Forum: >>> Magnum BBS <<<

Re: Word For Today: =?utf-8?Q?=E2=80=9CUglification=E2=80=9D?=

From Richard Kettlewell@21:1/5 to Kaz Kylheku on Tue Mar 12 08:03:50 2024

Kaz Kylheku <433-929-6894@kylheku.com> writes:

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

and the answer came back “1037”. The idea that a C-language
implementation and run-time environment is any sense monolithic seems
hopelessly out of touch.

There is no such out-of-touch idea. In (say) a Glibc-based system, only
the GCC, Glibc and kernel headers are part of the implementation (which comprises C, POSIX plus GNU and Linux extensions), and only the GCC and
Glibc library components and their external names.

Other libraries are third parties; the __ and _[A-Z] namespace
simply doesn't belong to them.

C doesn't provide any special tools for the application developer and
third party code to avoid clashes among themselves.

That’s true, but AFAICT it’s exactly what Lawrence is complaining about: there’s nothing in the language spec to help those thousand other
libraries avoid name clashes.

--
https://www.greenend.org.uk/rjk/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Kettlewell@21:1/5 to Keith Thompson on Wed Mar 13 09:01:06 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

My specific complaint was about temporary names being used internal to
library macros. It’s a problem that is essentially impossible to solve
with string-based macro processors.

Here's the macro definition you cited upthread:
"""
From /usr/include/«arch»/bits/select.h on my Debian system:

#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
} while (0)
"""

Can you explain how that would cause a problem?

The implementation can use prefixed names as shown, so that example
won’t cause any trouble as long as the implementors coordinate amongst themselves (which is a reasonable assumption).

Any library from outside the implementation doesn’t have that privilege
and just has to hope that the internal names it chooses don’t collide
with anything else that’s visible when its header(s) are included.

--
https://www.greenend.org.uk/rjk/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Kettlewell@21:1/5 to Keith Thompson on Wed Mar 13 23:32:20 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 13 Mar 2024 22:33:02 GMT, Scott Lurndal wrote:

Third party libraries are allowed to use any mechanism they like to
minimize name conflicts other than prefixing with two underscores.

But there is no other such mechanism available.

Are you aware that working third party libraries exist, and name
collisions are fairly rare? How do you think that's possible?

There are no 100% reliable mechanisms. There are mechanisms that work
well enough in practice, including using library-specific prefixes.

The collisions I recall running into are library headers defining things
like MIN, MAX, TRUE, FALSE - useful in isolation, but a nuisance when
more than one library does it.

Actually the offending headers that spring are mind are supplied by the implementor of the platform they support, albeit that the headers
involved are not ones specified in standard C.

--
https://www.greenend.org.uk/rjk/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Thu Mar 14 10:23:01 2024

Kaz Kylheku <433-929-6894@kylheku.com> writes:

[some editing of white space done]

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

From /usr/include/<<arch>>/bits/select.h on my Debian system:

#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \

This assignment has value; it checks that, loosely speaking,
s is an "assignment compatible" pointer with a fd_set *,
so that there is a diagnostic if the macro is applied to
an object of the wrong type.

More to the point, if the macro is applied to a value of the wrong
type.

for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \

Here, I would have done memset(__arr, 0, sizeof *__arr).

That assumes that it is the entire fd_set that needs to be zeroed,
which may not be right. Note the call to the __FDS_BITS() macro.

Better:

#define __FD_ZERO(s) ( \
(void) memset( \
__FDS_BITS( (fd_set*){(s)} ), 0, sizeof __FDS_BITS( (fd_set*){0} ) \
) \
)

This definition: avoids introducing any new identifiers; checks
that the argument s yields an assignment compatible pointer; and
provides a macro that can be used as a void expression (unlike the
original macro definition, which can be used only as a statement).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Scott Lurndal on Fri Mar 15 00:01:47 2024

scott@slp53.sl.home (Scott Lurndal) writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <433-929-6894@kylheku.com> writes:

[some editing of white space done]

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

From /usr/include/<<arch>>/bits/select.h on my Debian system:

#define __FD_ZERO(s) \ >>>>> do { \ >>>>> unsigned int __i; \ >>>>> fd_set *__arr = (s); \ >>>>

This assignment has value; it checks that, loosely speaking,
s is an "assignment compatible" pointer with a fd_set *,
so that there is a diagnostic if the macro is applied to
an object of the wrong type.

More to the point, if the macro is applied to a value of the wrong
type.

for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \ >>>>> __FDS_BITS (__arr)[__i] = 0; \ >>>>

Here, I would have done memset(__arr, 0, sizeof *__arr).

That assumes that it is the entire fd_set that needs to be zeroed,
which may not be right. Note the call to the __FDS_BITS() macro.

Better:

#define __FD_ZERO(s) ( \ >>> (void) memset( \ >>> __FDS_BITS( (fd_set*){(s)} ), 0, sizeof __FDS_BITS( (fd_set*){0} ) \ >>> ) \ >>> )

This definition: avoids introducing any new identifiers; checks
that the argument s yields an assignment compatible pointer; and
provides a macro that can be used as a void expression (unlike the
original macro definition, which can be used only as a statement).

For context, here's the entire file from my system (Ubuntu 24.0.4,
package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the >>author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes >>this header. (I offer no opinion on whether that's a good tradeoff.)

Note that __FD_ZERO is very clearly *not* intended to be invoked by >>arbitrary code.

```
/* Copyright (C) 1997-2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.

The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */

#ifndef _SYS_SELECT_H
# error "Never use <bits/select.h> directly; include <sys/select.h> instead." >>#endif

/* We don't use `memset' because this would require a prototype and
the array isn't too big. */
#define __FD_ZERO(s) \
do { \
unsigned int __i; \ >> fd_set *__arr = (s); \ >> for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \ >> __FDS_BITS (__arr)[__i] = 0; \ >> } while (0)
#define __FD_SET(d, s) \
((void) (__FDS_BITS (s)[__FD_ELT(d)] |= __FD_MASK(d)))
#define __FD_CLR(d, s) \
((void) (__FDS_BITS (s)[__FD_ELT(d)] &= ~__FD_MASK(d)))
#define __FD_ISSET(d, s) \
((__FDS_BITS (s)[__FD_ELT (d)] & __FD_MASK (d)) != 0)

```

That code is only selected if it is not compiled with
gcc. If it is gcc 2 or later, the header file uses

# define __FD_ZERO(fdsp) \
do { \
int __d0, __d1; \
__asm__ __volatile__ ("cld; rep; " __FD_ZERO_STOS \
: "=c" (__d0), "=D" (__d1) \
: "a" (0), "0" (sizeof (fd_set) \
/ sizeof (__fd_mask)), \
"1" (&__FDS_BITS (fdsp)[0]) \
: "memory"); \
} while (0)

(Fedora Core 20).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Keith Thompson on Thu Mar 14 23:59:03 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <433-929-6894@kylheku.com> writes:

[some editing of white space done]

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

From /usr/include/<<arch>>/bits/select.h on my Debian system:

#define __FD_ZERO(s) \ >>>> do { \ >>>> unsigned int __i; \ >>>> fd_set *__arr = (s); \ >>>

This assignment has value; it checks that, loosely speaking,
s is an "assignment compatible" pointer with a fd_set *,
so that there is a diagnostic if the macro is applied to
an object of the wrong type.

More to the point, if the macro is applied to a value of the wrong
type.

for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \ >>>> __FDS_BITS (__arr)[__i] = 0; \ >>>

Here, I would have done memset(__arr, 0, sizeof *__arr).

That assumes that it is the entire fd_set that needs to be zeroed,
which may not be right. Note the call to the __FDS_BITS() macro.

Better:

#define __FD_ZERO(s) ( \ >> (void) memset( \ >> __FDS_BITS( (fd_set*){(s)} ), 0, sizeof __FDS_BITS( (fd_set*){0} ) \ >> ) \ >> )

This definition: avoids introducing any new identifiers; checks
that the argument s yields an assignment compatible pointer; and
provides a macro that can be used as a void expression (unlike the
original macro definition, which can be used only as a statement).

For context, here's the entire file from my system (Ubuntu 24.0.4,
package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the >author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes
this header. (I offer no opinion on whether that's a good tradeoff.)

Note that __FD_ZERO is very clearly *not* intended to be invoked by
arbitrary code.

```
/* Copyright (C) 1997-2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.

The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */

#ifndef _SYS_SELECT_H
# error "Never use <bits/select.h> directly; include <sys/select.h> instead." >#endif

/* We don't use `memset' because this would require a prototype and
the array isn't too big. */
#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
} while (0)
#define __FD_SET(d, s) \
((void) (__FDS_BITS (s)[__FD_ELT(d)] |= __FD_MASK(d)))
#define __FD_CLR(d, s) \
((void) (__FDS_BITS (s)[__FD_ELT(d)] &= ~__FD_MASK(d)))
#define __FD_ISSET(d, s) \
((__FDS_BITS (s)[__FD_ELT (d)] & __FD_MASK (d)) != 0)

```

That code is only selected if it is not compiled with
gcc. If it is gcc 2 or later, the header file uses

# define __FD_ZERO(fdsp) \
do { \
int __d0, __d1; \
__asm__ __volatile__ ("cld; rep; " __FD_ZERO_STOS \
: "=c" (__d0), "=D" (__d1) \
: "a" (0), "0" (sizeof (fd_set) \
/ sizeof (__fd_mask)), \
"1" (&__FDS_BITS (fdsp)[0]) \
: "memory"); \
} while (0)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Keith Thompson on Fri Mar 15 00:33:00 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

For context, here's the entire file from my system (Ubuntu 24.0.4, >>>package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the >>>author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes >>>this header. (I offer no opinion on whether that's a good tradeoff.)

Note that __FD_ZERO is very clearly *not* intended to be invoked by >>>arbitrary code.

```

[code snipped]

```

That code is only selected if it is not compiled with
gcc. If it is gcc 2 or later, the header file uses

# define __FD_ZERO(fdsp) \
do { \
int __d0, __d1; \
__asm__ __volatile__ ("cld; rep; " __FD_ZERO_STOS \
: "=c" (__d0), "=D" (__d1) \
: "a" (0), "0" (sizeof (fd_set) \
/ sizeof (__fd_mask)), \
"1" (&__FDS_BITS (fdsp)[0]) \
: "memory"); \
} while (0)

Oh? I don't see that code anywhere in the current glibc sources, in any >older version of bits/select.h, or anywhere under /usr/include on my
system.

I'm using Fedora Core 20. Should have noted that directly in the reply,
sorry about that. It had surprised me when I looked at the generated
code, thinking that perhaps the optimizer generated the rep stos
when optimizing the loop.

$ rpm -q -f /usr/include/bits/select.h
glibc-headers-2.18-19.fc20.x86_64
$ gcc --version
gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)

Yes, I should probably update to a newer Fedora release, but this
has been rock solid for almost a decade (until last week when
an automatic firefox update started requiring wayland libraries).

We do use gcc11+ and recent Ubuntu/Fedora/Redhat release at the office.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Keith Thompson on Fri Mar 15 00:34:25 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]

For context, here's the entire file from my system (Ubuntu 24.0.4,
package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the
author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes
this header. (I offer no opinion on whether that's a good tradeoff.)

[...]

An older version did use memset(). It was changed to use a loop in
1997, with a commit message that included:
"Don't use memset to prevent prototype trouble, use simple loop."
It may have been to avoid problems with pre-ANSI C compilers that didn't >support prototypes. That's still speculation on my part.

Here's the full fc20 version:
$ rpm -q -f /usr/include/bits/select.h
glibc-headers-2.18-19.fc20.x86_64

/* Copyright (C) 1997-2013 Free Software Foundation, Inc.
This file is part of the GNU C Library.

The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */

#ifndef _SYS_SELECT_H
# error "Never use <bits/select.h> directly; include <sys/select.h> instead." #endif

#include <bits/wordsize.h>

#if defined __GNUC__ && __GNUC__ >= 2

# if __WORDSIZE == 64
# define __FD_ZERO_STOS "stosq"
# else
# define __FD_ZERO_STOS "stosl"
# endif

# define __FD_ZERO(fdsp) \
do { \
int __d0, __d1; \
__asm__ __volatile__ ("cld; rep; " __FD_ZERO_STOS \
: "=c" (__d0), "=D" (__d1) \
: "a" (0), "0" (sizeof (fd_set) \
/ sizeof (__fd_mask)), \
"1" (&__FDS_BITS (fdsp)[0]) \
: "memory"); \
} while (0)

#else /* ! GNU CC */

/* We don't use `memset' because this would require a prototype and
the array isn't too big. */
# define __FD_ZERO(set) \
do { \
unsigned int __i; \
fd_set *__arr = (set); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
} while (0)

#endif /* GNU CC */

#define __FD_SET(d, set) \
((void) (__FDS_BITS (set)[__FD_ELT (d)] |= __FD_MASK (d)))
#define __FD_CLR(d, set) \
((void) (__FDS_BITS (set)[__FD_ELT (d)] &= ~__FD_MASK (d)))
#define __FD_ISSET(d, set) \
((__FDS_BITS (set)[__FD_ELT (d)] & __FD_MASK (d)) != 0)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Thu Mar 14 19:49:07 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <433-929-6894@kylheku.com> writes:

[some editing of white space done]

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

From /usr/include/<<arch>>/bits/select.h on my Debian system:

#define __FD_ZERO(s) \ >>>> do { \ >>>> unsigned int __i; \ >>>> fd_set *__arr = (s); \ >>>

This assignment has value; it checks that, loosely speaking,
s is an "assignment compatible" pointer with a fd_set *,
so that there is a diagnostic if the macro is applied to
an object of the wrong type.

More to the point, if the macro is applied to a value of the wrong
type.

for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \ >>>> __FDS_BITS (__arr)[__i] = 0; \ >>>

Here, I would have done memset(__arr, 0, sizeof *__arr).

That assumes that it is the entire fd_set that needs to be zeroed,
which may not be right. Note the call to the __FDS_BITS() macro.

Better:

#define __FD_ZERO(s) ( \ >> (void) memset( \ >> __FDS_BITS( (fd_set*){(s)} ), 0, sizeof __FDS_BITS( (fd_set*){0} ) \ >> ) \ >> )

This definition: avoids introducing any new identifiers; checks
that the argument s yields an assignment compatible pointer; and
provides a macro that can be used as a void expression (unlike the
original macro definition, which can be used only as a statement).

For context, here's the entire file from my system (Ubuntu 24.0.4,
package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes
this header. [...]

Yes, it seems clear from the (snipped) source that the authors
deliberately avoided using memset(), perhaps so as not to have
an unwanted dependency.

My comments were meant in the sense of comparing one revision to
another, and about macro definitions generally. They were not
meant to say anything specific about the context in which the
original macro was defined, both because it is not one I have
easy access to and because it doesn't affect the general nature
of my comments.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Tue Jun 18 23:09:00 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Any library from outside the implementation cannot use reserved
identifiers without invoking undefined behavior, [...]

Reserved identifiers can be used. It is only declaring or
defining reserved identifiers (in some contexts) that the C
standard calls out as undefined behavior, not other uses.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (0 / 16)
Uptime:	159:10:25
Calls:	10,384
Calls today:	1
Files:	14,056
Messages:	6,416,491

Re: Word For Today: =?utf-8?Q?=E2=80=9CUglification=E2=80=9D?=

Who's Online

Recent Visitors

System Info