• Re: int a = a

    From Tim Rentsch@21:1/5 to All on Thu Mar 20 02:54:21 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    [how to indicate a variable not being used is okay]
    [some quoted text rearranged]

    Unless I'm missing something, `(void)x` also has undefined beahvior
    if x is uninitialized,

    Right. Using (void)&x is better.

    though it's very likely to do nothing in practice.

    Unless x is volatile qualified, in which there must be an access
    to x in the generated code.

    The behavior [of int a = a;] is undefined. In C11 and later
    (N1570 6.3.2.1p2):

    Except when [...] an lvalue that does not have array type is
    converted to the value stored in the designated object (and is
    no longer an lvalue); this is called lvalue conversion.
    [...]
    If the lvalue designates an object of automatic storage
    duration that could have been declared with the register
    storage class (never had its address taken), and that object
    is uninitialized (not declared with an initializer and no
    assignment to it has been performed prior to use), the
    behavior is undefined.

    Long digression follows.

    The "could have been declared with the register storage class"
    seems quite odd. And in fact it is quite odd.

    I don't have the same reaction. The point of this phrase is that
    undefined behavior occurs only for variables that don't have
    their address taken. The phrase used describes that nicely.
    Any questions related to "registerness" can be ignored, because
    'register' in C really has nothing to do with hardware registers,
    despite the name.

    It's tempting to assume that `int n = n;` did not have undefined
    behavior prior to C11, or that accessing an automatic object whose
    address has not been taken does not have undefined behavior even
    in C11 or later, but it's not that simple.

    In C90, the non-normative Annex G (renamed to Annex J in later
    editions) says:

    The behavior in the following circumstances is undefined:
    [...]
    - The value of an uninitialized object that has automatic storage
    duration is used before a value is assigned (6.5.7).

    6.5.7 discusses initialization, and says that "If an object that
    has automatic storage duration is not initialized explicitly, its
    value is indeterminate", and C90's definition of "undefined behavior" explicitly refers to use of indeterminately valued objects, though
    it's not 100% clear that using an indeterminate value *always*
    has undefined behavior.

    So in C90, `int n = n;` explicitly had undefined behavior, even if
    all possible bit representations for an object of type int correspond
    to valid values (C90 didn't mention "trap representations").

    C99 added a definition for "indeterminate value": "either an
    unspecified value or a trap representation", and drops the mention
    of indeterminate values in the definition of "undefined behavior".
    It dropped the reference to uninitialized objects in Annex G/J.
    I believe that in C99, `int n = n;` is well defined *if* int
    has no trap representations, or if the representation stored in
    the memory occupied by n happens not to be a trap representation.
    If int has trap representations, and that memory happens to contain
    such a representation, the behavior is undefined.

    I found a discussion in comp.std.c from 2023, subject "Does reading
    an uninitialized object have undefined behavior?".

    The discontinued IA-64/Itanium processor had something called
    "NaT", "Not a Thing". NaT representations exist only in CPU
    registers, not in memory. (Imagine an extra bit for each register
    indicating whether the register contains a "thing".) A NaT allows
    for representations that act like C trap representations (called
    non-value representations in C23) even for types with no trap
    representations (for example where all 2**N possible representations correspond to valid values) -- but again, only in CPU registers.

    https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm

    So the "could have been declared with the register storage class"
    wording was added in C11 specifically to cater to the IA64. This
    change would have been superfluous in C90, where the behavior was
    undefined anyway, but is a semantically significant change between
    C99 and C11. (If some future CPU has something like NaT that can
    be stored in memory, the wording might need to be updated yet again.)

    My takeaway is that if it requires this much research to determine
    whether accessing the value of an uninitialized object has undefined
    behavior (in which circumstances and which edition of the standard),
    I'll just avoid doing so altogether. I'll initialize objects
    when they're defined whenever practical. If it's not practical
    for some reason, I won't initialize it with some dummy value; I'll
    leave it uninitialized so the compiler has a chance to warn me if
    I accidentally use it before assigning a value to it.

    I think you are overthinking the question. In cases where it's
    important to give an initial value to a variable, and can be done
    so at the point of its declaration, use an initializer; otherwise
    don't. We don't have to read several different C standards, or
    even only one, to reach that conclusion. If someone wants to know
    exactly which border cases are safe and which cases are not, then
    reading the relevant version(s) of the C standard is needed, but
    in most situations it isn't. It's important for the C standard to
    be precise about what it prescribes, but as far as initialization
    goes it's easy to write code that doesn't need that level of
    detail. Compiler writers need to know such things; in the
    particular case of when and where to initialize, most developers
    don't.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Thu Mar 20 15:42:06 2025
    On 19/03/2025 21:34, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    As far as I understand it (and I hope to be corrected if I am wrong),

    Your hope is about to be fulfilled.

    "int a = a;" is not undefined behaviour as long as the implementation
    does not have trap values for "int". It simply leaves "a" as an
    unspecified value - just like "int a;" does. Thus it is not in any
    way "worse" than "int a;" as far as C semantics are concerned. Any
    difference is a matter of implementation - and the usual
    implementation effect is to disable "not initialised" warnings.

    The behavior is undefined. In C11 and later (N1570 6.3.2.1p2):

    Except when [...] an lvalue that does not have array type is
    converted to the value stored in the designated object (and is no
    longer an lvalue); this is called lvalue conversion.
    [...]
    If the lvalue designates an object of automatic storage duration that
    could have been declared with the register storage class (never had
    its address taken), and that object is uninitialized (not declared
    with an initializer and no assignment to it has been performed prior
    to use), the behavior is undefined.


    OK. I had missed that for some reason. Elsewhere (6.7.9p10, under "initialization") the standard says the value is "indeterminate", which
    is defined as an "unspecified or trap" value.

    It is in much the same category as "(void) x;", which is an idiom for
    skipping an "unused variable" or "unused parameter" warning.

    Unless I'm missing something, `(void)x` also has undefined beahvior
    if x is uninitialized, though it's very likely to do nothing in
    practice.

    The situation where "(void) x;" is most useful is, I would say, unused parameters. So there is no undefined behaviour there. And for other
    variables it is most likely in situations where you have assigned to the variable but then don't use it (perhaps you plan to use it later).
    Maybe you have "status = do_something();", and then don't actually make
    use of "status" - casting it to void tells both the compiler and the
    reader that you know "do_something()" is returning a status indicator,
    but that you are then ignoring it. If you are simply declaring a
    variable without initialising it and you don't want to use it and don't
    want to be warned about it, it's probably just as easy (and definitely
    avoids UB) to remove the declaration.


    Long digression follows.

    The "could have been declared with the register storage class" seems
    quite odd. And in fact it is quite odd.

    It's tempting to assume that `int n = n;` did not have undefined
    behavior prior to C11, or that accessing an automatic object whose
    address has not been taken does not have undefined behavior even
    in C11 or later, but it's not that simple.

    In C90, the non-normative Annex G (renamed to Annex J in later
    editions) says:

    The behavior in the following circumstances is undefined:
    [...]
    - The value of an uninitialized object that has automatic storage
    duration is used before a value is assigned (6.5.7).

    6.5.7 discusses initialization, and says that "If an object that
    has automatic storage duration is not initialized explicitly, its
    value is indeterminate", and C90's definition of "undefined behavior" explicitly refers to use of indeterminately valued objects, though
    it's not 100% clear that using an indeterminate value *always*
    has undefined behavior.

    So in C90, `int n = n;` explicitly had undefined behavior, even if
    all possible bit representations for an object of type int correspond
    to valid values (C90 didn't mention "trap representations").

    C99 added a definition for "indeterminate value": "either an
    unspecified value or a trap representation", and drops the mention
    of indeterminate values in the definition of "undefined behavior".
    It dropped the reference to uninitialized objects in Annex G/J.
    I believe that in C99, `int n = n;` is well defined *if* int
    has no trap representations, or if the representation stored in
    the memory occupied by n happens not to be a trap representation.
    If int has trap representations, and that memory happens to contain
    such a representation, the behavior is undefined.

    I found a discussion in comp.std.c from 2023, subject "Does reading
    an uninitialized object have undefined behavior?".

    The discontinued IA-64/Itanium processor had something called
    "NaT", "Not a Thing". NaT representations exist only in CPU
    registers, not in memory. (Imagine an extra bit for each register
    indicating whether the register contains a "thing".) A NaT allows
    for representations that act like C trap representations (called
    non-value representations in C23) even for types with no trap
    representations (for example where all 2**N possible representations correspond to valid values) -- but again, only in CPU registers.

    https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm

    So the "could have been declared with the register storage class"
    wording was added in C11 specifically to cater to the IA64. This
    change would have been superfluous in C90, where the behavior was
    undefined anyway, but is a semantically significant change between
    C99 and C11. (If some future CPU has something like NaT that can
    be stored in memory, the wording might need to be updated yet again.)

    My takeaway is that if it requires this much research to determine
    whether accessing the value of an uninitialized object has undefined
    behavior (in which circumstances and which edition of the standard),
    I'll just avoid doing so altogether. I'll initialize objects
    when they're defined whenever practical. If it's not practical
    for some reason, I won't initialize it with some dummy value; I'll
    leave it uninitialized so the compiler has a chance to warn me if
    I accidentally use it before assigning a value to it.


    Thanks for that explanation.

    My opinions here match your "takeaway" entirely. Just because I have
    seen "int a = a;", and know how gcc (and perhaps other compilers) handle
    it, does not mean I think it is a good thing to write!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Thu Mar 20 16:22:55 2025
    On 20/03/2025 11:20, Keith Thompson wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:


    The "could have been declared with the register storage class"
    seems quite odd. And in fact it is quite odd.

    I don't have the same reaction. The point of this phrase is that
    undefined behavior occurs only for variables that don't have
    their address taken. The phrase used describes that nicely.
    Any questions related to "registerness" can be ignored, because
    'register' in C really has nothing to do with hardware registers,
    despite the name.

    DR 338 is explicitly motivated by an IA-64 feature that applies only to
    CPU registers. An object whose address is taken can't be stored (only)
    in a register, so it can't have a NaT representation.

    The phrase used is "could have been declared with register storage class (never had its address taken)". Surely "never had its address taken"
    would have been clear enough if CPU registers weren't a big part of the motivation.


    I too think the phrasing is a bit odd.

    Just because a variable's address is taken, does not mean it cannot be
    put in a cpu register by the compiler. If the variable is not accessed
    in a way that actually requires putting it in memory, then the compiler
    can put it in a cpu register (or otherwise optimise it). So simply
    taking the address of a variable on IA-64 does not mean it cannot be in
    a register, and thus does not necessarily mean it cannot be NaT. Taking
    the address of a variable means the variable cannot be declared
    "register", but it does not mean it cannot be /in/ a register.

    It seems very strange to me that this is UB:

    int foo1(void) {
    int x;

    return x;
    }

    while this is not :

    int foo2(void) {
    int x;

    int * p = &x;

    return x;
    }

    (Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler in
    its list.)

    It strikes me that it would have been far simpler for the standard
    simply to say that using the value of an uninitialised and unassigned
    variable is undefined behaviour.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Fri Mar 21 10:44:05 2025
    On 20/03/2025 20:46, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 20/03/2025 11:20, Keith Thompson wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    The "could have been declared with the register storage class"
    seems quite odd. And in fact it is quite odd.

    I don't have the same reaction. The point of this phrase is that
    undefined behavior occurs only for variables that don't have
    their address taken. The phrase used describes that nicely.
    Any questions related to "registerness" can be ignored, because
    'register' in C really has nothing to do with hardware registers,
    despite the name.
    DR 338 is explicitly motivated by an IA-64 feature that applies only
    to
    CPU registers. An object whose address is taken can't be stored (only)
    in a register, so it can't have a NaT representation.
    The phrase used is "could have been declared with register storage
    class
    (never had its address taken)". Surely "never had its address taken"
    would have been clear enough if CPU registers weren't a big part of the
    motivation.

    I too think the phrasing is a bit odd.

    Just because a variable's address is taken, does not mean it cannot be
    put in a cpu register by the compiler. If the variable is not
    accessed in a way that actually requires putting it in memory, then
    the compiler can put it in a cpu register (or otherwise optimise it).
    So simply taking the address of a variable on IA-64 does not mean it
    cannot be in a register, and thus does not necessarily mean it cannot
    be NaT. Taking the address of a variable means the variable cannot be
    declared "register", but it does not mean it cannot be /in/ a
    register.

    Sure, any variable that's stored in memory can be mirrored by holding
    its value in a register.

    int n = 42; // Assume n is assigned a memory address
    printf("n+1=%d n+2=%d\n", n+1, n+2);

    A compiler could plausibly store the value of n in a register before computing n+1, and then reuse the register value to compute n+2.

    Yes, of course. But there is also no necessity for variables to be in
    memory at all, or that there is any consistency there. "Assume n is
    assigned a memory address" is a completely unwarranted assumption for
    almost all local variables. It is only if the address is taken, and
    used in some way that is beyond the optimiser, that the variable
    actually has to go in a fixed place in memory. Otherwise optimisers can
    and do keep data in registers, or move them in and out of registers and different stack slots according to convenience for efficient code.


    uint32_t float_to_uint(float f) {
    uint32_t u;
    memcpy(&u, &f, 4);
    return u;
    }

    gcc compiles that to :

    float_to_uint:
    movd eax, xmm0
    ret

    So even though the addresses of the variable "u" and the parameter "f"
    are taken, and converted to char pointers, and passed to a function with external linkage, nothing is actually put in memory at all.

    Thus the standard's wording as though the legality of using the
    "register" storage-class specifier corresponds to cpu register usage is,
    at best, wildly out of date.

    (And there are some architectures where the cpu registers are directly
    mapped to memory, and can be accessed as memory locations or registers.)


    My understanding is that IA-64 NaT (Not a Thing) representations
    exist only for registers, and the NaT bit should be cleared when
    a value is stored in the register.

    The odd wording in the standard allows an IA-64 C compiler to
    take advantage of NaT representations for their intended purpose.
    It might impose some minor constraints on what machine code can be
    generated, but *most* of the cases where a NaT could be accessed
    are undefined behavior in C.


    I see that, but I believe it would be much simpler and clearer if
    attempting to read an uninitialised and unassigned local variable were undefined behaviour in every case.

    Alternatively, it could have said that the value is unspecified in every
    case. Then on the IA-64, the compiler would have to ensure that
    registers do not have their NaT bit set even if they are not initialised
    - this would not be a difficult task. Enabling use of the NaT bit for detection of bugs could then be a compiler option if implementations
    wanted to provide that feature.

    It seems very strange to me that this is UB:

    int foo1(void) {
    int x;

    return x;
    }

    while this is not :

    int foo2(void) {
    int x;

    int * p = &x;

    return x;
    }

    (Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler
    in its list.)

    It strikes me that it would have been far simpler for the standard
    simply to say that using the value of an uninitialised and unassigned
    variable is undefined behaviour.

    In C90, it was. C99 changed that, making the behavior defined if the representation is not a trap representation.

    For C99, a conforming IA-64 C compiler would have had to go out of its
    way to avoid accessing NaT representations. For example, if you wrote

    {
    int n;
    n;
    }

    the most straightforward IA-64 code would store n in a register and
    not initialize it, resulting in a trap when the register is read.
    A compiler might have to generate code to store an arbitrary value
    in the register to void the trap.

    I'm undecided on whether reading the value of an uninitialized
    automatic object *should* be undefined behavior, but given that
    it isn't, the C11 committee made the smallest possible change to
    cater to IA-64 semantics.


    IMHO, having it as UB is the best option, with unspecified behaviour as
    a second best option. The jumble that C11 has is not necessary for the
    IA-64, and clearly worse than the other two choices for architectures
    that don't have a NaT equivalent.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Fri Mar 21 21:46:32 2025
    On 21/03/2025 20:23, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:

    I see that, but I believe it would be much simpler and clearer if
    attempting to read an uninitialised and unassigned local variable were
    undefined behaviour in every case.

    I probably agree (I haven't given it all that much thought), but the committee made a specific decision between C90 and C99 to say that
    reading an uninitialized automatic object is *not* undefined behavior.
    I'm don't know why they did that (though, all else being equal, reducing
    the number of instances of undefined behavior is a good thing), but
    reversing that decision for this one issue is not something they decided
    to do.

    Certainly the C committee have to think harder, and consider more
    possibilities than most mere C programmers are likely to do - and they
    don't like to make something "undefined" if it were defined (to at least
    some extent) previously.

    I can agree that it is good to reduce the number of UB's, all else being
    equal - but all else is very seldom equal. To me, it is preferable to
    say clearly and explicitly "this is undefined behaviour" than to leave
    the C programmer to combine several parts of the standard to figure out
    that the construct might do something defined but unspecified, or might
    do something bad (a trap), or might be defined or undefined depending on
    other mostly unrelated code.

    I am a fan of clear undefined behaviour when there is no good definition
    of what the behaviour should be - I'd rather have UB than badly defined behaviour. But I strongly prefer it to be explicit and clear.


    Alternatively, it could have said that the value is unspecified in
    every case. Then on the IA-64, the compiler would have to ensure that
    registers do not have their NaT bit set even if they are not
    initialised - this would not be a difficult task. Enabling use of the
    NaT bit for detection of bugs could then be a compiler option if
    implementations wanted to provide that feature.

    The whole point of the NaT bit is to detect accesses to uninitialized
    values. Requiring the compiler to arbitrarily clear that bit
    doesn't strike me as a good idea.


    I think it would be fine to do that - as long as tools also provide
    modes that don't clear the bit so that you can use the feature for
    debugging or run-time checks. If the behaviour here was to make the
    variable an unspecified value, then in fully compliant modes the
    compiler would have to clear the NaT bit, so the compiler mode making
    use of the NaT bit would be marginally non-compliant. But I see no
    problem with that - developers can happily use slightly non-compliant
    mode in order to get more features (language extensions, faster
    execution, better debugging - whatever suits the user and the compiler implementer).

    Still, leaving it undefined behaviour would be even better, because then compilers could have flags for using the NaT bit, or clearing the
    variable to 0, or giving a compile-time error - whatever they did it
    would still be compliant.

    I dislike the way that wording was added to the standard specifically
    to cater to one specific CPU (which happens to have been discontinued
    later). I would have been happier with a more general solution.
    I that making accessing the value of an uninitialized automatic
    object UB would have been much cleaner, and it would have allowed for sensible use of NaT by IA-64 compilers. But without knowing *why*
    the committee removed that UB between C90 and C99, I'm hesitant to
    say it was a mistake.

    Meanwhile, I will in effect assume that accessing uninitialized objects
    is UB, i.e., I'll carefully avoid doing so.


    That, I think is the best way to handle this.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Sat Mar 22 13:59:05 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    David Brown <david.brown@hesbynett.no> writes:

    [...]I believe it would be much simpler and clearer if attempting
    to read an uninitialised and unassigned local variable were
    undefined behaviour in every case.

    I probably agree (I haven't given it all that much thought), but
    the committee made a specific decision between C90 and C99 to say
    that reading an uninitialized automatic object is *not* undefined
    behavior. I'm don't know why they did that (though, all else
    being equal, reducing the number of instances of undefined
    behavior is a good thing), but reversing that decision for this
    one issue is not something they decided to do.

    Your description of what was done is wrong. It is still the case in
    C99 that trying to access an uninitialized object is undefined
    behavior, at least potentially, except for accesses using a type
    that either is a character type or has no trap representations (and
    all types other than unsigned char may have trap representations,
    depending on the implementation). A statement like

    int a = a;

    may still be given a warning as potential undefined behavior, even
    in C99.

    Alternatively, it could have said that the value is unspecified
    in every case. Then on the IA-64, the compiler would have to
    ensure that registers do not have their NaT bit set even if they
    are not initialised - this would not be a difficult task.
    Enabling use of the NaT bit for detection of bugs could then be a
    compiler option if implementations wanted to provide that
    feature.

    The whole point of the NaT bit is to detect accesses to
    uninitialized values. Requiring the compiler to arbitrarily clear
    that bit doesn't strike me as a good idea.

    I dislike the way that wording was added to the standard
    specifically to cater to one specific CPU (which happens to have
    been discontinued later). I would have been happier with a more
    general solution. I that making accessing the value of an
    uninitialized automatic object UB would have been much cleaner,
    and it would have allowed for sensible use of NaT by IA-64
    compilers.

    I think you may be missing a key point. In both C99 and C11,
    accessing an uninitialized object using a character type is defined
    (albeit unspecified) behavior. But in C11, because of the changes
    in 6.3.2.1, even character types are subject to undefined behavior
    when uninitialized objects are accessed (provided of course that
    they don't fall under an exception because their address was taken).
    The C11 rule does more than allowing undefined behavior just for
    non-character types; it extends the possibility of undefined
    behavior to character types as well.

    But without knowing *why* the committee removed that UB between
    C90 and C99, I'm hesitant to say it was a mistake.

    The mistake is thinking that UB for uninitialized access was
    removed in C99. It wasn't. Narrowed, yes; removed, no. And
    later C11 simply widened it back a bit, recovering some of the
    territory that had been taken away in C99. The Itanium may
    have been what prompted the change, but the change that was made
    is one well worth making.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Sun Apr 27 13:41:43 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    [...]

    An addle-brained view. Anyone who thinks that should be forcibly
    removed from any activity involving software development.

    Be less rude.

    My comment was a statement about content. If I had said
    "David Brown is an addle-brained fool" that would be a
    statement about a person. What I did say was not. If you
    want to disagree with my statements about content, you
    are welcome to express an opposing view. As long as a
    statement is about content, rather than about a person,
    there is nothing wrong with expressing it in any degree
    of strong language.

    For the record, there are plenty of behaviors that you
    engage in that I find offensive, insulting, or rude,
    including statements made about people. People who
    live in glass houses shouldn't throw stones.

    I note with amusement that you over-snipped the context
    to which I was replying.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Mon Apr 28 09:39:46 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    David Brown <david.brown@hesbynett.no> writes:

    [...]I believe it would be much simpler and clearer if attempting
    to read an uninitialised and unassigned local variable were
    undefined behaviour in every case.

    I probably agree (I haven't given it all that much thought), but
    the committee made a specific decision between C90 and C99 to say
    that reading an uninitialized automatic object is *not* undefined
    behavior. I'm don't know why they did that (though, all else
    being equal, reducing the number of instances of undefined
    behavior is a good thing), but reversing that decision for this
    one issue is not something they decided to do.

    Your description of what was done is wrong. It is still the case in
    C99 that trying to access an uninitialized object is undefined
    behavior, at least potentially, except for accesses using a type
    that either is a character type or has no trap representations (and
    all types other than unsigned char may have trap representations,
    depending on the implementation). A statement like

    int a = a;

    may still be given a warning as potential undefined behavior, even
    in C99.

    I had already mentioned that distinction earlier in the thread.

    Oh, I must have missed that. I don't remember seeing it in
    the message I was replying to.

    The mistake is thinking that UB for uninitialized access was
    removed in C99. It wasn't. Narrowed, yes; removed, no.

    Acknowledged.

    Good deal.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Tue Apr 29 13:12:17 2025
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [how to indicate a variable not being used is okay]
    [some quoted text rearranged]

    Unless I'm missing something, `(void)x` also has undefined beahvior
    if x is uninitialized,

    Right. Using (void)&x is better.

    I'm not convinced -- and it's far less idiomatic.

    Both phrases are idiomatic. What you mean is one phrase is more
    common than the other. More common doesn't mean better. Recall
    Dijkstra's dictum, not to conclude that something is more convenient
    just because it's more conventional.

    I don't think
    I've ever seen (void)&x in code, and if I did I'd wonder what the
    author's intent was.

    The same is true for any construction seen for the first time,
    and like other such cases either you would figure it out or
    look/ask around to find out. And then you'd know.

    Furthermore, having gotten the benefit of this discussion, you
    wouldn't have to do that, because you've seen it already.

    (void)x is a common idiom for hinting to the compiler that it
    doesn't need to complain about x being unused. (void)&x doesn't
    tell the compiler that the *value* of x is used. I'm not sure how
    much difference that makes.

    Both have the effect of getting rid of the warning even if placed
    after a 'return' statement so as not to be executed.

    Even with (void)x and/or (void)&x, a compiler *could* still warn
    about x being unused, or about the programmer's use of an ugly font.

    Yes, it could. At such time that it happens I expect I would
    react and adapt accordingly, the same as with all questionable
    compiler behaviors.

    though it's very likely to do nothing in practice.

    Unless x is volatile qualified, in which there must be an access
    to x in the generated code.

    The behavior [of int a = a;] is undefined. In C11 and later
    (N1570 6.3.2.1p2):

    Except when [...] an lvalue that does not have array type is
    converted to the value stored in the designated object (and is
    no longer an lvalue); this is called lvalue conversion.
    [...]
    If the lvalue designates an object of automatic storage
    duration that could have been declared with the register
    storage class (never had its address taken), and that object
    is uninitialized (not declared with an initializer and no
    assignment to it has been performed prior to use), the
    behavior is undefined.

    Long digression follows.

    The "could have been declared with the register storage class"
    seems quite odd. And in fact it is quite odd.

    I don't have the same reaction. The point of this phrase is that
    undefined behavior occurs only for variables that don't have
    their address taken. The phrase used describes that nicely.
    Any questions related to "registerness" can be ignored, because
    'register' in C really has nothing to do with hardware registers,
    despite the name.

    DR 338 is explicitly motivated by an IA-64 feature that applies only to
    CPU registers. An object whose address is taken can't be stored (only)
    in a register, so it can't have a NaT representation.

    The phrase used is "could have been declared with register storage class (never had its address taken)". Surely "never had its address taken"
    would have been clear enough if CPU registers weren't a big part of the motivation.

    I'm surprised you would say this. The phrase "never had its address
    taken" doesn't satisfy the careful language threshold observed in
    the ISO C standard. Do you really not understand this?

    https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm

    So the "could have been declared with the register storage class"
    wording was added in C11 specifically to cater to the IA64. This
    change would have been superfluous in C90, where the behavior was
    undefined anyway, but is a semantically significant change between
    C99 and C11. (If some future CPU has something like NaT that can
    be stored in memory, the wording might need to be updated yet again.)

    My takeaway is that if it requires this much research to determine
    whether accessing the value of an uninitialized object has undefined
    behavior (in which circumstances and which edition of the standard),
    I'll just avoid doing so altogether. I'll initialize objects
    when they're defined whenever practical. If it's not practical
    for some reason, I won't initialize it with some dummy value; I'll
    leave it uninitialized so the compiler has a chance to warn me if
    I accidentally use it before assigning a value to it.

    I think you are overthinking the question. In cases where it's
    important to give an initial value to a variable, and can be done
    so at the point of its declaration, use an initializer; otherwise
    don't.

    My overthinking led me to essentially the same conclusion, so I don't
    see the problem. And I also found it to be an interesting exploration
    of how certain aspects of the C standard have evolved over time.

    Doing more thinking than is needed is a waste of effort. I can only
    hope that you have better things to do with your time. Furthermore
    spending any time dwelling on the Itanium being the motivation for
    the change is just a distraction. It was interesting to learn, but
    having learned it there is no need to consider it further.

    We don't have to read several different C standards, or
    even only one, to reach that conclusion.

    No, but we do have to read one or more C standards to counter an
    argument that `int a = a;` is well defined.

    Only if one feels it necessary to convince someone who holds such
    an uneducated view. I don't mind pointing someone in the right
    direction, but it's not my job to convince them.

    If someone wants to know
    exactly which border cases are safe and which cases are not, then
    reading the relevant version(s) of the C standard is needed, but
    in most situations it isn't. It's important for the C standard to
    be precise about what it prescribes, but as far as initialization
    goes it's easy to write code that doesn't need that level of
    detail. Compiler writers need to know such things; in the
    particular case of when and where to initialize, most developers
    don't.

    Most developers don't read this newsgroup.

    Probably true, but there plenty of places where one can find out
    these things besides comp.lang.c.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)