• iso646.h

    From Lawrence D'Oliveiro@21:1/5 to All on Mon Jan 22 01:51:48 2024
    How many people know about this? It was introduced in c99. If you
    “#include <iso646.h>”, then you can use alternative symbols like “not” instead of “!”, “and” instead of “&&” and “or” instead of “||”.

    C++ already had this, without the need to include such a file.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 02:00:20 2024
    On Mon, 22 Jan 2024 01:51:48 +0000, Lawrence D'Oliveiro wrote:

    How many people know about this? It was introduced in c99.

    It was added in a 1995 amendment to the C90 standard.

    If you
    “#include <iso646.h>”, then you can use alternative symbols like “not”
    instead of “!”, “and” instead of “&&” and “or” instead of “||”.

    C++ already had this, without the need to include such a file.

    Yah, so?

    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Sun Jan 21 21:48:36 2024
    On 1/21/24 20:51, Lawrence D'Oliveiro wrote:
    How many people know about this? It was introduced in c99. If you
    “#include <iso646.h>”, then you can use alternative symbols like “not”
    instead of “!”, “and” instead of “&&” and “or” instead of “||”.

    I did, and so did a lot of other people. Google groups lists 222
    messages containing <iso646.h>.

    C++ already had this, without the need to include such a file.

    That's because backwards compatibility with older versions of C is a
    lower priority for C++ than it is for C. Existing code that ysed those
    words as identifiers would break if that feature were simply always
    enabled. Putting it in a standard header allows it to be under user
    control.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Mon Jan 22 05:23:30 2024
    On Sun, 21 Jan 2024 21:48:36 -0500, James Kuyper wrote:

    On 1/21/24 20:51, Lawrence D'Oliveiro wrote:

    C++ already had this, without the need to include such a file.

    That's because backwards compatibility with older versions of C is a
    lower priority for C++ than it is for C.

    I don’t think even many C++ programmers know about this.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 09:30:21 2024
    On 22/01/2024 02:51, Lawrence D'Oliveiro wrote:
    How many people know about this? It was introduced in c99. If you
    “#include <iso646.h>”, then you can use alternative symbols like “not”
    instead of “!”, “and” instead of “&&” and “or” instead of “||”.

    C++ already had this, without the need to include such a file.

    I can't say how many people know about it, but I certainly did. I have
    not used it - and I don't use the matching names in C++ either, but I
    knew about them. (I know lots more features of C and C++ than I use - I
    expect that applies to most programmers.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Mon Jan 22 16:24:36 2024
    David Brown <david.brown@hesbynett.no> writes:
    On 22/01/2024 02:51, Lawrence D'Oliveiro wrote:
    How many people know about this? It was introduced in c99. If you

    Although it was available before 1999 - the SVR4 C compilation System
    (CCS) had an iso656.h header file in the early 90's.

    #ifndef _ISO646_H
    #define _ISO646_H
    #ident "@(#)sgs-head:common/head/iso646.h 1.2"

    #define and &&
    #define and_eq &=
    #define bitand &

    #define or ||
    #define or_eq |=
    #define bitor |

    #define xor ^
    #define xor_eq ^=

    #define compl ~

    #define not !
    #define not_eq !=

    #endif /*_ISO646_H*/
    ~

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Mon Jan 22 20:34:04 2024
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability:

    if (ThisCh < '0' or ThisCh > '9')
    {
    if (AllowSign and Index == 0 and (ThisCh == '+' or ThisCh == '-'))
    {
    /* fine */
    }
    else if (AllowDecimal and not DecimalSeen and ThisCh == '.')
    {
    DecimalSeen = true; /* only allow one decimal point */
    }
    else
    {
    Valid = false;
    break;
    } /*if*/
    } /*if*/

    ...

    if
    (
    TheEntry->d_name[0] == '.'
    and
    (
    TheEntry->d_name[1] == 0
    or
    TheEntry->d_name[1] == '.'
    and
    TheEntry->d_name[2] == 0
    )
    )
    {
    /* skip "." and ".." entries */
    }

    ...

    if
    (
    ThisCh >= 'a' and ThisCh <= 'z'
    or
    ThisCh >= 'A' and ThisCh <= 'Z'
    or
    ThisCh >= '0' and ThisCh <= '9'
    or
    ThisCh == '_'
    or
    ThisCh == '-'
    or
    ThisCh == '.'
    or
    ThisCh == '/'
    )
    {
    Result.append(1, ThisCh);
    }

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Mon Jan 22 20:34:37 2024
    On Mon, 22 Jan 2024 16:24:36 GMT, Scott Lurndal wrote:

    Although it was available before 1999 - the SVR4 C compilation System
    (CCS) had an iso6[4]6.h header file in the early 90's.

    By the way, why is it called “iso646.h”?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 21:32:41 2024
    On 2024-01-22, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability:

    Only for someone who has no experience in C or C++, and is unfamiliar
    with the regular operators, and speaks English.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Jan 22 22:07:11 2024
    On Mon, 22 Jan 2024 13:22:41 -0800, Keith Thompson wrote:

    As for "and" being more readable than "&&", that's not necessarily the
    case for people who are accustomed to reading C code.

    You mean, “old-style C code”.

    I imagine the introduction of ANSI-style argument declarations, as opposed
    to the old K&R style, was a bit of a jar, too. But we got over it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Blue-Maned_Hawk@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 23:08:53 2024
    Lawrence D'Oliveiro wrote:

    Don’t you think it improves readability:

    No.

    --
    Blue-Maned_Hawk│shortens to Hawk│/ blu.mɛin.dʰak/ │he/him/his/himself/Mr.
    blue-maned_hawk.srht.site
    Cenunfly!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 00:10:20 2024
    On 2024-01-22, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Mon, 22 Jan 2024 14:56:53 -0800, Keith Thompson wrote:

    As far as I can tell, the macros defined in <iso646.h> have never caught
    on significantly.

    The nice thing is, I don’t have to care. They have to be part of any standards-compliant C compiler, therefore I am free to use them. And I do.

    Indentation not being required is part of any conforming C compiler.
    Therefore, I take advantage of it by starting each line of code without
    any leading whitespace, regardless of the nesting level. My coworkers
    love me!

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 00:12:25 2024
    On 2024-01-22, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Mon, 22 Jan 2024 23:08:53 -0000 (UTC), Blue-Maned_Hawk wrote:

    Lawrence D'Oliveiro wrote:

    Don’t you think it improves readability:

    No.

    Lessig’s Law: The one who writes the code makes the rules.

    Accordingly, if you get to rewrite the genetic code
    of someone reading the code, you may be able to dictate
    what they find readable.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Jan 22 23:44:01 2024
    On Mon, 22 Jan 2024 14:56:53 -0800, Keith Thompson wrote:

    As far as I can tell, the macros defined in <iso646.h> have never caught
    on significantly.

    The nice thing is, I don’t have to care. They have to be part of any standards-compliant C compiler, therefore I am free to use them. And I do.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to All on Mon Jan 22 23:37:33 2024
    On Mon, 22 Jan 2024 23:08:53 -0000 (UTC), Blue-Maned_Hawk wrote:

    Lawrence D'Oliveiro wrote:

    Don’t you think it improves readability:

    No.

    Lessig’s Law: The one who writes the code makes the rules.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 20:23:59 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    [.. using ISO646 names for logical operators ..]

    I do, if/when I do use C++ and C. Don't you think it improves
    readability:

    [example]

    No, quite the contrary.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 06:54:56 2024
    On 23.01.2024 00:37, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 23:08:53 -0000 (UTC), Blue-Maned_Hawk wrote:

    Lawrence D'Oliveiro wrote:

    Don’t you think it improves readability:

    No.

    Lessig’s Law: The one who writes the code makes the rules.

    If by "The one" you mean the company and the project leader then
    you are right. If you mean the individual programmer you are not
    necessarily right; in my professional contexts there where even
    [coding] standards defined that you had to follow. What people
    make in their private cubbyhole is of course their own business
    (and no one cares).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 06:47:10 2024
    On 22.01.2024 21:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability: [...]

    I'm a big proponent of readable code. Most of the early languages that
    I used had such keywords (and also 'begin' etc. instead of braces).

    I seem to recall that the (one?) reason to not have them was to reduce
    the number of literal alphabetic keywords, to avoid name clashes.

    But replacing '&&' by 'and' also doesn't add to the comprehensibility
    of the source code (YMMV), presuming that we want the code to be read
    by people who know how to program. It's also idiomatic. And in other
    contexts (than C) these symbols have similar meanings.

    There's also problems with these names. For example '&&' has not the
    semantics of 'and' but of 'and_then', 'or' is actually 'or_else'. Even
    it that gets "fixed", it's also not addressing other punctuation token
    issues, like using '==' instead of the common '=' or 'equals_to'.

    Since the beginning of C we had also other inherent issues with these
    symbols; not simple lexical ones like 'and', but e.g. the operator
    precedence that is (IMO) broken in at least one place. (I guess you
    know them.) And in the earlier C days we had not even a true boolean
    type or the 'true' and 'false' literals. Fixing things in a language
    like C appears to me to be an arduous and unrewarding task.

    So I take it as it comes, and I think, in C, we'd need other things
    than an iso646.h header or disputes about personal token preferences.
    Anyway.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 09:24:31 2024
    On 22/01/2024 21:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability:

    No. But I fully appreciate that this is personal preference and habit.
    The same applies to brace style, and the use of parenthesis, and
    variable naming :

    if ((ch < '0') || (ch > '0')) {
    if (allow_sign && (i == 0) && ((ch == '+') || (ch == '-'))) {
    // All fine
    } else if (allow_dec && !seen_dec && (ch == '.')) {
    seen_dec = true;
    } else {
    valid = false;
    break;
    }
    }


    I wouldn't object to seeing "and" and "or", but I would not feel it
    improves readability - it would make the code look more like Python than C.



    if (ThisCh < '0' or ThisCh > '9')
    {
    if (AllowSign and Index == 0 and (ThisCh == '+' or ThisCh == '-'))
    {
    /* fine */
    }
    else if (AllowDecimal and not DecimalSeen and ThisCh == '.')
    {
    DecimalSeen = true; /* only allow one decimal point */
    }
    else
    {
    Valid = false;
    break;
    } /*if*/
    } /*if*/


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 11:54:31 2024
    On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability:

    if (ThisCh < '0' or ThisCh > '9')
    {
    if (AllowSign and Index == 0 and (ThisCh == '+' or ThisCh == '-'))
    {
    /* fine */
    }
    else if (AllowDecimal and not DecimalSeen and ThisCh == '.')
    {
    DecimalSeen = true; /* only allow one decimal point */
    }
    else
    {
    Valid = false;
    break;
    } /*if*/
    } /*if*/


    I strongly dislike C syntax, but this iso646 thing barely address 1% of
    it. So I'm not surprised few bother.

    Especially if you have to go the the trouble of including a header file
    just to be able to use what should be language keywords.

    In my everyday language (lower level like C) this can be written as:

    if ch not in '0'..'9' then
    if allowsign and index = 0 and ch in ['+', '-'] then
    # fine

    elsif allowdecimal and not decimalseen and ch = '-' then
    decimalseen := true

    else
    valid := false
    exit

    end if
    end if

    (Notice no braces or parentheses.)


    if
    (
    ThisCh >= 'a' and ThisCh <= 'z'
    or
    ThisCh >= 'A' and ThisCh <= 'Z'
    or
    ThisCh >= '0' and ThisCh <= '9'
    or
    ThisCh == '_'
    or
    ThisCh == '-'
    or
    ThisCh == '.'
    or
    ThisCh == '/'
    )
    {
    Result.append(1, ThisCh);
    }

    This one could be a Switch (mine allows 'a'..'z' like using gcc's
    extension), but that might be overkill.

    Otherwise:

    if ch in 'A'..'Z' or ch in 'a'..'z' or ch in '0'..'9' or
    ch in ['_', '-', '.', '/'] then
    result.append(1, ch)

    I shortened your ThisCh to something more apt for a local variable.

    Alternatively, I'd use:

    if namemap[ch] then
    ...

    but this is now rewriting your code, not just showing alternate synax.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Malcolm McLean on Tue Jan 23 17:21:40 2024
    On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

    On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability: >>
    It breaks the rule that, in C, variables and functions are alphnumeric, whilst operators are symbols. sizeof is an exception, but a justified
    one. However it's harder to justify a symbol for "plus" but a word for "or".

    Less importantly, it also violates the convention that C macros are named in upper case to distinguish them from keywords and "regular" identifiers.

    I'll stick with the native C operators, but IF I were working in an environment where 'special characters' were problematic (such as where digraphs or trigraphs
    are necessary), I'd rather use
    a OR b
    instead of
    a or b



    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Malcolm McLean on Tue Jan 23 18:34:09 2024
    On 23/01/2024 16:32, Malcolm McLean wrote:
    On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves
    readability:

    It breaks the rule that, in C, variables and functions are alphnumeric, whilst operators are symbols. sizeof is an exception, but a justified
    one. However it's harder to justify a symbol for "plus" but a word for
    "or".

    But it's OK to justify 'pow' for exponentiation?

    Every explanation for && and || for every language that copied them from
    C, is that && means AND, and || means OR.

    Presumably everyone knows what AND and OR mean. So why not just use AND
    and OR?

    A lot of C code already looks a sea of punctuation; it can be good to
    break it up a little.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to bart on Tue Jan 23 18:52:22 2024
    bart <bc@freeuk.com> writes:
    On 23/01/2024 16:32, Malcolm McLean wrote:


    Every explanation for && and || for every language that copied them from
    C, is that && means AND, and || means OR.

    in C, && specifically means 'conditional and'. The programmer can
    rely on the fact that the second term will not be evaluated if
    the first term evaluates to false.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Scott Lurndal on Tue Jan 23 14:23:01 2024
    On 1/23/24 13:52, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 23/01/2024 16:32, Malcolm McLean wrote:


    Every explanation for && and || for every language that copied them from
    C, is that && means AND, and || means OR.

    in C, && specifically means 'conditional and'. The programmer can
    rely on the fact that the second term will not be evaluated if
    the first term evaluates to false.

    Actually, C uses the term "Logical AND". I don't have any idea what "conditional and" is supposed to mean, except that the explanation you
    provide matches the term "Logical AND".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kalevi Kolttonen@21:1/5 to James Kuyper on Tue Jan 23 20:32:39 2024
    James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    Actually, C uses the term "Logical AND". I don't have any idea what "conditional and" is supposed to mean, except that the explanation you provide matches the term "Logical AND".

    The term "conditional and" is probably not so good, but the
    meaning of it here refers to the familiar short-circuiting
    behaviour of C's "&&". The same behaviour exists in, I
    think, all UNIX shells.

    If I write this in bash:

    rm foo.txt && rm bar.txt

    then if the first rm-command fails with a non-zero exit value,
    then the second rm-command is not executed at all.

    It is similar in C code but the zero evaluation value
    there means false, i.e. it is completely opposite of the
    UNIX shell.

    br,
    KK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to James Kuyper on Tue Jan 23 21:28:23 2024
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 1/23/24 13:52, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 23/01/2024 16:32, Malcolm McLean wrote:


    Every explanation for && and || for every language that copied them from >>> C, is that && means AND, and || means OR.

    in C, && specifically means 'conditional and'. The programmer can
    rely on the fact that the second term will not be evaluated if
    the first term evaluates to false.

    Actually, C uses the term "Logical AND".

    The term 'conditional and' has been in common use for decades.

    I never claimed it was the term used in the standard, regardless
    of the use of the word 'specifically'.

    An issue with the use of the iso646 'and' macro is that the
    typical reader, unfamiliar with iso646, might assume a bitwise
    and rather than a 'logical' or 'conditional' and.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to David Brown on Tue Jan 23 21:30:20 2024
    On 2024-01-23, David Brown <david.brown@hesbynett.no> wrote:
    On 22/01/2024 21:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability:

    No. But I fully appreciate that this is personal preference and habit.

    I believe that some of the identifiers improves readability for people
    coming from a programming language which uses those English words for
    very similar operators rather than && and ||.

    In a green field programming language design, it's probably better
    to design that way from the start. It's a nice bonus if a language
    looks readable to newcomers.

    Generations of C coders are used to && and || though; that's the normal
    way to write C. Using these aliases is a vanishingly rare practice. An important aspect of readability is writing code like everyone else. When
    a language is newly designed so that there isn't anyone else, that
    doesn't have to be considered.

    For that reason, these identifiers should not be used, except for machine-encoding of programs into a 6 bit character set.

    Additionally certain names in the iso646.h header are poorly considered,
    and obstruct readability. They use the _eq suffix for an operation that
    is assignment.

    #define and_eq &=

    If the purpose of this header were to optimize readability for those
    unfamiliar with C, this should be called

    #define and_set &=

    or similar.

    The assignment operator = should not be read "equals", but "becomes" or
    "takes the value" or "is assigned" or "is set to". This should be taken
    into consideration when coming up with word-like token or token fragment
    to represent it.

    Also note the following inconsistency:

    #define and &&
    #define bitand &
    #define and_eq &= // what happened to "bit"?

    This looks like and_eq should correspond to &&=, since and is &&,
    and bitand is &. &= wants to be bitand_eq.

    Clearly, the purpose of this header is to allow C to be written with the
    ISO 646 character set. The choices of identifiers do not look like
    evidence of readability having been highly prioritized.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lew Pitcher on Tue Jan 23 21:49:23 2024
    On 2024-01-23, Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

    On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability:

    It breaks the rule that, in C, variables and functions are alphnumeric,
    whilst operators are symbols. sizeof is an exception, but a justified
    one. However it's harder to justify a symbol for "plus" but a word for "or".

    Less importantly, it also violates the convention that C macros are named in upper case to distinguish them from keywords and "regular" identifiers.

    So do true, false, bool, complex, imaginary, errno, assert, fpclassify,
    ..., and any function supplanted by a macro, such as used to be the case
    with getc.


    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Tue Jan 23 21:51:29 2024
    On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

    It breaks the rule that, in C, variables and functions are alphnumeric, whilst operators are symbols.

    Where is there such a “rule”?

    sizeof is an exception, but a justified one.

    This is how religious people argue: they use circular reasoning to say something is justified because it is justified.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Scott Lurndal on Tue Jan 23 22:00:43 2024
    On 2024-01-23, Scott Lurndal <scott@slp53.sl.home> wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 1/23/24 13:52, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 23/01/2024 16:32, Malcolm McLean wrote:


    Every explanation for && and || for every language that copied them from >>>> C, is that && means AND, and || means OR.

    in C, && specifically means 'conditional and'. The programmer can
    rely on the fact that the second term will not be evaluated if
    the first term evaluates to false.

    Actually, C uses the term "Logical AND".

    The term 'conditional and' has been in common use for decades.

    Also, a bitwise and is logical!

    ANSI Common Lisp uses symbols like logand, logior, logxor, ...
    for bitwise operations.

    When you implement this stuff with electronic gates it is digital logic circuits. You can read live values in it with a logic probe.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Lew Pitcher on Tue Jan 23 21:52:00 2024
    On Tue, 23 Jan 2024 17:21:40 -0000 (UTC), Lew Pitcher wrote:

    Less importantly, it also violates the convention that C macros are
    named in upper case to distinguish them from keywords and "regular" identifiers.

    Why does C allow lowercase in macro names, then?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Tue Jan 23 21:50:01 2024
    On Tue, 23 Jan 2024 06:47:10 +0100, Janis Papanagnou wrote:

    There's also problems with these names. For example '&&' has not the semantics of 'and' but of 'and_then', 'or' is actually 'or_else'.

    Funnily enough, that is how the languages that offer those words interpret them. Not just C and C++.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 22:09:35 2024
    On 2024-01-23, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Tue, 23 Jan 2024 20:32:39 -0000 (UTC), Kalevi Kolttonen wrote:

    If I write this in bash:

    rm foo.txt && rm bar.txt

    Then the second is only executed if the first one returns zero.

    What does C do in this case?

    C also doesn't evaluate the right operand if the left one is
    true.

    However in C, the following:

    a && b || c && d

    is parsed like this:

    (a && b) || (c && d)

    whereas in the shell:

    (((a && b) || c) && d)

    where I'm using "virtual parentheses". If you actually stick in real
    ones, they denote subshell execution in a separate process. (Bash
    allows curly braces for command grouping that doesn't create
    processes.)


    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Kaz Kylheku on Tue Jan 23 22:03:02 2024
    On 23/01/2024 21:30, Kaz Kylheku wrote:

    Also note the following inconsistency:

    #define and &&
    #define bitand &
    #define and_eq &= // what happened to "bit"?

    This looks like and_eq should correspond to &&=, since and is &&,
    and bitand is &. &= wants to be bitand_eq.

    Clearly, the purpose of this header is to allow C to be written with the
    ISO 646 character set. The choices of identifiers do not look like
    evidence of readability having been highly prioritized.

    It shows this is a poor man's way of extending a language's syntax.

    It's forgivable in user-code, as there is no other way of doing it. But
    those new operator names are supposed to look like they are built-in.

    If they were, you'd be able to do this:

    a bitand= b;

    Actually, I thought the macros could be used like that. But '&=' needs
    to be a single token, not two.

    Also, why isn't 'a &&= b' valid? It would just mean 'a = a && b'.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Tue Jan 23 22:01:09 2024
    On Tue, 23 Jan 2024 12:13:27 -0800, Keith Thompson wrote:

    There's no hard rule that operators must be
    punctuation, just a general trend.

    And iso646.h demonstrates that that trend is at an end.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Kalevi Kolttonen on Tue Jan 23 21:52:59 2024
    On Tue, 23 Jan 2024 20:32:39 -0000 (UTC), Kalevi Kolttonen wrote:

    If I write this in bash:

    rm foo.txt && rm bar.txt

    Then the second is only executed if the first one returns zero.

    What does C do in this case?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Kaz Kylheku on Tue Jan 23 22:33:58 2024
    On 23/01/2024 22:00, Kaz Kylheku wrote:
    On 2024-01-23, Scott Lurndal <scott@slp53.sl.home> wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 1/23/24 13:52, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 23/01/2024 16:32, Malcolm McLean wrote:


    Every explanation for && and || for every language that copied them from >>>>> C, is that && means AND, and || means OR.

    in C, && specifically means 'conditional and'. The programmer can
    rely on the fact that the second term will not be evaluated if
    the first term evaluates to false.

    Actually, C uses the term "Logical AND".

    The term 'conditional and' has been in common use for decades.

    Also, a bitwise and is logical!

    Only on individual bits. The result of A & B can be any value in the
    range of A and B's type, not just true and false.

    (My tools internally use ANDL/ORL/NOTL for logical versions, and IAND/IOR/IXOR/INOT for bitwise. I don't use bare AND/OR/NOT. There is
    some confusion however with x64 opcodes which use those for bitwise instructions.)

    ANSI Common Lisp uses symbols like logand, logior, logxor, ...
    for bitwise operations.

    Then it is confusing. What does it use for non-bitwise logical operations?

    When you implement this stuff with electronic gates it is digital logic circuits. You can read live values in it with a logic probe.

    Yes. You put the probe on one signal line to read true or false on that
    line.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kalevi Kolttonen@21:1/5 to Kaz Kylheku on Tue Jan 23 22:37:46 2024
    Kaz Kylheku <433-929-6894@kylheku.com> wrote:
    On 2024-01-23, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Tue, 23 Jan 2024 20:32:39 -0000 (UTC), Kalevi Kolttonen wrote:

    If I write this in bash:

    rm foo.txt && rm bar.txt

    Then the second is only executed if the first one returns zero.

    What does C do in this case?

    C also doesn't evaluate the right operand if the left one is
    true.

    We are speaking about logical AND evaluation here.

    If the left one is *false*, then the evaluation stops
    because the whole conjunction is known to false.

    If we are talking about logical OR, then if the left
    operand is true, then the evaluation stops because
    the disjunction is known to be true.

    So this distinction between "conditional and" and
    "logical and" boils down to the short-circuiting
    left-to-right evaluation order that is guaranteed
    by C language standard.

    We could conceivably have "logical and" with
    out-of-order or right-to-left evaluation. I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    br,
    KK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kalevi Kolttonen@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 22:54:33 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Tue, 23 Jan 2024 12:13:27 -0800, Keith Thompson wrote:

    There's no hard rule that operators must be
    punctuation, just a general trend.

    And iso646.h demonstrates that that trend is at an end.

    In the real world, iso646.h is almost never used. I
    do not claim to be any kind of programming expert but
    I have read a considerable amount of C code of various
    projects. I have personally witnessed that everybody
    always uses "&&" and "||".

    You can probably download thousands of C codebases
    from github and you will find out that reality is
    like that.

    br,
    KK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 22:45:17 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Tue, 23 Jan 2024 12:13:27 -0800, Keith Thompson wrote:

    There's no hard rule that operators must be
    punctuation, just a general trend.

    And iso646.h demonstrates that that trend is at an end.


    I don't see you you can draw that conclusion. The header
    file has been around for over three decades, yet it's not
    in common (or even uncommon) use.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Tue Jan 23 23:33:48 2024
    On Tue, 23 Jan 2024 14:51:52 -0800, Keith Thompson wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Tue, 23 Jan 2024 17:21:40 -0000 (UTC), Lew Pitcher wrote:

    Less importantly, it also violates the convention that C macros are
    named in upper case to distinguish them from keywords and "regular"
    identifiers.

    Why does C allow lowercase in macro names, then?

    Because it's a convention, not a language rule.

    So what would one mean by “violate”, other than “I personally don’t like
    it”?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to bart on Tue Jan 23 23:46:15 2024
    On 2024-01-23, bart <bc@freeuk.com> wrote:
    ANSI Common Lisp uses symbols like logand, logior, logxor, ...
    for bitwise operations.

    Then it is confusing. What does it use for non-bitwise logical operations?

    and, or, not

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Tue Jan 23 23:40:23 2024
    On Tue, 23 Jan 2024 15:10:36 -0800, Keith Thompson wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Tue, 23 Jan 2024 12:13:27 -0800, Keith Thompson wrote:

    There's no hard rule that operators must be punctuation, just a
    general trend.

    And iso646.h demonstrates that that trend is at an end.

    It does no such thing.

    A “trend” means ongoing developments continue to follow the same pattern. There have been no new non-alphanumeric operators added to C in, oh, over
    40 years. Therefore the “trend” actually came to an end a long time ago.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Tue Jan 23 23:39:07 2024
    On Tue, 23 Jan 2024 22:45:17 GMT, Scott Lurndal wrote:

    The header file has been around for over three decades, yet it's not in common (or even uncommon) use.

    The nice thing is, I don’t have to care. It has to be part of any standards-compliant C compiler, therefore I am free to use it. And I do.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Scott Lurndal on Tue Jan 23 19:45:55 2024
    On 1/23/24 16:28, Scott Lurndal wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 1/23/24 13:52, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 23/01/2024 16:32, Malcolm McLean wrote:


    Every explanation for && and || for every language that copied them from >>>> C, is that && means AND, and || means OR.

    in C, && specifically means 'conditional and'. The programmer can
    rely on the fact that the second term will not be evaluated if
    the first term evaluates to false.

    Actually, C uses the term "Logical AND".

    The term 'conditional and' has been in common use for decades.

    I've never heard of it. When searching Wikipedia, "conditional and" the
    first two pages of hits don't seem to contain any that are actually
    relevant. A search on "logical and", on the other hand, redirects to
    "logical conjunction", which does seem to be relevant, yet it makes no reference to "conditional and" as an alternative term for the same
    thing. I think that that term might not be as widely used as you think.

    A Google search for "logical and" has many relevant hits. A search for "conditional and" produces a much smaller number of hits, which suggest
    that it's a term associated with C# and Java, and at least one source
    described it by saying the "coditional and operator performs a ogical
    and operation", which sounds rather confused to me.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 00:47:03 2024
    On Tue, 23 Jan 2024 16:27:25 -0800, Keith Thompson wrote:

    I believe the only thing iso646.h demonstrates is the (largely former)
    need to write C on systems that do not support full ASCII.

    It does seem a very late addition for a purpose which would have been a
    lot more relevant decades earlier. By that point, implementations that did
    not support full ASCII would have been museum pieces.

    Reminds me of how PL/I added “eq”, “ne”, “lt”, ”gt”, “le”, “ge” as
    alternatives to “=”, “¬=”, “<”, ”>“, “¬>”, “¬<” for use on systems which
    didn’t have the requisite character set.

    Incidentally, those six symbols were the only reserved words in the entire language. You’d think they could have done it Fortran-style, without
    reserved words.

    If you want to use it in your own code, nobody will stop you.

    But will I continue to hear complaints from you about it?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 00:39:19 2024
    On Tue, 23 Jan 2024 16:16:31 -0800, Keith Thompson wrote:

    Obviously, it would mean not following the convention.

    Thumbing one’s nose at staid convention? How shocking!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 02:42:01 2024
    On Tue, 23 Jan 2024 17:32:37 -0800, Keith Thompson wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    It does seem a very late addition for a purpose which would have been a
    lot more relevant decades earlier. By that point, implementations that
    did not support full ASCII would have been museum pieces.

    <iso646.h> was added in 1995, and it was intended to replace a number of implementation-specific workarounds.

    Just in time for them to become unnecessary, I would say.

    But will I continue to hear complaints from you about it?

    I can't continue what I never started.

    I’ll take that as a “yes”.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Keith Thompson on Tue Jan 23 23:26:54 2024
    On 1/23/24 19:27, Keith Thompson wrote:
    ...
    I believe the only thing iso646.h demonstrates is the (largely former)
    need to write C on systems that do not support full ASCII. This is
    fully explained in the C99 Rationale, <http://www.open-std.org/jtc1/sc22/WG14/www/C99RationaleV5.10.pdf>;
    search for "MSE.4". It says nothing about "and" being more readable
    than "&&" on systems that are able to display the '&' character.

    It does say, in lines 26-29, that "The Committee recognizes that the
    solution offered in this header is incomplete and involves a mixture of approaches, but nevertheless believes that it can help make Standard C
    programs more readable." However, if you examine the context of that
    statement, what it's saying is that they are more readable than is
    trigraphs. In other words, they are saying that "or" is more readable
    than "??!??!", which I doubt that anyone would disagree with.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 23:56:11 2024
    On 1/23/24 19:47, Lawrence D'Oliveiro wrote:
    On Tue, 23 Jan 2024 16:27:25 -0800, Keith Thompson wrote:

    I believe the only thing iso646.h demonstrates is the (largely former)
    need to write C on systems that do not support full ASCII.

    It does seem a very late addition for a purpose which would have been a
    lot more relevant decades earlier. By that point, implementations that did not support full ASCII would have been museum pieces.

    No, <iso646.h> was approved as part of AMD1, in 1995. For the countries
    it was targeted at, full ASCII was not a tenable solution - it could not
    be used to write their languages. The were still using encodings such as shift-JIS and ISO/IEC 8859-10. Those did not become museum pieces until
    after the widespread adoption of Unicode, which came out in 1996, and
    did not become widely supported for many years after that.

    If you want to use it in your own code, nobody will stop you.

    But will I continue to hear complaints from you about it?

    If you've misunderstood the comments you've already received as
    complaints, I think it's quite likely that you'll continue doing so.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Wed Jan 24 05:24:39 2024
    On Tue, 23 Jan 2024 23:56:11 -0500, James Kuyper wrote:

    No, <iso646.h> was approved as part of AMD1, in 1995. For the countries
    it was targeted at, full ASCII was not a tenable solution - it could not
    be used to write their languages. The were still using encodings such as shift-JIS and ISO/IEC 8859-10.

    But ASCII is a 7-bit code. The ISO 8859 codes are all ASCII supersets. And
    also remember what the “shift” in “shift-JIS” stands for. Oh, and the name
    of the corporation that created it in the first place is a bit of a
    giveaway.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Wed Jan 24 06:05:01 2024
    On Tue, 23 Jan 2024 06:54:56 +0100, Janis Papanagnou wrote:

    ... in my professional contexts there where even [coding] standards
    defined that you had to follow.

    What about the open-source code that your company takes without paying? Do
    you demand that that code follow your rules as well? Do you send it back
    to the developers to demand they rewrite it for you?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 06:35:05 2024
    On Tue, 23 Jan 2024 21:43:44 -0800, Keith Thompson wrote:

    In the Shift-JIS encoding, character 0x5C, which is the backslash in
    ASCII and Unicode, is the Yen sign. That means that if a C source file contains "Hello, world\n", viewing it as Shift-JIS makes it look like
    "Hello, world¥n", but a C compiler that treats its input as ASCII would
    see a backslash.

    So what exactly does iso646.h offer to deal with this?

    .

    .

    .

    (crickets)

    .

    .

    .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kaz Kylheku on Wed Jan 24 08:53:05 2024
    On 23.01.2024 22:30, Kaz Kylheku wrote:
    On 2024-01-23, David Brown <david.brown@hesbynett.no> wrote:
    On 22/01/2024 21:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability:

    No. But I fully appreciate that this is personal preference and habit.

    I believe that some of the identifiers improves readability for people
    coming from a programming language which uses those English words for
    very similar operators rather than && and ||.

    Well, I can only speaking for myself. But as someone originally coming
    from such languages I have no problems with these operators. (== vs = aggravated me more).

    [...]

    For that reason, these identifiers should not be used, except for machine-encoding of programs into a 6 bit character set.

    [...]

    Clearly, the purpose of this header is to allow C to be written with the
    ISO 646 character set. The choices of identifiers do not look like
    evidence of readability having been highly prioritized.

    I don't quite understand your thought. The "6-bit characters" OSes
    that I used had no lowercase letters, as opposed to IA5/ASCII/ISO646.

    To be sure I had also re-inspected the ASCII character set and it
    seems that all C characters (including these operators) are anyway
    in the ASCII domain. It's beyond me why they've used the name
    "iso646.h".

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 08:59:59 2024
    On 23.01.2024 22:50, Lawrence D'Oliveiro wrote:
    On Tue, 23 Jan 2024 06:47:10 +0100, Janis Papanagnou wrote:

    There's also problems with these names. For example '&&' has not the
    semantics of 'and' but of 'and_then', 'or' is actually 'or_else'.

    Funnily enough, that is how the languages that offer those words interpret them. Not just C and C++.

    From the languages I know of that support them (Ada and Eiffel) these
    languages support both forms.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Wed Jan 24 09:06:22 2024
    On 23.01.2024 21:13, Keith Thompson wrote:
    [...] There's no hard rule that operators must be
    punctuation, just a general trend.)

    Anyone still writing "MULTIPLY a BY b GIVEN c" ? :-)

    (Luckily I've never programmed in COBOL, even after
    it allowed "COMPUTE c = a * b" (or some such).)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 09:02:02 2024
    On 23.01.2024 22:52, Lawrence D'Oliveiro wrote:
    On Tue, 23 Jan 2024 17:21:40 -0000 (UTC), Lew Pitcher wrote:

    Less importantly, it also violates the convention that C macros are
    named in upper case to distinguish them from keywords and "regular"
    identifiers.

    Why does C allow lowercase in macro names, then?

    It's the nature of "conventions" to take the place where there's no
    rule.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kalevi Kolttonen on Wed Jan 24 09:10:55 2024
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kaz Kylheku on Wed Jan 24 09:09:05 2024
    On 23.01.2024 23:09, Kaz Kylheku wrote:
    [...]
    whereas in the shell:

    (((a && b) || c) && d)

    where I'm using "virtual parentheses". If you actually stick in real
    ones, they denote subshell execution in a separate process. (Bash
    allows curly braces for command grouping that doesn't create
    processes.)

    Make that "POSIX allows..." (it's standard behavior for POSIX shells).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 09:15:43 2024
    On 24.01.2024 00:33, Lawrence D'Oliveiro wrote:
    On Tue, 23 Jan 2024 14:51:52 -0800, Keith Thompson wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Tue, 23 Jan 2024 17:21:40 -0000 (UTC), Lew Pitcher wrote:

    Less importantly, it also violates the convention that C macros are
    named in upper case to distinguish them from keywords and "regular"
    identifiers.

    Why does C allow lowercase in macro names, then?

    Because it's a convention, not a language rule.

    So what would one mean by “violate”, other than “I personally don’t like
    it”?

    I would interpret it in [professional] project contexts where more
    than one person is programming on the same project.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 09:34:24 2024
    On 24.01.2024 07:05, Lawrence D'Oliveiro wrote:
    On Tue, 23 Jan 2024 06:54:56 +0100, Janis Papanagnou wrote:

    ... in my professional contexts there where even [coding] standards
    defined that you had to follow.

    What about the open-source code that your company takes without paying? Do you demand that that code follow your rules as well? Do you send it back
    to the developers to demand they rewrite it for you?

    Since you're making up your comfort situations feel free to answer them yourself.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to James Kuyper on Wed Jan 24 09:28:11 2024
    On 24.01.2024 01:45, James Kuyper wrote:
    On 1/23/24 16:28, Scott Lurndal wrote:

    The term 'conditional and' has been in common use for decades.

    I've never heard of it. When searching Wikipedia, [...]

    In the context of Ada I read about "boolean operator" ('and')
    and "boolean shortcut operator" ('and then').

    In an Eiffel book (which oriented its this operator choice on Ada)
    they speak about "non-commutative" operators when the mean the e.g.
    'and then' sorts of operators.

    In my book the other formulation(s) mentioned in the threads are
    clear enough to understand them (I did, at least) and unless one
    is on the argument trip we should stop that fruitless discussion.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Wed Jan 24 09:41:31 2024
    On 24.01.2024 00:10, Keith Thompson wrote:

    The header was introduced to make it easier (or possible) to write C
    code on systems/keyboards that don't support certain characters like '&'
    and '|' -- similar to digraphs and trigraphs.

    I think this is the most likely explanation; the restricted _keyboards_
    (and not the restricted [ASCII] character set). Matches my experiences
    with old keyboards I used decades ago.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Kaz Kylheku on Wed Jan 24 10:03:03 2024
    On 23/01/2024 22:30, Kaz Kylheku wrote:
    On 2024-01-23, David Brown <david.brown@hesbynett.no> wrote:
    On 22/01/2024 21:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves readability:

    No. But I fully appreciate that this is personal preference and habit.

    I believe that some of the identifiers improves readability for people
    coming from a programming language which uses those English words for
    very similar operators rather than && and ||.

    In a green field programming language design, it's probably better
    to design that way from the start. It's a nice bonus if a language
    looks readable to newcomers.


    Agreed.

    Generations of C coders are used to && and || though; that's the normal
    way to write C. Using these aliases is a vanishingly rare practice. An important aspect of readability is writing code like everyone else. When
    a language is newly designed so that there isn't anyone else, that
    doesn't have to be considered.


    Agreed. (Although people coming to your new language probably have
    experience with some other languages, and you'll want to avoid being confusingly different from common existing languages. Choosing "&&" and
    "&" for "bitwise and" and "logical and" would be a bad idea for a new
    language, even if it arguably makes more sense based on the prevalence
    of these operator usages.)

    For that reason, these identifiers should not be used, except for machine-encoding of programs into a 6 bit character set.

    Additionally certain names in the iso646.h header are poorly considered,
    and obstruct readability. They use the _eq suffix for an operation that
    is assignment.

    #define and_eq &=

    If the purpose of this header were to optimize readability for those unfamiliar with C, this should be called

    #define and_set &=

    or similar.

    Or "bitmask" ?


    The assignment operator = should not be read "equals", but "becomes" or "takes the value" or "is assigned" or "is set to". This should be taken
    into consideration when coming up with word-like token or token fragment
    to represent it.

    Yes.


    Also note the following inconsistency:

    #define and &&
    #define bitand &
    #define and_eq &= // what happened to "bit"?

    This looks like and_eq should correspond to &&=, since and is &&,
    and bitand is &. &= wants to be bitand_eq.

    Clearly, the purpose of this header is to allow C to be written with the
    ISO 646 character set. The choices of identifiers do not look like
    evidence of readability having been highly prioritized.


    They are still better than trigraphs :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 14:14:39 2024
    On 24/01/2024 07:35, Lawrence D'Oliveiro wrote:
    On Tue, 23 Jan 2024 21:43:44 -0800, Keith Thompson wrote:

    In the Shift-JIS encoding, character 0x5C, which is the backslash in
    ASCII and Unicode, is the Yen sign. That means that if a C source file
    contains "Hello, world\n", viewing it as Shift-JIS makes it look like
    "Hello, world¥n", but a C compiler that treats its input as ASCII would
    see a backslash.

    So what exactly does iso646.h offer to deal with this?


    In Scandinavian language variants of ASCII, the | symbol was replaced by
    the letter ø or ö. The name "or" is a significant improvement over the "??!??!" trigraph for ||.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Malcolm McLean on Wed Jan 24 13:00:16 2024
    On 24/01/2024 12:20, Malcolm McLean wrote:
    On 23/01/2024 18:34, bart wrote:
    On 23/01/2024 16:32, Malcolm McLean wrote:
    On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves
    readability:

    It breaks the rule that, in C, variables and functions are
    alphnumeric, whilst operators are symbols. sizeof is an exception,
    but a justified one. However it's harder to justify a symbol for
    "plus" but a word for "or".

    But it's OK to justify 'pow' for exponentiation?

    Mathematically operators are functions, so a mathematican would say that "add" is just as much of a function as "gamma". But to a computer
    programmer an operator compiles to a trivial number of machine code instructions, whilst a function is a subroutine call. Pow is not usually supported in hardware. However it's such a basic mathematical function
    that it has special notation. So some languages say it should be an
    operator. However ASCII won't represent the standard notation.

    Which is what? Usually the operator is implied when using mathematical notation, as is multiply.

    Computer languages commonly use '**' or '^' for this operator.

    SO there
    are good arguments for and against pow as an operators, and different language take differnet views. But I think the C decision is better, as
    C code is for programming computers, not for translating formulae into machine readable form.

    C's decision is possibly the worst. A proper built-in operator, say
    '**', can be overloaded to work on both ints and floats.

    If you do 'pow(a,3)' in C when 'a' is an integer, then it will convert
    to a float, call the external function, and return a float result, which
    is likely to force neighbouring terms and operators to work as floats too.

    Using 'a**3', that would probably be a call to an integer power
    function, but here it can also easily choose to do a*a*a.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Wed Jan 24 14:22:31 2024
    This post appears to me as really odd in many respects (details below), presuming you mean it as it is formulated [as a general comment] (as
    opposed to, maybe, some "very specific" C view). - Where did you get
    that view from?

    On 24.01.2024 13:20, Malcolm McLean wrote:

    Mathematically operators are functions, so a mathematican would say that "add" is just as much of a function as "gamma". But to a computer
    programmer an operator compiles to a trivial number of machine code instructions, whilst a function is a subroutine call.

    We generally don't know; neither what an operator compiles into, nor
    what a function compiles into, nor what the compilers and optimizers additionally do. And I'm not even asking who that ominous "computer
    programmer" you have in mind actually is.

    I do, for example with matrix a, b, c; c = a * b;
    (Or even more difficult tasks.) How should these matrix multiplication
    be more trivial than a counterpart with functions?

    (Someone already mentioned upthread that functions and operators can
    be considered as similar (or in practice probably even equal) things.
    Of course this is mostly correct, only that you also have to consider
    operator precedence in the second case, and functions are typically
    allowing more parameters; most operator are dyadic or monadic, else
    you need some additional syntactic support.)

    Pow is not usually
    supported in hardware. However it's such a basic mathematical function
    that it has special notation. So some languages say it should be an
    operator. However ASCII won't represent the standard notation.

    Generally this operator is defined by one or more ASCII characters.

    SO there
    are good arguments for and against pow as an operators, and different language take differnet views.

    Yes, other languages support it (often in various ways; ** ^ 'up' ...)

    But I think the C decision is better, as
    C code is for programming computers,

    All the languages that I know to support an operator for power() are
    there "for programming computers"...

    not for translating formulae into
    machine readable form.

    ...and all translate formulas (as part of a program) into machine code.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Wed Jan 24 14:54:54 2024
    On 24/01/2024 13:20, Malcolm McLean wrote:
    On 23/01/2024 18:34, bart wrote:
    On 23/01/2024 16:32, Malcolm McLean wrote:
    On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:
    On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

    ... I don't use the matching names in C++ either ...

    I do, if/when I do use C++ and C. Don’t you think it improves
    readability:

    It breaks the rule that, in C, variables and functions are
    alphnumeric, whilst operators are symbols. sizeof is an exception,
    but a justified one. However it's harder to justify a symbol for
    "plus" but a word for "or".

    But it's OK to justify 'pow' for exponentiation?

    Mathematically operators are functions, so a mathematican would say that "add" is just as much of a function as "gamma".

    In mathematics, the term "operator" is usually used for functions where
    the domain itself involves functions - such as the Laplace
    transformation, or the integral operator. Addition is an /operation/,
    not an operator, and "+" is a /symbol/. Given an operation and a
    domain, you get a function - addition applied to real numbers is a
    function, distinct from addition applied to integers or complex numbers.

    But to a computer
    programmer an operator compiles to a trivial number of machine code instructions, whilst a function is a subroutine call.

    Not at all.

    People working in most high level languages do not think in terms of
    generated machine code at all, or in terms of subroutine calls. And C programmers who are looking closely enough at efficiency and generated
    code to be interested in the implementation details, should know that
    function calls do not equate to machine-code subroutine calls, nor do
    operators necessarily compile to small numbers of machine code instructions.

    Many operators in C are not mathematical operations. "sizeof" is an
    operator, so are indirection operators, structure member access
    operators, function calls, and the comma operator. These don't
    correspond to machine code instructions in any straightforward manner -
    nor do they match subroutine calls. They are specific parts of the
    syntax and grammar of the language, and can do things that functions
    cannot do in C.

    Pow is not usually
    supported in hardware. However it's such a basic mathematical function
    that it has special notation.

    There are vast numbers of things in mathematics that have special notation.

    Exponentiation is not particularly common in programming, except for a
    few special cases - easily written as "x * x", "x * x * x", "1.0 / x",
    or "sqrt(x)", which are normally significantly more efficient than a
    generic power function or operator would be.

    So some languages say it should be an
    operator. However ASCII won't represent the standard notation. SO there
    are good arguments for and against pow as an operators, and different language take differnet views. But I think the C decision is better, as
    C code is for programming computers, not for translating formulae into machine readable form.


    That is not an argument against having an operator in C called "pow".
    It is simply not useful enough for there to be a benefit in adding it to
    the language as an operator, when it could (and was) easily be added as
    a function in the standard library.

    (It could not have been added as "**", because - as Keith said in
    another post - "x ** y" already has a meaning in C. While I believe it
    would be possible to distinguish the uses based on the type of "y",
    other than for the literal 0, having "x ** y" mean two /completely/
    different things depending on the type of "y" would not be a good idea
    for C.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Janis Papanagnou on Wed Jan 24 14:17:04 2024
    On Wed, 24 Jan 2024 09:06:22 +0100, Janis Papanagnou wrote:

    On 23.01.2024 21:13, Keith Thompson wrote:
    [...] There's no hard rule that operators must be
    punctuation, just a general trend.)

    Anyone still writing "MULTIPLY a BY b GIVEN c" ? :-)

    ITYM

    MULTIPLY A BY B GIVING C.

    and, yes, COBOL programmers are still in demand, mostly by
    financial institutions that have hundreds of millions
    of lines of COBOL code to maintain.

    (Luckily I've never programmed in COBOL, even after
    it allowed "COMPUTE c = a * b" (or some such).)

    I have (lucky me :-) ).

    While I don't tout COBOL as the "be all and end all" of
    programming languages, it still can perform a lot of
    useful work, especially in fields where exact calculations
    are required and rounding and truncation of mathematical
    operations are well defined. Such as financial institutions.

    These days, it even supports object oriented code.
    FWIW, the last ISO COBOL language standard was issued in 2023.

    Are you certain that you want your taxes to be calculated in
    floatingpoint? ;-)


    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 15:43:59 2024
    On 2024-01-24, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Tue, 23 Jan 2024 06:54:56 +0100, Janis Papanagnou wrote:

    ... in my professional contexts there where even [coding] standards
    defined that you had to follow.

    What about the open-source code that your company takes without paying? Do you demand that that code follow your rules as well? Do you send it back
    to the developers to demand they rewrite it for you?

    In my experience, when you patch third-party code, you closely adhere to
    *its* conventions, to keep it consistent.

    If you're a person who maintains an diverse open source stack for an organization, you end up working with numerous coding conventions.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Janis Papanagnou on Wed Jan 24 16:56:58 2024
    On 2024-01-24, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    To be sure I had also re-inspected the ASCII character set and it
    seems that all C characters (including these operators) are anyway
    in the ASCII domain. It's beyond me why they've used the name
    "iso646.h".

    Because the macro names in that header are in the ISO 646
    invariant set, expanding to tokens that use characters outside
    of the invariant set.

    ISO 646 looks liken a effort to standardize the "zoo" of regional ASCII variants.

    It defines a base character set which looks exactly like ASCII (correct
    me if I'm wrong) of which there are national variants. It's like a
    "mini ISO Latin" in 7 bits.

    The Wikipedia page on it is quite good.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lew Pitcher on Wed Jan 24 17:44:29 2024
    On 24.01.2024 15:17, Lew Pitcher wrote:
    On Wed, 24 Jan 2024 09:06:22 +0100, Janis Papanagnou wrote:

    ITYM

    MULTIPLY A BY B GIVING C.

    Memories are faint, and I consider it already pathological that
    I still recall (basically) the syntax that I've seen just once,
    many decades ago, and never programmen myself! ;-)


    and, yes, COBOL programmers are still in demand, mostly by
    financial institutions that have hundreds of millions
    of lines of COBOL code to maintain.

    Yeah, I've heard so.


    (Luckily I've never programmed in COBOL, even after
    it allowed "COMPUTE c = a * b" (or some such).)

    I have (lucky me :-) ).

    Hmm, okaaay... :-)


    While I don't tout COBOL as the "be all and end all" of

    Sounds prophetic like: "the end of all". LOL :-)

    programming languages, it still can perform a lot of
    useful work, especially in fields where exact calculations
    are required and rounding and truncation of mathematical
    operations are well defined. Such as financial institutions.

    Yes, sure.


    These days, it even supports object oriented code.
    FWIW, the last ISO COBOL language standard was issued in 2023.

    Yes, I had noticed (and was astonished) that it's evolution
    is still continued.


    Are you certain that you want your taxes to be calculated in
    floatingpoint? ;-)

    I'm not sure about that. Quite some years ago I have actually
    worked (in the country where I am living) on tax software for
    the finance authorities. At least the part I was involved was
    written in C++. But I cannot tell what the calculation modules
    were based on, floating point or else. Though I'm sure they've
    chosen a sophisticated calculation base. I doubt it was COBOL.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kaz Kylheku on Wed Jan 24 18:07:03 2024
    On 24.01.2024 17:56, Kaz Kylheku wrote:
    On 2024-01-24, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    To be sure I had also re-inspected the ASCII character set and it
    seems that all C characters (including these operators) are anyway
    in the ASCII domain. It's beyond me why they've used the name
    "iso646.h".

    Because the macro names in that header are in the ISO 646
    invariant set, expanding to tokens that use characters outside
    of the invariant set.

    I forgot about the "national variants" and the, how was it called,
    IRV (International Reference Version, or some such)?.


    ISO 646 looks liken a effort to standardize the "zoo" of regional ASCII variants.

    It defines a base character set which looks exactly like ASCII (correct
    me if I'm wrong) of which there are national variants. It's like a
    "mini ISO Latin" in 7 bits.

    A common ASCII subset with specific code points around [ ^ ] | etc.

    The inherent problem with that is that even many standard symbols
    from the C language were a problem; [ ] ^ { } ~ | which are in the
    IRV but (prevalently) not in the national variants.

    You cannot write legible C code with national non-IRV ASCII variants.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kaz Kylheku on Wed Jan 24 18:12:38 2024
    On 24.01.2024 17:56, Kaz Kylheku wrote:

    The Wikipedia page on it is quite good.

    The German Wikipedia has a table that is better legible IMO https://de.wikipedia.org/wiki/ISO_646

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 24 17:27:46 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 24.01.2024 15:17, Lew Pitcher wrote:
    On Wed, 24 Jan 2024 09:06:22 +0100, Janis Papanagnou wrote:

    programming languages, it still can perform a lot of
    useful work, especially in fields where exact calculations
    are required and rounding and truncation of mathematical
    operations are well defined. Such as financial institutions.

    Yes, sure.

    On one of the Burroughs mainframe lines, the disk
    defragmentation utility (called SQUASH) was written
    in COBOL68.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Janis Papanagnou on Wed Jan 24 12:24:32 2024
    On 1/24/24 03:10, Janis Papanagnou wrote:
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement? As quoted,
    it's a general statement which includes C: "Except as specified later,
    side effects and value computations of subexpressions are unsequenced."

    Now, logical-AND and logical-OR are two cases where the order of
    evaluation is, in fact, specified. Are you expressing surprise that
    there are other languages where that's not the case? I can't remember
    where, but I'm fairly sure I've seen a language where the closest
    equivalent of C's (expression1 && expression2) causes both
    sub-expressions to be evaluated, in an arbitrary order, before
    evaluating the equivalent of && itself. Unfortunately, I don't remember
    where.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to James Kuyper on Wed Jan 24 18:38:38 2024
    On 24.01.2024 18:24, James Kuyper wrote:
    On 1/24/24 03:10, Janis Papanagnou wrote:
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement?

    It sounds so wrong, not matching anything I've experienced in the
    programming languages I heard about and about compiler construction
    that I can only express my astonishment about such a statement. The
    poster's statement itself is not explained, though, and if anything,
    the poster should first explain what makes him "pretty sure" about
    it before we can exchange arguments.

    I may have been lucky that such a fundamental property as operational
    semantics defining evaluation order have been part of all languages I
    met, especially in the context of expressions connected by operators
    we were speaking about here.

    [...]

    Now, logical-AND and logical-OR are two cases where the order of
    evaluation is, in fact, specified. Are you expressing surprise that
    there are other languages where that's not the case? I can't remember
    where, but I'm fairly sure I've seen a language where the closest
    equivalent of C's (expression1 && expression2) causes both
    sub-expressions to be evaluated, in an arbitrary order, before
    evaluating the equivalent of && itself. Unfortunately, I don't remember where.

    In functional languages without side effects it might not be an issue.

    The closest I met were theoretical expressions (like e.g. Dijktra's
    Guards, or how they were called) in per se non-deterministic contexts.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kalevi Kolttonen@21:1/5 to Janis Papanagnou on Wed Jan 24 17:49:16 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    It sounds so wrong, not matching anything I've experienced in the
    programming languages I heard about and about compiler construction
    that I can only express my astonishment about such a statement. The
    poster's statement itself is not explained, though, and if anything,
    the poster should first explain what makes him "pretty sure" about
    it before we can exchange arguments.

    Well, I said I was "pretty sure" simply because there are probably
    hundreds if not thousands of programming languages out there.

    It seems rather likely to me that not all of them have
    C-like properties.

    I suppose it is possible that in some languages out-of-order
    evaluation could be what happens, e.g. the "logical AND" operands
    could be evaluated in parallel by different CPUs.

    But admittedly I cannot give a concrete example.

    br,
    KK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Janis Papanagnou on Wed Jan 24 18:42:45 2024
    On 24.01.2024 18:38, Janis Papanagnou wrote:
    [...]

    The closest I met were theoretical expressions (like e.g. Dijktra's
    Guards, or how they were called) in per se non-deterministic contexts.

    Ah, I forgot; and I think also in Intercal... - wasn't (at least) the "politeness check" non-deterministic? - ...in case it matters. :-)


    Janis


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kalevi Kolttonen@21:1/5 to Kalevi Kolttonen on Wed Jan 24 18:40:08 2024
    Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:
    I suppose it is possible that in some languages out-of-order
    evaluation could be what happens, e.g. the "logical AND" operands
    could be evaluated in parallel by different CPUs.

    I did some searching and finally found a suitable result.
    According to:

    https://en.wikipedia.org/wiki/Logical_disjunction

    "Logical disjunction is usually short-circuited; that is,
    if the first (left) operand evaluates to true, then the
    second (right) operand is not evaluated. The logical
    disjunction operator thus usually constitutes a
    sequence point.

    In a parallel (concurrent) language, it is possible to
    short-circuit both sides: they are evaluated in
    parallel, and if one terminates with value true, the
    other is interrupted. This operator is thus called
    the parallel or."

    So that describes "parallel logical OR". If this is
    possible in a parallel language, then surely
    "parallel logical AND" is also doable because in
    that case you can terminate the evaluation when
    one of the operands turns out to be false.

    The Wikipedia article seems to assume that these
    logical operators take just two operands, but as
    we know from some languages (LISP?), AND or OR can
    be generalized to accept any number of operands.

    br,
    KK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Janis Papanagnou on Wed Jan 24 19:32:45 2024
    On 2024-01-24, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 24.01.2024 00:10, Keith Thompson wrote:

    The header was introduced to make it easier (or possible) to write C
    code on systems/keyboards that don't support certain characters like '&'
    and '|' -- similar to digraphs and trigraphs.

    I think this is the most likely explanation; the restricted _keyboards_
    (and not the restricted [ASCII] character set). Matches my experiences
    with old keyboards I used decades ago.

    Well, keyboards and displays. Your keyboard has something other than a |
    key, and when you type that, you get a character similar to the one
    on the keyboard. They happen to have the same character code as the
    ASCII/ISO 646 base character |. When you read someone's soure code,
    you see that character where the code has | and ||.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Wed Jan 24 19:45:58 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 24/01/2024 13:54, David Brown wrote:
    On 24/01/2024 13:20, Malcolm McLean wrote:

    Many operators in C are not mathematical operations.  "sizeof" is an
    operator, so are indirection operators, structure member access
    operators, function calls, and the comma operator.

    I've discussed this ad infinitum with people who don't really understand
    what the term "function" means. Anththing that maps one set to another
    set such that there is one and only one mapping from each member if the >struture set to the result set is mathematically a "function".
    Sizeof clearly counts.

    Just how many angels do you think can dance on that pin?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kalevi Kolttonen@21:1/5 to Malcolm McLean on Wed Jan 24 19:48:32 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:
    On 24/01/2024 13:54, David Brown wrote:
    On 24/01/2024 13:20, Malcolm McLean wrote:

    Many operators in C are not mathematical operations.  "sizeof" is an
    operator, so are indirection operators, structure member access
    operators, function calls, and the comma operator.

    I've discussed this ad infinitum with people who don't really understand
    what the term "function" means. Anththing that maps one set to another
    set such that there is one and only one mapping from each member if the struture set to the result set is mathematically a "function".
    Sizeof clearly counts.

    Yes, indeed you are right *if* one accepts that the
    mathematical meaning is the only acceptable one.

    However, the meanings in mathematics and in programming
    languages can be different.

    For example, despite that some people still
    complain that "=" should not be used for assignment
    because in mathematics it means equality. It is
    just that the meanings of "=" are different in different
    contexts.

    One can easily argue that sizeof() is not a function
    in C because you cannot use function pointers to
    refer to it. With C's meaning of functions, you
    always can.

    So it is not just a matter of mapping elements
    of two sets.

    br,
    KK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Wed Jan 24 19:52:58 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 23/01/2024 21:51, Lawrence D'Oliveiro wrote:
    On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

    It breaks the rule that, in C, variables and functions are alphnumeric,
    whilst operators are symbols.

    Where is there such a “rule”?

    Valid function names have to begin with an alphabetical symbol or
    (annoyingly for me) an underscore, as do variables. They may not contain >non-alphanumerical symbols except for underscore

    Dollar symbol ($) is an allowed extension.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kalevi Kolttonen@21:1/5 to Keith Thompson on Wed Jan 24 20:03:11 2024
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 24.01.2024 18:24, James Kuyper wrote:
    On 1/24/24 03:10, Janis Papanagnou wrote:
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement?

    It sounds so wrong, not matching anything I've experienced in the
    programming languages I heard about and about compiler construction
    that I can only express my astonishment about such a statement. The
    poster's statement itself is not explained, though, and if anything,
    the poster should first explain what makes him "pretty sure" about
    it before we can exchange arguments.

    A concrete example:

    #include <stdio.h>

    static int count(void) {
    static int result = 0;
    return ++result;
    }

    int main(void) {
    printf("%d %d %d\n", count(), count(), count());
    return 0;
    }

    C does not specify the order in which the arguments are evaluated
    (likewise for operands of most operators). This program could produce
    any of 6 possible outputs, at the whim of the compiler. (On my system,
    I see "3 2 1" with gcc and "1 2 3" with clang; both are perfectly
    valid.)

    I'm surprised that that surprises you. It's a fairly fundamental
    property of C (and also of C++).

    [...]

    I believe Janis knows what you are saying. The object of
    discussion was logical operators, not the fact that C's
    function arguments have no guaranteed order of evaluation.

    However, I already posted a link to Wikipedia that is
    on-topic and shows that parallel logical OR and parallel
    logical AND are available in some parallel/concurrent
    programming languages.

    br,
    KK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Wed Jan 24 20:14:19 2024
    On Wed, 24 Jan 2024 12:24:32 -0500, James Kuyper wrote:

    As quoted, it's a general statement which includes C: "Except as
    specified later, side effects and value computations of subexpressions
    are unsequenced."

    It always seemed to me, explicitly specifying an order removes the
    potential for parallelization on hardware which might allow it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 20:11:33 2024
    On Wed, 24 Jan 2024 07:58:44 -0800, Keith Thompson wrote:

    Trigraphs, digraphs, and <iso646.h> were all introduced to support
    systems that *don't* support the full ASCII character set.

    Where is there a national character set that doesn’t support the symbols
    for which iso646.h introduces synonyms?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Wed Jan 24 20:19:27 2024
    On Wed, 24 Jan 2024 14:14:39 +0100, David Brown wrote:

    On 24/01/2024 07:35, Lawrence D'Oliveiro wrote:

    On Tue, 23 Jan 2024 21:43:44 -0800, Keith Thompson wrote:

    In the Shift-JIS encoding, character 0x5C, which is the backslash in
    ASCII and Unicode, is the Yen sign. That means that if a C source
    file contains "Hello, world\n", viewing it as Shift-JIS makes it look
    like "Hello, world¥n", but a C compiler that treats its input as ASCII
    would see a backslash.

    So what exactly does iso646.h offer to deal with this?

    In Scandinavian language variants of ASCII ...

    Relevance to Shift-JIS being?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Kalevi Kolttonen on Wed Jan 24 20:15:43 2024
    On Wed, 24 Jan 2024 18:40:08 -0000 (UTC), Kalevi Kolttonen wrote:

    "Logical disjunction is usually short-circuited ...

    I wonder why that shouldn’t apply to anything else. E.g. in

    a × (b + c)

    if “a” evaluates to zero, why not avoid the computation of “b + c” and just return zero as the value of the expression?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Lew Pitcher on Wed Jan 24 20:21:37 2024
    On Wed, 24 Jan 2024 14:17:04 -0000 (UTC), Lew Pitcher wrote:

    and, yes, COBOL programmers are still in demand, mostly by financial institutions that have hundreds of millions of lines of COBOL code to maintain.

    I suspect a lot of those institutions have already gone out of business,
    or are close to going out of business. And the amounts they have to pay
    COBOL programmers to maintain their code are hastening that end.

    Are you certain that you want your taxes to be calculated in
    floatingpoint? ;-)

    How else would you handle compound interest?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 20:25:00 2024
    On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

    On Wed, 24 Jan 2024 07:58:44 -0800, Keith Thompson wrote:

    Trigraphs, digraphs, and <iso646.h> were all introduced to support
    systems that *don't* support the full ASCII character set.

    Where is there a national character set that doesn’t support the symbols for which iso646.h introduces synonyms?

    EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.



    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 20:31:05 2024
    On Wed, 24 Jan 2024 20:21:37 +0000, Lawrence D'Oliveiro wrote:

    On Wed, 24 Jan 2024 14:17:04 -0000 (UTC), Lew Pitcher wrote:

    and, yes, COBOL programmers are still in demand, mostly by financial
    institutions that have hundreds of millions of lines of COBOL code to
    maintain.

    I suspect a lot of those institutions have already gone out of business,
    or are close to going out of business.

    And who do /you/ bank with? Certainly, in Canada, our major banks are still going strong, and they /all/ use COBOL.

    And the amounts they have to pay COBOL programmers

    COBOL programmers are being offered hourly rates at about $90 CAD or more.
    (I checked this morning). OTOH, C++ programmers (for instance) are being offered hourly rates of $45 CAD or thereabouts. (Again, this morning's
    offers on Indeed.com).

    to maintain their code are hastening that end.



    Are you certain that you want your taxes to be calculated in
    floatingpoint? ;-)

    How else would you handle compound interest?

    Fixed point arithmetic, of course. BTW, that's /not/ "integer"
    arithmetic.


    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Jan 24 20:50:12 2024
    On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    (It could not have been added as "**", because - as Keith said in
    another post - "x ** y" already has a meaning in C. While I believe
    it would be possible to distinguish the uses based on the type of "y",
    other than for the literal 0, having "x ** y" mean two /completely/
    different things depending on the type of "y" would not be a good idea
    for C.)

    The problem with a "**" exponentation operator is lexical. It's common
    to have two consecutive unary "*" operators in declarations and
    expression:
    char **argv;
    char c = **argv;

    Clearly, then, the way forward with this ** operator is to wait for the
    C++ people to do the unthinkable, and reluctantly copy it some years
    later.

    Ya know, like what they did with stacked template closers, which are
    already the >> operator.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Wed Jan 24 21:55:45 2024
    On 24/01/2024 21:23, Keith Thompson wrote:
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Wed, 24 Jan 2024 07:58:44 -0800, Keith Thompson wrote:
    Trigraphs, digraphs, and <iso646.h> were all introduced to support
    systems that *don't* support the full ASCII character set.

    Where is there a national character set that doesn’t support the symbols >> for which iso646.h introduces synonyms?

    Just one example: <https://en.wikipedia.org/wiki/Code_page_1016> has 'ø'
    in the slot that ASCII uses for '|'.

    I don't believe it's in common use today, but it may have been in 1995.


    7-bit modified ASCII sets like this were definitely used for a while.
    They were replaced by 8-bit extended ASCII sets, then UTF-8 (though the
    Latin-1 and Latin-9 sets are still in common use, especially on
    Windows). Both these 8-bit sets and UTF-8 have 7-bit ASCII as a subset,
    and thus have no problems with C.

    In 1995 I was new in Norway, and only spoke a little Norwegian, so I
    used UK English when programming. I believe most Norwegian programmers
    also used either US or UK English - being a relatively small country (in population), almost everything technical here is in English. So C
    character sets were never a problem for me when writing code. But
    dot-matrix printers were often set to particular international sets - it
    was not uncommon for code printouts to have Norwegian letters in place
    of ASCII symbols.

    It might be interesting to hear from any native Germans who were
    programming C at that time. Germany is big enough that people
    programmed in German (so comments would be in German, for example), and
    their 7-bit ASCII variant (Code page 1011) also had accented letters in
    place of some symbols used by C - including "|".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Jan 24 21:00:10 2024
    On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes:
    [...]
    These days, it even supports object oriented code.
    FWIW, the last ISO COBOL language standard was issued in 2023.

    ADD 1 TO COBOL GIVING COBOL

    Oh, oh, I have a new one to this oldie:

    ADD 100 TO PITCH OF COBOL

    (100 cents in a semitone.)

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 22:01:38 2024
    On 24/01/2024 21:15, Lawrence D'Oliveiro wrote:
    On Wed, 24 Jan 2024 18:40:08 -0000 (UTC), Kalevi Kolttonen wrote:

    "Logical disjunction is usually short-circuited ...

    I wonder why that shouldn’t apply to anything else. E.g. in

    a × (b + c)

    if “a” evaluates to zero, why not avoid the computation of “b + c” and
    just return zero as the value of the expression?

    Compilers both can and do use such optimisations:

    <https://godbolt.org/z/bhPMo8WPW>

    int test(int a, int b, int c) {
    if (a == 0) {
    return a * (b + c);
    } else {
    return a * (b + c);
    }
    }

    compiles (with gcc -O2) to :

    test:
    mov eax, edi
    test edi, edi
    je .L2
    add esi, edx
    imul eax, esi
    .L2:
    ret


    As long as there are no side-effects evaluating "b" and "c", then if the compiler knows "a" is 0, it can return 0 without doing the sums.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Janis Papanagnou on Wed Jan 24 21:08:21 2024
    On 2024-01-24, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 24.01.2024 17:56, Kaz Kylheku wrote:

    The Wikipedia page on it is quite good.

    The German Wikipedia has a table that is better legible IMO https://de.wikipedia.org/wiki/ISO_646

    That character 0x24 is funny. Every country listed in that table
    just has it as $. But the IRV has to have ¤?

    Who the hell needs a symbol indicating unspecified currency?

    How about one for unspecified temperature units?

    100⁰B ("degrees bullshit")

    Is that freezing? Room temperature? Solder-melting? Who knows ...

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to James Kuyper on Wed Jan 24 21:11:31 2024
    On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    On 1/24/24 03:10, Janis Papanagnou wrote:
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement? As quoted,
    it's a general statement which includes C: "Except as specified later,
    side effects and value computations of subexpressions are unsequenced."

    Pretty much any language has to guarantee *something* about
    order of evaluation, somewhere.

    Like for instance that calculating output is not possible before a
    needed input is available.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 22:09:33 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Wed, 24 Jan 2024 14:17:04 -0000 (UTC), Lew Pitcher wrote:

    and, yes, COBOL programmers are still in demand, mostly by financial
    institutions that have hundreds of millions of lines of COBOL code to
    maintain.

    I suspect a lot of those institutions have already gone out of business,
    or are close to going out of business. And the amounts they have to pay
    COBOL programmers to maintain their code are hastening that end.

    Are you certain that you want your taxes to be calculated in
    floatingpoint? ;-)

    How else would you handle compound interest?

    Fixed point arithmetic, of course. E.g. for
    currency, work with 'mils' and round (using
    bankers rounding) the result to the desired
    precision (e.g. cents).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 21:20:04 2024
    On 2024-01-24, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Wed, 24 Jan 2024 07:58:44 -0800, Keith Thompson wrote:

    Trigraphs, digraphs, and <iso646.h> were all introduced to support
    systems that *don't* support the full ASCII character set.

    Where is there a national character set that doesn’t support the symbols for which iso646.h introduces synonyms?

    See table in German Wikipedia page found by Janis

    https://de.wikipedia.org/wiki/ISO_646#Aufbau

    Let me reproduce that here:

    ISO 646-IRV # ¤ @ [ \ ] ^ ` { | } ~
    Deutschland # $ § Ä Ö Ü ^ ` ä ö ü ß
    Schweiz ù $ à é ç ê î ô ä ö ü û
    USA (ASCII) # $ @ [ \ ] ^ ` { | } ~
    Großbritannien £ $ @ [ \ ] ^ ` { | } ~
    Frankreich £ $ à ° ç § ^ ` é ù è ¨
    Kanada # $ à â ç ê î ô é ù è û
    Finnland # $ @ Ä Ö Å Ü é ä ö å ü
    Norwegen # $ @ Æ Ø Å ^ ` æ ø å ~
    Schweden # $ É Ä Ö Å Ü é ä ö å ü
    Italien £ $ § ° ç é ^ ù à ò ù ì
    Niederlande £ $ ¾ ÿ ½ | ^ ` ¨ ƒ ¼ ´
    Spanien £ $ § ¡ Ñ ¿ ^ ` ° ñ ç ~
    Portugal # $ @ Ã Ç Õ ^ ` ã ç õ ~

    These are 7 bit codes, that are essentially variations on ASCII,
    distinct from ISO-8859 (a.k.a. "ISO Latin").

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Kaz Kylheku on Wed Jan 24 22:13:06 2024
    Kaz Kylheku <433-929-6894@kylheku.com> writes:
    On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes:
    [...]
    These days, it even supports object oriented code.
    FWIW, the last ISO COBOL language standard was issued in 2023.

    ADD 1 TO COBOL GIVING COBOL

    Oh, oh, I have a new one to this oldie:

    ADD 100 TO PITCH OF COBOL

    (100 cents in a semitone.)

    ?LI SYSTEM/OPERATOR
    ?COMPILE STREK COBOL LIB MEM + 300.
    ?DATA CARD
    $SET CODE
    IDENTIFICATION DIVISION.
    PROGRAM-ID. STREK.
    AUTHOR. KURT WILHELM.
    INSTALLATION. OAKLAND UNIVERSITY.
    DATE-WRITTEN. COMPLETED SEPTEMBER 1, 1979.
    *
    *******************************************************
    * STAR_TREK SIMULATES AN OUTER SPACE ADVENTURE GAME *
    * ON A REMOTE TERMINAL. THE USER COMMANDS THE U.S.S. *
    * ENTERPRISE, AND THRU VARIOUS OFFENSIVE AND DEFEN- *
    * SIVE COMMANDS, TRAVELS THROUGHOUT THE GALAXY ON A *
    * MISSION TO DESTROY ALL KLINGONS, WHICH ALSO MANEU- *
    * VER AND FIRE ON THE ENTERPRISE. *
    *******************************************************
    *

    ENVIRONMENT DIVISION.
    CONFIGURATION SECTION.
    SOURCE-COMPUTER. V-380.
    OBJECT-COMPUTER. V-300.

    DATA DIVISION.
    WORKING-STORAGE SECTION.
    01 EOF-FLAG PIC X VALUE "N".
    01 STAR-TABLE.
    05 ROW OCCURS 42 TIMES.
    10 KOLUMN PIC X OCCURS 42 TIMES.
    01 RCTR PIC 99.
    01 KCTR PIC 99.
    01 COMMANDS-X.
    05 COMMAND PIC X(3).
    88 NAVIGATE VALUE "NAV".
    88 PHASERS VALUE "PHA".
    88 TORPEDO VALUE "TOR".
    88 SHIELDS VALUE "DEF".
    88 DOCK VALUE "DOC".
    88 LIB-COM VALUE "COM".
    88 NAV-C VALUE "NAV".
    88 PHA-C VALUE "PHA".
    88 TOR-C VALUE "TOR".
    88 DEF-C VALUE "DEF".
    88 DOC-C VALUE "DOC".
    88 COM-C VALUE "COM".
    05 ENTRY1 PIC 9.
    05 ENTRY2 PIC 9.
    01 MINI-TABLE.
    05 MROW OCCURS 14 TIMES.
    10 MCOL PIC X OCCURS 14 TIMES.
    01 RCNTR PIC 99.
    01 KCNTR PIC 99.
    01 X PIC 999.
    01 Y PIC 999.
    01 WS-DATE PIC 9(4) COMP.
    01 TIME-FLAG PIC 9.
    88 TIME-FLAG-SET VALUE 1.
    01 MAX-NO PIC 999.
    01 HQ1 PIC 9.
    01 HQ2 PIC 9.
    01 T-STORE PIC 9(4) COMP.
    01 ATTACK-FLAG PIC 9.
    88 KLINGONS-ATTACKING VALUE 1.
    ...
    MOVE 0 TO TOO-LATE-FLAG.
    DISPLAY " ".
    DISPLAY " *STAR TREK* ".
    DISPLAY " ".
    DISPLAY "CONGRATULATIONS - YOU HAVE JUST BEEN APPOINTED ".
    DISPLAY "CAPTAIN OF THE U.S.S. ENTERPRISE. ".
    DISPLAY " ".
    DISPLAY "PLEASE ENTER YOUR NAME, CAPTAIN ".
    ACCEPT NAME-X.
    DISPLAY "AND YOUR SKILL LEVEL (1-4)? ".
    ACCEPT SKILL-LEV.
    IF SKILL-LEV NOT NUMERIC OR SKILL-LEV < 1 OR SKILL-LEV > 4
    DISPLAY "INVALID SKILL LEVEL "
    DISPLAY "ENTER YOUR SKILL LEVEL (1-4) "
    ACCEPT SKILL-LEV
    IF SKILL-LEV NOT NUMERIC OR SKILL-LEV < 1 OR SKILL-LEV >
    - 4
    MOVE 1 TO SKILL-LEV
    DISPLAY "YOUR SKILL LEVEL MUST BE 1 ".
    MOVE 0 TO VAB5.
    MOVE 0 TO VAB6.
    INSPECT NAME-X TALLYING VAB6 FOR ALL "A".
    INSPECT NAME-X TALLYING VAB6 FOR ALL "E".
    ADD 1 TO VAB6.
    INSPECT NAME-X TALLYING VAB5 FOR ALL " ".
    COMPUTE VAB6 ROUNDED = (VAB5 / 1.75) + (VAB6 / SKILL-LEV).
    COMPUTE K-OR ROUNDED = (SKILL-LEV * 4) + VAB6 + 5.
    COMPUTE VAB1 = 9 - SKILL-LEV.
    COMPUTE VAB2 ROUNDED = (SKILL-LEV / 3) * K-OR.
    MOVE K-OR TO KLINGONS.
    MOVE VAB1 TO VAE1.
    ACCEPT WS-TIME FROM TIME.
    MOVE WS-MIN OF WS-TIME TO DS-MIN.
    MOVE WS-SEC OF WS-TIME TO DS-SEC.
    MOVE DS-TABLE TO S-DATE.
    ADD 16 TO DS-MIN.
    IF DS-MIN > 59
    MOVE 1 TO TIME-FLAG
    ELSE
    MOVE 0 TO TIME-FLAG.
    MOVE DS-TABLE TO DS-DATE.
    ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lew Pitcher on Wed Jan 24 23:20:02 2024
    On 2024-01-24, Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    Are you certain that you want your taxes to be calculated in
    floatingpoint? ;-)

    I've never calculated my taxes in anything but floating-point.

    The Canada Revenue Agency will not refund or charge amounts less than a
    couple of dollars so even if floating-point introduced errors (which it doesn't) it wouldn't matter.

    Ordinary personal accounting, and small business accounting, can be
    done entirely in IEEE 754 double, if used correctly. Or else using
    integers for the ledger values, and taking at trip to floating point
    for percentage calculations and such, which get rounded back to the
    rational representation.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Lew Pitcher on Thu Jan 25 00:17:01 2024
    On Wed, 24 Jan 2024 20:25:00 -0000 (UTC), Lew Pitcher wrote:

    On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

    Where is there a national character set that doesn’t support the
    symbols for which iso646.h introduces synonyms?

    EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

    Were any of the EBCDICs official standards anywhere in the world, outside
    of IBM?

    Thinking about what the “A” in “ASCII” stands for ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Thu Jan 25 00:30:07 2024
    On Wed, 24 Jan 2024 19:33:09 +0000, Malcolm McLean wrote:

    I've discussed this ad infinitum with people who don't really understand
    what the term "function" means. Anththing that maps one set to another
    set such that there is one and only one mapping from each member if the struture set to the result set is mathematically a "function".

    Sizeof clearly counts.

    It does in the mathematical sense. But in the C sense, a “function” is a block of code which is called at runtime with zero or more arguments and returns a result (which might be void). It can also have side-effects on
    the machine state.

    It helps the discussion to be clear what your terms mean. Otherwise the
    people you are arguing with have a right to be indignant at what they
    might perceive to be wilful obtuseness.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 00:50:41 2024
    On Thu, 25 Jan 2024 00:17:01 +0000, Lawrence D'Oliveiro wrote:

    On Wed, 24 Jan 2024 20:25:00 -0000 (UTC), Lew Pitcher wrote:

    On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

    Where is there a national character set that doesn’t support the
    symbols for which iso646.h introduces synonyms?

    EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

    Were any of the EBCDICs official standards anywhere in the world, outside
    of IBM?

    Who cares? The better questions would be:
    - Are there C compilers for IBM mainframe systems that use EBCDIC?
    Yes, indeed there are.
    - Is IBM represented on the ISO C Standards committee?
    Yes, it is.

    Thinking about what the “A” in “ASCII” stands for ...

    Thinking of what the "E" in "ECMA-6" stands for
    (https://ecma-international.org/publications-and-standards/standards/ecma-6/) :-)

    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Keith Thompson on Thu Jan 25 00:52:33 2024
    On 24/01/2024 16:27, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    (It could not have been added as "**", because - as Keith said in
    another post - "x ** y" already has a meaning in C. While I believe
    it would be possible to distinguish the uses based on the type of "y",
    other than for the literal 0, having "x ** y" mean two /completely/
    different things depending on the type of "y" would not be a good idea
    for C.)

    The problem with a "**" exponentation operator is lexical. It's common
    to have two consecutive unary "*" operators in declarations and
    expression:
    char **argv;
    char c = **argv;
    Adding a "**" operator would have made the above invalid due to the
    "maximal munch" rule, before the type of the argument is even
    considered.

    See also x+++++y, which might be intended as x++ + ++y, but is scanned
    as x ++ ++ + y, a syntax error.

    C could have added "**" very early, but then we'd have to write
    "* *argv" or "*(*argv)".

    Given the other syntax decisions, probably the best bet would have been
    to have a named /operator/ called 'pow'. With two operatands, using function-like syntax 'pow(x, y)' would look better than 'x pow y'.

    This could then be defined over float, double, int and friends.

    You wouldn't be able to have a reference to it as you can with a
    function, but why is that so important? You can't have reference to the multiply operator either!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Thu Jan 25 00:26:04 2024
    On Wed, 24 Jan 2024 19:52:58 GMT, Scott Lurndal wrote:

    Dollar symbol ($) is an allowed extension.

    I wonder if we have DEC to thank for that ... ?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 01:25:15 2024
    On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Wed, 24 Jan 2024 19:52:58 GMT, Scott Lurndal wrote:

    Dollar symbol ($) is an allowed extension.

    I wonder if we have DEC to thank for that ... ?

    Perhaps. You have to follow the money to find out where that came from.


    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 01:23:11 2024
    On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Wed, 24 Jan 2024 20:25:00 -0000 (UTC), Lew Pitcher wrote:

    On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

    Where is there a national character set that doesn’t support the
    symbols for which iso646.h introduces synonyms?

    EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

    Were any of the EBCDICs official standards anywhere in the world, outside
    of IBM?

    Let's make a song!

    (To the tune of Mozart, K265).

    A, B, C, D, E-F-G-H, I

    dead-space plus/minus, J, K, L-M-N-O-Pie

    R, dead-space, tilde, S, T-U-V

    A-I-X-Sux, Y use System Z?

    I almost know by ebsy-dickee-dee.

    Sing "To hell with IBM!" with me.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 01:33:19 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Wed, 24 Jan 2024 20:25:00 -0000 (UTC), Lew Pitcher wrote:

    On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

    Where is there a national character set that doesn’t support the
    symbols for which iso646.h introduces synonyms?

    EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

    Were any of the EBCDICs official standards anywhere in the world, outside
    of IBM?

    Defacto within the particular manufacturer, yes.


    Thinking about what the “A” in “ASCII” stands for ...

    Hard to tell with all the unreadable UTF-8 :-)

    I believe it was the caret that printed as the EBCDIC 'not'
    character when printed on an EBCDIC printer.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Kaz Kylheku on Thu Jan 25 00:01:05 2024
    On 1/24/24 16:11, Kaz Kylheku wrote:
    On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    On 1/24/24 03:10, Janis Papanagnou wrote:
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement? As quoted,
    it's a general statement which includes C: "Except as specified later,
    side effects and value computations of subexpressions are unsequenced."

    Pretty much any language has to guarantee *something* about
    order of evaluation, somewhere.

    Not the functional languages, I believe - but I've only heard about such languages, not used them.

    Like for instance that calculating output is not possible before a
    needed input is available.

    Oddly enough, for a long time the C standard never said anything about
    that issue. I argued that this was logically necessary, and few people disagreed with that argument, but I couldn't point to wording in the
    standard to support that claim.

    That changed when they added support for multi-threaded code to C in
    C2011. That required the standard to be very explicit about which things
    could happen simultaneously in different threads, and which things had
    to occur in a specified order. All of the wording about "sequenced" was
    first introduced at that time. In particular, the following wording was
    added:

    "The value computations of the operands of an operator are sequenced
    before the value computation of the result of the operator." (6.5p1)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Thu Jan 25 05:25:56 2024
    On Thu, 25 Jan 2024 03:56:13 +0000, Malcolm McLean wrote:

    I said that the C standard's
    use of the term "function" to mean "subroutine" was a misuse ...

    Common Python terminology does the same.

    Back in Pascal days, a “function” returned a value, while a “procedure” had some effect on the machine state. If you wanted to refer to both, you
    tried a semi-common term like “routine” and hoped they understood.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Thu Jan 25 05:39:19 2024
    On 2024-01-25, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    As for K&R's thinking, I have no particular insight on that. I have no problem with some operators being represented by symbols and others by keywords (I'm accustomed to it from other languages), and I don't see
    that the decision to make "sizeof" a keyword even requires any
    justification.

    C has hardly any alphabetical operators only if you don't consider the statement keywords to be operators!

    I.e. you don't consider the "if" in "if (expr) S1 else S2"
    to be an operator which evaluates expr, and then chooses
    whether to execute S1 or S2.

    If you think that way, the terminology in the language backs you up;
    it doesn't call them operators.

    Yet ?: in expr ? E1 : E2 is identified as an operator.

    Is it because it yields a value? Casting is an operator, and
    doesn't always yield a value:

    (void) expr

    The function call parentheses are an operator, and likewise don't always
    yield a value:

    free(ptr)

    No, it's because something involved only with expressions is an
    operator, wheras something involved with statements is ... some unnamed something: "statement keyword" or whatever.

    C++ has lambda expressions, which are operators, yielding a value. And
    they are involved with statements: lambdas have a statement body,
    much like a while statement has a body.

    If C adopted similar syntax, the idea that operators don't have anything
    to do with the control of statements would go out the window.

    Basically, the Lisp people nailed all the concepts and terminology. If
    we use that, we can talk about various other languages in a sensible
    way.

    In declarators, const and restrict work like unary operators,
    syntactically in the same phrase structure role as the pointer *,
    stacking from right to left together with these:

    int *const *restrict p

    Again, the language doesn't call "operators" those symbols that guide
    the semantics of declarations. In my background, the main symbols that
    guide meaning are all operators. in int *p, * is a type construction
    operator which derives a pointer.

    If I'm talking about C from the outside, as an object/product of
    computer science, rather than talking about C programming, I will
    tend to use those terms: like C has operators for iteration
    such as while and for (which are not called operators in the
    specification of that language).

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Thu Jan 25 07:48:41 2024
    On 24.01.2024 17:27, Keith Thompson wrote:

    (C++, in 2011 IIRC, introduced special handling for the >> token, which occurs in things like std::vector<std::vector<int>>).

    So you need not any more have to write it with a space as in ...?
    std::vector<std::vector<int> >
    That's fine.

    But I suppose they haven't fixed its precedence in cin<< and cout>>
    contexts? (I suppose it's still handled with shl/shr precedence?)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 07:22:07 2024
    On 24.01.2024 21:55, David Brown wrote:
    [...]
    It might be interesting to hear from any native Germans who were
    programming C at that time.

    This is matching my profile. (Don't get fooled by my name.) :-)

    My faint memories might be the limiting factor, though! :-/

    Germany is big enough that people
    programmed in German (so comments would be in German, for example), and
    their 7-bit ASCII variant (Code page 1011) also had accented letters in
    place of some symbols used by C - including "|".

    The following recollections imply (but also exceed) the C context.
    (So you've been warned.)

    The systems I used had originally, for example, not used Umlauts.
    There's an alternative representation for Ä Ae, Ö Oe, Ü Ue, and
    similar the lower case pendants, and using ss for ß. Some/most
    mainframes used only uppercase letters. One OS (for the TR 440)
    had even German commands; e.g. the "compile" (de: "übersetze")
    command was written 'UEBERSETZE' etc. But it's not as simple; the
    TR 440 had a few character sets (for punch cards, printer, etc.);
    you find it at http://volatile.gridbug.de/TR440-Charactersets.pdf
    The CDC we had used 6 bit character sets (but I have no docs for
    it. All uppercase, no umlauts. WRT programming; for Pascal there
    were alternative representations; e.g. for array elements a[i]
    something like a.(i.), and I don't recall having used umlauts.
    In Algol keywords were written, I think, with a leading dot like
    .begin .int i; .for i .from 1 .to ... (all caps of course)
    Unfortunately my only surviving source code from that time (punch
    card set and corresponding 132 column printer output) is buried
    somewhere in the basement. The C programming I did, I think, on a
    Siemens 7.860, which actually was an IBM clone (maybe a 360/370 ?).
    Not sure about the OS I used for that; it was either on a VM/CMS or
    the Unix variant VM/UTS (Amdahl), both running on that VM platform.
    Someone who better knows the IBM platform may have some insights
    about the code sets; I have no docs. While we did *not* have to
    use trigraphs I don't recall whether we had to use alternative
    representations for some of these characters as mentioned above.
    I don't recall that I ever used ASCII extensions like umlauts at
    these days, neither in comments nor elsewhere. Next instance was
    Sun workstations (Sun-OS), no code-set issues here. Later IBM
    and HP-UX systems where I think we had coding standards to avoid
    umlauts (but not sure). Base was ISO Latin 1, later ISO Latin 9
    (ISO 8859-15), and much later UTF-8. Part of the development was
    for Windows clients (not done by me), so compatibility with the
    Unix platforms was an issue WRT the code pages.
    For an European trans-national X.500 based telephone directory we
    used the ISO/ITU-T standards/recommendations in exchanged data.
    A small BASIC programmable pocket calculator (in the 1980's) used
    an "ASCII code table", but it contains only NULL, and positions
    0x20 to 0x5f (no lower case letters or umlauts), but it was 8 bit
    and supported at positions 251 and 252 the 'pi' character and the
    'sqrt' symbol respectively.
    In the late 1970's an Olivetti P6060 basic desktop computer used
    the ASCII character set (unchanged, as we know it), no umlauts.
    In the docs it's named "ISO-CODE TABELLE", but the characters in
    the control code block 0-31 had some graphical representations,
    and the 8-bit "extended range" was existing but unspecified.
    I also found some docs from that time for the Commodore PET 2001
    (and CBM series) that we used here; ASCII and display character
    set, see http://volatile.gridbug.de/Commodore-Characterset.pdf

    So far my faint memories and some docs. It's really hard to recall
    such details and there's some inherent uncertainty.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 07:02:30 2024
    On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Thu, 25 Jan 2024 03:56:13 +0000, Malcolm McLean wrote:

    I said that the C standard's
    use of the term "function" to mean "subroutine" was a misuse ...

    Common Python terminology does the same.

    Back in Pascal days, a “function” returned a value, while a “procedure”
    had some effect on the machine state. If you wanted to refer to both, you tried a semi-common term like “routine” and hoped they understood.

    Lisp was there long before Pacal. In Lisp, there are only functions.
    Functions can be pure if written that way, or have side effects.
    They take arguments by value. All computation and side effecting is
    done via expressions, which yield a value, even assignments.

    Lisp also introduced the ternary "if" operator (if expr this that),
    as well as short circuiting "and" and "or" (and expr1 expr2 ...)
    (or expr1 expr2 ...).

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Thu Jan 25 14:07:25 2024
    On 25/01/2024 07:48, Janis Papanagnou wrote:
    On 24.01.2024 17:27, Keith Thompson wrote:

    (C++, in 2011 IIRC, introduced special handling for the >> token, which
    occurs in things like std::vector<std::vector<int>>).

    So you need not any more have to write it with a space as in ...?
    std::vector<std::vector<int> >
    That's fine.

    But I suppose they haven't fixed its precedence in cin<< and cout>>
    contexts? (I suppose it's still handled with shl/shr precedence?)


    There's little problem with the precedence - that's one of the reasons
    these operators were picked in the first place. You sometimes need parentheses, but not often.

    The problem was with the order of evaluation. Prior to C++17 (where it
    was fixed), if you wrote "cout << one() << two() << three();", the order
    the three functions were evaluated was unspecified.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Kuyper on Thu Jan 25 13:43:12 2024
    On 25/01/2024 06:01, James Kuyper wrote:
    On 1/24/24 16:11, Kaz Kylheku wrote:
    On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    On 1/24/24 03:10, Janis Papanagnou wrote:
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement? As quoted,
    it's a general statement which includes C: "Except as specified later,
    side effects and value computations of subexpressions are unsequenced."

    Pretty much any language has to guarantee *something* about
    order of evaluation, somewhere.

    Not the functional languages, I believe - but I've only heard about such languages, not used them.


    I remember a programming task at university around infinite lists in a functional programming language (not Haskell, but very similar -
    arguably its predecessor). We wrote a function returning pi as an
    infinite list of decimal digits - the printout of that started long
    before the calculation itself was finished!

    Like for instance that calculating output is not possible before a
    needed input is available.

    Oddly enough, for a long time the C standard never said anything about
    that issue. I argued that this was logically necessary, and few people disagreed with that argument, but I couldn't point to wording in the
    standard to support that claim.

    That changed when they added support for multi-threaded code to C in
    C2011. That required the standard to be very explicit about which things could happen simultaneously in different threads, and which things had
    to occur in a specified order. All of the wording about "sequenced" was
    first introduced at that time. In particular, the following wording was added:

    "The value computations of the operands of an operator are sequenced
    before the value computation of the result of the operator." (6.5p1)


    For the most part, even with threads, C defines things in terms of
    observable behaviour. The actual order in which things are evaluated,
    even when the standard gives the order required by the abstract machine
    (such as using sequence points, and the complicated multi-threaded
    stuff), the actual implementation can do any kind of re-ordering it
    likes as long as there is no change to the observable behaviour.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Kaz Kylheku on Thu Jan 25 14:01:36 2024
    On 24/01/2024 21:50, Kaz Kylheku wrote:
    On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    (It could not have been added as "**", because - as Keith said in
    another post - "x ** y" already has a meaning in C. While I believe
    it would be possible to distinguish the uses based on the type of "y",
    other than for the literal 0, having "x ** y" mean two /completely/
    different things depending on the type of "y" would not be a good idea
    for C.)

    The problem with a "**" exponentation operator is lexical. It's common
    to have two consecutive unary "*" operators in declarations and
    expression:
    char **argv;
    char c = **argv;

    Clearly, then, the way forward with this ** operator is to wait for the
    C++ people to do the unthinkable, and reluctantly copy it some years
    later.

    I'm hoping the C++ people while do the sane/unthinkable (cross out one, according to personal preference) thing and allow Unicode symbols for operators, which will then be added to the standard library rather than
    to the language. Then we'll have "x ↑ y", and no possible confusion.

    (It's actually almost fully possible already - all they need to do is
    allow characters such as ↑ to be used as macros, and we're good to go.)


    Ya know, like what they did with stacked template closers, which are
    already the >> operator.


    The "maximum munch" parsing rule seemed like such a good idea, long ago!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Thu Jan 25 14:29:33 2024
    On 24/01/2024 20:33, Malcolm McLean wrote:
    On 24/01/2024 13:54, David Brown wrote:
    On 24/01/2024 13:20, Malcolm McLean wrote:

    Many operators in C are not mathematical operations.  "sizeof" is an
    operator, so are indirection operators, structure member access
    operators, function calls, and the comma operator.

    I've discussed this ad infinitum with people who don't really understand
    what the term "function" means.

    Yes, you have - usually at least somewhat incorrectly, and usually
    without being clear if you are talking about a "C function", a
    mathematical "function", or a "Malcolm function" using your own private definitions.

    Anththing that maps one set to another
    set such that there is one and only one mapping from each member if the struture set to the result set is mathematically a "function".
    Sizeof clearly counts.

    "sizeof" clearly does not count.

    You don't get to mix "mathematical" definitions and "C" definitions.
    "sizeof" is a C feature - it makes no sense to ask if it is a
    mathematical function or not. It /does/ make sense to ask if it is a
    /C/ function or not - and it is not a C function.

    At a pinch, you could say that "sizeof" is a mathematical function with
    the domain being a subset set of possible expressions and possible types visible in the program at the time, according to the rules in the C
    standards. It would not be useful or helpful, but you /could/ say that.

    However, what I wrote was that "sizeof" was not a "mathematical
    operation" - not that it was not a function in the mathematical sense.


    Exponentiation is not particularly common in programming, except for a
    few special cases - easily written as "x * x", "x * x * x", "1.0 / x",
    or "sqrt(x)", which are normally significantly more efficient than a
    generic power function or operator would be.

    It's pretty common in the sort of programming that I do. But this is
    fair point. A lot of programs don't apply complex transformations to
    data in the way that mine typically do.

    Different tasks have different programming needs.


    That is not an argument against having an operator in C called "pow".
    It is simply not useful enough for there to be a benefit in adding it
    to the language as an operator, when it could (and was) easily be
    added as a function in the standard library.

    (It could not have been added as "**", because - as Keith said in
    another post - "x ** y" already has a meaning in C.  While I believe
    it would be possible to distinguish the uses based on the type of "y",
    other than for the literal 0, having "x ** y" mean two /completely/
    different things depending on the type of "y" would not be a good idea
    for C.)

    Yes, ** and ^, which are the two common ASCII fallbacks, are already
    taken. But as you said earlier, in reality most exponentiation
    operations are either square or cube, or square root. And in C, that
    means either special functions or inefficiently converting the exponent
    into a double. If pow were an operator, that wouldn't be an issue.


    It could certainly still be an issue - it depends entirely on how that
    operator were specified, and how it were implemented.

    Unlike normal C functions, operators in C can be "overloaded" by type.
    But if there were a "pow" operator that fitted with the style of
    existing C operators, we would have overloads for "T pow T to T", where
    "T" is an integer type (of rank at least that of "int"), or a floating
    point type. We would not see the kinds of overloads that would actually
    be useful beyond what you get with today's "pow" function, such as
    raising floating point numbers to an integer type. We would certainly
    not see efficient shortcuts for common cases at the standards level
    (though implementations could do what they wanted).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 14:19:18 2024
    On 25.01.2024 13:43, David Brown wrote:

    [...] We wrote a function returning pi as an
    infinite list of decimal digits - the printout of that started long
    before the calculation itself was finished!

    You had an algorithm for an infinite list of decimals that finished?

    I think this formulation will go into my cookie jar of noteworthy
    achievements. - And, sorry, I could not resist. :-)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 14:35:50 2024
    On 25.01.2024 14:07, David Brown wrote:

    The problem was with the order of evaluation. Prior to C++17 (where it
    was fixed), if you wrote "cout << one() << two() << three();", the order
    the three functions were evaluated was unspecified.

    The last decade or two I haven't been in C++ to any depth. But I'm a bit surprised by that. The op<< is defined by something like [informally]
    stream op<<(stream,value), where "two() << three()" is "value << value",
    but "cout << one()" would yield a stream, say X, and "X << two()" again
    a stream, etc. So actually we have nested functions
    op<<( op<<( op<<(cout, one()), two()), three())
    At least you'd need to evaluate one() to obtain the argument for the
    next outer of the nested calls.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Thu Jan 25 13:43:37 2024
    On 25/01/2024 12:43, David Brown wrote:
    On 25/01/2024 06:01, James Kuyper wrote:
    On 1/24/24 16:11, Kaz Kylheku wrote:
    On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    On 1/24/24 03:10, Janis Papanagnou wrote:
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement? As quoted,
    it's a general statement which includes C: "Except as specified later, >>>> side effects and value computations of subexpressions are unsequenced." >>>
    Pretty much any language has to guarantee *something* about
    order of evaluation, somewhere.

    Not the functional languages, I believe - but I've only heard about such
    languages, not used them.


    I remember a programming task at university around infinite lists in a functional programming language (not Haskell, but very similar -
    arguably its predecessor).  We wrote a function returning pi as an
    infinite list of decimal digits - the printout of that started long
    before the calculation itself was finished!

    You can write something like that in C. I adapted a program to print the
    first N digits so that it doesn't stop. It looks like this:

    int main(void) {
    while (1) {
    printf("%c",nextpidigit());
    }
    }

    (The output starts as "314159..."; it will need a tweak to insert the
    decimal point.)

    The algorithm obviously wasn't mine; I've no idea how it works. (Tn a
    sequence like ...399999999..., how does it know that 3 is a 3 and not a
    4, before it's calculated further? It's magic.)

    The nextpidigit() function is set up as a generator.

    It also relies on using big integers (I used it to test my library), so
    will rapidly get much slower at calculating the next digit.

    Even with a much faster library, eventually memory will be exhausted, so
    this is not suitable for an 'infinite' number, or even an unlimited
    number of digits; it will eventually grind to a halt.

    Was yours any different?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Thu Jan 25 14:46:38 2024
    On 25/01/2024 04:56, Malcolm McLean wrote:
    On 25/01/2024 00:30, Lawrence D'Oliveiro wrote:
    On Wed, 24 Jan 2024 19:33:09 +0000, Malcolm McLean wrote:

    I've discussed this ad infinitum with people who don't really understand >>> what the term "function" means. Anththing that maps one set to another
    set such that there is one and only one mapping from each member if the
    struture set to the result set is mathematically a "function".

    Sizeof clearly counts.

    It does in the mathematical sense. But in the C sense, a “function” is a >> block of code which is called at runtime with zero or more arguments and
    returns a result (which might be void). It can also have side-effects on
    the machine state.

    It helps the discussion to be clear what your terms mean. Otherwise the
    people you are arguing with have a right to be indignant at what they
    might perceive to be wilful obtuseness.

    You haven't been around for long enough. I said that the C standard's
    use of the term "function" to mean "subroutine" was a misuse, and that I
    was going to use the term "function", in context, to refer to that
    subset of C subroutines which calculate mathematical functions of bits
    in the computer's memory. The opposition and outrage that this generated
    was incredible, and must have gone on for years.

    There is no single consensus of the definitions of the terms "function"
    or "subroutine" in computing terminology, rendering your argument moot.

    There is, however, a very clear understanding of the term "function" in
    the context of C - thus in comp.lang.c., the unqualified term "function"
    means no more and no less than what the C standards say the term means.

    And there is an established mathematical meaning of the term "function"
    - a "mathematical function" is a mapping from one set to another set.

    These are very different things. They may not be quite as different as,
    say, the meaning of the word "character" in the context of C, and it's
    meaning in the context of a Shakespeare play. But they are still very distinct, and it is counter-productive to mix them.

    Remember, no one really cares what you said about them, or whether you
    think the C founding fathers used a poor choice of terms. It doesn't
    even matter if anyone agrees with that opinion, or thinks Nim got it
    right by distinguishing "func" and "proc". This is comp.lang.c, where
    the focus is on the C language - and we use its terms, definitions and
    rules regardless of how we may feel about them.

    I don't think the "opposition and outrage" against your ideas has been incredible - I think people have simply been frustrated at your
    insistence of writing confusing and unhelpful posts. The only
    incredible thing is that you continue you this nonsense no matter how
    often and how carefully people explain that you are wrong - and that
    even if you were right, you'd /still/ be wrong here in c.l.c.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Thu Jan 25 13:53:02 2024
    On 25/01/2024 13:01, David Brown wrote:
    On 24/01/2024 21:50, Kaz Kylheku wrote:
    On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    (It could not have been added as "**", because - as Keith said in
    another post - "x ** y" already has a meaning in C.  While I believe
    it would be possible to distinguish the uses based on the type of "y", >>>> other than for the literal 0, having "x ** y" mean two /completely/
    different things depending on the type of "y" would not be a good idea >>>> for C.)

    The problem with a "**" exponentation operator is lexical.  It's common >>> to have two consecutive unary "*" operators in declarations and
    expression:
         char **argv;
         char c = **argv;

    Clearly, then, the way forward with this ** operator is to wait for the
    C++ people to do the unthinkable, and reluctantly copy it some years
    later.

    I'm hoping the C++ people while do the sane/unthinkable (cross out one, according to personal preference) thing and allow Unicode symbols for operators, which will then be added to the standard library rather than
    to the language.  Then we'll have "x ↑ y", and no possible confusion.

    (It's actually almost fully possible already - all they need to do is
    allow characters such as ↑ to be used as macros, and we're good to go.)


    Suppose ↑ could be used in as macro now, what would such a definition
    look like?

    Surely you'd be able to invoke it as ↑(x, y)?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Thu Jan 25 15:02:52 2024
    On 24/01/2024 20:48, Malcolm McLean wrote:
    On 23/01/2024 21:51, Lawrence D'Oliveiro wrote:
    On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

    It breaks the rule that, in C, variables and functions are alphnumeric,
    whilst operators are symbols.

    Where is there such a “rule”?

    There is no such rule.


    Valid function names have to begin with an alphabetical symbol or
    (annoyingly for me) an underscore, as do variables. They may not contain non-alphanumerical symbols except for underscore. It's in the C standard somewhere.

    6.4.2.1, in the definition of identifiers. (A wide range of other
    Unicode letters are allowed as well.)

    C operators are all non-alphanumerical symbols, with the exception of "sizeof". Again, the operators are listed in the C standard.


    _Alignof has been an operator since C11, and the "typeof" operator is
    coming with C23.

    C does not have user-defined operators - all operators are defined by
    the C standards committee, and documented in the C standards. If they
    choose to make an operator called "pow", or "÷", or "multiply_by_7",
    that is their prerogative. They are not bound by any rules someone on
    Usenet thinks up because of a pattern that they see. Patterns are not
    rules.

    If the C standards committee ever decide to add more new operators, they
    will choose the name based on what is expected to work best for users,
    what has the least risk of compatibility issues, what fits best with any relevant existing extensions, and what is practical to implement. For
    the most recent operators, words have made more sense. But if they were
    to introduce a "power" operator, they would likely consider both "**"
    and "_Pow" as possibilities - and probably other ideas too.



    sizeof is an exception, but a justified one.

    This is how religious people argue: they use circular reasoning to say
    something is justified because it is justified.

    No. This isn't circular reasoning. It's a claim which hasn't been backed
    up. It's expected that the reader won't ask for this because it is so
    obvious that we can give sensible reasons for "sizeof" being a
    function-like alphabetical word rather than a symbol. But if you do, of course I'm sure someone will provide such a justification.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Thu Jan 25 15:20:54 2024
    On 25/01/2024 14:19, Janis Papanagnou wrote:
    On 25.01.2024 13:43, David Brown wrote:

    [...] We wrote a function returning pi as an
    infinite list of decimal digits - the printout of that started long
    before the calculation itself was finished!

    You had an algorithm for an infinite list of decimals that finished?


    That's the beauty of lazy evaluation!

    I think this formulation will go into my cookie jar of noteworthy achievements. - And, sorry, I could not resist. :-)


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Thu Jan 25 15:19:26 2024
    On 25/01/2024 10:55, Malcolm McLean wrote:
    On 25/01/2024 03:59, Keith Thompson wrote:

    As for K&R's thinking, I have no particular insight on that.  I have no
    problem with some operators being represented by symbols and others by
    keywords (I'm accustomed to it from other languages), and I don't see
    that the decision to make "sizeof" a keyword even requires any
    justification.

    I looked it up on the web, but I can't find anything that goes back to K
    and R and explains why they took that decision. But clearly to use a
    word rather than punctuators, as was the case with every other operator,
    must have had a reason.

    I think they wanted it to look function-like, because it is function,
    though a function of a type rather than of bits, so of course not a "function" in the C standard sense of the term.

    It is not a function in the C sense - "sizeof x" is not like a function
    call (where "x" is a variable or expression, rather than a type).
    However, many people (myself included) feel it is clearer in code to
    write it as "sizeof(x)", making it look more like a function or
    function-like macro.

    I suspect the prime reason "sizeof" is a word, rather than a symbol or
    sequence of symbols, is that the word is very clear while there are no
    suitable choices of symbols for the task. The nearest might have been
    "#", but that might have made pre-processor implementations more
    difficult. Of course any symbol or combination /could/ have been used,
    and people would have learned its meaning, but "sizeof" just seems so
    much simpler.

    But all operators are
    functions in this sense. However sizeof doesn't map to anything used in non-computer mathematics. But "size" is conventionally denoted by two vertical lines. These are taken by "OR", and would be misleading as in mathematics it means "absolute", not "physical area of paper taken up by
    the notation".
    So I would imagine that that was why they thought a word would be appropriate, and these reasons were strong enough to justify breaking
    the general patrern that operators are punctuators.
    I could be completely wrong of course in the absence of actual
    statements by K and R. But this would seem to make sense.

    That's roughly the same reasoning I see. (And I too do not have any
    evidence or references for the reasoning.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kalevi Kolttonen@21:1/5 to David Brown on Thu Jan 25 15:08:44 2024
    David Brown <david.brown@hesbynett.no> wrote:
    On 24/01/2024 20:33, Malcolm McLean wrote:
    On 24/01/2024 13:54, David Brown wrote:
    On 24/01/2024 13:20, Malcolm McLean wrote:

    Many operators in C are not mathematical operations.  "sizeof" is an
    operator, so are indirection operators, structure member access
    operators, function calls, and the comma operator.

    I've discussed this ad infinitum with people who don't really understand
    what the term "function" means.

    Yes, you have - usually at least somewhat incorrectly, and usually
    without being clear if you are talking about a "C function", a
    mathematical "function", or a "Malcolm function" using your own private definitions.

    Anththing that maps one set to another
    set such that there is one and only one mapping from each member if the
    struture set to the result set is mathematically a "function".
    Sizeof clearly counts.

    "sizeof" clearly does not count.

    You don't get to mix "mathematical" definitions and "C" definitions.
    "sizeof" is a C feature - it makes no sense to ask if it is a
    mathematical function or not. It /does/ make sense to ask if it is a
    /C/ function or not - and it is not a C function.

    Why on earth do you say that?

    It makes perfect sense to ask whether C's sizeof() can be
    regarded as a mathematical function. And the answer
    to that question is just what Malcolm said: Yes, sizeof()
    fits perfectly to the definition of a mathematical function.

    But in C language it is not a function, but operator.

    br,
    KK

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Thu Jan 25 16:11:26 2024
    On 25/01/2024 14:43, bart wrote:
    On 25/01/2024 12:43, David Brown wrote:
    On 25/01/2024 06:01, James Kuyper wrote:
    On 1/24/24 16:11, Kaz Kylheku wrote:
    On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    On 1/24/24 03:10, Janis Papanagnou wrote:
    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement? As quoted, >>>>> it's a general statement which includes C: "Except as specified later, >>>>> side effects and value computations of subexpressions are
    unsequenced."

    Pretty much any language has to guarantee *something* about
    order of evaluation, somewhere.

    Not the functional languages, I believe - but I've only heard about such >>> languages, not used them.


    I remember a programming task at university around infinite lists in a
    functional programming language (not Haskell, but very similar -
    arguably its predecessor).  We wrote a function returning pi as an
    infinite list of decimal digits - the printout of that started long
    before the calculation itself was finished!

    You can write something like that in C. I adapted a program to print the first N digits so that it doesn't stop. It looks like this:

      int main(void) {
          while (1) {
              printf("%c",nextpidigit());
          }
      }

    (The output starts as "314159..."; it will need a tweak to insert the
    decimal point.)

    The algorithm obviously wasn't mine; I've no idea how it works. (Tn a sequence like ...399999999..., how does it know that 3 is a 3 and not a
    4, before it's calculated further? It's magic.)

    The nextpidigit() function is set up as a generator.

    It also relies on using big integers (I used it to test my library), so
    will rapidly get much slower at calculating the next digit.

    Even with a much faster library, eventually memory will be exhausted, so
    this is not suitable for an 'infinite' number, or even an unlimited
    number of digits; it will eventually grind to a halt.

    Was yours any different?


    That's roughly the same sort of thing. Functional programming languages
    just make this kind of thing much easier. (Most do - some do not
    support lazy evaluation.) So when you write "numbers = [1..]" in
    Haskell, you are really setting up a generator function and a data
    structure to hold the progress so far.

    But it means you can use them just like any other lists, as long as you
    don't try to do something that involves going through to the end of the
    list (like printing it out in full, or finding its length) - or sooner
    or later your machine is going to run out of memory, or the user will
    run out of patience.

    So a way to do this (which is not intended to be a very efficient
    method) is to view numbers as having an integer part (perhaps a "big
    integer", or an expandable list of decimal digits) and an infinite list
    of decimals. You can then build up functions for adding them,
    multiplying them by an integer, multiplying them by other infinite
    precision numbers, etc.

    Each of these functions returns another infinite list. The generator
    function it produces takes its inputs by running the generators of the sub-expressions, as needed. (In general, you sometimes have to look a
    little ahead to check for carries - as long as you don't meet an
    infinite list of 9's, this always stops in a finite time.)

    Now you make an arctan function, using the formula :

    arctan z = z - (z^3)/3 + (z^5)/5 - (z^7)/7 + ...

    It's just an infinite list of scaled powers of z, and you apply "sum" to it.

    And your final pi is just "pi = 4 * arctan 1". Printing out the first
    20 numbers would then be "take 20 pi".

    (Or something similar - I don't remember the exact syntax of the
    language we used.)

    Of course, there's some details to get right in terms of how far you
    need to go down the lists to ensure your output so far is correct. This
    is particularly true for the arctan, as you have to determine how many
    terms of the infinite sum you are adding as well as how far to go down
    each term. (It really helps that for |z| < 1, the terms always get
    smaller.) And there's scope for optimisation, such as "pre-calculating"
    the powers of z so each new term only needs to multiply the previous
    term by z². In a functional programming language, that's easy:

    zs = z : map ( * z2 ) zs
    where z2 = z * z

    Once all that's in place, you can try more advanced Machine formula like:

    π/4 = 4.arctan(1/5) - arctan(1/239)

    and get far faster convergence. If all the rest of the bits are in
    place correctly, only that last line needs to be changed to use the more efficient formula.


    Of course you can do this sort of thing in C, using generator functions.
    But it's a lot easier when the language handles all the implementation
    detail for you, and lets you use a much simpler syntax. (Simpler when
    you are used to it, of course!)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 16:40:55 2024
    On 25.01.2024 15:20, David Brown wrote:
    On 25/01/2024 14:19, Janis Papanagnou wrote:
    On 25.01.2024 13:43, David Brown wrote:

    [...] We wrote a function returning pi as an
    infinite list of decimal digits - the printout of that started long
    before the calculation itself was finished!

    You had an algorithm for an infinite list of decimals that finished?

    That's the beauty of lazy evaluation!

    Erm, yes, I'm familiar with lazy evaluation.

    It just doesn't clarify the magic of getting infinite lists from a
    finished procedure.

    But I don't want to disturb the ethereal beauty of such a sentence
    by critical questioning its semantics. :-)


    I think this formulation will go into my cookie jar of noteworthy
    achievements. - And, sorry, I could not resist. :-)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Thu Jan 25 16:48:25 2024
    On 25/01/2024 14:53, bart wrote:
    On 25/01/2024 13:01, David Brown wrote:
    On 24/01/2024 21:50, Kaz Kylheku wrote:
    On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    (It could not have been added as "**", because - as Keith said in
    another post - "x ** y" already has a meaning in C.  While I believe >>>>> it would be possible to distinguish the uses based on the type of "y", >>>>> other than for the literal 0, having "x ** y" mean two /completely/
    different things depending on the type of "y" would not be a good idea >>>>> for C.)

    The problem with a "**" exponentation operator is lexical.  It's common >>>> to have two consecutive unary "*" operators in declarations and
    expression:
         char **argv;
         char c = **argv;

    Clearly, then, the way forward with this ** operator is to wait for the
    C++ people to do the unthinkable, and reluctantly copy it some years
    later.

    I'm hoping the C++ people while do the sane/unthinkable (cross out
    one, according to personal preference) thing and allow Unicode symbols
    for operators, which will then be added to the standard library rather
    than to the language.  Then we'll have "x ↑ y", and no possible
    confusion.

    (It's actually almost fully possible already - all they need to do is
    allow characters such as ↑ to be used as macros, and we're good to go.)


    Suppose ↑ could be used in as macro now, what would such a definition
    look like?

    Surely you'd be able to invoke it as ↑(x, y)?


    It would be slightly nasty C++. Are you sure you want to know? Stop
    reading now if you are having second thoughts...

    Since you can't use ↑ as a macro name, I've used Π below in the sample function. "x Π y" calls "pow(x, y)", with no overhead.

    (Real code like this would use templates to support integer types and
    other floating point types, and perhaps also short-cuts for squares,
    cubes, and square roots. But I don't want to make it too messy, and it
    would also be more of a c.l.c++ topic.)


    #include <cmath>

    class PowerProxyInner {
    double _x;
    public:
    constexpr PowerProxyInner(double x) : _x(x) {}
    friend inline constexpr double operator + (const PowerProxyInner x,
    double y);
    };

    inline constexpr double operator + (const PowerProxyInner x, double y) {
    return std::pow(x._x, y);
    }

    class PowerProxy {
    public:
    constexpr PowerProxy() {}
    friend inline constexpr PowerProxyInner operator + (const
    PowerProxy x, double y);
    };

    inline constexpr PowerProxyInner operator + (double x, const PowerProxy y) {
    (void) y;
    return PowerProxyInner(x);
    }

    constexpr inline PowerProxy POWER;

    #define Π +POWER+

    double test(double x, double y) {
    return x Π y;
    }

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Thu Jan 25 16:53:17 2024
    On 25/01/2024 14:35, Janis Papanagnou wrote:
    On 25.01.2024 14:07, David Brown wrote:

    The problem was with the order of evaluation. Prior to C++17 (where it
    was fixed), if you wrote "cout << one() << two() << three();", the order
    the three functions were evaluated was unspecified.

    The last decade or two I haven't been in C++ to any depth. But I'm a bit surprised by that. The op<< is defined by something like [informally]
    stream op<<(stream,value), where "two() << three()" is "value << value",
    but "cout << one()" would yield a stream, say X, and "X << two()" again
    a stream, etc. So actually we have nested functions
    op<<( op<<( op<<(cout, one()), two()), three())
    At least you'd need to evaluate one() to obtain the argument for the
    next outer of the nested calls.


    Not quite. To simplify :

    cout << one() << two()

    is parsed as :

    (cout << one()) << two()

    So "cout << one()" is like a call to "op<<(cout, one())", and the full expression is like :

    op<<(op<<(cout, one()), two())

    Without the new C++17 order of evaluation rules, the compiler can
    happily execute "two()" before "op<<(cout, one())". The operands to the
    outer call need to be executed before the outer call itself, but the
    order in which these two operands are evaluated is unspecified (until
    C++17).

    Note that C++17 only specifies the order in certain cases, such as the
    bitwise shift operators.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 17:11:11 2024
    On 25.01.2024 16:53, David Brown wrote:
    On 25/01/2024 14:35, Janis Papanagnou wrote:
    On 25.01.2024 14:07, David Brown wrote:

    The problem was with the order of evaluation. Prior to C++17 (where it
    was fixed), if you wrote "cout << one() << two() << three();", the order >>> the three functions were evaluated was unspecified.

    The last decade or two I haven't been in C++ to any depth. But I'm a bit
    surprised by that. The op<< is defined by something like [informally]
    stream op<<(stream,value), where "two() << three()" is "value << value",
    but "cout << one()" would yield a stream, say X, and "X << two()" again
    a stream, etc. So actually we have nested functions
    op<<( op<<( op<<(cout, one()), two()), three())
    At least you'd need to evaluate one() to obtain the argument for the
    next outer of the nested calls.


    Not quite. To simplify :

    cout << one() << two()

    is parsed as :

    (cout << one()) << two()

    So "cout << one()" is like a call to "op<<(cout, one())", and the full expression is like :

    op<<(op<<(cout, one()), two())

    Yes, up to here that's exactly what I said above (with three nestings).

    op<<( op<<( op<<(cout, one()), two()), three())

    Remove one

    op<<( op<<(cout, one()), two())


    Without the new C++17 order of evaluation rules, the compiler can
    happily execute "two()" before "op<<(cout, one())". The operands to the outer call need to be executed before the outer call itself, but the
    order in which these two operands are evaluated is unspecified (until
    C++17).

    If that was formerly the case then the update was obviously necessary.

    Functionally there would probably have been commotion if

    tmp = op<<(cout, one())
    op<<( tmp, two())

    and

    op<<( op<<(cout, one()), two())

    would have had different results.

    Is or was there any compiler that implemented that in the "unexpected"
    order?

    [...]

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Jan 25 16:22:03 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 25/01/2024 03:59, Keith Thompson wrote:
    However sizeof doesn't map to anything used in
    non-computer mathematics. But "size" is conventionally denoted by two >vertical lines.

    So is absolute value, if I remember the notation correctly.

    |value|

    Arguing about what Ken and Dennis were thinking seems
    particularly fruitless.

    Almost as bad as arguing whether sizeof is an operator,
    a function or a keyword.

    Personally, I always parenthesize the sizeof
    non-terminal.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Thu Jan 25 20:06:39 2024
    On Thu, 25 Jan 2024 14:01:36 +0100, David Brown wrote:

    I'm hoping the C++ people while do the sane/unthinkable (cross out one, according to personal preference) thing and allow Unicode symbols for operators, which will then be added to the standard library rather than
    to the language.

    Why not do what Algol-68 did, and specify a set of characters that could
    be used to define new custom operators?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Thu Jan 25 21:07:55 2024
    On 25/01/2024 18:57, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 24/01/2024 21:50, Kaz Kylheku wrote:
    On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    [...]
    The problem with a "**" exponentation operator is lexical. It's common >>>> to have two consecutive unary "*" operators in declarations and
    expression:
    char **argv;
    char c = **argv;
    Clearly, then, the way forward with this ** operator is to wait for
    the C++ people to do the unthinkable, and reluctantly copy it some
    years later.

    I'm hoping the C++ people while do the sane/unthinkable (cross out
    one, according to personal preference) thing and allow Unicode symbols
    for operators, which will then be added to the standard library rather
    than to the language. Then we'll have "x ↑ y", and no possible
    confusion.

    That's difficult to type -- but they could add a new trigraph! 8-)}

    Shift-AltGr-U for me. x ↑ y might have been a slightly nicer symbol,
    but it's harder to type (for me).

    This illustrates the two big difficulties with Unicode symbols for this
    kind of thing. Lots of them are difficult to type for many people (at
    least, not without a good deal of messing around or extra programs).
    And it's easy to have different symbols that appear quite similar as
    glyphs, but are very different characters as far as the compiler is
    concerned.


    If the committee decides C needs an exponentation operator (which, as
    far as I know, nobody has submitted a proposal for), "^^" is available.


    Well, a logical exclusive or operator would not be much use, so why not?

    (It's actually almost fully possible already - all they need to do is
    allow characters such as ↑ to be used as macros, and we're good to
    go.)

    You'd also need something for ↑ to expand to.

    Yes, but you can write that bit in C++ already. (See my reply to Bart.)


    Ya know, like what they did with stacked template closers, which are
    already the >> operator.

    The "maximum munch" parsing rule seemed like such a good idea, long ago!

    It still does. It's simple to describe, and ambiguous cases like
    x+++++y should be resolved with whitespace. (">>" was a real problem in
    C++, resolved with a special-case rule in C11; C has no such problems of similar severity.)


    Not until we get a ** exponential operator...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Thu Jan 25 21:11:18 2024
    On 25/01/2024 17:11, Janis Papanagnou wrote:
    On 25.01.2024 16:53, David Brown wrote:
    On 25/01/2024 14:35, Janis Papanagnou wrote:
    On 25.01.2024 14:07, David Brown wrote:

    The problem was with the order of evaluation. Prior to C++17 (where it >>>> was fixed), if you wrote "cout << one() << two() << three();", the order >>>> the three functions were evaluated was unspecified.

    The last decade or two I haven't been in C++ to any depth. But I'm a bit >>> surprised by that. The op<< is defined by something like [informally]
    stream op<<(stream,value), where "two() << three()" is "value << value", >>> but "cout << one()" would yield a stream, say X, and "X << two()" again
    a stream, etc. So actually we have nested functions
    op<<( op<<( op<<(cout, one()), two()), three())
    At least you'd need to evaluate one() to obtain the argument for the
    next outer of the nested calls.


    Not quite. To simplify :

    cout << one() << two()

    is parsed as :

    (cout << one()) << two()

    So "cout << one()" is like a call to "op<<(cout, one())", and the full
    expression is like :

    op<<(op<<(cout, one()), two())

    Yes, up to here that's exactly what I said above (with three nestings).

    op<<( op<<( op<<(cout, one()), two()), three())

    Remove one

    op<<( op<<(cout, one()), two())


    Without the new C++17 order of evaluation rules, the compiler can
    happily execute "two()" before "op<<(cout, one())". The operands to the
    outer call need to be executed before the outer call itself, but the
    order in which these two operands are evaluated is unspecified (until
    C++17).

    If that was formerly the case then the update was obviously necessary.

    Functionally there would probably have been commotion if

    tmp = op<<(cout, one())
    op<<( tmp, two())

    and

    op<<( op<<(cout, one()), two())

    would have had different results.

    Is or was there any compiler that implemented that in the "unexpected"
    order?


    There were indeed such real-world cases, complaints were made, and the
    rules changed in C++17.

    Usually it doesn't matter what order arguments to functions (or operands
    to operators) are evaluated. Some compilers have consistent ordering
    (and it is often last to first, not first to last), others pick whatever
    makes sense at the time. The ordering has been explicitly and clearly
    stated as "unspecified" since around the beginning of time (which was,
    as we all know, 01.01.1970).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Thu Jan 25 20:08:34 2024
    On Thu, 25 Jan 2024 09:57:43 -0800, Keith Thompson wrote:

    Then we'll have "x ↑ y", and no possible confusion.

    That's difficult to type

    Compose-circumflex-bar, or compose-bar-circumflex.

    ↑↑ (typed by me)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 20:30:36 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Thu, 25 Jan 2024 09:57:43 -0800, Keith Thompson wrote:

    Then we'll have "x ↑ y", and no possible confusion.

    That's difficult to type

    Compose-circumflex-bar, or compose-bar-circumflex.

    yep, difficult to type.


    ↑↑ (typed by me)

    and to read.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Thu Jan 25 20:18:17 2024
    On Thu, 25 Jan 2024 21:07:55 +0100, David Brown wrote:

    This illustrates the two big difficulties with Unicode symbols for this
    kind of thing. Lots of them are difficult to type for many people (at
    least, not without a good deal of messing around or extra programs).

    The compose key on *nix systems gives you a fairly mnemonic way of typing
    many of them.

    And it's easy to have different symbols that appear quite similar as
    glyphs, but are very different characters as far as the compiler is concerned.

    You can actually take advantage of that. E.g. from some of my Python code:

    for cłass in (Window, Pixmap, Cursor, GContext, Region) :
    delattr(cłass, "__del__")
    #end for

    The human reader might not actually notice (or care) that a particular identifier looks like a reserved word, since the meaning is obvious from context. The compiler cannot deduce the meaning from that context, but
    then, it doesn’t need to.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 21:04:39 2024
    On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Thu, 25 Jan 2024 14:01:36 +0100, David Brown wrote:

    I'm hoping the C++ people while do the sane/unthinkable (cross out one,
    according to personal preference) thing and allow Unicode symbols for
    operators, which will then be added to the standard library rather than
    to the language.

    Why not do what Algol-68 did, and specify a set of characters that could
    be used to define new custom operators?

    That ship sailed. C uses maximal munch lexing which allows operators
    to be juxtaposed with no intervening space. E.g. !*++p is tokenized
    as {!}{*}{++}{p}. It's difficult to introduce a scheme for defining
    new combinations of characters as new kinds of tokens. The only
    ASCII glyphs that are not some kind of token already are $ and @;
    they are not used in C. So program-defined tokens would have to start
    with one of these. If a program-defined token started with something
    existing like *, for instance a token *%%*, that already has an existing meaning; it scans as {*}{%}{%}{*}. Hacky rules, like speculative parses,
    and whatnot, could make it work; there is now way something like that
    could be standardized into C, though.

    We can use *%%* as a symbol in ANSI Lisp, because Lisp has tokenizing
    rules which support that. Tokens are made up of token constituent
    characters, and are delimited by the first nonconstituent. For instance
    ( is a non-constituent (for obvious reasons) so *%%*( will be read
    properly as the *%%* token (which becomes a symbol object) followed
    by an open parenthesis starting a list. Likewise, whitespace is
    terminating. The # character has a special meaning in various notations,
    and is also a token constituent so that #c(1.0 2.0) is a complex
    number, yet ab#c is a single symbol token.

    Nothing like this is easy to retrofit into C.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 21:13:37 2024
    On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Thu, 25 Jan 2024 09:57:43 -0800, Keith Thompson wrote:

    Then we'll have "x ↑ y", and no possible confusion.

    That's difficult to type

    Compose-circumflex-bar, or compose-bar-circumflex.

    ↑↑ (typed by me)

    In Japanese IME: type ue, then space bar several times to find the
    completion, which becomes the top one in the LRU list for the
    next time you type ue.

    Japanese IME is great.

    Need an Ohm symbol? o-mu, spacebar: Ω. omega also works.

    ha-to: ♥
    onpu: ♪
    hoshi: ★, ☆

    arufa, be-ta, ganma, deruta, ...: α, β, γ, Δ

    hidari: ← (and others)
    migi: → (and others)
    shita: ↓

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 21:16:14 2024
    On 25/01/2024 20:06, Lawrence D'Oliveiro wrote:
    On Thu, 25 Jan 2024 14:01:36 +0100, David Brown wrote:

    I'm hoping the C++ people while do the sane/unthinkable (cross out one,
    according to personal preference) thing and allow Unicode symbols for
    operators, which will then be added to the standard library rather than
    to the language.

    Why not do what Algol-68 did, and specify a set of characters that could
    be used to define new custom operators?

    Because that was a terrible scheme. It was too easy to create a language
    full of cryptic-looking new operators that no one had any idea what they
    did.

    It also allowed you to define precedences of any operator, including
    overriding the precedence of any operator within a nested scope.

    That means that 'a + b * c' could be parsed differently in different
    contexts.

    Imagine putting that power into the hands of ordinary users.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Thu Jan 25 22:01:59 2024
    On Thu, 25 Jan 2024 21:16:14 +0000, bart wrote:

    Imagine putting that power into the hands of ordinary users.

    Shock, horror. Of course we elite cannot allow that into the hands of the plebs. Imagine what they might do!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Fri Jan 26 02:11:27 2024
    On 25.01.2024 23:01, Lawrence D'Oliveiro wrote:
    On Thu, 25 Jan 2024 21:16:14 +0000, bart wrote:

    Imagine putting that power into the hands of ordinary users.

    Shock, horror. Of course we elite cannot allow that into the hands of the plebs. Imagine what they might do!

    Power to the people!

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Fri Jan 26 02:21:41 2024
    On 25.01.2024 21:11, David Brown wrote:
    On 25/01/2024 17:11, Janis Papanagnou wrote:
    On 25.01.2024 16:53, David Brown wrote:
    On 25/01/2024 14:35, Janis Papanagnou wrote:
    On 25.01.2024 14:07, David Brown wrote:

    The problem was with the order of evaluation. Prior to C++17
    (where it
    was fixed), if you wrote "cout << one() << two() << three();", the
    order
    the three functions were evaluated was unspecified.

    The last decade or two I haven't been in C++ to any depth. But I'm a
    bit
    surprised by that. The op<< is defined by something like [informally]
    stream op<<(stream,value), where "two() << three()" is "value <<
    value",
    but "cout << one()" would yield a stream, say X, and "X << two()" again >>>> a stream, etc. So actually we have nested functions
    op<<( op<<( op<<(cout, one()), two()), three())
    At least you'd need to evaluate one() to obtain the argument for the
    next outer of the nested calls.


    Not quite. To simplify :

    cout << one() << two()

    is parsed as :

    (cout << one()) << two()

    So "cout << one()" is like a call to "op<<(cout, one())", and the full
    expression is like :

    op<<(op<<(cout, one()), two())

    Yes, up to here that's exactly what I said above (with three nestings).

    op<<( op<<( op<<(cout, one()), two()), three())

    Remove one

    op<<( op<<(cout, one()), two())


    Without the new C++17 order of evaluation rules, the compiler can
    happily execute "two()" before "op<<(cout, one())". The operands to the >>> outer call need to be executed before the outer call itself, but the
    order in which these two operands are evaluated is unspecified (until
    C++17).

    If that was formerly the case then the update was obviously necessary.

    Functionally there would probably have been commotion if

    tmp = op<<(cout, one())
    op<<( tmp, two())

    and

    op<<( op<<(cout, one()), two())

    would have had different results.

    Is or was there any compiler that implemented that in the "unexpected"
    order?


    There were indeed such real-world cases, complaints were made,

    Complaints that the rule was not clear in its definition?
    Or complaints that their compiler did not support cout<<a<<b<<c;
    correctly? - I would be astonished about the latter.
    This is so fundamental a construct and so frequently used that any
    compiler would have been withdrawn in the week after it came out.
    That is my expectation. So I would be grateful if you could provide
    some evidence that I can look up.

    Mind that even if two() is evaluated before one(), it will not be
    output before the stream of the first expression op<<(cout, one())
    is available, and for this one() must be evaluated. Then one() can
    be sent to the stream, and then also two() can be sent to the stream.
    (Am I missing something?)

    Janis

    and the rules changed in C++17.

    Usually it doesn't matter what order arguments to functions (or operands
    to operators) are evaluated. Some compilers have consistent ordering
    (and it is often last to first, not first to last), others pick whatever makes sense at the time. The ordering has been explicitly and clearly
    stated as "unspecified" since around the beginning of time (which was,
    as we all know, 01.01.1970).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Janis Papanagnou on Fri Jan 26 01:19:43 2024
    On 2024-01-26, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 25.01.2024 23:01, Lawrence D'Oliveiro wrote:
    On Thu, 25 Jan 2024 21:16:14 +0000, bart wrote:

    Imagine putting that power into the hands of ordinary users.

    Shock, horror. Of course we elite cannot allow that into the hands of the
    plebs. Imagine what they might do!

    Power to the people!

    And through the chairs of bad people!

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Fri Jan 26 15:59:11 2024
    On 25/01/2024 21:18, Lawrence D'Oliveiro wrote:
    On Thu, 25 Jan 2024 21:07:55 +0100, David Brown wrote:

    This illustrates the two big difficulties with Unicode symbols for this
    kind of thing. Lots of them are difficult to type for many people (at
    least, not without a good deal of messing around or extra programs).

    The compose key on *nix systems gives you a fairly mnemonic way of typing many of them.


    It lets you type some, but it is still limited in the default setup.
    It's very useful for things like diacriticals on letters that you
    already have, but if you want to use it for something out of the
    ordinary, you need to make your own .XCompose file. And then you have
    to remember to update things on your home computer, work computer,
    laptop, etc. So it is very useful (I use it myself), but not an
    out-of-the-box solution.

    And it's easy to have different symbols that appear quite similar as
    glyphs, but are very different characters as far as the compiler is
    concerned.

    You can actually take advantage of that. E.g. from some of my Python code:

    for cłass in (Window, Pixmap, Cursor, GContext, Region) :
    delattr(cłass, "__del__")
    #end for

    The human reader might not actually notice (or care) that a particular identifier looks like a reserved word, since the meaning is obvious from context. The compiler cannot deduce the meaning from that context, but
    then, it doesn’t need to.

    I am not at all keen on that. I am not against using non-ASCII letters
    as though they were special symbols for particular purposes, but I'd
    want them to stand out clearly.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Fri Jan 26 17:06:35 2024
    On 26/01/2024 13:17, Malcolm McLean wrote:

    We could say that in comp.lang.c "function" shall mean "a subroutine"

    <snip the blather>

    Why don't we just say - as everyone in this group except you already
    says, that in c.l.c. "function" means "C function" as described in the C standards, and any other type of function needs to be qualified?

    Thus "the tan function" here means the function from <math.h>, not the mathematical function, or something done when making leather.

    It really is not difficult.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Fri Jan 26 17:01:57 2024
    On 26/01/2024 02:21, Janis Papanagnou wrote:
    On 25.01.2024 21:11, David Brown wrote:
    On 25/01/2024 17:11, Janis Papanagnou wrote:

    Is or was there any compiler that implemented that in the "unexpected"
    order?


    There were indeed such real-world cases, complaints were made,

    Complaints that the rule was not clear in its definition?
    Or complaints that their compiler did not support cout<<a<<b<<c;
    correctly? - I would be astonished about the latter.

    The pre-C++17 rule was perfectly clear - there was no specified order of execution for the operands. (And I thought I'd made /that/ perfectly
    clear already.) Compilers all worked correctly - they can hardly have
    fallen foul of a rule that did not exist.

    The complaints (at least, the ones based on facts rather than misunderstandings) were about the lack of a rule that enforced
    evaluation order in certain cases.

    So C++17 added rules for evaluation orders in some circumstances, but
    not others. In C++17, but not before (and not in C), the evaluation of
    the expression "one" (and any side-effects) must come before the
    evaluation of "two" for, amongst other things :

    one << two
    one >> two
    one[two]
    two = one

    There is still /no/ ordering for

    one * two
    one + two

    and many other cases.

    And of course there are cases where there has always been a sequence
    point, and therefore an order of evaluation (a logical order, that is -
    if the compiler can see it makes no difference to the observable
    effects, it can always re-arrange anything).

    <https://en.cppreference.com/w/cpp/language/eval_order> <https://en.cppreference.com/w/c/language/eval_order>


    This is so fundamental a construct and so frequently used that any
    compiler would have been withdrawn in the week after it came out.
    That is my expectation. So I would be grateful if you could provide
    some evidence that I can look up.

    <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf>

    For an example in practice, where you can see the generated assembly:

    <https://www.godbolt.org/z/fWezzx1nd>

    If I remember correctly, gcc 7 implemented the ordering rules from C++17
    and back-ported them to previous C++ standards for user convenience (as
    the order was previously unspecified, it was fine to do that).

    Look at the generated assembly and the order in which the calls to
    one(), two(), three() and four() are made. For the operator "<<", they
    are made in order one() to four(). For the operator "+", and for
    function call parameters, they are generated in order four() to one()
    for this case. (In other cases, that may be different - that's what "unspecified" means.)


    Mind that even if two() is evaluated before one(), it will not be
    output before the stream of the first expression op<<(cout, one())
    is available, and for this one() must be evaluated. Then one() can
    be sent to the stream, and then also two() can be sent to the stream.
    (Am I missing something?)

    The output to the stream must be in the order given in the code - that
    is true. But the values to be output could (prior to C++17) be
    evaluated in any order. If one() and two() have side-effects, that is
    critical - those side-effects could be executed in any order.


    Janis

    and the rules changed in C++17.

    Usually it doesn't matter what order arguments to functions (or operands
    to operators) are evaluated. Some compilers have consistent ordering
    (and it is often last to first, not first to last), others pick whatever
    makes sense at the time. The ordering has been explicitly and clearly
    stated as "unspecified" since around the beginning of time (which was,
    as we all know, 01.01.1970).



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Fri Jan 26 17:12:29 2024
    On 25/01/2024 16:26, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 25/01/2024 10:55, Malcolm McLean wrote:
    On 25/01/2024 03:59, Keith Thompson wrote:

    As for K&R's thinking, I have no particular insight on that.  I have no >>>> problem with some operators being represented by symbols and others by >>>> keywords (I'm accustomed to it from other languages), and I don't see
    that the decision to make "sizeof" a keyword even requires any
    justification.

    I looked it up on the web, but I can't find anything that goes back
    to K and R and explains why they took that decision. But clearly to
    use a word rather than punctuators, as was the case with every other
    operator, must have had a reason.

    I think they wanted it to look function-like, because it is
    function, though a function of a type rather than of bits, so of
    course not a "function" in the C standard sense of the term.

    It is not a function in the C sense - "sizeof x" is not like a
    function call (where "x" is a variable or expression, rather than a
    type). However, many people (myself included) feel it is clearer in
    code to write it as "sizeof(x)", making it look more like a function
    or function-like macro.

    And many people (myself included) feel it is clearer to write it as
    `sizeof x`, precisely so it *doesn't* look like a function call, because
    it isn't one. Similarly, I don't use unnecessary parentheses on return statements.

    I also write `sizeof (int)` rather than `sizeof(int)`. The parentheses
    look similar to those in a function call, but the construct is
    semantically distinct. I think of keywords as a different kind of
    token than identifiers, even though they look similar (and the standard describes them that way).


    Fair enough. I think a lot of this kind of thing is just habit or
    personal preference. There's always little differences in the way
    people write their code.

    I suspect the prime reason "sizeof" is a word, rather than a symbol or
    sequence of symbols, is that the word is very clear while there are no
    suitable choices of symbols for the task. The nearest might have been
    "#", but that might have made pre-processor implementations more
    difficult. Of course any symbol or combination /could/ have been
    used, and people would have learned its meaning, but "sizeof" just
    seems so much simpler.

    It has occurred to me that if there had been a strong desire to use a
    symbol, "$" could have worked. It even suggests the 's' in the word
    "size".

    But there was no such desire. sizeof happens to be the only operator
    whose symbol is a keyword, but I see no particular significance to this,
    and no reason not to define it that way. I might even have preferred keywords for some of C's well-populated zoo of operators. See also
    Pascal, which has keywords "and", "or", "not", and "mod".


    Agreed. I don't think these details really make a big difference to programming languages.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Fri Jan 26 18:31:57 2024
    All what you wrote below targets at your last sentense
    "those side-effects could be executed in any order".
    For the examples we had, like (informally) cout<<a<<b<<c;
    this is undisputed for the SIDE EFFECTS of "a", etc. You
    had "hidden" those side effects in "one()", I gave in an
    earlier post the more obvious example c++ in the context
    of cout << c++ << c++ << c++ << endl; as side effects.
    All side effects can be a problem (and should be avoided
    unless "necessary"). My point was that the order of '<<'
    with its arguments is NOT corrupted. I interpreted your
    previous posting that you'd have heard that to be an issue.
    If you haven't meant to say that there's nothing more to
    say about the issue, since the other things you filled your
    post with is only distracting from the point in question.

    Janis


    On 26.01.2024 17:01, David Brown wrote:
    On 26/01/2024 02:21, Janis Papanagnou wrote:
    On 25.01.2024 21:11, David Brown wrote:
    On 25/01/2024 17:11, Janis Papanagnou wrote:

    Is or was there any compiler that implemented that in the "unexpected" >>>> order?


    There were indeed such real-world cases, complaints were made,

    Complaints that the rule was not clear in its definition?
    Or complaints that their compiler did not support cout<<a<<b<<c;
    correctly? - I would be astonished about the latter.

    The pre-C++17 rule was perfectly clear - there was no specified order of execution for the operands. (And I thought I'd made /that/ perfectly
    clear already.) Compilers all worked correctly - they can hardly have
    fallen foul of a rule that did not exist.

    The complaints (at least, the ones based on facts rather than misunderstandings) were about the lack of a rule that enforced
    evaluation order in certain cases.

    So C++17 added rules for evaluation orders in some circumstances, but
    not others. In C++17, but not before (and not in C), the evaluation of
    the expression "one" (and any side-effects) must come before the
    evaluation of "two" for, amongst other things :

    one << two
    one >> two
    one[two]
    two = one

    There is still /no/ ordering for

    one * two
    one + two

    and many other cases.

    And of course there are cases where there has always been a sequence
    point, and therefore an order of evaluation (a logical order, that is -
    if the compiler can see it makes no difference to the observable
    effects, it can always re-arrange anything).

    <https://en.cppreference.com/w/cpp/language/eval_order> <https://en.cppreference.com/w/c/language/eval_order>


    This is so fundamental a construct and so frequently used that any
    compiler would have been withdrawn in the week after it came out.
    That is my expectation. So I would be grateful if you could provide
    some evidence that I can look up.

    <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf>

    For an example in practice, where you can see the generated assembly:

    <https://www.godbolt.org/z/fWezzx1nd>

    If I remember correctly, gcc 7 implemented the ordering rules from C++17
    and back-ported them to previous C++ standards for user convenience (as
    the order was previously unspecified, it was fine to do that).

    Look at the generated assembly and the order in which the calls to
    one(), two(), three() and four() are made. For the operator "<<", they
    are made in order one() to four(). For the operator "+", and for
    function call parameters, they are generated in order four() to one()
    for this case. (In other cases, that may be different - that's what "unspecified" means.)


    Mind that even if two() is evaluated before one(), it will not be
    output before the stream of the first expression op<<(cout, one())
    is available, and for this one() must be evaluated. Then one() can
    be sent to the stream, and then also two() can be sent to the stream.
    (Am I missing something?)

    The output to the stream must be in the order given in the code - that
    is true. But the values to be output could (prior to C++17) be
    evaluated in any order. If one() and two() have side-effects, that is critical - those side-effects could be executed in any order.


    Janis

    and the rules changed in C++17.

    Usually it doesn't matter what order arguments to functions (or operands >>> to operators) are evaluated. Some compilers have consistent ordering
    (and it is often last to first, not first to last), others pick whatever >>> makes sense at the time. The ordering has been explicitly and clearly
    stated as "unspecified" since around the beginning of time (which was,
    as we all know, 01.01.1970).




    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Fri Jan 26 18:59:02 2024
    On 26.01.2024 17:06, David Brown wrote:
    On 26/01/2024 13:17, Malcolm McLean wrote:

    We could say that in comp.lang.c "function" shall mean "a subroutine"

    Why don't we just say - as everyone in this group except you already
    says, that in c.l.c. "function" means "C function" as described in the C standards, and any other type of function needs to be qualified?

    Thus "the tan function" here means the function from <math.h>, not the mathematical function, or something done when making leather.

    It really is not difficult.

    Unless the discussion was done on a meta-level as opposed to a
    concrete language specific implementation-model of a function,
    or a concrete functions. - My impression from the posts upthread
    was that we were taking on the meta-level to understand what we
    actually have (with tha 'sizeof' beast) or how to consider it
    conceptionally.

    I also think that this is the key to not talk past each other.

    The term "function" in computer science seems to have never been
    an issue of dispute - I mean on a terminology level; explanations
    in lectures or books were quite coherent, and since there was no
    dispute everyone seems to have understood what a function is; in
    computer science and in mathematics.

    From my references it seems a consensus at least in that it's
    reflecting a mathematical f: (x,y,...) -> (u,v,...) which is
    projected at (or implemented by) some routine/procedure/method/
    function, etc. - however it's called in any programming language.

    The terminology certainly differs, but the interpretation less.

    If we look deeper at the issue we can of course make academic
    battles about other "function concepts" (my favorite example
    is analogue computers; but that's extreme, of course). But in
    that narrow corner we're discussing things it's sufficient IMO,
    and probably more rewarding than restricting on the C function
    implementation model.

    How should we get principle insights on 'sizeof', what it is,
    what it should be, etc., if we stay within this restricted C
    world terminology, and discussing even a very special type of
    a, umm.., function (sort of).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Fri Jan 26 19:59:23 2024
    On 26/01/2024 18:31, Janis Papanagnou wrote:
    All what you wrote below targets at your last sentense
    "those side-effects could be executed in any order".
    For the examples we had, like (informally) cout<<a<<b<<c;
    this is undisputed for the SIDE EFFECTS of "a", etc. You
    had "hidden" those side effects in "one()", I gave in an
    earlier post the more obvious example c++ in the context
    of cout << c++ << c++ << c++ << endl; as side effects.
    All side effects can be a problem (and should be avoided
    unless "necessary"). My point was that the order of '<<'
    with its arguments is NOT corrupted. I interpreted your
    previous posting that you'd have heard that to be an issue.
    If you haven't meant to say that there's nothing more to
    say about the issue, since the other things you filled your
    post with is only distracting from the point in question.


    I said - repeatedly - that the order of evaluation of the operands to
    most operators is unspecified in C and C++. This could result in
    behaviour that was unexpected for some people, especially in connection
    with cout and other C++ streams, and was thus specified in C++17 for
    specific cases.

    A typical example would be :

    cout << "Start time: " << get_time() << "\n"
    << "Running tests... " << run_tests() << "\n"
    << "End time: " << get_time();

    It was realistic - and indeed happened in some cases - for pre-C++17
    compilers to generate the second "get_time()" call before "run_tests()",
    and finally do the first "get_time()" call. Alternatively, the compiler
    could call "get_time()" twice, with "run_tests()" called either before
    or after that pair. In all these cases, the user will see an output
    that was not at all what they intended, with time appearing to go
    backwards or the test apparently taking no time.

    This was the case regardless of whether or not "get_time()" and
    "run_tests()" had any side-effects.

    You are, quite obviously, guaranteed that in "cout << a << b << c", the
    output was in order a, b, c. But that is a totally different matter
    from the order of evaluation (and execution, for function calls) of the subexpressions a, b, and c.


    I have said exactly what I intended to say in this thread, but I suspect
    you have mistaken what the term "order of evaluation" means, and
    therefore misunderstood what I wrote. I hope this is all clear to you now.







    On 26.01.2024 17:01, David Brown wrote:
    On 26/01/2024 02:21, Janis Papanagnou wrote:
    On 25.01.2024 21:11, David Brown wrote:
    On 25/01/2024 17:11, Janis Papanagnou wrote:

    Is or was there any compiler that implemented that in the "unexpected" >>>>> order?


    There were indeed such real-world cases, complaints were made,

    Complaints that the rule was not clear in its definition?
    Or complaints that their compiler did not support cout<<a<<b<<c;
    correctly? - I would be astonished about the latter.

    The pre-C++17 rule was perfectly clear - there was no specified order of
    execution for the operands. (And I thought I'd made /that/ perfectly
    clear already.) Compilers all worked correctly - they can hardly have
    fallen foul of a rule that did not exist.

    The complaints (at least, the ones based on facts rather than
    misunderstandings) were about the lack of a rule that enforced
    evaluation order in certain cases.

    So C++17 added rules for evaluation orders in some circumstances, but
    not others. In C++17, but not before (and not in C), the evaluation of
    the expression "one" (and any side-effects) must come before the
    evaluation of "two" for, amongst other things :

    one << two
    one >> two
    one[two]
    two = one

    There is still /no/ ordering for

    one * two
    one + two

    and many other cases.

    And of course there are cases where there has always been a sequence
    point, and therefore an order of evaluation (a logical order, that is -
    if the compiler can see it makes no difference to the observable
    effects, it can always re-arrange anything).

    <https://en.cppreference.com/w/cpp/language/eval_order>
    <https://en.cppreference.com/w/c/language/eval_order>


    This is so fundamental a construct and so frequently used that any
    compiler would have been withdrawn in the week after it came out.
    That is my expectation. So I would be grateful if you could provide
    some evidence that I can look up.

    <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf>

    For an example in practice, where you can see the generated assembly:

    <https://www.godbolt.org/z/fWezzx1nd>

    If I remember correctly, gcc 7 implemented the ordering rules from C++17
    and back-ported them to previous C++ standards for user convenience (as
    the order was previously unspecified, it was fine to do that).

    Look at the generated assembly and the order in which the calls to
    one(), two(), three() and four() are made. For the operator "<<", they
    are made in order one() to four(). For the operator "+", and for
    function call parameters, they are generated in order four() to one()
    for this case. (In other cases, that may be different - that's what
    "unspecified" means.)


    Mind that even if two() is evaluated before one(), it will not be
    output before the stream of the first expression op<<(cout, one())
    is available, and for this one() must be evaluated. Then one() can
    be sent to the stream, and then also two() can be sent to the stream.
    (Am I missing something?)

    The output to the stream must be in the order given in the code - that
    is true. But the values to be output could (prior to C++17) be
    evaluated in any order. If one() and two() have side-effects, that is
    critical - those side-effects could be executed in any order.


    Janis

    and the rules changed in C++17.

    Usually it doesn't matter what order arguments to functions (or operands >>>> to operators) are evaluated. Some compilers have consistent ordering
    (and it is often last to first, not first to last), others pick whatever >>>> makes sense at the time. The ordering has been explicitly and clearly >>>> stated as "unspecified" since around the beginning of time (which was, >>>> as we all know, 01.01.1970).





    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Fri Jan 26 20:18:47 2024
    On 26/01/2024 18:59, Janis Papanagnou wrote:
    On 26.01.2024 17:06, David Brown wrote:
    On 26/01/2024 13:17, Malcolm McLean wrote:

    We could say that in comp.lang.c "function" shall mean "a subroutine"

    Why don't we just say - as everyone in this group except you already
    says, that in c.l.c. "function" means "C function" as described in the C
    standards, and any other type of function needs to be qualified?

    Thus "the tan function" here means the function from <math.h>, not the
    mathematical function, or something done when making leather.

    It really is not difficult.

    Unless the discussion was done on a meta-level as opposed to a
    concrete language specific implementation-model of a function,
    or a concrete functions. - My impression from the posts upthread
    was that we were taking on the meta-level to understand what we
    actually have (with tha 'sizeof' beast) or how to consider it
    conceptionally.

    We are - probably futilely - trying to get Malcolm to understand that
    even in "meta-level" discussions, it is vital to be clear what is meant
    by terms. And "function" alone means "C function" in c.l.c. You might
    often think it is obvious from the context whether someone means "C
    functions", "mathematical functions", or "wedding functions", but with
    Malcolm you /never/ know. It regularly means "Malcolm functions", which
    have an approximate definition that might change at any time.


    I also think that this is the key to not talk past each other.

    The term "function" in computer science seems to have never been
    an issue of dispute - I mean on a terminology level; explanations
    in lectures or books were quite coherent, and since there was no
    dispute everyone seems to have understood what a function is; in
    computer science and in mathematics.

    The term "function" is most certainly in dispute in computer science.
    It means different things - sometimes subtly, sometimes significantly -
    in the context of different programming languages, or computation
    theory, or mathematics. A "C function" is different from a "Pascal
    function", a "lambda calculus function", a "Turing machine function", or
    any other kind of function definition you want to pick.


    From my references it seems a consensus at least in that it's
    reflecting a mathematical f: (x,y,...) -> (u,v,...) which is
    projected at (or implemented by) some routine/procedure/method/
    function, etc. - however it's called in any programming language.

    No, that is only one kind of function. There are all sorts of questions
    to ask.

    Can functions have side effects?

    Do functions have to have outputs? Do they have to have inputs?

    Does a function have to give the same output for the same inputs?

    Can a function give more than one output? Does a function actually have
    to be executed as called, or can the language re-arrange things?

    Is it valid to have a function that does not satisfy certain
    requirements, if that function is never called?

    Can functions operate on types? Can they operate on other functions?
    Can they operate on whole programs?

    Does the function include some kind of data store? Does it include the
    machine it executes on?

    Does a function have to be executable? Does it even have to be
    computable? Does it have to execute in a finite time?

    Is a function a run-time entity, or a compile-time entity? Can it be
    changed at run-time? Does it make sense to "run" a function at compile
    time?

    I'm sure we could go on.


    The terminology certainly differs, but the interpretation less.

    The problem is that the terminology is the same, but the interpretation
    can be wildly different. In order to communicate, we must be sure that
    a given term is interpreted in the same way be each person.


    If we look deeper at the issue we can of course make academic
    battles about other "function concepts" (my favorite example
    is analogue computers; but that's extreme, of course). But in
    that narrow corner we're discussing things it's sufficient IMO,
    and probably more rewarding than restricting on the C function
    implementation model.

    I think we're fine sticking to "function" meaning "C function", which is
    well defined by the C standards, and using "mathematical function" for mathematical functions, which are also quite solidly defined. Any other
    usage will need to be explained at the time.


    How should we get principle insights on 'sizeof', what it is,
    what it should be, etc., if we stay within this restricted C
    world terminology, and discussing even a very special type of
    a, umm.., function (sort of).


    Sizeof is not a C function. It is a C operator. If you don't know what
    it is or how it works, or want the technical details, it's all in
    6.5.3.4 of the C standards.

    Trying to describe "sizeof" as a function of some sort with a different
    kind of use of the word "function" really doesn't get us anywhere, as
    shown in this thread. It is what it is - trying to mush it into another
    term is not helpful.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Fri Jan 26 22:01:52 2024
    On 26.01.2024 21:27, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    You are, quite obviously, guaranteed that in "cout << a << b << c",
    the output was in order a, b, c. But that is a totally different
    matter from the order of evaluation (and execution, for function
    calls) of the subexpressions a, b, and c.
    [...]

    Perhaps I can help clarify this a bit (or perhaps muddy the waters
    even further). I'll try to add a bit of C relevance at the bottom.

    In `cout << a << b << c`, if a, b, and c are names of non-volatile
    objects, the evaluation order doesn't matter. The values of a, b,
    and c will be written to the standard output stream in that order,
    in all versions of C++.

    Please note that I used cout << a << b << c as a meta expression
    to _subsume_ in one expression the two variants that may have possible
    side effects; that with three functions, and that with three instances
    of c++. I hoped that got clear from the subsequent text, but obviously
    it confused the matter. Sorry about that.


    In `cout << x() << y() << z()`, it's also guaranteed that the
    result of the call to `x()` will precede the result of the call to
    `y()`, which will precede the result of the call to `z()`, in the
    text written to the output stream. What's not guaranteed prior
    to C++17 is the order in which the three functions will be called.
    If none of the functions have side effects that affect the results
    of the other two, or depend on non-local data, it doesn't matter.
    If the functions return, say, a string representation of the current
    time with nanosecond resolution, the three results can be in any
    of 6 orders prior to C++17; in C++17 and later, the timestamps will
    always be in increasing order.

    Yes.


    C++ overloads the "<<" shift operator for output operations, so each
    "<<" after `std::cout` is really a function call, but the rules for sequencing and order of evaluation are the same as for the built-in
    "<<" integer shift operation. C++ could have imposed sequencing
    requirements only on overloaded "<<" and ">> operators, but that
    would have been more difficult to specify in the standard.

    Okay.


    C++17 added a new requirement that the evaluation of the left
    operand of "<<" or ">>" is "sequenced before" the right operand,
    meaning that any side effects of the evaluation of the left operand
    must be complete before evaluation of the right operand begins
    (though optimizations that don't change the visible behavior are
    still allowed). It did not add such a requirement for the "+"
    operator, which is overloaded for std::string concatenation.

    This is actually what I was asking for; what they changed. Thanks.
    (And as I suspected it's about "side effects of the evaluation".)

    [ snip example and prospect ]

    And no, you haven't muddied the issue, au contraire, it's a very
    clear presentation.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Fri Jan 26 22:46:25 2024
    On 26.01.2024 22:16, Lawrence D'Oliveiro wrote:
    On Thu, 25 Jan 2024 14:07:25 +0100, David Brown wrote:

    "cout << one() << two() << three();"

    Those C++ operators for I/O are a brain-dead idea. C-style printf formats actually work better.

    Well, no. There's a reason for using operators. In OO design you define classes, and you can define ostream operators << for your classes so
    that you can use these like elementary types in an output stream. This
    means you can output (and input) arbitrary complex classes. You'll also
    get strong type-safety. And whatnot. Also the stream hierarchy offers
    design and implementation paths that you just don't have with printf().

    In case you haven't done OO design & programming there's unfortunately
    not an easy way to explain or go into the necessary details here. I can
    just suggest to get into that topic; it's worth, IMO.

    The printf() method is quite old; with simple types it's a more compact
    form of formula. You have some cryptic characters that shorten the
    format string (whereas with C++ manipulators and other features you'll
    have more flexibility, extensibility, but pay that with a bulkier form).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Fri Jan 26 22:30:02 2024
    On 26.01.2024 19:59, David Brown wrote:

    I said - repeatedly - that the order of evaluation of the operands to
    most operators is unspecified in C and C++. [...]

    Yes, and this was undisputed.


    A typical example would be :

    cout << "Start time: " << get_time() << "\n"
    << "Running tests... " << run_tests() << "\n"
    << "End time: " << get_time();

    It was realistic - and indeed happened in some cases - for pre-C++17 compilers to generate the second "get_time()" call before "run_tests()",
    and finally do the first "get_time()" call.

    Yes, we have no differences.

    And the sample is fine to show how we should NOT implement such time measurements (or similar logic)!

    A computer scientist or a sophisticated programmer would know that
    there are run-times associated in such expressions:

    cout << "S1" << f1() << "S2" << f2() << "S3" << f3();

    t1 t2 t3 t4 t5 t6 t7 t8 t9

    and he would act accordingly and serialize the expression (see below).

    Alternatively, the compiler
    could call "get_time()" twice, with "run_tests()" called either before
    or after that pair. In all these cases, the user will see an output
    that was not at all what they intended, with time appearing to go
    backwards or the test apparently taking no time.

    This was the case regardless of whether or not "get_time()" and
    "run_tests()" had any side-effects.

    We disagree here; it may not appear so to you but get_time() actually
    has a "side effect" (I put it in quotes, because it's literally no
    "effect" but for the argument of its _sequencing problem_ it's a
    relevant externality). It obtains (probably from a hardware device)
    the time when the call happened.

    That's why somewhat experienced programmers would not write above
    code that way; something like "run_tests()" is (typically) or can be
    very time consuming, so they'd do
    t0 = get_time(); res = run_tests(); t1 = get_time();
    cout << ... etc.

    (Note: This argument implies NOT that a language shouldn't be made as bulletproof as possible and sensible.)


    You are, quite obviously, guaranteed that in "cout << a << b << c", the output was in order a, b, c. But that is a totally different matter
    from the order of evaluation (and execution, for function calls) of the subexpressions a, b, and c.

    (It was meant as a "meta expression". I've addressed that in my
    response to Keith already; please see there.)


    I have said exactly what I intended to say in this thread, but I suspect
    you have mistaken what the term "order of evaluation" means, and
    therefore misunderstood what I wrote. I hope this is all clear to you now.

    The order of evaluation of the '<<' was what I spoke about. The order
    of the arguments had never been an issue. The "problem" with the order
    of the arguments becomes a problem (without quotes) when side effects
    of the arguments are inherent to the arguments.

    You had been focused on the evaluation of the arguments (where side
    effects might lead to unexpected behavior). I wasn't.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Fri Jan 26 21:16:25 2024
    On Thu, 25 Jan 2024 14:07:25 +0100, David Brown wrote:

    "cout << one() << two() << three();"

    Those C++ operators for I/O are a brain-dead idea. C-style printf formats actually work better.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Fri Jan 26 22:41:26 2024
    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    On 26.01.2024 22:16, Lawrence D'Oliveiro wrote:
    On Thu, 25 Jan 2024 14:07:25 +0100, David Brown wrote:

    "cout << one() << two() << three();"

    Those C++ operators for I/O are a brain-dead idea. C-style printf
    formats actually work better.

    Well, no. There's a reason for using operators.

    But remember, you often need to do localized output. Which means parts of
    a message may need to be reordered for grammatical purposes.

    You can do this with printf, but not with C++ output operators.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Fri Jan 26 23:52:44 2024
    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    Also the stream hierarchy offers
    design and implementation paths that you just don't have with printf().

    And that you don’t need, frankly. Java manages just fine with printf-style formatting and “toString()” methods.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Fri Jan 26 23:51:56 2024
    On Fri, 26 Jan 2024 15:41:43 -0800, Keith Thompson wrote:

    You can do this with POSIX printf.

    POSIX specifies an extension to printf that allows arguments to be re-ordered. For example:
    printf("%2$s%1$s\n", "foo", "bar");
    prints "barfoo".

    ISO C does not have this feature.

    I often feel, reading the complaints about deficiencies in this group,
    that C does not work very well unless it is running on top of a *nix-type system.

    C++'s `cout << ...` has advantages and disadvantages.

    Interesting about Java, with all its needless complexity and futile
    attempts at simplification, that this was one decision it made correctly,
    and that was not to copy those operators.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Sat Jan 27 01:12:05 2024
    On 26.01.2024 20:18, David Brown wrote:
    On 26/01/2024 18:59, Janis Papanagnou wrote:
    On 26.01.2024 17:06, David Brown wrote:
    On 26/01/2024 13:17, Malcolm McLean wrote:

    We could say that in comp.lang.c "function" shall mean "a subroutine"

    Why don't we just say - as everyone in this group except you already
    says, that in c.l.c. "function" means "C function" as described in the C >>> standards, and any other type of function needs to be qualified?

    Thus "the tan function" here means the function from <math.h>, not the
    mathematical function, or something done when making leather.

    It really is not difficult.

    Unless the discussion was done on a meta-level as opposed to a
    concrete language specific implementation-model of a function,
    or a concrete functions. - My impression from the posts upthread
    was that we were taking on the meta-level to understand what we
    actually have (with tha 'sizeof' beast) or how to consider it
    conceptionally.

    We are - probably futilely - trying to get Malcolm to understand that
    even in "meta-level" discussions, it is vital to be clear what is meant
    by terms. And "function" alone means "C function" in c.l.c. You might
    often think it is obvious from the context whether someone means "C functions", "mathematical functions", or "wedding functions", but with Malcolm you /never/ know. It regularly means "Malcolm functions", which
    have an approximate definition that might change at any time.

    (I don't like the habit of introducing personalized terms like
    "Malcolm functions"; this habit exposes more of the person who
    introduced it than anything else. And it anyway would only muddy
    the issue not clarify.)



    I also think that this is the key to not talk past each other.

    The term "function" in computer science seems to have never been
    an issue of dispute - I mean on a terminology level; explanations
    in lectures or books were quite coherent, and since there was no
    dispute everyone seems to have understood what a function is; in
    computer science and in mathematics.

    The term "function" is most certainly in dispute in computer science. It means different things - sometimes subtly, sometimes significantly - in
    the context of different programming languages, or computation theory,
    or mathematics.

    We have to divide et impera the area for the discussion to not
    get into the wild. The first divide is the abstraction level;
    this is what I've done below. (Because any technical differences
    of every single computer language makes no sense in a discussion
    on a higher abstraction level. - I see no use in differentiating
    Pascal functions from Simula or Algol functions for the discussion
    here.)

    (I fear this thread will lead nowhere, but okay, I'll enter...)

    A "C function" is different from a "Pascal function", a
    "lambda calculus function", a "Turing machine function", or any other
    kind of function definition you want to pick.

    What relevance has any technical difference of "C functions"
    and "Pascal functions"? - None.

    What do you think is the difference between a function from
    the lambda-calculus and a function from/for Turing Machines
    concerning the class of functions that can be expressed and
    calculated respectively? And in what way would syntax of any
    of these two languages contribute to the question? - Nothing.

    Note: I don't want you to answer these questions. I suppose
    you might have some substantial CS background (I certainly do)
    and are not just spreading buzzwords.
    Neither the technical (implementation) differences of the first
    two types are relevant for the topics that have been discussed,
    nor the algorithm theory definitions of the latter two function
    types are relevant here.



    From my references it seems a consensus at least in that it's
    reflecting a mathematical f: (x,y,...) -> (u,v,...) which is
    projected at (or implemented by) some routine/procedure/method/
    function, etc. - however it's called in any programming language.

    No, that is only one kind of function.

    That is an abstract representation from mathematics (and I am
    not interest in syntactic differences to other forms) that can
    be directly mapped to an algorithmic representation.

    We write (for example [borrowed from a book]):

    f: R x R x R -> R for the domains; R here: real numbers

    f(r,R,h) -> pi/3 x h x (r^2 + r x R + R^2)

    and in computer languages (for example) syntactic variants of:

    f = (real r, real R, real h) real :
    pi/3 * h * (r^2 + r * R + R^2)

    The function from the language closely resembles that from the
    mathematic domain.

    This is an Algol(-like) syntax representation, other languages
    have other syntaxes but mostly it boils down to few differences.

    It's not really important for our discussions to consider Algol's
    ref, Pascal's var, C++'s const, or what else. For the sake of the
    discussion upthread it's also irrelevant whether we have parameters
    passed by value, by reference, by name, or consider some deep or
    shallow copy mechanisms, and it's also not necessary to know for
    this discussions whether the caller or the called instance will
    allocate the stack size, or whether there's stack at all allocated.
    There's countless _technical_ differences that are meaningless in
    a taxonomy-like discussion here.

    There are all sorts of questions to ask.

    Yes, but not many (none?) of significance in our discussion context
    here.


    Can functions have side effects?

    Do functions have to have outputs? Do they have to have inputs?

    Does a function have to give the same output for the same inputs?

    Can a function give more than one output? Does a function actually have
    to be executed as called, or can the language re-arrange things?

    Is it valid to have a function that does not satisfy certain
    requirements, if that function is never called?

    Can functions operate on types? Can they operate on other functions?
    Can they operate on whole programs?

    Does the function include some kind of data store? Does it include the machine it executes on?

    Does a function have to be executable? Does it even have to be
    computable? Does it have to execute in a finite time?

    Is a function a run-time entity, or a compile-time entity? Can it be
    changed at run-time? Does it make sense to "run" a function at compile
    time?

    I'm sure we could go on.

    (Yes, and it wouldn't add anything.)



    The terminology certainly differs, but the interpretation less.

    The problem is that the terminology is the same, but the interpretation
    can be wildly different. In order to communicate, we must be sure that
    a given term is interpreted in the same way be each person.

    Yes. But remember that our question was not a technical one; wasn't
    the question by the other poster (Malcolm?) about a mathematical
    function term and how it fits to determine what 'sizeof' actually is
    to be considered.



    If we look deeper at the issue we can of course make academic
    battles about other "function concepts" (my favorite example
    is analogue computers; but that's extreme, of course). But in
    that narrow corner we're discussing things it's sufficient IMO,
    and probably more rewarding than restricting on the C function
    implementation model.

    I think we're fine sticking to "function" meaning "C function", which is
    well defined by the C standards, and using "mathematical function" for mathematical functions, which are also quite solidly defined. Any other usage will need to be explained at the time.


    How should we get principle insights on 'sizeof', what it is,
    what it should be, etc., if we stay within this restricted C
    world terminology, and discussing even a very special type of
    a, umm.., function (sort of).


    Sizeof is not a C function.

    I know it's an operator in C. And I also wasn't saying that it's a
    C function. - You still see the "(sort of)" in my statement. And we
    already spoke about the close (but not exact) equivalences between
    functions and operators.

    It is a C operator. If you don't know what
    it is or how it works, or want the technical details, it's all in
    6.5.3.4 of the C standards.

    If that's all the OP wanted to discuss it would be easy. You don't
    even need any C standard document. Open any book, even the old K&R
    is sufficient, and look up 'sizeof'. You can read about it being an
    operator and fine. File closed. Goodbye. (What for was the original
    question of this thread? I seem to recall something about the form
    with parenthesis and type?)


    Trying to describe "sizeof" as a function of some sort with a different
    kind of use of the word "function" really doesn't get us anywhere, as
    shown in this thread. It is what it is - trying to mush it into another
    term is not helpful.

    What would be the difference if the parenthesized form would be
    called a function, given that functions and operators are similar,
    and the context so restricted? I don't think you can get an address
    of it (or can we?); but that again is just another implementation
    details (C specific).

    The need for parenthesis in sizeof(type) seems anyway to be only a
    hack, necessary for type expressions with blanks, sizeof(struct x) ?

    Janis

    BTW: There was another subthread about preprocessor use for NELEM
    determination using sizeof. When I looked up the K&R reference I
    saw its use described even as a standard pattern to determine the
    number of array elements. No wonder it became idiomatic.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sat Jan 27 01:17:23 2024
    On 27.01.2024 00:51, Lawrence D'Oliveiro wrote:
    On Fri, 26 Jan 2024 15:41:43 -0800, Keith Thompson wrote:
    C++'s `cout << ...` has advantages and disadvantages.

    Interesting about Java, with all its needless complexity and futile
    attempts at simplification, that this was one decision it made correctly,
    and that was not to copy those operators.

    Chosing these operators is a separate issue. But, yes. With the
    operator precedence borrowed from the shift operator you need to
    be careful. As part of the output operator syntax I'd expected
    a precedence like '=' or even lower.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sat Jan 27 01:27:55 2024
    On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:
    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    Also the stream hierarchy offers
    design and implementation paths that you just don't have with printf().

    And that you don’t need, frankly.

    Don't be so fast with your judgment. Of course we use it to elegantly
    and scaleably solve tasks in C++.

    Java manages just fine with printf-style formatting and “toString()” methods.

    I tried to explain in my other post that it's not just about a format
    (or a string-sequencing member function). But I'm sure one must be
    deeper in the topic or have experienced (besides any supposed issues)
    the sophisticated possibilities that C++ offers to support good design.

    Java (as a newer language) has also some advantages, but was in many
    respects far behind C++ (IMO). ("was" because I lately didn't follow
    its evolution any more.) - But that's anyway all off-topic here.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Sat Jan 27 00:38:20 2024
    On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

    On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    Also the stream hierarchy offers design and implementation paths that
    you just don't have with printf().

    And that you don’t need, frankly.

    Don't be so fast with your judgment. Of course we use it to elegantly
    and scaleably solve tasks in C++.

    But not localization, which is an important issue. printf-style formatting allows rearrangement of parts of a message to suit grammar purposes, C++-
    style output operators do not.

    Java (as a newer language) has also some advantages, but was in many
    respects far behind C++ (IMO).

    It made many mistakes. The goal of trying to be simpler than C++ was I
    think a failure.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Janis Papanagnou on Sat Jan 27 00:42:55 2024
    On 27/01/2024 00:27, Janis Papanagnou wrote:

    Java ...C++ ..
    But that's anyway all off-topic here.

    JP finally realises that, after 100 posts of ramblings about anything but C.


    From 'bart cc32n.c' thread:

    JP:
    No, you are wrong, I'm not the owner of this piece of... code.

    If someone makes a big heap of fecal in a public park, would
    you think I'm the owner? I'd rather sue the one who did that;
    because the park (or Usenet) is common property, and the heap
    of fecal (or that code) is not.

    What a thoroughly unpleasant piece of work.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Janis Papanagnou on Fri Jan 26 20:10:17 2024
    On 1/26/24 12:31, Janis Papanagnou wrote:
    ...
    All what you wrote below targets at your last sentense
    "those side-effects could be executed in any order".
    For the examples we had, like (informally) cout<<a<<b<<c;
    this is undisputed for the SIDE EFFECTS of "a", etc. You
    had "hidden" those side effects in "one()", I gave in an
    earlier post the more obvious example c++ in the context
    of cout << c++ << c++ << c++ << endl; as side effect

    There's an important reason why he used one(), two(), and three().
    "If a side effect on a memory location (6.7.1) is unsequenced relative
    to either another side effect on the same memory location or a value computation using the value of any object in the same memory location,
    and they are not potentially concurrent (6.9.2), the behavior is
    undefined." (C++ 6.9.1p10)

    "Two actions are potentially concurrent if
    (21.1)— they are performed by different threads, or
    (21.2)— they are unsequenced, at least one is performed by a signal
    handler, and they are not both performed
    by the same signal handler invocation." (C++ 6.9.2.1p21)

    So the exception for potentially concurrent side effects cannot apply to
    your version.

    All three of your c++ expressions have side effects on the same memory location, and all use the value stored in that location. Prior to the
    change which is being discussed, all three of those side effects would
    have been unsequenced from each other and from each other's value
    computations, so the behavior of such an expression would have been
    undefined. With this change, they are sequenced, and there is no longer
    a problem with such code.

    The function one(), two() and three() that he used could have side
    effects that effect each other without sharing a memory location, for
    instance by writing to and reading from a file. Therefore, such code
    would have worked both before and after that change - but it might have
    given different results before the change.

    All side effects can be a problem (and should be avoided
    unless "necessary").

    Virtually everything useful that a computer program does qualifies as a
    side effect. Side effects cannot be avoided, they can only be controlled.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Janis Papanagnou on Fri Jan 26 20:43:27 2024
    On 1/26/24 19:12, Janis Papanagnou wrote:
    On 26.01.2024 20:18, David Brown wrote:
    ...
    (I don't like the habit of introducing personalized terms like
    "Malcolm functions"; this habit exposes more of the person who
    introduced it than anything else. And it anyway would only muddy
    the issue not clarify.)

    It is an unfortunate necessity when talking with Malcolm. He has his own idiosyncratic definitions for just about any technical term you care to
    name, which he generally believes to be universally accepted, when he
    appears to be the only person using those words with those definitions.

    ...
    A "C function" is different from a "Pascal function", a
    "lambda calculus function", a "Turing machine function", or any other
    kind of function definition you want to pick.

    What relevance has any technical difference of "C functions"
    and "Pascal functions"? - None.

    The fact that there are differences between the meanings of each of
    those terms is relevant to the fact that you need to be clear which of
    those terms you are using. Given that this is comp.lang.c, the one
    exception is "C function", which can be assumed whenever no other kind
    of function is specified.

    ...
    It's not really important for our discussions to consider Algol's
    ref, Pascal's var, C++'s const, or what else.

    You might think so, but it's not uncommon fro those things to come up in discussion here. It's particularly common for critics of C to discuss
    their preferred alternative.

    ...
    Yes. But remember that our question was not a technical one; wasn't
    the question by the other poster (Malcolm?) about a mathematical
    function term and how it fits to determine what 'sizeof' actually is
    to be considered.

    As you'll find if you stay here long enough, every discussion involving
    Malcolm degenerates into confusion until someone realizes that he's
    using a Malcolm-definition for one or more of the relevant terms. It's
    not possible to make sense of his comments until you've extracted from
    him his idiosyncratic definitions for those terms.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Janis Papanagnou on Fri Jan 26 20:22:22 2024
    On 1/26/24 16:30, Janis Papanagnou wrote:
    ...
    We disagree here; it may not appear so to you but get_time() actually
    has a "side effect" (I put it in quotes, because it's literally no
    "effect" but for the argument of its _sequencing problem_ it's a
    relevant externality). It obtains (probably from a hardware device)
    the time when the call happened.

    "Reading an object designated by a volatile glvalue (7.2.1), modifying
    an object, calling a library I/O function, or calling a function that
    does any of those operations are all side effects, which are changes in
    the state of the execution environment." (C++ 6.9.1p7)

    The term "side effect" is in italics in that sentence, an ISO convention indicating that the sentence in which it appears constitutes the
    official definition of that term. Everything that the C++ standard says
    about side effects traces back to that definition.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sat Jan 27 11:09:56 2024
    On 27.01.2024 01:38, Lawrence D'Oliveiro wrote:
    On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

    On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    Also the stream hierarchy offers design and implementation paths that
    you just don't have with printf().

    And that you don’t need, frankly.

    Don't be so fast with your judgment. Of course we use it to elegantly
    and scaleably solve tasks in C++.

    But not localization, which is an important issue. printf-style formatting allows rearrangement of parts of a message to suit grammar purposes, C++- style output operators do not.

    I see where you're coming from. Myself I have just cursory knowledge
    about and experience in localization, so I have no strong opinions
    on that and thus cannot and don't want to work into the details.
    What I [think to] know is that simple word permutations don't help
    for general cases of localization, so printf would just work for
    some primitive application special cases. (But feel free to CMIIW.)
    Other languages don't operate on simple word arrangements but on whole sentences. In Kornshell, for example, you can predefine language
    strings (in "dictionaries") to be incorporated for displaying language
    specific messages. This is a simple and effective mechanism, and I
    have difficulties to imagine how printf with its parameter permutation
    would create flexible and clean localization; I'd expect a mess (but,
    to be honest, haven't yet seen any convincing example). C++ has also localization support, but as said I have no experience with it. For
    more complex language manipulations you'd anyway need more than word permutation.

    I used to play intensively Nethack; it's a roguelike game known for
    creating well formulated (English) sentences. The source code has a
    lot of code to handle manipulation of language for various details
    of the grammar; plural forms, forms for cases, and whatnot. Someone
    wanted to migrate the game (with its grammar functions) to another
    language; it was very difficult and could not be achieved by simple
    mechanisms like word permutation and word substitution, IIRC.


    Java (as a newer language) has also some advantages, but was in many
    respects far behind C++ (IMO).

    It made many mistakes. The goal of trying to be simpler than C++ was I
    think a failure.

    Well, personally I prefer "simple" languages to complex ones. (Oh,
    well, yet I like C++.) Okay, by simple I mean coherently defined
    with clean concepts. Though I wouldn't call Java simpler than C++;
    for example the STL was much more sophisticated and orthogonally
    designed (thus in this respect simpler) where the Java libraries
    looked more like an ad hoc tool chest (like in Javascript or PHP).
    But when I saw later C++ evolutions (was it C++/2011 ?) I can just
    admit that at least now it became an overly complex language.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Sat Jan 27 11:34:02 2024
    On 27.01.2024 04:05, Malcolm McLean wrote:
    [...] Personally I wanted just
    "function" and for it to be clear from context that here the term did
    not mean "subroutine".

    In my book; there's the "concept function" (mathematical), and the mapping/implementation onto/in a computer (a "calculation routine").
    The latter has just different names in different languages and it
    naturally has different technical details. In any form its purpose
    is to be an implemented instance of a formal mathematical concept.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Sat Jan 27 11:43:38 2024
    On 27.01.2024 04:24, Malcolm McLean wrote:

    It's hard to think of anything that can be passed to standard outut
    other than integers, floating point values, and strings. So you only
    need three atomic operations.
    You can then buld complex objects consisting of integers, floats and
    strings on top of those three basic operations. But the stream itself
    should be locked down and not open to derivation.

    I'm not sure where you're coming from here, what you mean by "locked
    down", and why it would be a goal to reach. I used various ostreams
    (output, strings, etc.) to an advantage in designing flexible usable
    algorithms without duplicating code or anything. (I don't know where
    your view comes from, but maybe it helps to take an existing library
    based on OO design, maybe even STL (which has functional concepts as
    well) and inspect what can be done with stream (or other) hierarchies
    of types. - But maybe its just as I've written upthread; if you have
    not experienced that yourself it's probably hard to understand where
    and what the advantages are.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to James Kuyper on Sat Jan 27 11:21:48 2024
    On 27.01.2024 02:43, James Kuyper wrote:
    [...]

    Thanks for your posts, explanations and background information.

    Yes, I know little about the persons and their posting history.
    I'm trying to stay on the topical level as long as possible - it
    not always works as I intend, though, when it gets pathological -
    just because as soon as it gets personal the threads are usually
    tainted and not fruitful. - I observe the nerves of posters here
    are often stressed.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Sat Jan 27 12:36:15 2024
    On 27.01.2024 12:02, Malcolm McLean wrote:
    On 27/01/2024 10:34, Janis Papanagnou wrote:
    On 27.01.2024 04:05, Malcolm McLean wrote:
    [...] Personally I wanted just
    "function" and for it to be clear from context that here the term did
    not mean "subroutine".

    In my book; there's the "concept function" (mathematical), and the
    mapping/implementation onto/in a computer (a "calculation routine").
    The latter has just different names in different languages and it
    naturally has different technical details. In any form its purpose
    is to be an implemented instance of a formal mathematical concept.

    Janis


    I don't really see how "Bleep" is any sort of mathematical function. But
    it is clearly a "subroutine".

    I don't know what that 'Bleep' is that you mention here, but I suppose
    that is meant to be some function that is not returning a value but an
    acoustic _side effect_. In an Algol representation something like the
    function

    bleep = (void) void: invoke_tone

    Or in Simula or Pascal the return-typeless function (= 'procedure')

    procedure bleep

    Or in C the function

    bleep() { invoke_tone; }

    In FORTRAN and BASIC (don't remember) that function was maybe called "subroutine"?

    Nomenclature changes wording but not its underlying character. Use the
    terms and models that fit best your goals. Meta-goals may be to clarify
    or to muddy the issue or its details.

    The term "subroutine" (for me) comprises two aspects; a hierarchical
    relation, and a [computation] process character.

    I'm unsure whether you wanted to express that functions with return
    type void are different from functions that return non-void types and
    should thus not be called functions? Other's may say that "subroutine"
    is badly chosen since there's not necessarily a hierarchical semantic
    when using "procedures" in context of Simula's classes with coroutines.
    Or are you saying that any "non-pure" function generally relies on
    side-effects and should be considered and named differently? - I'm fine
    with it. As long as it's clear what goals we follow, and whether the nomenclature is appropriate for that specific goal (and not conflicts
    with other goals).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Sat Jan 27 12:53:49 2024
    On 27.01.2024 12:13, Malcolm McLean wrote:

    What I am saying is that standard output can take integers, floats and strings.

    Oh!? That doesn't match with any output model I have in mind. (It only
    matches if I'm coming from the printf() functionality as basis of all.)

    Standard output can take, technically, anything. We're passing binary
    and "text" across the standard channels (and may leave open whether an
    UTF-8 encoded text is to be considered "binary" in some context).

    On one abstraction level you can say I want to consider only non-binary readable output, but then it's all text (int and float and bool are just
    output with one of their textual standard representations).

    So the stream should have some facilites for writing integers (leading
    zeros, signs, maybe commas separators for thousands), some for floats (rounding, precision, scientific notation etc), some for strings (not
    much you can do here other than just pass the raw characters).

    From an OO perspective we may also say the type Integer, the type Float,
    etc. shall provide means to create a textual standard representation,
    with options to control the form of the standard representation. Is the
    form of the standard representation a property of the data type or of a
    [not really] "generic" procedure that has hard-coded support for just
    a hand full of predefined primitive types?


    Now when we've got those facilites and we are happy with them, that's
    it. We don't allow further derivation of the stream to change the basic behaviour. Now people might say "booleans, you've forgotten booleans,
    surely when you pass booleans it should print "true" or "false". No.
    We'll handle that at a higher level and pass "true" and "false" as strings.

    The disadvantage is that you are locked into an integer/float/string paradigm. Amd it's not OO. But the advantage is that it will be stable.

    The OO methods are also stable, but they are also flexible. The concept
    makes it possible to extend it to any data type. (I've done that many
    times. And thinking about how that would have looked like with non-OO
    and only printf() methods is a horror. But, as said; it's probably
    necessary to experience that self if it's not understandable from the explanations alone.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Kuyper on Sat Jan 27 16:44:28 2024
    On 27/01/2024 02:10, James Kuyper wrote:
    On 1/26/24 12:31, Janis Papanagnou wrote:

    All side effects can be a problem (and should be avoided
    unless "necessary").

    Virtually everything useful that a computer program does qualifies as a
    side effect. Side effects cannot be avoided, they can only be controlled.


    Try telling that to Haskell programmers :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Sat Jan 27 16:58:43 2024
    On 27/01/2024 01:38, Lawrence D'Oliveiro wrote:
    On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

    On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    Also the stream hierarchy offers design and implementation paths that
    you just don't have with printf().

    And that you don’t need, frankly.

    Don't be so fast with your judgment. Of course we use it to elegantly
    and scaleably solve tasks in C++.

    But not localization, which is an important issue. printf-style formatting allows rearrangement of parts of a message to suit grammar purposes, C++- style output operators do not.


    Standard printf formatting also does not allow such re-arrangements.

    C++ has added "std::format" as a way of getting more flexible
    formatting, including re-arrangements of parts, with type safety and
    user extension support like iostreams. (I haven't used it myself, and a
    deeper discussion would be better down the hall in c.l.c++.)

    Generally, I think the different formatting systems have their
    advantages and disadvantages. I've yet to see a system in any language
    that could cover all needs.

    (My own key dislike about the C++ output streams is the mess of stateful
    "IO manipulators".)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Sat Jan 27 16:43:15 2024
    On 26/01/2024 22:30, Janis Papanagnou wrote:
    On 26.01.2024 19:59, David Brown wrote:

    I said - repeatedly - that the order of evaluation of the operands to
    most operators is unspecified in C and C++. [...]

    Yes, and this was undisputed.


    A typical example would be :

    cout << "Start time: " << get_time() << "\n"
    << "Running tests... " << run_tests() << "\n"
    << "End time: " << get_time();

    It was realistic - and indeed happened in some cases - for pre-C++17
    compilers to generate the second "get_time()" call before "run_tests()",
    and finally do the first "get_time()" call.

    Yes, we have no differences.

    And the sample is fine to show how we should NOT implement such time measurements (or similar logic)!


    There are, of course, many reasons why this is not a good way to
    implement time measurements - the order of evaluation is not the only one.

    A computer scientist or a sophisticated programmer would know that
    there are run-times associated in such expressions:

    cout << "S1" << f1() << "S2" << f2() << "S3" << f3();

    t1 t2 t3 t4 t5 t6 t7 t8 t9


    The experienced or knowledgable C++ programmer (prior to C++17) would
    know that the parts here are not necessarily executed in the order you
    give. (It's not clear to me if these are the run times for different
    parts, or time-stamps.) Indeed, depending on the kinds of
    subexpressions you have, not only can the order of evaluation be changed
    for most operators, but their evaluation can be interleaved. (I could
    go through the details, but it is probably better to look them up on a reference site such as en.cppreference.com.)

    and he would act accordingly and serialize the expression (see below).


    If the programmer wants them to be executed in a particular order,
    he/she must use constructs in the language to force that.

    Alternatively, the compiler
    could call "get_time()" twice, with "run_tests()" called either before
    or after that pair. In all these cases, the user will see an output
    that was not at all what they intended, with time appearing to go
    backwards or the test apparently taking no time.

    This was the case regardless of whether or not "get_time()" and
    "run_tests()" had any side-effects.

    We disagree here; it may not appear so to you but get_time() actually
    has a "side effect" (I put it in quotes, because it's literally no
    "effect" but for the argument of its _sequencing problem_ it's a
    relevant externality). It obtains (probably from a hardware device)
    the time when the call happened.

    We don't disagree - I haven't said what "get_time()" does or how it
    works, or what concept of "time" it has. I agree that getting some
    real-world time from a hardware device would be a side-effect - as would
    most practical ways to get a useful idea of "time". I was making a
    general point - the operands to the << operator could be (pre-C++17)
    evaluated in any order regardless of whether or not they had side-effects.


    That's why somewhat experienced programmers would not write above
    code that way; something like "run_tests()" is (typically) or can be
    very time consuming, so they'd do
    t0 = get_time(); res = run_tests(); t1 = get_time();
    cout << ... etc.


    Of course.

    In practice, they could still be badly wrong even with that code -
    there's a lot of subtle points to consider when trying to time code, and
    my experience is that very few programmers get it entirely right.

    (Note: This argument implies NOT that a language shouldn't be made as bulletproof as possible and sensible.)


    A language should be convenient to use and avoid surprising the
    programmer. But it should /not/ be a surprise to C or C++ programmers
    that the order of evaluation of subexpressions is usually unspecified.


    You are, quite obviously, guaranteed that in "cout << a << b << c", the
    output was in order a, b, c. But that is a totally different matter
    from the order of evaluation (and execution, for function calls) of the
    subexpressions a, b, and c.

    (It was meant as a "meta expression". I've addressed that in my
    response to Keith already; please see there.)


    I have said exactly what I intended to say in this thread, but I suspect
    you have mistaken what the term "order of evaluation" means, and
    therefore misunderstood what I wrote. I hope this is all clear to you now.

    The order of evaluation of the '<<' was what I spoke about. The order
    of the arguments had never been an issue. The "problem" with the order
    of the arguments becomes a problem (without quotes) when side effects
    of the arguments are inherent to the arguments.

    You had been focused on the evaluation of the arguments (where side
    effects might lead to unexpected behavior). I wasn't.


    I'm afraid I can't quite follow you here. I can just hope that you
    understand that evaluation order is unspecified for most operators, that
    real compilers evaluate subexpressions in different orders in real code
    (so the re-ordering is not hypothetical), and that C++17 added special
    rules for << and >> to make things more convenient for programmers. If
    I have helped you see this, or if Keith's post helped you see it, then
    that's great.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Sat Jan 27 17:07:42 2024
    On 27/01/2024 00:15, Malcolm McLean wrote:
    On 26/01/2024 19:18, David Brown wrote:

    I think we're fine sticking to "function" meaning "C function", which
    is well defined by the C standards, and using "mathematical function"
    for mathematical functions, which are also quite solidly defined.  Any
    other usage will need to be explained at the time.

    Basically I wanted "function" for C functions which are also
    mathematical functions, and "procedure" for C functions which do not
    meet the definition of mathematical functions. In context, of course.

    So basically, you want to use your own terms that are different from
    everyone else's.

    And since this is normal, accepted usage, I though It would be accepted
    here.

    No, it is not "normal, accepted usage" anywhere but when you are talking
    to yourself. As I have pointed out in other posts, "function" can mean
    a vast number of different things.

    And it is not "normal, accepted usage" to join a discussion group with
    clear and established terminology, and attempt to use your own very
    different terminology. That is true even if the terminology and
    definitions you want to use are well established elsewhere - which in
    this case, they are not.


    Seriously, how hard would it be for you to accept the usage of
    "function" to mean "C function" in this group? How difficult would it
    be for you to try to speak the same language as the rest of us? Do you
    really expect everyone else to adapt to suit your personal choice of definitions? How often do you need to go round the same circles again
    and again, instead of trying to communicate with people in a sane manner?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Sat Jan 27 16:25:40 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 27.01.2024 00:51, Lawrence D'Oliveiro wrote:
    On Fri, 26 Jan 2024 15:41:43 -0800, Keith Thompson wrote:
    C++'s `cout << ...` has advantages and disadvantages.

    Interesting about Java, with all its needless complexity and futile
    attempts at simplification, that this was one decision it made correctly,
    and that was not to copy those operators.

    Chosing these operators is a separate issue.

    I've always found them ugly an inefficient myself, leaving aside
    the ability to override the operator for "type safety", something
    that I've not found to be a compelling advantage.

    how is

    cout << std::hex << std::setw((bits + 3)/4) << value << std::eol;

    better than

    printf("%*x\n", (bits+3/4), value);

    Especially when the format string includes multiple values represented
    in different bases and they may need to be reordered at runtime based
    on e.g. the current locale?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Sat Jan 27 17:46:00 2024
    On 27/01/2024 01:12, Janis Papanagnou wrote:
    On 26.01.2024 20:18, David Brown wrote:
    On 26/01/2024 18:59, Janis Papanagnou wrote:
    On 26.01.2024 17:06, David Brown wrote:
    On 26/01/2024 13:17, Malcolm McLean wrote:



    (I don't like the habit of introducing personalized terms like
    "Malcolm functions"; this habit exposes more of the person who
    introduced it than anything else. And it anyway would only muddy
    the issue not clarify.)


    I agree. But I'd rather he talked about "Malcolm functions" than just
    wrote "functions" while meaning something completely different from what everyone else here means by the word.


    (I fear this thread will lead nowhere, but okay, I'll enter...)


    I'm trying to snip and skip stuff to reduce the size of the post here.

    A "C function" is different from a "Pascal function", a
    "lambda calculus function", a "Turing machine function", or any other
    kind of function definition you want to pick.

    What relevance has any technical difference of "C functions"
    and "Pascal functions"? - None.

    In Pascal, a "subroutine" (if we may use that as a generic term for now,
    even though that too can have many different meanings) that returns a
    value is a "function". If it does not return a value, it is a
    "procedure". Either may or may not have side-effects. Both are called "functions" in C.

    Thus there is a difference between "C functions" and "Pascal functions".
    In comp.lang.pascal, the unqualified term "function" would mean
    something different from what it means in comp.lang.c.

    The relevance to the discussion is that there are a vast number of
    meanings of the term "function", even within the realm of computer
    programming.


    Note: I don't want you to answer these questions. I suppose
    you might have some substantial CS background (I certainly do)
    and are not just spreading buzzwords.

    My university education was in mathematics and computation, so I do have
    a theoretical background. Since that time, decades ago, my work has
    been practical rather than theoretical.

    Neither the technical (implementation) differences of the first
    two types are relevant for the topics that have been discussed,
    nor the algorithm theory definitions of the latter two function
    types are relevant here.

    I agree that there are few practical differences between Pascal functions/procedures and C functions. But this is not a discussion
    about practical implementations of compiled imperative programming
    languages - it is a discussion about terms, and why it is important to
    agree on the meaning of the terms. The term "function" means different
    things in Pascal and C.





    From my references it seems a consensus at least in that it's
    reflecting a mathematical f: (x,y,...) -> (u,v,...) which is
    projected at (or implemented by) some routine/procedure/method/
    function, etc. - however it's called in any programming language.

    No, that is only one kind of function.

    That is an abstract representation from mathematics (and I am
    not interest in syntactic differences to other forms) that can
    be directly mapped to an algorithmic representation.

    It is more general in mathematics to consider it to map one set to
    another, but we sometimes use multiple parameters for convenience of
    notation. (It's rare to view the codomain as multiple parts.)


    We write (for example [borrowed from a book]):

    f: R x R x R -> R for the domains; R here: real numbers

    f(r,R,h) -> pi/3 x h x (r^2 + r x R + R^2)

    and in computer languages (for example) syntactic variants of:

    f = (real r, real R, real h) real :
    pi/3 * h * (r^2 + r * R + R^2)

    The function from the language closely resembles that from the
    mathematic domain.

    It resembles it in this case - though in important aspects they are very different. However, many real-world C functions don't match closely
    with a neat mathematical function (mathematical functions don't have side-effects), and many real-world mathematical functions don't match
    practical C functions. Picking one example where a reasonable match can
    be made does not mean the two kinds of "function" are similar.



    There are all sorts of questions to ask.

    Yes, but not many (none?) of significance in our discussion context
    here.


    All show that the term "function" can mean a huge number of different
    things. There is no single "computer science" definition or "accepted
    common definition" - it's not even a close call.

    And so if we are going to use that word at all, we have to agree what it
    means. Since this is comp.lang.c, and since "functions" are central to
    C, we use the term "function" to mean "C function" unless otherwise
    /clearly/ stated.


    How should we get principle insights on 'sizeof', what it is,
    what it should be, etc., if we stay within this restricted C
    world terminology, and discussing even a very special type of
    a, umm.., function (sort of).


    Sizeof is not a C function.

    I know it's an operator in C. And I also wasn't saying that it's a
    C function. - You still see the "(sort of)" in my statement. And we
    already spoke about the close (but not exact) equivalences between
    functions and operators.


    We certainly won't learn anything about "sizeof" by calling it a
    "function" or a "mathematical function".

    It is a C operator. If you don't know what
    it is or how it works, or want the technical details, it's all in
    6.5.3.4 of the C standards.

    If that's all the OP wanted to discuss it would be easy. You don't
    even need any C standard document. Open any book, even the old K&R
    is sufficient, and look up 'sizeof'. You can read about it being an
    operator and fine. File closed. Goodbye. (What for was the original
    question of this thread? I seem to recall something about the form
    with parenthesis and type?)

    I have long since forgotten what the question was - we have certainly
    wandered far from the start of the thread!



    Trying to describe "sizeof" as a function of some sort with a different
    kind of use of the word "function" really doesn't get us anywhere, as
    shown in this thread. It is what it is - trying to mush it into another
    term is not helpful.

    What would be the difference if the parenthesized form would be
    called a function, given that functions and operators are similar,
    and the context so restricted?

    The difference is, it would not be a C function. "sizeof" operates at
    compile time, and it operates on types - either an explicit type, or the
    type of the expression. It has no run-time behaviour (excluding the
    somewhat poorly described VLA behaviour). It does not evaluate its
    operand. It has no prototype or declaration. It cannot be implemented
    by the user in a free-standing implementation. You cannot take its
    address. It has no linkage. You cannot use its name as an identifier.


    I don't think you can get an address
    of it (or can we?); but that again is just another implementation
    details (C specific).

    The need for parenthesis in sizeof(type) seems anyway to be only a
    hack, necessary for type expressions with blanks, sizeof(struct x) ?

    Janis

    BTW: There was another subthread about preprocessor use for NELEM determination using sizeof. When I looked up the K&R reference I
    saw its use described even as a standard pattern to determine the
    number of array elements. No wonder it became idiomatic.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Sat Jan 27 17:26:24 2024
    David Brown <david.brown@hesbynett.no> writes:
    On 27/01/2024 01:38, Lawrence D'Oliveiro wrote:
    On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

    On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    Also the stream hierarchy offers design and implementation paths that >>>>> you just don't have with printf().

    And that you don’t need, frankly.

    Don't be so fast with your judgment. Of course we use it to elegantly
    and scaleably solve tasks in C++.

    But not localization, which is an important issue. printf-style formatting >> allows rearrangement of parts of a message to suit grammar purposes, C++-
    style output operators do not.


    Standard printf formatting also does not allow such re-arrangements.


    Depends on what standard you use. POSIX certainly does.



    (My own key dislike about the C++ output streams is the mess of stateful
    "IO manipulators".)


    Hear! Hear!

    The run-time cost of all those stateful manipulators isn't free, either.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Sat Jan 27 17:24:27 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:
    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    Also the stream hierarchy offers
    design and implementation paths that you just don't have with printf().

    And that you don’t need, frankly.

    Don't be so fast with your judgment. Of course we use it to elegantly
    and scaleably solve tasks in C++.

    Java manages just fine with printf-style formatting and “toString()” methods.

    I tried to explain in my other post that it's not just about a format
    (or a string-sequencing member function). But I'm sure one must be
    deeper in the topic or have experienced (besides any supposed issues)
    the sophisticated possibilities that C++ offers to support good design.

    As someone who as programmed daily in C++ since 1989, usually
    in performance sensitive code, I've never found the C++ input
    and output operators useful. The run-time cost both in space
    and time is far more than the *printf formatting functions,
    and they're less flexible when the formatting changes based,
    e.g., on locale.


    Java (as a newer language) has also some advantages, but was in many
    respects far behind C++ (IMO).

    Wow, that's a strong statement. What led you to hold that opinion?

    Java, as a language, was rather well designed. The run-time costs,
    however, precluded the use of Java in most of the projects that I've
    worked on since Java was introduced.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Scott Lurndal on Sat Jan 27 18:53:05 2024
    On 27/01/2024 18:26, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 27/01/2024 01:38, Lawrence D'Oliveiro wrote:
    On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

    On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

    On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

    Also the stream hierarchy offers design and implementation paths that >>>>>> you just don't have with printf().

    And that you don’t need, frankly.

    Don't be so fast with your judgment. Of course we use it to elegantly
    and scaleably solve tasks in C++.

    But not localization, which is an important issue. printf-style formatting >>> allows rearrangement of parts of a message to suit grammar purposes, C++- >>> style output operators do not.


    Standard printf formatting also does not allow such re-arrangements.


    Depends on what standard you use. POSIX certainly does.

    Sure. But not all of the world is POSIX. (Okay, a lot of it is, and
    it's fine to rely on POSIX features if you know that's appropriate.)




    (My own key dislike about the C++ output streams is the mess of stateful
    "IO manipulators".)


    Hear! Hear!

    The run-time cost of all those stateful manipulators isn't free, either.

    For my own use, I've sometimes used classes letting you do :

    debug_log << "X = " << x << " = 0x" << hex(x, 8) << "\n";

    "hex(x, 8)" returns a value of a class holding "x" and the number of
    digits 8, and then there is an overload for the << operator on this
    class. No extra state needs to be stored in the logging class, I can
    make as many of these formatters as I like, and the intermediary classes
    all disappear in the optimisation.

    It just seems so much cleaner and more efficient than the way C++
    streams are done.

    (It doesn't allow re-arranging the parts in the format, nor does it
    solve the "1 thingy / 2 thingies" issue, but it's good enough for my needs.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Sat Jan 27 19:39:31 2024
    David Brown <david.brown@hesbynett.no> writes:
    On 27/01/2024 18:26, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:


    (My own key dislike about the C++ output streams is the mess of stateful >>> "IO manipulators".)


    Hear! Hear!

    The run-time cost of all those stateful manipulators isn't free, either.

    For my own use, I've sometimes used classes letting you do :

    debug_log << "X = " << x << " = 0x" << hex(x, 8) << "\n";



    example:
    void
    c_processor::dump_rle(c_logger *lp, ulong task)
    {
    mem_addr_t rle = c_system::self()->get_rlist_base()
    + (task * RLIST_ENTRY_SIZE);

    lp->log("Reinstate List Entry at %9.9llu\n", rle);
    lp->log("Task #%4.4llu ET %9.9llu #ET: %5.5llu"
    " Prio: %2.2llu Op Claim: %4.4llu\n",
    getdigits(rle+RLIST_TASKNUM, RLIST_TASKNUM_LEN),
    getdigits(rle+RLIST_ENV_TBL_ADDR, RLIST_ENV_TBL_ADDR_LEN),
    getdigits(rle+RLIST_NUM_MAT, RLIST_NUM_MAT_LEN),
    getdigits(rle+RLIST_TASKPRIO, RLIST_TASKPRIO_LEN),
    getdigits(rle+RLIST_OPCLAIM, RLIST_OPCLAIM_LEN));
    lp->log("MCP Lock# %4.4llu USER Lock# %4.4llu "
    "Task Owning %4.4llu Next Task %4.4llu\n",
    getdigits(rle+RLIST_MCPLOCKNUM, RLIST_MCPLOCKNUM_LEN),
    getdigits(rle+RLIST_USERLOCKNUM, RLIST_USERLOCKNUM_LEN),
    getdigits(rle+RLIST_TASKNUMOWN, RLIST_TASKNUMOWN_LEN),
    getdigits(rle+RLIST_NEXTTASK, RLIST_NEXTTASK_LEN));
    lp->log("Time Slice Remaining: %8.8llu New Time Slice: %8.8llu\n",
    getdigits(rle+RLIST_TSR, RLIST_TSR_LEN),
    getdigits(rle+RLIST_NTS, RLIST_NTS_LEN));
    lp->log("Wait field: %6.6llx\n",
    gethex(rle+RLIST_WAIT_FIELD, RLIST_WAIT_FIELD_LEN));
    lp->log(" IX4 %8.8llx IX5 %8.8llx IX6 %8.8llx IX7 %8.8llx\n",
    gethex(rle+RLIST_MOBIX, 8),
    gethex(rle+RLIST_MOBIX+8, 8),
    gethex(rle+RLIST_MOBIX+16, 8),
    gethex(rle+RLIST_MOBIX+24, 8));

    c_environment env(this, rle+RLIST_ACTIVE_ENV);
    char buf[10];
    lp->log(" %s:%6.6llu IntMask=%2.2llx\n",
    env.print(buf, sizeof(buf)),
    getdigits(rle+RLIST_IP, RLIST_IP_LEN),
    getdigits(rle+RLIST_IMASK, RLIST_IMASK_LEN));
    }

    I'd really hate to have to write and support a C++ outputstream
    version of that using the << operator overloads.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Sat Jan 27 20:59:07 2024
    On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

    David Brown <david.brown@hesbynett.no> writes:

    Standard printf formatting also does not allow such re-arrangements.

    Depends on what standard you use. POSIX certainly does.

    C and POSIX go together like a horse and carriage; one without the other
    is a lot less useful.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Sat Jan 27 21:06:52 2024
    On Sat, 27 Jan 2024 11:13:16 +0000, Malcolm McLean wrote:

    What I am saying is that standard output can take integers, floats and strings.

    You forgot booleans. Also enumerations can be useful.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Harnden@21:1/5 to Malcolm McLean on Sat Jan 27 21:40:31 2024
    On 27/01/2024 20:17, Malcolm McLean wrote:
    You can of course encode any data format as any other as long as you can write enough. But standard output can't take images or audio, for example.

    What?

    stdout is perfectly happy with anything, eg:

    $ cat a.out | xxd | head -1
    00000000: cffa edfe 0c00 0001 0000 0000 0200 0000 ................

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Richard Harnden on Sun Jan 28 00:31:10 2024
    On Sat, 27 Jan 2024 21:40:31 +0000, Richard Harnden wrote:

    $ cat a.out | xxd | head -1 00000000: cffa edfe 0c00 0001 0000 0000 0200
    0000 ................

    UUOC!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Sun Jan 28 01:22:28 2024
    On Sun, 28 Jan 2024 00:35:30 +0000, Malcolm McLean wrote:

    On 27/01/2024 21:06, Lawrence D'Oliveiro wrote:

    On Sat, 27 Jan 2024 11:13:16 +0000, Malcolm McLean wrote:

    What I am saying is that standard output can take integers, floats and
    strings.

    You forgot booleans. Also enumerations can be useful.

    Yes, and we could say fixed point, complex, etc.

    Booleans and enumerations (enumerations can be considered a generalization
    of booleans) are ones that could usefully be displayed in symbolic form.
    Python example:

    for i in (1, 2, 3) :
    print("%d = %d? %s" % (i, 2, i == 2))
    #end for

    produces output:

    1 = 2? False
    2 = 2? True
    3 = 2? False

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Sun Jan 28 01:50:30 2024
    On Sat, 27 Jan 2024 17:34:00 -0800, Keith Thompson wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    C and POSIX go together like a horse and carriage; one without the
    other is a lot less useful.

    Which is why horseless carriages never caught on.

    No, they never did. Until something was invented ... what was it ...
    something to do with “horsepower” ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to David Brown on Sun Jan 28 01:31:28 2024
    On 1/27/24 10:44, David Brown wrote:
    On 27/01/2024 02:10, James Kuyper wrote:
    On 1/26/24 12:31, Janis Papanagnou wrote:

    All side effects can be a problem (and should be avoided
    unless "necessary").

    Virtually everything useful that a computer program does qualifies as a
    side effect. Side effects cannot be avoided, they can only be controlled.


    Try telling that to Haskell programmers :-)

    I was talking very specifically in reference to C's definition of "side-effect". I'm not particularly familiar with Haskell - does it have
    a different definition of "side effect", or does it somehow get
    something useful done without qualifying under C's definition? If so, how?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Sun Jan 28 12:50:14 2024
    On 28/01/2024 02:31, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    For my own use, I've sometimes used classes letting you do :

    debug_log << "X = " << x << " = 0x" << hex(x, 8) << "\n";

    "hex(x, 8)" returns a value of a class holding "x" and the number of
    digits 8, and then there is an overload for the << operator on this
    class. No extra state needs to be stored in the logging class, I can
    make as many of these formatters as I like, and the intermediary
    classes all disappear in the optimisation.

    Or hex() could just return a std::string.


    Yes. That can make the coding a lot simpler, and very flexible. But it
    comes at a cost - in my type of work, I don't want any dynamic memory
    unless it is absolutely unavoidable, and I want results to be as
    efficient as practically possible. On the other hand, I don't need the generality that you would have in a larger and more general purpose
    framework. That all means tighter connections between the "log" class
    and the functions handling these outputs.

    (One possibility that is not bad is to have support for handling
    fixed-length strings, rather than std::string. Real implementations of
    this kind of thing are more complicated, with templates and all, and
    well outside of c.l.c.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Kuyper on Sun Jan 28 12:40:59 2024
    On 28/01/2024 07:31, James Kuyper wrote:
    On 1/27/24 10:44, David Brown wrote:
    On 27/01/2024 02:10, James Kuyper wrote:
    On 1/26/24 12:31, Janis Papanagnou wrote:

    All side effects can be a problem (and should be avoided
    unless "necessary").

    Virtually everything useful that a computer program does qualifies as a
    side effect. Side effects cannot be avoided, they can only be controlled. >>>

    Try telling that to Haskell programmers :-)

    I was talking very specifically in reference to C's definition of "side-effect". I'm not particularly familiar with Haskell - does it have
    a different definition of "side effect", or does it somehow get
    something useful done without qualifying under C's definition? If so, how?


    Pure functional programming does not have side-effects, at least not in
    the way we are familiar with in C. There are various techniques used by different functional programming languages to do things like IO, but
    that's all way off-topic (and beyond my knowledge and understanding).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Sun Jan 28 12:53:36 2024
    On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:
    On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

    David Brown <david.brown@hesbynett.no> writes:

    Standard printf formatting also does not allow such re-arrangements.

    Depends on what standard you use. POSIX certainly does.

    C and POSIX go together like a horse and carriage; one without the other
    is a lot less useful.

    To the nearest percent, 0% of all systems running C programs support
    POSIX (or Windows, or any other "big" system). The world of small
    embedded systems totally outweigh "big" systems by many orders of
    magnitude. And perhaps 80% of such small systems are programmed in C.

    It's fine to use POSIX functions when your target is POSIX. But the
    target for C programmers is not always POSIX.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Sun Jan 28 13:02:24 2024
    On 28/01/2024 02:34, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    Seriously, how hard would it be for you to accept the usage of
    "function" to mean "C function" in this group? How difficult would it
    be for you to try to speak the same language as the rest of us? Do
    you really expect everyone else to adapt to suit your personal choice
    of definitions? How often do you need to go round the same circles
    again and again, instead of trying to communicate with people in a
    sane manner?

    You don't really think difficulty is the issue, do you?


    No, it was just a figure of speech.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Sun Jan 28 13:00:06 2024
    On 28/01/2024 02:59, Malcolm McLean wrote:
    On 28/01/2024 01:26, Keith Thompson wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 27/01/2024 21:06, Lawrence D'Oliveiro wrote:
    On Sat, 27 Jan 2024 11:13:16 +0000, Malcolm McLean wrote:

    What I am saying is that standard output can take integers, floats and >>>>> strings.
    You forgot booleans. Also enumerations can be useful.

    Yes, and we could say fixed point, complex, etc.
    It's not inherently a bad idea to extend our little stdout interface
    to include booleans. But in fact there are too many output formats you
    might need.
    Fixed point - in C or C++ there's no standard for that, so now you are
    going the OO route. As you would with enumerations as the symbol
    doesn't exist at runtime.
    It's not that there is no case to be made for the OO approach. What I
    am saying is that in practice the locked down restricted interface
    will work better.

    I think you mean it will work Malcolm-better.

    Apparently inflexibility and vulnerability to type errors are
    Malcolm-better than the alternative.

    Exactly.
    Inflexibility can be better. Because in reality most program work with a restricted set of data types which it makes sense to pass to a text
    stream, and so you only need three atomic types.

    Tpye errors are of course a nuisance with printf(). But that's because
    of the quirks of C, not because it takes a restricted set of types, and
    you can write a different restricted interface without this problem.

    The fact is that printf(), which works basically as I recommend, is
    widely used as the interface to standard output, and often OO
    alternatives are available and not used for various reasons.

    So the world is in fact "Malcolm better".


    You mean the Malcolm-world is Malcolm-better with these restrictions,
    because in the Malcolm-world the only programming tasks that are done
    are Malcolm-tasks, and the programmers are all Malcolm-programmers.

    At least that's all cleared up nicely, and the rest of the world can go
    back to using more than three types, and generating outputs that are not
    just ASCII text.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Sun Jan 28 18:02:01 2024
    On 28.01.2024 13:00, David Brown wrote:
    On 28/01/2024 02:59, Malcolm McLean wrote:
    [...]

    You mean the Malcolm-world is Malcolm-better with these restrictions,
    because in the Malcolm-world the only programming tasks that are done
    are Malcolm-tasks, and the programmers are all Malcolm-programmers.

    Meanwhile I understand why Malcolm'isms have been introduced here
    (as had been explained to me in another post). :-)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Sun Jan 28 18:06:36 2024
    On 28.01.2024 17:09, Malcolm McLean wrote:
    On 28/01/2024 12:00, David Brown wrote:
    [...]
    You put non-ASCII text on stdout?

    Of course. - Hadn't that already been explained before?

    I mean, obviously in a program for international use itself. But in
    routine program for general use?

    The meaning of "general use" is typically to be a sort of general one,
    not one that is (artificially) restricted ("ASCII" vs. "non-ASCII").

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Scott Lurndal on Sun Jan 28 18:46:41 2024
    On 27.01.2024 17:25, Scott Lurndal wrote:

    how is

    cout << std::hex << std::setw((bits + 3)/4) << value << std::eol;

    You forgot the first "std::" (if you wanted to make it appear complex)

    std::cout << std::hex << std::setw((bits + 3)/4) << value << std::endl;

    but I prefer for readability anyway the simpler form (implying 'using')

    cout << hex << setw((bits + 3)/4) << value << endl;


    better than

    printf("%*x\n", (bits+3/4), value);

    It's an extensible and less error prone framework in C++ as opposed
    to a restricted and error prone feature in the C base. (I recently
    gave an application example with some typical advantages visible in
    another post here.)

    But personally I had never been a fan of the stream manipulators;
    I always had to look them up in the documentation (when needed).
    Luckily they were rarely necessary in my contexts, so it didn't
    bother me much. (OTOH, I also have look up the %x.y modifiers in
    FP output format, that I also rarely use. So not much difference.)

    The question of flexibility of the OO features (compared e.g. to the
    restricted C printf() features) had always been of larger relevance.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Sun Jan 28 18:22:30 2024
    On 27.01.2024 21:17, Malcolm McLean wrote:

    Standard output is any sequence of ASCII characters.

    Nonsense.

    printf() is the
    main C interface to that, and supports integers, floats and strings, to
    a first approximation.

    It's not an approximation; printf() is _restricted_ to these types (and
    a few more variants of these few basic types, to be correct).

    In an OO context you would not unnecessarily restrict yourself. (Unless
    you don't know better.)

    You can of course encode any data format as any other as long as you can write enough. But standard output can't take images or audio, for example.

    Standard output is an I/O channel; I can send to it non-text data like
    the ones you mention. (Just imagine you couldn't send UTF-8 text data.)

    The OO method is to allow the stream to be extended. So, in one common system, we might have a "decimal" stream which takes floats and outputs in the format 123.456. Then we could derive a different type of stream from
    that which outputs floats as 1.23456e2. [...]

    Nonsense. - You seem to have never really learned or understood OO or
    what streams in C++ actually are.

    Janis

    [...]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Sun Jan 28 19:24:37 2024
    On 28/01/2024 17:09, Malcolm McLean wrote:
    On 28/01/2024 12:00, David Brown wrote:
    On 28/01/2024 02:59, Malcolm McLean wrote:
    On 28/01/2024 01:26, Keith Thompson wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 27/01/2024 21:06, Lawrence D'Oliveiro wrote:
    On Sat, 27 Jan 2024 11:13:16 +0000, Malcolm McLean wrote:

    What I am saying is that standard output can take integers,
    floats and
    strings.
    You forgot booleans. Also enumerations can be useful.

    Yes, and we could say fixed point, complex, etc.
    It's not inherently a bad idea to extend our little stdout interface >>>>> to include booleans. But in fact there are too many output formats you >>>>> might need.
    Fixed point - in C or C++ there's no standard for that, so now you are >>>>> going the OO route. As you would with enumerations as the symbol
    doesn't exist at runtime.
    It's not that there is no case to be made for the OO approach. What I >>>>> am saying is that in practice the locked down restricted interface
    will work better.

    I think you mean it will work Malcolm-better.

    Apparently inflexibility and vulnerability to type errors are
    Malcolm-better than the alternative.

    Exactly.
    Inflexibility can be better. Because in reality most program work
    with a restricted set of data types which it makes sense to pass to a
    text stream, and so you only need three atomic types.

    Tpye errors are of course a nuisance with printf(). But that's
    because of the quirks of C, not because it takes a restricted set of
    types, and you can write a different restricted interface without
    this problem.

    The fact is that printf(), which works basically as I recommend, is
    widely used as the interface to standard output, and often OO
    alternatives are available and not used for various reasons.

    So the world is in fact "Malcolm better".


    You mean the Malcolm-world is Malcolm-better with these restrictions,
    because in the Malcolm-world the only programming tasks that are done
    are Malcolm-tasks, and the programmers are all Malcolm-programmers.

    At least that's all cleared up nicely, and the rest of the world can
    go back to using more than three types, and generating outputs that
    are not just ASCII text.

    You put non-ASCII text on stdout?
    I mean, obviously in a program for international use itself. But in
    routine program for general use?


    I commonly write out in UTF-8 - it does not have to be "international".
    (I assume that by "international" you, as a good Brit, mean "not UK".
    After all, a program written solely for use in Norwegian is not
    international.)

    Sometimes I will have binary data of some kind on the standard output.
    It's a lot less common, but it happens. A common example would be code
    for generating images or other files for a webserver.

    Most of my "real" programs, rather than small utilities, are for
    embedded systems where the concept of "standard output" is not really
    the same as for PC's.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Sun Jan 28 20:14:24 2024
    On 27.01.2024 16:43, David Brown wrote:
    On 26/01/2024 22:30, Janis Papanagnou wrote:
    On 26.01.2024 19:59, David Brown wrote:

    A computer scientist or a sophisticated programmer would know that
    there are run-times associated in such expressions:

    cout << "S1" << f1() << "S2" << f2() << "S3" << f3();

    t1 t2 t3 t4 t5 t6 t7 t8 t9


    The experienced or knowledgable C++ programmer (prior to C++17) would
    know that the parts here are not necessarily executed in the order you
    give.

    And that was not intended in that representation. It was to show
    where "time factors" are "hidden". (And if I had the intention I
    could also have chosen 'dt1' instead of the inaccurate 't1' to
    indicate that.) Sorry if that confuses you. I hoped that together
    with the text it would be more informative (than confusing).
    If one sees the time demands, and has read about our consensus
    about evaluation order above (it was repeatedly stated!) there
    should be no misunderstanding (or so I thought at least).

    [...]

    That's why somewhat experienced programmers would not write above
    code that way; something like "run_tests()" is (typically) or can be
    very time consuming, so they'd do
    t0 = get_time(); res = run_tests(); t1 = get_time();
    cout << ... etc.

    Of course.

    You can serialize (as I suggested previously as one example) or
    embed functions like take_time(run_tests()) as another example.


    In practice, they could still be badly wrong even with that code -
    there's a lot of subtle points to consider when trying to time code, and
    my experience is that very few programmers get it entirely right.

    Really? - I mostly had to do with folks, even newbies with a
    proper CS education, who had enough experience or knowledge.
    Most problems appeared in contexts where the used languages
    have inherent design issues; not in any case we could avoid
    use of such languages in the first place.

    Janis

    [...]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Sun Jan 28 20:16:12 2024
    On 27.01.2024 17:46, David Brown wrote:
    [...]

    FYI: Too long to read at the moment. (Maybe later, maybe not.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Sun Jan 28 20:43:00 2024
    On Sun, 28 Jan 2024 12:53:36 +0100, David Brown wrote:

    On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

    On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

    David Brown <david.brown@hesbynett.no> writes:

    Standard printf formatting also does not allow such re-arrangements.

    Depends on what standard you use. POSIX certainly does.

    C and POSIX go together like a horse and carriage; one without the
    other is a lot less useful.

    To the nearest percent, 0% of all systems running C programs support
    POSIX (or Windows, or any other "big" system). The world of small
    embedded systems totally outweigh "big" systems by many orders of
    magnitude. And perhaps 80% of such small systems are programmed in C.

    And a lot of those “embedded” systems are running Android.

    Android ships as many units per year as the entire installed base of
    Microsoft Windows.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Jan 29 00:03:08 2024
    On Sun, 28 Jan 2024 14:49:53 -0800, Keith Thompson wrote:

    A lot (for some interpretations of "a lot") of embedded systems run
    Android. Those aren't the one David was talking about.

    They have a POSIX-type C runtime. Which does support “%«n»$” for reordering args to the printf routines.

    The point being the prevalence of POSIX is a little larger than you give
    it credit for.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Lawrence D'Oliveiro on Mon Jan 29 01:24:06 2024
    On 29/01/2024 00:03, Lawrence D'Oliveiro wrote:
    On Sun, 28 Jan 2024 14:49:53 -0800, Keith Thompson wrote:

    A lot (for some interpretations of "a lot") of embedded systems run
    Android. Those aren't the one David was talking about.

    They have a POSIX-type C runtime. Which does support “%«n»$” for reordering args to the printf routines.

    The point being the prevalence of POSIX is a little larger than you give
    it credit for.

    As far as I can see, the way it works is to have a separate format
    string for each language target. The format string will contain the bulk
    of the message, together with any variably-placed argments.

    If those arguments are themselves text, they may also need different
    versions per target.

    This provides only the crudest form of aid, every language will have its
    own exceptions.

    A further problem is having type info encoded within each format string,
    so that now magnifies the problem of maintenance, especially if those
    strings reside in external data files. If the format argument is a
    variable, that also limits the ability to detect format errors.

    I can't see that it's of great benefit. Internationalisation requires
    some extra effort anyway beyond what a language provides, especially a low-level one.

    Can you provide one example of how it will help with just two languages
    with differing word order?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Jan 29 02:17:38 2024
    On Sun, 28 Jan 2024 17:48:53 -0800, Keith Thompson wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    The point being the prevalence of POSIX is a little larger than you
    give it credit for.

    Again, David wasn't talking about Android systems.

    No, I was, as an example of the sort of POSIX system he thought was too minuscule to worry about.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Mon Jan 29 14:09:51 2024
    On 28/01/2024 20:14, Janis Papanagnou wrote:
    On 27.01.2024 16:43, David Brown wrote:
    On 26/01/2024 22:30, Janis Papanagnou wrote:


    That's why somewhat experienced programmers would not write above
    code that way; something like "run_tests()" is (typically) or can be
    very time consuming, so they'd do
    t0 = get_time(); res = run_tests(); t1 = get_time();
    cout << ... etc.

    Of course.

    You can serialize (as I suggested previously as one example) or
    embed functions like take_time(run_tests()) as another example.


    In practice, they could still be badly wrong even with that code -
    there's a lot of subtle points to consider when trying to time code, and
    my experience is that very few programmers get it entirely right.

    Really? - I mostly had to do with folks, even newbies with a
    proper CS education, who had enough experience or knowledge.
    Most problems appeared in contexts where the used languages
    have inherent design issues; not in any case we could avoid
    use of such languages in the first place.


    Let's suppose you have a "get_time()" function that gets a time stamp
    from somewhere, and that it correctly uses a volatile access from
    hardware (or some OS-controlled time function that is volatile
    somewhere). And suppose you are trying to time a function to test the
    speed of recursion on your system :

    unsigned int factorial(unsigned int x) {
    if (x == 0) return 1;
    return x * factorial(x - 1);
    }

    What happens when you write this? :

    unsigned int x = 10;
    double start = get_time();
    unsigned int y = factorial(x);
    double end = get_time();

    printf("Time is %f seconds\n", end - start);


    It looks reasonable enough, and because "get_time()" has observable
    behaviour (a volatile access), it must be correct, right? It gives
    shows a small but non-zero time, as expected.

    But what you have actually measured is the overhead in the get_time()
    function, because the compiler has removed the call to factorial because
    the answer is not needed.

    So you try :


    unsigned int x = 10;
    double pre_start = get_time();
    double start = get_time();
    unsigned int y = factorial(x);
    double end = get_time();

    double overhead = start - pre_start;
    printf("Factorial %ui is %ui\n", x, y);
    printf("Time is %f seconds\n", end - start - overhead);


    to compensate for the overhead in the timing, and to force the compiler
    to run factorial(10) because you observe its output. Now you get the
    right answer for y, and the time is 0, because the compiler has
    pre-calculated factorial x and substituted 3628800 for y.

    So you take "x" from argv, so that the compiler can't pre-calculate the
    result. And now the time is 0, because the compiler has re-arranged the
    code as though it was :


    unsigned int x = atoi(argv[1]);
    double pre_start = get_time();
    double start = get_time();
    double end = get_time();

    double overhead = start - pre_start;

    unsigned int y = factorial(x);
    printf("Factorial %ui is %ui\n", x, y);
    printf("Time is %f seconds\n", end - start - overhead);


    Making x and y both volatile gives you a time for the call to
    factorial(x). It is still not measuring any recursion speed, because
    the compiler has turned that function into a loop.

    And that is before we start asking if you are measuring the speed of the
    code, or of the memory and cache system on your PC.


    I regularly see people failing to time or benchmark functions as they
    expect - they don't understand how the compiler can re-arrange or
    optimise things. A recurring problem is the belief that volatile
    accesses force an order on other memory accesses, or on calculations,
    which is not correct.

    Then they decide to disable optimisation because "the optimiser messes
    with my timing code", getting a result as useful as measuring the speed
    of a race car stuck in first gear.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Mon Jan 29 13:35:00 2024
    On 28/01/2024 21:43, Lawrence D'Oliveiro wrote:
    On Sun, 28 Jan 2024 12:53:36 +0100, David Brown wrote:

    On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

    On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

    David Brown <david.brown@hesbynett.no> writes:

    Standard printf formatting also does not allow such re-arrangements. >>>>>
    Depends on what standard you use. POSIX certainly does.

    C and POSIX go together like a horse and carriage; one without the
    other is a lot less useful.

    To the nearest percent, 0% of all systems running C programs support
    POSIX (or Windows, or any other "big" system). The world of small
    embedded systems totally outweigh "big" systems by many orders of
    magnitude. And perhaps 80% of such small systems are programmed in C.

    And a lot of those “embedded” systems are running Android.

    No, they are not. Android is Linux, and is included in the 0%.


    Android ships as many units per year as the entire installed base of Microsoft Windows.

    Sure. And it is still within the 0%.

    Take your car as an example. There's a reasonable chance, if it is
    modern, that the entertainment and navigation system is running Android.
    You might have a couple of other parts running embedded Linux of other
    types. And you might have 100 other microcontrollers running programs
    written in C, but not running a "big" POSIX OS. Some will run RTOS's,
    some will be bare metal.

    On the computer on your desk, you have a microcontroller in your mouse, keyboard, webcam, screen, harddisk, managed switch. Your printer might
    have some kind of embedded Linux for its display and UI, but probably
    has many other microcontrollers in it. Your toaster, oven, fridge,
    alarm clock, digital thermometer - microcontrollers are everywhere.

    Even your typical Android device - a phone or tablet - will have a few
    separate microcontrollers, and a variety of bits and pieces in its SoC
    that are programmed in C but do not have a POSIX system.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Mon Jan 29 17:06:53 2024
    On 29/01/2024 03:17, Lawrence D'Oliveiro wrote:
    On Sun, 28 Jan 2024 17:48:53 -0800, Keith Thompson wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    The point being the prevalence of POSIX is a little larger than you
    give it credit for.

    Again, David wasn't talking about Android systems.

    No, I was, as an example of the sort of POSIX system he thought was too minuscule to worry about.

    I think your mind-reading equipment needs adjustment. Perhaps I should
    take off my tin-foil hat - would that make it easier for you to argue
    against what you imagine I think?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Mon Jan 29 17:05:17 2024
    On 29/01/2024 01:03, Lawrence D'Oliveiro wrote:
    On Sun, 28 Jan 2024 14:49:53 -0800, Keith Thompson wrote:

    A lot (for some interpretations of "a lot") of embedded systems run
    Android. Those aren't the one David was talking about.

    They have a POSIX-type C runtime. Which does support “%«n»$” for reordering args to the printf routines.

    The point being the prevalence of POSIX is a little larger than you give
    it credit for.

    No, it is not. I doubt if there are many people here that don't know
    that Android is Linux, and therefore POSIX, or that there are lots more
    Android systems than Windows systems. If you exclude obvious cases like phones, tablets, and smart TVs, there are many more embedded Linux
    systems that are not Android, than embedded Android systems. Those are
    all POSIX too. And yet they are all part of the 0%.

    Now, it is not uncommon for the libraries of small embedded systems to
    support some POSIX extensions - the "newlib" family of C libraries
    supports POSIX extensions to printf formatting flags, such as control of positional arguments. But many other embedded C libraries do not. And
    support for that one extension does not imply POSIX support or
    "POSIX-type C runtime" libraries.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Mon Jan 29 17:20:18 2024
    On 28/01/2024 20:16, Janis Papanagnou wrote:
    On 27.01.2024 17:46, David Brown wrote:
    [...]

    FYI: Too long to read at the moment. (Maybe later, maybe not.)


    OK. It's off-topic, and just chatter, so if you don't reply I will not
    feel insulted :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Mon Jan 29 17:18:57 2024
    On 28/01/2024 20:49, Malcolm McLean wrote:
    On 28/01/2024 18:24, David Brown wrote:
    On 28/01/2024 17:09, Malcolm McLean wrote:

    You put non-ASCII text on stdout?
    I mean, obviously in a program for international use itself. But in
    routine program for general use?


    I commonly write out in UTF-8 - it does not have to be
    "international". (I assume that by "international" you, as a good
    Brit, mean "not UK". After all, a program written solely for use in
    Norwegian is not international.)

    I'd expect that most general purpose programs written by Norwegians use
    an English interface, even if it isn't really expected that the program
    will find an audience beyond some users in Norway. Except of course for programs which in some way are about Norway.

    Why?

    I might write the program with English output if the output language
    doesn't matter, because I am lazy and write better English than
    Norwegian. I might give it English output if the program were to be
    used in a context where English is prevalent already, such as a
    programming utility. But if it is intended to be used by Norwegians in
    general use, I'll make the output in Norwegian.

    Just because Norwegians are, for the most part, very good at English,
    does not mean they don't prefer Norwegian.


    Sometimes I will have binary data of some kind on the standard output.
    It's a lot less common, but it happens.  A common example would be
    code for generating images or other files for a webserver.

    Most of my "real" programs, rather than small utilities, are for
    embedded systems where the concept of "standard output" is not really
    the same as for PC's.

    I've never used standard output for binary data. It might be necessary
    for webservers that serve images. But it strikes me as a poor design decision.

    It's simply something you haven't considered. Don't assume that the
    kind of programming /you/ do, or that /you/ have experience with, is in
    any way representative of other kinds of programming needs or practices.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Keith Thompson on Mon Jan 29 12:09:57 2024
    On 1/28/24 18:00, Keith Thompson wrote:
    ...
    And in environments like POSIX that don't distinguish between text and
    binary output streams, it can be perfectly sensible (though not 100% portable) to send binary data to stdout.

    I'm sure you know about the following, but for Malcolm's benefit, I want
    to expand on that comment. In other environments, the fact that stdout
    is initially a text stream means that freopen() would have to be used to
    change it to a binary stream - but it is otherwise no more of a problem
    than in a POSIX environment.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Mon Jan 29 16:23:45 2024
    On 29/01/2024 12:35, David Brown wrote:
    On 28/01/2024 21:43, Lawrence D'Oliveiro wrote:
    On Sun, 28 Jan 2024 12:53:36 +0100, David Brown wrote:

    On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

    On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

    David Brown <david.brown@hesbynett.no> writes:

    Standard printf formatting also does not allow such re-arrangements. >>>>>>
    Depends on what standard you use.  POSIX certainly does.

    C and POSIX go together like a horse and carriage; one without the
    other is a lot less useful.

    To the nearest percent, 0% of all systems running C programs support
    POSIX (or Windows, or any other "big" system).  The world of small
    embedded systems totally outweigh "big" systems by many orders of
    magnitude.  And perhaps 80% of such small systems are programmed in C.

    And a lot of those “embedded” systems are running Android.

    No, they are not.  Android is Linux, and is included in the 0%.


    Android ships as many units per year as the entire installed base of
    Microsoft Windows.

    Sure.  And it is still within the 0%.

    Take your car as an example.  There's a reasonable chance, if it is
    modern, that the entertainment and navigation system is running Android.
     You might have a couple of other parts running embedded Linux of other types.  And you might have 100 other microcontrollers running programs written in C, but not running a "big" POSIX OS.  Some will run RTOS's,
    some will be bare metal.

    On the computer on your desk, you have a microcontroller in your mouse, keyboard, webcam, screen, harddisk, managed switch.  Your printer might
    have some kind of embedded Linux for its display and UI, but probably
    has many other microcontrollers in it.  Your toaster, oven, fridge,
    alarm clock, digital thermometer - microcontrollers are everywhere.

    I think this is being disingenuous. Of course there are countless
    millions of integrated circuits used everywhere, that will outnumber the packaged consumer devices that everyone knows about.

    Some of them may have programmable elements. But, no matter how crude,
    how limited, if somebody, somewhere, has configured a program to turn a
    subset of C into code for that device, that enables you to add that to
    the list of systems you claim are programmed in 'C'.

    Even if it relies on dedicated extensions or uses lots of inline assembly.

    Even your typical Android device - a phone or tablet - will have a few separate microcontrollers, and a variety of bits and pieces in its SoC
    that are programmed in C but do not have a POSIX system.


    Maybe you can count each main CPU and each core separately too!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Mon Jan 29 18:40:56 2024
    On 29/01/2024 17:23, bart wrote:
    On 29/01/2024 12:35, David Brown wrote:
    On 28/01/2024 21:43, Lawrence D'Oliveiro wrote:
    On Sun, 28 Jan 2024 12:53:36 +0100, David Brown wrote:

    On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

    On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

    David Brown <david.brown@hesbynett.no> writes:

    Standard printf formatting also does not allow such re-arrangements. >>>>>>>
    Depends on what standard you use.  POSIX certainly does.

    C and POSIX go together like a horse and carriage; one without the
    other is a lot less useful.

    To the nearest percent, 0% of all systems running C programs support
    POSIX (or Windows, or any other "big" system).  The world of small
    embedded systems totally outweigh "big" systems by many orders of
    magnitude.  And perhaps 80% of such small systems are programmed in C. >>>
    And a lot of those “embedded” systems are running Android.

    No, they are not.  Android is Linux, and is included in the 0%.


    Android ships as many units per year as the entire installed base of
    Microsoft Windows.

    Sure.  And it is still within the 0%.

    Take your car as an example.  There's a reasonable chance, if it is
    modern, that the entertainment and navigation system is running
    Android.   You might have a couple of other parts running embedded
    Linux of other types.  And you might have 100 other microcontrollers
    running programs written in C, but not running a "big" POSIX OS.  Some
    will run RTOS's, some will be bare metal.

    On the computer on your desk, you have a microcontroller in your
    mouse, keyboard, webcam, screen, harddisk, managed switch.  Your
    printer might have some kind of embedded Linux for its display and UI,
    but probably has many other microcontrollers in it.  Your toaster,
    oven, fridge, alarm clock, digital thermometer - microcontrollers are
    everywhere.

    I think this is being disingenuous. Of course there are countless
    millions of integrated circuits used everywhere, that will outnumber the packaged consumer devices that everyone knows about.

    Some of them may have programmable elements. But, no matter how crude,
    how limited, if somebody, somewhere, has configured a program to turn a subset of C into code for that device, that enables you to add that to
    the list of systems you claim are programmed in 'C'.

    I am talking about systems with CPUs of some sort that are regularly
    programmed in C. I am not including chips that just have configuration,
    or 4-bit devices that are programmed in their own Forth-like assembly.


    Even if it relies on dedicated extensions or uses lots of inline assembly.


    Sure. I am not suggesting that these devices are programmed /solely/ in
    C - certainly not solely in standard and portable C. But then, few
    programs on PC's or other "big" systems are programmed solely in pure
    standard C.

    And I am not restricting this to devices that the end user can program
    in C. Most devices cannot be accessed or re-programmed by the end-user
    - some cannot be reprogrammed at all, with the program in ROM of some kind.

    Even your typical Android device - a phone or tablet - will have a few
    separate microcontrollers, and a variety of bits and pieces in its SoC
    that are programmed in C but do not have a POSIX system.


    Maybe you can count each main CPU and each core separately too!

    That would change the dynamics a bit, but it would not change the
    overall point.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Mon Jan 29 19:33:26 2024
    On 29.01.2024 00:00, Keith Thompson wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 27.01.2024 21:17, Malcolm McLean wrote:
    [...]
    printf() is the
    main C interface to that, and supports integers, floats and strings, to
    a first approximation.

    It's not an approximation; printf() is _restricted_ to these types (and
    a few more variants of these few basic types, to be correct).

    printf also supports pointer values with "%p". And it support single characters, which are not strings.

    strings of cousre absolutely do not have to be ASCII. Using printf to
    print data with embedded null bytes is tricky

    I haven't tried with a pure C compiler but with a C++ compiler
    it works fine (see below). - Just don't make the mistake to try
    to embed a NUL inside a string constant, like

    printf ("%s\n", "My\0string");

    which won't work as some may expect (you will only see "My").

    -- but of course printf is
    not the only interface. We can print arbitrary data with putchar,
    fwrite, etc.

    I use the Unix I/O functions as well (i.e. 'write' etc.).


    With printf or cout I can print binary, non-ASCII, UTF-8 encoded
    characters...

    std::cout << "lbelkeituerung" << std::endl;
    printf("%s\n", "lbelkeituerung");

    $ utf8out
    lbelkeituerung
    lbelkeituerung


    cout << (char)0x02 << (char)0x01 << (char)0x00 << (char)0xff
    << (char)0xfe << (char)0x80 << (char)0x7f << (char)0x0a;

    printf("%c%c%c%c%c%c%c%c",
    (char)0x02, (char)0x01, (char)0x00, (char)0xff,
    (char)0xfe, (char)0x80, (char)0x7f, (char)0x0a);

    $ binout | od -t x1
    0000000 02 01 00 ff fe 80 7f 0a 02 01 00 ff fe 80 7f 0a
    0000020


    And in environments like POSIX that don't distinguish between text and
    binary output streams,

    Also keep in mind that you can modify the output system attributes;
    in these cases of a modified channel you may not see what you sent.

    it can be perfectly sensible (though not 100%
    portable) to send binary data to stdout.

    Not observed (by me) in the environments I was using the past decades.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Mon Jan 29 19:48:27 2024
    On 29/01/2024 19:32, Malcolm McLean wrote:
    On 27/01/2024 11:36, Janis Papanagnou wrote:
    On 27.01.2024 12:02, Malcolm McLean wrote:
    On 27/01/2024 10:34, Janis Papanagnou wrote:
    On 27.01.2024 04:05, Malcolm McLean wrote:


    In many languages, including C, there's a difference between functions
    that return a value and functions that don't, in that

    In some languages, yes.


    if (realloc(ptr, 0))

    is allowed

    whilst

    if (free(ptr))

    struct S { int a, b; };

    struct S foo(void);

    foo() returns a value, but "if (foo())" is not allowed.

    C does not make much difference between functions that return a value,
    and those that don't. The key distinction is whether the "return"
    statement must have an expression or must not have an expression.



    I don't disagree that it can be useful to distinguish between different
    types of functions. I /do/ disagree with your attempts to classify
    them, which I do not think are useful or well-defined categories.

    And just because particular terms are used in some other context, does
    not mean you get to define them yourself for use in other contexts, or
    apply to them to languages that do not use those terms.

    A "function" here is a "C function" in terms of the C standard. If you
    want to talk about anything else, define what you mean at the time.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to David Brown on Mon Jan 29 19:35:22 2024
    On 2024-01-29, David Brown <david.brown@hesbynett.no> wrote:
    On 29/01/2024 19:32, Malcolm McLean wrote:
    On 27/01/2024 11:36, Janis Papanagnou wrote:
    On 27.01.2024 12:02, Malcolm McLean wrote:
    On 27/01/2024 10:34, Janis Papanagnou wrote:
    On 27.01.2024 04:05, Malcolm McLean wrote:


    In many languages, including C, there's a difference between functions
    that return a value and functions that don't, in that

    In some languages, yes.


    if (realloc(ptr, 0))

    is allowed

    whilst

    if (free(ptr))

    struct S { int a, b; };

    struct S foo(void);

    foo() returns a value, but "if (foo())" is not allowed.

    C does not make much difference between functions that return a value,
    and those that don't. The key distinction is whether the "return"
    statement must have an expression or must not have an expression.

    Don't forget that we can have:

    struct S s = foo();

    not to mention

    struct S bar(void) { return foo(); }

    as well as:

    extern bar(struct S);

    bar(foo());

    none of which patterns is possible if foo returns void.

    A void return is qualitatively different. A function which returns
    a value can plausibly belong into the functional domain. A function
    which returns void is necessarily an imperative procedure.

    Even if it does nothing, a void foo() function it is a procedure in that
    it cannot be planted into a functional expression like bar(foo()).

    So we can identify an emergent category there.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Mon Jan 29 21:06:52 2024
    On 29/01/2024 19:47, Malcolm McLean wrote:
    On 29/01/2024 16:18, David Brown wrote:
    On 28/01/2024 20:49, Malcolm McLean wrote:
    On 28/01/2024 18:24, David Brown wrote:

    I'd expect that most general purpose programs written by Norwegians
    use an English interface, even if it isn't really expected that the
    program will find an audience beyond some users in Norway. Except of
    course for programs which in some way are about Norway.

    Why?

    Generally programmers are educated people and educated people use
    English for serious purposes.

    I think you should probably stop there before you insult people.

    Not always of course and Norway might be
    an exception. But I'd expect that in a Norweigian university, for
    example, it would be forbidden to document a program in Norwegian or to
    use non-English words for identifiers. And probably the same in a large Norwegina company. I might be wrong about that and I have never visited Norway or worked for a Norweigian employer (and obviously I couldn't do
    so unless the policy I expect was followed).

    You are wrong. And it is not something special about Norway.

    You are coming across as the kind of ignorant and inconsiderate git who
    thinks the way to talk to Jonny Foreigner is slowly and loudly.

    Please stop before you embarrass yourself more.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Malcolm McLean on Mon Jan 29 12:10:15 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Kaz Kylheku on Mon Jan 29 21:10:40 2024
    On 29/01/2024 20:35, Kaz Kylheku wrote:
    On 2024-01-29, David Brown <david.brown@hesbynett.no> wrote:
    On 29/01/2024 19:32, Malcolm McLean wrote:
    On 27/01/2024 11:36, Janis Papanagnou wrote:
    On 27.01.2024 12:02, Malcolm McLean wrote:
    On 27/01/2024 10:34, Janis Papanagnou wrote:
    On 27.01.2024 04:05, Malcolm McLean wrote:


    In many languages, including C, there's a difference between functions
    that return a value and functions that don't, in that

    In some languages, yes.


    if (realloc(ptr, 0))

    is allowed

    whilst

    if (free(ptr))

    struct S { int a, b; };

    struct S foo(void);

    foo() returns a value, but "if (foo())" is not allowed.

    C does not make much difference between functions that return a value,
    and those that don't. The key distinction is whether the "return"
    statement must have an expression or must not have an expression.

    Don't forget that we can have:

    struct S s = foo();

    not to mention

    struct S bar(void) { return foo(); }

    as well as:

    extern bar(struct S);

    bar(foo());

    none of which patterns is possible if foo returns void.

    Sure. But Malcolm suggested that the "if" pattern was a special
    distinguishing feature. (He has already made it clear that the only
    types of interest, in his world, are integers, floats and strings.)


    A void return is qualitatively different. A function which returns
    a value can plausibly belong into the functional domain. A function
    which returns void is necessarily an imperative procedure.

    Even if it does nothing, a void foo() function it is a procedure in that
    it cannot be planted into a functional expression like bar(foo()).

    So we can identify an emergent category there.


    Certainly there is a distinction between void and non-void functions -
    but often it is not a particularly important one. What a function does,
    what kind of side-effects it has, how its behaviour and values interact
    with other parts of the code, how it interacts with other threads, are
    far more interesting aspects for classification. (And part of the
    interest is that these features are not normally specified in the code.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Mon Jan 29 22:24:43 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 29/01/2024 20:10, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans

    Sure it can. Pipe it to a the appropriate tool.

    $ cat /usr/bin/xrn | xxd |head
    0000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
    0000010: 0200 3e00 0100 0000 ac47 4000 0000 0000 ..>......G@.....
    0000020: 4000 0000 0000 0000 08a0 0500 0000 0000 @...............
    0000030: 0000 0000 4000 3800 0900 4000 2200 1f00 ....@.8...@."...
    0000040: 0600 0000 0500 0000 4000 0000 0000 0000 ........@.......
    0000050: 4000 4000 0000 0000 4000 4000 0000 0000 @.@.....@.@.....
    0000060: f801 0000 0000 0000 f801 0000 0000 0000 ................
    0000070: 0800 0000 0000 0000 0300 0000 0400 0000 ................
    0000080: 3802 0000 0000 0000 3802 4000 0000 0000 8.......8.@.....

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Mon Jan 29 23:51:49 2024
    On Mon, 29 Jan 2024 19:33:26 +0100, Janis Papanagnou wrote:

    printf ("%s\n", "My\0string");

    which won't work as some may expect (you will only see "My").

    I’ve often thought, now that we can assume that strings are normally UTF-8-encoded, that we can use an alternative UTF-8-derived representation
    of NUL that *isn’t* interpreted as a string terminator. E.g.

    \xc0\x80

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Mon Jan 29 23:56:12 2024
    On Mon, 29 Jan 2024 17:05:17 +0100, David Brown wrote:

    If you exclude obvious cases like
    phones, tablets, and smart TVs, there are many more embedded Linux
    systems that are not Android, than embedded Android systems. Those are
    all POSIX too. And yet they are all part of the 0%.

    If you include them, you find that Microsoft has only something like 25%
    of the computer market.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Mon Jan 29 23:55:16 2024
    On Mon, 29 Jan 2024 18:47:46 +0000, Malcolm McLean wrote:

    ... I don't think
    my use of standard output is all that untypical. It's unacceptable for anything released to customers and is used mainly for debugging.

    stderr for debugging, typically, not stdout.

    If you want a “human-readable” output stream, stderr would more likely fit that bill than stdout.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Malcolm McLean on Mon Jan 29 23:27:52 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might
    have unusual effects if passed though systems designed to handle human-readable text. For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting
    for next data byte". Obviously this will cause difficuties if the data
    is binary.
    Also many binary formats can't easily be extended, so you can pass one
    image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to
    extend.

    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Tue Jan 30 10:05:31 2024
    On 29/01/2024 23:27, Malcolm McLean wrote:
    On 29/01/2024 20:10, David Brown wrote:

    Sure.  But Malcolm suggested that the "if" pattern was a special
    distinguishing feature.  (He has already made it clear that the only
    types of interest, in his world, are integers, floats and strings.)

    No, Malcolm gave if() as an example of a distinction in a language grammar between functions that return a value and functions that don't. Sometimes functions return a value and if() still isn't allowed. Fair enough
    point.But it doesn't really detract from the point that Malcolm is makin


    Fair enough. But that does not detract from my counter-point - I don't
    think void / non-void return type is a major or helpful way to
    categorise functions. As you said yourself, there is not a huge
    difference between a function that returns a value directly, or one that
    takes a pointer for passing the return value to the caller.

    And since void / non-void return type is already a distinction made in
    the way the code is written, it is not as useful a classification as you
    could get from other factors that are not immediately clear from the
    code or clear to the compiler (such as "has side-effects" / "depends on side-effects" / "independent of side-effects", or "thread-safe" / "not thread-safe").

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Tue Jan 30 14:03:46 2024
    On 30/01/2024 10:13, Malcolm McLean wrote:
    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might
    have unusual effects if passed though systems designed to handle
    human-readable text.  For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting
    for next data byte".  Obviously this will cause difficuties if the data >>> is binary.
    Also many binary formats can't easily be extended, so you can pass one
    image and that's all.  While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to
    extend.

    Your reasoning is all gobbledygook.  Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    Proof by repeated assertion?

    /stderr/ was designed for text - it was for error messages independent
    of the main output stream, precisely because stdout output was often not
    human readable text but data passed on to other programs or devices.

    I wonder if there is any *nix program older or simpler than "cat" - a
    program that simply passes its input files or the stdin to stdout.

    Whilst there is a "printf()" which operates on standard output by
    default, there are no functions which write binary data to standard
    outout by default, for example. Though of course you can pass stdout to
    the regular binary output functions like fwrite().

    There is no standard C library function that takes stderr as the default stream. Does that mean stderr was not designed to be used at all?

    "printf" exists and works the way it does because it is convenient and
    useful. It can be viewed as a short-cut for "fprintf(stdout, ...".
    Indeed, that is /exactly/ how the C standard describes the function.

    That means the C standards acknowledge that people often want to print
    out formatted text (which in no way implies plain ASCII) to stdout.
    This does not mean they expect this to be the /only/ use of stdout, or
    that people will not use binary outputs to stdout, any more than it
    implies that text output will always be sent to stdout and not other
    streams or files.


    So I'm obviously not the only person to take the view that passing
    binary data to standard output is a rather odd thing to do.

    Many programs only write text to stdout. That does not mean writing
    binary data is "odd", or that stdout was "designed" for text.


    I suspect the truth is that is is a bad design and I am right, but
    because for some reason communications have to be via standard output,
    people make the best of it and contrive that it shall work, and then
    forget that essentially it is a misuse of a text stream. They are
    slightly proud of their efforts and intolerant of my point.


    Obviously you think you are right - it would be pretty silly for you to
    witter on in the face of clear and mostly unanimous opposition if you
    did not think you were right. That, of course, does not mean you /are/
    right.

    That you couldn't actually mount a defence of your position whilst I
    could also strongly implies that I am right.

    He was correct - your reasoning is gobbledygook.

    If I claim grass is pink, and I know this because it is the same colour
    as the sea which is also pink, then I have given a justification and a
    defence of my position. That doesn't mean it is worth the pixels it is
    written with, or that anyone needs to elaborate when they same I am
    talking nonsense.

    It is so blindingly clear and obvious that stdout is regularly used for non-text data, and so many undeniably accurate and common examples have
    been given, that your position is entirely untenable.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 15:00:04 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might
    have unusual effects if passed though systems designed to handle
    human-readable text. For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting
    for next data byte". Obviously this will cause difficuties if the data
    is binary.
    Also many binary formats can't easily be extended, so you can pass one
    image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to
    extend.

    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    That was never the case. stdout is a unformatted stream of bytes
    associated by default with file descriptor number one in the
    application.

    Long before windows was even a gleam in gates eye.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Scott Lurndal on Tue Jan 30 15:30:57 2024
    On 30/01/2024 15:00, Scott Lurndal wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    That was never the case. stdout is a unformatted stream of bytes
    associated by default with file descriptor number one in the
    application.

    Long before windows was even a gleam in gates eye.


    I don't know what Windows has to do with it.

    The difference between text and binary byte streams is something
    invented by C, so that conversions could be done for byte '\n' on
    systems with alternate line-endings.

    It wasn't just LF versus CRLF either. The difference with the latter is
    that it is still use, /due to the popularity/ of MSDOS and WINDOWS,
    while others have died out.

    With C out of the picture, then CRLF was and is simply a two-byte
    sequence: you write two bytes, or read two bytes.

    So long as a byte has 8 bits, you can't tell whether a byte stream
    represents text or binary data.

    When coding however, it seems incredibly bad form to me to send what you
    know is binary data (so can contain malformed UTF8, or inadvertent
    escape sequences) to an output which will try to represent that on a
    text display.

    For such reasons, whenever I invent a binary text format, I usually
    include a 26/1A byte (end-of-file) so that programs which expect a text
    file will stop at that character. For example:

    c:\mx>type hello.mx
    MCX
    c:\mx>dump hello.mx
    0000: 4D 43 58 1A 01 30 2E 31 32 33 34 00 05 76 02 00

    If I try it under WSL with 'cat', I get a huge bunch of crap displayed
    on the screen, mixed up with long beeps whenever there's an 07 byte.

    That's why binary code, not representing text, is a bad idea even on
    'stdout'.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 15:53:04 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 15:00, Scott Lurndal wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
    human-readable text. For instance in some systems designed to receive >>>>> ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
    Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    That was never the case. stdout is a unformatted stream of bytes
    associated by default with file descriptor number one in the
    application.

    It is a stream of bytes at the level that the file descriptor is used to >generate a write event for a byte which can be arbitrary. But standard
    output is often quickly transformed into a stream of characters.
    Sometimes within the application executable.

    It's binary data. UTF-8, for example, is binary data. Your
    statement "standard output is designed for text, not binary" is
    completely false.

    those who use the less common, often more expensive Unix systems.

    More expensive? In what world do you live?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to bart on Tue Jan 30 15:54:54 2024
    bart <bc@freeuk.com> writes:
    On 30/01/2024 15:00, Scott Lurndal wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    That was never the case. stdout is a unformatted stream of bytes
    associated by default with file descriptor number one in the
    application.

    Long before windows was even a gleam in gates eye.


    I don't know what Windows has to do with it.

    The difference between text and binary byte streams is something
    invented by C, so that conversions could be done for byte '\n' on
    systems with alternate line-endings.

    No, it was invented to support windows CRLF line endings.

    Regardless of your digression, stdout is still an unformatted
    stream of bytes. Any structure on that stream is imposed
    by the -consumer- of those bytes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 16:06:02 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 13:03, David Brown wrote:
    On 30/01/2024 10:13, Malcolm McLean wrote:

    There is no standard C library function that takes stderr as the default
    stream.  Does that mean stderr was not designed to be used at all?

    "printf" exists and works the way it does because it is convenient and
    useful.  It can be viewed as a short-cut for "fprintf(stdout, ...".
    Indeed, that is /exactly/ how the C standard describes the function.

    That means the C standards acknowledge that people often want to print
    out formatted text (which in no way implies plain ASCII) to stdout. This
    does not mean they expect this to be the /only/ use of stdout, or that
    people will not use binary outputs to stdout, any more than it implies
    that text output will always be sent to stdout and not other streams or
    files.

    Speical facilities for text don't necessarily mean that text is the only >output intended to be used, fair enough.

    Even text is just an unformatted stream of bytes. It is the ultimate
    consumer of that text that imposed structure on it (e.g. by treating
    it as ASCII, UTF-16, UTF-8, UTF-32, EBCDIC, et cetera, et alia, und so weiter)


    printf has no binary data format specifier.

    %s? Simply copies non-nul bytes. That's almost as binary as
    one can get, it certainly isn't restricted to printable characters.

    And of course, there are putc and putchar.

    Not to mention using printf where the format string argument
    includes binary data.

    <snip>
    The fact that there is no
    similar function for standard error

    fprintf(stderr, "%s", binary_data_with_no_embedded_nul_bytes);


    . Similarly there is no
    function "write`" that passes binary data to standard output by default.

    In the real world, and in the world the C was created to support
    there are several functions (write, pwrite, mmap, aio_listio, aio_read, aio_write et cetera, et alia, und so wieter).

    Most of which existed before 1989.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Scott Lurndal on Tue Jan 30 16:10:13 2024
    On 30/01/2024 15:54, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 30/01/2024 15:00, Scott Lurndal wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    That was never the case. stdout is a unformatted stream of bytes
    associated by default with file descriptor number one in the
    application.

    Long before windows was even a gleam in gates eye.


    I don't know what Windows has to do with it.

    The difference between text and binary byte streams is something
    invented by C, so that conversions could be done for byte '\n' on
    systems with alternate line-endings.

    No, it was invented to support windows CRLF line endings.

    You just want to have a go at Windows don't you?

    I was using CRLF line-endings in 1970s, they weren't an invention of
    Windows, which didn't exist until the mid-80s and didn't become popular
    until the mid-90s.

    So, how did C deal with CRLF in all those non-Windows settings?


    Regardless of your digression, stdout is still an unformatted
    stream of bytes. Any structure on that stream is imposed
    by the -consumer- of those bytes.

    Of course. But it still a bad idea to write actual output that you KNOW
    does not represent text, to a consumer that will expect text.

    For example to a terminal window, which can happen if you forget to
    redirect it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Scott Lurndal on Tue Jan 30 16:12:06 2024
    On Tue, 30 Jan 2024 16:06:02 +0000, Scott Lurndal wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 13:03, David Brown wrote:
    On 30/01/2024 10:13, Malcolm McLean wrote:

    There is no standard C library function that takes stderr as the default >>> stream.  Does that mean stderr was not designed to be used at all?

    "printf" exists and works the way it does because it is convenient and
    useful.  It can be viewed as a short-cut for "fprintf(stdout, ...".
    Indeed, that is /exactly/ how the C standard describes the function.

    That means the C standards acknowledge that people often want to print
    out formatted text (which in no way implies plain ASCII) to stdout. This >>> does not mean they expect this to be the /only/ use of stdout, or that
    people will not use binary outputs to stdout, any more than it implies
    that text output will always be sent to stdout and not other streams or
    files.

    Speical facilities for text don't necessarily mean that text is the only >>output intended to be used, fair enough.

    Even text is just an unformatted stream of bytes. It is the ultimate consumer of that text that imposed structure on it (e.g. by treating
    it as ASCII, UTF-16, UTF-8, UTF-32, EBCDIC, et cetera, et alia, und so weiter)


    printf has no binary data format specifier.

    %s? Simply copies non-nul bytes. That's almost as binary as
    one can get, it certainly isn't restricted to printable characters.

    And of course, there are putc and putchar.

    Not to mention using printf where the format string argument
    includes binary data.

    <snip>
    The fact that there is no
    similar function for standard error

    fprintf(stderr, "%s", binary_data_with_no_embedded_nul_bytes);

    Or, better yet, fwrite(), which can write to any output stream[1],
    including stdout and stderr.

    [1] At least, the C standard does not impose any restriction on
    which stream(s) fwrite() can access, other than that they must
    be writable.


    . Similarly there is no
    function "write`" that passes binary data to standard output by default.

    fwrite(buffer,sizeof *buffer,1,stdout);

    In the real world, and in the world the C was created to support
    there are several functions (write, pwrite, mmap, aio_listio, aio_read, aio_write et cetera, et alia, und so wieter).

    Most of which existed before 1989.




    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Tue Jan 30 17:25:31 2024
    On 30/01/2024 16:49, Malcolm McLean wrote:
    On 30/01/2024 13:03, David Brown wrote:
    On 30/01/2024 10:13, Malcolm McLean wrote:

    There is no standard C library function that takes stderr as the
    default stream.  Does that mean stderr was not designed to be used at
    all?

    "printf" exists and works the way it does because it is convenient and
    useful.  It can be viewed as a short-cut for "fprintf(stdout, ...".
    Indeed, that is /exactly/ how the C standard describes the function.

    That means the C standards acknowledge that people often want to print
    out formatted text (which in no way implies plain ASCII) to stdout.
    This does not mean they expect this to be the /only/ use of stdout, or
    that people will not use binary outputs to stdout, any more than it
    implies that text output will always be sent to stdout and not other
    streams or files.

    Speical facilities for text don't necessarily mean that text is the only output intended to be used, fair enough.

    printf has no binary data format specifier.

    Mixing binary data with formatted text data is very unlikely to be
    useful. fwrite() is perfectly good for writing binary data - it would
    make no sense to have some awkward printf specifier to do this. (What
    would the specifier even be? It would need to take two items of data -
    a pointer and a length - and thus be very different from existing
    specifiers.)

    And as you say, the fact
    that it is provided is an acknowledgement that programmers often want to
    pass formatted text to standard output.

    Yes.

    The fact that there is no
    similar function for standard error suggests that wanting to pass
    formatted text to error is a less common requirement.

    stderr is a newer invention than stdout and stdin. Perhaps no one
    bothered to add such a function to the standard library because "fprintf(stderr..." is not particularly difficult to write.

    Which is my
    experience for the sort of programming that I do. Similarly there is no function "write`" that passes binary data to standard output by default.

    What would that gain? One fewer parameters to fwrite() ?

    You are reading /way/ too much into the existence or non-existence of
    short-cut functions in the C standard library.

    So this suggests that passing binary data to standard output is a less
    common requirement. And in fact on many systems standard output will
    corrupt such data by in default mode.

    So these three things together - no binary data format specifer for
    printf(), no binary equivalent function to printf that defaults to
    standard output, and the fact that standard output will corrupt binary
    data in default mode on some systems, adds up to a pretty powerful
    argument for my position.

    If I claim grass is pink, and I know this because it is the same
    colour as the sea which is also pink, then I have given a
    justification and a defence of my position.  That doesn't mean it is
    worth the pixels it is written with, or that anyone needs to elaborate
    when they same I am talking nonsense.

    It is so blindingly clear and obvious that stdout is regularly used
    for non-text data, and so many undeniably accurate and common examples
    have been given, that your position is entirely untenable.

    Do learn to think. I've given coherent, reasonable, justifications that are open to dispute on their own terms.

    No, you haven't. You have taken your own experience of C programming in
    a limited niche field, and extrapolated wildly to assume it applies to
    all C programming. You have totally failed to consider extremely common
    cases where binary stdout output is used - utility programs that anyone
    who uses the Linux command line regularly makes use of dozens of times a
    day. You have taken the existence or non-existence of certain standard
    C library functions as though they were hard rules about how stdout was "designed", or hard rules about what is "normal" and what is "odd" in C programming.

    Nothing of that was reasonable or justified.

    That you are capable of inventing an
    incoherent argument on a different topic proves nothing even by analogy except, to be fair, that it is plausible that people will make bad
    arguments.

    And apart from "that's how you have to do it to make a web server work
    under Unix", I haven't seen much of anything in this sub thread which constitutes a good argument for passing binary data to standard output.


    Then you haven't bothered looking at any of the posts in this thread
    branch, and it is therefore not worth trying to educate you by repeating
    the same things.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Tue Jan 30 17:49:57 2024
    On 30/01/2024 17:10, bart wrote:
    On 30/01/2024 15:54, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 30/01/2024 15:00, Scott Lurndal wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    I must admit that it's nothing I have ever done or considered doing. >>>>>
    However standard output is designed for text and not binary ouput.

    That was never the case.  stdout is a unformatted stream of bytes
    associated by default with file descriptor number one in the
    application.

    Long before windows was even a gleam in gates eye.


    I don't know what Windows has to do with it.

    The difference between text and binary byte streams is something
    invented by C, so that conversions could be done for byte '\n' on
    systems with alternate line-endings.

    No, it was invented to support windows CRLF line endings.

    You just want to have a go at Windows don't you?

    I was using CRLF line-endings in 1970s, they weren't an invention of
    Windows, which didn't exist until the mid-80s and didn't become popular
    until the mid-90s.

    CRLF line endings were the invention of printers or teletype machines.
    It took time to move print heads from the end of one line to the
    beginning of the next, and separating the "carriage return" and "line
    feed" commands made timings easier. It also let printer implementers
    handle the two operations independently - occasionally people would want
    to do one but not the other.

    The use of CRLF as a standard for line endings in files was, I believe,
    from CP/M - which came after Unix and Multics, which had standardised on
    LF line endings. (Most OS's before that made things up as they choose,
    rather than being "standard", or used record-based files, punched cards,
    etc.)

    So CRLF precedes Windows quite significantly.

    (I have no idea why Macs picked CR - perhaps they just wanted to be
    different.)


    So, how did C deal with CRLF in all those non-Windows settings?


    The difference between "text" and "binary" streams in C is, in practice,
    up to the implementation. That can be the implementation of the C
    library, or the OS functions (or DLLs/SOs) that the C library calls.
    The norm is that you use "\n" for line endings in the C code - what
    happens after that, for text streams, is beyond C.

    The reason C distinguishes between text and binary streams is that some
    OS's distinguish between them.


    Regardless of your digression, stdout is still an unformatted
    stream of bytes.   Any structure on that stream is imposed
    by the -consumer- of those bytes.

    Of course. But it still a bad idea to write actual output that you KNOW
    does not represent text, to a consumer that will expect text.


    That's just a specific example of "it's a bad idea for a program to
    behave in a way that a reasonable user would not expect". Which is, of
    course, true - but not a big surprise.

    For example to a terminal window, which can happen if you forget to
    redirect it.

    If the program can reasonably be expected to generate binary output,
    then it is the user's fault if they do this accidentally. Examples
    shown in this thread include cat and zcat - that's what these programs do.

    Sometimes people make mistakes, and try to "cat" (or "type") non-text
    files. Mistakes happen.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 16:54:56 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 16:06, Scott Lurndal wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 13:03, David Brown wrote:
    On 30/01/2024 10:13, Malcolm McLean wrote:

    There is no standard C library function that takes stderr as the default >>>> stream.  Does that mean stderr was not designed to be used at all?

    "printf" exists and works the way it does because it is convenient and >>>> useful.  It can be viewed as a short-cut for "fprintf(stdout, ...".
    Indeed, that is /exactly/ how the C standard describes the function.

    That means the C standards acknowledge that people often want to print >>>> out formatted text (which in no way implies plain ASCII) to stdout. This >>>> does not mean they expect this to be the /only/ use of stdout, or that >>>> people will not use binary outputs to stdout, any more than it implies >>>> that text output will always be sent to stdout and not other streams or >>>> files.

    Speical facilities for text don't necessarily mean that text is the only >>> output intended to be used, fair enough.

    Even text is just an unformatted stream of bytes. It is the ultimate
    consumer of that text that imposed structure on it (e.g. by treating
    it as ASCII, UTF-16, UTF-8, UTF-32, EBCDIC, et cetera, et alia, und so weiter)

    No. If we know that text is ASCII it is not highly structured.

    ASCII is, of course, structured.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Tue Jan 30 16:55:33 2024
    David Brown <david.brown@hesbynett.no> writes:
    On 30/01/2024 16:49, Malcolm McLean wrote:

    The fact that there is no
    similar function for standard error suggests that wanting to pass
    formatted text to error is a less common requirement.

    stderr is a newer invention than stdout and stdin.

    c'est what?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 18:43:25 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 16:25, David Brown wrote:
    On 30/01/2024 16:49, Malcolm McLean wrote:

    Which is my experience for the sort of programming that I do.
    [stderr less used than stdout]
    Similarly there is no function "write`" that passes binary data to
    standard output by default.

    What would that gain?  One fewer parameters to fwrite() ?

    Yes. printf() could easily have been omitted and fprintf() only
    provided.

    IIRC, printf() existed even before fprintf was invented and
    it was used by a whole lot of code when the C standardization
    efforts began.

    I suspect that most modern C libraries have printf effectively call fprintf(stdout) internally (or at least a common function to
    process the format string & varargs).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Tue Jan 30 18:22:56 2024
    On 30/01/2024 16:49, David Brown wrote:

    You just want to have a go at Windows don't you?

    I was using CRLF line-endings in 1970s, they weren't an invention of
    Windows, which didn't exist until the mid-80s and didn't become
    popular until the mid-90s.

    CRLF line endings were the invention of printers or teletype machines.
    It took time to move print heads from the end of one line to the
    beginning of the next, and separating the "carriage return" and "line
    feed" commands made timings easier.  It also let printer implementers
    handle the two operations independently - occasionally people would want
    to do one but not the other.

    The use of CRLF as a standard for line endings in files was, I believe,
    from CP/M - which came after Unix and Multics, which had standardised on
    LF line endings.  (Most OS's before that made things up as they choose, rather than being "standard", or used record-based files, punched cards, etc.)

    So CRLF precedes Windows quite significantly.

    (I have no idea why Macs picked CR - perhaps they just wanted to be different.)


    So, how did C deal with CRLF in all those non-Windows settings?


    The difference between "text" and "binary" streams in C is, in practice,
    up to the implementation.  That can be the implementation of the C
    library, or the OS functions (or DLLs/SOs) that the C library calls. The
    norm is that you use "\n" for line endings in the C code - what happens
    after that, for text streams, is beyond C.

    The reason C distinguishes between text and binary streams is that some
    OS's distinguish between them.


    Regardless of your digression, stdout is still an unformatted
    stream of bytes.   Any structure on that stream is imposed
    by the -consumer- of those bytes.

    Of course. But it still a bad idea to write actual output that you
    KNOW does not represent text, to a consumer that will expect text.


    That's just a specific example of "it's a bad idea for a program to
    behave in a way that a reasonable user would not expect".  Which is, of course, true - but not a big surprise.

    For example to a terminal window, which can happen if you forget to
    redirect it.

    If the program can reasonably be expected to generate binary output,
    then it is the user's fault if they do this accidentally.  Examples
    shown in this thread include cat and zcat - that's what these programs do.

    Sometimes people make mistakes, and try to "cat" (or "type") non-text files.  Mistakes happen.


    If you routinely write pure binary data to stdout, then users are going
    to see garbage a lot more often.

    I gave an example earlier when displaying a binary file with 'type' was better-behaved than with 'cat', since 'type' stops at the first 1A byte.

    I used this in my binary formats by adding 1A after the signature, so
    you if you attempted to type it out, it wouldn't go mad. Here's another example:

    c:\sc>type tree.scd
    SCD

    (.scd is a binary file containing CAD drawing data.)

    If I again do that with 'cat' under WSL, it's goes even crazier. In
    starts to try and interpret of the output as commands (with what
    program, I don't know), with lots of Bell sounds, and I can't get back
    to the WSL prompt.

    It's just very, very sloppy.

    Using 'type' on a binary file without a 1A deliberately added as a
    barrier, can have similar problems, but it will at least stop when 1A is
    seen. For example:

    c:\jpeg>type card2.jpg
    ����JFIF@@HH���ExifM@n@@@@

    I get one line of output. What does 'cat' do? I haven't yet tried it.
    Now that I do, then yep, a bunch of crap disappearing off the top of the screen.

    This is the last part of it:

    -----------------------------
    root@XXX:/mnt/c/jpeg# cat card2.jpg
    .... ��Ny�6��G�W������o���N#qi=�>�˧5+_�,�l������*�����}����3Tg�sd����J�&��{��^7��x)j��\�`�BAFxk6k{K���_�>

    >�͚͞�O��
    ��{J}bT��Nl��Y�ث���:����#�jƾ�`T�s $�b��Y�Lث��root@XXX:/mnt/c/jpeg# 61;6;7;22;23

    3;24;28;32;42c
    61: command not found
    6: command not found
    7: command not found
    22: command not found
    23: command not found
    24: command not found
    28: command not found
    32: command not found

    Command '42c' not found, did you mean:

    command 'g2c' from deb goo (0.155+ds-1)
    command 'f2c' from deb f2c (20160102-1)

    Try: apt install <deb name>

    root@XXX:/mnt/c/jpeg#

    ------------------------------------

    It stopped a third of the way down showing ... 24;28;32;42c (I didn't
    notice the prompt that was hidden in there), then I typed Enter, which generated that other garbage.

    This is really poor. I know you will not agree with that; you can't do
    as you would then be against the entire ethos of Unix which is that
    everything is done by piping binary data between processes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to bart on Tue Jan 30 18:44:20 2024
    bart <bc@freeuk.com> writes:
    On 30/01/2024 16:49, David Brown wrote:


    Sometimes people make mistakes, and try to "cat" (or "type") non-text
    files.  Mistakes happen.


    If you routinely write pure binary data to stdout, then users are going
    to see garbage a lot more often.

    Why do you believe that? Most users are capable of RTFM.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Tue Jan 30 20:23:06 2024
    On 30/01/2024 19:29, Malcolm McLean wrote:
    On 29/01/2024 21:00, Keith Thompson wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 29/01/2024 16:18, David Brown wrote:
    On 28/01/2024 20:49, Malcolm McLean wrote:
    On 28/01/2024 18:24, David Brown wrote:
    I'd expect that most general purpose programs written by Norwegians
    use an English interface, even if it isn't really expected that the
    program will find an audience beyond some users in Norway. Except
    of course for programs which in some way are about Norway.
    Why?

    Generally programmers are educated people and educated people use
    English for serious purposes. Not always of course and Norway might be
    an exception. But I'd expect that in a Norweigian university, for
    example, it would be forbidden to document a program in Norwegian or
    to use non-English words for identifiers. And probably the same in a
    large Norwegina company. I might be wrong about that and I have never
    visited Norway or worked for a Norweigian employer (and obviously I
    couldn't do so unless the policy I expect was followed).

    You assert that "educated people use English for serious purposes".
    I don't have the experience to refute that claim, but I suspect
    it's arrogant nonsense.  I could be wrong, of course.

    Everything which is at all intellectually serious is these days written
    in English. It's the new Latin. It's the language all educated people
    use to communicate with each other when discussing scientific,
    philosophical, or scholarly matters. And also technical matters to a
    large extent.
    There are a few exceptions but very few. I remember a discussion about whether you could get away with organising a scientific conference in
    French, in France, and the conclusion was that you could not. Even in
    France. However the French are very reluctant to concede, which is why
    the discussion took place at all.

    If a large Norwegian company allows programmers to document software in Norwegian, then it cannot employ non-Norwegian programmers to work on
    it. So I would imagine that this would be forbidden, But I've never
    actually worked for a Norwegian company and I don't actually know. David Brown, to be fair, does work for a Norwegian company so he might know
    better. But he asks "why?" and I gave the reason.


    So your reasoning is "Scholars use English. Scholars are serious.
    Therefore, anything serious is scholarly and consequently in English".

    You do know that Monty Python's Holy Grail is entertainment, not a
    education in logic?

    When you are in such a deep hole, the wise thing would be to stop digging.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Scott Lurndal on Tue Jan 30 20:29:17 2024
    On 30/01/2024 17:55, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 30/01/2024 16:49, Malcolm McLean wrote:

    The fact that there is no
    similar function for standard error suggests that wanting to pass
    formatted text to error is a less common requirement.

    stderr is a newer invention than stdout and stdin.

    c'est what?


    According to Wikipedia (it's not infallible, but it knows better than me
    here) :

    """
    Standard error was added to Unix in the 1970s after several wasted phototypesetting runs ended with error messages being typeset instead of displayed on the user's terminal.[4]
    """

    <https://web.archive.org/web/20200925010614/https://minnie.tuhs.org/pipermail/tuhs/2013-December/006113.html>

    """
    One of the most amusing and unexpected consequences of phototypesetting
    was the Unix standard error file (!). After phototypesetting, you had to
    take a long wide strip of paper and feed it carefully into a smelly, icky machine which eventually (several minutes later) spat out the paper with
    the printing visible.

    One afternoon several of us had the same experience -- typesetting
    something, feeding the paper through the developer, only to find a single, beautifully typeset line: "cannot open file foobar" The grumbles were
    loud enough and in the presence of the right people, and a couple of days
    later the standard error file was born...
    """



    stdout and stdin were apparently available in FORTRAN in the 1950's.


    But you are more likely to know things first hand, or at least second
    hand, so if you can correct both me and Wikipedia, I'd be happy with that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Harnden@21:1/5 to Malcolm McLean on Tue Jan 30 19:39:24 2024
    On 30/01/2024 16:33, Malcolm McLean wrote:

    C plays fast and loose with the char type. But you can't pass embedded
    nuls. These are so common in binary data that in  practis=ce you can't
    use %s for binary data at all.

    Nobody uses printf to output binary data. fwrite(3) would be common, as
    would write(2).

    Maybe you could use printf("%c%c%c" ... but it'd be beyond tedious.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Harnden@21:1/5 to Malcolm McLean on Tue Jan 30 19:45:49 2024
    On 30/01/2024 18:39, Malcolm McLean wrote:
    On 30/01/2024 16:49, David Brown wrote:

    If the program can reasonably be expected to generate binary output,
    then it is the user's fault if they do this accidentally.  Examples
    shown in this thread include cat and zcat - that's what these programs
    do.

    Sometimes people make mistakes, and try to "cat" (or "type") non-text
    files.  Mistakes happen.


    Elsethred [David Brown]
    I wonder if there is any *nix program older or simpler than "cat" - a program that simply passes its input files or the stdin to stdout.

    But, nobody expects piping a binary file to a tty to "work".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Tue Jan 30 20:46:05 2024
    On 30/01/2024 19:22, bart wrote:
    On 30/01/2024 16:49, David Brown wrote:

    You just want to have a go at Windows don't you?

    I was using CRLF line-endings in 1970s, they weren't an invention of
    Windows, which didn't exist until the mid-80s and didn't become
    popular until the mid-90s.

    CRLF line endings were the invention of printers or teletype machines.
    It took time to move print heads from the end of one line to the
    beginning of the next, and separating the "carriage return" and "line
    feed" commands made timings easier.  It also let printer implementers
    handle the two operations independently - occasionally people would
    want to do one but not the other.

    The use of CRLF as a standard for line endings in files was, I
    believe, from CP/M - which came after Unix and Multics, which had
    standardised on LF line endings.  (Most OS's before that made things
    up as they choose, rather than being "standard", or used record-based
    files, punched cards, etc.)

    So CRLF precedes Windows quite significantly.

    (I have no idea why Macs picked CR - perhaps they just wanted to be
    different.)


    So, how did C deal with CRLF in all those non-Windows settings?


    The difference between "text" and "binary" streams in C is, in
    practice, up to the implementation.  That can be the implementation of
    the C library, or the OS functions (or DLLs/SOs) that the C library
    calls. The norm is that you use "\n" for line endings in the C code -
    what happens after that, for text streams, is beyond C.

    The reason C distinguishes between text and binary streams is that
    some OS's distinguish between them.


    Regardless of your digression, stdout is still an unformatted
    stream of bytes.   Any structure on that stream is imposed
    by the -consumer- of those bytes.

    Of course. But it still a bad idea to write actual output that you
    KNOW does not represent text, to a consumer that will expect text.


    That's just a specific example of "it's a bad idea for a program to
    behave in a way that a reasonable user would not expect".  Which is,
    of course, true - but not a big surprise.

    For example to a terminal window, which can happen if you forget to
    redirect it.

    If the program can reasonably be expected to generate binary output,
    then it is the user's fault if they do this accidentally.  Examples
    shown in this thread include cat and zcat - that's what these programs
    do.

    Sometimes people make mistakes, and try to "cat" (or "type") non-text
    files.  Mistakes happen.


    If you routinely write pure binary data to stdout, then users are going
    to see garbage a lot more often.

    Or they will learn not to try to view the output of the programs.

    I can see there being a problem if a program looks like it should give
    text output, or sometimes gives text output, but it sometimes gives
    binary output.

    If you have a program like that, then it probably makes sense to have a
    flag to say "output the data to stdout" and the default being writing to
    a file.

    So don't misunderstand me here - I'm all in favour of making programs user-friendly, and not filling people's terminals with junk
    unexpectedly. It's just that sometimes binary output on stdout is
    excepted and convenient.


    I gave an example earlier when displaying a binary file with 'type' was better-behaved than with 'cat', since 'type' stops at the first 1A byte.


    That would make "type" better behaved if you only use it for looking at
    text files. "cat" is used for many different things, and stopping at
    0x1a (or 0x00) would be bad behaviour for that program. They are not
    the same program, and do different things (even if "cat" can also do
    everything that "type" can.

    I used this in my binary formats by adding 1A after the signature, so
    you if you attempted to type it out, it wouldn't go mad. Here's another example:

       c:\sc>type tree.scd
       SCD

    (.scd is a binary file containing CAD drawing data.)


    If that suits your needs, fine. I haven't used "type" for decades. (In
    almost all cases, "more" is better. And if I have normal utilities,
    "less" is better still.)

    If I again do that with 'cat' under WSL, it's goes even crazier. In
    starts to try and interpret of the output as commands (with what
    program, I don't know), with lots of Bell sounds, and I can't get back
    to the WSL prompt.

    It's just very, very sloppy.

    No, it is very, very correct. The program is doing what it is intended
    to do. You just think it should have been designed to do something
    else. A "cat" program that stopped after 0x1a would be completely broken.

    It's fine that you prefer "type" to "cat". I wouldn't use either of
    them - I'd use "less" for viewing a file. (You'd have similar issues
    with unhelpful output if you tried to use "less" on a binary file, but
    only one screenful.) But I'd use "cat" for concatenating files, or for sourcing a file to pipe into something else. I realise joining multiple commands with pipes is not common in Windows - that's fine too. You use
    what suits you best. But it makes no sense to blame one program for not functioning like a very different program - especially when the
    difference is a misfeature, as often as not.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Tue Jan 30 20:24:49 2024
    On 30/01/2024 19:29, David Brown wrote:
    On 30/01/2024 17:55, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 30/01/2024 16:49, Malcolm McLean wrote:

    The fact that there is no
    similar function for standard error suggests that wanting to pass
    formatted text to error is a less common requirement.

    stderr is a newer invention than stdout and stdin.

    c'est what?


    According to Wikipedia (it's not infallible, but it knows better than me here) :

    """
    Standard error was added to Unix in the 1970s after several wasted phototypesetting runs ended with error messages being typeset instead of displayed on the user's terminal.[4]
    """

    <https://web.archive.org/web/20200925010614/https://minnie.tuhs.org/pipermail/tuhs/2013-December/006113.html>

    """
    One of the most amusing and unexpected consequences of phototypesetting
    was the Unix standard error file (!).   After phototypesetting, you had to take a long wide strip of paper and feed it carefully into a smelly, icky machine which eventually (several minutes later) spat out the paper with
    the printing visible.

    One afternoon several of us had the same experience -- typesetting
    something, feeding the paper through the developer, only to find a single, beautifully typeset line: "cannot open file foobar"   The grumbles were loud enough and in the presence of the right people, and a couple of days later the standard error file was born...
    """


    That explains a lot. It is ludicrous to just blindly send data to
    stdout, in cases like this. Send it to a file, where it can be neatly self-contained, and then send that to the device. Or least use a device
    handle like 'stdprinter'.

    Just something that will segregate output that is intended for the
    terminal, from data output.

    Clearly every process in Unix was some sort of batch program.

    With an interactive application, you have an on-going dialog with the
    user. But if EVERYTHING is sent to stdout, how on earth do you switch
    between output for the user, and output for the whatever needs to have
    been setup to hang onto stdout?

    Don't tell me that you need to use stderr for the interactive dialog and
    stdout for data!

    It just got things wrong from the start. It doesn't look like much as
    changed.


    stdout and stdin were apparently available in FORTRAN in the 1950's.

    I spent a year writing FORTRAN on mainframes and minicomputers. I don't remember anything like that.

    And my coding involved outputting to a range of imaging peripherals
    including vector graphics terminals (Tektronix).

    Meanwhile other software of mine since then could output to an endless
    variety of peripherals, often several kinds within the same program.

    I don't recall any nonsense where you just wrote necessary data to the equivalent of STDOUT, and hoping something at the other end would do
    something sensible with it.

    For a start, that would be the only thing the program could do: dump
    stuff to STDOUT then terminate.

    But hey, if Unix does it then it must be right!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Scott Lurndal on Tue Jan 30 20:54:32 2024
    On 2024-01-30, Scott Lurndal <scott@slp53.sl.home> wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 30/01/2024 16:25, David Brown wrote:
    On 30/01/2024 16:49, Malcolm McLean wrote:

    Which is my experience for the sort of programming that I do.
    [stderr less used than stdout]
    Similarly there is no function "write`" that passes binary data to
    standard output by default.

    What would that gain?  One fewer parameters to fwrite() ?

    Yes. printf() could easily have been omitted and fprintf() only
    provided.

    IIRC, printf() existed even before fprintf was invented and
    it was used by a whole lot of code when the C standardization
    efforts began.

    Well, actually, printf and fprintf were conflated into one
    function in early Unix!

    If the first argument of printf was a number instead of a
    format string, then it indicated the device to send output
    to (it was an index in the table of FILE structures or something).
    In that case, the second argument was the format string:

    printf("hello, %s\n", "world!"); /* print on standard output */

    printf(1, "hello, %s\n", "world!"); /* likewise */

    printf(2, "hello, %s\n", "world!"); /* likewise */

    Or something like that; I may have some detail wrong.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to David Brown on Tue Jan 30 16:16:32 2024
    On 1/30/24 11:49, David Brown wrote:
    ...
    If the program can reasonably be expected to generate binary output,
    then it is the user's fault if they do this accidentally. Examples
    shown in this thread include cat and zcat - that's what these programs do.

    ? There's no problem using cat to concatenate binary files. I've used
    'split' to split binary files into smaller pieces, and then used 'cat'
    to recombine them, and it worked fine. I don't remember why, but I had
    to transfer the files from one place to another by a method that imposed
    an upper limit on the size of individual files.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Tue Jan 30 21:42:29 2024
    David Brown <david.brown@hesbynett.no> writes:
    On 30/01/2024 17:55, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 30/01/2024 16:49, Malcolm McLean wrote:

    The fact that there is no
    similar function for standard error suggests that wanting to pass
    formatted text to error is a less common requirement.

    stderr is a newer invention than stdout and stdin.

    c'est what?


    According to Wikipedia (it's not infallible, but it knows better than me >here) :

    """
    Standard error was added to Unix in the 1970s after several wasted >phototypesetting runs ended with error messages being typeset instead of >displayed on the user's terminal.[4]


    Ok. I had incorrectly assumed you were referring to the late 80's
    when C standardization was underway.

    It was certainly there by unix v6 which was the first version I
    used in the late 70's.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Wed Jan 31 06:00:30 2024
    On Tue, 30 Jan 2024 17:49:57 +0100, David Brown wrote:

    The use of CRLF as a standard for line endings in files was, I believe,
    from CP/M ...

    Which I think copied it from DEC minicomputer systems.

    Fun fact: on some of those DEC systems (which I used when they were still
    being made), you could end a line with CR-LF, or LF-CR-NUL.

    What was the NUL for? Padding. Why did it need padding? (This was before
    CRT terminals.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 31 06:02:45 2024
    On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

    VMS (now OpenVMS) was also a significant system at the time, and it had
    some rather complex file formats ...

    It relied on extra file metadata called “record attributes” in order to make sense of the file format. It was quite common to transfer files from
    other systems, and have them not be readable until you had set appropriate record attributes on them. Picky, picky, I know.

    Apparently Linus Torvalds used VMS for a while, and hated it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Wed Jan 31 06:04:38 2024
    On Tue, 30 Jan 2024 21:04:56 +0000, Malcolm McLean wrote:

    Linus Torvald's native language is Finnish, for example.

    No, it would be Swedish. He’s an ethnic Swede, from Finland.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Wed Jan 31 06:03:37 2024
    On Tue, 30 Jan 2024 15:21:01 +0000, Malcolm McLean wrote:

    Most systems run Windows where the model of piping from standard output
    to standard input of the next program is much less used than in Unix,
    this is true. That sometimes generates a feeling of superiority amongst
    those who use the less common, often more expensive Unix systems. It's
    very silly, but that's how people think.

    Also we can do select/poll on pipes on *nix systems, you can’t on Windows.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Wed Jan 31 06:06:14 2024
    On Tue, 30 Jan 2024 09:13:07 +0000, Malcolm McLean wrote:

    However standard output is designed for text and not binary ouput.
    Whilst there is a "printf()" which operates on standard output by
    default, there are no functions which write binary data to standard
    outout by default, for example.

    putchar(2).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Wed Jan 31 06:10:56 2024
    On Tue, 30 Jan 2024 20:29:17 +0100, David Brown wrote:

    stdout and stdin were apparently available in FORTRAN in the 1950's.

    There was a convention that channel 5 was the card reader, and 6 was the
    line printer.

    When interactive systems came along later, this became channel 5 for
    keyboard input, and 6 for terminal output.

    What happened to channels 1, 2, 3 & 4? Don’t know.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Richard Harnden on Wed Jan 31 06:12:07 2024
    On Tue, 30 Jan 2024 19:39:24 +0000, Richard Harnden wrote:

    Nobody uses printf to output binary data.

    Do terminal-control escape sequences count?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Wed Jan 31 06:07:47 2024
    On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

    Mixing binary data with formatted text data is very unlikely to be
    useful.

    PDF does exactly that. To the point where the spec suggests putting some
    random unprintable bytes up front, to distract format sniffers from
    thinking they’re looking at a text file.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 01:39:28 2024
    On 1/31/24 01:12, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 19:39:24 +0000, Richard Harnden wrote:

    Nobody uses printf to output binary data.

    Do terminal-control escape sequences count?

    The standard says "Data read in from a text stream will necessarily
    compare equal to the data that were earlier written out to that stream
    only if: the data consist only of printing characters and the control characters horizontal tab and new-line; no new-line character is
    immediately preceded by space characters; and the last character is a
    new-line character." (7.23.2p2).

    I would say that any output which would invalidate that guarantee
    qualifies as binary data, since you would need to output it in binary
    mode in order to guarantee being able to read it back. That would
    include the other control characters (vertical tab and form feed,
    backspace, and carriage return).

    The letters and digits are guaranteed to be printing characters. Note that:

    "A letter is an uppercase letter or a lowercase letter as defined above;
    in this document the term does not include other characters that are
    letters in other alphabets."

    The other printing characters are locale-specific, so you'll have to
    test them with isprint(). In particular, many terminal control escape
    sequences start with the ESC character, which doesn't qualify above.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 08:59:22 2024
    On 31/01/2024 07:04, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 21:04:56 +0000, Malcolm McLean wrote:

    Linus Torvald's native language is Finnish, for example.

    No, it would be Swedish. He’s an ethnic Swede, from Finland.

    He is Finnish, but has Swedish as his mother tongue (like about 5% of
    Finns). Speaking Swedish as your main language does not make you
    ethically Swedish. As a university-educated Finn, brought up in
    Helsinki, he will also speak Finnish quite fluently.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Malcolm McLean on Tue Jan 30 23:18:21 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might
    have unusual effects if passed though systems designed to handle
    human-readable text. For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting
    for next data byte". Obviously this will cause difficuties if the data
    is binary.
    Also many binary formats can't easily be extended, so you can pass one
    image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to
    extend.

    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    [...]

    Simple example (disclaimer: not tested):

    ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
    (mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

    Of the five main programs in this command, four are using
    standard out to send binary data:

    tar -cf - .
    gzip -c
    ssh foo [...]
    gunzip -c

    The tar -xf - at the end reads binary data on standard in
    but doesn't output any (or anything else for that matter).

    It is FAR more cumbersome to accomplish what this command
    is doing without sending binary data through standard out.
    Anyone who doesn't understand this doesn't understand Unix.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Tim Rentsch on Wed Jan 31 12:43:32 2024
    On Tue, 30 Jan 2024 23:18:21 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it
    might have unusual effects if passed though systems designed to
    handle human-readable text. For instance in some systems
    designed to receive ASCII text, there is no distinction between
    the nul byte and "waiting for next data byte". Obviously this
    will cause difficuties if the data is binary.
    Also many binary formats can't easily be extended, so you can
    pass one image and that's all. While it is possible to devise a
    text format which is similar, in practice text formats usually
    have enough redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and
    harder to extend.

    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    [...]

    Simple example (disclaimer: not tested):

    ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
    (mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

    Of the five main programs in this command, four are using
    standard out to send binary data:

    tar -cf - .
    gzip -c
    ssh foo [...]
    gunzip -c

    The tar -xf - at the end reads binary data on standard in
    but doesn't output any (or anything else for that matter).

    It is FAR more cumbersome to accomplish what this command
    is doing without sending binary data through standard out.

    If I am not mistaken, tar, gzip and gunzip do not write binary data to
    standard output by default. They should be specifically told to do so.
    For ssh I don't know. Anyway, ssh is not a "normal" program so it's
    not surprising when textuality of ssh output is the same as textuality
    of the command it carries.

    Anyone who doesn't understand this doesn't understand Unix.

    Frankly, Unix redirection racket looks like something hacked together
    rather than designed as result of the solid thinking process.
    As long as there were only standard input and output it was sort of
    logical. But when they figured out that it is insufficient, they had
    chosen a quick hack instead of constructing a solution that wouldn't
    offend engineering senses of any non-preconditioned observer.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to David Brown on Wed Jan 31 12:53:04 2024
    On Wed, 31 Jan 2024 08:59:22 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    On 31/01/2024 07:04, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 21:04:56 +0000, Malcolm McLean wrote:

    Linus Torvald's native language is Finnish, for example.

    No, it would be Swedish. He’s an ethnic Swede, from Finland.

    He is Finnish, but has Swedish as his mother tongue (like about 5% of Finns). Speaking Swedish as your main language does not make you
    ethically Swedish.

    Linus has Swedish as his mother tongue.
    Linus has Swedish family name. Or at least Scandinavian, for me it
    sounds more Danish than Swedish, but I am not an expert. It certainly
    does not sound Finnish.
    When Linus was younger, he used to like to tell stereotypical jokes
    about Finns.

    When it quacks like a duck...

    As a university-educated Finn, brought up in
    Helsinki, he will also speak Finnish quite fluently.


    That's true.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to James Kuyper on Wed Jan 31 12:35:32 2024
    On 30.01.2024 22:16, James Kuyper wrote:
    On 1/30/24 11:49, David Brown wrote:
    ...
    If the program can reasonably be expected to generate binary output,
    then it is the user's fault if they do this accidentally. Examples
    shown in this thread include cat and zcat - that's what these programs do.

    ? There's no problem using cat to concatenate binary files. I've used
    'split' to split binary files into smaller pieces, and then used 'cat'
    to recombine them, and it worked fine. I don't remember why, but I had
    to transfer the files from one place to another by a method that imposed
    an upper limit on the size of individual files.

    Yes. Been there. Faint memories only. But some instances were;
    * allowed attachment sizes for email exchange
    * posting limits in Usenet (binary files)
    * unreliable FTP(?) download processes[*]
    * (and something else I can't wrap my mind around at the moment)

    Janis

    [*] Worth an anecdote...
    Downloading standard documents from the CCITT (now ITU-T).
    They decided to provide the documents in [non-standard] "MS Word"
    'doc' format. Short standard texts in blown up doc documents could
    not reliably be transmitted due to connection drops; splitting that
    binary doc could have helped (as any more sensible standard format).
    But since the original data was a bulky monolith, to no avail.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 12:43:04 2024
    On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

    VMS (now OpenVMS) was also a significant system at the time, and it had
    some rather complex file formats ...

    [...]

    Apparently Linus Torvalds used VMS for a while, and hated it.

    I don't understand the intention of this comment.
    VMS and Torvalds are completely different eras.
    And were is the relation?

    (Or just meant as anecdotal trivia?)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 13:58:02 2024
    On Wed, 31 Jan 2024 12:43:04 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

    VMS (now OpenVMS) was also a significant system at the time, and
    it had some rather complex file formats ...

    [...]

    Apparently Linus Torvalds used VMS for a while, and hated it.

    I don't understand the intention of this comment.
    VMS and Torvalds are completely different eras.
    And were is the relation?

    (Or just meant as anecdotal trivia?)

    Janis


    Linus is older than you probably realize. He entered the University of
    Helsinki in 1988. Back then VMS was only slightly behind its peak of popularity. By value, likely still bigger than all Unixen combined.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 12:22:39 2024
    On 30.01.2024 20:46, David Brown wrote:
    On 30/01/2024 19:22, bart wrote:
    [...]

    If you have a program like that, then it probably makes sense to have a
    flag to say "output the data to stdout" and the default being writing to
    a file.

    Did you here mean to say "output the data to the terminal"? (I noticed
    that a lot of the posts here have a misconception about what 'stdout'
    is; they seem to use it synonymously to "terminal" or "screen/display".
    But you are not guaranteed that stdout will produce screen output; it
    depends on the environment. Being more accurate with the distinction
    might help prevent misconceptions if replying to these people.)

    More to your point you wrote, I don't think this would be a good design
    as you've written it. A default would imply the necessity of some fixed
    name (or naming schema) - think of the disputable "a.out" default. What
    would be the default name (or naming scheme) for a tool's data output?
    (I could make a maybe even sensible choice for _one_ tool, like using
    the base name of a file name argument for cc's output file entity in
    the special case of compiling a "main module". Similar the schema '.o'
    when used with option -c . But generally?)

    Tools that output their data to the [default] stdout channel have many
    options; they may redirect that output, or they may pipe it into other programs, or they may also just suppress the data, unless they want to
    see it on their terminal display. Any quite some tools surely (where
    it's sensible) provide options to define a output file name in addition
    to the default stdout.

    And there's even some conventions for tools that require a file name
    as (also output) argument, using '-' to let it go to standard output,
    or /dev/tty to let it go to the terminal console. (It depends on the
    OS, of course.) This all said from the Unix perspective. Users with
    experiences only from other (often more primitive) OSes might have
    some problems to recognize such design principles.

    For sure we have tools that inherently operate with binary data and of
    course use the same I/O-channels for binary that a text processing tool
    uses for text interchange.

    There's no compelling reason why there should be _additional_ channels
    for "binary" data exchange; and I put "binary" in quotes for reasons we
    already spoke about.

    Every interface has a defined input and a defined output mechanism. Use
    them as they are specified. If you're calling, say (to keep it simply),
    'cat some_image.jpg' what do people expect, a textual description of
    the image? (Which would certainly be a nice feature in our AI hype era.
    But then we'd probably rather use 'describe_image some_image.jpg' or
    any redirected variant thereof. And we would still want binary data
    exchanged, to use it as (e.g.) 'AI_image_generator | describe_image'.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 13:35:47 2024
    On 31.01.2024 12:58, Michael S wrote:
    On Wed, 31 Jan 2024 12:43:04 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

    VMS (now OpenVMS) was also a significant system at the time, and
    it had some rather complex file formats ...

    [...]

    Apparently Linus Torvalds used VMS for a while, and hated it.

    I don't understand the intention of this comment.
    VMS and Torvalds are completely different eras.
    And were is the relation?


    Linus is older than you probably realize.

    Why do you think that I'd be thinking that?

    I know that he's quite some years younger than I am. So what?

    He entered the University of
    Helsinki in 1988. Back then VMS was only slightly behind its peak of popularity.

    What? - I'm not sure where you're coming from.

    I associate DEC's VMS with the old DEC VAX-11 system, both
    from around the mid of the 1970's. I programmed on a DEC's
    VAX with VMS obviously before Linus Torvalds started his
    studies. And that was at a time when the DEC VAX and VMS
    were replaced at our sites by Unix systems.

    By value, likely still bigger than all Unixen combined.

    Not sure what (to me strange sounding) ideas you have here.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Wed Jan 31 13:49:48 2024
    On 31.01.2024 13:15, Malcolm McLean wrote:
    On 31/01/2024 10:43, Michael S wrote:

    Frankly, Unix redirection racket looks like something hacked together
    rather than designed as result of the solid thinking process.
    As long as there were only standard input and output it was sort of
    logical. But when they figured out that it is insufficient, they had
    chosen a quick hack instead of constructing a solution that wouldn't
    offend engineering senses of any non-preconditioned observer.

    (What a bullshit.)


    It was designed for very memory constrained systems which handled text
    on a line by line basis. So one line of a long file wuld be processed
    and passed down the pipeline, and you wouldn't need temporary disk files
    or large amounts of memory. I'm sure it worked quite well for that.

    You are right about the intention of pipes. (Though not only for
    "memory constrained systems", but generally to avoid unnecessary and
    costly disk I/O.)

    But it's not line oriented. (That would be non-performant.) In Unix
    systems there's a distinction between line-oriented and buffered I/O.
    You can configure I/O with what you want if you program your tools.
    Typically tools when in non-interactive mode use efficient buffered
    mode. And there's also methods to force buffered I/O to become line
    oriented (cf. 'pty') for specific cases where you want it.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Wed Jan 31 14:02:51 2024
    On 30.01.2024 22:04, Malcolm McLean wrote:
    Linus Torvald's native language is Finnish, for example. But git was
    released in English. There might be Finnish language bindings for it
    now, but I'm pretty sure not in the original version. Similarly Bjarne Strousup is Swedish, but C++ uses keywords like "class" and "friend",
    not the Swedish terms.

    Okay something had already been said about Linus T.'s ethnicity,
    so let me also point out that Bjarne Stroustrup (sp!) is Danish.
    (Just for the record.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 15:10:05 2024
    On Wed, 31 Jan 2024 13:35:47 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 31.01.2024 12:58, Michael S wrote:
    On Wed, 31 Jan 2024 12:43:04 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

    VMS (now OpenVMS) was also a significant system at the time, and
    it had some rather complex file formats ...

    [...]

    Apparently Linus Torvalds used VMS for a while, and hated it.

    I don't understand the intention of this comment.
    VMS and Torvalds are completely different eras.
    And were is the relation?


    Linus is older than you probably realize.

    Why do you think that I'd be thinking that?

    I know that he's quite some years younger than I am. So what?

    He entered the University of
    Helsinki in 1988. Back then VMS was only slightly behind its peak of popularity.

    What? - I'm not sure where you're coming from.

    I associate DEC's VMS with the old DEC VAX-11 system, both
    from around the mid of the 1970's.

    Released in 1977.
    Reached the peak of popularity in mid 1980s, when DEC decided to use
    VAX not just as mini/super-mini, but also as competitor to mainframes, effectively killing their earlier mainframe line (PDP-6/10/20).
    In late 1980s and early 1990s was used as desktop/workstation OS as
    well. Never was very popular in that role, but the reason for moderate popularity was high price of software and relative weakness of hardware (MicroVAX) rather than technical deficiencies of the Operation System.
    Ported to Alpha in early 1990s.
    Ported to Itanium in early 2000s.
    Ported to x86-64 starting from mid-2010s. Since, unlike previous ports,
    this one was done by small company, it took plenty of time. However by
    now VMS on x86-64 is already in production stage.

    I programmed on a DEC's
    VAX with VMS obviously before Linus Torvalds started his
    studies. And that was at a time when the DEC VAX and VMS
    were replaced at our sites by Unix systems.


    There were places like that.
    There were far more places where VMS was replaced much later and not necessarily by Unix.
    There are places where VMS is still running.
    Most likely, VMS will still be used in production after the last
    "vendor's Unix", of which I'd bet on AIX, is replaced by ether Linux of
    BSD.

    By value, likely still bigger than all Unixen combined.

    Not sure what (to me strange sounding) ideas you have here.

    Janis


    I can say the same.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Wed Jan 31 14:05:45 2024
    On 30/01/2024 22:04, Malcolm McLean wrote:
    On 30/01/2024 20:06, Keith Thompson wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 29/01/2024 21:00, Keith Thompson wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 29/01/2024 16:18, David Brown wrote:
    On 28/01/2024 20:49, Malcolm McLean wrote:
    On 28/01/2024 18:24, David Brown wrote:
    I'd expect that most general purpose programs written by Norwegians >>>>>>> use an English interface, even if it isn't really expected that the >>>>>>> program will find an audience beyond some users in Norway. Except >>>>>>> of course for programs which in some way are about Norway.
    Why?

    Generally programmers are educated people and educated people use
    English for serious purposes. Not always of course and Norway might be >>>>> an exception. But I'd expect that in a Norweigian university, for
    example, it would be forbidden to document a program in Norwegian or >>>>> to use non-English words for identifiers. And probably the same in a >>>>> large Norwegina company. I might be wrong about that and I have never >>>>> visited Norway or worked for a Norweigian employer (and obviously I
    couldn't do so unless the policy I expect was followed).
    You assert that "educated people use English for serious purposes".
    I don't have the experience to refute that claim, but I suspect
    it's arrogant nonsense.  I could be wrong, of course.

    Everything which is at all intellectually serious is these days
    written in English. It's the new Latin. It's the language all educated
    people use to communicate with each other when discussing scientific,
    philosophical, or scholarly matters. And also technical matters to a
    large extent.

    Even if that's true, you assertion was about user interfaces.

    Are you under the impression that mobile phones show their messages only
    in English because all their users are scholars?

    Things tend to trickle down.
    Teenagers from non-English speaking countries hum along to pop songs in English.

    And you think that makes them fluent, or be as happy communicating in
    English than in their own language(s) ?

    It is certainly true that in many non-English speaking countries, people
    get quite familiar with English, and they do so from a younger age than
    in previous generations. It is also true that in many non-English
    speaking countries, people /don't/ learn much English unless they are
    involved in a profession that needs it.

    A guiding factor here is the size of the population. In European
    countries with maybe 20-30 million or more, everything important is
    translated or dubbed. You can watch American movies in your own
    language - they are dubbed. You can read translated books, or natively
    written books, even for specialised non-fiction books. You don't need
    to understand any English of significance to have a full professional,
    cultural and social life.

    But it is correct that English has become the main language for
    international communication, and is therefore critical for anything that involves cross-border communication, or where there are significant
    numbers of foreign workers. That includes academic work. Different
    parts of Europe previously used German or Russian for this, and that is
    still found amongst the older generations, but it is changing to English.


    I live in Norway, which is a country near the top of any list of English fluency. It is small (in population), so relies on a lot of imported
    culture and information. Films are subtitled, not dubbed (except for
    films for small kids), and only popular books are translated. We have a
    very high standard of education, including higher education, a great
    deal of foreign workers in some fields (in my company, perhaps a third
    of employees are not natives), and Norwegians travel extensively. There
    can be few non-English speaking countries where English is significantly
    more prevalent than Norway.


    Yet be in no doubt that the solid majority of Norwegians prefer
    Norwegian for most purposes.

    Yes, pretty much any programmer in Norway will be confident in English.
    So will most scholars. But the huge majority of people in Norway are
    not programmers or scholars.

    Most Norwegians will be reasonably good at understanding English -
    written or spoken, as most are exposed to it regularly (though not
    everyone watches American films or plays English-language computer
    games). A large proportion can speak English well - most of the younger generations, and those with higher education. But many people speak
    little English in their normal life, after their education, and are out
    of practice and under-confident.

    And despite their impressive English skills, most /prefer/ Norwegian,
    and are more comfortable in Norwegian. It does not matter how fluent
    they are in English - if you put some Norwegians together, they will
    talk in Norwegian, because that is the native language here.

    You use Norwegian unless there is good reason to do otherwise.


    So if I am writing a program aimed at Norwegians, the output will be in Norwegian, unless the balance of development effort versus use effort
    makes English the rational choice (since I write English faster and more accurately). I would normally write any user documentation in
    Norwegian. I would normally write internal or code documentation in
    English - but the issue was about the user-visible language, not the programmer's language. I would definitely use English for the code -
    but again, that was not the issue.

    If the program might reasonably be used by non-Norwegian speakers, then
    the choice is between English only, or bilingual.


    And whilst programmers aren't usually scholars, if they are C
    programmers they will use a programming language with keywords based on English.

    The actual programming is almost always in English. Some Norwegians
    will use Norwegian terms for identifiers, usually ASCII'fied. I do that sometimes when the code uses terms that I know in Norwegian but not in
    English. Some Norwegians like to comment in Norwegian. Since a lot of companies have at least some non-Norwegian programmers, and larger
    software development projects are often somewhat international, a most
    coding is in English.

    But we were not talking about the language programmers use. We were
    talking about the language output by programs - how /users/ interact
    with the program.


    And you will likley get quite a bit of English coming through the mobile phone.

    Not unless you are talking to someone in English. Do you think mobile
    phone software in Norway is in English?

    Linus Torvald's native language is Finnish, for example.

    It is Swedish.

    But git was
    released in English.

    It was written long after Torvalds (that's how his name is spelt) had
    moved to the USA. It was written for a highly international community,
    all over the world - a community of programmers who were already working
    on an English-language project.

    It is thus utterly irrelevant as an example.

    There might be Finnish language bindings for it
    now, but I'm pretty sure not in the original version. Similarly Bjarne Strousup is Swedish, but C++ uses keywords like "class" and "friend",
    not the Swedish terms.

    Bjarne Stroustrup (again, you got the spelling wrong) is Danish. And he designed a programming language, for programmers, as an extension for an existing English-language programming language.

    Are you actively trying to make yourself seem more absurdly ignorant of
    the world outside your doorstep? If you are going to use famous
    programmers as examples, at least do them the courtesy of getting their
    names and languages right. It would have been even better if the
    examples had been even remotely relevant. (I do understand that you
    can't think of any actually relevant examples - you are not familiar
    with any programs written primarily for Norwegians. How you then think
    you are qualified to pontificate about them is an unfathomable question.)

    Now I think that would rub off on Norwegian programmers. It would be surprising of it did not.
    Are you aware of the existence of medical devices (my current $DAYJOB)
    that can be configured to display messages in any of a number of
    languages?

    Some software is internationalised. But it takes quite a lot of
    resources to translate software. With medical device software the
    software is likely so expensive to develop anyway because of all the
    safety critical portions that the cost is tolerable.

    And now you are trying to tell people who make medical software and
    medical devices about their jobs.

    Our software has
    purely English user interfaces. It was something we looked at, but it
    would have been expensive and made the code base harder to manange, and
    the users said that the benefit to them was marginal as they spoke
    enough English to understand a few simple GUI labels. I think our
    experience is more typical, but some people will no doubt make out that
    it is narrow and parochial.

    The fact is that there is no single answer. The variation in software
    and in requirements is enormous. Every single project is "narrow and parochial". The rest of us understand that - we know there is no "one
    size fits all". There is no "typical".


    [...]

    Correct, you don't actually know.  Why doesn't that prevent you from
    making assertions rather than asking questions, so that you can learn
    something from people who know more than you do?


    That is the really big question here.

    I'm a qualified scientist, amongst other things.

    I find that /very/ difficult to believe. I see very little rational
    thinking, collection of data, consideration of hypotheses, collaboration
    with colleagues, refinement of argument when contradictory data is
    presented, or anything else indicating scientific training or a
    scientific viewpoint.

    What I see is wild and unjustified extrapolation from very limited
    experiences, and a flat denial of information that contradicts your
    pre-formed ideas.

    In science, the things
    that you know are usually either quite basic and covered in the first
    degree, or they are not terribly interesting.

    You could not be more wrong. What an incredibly sad view you have.

    What matters is what you
    don't actually know, but believe to be the case, based on sound evidence
    and reasoning.

    Are you mixing up religion and science? /Belief/ is irrelevant in science.

    And I believe it to be the case that English is used very
    widely in Norway. And in fact, if David Brown, who is in a position to
    know, asserts this not to be the case, I'd put it down to his
    contentious nature and tend to dismiss it. Now of course I could have
    misled myself. But I doubt it.


    No one ever contended that English is used widely in Norway. That was
    never at issue, and is not closely related to the claims you made.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 14:45:49 2024
    On 31/01/2024 07:07, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

    Mixing binary data with formatted text data is very unlikely to be
    useful.

    PDF does exactly that. To the point where the spec suggests putting some random unprintable bytes up front, to distract format sniffers from
    thinking they’re looking at a text file.

    PDF files start with the "magic" indicator "%PDF", which is enough for
    many programs to identify them correctly. And they are usually
    compressed so that the content text is not directly readable or
    identifiable as strings. If they are not compressed, then yes, there is
    can be text mixed in with everything else. But I would not call that
    "mixing binary data and formatted text" - I would just say that some of
    the binary data happens to be strings. It's the same as elf files
    containing copies of strings from the program, or identifiers for
    external linking.

    However, I learned a new trick when checking that I was not mistaken
    about this - it turns out that "less file.pdf" gives a nice text-only
    output from the pdf file (by passing it through "lesspipe"). There's
    always something new to learn from inane conversations on Usenet :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 15:28:42 2024
    On Wed, 31 Jan 2024 06:03:37 -0000 (UTC)
    Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

    On Tue, 30 Jan 2024 15:21:01 +0000, Malcolm McLean wrote:

    Most systems run Windows where the model of piping from standard
    output to standard input of the next program is much less used than
    in Unix, this is true. That sometimes generates a feeling of
    superiority amongst those who use the less common, often more
    expensive Unix systems. It's very silly, but that's how people
    think.

    Also we can do select/poll on pipes on *nix systems, you can’t on
    Windows.

    You can't do select/poll on Windows anonymous pipes, which are an odd
    bird in Win32 API. To me Windows anonymous pipes look like poorly
    implemented late addition to Win32 that, I would imagine, was done in
    order to claim POSIX compatibility back when it mattered for USA
    government contracts.
    However you can do Windows' equivalent of poll (WaitForMultipleObjects)
    on *named* pipes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Michael S on Wed Jan 31 14:21:23 2024
    On 31/01/2024 11:53, Michael S wrote:
    On Wed, 31 Jan 2024 08:59:22 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    On 31/01/2024 07:04, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 21:04:56 +0000, Malcolm McLean wrote:

    Linus Torvald's native language is Finnish, for example.

    No, it would be Swedish. He’s an ethnic Swede, from Finland.

    He is Finnish, but has Swedish as his mother tongue (like about 5% of
    Finns). Speaking Swedish as your main language does not make you
    ethically Swedish.

    Linus has Swedish as his mother tongue.
    Linus has Swedish family name. Or at least Scandinavian, for me it
    sounds more Danish than Swedish, but I am not an expert. It certainly
    does not sound Finnish.

    His name is of Swedish language origin, yes. I can't answer for
    Torvalds family, but the solid majority of Swedish-speaking Finn
    families have been in Finland for hundreds of years. I'm not sure what
    the definitions of "ethnically Swedish" or "ethnically Finnish" might
    be, but as a general rule Swedish-speaking Finns consider themselves to
    be Swedish-speaking Finns - not Swedes living in Finland.

    When Linus was younger, he used to like to tell stereotypical jokes
    about Finns.

    So do all Finns I have ever met. They are particularly fond of jokes
    about how absurd the Finnish language can be - whether they are natively Swedish speakers or Finnish speakers. They are all proud of being
    Finnish, including traits that may be viewed as eccentric by other cultures.


    When it quacks like a duck...

    As a university-educated Finn, brought up in
    Helsinki, he will also speak Finnish quite fluently.


    That's true.



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 14:35:03 2024
    On 31.01.2024 14:05, David Brown wrote:

    But it is correct that English has become the main language for
    international communication, and is therefore critical for anything that involves cross-border communication, or where there are significant
    numbers of foreign workers. That includes academic work. Different
    parts of Europe previously used German or Russian for this,

    Don't forget the importance of French! - The whole postal and
    telecommunication sectors were (and probably still are) massively
    influenced by France.

    (You're always writing so much text, so I'll skip it and avoid
    more comments.)

    Just two (unrelated) notes concerning statements I've seen
    somewhere in the thread (maybe here as well)...

    First; the EU publishes in all languages of the member states,
    for example. (There's no single lingua franca.)

    And the second note; we have to distinguish the language of the
    programming language's keywords, the comments in the source
    code, and the language(s) used for user-interaction.

    I don't know whether there's some native language that use
    non-English keywords, but I'd suppose so, since in the past
    I've seen some using preprocessors for a "native language"
    source code. So while not typical, probably a demand at some
    places. (Elsethread I mentioned the German TR440 commands,
    but a [primitive] command language, as opposed to, say, the
    Unix shell, I don't consider much as a language.)

    The comments' languages varies, in my experience. Sometimes
    there's coding standards (that demand the native language, or
    that demand English), sometimes it's not defined. Myself I'm
    reluctant to switch between languages and stay with English.
    But there were also other cases with longer descriptions on
    a conceptual basis; if you come from a native language's
    perspective it can be better to stay with the language of the
    specification instead of introducing sources of misunderstanding.

    The user interface, finally, is of course as specified, and can
    be anything, or even multi-lingual.

    Janis

    [...]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Richard Harnden on Wed Jan 31 14:47:38 2024
    On 30.01.2024 20:39, Richard Harnden wrote:

    Nobody uses printf to output binary data. fwrite(3) would be common, as
    would write(2).

    Right. I'm using the OS'es write(2), but also printf with ANSI escapes,
    e.g. sprintf (buf, "\033[%d;%dH", ...


    Maybe you could use printf("%c%c%c" ... but it'd be beyond tedious.

    Since I recall to have used it in some thread I want to clarify that
    it was just meant as an example countering an incorrect argument of
    "not being able to output binary data on stdout", or some such.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Wed Jan 31 14:58:42 2024
    On 31/01/2024 12:22, Janis Papanagnou wrote:
    On 30.01.2024 20:46, David Brown wrote:
    On 30/01/2024 19:22, bart wrote:
    [...]

    If you have a program like that, then it probably makes sense to have a
    flag to say "output the data to stdout" and the default being writing to
    a file.

    Did you here mean to say "output the data to the terminal"?

    No.

    I said that if you have a program that sometimes gives binary output on
    stdout, and sometimes gives text messages, and this leads people to have
    a significant chance of accidentally dumping binary output to their
    terminal, then it probably makes sense to require an explicit flag to
    have the program generate binary output on stdout. Then you can use
    pipes or other redirection if you want, but are less likely to make a
    mess of your terminal by mistake.

    (I noticed
    that a lot of the posts here have a misconception about what 'stdout'
    is; they seem to use it synonymously to "terminal" or "screen/display".
    But you are not guaranteed that stdout will produce screen output; it
    depends on the environment. Being more accurate with the distinction
    might help prevent misconceptions if replying to these people.)


    I am not sure I have noticed that mistake here, but it is something that
    people do mix up - especially if they are used to systems where pipes
    and shell redirection are uncommon.

    However, here we have used "terminal" intentionally. It is usually not
    a big problem if you accidentally pipe binary data into grep or redirect
    it to a file. It is, however, annoying when it goes to the terminal and generates a screenful of flashing garbage, 42 beeps, and leaves you with
    a red on red colour scheme.

    More to your point you wrote, I don't think this would be a good design
    as you've written it. A default would imply the necessity of some fixed
    name (or naming schema) - think of the disputable "a.out" default.

    Yes.

    Or perhaps just an error message saying you need to specify the filename
    for output or "-" for stdout. Those are just details.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Kuyper on Wed Jan 31 15:09:28 2024
    On 30/01/2024 22:16, James Kuyper wrote:
    On 1/30/24 11:49, David Brown wrote:
    ...
    If the program can reasonably be expected to generate binary output,
    then it is the user's fault if they do this accidentally. Examples
    shown in this thread include cat and zcat - that's what these programs do.

    ? There's no problem using cat to concatenate binary files. I've used
    'split' to split binary files into smaller pieces, and then used 'cat'
    to recombine them, and it worked fine. I don't remember why, but I had
    to transfer the files from one place to another by a method that imposed
    an upper limit on the size of individual files.


    I think there's a misunderstanding here - I gave "cat" is an example of
    a program that /can/ be expected to produce binary output. (It can also produce text output - you get what you put in.) So it is the user's
    fault if the type "cat /bin/cat" and are surprised by a mess in their
    terminal.

    I would expect that the majority of uses of "cat" are with just one
    file, but certainly it is useful when you want to combine files in
    different ways.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 15:46:27 2024
    On 31.01.2024 15:09, David Brown wrote:

    I would expect that the majority of uses of "cat" are with just one
    file,

    And of course just because of ignorance; the majority of (but not all)
    uses with just one file are UUOCs.

    but certainly it is useful when you want to combine files in
    different ways.

    I don't know of any concatenations in "different" ways, but of course
    there's some more of the other usages that are supported by options.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 15:39:47 2024
    On 31.01.2024 14:58, David Brown wrote:
    On 31/01/2024 12:22, Janis Papanagnou wrote:
    On 30.01.2024 20:46, David Brown wrote:
    On 30/01/2024 19:22, bart wrote:
    [...]

    If you have a program like that, then it probably makes sense to have a
    flag to say "output the data to stdout" and the default being writing to >>> a file.

    Did you here mean to say "output the data to the terminal"?

    No.

    I said that if you have a program that sometimes gives binary output on stdout, and sometimes gives text messages,

    Umm, the same program? - I mean, sure, technically a program can
    produce binary output (with some options) and textual output (with
    some other options); but that is explicitly controlled by the user.

    Your statement reads as if it would arbitrarily output binary or
    text; and in that case the "output type" distinction appears to be
    artificially and should instead be called just "data".

    It might be enlightening to consider for example: decrypt f.cpt
    What would one expect as output ("binary"? "text"? or "data"?) and
    what we gain with a (spurious) flag?

    and this leads people to have
    a significant chance of accidentally dumping binary output to their
    terminal, [...]

    Okay, after all, "output the data to the terminal", as I suspected
    (and that was what I had been asking to clarify).

    Myself (as I'd suppose everyone else also) has dumped non-text
    code onto the terminal, by a typo, or by forgetting to provide
    an option. 'stty sane' or 'reset' and continue...

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Wed Jan 31 15:21:16 2024
    On 31/01/2024 09:36, Malcolm McLean wrote:
    On 31/01/2024 07:18, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
    human-readable text.  For instance in some systems designed to receive >>>>> ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte".  Obviously this will cause difficuties if the
    data
    is binary.
    Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all.  While it is possible to devise a text format >>>>> which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

    Your reasoning is all gobbledygook.  Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    [...]

    Simple example (disclaimer: not tested):

         ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
            (mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

    Of the five main programs in this command, four are using
    standard out to send binary data:

         tar -cf - .
         gzip -c
         ssh foo [...]
         gunzip -c

    The tar -xf - at the end reads binary data on standard in
    but doesn't output any (or anything else for that matter).

    It is FAR more cumbersome to accomplish what this command
    is doing without sending binary data through standard out.
    Anyone who doesn't understand this doesn't understand Unix.

    Yes. I don't do that sort of thing.
    Whilst I have used Unix, it is as a platform for interactive programs
    which work on graohics, or a general C compilation environment. I don;t
    build pipeliens to do that sort of data processing. If I had to download
    a tar file I'd either use a graphical tool or type serveal commands into
    the shell, each launching single executable, interactively.

    The reason is that I'd only run the command once, and it's so likely
    that there will be either a syntax misunderstanding or a typing error
    that I'd have to test to ensure that it was right. And by the time
    you've done that any time saved by typing only one commandline is lost.
    Of course if you are writing scripts then that doesn't apply. But now
    it's effectively a programming language, and, from the example code, a
    very poorly designed one which is cryptic and fussy and liable to be
    hard to maintain. So it's better to use a language like Perl to achieve
    the same thing, and I did have a few Perl scripts handy for repetitive
    jobs of that nature in my Unix days.


    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    You admit this with "not tested". Says it all. '"Understandig Unix" is
    an intellectually useless achievement. You might have to do it if you
    have to use the system and debug and trouble shoot. But it's nothing to
    be proud about.


    It is "useless" for people who don't use it. For people who /do/ use
    it, is very useful.

    I've used sequences like Tim's - it's a way to copy data remotely from a different machine. I would likely write it slightly differently - I'd
    probably do the mkdir and cd first, thus avoiding the need for a
    subshell, and I'd use "ssh -C" or "tar -z" to do the compression rather
    than "gzip".

    There's no doubt that the learning curve is longer for doing this sort
    of thing from the command line than using gui programs. There is also
    no doubt that when you are used to it, command line utilities and a good
    shell are very flexible and efficient.

    Learn to use the tools that are conveniently available, and then pick
    the right tool for the job - whether it is command line or gui.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Malcolm McLean on Wed Jan 31 17:03:12 2024
    On Wed, 31 Jan 2024 12:15:23 +0000
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

    On 31/01/2024 10:43, Michael S wrote:
    On Tue, 30 Jan 2024 23:18:21 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it
    might have unusual effects if passed though systems designed to
    handle human-readable text. For instance in some systems
    designed to receive ASCII text, there is no distinction between
    the nul byte and "waiting for next data byte". Obviously this
    will cause difficuties if the data is binary.
    Also many binary formats can't easily be extended, so you can
    pass one image and that's all. While it is possible to devise a
    text format which is similar, in practice text formats usually
    have enough redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and
    harder to extend.

    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered
    doing.

    [...]

    Simple example (disclaimer: not tested):

    ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
    (mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

    Of the five main programs in this command, four are using
    standard out to send binary data:

    tar -cf - .
    gzip -c
    ssh foo [...]
    gunzip -c

    The tar -xf - at the end reads binary data on standard in
    but doesn't output any (or anything else for that matter).

    It is FAR more cumbersome to accomplish what this command
    is doing without sending binary data through standard out.

    If I am not mistaken, tar, gzip and gunzip do not write binary data
    to standard output by default. They should be specifically told to
    do so. For ssh I don't know. Anyway, ssh is not a "normal" program
    so it's not surprising when textuality of ssh output is the same as textuality of the command it carries.

    Anyone who doesn't understand this doesn't understand Unix.

    Frankly, Unix redirection racket looks like something hacked
    together rather than designed as result of the solid thinking
    process. As long as there were only standard input and output it
    was sort of logical. But when they figured out that it is
    insufficient, they had chosen a quick hack instead of constructing
    a solution that wouldn't offend engineering senses of any non-preconditioned observer.
    It was designed for very memory constrained systems which handled
    text on a line by line basis. So one line of a long file wuld be
    processed and passed down the pipeline, and you wouldn't need
    temporary disk files or large amounts of memory. I'm sure it worked
    quite well for that.



    A concept of pipes is fine. I was not talking about that side.

    My objection is with each program having exactly 1 special input and
    exactly 2 special outputs. Instead of having, say, up to 5 of each,
    fully interchangeable with the first of the five being special only in
    that that it is a default and as such allows for shorter syntax in the
    shell.

    I would be surprised if something like that was not done by somebody.
    I would be even more surprised if idea did not cross the mind of Unix
    pioneers. However they decided to add stderr and to stop here. Most
    likely, because they didn't take themselves as seriously as few posters
    here take them 45-50 years later.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 15:15:15 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Tue, 30 Jan 2024 17:49:57 +0100, David Brown wrote:

    The use of CRLF as a standard for line endings in files was, I believe,
    from CP/M ...

    Which I think copied it from DEC minicomputer systems.

    Fun fact: on some of those DEC systems (which I used when they were still >being made), you could end a line with CR-LF, or LF-CR-NUL.

    What was the NUL for? Padding. Why did it need padding? (This was before
    CRT terminals.)

    Unix built the nul-padding into the terminal driver. Users used
    the stty command to set the number of nul's (based on the time it
    took for the ASR33 carriage to return to the home position after
    it recieved a CR).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 31 15:17:53 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 31.01.2024 12:58, Michael S wrote:
    On Wed, 31 Jan 2024 12:43:04 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

    VMS (now OpenVMS) was also a significant system at the time, and
    it had some rather complex file formats ...

    [...]

    Apparently Linus Torvalds used VMS for a while, and hated it.

    I don't understand the intention of this comment.
    VMS and Torvalds are completely different eras.
    And were is the relation?


    Linus is older than you probably realize.

    Why do you think that I'd be thinking that?

    I know that he's quite some years younger than I am. So what?

    He entered the University of
    Helsinki in 1988. Back then VMS was only slightly behind its peak of
    popularity.

    What? - I'm not sure where you're coming from.

    I associate DEC's VMS with the old DEC VAX-11 system, both
    from around the mid of the 1970's.

    Early customer systems were shipped around 1979, IIRC. We had four.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 15:20:43 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Tue, 30 Jan 2024 19:39:24 +0000, Richard Harnden wrote:

    Nobody uses printf to output binary data.

    Do terminal-control escape sequences count?

    Or UTF-*?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 15:16:57 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

    VMS (now OpenVMS) was also a significant system at the time, and it had
    some rather complex file formats ...

    It relied on extra file metadata called “record attributes” in order to >make sense of the file format. It was quite common to transfer files from >other systems, and have them not be readable until you had set appropriate >record attributes on them. Picky, picky, I know.

    At my first job, I had to write a tool (in macro32) to support access
    to any type of file, regardless of the RMS attributes. Wasn't a trivial
    task like it would have been on a unix system.

    VMS inherited that from the mainframe systems whose filesystem were
    based around COBOL file handling.

    At least it was far superior to IBM's PDS and associated crap.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Wed Jan 31 15:20:08 2024
    David Brown <david.brown@hesbynett.no> writes:
    On 31/01/2024 07:07, Lawrence D'Oliveiro wrote:
    On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

    Mixing binary data with formatted text data is very unlikely to be
    useful.

    PDF does exactly that. To the point where the spec suggests putting some
    random unprintable bytes up front, to distract format sniffers from
    thinking they’re looking at a text file.

    PDF files start with the "magic" indicator "%PDF", which is enough for
    many programs to identify them correctly. And they are usually
    compressed so that the content text is not directly readable or
    identifiable as strings. If they are not compressed, then yes, there is
    can be text mixed in with everything else. But I would not call that
    "mixing binary data and formatted text" - I would just say that some of
    the binary data happens to be strings. It's the same as elf files
    containing copies of strings from the program, or identifiers for
    external linking.

    However, I learned a new trick when checking that I was not mistaken
    about this - it turns out that "less file.pdf" gives a nice text-only
    output from the pdf file (by passing it through "lesspipe"). There's
    always something new to learn from inane conversations on Usenet :-)

    For many years, I used a tool called 'antiword' to read legacy microsoft windows .doc files (before .docx).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Wed Jan 31 15:22:58 2024
    David Brown <david.brown@hesbynett.no> writes:
    On 31/01/2024 09:36, Malcolm McLean wrote:

    The reason is that I'd only run the command once, and it's so likely
    that there will be either a syntax misunderstanding or a typing error
    that I'd have to test to ensure that it was right. And by the time
    you've done that any time saved by typing only one commandline is lost.
    Of course if you are writing scripts then that doesn't apply. But now
    it's effectively a programming language, and, from the example code, a
    very poorly designed one which is cryptic and fussy and liable to be
    hard to maintain. So it's better to use a language like Perl to achieve
    the same thing, and I did have a few Perl scripts handy for repetitive
    jobs of that nature in my Unix days.


    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    You admit this with "not tested". Says it all. '"Understandig Unix" is
    an intellectually useless achievement. You might have to do it if you
    have to use the system and debug and trouble shoot. But it's nothing to
    be proud about.


    It is "useless" for people who don't use it. For people who /do/ use
    it, is very useful.

    I've used sequences like Tim's - it's a way to copy data remotely from a >different machine. I would likely write it slightly differently - I'd >probably do the mkdir and cd first, thus avoiding the need for a
    subshell, and I'd use "ssh -C" or "tar -z" to do the compression rather
    than "gzip".

    There's no doubt that the learning curve is longer for doing this sort
    of thing from the command line than using gui programs. There is also
    no doubt that when you are used to it, command line utilities and a good >shell are very flexible and efficient.

    Learn to use the tools that are conveniently available, and then pick
    the right tool for the job - whether it is command line or gui.

    And there are often more than one tool for the job. e.g. rsync(1)
    for copying data remotely.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Michael S on Wed Jan 31 15:25:00 2024
    Michael S <already5chosen@yahoo.com> writes:
    On Tue, 30 Jan 2024 23:18:21 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:


    Anyone who doesn't understand this doesn't understand Unix.

    Frankly, Unix redirection racket looks like something hacked together
    rather than designed as result of the solid thinking process.

    It seems you don't understand Unix.

    As long as there were only standard input and output it was sort of
    logical. But when they figured out that it is insufficient, they had
    chosen a quick hack instead of constructing a solution that wouldn't
    offend engineering senses of any non-preconditioned observer.

    You mean like

    exec 3< /path/to/input/file
    read -u3 line_from_input file

    How does that offend your engineering senses?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 16:25:30 2024
    On 31.01.2024 15:21, David Brown wrote:
    On 31/01/2024 09:36, Malcolm McLean wrote:
    [ I snipped a couple of "I actually don't know/need it" things ]

    But now it's effectively a programming language, and, from the example
    code, a very poorly designed one which is cryptic and fussy and liable
    to be hard to maintain. So it's better to use a language like Perl to
    achieve the same thing, and I did have a few Perl scripts handy for
    repetitive jobs of that nature in my Unix days.

    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    I don't think it's that clear a joke. The Unix shell is extremely
    error prone to program, and you should not let a newbie write shell
    programs without careful supervision. ("newbie" [in shell context]
    = less than 10 years of practical experience. - Am I exaggerating?
    Maybe. But not much.)

    And yes, Shell is cryptic, but, OTOH, it's also a programming
    language with powerful concepts (taken from Algol68). It wouldn't
    appear to me, though, to classify Perl as non-cryptic. You can
    write more or less legible shell code; also depends on experience.
    You have functions to structure your code, the control constructs,
    error handling, the I/O glue, all smoothly fitting together. But,
    to be honest, I say this with modern shells in mind (ksh, bash,
    zsh), not so much the POSIX subset, and even less the Bourne sh.

    Perl's advantage is probably that you have the same interface on
    all platforms [where it is installed]. Not having to distinguish,
    say, the 'ps' options from one Unix system to another. And it has
    a lot more features, data types, and supporting external modules.


    There's no doubt that the learning curve is longer for doing this sort
    of thing from the command line than using gui programs. There is also
    no doubt that when you are used to it, command line utilities and a good shell are very flexible and efficient.

    The big advantage of non-GUI is for process automation. With GUI
    oriented applications you can mainly only interactively (=slow and
    cumbersome) do what it provides. Rarely GUI applications support a
    scripting interface, and if so it's then typically some proprietary non-standard language.


    Learn to use the tools that are conveniently available, and then pick
    the right tool for the job - whether it is command line or gui.

    The Unix shell is at least standard and available on Unix systems.
    (Perl is no standard on Unix. And you may not be allowed to install
    it.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to David Brown on Wed Jan 31 15:29:07 2024
    David Brown <david.brown@hesbynett.no> writes:
    On 30/01/2024 22:16, James Kuyper wrote:
    On 1/30/24 11:49, David Brown wrote:
    ...
    If the program can reasonably be expected to generate binary output,
    then it is the user's fault if they do this accidentally. Examples
    shown in this thread include cat and zcat - that's what these programs do. >>
    ? There's no problem using cat to concatenate binary files. I've used
    'split' to split binary files into smaller pieces, and then used 'cat'
    to recombine them, and it worked fine. I don't remember why, but I had
    to transfer the files from one place to another by a method that imposed
    an upper limit on the size of individual files.


    I think there's a misunderstanding here - I gave "cat" is an example of
    a program that /can/ be expected to produce binary output. (It can also >produce text output - you get what you put in.) So it is the user's
    fault if the type "cat /bin/cat" and are surprised by a mess in their >terminal.

    Quick and dirty editor:

    $ cat > /tmp/file < /dev/tty
    line1
    line2
    line3
    ^D
    $
    $ cat /tmp/file
    line1
    line2
    line3
    $

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Scott Lurndal on Wed Jan 31 16:33:44 2024
    On 31.01.2024 16:25, Scott Lurndal wrote:
    Michael S <already5chosen@yahoo.com> writes:
    [...]

    You mean like

    exec 3< /path/to/input/file
    read -u3 line_from_input file

    Careful with non-standard extensions like '-u'.


    How does that offend your engineering senses?

    It probably would if the standard redirection pattern
    would have been used here. It's certainly more "cryptic"
    than '-u3'. ;-)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Scott Lurndal on Wed Jan 31 17:42:49 2024
    On Wed, 31 Jan 2024 15:25:00 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:

    Michael S <already5chosen@yahoo.com> writes:
    On Tue, 30 Jan 2024 23:18:21 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:


    Anyone who doesn't understand this doesn't understand Unix.

    Frankly, Unix redirection racket looks like something hacked together >rather than designed as result of the solid thinking process.

    It seems you don't understand Unix.


    That's likely.
    I learned it during short course ~30 years ago, then read 2 or 3
    books about it and then used it quite sporadically.

    As long as there were only standard input and output it was sort of >logical. But when they figured out that it is insufficient, they had
    chosen a quick hack instead of constructing a solution that wouldn't
    offend engineering senses of any non-preconditioned observer.

    You mean like

    exec 3< /path/to/input/file
    read -u3 line_from_input file

    How does that offend your engineering senses?


    That was not in 2-3 books that I had read. I can't say that I understand
    what is going on, what environment we are and whether what you show is
    generic or specific to 'exec' and 'read'.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Michael S on Wed Jan 31 15:58:02 2024
    Michael S <already5chosen@yahoo.com> writes:
    On Wed, 31 Jan 2024 15:25:00 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:

    You mean like

    exec 3< /path/to/input/file
    read -u3 line_from_input file

    How does that offend your engineering senses?


    That was not in 2-3 books that I had read. I can't say that I understand
    what is going on, what environment we are and whether what you show is >generic or specific to 'exec' and 'read'.

    It is acutally specific to a shell which provides internal
    'exec' and 'read' commands. There are dozens of shells available
    to suit the end-users requirements (bourne shell, korn shell, C shell
    et alia), many of them implement a common subset of commands
    defined by POSIX.

    From the unix kernel perspective, an application opens a file and a file descriptor is assigned (consecutively, starting at zero). There
    are no semantics associated with the fd which can refer to a terminal, pseudo-terminal, block device, character device, pipe,
    named fifo, or even a TCP connection to a remote host.

    Any semantics associated with the file descriptor are a contract
    between the shell and the application.

    One could certainly write a shell that doesn't use file descriptor
    zero as stdin (although that would be incompatable with applications
    written for the standard shells, all of which honor the convention
    that file descriptor zero is stdin).

    For portability between shells, POSIX has codified the relationship
    of stdin to fd 0, stdout to fd1, etc.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Michael S on Wed Jan 31 15:49:10 2024
    Michael S <already5chosen@yahoo.com> writes:
    On Wed, 31 Jan 2024 12:15:23 +0000
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

    It was designed for very memory constrained systems which handled
    text on a line by line basis. So one line of a long file wuld be
    processed and passed down the pipeline, and you wouldn't need
    temporary disk files or large amounts of memory. I'm sure it worked
    quite well for that.



    A concept of pipes is fine. I was not talking about that side.

    My objection is with each program having exactly 1 special input and
    exactly 2 special outputs. Instead of having, say, up to 5 of each,
    fully interchangeable with the first of the five being special only in
    that that it is a default and as such allows for shorter syntax in the
    shell.

    Each program has 1024 (on my system - it's configurable on a per-process
    basis) fully interchangable "inputs" and "outputs" (also known as files).

    $ application 5> /tmp/file5

    will redirect file descriptor five to the specified file.


    There's nothing special about stdin, stdout or stderr other than
    that they are tags applied to the first three file descriptors.

    There is a convention the that the first file descriptor
    is used for input, the second for output and the third
    for diagnostic output. But it's just a convention

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 31 15:58:36 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 31.01.2024 15:21, David Brown wrote:
    On 31/01/2024 09:36, Malcolm McLean wrote:
    [ I snipped a couple of "I actually don't know/need it" things ]

    But now it's effectively a programming language, and, from the example
    code, a very poorly designed one which is cryptic and fussy and liable
    to be hard to maintain. So it's better to use a language like Perl to
    achieve the same thing, and I did have a few Perl scripts handy for
    repetitive jobs of that nature in my Unix days.

    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    I don't think it's that clear a joke. The Unix shell is extremely
    error prone to program, and you should not let a newbie write shell
    programs without careful supervision.

    Nonsense.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Scott Lurndal on Wed Jan 31 18:04:29 2024
    On Wed, 31 Jan 2024 15:49:10 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:

    Michael S <already5chosen@yahoo.com> writes:
    On Wed, 31 Jan 2024 12:15:23 +0000
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

    It was designed for very memory constrained systems which handled
    text on a line by line basis. So one line of a long file wuld be
    processed and passed down the pipeline, and you wouldn't need
    temporary disk files or large amounts of memory. I'm sure it worked
    quite well for that.



    A concept of pipes is fine. I was not talking about that side.

    My objection is with each program having exactly 1 special input and >exactly 2 special outputs. Instead of having, say, up to 5 of each,
    fully interchangeable with the first of the five being special only
    in that that it is a default and as such allows for shorter syntax
    in the shell.

    Each program has 1024 (on my system - it's configurable on a
    per-process basis) fully interchangable "inputs" and "outputs" (also
    known as files).

    $ application 5> /tmp/file5

    will redirect file descriptor five to the specified file.


    There's nothing special about stdin, stdout or stderr other than
    that they are tags applied to the first three file descriptors.

    There is a convention the that the first file descriptor
    is used for input, the second for output and the third
    for diagnostic output. But it's just a convention

    I don't understand.
    Are not descriptors 0,1 and 2 special in that that they are already
    open (I don't know if by OS or by shell) when the program starts and the
    rest of them, if ever used, have to be opened by the program code?

    On only remotely related note, what happens on your system when you
    want more than 1024 files to be open by one program simultaneously?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 17:05:23 2024
    On 31.01.2024 16:42, Michael S wrote:
    On Wed, 31 Jan 2024 15:25:00 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:

    You mean like

    exec 3< /path/to/input/file
    read -u3 line_from_input file

    How does that offend your engineering senses?

    That was not in 2-3 books that I had read. I can't say that I understand
    what is going on, what environment we are and whether what you show is generic or specific to 'exec' and 'read'.

    '-u' is obviously an option of read. Various shells support it; at least
    ksh since 30 years, and bash meanwhile as well. But it's not in POSIX.

    Other redirections are standard, and these should certainly be known by
    anyone who had visited a course and read any book on the Unix shell.
    The syntax is not difficult, follows rules, and certainly not arbitrary.

    The one in above code is assigning the file descriptor 3 to the given
    file for reading. You can let a FD point to the channel another one is
    pointing to, like in the "well known" '2>&1' (where stderr is connected
    to the same channel than stdin currently points to). Similar you can in
    above example use the standard form 'read line_from_input_file <&3',
    which may certainly appear more cryptic than an option '-u3', but it's essential to any shell programmer.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Scott Lurndal on Wed Jan 31 17:17:44 2024
    On 31.01.2024 16:58, Scott Lurndal wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 31.01.2024 15:21, David Brown wrote:
    On 31/01/2024 09:36, Malcolm McLean wrote:
    [ I snipped a couple of "I actually don't know/need it" things ]

    But now it's effectively a programming language, and, from the example >>>> code, a very poorly designed one which is cryptic and fussy and liable >>>> to be hard to maintain. So it's better to use a language like Perl to
    achieve the same thing, and I did have a few Perl scripts handy for
    repetitive jobs of that nature in my Unix days.

    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    I don't think it's that clear a joke. The Unix shell is extremely
    error prone to program, and you should not let a newbie write shell
    programs without careful supervision.

    Nonsense.

    Not the least. - I'm not sure about your background in shell. But all
    what you wrote in this newsgroup is indicating that your experience
    seems to be quite limited. (I've seen only a single and pointless post
    from you in comp.unix.shell). - Nevermind. Just don't expose yourself
    so much with your obviously little knowledge and experience.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Janis Papanagnou on Wed Jan 31 11:20:21 2024
    On 1/31/24 07:35, Janis Papanagnou wrote:
    ...
    I associate DEC's VMS with the old DEC VAX-11 system, both
    from around the mid of the 1970's. I programmed on a DEC's
    VAX with VMS obviously before Linus Torvalds started his
    studies. And that was at a time when the DEC VAX and VMS
    were replaced at our sites by Unix systems.

    OK - so it's that association you've got wrong. I know VMS was still
    going strong around 1990 when I was introduced to it. It might have been
    in decline at the time, but it was very far from being gone.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 18:26:36 2024
    On Wed, 31 Jan 2024 16:25:30 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:


    The big advantage of non-GUI is for process automation. With GUI
    oriented applications you can mainly only interactively (=slow and cumbersome) do what it provides. Rarely GUI applications support a
    scripting interface, and if so it's then typically some proprietary non-standard language.


    I'd take almost any proprietary non-standard GUI macro language over non-proprietary non-standard tcl. They say, Lua is better. I never had motivation to look at it more closely.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 18:18:20 2024
    On Wed, 31 Jan 2024 17:05:23 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 31.01.2024 16:42, Michael S wrote:
    On Wed, 31 Jan 2024 15:25:00 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:

    You mean like

    exec 3< /path/to/input/file
    read -u3 line_from_input file

    How does that offend your engineering senses?

    That was not in 2-3 books that I had read. I can't say that I
    understand what is going on, what environment we are and whether
    what you show is generic or specific to 'exec' and 'read'.

    '-u' is obviously an option of read. Various shells support it; at
    least ksh since 30 years, and bash meanwhile as well. But it's not in
    POSIX.


    The books were talking about Bourne shell and C shell. They acknowledged
    an existence of ksh, but didn't go into details. I don't remember if
    bash was mentioned at all.
    Of course, in practice in this century I used bash almost exclusively,
    but never learned it formally, by book, from start to finish.
    The same as over 90% of bash users, I'd guess.


    Other redirections are standard, and these should certainly be known
    by anyone who had visited a course and read any book on the Unix
    shell. The syntax is not difficult, follows rules, and certainly not arbitrary.

    The one in above code is assigning the file descriptor 3 to the given
    file for reading. You can let a FD point to the channel another one is pointing to, like in the "well known" '2>&1' (where stderr is
    connected to the same channel than stdin currently points to).
    Similar you can in above example use the standard form 'read line_from_input_file <&3', which may certainly appear more cryptic
    than an option '-u3', but it's essential to any shell programmer.

    Janis


    I did understand '3<' by association with '2>' that was in the book,
    but more importantly, is something I use regularly.
    However I had never seen '3<' in the books.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Michael S on Wed Jan 31 18:27:46 2024
    On Wed, 31 Jan 2024 18:26:36 +0200
    Michael S <already5chosen@yahoo.com> wrote:

    On Wed, 31 Jan 2024 16:25:30 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:


    The big advantage of non-GUI is for process automation. With GUI
    oriented applications you can mainly only interactively (=slow and cumbersome) do what it provides. Rarely GUI applications support a scripting interface, and if so it's then typically some proprietary non-standard language.


    I'd take almost any proprietary non-standard GUI macro language over non-proprietary non-standard tcl. They say, Lua is better. I never had motivation to look at it more closely.


    meant to write 'non-proprietary standard tcl'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 17:29:30 2024
    On 31.01.2024 16:03, Michael S wrote:

    My objection is with each program having exactly 1 special input and
    exactly 2 special outputs. Instead of having, say, up to 5 of each,
    fully interchangeable with the first of the five being special only in
    that that it is a default and as such allows for shorter syntax in the
    shell.

    The first three are pre-assigned and read to use for application.
    You can redirect, close, or open them on shell level with a set of
    redirection commands, and you can do such things also on OS-level
    with the Unix system commands.

    I'm not sure why you mentioned 5, whether that's better or worse.
    There's naturally some limit on OS level on the number of parallel
    open file descriptors, but that limit is very high. Mind that you
    can always close unused ones.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 17:47:33 2024
    On 31.01.2024 17:18, Michael S wrote:
    On Wed, 31 Jan 2024 17:05:23 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    The books were talking about Bourne shell and C shell. They acknowledged
    an existence of ksh, but didn't go into details. I don't remember if
    bash was mentioned at all.

    So that's understandable then. The C shell is not really suited for
    programming (and commonly depreciated; "C shell considered harmful").
    And Bourne sh is indeed very restricted.

    Of course, in practice in this century I used bash almost exclusively,
    but never learned it formally, by book, from start to finish.

    In case that bash is still part of your working environment I suggest
    to update your knowledge by reading some good bash tutorial. It's
    really a huge gain if compared to Bourne sh. In case, though, you want
    to do _portable_ shell programming then take some book that clearly
    indicates what is POSIX and what is some extension.

    The same as over 90% of bash users, I'd guess.

    Well, I think you have to differentiate the levels. Quite many users
    of bash take this shell as "quasi-standard"; which is mostly okay in a
    Linux world. And know exactly that universe. If you happen to work in
    a larger Unix universe that might be a hindrance. (Thus POSIX, or use
    of ksh, which is even more powerful than bash, or zsh, supplemented
    with its own but some more coherent concepts.)


    I did understand '3<' by association with '2>' that was in the book,
    but more importantly, is something I use regularly.
    However I had never seen '3<' in the books.

    It's just the numbers of file descriptors and whether it's an input >
    or output < channel, or even a read/write channel <> . That's why in
    books (or man pages) you regularly see the building blocks, not the
    complete enumeration.

    (See for example 'man ksh' Section "Input/Output". But careful; ksh
    has additional non-standard additions. So a peek into the POSIX docs
    might serve you better.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Michael S on Wed Jan 31 16:54:06 2024
    Michael S <already5chosen@yahoo.com> writes:
    On Wed, 31 Jan 2024 15:49:10 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:

    Michael S <already5chosen@yahoo.com> writes:
    On Wed, 31 Jan 2024 12:15:23 +0000
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

    It was designed for very memory constrained systems which handled
    text on a line by line basis. So one line of a long file wuld be
    processed and passed down the pipeline, and you wouldn't need
    temporary disk files or large amounts of memory. I'm sure it worked
    quite well for that.



    A concept of pipes is fine. I was not talking about that side.

    My objection is with each program having exactly 1 special input and
    exactly 2 special outputs. Instead of having, say, up to 5 of each,
    fully interchangeable with the first of the five being special only
    in that that it is a default and as such allows for shorter syntax
    in the shell.

    Each program has 1024 (on my system - it's configurable on a
    per-process basis) fully interchangable "inputs" and "outputs" (also
    known as files).

    $ application 5> /tmp/file5

    will redirect file descriptor five to the specified file.


    There's nothing special about stdin, stdout or stderr other than
    that they are tags applied to the first three file descriptors.

    There is a convention the that the first file descriptor
    is used for input, the second for output and the third
    for diagnostic output. But it's just a convention

    I don't understand.
    Are not descriptors 0,1 and 2 special in that that they are already
    open (I don't know if by OS or by shell) when the program starts and the
    rest of them, if ever used, have to be opened by the program code?

    That's a contract between a shell and an application. A shell
    doesn't need to provide POSIX semantics, although it will likely
    do so just to maintain compataiblity with existing applications.

    Applications like daemons usually are disconnected from stdin/stdout/stderr completely, as they're not designed for interactive use.

    Using the fork and exec (or posix_spawn) system calls, an application can invoke another
    application without the shell and any number of file descriptors
    can be left open for the child to use in any way. It's a contract
    between the two applications.


    On only remotely related note, what happens on your system when you
    want more than 1024 files to be open by one program simultaneously?


    $ ulimit -f 2048

    Will increase the limit, to any arbitrary value, subject to system
    wide limits configured by the superuser (system manager).

    There's a system call (setrlimit) that an application can use as well, and the limit will be inherited by child processes.

    One might argue that there are few cases (absent network daemons)
    where more than 2k files will need to be open simultaneously.

    For each resource (address space size, core file size, max cpu time,
    file size, open files, et cetera, et alia) there are two values;
    a hard limit (which a process can never exceed) and a soft limit.
    A process can increase the soft limit up to the hard limit.

    In my case, the hard limit is 8192 and the soft limit is 1024.

    $ ulimit -aH
    address space limit (Kibytes) (-M) unlimited
    core file size (blocks) (-c) unlimited
    cpu time (seconds) (-t) unlimited
    data size (Kibytes) (-d) unlimited
    file size (blocks) (-f) unlimited
    locks (-x) unlimited
    locked address space (Kibytes) (-l) 64
    message queue size (Kibytes) (-q) 800
    nice (-e) 0
    nofile (-n) 8192
    nproc (-u) 63878
    pipe buffer size (bytes) (-p) 4096
    max memory size (Kibytes) (-m) unlimited
    rtprio (-r) 0
    socket buffer size (bytes) (-b) 4096
    sigpend (-i) 63878
    stack size (Kibytes) (-s) unlimited
    swap size (Kibytes) (-w) not supported
    threads (-T) not supported
    process size (Kibytes) (-v) unlimited
    $ ulimit -aS
    address space limit (Kibytes) (-M) unlimited
    core file size (blocks) (-c) 0
    cpu time (seconds) (-t) unlimited
    data size (Kibytes) (-d) unlimited
    file size (blocks) (-f) unlimited
    locks (-x) unlimited
    locked address space (Kibytes) (-l) 64
    message queue size (Kibytes) (-q) 800
    nice (-e) 0
    nofile (-n) 1024
    nproc (-u) 1024
    pipe buffer size (bytes) (-p) 4096
    max memory size (Kibytes) (-m) unlimited
    rtprio (-r) 0
    socket buffer size (bytes) (-b) 4096
    sigpend (-i) 63878
    stack size (Kibytes) (-s) 8192
    swap size (Kibytes) (-w) not supported
    threads (-T) not supported
    process size (Kibytes) (-v) unlimited

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Keith Thompson on Wed Jan 31 17:06:33 2024
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    scott@slp53.sl.home (Scott Lurndal) writes:
    [...]
    Quick and dirty editor:

    $ cat > /tmp/file < /dev/tty
    line1
    line2
    line3
    ^D
    $
    $ cat /tmp/file
    line1
    line2
    line3
    $

    You probably don't need the "< /dev/tty".

    True, in most cases. Although I thought it more illustrative
    to do it that way in this context.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 31 17:05:17 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 31.01.2024 16:58, Scott Lurndal wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 31.01.2024 15:21, David Brown wrote:
    On 31/01/2024 09:36, Malcolm McLean wrote:
    [ I snipped a couple of "I actually don't know/need it" things ]

    But now it's effectively a programming language, and, from the example >>>>> code, a very poorly designed one which is cryptic and fussy and liable >>>>> to be hard to maintain. So it's better to use a language like Perl to >>>>> achieve the same thing, and I did have a few Perl scripts handy for
    repetitive jobs of that nature in my Unix days.

    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    I don't think it's that clear a joke. The Unix shell is extremely
    error prone to program, and you should not let a newbie write shell
    programs without careful supervision.

    Nonsense.

    Not the least. - I'm not sure about your background in shell.

    I've been using shells (sh, ksh) daily since 1979. I've contributed code
    to the linux kernel (a kernel debugger called kdb in 1998).
    I've contributed code to Oracle's RDBMS (OS dependent I/O code).
    I've co-written a unix-compatible distributed operating system[*] for a massively
    parallel machine (in C++) and two bare-metal hypervisors which execute
    linux guest operating systems. I spent 6 years on the base working group
    at X/Open and the Open Group working on the XPG standards (which were merged with the POSIX standards a decade ago).

    I won't say that there aren't gotchas when writing shell scripts,
    particularly when used by someone with elevated privileged, but that
    doesn't make them "extremely error prone".

    [*] eventually in partnership with USL (Unix System Labs - i.e. AT&T), Fujitsu, ICL.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to James Kuyper on Wed Jan 31 18:06:13 2024
    On 31.01.2024 17:20, James Kuyper wrote:
    On 1/31/24 07:35, Janis Papanagnou wrote:
    ...
    I associate DEC's VMS with the old DEC VAX-11 system, both
    from around the mid of the 1970's. I programmed on a DEC's
    VAX with VMS obviously before Linus Torvalds started his
    studies. And that was at a time when the DEC VAX and VMS
    were replaced at our sites by Unix systems.

    OK - so it's that association you've got wrong. I know VMS was still
    going strong around 1990 when I was introduced to it. It might have been
    in decline at the time, but it was very far from being gone.

    I am aware that it was also existing in the late 1990's, if
    at least through the openVMS form. I certainly have not the
    whole market in view. All I can say is what I wrote, that we
    have changed our platforms at that time, and when I shortly
    later switched jobs to the telecommunication area the "whole
    world" (sort of) downsized their machine parks to mainly Unix
    systems or (partly; later commonly, at least for the Clients)
    Windows based systems. And the Unix server installations grew
    enormously during the 1990's, then came the big wave of Linux,
    and all the providers building their farms on Linux basis. DEC
    became invisible [to me] (I was not saying it vanished), similar
    like COBOL did not vanish.

    Yet I don't understand the relation to Linus Torvalds that was
    the source of mentioning VMS. - I mean; only that he dislikes
    it is not much of a news. (I've done a few things on a DEC/VAX
    with VMS, and I recall it had a verbose hierarchical file system,
    a horrible screen VT220, and a (literally!) impressive VT100
    keyboard; my fingers still ache when thinking about it. These
    days I already had experiences former with Unix (VM/UTS), so I
    did know what I got or what I was missing with VMS.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Keith Thompson on Wed Jan 31 17:21:34 2024
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    scott@slp53.sl.home (Scott Lurndal) writes:
    Michael S <already5chosen@yahoo.com> writes:
    [...]
    On only remotely related note, what happens on your system when you
    want more than 1024 files to be open by one program simultaneously?


    $ ulimit -f 2048

    Will increase the limit, to any arbitrary value, subject to system
    wide limits configured by the superuser (system manager).

    That sets the limit for file size. I think you mean "ulimit -n 2048".

    Yes, thank you.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 31 17:20:30 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 31.01.2024 17:18, Michael S wrote:
    On Wed, 31 Jan 2024 17:05:23 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    (See for example 'man ksh' Section "Input/Output". But careful; ksh
    has additional non-standard additions. So a peek into the POSIX docs
    might serve you better.)

    FWIW, the POSIX shell language was based on a subset of ksh88.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 18:19:47 2024
    On 31.01.2024 14:10, Michael S wrote:
    [ DEC's VMS ]

    Released in 1977.
    Reached the peak of popularity in mid 1980s, when DEC decided to use
    VAX not just as mini/super-mini, but also as competitor to mainframes, effectively killing their earlier mainframe line (PDP-6/10/20).

    This is interesting. These days all major players here switched to
    Unix systems (in our context specifically AIX and HP-UX), exactly
    to exchange the huge sports halls full of mainframe computers to
    just a small room full of Unix servers.

    [...]

    By value, likely still bigger than all Unixen combined.

    Not sure what (to me strange sounding) ideas you have here.

    I can say the same.

    Sure, so let me expand. The "By value" was what made me doubt. The
    "values" (Real Money) I experienced in the legacy mainframe areas,
    in the financial sector (banks and assurance companies); these were
    not DECs here, and they were hard to replace. - I know that every
    couple years they made their business cases about how they can get
    rid of the mainframes, to no avail. (Don't know how it evolved the
    past 20 years, though.) And later all the ISP computing power went
    in Linux plants, where the money was made. I never observed that
    DEC/VMS was of any importance "by value". If it had some value by
    means I'm not aware of, I take your word as granted.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 19:36:06 2024
    On Wed, 31 Jan 2024 17:29:30 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    I'm not sure why you mentioned 5, whether that's better or worse.
    There's naturally some limit on OS level on the number of parallel
    open file descriptors, but that limit is very high. Mind that you
    can always close unused ones.

    Janis


    Five of each sort, i.e. five inputs and five outputs, sound good to me.

    Much more than five of each sort pre-opened by shell sound like too
    much. If there exist a need for more than five channels for
    communication between complex of programs then this complex of programs
    very likely was designed to work together and only together. And then
    any intervention of the user into communication between them will
    likely do more harm than good.

    Of course, I fully expect that usefully using more than three
    predefined channels of any particular direction would be very rare, but
    I still like five, or at least four, better than three.

    As to not using predefined direction and instead just providing a pool
    of up to 10 pre-open descriptors, this idea didn't cross my mind in
    those particular five minutes that I was writing my initial (yes,
    provocative, yes intentionally so) post. Right now I don't want to
    think whether I like it or not, because I see no good reasons to
    think deeply about this particular water under bridge.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Wed Jan 31 19:05:22 2024
    On 31/01/2024 17:38, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    However, I learned a new trick when checking that I was not mistaken
    about this - it turns out that "less file.pdf" gives a nice text-only
    output from the pdf file (by passing it through "lesspipe"). There's
    always something new to learn from inane conversations on Usenet :-)

    It doesn't necessarily do this by default. See the documentation for
    details (which are of course off-topic here).


    Sure - I investigated to see how it works when I saw it happening, and
    there are clearly many possibilities here. But it's nice when you
    discover a useful and simple feature of an everyday tool that you never
    knew was there.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 19:53:16 2024
    On Wed, 31 Jan 2024 18:19:47 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 31.01.2024 14:10, Michael S wrote:
    [ DEC's VMS ]

    Released in 1977.
    Reached the peak of popularity in mid 1980s, when DEC decided to use
    VAX not just as mini/super-mini, but also as competitor to
    mainframes, effectively killing their earlier mainframe line
    (PDP-6/10/20).

    This is interesting. These days all major players here switched to
    Unix systems (in our context specifically AIX and HP-UX), exactly
    to exchange the huge sports halls full of mainframe computers to
    just a small room full of Unix servers.


    Wikipedia tells me that AIX formally exists since 1986, but in reality
    it was just a curiosity until ported to POWER in 1990.
    HP-UX sort of existed, but under different name and was not
    particularly big until 1990 or 1991.

    [...]

    By value, likely still bigger than all Unixen combined.

    Not sure what (to me strange sounding) ideas you have here.

    I can say the same.

    Sure, so let me expand. The "By value" was what made me doubt. The
    "values" (Real Money) I experienced in the legacy mainframe areas,
    in the financial sector (banks and assurance companies); these were
    not DECs here, and they were hard to replace. - I know that every
    couple years they made their business cases about how they can get
    rid of the mainframes, to no avail. (Don't know how it evolved the
    past 20 years, though.) And later all the ISP computing power went
    in Linux plants, where the money was made. I never observed that
    DEC/VMS was of any importance "by value". If it had some value by
    means I'm not aware of, I take your word as granted.

    Janis


    DEC was a big company back then, very solidly #2 computers business in
    the "West", if we consider Japan as "East'. Somehow I remember a number
    14B USD in 1991 or 1992, back when #3 was may be 5 or 6B USD.
    And in the second half of the 80s VAX/VMS was the biggest part of DEC
    by far. Of course, there always were internal struggles and internal competition, but it seems less so in this period then in other periods
    of DEC's history.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 19:01:32 2024
    On 31.01.2024 17:26, Michael S wrote:
    On Wed, 31 Jan 2024 16:25:30 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    The big advantage of non-GUI is for process automation. With GUI
    oriented applications you can mainly only interactively (=slow and
    cumbersome) do what it provides. Rarely GUI applications support a
    scripting interface, and if so it's then typically some proprietary
    non-standard language.

    I'd take almost any proprietary non-standard GUI macro language over non-proprietary non-standard tcl. They say, Lua is better. I never had motivation to look at it more closely.

    I don't recall to have ever used tcl (maybe once, long ago?),
    and I never stumbled across an application (GUI or otherwise)
    where I needed scripting and it would have provided Lua. Thus
    I cannot help you here, I either don't know it.

    All I can say is that the Unix shell was a reliable companion
    wherever we had to automate tasks on Unix systems or on Cygwin
    enhanced Windows.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Wed Jan 31 19:11:37 2024
    On 31/01/2024 15:46, Janis Papanagnou wrote:
    On 31.01.2024 15:09, David Brown wrote:

    I would expect that the majority of uses of "cat" are with just one
    file,

    And of course just because of ignorance; the majority of (but not all)
    uses with just one file are UUOCs.

    I regularly see it as more symmetrical and clearer to push data left to
    right. So I might write "cat infile | grep foo | sort > outfile". Of
    course I could use "<" redirection, but somehow it seems more natural to
    me to have this flow. I'll use "<" for simpler cases.

    But perhaps this is just my habit, and makes little sense to other people.


    but certainly it is useful when you want to combine files in
    different ways.

    I don't know of any concatenations in "different" ways, but of course
    there's some more of the other usages that are supported by options.


    Different orders for the files, and different subsets of the same set of
    files.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Wed Jan 31 19:20:16 2024
    On 31/01/2024 16:25, Janis Papanagnou wrote:
    On 31.01.2024 15:21, David Brown wrote:
    On 31/01/2024 09:36, Malcolm McLean wrote:
    [ I snipped a couple of "I actually don't know/need it" things ]

    But now it's effectively a programming language, and, from the example
    code, a very poorly designed one which is cryptic and fussy and liable
    to be hard to maintain. So it's better to use a language like Perl to
    achieve the same thing, and I did have a few Perl scripts handy for
    repetitive jobs of that nature in my Unix days.

    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    I don't think it's that clear a joke. The Unix shell is extremely
    error prone to program, and you should not let a newbie write shell
    programs without careful supervision. ("newbie" [in shell context]
    = less than 10 years of practical experience. - Am I exaggerating?
    Maybe. But not much.)


    I'm not a great fan of shell programming - anything advanced, and I tend
    to reach for Python. But I think that is a matter of familiarity and
    practice. But if you consider bash programming as difficult to get
    right, I'll not argue.

    Perl is famously known as a "write-only" language. Sure, it is possible
    to write good, clear, maintainable Perl code - but few people do that.

    Thus the idea that finding bash cryptic or difficult and using Perl
    instead is the joke.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Scott Lurndal on Wed Jan 31 19:15:56 2024
    On 31/01/2024 16:22, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 31/01/2024 09:36, Malcolm McLean wrote:

    The reason is that I'd only run the command once, and it's so likely
    that there will be either a syntax misunderstanding or a typing error
    that I'd have to test to ensure that it was right. And by the time
    you've done that any time saved by typing only one commandline is lost.
    Of course if you are writing scripts then that doesn't apply. But now
    it's effectively a programming language, and, from the example code, a
    very poorly designed one which is cryptic and fussy and liable to be
    hard to maintain. So it's better to use a language like Perl to achieve
    the same thing, and I did have a few Perl scripts handy for repetitive
    jobs of that nature in my Unix days.


    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    You admit this with "not tested". Says it all. '"Understandig Unix" is
    an intellectually useless achievement. You might have to do it if you
    have to use the system and debug and trouble shoot. But it's nothing to
    be proud about.


    It is "useless" for people who don't use it. For people who /do/ use
    it, is very useful.

    I've used sequences like Tim's - it's a way to copy data remotely from a
    different machine. I would likely write it slightly differently - I'd
    probably do the mkdir and cd first, thus avoiding the need for a
    subshell, and I'd use "ssh -C" or "tar -z" to do the compression rather
    than "gzip".

    There's no doubt that the learning curve is longer for doing this sort
    of thing from the command line than using gui programs. There is also
    no doubt that when you are used to it, command line utilities and a good
    shell are very flexible and efficient.

    Learn to use the tools that are conveniently available, and then pick
    the right tool for the job - whether it is command line or gui.

    And there are often more than one tool for the job. e.g. rsync(1)
    for copying data remotely.

    Or sshfs then "cp -r". There's often more than two tools for the job!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Janis Papanagnou on Wed Jan 31 19:22:50 2024
    On 31.01.2024 17:47, Janis Papanagnou wrote:

    It's just the numbers of file descriptors and whether it's an input >
    or output < channel, or even a read/write channel <> .

    In case the typo wasn't obvious and detected as such, please swap them
    < input
    output
    <> in/out
    append
    << here-doc
    etc.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Wed Jan 31 19:24:58 2024
    On 31/01/2024 19:01, Janis Papanagnou wrote:
    On 31.01.2024 17:26, Michael S wrote:
    On Wed, 31 Jan 2024 16:25:30 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    The big advantage of non-GUI is for process automation. With GUI
    oriented applications you can mainly only interactively (=slow and
    cumbersome) do what it provides. Rarely GUI applications support a
    scripting interface, and if so it's then typically some proprietary
    non-standard language.

    I'd take almost any proprietary non-standard GUI macro language over
    non-proprietary non-standard tcl. They say, Lua is better. I never had
    motivation to look at it more closely.

    I don't recall to have ever used tcl (maybe once, long ago?),
    and I never stumbled across an application (GUI or otherwise)
    where I needed scripting and it would have provided Lua. Thus
    I cannot help you here, I either don't know it.


    TCL is - for reasons beyond my ken - the standard scripting language
    used by several programmable logic design suites. Your code is a
    mixture of VHDL and/or Verilog (and/or higher level HDL languages)
    and/or schematic or block diagrams, and it goes through a range of
    analysers, test simulators, placement and routing systems. There's
    plenty of gui programs involved and non-gui programs, and the whole
    thing is tied together with TCL scripting.

    I have no idea if that's the kind of system Michael is referring to, but
    it is certainly used for that kind of thing.

    All I can say is that the Unix shell was a reliable companion
    wherever we had to automate tasks on Unix systems or on Cygwin
    enhanced Windows.


    Automation is certainly easier with good scripting - whatever the
    language or shell.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 19:47:43 2024
    On 31.01.2024 18:36, Michael S wrote:
    On Wed, 31 Jan 2024 17:29:30 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    I'm not sure why you mentioned 5, whether that's better or worse.
    There's naturally some limit on OS level on the number of parallel
    open file descriptors, but that limit is very high. Mind that you
    can always close unused ones.

    Five of each sort, i.e. five inputs and five outputs, sound good to me.

    Well, to me it sounds like an arbitrary number, like four or seven.

    But how are these pre-opened FDs then used? To which devices assigned?

    We already have what most folks seem to use pre-opened: 0, 1, 2

    When I need a new one I couple it with an existing one

    exec 3>&1 4>&1 5>&1 ## but why open it in advance per default?

    or to a file (where we may have a concrete demand for a second channel)

    exec 3< ./my_input
    exec 4> /tmp/my_output

    and still using channel 0 and 1 (here still connected to the default)
    in parallel.


    Much more than five of each sort pre-opened by shell sound like too
    much. If there exist a need for more than five channels for
    communication between complex of programs then this complex of programs
    very likely was designed to work together and only together. And then
    any intervention of the user into communication between them will
    likely do more harm than good.

    Of course, I fully expect that usefully using more than three
    predefined channels of any particular direction would be very rare, but
    I still like five, or at least four, better than three.

    You should have an idea, though, to what they should initially point
    to, and why you want them differentiated.


    As to not using predefined direction and instead just providing a pool
    of up to 10 pre-open descriptors, this idea didn't cross my mind in
    those particular five minutes that I was writing my initial (yes, provocative, yes intentionally so) post. Right now I don't want to
    think whether I like it or not, because I see no good reasons to
    think deeply about this particular water under bridge.

    I don't think I understand what you wanted to say here.
    (A pool is there, the shell manages it for you.)

    As an amendment, in ksh there's also the feature to not manually
    assign file descriptor numbers but automatically get them from
    the shell by prepending '{var}' to the redirection; here some FD
    greater than 10 will be chosen by that shell and it's number
    stored in 'var'. This may be useful for applications that may
    e.g. serve a couple equivalent communication partners.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Harnden@21:1/5 to Janis Papanagnou on Wed Jan 31 19:43:18 2024
    On 31/01/2024 13:47, Janis Papanagnou wrote:
    On 30.01.2024 20:39, Richard Harnden wrote:

    Nobody uses printf to output binary data. fwrite(3) would be common, as
    would write(2).

    Right. I'm using the OS'es write(2), but also printf with ANSI escapes,
    e.g. sprintf (buf, "\033[%d;%dH", ...

    I meant 'binary' as in has \0s

    It seems to work fine with ESC's and utf8 (and i abuse it thus often)
    ... but, from what James said, that is not actually guarenteed.



    Maybe you could use printf("%c%c%c" ... but it'd be beyond tedious.

    Since I recall to have used it in some thread I want to clarify that
    it was just meant as an example countering an incorrect argument of
    "not being able to output binary data on stdout", or some such.

    Janis


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Jan 31 20:25:59 2024
    On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Terminal control sequences (almost always based on VT100 these days) are typically not printable, but tend to avoid null characters, which means
    you can very probably use printf to print them (assuming you're on a POSIX-like system).

    They use text. For instance, a cursor position is both accepted and
    reported in a decimal format like 13;17. All the commands and
    delimiting characters are textual, except for part of the CSI (control
    sequence introducer). The 7 bit CSI uses two characters, ESC and [.
    Except for that one ESC, everything is printable.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Malcolm McLean on Wed Jan 31 23:36:43 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might
    have unusual effects if passed though systems designed to handle
    human-readable text.

    Maybe you are not used to a system where it's trivial to inspect such
    data. When "some_prog" produces data that are not compatible with the
    current terminal settings, "some_prog | hd" shows a hex dump instead.
    The need to do this does not make "some_prog" poorly designed. It may
    simply mean that the output is /intended/ for further processing.

    For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting
    for next data byte". Obviously this will cause difficuties if the data
    is binary.
    Also many binary formats can't easily be extended, so you can pass one
    image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to
    extend.
    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    What is your evidence? stdout was just designed for output (as far as I
    can tell) and, anyway, what is the distinction you are making between
    binary and text? iconv --from ACSII --to EBCDIC-UK will produce
    something that is "logically" text on stdout, but it might look like
    binary to you.

    An example where it's really useful not to care: I have a suite of tools
    for doing toy cryptanalysis. Some apply various transformations and/or
    filters to byte streams and others collect and output (on stderr)
    various statistics. Plugging them together in various pipelines is very
    handy when investigating an encrypted text. The output is almost always "binary" in the sense that there would be not point in looking at on a terminal.

    According to you, these tools are poorly designed. I don't think so.
    How would you design them? Endless input and output file names to be
    juggled and tidied up afterwards?

    That you couldn't actually mount a defence of your position whilst I could also strongly implies that I am right.

    There is also the possibility that readers are just exhausted by the
    fire-hose of bizarre, unsubstantiated options coming from you. I've
    been tied up with personal matters recently, but I've seen many posts
    from you where I have been tempted to make a one-word reply like "no", "nonsense" or "daft", but I see that you would have taken these to be
    string confirmation that you are right!

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to bart on Thu Feb 1 00:47:35 2024
    bart <bc@freeuk.com> writes:
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    According to you, these tools are poorly designed. I don't think so.
    How would you design them? Endless input and output file names to be
    juggled and tidied up afterwards?

    I think they're poorly designed too.

    Of course you do. They're not bart programs.


    From the POV of interactive console programs, they /are/ poor.

    You don't provide any reason why - do elucidate!


    But the
    mistake is thinking that they are actual programs or commands, when
    really they are just filters. They are not designed to be standalone >commands.

    Even 'cat', if I type it by itself, just sits there.

    It is _designed_ to do that. If that's not the behavior you
    want, don't use that command.

    If you read the command documentation, you would understand
    what the behavior of the command would be in that context
    and realize that it would be pointless.

    Although, it is actually useful, since it echos back what
    you type, for e.g. checking serial lines, terminal emulators,
    etc.

    (I wonder what use
    it has in a sequence like ... | cat | ...; what does it add to the data?)

    The manual page provides the answer to that question, and it
    should be obvious even to the meanest programmer.


    AFAICS, this stuff mainly works inside scripts. Or do people here spend
    all day manually piping stuff between programs?

    Yes and Yes.


    As for alternatives, I don't know.

    Indeed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Ben Bacarisse on Thu Feb 1 00:21:32 2024
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might
    have unusual effects if passed though systems designed to handle
    human-readable text.

    Maybe you are not used to a system where it's trivial to inspect such
    data. When "some_prog" produces data that are not compatible with the current terminal settings, "some_prog | hd" shows a hex dump instead.
    The need to do this does not make "some_prog" poorly designed. It may
    simply mean that the output is /intended/ for further processing.

    For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting
    for next data byte". Obviously this will cause difficuties if the data >>>> is binary.
    Also many binary formats can't easily be extended, so you can pass one >>>> image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to
    extend.
    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    What is your evidence? stdout was just designed for output (as far as I
    can tell) and, anyway, what is the distinction you are making between
    binary and text? iconv --from ACSII --to EBCDIC-UK will produce
    something that is "logically" text on stdout, but it might look like
    binary to you.

    An example where it's really useful not to care: I have a suite of tools
    for doing toy cryptanalysis. Some apply various transformations and/or filters to byte streams and others collect and output (on stderr)
    various statistics. Plugging them together in various pipelines is very handy when investigating an encrypted text. The output is almost always "binary" in the sense that there would be not point in looking at on a terminal.

    According to you, these tools are poorly designed. I don't think so.
    How would you design them? Endless input and output file names to be
    juggled and tidied up afterwards?

    I think they're poorly designed too.

    From the POV of interactive console programs, they /are/ poor. But the
    mistake is thinking that they are actual programs or commands, when
    really they are just filters. They are not designed to be standalone
    commands.

    Even 'cat', if I type it by itself, just sits there. (I wonder what use
    it has in a sequence like ... | cat | ...; what does it add to the data?)

    AFAICS, this stuff mainly works inside scripts. Or do people here spend
    all day manually piping stuff between programs?

    As for alternatives, I don't know. There are any number of ways this
    could be done. But if everyone has become inured to this piping
    business, they will not be receptive to anything different.

    Here however are some ideas:

    * Have versions of these tools for use as filters with no UI, just
    a default input and output, and versions for interactive use with
    helpful prompts. Or even just a sensibly named output! Instead of
    every program writing a.out.

    * Have a concept of a current block of data, analogous to a clipboard.
    Then separate commands can load data, sort it, count it, display it,
    write it etc, with no need for intermediate named files.

    But I'd be happier if this was all contained within a separate
    application from an OS shell program.

    Such things are ludicrously easy to write. I just did one in 15 minutes
    and 50 lines of code, but set up for line-based text files. This is it
    in action; the test input contains 4 lines "one two three four":

    Type exit quit or q to finish

    > load fred
    Data loaded

    > lc
    4 lines

    > list
    1 one
    2 two
    3 three
    4 four

    > rev
    Reversed

    > list
    1 four
    2 three
    3 two
    4 one

    > sort
    Sorted

    > upper

    > list
    1 FOUR
    2 ONE
    3 THREE
    4 TWO

    > save bill
    Written to bill

    > q

    There are any number of transformations that can be applied without
    needed to name and write intermediate files. The commands can even just
    invoke the same filters I mentioned, but now within a friendlier UI.

    A more sophisticated version can permanently show the data without my
    having to type 'list'.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Feb 1 00:49:53 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 31/01/2024 16:33, Keith Thompson wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 31/01/2024 07:18, Tim Rentsch wrote:
    [...]
    It is FAR more cumbersome to accomplish what this command
    is doing without sending binary data through standard out.
    Anyone who doesn't understand this doesn't understand Unix.

    Yes. I don't do that sort of thing.

    You don't. Others do. What was your point again?

    There's kind of an implication that it's something I ought to be doing.

    No, we're saying that what you do or don't do is irrelevent. Nobody is
    forcing you to do anything other than acknowledge that it is a useful capability that others use regularly.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Scott Lurndal on Thu Feb 1 01:29:44 2024
    On 01/02/2024 00:47, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    According to you, these tools are poorly designed. I don't think so.
    How would you design them? Endless input and output file names to be
    juggled and tidied up afterwards?

    I think they're poorly designed too.

    Of course you do. They're not bart programs.


    From the POV of interactive console programs, they /are/ poor.

    You don't provide any reason why - do elucidate!

    They only do one thing, like you can't first do A, then B. They don't
    give any prompts. They often apparently do nothing (so you can't tell if they're busy, waiting for input, or hanging). There is no dialog.

    But I'm sure this has been mentioned a few times.

    NONE of my console apps are that poor. But then I've written programs
    with proper CLIs and GUIs that had to used by non-programmers.


    AFAICS, this stuff mainly works inside scripts. Or do people here spend
    all day manually piping stuff between programs?

    Yes and Yes.

    OK. I spent most creative time on my machine doing actual coding.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Thu Feb 1 01:23:07 2024
    On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <433-929-6894@kylheku.com> writes:
    On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Terminal control sequences (almost always based on VT100 these days) are >>> typically not printable, but tend to avoid null characters, which means
    you can very probably use printf to print them (assuming you're on a
    POSIX-like system).

    They use text. For instance, a cursor position is both accepted and
    reported in a decimal format like 13;17. All the commands and
    delimiting characters are textual, except for part of the CSI (control
    sequence introducer). The 7 bit CSI uses two characters, ESC and [.
    Except for that one ESC, everything is printable.

    I'd describe "printable except for ESC" as binary. And some sequences
    use other non-printable characters like ASCII BEL (Ctl-G) (perhaps not
    VT100 standard, but for example commands to change fonts and colors for xterm).

    But ESC is related text; it's a character described in ASCII used for
    signaling in the middle of text, which is what it's doing here.

    I'd say that if the terminal control sequences contained features like:

    - records delimited by length (number of bytes) rather than any stable
    delimiter, with an emphasis on either fixed length data structures,
    or else length + data encodings for variable ones.

    - numbers encoded in binary: for instance screen coordinates (5,257)
    being encoded as bytes 05 00 01 01 (little endian, 16 bit)
    rathe than the text "5;257"

    ... then that would be "binary data".

    In the terminal sequences, when the byte 27/0x1B occurs, it is
    always being ESC.

    In binary data, when such a byte value occurs, it might be anything,
    with any meaning. It could be part of a number, e.g. lower byte of
    0x1D1B. It could be some flags meaning 00011101, with various meanings.

    Binary data has the characteristic that it's not concerned with the
    character interpretation of many of the bytes, other than ones which are embedded text fields. Thus in many parts of the binary data, any
    character can potentially occur by chance. An 8 bit flags field can
    reproduce any character, given the right combination of the flag values.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Kaz Kylheku on Thu Feb 1 01:34:06 2024
    Kaz Kylheku <433-929-6894@kylheku.com> writes:
    On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <433-929-6894@kylheku.com> writes:
    On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Terminal control sequences (almost always based on VT100 these days) are >>>> typically not printable, but tend to avoid null characters, which means >>>> you can very probably use printf to print them (assuming you're on a
    POSIX-like system).

    They use text. For instance, a cursor position is both accepted and
    reported in a decimal format like 13;17. All the commands and
    delimiting characters are textual, except for part of the CSI (control
    sequence introducer). The 7 bit CSI uses two characters, ESC and [.
    Except for that one ESC, everything is printable.

    I'd describe "printable except for ESC" as binary. And some sequences
    use other non-printable characters like ASCII BEL (Ctl-G) (perhaps not
    VT100 standard, but for example commands to change fonts and colors for
    xterm).

    But ESC is related text; it's a character described in ASCII used for >signaling in the middle of text, which is what it's doing here.

    So are most of the other ASCII codes less than 0x20. Including
    file and record delimiters and shift-in/shift-out.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Richard Harnden on Thu Feb 1 00:45:11 2024
    On 1/31/24 14:43, Richard Harnden wrote:
    On 31/01/2024 13:47, Janis Papanagnou wrote:
    On 30.01.2024 20:39, Richard Harnden wrote:

    Nobody uses printf to output binary data. fwrite(3) would be common, as
    would write(2).

    Right. I'm using the OS'es write(2), but also printf with ANSI escapes,
    e.g. sprintf (buf, "\033[%d;%dH", ...

    I meant 'binary' as in has \0s

    It seems to work fine with ESC's and utf8 (and i abuse it thus often)
    ... but, from what James said, that is not actually guarenteed.

    You're right. I forgot to point it out, but it's clear that null
    characters also invalidate that guarantee.

    The simplest way in which that guarantee could fail would be that any of
    the prohibited characters would simply be dropped, either while writing
    in text mode or when reading them back in text mode. However, that would
    be pretty arbitrary.

    A more subtle alternative is that some or all of the prohibited
    characters are used by the operating system to control how the other
    contents of text files are interpreted when reading the file. Examples: backspace characters could be removed, along with the immediately
    preceding characters. Carriage return characters could be removed along
    with the entire preceding line. Vertical tabs and form feeds could be
    replaced with an appropriate number of newline characters. I don't know
    if any real operating system does any of those things with text files,
    but if it did, that would not prevent a fully conforming implementation
    of C on that platform, thanks to the clause I cited.
    However, I also know of one real-life example, though an obscure one:
    text files are stored in fixed-size blocks, with each line starting with
    a count of the blocks it occupies. Newline characters are converted in
    spaces that pad out the end of the last block - the net result is that
    spaces immediately preceding a newline would be indistinguishable from
    the padding, and would therefore get dropped when reading the text file
    back in. The existence of such systems is precisely why "no new-line
    character is immediately preceded by space characters" is one of the specifications for text files.

    In that case, the I/O routines have two options: they can remove those characters when writing the text to prevent them from being
    misinterpreted, which would be how the text read in fails to match the
    text that was written. The other alternative is let those characters
    through, and let them be misinterpreted when read back, which could
    produce arbitrarily bizarre consequences (NOT limited to the examples I
    gave).

    A similar fixed-size block scheme with null characters padding out the
    end of the last block is the reason why the standard says that a binary
    "stream may, however, have an implementation-defined number of null
    characters appended to the end of the stream." when read back in.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Keith Thompson on Thu Feb 1 00:52:24 2024
    On 1/31/24 15:14, Keith Thompson wrote:
    Richard Harnden <richard.nospam@gmail.invalid> writes:
    On 31/01/2024 13:47, Janis Papanagnou wrote:
    On 30.01.2024 20:39, Richard Harnden wrote:

    Nobody uses printf to output binary data. fwrite(3) would be common, as >>>> would write(2).
    Right. I'm using the OS'es write(2), but also printf with ANSI
    escapes,
    e.g. sprintf (buf, "\033[%d;%dH", ...

    I meant 'binary' as in has \0s

    I don't think that's what "binary" means.

    The standard defines text and binary streams. I think it would be
    reasonable to call a file binary if it violates any of the requirements
    for a text stream to read back the same way it was written out. If so,
    null and ESC characters both qualify.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Thu Feb 1 11:34:22 2024
    On 31/01/2024 19:35, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 31/01/2024 15:46, Janis Papanagnou wrote:
    On 31.01.2024 15:09, David Brown wrote:

    I would expect that the majority of uses of "cat" are with just one
    file,
    And of course just because of ignorance; the majority of (but not
    all)
    uses with just one file are UUOCs.

    I regularly see it as more symmetrical and clearer to push data left
    to right. So I might write "cat infile | grep foo | sort > outfile".
    Of course I could use "<" redirection, but somehow it seems more
    natural to me to have this flow. I'll use "<" for simpler cases.

    But perhaps this is just my habit, and makes little sense to other people.

    You can also use:

    < infile grep foo | sort > outfile

    Redirections don't have to be written after a command.


    I did not know you could write it that way - thanks for another
    off-topic, but useful, tip.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Thu Feb 1 11:30:56 2024
    On 31/01/2024 14:35, Janis Papanagnou wrote:
    On 31.01.2024 14:05, David Brown wrote:

    But it is correct that English has become the main language for
    international communication, and is therefore critical for anything that
    involves cross-border communication, or where there are significant
    numbers of foreign workers. That includes academic work. Different
    parts of Europe previously used German or Russian for this,

    Don't forget the importance of French! - The whole postal and telecommunication sectors were (and probably still are) massively
    influenced by France.

    The French would never let the Anglophiles forget the importance of
    French :-)


    (You're always writing so much text, so I'll skip it and avoid
    more comments.)

    Just two (unrelated) notes concerning statements I've seen
    somewhere in the thread (maybe here as well)...

    First; the EU publishes in all languages of the member states,
    for example. (There's no single lingua franca.)

    Weirdly, while Norway is not in the EU but Sweden and Denmark are, they
    publish (for some things at least) in Norwegian but not in Swedish or
    Danish. The Danes and the Swedes don't like each other's languages, but
    are happy enough with Norwegian so they use that to save a little time
    and money.


    And the second note; we have to distinguish the language of the
    programming language's keywords, the comments in the source
    code, and the language(s) used for user-interaction.


    Absolutely. That is a critical distinction.

    I don't know whether there's some native language that use
    non-English keywords, but I'd suppose so, since in the past
    I've seen some using preprocessors for a "native language"
    source code. So while not typical, probably a demand at some
    places. (Elsethread I mentioned the German TR440 commands,
    but a [primitive] command language, as opposed to, say, the
    Unix shell, I don't consider much as a language.)


    Apparently there are a few - including, of course, one in French. But I
    have not heard of any that are relevant in modern times.

    The comments' languages varies, in my experience. Sometimes
    there's coding standards (that demand the native language, or
    that demand English), sometimes it's not defined. Myself I'm
    reluctant to switch between languages and stay with English.
    But there were also other cases with longer descriptions on
    a conceptual basis; if you come from a native language's
    perspective it can be better to stay with the language of the
    specification instead of introducing sources of misunderstanding.

    The user interface, finally, is of course as specified, and can
    be anything, or even multi-lingual.


    That's my experience too.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Thu Feb 1 14:02:12 2024
    On 01.02.2024 06:24, Malcolm McLean wrote:
    On 31/01/2024 23:36, Ben Bacarisse wrote:

    Maybe you are not used to a system where it's trivial to inspect such
    data. When "some_prog" produces data that are not compatible with the
    current terminal settings, "some_prog | hd" shows a hex dump instead.
    The need to do this does not make "some_prog" poorly designed. It may
    simply mean that the output is /intended/ for further processing.

    Well almost by definition binary output is intended for further
    processing. Binary audio files must ultimately be converted to analogue
    if anyone is to listen to them, for example.

    Well, not necessarily. Let's leave the typical use case for a moment...

    It might also be analyzed and converted to a digitally represented
    formula, say some TeX code, or e.g. like the formal syntax that the
    lilypond program uses.

    I had to check how to do a hex dump on the system I'm typing this on.
    The name of the hex dumper is xxd instead of hd, but otherwise it works
    the same way and will accept piped data. But the fact I had to look it
    up tells you that I've never actually used it.

    Well, there's always the old Unix standard tool, 'od'.

    I use that without thinking or looking it up, since it was ever there,
    despite I only rarely use it.

    And you observed correctly that nowadays there's typically even more
    than one tool available. (And Bart will probably write his own tool. :-)

    The two problems with hex
    dumps are that you've got to do mental arithmetic to convert 8 bit hex
    values into 16 or 32 bit fields,

    Hmm.. - have you inspected the man pages of the tools?

    At least for 'od' I know it's easy per option...
    od -c file # characters (or escapes and octals)
    od -t x1 file # hex octets
    od -t x2 file # words (two octets)
    od -c -t x1 file # characters and octets

    and that once you get a variable length
    field, it's virtually impossible to keep track of and match up the
    following fields.

    An inherent property of a binary. In that case you need data specific applications.

    So in reality what I do when troubleshooting binary
    data is to write a scratch program, or, more often because the trouble
    is in the existing parser, put diagnosics in an existing parser to print
    out a few fields and inspect them that way.

    That's fine.

    Of course to check that
    audio or image data is right you have to listen to it or view it - you
    can't tell from looking at the individual samples.

    Depends on the sort of check, and the solution approach. (See my
    lilypond example.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Michael S on Thu Feb 1 05:17:34 2024
    Michael S <already5chosen@yahoo.com> writes:

    On Tue, 30 Jan 2024 23:18:21 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    [..on sending binary to standard out..]

    Simple example (disclaimer: not tested):

    ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
    (mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

    Of the five main programs in this command, four are using
    standard out to send binary data:

    tar -cf - .
    gzip -c
    ssh foo [...]
    gunzip -c

    The tar -xf - at the end reads binary data on standard in
    but doesn't output any (or anything else for that matter).

    It is FAR more cumbersome to accomplish what this command
    is doing without sending binary data through standard out.

    If I am not mistaken, tar, gzip and gunzip do not write binary
    data to standard output by default. [...]

    What I think you mean is that these programs don't send their
    primary processing output to standard out unless specifically
    directed to do so. Well sure. I don't think that takes away from
    the point that it is useful to use standard out for a binary output
    stream.

    Anyone who doesn't understand this doesn't understand Unix.

    Frankly, Unix redirection racket looks like something hacked
    together rather than designed as result of the solid thinking
    process. As long as there were only standard input and output it
    was sort of logical. But when they figured out that it is
    insufficient, they had chosen a quick hack instead of
    constructing a solution that wouldn't offend engineering senses
    of any non-preconditioned observer.

    First I think you are being too harsh on the people who originally
    came up the pipe/redirection mechanism. Considering the historical
    context it was a big step forward, and a good match to available
    processing resources at the time.

    That said, no one is claiming that the single pipe/redirection
    mechanism is the be-all and end-all, or that it solves all the
    world's problems. But it does do a good job of solving a
    significant subclass of the world's problems, and in that context
    provides a good motivating example for using standard out for
    binary data as well as textual data.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Malcolm McLean on Thu Feb 1 05:24:50 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 31/01/2024 07:18, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
    human-readable text. For instance in some systems designed to receive >>>>> ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
    Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    [...]

    Simple example (disclaimer: not tested):

    ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
    (mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

    Of the five main programs in this command, four are using
    standard out to send binary data:

    tar -cf - .
    gzip -c
    ssh foo [...]
    gunzip -c

    The tar -xf - at the end reads binary data on standard in
    but doesn't output any (or anything else for that matter).

    It is FAR more cumbersome to accomplish what this command
    is doing without sending binary data through standard out.
    Anyone who doesn't understand this doesn't understand Unix.

    Yes. I don't do that sort of thing.
    While I have used Unix, it is as a platform for interactive programs
    which work on graohics, or a general C compilation environment. I
    don;t build pipeliens to do that sort of data processing. If I had to download a tar file I'd either use a graphical tool or type serveal
    commands into the shell, each launching single executable,
    interactively.

    The reason is that I'd only run the command once, and it's so likely
    that there will be either a syntax misunderstanding or a typing error
    that I'd have to test to ensure that it was right. And by the time
    you've done that any time saved by typing only one commandline is
    lost. Of course if you are writing scripts then that doesn't
    apply. But now it's effectively a programming language, and, from the example code, a very poorly designed one which is cryptic and fussy
    and liable to be hard to maintain. So it's better to use a language
    like Perl to achieve the same thing, and I did have a few Perl scripts
    handy for repetitive jobs of that nature in my Unix days.

    You admit this with "not tested". Says it all. '"Understandig Unix" is
    an intellectually useless achievement. You might have to do it if you
    have to use the system and debug and trouble shoot. But it's nothing
    to be proud about.

    You're an idiot. As usual trying to have a useful discussion
    with you has turned out to be a complete waste of time.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to bart on Thu Feb 1 14:42:34 2024
    bart <bc@freeuk.com> writes:
    On 01/02/2024 00:47, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    According to you, these tools are poorly designed. I don't think so.
    How would you design them? Endless input and output file names to be
    juggled and tidied up afterwards?

    I think they're poorly designed too.

    Of course you do. They're not bart programs.


    From the POV of interactive console programs, they /are/ poor.

    You don't provide any reason why - do elucidate!

    They only do one thing, like you can't first do A, then B. They don't
    give any prompts. They often apparently do nothing (so you can't tell if >they're busy, waiting for input, or hanging). There is no dialog.

    Those are features, not defects.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Feb 1 14:45:31 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
    human-readable text.

    Maybe you are not used to a system where it's trivial to inspect such
    data. When "some_prog" produces data that are not compatible with the
    current terminal settings, "some_prog | hd" shows a hex dump instead.
    The need to do this does not make "some_prog" poorly designed. It may
    simply mean that the output is /intended/ for further processing.

    For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
    Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.
    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    What is your evidence? stdout was just designed for output (as far as I
    can tell) and, anyway, what is the distinction you are making between
    binary and text? iconv --from ACSII --to EBCDIC-UK will produce
    something that is "logically" text on stdout, but it might look like
    binary to you.

    An example where it's really useful not to care: I have a suite of tools
    for doing toy cryptanalysis. Some apply various transformations and/or
    filters to byte streams and others collect and output (on stderr)
    various statistics. Plugging them together in various pipelines is very
    handy when investigating an encrypted text. The output is almost always
    "binary" in the sense that there would be not point in looking at on a
    terminal.

    According to you, these tools are poorly designed. I don't think so.
    How would you design them? Endless input and output file names to be
    juggled and tidied up afterwards?

    I'd write a monolithic program.

    Even a monolithic program is decomposed into subroutines (or malcolm functions).

    A pipeline is the same concept at a higher level.

    Load the encryoted text into memory, and then pass it to subroutines to
    do the various analyses.

    So your program is arbitrarily large and needs to be recompiled
    to add new subroutines. Advantage to the pipeline, again.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Thu Feb 1 15:50:44 2024
    On 01/02/2024 02:29, bart wrote:
    On 01/02/2024 00:47, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    According to you, these tools are poorly designed.  I don't think so. >>>> How would you design them?  Endless input and output file names to be >>>> juggled and tidied up afterwards?

    I think they're poorly designed too.

    Of course you do.   They're not bart programs.


     From the POV of interactive console programs, they /are/ poor.

    You don't provide any reason why - do elucidate!

    They only do one thing, like you can't first do A, then B. They don't
    give any prompts. They often apparently do nothing (so you can't tell if they're busy, waiting for input, or hanging). There is no dialog.

    That's the whole point!

    If you want to do A, then B, then you do "A | B", or "A; B", or "A && B"
    or "A || B". And if you want to do A, then B twice, then C, then A
    again, you write "A | B | B | C | A". Other operator choices let you
    say "do this then that", or "do this, and if successful do that", etc.

    Your monolithic AB program fails when you want to do C, or want to do A
    and B in a way the AB author didn't envisage.

    You have a Transformer - a toy that can be either a car or a robot.
    I've got a box of Lego. Sometimes I need instructions and a bit of
    time, but I can have a car, a robot, a plane, an alien, a house, and
    anything else I might want.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Feb 1 14:52:21 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 31/01/2024 23:34, Keith Thompson wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 31/01/2024 20:14, Keith Thompson wrote:
    [...]
    Terminal control sequences (almost always based on VT100 these days)
    are typically not printable, but tend to avoid null characters, which
    means you can very probably use printf to print them (assuming you're
    on a POSIX-like system).
    [...]

    In ASCII, 0 means NUL, or "ignore". So an ASCII sequence may contain
    any number of embedded zero bytes, which the receiver ignores. That's
    because for technical reasons some communications channels have to
    send data every cycle, and if there is no data, they will send a
    signal indistinguishable from all bits zero.

    Not particularly relevant. A quick experiment with xterm indicates that
    embedding null bytes in a control sequence prevents it from being
    recognized. There may be some standards that require embedded zero
    bytes to be ignored, but xterm doesn't any such standard. Similarly, if
    you embed null bytes in text written to a file, the result is corrupted
    text file.

    The standard is ASCII. (American standard for computer information >interchange). Byte zero is NUL, which means "ignore".

    ASCII does not define a semantic for
    the NUL byte other than grouping it as a "control character" with
    the function NO-OP. It was often used as a padding character.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Thu Feb 1 15:35:12 2024
    On 01/02/2024 01:21, bart wrote:
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
    human-readable text.

    Maybe you are not used to a system where it's trivial to inspect such
    data.  When "some_prog" produces data that are not compatible with the
    current terminal settings, "some_prog | hd" shows a hex dump instead.
    The need to do this does not make "some_prog" poorly designed.  It may
    simply mean that the output is /intended/ for further processing.

    For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte".  Obviously this will cause difficuties if the
    data
    is binary.
    Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all.  While it is possible to devise a text format >>>>> which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.
    Your reasoning is all gobbledygook.  Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.

    What is your evidence?  stdout was just designed for output (as far as I
    can tell) and, anyway, what is the distinction you are making between
    binary and text?  iconv --from ACSII --to EBCDIC-UK will produce
    something that is "logically" text on stdout, but it might look like
    binary to you.

    An example where it's really useful not to care: I have a suite of tools
    for doing toy cryptanalysis.  Some apply various transformations and/or
    filters to byte streams and others collect and output (on stderr)
    various statistics.  Plugging them together in various pipelines is very
    handy when investigating an encrypted text.  The output is almost always
    "binary" in the sense that there would be not point in looking at on a
    terminal.

    According to you, these tools are poorly designed.  I don't think so.
    How would you design them?  Endless input and output file names to be
    juggled and tidied up afterwards?

    I think they're poorly designed too.

    Then you are wrong too. (Although I think you've been doing a slightly
    better job than Malcolm of explaining /why/ you think they way you do.)

    Basically, neither you nor Malcolm are familiar with these tools. You
    don't use them significantly - and when you do, it is for simple things
    and you feel they are over-complicated for the tasks /you/ need. And
    maybe they /are/ over-complicated for your particular needs - but they
    are useful to a large number of people in a large number of ways. They
    are not poorly designed - they are simply not designed for /your/ needs
    and preferences.

    This of course works in all directions. Some people prefer gui tools
    and think command-line tools are hard to use. Some people prefer
    command line tools and think gui tools are slow and inefficient. Some
    people like combining multiple small, flexible tools that each handle
    part of the task - others prefer monolithic tools that try to do
    everything. Each possibility has its advantages, and disadvantages, its proponents and detractors. And if you want to be a happy and efficient developer, you stay open to using any of the solutions - learn at least
    a bit about them all, and use what suits you and your needs best at the
    time.

    But do not mistake "I don't like this" or "I think this is hard" with
    "it's poorly designed".


    From the POV of interactive console programs, they /are/ poor. But the mistake is thinking that they are actual programs or commands, when
    really they are just filters. They are not designed to be standalone commands.


    Of course they are programs. Some programs can be used as filters -
    that doesn't mean they are not programs!

    Even 'cat', if I type it by itself, just sits there. (I wonder what use
    it has in a sequence like ... | cat | ...; what does it add to the data?)


    Nothing, in this case.

    And if I open - say - MS Word, then close it again, I have also done
    nothing. Does that mean MS Word is a useless program that doesn't
    really do anything?

    AFAICS, this stuff mainly works inside scripts. Or do people here spend
    all day manually piping stuff between programs?

    This stuff works in scripts /and/ ad-hoc from the command line. People
    use both, whatever is convenient at the time. I don't spend "all day"
    piping stuff, but I guess I perhaps use pipes a dozen or so times a day
    - the variance here is so big it's hard to give a sensible average.
    Many cases are piping the output of one command into "less" and/or
    "grep". Other common targets for pipes (for me) are head, tail (or
    "tail -f"), "hexdump -C", "wc -l".


    As for alternatives, I don't know. There are any number of ways this
    could be done. But if everyone has become inured to this piping
    business, they will not be receptive to anything different.

    No one has claimed pipes or *nix stream redirection is a "perfect"
    system. No one has suggested they think things could not have been done
    in different ways. Just as with C, you are mistaking "we know how this
    works, we use it, we are happy enough with it that we can work with it,
    we know why it is this way" with "we think this is perfect and nothing
    else will come close". Discussions with you would be so much more
    fruitful if you didn't have such an absurd "you're with us or against
    us" attitude. Opinions are not binary.


    Those of us who understand *nix shell usage, pipes and indirection, and
    are familiar with *nix common command line tools, and are therefore able
    to give qualified opinions based on facts and experience (i.e., not you,
    and not Malcolm), seem to find them useful. I'm sure, however, we can
    all think of potential improvements.

    But like C, the advantages of familiarity and common implementations
    greatly outweigh the disadvantages. Maybe "ls" would have been better
    if it had been spelt "list". Perhaps "rm" should have had different
    defaults, or "sed" could have been less cryptic. The fact that "ls",
    "rm", and other fundamentals works exactly the same way on every *nix
    system made in the last 40 years, gaining new features if needed without breaking compatibility, is a /massive/ advantage.

    You claim others are not open to considering something different, which
    may potentially be "better" (for some values of "better"). You are
    wrong. Most people, however, understand that momentum is important. It
    can mean clearly inferior systems become standard - such as Windows or
    the x86 ISA, which are technically massively inferior to alternatives,
    but are successful due to momentum.


    Here however are some ideas:

    * Have versions of these tools for use as filters with no UI, just
      a default input and output, and versions for interactive use with
      helpful prompts. Or even just a sensibly named output! Instead of
      every program writing a.out.


    Sometimes a "front end" with a nice UI /is/ useful. So people write
    front ends with nice UI's - text-based or gui. Typically these are
    great for common tasks, and are cumbersome or useless for rarer or more advanced stuff. And that's fine - use whatever works best for you at
    the time. If you think "ssh" is complicated from the command line, use
    "putty" - but that won't handle all the uses that some people need.


    But I most certainly don't want interactive and "helpful" prompts for
    most of my command-line tools. It's fine on occasion if it is
    necessary, useful if something out of the ordinary happens, and
    appropriate for things like passwords. But when I know what I am doing,
    why would I want to see help messages repeated? And if I don't know
    what I am doing - it's the first time using a command, or I've forgotten
    some details, then I have "prog --help", "man prog", or "google prog"
    that will all give much more useful information than interactive prompts
    ever could.

    Can you imagine typing "ls" and being asked :

    * Did you want to list files in the current directory, or elsewhere?
    * Did you want to list all files, including hidden files?
    * Did you want to use colour?

    and then twenty more questions for the other common options for ls?


    What would be the benefits of "sort" using command line options for
    "filter" or "script" usage and then asking a dozen questions for
    "interactive" use?


    * Have a concept of a current block of data, analogous to a clipboard.
      Then separate commands can load data, sort it, count it, display it,
      write it etc, with no need for intermediate named files.

    You mean, a convenient way of moving data between programs? Sort of
    like taking data output by one program, and passing it into another
    program? Maybe something akin to a factory assembly line - or a
    pipeline? Perhaps we could make this convenient with a simple symbol,
    and let it be organised and controlled by the shell so that individual
    programs don't need to implement anything special - they just read in
    data and dump it out in a simple, standardised way, and with this
    wonderful "pipeline" idea, people can tie together different existing
    standard programs in any combinations they like! Genius! If only those hopeless Unix people had thought about this 50+ years ago, instead of
    messing around with stream redirection and pipes.


    But I'd be happier if this was all contained within a separate
    application from an OS shell program.

    Yes, because it is /so/ much better if it is limited to a few commands
    that you think of when writing this special application, than having a
    general system that works with any commands.

    Basically, all you are saying is that you'd like command line utilities
    to work with a default file name "/tmp/clipboard" - something you didn't
    want earlier on.

    Let's use /tmp/x for convenience.

    /tmp$ cat > fred
    one
    two
    three
    four
    <ctrl-D>

    > load fred
    Data loaded

    $ cp fred x


    > lc
    4 lines


    $ wc -l x
    4 x

    > list
    1 one
    2 two
    3 three
    4 four

    $ cat x
    one
    two
    three
    four

    $ cat -n x
    1 one
    2 two
    3 three
    4 four


    > rev
    Reversed

    tac x | sponge x


    > list
    1 four
    2 three
    3 two
    4 one

    $ cat -n x
    1 four
    2 three
    3 two
    4 one


    > sort
    Sorted


    $ sort x | sponge x

    > upper

    $ awk '{print toupper($0)}' x | sponge x


    > list
    1 FOUR
    2 ONE
    3 THREE
    4 TWO

    $ cat -n x
    1 FOUR
    2 ONE
    3 THREE
    4 TWO


    > save bill
    Written to bill

    > q


    $ cp x bill


    The "sponge" utility reads all of its stdin, then writes the file.
    Otherwise, since Unix is inherently multi-tasking and runs the programs
    in parallel (unlike your utility), trying to redirect output back into
    the same file you use for output is a race condition. Utilities are
    generally designed for pipes, not destructive changes to a single file.


    With pipes, this is all vastly simpler:

    $ cat fred | tac | sort | awk '{print toupper($0)}' > bill

    Oh, and the sorting and case conversion works for your locale, including
    UTF-8.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Thu Feb 1 14:53:56 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 01.02.2024 06:24, Malcolm McLean wrote:
    On 31/01/2024 23:36, Ben Bacarisse wrote:

    Maybe you are not used to a system where it's trivial to inspect such
    data. When "some_prog" produces data that are not compatible with the
    current terminal settings, "some_prog | hd" shows a hex dump instead.
    The need to do this does not make "some_prog" poorly designed. It may
    simply mean that the output is /intended/ for further processing.

    Well almost by definition binary output is intended for further
    processing. Binary audio files must ultimately be converted to analogue
    if anyone is to listen to them, for example.

    Well, not necessarily. Let's leave the typical use case for a moment...

    It might also be analyzed and converted to a digitally represented
    formula, say some TeX code, or e.g. like the formal syntax that the
    lilypond program uses.

    I had to check how to do a hex dump on the system I'm typing this on.
    The name of the hex dumper is xxd instead of hd, but otherwise it works
    the same way and will accept piped data. But the fact I had to look it
    up tells you that I've never actually used it.

    Well, there's always the old Unix standard tool, 'od'.

    I use that without thinking or looking it up, since it was ever there, >despite I only rarely use it.

    And you observed correctly that nowadays there's typically even more
    than one tool available. (And Bart will probably write his own tool. :-)

    The two problems with hex
    dumps are that you've got to do mental arithmetic to convert 8 bit hex
    values into 16 or 32 bit fields,

    Hmm.. - have you inspected the man pages of the tools?

    At least for 'od' I know it's easy per option...
    od -c file # characters (or escapes and octals)
    od -t x1 file # hex octets
    od -t x2 file # words (two octets)
    od -c -t x1 file # characters and octets

    Likewise, with xxd, use the -g flag. But Malcolm can't be
    troubled to read the man page before complaining.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Thu Feb 1 15:55:34 2024
    On 01/02/2024 02:53, Malcolm McLean wrote:
    On 31/01/2024 23:36, Ben Bacarisse wrote:

    An example where it's really useful not to care: I have a suite of tools
    for doing toy cryptanalysis.  Some apply various transformations and/or
    filters to byte streams and others collect and output (on stderr)
    various statistics.  Plugging them together in various pipelines is very
    handy when investigating an encrypted text.  The output is almost always
    "binary" in the sense that there would be not point in looking at on a
    terminal.

    According to you, these tools are poorly designed.  I don't think so.
    How would you design them?  Endless input and output file names to be
    juggled and tidied up afterwards?

    I'd write a monolithic program.

    It's very strange to me to see people that consider themselves
    programmers talk about having multiple small functions to do specific
    tasks and combining them into bigger functions to solve bigger problems,
    yet are reduced to quivering jellies at the thought of multiple small
    programs to do specific tasks that can be combined to solve bigger tasks.

    Do you think the C standard library would be improved by a single
    function "flubadub" that takes 20 parameters and can calculate
    logarithms, print formatted text, allocate memory and write it all to a
    file?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Thu Feb 1 16:07:58 2024
    On 01.02.2024 14:26, Malcolm McLean wrote:
    On 01/02/2024 13:02, Janis Papanagnou wrote:
    Well, not necessarily. Let's leave the typical use case for a moment...

    It might also be analyzed and converted to a digitally represented
    formula, say some TeX code, or e.g. like the formal syntax that the
    lilypond program uses.

    And ultimately converted to a non binary form. A list of 1s and 0s is
    seldom any use to the final consumer of the data.

    No, I was speaking about an application that creates lilypond _input_,
    which is a formal language to write notes, e.g. for evaluation by the
    lilypond software, but not excluding other usages.


    The two problems with hex
    dumps are that you've got to do mental arithmetic to convert 8 bit hex
    values into 16 or 32 bit fields,

    Hmm.. - have you inspected the man pages of the tools?

    I just ran "man xxd". The man page contains this statement.

    The tool's weirdness matches its creator's brain. Use entirely at your
    own risk. Copy files. Trace it. Become a wizard.

    This statement repelled you? (Can't help you here.)

    At least for 'od' I know it's easy per option...
    od -c file # characters (or escapes and octals)
    od -t x1 file # hex octets
    od -t x2 file # words (two octets)
    od -c -t x1 file # characters and octets

    So a JPEG file starts with
    FF D8
    FF E0
    hi lo (length of the FF E0 segment)

    So we want the output

    FF D8 FF E0 [1000] to check that the segment markers are correct and FF
    E0 segment is genuinely a thousand bytes (or whatever it is). This isn't
    easy to achieve with a hex dump utility.

    I don't know binary format details about jpg, so I cannot help you here.

    I was responding to your question where you wanted entities larger than
    a single octet. I showed you some examples what I can do with 'od'.
    (Just open it's man page to find all sorts of possible options and
    option combinations.)

    Yes, you can use entities of length four. Guess how? By 'od -t x4 file'
    Or if you need decimal numbers use 'od -t d4 file' or 'od -t d2 file'.

    And I already answered that for specific binary structures you'll need something data specific. You can also generalize that to some degree...

    For example I recently wrote a shell script that supports binary data definitions in a very primitive declarative form; it's allowing me to
    specify the field lengths, output type, and identification for readable
    output. It allows hex, bin, dec, text, hex-seq, data to skip. Length of
    fields many be constants, 0-terminated, or defined by another preceding
    field. It also handles endianness.

    You see even some primitive "generic" thing needs a couple features.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Thu Feb 1 16:47:03 2024
    On 01.02.2024 15:57, Malcolm McLean wrote:
    On 01/02/2024 14:45, Scott Lurndal wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    I'd write a monolithic program.

    I certainly want re-usability of functions, modules, components,
    commands, and systems. (And I want no duplication on any level.)

    Even a monolithic program is decomposed into subroutines (or malcolm
    functions).

    There are various ways to organize software. Some supported by the
    language, some by the OS mechanisms, some specifically implemented.

    A pipeline is the same concept at a higher level.

    Functions is a way, communicating processes is a way, etc. etc. etc.
    All can be taken to combine various entities. All mechanisms can be
    used together or some omitted.

    Exactly. So whilst it might have some advantages, they aren't going to
    be very large, because as you say, it;s the same basic concept.

    I think that you draw the wrong conclusion (on a statement that is
    prone to misunderstandings or even wrong).

    Pipelines are a very useful method to let processes communicate in
    a one-way direction (as the name already suggests). From that it's
    immediately recognizable that filters are a natural element in that OS-architectural glue.

    One original Unix philosophy was to have specialized commands that
    do one thing well, and to combine such tasks as necessary. (To some
    degree there was as similar statement concerning C function design.) Unfortunately some popular GNU tools deviate from that. Features get incorporated (as duplicates) in many tools (instead of using the
    existing specialized one).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Thu Feb 1 16:48:59 2024
    On 01/02/2024 14:26, Malcolm McLean wrote:
    I just ran "man xxd". The man page contains this statement.

    The tool's weirdness matches its creator's brain.  Use entirely at your
           own risk. Copy files. Trace it. Become a wizard.

    If you don't like xxd, use one of the dozens of other hex dump programs
    that are easily available. I use hexdump myself.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Thu Feb 1 15:32:26 2024
    On 01/02/2024 14:35, David Brown wrote:
    On 01/02/2024 01:21, bart wrote:


    * Have versions of these tools for use as filters with no UI, just
       a default input and output, and versions for interactive use with
       helpful prompts. Or even just a sensibly named output! Instead of
       every program writing a.out.


    Sometimes a "front end" with a nice UI /is/ useful.  So people write
    front ends with nice UI's - text-based or gui.  Typically these are
    great for common tasks, and are cumbersome or useless for rarer or more advanced stuff.  And that's fine - use whatever works best for you at
    the time.  If you think "ssh" is complicated from the command line, use "putty" - but that won't handle all the uses that some people need.


    But I most certainly don't want interactive and "helpful" prompts for
    most of my command-line tools.  It's fine on occasion if it is
    necessary, useful if something out of the ordinary happens, and
    appropriate for things like passwords.  But when I know what I am doing,
    why would I want to see help messages repeated?  And if I don't know
    what I am doing - it's the first time using a command, or I've forgotten
    some details, then I have "prog --help", "man prog", or "google prog"
    that will all give much more useful information than interactive prompts
    ever could.

    Can you imagine typing "ls" and being asked :

    * Did you want to list files in the current directory, or elsewhere?
    * Did you want to list all files, including hidden files?
    * Did you want to use colour?

    and then twenty more questions for the other common options for ls?


    What would be the benefits of "sort" using command line options for
    "filter" or "script" usage and then asking a dozen questions for "interactive" use?

    I'd classify command-line programs into categories:

    * Filters, as I explained before, which are designed to work with piped data

    * Programs that can meaningfully be run with no command line options
    (such as your 'ls' example), because a default action is defined

    * Programs that expect command line parameters, which can generate an
    error message or usage info if missing

    * Programs that launch into an on-going session, which may or may not
    take command line parameters (eg. python interpreter)

    * Everything else, programs which are launched from the command line but
    do not otherwise have a CLI, eg. a GUI text editor.

    The troublesome ones are those I've called filters. Some programs I
    expect to behaviour conventionally, unexpectedly behave as filters, such
    as 'as'.


    * Have a concept of a current block of data, analogous to a clipboard.
       Then separate commands can load data, sort it, count it, display it,
       write it etc, with no need for intermediate named files.

    You mean, a convenient way of moving data between programs?  Sort of
    ...


    But I'd be happier if this was all contained within a separate
    application from an OS shell program.

    Yes, because it is /so/ much better if it is limited to a few commands
    that you think of when writing this special application, than having a general system that works with any commands.

    Basically, all you are saying is that you'd like command line utilities
    to work with a default file name "/tmp/clipboard" - something you didn't
    want earlier on.

    No. Someone said that it is convenient to run a sequence of programs,
    where each processes the output of the other, without having to
    explicitly named intermediate files.

    I'm exploring other ways of doing that ...

    Let's use /tmp/x for convenience.

    /tmp$ cat > fred
    one
    two
    three
    four
    <ctrl-D>

          > load fred
          Data loaded

    $ cp fred x
    ...
    $ cat -n x
         1    FOUR
         2    ONE
         3    THREE
         4    TWO


          > save bill
          Written to bill

          > q


    $ cp x bill



    With pipes, this is all vastly simpler:

    $ cat fred | tac | sort | awk '{print toupper($0)}' > bill

    ... one way, which follows on from my previous suggestion to have two
    versions of filter utilities, might be to have versions which work on
    that blob of current data as I demonstrated.

    Then on the shell command line you'd have:

    > cload fred
    > ctac
    > csort
    > cawk ...
    > csave bill

    Any inputs will always come from that blob; it will not just do nothing
    waiting for input, unless the command is specifically for that.

    And if so, it can prompt for it. For that matter, it can also write
    messages confirming what it's just done, or show a progress report.

    If you like, you can also have a stack of such data blobs, so:

    > cload fred
    > cdupl
    > csave bill
    > crev
    > csave llib

    (I think I've just invented shell-Forth!)

    I've just realised why it is that your filter programs don't show
    prompts or any kinds of messages: because those are sent to stdout, and therefore will screw up any data that is being sent there as the primary output.

    THAT'S why sending what ought to be packaged data to stdout is Wrong.


    The "sponge" utility reads all of its stdin, then writes the file. Otherwise, since Unix is inherently multi-tasking and runs the programs
    in parallel (unlike your utility), trying to redirect output back into
    the same file you use for output is a race condition. Utilities are generally designed for pipes, not destructive changes to a single file.

    My utility is like pretty much any application that loads a file, does something with it, and writes it out. (Eg. text or image editors.)

    So that kind of pattern is well understand. The difference is that your
    filters tend to bluntly work on the whole blob of data.

    The kind of REPL utility I outlined is far more user-friendly. The data
    it deals with isn't as transient as the data in a pipe either.

    Halfway through my session, the data is still there. You can examine it
    so far, and decide where to go next, or perhaps reverse the last stup.

    When you write a|b|c>d, it whizzes through it all at once. I can write
    more on this but I simply don't think that any 'Unix-head' here is going
    to get it; it's too ingrained.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Thu Feb 1 17:15:34 2024
    On 01.02.2024 16:41, Malcolm McLean wrote:

    So could you list one or two reasons why you might prefer a program with
    five subroutines, and one or two reasons why you might prefer to write
    five programs which communicate via piped data?

    A quite appealing and naturally appearing task (from the past) to use
    pipes was to model communication cascades. Something like (off the top
    of my head)...

    data-source | sign | compress | crc | encrypt | channel-enc |
    interleaver | channel-simulator | deinterleaver | channel-dec |
    decrypt | crc-check | uncompress | check-sign | data-sink

    Component-pairs can be omitted, say you may leave out the un-/compress function. And every component may be either special purpose or general.
    A special purpose entity could be BCH-enc and RCPC-enc, or it can also
    be (if better suited) a combined module, say 'crc -16' vs. 'crc -32'
    with the function realized as option argument.

    Reasons to not use pipelines are when you don't have a linear flow.
    In some circumstances you can bypass the pipe (opening a side-channel)
    on other cases you can't or it's overly messy to do so.

    Reasons not to use in-memory processing are of course if you have huge
    amounts of data. Then you need filtering and pipeline processing.
    (A former fellow student who worked for the ESO told me remarkable
    things about the amounts of data they continuously receive and that
    must on the fly be processed.) Another more recent example can be
    processing of real time data for Digital Twins (e.g. city models).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Thu Feb 1 16:22:43 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 01.02.2024 15:57, Malcolm McLean wrote:
    On 01/02/2024 14:45, Scott Lurndal wrote:


    Exactly. So whilst it might have some advantages, they aren't going to
    be very large, because as you say, it;s the same basic concept.

    I think that you draw the wrong conclusion (on a statement that is
    prone to misunderstandings or even wrong).

    Pipelines are a very useful method to let processes communicate in
    a one-way direction (as the name already suggests). From that it's >immediately recognizable that filters are a natural element in that >OS-architectural glue.

    One original Unix philosophy was to have specialized commands that
    do one thing well, and to combine such tasks as necessary. (To some
    degree there was as similar statement concerning C function design.) >Unfortunately some popular GNU tools deviate from that. Features get >incorporated (as duplicates) in many tools (instead of using the
    existing specialized one).

    I believe the classic example is the documenters workbench.

    Troff converts a set of page layout directives into a typeset
    document. Originally for the CAT typesetter, but I'll address
    that later.

    So, now you want to add tables. You can modify troff to add
    new macros that use existing markup directives, or you can
    add a filter that converts a 'table description' language
    into troff markup directives.

    $ < document.tr | tbl | troff -mm > typesetter_output

    Now, you want to add support for arbitrary mathematical
    formulae. You can modify troff to add more macros, or
    you can write a filter that converts an 'equation description'
    to troff and add that to your pipeline;

    $ <document.tr | eqn | tbl | troff -mm > typesetter_output

    Now, you realize that you want to support multiple typesetters,
    so you can either modify troff to support all possible typesetters,
    or you can add post processing filters.

    $ <document.tr | eqn | tbl | ditroff -mm | dit2cat > typesetter_output
    $ <document.tr | eqn | tbl | ditroff -mm | dit2ps > postscript output

    Then you might want to be able to include pictures in your
    document.

    $ <document.tr | pic | eqn | tbl | ditroff -mm | dit2ps | ps2pdf > document.pdf.

    Or you might want ascii text output

    $ <document.tr | pic | eqn | tbl | nroff -mm > document.txt

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Thu Feb 1 17:24:54 2024
    On 01.02.2024 11:34, David Brown wrote:
    On 31/01/2024 19:35, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:

    I regularly see it as more symmetrical and clearer to push data left
    to right. So I might write "cat infile | grep foo | sort > outfile".
    Of course I could use "<" redirection, but somehow it seems more
    natural to me to have this flow. I'll use "<" for simpler cases.

    But perhaps this is just my habit, and makes little sense to other
    people.

    I completely understand that.


    You can also use:

    < infile grep foo | sort > outfile

    Redirections don't have to be written after a command.

    Indeed. And if we also respect that 'grep' accepts arguments,
    then it's even more compact and yet probably better legible... :-)

    grep foo infile | sort > outfile


    I did not know you could write it that way - thanks for another
    off-topic, but useful, tip.

    Yes. We certainly should instead have written

    grep foo iso646.h | sort > outfile


    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Scott Lurndal on Thu Feb 1 16:28:10 2024
    On 2024-02-01, Scott Lurndal <scott@slp53.sl.home> wrote:
    Kaz Kylheku <433-929-6894@kylheku.com> writes:
    On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <433-929-6894@kylheku.com> writes:
    But ESC is related text; it's a character described in ASCII used for >>signaling in the middle of text, which is what it's doing here.

    So are most of the other ASCII codes less than 0x20. Including
    file and record delimiters and shift-in/shift-out.

    ESC is such an important piece of text, that it has a dedicated key on
    your keyboard.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Feb 1 16:30:33 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 01/02/2024 15:07, Janis Papanagnou wrote:
    On 01.02.2024 14:26, Malcolm McLean wrote:
    On 01/02/2024 13:02, Janis Papanagnou wrote:
    Well, not necessarily. Let's leave the typical use case for a moment... >>>>
    It might also be analyzed and converted to a digitally represented
    formula, say some TeX code, or e.g. like the formal syntax that the
    lilypond program uses.

    And ultimately converted to a non binary form. A list of 1s and 0s is
    seldom any use to the final consumer of the data.

    No, I was speaking about an application that creates lilypond _input_,
    which is a formal language to write notes, e.g. for evaluation by the
    lilypond software, but not excluding other usages.


    The two problems with hex
    dumps are that you've got to do mental arithmetic to convert 8 bit hex >>>>> values into 16 or 32 bit fields,

    Hmm.. - have you inspected the man pages of the tools?

    I just ran "man xxd". The man page contains this statement.

    The tool's weirdness matches its creator's brain. Use entirely at your
    own risk. Copy files. Trace it. Become a wizard.

    This statement repelled you? (Can't help you here.)

    At least for 'od' I know it's easy per option...
    od -c file # characters (or escapes and octals)
    od -t x1 file # hex octets
    od -t x2 file # words (two octets)
    od -c -t x1 file # characters and octets

    So a JPEG file starts with
    FF D8
    FF E0
    hi lo (length of the FF E0 segment)

    So we want the output

    FF D8 FF E0 [1000] to check that the segment markers are correct and FF
    E0 segment is genuinely a thousand bytes (or whatever it is). This isn't >>> easy to achieve with a hex dump utility.

    I don't know binary format details about jpg, so I cannot help you here.

    JPEG is an extremely common binary file format and JPEG files will be
    found on most general purpose computers.
    All you need to know for the purposes of the discussion is that the
    first four bytes are segment identifiers and must have the values I
    gave, whilst bytes five and six are a big endian 16 bit number that >represents a segment length, and that potentially any of those values
    could be unexpected and you might want to inspect them.

    So how would you achieve that in a convenient and non-error prone way?

    $ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
    it is a jpeg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Thu Feb 1 17:06:24 2024
    On 01/02/2024 14:50, David Brown wrote:
    On 01/02/2024 02:29, bart wrote:
    On 01/02/2024 00:47, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    According to you, these tools are poorly designed.  I don't think so. >>>>> How would you design them?  Endless input and output file names to be >>>>> juggled and tidied up afterwards?

    I think they're poorly designed too.

    Of course you do.   They're not bart programs.


     From the POV of interactive console programs, they /are/ poor.

    You don't provide any reason why - do elucidate!

    They only do one thing, like you can't first do A, then B. They don't
    give any prompts. They often apparently do nothing (so you can't tell
    if they're busy, waiting for input, or hanging). There is no dialog.

    That's the whole point!

    If you want to do A, then B, then you do "A | B", or "A; B", or "A && B"
    or "A || B".  And if you want to do A, then B twice, then C, then A
    again, you write "A | B | B | C | A".  Other operator choices let you
    say "do this then that", or "do this, and if successful do that", etc.

    Your monolithic AB program fails when you want to do C, or want to do A
    and B in a way the AB author didn't envisage.

    You have a Transformer - a toy that can be either a car or a robot. I've
    got a box of Lego.  Sometimes I need instructions and a bit of time, but
    I can have a car, a robot, a plane, an alien, a house, and anything else
    I might want.


    You can only do one thing, as you can only have one unbroken byte
    sequence as output sent to stdout.

    You can't send output A to stdout, then B to stdout, and certainly can't interleave messages to the console on stdout, as that would then be all
    mixed up with the possibly binary data, and if redirected, you won't see it.

    I can see the idea of having one permanently open channel, but call it stdbinout or stdpipeout. But you still won't be able to generate a
    sequence of distinct data blocks along that one channel because it is continuous.

    This why 'as' only ever produces one object file, even for multiple
    input source files.

    And explains why 'as' treats multiple .s input files as though they were
    all part of the same single source file: you can take one .s file, chop
    it up into multiple .s files, and submit them all to 'as' (keeping the
    right order).

    It's a feature! It's also the whackiest assembler I've encountered, this century anyway. That fact that it's implemented as a crude filter with
    one input stream and one output streams helps explain it.

    Although it works differently from most such filters, because if its
    output is not piped, and not redirected, it is sent to a file (always
    called a.out). It's not quite crazy enough to send binary object file
    data to the termimal; I wonder why not?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Thu Feb 1 18:35:32 2024
    On 01.02.2024 11:30, David Brown wrote:
    On 31/01/2024 14:35, Janis Papanagnou wrote:

    First; the EU publishes in all languages of the member states,
    for example. (There's no single lingua franca.)

    Weirdly, while Norway is not in the EU but Sweden and Denmark are, they publish (for some things at least) in Norwegian but not in Swedish or
    Danish. [...]

    Hmm.. - in my ears this sounds strange. I've looked it up and found...

    "The EU has 24 official languages:

    Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
    French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian,
    Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish and
    Swedish."

    [
    https://european-union.europa.eu/principles-countries-history/languages_en ]


    Searching for a specific regulation, e.g. the GDPR, there's documents
    in these languages:

    BG ES CS DA DE ET EL EN FR GA HR IT
    LV LT HU MT NL PL PT RO SK SL FI SV


    "We aim to provide information on our websites in all 24 EU official
    languages. If content is not available in your chosen EU language,
    more and more websites offer eTranslation, the Commission’s machine
    translation service."

    "All content is published in at least English, because research has
    shown that with English we can reach around 90% of visitors to our
    sites in either their preferred foreign language or their native
    language."

    [ https://european-union.europa.eu/languages-our-websites_en ]


    It's interesting that they have an extra explanation about English:

    "English remains an official EU language, despite the United Kingdom
    having left the EU. It remains an official and working language of the
    EU institutions as long as it is listed as such in Regulation No 1.
    English is also one of Ireland’s and Malta’s official languages."

    [
    https://european-union.europa.eu/principles-countries-history/languages_en ]


    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Scott Lurndal on Thu Feb 1 18:03:24 2024
    On 01/02/2024 16:30, Scott Lurndal wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
    On 01/02/2024 15:07, Janis Papanagnou wrote:
    On 01.02.2024 14:26, Malcolm McLean wrote:
    On 01/02/2024 13:02, Janis Papanagnou wrote:
    Well, not necessarily. Let's leave the typical use case for a moment... >>>>>
    It might also be analyzed and converted to a digitally represented
    formula, say some TeX code, or e.g. like the formal syntax that the
    lilypond program uses.

    And ultimately converted to a non binary form. A list of 1s and 0s is
    seldom any use to the final consumer of the data.

    No, I was speaking about an application that creates lilypond _input_,
    which is a formal language to write notes, e.g. for evaluation by the
    lilypond software, but not excluding other usages.


    The two problems with hex
    dumps are that you've got to do mental arithmetic to convert 8 bit hex >>>>>> values into 16 or 32 bit fields,

    Hmm.. - have you inspected the man pages of the tools?

    I just ran "man xxd". The man page contains this statement.

    The tool's weirdness matches its creator's brain. Use entirely at your >>>> own risk. Copy files. Trace it. Become a wizard.

    This statement repelled you? (Can't help you here.)

    At least for 'od' I know it's easy per option...
    od -c file # characters (or escapes and octals)
    od -t x1 file # hex octets
    od -t x2 file # words (two octets)
    od -c -t x1 file # characters and octets

    So a JPEG file starts with
    FF D8
    FF E0
    hi lo (length of the FF E0 segment)

    So we want the output

    FF D8 FF E0 [1000] to check that the segment markers are correct and FF >>>> E0 segment is genuinely a thousand bytes (or whatever it is). This isn't >>>> easy to achieve with a hex dump utility.

    I don't know binary format details about jpg, so I cannot help you here. >>>
    JPEG is an extremely common binary file format and JPEG files will be
    found on most general purpose computers.
    All you need to know for the purposes of the discussion is that the
    first four bytes are segment identifiers and must have the values I
    gave, whilst bytes five and six are a big endian 16 bit number that
    represents a segment length, and that potentially any of those values
    could be unexpected and you might want to inspect them.

    So how would you achieve that in a convenient and non-error prone way?

    $ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
    it is a jpeg

    That doesn't work for me:

    root@xxx:/mnt/c/mx# ls card2.jpg
    card2.jpg
    root@xxx:/mnt/c/mx# if file card2.jpg | grep JPEG >
    /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
    >
    >

    I just get a lone ">". If press Enter, I get more. If I press Ctrl=D, it
    says:

    > -bash: syntax error: unexpected end of file
    logout

    I think anyway that you need to grep for JFIF not JPEG, but that is a
    really poor way to check for a JPEG file. Any text or binary file can
    have a JFIF byte sequence.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Thu Feb 1 19:36:11 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 01.02.2024 16:41, Malcolm McLean wrote:

    So could you list one or two reasons why you might prefer a program with
    five subroutines, and one or two reasons why you might prefer to write
    five programs which communicate via piped data?

    A quite appealing and naturally appearing task (from the past) to use
    pipes was to model communication cascades. Something like (off the top
    of my head)...

    data-source | sign | compress | crc | encrypt | channel-enc |
    interleaver | channel-simulator | deinterleaver | channel-dec |
    decrypt | crc-check | uncompress | check-sign | data-sink

    Component-pairs can be omitted, say you may leave out the un-/compress >function. And every component may be either special purpose or general.
    A special purpose entity could be BCH-enc and RCPC-enc, or it can also
    be (if better suited) a combined module, say 'crc -16' vs. 'crc -32'
    with the function realized as option argument.

    There was also the widely used netpbm package for translating
    between different image formats.

    https://en.wikipedia.org/wiki/Netpbm

    $ giftopnm somepic.gif | ppmtobmp > somepic.bmp
    $ for i in *.png; do pngtopam $i | ppmtojpeg >`basename $i .png`.jpg; done

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Scott Lurndal on Thu Feb 1 19:50:04 2024
    On 2024-02-01, Scott Lurndal <scott@slp53.sl.home> wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 01.02.2024 16:41, Malcolm McLean wrote:

    So could you list one or two reasons why you might prefer a program with >>> five subroutines, and one or two reasons why you might prefer to write
    five programs which communicate via piped data?

    A quite appealing and naturally appearing task (from the past) to use
    pipes was to model communication cascades. Something like (off the top
    of my head)...

    data-source | sign | compress | crc | encrypt | channel-enc |
    interleaver | channel-simulator | deinterleaver | channel-dec |
    decrypt | crc-check | uncompress | check-sign | data-sink

    Component-pairs can be omitted, say you may leave out the un-/compress >>function. And every component may be either special purpose or general.
    A special purpose entity could be BCH-enc and RCPC-enc, or it can also
    be (if better suited) a combined module, say 'crc -16' vs. 'crc -32'
    with the function realized as option argument.

    There was also the widely used netpbm package for translating
    between different image formats.

    https://en.wikipedia.org/wiki/Netpbm

    $ giftopnm somepic.gif | ppmtobmp > somepic.bmp
    $ for i in *.png; do pngtopam $i | ppmtojpeg >`basename $i .png`.jpg; done

    Also, in regard to some silly objections upthread about the danger of
    binary data on standard ouptut, programs in Unix can easily do the
    Following (and arguably should):

    if (isatty(STDOUT_FILENO)) {
    fprintf(stderr, "Cowardly refusing to dump binary data to a terminal.\n");
    exit(EXIT_FAILURE);
    }

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Harnden@21:1/5 to Kaz Kylheku on Thu Feb 1 20:03:59 2024
    On 01/02/2024 19:50, Kaz Kylheku wrote:
    On 2024-02-01, Scott Lurndal <scott@slp53.sl.home> wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 01.02.2024 16:41, Malcolm McLean wrote:

    So could you list one or two reasons why you might prefer a program with >>>> five subroutines, and one or two reasons why you might prefer to write >>>> five programs which communicate via piped data?

    A quite appealing and naturally appearing task (from the past) to use
    pipes was to model communication cascades. Something like (off the top
    of my head)...

    data-source | sign | compress | crc | encrypt | channel-enc |
    interleaver | channel-simulator | deinterleaver | channel-dec |
    decrypt | crc-check | uncompress | check-sign | data-sink

    Component-pairs can be omitted, say you may leave out the un-/compress
    function. And every component may be either special purpose or general.
    A special purpose entity could be BCH-enc and RCPC-enc, or it can also
    be (if better suited) a combined module, say 'crc -16' vs. 'crc -32'
    with the function realized as option argument.

    There was also the widely used netpbm package for translating
    between different image formats.

    https://en.wikipedia.org/wiki/Netpbm

    $ giftopnm somepic.gif | ppmtobmp > somepic.bmp
    $ for i in *.png; do pngtopam $i | ppmtojpeg >`basename $i .png`.jpg; done

    Also, in regard to some silly objections upthread about the danger of
    binary data on standard ouptut, programs in Unix can easily do the
    Following (and arguably should):

    if (isatty(STDOUT_FILENO)) {
    fprintf(stderr, "Cowardly refusing to dump binary data to a terminal.\n");
    exit(EXIT_FAILURE);
    }


    Yes, so common that the shell has test -t

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Keith Thompson on Thu Feb 1 20:59:14 2024
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    bart <bc@freeuk.com> writes:
    On 01/02/2024 16:30, Scott Lurndal wrote:
    [...]
    $ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
    it is a jpeg

    That doesn't work for me:

    Not if you type the "^J"s as '^' and 'J'. They were intended to
    represent newlines. I would use semicolons instead:

    Yes, that's an artifact of ksh history entries for multiline commands.

    I should have edited it before posting.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Keith Thompson on Thu Feb 1 21:25:33 2024
    On 01/02/2024 20:09, Keith Thompson wrote:
    bart <bc@freeuk.com> writes:
    On 01/02/2024 16:30, Scott Lurndal wrote:
    [...]
    $ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
    it is a jpeg

    That doesn't work for me:

    Not if you type the "^J"s as '^' and 'J'. They were intended to
    represent newlines. I would use semicolons instead:

    $ if file /tmp/garage.jpg | grep JPEG > /dev/null ; then echo "it is a jpeg" ; fi
    it is a jpeg

    (I might also use "grep -q" rather than redirecting to /dev/null.)

    [...]

    I think anyway that you need to grep for JFIF not JPEG, but that is a
    really poor way to check for a JPEG file. Any text or binary file can
    have a JFIF byte sequence.

    That's not an issue. "file" doesn't just look for "JFIF" to determine
    that a file is a jpg.

    I see, so 'file' is a special command that does all the work. grep
    checks whether the description contains JPEG. Although it won't work for
    any of my private formats.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Thu Feb 1 22:41:05 2024
    On 01/02/2024 18:35, Janis Papanagnou wrote:
    On 01.02.2024 11:30, David Brown wrote:
    On 31/01/2024 14:35, Janis Papanagnou wrote:

    First; the EU publishes in all languages of the member states,
    for example. (There's no single lingua franca.)

    Weirdly, while Norway is not in the EU but Sweden and Denmark are, they
    publish (for some things at least) in Norwegian but not in Swedish or
    Danish. [...]

    Hmm.. - in my ears this sounds strange. I've looked it up and found...

    "The EU has 24 official languages:

    Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
    French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian,
    Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish and
    Swedish."

    I can't say I have looked this up myself, or particularly care what
    languages are used there. Maybe it only applied to some documents, or
    used to apply but no longer does. Maybe some things don't stick rigidly
    to the official languages, or maybe different guidelines are used for
    internal documents.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Thu Feb 1 22:46:16 2024
    On 01/02/2024 17:24, Janis Papanagnou wrote:
    On 01.02.2024 11:34, David Brown wrote:
    On 31/01/2024 19:35, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:

    I regularly see it as more symmetrical and clearer to push data left
    to right. So I might write "cat infile | grep foo | sort > outfile".
    Of course I could use "<" redirection, but somehow it seems more
    natural to me to have this flow. I'll use "<" for simpler cases.

    But perhaps this is just my habit, and makes little sense to other
    people.

    I completely understand that.


    You can also use:

    < infile grep foo | sort > outfile

    Redirections don't have to be written after a command.

    Indeed. And if we also respect that 'grep' accepts arguments,
    then it's even more compact and yet probably better legible... :-)

    grep foo infile | sort > outfile


    I did not know you could write it that way - thanks for another
    off-topic, but useful, tip.

    Yes. We certainly should instead have written

    grep foo iso646.h | sort > outfile



    I'm happy using different arrangements at different times. Sometimes I
    think one way is clearer, or easier to type, sometimes I think another
    way is better. As a vague rule, I will usually use "grep foo infile" if
    it is stand alone, or at most piped into "less". If I have a larger
    chain, it seems more natural to me to move the data left to right in a pipe.

    I'm sure I cost my computer a few microseconds of extra effort, but I
    don't worry too much about that!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Thu Feb 1 23:02:17 2024
    On 01/02/2024 18:06, bart wrote:
    On 01/02/2024 14:50, David Brown wrote:
    On 01/02/2024 02:29, bart wrote:
    On 01/02/2024 00:47, Scott Lurndal wrote:
    bart <bc@freeuk.com> writes:
    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    According to you, these tools are poorly designed.  I don't think so. >>>>>> How would you design them?  Endless input and output file names to be >>>>>> juggled and tidied up afterwards?

    I think they're poorly designed too.

    Of course you do.   They're not bart programs.


     From the POV of interactive console programs, they /are/ poor.

    You don't provide any reason why - do elucidate!

    They only do one thing, like you can't first do A, then B. They don't
    give any prompts. They often apparently do nothing (so you can't tell
    if they're busy, waiting for input, or hanging). There is no dialog.

    That's the whole point!

    If you want to do A, then B, then you do "A | B", or "A; B", or "A &&
    B" or "A || B".  And if you want to do A, then B twice, then C, then A
    again, you write "A | B | B | C | A".  Other operator choices let you
    say "do this then that", or "do this, and if successful do that", etc.

    Your monolithic AB program fails when you want to do C, or want to do
    A and B in a way the AB author didn't envisage.

    You have a Transformer - a toy that can be either a car or a robot.
    I've got a box of Lego.  Sometimes I need instructions and a bit of
    time, but I can have a car, a robot, a plane, an alien, a house, and
    anything else I might want.


    You can only do one thing, as you can only have one unbroken byte
    sequence as output sent to stdout.

    I can do one sequence of many things.

    Your suggested "clipboard" idea was no different.

    But it's not difficult to have intermediary files if you want to do more complicated things.


    You can't send output A to stdout, then B to stdout, and certainly can't interleave messages to the console on stdout, as that would then be all
    mixed up with the possibly binary data, and if redirected, you won't see
    it.

    $ cat A
    one
    two
    three

    $ cat B

    cat
    dog
    cow

    $ (cat A; cat B) | wc -l
    6

    That's the output of two commands, "cat A" and "cat B", each going to
    their stdout, and they are concatenated into a single pipe going to the
    "wc -l" command to count the lines.

    And if I wanted to redirect them to a file "x" and also view them, I'd
    write :

    $ (cat A; cat B) | tee x
    one
    two
    three
    cat
    dog
    cow

    $ wc -l x
    6 x


    I'm not sure we are getting anywhere with you trying to invent more and
    more complex situations in an attempt to find something that can't be
    done from a Linux bash shell.


    I can see the idea of having one permanently open channel, but call it stdbinout or stdpipeout. But you still won't be able to generate a
    sequence of distinct data blocks along that one channel because it is continuous.

    This why 'as' only ever produces one object file, even for multiple
    input source files.

    "as" produces one object file, because that's what the program does. If
    you want two object files, run it twice. In Unix systems, starting
    programs and running lots of programs at a time is cheap. It's not a
    system that requires monolithic programs in order to work efficiently.


    And explains why 'as' treats multiple .s input files as though they were
    all part of the same single source file: you can take one .s file, chop
    it up into multiple .s files, and submit them all to 'as' (keeping the
    right order).

    It does that because that's what makes sense. If you want to assembly
    multiple .s files into individual .o files, then you do that. If you
    want to assemble them into a single .o file, then you do that. Your
    choice. Having it generate multiple .o files for multiple .s inputs
    would restrict that choice.


    It's a feature! It's also the whackiest assembler I've encountered, this century anyway. That fact that it's implemented as a crude filter with
    one input stream and one output streams helps explain it.

    Although it works differently from most such filters, because if its
    output is not piped, and not redirected, it is sent to a file (always
    called a.out). It's not quite crazy enough to send binary object file
    data to the termimal; I wonder why not?

    You really are scraping the bottom of the barrel to try to justify your irrational hatreds, aren't you? You put a lot of effort into
    desperately trying to dislike programs that don't work exactly the way
    your programs work. It's a very strange hobby you have.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Thu Feb 1 23:06:17 2024
    On 01/02/2024 16:41, Malcolm McLean wrote:
    On 01/02/2024 14:55, David Brown wrote:
    On 01/02/2024 02:53, Malcolm McLean wrote:
    On 31/01/2024 23:36, Ben Bacarisse wrote:

    An example where it's really useful not to care: I have a suite of
    tools
    for doing toy cryptanalysis.  Some apply various transformations and/or >>>> filters to byte streams and others collect and output (on stderr)
    various statistics.  Plugging them together in various pipelines is
    very
    handy when investigating an encrypted text.  The output is almost
    always
    "binary" in the sense that there would be not point in looking at on a >>>> terminal.

    According to you, these tools are poorly designed.  I don't think so. >>>> How would you design them?  Endless input and output file names to be >>>> juggled and tidied up afterwards?

    I'd write a monolithic program.

    It's very strange to me to see people that consider themselves
    programmers talk about having multiple small functions to do specific
    tasks and combining them into bigger functions to solve bigger
    problems, yet are reduced to quivering jellies at the thought of
    multiple small programs to do specific tasks that can be combined to
    solve bigger tasks.

    Do you think the C standard library would be improved by a single
    function "flubadub" that takes 20 parameters and can calculate
    logarithms, print formatted text, allocate memory and write it all to
    a file?

    By breaking down the problem into several parts e.g. "collect
    statistical data, analyse statistics, form hypothesis, attempt
    decryption, check decrypt for plausible plaintext" we can usually attack
    it better. And you're right, there's not a fundamental difference
    between writing one program with five subroutines, or five programs
    which pass data to each other via pipelines.

    That's not what I said. Try re-reading. I can't be bothered arguing
    against yet another straw man.

    But that doesn't mean that decision must not be made, or that you can't
    give reasons for and against each option.

    So could you list one or two reasons why you might prefer a program with
    five subroutines, and one or two reasons why you might prefer to write
    five programs which communicate via piped data?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Chris M. Thomasson on Thu Feb 1 22:25:18 2024
    "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
    On 2/1/2024 1:25 PM, bart wrote:
    On 01/02/2024 20:09, Keith Thompson wrote:
    bart <bc@freeuk.com> writes:
    On 01/02/2024 16:30, Scott Lurndal wrote:
    [...]
    $ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is >>>>> a jpeg"^Jfi
    it is a jpeg

    That doesn't work for me:

    Not if you type the "^J"s as '^' and 'J'.  They were intended to
    represent newlines.  I would use semicolons instead:

         $ if file /tmp/garage.jpg | grep JPEG > /dev/null ; then echo "it >>> is a jpeg" ; fi
         it is a jpeg

    (I might also use "grep -q" rather than redirecting to /dev/null.)

    [...]

    I think anyway that you need to grep for JFIF not JPEG, but that is a
    really poor way to check for a JPEG file. Any text or binary file can
    have a JFIF byte sequence.

    That's not an issue.  "file" doesn't just look for "JFIF" to determine
    that a file is a jpg.

    I see, so 'file' is a special command that does all the work. grep
    checks whether the description contains JPEG. Although it won't work for
    any of my private formats.



    Why would it work with your private formats? ;^)

    It will. He needs to describe the classification criteria
    in /etc/magic.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to bart on Thu Feb 1 22:24:48 2024
    bart <bc@freeuk.com> writes:
    On 01/02/2024 20:09, Keith Thompson wrote:
    bart <bc@freeuk.com> writes:
    On 01/02/2024 16:30, Scott Lurndal wrote:
    [...]
    $ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
    it is a jpeg

    That doesn't work for me:

    Not if you type the "^J"s as '^' and 'J'. They were intended to
    represent newlines. I would use semicolons instead:

    $ if file /tmp/garage.jpg | grep JPEG > /dev/null ; then echo "it is a jpeg" ; fi
    it is a jpeg

    (I might also use "grep -q" rather than redirecting to /dev/null.)

    [...]

    I think anyway that you need to grep for JFIF not JPEG, but that is a
    really poor way to check for a JPEG file. Any text or binary file can
    have a JFIF byte sequence.

    That's not an issue. "file" doesn't just look for "JFIF" to determine
    that a file is a jpg.

    I see, so 'file' is a special command that does all the work. grep
    checks whether the description contains JPEG. Although it won't work for
    any of my private formats.

    Like anything in unix, the 'file(1)' command is flexible.

    There is a file /etc/magic, that an installation can use
    to describe custom file formats.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Thu Feb 1 22:56:05 2024
    On 01/02/2024 22:02, David Brown wrote:
    On 01/02/2024 18:06, bart wrote:

    You can't send output A to stdout, then B to stdout, and certainly
    can't interleave messages to the console on stdout, as that would then
    be all mixed up with the possibly binary data, and if redirected, you
    won't see it.

    $ cat A
    one
    two
    three

    $ cat B

    cat
    dog
    cow

    $ (cat A; cat B) | wc -l
    6

    That's the output of two commands, "cat A" and "cat B", each going to
    their stdout, and they are concatenated into a single pipe going to the
    "wc -l" command to count the lines.

    I see you don't get it. This is the equivalent of a program which is
    supposed to do this:

    print A to TTY
    print B to LPT
    print C to TTY
    print D to LPT

    but instead is written as this:

    print A to TTY
    print B to TTY
    print C to TTY
    print D to TTY

    and you are expected to redirect all TTY output to LPT.

    At least, on LPT, B and D can each start with a separate title page; on
    stdout directed to a file, it will be all mixed up.

    I'm not sure we are getting anywhere with you trying to invent more and
    more complex situations in an attempt to find something that can't be
    done from a Linux bash shell.

    They're remarkably simple situations!

    And explains why 'as' treats multiple .s input files as though they
    were all part of the same single source file: you can take one .s
    file, chop it up into multiple .s files, and submit them all to 'as'
    (keeping the right order).

    It does that because that's what makes sense.

    Sorry, but it is rubbish.

    Having it generate multiple .o files for multiple .s inputs
    would restrict that choice.

    And yet that is exactly what gcc does; see below.

    You really are scraping the bottom of the barrel to try to justify your irrational hatreds, aren't you?  You put a lot of effort into
    desperately trying to dislike programs that don't work exactly the way
    your programs work.

    Because their UIs are rubbish. They are inconsistent. They are
    restricted. And yet they are deified for some inexplicable reason. Over
    the past few decades nobody has written a better assembler?

    At least there are external ones you can use instead, but gcc will not
    generate .s files in the right syntax (I guess that's another tool you
    will pull of that bottomless bag of such tools).

      It's a very strange hobby you have.

    I write language tools. Ones which are always derided in this newsgroup. They've included quite a few assemblers.

    And yet they all have sensible command line interfaces that do what you
    expect.

    Which is more that can be said for gcc and especially 'as'.

    If you do this on gcc:

    gcc one.s two.s

    it will create one.o and two.s. Do this on as:

    as one.s two.s

    it will not only create one file a.out, but will concatenate both,
    giving unexpected results:

    c:\c>gcc -S cipher.c hmac.c sha2.c

    c:\c>as cipher.s hmac.s sha2.s
    hmac.s: Assembler messages:
    hmac.s:328: Error: symbol `.L18' is already defined
    hmac.s:352: Error: symbol `.L17' is already defined
    ...

    WTF? Compare with this:

    c:\c>mcc -s cipher hmac sha2
    Compiling 3 files to .asm

    c:\c>aa cipher hmac sha2
    Assembling cipher.asm to cipher.exe

    It works impeccably. Even better than 'as', even if that had worked as expected, because there you still have the task of linking the outputs.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Michael S on Thu Feb 1 23:49:50 2024
    On Wed, 31 Jan 2024 15:10:05 +0200, Michael S wrote:

    their earlier mainframe line (PDP-6/10/20).

    Nitpick: there was PDP-6 and PDP-10, but never “PDP-20”. There were
    systems sold as “DECsystem-10” and “DECsystem-20”, but they were all based
    on PDP-10 hardware.

    The difference was I think primarily down to the OS: the “DECsystem-10” installations shipped with TOPS-10, while the “DECsystem-20” machines shipped with TOPS-20, which was based on the groundbreaking TENEX from
    BBN.

    TOPS-20 was not in any sense a “successor” to TOPS-10. I suppose it should have been, but it was too big a shift from the old way of doing things, so
    some customers preferred to stick with the clunkier OS. So both the -10
    and -20 lines were being sold concurrently.

    (End nitpick.)

    (Begin extra trivia part.)

    The PDP-6 was DEC’s first foray into “large-scale” systems (it had a 36- bit word length, greater than any earlier DEC machine). They only made a
    few, at great expense, and lost money on them. So the line was
    discontinued, with the proclamation that they were never getting into 36-
    bit machines again.

    Then two years later, the architecture was revived as the PDP-10 ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Thu Feb 1 23:52:15 2024
    On Wed, 31 Jan 2024 18:06:13 +0100, Janis Papanagnou wrote:

    Yet I don't understand the relation to Linus Torvalds that was the
    source of mentioning VMS. - I mean; only that he dislikes it is not much
    of a news.

    It was the reason he gave for disliking it: you could not easily determine
    the length in bytes of a file.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Fri Feb 2 00:08:09 2024
    On Wed, 31 Jan 2024 14:45:49 +0100, David Brown wrote:

    On 31/01/2024 07:07, Lawrence D'Oliveiro wrote:

    On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

    Mixing binary data with formatted text data is very unlikely to be
    useful.

    PDF does exactly that. To the point where the spec suggests putting
    some random unprintable bytes up front, to distract format sniffers
    from thinking they’re looking at a text file.

    PDF files start with the "magic" indicator "%PDF", which is enough for
    many programs to identify them correctly.

    Sure, if you were looking for PDF files specifically.

    But consider the more generic case of file-transfer tools that try to automatically convert between line-endings for text files on different platforms: if they mistook a PDF file for text, they could screw it up
    royally.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Fri Feb 2 01:59:51 2024
    On 01.02.2024 22:41, David Brown wrote:
    On 01/02/2024 18:35, Janis Papanagnou wrote:
    On 01.02.2024 11:30, David Brown wrote:
    On 31/01/2024 14:35, Janis Papanagnou wrote:

    First; the EU publishes in all languages of the member states,
    for example. (There's no single lingua franca.)

    Weirdly, while Norway is not in the EU but Sweden and Denmark are, they
    publish (for some things at least) in Norwegian but not in Swedish or
    Danish. [...]

    Hmm.. - in my ears this sounds strange. I've looked it up and found...

    "The EU has 24 official languages:

    Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
    French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian,
    Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish and
    Swedish."

    I can't say I have looked this up myself, or particularly care what
    languages are used there. Maybe it only applied to some documents, or
    used to apply but no longer does. Maybe some things don't stick rigidly
    to the official languages, or maybe different guidelines are used for internal documents.

    Extremely unlikely. - More likely that you were just misremembering
    or the document you have in mind was not an official EU document.

    (But if you can dig it up and provide that evidence I'm interested.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Fri Feb 2 01:13:17 2024
    On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

    In ASCII, 0 means NUL, or "ignore".

    Fun fact: one of the names for hex 7F was “rubout”. On seven-track paper tape, if you made a mistake typing your program, intead of throwing away
    the tape and starting again, you could go back and punch out all the holes
    at that position to produce a “rubout” character. The meaning was “ignore this character”.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 02:10:12 2024
    On 02.02.2024 01:08, Lawrence D'Oliveiro wrote:

    But consider the more generic case of file-transfer tools that try to automatically convert between line-endings for text files on different platforms: if they mistook a PDF file for text, they could screw it up royally.

    What tools are you specifically thinking of? - I recall in FTP you
    explicitly set bin-mode or text-mode. I assume that protocols like
    FTAM (CCITT) would also transfer files reliably. I would certainly
    try to avoid tools that operate unreliably or can't be switched to
    operate correctly with [8 bit] "binary" or [ASCII] text files.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to David Brown on Fri Feb 2 01:08:24 2024
    On Thu, 1 Feb 2024 23:02:17 +0100, David Brown wrote:

    But it's not difficult to have intermediary files if you want to do more complicated things.

    This is the point where someone says “I wish a shell script pipeline could express a general flow graph”.

    .

    .

    .

    .

    .

    ... nobody?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Fri Feb 2 02:18:35 2024
    On 31.01.2024 19:24, David Brown wrote:
    On 31/01/2024 19:01, Janis Papanagnou wrote:

    All I can say is that the Unix shell was a reliable companion
    wherever we had to automate tasks on Unix systems or on Cygwin
    enhanced Windows.

    Automation is certainly easier with good scripting - whatever the
    language or shell.

    Sure. And shell was always available as part of standard Unix.
    That was not (not always) true for other languages, like Perl
    (for example).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Fri Feb 2 02:48:58 2024
    On 31.01.2024 19:20, David Brown wrote:
    On 31/01/2024 16:25, Janis Papanagnou wrote:
    On 31.01.2024 15:21, David Brown wrote:
    On 31/01/2024 09:36, Malcolm McLean wrote:
    [ I snipped a couple of "I actually don't know/need it" things ]

    But now it's effectively a programming language, and, from the example >>>> code, a very poorly designed one which is cryptic and fussy and liable >>>> to be hard to maintain. So it's better to use a language like Perl to
    achieve the same thing, and I did have a few Perl scripts handy for
    repetitive jobs of that nature in my Unix days.

    That gave me a laugh! You think bash is cryptic, fussy and poorly
    designed, and choose /Perl/ as the alternative :-)

    I don't think it's that clear a joke. The Unix shell is extremely
    error prone to program, and you should not let a newbie write shell
    programs without careful supervision. ("newbie" [in shell context]
    = less than 10 years of practical experience. - Am I exaggerating?
    Maybe. But not much.)

    I'm not a great fan of shell programming - anything advanced, and I tend
    to reach for Python. But I think that is a matter of familiarity and practice. But if you consider bash programming as difficult to get
    right, I'll not argue.

    Not specifically bash programming, the same is true for ksh, etc.;
    it's the underlying shell design that has a lot of pitfalls. And
    it's not only about familiarity with the tool - of course being
    familiar with the concepts is necessary. But there's still enough
    pits where even years long programmers stumble into. (I'm saying
    that as someone who did 35+ years ksh programming, I gave courses,
    defined shell coding standards, followed 20+ years the problems
    that users had in comp.unix.shell, and even saw experienced shell
    book authors (and I'm not even mentioning bloggers), to fail in
    some instances of the language.)

    But of course, with knowledge and discipline, you can write also
    fine shell programs.

    Despite the shell inherent issues I like it because I can solve
    some types of tasks reliably in Unix context.


    Perl is famously known as a "write-only" language. Sure, it is possible
    to write good, clear, maintainable Perl code - but few people do that.

    I've programmed just a few times in Perl, mostly only extending
    existing programs. But a friend of mine is leading a Perl user
    group in our city; his programs (despite some cryptic elements)
    are still quite legible.


    Thus the idea that finding bash cryptic or difficult and using Perl
    instead is the joke.

    Well, in shell it's all that '1>&2' and '${f##*/}' and whatnot
    stuff that can only be called cryptic. I wouldn't count regexps
    because that's the base in any proper scripting language. What
    remains for Perl to be cryptic? The variable type prefixes are
    the most prominent punctuation elements that pop into my head.
    If one wants better legible yet simple scripts he could resort
    to Awk; but its focus is different from shell. (Perl supports
    both, but is not everywhere available.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 02:12:43 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Wed, 31 Jan 2024 14:45:49 +0100, David Brown wrote:

    On 31/01/2024 07:07, Lawrence D'Oliveiro wrote:

    On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

    Mixing binary data with formatted text data is very unlikely to be
    useful.

    PDF does exactly that. To the point where the spec suggests putting
    some random unprintable bytes up front, to distract format sniffers
    from thinking they’re looking at a text file.

    PDF files start with the "magic" indicator "%PDF", which is enough for
    many programs to identify them correctly.

    Sure, if you were looking for PDF files specifically.

    But consider the more generic case of file-transfer tools that try to >automatically convert between line-endings for text files on different >platforms: if they mistook a PDF file for text, they could screw it up >royally.

    Just another reason not to use the system with two-byte line endings.

    Not a problem on unix.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 02:15:05 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

    In ASCII, 0 means NUL, or "ignore".

    Fun fact: one of the names for hex 7F was “rubout”.

    Additional fun fact. Rubout was the legend on the keycap on
    the ASR-33 used to rub out the prior character (the A in ASR
    means it has the reader/punch). On paper tape, it means ignore
    the prior character.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to bart on Fri Feb 2 03:13:34 2024
    On 01.02.2024 23:56, bart wrote:
    On 01/02/2024 22:02, David Brown wrote:
    On 01/02/2024 18:06, bart wrote:

    You can't send output A to stdout, then B to stdout, and certainly
    can't interleave messages to the console on stdout, as that would
    then be all mixed up with the possibly binary data, and if
    redirected, you won't see it.

    $ cat A
    one
    two
    three

    $ cat B

    cat
    dog
    cow

    $ (cat A; cat B) | wc -l
    6

    That's the output of two commands, "cat A" and "cat B", each going to
    their stdout, and they are concatenated into a single pipe going to
    the "wc -l" command to count the lines.

    I see you're trying to teach shell basics to Bart; interesting.


    I see you don't get it. This is the equivalent of a program which is

    What is meant by "This"? (I cannot find code or description.)

    supposed to do this:

    print A to TTY
    print B to LPT
    print C to TTY
    print D to LPT

    Is that the task? Are A-D program names? Any other requirements
    or restrictions you impose?

    cat A
    cat B | lpr # of course you don't need cat here: lpr B
    cat C
    cat D | lpr # of course you don't need cat here: lpr D

    You want stdout collected? Do it as David suggested, e.g.

    { cat A ; lpr B ; cat C ; lpr D ;} | processor-for-A-and-C

    Or you want to split A, B, C, or D to different channels or tools?
    Then use 'tee', and/or use redirection to files, and/or to processes.


    but instead is written as this:

    print A to TTY
    print B to TTY
    print C to TTY
    print D to TTY

    and you are expected to redirect all TTY output to LPT.

    At least, on LPT, B and D can each start with a separate title page; on stdout directed to a file, it will be all mixed up.

    See above. - Any other task? - Or did you mean something else?

    (It's hard to believe that there's something possible in the DOS
    cmd/bat/bart world that wouldn't be possible in Unix shell - besides
    blue screens of course.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to All on Fri Feb 2 03:26:02 2024
    bart <bc@freeuk.com> writes:
    [...]
    I've just realised why it is that your filter programs don't show
    prompts or any kinds of messages: because those are sent to stdout,
    and therefore will screw up any data that is being sent there as the
    primary output.

    Erm, no. Filter programs just don't need a prompt.

    And if you have a program that want to provide a prompt, and provide
    data on standard output, you usually wouldn't use stdout for that;
    use stderr or /dev/tty, depending on the intention of the tool.

    And even if you intend to use a prompt on stdout (what really sounds
    as a weak idea) you can program a test whether your file descriptor is
    attached to the tty if you like.

    To demonstrate that, here's some ksh code...

    $ ( exec 0>&- ; [[ -t 0 ]] ; echo $? ) # stdin closed
    1 # error
    $ ( [[ -t 0 ]] ; echo $? ) # stdin attached to tty
    0 # okay
    $ ls x | ( [[ -t 0 ]] ; echo $? ) # stdin attached to pipe
    1 # error


    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Scott Lurndal on Fri Feb 2 03:42:58 2024
    On 02.02.2024 03:12, Scott Lurndal wrote:
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    But consider the more generic case of file-transfer tools that try to
    automatically convert between line-endings for text files on different
    platforms: if they mistook a PDF file for text, they could screw it up
    royally.

    Just another reason not to use the system with two-byte line endings.

    That cannot always be avoided.


    Not a problem on unix.

    There are several situations where it matters to consider CR/LF or when
    some OS setting may handle these line terminators. Even if you're only
    staying in your Unix universe. The "funniest" thing is if you process
    files that have been edited by different people on different platforms.
    (I know that I am not the first one who has written a CR-LF-CRLF tool
    to check and fix (in some consistent way) the line endings of files.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Janis Papanagnou on Fri Feb 2 03:47:23 2024
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 02.02.2024 03:12, Scott Lurndal wrote:
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    But consider the more generic case of file-transfer tools that try to
    automatically convert between line-endings for text files on different
    platforms: if they mistook a PDF file for text, they could screw it up
    royally.

    Just another reason not to use the system with two-byte line endings.

    That cannot always be avoided.


    Not a problem on unix.

    There are several situations where it matters to consider CR/LF or when
    some OS setting may handle these line terminators. Even if you're only >staying in your Unix universe. The "funniest" thing is if you process
    files that have been edited by different people on different platforms.
    (I know that I am not the first one who has written a CR-LF-CRLF tool
    to check and fix (in some consistent way) the line endings of files.)

    vim can load and save with either line ending, switching if the
    user wishes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Fri Feb 2 05:08:13 2024
    On Fri, 02 Feb 2024 02:15:05 GMT, Scott Lurndal wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

    Fun fact: one of the names for hex 7F was “rubout”.

    Additional fun fact. Rubout was the legend on the keycap on the ASR-33
    used to rub out the prior character (the A in ASR means it has the reader/punch). On paper tape, it means ignore the prior character.

    No, you had to overpunch the character to be ignored. Did that key automatically backspace the tape for you, or did you have to do it
    manually?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Fri Feb 2 09:44:15 2024
    On 01/02/2024 23:56, bart wrote:
    On 01/02/2024 22:02, David Brown wrote:

    I'm not sure we are getting anywhere with you trying to invent more
    and more complex situations in an attempt to find something that can't
    be done from a Linux bash shell.

    They're remarkably simple situations!

    As I said, you can try to invent more and more unrealistic situations to
    try to "prove" that bash is useless and Unix is flawed and its designers
    were incompetent, along with every other programmer and developer
    throughout time, while Bart from Usenet has the perfect solution for everything.

    If you are happy with everything you have made yourself, and think
    everything else is unusable, incompetently made, and probably designed specifically to annoy you personally, why bother with any of it? Why
    are you here, complaining about everything? What do you gain from
    frothing at the mouth like this, other than high blood pressure?

    Ask about C issues. Tell us about C programs. You can even ask about off-topic stuff - many of us have tried to give you lots of information
    about things you are ignorant of. But /please/ stop whining.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Fri Feb 2 09:51:04 2024
    On 01/02/2024 23:38, Malcolm McLean wrote:
    I'm sure you're capable of going through the exercise and then you might
    gain a bit of insight on how to design such software systems. And, no, arguing that you'd go for a monolithic program doesn't necessarily mean
    that you are a "quivering jelly" at the thought of writing several
    simpler ones. And in fact to start you off I actually mentioned a few advantages of the pipeline approach.

    I am perfectly aware of the advantages and disadvantages of monolithic approaches.

    I am also perfectly aware that you won't read that previous sentence, understand it, or consider it before making up your next pointless straw
    man or making up another lecture on something you know nothing about
    while the rest of us do.


    There are advantages and drawbacks to both. But I can't force you to
    think about what those might be if you won't, and from experience just telling you provokes your natural contentiousness and isn't very effective.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Fri Feb 2 12:41:28 2024
    On 01.02.2024 21:09, Keith Thompson wrote:
    [...] I would use semicolons instead:

    $ if file /tmp/garage.jpg | grep JPEG > /dev/null ; then echo "it is a jpeg" ; fi
    it is a jpeg

    (I might also use "grep -q" rather than redirecting to /dev/null.)

    And probably also avoid multi-line code (to prevent the ^J confusion)

    file /tmp/garage.jpg | grep -q JPEG && echo "it is a jpeg"


    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Fri Feb 2 12:33:46 2024
    On 01.02.2024 16:50, Malcolm McLean wrote:
    On 01/02/2024 15:07, Janis Papanagnou wrote:

    I don't know binary format details about jpg, so I cannot help you here.

    JPEG is an extremely common binary file format and JPEG files will be
    found on most general purpose computers.

    Really?!

    All you need to know for the purposes of the discussion is that the
    first four bytes are segment identifiers and must have the values I

    All I wanted to know is what you intended to do. The intended
    task.

    gave, whilst bytes five and six are a big endian 16 bit number that represents a segment length, and that potentially any of those values
    could be unexpected and you might want to inspect them.

    And from subsequent posts I assume you want to test the values
    to determine the file type.

    So how would you achieve that in a convenient and non-error prone way?

    If I'd be interested in the file type of the data I'd do what
    has already been suggested by others, use the 'file' command.
    It's the purpose of that command to try to determine the file
    type by objective means, or, if that is not unambiguously
    possible, by heuristic means.

    In case I want to do a more thorough data analysis of such
    binary code it applies what I've written upthread about the
    need for knowledge about the binary data structures (and use
    the tools I mentioned).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Fri Feb 2 13:34:52 2024
    On 02.02.2024 00:57, Malcolm McLean wrote:

    The difference is that the syntax for redirecting output in the UNIX
    shell is ony of the slightest use if you happen to run that particular
    type of system.

    Hasn't DOS adopted the basic redirections as well?
    And hasn't it even tried to mimic pipes?

    Of course redirection and its specific syntaxes are depending on
    the supporting OS. So what?

    In Unix they developed a terse version at a time where on other
    OSes you need to "formulate novels" to invoke such features.
    And that version was good enough to have been adopted by other
    tools. And it's also simple enough that many users use these in
    their programming contexts effectively (and without complaints).

    [...] And whilst pipes are a concept, they are no way
    comparable in depth and fundamental importance to the concept of
    functions of functions.

    The point is the two are not comparable. [...]

    Of course. This point sounds completely reasonable to me.

    So the confusion was only about some inappropriate statement that
    had been used. (I seem to recall, and think I commented on that.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Fri Feb 2 13:17:55 2024
    On 02.02.2024 08:16, Malcolm McLean wrote:

    In Perl you have an implict variable called $_. Some Perl statements
    will operate on $_ without it actually being specified, and you then
    have to reference $_ exoictitly to obtain the result. It's highly
    confusing for anyone used to a conventional language with only one type
    of named varibales. And that's one of the main decisions which makes
    Perl hard to read.

    Yes, I remember that '$_', and even though I'm not the typical Perl
    programmer (I told you about my only few contacts with Perl) it was
    not the least confusing to me. I think, what also generally holds,
    that if you want to use some tool you should at least make yourself
    familiar with its basic concepts. (This appears to be a quite common
    view, despite here we often see complaints complaints from folks with
    only basic or no knowledge about the objects of their complain.)
    But Perl is also a large language, and it needs some time to learn
    or master it. But $_ seemed to me to be some basic thing.

    That said, now consider in comparison (e.g.) Ksh that has also a
    variable '_' (with value '$_'); and the contents of Ksh's '_' is
    even depending on the runtime context!


    However often you can write slightly less idiomatic Perl code which
    doesn't make use of this feature, and then it's clearer. Or you can lay
    the code out so that all the places where $_ are used in the same way
    are together and make it a bit easier to work out what is going on.
    There are thing you can do and Perl doesn't have to look like a
    confusing mess.

    Yes. Quite typical for many scripting languages. Sometimes there's
    even some desire to produce short, cryptic, "clever", forms of code.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Fri Feb 2 15:47:21 2024
    On 02/02/2024 14:28, Malcolm McLean wrote:
    On 02/02/2024 08:51, David Brown wrote:
    On 01/02/2024 23:38, Malcolm McLean wrote:
    I'm sure you're capable of going through the exercise and then you
    might gain a bit of insight on how to design such software systems.
    And, no, arguing that you'd go for a monolithic program doesn't
    necessarily mean that you are a "quivering jelly" at the thought of
    writing several simpler ones. And in fact to start you off I actually
    mentioned a few advantages of the pipeline approach.

    I am perfectly aware of the advantages and disadvantages of monolithic
    approaches.

    Well it's kind of poroof of the pudding. Ben has several programs
    connected by piplines and asked me what I thought of the design. I said
    I'd go for a monolithinc approach. You criticised mem giving no reason
    ither than that my oreferred approach was  monolithic. So any reasonable erson would assume that you think that a monolithic approach is in and
    of itself bad.

    No, they would not.

    But I don't think, based on your postings, you count as a "reasonable
    person".


    When invited to list the advantages and disadvantages if either, you
    refused to do so. I am sure that you are capable of doing this, and you
    are basically right. But you haven't actually done so. And it's proof of
    the pudding.

    Thne fact is there is case for `Ben's approach, there's a case for my approach, and maybe Ben's case is better. I've no objection to anyone weighing in on that. But fundamentally you do not understand what it
    means to offer an argument or how to make a case.


    I know exactly what it means. But I know when it is pointless, when the
    person on the other side pays not the slightest attention to what is
    being said and instead wanders off in their own little world with their
    own little ideas and their own independent terminology.

    What would be the point in giving reasons for anything, when you won't
    read them? Why should I give arguments, when you will "counter" them by telling us that grass is blue, using your own definition for "blue"?
    I've put a lot of effort into trying to explain things to you - enough
    is enough. I get more intelligent responses talking to my cat.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 15:28:33 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Fri, 02 Feb 2024 02:15:05 GMT, Scott Lurndal wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

    Fun fact: one of the names for hex 7F was “rubout”.

    Additional fun fact. Rubout was the legend on the keycap on the ASR-33
    used to rub out the prior character (the A in ASR means it has the
    reader/punch). On paper tape, it means ignore the prior character.

    No, you had to overpunch the character to be ignored. Did that key >automatically backspace the tape for you, or did you have to do it
    manually?

    1) either way worked, depending on the software reading the tape
    2) there was a button on the pt unit that backed up one character.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Harnden@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 16:50:49 2024
    On 02/02/2024 01:13, Lawrence D'Oliveiro wrote:
    On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

    In ASCII, 0 means NUL, or "ignore".

    Fun fact: one of the names for hex 7F was “rubout”. On seven-track paper tape, if you made a mistake typing your program, intead of throwing away
    the tape and starting again, you could go back and punch out all the holes
    at that position to produce a “rubout” character. The meaning was “ignore
    this character”.

    Also, over-punching all seven holes ment there was never any possibility
    of it ever getting misread as anything else.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Malcolm McLean on Fri Feb 2 21:40:41 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 01/02/2024 15:07, Janis Papanagnou wrote:
    On 01.02.2024 14:26, Malcolm McLean wrote:
    On 01/02/2024 13:02, Janis Papanagnou wrote:
    Well, not necessarily. Let's leave the typical use case for a moment... >>>>
    It might also be analyzed and converted to a digitally represented
    formula, say some TeX code, or e.g. like the formal syntax that the
    lilypond program uses.

    And ultimately converted to a non binary form. A list of 1s and 0s is
    seldom any use to the final consumer of the data.
    No, I was speaking about an application that creates lilypond _input_,
    which is a formal language to write notes, e.g. for evaluation by the
    lilypond software, but not excluding other usages.


    The two problems with hex
    dumps are that you've got to do mental arithmetic to convert 8 bit hex >>>>> values into 16 or 32 bit fields,

    Hmm.. - have you inspected the man pages of the tools?

    I just ran "man xxd". The man page contains this statement.

    The tool's weirdness matches its creator's brain. Use entirely at your
    own risk. Copy files. Trace it. Become a wizard.
    This statement repelled you? (Can't help you here.)

    At least for 'od' I know it's easy per option...
    od -c file # characters (or escapes and octals)
    od -t x1 file # hex octets
    od -t x2 file # words (two octets)
    od -c -t x1 file # characters and octets

    So a JPEG file starts with
    FF D8
    FF E0
    hi lo (length of the FF E0 segment)

    So we want the output

    FF D8 FF E0 [1000] to check that the segment markers are correct and FF
    E0 segment is genuinely a thousand bytes (or whatever it is). This isn't >>> easy to achieve with a hex dump utility.
    I don't know binary format details about jpg, so I cannot help you here.

    JPEG is an extremely common binary file format and JPEG files will be found on most general purpose computers.

    No. The loose term "JPEG file" usually refers to a file encoded using
    either the JFIF or EXIF standard. Prior to the introduction of EXIF,
    the loose term was used to refer only to JFIF files.

    JPEG is the name for the image encoding usually carried in JFIF (or
    EXIF) format files, but since you are actually discussing the file
    format, you should probably use the right name for it: JFIF.

    All you need to know for the purposes of the discussion is that the first four bytes are segment identifiers and must have the values I gave,

    My laptop contains lots of "JPEG files" that start FF D8 FF E1.

    So how would you achieve that in a convenient and non-error prone way?

    One way is to use od like this:

    $ od --endian=big -N6 -t x1u2 x.jpg
    0000000 ff d8 ff e0 00 10
    65496 65504 16
    0000006

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Ben Bacarisse on Fri Feb 2 23:59:49 2024
    On 2024-02-02, Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
    I've commented elsewhere on why I think a monolithic program is not a
    good design, so I won't repeat that here.

    All the programs I use are some kind of "lithic"; not so much mono-
    as neo-.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to bart on Fri Feb 2 23:41:01 2024
    bart <bc@freeuk.com> writes:

    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
    human-readable text.
    Maybe you are not used to a system where it's trivial to inspect such
    data. When "some_prog" produces data that are not compatible with the
    current terminal settings, "some_prog | hd" shows a hex dump instead.
    The need to do this does not make "some_prog" poorly designed. It may
    simply mean that the output is /intended/ for further processing.

    For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
    Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.
    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.
    What is your evidence? stdout was just designed for output (as far as I
    can tell) and, anyway, what is the distinction you are making between
    binary and text? iconv --from ACSII --to EBCDIC-UK will produce
    something that is "logically" text on stdout, but it might look like
    binary to you.
    An example where it's really useful not to care: I have a suite of tools
    for doing toy cryptanalysis. Some apply various transformations and/or
    filters to byte streams and others collect and output (on stderr)
    various statistics. Plugging them together in various pipelines is very
    handy when investigating an encrypted text. The output is almost always
    "binary" in the sense that there would be not point in looking at on a
    terminal.
    According to you, these tools are poorly designed. I don't think so.
    How would you design them? Endless input and output file names to be
    juggled and tidied up afterwards?

    I think they're poorly designed too.

    I am curious to read your reasoning...

    From the POV of interactive console programs, they /are/ poor. But the mistake is thinking that they are actual programs or commands, when really they are just filters. They are not designed to be standalone
    commands.

    So it's a bad design, not because of the nature of the data ("binary"
    vs. "text") but because you claim a filter is not a command? Where does
    that notion come from?

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Malcolm McLean on Fri Feb 2 23:38:22 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 31/01/2024 23:36, Ben Bacarisse wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 30/01/2024 07:27, Tim Rentsch wrote:
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 29/01/2024 20:10, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    [...]

    I've never used standard output for binary data.
    [...] it strikes me as a poor design decision.

    How so?

    Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
    human-readable text.
    Maybe you are not used to a system where it's trivial to inspect such
    data. When "some_prog" produces data that are not compatible with the
    current terminal settings, "some_prog | hd" shows a hex dump instead.
    The need to do this does not make "some_prog" poorly designed. It may
    simply mean that the output is /intended/ for further processing.

    For instance in some systems designed to receive
    ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
    Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
    which is similar, in practice text formats usually have enough
    redundancy to be easily extended.

    So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.
    Your reasoning is all gobbledygook. Your comments reflect only
    limitations in your thinking, not any essential truth about using
    standard out for binary data.

    I must admit that it's nothing I have ever done or considered doing.

    However standard output is designed for text and not binary ouput.
    What is your evidence? stdout was just designed for output (as far as I
    can tell) and, anyway, what is the distinction you are making between
    binary and text? iconv --from ACSII --to EBCDIC-UK will produce
    something that is "logically" text on stdout, but it might look like
    binary to you.
    An example where it's really useful not to care: I have a suite of tools
    for doing toy cryptanalysis. Some apply various transformations and/or
    filters to byte streams and others collect and output (on stderr)
    various statistics. Plugging them together in various pipelines is very
    handy when investigating an encrypted text. The output is almost always
    "binary" in the sense that there would be not point in looking at on a
    terminal.
    According to you, these tools are poorly designed. I don't think so.
    How would you design them? Endless input and output file names to be
    juggled and tidied up afterwards?

    I'd write a monolithic program.
    Load the encryoted text into memory, and then pass it to subroutines to do the various analyses.
    You can of course process it, and then pass the processed output to other programs. And that does have a point if the program which is acceoting the processed outout is doing something which has no necessary connection to cryptanalysis. So for example a program to produce a pie chart from a list
    of letter frequencies. But if it's transforming the encrypted text in intricate and specialised ways, then analysing the transformed text in
    other specialised and intricate ways, then firstly you've probably
    introduced coupling and dependency between the two programs, and secondly you're probably at some point going to want to modify the second program in the pipeline to look at the raw data.

    I don't think you understand the design at all. What coupling? And why
    would I modify the program to inspect the data when there are several inspection program that can be inserted before or after to do just that?

    I've commented elsewhere on why I think a monolithic program is not a
    good design, so I won't repeat that here. I just don't understand any
    of your objections to my design. Specifically, you don't address the
    fact that you claim it's wrong simply because the data going to stdout
    are binary. Have you abandoned that generic criticism? Is there why
    you split the thread with two replies?

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Malcolm McLean on Sat Feb 3 11:27:37 2024
    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    On 01/02/2024 13:24, Tim Rentsch wrote:

    Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

    You admit this with "not tested". Says it all. '"Understandig Unix" is >>> an intellectually useless achievement. You might have to do it if you
    have to use the system and debug and trouble shoot. But it's nothing
    to be proud about.

    You're an idiot. As usual trying to have a useful discussion
    with you has turned out to be a complete waste of time.

    Some things are interesting in themselves and worth talking about at
    lenght. Like how Haskell builds up functions of functions. Other
    things really aren't. And how to set up a Unix pipeline is one of
    those that really aren't (unless actually faced with a such a system
    and with a practical need to do it).

    I think you have the intelligence to understand this, if you'd just understand where I am coming from. This arrogant and dismissive
    attitude does not become you.

    The arrogance and dismissive attitude is on your side, not
    mine. I don't think everyone is an idiot. Just you.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to James Kuyper on Sat Feb 3 11:35:00 2024
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/24/24 16:11, Kaz Kylheku wrote:

    On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

    On 1/24/24 03:10, Janis Papanagnou wrote:

    On 23.01.2024 23:37, Kalevi Kolttonen wrote:

    [...] I am
    pretty sure that not all computer languages
    provide guarantees about the order of evaluation.

    What?!

    Could you explain what surprises you about that statement? As quoted,
    it's a general statement which includes C: "Except as specified later,
    side effects and value computations of subexpressions are unsequenced."

    Pretty much any language has to guarantee *something* about
    order of evaluation, somewhere.

    Not the functional languages, I believe - but I've only heard about such languages, not used them.

    Like for instance that calculating output is not possible before a
    needed input is available.

    Oddly enough, for a long time the C standard never said anything about
    that issue. I argued that this was logically necessary, and few people disagreed with that argument, but I couldn't point to wording in the
    standard to support that claim.

    That changed when they added support for multi-threaded code to C in
    C2011. That required the standard to be very explicit about which things could happen simultaneously in different threads, and which things had
    to occur in a specified order. All of the wording about "sequenced" was first introduced at that time. [...]

    The timing may have been the same, but the motivation for the
    new language about sequencing actually occurred much earlier.
    For quite a long time before C11, and IIRC even before C99,
    the ISO C committee was looking for a formal model to describe
    the sequencing rules. There were proposals for various formal
    models, none of which were thought to suffice. So it wasn't
    just adding threading to C that prompted adding better language
    regarding sequencing rules.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dave_thompson_2@comcast.net@21:1/5 to david.brown@hesbynett.no on Mon Feb 26 04:18:16 2024
    On Tue, 30 Jan 2024 17:25:31 +0100, David Brown
    <david.brown@hesbynett.no> wrote:

    On 30/01/2024 16:49, Malcolm McLean wrote:
    [nonsense as usual]
    Mixing binary data with formatted text data is very unlikely to be
    useful. fwrite() is perfectly good for writing binary data - it would
    make no sense to have some awkward printf specifier to do this. (What
    would the specifier even be? It would need to take two items of data -
    a pointer and a length - and thus be very different from existing specifiers.)

    "%.*s" takes length and pointer, which IMO is not VERY different.

    It does stop (prematurely?) if the data/buffer contains \0 .

    [snip rest]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dave_thompson_2@comcast.net@21:1/5 to ldo@nz.invalid on Mon Feb 26 04:20:23 2024
    On Wed, 31 Jan 2024 06:10:56 -0000 (UTC), Lawrence D'Oliveiro
    <ldo@nz.invalid> wrote:

    On Tue, 30 Jan 2024 20:29:17 +0100, David Brown wrote:

    stdout and stdin were apparently available in FORTRAN in the 1950's.

    There was a convention that channel 5 was the card reader, and 6 was the
    line printer.

    and 7 the card punch, at least from FIV/66 up. Data you expected to
    reprocess, like a Unix filter's stdout, would go to 7, and data you
    wanted a human to see, like stderr, to 6.

    When interactive systems came along later, this became channel 5 for
    keyboard input, and 6 for terminal output.

    and 7 got lost -- the same way that ssh -t (and docker -t) loses the distinction between stdout and stderr, it's all just output

    What happened to channels 1, 2, 3 & 4? Dont know.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)