Forum: >>> Magnum BBS <<<

iso646.h

From Lawrence D'Oliveiro@21:1/5 to All on Mon Jan 22 01:51:48 2024

How many people know about this? It was introduced in c99. If you
“#include <iso646.h>”, then you can use alternative symbols like “not” instead of “!”, “and” instead of “&&” and “or” instead of “||”.

C++ already had this, without the need to include such a file.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lew Pitcher@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 02:00:20 2024

On Mon, 22 Jan 2024 01:51:48 +0000, Lawrence D'Oliveiro wrote:

How many people know about this? It was introduced in c99.

It was added in a 1995 amendment to the C90 standard.

If you
“#include <iso646.h>”, then you can use alternative symbols like “not”
instead of “!”, “and” instead of “&&” and “or” instead of “||”.

C++ already had this, without the need to include such a file.

Yah, so?

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Sun Jan 21 21:48:36 2024

On 1/21/24 20:51, Lawrence D'Oliveiro wrote:

How many people know about this? It was introduced in c99. If you
“#include <iso646.h>”, then you can use alternative symbols like “not”
instead of “!”, “and” instead of “&&” and “or” instead of “||”.

I did, and so did a lot of other people. Google groups lists 222
messages containing <iso646.h>.

C++ already had this, without the need to include such a file.

That's because backwards compatibility with older versions of C is a
lower priority for C++ than it is for C. Existing code that ysed those
words as identifiers would break if that feature were simply always
enabled. Putting it in a standard header allows it to be under user
control.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Mon Jan 22 05:23:30 2024

On Sun, 21 Jan 2024 21:48:36 -0500, James Kuyper wrote:

On 1/21/24 20:51, Lawrence D'Oliveiro wrote:

C++ already had this, without the need to include such a file.

That's because backwards compatibility with older versions of C is a
lower priority for C++ than it is for C.

I don’t think even many C++ programmers know about this.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 09:30:21 2024

On 22/01/2024 02:51, Lawrence D'Oliveiro wrote:

How many people know about this? It was introduced in c99. If you
“#include <iso646.h>”, then you can use alternative symbols like “not”
instead of “!”, “and” instead of “&&” and “or” instead of “||”.

C++ already had this, without the need to include such a file.

I can't say how many people know about it, but I certainly did. I have
not used it - and I don't use the matching names in C++ either, but I
knew about them. (I know lots more features of C and C++ than I use - I
expect that applies to most programmers.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Mon Jan 22 16:24:36 2024

David Brown <david.brown@hesbynett.no> writes:

On 22/01/2024 02:51, Lawrence D'Oliveiro wrote:

How many people know about this? It was introduced in c99. If you

Although it was available before 1999 - the SVR4 C compilation System
(CCS) had an iso656.h header file in the early 90's.

#ifndef _ISO646_H
#define _ISO646_H
#ident "@(#)sgs-head:common/head/iso646.h 1.2"

#define and &&
#define and_eq &=
#define bitand &

#define or ||
#define or_eq |=
#define bitor |

#define xor ^
#define xor_eq ^=

#define compl ~

#define not !
#define not_eq !=

#endif /*_ISO646_H*/
~

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Mon Jan 22 20:34:04 2024

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability:

if (ThisCh < '0' or ThisCh > '9')
{
if (AllowSign and Index == 0 and (ThisCh == '+' or ThisCh == '-'))
{
/* fine */
}
else if (AllowDecimal and not DecimalSeen and ThisCh == '.')
{
DecimalSeen = true; /* only allow one decimal point */
}
else
{
Valid = false;
break;
} /*if*/
} /*if*/

...

if
(
TheEntry->d_name[0] == '.'
and
(
TheEntry->d_name[1] == 0
or
TheEntry->d_name[1] == '.'
and
TheEntry->d_name[2] == 0
)
)
{
/* skip "." and ".." entries */
}

...

if
(
ThisCh >= 'a' and ThisCh <= 'z'
or
ThisCh >= 'A' and ThisCh <= 'Z'
or
ThisCh >= '0' and ThisCh <= '9'
or
ThisCh == '_'
or
ThisCh == '-'
or
ThisCh == '.'
or
ThisCh == '/'
)
{
Result.append(1, ThisCh);
}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Mon Jan 22 20:34:37 2024

On Mon, 22 Jan 2024 16:24:36 GMT, Scott Lurndal wrote:

Although it was available before 1999 - the SVR4 C compilation System
(CCS) had an iso6[4]6.h header file in the early 90's.

By the way, why is it called “iso646.h”?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 21:32:41 2024

On 2024-01-22, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability:

Only for someone who has no experience in C or C++, and is unfamiliar
with the regular operators, and speaks English.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Jan 22 22:07:11 2024

On Mon, 22 Jan 2024 13:22:41 -0800, Keith Thompson wrote:

As for "and" being more readable than "&&", that's not necessarily the
case for people who are accustomed to reading C code.

You mean, “old-style C code”.

I imagine the introduction of ANSI-style argument declarations, as opposed
to the old K&R style, was a bit of a jar, too. But we got over it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Blue-Maned_Hawk@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 23:08:53 2024

Lawrence D'Oliveiro wrote:

Don’t you think it improves readability:

No.

--
Blue-Maned_HawkÃÃÃÂ¢shortens to HawkÃÃÃÂ¢/ blu.mÃin.dÃÃÃÃÂ°ak/ ÃÃÃÂ¢he/him/his/himself/Mr.
blue-maned_hawk.srht.site
Cenunfly!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 00:10:20 2024

On 2024-01-22, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Mon, 22 Jan 2024 14:56:53 -0800, Keith Thompson wrote:

As far as I can tell, the macros defined in <iso646.h> have never caught
on significantly.

The nice thing is, I don’t have to care. They have to be part of any standards-compliant C compiler, therefore I am free to use them. And I do.

Indentation not being required is part of any conforming C compiler.
Therefore, I take advantage of it by starting each line of code without
any leading whitespace, regardless of the nesting level. My coworkers
love me!

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 00:12:25 2024

On 2024-01-22, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Mon, 22 Jan 2024 23:08:53 -0000 (UTC), Blue-Maned_Hawk wrote:

Lawrence D'Oliveiro wrote:

Don’t you think it improves readability:

No.

Lessig’s Law: The one who writes the code makes the rules.

Accordingly, if you get to rewrite the genetic code
of someone reading the code, you may be able to dictate
what they find readable.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Jan 22 23:44:01 2024

On Mon, 22 Jan 2024 14:56:53 -0800, Keith Thompson wrote:

As far as I can tell, the macros defined in <iso646.h> have never caught
on significantly.

The nice thing is, I don’t have to care. They have to be part of any standards-compliant C compiler, therefore I am free to use them. And I do.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Mon Jan 22 23:37:33 2024

On Mon, 22 Jan 2024 23:08:53 -0000 (UTC), Blue-Maned_Hawk wrote:

Lawrence D'Oliveiro wrote:

Don’t you think it improves readability:

No.

Lessig’s Law: The one who writes the code makes the rules.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Lawrence D'Oliveiro on Mon Jan 22 20:23:59 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

[.. using ISO646 names for logical operators ..]

I do, if/when I do use C++ and C. Don't you think it improves
readability:

[example]

No, quite the contrary.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 06:54:56 2024

On 23.01.2024 00:37, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 23:08:53 -0000 (UTC), Blue-Maned_Hawk wrote:

Lawrence D'Oliveiro wrote:

Don’t you think it improves readability:

No.

Lessig’s Law: The one who writes the code makes the rules.

If by "The one" you mean the company and the project leader then
you are right. If you mean the individual programmer you are not
necessarily right; in my professional contexts there where even
[coding] standards defined that you had to follow. What people
make in their private cubbyhole is of course their own business
(and no one cares).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 06:47:10 2024

On 22.01.2024 21:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability: [...]

I'm a big proponent of readable code. Most of the early languages that
I used had such keywords (and also 'begin' etc. instead of braces).

I seem to recall that the (one?) reason to not have them was to reduce
the number of literal alphabetic keywords, to avoid name clashes.

But replacing '&&' by 'and' also doesn't add to the comprehensibility
of the source code (YMMV), presuming that we want the code to be read
by people who know how to program. It's also idiomatic. And in other
contexts (than C) these symbols have similar meanings.

There's also problems with these names. For example '&&' has not the
semantics of 'and' but of 'and_then', 'or' is actually 'or_else'. Even
it that gets "fixed", it's also not addressing other punctuation token
issues, like using '==' instead of the common '=' or 'equals_to'.

Since the beginning of C we had also other inherent issues with these
symbols; not simple lexical ones like 'and', but e.g. the operator
precedence that is (IMO) broken in at least one place. (I guess you
know them.) And in the earlier C days we had not even a true boolean
type or the 'true' and 'false' literals. Fixing things in a language
like C appears to me to be an arduous and unrewarding task.

So I take it as it comes, and I think, in C, we'd need other things
than an iso646.h header or disputes about personal token preferences.
Anyway.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 09:24:31 2024

On 22/01/2024 21:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability:

No. But I fully appreciate that this is personal preference and habit.
The same applies to brace style, and the use of parenthesis, and
variable naming :

if ((ch < '0') || (ch > '0')) {
if (allow_sign && (i == 0) && ((ch == '+') || (ch == '-'))) {
// All fine
} else if (allow_dec && !seen_dec && (ch == '.')) {
seen_dec = true;
} else {
valid = false;
break;
}
}

I wouldn't object to seeing "and" and "or", but I would not feel it
improves readability - it would make the code look more like Python than C.

if (ThisCh < '0' or ThisCh > '9')
{
if (AllowSign and Index == 0 and (ThisCh == '+' or ThisCh == '-'))
{
/* fine */
}
else if (AllowDecimal and not DecimalSeen and ThisCh == '.')
{
DecimalSeen = true; /* only allow one decimal point */
}
else
{
Valid = false;
break;
} /*if*/
} /*if*/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 11:54:31 2024

On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability:

if (ThisCh < '0' or ThisCh > '9')
{
if (AllowSign and Index == 0 and (ThisCh == '+' or ThisCh == '-'))
{
/* fine */
}
else if (AllowDecimal and not DecimalSeen and ThisCh == '.')
{
DecimalSeen = true; /* only allow one decimal point */
}
else
{
Valid = false;
break;
} /*if*/
} /*if*/

I strongly dislike C syntax, but this iso646 thing barely address 1% of
it. So I'm not surprised few bother.

Especially if you have to go the the trouble of including a header file
just to be able to use what should be language keywords.

In my everyday language (lower level like C) this can be written as:

if ch not in '0'..'9' then
if allowsign and index = 0 and ch in ['+', '-'] then
# fine

elsif allowdecimal and not decimalseen and ch = '-' then
decimalseen := true

else
valid := false
exit

end if
end if

(Notice no braces or parentheses.)

if
(
ThisCh >= 'a' and ThisCh <= 'z'
or
ThisCh >= 'A' and ThisCh <= 'Z'
or
ThisCh >= '0' and ThisCh <= '9'
or
ThisCh == '_'
or
ThisCh == '-'
or
ThisCh == '.'
or
ThisCh == '/'
)
{
Result.append(1, ThisCh);
}

This one could be a Switch (mine allows 'a'..'z' like using gcc's
extension), but that might be overkill.

Otherwise:

if ch in 'A'..'Z' or ch in 'a'..'z' or ch in '0'..'9' or
ch in ['_', '-', '.', '/'] then
result.append(1, ch)

I shortened your ThisCh to something more apt for a local variable.

Alternatively, I'd use:

if namemap[ch] then
...

but this is now rewriting your code, not just showing alternate synax.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lew Pitcher@21:1/5 to Malcolm McLean on Tue Jan 23 17:21:40 2024

On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability: >>

It breaks the rule that, in C, variables and functions are alphnumeric, whilst operators are symbols. sizeof is an exception, but a justified
one. However it's harder to justify a symbol for "plus" but a word for "or".

Less importantly, it also violates the convention that C macros are named in upper case to distinguish them from keywords and "regular" identifiers.

I'll stick with the native C operators, but IF I were working in an environment where 'special characters' were problematic (such as where digraphs or trigraphs
are necessary), I'd rather use
a OR b
instead of
a or b

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Malcolm McLean on Tue Jan 23 18:34:09 2024

On 23/01/2024 16:32, Malcolm McLean wrote:

On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves
readability:

It breaks the rule that, in C, variables and functions are alphnumeric, whilst operators are symbols. sizeof is an exception, but a justified
one. However it's harder to justify a symbol for "plus" but a word for
"or".

But it's OK to justify 'pow' for exponentiation?

Every explanation for && and || for every language that copied them from
C, is that && means AND, and || means OR.

Presumably everyone knows what AND and OR mean. So why not just use AND
and OR?

A lot of C code already looks a sea of punctuation; it can be good to
break it up a little.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Tue Jan 23 18:52:22 2024

bart <bc@freeuk.com> writes:

On 23/01/2024 16:32, Malcolm McLean wrote:

Every explanation for && and || for every language that copied them from
C, is that && means AND, and || means OR.

in C, && specifically means 'conditional and'. The programmer can
rely on the fact that the second term will not be evaluated if
the first term evaluates to false.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Scott Lurndal on Tue Jan 23 14:23:01 2024

On 1/23/24 13:52, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 23/01/2024 16:32, Malcolm McLean wrote:

Every explanation for && and || for every language that copied them from
C, is that && means AND, and || means OR.

in C, && specifically means 'conditional and'. The programmer can
rely on the fact that the second term will not be evaluated if
the first term evaluates to false.

Actually, C uses the term "Logical AND". I don't have any idea what "conditional and" is supposed to mean, except that the explanation you
provide matches the term "Logical AND".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to James Kuyper on Tue Jan 23 20:32:39 2024

James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

Actually, C uses the term "Logical AND". I don't have any idea what "conditional and" is supposed to mean, except that the explanation you provide matches the term "Logical AND".

The term "conditional and" is probably not so good, but the
meaning of it here refers to the familiar short-circuiting
behaviour of C's "&&". The same behaviour exists in, I
think, all UNIX shells.

If I write this in bash:

rm foo.txt && rm bar.txt

then if the first rm-command fails with a non-zero exit value,
then the second rm-command is not executed at all.

It is similar in C code but the zero evaluation value
there means false, i.e. it is completely opposite of the
UNIX shell.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to James Kuyper on Tue Jan 23 21:28:23 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 1/23/24 13:52, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 23/01/2024 16:32, Malcolm McLean wrote:

Every explanation for && and || for every language that copied them from >>> C, is that && means AND, and || means OR.

in C, && specifically means 'conditional and'. The programmer can
rely on the fact that the second term will not be evaluated if
the first term evaluates to false.

Actually, C uses the term "Logical AND".

The term 'conditional and' has been in common use for decades.

I never claimed it was the term used in the standard, regardless
of the use of the word 'specifically'.

An issue with the use of the iso646 'and' macro is that the
typical reader, unfamiliar with iso646, might assume a bitwise
and rather than a 'logical' or 'conditional' and.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to David Brown on Tue Jan 23 21:30:20 2024

On 2024-01-23, David Brown <david.brown@hesbynett.no> wrote:

On 22/01/2024 21:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability:

No. But I fully appreciate that this is personal preference and habit.

I believe that some of the identifiers improves readability for people
coming from a programming language which uses those English words for
very similar operators rather than && and ||.

In a green field programming language design, it's probably better
to design that way from the start. It's a nice bonus if a language
looks readable to newcomers.

Generations of C coders are used to && and || though; that's the normal
way to write C. Using these aliases is a vanishingly rare practice. An important aspect of readability is writing code like everyone else. When
a language is newly designed so that there isn't anyone else, that
doesn't have to be considered.

For that reason, these identifiers should not be used, except for machine-encoding of programs into a 6 bit character set.

Additionally certain names in the iso646.h header are poorly considered,
and obstruct readability. They use the _eq suffix for an operation that
is assignment.

#define and_eq &=

If the purpose of this header were to optimize readability for those
unfamiliar with C, this should be called

#define and_set &=

or similar.

The assignment operator = should not be read "equals", but "becomes" or
"takes the value" or "is assigned" or "is set to". This should be taken
into consideration when coming up with word-like token or token fragment
to represent it.

Also note the following inconsistency:

#define and &&
#define bitand &
#define and_eq &= // what happened to "bit"?

This looks like and_eq should correspond to &&=, since and is &&,
and bitand is &. &= wants to be bitand_eq.

Clearly, the purpose of this header is to allow C to be written with the
ISO 646 character set. The choices of identifiers do not look like
evidence of readability having been highly prioritized.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lew Pitcher on Tue Jan 23 21:49:23 2024

On 2024-01-23, Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability:

It breaks the rule that, in C, variables and functions are alphnumeric,
whilst operators are symbols. sizeof is an exception, but a justified
one. However it's harder to justify a symbol for "plus" but a word for "or".

Less importantly, it also violates the convention that C macros are named in upper case to distinguish them from keywords and "regular" identifiers.

So do true, false, bool, complex, imaginary, errno, assert, fpclassify,
..., and any function supplanted by a macro, such as used to be the case
with getc.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Tue Jan 23 21:51:29 2024

On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

It breaks the rule that, in C, variables and functions are alphnumeric, whilst operators are symbols.

Where is there such a “rule”?

sizeof is an exception, but a justified one.

This is how religious people argue: they use circular reasoning to say something is justified because it is justified.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Scott Lurndal on Tue Jan 23 22:00:43 2024

On 2024-01-23, Scott Lurndal <scott@slp53.sl.home> wrote:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 1/23/24 13:52, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 23/01/2024 16:32, Malcolm McLean wrote:

Every explanation for && and || for every language that copied them from >>>> C, is that && means AND, and || means OR.

in C, && specifically means 'conditional and'. The programmer can
rely on the fact that the second term will not be evaluated if
the first term evaluates to false.

Actually, C uses the term "Logical AND".

The term 'conditional and' has been in common use for decades.

Also, a bitwise and is logical!

ANSI Common Lisp uses symbols like logand, logior, logxor, ...
for bitwise operations.

When you implement this stuff with electronic gates it is digital logic circuits. You can read live values in it with a logic probe.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Lew Pitcher on Tue Jan 23 21:52:00 2024

On Tue, 23 Jan 2024 17:21:40 -0000 (UTC), Lew Pitcher wrote:

Less importantly, it also violates the convention that C macros are
named in upper case to distinguish them from keywords and "regular" identifiers.

Why does C allow lowercase in macro names, then?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Tue Jan 23 21:50:01 2024

On Tue, 23 Jan 2024 06:47:10 +0100, Janis Papanagnou wrote:

There's also problems with these names. For example '&&' has not the semantics of 'and' but of 'and_then', 'or' is actually 'or_else'.

Funnily enough, that is how the languages that offer those words interpret them. Not just C and C++.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 22:09:35 2024

On 2024-01-23, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Tue, 23 Jan 2024 20:32:39 -0000 (UTC), Kalevi Kolttonen wrote:

If I write this in bash:

rm foo.txt && rm bar.txt

Then the second is only executed if the first one returns zero.

What does C do in this case?

C also doesn't evaluate the right operand if the left one is
true.

However in C, the following:

a && b || c && d

is parsed like this:

(a && b) || (c && d)

whereas in the shell:

(((a && b) || c) && d)

where I'm using "virtual parentheses". If you actually stick in real
ones, they denote subshell execution in a separate process. (Bash
allows curly braces for command grouping that doesn't create
processes.)

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Kaz Kylheku on Tue Jan 23 22:03:02 2024

On 23/01/2024 21:30, Kaz Kylheku wrote:

Also note the following inconsistency:

#define and &&
#define bitand &
#define and_eq &= // what happened to "bit"?

This looks like and_eq should correspond to &&=, since and is &&,
and bitand is &. &= wants to be bitand_eq.

Clearly, the purpose of this header is to allow C to be written with the
ISO 646 character set. The choices of identifiers do not look like
evidence of readability having been highly prioritized.

It shows this is a poor man's way of extending a language's syntax.

It's forgivable in user-code, as there is no other way of doing it. But
those new operator names are supposed to look like they are built-in.

If they were, you'd be able to do this:

a bitand= b;

Actually, I thought the macros could be used like that. But '&=' needs
to be a single token, not two.

Also, why isn't 'a &&= b' valid? It would just mean 'a = a && b'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Tue Jan 23 22:01:09 2024

On Tue, 23 Jan 2024 12:13:27 -0800, Keith Thompson wrote:

There's no hard rule that operators must be
punctuation, just a general trend.

And iso646.h demonstrates that that trend is at an end.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Kalevi Kolttonen on Tue Jan 23 21:52:59 2024

On Tue, 23 Jan 2024 20:32:39 -0000 (UTC), Kalevi Kolttonen wrote:

If I write this in bash:

rm foo.txt && rm bar.txt

Then the second is only executed if the first one returns zero.

What does C do in this case?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Kaz Kylheku on Tue Jan 23 22:33:58 2024

On 23/01/2024 22:00, Kaz Kylheku wrote:

On 2024-01-23, Scott Lurndal <scott@slp53.sl.home> wrote:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 1/23/24 13:52, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 23/01/2024 16:32, Malcolm McLean wrote:

Every explanation for && and || for every language that copied them from >>>>> C, is that && means AND, and || means OR.

in C, && specifically means 'conditional and'. The programmer can
rely on the fact that the second term will not be evaluated if
the first term evaluates to false.

Actually, C uses the term "Logical AND".

The term 'conditional and' has been in common use for decades.

Also, a bitwise and is logical!

Only on individual bits. The result of A & B can be any value in the
range of A and B's type, not just true and false.

(My tools internally use ANDL/ORL/NOTL for logical versions, and IAND/IOR/IXOR/INOT for bitwise. I don't use bare AND/OR/NOT. There is
some confusion however with x64 opcodes which use those for bitwise instructions.)

ANSI Common Lisp uses symbols like logand, logior, logxor, ...
for bitwise operations.

Then it is confusing. What does it use for non-bitwise logical operations?

When you implement this stuff with electronic gates it is digital logic circuits. You can read live values in it with a logic probe.

Yes. You put the probe on one signal line to read true or false on that
line.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to Kaz Kylheku on Tue Jan 23 22:37:46 2024

Kaz Kylheku <433-929-6894@kylheku.com> wrote:

On 2024-01-23, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Tue, 23 Jan 2024 20:32:39 -0000 (UTC), Kalevi Kolttonen wrote:

If I write this in bash:

rm foo.txt && rm bar.txt

Then the second is only executed if the first one returns zero.

What does C do in this case?

C also doesn't evaluate the right operand if the left one is
true.

We are speaking about logical AND evaluation here.

If the left one is *false*, then the evaluation stops
because the whole conjunction is known to false.

If we are talking about logical OR, then if the left
operand is true, then the evaluation stops because
the disjunction is known to be true.

So this distinction between "conditional and" and
"logical and" boils down to the short-circuiting
left-to-right evaluation order that is guaranteed
by C language standard.

We could conceivably have "logical and" with
out-of-order or right-to-left evaluation. I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 22:54:33 2024

Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Tue, 23 Jan 2024 12:13:27 -0800, Keith Thompson wrote:

There's no hard rule that operators must be
punctuation, just a general trend.

And iso646.h demonstrates that that trend is at an end.

In the real world, iso646.h is almost never used. I
do not claim to be any kind of programming expert but
I have read a considerable amount of C code of various
projects. I have personally witnessed that everybody
always uses "&&" and "||".

You can probably download thousands of C codebases
from github and you will find out that reality is
like that.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 22:45:17 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 23 Jan 2024 12:13:27 -0800, Keith Thompson wrote:

There's no hard rule that operators must be
punctuation, just a general trend.

And iso646.h demonstrates that that trend is at an end.

I don't see you you can draw that conclusion. The header
file has been around for over three decades, yet it's not
in common (or even uncommon) use.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Tue Jan 23 23:33:48 2024

On Tue, 23 Jan 2024 14:51:52 -0800, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 23 Jan 2024 17:21:40 -0000 (UTC), Lew Pitcher wrote:

Less importantly, it also violates the convention that C macros are
named in upper case to distinguish them from keywords and "regular"
identifiers.

Why does C allow lowercase in macro names, then?

Because it's a convention, not a language rule.

So what would one mean by “violate”, other than “I personally don’t like
it”?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Tue Jan 23 23:46:15 2024

On 2024-01-23, bart <bc@freeuk.com> wrote:

ANSI Common Lisp uses symbols like logand, logior, logxor, ...
for bitwise operations.

Then it is confusing. What does it use for non-bitwise logical operations?

and, or, not

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Tue Jan 23 23:40:23 2024

On Tue, 23 Jan 2024 15:10:36 -0800, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 23 Jan 2024 12:13:27 -0800, Keith Thompson wrote:

There's no hard rule that operators must be punctuation, just a
general trend.

And iso646.h demonstrates that that trend is at an end.

It does no such thing.

A “trend” means ongoing developments continue to follow the same pattern. There have been no new non-alphanumeric operators added to C in, oh, over
40 years. Therefore the “trend” actually came to an end a long time ago.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Tue Jan 23 23:39:07 2024

On Tue, 23 Jan 2024 22:45:17 GMT, Scott Lurndal wrote:

The header file has been around for over three decades, yet it's not in common (or even uncommon) use.

The nice thing is, I don’t have to care. It has to be part of any standards-compliant C compiler, therefore I am free to use it. And I do.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Scott Lurndal on Tue Jan 23 19:45:55 2024

On 1/23/24 16:28, Scott Lurndal wrote:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 1/23/24 13:52, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 23/01/2024 16:32, Malcolm McLean wrote:

Every explanation for && and || for every language that copied them from >>>> C, is that && means AND, and || means OR.

in C, && specifically means 'conditional and'. The programmer can
rely on the fact that the second term will not be evaluated if
the first term evaluates to false.

Actually, C uses the term "Logical AND".

The term 'conditional and' has been in common use for decades.

I've never heard of it. When searching Wikipedia, "conditional and" the
first two pages of hits don't seem to contain any that are actually
relevant. A search on "logical and", on the other hand, redirects to
"logical conjunction", which does seem to be relevant, yet it makes no reference to "conditional and" as an alternative term for the same
thing. I think that that term might not be as widely used as you think.

A Google search for "logical and" has many relevant hits. A search for "conditional and" produces a much smaller number of hits, which suggest
that it's a term associated with C# and Java, and at least one source
described it by saying the "coditional and operator performs a ogical
and operation", which sounds rather confused to me.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 00:47:03 2024

On Tue, 23 Jan 2024 16:27:25 -0800, Keith Thompson wrote:

I believe the only thing iso646.h demonstrates is the (largely former)
need to write C on systems that do not support full ASCII.

It does seem a very late addition for a purpose which would have been a
lot more relevant decades earlier. By that point, implementations that did
not support full ASCII would have been museum pieces.

Reminds me of how PL/I added “eq”, “ne”, “lt”, ”gt”, “le”, “ge” as
alternatives to “=”, “¬=”, “<”, ”>“, “¬>”, “¬<” for use on systems which
didn’t have the requisite character set.

Incidentally, those six symbols were the only reserved words in the entire language. You’d think they could have done it Fortran-style, without
reserved words.

If you want to use it in your own code, nobody will stop you.

But will I continue to hear complaints from you about it?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 00:39:19 2024

On Tue, 23 Jan 2024 16:16:31 -0800, Keith Thompson wrote:

Obviously, it would mean not following the convention.

Thumbing one’s nose at staid convention? How shocking!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 02:42:01 2024

On Tue, 23 Jan 2024 17:32:37 -0800, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

It does seem a very late addition for a purpose which would have been a
lot more relevant decades earlier. By that point, implementations that
did not support full ASCII would have been museum pieces.

<iso646.h> was added in 1995, and it was intended to replace a number of implementation-specific workarounds.

Just in time for them to become unnecessary, I would say.

But will I continue to hear complaints from you about it?

I can't continue what I never started.

I’ll take that as a “yes”.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Keith Thompson on Tue Jan 23 23:26:54 2024

On 1/23/24 19:27, Keith Thompson wrote:
...

I believe the only thing iso646.h demonstrates is the (largely former)
need to write C on systems that do not support full ASCII. This is
fully explained in the C99 Rationale, <http://www.open-std.org/jtc1/sc22/WG14/www/C99RationaleV5.10.pdf>;
search for "MSE.4". It says nothing about "and" being more readable
than "&&" on systems that are able to display the '&' character.

It does say, in lines 26-29, that "The Committee recognizes that the
solution offered in this header is incomplete and involves a mixture of approaches, but nevertheless believes that it can help make Standard C
programs more readable." However, if you examine the context of that
statement, what it's saying is that they are more readable than is
trigraphs. In other words, they are saying that "or" is more readable
than "??!??!", which I doubt that anyone would disagree with.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Tue Jan 23 23:56:11 2024

On 1/23/24 19:47, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 16:27:25 -0800, Keith Thompson wrote:

I believe the only thing iso646.h demonstrates is the (largely former)
need to write C on systems that do not support full ASCII.

It does seem a very late addition for a purpose which would have been a
lot more relevant decades earlier. By that point, implementations that did not support full ASCII would have been museum pieces.

No, <iso646.h> was approved as part of AMD1, in 1995. For the countries
it was targeted at, full ASCII was not a tenable solution - it could not
be used to write their languages. The were still using encodings such as shift-JIS and ISO/IEC 8859-10. Those did not become museum pieces until
after the widespread adoption of Unicode, which came out in 1996, and
did not become widely supported for many years after that.

If you want to use it in your own code, nobody will stop you.

But will I continue to hear complaints from you about it?

If you've misunderstood the comments you've already received as
complaints, I think it's quite likely that you'll continue doing so.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Wed Jan 24 05:24:39 2024

On Tue, 23 Jan 2024 23:56:11 -0500, James Kuyper wrote:

No, <iso646.h> was approved as part of AMD1, in 1995. For the countries
it was targeted at, full ASCII was not a tenable solution - it could not
be used to write their languages. The were still using encodings such as shift-JIS and ISO/IEC 8859-10.

But ASCII is a 7-bit code. The ISO 8859 codes are all ASCII supersets. And
also remember what the “shift” in “shift-JIS” stands for. Oh, and the name
of the corporation that created it in the first place is a bit of a
giveaway.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Wed Jan 24 06:05:01 2024

On Tue, 23 Jan 2024 06:54:56 +0100, Janis Papanagnou wrote:

... in my professional contexts there where even [coding] standards
defined that you had to follow.

What about the open-source code that your company takes without paying? Do
you demand that that code follow your rules as well? Do you send it back
to the developers to demand they rewrite it for you?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 06:35:05 2024

On Tue, 23 Jan 2024 21:43:44 -0800, Keith Thompson wrote:

In the Shift-JIS encoding, character 0x5C, which is the backslash in
ASCII and Unicode, is the Yen sign. That means that if a C source file contains "Hello, world\n", viewing it as Shift-JIS makes it look like
"Hello, world¥n", but a C compiler that treats its input as ASCII would
see a backslash.

So what exactly does iso646.h offer to deal with this?

.

.

.

(crickets)

.

.

.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Kaz Kylheku on Wed Jan 24 08:53:05 2024

On 23.01.2024 22:30, Kaz Kylheku wrote:

On 2024-01-23, David Brown <david.brown@hesbynett.no> wrote:

On 22/01/2024 21:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability:

No. But I fully appreciate that this is personal preference and habit.

I believe that some of the identifiers improves readability for people
coming from a programming language which uses those English words for
very similar operators rather than && and ||.

Well, I can only speaking for myself. But as someone originally coming
from such languages I have no problems with these operators. (== vs = aggravated me more).

[...]

For that reason, these identifiers should not be used, except for machine-encoding of programs into a 6 bit character set.

[...]

Clearly, the purpose of this header is to allow C to be written with the
ISO 646 character set. The choices of identifiers do not look like
evidence of readability having been highly prioritized.

I don't quite understand your thought. The "6-bit characters" OSes
that I used had no lowercase letters, as opposed to IA5/ASCII/ISO646.

To be sure I had also re-inspected the ASCII character set and it
seems that all C characters (including these operators) are anyway
in the ASCII domain. It's beyond me why they've used the name
"iso646.h".

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 08:59:59 2024

On 23.01.2024 22:50, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 06:47:10 +0100, Janis Papanagnou wrote:

There's also problems with these names. For example '&&' has not the
semantics of 'and' but of 'and_then', 'or' is actually 'or_else'.

Funnily enough, that is how the languages that offer those words interpret them. Not just C and C++.

From the languages I know of that support them (Ada and Eiffel) these
languages support both forms.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Keith Thompson on Wed Jan 24 09:06:22 2024

On 23.01.2024 21:13, Keith Thompson wrote:

[...] There's no hard rule that operators must be
punctuation, just a general trend.)

Anyone still writing "MULTIPLY a BY b GIVEN c" ? :-)

(Luckily I've never programmed in COBOL, even after
it allowed "COMPUTE c = a * b" (or some such).)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 09:02:02 2024

On 23.01.2024 22:52, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 17:21:40 -0000 (UTC), Lew Pitcher wrote:

Less importantly, it also violates the convention that C macros are
named in upper case to distinguish them from keywords and "regular"
identifiers.

Why does C allow lowercase in macro names, then?

It's the nature of "conventions" to take the place where there's no
rule.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Kalevi Kolttonen on Wed Jan 24 09:10:55 2024

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Kaz Kylheku on Wed Jan 24 09:09:05 2024

On 23.01.2024 23:09, Kaz Kylheku wrote:

[...]
whereas in the shell:

(((a && b) || c) && d)

where I'm using "virtual parentheses". If you actually stick in real
ones, they denote subshell execution in a separate process. (Bash
allows curly braces for command grouping that doesn't create
processes.)

Make that "POSIX allows..." (it's standard behavior for POSIX shells).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 09:15:43 2024

On 24.01.2024 00:33, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 14:51:52 -0800, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 23 Jan 2024 17:21:40 -0000 (UTC), Lew Pitcher wrote:

Less importantly, it also violates the convention that C macros are
named in upper case to distinguish them from keywords and "regular"
identifiers.

Why does C allow lowercase in macro names, then?

Because it's a convention, not a language rule.

So what would one mean by “violate”, other than “I personally don’t like
it”?

I would interpret it in [professional] project contexts where more
than one person is programming on the same project.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 09:34:24 2024

On 24.01.2024 07:05, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 06:54:56 +0100, Janis Papanagnou wrote:

... in my professional contexts there where even [coding] standards
defined that you had to follow.

What about the open-source code that your company takes without paying? Do you demand that that code follow your rules as well? Do you send it back
to the developers to demand they rewrite it for you?

Since you're making up your comfort situations feel free to answer them yourself.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to James Kuyper on Wed Jan 24 09:28:11 2024

On 24.01.2024 01:45, James Kuyper wrote:

On 1/23/24 16:28, Scott Lurndal wrote:

The term 'conditional and' has been in common use for decades.

I've never heard of it. When searching Wikipedia, [...]

In the context of Ada I read about "boolean operator" ('and')
and "boolean shortcut operator" ('and then').

In an Eiffel book (which oriented its this operator choice on Ada)
they speak about "non-commutative" operators when the mean the e.g.
'and then' sorts of operators.

In my book the other formulation(s) mentioned in the threads are
clear enough to understand them (I did, at least) and unless one
is on the argument trip we should stop that fruitless discussion.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Keith Thompson on Wed Jan 24 09:41:31 2024

On 24.01.2024 00:10, Keith Thompson wrote:

The header was introduced to make it easier (or possible) to write C
code on systems/keyboards that don't support certain characters like '&'
and '|' -- similar to digraphs and trigraphs.

I think this is the most likely explanation; the restricted _keyboards_
(and not the restricted [ASCII] character set). Matches my experiences
with old keyboards I used decades ago.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Kaz Kylheku on Wed Jan 24 10:03:03 2024

On 23/01/2024 22:30, Kaz Kylheku wrote:

On 2024-01-23, David Brown <david.brown@hesbynett.no> wrote:

On 22/01/2024 21:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves readability:

No. But I fully appreciate that this is personal preference and habit.

I believe that some of the identifiers improves readability for people
coming from a programming language which uses those English words for
very similar operators rather than && and ||.

In a green field programming language design, it's probably better
to design that way from the start. It's a nice bonus if a language
looks readable to newcomers.

Agreed.

Generations of C coders are used to && and || though; that's the normal
way to write C. Using these aliases is a vanishingly rare practice. An important aspect of readability is writing code like everyone else. When
a language is newly designed so that there isn't anyone else, that
doesn't have to be considered.

Agreed. (Although people coming to your new language probably have
experience with some other languages, and you'll want to avoid being confusingly different from common existing languages. Choosing "&&" and
"&" for "bitwise and" and "logical and" would be a bad idea for a new
language, even if it arguably makes more sense based on the prevalence
of these operator usages.)

For that reason, these identifiers should not be used, except for machine-encoding of programs into a 6 bit character set.

Additionally certain names in the iso646.h header are poorly considered,
and obstruct readability. They use the _eq suffix for an operation that
is assignment.

#define and_eq &=

If the purpose of this header were to optimize readability for those unfamiliar with C, this should be called

#define and_set &=

or similar.

Or "bitmask" ?

The assignment operator = should not be read "equals", but "becomes" or "takes the value" or "is assigned" or "is set to". This should be taken
into consideration when coming up with word-like token or token fragment
to represent it.

Yes.

Also note the following inconsistency:

#define and &&
#define bitand &
#define and_eq &= // what happened to "bit"?

This looks like and_eq should correspond to &&=, since and is &&,
and bitand is &. &= wants to be bitand_eq.

Clearly, the purpose of this header is to allow C to be written with the
ISO 646 character set. The choices of identifiers do not look like
evidence of readability having been highly prioritized.

They are still better than trigraphs :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 14:14:39 2024

On 24/01/2024 07:35, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 21:43:44 -0800, Keith Thompson wrote:

In the Shift-JIS encoding, character 0x5C, which is the backslash in
ASCII and Unicode, is the Yen sign. That means that if a C source file
contains "Hello, world\n", viewing it as Shift-JIS makes it look like
"Hello, world¥n", but a C compiler that treats its input as ASCII would
see a backslash.

So what exactly does iso646.h offer to deal with this?

In Scandinavian language variants of ASCII, the | symbol was replaced by
the letter ø or ö. The name "or" is a significant improvement over the "??!??!" trigraph for ||.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Malcolm McLean on Wed Jan 24 13:00:16 2024

On 24/01/2024 12:20, Malcolm McLean wrote:

On 23/01/2024 18:34, bart wrote:

On 23/01/2024 16:32, Malcolm McLean wrote:

On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves
readability:

It breaks the rule that, in C, variables and functions are
alphnumeric, whilst operators are symbols. sizeof is an exception,
but a justified one. However it's harder to justify a symbol for
"plus" but a word for "or".

But it's OK to justify 'pow' for exponentiation?

Mathematically operators are functions, so a mathematican would say that "add" is just as much of a function as "gamma". But to a computer
programmer an operator compiles to a trivial number of machine code instructions, whilst a function is a subroutine call. Pow is not usually supported in hardware. However it's such a basic mathematical function
that it has special notation. So some languages say it should be an
operator. However ASCII won't represent the standard notation.

Which is what? Usually the operator is implied when using mathematical notation, as is multiply.

Computer languages commonly use '**' or '^' for this operator.

SO there

are good arguments for and against pow as an operators, and different language take differnet views. But I think the C decision is better, as
C code is for programming computers, not for translating formulae into machine readable form.

C's decision is possibly the worst. A proper built-in operator, say
'**', can be overloaded to work on both ints and floats.

If you do 'pow(a,3)' in C when 'a' is an integer, then it will convert
to a float, call the external function, and return a float result, which
is likely to force neighbouring terms and operators to work as floats too.

Using 'a**3', that would probably be a call to an integer power
function, but here it can also easily choose to do a*a*a.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Wed Jan 24 14:22:31 2024

This post appears to me as really odd in many respects (details below), presuming you mean it as it is formulated [as a general comment] (as
opposed to, maybe, some "very specific" C view). - Where did you get
that view from?

On 24.01.2024 13:20, Malcolm McLean wrote:

Mathematically operators are functions, so a mathematican would say that "add" is just as much of a function as "gamma". But to a computer
programmer an operator compiles to a trivial number of machine code instructions, whilst a function is a subroutine call.

We generally don't know; neither what an operator compiles into, nor
what a function compiles into, nor what the compilers and optimizers additionally do. And I'm not even asking who that ominous "computer
programmer" you have in mind actually is.

I do, for example with matrix a, b, c; c = a * b;
(Or even more difficult tasks.) How should these matrix multiplication
be more trivial than a counterpart with functions?

(Someone already mentioned upthread that functions and operators can
be considered as similar (or in practice probably even equal) things.
Of course this is mostly correct, only that you also have to consider
operator precedence in the second case, and functions are typically
allowing more parameters; most operator are dyadic or monadic, else
you need some additional syntactic support.)

Pow is not usually
supported in hardware. However it's such a basic mathematical function
that it has special notation. So some languages say it should be an
operator. However ASCII won't represent the standard notation.

Generally this operator is defined by one or more ASCII characters.

SO there
are good arguments for and against pow as an operators, and different language take differnet views.

Yes, other languages support it (often in various ways; ** ^ 'up' ...)

But I think the C decision is better, as
C code is for programming computers,

All the languages that I know to support an operator for power() are
there "for programming computers"...

not for translating formulae into
machine readable form.

...and all translate formulas (as part of a program) into machine code.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Wed Jan 24 14:54:54 2024

On 24/01/2024 13:20, Malcolm McLean wrote:

On 23/01/2024 18:34, bart wrote:

On 23/01/2024 16:32, Malcolm McLean wrote:

On 22/01/2024 20:34, Lawrence D'Oliveiro wrote:

On Mon, 22 Jan 2024 09:30:21 +0100, David Brown wrote:

... I don't use the matching names in C++ either ...

I do, if/when I do use C++ and C. Don’t you think it improves
readability:

It breaks the rule that, in C, variables and functions are
alphnumeric, whilst operators are symbols. sizeof is an exception,
but a justified one. However it's harder to justify a symbol for
"plus" but a word for "or".

But it's OK to justify 'pow' for exponentiation?

Mathematically operators are functions, so a mathematican would say that "add" is just as much of a function as "gamma".

In mathematics, the term "operator" is usually used for functions where
the domain itself involves functions - such as the Laplace
transformation, or the integral operator. Addition is an /operation/,
not an operator, and "+" is a /symbol/. Given an operation and a
domain, you get a function - addition applied to real numbers is a
function, distinct from addition applied to integers or complex numbers.

But to a computer
programmer an operator compiles to a trivial number of machine code instructions, whilst a function is a subroutine call.

Not at all.

People working in most high level languages do not think in terms of
generated machine code at all, or in terms of subroutine calls. And C programmers who are looking closely enough at efficiency and generated
code to be interested in the implementation details, should know that
function calls do not equate to machine-code subroutine calls, nor do
operators necessarily compile to small numbers of machine code instructions.

Many operators in C are not mathematical operations. "sizeof" is an
operator, so are indirection operators, structure member access
operators, function calls, and the comma operator. These don't
correspond to machine code instructions in any straightforward manner -
nor do they match subroutine calls. They are specific parts of the
syntax and grammar of the language, and can do things that functions
cannot do in C.

Pow is not usually
supported in hardware. However it's such a basic mathematical function
that it has special notation.

There are vast numbers of things in mathematics that have special notation.

Exponentiation is not particularly common in programming, except for a
few special cases - easily written as "x * x", "x * x * x", "1.0 / x",
or "sqrt(x)", which are normally significantly more efficient than a
generic power function or operator would be.

So some languages say it should be an
operator. However ASCII won't represent the standard notation. SO there
are good arguments for and against pow as an operators, and different language take differnet views. But I think the C decision is better, as
C code is for programming computers, not for translating formulae into machine readable form.

That is not an argument against having an operator in C called "pow".
It is simply not useful enough for there to be a benefit in adding it to
the language as an operator, when it could (and was) easily be added as
a function in the standard library.

(It could not have been added as "**", because - as Keith said in
another post - "x ** y" already has a meaning in C. While I believe it
would be possible to distinguish the uses based on the type of "y",
other than for the literal 0, having "x ** y" mean two /completely/
different things depending on the type of "y" would not be a good idea
for C.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lew Pitcher@21:1/5 to Janis Papanagnou on Wed Jan 24 14:17:04 2024

On Wed, 24 Jan 2024 09:06:22 +0100, Janis Papanagnou wrote:

On 23.01.2024 21:13, Keith Thompson wrote:

[...] There's no hard rule that operators must be
punctuation, just a general trend.)

Anyone still writing "MULTIPLY a BY b GIVEN c" ? :-)

ITYM

MULTIPLY A BY B GIVING C.

and, yes, COBOL programmers are still in demand, mostly by
financial institutions that have hundreds of millions
of lines of COBOL code to maintain.

(Luckily I've never programmed in COBOL, even after
it allowed "COMPUTE c = a * b" (or some such).)

I have (lucky me :-) ).

While I don't tout COBOL as the "be all and end all" of
programming languages, it still can perform a lot of
useful work, especially in fields where exact calculations
are required and rounding and truncation of mathematical
operations are well defined. Such as financial institutions.

These days, it even supports object oriented code.
FWIW, the last ISO COBOL language standard was issued in 2023.

Are you certain that you want your taxes to be calculated in
floatingpoint? ;-)

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 15:43:59 2024

On 2024-01-24, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Tue, 23 Jan 2024 06:54:56 +0100, Janis Papanagnou wrote:

... in my professional contexts there where even [coding] standards
defined that you had to follow.

What about the open-source code that your company takes without paying? Do you demand that that code follow your rules as well? Do you send it back
to the developers to demand they rewrite it for you?

In my experience, when you patch third-party code, you closely adhere to
*its* conventions, to keep it consistent.

If you're a person who maintains an diverse open source stack for an organization, you end up working with numerous coding conventions.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Janis Papanagnou on Wed Jan 24 16:56:58 2024

On 2024-01-24, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

To be sure I had also re-inspected the ASCII character set and it
seems that all C characters (including these operators) are anyway
in the ASCII domain. It's beyond me why they've used the name
"iso646.h".

Because the macro names in that header are in the ISO 646
invariant set, expanding to tokens that use characters outside
of the invariant set.

ISO 646 looks liken a effort to standardize the "zoo" of regional ASCII variants.

It defines a base character set which looks exactly like ASCII (correct
me if I'm wrong) of which there are national variants. It's like a
"mini ISO Latin" in 7 bits.

The Wikipedia page on it is quite good.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lew Pitcher on Wed Jan 24 17:44:29 2024

On 24.01.2024 15:17, Lew Pitcher wrote:

On Wed, 24 Jan 2024 09:06:22 +0100, Janis Papanagnou wrote:

ITYM

MULTIPLY A BY B GIVING C.

Memories are faint, and I consider it already pathological that
I still recall (basically) the syntax that I've seen just once,
many decades ago, and never programmen myself! ;-)

and, yes, COBOL programmers are still in demand, mostly by
financial institutions that have hundreds of millions
of lines of COBOL code to maintain.

Yeah, I've heard so.

(Luckily I've never programmed in COBOL, even after
it allowed "COMPUTE c = a * b" (or some such).)

I have (lucky me :-) ).

Hmm, okaaay... :-)

While I don't tout COBOL as the "be all and end all" of

Sounds prophetic like: "the end of all". LOL :-)

programming languages, it still can perform a lot of
useful work, especially in fields where exact calculations
are required and rounding and truncation of mathematical
operations are well defined. Such as financial institutions.

Yes, sure.

These days, it even supports object oriented code.
FWIW, the last ISO COBOL language standard was issued in 2023.

Yes, I had noticed (and was astonished) that it's evolution
is still continued.

Are you certain that you want your taxes to be calculated in
floatingpoint? ;-)

I'm not sure about that. Quite some years ago I have actually
worked (in the country where I am living) on tax software for
the finance authorities. At least the part I was involved was
written in C++. But I cannot tell what the calculation modules
were based on, floating point or else. Though I'm sure they've
chosen a sophisticated calculation base. I doubt it was COBOL.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Kaz Kylheku on Wed Jan 24 18:07:03 2024

On 24.01.2024 17:56, Kaz Kylheku wrote:

On 2024-01-24, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

To be sure I had also re-inspected the ASCII character set and it
seems that all C characters (including these operators) are anyway
in the ASCII domain. It's beyond me why they've used the name
"iso646.h".

Because the macro names in that header are in the ISO 646
invariant set, expanding to tokens that use characters outside
of the invariant set.

I forgot about the "national variants" and the, how was it called,
IRV (International Reference Version, or some such)?.

ISO 646 looks liken a effort to standardize the "zoo" of regional ASCII variants.

It defines a base character set which looks exactly like ASCII (correct
me if I'm wrong) of which there are national variants. It's like a
"mini ISO Latin" in 7 bits.

A common ASCII subset with specific code points around [ ^ ] | etc.

The inherent problem with that is that even many standard symbols
from the C language were a problem; [ ] ^ { } ~ | which are in the
IRV but (prevalently) not in the national variants.

You cannot write legible C code with national non-IRV ASCII variants.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Kaz Kylheku on Wed Jan 24 18:12:38 2024

On 24.01.2024 17:56, Kaz Kylheku wrote:

The Wikipedia page on it is quite good.

The German Wikipedia has a table that is better legible IMO https://de.wikipedia.org/wiki/ISO_646

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 24 17:27:46 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 24.01.2024 15:17, Lew Pitcher wrote:

On Wed, 24 Jan 2024 09:06:22 +0100, Janis Papanagnou wrote:

programming languages, it still can perform a lot of
useful work, especially in fields where exact calculations
are required and rounding and truncation of mathematical
operations are well defined. Such as financial institutions.

Yes, sure.

On one of the Burroughs mainframe lines, the disk
defragmentation utility (called SQUASH) was written
in COBOL68.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Janis Papanagnou on Wed Jan 24 12:24:32 2024

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement? As quoted,
it's a general statement which includes C: "Except as specified later,
side effects and value computations of subexpressions are unsequenced."

Now, logical-AND and logical-OR are two cases where the order of
evaluation is, in fact, specified. Are you expressing surprise that
there are other languages where that's not the case? I can't remember
where, but I'm fairly sure I've seen a language where the closest
equivalent of C's (expression1 && expression2) causes both
sub-expressions to be evaluated, in an arbitrary order, before
evaluating the equivalent of && itself. Unfortunately, I don't remember
where.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to James Kuyper on Wed Jan 24 18:38:38 2024

On 24.01.2024 18:24, James Kuyper wrote:

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement?

It sounds so wrong, not matching anything I've experienced in the
programming languages I heard about and about compiler construction
that I can only express my astonishment about such a statement. The
poster's statement itself is not explained, though, and if anything,
the poster should first explain what makes him "pretty sure" about
it before we can exchange arguments.

I may have been lucky that such a fundamental property as operational
semantics defining evaluation order have been part of all languages I
met, especially in the context of expressions connected by operators
we were speaking about here.

[...]

Now, logical-AND and logical-OR are two cases where the order of
evaluation is, in fact, specified. Are you expressing surprise that
there are other languages where that's not the case? I can't remember
where, but I'm fairly sure I've seen a language where the closest
equivalent of C's (expression1 && expression2) causes both
sub-expressions to be evaluated, in an arbitrary order, before
evaluating the equivalent of && itself. Unfortunately, I don't remember where.

In functional languages without side effects it might not be an issue.

The closest I met were theoretical expressions (like e.g. Dijktra's
Guards, or how they were called) in per se non-deterministic contexts.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to Janis Papanagnou on Wed Jan 24 17:49:16 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

It sounds so wrong, not matching anything I've experienced in the
programming languages I heard about and about compiler construction
that I can only express my astonishment about such a statement. The
poster's statement itself is not explained, though, and if anything,
the poster should first explain what makes him "pretty sure" about
it before we can exchange arguments.

Well, I said I was "pretty sure" simply because there are probably
hundreds if not thousands of programming languages out there.

It seems rather likely to me that not all of them have
C-like properties.

I suppose it is possible that in some languages out-of-order
evaluation could be what happens, e.g. the "logical AND" operands
could be evaluated in parallel by different CPUs.

But admittedly I cannot give a concrete example.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Janis Papanagnou on Wed Jan 24 18:42:45 2024

On 24.01.2024 18:38, Janis Papanagnou wrote:

[...]

The closest I met were theoretical expressions (like e.g. Dijktra's
Guards, or how they were called) in per se non-deterministic contexts.

Ah, I forgot; and I think also in Intercal... - wasn't (at least) the "politeness check" non-deterministic? - ...in case it matters. :-)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to Kalevi Kolttonen on Wed Jan 24 18:40:08 2024

Kalevi Kolttonen <kalevi@kolttonen.fi> wrote:

I suppose it is possible that in some languages out-of-order
evaluation could be what happens, e.g. the "logical AND" operands
could be evaluated in parallel by different CPUs.

I did some searching and finally found a suitable result.
According to:

https://en.wikipedia.org/wiki/Logical_disjunction

"Logical disjunction is usually short-circuited; that is,
if the first (left) operand evaluates to true, then the
second (right) operand is not evaluated. The logical
disjunction operator thus usually constitutes a
sequence point.

In a parallel (concurrent) language, it is possible to
short-circuit both sides: they are evaluated in
parallel, and if one terminates with value true, the
other is interrupted. This operator is thus called
the parallel or."

So that describes "parallel logical OR". If this is
possible in a parallel language, then surely
"parallel logical AND" is also doable because in
that case you can terminate the evaluation when
one of the operands turns out to be false.

The Wikipedia article seems to assume that these
logical operators take just two operands, but as
we know from some languages (LISP?), AND or OR can
be generalized to accept any number of operands.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Janis Papanagnou on Wed Jan 24 19:32:45 2024

On 2024-01-24, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 24.01.2024 00:10, Keith Thompson wrote:

The header was introduced to make it easier (or possible) to write C
code on systems/keyboards that don't support certain characters like '&'
and '|' -- similar to digraphs and trigraphs.

I think this is the most likely explanation; the restricted _keyboards_
(and not the restricted [ASCII] character set). Matches my experiences
with old keyboards I used decades ago.

Well, keyboards and displays. Your keyboard has something other than a |
key, and when you type that, you get a character similar to the one
on the keyboard. They happen to have the same character code as the
ASCII/ISO 646 base character |. When you read someone's soure code,
you see that character where the code has | and ||.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Wed Jan 24 19:45:58 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 24/01/2024 13:54, David Brown wrote:

On 24/01/2024 13:20, Malcolm McLean wrote:

Many operators in C are not mathematical operations. "sizeof" is an
operator, so are indirection operators, structure member access
operators, function calls, and the comma operator.

I've discussed this ad infinitum with people who don't really understand
what the term "function" means. Anththing that maps one set to another
set such that there is one and only one mapping from each member if the >struture set to the result set is mathematically a "function".
Sizeof clearly counts.

Just how many angels do you think can dance on that pin?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to Malcolm McLean on Wed Jan 24 19:48:32 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

On 24/01/2024 13:54, David Brown wrote:

On 24/01/2024 13:20, Malcolm McLean wrote:

Many operators in C are not mathematical operations. "sizeof" is an
operator, so are indirection operators, structure member access
operators, function calls, and the comma operator.

I've discussed this ad infinitum with people who don't really understand
what the term "function" means. Anththing that maps one set to another
set such that there is one and only one mapping from each member if the struture set to the result set is mathematically a "function".
Sizeof clearly counts.

Yes, indeed you are right *if* one accepts that the
mathematical meaning is the only acceptable one.

However, the meanings in mathematics and in programming
languages can be different.

For example, despite that some people still
complain that "=" should not be used for assignment
because in mathematics it means equality. It is
just that the meanings of "=" are different in different
contexts.

One can easily argue that sizeof() is not a function
in C because you cannot use function pointers to
refer to it. With C's meaning of functions, you
always can.

So it is not just a matter of mapping elements
of two sets.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Wed Jan 24 19:52:58 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 23/01/2024 21:51, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

It breaks the rule that, in C, variables and functions are alphnumeric,
whilst operators are symbols.

Where is there such a “rule”?

Valid function names have to begin with an alphabetical symbol or
(annoyingly for me) an underscore, as do variables. They may not contain >non-alphanumerical symbols except for underscore

Dollar symbol ($) is an allowed extension.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to Keith Thompson on Wed Jan 24 20:03:11 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 24.01.2024 18:24, James Kuyper wrote:

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement?

It sounds so wrong, not matching anything I've experienced in the
programming languages I heard about and about compiler construction
that I can only express my astonishment about such a statement. The
poster's statement itself is not explained, though, and if anything,
the poster should first explain what makes him "pretty sure" about
it before we can exchange arguments.

A concrete example:

#include <stdio.h>

static int count(void) {
static int result = 0;
return ++result;
}

int main(void) {
printf("%d %d %d\n", count(), count(), count());
return 0;
}

C does not specify the order in which the arguments are evaluated
(likewise for operands of most operators). This program could produce
any of 6 possible outputs, at the whim of the compiler. (On my system,
I see "3 2 1" with gcc and "1 2 3" with clang; both are perfectly
valid.)

I'm surprised that that surprises you. It's a fairly fundamental
property of C (and also of C++).

[...]

I believe Janis knows what you are saying. The object of
discussion was logical operators, not the fact that C's
function arguments have no guaranteed order of evaluation.

However, I already posted a link to Wikipedia that is
on-topic and shows that parallel logical OR and parallel
logical AND are available in some parallel/concurrent
programming languages.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Wed Jan 24 20:14:19 2024

On Wed, 24 Jan 2024 12:24:32 -0500, James Kuyper wrote:

As quoted, it's a general statement which includes C: "Except as
specified later, side effects and value computations of subexpressions
are unsequenced."

It always seemed to me, explicitly specifying an order removes the
potential for parallelization on hardware which might allow it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 24 20:11:33 2024

On Wed, 24 Jan 2024 07:58:44 -0800, Keith Thompson wrote:

Trigraphs, digraphs, and <iso646.h> were all introduced to support
systems that *don't* support the full ASCII character set.

Where is there a national character set that doesn’t support the symbols
for which iso646.h introduces synonyms?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Wed Jan 24 20:19:27 2024

On Wed, 24 Jan 2024 14:14:39 +0100, David Brown wrote:

On 24/01/2024 07:35, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 21:43:44 -0800, Keith Thompson wrote:

In the Shift-JIS encoding, character 0x5C, which is the backslash in
ASCII and Unicode, is the Yen sign. That means that if a C source
file contains "Hello, world\n", viewing it as Shift-JIS makes it look
like "Hello, world¥n", but a C compiler that treats its input as ASCII
would see a backslash.

So what exactly does iso646.h offer to deal with this?

In Scandinavian language variants of ASCII ...

Relevance to Shift-JIS being?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Kalevi Kolttonen on Wed Jan 24 20:15:43 2024

On Wed, 24 Jan 2024 18:40:08 -0000 (UTC), Kalevi Kolttonen wrote:

"Logical disjunction is usually short-circuited ...

I wonder why that shouldn’t apply to anything else. E.g. in

a × (b + c)

if “a” evaluates to zero, why not avoid the computation of “b + c” and just return zero as the value of the expression?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Lew Pitcher on Wed Jan 24 20:21:37 2024

On Wed, 24 Jan 2024 14:17:04 -0000 (UTC), Lew Pitcher wrote:

and, yes, COBOL programmers are still in demand, mostly by financial institutions that have hundreds of millions of lines of COBOL code to maintain.

I suspect a lot of those institutions have already gone out of business,
or are close to going out of business. And the amounts they have to pay
COBOL programmers to maintain their code are hastening that end.

Are you certain that you want your taxes to be calculated in
floatingpoint? ;-)

How else would you handle compound interest?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lew Pitcher@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 20:25:00 2024

On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

On Wed, 24 Jan 2024 07:58:44 -0800, Keith Thompson wrote:

Trigraphs, digraphs, and <iso646.h> were all introduced to support
systems that *don't* support the full ASCII character set.

Where is there a national character set that doesn’t support the symbols for which iso646.h introduces synonyms?

EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lew Pitcher@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 20:31:05 2024

On Wed, 24 Jan 2024 20:21:37 +0000, Lawrence D'Oliveiro wrote:

On Wed, 24 Jan 2024 14:17:04 -0000 (UTC), Lew Pitcher wrote:

and, yes, COBOL programmers are still in demand, mostly by financial
institutions that have hundreds of millions of lines of COBOL code to
maintain.

I suspect a lot of those institutions have already gone out of business,
or are close to going out of business.

And who do /you/ bank with? Certainly, in Canada, our major banks are still going strong, and they /all/ use COBOL.

And the amounts they have to pay COBOL programmers

COBOL programmers are being offered hourly rates at about $90 CAD or more.
(I checked this morning). OTOH, C++ programmers (for instance) are being offered hourly rates of $45 CAD or thereabouts. (Again, this morning's
offers on Indeed.com).

to maintain their code are hastening that end.

Are you certain that you want your taxes to be calculated in
floatingpoint? ;-)

How else would you handle compound interest?

Fixed point arithmetic, of course. BTW, that's /not/ "integer"
arithmetic.

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Jan 24 20:50:12 2024

On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

(It could not have been added as "**", because - as Keith said in
another post - "x ** y" already has a meaning in C. While I believe
it would be possible to distinguish the uses based on the type of "y",
other than for the literal 0, having "x ** y" mean two /completely/
different things depending on the type of "y" would not be a good idea
for C.)

The problem with a "**" exponentation operator is lexical. It's common
to have two consecutive unary "*" operators in declarations and
expression:
char **argv;
char c = **argv;

Clearly, then, the way forward with this ** operator is to wait for the
C++ people to do the unthinkable, and reluctantly copy it some years
later.

Ya know, like what they did with stacked template closers, which are
already the >> operator.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Wed Jan 24 21:55:45 2024

On 24/01/2024 21:23, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 24 Jan 2024 07:58:44 -0800, Keith Thompson wrote:

Trigraphs, digraphs, and <iso646.h> were all introduced to support
systems that *don't* support the full ASCII character set.

Where is there a national character set that doesn’t support the symbols >> for which iso646.h introduces synonyms?

Just one example: <https://en.wikipedia.org/wiki/Code_page_1016> has 'ø'
in the slot that ASCII uses for '|'.

I don't believe it's in common use today, but it may have been in 1995.

7-bit modified ASCII sets like this were definitely used for a while.
They were replaced by 8-bit extended ASCII sets, then UTF-8 (though the
Latin-1 and Latin-9 sets are still in common use, especially on
Windows). Both these 8-bit sets and UTF-8 have 7-bit ASCII as a subset,
and thus have no problems with C.

In 1995 I was new in Norway, and only spoke a little Norwegian, so I
used UK English when programming. I believe most Norwegian programmers
also used either US or UK English - being a relatively small country (in population), almost everything technical here is in English. So C
character sets were never a problem for me when writing code. But
dot-matrix printers were often set to particular international sets - it
was not uncommon for code printouts to have Norwegian letters in place
of ASCII symbols.

It might be interesting to hear from any native Germans who were
programming C at that time. Germany is big enough that people
programmed in German (so comments would be in German, for example), and
their 7-bit ASCII variant (Code page 1011) also had accented letters in
place of some symbols used by C - including "|".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Jan 24 21:00:10 2024

On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes:
[...]

These days, it even supports object oriented code.
FWIW, the last ISO COBOL language standard was issued in 2023.

ADD 1 TO COBOL GIVING COBOL

Oh, oh, I have a new one to this oldie:

ADD 100 TO PITCH OF COBOL

(100 cents in a semitone.)

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 22:01:38 2024

On 24/01/2024 21:15, Lawrence D'Oliveiro wrote:

On Wed, 24 Jan 2024 18:40:08 -0000 (UTC), Kalevi Kolttonen wrote:

"Logical disjunction is usually short-circuited ...

I wonder why that shouldn’t apply to anything else. E.g. in

a × (b + c)

if “a” evaluates to zero, why not avoid the computation of “b + c” and
just return zero as the value of the expression?

Compilers both can and do use such optimisations:

<https://godbolt.org/z/bhPMo8WPW>

int test(int a, int b, int c) {
if (a == 0) {
return a * (b + c);
} else {
return a * (b + c);
}
}

compiles (with gcc -O2) to :

test:
mov eax, edi
test edi, edi
je .L2
add esi, edx
imul eax, esi
.L2:
ret

As long as there are no side-effects evaluating "b" and "c", then if the compiler knows "a" is 0, it can return 0 without doing the sums.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Janis Papanagnou on Wed Jan 24 21:08:21 2024

On 2024-01-24, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 24.01.2024 17:56, Kaz Kylheku wrote:

The Wikipedia page on it is quite good.

The German Wikipedia has a table that is better legible IMO https://de.wikipedia.org/wiki/ISO_646

That character 0x24 is funny. Every country listed in that table
just has it as $. But the IRV has to have ¤?

Who the hell needs a symbol indicating unspecified currency?

How about one for unspecified temperature units?

100⁰B ("degrees bullshit")

Is that freezing? Room temperature? Solder-melting? Who knows ...

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to James Kuyper on Wed Jan 24 21:11:31 2024

On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement? As quoted,
it's a general statement which includes C: "Except as specified later,
side effects and value computations of subexpressions are unsequenced."

Pretty much any language has to guarantee *something* about
order of evaluation, somewhere.

Like for instance that calculating output is not possible before a
needed input is available.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 22:09:33 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 24 Jan 2024 14:17:04 -0000 (UTC), Lew Pitcher wrote:

and, yes, COBOL programmers are still in demand, mostly by financial
institutions that have hundreds of millions of lines of COBOL code to
maintain.

I suspect a lot of those institutions have already gone out of business,
or are close to going out of business. And the amounts they have to pay
COBOL programmers to maintain their code are hastening that end.

Are you certain that you want your taxes to be calculated in
floatingpoint? ;-)

How else would you handle compound interest?

Fixed point arithmetic, of course. E.g. for
currency, work with 'mils' and round (using
bankers rounding) the result to the desired
precision (e.g. cents).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Wed Jan 24 21:20:04 2024

On 2024-01-24, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Wed, 24 Jan 2024 07:58:44 -0800, Keith Thompson wrote:

Trigraphs, digraphs, and <iso646.h> were all introduced to support
systems that *don't* support the full ASCII character set.

Where is there a national character set that doesn’t support the symbols for which iso646.h introduces synonyms?

See table in German Wikipedia page found by Janis

https://de.wikipedia.org/wiki/ISO_646#Aufbau

Let me reproduce that here:

ISO 646-IRV # ¤ @ [ \ ] ^ ` { | } ~
Deutschland # $ § Ä Ö Ü ^ ` ä ö ü ß
Schweiz ù $ à é ç ê î ô ä ö ü û
USA (ASCII) # $ @ [ \ ] ^ ` { | } ~
Großbritannien £ $ @ [ \ ] ^ ` { | } ~
Frankreich £ $ à ° ç § ^ ` é ù è ¨
Kanada # $ à â ç ê î ô é ù è û
Finnland # $ @ Ä Ö Å Ü é ä ö å ü
Norwegen # $ @ Æ Ø Å ^ ` æ ø å ~
Schweden # $ É Ä Ö Å Ü é ä ö å ü
Italien £ $ § ° ç é ^ ù à ò ù ì
Niederlande £ $ ¾ ÿ ½ | ^ ` ¨ ƒ ¼ ´
Spanien £ $ § ¡ Ñ ¿ ^ ` ° ñ ç ~
Portugal # $ @ Ã Ç Õ ^ ` ã ç õ ~

These are 7 bit codes, that are essentially variations on ASCII,
distinct from ISO-8859 (a.k.a. "ISO Latin").

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Kaz Kylheku on Wed Jan 24 22:13:06 2024

Kaz Kylheku <433-929-6894@kylheku.com> writes:

On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes:
[...]

These days, it even supports object oriented code.
FWIW, the last ISO COBOL language standard was issued in 2023.

ADD 1 TO COBOL GIVING COBOL

Oh, oh, I have a new one to this oldie:

ADD 100 TO PITCH OF COBOL

(100 cents in a semitone.)

?LI SYSTEM/OPERATOR
?COMPILE STREK COBOL LIB MEM + 300.
?DATA CARD
$SET CODE
IDENTIFICATION DIVISION.
PROGRAM-ID. STREK.
AUTHOR. KURT WILHELM.
INSTALLATION. OAKLAND UNIVERSITY.
DATE-WRITTEN. COMPLETED SEPTEMBER 1, 1979.
*
*******************************************************
* STAR_TREK SIMULATES AN OUTER SPACE ADVENTURE GAME *
* ON A REMOTE TERMINAL. THE USER COMMANDS THE U.S.S. *
* ENTERPRISE, AND THRU VARIOUS OFFENSIVE AND DEFEN- *
* SIVE COMMANDS, TRAVELS THROUGHOUT THE GALAXY ON A *
* MISSION TO DESTROY ALL KLINGONS, WHICH ALSO MANEU- *
* VER AND FIRE ON THE ENTERPRISE. *
*******************************************************
*

ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SOURCE-COMPUTER. V-380.
OBJECT-COMPUTER. V-300.

DATA DIVISION.
WORKING-STORAGE SECTION.
01 EOF-FLAG PIC X VALUE "N".
01 STAR-TABLE.
05 ROW OCCURS 42 TIMES.
10 KOLUMN PIC X OCCURS 42 TIMES.
01 RCTR PIC 99.
01 KCTR PIC 99.
01 COMMANDS-X.
05 COMMAND PIC X(3).
88 NAVIGATE VALUE "NAV".
88 PHASERS VALUE "PHA".
88 TORPEDO VALUE "TOR".
88 SHIELDS VALUE "DEF".
88 DOCK VALUE "DOC".
88 LIB-COM VALUE "COM".
88 NAV-C VALUE "NAV".
88 PHA-C VALUE "PHA".
88 TOR-C VALUE "TOR".
88 DEF-C VALUE "DEF".
88 DOC-C VALUE "DOC".
88 COM-C VALUE "COM".
05 ENTRY1 PIC 9.
05 ENTRY2 PIC 9.
01 MINI-TABLE.
05 MROW OCCURS 14 TIMES.
10 MCOL PIC X OCCURS 14 TIMES.
01 RCNTR PIC 99.
01 KCNTR PIC 99.
01 X PIC 999.
01 Y PIC 999.
01 WS-DATE PIC 9(4) COMP.
01 TIME-FLAG PIC 9.
88 TIME-FLAG-SET VALUE 1.
01 MAX-NO PIC 999.
01 HQ1 PIC 9.
01 HQ2 PIC 9.
01 T-STORE PIC 9(4) COMP.
01 ATTACK-FLAG PIC 9.
88 KLINGONS-ATTACKING VALUE 1.
...
MOVE 0 TO TOO-LATE-FLAG.
DISPLAY " ".
DISPLAY " *STAR TREK* ".
DISPLAY " ".
DISPLAY "CONGRATULATIONS - YOU HAVE JUST BEEN APPOINTED ".
DISPLAY "CAPTAIN OF THE U.S.S. ENTERPRISE. ".
DISPLAY " ".
DISPLAY "PLEASE ENTER YOUR NAME, CAPTAIN ".
ACCEPT NAME-X.
DISPLAY "AND YOUR SKILL LEVEL (1-4)? ".
ACCEPT SKILL-LEV.
IF SKILL-LEV NOT NUMERIC OR SKILL-LEV < 1 OR SKILL-LEV > 4
DISPLAY "INVALID SKILL LEVEL "
DISPLAY "ENTER YOUR SKILL LEVEL (1-4) "
ACCEPT SKILL-LEV
IF SKILL-LEV NOT NUMERIC OR SKILL-LEV < 1 OR SKILL-LEV >
- 4
MOVE 1 TO SKILL-LEV
DISPLAY "YOUR SKILL LEVEL MUST BE 1 ".
MOVE 0 TO VAB5.
MOVE 0 TO VAB6.
INSPECT NAME-X TALLYING VAB6 FOR ALL "A".
INSPECT NAME-X TALLYING VAB6 FOR ALL "E".
ADD 1 TO VAB6.
INSPECT NAME-X TALLYING VAB5 FOR ALL " ".
COMPUTE VAB6 ROUNDED = (VAB5 / 1.75) + (VAB6 / SKILL-LEV).
COMPUTE K-OR ROUNDED = (SKILL-LEV * 4) + VAB6 + 5.
COMPUTE VAB1 = 9 - SKILL-LEV.
COMPUTE VAB2 ROUNDED = (SKILL-LEV / 3) * K-OR.
MOVE K-OR TO KLINGONS.
MOVE VAB1 TO VAE1.
ACCEPT WS-TIME FROM TIME.
MOVE WS-MIN OF WS-TIME TO DS-MIN.
MOVE WS-SEC OF WS-TIME TO DS-SEC.
MOVE DS-TABLE TO S-DATE.
ADD 16 TO DS-MIN.
IF DS-MIN > 59
MOVE 1 TO TIME-FLAG
ELSE
MOVE 0 TO TIME-FLAG.
MOVE DS-TABLE TO DS-DATE.
...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lew Pitcher on Wed Jan 24 23:20:02 2024

On 2024-01-24, Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

Are you certain that you want your taxes to be calculated in
floatingpoint? ;-)

I've never calculated my taxes in anything but floating-point.

The Canada Revenue Agency will not refund or charge amounts less than a
couple of dollars so even if floating-point introduced errors (which it doesn't) it wouldn't matter.

Ordinary personal accounting, and small business accounting, can be
done entirely in IEEE 754 double, if used correctly. Or else using
integers for the ledger values, and taking at trip to floating point
for percentage calculations and such, which get rounded back to the
rational representation.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Lew Pitcher on Thu Jan 25 00:17:01 2024

On Wed, 24 Jan 2024 20:25:00 -0000 (UTC), Lew Pitcher wrote:

On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

Where is there a national character set that doesn’t support the
symbols for which iso646.h introduces synonyms?

EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

Were any of the EBCDICs official standards anywhere in the world, outside
of IBM?

Thinking about what the “A” in “ASCII” stands for ...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Thu Jan 25 00:30:07 2024

On Wed, 24 Jan 2024 19:33:09 +0000, Malcolm McLean wrote:

I've discussed this ad infinitum with people who don't really understand
what the term "function" means. Anththing that maps one set to another
set such that there is one and only one mapping from each member if the struture set to the result set is mathematically a "function".

Sizeof clearly counts.

It does in the mathematical sense. But in the C sense, a “function” is a block of code which is called at runtime with zero or more arguments and returns a result (which might be void). It can also have side-effects on
the machine state.

It helps the discussion to be clear what your terms mean. Otherwise the
people you are arguing with have a right to be indignant at what they
might perceive to be wilful obtuseness.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lew Pitcher@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 00:50:41 2024

On Thu, 25 Jan 2024 00:17:01 +0000, Lawrence D'Oliveiro wrote:

On Wed, 24 Jan 2024 20:25:00 -0000 (UTC), Lew Pitcher wrote:

On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

Where is there a national character set that doesn’t support the
symbols for which iso646.h introduces synonyms?

EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

Were any of the EBCDICs official standards anywhere in the world, outside
of IBM?

Who cares? The better questions would be:
- Are there C compilers for IBM mainframe systems that use EBCDIC?
Yes, indeed there are.
- Is IBM represented on the ISO C Standards committee?
Yes, it is.

Thinking about what the “A” in “ASCII” stands for ...

Thinking of what the "E" in "ECMA-6" stands for
(https://ecma-international.org/publications-and-standards/standards/ecma-6/) :-)

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Keith Thompson on Thu Jan 25 00:52:33 2024

On 24/01/2024 16:27, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

(It could not have been added as "**", because - as Keith said in
another post - "x ** y" already has a meaning in C. While I believe
it would be possible to distinguish the uses based on the type of "y",
other than for the literal 0, having "x ** y" mean two /completely/
different things depending on the type of "y" would not be a good idea
for C.)

The problem with a "**" exponentation operator is lexical. It's common
to have two consecutive unary "*" operators in declarations and
expression:
char **argv;
char c = **argv;
Adding a "**" operator would have made the above invalid due to the
"maximal munch" rule, before the type of the argument is even
considered.

See also x+++++y, which might be intended as x++ + ++y, but is scanned
as x ++ ++ + y, a syntax error.

C could have added "**" very early, but then we'd have to write
"* *argv" or "*(*argv)".

Given the other syntax decisions, probably the best bet would have been
to have a named /operator/ called 'pow'. With two operatands, using function-like syntax 'pow(x, y)' would look better than 'x pow y'.

This could then be defined over float, double, int and friends.

You wouldn't be able to have a reference to it as you can with a
function, but why is that so important? You can't have reference to the multiply operator either!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Thu Jan 25 00:26:04 2024

On Wed, 24 Jan 2024 19:52:58 GMT, Scott Lurndal wrote:

Dollar symbol ($) is an allowed extension.

I wonder if we have DEC to thank for that ... ?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 01:25:15 2024

On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Wed, 24 Jan 2024 19:52:58 GMT, Scott Lurndal wrote:

Dollar symbol ($) is an allowed extension.

I wonder if we have DEC to thank for that ... ?

Perhaps. You have to follow the money to find out where that came from.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 01:23:11 2024

On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Wed, 24 Jan 2024 20:25:00 -0000 (UTC), Lew Pitcher wrote:

On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

Where is there a national character set that doesn’t support the
symbols for which iso646.h introduces synonyms?

EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

Were any of the EBCDICs official standards anywhere in the world, outside
of IBM?

Let's make a song!

(To the tune of Mozart, K265).

A, B, C, D, E-F-G-H, I

dead-space plus/minus, J, K, L-M-N-O-Pie

R, dead-space, tilde, S, T-U-V

A-I-X-Sux, Y use System Z?

I almost know by ebsy-dickee-dee.

Sing "To hell with IBM!" with me.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 01:33:19 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 24 Jan 2024 20:25:00 -0000 (UTC), Lew Pitcher wrote:

On Wed, 24 Jan 2024 20:11:33 +0000, Lawrence D'Oliveiro wrote:

Where is there a national character set that doesn’t support the
symbols for which iso646.h introduces synonyms?

EBCDIC-US, for one. It lacks the CIRCUMFLEX (^) character.

Were any of the EBCDICs official standards anywhere in the world, outside
of IBM?

Defacto within the particular manufacturer, yes.

Thinking about what the “A” in “ASCII” stands for ...

Hard to tell with all the unreadable UTF-8 :-)

I believe it was the caret that printed as the EBCDIC 'not' �
character when printed on an EBCDIC printer.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Kaz Kylheku on Thu Jan 25 00:01:05 2024

On 1/24/24 16:11, Kaz Kylheku wrote:

On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement? As quoted,
it's a general statement which includes C: "Except as specified later,
side effects and value computations of subexpressions are unsequenced."

Pretty much any language has to guarantee *something* about
order of evaluation, somewhere.

Not the functional languages, I believe - but I've only heard about such languages, not used them.

Like for instance that calculating output is not possible before a
needed input is available.

Oddly enough, for a long time the C standard never said anything about
that issue. I argued that this was logically necessary, and few people disagreed with that argument, but I couldn't point to wording in the
standard to support that claim.

That changed when they added support for multi-threaded code to C in
C2011. That required the standard to be very explicit about which things
could happen simultaneously in different threads, and which things had
to occur in a specified order. All of the wording about "sequenced" was
first introduced at that time. In particular, the following wording was
added:

"The value computations of the operands of an operator are sequenced
before the value computation of the result of the operator." (6.5p1)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Thu Jan 25 05:25:56 2024

On Thu, 25 Jan 2024 03:56:13 +0000, Malcolm McLean wrote:

I said that the C standard's
use of the term "function" to mean "subroutine" was a misuse ...

Common Python terminology does the same.

Back in Pascal days, a “function” returned a value, while a “procedure” had some effect on the machine state. If you wanted to refer to both, you
tried a semi-common term like “routine” and hoped they understood.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Thu Jan 25 05:39:19 2024

On 2024-01-25, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
As for K&R's thinking, I have no particular insight on that. I have no problem with some operators being represented by symbols and others by keywords (I'm accustomed to it from other languages), and I don't see
that the decision to make "sizeof" a keyword even requires any
justification.

C has hardly any alphabetical operators only if you don't consider the statement keywords to be operators!

I.e. you don't consider the "if" in "if (expr) S1 else S2"
to be an operator which evaluates expr, and then chooses
whether to execute S1 or S2.

If you think that way, the terminology in the language backs you up;
it doesn't call them operators.

Yet ?: in expr ? E1 : E2 is identified as an operator.

Is it because it yields a value? Casting is an operator, and
doesn't always yield a value:

(void) expr

The function call parentheses are an operator, and likewise don't always
yield a value:

free(ptr)

No, it's because something involved only with expressions is an
operator, wheras something involved with statements is ... some unnamed something: "statement keyword" or whatever.

C++ has lambda expressions, which are operators, yielding a value. And
they are involved with statements: lambdas have a statement body,
much like a while statement has a body.

If C adopted similar syntax, the idea that operators don't have anything
to do with the control of statements would go out the window.

Basically, the Lisp people nailed all the concepts and terminology. If
we use that, we can talk about various other languages in a sensible
way.

In declarators, const and restrict work like unary operators,
syntactically in the same phrase structure role as the pointer *,
stacking from right to left together with these:

int *const *restrict p

Again, the language doesn't call "operators" those symbols that guide
the semantics of declarations. In my background, the main symbols that
guide meaning are all operators. in int *p, * is a type construction
operator which derives a pointer.

If I'm talking about C from the outside, as an object/product of
computer science, rather than talking about C programming, I will
tend to use those terms: like C has operators for iteration
such as while and for (which are not called operators in the
specification of that language).

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Keith Thompson on Thu Jan 25 07:48:41 2024

On 24.01.2024 17:27, Keith Thompson wrote:

(C++, in 2011 IIRC, introduced special handling for the >> token, which occurs in things like std::vector<std::vector<int>>).

So you need not any more have to write it with a space as in ...?
std::vector<std::vector<int> >
That's fine.

But I suppose they haven't fixed its precedence in cin<< and cout>>
contexts? (I suppose it's still handled with shl/shr precedence?)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 07:22:07 2024

On 24.01.2024 21:55, David Brown wrote:

[...]
It might be interesting to hear from any native Germans who were
programming C at that time.

This is matching my profile. (Don't get fooled by my name.) :-)

My faint memories might be the limiting factor, though! :-/

Germany is big enough that people
programmed in German (so comments would be in German, for example), and
their 7-bit ASCII variant (Code page 1011) also had accented letters in
place of some symbols used by C - including "|".

The following recollections imply (but also exceed) the C context.
(So you've been warned.)

The systems I used had originally, for example, not used Umlauts.
There's an alternative representation for Ä Ae, Ö Oe, Ü Ue, and
similar the lower case pendants, and using ss for ß. Some/most
mainframes used only uppercase letters. One OS (for the TR 440)
had even German commands; e.g. the "compile" (de: "übersetze")
command was written 'UEBERSETZE' etc. But it's not as simple; the
TR 440 had a few character sets (for punch cards, printer, etc.);
you find it at http://volatile.gridbug.de/TR440-Charactersets.pdf
The CDC we had used 6 bit character sets (but I have no docs for
it. All uppercase, no umlauts. WRT programming; for Pascal there
were alternative representations; e.g. for array elements a[i]
something like a.(i.), and I don't recall having used umlauts.
In Algol keywords were written, I think, with a leading dot like
.begin .int i; .for i .from 1 .to ... (all caps of course)
Unfortunately my only surviving source code from that time (punch
card set and corresponding 132 column printer output) is buried
somewhere in the basement. The C programming I did, I think, on a
Siemens 7.860, which actually was an IBM clone (maybe a 360/370 ?).
Not sure about the OS I used for that; it was either on a VM/CMS or
the Unix variant VM/UTS (Amdahl), both running on that VM platform.
Someone who better knows the IBM platform may have some insights
about the code sets; I have no docs. While we did *not* have to
use trigraphs I don't recall whether we had to use alternative
representations for some of these characters as mentioned above.
I don't recall that I ever used ASCII extensions like umlauts at
these days, neither in comments nor elsewhere. Next instance was
Sun workstations (Sun-OS), no code-set issues here. Later IBM
and HP-UX systems where I think we had coding standards to avoid
umlauts (but not sure). Base was ISO Latin 1, later ISO Latin 9
(ISO 8859-15), and much later UTF-8. Part of the development was
for Windows clients (not done by me), so compatibility with the
Unix platforms was an issue WRT the code pages.
For an European trans-national X.500 based telephone directory we
used the ISO/ITU-T standards/recommendations in exchanged data.
A small BASIC programmable pocket calculator (in the 1980's) used
an "ASCII code table", but it contains only NULL, and positions
0x20 to 0x5f (no lower case letters or umlauts), but it was 8 bit
and supported at positions 251 and 252 the 'pi' character and the
'sqrt' symbol respectively.
In the late 1970's an Olivetti P6060 basic desktop computer used
the ASCII character set (unchanged, as we know it), no umlauts.
In the docs it's named "ISO-CODE TABELLE", but the characters in
the control code block 0-31 had some graphical representations,
and the 8-bit "extended range" was existing but unspecified.
I also found some docs from that time for the Commodore PET 2001
(and CBM series) that we used here; ASCII and display character
set, see http://volatile.gridbug.de/Commodore-Characterset.pdf

So far my faint memories and some docs. It's really hard to recall
such details and there's some inherent uncertainty.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 07:02:30 2024

On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Thu, 25 Jan 2024 03:56:13 +0000, Malcolm McLean wrote:

I said that the C standard's
use of the term "function" to mean "subroutine" was a misuse ...

Common Python terminology does the same.

Back in Pascal days, a “function” returned a value, while a “procedure”
had some effect on the machine state. If you wanted to refer to both, you tried a semi-common term like “routine” and hoped they understood.

Lisp was there long before Pacal. In Lisp, there are only functions.
Functions can be pure if written that way, or have side effects.
They take arguments by value. All computation and side effecting is
done via expressions, which yield a value, even assignments.

Lisp also introduced the ternary "if" operator (if expr this that),
as well as short circuiting "and" and "or" (and expr1 expr2 ...)
(or expr1 expr2 ...).

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Thu Jan 25 14:07:25 2024

On 25/01/2024 07:48, Janis Papanagnou wrote:

On 24.01.2024 17:27, Keith Thompson wrote:

(C++, in 2011 IIRC, introduced special handling for the >> token, which
occurs in things like std::vector<std::vector<int>>).

So you need not any more have to write it with a space as in ...?
std::vector<std::vector<int> >
That's fine.

But I suppose they haven't fixed its precedence in cin<< and cout>>
contexts? (I suppose it's still handled with shl/shr precedence?)

There's little problem with the precedence - that's one of the reasons
these operators were picked in the first place. You sometimes need parentheses, but not often.

The problem was with the order of evaluation. Prior to C++17 (where it
was fixed), if you wrote "cout << one() << two() << three();", the order
the three functions were evaluated was unspecified.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Thu Jan 25 13:43:12 2024

On 25/01/2024 06:01, James Kuyper wrote:

On 1/24/24 16:11, Kaz Kylheku wrote:

On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement? As quoted,
it's a general statement which includes C: "Except as specified later,
side effects and value computations of subexpressions are unsequenced."

Pretty much any language has to guarantee *something* about
order of evaluation, somewhere.

Not the functional languages, I believe - but I've only heard about such languages, not used them.

I remember a programming task at university around infinite lists in a functional programming language (not Haskell, but very similar -
arguably its predecessor). We wrote a function returning pi as an
infinite list of decimal digits - the printout of that started long
before the calculation itself was finished!

Like for instance that calculating output is not possible before a
needed input is available.

Oddly enough, for a long time the C standard never said anything about
that issue. I argued that this was logically necessary, and few people disagreed with that argument, but I couldn't point to wording in the
standard to support that claim.

That changed when they added support for multi-threaded code to C in
C2011. That required the standard to be very explicit about which things could happen simultaneously in different threads, and which things had
to occur in a specified order. All of the wording about "sequenced" was
first introduced at that time. In particular, the following wording was added:

"The value computations of the operands of an operator are sequenced
before the value computation of the result of the operator." (6.5p1)

For the most part, even with threads, C defines things in terms of
observable behaviour. The actual order in which things are evaluated,
even when the standard gives the order required by the abstract machine
(such as using sequence points, and the complicated multi-threaded
stuff), the actual implementation can do any kind of re-ordering it
likes as long as there is no change to the observable behaviour.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Kaz Kylheku on Thu Jan 25 14:01:36 2024

On 24/01/2024 21:50, Kaz Kylheku wrote:

On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

(It could not have been added as "**", because - as Keith said in
another post - "x ** y" already has a meaning in C. While I believe
it would be possible to distinguish the uses based on the type of "y",
other than for the literal 0, having "x ** y" mean two /completely/
different things depending on the type of "y" would not be a good idea
for C.)

The problem with a "**" exponentation operator is lexical. It's common
to have two consecutive unary "*" operators in declarations and
expression:
char **argv;
char c = **argv;

Clearly, then, the way forward with this ** operator is to wait for the
C++ people to do the unthinkable, and reluctantly copy it some years
later.

I'm hoping the C++ people while do the sane/unthinkable (cross out one, according to personal preference) thing and allow Unicode symbols for operators, which will then be added to the standard library rather than
to the language. Then we'll have "x ↑ y", and no possible confusion.

(It's actually almost fully possible already - all they need to do is
allow characters such as ↑ to be used as macros, and we're good to go.)

Ya know, like what they did with stacked template closers, which are
already the >> operator.

The "maximum munch" parsing rule seemed like such a good idea, long ago!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Jan 25 14:29:33 2024

On 24/01/2024 20:33, Malcolm McLean wrote:

On 24/01/2024 13:54, David Brown wrote:

On 24/01/2024 13:20, Malcolm McLean wrote:

Many operators in C are not mathematical operations. "sizeof" is an
operator, so are indirection operators, structure member access
operators, function calls, and the comma operator.

I've discussed this ad infinitum with people who don't really understand
what the term "function" means.

Yes, you have - usually at least somewhat incorrectly, and usually
without being clear if you are talking about a "C function", a
mathematical "function", or a "Malcolm function" using your own private definitions.

Anththing that maps one set to another
set such that there is one and only one mapping from each member if the struture set to the result set is mathematically a "function".
Sizeof clearly counts.

"sizeof" clearly does not count.

You don't get to mix "mathematical" definitions and "C" definitions.
"sizeof" is a C feature - it makes no sense to ask if it is a
mathematical function or not. It /does/ make sense to ask if it is a
/C/ function or not - and it is not a C function.

At a pinch, you could say that "sizeof" is a mathematical function with
the domain being a subset set of possible expressions and possible types visible in the program at the time, according to the rules in the C
standards. It would not be useful or helpful, but you /could/ say that.

However, what I wrote was that "sizeof" was not a "mathematical
operation" - not that it was not a function in the mathematical sense.

Exponentiation is not particularly common in programming, except for a
few special cases - easily written as "x * x", "x * x * x", "1.0 / x",
or "sqrt(x)", which are normally significantly more efficient than a
generic power function or operator would be.

It's pretty common in the sort of programming that I do. But this is
fair point. A lot of programs don't apply complex transformations to
data in the way that mine typically do.

Different tasks have different programming needs.

That is not an argument against having an operator in C called "pow".
It is simply not useful enough for there to be a benefit in adding it
to the language as an operator, when it could (and was) easily be
added as a function in the standard library.

(It could not have been added as "**", because - as Keith said in
another post - "x ** y" already has a meaning in C. While I believe
it would be possible to distinguish the uses based on the type of "y",
other than for the literal 0, having "x ** y" mean two /completely/
different things depending on the type of "y" would not be a good idea
for C.)

Yes, ** and ^, which are the two common ASCII fallbacks, are already
taken. But as you said earlier, in reality most exponentiation
operations are either square or cube, or square root. And in C, that
means either special functions or inefficiently converting the exponent
into a double. If pow were an operator, that wouldn't be an issue.

It could certainly still be an issue - it depends entirely on how that
operator were specified, and how it were implemented.

Unlike normal C functions, operators in C can be "overloaded" by type.
But if there were a "pow" operator that fitted with the style of
existing C operators, we would have overloads for "T pow T to T", where
"T" is an integer type (of rank at least that of "int"), or a floating
point type. We would not see the kinds of overloads that would actually
be useful beyond what you get with today's "pow" function, such as
raising floating point numbers to an integer type. We would certainly
not see efficient shortcuts for common cases at the standards level
(though implementations could do what they wanted).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 14:19:18 2024

On 25.01.2024 13:43, David Brown wrote:

[...] We wrote a function returning pi as an
infinite list of decimal digits - the printout of that started long
before the calculation itself was finished!

You had an algorithm for an infinite list of decimals that finished?

I think this formulation will go into my cookie jar of noteworthy
achievements. - And, sorry, I could not resist. :-)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 14:35:50 2024

On 25.01.2024 14:07, David Brown wrote:

The problem was with the order of evaluation. Prior to C++17 (where it
was fixed), if you wrote "cout << one() << two() << three();", the order
the three functions were evaluated was unspecified.

The last decade or two I haven't been in C++ to any depth. But I'm a bit surprised by that. The op<< is defined by something like [informally]
stream op<<(stream,value), where "two() << three()" is "value << value",
but "cout << one()" would yield a stream, say X, and "X << two()" again
a stream, etc. So actually we have nested functions
op<<( op<<( op<<(cout, one()), two()), three())
At least you'd need to evaluate one() to obtain the argument for the
next outer of the nested calls.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Jan 25 13:43:37 2024

On 25/01/2024 12:43, David Brown wrote:

On 25/01/2024 06:01, James Kuyper wrote:

On 1/24/24 16:11, Kaz Kylheku wrote:

On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement? As quoted,
it's a general statement which includes C: "Except as specified later, >>>> side effects and value computations of subexpressions are unsequenced." >>>

Pretty much any language has to guarantee *something* about
order of evaluation, somewhere.

Not the functional languages, I believe - but I've only heard about such
languages, not used them.

I remember a programming task at university around infinite lists in a functional programming language (not Haskell, but very similar -
arguably its predecessor). We wrote a function returning pi as an
infinite list of decimal digits - the printout of that started long
before the calculation itself was finished!

You can write something like that in C. I adapted a program to print the
first N digits so that it doesn't stop. It looks like this:

int main(void) {
while (1) {
printf("%c",nextpidigit());
}
}

(The output starts as "314159..."; it will need a tweak to insert the
decimal point.)

The algorithm obviously wasn't mine; I've no idea how it works. (Tn a
sequence like ...399999999..., how does it know that 3 is a 3 and not a
4, before it's calculated further? It's magic.)

The nextpidigit() function is set up as a generator.

It also relies on using big integers (I used it to test my library), so
will rapidly get much slower at calculating the next digit.

Even with a much faster library, eventually memory will be exhausted, so
this is not suitable for an 'infinite' number, or even an unlimited
number of digits; it will eventually grind to a halt.

Was yours any different?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Jan 25 14:46:38 2024

On 25/01/2024 04:56, Malcolm McLean wrote:

On 25/01/2024 00:30, Lawrence D'Oliveiro wrote:

On Wed, 24 Jan 2024 19:33:09 +0000, Malcolm McLean wrote:

I've discussed this ad infinitum with people who don't really understand >>> what the term "function" means. Anththing that maps one set to another
set such that there is one and only one mapping from each member if the
struture set to the result set is mathematically a "function".

Sizeof clearly counts.

It does in the mathematical sense. But in the C sense, a “function” is a >> block of code which is called at runtime with zero or more arguments and
returns a result (which might be void). It can also have side-effects on
the machine state.

It helps the discussion to be clear what your terms mean. Otherwise the
people you are arguing with have a right to be indignant at what they
might perceive to be wilful obtuseness.

You haven't been around for long enough. I said that the C standard's
use of the term "function" to mean "subroutine" was a misuse, and that I
was going to use the term "function", in context, to refer to that
subset of C subroutines which calculate mathematical functions of bits
in the computer's memory. The opposition and outrage that this generated
was incredible, and must have gone on for years.

There is no single consensus of the definitions of the terms "function"
or "subroutine" in computing terminology, rendering your argument moot.

There is, however, a very clear understanding of the term "function" in
the context of C - thus in comp.lang.c., the unqualified term "function"
means no more and no less than what the C standards say the term means.

And there is an established mathematical meaning of the term "function"
- a "mathematical function" is a mapping from one set to another set.

These are very different things. They may not be quite as different as,
say, the meaning of the word "character" in the context of C, and it's
meaning in the context of a Shakespeare play. But they are still very distinct, and it is counter-productive to mix them.

Remember, no one really cares what you said about them, or whether you
think the C founding fathers used a poor choice of terms. It doesn't
even matter if anyone agrees with that opinion, or thinks Nim got it
right by distinguishing "func" and "proc". This is comp.lang.c, where
the focus is on the C language - and we use its terms, definitions and
rules regardless of how we may feel about them.

I don't think the "opposition and outrage" against your ideas has been incredible - I think people have simply been frustrated at your
insistence of writing confusing and unhelpful posts. The only
incredible thing is that you continue you this nonsense no matter how
often and how carefully people explain that you are wrong - and that
even if you were right, you'd /still/ be wrong here in c.l.c.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Jan 25 13:53:02 2024

On 25/01/2024 13:01, David Brown wrote:

On 24/01/2024 21:50, Kaz Kylheku wrote:

On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

(It could not have been added as "**", because - as Keith said in
another post - "x ** y" already has a meaning in C. While I believe
it would be possible to distinguish the uses based on the type of "y", >>>> other than for the literal 0, having "x ** y" mean two /completely/
different things depending on the type of "y" would not be a good idea >>>> for C.)

The problem with a "**" exponentation operator is lexical. It's common >>> to have two consecutive unary "*" operators in declarations and
expression:
char **argv;
char c = **argv;

Clearly, then, the way forward with this ** operator is to wait for the
C++ people to do the unthinkable, and reluctantly copy it some years
later.

I'm hoping the C++ people while do the sane/unthinkable (cross out one, according to personal preference) thing and allow Unicode symbols for operators, which will then be added to the standard library rather than
to the language. Then we'll have "x ↑ y", and no possible confusion.

(It's actually almost fully possible already - all they need to do is
allow characters such as ↑ to be used as macros, and we're good to go.)

Suppose ↑ could be used in as macro now, what would such a definition
look like?

Surely you'd be able to invoke it as ↑(x, y)?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Jan 25 15:02:52 2024

On 24/01/2024 20:48, Malcolm McLean wrote:

On 23/01/2024 21:51, Lawrence D'Oliveiro wrote:

On Tue, 23 Jan 2024 16:32:09 +0000, Malcolm McLean wrote:

It breaks the rule that, in C, variables and functions are alphnumeric,
whilst operators are symbols.

Where is there such a “rule”?

There is no such rule.

Valid function names have to begin with an alphabetical symbol or
(annoyingly for me) an underscore, as do variables. They may not contain non-alphanumerical symbols except for underscore. It's in the C standard somewhere.

6.4.2.1, in the definition of identifiers. (A wide range of other
Unicode letters are allowed as well.)

C operators are all non-alphanumerical symbols, with the exception of "sizeof". Again, the operators are listed in the C standard.

_Alignof has been an operator since C11, and the "typeof" operator is
coming with C23.

C does not have user-defined operators - all operators are defined by
the C standards committee, and documented in the C standards. If they
choose to make an operator called "pow", or "÷", or "multiply_by_7",
that is their prerogative. They are not bound by any rules someone on
Usenet thinks up because of a pattern that they see. Patterns are not
rules.

If the C standards committee ever decide to add more new operators, they
will choose the name based on what is expected to work best for users,
what has the least risk of compatibility issues, what fits best with any relevant existing extensions, and what is practical to implement. For
the most recent operators, words have made more sense. But if they were
to introduce a "power" operator, they would likely consider both "**"
and "_Pow" as possibilities - and probably other ideas too.

sizeof is an exception, but a justified one.

This is how religious people argue: they use circular reasoning to say
something is justified because it is justified.

No. This isn't circular reasoning. It's a claim which hasn't been backed
up. It's expected that the reader won't ask for this because it is so
obvious that we can give sensible reasons for "sizeof" being a
function-like alphabetical word rather than a symbol. But if you do, of course I'm sure someone will provide such a justification.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Thu Jan 25 15:20:54 2024

On 25/01/2024 14:19, Janis Papanagnou wrote:

On 25.01.2024 13:43, David Brown wrote:

[...] We wrote a function returning pi as an
infinite list of decimal digits - the printout of that started long
before the calculation itself was finished!

You had an algorithm for an infinite list of decimals that finished?

That's the beauty of lazy evaluation!

I think this formulation will go into my cookie jar of noteworthy achievements. - And, sorry, I could not resist. :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Jan 25 15:19:26 2024

On 25/01/2024 10:55, Malcolm McLean wrote:

On 25/01/2024 03:59, Keith Thompson wrote:

As for K&R's thinking, I have no particular insight on that. I have no
problem with some operators being represented by symbols and others by
keywords (I'm accustomed to it from other languages), and I don't see
that the decision to make "sizeof" a keyword even requires any
justification.

I looked it up on the web, but I can't find anything that goes back to K
and R and explains why they took that decision. But clearly to use a
word rather than punctuators, as was the case with every other operator,
must have had a reason.

I think they wanted it to look function-like, because it is function,
though a function of a type rather than of bits, so of course not a "function" in the C standard sense of the term.

It is not a function in the C sense - "sizeof x" is not like a function
call (where "x" is a variable or expression, rather than a type).
However, many people (myself included) feel it is clearer in code to
write it as "sizeof(x)", making it look more like a function or
function-like macro.

I suspect the prime reason "sizeof" is a word, rather than a symbol or
sequence of symbols, is that the word is very clear while there are no
suitable choices of symbols for the task. The nearest might have been
"#", but that might have made pre-processor implementations more
difficult. Of course any symbol or combination /could/ have been used,
and people would have learned its meaning, but "sizeof" just seems so
much simpler.

But all operators are
functions in this sense. However sizeof doesn't map to anything used in non-computer mathematics. But "size" is conventionally denoted by two vertical lines. These are taken by "OR", and would be misleading as in mathematics it means "absolute", not "physical area of paper taken up by
the notation".
So I would imagine that that was why they thought a word would be appropriate, and these reasons were strong enough to justify breaking
the general patrern that operators are punctuators.
I could be completely wrong of course in the absence of actual
statements by K and R. But this would seem to make sense.

That's roughly the same reasoning I see. (And I too do not have any
evidence or references for the reasoning.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to David Brown on Thu Jan 25 15:08:44 2024

David Brown <david.brown@hesbynett.no> wrote:

On 24/01/2024 20:33, Malcolm McLean wrote:

On 24/01/2024 13:54, David Brown wrote:

On 24/01/2024 13:20, Malcolm McLean wrote:

Many operators in C are not mathematical operations. "sizeof" is an
operator, so are indirection operators, structure member access
operators, function calls, and the comma operator.

I've discussed this ad infinitum with people who don't really understand
what the term "function" means.

Yes, you have - usually at least somewhat incorrectly, and usually
without being clear if you are talking about a "C function", a
mathematical "function", or a "Malcolm function" using your own private definitions.

Anththing that maps one set to another
set such that there is one and only one mapping from each member if the
struture set to the result set is mathematically a "function".
Sizeof clearly counts.

"sizeof" clearly does not count.

You don't get to mix "mathematical" definitions and "C" definitions.
"sizeof" is a C feature - it makes no sense to ask if it is a
mathematical function or not. It /does/ make sense to ask if it is a
/C/ function or not - and it is not a C function.

Why on earth do you say that?

It makes perfect sense to ask whether C's sizeof() can be
regarded as a mathematical function. And the answer
to that question is just what Malcolm said: Yes, sizeof()
fits perfectly to the definition of a mathematical function.

But in C language it is not a function, but operator.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Jan 25 16:11:26 2024

On 25/01/2024 14:43, bart wrote:

On 25/01/2024 12:43, David Brown wrote:

On 25/01/2024 06:01, James Kuyper wrote:

On 1/24/24 16:11, Kaz Kylheku wrote:

On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement? As quoted, >>>>> it's a general statement which includes C: "Except as specified later, >>>>> side effects and value computations of subexpressions are
unsequenced."

Pretty much any language has to guarantee *something* about
order of evaluation, somewhere.

Not the functional languages, I believe - but I've only heard about such >>> languages, not used them.

I remember a programming task at university around infinite lists in a
functional programming language (not Haskell, but very similar -
arguably its predecessor). We wrote a function returning pi as an
infinite list of decimal digits - the printout of that started long
before the calculation itself was finished!

You can write something like that in C. I adapted a program to print the first N digits so that it doesn't stop. It looks like this:

int main(void) {
      while (1) {
          printf("%c",nextpidigit());
      }
}

(The output starts as "314159..."; it will need a tweak to insert the
decimal point.)

The algorithm obviously wasn't mine; I've no idea how it works. (Tn a sequence like ...399999999..., how does it know that 3 is a 3 and not a
4, before it's calculated further? It's magic.)

The nextpidigit() function is set up as a generator.

It also relies on using big integers (I used it to test my library), so
will rapidly get much slower at calculating the next digit.

Even with a much faster library, eventually memory will be exhausted, so
this is not suitable for an 'infinite' number, or even an unlimited
number of digits; it will eventually grind to a halt.

Was yours any different?

That's roughly the same sort of thing. Functional programming languages
just make this kind of thing much easier. (Most do - some do not
support lazy evaluation.) So when you write "numbers = [1..]" in
Haskell, you are really setting up a generator function and a data
structure to hold the progress so far.

But it means you can use them just like any other lists, as long as you
don't try to do something that involves going through to the end of the
list (like printing it out in full, or finding its length) - or sooner
or later your machine is going to run out of memory, or the user will
run out of patience.

So a way to do this (which is not intended to be a very efficient
method) is to view numbers as having an integer part (perhaps a "big
integer", or an expandable list of decimal digits) and an infinite list
of decimals. You can then build up functions for adding them,
multiplying them by an integer, multiplying them by other infinite
precision numbers, etc.

Each of these functions returns another infinite list. The generator
function it produces takes its inputs by running the generators of the sub-expressions, as needed. (In general, you sometimes have to look a
little ahead to check for carries - as long as you don't meet an
infinite list of 9's, this always stops in a finite time.)

Now you make an arctan function, using the formula :

arctan z = z - (z^3)/3 + (z^5)/5 - (z^7)/7 + ...

It's just an infinite list of scaled powers of z, and you apply "sum" to it.

And your final pi is just "pi = 4 * arctan 1". Printing out the first
20 numbers would then be "take 20 pi".

(Or something similar - I don't remember the exact syntax of the
language we used.)

Of course, there's some details to get right in terms of how far you
need to go down the lists to ensure your output so far is correct. This
is particularly true for the arctan, as you have to determine how many
terms of the infinite sum you are adding as well as how far to go down
each term. (It really helps that for |z| < 1, the terms always get
smaller.) And there's scope for optimisation, such as "pre-calculating"
the powers of z so each new term only needs to multiply the previous
term by z². In a functional programming language, that's easy:

zs = z : map ( * z2 ) zs
where z2 = z * z

Once all that's in place, you can try more advanced Machine formula like:

π/4 = 4.arctan(1/5) - arctan(1/239)

and get far faster convergence. If all the rest of the bits are in
place correctly, only that last line needs to be changed to use the more efficient formula.

Of course you can do this sort of thing in C, using generator functions.
But it's a lot easier when the language handles all the implementation
detail for you, and lets you use a much simpler syntax. (Simpler when
you are used to it, of course!)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 16:40:55 2024

On 25.01.2024 15:20, David Brown wrote:

On 25/01/2024 14:19, Janis Papanagnou wrote:

On 25.01.2024 13:43, David Brown wrote:

[...] We wrote a function returning pi as an
infinite list of decimal digits - the printout of that started long
before the calculation itself was finished!

You had an algorithm for an infinite list of decimals that finished?

That's the beauty of lazy evaluation!

Erm, yes, I'm familiar with lazy evaluation.

It just doesn't clarify the magic of getting infinite lists from a
finished procedure.

But I don't want to disturb the ethereal beauty of such a sentence
by critical questioning its semantics. :-)

I think this formulation will go into my cookie jar of noteworthy
achievements. - And, sorry, I could not resist. :-)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Jan 25 16:48:25 2024

On 25/01/2024 14:53, bart wrote:

On 25/01/2024 13:01, David Brown wrote:

On 24/01/2024 21:50, Kaz Kylheku wrote:

On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

(It could not have been added as "**", because - as Keith said in
another post - "x ** y" already has a meaning in C. While I believe >>>>> it would be possible to distinguish the uses based on the type of "y", >>>>> other than for the literal 0, having "x ** y" mean two /completely/
different things depending on the type of "y" would not be a good idea >>>>> for C.)

The problem with a "**" exponentation operator is lexical. It's common >>>> to have two consecutive unary "*" operators in declarations and
expression:
char **argv;
char c = **argv;

Clearly, then, the way forward with this ** operator is to wait for the
C++ people to do the unthinkable, and reluctantly copy it some years
later.

I'm hoping the C++ people while do the sane/unthinkable (cross out
one, according to personal preference) thing and allow Unicode symbols
for operators, which will then be added to the standard library rather
than to the language. Then we'll have "x ↑ y", and no possible
confusion.

(It's actually almost fully possible already - all they need to do is
allow characters such as ↑ to be used as macros, and we're good to go.)

Suppose ↑ could be used in as macro now, what would such a definition
look like?

Surely you'd be able to invoke it as ↑(x, y)?

It would be slightly nasty C++. Are you sure you want to know? Stop
reading now if you are having second thoughts...

Since you can't use ↑ as a macro name, I've used Π below in the sample function. "x Π y" calls "pow(x, y)", with no overhead.

(Real code like this would use templates to support integer types and
other floating point types, and perhaps also short-cuts for squares,
cubes, and square roots. But I don't want to make it too messy, and it
would also be more of a c.l.c++ topic.)

#include <cmath>

class PowerProxyInner {
double _x;
public:
constexpr PowerProxyInner(double x) : _x(x) {}
friend inline constexpr double operator + (const PowerProxyInner x,
double y);
};

inline constexpr double operator + (const PowerProxyInner x, double y) {
return std::pow(x._x, y);
}

class PowerProxy {
public:
constexpr PowerProxy() {}
friend inline constexpr PowerProxyInner operator + (const
PowerProxy x, double y);
};

inline constexpr PowerProxyInner operator + (double x, const PowerProxy y) {
(void) y;
return PowerProxyInner(x);
}

constexpr inline PowerProxy POWER;

#define Π +POWER+

double test(double x, double y) {
return x Π y;
}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Thu Jan 25 16:53:17 2024

On 25/01/2024 14:35, Janis Papanagnou wrote:

On 25.01.2024 14:07, David Brown wrote:

The problem was with the order of evaluation. Prior to C++17 (where it
was fixed), if you wrote "cout << one() << two() << three();", the order
the three functions were evaluated was unspecified.

The last decade or two I haven't been in C++ to any depth. But I'm a bit surprised by that. The op<< is defined by something like [informally]
stream op<<(stream,value), where "two() << three()" is "value << value",
but "cout << one()" would yield a stream, say X, and "X << two()" again
a stream, etc. So actually we have nested functions
op<<( op<<( op<<(cout, one()), two()), three())
At least you'd need to evaluate one() to obtain the argument for the
next outer of the nested calls.

Not quite. To simplify :

cout << one() << two()

is parsed as :

(cout << one()) << two()

So "cout << one()" is like a call to "op<<(cout, one())", and the full expression is like :

op<<(op<<(cout, one()), two())

Without the new C++17 order of evaluation rules, the compiler can
happily execute "two()" before "op<<(cout, one())". The operands to the
outer call need to be executed before the outer call itself, but the
order in which these two operands are evaluated is unspecified (until
C++17).

Note that C++17 only specifies the order in certain cases, such as the
bitwise shift operators.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Thu Jan 25 17:11:11 2024

On 25.01.2024 16:53, David Brown wrote:

On 25/01/2024 14:35, Janis Papanagnou wrote:

On 25.01.2024 14:07, David Brown wrote:

The problem was with the order of evaluation. Prior to C++17 (where it
was fixed), if you wrote "cout << one() << two() << three();", the order >>> the three functions were evaluated was unspecified.

The last decade or two I haven't been in C++ to any depth. But I'm a bit
surprised by that. The op<< is defined by something like [informally]
stream op<<(stream,value), where "two() << three()" is "value << value",
but "cout << one()" would yield a stream, say X, and "X << two()" again
a stream, etc. So actually we have nested functions
op<<( op<<( op<<(cout, one()), two()), three())
At least you'd need to evaluate one() to obtain the argument for the
next outer of the nested calls.

Not quite. To simplify :

cout << one() << two()

is parsed as :

(cout << one()) << two()

So "cout << one()" is like a call to "op<<(cout, one())", and the full expression is like :

op<<(op<<(cout, one()), two())

Yes, up to here that's exactly what I said above (with three nestings).

op<<( op<<( op<<(cout, one()), two()), three())

Remove one

op<<( op<<(cout, one()), two())

Without the new C++17 order of evaluation rules, the compiler can
happily execute "two()" before "op<<(cout, one())". The operands to the outer call need to be executed before the outer call itself, but the
order in which these two operands are evaluated is unspecified (until
C++17).

If that was formerly the case then the update was obviously necessary.

Functionally there would probably have been commotion if

tmp = op<<(cout, one())
op<<( tmp, two())

and

op<<( op<<(cout, one()), two())

would have had different results.

Is or was there any compiler that implemented that in the "unexpected"
order?

[...]

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Jan 25 16:22:03 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 25/01/2024 03:59, Keith Thompson wrote:
However sizeof doesn't map to anything used in
non-computer mathematics. But "size" is conventionally denoted by two >vertical lines.

So is absolute value, if I remember the notation correctly.

|value|

Arguing about what Ken and Dennis were thinking seems
particularly fruitless.

Almost as bad as arguing whether sizeof is an operator,
a function or a keyword.

Personally, I always parenthesize the sizeof
non-terminal.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Thu Jan 25 20:06:39 2024

On Thu, 25 Jan 2024 14:01:36 +0100, David Brown wrote:

I'm hoping the C++ people while do the sane/unthinkable (cross out one, according to personal preference) thing and allow Unicode symbols for operators, which will then be added to the standard library rather than
to the language.

Why not do what Algol-68 did, and specify a set of characters that could
be used to define new custom operators?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Thu Jan 25 21:07:55 2024

On 25/01/2024 18:57, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

On 24/01/2024 21:50, Kaz Kylheku wrote:

On 2024-01-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

[...]

The problem with a "**" exponentation operator is lexical. It's common >>>> to have two consecutive unary "*" operators in declarations and
expression:
char **argv;
char c = **argv;

Clearly, then, the way forward with this ** operator is to wait for
the C++ people to do the unthinkable, and reluctantly copy it some
years later.

I'm hoping the C++ people while do the sane/unthinkable (cross out
one, according to personal preference) thing and allow Unicode symbols
for operators, which will then be added to the standard library rather
than to the language. Then we'll have "x ↑ y", and no possible
confusion.

That's difficult to type -- but they could add a new trigraph! 8-)}

Shift-AltGr-U for me. x ￪ y might have been a slightly nicer symbol,
but it's harder to type (for me).

This illustrates the two big difficulties with Unicode symbols for this
kind of thing. Lots of them are difficult to type for many people (at
least, not without a good deal of messing around or extra programs).
And it's easy to have different symbols that appear quite similar as
glyphs, but are very different characters as far as the compiler is
concerned.

If the committee decides C needs an exponentation operator (which, as
far as I know, nobody has submitted a proposal for), "^^" is available.

Well, a logical exclusive or operator would not be much use, so why not?

(It's actually almost fully possible already - all they need to do is
allow characters such as ↑ to be used as macros, and we're good to
go.)

You'd also need something for ↑ to expand to.

Yes, but you can write that bit in C++ already. (See my reply to Bart.)

Ya know, like what they did with stacked template closers, which are
already the >> operator.

The "maximum munch" parsing rule seemed like such a good idea, long ago!

It still does. It's simple to describe, and ambiguous cases like
x+++++y should be resolved with whitespace. (">>" was a real problem in
C++, resolved with a special-case rule in C11; C has no such problems of similar severity.)

Not until we get a ** exponential operator...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Thu Jan 25 21:11:18 2024

On 25/01/2024 17:11, Janis Papanagnou wrote:

On 25.01.2024 16:53, David Brown wrote:

On 25/01/2024 14:35, Janis Papanagnou wrote:

On 25.01.2024 14:07, David Brown wrote:

The problem was with the order of evaluation. Prior to C++17 (where it >>>> was fixed), if you wrote "cout << one() << two() << three();", the order >>>> the three functions were evaluated was unspecified.

The last decade or two I haven't been in C++ to any depth. But I'm a bit >>> surprised by that. The op<< is defined by something like [informally]
stream op<<(stream,value), where "two() << three()" is "value << value", >>> but "cout << one()" would yield a stream, say X, and "X << two()" again
a stream, etc. So actually we have nested functions
op<<( op<<( op<<(cout, one()), two()), three())
At least you'd need to evaluate one() to obtain the argument for the
next outer of the nested calls.

Not quite. To simplify :

cout << one() << two()

is parsed as :

(cout << one()) << two()

So "cout << one()" is like a call to "op<<(cout, one())", and the full
expression is like :

op<<(op<<(cout, one()), two())

Yes, up to here that's exactly what I said above (with three nestings).

op<<( op<<( op<<(cout, one()), two()), three())

Remove one

op<<( op<<(cout, one()), two())

Without the new C++17 order of evaluation rules, the compiler can
happily execute "two()" before "op<<(cout, one())". The operands to the
outer call need to be executed before the outer call itself, but the
order in which these two operands are evaluated is unspecified (until
C++17).

If that was formerly the case then the update was obviously necessary.

Functionally there would probably have been commotion if

tmp = op<<(cout, one())
op<<( tmp, two())

and

op<<( op<<(cout, one()), two())

would have had different results.

Is or was there any compiler that implemented that in the "unexpected"
order?

There were indeed such real-world cases, complaints were made, and the
rules changed in C++17.

Usually it doesn't matter what order arguments to functions (or operands
to operators) are evaluated. Some compilers have consistent ordering
(and it is often last to first, not first to last), others pick whatever
makes sense at the time. The ordering has been explicitly and clearly
stated as "unspecified" since around the beginning of time (which was,
as we all know, 01.01.1970).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Thu Jan 25 20:08:34 2024

On Thu, 25 Jan 2024 09:57:43 -0800, Keith Thompson wrote:

Then we'll have "x ↑ y", and no possible confusion.

That's difficult to type

Compose-circumflex-bar, or compose-bar-circumflex.

↑↑ (typed by me)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 20:30:36 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Thu, 25 Jan 2024 09:57:43 -0800, Keith Thompson wrote:

Then we'll have "x ↑ y", and no possible confusion.

That's difficult to type

Compose-circumflex-bar, or compose-bar-circumflex.

yep, difficult to type.

↑↑ (typed by me)

and to read.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Thu Jan 25 20:18:17 2024

On Thu, 25 Jan 2024 21:07:55 +0100, David Brown wrote:

This illustrates the two big difficulties with Unicode symbols for this
kind of thing. Lots of them are difficult to type for many people (at
least, not without a good deal of messing around or extra programs).

The compose key on *nix systems gives you a fairly mnemonic way of typing
many of them.

And it's easy to have different symbols that appear quite similar as
glyphs, but are very different characters as far as the compiler is concerned.

You can actually take advantage of that. E.g. from some of my Python code:

for cłass in (Window, Pixmap, Cursor, GContext, Region) :
delattr(cłass, "__del__")
#end for

The human reader might not actually notice (or care) that a particular identifier looks like a reserved word, since the meaning is obvious from context. The compiler cannot deduce the meaning from that context, but
then, it doesn’t need to.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 21:04:39 2024

On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Thu, 25 Jan 2024 14:01:36 +0100, David Brown wrote:

I'm hoping the C++ people while do the sane/unthinkable (cross out one,
according to personal preference) thing and allow Unicode symbols for
operators, which will then be added to the standard library rather than
to the language.

Why not do what Algol-68 did, and specify a set of characters that could
be used to define new custom operators?

That ship sailed. C uses maximal munch lexing which allows operators
to be juxtaposed with no intervening space. E.g. !*++p is tokenized
as {!}{*}{++}{p}. It's difficult to introduce a scheme for defining
new combinations of characters as new kinds of tokens. The only
ASCII glyphs that are not some kind of token already are $ and @;
they are not used in C. So program-defined tokens would have to start
with one of these. If a program-defined token started with something
existing like *, for instance a token *%%*, that already has an existing meaning; it scans as {*}{%}{%}{*}. Hacky rules, like speculative parses,
and whatnot, could make it work; there is now way something like that
could be standardized into C, though.

We can use *%%* as a symbol in ANSI Lisp, because Lisp has tokenizing
rules which support that. Tokens are made up of token constituent
characters, and are delimited by the first nonconstituent. For instance
( is a non-constituent (for obvious reasons) so *%%*( will be read
properly as the *%%* token (which becomes a symbol object) followed
by an open parenthesis starting a list. Likewise, whitespace is
terminating. The # character has a special meaning in various notations,
and is also a token constituent so that #c(1.0 2.0) is a complex
number, yet ab#c is a single symbol token.

Nothing like this is easy to retrofit into C.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 21:13:37 2024

On 2024-01-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Thu, 25 Jan 2024 09:57:43 -0800, Keith Thompson wrote:

Then we'll have "x ↑ y", and no possible confusion.

That's difficult to type

Compose-circumflex-bar, or compose-bar-circumflex.

↑↑ (typed by me)

In Japanese IME: type ue, then space bar several times to find the
completion, which becomes the top one in the LRU list for the
next time you type ue.

Japanese IME is great.

Need an Ohm symbol? o-mu, spacebar: Ω. omega also works.

ha-to: ♥
onpu: ♪
hoshi: ★, ☆

arufa, be-ta, ganma, deruta, ...: α, β, γ, Δ

hidari: ← (and others)
migi: → (and others)
shita: ↓

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Lawrence D'Oliveiro on Thu Jan 25 21:16:14 2024

On 25/01/2024 20:06, Lawrence D'Oliveiro wrote:

On Thu, 25 Jan 2024 14:01:36 +0100, David Brown wrote:

I'm hoping the C++ people while do the sane/unthinkable (cross out one,
according to personal preference) thing and allow Unicode symbols for
operators, which will then be added to the standard library rather than
to the language.

Why not do what Algol-68 did, and specify a set of characters that could
be used to define new custom operators?

Because that was a terrible scheme. It was too easy to create a language
full of cryptic-looking new operators that no one had any idea what they
did.

It also allowed you to define precedences of any operator, including
overriding the precedence of any operator within a nested scope.

That means that 'a + b * c' could be parsed differently in different
contexts.

Imagine putting that power into the hands of ordinary users.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to bart on Thu Jan 25 22:01:59 2024

On Thu, 25 Jan 2024 21:16:14 +0000, bart wrote:

Imagine putting that power into the hands of ordinary users.

Shock, horror. Of course we elite cannot allow that into the hands of the plebs. Imagine what they might do!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Fri Jan 26 02:11:27 2024

On 25.01.2024 23:01, Lawrence D'Oliveiro wrote:

On Thu, 25 Jan 2024 21:16:14 +0000, bart wrote:

Imagine putting that power into the hands of ordinary users.

Shock, horror. Of course we elite cannot allow that into the hands of the plebs. Imagine what they might do!

Power to the people!

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Fri Jan 26 02:21:41 2024

On 25.01.2024 21:11, David Brown wrote:

On 25/01/2024 17:11, Janis Papanagnou wrote:

On 25.01.2024 16:53, David Brown wrote:

On 25/01/2024 14:35, Janis Papanagnou wrote:

On 25.01.2024 14:07, David Brown wrote:

The problem was with the order of evaluation. Prior to C++17
(where it
was fixed), if you wrote "cout << one() << two() << three();", the
order
the three functions were evaluated was unspecified.

The last decade or two I haven't been in C++ to any depth. But I'm a
bit
surprised by that. The op<< is defined by something like [informally]
stream op<<(stream,value), where "two() << three()" is "value <<
value",
but "cout << one()" would yield a stream, say X, and "X << two()" again >>>> a stream, etc. So actually we have nested functions
op<<( op<<( op<<(cout, one()), two()), three())
At least you'd need to evaluate one() to obtain the argument for the
next outer of the nested calls.

Not quite. To simplify :

cout << one() << two()

is parsed as :

(cout << one()) << two()

So "cout << one()" is like a call to "op<<(cout, one())", and the full
expression is like :

op<<(op<<(cout, one()), two())

Yes, up to here that's exactly what I said above (with three nestings).

op<<( op<<( op<<(cout, one()), two()), three())

Remove one

op<<( op<<(cout, one()), two())

Without the new C++17 order of evaluation rules, the compiler can
happily execute "two()" before "op<<(cout, one())". The operands to the >>> outer call need to be executed before the outer call itself, but the
order in which these two operands are evaluated is unspecified (until
C++17).

If that was formerly the case then the update was obviously necessary.

Functionally there would probably have been commotion if

tmp = op<<(cout, one())
op<<( tmp, two())

and

op<<( op<<(cout, one()), two())

would have had different results.

Is or was there any compiler that implemented that in the "unexpected"
order?

There were indeed such real-world cases, complaints were made,

Complaints that the rule was not clear in its definition?
Or complaints that their compiler did not support cout<<a<<b<<c;
correctly? - I would be astonished about the latter.
This is so fundamental a construct and so frequently used that any
compiler would have been withdrawn in the week after it came out.
That is my expectation. So I would be grateful if you could provide
some evidence that I can look up.

Mind that even if two() is evaluated before one(), it will not be
output before the stream of the first expression op<<(cout, one())
is available, and for this one() must be evaluated. Then one() can
be sent to the stream, and then also two() can be sent to the stream.
(Am I missing something?)

Janis

and the rules changed in C++17.

Usually it doesn't matter what order arguments to functions (or operands
to operators) are evaluated. Some compilers have consistent ordering
(and it is often last to first, not first to last), others pick whatever makes sense at the time. The ordering has been explicitly and clearly
stated as "unspecified" since around the beginning of time (which was,
as we all know, 01.01.1970).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Janis Papanagnou on Fri Jan 26 01:19:43 2024

On 2024-01-26, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 25.01.2024 23:01, Lawrence D'Oliveiro wrote:

On Thu, 25 Jan 2024 21:16:14 +0000, bart wrote:

Imagine putting that power into the hands of ordinary users.

Shock, horror. Of course we elite cannot allow that into the hands of the
plebs. Imagine what they might do!

Power to the people!

And through the chairs of bad people!

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Fri Jan 26 15:59:11 2024

On 25/01/2024 21:18, Lawrence D'Oliveiro wrote:

On Thu, 25 Jan 2024 21:07:55 +0100, David Brown wrote:

This illustrates the two big difficulties with Unicode symbols for this
kind of thing. Lots of them are difficult to type for many people (at
least, not without a good deal of messing around or extra programs).

The compose key on *nix systems gives you a fairly mnemonic way of typing many of them.

It lets you type some, but it is still limited in the default setup.
It's very useful for things like diacriticals on letters that you
already have, but if you want to use it for something out of the
ordinary, you need to make your own .XCompose file. And then you have
to remember to update things on your home computer, work computer,
laptop, etc. So it is very useful (I use it myself), but not an
out-of-the-box solution.

And it's easy to have different symbols that appear quite similar as
glyphs, but are very different characters as far as the compiler is
concerned.

You can actually take advantage of that. E.g. from some of my Python code:

for cłass in (Window, Pixmap, Cursor, GContext, Region) :
delattr(cłass, "__del__")
#end for

The human reader might not actually notice (or care) that a particular identifier looks like a reserved word, since the meaning is obvious from context. The compiler cannot deduce the meaning from that context, but
then, it doesn’t need to.

I am not at all keen on that. I am not against using non-ASCII letters
as though they were special symbols for particular purposes, but I'd
want them to stand out clearly.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Fri Jan 26 17:06:35 2024

On 26/01/2024 13:17, Malcolm McLean wrote:

We could say that in comp.lang.c "function" shall mean "a subroutine"

<snip the blather>

Why don't we just say - as everyone in this group except you already
says, that in c.l.c. "function" means "C function" as described in the C standards, and any other type of function needs to be qualified?

Thus "the tan function" here means the function from <math.h>, not the mathematical function, or something done when making leather.

It really is not difficult.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Fri Jan 26 17:01:57 2024

On 26/01/2024 02:21, Janis Papanagnou wrote:

On 25.01.2024 21:11, David Brown wrote:

On 25/01/2024 17:11, Janis Papanagnou wrote:

Is or was there any compiler that implemented that in the "unexpected"
order?

There were indeed such real-world cases, complaints were made,

Complaints that the rule was not clear in its definition?
Or complaints that their compiler did not support cout<<a<<b<<c;
correctly? - I would be astonished about the latter.

The pre-C++17 rule was perfectly clear - there was no specified order of execution for the operands. (And I thought I'd made /that/ perfectly
clear already.) Compilers all worked correctly - they can hardly have
fallen foul of a rule that did not exist.

The complaints (at least, the ones based on facts rather than misunderstandings) were about the lack of a rule that enforced
evaluation order in certain cases.

So C++17 added rules for evaluation orders in some circumstances, but
not others. In C++17, but not before (and not in C), the evaluation of
the expression "one" (and any side-effects) must come before the
evaluation of "two" for, amongst other things :

one << two
one >> two
one[two]
two = one

There is still /no/ ordering for

one * two
one + two

and many other cases.

And of course there are cases where there has always been a sequence
point, and therefore an order of evaluation (a logical order, that is -
if the compiler can see it makes no difference to the observable
effects, it can always re-arrange anything).

<https://en.cppreference.com/w/cpp/language/eval_order> <https://en.cppreference.com/w/c/language/eval_order>

This is so fundamental a construct and so frequently used that any
compiler would have been withdrawn in the week after it came out.
That is my expectation. So I would be grateful if you could provide
some evidence that I can look up.

<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf>

For an example in practice, where you can see the generated assembly:

<https://www.godbolt.org/z/fWezzx1nd>

If I remember correctly, gcc 7 implemented the ordering rules from C++17
and back-ported them to previous C++ standards for user convenience (as
the order was previously unspecified, it was fine to do that).

Look at the generated assembly and the order in which the calls to
one(), two(), three() and four() are made. For the operator "<<", they
are made in order one() to four(). For the operator "+", and for
function call parameters, they are generated in order four() to one()
for this case. (In other cases, that may be different - that's what "unspecified" means.)

Mind that even if two() is evaluated before one(), it will not be
output before the stream of the first expression op<<(cout, one())
is available, and for this one() must be evaluated. Then one() can
be sent to the stream, and then also two() can be sent to the stream.
(Am I missing something?)

The output to the stream must be in the order given in the code - that
is true. But the values to be output could (prior to C++17) be
evaluated in any order. If one() and two() have side-effects, that is
critical - those side-effects could be executed in any order.

Janis

and the rules changed in C++17.

Usually it doesn't matter what order arguments to functions (or operands
to operators) are evaluated. Some compilers have consistent ordering
(and it is often last to first, not first to last), others pick whatever
makes sense at the time. The ordering has been explicitly and clearly
stated as "unspecified" since around the beginning of time (which was,
as we all know, 01.01.1970).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Fri Jan 26 17:12:29 2024

On 25/01/2024 16:26, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

On 25/01/2024 10:55, Malcolm McLean wrote:

On 25/01/2024 03:59, Keith Thompson wrote:

As for K&R's thinking, I have no particular insight on that. I have no >>>> problem with some operators being represented by symbols and others by >>>> keywords (I'm accustomed to it from other languages), and I don't see
that the decision to make "sizeof" a keyword even requires any
justification.

I looked it up on the web, but I can't find anything that goes back
to K and R and explains why they took that decision. But clearly to
use a word rather than punctuators, as was the case with every other
operator, must have had a reason.

I think they wanted it to look function-like, because it is
function, though a function of a type rather than of bits, so of
course not a "function" in the C standard sense of the term.

It is not a function in the C sense - "sizeof x" is not like a
function call (where "x" is a variable or expression, rather than a
type). However, many people (myself included) feel it is clearer in
code to write it as "sizeof(x)", making it look more like a function
or function-like macro.

And many people (myself included) feel it is clearer to write it as
`sizeof x`, precisely so it *doesn't* look like a function call, because
it isn't one. Similarly, I don't use unnecessary parentheses on return statements.

I also write `sizeof (int)` rather than `sizeof(int)`. The parentheses
look similar to those in a function call, but the construct is
semantically distinct. I think of keywords as a different kind of
token than identifiers, even though they look similar (and the standard describes them that way).

Fair enough. I think a lot of this kind of thing is just habit or
personal preference. There's always little differences in the way
people write their code.

I suspect the prime reason "sizeof" is a word, rather than a symbol or
sequence of symbols, is that the word is very clear while there are no
suitable choices of symbols for the task. The nearest might have been
"#", but that might have made pre-processor implementations more
difficult. Of course any symbol or combination /could/ have been
used, and people would have learned its meaning, but "sizeof" just
seems so much simpler.

It has occurred to me that if there had been a strong desire to use a
symbol, "$" could have worked. It even suggests the 's' in the word
"size".

But there was no such desire. sizeof happens to be the only operator
whose symbol is a keyword, but I see no particular significance to this,
and no reason not to define it that way. I might even have preferred keywords for some of C's well-populated zoo of operators. See also
Pascal, which has keywords "and", "or", "not", and "mod".

Agreed. I don't think these details really make a big difference to programming languages.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Fri Jan 26 18:31:57 2024

All what you wrote below targets at your last sentense
"those side-effects could be executed in any order".
For the examples we had, like (informally) cout<<a<<b<<c;
this is undisputed for the SIDE EFFECTS of "a", etc. You
had "hidden" those side effects in "one()", I gave in an
earlier post the more obvious example c++ in the context
of cout << c++ << c++ << c++ << endl; as side effects.
All side effects can be a problem (and should be avoided
unless "necessary"). My point was that the order of '<<'
with its arguments is NOT corrupted. I interpreted your
previous posting that you'd have heard that to be an issue.
If you haven't meant to say that there's nothing more to
say about the issue, since the other things you filled your
post with is only distracting from the point in question.

Janis

On 26.01.2024 17:01, David Brown wrote:

On 26/01/2024 02:21, Janis Papanagnou wrote:

On 25.01.2024 21:11, David Brown wrote:

On 25/01/2024 17:11, Janis Papanagnou wrote:

Is or was there any compiler that implemented that in the "unexpected" >>>> order?

There were indeed such real-world cases, complaints were made,

Complaints that the rule was not clear in its definition?
Or complaints that their compiler did not support cout<<a<<b<<c;
correctly? - I would be astonished about the latter.

The pre-C++17 rule was perfectly clear - there was no specified order of execution for the operands. (And I thought I'd made /that/ perfectly
clear already.) Compilers all worked correctly - they can hardly have
fallen foul of a rule that did not exist.

The complaints (at least, the ones based on facts rather than misunderstandings) were about the lack of a rule that enforced
evaluation order in certain cases.

So C++17 added rules for evaluation orders in some circumstances, but
not others. In C++17, but not before (and not in C), the evaluation of
the expression "one" (and any side-effects) must come before the
evaluation of "two" for, amongst other things :

one << two
one >> two
one[two]
two = one

There is still /no/ ordering for

one * two
one + two

and many other cases.

And of course there are cases where there has always been a sequence
point, and therefore an order of evaluation (a logical order, that is -
if the compiler can see it makes no difference to the observable
effects, it can always re-arrange anything).

<https://en.cppreference.com/w/cpp/language/eval_order> <https://en.cppreference.com/w/c/language/eval_order>

This is so fundamental a construct and so frequently used that any
compiler would have been withdrawn in the week after it came out.
That is my expectation. So I would be grateful if you could provide
some evidence that I can look up.

<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf>

For an example in practice, where you can see the generated assembly:

<https://www.godbolt.org/z/fWezzx1nd>

If I remember correctly, gcc 7 implemented the ordering rules from C++17
and back-ported them to previous C++ standards for user convenience (as
the order was previously unspecified, it was fine to do that).

Look at the generated assembly and the order in which the calls to
one(), two(), three() and four() are made. For the operator "<<", they
are made in order one() to four(). For the operator "+", and for
function call parameters, they are generated in order four() to one()
for this case. (In other cases, that may be different - that's what "unspecified" means.)

Mind that even if two() is evaluated before one(), it will not be
output before the stream of the first expression op<<(cout, one())
is available, and for this one() must be evaluated. Then one() can
be sent to the stream, and then also two() can be sent to the stream.
(Am I missing something?)

The output to the stream must be in the order given in the code - that
is true. But the values to be output could (prior to C++17) be
evaluated in any order. If one() and two() have side-effects, that is critical - those side-effects could be executed in any order.

Janis

and the rules changed in C++17.

Usually it doesn't matter what order arguments to functions (or operands >>> to operators) are evaluated. Some compilers have consistent ordering
(and it is often last to first, not first to last), others pick whatever >>> makes sense at the time. The ordering has been explicitly and clearly
stated as "unspecified" since around the beginning of time (which was,
as we all know, 01.01.1970).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Fri Jan 26 18:59:02 2024

On 26.01.2024 17:06, David Brown wrote:

On 26/01/2024 13:17, Malcolm McLean wrote:

We could say that in comp.lang.c "function" shall mean "a subroutine"

Why don't we just say - as everyone in this group except you already
says, that in c.l.c. "function" means "C function" as described in the C standards, and any other type of function needs to be qualified?

Thus "the tan function" here means the function from <math.h>, not the mathematical function, or something done when making leather.

It really is not difficult.

Unless the discussion was done on a meta-level as opposed to a
concrete language specific implementation-model of a function,
or a concrete functions. - My impression from the posts upthread
was that we were taking on the meta-level to understand what we
actually have (with tha 'sizeof' beast) or how to consider it
conceptionally.

I also think that this is the key to not talk past each other.

The term "function" in computer science seems to have never been
an issue of dispute - I mean on a terminology level; explanations
in lectures or books were quite coherent, and since there was no
dispute everyone seems to have understood what a function is; in
computer science and in mathematics.

From my references it seems a consensus at least in that it's
reflecting a mathematical f: (x,y,...) -> (u,v,...) which is
projected at (or implemented by) some routine/procedure/method/
function, etc. - however it's called in any programming language.

The terminology certainly differs, but the interpretation less.

If we look deeper at the issue we can of course make academic
battles about other "function concepts" (my favorite example
is analogue computers; but that's extreme, of course). But in
that narrow corner we're discussing things it's sufficient IMO,
and probably more rewarding than restricting on the C function
implementation model.

How should we get principle insights on 'sizeof', what it is,
what it should be, etc., if we stay within this restricted C
world terminology, and discussing even a very special type of
a, umm.., function (sort of).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Fri Jan 26 19:59:23 2024

On 26/01/2024 18:31, Janis Papanagnou wrote:

All what you wrote below targets at your last sentense
"those side-effects could be executed in any order".
For the examples we had, like (informally) cout<<a<<b<<c;
this is undisputed for the SIDE EFFECTS of "a", etc. You
had "hidden" those side effects in "one()", I gave in an
earlier post the more obvious example c++ in the context
of cout << c++ << c++ << c++ << endl; as side effects.
All side effects can be a problem (and should be avoided
unless "necessary"). My point was that the order of '<<'
with its arguments is NOT corrupted. I interpreted your
previous posting that you'd have heard that to be an issue.
If you haven't meant to say that there's nothing more to
say about the issue, since the other things you filled your
post with is only distracting from the point in question.

I said - repeatedly - that the order of evaluation of the operands to
most operators is unspecified in C and C++. This could result in
behaviour that was unexpected for some people, especially in connection
with cout and other C++ streams, and was thus specified in C++17 for
specific cases.

A typical example would be :

cout << "Start time: " << get_time() << "\n"
<< "Running tests... " << run_tests() << "\n"
<< "End time: " << get_time();

It was realistic - and indeed happened in some cases - for pre-C++17
compilers to generate the second "get_time()" call before "run_tests()",
and finally do the first "get_time()" call. Alternatively, the compiler
could call "get_time()" twice, with "run_tests()" called either before
or after that pair. In all these cases, the user will see an output
that was not at all what they intended, with time appearing to go
backwards or the test apparently taking no time.

This was the case regardless of whether or not "get_time()" and
"run_tests()" had any side-effects.

You are, quite obviously, guaranteed that in "cout << a << b << c", the
output was in order a, b, c. But that is a totally different matter
from the order of evaluation (and execution, for function calls) of the subexpressions a, b, and c.

I have said exactly what I intended to say in this thread, but I suspect
you have mistaken what the term "order of evaluation" means, and
therefore misunderstood what I wrote. I hope this is all clear to you now.

On 26.01.2024 17:01, David Brown wrote:

On 26/01/2024 02:21, Janis Papanagnou wrote:

On 25.01.2024 21:11, David Brown wrote:

On 25/01/2024 17:11, Janis Papanagnou wrote:

Is or was there any compiler that implemented that in the "unexpected" >>>>> order?

There were indeed such real-world cases, complaints were made,

Complaints that the rule was not clear in its definition?
Or complaints that their compiler did not support cout<<a<<b<<c;
correctly? - I would be astonished about the latter.

The pre-C++17 rule was perfectly clear - there was no specified order of
execution for the operands. (And I thought I'd made /that/ perfectly
clear already.) Compilers all worked correctly - they can hardly have
fallen foul of a rule that did not exist.

The complaints (at least, the ones based on facts rather than
misunderstandings) were about the lack of a rule that enforced
evaluation order in certain cases.

So C++17 added rules for evaluation orders in some circumstances, but
not others. In C++17, but not before (and not in C), the evaluation of
the expression "one" (and any side-effects) must come before the
evaluation of "two" for, amongst other things :

one << two
one >> two
one[two]
two = one

There is still /no/ ordering for

one * two
one + two

and many other cases.

And of course there are cases where there has always been a sequence
point, and therefore an order of evaluation (a logical order, that is -
if the compiler can see it makes no difference to the observable
effects, it can always re-arrange anything).

<https://en.cppreference.com/w/cpp/language/eval_order>
<https://en.cppreference.com/w/c/language/eval_order>

This is so fundamental a construct and so frequently used that any
compiler would have been withdrawn in the week after it came out.
That is my expectation. So I would be grateful if you could provide
some evidence that I can look up.

<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf>

For an example in practice, where you can see the generated assembly:

<https://www.godbolt.org/z/fWezzx1nd>

If I remember correctly, gcc 7 implemented the ordering rules from C++17
and back-ported them to previous C++ standards for user convenience (as
the order was previously unspecified, it was fine to do that).

Look at the generated assembly and the order in which the calls to
one(), two(), three() and four() are made. For the operator "<<", they
are made in order one() to four(). For the operator "+", and for
function call parameters, they are generated in order four() to one()
for this case. (In other cases, that may be different - that's what
"unspecified" means.)

Mind that even if two() is evaluated before one(), it will not be
output before the stream of the first expression op<<(cout, one())
is available, and for this one() must be evaluated. Then one() can
be sent to the stream, and then also two() can be sent to the stream.
(Am I missing something?)

The output to the stream must be in the order given in the code - that
is true. But the values to be output could (prior to C++17) be
evaluated in any order. If one() and two() have side-effects, that is
critical - those side-effects could be executed in any order.

Janis

and the rules changed in C++17.

Usually it doesn't matter what order arguments to functions (or operands >>>> to operators) are evaluated. Some compilers have consistent ordering
(and it is often last to first, not first to last), others pick whatever >>>> makes sense at the time. The ordering has been explicitly and clearly >>>> stated as "unspecified" since around the beginning of time (which was, >>>> as we all know, 01.01.1970).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Fri Jan 26 20:18:47 2024

On 26/01/2024 18:59, Janis Papanagnou wrote:

On 26.01.2024 17:06, David Brown wrote:

On 26/01/2024 13:17, Malcolm McLean wrote:

We could say that in comp.lang.c "function" shall mean "a subroutine"

Why don't we just say - as everyone in this group except you already
says, that in c.l.c. "function" means "C function" as described in the C
standards, and any other type of function needs to be qualified?

Thus "the tan function" here means the function from <math.h>, not the
mathematical function, or something done when making leather.

It really is not difficult.

Unless the discussion was done on a meta-level as opposed to a
concrete language specific implementation-model of a function,
or a concrete functions. - My impression from the posts upthread
was that we were taking on the meta-level to understand what we
actually have (with tha 'sizeof' beast) or how to consider it
conceptionally.

We are - probably futilely - trying to get Malcolm to understand that
even in "meta-level" discussions, it is vital to be clear what is meant
by terms. And "function" alone means "C function" in c.l.c. You might
often think it is obvious from the context whether someone means "C
functions", "mathematical functions", or "wedding functions", but with
Malcolm you /never/ know. It regularly means "Malcolm functions", which
have an approximate definition that might change at any time.

I also think that this is the key to not talk past each other.

The term "function" in computer science seems to have never been
an issue of dispute - I mean on a terminology level; explanations
in lectures or books were quite coherent, and since there was no
dispute everyone seems to have understood what a function is; in
computer science and in mathematics.

The term "function" is most certainly in dispute in computer science.
It means different things - sometimes subtly, sometimes significantly -
in the context of different programming languages, or computation
theory, or mathematics. A "C function" is different from a "Pascal
function", a "lambda calculus function", a "Turing machine function", or
any other kind of function definition you want to pick.

From my references it seems a consensus at least in that it's
reflecting a mathematical f: (x,y,...) -> (u,v,...) which is
projected at (or implemented by) some routine/procedure/method/
function, etc. - however it's called in any programming language.

No, that is only one kind of function. There are all sorts of questions
to ask.

Can functions have side effects?

Do functions have to have outputs? Do they have to have inputs?

Does a function have to give the same output for the same inputs?

Can a function give more than one output? Does a function actually have
to be executed as called, or can the language re-arrange things?

Is it valid to have a function that does not satisfy certain
requirements, if that function is never called?

Can functions operate on types? Can they operate on other functions?
Can they operate on whole programs?

Does the function include some kind of data store? Does it include the
machine it executes on?

Does a function have to be executable? Does it even have to be
computable? Does it have to execute in a finite time?

Is a function a run-time entity, or a compile-time entity? Can it be
changed at run-time? Does it make sense to "run" a function at compile
time?

I'm sure we could go on.

The terminology certainly differs, but the interpretation less.

The problem is that the terminology is the same, but the interpretation
can be wildly different. In order to communicate, we must be sure that
a given term is interpreted in the same way be each person.

If we look deeper at the issue we can of course make academic
battles about other "function concepts" (my favorite example
is analogue computers; but that's extreme, of course). But in
that narrow corner we're discussing things it's sufficient IMO,
and probably more rewarding than restricting on the C function
implementation model.

I think we're fine sticking to "function" meaning "C function", which is
well defined by the C standards, and using "mathematical function" for mathematical functions, which are also quite solidly defined. Any other
usage will need to be explained at the time.

How should we get principle insights on 'sizeof', what it is,
what it should be, etc., if we stay within this restricted C
world terminology, and discussing even a very special type of
a, umm.., function (sort of).

Sizeof is not a C function. It is a C operator. If you don't know what
it is or how it works, or want the technical details, it's all in
6.5.3.4 of the C standards.

Trying to describe "sizeof" as a function of some sort with a different
kind of use of the word "function" really doesn't get us anywhere, as
shown in this thread. It is what it is - trying to mush it into another
term is not helpful.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Keith Thompson on Fri Jan 26 22:01:52 2024

On 26.01.2024 21:27, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

You are, quite obviously, guaranteed that in "cout << a << b << c",
the output was in order a, b, c. But that is a totally different
matter from the order of evaluation (and execution, for function
calls) of the subexpressions a, b, and c.

[...]

Perhaps I can help clarify this a bit (or perhaps muddy the waters
even further). I'll try to add a bit of C relevance at the bottom.

In `cout << a << b << c`, if a, b, and c are names of non-volatile
objects, the evaluation order doesn't matter. The values of a, b,
and c will be written to the standard output stream in that order,
in all versions of C++.

Please note that I used cout << a << b << c as a meta expression
to _subsume_ in one expression the two variants that may have possible
side effects; that with three functions, and that with three instances
of c++. I hoped that got clear from the subsequent text, but obviously
it confused the matter. Sorry about that.

In `cout << x() << y() << z()`, it's also guaranteed that the
result of the call to `x()` will precede the result of the call to
`y()`, which will precede the result of the call to `z()`, in the
text written to the output stream. What's not guaranteed prior
to C++17 is the order in which the three functions will be called.
If none of the functions have side effects that affect the results
of the other two, or depend on non-local data, it doesn't matter.
If the functions return, say, a string representation of the current
time with nanosecond resolution, the three results can be in any
of 6 orders prior to C++17; in C++17 and later, the timestamps will
always be in increasing order.

Yes.

C++ overloads the "<<" shift operator for output operations, so each
"<<" after `std::cout` is really a function call, but the rules for sequencing and order of evaluation are the same as for the built-in
"<<" integer shift operation. C++ could have imposed sequencing
requirements only on overloaded "<<" and ">> operators, but that
would have been more difficult to specify in the standard.

Okay.

C++17 added a new requirement that the evaluation of the left
operand of "<<" or ">>" is "sequenced before" the right operand,
meaning that any side effects of the evaluation of the left operand
must be complete before evaluation of the right operand begins
(though optimizations that don't change the visible behavior are
still allowed). It did not add such a requirement for the "+"
operator, which is overloaded for std::string concatenation.

This is actually what I was asking for; what they changed. Thanks.
(And as I suspected it's about "side effects of the evaluation".)

[ snip example and prospect ]

And no, you haven't muddied the issue, au contraire, it's a very
clear presentation.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Fri Jan 26 22:46:25 2024

On 26.01.2024 22:16, Lawrence D'Oliveiro wrote:

On Thu, 25 Jan 2024 14:07:25 +0100, David Brown wrote:

"cout << one() << two() << three();"

Those C++ operators for I/O are a brain-dead idea. C-style printf formats actually work better.

Well, no. There's a reason for using operators. In OO design you define classes, and you can define ostream operators << for your classes so
that you can use these like elementary types in an output stream. This
means you can output (and input) arbitrary complex classes. You'll also
get strong type-safety. And whatnot. Also the stream hierarchy offers
design and implementation paths that you just don't have with printf().

In case you haven't done OO design & programming there's unfortunately
not an easy way to explain or go into the necessary details here. I can
just suggest to get into that topic; it's worth, IMO.

The printf() method is quite old; with simple types it's a more compact
form of formula. You have some cryptic characters that shorten the
format string (whereas with C++ manipulators and other features you'll
have more flexibility, extensibility, but pay that with a bulkier form).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Fri Jan 26 22:30:02 2024

On 26.01.2024 19:59, David Brown wrote:

I said - repeatedly - that the order of evaluation of the operands to
most operators is unspecified in C and C++. [...]

Yes, and this was undisputed.

A typical example would be :

cout << "Start time: " << get_time() << "\n"
<< "Running tests... " << run_tests() << "\n"
<< "End time: " << get_time();

It was realistic - and indeed happened in some cases - for pre-C++17 compilers to generate the second "get_time()" call before "run_tests()",
and finally do the first "get_time()" call.

Yes, we have no differences.

And the sample is fine to show how we should NOT implement such time measurements (or similar logic)!

A computer scientist or a sophisticated programmer would know that
there are run-times associated in such expressions:

cout << "S1" << f1() << "S2" << f2() << "S3" << f3();

t1 t2 t3 t4 t5 t6 t7 t8 t9

and he would act accordingly and serialize the expression (see below).

Alternatively, the compiler
could call "get_time()" twice, with "run_tests()" called either before
or after that pair. In all these cases, the user will see an output
that was not at all what they intended, with time appearing to go
backwards or the test apparently taking no time.

This was the case regardless of whether or not "get_time()" and
"run_tests()" had any side-effects.

We disagree here; it may not appear so to you but get_time() actually
has a "side effect" (I put it in quotes, because it's literally no
"effect" but for the argument of its _sequencing problem_ it's a
relevant externality). It obtains (probably from a hardware device)
the time when the call happened.

That's why somewhat experienced programmers would not write above
code that way; something like "run_tests()" is (typically) or can be
very time consuming, so they'd do
t0 = get_time(); res = run_tests(); t1 = get_time();
cout << ... etc.

(Note: This argument implies NOT that a language shouldn't be made as bulletproof as possible and sensible.)

You are, quite obviously, guaranteed that in "cout << a << b << c", the output was in order a, b, c. But that is a totally different matter
from the order of evaluation (and execution, for function calls) of the subexpressions a, b, and c.

(It was meant as a "meta expression". I've addressed that in my
response to Keith already; please see there.)

I have said exactly what I intended to say in this thread, but I suspect
you have mistaken what the term "order of evaluation" means, and
therefore misunderstood what I wrote. I hope this is all clear to you now.

The order of evaluation of the '<<' was what I spoke about. The order
of the arguments had never been an issue. The "problem" with the order
of the arguments becomes a problem (without quotes) when side effects
of the arguments are inherent to the arguments.

You had been focused on the evaluation of the arguments (where side
effects might lead to unexpected behavior). I wasn't.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Fri Jan 26 21:16:25 2024

On Thu, 25 Jan 2024 14:07:25 +0100, David Brown wrote:

"cout << one() << two() << three();"

Those C++ operators for I/O are a brain-dead idea. C-style printf formats actually work better.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Fri Jan 26 22:41:26 2024

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

On 26.01.2024 22:16, Lawrence D'Oliveiro wrote:

On Thu, 25 Jan 2024 14:07:25 +0100, David Brown wrote:

"cout << one() << two() << three();"

Those C++ operators for I/O are a brain-dead idea. C-style printf
formats actually work better.

Well, no. There's a reason for using operators.

But remember, you often need to do localized output. Which means parts of
a message may need to be reordered for grammatical purposes.

You can do this with printf, but not with C++ output operators.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Fri Jan 26 23:52:44 2024

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

Also the stream hierarchy offers
design and implementation paths that you just don't have with printf().

And that you don’t need, frankly. Java manages just fine with printf-style formatting and “toString()” methods.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Fri Jan 26 23:51:56 2024

On Fri, 26 Jan 2024 15:41:43 -0800, Keith Thompson wrote:

You can do this with POSIX printf.

POSIX specifies an extension to printf that allows arguments to be re-ordered. For example:
printf("%2$s%1$s\n", "foo", "bar");
prints "barfoo".

ISO C does not have this feature.

I often feel, reading the complaints about deficiencies in this group,
that C does not work very well unless it is running on top of a *nix-type system.

C++'s `cout << ...` has advantages and disadvantages.

Interesting about Java, with all its needless complexity and futile
attempts at simplification, that this was one decision it made correctly,
and that was not to copy those operators.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Sat Jan 27 01:12:05 2024

On 26.01.2024 20:18, David Brown wrote:

On 26/01/2024 18:59, Janis Papanagnou wrote:

On 26.01.2024 17:06, David Brown wrote:

On 26/01/2024 13:17, Malcolm McLean wrote:

We could say that in comp.lang.c "function" shall mean "a subroutine"

Why don't we just say - as everyone in this group except you already
says, that in c.l.c. "function" means "C function" as described in the C >>> standards, and any other type of function needs to be qualified?

Thus "the tan function" here means the function from <math.h>, not the
mathematical function, or something done when making leather.

It really is not difficult.

Unless the discussion was done on a meta-level as opposed to a
concrete language specific implementation-model of a function,
or a concrete functions. - My impression from the posts upthread
was that we were taking on the meta-level to understand what we
actually have (with tha 'sizeof' beast) or how to consider it
conceptionally.

We are - probably futilely - trying to get Malcolm to understand that
even in "meta-level" discussions, it is vital to be clear what is meant
by terms. And "function" alone means "C function" in c.l.c. You might
often think it is obvious from the context whether someone means "C functions", "mathematical functions", or "wedding functions", but with Malcolm you /never/ know. It regularly means "Malcolm functions", which
have an approximate definition that might change at any time.

(I don't like the habit of introducing personalized terms like
"Malcolm functions"; this habit exposes more of the person who
introduced it than anything else. And it anyway would only muddy
the issue not clarify.)

I also think that this is the key to not talk past each other.

The term "function" in computer science seems to have never been
an issue of dispute - I mean on a terminology level; explanations
in lectures or books were quite coherent, and since there was no
dispute everyone seems to have understood what a function is; in
computer science and in mathematics.

The term "function" is most certainly in dispute in computer science. It means different things - sometimes subtly, sometimes significantly - in
the context of different programming languages, or computation theory,
or mathematics.

We have to divide et impera the area for the discussion to not
get into the wild. The first divide is the abstraction level;
this is what I've done below. (Because any technical differences
of every single computer language makes no sense in a discussion
on a higher abstraction level. - I see no use in differentiating
Pascal functions from Simula or Algol functions for the discussion
here.)

(I fear this thread will lead nowhere, but okay, I'll enter...)

A "C function" is different from a "Pascal function", a
"lambda calculus function", a "Turing machine function", or any other
kind of function definition you want to pick.

What relevance has any technical difference of "C functions"
and "Pascal functions"? - None.

What do you think is the difference between a function from
the lambda-calculus and a function from/for Turing Machines
concerning the class of functions that can be expressed and
calculated respectively? And in what way would syntax of any
of these two languages contribute to the question? - Nothing.

Note: I don't want you to answer these questions. I suppose
you might have some substantial CS background (I certainly do)
and are not just spreading buzzwords.
Neither the technical (implementation) differences of the first
two types are relevant for the topics that have been discussed,
nor the algorithm theory definitions of the latter two function
types are relevant here.

From my references it seems a consensus at least in that it's
reflecting a mathematical f: (x,y,...) -> (u,v,...) which is
projected at (or implemented by) some routine/procedure/method/
function, etc. - however it's called in any programming language.

No, that is only one kind of function.

That is an abstract representation from mathematics (and I am
not interest in syntactic differences to other forms) that can
be directly mapped to an algorithmic representation.

We write (for example [borrowed from a book]):

f: R x R x R -> R for the domains; R here: real numbers

f(r,R,h) -> pi/3 x h x (r^2 + r x R + R^2)

and in computer languages (for example) syntactic variants of:

f = (real r, real R, real h) real :
pi/3 * h * (r^2 + r * R + R^2)

The function from the language closely resembles that from the
mathematic domain.

This is an Algol(-like) syntax representation, other languages
have other syntaxes but mostly it boils down to few differences.

It's not really important for our discussions to consider Algol's
ref, Pascal's var, C++'s const, or what else. For the sake of the
discussion upthread it's also irrelevant whether we have parameters
passed by value, by reference, by name, or consider some deep or
shallow copy mechanisms, and it's also not necessary to know for
this discussions whether the caller or the called instance will
allocate the stack size, or whether there's stack at all allocated.
There's countless _technical_ differences that are meaningless in
a taxonomy-like discussion here.

There are all sorts of questions to ask.

Yes, but not many (none?) of significance in our discussion context
here.

Can functions have side effects?

Do functions have to have outputs? Do they have to have inputs?

Does a function have to give the same output for the same inputs?

Can a function give more than one output? Does a function actually have
to be executed as called, or can the language re-arrange things?

Is it valid to have a function that does not satisfy certain
requirements, if that function is never called?

Can functions operate on types? Can they operate on other functions?
Can they operate on whole programs?

Does the function include some kind of data store? Does it include the machine it executes on?

Does a function have to be executable? Does it even have to be
computable? Does it have to execute in a finite time?

Is a function a run-time entity, or a compile-time entity? Can it be
changed at run-time? Does it make sense to "run" a function at compile
time?

I'm sure we could go on.

(Yes, and it wouldn't add anything.)

The terminology certainly differs, but the interpretation less.

The problem is that the terminology is the same, but the interpretation
can be wildly different. In order to communicate, we must be sure that
a given term is interpreted in the same way be each person.

Yes. But remember that our question was not a technical one; wasn't
the question by the other poster (Malcolm?) about a mathematical
function term and how it fits to determine what 'sizeof' actually is
to be considered.

If we look deeper at the issue we can of course make academic
battles about other "function concepts" (my favorite example
is analogue computers; but that's extreme, of course). But in
that narrow corner we're discussing things it's sufficient IMO,
and probably more rewarding than restricting on the C function
implementation model.

I think we're fine sticking to "function" meaning "C function", which is
well defined by the C standards, and using "mathematical function" for mathematical functions, which are also quite solidly defined. Any other usage will need to be explained at the time.

How should we get principle insights on 'sizeof', what it is,
what it should be, etc., if we stay within this restricted C
world terminology, and discussing even a very special type of
a, umm.., function (sort of).

Sizeof is not a C function.

I know it's an operator in C. And I also wasn't saying that it's a
C function. - You still see the "(sort of)" in my statement. And we
already spoke about the close (but not exact) equivalences between
functions and operators.

It is a C operator. If you don't know what
it is or how it works, or want the technical details, it's all in
6.5.3.4 of the C standards.

If that's all the OP wanted to discuss it would be easy. You don't
even need any C standard document. Open any book, even the old K&R
is sufficient, and look up 'sizeof'. You can read about it being an
operator and fine. File closed. Goodbye. (What for was the original
question of this thread? I seem to recall something about the form
with parenthesis and type?)

Trying to describe "sizeof" as a function of some sort with a different
kind of use of the word "function" really doesn't get us anywhere, as
shown in this thread. It is what it is - trying to mush it into another
term is not helpful.

What would be the difference if the parenthesized form would be
called a function, given that functions and operators are similar,
and the context so restricted? I don't think you can get an address
of it (or can we?); but that again is just another implementation
details (C specific).

The need for parenthesis in sizeof(type) seems anyway to be only a
hack, necessary for type expressions with blanks, sizeof(struct x) ?

Janis

BTW: There was another subthread about preprocessor use for NELEM
determination using sizeof. When I looked up the K&R reference I
saw its use described even as a standard pattern to determine the
number of array elements. No wonder it became idiomatic.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sat Jan 27 01:17:23 2024

On 27.01.2024 00:51, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 15:41:43 -0800, Keith Thompson wrote:

C++'s `cout << ...` has advantages and disadvantages.

Interesting about Java, with all its needless complexity and futile
attempts at simplification, that this was one decision it made correctly,
and that was not to copy those operators.

Chosing these operators is a separate issue. But, yes. With the
operator precedence borrowed from the shift operator you need to
be careful. As part of the output operator syntax I'd expected
a precedence like '=' or even lower.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sat Jan 27 01:27:55 2024

On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

Also the stream hierarchy offers
design and implementation paths that you just don't have with printf().

And that you don’t need, frankly.

Don't be so fast with your judgment. Of course we use it to elegantly
and scaleably solve tasks in C++.

Java manages just fine with printf-style formatting and “toString()” methods.

I tried to explain in my other post that it's not just about a format
(or a string-sequencing member function). But I'm sure one must be
deeper in the topic or have experienced (besides any supposed issues)
the sophisticated possibilities that C++ offers to support good design.

Java (as a newer language) has also some advantages, but was in many
respects far behind C++ (IMO). ("was" because I lately didn't follow
its evolution any more.) - But that's anyway all off-topic here.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Sat Jan 27 00:38:20 2024

On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

Also the stream hierarchy offers design and implementation paths that
you just don't have with printf().

And that you don’t need, frankly.

Don't be so fast with your judgment. Of course we use it to elegantly
and scaleably solve tasks in C++.

But not localization, which is an important issue. printf-style formatting allows rearrangement of parts of a message to suit grammar purposes, C++-
style output operators do not.

Java (as a newer language) has also some advantages, but was in many
respects far behind C++ (IMO).

It made many mistakes. The goal of trying to be simpler than C++ was I
think a failure.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Janis Papanagnou on Sat Jan 27 00:42:55 2024

On 27/01/2024 00:27, Janis Papanagnou wrote:

Java ...C++ ..
But that's anyway all off-topic here.

JP finally realises that, after 100 posts of ramblings about anything but C.

From 'bart cc32n.c' thread:

JP:

No, you are wrong, I'm not the owner of this piece of... code.

If someone makes a big heap of fecal in a public park, would
you think I'm the owner? I'd rather sue the one who did that;
because the park (or Usenet) is common property, and the heap
of fecal (or that code) is not.

What a thoroughly unpleasant piece of work.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Janis Papanagnou on Fri Jan 26 20:10:17 2024

On 1/26/24 12:31, Janis Papanagnou wrote:
...

All what you wrote below targets at your last sentense
"those side-effects could be executed in any order".
For the examples we had, like (informally) cout<<a<<b<<c;
this is undisputed for the SIDE EFFECTS of "a", etc. You
had "hidden" those side effects in "one()", I gave in an
earlier post the more obvious example c++ in the context
of cout << c++ << c++ << c++ << endl; as side effect

There's an important reason why he used one(), two(), and three().
"If a side effect on a memory location (6.7.1) is unsequenced relative
to either another side effect on the same memory location or a value computation using the value of any object in the same memory location,
and they are not potentially concurrent (6.9.2), the behavior is
undefined." (C++ 6.9.1p10)

"Two actions are potentially concurrent if
(21.1)— they are performed by different threads, or
(21.2)— they are unsequenced, at least one is performed by a signal
handler, and they are not both performed
by the same signal handler invocation." (C++ 6.9.2.1p21)

So the exception for potentially concurrent side effects cannot apply to
your version.

All three of your c++ expressions have side effects on the same memory location, and all use the value stored in that location. Prior to the
change which is being discussed, all three of those side effects would
have been unsequenced from each other and from each other's value
computations, so the behavior of such an expression would have been
undefined. With this change, they are sequenced, and there is no longer
a problem with such code.

The function one(), two() and three() that he used could have side
effects that effect each other without sharing a memory location, for
instance by writing to and reading from a file. Therefore, such code
would have worked both before and after that change - but it might have
given different results before the change.

All side effects can be a problem (and should be avoided
unless "necessary").

Virtually everything useful that a computer program does qualifies as a
side effect. Side effects cannot be avoided, they can only be controlled.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Janis Papanagnou on Fri Jan 26 20:43:27 2024

On 1/26/24 19:12, Janis Papanagnou wrote:

On 26.01.2024 20:18, David Brown wrote:

...

(I don't like the habit of introducing personalized terms like
"Malcolm functions"; this habit exposes more of the person who
introduced it than anything else. And it anyway would only muddy
the issue not clarify.)

It is an unfortunate necessity when talking with Malcolm. He has his own idiosyncratic definitions for just about any technical term you care to
name, which he generally believes to be universally accepted, when he
appears to be the only person using those words with those definitions.

...

A "C function" is different from a "Pascal function", a
"lambda calculus function", a "Turing machine function", or any other
kind of function definition you want to pick.

What relevance has any technical difference of "C functions"
and "Pascal functions"? - None.

The fact that there are differences between the meanings of each of
those terms is relevant to the fact that you need to be clear which of
those terms you are using. Given that this is comp.lang.c, the one
exception is "C function", which can be assumed whenever no other kind
of function is specified.

...

It's not really important for our discussions to consider Algol's
ref, Pascal's var, C++'s const, or what else.

You might think so, but it's not uncommon fro those things to come up in discussion here. It's particularly common for critics of C to discuss
their preferred alternative.

...

Yes. But remember that our question was not a technical one; wasn't
the question by the other poster (Malcolm?) about a mathematical
function term and how it fits to determine what 'sizeof' actually is
to be considered.

As you'll find if you stay here long enough, every discussion involving
Malcolm degenerates into confusion until someone realizes that he's
using a Malcolm-definition for one or more of the relevant terms. It's
not possible to make sense of his comments until you've extracted from
him his idiosyncratic definitions for those terms.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Janis Papanagnou on Fri Jan 26 20:22:22 2024

On 1/26/24 16:30, Janis Papanagnou wrote:
...

We disagree here; it may not appear so to you but get_time() actually
has a "side effect" (I put it in quotes, because it's literally no
"effect" but for the argument of its _sequencing problem_ it's a
relevant externality). It obtains (probably from a hardware device)
the time when the call happened.

"Reading an object designated by a volatile glvalue (7.2.1), modifying
an object, calling a library I/O function, or calling a function that
does any of those operations are all side effects, which are changes in
the state of the execution environment." (C++ 6.9.1p7)

The term "side effect" is in italics in that sentence, an ISO convention indicating that the sentence in which it appears constitutes the
official definition of that term. Everything that the C++ standard says
about side effects traces back to that definition.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sat Jan 27 11:09:56 2024

On 27.01.2024 01:38, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

Also the stream hierarchy offers design and implementation paths that
you just don't have with printf().

And that you don’t need, frankly.

Don't be so fast with your judgment. Of course we use it to elegantly
and scaleably solve tasks in C++.

But not localization, which is an important issue. printf-style formatting allows rearrangement of parts of a message to suit grammar purposes, C++- style output operators do not.

I see where you're coming from. Myself I have just cursory knowledge
about and experience in localization, so I have no strong opinions
on that and thus cannot and don't want to work into the details.
What I [think to] know is that simple word permutations don't help
for general cases of localization, so printf would just work for
some primitive application special cases. (But feel free to CMIIW.)
Other languages don't operate on simple word arrangements but on whole sentences. In Kornshell, for example, you can predefine language
strings (in "dictionaries") to be incorporated for displaying language
specific messages. This is a simple and effective mechanism, and I
have difficulties to imagine how printf with its parameter permutation
would create flexible and clean localization; I'd expect a mess (but,
to be honest, haven't yet seen any convincing example). C++ has also localization support, but as said I have no experience with it. For
more complex language manipulations you'd anyway need more than word permutation.

I used to play intensively Nethack; it's a roguelike game known for
creating well formulated (English) sentences. The source code has a
lot of code to handle manipulation of language for various details
of the grammar; plural forms, forms for cases, and whatnot. Someone
wanted to migrate the game (with its grammar functions) to another
language; it was very difficult and could not be achieved by simple
mechanisms like word permutation and word substitution, IIRC.

Java (as a newer language) has also some advantages, but was in many
respects far behind C++ (IMO).

It made many mistakes. The goal of trying to be simpler than C++ was I
think a failure.

Well, personally I prefer "simple" languages to complex ones. (Oh,
well, yet I like C++.) Okay, by simple I mean coherently defined
with clean concepts. Though I wouldn't call Java simpler than C++;
for example the STL was much more sophisticated and orthogonally
designed (thus in this respect simpler) where the Java libraries
looked more like an ad hoc tool chest (like in Javascript or PHP).
But when I saw later C++ evolutions (was it C++/2011 ?) I can just
admit that at least now it became an overly complex language.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Sat Jan 27 11:34:02 2024

On 27.01.2024 04:05, Malcolm McLean wrote:

[...] Personally I wanted just
"function" and for it to be clear from context that here the term did
not mean "subroutine".

In my book; there's the "concept function" (mathematical), and the mapping/implementation onto/in a computer (a "calculation routine").
The latter has just different names in different languages and it
naturally has different technical details. In any form its purpose
is to be an implemented instance of a formal mathematical concept.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Sat Jan 27 11:43:38 2024

On 27.01.2024 04:24, Malcolm McLean wrote:

It's hard to think of anything that can be passed to standard outut
other than integers, floating point values, and strings. So you only
need three atomic operations.
You can then buld complex objects consisting of integers, floats and
strings on top of those three basic operations. But the stream itself
should be locked down and not open to derivation.

I'm not sure where you're coming from here, what you mean by "locked
down", and why it would be a goal to reach. I used various ostreams
(output, strings, etc.) to an advantage in designing flexible usable
algorithms without duplicating code or anything. (I don't know where
your view comes from, but maybe it helps to take an existing library
based on OO design, maybe even STL (which has functional concepts as
well) and inspect what can be done with stream (or other) hierarchies
of types. - But maybe its just as I've written upthread; if you have
not experienced that yourself it's probably hard to understand where
and what the advantages are.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to James Kuyper on Sat Jan 27 11:21:48 2024

On 27.01.2024 02:43, James Kuyper wrote:

[...]

Thanks for your posts, explanations and background information.

Yes, I know little about the persons and their posting history.
I'm trying to stay on the topical level as long as possible - it
not always works as I intend, though, when it gets pathological -
just because as soon as it gets personal the threads are usually
tainted and not fruitful. - I observe the nerves of posters here
are often stressed.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Sat Jan 27 12:36:15 2024

On 27.01.2024 12:02, Malcolm McLean wrote:

On 27/01/2024 10:34, Janis Papanagnou wrote:

On 27.01.2024 04:05, Malcolm McLean wrote:

[...] Personally I wanted just
"function" and for it to be clear from context that here the term did
not mean "subroutine".

In my book; there's the "concept function" (mathematical), and the
mapping/implementation onto/in a computer (a "calculation routine").
The latter has just different names in different languages and it
naturally has different technical details. In any form its purpose
is to be an implemented instance of a formal mathematical concept.

Janis

I don't really see how "Bleep" is any sort of mathematical function. But
it is clearly a "subroutine".

I don't know what that 'Bleep' is that you mention here, but I suppose
that is meant to be some function that is not returning a value but an
acoustic _side effect_. In an Algol representation something like the
function

bleep = (void) void: invoke_tone

Or in Simula or Pascal the return-typeless function (= 'procedure')

procedure bleep

Or in C the function

bleep() { invoke_tone; }

In FORTRAN and BASIC (don't remember) that function was maybe called "subroutine"?

Nomenclature changes wording but not its underlying character. Use the
terms and models that fit best your goals. Meta-goals may be to clarify
or to muddy the issue or its details.

The term "subroutine" (for me) comprises two aspects; a hierarchical
relation, and a [computation] process character.

I'm unsure whether you wanted to express that functions with return
type void are different from functions that return non-void types and
should thus not be called functions? Other's may say that "subroutine"
is badly chosen since there's not necessarily a hierarchical semantic
when using "procedures" in context of Simula's classes with coroutines.
Or are you saying that any "non-pure" function generally relies on
side-effects and should be considered and named differently? - I'm fine
with it. As long as it's clear what goals we follow, and whether the nomenclature is appropriate for that specific goal (and not conflicts
with other goals).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Sat Jan 27 12:53:49 2024

On 27.01.2024 12:13, Malcolm McLean wrote:

What I am saying is that standard output can take integers, floats and strings.

Oh!? That doesn't match with any output model I have in mind. (It only
matches if I'm coming from the printf() functionality as basis of all.)

Standard output can take, technically, anything. We're passing binary
and "text" across the standard channels (and may leave open whether an
UTF-8 encoded text is to be considered "binary" in some context).

On one abstraction level you can say I want to consider only non-binary readable output, but then it's all text (int and float and bool are just
output with one of their textual standard representations).

So the stream should have some facilites for writing integers (leading
zeros, signs, maybe commas separators for thousands), some for floats (rounding, precision, scientific notation etc), some for strings (not
much you can do here other than just pass the raw characters).

From an OO perspective we may also say the type Integer, the type Float,
etc. shall provide means to create a textual standard representation,
with options to control the form of the standard representation. Is the
form of the standard representation a property of the data type or of a
[not really] "generic" procedure that has hard-coded support for just
a hand full of predefined primitive types?

Now when we've got those facilites and we are happy with them, that's
it. We don't allow further derivation of the stream to change the basic behaviour. Now people might say "booleans, you've forgotten booleans,
surely when you pass booleans it should print "true" or "false". No.
We'll handle that at a higher level and pass "true" and "false" as strings.

The disadvantage is that you are locked into an integer/float/string paradigm. Amd it's not OO. But the advantage is that it will be stable.

The OO methods are also stable, but they are also flexible. The concept
makes it possible to extend it to any data type. (I've done that many
times. And thinking about how that would have looked like with non-OO
and only printf() methods is a horror. But, as said; it's probably
necessary to experience that self if it's not understandable from the explanations alone.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Sat Jan 27 16:44:28 2024

On 27/01/2024 02:10, James Kuyper wrote:

On 1/26/24 12:31, Janis Papanagnou wrote:

All side effects can be a problem (and should be avoided
unless "necessary").

Virtually everything useful that a computer program does qualifies as a
side effect. Side effects cannot be avoided, they can only be controlled.

Try telling that to Haskell programmers :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Sat Jan 27 16:58:43 2024

On 27/01/2024 01:38, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

Also the stream hierarchy offers design and implementation paths that
you just don't have with printf().

And that you don’t need, frankly.

Don't be so fast with your judgment. Of course we use it to elegantly
and scaleably solve tasks in C++.

But not localization, which is an important issue. printf-style formatting allows rearrangement of parts of a message to suit grammar purposes, C++- style output operators do not.

Standard printf formatting also does not allow such re-arrangements.

C++ has added "std::format" as a way of getting more flexible
formatting, including re-arrangements of parts, with type safety and
user extension support like iostreams. (I haven't used it myself, and a
deeper discussion would be better down the hall in c.l.c++.)

Generally, I think the different formatting systems have their
advantages and disadvantages. I've yet to see a system in any language
that could cover all needs.

(My own key dislike about the C++ output streams is the mess of stateful
"IO manipulators".)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Sat Jan 27 16:43:15 2024

On 26/01/2024 22:30, Janis Papanagnou wrote:

On 26.01.2024 19:59, David Brown wrote:

I said - repeatedly - that the order of evaluation of the operands to
most operators is unspecified in C and C++. [...]

Yes, and this was undisputed.

A typical example would be :

cout << "Start time: " << get_time() << "\n"
<< "Running tests... " << run_tests() << "\n"
<< "End time: " << get_time();

It was realistic - and indeed happened in some cases - for pre-C++17
compilers to generate the second "get_time()" call before "run_tests()",
and finally do the first "get_time()" call.

Yes, we have no differences.

And the sample is fine to show how we should NOT implement such time measurements (or similar logic)!

There are, of course, many reasons why this is not a good way to
implement time measurements - the order of evaluation is not the only one.

A computer scientist or a sophisticated programmer would know that
there are run-times associated in such expressions:

cout << "S1" << f1() << "S2" << f2() << "S3" << f3();

t1 t2 t3 t4 t5 t6 t7 t8 t9

The experienced or knowledgable C++ programmer (prior to C++17) would
know that the parts here are not necessarily executed in the order you
give. (It's not clear to me if these are the run times for different
parts, or time-stamps.) Indeed, depending on the kinds of
subexpressions you have, not only can the order of evaluation be changed
for most operators, but their evaluation can be interleaved. (I could
go through the details, but it is probably better to look them up on a reference site such as en.cppreference.com.)

and he would act accordingly and serialize the expression (see below).

If the programmer wants them to be executed in a particular order,
he/she must use constructs in the language to force that.

Alternatively, the compiler
could call "get_time()" twice, with "run_tests()" called either before
or after that pair. In all these cases, the user will see an output
that was not at all what they intended, with time appearing to go
backwards or the test apparently taking no time.

This was the case regardless of whether or not "get_time()" and
"run_tests()" had any side-effects.

We disagree here; it may not appear so to you but get_time() actually
has a "side effect" (I put it in quotes, because it's literally no
"effect" but for the argument of its _sequencing problem_ it's a
relevant externality). It obtains (probably from a hardware device)
the time when the call happened.

We don't disagree - I haven't said what "get_time()" does or how it
works, or what concept of "time" it has. I agree that getting some
real-world time from a hardware device would be a side-effect - as would
most practical ways to get a useful idea of "time". I was making a
general point - the operands to the << operator could be (pre-C++17)
evaluated in any order regardless of whether or not they had side-effects.

That's why somewhat experienced programmers would not write above
code that way; something like "run_tests()" is (typically) or can be
very time consuming, so they'd do
t0 = get_time(); res = run_tests(); t1 = get_time();
cout << ... etc.

Of course.

In practice, they could still be badly wrong even with that code -
there's a lot of subtle points to consider when trying to time code, and
my experience is that very few programmers get it entirely right.

(Note: This argument implies NOT that a language shouldn't be made as bulletproof as possible and sensible.)

A language should be convenient to use and avoid surprising the
programmer. But it should /not/ be a surprise to C or C++ programmers
that the order of evaluation of subexpressions is usually unspecified.

You are, quite obviously, guaranteed that in "cout << a << b << c", the
output was in order a, b, c. But that is a totally different matter
from the order of evaluation (and execution, for function calls) of the
subexpressions a, b, and c.

(It was meant as a "meta expression". I've addressed that in my
response to Keith already; please see there.)

I have said exactly what I intended to say in this thread, but I suspect
you have mistaken what the term "order of evaluation" means, and
therefore misunderstood what I wrote. I hope this is all clear to you now.

The order of evaluation of the '<<' was what I spoke about. The order
of the arguments had never been an issue. The "problem" with the order
of the arguments becomes a problem (without quotes) when side effects
of the arguments are inherent to the arguments.

You had been focused on the evaluation of the arguments (where side
effects might lead to unexpected behavior). I wasn't.

I'm afraid I can't quite follow you here. I can just hope that you
understand that evaluation order is unspecified for most operators, that
real compilers evaluate subexpressions in different orders in real code
(so the re-ordering is not hypothetical), and that C++17 added special
rules for << and >> to make things more convenient for programmers. If
I have helped you see this, or if Keith's post helped you see it, then
that's great.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Sat Jan 27 17:07:42 2024

On 27/01/2024 00:15, Malcolm McLean wrote:

On 26/01/2024 19:18, David Brown wrote:

I think we're fine sticking to "function" meaning "C function", which
is well defined by the C standards, and using "mathematical function"
for mathematical functions, which are also quite solidly defined. Any
other usage will need to be explained at the time.

Basically I wanted "function" for C functions which are also
mathematical functions, and "procedure" for C functions which do not
meet the definition of mathematical functions. In context, of course.

So basically, you want to use your own terms that are different from
everyone else's.

And since this is normal, accepted usage, I though It would be accepted
here.

No, it is not "normal, accepted usage" anywhere but when you are talking
to yourself. As I have pointed out in other posts, "function" can mean
a vast number of different things.

And it is not "normal, accepted usage" to join a discussion group with
clear and established terminology, and attempt to use your own very
different terminology. That is true even if the terminology and
definitions you want to use are well established elsewhere - which in
this case, they are not.

Seriously, how hard would it be for you to accept the usage of
"function" to mean "C function" in this group? How difficult would it
be for you to try to speak the same language as the rest of us? Do you
really expect everyone else to adapt to suit your personal choice of definitions? How often do you need to go round the same circles again
and again, instead of trying to communicate with people in a sane manner?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Sat Jan 27 16:25:40 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.01.2024 00:51, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 15:41:43 -0800, Keith Thompson wrote:

C++'s `cout << ...` has advantages and disadvantages.

Interesting about Java, with all its needless complexity and futile
attempts at simplification, that this was one decision it made correctly,
and that was not to copy those operators.

Chosing these operators is a separate issue.

I've always found them ugly an inefficient myself, leaving aside
the ability to override the operator for "type safety", something
that I've not found to be a compelling advantage.

how is

cout << std::hex << std::setw((bits + 3)/4) << value << std::eol;

better than

printf("%*x\n", (bits+3/4), value);

Especially when the format string includes multiple values represented
in different bases and they may need to be reordered at runtime based
on e.g. the current locale?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Sat Jan 27 17:46:00 2024

On 27/01/2024 01:12, Janis Papanagnou wrote:

On 26.01.2024 20:18, David Brown wrote:

On 26/01/2024 18:59, Janis Papanagnou wrote:

On 26.01.2024 17:06, David Brown wrote:

On 26/01/2024 13:17, Malcolm McLean wrote:

(I don't like the habit of introducing personalized terms like
"Malcolm functions"; this habit exposes more of the person who
introduced it than anything else. And it anyway would only muddy
the issue not clarify.)

I agree. But I'd rather he talked about "Malcolm functions" than just
wrote "functions" while meaning something completely different from what everyone else here means by the word.

(I fear this thread will lead nowhere, but okay, I'll enter...)

I'm trying to snip and skip stuff to reduce the size of the post here.

A "C function" is different from a "Pascal function", a
"lambda calculus function", a "Turing machine function", or any other
kind of function definition you want to pick.

What relevance has any technical difference of "C functions"
and "Pascal functions"? - None.

In Pascal, a "subroutine" (if we may use that as a generic term for now,
even though that too can have many different meanings) that returns a
value is a "function". If it does not return a value, it is a
"procedure". Either may or may not have side-effects. Both are called "functions" in C.

Thus there is a difference between "C functions" and "Pascal functions".
In comp.lang.pascal, the unqualified term "function" would mean
something different from what it means in comp.lang.c.

The relevance to the discussion is that there are a vast number of
meanings of the term "function", even within the realm of computer
programming.

Note: I don't want you to answer these questions. I suppose
you might have some substantial CS background (I certainly do)
and are not just spreading buzzwords.

My university education was in mathematics and computation, so I do have
a theoretical background. Since that time, decades ago, my work has
been practical rather than theoretical.

Neither the technical (implementation) differences of the first
two types are relevant for the topics that have been discussed,
nor the algorithm theory definitions of the latter two function
types are relevant here.

I agree that there are few practical differences between Pascal functions/procedures and C functions. But this is not a discussion
about practical implementations of compiled imperative programming
languages - it is a discussion about terms, and why it is important to
agree on the meaning of the terms. The term "function" means different
things in Pascal and C.

From my references it seems a consensus at least in that it's
reflecting a mathematical f: (x,y,...) -> (u,v,...) which is
projected at (or implemented by) some routine/procedure/method/
function, etc. - however it's called in any programming language.

No, that is only one kind of function.

That is an abstract representation from mathematics (and I am
not interest in syntactic differences to other forms) that can
be directly mapped to an algorithmic representation.

It is more general in mathematics to consider it to map one set to
another, but we sometimes use multiple parameters for convenience of
notation. (It's rare to view the codomain as multiple parts.)

We write (for example [borrowed from a book]):

f: R x R x R -> R for the domains; R here: real numbers

f(r,R,h) -> pi/3 x h x (r^2 + r x R + R^2)

and in computer languages (for example) syntactic variants of:

f = (real r, real R, real h) real :
pi/3 * h * (r^2 + r * R + R^2)

The function from the language closely resembles that from the
mathematic domain.

It resembles it in this case - though in important aspects they are very different. However, many real-world C functions don't match closely
with a neat mathematical function (mathematical functions don't have side-effects), and many real-world mathematical functions don't match
practical C functions. Picking one example where a reasonable match can
be made does not mean the two kinds of "function" are similar.

There are all sorts of questions to ask.

Yes, but not many (none?) of significance in our discussion context
here.

All show that the term "function" can mean a huge number of different
things. There is no single "computer science" definition or "accepted
common definition" - it's not even a close call.

And so if we are going to use that word at all, we have to agree what it
means. Since this is comp.lang.c, and since "functions" are central to
C, we use the term "function" to mean "C function" unless otherwise
/clearly/ stated.

How should we get principle insights on 'sizeof', what it is,
what it should be, etc., if we stay within this restricted C
world terminology, and discussing even a very special type of
a, umm.., function (sort of).

Sizeof is not a C function.

I know it's an operator in C. And I also wasn't saying that it's a
C function. - You still see the "(sort of)" in my statement. And we
already spoke about the close (but not exact) equivalences between
functions and operators.

We certainly won't learn anything about "sizeof" by calling it a
"function" or a "mathematical function".

It is a C operator. If you don't know what
it is or how it works, or want the technical details, it's all in
6.5.3.4 of the C standards.

If that's all the OP wanted to discuss it would be easy. You don't
even need any C standard document. Open any book, even the old K&R
is sufficient, and look up 'sizeof'. You can read about it being an
operator and fine. File closed. Goodbye. (What for was the original
question of this thread? I seem to recall something about the form
with parenthesis and type?)

I have long since forgotten what the question was - we have certainly
wandered far from the start of the thread!

Trying to describe "sizeof" as a function of some sort with a different
kind of use of the word "function" really doesn't get us anywhere, as
shown in this thread. It is what it is - trying to mush it into another
term is not helpful.

What would be the difference if the parenthesized form would be
called a function, given that functions and operators are similar,
and the context so restricted?

The difference is, it would not be a C function. "sizeof" operates at
compile time, and it operates on types - either an explicit type, or the
type of the expression. It has no run-time behaviour (excluding the
somewhat poorly described VLA behaviour). It does not evaluate its
operand. It has no prototype or declaration. It cannot be implemented
by the user in a free-standing implementation. You cannot take its
address. It has no linkage. You cannot use its name as an identifier.

I don't think you can get an address
of it (or can we?); but that again is just another implementation
details (C specific).

The need for parenthesis in sizeof(type) seems anyway to be only a
hack, necessary for type expressions with blanks, sizeof(struct x) ?

Janis

BTW: There was another subthread about preprocessor use for NELEM determination using sizeof. When I looked up the K&R reference I
saw its use described even as a standard pattern to determine the
number of array elements. No wonder it became idiomatic.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Sat Jan 27 17:26:24 2024

David Brown <david.brown@hesbynett.no> writes:

On 27/01/2024 01:38, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

Also the stream hierarchy offers design and implementation paths that >>>>> you just don't have with printf().

And that you don’t need, frankly.

Don't be so fast with your judgment. Of course we use it to elegantly
and scaleably solve tasks in C++.

But not localization, which is an important issue. printf-style formatting >> allows rearrangement of parts of a message to suit grammar purposes, C++-
style output operators do not.

Standard printf formatting also does not allow such re-arrangements.

Depends on what standard you use. POSIX certainly does.

(My own key dislike about the C++ output streams is the mess of stateful
"IO manipulators".)

Hear! Hear!

The run-time cost of all those stateful manipulators isn't free, either.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Sat Jan 27 17:24:27 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

Also the stream hierarchy offers
design and implementation paths that you just don't have with printf().

And that you don’t need, frankly.

Don't be so fast with your judgment. Of course we use it to elegantly
and scaleably solve tasks in C++.

Java manages just fine with printf-style formatting and “toString()” methods.

I tried to explain in my other post that it's not just about a format
(or a string-sequencing member function). But I'm sure one must be
deeper in the topic or have experienced (besides any supposed issues)
the sophisticated possibilities that C++ offers to support good design.

As someone who as programmed daily in C++ since 1989, usually
in performance sensitive code, I've never found the C++ input
and output operators useful. The run-time cost both in space
and time is far more than the *printf formatting functions,
and they're less flexible when the formatting changes based,
e.g., on locale.

Java (as a newer language) has also some advantages, but was in many
respects far behind C++ (IMO).

Wow, that's a strong statement. What led you to hold that opinion?

Java, as a language, was rather well designed. The run-time costs,
however, precluded the use of Java in most of the projects that I've
worked on since Java was introduced.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Sat Jan 27 18:53:05 2024

On 27/01/2024 18:26, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

On 27/01/2024 01:38, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 01:27:55 +0100, Janis Papanagnou wrote:

On 27.01.2024 00:52, Lawrence D'Oliveiro wrote:

On Fri, 26 Jan 2024 22:46:25 +0100, Janis Papanagnou wrote:

Also the stream hierarchy offers design and implementation paths that >>>>>> you just don't have with printf().

And that you don’t need, frankly.

Don't be so fast with your judgment. Of course we use it to elegantly
and scaleably solve tasks in C++.

But not localization, which is an important issue. printf-style formatting >>> allows rearrangement of parts of a message to suit grammar purposes, C++- >>> style output operators do not.

Standard printf formatting also does not allow such re-arrangements.

Depends on what standard you use. POSIX certainly does.

Sure. But not all of the world is POSIX. (Okay, a lot of it is, and
it's fine to rely on POSIX features if you know that's appropriate.)

(My own key dislike about the C++ output streams is the mess of stateful
"IO manipulators".)

Hear! Hear!

The run-time cost of all those stateful manipulators isn't free, either.

For my own use, I've sometimes used classes letting you do :

debug_log << "X = " << x << " = 0x" << hex(x, 8) << "\n";

"hex(x, 8)" returns a value of a class holding "x" and the number of
digits 8, and then there is an overload for the << operator on this
class. No extra state needs to be stored in the logging class, I can
make as many of these formatters as I like, and the intermediary classes
all disappear in the optimisation.

It just seems so much cleaner and more efficient than the way C++
streams are done.

(It doesn't allow re-arranging the parts in the format, nor does it
solve the "1 thingy / 2 thingies" issue, but it's good enough for my needs.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Sat Jan 27 19:39:31 2024

David Brown <david.brown@hesbynett.no> writes:

On 27/01/2024 18:26, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

(My own key dislike about the C++ output streams is the mess of stateful >>> "IO manipulators".)

Hear! Hear!

The run-time cost of all those stateful manipulators isn't free, either.

For my own use, I've sometimes used classes letting you do :

debug_log << "X = " << x << " = 0x" << hex(x, 8) << "\n";

example:
void
c_processor::dump_rle(c_logger *lp, ulong task)
{
mem_addr_t rle = c_system::self()->get_rlist_base()
+ (task * RLIST_ENTRY_SIZE);

lp->log("Reinstate List Entry at %9.9llu\n", rle);
lp->log("Task #%4.4llu ET %9.9llu #ET: %5.5llu"
" Prio: %2.2llu Op Claim: %4.4llu\n",
getdigits(rle+RLIST_TASKNUM, RLIST_TASKNUM_LEN),
getdigits(rle+RLIST_ENV_TBL_ADDR, RLIST_ENV_TBL_ADDR_LEN),
getdigits(rle+RLIST_NUM_MAT, RLIST_NUM_MAT_LEN),
getdigits(rle+RLIST_TASKPRIO, RLIST_TASKPRIO_LEN),
getdigits(rle+RLIST_OPCLAIM, RLIST_OPCLAIM_LEN));
lp->log("MCP Lock# %4.4llu USER Lock# %4.4llu "
"Task Owning %4.4llu Next Task %4.4llu\n",
getdigits(rle+RLIST_MCPLOCKNUM, RLIST_MCPLOCKNUM_LEN),
getdigits(rle+RLIST_USERLOCKNUM, RLIST_USERLOCKNUM_LEN),
getdigits(rle+RLIST_TASKNUMOWN, RLIST_TASKNUMOWN_LEN),
getdigits(rle+RLIST_NEXTTASK, RLIST_NEXTTASK_LEN));
lp->log("Time Slice Remaining: %8.8llu New Time Slice: %8.8llu\n",
getdigits(rle+RLIST_TSR, RLIST_TSR_LEN),
getdigits(rle+RLIST_NTS, RLIST_NTS_LEN));
lp->log("Wait field: %6.6llx\n",
gethex(rle+RLIST_WAIT_FIELD, RLIST_WAIT_FIELD_LEN));
lp->log(" IX4 %8.8llx IX5 %8.8llx IX6 %8.8llx IX7 %8.8llx\n",
gethex(rle+RLIST_MOBIX, 8),
gethex(rle+RLIST_MOBIX+8, 8),
gethex(rle+RLIST_MOBIX+16, 8),
gethex(rle+RLIST_MOBIX+24, 8));

c_environment env(this, rle+RLIST_ACTIVE_ENV);
char buf[10];
lp->log(" %s:%6.6llu IntMask=%2.2llx\n",
env.print(buf, sizeof(buf)),
getdigits(rle+RLIST_IP, RLIST_IP_LEN),
getdigits(rle+RLIST_IMASK, RLIST_IMASK_LEN));
}

I'd really hate to have to write and support a C++ outputstream
version of that using the << operator overloads.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Sat Jan 27 20:59:07 2024

On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

Standard printf formatting also does not allow such re-arrangements.

Depends on what standard you use. POSIX certainly does.

C and POSIX go together like a horse and carriage; one without the other
is a lot less useful.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Sat Jan 27 21:06:52 2024

On Sat, 27 Jan 2024 11:13:16 +0000, Malcolm McLean wrote:

What I am saying is that standard output can take integers, floats and strings.

You forgot booleans. Also enumerations can be useful.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to Malcolm McLean on Sat Jan 27 21:40:31 2024

On 27/01/2024 20:17, Malcolm McLean wrote:

You can of course encode any data format as any other as long as you can write enough. But standard output can't take images or audio, for example.

What?

stdout is perfectly happy with anything, eg:

$ cat a.out | xxd | head -1
00000000: cffa edfe 0c00 0001 0000 0000 0200 0000 ................

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Richard Harnden on Sun Jan 28 00:31:10 2024

On Sat, 27 Jan 2024 21:40:31 +0000, Richard Harnden wrote:

$ cat a.out | xxd | head -1 00000000: cffa edfe 0c00 0001 0000 0000 0200
0000 ................

UUOC!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Sun Jan 28 01:22:28 2024

On Sun, 28 Jan 2024 00:35:30 +0000, Malcolm McLean wrote:

On 27/01/2024 21:06, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 11:13:16 +0000, Malcolm McLean wrote:

What I am saying is that standard output can take integers, floats and
strings.

You forgot booleans. Also enumerations can be useful.

Yes, and we could say fixed point, complex, etc.

Booleans and enumerations (enumerations can be considered a generalization
of booleans) are ones that could usefully be displayed in symbolic form.
Python example:

for i in (1, 2, 3) :
print("%d = %d? %s" % (i, 2, i == 2))
#end for

produces output:

1 = 2? False
2 = 2? True
3 = 2? False

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Sun Jan 28 01:50:30 2024

On Sat, 27 Jan 2024 17:34:00 -0800, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

C and POSIX go together like a horse and carriage; one without the
other is a lot less useful.

Which is why horseless carriages never caught on.

No, they never did. Until something was invented ... what was it ...
something to do with “horsepower” ...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to David Brown on Sun Jan 28 01:31:28 2024

On 1/27/24 10:44, David Brown wrote:

On 27/01/2024 02:10, James Kuyper wrote:

On 1/26/24 12:31, Janis Papanagnou wrote:

All side effects can be a problem (and should be avoided
unless "necessary").

Virtually everything useful that a computer program does qualifies as a
side effect. Side effects cannot be avoided, they can only be controlled.

Try telling that to Haskell programmers :-)

I was talking very specifically in reference to C's definition of "side-effect". I'm not particularly familiar with Haskell - does it have
a different definition of "side effect", or does it somehow get
something useful done without qualifying under C's definition? If so, how?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Sun Jan 28 12:50:14 2024

On 28/01/2024 02:31, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

For my own use, I've sometimes used classes letting you do :

debug_log << "X = " << x << " = 0x" << hex(x, 8) << "\n";

"hex(x, 8)" returns a value of a class holding "x" and the number of
digits 8, and then there is an overload for the << operator on this
class. No extra state needs to be stored in the logging class, I can
make as many of these formatters as I like, and the intermediary
classes all disappear in the optimisation.

Or hex() could just return a std::string.

Yes. That can make the coding a lot simpler, and very flexible. But it
comes at a cost - in my type of work, I don't want any dynamic memory
unless it is absolutely unavoidable, and I want results to be as
efficient as practically possible. On the other hand, I don't need the generality that you would have in a larger and more general purpose
framework. That all means tighter connections between the "log" class
and the functions handling these outputs.

(One possibility that is not bad is to have support for handling
fixed-length strings, rather than std::string. Real implementations of
this kind of thing are more complicated, with templates and all, and
well outside of c.l.c.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Sun Jan 28 12:40:59 2024

On 28/01/2024 07:31, James Kuyper wrote:

On 1/27/24 10:44, David Brown wrote:

On 27/01/2024 02:10, James Kuyper wrote:

On 1/26/24 12:31, Janis Papanagnou wrote:

All side effects can be a problem (and should be avoided
unless "necessary").

Virtually everything useful that a computer program does qualifies as a
side effect. Side effects cannot be avoided, they can only be controlled. >>>

Try telling that to Haskell programmers :-)

I was talking very specifically in reference to C's definition of "side-effect". I'm not particularly familiar with Haskell - does it have
a different definition of "side effect", or does it somehow get
something useful done without qualifying under C's definition? If so, how?

Pure functional programming does not have side-effects, at least not in
the way we are familiar with in C. There are various techniques used by different functional programming languages to do things like IO, but
that's all way off-topic (and beyond my knowledge and understanding).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Sun Jan 28 12:53:36 2024

On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

Standard printf formatting also does not allow such re-arrangements.

Depends on what standard you use. POSIX certainly does.

C and POSIX go together like a horse and carriage; one without the other
is a lot less useful.

To the nearest percent, 0% of all systems running C programs support
POSIX (or Windows, or any other "big" system). The world of small
embedded systems totally outweigh "big" systems by many orders of
magnitude. And perhaps 80% of such small systems are programmed in C.

It's fine to use POSIX functions when your target is POSIX. But the
target for C programmers is not always POSIX.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Sun Jan 28 13:02:24 2024

On 28/01/2024 02:34, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

Seriously, how hard would it be for you to accept the usage of
"function" to mean "C function" in this group? How difficult would it
be for you to try to speak the same language as the rest of us? Do
you really expect everyone else to adapt to suit your personal choice
of definitions? How often do you need to go round the same circles
again and again, instead of trying to communicate with people in a
sane manner?

You don't really think difficulty is the issue, do you?

No, it was just a figure of speech.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Sun Jan 28 13:00:06 2024

On 28/01/2024 02:59, Malcolm McLean wrote:

On 28/01/2024 01:26, Keith Thompson wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 27/01/2024 21:06, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 11:13:16 +0000, Malcolm McLean wrote:

What I am saying is that standard output can take integers, floats and >>>>> strings.

You forgot booleans. Also enumerations can be useful.

Yes, and we could say fixed point, complex, etc.
It's not inherently a bad idea to extend our little stdout interface
to include booleans. But in fact there are too many output formats you
might need.
Fixed point - in C or C++ there's no standard for that, so now you are
going the OO route. As you would with enumerations as the symbol
doesn't exist at runtime.
It's not that there is no case to be made for the OO approach. What I
am saying is that in practice the locked down restricted interface
will work better.

I think you mean it will work Malcolm-better.

Apparently inflexibility and vulnerability to type errors are
Malcolm-better than the alternative.

Exactly.
Inflexibility can be better. Because in reality most program work with a restricted set of data types which it makes sense to pass to a text
stream, and so you only need three atomic types.

Tpye errors are of course a nuisance with printf(). But that's because
of the quirks of C, not because it takes a restricted set of types, and
you can write a different restricted interface without this problem.

The fact is that printf(), which works basically as I recommend, is
widely used as the interface to standard output, and often OO
alternatives are available and not used for various reasons.

So the world is in fact "Malcolm better".

You mean the Malcolm-world is Malcolm-better with these restrictions,
because in the Malcolm-world the only programming tasks that are done
are Malcolm-tasks, and the programmers are all Malcolm-programmers.

At least that's all cleared up nicely, and the rest of the world can go
back to using more than three types, and generating outputs that are not
just ASCII text.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Sun Jan 28 18:02:01 2024

On 28.01.2024 13:00, David Brown wrote:

On 28/01/2024 02:59, Malcolm McLean wrote:

[...]

You mean the Malcolm-world is Malcolm-better with these restrictions,
because in the Malcolm-world the only programming tasks that are done
are Malcolm-tasks, and the programmers are all Malcolm-programmers.

Meanwhile I understand why Malcolm'isms have been introduced here
(as had been explained to me in another post). :-)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Sun Jan 28 18:06:36 2024

On 28.01.2024 17:09, Malcolm McLean wrote:

On 28/01/2024 12:00, David Brown wrote:

[...]

You put non-ASCII text on stdout?

Of course. - Hadn't that already been explained before?

I mean, obviously in a program for international use itself. But in
routine program for general use?

The meaning of "general use" is typically to be a sort of general one,
not one that is (artificially) restricted ("ASCII" vs. "non-ASCII").

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Scott Lurndal on Sun Jan 28 18:46:41 2024

On 27.01.2024 17:25, Scott Lurndal wrote:

how is

cout << std::hex << std::setw((bits + 3)/4) << value << std::eol;

You forgot the first "std::" (if you wanted to make it appear complex)

std::cout << std::hex << std::setw((bits + 3)/4) << value << std::endl;

but I prefer for readability anyway the simpler form (implying 'using')

cout << hex << setw((bits + 3)/4) << value << endl;

better than

printf("%*x\n", (bits+3/4), value);

It's an extensible and less error prone framework in C++ as opposed
to a restricted and error prone feature in the C base. (I recently
gave an application example with some typical advantages visible in
another post here.)

But personally I had never been a fan of the stream manipulators;
I always had to look them up in the documentation (when needed).
Luckily they were rarely necessary in my contexts, so it didn't
bother me much. (OTOH, I also have look up the %x.y modifiers in
FP output format, that I also rarely use. So not much difference.)

The question of flexibility of the OO features (compared e.g. to the
restricted C printf() features) had always been of larger relevance.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Sun Jan 28 18:22:30 2024

On 27.01.2024 21:17, Malcolm McLean wrote:

Standard output is any sequence of ASCII characters.

Nonsense.

printf() is the
main C interface to that, and supports integers, floats and strings, to
a first approximation.

It's not an approximation; printf() is _restricted_ to these types (and
a few more variants of these few basic types, to be correct).

In an OO context you would not unnecessarily restrict yourself. (Unless
you don't know better.)

You can of course encode any data format as any other as long as you can write enough. But standard output can't take images or audio, for example.

Standard output is an I/O channel; I can send to it non-text data like
the ones you mention. (Just imagine you couldn't send UTF-8 text data.)

The OO method is to allow the stream to be extended. So, in one common system, we might have a "decimal" stream which takes floats and outputs in the format 123.456. Then we could derive a different type of stream from
that which outputs floats as 1.23456e2. [...]

Nonsense. - You seem to have never really learned or understood OO or
what streams in C++ actually are.

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Sun Jan 28 19:24:37 2024

On 28/01/2024 17:09, Malcolm McLean wrote:

On 28/01/2024 12:00, David Brown wrote:

On 28/01/2024 02:59, Malcolm McLean wrote:

On 28/01/2024 01:26, Keith Thompson wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 27/01/2024 21:06, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 11:13:16 +0000, Malcolm McLean wrote:

What I am saying is that standard output can take integers,
floats and
strings.

You forgot booleans. Also enumerations can be useful.

Yes, and we could say fixed point, complex, etc.
It's not inherently a bad idea to extend our little stdout interface >>>>> to include booleans. But in fact there are too many output formats you >>>>> might need.
Fixed point - in C or C++ there's no standard for that, so now you are >>>>> going the OO route. As you would with enumerations as the symbol
doesn't exist at runtime.
It's not that there is no case to be made for the OO approach. What I >>>>> am saying is that in practice the locked down restricted interface
will work better.

I think you mean it will work Malcolm-better.

Apparently inflexibility and vulnerability to type errors are
Malcolm-better than the alternative.

Exactly.
Inflexibility can be better. Because in reality most program work
with a restricted set of data types which it makes sense to pass to a
text stream, and so you only need three atomic types.

Tpye errors are of course a nuisance with printf(). But that's
because of the quirks of C, not because it takes a restricted set of
types, and you can write a different restricted interface without
this problem.

The fact is that printf(), which works basically as I recommend, is
widely used as the interface to standard output, and often OO
alternatives are available and not used for various reasons.

So the world is in fact "Malcolm better".

You mean the Malcolm-world is Malcolm-better with these restrictions,
because in the Malcolm-world the only programming tasks that are done
are Malcolm-tasks, and the programmers are all Malcolm-programmers.

At least that's all cleared up nicely, and the rest of the world can
go back to using more than three types, and generating outputs that
are not just ASCII text.

You put non-ASCII text on stdout?
I mean, obviously in a program for international use itself. But in
routine program for general use?

I commonly write out in UTF-8 - it does not have to be "international".
(I assume that by "international" you, as a good Brit, mean "not UK".
After all, a program written solely for use in Norwegian is not
international.)

Sometimes I will have binary data of some kind on the standard output.
It's a lot less common, but it happens. A common example would be code
for generating images or other files for a webserver.

Most of my "real" programs, rather than small utilities, are for
embedded systems where the concept of "standard output" is not really
the same as for PC's.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Sun Jan 28 20:14:24 2024

On 27.01.2024 16:43, David Brown wrote:

On 26/01/2024 22:30, Janis Papanagnou wrote:

On 26.01.2024 19:59, David Brown wrote:

A computer scientist or a sophisticated programmer would know that
there are run-times associated in such expressions:

cout << "S1" << f1() << "S2" << f2() << "S3" << f3();

t1 t2 t3 t4 t5 t6 t7 t8 t9

The experienced or knowledgable C++ programmer (prior to C++17) would
know that the parts here are not necessarily executed in the order you
give.

And that was not intended in that representation. It was to show
where "time factors" are "hidden". (And if I had the intention I
could also have chosen 'dt1' instead of the inaccurate 't1' to
indicate that.) Sorry if that confuses you. I hoped that together
with the text it would be more informative (than confusing).
If one sees the time demands, and has read about our consensus
about evaluation order above (it was repeatedly stated!) there
should be no misunderstanding (or so I thought at least).

[...]

That's why somewhat experienced programmers would not write above
code that way; something like "run_tests()" is (typically) or can be
very time consuming, so they'd do
t0 = get_time(); res = run_tests(); t1 = get_time();
cout << ... etc.

Of course.

You can serialize (as I suggested previously as one example) or
embed functions like take_time(run_tests()) as another example.

In practice, they could still be badly wrong even with that code -
there's a lot of subtle points to consider when trying to time code, and
my experience is that very few programmers get it entirely right.

Really? - I mostly had to do with folks, even newbies with a
proper CS education, who had enough experience or knowledge.
Most problems appeared in contexts where the used languages
have inherent design issues; not in any case we could avoid
use of such languages in the first place.

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Sun Jan 28 20:16:12 2024

On 27.01.2024 17:46, David Brown wrote:

[...]

FYI: Too long to read at the moment. (Maybe later, maybe not.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Sun Jan 28 20:43:00 2024

On Sun, 28 Jan 2024 12:53:36 +0100, David Brown wrote:

On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

Standard printf formatting also does not allow such re-arrangements.

Depends on what standard you use. POSIX certainly does.

C and POSIX go together like a horse and carriage; one without the
other is a lot less useful.

To the nearest percent, 0% of all systems running C programs support
POSIX (or Windows, or any other "big" system). The world of small
embedded systems totally outweigh "big" systems by many orders of
magnitude. And perhaps 80% of such small systems are programmed in C.

And a lot of those “embedded” systems are running Android.

Android ships as many units per year as the entire installed base of
Microsoft Windows.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Jan 29 00:03:08 2024

On Sun, 28 Jan 2024 14:49:53 -0800, Keith Thompson wrote:

A lot (for some interpretations of "a lot") of embedded systems run
Android. Those aren't the one David was talking about.

They have a POSIX-type C runtime. Which does support “%«n»$” for reordering args to the printf routines.

The point being the prevalence of POSIX is a little larger than you give
it credit for.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Lawrence D'Oliveiro on Mon Jan 29 01:24:06 2024

On 29/01/2024 00:03, Lawrence D'Oliveiro wrote:

On Sun, 28 Jan 2024 14:49:53 -0800, Keith Thompson wrote:

A lot (for some interpretations of "a lot") of embedded systems run
Android. Those aren't the one David was talking about.

They have a POSIX-type C runtime. Which does support “%«n»$” for reordering args to the printf routines.

The point being the prevalence of POSIX is a little larger than you give
it credit for.

As far as I can see, the way it works is to have a separate format
string for each language target. The format string will contain the bulk
of the message, together with any variably-placed argments.

If those arguments are themselves text, they may also need different
versions per target.

This provides only the crudest form of aid, every language will have its
own exceptions.

A further problem is having type info encoded within each format string,
so that now magnifies the problem of maintenance, especially if those
strings reside in external data files. If the format argument is a
variable, that also limits the ability to detect format errors.

I can't see that it's of great benefit. Internationalisation requires
some extra effort anyway beyond what a language provides, especially a low-level one.

Can you provide one example of how it will help with just two languages
with differing word order?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Jan 29 02:17:38 2024

On Sun, 28 Jan 2024 17:48:53 -0800, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

The point being the prevalence of POSIX is a little larger than you
give it credit for.

Again, David wasn't talking about Android systems.

No, I was, as an example of the sort of POSIX system he thought was too minuscule to worry about.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Mon Jan 29 14:09:51 2024

On 28/01/2024 20:14, Janis Papanagnou wrote:

On 27.01.2024 16:43, David Brown wrote:

On 26/01/2024 22:30, Janis Papanagnou wrote:

That's why somewhat experienced programmers would not write above
code that way; something like "run_tests()" is (typically) or can be
very time consuming, so they'd do
t0 = get_time(); res = run_tests(); t1 = get_time();
cout << ... etc.

Of course.

You can serialize (as I suggested previously as one example) or
embed functions like take_time(run_tests()) as another example.

In practice, they could still be badly wrong even with that code -
there's a lot of subtle points to consider when trying to time code, and
my experience is that very few programmers get it entirely right.

Really? - I mostly had to do with folks, even newbies with a
proper CS education, who had enough experience or knowledge.
Most problems appeared in contexts where the used languages
have inherent design issues; not in any case we could avoid
use of such languages in the first place.

Let's suppose you have a "get_time()" function that gets a time stamp
from somewhere, and that it correctly uses a volatile access from
hardware (or some OS-controlled time function that is volatile
somewhere). And suppose you are trying to time a function to test the
speed of recursion on your system :

unsigned int factorial(unsigned int x) {
if (x == 0) return 1;
return x * factorial(x - 1);
}

What happens when you write this? :

unsigned int x = 10;
double start = get_time();
unsigned int y = factorial(x);
double end = get_time();

printf("Time is %f seconds\n", end - start);

It looks reasonable enough, and because "get_time()" has observable
behaviour (a volatile access), it must be correct, right? It gives
shows a small but non-zero time, as expected.

But what you have actually measured is the overhead in the get_time()
function, because the compiler has removed the call to factorial because
the answer is not needed.

So you try :

unsigned int x = 10;
double pre_start = get_time();
double start = get_time();
unsigned int y = factorial(x);
double end = get_time();

double overhead = start - pre_start;
printf("Factorial %ui is %ui\n", x, y);
printf("Time is %f seconds\n", end - start - overhead);

to compensate for the overhead in the timing, and to force the compiler
to run factorial(10) because you observe its output. Now you get the
right answer for y, and the time is 0, because the compiler has
pre-calculated factorial x and substituted 3628800 for y.

So you take "x" from argv, so that the compiler can't pre-calculate the
result. And now the time is 0, because the compiler has re-arranged the
code as though it was :

unsigned int x = atoi(argv[1]);
double pre_start = get_time();
double start = get_time();
double end = get_time();

double overhead = start - pre_start;

unsigned int y = factorial(x);
printf("Factorial %ui is %ui\n", x, y);
printf("Time is %f seconds\n", end - start - overhead);

Making x and y both volatile gives you a time for the call to
factorial(x). It is still not measuring any recursion speed, because
the compiler has turned that function into a loop.

And that is before we start asking if you are measuring the speed of the
code, or of the memory and cache system on your PC.

I regularly see people failing to time or benchmark functions as they
expect - they don't understand how the compiler can re-arrange or
optimise things. A recurring problem is the belief that volatile
accesses force an order on other memory accesses, or on calculations,
which is not correct.

Then they decide to disable optimisation because "the optimiser messes
with my timing code", getting a result as useful as measuring the speed
of a race car stuck in first gear.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Mon Jan 29 13:35:00 2024

On 28/01/2024 21:43, Lawrence D'Oliveiro wrote:

On Sun, 28 Jan 2024 12:53:36 +0100, David Brown wrote:

On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

Standard printf formatting also does not allow such re-arrangements. >>>>>

Depends on what standard you use. POSIX certainly does.

C and POSIX go together like a horse and carriage; one without the
other is a lot less useful.

To the nearest percent, 0% of all systems running C programs support
POSIX (or Windows, or any other "big" system). The world of small
embedded systems totally outweigh "big" systems by many orders of
magnitude. And perhaps 80% of such small systems are programmed in C.

And a lot of those “embedded” systems are running Android.

No, they are not. Android is Linux, and is included in the 0%.

Android ships as many units per year as the entire installed base of Microsoft Windows.

Sure. And it is still within the 0%.

Take your car as an example. There's a reasonable chance, if it is
modern, that the entertainment and navigation system is running Android.
You might have a couple of other parts running embedded Linux of other
types. And you might have 100 other microcontrollers running programs
written in C, but not running a "big" POSIX OS. Some will run RTOS's,
some will be bare metal.

On the computer on your desk, you have a microcontroller in your mouse, keyboard, webcam, screen, harddisk, managed switch. Your printer might
have some kind of embedded Linux for its display and UI, but probably
has many other microcontrollers in it. Your toaster, oven, fridge,
alarm clock, digital thermometer - microcontrollers are everywhere.

Even your typical Android device - a phone or tablet - will have a few
separate microcontrollers, and a variety of bits and pieces in its SoC
that are programmed in C but do not have a POSIX system.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Mon Jan 29 17:06:53 2024

On 29/01/2024 03:17, Lawrence D'Oliveiro wrote:

On Sun, 28 Jan 2024 17:48:53 -0800, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

The point being the prevalence of POSIX is a little larger than you
give it credit for.

Again, David wasn't talking about Android systems.

No, I was, as an example of the sort of POSIX system he thought was too minuscule to worry about.

I think your mind-reading equipment needs adjustment. Perhaps I should
take off my tin-foil hat - would that make it easier for you to argue
against what you imagine I think?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Mon Jan 29 17:05:17 2024

On 29/01/2024 01:03, Lawrence D'Oliveiro wrote:

On Sun, 28 Jan 2024 14:49:53 -0800, Keith Thompson wrote:

A lot (for some interpretations of "a lot") of embedded systems run
Android. Those aren't the one David was talking about.

They have a POSIX-type C runtime. Which does support “%«n»$” for reordering args to the printf routines.

The point being the prevalence of POSIX is a little larger than you give
it credit for.

No, it is not. I doubt if there are many people here that don't know
that Android is Linux, and therefore POSIX, or that there are lots more
Android systems than Windows systems. If you exclude obvious cases like phones, tablets, and smart TVs, there are many more embedded Linux
systems that are not Android, than embedded Android systems. Those are
all POSIX too. And yet they are all part of the 0%.

Now, it is not uncommon for the libraries of small embedded systems to
support some POSIX extensions - the "newlib" family of C libraries
supports POSIX extensions to printf formatting flags, such as control of positional arguments. But many other embedded C libraries do not. And
support for that one extension does not imply POSIX support or
"POSIX-type C runtime" libraries.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Mon Jan 29 17:20:18 2024

On 28/01/2024 20:16, Janis Papanagnou wrote:

On 27.01.2024 17:46, David Brown wrote:

[...]

FYI: Too long to read at the moment. (Maybe later, maybe not.)

OK. It's off-topic, and just chatter, so if you don't reply I will not
feel insulted :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Mon Jan 29 17:18:57 2024

On 28/01/2024 20:49, Malcolm McLean wrote:

On 28/01/2024 18:24, David Brown wrote:

On 28/01/2024 17:09, Malcolm McLean wrote:

You put non-ASCII text on stdout?
I mean, obviously in a program for international use itself. But in
routine program for general use?

I commonly write out in UTF-8 - it does not have to be
"international". (I assume that by "international" you, as a good
Brit, mean "not UK". After all, a program written solely for use in
Norwegian is not international.)

I'd expect that most general purpose programs written by Norwegians use
an English interface, even if it isn't really expected that the program
will find an audience beyond some users in Norway. Except of course for programs which in some way are about Norway.

Why?

I might write the program with English output if the output language
doesn't matter, because I am lazy and write better English than
Norwegian. I might give it English output if the program were to be
used in a context where English is prevalent already, such as a
programming utility. But if it is intended to be used by Norwegians in
general use, I'll make the output in Norwegian.

Just because Norwegians are, for the most part, very good at English,
does not mean they don't prefer Norwegian.

Sometimes I will have binary data of some kind on the standard output.
It's a lot less common, but it happens. A common example would be
code for generating images or other files for a webserver.

Most of my "real" programs, rather than small utilities, are for
embedded systems where the concept of "standard output" is not really
the same as for PC's.

I've never used standard output for binary data. It might be necessary
for webservers that serve images. But it strikes me as a poor design decision.

It's simply something you haven't considered. Don't assume that the
kind of programming /you/ do, or that /you/ have experience with, is in
any way representative of other kinds of programming needs or practices.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Keith Thompson on Mon Jan 29 12:09:57 2024

On 1/28/24 18:00, Keith Thompson wrote:
...

And in environments like POSIX that don't distinguish between text and
binary output streams, it can be perfectly sensible (though not 100% portable) to send binary data to stdout.

I'm sure you know about the following, but for Malcolm's benefit, I want
to expand on that comment. In other environments, the fact that stdout
is initially a text stream means that freopen() would have to be used to
change it to a binary stream - but it is otherwise no more of a problem
than in a POSIX environment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jan 29 16:23:45 2024

On 29/01/2024 12:35, David Brown wrote:

On 28/01/2024 21:43, Lawrence D'Oliveiro wrote:

On Sun, 28 Jan 2024 12:53:36 +0100, David Brown wrote:

On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

Standard printf formatting also does not allow such re-arrangements. >>>>>>

Depends on what standard you use. POSIX certainly does.

C and POSIX go together like a horse and carriage; one without the
other is a lot less useful.

To the nearest percent, 0% of all systems running C programs support
POSIX (or Windows, or any other "big" system). The world of small
embedded systems totally outweigh "big" systems by many orders of
magnitude. And perhaps 80% of such small systems are programmed in C.

And a lot of those “embedded” systems are running Android.

No, they are not. Android is Linux, and is included in the 0%.

Android ships as many units per year as the entire installed base of
Microsoft Windows.

Sure. And it is still within the 0%.

Take your car as an example. There's a reasonable chance, if it is
modern, that the entertainment and navigation system is running Android.
You might have a couple of other parts running embedded Linux of other types. And you might have 100 other microcontrollers running programs written in C, but not running a "big" POSIX OS. Some will run RTOS's,
some will be bare metal.

On the computer on your desk, you have a microcontroller in your mouse, keyboard, webcam, screen, harddisk, managed switch. Your printer might
have some kind of embedded Linux for its display and UI, but probably
has many other microcontrollers in it. Your toaster, oven, fridge,
alarm clock, digital thermometer - microcontrollers are everywhere.

I think this is being disingenuous. Of course there are countless
millions of integrated circuits used everywhere, that will outnumber the packaged consumer devices that everyone knows about.

Some of them may have programmable elements. But, no matter how crude,
how limited, if somebody, somewhere, has configured a program to turn a
subset of C into code for that device, that enables you to add that to
the list of systems you claim are programmed in 'C'.

Even if it relies on dedicated extensions or uses lots of inline assembly.

Even your typical Android device - a phone or tablet - will have a few separate microcontrollers, and a variety of bits and pieces in its SoC
that are programmed in C but do not have a POSIX system.

Maybe you can count each main CPU and each core separately too!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Mon Jan 29 18:40:56 2024

On 29/01/2024 17:23, bart wrote:

On 29/01/2024 12:35, David Brown wrote:

On 28/01/2024 21:43, Lawrence D'Oliveiro wrote:

On Sun, 28 Jan 2024 12:53:36 +0100, David Brown wrote:

On 27/01/2024 21:59, Lawrence D'Oliveiro wrote:

On Sat, 27 Jan 2024 17:26:24 GMT, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

Standard printf formatting also does not allow such re-arrangements. >>>>>>>

Depends on what standard you use. POSIX certainly does.

C and POSIX go together like a horse and carriage; one without the
other is a lot less useful.

To the nearest percent, 0% of all systems running C programs support
POSIX (or Windows, or any other "big" system). The world of small
embedded systems totally outweigh "big" systems by many orders of
magnitude. And perhaps 80% of such small systems are programmed in C. >>>

And a lot of those “embedded” systems are running Android.

No, they are not. Android is Linux, and is included in the 0%.

Android ships as many units per year as the entire installed base of
Microsoft Windows.

Sure. And it is still within the 0%.

Take your car as an example. There's a reasonable chance, if it is
modern, that the entertainment and navigation system is running
Android. You might have a couple of other parts running embedded
Linux of other types. And you might have 100 other microcontrollers
running programs written in C, but not running a "big" POSIX OS. Some
will run RTOS's, some will be bare metal.

On the computer on your desk, you have a microcontroller in your
mouse, keyboard, webcam, screen, harddisk, managed switch. Your
printer might have some kind of embedded Linux for its display and UI,
but probably has many other microcontrollers in it. Your toaster,
oven, fridge, alarm clock, digital thermometer - microcontrollers are
everywhere.

I think this is being disingenuous. Of course there are countless
millions of integrated circuits used everywhere, that will outnumber the packaged consumer devices that everyone knows about.

Some of them may have programmable elements. But, no matter how crude,
how limited, if somebody, somewhere, has configured a program to turn a subset of C into code for that device, that enables you to add that to
the list of systems you claim are programmed in 'C'.

I am talking about systems with CPUs of some sort that are regularly
programmed in C. I am not including chips that just have configuration,
or 4-bit devices that are programmed in their own Forth-like assembly.

Even if it relies on dedicated extensions or uses lots of inline assembly.

Sure. I am not suggesting that these devices are programmed /solely/ in
C - certainly not solely in standard and portable C. But then, few
programs on PC's or other "big" systems are programmed solely in pure
standard C.

And I am not restricting this to devices that the end user can program
in C. Most devices cannot be accessed or re-programmed by the end-user
- some cannot be reprogrammed at all, with the program in ROM of some kind.

Even your typical Android device - a phone or tablet - will have a few
separate microcontrollers, and a variety of bits and pieces in its SoC
that are programmed in C but do not have a POSIX system.

Maybe you can count each main CPU and each core separately too!

That would change the dynamics a bit, but it would not change the
overall point.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Keith Thompson on Mon Jan 29 19:33:26 2024

On 29.01.2024 00:00, Keith Thompson wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.01.2024 21:17, Malcolm McLean wrote:

[...]

printf() is the
main C interface to that, and supports integers, floats and strings, to
a first approximation.

It's not an approximation; printf() is _restricted_ to these types (and
a few more variants of these few basic types, to be correct).

printf also supports pointer values with "%p". And it support single characters, which are not strings.

strings of cousre absolutely do not have to be ASCII. Using printf to
print data with embedded null bytes is tricky

I haven't tried with a pure C compiler but with a C++ compiler
it works fine (see below). - Just don't make the mistake to try
to embed a NUL inside a string constant, like

printf ("%s\n", "My\0string");

which won't work as some may expect (you will only see "My").

-- but of course printf is
not the only interface. We can print arbitrary data with putchar,
fwrite, etc.

I use the Unix I/O functions as well (i.e. 'write' etc.).

With printf or cout I can print binary, non-ASCII, UTF-8 encoded
characters...

std::cout << "�l�belkeit�u�erung" << std::endl;
printf("%s\n", "�l�belkeit�u�erung");

$ utf8out
�l�belkeit�u�erung
�l�belkeit�u�erung

cout << (char)0x02 << (char)0x01 << (char)0x00 << (char)0xff
<< (char)0xfe << (char)0x80 << (char)0x7f << (char)0x0a;

printf("%c%c%c%c%c%c%c%c",
(char)0x02, (char)0x01, (char)0x00, (char)0xff,
(char)0xfe, (char)0x80, (char)0x7f, (char)0x0a);

$ binout | od -t x1
0000000 02 01 00 ff fe 80 7f 0a 02 01 00 ff fe 80 7f 0a
0000020

And in environments like POSIX that don't distinguish between text and
binary output streams,

Also keep in mind that you can modify the output system attributes;
in these cases of a modified channel you may not see what you sent.

it can be perfectly sensible (though not 100%
portable) to send binary data to stdout.

Not observed (by me) in the environments I was using the past decades.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Mon Jan 29 19:48:27 2024

On 29/01/2024 19:32, Malcolm McLean wrote:

On 27/01/2024 11:36, Janis Papanagnou wrote:

On 27.01.2024 12:02, Malcolm McLean wrote:

On 27/01/2024 10:34, Janis Papanagnou wrote:

On 27.01.2024 04:05, Malcolm McLean wrote:

In many languages, including C, there's a difference between functions
that return a value and functions that don't, in that

In some languages, yes.

if (realloc(ptr, 0))

is allowed

whilst

if (free(ptr))

struct S { int a, b; };

struct S foo(void);

foo() returns a value, but "if (foo())" is not allowed.

C does not make much difference between functions that return a value,
and those that don't. The key distinction is whether the "return"
statement must have an expression or must not have an expression.

I don't disagree that it can be useful to distinguish between different
types of functions. I /do/ disagree with your attempts to classify
them, which I do not think are useful or well-defined categories.

And just because particular terms are used in some other context, does
not mean you get to define them yourself for use in other contexts, or
apply to them to languages that do not use those terms.

A "function" here is a "C function" in terms of the C standard. If you
want to talk about anything else, define what you mean at the time.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to David Brown on Mon Jan 29 19:35:22 2024

On 2024-01-29, David Brown <david.brown@hesbynett.no> wrote:

On 29/01/2024 19:32, Malcolm McLean wrote:

On 27/01/2024 11:36, Janis Papanagnou wrote:

On 27.01.2024 12:02, Malcolm McLean wrote:

On 27/01/2024 10:34, Janis Papanagnou wrote:

On 27.01.2024 04:05, Malcolm McLean wrote:

In many languages, including C, there's a difference between functions
that return a value and functions that don't, in that

In some languages, yes.

if (realloc(ptr, 0))

is allowed

whilst

if (free(ptr))

struct S { int a, b; };

struct S foo(void);

foo() returns a value, but "if (foo())" is not allowed.

C does not make much difference between functions that return a value,
and those that don't. The key distinction is whether the "return"
statement must have an expression or must not have an expression.

Don't forget that we can have:

struct S s = foo();

not to mention

struct S bar(void) { return foo(); }

as well as:

extern bar(struct S);

bar(foo());

none of which patterns is possible if foo returns void.

A void return is qualitatively different. A function which returns
a value can plausibly belong into the functional domain. A function
which returns void is necessarily an imperative procedure.

Even if it does nothing, a void foo() function it is a procedure in that
it cannot be planted into a functional expression like bar(foo()).

So we can identify an emergent category there.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Mon Jan 29 21:06:52 2024

On 29/01/2024 19:47, Malcolm McLean wrote:

On 29/01/2024 16:18, David Brown wrote:

On 28/01/2024 20:49, Malcolm McLean wrote:

On 28/01/2024 18:24, David Brown wrote:

I'd expect that most general purpose programs written by Norwegians
use an English interface, even if it isn't really expected that the
program will find an audience beyond some users in Norway. Except of
course for programs which in some way are about Norway.

Why?

Generally programmers are educated people and educated people use
English for serious purposes.

I think you should probably stop there before you insult people.

Not always of course and Norway might be
an exception. But I'd expect that in a Norweigian university, for
example, it would be forbidden to document a program in Norwegian or to
use non-English words for identifiers. And probably the same in a large Norwegina company. I might be wrong about that and I have never visited Norway or worked for a Norweigian employer (and obviously I couldn't do
so unless the policy I expect was followed).

You are wrong. And it is not something special about Norway.

You are coming across as the kind of ignorant and inconsiderate git who
thinks the way to talk to Jonny Foreigner is slowly and loudly.

Please stop before you embarrass yourself more.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Malcolm McLean on Mon Jan 29 12:10:15 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Kaz Kylheku on Mon Jan 29 21:10:40 2024

On 29/01/2024 20:35, Kaz Kylheku wrote:

On 2024-01-29, David Brown <david.brown@hesbynett.no> wrote:

On 29/01/2024 19:32, Malcolm McLean wrote:

On 27/01/2024 11:36, Janis Papanagnou wrote:

On 27.01.2024 12:02, Malcolm McLean wrote:

On 27/01/2024 10:34, Janis Papanagnou wrote:

On 27.01.2024 04:05, Malcolm McLean wrote:

In many languages, including C, there's a difference between functions
that return a value and functions that don't, in that

In some languages, yes.

if (realloc(ptr, 0))

is allowed

whilst

if (free(ptr))

struct S { int a, b; };

struct S foo(void);

foo() returns a value, but "if (foo())" is not allowed.

C does not make much difference between functions that return a value,
and those that don't. The key distinction is whether the "return"
statement must have an expression or must not have an expression.

Don't forget that we can have:

struct S s = foo();

not to mention

struct S bar(void) { return foo(); }

as well as:

extern bar(struct S);

bar(foo());

none of which patterns is possible if foo returns void.

Sure. But Malcolm suggested that the "if" pattern was a special
distinguishing feature. (He has already made it clear that the only
types of interest, in his world, are integers, floats and strings.)

A void return is qualitatively different. A function which returns
a value can plausibly belong into the functional domain. A function
which returns void is necessarily an imperative procedure.

Even if it does nothing, a void foo() function it is a procedure in that
it cannot be planted into a functional expression like bar(foo()).

So we can identify an emergent category there.

Certainly there is a distinction between void and non-void functions -
but often it is not a particularly important one. What a function does,
what kind of side-effects it has, how its behaviour and values interact
with other parts of the code, how it interacts with other threads, are
far more interesting aspects for classification. (And part of the
interest is that these features are not normally specified in the code.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Mon Jan 29 22:24:43 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans

Sure it can. Pipe it to a the appropriate tool.

$ cat /usr/bin/xrn | xxd |head
0000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
0000010: 0200 3e00 0100 0000 ac47 4000 0000 0000 ..>......G@.....
0000020: 4000 0000 0000 0000 08a0 0500 0000 0000 @...............
0000030: 0000 0000 4000 3800 0900 4000 2200 1f00 ....@.8...@."...
0000040: 0600 0000 0500 0000 4000 0000 0000 0000 ........@.......
0000050: 4000 4000 0000 0000 4000 4000 0000 0000 @.@.....@.@.....
0000060: f801 0000 0000 0000 f801 0000 0000 0000 ................
0000070: 0800 0000 0000 0000 0300 0000 0400 0000 ................
0000080: 3802 0000 0000 0000 3802 4000 0000 0000 8.......8.@.....

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Mon Jan 29 23:51:49 2024

On Mon, 29 Jan 2024 19:33:26 +0100, Janis Papanagnou wrote:

printf ("%s\n", "My\0string");

which won't work as some may expect (you will only see "My").

I’ve often thought, now that we can assume that strings are normally UTF-8-encoded, that we can use an alternative UTF-8-derived representation
of NUL that *isn’t* interpreted as a string terminator. E.g.

\xc0\x80

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Mon Jan 29 23:56:12 2024

On Mon, 29 Jan 2024 17:05:17 +0100, David Brown wrote:

If you exclude obvious cases like
phones, tablets, and smart TVs, there are many more embedded Linux
systems that are not Android, than embedded Android systems. Those are
all POSIX too. And yet they are all part of the 0%.

If you include them, you find that Microsoft has only something like 25%
of the computer market.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Mon Jan 29 23:55:16 2024

On Mon, 29 Jan 2024 18:47:46 +0000, Malcolm McLean wrote:

... I don't think
my use of standard output is all that untypical. It's unacceptable for anything released to customers and is used mainly for debugging.

stderr for debugging, typically, not stdout.

If you want a “human-readable” output stream, stderr would more likely fit that bill than stdout.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Malcolm McLean on Mon Jan 29 23:27:52 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might
have unusual effects if passed though systems designed to handle human-readable text. For instance in some systems designed to receive
ASCII text, there is no distinction between the nul byte and "waiting
for next data byte". Obviously this will cause difficuties if the data
is binary.
Also many binary formats can't easily be extended, so you can pass one
image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to
extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Tue Jan 30 10:05:31 2024

On 29/01/2024 23:27, Malcolm McLean wrote:

On 29/01/2024 20:10, David Brown wrote:

Sure. But Malcolm suggested that the "if" pattern was a special
distinguishing feature. (He has already made it clear that the only
types of interest, in his world, are integers, floats and strings.)

No, Malcolm gave if() as an example of a distinction in a language grammar between functions that return a value and functions that don't. Sometimes functions return a value and if() still isn't allowed. Fair enough
point.But it doesn't really detract from the point that Malcolm is makin

Fair enough. But that does not detract from my counter-point - I don't
think void / non-void return type is a major or helpful way to
categorise functions. As you said yourself, there is not a huge
difference between a function that returns a value directly, or one that
takes a pointer for passing the return value to the caller.

And since void / non-void return type is already a distinction made in
the way the code is written, it is not as useful a classification as you
could get from other factors that are not immediately clear from the
code or clear to the compiler (such as "has side-effects" / "depends on side-effects" / "independent of side-effects", or "thread-safe" / "not thread-safe").

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Tue Jan 30 14:03:46 2024

On 30/01/2024 10:13, Malcolm McLean wrote:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might
have unusual effects if passed though systems designed to handle
human-readable text. For instance in some systems designed to receive
ASCII text, there is no distinction between the nul byte and "waiting
for next data byte". Obviously this will cause difficuties if the data >>> is binary.
Also many binary formats can't easily be extended, so you can pass one
image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to
extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

Proof by repeated assertion?

/stderr/ was designed for text - it was for error messages independent
of the main output stream, precisely because stdout output was often not
human readable text but data passed on to other programs or devices.

I wonder if there is any *nix program older or simpler than "cat" - a
program that simply passes its input files or the stdin to stdout.

Whilst there is a "printf()" which operates on standard output by
default, there are no functions which write binary data to standard
outout by default, for example. Though of course you can pass stdout to
the regular binary output functions like fwrite().

There is no standard C library function that takes stderr as the default stream. Does that mean stderr was not designed to be used at all?

"printf" exists and works the way it does because it is convenient and
useful. It can be viewed as a short-cut for "fprintf(stdout, ...".
Indeed, that is /exactly/ how the C standard describes the function.

That means the C standards acknowledge that people often want to print
out formatted text (which in no way implies plain ASCII) to stdout.
This does not mean they expect this to be the /only/ use of stdout, or
that people will not use binary outputs to stdout, any more than it
implies that text output will always be sent to stdout and not other
streams or files.

So I'm obviously not the only person to take the view that passing
binary data to standard output is a rather odd thing to do.

Many programs only write text to stdout. That does not mean writing
binary data is "odd", or that stdout was "designed" for text.

I suspect the truth is that is is a bad design and I am right, but
because for some reason communications have to be via standard output,
people make the best of it and contrive that it shall work, and then
forget that essentially it is a misuse of a text stream. They are
slightly proud of their efforts and intolerant of my point.

Obviously you think you are right - it would be pretty silly for you to
witter on in the face of clear and mostly unanimous opposition if you
did not think you were right. That, of course, does not mean you /are/
right.

That you couldn't actually mount a defence of your position whilst I
could also strongly implies that I am right.

He was correct - your reasoning is gobbledygook.

If I claim grass is pink, and I know this because it is the same colour
as the sea which is also pink, then I have given a justification and a
defence of my position. That doesn't mean it is worth the pixels it is
written with, or that anyone needs to elaborate when they same I am
talking nonsense.

It is so blindingly clear and obvious that stdout is regularly used for non-text data, and so many undeniably accurate and common examples have
been given, that your position is entirely untenable.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 15:00:04 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might
have unusual effects if passed though systems designed to handle
human-readable text. For instance in some systems designed to receive
ASCII text, there is no distinction between the nul byte and "waiting
for next data byte". Obviously this will cause difficuties if the data
is binary.
Also many binary formats can't easily be extended, so you can pass one
image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to
extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

That was never the case. stdout is a unformatted stream of bytes
associated by default with file descriptor number one in the
application.

Long before windows was even a gleam in gates eye.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Tue Jan 30 15:30:57 2024

On 30/01/2024 15:00, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

That was never the case. stdout is a unformatted stream of bytes
associated by default with file descriptor number one in the
application.

Long before windows was even a gleam in gates eye.

I don't know what Windows has to do with it.

The difference between text and binary byte streams is something
invented by C, so that conversions could be done for byte '\n' on
systems with alternate line-endings.

It wasn't just LF versus CRLF either. The difference with the latter is
that it is still use, /due to the popularity/ of MSDOS and WINDOWS,
while others have died out.

With C out of the picture, then CRLF was and is simply a two-byte
sequence: you write two bytes, or read two bytes.

So long as a byte has 8 bits, you can't tell whether a byte stream
represents text or binary data.

When coding however, it seems incredibly bad form to me to send what you
know is binary data (so can contain malformed UTF8, or inadvertent
escape sequences) to an output which will try to represent that on a
text display.

For such reasons, whenever I invent a binary text format, I usually
include a 26/1A byte (end-of-file) so that programs which expect a text
file will stop at that character. For example:

c:\mx>type hello.mx
MCX
c:\mx>dump hello.mx
0000: 4D 43 58 1A 01 30 2E 31 32 33 34 00 05 76 02 00

If I try it under WSL with 'cat', I get a huge bunch of crap displayed
on the screen, mixed up with long beeps whenever there's an 07 byte.

That's why binary code, not representing text, is a bad idea even on
'stdout'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 15:53:04 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 15:00, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
human-readable text. For instance in some systems designed to receive >>>>> ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

That was never the case. stdout is a unformatted stream of bytes
associated by default with file descriptor number one in the
application.

It is a stream of bytes at the level that the file descriptor is used to >generate a write event for a byte which can be arbitrary. But standard
output is often quickly transformed into a stream of characters.
Sometimes within the application executable.

It's binary data. UTF-8, for example, is binary data. Your
statement "standard output is designed for text, not binary" is
completely false.

those who use the less common, often more expensive Unix systems.

More expensive? In what world do you live?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Tue Jan 30 15:54:54 2024

bart <bc@freeuk.com> writes:

On 30/01/2024 15:00, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

That was never the case. stdout is a unformatted stream of bytes
associated by default with file descriptor number one in the
application.

Long before windows was even a gleam in gates eye.

I don't know what Windows has to do with it.

The difference between text and binary byte streams is something
invented by C, so that conversions could be done for byte '\n' on
systems with alternate line-endings.

No, it was invented to support windows CRLF line endings.

Regardless of your digression, stdout is still an unformatted
stream of bytes. Any structure on that stream is imposed
by the -consumer- of those bytes.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 16:06:02 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 13:03, David Brown wrote:

On 30/01/2024 10:13, Malcolm McLean wrote:

There is no standard C library function that takes stderr as the default
stream. Does that mean stderr was not designed to be used at all?

"printf" exists and works the way it does because it is convenient and
useful. It can be viewed as a short-cut for "fprintf(stdout, ...".
Indeed, that is /exactly/ how the C standard describes the function.

That means the C standards acknowledge that people often want to print
out formatted text (which in no way implies plain ASCII) to stdout. This
does not mean they expect this to be the /only/ use of stdout, or that
people will not use binary outputs to stdout, any more than it implies
that text output will always be sent to stdout and not other streams or
files.

Speical facilities for text don't necessarily mean that text is the only >output intended to be used, fair enough.

Even text is just an unformatted stream of bytes. It is the ultimate
consumer of that text that imposed structure on it (e.g. by treating
it as ASCII, UTF-16, UTF-8, UTF-32, EBCDIC, et cetera, et alia, und so weiter)

printf has no binary data format specifier.

%s? Simply copies non-nul bytes. That's almost as binary as
one can get, it certainly isn't restricted to printable characters.

And of course, there are putc and putchar.

Not to mention using printf where the format string argument
includes binary data.

<snip>

The fact that there is no
similar function for standard error

fprintf(stderr, "%s", binary_data_with_no_embedded_nul_bytes);

. Similarly there is no
function "write`" that passes binary data to standard output by default.

In the real world, and in the world the C was created to support
there are several functions (write, pwrite, mmap, aio_listio, aio_read, aio_write et cetera, et alia, und so wieter).

Most of which existed before 1989.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Tue Jan 30 16:10:13 2024

On 30/01/2024 15:54, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 30/01/2024 15:00, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

That was never the case. stdout is a unformatted stream of bytes
associated by default with file descriptor number one in the
application.

Long before windows was even a gleam in gates eye.

I don't know what Windows has to do with it.

The difference between text and binary byte streams is something
invented by C, so that conversions could be done for byte '\n' on
systems with alternate line-endings.

No, it was invented to support windows CRLF line endings.

You just want to have a go at Windows don't you?

I was using CRLF line-endings in 1970s, they weren't an invention of
Windows, which didn't exist until the mid-80s and didn't become popular
until the mid-90s.

So, how did C deal with CRLF in all those non-Windows settings?

Regardless of your digression, stdout is still an unformatted
stream of bytes. Any structure on that stream is imposed
by the -consumer- of those bytes.

Of course. But it still a bad idea to write actual output that you KNOW
does not represent text, to a consumer that will expect text.

For example to a terminal window, which can happen if you forget to
redirect it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lew Pitcher@21:1/5 to Scott Lurndal on Tue Jan 30 16:12:06 2024

On Tue, 30 Jan 2024 16:06:02 +0000, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 13:03, David Brown wrote:

On 30/01/2024 10:13, Malcolm McLean wrote:

There is no standard C library function that takes stderr as the default >>> stream. Does that mean stderr was not designed to be used at all?

"printf" exists and works the way it does because it is convenient and
useful. It can be viewed as a short-cut for "fprintf(stdout, ...".
Indeed, that is /exactly/ how the C standard describes the function.

That means the C standards acknowledge that people often want to print
out formatted text (which in no way implies plain ASCII) to stdout. This >>> does not mean they expect this to be the /only/ use of stdout, or that
people will not use binary outputs to stdout, any more than it implies
that text output will always be sent to stdout and not other streams or
files.

Speical facilities for text don't necessarily mean that text is the only >>output intended to be used, fair enough.

Even text is just an unformatted stream of bytes. It is the ultimate consumer of that text that imposed structure on it (e.g. by treating
it as ASCII, UTF-16, UTF-8, UTF-32, EBCDIC, et cetera, et alia, und so weiter)

printf has no binary data format specifier.

%s? Simply copies non-nul bytes. That's almost as binary as
one can get, it certainly isn't restricted to printable characters.

And of course, there are putc and putchar.

Not to mention using printf where the format string argument
includes binary data.

<snip>

The fact that there is no
similar function for standard error

fprintf(stderr, "%s", binary_data_with_no_embedded_nul_bytes);

Or, better yet, fwrite(), which can write to any output stream[1],
including stdout and stderr.

[1] At least, the C standard does not impose any restriction on
which stream(s) fwrite() can access, other than that they must
be writable.

. Similarly there is no
function "write`" that passes binary data to standard output by default.

fwrite(buffer,sizeof *buffer,1,stdout);

In the real world, and in the world the C was created to support
there are several functions (write, pwrite, mmap, aio_listio, aio_read, aio_write et cetera, et alia, und so wieter).

Most of which existed before 1989.

--
Lew Pitcher
"In Skills We Trust"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Tue Jan 30 17:25:31 2024

On 30/01/2024 16:49, Malcolm McLean wrote:

On 30/01/2024 13:03, David Brown wrote:

On 30/01/2024 10:13, Malcolm McLean wrote:

There is no standard C library function that takes stderr as the
default stream. Does that mean stderr was not designed to be used at
all?

"printf" exists and works the way it does because it is convenient and
useful. It can be viewed as a short-cut for "fprintf(stdout, ...".
Indeed, that is /exactly/ how the C standard describes the function.

That means the C standards acknowledge that people often want to print
out formatted text (which in no way implies plain ASCII) to stdout.
This does not mean they expect this to be the /only/ use of stdout, or
that people will not use binary outputs to stdout, any more than it
implies that text output will always be sent to stdout and not other
streams or files.

Speical facilities for text don't necessarily mean that text is the only output intended to be used, fair enough.

printf has no binary data format specifier.

Mixing binary data with formatted text data is very unlikely to be
useful. fwrite() is perfectly good for writing binary data - it would
make no sense to have some awkward printf specifier to do this. (What
would the specifier even be? It would need to take two items of data -
a pointer and a length - and thus be very different from existing
specifiers.)

And as you say, the fact
that it is provided is an acknowledgement that programmers often want to
pass formatted text to standard output.

Yes.

The fact that there is no
similar function for standard error suggests that wanting to pass
formatted text to error is a less common requirement.

stderr is a newer invention than stdout and stdin. Perhaps no one
bothered to add such a function to the standard library because "fprintf(stderr..." is not particularly difficult to write.

Which is my
experience for the sort of programming that I do. Similarly there is no function "write`" that passes binary data to standard output by default.

What would that gain? One fewer parameters to fwrite() ?

You are reading /way/ too much into the existence or non-existence of
short-cut functions in the C standard library.

So this suggests that passing binary data to standard output is a less
common requirement. And in fact on many systems standard output will
corrupt such data by in default mode.

So these three things together - no binary data format specifer for
printf(), no binary equivalent function to printf that defaults to
standard output, and the fact that standard output will corrupt binary
data in default mode on some systems, adds up to a pretty powerful
argument for my position.

If I claim grass is pink, and I know this because it is the same
colour as the sea which is also pink, then I have given a
justification and a defence of my position. That doesn't mean it is
worth the pixels it is written with, or that anyone needs to elaborate
when they same I am talking nonsense.

It is so blindingly clear and obvious that stdout is regularly used
for non-text data, and so many undeniably accurate and common examples
have been given, that your position is entirely untenable.

Do learn to think. I've given coherent, reasonable, justifications that are open to dispute on their own terms.

No, you haven't. You have taken your own experience of C programming in
a limited niche field, and extrapolated wildly to assume it applies to
all C programming. You have totally failed to consider extremely common
cases where binary stdout output is used - utility programs that anyone
who uses the Linux command line regularly makes use of dozens of times a
day. You have taken the existence or non-existence of certain standard
C library functions as though they were hard rules about how stdout was "designed", or hard rules about what is "normal" and what is "odd" in C programming.

Nothing of that was reasonable or justified.

That you are capable of inventing an
incoherent argument on a different topic proves nothing even by analogy except, to be fair, that it is plausible that people will make bad
arguments.

And apart from "that's how you have to do it to make a web server work
under Unix", I haven't seen much of anything in this sub thread which constitutes a good argument for passing binary data to standard output.

Then you haven't bothered looking at any of the posts in this thread
branch, and it is therefore not worth trying to educate you by repeating
the same things.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jan 30 17:49:57 2024

On 30/01/2024 17:10, bart wrote:

On 30/01/2024 15:54, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 30/01/2024 15:00, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

I must admit that it's nothing I have ever done or considered doing. >>>>>
However standard output is designed for text and not binary ouput.

That was never the case. stdout is a unformatted stream of bytes
associated by default with file descriptor number one in the
application.

Long before windows was even a gleam in gates eye.

I don't know what Windows has to do with it.

The difference between text and binary byte streams is something
invented by C, so that conversions could be done for byte '\n' on
systems with alternate line-endings.

No, it was invented to support windows CRLF line endings.

You just want to have a go at Windows don't you?

I was using CRLF line-endings in 1970s, they weren't an invention of
Windows, which didn't exist until the mid-80s and didn't become popular
until the mid-90s.

CRLF line endings were the invention of printers or teletype machines.
It took time to move print heads from the end of one line to the
beginning of the next, and separating the "carriage return" and "line
feed" commands made timings easier. It also let printer implementers
handle the two operations independently - occasionally people would want
to do one but not the other.

The use of CRLF as a standard for line endings in files was, I believe,
from CP/M - which came after Unix and Multics, which had standardised on
LF line endings. (Most OS's before that made things up as they choose,
rather than being "standard", or used record-based files, punched cards,
etc.)

So CRLF precedes Windows quite significantly.

(I have no idea why Macs picked CR - perhaps they just wanted to be
different.)

So, how did C deal with CRLF in all those non-Windows settings?

The difference between "text" and "binary" streams in C is, in practice,
up to the implementation. That can be the implementation of the C
library, or the OS functions (or DLLs/SOs) that the C library calls.
The norm is that you use "\n" for line endings in the C code - what
happens after that, for text streams, is beyond C.

The reason C distinguishes between text and binary streams is that some
OS's distinguish between them.

Regardless of your digression, stdout is still an unformatted
stream of bytes. Any structure on that stream is imposed
by the -consumer- of those bytes.

Of course. But it still a bad idea to write actual output that you KNOW
does not represent text, to a consumer that will expect text.

That's just a specific example of "it's a bad idea for a program to
behave in a way that a reasonable user would not expect". Which is, of
course, true - but not a big surprise.

For example to a terminal window, which can happen if you forget to
redirect it.

If the program can reasonably be expected to generate binary output,
then it is the user's fault if they do this accidentally. Examples
shown in this thread include cat and zcat - that's what these programs do.

Sometimes people make mistakes, and try to "cat" (or "type") non-text
files. Mistakes happen.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 16:54:56 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 16:06, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 13:03, David Brown wrote:

On 30/01/2024 10:13, Malcolm McLean wrote:

There is no standard C library function that takes stderr as the default >>>> stream. Does that mean stderr was not designed to be used at all?

"printf" exists and works the way it does because it is convenient and >>>> useful. It can be viewed as a short-cut for "fprintf(stdout, ...".
Indeed, that is /exactly/ how the C standard describes the function.

That means the C standards acknowledge that people often want to print >>>> out formatted text (which in no way implies plain ASCII) to stdout. This >>>> does not mean they expect this to be the /only/ use of stdout, or that >>>> people will not use binary outputs to stdout, any more than it implies >>>> that text output will always be sent to stdout and not other streams or >>>> files.

Speical facilities for text don't necessarily mean that text is the only >>> output intended to be used, fair enough.

Even text is just an unformatted stream of bytes. It is the ultimate
consumer of that text that imposed structure on it (e.g. by treating
it as ASCII, UTF-16, UTF-8, UTF-32, EBCDIC, et cetera, et alia, und so weiter)

No. If we know that text is ASCII it is not highly structured.

ASCII is, of course, structured.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Tue Jan 30 16:55:33 2024

David Brown <david.brown@hesbynett.no> writes:

On 30/01/2024 16:49, Malcolm McLean wrote:

The fact that there is no
similar function for standard error suggests that wanting to pass
formatted text to error is a less common requirement.

stderr is a newer invention than stdout and stdin.

c'est what?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Tue Jan 30 18:43:25 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 16:25, David Brown wrote:

On 30/01/2024 16:49, Malcolm McLean wrote:

Which is my experience for the sort of programming that I do.

[stderr less used than stdout]

Similarly there is no function "write`" that passes binary data to
standard output by default.

What would that gain? One fewer parameters to fwrite() ?

Yes. printf() could easily have been omitted and fprintf() only
provided.

IIRC, printf() existed even before fprintf was invented and
it was used by a whole lot of code when the C standardization
efforts began.

I suspect that most modern C libraries have printf effectively call fprintf(stdout) internally (or at least a common function to
process the format string & varargs).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jan 30 18:22:56 2024

On 30/01/2024 16:49, David Brown wrote:

You just want to have a go at Windows don't you?

I was using CRLF line-endings in 1970s, they weren't an invention of
Windows, which didn't exist until the mid-80s and didn't become
popular until the mid-90s.

CRLF line endings were the invention of printers or teletype machines.
It took time to move print heads from the end of one line to the
beginning of the next, and separating the "carriage return" and "line
feed" commands made timings easier. It also let printer implementers
handle the two operations independently - occasionally people would want
to do one but not the other.

The use of CRLF as a standard for line endings in files was, I believe,
from CP/M - which came after Unix and Multics, which had standardised on
LF line endings. (Most OS's before that made things up as they choose, rather than being "standard", or used record-based files, punched cards, etc.)

So CRLF precedes Windows quite significantly.

(I have no idea why Macs picked CR - perhaps they just wanted to be different.)

So, how did C deal with CRLF in all those non-Windows settings?

The difference between "text" and "binary" streams in C is, in practice,
up to the implementation. That can be the implementation of the C
library, or the OS functions (or DLLs/SOs) that the C library calls. The
norm is that you use "\n" for line endings in the C code - what happens
after that, for text streams, is beyond C.

The reason C distinguishes between text and binary streams is that some
OS's distinguish between them.

Regardless of your digression, stdout is still an unformatted
stream of bytes. Any structure on that stream is imposed
by the -consumer- of those bytes.

Of course. But it still a bad idea to write actual output that you
KNOW does not represent text, to a consumer that will expect text.

That's just a specific example of "it's a bad idea for a program to
behave in a way that a reasonable user would not expect". Which is, of course, true - but not a big surprise.

For example to a terminal window, which can happen if you forget to
redirect it.

If the program can reasonably be expected to generate binary output,
then it is the user's fault if they do this accidentally. Examples
shown in this thread include cat and zcat - that's what these programs do.

Sometimes people make mistakes, and try to "cat" (or "type") non-text files. Mistakes happen.

If you routinely write pure binary data to stdout, then users are going
to see garbage a lot more often.

I gave an example earlier when displaying a binary file with 'type' was better-behaved than with 'cat', since 'type' stops at the first 1A byte.

I used this in my binary formats by adding 1A after the signature, so
you if you attempted to type it out, it wouldn't go mad. Here's another example:

c:\sc>type tree.scd
SCD

(.scd is a binary file containing CAD drawing data.)

If I again do that with 'cat' under WSL, it's goes even crazier. In
starts to try and interpret of the output as commands (with what
program, I don't know), with lots of Bell sounds, and I can't get back
to the WSL prompt.

It's just very, very sloppy.

Using 'type' on a binary file without a 1A deliberately added as a
barrier, can have similar problems, but it will at least stop when 1A is
seen. For example:

c:\jpeg>type card2.jpg
��JFIF@@HH��ExifM@n@@@@

I get one line of output. What does 'cat' do? I haven't yet tried it.
Now that I do, then yep, a bunch of crap disappearing off the top of the screen.

This is the last part of it:

-----------------------------
root@XXX:/mnt/c/jpeg# cat card2.jpg
.... ��Ny�6��G�W��o��N#qi=�>�˧5+_�,�l��*��}��3Tg�sd��J�&��{��^7��x)j��\�`�BAFxk6k{K��_�>

>�͚͞�O��
��{J}bT��Nl��Y�ث��:��#�jƾ�`T�s $�b��Y�Lث��root@XXX:/mnt/c/jpeg# 61;6;7;22;23

3;24;28;32;42c
61: command not found
6: command not found
7: command not found
22: command not found
23: command not found
24: command not found
28: command not found
32: command not found

Command '42c' not found, did you mean:

command 'g2c' from deb goo (0.155+ds-1)
command 'f2c' from deb f2c (20160102-1)

Try: apt install <deb name>

root@XXX:/mnt/c/jpeg#

------------------------------------

It stopped a third of the way down showing ... 24;28;32;42c (I didn't
notice the prompt that was hidden in there), then I typed Enter, which generated that other garbage.

This is really poor. I know you will not agree with that; you can't do
as you would then be against the entire ethos of Unix which is that
everything is done by piping binary data between processes.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Tue Jan 30 18:44:20 2024

bart <bc@freeuk.com> writes:

On 30/01/2024 16:49, David Brown wrote:

Sometimes people make mistakes, and try to "cat" (or "type") non-text
files. Mistakes happen.

If you routinely write pure binary data to stdout, then users are going
to see garbage a lot more often.

Why do you believe that? Most users are capable of RTFM.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Tue Jan 30 20:23:06 2024

On 30/01/2024 19:29, Malcolm McLean wrote:

On 29/01/2024 21:00, Keith Thompson wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 16:18, David Brown wrote:

On 28/01/2024 20:49, Malcolm McLean wrote:

On 28/01/2024 18:24, David Brown wrote:
I'd expect that most general purpose programs written by Norwegians
use an English interface, even if it isn't really expected that the
program will find an audience beyond some users in Norway. Except
of course for programs which in some way are about Norway.

Why?

Generally programmers are educated people and educated people use
English for serious purposes. Not always of course and Norway might be
an exception. But I'd expect that in a Norweigian university, for
example, it would be forbidden to document a program in Norwegian or
to use non-English words for identifiers. And probably the same in a
large Norwegina company. I might be wrong about that and I have never
visited Norway or worked for a Norweigian employer (and obviously I
couldn't do so unless the policy I expect was followed).

You assert that "educated people use English for serious purposes".
I don't have the experience to refute that claim, but I suspect
it's arrogant nonsense. I could be wrong, of course.

Everything which is at all intellectually serious is these days written
in English. It's the new Latin. It's the language all educated people
use to communicate with each other when discussing scientific,
philosophical, or scholarly matters. And also technical matters to a
large extent.
There are a few exceptions but very few. I remember a discussion about whether you could get away with organising a scientific conference in
French, in France, and the conclusion was that you could not. Even in
France. However the French are very reluctant to concede, which is why
the discussion took place at all.

If a large Norwegian company allows programmers to document software in Norwegian, then it cannot employ non-Norwegian programmers to work on
it. So I would imagine that this would be forbidden, But I've never
actually worked for a Norwegian company and I don't actually know. David Brown, to be fair, does work for a Norwegian company so he might know
better. But he asks "why?" and I gave the reason.

So your reasoning is "Scholars use English. Scholars are serious.
Therefore, anything serious is scholarly and consequently in English".

You do know that Monty Python's Holy Grail is entertainment, not a
education in logic?

When you are in such a deep hole, the wise thing would be to stop digging.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Tue Jan 30 20:29:17 2024

On 30/01/2024 17:55, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

On 30/01/2024 16:49, Malcolm McLean wrote:

The fact that there is no
similar function for standard error suggests that wanting to pass
formatted text to error is a less common requirement.

stderr is a newer invention than stdout and stdin.

c'est what?

According to Wikipedia (it's not infallible, but it knows better than me
here) :

"""
Standard error was added to Unix in the 1970s after several wasted phototypesetting runs ended with error messages being typeset instead of displayed on the user's terminal.[4]
"""

<https://web.archive.org/web/20200925010614/https://minnie.tuhs.org/pipermail/tuhs/2013-December/006113.html>

"""
One of the most amusing and unexpected consequences of phototypesetting
was the Unix standard error file (!). After phototypesetting, you had to
take a long wide strip of paper and feed it carefully into a smelly, icky machine which eventually (several minutes later) spat out the paper with
the printing visible.

One afternoon several of us had the same experience -- typesetting
something, feeding the paper through the developer, only to find a single, beautifully typeset line: "cannot open file foobar" The grumbles were
loud enough and in the presence of the right people, and a couple of days
later the standard error file was born...
"""

stdout and stdin were apparently available in FORTRAN in the 1950's.

But you are more likely to know things first hand, or at least second
hand, so if you can correct both me and Wikipedia, I'd be happy with that.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to Malcolm McLean on Tue Jan 30 19:39:24 2024

On 30/01/2024 16:33, Malcolm McLean wrote:

C plays fast and loose with the char type. But you can't pass embedded
nuls. These are so common in binary data that in practis=ce you can't
use %s for binary data at all.

Nobody uses printf to output binary data. fwrite(3) would be common, as
would write(2).

Maybe you could use printf("%c%c%c" ... but it'd be beyond tedious.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to Malcolm McLean on Tue Jan 30 19:45:49 2024

On 30/01/2024 18:39, Malcolm McLean wrote:

On 30/01/2024 16:49, David Brown wrote:

If the program can reasonably be expected to generate binary output,
then it is the user's fault if they do this accidentally. Examples
shown in this thread include cat and zcat - that's what these programs
do.

Sometimes people make mistakes, and try to "cat" (or "type") non-text
files. Mistakes happen.

Elsethred [David Brown]

I wonder if there is any *nix program older or simpler than "cat" - a program that simply passes its input files or the stdin to stdout.

But, nobody expects piping a binary file to a tty to "work".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jan 30 20:46:05 2024

On 30/01/2024 19:22, bart wrote:

On 30/01/2024 16:49, David Brown wrote:

You just want to have a go at Windows don't you?

I was using CRLF line-endings in 1970s, they weren't an invention of
Windows, which didn't exist until the mid-80s and didn't become
popular until the mid-90s.

CRLF line endings were the invention of printers or teletype machines.
It took time to move print heads from the end of one line to the
beginning of the next, and separating the "carriage return" and "line
feed" commands made timings easier. It also let printer implementers
handle the two operations independently - occasionally people would
want to do one but not the other.

The use of CRLF as a standard for line endings in files was, I
believe, from CP/M - which came after Unix and Multics, which had
standardised on LF line endings. (Most OS's before that made things
up as they choose, rather than being "standard", or used record-based
files, punched cards, etc.)

So CRLF precedes Windows quite significantly.

(I have no idea why Macs picked CR - perhaps they just wanted to be
different.)

So, how did C deal with CRLF in all those non-Windows settings?

The difference between "text" and "binary" streams in C is, in
practice, up to the implementation. That can be the implementation of
the C library, or the OS functions (or DLLs/SOs) that the C library
calls. The norm is that you use "\n" for line endings in the C code -
what happens after that, for text streams, is beyond C.

The reason C distinguishes between text and binary streams is that
some OS's distinguish between them.

Regardless of your digression, stdout is still an unformatted
stream of bytes.   Any structure on that stream is imposed
by the -consumer- of those bytes.

Of course. But it still a bad idea to write actual output that you
KNOW does not represent text, to a consumer that will expect text.

That's just a specific example of "it's a bad idea for a program to
behave in a way that a reasonable user would not expect". Which is,
of course, true - but not a big surprise.

For example to a terminal window, which can happen if you forget to
redirect it.

If the program can reasonably be expected to generate binary output,
then it is the user's fault if they do this accidentally. Examples
shown in this thread include cat and zcat - that's what these programs
do.

Sometimes people make mistakes, and try to "cat" (or "type") non-text
files. Mistakes happen.

If you routinely write pure binary data to stdout, then users are going
to see garbage a lot more often.

Or they will learn not to try to view the output of the programs.

I can see there being a problem if a program looks like it should give
text output, or sometimes gives text output, but it sometimes gives
binary output.

If you have a program like that, then it probably makes sense to have a
flag to say "output the data to stdout" and the default being writing to
a file.

So don't misunderstand me here - I'm all in favour of making programs user-friendly, and not filling people's terminals with junk
unexpectedly. It's just that sometimes binary output on stdout is
excepted and convenient.

I gave an example earlier when displaying a binary file with 'type' was better-behaved than with 'cat', since 'type' stops at the first 1A byte.

That would make "type" better behaved if you only use it for looking at
text files. "cat" is used for many different things, and stopping at
0x1a (or 0x00) would be bad behaviour for that program. They are not
the same program, and do different things (even if "cat" can also do
everything that "type" can.

I used this in my binary formats by adding 1A after the signature, so
you if you attempted to type it out, it wouldn't go mad. Here's another example:

   c:\sc>type tree.scd
   SCD

(.scd is a binary file containing CAD drawing data.)

If that suits your needs, fine. I haven't used "type" for decades. (In
almost all cases, "more" is better. And if I have normal utilities,
"less" is better still.)

If I again do that with 'cat' under WSL, it's goes even crazier. In
starts to try and interpret of the output as commands (with what
program, I don't know), with lots of Bell sounds, and I can't get back
to the WSL prompt.

It's just very, very sloppy.

No, it is very, very correct. The program is doing what it is intended
to do. You just think it should have been designed to do something
else. A "cat" program that stopped after 0x1a would be completely broken.

It's fine that you prefer "type" to "cat". I wouldn't use either of
them - I'd use "less" for viewing a file. (You'd have similar issues
with unhelpful output if you tried to use "less" on a binary file, but
only one screenful.) But I'd use "cat" for concatenating files, or for sourcing a file to pipe into something else. I realise joining multiple commands with pipes is not common in Windows - that's fine too. You use
what suits you best. But it makes no sense to blame one program for not functioning like a very different program - especially when the
difference is a misfeature, as often as not.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jan 30 20:24:49 2024

On 30/01/2024 19:29, David Brown wrote:

On 30/01/2024 17:55, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

On 30/01/2024 16:49, Malcolm McLean wrote:

The fact that there is no
similar function for standard error suggests that wanting to pass
formatted text to error is a less common requirement.

stderr is a newer invention than stdout and stdin.

c'est what?

According to Wikipedia (it's not infallible, but it knows better than me here) :

"""
Standard error was added to Unix in the 1970s after several wasted phototypesetting runs ended with error messages being typeset instead of displayed on the user's terminal.[4]
"""

<https://web.archive.org/web/20200925010614/https://minnie.tuhs.org/pipermail/tuhs/2013-December/006113.html>

"""
One of the most amusing and unexpected consequences of phototypesetting
was the Unix standard error file (!). After phototypesetting, you had to take a long wide strip of paper and feed it carefully into a smelly, icky machine which eventually (several minutes later) spat out the paper with
the printing visible.

One afternoon several of us had the same experience -- typesetting
something, feeding the paper through the developer, only to find a single, beautifully typeset line: "cannot open file foobar" The grumbles were loud enough and in the presence of the right people, and a couple of days later the standard error file was born...
"""

That explains a lot. It is ludicrous to just blindly send data to
stdout, in cases like this. Send it to a file, where it can be neatly self-contained, and then send that to the device. Or least use a device
handle like 'stdprinter'.

Just something that will segregate output that is intended for the
terminal, from data output.

Clearly every process in Unix was some sort of batch program.

With an interactive application, you have an on-going dialog with the
user. But if EVERYTHING is sent to stdout, how on earth do you switch
between output for the user, and output for the whatever needs to have
been setup to hang onto stdout?

Don't tell me that you need to use stderr for the interactive dialog and
stdout for data!

It just got things wrong from the start. It doesn't look like much as
changed.

stdout and stdin were apparently available in FORTRAN in the 1950's.

I spent a year writing FORTRAN on mainframes and minicomputers. I don't remember anything like that.

And my coding involved outputting to a range of imaging peripherals
including vector graphics terminals (Tektronix).

Meanwhile other software of mine since then could output to an endless
variety of peripherals, often several kinds within the same program.

I don't recall any nonsense where you just wrote necessary data to the equivalent of STDOUT, and hoping something at the other end would do
something sensible with it.

For a start, that would be the only thing the program could do: dump
stuff to STDOUT then terminate.

But hey, if Unix does it then it must be right!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Scott Lurndal on Tue Jan 30 20:54:32 2024

On 2024-01-30, Scott Lurndal <scott@slp53.sl.home> wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 16:25, David Brown wrote:

On 30/01/2024 16:49, Malcolm McLean wrote:

Which is my experience for the sort of programming that I do.

[stderr less used than stdout]

Similarly there is no function "write`" that passes binary data to
standard output by default.

What would that gain? One fewer parameters to fwrite() ?

Yes. printf() could easily have been omitted and fprintf() only
provided.

IIRC, printf() existed even before fprintf was invented and
it was used by a whole lot of code when the C standardization
efforts began.

Well, actually, printf and fprintf were conflated into one
function in early Unix!

If the first argument of printf was a number instead of a
format string, then it indicated the device to send output
to (it was an index in the table of FILE structures or something).
In that case, the second argument was the format string:

printf("hello, %s\n", "world!"); /* print on standard output */

printf(1, "hello, %s\n", "world!"); /* likewise */

printf(2, "hello, %s\n", "world!"); /* likewise */

Or something like that; I may have some detail wrong.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to David Brown on Tue Jan 30 16:16:32 2024

On 1/30/24 11:49, David Brown wrote:
...

If the program can reasonably be expected to generate binary output,
then it is the user's fault if they do this accidentally. Examples
shown in this thread include cat and zcat - that's what these programs do.

? There's no problem using cat to concatenate binary files. I've used
'split' to split binary files into smaller pieces, and then used 'cat'
to recombine them, and it worked fine. I don't remember why, but I had
to transfer the files from one place to another by a method that imposed
an upper limit on the size of individual files.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Tue Jan 30 21:42:29 2024

David Brown <david.brown@hesbynett.no> writes:

On 30/01/2024 17:55, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

On 30/01/2024 16:49, Malcolm McLean wrote:

The fact that there is no
similar function for standard error suggests that wanting to pass
formatted text to error is a less common requirement.

stderr is a newer invention than stdout and stdin.

c'est what?

According to Wikipedia (it's not infallible, but it knows better than me >here) :

"""
Standard error was added to Unix in the 1970s after several wasted >phototypesetting runs ended with error messages being typeset instead of >displayed on the user's terminal.[4]

Ok. I had incorrectly assumed you were referring to the late 80's
when C standardization was underway.

It was certainly there by unix v6 which was the first version I
used in the late 70's.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Wed Jan 31 06:00:30 2024

On Tue, 30 Jan 2024 17:49:57 +0100, David Brown wrote:

The use of CRLF as a standard for line endings in files was, I believe,
from CP/M ...

Which I think copied it from DEC minicomputer systems.

Fun fact: on some of those DEC systems (which I used when they were still
being made), you could end a line with CR-LF, or LF-CR-NUL.

What was the NUL for? Padding. Why did it need padding? (This was before
CRT terminals.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Jan 31 06:02:45 2024

On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

VMS (now OpenVMS) was also a significant system at the time, and it had
some rather complex file formats ...

It relied on extra file metadata called “record attributes” in order to make sense of the file format. It was quite common to transfer files from
other systems, and have them not be readable until you had set appropriate record attributes on them. Picky, picky, I know.

Apparently Linus Torvalds used VMS for a while, and hated it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Wed Jan 31 06:04:38 2024

On Tue, 30 Jan 2024 21:04:56 +0000, Malcolm McLean wrote:

Linus Torvald's native language is Finnish, for example.

No, it would be Swedish. He’s an ethnic Swede, from Finland.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Wed Jan 31 06:03:37 2024

On Tue, 30 Jan 2024 15:21:01 +0000, Malcolm McLean wrote:

Most systems run Windows where the model of piping from standard output
to standard input of the next program is much less used than in Unix,
this is true. That sometimes generates a feeling of superiority amongst
those who use the less common, often more expensive Unix systems. It's
very silly, but that's how people think.

Also we can do select/poll on pipes on *nix systems, you can’t on Windows.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Wed Jan 31 06:06:14 2024

On Tue, 30 Jan 2024 09:13:07 +0000, Malcolm McLean wrote:

However standard output is designed for text and not binary ouput.
Whilst there is a "printf()" which operates on standard output by
default, there are no functions which write binary data to standard
outout by default, for example.

putchar(2).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Wed Jan 31 06:10:56 2024

On Tue, 30 Jan 2024 20:29:17 +0100, David Brown wrote:

stdout and stdin were apparently available in FORTRAN in the 1950's.

There was a convention that channel 5 was the card reader, and 6 was the
line printer.

When interactive systems came along later, this became channel 5 for
keyboard input, and 6 for terminal output.

What happened to channels 1, 2, 3 & 4? Don’t know.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Richard Harnden on Wed Jan 31 06:12:07 2024

On Tue, 30 Jan 2024 19:39:24 +0000, Richard Harnden wrote:

Nobody uses printf to output binary data.

Do terminal-control escape sequences count?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Wed Jan 31 06:07:47 2024

On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

Mixing binary data with formatted text data is very unlikely to be
useful.

PDF does exactly that. To the point where the spec suggests putting some
random unprintable bytes up front, to distract format sniffers from
thinking they’re looking at a text file.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 01:39:28 2024

On 1/31/24 01:12, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 19:39:24 +0000, Richard Harnden wrote:

Nobody uses printf to output binary data.

Do terminal-control escape sequences count?

The standard says "Data read in from a text stream will necessarily
compare equal to the data that were earlier written out to that stream
only if: the data consist only of printing characters and the control characters horizontal tab and new-line; no new-line character is
immediately preceded by space characters; and the last character is a
new-line character." (7.23.2p2).

I would say that any output which would invalidate that guarantee
qualifies as binary data, since you would need to output it in binary
mode in order to guarantee being able to read it back. That would
include the other control characters (vertical tab and form feed,
backspace, and carriage return).

The letters and digits are guaranteed to be printing characters. Note that:

"A letter is an uppercase letter or a lowercase letter as defined above;
in this document the term does not include other characters that are
letters in other alphabets."

The other printing characters are locale-specific, so you'll have to
test them with isprint(). In particular, many terminal control escape
sequences start with the ESC character, which doesn't qualify above.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 08:59:22 2024

On 31/01/2024 07:04, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 21:04:56 +0000, Malcolm McLean wrote:

Linus Torvald's native language is Finnish, for example.

No, it would be Swedish. He’s an ethnic Swede, from Finland.

He is Finnish, but has Swedish as his mother tongue (like about 5% of
Finns). Speaking Swedish as your main language does not make you
ethically Swedish. As a university-educated Finn, brought up in
Helsinki, he will also speak Finnish quite fluently.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Malcolm McLean on Tue Jan 30 23:18:21 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might
have unusual effects if passed though systems designed to handle
human-readable text. For instance in some systems designed to receive
ASCII text, there is no distinction between the nul byte and "waiting
for next data byte". Obviously this will cause difficuties if the data
is binary.
Also many binary formats can't easily be extended, so you can pass one
image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to
extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

[...]

Simple example (disclaimer: not tested):

ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
(mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

Of the five main programs in this command, four are using
standard out to send binary data:

tar -cf - .
gzip -c
ssh foo [...]
gunzip -c

The tar -xf - at the end reads binary data on standard in
but doesn't output any (or anything else for that matter).

It is FAR more cumbersome to accomplish what this command
is doing without sending binary data through standard out.
Anyone who doesn't understand this doesn't understand Unix.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Tim Rentsch on Wed Jan 31 12:43:32 2024

On Tue, 30 Jan 2024 23:18:21 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it
might have unusual effects if passed though systems designed to
handle human-readable text. For instance in some systems
designed to receive ASCII text, there is no distinction between
the nul byte and "waiting for next data byte". Obviously this
will cause difficuties if the data is binary.
Also many binary formats can't easily be extended, so you can
pass one image and that's all. While it is possible to devise a
text format which is similar, in practice text formats usually
have enough redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and
harder to extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

[...]

Simple example (disclaimer: not tested):

ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
(mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

Of the five main programs in this command, four are using
standard out to send binary data:

tar -cf - .
gzip -c
ssh foo [...]
gunzip -c

The tar -xf - at the end reads binary data on standard in
but doesn't output any (or anything else for that matter).

It is FAR more cumbersome to accomplish what this command
is doing without sending binary data through standard out.

If I am not mistaken, tar, gzip and gunzip do not write binary data to
standard output by default. They should be specifically told to do so.
For ssh I don't know. Anyway, ssh is not a "normal" program so it's
not surprising when textuality of ssh output is the same as textuality
of the command it carries.

Anyone who doesn't understand this doesn't understand Unix.

Frankly, Unix redirection racket looks like something hacked together
rather than designed as result of the solid thinking process.
As long as there were only standard input and output it was sort of
logical. But when they figured out that it is insufficient, they had
chosen a quick hack instead of constructing a solution that wouldn't
offend engineering senses of any non-preconditioned observer.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Wed Jan 31 12:53:04 2024

On Wed, 31 Jan 2024 08:59:22 +0100
David Brown <david.brown@hesbynett.no> wrote:

On 31/01/2024 07:04, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 21:04:56 +0000, Malcolm McLean wrote:

Linus Torvald's native language is Finnish, for example.

No, it would be Swedish. He’s an ethnic Swede, from Finland.

He is Finnish, but has Swedish as his mother tongue (like about 5% of Finns). Speaking Swedish as your main language does not make you
ethically Swedish.

Linus has Swedish as his mother tongue.
Linus has Swedish family name. Or at least Scandinavian, for me it
sounds more Danish than Swedish, but I am not an expert. It certainly
does not sound Finnish.
When Linus was younger, he used to like to tell stereotypical jokes
about Finns.

When it quacks like a duck...

As a university-educated Finn, brought up in
Helsinki, he will also speak Finnish quite fluently.

That's true.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to James Kuyper on Wed Jan 31 12:35:32 2024

On 30.01.2024 22:16, James Kuyper wrote:

On 1/30/24 11:49, David Brown wrote:
...

If the program can reasonably be expected to generate binary output,
then it is the user's fault if they do this accidentally. Examples
shown in this thread include cat and zcat - that's what these programs do.

? There's no problem using cat to concatenate binary files. I've used
'split' to split binary files into smaller pieces, and then used 'cat'
to recombine them, and it worked fine. I don't remember why, but I had
to transfer the files from one place to another by a method that imposed
an upper limit on the size of individual files.

Yes. Been there. Faint memories only. But some instances were;
* allowed attachment sizes for email exchange
* posting limits in Usenet (binary files)
* unreliable FTP(?) download processes[*]
* (and something else I can't wrap my mind around at the moment)

Janis

[*] Worth an anecdote...
Downloading standard documents from the CCITT (now ITU-T).
They decided to provide the documents in [non-standard] "MS Word"
'doc' format. Short standard texts in blown up doc documents could
not reliably be transmitted due to connection drops; splitting that
binary doc could have helped (as any more sensible standard format).
But since the original data was a bulky monolith, to no avail.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 12:43:04 2024

On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

VMS (now OpenVMS) was also a significant system at the time, and it had
some rather complex file formats ...

[...]

Apparently Linus Torvalds used VMS for a while, and hated it.

I don't understand the intention of this comment.
VMS and Torvalds are completely different eras.
And were is the relation?

(Or just meant as anecdotal trivia?)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 13:58:02 2024

On Wed, 31 Jan 2024 12:43:04 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

VMS (now OpenVMS) was also a significant system at the time, and
it had some rather complex file formats ...

[...]

Apparently Linus Torvalds used VMS for a while, and hated it.

I don't understand the intention of this comment.
VMS and Torvalds are completely different eras.
And were is the relation?

(Or just meant as anecdotal trivia?)

Janis

Linus is older than you probably realize. He entered the University of
Helsinki in 1988. Back then VMS was only slightly behind its peak of popularity. By value, likely still bigger than all Unixen combined.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 12:22:39 2024

On 30.01.2024 20:46, David Brown wrote:

On 30/01/2024 19:22, bart wrote:

[...]

If you have a program like that, then it probably makes sense to have a
flag to say "output the data to stdout" and the default being writing to
a file.

Did you here mean to say "output the data to the terminal"? (I noticed
that a lot of the posts here have a misconception about what 'stdout'
is; they seem to use it synonymously to "terminal" or "screen/display".
But you are not guaranteed that stdout will produce screen output; it
depends on the environment. Being more accurate with the distinction
might help prevent misconceptions if replying to these people.)

More to your point you wrote, I don't think this would be a good design
as you've written it. A default would imply the necessity of some fixed
name (or naming schema) - think of the disputable "a.out" default. What
would be the default name (or naming scheme) for a tool's data output?
(I could make a maybe even sensible choice for _one_ tool, like using
the base name of a file name argument for cc's output file entity in
the special case of compiling a "main module". Similar the schema '.o'
when used with option -c . But generally?)

Tools that output their data to the [default] stdout channel have many
options; they may redirect that output, or they may pipe it into other programs, or they may also just suppress the data, unless they want to
see it on their terminal display. Any quite some tools surely (where
it's sensible) provide options to define a output file name in addition
to the default stdout.

And there's even some conventions for tools that require a file name
as (also output) argument, using '-' to let it go to standard output,
or /dev/tty to let it go to the terminal console. (It depends on the
OS, of course.) This all said from the Unix perspective. Users with
experiences only from other (often more primitive) OSes might have
some problems to recognize such design principles.

For sure we have tools that inherently operate with binary data and of
course use the same I/O-channels for binary that a text processing tool
uses for text interchange.

There's no compelling reason why there should be _additional_ channels
for "binary" data exchange; and I put "binary" in quotes for reasons we
already spoke about.

Every interface has a defined input and a defined output mechanism. Use
them as they are specified. If you're calling, say (to keep it simply),
'cat some_image.jpg' what do people expect, a textual description of
the image? (Which would certainly be a nice feature in our AI hype era.
But then we'd probably rather use 'describe_image some_image.jpg' or
any redirected variant thereof. And we would still want binary data
exchanged, to use it as (e.g.) 'AI_image_generator | describe_image'.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 13:35:47 2024

On 31.01.2024 12:58, Michael S wrote:

On Wed, 31 Jan 2024 12:43:04 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

VMS (now OpenVMS) was also a significant system at the time, and
it had some rather complex file formats ...

[...]

Apparently Linus Torvalds used VMS for a while, and hated it.

I don't understand the intention of this comment.
VMS and Torvalds are completely different eras.
And were is the relation?

Linus is older than you probably realize.

Why do you think that I'd be thinking that?

I know that he's quite some years younger than I am. So what?

He entered the University of
Helsinki in 1988. Back then VMS was only slightly behind its peak of popularity.

What? - I'm not sure where you're coming from.

I associate DEC's VMS with the old DEC VAX-11 system, both
from around the mid of the 1970's. I programmed on a DEC's
VAX with VMS obviously before Linus Torvalds started his
studies. And that was at a time when the DEC VAX and VMS
were replaced at our sites by Unix systems.

By value, likely still bigger than all Unixen combined.

Not sure what (to me strange sounding) ideas you have here.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Wed Jan 31 13:49:48 2024

On 31.01.2024 13:15, Malcolm McLean wrote:

On 31/01/2024 10:43, Michael S wrote:

Frankly, Unix redirection racket looks like something hacked together
rather than designed as result of the solid thinking process.
As long as there were only standard input and output it was sort of
logical. But when they figured out that it is insufficient, they had
chosen a quick hack instead of constructing a solution that wouldn't
offend engineering senses of any non-preconditioned observer.

(What a bullshit.)

It was designed for very memory constrained systems which handled text
on a line by line basis. So one line of a long file wuld be processed
and passed down the pipeline, and you wouldn't need temporary disk files
or large amounts of memory. I'm sure it worked quite well for that.

You are right about the intention of pipes. (Though not only for
"memory constrained systems", but generally to avoid unnecessary and
costly disk I/O.)

But it's not line oriented. (That would be non-performant.) In Unix
systems there's a distinction between line-oriented and buffered I/O.
You can configure I/O with what you want if you program your tools.
Typically tools when in non-interactive mode use efficient buffered
mode. And there's also methods to force buffered I/O to become line
oriented (cf. 'pty') for specific cases where you want it.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Wed Jan 31 14:02:51 2024

On 30.01.2024 22:04, Malcolm McLean wrote:

Linus Torvald's native language is Finnish, for example. But git was
released in English. There might be Finnish language bindings for it
now, but I'm pretty sure not in the original version. Similarly Bjarne Strousup is Swedish, but C++ uses keywords like "class" and "friend",
not the Swedish terms.

Okay something had already been said about Linus T.'s ethnicity,
so let me also point out that Bjarne Stroustrup (sp!) is Danish.
(Just for the record.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 15:10:05 2024

On Wed, 31 Jan 2024 13:35:47 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 31.01.2024 12:58, Michael S wrote:

On Wed, 31 Jan 2024 12:43:04 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

VMS (now OpenVMS) was also a significant system at the time, and
it had some rather complex file formats ...

[...]

Apparently Linus Torvalds used VMS for a while, and hated it.

I don't understand the intention of this comment.
VMS and Torvalds are completely different eras.
And were is the relation?

Linus is older than you probably realize.

Why do you think that I'd be thinking that?

I know that he's quite some years younger than I am. So what?

He entered the University of
Helsinki in 1988. Back then VMS was only slightly behind its peak of popularity.

What? - I'm not sure where you're coming from.

I associate DEC's VMS with the old DEC VAX-11 system, both
from around the mid of the 1970's.

Released in 1977.
Reached the peak of popularity in mid 1980s, when DEC decided to use
VAX not just as mini/super-mini, but also as competitor to mainframes, effectively killing their earlier mainframe line (PDP-6/10/20).
In late 1980s and early 1990s was used as desktop/workstation OS as
well. Never was very popular in that role, but the reason for moderate popularity was high price of software and relative weakness of hardware (MicroVAX) rather than technical deficiencies of the Operation System.
Ported to Alpha in early 1990s.
Ported to Itanium in early 2000s.
Ported to x86-64 starting from mid-2010s. Since, unlike previous ports,
this one was done by small company, it took plenty of time. However by
now VMS on x86-64 is already in production stage.

I programmed on a DEC's
VAX with VMS obviously before Linus Torvalds started his
studies. And that was at a time when the DEC VAX and VMS
were replaced at our sites by Unix systems.

There were places like that.
There were far more places where VMS was replaced much later and not necessarily by Unix.
There are places where VMS is still running.
Most likely, VMS will still be used in production after the last
"vendor's Unix", of which I'd bet on AIX, is replaced by ether Linux of
BSD.

By value, likely still bigger than all Unixen combined.

Not sure what (to me strange sounding) ideas you have here.

Janis

I can say the same.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Wed Jan 31 14:05:45 2024

On 30/01/2024 22:04, Malcolm McLean wrote:

On 30/01/2024 20:06, Keith Thompson wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 21:00, Keith Thompson wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 16:18, David Brown wrote:

On 28/01/2024 20:49, Malcolm McLean wrote:

On 28/01/2024 18:24, David Brown wrote:
I'd expect that most general purpose programs written by Norwegians >>>>>>> use an English interface, even if it isn't really expected that the >>>>>>> program will find an audience beyond some users in Norway. Except >>>>>>> of course for programs which in some way are about Norway.

Why?

Generally programmers are educated people and educated people use
English for serious purposes. Not always of course and Norway might be >>>>> an exception. But I'd expect that in a Norweigian university, for
example, it would be forbidden to document a program in Norwegian or >>>>> to use non-English words for identifiers. And probably the same in a >>>>> large Norwegina company. I might be wrong about that and I have never >>>>> visited Norway or worked for a Norweigian employer (and obviously I
couldn't do so unless the policy I expect was followed).

You assert that "educated people use English for serious purposes".
I don't have the experience to refute that claim, but I suspect
it's arrogant nonsense. I could be wrong, of course.

Everything which is at all intellectually serious is these days
written in English. It's the new Latin. It's the language all educated
people use to communicate with each other when discussing scientific,
philosophical, or scholarly matters. And also technical matters to a
large extent.

Even if that's true, you assertion was about user interfaces.

Are you under the impression that mobile phones show their messages only
in English because all their users are scholars?

Things tend to trickle down.
Teenagers from non-English speaking countries hum along to pop songs in English.

And you think that makes them fluent, or be as happy communicating in
English than in their own language(s) ?

It is certainly true that in many non-English speaking countries, people
get quite familiar with English, and they do so from a younger age than
in previous generations. It is also true that in many non-English
speaking countries, people /don't/ learn much English unless they are
involved in a profession that needs it.

A guiding factor here is the size of the population. In European
countries with maybe 20-30 million or more, everything important is
translated or dubbed. You can watch American movies in your own
language - they are dubbed. You can read translated books, or natively
written books, even for specialised non-fiction books. You don't need
to understand any English of significance to have a full professional,
cultural and social life.

But it is correct that English has become the main language for
international communication, and is therefore critical for anything that involves cross-border communication, or where there are significant
numbers of foreign workers. That includes academic work. Different
parts of Europe previously used German or Russian for this, and that is
still found amongst the older generations, but it is changing to English.

I live in Norway, which is a country near the top of any list of English fluency. It is small (in population), so relies on a lot of imported
culture and information. Films are subtitled, not dubbed (except for
films for small kids), and only popular books are translated. We have a
very high standard of education, including higher education, a great
deal of foreign workers in some fields (in my company, perhaps a third
of employees are not natives), and Norwegians travel extensively. There
can be few non-English speaking countries where English is significantly
more prevalent than Norway.

Yet be in no doubt that the solid majority of Norwegians prefer
Norwegian for most purposes.

Yes, pretty much any programmer in Norway will be confident in English.
So will most scholars. But the huge majority of people in Norway are
not programmers or scholars.

Most Norwegians will be reasonably good at understanding English -
written or spoken, as most are exposed to it regularly (though not
everyone watches American films or plays English-language computer
games). A large proportion can speak English well - most of the younger generations, and those with higher education. But many people speak
little English in their normal life, after their education, and are out
of practice and under-confident.

And despite their impressive English skills, most /prefer/ Norwegian,
and are more comfortable in Norwegian. It does not matter how fluent
they are in English - if you put some Norwegians together, they will
talk in Norwegian, because that is the native language here.

You use Norwegian unless there is good reason to do otherwise.

So if I am writing a program aimed at Norwegians, the output will be in Norwegian, unless the balance of development effort versus use effort
makes English the rational choice (since I write English faster and more accurately). I would normally write any user documentation in
Norwegian. I would normally write internal or code documentation in
English - but the issue was about the user-visible language, not the programmer's language. I would definitely use English for the code -
but again, that was not the issue.

If the program might reasonably be used by non-Norwegian speakers, then
the choice is between English only, or bilingual.

And whilst programmers aren't usually scholars, if they are C
programmers they will use a programming language with keywords based on English.

The actual programming is almost always in English. Some Norwegians
will use Norwegian terms for identifiers, usually ASCII'fied. I do that sometimes when the code uses terms that I know in Norwegian but not in
English. Some Norwegians like to comment in Norwegian. Since a lot of companies have at least some non-Norwegian programmers, and larger
software development projects are often somewhat international, a most
coding is in English.

But we were not talking about the language programmers use. We were
talking about the language output by programs - how /users/ interact
with the program.

And you will likley get quite a bit of English coming through the mobile phone.

Not unless you are talking to someone in English. Do you think mobile
phone software in Norway is in English?

Linus Torvald's native language is Finnish, for example.

It is Swedish.

But git was
released in English.

It was written long after Torvalds (that's how his name is spelt) had
moved to the USA. It was written for a highly international community,
all over the world - a community of programmers who were already working
on an English-language project.

It is thus utterly irrelevant as an example.

There might be Finnish language bindings for it
now, but I'm pretty sure not in the original version. Similarly Bjarne Strousup is Swedish, but C++ uses keywords like "class" and "friend",
not the Swedish terms.

Bjarne Stroustrup (again, you got the spelling wrong) is Danish. And he designed a programming language, for programmers, as an extension for an existing English-language programming language.

Are you actively trying to make yourself seem more absurdly ignorant of
the world outside your doorstep? If you are going to use famous
programmers as examples, at least do them the courtesy of getting their
names and languages right. It would have been even better if the
examples had been even remotely relevant. (I do understand that you
can't think of any actually relevant examples - you are not familiar
with any programs written primarily for Norwegians. How you then think
you are qualified to pontificate about them is an unfathomable question.)

Now I think that would rub off on Norwegian programmers. It would be surprising of it did not.

Are you aware of the existence of medical devices (my current $DAYJOB)
that can be configured to display messages in any of a number of
languages?

Some software is internationalised. But it takes quite a lot of
resources to translate software. With medical device software the
software is likely so expensive to develop anyway because of all the
safety critical portions that the cost is tolerable.

And now you are trying to tell people who make medical software and
medical devices about their jobs.

Our software has
purely English user interfaces. It was something we looked at, but it
would have been expensive and made the code base harder to manange, and
the users said that the benefit to them was marginal as they spoke
enough English to understand a few simple GUI labels. I think our
experience is more typical, but some people will no doubt make out that
it is narrow and parochial.

The fact is that there is no single answer. The variation in software
and in requirements is enormous. Every single project is "narrow and parochial". The rest of us understand that - we know there is no "one
size fits all". There is no "typical".

[...]

Correct, you don't actually know. Why doesn't that prevent you from
making assertions rather than asking questions, so that you can learn
something from people who know more than you do?

That is the really big question here.

I'm a qualified scientist, amongst other things.

I find that /very/ difficult to believe. I see very little rational
thinking, collection of data, consideration of hypotheses, collaboration
with colleagues, refinement of argument when contradictory data is
presented, or anything else indicating scientific training or a
scientific viewpoint.

What I see is wild and unjustified extrapolation from very limited
experiences, and a flat denial of information that contradicts your
pre-formed ideas.

In science, the things
that you know are usually either quite basic and covered in the first
degree, or they are not terribly interesting.

You could not be more wrong. What an incredibly sad view you have.

What matters is what you
don't actually know, but believe to be the case, based on sound evidence
and reasoning.

Are you mixing up religion and science? /Belief/ is irrelevant in science.

And I believe it to be the case that English is used very
widely in Norway. And in fact, if David Brown, who is in a position to
know, asserts this not to be the case, I'd put it down to his
contentious nature and tend to dismiss it. Now of course I could have
misled myself. But I doubt it.

No one ever contended that English is used widely in Norway. That was
never at issue, and is not closely related to the claims you made.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 14:45:49 2024

On 31/01/2024 07:07, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

Mixing binary data with formatted text data is very unlikely to be
useful.

PDF does exactly that. To the point where the spec suggests putting some random unprintable bytes up front, to distract format sniffers from
thinking they’re looking at a text file.

PDF files start with the "magic" indicator "%PDF", which is enough for
many programs to identify them correctly. And they are usually
compressed so that the content text is not directly readable or
identifiable as strings. If they are not compressed, then yes, there is
can be text mixed in with everything else. But I would not call that
"mixing binary data and formatted text" - I would just say that some of
the binary data happens to be strings. It's the same as elf files
containing copies of strings from the program, or identifiers for
external linking.

However, I learned a new trick when checking that I was not mistaken
about this - it turns out that "less file.pdf" gives a nice text-only
output from the pdf file (by passing it through "lesspipe"). There's
always something new to learn from inane conversations on Usenet :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 15:28:42 2024

On Wed, 31 Jan 2024 06:03:37 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Tue, 30 Jan 2024 15:21:01 +0000, Malcolm McLean wrote:

Most systems run Windows where the model of piping from standard
output to standard input of the next program is much less used than
in Unix, this is true. That sometimes generates a feeling of
superiority amongst those who use the less common, often more
expensive Unix systems. It's very silly, but that's how people
think.

Also we can do select/poll on pipes on *nix systems, you can’t on
Windows.

You can't do select/poll on Windows anonymous pipes, which are an odd
bird in Win32 API. To me Windows anonymous pipes look like poorly
implemented late addition to Win32 that, I would imagine, was done in
order to claim POSIX compatibility back when it mattered for USA
government contracts.
However you can do Windows' equivalent of poll (WaitForMultipleObjects)
on *named* pipes.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Wed Jan 31 14:21:23 2024

On 31/01/2024 11:53, Michael S wrote:

On Wed, 31 Jan 2024 08:59:22 +0100
David Brown <david.brown@hesbynett.no> wrote:

On 31/01/2024 07:04, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 21:04:56 +0000, Malcolm McLean wrote:

Linus Torvald's native language is Finnish, for example.

No, it would be Swedish. He’s an ethnic Swede, from Finland.

He is Finnish, but has Swedish as his mother tongue (like about 5% of
Finns). Speaking Swedish as your main language does not make you
ethically Swedish.

Linus has Swedish as his mother tongue.
Linus has Swedish family name. Or at least Scandinavian, for me it
sounds more Danish than Swedish, but I am not an expert. It certainly
does not sound Finnish.

His name is of Swedish language origin, yes. I can't answer for
Torvalds family, but the solid majority of Swedish-speaking Finn
families have been in Finland for hundreds of years. I'm not sure what
the definitions of "ethnically Swedish" or "ethnically Finnish" might
be, but as a general rule Swedish-speaking Finns consider themselves to
be Swedish-speaking Finns - not Swedes living in Finland.

When Linus was younger, he used to like to tell stereotypical jokes
about Finns.

So do all Finns I have ever met. They are particularly fond of jokes
about how absurd the Finnish language can be - whether they are natively Swedish speakers or Finnish speakers. They are all proud of being
Finnish, including traits that may be viewed as eccentric by other cultures.

When it quacks like a duck...

As a university-educated Finn, brought up in
Helsinki, he will also speak Finnish quite fluently.

That's true.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 14:35:03 2024

On 31.01.2024 14:05, David Brown wrote:

But it is correct that English has become the main language for
international communication, and is therefore critical for anything that involves cross-border communication, or where there are significant
numbers of foreign workers. That includes academic work. Different
parts of Europe previously used German or Russian for this,

Don't forget the importance of French! - The whole postal and
telecommunication sectors were (and probably still are) massively
influenced by France.

(You're always writing so much text, so I'll skip it and avoid
more comments.)

Just two (unrelated) notes concerning statements I've seen
somewhere in the thread (maybe here as well)...

First; the EU publishes in all languages of the member states,
for example. (There's no single lingua franca.)

And the second note; we have to distinguish the language of the
programming language's keywords, the comments in the source
code, and the language(s) used for user-interaction.

I don't know whether there's some native language that use
non-English keywords, but I'd suppose so, since in the past
I've seen some using preprocessors for a "native language"
source code. So while not typical, probably a demand at some
places. (Elsethread I mentioned the German TR440 commands,
but a [primitive] command language, as opposed to, say, the
Unix shell, I don't consider much as a language.)

The comments' languages varies, in my experience. Sometimes
there's coding standards (that demand the native language, or
that demand English), sometimes it's not defined. Myself I'm
reluctant to switch between languages and stay with English.
But there were also other cases with longer descriptions on
a conceptual basis; if you come from a native language's
perspective it can be better to stay with the language of the
specification instead of introducing sources of misunderstanding.

The user interface, finally, is of course as specified, and can
be anything, or even multi-lingual.

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Richard Harnden on Wed Jan 31 14:47:38 2024

On 30.01.2024 20:39, Richard Harnden wrote:

Nobody uses printf to output binary data. fwrite(3) would be common, as
would write(2).

Right. I'm using the OS'es write(2), but also printf with ANSI escapes,
e.g. sprintf (buf, "\033[%d;%dH", ...

Maybe you could use printf("%c%c%c" ... but it'd be beyond tedious.

Since I recall to have used it in some thread I want to clarify that
it was just meant as an example countering an incorrect argument of
"not being able to output binary data on stdout", or some such.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Wed Jan 31 14:58:42 2024

On 31/01/2024 12:22, Janis Papanagnou wrote:

On 30.01.2024 20:46, David Brown wrote:

On 30/01/2024 19:22, bart wrote:

[...]

If you have a program like that, then it probably makes sense to have a
flag to say "output the data to stdout" and the default being writing to
a file.

Did you here mean to say "output the data to the terminal"?

No.

I said that if you have a program that sometimes gives binary output on
stdout, and sometimes gives text messages, and this leads people to have
a significant chance of accidentally dumping binary output to their
terminal, then it probably makes sense to require an explicit flag to
have the program generate binary output on stdout. Then you can use
pipes or other redirection if you want, but are less likely to make a
mess of your terminal by mistake.

(I noticed
that a lot of the posts here have a misconception about what 'stdout'
is; they seem to use it synonymously to "terminal" or "screen/display".
But you are not guaranteed that stdout will produce screen output; it
depends on the environment. Being more accurate with the distinction
might help prevent misconceptions if replying to these people.)

I am not sure I have noticed that mistake here, but it is something that
people do mix up - especially if they are used to systems where pipes
and shell redirection are uncommon.

However, here we have used "terminal" intentionally. It is usually not
a big problem if you accidentally pipe binary data into grep or redirect
it to a file. It is, however, annoying when it goes to the terminal and generates a screenful of flashing garbage, 42 beeps, and leaves you with
a red on red colour scheme.

More to your point you wrote, I don't think this would be a good design
as you've written it. A default would imply the necessity of some fixed
name (or naming schema) - think of the disputable "a.out" default.

Yes.

Or perhaps just an error message saying you need to specify the filename
for output or "-" for stdout. Those are just details.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Wed Jan 31 15:09:28 2024

On 30/01/2024 22:16, James Kuyper wrote:

On 1/30/24 11:49, David Brown wrote:
...

If the program can reasonably be expected to generate binary output,
then it is the user's fault if they do this accidentally. Examples
shown in this thread include cat and zcat - that's what these programs do.

? There's no problem using cat to concatenate binary files. I've used
'split' to split binary files into smaller pieces, and then used 'cat'
to recombine them, and it worked fine. I don't remember why, but I had
to transfer the files from one place to another by a method that imposed
an upper limit on the size of individual files.

I think there's a misunderstanding here - I gave "cat" is an example of
a program that /can/ be expected to produce binary output. (It can also produce text output - you get what you put in.) So it is the user's
fault if the type "cat /bin/cat" and are surprised by a mess in their
terminal.

I would expect that the majority of uses of "cat" are with just one
file, but certainly it is useful when you want to combine files in
different ways.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 15:46:27 2024

On 31.01.2024 15:09, David Brown wrote:

I would expect that the majority of uses of "cat" are with just one
file,

And of course just because of ignorance; the majority of (but not all)
uses with just one file are UUOCs.

but certainly it is useful when you want to combine files in
different ways.

I don't know of any concatenations in "different" ways, but of course
there's some more of the other usages that are supported by options.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 15:39:47 2024

On 31.01.2024 14:58, David Brown wrote:

On 31/01/2024 12:22, Janis Papanagnou wrote:

On 30.01.2024 20:46, David Brown wrote:

On 30/01/2024 19:22, bart wrote:

[...]

If you have a program like that, then it probably makes sense to have a
flag to say "output the data to stdout" and the default being writing to >>> a file.

Did you here mean to say "output the data to the terminal"?

No.

I said that if you have a program that sometimes gives binary output on stdout, and sometimes gives text messages,

Umm, the same program? - I mean, sure, technically a program can
produce binary output (with some options) and textual output (with
some other options); but that is explicitly controlled by the user.

Your statement reads as if it would arbitrarily output binary or
text; and in that case the "output type" distinction appears to be
artificially and should instead be called just "data".

It might be enlightening to consider for example: decrypt f.cpt
What would one expect as output ("binary"? "text"? or "data"?) and
what we gain with a (spurious) flag?

and this leads people to have
a significant chance of accidentally dumping binary output to their
terminal, [...]

Okay, after all, "output the data to the terminal", as I suspected
(and that was what I had been asking to clarify).

Myself (as I'd suppose everyone else also) has dumped non-text
code onto the terminal, by a typo, or by forgetting to provide
an option. 'stty sane' or 'reset' and continue...

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Wed Jan 31 15:21:16 2024

On 31/01/2024 09:36, Malcolm McLean wrote:

On 31/01/2024 07:18, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
human-readable text. For instance in some systems designed to receive >>>>> ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the
data
is binary.
Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format >>>>> which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

[...]

Simple example (disclaimer: not tested):

     ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
        (mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

Of the five main programs in this command, four are using
standard out to send binary data:

     tar -cf - .
     gzip -c
     ssh foo [...]
     gunzip -c

The tar -xf - at the end reads binary data on standard in
but doesn't output any (or anything else for that matter).

It is FAR more cumbersome to accomplish what this command
is doing without sending binary data through standard out.
Anyone who doesn't understand this doesn't understand Unix.

Yes. I don't do that sort of thing.
Whilst I have used Unix, it is as a platform for interactive programs
which work on graohics, or a general C compilation environment. I don;t
build pipeliens to do that sort of data processing. If I had to download
a tar file I'd either use a graphical tool or type serveal commands into
the shell, each launching single executable, interactively.

The reason is that I'd only run the command once, and it's so likely
that there will be either a syntax misunderstanding or a typing error
that I'd have to test to ensure that it was right. And by the time
you've done that any time saved by typing only one commandline is lost.
Of course if you are writing scripts then that doesn't apply. But now
it's effectively a programming language, and, from the example code, a
very poorly designed one which is cryptic and fussy and liable to be
hard to maintain. So it's better to use a language like Perl to achieve
the same thing, and I did have a few Perl scripts handy for repetitive
jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

You admit this with "not tested". Says it all. '"Understandig Unix" is
an intellectually useless achievement. You might have to do it if you
have to use the system and debug and trouble shoot. But it's nothing to
be proud about.

It is "useless" for people who don't use it. For people who /do/ use
it, is very useful.

I've used sequences like Tim's - it's a way to copy data remotely from a different machine. I would likely write it slightly differently - I'd
probably do the mkdir and cd first, thus avoiding the need for a
subshell, and I'd use "ssh -C" or "tar -z" to do the compression rather
than "gzip".

There's no doubt that the learning curve is longer for doing this sort
of thing from the command line than using gui programs. There is also
no doubt that when you are used to it, command line utilities and a good
shell are very flexible and efficient.

Learn to use the tools that are conveniently available, and then pick
the right tool for the job - whether it is command line or gui.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Malcolm McLean on Wed Jan 31 17:03:12 2024

On Wed, 31 Jan 2024 12:15:23 +0000
Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

On 31/01/2024 10:43, Michael S wrote:

On Tue, 30 Jan 2024 23:18:21 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it
might have unusual effects if passed though systems designed to
handle human-readable text. For instance in some systems
designed to receive ASCII text, there is no distinction between
the nul byte and "waiting for next data byte". Obviously this
will cause difficuties if the data is binary.
Also many binary formats can't easily be extended, so you can
pass one image and that's all. While it is possible to devise a
text format which is similar, in practice text formats usually
have enough redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and
harder to extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered
doing.

[...]

Simple example (disclaimer: not tested):

ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
(mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

Of the five main programs in this command, four are using
standard out to send binary data:

tar -cf - .
gzip -c
ssh foo [...]
gunzip -c

The tar -xf - at the end reads binary data on standard in
but doesn't output any (or anything else for that matter).

It is FAR more cumbersome to accomplish what this command
is doing without sending binary data through standard out.

If I am not mistaken, tar, gzip and gunzip do not write binary data
to standard output by default. They should be specifically told to
do so. For ssh I don't know. Anyway, ssh is not a "normal" program
so it's not surprising when textuality of ssh output is the same as textuality of the command it carries.

Anyone who doesn't understand this doesn't understand Unix.

Frankly, Unix redirection racket looks like something hacked
together rather than designed as result of the solid thinking
process. As long as there were only standard input and output it
was sort of logical. But when they figured out that it is
insufficient, they had chosen a quick hack instead of constructing
a solution that wouldn't offend engineering senses of any non-preconditioned observer.

It was designed for very memory constrained systems which handled
text on a line by line basis. So one line of a long file wuld be
processed and passed down the pipeline, and you wouldn't need
temporary disk files or large amounts of memory. I'm sure it worked
quite well for that.

A concept of pipes is fine. I was not talking about that side.

My objection is with each program having exactly 1 special input and
exactly 2 special outputs. Instead of having, say, up to 5 of each,
fully interchangeable with the first of the five being special only in
that that it is a default and as such allows for shorter syntax in the
shell.

I would be surprised if something like that was not done by somebody.
I would be even more surprised if idea did not cross the mind of Unix
pioneers. However they decided to add stderr and to stop here. Most
likely, because they didn't take themselves as seriously as few posters
here take them 45-50 years later.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 15:15:15 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 30 Jan 2024 17:49:57 +0100, David Brown wrote:

The use of CRLF as a standard for line endings in files was, I believe,
from CP/M ...

Which I think copied it from DEC minicomputer systems.

Fun fact: on some of those DEC systems (which I used when they were still >being made), you could end a line with CR-LF, or LF-CR-NUL.

What was the NUL for? Padding. Why did it need padding? (This was before
CRT terminals.)

Unix built the nul-padding into the terminal driver. Users used
the stty command to set the number of nul's (based on the time it
took for the ASR33 carriage to return to the home position after
it recieved a CR).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 31 15:17:53 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 31.01.2024 12:58, Michael S wrote:

On Wed, 31 Jan 2024 12:43:04 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 31.01.2024 07:02, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

VMS (now OpenVMS) was also a significant system at the time, and
it had some rather complex file formats ...

[...]

Apparently Linus Torvalds used VMS for a while, and hated it.

I don't understand the intention of this comment.
VMS and Torvalds are completely different eras.
And were is the relation?

Linus is older than you probably realize.

Why do you think that I'd be thinking that?

I know that he's quite some years younger than I am. So what?

He entered the University of
Helsinki in 1988. Back then VMS was only slightly behind its peak of
popularity.

What? - I'm not sure where you're coming from.

I associate DEC's VMS with the old DEC VAX-11 system, both
from around the mid of the 1970's.

Early customer systems were shipped around 1979, IIRC. We had four.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 15:20:43 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 30 Jan 2024 19:39:24 +0000, Richard Harnden wrote:

Nobody uses printf to output binary data.

Do terminal-control escape sequences count?

Or UTF-*?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jan 31 15:16:57 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 30 Jan 2024 11:50:35 -0800, Keith Thompson wrote:

VMS (now OpenVMS) was also a significant system at the time, and it had
some rather complex file formats ...

It relied on extra file metadata called “record attributes” in order to >make sense of the file format. It was quite common to transfer files from >other systems, and have them not be readable until you had set appropriate >record attributes on them. Picky, picky, I know.

At my first job, I had to write a tool (in macro32) to support access
to any type of file, regardless of the RMS attributes. Wasn't a trivial
task like it would have been on a unix system.

VMS inherited that from the mainframe systems whose filesystem were
based around COBOL file handling.

At least it was far superior to IBM's PDS and associated crap.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Wed Jan 31 15:20:08 2024

David Brown <david.brown@hesbynett.no> writes:

On 31/01/2024 07:07, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

Mixing binary data with formatted text data is very unlikely to be
useful.

PDF does exactly that. To the point where the spec suggests putting some
random unprintable bytes up front, to distract format sniffers from
thinking they’re looking at a text file.

PDF files start with the "magic" indicator "%PDF", which is enough for
many programs to identify them correctly. And they are usually
compressed so that the content text is not directly readable or
identifiable as strings. If they are not compressed, then yes, there is
can be text mixed in with everything else. But I would not call that
"mixing binary data and formatted text" - I would just say that some of
the binary data happens to be strings. It's the same as elf files
containing copies of strings from the program, or identifiers for
external linking.

However, I learned a new trick when checking that I was not mistaken
about this - it turns out that "less file.pdf" gives a nice text-only
output from the pdf file (by passing it through "lesspipe"). There's
always something new to learn from inane conversations on Usenet :-)

For many years, I used a tool called 'antiword' to read legacy microsoft windows .doc files (before .docx).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Wed Jan 31 15:22:58 2024

David Brown <david.brown@hesbynett.no> writes:

On 31/01/2024 09:36, Malcolm McLean wrote:

The reason is that I'd only run the command once, and it's so likely
that there will be either a syntax misunderstanding or a typing error
that I'd have to test to ensure that it was right. And by the time
you've done that any time saved by typing only one commandline is lost.
Of course if you are writing scripts then that doesn't apply. But now
it's effectively a programming language, and, from the example code, a
very poorly designed one which is cryptic and fussy and liable to be
hard to maintain. So it's better to use a language like Perl to achieve
the same thing, and I did have a few Perl scripts handy for repetitive
jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

You admit this with "not tested". Says it all. '"Understandig Unix" is
an intellectually useless achievement. You might have to do it if you
have to use the system and debug and trouble shoot. But it's nothing to
be proud about.

It is "useless" for people who don't use it. For people who /do/ use
it, is very useful.

I've used sequences like Tim's - it's a way to copy data remotely from a >different machine. I would likely write it slightly differently - I'd >probably do the mkdir and cd first, thus avoiding the need for a
subshell, and I'd use "ssh -C" or "tar -z" to do the compression rather
than "gzip".

There's no doubt that the learning curve is longer for doing this sort
of thing from the command line than using gui programs. There is also
no doubt that when you are used to it, command line utilities and a good >shell are very flexible and efficient.

Learn to use the tools that are conveniently available, and then pick
the right tool for the job - whether it is command line or gui.

And there are often more than one tool for the job. e.g. rsync(1)
for copying data remotely.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Michael S on Wed Jan 31 15:25:00 2024

Michael S <already5chosen@yahoo.com> writes:

On Tue, 30 Jan 2024 23:18:21 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Anyone who doesn't understand this doesn't understand Unix.

Frankly, Unix redirection racket looks like something hacked together
rather than designed as result of the solid thinking process.

It seems you don't understand Unix.

As long as there were only standard input and output it was sort of
logical. But when they figured out that it is insufficient, they had
chosen a quick hack instead of constructing a solution that wouldn't
offend engineering senses of any non-preconditioned observer.

You mean like

exec 3< /path/to/input/file
read -u3 line_from_input file

How does that offend your engineering senses?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Wed Jan 31 16:25:30 2024

On 31.01.2024 15:21, David Brown wrote:

On 31/01/2024 09:36, Malcolm McLean wrote:

[ I snipped a couple of "I actually don't know/need it" things ]

But now it's effectively a programming language, and, from the example
code, a very poorly designed one which is cryptic and fussy and liable
to be hard to maintain. So it's better to use a language like Perl to
achieve the same thing, and I did have a few Perl scripts handy for
repetitive jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

I don't think it's that clear a joke. The Unix shell is extremely
error prone to program, and you should not let a newbie write shell
programs without careful supervision. ("newbie" [in shell context]
= less than 10 years of practical experience. - Am I exaggerating?
Maybe. But not much.)

And yes, Shell is cryptic, but, OTOH, it's also a programming
language with powerful concepts (taken from Algol68). It wouldn't
appear to me, though, to classify Perl as non-cryptic. You can
write more or less legible shell code; also depends on experience.
You have functions to structure your code, the control constructs,
error handling, the I/O glue, all smoothly fitting together. But,
to be honest, I say this with modern shells in mind (ksh, bash,
zsh), not so much the POSIX subset, and even less the Bourne sh.

Perl's advantage is probably that you have the same interface on
all platforms [where it is installed]. Not having to distinguish,
say, the 'ps' options from one Unix system to another. And it has
a lot more features, data types, and supporting external modules.

There's no doubt that the learning curve is longer for doing this sort
of thing from the command line than using gui programs. There is also
no doubt that when you are used to it, command line utilities and a good shell are very flexible and efficient.

The big advantage of non-GUI is for process automation. With GUI
oriented applications you can mainly only interactively (=slow and
cumbersome) do what it provides. Rarely GUI applications support a
scripting interface, and if so it's then typically some proprietary non-standard language.

Learn to use the tools that are conveniently available, and then pick
the right tool for the job - whether it is command line or gui.

The Unix shell is at least standard and available on Unix systems.
(Perl is no standard on Unix. And you may not be allowed to install
it.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Wed Jan 31 15:29:07 2024

David Brown <david.brown@hesbynett.no> writes:

On 30/01/2024 22:16, James Kuyper wrote:

On 1/30/24 11:49, David Brown wrote:
...

If the program can reasonably be expected to generate binary output,
then it is the user's fault if they do this accidentally. Examples
shown in this thread include cat and zcat - that's what these programs do. >>

? There's no problem using cat to concatenate binary files. I've used
'split' to split binary files into smaller pieces, and then used 'cat'
to recombine them, and it worked fine. I don't remember why, but I had
to transfer the files from one place to another by a method that imposed
an upper limit on the size of individual files.

I think there's a misunderstanding here - I gave "cat" is an example of
a program that /can/ be expected to produce binary output. (It can also >produce text output - you get what you put in.) So it is the user's
fault if the type "cat /bin/cat" and are surprised by a mess in their >terminal.

Quick and dirty editor:

$ cat > /tmp/file < /dev/tty
line1
line2
line3
^D
$
$ cat /tmp/file
line1
line2
line3
$

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Scott Lurndal on Wed Jan 31 16:33:44 2024

On 31.01.2024 16:25, Scott Lurndal wrote:

Michael S <already5chosen@yahoo.com> writes:

[...]

You mean like

exec 3< /path/to/input/file
read -u3 line_from_input file

Careful with non-standard extensions like '-u'.

How does that offend your engineering senses?

It probably would if the standard redirection pattern
would have been used here. It's certainly more "cryptic"
than '-u3'. ;-)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Scott Lurndal on Wed Jan 31 17:42:49 2024

On Wed, 31 Jan 2024 15:25:00 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Michael S <already5chosen@yahoo.com> writes:

On Tue, 30 Jan 2024 23:18:21 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Anyone who doesn't understand this doesn't understand Unix.

Frankly, Unix redirection racket looks like something hacked together >rather than designed as result of the solid thinking process.

It seems you don't understand Unix.

That's likely.
I learned it during short course ~30 years ago, then read 2 or 3
books about it and then used it quite sporadically.

As long as there were only standard input and output it was sort of >logical. But when they figured out that it is insufficient, they had
chosen a quick hack instead of constructing a solution that wouldn't
offend engineering senses of any non-preconditioned observer.

You mean like

exec 3< /path/to/input/file
read -u3 line_from_input file

How does that offend your engineering senses?

That was not in 2-3 books that I had read. I can't say that I understand
what is going on, what environment we are and whether what you show is
generic or specific to 'exec' and 'read'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Michael S on Wed Jan 31 15:58:02 2024

Michael S <already5chosen@yahoo.com> writes:

On Wed, 31 Jan 2024 15:25:00 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

You mean like

exec 3< /path/to/input/file
read -u3 line_from_input file

How does that offend your engineering senses?

That was not in 2-3 books that I had read. I can't say that I understand
what is going on, what environment we are and whether what you show is >generic or specific to 'exec' and 'read'.

It is acutally specific to a shell which provides internal
'exec' and 'read' commands. There are dozens of shells available
to suit the end-users requirements (bourne shell, korn shell, C shell
et alia), many of them implement a common subset of commands
defined by POSIX.

From the unix kernel perspective, an application opens a file and a file descriptor is assigned (consecutively, starting at zero). There
are no semantics associated with the fd which can refer to a terminal, pseudo-terminal, block device, character device, pipe,
named fifo, or even a TCP connection to a remote host.

Any semantics associated with the file descriptor are a contract
between the shell and the application.

One could certainly write a shell that doesn't use file descriptor
zero as stdin (although that would be incompatable with applications
written for the standard shells, all of which honor the convention
that file descriptor zero is stdin).

For portability between shells, POSIX has codified the relationship
of stdin to fd 0, stdout to fd1, etc.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Michael S on Wed Jan 31 15:49:10 2024

Michael S <already5chosen@yahoo.com> writes:

On Wed, 31 Jan 2024 12:15:23 +0000
Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

It was designed for very memory constrained systems which handled
text on a line by line basis. So one line of a long file wuld be
processed and passed down the pipeline, and you wouldn't need
temporary disk files or large amounts of memory. I'm sure it worked
quite well for that.

A concept of pipes is fine. I was not talking about that side.

My objection is with each program having exactly 1 special input and
exactly 2 special outputs. Instead of having, say, up to 5 of each,
fully interchangeable with the first of the five being special only in
that that it is a default and as such allows for shorter syntax in the
shell.

Each program has 1024 (on my system - it's configurable on a per-process
basis) fully interchangable "inputs" and "outputs" (also known as files).

$ application 5> /tmp/file5

will redirect file descriptor five to the specified file.

There's nothing special about stdin, stdout or stderr other than
that they are tags applied to the first three file descriptors.

There is a convention the that the first file descriptor
is used for input, the second for output and the third
for diagnostic output. But it's just a convention

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 31 15:58:36 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 31.01.2024 15:21, David Brown wrote:

On 31/01/2024 09:36, Malcolm McLean wrote:

[ I snipped a couple of "I actually don't know/need it" things ]

But now it's effectively a programming language, and, from the example
code, a very poorly designed one which is cryptic and fussy and liable
to be hard to maintain. So it's better to use a language like Perl to
achieve the same thing, and I did have a few Perl scripts handy for
repetitive jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

I don't think it's that clear a joke. The Unix shell is extremely
error prone to program, and you should not let a newbie write shell
programs without careful supervision.

Nonsense.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Scott Lurndal on Wed Jan 31 18:04:29 2024

On Wed, 31 Jan 2024 15:49:10 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Michael S <already5chosen@yahoo.com> writes:

On Wed, 31 Jan 2024 12:15:23 +0000
Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

It was designed for very memory constrained systems which handled
text on a line by line basis. So one line of a long file wuld be
processed and passed down the pipeline, and you wouldn't need
temporary disk files or large amounts of memory. I'm sure it worked
quite well for that.

A concept of pipes is fine. I was not talking about that side.

My objection is with each program having exactly 1 special input and >exactly 2 special outputs. Instead of having, say, up to 5 of each,
fully interchangeable with the first of the five being special only
in that that it is a default and as such allows for shorter syntax
in the shell.

Each program has 1024 (on my system - it's configurable on a
per-process basis) fully interchangable "inputs" and "outputs" (also
known as files).

$ application 5> /tmp/file5

will redirect file descriptor five to the specified file.

There's nothing special about stdin, stdout or stderr other than
that they are tags applied to the first three file descriptors.

There is a convention the that the first file descriptor
is used for input, the second for output and the third
for diagnostic output. But it's just a convention

I don't understand.
Are not descriptors 0,1 and 2 special in that that they are already
open (I don't know if by OS or by shell) when the program starts and the
rest of them, if ever used, have to be opened by the program code?

On only remotely related note, what happens on your system when you
want more than 1024 files to be open by one program simultaneously?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 17:05:23 2024

On 31.01.2024 16:42, Michael S wrote:

On Wed, 31 Jan 2024 15:25:00 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

You mean like

exec 3< /path/to/input/file
read -u3 line_from_input file

How does that offend your engineering senses?

That was not in 2-3 books that I had read. I can't say that I understand
what is going on, what environment we are and whether what you show is generic or specific to 'exec' and 'read'.

'-u' is obviously an option of read. Various shells support it; at least
ksh since 30 years, and bash meanwhile as well. But it's not in POSIX.

Other redirections are standard, and these should certainly be known by
anyone who had visited a course and read any book on the Unix shell.
The syntax is not difficult, follows rules, and certainly not arbitrary.

The one in above code is assigning the file descriptor 3 to the given
file for reading. You can let a FD point to the channel another one is
pointing to, like in the "well known" '2>&1' (where stderr is connected
to the same channel than stdin currently points to). Similar you can in
above example use the standard form 'read line_from_input_file <&3',
which may certainly appear more cryptic than an option '-u3', but it's essential to any shell programmer.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Scott Lurndal on Wed Jan 31 17:17:44 2024

On 31.01.2024 16:58, Scott Lurndal wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 31.01.2024 15:21, David Brown wrote:

On 31/01/2024 09:36, Malcolm McLean wrote:

[ I snipped a couple of "I actually don't know/need it" things ]

But now it's effectively a programming language, and, from the example >>>> code, a very poorly designed one which is cryptic and fussy and liable >>>> to be hard to maintain. So it's better to use a language like Perl to
achieve the same thing, and I did have a few Perl scripts handy for
repetitive jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

I don't think it's that clear a joke. The Unix shell is extremely
error prone to program, and you should not let a newbie write shell
programs without careful supervision.

Nonsense.

Not the least. - I'm not sure about your background in shell. But all
what you wrote in this newsgroup is indicating that your experience
seems to be quite limited. (I've seen only a single and pointless post
from you in comp.unix.shell). - Nevermind. Just don't expose yourself
so much with your obviously little knowledge and experience.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Janis Papanagnou on Wed Jan 31 11:20:21 2024

On 1/31/24 07:35, Janis Papanagnou wrote:
...

I associate DEC's VMS with the old DEC VAX-11 system, both
from around the mid of the 1970's. I programmed on a DEC's
VAX with VMS obviously before Linus Torvalds started his
studies. And that was at a time when the DEC VAX and VMS
were replaced at our sites by Unix systems.

OK - so it's that association you've got wrong. I know VMS was still
going strong around 1990 when I was introduced to it. It might have been
in decline at the time, but it was very far from being gone.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 18:26:36 2024

On Wed, 31 Jan 2024 16:25:30 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

The big advantage of non-GUI is for process automation. With GUI
oriented applications you can mainly only interactively (=slow and cumbersome) do what it provides. Rarely GUI applications support a
scripting interface, and if so it's then typically some proprietary non-standard language.

I'd take almost any proprietary non-standard GUI macro language over non-proprietary non-standard tcl. They say, Lua is better. I never had motivation to look at it more closely.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 18:18:20 2024

On Wed, 31 Jan 2024 17:05:23 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 31.01.2024 16:42, Michael S wrote:

On Wed, 31 Jan 2024 15:25:00 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

You mean like

exec 3< /path/to/input/file
read -u3 line_from_input file

How does that offend your engineering senses?

That was not in 2-3 books that I had read. I can't say that I
understand what is going on, what environment we are and whether
what you show is generic or specific to 'exec' and 'read'.

'-u' is obviously an option of read. Various shells support it; at
least ksh since 30 years, and bash meanwhile as well. But it's not in
POSIX.

The books were talking about Bourne shell and C shell. They acknowledged
an existence of ksh, but didn't go into details. I don't remember if
bash was mentioned at all.
Of course, in practice in this century I used bash almost exclusively,
but never learned it formally, by book, from start to finish.
The same as over 90% of bash users, I'd guess.

Other redirections are standard, and these should certainly be known
by anyone who had visited a course and read any book on the Unix
shell. The syntax is not difficult, follows rules, and certainly not arbitrary.

The one in above code is assigning the file descriptor 3 to the given
file for reading. You can let a FD point to the channel another one is pointing to, like in the "well known" '2>&1' (where stderr is
connected to the same channel than stdin currently points to).
Similar you can in above example use the standard form 'read line_from_input_file <&3', which may certainly appear more cryptic
than an option '-u3', but it's essential to any shell programmer.

Janis

I did understand '3<' by association with '2>' that was in the book,
but more importantly, is something I use regularly.
However I had never seen '3<' in the books.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Michael S on Wed Jan 31 18:27:46 2024

On Wed, 31 Jan 2024 18:26:36 +0200
Michael S <already5chosen@yahoo.com> wrote:

On Wed, 31 Jan 2024 16:25:30 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

The big advantage of non-GUI is for process automation. With GUI
oriented applications you can mainly only interactively (=slow and cumbersome) do what it provides. Rarely GUI applications support a scripting interface, and if so it's then typically some proprietary non-standard language.

I'd take almost any proprietary non-standard GUI macro language over non-proprietary non-standard tcl. They say, Lua is better. I never had motivation to look at it more closely.

meant to write 'non-proprietary standard tcl'

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 17:29:30 2024

On 31.01.2024 16:03, Michael S wrote:

My objection is with each program having exactly 1 special input and
exactly 2 special outputs. Instead of having, say, up to 5 of each,
fully interchangeable with the first of the five being special only in
that that it is a default and as such allows for shorter syntax in the
shell.

The first three are pre-assigned and read to use for application.
You can redirect, close, or open them on shell level with a set of
redirection commands, and you can do such things also on OS-level
with the Unix system commands.

I'm not sure why you mentioned 5, whether that's better or worse.
There's naturally some limit on OS level on the number of parallel
open file descriptors, but that limit is very high. Mind that you
can always close unused ones.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 17:47:33 2024

On 31.01.2024 17:18, Michael S wrote:

On Wed, 31 Jan 2024 17:05:23 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

The books were talking about Bourne shell and C shell. They acknowledged
an existence of ksh, but didn't go into details. I don't remember if
bash was mentioned at all.

So that's understandable then. The C shell is not really suited for
programming (and commonly depreciated; "C shell considered harmful").
And Bourne sh is indeed very restricted.

Of course, in practice in this century I used bash almost exclusively,
but never learned it formally, by book, from start to finish.

In case that bash is still part of your working environment I suggest
to update your knowledge by reading some good bash tutorial. It's
really a huge gain if compared to Bourne sh. In case, though, you want
to do _portable_ shell programming then take some book that clearly
indicates what is POSIX and what is some extension.

The same as over 90% of bash users, I'd guess.

Well, I think you have to differentiate the levels. Quite many users
of bash take this shell as "quasi-standard"; which is mostly okay in a
Linux world. And know exactly that universe. If you happen to work in
a larger Unix universe that might be a hindrance. (Thus POSIX, or use
of ksh, which is even more powerful than bash, or zsh, supplemented
with its own but some more coherent concepts.)

I did understand '3<' by association with '2>' that was in the book,
but more importantly, is something I use regularly.
However I had never seen '3<' in the books.

It's just the numbers of file descriptors and whether it's an input >
or output < channel, or even a read/write channel <> . That's why in
books (or man pages) you regularly see the building blocks, not the
complete enumeration.

(See for example 'man ksh' Section "Input/Output". But careful; ksh
has additional non-standard additions. So a peek into the POSIX docs
might serve you better.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Michael S on Wed Jan 31 16:54:06 2024

Michael S <already5chosen@yahoo.com> writes:

On Wed, 31 Jan 2024 15:49:10 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Michael S <already5chosen@yahoo.com> writes:

On Wed, 31 Jan 2024 12:15:23 +0000
Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

It was designed for very memory constrained systems which handled
text on a line by line basis. So one line of a long file wuld be
processed and passed down the pipeline, and you wouldn't need
temporary disk files or large amounts of memory. I'm sure it worked
quite well for that.

A concept of pipes is fine. I was not talking about that side.

My objection is with each program having exactly 1 special input and
exactly 2 special outputs. Instead of having, say, up to 5 of each,
fully interchangeable with the first of the five being special only
in that that it is a default and as such allows for shorter syntax
in the shell.

Each program has 1024 (on my system - it's configurable on a
per-process basis) fully interchangable "inputs" and "outputs" (also
known as files).

$ application 5> /tmp/file5

will redirect file descriptor five to the specified file.

There's nothing special about stdin, stdout or stderr other than
that they are tags applied to the first three file descriptors.

There is a convention the that the first file descriptor
is used for input, the second for output and the third
for diagnostic output. But it's just a convention

I don't understand.
Are not descriptors 0,1 and 2 special in that that they are already
open (I don't know if by OS or by shell) when the program starts and the
rest of them, if ever used, have to be opened by the program code?

That's a contract between a shell and an application. A shell
doesn't need to provide POSIX semantics, although it will likely
do so just to maintain compataiblity with existing applications.

Applications like daemons usually are disconnected from stdin/stdout/stderr completely, as they're not designed for interactive use.

Using the fork and exec (or posix_spawn) system calls, an application can invoke another
application without the shell and any number of file descriptors
can be left open for the child to use in any way. It's a contract
between the two applications.

On only remotely related note, what happens on your system when you
want more than 1024 files to be open by one program simultaneously?

$ ulimit -f 2048

Will increase the limit, to any arbitrary value, subject to system
wide limits configured by the superuser (system manager).

There's a system call (setrlimit) that an application can use as well, and the limit will be inherited by child processes.

One might argue that there are few cases (absent network daemons)
where more than 2k files will need to be open simultaneously.

For each resource (address space size, core file size, max cpu time,
file size, open files, et cetera, et alia) there are two values;
a hard limit (which a process can never exceed) and a soft limit.
A process can increase the soft limit up to the hard limit.

In my case, the hard limit is 8192 and the soft limit is 1024.

$ ulimit -aH
address space limit (Kibytes) (-M) unlimited
core file size (blocks) (-c) unlimited
cpu time (seconds) (-t) unlimited
data size (Kibytes) (-d) unlimited
file size (blocks) (-f) unlimited
locks (-x) unlimited
locked address space (Kibytes) (-l) 64
message queue size (Kibytes) (-q) 800
nice (-e) 0
nofile (-n) 8192
nproc (-u) 63878
pipe buffer size (bytes) (-p) 4096
max memory size (Kibytes) (-m) unlimited
rtprio (-r) 0
socket buffer size (bytes) (-b) 4096
sigpend (-i) 63878
stack size (Kibytes) (-s) unlimited
swap size (Kibytes) (-w) not supported
threads (-T) not supported
process size (Kibytes) (-v) unlimited
$ ulimit -aS
address space limit (Kibytes) (-M) unlimited
core file size (blocks) (-c) 0
cpu time (seconds) (-t) unlimited
data size (Kibytes) (-d) unlimited
file size (blocks) (-f) unlimited
locks (-x) unlimited
locked address space (Kibytes) (-l) 64
message queue size (Kibytes) (-q) 800
nice (-e) 0
nofile (-n) 1024
nproc (-u) 1024
pipe buffer size (bytes) (-p) 4096
max memory size (Kibytes) (-m) unlimited
rtprio (-r) 0
socket buffer size (bytes) (-b) 4096
sigpend (-i) 63878
stack size (Kibytes) (-s) 8192
swap size (Kibytes) (-w) not supported
threads (-T) not supported
process size (Kibytes) (-v) unlimited

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Keith Thompson on Wed Jan 31 17:06:33 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:
[...]

Quick and dirty editor:

$ cat > /tmp/file < /dev/tty
line1
line2
line3
^D
$
$ cat /tmp/file
line1
line2
line3
$

You probably don't need the "< /dev/tty".

True, in most cases. Although I thought it more illustrative
to do it that way in this context.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 31 17:05:17 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 31.01.2024 16:58, Scott Lurndal wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 31.01.2024 15:21, David Brown wrote:

On 31/01/2024 09:36, Malcolm McLean wrote:

[ I snipped a couple of "I actually don't know/need it" things ]

But now it's effectively a programming language, and, from the example >>>>> code, a very poorly designed one which is cryptic and fussy and liable >>>>> to be hard to maintain. So it's better to use a language like Perl to >>>>> achieve the same thing, and I did have a few Perl scripts handy for
repetitive jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

I don't think it's that clear a joke. The Unix shell is extremely
error prone to program, and you should not let a newbie write shell
programs without careful supervision.

Nonsense.

Not the least. - I'm not sure about your background in shell.

I've been using shells (sh, ksh) daily since 1979. I've contributed code
to the linux kernel (a kernel debugger called kdb in 1998).
I've contributed code to Oracle's RDBMS (OS dependent I/O code).
I've co-written a unix-compatible distributed operating system[*] for a massively
parallel machine (in C++) and two bare-metal hypervisors which execute
linux guest operating systems. I spent 6 years on the base working group
at X/Open and the Open Group working on the XPG standards (which were merged with the POSIX standards a decade ago).

I won't say that there aren't gotchas when writing shell scripts,
particularly when used by someone with elevated privileged, but that
doesn't make them "extremely error prone".

[*] eventually in partnership with USL (Unix System Labs - i.e. AT&T), Fujitsu, ICL.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to James Kuyper on Wed Jan 31 18:06:13 2024

On 31.01.2024 17:20, James Kuyper wrote:

On 1/31/24 07:35, Janis Papanagnou wrote:
...

I associate DEC's VMS with the old DEC VAX-11 system, both
from around the mid of the 1970's. I programmed on a DEC's
VAX with VMS obviously before Linus Torvalds started his
studies. And that was at a time when the DEC VAX and VMS
were replaced at our sites by Unix systems.

OK - so it's that association you've got wrong. I know VMS was still
going strong around 1990 when I was introduced to it. It might have been
in decline at the time, but it was very far from being gone.

I am aware that it was also existing in the late 1990's, if
at least through the openVMS form. I certainly have not the
whole market in view. All I can say is what I wrote, that we
have changed our platforms at that time, and when I shortly
later switched jobs to the telecommunication area the "whole
world" (sort of) downsized their machine parks to mainly Unix
systems or (partly; later commonly, at least for the Clients)
Windows based systems. And the Unix server installations grew
enormously during the 1990's, then came the big wave of Linux,
and all the providers building their farms on Linux basis. DEC
became invisible [to me] (I was not saying it vanished), similar
like COBOL did not vanish.

Yet I don't understand the relation to Linus Torvalds that was
the source of mentioning VMS. - I mean; only that he dislikes
it is not much of a news. (I've done a few things on a DEC/VAX
with VMS, and I recall it had a verbose hierarchical file system,
a horrible screen VT220, and a (literally!) impressive VT100
keyboard; my fingers still ache when thinking about it. These
days I already had experiences former with Unix (VM/UTS), so I
did know what I got or what I was missing with VMS.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Keith Thompson on Wed Jan 31 17:21:34 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

scott@slp53.sl.home (Scott Lurndal) writes:

Michael S <already5chosen@yahoo.com> writes:

[...]

On only remotely related note, what happens on your system when you
want more than 1024 files to be open by one program simultaneously?

$ ulimit -f 2048

Will increase the limit, to any arbitrary value, subject to system
wide limits configured by the superuser (system manager).

That sets the limit for file size. I think you mean "ulimit -n 2048".

Yes, thank you.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Jan 31 17:20:30 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 31.01.2024 17:18, Michael S wrote:

On Wed, 31 Jan 2024 17:05:23 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

(See for example 'man ksh' Section "Input/Output". But careful; ksh
has additional non-standard additions. So a peek into the POSIX docs
might serve you better.)

FWIW, the POSIX shell language was based on a subset of ksh88.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 18:19:47 2024

On 31.01.2024 14:10, Michael S wrote:

[ DEC's VMS ]

Released in 1977.
Reached the peak of popularity in mid 1980s, when DEC decided to use
VAX not just as mini/super-mini, but also as competitor to mainframes, effectively killing their earlier mainframe line (PDP-6/10/20).

This is interesting. These days all major players here switched to
Unix systems (in our context specifically AIX and HP-UX), exactly
to exchange the huge sports halls full of mainframe computers to
just a small room full of Unix servers.

[...]

By value, likely still bigger than all Unixen combined.

Not sure what (to me strange sounding) ideas you have here.

I can say the same.

Sure, so let me expand. The "By value" was what made me doubt. The
"values" (Real Money) I experienced in the legacy mainframe areas,
in the financial sector (banks and assurance companies); these were
not DECs here, and they were hard to replace. - I know that every
couple years they made their business cases about how they can get
rid of the mainframes, to no avail. (Don't know how it evolved the
past 20 years, though.) And later all the ISP computing power went
in Linux plants, where the money was made. I never observed that
DEC/VMS was of any importance "by value". If it had some value by
means I'm not aware of, I take your word as granted.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 19:36:06 2024

On Wed, 31 Jan 2024 17:29:30 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

I'm not sure why you mentioned 5, whether that's better or worse.
There's naturally some limit on OS level on the number of parallel
open file descriptors, but that limit is very high. Mind that you
can always close unused ones.

Janis

Five of each sort, i.e. five inputs and five outputs, sound good to me.

Much more than five of each sort pre-opened by shell sound like too
much. If there exist a need for more than five channels for
communication between complex of programs then this complex of programs
very likely was designed to work together and only together. And then
any intervention of the user into communication between them will
likely do more harm than good.

Of course, I fully expect that usefully using more than three
predefined channels of any particular direction would be very rare, but
I still like five, or at least four, better than three.

As to not using predefined direction and instead just providing a pool
of up to 10 pre-open descriptors, this idea didn't cross my mind in
those particular five minutes that I was writing my initial (yes,
provocative, yes intentionally so) post. Right now I don't want to
think whether I like it or not, because I see no good reasons to
think deeply about this particular water under bridge.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Wed Jan 31 19:05:22 2024

On 31/01/2024 17:38, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

However, I learned a new trick when checking that I was not mistaken
about this - it turns out that "less file.pdf" gives a nice text-only
output from the pdf file (by passing it through "lesspipe"). There's
always something new to learn from inane conversations on Usenet :-)

It doesn't necessarily do this by default. See the documentation for
details (which are of course off-topic here).

Sure - I investigated to see how it works when I saw it happening, and
there are clearly many possibilities here. But it's nice when you
discover a useful and simple feature of an everyday tool that you never
knew was there.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Wed Jan 31 19:53:16 2024

On Wed, 31 Jan 2024 18:19:47 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 31.01.2024 14:10, Michael S wrote:

[ DEC's VMS ]

Released in 1977.
Reached the peak of popularity in mid 1980s, when DEC decided to use
VAX not just as mini/super-mini, but also as competitor to
mainframes, effectively killing their earlier mainframe line
(PDP-6/10/20).

This is interesting. These days all major players here switched to
Unix systems (in our context specifically AIX and HP-UX), exactly
to exchange the huge sports halls full of mainframe computers to
just a small room full of Unix servers.

Wikipedia tells me that AIX formally exists since 1986, but in reality
it was just a curiosity until ported to POWER in 1990.
HP-UX sort of existed, but under different name and was not
particularly big until 1990 or 1991.

[...]

By value, likely still bigger than all Unixen combined.

Not sure what (to me strange sounding) ideas you have here.

I can say the same.

Sure, so let me expand. The "By value" was what made me doubt. The
"values" (Real Money) I experienced in the legacy mainframe areas,
in the financial sector (banks and assurance companies); these were
not DECs here, and they were hard to replace. - I know that every
couple years they made their business cases about how they can get
rid of the mainframes, to no avail. (Don't know how it evolved the
past 20 years, though.) And later all the ISP computing power went
in Linux plants, where the money was made. I never observed that
DEC/VMS was of any importance "by value". If it had some value by
means I'm not aware of, I take your word as granted.

Janis

DEC was a big company back then, very solidly #2 computers business in
the "West", if we consider Japan as "East'. Somehow I remember a number
14B USD in 1991 or 1992, back when #3 was may be 5 or 6B USD.
And in the second half of the 80s VAX/VMS was the biggest part of DEC
by far. Of course, there always were internal struggles and internal competition, but it seems less so in this period then in other periods
of DEC's history.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 19:01:32 2024

On 31.01.2024 17:26, Michael S wrote:

On Wed, 31 Jan 2024 16:25:30 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

The big advantage of non-GUI is for process automation. With GUI
oriented applications you can mainly only interactively (=slow and
cumbersome) do what it provides. Rarely GUI applications support a
scripting interface, and if so it's then typically some proprietary
non-standard language.

I'd take almost any proprietary non-standard GUI macro language over non-proprietary non-standard tcl. They say, Lua is better. I never had motivation to look at it more closely.

I don't recall to have ever used tcl (maybe once, long ago?),
and I never stumbled across an application (GUI or otherwise)
where I needed scripting and it would have provided Lua. Thus
I cannot help you here, I either don't know it.

All I can say is that the Unix shell was a reliable companion
wherever we had to automate tasks on Unix systems or on Cygwin
enhanced Windows.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Wed Jan 31 19:11:37 2024

On 31/01/2024 15:46, Janis Papanagnou wrote:

On 31.01.2024 15:09, David Brown wrote:

I would expect that the majority of uses of "cat" are with just one
file,

And of course just because of ignorance; the majority of (but not all)
uses with just one file are UUOCs.

I regularly see it as more symmetrical and clearer to push data left to
right. So I might write "cat infile | grep foo | sort > outfile". Of
course I could use "<" redirection, but somehow it seems more natural to
me to have this flow. I'll use "<" for simpler cases.

But perhaps this is just my habit, and makes little sense to other people.

but certainly it is useful when you want to combine files in
different ways.

I don't know of any concatenations in "different" ways, but of course
there's some more of the other usages that are supported by options.

Different orders for the files, and different subsets of the same set of
files.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Wed Jan 31 19:20:16 2024

On 31/01/2024 16:25, Janis Papanagnou wrote:

On 31.01.2024 15:21, David Brown wrote:

On 31/01/2024 09:36, Malcolm McLean wrote:

[ I snipped a couple of "I actually don't know/need it" things ]

But now it's effectively a programming language, and, from the example
code, a very poorly designed one which is cryptic and fussy and liable
to be hard to maintain. So it's better to use a language like Perl to
achieve the same thing, and I did have a few Perl scripts handy for
repetitive jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

I don't think it's that clear a joke. The Unix shell is extremely
error prone to program, and you should not let a newbie write shell
programs without careful supervision. ("newbie" [in shell context]
= less than 10 years of practical experience. - Am I exaggerating?
Maybe. But not much.)

I'm not a great fan of shell programming - anything advanced, and I tend
to reach for Python. But I think that is a matter of familiarity and
practice. But if you consider bash programming as difficult to get
right, I'll not argue.

Perl is famously known as a "write-only" language. Sure, it is possible
to write good, clear, maintainable Perl code - but few people do that.

Thus the idea that finding bash cryptic or difficult and using Perl
instead is the joke.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Wed Jan 31 19:15:56 2024

On 31/01/2024 16:22, Scott Lurndal wrote:

David Brown <david.brown@hesbynett.no> writes:

On 31/01/2024 09:36, Malcolm McLean wrote:

The reason is that I'd only run the command once, and it's so likely
that there will be either a syntax misunderstanding or a typing error
that I'd have to test to ensure that it was right. And by the time
you've done that any time saved by typing only one commandline is lost.
Of course if you are writing scripts then that doesn't apply. But now
it's effectively a programming language, and, from the example code, a
very poorly designed one which is cryptic and fussy and liable to be
hard to maintain. So it's better to use a language like Perl to achieve
the same thing, and I did have a few Perl scripts handy for repetitive
jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

You admit this with "not tested". Says it all. '"Understandig Unix" is
an intellectually useless achievement. You might have to do it if you
have to use the system and debug and trouble shoot. But it's nothing to
be proud about.

It is "useless" for people who don't use it. For people who /do/ use
it, is very useful.

I've used sequences like Tim's - it's a way to copy data remotely from a
different machine. I would likely write it slightly differently - I'd
probably do the mkdir and cd first, thus avoiding the need for a
subshell, and I'd use "ssh -C" or "tar -z" to do the compression rather
than "gzip".

There's no doubt that the learning curve is longer for doing this sort
of thing from the command line than using gui programs. There is also
no doubt that when you are used to it, command line utilities and a good
shell are very flexible and efficient.

Learn to use the tools that are conveniently available, and then pick
the right tool for the job - whether it is command line or gui.

And there are often more than one tool for the job. e.g. rsync(1)
for copying data remotely.

Or sshfs then "cp -r". There's often more than two tools for the job!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Janis Papanagnou on Wed Jan 31 19:22:50 2024

On 31.01.2024 17:47, Janis Papanagnou wrote:

It's just the numbers of file descriptors and whether it's an input >
or output < channel, or even a read/write channel <> .

In case the typo wasn't obvious and detected as such, please swap them
< input

output

<> in/out

append

<< here-doc
etc.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Wed Jan 31 19:24:58 2024

On 31/01/2024 19:01, Janis Papanagnou wrote:

On 31.01.2024 17:26, Michael S wrote:

On Wed, 31 Jan 2024 16:25:30 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

The big advantage of non-GUI is for process automation. With GUI
oriented applications you can mainly only interactively (=slow and
cumbersome) do what it provides. Rarely GUI applications support a
scripting interface, and if so it's then typically some proprietary
non-standard language.

I'd take almost any proprietary non-standard GUI macro language over
non-proprietary non-standard tcl. They say, Lua is better. I never had
motivation to look at it more closely.

I don't recall to have ever used tcl (maybe once, long ago?),
and I never stumbled across an application (GUI or otherwise)
where I needed scripting and it would have provided Lua. Thus
I cannot help you here, I either don't know it.

TCL is - for reasons beyond my ken - the standard scripting language
used by several programmable logic design suites. Your code is a
mixture of VHDL and/or Verilog (and/or higher level HDL languages)
and/or schematic or block diagrams, and it goes through a range of
analysers, test simulators, placement and routing systems. There's
plenty of gui programs involved and non-gui programs, and the whole
thing is tied together with TCL scripting.

I have no idea if that's the kind of system Michael is referring to, but
it is certainly used for that kind of thing.

All I can say is that the Unix shell was a reliable companion
wherever we had to automate tasks on Unix systems or on Cygwin
enhanced Windows.

Automation is certainly easier with good scripting - whatever the
language or shell.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Michael S on Wed Jan 31 19:47:43 2024

On 31.01.2024 18:36, Michael S wrote:

On Wed, 31 Jan 2024 17:29:30 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

I'm not sure why you mentioned 5, whether that's better or worse.
There's naturally some limit on OS level on the number of parallel
open file descriptors, but that limit is very high. Mind that you
can always close unused ones.

Five of each sort, i.e. five inputs and five outputs, sound good to me.

Well, to me it sounds like an arbitrary number, like four or seven.

But how are these pre-opened FDs then used? To which devices assigned?

We already have what most folks seem to use pre-opened: 0, 1, 2

When I need a new one I couple it with an existing one

exec 3>&1 4>&1 5>&1 ## but why open it in advance per default?

or to a file (where we may have a concrete demand for a second channel)

exec 3< ./my_input
exec 4> /tmp/my_output

and still using channel 0 and 1 (here still connected to the default)
in parallel.

Much more than five of each sort pre-opened by shell sound like too
much. If there exist a need for more than five channels for
communication between complex of programs then this complex of programs
very likely was designed to work together and only together. And then
any intervention of the user into communication between them will
likely do more harm than good.

Of course, I fully expect that usefully using more than three
predefined channels of any particular direction would be very rare, but
I still like five, or at least four, better than three.

You should have an idea, though, to what they should initially point
to, and why you want them differentiated.

As to not using predefined direction and instead just providing a pool
of up to 10 pre-open descriptors, this idea didn't cross my mind in
those particular five minutes that I was writing my initial (yes, provocative, yes intentionally so) post. Right now I don't want to
think whether I like it or not, because I see no good reasons to
think deeply about this particular water under bridge.

I don't think I understand what you wanted to say here.
(A pool is there, the shell manages it for you.)

As an amendment, in ksh there's also the feature to not manually
assign file descriptor numbers but automatically get them from
the shell by prepending '{var}' to the redirection; here some FD
greater than 10 will be chosen by that shell and it's number
stored in 'var'. This may be useful for applications that may
e.g. serve a couple equivalent communication partners.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to Janis Papanagnou on Wed Jan 31 19:43:18 2024

On 31/01/2024 13:47, Janis Papanagnou wrote:

On 30.01.2024 20:39, Richard Harnden wrote:

Nobody uses printf to output binary data. fwrite(3) would be common, as
would write(2).

Right. I'm using the OS'es write(2), but also printf with ANSI escapes,
e.g. sprintf (buf, "\033[%d;%dH", ...

I meant 'binary' as in has \0s

It seems to work fine with ESC's and utf8 (and i abuse it thus often)
... but, from what James said, that is not actually guarenteed.

Maybe you could use printf("%c%c%c" ... but it'd be beyond tedious.

Since I recall to have used it in some thread I want to clarify that
it was just meant as an example countering an incorrect argument of
"not being able to output binary data on stdout", or some such.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Jan 31 20:25:59 2024

On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Terminal control sequences (almost always based on VT100 these days) are typically not printable, but tend to avoid null characters, which means
you can very probably use printf to print them (assuming you're on a POSIX-like system).

They use text. For instance, a cursor position is both accepted and
reported in a decimal format like 13;17. All the commands and
delimiting characters are textual, except for part of the CSI (control
sequence introducer). The 7 bit CSI uses two characters, ESC and [.
Except for that one ESC, everything is printable.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Malcolm McLean on Wed Jan 31 23:36:43 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might
have unusual effects if passed though systems designed to handle
human-readable text.

Maybe you are not used to a system where it's trivial to inspect such
data. When "some_prog" produces data that are not compatible with the
current terminal settings, "some_prog | hd" shows a hex dump instead.
The need to do this does not make "some_prog" poorly designed. It may
simply mean that the output is /intended/ for further processing.

For instance in some systems designed to receive

ASCII text, there is no distinction between the nul byte and "waiting
for next data byte". Obviously this will cause difficuties if the data
is binary.
Also many binary formats can't easily be extended, so you can pass one
image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to
extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

What is your evidence? stdout was just designed for output (as far as I
can tell) and, anyway, what is the distinction you are making between
binary and text? iconv --from ACSII --to EBCDIC-UK will produce
something that is "logically" text on stdout, but it might look like
binary to you.

An example where it's really useful not to care: I have a suite of tools
for doing toy cryptanalysis. Some apply various transformations and/or
filters to byte streams and others collect and output (on stderr)
various statistics. Plugging them together in various pipelines is very
handy when investigating an encrypted text. The output is almost always "binary" in the sense that there would be not point in looking at on a terminal.

According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

That you couldn't actually mount a defence of your position whilst I could also strongly implies that I am right.

There is also the possibility that readers are just exhausted by the
fire-hose of bizarre, unsubstantiated options coming from you. I've
been tied up with personal matters recently, but I've seen many posts
from you where I have been tempted to make a one-word reply like "no", "nonsense" or "daft", but I see that you would have taken these to be
string confirmation that you are right!

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Thu Feb 1 00:47:35 2024

bart <bc@freeuk.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I think they're poorly designed too.

Of course you do. They're not bart programs.

From the POV of interactive console programs, they /are/ poor.

You don't provide any reason why - do elucidate!

But the
mistake is thinking that they are actual programs or commands, when
really they are just filters. They are not designed to be standalone >commands.

Even 'cat', if I type it by itself, just sits there.

It is _designed_ to do that. If that's not the behavior you
want, don't use that command.

If you read the command documentation, you would understand
what the behavior of the command would be in that context
and realize that it would be pointless.

Although, it is actually useful, since it echos back what
you type, for e.g. checking serial lines, terminal emulators,
etc.

(I wonder what use
it has in a sequence like ... | cat | ...; what does it add to the data?)

The manual page provides the answer to that question, and it
should be obvious even to the meanest programmer.

AFAICS, this stuff mainly works inside scripts. Or do people here spend
all day manually piping stuff between programs?

Yes and Yes.

As for alternatives, I don't know.

Indeed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Ben Bacarisse on Thu Feb 1 00:21:32 2024

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might
have unusual effects if passed though systems designed to handle
human-readable text.

Maybe you are not used to a system where it's trivial to inspect such
data. When "some_prog" produces data that are not compatible with the current terminal settings, "some_prog | hd" shows a hex dump instead.
The need to do this does not make "some_prog" poorly designed. It may
simply mean that the output is /intended/ for further processing.

For instance in some systems designed to receive

ASCII text, there is no distinction between the nul byte and "waiting
for next data byte". Obviously this will cause difficuties if the data >>>> is binary.
Also many binary formats can't easily be extended, so you can pass one >>>> image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to
extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

What is your evidence? stdout was just designed for output (as far as I
can tell) and, anyway, what is the distinction you are making between
binary and text? iconv --from ACSII --to EBCDIC-UK will produce
something that is "logically" text on stdout, but it might look like
binary to you.

An example where it's really useful not to care: I have a suite of tools
for doing toy cryptanalysis. Some apply various transformations and/or filters to byte streams and others collect and output (on stderr)
various statistics. Plugging them together in various pipelines is very handy when investigating an encrypted text. The output is almost always "binary" in the sense that there would be not point in looking at on a terminal.

According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I think they're poorly designed too.

From the POV of interactive console programs, they /are/ poor. But the
mistake is thinking that they are actual programs or commands, when
really they are just filters. They are not designed to be standalone
commands.

Even 'cat', if I type it by itself, just sits there. (I wonder what use
it has in a sequence like ... | cat | ...; what does it add to the data?)

AFAICS, this stuff mainly works inside scripts. Or do people here spend
all day manually piping stuff between programs?

As for alternatives, I don't know. There are any number of ways this
could be done. But if everyone has become inured to this piping
business, they will not be receptive to anything different.

Here however are some ideas:

* Have versions of these tools for use as filters with no UI, just
a default input and output, and versions for interactive use with
helpful prompts. Or even just a sensibly named output! Instead of
every program writing a.out.

* Have a concept of a current block of data, analogous to a clipboard.
Then separate commands can load data, sort it, count it, display it,
write it etc, with no need for intermediate named files.

But I'd be happier if this was all contained within a separate
application from an OS shell program.

Such things are ludicrously easy to write. I just did one in 15 minutes
and 50 lines of code, but set up for line-based text files. This is it
in action; the test input contains 4 lines "one two three four":

Type exit quit or q to finish

> load fred
Data loaded

> lc
4 lines

> list
1 one
2 two
3 three
4 four

> rev
Reversed

> list
1 four
2 three
3 two
4 one

> sort
Sorted

> upper

> list
1 FOUR
2 ONE
3 THREE
4 TWO

> save bill
Written to bill

> q

There are any number of transformations that can be applied without
needed to name and write intermediate files. The commands can even just
invoke the same filters I mentioned, but now within a friendlier UI.

A more sophisticated version can permanently show the data without my
having to type 'list'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Feb 1 00:49:53 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 31/01/2024 16:33, Keith Thompson wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 31/01/2024 07:18, Tim Rentsch wrote:

[...]

It is FAR more cumbersome to accomplish what this command
is doing without sending binary data through standard out.
Anyone who doesn't understand this doesn't understand Unix.

Yes. I don't do that sort of thing.

You don't. Others do. What was your point again?

There's kind of an implication that it's something I ought to be doing.

No, we're saying that what you do or don't do is irrelevent. Nobody is
forcing you to do anything other than acknowledge that it is a useful capability that others use regularly.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Thu Feb 1 01:29:44 2024

On 01/02/2024 00:47, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I think they're poorly designed too.

Of course you do. They're not bart programs.

From the POV of interactive console programs, they /are/ poor.

You don't provide any reason why - do elucidate!

They only do one thing, like you can't first do A, then B. They don't
give any prompts. They often apparently do nothing (so you can't tell if they're busy, waiting for input, or hanging). There is no dialog.

But I'm sure this has been mentioned a few times.

NONE of my console apps are that poor. But then I've written programs
with proper CLIs and GUIs that had to used by non-programmers.

AFAICS, this stuff mainly works inside scripts. Or do people here spend
all day manually piping stuff between programs?

Yes and Yes.

OK. I spent most creative time on my machine doing actual coding.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Thu Feb 1 01:23:07 2024

On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Kaz Kylheku <433-929-6894@kylheku.com> writes:

On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Terminal control sequences (almost always based on VT100 these days) are >>> typically not printable, but tend to avoid null characters, which means
you can very probably use printf to print them (assuming you're on a
POSIX-like system).

They use text. For instance, a cursor position is both accepted and
reported in a decimal format like 13;17. All the commands and
delimiting characters are textual, except for part of the CSI (control
sequence introducer). The 7 bit CSI uses two characters, ESC and [.
Except for that one ESC, everything is printable.

I'd describe "printable except for ESC" as binary. And some sequences
use other non-printable characters like ASCII BEL (Ctl-G) (perhaps not
VT100 standard, but for example commands to change fonts and colors for xterm).

But ESC is related text; it's a character described in ASCII used for
signaling in the middle of text, which is what it's doing here.

I'd say that if the terminal control sequences contained features like:

- records delimited by length (number of bytes) rather than any stable
delimiter, with an emphasis on either fixed length data structures,
or else length + data encodings for variable ones.

- numbers encoded in binary: for instance screen coordinates (5,257)
being encoded as bytes 05 00 01 01 (little endian, 16 bit)
rathe than the text "5;257"

... then that would be "binary data".

In the terminal sequences, when the byte 27/0x1B occurs, it is
always being ESC.

In binary data, when such a byte value occurs, it might be anything,
with any meaning. It could be part of a number, e.g. lower byte of
0x1D1B. It could be some flags meaning 00011101, with various meanings.

Binary data has the characteristic that it's not concerned with the
character interpretation of many of the bytes, other than ones which are embedded text fields. Thus in many parts of the binary data, any
character can potentially occur by chance. An 8 bit flags field can
reproduce any character, given the right combination of the flag values.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Kaz Kylheku on Thu Feb 1 01:34:06 2024

Kaz Kylheku <433-929-6894@kylheku.com> writes:

On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Kaz Kylheku <433-929-6894@kylheku.com> writes:

On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Terminal control sequences (almost always based on VT100 these days) are >>>> typically not printable, but tend to avoid null characters, which means >>>> you can very probably use printf to print them (assuming you're on a
POSIX-like system).

They use text. For instance, a cursor position is both accepted and
reported in a decimal format like 13;17. All the commands and
delimiting characters are textual, except for part of the CSI (control
sequence introducer). The 7 bit CSI uses two characters, ESC and [.
Except for that one ESC, everything is printable.

I'd describe "printable except for ESC" as binary. And some sequences
use other non-printable characters like ASCII BEL (Ctl-G) (perhaps not
VT100 standard, but for example commands to change fonts and colors for
xterm).

But ESC is related text; it's a character described in ASCII used for >signaling in the middle of text, which is what it's doing here.

So are most of the other ASCII codes less than 0x20. Including
file and record delimiters and shift-in/shift-out.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Richard Harnden on Thu Feb 1 00:45:11 2024

On 1/31/24 14:43, Richard Harnden wrote:

On 31/01/2024 13:47, Janis Papanagnou wrote:

On 30.01.2024 20:39, Richard Harnden wrote:

Nobody uses printf to output binary data. fwrite(3) would be common, as
would write(2).

Right. I'm using the OS'es write(2), but also printf with ANSI escapes,
e.g. sprintf (buf, "\033[%d;%dH", ...

I meant 'binary' as in has \0s

It seems to work fine with ESC's and utf8 (and i abuse it thus often)
... but, from what James said, that is not actually guarenteed.

You're right. I forgot to point it out, but it's clear that null
characters also invalidate that guarantee.

The simplest way in which that guarantee could fail would be that any of
the prohibited characters would simply be dropped, either while writing
in text mode or when reading them back in text mode. However, that would
be pretty arbitrary.

A more subtle alternative is that some or all of the prohibited
characters are used by the operating system to control how the other
contents of text files are interpreted when reading the file. Examples: backspace characters could be removed, along with the immediately
preceding characters. Carriage return characters could be removed along
with the entire preceding line. Vertical tabs and form feeds could be
replaced with an appropriate number of newline characters. I don't know
if any real operating system does any of those things with text files,
but if it did, that would not prevent a fully conforming implementation
of C on that platform, thanks to the clause I cited.
However, I also know of one real-life example, though an obscure one:
text files are stored in fixed-size blocks, with each line starting with
a count of the blocks it occupies. Newline characters are converted in
spaces that pad out the end of the last block - the net result is that
spaces immediately preceding a newline would be indistinguishable from
the padding, and would therefore get dropped when reading the text file
back in. The existence of such systems is precisely why "no new-line
character is immediately preceded by space characters" is one of the specifications for text files.

In that case, the I/O routines have two options: they can remove those characters when writing the text to prevent them from being
misinterpreted, which would be how the text read in fails to match the
text that was written. The other alternative is let those characters
through, and let them be misinterpreted when read back, which could
produce arbitrarily bizarre consequences (NOT limited to the examples I
gave).

A similar fixed-size block scheme with null characters padding out the
end of the last block is the reason why the standard says that a binary
"stream may, however, have an implementation-defined number of null
characters appended to the end of the stream." when read back in.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Keith Thompson on Thu Feb 1 00:52:24 2024

On 1/31/24 15:14, Keith Thompson wrote:

Richard Harnden <richard.nospam@gmail.invalid> writes:

On 31/01/2024 13:47, Janis Papanagnou wrote:

On 30.01.2024 20:39, Richard Harnden wrote:

Nobody uses printf to output binary data. fwrite(3) would be common, as >>>> would write(2).

Right. I'm using the OS'es write(2), but also printf with ANSI
escapes,
e.g. sprintf (buf, "\033[%d;%dH", ...

I meant 'binary' as in has \0s

I don't think that's what "binary" means.

The standard defines text and binary streams. I think it would be
reasonable to call a file binary if it violates any of the requirements
for a text stream to read back the same way it was written out. If so,
null and ESC characters both qualify.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Thu Feb 1 11:34:22 2024

On 31/01/2024 19:35, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

On 31/01/2024 15:46, Janis Papanagnou wrote:

On 31.01.2024 15:09, David Brown wrote:

I would expect that the majority of uses of "cat" are with just one
file,

And of course just because of ignorance; the majority of (but not
all)
uses with just one file are UUOCs.

I regularly see it as more symmetrical and clearer to push data left
to right. So I might write "cat infile | grep foo | sort > outfile".
Of course I could use "<" redirection, but somehow it seems more
natural to me to have this flow. I'll use "<" for simpler cases.

But perhaps this is just my habit, and makes little sense to other people.

You can also use:

< infile grep foo | sort > outfile

Redirections don't have to be written after a command.

I did not know you could write it that way - thanks for another
off-topic, but useful, tip.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Thu Feb 1 11:30:56 2024

On 31/01/2024 14:35, Janis Papanagnou wrote:

On 31.01.2024 14:05, David Brown wrote:

But it is correct that English has become the main language for
international communication, and is therefore critical for anything that
involves cross-border communication, or where there are significant
numbers of foreign workers. That includes academic work. Different
parts of Europe previously used German or Russian for this,

Don't forget the importance of French! - The whole postal and telecommunication sectors were (and probably still are) massively
influenced by France.

The French would never let the Anglophiles forget the importance of
French :-)

(You're always writing so much text, so I'll skip it and avoid
more comments.)

Just two (unrelated) notes concerning statements I've seen
somewhere in the thread (maybe here as well)...

First; the EU publishes in all languages of the member states,
for example. (There's no single lingua franca.)

Weirdly, while Norway is not in the EU but Sweden and Denmark are, they
publish (for some things at least) in Norwegian but not in Swedish or
Danish. The Danes and the Swedes don't like each other's languages, but
are happy enough with Norwegian so they use that to save a little time
and money.

And the second note; we have to distinguish the language of the
programming language's keywords, the comments in the source
code, and the language(s) used for user-interaction.

Absolutely. That is a critical distinction.

I don't know whether there's some native language that use
non-English keywords, but I'd suppose so, since in the past
I've seen some using preprocessors for a "native language"
source code. So while not typical, probably a demand at some
places. (Elsethread I mentioned the German TR440 commands,
but a [primitive] command language, as opposed to, say, the
Unix shell, I don't consider much as a language.)

Apparently there are a few - including, of course, one in French. But I
have not heard of any that are relevant in modern times.

The comments' languages varies, in my experience. Sometimes
there's coding standards (that demand the native language, or
that demand English), sometimes it's not defined. Myself I'm
reluctant to switch between languages and stay with English.
But there were also other cases with longer descriptions on
a conceptual basis; if you come from a native language's
perspective it can be better to stay with the language of the
specification instead of introducing sources of misunderstanding.

The user interface, finally, is of course as specified, and can
be anything, or even multi-lingual.

That's my experience too.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Thu Feb 1 14:02:12 2024

On 01.02.2024 06:24, Malcolm McLean wrote:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Maybe you are not used to a system where it's trivial to inspect such
data. When "some_prog" produces data that are not compatible with the
current terminal settings, "some_prog | hd" shows a hex dump instead.
The need to do this does not make "some_prog" poorly designed. It may
simply mean that the output is /intended/ for further processing.

Well almost by definition binary output is intended for further
processing. Binary audio files must ultimately be converted to analogue
if anyone is to listen to them, for example.

Well, not necessarily. Let's leave the typical use case for a moment...

It might also be analyzed and converted to a digitally represented
formula, say some TeX code, or e.g. like the formal syntax that the
lilypond program uses.

I had to check how to do a hex dump on the system I'm typing this on.
The name of the hex dumper is xxd instead of hd, but otherwise it works
the same way and will accept piped data. But the fact I had to look it
up tells you that I've never actually used it.

Well, there's always the old Unix standard tool, 'od'.

I use that without thinking or looking it up, since it was ever there,
despite I only rarely use it.

And you observed correctly that nowadays there's typically even more
than one tool available. (And Bart will probably write his own tool. :-)

The two problems with hex
dumps are that you've got to do mental arithmetic to convert 8 bit hex
values into 16 or 32 bit fields,

Hmm.. - have you inspected the man pages of the tools?

At least for 'od' I know it's easy per option...
od -c file # characters (or escapes and octals)
od -t x1 file # hex octets
od -t x2 file # words (two octets)
od -c -t x1 file # characters and octets

and that once you get a variable length
field, it's virtually impossible to keep track of and match up the
following fields.

An inherent property of a binary. In that case you need data specific applications.

So in reality what I do when troubleshooting binary
data is to write a scratch program, or, more often because the trouble
is in the existing parser, put diagnosics in an existing parser to print
out a few fields and inspect them that way.

That's fine.

Of course to check that
audio or image data is right you have to listen to it or view it - you
can't tell from looking at the individual samples.

Depends on the sort of check, and the solution approach. (See my
lilypond example.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Thu Feb 1 05:17:34 2024

Michael S <already5chosen@yahoo.com> writes:

On Tue, 30 Jan 2024 23:18:21 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

[..on sending binary to standard out..]

Simple example (disclaimer: not tested):

ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
(mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

Of the five main programs in this command, four are using
standard out to send binary data:

tar -cf - .
gzip -c
ssh foo [...]
gunzip -c

The tar -xf - at the end reads binary data on standard in
but doesn't output any (or anything else for that matter).

It is FAR more cumbersome to accomplish what this command
is doing without sending binary data through standard out.

If I am not mistaken, tar, gzip and gunzip do not write binary
data to standard output by default. [...]

What I think you mean is that these programs don't send their
primary processing output to standard out unless specifically
directed to do so. Well sure. I don't think that takes away from
the point that it is useful to use standard out for a binary output
stream.

Anyone who doesn't understand this doesn't understand Unix.

Frankly, Unix redirection racket looks like something hacked
together rather than designed as result of the solid thinking
process. As long as there were only standard input and output it
was sort of logical. But when they figured out that it is
insufficient, they had chosen a quick hack instead of
constructing a solution that wouldn't offend engineering senses
of any non-preconditioned observer.

First I think you are being too harsh on the people who originally
came up the pipe/redirection mechanism. Considering the historical
context it was a big step forward, and a good match to available
processing resources at the time.

That said, no one is claiming that the single pipe/redirection
mechanism is the be-all and end-all, or that it solves all the
world's problems. But it does do a good job of solving a
significant subclass of the world's problems, and in that context
provides a good motivating example for using standard out for
binary data as well as textual data.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Malcolm McLean on Thu Feb 1 05:24:50 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 31/01/2024 07:18, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
human-readable text. For instance in some systems designed to receive >>>>> ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

[...]

Simple example (disclaimer: not tested):

ssh foo 'cd blah ; tar -cf - . | gzip -c' | \
(mkdir foo.blah ; cd foo.blah ; gunzip -c | tar -xf -)

Of the five main programs in this command, four are using
standard out to send binary data:

tar -cf - .
gzip -c
ssh foo [...]
gunzip -c

The tar -xf - at the end reads binary data on standard in
but doesn't output any (or anything else for that matter).

It is FAR more cumbersome to accomplish what this command
is doing without sending binary data through standard out.
Anyone who doesn't understand this doesn't understand Unix.

Yes. I don't do that sort of thing.
While I have used Unix, it is as a platform for interactive programs
which work on graohics, or a general C compilation environment. I
don;t build pipeliens to do that sort of data processing. If I had to download a tar file I'd either use a graphical tool or type serveal
commands into the shell, each launching single executable,
interactively.

The reason is that I'd only run the command once, and it's so likely
that there will be either a syntax misunderstanding or a typing error
that I'd have to test to ensure that it was right. And by the time
you've done that any time saved by typing only one commandline is
lost. Of course if you are writing scripts then that doesn't
apply. But now it's effectively a programming language, and, from the example code, a very poorly designed one which is cryptic and fussy
and liable to be hard to maintain. So it's better to use a language
like Perl to achieve the same thing, and I did have a few Perl scripts
handy for repetitive jobs of that nature in my Unix days.

You admit this with "not tested". Says it all. '"Understandig Unix" is
an intellectually useless achievement. You might have to do it if you
have to use the system and debug and trouble shoot. But it's nothing
to be proud about.

You're an idiot. As usual trying to have a useful discussion
with you has turned out to be a complete waste of time.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Thu Feb 1 14:42:34 2024

bart <bc@freeuk.com> writes:

On 01/02/2024 00:47, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I think they're poorly designed too.

Of course you do. They're not bart programs.

From the POV of interactive console programs, they /are/ poor.

You don't provide any reason why - do elucidate!

They only do one thing, like you can't first do A, then B. They don't
give any prompts. They often apparently do nothing (so you can't tell if >they're busy, waiting for input, or hanging). There is no dialog.

Those are features, not defects.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Feb 1 14:45:31 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
human-readable text.

Maybe you are not used to a system where it's trivial to inspect such
data. When "some_prog" produces data that are not compatible with the
current terminal settings, "some_prog | hd" shows a hex dump instead.
The need to do this does not make "some_prog" poorly designed. It may
simply mean that the output is /intended/ for further processing.

For instance in some systems designed to receive

ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

What is your evidence? stdout was just designed for output (as far as I
can tell) and, anyway, what is the distinction you are making between
binary and text? iconv --from ACSII --to EBCDIC-UK will produce
something that is "logically" text on stdout, but it might look like
binary to you.

An example where it's really useful not to care: I have a suite of tools
for doing toy cryptanalysis. Some apply various transformations and/or
filters to byte streams and others collect and output (on stderr)
various statistics. Plugging them together in various pipelines is very
handy when investigating an encrypted text. The output is almost always
"binary" in the sense that there would be not point in looking at on a
terminal.

According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I'd write a monolithic program.

Even a monolithic program is decomposed into subroutines (or malcolm functions).

A pipeline is the same concept at a higher level.

Load the encryoted text into memory, and then pass it to subroutines to
do the various analyses.

So your program is arbitrarily large and needs to be recompiled
to add new subroutines. Advantage to the pipeline, again.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Feb 1 15:50:44 2024

On 01/02/2024 02:29, bart wrote:

On 01/02/2024 00:47, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

According to you, these tools are poorly designed. I don't think so. >>>> How would you design them? Endless input and output file names to be >>>> juggled and tidied up afterwards?

I think they're poorly designed too.

Of course you do. They're not bart programs.

From the POV of interactive console programs, they /are/ poor.

You don't provide any reason why - do elucidate!

They only do one thing, like you can't first do A, then B. They don't
give any prompts. They often apparently do nothing (so you can't tell if they're busy, waiting for input, or hanging). There is no dialog.

That's the whole point!

If you want to do A, then B, then you do "A | B", or "A; B", or "A && B"
or "A || B". And if you want to do A, then B twice, then C, then A
again, you write "A | B | B | C | A". Other operator choices let you
say "do this then that", or "do this, and if successful do that", etc.

Your monolithic AB program fails when you want to do C, or want to do A
and B in a way the AB author didn't envisage.

You have a Transformer - a toy that can be either a car or a robot.
I've got a box of Lego. Sometimes I need instructions and a bit of
time, but I can have a car, a robot, a plane, an alien, a house, and
anything else I might want.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Feb 1 14:52:21 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 31/01/2024 23:34, Keith Thompson wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 31/01/2024 20:14, Keith Thompson wrote:

[...]

Terminal control sequences (almost always based on VT100 these days)
are typically not printable, but tend to avoid null characters, which
means you can very probably use printf to print them (assuming you're
on a POSIX-like system).
[...]

In ASCII, 0 means NUL, or "ignore". So an ASCII sequence may contain
any number of embedded zero bytes, which the receiver ignores. That's
because for technical reasons some communications channels have to
send data every cycle, and if there is no data, they will send a
signal indistinguishable from all bits zero.

Not particularly relevant. A quick experiment with xterm indicates that
embedding null bytes in a control sequence prevents it from being
recognized. There may be some standards that require embedded zero
bytes to be ignored, but xterm doesn't any such standard. Similarly, if
you embed null bytes in text written to a file, the result is corrupted
text file.

The standard is ASCII. (American standard for computer information >interchange). Byte zero is NUL, which means "ignore".

ASCII does not define a semantic for
the NUL byte other than grouping it as a "control character" with
the function NO-OP. It was often used as a padding character.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Feb 1 15:35:12 2024

On 01/02/2024 01:21, bart wrote:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
human-readable text.

Maybe you are not used to a system where it's trivial to inspect such
data. When "some_prog" produces data that are not compatible with the
current terminal settings, "some_prog | hd" shows a hex dump instead.
The need to do this does not make "some_prog" poorly designed. It may
simply mean that the output is /intended/ for further processing.

For instance in some systems designed to receive

ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the
data
is binary.
Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format >>>>> which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

What is your evidence? stdout was just designed for output (as far as I
can tell) and, anyway, what is the distinction you are making between
binary and text? iconv --from ACSII --to EBCDIC-UK will produce
something that is "logically" text on stdout, but it might look like
binary to you.

An example where it's really useful not to care: I have a suite of tools
for doing toy cryptanalysis. Some apply various transformations and/or
filters to byte streams and others collect and output (on stderr)
various statistics. Plugging them together in various pipelines is very
handy when investigating an encrypted text. The output is almost always
"binary" in the sense that there would be not point in looking at on a
terminal.

According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I think they're poorly designed too.

Then you are wrong too. (Although I think you've been doing a slightly
better job than Malcolm of explaining /why/ you think they way you do.)

Basically, neither you nor Malcolm are familiar with these tools. You
don't use them significantly - and when you do, it is for simple things
and you feel they are over-complicated for the tasks /you/ need. And
maybe they /are/ over-complicated for your particular needs - but they
are useful to a large number of people in a large number of ways. They
are not poorly designed - they are simply not designed for /your/ needs
and preferences.

This of course works in all directions. Some people prefer gui tools
and think command-line tools are hard to use. Some people prefer
command line tools and think gui tools are slow and inefficient. Some
people like combining multiple small, flexible tools that each handle
part of the task - others prefer monolithic tools that try to do
everything. Each possibility has its advantages, and disadvantages, its proponents and detractors. And if you want to be a happy and efficient developer, you stay open to using any of the solutions - learn at least
a bit about them all, and use what suits you and your needs best at the
time.

But do not mistake "I don't like this" or "I think this is hard" with
"it's poorly designed".

From the POV of interactive console programs, they /are/ poor. But the mistake is thinking that they are actual programs or commands, when
really they are just filters. They are not designed to be standalone commands.

Of course they are programs. Some programs can be used as filters -
that doesn't mean they are not programs!

Even 'cat', if I type it by itself, just sits there. (I wonder what use
it has in a sequence like ... | cat | ...; what does it add to the data?)

Nothing, in this case.

And if I open - say - MS Word, then close it again, I have also done
nothing. Does that mean MS Word is a useless program that doesn't
really do anything?

AFAICS, this stuff mainly works inside scripts. Or do people here spend
all day manually piping stuff between programs?

This stuff works in scripts /and/ ad-hoc from the command line. People
use both, whatever is convenient at the time. I don't spend "all day"
piping stuff, but I guess I perhaps use pipes a dozen or so times a day
- the variance here is so big it's hard to give a sensible average.
Many cases are piping the output of one command into "less" and/or
"grep". Other common targets for pipes (for me) are head, tail (or
"tail -f"), "hexdump -C", "wc -l".

As for alternatives, I don't know. There are any number of ways this
could be done. But if everyone has become inured to this piping
business, they will not be receptive to anything different.

No one has claimed pipes or *nix stream redirection is a "perfect"
system. No one has suggested they think things could not have been done
in different ways. Just as with C, you are mistaking "we know how this
works, we use it, we are happy enough with it that we can work with it,
we know why it is this way" with "we think this is perfect and nothing
else will come close". Discussions with you would be so much more
fruitful if you didn't have such an absurd "you're with us or against
us" attitude. Opinions are not binary.

Those of us who understand *nix shell usage, pipes and indirection, and
are familiar with *nix common command line tools, and are therefore able
to give qualified opinions based on facts and experience (i.e., not you,
and not Malcolm), seem to find them useful. I'm sure, however, we can
all think of potential improvements.

But like C, the advantages of familiarity and common implementations
greatly outweigh the disadvantages. Maybe "ls" would have been better
if it had been spelt "list". Perhaps "rm" should have had different
defaults, or "sed" could have been less cryptic. The fact that "ls",
"rm", and other fundamentals works exactly the same way on every *nix
system made in the last 40 years, gaining new features if needed without breaking compatibility, is a /massive/ advantage.

You claim others are not open to considering something different, which
may potentially be "better" (for some values of "better"). You are
wrong. Most people, however, understand that momentum is important. It
can mean clearly inferior systems become standard - such as Windows or
the x86 ISA, which are technically massively inferior to alternatives,
but are successful due to momentum.

Here however are some ideas:

* Have versions of these tools for use as filters with no UI, just
a default input and output, and versions for interactive use with
helpful prompts. Or even just a sensibly named output! Instead of
every program writing a.out.

Sometimes a "front end" with a nice UI /is/ useful. So people write
front ends with nice UI's - text-based or gui. Typically these are
great for common tasks, and are cumbersome or useless for rarer or more advanced stuff. And that's fine - use whatever works best for you at
the time. If you think "ssh" is complicated from the command line, use
"putty" - but that won't handle all the uses that some people need.

But I most certainly don't want interactive and "helpful" prompts for
most of my command-line tools. It's fine on occasion if it is
necessary, useful if something out of the ordinary happens, and
appropriate for things like passwords. But when I know what I am doing,
why would I want to see help messages repeated? And if I don't know
what I am doing - it's the first time using a command, or I've forgotten
some details, then I have "prog --help", "man prog", or "google prog"
that will all give much more useful information than interactive prompts
ever could.

Can you imagine typing "ls" and being asked :

* Did you want to list files in the current directory, or elsewhere?
* Did you want to list all files, including hidden files?
* Did you want to use colour?

and then twenty more questions for the other common options for ls?

What would be the benefits of "sort" using command line options for
"filter" or "script" usage and then asking a dozen questions for
"interactive" use?

* Have a concept of a current block of data, analogous to a clipboard.
Then separate commands can load data, sort it, count it, display it,
write it etc, with no need for intermediate named files.

You mean, a convenient way of moving data between programs? Sort of
like taking data output by one program, and passing it into another
program? Maybe something akin to a factory assembly line - or a
pipeline? Perhaps we could make this convenient with a simple symbol,
and let it be organised and controlled by the shell so that individual
programs don't need to implement anything special - they just read in
data and dump it out in a simple, standardised way, and with this
wonderful "pipeline" idea, people can tie together different existing
standard programs in any combinations they like! Genius! If only those hopeless Unix people had thought about this 50+ years ago, instead of
messing around with stream redirection and pipes.

But I'd be happier if this was all contained within a separate
application from an OS shell program.

Yes, because it is /so/ much better if it is limited to a few commands
that you think of when writing this special application, than having a
general system that works with any commands.

Basically, all you are saying is that you'd like command line utilities
to work with a default file name "/tmp/clipboard" - something you didn't
want earlier on.

Let's use /tmp/x for convenience.

/tmp$ cat > fred
one
two
three
four
<ctrl-D>

> load fred
Data loaded

$ cp fred x

> lc
4 lines

$ wc -l x
4 x

> list
1 one
2 two
3 three
4 four

$ cat x
one
two
three
four

$ cat -n x
1 one
2 two
3 three
4 four

> rev
Reversed

tac x | sponge x

> list
1 four
2 three
3 two
4 one

$ cat -n x
1 four
2 three
3 two
4 one

> sort
Sorted

$ sort x | sponge x

> upper

$ awk '{print toupper($0)}' x | sponge x

> list
1 FOUR
2 ONE
3 THREE
4 TWO

$ cat -n x
1 FOUR
2 ONE
3 THREE
4 TWO

> save bill
Written to bill

> q

$ cp x bill

The "sponge" utility reads all of its stdin, then writes the file.
Otherwise, since Unix is inherently multi-tasking and runs the programs
in parallel (unlike your utility), trying to redirect output back into
the same file you use for output is a race condition. Utilities are
generally designed for pipes, not destructive changes to a single file.

With pipes, this is all vastly simpler:

$ cat fred | tac | sort | awk '{print toupper($0)}' > bill

Oh, and the sorting and case conversion works for your locale, including
UTF-8.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Thu Feb 1 14:53:56 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.02.2024 06:24, Malcolm McLean wrote:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Maybe you are not used to a system where it's trivial to inspect such
data. When "some_prog" produces data that are not compatible with the
current terminal settings, "some_prog | hd" shows a hex dump instead.
The need to do this does not make "some_prog" poorly designed. It may
simply mean that the output is /intended/ for further processing.

Well almost by definition binary output is intended for further
processing. Binary audio files must ultimately be converted to analogue
if anyone is to listen to them, for example.

Well, not necessarily. Let's leave the typical use case for a moment...

It might also be analyzed and converted to a digitally represented
formula, say some TeX code, or e.g. like the formal syntax that the
lilypond program uses.

I had to check how to do a hex dump on the system I'm typing this on.
The name of the hex dumper is xxd instead of hd, but otherwise it works
the same way and will accept piped data. But the fact I had to look it
up tells you that I've never actually used it.

Well, there's always the old Unix standard tool, 'od'.

I use that without thinking or looking it up, since it was ever there, >despite I only rarely use it.

And you observed correctly that nowadays there's typically even more
than one tool available. (And Bart will probably write his own tool. :-)

The two problems with hex
dumps are that you've got to do mental arithmetic to convert 8 bit hex
values into 16 or 32 bit fields,

Hmm.. - have you inspected the man pages of the tools?

At least for 'od' I know it's easy per option...
od -c file # characters (or escapes and octals)
od -t x1 file # hex octets
od -t x2 file # words (two octets)
od -c -t x1 file # characters and octets

Likewise, with xxd, use the -g flag. But Malcolm can't be
troubled to read the man page before complaining.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Feb 1 15:55:34 2024

On 01/02/2024 02:53, Malcolm McLean wrote:

On 31/01/2024 23:36, Ben Bacarisse wrote:

An example where it's really useful not to care: I have a suite of tools
for doing toy cryptanalysis. Some apply various transformations and/or
filters to byte streams and others collect and output (on stderr)
various statistics. Plugging them together in various pipelines is very
handy when investigating an encrypted text. The output is almost always
"binary" in the sense that there would be not point in looking at on a
terminal.

According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I'd write a monolithic program.

It's very strange to me to see people that consider themselves
programmers talk about having multiple small functions to do specific
tasks and combining them into bigger functions to solve bigger problems,
yet are reduced to quivering jellies at the thought of multiple small
programs to do specific tasks that can be combined to solve bigger tasks.

Do you think the C standard library would be improved by a single
function "flubadub" that takes 20 parameters and can calculate
logarithms, print formatted text, allocate memory and write it all to a
file?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Thu Feb 1 16:07:58 2024

On 01.02.2024 14:26, Malcolm McLean wrote:

On 01/02/2024 13:02, Janis Papanagnou wrote:

Well, not necessarily. Let's leave the typical use case for a moment...

It might also be analyzed and converted to a digitally represented
formula, say some TeX code, or e.g. like the formal syntax that the
lilypond program uses.

And ultimately converted to a non binary form. A list of 1s and 0s is
seldom any use to the final consumer of the data.

No, I was speaking about an application that creates lilypond _input_,
which is a formal language to write notes, e.g. for evaluation by the
lilypond software, but not excluding other usages.

The two problems with hex
dumps are that you've got to do mental arithmetic to convert 8 bit hex
values into 16 or 32 bit fields,

Hmm.. - have you inspected the man pages of the tools?

I just ran "man xxd". The man page contains this statement.

The tool's weirdness matches its creator's brain. Use entirely at your
own risk. Copy files. Trace it. Become a wizard.

This statement repelled you? (Can't help you here.)

At least for 'od' I know it's easy per option...
od -c file # characters (or escapes and octals)
od -t x1 file # hex octets
od -t x2 file # words (two octets)
od -c -t x1 file # characters and octets

So a JPEG file starts with
FF D8
FF E0
hi lo (length of the FF E0 segment)

So we want the output

FF D8 FF E0 [1000] to check that the segment markers are correct and FF
E0 segment is genuinely a thousand bytes (or whatever it is). This isn't
easy to achieve with a hex dump utility.

I don't know binary format details about jpg, so I cannot help you here.

I was responding to your question where you wanted entities larger than
a single octet. I showed you some examples what I can do with 'od'.
(Just open it's man page to find all sorts of possible options and
option combinations.)

Yes, you can use entities of length four. Guess how? By 'od -t x4 file'
Or if you need decimal numbers use 'od -t d4 file' or 'od -t d2 file'.

And I already answered that for specific binary structures you'll need something data specific. You can also generalize that to some degree...

For example I recently wrote a shell script that supports binary data definitions in a very primitive declarative form; it's allowing me to
specify the field lengths, output type, and identification for readable
output. It allows hex, bin, dec, text, hex-seq, data to skip. Length of
fields many be constants, 0-terminated, or defined by another preceding
field. It also handles endianness.

You see even some primitive "generic" thing needs a couple features.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Thu Feb 1 16:47:03 2024

On 01.02.2024 15:57, Malcolm McLean wrote:

On 01/02/2024 14:45, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

I'd write a monolithic program.

I certainly want re-usability of functions, modules, components,
commands, and systems. (And I want no duplication on any level.)

Even a monolithic program is decomposed into subroutines (or malcolm
functions).

There are various ways to organize software. Some supported by the
language, some by the OS mechanisms, some specifically implemented.

A pipeline is the same concept at a higher level.

Functions is a way, communicating processes is a way, etc. etc. etc.
All can be taken to combine various entities. All mechanisms can be
used together or some omitted.

Exactly. So whilst it might have some advantages, they aren't going to
be very large, because as you say, it;s the same basic concept.

I think that you draw the wrong conclusion (on a statement that is
prone to misunderstandings or even wrong).

Pipelines are a very useful method to let processes communicate in
a one-way direction (as the name already suggests). From that it's
immediately recognizable that filters are a natural element in that OS-architectural glue.

One original Unix philosophy was to have specialized commands that
do one thing well, and to combine such tasks as necessary. (To some
degree there was as similar statement concerning C function design.) Unfortunately some popular GNU tools deviate from that. Features get incorporated (as duplicates) in many tools (instead of using the
existing specialized one).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Feb 1 16:48:59 2024

On 01/02/2024 14:26, Malcolm McLean wrote:

I just ran "man xxd". The man page contains this statement.

The tool's weirdness matches its creator's brain. Use entirely at your
own risk. Copy files. Trace it. Become a wizard.

If you don't like xxd, use one of the dozens of other hex dump programs
that are easily available. I use hexdump myself.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Feb 1 15:32:26 2024

On 01/02/2024 14:35, David Brown wrote:

On 01/02/2024 01:21, bart wrote:

* Have versions of these tools for use as filters with no UI, just
   a default input and output, and versions for interactive use with
   helpful prompts. Or even just a sensibly named output! Instead of
   every program writing a.out.

Sometimes a "front end" with a nice UI /is/ useful. So people write
front ends with nice UI's - text-based or gui. Typically these are
great for common tasks, and are cumbersome or useless for rarer or more advanced stuff. And that's fine - use whatever works best for you at
the time. If you think "ssh" is complicated from the command line, use "putty" - but that won't handle all the uses that some people need.

But I most certainly don't want interactive and "helpful" prompts for
most of my command-line tools. It's fine on occasion if it is
necessary, useful if something out of the ordinary happens, and
appropriate for things like passwords. But when I know what I am doing,
why would I want to see help messages repeated? And if I don't know
what I am doing - it's the first time using a command, or I've forgotten
some details, then I have "prog --help", "man prog", or "google prog"
that will all give much more useful information than interactive prompts
ever could.

Can you imagine typing "ls" and being asked :

* Did you want to list files in the current directory, or elsewhere?
* Did you want to list all files, including hidden files?
* Did you want to use colour?

and then twenty more questions for the other common options for ls?

What would be the benefits of "sort" using command line options for
"filter" or "script" usage and then asking a dozen questions for "interactive" use?

I'd classify command-line programs into categories:

* Filters, as I explained before, which are designed to work with piped data

* Programs that can meaningfully be run with no command line options
(such as your 'ls' example), because a default action is defined

* Programs that expect command line parameters, which can generate an
error message or usage info if missing

* Programs that launch into an on-going session, which may or may not
take command line parameters (eg. python interpreter)

* Everything else, programs which are launched from the command line but
do not otherwise have a CLI, eg. a GUI text editor.

The troublesome ones are those I've called filters. Some programs I
expect to behaviour conventionally, unexpectedly behave as filters, such
as 'as'.

* Have a concept of a current block of data, analogous to a clipboard.
   Then separate commands can load data, sort it, count it, display it,
   write it etc, with no need for intermediate named files.

You mean, a convenient way of moving data between programs? Sort of

...

But I'd be happier if this was all contained within a separate
application from an OS shell program.

Yes, because it is /so/ much better if it is limited to a few commands
that you think of when writing this special application, than having a general system that works with any commands.

Basically, all you are saying is that you'd like command line utilities
to work with a default file name "/tmp/clipboard" - something you didn't
want earlier on.

No. Someone said that it is convenient to run a sequence of programs,
where each processes the output of the other, without having to
explicitly named intermediate files.

I'm exploring other ways of doing that ...

Let's use /tmp/x for convenience.

/tmp$ cat > fred
one
two
three
four
<ctrl-D>

      > load fred
      Data loaded

$ cp fred x

...

$ cat -n x
     1    FOUR
     2    ONE
     3    THREE
     4    TWO

      > save bill
      Written to bill

      > q

$ cp x bill

With pipes, this is all vastly simpler:

$ cat fred | tac | sort | awk '{print toupper($0)}' > bill

... one way, which follows on from my previous suggestion to have two
versions of filter utilities, might be to have versions which work on
that blob of current data as I demonstrated.

Then on the shell command line you'd have:

> cload fred
> ctac
> csort
> cawk ...
> csave bill

Any inputs will always come from that blob; it will not just do nothing
waiting for input, unless the command is specifically for that.

And if so, it can prompt for it. For that matter, it can also write
messages confirming what it's just done, or show a progress report.

If you like, you can also have a stack of such data blobs, so:

> cload fred
> cdupl
> csave bill
> crev
> csave llib

(I think I've just invented shell-Forth!)

I've just realised why it is that your filter programs don't show
prompts or any kinds of messages: because those are sent to stdout, and therefore will screw up any data that is being sent there as the primary output.

THAT'S why sending what ought to be packaged data to stdout is Wrong.

The "sponge" utility reads all of its stdin, then writes the file. Otherwise, since Unix is inherently multi-tasking and runs the programs
in parallel (unlike your utility), trying to redirect output back into
the same file you use for output is a race condition. Utilities are generally designed for pipes, not destructive changes to a single file.

My utility is like pretty much any application that loads a file, does something with it, and writes it out. (Eg. text or image editors.)

So that kind of pattern is well understand. The difference is that your
filters tend to bluntly work on the whole blob of data.

The kind of REPL utility I outlined is far more user-friendly. The data
it deals with isn't as transient as the data in a pipe either.

Halfway through my session, the data is still there. You can examine it
so far, and decide where to go next, or perhaps reverse the last stup.

When you write a|b|c>d, it whizzes through it all at once. I can write
more on this but I simply don't think that any 'Unix-head' here is going
to get it; it's too ingrained.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Thu Feb 1 17:15:34 2024

On 01.02.2024 16:41, Malcolm McLean wrote:

So could you list one or two reasons why you might prefer a program with
five subroutines, and one or two reasons why you might prefer to write
five programs which communicate via piped data?

A quite appealing and naturally appearing task (from the past) to use
pipes was to model communication cascades. Something like (off the top
of my head)...

data-source | sign | compress | crc | encrypt | channel-enc |
interleaver | channel-simulator | deinterleaver | channel-dec |
decrypt | crc-check | uncompress | check-sign | data-sink

Component-pairs can be omitted, say you may leave out the un-/compress function. And every component may be either special purpose or general.
A special purpose entity could be BCH-enc and RCPC-enc, or it can also
be (if better suited) a combined module, say 'crc -16' vs. 'crc -32'
with the function realized as option argument.

Reasons to not use pipelines are when you don't have a linear flow.
In some circumstances you can bypass the pipe (opening a side-channel)
on other cases you can't or it's overly messy to do so.

Reasons not to use in-memory processing are of course if you have huge
amounts of data. Then you need filtering and pipeline processing.
(A former fellow student who worked for the ESO told me remarkable
things about the amounts of data they continuously receive and that
must on the fly be processed.) Another more recent example can be
processing of real time data for Digital Twins (e.g. city models).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Thu Feb 1 16:22:43 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.02.2024 15:57, Malcolm McLean wrote:

On 01/02/2024 14:45, Scott Lurndal wrote:

Exactly. So whilst it might have some advantages, they aren't going to
be very large, because as you say, it;s the same basic concept.

I think that you draw the wrong conclusion (on a statement that is
prone to misunderstandings or even wrong).

Pipelines are a very useful method to let processes communicate in
a one-way direction (as the name already suggests). From that it's >immediately recognizable that filters are a natural element in that >OS-architectural glue.

One original Unix philosophy was to have specialized commands that
do one thing well, and to combine such tasks as necessary. (To some
degree there was as similar statement concerning C function design.) >Unfortunately some popular GNU tools deviate from that. Features get >incorporated (as duplicates) in many tools (instead of using the
existing specialized one).

I believe the classic example is the documenters workbench.

Troff converts a set of page layout directives into a typeset
document. Originally for the CAT typesetter, but I'll address
that later.

So, now you want to add tables. You can modify troff to add
new macros that use existing markup directives, or you can
add a filter that converts a 'table description' language
into troff markup directives.

$ < document.tr | tbl | troff -mm > typesetter_output

Now, you want to add support for arbitrary mathematical
formulae. You can modify troff to add more macros, or
you can write a filter that converts an 'equation description'
to troff and add that to your pipeline;

$ <document.tr | eqn | tbl | troff -mm > typesetter_output

Now, you realize that you want to support multiple typesetters,
so you can either modify troff to support all possible typesetters,
or you can add post processing filters.

$ <document.tr | eqn | tbl | ditroff -mm | dit2cat > typesetter_output
$ <document.tr | eqn | tbl | ditroff -mm | dit2ps > postscript output

Then you might want to be able to include pictures in your
document.

$ <document.tr | pic | eqn | tbl | ditroff -mm | dit2ps | ps2pdf > document.pdf.

Or you might want ascii text output

$ <document.tr | pic | eqn | tbl | nroff -mm > document.txt

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Thu Feb 1 17:24:54 2024

On 01.02.2024 11:34, David Brown wrote:

On 31/01/2024 19:35, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

I regularly see it as more symmetrical and clearer to push data left
to right. So I might write "cat infile | grep foo | sort > outfile".
Of course I could use "<" redirection, but somehow it seems more
natural to me to have this flow. I'll use "<" for simpler cases.

But perhaps this is just my habit, and makes little sense to other
people.

I completely understand that.

You can also use:

< infile grep foo | sort > outfile

Redirections don't have to be written after a command.

Indeed. And if we also respect that 'grep' accepts arguments,
then it's even more compact and yet probably better legible... :-)

grep foo infile | sort > outfile

I did not know you could write it that way - thanks for another
off-topic, but useful, tip.

Yes. We certainly should instead have written

grep foo iso646.h | sort > outfile

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Scott Lurndal on Thu Feb 1 16:28:10 2024

On 2024-02-01, Scott Lurndal <scott@slp53.sl.home> wrote:

Kaz Kylheku <433-929-6894@kylheku.com> writes:

On 2024-01-31, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Kaz Kylheku <433-929-6894@kylheku.com> writes:

But ESC is related text; it's a character described in ASCII used for >>signaling in the middle of text, which is what it's doing here.

So are most of the other ASCII codes less than 0x20. Including
file and record delimiters and shift-in/shift-out.

ESC is such an important piece of text, that it has a dedicated key on
your keyboard.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Feb 1 16:30:33 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 01/02/2024 15:07, Janis Papanagnou wrote:

On 01.02.2024 14:26, Malcolm McLean wrote:

On 01/02/2024 13:02, Janis Papanagnou wrote:

Well, not necessarily. Let's leave the typical use case for a moment... >>>>
It might also be analyzed and converted to a digitally represented
formula, say some TeX code, or e.g. like the formal syntax that the
lilypond program uses.

And ultimately converted to a non binary form. A list of 1s and 0s is
seldom any use to the final consumer of the data.

No, I was speaking about an application that creates lilypond _input_,
which is a formal language to write notes, e.g. for evaluation by the
lilypond software, but not excluding other usages.

The two problems with hex
dumps are that you've got to do mental arithmetic to convert 8 bit hex >>>>> values into 16 or 32 bit fields,

Hmm.. - have you inspected the man pages of the tools?

I just ran "man xxd". The man page contains this statement.

The tool's weirdness matches its creator's brain. Use entirely at your
own risk. Copy files. Trace it. Become a wizard.

This statement repelled you? (Can't help you here.)

At least for 'od' I know it's easy per option...
od -c file # characters (or escapes and octals)
od -t x1 file # hex octets
od -t x2 file # words (two octets)
od -c -t x1 file # characters and octets

So a JPEG file starts with
FF D8
FF E0
hi lo (length of the FF E0 segment)

So we want the output

FF D8 FF E0 [1000] to check that the segment markers are correct and FF
E0 segment is genuinely a thousand bytes (or whatever it is). This isn't >>> easy to achieve with a hex dump utility.

I don't know binary format details about jpg, so I cannot help you here.

JPEG is an extremely common binary file format and JPEG files will be
found on most general purpose computers.
All you need to know for the purposes of the discussion is that the
first four bytes are segment identifiers and must have the values I
gave, whilst bytes five and six are a big endian 16 bit number that >represents a segment length, and that potentially any of those values
could be unexpected and you might want to inspect them.

So how would you achieve that in a convenient and non-error prone way?

$ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
it is a jpeg

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Feb 1 17:06:24 2024

On 01/02/2024 14:50, David Brown wrote:

On 01/02/2024 02:29, bart wrote:

On 01/02/2024 00:47, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

According to you, these tools are poorly designed. I don't think so. >>>>> How would you design them? Endless input and output file names to be >>>>> juggled and tidied up afterwards?

I think they're poorly designed too.

Of course you do. They're not bart programs.

From the POV of interactive console programs, they /are/ poor.

You don't provide any reason why - do elucidate!

They only do one thing, like you can't first do A, then B. They don't
give any prompts. They often apparently do nothing (so you can't tell
if they're busy, waiting for input, or hanging). There is no dialog.

That's the whole point!

If you want to do A, then B, then you do "A | B", or "A; B", or "A && B"
or "A || B". And if you want to do A, then B twice, then C, then A
again, you write "A | B | B | C | A". Other operator choices let you
say "do this then that", or "do this, and if successful do that", etc.

Your monolithic AB program fails when you want to do C, or want to do A
and B in a way the AB author didn't envisage.

You have a Transformer - a toy that can be either a car or a robot. I've
got a box of Lego. Sometimes I need instructions and a bit of time, but
I can have a car, a robot, a plane, an alien, a house, and anything else
I might want.

You can only do one thing, as you can only have one unbroken byte
sequence as output sent to stdout.

You can't send output A to stdout, then B to stdout, and certainly can't interleave messages to the console on stdout, as that would then be all
mixed up with the possibly binary data, and if redirected, you won't see it.

I can see the idea of having one permanently open channel, but call it stdbinout or stdpipeout. But you still won't be able to generate a
sequence of distinct data blocks along that one channel because it is continuous.

This why 'as' only ever produces one object file, even for multiple
input source files.

And explains why 'as' treats multiple .s input files as though they were
all part of the same single source file: you can take one .s file, chop
it up into multiple .s files, and submit them all to 'as' (keeping the
right order).

It's a feature! It's also the whackiest assembler I've encountered, this century anyway. That fact that it's implemented as a crude filter with
one input stream and one output streams helps explain it.

Although it works differently from most such filters, because if its
output is not piped, and not redirected, it is sent to a file (always
called a.out). It's not quite crazy enough to send binary object file
data to the termimal; I wonder why not?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Thu Feb 1 18:35:32 2024

On 01.02.2024 11:30, David Brown wrote:

On 31/01/2024 14:35, Janis Papanagnou wrote:

First; the EU publishes in all languages of the member states,
for example. (There's no single lingua franca.)

Weirdly, while Norway is not in the EU but Sweden and Denmark are, they publish (for some things at least) in Norwegian but not in Swedish or
Danish. [...]

Hmm.. - in my ears this sounds strange. I've looked it up and found...

"The EU has 24 official languages:

Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian,
Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish and
Swedish."

[
https://european-union.europa.eu/principles-countries-history/languages_en ]

Searching for a specific regulation, e.g. the GDPR, there's documents
in these languages:

BG ES CS DA DE ET EL EN FR GA HR IT
LV LT HU MT NL PL PT RO SK SL FI SV

"We aim to provide information on our websites in all 24 EU official
languages. If content is not available in your chosen EU language,
more and more websites offer eTranslation, the Commission’s machine
translation service."

"All content is published in at least English, because research has
shown that with English we can reach around 90% of visitors to our
sites in either their preferred foreign language or their native
language."

[ https://european-union.europa.eu/languages-our-websites_en ]

It's interesting that they have an extra explanation about English:

"English remains an official EU language, despite the United Kingdom
having left the EU. It remains an official and working language of the
EU institutions as long as it is listed as such in Regulation No 1.
English is also one of Ireland’s and Malta’s official languages."

[
https://european-union.europa.eu/principles-countries-history/languages_en ]

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Thu Feb 1 18:03:24 2024

On 01/02/2024 16:30, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 01/02/2024 15:07, Janis Papanagnou wrote:

On 01.02.2024 14:26, Malcolm McLean wrote:

On 01/02/2024 13:02, Janis Papanagnou wrote:

Well, not necessarily. Let's leave the typical use case for a moment... >>>>>
It might also be analyzed and converted to a digitally represented
formula, say some TeX code, or e.g. like the formal syntax that the
lilypond program uses.

And ultimately converted to a non binary form. A list of 1s and 0s is
seldom any use to the final consumer of the data.

No, I was speaking about an application that creates lilypond _input_,
which is a formal language to write notes, e.g. for evaluation by the
lilypond software, but not excluding other usages.

The two problems with hex
dumps are that you've got to do mental arithmetic to convert 8 bit hex >>>>>> values into 16 or 32 bit fields,

Hmm.. - have you inspected the man pages of the tools?

I just ran "man xxd". The man page contains this statement.

The tool's weirdness matches its creator's brain. Use entirely at your >>>> own risk. Copy files. Trace it. Become a wizard.

This statement repelled you? (Can't help you here.)

At least for 'od' I know it's easy per option...
od -c file # characters (or escapes and octals)
od -t x1 file # hex octets
od -t x2 file # words (two octets)
od -c -t x1 file # characters and octets

So a JPEG file starts with
FF D8
FF E0
hi lo (length of the FF E0 segment)

So we want the output

FF D8 FF E0 [1000] to check that the segment markers are correct and FF >>>> E0 segment is genuinely a thousand bytes (or whatever it is). This isn't >>>> easy to achieve with a hex dump utility.

I don't know binary format details about jpg, so I cannot help you here. >>>

JPEG is an extremely common binary file format and JPEG files will be
found on most general purpose computers.
All you need to know for the purposes of the discussion is that the
first four bytes are segment identifiers and must have the values I
gave, whilst bytes five and six are a big endian 16 bit number that
represents a segment length, and that potentially any of those values
could be unexpected and you might want to inspect them.

So how would you achieve that in a convenient and non-error prone way?

$ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
it is a jpeg

That doesn't work for me:

root@xxx:/mnt/c/mx# ls card2.jpg
card2.jpg
root@xxx:/mnt/c/mx# if file card2.jpg | grep JPEG >
/dev/null^Jthen^Jecho "it is a jpeg"^Jfi
>
>

I just get a lone ">". If press Enter, I get more. If I press Ctrl=D, it
says:

> -bash: syntax error: unexpected end of file
logout

I think anyway that you need to grep for JFIF not JPEG, but that is a
really poor way to check for a JPEG file. Any text or binary file can
have a JFIF byte sequence.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Thu Feb 1 19:36:11 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.02.2024 16:41, Malcolm McLean wrote:

So could you list one or two reasons why you might prefer a program with
five subroutines, and one or two reasons why you might prefer to write
five programs which communicate via piped data?

A quite appealing and naturally appearing task (from the past) to use
pipes was to model communication cascades. Something like (off the top
of my head)...

data-source | sign | compress | crc | encrypt | channel-enc |
interleaver | channel-simulator | deinterleaver | channel-dec |
decrypt | crc-check | uncompress | check-sign | data-sink

Component-pairs can be omitted, say you may leave out the un-/compress >function. And every component may be either special purpose or general.
A special purpose entity could be BCH-enc and RCPC-enc, or it can also
be (if better suited) a combined module, say 'crc -16' vs. 'crc -32'
with the function realized as option argument.

There was also the widely used netpbm package for translating
between different image formats.

https://en.wikipedia.org/wiki/Netpbm

$ giftopnm somepic.gif | ppmtobmp > somepic.bmp
$ for i in *.png; do pngtopam $i | ppmtojpeg >`basename $i .png`.jpg; done

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Scott Lurndal on Thu Feb 1 19:50:04 2024

On 2024-02-01, Scott Lurndal <scott@slp53.sl.home> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.02.2024 16:41, Malcolm McLean wrote:

So could you list one or two reasons why you might prefer a program with >>> five subroutines, and one or two reasons why you might prefer to write
five programs which communicate via piped data?

A quite appealing and naturally appearing task (from the past) to use
pipes was to model communication cascades. Something like (off the top
of my head)...

data-source | sign | compress | crc | encrypt | channel-enc |
interleaver | channel-simulator | deinterleaver | channel-dec |
decrypt | crc-check | uncompress | check-sign | data-sink

Component-pairs can be omitted, say you may leave out the un-/compress >>function. And every component may be either special purpose or general.
A special purpose entity could be BCH-enc and RCPC-enc, or it can also
be (if better suited) a combined module, say 'crc -16' vs. 'crc -32'
with the function realized as option argument.

There was also the widely used netpbm package for translating
between different image formats.

https://en.wikipedia.org/wiki/Netpbm

$ giftopnm somepic.gif | ppmtobmp > somepic.bmp
$ for i in *.png; do pngtopam $i | ppmtojpeg >`basename $i .png`.jpg; done

Also, in regard to some silly objections upthread about the danger of
binary data on standard ouptut, programs in Unix can easily do the
Following (and arguably should):

if (isatty(STDOUT_FILENO)) {
fprintf(stderr, "Cowardly refusing to dump binary data to a terminal.\n");
exit(EXIT_FAILURE);
}

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to Kaz Kylheku on Thu Feb 1 20:03:59 2024

On 01/02/2024 19:50, Kaz Kylheku wrote:

On 2024-02-01, Scott Lurndal <scott@slp53.sl.home> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.02.2024 16:41, Malcolm McLean wrote:

So could you list one or two reasons why you might prefer a program with >>>> five subroutines, and one or two reasons why you might prefer to write >>>> five programs which communicate via piped data?

A quite appealing and naturally appearing task (from the past) to use
pipes was to model communication cascades. Something like (off the top
of my head)...

data-source | sign | compress | crc | encrypt | channel-enc |
interleaver | channel-simulator | deinterleaver | channel-dec |
decrypt | crc-check | uncompress | check-sign | data-sink

Component-pairs can be omitted, say you may leave out the un-/compress
function. And every component may be either special purpose or general.
A special purpose entity could be BCH-enc and RCPC-enc, or it can also
be (if better suited) a combined module, say 'crc -16' vs. 'crc -32'
with the function realized as option argument.

There was also the widely used netpbm package for translating
between different image formats.

https://en.wikipedia.org/wiki/Netpbm

$ giftopnm somepic.gif | ppmtobmp > somepic.bmp
$ for i in *.png; do pngtopam $i | ppmtojpeg >`basename $i .png`.jpg; done

Also, in regard to some silly objections upthread about the danger of
binary data on standard ouptut, programs in Unix can easily do the
Following (and arguably should):

if (isatty(STDOUT_FILENO)) {
fprintf(stderr, "Cowardly refusing to dump binary data to a terminal.\n");
exit(EXIT_FAILURE);
}

Yes, so common that the shell has test -t

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Keith Thompson on Thu Feb 1 20:59:14 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

bart <bc@freeuk.com> writes:

On 01/02/2024 16:30, Scott Lurndal wrote:

[...]

$ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
it is a jpeg

That doesn't work for me:

Not if you type the "^J"s as '^' and 'J'. They were intended to
represent newlines. I would use semicolons instead:

Yes, that's an artifact of ksh history entries for multiline commands.

I should have edited it before posting.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Keith Thompson on Thu Feb 1 21:25:33 2024

On 01/02/2024 20:09, Keith Thompson wrote:

bart <bc@freeuk.com> writes:

On 01/02/2024 16:30, Scott Lurndal wrote:

[...]

$ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
it is a jpeg

That doesn't work for me:

Not if you type the "^J"s as '^' and 'J'. They were intended to
represent newlines. I would use semicolons instead:

$ if file /tmp/garage.jpg | grep JPEG > /dev/null ; then echo "it is a jpeg" ; fi
it is a jpeg

(I might also use "grep -q" rather than redirecting to /dev/null.)

[...]

I think anyway that you need to grep for JFIF not JPEG, but that is a
really poor way to check for a JPEG file. Any text or binary file can
have a JFIF byte sequence.

That's not an issue. "file" doesn't just look for "JFIF" to determine
that a file is a jpg.

I see, so 'file' is a special command that does all the work. grep
checks whether the description contains JPEG. Although it won't work for
any of my private formats.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Thu Feb 1 22:41:05 2024

On 01/02/2024 18:35, Janis Papanagnou wrote:

On 01.02.2024 11:30, David Brown wrote:

On 31/01/2024 14:35, Janis Papanagnou wrote:

First; the EU publishes in all languages of the member states,
for example. (There's no single lingua franca.)

Weirdly, while Norway is not in the EU but Sweden and Denmark are, they
publish (for some things at least) in Norwegian but not in Swedish or
Danish. [...]

Hmm.. - in my ears this sounds strange. I've looked it up and found...

"The EU has 24 official languages:

Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian,
Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish and
Swedish."

I can't say I have looked this up myself, or particularly care what
languages are used there. Maybe it only applied to some documents, or
used to apply but no longer does. Maybe some things don't stick rigidly
to the official languages, or maybe different guidelines are used for
internal documents.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Thu Feb 1 22:46:16 2024

On 01/02/2024 17:24, Janis Papanagnou wrote:

On 01.02.2024 11:34, David Brown wrote:

On 31/01/2024 19:35, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

I regularly see it as more symmetrical and clearer to push data left
to right. So I might write "cat infile | grep foo | sort > outfile".
Of course I could use "<" redirection, but somehow it seems more
natural to me to have this flow. I'll use "<" for simpler cases.

But perhaps this is just my habit, and makes little sense to other
people.

I completely understand that.

You can also use:

< infile grep foo | sort > outfile

Redirections don't have to be written after a command.

Indeed. And if we also respect that 'grep' accepts arguments,
then it's even more compact and yet probably better legible... :-)

grep foo infile | sort > outfile

I did not know you could write it that way - thanks for another
off-topic, but useful, tip.

Yes. We certainly should instead have written

grep foo iso646.h | sort > outfile

I'm happy using different arrangements at different times. Sometimes I
think one way is clearer, or easier to type, sometimes I think another
way is better. As a vague rule, I will usually use "grep foo infile" if
it is stand alone, or at most piped into "less". If I have a larger
chain, it seems more natural to me to move the data left to right in a pipe.

I'm sure I cost my computer a few microseconds of extra effort, but I
don't worry too much about that!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Feb 1 23:02:17 2024

On 01/02/2024 18:06, bart wrote:

On 01/02/2024 14:50, David Brown wrote:

On 01/02/2024 02:29, bart wrote:

On 01/02/2024 00:47, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

According to you, these tools are poorly designed. I don't think so. >>>>>> How would you design them? Endless input and output file names to be >>>>>> juggled and tidied up afterwards?

I think they're poorly designed too.

Of course you do. They're not bart programs.

From the POV of interactive console programs, they /are/ poor.

You don't provide any reason why - do elucidate!

They only do one thing, like you can't first do A, then B. They don't
give any prompts. They often apparently do nothing (so you can't tell
if they're busy, waiting for input, or hanging). There is no dialog.

That's the whole point!

If you want to do A, then B, then you do "A | B", or "A; B", or "A &&
B" or "A || B". And if you want to do A, then B twice, then C, then A
again, you write "A | B | B | C | A". Other operator choices let you
say "do this then that", or "do this, and if successful do that", etc.

Your monolithic AB program fails when you want to do C, or want to do
A and B in a way the AB author didn't envisage.

You have a Transformer - a toy that can be either a car or a robot.
I've got a box of Lego. Sometimes I need instructions and a bit of
time, but I can have a car, a robot, a plane, an alien, a house, and
anything else I might want.

You can only do one thing, as you can only have one unbroken byte
sequence as output sent to stdout.

I can do one sequence of many things.

Your suggested "clipboard" idea was no different.

But it's not difficult to have intermediary files if you want to do more complicated things.

You can't send output A to stdout, then B to stdout, and certainly can't interleave messages to the console on stdout, as that would then be all
mixed up with the possibly binary data, and if redirected, you won't see
it.

$ cat A
one
two
three

$ cat B

cat
dog
cow

$ (cat A; cat B) | wc -l
6

That's the output of two commands, "cat A" and "cat B", each going to
their stdout, and they are concatenated into a single pipe going to the
"wc -l" command to count the lines.

And if I wanted to redirect them to a file "x" and also view them, I'd
write :

$ (cat A; cat B) | tee x
one
two
three
cat
dog
cow

$ wc -l x
6 x

I'm not sure we are getting anywhere with you trying to invent more and
more complex situations in an attempt to find something that can't be
done from a Linux bash shell.

I can see the idea of having one permanently open channel, but call it stdbinout or stdpipeout. But you still won't be able to generate a
sequence of distinct data blocks along that one channel because it is continuous.

This why 'as' only ever produces one object file, even for multiple
input source files.

"as" produces one object file, because that's what the program does. If
you want two object files, run it twice. In Unix systems, starting
programs and running lots of programs at a time is cheap. It's not a
system that requires monolithic programs in order to work efficiently.

And explains why 'as' treats multiple .s input files as though they were
all part of the same single source file: you can take one .s file, chop
it up into multiple .s files, and submit them all to 'as' (keeping the
right order).

It does that because that's what makes sense. If you want to assembly
multiple .s files into individual .o files, then you do that. If you
want to assemble them into a single .o file, then you do that. Your
choice. Having it generate multiple .o files for multiple .s inputs
would restrict that choice.

It's a feature! It's also the whackiest assembler I've encountered, this century anyway. That fact that it's implemented as a crude filter with
one input stream and one output streams helps explain it.

Although it works differently from most such filters, because if its
output is not piped, and not redirected, it is sent to a file (always
called a.out). It's not quite crazy enough to send binary object file
data to the termimal; I wonder why not?

You really are scraping the bottom of the barrel to try to justify your irrational hatreds, aren't you? You put a lot of effort into
desperately trying to dislike programs that don't work exactly the way
your programs work. It's a very strange hobby you have.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Feb 1 23:06:17 2024

On 01/02/2024 16:41, Malcolm McLean wrote:

On 01/02/2024 14:55, David Brown wrote:

On 01/02/2024 02:53, Malcolm McLean wrote:

On 31/01/2024 23:36, Ben Bacarisse wrote:

An example where it's really useful not to care: I have a suite of
tools
for doing toy cryptanalysis. Some apply various transformations and/or >>>> filters to byte streams and others collect and output (on stderr)
various statistics. Plugging them together in various pipelines is
very
handy when investigating an encrypted text. The output is almost
always
"binary" in the sense that there would be not point in looking at on a >>>> terminal.

According to you, these tools are poorly designed. I don't think so. >>>> How would you design them? Endless input and output file names to be >>>> juggled and tidied up afterwards?

I'd write a monolithic program.

It's very strange to me to see people that consider themselves
programmers talk about having multiple small functions to do specific
tasks and combining them into bigger functions to solve bigger
problems, yet are reduced to quivering jellies at the thought of
multiple small programs to do specific tasks that can be combined to
solve bigger tasks.

Do you think the C standard library would be improved by a single
function "flubadub" that takes 20 parameters and can calculate
logarithms, print formatted text, allocate memory and write it all to
a file?

By breaking down the problem into several parts e.g. "collect
statistical data, analyse statistics, form hypothesis, attempt
decryption, check decrypt for plausible plaintext" we can usually attack
it better. And you're right, there's not a fundamental difference
between writing one program with five subroutines, or five programs
which pass data to each other via pipelines.

That's not what I said. Try re-reading. I can't be bothered arguing
against yet another straw man.

But that doesn't mean that decision must not be made, or that you can't
give reasons for and against each option.

So could you list one or two reasons why you might prefer a program with
five subroutines, and one or two reasons why you might prefer to write
five programs which communicate via piped data?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Chris M. Thomasson on Thu Feb 1 22:25:18 2024

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:

On 2/1/2024 1:25 PM, bart wrote:

On 01/02/2024 20:09, Keith Thompson wrote:

bart <bc@freeuk.com> writes:

On 01/02/2024 16:30, Scott Lurndal wrote:

[...]

$ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is >>>>> a jpeg"^Jfi
it is a jpeg

That doesn't work for me:

Not if you type the "^J"s as '^' and 'J'. They were intended to
represent newlines. I would use semicolons instead:

$ if file /tmp/garage.jpg | grep JPEG > /dev/null ; then echo "it >>> is a jpeg" ; fi
it is a jpeg

(I might also use "grep -q" rather than redirecting to /dev/null.)

[...]

I think anyway that you need to grep for JFIF not JPEG, but that is a
really poor way to check for a JPEG file. Any text or binary file can
have a JFIF byte sequence.

That's not an issue. "file" doesn't just look for "JFIF" to determine
that a file is a jpg.

I see, so 'file' is a special command that does all the work. grep
checks whether the description contains JPEG. Although it won't work for
any of my private formats.

Why would it work with your private formats? ;^)

It will. He needs to describe the classification criteria
in /etc/magic.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Thu Feb 1 22:24:48 2024

bart <bc@freeuk.com> writes:

On 01/02/2024 20:09, Keith Thompson wrote:

bart <bc@freeuk.com> writes:

On 01/02/2024 16:30, Scott Lurndal wrote:

[...]

$ if file /tmp/garage.jpg | grep JPEG > /dev/null^Jthen^Jecho "it is a jpeg"^Jfi
it is a jpeg

That doesn't work for me:

Not if you type the "^J"s as '^' and 'J'. They were intended to
represent newlines. I would use semicolons instead:

$ if file /tmp/garage.jpg | grep JPEG > /dev/null ; then echo "it is a jpeg" ; fi
it is a jpeg

(I might also use "grep -q" rather than redirecting to /dev/null.)

[...]

I think anyway that you need to grep for JFIF not JPEG, but that is a
really poor way to check for a JPEG file. Any text or binary file can
have a JFIF byte sequence.

That's not an issue. "file" doesn't just look for "JFIF" to determine
that a file is a jpg.

I see, so 'file' is a special command that does all the work. grep
checks whether the description contains JPEG. Although it won't work for
any of my private formats.

Like anything in unix, the 'file(1)' command is flexible.

There is a file /etc/magic, that an installation can use
to describe custom file formats.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Feb 1 22:56:05 2024

On 01/02/2024 22:02, David Brown wrote:

On 01/02/2024 18:06, bart wrote:

You can't send output A to stdout, then B to stdout, and certainly
can't interleave messages to the console on stdout, as that would then
be all mixed up with the possibly binary data, and if redirected, you
won't see it.

$ cat A
one
two
three

$ cat B

cat
dog
cow

$ (cat A; cat B) | wc -l
6

That's the output of two commands, "cat A" and "cat B", each going to
their stdout, and they are concatenated into a single pipe going to the
"wc -l" command to count the lines.

I see you don't get it. This is the equivalent of a program which is
supposed to do this:

print A to TTY
print B to LPT
print C to TTY
print D to LPT

but instead is written as this:

print A to TTY
print B to TTY
print C to TTY
print D to TTY

and you are expected to redirect all TTY output to LPT.

At least, on LPT, B and D can each start with a separate title page; on
stdout directed to a file, it will be all mixed up.

I'm not sure we are getting anywhere with you trying to invent more and
more complex situations in an attempt to find something that can't be
done from a Linux bash shell.

They're remarkably simple situations!

And explains why 'as' treats multiple .s input files as though they
were all part of the same single source file: you can take one .s
file, chop it up into multiple .s files, and submit them all to 'as'
(keeping the right order).

It does that because that's what makes sense.

Sorry, but it is rubbish.

Having it generate multiple .o files for multiple .s inputs
would restrict that choice.

And yet that is exactly what gcc does; see below.

You really are scraping the bottom of the barrel to try to justify your irrational hatreds, aren't you? You put a lot of effort into
desperately trying to dislike programs that don't work exactly the way
your programs work.

Because their UIs are rubbish. They are inconsistent. They are
restricted. And yet they are deified for some inexplicable reason. Over
the past few decades nobody has written a better assembler?

At least there are external ones you can use instead, but gcc will not
generate .s files in the right syntax (I guess that's another tool you
will pull of that bottomless bag of such tools).

It's a very strange hobby you have.

I write language tools. Ones which are always derided in this newsgroup. They've included quite a few assemblers.

And yet they all have sensible command line interfaces that do what you
expect.

Which is more that can be said for gcc and especially 'as'.

If you do this on gcc:

gcc one.s two.s

it will create one.o and two.s. Do this on as:

as one.s two.s

it will not only create one file a.out, but will concatenate both,
giving unexpected results:

c:\c>gcc -S cipher.c hmac.c sha2.c

c:\c>as cipher.s hmac.s sha2.s
hmac.s: Assembler messages:
hmac.s:328: Error: symbol `.L18' is already defined
hmac.s:352: Error: symbol `.L17' is already defined
...

WTF? Compare with this:

c:\c>mcc -s cipher hmac sha2
Compiling 3 files to .asm

c:\c>aa cipher hmac sha2
Assembling cipher.asm to cipher.exe

It works impeccably. Even better than 'as', even if that had worked as expected, because there you still have the task of linking the outputs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Michael S on Thu Feb 1 23:49:50 2024

On Wed, 31 Jan 2024 15:10:05 +0200, Michael S wrote:

their earlier mainframe line (PDP-6/10/20).

Nitpick: there was PDP-6 and PDP-10, but never “PDP-20”. There were
systems sold as “DECsystem-10” and “DECsystem-20”, but they were all based
on PDP-10 hardware.

The difference was I think primarily down to the OS: the “DECsystem-10” installations shipped with TOPS-10, while the “DECsystem-20” machines shipped with TOPS-20, which was based on the groundbreaking TENEX from
BBN.

TOPS-20 was not in any sense a “successor” to TOPS-10. I suppose it should have been, but it was too big a shift from the old way of doing things, so
some customers preferred to stick with the clunkier OS. So both the -10
and -20 lines were being sold concurrently.

(End nitpick.)

(Begin extra trivia part.)

The PDP-6 was DEC’s first foray into “large-scale” systems (it had a 36- bit word length, greater than any earlier DEC machine). They only made a
few, at great expense, and lost money on them. So the line was
discontinued, with the proclamation that they were never getting into 36-
bit machines again.

Then two years later, the architecture was revived as the PDP-10 ...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Janis Papanagnou on Thu Feb 1 23:52:15 2024

On Wed, 31 Jan 2024 18:06:13 +0100, Janis Papanagnou wrote:

Yet I don't understand the relation to Linus Torvalds that was the
source of mentioning VMS. - I mean; only that he dislikes it is not much
of a news.

It was the reason he gave for disliking it: you could not easily determine
the length in bytes of a file.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Fri Feb 2 00:08:09 2024

On Wed, 31 Jan 2024 14:45:49 +0100, David Brown wrote:

On 31/01/2024 07:07, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

Mixing binary data with formatted text data is very unlikely to be
useful.

PDF does exactly that. To the point where the spec suggests putting
some random unprintable bytes up front, to distract format sniffers
from thinking they’re looking at a text file.

PDF files start with the "magic" indicator "%PDF", which is enough for
many programs to identify them correctly.

Sure, if you were looking for PDF files specifically.

But consider the more generic case of file-transfer tools that try to automatically convert between line-endings for text files on different platforms: if they mistook a PDF file for text, they could screw it up
royally.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Fri Feb 2 01:59:51 2024

On 01.02.2024 22:41, David Brown wrote:

On 01/02/2024 18:35, Janis Papanagnou wrote:

On 01.02.2024 11:30, David Brown wrote:

On 31/01/2024 14:35, Janis Papanagnou wrote:

First; the EU publishes in all languages of the member states,
for example. (There's no single lingua franca.)

Weirdly, while Norway is not in the EU but Sweden and Denmark are, they
publish (for some things at least) in Norwegian but not in Swedish or
Danish. [...]

Hmm.. - in my ears this sounds strange. I've looked it up and found...

"The EU has 24 official languages:

Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian,
Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish and
Swedish."

I can't say I have looked this up myself, or particularly care what
languages are used there. Maybe it only applied to some documents, or
used to apply but no longer does. Maybe some things don't stick rigidly
to the official languages, or maybe different guidelines are used for internal documents.

Extremely unlikely. - More likely that you were just misremembering
or the document you have in mind was not an official EU document.

(But if you can dig it up and provide that evidence I'm interested.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Fri Feb 2 01:13:17 2024

On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

In ASCII, 0 means NUL, or "ignore".

Fun fact: one of the names for hex 7F was “rubout”. On seven-track paper tape, if you made a mistake typing your program, intead of throwing away
the tape and starting again, you could go back and punch out all the holes
at that position to produce a “rubout” character. The meaning was “ignore this character”.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 02:10:12 2024

On 02.02.2024 01:08, Lawrence D'Oliveiro wrote:

But consider the more generic case of file-transfer tools that try to automatically convert between line-endings for text files on different platforms: if they mistook a PDF file for text, they could screw it up royally.

What tools are you specifically thinking of? - I recall in FTP you
explicitly set bin-mode or text-mode. I assume that protocols like
FTAM (CCITT) would also transfer files reliably. I would certainly
try to avoid tools that operate unreliably or can't be switched to
operate correctly with [8 bit] "binary" or [ASCII] text files.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Fri Feb 2 01:08:24 2024

On Thu, 1 Feb 2024 23:02:17 +0100, David Brown wrote:

But it's not difficult to have intermediary files if you want to do more complicated things.

This is the point where someone says “I wish a shell script pipeline could express a general flow graph”.

.

.

.

.

.

... nobody?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Fri Feb 2 02:18:35 2024

On 31.01.2024 19:24, David Brown wrote:

On 31/01/2024 19:01, Janis Papanagnou wrote:

All I can say is that the Unix shell was a reliable companion
wherever we had to automate tasks on Unix systems or on Cygwin
enhanced Windows.

Automation is certainly easier with good scripting - whatever the
language or shell.

Sure. And shell was always available as part of standard Unix.
That was not (not always) true for other languages, like Perl
(for example).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to David Brown on Fri Feb 2 02:48:58 2024

On 31.01.2024 19:20, David Brown wrote:

On 31/01/2024 16:25, Janis Papanagnou wrote:

On 31.01.2024 15:21, David Brown wrote:

On 31/01/2024 09:36, Malcolm McLean wrote:

[ I snipped a couple of "I actually don't know/need it" things ]

But now it's effectively a programming language, and, from the example >>>> code, a very poorly designed one which is cryptic and fussy and liable >>>> to be hard to maintain. So it's better to use a language like Perl to
achieve the same thing, and I did have a few Perl scripts handy for
repetitive jobs of that nature in my Unix days.

That gave me a laugh! You think bash is cryptic, fussy and poorly
designed, and choose /Perl/ as the alternative :-)

I don't think it's that clear a joke. The Unix shell is extremely
error prone to program, and you should not let a newbie write shell
programs without careful supervision. ("newbie" [in shell context]
= less than 10 years of practical experience. - Am I exaggerating?
Maybe. But not much.)

I'm not a great fan of shell programming - anything advanced, and I tend
to reach for Python. But I think that is a matter of familiarity and practice. But if you consider bash programming as difficult to get
right, I'll not argue.

Not specifically bash programming, the same is true for ksh, etc.;
it's the underlying shell design that has a lot of pitfalls. And
it's not only about familiarity with the tool - of course being
familiar with the concepts is necessary. But there's still enough
pits where even years long programmers stumble into. (I'm saying
that as someone who did 35+ years ksh programming, I gave courses,
defined shell coding standards, followed 20+ years the problems
that users had in comp.unix.shell, and even saw experienced shell
book authors (and I'm not even mentioning bloggers), to fail in
some instances of the language.)

But of course, with knowledge and discipline, you can write also
fine shell programs.

Despite the shell inherent issues I like it because I can solve
some types of tasks reliably in Unix context.

Perl is famously known as a "write-only" language. Sure, it is possible
to write good, clear, maintainable Perl code - but few people do that.

I've programmed just a few times in Perl, mostly only extending
existing programs. But a friend of mine is leading a Perl user
group in our city; his programs (despite some cryptic elements)
are still quite legible.

Thus the idea that finding bash cryptic or difficult and using Perl
instead is the joke.

Well, in shell it's all that '1>&2' and '${f##*/}' and whatnot
stuff that can only be called cryptic. I wouldn't count regexps
because that's the base in any proper scripting language. What
remains for Perl to be cryptic? The variable type prefixes are
the most prominent punctuation elements that pop into my head.
If one wants better legible yet simple scripts he could resort
to Awk; but its focus is different from shell. (Perl supports
both, but is not everywhere available.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 02:12:43 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 31 Jan 2024 14:45:49 +0100, David Brown wrote:

On 31/01/2024 07:07, Lawrence D'Oliveiro wrote:

On Tue, 30 Jan 2024 17:25:31 +0100, David Brown wrote:

Mixing binary data with formatted text data is very unlikely to be
useful.

PDF does exactly that. To the point where the spec suggests putting
some random unprintable bytes up front, to distract format sniffers
from thinking they’re looking at a text file.

PDF files start with the "magic" indicator "%PDF", which is enough for
many programs to identify them correctly.

Sure, if you were looking for PDF files specifically.

But consider the more generic case of file-transfer tools that try to >automatically convert between line-endings for text files on different >platforms: if they mistook a PDF file for text, they could screw it up >royally.

Just another reason not to use the system with two-byte line endings.

Not a problem on unix.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 02:15:05 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

In ASCII, 0 means NUL, or "ignore".

Fun fact: one of the names for hex 7F was “rubout”.

Additional fun fact. Rubout was the legend on the keycap on
the ASR-33 used to rub out the prior character (the A in ASR
means it has the reader/punch). On paper tape, it means ignore
the prior character.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to bart on Fri Feb 2 03:13:34 2024

On 01.02.2024 23:56, bart wrote:

On 01/02/2024 22:02, David Brown wrote:

On 01/02/2024 18:06, bart wrote:

You can't send output A to stdout, then B to stdout, and certainly
can't interleave messages to the console on stdout, as that would
then be all mixed up with the possibly binary data, and if
redirected, you won't see it.

$ cat A
one
two
three

$ cat B

cat
dog
cow

$ (cat A; cat B) | wc -l
6

That's the output of two commands, "cat A" and "cat B", each going to
their stdout, and they are concatenated into a single pipe going to
the "wc -l" command to count the lines.

I see you're trying to teach shell basics to Bart; interesting.

I see you don't get it. This is the equivalent of a program which is

What is meant by "This"? (I cannot find code or description.)

supposed to do this:

print A to TTY
print B to LPT
print C to TTY
print D to LPT

Is that the task? Are A-D program names? Any other requirements
or restrictions you impose?

cat A
cat B | lpr # of course you don't need cat here: lpr B
cat C
cat D | lpr # of course you don't need cat here: lpr D

You want stdout collected? Do it as David suggested, e.g.

{ cat A ; lpr B ; cat C ; lpr D ;} | processor-for-A-and-C

Or you want to split A, B, C, or D to different channels or tools?
Then use 'tee', and/or use redirection to files, and/or to processes.

but instead is written as this:

print A to TTY
print B to TTY
print C to TTY
print D to TTY

and you are expected to redirect all TTY output to LPT.

At least, on LPT, B and D can each start with a separate title page; on stdout directed to a file, it will be all mixed up.

See above. - Any other task? - Or did you mean something else?

(It's hard to believe that there's something possible in the DOS
cmd/bat/bart world that wouldn't be possible in Unix shell - besides
blue screens of course.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to All on Fri Feb 2 03:26:02 2024

bart <bc@freeuk.com> writes:
[...]

I've just realised why it is that your filter programs don't show
prompts or any kinds of messages: because those are sent to stdout,
and therefore will screw up any data that is being sent there as the
primary output.

Erm, no. Filter programs just don't need a prompt.

And if you have a program that want to provide a prompt, and provide
data on standard output, you usually wouldn't use stdout for that;
use stderr or /dev/tty, depending on the intention of the tool.

And even if you intend to use a prompt on stdout (what really sounds
as a weak idea) you can program a test whether your file descriptor is
attached to the tty if you like.

To demonstrate that, here's some ksh code...

$ ( exec 0>&- ; [[ -t 0 ]] ; echo $? ) # stdin closed
1 # error
$ ( [[ -t 0 ]] ; echo $? ) # stdin attached to tty
0 # okay
$ ls x | ( [[ -t 0 ]] ; echo $? ) # stdin attached to pipe
1 # error

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Scott Lurndal on Fri Feb 2 03:42:58 2024

On 02.02.2024 03:12, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

But consider the more generic case of file-transfer tools that try to
automatically convert between line-endings for text files on different
platforms: if they mistook a PDF file for text, they could screw it up
royally.

Just another reason not to use the system with two-byte line endings.

That cannot always be avoided.

Not a problem on unix.

There are several situations where it matters to consider CR/LF or when
some OS setting may handle these line terminators. Even if you're only
staying in your Unix universe. The "funniest" thing is if you process
files that have been edited by different people on different platforms.
(I know that I am not the first one who has written a CR-LF-CRLF tool
to check and fix (in some consistent way) the line endings of files.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Fri Feb 2 03:47:23 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 02.02.2024 03:12, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

But consider the more generic case of file-transfer tools that try to
automatically convert between line-endings for text files on different
platforms: if they mistook a PDF file for text, they could screw it up
royally.

Just another reason not to use the system with two-byte line endings.

That cannot always be avoided.

Not a problem on unix.

There are several situations where it matters to consider CR/LF or when
some OS setting may handle these line terminators. Even if you're only >staying in your Unix universe. The "funniest" thing is if you process
files that have been edited by different people on different platforms.
(I know that I am not the first one who has written a CR-LF-CRLF tool
to check and fix (in some consistent way) the line endings of files.)

vim can load and save with either line ending, switching if the
user wishes.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Fri Feb 2 05:08:13 2024

On Fri, 02 Feb 2024 02:15:05 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

Fun fact: one of the names for hex 7F was “rubout”.

Additional fun fact. Rubout was the legend on the keycap on the ASR-33
used to rub out the prior character (the A in ASR means it has the reader/punch). On paper tape, it means ignore the prior character.

No, you had to overpunch the character to be ignored. Did that key automatically backspace the tape for you, or did you have to do it
manually?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Fri Feb 2 09:44:15 2024

On 01/02/2024 23:56, bart wrote:

On 01/02/2024 22:02, David Brown wrote:

I'm not sure we are getting anywhere with you trying to invent more
and more complex situations in an attempt to find something that can't
be done from a Linux bash shell.

They're remarkably simple situations!

As I said, you can try to invent more and more unrealistic situations to
try to "prove" that bash is useless and Unix is flawed and its designers
were incompetent, along with every other programmer and developer
throughout time, while Bart from Usenet has the perfect solution for everything.

If you are happy with everything you have made yourself, and think
everything else is unusable, incompetently made, and probably designed specifically to annoy you personally, why bother with any of it? Why
are you here, complaining about everything? What do you gain from
frothing at the mouth like this, other than high blood pressure?

Ask about C issues. Tell us about C programs. You can even ask about off-topic stuff - many of us have tried to give you lots of information
about things you are ignorant of. But /please/ stop whining.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Fri Feb 2 09:51:04 2024

On 01/02/2024 23:38, Malcolm McLean wrote:

I'm sure you're capable of going through the exercise and then you might
gain a bit of insight on how to design such software systems. And, no, arguing that you'd go for a monolithic program doesn't necessarily mean
that you are a "quivering jelly" at the thought of writing several
simpler ones. And in fact to start you off I actually mentioned a few advantages of the pipeline approach.

I am perfectly aware of the advantages and disadvantages of monolithic approaches.

I am also perfectly aware that you won't read that previous sentence, understand it, or consider it before making up your next pointless straw
man or making up another lecture on something you know nothing about
while the rest of us do.

There are advantages and drawbacks to both. But I can't force you to
think about what those might be if you won't, and from experience just telling you provokes your natural contentiousness and isn't very effective.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Keith Thompson on Fri Feb 2 12:41:28 2024

On 01.02.2024 21:09, Keith Thompson wrote:

[...] I would use semicolons instead:

$ if file /tmp/garage.jpg | grep JPEG > /dev/null ; then echo "it is a jpeg" ; fi
it is a jpeg

(I might also use "grep -q" rather than redirecting to /dev/null.)

And probably also avoid multi-line code (to prevent the ^J confusion)

file /tmp/garage.jpg | grep -q JPEG && echo "it is a jpeg"

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Fri Feb 2 12:33:46 2024

On 01.02.2024 16:50, Malcolm McLean wrote:

On 01/02/2024 15:07, Janis Papanagnou wrote:

I don't know binary format details about jpg, so I cannot help you here.

JPEG is an extremely common binary file format and JPEG files will be
found on most general purpose computers.

Really?!

All you need to know for the purposes of the discussion is that the
first four bytes are segment identifiers and must have the values I

All I wanted to know is what you intended to do. The intended
task.

gave, whilst bytes five and six are a big endian 16 bit number that represents a segment length, and that potentially any of those values
could be unexpected and you might want to inspect them.

And from subsequent posts I assume you want to test the values
to determine the file type.

So how would you achieve that in a convenient and non-error prone way?

If I'd be interested in the file type of the data I'd do what
has already been suggested by others, use the 'file' command.
It's the purpose of that command to try to determine the file
type by objective means, or, if that is not unambiguously
possible, by heuristic means.

In case I want to do a more thorough data analysis of such
binary code it applies what I've written upthread about the
need for knowledge about the binary data structures (and use
the tools I mentioned).

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Fri Feb 2 13:34:52 2024

On 02.02.2024 00:57, Malcolm McLean wrote:

The difference is that the syntax for redirecting output in the UNIX
shell is ony of the slightest use if you happen to run that particular
type of system.

Hasn't DOS adopted the basic redirections as well?
And hasn't it even tried to mimic pipes?

Of course redirection and its specific syntaxes are depending on
the supporting OS. So what?

In Unix they developed a terse version at a time where on other
OSes you need to "formulate novels" to invoke such features.
And that version was good enough to have been adopted by other
tools. And it's also simple enough that many users use these in
their programming contexts effectively (and without complaints).

[...] And whilst pipes are a concept, they are no way
comparable in depth and fundamental importance to the concept of
functions of functions.

The point is the two are not comparable. [...]

Of course. This point sounds completely reasonable to me.

So the confusion was only about some inappropriate statement that
had been used. (I seem to recall, and think I commented on that.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Malcolm McLean on Fri Feb 2 13:17:55 2024

On 02.02.2024 08:16, Malcolm McLean wrote:

In Perl you have an implict variable called $_. Some Perl statements
will operate on $_ without it actually being specified, and you then
have to reference $_ exoictitly to obtain the result. It's highly
confusing for anyone used to a conventional language with only one type
of named varibales. And that's one of the main decisions which makes
Perl hard to read.

Yes, I remember that '$_', and even though I'm not the typical Perl
programmer (I told you about my only few contacts with Perl) it was
not the least confusing to me. I think, what also generally holds,
that if you want to use some tool you should at least make yourself
familiar with its basic concepts. (This appears to be a quite common
view, despite here we often see complaints complaints from folks with
only basic or no knowledge about the objects of their complain.)
But Perl is also a large language, and it needs some time to learn
or master it. But $_ seemed to me to be some basic thing.

That said, now consider in comparison (e.g.) Ksh that has also a
variable '_' (with value '$_'); and the contents of Ksh's '_' is
even depending on the runtime context!

However often you can write slightly less idiomatic Perl code which
doesn't make use of this feature, and then it's clearer. Or you can lay
the code out so that all the places where $_ are used in the same way
are together and make it a bit easier to work out what is going on.
There are thing you can do and Perl doesn't have to look like a
confusing mess.

Yes. Quite typical for many scripting languages. Sometimes there's
even some desire to produce short, cryptic, "clever", forms of code.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Fri Feb 2 15:47:21 2024

On 02/02/2024 14:28, Malcolm McLean wrote:

On 02/02/2024 08:51, David Brown wrote:

On 01/02/2024 23:38, Malcolm McLean wrote:

I'm sure you're capable of going through the exercise and then you
might gain a bit of insight on how to design such software systems.
And, no, arguing that you'd go for a monolithic program doesn't
necessarily mean that you are a "quivering jelly" at the thought of
writing several simpler ones. And in fact to start you off I actually
mentioned a few advantages of the pipeline approach.

I am perfectly aware of the advantages and disadvantages of monolithic
approaches.

Well it's kind of poroof of the pudding. Ben has several programs
connected by piplines and asked me what I thought of the design. I said
I'd go for a monolithinc approach. You criticised mem giving no reason
ither than that my oreferred approach was monolithic. So any reasonable erson would assume that you think that a monolithic approach is in and
of itself bad.

No, they would not.

But I don't think, based on your postings, you count as a "reasonable
person".

When invited to list the advantages and disadvantages if either, you
refused to do so. I am sure that you are capable of doing this, and you
are basically right. But you haven't actually done so. And it's proof of
the pudding.

Thne fact is there is case for `Ben's approach, there's a case for my approach, and maybe Ben's case is better. I've no objection to anyone weighing in on that. But fundamentally you do not understand what it
means to offer an argument or how to make a case.

I know exactly what it means. But I know when it is pointless, when the
person on the other side pays not the slightest attention to what is
being said and instead wanders off in their own little world with their
own little ideas and their own independent terminology.

What would be the point in giving reasons for anything, when you won't
read them? Why should I give arguments, when you will "counter" them by telling us that grass is blue, using your own definition for "blue"?
I've put a lot of effort into trying to explain things to you - enough
is enough. I get more intelligent responses talking to my cat.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 15:28:33 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Fri, 02 Feb 2024 02:15:05 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

Fun fact: one of the names for hex 7F was “rubout”.

Additional fun fact. Rubout was the legend on the keycap on the ASR-33
used to rub out the prior character (the A in ASR means it has the
reader/punch). On paper tape, it means ignore the prior character.

No, you had to overpunch the character to be ignored. Did that key >automatically backspace the tape for you, or did you have to do it
manually?

1) either way worked, depending on the software reading the tape
2) there was a button on the pt unit that backed up one character.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to Lawrence D'Oliveiro on Fri Feb 2 16:50:49 2024

On 02/02/2024 01:13, Lawrence D'Oliveiro wrote:

On Wed, 31 Jan 2024 23:25:25 +0000, Malcolm McLean wrote:

In ASCII, 0 means NUL, or "ignore".

Fun fact: one of the names for hex 7F was “rubout”. On seven-track paper tape, if you made a mistake typing your program, intead of throwing away
the tape and starting again, you could go back and punch out all the holes
at that position to produce a “rubout” character. The meaning was “ignore
this character”.

Also, over-punching all seven holes ment there was never any possibility
of it ever getting misread as anything else.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Malcolm McLean on Fri Feb 2 21:40:41 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 01/02/2024 15:07, Janis Papanagnou wrote:

On 01.02.2024 14:26, Malcolm McLean wrote:

On 01/02/2024 13:02, Janis Papanagnou wrote:

Well, not necessarily. Let's leave the typical use case for a moment... >>>>
It might also be analyzed and converted to a digitally represented
formula, say some TeX code, or e.g. like the formal syntax that the
lilypond program uses.

And ultimately converted to a non binary form. A list of 1s and 0s is
seldom any use to the final consumer of the data.

No, I was speaking about an application that creates lilypond _input_,
which is a formal language to write notes, e.g. for evaluation by the
lilypond software, but not excluding other usages.

The two problems with hex
dumps are that you've got to do mental arithmetic to convert 8 bit hex >>>>> values into 16 or 32 bit fields,

Hmm.. - have you inspected the man pages of the tools?

I just ran "man xxd". The man page contains this statement.

The tool's weirdness matches its creator's brain. Use entirely at your
own risk. Copy files. Trace it. Become a wizard.

This statement repelled you? (Can't help you here.)

At least for 'od' I know it's easy per option...
od -c file # characters (or escapes and octals)
od -t x1 file # hex octets
od -t x2 file # words (two octets)
od -c -t x1 file # characters and octets

So a JPEG file starts with
FF D8
FF E0
hi lo (length of the FF E0 segment)

So we want the output

FF D8 FF E0 [1000] to check that the segment markers are correct and FF
E0 segment is genuinely a thousand bytes (or whatever it is). This isn't >>> easy to achieve with a hex dump utility.

I don't know binary format details about jpg, so I cannot help you here.

JPEG is an extremely common binary file format and JPEG files will be found on most general purpose computers.

No. The loose term "JPEG file" usually refers to a file encoded using
either the JFIF or EXIF standard. Prior to the introduction of EXIF,
the loose term was used to refer only to JFIF files.

JPEG is the name for the image encoding usually carried in JFIF (or
EXIF) format files, but since you are actually discussing the file
format, you should probably use the right name for it: JFIF.

All you need to know for the purposes of the discussion is that the first four bytes are segment identifiers and must have the values I gave,

My laptop contains lots of "JPEG files" that start FF D8 FF E1.

So how would you achieve that in a convenient and non-error prone way?

One way is to use od like this:

$ od --endian=big -N6 -t x1u2 x.jpg
0000000 ff d8 ff e0 00 10
65496 65504 16
0000006

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Ben Bacarisse on Fri Feb 2 23:59:49 2024

On 2024-02-02, Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:

I've commented elsewhere on why I think a monolithic program is not a
good design, so I won't repeat that here.

All the programs I use are some kind of "lithic"; not so much mono-
as neo-.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to bart on Fri Feb 2 23:41:01 2024

bart <bc@freeuk.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
human-readable text.

Maybe you are not used to a system where it's trivial to inspect such
data. When "some_prog" produces data that are not compatible with the
current terminal settings, "some_prog | hd" shows a hex dump instead.
The need to do this does not make "some_prog" poorly designed. It may
simply mean that the output is /intended/ for further processing.

For instance in some systems designed to receive

ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

What is your evidence? stdout was just designed for output (as far as I
can tell) and, anyway, what is the distinction you are making between
binary and text? iconv --from ACSII --to EBCDIC-UK will produce
something that is "logically" text on stdout, but it might look like
binary to you.
An example where it's really useful not to care: I have a suite of tools
for doing toy cryptanalysis. Some apply various transformations and/or
filters to byte streams and others collect and output (on stderr)
various statistics. Plugging them together in various pipelines is very
handy when investigating an encrypted text. The output is almost always
"binary" in the sense that there would be not point in looking at on a
terminal.
According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I think they're poorly designed too.

I am curious to read your reasoning...

From the POV of interactive console programs, they /are/ poor. But the mistake is thinking that they are actual programs or commands, when really they are just filters. They are not designed to be standalone
commands.

So it's a bad design, not because of the nature of the data ("binary"
vs. "text") but because you claim a filter is not a command? Where does
that notion come from?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Malcolm McLean on Fri Feb 2 23:38:22 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 31/01/2024 23:36, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 30/01/2024 07:27, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 29/01/2024 20:10, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

[...]

I've never used standard output for binary data.
[...] it strikes me as a poor design decision.

How so?

Because the output can't be inspected by humans, and because it might >>>>> have unusual effects if passed though systems designed to handle
human-readable text.

Maybe you are not used to a system where it's trivial to inspect such
data. When "some_prog" produces data that are not compatible with the
current terminal settings, "some_prog | hd" shows a hex dump instead.
The need to do this does not make "some_prog" poorly designed. It may
simply mean that the output is /intended/ for further processing.

For instance in some systems designed to receive

ASCII text, there is no distinction between the nul byte and "waiting >>>>> for next data byte". Obviously this will cause difficuties if the data >>>>> is binary.
Also many binary formats can't easily be extended, so you can pass one >>>>> image and that's all. While it is possible to devise a text format
which is similar, in practice text formats usually have enough
redundancy to be easily extended.

So it's harder to correct errors, more prone to errors, and harder to >>>>> extend.

Your reasoning is all gobbledygook. Your comments reflect only
limitations in your thinking, not any essential truth about using
standard out for binary data.

I must admit that it's nothing I have ever done or considered doing.

However standard output is designed for text and not binary ouput.

What is your evidence? stdout was just designed for output (as far as I
can tell) and, anyway, what is the distinction you are making between
binary and text? iconv --from ACSII --to EBCDIC-UK will produce
something that is "logically" text on stdout, but it might look like
binary to you.
An example where it's really useful not to care: I have a suite of tools
for doing toy cryptanalysis. Some apply various transformations and/or
filters to byte streams and others collect and output (on stderr)
various statistics. Plugging them together in various pipelines is very
handy when investigating an encrypted text. The output is almost always
"binary" in the sense that there would be not point in looking at on a
terminal.
According to you, these tools are poorly designed. I don't think so.
How would you design them? Endless input and output file names to be
juggled and tidied up afterwards?

I'd write a monolithic program.
Load the encryoted text into memory, and then pass it to subroutines to do the various analyses.
You can of course process it, and then pass the processed output to other programs. And that does have a point if the program which is acceoting the processed outout is doing something which has no necessary connection to cryptanalysis. So for example a program to produce a pie chart from a list
of letter frequencies. But if it's transforming the encrypted text in intricate and specialised ways, then analysing the transformed text in
other specialised and intricate ways, then firstly you've probably
introduced coupling and dependency between the two programs, and secondly you're probably at some point going to want to modify the second program in the pipeline to look at the raw data.

I don't think you understand the design at all. What coupling? And why
would I modify the program to inspect the data when there are several inspection program that can be inserted before or after to do just that?

I've commented elsewhere on why I think a monolithic program is not a
good design, so I won't repeat that here. I just don't understand any
of your objections to my design. Specifically, you don't address the
fact that you claim it's wrong simply because the data going to stdout
are binary. Have you abandoned that generic criticism? Is there why
you split the thread with two replies?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Malcolm McLean on Sat Feb 3 11:27:37 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 01/02/2024 13:24, Tim Rentsch wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

You admit this with "not tested". Says it all. '"Understandig Unix" is >>> an intellectually useless achievement. You might have to do it if you
have to use the system and debug and trouble shoot. But it's nothing
to be proud about.

You're an idiot. As usual trying to have a useful discussion
with you has turned out to be a complete waste of time.

Some things are interesting in themselves and worth talking about at
lenght. Like how Haskell builds up functions of functions. Other
things really aren't. And how to set up a Unix pipeline is one of
those that really aren't (unless actually faced with a such a system
and with a practical need to do it).

I think you have the intelligence to understand this, if you'd just understand where I am coming from. This arrogant and dismissive
attitude does not become you.

The arrogance and dismissive attitude is on your side, not
mine. I don't think everyone is an idiot. Just you.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to James Kuyper on Sat Feb 3 11:35:00 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 1/24/24 16:11, Kaz Kylheku wrote:

On 2024-01-24, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 1/24/24 03:10, Janis Papanagnou wrote:

On 23.01.2024 23:37, Kalevi Kolttonen wrote:

[...] I am
pretty sure that not all computer languages
provide guarantees about the order of evaluation.

What?!

Could you explain what surprises you about that statement? As quoted,
it's a general statement which includes C: "Except as specified later,
side effects and value computations of subexpressions are unsequenced."

Pretty much any language has to guarantee *something* about
order of evaluation, somewhere.

Not the functional languages, I believe - but I've only heard about such languages, not used them.

Like for instance that calculating output is not possible before a
needed input is available.

Oddly enough, for a long time the C standard never said anything about
that issue. I argued that this was logically necessary, and few people disagreed with that argument, but I couldn't point to wording in the
standard to support that claim.

That changed when they added support for multi-threaded code to C in
C2011. That required the standard to be very explicit about which things could happen simultaneously in different threads, and which things had
to occur in a specified order. All of the wording about "sequenced" was first introduced at that time. [...]

The timing may have been the same, but the motivation for the
new language about sequencing actually occurred much earlier.
For quite a long time before C11, and IIRC even before C99,
the ISO C committee was looking for a formal model to describe
the sequencing rules. There were proposals for various formal
models, none of which were thought to suffice. So it wasn't
just adding threading to C that prompted adding better language
regarding sequencing rules.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dave_thompson_2@comcast.net@21:1/5 to david.brown@hesbynett.no on Mon Feb 26 04:18:16 2024

On Tue, 30 Jan 2024 17:25:31 +0100, David Brown
<david.brown@hesbynett.no> wrote:

On 30/01/2024 16:49, Malcolm McLean wrote:

[nonsense as usual]

Mixing binary data with formatted text data is very unlikely to be
useful. fwrite() is perfectly good for writing binary data - it would
make no sense to have some awkward printf specifier to do this. (What
would the specifier even be? It would need to take two items of data -
a pointer and a length - and thus be very different from existing specifiers.)

"%.*s" takes length and pointer, which IMO is not VERY different.

It does stop (prematurely?) if the data/buffer contains \0 .

[snip rest]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dave_thompson_2@comcast.net@21:1/5 to ldo@nz.invalid on Mon Feb 26 04:20:23 2024

On Wed, 31 Jan 2024 06:10:56 -0000 (UTC), Lawrence D'Oliveiro
<ldo@nz.invalid> wrote:

On Tue, 30 Jan 2024 20:29:17 +0100, David Brown wrote:

stdout and stdin were apparently available in FORTRAN in the 1950's.

There was a convention that channel 5 was the card reader, and 6 was the
line printer.

and 7 the card punch, at least from FIV/66 up. Data you expected to
reprocess, like a Unix filter's stdout, would go to 7, and data you
wanted a human to see, like stderr, to 6.

When interactive systems came along later, this became channel 5 for
keyboard input, and 6 for terminal output.

and 7 got lost -- the same way that ssh -t (and docker -t) loses the distinction between stdout and stderr, it's all just output

What happened to channels 1, 2, 3 & 4? Don�t know.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet
- Thlc
  Sat Sep 13 17:11:34 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 17:04:03 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 16:32:19 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 15:41:11 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 07:56:03 2025
  from Rognac, France via SSH
- Gretchiie
  Sat Sep 13 07:22:10 2025
  from Derry, Nh via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (0 / 16)
Uptime:	161:39:24
Calls:	10,385
Calls today:	2
Files:	14,057
Messages:	6,416,500

iso646.h

Who's Online

Recent Visitors

System Info