Forum: >>> Magnum BBS <<<

Top 10 most common hard skills listed on resumes...

From John Forkosh@21:1/5 to All on Fri Aug 23 22:03:45 2024

I came across
https://www.fastcompany.com/91169318/
where I was quite surprised, and very happily so,
to see C listed as #3 on its list of
"Top 10 most common hard skills listed in 2023"
(scroll about halfway down for that list). Moreover,
C++ doesn't even make it anywhere in that top-10 list.
So is that list sensible??? I'd personally be delighted
if so, but I'm suspicious it may just be wishful thinking
on my part, and some kind of goofiness on the list's author.
--
John Forkosh

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to John Forkosh on Fri Aug 23 23:06:00 2024

On Fri, 23 Aug 2024 22:03:45 -0000 (UTC), John Forkosh wrote:

So is that list sensible???

On the one hand, I would like to think so, since it lists Python at number
1.

On the other hand, it can’t be, because it includes Excel. What a
laugh ...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Sat Aug 24 02:26:54 2024

On Fri, 23 Aug 2024 17:02:39 -0700, Keith Thompson wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On the other hand, it can’t be, because it includes Excel. What a laugh
...

Perhaps you could explain humor to the rest of us.

https://arstechnica.com/tech-policy/2020/10/excel-glitch-may-have-caused-uk-to-underreport-covid-19-cases-by-15841/
https://www.theregister.com/2023/06/06/austria_election_excel_blunder/ https://www.theregister.com/2023/10/12/excel_anesthetist_recruitment_blunder/ https://arstechnica.com/cars/2024/03/formula-1-chief-appalled-to-find-team-using-excel-to-manage-20000-car-parts/

What lessons have been learned since the Great Renaming of genes to
avoid Excel misinterpreting them as dates? Seemingly, none: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008984

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to John Forkosh on Sat Aug 24 14:41:16 2024

On 24/08/2024 00:03, John Forkosh wrote:

I came across
https://www.fastcompany.com/91169318/
where I was quite surprised, and very happily so,
to see C listed as #3 on its list of
"Top 10 most common hard skills listed in 2023"
(scroll about halfway down for that list). Moreover,
C++ doesn't even make it anywhere in that top-10 list.
So is that list sensible??? I'd personally be delighted
if so, but I'm suspicious it may just be wishful thinking
on my part, and some kind of goofiness on the list's author.

As far as I see, the article is about what people put on their CV's -
not what they /should/ put, or what potential employers want.
Basically, it is pretty useless - you could use it to argue that people
think (rightly or wrongly) that C skills are useful for getting a job,
or that people with C skills are regularly out of a job and needing to
apply for a new one.

And you can also expect that the people behind the article don't know
the difference between C, C++ and C#.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Bonita Montero on Sat Aug 24 19:27:08 2024

On 24/08/2024 19:11, Bonita Montero wrote:

Am 24.08.2024 um 00:03 schrieb John Forkosh:

I came across
https://www.fastcompany.com/91169318/
where I was quite surprised, and very happily so,
to see C listed as #3 on its list of
"Top 10 most common hard skills listed in 2023"
(scroll about halfway down for that list). Moreover,
C++ doesn't even make it anywhere in that top-10 list.
So is that list sensible??? I'd personally be delighted
if so, but I'm suspicious it may just be wishful thinking
on my part, and some kind of goofiness on the list's author.

I guess C++ is used much more often because you're multiple times
more produdtive than with C. And programming in C++ is a magnitude
less error-prone.

C++ incorporates most of C. So someone can write 'C++' code but can
still have most of the same problems as C.

Meanwhile real C++ code has several times more boilerplate than C. HTF
you can even discern your actual program amidst all that crap is beyond me.

There /are/ proper higher level languages than both C and C++. You can
use one to help develop a working application, then porting that part to
C is a quicker, simpler and safer process.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to Bart on Sat Aug 24 21:12:56 2024

On 24/08/2024 19:27, Bart wrote:

C++ incorporates most of C. So someone can write 'C++' code but can
still have most of the same problems as C.

Meanwhile real C++ code has several times more boilerplate than C. HTF
you can even discern your actual program amidst all that crap is beyond me.

There /are/ proper higher level languages than both C and C++. You can
use one to help develop a working application, then porting that part to
C is a quicker, simpler and safer process.

Up there someone is asking about a valgrind leak they can't find.

Over in comp.lang.c++ we write wrappers around the memory management if
we can't find a system class that will do it for us, and that's no
longer an issue.

(There are still all sorts of memory problems, but if you are leaking
you are doing Something Wrong.)

Andy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Forkosh@21:1/5 to David Brown on Sun Aug 25 12:09:40 2024

David Brown <david.brown@hesbynett.no> wrote:

John Forkosh wrote:

I came across
https://www.fastcompany.com/91169318/
where I was quite surprised, and very happily so,
to see C listed as #3 on its list of
"Top 10 most common hard skills listed in 2023"
(scroll about halfway down for that list). Moreover,
C++ doesn't even make it anywhere in that top-10 list.
So is that list sensible??? I'd personally be delighted
if so, but I'm suspicious it may just be wishful thinking
on my part, and some kind of goofiness on the list's author.

As far as I see, the article is about what people put on their CV's -
not what they /should/ put, or what potential employers want.
Basically, it is pretty useless - you could use it to argue that people
think (rightly or wrongly) that C skills are useful for getting a job,
or that people with C skills are regularly out of a job and needing to
apply for a new one.

And you can also expect that the people behind the article don't know
the difference between C, C++ and C#.

Yeah, I guess "C is #3" was just unlikely wishful thinking
on my part (I'm now hoping my lottery ticket is a winner).
So, is there any reasonably reliable such "Top 10" list?
If so, where? If not, where would C fall on it if it did
exist? (I'd probably guess C>10, so make that a "Top 100"
list, as needed.)
--
John Forkosh

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Forkosh@21:1/5 to Bart on Sun Aug 25 12:18:40 2024

Bart <bc@freeuk.com> wrote:

On 24/08/2024 19:11, Bonita Montero wrote:

Am 24.08.2024 um 00:03 schrieb John Forkosh:

I came across
https://www.fastcompany.com/91169318/
where I was quite surprised, and very happily so,
to see C listed as #3 on its list of
"Top 10 most common hard skills listed in 2023"
(scroll about halfway down for that list). Moreover,
C++ doesn't even make it anywhere in that top-10 list.
So is that list sensible??? I'd personally be delighted
if so, but I'm suspicious it may just be wishful thinking
on my part, and some kind of goofiness on the list's author.

I guess C++ is used much more often because you're multiple times
more produdtive than with C. And programming in C++ is a magnitude
less error-prone.

C++ incorporates most of C. So someone can write 'C++' code but can
still have most of the same problems as C.

Meanwhile real C++ code has several times more boilerplate than C. HTF
you can even discern your actual program amidst all that crap is beyond me.

There /are/ proper higher level languages than both C and C++. You can
use one to help develop a working application, then porting that part to
C is a quicker, simpler and safer process.

I recall C as originally characterized as a "portable assembly language",
as opposed to a "higher level language". And I'd agree with that
assessment, whereby I think you're barking up the wrong tree by trying
to evaluate its merits/demerits vis-a-vis higher-level languages.
Consider it with respect to its own objectives, instead.
--
John Forkosh

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to John Forkosh on Sun Aug 25 17:06:26 2024

On Sun, 25 Aug 2024 12:09:40 -0000 (UTC)
John Forkosh <forkosh@somewhere.com> wrote:

David Brown <david.brown@hesbynett.no> wrote:

John Forkosh wrote:

I came across
https://www.fastcompany.com/91169318/
where I was quite surprised, and very happily so,
to see C listed as #3 on its list of
"Top 10 most common hard skills listed in 2023"
(scroll about halfway down for that list). Moreover,
C++ doesn't even make it anywhere in that top-10 list.
So is that list sensible??? I'd personally be delighted
if so, but I'm suspicious it may just be wishful thinking
on my part, and some kind of goofiness on the list's author.

As far as I see, the article is about what people put on their CV's
- not what they /should/ put, or what potential employers want.
Basically, it is pretty useless - you could use it to argue that
people think (rightly or wrongly) that C skills are useful for
getting a job, or that people with C skills are regularly out of a
job and needing to apply for a new one.

And you can also expect that the people behind the article don't
know the difference between C, C++ and C#.

Yeah, I guess "C is #3" was just unlikely wishful thinking
on my part (I'm now hoping my lottery ticket is a winner).
So, is there any reasonably reliable such "Top 10" list?

By which criterion exactly?

If so, where? If not, where would C fall on it if it did
exist? (I'd probably guess C>10, so make that a "Top 100"
list, as needed.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to John Forkosh on Sun Aug 25 10:50:10 2024

On 8/25/24 08:18, John Forkosh wrote:

Bart <bc@freeuk.com> wrote:

...

I recall C as originally characterized as a "portable assembly language",
as opposed to a "higher level language". And I'd agree with that
assessment, whereby I think you're barking up the wrong tree by trying
to evaluate its merits/demerits vis-a-vis higher-level languages.
Consider it with respect to its own objectives, instead.

C has been mischaracterized as a "portable assembly language", but that
has never been an accurate characterization. It has, from the very
beginning, been defined by the behavior that is supposed to result from translating and executing the C code, not the assembly language that's
supposed to be produced by the translation process.
C is a high level language. It is a very low-level high-level language,
but it's not in any sense an assembler.

There's a famous bug that involved unconditionally derefencing a pointer returned by malloc() before checking whether or not that pointer was
null. The people who wrote that bug knew that the assembly code which
they naively expected to be generated by translating their program would
be perfectly safe on the platform they were working on: a null pointer
pointed at a specific location in memory, and it was safe to access that location.

But C is not defined as producing the assembly language they naively
expected it to produce. It's defined by the behavior of the resulting
code when executed. Dereferencing a null pointer has undefined behavior,
so the compiler just assumed that the fact that they were dereferencing
the pointer meant that they knew, for a fact, for reasons unknown to the compiler itself, that the pointer would never be null. Therefore, the
compiler optimized away the code that was supposed to be executed if the pointer was null.
This is a perfectly legal optimization - since the standard imposes no requirements on the behavior of a program when the behavior is
undefined, nothing the program could do when the pointer was null could
be incorrect behavior. Furthermore, this optimization was not on by
default, it applied only if you turned it on explicitly, which the
developers had done. It's OK to write code with behavior that is not
defined by the standard, so long as some other document does define the behavior. But you need to be very sure you know what behavior that other document defined - this compiler defined the behavior as causing
optimizations based upon the assumption that the pointer was not null.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to John Forkosh on Sun Aug 25 10:54:22 2024

On 8/25/24 08:09, John Forkosh wrote:
...

Yeah, I guess "C is #3" was just unlikely wishful thinking
on my part (I'm now hoping my lottery ticket is a winner).
So, is there any reasonably reliable such "Top 10" list?
If so, where? If not, where would C fall on it if it did
exist? (I'd probably guess C>10, so make that a "Top 100"
list, as needed.)

If you're just looking for a Top 10 list, and don't care "Top 10 what?",
then the Tiobe index <https://www.tiobe.com/tiobe-index/> might satisfy. There's lots of controversy about their methodology, despite which I
think it is the least controversial attempt to rank all computer languages.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From fir@21:1/5 to James Kuyper on Sun Aug 25 16:55:14 2024

James Kuyper wrote:

On 8/25/24 08:18, John Forkosh wrote:

Bart <bc@freeuk.com> wrote:

...

I recall C as originally characterized as a "portable assembly language",
as opposed to a "higher level language". And I'd agree with that
assessment, whereby I think you're barking up the wrong tree by trying
to evaluate its merits/demerits vis-a-vis higher-level languages.
Consider it with respect to its own objectives, instead.

C has been mischaracterized as a "portable assembly language", but that
has never been an accurate characterization. It has, from the very
beginning, been defined by the behavior that is supposed to result from translating and executing the C code, not the assembly language that's supposed to be produced by the translation process.
C is a high level language. It is a very low-level high-level language,
but it's not in any sense an assembler.

c is mid level language - i mean it has more sense to call c that way
than call it low level or high level

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to James Kuyper on Sun Aug 25 18:10:31 2024

On Sun, 25 Aug 2024 10:54:22 -0400
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 8/25/24 08:09, John Forkosh wrote:
...

Yeah, I guess "C is #3" was just unlikely wishful thinking
on my part (I'm now hoping my lottery ticket is a winner).
So, is there any reasonably reliable such "Top 10" list?
If so, where? If not, where would C fall on it if it did
exist? (I'd probably guess C>10, so make that a "Top 100"
list, as needed.)

If you're just looking for a Top 10 list, and don't care "Top 10
what?", then the Tiobe index <https://www.tiobe.com/tiobe-index/>
might satisfy. There's lots of controversy about their methodology,
despite which I think it is the least controversial attempt to rank
all computer languages.

Methodology published is a plus, methodology is bad is a minus.
Come on, Javascript never higher than #6 ?!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to fir on Sun Aug 25 16:30:17 2024

On 25/08/2024 15:55, fir wrote:

James Kuyper wrote:

On 8/25/24 08:18, John Forkosh wrote:

Bart <bc@freeuk.com> wrote:

...

I recall C as originally characterized as a "portable assembly
language",
as opposed to a "higher level language". And I'd agree with that
assessment, whereby I think you're barking up the wrong tree by trying
to evaluate its merits/demerits vis-a-vis higher-level languages.
Consider it with respect to its own objectives, instead.

C has been mischaracterized as a "portable assembly language", but that
has never been an accurate characterization. It has, from the very
beginning, been defined by the behavior that is supposed to result from
translating and executing the C code, not the assembly language that's
supposed to be produced by the translation process.
C is a high level language. It is a very low-level high-level language,
but it's not in any sense an assembler.

c is mid level language - i mean it has more sense to call c that way
than call it low level or high level

So what language goes between Assembly and C?

There aren't many! So it's reasonable to consider C as being at the
lowest level of HLLs.

Putting C at mid-level would make for a very cramped space above it as
99% of languages would have to fit in there.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From tTh@21:1/5 to Bart on Sun Aug 25 18:20:52 2024

On 8/25/24 17:30, Bart wrote:

So what language goes between Assembly and C?

Forth ?

--
+---------------------------------------------------------------------+
| https://tube.interhacker.space/a/tth/video-channels | +---------------------------------------------------------------------+

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bart on Sun Aug 25 19:17:19 2024

On Sun, 25 Aug 2024 16:30:17 +0100
Bart <bc@freeuk.com> wrote:

On 25/08/2024 15:55, fir wrote:

James Kuyper wrote:

On 8/25/24 08:18, John Forkosh wrote:

Bart <bc@freeuk.com> wrote:

...

I recall C as originally characterized as a "portable assembly
language",
as opposed to a "higher level language". And I'd agree with that
assessment, whereby I think you're barking up the wrong tree by
trying to evaluate its merits/demerits vis-a-vis higher-level
languages. Consider it with respect to its own objectives,
instead.

C has been mischaracterized as a "portable assembly language", but
that has never been an accurate characterization. It has, from the
very beginning, been defined by the behavior that is supposed to
result from translating and executing the C code, not the assembly
language that's supposed to be produced by the translation process.
C is a high level language. It is a very low-level high-level
language, but it's not in any sense an assembler.

c is mid level language - i mean it has more sense to call c that
way than call it low level or high level

So what language goes between Assembly and C?

Popular today? Not many. In the past? PL/M, BLISS. Although the former
is at almost the same level as C.

There aren't many!

Because C is seen as good enough.

So it's reasonable to consider C as being at the
lowest level of HLLs.

Putting C at mid-level would make for a very cramped space above it
as 99% of languages would have to fit in there.

Why is it a problem?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bonita Montero on Sun Aug 25 19:28:10 2024

On Sun, 25 Aug 2024 17:59:49 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:

Am 25.08.2024 um 16:50 schrieb James Kuyper:

C is a high level language. It is a very low-level high-level
language, but it's not in any sense an assembler.

C is almost the lowest-level of all high level lanugages.
I don't know any further language which lacks nearly every
abstraction.

Define "abstraction".
In my book this API is abstract.

struct bar_implementation;
typedef struct bar_implementation* bar;
typedef const struct bar_implementation* cbar;

bar* bar_make();
void bar_set_abra(bar x, int abra);
void bar_set_cadabra(bar x, double cadabra);
void bar_foo(bar x, int y);
int bar_get_abra(cbar x);
etc...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to James Kuyper on Sun Aug 25 18:23:07 2024

On 25.08.2024 16:50, James Kuyper wrote:

On 8/25/24 08:18, John Forkosh wrote:
...

I recall C as originally characterized as a "portable assembly language",
as opposed to a "higher level language". And I'd agree with that
assessment, whereby I think you're barking up the wrong tree by trying
to evaluate its merits/demerits vis-a-vis higher-level languages.
Consider it with respect to its own objectives, instead.

C has been mischaracterized as a "portable assembly language", but that
has never been an accurate characterization. It has, from the very
beginning, been defined by the behavior that is supposed to result from translating and executing the C code, not the assembly language that's supposed to be produced by the translation process.
C is a high level language. It is a very low-level high-level language,
but it's not in any sense an assembler.

I wouldn't take above characterization literally - literally it's a
wrong assessment (as I think you rightly say). But given its origin,
its intended uses for systems programming, its machine-orientation,
its low-level constructs, and lacking any high-level constructs, the
absence of abstraction that was already existing these days in various
forms in quite some other HLLs, all the software bugs and hassles with
it as a consequence of its design, and whatnot...

We can dispute about informal classifications; whether it's a "very
low-level high-level language" or rather a "very high-level low-level language".

It's just that many folks consider(ed) that language within the zoo of
HLLs as a lousy representative. The title "portable assembly language"
always appeared to me as being just a disrespectful accentuated formula
used in discussions with naive fans of new hypes that were not aware of state-of-the-art language developments existing these days, and still.

When thinking about the "level" of languages there's always an image
of the economic damage forming in my mind by the actual consequences
of using a language. (Just recently I tried soothing someone that he
shouldn't take his desperate error-tracking too personal in the light
that the damage of the given C-issue certainly caused billions of
dollars already and that he's certainly not alone with that problem.)

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bonita Montero on Sun Aug 25 18:28:02 2024

On 25.08.2024 17:59, Bonita Montero wrote:

C is almost the lowest-level of all high level lanugages.
I don't know any further language which lacks nearly every abstraction.

Do languages like Brainfuck, Intercal, Whitespace, etc. count?

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to John Forkosh on Sun Aug 25 18:51:16 2024

On 25/08/2024 14:18, John Forkosh wrote:

Bart <bc@freeuk.com> wrote:

On 24/08/2024 19:11, Bonita Montero wrote:

Am 24.08.2024 um 00:03 schrieb John Forkosh:

I came across
https://www.fastcompany.com/91169318/
where I was quite surprised, and very happily so,
to see C listed as #3 on its list of
"Top 10 most common hard skills listed in 2023"
(scroll about halfway down for that list). Moreover,
C++ doesn't even make it anywhere in that top-10 list.
So is that list sensible??? I'd personally be delighted
if so, but I'm suspicious it may just be wishful thinking
on my part, and some kind of goofiness on the list's author.

I guess C++ is used much more often because you're multiple times
more produdtive than with C. And programming in C++ is a magnitude
less error-prone.

C++ incorporates most of C. So someone can write 'C++' code but can
still have most of the same problems as C.

Meanwhile real C++ code has several times more boilerplate than C. HTF
you can even discern your actual program amidst all that crap is beyond me. >>
There /are/ proper higher level languages than both C and C++. You can
use one to help develop a working application, then porting that part to
C is a quicker, simpler and safer process.

I recall C as originally characterized as a "portable assembly language",

You recall incorrectly. Or, rather, you correctly recall people
incorrectly characterizing it that way.

One of C's original intents (and it has been extraordinarily successful
at it) is to reduce the need to write assembly language.

as opposed to a "higher level language". And I'd agree with that
assessment,

Then you'd be wrong. Dangerously wrong - people who think C is a kind
of "portable assembly language" regularly write incorrect code and miss
much of the point of the language.

whereby I think you're barking up the wrong tree by trying
to evaluate its merits/demerits vis-a-vis higher-level languages.
Consider it with respect to its own objectives, instead.

That's word salad without content. You are saying that C should be
treated like C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Sun Aug 25 18:36:46 2024

On 24.08.2024 20:27, Bart wrote:

On 24/08/2024 19:11, Bonita Montero wrote:

I guess C++ is used much more often because you're multiple times
more produdtive than with C. And programming in C++ is a magnitude
less error-prone.

C++ incorporates most of C. So someone can write 'C++' code but can
still have most of the same problems as C.

It's true that C++ decided to inherit unsafe C designs as C being
sort of its base. But a sophisticated programmer would knowingly
avoid the unsafe parts and use the existing safer C++ constructs.
Only that a language allows that you *can* write bad code doesn't
mean you cannot avoid the problems. Of course it would have been
(IMO) better if the unsafe parts were replaced or left out, but
there were portability consideration in C++'s design.

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to John Forkosh on Sun Aug 25 18:47:13 2024

On 25/08/2024 14:09, John Forkosh wrote:

David Brown <david.brown@hesbynett.no> wrote:

John Forkosh wrote:

I came across
https://www.fastcompany.com/91169318/
where I was quite surprised, and very happily so,
to see C listed as #3 on its list of
"Top 10 most common hard skills listed in 2023"
(scroll about halfway down for that list). Moreover,
C++ doesn't even make it anywhere in that top-10 list.
So is that list sensible??? I'd personally be delighted
if so, but I'm suspicious it may just be wishful thinking
on my part, and some kind of goofiness on the list's author.

As far as I see, the article is about what people put on their CV's -
not what they /should/ put, or what potential employers want.
Basically, it is pretty useless - you could use it to argue that people
think (rightly or wrongly) that C skills are useful for getting a job,
or that people with C skills are regularly out of a job and needing to
apply for a new one.

And you can also expect that the people behind the article don't know
the difference between C, C++ and C#.

Yeah, I guess "C is #3" was just unlikely wishful thinking
on my part (I'm now hoping my lottery ticket is a winner).

You wished to fine C rated at number three on a list, without much
concern about the relevance of the list to anything? That's an odd
wish, but I'm happy it was fulfilled for you!

So, is there any reasonably reliable such "Top 10" list?

Reliable for what?

If so, where? If not, where would C fall on it if it did
exist? (I'd probably guess C>10, so make that a "Top 100"
list, as needed.)

If you are talking about a list of programming languages people probably
should have on their CV's, it depends totally on the job. When people
are applying to work as software developers at my company, if they don't
have C on their list, it's unlikely we will bother with an interview.
C++ would be good, assembly would be good. Listing "Excel" in
programming languages would count negatively.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Sun Aug 25 20:11:24 2024

On Sun, 25 Aug 2024 18:36:46 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 24.08.2024 20:27, Bart wrote:

On 24/08/2024 19:11, Bonita Montero wrote:

I guess C++ is used much more often because you're multiple times
more produdtive than with C. And programming in C++ is a magnitude
less error-prone.

C++ incorporates most of C. So someone can write 'C++' code but can
still have most of the same problems as C.

It's true that C++ decided to inherit unsafe C designs as C being
sort of its base. But a sophisticated programmer would knowingly
avoid the unsafe parts and use the existing safer C++ constructs.
Only that a language allows that you *can* write bad code doesn't
mean you cannot avoid the problems. Of course it would have been
(IMO) better if the unsafe parts were replaced or left out, but
there were portability consideration in C++'s design.

Janis

[...]

Safe HLLs without mandatory automatic memory management tend to fall
into two categories:
1. Those that already failed to become popular
2. Those for which it will happen soon
That despite at least one language in the 1st category being pretty
well designed, if more than a little over-engineered.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Michael S on Sun Aug 25 18:17:01 2024

On 25/08/2024 17:17, Michael S wrote:

On Sun, 25 Aug 2024 16:30:17 +0100
Bart <bc@freeuk.com> wrote:

On 25/08/2024 15:55, fir wrote:

James Kuyper wrote:

On 8/25/24 08:18, John Forkosh wrote:

Bart <bc@freeuk.com> wrote:

...

I recall C as originally characterized as a "portable assembly
language",
as opposed to a "higher level language". And I'd agree with that
assessment, whereby I think you're barking up the wrong tree by
trying to evaluate its merits/demerits vis-a-vis higher-level
languages. Consider it with respect to its own objectives,
instead.

C has been mischaracterized as a "portable assembly language", but
that has never been an accurate characterization. It has, from the
very beginning, been defined by the behavior that is supposed to
result from translating and executing the C code, not the assembly
language that's supposed to be produced by the translation process.
C is a high level language. It is a very low-level high-level
language, but it's not in any sense an assembler.

c is mid level language - i mean it has more sense to call c that
way than call it low level or high level

So what language goes between Assembly and C?

Popular today? Not many. In the past? PL/M, BLISS. Although the former
is at almost the same level as C.

There aren't many!

Because C is seen as good enough.

Because it's seen off most of the competition, partly thanks to the
dominance of Unix.

Lots of younger people now think that C is what a lower level, systems
language is supposed to look like.

So it's reasonable to consider C as being at the
lowest level of HLLs.

Putting C at mid-level would make for a very cramped space above it
as 99% of languages would have to fit in there.

Why is it a problem?

It's only a problem if the aim is to classify languages according to
perceived level say from 1 to 100. Then you don't start by classifying
one of the lowest level ones as 50.

If plotting such a chart (say level vs. year of introduction), half of
it would be empty.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to tTh on Sun Aug 25 18:26:57 2024

On 25/08/2024 17:20, tTh wrote:

On 8/25/24 17:30, Bart wrote:

So what language goes between Assembly and C?

Forth ?

I had in mind languages classed as 'HLLs'. I'm not sure if Forth counts.

Otherwise there was a HLA I once worked with (that looks a lot more like
a HLL than Forth ever will).

Plus various Intermediate Languages that I've devised. These are higher
level than assembly, but are clearly not HLLs either. Especially
stack-based ones like this actual example for a Hello program:

proc main*
loadimm u64 "Hello, World"
callp puts 1 0
stop 0
end

extproc void puts
extparam u64
extend

These are designed to be machine generated, but this one you could write manually if you had to. It's easier than ASM and it's portable.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Bonita Montero on Sun Aug 25 19:24:30 2024

On 25/08/2024 19:12, Bonita Montero wrote:

Am 25.08.2024 um 18:28 schrieb Michael S:

Define "abstraction".

OOP, functional programming, generic programming, exceptions.

That isn't surprising. The code you constantly post always uses the most advanced features, uses every toy that is available, and the most
elaborate algorithms.

And it is always utterly incomprehensible.

I suspect that most of the time this is unnecessary and you just write
this way because you can. Or to show off. Or to write stuff it is
impossible to write in C (good!).

This is where having fewer such features helps. If I port an algorithm
to some new language X, then if it was originally in C, I'd have a
fighting chance.

In C++ written by you, then forget it. I don't even need to look at the
code. If you wrote novels, you'd be using a vocabulary 20 or 30 times
bigger than any normal person understands. It really doesn't help.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bonita Montero on Sun Aug 25 22:00:16 2024

On Sun, 25 Aug 2024 20:12:47 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:

Am 25.08.2024 um 18:28 schrieb Michael S:

Define "abstraction".

OOP, functional programming, generic programming, exceptions.

I don't see why any of those are abstractions.
Not that I am particularly fond of abstractions when I do see them.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Sun Aug 25 17:48:14 2024

Michael S <already5chosen@yahoo.com> writes:

On Sun, 25 Aug 2024 18:36:46 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 24.08.2024 20:27, Bart wrote:

On 24/08/2024 19:11, Bonita Montero wrote:

I guess C++ is used much more often because you're multiple times
more produdtive than with C. And programming in C++ is a magnitude
less error-prone.

C++ incorporates most of C. So someone can write 'C++' code but can
still have most of the same problems as C.

It's true that C++ decided to inherit unsafe C designs as C being
sort of its base. But a sophisticated programmer would knowingly
avoid the unsafe parts and use the existing safer C++ constructs.
Only that a language allows that you *can* write bad code doesn't
mean you cannot avoid the problems. Of course it would have been
(IMO) better if the unsafe parts were replaced or left out, but
there were portability consideration in C++'s design.

[...]

Safe HLLs without mandatory automatic memory management

I'm not sure what you mean by this description. Do you mean
languages that are otherwise unsafe but have a safe subset?
If not that then please elaborate. What are some examples of
"safe HLLs without mandatory automatic memory management"?

tend to fall
into two categories:
1. Those that already failed to become popular
2. Those for which it will happen soon

It's been amusing reading a discussion of which languages are or are
not high level, without anyone offering a definition of what the
term means. Wikipedia says, roughly, that a high-level language is
one that doesn't provide machine-level access (and IMO that is a
reasonable characterization). Of course no distinction along these
lines is black and white - almost all languages have a loophole or
two - but I expect there is general agreement about which languages
clearly fail that test. In particular, any language that offers
easy access to raw memory addresses (and both C and C++ certainly
do), is not a high-level language in the Wikipedia sense.

Second amusement: using the term popular without giving any
kind of a metric that measures popularity.

Third amusement: any language that has not yet become popular
has already failed to become popular.

That despite at least one language in the 1st category being
pretty well designed, if more than a little over-engineered.

Please, don't keep us in suspense. To what language do you refer?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Michael S on Mon Aug 26 02:33:27 2024

On Sun, 25 Aug 2024 20:11:24 +0300, Michael S wrote:

Safe HLLs without mandatory automatic memory management tend to fall
into two categories:
1. Those that already failed to become popular
2. Those for which it will happen soon
That despite at least one language in the 1st category
being pretty well designed, if more than a little over-engineered.

Which category does Rust fall into? Given that it has already won
acceptance in the world’s most successful software project -- the Linux kernel.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bonita Montero on Mon Aug 26 05:38:08 2024

On Sun, 25 Aug 2024 17:59:49 +0200, Bonita Montero wrote:

I don't know any further language which lacks nearly every abstraction.

Look at where C came from: its immediate ancestor was an adaptation of
BCPL to cope with byte addressability. BCPL had no types: everything was a “word”.

The DEC folks were fond of a language called BLISS, which was similar to
BCPL but a bit more advanced. Instead of conventional types, it had abstractions for accessing various memory layouts, including dynamic ones, right down to the bit level.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Michael S on Mon Aug 26 05:39:17 2024

On Sun, 25 Aug 2024 22:00:16 +0300, Michael S wrote:

On Sun, 25 Aug 2024 20:12:47 +0200 Bonita Montero
<Bonita.Montero@gmail.com> wrote:

Am 25.08.2024 um 18:28 schrieb Michael S:

Define "abstraction".

OOP, functional programming, generic programming, exceptions.

I don't see why any of those are abstractions.

OOP -- a form of data type abstraction
functional programming -- abstraction away from control sequencing
generic programming -- abstraction away from specific types
exceptions -- abstraction away from explicit exit jumps

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to fir on Mon Aug 26 05:40:30 2024

On Sun, 25 Aug 2024 16:55:14 +0200, fir wrote:

c is mid level language ...

In 1980s terms, it was.

Since the prevalence of languages at a higher level than what was
considered “high-level” then, you could say it has dropped a level.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bart on Mon Aug 26 05:41:16 2024

On Sun, 25 Aug 2024 16:30:17 +0100, Bart wrote:

So what language goes between Assembly and C?

There aren't many!

Quite a few, actually. Elsewhere I mentioned B (the precursor of C), BCPL
and BLISS.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Tim Rentsch on Mon Aug 26 10:54:56 2024

On Sun, 25 Aug 2024 17:48:14 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Sun, 25 Aug 2024 18:36:46 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 24.08.2024 20:27, Bart wrote:

On 24/08/2024 19:11, Bonita Montero wrote:

I guess C++ is used much more often because you're multiple times
more produdtive than with C. And programming in C++ is a
magnitude less error-prone.

C++ incorporates most of C. So someone can write 'C++' code but
can still have most of the same problems as C.

It's true that C++ decided to inherit unsafe C designs as C being
sort of its base. But a sophisticated programmer would knowingly
avoid the unsafe parts and use the existing safer C++ constructs.
Only that a language allows that you *can* write bad code doesn't
mean you cannot avoid the problems. Of course it would have been
(IMO) better if the unsafe parts were replaced or left out, but
there were portability consideration in C++'s design.

[...]

Safe HLLs without mandatory automatic memory management

I'm not sure what you mean by this description. Do you mean
languages that are otherwise unsafe but have a safe subset?
If not that then please elaborate.

That is nearly always a case in practice, but it does not have to be.
I can't give a counterexample, but I can imagine language similar to
Pascal that has no records with variants and no procedure Dispose and
also hardens few other corners that I currently forgot about.

What are some examples of
"safe HLLs without mandatory automatic memory management"?

The most prominent examples are Ada and Rust.
It seems that Zig tries the same, but I was not sufficiently interested
to dig deeper. Partly because last time when I tried to play with Zig it refused to install on Wit7 machine.

tend to fall
into two categories:
1. Those that already failed to become popular
2. Those for which it will happen soon

It's been amusing reading a discussion of which languages are or are
not high level, without anyone offering a definition of what the
term means. Wikipedia says, roughly, that a high-level language is
one that doesn't provide machine-level access (and IMO that is a
reasonable characterization).

I don't like this definition. IMHO, what language does have is at least
as important as what it does not have for the purpose of estimating its
level.

Of course no distinction along these
lines is black and white - almost all languages have a loophole or
two - but I expect there is general agreement about which languages
clearly fail that test. In particular, any language that offers
easy access to raw memory addresses (and both C and C++ certainly
do), is not a high-level language in the Wikipedia sense.

Second amusement: using the term popular without giving any
kind of a metric that measures popularity.

Precise definitions of everything are hard.
May be, popular == 1st or close second choice for particular sort of programming job? Plus, somehow add popularity points for being used in
many fields?

Third amusement: any language that has not yet become popular
has already failed to become popular.

There is also "heir apparent' type - languages that are recognized as
not particularly popular now, but believed by many, including press, to
become popular in the future.

That despite at least one language in the 1st category being
pretty well designed, if more than a little over-engineered.

Please, don't keep us in suspense. To what language do you refer?

I thought, that every reader understood that I meant Ada.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Lawrence D'Oliveiro on Mon Aug 26 12:05:02 2024

On 26/08/2024 06:41, Lawrence D'Oliveiro wrote:

On Sun, 25 Aug 2024 16:30:17 +0100, Bart wrote:

So what language goes between Assembly and C?

There aren't many!

Quite a few, actually. Elsewhere I mentioned B (the precursor of C), BCPL
and BLISS.

BLISS is a rather strange language. For something supposedly low level
than C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs: usually
if you declare a variable A, then you can access A's value just by
writing A; its address is automatically dereferenced.

That doesn't happen in BLISS. If you took this program in normal C:

int a, b, c;
b = 10;
c = 20;
a = b + c;

int* p;
p = &a;

then without that auto-deref feature, you'd have to write it like this:

int a, b, c;
*b = 10;
*c = 20;
*a = *b + *c;

int* p;
*p = a;

(The last two lines are interesting; both would be valid in standard C;
but the first set initialises p; the second set, compiled as normal C,
stores a's value into the uninitialised p's target.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Mon Aug 26 13:30:52 2024

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly low level than C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs: usually if
you declare a variable A, then you can access A's value just by writing A; its address is automatically dereferenced.

Not always. This is where left- and right-evaluation came in. On the
left of an assignment A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place. CPL used the terms and
C got them via BCPL's documentation. Viewed like this, BLISS just makes "evaluation" a universal concept.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Tim Rentsch on Mon Aug 26 15:46:02 2024

On 26/08/2024 02:48, Tim Rentsch wrote:

It's been amusing reading a discussion of which languages are or are
not high level, without anyone offering a definition of what the
term means.

That is an important point.

Wikipedia says, roughly, that a high-level language is
one that doesn't provide machine-level access (and IMO that is a
reasonable characterization).

No, that's not what Wikipedia says. To get the full picture, read the
links:

<https://en.wikipedia.org/wiki/Low-level_programming_language> <https://en.wikipedia.org/wiki/High-level_programming_language>

Roughly speaking, they define a "high-level language" as one with a
strong abstraction from the underlying machine, while a "low-level
language" has little or no abstraction.

Wikipedia classifies C as a high-level language that also supports a
degree of low-level programming, which I think is a fair assessment.

Of course no distinction along these
lines is black and white - almost all languages have a loophole or
two - but I expect there is general agreement about which languages
clearly fail that test.

Agreed - trying to make such binary classifications is usually a bad idea.

In particular, any language that offers
easy access to raw memory addresses (and both C and C++ certainly
do), is not a high-level language in the Wikipedia sense.

That is simply incorrect, based on the Wikipedia articles.

I think it is perhaps better to first talk about low-level and
high-level coding or functionality, rather than the language.
High-level coding deals with abstractions, defined by their
specifications rather than the hardware (or virtual machine) running the
code. Low-level coding is tightly tied to the hardware - access to
arbitrary memory (subject to OS or hardware restrictions), features
based on the instruction set of the computer, and so on.

C clearly supports high-level programming - you can write very portable
code that is independent from the underlying hardware. (Most C
/programs/ require a least a small amount of implementation-dependent
behaviour or external library code, but a lot of C /code/ does not.) It
also clearly supports low-level programming.

Whether a programming language is considered "high level" or "low level"
is, IME, determined by one question - is the language mainly defined in
terms of abstract specifications or by the hardware implementing it? C
does have implementation-specific behaviour, and is thus not "pure"
high-level language, but there can be no doubt that it is primarily
defined as a high-level language.

Both C and C++ also /support/ a limited (but very useful in practice)
subset of low-level programming. That does not make them low-level
programming languages, any more than C++ is a functional programming
language just because it has lambdas. And even if one were to classify
them as low-level languages, it would not stop them /also/ being
high-level languages.

And note that Wikipedia classifies it as a high-level language, and
lists it along with other high-level languages. (I don't consider
Wikipedia to be authoritative, but it's usually a reasonable and
objective source for many purposes.)

Third amusement: any language that has not yet become popular
has already failed to become popular.

Or it could be a new language that is gaining traction.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Mon Aug 26 14:54:21 2024

On 26/08/2024 13:30, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly low level than >> C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs: usually if
you declare a variable A, then you can access A's value just by writing A; >> its address is automatically dereferenced.

Not always. This is where left- and right-evaluation came in. On the
left of an assignment A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place. CPL used the terms and
C got them via BCPL's documentation. Viewed like this, BLISS just makes "evaluation" a universal concept.

That doesn't explain why one language requires an explcition dereference
in the source code, and the other doesn't.

By "access A's value" I mean either read or write access.

A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place.

This /is/ confusing as it suggests a different rank for A depending on
whether it is an lvalue or rvalue, eg. some difference in level of
indirection. In fact that is the same on both sides.

My point was that HLLs typically read or write values of variables
without extra syntax. Given a declaration like 'int A' then:

BLISS C

Read or write A's value .A A
Get A's address A &A

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Tim Rentsch on Mon Aug 26 15:13:26 2024

On 26/08/2024 01:48, Tim Rentsch wrote:

Michael S <already5chosen@yahoo.com> writes:

On Sun, 25 Aug 2024 18:36:46 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 24.08.2024 20:27, Bart wrote:

On 24/08/2024 19:11, Bonita Montero wrote:

I guess C++ is used much more often because you're multiple times
more produdtive than with C. And programming in C++ is a magnitude
less error-prone.

C++ incorporates most of C. So someone can write 'C++' code but can
still have most of the same problems as C.

It's true that C++ decided to inherit unsafe C designs as C being
sort of its base. But a sophisticated programmer would knowingly
avoid the unsafe parts and use the existing safer C++ constructs.
Only that a language allows that you *can* write bad code doesn't
mean you cannot avoid the problems. Of course it would have been
(IMO) better if the unsafe parts were replaced or left out, but
there were portability consideration in C++'s design.

[...]

Safe HLLs without mandatory automatic memory management

I'm not sure what you mean by this description. Do you mean
languages that are otherwise unsafe but have a safe subset?
If not that then please elaborate. What are some examples of
"safe HLLs without mandatory automatic memory management"?

tend to fall
into two categories:
1. Those that already failed to become popular
2. Those for which it will happen soon

It's been amusing reading a discussion of which languages are or are
not high level, without anyone offering a definition of what the
term means. Wikipedia says, roughly, that a high-level language is
one that doesn't provide machine-level access (and IMO that is a
reasonable characterization). Of course no distinction along these
lines is black and white - almost all languages have a loophole or
two - but I expect there is general agreement about which languages
clearly fail that test. In particular, any language that offers
easy access to raw memory addresses (and both C and C++ certainly
do), is not a high-level language in the Wikipedia sense.

So, which language do you think is higher level, C++ or Python? Where
might Lisp fit in, or OCaml?

Language 'level' is a linear concept, but the various characteristics of languages are such that there is really a multidimensional gamut.

But among 'systems languages' (something else that needs defining as so
many are claiming they are in that category), I think most would agree
that C is near the bottom, but I don't think that C++ is that much
higher, given how much cruft you still have to write to get anything done.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Mon Aug 26 13:07:24 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Bart <bc@freeuk.com> writes:
[...]

By "access A's value" I mean either read or write access.

Write access does not access the value of an object. [...]

Bart is explaining what he means by the phrase. There is nothing
wrong with pointing out that the C standard doesn't use that
phrase with the same meaning, but that doesn't make what Bart
said wrong, especially since he is comparing semantics in two
different languages.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to James Kuyper on Mon Aug 26 21:36:11 2024

On 25/08/2024 15:54, James Kuyper wrote:

If you're just looking for a Top 10 list, and don't care "Top 10 what?",
then the Tiobe index<https://www.tiobe.com/tiobe-index/> might satisfy. There's lots of controversy about their methodology, despite which I
think it is the least controversial attempt to rank all computer languages.

Thank you, very interesting.

I'm glad to see it doesn't have APL. I had to use it on a project once,
and I still bear the scars. 40 years later...

BTW this debate about high level/low level languages? It doesn't really
matter, and doesn't really have a clear unambiguous answer - any more
than the line between a hot day and a cold one.

And

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Mon Aug 26 21:41:35 2024

On Mon, 26 Aug 2024 12:30:55 -0700, Keith Thompson wrote:

As I recall, the terms "lvalue" and "rvalue" originated with CPL.

And very useful they have proven, too.

CPL was kind of a rival to PL/I. But as I recall from the design papers
and stuff, they spent most of their effort on coming up with alternative syntaxes for features, rather than on the features themselves.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bart on Mon Aug 26 21:40:20 2024

On Mon, 26 Aug 2024 12:05:02 +0100, Bart wrote:

BLISS is a rather strange language. For something supposedly low level
than C, it doesn't have 'goto'.

BLISS is proof that you don’t need goto to write well-structured, yet low- level code.

There is also a key feature that sets it apart from most HLLs: [variable references are always L-values]

Another key feature: scoped macros. And the variations on that concept,
like the way aggregate types are defined in an essentially macro-like
fashion.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Tue Aug 27 01:33:23 2024

On 25.08.2024 20:24, Bart wrote:

On 25/08/2024 19:12, Bonita Montero wrote:

Am 25.08.2024 um 18:28 schrieb Michael S:

Define "abstraction".

This could have been looked up online (e.g. in a Wikipedia article).

OOP, functional programming, generic programming, exceptions.

(And there are yet more.)

That isn't surprising. The code you constantly post always uses the most advanced features, uses every toy that is available, and the most
elaborate algorithms.

I'm not sure in what bubble you lived the past decades. The listed
abstraction examples date back to the 1960's. They were realized in
many programming languages, including long existing ones as well as
many contemporary ones. I suggest to try to understand the concepts
if you want to reach the next experience level. :-)

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Janis Papanagnou on Tue Aug 27 00:47:36 2024

On 27/08/2024 00:33, Janis Papanagnou wrote:

On 25.08.2024 20:24, Bart wrote:

On 25/08/2024 19:12, Bonita Montero wrote:

Am 25.08.2024 um 18:28 schrieb Michael S:

Define "abstraction".

This could have been looked up online (e.g. in a Wikipedia article).

OOP, functional programming, generic programming, exceptions.

(And there are yet more.)

That isn't surprising. The code you constantly post always uses the most
advanced features, uses every toy that is available, and the most
elaborate algorithms.

I'm not sure in what bubble you lived the past decades. The listed abstraction examples date back to the 1960's. They were realized in
many programming languages,

Perhaps not so much in the ones people used. Assembly? Fortran? Cobol?
There have always been academic languages.

including long existing ones as well as

many contemporary ones. I suggest to try to understand the concepts
if you want to reach the next experience level. :-)

I sometimes use (and implement) such features in scripting code which
has the support to use them effortlessly.

I've rarely needed them for systems programming.

My comments were in connection with their clunky and abstruse
implementations in C++, and BM's habit of posting C++ code full of
gratuitous uses of such features.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon Aug 26 17:55:21 2024

Michael S <already5chosen@yahoo.com> writes:

On Sun, 25 Aug 2024 17:48:14 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Sun, 25 Aug 2024 18:36:46 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

[...]

It's true that C++ decided to inherit unsafe C designs as C being
sort of its base. But a sophisticated programmer would knowingly
avoid the unsafe parts and use the existing safer C++ constructs.
Only that a language allows that you *can* write bad code doesn't
mean you cannot avoid the problems. Of course it would have been
(IMO) better if the unsafe parts were replaced or left out, but
there were portability consideration in C++'s design.

[...]

Safe HLLs without mandatory automatic memory management

I'm not sure what you mean by this description. Do you mean
languages that are otherwise unsafe but have a safe subset?
If not that then please elaborate.

That is nearly always a case in practice, but it does not have to
be. I can't give a counterexample, but I can imagine language
similar to Pascal that has no records with variants and no
procedure Dispose and also hardens few other corners that I
currently forgot about.

Does this description mean anything different than saying "Pascal
with all the unsafe parts taken out", without saying what all the
unsafe parts are?

What are some examples of
"safe HLLs without mandatory automatic memory management"?

The most prominent examples are Ada and Rust.

I don't think of either Ada or Rust as safe languages. I expect
there are some barriers in both languages to using unsafe features,
but also that it's easy to get around the barriers if one chooses
to do so. (I should add that I have little to no "real world"
experience in either Ada or Rust.)

tend to fall
into two categories:
1. Those that already failed to become popular
2. Those for which it will happen soon

It's been amusing reading a discussion of which languages are or are
not high level, without anyone offering a definition of what the
term means. Wikipedia says, roughly, that a high-level language is
one that doesn't provide machine-level access (and IMO that is a
reasonable characterization).

I don't like this definition. IMHO, what language does have is at
least as important as what it does not have for the purpose of
estimating its level.

I think Wikipedia means to give a definition in the dictionary
sense of the word, namely, presenting the most commonly held view,
or sometimes the most commonly held views plural, of how the word
or term is used. It is in this dictionary sense that I say the
definition is a reasonable characterization. In particular I don't
mean to say that this meaning is "good", only that it is common
(and also has reasonably high historical fidelity).

Of course no distinction along these
lines is black and white - almost all languages have a loophole or
two - but I expect there is general agreement about which languages
clearly fail that test. In particular, any language that offers
easy access to raw memory addresses (and both C and C++ certainly
do), is not a high-level language in the Wikipedia sense.

Second amusement: using the term popular without giving any
kind of a metric that measures popularity.

Precise definitions of everything are hard. [...]

Sure. But that shouldn't stop someone from giving an imprecise
or informal definition. Any attempt to give a useful definition
is infinitely better than no definition.

Third amusement: any language that has not yet become popular
has already failed to become popular.

There is also "heir apparent' type - languages that are recognized
as not particularly popular now, but believed by many, including
press, to become popular in the future.

Ahhh, so now we have a new metric to consider, whether a language
is _expected to become_ popular in the future (and an indeterminate
future at that).

Please forgive me if that comment comes across as snarky. Mostly I
just find the whole concept amusing... I don't mean to do that at
your expense.

That despite at least one language in the 1st category being
pretty well designed, if more than a little over-engineered.

Please, don't keep us in suspense. To what language do you refer?

I thought, that every reader understood that I meant Ada.

I expect many readers did. If I had had to guess I would have
guessed Rust.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon Aug 26 17:16:06 2024

Michael S <already5chosen@yahoo.com> writes:

[..concerning "abstraction"..]

Not that I am particularly fond of abstractions when I do see
them.

A most unexpected comment. IMO choosing the right abstractions
to define may be the most important skill in developing large
systems.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Bart on Mon Aug 26 18:16:24 2024

Bart <bc@freeuk.com> writes:

On 26/08/2024 01:48, Tim Rentsch wrote:

It's been amusing reading a discussion of which languages are or
are not high level, without anyone offering a definition of what
the term means. Wikipedia says, roughly, that a high-level
language is one that doesn't provide machine-level access (and IMO
that is a reasonable characterization). Of course no distinction
along these lines is black and white - almost all languages have a
loophole or two - but I expect there is general agreement about
which languages clearly fail that test. In particular, any
language that offers easy access to raw memory addresses (and both
C and C++ certainly do), is not a high-level language in the
Wikipedia sense.

So, which language do you think is higher level, C++ or Python?
Where might Lisp fit in, or OCaml?

I find it hard to imagine that anyone cares about my answer to
this question.

Language 'level' is a linear concept, but the various characteristics
of languages are such that there is really a multidimensional gamut.

I think you're confusing the notions of "high-level" and "powerful".
For example, in the Wikipedia entry sense of the term, the original
BASIC is a high-level language, but I think most people would agree
that it is not a very powerful language.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bart on Tue Aug 27 04:34:43 2024

On Mon, 26 Aug 2024 15:13:26 +0100, Bart wrote:

Language 'level' is a linear concept ...

Not strictly. For example, consider that many assembly languages
traditionally had more advanced macro facilities (e.g. arguments passed by keyword) than C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Tue Aug 27 04:36:23 2024

On Mon, 26 Aug 2024 15:46:02 +0200, David Brown wrote:

Wikipedia classifies C as a high-level language that also supports a
degree of low-level programming, which I think is a fair assessment.

The same could be said of Python.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bonita Montero on Tue Aug 27 05:17:30 2024

On Tue, 27 Aug 2024 07:10:57 +0200, Bonita Montero wrote:

Am 27.08.2024 um 02:16 schrieb Tim Rentsch:

A most unexpected comment. IMO choosing the right abstractions to
define may be the most important skill in developing large systems.

Tell this someone who think C++ doen't provide sth. useful over C.

Linus Torvalds?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bonita Montero on Tue Aug 27 06:47:08 2024

On Tue, 27 Aug 2024 07:23:17 +0200, Bonita Montero wrote:

Am 27.08.2024 um 07:17 schrieb Lawrence D'Oliveiro:

On Tue, 27 Aug 2024 07:10:57 +0200, Bonita Montero wrote:

Am 27.08.2024 um 02:16 schrieb Tim Rentsch:

A most unexpected comment. IMO choosing the right abstractions to
define may be the most important skill in developing large systems.

Tell this someone who think C++ doen't provide sth. useful over C.

Linus Torvalds?

That's hopeless to persuade him.

I thought you were going to question the competence of someone who had
that attitude ... clearly not.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Lawrence D'Oliveiro on Tue Aug 27 09:44:40 2024

On 27/08/2024 06:36, Lawrence D'Oliveiro wrote:

On Mon, 26 Aug 2024 15:46:02 +0200, David Brown wrote:

Wikipedia classifies C as a high-level language that also supports a
degree of low-level programming, which I think is a fair assessment.

The same could be said of Python.

Python does not support any significant degree of low-level programming.

A key example of low-level programming is control of hardware, which on
most systems means accessing memory-mapped registers at specific
addresses, reading and writing in specific orders. Python has no means
to do any of that - C and C++ both provide this ability. (Micropython,
a subset of Python targeting microcontrollers and small systems, has
library modules that can do this.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Tue Aug 27 09:37:52 2024

On 27/08/2024 01:47, Bart wrote:

On 27/08/2024 00:33, Janis Papanagnou wrote:

On 25.08.2024 20:24, Bart wrote:

On 25/08/2024 19:12, Bonita Montero wrote:

Am 25.08.2024 um 18:28 schrieb Michael S:

Define "abstraction".

This could have been looked up online (e.g. in a Wikipedia article).

OOP, functional programming, generic programming, exceptions.

(And there are yet more.)

That isn't surprising. The code you constantly post always uses the most >>> advanced features, uses every toy that is available, and the most
elaborate algorithms.

I'm not sure in what bubble you lived the past decades. The listed
abstraction examples date back to the 1960's. They were realized in
many programming languages,

Perhaps not so much in the ones people used. Assembly? Fortran? Cobol?
There have always been academic languages.

including long existing ones as well as

many contemporary ones. I suggest to try to understand the concepts
if you want to reach the next experience level. :-)

I sometimes use (and implement) such features in scripting code which
has the support to use them effortlessly.

I've rarely needed them for systems programming.

As a counterpoint, I /have/ seen at least some of these in systems
programming - namely my own, and that of colleagues, customers and
suppliers. (Small-systems embedded and microcontroller programming is
pretty much all "systems programming".)

But it is also fair to say that abstractions are less than you might see
on "big" systems. For systems programming, there is more concern about
the efficiency of the results, leading to a different balance with
respect to speed or ease of coding, maintainability, code re-use, etc.

In this field, C++ usage is on the way up, C usage has peaked and is
going down, assembly is mostly dead (at least as a language for entire programs), and newcomers such as Rust and Micropython are emerging.

We very rarely see exceptions in this field, but OOP is certainly common
now. Classes with non-virtual inheritance are basically cost-free, and
provide structure, safety, encapsulation and flexibility. Virtual
functions have overhead, but can be a solid alternative to call-backs or function pointers. I use generic programming - templates - regularly,
with inheritance and CRTP for compile-time polymorphism. I've even used lambdas.

Yes, abstractions are, and always have been, vital to systems
programming. They have always been important to systems programming in
C too, using the limited tools available in C ("void*" pointers, typedef "handle" types, struct pointers so that client code does not see the
struct contents, etc.). C++ gives you a lot more tools here, and lets
you get more efficient results in the end (if you use it appropriately).

But you certainly can use a range of abstractions in C programming too.
Every time you use an enumerated type instead of an int, it's an
abstraction. You can encapsulate your data and functions in structs.
You can do generic coding with macros. C++ makes it easier to get
right, and harder to get wrong (while still compiling), but you can
still do abstractions in C.

And yes, C++ gives more opportunities to write incomprehensible code
than C, and the language suffers from having been built up gradually
over time by adding features to a base (roughly C). It is no more a
"perfect" language than any other language.

My comments were in connection with their clunky and abstruse
implementations in C++, and BM's habit of posting C++ code full of
gratuitous uses of such features.

I don't think BM's posts are generally good or clear examples of uses of
C++. And I don't think continuously posting "C++ would be ten times
easier" in c.l.c is helpful to anyone.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Tim Rentsch on Tue Aug 27 12:33:25 2024

On Mon, 26 Aug 2024 17:55:21 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Sun, 25 Aug 2024 17:48:14 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

That despite at least one language in the 1st category being
pretty well designed, if more than a little over-engineered.

Please, don't keep us in suspense. To what language do you refer?

I thought, that every reader understood that I meant Ada.

I expect many readers did. If I had had to guess I would have
guessed Rust.

It means that derogatory overtones of my past comments about Rust were
too subtle.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Tim Rentsch on Tue Aug 27 12:44:43 2024

On Mon, 26 Aug 2024 17:16:06 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

[..concerning "abstraction"..]

Not that I am particularly fond of abstractions when I do see
them.

A most unexpected comment. IMO choosing the right abstractions
to define may be the most important skill in developing large
systems.

You are probably right, but me being me, I am rarely able to grasp
pure abstract things. Most typically, first read about abstract
concept goes straight above my head. It can't be helped by few
examples, but success is not guaranteed.
Even after I seemingly grasped the principle, when I start using
an instance of abstract thing, it's hard for me to stop thinking about
gears and toothed wheels rotating under the hood.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bonita Montero on Tue Aug 27 11:32:24 2024

On 27/08/2024 10:36, Bonita Montero wrote:

Am 27.08.2024 um 09:37 schrieb David Brown:

But it is also fair to say that abstractions are less than you might
see on "big" systems. For systems programming, there is more concern
about the efficiency of the results, ...

C++ is efficient and abstract in one.

Any simple one-line claim here is clearly going to be wrong.

C++ code can be efficient, or abstract, or both, or neither. The
language supports a wide range of coding practices, including bad ones.

Some types of abstraction inevitably have run-time costs (speed or code
space), which can be highly relevant in resource-constrained systems or
other situations where efficiency is paramount (games programming is a
fine example). These abstractions may or may not be worth the cost in
the overall picture - it is up to the software developer to figure that
out, regardless of the language.

We very rarely see exceptions in this field, but OOP is certainly
common now.

You have to accept exceptions with C++ since there are a lot of places
where C++ throws a bad_alloc or system_error.

Incorrect. Like most low-level or systems programmers using C++, I have exceptions disabled and never use them.

Classes with non-virtual inheritance are basically cost-free, and
provide
structure, safety, encapsulation and flexibility. Virtual functions have >> overhead, ...

The virtual function overhead isn't notwworthy and not much more over
manual dispatch.

Incorrect.

You simply don't know what you are talking about for programming at this
level - whether it is in C or C++.

Virtual function overhead will sometimes be worth the cost, and in some circumstances it can be less than more manual dispatch methods. But it
is not cost-free, and the overhead can most certainly be relevant if it
is used inappropriately.

But you certainly can use a range of abstractions in C programming too.

C doesn't supply features to have abstractions like in C++.

As I said, you can use a lot of abstractions in C programming, but C++
can make many types of abstraction easier, safer, and more efficient.

I don't think BM's posts are generally good or clear examples of uses
of C++. And I don't think continuously posting "C++ would be ten times
easier" in c.l.c is helpful to anyone.

C is just too much work.

Feel free to unsubscribe from the Usenet group dedicated to a language
you so strongly dislike.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bonita Montero on Tue Aug 27 12:49:14 2024

On Tue, 27 Aug 2024 07:10:57 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:

Am 27.08.2024 um 02:16 schrieb Tim Rentsch:

A most unexpected comment. IMO choosing the right abstractions
to define may be the most important skill in developing large
systems.

Tell this someone who think C++ doen't provide sth. useful over C.

That's not me.
But things that I appreciate about C++ likely form small strict subset
of the things that you appreciate.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bonita Montero on Tue Aug 27 15:06:04 2024

On Tue, 27 Aug 2024 07:23:17 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:

Am 27.08.2024 um 07:17 schrieb Lawrence D'Oliveiro:

On Tue, 27 Aug 2024 07:10:57 +0200, Bonita Montero wrote:

Am 27.08.2024 um 02:16 schrieb Tim Rentsch:

A most unexpected comment. IMO choosing the right abstractions to
define may be the most important skill in developing large
systems.

Tell this someone who think C++ doen't provide sth. useful over C.

Linus Torvalds?

That's hopeless to persuade him.

There were times when I tried to defend your position against Linus. https://www.realworldtech.com/forum/?threadid=84931&curpostid=84931
But it was 16 years ago.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bonita Montero on Tue Aug 27 14:51:42 2024

On 27/08/2024 11:47, Bonita Montero wrote:

Am 27.08.2024 um 11:32 schrieb David Brown:

On 27/08/2024 10:36, Bonita Montero wrote:

C++ is efficient and abstract in one.

Any simple one-line claim here is clearly going to be wrong.

90% of the C++-abstractions are zero-cost abstractons.

90% of statistics are plucked from the air, including that one.

C++ code can be efficient, or abstract, or both, or neither.

Of course it can. Imagine functional programming with a std::sort.
It's abstract since you won't have to deal with the details of the
sort and supply only a comparison function object, but it's still
optimally performant.

What part of "or both" in my comment caused you such confusion?

You have to accept exceptions with C++ since there are a lot of places
where C++ throws a bad_alloc or system_error.

Incorrect. Like most low-level or systems programmers using C++, I
have exceptions disabled and never use them.

You won't be able to change the runtime's behaviour with that. The
runtime trows bad_alloc or system_error everywhere and if you disable exceptions the application simply terminates if this is thrown.

As I said, you have no idea what you are talking about in the context of low-level programming. People use C++ without exceptions on embedded
systems, resource-constrained systems, high-reliability systems, and
low-level code.

Incorrect.

I just measured the tim of a ...

virtual int fn( int, int )

... which adds only two ints. The overhead is about one nanosecond
on my Zen4-CPU. And usually you do complex tasks inside the virtual
function so that the call itself doens't participate much in the
overall computation time.

Again, you demonstrate your total ignorance of the topic.

Virtual function overhead will sometimes be worth the cost, and in
some circumstances it can be less than more manual dispatch methods.
But it is not cost-free, and the overhead can most certainly be
relevant if it is used inappropriately.

If the destination of the dispatch varies the overhead is nearly the
same as with static dispatch since most of the time takes the mispre-
dicted branch.

The vast majority of processors produced and sold do not have any kind
of branch prediction.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Michael S on Tue Aug 27 09:45:55 2024

On 8/26/24 03:54, Michael S wrote:

On Sun, 25 Aug 2024 17:48:14 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

...

It's been amusing reading a discussion of which languages are or are
not high level, without anyone offering a definition of what the
term means. Wikipedia says, roughly, that a high-level language is
one that doesn't provide machine-level access (and IMO that is a
reasonable characterization).

I don't like this definition. IMHO, what language does have is at least
as important as what it does not have for the purpose of estimating its level.

That's not a particularly useful response. It would have been more
useful to identify what features a language should have to qualify as
low level or high level.
Defining a level solely in terms of what the language has, without
regard to what it doesn't have, leads to a potential ambiguity: what if
a language, let's call it A/C, which has every feature that you think
should qualify it as a low level language, AND every feature that you
think should qualify it as a high level language? If you define those
terms solely in terms of what the language has, then A/C must be called
both a low-level language and an high-level language.
If, on the other hand, you define the level of a language both in terms
of what it has, and what it doesn't have, A/C would be unclassifiable,
which I think is a more appropriate way of describing it. Every time
that someone says "low level languages can't ...", that statement will
be false about A/C, and similarly for "high level languages can't ...".

One principle that should be kept in mind when you're defining a term
whose definition is currently unclear, is to decide what statements you
want to make about things described by that term. In many cases, the
truth of those statements should be a logical consequence of the
definition you use.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Keith Thompson on Tue Aug 27 14:18:04 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly low level than >>> C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs: usually if >>> you declare a variable A, then you can access A's value just by writing A; >>> its address is automatically dereferenced.

Not always. This is where left- and right-evaluation came in. On the
left of an assignment A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place. CPL used the terms and
C got them via BCPL's documentation. Viewed like this, BLISS just makes
"evaluation" a universal concept.

As I recall, the terms "lvalue" and "rvalue" originated with CPL. The
'l' and 'r' suggest the left and right sides of an assignment.

Disclaimer: I have a couple of CPL documents, and I don't see the terms "lvalue" and "rvalue" in a quick look. The PDFs are not searchable. If someone has better information, please post it. Wikipedia does say that
the notion of "l-values" and "r-values" was introduced by CPL.

I presume, since I mentioned the concepts coming from CPL, you are
referring to specifically the short-form terms l- and r-values?

I can't help with those specific terms as the document I have uses a
mixture of terms like "the LH value of...", "left-hand expressions" and "evaluated in LH mode".

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bonita Montero on Tue Aug 27 20:54:30 2024

On 27/08/2024 15:14, Bonita Montero wrote:

Am 27.08.2024 um 14:51 schrieb David Brown:

90% of statistics are plucked from the air, including that one.

With C++ this fits. Most abstractions don't have an additional overhead
over a manual implementation.

Again, you are wrong to generalize. It depends on the situation, the abstraction in question, and the code.

As I said, you have no idea what you are talking about in the context
of low-level programming.

I told you why it isn't practicable to suppress exceptions in C++
since the runtime uses a lot of exceptions.

And you were completely wrong when you said that. Perhaps in /your/
field of programming you are correct - but you are ignoring the rest of
the world.

Again, you demonstrate your total ignorance of the topic.

Most of the time a nanosecond more doesn't count, especiailly because
usually you do more complex things in a virtual function.

Often that is correct. Often it is /not/ correct. The only thing we
can all be sure of is that your laughable attempt at a benchmark here
bears no relation to the real world - especially not the real world of small-systems programming.

The vast majority of processors produced and sold do not have any kind
of branch prediction.

Not today.

For every one of your favourite big x86 chips sold, there will be a
hundred small microcontrollers - none of which has branch prediction.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Tue Aug 27 21:53:11 2024

On 27/08/2024 21:16, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

On 27/08/2024 06:36, Lawrence D'Oliveiro wrote:

On Mon, 26 Aug 2024 15:46:02 +0200, David Brown wrote:

Wikipedia classifies C as a high-level language that also supports a
degree of low-level programming, which I think is a fair assessment.

The same could be said of Python.

Python does not support any significant degree of low-level programming.

A key example of low-level programming is control of hardware, which
on most systems means accessing memory-mapped registers at specific
addresses, reading and writing in specific orders. Python has no
means to do any of that - C and C++ both provide this ability.
(Micropython, a subset of Python targeting microcontrollers and small
systems, has library modules that can do this.)

I've used Python's mmap module to access /dev/kmem on an embedded
Linux system, accessing fixed addresses defined by an FPGA image.
(The mmap module happens to be part of the core Python distribution.)

There are /always/ ways to get around things (especially on Linux, where
you have such "backdoors"). That is why I said Python does not support low-level programming to any /significant/ degree. "low-level" vs. "high-level" is not a binary distinction. Typically if you have Python
code controlling some hardware, it is via a Python module with a C implementation, or with ctypes and an external shared library - not
directly from Python.

This is one of several reasons why we have different newsgroups for
different langauges.

Sure. It's not really the place to get into details of other languages,
but it is a thread that compares C to other languages.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to Bonita Montero on Tue Aug 27 21:13:56 2024

On 27/08/2024 14:14, Bonita Montero wrote:

Am 27.08.2024 um 14:51 schrieb David Brown:

90% of statistics are plucked from the air, including that one.

With C++ this fits. Most abstractions don't have an additional overhead
over a manual implementation.

Hmm. I'll pass on that.

As I said, you have no idea what you are talking about in the context
of low-level programming.

I told you why it isn't practicable to suppress exceptions in C++
since the runtime uses a lot of exceptions.

There are quite a lot of places in low level programming where you have
to manage without them. Sometimes you have to do without the runtime as
well. That doesn't mean you can't use C++ itself.

Again, you demonstrate your total ignorance of the topic.

Most of the time a nanosecond more doesn't count, especiailly because
usually you do more complex things in a virtual function.

The vast majority of processors produced and sold do not have any kind
of branch prediction.

Not today.

The vast majority of processors in desktop computers and above, sure.
But do you think the one in my watch has one? My thermostat? The alarm
clock? I've got at least a dozen devices with processors in this room
with me right now.

There are an awful lot of these small things. Where power usage and cost matters far more than performance.

Andy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Vir Campestris on Tue Aug 27 21:07:41 2024

Vir Campestris <vir.campestris@invalid.invalid> writes:

On 27/08/2024 14:14, Bonita Montero wrote:

I told you why it isn't practicable to suppress exceptions in C++
since the runtime uses a lot of exceptions.

There are quite a lot of places in low level programming where you have
to manage without them. Sometimes you have to do without the runtime as
well. That doesn't mean you can't use C++ itself.

Indeed. I've worked on two hypervisors and one large OS written
in C++ (no RTTI, no exceptions).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to Bonita Montero on Tue Aug 27 21:20:55 2024

On 27/08/2024 10:11, Bonita Montero wrote:

C++ is a lanugage which addresses the lowest level as well as medium abstactions. I like to combine both.

Your view of lowest probably differs from mine.

Once upon a time I had to write a BIOS for a computer from scratch.

You turn it on, and it starts executing code. OK, you can be sure you
have a ROM and a processor. Have you got any RAM? Best to be sure it's functioning OK before you start using it. I'll just call this test
function and ... oh. Well, I can put the return address in SP, we're not
using that yet.

That code had to be written in assembler. No other language gives you sufficient control. And it was actually quite fun doing it!

ANdy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bonita Montero on Tue Aug 27 23:44:51 2024

On Tue, 27 Aug 2024 08:58:01 +0200, Bonita Montero wrote:

Am 27.08.2024 um 08:47 schrieb Lawrence D'Oliveiro:

I thought you were going to question the competence of someone who had
that attitude ... clearly not.

That's not the question of competence but attitude. Torvalds has a minimalistic mindset and cant't handle abstractions in code.

On the contrary, the Linux kernel is full of abstractions: the device
layer, the network layer, the filesystem layer, the security layer ...

You can’t do an OS kernel without abstractions. And Linux does it better
than any other.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Keith Thompson on Wed Aug 28 00:15:42 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly low level than
C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs: usually if >>>>> you declare a variable A, then you can access A's value just by writing A;
its address is automatically dereferenced.

Not always. This is where left- and right-evaluation came in. On the >>>> left of an assignment A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place. CPL used the terms and >>>> C got them via BCPL's documentation. Viewed like this, BLISS just makes >>>> "evaluation" a universal concept.

As I recall, the terms "lvalue" and "rvalue" originated with CPL. The
'l' and 'r' suggest the left and right sides of an assignment.

Disclaimer: I have a couple of CPL documents, and I don't see the terms
"lvalue" and "rvalue" in a quick look. The PDFs are not searchable. If >>> someone has better information, please post it. Wikipedia does say that >>> the notion of "l-values" and "r-values" was introduced by CPL.

I presume, since I mentioned the concepts coming from CPL, you are
referring to specifically the short-form terms l- and r-values?

I can't help with those specific terms as the document I have uses a
mixture of terms like "the LH value of...", "left-hand expressions" and
"evaluated in LH mode".

The documents I have are unsearchable PDFs; they appear to be scans of
paper documents.

https://comjnl.oxfordjournals.org/content/6/2/134.full.pdf https://www.ancientgeek.org.uk/CPL/CPL_Elementary_Programming_Manual.pdf

Do you have friendlier documents?

The earliest that is searchable has this title page:

UNIVERSITY OF LONDON INSTITUTE OF COMPUTER SCIENCE
*************************************************
THE UNIVERSITY MATHEMATICAL LABORATORY, CAMBRIDGE
*************************************************
CPL ELEMENTARY PROGRAMMING MANUAL
Edition I (London)

This document, written by the late John Buxton, was preserved by Bill
Williams, formerly of London University’s Atlas support team. Bill has
generously made it available to Dik Leatherdale who has OCRed and
otherwise transcribed it for the Web. All errors should be reported to
dik@leatherdale.net. The original appearance is respected as far as
possible, but program text and narrative are distinguished by the use of
different fonts. Transcriber’s additions and “corrections” are in red,
hyperlinks in underlined purple. A contents list and a selection of
references have been added inside the back cover.

March 1965

I don't know where I got it from. The other searchable one is just a
PDF is the oft-cited paper "The main features of CPL" by Barron et. al.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Michael S on Tue Aug 27 23:50:39 2024

On Tue, 27 Aug 2024 12:44:43 +0300, Michael S wrote:

Most typically, first read about abstract concept goes
straight above my head.

This is why you need good examples. Generalization is all very well,
until your audience fails to grasp why the generalization is actually
useful.

Consider the “descriptor” concept in Python <https://docs.python.org/3/reference/datamodel.html#implementing-descriptors>. Can you appreciate, from that bare-bones description in §3.3.2.2 and §3.3.2.3, how useful they are? I certainly didn’t.

But on further study, I discovered that descriptors are key to how the
whole class system works in Python. Every function is a descriptor.
And then you discover that builtin functions like “classmethod” and “property” are just conveniences: you could write them yourself in
regular Python code if you wanted to, since they don’t rely on any
magic internal to the particular Python implementation.

A similar thing applies to metaclasses.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Tue Aug 27 23:53:55 2024

On Tue, 27 Aug 2024 09:44:40 +0200, David Brown wrote:

Python does not support any significant degree of low-level programming.

A key example of low-level programming is control of hardware, which on
most systems means accessing memory-mapped registers at specific
addresses, reading and writing in specific orders. Python has no means
to do any of that - C and C++ both provide this ability.

I’ve got news for you: this kind of thing is perfectly doable in Python <https://docs.python.org/3/library/ctypes.html>.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Wed Aug 28 00:49:49 2024

Bart <bc@freeuk.com> writes:

On 26/08/2024 13:30, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly low level than >>> C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs: usually if >>> you declare a variable A, then you can access A's value just by writing A; >>> its address is automatically dereferenced.

Not always. This is where left- and right-evaluation came in. On the
left of an assignment A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place. CPL used the terms and
C got them via BCPL's documentation. Viewed like this, BLISS just makes
"evaluation" a universal concept.

That doesn't explain why one language requires an explcition dereference in the source code, and the other doesn't.

It does for me. If you think I can help, maybe you could ask some more questions as I don't know what else to say. BLISS uses addresses
explicitly, so the rvalue/lvalue distincion is not a perfect match for
what's going on, but it's close enough that I find it helpful.

By "access A's value" I mean either read or write access.

A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place.

This /is/ confusing as it suggests a different rank for A depending on whether it is an lvalue or rvalue, eg. some difference in level of indirection. In fact that is the same on both sides.

I don't know what you mean by rank here. The whole point of two
different evaluations -- as an rvalue or an lvalue -- can be seen
(rather too crudely I fear) as adding one more level of indirection so
that what we expect to happen (when we've got used to modern programming languages), happens.

My point was that HLLs typically read or write values of variables without extra syntax.

Indeed, and BLISS is not like that. I had hoped to shed some light on
why there is some logic to BLISS's rather idiosyncratic design.

Given a declaration like 'int A' then:

BLISS C

Read or write A's value .A A

I don't think that's right. To change the value at address A (what I
think you mean by "write A's value") you write

A = 42;

in BLISS. And to add one to the value at address A you write

A = .A + 1;

Get A's address A &A

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to David Brown on Tue Aug 27 23:55:04 2024

On Tue, 27 Aug 2024 21:53:11 +0200, David Brown wrote:

... or with ctypes and an external shared library - not
directly from Python.

ctypes is a standard Python library module, and it has low-level
capabilities (like type-punning) that can be exercised independently of actually using it to load any external C code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Lawrence D'Oliveiro on Wed Aug 28 01:28:22 2024

On 28/08/2024 00:53, Lawrence D'Oliveiro wrote:

On Tue, 27 Aug 2024 09:44:40 +0200, David Brown wrote:

Python does not support any significant degree of low-level programming.

A key example of low-level programming is control of hardware, which on
most systems means accessing memory-mapped registers at specific
addresses, reading and writing in specific orders. Python has no means
to do any of that - C and C++ both provide this ability.

I’ve got news for you: this kind of thing is perfectly doable in Python <https://docs.python.org/3/library/ctypes.html>.

It's Python calling a special module to do the dirty work. That's not
far removed from Python just invoking an external C program to do the job.

By contrast, my scripting language can directly do the low level stuff.
If there is a byte value at a certain address, it can access it like this:

p:=makeref(0x40'0000, byte)
println p^

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Wed Aug 28 01:39:04 2024

On 28/08/2024 00:49, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 26/08/2024 13:30, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly low level than
C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs: usually if >>>> you declare a variable A, then you can access A's value just by writing A; >>>> its address is automatically dereferenced.

Not always. This is where left- and right-evaluation came in. On the
left of an assignment A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place. CPL used the terms and >>> C got them via BCPL's documentation. Viewed like this, BLISS just makes >>> "evaluation" a universal concept.

That doesn't explain why one language requires an explcition dereference in >> the source code, and the other doesn't.

It does for me. If you think I can help, maybe you could ask some more questions as I don't know what else to say. BLISS uses addresses
explicitly, so the rvalue/lvalue distincion is not a perfect match for
what's going on, but it's close enough that I find it helpful.

By "access A's value" I mean either read or write access.

A denotes a "place" to receive a value. On the
right, it denotes a value obtained from a place.

This /is/ confusing as it suggests a different rank for A depending on
whether it is an lvalue or rvalue, eg. some difference in level of
indirection. In fact that is the same on both sides.

I don't know what you mean by rank here. The whole point of two
different evaluations -- as an rvalue or an lvalue -- can be seen
(rather too crudely I fear) as adding one more level of indirection so
that what we expect to happen (when we've got used to modern programming languages), happens.

My point was that HLLs typically read or write values of variables without >> extra syntax.

Indeed, and BLISS is not like that. I had hoped to shed some light on
why there is some logic to BLISS's rather idiosyncratic design.

Given a declaration like 'int A' then:

BLISS C

Read or write A's value .A A

I don't think that's right. To change the value at address A (what I
think you mean by "write A's value") you write

A = 42;

in BLISS. And to add one to the value at address A you write

A = .A + 1;

OK. That's just makes it more bizarre than I'd thought. The example I
saw included these lines:

GETNUM(X); ! returns a value via X
Y = STEP(.X);
PUTNUM(.Y)

So in an rvalue context: X reads its address; while .X reads its value.

But in an lvalue one: Y writes its value; .Y may not be defined

It looks asymmetric. C like most languages is symmetric, you write 'A =
A' with the same syntax on both sides.

I assume that in BLISS, A = A is legal, but does something odd like copy
A's address into itself.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Tue Aug 27 18:19:39 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly
low level than C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs:
usually if you declare a variable A, then you can access A's
value just by writing A; its address is automatically
dereferenced.

Not always. This is where left- and right-evaluation came in.
On the left of an assignment A denotes a "place" to receive a
value. On the right, it denotes a value obtained from a place.
CPL used the terms and C got them via BCPL's documentation.
Viewed like this, BLISS just makes "evaluation" a universal
concept.

As I recall, the terms "lvalue" and "rvalue" originated with CPL.
The 'l' and 'r' suggest the left and right sides of an
assignment.

Disclaimer: I have a couple of CPL documents, and I don't see
the terms "lvalue" and "rvalue" in a quick look. The PDFs are
not searchable. If someone has better information, please post
it. Wikipedia does say that the notion of "l-values" and
"r-values" was introduced by CPL.

I presume, since I mentioned the concepts coming from CPL, you are
referring to specifically the short-form terms l- and r-values?

I can't help with those specific terms as the document I have uses
a mixture of terms like "the LH value of...", "left-hand
expressions" and "evaluated in LH mode".

The documents I have are unsearchable PDFs; they appear to be
scans of paper documents.

https://comjnl.oxfordjournals.org/content/6/2/134.full.pdf
https://www.ancientgeek.org.uk/CPL/CPL_Elementary_Programming_Manual.pdf

Do you have friendlier documents?

The earliest that is searchable has this title page:

UNIVERSITY OF LONDON INSTITUTE OF COMPUTER SCIENCE
*************************************************
THE UNIVERSITY MATHEMATICAL LABORATORY, CAMBRIDGE
*************************************************
CPL ELEMENTARY PROGRAMMING MANUAL
Edition I (London)

This document, written by the late John Buxton, was preserved by
Bill Williams, formerly of London University?s Atlas support team.
Bill has generously made it available to Dik Leatherdale who has
OCRed and otherwise transcribed it for the Web. All errors should
be reported to dik@leatherdale.net. The original appearance is
respected as far as possible, but program text and narrative are
distinguished by the use of different fonts. Transcriber's
additions and 'corrections' are in red, hyperlinks in underlined
purple. A contents list and a selection of references have been
added inside the back cover.

March 1965

I don't know where I got it from. The other searchable one is just
a PDF is the oft-cited paper "The main features of CPL" by Barron
et. al.

My understanding is the terms l-value and r-value, along with
several other terms widely used in relation to programming
languages, became widely used following a summer(?) course taught
by Christopher Strachey. Some of the other terms are referential
transparency and parametric polymorphism, IIRC.

https://en.wikipedia.org/wiki/Fundamental_Concepts_in_Programming_Languages

I believe it is possible to track down the notes from that course,
if a diligent web search is employed. I remember reading a copy
some years ago after finding one on the internet.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Tue Aug 27 19:38:14 2024

Michael S <already5chosen@yahoo.com> writes:

On Mon, 26 Aug 2024 17:55:21 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Sun, 25 Aug 2024 17:48:14 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

That despite at least one language in the 1st category being
pretty well designed, if more than a little over-engineered.

Please, don't keep us in suspense. To what language do you refer?

I thought, that every reader understood that I meant Ada.

I expect many readers did. If I had had to guess I would have
guessed Rust.

It means that derogatory overtones of my past comments about Rust were
too subtle.

It could just be my memory. I don't try to keep track of who
likes (or dislikes) which programming languages.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bonita Montero on Wed Aug 28 05:39:07 2024

On Wed, 28 Aug 2024 06:59:40 +0200, Bonita Montero wrote:

Am 28.08.2024 um 01:44 schrieb Lawrence D'Oliveiro:

On the contrary, the Linux kernel is full of abstractions: the device
layer, the network layer, the filesystem layer, the security layer ...

C's abstraction are very low level.

The Linux kernel abstractions are very high level. Look at how entirely different filesystems, even ones originating from entirely different OSes,
can be handled through the common VFS layer.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bart on Wed Aug 28 05:45:01 2024

On Wed, 28 Aug 2024 01:28:22 +0100, Bart wrote:

On 28/08/2024 00:53, Lawrence D'Oliveiro wrote:

On Tue, 27 Aug 2024 09:44:40 +0200, David Brown wrote:

Python does not support any significant degree of low-level
programming.

A key example of low-level programming is control of hardware, which
on most systems means accessing memory-mapped registers at specific
addresses, reading and writing in specific orders. Python has no
means to do any of that - C and C++ both provide this ability.

I’ve got news for you: this kind of thing is perfectly doable in Python
<https://docs.python.org/3/library/ctypes.html>.

It's Python calling a special module to do the dirty work.

It’s a standard Python module. It uses standard Python constructs like “«n» * «type»” to construct an array of «n» elements of «type». I/O to/
from those objects are done using something called the “Buffer Protocol”, which is a core part of how Python works.

I previously showed you how Python can even do low-level type
discrimination at runtime -- try doing that with C. Maybe time to add the capability to C to call external Python modules, for help with this? “#include <pytypes.h>", anybody?

That's not far removed from Python just invoking an external C program
to do the job.

But it is removed.

And note that there is nothing in the C spec to require that C cannot do
it that way as well. So don’t confuse implementation details with
fundamental language characteristics.

By contrast, my scripting language can directly do the low level stuff.

But it couldn’t do the dynamic type casting, could it?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bonita Montero on Wed Aug 28 11:26:03 2024

On 28/08/2024 07:02, Bonita Montero wrote:

Am 27.08.2024 um 20:54 schrieb David Brown:

And you were completely wrong when you said that. Perhaps in /your/
field of programming you are correct - but you are ignoring the rest
of the world.

You've to omit nearly the whole standard libary to avoid exceptions.
F.e. everything that has an allocator may throw bad_alloc.

And in small systems programming, much of the standard library /is/
omitted. When you are doing real-time or critical software, you avoid
dynamic memory like the plague. The standard container classes (other
than std::array<>) are inappropriate for most low-level or small-systems programming.

The standard C++ libraries for gcc and clang will not throw exceptions
if you have "-fno-exceptions" enabled - on allocation failure, they will
go straight to abort(). You can do better by using custom allocators - alternatively, you can use more appropriate container classes that are
suitable for such systems.

Often that is correct. Often it is /not/ correct.

It's correct nearly every time since usually you've more than a dozen instructions in a virtual function.

No, it is not - function calls when the compiler cannot see, or cannot
use, the definition of the called function can have significant
overheads well above the cost of a dozen instructions. Virtual function
calls add to that due to an extra layer of indirection. And many
virtual functions are very small, simply reading or writing a variable.

So any claims about how much of an overhead you have are meaningless -
all that can be said is that they are sometimes relevant, sometimes not.

For every one of your favourite big x86 chips sold, there will be a
hundred small microcontrollers - none of which has branch prediction.

Even the simple Cortex CPUs have a branch prediciton.

No, they don't. The Cortex-M cores do not have branch prediction. Some
have static speculative prefetch of branch targets from flash to reduce
the delay on branch, but no more than that. These are not out-of-order processors, they have no speculative execution, and only the Cortex-M7
is superscalar (slightly). This is, of course, a benefit - the cores
are intended to be small and low power, and to have consistent execution
times rather than maximal average throughput.

(I am not even sure why you thought branch prediction was relevant here.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Vir Campestris on Wed Aug 28 14:01:37 2024

On Tue, 27 Aug 2024 21:13:56 +0100
Vir Campestris <vir.campestris@invalid.invalid> wrote:

But do you think the one in my watch has one?

It depends on the type of the watch.
The main application processor core(s) of smart watch most certainly
have branch prediction.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Wed Aug 28 13:49:56 2024

On Wed, 28 Aug 2024 11:26:03 +0200
David Brown <david.brown@hesbynett.no> wrote:

(I am not even sure why you thought branch prediction was relevant
here.)

It is relevant.
Sophisticated branch prediction + BTBs + deep speculation working
together is a main reason for very good common-case performance of
virtual function calls on "big" CPUs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Wed Aug 28 13:55:26 2024

On Wed, 28 Aug 2024 11:26:03 +0200
David Brown <david.brown@hesbynett.no> wrote:

No, they don't. The Cortex-M cores do not have branch prediction.

Formally speaking, Cortex-M3/4 cores do have the most basic form of
static branch prediction - a conditional branch is always predicted as non-taken.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bonita Montero on Wed Aug 28 13:43:23 2024

On Wed, 28 Aug 2024 11:49:32 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:

Am 28.08.2024 um 11:26 schrieb David Brown:

No, they don't. ..

Then the branch is the most expensive part, no matter if
you're doing a call or a table-dispatch within a function.

Cortex-M3 and M4 have really short pipeline - 3 stages.
So branches are not that expensive.
The difference between direct and indirect branch is close to 2x even
when pointer is in maximally fast memory. Significantly more than that
when pointer is in flash, esp. when core runs at relatively high
frequency, say 150-200 MHz.
Virtual function call is typically implemented as double indirection,
so it ends up even slower than C-style call through function pointer.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bart on Wed Aug 28 14:21:19 2024

On Sun, 25 Aug 2024 18:26:57 +0100
Bart <bc@freeuk.com> wrote:

On 25/08/2024 17:20, tTh wrote:

On 8/25/24 17:30, Bart wrote:

So what language goes between Assembly and C?

Forth ?

I had in mind languages classed as 'HLLs'. I'm not sure if Forth
counts.

They say that Forth is a HLL
https://www.forth.com/forth/
I tend to agree with them.
My personal bar for HLL is pretty low - any language in which one can
write useful program (not necessarily big useful program or *very*
useful program) in a way that does not depend on instruction set of
underlying hardware is HLL.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bonita Montero on Wed Aug 28 15:06:32 2024

On Wed, 28 Aug 2024 13:02:03 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:

Am 28.08.2024 um 12:43 schrieb Michael S:

Virtual function call is typically implemented as double
indirection, so it ends up even slower than C-style call through
function pointer.

If you have a function<>-object there's no double indirection.

Are you still talking about virtual functions or trying to shift a
goalpost?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Wed Aug 28 14:25:05 2024

On 28/08/2024 12:49, Michael S wrote:

On Wed, 28 Aug 2024 11:26:03 +0200
David Brown <david.brown@hesbynett.no> wrote:

(I am not even sure why you thought branch prediction was relevant
here.)

It is relevant.
Sophisticated branch prediction + BTBs + deep speculation working
together is a main reason for very good common-case performance of
virtual function calls on "big" CPUs.

Well, yes. But branch prediction on its own is not sufficient - it is
not even the major part of the reason. Without speculative execution to
at least some extent, a cpu is not going to be able to see into the
virtual method table pointer, or the virtual method pointer itself, in
order to pre-fetch the instructions of the virtual method. A
non-virtual method is just a "call" instruction that relatively simple pre-fetching can handle.

It is speculative execution, along with register renaming (so that the
register moves typically found around function calls are "executed" in
zero clock cycles), that greatly speed up virtual method calls. Smarter instruction caches, and return stack caches are also more relevant than
branch prediction here. (Branch prediction is of course useful for
other things.)

And while microcontrollers sometimes have a limited form of branch
prediction (such as prefetching the target from cache), the more
numerous and smaller devices don't even have instruction caches.
Certainly none of them have register renaming or speculative execution.

Modern x86 processors are highly tuned to give fast execution of
inefficient code - which is great for code with lots of indirection such
as from virtual methods, as well as older code and poorly optimised
code. Microcontroller processors are a different world there. So
writing code that is efficient on a small microcontroller is very
different from writing code for big processors - even though C++ (as
well as C) can be an appropriate choice of language in both cases.

(I know /you/ know all this, but Bonita is clearly clueless about it.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Wed Aug 28 06:31:25 2024

Michael S <already5chosen@yahoo.com> writes:

On Mon, 26 Aug 2024 17:16:06 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

[..concerning "abstraction"..]

Not that I am particularly fond of abstractions when I do see
them.

A most unexpected comment. IMO choosing the right abstractions
to define may be the most important skill in developing large
systems.

You are probably right, but me being me, I am rarely able to grasp
pure abstract things. Most typically, first read about abstract
concept goes straight above my head.

Abstraction doesn't have to mean abstruse. To some degree every
function written defines, or partially defines, an abstraction.
Consider printf() as an example - printf() provides a way of
sending formatted output without having to worry about the details
of how the formatting is done. A key property of abstractions in
programming is offering a way to get something done without having
to worry about how it is done. Sometimes what is being done is
fairly simple to explain and understand, and other times more
difficult, but both cases are abstractions.

It can't be helped by few examples, but success is not
guaranteed.

Presumably you mean it can be helped, etc. It's often true that
examples greatly help in explaining what an abstraction does.
However it is also often true that examples alone do not suffice.
To be useful an abstraction must be understood as to what it does.
More than that, an abstraction is not useful if the amount of effort
needed to understand and use it is greater than the amount of effort
required to do the same thing without making use of the supplied
functions, type definitions, etc. In short, some documentation is
needed. How much is needed varies a lot from case to case.

Even after I seemingly grasped the principle, when I start using
an instance of abstract thing, it's hard for me to stop thinking
about gears and toothed wheels rotating under the hood.

Mostly this is a matter of practice and habit. Of course there are
cases where it can be helpful to think about what is going on inside
an abstraction. But such cases should be rare. The primary purpose
of defining an abstraction is so one doesn't have to think about how
something is done, only what needs to be done. I'm confident you
have enough discipline to acquire this habit if you make an effort
to do so. After getting into the habit then you can start to think
about when to make exceptions. But the first reaction should always
be "It's on a need not to know basis."

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Michael S on Wed Aug 28 14:51:42 2024

On 28/08/2024 12:21, Michael S wrote:

On Sun, 25 Aug 2024 18:26:57 +0100
Bart <bc@freeuk.com> wrote:

On 25/08/2024 17:20, tTh wrote:

On 8/25/24 17:30, Bart wrote:

So what language goes between Assembly and C?

Forth ?

I had in mind languages classed as 'HLLs'. I'm not sure if Forth
counts.

They say that Forth is a HLL
https://www.forth.com/forth/
I tend to agree with them.

From 'forth.com', and a commercial site at that? They're going to be completely impartial of course!

This is the Ackermann function in Forth, from https://rosettacode.org/wiki/Category:Forth :

: acker ( m n -- u )
over 0= IF nip 1+ EXIT THEN
swap 1- swap ( m-1 n -- )
dup 0= IF 1+ recurse EXIT THEN
1- over 1+ swap recurse recurse ;

Well, it's not assembly, but it doesn't have one characteristic of HLLs
which is readability. The code for Ackermann can usually be trivially
derived from its mathemetical definition; not so here.

So I say it fails the HLL test. But if it's not a HLL, it's also fails
on low-level control: the above doesn't concern itself with types or
bitwidths for example; you might consider that a HLL trait, but it's one
that belongs the other side of C rather than below.

(Below are versions in more conventional syntax.)

My personal bar for HLL is pretty low - any language in which one can
write useful program (not necessarily big useful program or *very*
useful program) in a way that does not depend on instruction set of underlying hardware is HLL.

Then you'd need to include any intermediate languages or
representations, since they are usually not tied to any specific target
either.

Although any static typing scheme will likely be crude, for example with
type info specified per-instruction rather than centrally as in a normal
HLL.

-------

(Examples from my languages)

func ack(m,n)=
case
when m=0 then
n+1
when n=0 then
ack(m-1,1)
else
ack(m-1,ack(m,n-1))
esac
end

(Compact version)

fun ack(m,n) = (m=0|n+1|(n=0|ack(m-1,1)|ack(m-1,ack(m,n-1))))

(Example for Haskell; usually considered difficult, here it is clearer
than the Forth where the algorithm or formula is indiscernible)

ack :: Int -> Int -> Int
ack 0 n = succ n
ack m 0 = ack (m-1) 1
ack m n = ack (m-1) (ack m (n-1))

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Wed Aug 28 15:47:13 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly
low level than C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs:
usually if you declare a variable A, then you can access A's
value just by writing A; its address is automatically
dereferenced.

Not always. This is where left- and right-evaluation came in.
On the left of an assignment A denotes a "place" to receive a
value. On the right, it denotes a value obtained from a place.
CPL used the terms and C got them via BCPL's documentation.
Viewed like this, BLISS just makes "evaluation" a universal
concept.

As I recall, the terms "lvalue" and "rvalue" originated with CPL.
The 'l' and 'r' suggest the left and right sides of an
assignment.

Disclaimer: I have a couple of CPL documents, and I don't see
the terms "lvalue" and "rvalue" in a quick look. The PDFs are
not searchable. If someone has better information, please post
it. Wikipedia does say that the notion of "l-values" and
"r-values" was introduced by CPL.

I presume, since I mentioned the concepts coming from CPL, you are
referring to specifically the short-form terms l- and r-values?

I can't help with those specific terms as the document I have uses
a mixture of terms like "the LH value of...", "left-hand
expressions" and "evaluated in LH mode".

The documents I have are unsearchable PDFs; they appear to be
scans of paper documents.

https://comjnl.oxfordjournals.org/content/6/2/134.full.pdf
https://www.ancientgeek.org.uk/CPL/CPL_Elementary_Programming_Manual.pdf >>>
Do you have friendlier documents?

The earliest that is searchable has this title page:

UNIVERSITY OF LONDON INSTITUTE OF COMPUTER SCIENCE
*************************************************
THE UNIVERSITY MATHEMATICAL LABORATORY, CAMBRIDGE
*************************************************
CPL ELEMENTARY PROGRAMMING MANUAL
Edition I (London)

This document, written by the late John Buxton, was preserved by
Bill Williams, formerly of London University?s Atlas support team.
Bill has generously made it available to Dik Leatherdale who has
OCRed and otherwise transcribed it for the Web. All errors should
be reported to dik@leatherdale.net. The original appearance is
respected as far as possible, but program text and narrative are
distinguished by the use of different fonts. Transcriber's
additions and 'corrections' are in red, hyperlinks in underlined
purple. A contents list and a selection of references have been
added inside the back cover.

March 1965

I don't know where I got it from. The other searchable one is just
a PDF is the oft-cited paper "The main features of CPL" by Barron
et. al.

My understanding is the terms l-value and r-value, along with
several other terms widely used in relation to programming
languages, became widely used following a summer(?) course taught
by Christopher Strachey. Some of the other terms are referential transparency and parametric polymorphism, IIRC.

The earlier ('65 and '66) writings about CPL that I've seen all use the
longer terms, and those lectures certainly use the short forms, so it
seems clear this is when they came about and, since he is the sole
author of the notes (unlike all the CPL documents that are groups
efforts), it's also likely he invented the terms.

https://en.wikipedia.org/wiki/Fundamental_Concepts_in_Programming_Languages

I believe it is possible to track down the notes from that course,
if a diligent web search is employed. I remember reading a copy
some years ago after finding one on the internet.

I suspect there is no copyright-free PDF as the notes were published by
an academic press.

None the less well worth a read, even nearly 60 years later. Is it a
shame that that is the case?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Wed Aug 28 15:57:30 2024

Bart <bc@freeuk.com> writes:

On 28/08/2024 00:49, Ben Bacarisse wrote:

Indeed, and BLISS is not like that. I had hoped to shed some light on
why there is some logic to BLISS's rather idiosyncratic design.

Given a declaration like 'int A' then:

BLISS C

Read or write A's value .A A

I don't think that's right. To change the value at address A (what I
think you mean by "write A's value") you write
A = 42;
in BLISS. And to add one to the value at address A you write
A = .A + 1;

OK. That's just makes it more bizarre than I'd thought.

Curious. It's what makes it consistent, though it is definitely an
uncommon approach.

The example I saw
included these lines:

GETNUM(X); ! returns a value via X
Y = STEP(.X);
PUTNUM(.Y)

So in an rvalue context: X reads its address; while .X reads its
value.

The whole point is to remove the two contexts. A variable name is
/always/ an lvalue (which is why it can be assigned). C has an implicit
lvalue to rvalue conversion in the contexts you have come to expect it.
BLISS does not. You always need a dot to convert to an rvalue.

But in an lvalue one: Y writes its value; .Y may not be defined

It looks asymmetric. C like most languages is symmetric, you write 'A = A' with the same syntax on both sides.

Since assignment is inherently asymmetric (you can't write 3 = A but you
can write A = 3) C's syntactic symmetry hides a semantic difference.
What is needed on the two sides is not the same.

I assume that in BLISS, A = A is legal, but does something odd like copy
A's address into itself.

What's odd about that? And why call is a copy operation? Do you think
of A = 42 as a copy operation? BLISS is a low-level system language.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Wed Aug 28 08:18:12 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

BLISS is a rather strange language. For something supposedly
low level than C, it doesn't have 'goto'.

It is also typeless.

There is also a key feature that sets it apart from most HLLs: >>>>>>>> usually if you declare a variable A, then you can access A's
value just by writing A; its address is automatically
dereferenced.

Not always. This is where left- and right-evaluation came in.
On the left of an assignment A denotes a "place" to receive a
value. On the right, it denotes a value obtained from a place.
CPL used the terms and C got them via BCPL's documentation.
Viewed like this, BLISS just makes "evaluation" a universal
concept.

As I recall, the terms "lvalue" and "rvalue" originated with CPL.
The 'l' and 'r' suggest the left and right sides of an
assignment.

Disclaimer: I have a couple of CPL documents, and I don't see
the terms "lvalue" and "rvalue" in a quick look. The PDFs are
not searchable. If someone has better information, please post
it. Wikipedia does say that the notion of "l-values" and
"r-values" was introduced by CPL.

I presume, since I mentioned the concepts coming from CPL, you are
referring to specifically the short-form terms l- and r-values?

I can't help with those specific terms as the document I have uses
a mixture of terms like "the LH value of...", "left-hand
expressions" and "evaluated in LH mode".

The documents I have are unsearchable PDFs; they appear to be
scans of paper documents.

https://comjnl.oxfordjournals.org/content/6/2/134.full.pdf
https://www.ancientgeek.org.uk/CPL/CPL_Elementary_Programming_Manual.pdf >>>>
Do you have friendlier documents?

The earliest that is searchable has this title page:

UNIVERSITY OF LONDON INSTITUTE OF COMPUTER SCIENCE
*************************************************
THE UNIVERSITY MATHEMATICAL LABORATORY, CAMBRIDGE
*************************************************
CPL ELEMENTARY PROGRAMMING MANUAL
Edition I (London)

This document, written by the late John Buxton, was preserved by
Bill Williams, formerly of London University?s Atlas support team.
Bill has generously made it available to Dik Leatherdale who has
OCRed and otherwise transcribed it for the Web. All errors should
be reported to dik@leatherdale.net. The original appearance is
respected as far as possible, but program text and narrative are
distinguished by the use of different fonts. Transcriber's
additions and 'corrections' are in red, hyperlinks in underlined
purple. A contents list and a selection of references have been
added inside the back cover.

March 1965

I don't know where I got it from. The other searchable one is just
a PDF is the oft-cited paper "The main features of CPL" by Barron
et. al.

My understanding is the terms l-value and r-value, along with
several other terms widely used in relation to programming
languages, became widely used following a summer(?) course taught
by Christopher Strachey. Some of the other terms are referential
transparency and parametric polymorphism, IIRC.

The earlier ('65 and '66) writings about CPL that I've seen all use the longer terms, and those lectures certainly use the short forms, so it
seems clear this is when they came about and, since he is the sole
author of the notes (unlike all the CPL documents that are groups
efforts), it's also likely he invented the terms.

Not sure if the 'this' refers to the 1965/66 writings or the later
course. In any case my comment was only about when the terms became
widely used, not when they were invented or first described.

https://en.wikipedia.org/wiki/Fundamental_Concepts_in_Programming_Languages >>
I believe it is possible to track down the notes from that course,
if a diligent web search is employed. I remember reading a copy
some years ago after finding one on the internet.

I suspect there is no copyright-free PDF as the notes were published by
an academic press.

I remember finding a copy on the net somewhere perhaps 10 years ago
(no claim that the number 10 is accurate!). If I had to bet I would
bet it was a PDF, but of course I'm not sure of that. Also of course
I don't know if the copy I found infringed any copyrights.

None the less well worth a read, even nearly 60 years later. Is it a
shame that that is the case?

What I might call the mainstream programming language community has
largely ignored the ideas that have come out of the declarative and
functional programming languages and the experience of those who
developed them. I consider the continuing relevance of the Strachey
notes to be a symptom of that larger shortcoming.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Bonita Montero on Wed Aug 28 16:51:40 2024

On 28/08/2024 15:18, Bonita Montero wrote:

Am 28.08.2024 um 15:52 schrieb Thiago Adams:

You have to deallocate only if the ownership still with the same object.
This is not the case when the object is moved.

If the compiler sees that it is moved the whole deallocation
is usually opimized away for the non-exception code path.

To create view object you need a new object because destructor cannot
be disabled.

It's optimized away for the non-exception code-path if the compiler
sees it.

Const-objects are there not to be modified. If you've got logical
constness you can cast away the const with const_cast and move its
contents. But casting away const is unclean mostly in C and C++.

excuses..

In C yoU'd have to do the same. And there's only one type of cast
and you could mistakenly cast to a non-fitting type whereas in C++
you've got const_cast<> which may only differ in const-ness.

It was create latter after people realize std::string was bad.

The same for std::array etc... almost everything is a fix in C++.

Array is there to have iterator-debugging on sth. that looks like
a C-array. With C-arrays there's no such feature and it's harder
to find according bugs. If I use static const C-style arrays which
are directed to the DATA-ssegment I always use a span<> on it to
have iterator-debugging, whereas in C you have to take a lot of
care.

(I don't blame people creating C++ in the past, but I think now we
have information sufficient to do better choices and this choices are
not begin made).

C is five to ten times more code for the same task.

Suppose you take the source code for CPython, and suppose that comprises
500K lines of C source. (It was half that a decade ago so it is feasible.)

Are suggesting that CPython written in C++ would only need 50-100K lines
of C++ user code (excluding support libraries) instead of 500K?

Would the resulting binary be bigger or smaller?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Wed Aug 28 18:48:34 2024

On 27.08.2024 01:47, Bart wrote:

On 27/08/2024 00:33, Janis Papanagnou wrote:

On 25.08.2024 20:24, Bart wrote:

On 25/08/2024 19:12, Bonita Montero wrote:

Am 25.08.2024 um 18:28 schrieb Michael S:

Define "abstraction".

This could have been looked up online (e.g. in a Wikipedia article).

OOP, functional programming, generic programming, exceptions.

(And there are yet more.)

That isn't surprising. The code you constantly post always uses the most >>> advanced features, uses every toy that is available, and the most
elaborate algorithms.

I'm not sure in what bubble you lived the past decades. The listed
abstraction examples date back to the 1960's. They were realized in
many programming languages,

Perhaps not so much in the ones people used. Assembly? Fortran? Cobol?
There have always been academic languages.

As said, there are lots of languages. And since from the thousands
of existing languages you cannot expect all to become hyped or used
there's of course "not so much" that people generally use. From the
widely used ones - inspect Wikipedia or Google to find them! - just
pick some and see what abstraction concepts they support (e.g. from
the listed ones above). Living examples for OOP are many; C++, Java,
or OO versions of long existing languages. For functional programming
I've heard of e.g. Lisp(-Dialects) still widely used, and even C++'s
STL implements a functional framework (in addition to genericity and
OO). - It's presumably the bubble you're living in that prevents you
from seeing that? If you'd have, intellectually or practically from
own experience, understood abstraction concepts you'd probably see
more clear what that means, what advantages you gain from each of
these abstraction concepts.

including long existing ones as well as
many contemporary ones. I suggest to try to understand the concepts
if you want to reach the next experience level. :-)

I sometimes use (and implement) such features in scripting code which
has the support to use them effortlessly.

I've rarely needed them for systems programming.

If you're restricting only to a small subset of software engineering
areas some concepts may probably be less useful to you. Though why
you think that, e.g., OO concepts are not useful to be applied to
systems programming is beyond me. I can only say that "thinking OO"
is not naturally given, it's something you may instantly understand
when you hear about it (given a proper experience and open mindset)
or observe others how they use it advantageously.

My comments were in connection with their clunky and abstruse
implementations in C++, and BM's habit of posting C++ code full of
gratuitous uses of such features.

I cannot see what examples you have in mind or what it is that you
find "clunky and abstruse". The intention of, e.g., OO design is
certainly to make non-trivial code flexible and comprehensible,
i.e. exactly the opposite of what you allege. Of course, nothing
is for free, and a programmer must have a minimum of knowledge,
experience, or openness of mind to understand that. And of course
you can (as in any language [but Intercal]) design and code your
programs more or less intelligibly; this is independent of using
abstraction methods or not. And C++ is a special matter anyway;
you have (inherited C-) things that contribute to abstruse code,
and you have (specifically with the newer C++ standards) a lot of
often cryptic appearing features which makes it hard especially
for C++ newbies. I suggest to try to separate the concepts from
specific (strange appearing) language features or specific (bad)
code samples.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Thiago Adams on Wed Aug 28 19:29:53 2024

On 28.08.2024 15:52, Thiago Adams wrote:

It was create latter after people realize std::string was bad.
The same for std::array etc... almost everything is a fix in C++.

You can consider them "fixes" of char* or char[] and sometype[],
respectively, as something fundamentally amiss in C++'s "C"-base
(i.e., if at all, it's rather a "C" fix).

But there's much more to these two types than you showed here;
you have to consider them in the STL context.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Tim Rentsch on Wed Aug 28 19:57:48 2024

On 27.08.2024 03:16, Tim Rentsch wrote:

[...]
For example, in the Wikipedia entry sense of the term, the original
BASIC is a high-level language, but I think most people would agree
that it is not a very powerful language.

I note you are saying "_original_ BASIC".

This reminds me someone (I think it was a university professor
in the 1980's) saying that BASIC became a low-level language when
some vendors introduced 'peek' and 'poke' into the language's set
of functions.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Aug 28 18:37:20 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.08.2024 03:16, Tim Rentsch wrote:

[...]
For example, in the Wikipedia entry sense of the term, the original
BASIC is a high-level language, but I think most people would agree
that it is not a very powerful language.

I note you are saying "_original_ BASIC".

This reminds me someone (I think it was a university professor
in the 1980's) saying that BASIC became a low-level language when
some vendors introduced 'peek' and 'poke' into the language's set
of functions.

There were BASIC interpreters (and compilers) in the 1970s
that supported calling functions written in other languages.

The HP-3000 BASIC interpreter and compiler, for example.

Side story:

The HP-3000 BASIC interpreter was installed, for some reason,
with the PH (Process Handling) Capability. The ability to
call functions written in other languages allowed a user to
use functionality otherwise restricted (e.g. changing job
priorities for other users jobs). In that instance, I had
written a small SPL/3000 subroutine that accessed an MPE
intrinsic gated by the PH capability and called it from a BASIC
program.

The operations staff removed that capability from the BASIC
interpreter shortly thereafter :-)

Note that this security issue only applied to interpreted
BASIC programs - compiled programs were standalone
executables which didn't inherit the permissions (capabilities)
granted to the interpreter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Wed Aug 28 19:26:24 2024

On 28/08/2024 15:57, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 28/08/2024 00:49, Ben Bacarisse wrote:

Indeed, and BLISS is not like that. I had hoped to shed some light on
why there is some logic to BLISS's rather idiosyncratic design.

Given a declaration like 'int A' then:

BLISS C

Read or write A's value .A A

I don't think that's right. To change the value at address A (what I
think you mean by "write A's value") you write
A = 42;
in BLISS. And to add one to the value at address A you write
A = .A + 1;

OK. That's just makes it more bizarre than I'd thought.

Curious. It's what makes it consistent, though it is definitely an
uncommon approach.

The example I saw
included these lines:

GETNUM(X); ! returns a value via X
Y = STEP(.X);
PUTNUM(.Y)

So in an rvalue context: X reads its address; while .X reads its
value.

The whole point is to remove the two contexts. A variable name is
/always/ an lvalue (which is why it can be assigned). C has an implicit lvalue to rvalue conversion in the contexts you have come to expect it.
BLISS does not. You always need a dot to convert to an rvalue.

This is the kind of thing I meant a few posts back. You don't need to
take A (which refers to some place where you can store values), and tell
it to fetch that value. Most HLLs will do that without being told.

(My point was that that was a distinguishing feature of HLLs, which is
missing in Forth for example.)

But in an lvalue one: Y writes its value; .Y may not be defined

It looks asymmetric. C like most languages is symmetric, you write 'A = A' >> with the same syntax on both sides.

Since assignment is inherently asymmetric (you can't write 3 = A but you
can write A = 3) C's syntactic symmetry hides a semantic difference.
What is needed on the two sides is not the same.

I would argue that it is exactly the same. You seem to imply that
lvalues and rvalues involve different levels of indirection.

In most HLLs you use the same syntax whether for lvalue or rvalue (eg. A
= A).

In intermediate representations it is also the same:

Push A
Pop A

And in most assemblers you use the same syntax too:

LOAD R, [A] # A is in memory
STORE [A], R
MOVE R, Ra # A is in a register
MOVE Ra, R

You will likely see similarities in instruction encodings too.

The same applies when the terms are complex: P.m = P.m, A[i] = A[i], *Q
= *Q.

What's different between LHS and RHS is lvalues having an extra
constraint. For example you might require an LHS to support an &
operation, and require it to be mutable.

So assigning to 42, or (A+B), won't work. Or it shouldn't do (anybody
can implement a crazy language where these can either be given a
meaning, or it just does something weird).

I assume that in BLISS, A = A is legal, but does something odd like copy
A's address into itself.

What's odd about that? And why call is a copy operation? Do you think
of A = 42 as a copy operation? BLISS is a low-level system language.

Why do you mean by call? But if this is valid in BLISS:

A = A

then what /does/ it do except what I said? My point is that most people
will except this to copy the value of A to itself, in which case they've forgotten to write '.A' on the RHS.

It is uncommon to want to do the equivalent of 'A = &A' and usually
doesn't work with static typing anyway.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Janis Papanagnou on Wed Aug 28 13:42:45 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.08.2024 03:16, Tim Rentsch wrote:

[...]
For example, in the Wikipedia entry sense of the term, the original
BASIC is a high-level language, but I think most people would agree
that it is not a very powerful language.

I note you are saying "_original_ BASIC".

Yes. And deliberately so.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Tim Rentsch on Wed Aug 28 23:22:08 2024

On 28.08.2024 22:42, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.08.2024 03:16, Tim Rentsch wrote:

[...]
For example, in the Wikipedia entry sense of the term, the original
BASIC is a high-level language, but I think most people would agree
that it is not a very powerful language.

I note you are saying "_original_ BASIC".

Yes. And deliberately so.

That was obvious to me (given what I wrote in the [stripped] text).

I just emphasized it, because it was essential and could easily be
overlooked.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Janis Papanagnou on Wed Aug 28 22:11:29 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 28.08.2024 20:37, Scott Lurndal wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.08.2024 03:16, Tim Rentsch wrote:

[...]
For example, in the Wikipedia entry sense of the term, the original
BASIC is a high-level language, but I think most people would agree
that it is not a very powerful language.

I note you are saying "_original_ BASIC".

This reminds me someone (I think it was a university professor
in the 1980's) saying that BASIC became a low-level language when
some vendors introduced 'peek' and 'poke' into the language's set
of functions.

There were BASIC interpreters (and compilers) in the 1970s
that supported calling functions written in other languages.

The HP-3000 BASIC interpreter and compiler, for example.

Myself I've never seen or worked with such a system.

I should have included a link to the reference manual:

http://bitsavers.org/pdf/hp/3000/mpeII/30000-90026_Aug-1978.pdf

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Scott Lurndal on Wed Aug 28 23:18:52 2024

On 28.08.2024 20:37, Scott Lurndal wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.08.2024 03:16, Tim Rentsch wrote:

[...]
For example, in the Wikipedia entry sense of the term, the original
BASIC is a high-level language, but I think most people would agree
that it is not a very powerful language.

I note you are saying "_original_ BASIC".

This reminds me someone (I think it was a university professor
in the 1980's) saying that BASIC became a low-level language when
some vendors introduced 'peek' and 'poke' into the language's set
of functions.

There were BASIC interpreters (and compilers) in the 1970s
that supported calling functions written in other languages.

The HP-3000 BASIC interpreter and compiler, for example.

Myself I've never seen or worked with such a system. The first
BASIC systems I worked with were an Olivetti P6060 (a compiler;
with a BASIC that had an immense command set, plus libraries
for graphical plotting and matrix computations), then a Wang
system (with a cassette tape recorder that worked like a real
mainframe tape-system with high speed positioning, but I don't
recall its BASIC commands), and some popular Commodore systems
(PET, CBM; these had peek and poke). There were just too many
dialects around these days. So that language is, feature-wise,
hard to compare WRT the subthread's topic of being a HLL, LLL.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Thu Aug 29 00:43:19 2024

Bart <bc@freeuk.com> writes:

On 28/08/2024 15:57, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 28/08/2024 00:49, Ben Bacarisse wrote:

Indeed, and BLISS is not like that. I had hoped to shed some light on >>>> why there is some logic to BLISS's rather idiosyncratic design.

Given a declaration like 'int A' then:

BLISS C

Read or write A's value .A A

I don't think that's right. To change the value at address A (what I
think you mean by "write A's value") you write
A = 42;
in BLISS. And to add one to the value at address A you write
A = .A + 1;

OK. That's just makes it more bizarre than I'd thought.

Curious. It's what makes it consistent, though it is definitely an
uncommon approach.

The example I saw
included these lines:

GETNUM(X); ! returns a value via X
Y = STEP(.X);
PUTNUM(.Y)

So in an rvalue context: X reads its address; while .X reads its
value.

The whole point is to remove the two contexts. A variable name is
/always/ an lvalue (which is why it can be assigned). C has an implicit
lvalue to rvalue conversion in the contexts you have come to expect it.
BLISS does not. You always need a dot to convert to an rvalue.

This is the kind of thing I meant a few posts back. You don't need to take
A (which refers to some place where you can store values), and tell it to fetch that value. Most HLLs will do that without being told.

(My point was that that was a distinguishing feature of HLLs, which is missing in Forth for example.)

We are talking at cross purposes then. I was not addressing anything
about your view of what makes an HLL.

But in an lvalue one: Y writes its value; .Y may not be defined

It looks asymmetric. C like most languages is symmetric, you write 'A = A' >>> with the same syntax on both sides.

Since assignment is inherently asymmetric (you can't write 3 = A but you
can write A = 3) C's syntactic symmetry hides a semantic difference.
What is needed on the two sides is not the same.

I would argue that it is exactly the same.

How do you argue that, given that A=3 is allowed and 3=A is not?

...

I assume that in BLISS, A = A is legal, but does something odd like copy >>> A's address into itself.

What's odd about that? And why call is a copy operation? Do you think
of A = 42 as a copy operation? BLISS is a low-level system language.

Why do you mean by call?

Typo. I meant to write... And why call /it/ a copy-operation? Do you
think of A = 42 as a copy operation?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Janis Papanagnou on Wed Aug 28 22:36:56 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 28.08.2024 22:42, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 27.08.2024 03:16, Tim Rentsch wrote:

[...]
For example, in the Wikipedia entry sense of the term, the original
BASIC is a high-level language, but I think most people would agree
that it is not a very powerful language.

I note you are saying "_original_ BASIC".

Yes. And deliberately so.

That was obvious to me [...]

I replied with the above comment in case it may not have
been obvious to someone else.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Thu Aug 29 10:41:39 2024

On 28/08/2024 15:51, Bart wrote:

On 28/08/2024 12:21, Michael S wrote:

On Sun, 25 Aug 2024 18:26:57 +0100
Bart <bc@freeuk.com> wrote:

On 25/08/2024 17:20, tTh wrote:

On 8/25/24 17:30, Bart wrote:

So what language goes between Assembly and C?

   Forth ?

I had in mind languages classed as 'HLLs'. I'm not sure if Forth
counts.

They say that Forth is a HLL
https://www.forth.com/forth/
I tend to agree with them.

From 'forth.com', and a commercial site at that? They're going to be completely impartial of course!

This is the Ackermann function in Forth, from https://rosettacode.org/wiki/Category:Forth :

: acker ( m n -- u )
    over 0= IF nip 1+ EXIT THEN
    swap 1- swap ( m-1 n -- )
    dup 0= IF 1+ recurse EXIT THEN
    1- over 1+ swap recurse recurse ;

Well, it's not assembly, but it doesn't have one characteristic of HLLs
which is readability. The code for Ackermann can usually be trivially
derived from its mathemetical definition; not so here.

Yoda you are if Forth readable then.

More seriously, Forth has a very consistent stack-based and post-fix
notation. Round parenthesis are used for comments - knowing that little
fact makes a huge difference when you are trying to read it! White
space is essential between words, but the amount and type of white space doesn't matter. And a "word" can be pretty much any combination of
letters, numbers, punctuation, etc. Even integer constants (which are
either a sequence of integers or an 0xabcd style hex constant,
optionally preceded by a minus sign) can be redefined as words. It is
not really a good idea to do that, however.

So I say it fails the HLL test. But if it's not a HLL, it's also fails
on low-level control: the above doesn't concern itself with types or bitwidths for example; you might consider that a HLL trait, but it's one
that belongs the other side of C rather than below.

Types are high level concepts - Forth does not have them as such.
(There are ways to make something like C structs, and you can build up higher-level features from Forth fundamentals, but this is not the
newsgroup for the details of that. Plus, I have no idea how to do it
and would have to look it up!) It is also fairly unstructured as a
language - words (functions) can add or remove data from the stack, and
they do not have to do so consistently for all paths through the code.

Forth cell sizes are typically the same size as C "int" on the same
target. (There is no direct connection here, but the same reasoning
about range and efficiency applies to both languages.) For other sizes, specific operators are used since there are no types. But you most
certainly can operate on data of different bitwidths, and access memory
at specific addresses, as required for low-level work.

Forth is not a very popular language these days, but much of its use has
been in low-level work and embedded systems - that's certainly where you
will find many Forth tools. For /really/ small microcontrollers, such
as the 4-bit devices that used to dominate in numbers while remaining
almost hidden to most developers, the assembly /is/ Forth. And in most
Forth systems, you have not only support for embedded assembly, but the assembler itself is built in (using Forth-style syntax).

One of the important uses of Forth was to have drivers for hardware
devices and plug-in cards for workstations, because the code could be
very compact, easily stored in a small serial rom/flash device, fully independent of the processor in workstation, and mostly independent of
the OS in the workstation. Of course, that changed with as the
Windows/x86 trashed the workstation market and replaced 10 KB of
portable Forth driver code with 100+ MB of Windows-specific crapware for
a simple network card.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Thu Aug 29 11:35:35 2024

On 29/08/2024 00:43, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 28/08/2024 15:57, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 28/08/2024 00:49, Ben Bacarisse wrote:

Indeed, and BLISS is not like that. I had hoped to shed some light on >>>>> why there is some logic to BLISS's rather idiosyncratic design.

Given a declaration like 'int A' then:

BLISS C

Read or write A's value .A A

I don't think that's right. To change the value at address A (what I >>>>> think you mean by "write A's value") you write
A = 42;
in BLISS. And to add one to the value at address A you write
A = .A + 1;

OK. That's just makes it more bizarre than I'd thought.

Curious. It's what makes it consistent, though it is definitely an
uncommon approach.

The example I saw
included these lines:

GETNUM(X); ! returns a value via X
Y = STEP(.X);
PUTNUM(.Y)

So in an rvalue context: X reads its address; while .X reads its
value.

The whole point is to remove the two contexts. A variable name is
/always/ an lvalue (which is why it can be assigned). C has an implicit >>> lvalue to rvalue conversion in the contexts you have come to expect it.
BLISS does not. You always need a dot to convert to an rvalue.

This is the kind of thing I meant a few posts back. You don't need to take >> A (which refers to some place where you can store values), and tell it to
fetch that value. Most HLLs will do that without being told.

(My point was that that was a distinguishing feature of HLLs, which is
missing in Forth for example.)

We are talking at cross purposes then. I was not addressing anything
about your view of what makes an HLL.

But in an lvalue one: Y writes its value; .Y may not be defined

It looks asymmetric. C like most languages is symmetric, you write 'A = A' >>>> with the same syntax on both sides.

Since assignment is inherently asymmetric (you can't write 3 = A but you >>> can write A = 3) C's syntactic symmetry hides a semantic difference.
What is needed on the two sides is not the same.

I would argue that it is exactly the same.

How do you argue that, given that A=3 is allowed and 3=A is not?

I explained that. LHS and RHS can be identical terms for assignment in
pretty much every aspect, but there are extra constraints on the LHS.

In the case of :

(c?a:b) = (z?x:y);

C won't allow it, but some other languages will.

Remember that the programmer can only express their intentions in the
form of syntax.

...

I assume that in BLISS, A = A is legal, but does something odd like copy >>>> A's address into itself.

What's odd about that? And why call is a copy operation? Do you think
of A = 42 as a copy operation? BLISS is a low-level system language.

Why do you mean by call?

Typo. I meant to write... And why call /it/ a copy-operation? Do you
think of A = 42 as a copy operation?

If '=' means assignment, then what else is it?

Depending on language, it might be shallow, or deep, or something
depending how it works or what is being assigned.

According to Wikipedia:

"In computer programming, an assignment statement sets and/or re-sets
the value stored in the storage location(s) denoted by a variable name;
in other words, it COPIES a value into the variable"

(My emphasis.)

I don't know why you're always so contradictory. Is it a game trying to
catch me out on some pendanty? It seems to be popular here.

This subthread started with me asking which HLL goes between Assembly
and C, if C was supposedly mid-level. I don't know how it got on
discussing what exactly assignment means.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Thu Aug 29 13:35:47 2024

Bart <bc@freeuk.com> writes:

On 29/08/2024 00:43, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 28/08/2024 15:57, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 28/08/2024 00:49, Ben Bacarisse wrote:

Indeed, and BLISS is not like that. I had hoped to shed some light on >>>>>> why there is some logic to BLISS's rather idiosyncratic design.

Given a declaration like 'int A' then:

BLISS C

Read or write A's value .A A

I don't think that's right. To change the value at address A (what I >>>>>> think you mean by "write A's value") you write
A = 42;
in BLISS. And to add one to the value at address A you write
A = .A + 1;

OK. That's just makes it more bizarre than I'd thought.

Curious. It's what makes it consistent, though it is definitely an
uncommon approach.

The example I saw
included these lines:

GETNUM(X); ! returns a value via X
Y = STEP(.X);
PUTNUM(.Y)

So in an rvalue context: X reads its address; while .X reads its
value.

The whole point is to remove the two contexts. A variable name is
/always/ an lvalue (which is why it can be assigned). C has an implicit >>>> lvalue to rvalue conversion in the contexts you have come to expect it. >>>> BLISS does not. You always need a dot to convert to an rvalue.

This is the kind of thing I meant a few posts back. You don't need to take >>> A (which refers to some place where you can store values), and tell it to >>> fetch that value. Most HLLs will do that without being told.

(My point was that that was a distinguishing feature of HLLs, which is
missing in Forth for example.)

We are talking at cross purposes then. I was not addressing anything
about your view of what makes an HLL.

But in an lvalue one: Y writes its value; .Y may not be defined

It looks asymmetric. C like most languages is symmetric, you write 'A = A'
with the same syntax on both sides.

Since assignment is inherently asymmetric (you can't write 3 = A but you >>>> can write A = 3) C's syntactic symmetry hides a semantic difference.
What is needed on the two sides is not the same.

I would argue that it is exactly the same.

How do you argue that, given that A=3 is allowed and 3=A is not?

I explained that. LHS and RHS can be identical terms for assignment in
pretty much every aspect, but there are extra constraints on the LHS.

So you use "exactly the same" to mean "exactly the same except for the differences". I don't think this point needs any more discussion, do
you?

In the case of :

(c?a:b) = (z?x:y);

C won't allow it, but some other languages will.

Remember that the programmer can only express their intentions in the form
of syntax.

...

I assume that in BLISS, A = A is legal, but does something odd like copy >>>>> A's address into itself.

What's odd about that? And why call is a copy operation? Do you think >>>> of A = 42 as a copy operation? BLISS is a low-level system language.

Why do you mean by call?

Typo. I meant to write... And why call /it/ a copy-operation? Do you
think of A = 42 as a copy operation?

If '=' means assignment, then what else is it?

That's was my question. You called it (in BLISS) a "copy" operation.
Why did you use that term rather than just saying that "A = A assigns
that address of A to the location A". I'm trying to find out if your
use of the word copy rather than assign is interesting in some way.

I don't know why you're always so contradictory. Is it a game trying to
catch me out on some pendanty? It seems to be popular here.

I wanted to explain how BLISS gets rid of the lvalue/rvalue distinction
because you seemed to have misunderstood it.

This subthread started with me asking which HLL goes between Assembly and
C, if C was supposedly mid-level. I don't know how it got on discussing
what exactly assignment means.

Because, unlike you, I want to understand you before commenting. It was
a trivial question (made more complex by a typo if mine, for which I'm
sorry). You found BLISS's meaning of A = A to be "odd" and you
explained the "odd" by using the word "copy" rather than "assigns". I
just wanted to know if there was more behind your use of the word.

I don't think there is anything that needs further explanation because I
think you just said "copy" when "assigns" would have done.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Thu Aug 29 14:10:29 2024

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in
pretty much every aspect, but there are extra constraints on the LHS.

So you use "exactly the same" to mean "exactly the same except for the differences".

No, I do mean exactly the same, both in terms of syntax and (in my implementations, which are likely typical) internal representation of
those terms.

There are no differences other than where the type system says your code
is invalid. So are no differences when considering only valid programs.

This program in my language:

42 := 42

is valid syntax. It passes the next step too. It is caught later on.
Other languages/implementations may object earlier.

Some may make it invalid syntax via a stricter grammar, but if I try '42
= 42' in C, then I don't get a syntax error; I get a message about 42
not being an lvalue. (From what I can make of C's grammar, a constant is allowed on the left of an assignment operator.)

I don't know why you're always so contradictory. Is it a game trying to
catch me out on some pendanty? It seems to be popular here.

I wanted to explain how BLISS gets rid of the lvalue/rvalue distinction because you seemed to have misunderstood it.

It seems to make a dog's dinner of it. I think even in Lisp you just
write (setf a a) or something like that. And here you are assigning a's
value to itself.

(I never came across BLISS even though I used DEC equipment. I did
implement a lower level language for PDP10 though, more of a HLA. But
even there, you would write A => A to asssign A's value to itself. I
can't remember how you obtained a reference to A.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Thu Aug 29 16:13:26 2024

Bart <bc@freeuk.com> writes:

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in
pretty much every aspect, but there are extra constraints on the LHS.

So you use "exactly the same" to mean "exactly the same except for the
differences".

No, I do mean exactly the same, both in terms of syntax and (in my implementations, which are likely typical) internal representation of those terms.

There are no differences other than where the type system says your code is invalid. So are no differences when considering only valid programs.

This program in my language:

42 := 42

is valid syntax.

So what? We were talking about assignment in C. You cut the two
previous quotes where it was clear we were talking about C. This is not
an honest thing to do. You are arguing for the sake if it, and in a
dishonest way too.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Kaz Kylheku on Thu Aug 29 16:45:06 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in >>>>> pretty much every aspect, but there are extra constraints on the LHS. >>>> So you use "exactly the same" to mean "exactly the same except for the >>>> differences".

No, I do mean exactly the same, both in terms of syntax and (in my
implementations, which are likely typical) internal representation of those >>> terms.

There are no differences other than where the type system says your code is >>> invalid. So are no differences when considering only valid programs.

This program in my language:

42 := 42

is valid syntax.

So what? We were talking about assignment in C. You cut the two
previous quotes where it was clear we were talking about C. This is not
an honest thing to do. You are arguing for the sake if it, and in a
dishonest way too.

It's also valid syntax in C, with a constraint violation that can be
"caught later on" in an implementation of C, just like in Bart's
language.

Have you taken Bart's bait and are now discussing a narrower context?

The claim that C's assignment is symmetric and what is required on the
two sides is exactly the same is junk. C's assignment has different
syntax on each side, and what is required is even more strict.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Ben Bacarisse on Thu Aug 29 15:40:21 2024

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in >>>> pretty much every aspect, but there are extra constraints on the LHS.

So you use "exactly the same" to mean "exactly the same except for the
differences".

No, I do mean exactly the same, both in terms of syntax and (in my
implementations, which are likely typical) internal representation of those >> terms.

There are no differences other than where the type system says your code is >> invalid. So are no differences when considering only valid programs.

This program in my language:

42 := 42

is valid syntax.

So what? We were talking about assignment in C. You cut the two
previous quotes where it was clear we were talking about C. This is not
an honest thing to do. You are arguing for the sake if it, and in a dishonest way too.

It's also valid syntax in C, with a constraint violation that can be
"caught later on" in an implementation of C, just like in Bart's
language.

ISO C doesn't say anything about when errors are caught, other than it
being associated with translation phase 7:

White-space characters separating tokens are no longer significant. Each
preprocessing token is converted into a token. The resulting tokens are
syntactically and semantically analyzed and translated as a translation
unit.

A constraint violtion like the need for an lvalue could be caught during
the activity denoted by "semantically analyzed" rather than that denoted
by "syntactically analyzed" which would count as "later on" with regard
to syntax.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Ben Bacarisse on Thu Aug 29 15:58:52 2024

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in >>>>>> pretty much every aspect, but there are extra constraints on the LHS. >>>>> So you use "exactly the same" to mean "exactly the same except for the >>>>> differences".

No, I do mean exactly the same, both in terms of syntax and (in my
implementations, which are likely typical) internal representation of those
terms.

There are no differences other than where the type system says your code is
invalid. So are no differences when considering only valid programs.

This program in my language:

42 := 42

is valid syntax.

So what? We were talking about assignment in C. You cut the two
previous quotes where it was clear we were talking about C. This is not >>> an honest thing to do. You are arguing for the sake if it, and in a
dishonest way too.

It's also valid syntax in C, with a constraint violation that can be
"caught later on" in an implementation of C, just like in Bart's
language.

Have you taken Bart's bait and are now discussing a narrower context?

The claim that C's assignment is symmetric and what is required on the
two sides is exactly the same is junk. C's assignment has different
syntax on each side, and what is required is even more strict.

In the ISO C grammar for assignment, there is a "unary expression" on
the left and an "assignment expression" on the right. That's just a
particular factoring of the grammar that implementors don't have to
follow, if the correct results are produced.

Under a parser generator tool we could have a production rule like
expr '=' expr , where the '=' token has an elsewhere-declared
associativity and precedence.

The basic idea that the same syntactic kind of thing is on both sides of
a C assignment (with an additional lvalue constraint) is valid;
it's just not literally true if we are discussing the details of how
ISO C expresses the grammar.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Kaz Kylheku on Thu Aug 29 17:06:06 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in >>>>>>> pretty much every aspect, but there are extra constraints on the LHS. >>>>>> So you use "exactly the same" to mean "exactly the same except for the >>>>>> differences".

No, I do mean exactly the same, both in terms of syntax and (in my
implementations, which are likely typical) internal representation of those
terms.

There are no differences other than where the type system says your code is
invalid. So are no differences when considering only valid programs. >>>>>
This program in my language:

42 := 42

is valid syntax.

So what? We were talking about assignment in C. You cut the two
previous quotes where it was clear we were talking about C. This is not >>>> an honest thing to do. You are arguing for the sake if it, and in a
dishonest way too.

It's also valid syntax in C, with a constraint violation that can be
"caught later on" in an implementation of C, just like in Bart's
language.

Have you taken Bart's bait and are now discussing a narrower context?

The claim that C's assignment is symmetric and what is required on the
two sides is exactly the same is junk. C's assignment has different
syntax on each side, and what is required is even more strict.

In the ISO C grammar for assignment, there is a "unary expression" on
the left and an "assignment expression" on the right. That's just a particular factoring of the grammar that implementors don't have to
follow, if the correct results are produced.

I can't see what it is you object to in what I wrote. I don't disagree
with anything you are saying (the "correct result" being to reject a
program that has, syntactically, the wrong thing on the left hand side).

Under a parser generator tool we could have a production rule like
expr '=' expr , where the '=' token has an elsewhere-declared
associativity and precedence.

The basic idea that the same syntactic kind of thing is on both sides of
a C assignment (with an additional lvalue constraint) is valid;
it's just not literally true if we are discussing the details of how
ISO C expresses the grammar.

A C program that has the wrong syntax (for example x+1) on the left hand
side of an assignment must be rejected. I'm not relying on some fussy definition about how the syntax is written but making a point that what
is required on each side is not the exactly same thing. Do you really
disagree with that?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Thu Aug 29 18:08:13 2024

On 29/08/2024 17:06, Ben Bacarisse wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in >>>>>>>> pretty much every aspect, but there are extra constraints on the LHS. >>>>>>> So you use "exactly the same" to mean "exactly the same except for the >>>>>>> differences".

No, I do mean exactly the same, both in terms of syntax and (in my >>>>>> implementations, which are likely typical) internal representation of those
terms.

There are no differences other than where the type system says your code is
invalid. So are no differences when considering only valid programs. >>>>>>
This program in my language:

42 := 42

is valid syntax.

So what? We were talking about assignment in C. You cut the two
previous quotes where it was clear we were talking about C. This is not >>>>> an honest thing to do. You are arguing for the sake if it, and in a >>>>> dishonest way too.

It's also valid syntax in C, with a constraint violation that can be
"caught later on" in an implementation of C, just like in Bart's
language.

Have you taken Bart's bait and are now discussing a narrower context?

The claim that C's assignment is symmetric and what is required on the
two sides is exactly the same is junk. C's assignment has different
syntax on each side, and what is required is even more strict.

In the ISO C grammar for assignment, there is a "unary expression" on
the left and an "assignment expression" on the right. That's just a
particular factoring of the grammar that implementors don't have to
follow, if the correct results are produced.

I can't see what it is you object to in what I wrote. I don't disagree
with anything you are saying (the "correct result" being to reject a
program that has, syntactically, the wrong thing on the left hand side).

Under a parser generator tool we could have a production rule like
expr '=' expr , where the '=' token has an elsewhere-declared
associativity and precedence.

The basic idea that the same syntactic kind of thing is on both sides of
a C assignment (with an additional lvalue constraint) is valid;
it's just not literally true if we are discussing the details of how
ISO C expresses the grammar.

A C program that has the wrong syntax (for example x+1) on the left hand
side of an assignment must be rejected. I'm not relying on some fussy definition about how the syntax is written but making a point that what
is required on each side is not the exactly same thing. Do you really disagree with that?

So what exactly is different about the LHS and RHS here:

A = A;

(In BLISS, doing the same thing requires 'A = .A' AIUI; while 'A = A' is
also valid, there is a hidden mismatch in indirection levels between
left and right. It is asymmetric while in C it is symmetric, although
seem to disagree on that latter point.)

A C program that has the wrong syntax (for example x+1) on the left hand side of an assignment must be rejected.

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason: an '+'
term is not a valid lvalue.

So will this:

x = z;

when x and y have incompatible types, even though you must agree the
syntax is valid, and even when x is a valid lvalue.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Ben Bacarisse on Thu Aug 29 14:26:02 2024

On 8/29/24 12:06, Ben Bacarisse wrote:
...

I can't see what it is you object to in what I wrote. I don't disagree
with anything you are saying (the "correct result" being to reject a
program that has, syntactically, the wrong thing on the left hand side).

No - the only requirement is that a diagnostic be produced. A fully
conforming implementation of C is allowed to accept such code and then
generate an executable; if you choose to execute the executable, the
behavior is undefined.
The only construct for which rejection is mandatory is a #error
directive that survives conditional compilation. Note that a #error
directive that contains or is a syntax error or a constraint violation
would invalidate that requirement, allowing the program to be accepted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bart on Thu Aug 29 23:43:57 2024

On Wed, 28 Aug 2024 19:26:24 +0100
Bart <bc@freeuk.com> wrote:

In most HLLs you use the same syntax whether for lvalue or rvalue
(eg. A = A).

From the top of my head.

Windows cmd shell:
set y=%x%

TCL:
set y=$x

bash shell:
y=$x

There should be many more examples like those above.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Thu Aug 29 22:29:39 2024

On 29/08/2024 21:30, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

So what exactly is different about the LHS and RHS here:

A = A;

The RHS is evaluated to determine the current value stored in the object named A. The LHS is evaluated to determine the object that's designated
by the name A; its current value is irrelevant.

Sure, but the same thing happens on both sides: one ends up performing a
Read via that Lvalue, and the other does a Write via that Lvalue.

In C terms, the RHS undergoes *lvalue conversion*, where an expression
that's an lvalue is converted to the value stored in the designated
object. The LHS does not undergo lvalue conversion.

(In BLISS, doing the same thing requires 'A = .A' AIUI; while 'A = A'
is also valid, there is a hidden mismatch in indirection levels
between left and right. It is asymmetric while in C it is symmetric,
although seem to disagree on that latter point.)

Because BLISS, unlike C, does not have implicit lvalue conversion; the
prefix "." operator performs explicit lvalue conversion. I presume the
"." operator isn't specific to assignments.

But it must have that conversion on the LHS, otherwise it's A's address
that is written to rather than its value, which doesn't make sense.
That's why I said it was asymmetric; the RHS needs an explicit operator,
the LHS doesn't.

I'd initially thought that both sides needed it.

In C, the LHS and RHS are evaluated differently. In BLISS, they're
evaluated in the same way, requiring an explicit operator to do what
done implicitly by context in C. I'd call the former asymmetric and the latter symmetric.

It sounds like you've got it backwards.

How can A = B be asymmetric, but A = .B be symmetric?

Lots of people like to be contrary in this group!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Ben Bacarisse on Fri Aug 30 00:08:27 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 8/29/24 12:06, Ben Bacarisse wrote:
...

I can't see what it is you object to in what I wrote. I don't disagree
with anything you are saying (the "correct result" being to reject a
program that has, syntactically, the wrong thing on the left hand side).

No - the only requirement is that a diagnostic be produced. A fully
conforming implementation of C is allowed to accept such code and then
generate an executable; if you choose to execute the executable, the
behavior is undefined.

Sorry, I used a term incorrectly. To put it informally, you must be
told that "this is not C". Not everything is C even if a C compiler
will accept FORTRAN code as an extension.

Actually I don't think I did. I said "reject" and a compiler that says
"this is not C" and then generates a executable is rejecting the code as
far as I am concerned.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to James Kuyper on Thu Aug 29 23:53:04 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 8/29/24 12:06, Ben Bacarisse wrote:
...

I can't see what it is you object to in what I wrote. I don't disagree
with anything you are saying (the "correct result" being to reject a
program that has, syntactically, the wrong thing on the left hand side).

No - the only requirement is that a diagnostic be produced. A fully conforming implementation of C is allowed to accept such code and then generate an executable; if you choose to execute the executable, the
behavior is undefined.

Sorry, I used a term incorrectly. To put it informally, you must be
told that "this is not C". Not everything is C even if a C compiler
will accept FORTRAN code as an extension.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Thu Aug 29 23:45:47 2024

On 29/08/2024 23:03, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 21:30, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

So what exactly is different about the LHS and RHS here:

A = A;

The RHS is evaluated to determine the current value stored in the
object
named A. The LHS is evaluated to determine the object that's designated >>> by the name A; its current value is irrelevant.

Sure, but the same thing happens on both sides: one ends up performing
a Read via that Lvalue, and the other does a Write via that Lvalue.

The read is done by converting the lvalue to its value, which is not an lvalue. Please read the discussion of "lvalue conversion" in the C
standard.

In C terms, the RHS undergoes *lvalue conversion*, where an expression
that's an lvalue is converted to the value stored in the designated
object. The LHS does not undergo lvalue conversion.

(In BLISS, doing the same thing requires 'A = .A' AIUI; while 'A = A'
is also valid, there is a hidden mismatch in indirection levels
between left and right. It is asymmetric while in C it is symmetric,
although seem to disagree on that latter point.)

Because BLISS, unlike C, does not have implicit lvalue conversion;
the
prefix "." operator performs explicit lvalue conversion. I presume the
"." operator isn't specific to assignments.

But it must have that conversion on the LHS, otherwise it's A's
address that is written to rather than its value, which doesn't make
sense. That's why I said it was asymmetric; the RHS needs an explicit
operator, the LHS doesn't.

No, the address isn't written. The object is written.

The RHS evaluation determines the value currently stored in the object.
The LHS evaluation does not. That's the asymmetry.

In BLISS, the evaluation of the expression A determines the object that
the name A designates. In C, it can either do that *or* it can extract
the value currently stored in that object.

So if C was to behave the same way as BLISS:

int a, b, c;

b = 23; // sets b to 23
a = *b; // sets a to that 23 in b
c = *a + *b; // sets c to 46

a = b; // attempts to set a to b's address (&b)

You would say that this is now symmetric, but C normal C wasn't?

I get you...

... not really! We'll just have to disagree. My comments are based on
having implemented this stuff myriad times on multiple targets, but I
must have obviously have been misunderstanding it all that time.

Opening a drawer A to put something in, is a totally different thing to
opening the same drawer A to take something out; of course!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Fri Aug 30 00:29:19 2024

Bart <bc@freeuk.com> writes:

On 29/08/2024 17:06, Ben Bacarisse wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in
pretty much every aspect, but there are extra constraints on the LHS. >>>>>>>> So you use "exactly the same" to mean "exactly the same except for the >>>>>>>> differences".

No, I do mean exactly the same, both in terms of syntax and (in my >>>>>>> implementations, which are likely typical) internal
representation of those
terms.

There are no differences other than where the type system says
your code is
invalid. So are no differences when considering only valid programs. >>>>>>>
This program in my language:

42 := 42

is valid syntax.

So what? We were talking about assignment in C. You cut the two
previous quotes where it was clear we were talking about C. This is not >>>>>> an honest thing to do. You are arguing for the sake if it, and in a >>>>>> dishonest way too.

It's also valid syntax in C, with a constraint violation that can be >>>>> "caught later on" in an implementation of C, just like in Bart's
language.

Have you taken Bart's bait and are now discussing a narrower context?

The claim that C's assignment is symmetric and what is required on the >>>> two sides is exactly the same is junk. C's assignment has different
syntax on each side, and what is required is even more strict.

In the ISO C grammar for assignment, there is a "unary expression" on
the left and an "assignment expression" on the right. That's just a
particular factoring of the grammar that implementors don't have to
follow, if the correct results are produced.

I can't see what it is you object to in what I wrote. I don't disagree
with anything you are saying (the "correct result" being to reject a
program that has, syntactically, the wrong thing on the left hand side).

Under a parser generator tool we could have a production rule like
expr '=' expr , where the '=' token has an elsewhere-declared
associativity and precedence.

The basic idea that the same syntactic kind of thing is on both sides of >>> a C assignment (with an additional lvalue constraint) is valid;
it's just not literally true if we are discussing the details of how
ISO C expresses the grammar.

A C program that has the wrong syntax (for example x+1) on the left hand
side of an assignment must be rejected. I'm not relying on some fussy
definition about how the syntax is written but making a point that what
is required on each side is not the exactly same thing. Do you really
disagree with that?

So what exactly is different about the LHS and RHS here:

A = A;

Do you think (or claim) that what is /required/ on each side of an
assignment in C is exactly the same thing? The expression on the LHS is required to be a modifiable lvalue expression. That does not apply to
the expression on right hand side.

A = A; might or might not be valid C because different kinds of
expression are required on each side and assignment. If A not a
modifiable lvalue, it can appear on the RHS but not on the LHS.

A C program that has the wrong syntax (for example x+1) on the left hand
side of an assignment must be rejected.

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason: an '+'
term is not a valid lvalue.

The compiler must tell you that neither is valid C. That's because what
is required on each side of assignment is not exactly the same thing.
It's a distraction to argue about why each is not valid C as both have
errors that require diagnostic at compile time.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Ben Bacarisse on Fri Aug 30 02:34:27 2024

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 17:06, Ben Bacarisse wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

On 29/08/2024 13:35, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

I explained that. LHS and RHS can be identical terms for assignment in
pretty much every aspect, but there are extra constraints on the LHS.

So you use "exactly the same" to mean "exactly the same except for the
differences".

No, I do mean exactly the same, both in terms of syntax and (in my >>>>>>>> implementations, which are likely typical) internal
representation of those
terms.

There are no differences other than where the type system says >>>>>>>> your code is
invalid. So are no differences when considering only valid programs. >>>>>>>>
This program in my language:

42 := 42

is valid syntax.

So what? We were talking about assignment in C. You cut the two >>>>>>> previous quotes where it was clear we were talking about C. This is not
an honest thing to do. You are arguing for the sake if it, and in a >>>>>>> dishonest way too.

It's also valid syntax in C, with a constraint violation that can be >>>>>> "caught later on" in an implementation of C, just like in Bart's
language.

Have you taken Bart's bait and are now discussing a narrower context? >>>>>
The claim that C's assignment is symmetric and what is required on the >>>>> two sides is exactly the same is junk. C's assignment has different >>>>> syntax on each side, and what is required is even more strict.

In the ISO C grammar for assignment, there is a "unary expression" on
the left and an "assignment expression" on the right. That's just a
particular factoring of the grammar that implementors don't have to
follow, if the correct results are produced.

I can't see what it is you object to in what I wrote. I don't disagree
with anything you are saying (the "correct result" being to reject a
program that has, syntactically, the wrong thing on the left hand side). >>>

Under a parser generator tool we could have a production rule like
expr '=' expr , where the '=' token has an elsewhere-declared
associativity and precedence.

The basic idea that the same syntactic kind of thing is on both sides of >>>> a C assignment (with an additional lvalue constraint) is valid;
it's just not literally true if we are discussing the details of how
ISO C expresses the grammar.

A C program that has the wrong syntax (for example x+1) on the left hand >>> side of an assignment must be rejected. I'm not relying on some fussy
definition about how the syntax is written but making a point that what
is required on each side is not the exactly same thing. Do you really
disagree with that?

So what exactly is different about the LHS and RHS here:

A = A;

Do you think (or claim) that what is /required/ on each side of an
assignment in C is exactly the same thing? The expression on the LHS is required to be a modifiable lvalue expression. That does not apply to
the expression on right hand side.

"modifiable lvalue" is a semantic attribute which depends on type
and qualification. An array is an lvalue, but not modifiable.
A const-qualified expression is also not a modififiable lvalue.

Bart is insisting that these attributes are not a matter of syntax.

If you regard the processing of these semantic attributes to be part of
syntax (under the model of an attribute grammar), then it can be
regarded as syntax.

I think that what ISO C means by syntax does rule that out.

That also applies in more trivial situations. For instance,
the grammar rules for declaration specifiers admit this as valid
syntax:

unsigned double float char x;

It's an invalid combination of specifiers ruled out by a constraints
paragraph. The constraint can be readily identified as syntactic (it
governs combinations of tokens), yet in ISO C, it is outside of the
formal syntax.

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason: an '+'
term is not a valid lvalue.

The compiler must tell you that neither is valid C. That's because what
is required on each side of assignment is not exactly the same thing.
It's a distraction to argue about why each is not valid C as both have
errors that require diagnostic at compile time.

Bart is only saying that it's valid syntax, not that it's valid C.

According to the ISO C syntax (not taking into account contraints, which
are not syntax) that view is justified.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bonita Montero on Fri Aug 30 03:21:05 2024

On Wed, 28 Aug 2024 08:04:34 +0200, Bonita Montero wrote:

Am 28.08.2024 um 07:39 schrieb Lawrence D'Oliveiro:

The Linux kernel abstractions are very high level. Look at how entirely
different filesystems, even ones originating from entirely different
OSes, can be handled through the common VFS layer.

The distance between the levels of indirection is less than in C++.

Do you have any examples of C++ code that deals with the levels of
indirection in the Linux kernel?

Consider how a network-based filesystem is accessed:

filesystem API ← VFS layer ← network filesystem layer ← network stack ← network protocol

Look at the number of levels. And that’s actually a simplification.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Michael S on Fri Aug 30 03:18:17 2024

On Wed, 28 Aug 2024 14:21:19 +0300, Michael S wrote:

They say that Forth is a HLL

PostScript yes, Forth no.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Bart on Thu Aug 29 21:24:04 2024

Bart <bc@freeuk.com> writes:

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C.

The second line does comply with the ISO C grammar (but does not
satisfy the constraints for an assignment expression).

The first line does not comply with the ISO C grammar. Which is
to say, there is no way to reduce the first line to a single
nonterminal under the ISO C grammar rules.

Disclaimer: the two previous statements represent my best
understanding. I'm fairly confident they are right but I
wouldn't advise someone to bet their life on that.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Fri Aug 30 06:40:14 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

[.. is assignment in C symmetric w.r.t. the two sides of '=' ..]

Have you taken Bart's bait and are now discussing a narrower
context?

The claim that C's assignment is symmetric and what is required on
the two sides is exactly the same is junk. C's assignment has
different syntax on each side, and what is required is even more
strict.

In the ISO C grammar for assignment, there is a "unary expression"
on the left and an "assignment expression" on the right. That's
just a particular factoring of the grammar that implementors don't
have to follow, if the correct results are produced.

Under a parser generator tool we could have a production rule like
expr '=' expr , where the '=' token has an elsewhere-declared
associativity and precedence.

The basic idea that the same syntactic kind of thing is on both
sides of a C assignment (with an additional lvalue constraint) is
valid; it's just not literally true if we are discussing the
details of how ISO C expresses the grammar.

I think this kind of reasoning is more harmful than helpful. The
point of the discussion is to understand what the ISO C standard
requires. Constraints apply only in the context of a complete
parse of a syntactically well-formed translation unit. To give
an example:

enum { A = 47 };

int
foo( int x ){
int A = 23;
return x+A;
}

There is nothing wrong with this translation unit. But if we
look at just the 'A = 23' as an assignment expression, it
violates a constraint. Reasoning about how an implementation
might go about parsing its input might lead one astray as to
how compilers are allowed to behave.

Of course implementations are allowed to phrase diagnostics in
any way they choose, even when a diagnostic is required. But for
understanding what the C standard mandates, it's better not to
think about how the parsing might be done, and instead follow the
given easy-to-understand guideline, namely, that constraints
apply only in the context of a complete parse of a syntactically
well-formed translation unit. If a translation unit is not
syntactically well-formed then no further consideration is needed
because it already doesn't comply with the C standard's rules;
it's only when a translation unit is completely syntactically
well-formed that one needs to think about constraint violations
and whether constraints are violated.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Fri Aug 30 06:44:27 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason:
an '+' term is not a valid lvalue.

The compiler must tell you that neither is valid C. That's
because what is required on each side of assignment is not
exactly the same thing. It's a distraction to argue about why
each is not valid C as both have errors that require diagnostic
at compile time.

Bart is only saying that it's valid syntax, not that it's valid C.

According to the ISO C syntax (not taking into account contraints,
which are not syntax) that view is justified.

The second line is syntactically well-formed. The first line is
not.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Fri Aug 30 14:38:03 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 28 Aug 2024 08:04:34 +0200, Bonita Montero wrote:

Am 28.08.2024 um 07:39 schrieb Lawrence D'Oliveiro:

The Linux kernel abstractions are very high level. Look at how entirely
different filesystems, even ones originating from entirely different
OSes, can be handled through the common VFS layer.

The distance between the levels of indirection is less than in C++.

Do you have any examples of C++ code that deals with the levels of >indirection in the Linux kernel?

The SVR4 VFS layer (which linux adopted) is ideally suited to
using C++ derived classes. As is the network stack.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to James Kuyper on Fri Aug 30 17:36:24 2024

On 2024-08-30, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 8/29/24 19:08, Ben Bacarisse wrote:
...

Actually I don't think I did. I said "reject" and a compiler that says
"this is not C" and then generates a executable is rejecting the code as
far as I am concerned.

How about a compiler that says: "Congratulations on using our extension
to C - program accepted"? Such a compiler could be fully conforming, and
I see no way to describe that as a rejection.

Woudln't it hava to correctly look for and process #error directives?

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Ben Bacarisse on Fri Aug 30 13:28:33 2024

On 8/29/24 19:08, Ben Bacarisse wrote:
...

Actually I don't think I did. I said "reject" and a compiler that says
"this is not C" and then generates a executable is rejecting the code as
far as I am concerned.

How about a compiler that says: "Congratulations on using our extension
to C - program accepted"? Such a compiler could be fully conforming, and
I see no way to describe that as a rejection.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Kaz Kylheku on Fri Aug 30 14:37:44 2024

On 8/30/24 13:36, Kaz Kylheku wrote:

On 2024-08-30, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 8/29/24 19:08, Ben Bacarisse wrote:
...

Actually I don't think I did. I said "reject" and a compiler that says
"this is not C" and then generates a executable is rejecting the code as >>> far as I am concerned.

How about a compiler that says: "Congratulations on using our extension
to C - program accepted"? Such a compiler could be fully conforming, and
I see no way to describe that as a rejection.

Woudln't it hava to correctly look for and process #error directives?

Certainly. I said that it could be fully conforming. It won't actually
be fully conforming unless it meets all of the other requirements of the
C standard. That means it must reject any program that contains a #error directive that survives conditional compilation, but it's otherwise free
to accept any program, even if that program has syntax errors or
constraint violations, so long as it also generates the required
diagnostics.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bonita Montero on Sat Aug 31 00:01:53 2024

On Fri, 30 Aug 2024 10:43:10 +0200, Bonita Montero wrote:

Am 30.08.2024 um 05:21 schrieb Lawrence D'Oliveiro:

Do you have any examples of C++ code that deals with the levels of
indirection in the Linux kernel?

I wanted to say that there are language facilities in C that put a
distance with the interface and the code behind it. In C this doesn't
exist. And there's no encapsulation that helps to manage such abstrac-
tions.

That’s a theoretical argument. I was making a practical one, by referring
to its use in actual, working code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Sat Aug 31 00:02:32 2024

On Fri, 30 Aug 2024 14:38:03 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 28 Aug 2024 08:04:34 +0200, Bonita Montero wrote:

Am 28.08.2024 um 07:39 schrieb Lawrence D'Oliveiro:

The Linux kernel abstractions are very high level. Look at how
entirely different filesystems, even ones originating from entirely
different OSes, can be handled through the common VFS layer.

The distance between the levels of indirection is less than in C++.

Do you have any examples of C++ code that deals with the levels of >>indirection in the Linux kernel?

The SVR4 VFS layer (which linux adopted) is ideally suited to using C++ derived classes. As is the network stack.

Can you point to any examples of an actual C++-based implementation of
same?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Sat Aug 31 07:08:33 2024

On 2024-08-30, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason:
an '+' term is not a valid lvalue.

The compiler must tell you that neither is valid C. That's
because what is required on each side of assignment is not
exactly the same thing. It's a distraction to argue about why
each is not valid C as both have errors that require diagnostic
at compile time.

Bart is only saying that it's valid syntax, not that it's valid C.

According to the ISO C syntax (not taking into account contraints,
which are not syntax) that view is justified.

The second line is syntactically well-formed. The first line is
not.

Right, because the LHS of an assignment is a unary-expression.
`(x + 1)` can be parsed as a unary-expression, but `x + 1` cannot.
However, the compilers I've tried produce the same diagnostic (not a
syntax error message) for both. Probably they use a tweaked grammar
that allows more a general expression as the LHS of an assignment,
and catch errors later in semantic analysis, for the purpose of
producing diagnostics that are easier to understand. It's obvious
that in `x + 1 = y`, the programmer (probably) intended `x + 1`
to be the LHS of an assignment. These compilers (I tried gcc,
clang, and tcc) are clever enough to recognize that.

A standard operator precedence parsing algorithm such as Shunting Yard
cannot help but parse that.

The operator tokens + and = have to be
assigned a precedence and associativity level, and so the parse has to
be (x + 1) = y or else x + (1 = y).

But precedence, in general, doesn't have to be ordered! It doesn't have
to have levels, or even partial ordering with transitivity. Precedence
can be such that for any pair of operators, we arbitrarily assign which
one is higher than the other, without regard for anything else.

Also: precedence can depend on order. It can be that in
X op1 Y op2 Z, where op1 is to the left of op2, op1 has
the higher precedence. But in X op2 Y op1 Z, op2 might have
the higher precedence. Or one order could have a defined
precedence but not the other.

In the C grammar, assignment breaks the cascading sequence. Whereas
most earlier rules refer to their immediate predecessors.
(e.g. additive builds on multiplicative), assignment looks all
the way back to unary. What this means is that the assignment operator
has no defined precedence with regard to all the intermediate
operators between it and unary. Or, at least, when the other operator
is to the left:

x + 1 = y // + =: no defined precedence: ambiguous: syntax error

y = x + 1 // = +: defined precedence: good syntax

When the precedence is not defined in one of the two orders,
you can safely adopt the one from the other order, provided
everything is still diagnosed that should be diagnosed.

The precedence not being defined means that the following parse
tree fragment is invalid:

=
+ y
x 1

it cannot be that + is a left child of =. So the parse could be
allowed by defining the precedence; and then we can detect the invalid condition by walking the parse tree, looking for assignment nodes that
have a left child that has no precedence relationship to assignment.

But as an AST it is valid because that same AST shape can be forced by parentheses, and parentheses disappear in abstract syntax.

Any invalid syntax condition that can be removed using parentheses is
not worth enforcing at the parse level. If it's wrong to assign to x +
1, you also need to diagnose when it's (x + 1). It's better to have
a single rule which catches both.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sat Aug 31 02:11:36 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 8/29/24 12:06, Ben Bacarisse wrote:
...

I can't see what it is you object to in what I wrote. I don't
disagree with anything you are saying (the "correct result" being
to reject a program that has, syntactically, the wrong thing on
the left hand side).

No - the only requirement is that a diagnostic be produced. A
fully conforming implementation of C is allowed to accept such
code and then generate an executable; if you choose to execute
the executable, the behavior is undefined.

Sorry, I used a term incorrectly. To put it informally, you must
be told that "this is not C". Not everything is C even if a C
compiler will accept FORTRAN code as an extension.

Actually I don't think I did. I said "reject" and a compiler that
says "this is not C" and then generates a executable is rejecting
the code as far as I am concerned.

I would like to express a personal reaction.

I think what you said about rejecting a program isn't exactly
wrong, but it is misleading, and also, I think, inadvisable. A
good general principle is not to use words with private meanings
in a venue where there is a different public meaning as
understood by a significant majority of participants in the
venue.

The response from James Kuyper is off the mark in my view and so
enters the conversation at cross purposes. The quotes around the
phrase "correct result" should be enough to make clear that you
are not making a precise statement about what the C standard
requires, but instead giving an informal description of a reaction
to a well-defined condition. James is dragging the conversation
into a domain that is not quite the same as the one of your
comment.

That said, I still think "reject" is a poor word choice there,
whatever you might think of it privately, for two reasons.
Reason one, it goes against the common ordinary meaning of the
word. Reason two, although the C standard does not use the word
"reject" at all, it does use the word "accept", and it is natural
to take "reject" to mean the opposite of "accept", but that sense
of "reject" is not what you mean (at least, I don't think it is).

Incidentally, the C standard doesn't say anything about refusing
an entire program. What it does say is that implementations must
not successfully translate a preprocessing translation unit that
has an unskipped #error directive. Presumably a not-successfully
translated preprocessing translation unit is meant to imply that
any program that tries to incorporate that translation unit must
also be invalid, but I don't think the C standard ever actually
says that. (Disclaimer: I haven't checked this claim carefully.)

On the question of what phrase to use instead, I might suggest
"must flag the program as being erroneous (in the sense that it
does not comply with the rules given in the C standard for what
constitutes a C source file)". That's a long phrase, but I think
the first part - "must flag the program as being erroneous" -
expresses what it is you want to convey. And I think it would be
understood by C-standard-experts in a way that's compatible with
what you want to say.

So, for what it's worth, there are my thoughts.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to James Kuyper on Sat Aug 31 02:18:13 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 8/29/24 19:08, Ben Bacarisse wrote:
...

Actually I don't think I did. I said "reject" and a compiler
that says "this is not C" and then generates a executable is
rejecting the code as far as I am concerned.

How about a compiler that says: "Congratulations on using our
extension to C - program accepted"?

As long as the message conforms to the implementation-defined
(and so documented) characteristics of other mandatory
diagnostics, it seems reasonable to expect that someone would
treat it the same way.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to James Kuyper on Sat Aug 31 03:56:36 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 8/26/24 03:54, Michael S wrote:

On Sun, 25 Aug 2024 17:48:14 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

...

It's been amusing reading a discussion of which languages are or
are not high level, without anyone offering a definition of what
the term means. Wikipedia says, roughly, that a high-level
language is one that doesn't provide machine-level access (and IMO
that is a reasonable characterization).

I don't like this definition. IMHO, what language does have is at
least as important as what it does not have for the purpose of
estimating its level.

That's not a particularly useful response. [...]

If it communicated what Michael wanted to say, it served his
purposes, which makes it useful, whether you thought it was
useful or not.

One principle that should be kept in mind when you're defining a
term whose definition is currently unclear, is to decide what
statements you want to make about things described by that term.

A more important factor to keep in mind is what common usage
or usages there are, both current and historical.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Kaz Kylheku on Sat Aug 31 12:45:43 2024

On 8/31/24 03:08, Kaz Kylheku wrote:

On 2024-08-30, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

...

However, the compilers I've tried produce the same diagnostic (not a
syntax error message) for both. Probably they use a tweaked grammar
that allows more a general expression as the LHS of an assignment,
and catch errors later in semantic analysis, for the purpose of
producing diagnostics that are easier to understand. It's obvious
that in `x + 1 = y`, the programmer (probably) intended `x + 1`
to be the LHS of an assignment. These compilers (I tried gcc,
clang, and tcc) are clever enough to recognize that.

A standard operator precedence parsing algorithm such as Shunting Yard
cannot help but parse that.

True, which is an example of why a precedence parsing algorithm is inappropriate for parsing C, which is not defined in terms of precedence levels.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Sat Aug 31 19:11:01 2024

On 30/08/2024 21:41, Keith Thompson wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason:
an '+' term is not a valid lvalue.

The compiler must tell you that neither is valid C. That's
because what is required on each side of assignment is not
exactly the same thing. It's a distraction to argue about why
each is not valid C as both have errors that require diagnostic
at compile time.

Bart is only saying that it's valid syntax, not that it's valid C.

According to the ISO C syntax (not taking into account contraints,
which are not syntax) that view is justified.

The second line is syntactically well-formed. The first line is
not.

Right, because the LHS of an assignment is a unary-expression.
`(x + 1)` can be parsed as a unary-expression, but `x + 1` cannot.

AFAICT both terms are parsed the same way.

Given this:

x + y = z;
(x + y) = z;

My compiler produces the same AST for the LHS of '=' in both cases. I
can't easily get the AST created by gcc. But the parsing of 'x + y = z'
has to be as one of these two:

x + (y = z);
(x + y) = z;

If I submit these to gcc:

x + (y = z);
(x + y) = z;
x + y = z;

then the first passes with gcc, but the other too both fail. Presumably
then the third must be parsed as '(x + y) = z' rather than 'x + (y = z)'.

I'm surprised that the experts here are unsure about it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Bart on Sat Aug 31 19:32:32 2024

On 2024-08-31, Bart <bc@freeuk.com> wrote:

On 30/08/2024 21:41, Keith Thompson wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason:
an '+' term is not a valid lvalue.

The compiler must tell you that neither is valid C. That's
because what is required on each side of assignment is not
exactly the same thing. It's a distraction to argue about why
each is not valid C as both have errors that require diagnostic
at compile time.

Bart is only saying that it's valid syntax, not that it's valid C.

According to the ISO C syntax (not taking into account contraints,
which are not syntax) that view is justified.

The second line is syntactically well-formed. The first line is
not.

Right, because the LHS of an assignment is a unary-expression.
`(x + 1)` can be parsed as a unary-expression, but `x + 1` cannot.

AFAICT both terms are parsed the same way.

Though they can be, it's not required by the ISO C gramamr.

Given this:

x + y = z;
(x + y) = z;

My compiler produces the same AST for the LHS of '=' in both cases. I

Sure, your compiler and others also. But this is not required.

According to the formal grammar in ISO C, this is bad syntax.

In terms of operator precedence (a concept not used in ISO C) we
can recognize the situation as similar to there being no defined
precedence between + and =, when they appear in that order:
E1 + E2 = E3. If there is no defined precedence, the situation
cannot be parsed.

Only in the order E1 = E2 + E3 does + have a higher precedence
than =, leading to the same abstract syntax as E1 = (E2 + E3).

The grammar production

assignment-expression: unary-expression assignment-op assignment-expression

doesn't admit parses where the left side is any of the intermediate
phrase structures between unary-expression and assignment-expression.

The grammar cannot derive any sentential forms that look like any
of these:

additive-expression assignment-op assignment-expression

multiplicative-expression assignment-op assignment-expression

inclusive-OR-expression assignment-op assignment-expression

etc. Any token sequence which matches these patterns is not
derivable by the grammar.

can't easily get the AST created by gcc. But the parsing of 'x + y = z'
has to be as one of these two:

x + (y = z);
(x + y) = z;

Yes, if we stipulate that:

- these operators are independent, binary operators; and
- there exists a defined precedence between them in that configuration

then there exists a parse, which is one of those two.

There are other possibilities, like:

- there doesn't exist a parse; or

- the two tokens constitute a ternary operator, and so the parse
produces an abstract syntax like this:

+ =
/ | \
x y z

I'm surprised that the experts here are unsure about it.

x + y = y simply does not match the grammar pattern rule "unary-expr op assignment-expr". It is clear.

It's not necessary for ISO C compilers to follow the grammar that is
given in ISO C, as long as they implement all the requirements.

To show that a compiler is nonconforming, you have to show:

- incorrect input that fails to produce a required diagnostic, or

- correct input, but which does not produce the observable behavior that
is required of it, taking into account the implementation
characteristics.

A compiler which parses x + y = z as (x + y) = z is conforming,
as long as it produces the required diagnostic. Because (x + y)
isn't a modifiable lvalue, it is enough for that to be diagnosed,
and the requirement for a diagnostic is therefore met.

The diagnostic does not have to be "assignment requires unary
expression on the left"; the standard doesn't dictate the wording
of diagnostics. Semantically, only a unary expression can produce a
modifiable lvalue, so all is well.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Bart on Sat Aug 31 16:04:57 2024

On 2024-08-31, Bart <bc@freeuk.com> wrote:

On 30/08/2024 21:41, Keith Thompson wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason:
an '+' term is not a valid lvalue.

...

Right, because the LHS of an assignment is a unary-expression.
`(x + 1)` can be parsed as a unary-expression, but `x + 1` cannot.

AFAICT both terms are parsed the same way.

Not correctly.

Given this:

x + y = z;
(x + y) = z;

My compiler produces the same AST for the LHS of '=' in both cases. I
can't easily get the AST created by gcc. But the parsing of 'x + y = z'
has to be as one of these two:

x + (y = z);

That is an invalid parse: the right operand of an addition expression
must be a multiplicative expression - y=z doesn't qualify. (y=z) would.

(x + y) = z;

That has a valid parse, but violates a constraint - x+y is not a
modifiable lvalue.

I'm surprised that the experts here are unsure about it.

I don't see uncertainty from those I consider experts. Kaz is saying
something about compensating for an incorrect parse by detecting the
problems after parsing rather than during parsing. I'd have to know a
lot of details about how the problem detection was performed to be sure
that it was safe. If it could be done, it would certainly be permitted -
the standard only requires that the final results are the same as if it
had been done the way the standard specifies. It much easier to evaluate
the validity of the design of a compiler when it parses things the way
the C standard specifies they should be parsed - but there might be
advantages to doing otherwise.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Bart on Sat Aug 31 15:10:07 2024

Bart <bc@freeuk.com> writes:

On 30/08/2024 21:41, Keith Thompson wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently

are both valid syntax in C. It will fail for a different reason:
an '+' term is not a valid lvalue.

The compiler must tell you that neither is valid C. That's
because what is required on each side of assignment is not
exactly the same thing. It's a distraction to argue about why
each is not valid C as both have errors that require diagnostic
at compile time.

Bart is only saying that it's valid syntax, not that it's valid C.

According to the ISO C syntax (not taking into account contraints,
which are not syntax) that view is justified.

The second line is syntactically well-formed. The first line is
not.

Right, because the LHS of an assignment is a unary-expression.
`(x + 1)` can be parsed as a unary-expression, but `x + 1` cannot.

AFAICT both terms are parsed the same way.

Given this:

x + y = z;
(x + y) = z;

My compiler produces the same AST for the LHS of '=' in both
cases.

To understand why they are different, try drawing parse trees
rather than abstract syntax trees.

https://en.wikipedia.org/wiki/Parse_tree

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Bonita Montero on Sat Aug 31 22:30:54 2024

On Sat, 31 Aug 2024 06:44:25 +0200, Bonita Montero wrote:

Am 31.08.2024 um 02:01 schrieb Lawrence D'Oliveiro:

On Fri, 30 Aug 2024 10:43:10 +0200, Bonita Montero wrote:

Am 30.08.2024 um 05:21 schrieb Lawrence D'Oliveiro:

Do you have any examples of C++ code that deals with the levels of
indirection in the Linux kernel?

I wanted to say that there are language facilities in C that put a
distance with the interface and the code behind it. In C this doesn't
exist. And there's no encapsulation that helps to manage such abstrac-
tions.

That’s a theoretical argument. ...

No, that's totally practical.
I use those facilties every 5min while programmming.

So does the Linux kernel. In C. To levels of sophistication beyond your
code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Sun Sep 1 00:37:49 2024

On 31/08/2024 23:31, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Given this:

x + y = z;
(x + y) = z;

My compiler produces the same AST for the LHS of '=' in both cases.

[...]

If that's the case (and I don't doubt that it is), then your compiler is
not following the grammar specified by the ISO C standard. Since
`x + y` is not a unary-expression, `x + y = z` is not a syntactically
valid assignment expression.

Yet no compiler out of the dozen I tried reported a syntax error (except
SDCC).

So what AST is produced by gcc, if any?

Most compilers, including mine, complain that an lvalue is expected.

A parser that strictly follows the ISO C grammar would reject
(diagnose, flag, whatever) `x + y = z;` just as it would reject `x = y +;`.

Which one does that (apart from SDCC)?

This is an observation, not a complaint. It doesn't imply that your
compiler is non-conforming or buggy. A parser that doesn't strictly
follow the ISO C grammar could still be part of a conforming compiler.

I can also say that the C grammar is buggy:

assignment-expression:
conditional-expression
unary-expression asssignment-operator assignment-expression

When attempting to parse an assignment-expression, do you go for a conditional-expression or unary-expression?

The latter is a subset of the former. If you go for a
conditional-expression and find that an assignment-operator follows, now
you have to perform some analysis on the LHS to see if that conditional-expression contains only a unary-expression.

However, if it's not a unary-expression, it will fail for other reasons
anyway, because all those other things that a conditional-expression
will be, can't be lvalues.

That also applies to many unary-expressions, such as 42, or a++; those
can't be lvalues either, even though the syntax is valid.

So it makes sense to do only an lvalue test.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Sat Aug 31 20:01:10 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Bart <bc@freeuk.com> writes:

[...]

I can also say that the C grammar is buggy:

assignment-expression:
conditional-expression
unary-expression asssignment-operator assignment-expression

When attempting to parse an assignment-expression, do you go for a
conditional-expression or unary-expression?

The latter is a subset of the former. If you go for a
conditional-expression and find that an assignment-operator
follows, now you have to perform some analysis on the LHS to see
if that conditional-expression contains only a unary-expression.
[...]

[...] I'm skeptical that the C grammar is buggy. [...]

It appears that what Bart means by buggy is different from what
you mean. I think what Bart means is that the grammar is not
suitable for being used by a particular parsing algorithm. Of
course that is not what the C standard means to supply, which is
rules of grammar that exactly reflect what syntactic forms are
suitable (syntactically) as C programs, without regard to how
input source is processed. The C standard's rules of grammar
are meant as a declarative specification, not as a procedural
description. Bart's complaint is, I believe, a complaint that
the grammar rules in the C standard do not provide a suitable
procedural description (for the procedural framework he has in
mind). They aren't meant to. Your comment is meant, I believe,
to be about whether the grammar rules in the C standard provide
an accurate declarative specification, which they almost
certainly do, especially as regards to the limited area of
expression syntax.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Bart on Sun Sep 1 03:04:20 2024

On 2024-08-31, Bart <bc@freeuk.com> wrote:

On 31/08/2024 23:31, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Given this:

x + y = z;
(x + y) = z;

My compiler produces the same AST for the LHS of '=' in both cases.

[...]

If that's the case (and I don't doubt that it is), then your compiler is
not following the grammar specified by the ISO C standard. Since
`x + y` is not a unary-expression, `x + y = z` is not a syntactically
valid assignment expression.

Yet no compiler out of the dozen I tried reported a syntax error (except SDCC).

That says exactly the same thing as, "Without carrying out an
extraordinary search, I've been able to confirm the existence of a
compiler that might actually be parsing the way the language is
specified, evidenced by it reporting a syntax error."

So what AST is produced by gcc, if any?

Most compilers, including mine, complain that an lvalue is expected.

Anyone using better parsing technology than what was available in 1970
will not be literally using the literal grammar in ISO C.

I can also say that the C grammar is buggy:

assignment-expression:
conditional-expression
unary-expression asssignment-operator assignment-expression

I second that. If I had energy and motivation for that sort of thing,
I would submit a defect report.

There is no value (no pun intended) in constraining the left side of an assignment to just that class of expression that might produce a
modifiable lvalue, since that constraint must be checked regardless.

It creates an gratuitous inconsistency between the formal ISO C
grammar and the intuitive precedence model for expressions that everyone understands. Even the K&R2 book has a precedence table (P. p53?).

When attempting to parse an assignment-expression, do you go for a conditional-expression or unary-expression?

If using a parser generator tool, like one of the Yacc derivatives,
I would just make it:

expr : ... /* numerous rules */
| expr '=' expr { ... }
| ... /* numerous rules */
;

using %left and %right declarations to set up the operator
precedences and associativities.

Anything you do differently from a specification, though,
creates risk. It takes extra work to show that what you have
is equivalent.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Sun Sep 1 07:07:19 2024

On Tue, 27 Aug 2024 17:46:30 -0700, Keith Thompson wrote:

Strachey coined the
terms Lvalue and Rvalue to describe the kind of value passed by
reference and value parameters, respectively.

I think “call by reference” is/was sometimes also referred to as “call by simple name”.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Tim Rentsch on Sun Sep 1 13:15:33 2024

On 31/08/2024 23:10, Tim Rentsch wrote:

Bart <bc@freeuk.com> writes:

On 30/08/2024 21:41, Keith Thompson wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

Bart <bc@freeuk.com> writes:

I think that these (with x, y having compatible scalar types):

x + 1 = y;
(x + 1) = y; // in case above was parsed differently >>>>>>>
are both valid syntax in C. It will fail for a different reason: >>>>>>> an '+' term is not a valid lvalue.

The compiler must tell you that neither is valid C. That's
because what is required on each side of assignment is not
exactly the same thing. It's a distraction to argue about why
each is not valid C as both have errors that require diagnostic
at compile time.

Bart is only saying that it's valid syntax, not that it's valid C.

According to the ISO C syntax (not taking into account contraints,
which are not syntax) that view is justified.

The second line is syntactically well-formed. The first line is
not.

Right, because the LHS of an assignment is a unary-expression.
`(x + 1)` can be parsed as a unary-expression, but `x + 1` cannot.

AFAICT both terms are parsed the same way.

Given this:

x + y = z;
(x + y) = z;

My compiler produces the same AST for the LHS of '=' in both
cases.

To understand why they are different, try drawing parse trees
rather than abstract syntax trees.

https://en.wikipedia.org/wiki/Parse_tree

Yeah, the CST differs in retaining the parentheses. But I can already
see that from the source code.

For normal compilation, that information is redundant.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Bart on Sun Sep 1 06:30:21 2024

Bart <bc@freeuk.com> writes:

On 31/08/2024 23:10, Tim Rentsch wrote:

Bart <bc@freeuk.com> writes:

[...]

Given this:

x + y = z;
(x + y) = z;

My compiler produces the same AST for the LHS of '=' in both
cases.

To understand why they are different, try drawing parse trees
rather than abstract syntax trees.

https://en.wikipedia.org/wiki/Parse_tree

Yeah, the CST differs in retaining the parentheses. But I can
already see that from the source code.

For normal compilation, that information is redundant.

The parse trees are different in a lot more ways than whether
one keeps parentheses. But the point is to compare the parse
tree for x + y = z to the abstract syntax tree for x + y = z.
Among other things, the input x + y = z doesn't even have a
parse tree; the rules governing the formation of parse trees
don't allow any parse tree matching that entire input.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Sun Sep 1 15:19:23 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Fri, 30 Aug 2024 14:38:03 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 28 Aug 2024 08:04:34 +0200, Bonita Montero wrote:

Am 28.08.2024 um 07:39 schrieb Lawrence D'Oliveiro:

The Linux kernel abstractions are very high level. Look at how
entirely different filesystems, even ones originating from entirely
different OSes, can be handled through the common VFS layer.

The distance between the levels of indirection is less than in C++.

Do you have any examples of C++ code that deals with the levels of >>>indirection in the Linux kernel?

The SVR4 VFS layer (which linux adopted) is ideally suited to using C++
derived classes. As is the network stack.

Can you point to any examples of an actual C++-based implementation of
same?

Yes. They are, of course proprietary and thus not publically
available.

That's always been your problem - you think your experiences are
all that exist.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Scott Lurndal on Sun Sep 1 15:22:26 2024

scott@slp53.sl.home (Scott Lurndal) writes:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Fri, 30 Aug 2024 14:38:03 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 28 Aug 2024 08:04:34 +0200, Bonita Montero wrote:

Am 28.08.2024 um 07:39 schrieb Lawrence D'Oliveiro:

The Linux kernel abstractions are very high level. Look at how
entirely different filesystems, even ones originating from entirely >>>>>> different OSes, can be handled through the common VFS layer.

The distance between the levels of indirection is less than in C++.

Do you have any examples of C++ code that deals with the levels of >>>>indirection in the Linux kernel?

The SVR4 VFS layer (which linux adopted) is ideally suited to using C++
derived classes. As is the network stack.

Can you point to any examples of an actual C++-based implementation of >>same?

Yes. They are, of course proprietary and thus not publically
available.

Here's one:

https://en.wikipedia.org/wiki/Chorus_Syst%C3%A8mes_SA

That's always been your problem - you think your experiences are
all that exist.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Bart on Sun Sep 1 13:12:53 2024

Bart <bc@freeuk.com> writes:

On 31/08/2024 23:31, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

...

I can also say that the C grammar is buggy:

assignment-expression:
conditional-expression
unary-expression asssignment-operator assignment-expression

When attempting to parse an assignment-expression, do you go for a conditional-expression or unary-expression?

Why are you attempting to parse an assignment-expression?
At the start a new statement, once you've gotten far enough to rule out
any other kind of statement, you should to parse it as an expression
statement (6.8.3), which starts with an expression. If you have already
parsed the statement far enough to recognize a unary expression, and
then find it followed by an assignment operator, then you should check
to see if the rest of the statement is an assignment-expression. If the
first part isn't a unary-expression, then with few exceptions (such as
inside a _Generic() expression), it's a syntax error.

The latter is a subset of the former. If you go for a
conditional-expression and find that an assignment-operator follows,
now you have to perform some analysis on the LHS to see if that conditional-expression contains only a unary-expression.

It qualifies as a conditional-expression only because being a
unary-expression is one of the ways it can qualify as conditional
expression. You can discard information about having recognized it as a unary-expression if it's convenient, but you can't argue that it's
convenient to discard it and at the same time complain about the fact
the consequences of having discarded it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Sun Sep 1 09:45:58 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

Any invalid syntax condition that can be removed using parentheses
is not worth enforcing at the parse level. If it's wrong to
assign to x + 1, you also need to diagnose when it's (x + 1).
It's better to have a single rule which catches both.

If you want to think that you are free to do so. But the
statement is nothing more than one person's opinion.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Sun Sep 1 18:47:44 2024

On 2024-09-01, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

Any invalid syntax condition that can be removed using parentheses
is not worth enforcing at the parse level. If it's wrong to
assign to x + 1, you also need to diagnose when it's (x + 1).
It's better to have a single rule which catches both.

If you want to think that you are free to do so. But the
statement is nothing more than one person's opinion.

Tim, are you under the impression that we need help figuring out
whether something is an opinion or not? I don't believe we do.

The point obviously not that it's an opinion, but that it's only one
person's opinion; i.e. that I have an opinion not shared by anyone,
insinuating that it's extremely weird or poorly considered (or else so brilliant that its blinding `wisdom is inaccessible to anyone else).

For that head count to be correct, it must be the case that:

1. Either BartCC is not a person, or else I'm not, or else we
are not distinct persons.

2. This one person is somehow responsible for all those compilers
parsing x + 1 = y like (x + 1) = y, and letting it succumb to the
modifiable lvalue constraint. (Or else, that implementation choice
is just a random, or even ignorant, deviation from the formal grammar
in ISO C, and doesn't actually reflect what anyone thinks is
better---but in that case they still implemented something coinciding
with the one person's weird opinion).

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Kaz Kylheku on Sun Sep 1 15:01:33 2024

On 9/1/24 14:47, Kaz Kylheku wrote:

On 2024-09-01, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

...

If you want to think that you are free to do so. But the
statement is nothing more than one person's opinion.

Tim, are you under the impression that we need help figuring out
whether something is an opinion or not? I don't believe we do.

The point obviously not that it's an opinion, but that it's only one
person's opinion; i.e. that I have an opinion not shared by anyone, insinuating that it's extremely weird or poorly considered (or else so brilliant that its blinding `wisdom is inaccessible to anyone else).

He didn't say that it was "only" one person's opinion. I don't think he
was implying that your opinion is unique. I think that by saying that it
was "nothing more" than one person's opinion, he meant that it's no more important than any other person's opinion.
If I'm wrong about that, Tim might bother pointing it out - or maybe not.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Sun Sep 1 13:07:35 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-31, Bart <bc@freeuk.com> wrote:

[...]

I can also say that the C grammar is buggy:

assignment-expression:
conditional-expression
unary-expression asssignment-operator assignment-expression

I second that. If I had energy and motivation for that sort of
thing, I would submit a defect report.

There is no value (no pun intended) in constraining the left side
of an assignment to just that class of expression that might
produce a modifiable lvalue, since that constraint must be checked regardless.

It creates an gratuitous inconsistency between the formal ISO C
grammar and the intuitive precedence model for expressions that
everyone understands. Even the K&R2 book has a precedence table
(P. p53?).

When attempting to parse an assignment-expression, do you go for
a conditional-expression or unary-expression?

If using a parser generator tool, like one of the Yacc
derivatives, I would just make it:

expr : ... /* numerous rules */
| expr '=' expr { ... }
| ... /* numerous rules */
;

using %left and %right declarations to set up the operator
precedences and associativities.

Anything you do differently from a specification, though, creates
risk. It takes extra work to show that what you have is
equivalent.

This idea is totally wrongheaded. The most important purpose of
the ISO C standard is to be read and understood by ordinary C
developers, not just compiler writers. The grammar in the C
standard is both easy to understand and a precise and accurate
specification for what syntax is accepted. Furthermore it is
written in a way that doesn't encourage any particular parsing
technology, which is definitely a benefit, because it promotes
looking at alternate approaches. On the last point, it is much
more important that developers be able to tell when a compiler
is doing things wrong than it is for compiler writers to be
sure they have met a specification, because if it is the
slightest bit difficult for developers then they will start to
rely on compilers as the ultimate authority rather than the C
standard itself, which is a very bad thing.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Sun Sep 1 13:14:34 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-09-01, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

Any invalid syntax condition that can be removed using parentheses
is not worth enforcing at the parse level. If it's wrong to
assign to x + 1, you also need to diagnose when it's (x + 1).
It's better to have a single rule which catches both.

If you want to think that you are free to do so. But the
statement is nothing more than one person's opinion.

Tim, are you under the impression that we need help figuring out
whether something is an opinion or not? I don't believe we do.

The point obviously not that it's an opinion, but that it's only one
person's opinion; i.e. that I have an opinion not shared by anyone, insinuating that it's extremely weird or poorly considered [...]

I wasn't insinuating any such thing.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to James Kuyper on Sun Sep 1 13:11:45 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 9/1/24 14:47, Kaz Kylheku wrote:

On 2024-09-01, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

...

If you want to think that you are free to do so. But the
statement is nothing more than one person's opinion.

Tim, are you under the impression that we need help figuring out
whether something is an opinion or not? I don't believe we do.

The point obviously not that it's an opinion, but that it's only one
person's opinion; i.e. that I have an opinion not shared by anyone,
insinuating that it's extremely weird or poorly considered (or else so
brilliant that its blinding `wisdom is inaccessible to anyone else).

He didn't say that it was "only" one person's opinion. I don't think he
was implying that your opinion is unique. [...]

Quite so.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Sun Sep 1 14:17:02 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

Any invalid syntax condition that can be removed using parentheses
is not worth enforcing at the parse level. If it's wrong to
assign to x + 1, you also need to diagnose when it's (x + 1).
It's better to have a single rule which catches both.

If you want to think that you are free to do so. But the
statement is nothing more than one person's opinion.

Tim, are you under the impression that we need help figuring out
whether something is an opinion or not? I don't believe we do.

This could be an opportunity to let us know what your opinion is,
and perhaps even support it. That might be interesting.

I don't have an opinion on your question one way or the other.
As a rule I don't think about what other people's reactions are
to the posting to which I am responding. Of course sometimes I
do, but generally I don't.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Sun Sep 1 23:48:17 2024

On Sun, 01 Sep 2024 15:22:26 GMT, Scott Lurndal wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Fri, 30 Aug 2024 14:38:03 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Do you have any examples of C++ code that deals with the levels of >>>>>indirection in the Linux kernel?

The SVR4 VFS layer (which linux adopted) is ideally suited to using
C++ derived classes. As is the network stack.

Can you point to any examples of an actual C++-based implementation of >>>same?

Here's one:

https://en.wikipedia.org/wiki/Chorus_Syst%C3%A8mes_SA

Looks like it’s dead. And it was microkernel-based anyway, and such
systems are known for low performance (as if C++ helped). Are you
surprised it died?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Kaz Kylheku on Mon Sep 2 13:03:59 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

...

Do you think (or claim) that what is /required/ on each side of an
assignment in C is exactly the same thing? The expression on the LHS is
required to be a modifiable lvalue expression. That does not apply to
the expression on right hand side.

"modifiable lvalue" is a semantic attribute which depends on type
and qualification. An array is an lvalue, but not modifiable.
A const-qualified expression is also not a modififiable lvalue.

Bart is insisting that these attributes are not a matter of syntax.

Your intervention derailed the discussion into one of syntax. Bart then
simply stopped talking about his original claim. Way back I pointed
out that:

|| What is needed on the two sides is not the same.

And he replied

| I would argue that it is exactly the same.

He did, later, say that is was "exactly the same" except for the
differences but then went back to "I do mean exactly the same".

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Mon Sep 2 13:39:23 2024

On 02/09/2024 13:03, Ben Bacarisse wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

...

Do you think (or claim) that what is /required/ on each side of an
assignment in C is exactly the same thing? The expression on the LHS is >>> required to be a modifiable lvalue expression. That does not apply to
the expression on right hand side.

"modifiable lvalue" is a semantic attribute which depends on type
and qualification. An array is an lvalue, but not modifiable.
A const-qualified expression is also not a modififiable lvalue.

Bart is insisting that these attributes are not a matter of syntax.

Your intervention derailed the discussion into one of syntax. Bart then simply stopped talking about his original claim. Way back I pointed
out that:

|| What is needed on the two sides is not the same.

And he replied

| I would argue that it is exactly the same.

He did, later, say that is was "exactly the same" except for the
differences but then went back to "I do mean exactly the same".

I said this:

I explained that. LHS and RHS can be identical terms for assignment in pretty much every aspect, but there are extra constraints on the LHS.

You then sarcastically suggested:

So you use "exactly the same" to mean "exactly the same except for the differences".

I then clarified:

No, I do mean exactly the same, both in terms of syntax and (in my
implementations, which are likely typical) internal representation of
those terms.
...So are no differences when considering only valid programs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Mon Sep 2 16:22:37 2024

Bart <bc@freeuk.com> writes:

On 02/09/2024 13:03, Ben Bacarisse wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

...

Do you think (or claim) that what is /required/ on each side of an
assignment in C is exactly the same thing? The expression on the LHS is >>>> required to be a modifiable lvalue expression. That does not apply to >>>> the expression on right hand side.

"modifiable lvalue" is a semantic attribute which depends on type
and qualification. An array is an lvalue, but not modifiable.
A const-qualified expression is also not a modififiable lvalue.

Bart is insisting that these attributes are not a matter of syntax.

Your intervention derailed the discussion into one of syntax. Bart then
simply stopped talking about his original claim. Way back I pointed
out that:
|| What is needed on the two sides is not the same.
And he replied
| I would argue that it is exactly the same.
He did, later, say that is was "exactly the same" except for the
differences but then went back to "I do mean exactly the same".

I said this:

I explained that. LHS and RHS can be identical terms for assignment in
pretty much every aspect, but there are extra constraints on the LHS.

You then sarcastically suggested:

So you use "exactly the same" to mean "exactly the same except for the
differences".

I then clarified:

No, I do mean exactly the same, both in terms of syntax and (in my
implementations, which are likely typical) internal representation of
those terms.
...So are no differences when considering only valid programs.

I wonder what it was you were really objecting to in the original remark
that I made. Since ignoring the differences in what is required on the
LHS and RHS all result in invalid programs your summary is (to a first approximation) correct, but it does not render mine wrong in any
interesting way.

I note that you have, again, indulged in strategic snipping. The "..."
was "There are no differences other than where the type system says your
code is invalid.". What is it about the type system of C that makes

int main(void) {
extern char *p;
*p = 0;
}

invalid? Because sometimes it is, depending on what p is in some other translation unit. Are you using your own meaning for "type system"? If
so what is it?

And as for your remarks about typical implementations, does your C
parser /really/ accept an assignment expression on both sides of an =
operator? What does that even look like in the code? I have written
one C parser, contributed to one other and (over the years) examined at
least two more, and none of them do what you seem to be suggesting is
typical.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Mon Sep 2 20:43:44 2024

On 02/09/2024 16:22, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 02/09/2024 13:03, Ben Bacarisse wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-08-29, Ben Bacarisse <ben@bsb.me.uk> wrote:

...

Do you think (or claim) that what is /required/ on each side of an
assignment in C is exactly the same thing? The expression on the LHS is >>>>> required to be a modifiable lvalue expression. That does not apply to >>>>> the expression on right hand side.

"modifiable lvalue" is a semantic attribute which depends on type
and qualification. An array is an lvalue, but not modifiable.
A const-qualified expression is also not a modififiable lvalue.

Bart is insisting that these attributes are not a matter of syntax.

Your intervention derailed the discussion into one of syntax. Bart then >>> simply stopped talking about his original claim. Way back I pointed
out that:
|| What is needed on the two sides is not the same.
And he replied
| I would argue that it is exactly the same.
He did, later, say that is was "exactly the same" except for the
differences but then went back to "I do mean exactly the same".

I said this:

I explained that. LHS and RHS can be identical terms for assignment in
pretty much every aspect, but there are extra constraints on the LHS.

You then sarcastically suggested:

So you use "exactly the same" to mean "exactly the same except for the
differences".

I then clarified:

No, I do mean exactly the same, both in terms of syntax and (in my
implementations, which are likely typical) internal representation of
those terms.
...So are no differences when considering only valid programs.

I wonder what it was you were really objecting to in the original remark
that I made. Since ignoring the differences in what is required on the
LHS and RHS all result in invalid programs your summary is (to a first approximation) correct, but it does not render mine wrong in any
interesting way.

I note that you have, again, indulged in strategic snipping. The "..."
was "There are no differences other than where the type system says your
code is invalid.". What is it about the type system of C that makes

int main(void) {
extern char *p;
*p = 0;
}

invalid? Because sometimes it is,

This is always valid, when compiling this translation unit. If 'p' is
defined wrongly elsewhere, then that's outside the remit of the compiler.

This is separate issue with C, in that it's not possible to check
consistency across translation units of declarations for shared symbols.

(My language solves that for the modules comprising a program, but it
can still exist between programs rather than between modules.)

But this is venturing away from the question of whether the left and
right sides of an assignment are compatible, or the same, or symmetric.

Obviously, one side is written to and the other is read; the RHS can
also contain a wider range of terms than the left side.

But usually what can be legally on the left side on an assignment, can
also written on the right, and with the same syntax, and the same levels
of indirection.

depending on what p is in some other
translation unit. Are you using your own meaning for "type system"? If
so what is it?

And as for your remarks about typical implementations, does your C
parser /really/ accept an assignment expression on both sides of an = operator? What does that even look like in the code? I have written
one C parser, contributed to one other and (over the years) examined at
least two more, and none of them do what you seem to be suggesting is typical.

Few of the compilers I tried reported a syntax error.

For assignment expressions, my code is something like this:

func readassignexpr =
p := readcondexpr() # p is an ast node

if token is an assign operator then
....
q := readassignexpr()
combine p and q into a new p assignment node
end

return p
end

This is for top-down recursive descent. If I'd called 'readunaryexpr'
instead, it would not recognise lots of expressions when what is being
read isn't the LHS of an assignment.

What did yours look like?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Mon Sep 2 23:48:19 2024

On 02/09/2024 23:31, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

But this is venturing away from the question of whether the left and
right sides of an assignment are compatible, or the same, or
symmetric.

Obviously, one side is written to and the other is read; the RHS can
also contain a wider range of terms than the left side.

But usually what can be legally on the left side on an assignment, can
also written on the right, and with the same syntax, and the same
levels of indirection.

Yes, but what can legally be on the right side of an assignment very
often cannot be written on the left. I don't call that "symmetric".

The symmetry is about when you /do/ legally have the same thing either
side of '='. That is in contrast to BLISS where the RHS needs an
explicit deref symbol, but the LHS doesn't.

BLISS, AFAIK, can also have unbalanced left and right expressions like:

A = .B + .C + .D

which I believe is the point you're making above. Yet that was described
as 'symmetric'.

Here's a more realistic example, negating a variable A:

A = - A; // C, described as 'asymmetric'
A = - .A; // BLISS, described 'symmetric'

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Mon Sep 2 23:59:32 2024

On 02/09/2024 23:52, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 02/09/2024 23:31, Keith Thompson wrote:

[...]

Yes, but what can legally be on the right side of an assignment very
often cannot be written on the left. I don't call that "symmetric".

The symmetry is about when you /do/ legally have the same thing either
side of '='. That is in contrast to BLISS where the RHS needs an
explicit deref symbol, but the LHS doesn't.

Thank you for clarifying what you mean by "symmetric".

I won't waste any more time debating it.

I wonder what /you/ had in mind then for 'symmetry'; that you can
legally have the same arbitrary expression on either side of '='?

That's only going to work when '=' means 'equality'!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Mon Sep 2 20:04:38 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

[...]

And as for your remarks about typical implementations, does your C
parser /really/ accept an assignment expression on both sides of
an = operator? What does that even look like in the code? I have
written one C parser, contributed to one other and (over the
years) examined at least two more, and none of them do what you
seem to be suggesting is typical.

It wouldn't be surprising to see a parser written so it would
accept (syntactically) a superset of the well-formed inputs
allowed by the language grammar. Any parses not allowed by the
grammar could then be flagged as erroneous in a later semantics
pass.

One reason to do this is to simplify error recovery in the face
of syntax errors. It's much easier to recover from a "correct"
parse than from one that looks hopelessly lost.

I'm not making any claim that such an approach is typical. On
the other hand it does seem to fit with some of the diagnostics
given by gcc for inputs that are syntactically ill-formed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Bart on Mon Sep 2 19:44:03 2024

Bart <bc@freeuk.com> writes:

[...]

But this is venturing away from the question of whether the left
and right sides of an assignment are compatible, or the same, or
symmetric.

Obviously, one side is written to and the other is read; the RHS
can also contain a wider range of terms than the left side.

But usually what can be legally on the left side on an assignment,
can also written on the right, and with the same syntax, and the
same levels of indirection.

If you wouldn't mind a suggestion or two, here are some.

First, look for accurate ways of expressing what you want to say. Syntactically, the relationship being considered is not a
symmetry but a subset. Considering just syntax, what can appear
on the left side of an assignment is a subset of what can appear
on the right side of an assignment. I think everyone would agree
on that point. After getting agreement on the syntax side of the
issue, the discussion can then pivot to semantic considerations.

Second, try not to always be so defensive. Disagreement doesn't
always mean criticism. Asking a question usually isn't meant as
an attack but just as an attempt to get clarification or more
information. Focus on communication, especially on understanding
what the other side is saying. Don't think I'm singling you out
on this; lots of people here are guilty of not listening as much
or as carefully as they should (myself sometimes included).

Let me say explicitly, I don't mean either of these suggestions
as criticism. My aim is only to help the conversation.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Tue Sep 3 16:08:59 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

[...]

And as for your remarks about typical implementations, does your C
parser /really/ accept an assignment expression on both sides of
an = operator? What does that even look like in the code? I have
written one C parser, contributed to one other and (over the
years) examined at least two more, and none of them do what you
seem to be suggesting is typical.

It wouldn't be surprising to see a parser written so it would
accept (syntactically) a superset of the well-formed inputs
allowed by the language grammar. Any parses not allowed by the
grammar could then be flagged as erroneous in a later semantics
pass.

Yes, that is pretty much what I've seen in more than one C parser.

I'm going to try to stop replying to Bart, partly because I think he
finds my replies annoying (so they are likely provoke unproductive
exchanges) but mainly because I am too literal. His reply to me shows
that he parses C like most of the compilers I've seen -- accepting a
superset of valid LH sides (as you say) but not "exactly the same" the
syntax on both sides. Recursive decent parses are inherently lopsided
in this respect because that's how they implement associativity.

One reason to do this is to simplify error recovery in the face
of syntax errors. It's much easier to recover from a "correct"
parse than from one that looks hopelessly lost.

I'm not making any claim that such an approach is typical. On
the other hand it does seem to fit with some of the diagnostics
given by gcc for inputs that are syntactically ill-formed.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Thu Sep 5 15:21:01 2024

Bart <bc@freeuk.com> wrote:

So what exactly is different about the LHS and RHS here:

A = A;

(In BLISS, doing the same thing requires 'A = .A' AIUI; while 'A = A' is
also valid, there is a hidden mismatch in indirection levels between
left and right. It is asymmetric while in C it is symmetric, although
seem to disagree on that latter point.)

You seem to miss the point that assigment operator is fundamentally
assymetic. This is quite visible at low level, where typical
machine has 'store' instruction. Store takes address (memory location)
as left argument, but a _value_ as right argument. Your example
introdices fake symmetry, you ealuate right hand side using
a load ant this may look symmetric with store. But even here
there is asymetry, which is better visible with naive compiler.
You may get code like

compute addres of A
load
compute address of A
store

The last step implement '=', the second 'compute address' corresponds
to A on the left had side. First 'compute address' corresponds to
A on the right hand side. Now you see that beside address computation
there is also load corresponding to A on the right hand side.
So clearly in most languages treatment of sides is assymetric:
extra loads are inserted due to 'lvalue convertion'.

To put in more general context: early success of C was related
to exposing address computations, so that programmers could
do optimization by hand (and consequently non-optimizing compiler
could produce reasonably fast object code). This has a cost:
need for explicit point dereferences not needed in other langiages.
Bliss went slightly further and requires explicit derefernces
to get values of variables. My point is that this is logical
regardless if you like it or not.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Waldek Hebisch on Thu Sep 5 16:54:57 2024

On 2024-09-05, Waldek Hebisch <antispam@fricas.org> wrote:

Bart <bc@freeuk.com> wrote:

So what exactly is different about the LHS and RHS here:

A = A;

(In BLISS, doing the same thing requires 'A = .A' AIUI; while 'A = A' is
also valid, there is a hidden mismatch in indirection levels between
left and right. It is asymmetric while in C it is symmetric, although
seem to disagree on that latter point.)

You seem to miss the point that assigment operator is fundamentally assymetic.

Both sides of an assignment can be complex expressions that designate
an object (though the right side need not). Only one detail is
different: the prior value of the left hand side object is not fetched,
but rather overwritten.

All constituents of both expressions have to be evaluated the same way.

*(a->b[c++].d(arg)) = e(f)[42]

the value of a has to be fetched, c has to be incremented,
the function pointer .d called and so on.

Note also that a swap operator, which is in the assignment family, is completely symmetric.

swap(lvalue1, lvalue2)

Both expressions get evaluated the same way. They designate objects,
the prior values of which are fetched, and then stored back in
reverse order.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Thu Sep 5 19:10:14 2024

On 05/09/2024 16:21, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

So what exactly is different about the LHS and RHS here:

A = A;

(In BLISS, doing the same thing requires 'A = .A' AIUI; while 'A = A' is
also valid, there is a hidden mismatch in indirection levels between
left and right. It is asymmetric while in C it is symmetric, although
seem to disagree on that latter point.)

You seem to miss the point that assigment operator is fundamentally assymetic.

If you've followed the subthread then you will know that nobody disputes
that assignment reads from side of '=' and writes to the other.

The symmetry is to do with syntax when the same term appears on both
sides of '=', the type associated with each side, and, typically, the
internal representations too.

Clearly the '=' operation is not reversible (or cummutative), as binary
'+' might be, but not binary '-'. That is not what I'm claiming.

This is quite visible at low level, where typical
machine has 'store' instruction. Store takes address (memory location)
as left argument, but a _value_ as right argument. Your example
introdices fake symmetry, you ealuate right hand side using
a load ant this may look symmetric with store.

Low level can reveal symmetry too. If 'A' is stored in a register 'Ra',
then copying A to itself can be done like this on x64:

mov Ra, Ra

(Whether data movement is RTL or LTR depends on the assembly syntax, but
in this example it doesn't matter.)

If A is in memory then it could be the same on 2-address architectures:

mov [A], [A]

but more typically it needs two instructions (here using RTL):

mov R, [A]
mov [A], R

Here, [A] appears in both instructions, it means the same thing, and
refers to the same location. Only the position (left vs. right operand,
exactly the same as in A = A) tells you if it's reading or writing.

But even here
there is asymetry, which is better visible with naive compiler.
You may get code like

compute addres of A
load
compute address of A
store

The last step implement '=', the second 'compute address' corresponds
to A on the left had side. First 'compute address' corresponds to
A on the right hand side. Now you see that beside address computation
there is also load corresponding to A on the right hand side.

So clearly in most languages treatment of sides is assymetric:
extra loads are inserted due to 'lvalue convertion'.

There is a Load on one operand and a balancing Store on the other. Two
loads or two stores would not make sense here.

If you want to go to a lower level, look at how a simple microprocessor
works. It will generate a /WR signal on memory accesses that tells a RAM
device whether to use the data bus as input or output.

Note that Load and Store can also be considered symmetric: each Load
reads data from somewhere and writes it somewhere else. Just like Store
does. So some instruction sets use the same mnemonic for both.

To put in more general context: early success of C was related
to exposing address computations, so that programmers could
do optimization by hand (and consequently non-optimizing compiler
could produce reasonably fast object code). This has a cost:
need for explicit point dereferences not needed in other langiages.
Bliss went slightly further and requires explicit derefernces
to get values of variables. My point is that this is logical
regardless if you like it or not.

And /my/ point was that in virtually every HLL, that dereference to turn
a variable's address, denoted by its name, into either a read or write
access of its value, is implicit.

I said this was a definining characteristic of HLLs, but which BLISS
does not have, or has it lop-sidedly: you need the explicit dereference
in an rvalue context, but not an lvalue one, like RHS and LHS of an
assignment. I considered that assymetric, but most who took part decided
that that was made it symmetric!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Kaz Kylheku on Thu Sep 5 17:37:44 2024

On 9/5/24 12:54, Kaz Kylheku wrote:

On 2024-09-05, Waldek Hebisch <antispam@fricas.org> wrote:

...

You seem to miss the point that assigment operator is fundamentally
assymetic.

Both sides of an assignment can be complex expressions that designate
an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to James Kuyper on Fri Sep 6 10:35:16 2024

On 05/09/2024 22:37, James Kuyper wrote:

On 9/5/24 12:54, Kaz Kylheku wrote:

On 2024-09-05, Waldek Hebisch <antispam@fricas.org> wrote:

...

You seem to miss the point that assigment operator is fundamentally
assymetic.

Both sides of an assignment can be complex expressions that designate
an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

Sure, if you want to completely disregard all the cases where the
symmetry does exist.

That means that for you, there is no interesting difference (using my
example of assigning A to itself) in a language where you write 'A = A',
and one where you write 'A = .A'.

(I'd be interested in how, in the latter language, you'd write the
equivalent of 'A = A = A' in C, since the middle term is both on the
left of '=', and on the right!)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bart on Fri Sep 6 14:05:41 2024

On Fri, 6 Sep 2024 10:35:16 +0100
Bart <bc@freeuk.com> wrote:

On 05/09/2024 22:37, James Kuyper wrote:

On 9/5/24 12:54, Kaz Kylheku wrote:

On 2024-09-05, Waldek Hebisch <antispam@fricas.org> wrote:

...

You seem to miss the point that assigment operator is
fundamentally assymetic.

Both sides of an assignment can be complex expressions that
designate an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

Sure, if you want to completely disregard all the cases where the
symmetry does exist.

That means that for you, there is no interesting difference (using my
example of assigning A to itself) in a language where you write 'A =
A', and one where you write 'A = .A'.

(I'd be interested in how, in the latter language, you'd write the
equivalent of 'A = A = A' in C, since the middle term is both on the
left of '=', and on the right!)

The point is that in BLISS everithing that is legal on the right side of asignment is also legal on the left side.
I don't know if the point is generally true. In particular, if BLISS
supports floatig point, what is meaning of floating point on the left
side?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Fri Sep 6 10:19:11 2024

Bart <bc@freeuk.com> wrote:

On 05/09/2024 16:21, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

So what exactly is different about the LHS and RHS here:

A = A;

(In BLISS, doing the same thing requires 'A = .A' AIUI; while 'A = A' is >>> also valid, there is a hidden mismatch in indirection levels between
left and right. It is asymmetric while in C it is symmetric, although
seem to disagree on that latter point.)

You seem to miss the point that assigment operator is fundamentally
assymetic.

If you've followed the subthread then you will know that nobody disputes
that assignment reads from side of '=' and writes to the other.

I dispute this and I think that to same degree several other folks too. Assgmenet _does not read_, it "only" writes. Assigment get two
parameters which are treated in different way. Imagine that you
are programming in a language like C, but are forbidden to use
assignment operator. But fortunately you have C "function"
'assign' with prototype:

void assign(int * p, int v);

Instead of writing

A = B

you need to write

assign(&A, B)

Of course, in real life nobody is going to force you to anything,
but except for fact that in C assignment has value the 'assign'
function is doing the same thing as '=' operator. And you can
see that it is asymetric: first agrument is an addres and right
is a value.

The symmetry is to do with syntax when the same term appears on both
sides of '=', the type associated with each side, and, typically, the internal representations too.

Simple compiler after parsing does not need "operators" at all.
You have parse tree where instead of "operators" are function
calls (of course you still need control structures). Code
generator has table and if function to be called is in the table,
than it emits corresponding instruction(s), otherwise it emits
real call. For assignment table simply will contain store
instruction.

Clearly the '=' operation is not reversible (or cummutative), as binary
'+' might be, but not binary '-'. That is not what I'm claiming.

This is quite visible at low level, where typical
machine has 'store' instruction. Store takes address (memory location)
as left argument, but a _value_ as right argument. Your example
introdices fake symmetry, you ealuate right hand side using
a load ant this may look symmetric with store.

Low level can reveal symmetry too. If 'A' is stored in a register 'Ra',
then copying A to itself can be done like this on x64:

mov Ra, Ra

(Whether data movement is RTL or LTR depends on the assembly syntax, but
in this example it doesn't matter.)

That is _very_ special case and for this reason misleading.

If A is in memory then it could be the same on 2-address architectures:

mov [A], [A]

but more typically it needs two instructions (here using RTL):

mov R, [A]
mov [A], R

Here, [A] appears in both instructions, it means the same thing, and
refers to the same location. Only the position (left vs. right operand, exactly the same as in A = A) tells you if it's reading or writing.

You somewhat miss fact that "A = B" has 3 parts, that is "A", "=", and "B".
The second 'mov' instruction came from "=", the first 'mov' is extra.
So instructions look symmetric, but clearly assigment part is asumetric.

But even here
there is asymetry, which is better visible with naive compiler.
You may get code like

compute addres of A
load
compute address of A
store

The last step implement '=', the second 'compute address' corresponds
to A on the left had side. First 'compute address' corresponds to
A on the right hand side. Now you see that beside address computation
there is also load corresponding to A on the right hand side.

So clearly in most languages treatment of sides is assymetric:
extra loads are inserted due to 'lvalue convertion'.

There is a Load on one operand and a balancing Store on the other. Two
loads or two stores would not make sense here.

Again: only store comes from assignment. This is clearly visible
if instead of misleading "A = A" you take "A = B" and replace
'B' by various things. Assigment part (store instruction) stays
the same, compution of value changes. In

A = c + d

you get two load (for c and d) and then addition. To put it
differently, you have

compute value of B
compute address of A
store

the "compute value of B" may be as trivial as putting constant
as part of store instruction, it may be load as in your case,
it may be compute value of some complex expression. Similarly,
"compute address of A" part can trivialize when 'A' is in
machine register, sligtly less trivial address of 'A' may be
constant put into store instruction, 'A' may a local variable
at some offset from stack, then again many machine can do this
as part of address calculation in store instruction. But 'A'
may be more compilcated and then you need real computaion and
extra instructions.

Anyway, only store instruction corresponds to assigment operator
proper, 'A' and 'B' just compute parameters for assigment.

If you want to go to a lower level, look at how a simple microprocessor works. It will generate a /WR signal on memory accesses that tells a RAM device whether to use the data bus as input or output.

Note that Load and Store can also be considered symmetric: each Load
reads data from somewhere and writes it somewhere else. Just like Store
does. So some instruction sets use the same mnemonic for both.

Concerning instruction, sure. But load is not an assignment.
It may look so in simple misleading cases. But even if 'A' is
allocated to register and you translate whole "A = B" to single
load, the load computes value of 'B' and if the result is in
correct register the assigment proper can be optimised to no
operation.

To put in more general context: early success of C was related
to exposing address computations, so that programmers could
do optimization by hand (and consequently non-optimizing compiler
could produce reasonably fast object code). This has a cost:
need for explicit point dereferences not needed in other langiages.
Bliss went slightly further and requires explicit derefernces
to get values of variables. My point is that this is logical
regardless if you like it or not.

And /my/ point was that in virtually every HLL, that dereference to turn
a variable's address, denoted by its name, into either a read or write
access of its value, is implicit.

I partially agree. Normal case is that write access is explicit
(for example via '=' in C) and it simly takes variable address. Only
read access in implicit.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Bart on Fri Sep 6 04:53:50 2024

Bart <bc@freeuk.com> writes:

On 05/09/2024 16:21, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

So what exactly is different about the LHS and RHS here:

A = A;

(In BLISS, doing the same thing requires 'A = .A' AIUI; while
'A = A' is also valid, there is a hidden mismatch in indirection
levels between left and right. It is asymmetric while in C it
is symmetric, although seem to disagree on that latter point.)

You seem to miss the point that assigment operator is
fundamentally assymetic.

If you've followed the subthread then you will know that nobody
disputes that assignment reads from side of '=' and writes to the
other.

The symmetry is to do with syntax when the same term appears on
both sides of '=', the type associated with each side, and,
typically, the internal representations too.

Maybe it would help if you would stop thinking in terms of the
word symmetry (clearly assignment is not symmetrical) and instead
think about consistency.

In C, the meaning of an identifier or object-locating expression
depends on where it is in the syntax tree. In some places it
means the address of the object; in other places it means the
contents of whatever is stored in the object. Those meanings
are very different; among other things, they have different
types (if one type is 'int' the other is 'int *').

In Bliss, by contrast, the meaning of an identifier is the same
no matter where it appears in the syntax tree: it always means
the address of the object. The meaning is independent of where
the term appears in the input, which is to say the meaning is
consistent from place to place.

In C the meaning is not consistent - in some places it means the
address, in other places whatever is stored at the address.

Considering the point of view of a compiler writer, it's easier
to write a compiler for Bliss than for C. In Bliss, upon seeing
an identifier, always simply put its address in a register. If
an object's value needs to be loaded, there will be a '.' to take
the address produced by the sub-expression and fetch the word
stored at that address. On the other hand, in C, upon seeing an
identifier, the compiler needs to consider the context of where
the identifier appears: on the left hand side of an assignment
it means one thing, in almost all other places it means something
else. There needs to be code in the compiler to decide which of
these two meanings is in effect for the node in question.

Please note that I am making no claim that the Bliss approach is
better than the C approach, or vice versa. My purpose here is to
explain the differences, not evaluate them.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Fri Sep 6 12:34:34 2024

On 06/09/2024 11:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

If you've followed the subthread then you will know that nobody disputes
that assignment reads from side of '=' and writes to the other.

I dispute this and I think that to same degree several other folks too. Assgmenet _does not read_, it "only" writes. Assigment get two
parameters which are treated in different way. Imagine that you
are programming in a language like C, but are forbidden to use
assignment operator. But fortunately you have C "function"
'assign' with prototype:

void assign(int * p, int v);

Instead of writing

A = B

you need to write

assign(&A, B)

Of course, in real life nobody is going to force you to anything,
but except for fact that in C assignment has value the 'assign'
function is doing the same thing as '=' operator. And you can
see that it is asymetric: first agrument is an addres and right
is a value.

If you have to use a function, yes. Because you've introduced an
artificial split in Where and When those dereferences are done.

With A=B they can be done at about the same time and the same place.

With ASSIGN(), the B dereference is done at the call-site; the A
deference is done inside ASSIGN(), so you are obliged to pass an
explicit reference to A. While B has already been dereferenced and you
have its value to pass.

(You can balance it out by by requiring ASSIGN(&A, &B)!)

If A is in memory then it could be the same on 2-address architectures:

mov [A], [A]

but more typically it needs two instructions (here using RTL):

mov R, [A]
mov [A], R

Here, [A] appears in both instructions, it means the same thing, and
refers to the same location. Only the position (left vs. right operand,
exactly the same as in A = A) tells you if it's reading or writing.

You somewhat miss fact that "A = B" has 3 parts, that is "A", "=", and "B". The second 'mov' instruction came from "=", the first 'mov' is extra.
So instructions look symmetric, but clearly assigment part is asumetric.

Here is A:=B from my HLLs in one of my ILs (intermediate language):

load B
store A

That was stack-based; here it is in 3-address-code IL:

a := b # more formally, 'move a, b'

This is it in dynamic byte-code:

push B
pop A

In every case, both A and B operands have the same rank and the same
levels of indirection. Only the opcode and/or operand position indicates
if a read or write operation is performed.

There is a Load on one operand and a balancing Store on the other. Two
loads or two stores would not make sense here.

Again: only store comes from assignment. This is clearly visible
if instead of misleading "A = A" you take "A = B" and replace
'B' by various things. Assigment part (store instruction) stays
the same, compution of value changes. In

A = c + d

This has been covered. The syntactical symmetry is that whatever you
have on the LHS, you can write the same thing on the RHS:

A[i+1].m = A[i+1].m

Obviously, you can have RHS terms that cannot appear on the left, like
'42', but that's usually due to separate constraints of the language.
(C's grammar allows 42 on the LHS of '=' for example).

you get two load (for c and d) and then addition. To put it
differently, you have

compute value of B
compute address of A
store

Why don't you need to compute the address of B? Why don't you need to
load the value of B? It is more like this:

compute address of B
load value of B via that address to some temporary location
compute address of A
store new value of A via that address

The only asymmetry in all my examples has been between Load/Store;
Push/Pop; or positional as in Left/Right.

The mechanism for EITHER reading or writing the value of an object via
its reference is the same; only the direction of data movement is the parameter.

Note that Load and Store can also be considered symmetric: each Load
reads data from somewhere and writes it somewhere else. Just like Store
does. So some instruction sets use the same mnemonic for both.

Concerning instruction, sure. But load is not an assignment.
It may look so in simple misleading cases.

In my PUSH B example from interpreted code, it will read B from memory,
and write it to a stack (also in memory). POP A will read from the
stack, and write to memory.

So both operations are really memory-to-memory.

But even if 'A' is
allocated to register and you translate whole "A = B" to single
load, the load computes value of 'B' and if the result is in
correct register the assigment proper can be optimised to no
operation.

I've done this stuff at the chip level (writing into an 8-bit latch for example, then reading from it); it's going to take a lot to convince me
that this is anything much different from a read/write or direction flag!

In more complicated cases in languages, then some asymmetry does come
up. For example, suppose C allowed this (my language allows the equivalent):

(c ? a : b) = x;

So this assigns to either a or b depending on c. My implementation
effectively turns it into this:

*(c ? &a : &b) = x;

So using explicit references and derefs. However, that is internal. The symmetry still exists in the syntax:

(c ? a : b) = (c ? a : b);

And /my/ point was that in virtually every HLL, that dereference to turn
a variable's address, denoted by its name, into either a read or write
access of its value, is implicit.

I partially agree. Normal case is that write access is explicit
(for example via '=' in C) and it simly takes variable address. Only
read access in implicit.

Well, you need /something/ to denote whether you are reading or writing
some location.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Tim Rentsch on Fri Sep 6 14:48:54 2024

On 06/09/2024 12:53, Tim Rentsch wrote:

Bart <bc@freeuk.com> writes:

On 05/09/2024 16:21, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

So what exactly is different about the LHS and RHS here:

A = A;

(In BLISS, doing the same thing requires 'A = .A' AIUI; while
'A = A' is also valid, there is a hidden mismatch in indirection
levels between left and right. It is asymmetric while in C it
is symmetric, although seem to disagree on that latter point.)

You seem to miss the point that assigment operator is
fundamentally assymetic.

If you've followed the subthread then you will know that nobody
disputes that assignment reads from side of '=' and writes to the
other.

The symmetry is to do with syntax when the same term appears on
both sides of '=', the type associated with each side, and,
typically, the internal representations too.

Maybe it would help if you would stop thinking in terms of the
word symmetry (clearly assignment is not symmetrical) and instead
think about consistency.

In C, the meaning of an identifier or object-locating expression
depends on where it is in the syntax tree. In some places it
means the address of the object; in other places it means the
contents of whatever is stored in the object.

In a HLL, a named object (ie. a variable name) is nearly always meant to
to refer to an object's value, either its current value or what will be
its new value.

It will rarely be intended to mean the name itself (ie. its address)
without extra denotations, other in special cases (eg. names of
functions, or names of arrays in C).

The implementation may sometimes need to use the address instead, but
that is hidden. (For example in evaluating x.m, you don't want to load a
500KB struct just to extract one small element).

I'm not sure what you mean by object-locating expression, but any
anonymous intermediate results (I call them transient values) generally
are considered rvalues in the HLL. They would need an explicit pointer
deref op to perform any stores.

Those meanings
are very different; among other things, they have different
types (if one type is 'int' the other is 'int *').

In Bliss, by contrast, the meaning of an identifier is the same
no matter where it appears in the syntax tree: it always means
the address of the object. The meaning is independent of where
the term appears in the input, which is to say the meaning is
consistent from place to place.

In BLISS both A and .A rvalues apparently have the same type. Both A = A
and A = .A are apparently valid, but do different things.

(But I don't know if ..A would work. In C, A = **A is invalid because of
the type system, but there isn't one in BLISS. However, when A is an
integer array, then i[A][A][A] does famously work - with suitable data
values.)

In C the meaning is not consistent - in some places it means the
address, in other places whatever is stored at the address.

Considering the point of view of a compiler writer, it's easier
to write a compiler for Bliss than for C. In Bliss, upon seeing
an identifier, always simply put its address in a register. If
an object's value needs to be loaded, there will be a '.' to take
the address produced by the sub-expression and fetch the word
stored at that address. On the other hand, in C, upon seeing an
identifier, the compiler needs to consider the context of where
the identifier appears:

You can do the same thing in a C compiler: always load the address of
any identifier associated with the location of value. Then decide
whether anything else needs to be done. The rules a little more
elaborate, but then C is a more complicated language.

You can try this in C source too:

*&A = *&B;

although the compiler is likely to cancel out both those sets of
operations (symmetrically on both sides).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Fri Sep 6 07:56:56 2024

Michael S <already5chosen@yahoo.com> writes:

On Fri, 6 Sep 2024 10:35:16 +0100
Bart <bc@freeuk.com> wrote:

On 05/09/2024 22:37, James Kuyper wrote:

On 9/5/24 12:54, Kaz Kylheku wrote:

On 2024-09-05, Waldek Hebisch <antispam@fricas.org> wrote:

...

You seem to miss the point that assigment operator is
fundamentally assymetic.

Both sides of an assignment can be complex expressions that
designate an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

Sure, if you want to completely disregard all the cases where the
symmetry does exist.

That means that for you, there is no interesting difference (using my
example of assigning A to itself) in a language where you write 'A =
A', and one where you write 'A = .A'.

(I'd be interested in how, in the latter language, you'd write the
equivalent of 'A = A = A' in C, since the middle term is both on the
left of '=', and on the right!)

The point is that in BLISS everithing that is legal on the right side of asignment is also legal on the left side.
I don't know if the point is generally true. In particular, if BLISS supports floatig point, what is meaning of floating point on the left
side?

BLISS is word based and typeless. On a PDP-10, doing a

.pi = 0

where 'pi' holds a 36-bit floating-point value (and 3.14159...
presumably), that floating-point value would be used as an
address and 0 would be stored into it (assuming I remember
BLISS correctly).

So probably not what one wants to do. ;)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Bart on Fri Sep 6 13:23:30 2024

On Fri, 6 Sep 2024 10:35:16 +0100
Bart <bc@freeuk.com> wrote:

On 05/09/2024 22:37, James Kuyper wrote:

On 9/5/24 12:54, Kaz Kylheku wrote:

...

Both sides of an assignment can be complex expressions that
designate an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

Sure, if you want to completely disregard all the cases where the
symmetry does exist.

Anything can be considered symmetric, if you ignore all the aspects of
it that are asymmetric. As a result, calling something symmetric for
that reason isn't worth commenting on.

A more useful way of describing what you're commenting on is not to
falsely claim that assignment in general is symmetric, but rather that
the particular assignment you're interest in is symmetric. And it's only symmetric syntactically; the associated semantics are profoundly asymmetric.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to James Kuyper on Fri Sep 6 19:58:50 2024

On 2024-09-06, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On Fri, 6 Sep 2024 10:35:16 +0100
Bart <bc@freeuk.com> wrote:

On 05/09/2024 22:37, James Kuyper wrote:

On 9/5/24 12:54, Kaz Kylheku wrote:

...

Both sides of an assignment can be complex expressions that
designate an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

Sure, if you want to completely disregard all the cases where the
symmetry does exist.

Anything can be considered symmetric, if you ignore all the aspects of
it that are asymmetric. As a result, calling something symmetric for
that reason isn't worth commenting on.

One damning aspect of the symmetry argument is that the left
side of a C assignment cannot be an assignment.

a = b = c

of course that is valid syntax, but it does not represent
(a = b) = c; it is not an assignment which has an assignment
expression as its left constituent.

It cannot be flipped around the b = c assignment, because c = b = a
represents a rearrangement of the abstract syntax which goes
beyond just swapping two children of a binary node.

This is not like the unary-expression issue; it is not
removable by an alternative grammar.

Absolute symmetry of the constructs of a grammar which has associativity
and precedence an only exist at the abstract syntax tree level, in
which associativity and precedence has been resolved into structure.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to James Kuyper on Fri Sep 6 23:38:05 2024

On 06/09/2024 18:23, James Kuyper wrote:

On Fri, 6 Sep 2024 10:35:16 +0100
Bart <bc@freeuk.com> wrote:

On 05/09/2024 22:37, James Kuyper wrote:

On 9/5/24 12:54, Kaz Kylheku wrote:

...

Both sides of an assignment can be complex expressions that
designate an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

Sure, if you want to completely disregard all the cases where the
symmetry does exist.

Anything can be considered symmetric, if you ignore all the aspects of
it that are asymmetric. As a result, calling something symmetric for
that reason isn't worth commenting on.

A more useful way of describing what you're commenting on is not to
falsely claim that assignment in general is symmetric, but rather that
the particular assignment you're interest in is symmetric. And it's only symmetric syntactically; the associated semantics are profoundly asymmetric.

In every kind of assignment, a variable is denoted in the same way on
either side of '=' if accessing its value (either to read or write). It
has the same type. It has the amount of indirection.

The same applies to more elaborate terms: if it can appear on the LHS,
it can appear unchanged on the RHS:

A[i+1].m = x;
y = A[i+1].m;

That A[i+1].m term can be written the same way on either side, but it
doesn't need to be on both sides of the same assignment! That wouldn't
be that useful.

My A = A; example was to highlight that aspect in a simple manner.

And it's only
symmetric syntactically; the associated semantics are profoundly

asymmetric.

As I said, the types are the same, the number of indirections are the
same. And internally, I've given examples of IL and native code where
the operands show the same properties.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Sat Sep 7 01:44:28 2024

Bart <bc@freeuk.com> wrote:

On 06/09/2024 11:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

If you've followed the subthread then you will know that nobody disputes >>> that assignment reads from side of '=' and writes to the other.

I dispute this and I think that to same degree several other folks too.
Assgmenet _does not read_, it "only" writes. Assigment get two
parameters which are treated in different way. Imagine that you
are programming in a language like C, but are forbidden to use
assignment operator. But fortunately you have C "function"
'assign' with prototype:

void assign(int * p, int v);

Instead of writing

A = B

you need to write

assign(&A, B)

Of course, in real life nobody is going to force you to anything,
but except for fact that in C assignment has value the 'assign'
function is doing the same thing as '=' operator. And you can
see that it is asymetric: first agrument is an addres and right
is a value.

If you have to use a function, yes. Because you've introduced an
artificial split in Where and When those dereferences are done.

Well, this is natural if you want simple semantics. As result of
parsing you ge a parse tree (it does not matter if parse tree
exists as a real data structure or is just a conceptual thing).
To get meaning of an expression you get meanings of arguments,
which conceptually is simple recursion (pragmaticaly compiler
may work in bottom-up way) and then you get meaning of whole
expression by applying operator at the top to meanings of the
subtrees. Point is that once you got meaning of a subtree
it does not change, so you can throw out source subtree.
In terms of code generation you can immediately generate
code and there is no need to change it after it is generated
(no need to turn read opcodes into writes).

With A=B they can be done at about the same time and the same place.

With ASSIGN(), the B dereference is done at the call-site; the A
deference is done inside ASSIGN(), so you are obliged to pass an
explicit reference to A. While B has already been dereferenced and you
have its value to pass.

(You can balance it out by by requiring ASSIGN(&A, &B)!)

This would not work in general, as I wrote it, the following are
valid:

assign(&a, 42)
assign(&a, a + 1)

but the second argument has no address, so your variant would not
work.

As I wrote, implementing this leads to very simple compiler, that
is why Forth uses that way. In optimizing compiler you want to
allocate variables into registers, and for this you need to get
rid of addresses (when you use variable address variable can not
be in register (some machines have/had addressable registers,
but even if you can address registers this usually leads to
slow code)). So probably using read/write variable accesses
leads to overall simpler compiler (with more complex front end,
but simpler optimizer). I have read that DEC had highly optimizing
Bliss compiler, I am not sure it this was due to Bliss features
or despite Bliss features.

If A is in memory then it could be the same on 2-address architectures:

mov [A], [A]

but more typically it needs two instructions (here using RTL):

mov R, [A]
mov [A], R

Here, [A] appears in both instructions, it means the same thing, and
refers to the same location. Only the position (left vs. right operand,
exactly the same as in A = A) tells you if it's reading or writing.

You somewhat miss fact that "A = B" has 3 parts, that is "A", "=", and "B". >> The second 'mov' instruction came from "=", the first 'mov' is extra.
So instructions look symmetric, but clearly assigment part is asumetric.

Here is A:=B from my HLLs in one of my ILs (intermediate language):

load B
store A

That was stack-based; here it is in 3-address-code IL:

a := b # more formally, 'move a, b'

This is it in dynamic byte-code:

push B
pop A

In every case, both A and B operands have the same rank and the same
levels of indirection.

You can use stack machines to get reasonably simple definition of
semantics. But still slightly more complex than what I outlined
above. And code for stack machines is unpleasent to optimize.
In a compiler for a language where official semantics is stack
based the first things that compiler does is to track few items
on top of the stack and match them to sequences like

push a
push b
call op

Once sequence is matched it is replaced by corresonding operation
on variables. If this matching works well you get conventional
(non-stack) intermediate representaion and reasonably good code.
If matching fails, the resulting object code wastes time on stack
operations.

Of course, if you don not minds slowdowns due to stack use, then
stack machine leads to very simple implemantaion. Best Forth compilers
have sophisticated code to track stack use and replace it by
use of registers. Other Forth compilers just accept slowness.

There is a Load on one operand and a balancing Store on the other. Two
loads or two stores would not make sense here.

Again: only store comes from assignment. This is clearly visible
if instead of misleading "A = A" you take "A = B" and replace
'B' by various things. Assigment part (store instruction) stays
the same, compution of value changes. In

A = c + d

This has been covered. The syntactical symmetry is that whatever you
have on the LHS, you can write the same thing on the RHS:

A[i+1].m = A[i+1].m

Obviously, you can have RHS terms that cannot appear on the left, like
'42', but that's usually due to separate constraints of the language.

Well, logically you can not change value of a number, so you can
not assign to a number, that is very fundamental. You could
try to define

x + 1 = y

as solving equation for x, that quickly runs into trouble due to
equations which are no solutions or multiple solutions. And solving
frequently is a complex process, not suitable as basic operation.
OK, if you trurly insist on symmetry, than Prolog unification
is symmetric, here information can flow in both directions.
But unification is quite different than assigment.

you get two load (for c and d) and then addition. To put it
differently, you have

compute value of B
compute address of A
store

Why don't you need to compute the address of B?

Well, B may have no address. In case when B is variable computing
its address is part of computation of its value. In general,
computing value of B need computing addresses of all variables
contained in B.

Why don't you need to
load the value of B?

"compute value" means putting result in place which is available
to subsequent operations, so logically no extra load is needed.
And for variables "compute value" includes loading them.

It is more like this:

compute address of B
load value of B via that address to some temporary location
compute address of A
store new value of A via that address

The point is that last part (that is store instruction) logically
does not change when you vary A and B. Only this part corresponds to assignment. The first to lines logically form a whole, that
is "compute B". And when you vary B you may get quite different split
of "compute B" into part. Moreover, logically "compute value of B"
and "compute address of A" are completely independent.

The only asymmetry in all my examples has been between Load/Store;
Push/Pop; or positional as in Left/Right.

The mechanism for EITHER reading or writing the value of an object via
its reference is the same; only the direction of data movement is the parameter.

In more complex case mechanism may differ. Pascal compiler may
wish to check that variables are in declared range. On read it
is consequence of type correctness, so no need for extra instructions.
But writes usually need checking (optimizer may be smart enough to
infer that assigned value is in range, but in general there is
need for actual checking code). Read access may more complex
than simple load instruction. I actually run into problem of
this sort in GNU Pascal. Read access was not entirely trivial
piece of intermediate representation and compiler tried to pattern
match it to find out what it was reading and turn it into write
access. But intermediate representation was transformed between
creation and write. For some time there was kind of whack-a-mole
game: new transformations were introduced which broke the compiler
and were fixed by better pattern. I resolved the problem by
creating special intermediate representation which was immune
to transformations, so simple pattern always worked. But there
was a cost, this intermetiate representation had to be expanded
(and transformed) in other places.

Note that Load and Store can also be considered symmetric: each Load
reads data from somewhere and writes it somewhere else. Just like Store
does. So some instruction sets use the same mnemonic for both.

Concerning instruction, sure. But load is not an assignment.
It may look so in simple misleading cases.

In my PUSH B example from interpreted code, it will read B from memory,
and write it to a stack (also in memory). POP A will read from the
stack, and write to memory.

So both operations are really memory-to-memory.

Stack does not change here anything compared to load/store pair,
this looks symmetric due to fusing two parts of logically three
step process into single intruction. Like above, varying A and
B will show actual parts.

But even if 'A' is
allocated to register and you translate whole "A = B" to single
load, the load computes value of 'B' and if the result is in
correct register the assigment proper can be optimised to no
operation.

I've done this stuff at the chip level (writing into an 8-bit latch for example, then reading from it); it's going to take a lot to convince me
that this is anything much different from a read/write or direction flag!

Chips got more complicated, but this in not relevant to the problem.
Of course, logically there is symmetry between read and write. At
lower level, reading a bit string from memory beside load needs
some masking and shifting. For write, one needs to read unmodified
part first, paste in new thing and write back. So there are
differences, but they are irrelevant to the problem, what I wrote
applies in case when there is symmetry between read and write
instructions.

In more complicated cases in languages, then some asymmetry does come
up. For example, suppose C allowed this (my language allows the equivalent):

(c ? a : b) = x;

So this assigns to either a or b depending on c. My implementation effectively turns it into this:

*(c ? &a : &b) = x;

So using explicit references and derefs. However, that is internal. The symmetry still exists in the syntax:

(c ? a : b) = (c ? a : b);

As noticed, people prefer symmetric notation, so most languages
make it "symmetric looking". But if you dig deeper there is
fundametal asymetry.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Sat Sep 7 03:13:46 2024

Bart <bc@freeuk.com> wrote:

I can also say that the C grammar is buggy:

assignment-expression:
conditional-expression
unary-expression asssignment-operator assignment-expression

When attempting to parse an assignment-expression, do you go for a conditional-expression or unary-expression?

Both. The grammar clearly is not LL. In C90 grammar there was
one issue which meant that grammar _alone_ could be not used for
parsing: at one (or maybe two) places after recognizing a token
parser had to look up in symbol table and possibly transform
the token based on previous declarations. If you did that,
then classic LR(1) parser would work. I think that changing
the grammar to do the transformation by grammar rules alone
changed grammar so that it was not longer LR(1). I think that
after change it was LR(2), but I may be remembering things
incorrectly. For _defining_ language syntax it is enough to have
unabigious grammar which is much easier than LR(2). Or standard
could give clear rules saying which of parse trees produced by
ambigious grammar is the right one (AFAIK that is what C++ is doing).

Anyway, any troubles in C grammar are _not_ due to part above, that
part is handled fine by LALR(1) tools (LALR(1) is somewhat hard to
define, but is subest of LR(1) which is easier to implement).

BTW: LL(k) requires ability to decide which rule will apply
after seeing k symbols after start of a construct. Clearly
for nontrival grammar one needs at least one symbol, so LL(1)
is easiet to parse. LR(k) requires ability to decide which
rule will apply after seeing the whole construct and at most
k following symbols. So, LR parser keeps reading symbols
as long as it can not recognize any construct (in other words
considers all alteratives as possible). In the process it
collects information. Once it has enough information it
replaces part of previously read input by left hand side of
a rule (this is called reduction). After one reduction
there may be several one, several constricts may and in the
same place. Parser starts which smallest recognized part, than
proceeds to bigger part. Usually LR parsers keep input and
effects of reduction on a stack. Reading symbol puts it on
the stack (this is called shift), reduction removes bunch of
items from the stack and replaces them by a single (different
item). At first glance control, that is deciding between shift
and reduction and in case of reduction deciding which rule
to use, may look complicated. But ignoring control, this is
simple and quite efficient. In sixties it was discovered
that in pinciple parser control may also be quite simple,
it can be done by a finite automaton. In other word, there
is state which contains collected information and tells you
what to do (shift or reduction), you get new state by looking
into table indexed by current state and read symbols (currenly
considered symbol and lookahead symbols). At first there
was doubt that needed tables may be too big and make this
inpractical, but in seventies it was discoverd that for
typical grammars of programming languages one can use resonably
small tables. And there were tricks with compressing tables.
So practically one get few, maybe tens kilobytes of tables.
Access to compressed tables needs extra instructions and LR
grammars tend to produce more reductions than LL ones, so
LR parses tend to be slightly slower than LL ones: people
got 500000 lines per minute for LR parsers and 900000
for LL paser on machines about 1000 slower than modern PC.
Evan at that time speed difference in paser was not a deal
breaker and LR parsing was preffered.

BTW2: Before C90 there was doubt if sensible grammar for C
is possible (earler grammar had serious prblems). But during
standared process is was discovered that quite resonable grammar
works. Current grammar is a developement of C90 grammar.
And problem like you mention were understood long ago.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Sat Sep 7 11:53:32 2024

On 07/09/2024 02:44, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 06/09/2024 11:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

(You can balance it out by by requiring ASSIGN(&A, &B)!)

This would not work in general, as I wrote it, the following are
valid:

assign(&a, 42)
assign(&a, a + 1)

but the second argument has no address, so your variant would not
work.

I believe that C's compound literals can give a reference to a+1:

#include <stdio.h>

void assign(int* lhs, int* rhs) {
*lhs = *rhs;
}

int main(void) {
int a=20;

assign(&a, &(int){a+1});

printf("a = %d\n", a);
}

The output from this is 21.

(You are not supposed to ask questions about how assign() can be
implemented if assignment statement is not available. Here I'm just
showing the symmetry at call-site and in callee.)

You can use stack machines to get reasonably simple definition of
semantics. But still slightly more complex than what I outlined
above. And code for stack machines is unpleasent to optimize.
In a compiler for a language where official semantics is stack
based the first things that compiler does is to track few items
on top of the stack and match them to sequences like

push a
push b
call op

Once sequence is matched it is replaced by corresonding operation
on variables. If this matching works well you get conventional
(non-stack) intermediate representaion and reasonably good code.
If matching fails, the resulting object code wastes time on stack
operations.

(My stack IL is different. The stack is a compile-time stack only, and
code is scanned linearly during code-generation. Roughly, the 'stack' corresponds to the machine's register-file, although in practice
register allocation is ad-hoc.

In my current scheme, the hardware stack is rarely used except for
function calls. You might notice I use Load/Store rather than Push/Pop,
this was to avoid confusion with hardware push/pop instructions.

When A and B are in registers, then A=B results in a register-register
move.)

Of course, if you don not minds slowdowns due to stack use, then
stack machine leads to very simple implemantaion.

I use a software stack for interpreted code.

Best Forth compilers
have sophisticated code to track stack use and replace it by
use of registers. Other Forth compilers just accept slowness.

Then you no longer have a language which can be implemented in a few KB.
You might as well use a real with with proper data types, and not have
the stack exposed in the language. Forth code can be very cryptic
because of that.

My IL stack code is typically easier to follow than Forth, and would be
easier to write, if still a slog. (That's why I use a HLL!)

Obviously, you can have RHS terms that cannot appear on the left, like
'42', but that's usually due to separate constraints of the language.

Well, logically you can not change value of a number, so you can
not assign to a number, that is very fundamental.

But it's not prohibited by the grammar.

You could
try to define

x + 1 = y

as solving equation for x, that quickly runs into trouble due to
equations which are no solutions or multiple solutions.

(Actually, 'x + 1 = y' is well defined in my languages. But that's
because '=' means equality. The result is true/false.)

Why don't you need to compute the address of B?

Well, B may have no address. In case when B is variable computing
its address is part of computation of its value. In general,
computing value of B need computing addresses of all variables
contained in B.

Why don't you need to
load the value of B?

"compute value" means putting result in place which is available
to subsequent operations, so logically no extra load is needed.

If you look typical generated code, then 'compute value' can often
require discrete loads. But sometimes it is implicit:

add R, [b]

However, so is Store sometimes:

add [a], R

And for variables "compute value" includes loading them.

It is more like this:

compute address of B
load value of B via that address to some temporary location
compute address of A
store new value of A via that address

The point is that last part (that is store instruction) logically
does not change when you vary A and B. Only this part corresponds to assignment. The first to lines logically form a whole, that
is "compute B". And when you vary B you may get quite different split
of "compute B" into part. Moreover, logically "compute value of B"
and "compute address of A" are completely independent.

Suppose you had a 2-address machine with this instruction:

mov [a], [b] # right-to-left

With a simple CPU, executing it might involve:

* Getting the address of b from the instruction into the address register
* Performing a memory read access to load b into a register
* Getting the address of a from the instruction into the address register
* Performing a memory write access to write the register to a

(c ? a : b) = (c ? a : b);

As noticed, people prefer symmetric notation, so most languages
make it "symmetric looking". But if you dig deeper there is
fundametal asymetry.

What started the subthread was the question of which HLL goes between
ASM and C (since someone suggested that C was mid-level).

People suggested ones like BLISS and Forth.

I remarked that a proper HLL would let you write just A to either read
the value of variable A, or write to it. Eg. A = A, without special
operators to dereference A's address.

At the lower level it might be push/pop, load/store, or even *&A = *&A,
but in all cases you typically use the same levels of indirection on
both sides.

This is in contrast to a non-HLL where you might need an extra level on
both sides, or to BLISS where it is on one side!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Sun Sep 8 00:05:59 2024

Bart <bc@freeuk.com> wrote:

On 07/09/2024 02:44, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 06/09/2024 11:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

(You can balance it out by by requiring ASSIGN(&A, &B)!)

This would not work in general, as I wrote it, the following are
valid:

assign(&a, 42)
assign(&a, a + 1)

but the second argument has no address, so your variant would not
work.

I believe that C's compound literals can give a reference to a+1:

#include <stdio.h>

void assign(int* lhs, int* rhs) {
*lhs = *rhs;
}

int main(void) {
int a=20;

assign(&a, &(int){a+1});

printf("a = %d\n", a);
}

The output from this is 21.

Yes, that would work. But you use here complex language feature,
rather unattractive if one want to use assign as a basic
construct.

You can use stack machines to get reasonably simple definition of
semantics. But still slightly more complex than what I outlined
above. And code for stack machines is unpleasent to optimize.
In a compiler for a language where official semantics is stack
based the first things that compiler does is to track few items
on top of the stack and match them to sequences like

push a
push b
call op

Once sequence is matched it is replaced by corresonding operation
on variables. If this matching works well you get conventional
(non-stack) intermediate representaion and reasonably good code.
If matching fails, the resulting object code wastes time on stack
operations.

(My stack IL is different. The stack is a compile-time stack only, and
code is scanned linearly during code-generation. Roughly, the 'stack' corresponds to the machine's register-file, although in practice
register allocation is ad-hoc.

OK, that is another way to work around slowness of the stack. But
you get into trouble when you have more items on the stack than
number of available registers.

Best Forth compilers
have sophisticated code to track stack use and replace it by
use of registers. Other Forth compilers just accept slowness.

Then you no longer have a language which can be implemented in a few KB.
You might as well use a real with with proper data types, and not have
the stack exposed in the language. Forth code can be very cryptic
because of that.

First, it is not my goal to advocate for Forth use. But I think
it is interesting to understand why things were done in the past.
Concerning exposed stack, Pop11 has exposed stack. 5 line
executive summary could be the same as of Forth. But language
is feels quite different. Pop11 uses classic infix syntax, and
variable access puts "value" on the stack. I put "value" in
scare qoutes because most of Pop11 data items are composite
and live in memory. They are accessed using their address and
that is what is put on the stack. But semantically you deal
with values and addresses are normally hidden from users. If you
wish you can ignore the stack. Presence of the stack is visible
in examples like below:
: 2*3 =>
** 6
: 2, 3, *() =>
** 6
the first line is infix form, '=>' oprator prints what is on
the stack (but you cat treat it as "print current result").
In the second line two numbers are pushed on the stack and
then there is call to multiplication routine. Parser knows
that '*' is an operator, but since there are no argument
'*' is treated as ordinary identifer and as result you
get multiplication routine. Like in other languages parentheses
mean function call. Up to now this may look just as some
weirdness with no purpose. But there are advantages. One
is that Pop11 functions can return multiple values, they just
put as many values as needed on the stack. Second, one can
write functions which take variable number of arguments.
And one can use say a loop to put varible number of arguments
on the stack and then call a routine expecting variable
number of arguments. In fact, there is common Pop11 idiom
to handle aggregas: 'explode' puts all members of the aggregate
on the stack. There are also constructor function which build
aggregates from values on the stack.

Coming back to Forth, you can easily add infix syntax to Forth
but Forth users somewhat dislike idea of using infix for most
of programming. My personal opinion is that Fort was good
around 1980. At that time there was quite simple implementation,
language offered interactive developement and some powerful
feature and there were interesting compromise between speed
and size. Namely, Forh code tended to be smaller than
machine code and deliver speed lower than machine code but
better than some other alternatives. Now, interactive developement
is done on bigger machines, so small size is not so important.
Traditional Forh uses so called threaded code, which has machine
word sized units. With 16-bit words that leads to relatively
compact code. With 32-bit words you double code size. And
on 64-bit machines this is quite wasteful.

What started the subthread was the question of which HLL goes between
ASM and C (since someone suggested that C was mid-level).

Well, for me important question is how much work is due to tools
(basically overhead) and how much deals with problem domain.
Since computers are now much heaper compared to human work
there is desire to reduce tool overhead as much as possible.
This favours higher level languages, so probably most recently
created languages is at higher level than C. However, in
sixties and seventies there were pack of so called algorithmic
languages or somewhat more specifically Algol family. I would
say that C is close to the middle of this pack. As a devils
advocate let me compare typical implementation of early Pascal
with C modern C. In C variables, including aggregates can be
initialized at declarartion time, early Pascal did not allow
this, so one had to initalize variables by code. In C function
definition gives argument names and types, even if there is
earlier prototype. In early Pascal, if you gave froward
declaration (equivalent of C prototype), you _had_ to omit
function parameters (and their types). That significantly
decreased readabilty of such functions. In early Pascal
one had to declare all local variables at the start of
a fucntion. C has block structure, so one can limit variable
scope to area where variable makes sense. More importat, one
can declare variable when there is sensible initial value,
which significantly decreases chance of missing/wrong
initalization. Many early Pascal implementations did not
have conformant arrays (present in original Wirth Pascal).
That meant that all arrays had to be of size known at compile
time. From C99 C has variably modified types, which allow
writing functions accepting arrays of varying sizes allocated
elsewere. Let me also mention C preprocessor: it allows programmer
to effectively define simple language extention, so that
language better fits to problem domain. This in not available
in standard Pascal. Of course, C has buch of low level features as
casts and pointer arithmetic. However, with variably modified
types pointer arthmetic is no longer needed and casts are system
programming feature not needed for normal programs. So if presence
of those bothers you the simple solution is not to use them
(of course C standard defines array access in terms of pointer
artihtmetic, but one can avoid explicit pointer artimetic and
allow only array access in the sources). Early Pascal has a
bunch of features not present in C, but one can reasonably consider
modern C to be higher level than early Pascal. And _a lot_
of early languages were at lower level than Pascal.

People suggested ones like BLISS and Forth.

I remarked that a proper HLL would let you write just A to either read
the value of variable A, or write to it. Eg. A = A, without special
operators to dereference A's address.

You are looking at superficial things. Forth is extensible, fetch
appropriate extention from the net and you can write expressions
in infix form using normal syntax. IIUC Bliss has powerful
preprocessor, I am not sure if it is powerful enough to transform
expression without dereferences into ones with dereferences, but
even if it can not do that extensibility allows to write clearer
code closer to problem domain.

I am not going to write substantial programs in Bliss or Forth
but I have no double they are HLL-s.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Sun Sep 8 05:23:45 2024

On 06.09.2024 11:35, Bart wrote:

[...]

[...] in a language where you write 'A = A',
and one where you write 'A = .A'.

(I'd be interested in how, in the latter language, you'd write the
equivalent of 'A = A = A' in C, since the middle term is both on the
left of '=', and on the right!)

Since assignment is right associative it would probably be written in
a decomposed form as

A = .A
A = .A

It's easier to see if you use different objects; A = B = C would be

B = .C
A = .B

but that depends on how "the latter language"'s semantics are defined
(or whether there's, maybe, specific forms for assignment chains in
such a language). If you have a concrete language in mind inspect its documentation, if you want to design such a language it's your task
to define its allowed syntax and associated semantics.

In "C" you have the right-hand assignment operation (B = C) create an
lvalue B but this is dereferenced to an rvalue (for A = B) if fed to
the left-hand assignment. (This is typical for languages that support
chained assignments.)

I'm reluctant to enter the discussion of your "symmetry" issue. Only
insofar to say that a program text that visibly _appears_ symmetric
in "C", like the 'A = A', isn't symmetric concerning its semantics.
That's certainly easier to see in languages with operators chosen
like ':=' or ':-' or '<-' etc., where the asymmetry is obvious, or
where the 'ref' (lvalue) property is explicitly specified or 'deref'
operations (like the '.' in "the latter language") explicitly given.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Sun Sep 8 05:53:29 2024

On 07.09.2024 12:53, Bart wrote:

I remarked that a proper HLL would let you write just A to either read
the value of variable A, or write to it. Eg. A = A, without special
operators to dereference A's address.

At the lower level it might be push/pop, load/store, or even *&A = *&A,
but in all cases you typically use the same levels of indirection on
both sides.

No. (And I think here lies your misconception or irritations concerning
the term "symmetry".) When writing 'A = A' there is still a semantical _asymmetry_ in "C", it only appears symmetric (because of the chosen
operator symbol and the implicit 'deref' operation which is typical in
most programming languages).

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Sun Sep 8 05:44:16 2024

On 06.09.2024 13:34, Bart wrote:

In more complicated cases in languages, then some asymmetry does come
up. For example, suppose C allowed this (my language allows the
equivalent):

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

So this assigns to either a or b depending on c. My implementation effectively turns it into this:

*(c ? &a : &b) = x;

So using explicit references and derefs. However, that is internal. The symmetry still exists in the syntax:

(c ? a : b) = (c ? a : b);

This is only a "visual" symmetry, not a semantical one.

The LHS of the Algol 68 example is of 'REF' (lvalue) type, as it would
be the case with a language that supports a syntax as you show it here.

I'm not sure if you should adjust your wording (concerning "symmetry")
given that you seem to widely inflict confusion here.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Janis Papanagnou on Sun Sep 8 05:58:25 2024

On 08.09.2024 05:53, Janis Papanagnou wrote:

On 07.09.2024 12:53, Bart wrote:

I remarked that a proper HLL would let you write just A to either read
the value of variable A, or write to it. Eg. A = A, without special
operators to dereference A's address.

At the lower level it might be push/pop, load/store, or even *&A = *&A,
but in all cases you typically use the same levels of indirection on
both sides.

No. (And I think here lies your misconception or irritations concerning
the term "symmetry".) When writing 'A = A' there is still a semantical _asymmetry_ in "C", it only appears symmetric (because of the chosen
operator symbol and the implicit 'deref' operation which is typical in
most programming languages).

To clarify my wording; typical is not an [explicit] 'deref' but its
_implicit_ use.

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Tim Rentsch on Sun Sep 8 06:39:38 2024

On 01.09.2024 22:07, Tim Rentsch wrote:

[...] The most important purpose of
the ISO C standard is to be read and understood by ordinary C
developers, not just compiler writers. [...]

Is that part of a preamble or rationale given in the C standard?

That target audience would surely surprise me. Myself I've programmed
in quite some programming languages and never read a standard document
of the respective language, nor did I yet met any programmer who have
done so. All programmer folks I know used text books to learn and look
up things and specific documentation that comes with the compiler or interpreter products. (This is of course just a personal experience.)

I've also worked a lot with standards documents in various areas
(mainly ISO and ITU-T standards but also some others). Almost none of
these standards (if they were substantial ones[*]) were suited for
"ordinary users". I used them to _implement_ the respective services
or protocols. But what they describe, and how they describe things,
is by far not the way that would fit "ordinary users".

That's why I immediately see the necessity that compiler creators need
to know them in detail to _implement_ "C". And that's why I cannot see
how the statement of the C-standard's "most important purpose" would
sound reasonable (to me). I mean, what will a programmer get from the
"C" standard that a well written text book doesn't provide? After all
the compiler vendor has to guarantee a conformance (or disclose any non-conformances).

I met languages feature, implementation, and environment differences
in various, e.g., C++ compilers I used in the past. The requirements
we had to fulfill were to create products for various platforms with differences in their C++ environments. A restriction to the standard
features were one point we learned from the compilers' descriptions,
and much things beyond that had anyway been non-standard (like, e.g.,
template handling).

YMMV, of course.

Janis

[*] By substantial I mean extensive ones like the ITU-T X.500 series
and similar, not trivial ones like, say, the ISO 8601).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Tim Rentsch on Sun Sep 8 11:53:34 2024

On Fri, 06 Sep 2024 07:56:56 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Fri, 6 Sep 2024 10:35:16 +0100
Bart <bc@freeuk.com> wrote:

On 05/09/2024 22:37, James Kuyper wrote:

On 9/5/24 12:54, Kaz Kylheku wrote:

On 2024-09-05, Waldek Hebisch <antispam@fricas.org> wrote:

...

You seem to miss the point that assigment operator is
fundamentally assymetic.

Both sides of an assignment can be complex expressions that
designate an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

Sure, if you want to completely disregard all the cases where the
symmetry does exist.

That means that for you, there is no interesting difference (using
my example of assigning A to itself) in a language where you write
'A = A', and one where you write 'A = .A'.

(I'd be interested in how, in the latter language, you'd write the
equivalent of 'A = A = A' in C, since the middle term is both on
the left of '=', and on the right!)

The point is that in BLISS everithing that is legal on the right
side of asignment is also legal on the left side.
I don't know if the point is generally true. In particular, if
BLISS supports floatig point, what is meaning of floating point on
the left side?

BLISS is word based and typeless. On a PDP-10, doing a

.pi = 0

where 'pi' holds a 36-bit floating-point value (and 3.14159...
presumably), that floating-point value would be used as an
address and 0 would be stored into it (assuming I remember
BLISS correctly).

On PDP-10 reinterpreting [18 LS bits of] floating-point as address is
natural, because addresses, integers and FP share the same register
file.
It seems to me that on S/360 or CDC-6K or PDP-11 or VAX it would be
less natural.
However, natural or not, BLISS was used widely both on PDP-11 and on
VAX, which means that it worked well enough.

So probably not what one wants to do. ;)

Yes, LS bits of FP as address do not sound very useful.
On the other hand, using several MS bits of FP, although typically
fewer than 18, as address is useful in calculations of many
transcendental functions.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Sun Sep 8 11:58:27 2024

On Sun, 8 Sep 2024 05:44:16 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 06.09.2024 13:34, Bart wrote:

In more complicated cases in languages, then some asymmetry does
come up. For example, suppose C allowed this (my language allows the equivalent):

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

Are you sure?
It seems to me that you got it backward.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Michael S on Sun Sep 8 11:27:33 2024

On 08/09/2024 09:58, Michael S wrote:

On Sun, 8 Sep 2024 05:44:16 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 06.09.2024 13:34, Bart wrote:

In more complicated cases in languages, then some asymmetry does
come up. For example, suppose C allowed this (my language allows the
equivalent):

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

Are you sure?
It seems to me that you got it backward.

The point here is that you can write such a 2-way select on the LHS of
an assignment. C doesn't allow that unless you wrap it up as a pointer expression:

*(c ? &a : &b) = x;

In language like C, the LHS of an assignment is one of four categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

A is a simple variable; X represents a term of any complexity, and Y is
any expression. (In C, the middle two are really the same thing.)

Some languages allow extra things on the LHS, but in C they can be
emulated by transforming the term to a pointer operation. In the same it
can emulate pass-by-reference (which objects which are not arrays!)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Janis Papanagnou on Sun Sep 8 11:18:58 2024

On 08/09/2024 04:44, Janis Papanagnou wrote:

On 06.09.2024 13:34, Bart wrote:

In more complicated cases in languages, then some asymmetry does come
up. For example, suppose C allowed this (my language allows the
equivalent):

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

Yes, that's my syntax too as that's where I copied it from. I've been
able to write such code since the 80s.

It applies also to any value-returning statement (switch, case, n-way
select, block**) and can nest.

But the feature (using them in lvalue contexts) was rarely used.

(** This is:

(a, b, c) = 0;

in C, which is illegal. But put in explicit pointers, and it's suddenly
fine:

*(a, b, &c) = 0;

So why can't the compiler do that?)

So this assigns to either a or b depending on c. My implementation
effectively turns it into this:

*(c ? &a : &b) = x;

So using explicit references and derefs. However, that is internal. The
symmetry still exists in the syntax:

(c ? a : b) = (c ? a : b);

This is only a "visual" symmetry, not a semantical one.

The LHS of the Algol 68 example is of 'REF' (lvalue) type, as it would
be the case with a language that supports a syntax as you show it here.

This is where I differ from Algol68, where I had to considerably
simplify the semantics to get something I could understand and implement.

Take this C:

int A, B;

A = B;

There are two types associated with the LHS: 'int*' which is the type
the name A (its address), and 'int' which is the type of A's value.

So, why would a language choose int* over int as /the/ type of an
assignment target? As far as the user is concerned, they're only dealing
with int value types.

If they have to consider that variables have addresses, then the same
applies to B!

This is where I think Algol68 got it badly wrong.

I'm not sure if you should adjust your wording (concerning "symmetry")
given that you seem to widely inflict confusion here.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Sun Sep 8 12:05:12 2024

On 08/09/2024 01:05, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then you no longer have a language which can be implemented in a few KB.
You might as well use a real with with proper data types, and not have
the stack exposed in the language. Forth code can be very cryptic
because of that.

First, it is not my goal to advocate for Forth use.

You're doing a fine job of it!

For me it's one of those languages, like Brainf*ck, which is trivial to implement (I've done both), but next to impossible to code in.

With Forth, I had to look for sample programs to try out, and discovered
that Forth was really a myriad different dialects. It's more of a DIY
language that you make up as you go along.

: 2*3 =>
** 6
: 2, 3, *() =>
** 6
the first line is infix form, '=>' oprator prints what is on
the stack (but you cat treat it as "print current result").
In the second line two numbers are pushed on the stack and
then there is call to multiplication routine. Parser knows
that '*' is an operator, but since there are no argument
'*' is treated as ordinary identifer and as result you
get multiplication routine. Like in other languages parentheses
mean function call. Up to now this may look just as some
weirdness with no purpose. But there are advantages. One
is that Pop11 functions can return multiple values, they just
put as many values as needed on the stack. Second, one can
write functions which take variable number of arguments.
And one can use say a loop to put varible number of arguments
on the stack and then call a routine expecting variable
number of arguments. In fact, there is common Pop11 idiom
to handle aggregas: 'explode' puts all members of the aggregate
on the stack. There are also constructor function which build
aggregates from values on the stack.

This sounds like one of my bytecode languages.

So, in the same way that Lisp looks like the output of an AST dump,
these stack languages look like intermediate code:

HLL -> AST -> Stack IL -> Interpret or -> ASM
(eg. C) (Lisp) (Forth)
(Pop-11)

Most people prefer to code in a HLL. But this at least shows Lisp as
being higher level than Forth, and it's a language that can also be bootstrapped from a tiny implementation.

Coming back to Forth, you can easily add infix syntax to Forth
but Forth users somewhat dislike idea of using infix for most
of programming. My personal opinion is that Fort was good
around 1980. At that time there was quite simple implementation,
language offered interactive developement and some powerful
feature and there were interesting compromise between speed
and size.

Around that time I was working on several languages that were low level
and with small implementations, which included running directly on 8-bit hardware. They all looked like proper HLLS, if crude and simple.

There was no need to go 'weird'. For lower level, I used assembly, where
you weren't constrained to a stack.

What started the subthread was the question of which HLL goes between
ASM and C (since someone suggested that C was mid-level).

Well, for me important question is how much work is due to tools
(basically overhead) and how much deals with problem domain.
Since computers are now much heaper compared to human work
there is desire to reduce tool overhead as much as possible.
This favours higher level languages, so probably most recently
created languages is at higher level than C. However, in
sixties and seventies there were pack of so called algorithmic
languages or somewhat more specifically Algol family. I would
say that C is close to the middle of this pack.

My exposure before I first looked at C was to Algol, Pascal, Fortran
(and COBOL).

C struck me as crude, and I would have placed it lower than FORTRAN IV,
even though the latter had no structured statements. But that was
because it exposed less in the language - you couldn't play around with variable addresses for example. So FORTRAN was no good for systems
programming.

I'd also looked at Algol68, which was higher level than any of those. So
my own first language used syntax from that, but with simplified
semantics and with explicit pointer/address ops. It was a fine systems language.

As a devils
advocate let me compare typical implementation of early Pascal

...

of early languages were at lower level than Pascal.

You're taking all those, to me, chaotic features of C as being superior
to Pascal.

Like being able define anonymous structs always anywhere, or allowing
multiple declarations of the same module-level variables and functions.

Pascal was a teaching language and some thought went into its structure
(unlike C). In my hands I would have given it some tweaks to make it a
viable systems language. For a better evolution of Pascal, forget Wirth,
look at Ada, even thought that is not my thing because it is too strict
for my style.

People suggested ones like BLISS and Forth.

I remarked that a proper HLL would let you write just A to either read
the value of variable A, or write to it. Eg. A = A, without special
operators to dereference A's address.

You are looking at superficial things.

Syntax IS superficial! But it's pretty important otherwise we'd be
programming in binary machine code, or lambda calculus.

I am not going to write substantial programs in Bliss or Forth
but I have no double they are HLL-s.

So, what would a non-HLL look like to you that is not actual assembly?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Janis Papanagnou on Sun Sep 8 10:12:17 2024

On 9/8/24 00:39, Janis Papanagnou wrote:
...

That's why I immediately see the necessity that compiler creators need
to know them in detail to _implement_ "C". And that's why I cannot see
how the statement of the C-standard's "most important purpose" would> sound reasonable (to me). ...

I agree - the most important purpose is for implementors, not developers.

... I mean, what will a programmer get from the
"C" standard that a well written text book doesn't provide?

What the C standard says is more precise and more complete than what
most textbooks say. Most important for my purposes, it makes it clear
what's required and allowed by the standard. For most of my career, I
worked under rules that required my code to avoid undefined behavior, to
work correctly regardless of which choice implementations make on
unspecified behavior, with a few exceptions.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bart on Sun Sep 8 16:34:32 2024

On Sun, 8 Sep 2024 11:27:33 +0100
Bart <bc@freeuk.com> wrote:

On 08/09/2024 09:58, Michael S wrote:

On Sun, 8 Sep 2024 05:44:16 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 06.09.2024 13:34, Bart wrote:

In more complicated cases in languages, then some asymmetry does
come up. For example, suppose C allowed this (my language allows
the equivalent):

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

Are you sure?
It seems to me that you got it backward.

The point here is that you can write such a 2-way select on the LHS
of an assignment. C doesn't allow that unless you wrap it up as a
pointer expression:

*(c ? &a : &b) = x;

In language like C, the LHS of an assignment is one of four
categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

A is a simple variable; X represents a term of any complexity, and Y
is any expression. (In C, the middle two are really the same thing.)

Some languages allow extra things on the LHS, but in C they can be
emulated by transforming the term to a pointer operation. In the same
it can emulate pass-by-reference (which objects which are not arrays!)

Got it.
Thank you.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Michael S on Sun Sep 8 16:40:14 2024

On 08.09.2024 10:58, Michael S wrote:

On Sun, 8 Sep 2024 05:44:16 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 06.09.2024 13:34, Bart wrote:

In more complicated cases in languages, then some asymmetry does
come up. For example, suppose C allowed this (my language allows the
equivalent):

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

Are you sure?

Sure about what? - That the code above works? - Yes, it does.

It seems to me that you got it backward.

Mind to elaborate?

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to James Kuyper on Sun Sep 8 16:37:05 2024

On 08.09.2024 16:12, James Kuyper wrote:

On 9/8/24 00:39, Janis Papanagnou wrote:
...

That's why I immediately see the necessity that compiler creators need
to know them in detail to _implement_ "C". And that's why I cannot see
how the statement of the C-standard's "most important purpose" would

sound reasonable (to me). ...

I agree - the most important purpose is for implementors, not developers.

... I mean, what will a programmer get from the
"C" standard that a well written text book doesn't provide?

What the C standard says is more precise and more complete than what
most textbooks say.

Exactly. And this precision is what makes standard often difficult
to read (for programming purposes for "ordinary" folks).

If a textbook doesn't answer a question I have I'd switch to another
(better) textbook, (usually) not to the standard.

Most important for my purposes, it makes it clear
what's required and allowed by the standard.

To be honest, I was also inspecting and reading language standards
(e.g. for the POSIX shell and awk), but not to be able to correctly
write programs in those languages, but rather out of interest and
for academical discussions in Usenet (like the discussions here,
that also often refer to the "C" standard).

For most of my career, I
worked under rules that required my code to avoid undefined behavior, to
work correctly regardless of which choice implementations make on
unspecified behavior, with a few exceptions.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Sun Sep 8 16:39:02 2024

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

I can think of three others. There may be more.

A is a simple variable;

C does not define the term "simple variable" so presumably you define it
to be any named object that /can/ appear on the LHS of a simple
assignment -- a sort of "no true Scots-variable".

X represents a term of any complexity, and Y is any
expression.

I can think of at least one expression form for X that contradicts this
claim.

It would be great if C had simple rules, but it doesn't. You could have started by saying something about the most comment forms of assignment
being those you list, and that X can be almost any term, but the risk of
making absolute claims is that people (like me) will look into them.

(In C, the middle two are really the same thing.)

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Sun Sep 8 17:14:19 2024

Bart <bc@freeuk.com> writes:

On 07/09/2024 02:44, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 06/09/2024 11:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

(You can balance it out by by requiring ASSIGN(&A, &B)!)

This would not work in general, as I wrote it, the following are
valid:
assign(&a, 42)
assign(&a, a + 1)
but the second argument has no address, so your variant would not
work.

I believe that C's compound literals can give a reference to a+1:

Is there no part of C you can't misrepresent?

#include <stdio.h>

void assign(int* lhs, int* rhs) {
*lhs = *rhs;
}

int main(void) {
int a=20;

assign(&a, &(int){a+1});

This is simply an anonymous object. You could have used a named object
and it wold not have been any further from being a "reference to a+1".

printf("a = %d\n", a);
}

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Sun Sep 8 17:22:08 2024

On 08.09.2024 12:18, Bart wrote:

On 08/09/2024 04:44, Janis Papanagnou wrote:

On 06.09.2024 13:34, Bart wrote:

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

But the feature (using them in lvalue contexts) was rarely used.

Sure.

[...]
This is only a "visual" symmetry, not a semantical one.

The LHS of the Algol 68 example is of 'REF' (lvalue) type, as it would
be the case with a language that supports a syntax as you show it here.

This is where I differ from Algol68,

Since Algol 68 is conceptually an extremely well designed language
I don't expect such formally elaborated and consistent design in
any language of much lower level.

where I had to considerably
simplify the semantics to get something I could understand and implement.

Take this C:

int A, B;

A = B;

There are two types associated with the LHS: 'int*' which is the type
the name A (its address), and 'int' which is the type of A's value.

Erm, no. The LHS of the assignment is a 'ref' 'int'; in "C" and in
(almost) all other languages I encountered. - If you have an issue
in seeing that, and with your decades of engagement with computers,
you may now have a serious practical effort to fix that view. (No
offense intended, honestly!)

So, why would a language choose int* over int as /the/ type of an
assignment target?

I suspect the "C terminology" might be in the way of understanding.
If you intend to grasp the LHS type it's probably better as analogy
to take C++'s 'int &' as the type of the LHS. (Probably the same
problem you had in understanding the "call by reference" concept,
which is also more like a '<type> &' and not a '<type> *'.)

As far as the user is concerned, they're only dealing
with int value types.

Only where values are (syntactically and semantically) expected.
Not at the LHS of an assignment. - I sense (but may be wrong) you
might have some ideas from the functional programming paradigm or
why don't you see (or can't accept) that?

If they have to consider that variables have addresses, then the same
applies to B!

Not sure what you think here. - Others in this thread have already
codified the 'A = B' for explanation purposes using functions; e.g.

assign (int &, int)

so let's add operator definitions from other languages

int & operator = (int & , int)

OP ASSIGN = (REF INT a, INT b) REF INT

This is where I think Algol68 got it badly wrong.

I strongly suspect you have no clue.

Algol 68 as probably the formally mostly elaborated and consistent
language defines the assignment semantics not differently from any
other of the many existing languages that work with variables.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Sun Sep 8 17:44:58 2024

On 08/09/2024 16:39, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

I can think of three others. There may be more.

OK, so what are they?

A is a simple variable;

C does not define the term "simple variable" so presumably you define it
to be any named object that /can/ appear on the LHS of a simple
assignment -- a sort of "no true Scots-variable".

X represents a term of any complexity, and Y is any
expression.

I can think of at least one expression form for X that contradicts this claim.

Example? (You seem to be turning into Tim here by hinting at things but withholding any further information.)

It would be great if C had simple rules, but it doesn't. You could have started by saying something about the most comment forms of assignment
being those you list, and that X can be almost any term, but the risk of making absolute claims is that people (like me) will look into them.

This was a reply to someone who was questioning whether those examples
of a complex LHS for Algol68 were valid.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Sun Sep 8 17:36:36 2024

On 08/09/2024 17:14, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 07/09/2024 02:44, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 06/09/2024 11:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

(You can balance it out by by requiring ASSIGN(&A, &B)!)

This would not work in general, as I wrote it, the following are
valid:
assign(&a, 42)
assign(&a, a + 1)
but the second argument has no address, so your variant would not
work.

I believe that C's compound literals can give a reference to a+1:

Is there no part of C you can't misrepresent?

Is nothing I write that you will take issue with?

#include <stdio.h>

void assign(int* lhs, int* rhs) {
*lhs = *rhs;
}

int main(void) {
int a=20;

assign(&a, &(int){a+1});

This is simply an anonymous object. You could have used a named object
and it wold not have been any further from being a "reference to a+1".

I suggested a 'assign()' function could have balanced parameters by
requiring:

asssign(&A, &B);

Someone objects that you can't in general apply & to arbitrary, unnamed, transient, intermediate values such as 'a + 1'.

I showed how you could do that using anonymous compound literals which
avoids having to create an explicit named temporary which in standard C
would need to be outside of that assignment call.

But you apparently have a problem it.

Or more likely you have a problem with me.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Janis Papanagnou on Sun Sep 8 20:09:33 2024

On Sun, 8 Sep 2024 16:40:14 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 08.09.2024 10:58, Michael S wrote:

On Sun, 8 Sep 2024 05:44:16 +0200
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 06.09.2024 13:34, Bart wrote:

In more complicated cases in languages, then some asymmetry does
come up. For example, suppose C allowed this (my language allows
the equivalent):

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

Are you sure?

Sure about what? - That the code above works? - Yes, it does.

It seems to me that you got it backward.

Mind to elaborate?

Janis

Bart already cleared my confusion.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Janis Papanagnou on Sun Sep 8 19:01:32 2024

On 08/09/2024 16:22, Janis Papanagnou wrote:

On 08.09.2024 12:18, Bart wrote:

On 08/09/2024 04:44, Janis Papanagnou wrote:

On 06.09.2024 13:34, Bart wrote:

(c ? a : b) = x;

In Algol 68 you can write

IF c THEN a ELSE b FI := x

or, in a shorter form, as

( c | a | b ) := x

if you prefer.

But the feature (using them in lvalue contexts) was rarely used.

Sure.

[...]
This is only a "visual" symmetry, not a semantical one.

The LHS of the Algol 68 example is of 'REF' (lvalue) type, as it would
be the case with a language that supports a syntax as you show it here.

This is where I differ from Algol68,

Since Algol 68 is conceptually an extremely well designed language
I don't expect such formally elaborated and consistent design in
any language of much lower level.

It is ridiculously over-engineered. It requires the user to have too
much knowledge of its internal workings.

A higher level HLL should make life simpler not harder!

where I had to considerably
simplify the semantics to get something I could understand and implement.

Take this C:

int A, B;

A = B;

There are two types associated with the LHS: 'int*' which is the type
the name A (its address), and 'int' which is the type of A's value.

Erm, no. The LHS of the assignment is a 'ref' 'int'; in "C" and in
(almost) all other languages I encountered.

The LHS of an an assignment needs to be an LVALUE. It has little to do
with types, other than, if the LHS has type T, you might use the ability
to turn it into REF T by a hypothetical application of &, to determine lvalueness.

Here is a fragment of C code:

int a, b;

a = b;

Here is the typed AST my compiler produces for it:

int----1 assign:
int- --|---1 name: a
int----|---2 name: b

On the left is the type of each node. Where is the 'int*' or ref int'
type? I can't see it.

You might notice that LHS and RHS both have the same type.

- If you have an issue
in seeing that, and with your decades of engagement with computers,
you may now have a serious practical effort to fix that view.

Why? My decades have been partly spent devising compilers for systems languages. If my views were wrong, then they simply wouldn't work!

This is where I think Algol68 got it badly wrong.

I strongly suspect you have no clue.

Algol68 was famous for its hard-to-grasp concepts. That's what it got wrong.

Algol 68 as probably the formally mostly elaborated and consistent
language defines the assignment semantics not differently from any
other of the many existing languages that work with variables.

Here's some syntax in my language which defines 3 ranks of names:

Type of name: Print shows

const int A = 100; int 100
int B := 200; ref int 200
ref int C := &B; ref ref int 0x123456 (address of B)

Here it is in Algol68 (I've swapped letter case for consistency):

Type of name: Print shows

int A = 100; int 100
int B := 200; ref int 200
ref int C := B; ref ref int 200

The middle column shows the types of the /names/ A B C. For B, C, it
would be the type of &B and &C in my language and in C.

In both cases, A is a constant, not a variable. It is not an lvalue, and
you can't assign to it. Yet the declaration uses the same 'int' rank as B.

You will see the difference though if you look at the middle column.

The Print column shows the results of applying Print to A/B/C. A doesn't
need a dereference. B has that first 'ref' dereferenced automatically as
is common for variables in nearly every HLL.

But In Algol 68 however, both of those 'ref ref' for C are dereferenced
to get at the underlying int value. That happens with C used as an
rvalue, but to assign to C, the RHS must have 'ref ref int' type; it's
is unbalanced.

So my language, like C, needs explicit & operators and explicit
derefoperators when dealing with pointers. Here, C is a pointer; B is
just a variable. That extra 'ref' in the middle is hidden like in every HLL.

Algol68 as I see it has a bunch of arcane rules that you need to
understand. I couldn't tell you for example how to get it to display the
actual address contained within C (ie. the address of B), or how to
display the address of C itself.

In my language it has been exactly this for 40 years:

print C shows contents of C (address of B)
print C^ (deref) shows what C points to (200)
print &C shows address of C

It is incredibly simple. So, you still think that Algol68 got it right?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Sun Sep 8 18:13:02 2024

Bart <bc@freeuk.com> wrote:

On 08/09/2024 01:05, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then you no longer have a language which can be implemented in a few KB. >>> You might as well use a real with with proper data types, and not have
the stack exposed in the language. Forth code can be very cryptic
because of that.

First, it is not my goal to advocate for Forth use.

You're doing a fine job of it!

For me it's one of those languages, like Brainf*ck, which is trivial to implement (I've done both), but next to impossible to code in.

I wonder if you really implemented Forth. Did you implement immediate
words? POSTPONE?

With Forth, I had to look for sample programs to try out, and discovered
that Forth was really a myriad different dialects. It's more of a DIY language that you make up as you go along.

Well, there is the standard and several implementations that do not
follow the standard. As in each language implementations have their
peculiar extentions. And Forth is extensible, so there are user
extentions.

: 2*3 =>
** 6
: 2, 3, *() =>
** 6
the first line is infix form, '=>' oprator prints what is on
the stack (but you cat treat it as "print current result").
In the second line two numbers are pushed on the stack and
then there is call to multiplication routine. Parser knows
that '*' is an operator, but since there are no argument
'*' is treated as ordinary identifer and as result you
get multiplication routine. Like in other languages parentheses
mean function call. Up to now this may look just as some
weirdness with no purpose. But there are advantages. One
is that Pop11 functions can return multiple values, they just
put as many values as needed on the stack. Second, one can
write functions which take variable number of arguments.
And one can use say a loop to put varible number of arguments
on the stack and then call a routine expecting variable
number of arguments. In fact, there is common Pop11 idiom
to handle aggregas: 'explode' puts all members of the aggregate
on the stack. There are also constructor function which build
aggregates from values on the stack.

This sounds like one of my bytecode languages.

So, in the same way that Lisp looks like the output of an AST dump,
these stack languages look like intermediate code:

HLL -> AST -> Stack IL -> Interpret or -> ASM
(eg. C) (Lisp) (Forth)
(Pop-11)

Well, in Pop11 it it more like

reader parser
HLL -> token stream -> Stack IL -> ASM or machine code in memory

Both reader (responsible for turning characters into tokens) and
parser are user extensible. For example, general addition is defined
as:

define 5 x + y;
lvars x, y;
if isinteger(x) and isinteger(y) then
if _padd_testovf(x, y) then return() else -> endif
endif;
Arith_2(x, y, OP_+)
enddefine;

The '5' after 'define' means that we are defining an operator which
have priority 5. 'lvars' says that 'x' and 'y' obey lexical
scoping (one could use 'vars' to get dynamic scope instead).
The 'isinteger' test check for machine-sized integers, if both
'x' aand 'y' are integers this takes fast path and uses machine
addition (with check for overfolw). In case of no overflow the
result of machine addition is return. Otherwise 'Arith_2' handles
(hairy) general cases. Users can define their own operators.
For example

define 7 x foo y; y; enddefine;

defines operator 'foo' (of higher priority than addition). And
we can use it:
: 1 foo 2 =>
** 2
: 2 + 3 foo 4 =>
** 4
: 2 + (3 foo 4) =>
** 6

Naive Fibonacci routine may look like this:

define fibr(n);
if n < 2 then 1 else fibr(n - 1) + fibr(n - 2) endif
enddefine;

Note that use of the stack is invisible above, user may treat this
as something like Pascal, only that source language structures
must be terminated by verbose 'endif' and 'enddefine' respectively.
OTOH there is no need for explicit return, last value computed is
the return value.

There are some anomalies, assignment is written as:

a -> b;

which means assign a to b. There is 'printf' which works
like this:

: printf(2, '%p\n');
2
: printf(2, 3, '%p %p\n');
3 2

Note that order of printed values is reversed compared to source
and format comes last, this is due to stack semantics. You
can write things like:

external declare tse in oldc;

void
tse1(da, ia, sa, dp, ip, sp, d, i, s)
double da[];
int ia[];
float sa[];
double * dp;
int * ip;
float * sp;
double d;
int i;
float s;
{
}

endexternal;

The first line switches on syntax extention and lines after that
up to 'endexternal' are parsed as C code. More procisely, 'oldc'
expects KR style function definitions (with empty body). As
result this piece of code generates Pop11 wrapper that call
corresponding C routine passing it arguments of types specified
in C definition (but on Pop11 side 'double da[]' and 'double * dp'
are treated differently). Note that this in _not_ a feature of
core Pop11 langrage. Rather it is handled by library code (which
in principle could by written by ordinary user.

Most people prefer to code in a HLL.

You mean "most people prefer by now traditional syntax", sure.

But this at least shows Lisp as
being higher level than Forth, and it's a language that can also be bootstrapped from a tiny implementation.

Pop11 has very similar semantics to Lisp and resonably traditional
syntax. I would say that Lisp users value more extensibility than
traditional syntax. Namely Lisp macros give powerful extensibility
and they work naturaly with prefix syntax. In particular, to do
an extension you specify tree transformation, that is transformation
of Lisp "lists", which can be done conveniently in Lisp. In Pop11
extentions usually involve some hooks into parser and one need to
think how parsing work. And one needs to generate representation
at lower level. So extending Pop11 usualy is more work than extending
Lisp. I also use a language called Boot, you will not like it
because it uses whitespace to denote blocks. Semantics of Boot is
essentially the same as Lisp and systax (expect for whitespace)
is Algol-like. A sample is:

dbShowConsKinds(page, cAlist) ==
cats := doms := paks := defs := nil
for x in cAlist repeat
op := CAAR x
kind := dbConstructorKind op
kind = 'category => cats := [x,:cats]
kind = 'domain => doms := [x,:doms]
kind = 'package => paks := [x,:paks]
defs := [x,:defs]
lists := [NREVERSE cats,NREVERSE doms,NREVERSE paks,NREVERSE defs]

If one really dislikes whitespace based syntax it would be not hard
to modify translator to produce more traditional syntax:

dbShowConsKinds(page, cAlist) == {
cats := doms := paks := defs := nil;
for x in cAlist repeat {
op := CAAR(x);
kind := dbConstructorKind(op);
kind = 'category => cats := [x, :cats];
kind = 'domain => doms := [x, :doms];
kind = 'package => paks := [x, :paks];
defs := [x, :defs]
}
lists := [NREVERSE(cats), NREVERSE(doms), NREVERSE(paks), NREVERSE(defs)]
}

Some explanantion "for x in cList" indicates iteration over a list.
Boot allows to call single argument functions without parenthesis around argument, in the second version I added parentheses for clarity.
':=' is assigment operator, '=' is equality, assignenet has a value
like in C. '=>' is condition exit from a block. That is if condition
on left hand side of it is true, then expression on the right hand side
is evaluated and rest of the block is skipped. In Boot all expression
have value, in case of taken exit value of expression on the right
hand side is value of the block. If no exit is teken, then value
of a block is value of last expression in the block. Conditionals
have obvious value, value of a loop is Lisp 'nil'.

Construct like

[x, :cats]

builds a list, ':' means that 'cats' must be a list and is inserted
in this place. So the result is prepending 'x' to 'cats'.
Again, body of a function is an expression and value produced
by this expression is returned from the function.

Drawback of Boot is that some Lisp constructs need awkward syntax
to access. And that includes defining extentions, so extentions
are frequently done by defining macros at Lisp level (or your
way, by modifing translator).

Coming back to Forth, you can easily add infix syntax to Forth
but Forth users somewhat dislike idea of using infix for most
of programming. My personal opinion is that Fort was good
around 1980. At that time there was quite simple implementation,
language offered interactive developement and some powerful
feature and there were interesting compromise between speed
and size.

Around that time I was working on several languages that were low level
and with small implementations, which included running directly on 8-bit hardware. They all looked like proper HLLS, if crude and simple.

There was no need to go 'weird'. For lower level, I used assembly, where
you weren't constrained to a stack.

Well, I had 48KB ZX Spectrum. On it I could run Hisoft Pascal,
Hisoft C, FIG-Forth and other things. At that time I had some
knowledge of C but really did not understand it. Hisoft C gave
me similar speed to Hisoft Pascal, had less features (C lacked
floating point) and needed more memory. FIG-Forth needed very
little memory but execution speed was significantly worse than
Pascal. And my programs were small. Some my programs needed
enough momory for data, but one could write a compiled progam
to the tape and run it from the tape. So I mostly used Pascal.
But that could be different with less memory (compiler would
not run on 16kB Spectrum, FIG-Forth would be happy on it) or
with bigger programs.

I was not scared of "weird". Rather, I had read Pascal books, could
define data structure and I general I had resonable idea
how to solve problem using Pascal. About Forth in had less
info. And for simple probles Forth did not look hard, just
somewhat more tedious. So for me using Forth looked like
spending effort with no clear gain.

What started the subthread was the question of which HLL goes between
ASM and C (since someone suggested that C was mid-level).

Well, for me important question is how much work is due to tools
(basically overhead) and how much deals with problem domain.
Since computers are now much heaper compared to human work
there is desire to reduce tool overhead as much as possible.
This favours higher level languages, so probably most recently
created languages is at higher level than C. However, in
sixties and seventies there were pack of so called algorithmic
languages or somewhat more specifically Algol family. I would
say that C is close to the middle of this pack.

My exposure before I first looked at C was to Algol, Pascal, Fortran
(and COBOL).

C struck me as crude, and I would have placed it lower than FORTRAN IV,
even though the latter had no structured statements. But that was
because it exposed less in the language - you couldn't play around with variable addresses for example. So FORTRAN was no good for systems programming.

One of my early program did recursive walk on a graph. Goal was
to do some circuit computation. I did it in Basic on ZX 81.
I used GOSUB to do recursive calls, but had to simulate the argument
stack with arrays. This led to severl compilcations, purely
because of inadequacy of the language. Official Fortran without
recursion probably would be equally bad. In Pascal or C the program
would be natrual and much simpler. One point being recursion,
the other structures, in Basic or Fortran I had to emulate structures
using parallel arrays, in C and Pascal they were buit-in.

BTW: I would very much profer to use Pascal for the above, but
Basic was the only thing available to me on ZX 81.

As a devils
advocate let me compare typical implementation of early Pascal

...

of early languages were at lower level than Pascal.

You're taking all those, to me, chaotic features of C as being superior
to Pascal.

Like being able define anonymous structs always anywhere, or allowing multiple declarations of the same module-level variables and functions.

Look at this C code:

void
do_bgi_add(unsigned int * dst, int xlen, unsigned int * xp,
int ylen, unsigned int * yp) {
if (ylen < xlen) {
int tmp = xlen;
xlen = ylen;
ylen = tmp;
unsigned int * tmpp = xp;
xp = yp;
yp = tmpp;
}
unsigned int xext = (unsigned int)(((int)(xp[xlen - 1])) >> 31);
unsigned int yext = (unsigned int)(((int)(yp[ylen - 1])) >> 31);
unsigned int c = 0;
int i = 0;
while(i < xlen) {
unsigned long long pp = (unsigned long long)(xp[i])
+ (unsigned long long)(yp[i])
+ (unsigned long long)c;
dst[i] = pp;
c = (pp >> (32ULL));
i++;
}
while(i < ylen) {
unsigned long long pp = (unsigned long long)xext
+ (unsigned long long)(yp[i])
+ (unsigned long long)c;
dst[i] = pp;
c = (pp >> (32ULL));
i++;
}
{
unsigned long long pp = (unsigned long long)xext
+ (unsigned long long)yext
+ (unsigned long long)c;
dst[i] = pp;
}
}

I claim that is is better than what could be done in early Pascal.
Temporary variables are declared exactly in scopes where they are
needed, I reuse the same name for 'pp' but scoping makes clear
that different 'pp' are different variables. All variables
are initialised at declaration time with sensible values. Only
parameters, 'i' and 'c' are common to various stages, they have
to. Note that 'xext' and 'yext' are declared at point where I
can compute initial value. Also note that among ordinary
variables only 'i' and 'c' are reassigned (I need to swap parameters
to simplify logic and 'dst' array entries are assigned as part of
function contract). Fact that variables are not reassigned could
be made clearer by declaring them as 'const'.

Pascal was a teaching language and some thought went into its structure (unlike C).

In general I like Pascal. But in some areas Wirth had too rigid
view. And experience changed notion of good program structure.

In my hands I would have given it some tweaks to make it a
viable systems language. For a better evolution of Pascal, forget Wirth,
look at Ada, even thought that is not my thing because it is too strict
for my style.

For system programming Turbo Pascal had what is needed. For better
evolution look at Extended Pascal and GNU Pascal. Extended Pascal
added variable initialization and relaxed constraints on declaration
placement. Schema types give nice handling of variable length arrays.
GNU Pascal relaxed this a bit some remaining restrictions in Extended Pascal and added system extentions (in particular Turbo Pascal is a subset of
GNU Pascal).

People suggested ones like BLISS and Forth.

I remarked that a proper HLL would let you write just A to either read
the value of variable A, or write to it. Eg. A = A, without special
operators to dereference A's address.

You are looking at superficial things.

Syntax IS superficial! But it's pretty important otherwise we'd be programming in binary machine code, or lambda calculus.

Well, even in context of systax dot required by Bliss is little thing,
just a bit of mandatory noise. When some dirt falls onto program listing
do you think that program is faulty because of this?

Concerning Forth and Lisp, some people like such syntax and there are
gains.

I am not going to write substantial programs in Bliss or Forth
but I have no double they are HLL-s.

So, what would a non-HLL look like to you that is not actual assembly?

In first approximation HLL = not(assembly). Of course (binary, octal,
etc) machine language counts as assembly for purpose of this equation.
And some language like PL360 or Randall Hyde HLA which have constructs
which does not look like assembly still count as assembly. Similarly
macro assemblers like IBM HLA count as assembly. Fact that there is
a complier that turns IBM HLA Javascript, so that you can run it on
wide range of machines does not change this. For me what count is
thinking and intent: using IBM HLA you still need to think in term
of machine instructions. One can use macro assemblers to provide
set of macros such that user of those macros does not need to think
about machine instructions. IIUC one of PL/I compilers was created
by using a macro assembler to implement a small subset of PL/I, and
then proper PL/I compiler was written in this subset. But this
is really using macro assembler to implement different language.
In other words, once extentions can function as independent language
and users are encouraged to think in terms of new language,
this is no longer assembler, but a new thing.

Just as an extra explanantion, I read HLL as Higher Level Language,
with Higher implitely referencing assembly. So it does not need
to be very high level, just higher level than assembly.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Sun Sep 8 18:39:35 2024

Bart <bc@freeuk.com> wrote:

But the feature (using them in lvalue contexts) was rarely used.

(** This is:

(a, b, c) = 0;

in C, which is illegal. But put in explicit pointers, and it's suddenly
fine:

*(a, b, &c) = 0;

So why can't the compiler do that?)

The second really is

(a, b, c = 0);

I do not know what you expect writing '(a, b, c) = 0', but the above
C meaning probably is not what you want.

I use language that allows:

(a, b, c) := (b, c, a);

that is if there are 3 things on left hand side, then there must
be 3 things on the right hand side. It also allows

(a, b, c) := l;

but then l must be a list or record of 3 things (and entries in
order are assigned to things on the left hand side).

(a, b, c) := 0;

would be valid only if '0' happended to be approprate list or record
so that case above would apply. '0' is overloaded and user definable
so user in principle could do that. But no sane person would
define such '0'.

BTW, in Pop11 this is

(1, 2, 3) -> (a, b, c);

and if you write

0 -> (a, b, c);

it assigns 0 to 'c' and grabs two items from the stack and assigns
them to 'a' and 'b' (empty stack signals error).

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Sun Sep 8 22:01:10 2024

On 08/09/2024 19:13, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Like being able define anonymous structs always anywhere, or allowing
multiple declarations of the same module-level variables and functions.

Look at this C code:

void
do_bgi_add(unsigned int * dst, int xlen, unsigned int * xp,
int ylen, unsigned int * yp) {
if (ylen < xlen) {
int tmp = xlen;
xlen = ylen;
ylen = tmp;
unsigned int * tmpp = xp;
xp = yp;
yp = tmpp;
}
unsigned int xext = (unsigned int)(((int)(xp[xlen - 1])) >> 31);
unsigned int yext = (unsigned int)(((int)(yp[ylen - 1])) >> 31);
unsigned int c = 0;
int i = 0;
while(i < xlen) {
unsigned long long pp = (unsigned long long)(xp[i])
+ (unsigned long long)(yp[i])
+ (unsigned long long)c;
dst[i] = pp;
c = (pp >> (32ULL));
i++;
}
while(i < ylen) {
unsigned long long pp = (unsigned long long)xext
+ (unsigned long long)(yp[i])
+ (unsigned long long)c;
dst[i] = pp;
c = (pp >> (32ULL));
i++;
}
{
unsigned long long pp = (unsigned long long)xext
+ (unsigned long long)yext
+ (unsigned long long)c;
dst[i] = pp;
}
}

I claim that is is better than what could be done in early Pascal.
Temporary variables are declared exactly in scopes where they are
needed, I reuse the same name for 'pp' but scoping makes clear
that different 'pp' are different variables. All variables
are initialised at declaration time with sensible values. Only
parameters, 'i' and 'c' are common to various stages, they have
to. Note that 'xext' and 'yext' are declared at point where I
can compute initial value. Also note that among ordinary
variables only 'i' and 'c' are reassigned (I need to swap parameters
to simplify logic and 'dst' array entries are assigned as part of
function contract). Fact that variables are not reassigned could
be made clearer by declaring them as 'const'.

I had a problem with this code because it was so verbose. The first
thing I did was to define aliases u64 and u32 for those long types:

typedef unsigned long long u64;
typedef unsigned long u32;

Then I removed some casts that I thought were not necessary. The first
result looks like this:

---------------------------
void do_bgi_add(u32 * dst, int xlen, u32 * xp, int ylen, u32 * yp) {
u32 xext, yext, c;
u64 pp;
int i;

if (ylen < xlen) {
int tmp = xlen;
xlen = ylen;
ylen = tmp;
u32 * tmpp = xp;
xp = yp;
yp = tmpp;
}

xext = ((int)(xp[xlen - 1])) >> 31;
yext = ((int)(yp[ylen - 1])) >> 31;

c = 0;
i = 0;
while(i < xlen) {
pp = (u64)(xp[i]) + (u64)yp[i] + c;
dst[i] = pp;
c = pp >> 32;
i++;
}

while(i < ylen) {
pp = (u64)xext + (u64)yp[i] + c;
dst[i] = pp;
c = pp >> 32;
i++;
}
pp = (u64)xext + (u64)yext + c;
dst[i] = pp;
}
---------------------------

Things actually fit onto one line! It's easier now to grasp what's going
on. There are still quite a few casts; it would be better if xext/yext/c
were all u64 type instead of u32.

pp seems to used for the same purpose throughout, so I can't see the
point in declaring three separate versions of the same thing.

I didn't take the C version further, but I did port it to my syntax to
see what it might look like; that is shown below.

That uses 64-bits, but the arrays are still 32 bits (and here passed by reference and are actual arrays, not pointer). I rearranged the
parameters for clarity.

This is now 20 non-blank lines vs 38 of your original, and has a 60%
smaller character count. But this is more about keeping clutter away
from the main body of the function.

Then it becomes easier to reason about it, something you also seem to
claim. For example, I probably don't need 'c'; I can set pp := 0 then
use pp.[32] as needed; two more lines gone. Now however, pp needs to
have function-scope.

Now please post a version in Forth!

---------------------------
proc do_bgi_add(ref[]u32 dest, xp, yp, int xlen, ylen)=
u64 xext, yext, c, pp, i

if ylen < xlen then
swap(xlen, ylen)
swap(xp, yp)
fi

xext := xp[xlen].[31]
yext := yp[ylen].[31]

c := 0
i := 1
while i <= xlen, ++i do
dst[i] := pp := xp[i] + yp[i] + c
c := pp.[32]
od

while i <= ylen, ++i do
dst[i] := pp := xext + yp[i] + c
c := pp.[32]
od

dst[i] := xext + yext + c
end
---------------------------

(Special features: X.[i] is bit indexing. While-loops can have an option incremental (left over from a suggestion for C where 'for' was
over-used); as written, the arrays are 1-based. 'swap' is a built-in op.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Sun Sep 8 21:18:57 2024

On 08/09/2024 19:13, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 08/09/2024 01:05, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then you no longer have a language which can be implemented in a few KB. >>>> You might as well use a real with with proper data types, and not have >>>> the stack exposed in the language. Forth code can be very cryptic
because of that.

First, it is not my goal to advocate for Forth use.

You're doing a fine job of it!

For me it's one of those languages, like Brainf*ck, which is trivial to
implement (I've done both), but next to impossible to code in.

I wonder if you really implemented Forth. Did you implement immediate
words? POSTPONE?

I implemented a toy version, with 35 predefined words, that was enough
to implement Fizz Buzz. Then I looked for more examples to try and found
they all assumed slightly different sets of built-ins.

define fibr(n);
if n < 2 then 1 else fibr(n - 1) + fibr(n - 2) endif
enddefine;

There are some anomalies, assignment is written as:

a -> b;

which means assign a to b.

This POP11 seems a more viable language than Forth. (I vaguely remember something from college days, but that might have been POP2.)

external declare tse in oldc;

void
tse1(da, ia, sa, dp, ip, sp, d, i, s)
double da[];
int ia[];
float sa[];
double * dp;
int * ip;
float * sp;
double d;
int i;
float s;
{
}

endexternal;

The first line switches on syntax extention and lines after that
up to 'endexternal' are parsed as C code. More procisely, 'oldc'
expects KR style function definitions (with empty body). As
result this piece of code generates Pop11 wrapper that call
corresponding C routine passing it arguments of types specified
in C definition (but on Pop11 side 'double da[]' and 'double * dp'
are treated differently). Note that this in _not_ a feature of
core Pop11 langrage. Rather it is handled by library code (which
in principle could by written by ordinary user.

I don't quite understand that. Who or what is the C syntax for? Does
POP11 transpile to C?

Usually the FFI of a language uses bindings expressed in that language,
and is needed for the implementation to generate the correct code. So
it's not clear if calls to that function are checked for numbers and
types of arguments.

If they're not, then you don't really need a function sig, just that
external declaration.

Most people prefer to code in a HLL.

You mean "most people prefer by now traditional syntax", sure.

But this at least shows Lisp as
being higher level than Forth, and it's a language that can also be
bootstrapped from a tiny implementation.

Pop11 has very similar semantics to Lisp and resonably traditional
syntax. I would say that Lisp users value more extensibility than traditional syntax. Namely Lisp macros give powerful extensibility
and they work naturaly with prefix syntax. In particular, to do
an extension you specify tree transformation, that is transformation
of Lisp "lists", which can be done conveniently in Lisp. In Pop11
extentions usually involve some hooks into parser and one need to
think how parsing work. And one needs to generate representation
at lower level. So extending Pop11 usualy is more work than extending
Lisp. I also use a language called Boot, you will not like it
because it uses whitespace to denote blocks. Semantics of Boot is essentially the same as Lisp and systax (expect for whitespace)
is Algol-like.

One of yours? A search for Boot PL didn't show anything relevant.

One of my early program did recursive walk on a graph. Goal was
to do some circuit computation. I did it in Basic on ZX 81.
I used GOSUB to do recursive calls, but had to simulate the argument
stack with arrays. This led to severl compilcations, purely
because of inadequacy of the language.

On ZX81? I can imagine it being hard! (Someone wanted me to do something
on ZX80, but I turned it down. I considered it too much of a toy.)

You're taking all those, to me, chaotic features of C as being superior
to Pascal.

Like being able define anonymous structs always anywhere, or allowing
multiple declarations of the same module-level variables and functions.

Look at this C code:

I'll reply to that separately.

In my hands I would have given it some tweaks to make it a
viable systems language. For a better evolution of Pascal, forget Wirth,
look at Ada, even thought that is not my thing because it is too strict
for my style.

For system programming Turbo Pascal had what is needed.

OK, there's that too, although it's not something I used. (I last used
actual Pascal c. 1980, and didn't try it again for 35+ years with
FreePascal.)

Syntax IS superficial! But it's pretty important otherwise we'd be
programming in binary machine code, or lambda calculus.

Well, even in context of systax dot required by Bliss is little thing,
just a bit of mandatory noise.

Confusing noise: in A = .B, why doesn't A need the dot too? (This has
been discussed at length so is rhetorical!)

Concerning Forth and Lisp, some people like such syntax and there are
gains.

In the Reddit PL forum people absolutely love weird and esoteric syntax.
The harder to understand the better. Loads of pointless punctuation is especially popular.

I am not going to write substantial programs in Bliss or Forth
but I have no double they are HLL-s.

So, what would a non-HLL look like to you that is not actual assembly?

In first approximation HLL = not(assembly). Of course (binary, octal,
etc) machine language counts as assembly for purpose of this equation.
And some language like PL360 or Randall Hyde HLA which have constructs
which does not look like assembly still count as assembly. Similarly
macro assemblers like IBM HLA count as assembly. Fact that there is
a complier that turns IBM HLA Javascript, so that you can run it on
wide range of machines does not change this. For me what count is
thinking and intent: using IBM HLA you still need to think in term
of machine instructions. One can use macro assemblers to provide
set of macros such that user of those macros does not need to think
about machine instructions. IIUC one of PL/I compilers was created
by using a macro assembler to implement a small subset of PL/I, and
then proper PL/I compiler was written in this subset. But this
is really using macro assembler to implement different language.
In other words, once extentions can function as independent language
and users are encouraged to think in terms of new language,
this is no longer assembler, but a new thing.

Just as an extra explanantion, I read HLL as Higher Level Language,
with Higher implitely referencing assembly. So it does not need
to be very high level, just higher level than assembly.

The HLA I implemented for PDP10 looks somewhat odd, but still nothing
like assembly. Assignment was LTR:

A + B * C => D

But there were no operator precedences so this evaluates (A + B)*C.

(I wanted to reimplement that language but I've forgotten most of it,
and the few specs for it are in a museum. They weren't keen on gettting
me copies as the considered the documents too fragile.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Sun Sep 8 21:34:27 2024

David Brown <david.brown@hesbynett.no> wrote:

And while microcontrollers sometimes have a limited form of branch
prediction (such as prefetching the target from cache), the more
numerous and smaller devices don't even have instruction caches.
Certainly none of them have register renaming or speculative execution.

IIUC STM4 series has cache, and some of them are not so big. There
are now several chinese variants of STM32F103 and some of them have
caches (some very small like 32 words, IIRC one has 8 words and it
is hard to decide if this very small cache or big prefetch buffer).
A notable example is MH32F103. Base model officially has 64kB RAM and
256KB flash. AFAIK this flash is rather slow SPI flash. It also
has 16kB cache which probably is 4-way set associative with few
extra lines (probably 4) to increase apparent associativity.
I write probably because this is result of reasoning based on
several time measurements. If you hit cache it runs nicely at
216 MHz. Cache miss costs around 100 clocks (varies depending on
exact setting of timing parameters and form of access).

Similar technology seem to be popular among chines chip makers,
especially for "bigger" chips. But IIUC GD use is for chips
of size of STM32F103C8T6.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Sun Sep 8 23:33:28 2024

On 08/09/2024 22:15, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

I had a problem with this code because it was so verbose. The first
thing I did was to define aliases u64 and u32 for those long types:

typedef unsigned long long u64;
typedef unsigned long u32;

So you're assuming that unsigned long is 32 bits? (It's 64 bits on the systems I use most.)

That should be unsigned int.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Mon Sep 9 00:25:07 2024

On 09/09/2024 00:20, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 08/09/2024 22:15, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

I had a problem with this code because it was so verbose. The first
thing I did was to define aliases u64 and u32 for those long types:

typedef unsigned long long u64;
typedef unsigned long u32;

So you're assuming that unsigned long is 32 bits? (It's 64 bits on
the systems I use most.)

That should be unsigned int.

So you're assumuing that unsigned int is 32 bits?

Yes. The code requires at least a 32-bit int anyway.

I know you're aware of <stdint.h>. You can use it to define your own
u64 and 32 aliases if you like.

I know about that header. I find it fiddlier to type than writing
unsigned long long once, which is bad enough.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Mon Sep 9 02:07:51 2024

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit
machines doing otherwise would add useless instructions to object
code. More precisly, really stupid compiler will generate useless intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly.
But at least some gcc versions needed such declarations. Note
also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables
were used in loops where they need to be widened to 64 bits anyway. The
new value of c is set from a 32-bit result.

(Have you tried 64-bit versions of xext, yext, c to see it it makes any difference? I may try it myself if I can set up a suitable test, but I
can only test on a 64-bit machine.

Do you still have 32-bit machines around? I haven't been able to find
one for a decade and a half!)

In my version type of pp is clearly visible, together with casts
this gives string hint what is happening: this is 32-bit addition
producing carry in 'c'.

It seems to do a 64-bit addition with a carry on bit 32 of the result
stored in c.

Note, I did not intend to post a trap for you, but in your
egerness to shorten the code you removed important information.
And while this code is unlikely to change much (basically
upgrade to 64-bit version on 64-bit machines in the only likely
change), normally code evolves and your version is harder
to change.

3 copies of 'pp' is a bit harder to change than one!

Your code is basically doing 'p = a[i] + b[i] + c' in a loop, but spread
over multiple lines and completed buried in all those 'unsigned long
long' declarations:

unsigned long long pp = (unsigned long long)(xp[i])
+ (unsigned long long)(yp[i])
+ (unsigned long long)c;

That was my main objection. And yes, all those casts because of mixing 32/64-bit arithmetic is another source of clutter.

More generally, my aim is to make code obviously correct
(I not saying that I was fully successful in this case).
I consider your version worse, because with your version
reader has more work checking correctness (even taking into
account that you lowered number of lines).

Sorry, I think mine (either version I posted) is easier to check.

Anyway, I illustrated to you how I use declarations in the middle
of a function. There is nothing chaotic about this, type is
declared when variable first time gets its value. And in most
cases variable scope is tiny. AFAICS neither early Pascal nor
your language allows me to write programs in this way.

You can write declarations mixed within code in my language, but they
will have function-wide scope. I tend to do that only for temporary code.

And if
you do not see benefits, well, this your loss.

Average number of local variables in a half-dozen C codebases I surveyed
was 3 variables per function. So I find it hard to see the point of
splitting them up into different scopes!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Mon Sep 9 00:29:01 2024

Bart <bc@freeuk.com> wrote:

On 08/09/2024 19:13, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Like being able define anonymous structs always anywhere, or allowing
multiple declarations of the same module-level variables and functions.

Look at this C code:

void
do_bgi_add(unsigned int * dst, int xlen, unsigned int * xp,
int ylen, unsigned int * yp) {
if (ylen < xlen) {
int tmp = xlen;
xlen = ylen;
ylen = tmp;
unsigned int * tmpp = xp;
xp = yp;
yp = tmpp;
}
unsigned int xext = (unsigned int)(((int)(xp[xlen - 1])) >> 31);
unsigned int yext = (unsigned int)(((int)(yp[ylen - 1])) >> 31);
unsigned int c = 0;
int i = 0;
while(i < xlen) {
unsigned long long pp = (unsigned long long)(xp[i])
+ (unsigned long long)(yp[i])
+ (unsigned long long)c;
dst[i] = pp;
c = (pp >> (32ULL));
i++;
}
while(i < ylen) {
unsigned long long pp = (unsigned long long)xext
+ (unsigned long long)(yp[i])
+ (unsigned long long)c;
dst[i] = pp;
c = (pp >> (32ULL));
i++;
}
{
unsigned long long pp = (unsigned long long)xext
+ (unsigned long long)yext
+ (unsigned long long)c;
dst[i] = pp;
}
}

I claim that is is better than what could be done in early Pascal.
Temporary variables are declared exactly in scopes where they are
needed, I reuse the same name for 'pp' but scoping makes clear
that different 'pp' are different variables. All variables
are initialised at declaration time with sensible values. Only
parameters, 'i' and 'c' are common to various stages, they have
to. Note that 'xext' and 'yext' are declared at point where I
can compute initial value. Also note that among ordinary
variables only 'i' and 'c' are reassigned (I need to swap parameters
to simplify logic and 'dst' array entries are assigned as part of
function contract). Fact that variables are not reassigned could
be made clearer by declaring them as 'const'.

I had a problem with this code because it was so verbose. The first
thing I did was to define aliases u64 and u32 for those long types:

typedef unsigned long long u64;
typedef unsigned long u32;

This code runs in 33 bit i386, 32 bit ARM and 64 bit x86-64, in
all cases under Linux. As Keith noticed, most popular of those
has 64-bit long. So your definition would break it. You need

typedef unsigned int u32;

Then I removed some casts that I thought were not necessary.

I want to be warned about mixing signed and unsigend when
I do not intend so. Casts make clear to compiler (and reader)
that I really want this.

The first
result looks like this:

---------------------------
void do_bgi_add(u32 * dst, int xlen, u32 * xp, int ylen, u32 * yp) {
u32 xext, yext, c;
u64 pp;
int i;

if (ylen < xlen) {
int tmp = xlen;
xlen = ylen;
ylen = tmp;
u32 * tmpp = xp;
xp = yp;
yp = tmpp;
}

xext = ((int)(xp[xlen - 1])) >> 31;
yext = ((int)(yp[ylen - 1])) >> 31;

c = 0;
i = 0;
while(i < xlen) {
pp = (u64)(xp[i]) + (u64)yp[i] + c;
dst[i] = pp;
c = pp >> 32;
i++;
}

while(i < ylen) {
pp = (u64)xext + (u64)yp[i] + c;
dst[i] = pp;
c = pp >> 32;
i++;
}
pp = (u64)xext + (u64)yext + c;
dst[i] = pp;
}
---------------------------

Things actually fit onto one line! It's easier now to grasp what's going
on. There are still quite a few casts; it would be better if xext/yext/c
were all u64 type instead of u32.

No. It is essential for efficiency to have 32-bit types. On 32-bit
machines doing otherwise would add useless instructions to object
code. More precisly, really stupid compiler will generate useless
intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly.
But at least some gcc versions needed such declarations. Note
also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

pp seems to used for the same purpose throughout, so I can't see the
point in declaring three separate versions of the same thing.

My version can be checked for correctenss in one resonably fast
pass other source. Your changes require multiple passes or much
slower single pass (because with single pass you need to keep
more info in your head compared to my version).

Your version obscures important fact that there is truncation
is assigment

dst[i] = pp;

In my version type of pp is clearly visible, together with casts
this gives string hint what is happening: this is 32-bit addition
producing carry in 'c'. And this is version of code intended for
32-bit machines, so everthing can be done using 32-bit instructions
(and on 32-bit machine there is no need for real shift). Your
remark above indicates that you missed this, but I think that
this is much easier to infer from my version than from your
changed one.

Another thing, I gave this example because most my functions
are very short. This one was longer and in longer function
benfits of keeping variable local are bigger. And that
function happend to have casts, but casts are actually
were a distraction to my main point.

Also, note that I plan to change this code so that it uses
64-bit arithmetic on 64-bit machines. Then I will have
something like 'host_word' (unsigned 32-bit on 32-bit
machine and 64-bit on 64-bit machine), 'signed_host_word'
(the same number of bit, but signed) and 'double_host_word'
(128-bit on 64-bit machine, 64-bit on 32-bit machine).
I wait with this change because ATM there is still piece of
code elsewere which can not handle 64-bit parts. So, while
your u64 and u32 change is acceptable for current version
it would be misleading about future intent.

Note, I did not intend to post a trap for you, but in your
egerness to shorten the code you removed important information.
And while this code is unlikely to change much (basically
upgrade to 64-bit version on 64-bit machines in the only likely
change), normally code evolves and your version is harder
to change.

More generally, my aim is to make code obviously correct
(I not saying that I was fully successful in this case).
I consider your version worse, because with your version
reader has more work checking correctness (even taking into
account that you lowered number of lines).

Anyway, I illustrated to you how I use declarations in the middle
of a function. There is nothing chaotic about this, type is
declared when variable first time gets its value. And in most
cases variable scope is tiny. AFAICS neither early Pascal nor
your language allows me to write programs in this way. And if
you do not see benefits, well, this your loss.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Mon Sep 9 01:19:06 2024

Bart <bc@freeuk.com> wrote:

On 08/09/2024 19:13, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 08/09/2024 01:05, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then you no longer have a language which can be implemented in a few KB. >>>>> You might as well use a real with with proper data types, and not have >>>>> the stack exposed in the language. Forth code can be very cryptic
because of that.

First, it is not my goal to advocate for Forth use.

You're doing a fine job of it!

For me it's one of those languages, like Brainf*ck, which is trivial to
implement (I've done both), but next to impossible to code in.

I wonder if you really implemented Forth. Did you implement immediate
words? POSTPONE?

I implemented a toy version, with 35 predefined words, that was enough
to implement Fizz Buzz. Then I looked for more examples to try and found
they all assumed slightly different sets of built-ins.

OK, so apparently you missed essential part.

define fibr(n);
if n < 2 then 1 else fibr(n - 1) + fibr(n - 2) endif
enddefine;

There are some anomalies, assignment is written as:

a -> b;

which means assign a to b.

This POP11 seems a more viable language than Forth. (I vaguely remember something from college days, but that might have been POP2.)

Pop11 was created in Great Britain, originally 11 refered to PDP-11,
it was port (with modifications) of Pop-2. Main site was Sussex,
but is was used in several other places.

external declare tse in oldc;

void
tse1(da, ia, sa, dp, ip, sp, d, i, s)
double da[];
int ia[];
float sa[];
double * dp;
int * ip;
float * sp;
double d;
int i;
float s;
{
}

endexternal;

The first line switches on syntax extention and lines after that
up to 'endexternal' are parsed as C code. More procisely, 'oldc'
expects KR style function definitions (with empty body). As
result this piece of code generates Pop11 wrapper that call
corresponding C routine passing it arguments of types specified
in C definition (but on Pop11 side 'double da[]' and 'double * dp'
are treated differently). Note that this in _not_ a feature of
core Pop11 langrage. Rather it is handled by library code (which
in principle could by written by ordinary user.

I don't quite understand that. Who or what is the C syntax for?

It is information for Pop11 about signature of a C function. Since
KR C did not have prototypes, construction above uses definition with
empty body. Basically, you take C source of your function and
cut first part of old style definition up to opening brace, paste
this int Pop11 file, add closing brace. Ater that you have
declaration for Pop11 compiler.

Does POP11 transpile to C?

No. The whole system is called Poplog and in interactive use
compiles to memory (one can save memory and load it later).
There is compiler compiling Pop11 extended with low-level
constructs, this compiler generates assemby which in turn gives
object files (there is extra file to containg extra Pop11 information,
needed for linking).

Usually the FFI of a language uses bindings expressed in that language,

Pop11 takes information from C-like code. Maybe I should have used
modern style, then you just take prototype from a C headers.

and is needed for the implementation to generate the correct code. So
it's not clear if calls to that function are checked for numbers and
types of arguments.

Pop11 is dynamically typed. The 'external declare' construct generates
Pop11 function with the same number of arguments as the C function.
This Pop11 function converts Pop11 arguments to right C types if possible, otherwise signals error. Then it calls C function (which must be
loaded from some shared library, I did not show library loading).
When C function returns, return value is convered to Pop11
representation.

Pop11 has very similar semantics to Lisp and resonably traditional
syntax. I would say that Lisp users value more extensibility than
traditional syntax. Namely Lisp macros give powerful extensibility
and they work naturaly with prefix syntax. In particular, to do
an extension you specify tree transformation, that is transformation
of Lisp "lists", which can be done conveniently in Lisp. In Pop11
extentions usually involve some hooks into parser and one need to
think how parsing work. And one needs to generate representation
at lower level. So extending Pop11 usualy is more work than extending
Lisp. I also use a language called Boot, you will not like it
because it uses whitespace to denote blocks. Semantics of Boot is
essentially the same as Lisp and systax (expect for whitespace)
is Algol-like.

One of yours? A search for Boot PL didn't show anything relevant.

Boot is invention of IBM research lab. It only current use is
implementing a computer algebra system. If interested look for
FriCAS, about 60 kloc is in Boot.

One of my early program did recursive walk on a graph. Goal was
to do some circuit computation. I did it in Basic on ZX 81.
I used GOSUB to do recursive calls, but had to simulate the argument
stack with arrays. This led to severl compilcations, purely
because of inadequacy of the language.

On ZX81? I can imagine it being hard! (Someone wanted me to do something
on ZX80, but I turned it down. I considered it too much of a toy.)

To give more background, bare ZX81 had 1kB RAM (including video RAM).
I used version with external 64 kB memory module, so I had 56 kB
usable RAM (8 kB of address space was taken by ROM). So in fact
resonably powerful machine. There was a printer using special paper
(silver looking). The only mass storage availble was on cassete.
And worse, the only language available was Basic in the ROM.

With bare version I would not try, it was clear to me that it was
impossible to put this program in 1kB.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Keith Thompson on Mon Sep 9 02:06:15 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Waldek Hebisch <antispam@fricas.org> writes:

Bart <bc@freeuk.com> wrote:

[...]

I had a problem with this code because it was so verbose. The first
thing I did was to define aliases u64 and u32 for those long types:

typedef unsigned long long u64;
typedef unsigned long u32;

This code runs in 33 bit i386, 32 bit ARM and 64 bit x86-64, in
all cases under Linux. As Keith noticed, most popular of those
has 64-bit long. So your definition would break it. You need

typedef unsigned int u32;

Just add #include <stdint.h> and use uint32_t -- or, if you value
brevity for some reason:

typedef uint32_t u32;

(Bart dislikes <stdint.h>, but there's no reason you should.)

1. Well, I used to care about pre-Ansi compilers and for benefit
to them I got into habit of avoiding <stdint.h>. Yes, it is time
to change and on microcontrollers I use <stdint.h> as sizes
are important there and I do not expect to ever compile
microcontroller code with non-Ansi compiler.

2. As I explanined fixed size is just current state of developement.
On 64-bit machines code should use bigger types. And to get
128-bit type it seems that I need nonstandard type (there seem
to be new types in C23, but I did not investigate them deeper).

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Mon Sep 9 03:04:16 2024

Bart <bc@freeuk.com> wrote:

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit
machines doing otherwise would add useless instructions to object
code. More precisly, really stupid compiler will generate useless
intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly.
But at least some gcc versions needed such declarations. Note
also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables
were used in loops where they need to be widened to 64 bits anyway. The
new value of c is set from a 32-bit result.

Well, at C level there is 64-bit type. The intent is that C compiler
should notice that the result is 32-bit + carry flag. Ideally
compiler should notice that c has only one bit and can keep it
in carry flag. On i386 comparison needed for loop control would
destroy carry flag, so there must be code using value of carry in
register and code to save carry to register. But one addition
of highs parts can be skipped. On 32-bit ARM compiler can use
special machine istructions and actually generated code which
is close to optimal.

(Have you tried 64-bit versions of xext, yext, c to see it it makes any difference? I may try it myself if I can set up a suitable test, but I
can only test on a 64-bit machine.

I did test this and and when I tested 64-bit declarations generated
extra instructions. IIRC 32-bit ones gave good result (not extra instructions). Re-checking now with gcc12 in 32-bit mode seem to
produce extra instructions. Maybe I remembered wrong, may there
is regression on i386.

Just extra remark: this is one of several routines which use
similar style and "the same" declarations. So even if in this
routne optimizaton does not work as intended, it made difference
in other routines and I want to keep declarations in all routines
consistent with each other.

Do you still have 32-bit machines around? I haven't been able to find
one for a decade and a half!)

I have a few old PC-s. In the best one currently power supply
does not work, but last time when I checked two other were
operational. I have a complete 32-bit Linux userland on
a machine with 64-bit kernel. So I can produce and run
32-bit binaries on this machine (just now I have no access
to this one).

I have also bunch of ARM boards, except for one
other are 32-bit, oldest one is from 2012, one or two were bought
few years ago. It seems that Raspberry Pi with 32-bit CPU
is still in shops and there were other brands. I do
not make much use of them, but I do use them from time to
time. Actually I have a lot of ARM boards, the ones I mention
before are "powerful" ones, with hunders MB RAM and at least
hundreds MHz clock. Other are microcontroller boards, those
are small, usually less than 1MB RAM, so not suitable for PC
class software. The small ones are 32-bit and new ones keep
appearing. Basically, at this size there is no motivation
to go to 64-bits.

Coming back to arithemtic routines, I plan to use 64-bit
units on 64-bit machines, and then otimization issue will
be the same as on 32-bit ones.

And if
you do not see benefits, well, this your loss.

Average number of local variables in a half-dozen C codebases I surveyed
was 3 variables per function. So I find it hard to see the point of
splitting them up into different scopes!

My style tends to produce more local variables than older style.
Some functions are big and in those there are most benefits.
But even if there is only 1 variable wrong/missing initialization
may be a problem. My style minimizes such issues.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Waldek Hebisch on Mon Sep 9 00:09:45 2024

On 9/8/24 20:29, Waldek Hebisch wrote:
...

This code runs in 33 bit i386, ...

I'm either amazed or amused, depending upon whether or not 33 is a typo.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Mon Sep 9 11:01:52 2024

On 08/09/2024 20:13, Waldek Hebisch wrote:

Well, I had 48KB ZX Spectrum. On it I could run Hisoft Pascal,
Hisoft C, FIG-Forth and other things. At that time I had some
knowledge of C but really did not understand it. Hisoft C gave
me similar speed to Hisoft Pascal, had less features (C lacked
floating point) and needed more memory. FIG-Forth needed very
little memory but execution speed was significantly worse than
Pascal. And my programs were small. Some my programs needed
enough momory for data, but one could write a compiled progam
to the tape and run it from the tape. So I mostly used Pascal.
But that could be different with less memory (compiler would
not run on 16kB Spectrum, FIG-Forth would be happy on it) or
with bigger programs.

That's bringing back memories - I too had these languages for my
Spectrum. IIRC FIG-Forth had an editor that had almost illegible
half-width characters to get more characters per line.

But if you think Forth was slow on the Spectrum, you probably never
tried Snail Logo :-)

While I tried these languages a bit on the Spectrum, I did more BASIC
and assembly programming. The BBC Micro had a much better BASIC and an integrated assembler, so I used that when I could.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Janis Papanagnou on Mon Sep 9 10:46:02 2024

On 08/09/2024 16:37, Janis Papanagnou wrote:

On 08.09.2024 16:12, James Kuyper wrote:

On 9/8/24 00:39, Janis Papanagnou wrote:
...

That's why I immediately see the necessity that compiler creators need
to know them in detail to _implement_ "C". And that's why I cannot see
how the statement of the C-standard's "most important purpose" would

sound reasonable (to me). ...

I agree - the most important purpose is for implementors, not developers.

... I mean, what will a programmer get from the
"C" standard that a well written text book doesn't provide?

What the C standard says is more precise and more complete than what
most textbooks say.

Exactly. And this precision is what makes standard often difficult
to read (for programming purposes for "ordinary" folks).

Yes. A standard is a different thing from a textbook, a tutorial, or a reference.

If a textbook doesn't answer a question I have I'd switch to another
(better) textbook, (usually) not to the standard.

Or ask someone :-)

Not everyone needs to be an expert in the details of a programming language.

Often a good online reference is more helpful - this one is the best I
know of (and I believe it is recommended and supported by the C++
standards committee):

<https://en.cppreference.com/>

Most important for my purposes, it makes it clear
what's required and allowed by the standard.

No, that is not really true - the C standard is /not/ clear on all
points. There are aspects of the language that you cannot fully
understand without cross-referencing between many different sections
(and there are a few aspects that are not clear even then). That is
because it is a standard, not a tutorial, and not a language reference.
A standard is written in more "legalise" language, and makes a point of
trying to avoid repeating itself - while a good reference will repeat
the same information multiple times in different places, whenever it
helps for clarity.

To be honest, I was also inspecting and reading language standards
(e.g. for the POSIX shell and awk), but not to be able to correctly
write programs in those languages, but rather out of interest and
for academical discussions in Usenet (like the discussions here,
that also often refer to the "C" standard).

For most of my career, I
worked under rules that required my code to avoid undefined behavior, to
work correctly regardless of which choice implementations make on
unspecified behavior, with a few exceptions.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Mon Sep 9 11:14:54 2024

On 09/09/2024 05:04, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit
machines doing otherwise would add useless instructions to object
code. More precisly, really stupid compiler will generate useless
intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly.
But at least some gcc versions needed such declarations. Note
also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables
were used in loops where they need to be widened to 64 bits anyway. The
new value of c is set from a 32-bit result.

Well, at C level there is 64-bit type. The intent is that C compiler
should notice that the result is 32-bit + carry flag. Ideally
compiler should notice that c has only one bit and can keep it
in carry flag. On i386 comparison needed for loop control would
destroy carry flag, so there must be code using value of carry in
register and code to save carry to register. But one addition
of highs parts can be skipped. On 32-bit ARM compiler can use
special machine istructions and actually generated code which
is close to optimal.

When you have a type that you want to be at least 32 bits (to cover the
range you need), and want it to be as efficient as possible on 32-bit
and 64-bit machines (and 16-bit and 8-bit, still found in
microcontrollers), use "int_fast32_t". On x86-64 it will be 64-bit, on
32-bit systems it will be 32-bit. Use of the [u]int_fastNN_t types can
make code significantly more efficient on 64-bit systems while retaining efficiency on 32-bit (or even smaller) targets.

Average number of local variables in a half-dozen C codebases I surveyed
was 3 variables per function. So I find it hard to see the point of
splitting them up into different scopes!

My style tends to produce more local variables than older style.
Some functions are big and in those there are most benefits.
But even if there is only 1 variable wrong/missing initialization
may be a problem. My style minimizes such issues.

Local variables are free in C as used by most people, so IMHO it is a
good idea to use them generously. Breaking big calculations into parts
with names can make code a lot clearer. Consider declaring them "const"
to be clear that they do not change after initialisation.

(Bart dislikes extra locals because he has a style of declaring them all
at the start of functions, usually without initialisation, and likes to
use non-optimising compilers and then complain about the speed.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Mon Sep 9 11:34:06 2024

On 08/09/2024 23:34, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

And while microcontrollers sometimes have a limited form of branch
prediction (such as prefetching the target from cache), the more
numerous and smaller devices don't even have instruction caches.
Certainly none of them have register renaming or speculative execution.

IIUC STM4 series has cache, and some of them are not so big. There
are now several chinese variants of STM32F103 and some of them have
caches (some very small like 32 words, IIRC one has 8 words and it
is hard to decide if this very small cache or big prefetch buffer).

There are different kinds of cache here. Some of the Cortex-M cores
have optional caches (i.e., the microcontroller manufacturer can choose
to have them or not).

<https://en.wikipedia.org/wiki/ARM_Cortex-M#Silicon_customization>

Flash memory, flash controller peripherals, external memory interfaces (including things like QSPI) are all specific to the manufacturer,
rather than part of the Cortex M cores from ARM. Manufacturers can do
whatever they want there.

So a "cache" of 32 words is going to be part of the flash interface, not
a cpu cache (which are typically 16KB - 64KB, and only found on bigger microcontrollers with speeds of perhaps 120 MHz or above). And yes, it
is often fair to call these flash caches "prefetch buffers" or
read-ahead buffers. (You also sometimes see small caches for external
ram or dram interfaces.)

A notable example is MH32F103. Base model officially has 64kB RAM and
256KB flash. AFAIK this flash is rather slow SPI flash. It also
has 16kB cache which probably is 4-way set associative with few
extra lines (probably 4) to increase apparent associativity.
I write probably because this is result of reasoning based on
several time measurements. If you hit cache it runs nicely at
216 MHz. Cache miss costs around 100 clocks (varies depending on
exact setting of timing parameters and form of access).

Similar technology seem to be popular among chines chip makers,
especially for "bigger" chips. But IIUC GD use is for chips
of size of STM32F103C8T6.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Mon Sep 9 12:28:56 2024

On 08.09.2024 13:05, Bart wrote:

On 08/09/2024 01:05, Waldek Hebisch wrote:

First, it is not my goal to advocate for Forth use.

For me it's one of those languages, like Brainf*ck, which is trivial to implement (I've done both), but next to impossible to code in.

Well, it depends. In comparison to Intercal Brainf*ck appeared
to me as quite comfortably programmable. - LOL

(I had also written my version of Brainf*ck in the past with a
couple more features, though. Just for fun. I've never used it
seriously; if that makes any sense in the first place.)

[...]

Pascal was a teaching language and some thought went into its structure (unlike C). In my hands I would have given it some tweaks to make it a
viable systems language. For a better evolution of Pascal, forget Wirth,
look at Ada, even thought that is not my thing because it is too strict
for my style.

Pascal was (and still is) often despised as a "teaching language".

However, I've heared in the 1980's that even the control software
of a nuclear reprocessing plant in our country had been written
in Pascal.

(Personally I'd have a better feeling of safety if I knew such
software is written in Pascal than, say, in "C". Mileages vary.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Waldek Hebisch on Mon Sep 9 12:29:05 2024

On 08.09.2024 02:05, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

What started the subthread was the question of which HLL goes between
ASM and C (since someone suggested that C was mid-level).

Well, for me important question is how much work is due to tools
(basically overhead) and how much deals with problem domain.

Indeed.

[...] As a devils
advocate let me compare typical implementation of early Pascal
with C modern C. [...] In early Pascal
one had to declare all local variables at the start of
a fucntion. C has block structure, so one can limit variable
scope to area where variable makes sense.

Pascal has also "blocks structure". It differs in some details,
e.g., as you say, declarations are not part of the blocks (for
reasons beyond relevance here).

But yes, local declarations are fine (and already known these
days from other block-oriented HLLs).

But to be fair, Pascal has nested functions - something that
"C" doesn't have; or did that change recently? - with their
own types, constants, variables, and functions/procedures.
So there's a locality of declarations in another, Pascal way.

[...] Early Pascal has a
bunch of features not present in C, but one can reasonably consider
modern C to be higher level than early Pascal.

I wouldn't second that. In the "_problem domain_" category that
you mentioned above Pascal always appeared to me to at least
produce much more ("semantically") legible code as opposed to
that heap of syntax trash we are used from "C".

The stronger typing and safe constructs (no pointer arithmetic)
from these days' Pascal I also consider a substantial criterion
of a HLL.

And _a lot_
of early languages were at lower level than Pascal.

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to David Brown on Mon Sep 9 07:03:00 2024

On 9/9/24 04:46, David Brown wrote:

On 08/09/2024 16:37, Janis Papanagnou wrote:

On 08.09.2024 16:12, James Kuyper wrote:

...

Most important for my purposes, it makes it clear
what's required and allowed by the standard.

No, that is not really true - the C standard is /not/ clear on all
points. There are aspects of the language that you cannot fully
understand without cross-referencing between many different sections
(and there are a few aspects that are not clear even then). That is
because it is a standard, not a tutorial, and not a language reference.
A standard is written in more "legalise" language, and makes a point of trying to avoid repeating itself - while a good reference will repeat
the same information multiple times in different places, whenever it
helps for clarity.

I will concede your point, but it's still the case that the standard is
clearer about such things than any other source I'm familiar with.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Bart on Mon Sep 9 13:05:29 2024

On 08.09.2024 23:01, Bart wrote:

On 08/09/2024 19:13, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Like being able define anonymous structs always anywhere, or allowing
multiple declarations of the same module-level variables and functions.

Look at this C code:

[ snip code ]

(Quite an unreadable mess. Needs character-wise code-inspection
to be sure it does what is intended. And needs a description in
the first place to be able to analyze it and make sure that it
works as intended.)

I claim that is is better than what could be done in early Pascal.

Have you tried? (It's not straightforward if you're approaching
it from the low level.) In Pascal I'd start with the data type I
intend to process, define an integral type. Implement bit-shifts
etc. in terms of simple operations (like 'div' and 'mod'). Then
identify the functional building blocks for the operations. Etc.

Temporary variables are declared exactly in scopes where they are
needed, I reuse the same name for 'pp' but scoping makes clear
that different 'pp' are different variables. All variables
are initialised at declaration time with sensible values. Only
parameters, 'i' and 'c' are common to various stages, they have
to. Note that 'xext' and 'yext' are declared at point where I
can compute initial value. Also note that among ordinary
variables only 'i' and 'c' are reassigned (I need to swap parameters
to simplify logic and 'dst' array entries are assigned as part of
function contract). Fact that variables are not reassigned could
be made clearer by declaring them as 'const'.

I had a problem with this code because it was so verbose. [...]

My first impetus was as well to start refactoring that "C" code.
But I noticed that it's IMO not a sensible example to demonstrate
HLLs in the first place! The code is full of low-level stuff where
you need, for example, to consider lengths of various word types,
use of tons of casts, bit-operations, and whatnot. The temporary
variables, at least, is not what makes this code more legible in
any way. It's also arguable whether it makes sense to introduce
temporaries to swap variables; instead of adding to legibility
by hiding that in a 'swap' function (with the temporary then not
littering the name-space of the already overloaded function block).
And so on.

[ snip try of a refactoring ]

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Mon Sep 9 13:06:13 2024

On 09/09/2024 13:03, James Kuyper wrote:

On 9/9/24 04:46, David Brown wrote:

On 08/09/2024 16:37, Janis Papanagnou wrote:

On 08.09.2024 16:12, James Kuyper wrote:

...

Most important for my purposes, it makes it clear
what's required and allowed by the standard.

No, that is not really true - the C standard is /not/ clear on all
points. There are aspects of the language that you cannot fully
understand without cross-referencing between many different sections
(and there are a few aspects that are not clear even then). That is
because it is a standard, not a tutorial, and not a language reference.
A standard is written in more "legalise" language, and makes a point of
trying to avoid repeating itself - while a good reference will repeat
the same information multiple times in different places, whenever it
helps for clarity.

I will concede your point, but it's still the case that the standard is clearer about such things than any other source I'm familiar with.

I have lost track of which particular "such things" we are talking about
here, so you could well be right!

The standard /is/ clear on some aspects of C - but not on others. I
don't dispute that it is a useful document and one that serious C
programmers should aspire to read, but I don't think it is really aimed
at "normal" C programmers or useful to them. Perhaps the original
writers did not envisage so many non-experts getting involved in C coding.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Mon Sep 9 12:31:10 2024

On 09/09/2024 02:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 08/09/2024 19:13, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 08/09/2024 01:05, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then you no longer have a language which can be implemented in a few KB. >>>>>> You might as well use a real with with proper data types, and not have >>>>>> the stack exposed in the language. Forth code can be very cryptic
because of that.

First, it is not my goal to advocate for Forth use.

You're doing a fine job of it!

For me it's one of those languages, like Brainf*ck, which is trivial to >>>> implement (I've done both), but next to impossible to code in.

I wonder if you really implemented Forth. Did you implement immediate
words? POSTPONE?

I implemented a toy version, with 35 predefined words, that was enough
to implement Fizz Buzz. Then I looked for more examples to try and found
they all assumed slightly different sets of built-ins.

OK, so apparently you missed essential part.

I've looked at half a dozen hits for 'forth postpone' and I still don't understand what it does. Apparently something to do with compiled mode.

I wouldn't know enough to confidently implement it or use it.

Another mysterious feature with hard to understand semantics. You did
say this was a very simple language and trivial to implement in a few KB?

My opinion of Forth has gone down a couple of notches; sorry.

(I'm not against all stack-based user-languages; I was quite impressed
by PostScript for example. But then I didn't have to do much in-depth
coding in it.)

On ZX81? I can imagine it being hard! (Someone wanted me to do something
on ZX80, but I turned it down. I considered it too much of a toy.)

To give more background, bare ZX81 had 1kB RAM (including video RAM).

You must mean /excluding/ surely? Otherwise there wouldn't be much left
from 1KB!

The first Z80 machine I /made/ had 0.25KB RAM, to which I added 1KB
(actually 1K 6-bit words; two bits unpopulated to save £6), of text-mode
video memory.

The second version had 32KB RAM, the same 1K text-mode memory, and 8KB graphics-mode video memory. I was able to write my first compiler on
that one, written using an assembler, which itself was written via a hex editor, and that was written in actual binary. ('Full-stack')

But both only had tape storage so tools were memory-based.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to David Brown on Mon Sep 9 08:21:23 2024

On 9/9/24 07:06, David Brown wrote:

On 09/09/2024 13:03, James Kuyper wrote:

On 9/9/24 04:46, David Brown wrote:

On 08/09/2024 16:37, Janis Papanagnou wrote:

On 08.09.2024 16:12, James Kuyper wrote:

...

Most important for my purposes, it makes it clear
what's required and allowed by the standard.

No, that is not really true - the C standard is /not/ clear on all
points. There are aspects of the language that you cannot fully
understand without cross-referencing between many different sections
(and there are a few aspects that are not clear even then). That is
because it is a standard, not a tutorial, and not a language reference.
A standard is written in more "legalise" language, and makes a point of
trying to avoid repeating itself - while a good reference will repeat
the same information multiple times in different places, whenever it
helps for clarity.

I will concede your point, but it's still the case that the standard is
clearer about such things than any other source I'm familiar with.

I have lost track of which particular "such things" we are talking about here, so you could well be right!

I was talking about "... what's required and allowed by the standard ...".

The standard /is/ clear on some aspects of C - but not on others. I
don't dispute that it is a useful document and one that serious C
programmers should aspire to read, but I don't think it is really aimed
at "normal" C programmers or useful to them. Perhaps the original
writers did not envisage so many non-experts getting involved in C coding.

I will concede that there's many aspects of C that's it's less than
clear about - but I'm unaware of any other document that's clearer about
those aspects. I don't see how there can be - if the standard isn't
clear about something, there's no way to be sure what the "truth" behind
the lack of clarity is, so it's not possible for some other document to
clarify it. Exception: if a DR has been filed, and the committee has
responded with a clarification, but has not yet updated the standard accordingly, the DR resolution is a document that's clearer than the
standard on that issue, and just as authoritative.

Other documents can say things like "... the standard isn't clear about
this, but you can count on all real-world implementations to do ...".
But when it's a question about "... what's required and allowed by the
standard ...", such alternatives are irrelevant.

Somewhat trickier are the cases where some other document says "... the standard says X, but that doesn't make any sense. It's clear that they
actually meant Y, and you can just write your code accordingly." Tim
Rentsch is a prolific source of such comments. From what I've seen, when
people have written such things, and the committee later resolved the
issue, they often did not resolve it in the expected way.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Keith Thompson on Mon Sep 9 15:58:39 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Waldek Hebisch <antispam@fricas.org> writes:

...

1. Well, I used to care about pre-Ansi compilers and for benefit
to them I got into habit of avoiding <stdint.h>. Yes, it is time
to change and on microcontrollers I use <stdint.h> as sizes
are important there and I do not expect to ever compile
microcontroller code with non-Ansi compiler.

Pre-ANSI? I haven't been even been able to find a working pre-ANSI
C compiler.

I know you mean "real world compiler", but to satisfy ones curiosity
it's possible to run Unix V7 on a Linux VM. (It reminds me how hard it
was to get anything done in 1979, even though Unix V7 seemed, as the
time, so much easier to use than IBM mainframes and punched cards.)

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Mon Sep 9 14:36:28 2024

David Brown <david.brown@hesbynett.no> wrote:

On 08/09/2024 23:34, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

And while microcontrollers sometimes have a limited form of branch
prediction (such as prefetching the target from cache), the more
numerous and smaller devices don't even have instruction caches.
Certainly none of them have register renaming or speculative execution.

IIUC STM4 series has cache, and some of them are not so big. There
are now several chinese variants of STM32F103 and some of them have
caches (some very small like 32 words, IIRC one has 8 words and it
is hard to decide if this very small cache or big prefetch buffer).

There are different kinds of cache here. Some of the Cortex-M cores
have optional caches (i.e., the microcontroller manufacturer can choose
to have them or not).

<https://en.wikipedia.org/wiki/ARM_Cortex-M#Silicon_customization>

I do not see relevent information at that link.

Flash memory, flash controller peripherals, external memory interfaces (including things like QSPI) are all specific to the manufacturer,
rather than part of the Cortex M cores from ARM. Manufacturers can do whatever they want there.

AFAIK typical Cortex-M design has core connected to "bus matrix".
It is up to chip vendor to decide what else is connected to bus matrix.
For me it does not matter if it is ARM design or vendor specific.
Normal internal RAM is accessed via bus matrix, and in MCU-s that
I know about is fast enough so that cache is not needed. So caches
come into play only for flash (and possibly external memory, but
design with external memory probably will be rather large).

It seems that vendor do not like to say that they use cache, instead
that use misleading terms like "flash accelerator".

So a "cache" of 32 words is going to be part of the flash interface, not
a cpu cache

Well, caches never were part of CPU proper, they were part of
memory interface. They could act for whole memory or only for part
that need it (like flash). So I do not understand what "not a cpu
cache" is supposed to mean. More relevant is if such thing act
as a cache, 32 word things almost surely will act as a cache,
8 word thing may be a simple FIFO buffer (or may act smarter
showing behaviour typical of caches).

(which are typically 16KB - 64KB,

I wonder where you found this figure. Such size is typical for
systems bigger than MCU-s. It could be useful for MCU-s with
flash a on separate die, but with flash on the same die as CPU
much smaller cache is adequate.

and only found on bigger
microcontrollers with speeds of perhaps 120 MHz or above). And yes, it
is often fair to call these flash caches "prefetch buffers" or
read-ahead buffers.

Typical code has enough branches that simple read-ahead beyond 8
words is unlikely to give good results. OTOH delivering things
that were accessed in the past and still present in the cache
gives good results even with very small caches.

(You also sometimes see small caches for external
ram or dram interfaces.)

A notable example is MH32F103. Base model officially has 64kB RAM and
256KB flash. AFAIK this flash is rather slow SPI flash. It also
has 16kB cache which probably is 4-way set associative with few
extra lines (probably 4) to increase apparent associativity.
I write probably because this is result of reasoning based on
several time measurements. If you hit cache it runs nicely at
216 MHz. Cache miss costs around 100 clocks (varies depending on
exact setting of timing parameters and form of access).

Similar technology seem to be popular among chines chip makers,
especially for "bigger" chips. But IIUC GD use is for chips
of size of STM32F103C8T6.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Mon Sep 9 17:11:34 2024

On 09/09/2024 16:36, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 08/09/2024 23:34, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

And while microcontrollers sometimes have a limited form of branch
prediction (such as prefetching the target from cache), the more
numerous and smaller devices don't even have instruction caches.
Certainly none of them have register renaming or speculative execution. >>>

IIUC STM4 series has cache, and some of them are not so big. There
are now several chinese variants of STM32F103 and some of them have
caches (some very small like 32 words, IIRC one has 8 words and it
is hard to decide if this very small cache or big prefetch buffer).

There are different kinds of cache here. Some of the Cortex-M cores
have optional caches (i.e., the microcontroller manufacturer can choose
to have them or not).

<https://en.wikipedia.org/wiki/ARM_Cortex-M#Silicon_customization>

I do not see relevent information at that link.

There is a table of the Cortex-M cores, with the sizes of the optional
caches.

Flash memory, flash controller peripherals, external memory interfaces
(including things like QSPI) are all specific to the manufacturer,
rather than part of the Cortex M cores from ARM. Manufacturers can do
whatever they want there.

AFAIK typical Cortex-M design has core connected to "bus matrix".
It is up to chip vendor to decide what else is connected to bus matrix.

Yes.

However, there are other things connected before these crossbar
switches, such as tightly-coupled memory (if any). And the cpu caches
(if any) are on the cpu side of the switches. Manufacturers also have a certain amount of freedom of the TCMs and caches, depending on which
core they are using and which licenses they have.

There is a convenient diagram here:

<https://www.electronicdesign.com/technologies/embedded/digital-ics/processors/microcontrollers/article/21800516/cortex-m7-contains-configurable-tightly-coupled-memory>

For me it does not matter if it is ARM design or vendor specific.
Normal internal RAM is accessed via bus matrix, and in MCU-s that
I know about is fast enough so that cache is not needed. So caches
come into play only for flash (and possibly external memory, but
design with external memory probably will be rather large).

Typically you see data caches on faster Cortex-M4 microcontrollers with external DRAM, and it is also standard on Cortex-M7 devices. For the
faster chips, internal SRAM on the AXI bus is not fast enough. For
example, the NXP i.mx RT106x family typically run at 528 MHz core clock,
but the AXI bus and cross-switch are at 133 MHz (a quarter of the
speed). The tightly-coupled memories and the caches run at full core speed.

It seems that vendor do not like to say that they use cache, instead
that use misleading terms like "flash accelerator".

That all depends on the vendor, and on how the flash interface
controller. Vendors do like to use terms that sound good, of course!

So a "cache" of 32 words is going to be part of the flash interface, not
a cpu cache

Well, caches never were part of CPU proper, they were part of
memory interface. They could act for whole memory or only for part
that need it (like flash). So I do not understand what "not a cpu
cache" is supposed to mean. More relevant is if such thing act
as a cache, 32 word things almost surely will act as a cache,
8 word thing may be a simple FIFO buffer (or may act smarter
showing behaviour typical of caches).

Look at the diagram in the link I gave above, as an example. CPU caches
are part of the block provided by ARM and are tightly connected to the processor. Control of the caches (such as for enabling them) is done by hardware registers provided by ARM, alongside the NVIC interrupt
controller, SysTick, MPU, and other units (depending on the exact
Cortex-M model).

This is completely different from the small buffers that are often
included in flash controllers or external memory interfaces as
read-ahead buffers or write queues (for RAM), which are as external the processor core as SPI, UART, PWM, ADC, and other common blocks provided
by the microcontroller manufacturer.

(which are typically 16KB - 64KB,

I wonder where you found this figure. Such size is typical for
systems bigger than MCU-s. It could be useful for MCU-s with
flash a on separate die, but with flash on the same die as CPU
much smaller cache is adequate.

Look at the Wikipedia link I gave. Those are common sizes for the
Cortex-M7 (which is pretty high-end), and for the newer generation of Cortex-M35 and Cortex-M5x parts. I have on my desk an RTO1062 with a
600 MHz Cortex-M7, 1 MB internal SRAM, 32 KB I and D caches, and
external QSPI flash.

and only found on bigger
microcontrollers with speeds of perhaps 120 MHz or above). And yes, it
is often fair to call these flash caches "prefetch buffers" or
read-ahead buffers.

Typical code has enough branches that simple read-ahead beyond 8
words is unlikely to give good results. OTOH delivering things
that were accessed in the past and still present in the cache
gives good results even with very small caches.

There are no processors with caches smaller than perhaps 4 KB - it is
simply not worth it. Read-ahead buffers on flash accesses are helpful, however, because most code is sequential most of the time. It is common
for such buffers to be two-way, and to have between 16 and 64 bytes per
way. These make a very big difference, especially with external memory.
They are attached to the flash interface or other external memory
interface, rather than the processor.

(You also sometimes see small caches for external
ram or dram interfaces.)

A notable example is MH32F103. Base model officially has 64kB RAM and
256KB flash. AFAIK this flash is rather slow SPI flash. It also
has 16kB cache which probably is 4-way set associative with few
extra lines (probably 4) to increase apparent associativity.
I write probably because this is result of reasoning based on
several time measurements. If you hit cache it runs nicely at
216 MHz. Cache miss costs around 100 clocks (varies depending on
exact setting of timing parameters and form of access).

Similar technology seem to be popular among chines chip makers,
especially for "bigger" chips. But IIUC GD use is for chips
of size of STM32F103C8T6.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Mon Sep 9 16:46:40 2024

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 05:04, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit
machines doing otherwise would add useless instructions to object
code. More precisly, really stupid compiler will generate useless
intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly.
But at least some gcc versions needed such declarations. Note
also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables
were used in loops where they need to be widened to 64 bits anyway. The
new value of c is set from a 32-bit result.

Well, at C level there is 64-bit type. The intent is that C compiler
should notice that the result is 32-bit + carry flag. Ideally
compiler should notice that c has only one bit and can keep it
in carry flag. On i386 comparison needed for loop control would
destroy carry flag, so there must be code using value of carry in
register and code to save carry to register. But one addition
of highs parts can be skipped. On 32-bit ARM compiler can use
special machine istructions and actually generated code which
is close to optimal.

When you have a type that you want to be at least 32 bits (to cover the
range you need), and want it to be as efficient as possible on 32-bit
and 64-bit machines (and 16-bit and 8-bit, still found in
microcontrollers), use "int_fast32_t". On x86-64 it will be 64-bit, on 32-bit systems it will be 32-bit. Use of the [u]int_fastNN_t types can
make code significantly more efficient on 64-bit systems while retaining efficiency on 32-bit (or even smaller) targets.

Well, I have constraints, some of which are outside of C code.
To resolve contraints I normally use some configure machinery.
If a standard type is exact match for my constraints, then I would
use standard type, possibly adding some fallbacks for pre-standard
systems. But if my constranits differ, I see no advantage in
using more "fancy" standard types compared to old types.

Concerning '<stdint.h>', somewhat worring thing is that several types
there seem to be optional (or maybe where optional in older versions
of the standard). I am not sure if this can be real problem,
but too many times I saw that on various issues developers say
"this in not mandatory, so we will skip it".

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to James Kuyper on Mon Sep 9 16:50:19 2024

James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 9/8/24 20:29, Waldek Hebisch wrote:
...

This code runs in 33 bit i386, ...

I'm either amazed or amused, depending upon whether or not 33 is a typo.

A typo. Maybe in my unconcious brain there was some thought of
sort '32 + carry = 33', but that is unlikely.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Mon Sep 9 09:47:18 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

I can think of three others. There may be more.

Yes, very good. I count four or five, depending on what
differences count as different.

A is a simple variable;

C does not define the term "simple variable" so presumably you
define it to be any named object that /can/ appear on the LHS
of a simple assignment -- a sort of "no true Scots-variable".

I think what you're talking about here is an identifier that
looks like it could be assigned to but can't be due to some
semantic information attached to the identifier. If that is
it then I'm okay with the "simple variable" phrase, with the
understanding that the phrase is being used informally rather
than completely precisely.

X represents a term of any complexity, and Y is any
expression.

I can think of at least one expression form for X that contradicts
this claim.

I haven't figured this one out yet. I'm assuming you don't mean
just a lack of parentheses around X that is causing the problem.
I guess I'm also assuming you're talking only about syntax, and
semantic information doesn't enter into it.

It would be great if C had simple rules, but it doesn't. You could have started by saying something about the most comment forms of assignment
being those you list, and that X can be almost any term, but the risk of making absolute claims is that people (like me) will look into them.

I have two problems with understanding some of Bart's comments.

One, he often isn't careful to express himself accurately or
precisely.

Two, his vocabulary can be unpredictably idiosyncratic.

I think the combination of these two aspects make it harder for
me to understand him than either aspect would just by itself.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Keith Thompson on Mon Sep 9 16:21:11 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Waldek Hebisch <antispam@fricas.org> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Waldek Hebisch <antispam@fricas.org> writes:

Bart <bc@freeuk.com> wrote:

[...]

I had a problem with this code because it was so verbose. The first
thing I did was to define aliases u64 and u32 for those long types:

typedef unsigned long long u64;
typedef unsigned long u32;

This code runs in 33 bit i386, 32 bit ARM and 64 bit x86-64, in
all cases under Linux. As Keith noticed, most popular of those
has 64-bit long. So your definition would break it. You need

typedef unsigned int u32;

Just add #include <stdint.h> and use uint32_t -- or, if you value
brevity for some reason:

typedef uint32_t u32;

(Bart dislikes <stdint.h>, but there's no reason you should.)

1. Well, I used to care about pre-Ansi compilers and for benefit
to them I got into habit of avoiding <stdint.h>. Yes, it is time
to change and on microcontrollers I use <stdint.h> as sizes
are important there and I do not expect to ever compile
microcontroller code with non-Ansi compiler.

Pre-ANSI? I haven't been even been able to find a working pre-ANSI
C compiler.

At least in 1994 HP shipped their base system with pre-Ansi compilers.
I heard that this practice continued for some time and also that
some other vendors did it. Hardware were pretty reliable, so could
be used for very long time. I think that few years ago, if I wished
I could get HP machine running and use its pre-ANSI compiler.

In more general terms, some developers seem to think that their moral
duty is to break any software that is 4 years old or more. I have
quite different view: if feel that breaking 10 years old or even 20
years old system requires compeling reason. And in my judgement
reason frequently are not compeling enough.

<stdint.h> was added in C99, so a C89/C90 ("ANSI") implementation might
not support it, but I'd still be surprised if you could find a C implementatation these days that doesn't have <stdint.h>. And you can
roll your own or use something like <https://www.lysator.liu.se/c/q8/index.html>.

Well, recently I had trouble with sofware that used C99 feature.
On some my Linux systems gcc default was gnu+C90 and it refused the
C99 features. That was resolvable, but broke automated build and
required manual intervention to add appropriate option to gcc.
For "normal" users on such Linux sotware was "broken". '<stdint.h>'
is not affected by such issue, but in general I would like a
monkey to be able to compile my code on wide variety of system.
As my ability to test different scenarios is limited, I tend
to be consevative in features that I use (OK, that is personal
judgement, in several cases I depend in gcc feature, but I feel
that I have reasons to depend on such features).

2. As I explanined fixed size is just current state of developement.
On 64-bit machines code should use bigger types. And to get
128-bit type it seems that I need nonstandard type (there seem
to be new types in C23, but I did not investigate them deeper).

C23 doesn't add any new support for 128-bit integers.

gcc (and compilers intended to be compatible with it, like clang)
support __int128 and unsigned __int128 types, but only on 64-bit
systems, and they're not strictly integer types (for example, there are
no constants of type __int128).

Yes, I know that. IIUC clang supports bigger types, and in one
of drafts I found "bit exact integer types". At first glance it
looked that they could support bigger lengths than standard integer
types.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Mon Sep 9 17:57:37 2024

On 09/09/2024 17:21, Waldek Hebisch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

C23 doesn't add any new support for 128-bit integers.

So what does _Bitint do with a width of 128 bits?

gcc (and compilers intended to be compatible with it, like clang)
support __int128 and unsigned __int128 types, but only on 64-bit
systems, and they're not strictly integer types (for example, there are
no constants of type __int128).

Yes, I know that. IIUC clang supports bigger types, and in one
of drafts I found "bit exact integer types". At first glance it
looked that they could support bigger lengths than standard integer
types.

I had quite decent 128-bit support in my language at one time, including:

* i128 and u128 names
* 128-bit constants, including char constants up to 'ABCDEFGHIJKLMNOP'
* Ability to print 128-bit numbers
* Nearly all arithmetic, comparison and logical ops supported (only
division was limited to 128/64)

I believe this is a better spec than provided by gnuC.

However I dropped it for various reasons. Mainly because I didn't have
enough use-cases for it (the only one was supporting it within the
compiler!), so it was not worthwhile (being cool to have wasn't enough).

But also, in a language with otherwise only 64-bit integer types so
there was no mixed-size arithmetic, now widening and narrowing comes
into play again.

In C, there could be mixed 32-bit, 64-bit and 128-bit types in an
expression. With the new _Bitint which allows arbitrary sizes from 1 to
63 bits as well as over 64 bits (with no apparent upper limit in the
spec), it sounds like there will be lots of fun and games to be had.

Especially if it all needs to work on 32-bit targets too.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Tim Rentsch on Mon Sep 9 18:27:58 2024

On 09/09/2024 17:47, Tim Rentsch wrote:

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

I can think of three others. There may be more.

Yes, very good. I count four or five, depending on what
differences count as different.

Well, I implement only three of those (ie. of my four) in my code
generator. (X[i] gets reduced to *X before this point.)

If there are any other categories, then they've never been encountered
in any of the C projects I've tested.

(My language has nearly 10 categories, including one where the dual
results of a 'divrem' op is assign to two LHS values. Are the extra ones
in C anything like that, eg. to do with 'complex' assignment?)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Mon Sep 9 19:37:15 2024

On 09/09/2024 18:57, Bart wrote:

On 09/09/2024 17:21, Waldek Hebisch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

C23 doesn't add any new support for 128-bit integers.

So what does _Bitint do with a width of 128 bits?

_BitInt types are not "integer types". Nor is gcc's __int128 type.
Obviously they are very like integer types in many ways, but there are differences, so they do not count as "integer types" like the standard
integer types or extended integer types (which C allows, but AFAIK no
common compiler supports).

You might think the differences are minor or just "legal
technicalities", but sometime it is relevant that while
0x1234567812345678 is a 64-bit integer constant, 0x12345678123456781234567812345678 is not a 128-bit integer constant
even if you can use __int128 (gcc or clang extension) or _BitInt(128)
(C23 bit-precise integer type) as though it were a 128-bit integer type
in most circumstances.

For many purposes, however, you can use _BitInt(128) as though it were a
normal 128-bit integer type, and ignore the details.

(I haven't done much testing with these myself. I wonder if gcc handles _BitInt(128) and __int128 identically in code generation.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Janis Papanagnou on Mon Sep 9 17:29:56 2024

On 2024-09-08, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 08.09.2024 16:12, James Kuyper wrote:

On 9/8/24 00:39, Janis Papanagnou wrote:
...

That's why I immediately see the necessity that compiler creators need
to know them in detail to _implement_ "C". And that's why I cannot see
how the statement of the C-standard's "most important purpose" would

sound reasonable (to me). ...

I agree - the most important purpose is for implementors, not developers.

... I mean, what will a programmer get from the
"C" standard that a well written text book doesn't provide?

What the C standard says is more precise and more complete than what
most textbooks say.

Exactly. And this precision is what makes standard often difficult
to read (for programming purposes for "ordinary" folks).

The C grammar is not presented in a nice way in ISO C.

It uses nonsensical categories. For instance a basic expression like A
is considered a unary-expression. A newcomer looking at the
grammar for assignment will be wondering how on Earth the A
in A = B is a unary expression, when it contains no unary operator.

The unary-expression is not given in the immediately preceding section,
and no section references are given; you have to go searching through
the document to find it.

I also suspect programmers not versed in parsing and grammars will not
intuit that assignment associates right to left. Someone who remembers
their compiler course from university will know that the right hand side "unary-expression assignment-operator assignment-expression"
has the assignment-expression on the right, and is therefore
identified as right-recursive, and that leads to right association.

I strongly suspect that the vast majority of the C coders on the planet
(as well as users of other languages that have operator grammars) refer
to operator precedence tables that fit on one page, rather than flipping
around in a telescopic grammar that goes on for pages.

The standard for a language which has operators with precedence
should give a precedence table. The grammar for assignment should
be given as <expr> = <expr>, where you refer to the table if you
want to know how a = b = c or a + b = c are parsed. That would be the
most understandable presentation, or at least the most widely used one
which programmer are used to from books and tutorials.

Books and tutorials about C could just carefully copy and paste the
table from the standard, and be confident that they have the correct
info which matches how the language is defined.

Lastly, the grammar presentation in ISO C is not actually agnostic of
parsing technology. It users left recursion which isn't LL(1) and
therefore presents a challenge to someone writing a recursive-descent
parser, who must factor the grammar.

For a grammar specification to be directly usable to the widest scope of implementation techniques---if that were a goal---it would have to
be left-factored, where necessary, to meet that goal.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Mon Sep 9 19:21:33 2024

On 09/09/2024 18:46, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 05:04, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit >>>>> machines doing otherwise would add useless instructions to object
code. More precisly, really stupid compiler will generate useless
intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly.
But at least some gcc versions needed such declarations. Note
also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables >>>> were used in loops where they need to be widened to 64 bits anyway. The >>>> new value of c is set from a 32-bit result.

Well, at C level there is 64-bit type. The intent is that C compiler
should notice that the result is 32-bit + carry flag. Ideally
compiler should notice that c has only one bit and can keep it
in carry flag. On i386 comparison needed for loop control would
destroy carry flag, so there must be code using value of carry in
register and code to save carry to register. But one addition
of highs parts can be skipped. On 32-bit ARM compiler can use
special machine istructions and actually generated code which
is close to optimal.

When you have a type that you want to be at least 32 bits (to cover the
range you need), and want it to be as efficient as possible on 32-bit
and 64-bit machines (and 16-bit and 8-bit, still found in
microcontrollers), use "int_fast32_t". On x86-64 it will be 64-bit, on
32-bit systems it will be 32-bit. Use of the [u]int_fastNN_t types can
make code significantly more efficient on 64-bit systems while retaining
efficiency on 32-bit (or even smaller) targets.

Well, I have constraints, some of which are outside of C code.
To resolve contraints I normally use some configure machinery.
If a standard type is exact match for my constraints, then I would
use standard type, possibly adding some fallbacks for pre-standard
systems. But if my constranits differ, I see no advantage in
using more "fancy" standard types compared to old types.

The "fancy" standard types in this case specify /exactly/ what you
apparently want - a type that can deal with at least 32 bits for range,
and is as efficient as possible on different targets. What other
constraints do you have here that make "int_fast32_t" unsuitable?

Concerning '<stdint.h>', somewhat worring thing is that several types
there seem to be optional (or maybe where optional in older versions
of the standard). I am not sure if this can be real problem,
but too many times I saw that on various issues developers say
"this in not mandatory, so we will skip it".

<stdint.h> has been required since C99. It has not been optional in any
C standard in the last 25 years. That's a /long/ time - long enough for
most people to be able to say "it's standard".

And almost every C90 compiler also includes <stdint.h> as an extension.

If you manage to find a compiler that is old enough not to have
<stdint.h>, and you need to use it for actual code, it's probably worth spending the 3 minutes it takes to write a suitable <stdint.h> yourself
for that target.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to David Brown on Mon Sep 9 18:46:46 2024

On 2024-09-09, David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 18:57, Bart wrote:

On 09/09/2024 17:21, Waldek Hebisch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

C23 doesn't add any new support for 128-bit integers.

So what does _Bitint do with a width of 128 bits?

_BitInt types are not "integer types". Nor is gcc's __int128 type.

How can we write a program which, in an implementation which has a
__int128 type, outputs "yes" if it is an integer type, otherwise "no"?

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Kaz Kylheku on Mon Sep 9 14:25:41 2024

On 9/9/24 13:29, Kaz Kylheku wrote:
...

The standard for a language which has operators with precedence
should give a precedence table.

C is not such a language, though it comes close to being one. The
standard, if it contained a precedence table, could do so in a
non-normative note indicating that only the actual grammar productions
are normative.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Kaz Kylheku on Mon Sep 9 21:04:10 2024

On 09/09/2024 20:46, Kaz Kylheku wrote:

On 2024-09-09, David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 18:57, Bart wrote:

On 09/09/2024 17:21, Waldek Hebisch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

C23 doesn't add any new support for 128-bit integers.

So what does _Bitint do with a width of 128 bits?

_BitInt types are not "integer types". Nor is gcc's __int128 type.

How can we write a program which, in an implementation which has a
__int128 type, outputs "yes" if it is an integer type, otherwise "no"?

#include <stdio.h>

int main() {
auto const x = 0x1'0000'0000'0000'0000;
if (x > 0xffff'ffff'ffff'ffff) {
printf("yes\n");
} else {
printf("no\n");
}
}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon Sep 9 12:08:09 2024

Michael S <already5chosen@yahoo.com> writes:

On Fri, 06 Sep 2024 07:56:56 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Fri, 6 Sep 2024 10:35:16 +0100
Bart <bc@freeuk.com> wrote:

On 05/09/2024 22:37, James Kuyper wrote:

On 9/5/24 12:54, Kaz Kylheku wrote:

On 2024-09-05, Waldek Hebisch <antispam@fricas.org> wrote:

...

You seem to miss the point that assigment operator is
fundamentally assymetic.

Both sides of an assignment can be complex expressions that
designate an object (though the right side need not).

So you've correctly identified the very fundamental asymmetry.

Sure, if you want to completely disregard all the cases where the
symmetry does exist.

That means that for you, there is no interesting difference (using
my example of assigning A to itself) in a language where you write
'A = A', and one where you write 'A = .A'.

(I'd be interested in how, in the latter language, you'd write the
equivalent of 'A = A = A' in C, since the middle term is both on
the left of '=', and on the right!)

The point is that in BLISS everithing that is legal on the right
side of asignment is also legal on the left side.
I don't know if the point is generally true. In particular, if
BLISS supports floatig point, what is meaning of floating point on
the left side?

BLISS is word based and typeless. On a PDP-10, doing a

.pi = 0

where 'pi' holds a 36-bit floating-point value (and 3.14159...
presumably), that floating-point value would be used as an
address and 0 would be stored into it (assuming I remember
BLISS correctly).

On PDP-10 reinterpreting [18 LS bits of] floating-point as address is natural, because addresses, integers and FP share the same register
file.
It seems to me that on S/360 or CDC-6K or PDP-11 or VAX it would be
less natural.

I don't think one thing has much of anything to do with the
other. It seems just as unlikely to use a floating-point value,
or a portion of a floating-point value, as an address on a PDP-10
as it does on any of the other systems you mentioned.

However, natural or not, BLISS was used widely both on PDP-11 and on
VAX, which means that it worked well enough.

BLISS is, or was, closer to the hardware than C. Also it's
harder to scale in BLISS than in C, because BLISS is typeless.
Writing code in BLISS needs more discipline than writing in C.
(Disclaimer: my experience writing code in BLISS is very
close to epsilon, and is very much dimmed by the long passage
of time.)

So probably not what one wants to do. ;)

Yes, LS bits of FP as address do not sound very useful.
On the other hand, using several MS bits of FP, although typically
fewer than 18, as address is useful in calculations of many
transcendental functions.

Probably not use it as an address but rather as an index.
Perhaps something like this (please forgive the bastard
mixing of BLISS and C):

// variable d contains a 64-bit double

needed = .(most_digits + (.d >> 52 & 0x1FFF))

to index a table 'most_digits' by the 11-bit exponent of a
"double" floating-point value.

I don't remember enough BLISS to know how to write indexing,
but this construction should accomplish that.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Waldek Hebisch on Mon Sep 9 16:56:31 2024

On 9/9/24 12:46, Waldek Hebisch wrote:
...

Concerning '<stdint.h>', somewhat worring thing is that several types
there seem to be optional (or maybe where optional in older versions
of the standard). I am not sure if this can be real problem,
but too many times I saw that on various issues developers say
"this in not mandatory, so we will skip it".

You're missing the point behind the optionality of those types. A lazy implementor could do as you say, but will save themselves almost nothing
- the time it takes to insert an appropriate typedef in <stdint.h> is
quite negligible. Those types are optional because there's some
platforms with no hardware support for anything that would meet the requirements for those types, and for which software emulation would be
too difficult. In general, on any implmentation that fails to provide
one of the optional types, there is no mandatory type that will do the
same thing that you wanted the optional type for.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Mon Sep 9 22:04:54 2024

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 18:46, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 05:04, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit >>>>>> machines doing otherwise would add useless instructions to object
code. More precisly, really stupid compiler will generate useless >>>>>> intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly.
But at least some gcc versions needed such declarations. Note
also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables >>>>> were used in loops where they need to be widened to 64 bits anyway. The >>>>> new value of c is set from a 32-bit result.

Well, at C level there is 64-bit type. The intent is that C compiler
should notice that the result is 32-bit + carry flag. Ideally
compiler should notice that c has only one bit and can keep it
in carry flag. On i386 comparison needed for loop control would
destroy carry flag, so there must be code using value of carry in
register and code to save carry to register. But one addition
of highs parts can be skipped. On 32-bit ARM compiler can use
special machine istructions and actually generated code which
is close to optimal.

When you have a type that you want to be at least 32 bits (to cover the
range you need), and want it to be as efficient as possible on 32-bit
and 64-bit machines (and 16-bit and 8-bit, still found in
microcontrollers), use "int_fast32_t". On x86-64 it will be 64-bit, on
32-bit systems it will be 32-bit. Use of the [u]int_fastNN_t types can
make code significantly more efficient on 64-bit systems while retaining >>> efficiency on 32-bit (or even smaller) targets.

Well, I have constraints, some of which are outside of C code.
To resolve contraints I normally use some configure machinery.
If a standard type is exact match for my constraints, then I would
use standard type, possibly adding some fallbacks for pre-standard
systems. But if my constranits differ, I see no advantage in
using more "fancy" standard types compared to old types.

The "fancy" standard types in this case specify /exactly/ what you
apparently want - a type that can deal with at least 32 bits for range,
and is as efficient as possible on different targets. What other
constraints do you have here that make "int_fast32_t" unsuitable?

This routine is part of small library. Good approximation is that
I want a type (posibly depending on target) which will make this
library fast. As I wrote in another message I plan to use 64-bit
units of work on 64-bit targets. I have not plans to use the
library in 8-bit or 16-bit targets, but if I needed them on such
targets I probably would use 16-bit work unit. So, while 32-bit
work unit represents current state, it not correct statement of
intentions. Consequently, 'int_fast32_t' would be as misleading
or more misleading than 'int' ('int' is reasonable indication
that choice of types is not finished, 'int_fast32_t' suggest that
it is proper choice for all targets).

Concerning '<stdint.h>', somewhat worring thing is that several types
there seem to be optional (or maybe where optional in older versions
of the standard). I am not sure if this can be real problem,
but too many times I saw that on various issues developers say
"this in not mandatory, so we will skip it".

<stdint.h> has been required since C99. It has not been optional in any
C standard in the last 25 years. That's a /long/ time - long enough for
most people to be able to say "it's standard".

And almost every C90 compiler also includes <stdint.h> as an extension.

If you manage to find a compiler that is old enough not to have
<stdint.h>, and you need to use it for actual code, it's probably worth spending the 3 minutes it takes to write a suitable <stdint.h> yourself
for that target.

ATM I do not have standard text handy to check, but AFAIR several
_types_ in <stdint.h> were not required but optional.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Bart on Tue Sep 10 00:07:48 2024

On 08/09/2024 17:44, Bart wrote:

On 08/09/2024 16:39, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories:

   A = Y;         // name
   *X = Y;        // pointer
   X[i] = Y;      // index
   X.m = Y;       // member select

I can think of three others. There may be more.

OK, so what are they?

TM:

Yes, very good. I count four or five, depending on what

differences count as different.

I guess nobody is going to say what those extra categories are, are they?

It's bad form to call somebody out on something but then refuse to tell
them exactly what they've got wrong or have missed out.

3, 4, or maybe 5 mysterious categories of LHS assignment terms that I
have never been encountered in a million lines of C code I've processed,
but nobody is willing to say what they are?

I sense a wind-up.

I can think of at least one expression form for X that contradicts this
claim.

Example?

Nothing here either.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to David Brown on Mon Sep 9 22:33:16 2024

On 2024-09-09, David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 20:46, Kaz Kylheku wrote:

On 2024-09-09, David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 18:57, Bart wrote:

On 09/09/2024 17:21, Waldek Hebisch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

C23 doesn't add any new support for 128-bit integers.

So what does _Bitint do with a width of 128 bits?

_BitInt types are not "integer types". Nor is gcc's __int128 type.

How can we write a program which, in an implementation which has a
__int128 type, outputs "yes" if it is an integer type, otherwise "no"?

#include <stdio.h>

int main() {
auto const x = 0x1'0000'0000'0000'0000;
if (x > 0xffff'ffff'ffff'ffff) {
printf("yes\n");
} else {
printf("no\n");
}
}

If the constant in this program is too wide for any integer type, that
is a constraint violation requiring a diagnostic. After that, if the
program is translated or executed anyway, the behavior is undefined.

I'd like the program to have well-defined behavior, under the
test assumption (that a documented, conforming extension is provided
in the form of __int128).

If the implementation were to truncate the constant to 64 bits, and then
choose a 64 bit type for x, then we would expect the "no" output; but
that doesn't show that there __int128 isn't an integer type, only that
the type is not equipped with constants.

Integer types don't have to have their own matching constants. The types
char and short don't have their own constants; they borrow int.
Library support can be missing, in regard to formatting __int128 to
text and back.

The __int128 type better support all integer arithmetic, though: and
there should be conversion rules regarding when __int128 is opposite to
an operand of a different type. An __int128 should be usable as a
displacement in pointer arithmetic. It should be possible to switch() on
an __int128, even if a constant cannot be expressed for every possible
value.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Mon Sep 9 23:58:45 2024

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 16:36, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 08/09/2024 23:34, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

And while microcontrollers sometimes have a limited form of branch
prediction (such as prefetching the target from cache), the more
numerous and smaller devices don't even have instruction caches.
Certainly none of them have register renaming or speculative execution. >>>>

IIUC STM4 series has cache, and some of them are not so big. There
are now several chinese variants of STM32F103 and some of them have
caches (some very small like 32 words, IIRC one has 8 words and it
is hard to decide if this very small cache or big prefetch buffer).

There are different kinds of cache here. Some of the Cortex-M cores
have optional caches (i.e., the microcontroller manufacturer can choose
to have them or not).

<https://en.wikipedia.org/wiki/ARM_Cortex-M#Silicon_customization>

I do not see relevent information at that link.

There is a table of the Cortex-M cores, with the sizes of the optional caches.

Flash memory, flash controller peripherals, external memory interfaces
(including things like QSPI) are all specific to the manufacturer,
rather than part of the Cortex M cores from ARM. Manufacturers can do
whatever they want there.

AFAIK typical Cortex-M design has core connected to "bus matrix".
It is up to chip vendor to decide what else is connected to bus matrix.

Yes.

However, there are other things connected before these crossbar
switches, such as tightly-coupled memory (if any).

TCM is _not_ a cache.

And the cpu caches
(if any) are on the cpu side of the switches.

Caches are attached were system designer thinks they are useful
(and possible). Word "cache" has well-estabished meaning and
ARM (or you) has no right to redefine it.

Manufacturers also have a
certain amount of freedom of the TCMs and caches, depending on which
core they are using and which licenses they have.

There is a convenient diagram here:

<https://www.electronicdesign.com/technologies/embedded/digital-ics/processors/microcontrollers/article/21800516/cortex-m7-contains-configurable-tightly-coupled-memory>

For me it does not matter if it is ARM design or vendor specific.
Normal internal RAM is accessed via bus matrix, and in MCU-s that
I know about is fast enough so that cache is not needed. So caches
come into play only for flash (and possibly external memory, but
design with external memory probably will be rather large).

Typically you see data caches on faster Cortex-M4 microcontrollers with external DRAM, and it is also standard on Cortex-M7 devices. For the
faster chips, internal SRAM on the AXI bus is not fast enough. For
example, the NXP i.mx RT106x family typically run at 528 MHz core clock,
but the AXI bus and cross-switch are at 133 MHz (a quarter of the
speed). The tightly-coupled memories and the caches run at full core speed.

OK, if you run core at faster clock than the bus matrix, then cache
attached on core side make a lot of sense. And since cache has to
compensate for lower bus speed it must be resonably large. But
if you look at devices where bus matrix runs at the same clock
as the core, then it makes sense to put cache on the other side.

It seems that vendor do not like to say that they use cache, instead
that use misleading terms like "flash accelerator".

That all depends on the vendor, and on how the flash interface
controller. Vendors do like to use terms that sound good, of course!

So a "cache" of 32 words is going to be part of the flash interface, not >>> a cpu cache

Well, caches never were part of CPU proper, they were part of
memory interface. They could act for whole memory or only for part
that need it (like flash). So I do not understand what "not a cpu
cache" is supposed to mean. More relevant is if such thing act
as a cache, 32 word things almost surely will act as a cache,
8 word thing may be a simple FIFO buffer (or may act smarter
showing behaviour typical of caches).

Look at the diagram in the link I gave above, as an example. CPU caches
are part of the block provided by ARM and are tightly connected to the processor. Control of the caches (such as for enabling them) is done by hardware registers provided by ARM, alongside the NVIC interrupt
controller, SysTick, MPU, and other units (depending on the exact
Cortex-M model).

This is completely different from the small buffers that are often
included in flash controllers or external memory interfaces as
read-ahead buffers or write queues (for RAM), which are as external the processor core as SPI, UART, PWM, ADC, and other common blocks provided
by the microcontroller manufacturer.

The disscussion started about possible interaction of caches
and virtual function dispatch. This interaction does not depend
on you calling it cache. It depends on cache hits/misses,
their cost and possible eviction. And actually small caches
can give "interesting" behaviour: with small code footprint there
may be 100% hit ratio, but one extra memory reference may lead
to significant misses. And even small caches behave differently
then simple buffers.

(which are typically 16KB - 64KB,

I wonder where you found this figure. Such size is typical for
systems bigger than MCU-s. It could be useful for MCU-s with
flash a on separate die, but with flash on the same die as CPU
much smaller cache is adequate.

Look at the Wikipedia link I gave. Those are common sizes for the
Cortex-M7 (which is pretty high-end), and for the newer generation of Cortex-M35 and Cortex-M5x parts. I have on my desk an RTO1062 with a
600 MHz Cortex-M7, 1 MB internal SRAM, 32 KB I and D caches, and
external QSPI flash.

OK, as I wrote it makes sense for them. But for smaller machines
much smaller caches may be adequate.

and only found on bigger
microcontrollers with speeds of perhaps 120 MHz or above). And yes, it
is often fair to call these flash caches "prefetch buffers" or
read-ahead buffers.

Typical code has enough branches that simple read-ahead beyond 8
words is unlikely to give good results. OTOH delivering things
that were accessed in the past and still present in the cache
gives good results even with very small caches.

There are no processors with caches smaller than perhaps 4 KB - it is
simply not worth it.

Historicaly there were processors with small caches. 256B in
Motorla chips and I think smaller too. It depends on the whole
design. Currently for "big" processors really small caches seem
to make no sense. Microconrollers have their own constaints.
Manufacturer may decide that cache giving 10% average improvement
is not worth uncertainilty of execution time. Or may decide that
small cache is the cheapest way to get better benchmark figures.

Read-ahead buffers on flash accesses are helpful,
however, because most code is sequential most of the time. It is common
for such buffers to be two-way, and to have between 16 and 64 bytes per
way.

If you read carefully description of STM "flash accelerator" it is
clear that this is classic cache, with line size matched to flash,
something like 2-set associativity, conflicts and eviction.
Historically there were variations, some caches only cache targets
of jumps and use prefetch buffer for linear code. Such caches
can be effective at very small size.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Bart on Mon Sep 9 17:51:01 2024

Bart <bc@freeuk.com> writes:

On 08/09/2024 17:44, Bart wrote:

On 08/09/2024 16:39, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories: >>>>
A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

I can think of three others. There may be more.

OK, so what are they?

TM:

Yes, very good. I count four or five, depending on what

differences count as different.

I guess nobody is going to say what those extra categories are, are they?

I consider it rude to speak for another poster.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Bart on Mon Sep 9 17:57:08 2024

Bart <bc@freeuk.com> writes:

On 06/09/2024 12:53, Tim Rentsch wrote:

Bart <bc@freeuk.com> writes:

On 05/09/2024 16:21, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

So what exactly is different about the LHS and RHS here:

A = A;

(In BLISS, doing the same thing requires 'A = .A' AIUI; while
'A = A' is also valid, there is a hidden mismatch in indirection
levels between left and right. It is asymmetric while in C it
is symmetric, although seem to disagree on that latter point.)

You seem to miss the point that assigment operator is
fundamentally assymetic.

If you've followed the subthread then you will know that nobody
disputes that assignment reads from side of '=' and writes to the
other.

The symmetry is to do with syntax when the same term appears on
both sides of '=', the type associated with each side, and,
typically, the internal representations too.

Maybe it would help if you would stop thinking in terms of the
word symmetry (clearly assignment is not symmetrical) and instead
think about consistency.

In C, the meaning of an identifier or object-locating expression
depends on where it is in the syntax tree. In some places it
means the address of the object; in other places it means the
contents of whatever is stored in the object.

In a HLL, a named object (ie. a variable name) is nearly always meant
to to refer to an object's value, either its current value or what
will be its new value.

BLISS is different.

Considering the point of view of a compiler writer, it's easier
to write a compiler for Bliss than for C. In Bliss, upon seeing
an identifier, always simply put its address in a register. If
an object's value needs to be loaded, there will be a '.' to take
the address produced by the sub-expression and fetch the word
stored at that address. On the other hand, in C, upon seeing an
identifier, the compiler needs to consider the context of where
the identifier appears:

You can do the same thing in a C compiler: always load the
address of any identifier associated with the location of
value.

Sure, but that doesn't change the basic point that in C some
additional information needs to be taken into account, and
possibly additional code generated, when looking at the parse
node for an identifier. In BLISS the action is always just
to load the address, and no other action is ever needed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Tue Sep 10 01:20:08 2024

On 10/09/2024 00:53, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 08/09/2024 17:44, Bart wrote:

On 08/09/2024 16:39, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories: >>>>>
   A = Y;         // name
   *X = Y;        // pointer
   X[i] = Y;      // index
   X.m = Y;       // member select

I can think of three others. There may be more.

OK, so what are they?

TM:

Yes, very good. I count four or five, depending on what

differences count as different.

I guess nobody is going to say what those extra categories are, are they?

The LHS of an assignment must be a modifiable lvalue. Searching for the
word "lvalue" in the standard's section on expressions yields several
forms not listed above:

- A parenthesized lvalue:
(A) = Y;

- A generic selection whose result expression is an lvalue:
_Generic(0, int: A) = Y;
Not sure why you'd do this.

- X->m, where X is a pointer (you might think of that as the same
category as X.m, but the standard doesn't define the -> operator in
terms of the . operator)

- A compound literal:
int n;
(int){n} = 42;

This assigns a value to a temporary object which is immediately
discarded. I can't think of a valid use for this.

OK, thanks for the prompt response.

You listed 4 examples; the 4th one I had no idea about (I don't support compound literals anyway).

The first 3, I have doubts as to whether they warrant their own categories.

The first two just end up doing an assignment to A (parentheses are
no-ops in terms like these anyway).

While the X->m term is exactly equivalent to (*X).m. I put these three
through my compiler and they produced the same AST as A = Y or *(X).m.

So I might call them curiosities rather than practical categories that
offer new possibilities.

For example, one of mine (I have several more actual ones) is multiple assignment: (a, b) = (c, d).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Mon Sep 9 18:00:43 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

[...]

And as for your remarks about typical implementations, does your C
parser /really/ accept an assignment expression on both sides of
an = operator? What does that even look like in the code? I have
written one C parser, contributed to one other and (over the
years) examined at least two more, and none of them do what you
seem to be suggesting is typical.

It wouldn't be surprising to see a parser written so it would
accept (syntactically) a superset of the well-formed inputs
allowed by the language grammar. Any parses not allowed by the
grammar could then be flagged as erroneous in a later semantics
pass.

Yes, that is pretty much what I've seen in more than one C parser.
[...]

Thank you for this posting.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Mon Sep 9 18:52:34 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

I had thought that the standard required support for constants of
extended integer types wider than int, but the wording seems
ambiguous at best.

An unsuffixed decimal constant, for example, has a type that is
the first of int, long int, long long int in which its value can
be represented. But:

If an integer constant cannot be represented by any type in
its list, it may have an extended integer type, if the
extended integer type can represent its value. If all of the
types in the list for the constant are signed, the extended
integer type shall be signed. If all of the types in the list
for the constant are unsigned, the extended integer type shall
be unsigned. If the list contains both signed and unsigned
types, the extended integer type may be signed or unsigned.
If an integer constant cannot be represented by any type in
its list and has no extended integer type, then the integer
constant has no type.

The word "may" in that first sentence seems problematic. It
implies a choice: either a decimal integer constant with a value
of LLONG_MAX+1 is of an extended integer type, or it isn't. One
possible interpretation is that such a constant must be of an
extended integer type if an appropriate type exists. Another is
that it may or may not be of an extended integer type at the whim
of the implementation -- but the standard doesn't say that the
choice is either implementation-defined or unspecified. It just
says "may".

Have you tried looking at other uses of the word "may" in the C
standard to see if that sheds some light on the question?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Mon Sep 9 20:46:24 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

Have you tried looking at other uses of the word "may" in the C
standard to see if that sheds some light on the question?

If you have any actual insight to offer, feel free to do so.

Is there some reason you mind my asking a civil question?

In my view there is no question about the intent here. If
I'm going to try to help you resolve your uncertainty, it
would be helpful to have a better understanding of your
reasoning process. That's why I asked the question.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Tue Sep 10 04:40:04 2024

Bart <bc@freeuk.com> wrote:

On 09/09/2024 02:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 08/09/2024 19:13, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 08/09/2024 01:05, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then you no longer have a language which can be implemented in a few KB.
You might as well use a real with with proper data types, and not have >>>>>>> the stack exposed in the language. Forth code can be very cryptic >>>>>>> because of that.

First, it is not my goal to advocate for Forth use.

You're doing a fine job of it!

For me it's one of those languages, like Brainf*ck, which is trivial to >>>>> implement (I've done both), but next to impossible to code in.

I wonder if you really implemented Forth. Did you implement immediate >>>> words? POSTPONE?

I implemented a toy version, with 35 predefined words, that was enough
to implement Fizz Buzz. Then I looked for more examples to try and found >>> they all assumed slightly different sets of built-ins.

OK, so apparently you missed essential part.

I've looked at half a dozen hits for 'forth postpone' and I still don't understand what it does. Apparently something to do with compiled mode.

I wouldn't know enough to confidently implement it or use it.

Silly example 1:

: mark 9876 . ;

: imark mark ; IMMEDIATE

: baz imark ;

Comments: 'mark' simply prints easily distinguishable message (number).
Due to 'IMMEDIATE' after 'imark' it is executed at compile time,
that is when Forh compiler sees 'imark' is suspends compilation
of 'baz' and starts execution of 'imark' which prints message
_at compile time_. Since 'baz' contains no other code nothing
interesting happens when we run 'baz'.

Silly example 2 (with the same 'mark' as above):

: pmark POSTPONE mark ; IMMEDIATE

: bar pmark ;

Comments: Here, due to 'IMMEDIATE' 'pmark' is executed during
compilation of 'bar'. So "POSTPONE mark" is executed. Due to
'POSTPONE' instead of execiting 'mark' Forth compiler adds call
to 'mark' to the compiled code for 'bar'. So final effect is
that executing 'bar' executes 'mark' that is prints the message.

Above were silly examples, but point is that say 'pmark' could
do arbitrary computation and depending on result of this computation
call POSTPONE or not. 'pmark' could use 'POSPONE' many time and
generate a lot of code. 'pmark' can read program text following
it and base its decisions on what it has seen. So, for example
you could define word calles say 'infix'. Such work would read
program text following it up to some end marker. It could
parse it using infix syntax and use 'POSTPONE' to generate
corresponding code.

Another mysterious feature with hard to understand semantics. You did
say this was a very simple language and trivial to implement in a few KB?

Yes, it is very simple to implement. First, several parts of Forth
compiler are exposed as user routines, some are actually useful
for "normal" programs, but some are intended as a tool to create
extentions, if you think about them as "ordinary" routines they
make no sense. Once you understand the purpose, they are sensible.

Next, base structure (you probably know this, but just in case).
There is dictionary holding defined Forth words. Simple linked
list will do. Definition has name, few flags and associated
data/code. Simplest Forth may keep data inside dictionary.
Alternatively, you have separate data area and in dictionary
entry you store pointer to the data. You need routine to find
definition in the dictinary. This routne takes name (string)
as an argument and returns say pointer to dictionary entry.
A simple variant of this routine is exposed as user procedure
FIND. Half of work of POSTPONE is done by this routine.
You need a few routines that create dictionary entries, that
is ':', 'CONSTANT', 'VARIABLE', 'CREATE', etc. They require
extra things, so fully can by only defined later. But
they have common part which creates "header", that is puts
name, flags and link in the dictionary. Routnes creating
dictionary entries differ in which flag they set and mainly
which data they store. Once you have basic Forth machinery
in place you can do rest as an extention as Forth code.
But to get started you need a number of words, I do not know
how many. You may look into Forth standard, there is something
like 130 "core" words and probably to be able to do rest in
Forth you will need about 30 other words. Depending how
you structure you implementation you may be able to define
some core words in Forth. Anyway, before Forth is operational
you need a way to fill say 100 dictionary entries. Hardcore
Forthers would create inital dictionary as an assembly definition
(use assembly directives to create data structure). You may
use your languaage to create appropriate data structure.

Crating "haders" is easy in any HLL, but of course you need also
data/code. Here there are primitives that traditinally are
short pieces of machine code. You probably need 30-40 of them,
things like '+', stack manipulations (remeber, there are two
stacks so that increases number of needed operations). In
corresponding dictionary entries you store addresses of primitives.

Next, execution engine. Simplest Forth implementation
is based on threaded code, execution engine is few machine instructions
(exact count will depend in details). It used few machine
registers which are supposed to have fixed meaning during
execution of forth words. Each "Forth" definition
contains a stub that transfers contol to execution engine,
Code in execution engine saves info about previous routine
and then stars executing current one. Maybe I will not go
into details here, there few subtly different schemes in
use, they are described in the net. As I wrote it is few
machine instructions, but if you deviate from known schemes
you may end up with something non working or much more
complicated then needed. Anyway, code for each Forth definition
has inital stub and sequence of address. Final address is
addres of routine that return control to the caller. Actually
the, there is a little twist there and return and exection
of caller are merged. But the point is that there is fixed
start and fixed end and real work of the compiler is to fill
the middle with addresses of corresponding words. Concerning
POSTPONE, at compile time of definition containing POSTONE
it reads following name, find definition in the dictionary
and stores corresponmding address in its body. When code
of POSTPONE is executed, it simply stores address contained
in the body of POSTPONE into currently compiled definition.

Remark: there is dozen or two global variable playing important
role in Forth. In particular there is thing called HERE
which notes position of currently defined data. So this
"store into definition" reads global variable, and uses
result as address of target of store. Then it bumps this
variable to move to next machine word (Forth data is usually
stored as whole words).

Now about compiler: compiler read string (name of word)
from input buffer, advances input and looks for the name
in the dictionary. In name is the dictionary it looks at
flags. If the word is in the dictionary and has "IMMEDIATE" flag,
then compiler switches to execution mode (called "interpretation
state" in the Forth standard) and calls corresponding code.
If the word is in the dictionary and but does not have
"IMMEDIATE" flag, then correspondig address is stored in
the current definition. Otherwise compiler tries to recognize
a number, the string is a number than it generate corresponding
code, that is stores address of a helper routine and after
it stores the number. At runtime helper routine uses info
from register to know location in the code and fetches
the number (and of course puts result on the stack).
There is a little issue here, compiler must avoid executing
incomplete definition and also normally does not do recursive
calls (intead calls previous definition). So there is a flag
to mark incomple definitions and special source construct to
set a flag indicating that current definition is recursive.
But that is almost all compiler. Similar thing is done
to implement command line and interpretive evaluation,
but this interpretive mode is executing words instead of
storing them in the definition. So compiler an interpeter
can share most of code and just use a flag to distinguish
mode.

That may look suspicious, as I did not write anything about
control structures. Well, contol structures are implemented
by IMMEDIATE words, they plant calls to helper routines that
do conditional and uncoditional jumps. Funky look of Forth
control structures is related to fact that corresponding words
have very simple job to do, they either store info about start
of a structure, plant jump or both. At finish of a control
structure there may be need for backpatching, that is filling
address of already planted jump. Part of info needed for this
is stored during compilation on Forth stack.

In traditional Forth user routines could plant addresses of
helper routines used by control structures. Standard Forth
defined this in abstract way, so it works also if you generate
machine code instead of threaded code. Point is that machinery
to compile contol structures is available to users.

Coming back to "simple to implement": several Forth words
essentially expose internals. If you have "wrong" internals,
then those words will not work as expected. If you have
no idea how Forth is implemented, then it can be quite
hard to discover right pattern. But this is described in
books and articles on the net. There is 'JohnesForth'
which is 2314 lines of i386 assembly and 1788 lines of Forth.
Most of assebly in JohnesForth is to fill inital dictionary,
if you use different language to fill dictionary then amount
of assembler can be dramatically lowered. And BTW, line count
above includes _a lot_ of comments. There is assembly
source of FIG-Forth (JonesForth has very similar structure,
but is for 32-bit mode).

Forth standard says you what various things are supposed to do,
there are comment/rationalle parts which say how things _may_
be implemented, but thiose are just samples. And it may be
hard to connect abstract sounding terminalogy with practice.
Still, standard say you what words are expected in modern Forth
and for "ordinary" words descriptions are reasonably readable.

My opinion of Forth has gone down a couple of notches; sorry.

Well, implementing real programming language is a real job.
There is some fun in implementing things, but as I mentioned,
on modern machines one can get more advantages than Forth from
different languages (at cost of more machine resources).

And to be clear, you do not need explicit stack to get main
advantages:
- interactivity: that is possiblity to add code at runtime,
kinda trivial in interprters (if you do not mind slowdown),
but if symbol table is part of runtime than compiler invoked
from inteactive command line can compile single routine to memory,
register it in symbol table and pass control back to interactive
toplevel. That is how Pop11, Lisp, SML typically works.
- extensibility: you need a way to represent code so that
user-genrated code can be compiled. One way it to use base
language as representaiton and implement transformation of
code, that is how Lisp macros work (I think Seed7 too). Or
you can expose code generator in machine indepented way (Pop11,
Forth). Or you bet that "everything" can be be expressed
adequatly within language. Haskell bets that lazy evaluation
and use of functions after optimization give effect of code
transformations. OO crowd bets that inheritance and smart
exception handling will do (in Smalltalk officially everthing
is a method calls, when you call non-existent method, it
goes to exception hander that can do now things). I think
that previous approaches are more powerfull, both Lisp and
Pop11 implement advanced object systems as a "user extention",
but OO folks think that they have enough fun.
- small size of obect code: byte code is usually smaller than
traditional Forh threaded code (on big macines much smaller,
but also smaller in 16-bit ones), but executes more slowly
that Forh threaded code
- speed of object code: compiler generting machine code usually
give faster code than Forh threaded code. If Forth generates
machine code, then whoever spent more work on optimizers
wins. In fact, small project has hard time to competes with
gcc and clang (but Ocaml did present interesting benchmark
figures). Machine code is bigger, but old trick is to
optimize only small critical part for speed, the rest may be
unoptimized or say optimized for size by compiling to bytecode.

(I'm not against all stack-based user-languages; I was quite impressed
by PostScript for example. But then I didn't have to do much in-depth
coding in it.)

On ZX81? I can imagine it being hard! (Someone wanted me to do something >>> on ZX80, but I turned it down. I considered it too much of a toy.)

To give more background, bare ZX81 had 1kB RAM (including video RAM).

You must mean /excluding/ surely? Otherwise there wouldn't be much left
from 1KB!

There is not much left. But there is a trick: boundary between
video memory and rest is movable, as you fill other part available
screen area shrinks. It does not take long time to have machine
completely filled out with no free space left (and IIRC some
small screen area for out of memory message)

The first Z80 machine I /made/ had 0.25KB RAM, to which I added 1KB
(actually 1K 6-bit words; two bits unpopulated to save £6), of text-mode video memory.

The second version had 32KB RAM, the same 1K text-mode memory, and 8KB graphics-mode video memory. I was able to write my first compiler on
that one, written using an assembler, which itself was written via a hex editor, and that was written in actual binary. ('Full-stack')

But both only had tape storage so tools were memory-based.

I wonder, did you have any ROM? Or battery backed RAM? Otherwise
how tape reading worked? In micros I know about tape reading
was via software, on Spectum this was more than 100 bytes.
It would be rather tedious to key in tape loader via console switches.
Old mainframes had I/O in hardware, but that was rather large
circuitry.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Tue Sep 10 09:04:47 2024

On 10/09/2024 00:04, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 18:46, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 05:04, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit >>>>>>> machines doing otherwise would add useless instructions to object >>>>>>> code. More precisly, really stupid compiler will generate useless >>>>>>> intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly.
But at least some gcc versions needed such declarations. Note
also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables >>>>>> were used in loops where they need to be widened to 64 bits anyway. The >>>>>> new value of c is set from a 32-bit result.

Well, at C level there is 64-bit type. The intent is that C compiler >>>>> should notice that the result is 32-bit + carry flag. Ideally
compiler should notice that c has only one bit and can keep it
in carry flag. On i386 comparison needed for loop control would
destroy carry flag, so there must be code using value of carry in
register and code to save carry to register. But one addition
of highs parts can be skipped. On 32-bit ARM compiler can use
special machine istructions and actually generated code which
is close to optimal.

When you have a type that you want to be at least 32 bits (to cover the >>>> range you need), and want it to be as efficient as possible on 32-bit
and 64-bit machines (and 16-bit and 8-bit, still found in
microcontrollers), use "int_fast32_t". On x86-64 it will be 64-bit, on >>>> 32-bit systems it will be 32-bit. Use of the [u]int_fastNN_t types can >>>> make code significantly more efficient on 64-bit systems while retaining >>>> efficiency on 32-bit (or even smaller) targets.

Well, I have constraints, some of which are outside of C code.
To resolve contraints I normally use some configure machinery.
If a standard type is exact match for my constraints, then I would
use standard type, possibly adding some fallbacks for pre-standard
systems. But if my constranits differ, I see no advantage in
using more "fancy" standard types compared to old types.

The "fancy" standard types in this case specify /exactly/ what you
apparently want - a type that can deal with at least 32 bits for range,
and is as efficient as possible on different targets. What other
constraints do you have here that make "int_fast32_t" unsuitable?

This routine is part of small library. Good approximation is that
I want a type (posibly depending on target) which will make this
library fast. As I wrote in another message I plan to use 64-bit
units of work on 64-bit targets. I have not plans to use the
library in 8-bit or 16-bit targets, but if I needed them on such
targets I probably would use 16-bit work unit. So, while 32-bit
work unit represents current state, it not correct statement of
intentions. Consequently, 'int_fast32_t' would be as misleading
or more misleading than 'int' ('int' is reasonable indication
that choice of types is not finished, 'int_fast32_t' suggest that
it is proper choice for all targets).

So you really want to say "at least 16 bits, as efficiently as possible"
- the type then is "int_fast16_t". This should be 32-bit or 64-bit on
most architectures bigger than 16-bit - unless the target is actually
more efficient at handling 16-bit (such as the original 68000).

If something like that is not sufficient, because you want more
flexibility, then consider making a typedef with a name you find
suitable, and a using that. You can have pre-processor conditional
compilation to handle known cases, and fall back to int_fast16_t otherwise.

The answer to "I don't know if this is the best choice on all targets, including ones I haven't tried" is /not/ "I'll use this type that I know
is not the most efficient on targets that I /have/ tried".

Concerning '<stdint.h>', somewhat worring thing is that several types
there seem to be optional (or maybe where optional in older versions
of the standard). I am not sure if this can be real problem,
but too many times I saw that on various issues developers say
"this in not mandatory, so we will skip it".

<stdint.h> has been required since C99. It has not been optional in any
C standard in the last 25 years. That's a /long/ time - long enough for
most people to be able to say "it's standard".

And almost every C90 compiler also includes <stdint.h> as an extension.

If you manage to find a compiler that is old enough not to have
<stdint.h>, and you need to use it for actual code, it's probably worth
spending the 3 minutes it takes to write a suitable <stdint.h> yourself
for that target.

ATM I do not have standard text handy to check, but AFAIR several
_types_ in <stdint.h> were not required but optional.

Online versions of the standards - at least from C99 upwards - are
easily and freely available, as long as you are happy with the final pre-published drafts rather than insisting on the expensive official
documents:

<https://en.cppreference.com/w/c/links>

Or you can ask here, and be sure that someone will tell you the answer.
And if that answer is incorrect or imprecise, someone else will correct
it :-)

The exact size types - int8_t, uint16_t, etc., might not exist on a
given target (some DSP's have CHARBIT 16 or 32, for example). The exact
width types for sizes 8, 16, 32, and 64 are required if the
implementation has standard or extended integer types of those sizes and properties (no padding bits, two's complement representation).
Otherwise they are optional. Other sizes are always optional - if an implementation has 24-bit int, it still does not have to provide
int24_t, though it is good practice.

So the 8, 16, 32, and 64 exact size types are always available if the
target supports them, and that is /almost/ always the case.

The [u]int_leastN_t and [u]int_fastN_t types are always required for 8,
16, 32 and 64 bit. Other sizes are optional.

[u]intptr_t are optional, but almost always provided. [u]intmax_t are required.

So you can freely use the "least" and "fast" types for sizes 8, 16, 32,
and 64 on any platform, knowing they always exist (for C99 upwards).
The fixed size types (for these standard sizes) will exist on all but
the most unusual of C targets, and if you are working with such a
target, you would already know about it.

If you need to convert a pointer to an integer type, use "uintptr_t" -
if that type does not exist, you are probably on a specialist target
that has some kind of long pointer and the conversion to an integer is
unlikely to work well anyway.

(It is conceivable that a pre-C99 implementation will provide a partial <stdint.h> with the fixed size types and not the least or fast types.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Tue Sep 10 09:19:40 2024

On 09/09/2024 22:16, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

On 09/09/2024 20:46, Kaz Kylheku wrote:

On 2024-09-09, David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 18:57, Bart wrote:

On 09/09/2024 17:21, Waldek Hebisch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

C23 doesn't add any new support for 128-bit integers.

So what does _Bitint do with a width of 128 bits?

_BitInt types are not "integer types". Nor is gcc's __int128 type.

How can we write a program which, in an implementation which has a
__int128 type, outputs "yes" if it is an integer type, otherwise "no"?

I'm not sure such a program, one that can detect either an __int128 that isn't an integer type or an __int128 that is an integer type, is possible.

#include <stdio.h>

int main() {
auto const x = 0x1'0000'0000'0000'0000;
if (x > 0xffff'ffff'ffff'ffff) {
printf("yes\n");
} else {
printf("no\n");
}
}

Of course this uses digit separators, which are a new feature in C23.

And "auto", another C23. I should perhaps have had "auto constexpr x" !

I felt free to use C23 because we were already talking about the
support, or lack thereof, of 128-bit integers in C23, and also _BitInt.
And on a platform that supported 128-bit integer types, "auto" would
pick a 128-bit type here, while other platforms would pick a 64-bit type.

If an implementation doesn't have an integer type wider than 64 bits,
then the constant 0x1'0000'0000'0000'0000 has no type, which violates a constraint.

Indeed. Conforming implementations need to give an error or warning
here. But some will also then map it to a 64-bit value (I'd usually
expect 0, but 0xffff'ffff or 0x7fff'ffff might also be used) and
continue compilation. That gives a program with the desired behaviour.
(For my own uses, I would pick flags that gave an error on such code.)

If an implementation does have a integer types wider than 64 bits,
there's no guarantee that it uses the name "__int128". A future gcc
might have a 128-bit (extended) integer type and still have __int128
with its current semantics.

If an implementation had a standard integer type of 128 bits, then I
expect "long long int" would be the name (unless it also had even bigger types!), but for extended integer types it could be many different
things. It is very likely, however, that it would also be in <stdint.h>
as int128_t.

Do you know of any implementations that do have 128-bit integer types?
In particular, any that have a compiler on godbolt.org?

For gcc, this meets Kaz's specification:

#include <stdio.h>
int main(void) {
puts("no");
}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Tue Sep 10 11:52:57 2024

On 10/09/2024 05:40, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I've looked at half a dozen hits for 'forth postpone' and I still don't
understand what it does. Apparently something to do with compiled mode.

I wouldn't know enough to confidently implement it or use it.

Silly example 1:

<snip long article about Forth>

For a non-advocate of Forth you have a lot to say about it!

I've archived your post for if/when I get back to that project. (If I
ever use Forth, it'll be on my own terms. I just tried it to see if it's
still breathing:

c:\qapps>qq forth
Bart-Forth
Type bye to stop
> 2 2 + .
4
> bye

So it looks like I already thought of it as my version.)

To give more background, bare ZX81 had 1kB RAM (including video RAM).

You must mean /excluding/ surely? Otherwise there wouldn't be much left
from 1KB!

There is not much left. But there is a trick: boundary between
video memory and rest is movable, as you fill other part available
screen area shrinks. It does not take long time to have machine
completely filled out with no free space left (and IIRC some
small screen area for out of memory message)

So that meagre 1KB had to be shared?

The Z80 designs I did (for my own use and when I had a job developing commercial machines) never used used tricks like that**.

I never had to work within very tight pricing, but I would have
considered such a machine unviable other than for experimenting.

(** Well, part from using partially-populated bytes. For text memory it
was for cost. For video memory I used 8KB but addressed as 16K 4-bit
bytes. This was so that in greyscale mode (it had 256x256x1 and
128x128x4 modes), each pixel had its own address for convenience.

In that mode, this circuitry was also used for TV frame-grabbing.)

But both only had tape storage so tools were memory-based.

I wonder, did you have any ROM? Or battery backed RAM? Otherwise
how tape reading worked? In micros I know about tape reading
was via software, on Spectum this was more than 100 bytes.
It would be rather tedious to key in tape loader via console switches.
Old mainframes had I/O in hardware, but that was rather large
circuitry.

The first version used circuitry (shift registers and counters) to read
and write cassette tape, at around 300bps. I think the same circuitry
was also used in single-step mode to manually load in data a bit at a
time, or examine memory after a run (since there was no display at
first, just on/off LEDs).

Using software, I could read and write tape at 1200bps. At some point I
also programmed a ROM (using an ad-hoc arrangement as I had no proper programmer) to provide software on power-on.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Tue Sep 10 11:20:55 2024

On 10/09/2024 01:58, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 16:36, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 08/09/2024 23:34, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

And while microcontrollers sometimes have a limited form of branch >>>>>> prediction (such as prefetching the target from cache), the more
numerous and smaller devices don't even have instruction caches.
Certainly none of them have register renaming or speculative execution. >>>>>

IIUC STM4 series has cache, and some of them are not so big. There
are now several chinese variants of STM32F103 and some of them have
caches (some very small like 32 words, IIRC one has 8 words and it
is hard to decide if this very small cache or big prefetch buffer).

There are different kinds of cache here. Some of the Cortex-M cores
have optional caches (i.e., the microcontroller manufacturer can choose >>>> to have them or not).

<https://en.wikipedia.org/wiki/ARM_Cortex-M#Silicon_customization>

I do not see relevent information at that link.

There is a table of the Cortex-M cores, with the sizes of the optional
caches.

Flash memory, flash controller peripherals, external memory interfaces >>>> (including things like QSPI) are all specific to the manufacturer,
rather than part of the Cortex M cores from ARM. Manufacturers can do >>>> whatever they want there.

AFAIK typical Cortex-M design has core connected to "bus matrix".
It is up to chip vendor to decide what else is connected to bus matrix.

Yes.

However, there are other things connected before these crossbar
switches, such as tightly-coupled memory (if any).

TCM is _not_ a cache.

Correct. (I did not suggest or imply that it was.)

And the cpu caches
(if any) are on the cpu side of the switches.

Caches are attached were system designer thinks they are useful
(and possible). Word "cache" has well-estabished meaning and
ARM (or you) has no right to redefine it.

I am using it in the manner ARM uses it when talking about ARM
processors and microcontroller cores. I think that is the most relevant
way to use the term here. The term "cache" has many meanings in many
contexts - there is no single precise "well-established" or "official"
meaning. Context is everything. That is why I have been using the term
"cpu cache" for the cache tied tightly to the cpu itself, which comes as
part of the core that ARM designs and delivers, along with parts such as
the NVIC. And I have tried to use terms such as "buffer" or "flash
controller cache" for the memory buffers often provided as part of flash controllers and memory interfaces on microcontrollers, because those are
terms used by the microcontroller manufacturers.

Manufacturers also have a
certain amount of freedom of the TCMs and caches, depending on which
core they are using and which licenses they have.

There is a convenient diagram here:

<https://www.electronicdesign.com/technologies/embedded/digital-ics/processors/microcontrollers/article/21800516/cortex-m7-contains-configurable-tightly-coupled-memory>

For me it does not matter if it is ARM design or vendor specific.
Normal internal RAM is accessed via bus matrix, and in MCU-s that
I know about is fast enough so that cache is not needed. So caches
come into play only for flash (and possibly external memory, but
design with external memory probably will be rather large).

Typically you see data caches on faster Cortex-M4 microcontrollers with
external DRAM, and it is also standard on Cortex-M7 devices. For the
faster chips, internal SRAM on the AXI bus is not fast enough. For
example, the NXP i.mx RT106x family typically run at 528 MHz core clock,
but the AXI bus and cross-switch are at 133 MHz (a quarter of the
speed). The tightly-coupled memories and the caches run at full core speed.

OK, if you run core at faster clock than the bus matrix, then cache
attached on core side make a lot of sense. And since cache has to
compensate for lower bus speed it must be resonably large.

Yes.

But
if you look at devices where bus matrix runs at the same clock
as the core, then it makes sense to put cache on the other side.

No.

You put caches as close as possible to the prime user of the cache. If
the prime user is the cpu and you want to cache data from flash,
external memory, and other sources, you put the cache tight up against
the cpu - then you can have dedicated, wide, fast buses to the cpu.

But it can also make sense to put small buffers as part of memory
interface controllers. These are not organized like data or instruction caches, but are specific for the type of memory and the characteristics
of it. How this is done depends on details of the interface, details of
the internal buses, and how the manufacturer wants to implement it. For example, on one microcontroller I am using there are queues to let it
accept multiple flash read/write commands from the AHB bus and the IPS
bus, but read-ahead is controlled by the burst length of read requests
from the cross-switch (which in turn will come from cache line fill
requests from the cpu caches). On a different microcontroller, the
read-ahead logic is in the flash controller itself as that chip has a
simpler internal bus where all read requests will be for 32 bits (it has
no cpu caches). An external DRAM controller, on the other hand, will
have queues and buffers optimised for multiple smaller transactions and
be able to hold writes in queues that get lower priority than read requests.

These sorts of queues and buffers are not generally referred to as
"caches", because they are specialised queues and buffers. Sometimes
you might have something that is in effect perhaps a two-way
single-entry 16 byte wide read-only cache, but using the term "cache"
here is often confusing. At best it is a "flash controller cache", and
very distinct from a "cpu cache".

It seems that vendor do not like to say that they use cache, instead
that use misleading terms like "flash accelerator".

That all depends on the vendor, and on how the flash interface
controller. Vendors do like to use terms that sound good, of course!

So a "cache" of 32 words is going to be part of the flash interface, not >>>> a cpu cache

Well, caches never were part of CPU proper, they were part of
memory interface. They could act for whole memory or only for part
that need it (like flash). So I do not understand what "not a cpu
cache" is supposed to mean. More relevant is if such thing act
as a cache, 32 word things almost surely will act as a cache,
8 word thing may be a simple FIFO buffer (or may act smarter
showing behaviour typical of caches).

Look at the diagram in the link I gave above, as an example. CPU caches
are part of the block provided by ARM and are tightly connected to the
processor. Control of the caches (such as for enabling them) is done by
hardware registers provided by ARM, alongside the NVIC interrupt
controller, SysTick, MPU, and other units (depending on the exact
Cortex-M model).

This is completely different from the small buffers that are often
included in flash controllers or external memory interfaces as
read-ahead buffers or write queues (for RAM), which are as external the
processor core as SPI, UART, PWM, ADC, and other common blocks provided
by the microcontroller manufacturer.

The disscussion started about possible interaction of caches
and virtual function dispatch.

OK - I admit to having lost track of the earlier discussion, so that is helpful.

This interaction does not depend
on you calling it cache. It depends on cache hits/misses,
their cost and possible eviction. And actually small caches
can give "interesting" behaviour: with small code footprint there
may be 100% hit ratio, but one extra memory reference may lead
to significant misses. And even small caches behave differently
then simple buffers.

I agree that behaviour can vary significantly.

When you have a "flash controller cache" - or read-ahead buffers - you typically have something like a 60-80% hit ratio for sequential code and
nearly 100% for very short loops (like you'd have for a memcpy() loop).
You have close to 0% hit ratio for branches or calls, regardless of
whether they are virtual or not (with virtual function dispatch
generally having one extra indirection at 0% hit rate). This is the
kind of "cache" you often see in microcontrollers with internal flash
and clock speeds of up to perhaps 150 Mz, where the flash might be at a
quarter of the main cpu clock.

Bigger microcontrollers have cpu caches to give much higher average hit
ratios even when calling indirectly, because the delay for external
memory is more significant (even though external QSPI flash can often
have higher bandwidth than many internal flashes). Indirect accesses
such as using virtual function dispatch will increase the likelihood of
a miss.

When variation and delays are unacceptable, critical code is often put
in internal SRAM, or sometimes cache lines are locked so that the speed
is predictable.

(which are typically 16KB - 64KB,

I wonder where you found this figure. Such size is typical for
systems bigger than MCU-s. It could be useful for MCU-s with
flash a on separate die, but with flash on the same die as CPU
much smaller cache is adequate.

Look at the Wikipedia link I gave. Those are common sizes for the
Cortex-M7 (which is pretty high-end), and for the newer generation of
Cortex-M35 and Cortex-M5x parts. I have on my desk an RTO1062 with a
600 MHz Cortex-M7, 1 MB internal SRAM, 32 KB I and D caches, and
external QSPI flash.

OK, as I wrote it makes sense for them. But for smaller machines
much smaller caches may be adequate.

As I have said, they are not really caches in the same sense as you have
for a cpu cache. But certainly a "flash controller cache" or read-ahead
buffer (especially if there are two of them) can make a big difference
to the throughput of a microcontroller, and equally certainly a cpu
cache would be an unreasonable cost in die area, power, and licensing
fees for most microcontrollers. Thus these small buffers - or very
small, very specialised caches in the flash controller - are a good idea.

and only found on bigger
microcontrollers with speeds of perhaps 120 MHz or above). And yes, it >>>> is often fair to call these flash caches "prefetch buffers" or
read-ahead buffers.

Typical code has enough branches that simple read-ahead beyond 8
words is unlikely to give good results. OTOH delivering things
that were accessed in the past and still present in the cache
gives good results even with very small caches.

There are no processors with caches smaller than perhaps 4 KB - it is
simply not worth it.

Historicaly there were processors with small caches. 256B in
Motorla chips and I think smaller too. It depends on the whole
design.

For a general cpu data cache on a modern cpu, the cache control logic is probably going to require the same die area as a few KB of cache
storage, as a minimum - so it makes no sense to have such small cpu
caches. The logic for instruction caches is simpler. In days gone by, balances were different and smaller caches could be useful. The 68020
had a 256 byte instruction cache, and the 68030 and 68040 added a 256
byte data cache. Both were single way.

Currently for "big" processors really small caches seem
to make no sense. Microconrollers have their own constaints.
Manufacturer may decide that cache giving 10% average improvement
is not worth uncertainilty of execution time. Or may decide that
small cache is the cheapest way to get better benchmark figures.

You are correct that microcontrollers have different constraints, and
that jitter and variation of timing is far more of a cost in
microcontrollers than it is on "big" processors, where throughput is
key. The other factor here is latency. On earlier designs such as the aforementioned M68k family, you could often add a fair bit of logic
without requiring extra clock cycles. Thus the cache was "free". That
is different now, even on microcontrollers. Adding a cpu cache on even
the slowest of modern microcontrollers will mean at least a clock cycle
extra on cache misses compared to no cache - for medium devices (say,
120 MHz Cortex-M4) it would mean 2 or 3 extra cycles. So unless you are getting a significant hit ratio, it is not worth it.

Putting read-ahead buffers and a "micro-cache", if that term suits you,
at the flash controller and other memory interfaces is, however, free in
terms of clock cycles and latency - these parts run at a higher clock
rate than the flash itself.

Read-ahead buffers on flash accesses are helpful,
however, because most code is sequential most of the time. It is common
for such buffers to be two-way, and to have between 16 and 64 bytes per
way.

If you read carefully description of STM "flash accelerator" it is
clear that this is classic cache, with line size matched to flash,
something like 2-set associativity, conflicts and eviction.
Historically there were variations, some caches only cache targets
of jumps and use prefetch buffer for linear code. Such caches
can be effective at very small size.

I don't know the STM "flash accelerator" specifically - there are many
ARM microcontrollers and I have not used them all. But while it is true
that some of these are organised in a similar way to extremely small and restricted caches, I think using the word "cache" alone here is
misleading. That's why I have tried to distinguish and qualify the term.

And in the context of virtual function dispatch, a two-way single line micro-cache is pretty much guaranteed to have a cache miss when doing
such indirect calls as you need the current code, the virtual method
table, and the virtual method itself to be in cache simultaneously to
avoid a miss. But these flash accelerators still make a big difference
to the speed of code in general.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Tue Sep 10 13:55:16 2024

On 10/09/2024 12:52, Bart wrote:

On 10/09/2024 05:40, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I've looked at half a dozen hits for 'forth postpone' and I still don't
understand what it does. Apparently something to do with compiled mode.

I wouldn't know enough to confidently implement it or use it.

Silly example 1:

<snip long article about Forth>

For a non-advocate of Forth you have a lot to say about it!

I've archived your post for if/when I get back to that project. (If I
ever use Forth, it'll be on my own terms. I just tried it to see if it's still breathing:

c:\qapps>qq forth
Bart-Forth
Type bye to stop

; 2 2 + .

4

; bye

But it handle :

: 2 1 ;
2 2 + .

and get the answer 2 ? :-)

So it looks like I already thought of it as my version.)

To give more background, bare ZX81 had 1kB RAM (including video RAM).

You must mean /excluding/ surely? Otherwise there wouldn't be much left
from 1KB!

There is not much left. But there is a trick: boundary between
video memory and rest is movable, as you fill other part available
screen area shrinks. It does not take long time to have machine
completely filled out with no free space left (and IIRC some
small screen area for out of memory message)

So that meagre 1KB had to be shared?

Yes. 24 lines of 32 characters took 768 bytes, and the OS and BASIC
took another 125 bytes. There was not a lot of space for user code and
data!

Most programs (often games) for the ZX81 required a memory expansion,
which I believe was typically 4 KB or 16 KB.

Low price was key for the ZX81 and its predecessor the ZX80 (and its
successor, the ZX Spectrum).

I've worked with microcontrollers with extremely small memories. It can
be fun up to a point, but it's not really a good use of valuable
programmers' time when implementing a fix or new feature means trying to
find a few bits of spare RAM or a way to squeeze a half-dozen bytes of
code into the eeprom space.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Tue Sep 10 04:19:46 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

[edited for brevity]

How can we write a program which, in an implementation which has a
__int128 type, outputs "yes" if it is an integer type, otherwise "no"?

I'd like the program to have well-defined behavior, under the test
assumption (that a documented, conforming extension is provided in
the form of __int128).

Integer types don't have to have their own matching constants.
The types char and short don't have their own constants; they
borrow int. Library support can be missing, in regard to
formatting __int128 to text and back.

The __int128 type better support all integer arithmetic, though:
and there should be conversion rules regarding when __int128 is
opposite to an operand of a different type. An __int128 should be
usable as a displacement in pointer arithmetic. It should be
possible to switch() on an __int128, even if a constant cannot be
expressed for every possible value.

To clarify, what you want is (I think) an integer-like type, not
necessarily an extended integer type. As long as expressions and
values having the type in question behave like other expressions and
values of integer types (as the C standard uses the term), it's all
good.

Something that I think you want but isn't mentioned is casting: you
want to be able to use explicit conversions (in other words, casts)
to force values or subexpressions to take on the new type.

As long as the regular arithmetic operations works and casting
works, and like you say switch() works with the new type, then
switch() statements are available, because 'case' needs only an
integer constant expression, and is not limited to constants.

Assuming all that is right, I recommend

typedef __uint128_t U128;
typedef __int128_t S128;

which works in both gcc and clang (I don't know yet about
Visual Studio).

(Let me add parenthetically that I don't see why you want the
program described. Just put those typedefs in your code, and
if compiling the code barfs then you know they aren't supported.
On platforms where those particular names are not supported
but other ways of supplying the types are, using a #define or
two might rescue the situation (and the #define's can be
#undef'ed after the typedef statements are processed).)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Tue Sep 10 14:30:29 2024

On 10/09/2024 12:55, David Brown wrote:

On 10/09/2024 12:52, Bart wrote:

I've archived your post for if/when I get back to that project. (If I
ever use Forth, it'll be on my own terms. I just tried it to see if
it's still breathing:

  c:\qapps>qq forth
  Bart-Forth
  Type bye to stop
  > 2 2 + .
  4
  > bye

But it handle :

: 2 1 ;
2 2 + .

and get the answer 2 ? :-)

Apparently:

c:\qapps>qq forth
Bart-Forth
Type bye to stop
> : 2 1 ;

> 2 2 + .
2
>

So it looks like I already thought of it as my version.)

To give more background, bare ZX81 had 1kB RAM (including video RAM). >>>>

You must mean /excluding/ surely? Otherwise there wouldn't be much left >>>> from 1KB!

There is not much left. But there is a trick: boundary between
video memory and rest is movable, as you fill other part available
screen area shrinks. It does not take long time to have machine
completely filled out with no free space left (and IIRC some
small screen area for out of memory message)

So that meagre 1KB had to be shared?

Yes. 24 lines of 32 characters took 768 bytes, and the OS and BASIC
took another 125 bytes. There was not a lot of space for user code and data!

An OS /and/ BASIC takes 125 bytes? You can't really complain about bloat
here! Windows is probably 20GB and doesn't have Basic.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Tue Sep 10 13:56:42 2024

David Brown <david.brown@hesbynett.no> wrote:

On 10/09/2024 00:04, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 18:46, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 05:04, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit >>>>>>>> machines doing otherwise would add useless instructions to object >>>>>>>> code. More precisly, really stupid compiler will generate useless >>>>>>>> intructions even with my declarations, really smart one will
notice that variables fit in 32-bits and optimize accordingly. >>>>>>>> But at least some gcc versions needed such declarations. Note >>>>>>>> also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision),
you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables >>>>>>> were used in loops where they need to be widened to 64 bits anyway. The >>>>>>> new value of c is set from a 32-bit result.

Well, at C level there is 64-bit type. The intent is that C compiler >>>>>> should notice that the result is 32-bit + carry flag. Ideally
compiler should notice that c has only one bit and can keep it
in carry flag. On i386 comparison needed for loop control would
destroy carry flag, so there must be code using value of carry in
register and code to save carry to register. But one addition
of highs parts can be skipped. On 32-bit ARM compiler can use
special machine istructions and actually generated code which
is close to optimal.

When you have a type that you want to be at least 32 bits (to cover the >>>>> range you need), and want it to be as efficient as possible on 32-bit >>>>> and 64-bit machines (and 16-bit and 8-bit, still found in
microcontrollers), use "int_fast32_t". On x86-64 it will be 64-bit, on >>>>> 32-bit systems it will be 32-bit. Use of the [u]int_fastNN_t types can >>>>> make code significantly more efficient on 64-bit systems while retaining >>>>> efficiency on 32-bit (or even smaller) targets.

Well, I have constraints, some of which are outside of C code.
To resolve contraints I normally use some configure machinery.
If a standard type is exact match for my constraints, then I would
use standard type, possibly adding some fallbacks for pre-standard
systems. But if my constranits differ, I see no advantage in
using more "fancy" standard types compared to old types.

The "fancy" standard types in this case specify /exactly/ what you
apparently want - a type that can deal with at least 32 bits for range,
and is as efficient as possible on different targets. What other
constraints do you have here that make "int_fast32_t" unsuitable?

This routine is part of small library. Good approximation is that
I want a type (posibly depending on target) which will make this
library fast. As I wrote in another message I plan to use 64-bit
units of work on 64-bit targets. I have not plans to use the
library in 8-bit or 16-bit targets, but if I needed them on such
targets I probably would use 16-bit work unit. So, while 32-bit
work unit represents current state, it not correct statement of
intentions. Consequently, 'int_fast32_t' would be as misleading
or more misleading than 'int' ('int' is reasonable indication
that choice of types is not finished, 'int_fast32_t' suggest that
it is proper choice for all targets).

So you really want to say "at least 16 bits, as efficiently as possible"
- the type then is "int_fast16_t". This should be 32-bit or 64-bit on
most architectures bigger than 16-bit - unless the target is actually
more efficient at handling 16-bit (such as the original 68000).

As I wrote I have other constrants that are likely to favour 32 bits
on 32-bit machines and 64-bit on 64-bit machines. As you noticed "int_fast16_t" may be 16 bit even if native word size is bigger.
And concerning proper size for "int_fast16_t", it is enough that
CPU performs scalar operations with the same speed regardless
of size of operands (including mixed operands). Then there is
no reason to favour bigger size. And there are reasons to favour
smaller size, like cache use or opportunities for autovectorization.
So I would make "int_fast16_t" into 16-bit type on such a machine.

So, I could use "int_fastX_t" with X determined by my constranits,
but assuming "int_fastX_t = int_fast16_t" would be a latent bug.

If something like that is not sufficient, because you want more
flexibility, then consider making a typedef with a name you find
suitable, and a using that. You can have pre-processor conditional compilation to handle known cases, and fall back to int_fast16_t otherwise.

As I wrote, I want type to be determined by configuration machinery.
And configuration machinery will generate appropriate typedefs.

The answer to "I don't know if this is the best choice on all targets, including ones I haven't tried" is /not/ "I'll use this type that I know
is not the most efficient on targets that I /have/ tried".

Actually, main use case is when I know _a lot_ about targets. Concerning
other targets, they are "to do", when I came to them, standard types
_may_ give useful baseline.

And to put a bit differently what I wrote previously, consider
'int' in current code as big shouting "FIXME". I have different
things to fix first, and I want a proper fix, so delay with this
one.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bart on Tue Sep 10 16:53:17 2024

On Tue, 10 Sep 2024 14:30:29 +0100
Bart <bc@freeuk.com> wrote:

On 10/09/2024 12:55, David Brown wrote:

On 10/09/2024 12:52, Bart wrote:

I've archived your post for if/when I get back to that project.
(If I ever use Forth, it'll be on my own terms. I just tried it to
see if it's still breathing:

  c:\qapps>qq forth
  Bart-Forth
  Type bye to stop
  > 2 2 + .
  4
  > bye

But it handle :

: 2 1 ;
2 2 + .

and get the answer 2 ? :-)

Apparently:

c:\qapps>qq forth
Bart-Forth
Type bye to stop
> : 2 1 ;

> 2 2 + .
2
>

So it looks like I already thought of it as my version.)

To give more background, bare ZX81 had 1kB RAM (including video
RAM).

You must mean /excluding/ surely? Otherwise there wouldn't be
much left from 1KB!

There is not much left. But there is a trick: boundary between
video memory and rest is movable, as you fill other part available
screen area shrinks. It does not take long time to have machine
completely filled out with no free space left (and IIRC some
small screen area for out of memory message)

So that meagre 1KB had to be shared?

Yes. 24 lines of 32 characters took 768 bytes, and the OS and
BASIC took another 125 bytes. There was not a lot of space for
user code and data!

An OS /and/ BASIC takes 125 bytes? You can't really complain about
bloat here! Windows is probably 20GB and doesn't have Basic.

I think that Windows comes with VBScript, which can be considered
Basic dialect, installed by default. It certainly was still here in
latest incarnations of Win10. I didn't check if it presents in 11.

Checked Wikipedia. Still here. Removal planned for 2027 or later.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Tue Sep 10 16:28:54 2024

On 10/09/2024 15:56, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 10/09/2024 00:04, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 18:46, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 05:04, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 09/09/2024 01:29, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

No. It is essential for efficiency to have 32-bit types. On 32-bit >>>>>>>>> machines doing otherwise would add useless instructions to object >>>>>>>>> code. More precisly, really stupid compiler will generate useless >>>>>>>>> intructions even with my declarations, really smart one will >>>>>>>>> notice that variables fit in 32-bits and optimize accordingly. >>>>>>>>> But at least some gcc versions needed such declarations. Note >>>>>>>>> also that my version makes clear that there there is
symmetry (everything should be added using 64-bit precision), >>>>>>>>> you depend on promotion rules which creates visual asymetry
are requires reasoning to realize that meaning is symetric.

Your posted code used 64-bit aritmetic. The xext and c 32-bit variables
were used in loops where they need to be widened to 64 bits anyway. The
new value of c is set from a 32-bit result.

Well, at C level there is 64-bit type. The intent is that C compiler >>>>>>> should notice that the result is 32-bit + carry flag. Ideally
compiler should notice that c has only one bit and can keep it
in carry flag. On i386 comparison needed for loop control would >>>>>>> destroy carry flag, so there must be code using value of carry in >>>>>>> register and code to save carry to register. But one addition
of highs parts can be skipped. On 32-bit ARM compiler can use
special machine istructions and actually generated code which
is close to optimal.

When you have a type that you want to be at least 32 bits (to cover the >>>>>> range you need), and want it to be as efficient as possible on 32-bit >>>>>> and 64-bit machines (and 16-bit and 8-bit, still found in
microcontrollers), use "int_fast32_t". On x86-64 it will be 64-bit, on >>>>>> 32-bit systems it will be 32-bit. Use of the [u]int_fastNN_t types can >>>>>> make code significantly more efficient on 64-bit systems while retaining >>>>>> efficiency on 32-bit (or even smaller) targets.

Well, I have constraints, some of which are outside of C code.
To resolve contraints I normally use some configure machinery.
If a standard type is exact match for my constraints, then I would
use standard type, possibly adding some fallbacks for pre-standard
systems. But if my constranits differ, I see no advantage in
using more "fancy" standard types compared to old types.

The "fancy" standard types in this case specify /exactly/ what you
apparently want - a type that can deal with at least 32 bits for range, >>>> and is as efficient as possible on different targets. What other
constraints do you have here that make "int_fast32_t" unsuitable?

This routine is part of small library. Good approximation is that
I want a type (posibly depending on target) which will make this
library fast. As I wrote in another message I plan to use 64-bit
units of work on 64-bit targets. I have not plans to use the
library in 8-bit or 16-bit targets, but if I needed them on such
targets I probably would use 16-bit work unit. So, while 32-bit
work unit represents current state, it not correct statement of
intentions. Consequently, 'int_fast32_t' would be as misleading
or more misleading than 'int' ('int' is reasonable indication
that choice of types is not finished, 'int_fast32_t' suggest that
it is proper choice for all targets).

So you really want to say "at least 16 bits, as efficiently as possible"
- the type then is "int_fast16_t". This should be 32-bit or 64-bit on
most architectures bigger than 16-bit - unless the target is actually
more efficient at handling 16-bit (such as the original 68000).

As I wrote I have other constrants that are likely to favour 32 bits
on 32-bit machines and 64-bit on 64-bit machines. As you noticed "int_fast16_t" may be 16 bit even if native word size is bigger.
And concerning proper size for "int_fast16_t", it is enough that
CPU performs scalar operations with the same speed regardless
of size of operands (including mixed operands). Then there is
no reason to favour bigger size. And there are reasons to favour
smaller size, like cache use or opportunities for autovectorization.
So I would make "int_fast16_t" into 16-bit type on such a machine.

On many cpus, using sizes smaller than the full register size means
doing sign extensions or masking operations at various times - thus full
size register operations can often be more efficient. On such systems
you will find that int_fast16_t is 32-bit or 64-bit, according to the
register width. On other cpus, some common ALU operations on full-size operands can be slower than for smaller operands (such as on the 68000).
There, int_fast16_t will be 16-bit.

Compiler authors know what will usually be faster on the target. There
will always be some exceptions (division is usually faster on smaller
operands, for example). But if you don't know the target - as is the
case of portable code - the compiler will usually make a better choice
here than you would.

So, I could use "int_fastX_t" with X determined by my constranits,
but assuming "int_fastX_t = int_fast16_t" would be a latent bug.

The point is to /avoid/ assumptions, not make new ones.

If something like that is not sufficient, because you want more
flexibility, then consider making a typedef with a name you find
suitable, and a using that. You can have pre-processor conditional
compilation to handle known cases, and fall back to int_fast16_t otherwise.

As I wrote, I want type to be determined by configuration machinery.
And configuration machinery will generate appropriate typedefs.

The answer to "I don't know if this is the best choice on all targets,
including ones I haven't tried" is /not/ "I'll use this type that I know
is not the most efficient on targets that I /have/ tried".

Actually, main use case is when I know _a lot_ about targets. Concerning other targets, they are "to do", when I came to them, standard types
_may_ give useful baseline.

And to put a bit differently what I wrote previously, consider
'int' in current code as big shouting "FIXME". I have different
things to fix first, and I want a proper fix, so delay with this
one.

I would tend to use "int" when I meant "int", "int_fast16_t" when I
meant "int_fast16_t", and "// FIXME" when I meant "FIXME". But it's
your code, and you use the style you want. I've given you advice on
better choices of types, along with details of how they work in
practice, and what guarantees you get from the standards. If you still
prefer less efficient code, it's your choice.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Tue Sep 10 16:18:31 2024

On 10/09/2024 15:30, Bart wrote:

On 10/09/2024 12:55, David Brown wrote:

On 10/09/2024 12:52, Bart wrote:

I've archived your post for if/when I get back to that project. (If I
ever use Forth, it'll be on my own terms. I just tried it to see if
it's still breathing:

  c:\qapps>qq forth
  Bart-Forth
  Type bye to stop
  > 2 2 + .
  4
  > bye

But it handle :

: 2 1 ;
2 2 + .

and get the answer 2 ? :-)

Apparently:

c:\qapps>qq forth
Bart-Forth
Type bye to stop
> : 2 1 ;

> 2 2 + .
2
>

There are probably Forth interpreters written in Forth that you could
then run in your limited Forth in order to get any missing parts!

So it looks like I already thought of it as my version.)

To give more background, bare ZX81 had 1kB RAM (including video RAM). >>>>>

You must mean /excluding/ surely? Otherwise there wouldn't be much
left
from 1KB!

There is not much left. But there is a trick: boundary between
video memory and rest is movable, as you fill other part available
screen area shrinks. It does not take long time to have machine
completely filled out with no free space left (and IIRC some
small screen area for out of memory message)

So that meagre 1KB had to be shared?

Yes. 24 lines of 32 characters took 768 bytes, and the OS and BASIC
took another 125 bytes. There was not a lot of space for user code
and data!

An OS /and/ BASIC takes 125 bytes? You can't really complain about bloat here! Windows is probably 20GB and doesn't have Basic.

That's ram - the code was in an 8K ROM. And to be honest, the OS did
not have a lot of features - the distinction between an OS and a BASIC environment was rather blurred for these kinds of home computers. (I'm
sure you used a few of the home computers of that era, even if you were
already a professional programmer at that time.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Tue Sep 10 15:15:06 2024

Bart <bc@freeuk.com> writes:

On 08/09/2024 17:44, Bart wrote:

On 08/09/2024 16:39, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories: >>>>
�� A = Y;�� // name
�� *X = Y;�� // pointer
�� X[i] = Y;�� // index
�� X.m = Y;�� // member select

I can think of three others.� There may be more.

OK, so what are they?

TM:

Yes, very good. I count four or five, depending on what

differences count as different.

I guess nobody is going to say what those extra categories are, are
they?

Sorry, I was busy. I see KT as given a good summary (though I was not
counting forms in parentheses).

It's bad form to call somebody out on something but then refuse to tell
them exactly what they've got wrong or have missed out.

3, 4, or maybe 5 mysterious categories of LHS assignment terms that I have never been encountered in a million lines of C code I've processed, but nobody is willing to say what they are?

I sense a wind-up.

You have implemented a C compiler. The wind-up I sensed was your giving
out misinformation, but I'll just have to take your word for it that
you've been arguing about assignments without know what constitutes an
lvalue expression.

But when I didn't answer soon enough, surely you could have just looked
in any good C reference to find all the expression forms that are
lvalues.

I can think of at least one expression form for X that contradicts this
claim.

Example?

Nothing here either.

f().m where f returns a struct.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Tue Sep 10 17:28:29 2024

On 10/09/2024 15:24, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

Someone objects that you can't in general apply & to arbitrary, unnamed,
transient, intermediate values such as 'a + 1'.

or even just 1.

'1' isn't as good an example; I wanted something that necessarily has to
be in a transient location like a register. Constants can exist as
immediate fields in instructions, or in actual memory (eg. for floats).

I showed how you could do that using anonymous compound literals which
avoids having to create an explicit named temporary which in standard C
would need to be outside of that assignment call.

But you apparently have a problem it.

Because you called it a reference to a+1. The reference is to an
object.

I don't know if you were deliberately twisting the term because you are
now 100% committed to some false symmetry in assignments, or whether you
are just very loose in your use of terms, but rvalue expressions (C does
not really use the term, but it's better than non-lvalue expressions)
can't have "references" to them. That was all that Waldek Hebisch was saying. Did you think for a second that he did not know that if you put
an int value into an object you can take the pointer to that object?

He took the symmetry I was claiming for assignment, and created an
asymmetric counter-example where the dereferencing part of the LHS was
moved into a function, necessitating the creation of a discrete
reference for the LHS.

I created a counter-counter-example where dereferences on both sides
were moved into the function, so restoring the symmetry.

And yes I'm still committed to that symmetry. I'ved used it for
countless language implementations. C is little different other than it
has a 700-page standard that suggests a recommended model of how it's
supposed to work.

You can't really use that to bash me about the head with and maintain
that all my ideas about language implementation are wrong because C
views assignment in its own idiosyncratic manner.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Tue Sep 10 15:24:09 2024

Bart <bc@freeuk.com> writes:

On 08/09/2024 17:14, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 07/09/2024 02:44, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 06/09/2024 11:19, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

(You can balance it out by by requiring ASSIGN(&A, &B)!)

This would not work in general, as I wrote it, the following are
valid:
assign(&a, 42)
assign(&a, a + 1)
but the second argument has no address, so your variant would not
work.

I believe that C's compound literals can give a reference to a+1:

Is there no part of C you can't misrepresent?

Is nothing I write that you will take issue with?

Count the number of posts you've made in, say, the last month. Count
the number of them to which I have replied. Does the answer your
question?

#include <stdio.h>

void assign(int* lhs, int* rhs) {
*lhs = *rhs;
}

int main(void) {
int a=20;

assign(&a, &(int){a+1});

This is simply an anonymous object. You could have used a named object
and it wold not have been any further from being a "reference to a+1".

I suggested a 'assign()' function could have balanced parameters by requiring:

asssign(&A, &B);

Someone objects that you can't in general apply & to arbitrary, unnamed, transient, intermediate values such as 'a + 1'.

or even just 1.

I showed how you could do that using anonymous compound literals which
avoids having to create an explicit named temporary which in standard C
would need to be outside of that assignment call.

But you apparently have a problem it.

Because you called it a reference to a+1. The reference is to an
object.

I don't know if you were deliberately twisting the term because you are
now 100% committed to some false symmetry in assignments, or whether you
are just very loose in your use of terms, but rvalue expressions (C does
not really use the term, but it's better than non-lvalue expressions)
can't have "references" to them. That was all that Waldek Hebisch was
saying. Did you think for a second that he did not know that if you put
an int value into an object you can take the pointer to that object?

Or more likely you have a problem with me.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Tue Sep 10 17:58:13 2024

On 10/09/2024 15:15, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 08/09/2024 17:44, Bart wrote:

On 08/09/2024 16:39, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories: >>>>>
   A = Y;         // name
   *X = Y;        // pointer
   X[i] = Y;      // index
   X.m = Y;       // member select

I can think of three others. There may be more.

OK, so what are they?

TM:

Yes, very good. I count four or five, depending on what

differences count as different.

I guess nobody is going to say what those extra categories are, are
they?

Sorry, I was busy. I see KT as given a good summary (though I was not counting forms in parentheses).

It's bad form to call somebody out on something but then refuse to tell
them exactly what they've got wrong or have missed out.

3, 4, or maybe 5 mysterious categories of LHS assignment terms that I have >> never been encountered in a million lines of C code I've processed, but
nobody is willing to say what they are?

I sense a wind-up.

You have implemented a C compiler. The wind-up I sensed was your giving
out misinformation, but I'll just have to take your word for it that
you've been arguing about assignments without know what constitutes an
lvalue expression.

But when I didn't answer soon enough, surely you could have just looked
in any good C reference to find all the expression forms that are
lvalues.

I can think of at least one expression form for X that contradicts this >>>> claim.

Example?

Nothing here either.

f().m where f returns a struct.

f().m is allowed with mcc and tcc compilers (but it doesn't do anything useful). It's not classed as an lvalue by gcc.

By "the LHS of an assignment", and "X is a term of any complexity" I
imply those X's forming valid LHS terms.

An X used as X[i]=Y, *X=Y, or X.m=Y, or even any of the Y's, could be
rejected for lots of reasons. That X isn't an array, pointer or struct
for example.

(BTW I've now counted the different categories of my own languages are
there are about 15 in all that can be used as assignment targets. A lot
are just terms that can appear as rvalues, that can also appear as lvalues.

For example a 'switch' statement, which standard C doesn't even allow as
an rvalue.

Sorry, did your remark above suggest I don't know what an lvalue is?
Maybe it's a miracle all this stuff works then!)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Tue Sep 10 22:10:24 2024

On 10/09/2024 21:18, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

On 09/09/2024 22:16, Keith Thompson wrote:

[...]

I felt free to use C23 because we were already talking about the
support, or lack thereof, of 128-bit integers in C23, and also
_BitInt. And on a platform that supported 128-bit integer types,
"auto" would pick a 128-bit type here, while other platforms would
pick a 64-bit type.

That last assumption turns out not to be correct.

I don't know of a platform that has 128-bit integers, so I haven't
tested it. But I agree with you that it is not quite clear what the
standard requires if an implementation has 128 bit integers as an
extended integer type, rather than a standard integer type.

If an implementation doesn't have an integer type wider than 64 bits,
then the constant 0x1'0000'0000'0000'0000 has no type, which violates
a constraint.

Indeed. Conforming implementations need to give an error or warning
here. But some will also then map it to a 64-bit value (I'd usually
expect 0, but 0xffff'ffff or 0x7fff'ffff might also be used) and

(I was missing a few ffff's here.)

continue compilation. That gives a program with the desired
behaviour. (For my own uses, I would pick flags that gave an error on
such code.)

gcc 14.2.0 gives it type int and value 0. I think what's happening is
that the value wraps around, and then the compiler determines its type
based on that. It also issues a warning.

Since the code violates a constraint, you can't rely on its behavior.

My code was the best I could do that looked like it would give the
required behaviour - at least on some platforms. It is plausible that
gcc documents that behaviour somewhere (though I have not seen such documentation myself), which would make it non-portable but reliable for
that compiler - even in conforming modes. I don't see any way to get
closer to the requested behaviour in a way that would also work on a hypothetical 128-bit integer target.

clang 18.1.0 rejects it.

If an implementation does have a integer types wider than 64 bits,
there's no guarantee that it uses the name "__int128". A future gcc
might have a 128-bit (extended) integer type and still have __int128
with its current semantics.

If an implementation had a standard integer type of 128 bits, then I
expect "long long int" would be the name (unless it also had even
bigger types!), but for extended integer types it could be many
different things. It is very likely, however, that it would also be
in <stdint.h> as int128_t.

I would probably expect most compilers to keep long long at 64 bits, to
cater to existing code that makes that assumption. That's what gcc did
when it added __int128 (which is *almost* an extended integer type, but
they don't call it one).

It seems likely, but it is not the only possibility. For platforms that
have 64-bit "long", a 128-bit "long long" is not unreasonable.

Do you know of any implementations that do have 128-bit integer types?
In particular, any that have a compiler on godbolt.org?

None that I know of.

I believe there is a specification for a 128-bit RISC-V version, but no implementation in practice, and no version of gcc for it (at least, not
in mainline gcc).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Wed Sep 11 01:02:24 2024

Bart <bc@freeuk.com> writes:

On 10/09/2024 15:15, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 08/09/2024 17:44, Bart wrote:

On 08/09/2024 16:39, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

In language like C, the LHS of an assignment is one of four categories: >>>>>>
�� A = Y;�� // name
�� *X = Y;�� // pointer
�� X[i] = Y;�� // index
�� X.m = Y;�� // member select

I can think of three others.� There may be more.

OK, so what are they?

TM:

Yes, very good. I count four or five, depending on what

differences count as different.

I guess nobody is going to say what those extra categories are, are
they?

Sorry, I was busy. I see KT as given a good summary (though I was not
counting forms in parentheses).

It's bad form to call somebody out on something but then refuse to tell
them exactly what they've got wrong or have missed out.

3, 4, or maybe 5 mysterious categories of LHS assignment terms that I have >>> never been encountered in a million lines of C code I've processed, but
nobody is willing to say what they are?

I sense a wind-up.

You have implemented a C compiler. The wind-up I sensed was your giving
out misinformation, but I'll just have to take your word for it that
you've been arguing about assignments without know what constitutes an
lvalue expression.
But when I didn't answer soon enough, surely you could have just looked
in any good C reference to find all the expression forms that are
lvalues.

I can think of at least one expression form for X that contradicts this >>>>> claim.

Example?

Nothing here either.

f().m where f returns a struct.

f().m is allowed with mcc and tcc compilers (but it doesn't do anything useful). It's not classed as an lvalue by gcc.

gcc is correct. It isn't a lvalue.

By "the LHS of an assignment", and "X is a term of any complexity" I imply those X's forming valid LHS terms.

An X used as X[i]=Y, *X=Y, or X.m=Y, or even any of the Y's, could be rejected for lots of reasons. That X isn't an array, pointer or struct for example.

Yes, and you'll notice I did not point that out. There are cases when
X.m is not a lvalue even when X is an expression of the right type to
have a member m.

(BTW I've now counted the different categories of my own languages are
there are about 15 in all that can be used as assignment targets. A lot are just terms that can appear as rvalues, that can also appear as lvalues.

For example a 'switch' statement, which standard C doesn't even allow as an rvalue.

Sorry, did your remark above suggest I don't know what an lvalue is?

That seemed like the obvious explanation for the incorrect information
you gave. Did you post it /knowing/ what other kinds of things are
lvalues in C just to confuse people?

Maybe
it's a miracle all this stuff works then!)

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Wed Sep 11 01:22:11 2024

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to work.

You can't really use that to bash me about the head with and maintain that all my ideas about language implementation are wrong because C views assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You
/know/ the LH and RH side of a C assignment have different constraints
(you have said so yourself) yet you persist in defending your original
claim that what is needed on the two sides "is exactly the same". You
must, surely, be arguing simply for the fun of it.

Tim suggests that there is communication failure here -- that you have
not expressed what you mean clearly enough. That may be so, but I can't
see how to interpret what you've written in any other way.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Wed Sep 11 10:52:39 2024

On 11/09/2024 01:02, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

Sorry, did your remark above suggest I don't know what an lvalue is?

That seemed like the obvious explanation for the incorrect information
you gave. Did you post it /knowing/ what other kinds of things are
lvalues in C just to confuse people?

Which incorrect explanation was that?

I merely said that LHSs of assigments fall into these categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

Clearly I mean VALID LHSs, otherwise they wouldn't be LHSs of assignments!

I've since learnt about a couple of other possible categories; one is
with compound literals like '(int){42} = 0'. (I don't count (A), ((A))
etc as a separate category; come on!)

The other is 'X.m' but when .m is a bitfield; although this has the same
same syntax as above, internally it's somewhat different. (My C compiler
treats bitfields as ordinary members.)

I acknowledge that allowing 'F().m = Y' is wrong; I might get around to
fixing it one day.

(In my language that would fail when F returns a value struct. It would
pass if F returns a pointer to a struct, since I can still use 'F().m :=
Y', as derefs are automatic.

F().m := Y is valid also in my dynamic language, since the returned
object can be shared so the effect of the assignment can be observable.
But that's really due to underlying pointers too.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Wed Sep 11 10:34:27 2024

On 11/09/2024 01:22, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless
language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to >> work.

You can't really use that to bash me about the head with and maintain that >> all my ideas about language implementation are wrong because C views
assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You /know/ the LH and RH side of a C assignment have different constraints
(you have said so yourself) yet you persist in defending your original
claim that what is needed on the two sides "is exactly the same".

I've listed the aspects that I said are the same.

That is, if something is a legal LHS term, then its syntax, and its
type, are identical to that term appearing on the RHS.

(And by its type, I mean its base type. So given 'int a,b; a=b;', I'm
talking about 'int' not 'int*'.)

There can additionally be similarities within internal representations.

Tim suggests that there is communication failure here -- that you have
not expressed what you mean clearly enough. That may be so, but I can't
see how to interpret what you've written in any other way.

Or people simply can't grasp what I'm saying. I've given a million
examples of identical LHSs and RHSs, and they insist on saying they're asymmmetric (while also insisting that it's the A=.A of BLISS that has
true symmetry!).

I acknowledge that LHSs are written and RHSs are read (and also that,
while A=A has true reflective symmetry, B=B doesn't, if that's what
bothers some).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Bart on Wed Sep 11 15:15:09 2024

On 2024-09-11, Bart <bc@freeuk.com> wrote:

On 11/09/2024 01:22, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless >>> language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to >>> work.

You can't really use that to bash me about the head with and maintain that >>> all my ideas about language implementation are wrong because C views
assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You
/know/ the LH and RH side of a C assignment have different constraints
(you have said so yourself) yet you persist in defending your original
claim that what is needed on the two sides "is exactly the same".

I've listed the aspects that I said are the same.

That is, if something is a legal LHS term, then its syntax, and its
type, are identical to that term appearing on the RHS.

But the converse isn't true. "a = b" is a valid RHS term, yet isn't
always valid on the LHS without parentheses on it; and this isn't
something that is due to a glitch in the ISO C grammar that is removable
by a compiler implementor, since a = b = c means a = (b = c).

Assignment is syntactially symmetric at the AST level only, where
parentheses don't matter any more.

It's possible to generate an AST that has an '=' node with any kind
of expressions as left and right children, and then check it for
valid left sides as a matter of syntax.

If the surface language before the AST is some Lisp-like, then
the symmetry extends to that syntax also, which can just be (set <expr> <expr>); it parses regardless of what the expressions are.

Unless you take symmetry too literally: obviously (set <expr-1>
<expr-2>) is not symmetric with regard to (<expr-2> <expr-1> set). :)
Not to mention (<2-rpxe> <1-rpxe> tes). :)

The "assignment is symmetric at the syntax level" is a kind of Lisp
mindset, which has banished precedence and associativity.

In Common Lisp (and some other dialects), the application program
can extend the repertoire of what kind of expression is a "place";
i.e. can appear as the left argument of an assignment operator,
and be used in other related operators.

In Common Lisp, I can make (setf (+ a 1) 42) do something. For
instance, I can have it set a to 41, so that (+ a 1) yields 42.

Note that if I chose this specific semantics, it will still not allow
(setf (+ 2 2) 42). The + expression will have to contain at least one
term that is itself a place. (Perhaps we can choose the leftmost
place as the one to operate on so that (setf (+ a b) ...) will
operate on a).

I belong to the Lisp mindset, so I tend to agree with the basic gist
of your idea. Tou've chosen to work in languages where your intuitions
do not work out to being literally true, though. :)

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Kaz Kylheku on Wed Sep 11 16:51:36 2024

On 11/09/2024 16:15, Kaz Kylheku wrote:

On 2024-09-11, Bart <bc@freeuk.com> wrote:

On 11/09/2024 01:22, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless >>>> language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to
work.

You can't really use that to bash me about the head with and maintain that >>>> all my ideas about language implementation are wrong because C views
assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You
/know/ the LH and RH side of a C assignment have different constraints
(you have said so yourself) yet you persist in defending your original
claim that what is needed on the two sides "is exactly the same".

I've listed the aspects that I said are the same.

That is, if something is a legal LHS term, then its syntax, and its
type, are identical to that term appearing on the RHS.

But the converse isn't true.

I'm not claiming that. Only that legal LHS terms are the same on the
RHS. Most RHS expressions wouldn't be valid on the left.

"a = b" is a valid RHS term, yet isn't
always valid on the LHS without parentheses on it;

But, it isn't valid on the LHS even with parentheses! '(a = b) = c;' is
an error.

and this isn't
something that is due to a glitch in the ISO C grammar that is removable
by a compiler implementor, since a = b = c means a = (b = c).

Assignment is syntactially symmetric at the AST level only, where
parentheses don't matter any more.

It's possible to generate an AST that has an '=' node with any kind
of expressions as left and right children, and then check it for
valid left sides as a matter of syntax.

If the surface language before the AST is some Lisp-like, then
the symmetry extends to that syntax also, which can just be (set <expr> <expr>); it parses regardless of what the expressions are.

(That's what I do anyway. Apparently I don't do extensive lvalue testing
during type analysis; some of this stuff is picked up during code
generation. My type analysers are long due for an overhaul.)

Unless you take symmetry too literally: obviously (set <expr-1>
<expr-2>) is not symmetric with regard to (<expr-2> <expr-1> set). :)
Not to mention (<2-rpxe> <1-rpxe> tes). :)

Suppose there were two symbols for assignment, ':=' and '=:' for RTL and
LTR data movement respectively. Then if you had this:

A := B[i]

(assign element of B to A), you could reverse the assignment - assuming
both sides could be valid lvalues - by switching the symbol:

A =: B[i] # Store A in B[i]

I call that symmetric. You'd have trouble doing that when one side of an assignment needs a special symbol to access the value of X, and other
doesn't. So instead of 'A := B' and 'A =: B', you'd need to swap also
that symbol:

A := .B
.A =: B

The "assignment is symmetric at the syntax level" is a kind of Lisp
mindset, which has banished precedence and associativity.

In Common Lisp (and some other dialects), the application program
can extend the repertoire of what kind of expression is a "place";
i.e. can appear as the left argument of an assignment operator,
and be used in other related operators.

In Common Lisp, I can make (setf (+ a 1) 42) do something. For
instance, I can have it set a to 41, so that (+ a 1) yields 42.

Note that if I chose this specific semantics, it will still not allow
(setf (+ 2 2) 42). The + expression will have to contain at least one
term that is itself a place.

You example is not viable. Apart from expressions which do not involve
any locations, there could be a dozen such locations.

(Perhaps we can choose the leftmost
place as the one to operate on so that (setf (+ a b) ...) will
operate on a).

What about when the leftmost is itself a nested term, that may be a
function call?

I belong to the Lisp mindset, so I tend to agree with the basic gist
of your idea. Tou've chosen to work in languages where your intuitions
do not work out to being literally true, though. :)

Actually they do. I've given an example where an assignment in
intermediate code is literally identical to the HLL code (A := B in HLL
ends up as A := B in three-address-code).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Waldek Hebisch on Wed Sep 11 17:12:39 2024

On 9/9/24 18:04, Waldek Hebisch wrote:
...

ATM I do not have standard text handy to check, but AFAIR several
_types_ in <stdint.h> were not required but optional.

Correct - those are [u]intptr_t, the exact sized types, and the types
(exact, fast, or least) with sizes other than 8, 16, 32, or 64. As a
good general rule, those optional types will be missing only if the implementation doesn't support any integers of the specified size. What
would you want your code to do on an implementation that doesn't support
the size that you would otherwise specify?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Thu Sep 12 00:32:31 2024

Bart <bc@freeuk.com> writes:

On 11/09/2024 01:22, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless >>> language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to >>> work.

You can't really use that to bash me about the head with and maintain that >>> all my ideas about language implementation are wrong because C views
assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You
/know/ the LH and RH side of a C assignment have different constraints
(you have said so yourself) yet you persist in defending your original
claim that what is needed on the two sides "is exactly the same".

I've listed the aspects that I said are the same.

And you've stated that there are differences but of course you haven't
listed them (as far as I can tell).

That is, if something is a legal LHS term, then its syntax, and its type,
are identical to that term appearing on the RHS.

And you have /not/ stated, though you know it perfectly well, that the
reverse does /not/ apply -- that many "legal" RHS expressions can't
appear, legally, on the LHS.

(And by its type, I mean its base type. So given 'int a,b; a=b;', I'm
talking about 'int' not 'int*'.)

There can additionally be similarities within internal representations.

Tim suggests that there is communication failure here -- that you have
not expressed what you mean clearly enough. That may be so, but I can't
see how to interpret what you've written in any other way.

Or people simply can't grasp what I'm saying. I've given a million
examples of identical LHSs and RHSs,

But your mistake is not that there are not millions of identical
(looking) LH sides and RH sides. No one disputes that. But in reply to
my statement that what is /required/ on both sides is not the same, you
said "I would argue that it is exactly the same". Just one case where something different is required is enough to show that you should not
"argue that it [what is required] is exactly the same".

If all you are saying is that, in C, there are millions of examples
where identical looking valid expression can appear on the left and
right hand side of an assignment, no one would have objected (except to
point out that that is not an interesting observation). But you
seem to be saying more than that while offering (a) only evidence for
that trivial truth and (b) accepting that the two sides have different constraints.

Are you just saying that there are millions of examples of identical LH
and RH sides? Because if so, this can end here -- I emphatically agree
that there are!

and they insist on saying they're
asymmmetric (while also insisting that it's the A=.A of BLISS that has
true symmetry!).

Assignment is always asymmetric in some sense because of what the two
sides must mean. But BLISS permits both A = A and .A = .A because the
same rules apply to what is permitted on both sides of an assignment.
What's more, both sides are evaluated in exactly the same way to yield a
bit pattern. In BLISS your claim that what is permitted on both sides
is "exactly the same" is true. It's not true in C.

I acknowledge that LHSs are written and RHSs are read (and also that, while A=A has true reflective symmetry, B=B doesn't, if that's what bothers
some).

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Wed Sep 11 23:59:59 2024

David Brown <david.brown@hesbynett.no> wrote:

On many cpus, using sizes smaller than the full register size means
doing sign extensions or masking operations at various times - thus full
size register operations can often be more efficient. On such systems
you will find that int_fast16_t is 32-bit or 64-bit, according to the register width. On other cpus, some common ALU operations on full-size operands can be slower than for smaller operands (such as on the 68000).
There, int_fast16_t will be 16-bit.

Compiler authors know what will usually be faster on the target. There
will always be some exceptions (division is usually faster on smaller operands, for example). But if you don't know the target - as is the
case of portable code - the compiler will usually make a better choice
here than you would.

BTW, I just played with Clang 18 on 64-bit FreeBSD. It has 32-bit int_fast16_t. gcc in Linux makes it 64-bit. Who is right?

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Thu Sep 12 00:47:58 2024

Bart <bc@freeuk.com> writes:

On 11/09/2024 01:02, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

Sorry, did your remark above suggest I don't know what an lvalue is?

That seemed like the obvious explanation for the incorrect information
you gave. Did you post it /knowing/ what other kinds of things are
lvalues in C just to confuse people?

Which incorrect explanation was that?

I merely said that LHSs of assigments fall into these categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

Yes, that incorrect explanation.

Clearly I mean VALID LHSs, otherwise they wouldn't be LHSs of assignments!

I've since learnt about a couple of other possible categories; one is with compound literals like '(int){42} = 0'.

Along with (a) _Generic expressions (where the selected arm is an
lvalue) and (b) expressions of the form X->m.

(I don't count (A), ((A)) etc as a
separate category; come on!)

Don't give me "come on!". I was counting forms in the same way that you
were when I said I could think of three more. I was not counting
parentheses.

The other is 'X.m' but when .m is a bitfield;

What makes X.m = Y, where m is a bitfield, an extra category? It fits
the X.m = Y pattern perfectly well.

although this has the same
same syntax as above, internally it's somewhat different.

Your categories were syntactic. You were describing forms.

(My C compiler
treats bitfields as ordinary members.)

I acknowledge that allowing 'F().m = Y' is wrong;

Thank you.

I might get around to
fixing it one day.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Thu Sep 12 01:40:02 2024

On 12/09/2024 00:32, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 11/09/2024 01:22, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless >>>> language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to
work.

You can't really use that to bash me about the head with and maintain that >>>> all my ideas about language implementation are wrong because C views
assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You
/know/ the LH and RH side of a C assignment have different constraints
(you have said so yourself) yet you persist in defending your original
claim that what is needed on the two sides "is exactly the same".

I've listed the aspects that I said are the same.

And you've stated that there are differences but of course you haven't
listed them (as far as I can tell).

That is, if something is a legal LHS term, then its syntax, and its type,
are identical to that term appearing on the RHS.

And you have /not/ stated, though you know it perfectly well, that the reverse does /not/ apply -- that many "legal" RHS expressions can't
appear, legally, on the LHS.

Clearly all RHSs can't appear on the left; this is OK:

A = A + A + A + A;

but not:

A + A + A + A = A;

As for differences, there is my AST for A = B:

i32-- 1 assign:
i32-- - 1 name: a
i32-- - 2 name: b

Same node type, same type. What are the differences?

Or people simply can't grasp what I'm saying. I've given a million
examples of identical LHSs and RHSs,

But your mistake is not that there are not millions of identical
(looking) LH sides and RH sides. No one disputes that. But in reply to
my statement that what is /required/ on both sides is not the same, you
said "I would argue that it is exactly the same".

I think you're still not getting it. In C you write:

A = B

to assign simple value types. Or you write B = A the other way around.

You write A as A, B as B no matter which side it is.

In BLISS which you claim to be more symmetric, you have to write:

A = .B

for the same operation (copy B's value into A), or B = .A for the other
way around.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Thu Sep 12 02:11:21 2024

Bart <bc@freeuk.com> wrote:

On 10/09/2024 15:24, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:
I don't know if you were deliberately twisting the term because you are
now 100% committed to some false symmetry in assignments, or whether you
are just very loose in your use of terms, but rvalue expressions (C does
not really use the term, but it's better than non-lvalue expressions)
can't have "references" to them. That was all that Waldek Hebisch was
saying. Did you think for a second that he did not know that if you put
an int value into an object you can take the pointer to that object?

He took the symmetry I was claiming for assignment, and created an
asymmetric counter-example where the dereferencing part of the LHS was
moved into a function, necessitating the creation of a discrete
reference for the LHS.

I created a counter-counter-example where dereferences on both sides
were moved into the function, so restoring the symmetry.

Really not. My point was about conceptual model and important
part is to keep such model simple. You used complex language
construct which defeats the purose of conceptual model.

And yes I'm still committed to that symmetry. I'ved used it for
countless language implementations. C is little different other than it
has a 700-page standard that suggests a recommended model of how it's supposed to work.

You can't really use that to bash me about the head with and maintain
that all my ideas about language implementation are wrong because C
views assignment in its own idiosyncratic manner.

Well, gcc implements assignment in similar way to you (or at least
did that in the past, I have not checked if recent version do the
same). To be more precise, gcc parser when seeing a variable
creates read reference to this variable. When parser realizes that
already recognized part of expression is the left hand side of an
assignment it converts it to write access. So your approach is
no worse than gcc. But it creates troubles, process of changing
subexpression with read references into write access is more
complicated than replacing read instruction by write instruction.
One, they need to recognize things which are invalid. Second,
from the very beginning gcc transforms its intermediate representation.
But some transforms valid on right hand side are invalid on the
left hand side, so gcc needs to postpone them and do later.
Third, intermdiate form contains extra information and that
needs to be handled later.

So your (and gcc) approach is: "let us pretend that assigment
operator is symmetric and fix asymetry (that is left hand side)
later". That works, and I can imagine good reasons to proceed
that way. Or intead of "fixing" one can first generate intemediate
for and then pass right hand side to "right hand side code
generator" and left hand side to "left hand side code generator".
One way or another, in this approach left hand side of assigment
must be handled differently than right hand side. But saying that
assignment operator is really symmetric is wrong. Different
treatement treatment of sizes shows this. And once you accept
that assignment operator is asymetric, then in Bliss you can handle
both sides in exactly the same way. In C, there is implict
lvalue convertion and at tops of arguments to assigment there
is slight asymetry, you do not apply lvalue convertion to the
left hand argument, but apply it to the right argument. This
is simpler model then yours when you want precise destription.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Thu Sep 12 12:00:18 2024

On 12/09/2024 00:47, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 11/09/2024 01:02, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

Sorry, did your remark above suggest I don't know what an lvalue is?

That seemed like the obvious explanation for the incorrect information
you gave. Did you post it /knowing/ what other kinds of things are
lvalues in C just to confuse people?

Which incorrect explanation was that?

I merely said that LHSs of assigments fall into these categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

Yes, that incorrect explanation.

I dispute that. What I said is very broadly correct. But in this
newgroup you do like to nitpick.

So to you, it is of the greatest importance that somebody doesn't just
know about those four categories that they will be reading and writing
all the time in C code, but also know about:

(int){A} = Y;

which they will encounter approximatey never. And it is also vital they
they consider:

(A) = (Y);

a distinct category. There might be a case for this for '(Z, A) = Y;'
but that isn't allowed anyway. So it only applies to superfluous
parentheses.

Clearly I mean VALID LHSs, otherwise they wouldn't be LHSs of assignments! >>
I've since learnt about a couple of other possible categories; one is with >> compound literals like '(int){42} = 0'.

Along with (a) _Generic expressions (where the selected arm is an
lvalue)

The _Generic forms reduce down one of those four. It is more like a
macro, and if you're going to start with macros, there are unlimited
categories that can be created. If this is merely about syntax, then why
not?

(I'd also like to see an actual usecase for _Generic on the LHS of an assignment. Perhaps one where there is a matching (symmetric?) _Generic
on the RHS?)

and (b) expressions of the form X->m.

Are there any circumstances where X->m does something different from (*X).m?

(I don't count (A), ((A)) etc as a
separate category; come on!)

Don't give me "come on!". I was counting forms in the same way that you
were when I said I could think of three more. I was not counting parentheses.

Keith mentioned this form.

The other is 'X.m' but when .m is a bitfield;

What makes X.m = Y, where m is a bitfield, an extra category? It fits
the X.m = Y pattern perfectly well.

although this has the same
same syntax as above, internally it's somewhat different.

Your categories were syntactic. You were describing forms.

Not entirely. There is behaviour associated with them.

Most LHS terms can have & applied in an rvalue context for example;
bitfield accesses can't. So it's something a user of the language needs
to know about.

And internally, my ASTs (where bitfields are supported) use a different
node type when X.m is a bitfield rather than a regular access.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Thu Sep 12 13:45:08 2024

On 12/09/2024 01:59, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On many cpus, using sizes smaller than the full register size means
doing sign extensions or masking operations at various times - thus full
size register operations can often be more efficient. On such systems
you will find that int_fast16_t is 32-bit or 64-bit, according to the
register width. On other cpus, some common ALU operations on full-size
operands can be slower than for smaller operands (such as on the 68000).
There, int_fast16_t will be 16-bit.

Compiler authors know what will usually be faster on the target. There
will always be some exceptions (division is usually faster on smaller
operands, for example). But if you don't know the target - as is the
case of portable code - the compiler will usually make a better choice
here than you would.

BTW, I just played with Clang 18 on 64-bit FreeBSD. It has 32-bit int_fast16_t. gcc in Linux makes it 64-bit. Who is right?

Technically, both are right - implementations can use any integer type
of at least 16 bits here, whatever they think is fastest in general.
But it surprises me somewhat, given that clang for x86-64 on Linux uses
64-bit for int_fast16_t.

But to be clear, the size of the "fast" types depends on the target and
the implementation. They are not normally used for external ABI's, and
are purely internal to the generated code. Obviously you must pick a
"fast" size that is at least as big as the range you need.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Waldek Hebisch on Thu Sep 12 12:27:05 2024

On 12/09/2024 03:11, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 10/09/2024 15:24, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:
I don't know if you were deliberately twisting the term because you are
now 100% committed to some false symmetry in assignments, or whether you >>> are just very loose in your use of terms, but rvalue expressions (C does >>> not really use the term, but it's better than non-lvalue expressions)
can't have "references" to them. That was all that Waldek Hebisch was
saying. Did you think for a second that he did not know that if you put >>> an int value into an object you can take the pointer to that object?

He took the symmetry I was claiming for assignment, and created an
asymmetric counter-example where the dereferencing part of the LHS was
moved into a function, necessitating the creation of a discrete
reference for the LHS.

I created a counter-counter-example where dereferences on both sides
were moved into the function, so restoring the symmetry.

Really not. My point was about conceptual model and important
part is to keep such model simple. You used complex language
construct which defeats the purose of conceptual model.

I've seen countless proposals in the Reddit PL forum with people trying
to turn everything into a function (but it is very FP oriented anyway).

To my mind that makes a language much more complicated. You can't really implement 'if' and 'while' statements via functions without closures for example.

So I'd prefer to keep assignment as low-tech as possible.

And yes I'm still committed to that symmetry. I'ved used it for
countless language implementations. C is little different other than it
has a 700-page standard that suggests a recommended model of how it's
supposed to work.

You can't really use that to bash me about the head with and maintain
that all my ideas about language implementation are wrong because C
views assignment in its own idiosyncratic manner.

Well, gcc implements assignment in similar way to you (or at least
did that in the past, I have not checked if recent version do the
same). To be more precise, gcc parser when seeing a variable
creates read reference to this variable. When parser realizes that
already recognized part of expression is the left hand side of an
assignment it converts it to write access. So your approach is
no worse than gcc. But it creates troubles, process of changing subexpression with read references into write access is more
complicated than replacing read instruction by write instruction.
One, they need to recognize things which are invalid.

It's not that complicated, not with C anyway.

Because in C, if you take the 3-4 categories of LHS in assignments
(ignore the esoteric ones, and [] and . are really the same), there is
only one top-level lvalue node to consider.

That's the only thing that needs to 'change', which I don't think is
onerous anyway.

With more elaborate LHSs, for example like this:

(A[i], B[i], (x ? C.m : D.m) = Y();

There can be both multiple and nested lvalue nodes. So 'lvalueness' has
to somehow propagate down into those branches after parsing has been done.

And yet, I was doing that in the 1980s on my toy compilers. So that's
not that hard either.

(Now, try implementing this assignment with a function!)

Second,
from the very beginning gcc transforms its intermediate representation.
But some transforms valid on right hand side are invalid on the
left hand side, so gcc needs to postpone them and do later.
Third, intermdiate form contains extra information and that
needs to be handled later.

So your (and gcc) approach is: "let us pretend that assigment
operator is symmetric and fix asymetry (that is left hand side)
later". That works, and I can imagine good reasons to proceed
that way. Or intead of "fixing" one can first generate intemediate
for and then pass right hand side to "right hand side code
generator" and left hand side to "left hand side code generator".
One way or another, in this approach left hand side of assigment
must be handled differently than right hand side. But saying that
assignment operator is really symmetric is wrong. Different
treatement treatment of sizes shows this. And once you accept
that assignment operator is asymetric, then in Bliss you can handle
both sides in exactly the same way. In C, there is implict
lvalue convertion

What exactly /is/ lvalue conversion? What is converted to what?

I do an lvalue /check/ (so variable 'A' yes, const 6, no). But nothing
is actually converted.

What /sometimes/ happens in elaborate cases, is that & is applied to
multiple terms, balanced by a single * elsewhere. But that is not a
conversion, and is usually not needed for C which only has a single
lvalue node on the LHS.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Thu Sep 12 20:54:54 2024

On 12/09/2024 20:38, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

It's not that complicated, not with C anyway.

Because in C, if you take the 3-4 categories of LHS in assignments
(ignore the esoteric ones, and [] and . are really the same), there is
only one top-level lvalue node to consider.

I agree, there's only one thing to consider. The LHS of an
assignment is a modifiable lvalue.

We've spent a lot of time arguing about how any "categories" there
are for the LHS of an assignment. If I recall correctly, the whole
thing started when you stated that "the LHS of an assignment is
one of four categories", leading to a debate about whether four
is the correct number.

Enumerating the kinds of expressions that can be modifiable
lvalues is interesting, I suppose, but are A and (A) in different "categories"? Is it important to count generic selections and
compound literals?

Who cares, and why?

If you go back far enough, someone was questioning the use of a 2-way
selection (what in C would be expressed using ?:) on the left of an
assignment, in an example from another language.

So my set of categories were in the context of what /was/ allowed in C
on the left of an assignment.

People made too much of it.

And yet, I was doing that in the 1980s on my toy compilers. So that's
not that hard either.

Ok, it's not that hard to implement things that are not valid C.

And my comments were in reply to this from WH:

But it creates troubles, process of changing
subexpression with read references into write access is more
complicated than replacing read instruction by write instruction.
One, they need to recognize things which are invalid.

This suggests a complexity that does not really exist, certainly not for
C with its simple LHSs, and I said that it's not that hard even with
more elaborate languages.

So your reply is taking my remark of contest. Or were you just trying to
write something clever? According to that, any entire language X would
be easy to implement because it's not valid C?

What exactly /is/ lvalue conversion? What is converted to what?

This is specified in the C standard (6.3.2.1p2 in C11, 6.3.3.1p2
in C23 drafts). I suggest you read it.

An lvalue (an expression that designates an object) is converted
to the value stored in the designated object. This conversion
(adjustment) does not occur on the LHS of an assignment or in
several other contexts.

It might have been clearer to say that the expression is adjusted
from an expression that designates an object (an lvalue) to an
expression that yields the value of that object (not an lvalue)

Note that this is not a run-time conversion,

Well quite. WH was trying to make it out it's a big deal, but it's
usually a no-op (ie. nothing needs to be done in an actual compiler).

like a conversion of an
integer value to a floating-point value.

No, there an actual conversion is needed, usually. (Maybe the
implementation language only stores numeric constants as floats, so only
the type tag changes.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Bart on Thu Sep 12 21:09:40 2024

Bart <bc@freeuk.com> wrote:

On 10/09/2024 05:40, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I've looked at half a dozen hits for 'forth postpone' and I still don't
understand what it does. Apparently something to do with compiled mode.

I wouldn't know enough to confidently implement it or use it.

Silly example 1:

<snip long article about Forth>

For a non-advocate of Forth you have a lot to say about it!

Consider the following problem: you have a compiled language and
are mostly statisfied with it. But you would like to add interpretive execution. The point is to reduce executable size, bytecode can
be significantly smaller than machine code. So, you want small
size of interpreted code, but also resonably fast execution (otherwise
to get decent speed everthing would be compiled to machine code,
which would defeat the purpose). Since this is supposed to be
space optimization, there should be seamless cooperation between
machine code and interpreted code. In particular interpreted
functions must be wrapped into machine code so that they are
directly callable and wrapper starts the intepreted. Something
like this is not hard to implement, but problem/question is
to get low overhead. Now, concerning Forth, traditional Forth
mixes machine code with threaded code, and threaded code is kind
of interpretation. So I looked at Fort implementation with intent
of understandig tricks which Forth uses to make this fast and
if the tricks apply in more general setting (it seems that not).

For different reasons I looked at use of stack machines as user
visible computation model.

To give more background, bare ZX81 had 1kB RAM (including video RAM).

You must mean /excluding/ surely? Otherwise there wouldn't be much left
from 1KB!

There is not much left. But there is a trick: boundary between
video memory and rest is movable, as you fill other part available
screen area shrinks. It does not take long time to have machine
completely filled out with no free space left (and IIRC some
small screen area for out of memory message)

So that meagre 1KB had to be shared?

One point of view is that base ZX81 is much bigger than some other
machines. It could act as decent programmable calculator (popular
Ti-50 had 50 bytes of program space for users and memory for 8
numbers which is roughly equvalent to another 50 bytes). If you
are staisfied with small display space, then ZX81 can hold much
more program and data than Ti-50. And all (not many) ZX81 that I saw in
real use had memory extention.

The Z80 designs I did (for my own use and when I had a job developing commercial machines) never used used tricks like that**.

I never had to work within very tight pricing, but I would have
considered such a machine unviable other than for experimenting.

ZX81 had another cost-saving trick: the CPU did significant part
of work needed to create image on the screen. In normal mode
CPU was available to user only when it was not busy creating
image (which IIRC was during vertical retrace). There was fast
mode, when CPU was devoted to user program and no output appeared
on the screen.

Concerning "unviable", ZX81 could be used as a decent programmable
calculator. I could imagine specialized application for say data
entry or simple control that could run on base model. But mainly,
people installed memory extention.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Thu Sep 12 21:28:16 2024

David Brown <david.brown@hesbynett.no> wrote:

On 12/09/2024 01:59, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On many cpus, using sizes smaller than the full register size means
doing sign extensions or masking operations at various times - thus full >>> size register operations can often be more efficient. On such systems
you will find that int_fast16_t is 32-bit or 64-bit, according to the
register width. On other cpus, some common ALU operations on full-size
operands can be slower than for smaller operands (such as on the 68000). >>> There, int_fast16_t will be 16-bit.

Compiler authors know what will usually be faster on the target. There
will always be some exceptions (division is usually faster on smaller
operands, for example). But if you don't know the target - as is the
case of portable code - the compiler will usually make a better choice
here than you would.

BTW, I just played with Clang 18 on 64-bit FreeBSD. It has 32-bit
int_fast16_t. gcc in Linux makes it 64-bit. Who is right?

Technically, both are right - implementations can use any integer type
of at least 16 bits here, whatever they think is fastest in general.
But it surprises me somewhat, given that clang for x86-64 on Linux uses 64-bit for int_fast16_t.

Well, both satisfy "hard" requirements. But the question was which
type is faster.

But to be clear, the size of the "fast" types depends on the target and
the implementation. They are not normally used for external ABI's, and
are purely internal to the generated code. Obviously you must pick a
"fast" size that is at least as big as the range you need.

I think that Linux (and probably FreeBSD too) considers size of
fast type as part of ABI (regardless of gudelines those types
certainly leaked into "public" interfaces). Such ABI change is
probably viewed as not worth doing.

And concering choice on x86_64, AFAIK for operations on numbers of
the same size 32-bit gives fastest operations. 16-bit had two
disadvantages, big one due to partial register stalls, small one
due to larger size (operand size prefix). 64-bit requires bigger
code (more need to use prefixes) and bigger data. When mixing
types, 32-bit numbers are automatically zero extended, so there
is no extra cost when mixing unsigend numbers. So what remains
is mixing signed 32-bit integers with 64-bit ones. Addresses
use 64-bit artitmetic, so that requires sign extention. OTOH
in arithmetic "fast" types are likely to be mixed with exact
32-bit types and then making "fast" types 32-bit is faster
overall. So, there are bets/assumptions which usage is more
frequent. OTOH, choice between 32-bit and 64-bit fast types
is unlikely to make _much_ difference.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Fri Sep 13 00:46:35 2024

Bart <bc@freeuk.com> writes:

On 12/09/2024 00:47, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 11/09/2024 01:02, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

Sorry, did your remark above suggest I don't know what an lvalue is?

That seemed like the obvious explanation for the incorrect information >>>> you gave. Did you post it /knowing/ what other kinds of things are
lvalues in C just to confuse people?

Which incorrect explanation was that?

I merely said that LHSs of assigments fall into these categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

Yes, that incorrect explanation.

I dispute that. What I said is very broadly correct. But in this newgroup
you do like to nitpick.

Someone who wants to write p->m = 42 would not consider it nitpicking if
your compiler did not accept that form of LHS.

But I agree I was simply correcting a small error. Why did you not just
say "yes, I forgot a few cases"?

So to you, it is of the greatest importance that somebody doesn't just know about those four categories that they will be reading and writing all the time in C code, but also know about:

(int){A} = Y;

You see why I wonder if you had a political career? This is pure spin.
There is no technical argument here, just an attempt to mock someone
pointing out a truth. I never even suggested that it was important,
just that it was a missing case. And, still spinning away, you ignore

m which /is/ important and was also missing.

which they will encounter approximatey never. And it is also vital they
they consider:

(A) = (Y);

a distinct category.

More spin. There is nothing vital about it at all, and no one ever said
there was. In fact I explicitly stated that I was ignoring such
details.

There might be a case for this for '(Z, A) = Y;' but
that isn't allowed anyway. So it only applies to superfluous
parentheses.

Clearly I mean VALID LHSs, otherwise they wouldn't be LHSs of assignments! >>>
I've since learnt about a couple of other possible categories; one is with >>> compound literals like '(int){42} = 0'.

Along with (a) _Generic expressions (where the selected arm is an
lvalue)

The _Generic forms reduce down one of those four.

It would reduce down to one of the six or seven (dependin how we want ot count), not to your four. Anyway, you were not listing "reduced" forms
but the valid shapes for left hand sides.

It is more like a macro,
and if you're going to start with macros, there are unlimited categories
that can be created. If this is merely about syntax, then why not?

(I'd also like to see an actual usecase for _Generic on the LHS of an assignment. Perhaps one where there is a matching (symmetric?) _Generic on the RHS?)

and (b) expressions of the form X->m.

Are there any circumstances where X->m does something different from
(*X).m?

Are there cases where X[i] does something different to *(X+i)? You felt
the need to distinguish them despite the fact that they are equivalent
by definition.

(I don't count (A), ((A)) etc as a
separate category; come on!)

Don't give me "come on!". I was counting forms in the same way that you
were when I said I could think of three more. I was not counting
parentheses.

Keith mentioned this form.

Your reply was to me.

The other is 'X.m' but when .m is a bitfield;

What makes X.m = Y, where m is a bitfield, an extra category? It fits
the X.m = Y pattern perfectly well.

although this has the same
same syntax as above, internally it's somewhat different.

Your categories were syntactic. You were describing forms.

Not entirely. There is behaviour associated with them.

Most LHS terms can have & applied in an rvalue context for example;
bitfield accesses can't. So it's something a user of the language needs to know about.

You were describing the forms of assignment. If you want to consider
"things you can't do with some valid left hand sides", bitfields are not
alone in being special[1], but there is no reason to single them out
when just being left hand sides.

And internally, my ASTs (where bitfields are supported) use a different
node type when X.m is a bitfield rather than a regular access.

[1] in case you accuse me of teasing again, your first form of LHS, A =
Y, can't be the operand of & if A is declared with storage class
register.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Fri Sep 13 01:01:00 2024

Bart <bc@freeuk.com> writes:

On 12/09/2024 00:32, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 11/09/2024 01:22, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless >>>>> language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to
work.

You can't really use that to bash me about the head with and maintain that
all my ideas about language implementation are wrong because C views >>>>> assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You >>>> /know/ the LH and RH side of a C assignment have different constraints >>>> (you have said so yourself) yet you persist in defending your original >>>> claim that what is needed on the two sides "is exactly the same".

I've listed the aspects that I said are the same.

And you've stated that there are differences but of course you haven't
listed them (as far as I can tell).

That is, if something is a legal LHS term, then its syntax, and its type, >>> are identical to that term appearing on the RHS.

And you have /not/ stated, though you know it perfectly well, that the
reverse does /not/ apply -- that many "legal" RHS expressions can't
appear, legally, on the LHS.

Clearly all RHSs can't appear on the left; this is OK:

A = A + A + A + A;

but not:

A + A + A + A = A;

The example I used all those posts ago was A=3 vs 3=A;

As for differences, there is my AST for A = B:

i32-- 1 assign:
i32-- - 1 name: a
i32-- - 2 name: b

Same node type, same type. What are the differences?

You clearly know that what is required on each side is different: the LH
side must be a modifiable lvalue, the RH side need not be.

Or people simply can't grasp what I'm saying. I've given a million
examples of identical LHSs and RHSs,

But your mistake is not that there are not millions of identical
(looking) LH sides and RH sides. No one disputes that. But in reply to
my statement that what is /required/ on both sides is not the same, you
said "I would argue that it is exactly the same".

I think you're still not getting it. In C you write:

A = B

to assign simple value types. Or you write B = A the other way around.

I get that. What is it that you think I don't get? What have I said
that you actually disagree with? You know perfectly well the what is
required on the two sides is not exactly the same. 3=A is not permitted
but A=3 is yet you keep posting in support of the notion that what is
required on both sides is exactly the same.

You write A as A, B as B no matter which side it is.

In BLISS which you claim to be more symmetric, you have to write:

A = .B

for the same operation (copy B's value into A), or B = .A for the other way around.

Is there anything I've said about BLISS that you did not understand? Is
there anything I said about BLISS that you think is wrong (I am no an
expert)? I ask because you have mysteriously cut what I /actually/ said
about BLISS (I hope you understood my explanation) and instead made up
some claim that I don't think I have made. If you want to dispute
something, quote me and say what you think is wrong or incomprehensible.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to David Brown on Fri Sep 13 02:16:00 2024

David Brown <david.brown@hesbynett.no> wrote:

On 10/09/2024 01:58, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 16:36, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 08/09/2024 23:34, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

And while microcontrollers sometimes have a limited form of branch >>>>>>> prediction (such as prefetching the target from cache), the more >>>>>>> numerous and smaller devices don't even have instruction caches. >>>>>>> Certainly none of them have register renaming or speculative execution. >>>>>>

IIUC STM4 series has cache, and some of them are not so big. There >>>>>> are now several chinese variants of STM32F103 and some of them have >>>>>> caches (some very small like 32 words, IIRC one has 8 words and it >>>>>> is hard to decide if this very small cache or big prefetch buffer). >>>>>

There are different kinds of cache here. Some of the Cortex-M cores >>>>> have optional caches (i.e., the microcontroller manufacturer can choose >>>>> to have them or not).

<https://en.wikipedia.org/wiki/ARM_Cortex-M#Silicon_customization>

I do not see relevent information at that link.

There is a table of the Cortex-M cores, with the sizes of the optional
caches.

Flash memory, flash controller peripherals, external memory interfaces >>>>> (including things like QSPI) are all specific to the manufacturer,
rather than part of the Cortex M cores from ARM. Manufacturers can do >>>>> whatever they want there.

AFAIK typical Cortex-M design has core connected to "bus matrix".
It is up to chip vendor to decide what else is connected to bus matrix. >>>

Yes.

However, there are other things connected before these crossbar
switches, such as tightly-coupled memory (if any).

TCM is _not_ a cache.

Correct. (I did not suggest or imply that it was.)

And the cpu caches
(if any) are on the cpu side of the switches.

Caches are attached were system designer thinks they are useful
(and possible). Word "cache" has well-estabished meaning and
ARM (or you) has no right to redefine it.

I am using it in the manner ARM uses it when talking about ARM
processors and microcontroller cores. I think that is the most relevant
way to use the term here. The term "cache" has many meanings in many contexts - there is no single precise "well-established" or "official" meaning.

It is "well-established" that meaning is very inclusive, like definition
here:

<https://en.wikipedia.org/wiki/Cache_(computing)>

Context is everything. That is why I have been using the term
"cpu cache" for the cache tied tightly to the cpu itself, which comes as
part of the core that ARM designs and delivers, along with parts such as
the NVIC.

Logically, given that there was "tightly attached memory", this should
be called "tightly attached cache" :)

Logicallt "cpu cache" is cache sitting on path between the CPU and a memory device. It does not need to be tightly attached to the CPU, L2 and L3
caches in PC-s are not "tightly attached".

And I have tried to use terms such as "buffer" or "flash
controller cache" for the memory buffers often provided as part of flash controllers and memory interfaces on microcontrollers, because those are terms used by the microcontroller manufacturers.

"flash cache" looks resonable. Concerning difference between a buffer
and a cache there is indeed some fuzzines here. AFAICS word "buffer"
is used for logically very simple devices, once operation becomes a bit
more interesting it is usually called a cache. Anyway, given fuzzines
saying that something called a buffer is not a cache is risky, it
may have all features associalted normally with caches, and in such
a case deserves to be called a cache.

Manufacturers also have a
certain amount of freedom of the TCMs and caches, depending on which
core they are using and which licenses they have.

There is a convenient diagram here:

<https://www.electronicdesign.com/technologies/embedded/digital-ics/processors/microcontrollers/article/21800516/cortex-m7-contains-configurable-tightly-coupled-memory>

For me it does not matter if it is ARM design or vendor specific.
Normal internal RAM is accessed via bus matrix, and in MCU-s that
I know about is fast enough so that cache is not needed. So caches
come into play only for flash (and possibly external memory, but
design with external memory probably will be rather large).

Typically you see data caches on faster Cortex-M4 microcontrollers with
external DRAM, and it is also standard on Cortex-M7 devices. For the
faster chips, internal SRAM on the AXI bus is not fast enough. For
example, the NXP i.mx RT106x family typically run at 528 MHz core clock, >>> but the AXI bus and cross-switch are at 133 MHz (a quarter of the
speed). The tightly-coupled memories and the caches run at full core speed.

OK, if you run core at faster clock than the bus matrix, then cache
attached on core side make a lot of sense. And since cache has to
compensate for lower bus speed it must be resonably large.

Yes.

But
if you look at devices where bus matrix runs at the same clock
as the core, then it makes sense to put cache on the other side.

No.

You put caches as close as possible to the prime user of the cache. If
the prime user is the cpu and you want to cache data from flash,
external memory, and other sources, you put the cache tight up against
the cpu - then you can have dedicated, wide, fast buses to the cpu.

I would say that there is a tradeoff between cost and effect. And
there is question of technical possibility. For example, 386 was
sold as a chip, and all that a system designer could do was to put
a cache ont the motherboard. On chip cache would be better, but was
not possible. IIUC in case of Cortex-M0 or say M4 manufactures get
ARM core with busses intended to be connected to the bus matrix.
Manufacturs could add extra bus matrix or crossbar just to access cache,
but bus width is specified by ARM design. If main bus matrix and RAM
is clocked at CPU freqency the extra bus matrix and cache would
only add extra latency for no gain (of course, this changes when
main bus matrix runs at lower clock). So putting cache only
at flash interface makes sense: it helps there and on lower end
chips is not needed elswere. Also, concerning caches in MCU-s note
that for writable memory there is problem of cache coherency. In
particular several small MCU-s have DMA channels. Non-coherent design
would violate user expectations and would be hard to use. OTOH putting coherent cache on memory side means extra complication to bus matrix (I
do not know what ARM did with their bigger cores). Flash being
mainly read-only does not have this problem.

But it can also make sense to put small buffers as part of memory
interface controllers. These are not organized like data or instruction caches, but are specific for the type of memory and the characteristics
of it.

Point is that in many cases they are organized like classic caches.
They cover only flash, but how it is different from caches in PC-s
that covered only part of possible RAM?

How this is done depends on details of the interface, details of
the internal buses, and how the manufacturer wants to implement it. For example, on one microcontroller I am using there are queues to let it
accept multiple flash read/write commands from the AHB bus and the IPS
bus, but read-ahead is controlled by the burst length of read requests
from the cross-switch (which in turn will come from cache line fill
requests from the cpu caches). On a different microcontroller, the read-ahead logic is in the flash controller itself as that chip has a
simpler internal bus where all read requests will be for 32 bits (it has
no cpu caches). An external DRAM controller, on the other hand, will
have queues and buffers optimised for multiple smaller transactions and
be able to hold writes in queues that get lower priority than read requests.

These sorts of queues and buffers are not generally referred to as
"caches", because they are specialised queues and buffers. Sometimes
you might have something that is in effect perhaps a two-way
single-entry 16 byte wide read-only cache, but using the term "cache"
here is often confusing. At best it is a "flash controller cache", and
very distinct from a "cpu cache".

From STM32F400 reference manual:

: Instruction cache memory
:
: To limit the time lost due to jumps, it is possible to retain 64 lines
: of 128 bits in an instruction cache memory.

That is 1kB instruction cache. In most of their marketing material they
say "flash accelerator", but in reference manual admited that this is a
cache (OK, they have also prefetch buffer and possibly "flash accelerator
= cache + buffer"). Simlarly, documentation of RP2040 says:

: An internal cache remembers the contents of recently-accessed flash
: locations, which accelerates the average bandwidth and latency of
: the interface.

Granted, RP2040 is rather big chip, but the same thing is used in smaller
ones.

It seems that vendor do not like to say that they use cache, instead
that use misleading terms like "flash accelerator".

That all depends on the vendor, and on how the flash interface
controller. Vendors do like to use terms that sound good, of course!

So a "cache" of 32 words is going to be part of the flash interface, not >>>>> a cpu cache

Well, caches never were part of CPU proper, they were part of
memory interface. They could act for whole memory or only for part
that need it (like flash). So I do not understand what "not a cpu
cache" is supposed to mean. More relevant is if such thing act
as a cache, 32 word things almost surely will act as a cache,
8 word thing may be a simple FIFO buffer (or may act smarter
showing behaviour typical of caches).

Look at the diagram in the link I gave above, as an example. CPU caches >>> are part of the block provided by ARM and are tightly connected to the
processor. Control of the caches (such as for enabling them) is done by >>> hardware registers provided by ARM, alongside the NVIC interrupt
controller, SysTick, MPU, and other units (depending on the exact
Cortex-M model).

This is completely different from the small buffers that are often
included in flash controllers or external memory interfaces as
read-ahead buffers or write queues (for RAM), which are as external the
processor core as SPI, UART, PWM, ADC, and other common blocks provided
by the microcontroller manufacturer.

The disscussion started about possible interaction of caches
and virtual function dispatch.

OK - I admit to having lost track of the earlier discussion, so that is helpful.

This interaction does not depend
on you calling it cache. It depends on cache hits/misses,
their cost and possible eviction. And actually small caches
can give "interesting" behaviour: with small code footprint there
may be 100% hit ratio, but one extra memory reference may lead
to significant misses. And even small caches behave differently
then simple buffers.

I agree that behaviour can vary significantly.

When you have a "flash controller cache" - or read-ahead buffers - you typically have something like a 60-80% hit ratio for sequential code and nearly 100% for very short loops (like you'd have for a memcpy() loop).
You have close to 0% hit ratio for branches or calls, regardless of
whether they are virtual or not (with virtual function dispatch
generally having one extra indirection at 0% hit rate). This is the
kind of "cache" you often see in microcontrollers with internal flash
and clock speeds of up to perhaps 150 Mz, where the flash might be at a quarter of the main cpu clock.

Well, with 64 lines and 2-set associativlity STM cache can give you
quite decent hit ratio on branchy code, as long as working set is
not too large. I does not need to be a simple loop. Already 3
lines can be enough if you have single call to a simple function
and call is in the loop (and if you call via function pointer
compiler can not inline the function). More realistically, 8 lines
will cover several cases where code jumps between small number
of locations.

(which are typically 16KB - 64KB,

I wonder where you found this figure. Such size is typical for
systems bigger than MCU-s. It could be useful for MCU-s with
flash a on separate die, but with flash on the same die as CPU
much smaller cache is adequate.

Look at the Wikipedia link I gave. Those are common sizes for the
Cortex-M7 (which is pretty high-end), and for the newer generation of
Cortex-M35 and Cortex-M5x parts. I have on my desk an RTO1062 with a
600 MHz Cortex-M7, 1 MB internal SRAM, 32 KB I and D caches, and
external QSPI flash.

OK, as I wrote it makes sense for them. But for smaller machines
much smaller caches may be adequate.

As I have said, they are not really caches in the same sense as you have
for a cpu cache.

In early Penium era there were PC processors (non-Intel) with (IIRC)
1kB on chip cache. Due to small hit ratio they were slower than
chips with bigger cache, but dramatically faster compared to chip
taht had no cache at all.

But certainly a "flash controller cache" or read-ahead
buffer (especially if there are two of them) can make a big difference
to the throughput of a microcontroller, and equally certainly a cpu
cache would be an unreasonable cost in die area, power, and licensing
fees for most microcontrollers. Thus these small buffers - or very
small, very specialised caches in the flash controller - are a good idea.

and only found on bigger
microcontrollers with speeds of perhaps 120 MHz or above). And yes, it >>>>> is often fair to call these flash caches "prefetch buffers" or
read-ahead buffers.

Typical code has enough branches that simple read-ahead beyond 8
words is unlikely to give good results. OTOH delivering things
that were accessed in the past and still present in the cache
gives good results even with very small caches.

There are no processors with caches smaller than perhaps 4 KB - it is
simply not worth it.

Historicaly there were processors with small caches. 256B in
Motorla chips and I think smaller too. It depends on the whole
design.

For a general cpu data cache on a modern cpu, the cache control logic is probably going to require the same die area as a few KB of cache
storage, as a minimum - so it makes no sense to have such small cpu
caches. The logic for instruction caches is simpler. In days gone by, balances were different and smaller caches could be useful. The 68020
had a 256 byte instruction cache, and the 68030 and 68040 added a 256
byte data cache. Both were single way.

Currently for "big" processors really small caches seem
to make no sense. Microconrollers have their own constaints.
Manufacturer may decide that cache giving 10% average improvement
is not worth uncertainilty of execution time. Or may decide that
small cache is the cheapest way to get better benchmark figures.

You are correct that microcontrollers have different constraints, and
that jitter and variation of timing is far more of a cost in
microcontrollers than it is on "big" processors, where throughput is
key. The other factor here is latency. On earlier designs such as the aforementioned M68k family, you could often add a fair bit of logic
without requiring extra clock cycles. Thus the cache was "free". That
is different now, even on microcontrollers. Adding a cpu cache on even
the slowest of modern microcontrollers will mean at least a clock cycle
extra on cache misses compared to no cache - for medium devices (say,
120 MHz Cortex-M4) it would mean 2 or 3 extra cycles. So unless you are getting a significant hit ratio, it is not worth it.

As I wrote, adding a RAM cache on low/middle end devices does not
help, RAM there is as fast as possible cache. Small chips have no
external memories, so only usefully cachable thing is flash. And
for flash even small cache may be useful. Concerning 120 MHz Cortex-M4,
I do not think is is 2 clock, at least when cache is small enough.
And in principle chip may do cache lookup in parallel with flash
access. In case of a miss there would be no time lost compared to
no cache. In case of a hit there would be gain.

Putting read-ahead buffers and a "micro-cache", if that term suits you,
at the flash controller and other memory interfaces is, however, free in terms of clock cycles and latency - these parts run at a higher clock
rate than the flash itself.

Read-ahead buffers on flash accesses are helpful,
however, because most code is sequential most of the time. It is common >>> for such buffers to be two-way, and to have between 16 and 64 bytes per
way.

If you read carefully description of STM "flash accelerator" it is
clear that this is classic cache, with line size matched to flash,
something like 2-set associativity, conflicts and eviction.
Historically there were variations, some caches only cache targets
of jumps and use prefetch buffer for linear code. Such caches
can be effective at very small size.

I don't know the STM "flash accelerator" specifically - there are many
ARM microcontrollers and I have not used them all. But while it is true
that some of these are organised in a similar way to extremely small and restricted caches, I think using the word "cache" alone here is
misleading. That's why I have tried to distinguish and qualify the term.

And in the context of virtual function dispatch, a two-way single line micro-cache is pretty much guaranteed to have a cache miss when doing
such indirect calls as you need the current code, the virtual method
table, and the virtual method itself to be in cache simultaneously to
avoid a miss. But these flash accelerators still make a big difference
to the speed of code in general.

I do not mean 2 lines, those are typically called prefetch buffers.
Rather, I mean say 8-line micro cache or caches having tens of lines.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Fri Sep 13 15:12:47 2024

On 13/09/2024 00:46, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 12/09/2024 00:47, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 11/09/2024 01:02, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

Sorry, did your remark above suggest I don't know what an lvalue is? >>>>> That seemed like the obvious explanation for the incorrect information >>>>> you gave. Did you post it /knowing/ what other kinds of things are

lvalues in C just to confuse people?

Which incorrect explanation was that?

I merely said that LHSs of assigments fall into these categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

Yes, that incorrect explanation.

I dispute that. What I said is very broadly correct. But in this newgroup
you do like to nitpick.

Someone who wants to write p->m = 42 would not consider it nitpicking if
your compiler did not accept that form of LHS.

But I agree I was simply correcting a small error.

It didn't seem like it. You keep saying things like this:

Is there no part of C you can't misrepresent?

And (see above):

Yes, that incorrect explanation.

Suggesting it was completely wrong rather than there being some corner
cases that might have been included too.

You'd have a point if I was writing some treatise on C and this was
supposed to be a comprehensive list of all lvalue terms.

The context was that I was pointing out the simpler kinds of LHSs that C allows, compared with other languages where you could do stuff like this:

IF c THEN a ELSE b FI := x

which was exactly what was being queried.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Ben Bacarisse on Fri Sep 13 15:02:30 2024

On 13/09/2024 00:46, Ben Bacarisse wrote:

I merely said that LHSs of assigments fall into these categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

Yes, that incorrect explanation.

I dispute that. What I said is very broadly correct. But in this newgroup
you do like to nitpick.

Someone who wants to write p->m = 42 would not consider it nitpicking if
your compiler did not accept that form of LHS.

But I agree I was simply correcting a small error. Why did you not just
say "yes, I forgot a few cases"?

So to you, it is of the greatest importance that somebody doesn't just know >> about those four categories that they will be reading and writing all the
time in C code, but also know about:

(int){A} = Y;

You see why I wonder if you had a political career? This is pure spin.
There is no technical argument here, just an attempt to mock someone
pointing out a truth. I never even suggested that it was important,
just that it was a missing case. And, still spinning away, you ignore

m which /is/ important and was also missing.

I assumed that most know that X->m is just shorthand for (*X).m,
presumably created or adapted because the alternative (thanks to C's
prefix dererence op) is so ungainly.

But X->m and (*X).m both do member selection; they are conceptually the
same thing.

I kept X[i] and *X separate because they are conceptually different. (I
expect if I'd had only *X, you would have complained that X[i] was missing.)

Note that the decription I added in the comment says only 'member
select'. What would your descriptions have been for X.m and X->m?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Keith Thompson on Fri Sep 13 14:18:53 2024

On 12/09/2024 21:51, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 12/09/2024 20:38, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

It's not that complicated, not with C anyway.

Agreed. Let's stop doing that. (Your specific statement that there are *four* categories triggered a lot of the "too much".)

I actually said 3-4 categories, depending on when index operations get
turned into pointer operations. In C source they don't, but it might conceptually be in the mind of whoever is writing the C source.

Upthread, you wrote:

That's the only thing that needs to 'change', which I don't think is
onerous anyway.

Would you like to clarify what you think needs to change?

That was in reply to this (I've capitalised 'changing'):

WH:

"To be more precise, gcc parser when seeing a variable

creates read reference to this variable. When parser realizes that
already recognized part of expression is the left hand side of an
assignment it converts it to write access. So your approach is
no worse than gcc. But it creates troubles, process of CHANGING
subexpression with read references into write access is more
complicated than replacing read instruction by write instruction."

I put my 'change' in quotes since I didn't believe any such change is necessary. But if somebody or something deems it so then, in C, that
would only apply to one lvalue on the LHS of an assignment.

WH was criticising the approach of initially dealing with LHS/RHS, lvalue/rvalue, the same way, then making any 'changes' later.

I gave an example of a LHS that might appear in some languages that had MULTIPLE lvalue terms on the left of an assignment, that could be nested
within a complex expression, where that appoach gives very little trouble.

But I can see you're mainly concerned with scanning my posts to see if
there's any divergence from the exact wording of the C standard, even
though the discussion is a little wider than that in crossing language boundaries, and includes implementation details that are beyond the
scope of the standard anyway.

All I can say is that comp.std.c is that way --->

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Fri Sep 13 18:05:28 2024

On Fri, 13 Sep 2024 16:25:44 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 13/09/2024 04:16, Waldek Hebisch wrote:

You put caches as close as possible to the prime user of the
cache. If the prime user is the cpu and you want to cache data
from flash, external memory, and other sources, you put the cache
tight up against the cpu - then you can have dedicated, wide, fast
buses to the cpu.

I would say that there is a tradeoff between cost and effect. And
there is question of technical possibility. For example, 386 was
sold as a chip, and all that a system designer could do was to put
a cache ont the motherboard. On chip cache would be better, but was
not possible.

There can certainly be such trade-offs. I don't remember the details
of the 386, but I /think/ the cache was connected separately on a
dedicated bus, rather than on the bus that went to the memory
controller (which was also off-chip, on the chipset).

No, i386 had no back-side bus.
The caches used in i386 based computers are typically described as
"inline" cache.

So it was
logically close to the cpu even though it was physically on a
different chip. I think if these sorts of details are of interest, a
thread in comp.arch might make more sense than comp.lang.c.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Fri Sep 13 16:24:12 2024

On 12/09/2024 23:28, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 12/09/2024 01:59, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On many cpus, using sizes smaller than the full register size means
doing sign extensions or masking operations at various times - thus full >>>> size register operations can often be more efficient. On such systems >>>> you will find that int_fast16_t is 32-bit or 64-bit, according to the
register width. On other cpus, some common ALU operations on full-size >>>> operands can be slower than for smaller operands (such as on the 68000). >>>> There, int_fast16_t will be 16-bit.

Compiler authors know what will usually be faster on the target. There >>>> will always be some exceptions (division is usually faster on smaller
operands, for example). But if you don't know the target - as is the
case of portable code - the compiler will usually make a better choice >>>> here than you would.

BTW, I just played with Clang 18 on 64-bit FreeBSD. It has 32-bit
int_fast16_t. gcc in Linux makes it 64-bit. Who is right?

Technically, both are right - implementations can use any integer type
of at least 16 bits here, whatever they think is fastest in general.
But it surprises me somewhat, given that clang for x86-64 on Linux uses
64-bit for int_fast16_t.

Well, both satisfy "hard" requirements. But the question was which
type is faster.

Yes. But unless I am using a target processor that I know well, have a
good idea of the types of instructions and know about the timings for
those instructions with different data types, then I am inclined to
believe the compiler implementer here and use the int_fastNN_t types.
(For most of my C programming, I /do/ know the target well, and can make
more refined type choices if it is relevant. But I don't know the
timing details for the countless x86-64 variants.)

But to be clear, the size of the "fast" types depends on the target and
the implementation. They are not normally used for external ABI's, and
are purely internal to the generated code. Obviously you must pick a
"fast" size that is at least as big as the range you need.

I think that Linux (and probably FreeBSD too) considers size of
fast type as part of ABI (regardless of gudelines those types
certainly leaked into "public" interfaces). Such ABI change is
probably viewed as not worth doing.

I am not sure if the fast types are in the ABI. It certainly seems not,
if you say clang on x86-64 BSD has 32-bit int_fast32_t, while it is
64-bit on Linux gcc and clang. BSD and Linux use the same ABI, AFAIK. (Everybody except MS use the same x86-64 ABI.)

And concering choice on x86_64, AFAIK for operations on numbers of
the same size 32-bit gives fastest operations. 16-bit had two
disadvantages, big one due to partial register stalls, small one
due to larger size (operand size prefix). 64-bit requires bigger
code (more need to use prefixes) and bigger data.

I don't know if that is all correct or not. Some operations are
definitely slower on bigger operands, such as division. Different
x86-64 processors may see different costs for things like prefix sizes.
And if you can get SIMD instructions into the picture, then smaller
sizes let you do more in the same instruction.

When mixing
types, 32-bit numbers are automatically zero extended, so there
is no extra cost when mixing unsigend numbers.

When I look at generated code, unsigned types smaller than 64 bits can
require masking at times, and they do sometimes require zero extend instructions (typically expressed as a "move 32-bit register to 64-bit register" instruction).

So what remains
is mixing signed 32-bit integers with 64-bit ones. Addresses
use 64-bit artitmetic, so that requires sign extention. OTOH
in arithmetic "fast" types are likely to be mixed with exact
32-bit types and then making "fast" types 32-bit is faster
overall. So, there are bets/assumptions which usage is more
frequent. OTOH, choice between 32-bit and 64-bit fast types
is unlikely to make _much_ difference.

It could be interesting to compare speeds for different kinds of code on different x86-86 targets. But the details here are beyond me, and also
well outside of the targets that I am most interested in.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Waldek Hebisch on Fri Sep 13 16:25:44 2024

On 13/09/2024 04:16, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 10/09/2024 01:58, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 09/09/2024 16:36, Waldek Hebisch wrote:

(I'm snipping bits, because these posts are getting a bit long!)

Context is everything. That is why I have been using the term
"cpu cache" for the cache tied tightly to the cpu itself, which comes as
part of the core that ARM designs and delivers, along with parts such as
the NVIC.

Logically, given that there was "tightly attached memory", this should
be called "tightly attached cache" :)

Logicallt "cpu cache" is cache sitting on path between the CPU and a memory device. It does not need to be tightly attached to the CPU, L2 and L3
caches in PC-s are not "tightly attached".

That is all true. But in the case of ARM Cortex-M microcontrollers, the
cpu cache is part of the "black box" delivered by ARM. Manufacturers
get some choices when they order the box, including some influence over
the cache sizes, but it is very much integrated in the core complex
(along with the NVIC and a number of other parts). It is completely
irrelevant that on a Pentium PC, the cache was a separate chip, or that
on a PowerPC microcontroller the interrupt controller is made by the microcontroller manufacturer and not by the cpu core designers. On microcontrollers built around ARM Cortex-M cores, ARM provides the cpu
core, cpu caches (depending on the core model and options chosen), the
NVIC interrupt controller, MPU, and a few other bits and pieces. The
caches are called "cpu caches" - "cpu data cache" and "cpu instruction
cache" because they are attached to the cpu. The microcontroller
manufacturer can put whatever else they like on the chip.

I don't disagree that other types of buffer can fall under a generic
concept of "cache". And in some cases they may even have the same
logical build-up as a tiny and limited version of the cpu caches. But I
don't think it helps to use exactly the same terms for things that are
in significantly different places on the chip, with very different
balances in their designs, and for different effects in their detailed
working.

It is fair enough to talk about a "flash cache" for buffers that are
designed somewhat like a cache, with at least two entries indexed by an
address hash (usually just some of the lower address bits), and with
tags including the rest of the address bits. It is misleading for
systems where you just have a read-ahead buffer or two, or a queue
system. Unlike the cpu caches, the architecture of such flash
accelerators varies wildly for different manufacturers and their
different microcontroller models.

And I have tried to use terms such as "buffer" or "flash
controller cache" for the memory buffers often provided as part of flash
controllers and memory interfaces on microcontrollers, because those are
terms used by the microcontroller manufacturers.

"flash cache" looks resonable. Concerning difference between a buffer
and a cache there is indeed some fuzzines here. AFAICS word "buffer"
is used for logically very simple devices, once operation becomes a bit
more interesting it is usually called a cache. Anyway, given fuzzines
saying that something called a buffer is not a cache is risky, it
may have all features associalted normally with caches, and in such
a case deserves to be called a cache.

OK, but it is not a "cpu cache".

But
if you look at devices where bus matrix runs at the same clock
as the core, then it makes sense to put cache on the other side.

No.

You put caches as close as possible to the prime user of the cache. If
the prime user is the cpu and you want to cache data from flash,
external memory, and other sources, you put the cache tight up against
the cpu - then you can have dedicated, wide, fast buses to the cpu.

I would say that there is a tradeoff between cost and effect. And
there is question of technical possibility. For example, 386 was
sold as a chip, and all that a system designer could do was to put
a cache ont the motherboard. On chip cache would be better, but was
not possible.

There can certainly be such trade-offs. I don't remember the details of
the 386, but I /think/ the cache was connected separately on a dedicated
bus, rather than on the bus that went to the memory controller (which
was also off-chip, on the chipset). So it was logically close to the
cpu even though it was physically on a different chip. I think if these
sorts of details are of interest, a thread in comp.arch might make more
sense than comp.lang.c.

IIUC in case of Cortex-M0 or say M4 manufactures get
ARM core with busses intended to be connected to the bus matrix.

Yes.

Manufacturs could add extra bus matrix or crossbar just to access cache,
but bus width is specified by ARM design.

I believe the bus standard is from ARM, but the implementation is by the manufacturers (unlike the cpu core and immediately surrounding parts,
including the cpu caches for devices that support that).

If main bus matrix and RAM
is clocked at CPU freqency the extra bus matrix and cache would
only add extra latency for no gain (of course, this changes when
main bus matrix runs at lower clock).

Correct, at least for static RAM (DRAM can have more latency than a
cache even if it is at the same base frequency). cpu caches are useful
when the onboard ram is slower than the cpu, and particularly when
slower memory such as flash or external ram are used.

So putting cache only
at flash interface makes sense: it helps there and on lower end
chips is not needed elswere.

Yes.

Also, concerning caches in MCU-s note
that for writable memory there is problem of cache coherency. In
particular several small MCU-s have DMA channels. Non-coherent design
would violate user expectations and would be hard to use.

That is correct. There are three main solutions to this in any system
with caches. One is to have cache snooping for the DMA controller so
that the cpu and the DMA have the same picture of real memory. Another
is to have some parts of the ram as being uncached (this is usually
controlled by the MMU), so that memory that is accessed by the DMA is
never in cache. And the third method is to use cache flush and
invalidate instructions appropriately so that software makes sure it has up-to-date data. I've seen all three - and on some microcontrollers I
have seen a mixture in use. Obviously they have their advantages and disadvantages in terms of hardware or software complexity.

OTOH putting
coherent cache on memory side means extra complication to bus matrix (I
do not know what ARM did with their bigger cores). Flash being
mainly read-only does not have this problem.

Flash still has such issues during updates. I've seen badly made
systems where things like the flash status register got cached.
Needless to say, that did not work well! And if you have a bigger
instruction cache, you have to take care to flush things appropriately
during software updates.

But it can also make sense to put small buffers as part of memory
interface controllers. These are not organized like data or instruction
caches, but are specific for the type of memory and the characteristics
of it.

Point is that in many cases they are organized like classic caches.
They cover only flash, but how it is different from caches in PC-s
that covered only part of possible RAM?

The main differences are the dimensions of the caches, their physical
and logical location, and the purpose for which they are optimised.

How this is done depends on details of the interface, details of
the internal buses, and how the manufacturer wants to implement it. For
example, on one microcontroller I am using there are queues to let it
accept multiple flash read/write commands from the AHB bus and the IPS
bus, but read-ahead is controlled by the burst length of read requests
from the cross-switch (which in turn will come from cache line fill
requests from the cpu caches). On a different microcontroller, the
read-ahead logic is in the flash controller itself as that chip has a
simpler internal bus where all read requests will be for 32 bits (it has
no cpu caches). An external DRAM controller, on the other hand, will
have queues and buffers optimised for multiple smaller transactions and
be able to hold writes in queues that get lower priority than read requests. >>
These sorts of queues and buffers are not generally referred to as
"caches", because they are specialised queues and buffers. Sometimes
you might have something that is in effect perhaps a two-way
single-entry 16 byte wide read-only cache, but using the term "cache"
here is often confusing. At best it is a "flash controller cache", and
very distinct from a "cpu cache".

From STM32F400 reference manual:

: Instruction cache memory
:
: To limit the time lost due to jumps, it is possible to retain 64 lines
: of 128 bits in an instruction cache memory.

That is 1kB instruction cache. In most of their marketing material they
say "flash accelerator", but in reference manual admited that this is a
cache (OK, they have also prefetch buffer and possibly "flash accelerator
= cache + buffer").

By the time you are talking about 1 KB and 64 lines, "cache" is a
reasonable term. Many "flash accelerators" have perhaps just two lines.

Simlarly, documentation of RP2040 says:

: An internal cache remembers the contents of recently-accessed flash
: locations, which accelerates the average bandwidth and latency of
: the interface.

Granted, RP2040 is rather big chip, but the same thing is used in smaller ones.

No, it is a small device - it is dual core Cortex-M0+. But it does have
a surprisingly large XIP flash cache at 16 KB. This is not a cpu cache,
since it is connected directly to the QSPI flash controller rather than
the cpu, but it /is/ a cache.

I agree that behaviour can vary significantly.

When you have a "flash controller cache" - or read-ahead buffers - you
typically have something like a 60-80% hit ratio for sequential code and
nearly 100% for very short loops (like you'd have for a memcpy() loop).
You have close to 0% hit ratio for branches or calls, regardless of
whether they are virtual or not (with virtual function dispatch
generally having one extra indirection at 0% hit rate). This is the
kind of "cache" you often see in microcontrollers with internal flash
and clock speeds of up to perhaps 150 Mz, where the flash might be at a
quarter of the main cpu clock.

Well, with 64 lines and 2-set associativlity STM cache can give you
quite decent hit ratio on branchy code, as long as working set is
not too large. I does not need to be a simple loop. Already 3
lines can be enough if you have single call to a simple function
and call is in the loop (and if you call via function pointer
compiler can not inline the function). More realistically, 8 lines
will cover several cases where code jumps between small number
of locations.

Yes, I would call 64 lines a "cache" rather than a "buffer".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Fri Sep 13 17:32:51 2024

David Brown <david.brown@hesbynett.no> writes:

On 13/09/2024 04:16, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

I don't disagree that other types of buffer can fall under a generic
concept of "cache". And in some cases they may even have the same
logical build-up as a tiny and limited version of the cpu caches. But I >don't think it helps to use exactly the same terms for things that are
in significantly different places on the chip, with very different
balances in their designs, and for different effects in their detailed >working.

The ARM SMMU caches stream table entries and context table entries,
not to mention address translations (TLB is a cache, after all) and
in both cases offers software a way to flush and/or invalidate
the caches when necessary (i.e. when the translation table is
updated by software). These are naturally called caches.

The ARM GIC caches LPI and vLPI properties for frequently used
interrupt numbers and offers software a way to flush the cache
when necessary. Likewise naturally called simple caches.

SoCs may have dozens of internal caches, usually with some programmatic mechanism to flush them; caching everything from encryption keys to
ethernet MAC addresses.

It is fair enough to talk about a "flash cache" for buffers that are
designed somewhat like a cache, with at least two entries indexed by an >address hash (usually just some of the lower address bits), and with
tags including the rest of the address bits. It is misleading for
systems where you just have a read-ahead buffer or two, or a queue
system. Unlike the cpu caches, the architecture of such flash
accelerators varies wildly for different manufacturers and their
different microcontroller models.

ARM does provide an instruction to "flush to point of persistence"
specifically to support NV memory devices directly attached to
the processor.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Bart on Fri Sep 13 23:01:55 2024

Bart <bc@freeuk.com> writes:

On 13/09/2024 00:46, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 12/09/2024 00:47, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

On 11/09/2024 01:02, Ben Bacarisse wrote:

Bart <bc@freeuk.com> writes:

Sorry, did your remark above suggest I don't know what an lvalue is? >>>>>> That seemed like the obvious explanation for the incorrect information >>>>>> you gave. Did you post it /knowing/ what other kinds of things are >>>>>> lvalues in C just to confuse people?

Which incorrect explanation was that?

I merely said that LHSs of assigments fall into these categories:

A = Y; // name
*X = Y; // pointer
X[i] = Y; // index
X.m = Y; // member select

Yes, that incorrect explanation.

I dispute that. What I said is very broadly correct. But in this newgroup >>> you do like to nitpick.

Someone who wants to write p->m = 42 would not consider it nitpicking if
your compiler did not accept that form of LHS.
But I agree I was simply correcting a small error.

It didn't seem like it. You keep saying things like this:

Is there no part of C you can't misrepresent?

Not about this I didn't. That's another sub-thread about another quite different error. And not the "spin" -- I didn't "keep" saying things
like that.

And (see above):

Yes, that incorrect explanation.

Suggesting it was completely wrong rather than there being some corner
cases that might have been included too.

You asked "which incorrect explanation was that" and then repeated it.
What should I have said -- yes that explanation that I now concede is
correct because the missing forms are not important?

You'd have a point if I was writing some treatise on C and this was
supposed to be a comprehensive list of all lvalue terms.

And I'd not have said anything if you has not implied that there were
only four forms of valid left hand side.

The context was that I was pointing out the simpler kinds of LHSs that C allows, compared with other languages where you could do stuff like this:

IF c THEN a ELSE b FI := x

which was exactly what was being queried.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Sat Sep 14 15:13:56 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

Assuming all that is right, I recommend

typedef __uint128_t U128;
typedef __int128_t S128;

which works in both gcc and clang (I don't know yet about
Visual Studio).

The documented names are `__int128` and `unsigned __int128`.

Both gcc and clang do recognize `__int128_t` and `__uint128_t`,
but I wouldn't recommend relying on an undocumented feature.

Both gcc and clang recognized __[u]int128_t in earlier versions
than they did __int128. The __[u]int128_t types are also
recognized by the Intel compiler.

__int128 is treated as a keyword.

That's another reason to prefer __[u]int128_t types.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Sat Sep 14 15:07:14 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

Have you tried looking at other uses of the word "may" in the C
standard to see if that sheds some light on the question?

If you have any actual insight to offer, feel free to do so.

Is there some reason you mind my asking a civil question?

In my view there is no question about the intent here. If
I'm going to try to help you resolve your uncertainty, it
would be helpful to have a better understanding of your
reasoning process. That's why I asked the question.

I got the impression that you were providing vague hints and
deliberately hiding information. As you know, several other people
here have gotten the same impression in similar circumstances.
Perhaps that wasn't your intent, but in my opinion it would be to
your benefit to be more aware of how you come across.

You should apply this advice to yourself. I was just about to
start writing a full explanation when I noticed this paragraph.
It completely robbed me of any motivation to continue.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Sep 15 20:05:47 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless >> language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to >> work.

You can't really use that to bash me about the head with and maintain that >> all my ideas about language implementation are wrong because C views
assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You /know/ the LH and RH side of a C assignment have different constraints
(you have said so yourself) yet you persist in defending your original
claim that what is needed on the two sides "is exactly the same". You
must, surely, be arguing simply for the fun of it.

Tim suggests that there is communication failure here -- that you have
not expressed what you mean clearly enough. That may be so, but I can't
see how to interpret what you've written in any other way.

I'm coming around to the point of view that Bart isn't really
interested in communicating. He seems not to listen to what
other people say, and either he can't be bothered to say what
he really means or he says things in a personal idiosyncratic
vernacular that no one else understands. I'm okay with people
who are making a sincere effort to communicate and are just
having trouble doing so. With Bart though more and more the
impression I get is that he isn't really trying because at
some level he doesn't care if he communicates or not.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Tim Rentsch on Mon Sep 16 10:58:53 2024

On 16/09/2024 04:05, Tim Rentsch wrote:

Ben Bacarisse <ben@bsb.me.uk> writes:

Bart <bc@freeuk.com> writes:

And yes I'm still committed to that symmetry. I'ved used it for countless >>> language implementations. C is little different other than it has a
700-page standard that suggests a recommended model of how it's supposed to >>> work.

You can't really use that to bash me about the head with and maintain that >>> all my ideas about language implementation are wrong because C views
assignment in its own idiosyncratic manner.

I don't want to bash you about the head, but what C says about
assignment has /always/ been the point, and your implementation of C
will be wrong if you don't follow the rules about C's assignments. You
/know/ the LH and RH side of a C assignment have different constraints
(you have said so yourself) yet you persist in defending your original
claim that what is needed on the two sides "is exactly the same". You
must, surely, be arguing simply for the fun of it.

Tim suggests that there is communication failure here -- that you have
not expressed what you mean clearly enough. That may be so, but I can't
see how to interpret what you've written in any other way.

I'm coming around to the point of view that Bart isn't really
interested in communicating. He seems not to listen to what
other people say, and either he can't be bothered to say what
he really means or he says things in a personal idiosyncratic
vernacular that no one else understands. I'm okay with people
who are making a sincere effort to communicate and are just
having trouble doing so. With Bart though more and more the
impression I get is that he isn't really trying because at
some level he doesn't care if he communicates or not.

Trying to communicate in this Standards-obsessed newsgroup is like
trying to have a meaningful discussion with Bible-bashers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Bart on Mon Sep 16 11:30:30 2024

On 2024-09-16, Bart <bc@freeuk.com> wrote:

Trying to communicate in this Standards-obsessed newsgroup is like
trying to have a meaningful discussion with Bible-bashers.

Yes; "I hsve my own personal version of the Bible in which many of its arbitrary stories are otherwise" is probably not a good way to approach
a Bible study group.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Kaz Kylheku on Mon Sep 16 14:42:06 2024

On 16/09/2024 12:30, Kaz Kylheku wrote:

On 2024-09-16, Bart <bc@freeuk.com> wrote:

Trying to communicate in this Standards-obsessed newsgroup is like
trying to have a meaningful discussion with Bible-bashers.

Yes; "I hsve my own personal version of the Bible in which many of its arbitrary stories are otherwise" is probably not a good way to approach
a Bible study group.

So this is a Bible study group now?

I suspected as much.

I suppose the diverse requirements of everyday life, and individuals'
different personalities, responsibilities and attitudes all have to be
left outside the door.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Bart on Mon Sep 16 17:40:19 2024

On Mon, 16 Sep 2024 14:42:06 +0100
Bart <bc@freeuk.com> wrote:

On 16/09/2024 12:30, Kaz Kylheku wrote:

On 2024-09-16, Bart <bc@freeuk.com> wrote:

Trying to communicate in this Standards-obsessed newsgroup is like
trying to have a meaningful discussion with Bible-bashers.

Yes; "I hsve my own personal version of the Bible in which many of
its arbitrary stories are otherwise" is probably not a good way to
approach a Bible study group.

So this is a Bible study group now?

I suspected as much.

Was not it you who in recent discussion defended Orthodox Algolescu
semantics of assignment statement against heretics?

I suppose the diverse requirements of everyday life, and individuals' different personalities, responsibilities and attitudes all have to
be left outside the door.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Bart on Mon Sep 16 14:30:39 2024

On 2024-09-16, Bart <bc@freeuk.com> wrote:

On 16/09/2024 12:30, Kaz Kylheku wrote:

On 2024-09-16, Bart <bc@freeuk.com> wrote:

Trying to communicate in this Standards-obsessed newsgroup is like
trying to have a meaningful discussion with Bible-bashers.

Yes; "I hsve my own personal version of the Bible in which many of its
arbitrary stories are otherwise" is probably not a good way to approach
a Bible study group.

So this is a Bible study group now?

In a way, yes. I would say the analogy holds.
A big chunk of comp.lang.c something like "ISO C hermeneutics".

Computer Science is basically a religious domain full of faiths and
sects, along lines such as tech stacks and programming languages. These
have documentation, which is like scripture.

Things are infinitely flexible; and everything is built on conventions
for which there are alternatives, and in many cases obviously better
justified alternatives that would be followed in a green field redesign.

People who go to Bible studies are almost certainly motivated by faith. Atheists generally aren't going to Bible studies. But in tech, we do
have plenty of "atheists" doing the analog of Bible study, in order
to understand the things that are, and get stuff done with them.

Not everyone who just wants to stick to the topic is necessarily a
fanatic about that topic who has religious blinders on their eyes.

I suspected as much.

I suppose the diverse requirements of everyday life, and individuals' different personalities, responsibilities and attitudes all have to be
left outside the door.

I would say no they don't, but there should be a story to frame
the relevance of anything not left outside the door.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Bart on Mon Sep 16 12:19:17 2024

On 2024-09-16, Bart <bc@freeuk.com> wrote:

On 16/09/2024 12:30, Kaz Kylheku wrote:

...

Yes; "I hsve my own personal version of the Bible in which many of its
arbitrary stories are otherwise" is probably not a good way to approach
a Bible study group.

So this is a Bible study group now?

No, but there are some key similarities. Both bible study groups and
this newsgroup have an authoritative text to reference. However, the
nature of that authority is quite different in the two cases. Bible
study groups believe that the Bible is divinely inspired. Those who are sufficiently familiar with the C standard know that it was created by a committee of experts, fully capable of making mistakes. Many (most?)
Believers consider the Bible to be incapable of being wrong.

The C standard is also incapable of being wrong, but in a very different
sense - the C standard defines C, there is no alternative to compare it
with, in order to say that the C standard is wrong. The C standard might
be inconsistent, unimplementable, badly designed, or incomprehensible,
among many other defects if might have - but as the official definition
of C, it cannot be wrong.
Any such defects can be corrected by filing a defect report and
convincing the committee that the report is correct. If they agree, the
next version of the standard is likely to contain revised wording to
address the issue. Try doing that with the Bible.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Mon Sep 16 19:13:07 2024

On 16/09/2024 18:19, James Kuyper wrote:

On 2024-09-16, Bart <bc@freeuk.com> wrote:

On 16/09/2024 12:30, Kaz Kylheku wrote:

...

Yes; "I hsve my own personal version of the Bible in which many of its
arbitrary stories are otherwise" is probably not a good way to approach
a Bible study group.

So this is a Bible study group now?

No, but there are some key similarities. Both bible study groups and
this newsgroup have an authoritative text to reference. However, the
nature of that authority is quite different in the two cases. Bible
study groups believe that the Bible is divinely inspired. Those who are sufficiently familiar with the C standard know that it was created by a committee of experts, fully capable of making mistakes. Many (most?) Believers consider the Bible to be incapable of being wrong.

The C standard is also incapable of being wrong, but in a very different sense - the C standard defines C, there is no alternative to compare it
with, in order to say that the C standard is wrong. The C standard might
be inconsistent, unimplementable, badly designed, or incomprehensible,
among many other defects if might have - but as the official definition
of C, it cannot be wrong.
Any such defects can be corrected by filing a defect report and
convincing the committee that the report is correct. If they agree, the
next version of the standard is likely to contain revised wording to
address the issue. Try doing that with the Bible.

At the risk of offending people, I'd say this /has/ been done with the
Bible countless times. There are dozens of major versions of the Bible
with different selections of books and sections of the books. There are hundreds of translations for each version, even counting just
translations into English, based on different source texts and very
different styles of translation. And that's before you get to major
re-writes, like Mormonism (though perhaps that's more akin to moving
from C to Rust).

Unlike C, it is not a nice linear progression with each new version
superseding the previous versions. But we still do see some "C90
fanatics" that are as convinced in their viewpoint as some King James fans!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to James Kuyper on Mon Sep 16 19:26:51 2024

On 16.09.2024 18:19, James Kuyper wrote:

On 2024-09-16, Bart <bc@freeuk.com> wrote:

On 16/09/2024 12:30, Kaz Kylheku wrote:

...

Yes; "I hsve my own personal version of the Bible in which many of its
arbitrary stories are otherwise" is probably not a good way to approach
a Bible study group.

So this is a Bible study group now?

No, but there are some key similarities. Both bible study groups and
this newsgroup have an authoritative text to reference. However, the
nature of that authority is quite different in the two cases. Bible
study groups believe that the Bible is divinely inspired. Those who are sufficiently familiar with the C standard know that it was created by a committee of experts, fully capable of making mistakes. Many (most?) Believers consider the Bible to be incapable of being wrong.

But wasn't the Bible created in a ("Chinese whispers"?) way like
([God] ->) human -> ... -> human -> Bible write-down
with (a lot?) of ([possibly] errant) humans in between?

Disclaimer: I don't know how many instances of "human" were involved.

Only that there's no way, I suppose, to fix any (even obvious) mistakes
in the Bible.

Janis

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to All on Tue Sep 17 05:46:22 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

[..talking about documents that are at odds with the C standard..]

Somewhat trickier are the cases where some other document says
"... the standard says X, but that doesn't make any sense. It's
clear that they actually meant Y, and you can just write your code accordingly." Tim Rentsch is a prolific source of such comments.

That is a misrepresentation, if not simply an outright lie.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Kaz Kylheku on Tue Sep 17 05:56:02 2024

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2024-09-08, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 08.09.2024 16:12, James Kuyper wrote:

On 9/8/24 00:39, Janis Papanagnou wrote:
...

That's why I immediately see the necessity that compiler creators
need to know them in detail to _implement_ "C". And that's why I
cannot see how the statement of the C-standard's "most important
purpose" would

sound reasonable (to me). ...

I agree - the most important purpose is for implementors, not
developers.

... I mean, what will a programmer get from the "C" standard that
a well written text book doesn't provide?

What the C standard says is more precise and more complete than
what most textbooks say.

Exactly. And this precision is what makes standard often difficult
to read (for programming purposes for "ordinary" folks).

The C grammar is not presented in a nice way in ISO C.

It uses nonsensical categories. For instance a basic expression
like A is considered a unary-expression. A newcomer looking at the
grammar for assignment will be wondering how on Earth the A in A = B
is a unary expression, when it contains no unary operator.

The unary-expression is not given in the immediately preceding
section, and no section references are given; you have to go
searching through the document to find it.

I also suspect programmers not versed in parsing and grammars will
not intuit that assignment associates right to left. Someone who
remembers their compiler course from university will know that the
right hand side

"unary-expression assignment-operator assignment-expression"

has the assignment-expression on the right, and is therefore
identified as right-recursive, and that leads to right association.

I strongly suspect that the vast majority of the C coders on the
planet (as well as users of other languages that have operator
grammars) refer to operator precedence tables that fit on one
page, rather than flipping around in a telescopic grammar that
goes on for pages.

This critique is off base. The C standard is a reference, not a
tutorial. If Kaz thinks the grammar given in the C standard is not
up to the task *for the document it is part of*, he should post a
proposed replacement.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Janis Papanagnou on Tue Sep 17 09:27:57 2024

On 9/16/24 13:26, Janis Papanagnou wrote:

On 16.09.2024 18:19, James Kuyper wrote:

On 2024-09-16, Bart <bc@freeuk.com> wrote:

On 16/09/2024 12:30, Kaz Kylheku wrote:

...

Yes; "I hsve my own personal version of the Bible in which many of its >>>> arbitrary stories are otherwise" is probably not a good way to approach >>>> a Bible study group.

So this is a Bible study group now?

No, but there are some key similarities. Both bible study groups and
this newsgroup have an authoritative text to reference. However, the
nature of that authority is quite different in the two cases. Bible
study groups believe that the Bible is divinely inspired. Those who are
sufficiently familiar with the C standard know that it was created by a
committee of experts, fully capable of making mistakes. Many (most?)
Believers consider the Bible to be incapable of being wrong.

But wasn't the Bible created in a ("Chinese whispers"?) way like
([God] ->) human -> ... -> human -> Bible write-down
with (a lot?) of ([possibly] errant) humans in between?

Disclaimer: I don't know how many instances of "human" were involved.

As I understand it, the claim is that God directly inspired the final
authors. They might have received incorrect information by other means,
but God would have inspired them to correct it in the version that got committed to paper.
I've even heard the same claim made for those who merely translated the
text to a new language, rather than writing it for the first time. In particular, I remember that claim being made for both the Septuagint and
the King James bible.
Serious Bible scholars know all about the many variant texts and
inaccurate translations - but those scholars are not the ones making the
claims I'm talking about. Also, not all Believers believe that all
divinely inspired texts are perfect - many believe that every biblical
text is, at best, somewhat distorted due to the divine inspiration being modified during the process of passing through a human brain.
I'm an atheist talking about what Believers think - no disrespect is
intended
beyond that inherent in the fact that I am an atheist.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Janis Papanagnou on Tue Sep 17 06:57:25 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.09.2024 22:07, Tim Rentsch wrote:

[...] The most important purpose of
the ISO C standard is to be read and understood by ordinary C
developers, not just compiler writers. [...]

Is that part of a preamble or rationale given in the C standard?

That target audience would surely surprise me. Myself I've
programmed in quite some programming languages and never read a
standard document of the respective language, nor did I yet met
any programmer who have done so. All programmer folks I know used
text books to learn and look up things and specific documentation
that comes with the compiler or interpreter products. (This is of
course just a personal experience.)

I've also worked a lot with standards documents in various areas
(mainly ISO and ITU-T standards but also some others). [..]

My comment is only about the C standard, not any other standards
documents.

That's why I immediately see the necessity that compiler creators
need to know them in detail to _implement_ "C". And that's why I
cannot see how the statement of the C-standard's "most important
purpose" would sound reasonable (to me).

You're hearing something different than what I said. The C standard
is not a tutorial, and I didn't say it is. Of course the standard
is important to implementors, and I didn't say it isn't. It's a
safe bet that most C developers are not familiar with the C standard
and I didn't say they were. Despite all that, the most important
value of the C standard is that is available to, accessible to, and
can be read by, ordinary developers. To see why that is, compare
the C standard to reference documents for other current programming
languages. It is fairly easy to get a solid sense of exactly what
the C language allows and what it doesn't ("solid" being the key
word here). That is not the case for many popular languages today.

I mean, what will a programmer get from the "C" standard that a
well written text book doesn't provide?

The text books being imagined here don't exist, because there is no
market for them. Very few developers read the C standard. But the
impact and influence of those who do is much larger than the small
numbers would suggest.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Tim Rentsch on Tue Sep 17 19:02:55 2024

On 17.09.2024 15:57, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.09.2024 22:07, Tim Rentsch wrote:

[...] The most important purpose of
the ISO C standard is to be read and understood by ordinary C
developers, not just compiler writers. [...]

Is that part of a preamble or rationale given in the C standard?

That target audience would surely surprise me. Myself I've
programmed in quite some programming languages and never read a
standard document of the respective language, nor did I yet met
any programmer who have done so. All programmer folks I know used
text books to learn and look up things and specific documentation
that comes with the compiler or interpreter products. (This is of
course just a personal experience.)

I've also worked a lot with standards documents in various areas
(mainly ISO and ITU-T standards but also some others). [..]

My comment is only about the C standard, not any other standards
documents.

Yes, that was obvious.

Are trying to say that the "C standard" is substantially different
with respect to "readability" to other standards? - In the context
of what has been said, that it's a replacement of a textbook (or at
least maybe a supplement)? - Obviously you seem to agree that it's
not, since elsethread you say "The C standard is a reference, not a
tutorial." and I agree with that; since that was actually what I
expressed (or at least tried to express; sorry, if that was unclear
to you).

[...]

I mean, what will a programmer get from the "C" standard that a
well written text book doesn't provide?

The text books being imagined here don't exist, because there is no
market for them.

I'm speaking about existing textbooks for programming languages.
(Not sure what you're reading or implying here.)

Very few developers read the C standard.

Yes, that was also my impression. (And I'm sure to know the reason;
standards are not suited for, not written for general programmers.
they, IMO obviously, have another target group.)

But the
impact and influence of those who do is much larger than the small
numbers would suggest.

What influence? (I wasn't speaking about any influence or reach.
I was just speaking about the target reader-audience of standards,
and about the role of standards and textbooks for programmers.)

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to David Brown on Tue Sep 17 14:08:32 2024

On 9/16/24 13:13, David Brown wrote:

On 16/09/2024 18:19, James Kuyper wrote:

...

The C standard is also incapable of being wrong, but in a very different
sense - the C standard defines C, there is no alternative to compare it
with, in order to say that the C standard is wrong. The C standard might
be inconsistent, unimplementable, badly designed, or incomprehensible,
among many other defects if might have - but as the official definition
of C, it cannot be wrong.
Any such defects can be corrected by filing a defect report and
convincing the committee that the report is correct. If they agree, the
next version of the standard is likely to contain revised wording to
address the issue. Try doing that with the Bible.

At the risk of offending people, I'd say this /has/ been done with the
Bible countless times. There are dozens of major versions of the Bible
with different selections of books and sections of the books. There are hundreds of translations for each version, even counting just
translations into English, based on different source texts and very
different styles of translation. And that's before you get to major re-writes, like Mormonism (though perhaps that's more akin to moving
from C to Rust).

There's a key difference: there's a central authority responsible for C,
the ISO C committee. Defect reports must be filed with them, and new
versions of the C standard are issued by them.

The different versions of the Bible that you refer to generally
correspond to schisms in the community of Believers, with one version of
the Bible accepted by one side of the split, and a different version by
the other side, with neither side accepting the authority of the other
to determine which version is correct.

The authority for the Bible that corresponds to the C committee for the
C standard should be God, but to an atheist like me, it's not clear that
He's playing any active public role in clarifying which version of the
Bible should be used. If He's taking any active role, it would appear to
be in the form of telling individual Believers which version they should believe, with different Believers reporting having gotten different
advice from Him on the matter.

Unlike C, it is not a nice linear progression with each new version superseding the previous versions. But we still do see some "C90
fanatics" that are as convinced in their viewpoint as some King James fans!

The C90 fanatics do seem to be a good analogy to the Christian
schismatics. They basically don't accept the authority of ISO to change
the C standard, despite the fact that it became a standard under ISO
auspices. However, they are not organized into a coherent group like the schismatic churches have been.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to David Brown on Tue Sep 17 17:32:14 2024

On 2024-09-16, David Brown <david.brown@hesbynett.no> wrote:

On 16/09/2024 18:19, James Kuyper wrote:

Any such defects can be corrected by filing a defect report and
convincing the committee that the report is correct. If they agree, the
next version of the standard is likely to contain revised wording to
address the issue. Try doing that with the Bible.

At the risk of offending people, I'd say this /has/ been done with the
Bible countless times.

However, it was done 1) for ideological reasons, not to actually correct
any of the obviously ridiculous stuff and 2) while pretending that it's
all still literally God's word (and believing it).

It's like if someone took C99 and removed, say, designated initializers, because they just don't like them, and then preached that their version
of the document is still ISO 9899:1999, with people believing it and
spreading the falsified document along with that claim that it is
still the word of the ISO committee.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Janis Papanagnou on Tue Sep 17 16:26:34 2024

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 17.09.2024 15:57, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.09.2024 22:07, Tim Rentsch wrote:

[...] The most important purpose of
the ISO C standard is to be read and understood by ordinary C
developers, not just compiler writers. [...]

Is that part of a preamble or rationale given in the C standard?

That target audience would surely surprise me. Myself I've
programmed in quite some programming languages and never read a
standard document of the respective language, nor did I yet met
any programmer who have done so. All programmer folks I know used
text books to learn and look up things and specific documentation
that comes with the compiler or interpreter products. (This is of
course just a personal experience.)

I've also worked a lot with standards documents in various areas
(mainly ISO and ITU-T standards but also some others). [..]

My comment is only about the C standard, not any other standards
documents.

Yes, that was obvious.

Are trying to say that the "C standard" is substantially different
with respect to "readability" to other standards?

To other language reference documents - yes.

- In the context
of what has been said, that it's a replacement of a textbook (or at
least maybe a supplement)?

I would say complement. These days most language technical material
is written in a tutorial style, sometimes overly so. Also they
almost always gloss over some of the technical details.

[...]

I mean, what will a programmer get from the "C" standard that a
well written text book doesn't provide?

The text books being imagined here don't exist, because there is no
market for them.

I'm speaking about existing textbooks for programming languages.
(Not sure what you're reading or implying here.)

The books of interest are not just already existing but also
comparably well-written and covering the same ground. I believe
there are no such books.

Very few developers read the C standard.

Yes, that was also my impression. (And I'm sure to know the reason; standards are not suited for, not written for general programmers.
they, IMO obviously, have another target group.)

I would say the ISO C standard is meant to be read by experienced
software developers. Not beginners, but ordinary developers who
have gained some level of proficiency in the art.

But the
impact and influence of those who do is much larger than the small
numbers would suggest.

What influence?

You are getting some of that effect by participating in the
newsgroup here, and the longer you stay the more you will
get (perhaps up to the point where you start reading the C
standard yourself).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Kaz Kylheku on Wed Sep 18 09:44:58 2024

On 17/09/2024 19:32, Kaz Kylheku wrote:

On 2024-09-16, David Brown <david.brown@hesbynett.no> wrote:

On 16/09/2024 18:19, James Kuyper wrote:

Any such defects can be corrected by filing a defect report and
convincing the committee that the report is correct. If they agree, the
next version of the standard is likely to contain revised wording to
address the issue. Try doing that with the Bible.

At the risk of offending people, I'd say this /has/ been done with the
Bible countless times.

However, it was done 1) for ideological reasons, not to actually correct
any of the obviously ridiculous stuff and

Well, it's debatable about how often this has been done for ideological reasons, and how often the motives have been for money or power. But at
least sometimes, changes to the Bible have been ideologically motivated.
Motivations for changes to the C standards have also been a bit of a
mixed bag (though I think for the defect reports, as far as I have seen,
it's been for simple clarification or fix of definite errors or omissions).

2) while pretending that it's
all still literally God's word (and believing it).

Some people pretend to believe, some people believe the pretence. And
maybe some of the believers are right about some things - some key
religious beliefs are not amenable to more scientific fact checking.

It's like if someone took C99 and removed, say, designated initializers, because they just don't like them, and then preached that their version
of the document is still ISO 9899:1999, with people believing it and spreading the falsified document along with that claim that it is
still the word of the ISO committee.

I think the analogies between the Bible and the C standards might be
breaking down at this point!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Wed Sep 18 10:05:33 2024

On 17/09/2024 20:08, James Kuyper wrote:

On 9/16/24 13:13, David Brown wrote:

On 16/09/2024 18:19, James Kuyper wrote:

...

The C standard is also incapable of being wrong, but in a very different >>> sense - the C standard defines C, there is no alternative to compare it
with, in order to say that the C standard is wrong. The C standard might >>> be inconsistent, unimplementable, badly designed, or incomprehensible,
among many other defects if might have - but as the official definition
of C, it cannot be wrong.
Any such defects can be corrected by filing a defect report and
convincing the committee that the report is correct. If they agree, the
next version of the standard is likely to contain revised wording to
address the issue. Try doing that with the Bible.

At the risk of offending people, I'd say this /has/ been done with the
Bible countless times. There are dozens of major versions of the Bible
with different selections of books and sections of the books. There are
hundreds of translations for each version, even counting just
translations into English, based on different source texts and very
different styles of translation. And that's before you get to major
re-writes, like Mormonism (though perhaps that's more akin to moving
from C to Rust).

There's a key difference: there's a central authority responsible for C,
the ISO C committee. Defect reports must be filed with them, and new
versions of the C standard are issued by them.

There is that difference, yes. But the central authority is quite aloof
- few of them deign to communicate directly to us mere mortals very
often, and it is not uncommon to hear in discussions about the standards
"when they wrote /this/, they really meant /that/" or fixed beliefs
about how parts of the standard are to be interpreted.

And there was a central (very human) authority responsible for the
original formation of the Bible - not for writing the individual books,
but at least for choosing which books were included in the anthology.

However, the C standards have a much clearer and more realistic
procedure for changes - both small fixes (defect reports) and major
changes (new versions).

The different versions of the Bible that you refer to generally
correspond to schisms in the community of Believers, with one version of
the Bible accepted by one side of the split, and a different version by
the other side, with neither side accepting the authority of the other
to determine which version is correct.

Yes.

But even within the groups that nominally accept the same version of the
Bible, there are subgroups that rely on different translations with significantly different interpretations or implications.

The authority for the Bible that corresponds to the C committee for the
C standard should be God, but to an atheist like me, it's not clear that
He's playing any active public role in clarifying which version of the
Bible should be used. If He's taking any active role, it would appear to
be in the form of telling individual Believers which version they should believe, with different Believers reporting having gotten different
advice from Him on the matter.

Unlike C, it is not a nice linear progression with each new version
superseding the previous versions. But we still do see some "C90
fanatics" that are as convinced in their viewpoint as some King James fans!

The C90 fanatics do seem to be a good analogy to the Christian
schismatics. They basically don't accept the authority of ISO to change
the C standard, despite the fact that it became a standard under ISO auspices. However, they are not organized into a coherent group like the schismatic churches have been.

I think that for any belief or cause - good or bad, real or imagined -
there are always some people who follow it so fanatically that it is
their fanaticism that dominates, not the belief or cause.

Perhaps we have milked this comparison enough - it's not really good
c.l.c. topicality. I personally find religious history and related
topics fascinating, and am easily drawn into discussions about it, but
there are no doubt more appropriate arenas.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to David Brown on Wed Sep 18 07:27:22 2024

On 9/18/24 04:05, David Brown wrote:
...

Perhaps we have milked this comparison enough - it's not really good
c.l.c. topicality. ...

My purpose in starting this sub-thread was to deny the validity of the comparison of this newsgroup to a Bible study group. Topicality was out
the window from that point onward.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Wed Sep 18 14:15:05 2024

On 18/09/2024 13:27, James Kuyper wrote:

On 9/18/24 04:05, David Brown wrote:
...

Perhaps we have milked this comparison enough - it's not really good
c.l.c. topicality. ...

My purpose in starting this sub-thread was to deny the validity of the comparison of this newsgroup to a Bible study group. Topicality was out
the window from that point onward.

Sure. And I think it was a valid point. I just know that /I/ easily
stray further off-topic with some subjects, such as this one. (Lead me
not into temptation - I can find the way myself :-) )

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@fricas.org@21:1/5 to Tim Rentsch on Wed Sep 18 15:28:11 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 17.09.2024 15:57, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.09.2024 22:07, Tim Rentsch wrote:

[...] The most important purpose of
the ISO C standard is to be read and understood by ordinary C
developers, not just compiler writers. [...]

Is that part of a preamble or rationale given in the C standard?

That target audience would surely surprise me. Myself I've
programmed in quite some programming languages and never read a
standard document of the respective language, nor did I yet met
any programmer who have done so. All programmer folks I know used
text books to learn and look up things and specific documentation
that comes with the compiler or interpreter products. (This is of
course just a personal experience.)

I've also worked a lot with standards documents in various areas
(mainly ISO and ITU-T standards but also some others). [..]

My comment is only about the C standard, not any other standards
documents.

Yes, that was obvious.

Are trying to say that the "C standard" is substantially different
with respect to "readability" to other standards?

To other language reference documents - yes.

Compared to ISO 10206 (Extended Pascal) I find C standard much
less readible. So there is difference, but in opposite direction
than you suggest. Main thing is that C standard is written in
"lawyerish" style, which is confusing to programmers. OTOH
ISO 10206 is written in precise techincal style, which is
easier. Of course, neither is a light reading.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to antispam@fricas.org on Sat Sep 21 06:00:50 2024

antispam@fricas.org writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 17.09.2024 15:57, Tim Rentsch wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 01.09.2024 22:07, Tim Rentsch wrote:

[...] The most important purpose of
the ISO C standard is to be read and understood by ordinary C
developers, not just compiler writers. [...]

Is that part of a preamble or rationale given in the C standard?

That target audience would surely surprise me. Myself I've
programmed in quite some programming languages and never read a
standard document of the respective language, nor did I yet met
any programmer who have done so. All programmer folks I know used
text books to learn and look up things and specific documentation
that comes with the compiler or interpreter products. (This is of
course just a personal experience.)

I've also worked a lot with standards documents in various areas
(mainly ISO and ITU-T standards but also some others). [..]

My comment is only about the C standard, not any other standards
documents.

Yes, that was obvious.

Are trying to say that the "C standard" is substantially different
with respect to "readability" to other standards?

To other language reference documents - yes.

Compared to ISO 10206 (Extended Pascal) I find C standard much
less readible. So there is difference, but in opposite direction
than you suggest. Main thing is that C standard is written in
"lawyerish" style, which is confusing to programmers. OTOH
ISO 10206 is written in precise techincal style, which is
easier. Of course, neither is a light reading.

I don't necessarily agree with all of your conclusions. In
any case I never claimed that the ISO C standard is the most
readable language reference document ever written. Even if
I were later to decide that the Extended Pascal standard is
better, that doesn't contradict anything I said above.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Gretchiie
  Tue Sep 16 05:20:21 2025
  from Derry, Nh via Telnet
- Ginger1
  Mon Sep 15 19:33:54 2025
  from London via SSH
- Bob Worm
  Mon Sep 15 15:42:34 2025
  from Wales, Uk via Telnet
- Gretchiie
  Mon Sep 15 05:16:29 2025
  from Derry, Nh via Telnet
- Fred Blogs
  Mon Sep 15 00:03:12 2025
  from Uk via SSH
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet
- Thlc
  Sat Sep 13 17:11:34 2025
  from Rognac, France via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (2 / 14)
Uptime:	24:17:23
Calls:	10,390
Calls today:	1
Files:	14,064
Messages:	6,417,012

Top 10 most common hard skills listed on resumes...

Who's Online

Recent Visitors

System Info