Forum: >>> Magnum BBS <<<

Re: Baby X is bor nagain

From bart@21:1/5 to Malcolm McLean on Tue Jun 11 13:59:23 2024

On 11/06/2024 10:13, Malcolm McLean wrote:

I've finally got Baby X (not the resource compiler, the Windows toolkit)
to link X11 on my Mac. And I can start work on it again. But it was far
from easyt to get it to link.

Can friendly people plesse dowload it and see if it compiles on other platforms?

In src\windows, I tried compiling both main.c and testbed.c (not using
any makefiles, just directly applying a compiler).

They both call a startbabyx() function which is incompatible with the
one defined in BabyX.h.

It seems they pass an extra Hinstance first parameter. If I switch from WinMain() to main(), then testbed.c compiles - by itself. (Note that
Windows GUI apps don't need WinMain; they work just as well with main.)

I tried another module BBX_Canvas.h. gcc complained about a type
mismatch (but it is 14.1 which is stricter).

tcc said it couldn't find windowsx.h.

My mcc had problems with UINT32 which doesn't seem to be defined
anywhere that I could see.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Malcolm McLean on Tue Jun 11 15:16:00 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

I've finally got Baby X (not the resource compiler, the Windows toolkit) to link X11 on my Mac. And I can start work on it again. But it was far from easyt to get it to link.

Can friendly people plesse dowload it and see if it compiles on other platforms?

Compiles and the test programs run on Ubuntu 24.04 (LTS).

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Bonita Montero on Tue Jun 11 18:09:41 2024

On 11/06/2024 17:02, Bonita Montero wrote:

Am 11.06.2024 um 11:13 schrieb Malcolm McLean:

I've finally got Baby X (not the resource compiler, the Windows
toolkit) to link X11 on my Mac. And I can start work on it again. But
it was far from easyt to get it to link.
Can friendly people plesse dowload it and see if it compiles on other
platforms?

For large files it would be more convenient to have an .obj-output in
the proper format for Windows or Linux. I implemented a binary file to char-array compiler myself and for lage files the compilation time was totally intolerable and all the compilers I tested (g++, clang++, MSVC)
ran into out of memory conditiond sooner or later, depending on the
size of the char array.

A char array initialised a byte at a time?

That is going to be inefficient.

Instead of a large array like {65, 65, 66 ...} try instead generating a
single string like:

"\101\102\103..."
"\x41\x42\x43..."

Or maybe numbers of shorter strings with a few dozen values per line.

Each string should be represented internally by the compiler as a single
object occupying one byte per element, instead of dozens or even
hundreds of bytes per element.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kalevi Kolttonen@21:1/5 to Malcolm McLean on Tue Jun 11 18:21:03 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:

Can friendly people plesse dowload it and see if it compiles on other platforms?

Fully up-to-date Fedora 40 with the following GCC:

kalevi@lappari ~$ rpm -qi gcc|head -4
Name : gcc
Version : 14.1.1
Release : 4.fc40
Architecture: x86_64

The build fails:

[ 85%] Building C object CMakeFiles/babyxfs_shell.dir/babyxfs_src/shell/bbx_fs_shell.c.o
/home/kalevi/tmp/babyx/babyxrc/babyxfs_src/shell/bbx_fs_shell.c: In function ‘cp’:
/home/kalevi/tmp/babyx/babyxrc/babyxfs_src/shell/bbx_fs_shell.c:503:9: error: ‘return’ with no value, in function returning non-void [-Wreturn-mismatch]
503 | return;
| ^~~~~~ /home/kalevi/tmp/babyx/babyxrc/babyxfs_src/shell/bbx_fs_shell.c:481:12: note: declared here
481 | static int cp(BBX_FS_SHELL *shell, int argc, char **argv)
| ^~ /home/kalevi/tmp/babyx/babyxrc/babyxfs_src/shell/bbx_fs_shell.c: In function ‘bbx_fs_system’:
/home/kalevi/tmp/babyx/babyxrc/babyxfs_src/shell/bbx_fs_shell.c:656:9: warning: ‘strncat’ specified bound 1024 equals destination size [-Wstringop-overflow=]
656 | strncat(line, " ", 1024);
| ^~~~~~~~~~~~~~~~~~~~~~~~
make[2]: *** [CMakeFiles/babyxfs_shell.dir/build.make:160: CMakeFiles/babyxfs_shell.dir/babyxfs_src/shell/bbx_fs_shell.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:341: CMakeFiles/babyxfs_shell.dir/all] Error 2
make: *** [Makefile:91: all] Error 2

After fixing the line 503 to be "return 0;", the build completed
and produced the executables.

But you should also address the -Wstringop-overflow warning.

I also got a warning about tmpnam() being dangerous and
a suggestion to use mkstemp() instead.

br,
KK

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Bonita Montero on Tue Jun 11 20:31:56 2024

On 11/06/2024 19:22, Bonita Montero wrote:

Am 11.06.2024 um 19:09 schrieb bart:

A char array initialised a byte at a time?
That is going to be inefficient.

xxd does is that way.

Which option is that?

All recent discussions of xxd have used the '-i' option which writes out
the data as individual hex bytes such as '0x41,'.

The most compact option appears to be '-ps', but that is not a C data
format.

Or do you mean that xxd does a byte at a time, and so your version did
the same?

In that case don't be afraid to do your own thing if it is better.

I've just done a test. First writing a 5MB binary as 5 million
individual bytes, one per line. Compiling that with gcc took 15 seconds.

Then I wrote it as a single string full of hex codes as I suggested.

Now compilation took 1.3 seconds.

Using Tiny C, compile time reduced from 1.75 seconds to under 0.3
seconds. 12 times or 6 times faster compile-time; string data is always
faster.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Malcolm McLean on Wed Jun 12 00:34:43 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 11/06/2024 15:16, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

I've finally got Baby X (not the resource compiler, the Windows toolkit) to >>> link X11 on my Mac. And I can start work on it again. But it was far from >>> easyt to get it to link.

Can friendly people plesse dowload it and see if it compiles on other
platforms?

Compiles and the test programs run on Ubuntu 24.04 (LTS).

Oh brilliant.

My installation may not be typical in that I have a lot of -dev packages installed, but it will be similar to those of others who develop software

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bonita Montero on Wed Jun 12 09:01:58 2024

On 12/06/2024 07:40, Bonita Montero wrote:

Am 11.06.2024 um 18:15 schrieb Malcolm McLean:

These are Baby programs. But they use a cut down GUI. So they need to
get fonts and images into the program somehow. And so Baby X does that
by converting to 32 bit C arrays which can be compiled and linked as
normal. And for that, you need a tool. Writing a tiff file decoder is
not a trivial exercise.

I converted my code into sth. that produces a C-string as an output.
Printing that is still very fast, i.e. the files produced are written
with about 2.6GiB/s. But the problem is still that all compilers don't
parse large files but quit with an out of memory error. So having a
.obj output along with a small header file would be the best.

How big files are you talking about? In an earlier thread (which I
thought had beaten this topic to death), "xxd -i" include files were
fine to at least a few tens of megabytes with gcc. And it would be,
IMHO, absurd to have much bigger files than that embedded with your
executable in this manner. I can understand wanting some icons and a
few resource files in a PC executable, but if you have a lot of files or
big files then a single massive executable often does not make much
sense as the binary file.

If you /do/ want such a file, it is typically for making a portable
package that can be run directly without installing. But then you don't
mess around with inventing your own little pretend file systems, or
embedding the files manually, or using absurd ideas like XML text
strings. You use standard, well-established solutions and tools such
as AppImage on Linux or self-extracting zip files on Windows.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Wed Jun 12 12:27:13 2024

On 12/06/2024 08:01, David Brown wrote:

On 12/06/2024 07:40, Bonita Montero wrote:

I converted my code into sth. that produces a C-string as an output.
Printing that is still very fast, i.e. the files produced are written
with about 2.6GiB/s. But the problem is still that all compilers don't
parse large files but quit with an out of memory error. So having a
.obj output along with a small header file would be the best.

How big files are you talking about? In an earlier thread (which I
thought had beaten this topic to death), "xxd -i" include files were
fine to at least a few tens of megabytes with gcc.

What was never discussed is why xxd (and the faster alternates that
some posted to do that task more quickly), produces lists of numbers anyway.

Why not strings containing the embedded binary data?

And it would be,
IMHO, absurd to have much bigger files than that embedded with your executable in this manner.

BM complained that some files expressed as xxd-like output were causing problems with compilers.

I suggested using a string representation. While the generated text file
is not much smaller, it is seen by the compiler as one string
expression, instead of millions of small expressions. Or at least,
1/20th the number if you split the strings across lines.

It's a no-brainer. Why spend 10 times as long on processing such data?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Bonita Montero on Wed Jun 12 14:13:24 2024

On 12/06/2024 13:35, Bonita Montero wrote:

Am 12.06.2024 um 13:27 schrieb bart:

I suggested using a string representation. While the generated text
file is not much smaller, it is seen by the compiler as one string
expression, instead of millions of small expressions. Or at least,
1/20th the number if you split the strings across lines.

I implemented that with my second code but the compilers are still
limited with that.

What size of file are we talking about, and how much memory in your machine?

I need another test with a 55MB test file. These are the results:

{65,66,67,... "\x41\x42\x43...

g++ 284 seconds 13 seconds
tcc 20 seconds 2.5 seconds

A 20x slowdown suggests problems exceeding memory.

Do you have a test file that does work in either format? If so how much difference was there between them? If very little, then you're doing
something wrong.

(I did one more test with my language which directly imported the 55MB
binary without either of those intermediate textual formats. It took
under 0.7 seconds.

That needs built-in language support, but using a more apt textual
format is a sensible first step.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Bonita Montero on Wed Jun 12 14:52:33 2024

On 12/06/2024 14:43, Bonita Montero wrote:

Am 12.06.2024 um 15:13 schrieb bart:

I need another test with a 55MB test file. These are the results:

           {65,66,67,...      "\x41\x42\x43...

g++       284 seconds        13   seconds
tcc        20 seconds         2.5 seconds

I just tested this with my personal backup.

I meant to say 'I did' rather than 'I need'.

I keep forgetting that this forum doesn't allow editing.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Wed Jun 12 15:46:44 2024

On 12/06/2024 13:27, bart wrote:

On 12/06/2024 08:01, David Brown wrote:

On 12/06/2024 07:40, Bonita Montero wrote:

I converted my code into sth. that produces a C-string as an output.
Printing that is still very fast, i.e. the files produced are written
with about 2.6GiB/s. But the problem is still that all compilers don't
parse large files but quit with an out of memory error. So having a
.obj output along with a small header file would be the best.

How big files are you talking about? In an earlier thread (which I
thought had beaten this topic to death), "xxd -i" include files were
fine to at least a few tens of megabytes with gcc.

What was never discussed is why xxd (and the faster alternates that
some posted to do that task more quickly), produces lists of numbers
anyway.

Why not strings containing the embedded binary data?

There are some cases where lists of numbers would be useable while
strings would not be. But I suppose the opposite will apply too.

While string literals can contain embedded null characters (a string
literal in C does not have to be a string), I don't feel as comfortable
using a messy string literal full of escape codes for binary data. A
list of hex numbers, with appropriate line lengths, is also vastly
neater if you need to look at the data (or accidentally open it in a
text editor).

I also don't imagine that string literals would be much faster for
compilation, at least for file sizes that I think make sense. And I
have heard (it could be wrong) that MSVC has severe limits on the size
of string literals, though it is not a compiler I ever use myself.

But of course, if you prefer string literals, use them. I don't think
xxd can generate them, but it should not be hard to write a program that
does.

And it would be, IMHO, absurd to have much bigger files than that
embedded with your executable in this manner.

BM complained that some files expressed as xxd-like output were causing problems with compilers.

I suggested using a string representation. While the generated text file
is not much smaller, it is seen by the compiler as one string
expression, instead of millions of small expressions. Or at least,
1/20th the number if you split the strings across lines.

It's a no-brainer. Why spend 10 times as long on processing such data?

10 times negligible is still negligible.

But to be clear, I'd still rate a string literal like this as vastly
nicer than some XML monstrosity!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Wed Jun 12 15:20:59 2024

On 12/06/2024 11:51, Malcolm McLean wrote:

On 12/06/2024 08:01, David Brown wrote:

On 12/06/2024 07:40, Bonita Montero wrote:

Am 11.06.2024 um 18:15 schrieb Malcolm McLean:

These are Baby programs. But they use a cut down GUI. So they need
to get fonts and images into the program somehow. And so Baby X does
that by converting to 32 bit C arrays which can be compiled and
linked as normal. And for that, you need a tool. Writing a tiff file
decoder is not a trivial exercise.

I converted my code into sth. that produces a C-string as an output.
Printing that is still very fast, i.e. the files produced are written
with about 2.6GiB/s. But the problem is still that all compilers don't
parse large files but quit with an out of memory error. So having a
.obj output along with a small header file would be the best.

How big files are you talking about? In an earlier thread (which I
thought had beaten this topic to death), "xxd -i" include files were
fine to at least a few tens of megabytes with gcc. And it would be,
IMHO, absurd to have much bigger files than that embedded with your
executable in this manner. I can understand wanting some icons and a
few resource files in a PC executable, but if you have a lot of files
or big files then a single massive executable often does not make much
sense as the binary file.

If you /do/ want such a file, it is typically for making a portable
package that can be run directly without installing. But then you
don't mess around with inventing your own little pretend file systems,
or embedding the files manually, or using absurd ideas like XML text
strings. You use standard, well-established solutions and tools such
as AppImage on Linux or self-extracting zip files on Windows.

You don't get what Baby X is all about.

True.

These solutions will not work for the audience I am trying to target.
Baby X is clean, portable, and simple. As much as I can make it. And
it's meant to be easy for people who are just beginning programmers to use.

Who are these people? And why would they care if it is "clean",
whatever you mean by that? Why would they care about portability? The
great majority of developers spend most of their time targeting a single platform. Beginners are unlikely to have more than one OS to work with.
I think cross-platform portability between Linux and Windows is
usually a good thing, but not a big issue for beginners. Macs and other systems are irrelevant in practice. And no one - apart from you and a
guy called Paul - have the slightest interest in making gui toolkits in
string C89/C90. To the nearest percent, no one, beginner or expert,
writes gui programs in C.

"Simple" is good. Big toolkits like GTK, wxWidgets or Qt can easily be overwhelming for beginners. But too simple and limited is not helpful
either - once the beginner has made their "Hello, world" gui with an OK
button, they need more. Why should anyone pick Baby X rather than, say,
FLTK?

But the main focus now is help and documentation. The improved ls
command is now in the shell, and the next task is to improve the "help" command, at the same time as writing more docs. The two tasks naturally
go together, and the website is beginning to gel.

To be clear here, I do not want to discourage you from your project in
any way. I am trying to ask questions to make you think, and to focus appropriately. It seems to me that Baby X is your real passion here and
the project that you think will be useful to others (regardless of what
I may think of it). I believe your "Filesystem XML" and even more so,
your shell and utilities like "ls", are a distraction and a rabbit hole.
It does not make sense to spend months developing that to save the
user a couple of seconds packing or unpacking the XML file to normal files.

Help, documentation and examples are going to be much more valuable to
users.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Thu Jun 13 00:29:33 2024

On Wed, 12 Jun 2024 15:46:44 +0200
David Brown <david.brown@hesbynett.no> wrote:

I also don't imagine that string literals would be much faster for compilation, at least for file sizes that I think make sense.

Just shows how little do you know about internals of typical compiler.
Which, by itself, is o.k. What is not o.k. is that with your level of
knowledge you have a nerve to argue vs bart that obviously knows a lot
more.

And I
have heard (it could be wrong) that MSVC has severe limits on the
size of string literals, though it is not a compiler I ever use
myself.

Citation, please..

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Thu Jun 13 13:53:54 2024

On 12/06/2024 23:29, Michael S wrote:

On Wed, 12 Jun 2024 15:46:44 +0200
David Brown <david.brown@hesbynett.no> wrote:

I also don't imagine that string literals would be much faster for
compilation, at least for file sizes that I think make sense.

Just shows how little do you know about internals of typical compiler.
Which, by itself, is o.k. What is not o.k. is that with your level of knowledge you have a nerve to argue vs bart that obviously knows a lot
more.

I know more than most C programmers about how certain C compilers work,
and what works well with them, and what is relevant for them - though I certainly don't claim to know everything. Obviously Bart knows vastly
more about how /his/ compiler works. He also tends to do testing with
several small and odd C compilers, which can give interesting results
even though they are of little practical relevance for real-world C
development work.

Testing a 1 MB file of random data, gcc -O2 took less than a second to
compile it. One megabyte is about the biggest size I would think makes
sense to embed directly in C code unless you are doing something very
niche - usually if you need that much data, you'd be better off with
separate files and standardised packaging systems like zip files,
installer setup.exe builds, or that kind of thing.

Using string literals, the compile time was shorter, but when you are
already below a second, it's all just irrelevant noise.

For much bigger files, string literals are likely to be faster for
compilation for gcc because the compiler does not track as much
information (for use in diagnostic messages). But it makes no
difference to real world development.

And I
have heard (it could be wrong) that MSVC has severe limits on the
size of string literals, though it is not a compiler I ever use
myself.

Citation, please..

<https://letmegooglethat.com/?q=msvc+string+literal+length+limit>

Actually, I think it was from Bart that I first heard that MSVC has
limitations on its string literal lengths, but I could well be
misremembering that. I am confident, however, that it was here in
c.l.c., as MSVC is not a tool I have used myself.

<https://learn.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp>

It seems that version 17.0 has removed the arbitrary limits, while
before that it was limited to 65K in their C++ compiler.

For the MSVC C compiler, I see this:

<https://learn.microsoft.com/en-us/cpp/c-language/maximum-string-length>

Each individual string is up to 2048 bytes, which can be concatenated to
a maximum of 65K in total.

I see other links giving different values, but I expect the MS ones to
be authoritative. It is possible that newer versions of their C
compiler have removed the limit, just as for their C++ compiler, but it
was missing from that webpage.

(And I noticed also someone saying that MSVC is 70x faster at using
string literals compared to lists of integers for array initialisation.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Jun 13 14:46:32 2024

On 13/06/2024 12:53, David Brown wrote:

On 12/06/2024 23:29, Michael S wrote:

On Wed, 12 Jun 2024 15:46:44 +0200
David Brown <david.brown@hesbynett.no> wrote:

I also don't imagine that string literals would be much faster for
compilation, at least for file sizes that I think make sense.

Just shows how little do you know about internals of typical compiler.
Which, by itself, is o.k. What is not o.k. is that with your level of
knowledge you have a nerve to argue vs bart that obviously knows a lot
more.

I know more than most C programmers about how certain C compilers work,
and what works well with them, and what is relevant for them - though I certainly don't claim to know everything. Obviously Bart knows vastly
more about how /his/ compiler works. He also tends to do testing with several small and odd C compilers, which can give interesting results
even though they are of little practical relevance for real-world C development work.

Testing a 1 MB file of random data, gcc -O2 took less than a second to compile it. One megabyte is about the biggest size I would think makes sense to embed directly in C code unless you are doing something very
niche - usually if you need that much data, you'd be better off with
separate files and standardised packaging systems like zip files,
installer setup.exe builds, or that kind of thing.

Here are some tests embedding a 1.1 MB binary on my machine:

Numbers One string

gcc 14.1 -O0 3.2 (0.2) 0.4 (0.2) Seconds

tcc 0.4 (0.03) 0.07 (0.03)

Using 'One string' makes gcc as fast as Tiny C working with 'Numbers'!

The figures in brackets are the build times for hello.c, to better
appreciate the differences.

Including those overheads, 'One string' makes gcc 8 times as fast as
with 'Numbers'. Excluded those overheads, and it is 15 times as fast
(3.0 vs 0.2).

For comparions, here is the timing for my non-C compiler using direct embedding:

mm 0.05 (0.03)

The extra time compared with 'hello' is 20ms; tcc was 370/40ms, and gcc
was 3000/200ms.

Using string literals, the compile time was shorter, but when you are
already below a second, it's all just irrelevant noise.

My machine is slower than yours. It's not anyway just about one machine
and one program. You're choosing to spend 10 times as long to do a task,
using resources that could be used for other processes, and using extra
power.

But if you are creating a tool for N other people to use who may be
running it M times a day on data of size X, you can't just dismiss these considerations. You don't know how far people will push the operating
limits of your tool.

Each individual string is up to 2048 bytes, which can be concatenated to
a maximum of 65K in total.

I see other links giving different values, but I expect the MS ones to
be authoritative. It is possible that newer versions of their C
compiler have removed the limit, just as for their C++ compiler, but it
was missing from that webpage.

(And I noticed also someone saying that MSVC is 70x faster at using
string literals compared to lists of integers for array initialisation.)

That doesn't sound unreasonable.

Note that it is not necessary to use one giant string; you can chop it
up into smaller strings, say with one line's worth of values per string,
and still get most of the benefits. It's just a tiny bit more fiddly to generate the strings.

Within my compiler, each single number takes a 64-byte record to
represent. So 1MB of data takes 64MB, while a 1MB string takes one
64-byte record plus the 1MB of the string data.

Then there are the various type analysis and other passes that have to
be done a million times rather then once. I'd imagine that compilers
like gcc do a lot more.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From tTh@21:1/5 to bart on Thu Jun 13 17:11:57 2024

On 6/13/24 15:46, bart wrote:

Note that it is not necessary to use one giant string; you can chop it
up into smaller strings, say with one line's worth of values per string,
and still get most of the benefits. It's just a tiny bit more fiddly to generate the strings.

And what about the ending '\0' of all those small strings ?

--
+---------------------------------------------------------------------+
| https://tube.interhacker.space/a/tth/video-channels | +---------------------------------------------------------------------+

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Thu Jun 13 17:43:54 2024

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 12/06/2024 23:29, Michael S wrote:

On Wed, 12 Jun 2024 15:46:44 +0200
David Brown <david.brown@hesbynett.no> wrote:

I also don't imagine that string literals would be much faster for
compilation, at least for file sizes that I think make sense.

Just shows how little do you know about internals of typical
compiler. Which, by itself, is o.k. What is not o.k. is that with
your level of knowledge you have a nerve to argue vs bart that
obviously knows a lot more.

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them -
though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to
do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is big,
what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all.

Testing a 1 MB file of random data, gcc -O2 took less than a second
to compile it.

Somewhat more than a second on less modern hardware. Enough for me to
feel that compilation is not instant.
But 1 MB is just an arbitrary number. For 20 MB everybody would feel
the difference. And for 50 MB few people would not want it to be much
faster.

One megabyte is about the biggest size I would think
makes sense to embed directly in C code unless you are doing
something very niche - usually if you need that much data, you'd be
better off with separate files and standardised packaging systems
like zip files, installer setup.exe builds, or that kind of thing.

Using string literals, the compile time was shorter, but when you are
already below a second, it's all just irrelevant noise.

For much bigger files, string literals are likely to be faster for compilation for gcc because the compiler does not track as much
information

And that is sort of the thing that bart knows immediately. Unlike you
and me.

(for use in diagnostic messages).
But it makes no
difference to real world development.

And I
have heard (it could be wrong) that MSVC has severe limits on the
size of string literals, though it is not a compiler I ever use
myself.

Citation, please..

<https://letmegooglethat.com/?q=msvc+string+literal+length+limit>

Actually, I think it was from Bart that I first heard that MSVC has limitations on its string literal lengths, but I could well be
misremembering that. I am confident, however, that it was here in
c.l.c., as MSVC is not a tool I have used myself.

<https://learn.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp>

It seems that version 17.0 has removed the arbitrary limits, while
before that it was limited to 65K in their C++ compiler.

For the MSVC C compiler, I see this:

<https://learn.microsoft.com/en-us/cpp/c-language/maximum-string-length>

Each individual string is up to 2048 bytes, which can be concatenated
to a maximum of 65K in total.

I see other links giving different values, but I expect the MS ones
to be authoritative. It is possible that newer versions of their C
compiler have removed the limit, just as for their C++ compiler, but
it was missing from that webpage.

(And I noticed also someone saying that MSVC is 70x faster at using
string literals compared to lists of integers for array
initialisation.)

I didn't know it, thanks.
It means that string method can't be used universally.

Still, for C (as opposed to C++), limitation of compiler can be tricked
around by declaring container as a struct. E.g. for array of length
1234567

struct {
char bulk[123][10000];
char tail[4567];
} bar = {
{
"init0-to-99999" ,
"init10000-to-199999" ,
....
},
"init123400-to1234566"
};

For that I'd expecte compilation speed almost as fast as of one string.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to tTh on Thu Jun 13 16:32:01 2024

On 13/06/2024 16:11, tTh wrote:

On 6/13/24 15:46, bart wrote:

Note that it is not necessary to use one giant string; you can chop it
up into smaller strings, say with one line's worth of values per
string, and still get most of the benefits. It's just a tiny bit more
fiddly to generate the strings.

And what about the ending '\0' of all those small strings ?

What about it? If you specify the bounds of the char array, then a
terminator won't be added.

But I've now realised lots of shorter strings will make it awkward to
define the data (now a table of chars, with the last row being partially
full, making an exact size tricky).

It can only really work if the separate strings are concatenated into
one big string. This may still run into string length limitations, but
it depends on whether the limitation applies to individual strings
(which is OK), or to the sum of all the strings (which isn't).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Thu Jun 13 19:13:51 2024

On Thu, 13 Jun 2024 14:46:32 +0100
bart <bc@freeuk.com> wrote:

Within my compiler, each single number takes a 64-byte record to
represent. So 1MB of data takes 64MB, while a 1MB string takes one
64-byte record plus the 1MB of the string data.

Then there are the various type analysis and other passes that have
to be done a million times rather then once. I'd imagine that
compilers like gcc do a lot more.

For gcc up to certain limit I measured ~160 bytes per number.
After that certain very big limit (probably 64M numbers) gcc appears
to switch into more economical mode - ~112 bytes per number. At ~300M
numbers it appears to become yet more economical, but still above 100
bytes per number. 400M number - 100 bytes per number. Going further
became quite time consuming so I gave up.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Fri Jun 14 18:43:59 2024

On 13/06/2024 16:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 12/06/2024 23:29, Michael S wrote:

On Wed, 12 Jun 2024 15:46:44 +0200
David Brown <david.brown@hesbynett.no> wrote:

I also don't imagine that string literals would be much faster for
compilation, at least for file sizes that I think make sense.

Just shows how little do you know about internals of typical
compiler. Which, by itself, is o.k. What is not o.k. is that with
your level of knowledge you have a nerve to argue vs bart that
obviously knows a lot more.

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them -
though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to
do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is big,
what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all.

I know enough about compiler design and implementation to have a pretty
good idea about many parts of it, though certainly not all of it. To be
clear - theoretical knowledge is not the same as practical experience.
I realise that array initialisation by a sequence of numbers has
overhead (this was discussed at length in a previous thread), and that a
string literal will likely have less. But the difference is not
particularly significant for realistic file sizes.

Testing a 1 MB file of random data, gcc -O2 took less than a second
to compile it.

Somewhat more than a second on less modern hardware. Enough for me to
feel that compilation is not instant.
But 1 MB is just an arbitrary number. For 20 MB everybody would feel
the difference. And for 50 MB few people would not want it to be much
faster.

But what would be the point of trying to embed such files in the first
place? There are much better ways of packing large files. You can
always increase sizes for things until you get problems or annoying
slowdowns, but that does not mean that will happen in practical situations.

And even if you /did/ want to embed a 20 MB file, and even if that took
20 seconds, so what? Unless you have a masochistic build setup, such as refusing to use "make" or insisting that everything goes in one C file
that is re-compiled all the time, that 20 second compile is a one-off
time cost on the rare occasion when you change the big binary file.

Now, I am quite happy to agree that faster is better, all other things
being equal. And convenience and simplicity is better. Once the
compilers I use support #embed, if I need to embed a file and I don't
need anything more than an array initialisation, I'll use #embed. Until
then, 5 seconds writing an "xxd -i" line in a makefile and a 20 second
compile (if it took that long) beats 5 minutes writing a Python script
to generate string literals even if the compile is now 2 seconds.

One megabyte is about the biggest size I would think
makes sense to embed directly in C code unless you are doing
something very niche - usually if you need that much data, you'd be
better off with separate files and standardised packaging systems
like zip files, installer setup.exe builds, or that kind of thing.

Using string literals, the compile time was shorter, but when you are
already below a second, it's all just irrelevant noise.

For much bigger files, string literals are likely to be faster for
compilation for gcc because the compiler does not track as much
information

And that is sort of the thing that bart knows immediately. Unlike you
and me.

I can't answer for /you/, but /I/ knew this - look through the
discussions on #embed if you like. To be fair, I did not consider
string literals very much there - they are a pretty pointless
alternative when #embed is coming and integer sequences are fast enough
for any realistic use.

And I
have heard (it could be wrong) that MSVC has severe limits on the
size of string literals, though it is not a compiler I ever use
myself.

For the MSVC C compiler, I see this:

<https://learn.microsoft.com/en-us/cpp/c-language/maximum-string-length>

Each individual string is up to 2048 bytes, which can be concatenated
to a maximum of 65K in total.

I see other links giving different values, but I expect the MS ones
to be authoritative. It is possible that newer versions of their C
compiler have removed the limit, just as for their C++ compiler, but
it was missing from that webpage.

(And I noticed also someone saying that MSVC is 70x faster at using
string literals compared to lists of integers for array
initialisation.)

I didn't know it, thanks.

I didn't know the details either, until you challenged me and I looked
them up!

It means that string method can't be used universally.

That depends on the state of the current MSVC compiler - and perhaps
other compilers. The C standards only require support for 4095
characters in a string literal. (They also only require support for
objects up to 32767 bytes in length - and for that size, any method
should be fast.)

Still, for C (as opposed to C++), limitation of compiler can be tricked around by declaring container as a struct. E.g. for array of length
1234567

struct {
char bulk[123][10000];
char tail[4567];
} bar = {
{
"init0-to-99999" ,
"init10000-to-199999" ,
....
},
"init123400-to1234566"
};

For that I'd expecte compilation speed almost as fast as of one string.

I suppose so, but it is not pretty!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 14 19:24:04 2024

On 14/06/2024 17:43, David Brown wrote:

On 13/06/2024 16:43, Michael S wrote:

Somewhat more than a second on less modern hardware. Enough for me to
feel that compilation is not instant.
But 1 MB is just an arbitrary number. For 20 MB everybody would feel
the difference. And for 50 MB few people would not want it to be much
faster.

But what would be the point of trying to embed such files in the first place? There are much better ways of packing large files.

I remember complaining that some tool installations were bloated at
100MB, 500MB, 1000MB or beyond, and your attitude was So what, since
there is now almost unlimited storage.

But now of course, it's Why would someone ever want to do X with such a
large file! Suddenly large files are undesirable when it suits you.

You can
always increase sizes for things until you get problems or annoying slowdowns, but that does not mean that will happen in practical situations.

And even if you /did/ want to embed a 20 MB file, and even if that took
20 seconds, so what? Unless you have a masochistic build setup, such as refusing to use "make" or insisting that everything goes in one C file
that is re-compiled all the time, that 20 second compile is a one-off
time cost on the rare occasion when you change the big binary file.

Now, I am quite happy to agree that faster is better, all other things
being equal. And convenience and simplicity is better. Once the
compilers I use support #embed, if I need to embed a file and I don't
need anything more than an array initialisation, I'll use #embed. Until then, 5 seconds writing an "xxd -i" line in a makefile and a 20 second compile (if it took that long) beats 5 minutes writing a Python script
to generate string literals even if the compile is now 2 seconds.

That's a really bad attitude. It partly explains why such things as
#embed take so long to get added.

I've heard lots of horror stories elsewhere about projects taking
minutes, tens of minutes or even hours to build.

How much of that is due to attitudes like yours? You've managed to find
ways of working around speed problems, by throwing hardware resources at
it (fast processors, loads of memory, multiple cores, SSD, RAM-disk), or
using ingenuity in *avoiding* having to compile stuff as much as
possible. Or maybe the programs you build aren't that big.

But that is not how you fix such problems. Potential bottlenecks should
be identified and investigated.

/Could/ it be faster? /Could/ it use less memory? /Could/ a simple
language extension help out?

I can understand you having little interest in it because you just use
the tools that available and can't do much about it, but it should be somebody's job to keep on top of this stuff.

Until
then, 5 seconds writing an "xxd -i" line in a makefile and a 20 second compile (if it took that long) beats 5 minutes writing a Python script
to generate string literals even if the compile is now 2 seconds.

So now you need 'xxd'. And 'Python'. And 'make'. When it could all be
done effortlessly, more easily and 100 times faster within the language
without all that mucking about.

Unless you have a masochistic build setup, such as
refusing to use "make" or insisting that everything goes in one C file
that is re-compiled all the time,

When you write such tools, you don't know what people are going to do
with them, how much they will push their limits. And you can't really
dictate how they develop or build their software.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sat Jun 15 12:35:37 2024

On 14/06/2024 20:24, bart wrote:

On 14/06/2024 17:43, David Brown wrote:

On 13/06/2024 16:43, Michael S wrote:

Somewhat more than a second on less modern hardware. Enough for me to
feel that compilation is not instant.
But 1 MB is just an arbitrary number. For 20 MB everybody would feel
the difference. And for 50 MB few people would not want it to be much
faster.

But what would be the point of trying to embed such files in the first
place? There are much better ways of packing large files.

I remember complaining that some tool installations were bloated at
100MB, 500MB, 1000MB or beyond, and your attitude was So what, since
there is now almost unlimited storage.

We all remember that :-)

But now of course, it's Why would someone ever want to do X with such a
large file! Suddenly large files are undesirable when it suits you.

It's a /completely/ different situation. Anyone doing development work
is going to have a machine with lots of space - 1 GB is peanuts for
space on a disk. But that does not mean it makes sense to have a 1 GB initialised array in an executable!

Consider /why/ you might want to include a binary blob inside an
executable. I can think of a number of scenarios :

1. You want a "setup.exe" installation file. Then you use appropriate
tools for the job, you don't use inclusion in a C file.

2. You want a "portable" version of a big program - portable apps on
Windows, AppImage on Linux, or something like that. Then you use
appropriate tools for the job so that the application can access the
enclosed files as /normal/ files (not some weird "XML Filesystem" nonsense).

3. You are targeting a platform where there is no big OS and no
filesystem, and everything is within a single statically-linked binary.
Then embedded files in C arrays are a good solution, but your files are
always small because your system is small.

4. You want to include a few "resources" like icons or images in your executable, because you don't need much and it makes the results neater.
Then you use some kind of "resource compiler", such as has been used
on Windows for decades.

I'm sure there are a few other niche cases where the convenience of a
single executable file is more important than the inconvenience of not
being able to access the files with normal file operations. Even then,
it's unlikely that they will be big files.

To give an analogy, consider books. In a home, it's no problem having a
set of bookshelves with hundreds of books on them - that's your disk
storage. It is also sometimes convenient to have books packed together
in single units, boxes, even though you need to unpack them to get to
the books - that's your setup.exe or AppImage files. And sometimes it
is nice to have a few /small/ books inside one binding, such as a a
trilogy in one binding - that's your embedded files. But no one wants
the complete Encyclopedia Britannica in one binding.

You can always increase sizes for things until you get problems or
annoying slowdowns, but that does not mean that will happen in
practical situations.

And even if you /did/ want to embed a 20 MB file, and even if that
took 20 seconds, so what? Unless you have a masochistic build setup,
such as refusing to use "make" or insisting that everything goes in
one C file that is re-compiled all the time, that 20 second compile is
a one-off time cost on the rare occasion when you change the big
binary file.

Now, I am quite happy to agree that faster is better, all other things
being equal. And convenience and simplicity is better. Once the
compilers I use support #embed, if I need to embed a file and I don't
need anything more than an array initialisation, I'll use #embed.
Until then, 5 seconds writing an "xxd -i" line in a makefile and a 20
second compile (if it took that long) beats 5 minutes writing a Python
script to generate string literals even if the compile is now 2 seconds.

That's a really bad attitude. It partly explains why such things as
#embed take so long to get added.

Using the best tool available for the job, and using a better tool if
one becomes available, is a "bad attitude" ?

Or did you mean it is a "bad attitude" to concentrate on things that are important and make a real difference, instead of improving on something
that was never really a big issue in the first place?

I've heard lots of horror stories elsewhere about projects taking
minutes, tens of minutes or even hours to build.

I agree - some kinds of builds take a /long/ time. Embedding binary
blobs has absolutely nothing to do with it. Indeed, long build times
are often the result of trying to put too much in one build rather than splitting things up in separate files and libraries. (Sometimes such
big builds are justified, such as for large programs with very large
user bases.)

How much of that is due to attitudes like yours? You've managed to find
ways of working around speed problems, by throwing hardware resources at
it (fast processors, loads of memory, multiple cores, SSD, RAM-disk), or using ingenuity in *avoiding* having to compile stuff as much as
possible. Or maybe the programs you build aren't that big.

You are joking, right? Or trolling?

(I'm snipping the rest, because if it is not trolling, it would take far
too long to explain to you how the software development world works for everyone else.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Michael S on Mon Jun 17 02:22:33 2024

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them -
though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to
do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is big,
what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all.

The problem is that Bart's compiler is VERY unusual. It's customized for
his use, and he has lots of quirks in the way he thinks compilers should
work, which are very different from those of most other programmers. In particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to James Kuyper on Mon Jun 17 07:30:44 2024

On 2024-06-17, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The problem is that Bart's compiler is VERY unusual. It's customized for
his use, and he has lots of quirks in the way he thinks compilers should work, which are very different from those of most other programmers. In particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Most programmers use Javascript and Python, which follow Bart's
priorities. Fast, invisible compilation to some kind of byte code (plus possibly later JIT), slow execution time.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Kaz Kylheku on Mon Jun 17 12:25:27 2024

On Mon, 17 Jun 2024 07:30:44 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wrote:

On 2024-06-17, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The problem is that Bart's compiler is VERY unusual. It's
customized for his use, and he has lots of quirks in the way he
thinks compilers should work, which are very different from those
of most other programmers. In particular, compilation speed is very important to him, while execution speed is almost completely
unimportant, which is pretty much the opposite of the way most
programmers prioritize those things.

Most programmers use Javascript and Python, which follow Bart's
priorities. Fast, invisible compilation to some kind of byte code
(plus possibly later JIT), slow execution time.

I'd dare to say that most programmers care about speed of compilation
more than they care about speed of execution even (or especially) when
they use "visible" compilation processes. Except when compilation is
already very fast.
BTW, my impression was that Bart's 'C' compiler uses 'visible'
compilation.

Then again, neither speed of compilation nor speed of execution are top priorities for most pros. My guess is that #1 priority is conformance
with co-workers/employer, #2 is convenient IDE, preferably integrated
with debugger, #3 is support, but there is big distance between #2, and
#3. #4 are religious issues of various forms. Speed of compilation is at
best #5.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to James Kuyper on Mon Jun 17 11:30:19 2024

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them -
though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to
do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is big,
what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all.

The problem is that Bart's compiler is VERY unusual. It's customized for
his use, and he has lots of quirks in the way he thinks compilers should work, which are very different from those of most other programmers.

In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Compilation speed is important to everyone. That's why so many tricks
are used to get around the lack of speed in a big compiler, or so many
extra resources are thrown at the problem.

Runtime performance is important too, but at this level of language, the difference between optimised and unoptimised code is narrow. Unoptimised
may be between 1x and 2x slower, typically.

Perhaps slower on benchmarks, or code written in C++ style that
generates lots of redundances that relies on optimisation to make it fast.

But, during developement, you probably wouldn't use optimisation anyway.

In that case, you're still suffering slow build times with a big
compiler, but you don't get any faster code at the end of it.

I sometimes suggest to people to use Tiny C most of the time, and run
gcc from time to time for extra analysis and extra checks, and use
gcc-O3 for production builds.

(I have also suggested that gcc should incorporate a -O-1 option that
runs a secretly bundled of Tiny C.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Kaz Kylheku on Mon Jun 17 15:23:55 2024

On 17/06/2024 09:30, Kaz Kylheku wrote:

On 2024-06-17, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The problem is that Bart's compiler is VERY unusual. It's customized for
his use, and he has lots of quirks in the way he thinks compilers should
work, which are very different from those of most other programmers. In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Most programmers use Javascript and Python, which follow Bart's
priorities. Fast, invisible compilation to some kind of byte code (plus possibly later JIT), slow execution time.

That is not at all why people use Javascript and/or Python.

They want fast /development/ time - compilation speed, to the extent
that such languages have a compilation speed - is a minor issue. I know
that it would not bother me in the slightest if my Python code took some
small but non-zero compilation time for "real" programs (as distinct
from small scripts). I use Python rather than C because for PC code,
that can often involve files, text manipulation, networking, and various
data structures, the Python code is at least an order of magnitude
shorter and faster to write. When I see the amount of faffing around in
order to read and parse a file consisting of a list of integers, I find
it amazing that anyone would actively choose C for the task (unless it
is for the fun of it).

And people who use these languages - indeed any languages - want their
code to be /fast enough/. Faster than that should not be a priority.

Bart's priorities in his C compiler do not match those of Python or
Javascript programmers. (His scripting language might be closer.)
Development time with his C compiler will be significantly worse than
normal C with a quality C compiler - you might save a second or two on compilation time, but the lack of features, compatibility, modern
standards, and static checking could cost you hours, days, or months.
And the fact that it does not produce as efficient results as tools like
gcc and clang make it less useful - one of the prime motivations for
using C is to get high speed code.

His C compiler might have use as a companion tool to his other language
tools that generate C, and it could also be seen as a testbed for
playing with new potential features in C as it is easier to modify than,
say, gcc or clang. But it is not a tool that matches the priorities of languages such as Javascript or Python.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Mon Jun 17 15:43:31 2024

On 17/06/2024 12:30, bart wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them -
though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to
do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is big,
what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all.

The problem is that Bart's compiler is VERY unusual. It's customized for
his use, and he has lots of quirks in the way he thinks compilers should
work, which are very different from those of most other programmers.

In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Compilation speed is important to everyone. That's why so many tricks
are used to get around the lack of speed in a big compiler, or so many
extra resources are thrown at the problem.

What "tricks" ?

Runtime performance is important too, but at this level of language, the difference between optimised and unoptimised code is narrow. Unoptimised
may be between 1x and 2x slower, typically.

That depends on the language, type of code, and target platform.
Typical C code on an x86_64 platform might be two or three times slower
when using a poorly optimising compiler. After all, the designers of
x86 cpus put a great deal of effort into making shitty code run fast.
For high-performance code written with care and requiring fast results,
the performance difference will be bigger. For C++, especially code
that makes good use of abstractions, the difference can be very much
bigger. For C code on an embedded ARM device or other microcontroller,
it's not unusual to see a 5x speed improvement on optimised code.

Speed is not the only good reason for picking C as the language for a
task, but it is often a relevant factor. And if it is a factor, then
you will usually prefer faster speeds.

Perhaps slower on benchmarks, or code written in C++ style that
generates lots of redundances that relies on optimisation to make it fast.

But, during developement, you probably wouldn't use optimisation anyway.

I virtually always have optimisation enabled during development. I
might, when trying to chase down a specific bug, reduce some specific optimisations, but I have never seen the point of crippling a
development tool when doing development work - it makes no sense at all.

In that case, you're still suffering slow build times with a big
compiler, but you don't get any faster code at the end of it.

I sometimes suggest to people to use Tiny C most of the time, and run
gcc from time to time for extra analysis and extra checks, and use
gcc-O3 for production builds.

I cannot imagine any situation where I would think that might be a good
idea.

But then, I see development tools as tools to help my work as a
developer, while you seem to consider tools (other than your own) as
objects of hatred to be avoided whenever possible or dismissed as
"tricks". I don't expect we will ever agree there.

(I have also suggested that gcc should incorporate a -O-1 option that
runs a secretly bundled of Tiny C.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Mon Jun 17 18:38:40 2024

On 17/06/2024 15:09, Malcolm McLean wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them -
though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to
do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is big,
what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all.

The problem is that Bart's compiler is VERY unusual. It's customized for
his use, and he has lots of quirks in the way he thinks compilers should
work, which are very different from those of most other programmers. In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Yes, but that's probably what you want.

Who is "you" here? Possibly "you" is Bart, but it is certainly not /me/.

As a one man band, bart can't
bear Aple and Microsoft in priiducing a compiler which creates highly optimised code that executes quickly. And that's what the vast majority
of customers want.

I believe I can figure out the words you used, despite the spelling
mistakes, but I can't figure out what you are trying to say. One man
band developers generally want the best tools they can get hold of,
within the limits of their budgets - home-made tools can be part of
that, but not for something like a C compiler.

But say that 0.1% of customers are more interested in compilation speed.
Now, Apple and Microsoft might not even bother catering to, what is to
them, just a tiny market and a disraction for the development team. So
bart can plausibly produce a compiler which does compile code correctly,
and much faster than the big boys. And there are about 28 million pepole
in the world who derive thetr living as computer programmers. 0.1% of
that is 28,000, Charge 10 dollars each, and that's a nice little
business for one person.

Your connection with reality is tenuous at best.

People /do/ like faster compilation speed, though it is rarely a problem
in practice for C. But no one who has used a real C compiler would want
to step down to Bart's tool just to shave a second off their build times.

It is quite believable that some people will find big tools intimidating
and want something that they view as smaller and simpler, but not for
compiler speed. And that market is already saturated by things like
lcc-win and tcc. (These are, unlike Bart's tool, compilers that make a significant effort to be correct for standard C. Bart's compiler is
made for his own use only, and is only likely to be correct for the
subset of C that he wants to use. There's absolutely nothing wrong with
that, but his tool is far from being ready to sell to others as a C
compiler.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Mon Jun 17 18:21:26 2024

On 17/06/2024 17:48, Malcolm McLean wrote:

On 17/06/2024 14:43, David Brown wrote:

On 17/06/2024 12:30, bart wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them - >>>>>> though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to >>>>>> do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is
big,
what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all. >>>>

The problem is that Bart's compiler is VERY unusual. It's customized
for
his use, and he has lots of quirks in the way he thinks compilers
should
work, which are very different from those of most other programmers.

In
particular, compilation speed is very important to him, while execution >>>> speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Compilation speed is important to everyone. That's why so many tricks
are used to get around the lack of speed in a big compiler, or so
many extra resources are thrown at the problem.

What "tricks" ?

Runtime performance is important too, but at this level of language,
the difference between optimised and unoptimised code is narrow.
Unoptimised may be between 1x and 2x slower, typically.

That depends on the language, type of code, and target platform.
Typical C code on an x86_64 platform might be two or three times
slower when using a poorly optimising compiler. After all, the
designers of x86 cpus put a great deal of effort into making shitty
code run fast. For high-performance code written with care and
requiring fast results, the performance difference will be bigger.
For C++, especially code that makes good use of abstractions, the
difference can be very much bigger. For C code on an embedded ARM
device or other microcontroller, it's not unusual to see a 5x speed
improvement on optimised code.

Speed is not the only good reason for picking C as the language for a
task, but it is often a relevant factor. And if it is a factor, then
you will usually prefer faster speeds.

Perhaps slower on benchmarks, or code written in C++ style that
generates lots of redundances that relies on optimisation to make it
fast.

But, during developement, you probably wouldn't use optimisation anyway. >>>

I virtually always have optimisation enabled during development. I
might, when trying to chase down a specific bug, reduce some specific
optimisations, but I have never seen the point of crippling a
development tool when doing development work - it makes no sense at all.

I never do.
Until I had to give up work, I was making real time tools for artists.
And if it didn't work in under just noticeable time on the debug build,
it wouldn't be working in under just noticeable time on the release
build, you could be pretty sure. So I never turned the release build on,
but of course the downstream deployment team built it as release for
delivery to customers. And that might mean that they could do 2000 paths instead of 1000 before the tool slowed to the point that it became
unusable. So not a game changer. But not something to deprive a customer
of either.

Having a distinction in optimisation between "debug" and "release"
builds is simply /wrong/. Release what you have debugged, debug what
you intend to release.

Sometimes it is helpful to fiddle with optimisation settings for
specific debugging tasks - though usually it is better to do this for
specific files or functions. And of course the more heavyweight
debugging tools, such as sanitizers, are used during development and not
in releases.

Optimisation is important. Right out the gate, it means you can let
your tools do a much better job of static analysis (though it is
possible to treat static analysis as a separate tool from compilation),
and you never want to leave bugs to testing if your tools can find them
at static analysis stage. The more problems you find out early, the better.

The other major point to optimisation is it means you can use better abstractions. You don't need to use outdated and unsafe function-like
macros - you can use proper functions. You can split code up into
smaller parts, make new variables as and when they are convenient, and
in general write clearer and more maintainable code because you are
leaving the donkey work of optimisation up to the compiler.

Basically, if you are not using a good optimising compiler, and have optimisation enabled, then the chances are high that C is the wrong
choice of language for the task.

Using C without optimisation is like driving a car but refusing to go
out of first gear. You would probably have been better off with a
bicycle or driving a tank, according to the task at hand.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jun 17 20:24:07 2024

On 17/06/2024 14:43, David Brown wrote:

On 17/06/2024 12:30, bart wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them - >>>>> though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to
do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is big, >>>> what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all.

The problem is that Bart's compiler is VERY unusual. It's customized for >>> his use, and he has lots of quirks in the way he thinks compilers should >>> work, which are very different from those of most other programmers.

In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Compilation speed is important to everyone. That's why so many tricks
are used to get around the lack of speed in a big compiler, or so many
extra resources are thrown at the problem.

What "tricks" ?

Runtime performance is important too, but at this level of language,
the difference between optimised and unoptimised code is narrow.
Unoptimised may be between 1x and 2x slower, typically.

That depends on the language, type of code, and target platform. Typical
C code on an x86_64 platform might be two or three times slower when
using a poorly optimising compiler. After all, the designers of x86
cpus put a great deal of effort into making shitty code run fast. For high-performance code written with care and requiring fast results, the performance difference will be bigger. For C++, especially code that
makes good use of abstractions, the difference can be very much bigger.
For C code on an embedded ARM device or other microcontroller, it's not unusual to see a 5x speed improvement on optimised code.

Speed is not the only good reason for picking C as the language for a
task, but it is often a relevant factor. And if it is a factor, then
you will usually prefer faster speeds.

Perhaps slower on benchmarks, or code written in C++ style that
generates lots of redundances that relies on optimisation to make it
fast.

But, during developement, you probably wouldn't use optimisation anyway.

I virtually always have optimisation enabled during development. I
might, when trying to chase down a specific bug, reduce some specific optimisations, but I have never seen the point of crippling a
development tool when doing development work - it makes no sense at all.

In that case, you're still suffering slow build times with a big
compiler, but you don't get any faster code at the end of it.

I sometimes suggest to people to use Tiny C most of the time, and run
gcc from time to time for extra analysis and extra checks, and use
gcc-O3 for production builds.

I cannot imagine any situation where I would think that might be a good
idea.

But then, I see development tools as tools to help my work as a
developer, while you seem to consider tools (other than your own) as
objects of hatred to be avoided whenever possible or dismissed as
"tricks". I don't expect we will ever agree there.

Here's one use-case of a C compiler, to process the output of my
whole-program non-C compiler. The 'mc' transpiler first converts to C
then invokes a C compiler according to options:

C:\qx52>tm mc -mcc qc
W:Invoking C compiler: mcc -out:qc.exe qc.c
Compiling qc.c to qc.exe
TM: 0.31

C:\qx52>tm mc -tcc qc
W:Invoking C compiler: tcc -oqc.exe qc.c
c:\windows\system32\user32.dll -luser32 c:\windows\system32\kernel32.dll -fdollars-in-identifiers
TM: 0.27

C:\qx52>tm mc -gcc qc
W:Invoking C compiler: gcc -m64 -oqc.exe qc.c -s
TM: 2.44

C:\qx52>tm mc -gcc -opt qc
W:Invoking C compiler: gcc -m64 -O3 -oqc.exe qc.c -s
TM: 14.47

The actual translation to C take 0.1 seconds, so tcc is 13 times faster
at producing optimised code than gcc-O0, and about 80 times faster than
gcc-O3 (67 times faster than -O2).

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

My C compiler is in there as well, and it's also quite fast, but if I
can run it, it means I'm running on Windows and can also directly use my
main compiler, which completes the job in 0.1 seconds:

C:\qx52>tm mm qc
TM: 0.09

But if running on Linux for example, I need to use intermediate C, and
tcc makes more sense as a default than gcc.

With tcc, I also only need two files totalling 230KB (I don't need std headers); I can bundle it with the compiler.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Mon Jun 17 21:29:57 2024

On 17/06/2024 21:17, Malcolm McLean wrote:

On 17/06/2024 17:21, David Brown wrote:

On 17/06/2024 17:48, Malcolm McLean wrote:

Using C without optimisation is like driving a car but refusing to go
out of first gear. You would probably have been better off with a
bicycle or driving a tank, according to the task at hand.

I drive C in first gear when I'm developing, which means that the car is given instructions to go to the right place and obey all, the rules of
the road.

I do my C development with optimisations enabled, which means that the C compiler will obey all the rules and requirements of C. Optimisations
don't change the meaning of correct code - they only have an effect on
the results of your code if you have written incorrect code. I don't
know about you, but my aim in development is to write /correct/ code.
If disabling optimisations helped in some way, it would be due to bugs
and luck.

But it never gets out of frst gear when I'm driving it.
However because of the nature of what we do, which is interactivce programming mostly, usually "just noticeable" time is sufficient. It's a
bit like driving in London - a top of the range sports car is no better
than a beat up old mini, they travel at the same speed because of all
the interactions.

If I am writing PC code where the timing is determined by user
interaction, I would not be writing in C - it is almost certainly a poor
choice of language for the task.

They I had it over to the deployment team, and they take the restraints
off, and allow it to go up to top gear, and it is compiled with full optimisation.

That is insane development practice, if I understand you correctly. For
some kinds of development work, it can make sense to have one person (or
team) make prototypes or proofs-of-concept, and then have another person
(or team) use that as a guide, specification and test comparison when
writing a fast implementation for the real product. But the prototype
should be in a high-level language, written in the clearest and simplest
manner - not crappy code in a low-level language that works by luck when
it is not optimised!

And I don't actually have a computer with one of the most
important hardware targets, but it's all written in C++, a bit in C, and
none in assembler. So I can't profile it, and I have to rely on insight
into where the inner loop will be, and how to avoid expensive operations
in the inner loop.

If you are writing C++ and are not happy about using optimisation, you
are in the wrong job.

And hopefully those subroutines will be called for many years to come,
or hardware as yet un-designed.

With Baby X, I did have severe problems with the rendering speed on an
old Windows machine. But I haven;t noticed them now its runnng on the
Apple Mac. However as the name suggests, Baby X was first designed for X
lib. I only added Windows support later, and all the rgba buffers were
in the wrong format. But faster processors cover a multitude of sins, if
you keep things lean.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jun 17 22:01:05 2024

On 17/06/2024 17:21, David Brown wrote:

Using C without optimisation is like driving a car but refusing to go
out of first gear. You would probably have been better off with a
bicycle or driving a tank, according to the task at hand.

Which bit is the car: the compiler, or the program that it produces?

When I am developing, it is the compiler that is used more often. Or, if
I spend a lot of time on a particular build of an application, the speed
at which it runs is rarely critical, since during most testing, the
scale of the tasks is small.

So if the compiler is the car, then one like tcc goes at 60mph while gcc
goes at walking pace.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to David Brown on Mon Jun 17 19:52:16 2024

On 6/17/24 12:21, David Brown wrote:
...

Having a distinction in optimisation between "debug" and "release"
builds is simply /wrong/. Release what you have debugged, debug what
you intend to release.

I fully agree that you should debug what you intend to release; but I
can't agree that it always makes sense to release what you've debugged.
There are ways to debug code that make it horribly inefficient - they
are also good ways to uncover certain kinds of bugs. There should be a
debug mode where you enable that inefficient code, and track down and
remove any bugs that you find. Then you go to release mode, and test it
as thoroughly as possible with the code as it is intended to be
released, which is never as much can be possible in debug mode. Do not
release until the final version of the code has passed both sets of
tests. If release testing uncovers a bug that requires a code change,
that means that debug testing also needs to be redone.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Tue Jun 18 08:44:32 2024

On 17/06/2024 23:06, Malcolm McLean wrote:

On 17/06/2024 20:29, David Brown wrote:

On 17/06/2024 21:17, Malcolm McLean wrote:

On 17/06/2024 17:21, David Brown wrote:

On 17/06/2024 17:48, Malcolm McLean wrote:

Using C without optimisation is like driving a car but refusing to
go out of first gear. You would probably have been better off with
a bicycle or driving a tank, according to the task at hand.

I drive C in first gear when I'm developing, which means that the car
is given instructions to go to the right place and obey all, the
rules of the road.

I do my C development with optimisations enabled, which means that the
C compiler will obey all the rules and requirements of C.
Optimisations don't change the meaning of correct code - they only
have an effect on the results of your code if you have written
incorrect code. I don't know about you, but my aim in development is
to write /correct/ code. If disabling optimisations helped in some
way, it would be due to bugs and luck.

But it never gets out of frst gear when I'm driving it. However
because of the nature of what we do, which is interactivce
programming mostly, usually "just noticeable" time is sufficient.
It's a bit like driving in London - a top of the range sports car is
no better than a beat up old mini, they travel at the same speed
because of all the interactions.

If I am writing PC code where the timing is determined by user
interaction, I would not be writing in C - it is almost certainly a
poor choice of language for the task.

They I had it over to the deployment team, and they take the
restraints off, and allow it to go up to top gear, and it is compiled
with full optimisation.

That is insane development practice, if I understand you correctly.
For some kinds of development work, it can make sense to have one
person (or team) make prototypes or proofs-of-concept, and then have
another person (or team) use that as a guide, specification and test
comparison when writing a fast implementation for the real product.
But the prototype should be in a high-level language, written in the
clearest and simplest manner - not crappy code in a low-level language
that works by luck when it is not optimised!

And I don't actually have a computer with one of the most important
hardware targets, but it's all written in C++, a bit in C, and none
in assembler. So I can't profile it, and I have to rely on insight
into where the inner loop will be, and how to avoid expensive
operations in the inner loop.

If you are writing C++ and are not happy about using optimisation, you
are in the wrong job.

You know what hardware your code will run on. I don't.

That is absolutely true, and it gives me certain advantages. It is also
the case that high-quality optimisation is vital to my work.

But it is also absolutely irrelevant to the point I was making.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 18 09:01:48 2024

On 17/06/2024 23:01, bart wrote:

On 17/06/2024 17:21, David Brown wrote:

Using C without optimisation is like driving a car but refusing to go
out of first gear. You would probably have been better off with a
bicycle or driving a tank, according to the task at hand.

Which bit is the car: the compiler, or the program that it produces?

The compiler. It is the compiler you are pointlessly and
counter-productively limiting.

Clearly (at least to people not intentionally misinterpreting this) the
speed of the car is not analogous to the /speed/ of the compiler, but
its functionality.

When I am developing, it is the compiler that is used more often. Or, if
I spend a lot of time on a particular build of an application, the speed
at which it runs is rarely critical, since during most testing, the
scale of the tasks is small.

So if the compiler is the car, then one like tcc goes at 60mph while gcc
goes at walking pace.

If you spend most of your development time compiling, you are an
/extremely/ unusual developer.

I would expect that for most developers, the great majority of their
time is spend reading - reading their own code, reading other people's
code, reading documentation, API details, manuals, specifications,
notes, and everything else. The tool they spend most time with is their
IDE, along with whatever tools they use in testing and debugging and
whatever tools they use for documentation, and whatever collaboration
tools they use with colleagues (zoom, whiteboards, coffee machines,
etc.). Proportions will of course vary wildly.

I haven't measured the times for my own work, but at a vague guess I'd
suppose I have perhaps 5 to 30 seconds of build time per hour on average
during most development. Occasionally I'll have peaks where I am doing
small changes, rebuilds and testing in quick succession, but even there
the build times are very rarely a major time factor compared to testing
time. (And that's with perhaps 500 files of C, C++ and headers.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Kaz Kylheku on Tue Jun 18 03:26:45 2024

On 6/17/24 03:30, Kaz Kylheku wrote:

On 2024-06-17, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The problem is that Bart's compiler is VERY unusual. It's customized for
his use, and he has lots of quirks in the way he thinks compilers should
work, which are very different from those of most other programmers. In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Most programmers use Javascript and Python, which follow Bart's
priorities. Fast, invisible compilation to some kind of byte code (plus possibly later JIT), slow execution time.

Perhaps I should have said "most C programmers"; C tends to attract
those who have a need for fast execution time.
Most of my own programming experience has been with programs that worked
on data coming down to Earth from NASA satellites. My programs read one
or more input files, process them, and write one or more output files,
with no human interaction of any kind. Those programs each ran in batch processing mode thousand of times a day, and the load they placed on the processors was a significant cost factor - the slower they operated, the
more processors we had to maintain in order to get the output data
coming out as fast as the input data was coming in. Even though they
performed complex scientific calculations on the data, they were
primarily I/O bound, so our top priority was to design them to minimize
the amount of I/O that needed to be done.
I fully understand that this experience gives me a biased view of
programming - but so does everyone else's experience. I am in no danger
of believing that all programs are batch processing, and you should not
imagine that all programs are interactive. Some of the biggest, most
power computers in the world process weather forecasting data 24/7, and
many of those programs operate in a batch mode keeping pace with
real-time data, similar to the way mine operated.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to David Brown on Tue Jun 18 03:28:17 2024

On 6/17/24 12:38, David Brown wrote:

On 17/06/2024 15:09, Malcolm McLean wrote:

On 17/06/2024 07:22, James Kuyper wrote:

...

The problem is that Bart's compiler is VERY unusual. It's customized for >>> his use, and he has lots of quirks in the way he thinks compilers should >>> work, which are very different from those of most other programmers. In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Yes, but that's probably what you want.

Who is "you" here? Possibly "you" is Bart, but it is certainly not /me/.

In a response to a message by me, his "you" is most plausibly me - but
that certain isn't true of me, either.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Tue Jun 18 11:56:50 2024

On Mon, 17 Jun 2024 15:23:55 +0200
David Brown <david.brown@hesbynett.no> wrote:

I use Python rather than C because for
PC code, that can often involve files, text manipulation, networking,
and various data structures, the Python code is at least an order of magnitude shorter and faster to write. When I see the amount of
faffing around in order to read and parse a file consisting of a list
of integers, I find it amazing that anyone would actively choose C
for the task (unless it is for the fun of it).

The faffing (what does it mean, BTW ?) is caused by unrealistic
requirements. More specifically, by requirements of (A) to support
arbitrary line length (B) to process file line by line. Drop just one
of those requirements and everything become quite simple.
[O.T.]
That despite the fact that fgets() API is designed rather badly -
return value is much less useful that it can easily be. It would be
interesting to find out who was responsible.
[/O.T.]

For task like that Python could indeed be several times shorter, but
only if you wrote your python script exclusively for yourself, cutting
all corners, like not providing short help for user, not testing that
input format matches expectations and most importantly not reporting
input format problems in potentially useful manner.
OTOH, if we write our utility in more "anal" manner, as we should if
we expect it to be used by other people or by ourselves long time after
it was written (in my age, couple of months is long enough and I am not
that much older than you) then code size difference between python and
C variants will be much smaller, probably factor of 2 or so.

W.r.t. faster to code, it very strongly depends on familiarity.
You didn't do that sort of tasks in 'C' since your school days, right?
Or ever? And you are doing them in Python quite regularly? Then that is
much bigger reason for the difference than the language itself.
Now, for more complicated tasks Python, as the language, and even more importantly, Python as a massive set of useful libraries could have
very big productivity advantage over 'C'. But it does not apply to very
simple thing like reading numbers from text file.

In the real world, I wrote utility akin to that less than two years ago.
It converted big matrices from space delimited text to Matlab v4 .mat
format. Why did I do it? Because while both Matlab and Gnu Octave are
capable of reading text files like those, but they are quite slow doing
so. With huge files that I was using at the moment, it became
uncomfortable.
I wrote it in 'C' (or was it C-style C++ ? I don't remember) mostly
because I knew how to produce v4 .mat files in C. If I were doing it in
Python, I'd have to learn how to do it in Python and at the end it
would have taken me more time rather than less. I didn't even came to
the point of evaluating whether speed of python's functions for parsing
text was sufficient for my needs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to James Kuyper on Tue Jun 18 11:26:32 2024

On 18/06/2024 01:52, James Kuyper wrote:

On 6/17/24 12:21, David Brown wrote:
...

Having a distinction in optimisation between "debug" and "release"
builds is simply /wrong/. Release what you have debugged, debug what
you intend to release.

I fully agree that you should debug what you intend to release; but I
can't agree that it always makes sense to release what you've debugged.
There are ways to debug code that make it horribly inefficient - they
are also good ways to uncover certain kinds of bugs.

Yes, as I said I sometimes change things while chasing particular bugs -
with sanitizers mentioned as an example. And I might compile a
particular file with low optimisation, or disable particular
optimisations to make debugging easier.

Most often, I do this with additions to the source code in question -
marking some functions as "noinline" (a gcc attribute) to make
breakpoints easier, or marking some variables as "volatile" to make it
easier to see them with a debugger, or simply adding some extra
printf's. You do what you need to do in order to find the bugs, and how
you do that depends on the bugs and the type of tools you use and the
kind of program you have. But I do not change the rest of the build.

Of course it is correct in a sense that you don't release what you
debug, because you debug code that is not working, and you don't release
it until it is fixed!

But you should not (IMHO) be using special debug modes for the main part
of your debugging and testing, you should be aiming to have as realistic
a scenario as you can for the code. While optimisation does not change
the effect of correct code (other than perhaps different choices for unspecified behaviour), few programmers are perfect. Sometimes code
errors will, by luck, give the desired behaviour with no optimisation
but erroneous behaviour with high optimisation. I have no interest at
all in having my code pass its tests with -O0 and fail with -O2 - I want
to go straight to seeing the problem.

There should be a
debug mode where you enable that inefficient code, and track down and
remove any bugs that you find. Then you go to release mode, and test it
as thoroughly as possible with the code as it is intended to be
released, which is never as much can be possible in debug mode. Do not release until the final version of the code has passed both sets of
tests. If release testing uncovers a bug that requires a code change,
that means that debug testing also needs to be redone.

There are different development strategies appropriate for different
types of program, and different types of development teams. Splits
between development, debugging and testing can vary.

Perhaps my attitude here is unusual, and due to the type of work I do.
For me, a "project" consists of all my own source code, all the
libraries, headers, microcontroller SDK's, all the third-party source
code, the build process (normally a makefile and often a few scripts,
including all compiler flags), the toolchain, and the library.

The binary that goes into the product is thus entirely reproducible. I
don't deliver a collection of C files to the customer, I provide the
whole build system - and the code is debugged, tested and guaranteed
only with that toolchain and build settings. It is important that if
the customer finds a problem years later, I am working with /exactly/
the same binary as was delivered.

Of course I try to make the code as independent of the details of the toolchain, libraries and SDK's as reasonably possible, and I will move
over to new versions if appropriate - but that means full re-testing, re-qualification, and so on.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 18 14:40:27 2024

On 17/06/2024 21:24, bart wrote:

On 17/06/2024 14:43, David Brown wrote:

On 17/06/2024 12:30, bart wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

You might use tcc if you have no brain. People who do C development
seriously don't use the compiler just to generate an exe file. gcc is a development tool, not just a compiler. (As is clang, and MSVC.) If you
think compilation speed of a subset of C is all that matters, you are
not doing C development.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Tue Jun 18 14:36:40 2024

On 18/06/2024 10:56, Michael S wrote:

On Mon, 17 Jun 2024 15:23:55 +0200
David Brown <david.brown@hesbynett.no> wrote:

I use Python rather than C because for
PC code, that can often involve files, text manipulation, networking,
and various data structures, the Python code is at least an order of
magnitude shorter and faster to write. When I see the amount of
faffing around in order to read and parse a file consisting of a list
of integers, I find it amazing that anyone would actively choose C
for the task (unless it is for the fun of it).

The faffing (what does it mean, BTW ?) is caused by unrealistic
requirements. More specifically, by requirements of (A) to support
arbitrary line length (B) to process file line by line. Drop just one
of those requirements and everything become quite simple.

"Faffing around" or "faffing about" means messing around doing
unimportant or unnecessary things instead of useful things. In this
case, it means writing lots of code for handling memory management to
read a file instead of using a higher-level language and just reading
the file.

Yes, dropping requirements might make the task easier in C. But you
still don't get close to being as easy as it is in a higher level
language. (That does not have to be Python - I simply use that as an
example that I am familiar with, and many others here will also have at
least some experience of it.)

For task like that Python could indeed be several times shorter, but
only if you wrote your python script exclusively for yourself, cutting
all corners, like not providing short help for user, not testing that
input format matches expectations and most importantly not reporting
input format problems in potentially useful manner.

No, even if that were part of the specifications, it would still be far
easier in Python. The brief Python samples I have posted don't cover
such user help, options, error checking, etc., but that's because they
are brief samples.

OTOH, if we write our utility in more "anal" manner, as we should if
we expect it to be used by other people or by ourselves long time after
it was written (in my age, couple of months is long enough and I am not
that much older than you) then code size difference between python and
C variants will be much smaller, probably factor of 2 or so.

Unless half the code is a text string for a help page, I'd expect a
bigger factor. And I'd expect the development time difference to be an
even bigger factor - with Python you avoid a number of issues that are
easy to get wrong in C (such as memory management). Of course that
would require a reasonable familiarity of both languages for a fair
comparison.

C and Python are both great languages, with their pros and cons and
different areas where they shine. There can be good reasons for writing
a program like this in C rather than Python, but C is often used without
good technical reasons. To me, it is important to know a number of
tools and pick the best one for any given job.

W.r.t. faster to code, it very strongly depends on familiarity.
You didn't do that sort of tasks in 'C' since your school days, right?
Or ever? And you are doing them in Python quite regularly? Then that is
much bigger reason for the difference than the language itself.

Sure - familiarity with a particular tool is a big reason for choosing it.

Now, for more complicated tasks Python, as the language, and even more importantly, Python as a massive set of useful libraries could have
very big productivity advantage over 'C'. But it does not apply to very simple thing like reading numbers from text file.

IMHO, it does. I have slightly lost track of which programs were being discussed in which thread, but the Python code for the task is a small
fraction of the size of the C code. I agree that if you want to add
help messages and nicer error messages, the difference will go down.

Here is a simple task - take a file name as an command-line argument,
then read all white-space (space, tab, newlines, mixtures) separated
integers. Add them up and print the count, sum, and average (as an
integer). Give a brief usage message if the file name is missing, and a
brief error if there is something that is not an integer. This should
be a task that you see as very simple in C.

#!/usr/bin/python3
import sys

if len(sys.argv) < 2 :
print("Usage: sums.py <input-file>")
sys.exit(1)

data = list(map(int, open(sys.argv[1], "r").read().split()))
n = len(data)
s = sum(data)
print("Count: %i, sum %i, average %i" % (n, s, s // n))

In the real world, I wrote utility akin to that less than two years ago.
It converted big matrices from space delimited text to Matlab v4 .mat
format. Why did I do it? Because while both Matlab and Gnu Octave are
capable of reading text files like those, but they are quite slow doing
so. With huge files that I was using at the moment, it became
uncomfortable.
I wrote it in 'C' (or was it C-style C++ ? I don't remember) mostly
because I knew how to produce v4 .mat files in C. If I were doing it in Python, I'd have to learn how to do it in Python and at the end it
would have taken me more time rather than less. I didn't even came to
the point of evaluating whether speed of python's functions for parsing
text was sufficient for my needs.

Of course if you don't know Python, it will be slower to write it in Python!

And there are times when Python /could/ be used, but C would be better -
C has faster run-time for most purposes. In many situations you can get
Python to run fast, by being careful of the code structures you use, or
using JIT tools, or using toolkits like numpy. And of course these
require additional development effort and learning to use.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jun 18 14:48:15 2024

On 18/06/2024 13:36, David Brown wrote:

On 18/06/2024 10:56, Michael S wrote:

On Mon, 17 Jun 2024 15:23:55 +0200
David Brown <david.brown@hesbynett.no> wrote:

I use Python rather than C because for
PC code, that can often involve files, text manipulation, networking,
and various data structures, the Python code is at least an order of
magnitude shorter and faster to write. When I see the amount of
faffing around in order to read and parse a file consisting of a list
of integers, I find it amazing that anyone would actively choose C
for the task (unless it is for the fun of it).

The faffing (what does it mean, BTW ?) is caused by unrealistic
requirements. More specifically, by requirements of (A) to support
arbitrary line length (B) to process file line by line. Drop just one
of those requirements and everything become quite simple.

"Faffing around" or "faffing about" means messing around doing
unimportant or unnecessary things instead of useful things. In this
case, it means writing lots of code for handling memory management to
read a file instead of using a higher-level language and just reading
the file.

Yes, dropping requirements might make the task easier in C. But you
still don't get close to being as easy as it is in a higher level
language. (That does not have to be Python - I simply use that as an example that I am familiar with, and many others here will also have at
least some experience of it.)

For task like that Python could indeed be several times shorter, but
only if you wrote your python script exclusively for yourself, cutting
all corners, like not providing short help for user, not testing that
input format matches expectations and most importantly not reporting
input format problems in potentially useful manner.

No, even if that were part of the specifications, it would still be far easier in Python. The brief Python samples I have posted don't cover
such user help, options, error checking, etc., but that's because they
are brief samples.

OTOH, if we write our utility in more "anal" manner, as we should if
we expect it to be used by other people or by ourselves long time after
it was written (in my age, couple of months is long enough and I am not
that much older than you) then code size difference between python and
C variants will be much smaller, probably factor of 2 or so.

Unless half the code is a text string for a help page, I'd expect a
bigger factor. And I'd expect the development time difference to be an
even bigger factor - with Python you avoid a number of issues that are
easy to get wrong in C (such as memory management). Of course that
would require a reasonable familiarity of both languages for a fair comparison.

C and Python are both great languages, with their pros and cons and
different areas where they shine. There can be good reasons for writing
a program like this in C rather than Python, but C is often used without
good technical reasons. To me, it is important to know a number of
tools and pick the best one for any given job.

W.r.t. faster to code, it very strongly depends on familiarity.
You didn't do that sort of tasks in 'C' since your school days, right?
Or ever? And you are doing them in Python quite regularly? Then that is
much bigger reason for the difference than the language itself.

Sure - familiarity with a particular tool is a big reason for choosing it.

Now, for more complicated tasks Python, as the language, and even more
importantly, Python as a massive set of useful libraries could have
very big productivity advantage over 'C'. But it does not apply to very
simple thing like reading numbers from text file.

IMHO, it does. I have slightly lost track of which programs were being discussed in which thread, but the Python code for the task is a small fraction of the size of the C code. I agree that if you want to add
help messages and nicer error messages, the difference will go down.

Here is a simple task - take a file name as an command-line argument,
then read all white-space (space, tab, newlines, mixtures) separated integers. Add them up and print the count, sum, and average (as an integer). Give a brief usage message if the file name is missing, and a brief error if there is something that is not an integer. This should
be a task that you see as very simple in C.

#!/usr/bin/python3
import sys

if len(sys.argv) < 2 :
print("Usage: sums.py <input-file>")
sys.exit(1)

data = list(map(int, open(sys.argv[1], "r").read().split()))
n = len(data)
s = sum(data)
print("Count: %i, sum %i, average %i" % (n, s, s // n))

A rather artificial task that you have to chosen so that it can be done
as a Python one-liner, for the main body.

Some characteristics of how it is done are that the whole file is read
into memory as effectively a single string, and all the numbers are
collated into an in-memory array before it is processed.

Numbers are also conveniently separated by white-space (no commas!), so
that .split can be used.

You are using features from Python that allow arbitrary large integers
that also avoid any overflow on that sum.

A C version wouldn't have all those built-ins to draw on (presumably you
expect the starting point to be 'int main(int n ,char** args){}'; using existing libraries is not allowed).

Some would write it so that the file is processed serially and doesn't
have to occupy memory, or needed to deal with files that might fill up
memory.

They might also try and avoid building a large data[] array that may
need to grow in size unless the bounds are determined in addvance.

The C version would be doing it in a different mannner, and likely to be
more efficient.

I haven't tried it directly in C (I don't have a C 'readfile's to hand);
I tried it in my language on a 100MB test input of 15M random numbers
ranging up to one million.

It took just under 0.5 seconds. When I optimised it via C and gcc-O3, it
took just over 0.3 seconds (so the C was 50% faster).

In CPython, your version took 6 seconds, and PyPy was 4.8 seconds.

With a more arbitrary input format, this would be the kind of job that a compiler's lexer does. But nobody seriously writes lexers in Python.

(This is the main program from my attempt; not C, but equally low level:

-------------------
proc main=
int n:=0, x, length:=0, sum:=0

sptr:=readfile("data.txt")
if sptr=nil then stop fi
eof:=0

while x:=nextnumber(); not eof do
++length
sum+:=x
od

println "Length =", length
println "Sum =", sum
println "Average =", sum/length
end
-------------------

Not shown is the fiddly 'nextnumber' routine. It uses 64-bit signed
values, and handles negative numbers.

This is it in action, run directly from source code (tcc can do this too!):

C:\mapps>mm -run test
Length = 15494902
Sum = 7745911799036
Average = 499900

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to bart on Tue Jun 18 09:39:10 2024

On 17/06/2024 21:24, bart wrote:
...

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

On virtually every occasion when I've heard someone claim that a given
decision is a no-brainer, I would generally make a different decision if
I actually applied my brain to the issue. This is no exception.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jun 18 14:55:57 2024

On 18/06/2024 13:40, David Brown wrote:

On 17/06/2024 21:24, bart wrote:

On 17/06/2024 14:43, David Brown wrote:

On 17/06/2024 12:30, bart wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

You might use tcc if you have no brain. People who do C development seriously don't use the compiler just to generate an exe file. gcc is a development tool, not just a compiler. (As is clang, and MSVC.) If you think compilation speed of a subset of C is all that matters, you are
not doing C development.

It's all that mattered in the context that you snipped. Which actually
/isn't/ C development; the C has been generated by a program. This
hardly an uncommon use of a C compiler; there, all you want of it is (1)
to generate executable code (2) perhaps make it generate fast code.

There are a number of use-cases where the extra capabilities of C aren't relevant.

If I do this:

gcc prog.c
del a.exe

where 'del a.exe' is done by mistake (or perhaps I do 'gcc prog2.c'
which wipes out a.exe; I like super-smart compilers!), then I have to do
'gcc prog.c' again.

But it has already been analysed, and nothing has changed; I just want a translation from .c file to .exe file.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to James Kuyper on Tue Jun 18 14:58:30 2024

On 18/06/2024 14:39, James Kuyper wrote:

On 17/06/2024 21:24, bart wrote:
...

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

On virtually every occasion when I've heard someone claim that a given decision is a no-brainer, I would generally make a different decision if
I actually applied my brain to the issue. This is no exception.

So your brain would tell you to choose a tool which takes at least 10
times as long to do the same task?

OK.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Tue Jun 18 14:33:34 2024

bart <bc@freeuk.com> writes:

On 18/06/2024 14:39, James Kuyper wrote:

On 17/06/2024 21:24, bart wrote:
...

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

On virtually every occasion when I've heard someone claim that a given
decision is a no-brainer, I would generally make a different decision if
I actually applied my brain to the issue. This is no exception.

So your brain would tell you to choose a tool which takes at least 10
times as long to do the same task?

That's a ridiculous characterization. Why on earth would I compile
with tcc when it generates completely different (and very poorly
performing code) than the production compiler that I would
use for the version shipped to customers?

The difference in compile time for the vast majority of source
files being compiled with those two compilers is in the noise.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Tue Jun 18 18:40:26 2024

On Tue, 18 Jun 2024 14:36:40 +0200
David Brown <david.brown@hesbynett.no> wrote:

Of course if you don't know Python, it will be slower to write it in
Python!

I don't know Python well, but it does not meant that I don't know it at
all.
Few minutes ago I took a look into docs and it seems that situation with writing binary data files with predefined layout is better than what I
was suspecting. They have something called "Buffer Protocol". It allows
to specify layout in declarative manner, similarly to C struct or may
be even to Ada's records with representation clause.
However attempt to read the doc page further down proved that my
suspicion about steepness of the learning curve was not wrong :(

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Tue Jun 18 18:15:13 2024

On 18/06/2024 16:30, Malcolm McLean wrote:

And here's a simple task for you. Our filesystem uses a new technology
and is a bit dicey. Occasionally you will get a read error. Can you
modify the python to print out that a read error has occurred?

The error is in the specification of the task.

If you are trying to suggest that sometimes Python is not a suitable
language and C might be better for some tasks, then I already know that.

(Mind you, it's quite possible that Python and fuse might be a suitable combination for prototyping a filesystem.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 18 18:11:07 2024

On 18/06/2024 15:48, bart wrote:

On 18/06/2024 13:36, David Brown wrote:

On 18/06/2024 10:56, Michael S wrote:

On Mon, 17 Jun 2024 15:23:55 +0200
David Brown <david.brown@hesbynett.no> wrote:

I use Python rather than C because for
PC code, that can often involve files, text manipulation, networking,
and various data structures, the Python code is at least an order of
magnitude shorter and faster to write. When I see the amount of
faffing around in order to read and parse a file consisting of a list
of integers, I find it amazing that anyone would actively choose C
for the task (unless it is for the fun of it).

The faffing (what does it mean, BTW ?) is caused by unrealistic
requirements. More specifically, by requirements of (A) to support
arbitrary line length (B) to process file line by line. Drop just one
of those requirements and everything become quite simple.

"Faffing around" or "faffing about" means messing around doing
unimportant or unnecessary things instead of useful things. In this
case, it means writing lots of code for handling memory management to
read a file instead of using a higher-level language and just reading
the file.

Yes, dropping requirements might make the task easier in C. But you
still don't get close to being as easy as it is in a higher level
language. (That does not have to be Python - I simply use that as an
example that I am familiar with, and many others here will also have
at least some experience of it.)

For task like that Python could indeed be several times shorter, but
only if you wrote your python script exclusively for yourself, cutting
all corners, like not providing short help for user, not testing that
input format matches expectations and most importantly not reporting
input format problems in potentially useful manner.

No, even if that were part of the specifications, it would still be
far easier in Python. The brief Python samples I have posted don't
cover such user help, options, error checking, etc., but that's
because they are brief samples.

OTOH, if we write our utility in more "anal" manner, as we should if
we expect it to be used by other people or by ourselves long time after
it was written (in my age, couple of months is long enough and I am not
that much older than you) then code size difference between python and
C variants will be much smaller, probably factor of 2 or so.

Unless half the code is a text string for a help page, I'd expect a
bigger factor. And I'd expect the development time difference to be
an even bigger factor - with Python you avoid a number of issues that
are easy to get wrong in C (such as memory management). Of course
that would require a reasonable familiarity of both languages for a
fair comparison.

C and Python are both great languages, with their pros and cons and
different areas where they shine. There can be good reasons for
writing a program like this in C rather than Python, but C is often
used without good technical reasons. To me, it is important to know a
number of tools and pick the best one for any given job.

W.r.t. faster to code, it very strongly depends on familiarity.
You didn't do that sort of tasks in 'C' since your school days, right?
Or ever? And you are doing them in Python quite regularly? Then that is
much bigger reason for the difference than the language itself.

Sure - familiarity with a particular tool is a big reason for choosing
it.

Now, for more complicated tasks Python, as the language, and even more
importantly, Python as a massive set of useful libraries could have
very big productivity advantage over 'C'. But it does not apply to very
simple thing like reading numbers from text file.

IMHO, it does. I have slightly lost track of which programs were
being discussed in which thread, but the Python code for the task is a
small fraction of the size of the C code. I agree that if you want to
add help messages and nicer error messages, the difference will go down.

Here is a simple task - take a file name as an command-line argument,
then read all white-space (space, tab, newlines, mixtures) separated
integers. Add them up and print the count, sum, and average (as an
integer). Give a brief usage message if the file name is missing, and
a brief error if there is something that is not an integer. This
should be a task that you see as very simple in C.

#!/usr/bin/python3
import sys

if len(sys.argv) < 2 :
print("Usage: sums.py <input-file>")
sys.exit(1)

data = list(map(int, open(sys.argv[1], "r").read().split()))
n = len(data)
s = sum(data)
print("Count: %i, sum %i, average %i" % (n, s, s // n))

A rather artificial task that you have to chosen so that it can be done
as a Python one-liner, for the main body.

It is an artificial task that matches Michael's description of a "very
simple thing like reading numbers from text file". Perhaps I should
have asked for the median and mode as well as the mean. In Python, that
would mean adding these lines :

from collections import Counter

print("Mode: %i" % Counter(data).most_common(1)[0][0])

if n % 2 == 1 :
median = sorted(data)[n // 2]
else :
median = sum(sorted(data)[(n // 2 - 1) : (n // 2 + 1)]) / 2 print("Median: %s" % median)

Or there is statistics.mode() and statistics.mean(), but I expect you'd
call that cheating. And I know that sorting the data is inefficient
compared to using heaps to calculate the medium, but this is targeting
low developer time, not low run time.

How much more would that be in C?

Some characteristics of how it is done are that the whole file is read
into memory as effectively a single string, and all the numbers are
collated into an in-memory array before it is processed.

Yes. And that's fine.

Numbers are also conveniently separated by white-space (no commas!), so
that .split can be used.

Yes, that was the specification. But if you want it to support spaces, newlines, tabs and commas, you can write the split() as

.split([" ", "\n", "\t", ","])

I'd probably arrange the code with a couple of extra lines in that case,
as it's not nice to put too much functionality in one line.

You are using features from Python that allow arbitrary large integers
that also avoid any overflow on that sum.

I'm using features from Python in my Python code when showing that
Python has features making it more convenient than C for this kind of
task! What a horror! That's downright /evil/ of me!

A C version wouldn't have all those built-ins to draw on (presumably you expect the starting point to be 'int main(int n ,char** args){}'; using existing libraries is not allowed).

Some would write it so that the file is processed serially and doesn't
have to occupy memory, or needed to deal with files that might fill up memory.

/Exactly/.

They might also try and avoid building a large data[] array that may
need to grow in size unless the bounds are determined in addvance.

The C version would be doing it in a different mannner, and likely to be
more efficient.

Run-time speed was not at issue. We all know that it is possible to
write C code for a task like this which will run a great deal faster
than the Python code, especially if you can give extra restrictions to
the incoming data.

I haven't tried it directly in C (I don't have a C 'readfile's to hand);
I tried it in my language on a 100MB test input of 15M random numbers
ranging up to one million.

No one is interested in that - that was not part of the task.

With a more arbitrary input format, this would be the kind of job that a compiler's lexer does. But nobody seriously writes lexers in Python.

Yes, people do. (Look up the PLY project, for example.) Nobody
seriously writes lexers in C these days. They use Python or another
high level language during development, prototyping and experimentation,
and if the language takes off as a realistic general-purpose language,
they either write the lexer and the rest of the tools in the new
language itself, or they use C++.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jun 18 17:38:58 2024

On 18/06/2024 17:11, David Brown wrote:

On 18/06/2024 15:48, bart wrote:

I haven't tried it directly in C (I don't have a C 'readfile's to
hand); I tried it in my language on a 100MB test input of 15M random
numbers ranging up to one million.

No one is interested in that - that was not part of the task.

You're arguing in favour of a high level scripting language for task
rather than a lower level one which you claim involves a lot of
'faffing' around.

I tried it using mine (C would have taken 10 minutes longer), and found
it wasn't actually that hard, especially if you have some ready-made
routines lying around.

Plus it was a magnitude faster. Plus I showed how it could be run
without a discrete build step just like Python (Tcc has that feature for C).

So for this example, there wasn't a lot in it, while the low-level could
was shown to be faster without trying too hard.

For a throwaway program that is only run once you probably would use the nearest scripting language; its extra runtime (5 seconds for my example)
is shorter than the next extra coding time.

But that's not why you might use C.

With a more arbitrary input format, this would be the kind of job that
a compiler's lexer does. But nobody seriously writes lexers in Python.

Yes, people do. (Look up the PLY project, for example.) Nobody
seriously writes lexers in C these days. They use Python or another
high level language during development, prototyping and experimentation,
and if the language takes off as a realistic general-purpose language,
they either write the lexer and the rest of the tools in the new
language itself, or they use C++.

I'm not talking about experimentation.

Actually I'm starting to wonder whether you use C much at all, and why.

You come across as a Python and C++ 'fan-boy'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 18 18:54:36 2024

On 18/06/2024 18:38, bart wrote:

On 18/06/2024 17:11, David Brown wrote:

On 18/06/2024 15:48, bart wrote:

I haven't tried it directly in C (I don't have a C 'readfile's to
hand); I tried it in my language on a 100MB test input of 15M random
numbers ranging up to one million.

No one is interested in that - that was not part of the task.

You're arguing in favour of a high level scripting language for task
rather than a lower level one which you claim involves a lot of
'faffing' around.

Yes - for cases like the ones we've been looking at recently where
Python code is vastly simpler and faster to write, and easier to get
correct, and where the speed is fine for realistic use-cases.

No one has suggested that Python is faster to /run/ than reasonable C
code - the point is that where Python runs more than as fast as you
need, going faster is of no benefit. But being faster to develop /is/ a benefit.

I tried it using mine (C would have taken 10 minutes longer), and found
it wasn't actually that hard, especially if you have some ready-made
routines lying around.

You are already far beyond the time it takes to write such code in
Python. And anything in your language is irrelevant to everyone except you.

Plus it was a magnitude faster. Plus I showed how it could be run
without a discrete build step just like Python (Tcc has that feature for
C).

So for this example, there wasn't a lot in it, while the low-level could
was shown to be faster without trying too hard.

For a throwaway program that is only run once you probably would use the nearest scripting language; its extra runtime (5 seconds for my example)
is shorter than the next extra coding time.

But that's not why you might use C.

I agree. That's the point - use C when C is the best choice, use a
higher level language when /that/ is the best choice.

With a more arbitrary input format, this would be the kind of job
that a compiler's lexer does. But nobody seriously writes lexers in
Python.

Yes, people do. (Look up the PLY project, for example.) Nobody
seriously writes lexers in C these days. They use Python or another
high level language during development, prototyping and
experimentation, and if the language takes off as a realistic
general-purpose language, they either write the lexer and the rest of
the tools in the new language itself, or they use C++.

I'm not talking about experimentation.

Actually I'm starting to wonder whether you use C much at all, and why.

I use C for embedded systems where C is the right (or only!) choice. I
also use C++ on such systems when that is the right choice.

You come across as a Python and C++ 'fan-boy'.

I'm a fan of picking the best available tool for the job. For something involving manipulating text, strings, and file data on a PC, that is
rarely C.

Unless, of course, you are doing stuff in C for the fun of it, in which
case C is clearly the right choice!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to bart on Tue Jun 18 13:02:50 2024

bart <bc@freeuk.com> writes:

On 18/06/2024 14:39, James Kuyper wrote:

On 17/06/2024 21:24, bart wrote:
...

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

On virtually every occasion when I've heard someone claim that a given
decision is a no-brainer, I would generally make a different decision if
I actually applied my brain to the issue. This is no exception.

So your brain would tell you to choose a tool which takes at least 10
times as long to do the same task?

No, "the task" isn't "compile a program", it's "develop a program",
which includes only a quite negligible amount of time spent compiling it.
What I know about TCC is relatively limited, but the Wikipedia article
is consistent with what I though I knew. It says that tcc supports all
of the features of C90, most of C99, and some gnu extensions. That is
not the dialect of C I want to write in. I want full conformance with
the latest official version of C, with any unintentional use of gnu
extensions flagged with a diagnostic.
Having to write my code in a crippled version of C would be a waste of
my time, and having to fix it to take advantage of the features of a
more modern version of C when I'm ready to optimize it would be a
further waste of time. I'd save far more development time by writing in
the same dialect of C from the very beginning, then I could ever
possibly save by dividing entirely negligible compile times by a factor
of 10.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Tue Jun 18 10:54:09 2024

Michael S <already5chosen@yahoo.com> writes:

On Mon, 17 Jun 2024 07:30:44 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wrote:

On 2024-06-17, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The problem is that Bart's compiler is VERY unusual. It's
customized for his use, and he has lots of quirks in the way he
thinks compilers should work, which are very different from those
of most other programmers. In particular, compilation speed is very
important to him, while execution speed is almost completely
unimportant, which is pretty much the opposite of the way most
programmers prioritize those things.

Most programmers use Javascript and Python, which follow Bart's
priorities. Fast, invisible compilation to some kind of byte code
(plus possibly later JIT), slow execution time.

I'd dare to say that most programmers care about speed of compilation
more than they care about speed of execution even (or especially) when
they use "visible" compilation processes. Except when compilation is
already very fast.
BTW, my impression was that Bart's 'C' compiler uses 'visible'
compilation.

Then again, neither speed of compilation nor speed of execution are top priorities for most pros. My guess is that #1 priority is conformance
with co-workers/employer, #2 is convenient IDE, preferably integrated
with debugger, #3 is support, but there is big distance between #2, and
#3. #4 are religious issues of various forms. Speed of compilation is at best #5.

I agree that speed of compilation is nowhere near the top of my
list, and probably that is true for many or most other developers
as well. I suspect that what the start of the list looks like,
both in terms of what items appear and in what order, varies a
fair amount between different developers and different groups.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to James Kuyper on Tue Jun 18 19:06:48 2024

On 18/06/2024 18:02, James Kuyper wrote:

bart <bc@freeuk.com> writes:

On 18/06/2024 14:39, James Kuyper wrote:

On 17/06/2024 21:24, bart wrote:
...

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

On virtually every occasion when I've heard someone claim that a given
decision is a no-brainer, I would generally make a different decision if >>> I actually applied my brain to the issue. This is no exception.

So your brain would tell you to choose a tool which takes at least 10
times as long to do the same task?

No, "the task" isn't "compile a program", it's "develop a program",
which includes only a quite negligible amount of time spent compiling it. What I know about TCC is relatively limited, but the Wikipedia article
is consistent with what I though I knew. It says that tcc supports all
of the features of C90, most of C99, and some gnu extensions. That is
not the dialect of C I want to write in. I want full conformance with
the latest official version of C, with any unintentional use of gnu extensions flagged with a diagnostic.
Having to write my code in a crippled version of C would be a waste of
my time, and having to fix it to take advantage of the features of a
more modern version of C when I'm ready to optimize it would be a
further waste of time. I'd save far more development time by writing in
the same dialect of C from the very beginning, then I could ever
possibly save by dividing entirely negligible compile times by a factor
of 10.

No, the task in my examples was to turn the validated C generated by a
program into runnable binary.

The C can be generated very quickly; then invoking gcc, even using -O0,
would be like hitting a brick wall. With Tiny C, the whole process is
more fluent.

My generated C tends to be very conservative. The most controversial
feature it has is the use of "$" in identifiers, which Tcc for some
reason doesn't support by default unless you enable it with a
long-winded option (see the examples I showed).

This use of C is fairly common among programming languages, where you
don't need a lot of fancy analysis, since that has already been done by
the front end compiler. And it doesn't really need extra features
either; that's also taken care of.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to bart on Tue Jun 18 11:07:28 2024

bart <bc@freeuk.com> writes:

On 18/06/2024 14:39, James Kuyper wrote:

On 17/06/2024 21:24, bart wrote:
...

If you don't need optimised code right now, why would you invoke gcc
rather than tcc? It's a no-brainer.

On virtually every occasion when I've heard someone claim that a given
decision is a no-brainer, I would generally make a different decision if
I actually applied my brain to the issue. This is no exception.

So your brain would tell you to choose a tool which takes at least 10
times as long to do the same task?

"When two people do the same thing, it's not exactly the same."

- the ancient playwright Terence

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From DFS@21:1/5 to David Brown on Tue Jun 18 14:14:27 2024

On 6/18/2024 12:11 PM, David Brown wrote:

if n % 2 == 1 :
median = sorted(data)[n // 2]
else :
median = sum(sorted(data)[(n // 2 - 1) : (n // 2 + 1)]) / 2

I think your else formula (n % 2 == 0) is incorrect:

n = 4
data = [1,2,3,4]
median = 2.5

Yours appears to sum (1,2,3) = 6 / 2 = 3.

Am I reading it correctly?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jun 18 19:22:46 2024

On 17/06/2024 14:43, David Brown wrote:

On 17/06/2024 12:30, bart wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them - >>>>> though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to
do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is big, >>>> what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all.

The problem is that Bart's compiler is VERY unusual. It's customized for >>> his use, and he has lots of quirks in the way he thinks compilers should >>> work, which are very different from those of most other programmers.

In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Compilation speed is important to everyone. That's why so many tricks
are used to get around the lack of speed in a big compiler, or so many
extra resources are thrown at the problem.

What "tricks" ?

Going to considerable lengths to avoid actually doing any compilation,
or to somehow cache previous results (I mean things like .pch files
rather than .o files).

Have a look at any makefile.

If compilation was instant, half the reasons for a makefile and its
dependency graphs would disappear.

For the scale of programs I write, with the tools I use, compilation
*is* more or less instant.

(Roughly 0.1 seconds; faster than it takes to press and release the
Enter key, for my main compiler. My C compiler takes a bit longer, as it
has been accelerated, but it tends to be used for smaller projects if it
is something I've written.)

That depends on the language, type of code, and target platform. Typical
C code on an x86_64 platform might be two or three times slower when
using a poorly optimising compiler. After all, the designers of x86
cpus put a great deal of effort into making shitty code run fast.

Yes, that's one reason why you can get away without an optimiser, for
sensibly written source code. But it also makes reasoning about optimal
code much harder: removing superfluous instructions often makes code slower!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From DFS@21:1/5 to Mark Bourne on Tue Jun 18 16:07:05 2024

On 6/18/2024 3:52 PM, Mark Bourne wrote:

DFS wrote:

On 6/18/2024 12:11 PM, David Brown wrote:

if n % 2 == 1 :
         median = sorted(data)[n // 2]
else :
         median = sum(sorted(data)[(n // 2 - 1) : (n // 2 + 1)]) / 2

I think your else formula (n % 2 == 0) is incorrect:

n      = 4
data   = [1,2,3,4]
median = 2.5

Yours appears to sum (1,2,3) = 6 / 2 = 3.

Am I reading it correctly?

Python ranges include the start index but exclude the end index. So data[1:3] gives the items at data[1] and data[2], but not data[3].
Indexes are zero-based, so data[1:3] == [2, 3], sum([2, 3]) == 5, and 5
/ 2 == 2.5.

I knew python is index-0 based and I've done a lot of string slicing,
but I didn't know you could sum a slice.

Thanks.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mark Bourne@21:1/5 to Malcolm McLean on Tue Jun 18 21:09:25 2024

Malcolm McLean wrote:

On 18/06/2024 13:36, David Brown wrote:

#!/usr/bin/python3
import sys

if len(sys.argv) < 2 :
print("Usage: sums.py <input-file>")
sys.exit(1)

data = list(map(int, open(sys.argv[1], "r").read().split()))
n = len(data)
s = sum(data)
print("Count: %i, sum %i, average %i" % (n, s, s // n))

And here's a simple task for you. Our filesystem uses a new technology
and is a bit dicey. Occasionally you will get a read error. Can you
modify the python to print out that a read error has occurred?

No modification required ;) A read error will raise an exception;
unhandled exceptions are printed to stderr along with a stack trace, and
the program terminates with a failure status.

If you don't want the stack trace, you can handle the exception to just
print a message and exit, but this is comp.lang.c, not comp.lang.python.

--
Mark.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mark Bourne@21:1/5 to DFS on Tue Jun 18 20:52:58 2024

DFS wrote:

On 6/18/2024 12:11 PM, David Brown wrote:

if n % 2 == 1 :
         median = sorted(data)[n // 2]
else :
         median = sum(sorted(data)[(n // 2 - 1) : (n // 2 + 1)]) / 2

I think your else formula (n % 2 == 0) is incorrect:

n      = 4
data   = [1,2,3,4]
median = 2.5

Yours appears to sum (1,2,3) = 6 / 2 = 3.

Am I reading it correctly?

Python ranges include the start index but exclude the end index. So
data[1:3] gives the items at data[1] and data[2], but not data[3].
Indexes are zero-based, so data[1:3] == [2, 3], sum([2, 3]) == 5, and 5
/ 2 == 2.5.

--
Mark.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to bart on Tue Jun 18 16:34:59 2024

bart <bc@freeuk.com> writes:

If compilation was instant, half the reasons for a makefile and its dependency graphs would disappear.

Even if that conclusion were right, it's irrelevant, because the
premise is false. Furthermore as compilers get faster there is
an irrestible force to add diagnostic tools to simplify and
improve the reliability of program development, and that will
easily soak up as many cycles as are available.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Wed Jun 19 07:44:44 2024

On 18/06/2024 18:39, Malcolm McLean wrote:

On 18/06/2024 16:40, Michael S wrote:

On Tue, 18 Jun 2024 14:36:40 +0200
David Brown <david.brown@hesbynett.no> wrote:

Of course if you don't know Python, it will be slower to write it in
Python!

I don't know Python well, but it does not meant that I don't know it at
all.
Few minutes ago I took a look into docs and it seems that situation with
writing binary data files with predefined layout is better than what I
was suspecting. They have something called "Buffer Protocol". It allows
to specify layout in declarative manner, similarly to C struct or may
be even to Ada's records with representation clause.
However attempt to read the doc page further down proved that my
suspicion about steepness of the learning curve was not wrong :(

My main experience of Python was that we had some resource files which
were icons, in matching light and dark themes. The light theme had
suffix _L followed by extension, and the dark themes had _D. And they
needed to be sorted alphabetically, except that _L should be placed
before _D.
And it didn't take long to get Python to sort the list alphabetically,
but there seemed no way in to the sort comparision function itself. And
I had to give up.

Python "sort" is a bit like C "qsort" (desperately trying to relate this
to the group topicality) in that you can define your own comparison
function, and use that for "sort". For simple comparison functions,
people often use lambdas, for more complicated ones it's clearer to
define a function with a name.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Wed Jun 19 07:39:26 2024

On 18/06/2024 17:40, Michael S wrote:

On Tue, 18 Jun 2024 14:36:40 +0200
David Brown <david.brown@hesbynett.no> wrote:

Of course if you don't know Python, it will be slower to write it in
Python!

I don't know Python well, but it does not meant that I don't know it at
all.
Few minutes ago I took a look into docs and it seems that situation with writing binary data files with predefined layout is better than what I
was suspecting. They have something called "Buffer Protocol". It allows
to specify layout in declarative manner, similarly to C struct or may
be even to Ada's records with representation clause.
However attempt to read the doc page further down proved that my
suspicion about steepness of the learning curve was not wrong :(

"Buffer protocol" is for passing data between Python and C extensions,
which is certainly a complicated business.

For dealing with binary data in specific formats in Python, the "struct"
module is your friend. It lets you pack and unpack data with specific
sizes and endianness using a compact format string notation. I've used
it for dealing with binary file formats and especially for network
packets. There's also the ctypes module which is aimed at duplicating
C-style types and structures, primarily for interfacing with DLL's and
dynamic so libraries.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Wed Jun 19 10:05:53 2024

On 18/06/2024 20:22, bart wrote:

On 17/06/2024 14:43, David Brown wrote:

On 17/06/2024 12:30, bart wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers
work, and what works well with them, and what is relevant for them - >>>>>> though I certainly don't claim to know everything. Obviously Bart
knows vastly more about how /his/ compiler works. He also tends to >>>>>> do testing with several small and odd C compilers, which can give
interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you
or me] of what is hard and what is easy, what is small and what is
big,
what is fast and what is slow. That applies to all compilers except
those that are very unusual. "Major" compiler are not unusual at all. >>>>

The problem is that Bart's compiler is VERY unusual. It's customized
for
his use, and he has lots of quirks in the way he thinks compilers
should
work, which are very different from those of most other programmers.

In
particular, compilation speed is very important to him, while execution >>>> speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Compilation speed is important to everyone. That's why so many tricks
are used to get around the lack of speed in a big compiler, or so
many extra resources are thrown at the problem.

What "tricks" ?

Going to considerable lengths to avoid actually doing any compilation,
or to somehow cache previous results (I mean things like .pch files
rather than .o files).

Have a look at any makefile.

As I suspected, your idea of "tricks" is mostly what other people call
useful or essential tools.

I would use makefiles even if compilation was instant. I /do/ use
makefiles even when compilation is near instant. I use them even if
every run requires a full rebuild of everything. I use them for all
kinds of tasks other than compiling C - I first started using them for cross-assembly builds on DOS.

The point of a makefile (or other build system) is twofold:

1. Get consistent results, with minimal risk of sometimes getting the
build process wrong.

2. Save time and effort for the developer.

It takes a special kind of dedication and stubborn, wilful ignorance to
fail to see the benefit of build tools. (make is not the only option available.)

If compilation was instant, half the reasons for a makefile and its dependency graphs would disappear.

My makefiles would be simpler if compilation were instant, but they
would be equally essential to my work.

For the scale of programs I write, with the tools I use, compilation
*is* more or less instant.

Some of us write serious code, use serious tools, and use them in
serious ways.

(Roughly 0.1 seconds; faster than it takes to press and release the
Enter key, for my main compiler. My C compiler takes a bit longer, as it
has been accelerated, but it tends to be used for smaller projects if it
is something I've written.)

That depends on the language, type of code, and target platform.
Typical C code on an x86_64 platform might be two or three times
slower when using a poorly optimising compiler. After all, the
designers of x86 cpus put a great deal of effort into making shitty
code run fast.

Yes, that's one reason why you can get away without an optimiser, for sensibly written source code. But it also makes reasoning about optimal
code much harder: removing superfluous instructions often makes code
slower!

No, removing superfluous instructions very rarely makes code slower.
But I agree that it is hard to figure out optimal code sequences on
modern cpus, especially x86_64 devices, and even more so when you want
fast results on a range of such processors. Writing a good optimiser is
a very difficult task. But when compilers with good optimisers exist,
/using/ them is not at all hard for getting reasonable results.
(Squeezing out the last few percent /is/ hard.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Wed Jun 19 12:42:31 2024

On 19/06/2024 11:25, Malcolm McLean wrote:

On 18/06/2024 23:49, Keith Thompson wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
[...]

And it didn't take long to get Python to sort the list alphabetically,
but there seemed no way in to the sort comparision function
itself. And I had to give up.

<OT>
https://docs.python.org/3/library/functions.html#sorted
https://docs.python.org/3/library/stdtypes.html#list.sort
</OT>

key specifies a function of one argument that is used to extract a
comparison key from each element in iterable (for example,
key=str.lower). The default value is None (compare the elements directly).

You see the problem. I can sort on any field. I can sort alphaetically upwards and downwards. But I don't want to do that. I want to use a non-alphabetical comaprison function on two fields, and I need to
specify that myself, because it's impossible that it is available
anywhere. And that is to sort alphalbetically, except where the strings
match except for an emedded "_L_" or "_D_" where the string wth the
embedded "L" shoud be treated as closer to A than the string with the emebdded "_D_".

def LD_key(n) :
if "_L" in n : return (0, n)
if "_D_" in n : return (1, n)
return (2, n)

Now you have a key function that will put all names containing "_L_"
first, then all names containing "_D_", then everything else, with
alphabetic sorting within those groups.

There is no problem here - you just have to think about things in a
different way.

(I don't know why Python 3 dropped the comparison function support from sort()/sorted(). It might be that a key function is more efficient,
since you call it once for each item rather than once for each comparison.)

And I'm sure there is some way to achiev e this. But in C, it s achieved simply by declaring that qsort takes a function pointer to user-supplied code.

Yes, there is some way to achieve this all in Python. And like pretty
much every other question that is commonly asked, google will tell you
the answer. Sometimes things seem hard - then you do a little research,
learn a bit, and then its easy.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Wed Jun 19 11:52:04 2024

On 19/06/2024 09:05, David Brown wrote:

On 18/06/2024 20:22, bart wrote:

On 17/06/2024 14:43, David Brown wrote:

On 17/06/2024 12:30, bart wrote:

On 17/06/2024 07:22, James Kuyper wrote:

On 6/13/24 10:43, Michael S wrote:

On Thu, 13 Jun 2024 13:53:54 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

I know more than most C programmers about how certain C compilers >>>>>>> work, and what works well with them, and what is relevant for them - >>>>>>> though I certainly don't claim to know everything. Obviously Bart >>>>>>> knows vastly more about how /his/ compiler works. He also tends to >>>>>>> do testing with several small and odd C compilers, which can give >>>>>>> interesting results even though they are of little practical
relevance for real-world C development work.

Since he do compilers himself, he has much better feeling [that you >>>>>> or me] of what is hard and what is easy, what is small and what is >>>>>> big,
what is fast and what is slow. That applies to all compilers except >>>>>> those that are very unusual. "Major" compiler are not unusual at all. >>>>>

The problem is that Bart's compiler is VERY unusual. It's
customized for
his use, and he has lots of quirks in the way he thinks compilers
should
work, which are very different from those of most other programmers.

In
particular, compilation speed is very important to him, while
execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Compilation speed is important to everyone. That's why so many
tricks are used to get around the lack of speed in a big compiler,
or so many extra resources are thrown at the problem.

What "tricks" ?

Going to considerable lengths to avoid actually doing any compilation,
or to somehow cache previous results (I mean things like .pch files
rather than .o files).

Have a look at any makefile.

As I suspected, your idea of "tricks" is mostly what other people call
useful or essential tools.

I would use makefiles even if compilation was instant. I /do/ use
makefiles even when compilation is near instant. I use them even if
every run requires a full rebuild of everything. I use them for all
kinds of tasks other than compiling C - I first started using them for cross-assembly builds on DOS.

The point of a makefile (or other build system) is twofold:

1. Get consistent results, with minimal risk of sometimes getting the
build process wrong.

2. Save time and effort for the developer.

It takes a special kind of dedication and stubborn, wilful ignorance to
fail to see the benefit of build tools. (make is not the only option available.)

If compilation was instant, half the reasons for a makefile and its
dependency graphs would disappear.

My makefiles would be simpler if compilation were instant, but they
would be equally essential to my work.

For the scale of programs I write, with the tools I use, compilation
*is* more or less instant.

Some of us write serious code, use serious tools, and use them in
serious ways.

I understand. You can't take a product seriously unless it's big, and
it's slow, and it's got lots of shiny buttons!

My company had the some problem once: I had a product that fitted onto
one floppy disk, which seemed insubstantial. So it was supplied on a CD
instead to make it seem bigger than it was (a CD had a capacity 500
times greater than a floppy).

However, I can paste here the result of running a C program. Could you
tell whether it was built with a 0.2MB compiler or a 0.2GB one? Could
you tell whether it was built in 0.1 seconds or if it took a minute?

Could you tell whether execution took 5 seconds or 10 seconds?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Wed Jun 19 19:53:55 2024

On 19/06/2024 12:52, bart wrote:

On 19/06/2024 09:05, David Brown wrote:

Some of us write serious code, use serious tools, and use them in
serious ways.

I understand.

No. You don't understand. No doubt you never will, because you have
spend such a lot of time and effort to be sure that you will never
understand.

I'm glad that you are happy with the tools you use - and let's leave it
there.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Wed Jun 19 19:49:50 2024

On 19/06/2024 18:52, Malcolm McLean wrote:

Yes, but that's not quite what we want. A typical input would go.

It's extremely hard to guess what you want (not "we", but "you" - no one
else wants this kind of thing) when you have bizarre requirements and
only give bits of them. So modifying the Python code is left as an
exercise if you are interested, especially as it is off-topic.

I appreciate that Python programming will be more difficult than C
programming if you are familiar with C and have never written Python.
That's not the point. The point is that for someone reasonably familiar
with both languages, some types of coding - such as the ones discussed
here - are faster and easier to develop in Python.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Malcolm McLean on Wed Jun 19 22:51:00 2024

On 19/06/2024 21:42, Malcolm McLean wrote:

On 19/06/2024 18:49, David Brown wrote:

On 19/06/2024 18:52, Malcolm McLean wrote:

Yes, but that's not quite what we want. A typical input would go.

It's extremely hard to guess what you want (not "we", but "you" - no
one else wants this kind of thing) when you have bizarre requirements
and only give bits of them. So modifying the Python code is left as
an exercise if you are interested, especially as it is off-topic.

This was a work project, so "we". I would never set up such a system.
But we hasd light-themed and dark-themed icons and they hsd to be
arranged just so so that the program would find them and show the right theme. And as you can imagine, it was a nuisance to we programmers to
set up the resource scripts so that eveything was right.

So why not get Python to do the job? But there wasn't much enthusiasm.
So, despite not knowing Python, I decided to have a go, and I got a
sorted list of icons quite easily, and it looked promising. But now the special requirement for a little deviation from alphabetical sort. And I couldn't work out how to do that.

And it wasn't what I was supposed to be doing or paid to do. We used to
have a weekly meeting where we discussed work done. If I said "oh, and I spent an afternoon knocking up this little Python script to help get
those resource files together", then that's OK. If I say that was my
main focus for the week, no, and if I say I spent substantial time on it
and it didn't even work - well that really wouldn't go down well.
So I had to abandon it once it became clear that it would take many
hours of trawling through docs and online tips to try to work out a way.
And no-one has posted a solution here. And, whilst there will be a way,
I suspect that it just doesn't use the mainstream langage facilities. I suspect that Python isn't really a programming language - a language
designed to make it easy to apply arbitrary transforms to data - it's a scripting language - a language designed to make it easy to call pre-existings code to do the things it is designed to do.

But maybe I'm unfair.

No; I don't like Python either. It's big. It has lots of advanced, hard-to-grasp features. There's a dozen unwieldy, usually incompatible
ways of getting anything done.

Pretty much everything can be assigned to (the only exception is
reserved words). Because every user identifer (even if declared with def
or class or module) is a variable.

Take a simple feature like a struct with mutable fields:

typedef struct {int x, y;} Point;

How do you do that in Python; maybe a class:

class Point:
pass

p = Point() # create an instance
p.x = 100 # fill in the fields
p.qwghheghergh = 200

But here you type gobbledygook instead of 'y'; it still works! You can
have an unlimited number of attributes.

Maybe a tuple is better, but those are indexed by number, not by name.
So you use a module providing NamedTuples - except those fields are
immutable. You can only update the lot.

Here's one I've just seen (that '@' line is a 'decorator'; don't ask me
what it means):

from dataclasses import dataclass

@dataclass
class Point:
x: int
y: int

p = Point(10, 20)
print (p)

This looks promising. Then I tried 'p.z = 30'; it still works. So does:

p = Point("dog", "cat") # what happened to those int types?

Maybe you need a struct type compatible with C; then you might try this:

from ctypes import *

class YourStruct(Structure): # adapted from online example
_fields_ = [('x', c_int),
('y', c_int)]

It's just a big, ugly, kitchen-sink language. They throw in every
feature they can think of (like C++, possibly why DB likes it) in the
hope that somewhere in the mess is a solution to your needs.

I'm not surprised it takes 20MB to embed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Jun 20 09:55:40 2024

On 19/06/2024 22:42, Malcolm McLean wrote:

On 19/06/2024 18:49, David Brown wrote:

On 19/06/2024 18:52, Malcolm McLean wrote:

Yes, but that's not quite what we want. A typical input would go.

It's extremely hard to guess what you want (not "we", but "you" - no
one else wants this kind of thing) when you have bizarre requirements
and only give bits of them. So modifying the Python code is left as
an exercise if you are interested, especially as it is off-topic.

This was a work project, so "we". I would never set up such a system.
But we hasd light-themed and dark-themed icons and they hsd to be
arranged just so so that the program would find them and show the right theme. And as you can imagine, it was a nuisance to we programmers to
set up the resource scripts so that eveything was right.

So why not get Python to do the job? But there wasn't much enthusiasm.
So, despite not knowing Python, I decided to have a go, and I got a
sorted list of icons quite easily, and it looked promising. But now the special requirement for a little deviation from alphabetical sort. And I couldn't work out how to do that.

And it wasn't what I was supposed to be doing or paid to do. We used to
have a weekly meeting where we discussed work done. If I said "oh, and I spent an afternoon knocking up this little Python script to help get
those resource files together", then that's OK. If I say that was my
main focus for the week, no, and if I say I spent substantial time on it
and it didn't even work - well that really wouldn't go down well.
So I had to abandon it once it became clear that it would take many
hours of trawling through docs and online tips to try to work out a way.
And no-one has posted a solution here. And, whilst there will be a way,
I suspect that it just doesn't use the mainstream langage facilities. I suspect that Python isn't really a programming language - a language
designed to make it easy to apply arbitrary transforms to data - it's a scripting language - a language designed to make it easy to call pre-existings code to do the things it is designed to do.

But maybe I'm unfair.

It's not so much that you are being unfair, it is that you are arguing
from a position of almost total ignorance. It's fine to say you know
little about Python and can't comment much on it. It is /not/ fine to
say (and demonstrate) that you know almost nothing about a language and
then make wild claims about what it can and cannot do.

This is not a Python group, so I have not bothered writing code for your
weird requirements. Suffice it to say that it would not be hard, and it
would use "mainstream language facilities" (I take that to mean the
language and its standard libraries, rather than third-party tools).

Yes, Python is a "programming language" by any reasonable definition of
that term. (It is /also/ useful as a scripting language - languages can
be suitable for more than one task.) Yes, it is designed to make it
easy to "apply arbitrary transforms to data" - it is usually very much
easier to do this than in C, at the cost of less efficient run-time performance. And no language has pre-existing code or standard
functions to handle your highly unusual sorting requirements - such
things always need their own code.

You've made it clear you know nothing about the language. Fair enough -
we all know almost nothing about almost all programming languages. But
trust someone who does.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Jun 20 12:34:43 2024

On 19/06/2024 23:51, bart wrote:

On 19/06/2024 21:42, Malcolm McLean wrote:

On 19/06/2024 18:49, David Brown wrote:

On 19/06/2024 18:52, Malcolm McLean wrote:

Yes, but that's not quite what we want. A typical input would go.

It's extremely hard to guess what you want (not "we", but "you" - no
one else wants this kind of thing) when you have bizarre requirements
and only give bits of them. So modifying the Python code is left as
an exercise if you are interested, especially as it is off-topic.

This was a work project, so "we". I would never set up such a system.
But we hasd light-themed and dark-themed icons and they hsd to be
arranged just so so that the program would find them and show the
right theme. And as you can imagine, it was a nuisance to we
programmers to set up the resource scripts so that eveything was right.

So why not get Python to do the job? But there wasn't much enthusiasm.
So, despite not knowing Python, I decided to have a go, and I got a
sorted list of icons quite easily, and it looked promising. But now
the special requirement for a little deviation from alphabetical sort.
And I couldn't work out how to do that.

And it wasn't what I was supposed to be doing or paid to do. We used
to have a weekly meeting where we discussed work done. If I said "oh,
and I spent an afternoon knocking up this little Python script to help
get those resource files together", then that's OK. If I say that was
my main focus for the week, no, and if I say I spent substantial time
on it and it didn't even work - well that really wouldn't go down well.
So I had to abandon it once it became clear that it would take many
hours of trawling through docs and online tips to try to work out a
way. And no-one has posted a solution here. And, whilst there will be
a way, I suspect that it just doesn't use the mainstream langage
facilities. I suspect that Python isn't really a programming language
- a language designed to make it easy to apply arbitrary transforms to
data - it's a scripting language - a language designed to make it easy
to call pre-existings code to do the things it is designed to do.

But maybe I'm unfair.

No; I don't like Python either. It's big. It has lots of advanced, hard-to-grasp features. There's a dozen unwieldy, usually incompatible
ways of getting anything done.

Like any major language, for any given programmer there will be things
they like, and things they don't like. Python is very flexible, which
has its advantages and disadvantages. And both the language and the
standard library are very big - again that has pros and cons. I find
that you can make good use of Python without knowing anything about its
more advanced language features, so I never found the size overwhelming
even as a beginner.

Pretty much everything can be assigned to (the only exception is
reserved words). Because every user identifer (even if declared with def
or class or module) is a variable.

The concept of "variable" in Python is quite different from that of C.
You can pretend they are similar for very simple Python snippets, but
then you will end up thinking there are lots of arbitrary rules for when assignment and function parameters are by value or by reference. It is
better to think that all "things" in Python are anonymous
reference-counted objects on the heap. When it looks like you have a
variable, you actually just have a named reference to such objects.
Imagine it more like your "variables" are all "void *" pointers or
references, while all other types and structures are malloc'd. These references have no type information - but the objects they point to are
all strongly typed. And the objects have reference-counted garbage
collection.

Take a simple feature like a struct with mutable fields:

   typedef struct {int x, y;} Point;

How do you do that in Python; maybe a class:

   class Point:
       pass

   p = Point()             # create an instance
   p.x = 100               # fill in the fields
   p.qwghheghergh = 200

But here you type gobbledygook instead of 'y'; it still works! You can
have an unlimited number of attributes.

Yes. It is a dynamic language, and the default is to let you do all
sorts of things like this. In particular, every object has a "__dict__"
- a hash-map of all fields, including data members and methods. Once an
object is created, you can add, change or remove these fields as you
want. That lets you do all kinds of useful and interesting things at
run-time - but of course it also lets you make all kinds of mistakes.

But you can restrict your types in various ways if you want - you can
even add type annotations and use a static checker.

The way to restrict a type here is using __slots__ :

class Point :
__slots__ = "x", "y"
def __init__(self, x = 0, y = 0) :
self.x = x
self.y = y
def __repr__(self) :
return f"point({self.x}, {self.y})"

You don't /need/ the __init__ or __repr__, but it's nice to have them.
It is the __slots__ entry that is special, because it removes the normal __dict__. Now your type has only data members "x" and "y", and you
can't add new ones - intentionally or by mistake.

Maybe a tuple is better, but those are indexed by number, not by name.
So you use a module providing NamedTuples - except those fields are immutable. You can only update the lot.

Point = namedtuple("point", ["x", "y"])

Named tuples are indeed a way to do this, and are very common for the
purpose. And yes, they are immutable - that's a feature as well as a restriction. It means, amongst other things, that they can be used as
keys in dictionaries.

You can use the _replace method to change one (or several) fields, while leaving the rest unchanged. Since the tuples are immutable, it returns
a new tuple rather than changing the existing one.

p = Point(2, 3)
p2 = p._replace(x = 45)

Here's one I've just seen (that '@' line is a 'decorator'; don't ask me
what it means):

Then let me tell you - "dataclass" is a function that takes a class (the
type itself, not instants of the class) and returns a new class with
some aspects changed. (That's the kind of thing you can do when classes
are objects, and objects can normally be modified in all sorts of ways.)
So the syntax here means a class "Point" is defined in the way you
wrote it, then passed to the "dataclass" decorator which creates a new
class (by adding an __init__ and a __repr__ methods), and then this new
class gets the name "Point", hiding the original class. It sounds a bit complicated, but it is a useful concept, and works for classes,
functions and other things.

from dataclasses import dataclass

@dataclass
class Point:
     x: int
     y: int

p = Point(10, 20)
print (p)

This looks promising. Then I tried 'p.z = 30'; it still works.

You can use @dataclass(slots = True) to make the class use slots, and
then you can't create new fields like "p.z".

So does:

p = Point("dog", "cat")      # what happened to those int types?

Python has dynamic typing. The types you specify here are not part of
the run-time language at all - they are annotations available to other
tools (or other functions in Python). So static type checkers will see
them - run-time code and interactive Python does not. (I'm not saying I
like that - I'm just telling you how it is.)

Maybe you need a struct type compatible with C; then you might try this:

from ctypes import *

class YourStruct(Structure):          # adapted from online example
      _fields_ = [('x', c_int),
                  ('y', c_int)]

That is a way to make structures for interaction with external code - basically, for when you are connecting to DLLs or so libraries.

It's just a big, ugly, kitchen-sink language. They throw in every
feature they can think of (like C++, possibly why DB likes it) in the
hope that somewhere in the mess is a solution to your needs.

I'm not surprised it takes 20MB to embed.

Neither Python nor C++ throws in "every feature they can think of" - for
both languages, there is a long process of proposals, discussions,
testing, and consideration of the impact on the rest of the language,
existing code, and possible future language features, before a feature
is included. Yes, these are big languages. Sometimes big is good,
sometimes it is bad - it would be wildly foolish to think that one
language, or one style of language, is all that people need or want.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Malcolm McLean on Thu Jun 20 12:24:06 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 20/06/2024 08:55, David Brown wrote:

...

You've made it clear you know nothing about the language.� Fair enough -
we all know almost nothing about almost all programming languages.� But
trust someone who does.

Yes but I just don't.
Everyone says "oh yes, that is easy - goto a Python group" and so
on. No-one actually comes up with the code.

In C, what we do is write a special version of strcmp

int strcmp_light_and_dark(const char *a, const char *b)
{
int i;

for (i = 0; a[i] && b[i]; i++)
{
if (a[i] != b[i])
{
if (a[i] == 'L' && b[i] == 'D' && i > 0 && a[i-1] == '_')
return -1;
if (a[i] == 'D' && b[i] == 'L' && i > 0 && a[i-1] = '_')
return 1;
break;
}
}

return a[i] - b[i];

}

So easy to do.

Unless I'm missing something here, that code does not do what you say
you want. You gave an example of some input and the desired output but
this comparison function does not sort into the ordering you gave.

You may find this "ordering" hard to duplicate in other languages
because it is not even an ordering in the mathematical sense as it is
not transitive.

Can you specify the desired ordering as a total or partial order
relation?

On a C language point, I don't think the standard says anything about
sorting with non-order functions like the one above. Is an
implementation of qsort permitted to misbehave (for example by not
terminating) when the comparison function does not implement a proper
order relation?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Thu Jun 20 13:18:40 2024

On 20/06/2024 10:37, Malcolm McLean wrote:

On 20/06/2024 08:55, David Brown wrote:

On 19/06/2024 22:42, Malcolm McLean wrote:

On 19/06/2024 18:49, David Brown wrote:

On 19/06/2024 18:52, Malcolm McLean wrote:

Yes, but that's not quite what we want. A typical input would go.

It's extremely hard to guess what you want (not "we", but "you" - no
one else wants this kind of thing) when you have bizarre
requirements and only give bits of them. So modifying the Python
code is left as an exercise if you are interested, especially as it
is off-topic.

This was a work project, so "we". I would never set up such a system.
But we hasd light-themed and dark-themed icons and they hsd to be
arranged just so so that the program would find them and show the
right theme. And as you can imagine, it was a nuisance to we
programmers to set up the resource scripts so that eveything was right.

So why not get Python to do the job? But there wasn't much
enthusiasm. So, despite not knowing Python, I decided to have a go,
and I got a sorted list of icons quite easily, and it looked
promising. But now the special requirement for a little deviation
from alphabetical sort. And I couldn't work out how to do that.

And it wasn't what I was supposed to be doing or paid to do. We used
to have a weekly meeting where we discussed work done. If I said "oh,
and I spent an afternoon knocking up this little Python script to
help get those resource files together", then that's OK. If I say
that was my main focus for the week, no, and if I say I spent
substantial time on it and it didn't even work - well that really
wouldn't go down well.
So I had to abandon it once it became clear that it would take many
hours of trawling through docs and online tips to try to work out a
way. And no-one has posted a solution here. And, whilst there will be
a way, I suspect that it just doesn't use the mainstream langage
facilities. I suspect that Python isn't really a programming language
- a language designed to make it easy to apply arbitrary transforms
to data - it's a scripting language - a language designed to make it
easy to call pre-existings code to do the things it is designed to do.

But maybe I'm unfair.

It's not so much that you are being unfair, it is that you are arguing
from a position of almost total ignorance. It's fine to say you know
little about Python and can't comment much on it. It is /not/ fine to
say (and demonstrate) that you know almost nothing about a language
and then make wild claims about what it can and cannot do.

This is not a Python group, so I have not bothered writing code for
your weird requirements. Suffice it to say that it would not be hard,
and it would use "mainstream language facilities" (I take that to mean
the language and its standard libraries, rather than third-party tools).

Yes, Python is a "programming language" by any reasonable definition
of that term. (It is /also/ useful as a scripting language -
languages can be suitable for more than one task.) Yes, it is
designed to make it easy to "apply arbitrary transforms to data" - it
is usually very much easier to do this than in C, at the cost of less
efficient run-time performance. And no language has pre-existing code
or standard functions to handle your highly unusual sorting
requirements - such things always need their own code.

You've made it clear you know nothing about the language. Fair enough
- we all know almost nothing about almost all programming languages.
But trust someone who does.

Yes but I just don't.

Did you think I was lying? Did you think I was mistaken?

Everyone says "oh yes, that is easy - goto a Python group" and so on.
No-one actually comes up with the code.

In C, what we do is write a special version of strcmp

int strcmp_light_and_dark(const char *a, const char *b)
{
    int i;

    for (i = 0; a[i] && b[i]; i++)
    {
        if (a[i] != b[i])
        {
           if (a[i] == 'L' && b[i] == 'D' && i > 0 && a[i-1] == '_')
               return -1;
            if (a[i] == 'D' && b[i] == 'L' && i > 0 && a[i-1] = '_')
               return 1;
           break;
        }
    }

    return a[i] - b[i];

}

In Python, you write a key function. I'm still not clear on exactly
what you are asking for, especially since your function above does not
match your earlier description. My obvious thought here is to replace
"_D_" with "_Z_", and sort alphabetically :

sorted(xs, key = lambda x : x.replace("_D_", "_Z_"))

Now, I'm sure you will think of some other way to change your goalposts
and come up with a new, inconsistent set of rules. Perhaps it was all
too long ago for you to remember.

So easy to do. And we can't pass that to qsort. We have to write a
little wrapper to convert the pointers for us.

You say you /can't/ pass your function to qsort, and yet you still think
it is easier?

But C hs that sort of flexibility as a result of stripping features away
from the language.

C has never stripped features away from any language. It is simply a relatively small language - it's flexible, yes, but you need to write
lots of things manually that are built in in higher-level languages.

And of course anything like this that you can write in C, you can write
in Python, if that's what you want. It will be /slower/ in Python, and
is unlikely to be the best way to writing things, but it is entirely
possible :

def strcmp_light_and_dark(a : str, b : str) -> int
n = min(len(a), len(b))
for i in range(n) :
if a[i] != b[i] :
if (a[i] == 'L' && b[i] == 'D' && i > 0 && a[i-1] == '_') :
return -1
if (a[i] == 'D' && b[i] == 'L' && i > 0 && a[i-1] == '_') :
return 1;
break
return a[i] > b[i];

This can't be passed to Python's sort functions either, as they use key functions - but the wrapper is provided by Python as functools.cmp_to_key.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Jun 20 14:37:44 2024

On 20/06/2024 11:34, David Brown wrote:

On 19/06/2024 23:51, bart wrote:

[Discussing Python]

Pretty much everything can be assigned to (the only exception is
reserved words). Because every user identifer (even if declared with
def or class or module) is a variable.

The concept of "variable" in Python is quite different from that of C.
You can pretend they are similar for very simple Python snippets, but
then you will end up thinking there are lots of arbitrary rules for when assignment and function parameters are by value or by reference. It is better to think that all "things" in Python are anonymous
reference-counted objects on the heap. When it looks like you have a variable, you actually just have a named reference to such objects.
Imagine it more like your "variables" are all "void *" pointers or references, while all other types and structures are malloc'd. These references have no type information - but the objects they point to are
all strongly typed. And the objects have reference-counted garbage collection.

You haven't given an opinion. I think this is an unnecessary aspect of
it, which also makes it harder to optimise, and to reason about.

My languages have perhaps a dozen categories of identifiers, known at compile-time, which include variable names. Python has only one, a
'variable'. It mean this is possible:

def F(n): return n + 1
...
F = 42
....
F(x) # will not work when F is 42

In my language (in common with most other sensible ones!), if you really
wanted that effect, you do it like this:

fun F(n) = n + 1 # F is a function name; it cannot change
G := F # G is a variable name; its value can change
...
G := 42
....
F(x) # Will always work
G(x) # Will not work when G doesn't refer to
# a function

That is a way to make structures for interaction with external code - basically, for when you are connecting to DLLs or so libraries.

It's just a big, ugly, kitchen-sink language. They throw in every
feature they can think of (like C++, possibly why DB likes it) in the
hope that somewhere in the mess is a solution to your needs.

I'm not surprised it takes 20MB to embed.

Neither Python nor C++ throws in "every feature they can think of" - for
both languages, there is a long process of proposals, discussions,
testing, and consideration of the impact on the rest of the language, existing code, and possible future language features, before a feature
is included.

And /then/ they include the feature! I've long given up keeping track.

Yes, these are big languages. Sometimes big is good,
sometimes it is bad - it would be wildly foolish to think that one
language, or one style of language, is all that people need or want.

It's a big language that ignores many fundamental features. My scripting language is smaller and simpler, but it takes care of those because I
think they are important.

The record example, with variant elements, is defined like this:

record point =
var x, y
end

and the C-compatible version, which can also be used to enforce element
types, or to save memory if there are large homogeneous arrays of them,
like this:

type cpoint =
int32 x, y
end

Both have mutable elements. Neither allow arbitrary attributes (so
impossible to misspell member names). And if the FFI demands it,
pointers to structs or ints can be passed:

p := cpoint(10, 20)
&p # low-level pointer to the struct
&p.x # low-level pointer to the int32 element

or even:

q := point(10, 20)
&q.x # low-level pointer to the now int64 element

It just works with no fuss, with no need for add-ons, or decorators, and
using the same syntax you'd use in static language that uses records and structs.

I was aware of Python during the 1990s. My own scripting language was
ungainly; so was Python. At one point I needed to bolt-on byte-arrays;
so did Python!

But Python even then completely disregarded performance. In the 1990s,
if you wrote a loop like this:

for i in range(1000000):
....

it would actually create an object with a million elements so that you
could iterate along it. It sounds absolutely crazy, and it was.

Later they added xrange() which didn't do that, and later on 'xrange'
morphed into 'range'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Thu Jun 20 08:05:24 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 20/06/2024 08:55, David Brown wrote:

...

You've made it clear you know nothing about the language. Fair enough - >>> we all know almost nothing about almost all programming languages. But
trust someone who does.

Yes but I just don't.
Everyone says "oh yes, that is easy - goto a Python group" and so
on. No-one actually comes up with the code.

In C, what we do is write a special version of strcmp

int strcmp_light_and_dark(const char *a, const char *b)
{
int i;

for (i = 0; a[i] && b[i]; i++)
{
if (a[i] != b[i])
{
if (a[i] == 'L' && b[i] == 'D' && i > 0 && a[i-1] == '_')
return -1;
if (a[i] == 'D' && b[i] == 'L' && i > 0 && a[i-1] = '_')
return 1;
break;
}
}

return a[i] - b[i];

}

So easy to do.

Unless I'm missing something here, that code does not do what you say
you want. You gave an example of some input and the desired output but
this comparison function does not sort into the ordering you gave.

You may find this "ordering" hard to duplicate in other languages
because it is not even an ordering in the mathematical sense as it is
not transitive.

Can you specify the desired ordering as a total or partial order
relation?

On a C language point, I don't think the standard says anything about
sorting with non-order functions like the one above.

It does, somewhat indirectly, in section 7.22.5 paragraph 4.

Is an
implementation of qsort permitted to misbehave (for example by not terminating) when the comparison function does not implement a proper
order relation?

My reading of the C standard is that the comparison function
must impose a total ordering on the elements actually present
in the array, or is undefined behavior if it does not. In
other words it's okay if the comparison function doesn't
define a proper order relation, as long as there are no
inconsistencies between values that actually occur in the
particular array being sorted.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Jun 20 17:07:29 2024

On 20/06/2024 15:37, bart wrote:

On 20/06/2024 11:34, David Brown wrote:

On 19/06/2024 23:51, bart wrote:

[Discussing Python]

Pretty much everything can be assigned to (the only exception is
reserved words). Because every user identifer (even if declared with
def or class or module) is a variable.

The concept of "variable" in Python is quite different from that of C.
You can pretend they are similar for very simple Python snippets, but
then you will end up thinking there are lots of arbitrary rules for
when assignment and function parameters are by value or by reference.
It is better to think that all "things" in Python are anonymous
reference-counted objects on the heap. When it looks like you have a
variable, you actually just have a named reference to such objects.
Imagine it more like your "variables" are all "void *" pointers or
references, while all other types and structures are malloc'd. These
references have no type information - but the objects they point to
are all strongly typed. And the objects have reference-counted
garbage collection.

You haven't given an opinion. I think this is an unnecessary aspect of
it, which also makes it harder to optimise, and to reason about.

I haven't given an opinion, no - I am trying primarily to give facts
here. This is simply the way Python works, and if it did things
differently, it would be a very different language. So I am not sure it
makes sense to give an opinion on this aspect of Python alone.

I quite like Python, and find it a useful language. I have used it for
some large gui programs, some backend server systems, some web server
code, and countless small programs (or scripts, if you prefer the term)
in test systems and utilities. It is also my go-to language for
single-use code.

Naturally, there are lots of things about it that I /don't/ like, as is
always the case for any language. And there will be some overlap with
the things /you/ don't like about Python, as one would expect for such subjective opinions. I don't, however, feel any need to list my
dislikes of Python here in comp.lang.c, or even in comp.lang.python.
I've only been discussing Python as an example of how many programming
tasks are easier in high-level languages than in C. I hadn't expected
to have to justify that blindingly obvious fact to someone (not you) who
has no idea about Python, nor to be giving tutorials to you to counter
your pet peeves about the language. I suppose I should not be
surprised, however.

My languages have perhaps a dozen categories of identifiers, known at compile-time, which include variable names. Python has only one, a 'variable'. It mean this is possible:

    def F(n): return n + 1
    ...
    F = 42
    ....
    F(x)                # will not work when F is 42

I seem to remember you getting really worked up about C programmers
using the same identifier for structs and variables!

Some languages have a syntax and restrictions that lets them keep
identifier namesspaces more separate, others can't. Functions in Python
are objects, as are types - so they cannot be a different category of identifier. That's a language design choice with its pros and cons - it
is not some sort of flaw, as you seem to imagine.

Neither Python nor C++ throws in "every feature they can think of" -
for both languages, there is a long process of proposals, discussions,
testing, and consideration of the impact on the rest of the language,
existing code, and possible future language features, before a feature
is included.

And /then/ they include the feature! I've long given up keeping track.

They include the features they think will be useful and will fit well
with the language, yes. Surely that's not surprising?

Both have mutable elements. Neither allow arbitrary attributes (so
impossible to misspell member names). And if the FFI demands it,
pointers to structs or ints can be passed.

You can do all this with Python. I showed you how to have structures
with mutable elements - and immutable structures, and structures with or without the ability to add new fields.

But Python even then completely disregarded performance. In the 1990s,
if you wrote a loop like this:

    for i in range(1000000):
        ....

it would actually create an object with a million elements so that you
could iterate along it. It sounds absolutely crazy, and it was.

Later they added xrange() which didn't do that, and later on 'xrange'
morphed into 'range'.

So your complaint now is that newer versions of Python have made some
common tasks more efficient? There's no pleasing some people.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Jun 20 17:58:25 2024

On 20/06/2024 16:07, David Brown wrote:

On 20/06/2024 15:37, bart wrote:

On 20/06/2024 11:34, David Brown wrote:

I've only been discussing Python as an example of how many programming
tasks are easier in high-level languages than in C.

A lot of it seems to be incantations that you can only come up as an
expert user. I wouldn't have been able to come up with even basic
file-reading; I'd have to go and look up examples, every time.

I seem to remember you getting really worked up about C programmers
using the same identifier for structs and variables!

Yes, you can have both 'struct T' and a type, variable etc called 'T';
or a type 'T' and, due to case sensitivity, a variable or function
called 't'.

But those identifiers in C are still fixed at compile-time. You can't so
this:

printf = sqrt;

In Python (not 2.x where 'print' was a reserved word), you can:

print = math.sqrt

Both have mutable elements. Neither allow arbitrary attributes (so
impossible to misspell member names). And if the FFI demands it,
pointers to structs or ints can be passed.

You can do all this with Python. I showed you how to have structures
with mutable elements - and immutable structures, and structures with or without the ability to add new fields.

I mentioned 5 ways of doing it, you added one or two more. That is my
point: when a simple feature isn't built in, solutions have to be
provided in lots of disparate ways.

I think your last one corresponded most to what I already have in my
language, but it needed 3 special features to do it, plus maybe one more
to hide some of those workings.

Python is supposed to a good beginner's language not a DIY one.

But Python even then completely disregarded performance. In the 1990s,
if you wrote a loop like this:

for i in range(1000000):
....

it would actually create an object with a million elements so that you
could iterate along it. It sounds absolutely crazy, and it was.

Later they added xrange() which didn't do that, and later on 'xrange'
morphed into 'range'.

So your complaint now is that newer versions of Python have made some
common tasks more efficient? There's no pleasing some people.

No, the complaint was getting it so wrong in the first place, then
taking too long to fix it. (I think it was in Python 3 that you could
type 'range' instead of 'xrange'.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Malcolm McLean on Thu Jun 20 17:45:38 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

Incidentally you might like the default sort in my ls command, now in
the Baby X FileSystem shell. There's also a separate programs that runs
under the native shell - of course I developed the command on native
host filesystems first. It uses a natural sort so that "Chapter 10"
sorts after and not before "Chapter 2".

Spaces have no place in filenames.

chapter02 sorts efore chapter10 naturally.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Scott Lurndal on Thu Jun 20 13:55:28 2024

On 6/20/24 13:45, Scott Lurndal wrote:
...

Spaces have no place in filenames.

Unix-like OSs and Windows both allow them, and they are used frequently.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Thu Jun 20 20:28:49 2024

On Thu, 20 Jun 2024 17:58:25 +0100
bart <bc@freeuk.com> wrote:

On 20/06/2024 16:07, David Brown wrote:

On 20/06/2024 15:37, bart wrote:

On 20/06/2024 11:34, David Brown wrote:

I've only been discussing Python as an example of how many
programming tasks are easier in high-level languages than in C.

A lot of it seems to be incantations that you can only come up as an
expert user. I wouldn't have been able to come up with even basic file-reading; I'd have to go and look up examples, every time.

I seem to remember you getting really worked up about C programmers
using the same identifier for structs and variables!

Yes, you can have both 'struct T' and a type, variable etc called
'T'; or a type 'T' and, due to case sensitivity, a variable or
function called 't'.

But those identifiers in C are still fixed at compile-time. You can't
so this:

printf = sqrt;

In Python (not 2.x where 'print' was a reserved word), you can:

print = math.sqrt

Both have mutable elements. Neither allow arbitrary attributes (so
impossible to misspell member names). And if the FFI demands it,
pointers to structs or ints can be passed.

You can do all this with Python. I showed you how to have
structures with mutable elements - and immutable structures, and
structures with or without the ability to add new fields.

I mentioned 5 ways of doing it, you added one or two more. That is my
point: when a simple feature isn't built in, solutions have to be
provided in lots of disparate ways.

I think your last one corresponded most to what I already have in my language, but it needed 3 special features to do it, plus maybe one
more to hide some of those workings.

Python is supposed to a good beginner's language not a DIY one.

Python + its huge standard library is supposed to a good beginner's
language. Not the language in isolation.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Michael S on Thu Jun 20 20:11:28 2024

On 20/06/2024 18:28, Michael S wrote:

On Thu, 20 Jun 2024 17:58:25 +0100
bart <bc@freeuk.com> wrote:

Python is supposed to a good beginner's language not a DIY one.

Python + its huge standard library is supposed to a good beginner's language. Not the language in isolation.

That doesn't make a beginner's language; it's just makes it a language
with a huge library.

The first program I ever wrote was an exercise involving reading three
numbers from the terminal. I can't remember the exact syntax, but it
wouldn't have been far off what I write now in my two languages:

readln a, b, c

In Python it's something like this (from Stackoverflow):

a, b, c = map(int, input().split())

Another answer said:

a, b, c = [int(v) for v in input().split()]

If a, b, c are floats, or strings, or a known combination, or (in a
dynamic language) they can be mixed, then it's all different.

Oh, and you have to use whitespace to separate; commas will need extra
work; you need to go and look at the specs for 'split'.

The input also needs exactly three values, but you might want to read
the first value, then decided what to read next based on that.

Hardly intuitive, is it? My read/readln /statements/ are strictly line-oriented. If I run this script:

repeat
print "? "
readln a, b, c
fprintln "<#> <#> <#>", a, b, c
println a.type, b.type, c.type
println
until not a

Then this is a session:

? 1 2 3

int int int

? 1 2.3 abc
<1> <2.300000> <abc>
int real string

? 1234
<1234> <> <>
int string string

? "1 2 3" 4 5
<1 2 3> <4> <5>
string int int

? 1,2,3,4,5

int int int

? 6,7

int int string

? 999999999999999999999999999999999999999999999999 20 30 <999999999999999999999999999999999999999999999999> <20> <30>
decimal int int

That simple Python line, with or without the conversion to numbers, will
choke on most such examples. The 'beginner' now needs to become an
expert on string processing to do simple input.

This script is in dynamic code. In static code, a, b, c will have fixed
types, eg. all ints, but float inputs are converted as needed.

So does C fare? I tried this:

int a,b,c;

while (1) {
printf("? ");
scanf("%d %d %d", &a, &b, &c);
printf("<%d> <%d> <%d>\n\n", a, b, c);
}

This is a session:

? 1 2 3

? 1, 2, 3

So far so good, but I haven't yet pressed Enter at this point; when I
do, it just loops forever showing '? <1> <2> <3>'. Start again:

? 1 2.3 0

It loops again. I will take out the loop and just do one line per run!

? 1 2.3 4

Oh dear. Plus it screws something up so that lines get out of sync.
Let's try doubles:

? 9223372036854775807 2 3
<9223372036854775800.000000> <2.000000> <3.000000>

Hmm...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to James Kuyper on Thu Jun 20 19:01:06 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 6/20/24 13:45, Scott Lurndal wrote:
...

Spaces have no place in filenames.

Unix-like OSs and Windows both allow them, and they are used frequently.

True, yet the point stands.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to Kaz Kylheku on Thu Jun 20 21:13:45 2024

On 17/06/2024 08:30, Kaz Kylheku wrote:

On 2024-06-17, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The problem is that Bart's compiler is VERY unusual. It's customized for
his use, and he has lots of quirks in the way he thinks compilers should
work, which are very different from those of most other programmers. In
particular, compilation speed is very important to him, while execution
speed is almost completely unimportant, which is pretty much the
opposite of the way most programmers prioritize those things.

Most programmers use Javascript and Python, which follow Bart's
priorities. Fast, invisible compilation to some kind of byte code (plus possibly later JIT), slow execution time.

I was so surprised by that I went to check.

<https://www.statista.com/statistics/793628/worldwide-developer-survey-most-used-languages/>

says HTML is second, not Python which is 3rd - but that might be down to definitions of what a language is.

About a fifth of us use C and C++, and another 5% assembly. I guess the
rest of the world cares more about getting a product out that works, and
less about making it fast.

Pretty much my entire career cared about performance in some way or another.

Andy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to David Brown on Thu Jun 20 21:21:28 2024

On 17/06/2024 20:29, David Brown wrote:

I do my C development with optimisations enabled, which means that the C compiler will obey all the rules and requirements of C. Optimisations
don't change the meaning of correct code - they only have an effect on
the results of your code if you have written incorrect code. I don't
know about you, but my aim in development is to write /correct/ code. If disabling optimisations helped in some way, it would be due to bugs and
luck.

To me disabling optimisations does one slightly useful thing (compiles a
little quicker) and one really useful one. It makes the interactive
debugger work. Optimised code confuses the debugger, especially when it
does things like reorder code, unroll loops, or merge equivalent functions.

Of course I then test with the optimised version.

Andy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to Scott Lurndal on Thu Jun 20 21:34:03 2024

On 20/06/2024 18:45, Scott Lurndal wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

Incidentally you might like the default sort in my ls command, now in
the Baby X FileSystem shell. There's also a separate programs that runs
under the native shell - of course I developed the command on native
host filesystems first. It uses a natural sort so that "Chapter 10"
sorts after and not before "Chapter 2".

Spaces have no place in filenames.

chapter02 sorts efore chapter10 naturally.

What do you do when you write Chapter101?

Andy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to David Brown on Thu Jun 20 21:31:26 2024

On 17/06/2024 14:43, David Brown wrote:

Compilation speed is important to everyone. That's why so many tricks
are used to get around the lack of speed in a big compiler, or so many
extra resources are thrown at the problem.

What "tricks" ?

Precompiled headers sprang to mind in about half a second.

<https://en.wikipedia.org/wiki/Precompiled_header>

Andy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Jun 20 22:16:22 2024

On 20/06/2024 18:58, bart wrote:

On 20/06/2024 16:07, David Brown wrote:

On 20/06/2024 15:37, bart wrote:

On 20/06/2024 11:34, David Brown wrote:

I've only been discussing Python as an example of how many programming
tasks are easier in high-level languages than in C.

A lot of it seems to be incantations that you can only come up as an
expert user. I wouldn't have been able to come up with even basic file-reading; I'd have to go and look up examples, every time.

I am not a Python expert. I am experienced, yes, but I would not call
myself an expert. But sure, look things up - it's quite easy to do
these days.

I seem to remember you getting really worked up about C programmers
using the same identifier for structs and variables!

Yes, you can have both 'struct T' and a type, variable etc called 'T';
or a type 'T' and, due to case sensitivity, a variable or function
called 't'.

But those identifiers in C are still fixed at compile-time. You can't so this:

    printf = sqrt;

In Python (not 2.x where 'print' was a reserved word), you can:

    print = math.sqrt

Yes.

Sometimes that kind of thing is extremely useful, especially during
testing or debugging. Suppose you've got the function sqrt, and it is
causing you trouble. Let's get the function from:

from math import sqrt

(It doesn't matter if it is a library function, a built-in function, or
your own function.)

We want to trace when it is called:

def debug_wrapper(f) :
def wrapped(x) :
print("Trying value ", x)
return f(x)
return wrapped

Now we can write :

sqrt = debug_wrapper(sqrt)

And every time the "sqrt" function is called after that, it is traced.

There's lots of things in Python that can be useful to some people at
some time, even if /you/ can't see the use for them. (There are no
doubt lots of things that I can't see the use for either. I'm
reasonably confident, however, that either /someone/ sees them as
useful, or they are a side-effect of other useful features. Contrary to
your firmly held beliefs, people do not design programming languages
solely with the aim of annoying you personally.)

Both have mutable elements. Neither allow arbitrary attributes (so
impossible to misspell member names). And if the FFI demands it,
pointers to structs or ints can be passed.

You can do all this with Python. I showed you how to have structures
with mutable elements - and immutable structures, and structures with
or without the ability to add new fields.

I mentioned 5 ways of doing it, you added one or two more. That is my
point: when a simple feature isn't built in, solutions have to be
provided in lots of disparate ways.

I mentioned several ways, each with their advantages and disadvantages,
and they each have their use in different situations. You don't like
choice? Okay, don't use a language that has choices.

I think your last one corresponded most to what I already have in my language, but it needed 3 special features to do it, plus maybe one more
to hide some of those workings.

Python is supposed to a good beginner's language not a DIY one.

Python has /never/ been a beginner's language. It has always been a
language that is relatively easy to use quickly - there's a very big
difference there. A beginner needing a structure type can google and
get a quick answer on how to make such a type, and use it. Or they can
just make a class and get around the issue of being able to assign to
new fields like "p.z" by simply not doing that. Later, as they learn
more, they can learn new techniques to fine-tune what they are trying to do.

But Python even then completely disregarded performance. In the
1990s, if you wrote a loop like this:

     for i in range(1000000):
         ....

it would actually create an object with a million elements so that
you could iterate along it. It sounds absolutely crazy, and it was.

Later they added xrange() which didn't do that, and later on 'xrange'
morphed into 'range'.

So your complaint now is that newer versions of Python have made some
common tasks more efficient? There's no pleasing some people.

No, the complaint was getting it so wrong in the first place, then
taking too long to fix it. (I think it was in Python 3 that you could
type 'range' instead of 'xrange'.)

s = 0
for i in range(10000000) :
s = s + i
print(s)

(I made it 10 million, not 1 million, because I have such a vastly
faster computer than everyone else.)

$ time python2 sum.py
49999995000000

real 0m0,820s
user 0m0,632s
sys 0m0,179s

$ time python3 sum.py
49999995000000

real 0m0,868s
user 0m0,854s
sys 0m0,004s

Python is not a racehorse, and never has been - that was never the point
of the language. You make less slow Python code by being more pythonic,
such as "s = sum(range(10000000))" rather than using a loop. And you
make fast Python code by making sure the hard work is done by optimised low-level libraries (usually, but not always, written in C) while the
Python code is for controlling it and gluing things together.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Keith Thompson on Thu Jun 20 22:39:57 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything about
sorting with non-order functions like the one above. Is an
implementation of qsort permitted to misbehave (for example by not
terminating) when the comparison function does not implement a proper
order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of
their current positions in the array) are passed more than once to
the comparison function, the results shall be consistent with one
another. That is, for qsort they shall define a total ordering on
the array, and for bsearch the same object shall always compare
the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in
undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another. That
the comparison function defines a total order on the elements is, to me,
a major extra constraint that should not be written as an apparent clarification to something the does not imply it: repeated calls should
be consistent with one another and, in addition, a total order should be imposed on the elements present.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Malcolm McLean on Thu Jun 20 22:26:59 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 20/06/2024 12:24, Ben Bacarisse wrote:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 20/06/2024 08:55, David Brown wrote:

...

You've made it clear you know nothing about the language.� Fair enough - >>>> we all know almost nothing about almost all programming languages.� But >>>> trust someone who does.

Yes but I just don't.
Everyone says "oh yes, that is easy - goto a Python group" and so
on. No-one actually comes up with the code.

In C, what we do is write a special version of strcmp

int strcmp_light_and_dark(const char *a, const char *b)
{
int i;

for (i = 0; a[i] && b[i]; i++)
{
if (a[i] != b[i])
{
if (a[i] == 'L' && b[i] == 'D' && i > 0 && a[i-1] == '_')
return -1;
if (a[i] == 'D' && b[i] == 'L' && i > 0 && a[i-1] = '_')
return 1;
break;
}
}

return a[i] - b[i];

}

So easy to do.

Unless I'm missing something here, that code does not do what you say
you want. You gave an example of some input and the desired output but
this comparison function does not sort into the ordering you gave.

You said it's easy in C, but the C code does not give the order you said
you wanted.

You may find this "ordering" hard to duplicate in other languages
because it is not even an ordering in the mathematical sense as it is
not transitive.
Can you specify the desired ordering as a total or partial order
relation?
On a C language point, I don't think the standard says anything about
sorting with non-order functions like the one above. Is an
implementation of qsort permitted to misbehave (for example by not
terminating) when the comparison function does not implement a proper
order relation?

It's allowed to crash because it runs out of stack space if the sort
function isb't consistent.

That's not the sort of thing the C standard would say, so I think that's
just a guess. What little the standard /does/ say, is confusing to me.
I'll make another post to see if anyone can unravel it.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Ben Bacarisse on Fri Jun 21 00:56:05 2024

On Thu, 20 Jun 2024 22:39:57 +0100
Ben Bacarisse <ben@bsb.me.uk> wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything
about sorting with non-order functions like the one above. Is an
implementation of qsort permitted to misbehave (for example by not
terminating) when the comparison function does not implement a
proper order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of
their current positions in the array) are passed more than once to
the comparison function, the results shall be consistent with one
another. That is, for qsort they shall define a total ordering on
the array, and for bsearch the same object shall always compare
the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in
undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another.
That the comparison function defines a total order on the elements
is, to me, a major extra constraint that should not be written as an
apparent clarification to something the does not imply it: repeated
calls should be consistent with one another and, in addition, a total
order should be imposed on the elements present.

Malcolm's comparison function does not establish total order on
arbitrary strings, but it is both possible and likely that it
establishes total order on all sets of inputs that he cares about.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Jun 20 23:40:17 2024

On 20/06/2024 21:16, David Brown wrote:

On 20/06/2024 18:58, bart wrote:

No, the complaint was getting it so wrong in the first place, then
taking too long to fix it. (I think it was in Python 3 that you could
type 'range' instead of 'xrange'.)

s = 0
for i in range(10000000) :
    s = s + i
print(s)

(I made it 10 million, not 1 million, because I have such a vastly
faster computer than everyone else.)

$ time python2 sum.py
49999995000000

real    0m0,820s
user    0m0,632s
sys    0m0,179s

$ time python3 sum.py
49999995000000

real    0m0,868s
user    0m0,854s
sys    0m0,004s

During the last 20 or more years, Python implementations have gotten
more and more efficient. So if there is a difference between range and
xrange, there are techniques that can be used to minimise it.

But if should never have been done that crazy way in the first place.

However if I compare range to xrange now on Py2 using such loops as your examples, then xrange is about twice as fast. But range on Py3 is
in-between. Basically I've no idea what's going on inside it these days.

(You should run the loop in your example inside a function, which makes
it twice as fast. CPython on my Windows machine runs it in 0.6 seconds.

My language can do it in 0.5 seconds, but that's for a 100M loop.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From vallor@21:1/5 to Keith.S.Thompson+u@gmail.com on Thu Jun 20 22:21:36 2024

On Thu, 20 Jun 2024 13:44:27 -0700, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote in <87v823wdhw.fsf@nosuchdomain.example.com>:

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
[...]

And I see that using Python's getkey() function to swap "_D" and "_H"
would be a way to solve this. But it's far from obvious how to set up a
custom sort.

And you still think comp.lang.c is the place to talk about it?

If the links I already posted aren't sufficiently helpful (I expected
they would be), comp.lang.python exists and is reasonably active.

Incidentally you might like the default sort in my ls command, now in
the Baby X FileSystem shell. There's also a separate programs that runs
under the native shell - of course I developed the command on native
host filesystems first. It uses a natural sort so that "Chapter 10"
sorts after and not before "Chapter 2".

GNU coreutils ls has a "-v" or "--sort=version" option that does this.

I just posted a python program to comp.lang.python that uses
glibc's strverscmp(3) to sort input parameters.

There is also a separate thread in comp.unix.shell where this has
come up.

--
-v

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Ben Bacarisse on Thu Jun 20 20:59:33 2024

On 6/20/24 17:39, Ben Bacarisse wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything about
sorting with non-order functions like the one above. Is an
implementation of qsort permitted to misbehave (for example by not
terminating) when the comparison function does not implement a proper
order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of
their current positions in the array) are passed more than once to
the comparison function, the results shall be consistent with one
another. That is, for qsort they shall define a total ordering on
the array, and for bsearch the same object shall always compare
the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in
undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another.

I don't think they were talking only about multiple comparisons of the
same value producing the same result. I think that they were also
talking about consistency between the result on different pairs of
values. In order to sort a, b, and c, the results of comp(a,b),
comp(b,c) and comp(a,c) need to be consistent with each other, or
there's no well-defined sort order. "total order" is merely a more
specific and precise way of specifying that consistency requirement, but
it is a consistency requirement, and therefore the most plausible kind
of consistency they could have been referring to with that comment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Malcolm McLean on Thu Jun 20 22:34:02 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 20/06/2024 16:05, Tim Rentsch wrote:

[.. on qsort compare function ..]

My reading of the C standard is that the comparison function
must impose a total ordering on the elements actually present
in the array, or is undefined behavior if it does not. In
other words it's okay if the comparison function doesn't
define a proper order relation, as long as there are no
inconsistencies between values that actually occur in the
particular array being sorted.

Yes, a qsort written in the natural way can getstuck if a
sub0array it considered sorted becmes not sortted on the next
pass.

Not get stuck. The array might not be sorted correctly,
but the algorithm doesn't get stuck.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Fri Jun 21 10:02:49 2024

On 21/06/2024 00:40, bart wrote:

During the last 20 or more years, Python implementations have gotten
more and more efficient. So if there is a difference between range and xrange, there are techniques that can be used to minimise it.

Yes. Like most languages (except assembly, Forth, and perhaps a few
other very low-level languages) Python is defined in terms of its
effects, not specific implementations of its features. The effect of generating a list of consecutive integers then iterating over it, and of generating a C for style loop, are the same - but obviously the run-time efficiency will typically be very different.

Python is not a language with much optimisation in itself - but the
compiler (byte-compiler) has got more sophisticated over time. Common use-cases of range where the list nature of the range is not important,
have been improved - modern Python does not generate a list for ranges
unless they are actually necessary.

Conceptually, having ranges as a list is very nice. It is a design
choice geared towards ease of use, consistency and functionality, rather
than ease of efficient implementation. Like functional programming
languages, it's a design that appeals to people thinking mathematically
or with high-level concepts in mind, but it will seem alien and
inefficient to people thinking from the low level up or considering implementation details.

Your programming experience has mainly been through assembly, low-level languages, and implementation of your own fairly low-level languages. I
expect that whenever you see code in any language, part of your mind is automatically trying to think of how you could implement that in a code generator. So when you see something like Python's "range" expression,
you immediately see the obvious implementation, and that this is clearly
a highly inefficient way to get a simple for-loop.

I too have a background in assembly - even lower, in fact, since I work
with electronics and am also familiar with logic design and the basic principles of processor design. But I also have a background in
mathematics, and as well as assembly and C, I have worked with
functional programming languages and other high-level languages. To me,
there is nothing wrong with using recursive functions over infinite
lists as a way of making a loop - /if/ the language supports it
reasonably efficiently. And there is nothing wrong with a language that
is 1000 times slower to run but 10 times faster to code.

But if should never have been done that crazy way in the first place.

However if I compare range to xrange now on Py2 using such loops as your examples, then xrange is about twice as fast. But range on Py3 is
in-between. Basically I've no idea what's going on inside it these days.

xrange always gives you a structure that is a kind of iterator or
generator - it holds a start value, end value, step value and current
value. It is not a list, and does not have the same types of methods as
a list. For example, you can't modify it.

In Python3, range() actually returns an range object, which is basically
a rename of the Python2 xrange object. To get the old behaviour of
range() as a list, you now need to write "list(range(100))".

A major point, however, is that for the most part, you don't need to
know what is going on inside it. I know you like to understand these
things (I do too), and it's hard to turn off the curiosity instincts.
But at least understand that for the vast majority of other users, the underlying details don't matter. Their Python "for" loops are faster
than they used to be, and that's nice - why bother about the hidden
details when there are other things to occupy one's time and brainpower?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Malcolm McLean on Fri Jun 21 11:19:10 2024

On 20/06/2024 18:56, Malcolm McLean wrote:

Yes, a qsort written in the natural way can getstuck if a sub0array it considered sorted becmes not sortted on the next pass.

I have no idea what you might think of as the "natural" way to implement
qsort. But if it were, as the name implies, a quicksort, then it would
not get stuck. At each step, progress is always made - even with a
comparison function that gave random values, you'd still have a worst
case complexity of O(n²).

The only sorting algorithm I can think of that might get stuck with a
slightly broken comparison function would be a poor bubblesort
implementation that is run over the whole array every round, simply
repeating until no changes are made in a round.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Vir Campestris on Fri Jun 21 11:46:04 2024

On 20/06/2024 22:21, Vir Campestris wrote:

On 17/06/2024 20:29, David Brown wrote:

I do my C development with optimisations enabled, which means that the
C compiler will obey all the rules and requirements of C.
Optimisations don't change the meaning of correct code - they only
have an effect on the results of your code if you have written
incorrect code. I don't know about you, but my aim in development is
to write /correct/ code. If disabling optimisations helped in some
way, it would be due to bugs and luck.

To me disabling optimisations does one slightly useful thing (compiles a little quicker) and one really useful one. It makes the interactive
debugger work. Optimised code confuses the debugger, especially when it
does things like reorder code, unroll loops, or merge equivalent functions.

Of course I then test with the optimised version.

Andy

I understand your viewpoint and motivation. But my own experience is
mostly different.

First, to get it out of the way, there's the speed of compilation.
While heavy optimisation (-O3) can take noticeably longer, I never see
-O0 as being in any noticeable way faster for compilation than -O1 or
even -O2. (I'm implicitly using gcc options here, but it's mostly
applicable to any serious compiler I have used.) Frankly, if your
individual C compiles during development are taking too long, you are
doing something wrong. Maybe you are using far too big files, or trying
to do too much in one part - split the code into manageable sections and possibly into libraries, and it will be far easier to understand, write
and test. Maybe you are not using appropriate build tools. Maybe you
are using a host computer that is long outdated or grossly underpowered.

There are exceptions. Clearly some languages - like C++ - are more
demanding of compilers than others. And if you are using whole-program
or link-time optimisation, compilation and build time is more of an
issue - but of course these only make sense with strong optimisation.

Secondly, there is the static error analysis. While it is possible to
do this using additional tools, your first friend is your compiler and
its warnings. (Even with additional tools, you'll want compiler
warnings enabled.) You always want to find your errors as early as
possible - from your editor/IDE, your compiler, your linker, your
additional linters, your automatic tests, your manual tests, your beta
tests, your end user complaints. The earlier in this chain you find the
issue, the faster, easier and cheaper it is to fix things. And
compilers do a better job at static error checking with strong
optimisations enabled, because they do more code analysis.

Thirdly, optimisation allows you to write your code with more focus on
clarity, flexibility and maintainability, relying on the compiler for
the donkey work of efficiency details. If you want efficient results
(and that doesn't always matter - but if it doesn't, then C is probably
not the best choice of language in the first place) and you also want to
write good quality source code, optimisation is a must.

Now to your point about debugging. It is not uncommon for me to use
debuggers, including single-stepping, breakpoints, monitoring variables, modifying data via the debugger, and so on. It is common practice in
embedded development. I also regularly examine the generated assembly,
and debug at that level. If I am doing a lot of debugging on a section
of code, I generally use -O1 rather than -O0 - precisely because it is
far /easier/ to understand the generated code. Typically it is hard to
see what is going on in the assembly because it is swamped by stack
accesses or code that would be far simpler when optimised. (That goes
back to the focus on source code clarity and flexibility rather than micro-managing for run-time efficiency without optimisation.)

Some specific optimisation options can make a big difference to
debugging, and can be worth disabling, such as "-fno-inline" or "-fno-top-reorder", and heavily optimised code can be hard to follow in
a debugger. But disabling optimisation entirely can often, IME, make
things harder.

Temporarily changing optimisation flags for all or part of the code
while chasing particular bugs is a useful tool, however.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Vir Campestris on Fri Jun 21 12:46:40 2024

On 20/06/2024 22:31, Vir Campestris wrote:

On 17/06/2024 14:43, David Brown wrote:

Compilation speed is important to everyone. That's why so many tricks
are used to get around the lack of speed in a big compiler, or so
many extra resources are thrown at the problem.

What "tricks" ?

Precompiled headers sprang to mind in about half a second.

<https://en.wikipedia.org/wiki/Precompiled_header>

Andy

These are used primarily in C++, where you often have /very/ large
headers that require a significant amount of analysis. In C, this is
not remotely the same scale of problem. C headers are usually far
shorter - and far simpler to process. (In a quick test on my system,
#include <stdio.h> pulled in 792 lines, while #include <iostream> took
28152 lines.)

In C++, compilation speed /is/ an issue. That's why we now have modules
there.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 21 11:42:47 2024

On 21/06/2024 10:46, David Brown wrote:

On 20/06/2024 22:21, Vir Campestris wrote:

On 17/06/2024 20:29, David Brown wrote:

I do my C development with optimisations enabled, which means that
the C compiler will obey all the rules and requirements of C.
Optimisations don't change the meaning of correct code - they only
have an effect on the results of your code if you have written
incorrect code. I don't know about you, but my aim in development is
to write /correct/ code. If disabling optimisations helped in some
way, it would be due to bugs and luck.

To me disabling optimisations does one slightly useful thing (compiles
a little quicker) and one really useful one. It makes the interactive
debugger work. Optimised code confuses the debugger, especially when
it does things like reorder code, unroll loops, or merge equivalent
functions.

Of course I then test with the optimised version.

Andy

I understand your viewpoint and motivation. But my own experience is
mostly different.

First, to get it out of the way, there's the speed of compilation. While heavy optimisation (-O3) can take noticeably longer, I never see -O0 as
being in any noticeable way faster for compilation than -O1 or even
-O2.

Absolute time or relative? For me, optimised options with gcc always
take longer:

C:\c>tm gcc bignum.c -shared -s -obignum.dll # from cold
TM: 3.85

C:\c>tm gcc bignum.c -shared -s -obignum.dll
TM: 0.31

C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
TM: 0.83

C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
TM: 0.93

C:\c>dir bignum.dll
21/06/2024 11:14 35,840 bignum.dll

Optimising takes nearly 3 times as long. But this is a single small
module of 1600 lines.

Where optimisation makes no difference is when compiling lots of
declarations:

C:\c>type w.c
#include <windows.h>
int main(void) { MessageBox(0,"World", "Hello", 0); }

C:\c>tm gcc -s w.c
TM: 1.39

C:\c>tm gcc -s w.c -O3
TM: 1.41

It is these big headers belonging to large libraries that can affect build-times, even compiling only one module. (I can't compare with tcc
here since it uses its own smaller windows.h; I'd have to compare with libraries like gtk2.h, which is 0.35M lines of unique declarations in
hundreds of headers.)

With my own usage where gcc is used to build one large file, then the difference is stark, and very noticeable:

C:\qx52>tm gcc -s qc.c -oqc.exe -O0
TM: 2.17

C:\qx52>tm gcc -s qc.c -oqc.exe -O1
TM: 5.31

C:\qx52>tm gcc -s qc.c -oqc.exe -O2
TM: 10.98

C:\qx52>tm gcc -s qc.c -oqc.exe -O3
TM: 13.60

C:\qx52>tm mm qc -opt (my normal compiler)
Compiling qc.m to qc.exe
TM: 0.09

Compared with using my usual compiler, if I wanted to benefit from gcc's optimisation, then building it takes 150 times longer.

And this is, by your own standards which are constantly beating me about
the head with, for a toy application. Although it is still 0.5MB:

C:\qx52>mm qc -opt -v -rip
Compiling qc.m to qc.exe
Code size: 220,587 bytes
Idata size: 327,184
Code+Idata: 547,771
Zdata size: 5,343,544
EXE size: 553,472

(gcc -O3 produces a 765KB file, -O0 a 698KB one, and even -Os a 572KB
one, but that took it 8 seconds. My code could be reduced further in
size, but it's a low priority.

Note: my EXE is for low-loading images. A high-loading one (option
-himem) is 4KB bigger, but needs also extra flags to force the OS to
load it high. All gcc results are for high-loading.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 21 14:25:46 2024

On 21/06/2024 11:46, David Brown wrote:

On 20/06/2024 22:31, Vir Campestris wrote:

On 17/06/2024 14:43, David Brown wrote:

Compilation speed is important to everyone. That's why so many
tricks are used to get around the lack of speed in a big compiler,
or so many extra resources are thrown at the problem.

What "tricks" ?

Precompiled headers sprang to mind in about half a second.

<https://en.wikipedia.org/wiki/Precompiled_header>

Andy

These are used primarily in C++, where you often have /very/ large
headers that require a significant amount of analysis. In C, this is
not remotely the same scale of problem. C headers are usually far
shorter - and far simpler to process. (In a quick test on my system, #include <stdio.h> pulled in 792 lines, while #include <iostream> took
28152 lines.)

C standard headers are nothing. From 3K to 5K lines between Windows and
Linux, last time I looked. That's for all 30 headers. (Which is why I
think they should just be included anyway.)

But library headers can be much bigger. I already gave a timing for
windows.h, of 1.4 seconds. SDL2, quite a small library compared with
some, is still 50K lines, and adds 0.75 seconds compilation per module.

It's too big a job to test with GTK4, but GTK2, from my last
experiments, was 350K lines across 550 unique header files, and involved processing over 1000 #include statements.

It was a test project earlier this year where multiple modules each
including, indirectly, sdl.h, which cause a noticeable sluggishness in
my mcc compiler.

If I was to take that product seriously, that is where I would strive to
apply whole-program-compilation methods (developed for my main 'toy' compilers), so that a library like sdl.h is processed once at most if
included by N modules in a specific mcc invocation.

If I set up a dummy test where there are 50 .c files each with '#include
"sdl"' and one empty function, plus one main.c file, then a gcc build
takes 36 seconds.

If I use a precompiled header for sdl.h, it takes 5-6 seconds (remember
there is no application code; this is just the SDL interface).

When I set up the same test in my language, it takes 0.05 seconds. (This
also uses a single interface file for SDL where its 75 C headers are
summarised as a set of declarations totalling under 3000 lines.)

So 700 times faster than gcc processing header files, and still over 100
times faster when gcc employs its secret weapon: precompiled headers.
Which you said are not necessary.

To summarise, in my test, a C compiler has to process 2.5M non-unique
lines of declarations across 4000 non-unique header files. My compiler
has to process only 3K unique lines in one unique file.

You will obviously dismiss my amateurish efforts out of hand, but they
show what is possible, and what you need to look at if trying to speed
things up.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Fri Jun 21 15:34:30 2024

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

On 20/06/2024 22:21, Vir Campestris wrote:

On 17/06/2024 20:29, David Brown wrote:

I do my C development with optimisations enabled, which means that
the C compiler will obey all the rules and requirements of C.
Optimisations don't change the meaning of correct code - they only
have an effect on the results of your code if you have written
incorrect code. I don't know about you, but my aim in development
is to write /correct/ code. If disabling optimisations helped in
some way, it would be due to bugs and luck.

To me disabling optimisations does one slightly useful thing
(compiles a little quicker) and one really useful one. It makes the
interactive debugger work. Optimised code confuses the debugger,
especially when it does things like reorder code, unroll loops, or
merge equivalent functions.

Of course I then test with the optimised version.

Andy

I understand your viewpoint and motivation. But my own experience is
mostly different.

First, to get it out of the way, there's the speed of compilation.
While heavy optimisation (-O3) can take noticeably longer, I never see
-O0 as being in any noticeable way faster for compilation than -O1 or
even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always
take longer:

Of course. But I said it was not noticeable - it does not make enough difference in speed for it to be worth choosing.

C:\c>tm gcc bignum.c -shared -s -obignum.dll # from cold
TM: 3.85

Cold build times are irrelevant to development - when you are working on
a project, all the source files and all your compiler files are in the
PC's cache.

C:\c>tm gcc bignum.c -shared -s -obignum.dll
TM: 0.31

C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
TM: 0.83

C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
TM: 0.93

C:\c>dir bignum.dll
21/06/2024 11:14 35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file. It has 158 C files and
about 220 header files. And I ran it on my old PC, without any "tricks"
that you dislike so much, doing full clean re-builds. The files are
actually all compiled twice, building two variants of the binary.

With -O2, it took 34.3 seconds to build. With -O1, it took 33.4
seconds. With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds. In practice, of course,
full rebuilds are rarely needed, and most builds after changes to the
source are within a second or so.

So yes, unoptimised builds are consistently faster. But for something
other than a "hello world" program or single monster source code, the difference is not relevant in normal development work.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Fri Jun 21 15:51:46 2024

On 21/06/2024 15:25, bart wrote:

On 21/06/2024 11:46, David Brown wrote:

On 20/06/2024 22:31, Vir Campestris wrote:

On 17/06/2024 14:43, David Brown wrote:

Compilation speed is important to everyone. That's why so many
tricks are used to get around the lack of speed in a big compiler,
or so many extra resources are thrown at the problem.

What "tricks" ?

Precompiled headers sprang to mind in about half a second.

<https://en.wikipedia.org/wiki/Precompiled_header>

Andy

These are used primarily in C++, where you often have /very/ large
headers that require a significant amount of analysis. In C, this is
not remotely the same scale of problem. C headers are usually far
shorter - and far simpler to process. (In a quick test on my system,
#include <stdio.h> pulled in 792 lines, while #include <iostream> took
28152 lines.)

C standard headers are nothing. From 3K to 5K lines between Windows and Linux, last time I looked. That's for all 30 headers.

C headers for most other libraries are usually also short, at least
compared to C++ headers.

(Which is why I
think they should just be included anyway.)

That would be a terrible idea for many reasons.

But library headers can be much bigger. I already gave a timing for windows.h, of 1.4 seconds. SDL2, quite a small library compared with
some, is still 50K lines, and adds 0.75 seconds compilation per module.

I don't know what version of SDL headers I have (it's not a library I
have used myself), but there are about 30 headers in the
/usr/include/SDL/ folder, totalling some 12K lines (after
preprocessing). Including them all in an otherwise blank C file takes
about 40-50 ms to compile - basically, noise.

Again, this is a decade old PC with spinning rust disk. A key
difference, of course, is that I am using an OS that is suitable for the
task.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 21 19:43:36 2024

On 21/06/2024 14:51, David Brown wrote:

On 21/06/2024 15:25, bart wrote:

On 21/06/2024 11:46, David Brown wrote:

On 20/06/2024 22:31, Vir Campestris wrote:

On 17/06/2024 14:43, David Brown wrote:

Compilation speed is important to everyone. That's why so many
tricks are used to get around the lack of speed in a big compiler, >>>>>> or so many extra resources are thrown at the problem.

What "tricks" ?

Precompiled headers sprang to mind in about half a second.

<https://en.wikipedia.org/wiki/Precompiled_header>

Andy

These are used primarily in C++, where you often have /very/ large
headers that require a significant amount of analysis. In C, this is
not remotely the same scale of problem. C headers are usually far
shorter - and far simpler to process. (In a quick test on my system,
#include <stdio.h> pulled in 792 lines, while #include <iostream>
took 28152 lines.)

C standard headers are nothing. From 3K to 5K lines between Windows
and Linux, last time I looked. That's for all 30 headers.

C headers for most other libraries are usually also short, at least
compared to C++ headers.

(Which is why I think they should just be included anyway.)

That would be a terrible idea for many reasons.

But library headers can be much bigger. I already gave a timing for
windows.h, of 1.4 seconds. SDL2, quite a small library compared with
some, is still 50K lines, and adds 0.75 seconds compilation per module.

I don't know what version of SDL headers I have (it's not a library I
have used myself), but there are about 30 headers in the
/usr/include/SDL/ folder, totalling some 12K lines (after
preprocessing). Including them all in an otherwise blank C file takes
about 40-50 ms to compile - basically, noise.

I don't trust your figures. How many lines were there /before/
preprocessing? Since if significantly more, all that has to be processed
still.

Even at 12K lines and 50ms, that gives gcc a throughput of 0.25M lines
per second, something I've never seen on any version of gcc. And if it
was 50K lines like mine (just the lines in all the SDL*.h files), then
it suggests a throughput of 1M lines per second.

My C compiler tells me that in this program:

#include "sdl.h"

a total of non-unique 112K lines are processed, and 240 non-unique
#includes are encountered.

If I also tell it to just preprocess, then the result is only 3K lines, including standard C headers which SDL includes.

gcc's -E gives me 56K lines, which includes a lot of crap like loads of
'#' lines, but also huge numbers of intrinsics like:

extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
_mm_abs_epi32 (__m128i __X)
{
return (__m128i) __builtin_ia32_pabsd128 ((__v4si)__X);
}

Again, this is a decade old PC with spinning rust disk. A key
difference, of course, is that I am using an OS that is suitable for the task.

I tried it under WSL too (unless that doesn't really count as a suitable
OS):

WSL gcc 9.4 -S 0.5 seconds to process #include "sdl.h"
Windows 11 gcc 14.1 -S 0.7 seconds

That last is the same OS where my mcc compiler takes 0.08 seconds to
process those same SDL*.h files.

I think you've misplaced a decimal point somewhere.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Michael S on Fri Jun 21 22:07:39 2024

Michael S <already5chosen@yahoo.com> writes:

Malcolm's comparison function does not establish total order on
arbitrary strings, but it is both possible and likely that it
establishes total order on all sets of inputs that he cares about.

Yes, I think he said as much. The trouble is, the problem was posted as something of a challenge ("nobody posted any code" or similar) but we
knew neither the possible inputs nor the desired ordering!

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 21 22:05:08 2024

On 21/06/2024 14:51, David Brown wrote:

On 21/06/2024 15:25, bart wrote:

(Which is why I think they should just be included anyway.)

That would be a terrible idea for many reasons.

Such as? It can't be compilation time, since headers ten times the size
or more apparently instantly.

One delight in using my language is that its standard library is always available. But if you don't want it, it can be disabled. With C I spend
a big chunk of my time writing include lines.

First I need stdio. Then string. Then stdlib. Then there's always ones I
can't quite remember.

Or I need to debug someone's code, and it needs stdio to define
'printf', FFS. Would the world stop turning if it was just available?

I just don't believe that things defined in the headers need to be micro-managed to that extent, most of the time.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 21 22:47:46 2024

On 21/06/2024 14:34, David Brown wrote:

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

On 20/06/2024 22:21, Vir Campestris wrote:

On 17/06/2024 20:29, David Brown wrote:

I do my C development with optimisations enabled, which means that
the C compiler will obey all the rules and requirements of C.
Optimisations don't change the meaning of correct code - they only
have an effect on the results of your code if you have written
incorrect code. I don't know about you, but my aim in development
is to write /correct/ code. If disabling optimisations helped in
some way, it would be due to bugs and luck.

To me disabling optimisations does one slightly useful thing
(compiles a little quicker) and one really useful one. It makes the
interactive debugger work. Optimised code confuses the debugger,
especially when it does things like reorder code, unroll loops, or
merge equivalent functions.

Of course I then test with the optimised version.

Andy

I understand your viewpoint and motivation. But my own experience is
mostly different.

First, to get it out of the way, there's the speed of compilation.
While heavy optimisation (-O3) can take noticeably longer, I never
see -O0 as being in any noticeable way faster for compilation than
-O1 or even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always take longer:

Of course. But I said it was not noticeable - it does not make enough difference in speed for it to be worth choosing.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll        # from cold >>   TM: 3.85

Cold build times are irrelevant to development - when you are working on
a project, all the source files and all your compiler files are in the
PC's cache.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll
  TM: 0.31

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
  TM: 0.83

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
  TM: 0.93

  C:\c>dir bignum.dll
  21/06/2024 11:14            35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file. It has 158 C files and
about 220 header files. And I ran it on my old PC, without any "tricks" that you dislike so much, doing full clean re-builds. The files are actually all compiled twice, building two variants of the binary.

With -O2, it took 34.3 seconds to build. With -O1, it took 33.4
seconds. With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds. In practice, of course,
full rebuilds are rarely needed, and most builds after changes to the
source are within a second or so.

Then there's something very peculiar about your codebase.

Either there's disproportionate percentage of declarations compared to executable code (like my tests with headers).

Or a big chunk of the source either contains lots of redundant code, or generates such code, or maybe conditional code, that is eliminated
before it gets to the optimising stages.

My own test show a different pattern. First for Lua sources (32 .c files):

C:\luac>tm gcc @luafiles
TM: 7.71
C:\luac>tm gcc @luafiles -O3
TM: 16.97

This project uses lots of macros. Then Malcolm's resource compiler (43
.c files):

C:\bbx\src>tm gcc @gfiles
TM: 10.66
C:\bbx\src>tm gcc @gfiles -O3
TM: 28.87

That is itself unusual in that 80% of the line count is data.

This is one of the sub-programs of LibJPEG (54 .c files):

C:\jj>tm gcc @cjpeg
TM: 7.69
C:\jj>tm gcc @cjpeg -O3
TM: 18.01

Pico C (22 files):

C:\pico>tm gcc @pico
TM: 3.25
C:\pico>tm gcc @pico -O3
TM: 5.62

Tiny C (one file):

C:\tcs>tm gcc tcc.c
TM: 3.03
C:\tcs>tm gcc tcc.c -O3
TM: 13.06

The difference looks more than 15% to me. More like 70% to 170%, not
including that last one, which is 330%, as you don't like programs in
one file.

Yet, that particular app *is* one file. If working with that, you don't
have a choice.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Fri Jun 21 21:10:07 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything about
sorting with non-order functions like the one above. Is an
implementation of qsort permitted to misbehave (for example by not
terminating) when the comparison function does not implement a proper
order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of
their current positions in the array) are passed more than once to
the comparison function, the results shall be consistent with one
another. That is, for qsort they shall define a total ordering on
the array, and for bsearch the same object shall always compare
the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in
undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another. That
the comparison function defines a total order on the elements is, to me,
a major extra constraint that should not be written as an apparent clarification to something the does not imply it: repeated calls should
be consistent with one another and, in addition, a total order should be imposed on the elements present.

I think you're misreading the first sentence. Suppose we are in
court listening to an ongoing murder trial. Witness one comes in
and testifies that Alice left the house before Bob. Witness two
comes in (after witness one has gone) and testifies that Bob left
the house before Cathy. Witness three comes in (after the first
two have gone) and testifies that Cathy left the house before
Alice. None of the witnesses have contradicted either of the
other witnesses, but the testimonies of the three witnesses are
not consistent with one another. Try a web search

"consistent with" definition

for more explanation. Also, for "one another", if we say the
children in the Jones family get along with one another, we don't
mean that each child gets along with at least one of the others,
but instead mean that every child gets along with every other
child, that is, that they all get along with each other. Whether
or not some other reading (of that problem sentence in the C
standard) is sensible, surely the reading I have suggested is a
plausible one. Do you agree? It seems clear, given how the
second sentence is phrased, that this suggested reading is what
was intended.

I don't mean to defend the quality of writing in this passage.
Certainly it would be nice if the meaning could have been stated
more plainly. But I think it's an overstatement to say that the
first sentence in no way implies a total order.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Jun 23 12:19:23 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything about
sorting with non-order functions like the one above. Is an
implementation of qsort permitted to misbehave (for example by not
terminating) when the comparison function does not implement a proper
order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of
their current positions in the array) are passed more than once to
the comparison function, the results shall be consistent with one
another. That is, for qsort they shall define a total ordering on
the array, and for bsearch the same object shall always compare
the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in
undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another. That
the comparison function defines a total order on the elements is, to me,
a major extra constraint that should not be written as an apparent
clarification to something the does not imply it: repeated calls should
be consistent with one another and, in addition, a total order should be
imposed on the elements present.

I think you're misreading the first sentence.

Let's hope so. That's why I said it should be clearer, not that it was
wrong.

Suppose we are in
court listening to an ongoing murder trial. Witness one comes in
and testifies that Alice left the house before Bob. Witness two
comes in (after witness one has gone) and testifies that Bob left
the house before Cathy. Witness three comes in (after the first
two have gone) and testifies that Cathy left the house before
Alice. None of the witnesses have contradicted either of the
other witnesses, but the testimonies of the three witnesses are
not consistent with one another.

My (apparently incorrect) reading of the first sentence is that the
consistency is only required between the results of multiple calls
between each pair. In other words, if the witnesses are repeatedly
asked, again and again, if Alice left before Bob and/or if Bob left
before Alice the results would always be consistent (with, of course,
the same required of repeatedly asking about the other pairs of people).

Try a web search

"consistent with" definition

for more explanation.

Seriously?

Also, for "one another", if we say the
children in the Jones family get along with one another, we don't
mean that each child gets along with at least one of the others,
but instead mean that every child gets along with every other
child, that is, that they all get along with each other.

The sentence in question has, to my mind, already stated what the "one
another" refers to -- the multiple calls between pairs containing the
same objects. I get you think that's not the intended meaning, but I
get my reading so strongly that I struggle to see the other.

Whether
or not some other reading (of that problem sentence in the C
standard) is sensible, surely the reading I have suggested is a
plausible one. Do you agree? It seems clear, given how the
second sentence is phrased, that this suggested reading is what
was intended.

I still can't read it the way you do. Every time I try, I find the
consistency is to be taken as applying to the results of the multiple
calls between pairs of the same objects. Nothing more. It starts with
"When the same objects". It seems so clear that the consistency is all
about the multiple calls with these same objects. I keep trying to see
your reading of it, but I can't.

I don't mean to defend the quality of writing in this passage.
Certainly it would be nice if the meaning could have been stated
more plainly. But I think it's an overstatement to say that the
first sentence in no way implies a total order.

I have a second objection that promoted that remark. If I take the (apparently) intended meaning of the first sentence, I think that
"consistent" is too weak to imply even a partial order. In dog club
tonight, because of how they get on, I will ensure that Enzo is walking
behind George, that George is walking behind Benji, Benji behind Gibson,
Gibson behind Pepper and Pepper behind Enzo. In what sense is this
"ordering" not consistent? All the calls to the comparison function are consistent with each other.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sun Jun 23 14:25:53 2024

On 21/06/2024 23:47, bart wrote:

On 21/06/2024 14:34, David Brown wrote:

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

On 20/06/2024 22:21, Vir Campestris wrote:

On 17/06/2024 20:29, David Brown wrote:

I do my C development with optimisations enabled, which means that >>>>>> the C compiler will obey all the rules and requirements of C.
Optimisations don't change the meaning of correct code - they only >>>>>> have an effect on the results of your code if you have written
incorrect code. I don't know about you, but my aim in development >>>>>> is to write /correct/ code. If disabling optimisations helped in
some way, it would be due to bugs and luck.

To me disabling optimisations does one slightly useful thing
(compiles a little quicker) and one really useful one. It makes the
interactive debugger work. Optimised code confuses the debugger,
especially when it does things like reorder code, unroll loops, or
merge equivalent functions.

Of course I then test with the optimised version.

Andy

I understand your viewpoint and motivation. But my own experience
is mostly different.

First, to get it out of the way, there's the speed of compilation.
While heavy optimisation (-O3) can take noticeably longer, I never
see -O0 as being in any noticeable way faster for compilation than
-O1 or even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always take longer:

Of course. But I said it was not noticeable - it does not make enough
difference in speed for it to be worth choosing.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll        # from cold >>>   TM: 3.85

Cold build times are irrelevant to development - when you are working
on a project, all the source files and all your compiler files are in
the PC's cache.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll
  TM: 0.31

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
  TM: 0.83

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
  TM: 0.93

  C:\c>dir bignum.dll
  21/06/2024 11:14            35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file. It has 158 C files and
about 220 header files. And I ran it on my old PC, without any
"tricks" that you dislike so much, doing full clean re-builds. The
files are actually all compiled twice, building two variants of the
binary.

With -O2, it took 34.3 seconds to build. With -O1, it took 33.4
seconds. With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds. In practice, of course,
full rebuilds are rarely needed, and most builds after changes to the
source are within a second or so.

Then there's something very peculiar about your codebase.

In my experience, programs don't usually consist of a single C file.
And if they do, build time is rarely long enough to worry anyone.

I think, from the history of discussions in this group, that it is more
likely that your codebases are the peculiar ones.

However, it is certainly fair to say that codebases vary a lot, as do
build procedures.

Either there's disproportionate percentage of declarations compared to executable code (like my tests with headers).

No.

Or a big chunk of the source either contains lots of redundant code,

No.

or
generates such code, or maybe conditional code,

There's a bit, but not much.

that is eliminated
before it gets to the optimising stages.

To be fair here, I have a /lot/ of flags for my builds, and the only
thing I changed here was the main optimisation number. Other flags
might still have had an influence in the compiler run time, such as the
warning flags. But there's a limit to how much effort I'm going to
spend playing around for a Usenet post. And a great many of the flags
(not the warning flags) are important for actually generating the
correct code for the correct target, so sorting out those that could be
removed in order to make a completely pointless and useless
near-unoptimised build is just not worth the effort. However, changing
the main optimisation number to 0 certainly reduced the code generation
to much less efficient code - likely too slow to work for some critical
aspects of the program.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sun Jun 23 14:56:11 2024

On 21/06/2024 20:43, bart wrote:

On 21/06/2024 14:51, David Brown wrote:

On 21/06/2024 15:25, bart wrote:

On 21/06/2024 11:46, David Brown wrote:

On 20/06/2024 22:31, Vir Campestris wrote:

On 17/06/2024 14:43, David Brown wrote:

Compilation speed is important to everyone. That's why so many
tricks are used to get around the lack of speed in a big
compiler, or so many extra resources are thrown at the problem.

What "tricks" ?

Precompiled headers sprang to mind in about half a second.

<https://en.wikipedia.org/wiki/Precompiled_header>

Andy

These are used primarily in C++, where you often have /very/ large
headers that require a significant amount of analysis. In C, this
is not remotely the same scale of problem. C headers are usually
far shorter - and far simpler to process. (In a quick test on my
system, #include <stdio.h> pulled in 792 lines, while #include
<iostream> took 28152 lines.)

C standard headers are nothing. From 3K to 5K lines between Windows
and Linux, last time I looked. That's for all 30 headers.

C headers for most other libraries are usually also short, at least
compared to C++ headers.

(Which is why I think they should just be included anyway.)

That would be a terrible idea for many reasons.

But library headers can be much bigger. I already gave a timing for
windows.h, of 1.4 seconds. SDL2, quite a small library compared with
some, is still 50K lines, and adds 0.75 seconds compilation per module.

I don't know what version of SDL headers I have (it's not a library I
have used myself), but there are about 30 headers in the
/usr/include/SDL/ folder, totalling some 12K lines (after
preprocessing). Including them all in an otherwise blank C file takes
about 40-50 ms to compile - basically, noise.

I don't trust your figures. How many lines were there /before/
preprocessing? Since if significantly more, all that has to be processed still.

There are about 12K lines in my /usr/include/SDL/ folder /before/ preprocessing. I'm guessing that during preprocessing, some of these
will disappear due to conditionals, while some will be duplicated due to
the same file included more than once, and other headers will be pulled
in from elsewhere. The end result happens to be around the same, though
it is certainly not guaranteed that this would be the case.

I checked versions - it seems this is SDL 1.2. As I say, I have never
used it directly (I guess the headers got pulled in along with some
other package). There could be huge differences between SDL 1.2 and SDL
2, for all I know.

Even at 12K lines and 50ms, that gives gcc a throughput of 0.25M lines
per second, something I've never seen on any version of gcc. And if it
was 50K lines like mine (just the lines in all the SDL*.h files), then
it suggests a throughput of 1M lines per second.

Lines per second is a completely meaningless way to measure compiler
speed. I know you like it, but it makes no sense at all - it's like
measuring the productivity of programmers in lines per day. I expect
compilers to chew through C headers far faster than main code, whether
or not they do any optimisations.

Again, this is a decade old PC with spinning rust disk. A key
difference, of course, is that I am using an OS that is suitable for
the task.

I tried it under WSL too (unless that doesn't really count as a suitable
OS):

WSL does not, I think, count as a suitable OS in regard to timings and efficiency - it relies on Windows for file and disk access, and either
runs processes on the Windows kernel or has virtualisation overheads (I
don't know the details of WSL workings).

WSL gcc 9.4 -S 0.5 seconds to process #include "sdl.h"
Windows 11 gcc 14.1 -S 0.7 seconds

That last is the same OS where my mcc compiler takes 0.08 seconds to
process those same SDL*.h files.

I think you've misplaced a decimal point somewhere.

No.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sun Jun 23 15:11:47 2024

On 21/06/2024 23:05, bart wrote:

On 21/06/2024 14:51, David Brown wrote:

On 21/06/2024 15:25, bart wrote:

(Which is why I think they should just be included anyway.)

That would be a terrible idea for many reasons.

Such as? It can't be compilation time, since headers ten times the size
or more apparently instantly.

Namespace pollution. C does not have namespaces, so everything gets
dumped in the global namespaces. There is always a risk that user code
has identifiers that are used by later versions of the C standards, even
when they try to avoid the names in the reserved list. As each standard
gets more standard headers, and more identifiers in them, the risk of
conflicts if you always included all standard headers would be huge.

One delight in using my language is that its standard library is always available. But if you don't want it, it can be disabled. With C I spend
a big chunk of my time writing include lines.

That sounds like inefficient coding practices. You do know that file
inclusion is recursive? So if most of the files in your project use <stdbool.h>, <stdint.h>, <stdlib.h>, <stddef.h> and <string.h>, you can
put all these in a file "common.h" and include that? But if you never,
or almost never, use <setjmp.h>, <complex.h>, <wctype.h>, <fenv.h>, then
you don't bother with them in the "common.h" file.

Such practices eliminate any perceived benefit of putting all the
headers inside the compiler.

First I need stdio. Then string. Then stdlib. Then there's always ones I can't quite remember.

There are some pretty good references available. Some compilers will
even tell you which include files are missing, at least in some common
cases.

Or I need to debug someone's code, and it needs stdio to define
'printf', FFS. Would the world stop turning if it was just available?

I sometimes want to define my own "printf". You can disagree, but
certainly /I/ am glad of the flexibility C offers.

I just don't believe that things defined in the headers need to be micro-managed to that extent, most of the time.

Most languages require declarations of libraries or modules that are
used. C, being fairly low-level and with a smaller core language than
most, might require such declarations a little more than most languages.
But it works well enough in practice.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Ben Bacarisse on Sun Jun 23 10:30:06 2024

On 6/23/24 07:19, Ben Bacarisse wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

...

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another. That
the comparison function defines a total order on the elements is, to me, >>> a major extra constraint that should not be written as an apparent
clarification to something the does not imply it: repeated calls should >>> be consistent with one another and, in addition, a total order should be >>> imposed on the elements present.

I think you're misreading the first sentence.

Let's hope so. That's why I said it should be clearer, not that it was wrong.

Suppose we are in
court listening to an ongoing murder trial. Witness one comes in
and testifies that Alice left the house before Bob. Witness two
comes in (after witness one has gone) and testifies that Bob left
the house before Cathy. Witness three comes in (after the first
two have gone) and testifies that Cathy left the house before
Alice. None of the witnesses have contradicted either of the
other witnesses, but the testimonies of the three witnesses are
not consistent with one another.

My (apparently incorrect) reading of the first sentence is that the consistency is only required between the results of multiple calls
between each pair. In other words, if the witnesses are repeatedly
asked, again and again, if Alice left before Bob and/or if Bob left
before Alice the results would always be consistent (with, of course,
the same required of repeatedly asking about the other pairs of people).

It says "When the same objects (consisting of size bytes, irrespective
of their current positions in the array) are passed more than once to
the comparison function, the results shall be consistent with one
another."
I can see you reading that as applying only when both arguments point to
the same pair of objects as another call to comparison function, but if
I do compar(&a,&b), compar(&b,&c), and compar(&c,&a), then each argument
of every call to compar() involves the same object as one of the
arguments of another call, and it seems to me that those same words
therefore require consistency of the results of those comparisons, too. Interpreting those words that way might seem less obvious, but it has
the advantage of making the subsequent "That is, ..." correct, rather
than an error.
I certainly would favor improved wording that made this clearer. In
fact, simply explicitly mandating total ordering rather than making a
vague comment about consistency would probably be the best approach.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Sun Jun 23 15:22:35 2024

David Brown <david.brown@hesbynett.no> writes:

On 21/06/2024 20:43, bart wrote:

Even at 12K lines and 50ms, that gives gcc a throughput of 0.25M lines
per second, something I've never seen on any version of gcc. And if it
was 50K lines like mine (just the lines in all the SDL*.h files), then
it suggests a throughput of 1M lines per second.

Lines per second is a completely meaningless way to measure compiler
speed. I know you like it, but it makes no sense at all - it's like >measuring the productivity of programmers in lines per day. I expect >compilers to chew through C headers far faster than main code, whether
or not they do any optimisations.

There was a time when lines per second mattered, but it wasn't due
to the technology used in the compiler, but rather the technology
used to feed lines to the compiler (e.g. reading from a 150 card-per-minute card reader limited the compile speed).

I do have one C++ source file that takes just under 8 minutes to compile
(on a very beefy xeon server) when using -O3. It includes 2.5 million
lines of class definitions across a half dozen header files.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Jun 23 09:47:15 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything
about sorting with non-order functions like the one above. Is
an implementation of qsort permitted to misbehave (for example
by not terminating) when the comparison function does not
implement a proper order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of
their current positions in the array) are passed more than once
to the comparison function, the results shall be consistent with
one another. That is, for qsort they shall define a total
ordering on the array, and for bsearch the same object shall
always compare the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in
undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another.
That the comparison function defines a total order on the elements
is, to me, a major extra constraint that should not be written as
an apparent clarification to something the does not imply it:
repeated calls should be consistent with one another and, in
addition, a total order should be imposed on the elements present.

I think you're misreading the first sentence.

Let's hope so. That's why I said it should be clearer, not that it
was wrong.

Suppose we are in
court listening to an ongoing murder trial. Witness one comes in
and testifies that Alice left the house before Bob. Witness two
comes in (after witness one has gone) and testifies that Bob left
the house before Cathy. Witness three comes in (after the first
two have gone) and testifies that Cathy left the house before
Alice. None of the witnesses have contradicted either of the
other witnesses, but the testimonies of the three witnesses are
not consistent with one another.

My (apparently incorrect) reading of the first sentence is that
the consistency is only required between the results of multiple
calls between each pair. In other words, if the witnesses are
repeatedly asked, again and again, if Alice left before Bob and/or
if Bob left before Alice the results would always be consistent
(with, of course, the same required of repeatedly asking about the
other pairs of people).

Let me paraphrase that. When the same pair of objects is passed
more than once to individual calls of the comparison function, the
results of those different calls shall each be consistent with
every other one of the results.

To paraphrase my reading, when some set of "same" objects is each
passed more than once to individual calls of the comparison
function, the results of all of those calls taken together shall
not imply an ordering contradiction.

Are the last two paragraphs fair restatements of our respective
readings? Is the second paragraph plain enough so that you
would not misconstrue it if read in isolation? Or if not, can
you suggest a better phrasing?

Try a web search

"consistent with" definition

for more explanation.

Seriously?

Yes, it's a serious suggestion, and I'm sorry if it came across as condescending. I did this search myself, and learned something from
it. The important point is the "consistent with" is something of an
idiomatic phrase, and it doesn't mean "equivalent to" or "the same
as". Maybe you already knew that, but I didn't, and learning it
helped me see what the quoted passage is getting at.

Also, for "one another", if we say the
children in the Jones family get along with one another, we don't
mean that each child gets along with at least one of the others,
but instead mean that every child gets along with every other
child, that is, that they all get along with each other.

The sentence in question has, to my mind, already stated what the
"one another" refers to -- the multiple calls between pairs
containing the same objects. I get you think that's not the
intended meaning, but I get my reading so strongly that I struggle
to see the other.

Yes, I got that. The incongruity between the first sentence and the
second sentence prompted me to re-examine the entire paragraph,
which is what eventually led me to my current reading.

Whether
or not some other reading (of that problem sentence in the C
standard) is sensible, surely the reading I have suggested is a
plausible one. Do you agree? It seems clear, given how the
second sentence is phrased, that this suggested reading is what
was intended.

I still can't read it the way you do. Every time I try, I find
the consistency is to be taken as applying to the results of the
multiple calls between pairs of the same objects. Nothing more.
It starts with "When the same objects". It seems so clear that
the consistency is all about the multiple calls with these same
objects. I keep trying to see your reading of it, but I can't.

Yes, the phrase "the same objects" starts one down a wrong path.
What I think is meant is that "sameness" applies to objects
individually, without regard to what the object is being compared
to. It's a tricky point because it isn't literally the same object:
what is meant is the same "logical" object, not the same physical
object. If you think of "the same objects" as meaning a set of
individual logical objects, rather than pairs of logical objects,
that might be a way to dislodge the (unfortunately all too easy
to fall into) initial impression.

I don't mean to defend the quality of writing in this passage.
Certainly it would be nice if the meaning could have been stated
more plainly. But I think it's an overstatement to say that the
first sentence in no way implies a total order.

I have a second objection that promoted that remark. If I take the (apparently) intended meaning of the first sentence, I think that "consistent" is too weak to imply even a partial order. In dog club
tonight, because of how they get on, I will ensure that Enzo is
walking behind George, that George is walking behind Benji, Benji
behind Gibson, Gibson behind Pepper and Pepper behind Enzo. In what
sense is this "ordering" not consistent? All the calls to the
comparison function are consistent with each other.

I understand the objection, and this is the point I was trying to
make in the paragraph about children in the Jones family. The
phrase "one another" in "the results shall be consistent with one
another" is meant to be read as saying "all the results taken
together". It is not enough that results not be contradictory taken
two at a time; considering all the results at once must not lead to
an ordering contradiction.

Hopefully this has been helpful for you. If it hasn't I'd like to
hear where the sticking points are.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to James Kuyper on Sun Jun 23 11:04:21 2024

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

[on the requirements for qsort]

I certainly would favor improved wording that made this clearer.
In fact, simply explicitly mandating total ordering rather than
making a vague comment about consistency would probably be the
best approach.

Clearly the C standard intends to impose a weaker requirement
than that the comparison function be a total ordering.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Sun Jun 23 19:21:07 2024

On 23/06/2024 13:25, David Brown wrote:

On 21/06/2024 23:47, bart wrote:

On 21/06/2024 14:34, David Brown wrote:

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

On 20/06/2024 22:21, Vir Campestris wrote:

On 17/06/2024 20:29, David Brown wrote:

I do my C development with optimisations enabled, which means
that the C compiler will obey all the rules and requirements of
C. Optimisations don't change the meaning of correct code - they >>>>>>> only have an effect on the results of your code if you have
written incorrect code. I don't know about you, but my aim in
development is to write /correct/ code. If disabling
optimisations helped in some way, it would be due to bugs and luck. >>>>>>

To me disabling optimisations does one slightly useful thing
(compiles a little quicker) and one really useful one. It makes
the interactive debugger work. Optimised code confuses the
debugger, especially when it does things like reorder code, unroll >>>>>> loops, or merge equivalent functions.

Of course I then test with the optimised version.

Andy

I understand your viewpoint and motivation. But my own experience
is mostly different.

First, to get it out of the way, there's the speed of compilation.
While heavy optimisation (-O3) can take noticeably longer, I never
see -O0 as being in any noticeable way faster for compilation than
-O1 or even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always take longer:

Of course. But I said it was not noticeable - it does not make
enough difference in speed for it to be worth choosing.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll        # from cold >>>>   TM: 3.85

Cold build times are irrelevant to development - when you are working
on a project, all the source files and all your compiler files are in
the PC's cache.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll
  TM: 0.31

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
  TM: 0.83

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
  TM: 0.93

  C:\c>dir bignum.dll
  21/06/2024 11:14            35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file. It has 158 C files
and about 220 header files. And I ran it on my old PC, without any
"tricks" that you dislike so much, doing full clean re-builds. The
files are actually all compiled twice, building two variants of the
binary.

With -O2, it took 34.3 seconds to build. With -O1, it took 33.4
seconds. With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds. In practice, of course,
full rebuilds are rarely needed, and most builds after changes to the
source are within a second or so.

Then there's something very peculiar about your codebase.

In my experience, programs don't usually consist of a single C file. And
if they do, build time is rarely long enough to worry anyone.

I think, from the history of discussions in this group, that it is more likely that your codebases are the peculiar ones.

I specifically excluded any of my own. I tried a variety of distinct
projects, all sharing the same characteristics: that -O3 generally
doubled build time, sometimes a bit less, often a lot more.

But you seem remarkably unbothered that in your code-base, the
difference is only 15% [for -O2]. I'd be quite curious.

If that really was typical, and I was in charge of gcc, I'd seriously
consider whether to bother with the -O0 and -O1 levels.

However the following timings to build TCC/lUA are typical of my
experience of gcc over 10-20 years:

(tcc 0.10)
-O0 2.84 seconds to build tcc.exe
-O1 5.70
-O2 10.78
-O3 13.21

(tcc 0.25)
-O0 7.74 seconds to build lua.exe
-O1 10.63
-O2 14.95
-O3 18.24

I've shown the timings from building with Tcc to give some perspective.
The proportional difference between -O3 and -O0 is indeed small compared
with that between -O0 and tcc!

However, it is certainly fair to say that codebases vary a lot, as do
build procedures.

Either there's disproportionate percentage of declarations compared to
executable code (like my tests with headers).

No.

Or a big chunk of the source either contains lots of redundant code,

No.

or generates such code, or maybe conditional code,

There's a bit, but not much.

that is eliminated before it gets to the optimising stages.

To be fair here, I have a /lot/ of flags for my builds, and the only
thing I changed here was the main optimisation number. Other flags
might still have had an influence in the compiler run time, such as the warning flags. But there's a limit to how much effort I'm going to
spend playing around for a Usenet post.

That depends on whether your want your findings to be accurate and not misleading. ATM your figures and also your comments raise some red flags
for me.

TBF this is a problem with smart compilers, some may find some way of
caching previous results - beyond whatever the file system does - even
if you delete the relevant binaries (I think Rust or Zig did this).

And a great many of the flags
(not the warning flags) are important for actually generating the
correct code for the correct target, so sorting out those that could be removed in order to make a completely pointless and useless
near-unoptimised build is just not worth the effort. However, changing
the main optimisation number to 0 certainly reduced the code generation
to much less efficient code - likely too slow to work for some critical aspects of the program.

And yet, -O2 must have invoked all those dozens of optimising passes to
make that difference, all for only 15% cost?

(My compiler's joke of an 'optimising' pass slows it down by 10%, and it usually makes fuck-all difference to performance of my main apps.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to bart on Sun Jun 23 22:09:09 2024

On 23/06/2024 19:21, bart wrote:

On 23/06/2024 13:25, David Brown wrote:

In my experience, programs don't usually consist of a single C file.
And if they do, build time is rarely long enough to worry anyone.

I think, from the history of discussions in this group, that it is
more likely that your codebases are the peculiar ones.

I specifically excluded any of my own. I tried a variety of distinct projects, all sharing the same characteristics: that -O3 generally
doubled build time, sometimes a bit less, often a lot more.

But you seem remarkably unbothered that in your code-base, the
difference is only 15% [for -O2]. I'd be quite curious.

If that really was typical, and I was in charge of gcc, I'd seriously consider whether to bother with the -O0 and -O1 levels.

However the following timings to build TCC/lUA are typical of my
experience of gcc over 10-20 years:

   (tcc 0.10)
   -O0   2.84 seconds to build tcc.exe
   -O1   5.70
   -O2 10.78
   -O3 13.21

   (tcc 0.25)
   -O0   7.74 seconds to build lua.exe
   -O1 10.63
   -O2 14.95
   -O3 18.24

I've shown the timings from building with Tcc to give some perspective.
The proportional difference between -O3 and -O0 is indeed small compared
with that between -O0 and tcc!

I've done one more test, which is compiling 140 .c files of Seed7 to
object files (not linking). This was done under WSL and outside of a
makefile where there were a million things going on that I had no idea
about.

Results were:

-O0 17 seconds
-O2 36 seconds
-O3 43 seconds

This was done with one invocation of gcc. Invoking gcc each time might
well make it slower, but a test I did along those lines was not conclusive.

So, if the difference between O0 and O2 is so narrow for you, and it's
not the source code, nor how you invoke gcc, then there must be other
things going on.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Jun 23 23:30:55 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything
about sorting with non-order functions like the one above. Is
an implementation of qsort permitted to misbehave (for example
by not terminating) when the comparison function does not
implement a proper order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of
their current positions in the array) are passed more than once
to the comparison function, the results shall be consistent with
one another. That is, for qsort they shall define a total
ordering on the array, and for bsearch the same object shall
always compare the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in
undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another.
That the comparison function defines a total order on the elements
is, to me, a major extra constraint that should not be written as
an apparent clarification to something the does not imply it:
repeated calls should be consistent with one another and, in
addition, a total order should be imposed on the elements present.

I think you're misreading the first sentence.

Let's hope so. That's why I said it should be clearer, not that it
was wrong.

Suppose we are in
court listening to an ongoing murder trial. Witness one comes in
and testifies that Alice left the house before Bob. Witness two
comes in (after witness one has gone) and testifies that Bob left
the house before Cathy. Witness three comes in (after the first
two have gone) and testifies that Cathy left the house before
Alice. None of the witnesses have contradicted either of the
other witnesses, but the testimonies of the three witnesses are
not consistent with one another.

My (apparently incorrect) reading of the first sentence is that
the consistency is only required between the results of multiple
calls between each pair. In other words, if the witnesses are
repeatedly asked, again and again, if Alice left before Bob and/or
if Bob left before Alice the results would always be consistent
(with, of course, the same required of repeatedly asking about the
other pairs of people).

Let me paraphrase that. When the same pair of objects is passed
more than once to individual calls of the comparison function, the
results of those different calls shall each be consistent with
every other one of the results.

No, only with the results of the other calls that get passed the same
pair. If cmp(&a, &b) == -32 then cmp(&a, &b) must always be negative
(though not always -32) and cmp(&b, &a) must always be positive. To me,
this reading is backed up by the remark about "regardless of where in
the array they are".

To paraphrase my reading, when some set of "same" objects is each
passed more than once to individual calls of the comparison
function, the results of all of those calls taken together shall
not imply an ordering contradiction.

Are the last two paragraphs fair restatements of our respective
readings?

I don't think so. The first does not seem to be what I meant, and the
second begs a question: what is an ordering contradiction?

Maybe I could work out what you mean by that if I thought about it some
more, but this discussion has reminded me why I swore not to discuss
wording and interpretation on Usenet. You found the wording adequate.
I didn't. I won't mind if no one ever knows exactly why I didn't. C
has managed fine with this wording for decades so there is no practical problem. I think enough time has been spent on this discussion already,
but I can sense more is likely to spent.

Is the second paragraph plain enough so that you
would not misconstrue it if read in isolation? Or if not, can
you suggest a better phrasing?

Since I don't know what an ordering contradiction is, I can't suggest an alternative.

Try a web search

"consistent with" definition

for more explanation.

Seriously?

Yes, it's a serious suggestion, and I'm sorry if it came across as condescending. I did this search myself, and learned something from
it. The important point is the "consistent with" is something of an idiomatic phrase, and it doesn't mean "equivalent to" or "the same
as". Maybe you already knew that, but I didn't, and learning it
helped me see what the quoted passage is getting at.

I find that /inconsistent/ with what I've previously inferred about your knowledge of English, but I have to take your word for it.

If you care to be less cryptic, maybe you will say what it was about the meaning of "consistent with" that helped you see what the text in
question was getting at.

Also, for "one another", if we say the
children in the Jones family get along with one another, we don't
mean that each child gets along with at least one of the others,
but instead mean that every child gets along with every other
child, that is, that they all get along with each other.

The sentence in question has, to my mind, already stated what the
"one another" refers to -- the multiple calls between pairs
containing the same objects. I get you think that's not the
intended meaning, but I get my reading so strongly that I struggle
to see the other.

Yes, I got that. The incongruity between the first sentence and the
second sentence prompted me to re-examine the entire paragraph,
which is what eventually led me to my current reading.

Whether
or not some other reading (of that problem sentence in the C
standard) is sensible, surely the reading I have suggested is a
plausible one. Do you agree? It seems clear, given how the
second sentence is phrased, that this suggested reading is what
was intended.

I still can't read it the way you do. Every time I try, I find
the consistency is to be taken as applying to the results of the
multiple calls between pairs of the same objects. Nothing more.
It starts with "When the same objects". It seems so clear that
the consistency is all about the multiple calls with these same
objects. I keep trying to see your reading of it, but I can't.

Yes, the phrase "the same objects" starts one down a wrong path.
What I think is meant is that "sameness" applies to objects
individually, without regard to what the object is being compared
to. It's a tricky point because it isn't literally the same object:
what is meant is the same "logical" object, not the same physical
object. If you think of "the same objects" as meaning a set of
individual logical objects, rather than pairs of logical objects,
that might be a way to dislodge the (unfortunately all too easy
to fall into) initial impression.

Can you express this mathematically? I can't follow these words at all.
I am clearly getting mentally old.

I don't mean to defend the quality of writing in this passage.
Certainly it would be nice if the meaning could have been stated
more plainly. But I think it's an overstatement to say that the
first sentence in no way implies a total order.

I have a second objection that promoted that remark. If I take the
(apparently) intended meaning of the first sentence, I think that
"consistent" is too weak to imply even a partial order. In dog club
tonight, because of how they get on, I will ensure that Enzo is
walking behind George, that George is walking behind Benji, Benji
behind Gibson, Gibson behind Pepper and Pepper behind Enzo. In what
sense is this "ordering" not consistent? All the calls to the
comparison function are consistent with each other.

I understand the objection, and this is the point I was trying to
make in the paragraph about children in the Jones family. The
phrase "one another" in "the results shall be consistent with one
another" is meant to be read as saying "all the results taken
together". It is not enough that results not be contradictory taken
two at a time; considering all the results at once must not lead to
an ordering contradiction.

So you agree that the first sentence in no way implies a total order?
All the results of the dog-order comparison function, taken together,
are consistent with the circular order, which is obviously not a total
order.

I must be missing something because you don't say anything else to
indicate a change of opinion. Are you making what to me is a circular
argument that consistent means consistent with a total order, not some
other ordering relationship?

Hopefully this has been helpful for you. If it hasn't I'd like to
hear where the sticking points are.

I think I am a little more confused than I was.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Jun 23 23:33:34 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

[on the requirements for qsort]

I certainly would favor improved wording that made this clearer.
In fact, simply explicitly mandating total ordering rather than
making a vague comment about consistency would probably be the
best approach.

Clearly the C standard intends to impose a weaker requirement
than that the comparison function be a total ordering.

The plot thickens. Unless, of course, you are referring to the
distinction you drew before between an ordering of all possible objects
and only those in the array.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Malcolm McLean on Mon Jun 24 01:25:46 2024

On 24/06/2024 00:52, Malcolm McLean wrote:

On 23/06/2024 22:09, bart wrote:

On 23/06/2024 19:21, bart wrote:

On 23/06/2024 13:25, David Brown wrote:

In my experience, programs don't usually consist of a single C file.
And if they do, build time is rarely long enough to worry anyone.

I think, from the history of discussions in this group, that it is
more likely that your codebases are the peculiar ones.

I specifically excluded any of my own. I tried a variety of distinct
projects, all sharing the same characteristics: that -O3 generally
doubled build time, sometimes a bit less, often a lot more.

But you seem remarkably unbothered that in your code-base, the
difference is only 15% [for -O2]. I'd be quite curious.

If that really was typical, and I was in charge of gcc, I'd seriously
consider whether to bother with the -O0 and -O1 levels.

However the following timings to build TCC/lUA are typical of my
experience of gcc over 10-20 years:

    (tcc 0.10)
    -O0   2.84 seconds to build tcc.exe
    -O1   5.70
    -O2 10.78
    -O3 13.21

    (tcc 0.25)
    -O0   7.74 seconds to build lua.exe
    -O1 10.63
    -O2 14.95
    -O3 18.24

I've shown the timings from building with Tcc to give some
perspective. The proportional difference between -O3 and -O0 is
indeed small compared with that between -O0 and tcc!

I've done one more test, which is compiling 140 .c files of Seed7 to
object files (not linking). This was done under WSL and outside of a
makefile where there were a million things going on that I had no idea
about.

Results were:

   -O0   17 seconds
   -O2   36 seconds
   -O3   43 seconds

This was done with one invocation of gcc. Invoking gcc each time might
well make it slower, but a test I did along those lines was not
conclusive.

So, if the difference between O0 and O2 is so narrow for you, and it's
not the source code, nor how you invoke gcc, then there must be other
things going on.

43 seconds compile time is getting to be a bit of a problem. But not for
the final build. Only for intermediate builds.

That isn't the issue here (it could have been 4.3 seconds vs 1.7 seconds).

David Brown was claiming there was little difference (15%, although on
the 34.3s and 30.8s timings, it is actually 11% not 15%) between
optimised and unoptimised builds, whereas I have always seen substantial differences like 100% or more.

He suggested that my figures (based on working with myriad open source
programs as well as my own) were erroneous, and that his figures must be correct. But he also claims that gcc compiles the SDL headers 15 times
faster, on his old machine with HDD, than it makes on my newer machine
with SSD, and that the main reasons is because it runs Windows.

He further claims that WSL isn't Linux (because I got the same behaviour
on that). But I have also seen those differences with gcc running on
pure Linux machines with no Windows at all.

I think he's just trying to win some stupid argument (something to do
with gcc not really being that slow compiler, or if it is, there's a way
around it: don't compile any files!).

My own theory is that of those 30.8/34.3s -O0/-O2 timings to build that
project of his, ~27 seconds of it was nothing to do with gcc but
something else his build system ws up to. But he isn't interested in invesigating further (no surprise).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Mon Jun 24 00:56:17 2024

bart <bc@freeuk.com> writes:

On 24/06/2024 00:52, Malcolm McLean wrote:

43 seconds compile time is getting to be a bit of a problem. But not for
the final build. Only for intermediate builds.

That isn't the issue here (it could have been 4.3 seconds vs 1.7 seconds).

David Brown was claiming there was little difference (15%, although on
the 34.3s and 30.8s timings, it is actually 11% not 15%) between
optimised and unoptimised builds, whereas I have always seen substantial >differences like 100% or more.

David is likely referring to the average. You've obviously found
an outlier. I've one outlier that compiles with -O0 in about
15 seconds, and with -O3 it takes 8 minutes. However, the
code in that file is far from optimal. The rest of the source
files in that project (20+ million lines) are more in the 10-15%
range, if that high.

I took a look at the outlier, and got rid of some unnecessary
local block declarations in a large function, removed some unnecessary #include's and cut the -O3 time in half, to just under 4
minutes. The variable tracking feature of the gnu compiler
collection seems to be the culprit here.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Mon Jun 24 10:28:52 2024

On 24/06/2024 02:56, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 00:52, Malcolm McLean wrote:

43 seconds compile time is getting to be a bit of a problem. But not for >>> the final build. Only for intermediate builds.

That isn't the issue here (it could have been 4.3 seconds vs 1.7 seconds). >>
David Brown was claiming there was little difference (15%, although on
the 34.3s and 30.8s timings, it is actually 11% not 15%) between
optimised and unoptimised builds, whereas I have always seen substantial
differences like 100% or more.

I think the main problem is that Bart doesn't understand what "builds"
are. And he doesn't understand the relevance - or irrelevance - of
compile times while developing software, and how they relate to /build/
times or /development/ time. I don't care how long it takes to compile
a file - I care how long the build takes.

David is likely referring to the average.

I think in this case it was just estimating percentages in my head,
rounding up - it's all approximate.

You've obviously found
an outlier. I've one outlier that compiles with -O0 in about
15 seconds, and with -O3 it takes 8 minutes. However, the
code in that file is far from optimal. The rest of the source
files in that project (20+ million lines) are more in the 10-15%
range, if that high.

I took a look at the outlier, and got rid of some unnecessary
local block declarations in a large function, removed some unnecessary #include's and cut the -O3 time in half, to just under 4
minutes. The variable tracking feature of the gnu compiler
collection seems to be the culprit here.

Some optimisations scale quadratically or worse, by code size and/or complexity. And using -O3 rather than -O2 will enable optimisation
passes that can take a lot longer while usually only having minor
benefits to the generated code, and indeed risking giving worse results.
So some types of code can take an inordinate amount of time to compile
with -O3. Typical cases are very large files or files with very large functions - C code generated by other tools (such as simulators
generated from HDL languages) can often be problematic. If it matters,
it's normally possible to do some fine-tuning with particular
optimisation passes or fiddling optimiser parameters.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Mon Jun 24 10:16:09 2024

On 23/06/2024 20:21, bart wrote:

On 23/06/2024 13:25, David Brown wrote:

To be fair here, I have a /lot/ of flags for my builds, and the only
thing I changed here was the main optimisation number. Other flags
might still have had an influence in the compiler run time, such as
the warning flags. But there's a limit to how much effort I'm going
to spend playing around for a Usenet post.

That depends on whether your want your findings to be accurate and not misleading. ATM your figures and also your comments raise some red flags
for me.

I've given figures for what changing the optimisation level means to the
timing of /my/ builds. Things can be different for other people. You
have gone on and on about how compilation time is the only thing that
matters to you, and how important it is to use "gcc -O0" to keep compile
times low. I've shown that, for my code and my builds, disabling
optimisation has such a small impact on the build time that it is not
remotely worth it.

This might not be /your/ experience, but it is my experience. And of
all the people I have worked with, I've never known a single occasion
when anyone has used low optimisation because of build speed. I've seen
plenty of people compile without optimisation for various reasons
(usually bad reasons or plain ignorance), but never because of speed.

TBF this is a problem with smart compilers, some may find some way of
caching previous results - beyond whatever the file system does - even
if you delete the relevant binaries (I think Rust or Zig did this).

gcc does not cache previous results. It is a program of *nix heritage -
it does its job, and lets other programs do theirs. So if you want
caching of compilation, you use a different program - ccache. (And I
/do/ use that for some projects, especially if there is a lot of C++ or
the builds involve a lot of parallel builds with different options for different target variants. It was not used in the timings I gave here, however.)

And a great many of the flags (not the warning flags) are important
for actually generating the correct code for the correct target, so
sorting out those that could be removed in order to make a completely
pointless and useless near-unoptimised build is just not worth the
effort. However, changing the main optimisation number to 0 certainly
reduced the code generation to much less efficient code - likely too
slow to work for some critical aspects of the program.

And yet, -O2 must have invoked all those dozens of optimising passes to
make that difference, all for only 15% cost?

That's the numbers I measured. I would not be surprised to see a
significantly bigger difference on a single compile of a single file, independent of the rest of the build. But such figures would be
meaningless to me - the time it takes to build the target binaries is
the relevant number.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Mon Jun 24 01:53:27 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything
about sorting with non-order functions like the one above. Is
an implementation of qsort permitted to misbehave (for example
by not terminating) when the comparison function does not
implement a proper order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of
their current positions in the array) are passed more than once
to the comparison function, the results shall be consistent with
one another. That is, for qsort they shall define a total
ordering on the array, and for bsearch the same object shall
always compare the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in
undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another.
That the comparison function defines a total order on the elements
is, to me, a major extra constraint that should not be written as
an apparent clarification to something the does not imply it:
repeated calls should be consistent with one another and, in
addition, a total order should be imposed on the elements present.

I think you're misreading the first sentence.

Let's hope so. That's why I said it should be clearer, not that it
was wrong.

Suppose we are in
court listening to an ongoing murder trial. Witness one comes in
and testifies that Alice left the house before Bob. Witness two
comes in (after witness one has gone) and testifies that Bob left
the house before Cathy. Witness three comes in (after the first
two have gone) and testifies that Cathy left the house before
Alice. None of the witnesses have contradicted either of the
other witnesses, but the testimonies of the three witnesses are
not consistent with one another.

My (apparently incorrect) reading of the first sentence is that
the consistency is only required between the results of multiple
calls between each pair. In other words, if the witnesses are
repeatedly asked, again and again, if Alice left before Bob and/or
if Bob left before Alice the results would always be consistent
(with, of course, the same required of repeatedly asking about the
other pairs of people).

Let me paraphrase that. When the same pair of objects is passed
more than once to individual calls of the comparison function, the
results of those different calls shall each be consistent with
every other one of the results.

No, only with the results of the other calls that get passed the same
pair. [...]

Sorry, my oversight. That's is what I meant. "When the same pair
of objects is passed more than once to individual calls of the
comparison function, the results of those different calls shall
each be consistent with every other one of THOSE results." The
consistency is meant to be only between results of comparisons
of the same pair. (This mistake illustrates how hard it is to
write good specifications in the C standard.)

To paraphrase my reading, when some set of "same" objects is each
passed more than once to individual calls of the comparison
function, the results of all of those calls taken together shall
not imply an ordering contradiction.

Are the last two paragraphs fair restatements of our respective
readings?

I don't think so. The first does not seem to be what I meant, and the
second begs a question: what is an ordering contradiction?

A conclusion that violates the usual mathematical rules of the
relations less than, equal to, greater than: A<B and B<C implies
A<C, A<B implies A!=B, A=B implies not A<B, A<B implies B>A, etc.

Maybe I could work out what you mean by that if I thought about it
some more, but this discussion has reminded me why I swore not to
discuss wording and interpretation on Usenet. You found the wording adequate. I didn't. I won't mind if no one ever knows exactly why
I didn't. C has managed fine with this wording for decades so there
is no practical problem. I think enough time has been spent on this discussion already, but I can sense more is likely to spent.

A small correction: I found the wording understandable. If the
question is about adequacy, I certainly can't give the current
wording 10 out of 10. I would like to see the specification for
qsort stated more plainly. Although, as you can see, I'm having
trouble figuring out how to do that.

Is the second paragraph plain enough so that you
would not misconstrue it if read in isolation? Or if not, can
you suggest a better phrasing?

Since I don't know what an ordering contradiction is, I can't suggest
an alternative.

Now that I have explained that phrase, I hope you will have a go
at finding a better wording.

Try a web search

"consistent with" definition

for more explanation.

Seriously?

Yes, it's a serious suggestion, and I'm sorry if it came across as
condescending. I did this search myself, and learned something from
it. The important point is the "consistent with" is something of an
idiomatic phrase, and it doesn't mean "equivalent to" or "the same
as". Maybe you already knew that, but I didn't, and learning it
helped me see what the quoted passage is getting at.

I find that /inconsistent/ with what I've previously inferred about
your knowledge of English, but I have to take your word for it.

In many cases my knowledge of English that is on display here
comes only after the fact. I routinely look up words and phrases
during the process of composing messages for the newsgroups.

If you care to be less cryptic, maybe you will say what it was
about the meaning of "consistent with" that helped you see what
the text in question was getting at.

I think the key thing is that "consistent with" doesn't mean the
same. If we're comparing the same pair of objects over and over,
the results are either the same or they are different. It would
be odd to use "consistent with one another" if all that mattered
is whether they are all the same. (I suppose some people would
think that compare(A,B) < 0 and compare(B,A) > 0 are different
results, but at least for qsort I don't.) If "consistent with
one another" doesn't mean "all the same", then "the same objects"
must not mean the same pairs over and over. I'm guessing about
what happened because my thought process was not a completely
conscious one, but this line of reasoning seems plausible.

Also, for "one another", if we say the
children in the Jones family get along with one another, we don't
mean that each child gets along with at least one of the others,
but instead mean that every child gets along with every other
child, that is, that they all get along with each other.

The sentence in question has, to my mind, already stated what the
"one another" refers to -- the multiple calls between pairs
containing the same objects. I get you think that's not the
intended meaning, but I get my reading so strongly that I struggle
to see the other.

Yes, I got that. The incongruity between the first sentence and the
second sentence prompted me to re-examine the entire paragraph,
which is what eventually led me to my current reading.

Whether
or not some other reading (of that problem sentence in the C
standard) is sensible, surely the reading I have suggested is a
plausible one. Do you agree? It seems clear, given how the
second sentence is phrased, that this suggested reading is what
was intended.

I still can't read it the way you do. Every time I try, I find
the consistency is to be taken as applying to the results of the
multiple calls between pairs of the same objects. Nothing more.
It starts with "When the same objects". It seems so clear that
the consistency is all about the multiple calls with these same
objects. I keep trying to see your reading of it, but I can't.

Yes, the phrase "the same objects" starts one down a wrong path.
What I think is meant is that "sameness" applies to objects
individually, without regard to what the object is being compared
to. It's a tricky point because it isn't literally the same object:
what is meant is the same "logical" object, not the same physical
object. If you think of "the same objects" as meaning a set of
individual logical objects, rather than pairs of logical objects,
that might be a way to dislodge the (unfortunately all too easy
to fall into) initial impression.

Can you express this mathematically? I can't follow these
words at all. I am clearly getting mentally old.

Something like this: if we label values in the array with their
initial position in the array, where the label stays with the
value when array elements are exchanged, consider now the set of
labels whose corresponding objects are each passed to the
comparison function more than once; this in turn induces a set
of ordering relationships on the labels. Then the set of label
order relationships must obey the usual mathematical rules for
the ordering relations less than, equal to, and greater than.

Not the best writing I've ever done there. It is at least a bit
more mathematical.

I don't mean to defend the quality of writing in this passage.
Certainly it would be nice if the meaning could have been stated
more plainly. But I think it's an overstatement to say that the
first sentence in no way implies a total order.

I have a second objection that promoted that remark. If I take the
(apparently) intended meaning of the first sentence, I think that
"consistent" is too weak to imply even a partial order. In dog club
tonight, because of how they get on, I will ensure that Enzo is
walking behind George, that George is walking behind Benji, Benji
behind Gibson, Gibson behind Pepper and Pepper behind Enzo. In what
sense is this "ordering" not consistent? All the calls to the
comparison function are consistent with each other.

I understand the objection, and this is the point I was trying to
make in the paragraph about children in the Jones family. The
phrase "one another" in "the results shall be consistent with one
another" is meant to be read as saying "all the results taken
together". It is not enough that results not be contradictory taken
two at a time; considering all the results at once must not lead to
an ordering contradiction.

So you agree that the first sentence in no way implies a total order?

Well, no, I wouldn't say that. The first sentence does in /some/
ways imply a total order. Unfortunately it can be read in other
ways that do not imply a total order. But I can't say that in
no way does it imply a total order.

All the results of the dog-order comparison function, taken together,
are consistent with the circular order, which is obviously not a total
order.

If A<B, B<C, C<D, D<E, and E<A, we can infer from the transitivity
of the "less than" relation that A<A. But A<A can never be true, so
this set of comparison results is no good. So I guess what we have
discovered is that "consistent with one another" is intended to mean
"obeys the usual mathematical rules for ordering relations".

I must be missing something because you don't say anything else to
indicate a change of opinion. Are you making what to me is a circular argument that consistent means consistent with a total order, not some
other ordering relationship?

The qsort function takes a pointer-to-function argument to perform
comparisons between objects. That function is obligated to return
an integer less than, equal to, or greater than zero if the first
argument is considered to be respectively less than, equal to, or
greater than the second argument. "Consistent with" means that the
results indicating less than, equal to, or greater than must obey
the usual mathematical rules, as I tried to explain above, for those relationships. That these results define a total ordering is a
consequence of the results obeying the usual mathematical rules.

It occurs to me now to say that "consistent with" is meant to
include logical inference. That distinction is a key difference
between "consistent" and "consistent with" (at least as the two
terms might be understood). The combination of: one, the results
of the comparison function are seen as corresponding to an ordering
relation; and two, that "consistent with one another" includes
logical inferences considering all of the results together; is what
allows us to conclude that the results define a total order.

I'm sorry if any of the above sounds like it's just stating the
obvious. I'm strugging trying to find a way to explain what to
me seems straightforward.

Hopefully this has been helpful for you. If it hasn't I'd like to
hear where the sticking points are.

I think I am a little more confused than I was.

I am confident that the click will happen.

On the other hand, I was surprised the other day
when I looked up "optimist" in the dictionary and
found a picture of myself. ;)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Mon Jun 24 03:28:16 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

[on the requirements for qsort]

I certainly would favor improved wording that made this clearer.
In fact, simply explicitly mandating total ordering rather than
making a vague comment about consistency would probably be the
best approach.

Clearly the C standard intends to impose a weaker requirement
than that the comparison function be a total ordering.

The plot thickens. Unless, of course, you are referring to the
distinction you drew before between an ordering of all possible objects
and only those in the array.

Consider the following situation.

We have an array with seven elements, the integers 1 to 7,
in that order. We call qsort on the array, with a natural
comparison function that compares the integer values.

The qsort function starts with a check, and for any array
with eight elements or fewer a simple insertion sort is
done. Because 1 is less than 2, these elements stay
where they are. Because 2 is less than 3, there is only
the one comparison, and 3 stays where it is. And so on...
at each point in the sort an element is compared to the
one before it, and nothing changes. Six compares are
done to sort seven elements. Question: has the program
encountered any undefined behavior? (I expect you will
say no.)

Now consider a second situation.

We again have an array with seven elements, the integers 1
to 7, but not necessarily in order. We call the same
qsort function. This time though the argument for the
comparison function is for a function that just always
returns -1. The same sequence of events takes place as
did in the first situation: each element after the first
is compared to the one before it, and because the previous
element is deemed "less than" this element no movement
occurs and we proceed to the next element of the array.
Six compares are done to "sort" seven elements. Question:
has the program encountered any undefined behavior?

If there has been undefined behavior, which passages in
the C standard explains the difference relative to the
first situation?

If there has not been undefined behavior, what does that
say about what the requirements are for a call to qsort?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Mon Jun 24 13:46:22 2024

On 24/06/2024 13:17, bart wrote:

On 24/06/2024 09:28, David Brown wrote:

On 24/06/2024 02:56, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 00:52, Malcolm McLean wrote:

43 seconds compile time is getting to be a bit of a problem. But
not for
the final build. Only for intermediate builds.

That isn't the issue here (it could have been 4.3 seconds vs 1.7
seconds).

David Brown was claiming there was little difference (15%, although on >>>> the 34.3s and 30.8s timings, it is actually 11% not 15%) between
optimised and unoptimised builds, whereas I have always seen
substantial
differences like 100% or more.

I think the main problem is that Bart doesn't understand what "builds"
are. And he doesn't understand the relevance - or irrelevance - of
compile times while developing software, and how they relate to
/build/ times or /development/ time. I don't care how long it takes
to compile a file - I care how long the build takes.

I've been building programs for approaching half a century, and I've
been developing tools do so for most of that.

Do you really thing I don't know what a 'build' is?

Given your decade (at least) old battle to misunderstand the most
commonly used build tool, yes.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jun 24 12:17:50 2024

On 24/06/2024 09:28, David Brown wrote:

On 24/06/2024 02:56, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 00:52, Malcolm McLean wrote:

43 seconds compile time is getting to be a bit of a problem. But not
for
the final build. Only for intermediate builds.

That isn't the issue here (it could have been 4.3 seconds vs 1.7
seconds).

David Brown was claiming there was little difference (15%, although on
the 34.3s and 30.8s timings, it is actually 11% not 15%) between
optimised and unoptimised builds, whereas I have always seen substantial >>> differences like 100% or more.

I think the main problem is that Bart doesn't understand what "builds"
are. And he doesn't understand the relevance - or irrelevance - of
compile times while developing software, and how they relate to /build/
times or /development/ time. I don't care how long it takes to compile
a file - I care how long the build takes.

I've been building programs for approaching half a century, and I've
been developing tools do so for most of that.

Do you really thing I don't know what a 'build' is?

I can recreate a 13% difference between -O0 and -O2 on one project (not
mine), which is 34 files that takes either 22 seconds or 25 seconds.

That is the one where SDL headers are used by most of the modules (where
you claim that gcc will build them instantly if only I didn't use
Windows). But it has been extablished that processing lots of headers is
not affected by optimisation flags.

I noticed that my MCC compiler took 1.8 seconds for a full-build, which
for me is annoyingly slow, considering the C code amounts to only 7Kloc.

But because this is C, I switched to incremental compilation, but done manually: setting up a project file for my IDE (instead of compiling
*.c), where I can re-compile an individual file then link.

So that is always a possibility, and have used that in the past when my language had independent compilation.

With true whole-program compilation as I normally use, the problem
simply wouldn't arise: the library interface, which would also be a
short summary not dozens of sprawling headers, is processed once per build.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Mon Jun 24 16:09:41 2024

On Fri, 21 Jun 2024 22:47:46 +0100
bart <bc@freeuk.com> wrote:

On 21/06/2024 14:34, David Brown wrote:

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

I understand your viewpoint and motivation. But my own
experience is mostly different.

First, to get it out of the way, there's the speed of
compilation. While heavy optimisation (-O3) can take noticeably
longer, I never see -O0 as being in any noticeable way faster for
compilation than -O1 or even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always take longer:

Of course. But I said it was not noticeable - it does not make
enough difference in speed for it to be worth choosing.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll        # from cold
  TM: 3.85

Cold build times are irrelevant to development - when you are
working on a project, all the source files and all your compiler
files are in the PC's cache.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll
  TM: 0.31

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
  TM: 0.83

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
  TM: 0.93

  C:\c>dir bignum.dll
  21/06/2024 11:14            35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file. It has 158 C files
and about 220 header files. And I ran it on my old PC, without any "tricks" that you dislike so much, doing full clean re-builds. The
files are actually all compiled twice, building two variants of the
binary.

With -O2, it took 34.3 seconds to build. With -O1, it took 33.4 seconds. With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds. In practice, of
course, full rebuilds are rarely needed, and most builds after
changes to the source are within a second or so.

Then there's something very peculiar about your codebase.

To me it looks more likely that your codebase is very unusual rather
than David's

In order to get meaningful measurements I took embedded project that
is significantly bigger than average by my standards. Here are times of
full parallel rebuild (make -j5) on relatively old computer (4-core Xeon E3-1271 v3).

Option time(s) -g time text size
-O0 13.1 13.3 631648
-Os 13.6 14.1 424016
-O1 13.5 13.7 455728
-O2 14.0 14.1 450056
-O3 14.0 14.6 525380

The difference in time between different -O settings in my measurements
is even smaller than reported by David Brown. That can be attributed to
older compiler (gcc 4.1.2). Another difference is that this compiler
works under cygwin, which is significantly slower both than native
Linux and than native Windows. That causes relatively higher make
overhead and longer link.
If I had were "native" tools then all times will be likely shorter by
few seconds and the difference between -O0 and -O3 will be close to 10%.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jun 24 14:01:18 2024

On 24/06/2024 12:46, David Brown wrote:

On 24/06/2024 13:17, bart wrote:

On 24/06/2024 09:28, David Brown wrote:

On 24/06/2024 02:56, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 00:52, Malcolm McLean wrote:

43 seconds compile time is getting to be a bit of a problem. But
not for
the final build. Only for intermediate builds.

That isn't the issue here (it could have been 4.3 seconds vs 1.7
seconds).

David Brown was claiming there was little difference (15%, although on >>>>> the 34.3s and 30.8s timings, it is actually 11% not 15%) between
optimised and unoptimised builds, whereas I have always seen
substantial
differences like 100% or more.

I think the main problem is that Bart doesn't understand what
"builds" are. And he doesn't understand the relevance - or
irrelevance - of compile times while developing software, and how
they relate to /build/ times or /development/ time. I don't care how
long it takes to compile a file - I care how long the build takes.

I've been building programs for approaching half a century, and I've
been developing tools do so for most of that.

Do you really thing I don't know what a 'build' is?

Given your decade (at least) old battle to misunderstand the most
commonly used build tool, yes.

Then people have different concepts of what a 'build' is.

My definition is a full translation from source code to executable
binary. That, I have never had any trouble with for my own projects, and
have never had to hang about for it either, because I define and control
the process (and most often write the necessary tools too, AND design
the language involved to that end).

But that's not the case with other people's projects where they define
and control how it works, since they introduce endless complexities and dependencies.

Plus of course they tend to use C, which has all sorts of troublesome
aspects, such as headers having loads of compiler-specific conditional
blocks. Although if they ONLY used C, it wouldn't be so bad, since then
the only tool involved with be a C compiler.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Michael S on Mon Jun 24 15:00:26 2024

On 24/06/2024 14:09, Michael S wrote:

On Fri, 21 Jun 2024 22:47:46 +0100
bart <bc@freeuk.com> wrote:

On 21/06/2024 14:34, David Brown wrote:

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

I understand your viewpoint and motivation. But my own
experience is mostly different.

First, to get it out of the way, there's the speed of
compilation. While heavy optimisation (-O3) can take noticeably
longer, I never see -O0 as being in any noticeable way faster for
compilation than -O1 or even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always take longer:

Of course. But I said it was not noticeable - it does not make
enough difference in speed for it to be worth choosing.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll        # from cold
  TM: 3.85

Cold build times are irrelevant to development - when you are
working on a project, all the source files and all your compiler
files are in the PC's cache.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll
  TM: 0.31

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
  TM: 0.83

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
  TM: 0.93

  C:\c>dir bignum.dll
  21/06/2024 11:14            35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file. It has 158 C files
and about 220 header files. And I ran it on my old PC, without any
"tricks" that you dislike so much, doing full clean re-builds. The
files are actually all compiled twice, building two variants of the
binary.

With -O2, it took 34.3 seconds to build. With -O1, it took 33.4
seconds. With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds. In practice, of
course, full rebuilds are rarely needed, and most builds after
changes to the source are within a second or so.

Then there's something very peculiar about your codebase.

To me it looks more likely that your codebase is very unusual rather
than David's

In order to get meaningful measurements I took embedded project that
is significantly bigger than average by my standards. Here are times of
full parallel rebuild (make -j5) on relatively old computer (4-core Xeon E3-1271 v3).

Option time(s) -g time text size
-O0 13.1 13.3 631648
-Os 13.6 14.1 424016
-O1 13.5 13.7 455728
-O2 14.0 14.1 450056
-O3 14.0 14.6 525380

The difference in time between different -O settings in my measurements
is even smaller than reported by David Brown. That can be attributed to
older compiler (gcc 4.1.2). Another difference is that this compiler
works under cygwin, which is significantly slower both than native
Linux and than native Windows. That causes relatively higher make
overhead and longer link.

I don't know why Cygwin would make much difference; the native code is
still running on the same processor.

However, is there any way of isolating the compilation time (turning .c
files into either or .o files) from 'make' the linker? Failing that, can
you compile just one module in isolation (.c to .o) with -O0 and -O2, or
is that not possible?

Those throughputs don't look that impressive for a parallel build on
what sounds like a high-spec machine.

Your processor has a CPU-mark double that of mine, which has only two
cores, and is using one.

Building a 34-module project with .text size of 300KB, with either gcc
10 or 14, using -O0, takes about 8 seconds, or 37KB/second.

Your figures show about 50KB/second. You say you use gcc 4, but an older
gcc is more likely to be faster in compilation speed than a newer one.

It does sound like something outside of gcc itself.

For the same project, on the same slow machine, Tiny C's throughput is 1.3MB/second. While my non-C compiler, on other projects, is
5-10MB/second, still only looking at .text segments. That is 100 times
faster than your timings, for generating code that is as good as gcc's -O0.

So IT IS NOT WINDOWS ITSELF THAT IS SLOW.

If I had were "native" tools then all times will be likely shorter by
few seconds and the difference between -O0 and -O3 will be close to 10%.

So two people now saying that all the many dozens of extras passes and
extra analysis that gcc -O2/O3 has to do, compared with the basic
front-end work that every toy compiler needs to do and does it quickly,
only slows it down by 10%.

I really don't believe it. And you should understand that it doesn't add up.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Mon Jun 24 07:40:46 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:
[on the requirements for qsort]

I certainly would favor improved wording that made this clearer.
In fact, simply explicitly mandating total ordering rather than
making a vague comment about consistency would probably be the
best approach.

Clearly the C standard intends to impose a weaker requirement
than that the comparison function be a total ordering.

"That is, for qsort they shall define a total ordering on the
array".

I presume you didn't intend to contradict that requirement, but
I can't figure out what you meant -- unless, as Ben suggested,
you're distinguishing between a total ordering of all possible
arguments and a total ordering of objects present in the array.
But even then, the standard explicitly imposes a total ordering.
(The requirements for bsearch might be weaker, but we're discussing
qsort.)

Can you clarify what you meant?

For starters, saying that the comparison function defines a total
ordering of elements actually present in the array is already a
weaker requirement than saying that the comparison function defines
a total ordering of all values that might legally be present in the
array.

Now notice that the C standard isn't referring to the comparison
function in the statement quoted above. The standard does not say
"the comparison function shall define". What it does say is that
"/they/ shall define". Those two aren't the same thing.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Mon Jun 24 17:09:25 2024

On 24/06/2024 16:00, bart wrote:

On 24/06/2024 14:09, Michael S wrote:

On Fri, 21 Jun 2024 22:47:46 +0100
bart <bc@freeuk.com> wrote:

On 21/06/2024 14:34, David Brown wrote:

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

I understand your viewpoint and motivation. But my own
experience is mostly different.

First, to get it out of the way, there's the speed of
compilation. While heavy optimisation (-O3) can take noticeably
longer, I never see -O0 as being in any noticeable way faster for
compilation than -O1 or even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always take longer:

Of course. But I said it was not noticeable - it does not make
enough difference in speed for it to be worth choosing.

   C:\c>tm gcc bignum.c -shared -s -obignum.dll        # from cold
   TM: 3.85

Cold build times are irrelevant to development - when you are
working on a project, all the source files and all your compiler
files are in the PC's cache.

   C:\c>tm gcc bignum.c -shared -s -obignum.dll
   TM: 0.31
   C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
   TM: 0.83
   C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
   TM: 0.93
   C:\c>dir bignum.dll
   21/06/2024 11:14            35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file. It has 158 C files
and about 220 header files. And I ran it on my old PC, without any
"tricks" that you dislike so much, doing full clean re-builds. The
files are actually all compiled twice, building two variants of the
binary.

With -O2, it took 34.3 seconds to build. With -O1, it took 33.4
seconds. With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds. In practice, of
course, full rebuilds are rarely needed, and most builds after
changes to the source are within a second or so.

Then there's something very peculiar about your codebase.

To me it looks more likely that your codebase is very unusual rather
than David's

In order to get meaningful measurements I took embedded project that
is significantly bigger than average by my standards. Here are times of
full parallel rebuild (make -j5) on relatively old computer (4-core Xeon
E3-1271 v3).

Option time(s) -g time text size
-O0    13.1      13.3   631648
-Os    13.6      14.1   424016
-O1    13.5      13.7   455728
-O2    14.0      14.1   450056
-O3    14.0      14.6   525380

The difference in time between different -O settings in my measurements
is even smaller than reported by David Brown. That can be attributed to
older compiler (gcc 4.1.2). Another difference is that this compiler
works under cygwin, which is significantly slower both than native
Linux and than native Windows. That causes relatively higher make
overhead and longer link.

I don't know why Cygwin would make much difference; the native code is
still running on the same processor.

Cygwin, especially older Cygwin, is very slow for all file access and
all process control, because it tries to emulate POSIX as closely as
possible on an OS that has only a fraction of the necessary features.
gcc is not a monolithic tool - it is a driver, and controls multiple
processes and accesses a fairly large number of files. So Cygwin-based
gcc builds will spend a considerable amount of time in this sort of
thing rather than actual processor-bound compiler work. I am confident
that Michael would find a mingw/mingw64 based build significantly faster
since that has a far thinner (almost transparent) emulation layer. And
it would be a good deal faster again under Linux in the same hardware,
as that has more efficient file handling.

(I'm not suggesting Michael change for this project - for serious
embedded work, repeatable builds and consistency of toolchains is
generally far more important than build times. But I presume he'll use
newer and better tools for new projects.)

However, is there any way of isolating the compilation time (turning .c
files into either or .o files) from 'make' the linker?

Why would anyone want to do that? At times, it can be useful to do
partial builds, but compilation alone is not particularly useful.

Failing that, can
you compile just one module in isolation (.c to .o) with -O0 and -O2, or
is that not possible?

Those throughputs don't look that impressive for a parallel build on
what sounds like a high-spec machine.

How can you possibly judge that when you have no idea how big the
project is?

If I had were "native" tools then all times will be likely shorter by
few seconds and the difference between -O0 and -O3 will be close to 10%.

So two people now saying that all the many dozens of extras passes and
extra analysis that gcc -O2/O3 has to do, compared with the basic
front-end work that every toy compiler needs to do and does it quickly,
only slows it down by 10%.

I really don't believe it. And you should understand that it doesn't add
up.

That's not what people have said.

They have said that /build/ times for /real/ projects, measured in real
time, with optimisation disabled do not give a speedup which justifies
turning off optimisation and losing the features you get with a strong optimising compiler.

No one denies that "gcc -O0" is faster than "gcc -O3" for individual
compiles, and that the percentage difference will vary and sometimes be
large.

But that's not the point. People who do C development for a living, do
not measure the quality of their tools by the speed of compiling random
junk they found on the internet to see which compiler saves them half a
second.

Factors that are important for considering a compiler can include, in no particular order and not all relevant to all developers :

* Does it support the target devices I need?
* Does it support the languages and language standards I want?
* Does it have the extensions I want to use?
* How good are its error messages at leading me to problems in the code?
* How good is its static checks and warnings?
* How efficient are the results?
* Is it compatible with the libraries and SDK's I want to use?
* Is it commonly used by others - colleagues, customers, suppliers?
* Is it supported by the suppliers of my microcontrollers, OS, etc.?
* Can I easily run it on multiple machines?
* Can I back it up and run it on systems in the future?
* Can I get hold of specific old versions of the tools? Can I
reasonably expect the tools to be available for a long time in the future?
* What are the policies for bug reporting, and bug fixing in the toolchain?
* How easy is it to examine the generated code?
* Does it play well with my IDE, such as cross-navigating between
compiler messages and source code?
* Does it have any restrictions in its use?
* How good is the documentation?
* Does it have enough flexibility to tune it to my needs and preferences
for source code checks and warnings?
* Does it have enough flexibility to tune it to my code generation needs
and preferences?
* Can I re-use the same tool for multiple projects?
* Can I use the same source and the same tool (or same family) on my
targets and for simulation on PC's?
* Is the tool mainstream and well-tested by users in practice?
* Does the same tool, or family of tools, work for different targets?
* Am I familiar with the tool, its idiosyncrasies, and its options?
* Is it common enough that I can google for questions about it?
* Does it generate the debugging information I need? Does it play well
with my debugger?
* Is the price within budget? Does it have locks, dongles, or other restrictions? Is commercial support available if I need it?
* Does it have run-time debugging tools such as sanitizers or optional
range checks?
* Does it run on the host systems I want to use?
* Does it have (or integrate with) other tools such as profilers or code coverage tools?
* What is the upgrade path and the expectation of future improvements in
new versions?
* Are there any legal requirements or ramifications from using the tool?
* Is it fast enough that it is not annoying to use with normal options
and common build automation tools, running on a host within reasonable
budget?

Notice how important raw compiler speed is in the grand scheme of things?

The importance of tools is how effective they are for your use as a /developer/. Seconds saved on compile time are totally irrelevant
compared to days, weeks, months saved by tools that help find or prevent errors, or that let you write better or clearer code.

I use gcc - specifically toolchains built and released by ARM - because
that is the tool that I rate highest on these factors. If there were a
similar featured clang toolchain I'd look closely at that too. And over
the years I have used many toolchains for many targets, some costing
multiple $K for the license.

Of course everyone likes faster compiles, all other things being equal.
But the other things are /not/ equal when comparing real-world
development tools with the likes of tcc or your little compiler. The
idea that anyone should reasonably expect to get paid for wasting
customer time and money with those is just laughable. It's like being
hired to dig up a road and arriving with kid's sand spade then claiming
it is better than a mechanical digger because it is smaller and lighter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Mon Jun 24 18:10:06 2024

On Mon, 24 Jun 2024 15:00:26 +0100
bart <bc@freeuk.com> wrote:

On 24/06/2024 14:09, Michael S wrote:

On Fri, 21 Jun 2024 22:47:46 +0100
bart <bc@freeuk.com> wrote:

On 21/06/2024 14:34, David Brown wrote:

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

I understand your viewpoint and motivation. But my own
experience is mostly different.

First, to get it out of the way, there's the speed of
compilation. While heavy optimisation (-O3) can take noticeably
longer, I never see -O0 as being in any noticeable way faster
for compilation than -O1 or even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always take longer:

Of course. But I said it was not noticeable - it does not make
enough difference in speed for it to be worth choosing.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll        # from >>>> cold TM: 3.85

Cold build times are irrelevant to development - when you are
working on a project, all the source files and all your compiler
files are in the PC's cache.

  C:\c>tm gcc bignum.c -shared -s -obignum.dll
  TM: 0.31

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
  TM: 0.83

  C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
  TM: 0.93

  C:\c>dir bignum.dll
  21/06/2024 11:14            35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file. It has 158 C files
and about 220 header files. And I ran it on my old PC, without
any "tricks" that you dislike so much, doing full clean
re-builds. The files are actually all compiled twice, building
two variants of the binary.

With -O2, it took 34.3 seconds to build. With -O1, it took 33.4
seconds. With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds. In practice, of
course, full rebuilds are rarely needed, and most builds after
changes to the source are within a second or so.

Then there's something very peculiar about your codebase.

To me it looks more likely that your codebase is very unusual rather
than David's

In order to get meaningful measurements I took embedded project that
is significantly bigger than average by my standards. Here are
times of full parallel rebuild (make -j5) on relatively old
computer (4-core Xeon E3-1271 v3).

Option time(s) -g time text size
-O0 13.1 13.3 631648
-Os 13.6 14.1 424016
-O1 13.5 13.7 455728
-O2 14.0 14.1 450056
-O3 14.0 14.6 525380

The difference in time between different -O settings in my
measurements is even smaller than reported by David Brown. That can
be attributed to older compiler (gcc 4.1.2). Another difference is
that this compiler works under cygwin, which is significantly
slower both than native Linux and than native Windows. That causes relatively higher make overhead and longer link.

I don't know why Cygwin would make much difference; the native code
is still running on the same processor.

I don't know specific reasons. Bird's eye perspective is that cygwin
tries to emulate Posix semantics on platform that is not Posix and
achieves that by using few low-granularity semaphores in user space,
which seriously limits parallelism. Besides, there are problems with
emulation of Posix I/O semantics that cause cygwin file I/O to be 2-3
times slower that native Windows I/O. The later applies mostly to
relatively small files, but, then again, software build mostly
accesses small files.
As a matter of fact, a parallel speed up I see on this project on this quad-core machine is barely 2x. I expect 3x or a little more for the
same project with native Windows tools.

However, is there any way of isolating the compilation time (turning
.c files into either or .o files) from 'make' the linker? Failing
that, can you compile just one module in isolation (.c to .o) with
-O0 and -O2, or is that not possible?

Of course, there is a way. But it's more work and gives an answer I am
not interested to know. Here are fully agree with what David said few
posts below.

Those throughputs don't look that impressive for a parallel build on
what sounds like a high-spec machine.

Yes, not that impressive.
However I wouldn't call the machine high-spec. More precisely, when
originally bought almost decade ago it was high-spec (but not top-spec)
machine for FPGA development. FPGA development tools even today are not
so good at parallelization of their jobs. 10 years ago parallelization
gain above 2 cores was non-existing. Also, data access patterns of this
tools tend to have poor locality of reference. It means that big L3
cache has limited usefulness. On the other hand, low-latency main
memory is very useful. This specific machine is bought with this
constraints in mind - high [for 2014] single-thread performance,
relatively small 8MB L3 cache, small [for multi-user server] 32 GB
low-latency main memory built of unbuffered DIMMs rather than of more
typical for server registered DIMMs.
People that care about parallel software builds buy quit different sort
of servers - dual-socket, lots of cores, big last level cache, big main
memory with high throughput and high latency. Back in the second half of
2014 those of them that had bigger budgets bought Xeon E5-2697 v2; those
with smaller budgets preferred Xeon E5-2697 v2. Those with good
contacts were getting Xeon E5 v3 that was already lounched but not
available for everyone.

Your processor has a CPU-mark double that of mine, which has only two
cores, and is using one.

Building a 34-module project with .text size of 300KB, with either
gcc 10 or 14, using -O0, takes about 8 seconds, or 37KB/second.

But my project has much more than 34 modules. 164 modules compiled
during build + several dozens in libraries.

Your figures show about 50KB/second.

text KB/second is hardly a good measure, esp. considering that we are
talking about different architectures. Mine is Altera Nios2 - 32-bit
RISC processor very similar to MIPS. The code density for this
architecture is significantly lower than on ARMv7 or x386 and even
somewhat lower than x86-64 and ARM64. The exact ratio depends on the
project, but 15-20% would be typical.
Also, part of text are libraries that we not compiled during this build.
But I would think that your .text size also includes libraries.

You say you use gcc 4, but an
older gcc is more likely to be faster in compilation speed than a
newer one.

Yes, but not dramatically so. And your ratios are extra-dramatically
different from mines, David's and Scott's.

It does sound like something outside of gcc itself.

For the same project, on the same slow machine, Tiny C's throughput
is 1.3MB/second. While my non-C compiler, on other projects, is 5-10MB/second, still only looking at .text segments. That is 100
times faster than your timings, for generating code that is as good
as gcc's -O0.

So IT IS NOT WINDOWS ITSELF THAT IS SLOW.

If I had were "native" tools then all times will be likely shorter
by few seconds and the difference between -O0 and -O3 will be close
to 10%.

So two people now saying that all the many dozens of extras passes
and extra analysis that gcc -O2/O3 has to do, compared with the basic front-end work that every toy compiler needs to do and does it
quickly, only slows it down by 10%.

I really don't believe it. And you should understand that it doesn't
add up.

I am not lying. I am pretty sure that DavidB also tells truth.
I recommend to try to compile your C compiler with gcc. Somehow I have a feeling that -O0 to -O2 ratio you'd see will be much closer to 1.15x
than to 4x.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Malcolm McLean on Mon Jun 24 09:34:13 2024

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

On 23/06/2024 17:47, Tim Rentsch wrote:

Yes, it's a serious suggestion, and I'm sorry if it came across as
condescending. I did this search myself, and learned something from
it. The important point is the "consistent with" is something of an
idiomatic phrase, and it doesn't mean "equivalent to" or "the same
as". Maybe you already knew that, but I didn't, and learning it
helped me see what the quoted passage is getting at.

We've established that the wife was in the house at the time when the
husband was killed. Which is consistent with her having done the
murder. But it doesn't by itself prove that she did the
murder. However had we been able to show that she was elsewhere at the
time, that would not be consistent with her having done the murder,
and so she would be dropped as a suspect.

Please don't post this sort of stupid pointless comment again.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Michael S on Mon Jun 24 17:51:25 2024

On 24/06/2024 16:10, Michael S wrote:

On Mon, 24 Jun 2024 15:00:26 +0100
bart <bc@freeuk.com> wrote:

Your processor has a CPU-mark double that of mine, which has only two
cores, and is using one.

Building a 34-module project with .text size of 300KB, with either
gcc 10 or 14, using -O0, takes about 8 seconds, or 37KB/second.

But my project has much more than 34 modules. 164 modules compiled
during build + several dozens in libraries.

Does that matter? My example is a smaller project, but I'm comparing the
rate of compilation not total time.

If you want a bigger example, yesterday I posted one involving 140 .c
files, total EXE size is 5MB (don't know .text as this is ELF format).

-O2 took just over twice as long as -O0.

But I guess every single example I come up with is 'exceptional'. Maybe
what's exceptional is that I measuring the runtime of the compiler, and
not all sorts of other junk.

You might claim that that other junk is necessary for the the build; I'd dispute that.

Your figures show about 50KB/second.

text KB/second is hardly a good measure, esp. considering that we are
talking about different architectures. Mine is Altera Nios2 - 32-bit
RISC processor very similar to MIPS. The code density for this
architecture is significantly lower than on ARMv7 or x386 and even
somewhat lower than x86-64 and ARM64. The exact ratio depends on the
project, but 15-20% would be typical.
Also, part of text are libraries that we not compiled during this build.
But I would think that your .text size also includes libraries.

So, what is the total size of the code that is produced by the
compilation? What is the data size?

I really don't believe it. And you should understand that it doesn't
add up.

I am not lying.

I'm not saying that. I'm disputing that -O2 adds only 10% compared to
-O0 when running only gcc.

I am pretty sure that DavidB also tells truth.

I recommend to try to compile your C compiler with gcc.

My C compiler is not written in C. I can transpile to C, but then it
would be a single large C file, which puts pressure on the optimiser:

C:\cx>mc -c cc # transpile to C
Compiling cc.m to cc.c

C:\cx>tm gcc cc.c # -O0 build
TM: 1.92

C:\cx>tm gcc cc.c -O2
TM: 9.22

It takes 380% longer compared with -O0. However the advantage is that I
now have a whole-program optimised application. But this is not
something I need to do routinely (maybe on production versions or
running benchmarks).

Most of the time I don't need the optimised code (which is only about
40% faster) and can build like this:

C:\cx>tm mm cc
Compiling cc.m to cc.exe
TM: 0.06

(If mm.exe is similarly boosted by via gcc, otherwise it takes 0.08s.)

So optimisation for this product take 150 times, or 15,000%, longer.

DB was however saying that he normally has optimisation turned on. Well
I'm not surprised if turning it off makes only 11% difference! But he is
not interested in super-fast builds such as those I work on.

Note that that 0.06 figure is for rebuilding my C compiler from scratch
(200KB .text size, 300KB EXE size.), and 1/3 of it is Windows process overheads. Accurately measuring build-times when timings are near zero
is difficult.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jun 24 17:19:58 2024

On 24/06/2024 16:09, David Brown wrote:

On 24/06/2024 16:00, bart wrote:

However, is there any way of isolating the compilation time (turning
.c files into either or .o files) from 'make' the linker?

Why would anyone want to do that? At times, it can be useful to do
partial builds, but compilation alone is not particularly useful.

It is useful when you are trying to establish the true cost of -O2
compared to -O0. If including all sorts of extras (why not the time it
takes to DHL the resulting floppy to a client!) then any figures will be misleading.

Anyone reading this thread could well end up believing that applying -O2
to a compilation only makes gcc 10-15% slower.

Failing that, can you compile just one module in isolation (.c to .o)
with -O0 and -O2, or is that not possible?

Those throughputs don't look that impressive for a parallel build on
what sounds like a high-spec machine.

How can you possibly judge that when you have no idea how big the
project is?

Because the 'text' (code) size was provided?

If I had were "native" tools then all times will be likely shorter by
few seconds and the difference between -O0 and -O3 will be close to 10%.

So two people now saying that all the many dozens of extras passes and
extra analysis that gcc -O2/O3 has to do, compared with the basic
front-end work that every toy compiler needs to do and does it
quickly, only slows it down by 10%.

I really don't believe it. And you should understand that it doesn't
add up.

That's not what people have said.

They have said that /build/ times for /real/ projects, measured in real
time, with optimisation disabled do not give a speedup which justifies turning off optimisation and losing the features you get with a strong optimising compiler.

All the projects I tried except one give a typical speed-up of 2x or
more at -O0.

The exception was the SDL2 project (which now gives timings of 14 vs 17 seconds).

No one denies that "gcc -O0" is faster than "gcc -O3" for individual compiles, and that the percentage difference will vary and sometimes be large.

But that's not the point. People who do C development for a living, do
not measure the quality of their tools by the speed of compiling random
junk they found on the internet to see which compiler saves them half a second.

Factors that are important for considering a compiler can include, in no particular order and not all relevant to all developers :

* Does it support the target devices I need?
* Does it support the languages and language standards I want?
* Does it have the extensions I want to use?
* How good are its error messages at leading me to problems in the code?
* How good is its static checks and warnings?
* How efficient are the results?
* Is it compatible with the libraries and SDK's I want to use?
* Is it commonly used by others - colleagues, customers, suppliers?
* Is it supported by the suppliers of my microcontrollers, OS, etc.?
* Can I easily run it on multiple machines?
* Can I back it up and run it on systems in the future?
* Can I get hold of specific old versions of the tools? Can I
reasonably expect the tools to be available for a long time in the future?
* What are the policies for bug reporting, and bug fixing in the toolchain?
* How easy is it to examine the generated code?
* Does it play well with my IDE, such as cross-navigating between
compiler messages and source code?
* Does it have any restrictions in its use?
* How good is the documentation?
* Does it have enough flexibility to tune it to my needs and preferences
for source code checks and warnings?
* Does it have enough flexibility to tune it to my code generation needs
and preferences?
* Can I re-use the same tool for multiple projects?
* Can I use the same source and the same tool (or same family) on my
targets and for simulation on PC's?
* Is the tool mainstream and well-tested by users in practice?
* Does the same tool, or family of tools, work for different targets?
* Am I familiar with the tool, its idiosyncrasies, and its options?
* Is it common enough that I can google for questions about it?
* Does it generate the debugging information I need? Does it play well
with my debugger?
* Is the price within budget? Does it have locks, dongles, or other restrictions? Is commercial support available if I need it?
* Does it have run-time debugging tools such as sanitizers or optional
range checks?
* Does it run on the host systems I want to use?
* Does it have (or integrate with) other tools such as profilers or code coverage tools?
* What is the upgrade path and the expectation of future improvements in
new versions?
* Are there any legal requirements or ramifications from using the tool?
* Is it fast enough that it is not annoying to use with normal options
and common build automation tools, running on a host within reasonable budget?

Notice how important raw compiler speed is in the grand scheme of things?

Yes, gcc ticks all the boxes. Except the last. For me it would be like
driving my car at walking pace all over town, even though most of my
time would be spent at various stopping places.

You get around that minimising your visits, taking short-cuts, using
multiple cars each driven by different people to parallelise all the tasks.

But the cheapest car on the market that can do 30mph would fix it more
easily.

You obviously have a very different view of what a compiler is for.

For me it's just a black box where you put source code in at end, and
get runnable code at the other, preferably instantly.

And in the case of C code, I don't want it to depend on specific
compilers. I used to test my code on 6 C compilers, now I limit it to three.

I use gcc ONLY to get an extra boost in speed. Except sometimes I have
to use it because some open source code only works with gcc.

The importance of tools is how effective they are for your use as a /developer/. Seconds saved on compile time are totally irrelevant
compared to days, weeks, months saved by tools that help find or prevent errors, or that let you write better or clearer code.

For better or clearer code, try a new language. For me the biggest
problems of developing with C are the language itself. All the
industrial scale tools in the world can't fix the language.

I use gcc - specifically toolchains built and released by ARM - because
that is the tool that I rate highest on these factors. If there were a similar featured clang toolchain I'd look closely at that too. And over
the years I have used many toolchains for many targets, some costing
multiple $K for the license.

Of course everyone likes faster compiles, all other things being equal.
But the other things are /not/ equal when comparing real-world
development tools with the likes of tcc or your little compiler. The
idea that anyone should reasonably expect to get paid for wasting
customer time and money with those is just laughable.

That's a good point. How much money has been wasted in paying
programmers by the hour to twiddle their thumbs while waiting for a rebuild?

Perhap just once you can forget about analysing every obscure corner of
the code to quickly try out the latest change to see if it fixes that
problem.

It's like being
hired to dig up a road and arriving with kid's sand spade then claiming
it is better than a mechanical digger because it is smaller and lighter.

Have you considered that you could have a product that works like gcc,
in ticking most of those boxes, and yet could be nearly as fast as TCC?

But nobody is interested in that; customers such as you are so inured to
slow build-times, and have learnt to get around its sluggishness (and in
your case even convinced yourself that it is really quite fast), so that
there simply aren't enough people complaining about it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Mon Jun 24 17:02:33 2024

bart <bc@freeuk.com> writes:

On 24/06/2024 16:10, Michael S wrote:

But my project has much more than 34 modules. 164 modules compiled
during build + several dozens in libraries.

Does that matter? My example is a smaller project, but I'm comparing the
rate of compilation not total time.

If you want a bigger example, yesterday I posted one involving 140 .c
files, total EXE size is 5MB (don't know .text as this is ELF format).

Why do you believe that the size of the executable is interesting?

Why do you think you think 5MB is unusual if you don't know anything
about ELF?

The 'size' command will tell you the text size, although the text size
is a meaningless parameter in modern virtual memory systems which
load pages on demand.

$ file /tmp/a
/tmp/a: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=12944219f87c74b03ec19d1d693771e12416dd81, not stripped
$ size /tmp/a
text data bss dec hex filename
1257 548 4 1809 711 /tmp/a
$ ls -l /tmp/a
-rwxrwxr-x. 1 scott scott 8501 Jun 20 07:45 /tmp/a

When the average disk holds 1 TB, 5MB is not even in the noise.

-O2 took just over twice as long as -O0.

But I guess every single example I come up with is 'exceptional'.

By definition.

Maybe
what's exceptional is that I measuring the runtime of the compiler, and
not all sorts of other junk.

You seem to be measuring the wall-clock time which is influenced by
factors other than the size of the source file including other processes running during your compile.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Mon Jun 24 17:25:05 2024

David Brown <david.brown@hesbynett.no> writes:

On 24/06/2024 18:19, bart wrote:

For me it would be like
driving my car at walking pace all over town, even though most of my
time would be spent at various stopping places.

You still don't understand. You are telling people how fantastically
fast your car is, without realising it is just a remote-controlled toy
car. Nobody cars if your toy runs faster than a real tool - people will >still choose the real tool. And the real tools run fast enough for real >developers doing real work.

Brings to mind the race scene in the film _Oceans 11_.

https://www.youtube.com/watch?v=d3XwQcfZQtw

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Mon Jun 24 20:41:31 2024

On Mon, 24 Jun 2024 17:51:25 +0100
bart <bc@freeuk.com> wrote:

On 24/06/2024 16:10, Michael S wrote:

On Mon, 24 Jun 2024 15:00:26 +0100
bart <bc@freeuk.com> wrote:

Your processor has a CPU-mark double that of mine, which has only
two cores, and is using one.

Building a 34-module project with .text size of 300KB, with either
gcc 10 or 14, using -O0, takes about 8 seconds, or 37KB/second.

But my project has much more than 34 modules. 164 modules compiled
during build + several dozens in libraries.

Does that matter? My example is a smaller project, but I'm comparing
the rate of compilation not total time.

If you want a bigger example, yesterday I posted one involving 140 .c
files, total EXE size is 5MB (don't know .text as this is ELF format).

-O2 took just over twice as long as -O0.

But I guess every single example I come up with is 'exceptional'.
Maybe what's exceptional is that I measuring the runtime of the
compiler, and not all sorts of other junk.

You might claim that that other junk is necessary for the the build;
I'd dispute that.

Here is compile time only, serial. It is only few seconds faster than non-parallel make:
-O0 0m21.761s
-Os 0m24.227s
-O1 0m23.384s
-O2 0m24.601s
-O3 0m25.256s
O2/O0= 1.13

It could be probably up to 1.5x faster if I call compiler with more
than one source file at time, but then it would be too different from
our normal workflow.

Here are 3 biggest modules (Option/time(second)/text sise):
module 0:

module A
-O0 0.250 58444
-Os 0.530 29760
-O1 0.390 37368
-O2 0.530 34128
-O3 0.671 50104

module B
-O0 0.250 32680
-Os 0.343 25840
-O1 0.296 26452
-O2 0.343 26992
-O3 0.343 27672

Module C
-O0 0.296 37648
-Os 0.437 25976
-O1 0.390 28168
-O2 0.468 27784
-O3 0.515 32556

The biggest module (module A), absolutely not co-incidentally was not
written by us. It is purchased from Jean Labrosse, who is smart guy and
good businessman, but his ideas of software engineering are best
described as eccentric.
Other big modules are dispatchers: they receive messages from command line/telnet (Module 2) and HTTP (module 3) and either handle simple
cases by themselves or in more typical and more complex cases call
outside modules to do the job. So while they are big, they consist of
big number of independent pieces. As you see, this sort of bigness does
not produce particularly big -O2/-O0 ratios.

Your figures show about 50KB/second.

text KB/second is hardly a good measure, esp. considering that we
are talking about different architectures. Mine is Altera Nios2 -
32-bit RISC processor very similar to MIPS. The code density for
this architecture is significantly lower than on ARMv7 or x386 and
even somewhat lower than x86-64 and ARM64. The exact ratio depends
on the project, but 15-20% would be typical.
Also, part of text are libraries that we not compiled during this
build. But I would think that your .text size also includes
libraries.

So, what is the total size of the code that is produced by the
compilation?

I don't know. Probably 70-80% of total text

What is the data size?

Initialized data and read-only data? Not big. 23 KB. bss is
significantly bigger (2.8 MB), but I don't see how size of bss is
possibly relevant.

I really don't believe it. And you should understand that it
doesn't add up.

I am not lying.

I'm not saying that. I'm disputing that -O2 adds only 10% compared to
-O0 when running only gcc.

I am pretty sure that DavidB also tells truth.

I recommend to try to compile your C compiler with gcc.

My C compiler is not written in C. I can transpile to C, but then it
would be a single large C file, which puts pressure on the optimiser:

One big file is not quite obviously special case.

C:\cx>mc -c cc # transpile to C
Compiling cc.m to cc.c

C:\cx>tm gcc cc.c # -O0 build
TM: 1.92

C:\cx>tm gcc cc.c -O2
TM: 9.22

It takes 380% longer compared with -O0. However the advantage is that
I now have a whole-program optimised application. But this is not
something I need to do routinely (maybe on production versions or
running benchmarks).

Most of the time I don't need the optimised code (which is only about
40% faster) and can build like this:

C:\cx>tm mm cc
Compiling cc.m to cc.exe
TM: 0.06

(If mm.exe is similarly boosted by via gcc, otherwise it takes 0.08s.)

So optimisation for this product take 150 times, or 15,000%, longer.

DB was however saying that he normally has optimisation turned on.
Well I'm not surprised if turning it off makes only 11% difference!
But he is not interested in super-fast builds such as those I work on.

Note that that 0.06 figure is for rebuilding my C compiler from
scratch (200KB .text size, 300KB EXE size.), and 1/3 of it is Windows
process overheads. Accurately measuring build-times when timings are
near zero is difficult.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Mon Jun 24 18:50:40 2024

On 24/06/2024 18:02, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 16:10, Michael S wrote:

But my project has much more than 34 modules. 164 modules compiled
during build + several dozens in libraries.

Does that matter? My example is a smaller project, but I'm comparing the
rate of compilation not total time.

If you want a bigger example, yesterday I posted one involving 140 .c
files, total EXE size is 5MB (don't know .text as this is ELF format).

Why do you believe that the size of the executable is interesting?

Well, what metric IS interesting?

You seem to not care whether an executable is 10KB, 10MB, or 10GB. You
really don't think there's correspondence with build-time?

Why do you think you think 5MB is unusual

MS claimed that my project was smaller than his. So I found a bigger one.

if you don't know anything
about ELF?

What's that going to do with the price of fish? ELF is another
executable format, but my tools can't look inside it.

The 'size' command will tell you the text size, although the text size
is a meaningless parameter in modern virtual memory systems which
load pages on demand.

The .text was also something introduced by MS.

$ file /tmp/a
/tmp/a: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=12944219f87c74b03ec19d1d693771e12416dd81, not stripped
$ size /tmp/a
text data bss dec hex filename
1257 548 4 1809 711 /tmp/a
$ ls -l /tmp/a
-rwxrwxr-x. 1 scott scott 8501 Jun 20 07:45 /tmp/a

When the average disk holds 1 TB, 5MB is not even in the noise.

And what does /that/ have to do with anything?

One metric I find useful with my compiler work is how many MB per second
they can produce from direct compilation. And 5MB per second is faster
than 50KB per second.

It took gcc 17-38 seconds to build that 5MB product in WSL; that's
pretty slow.

If you want to ignore that simple fact then <shrug>. The common themes
in this newsgroup is first to defend the design of the C language, and
second to defend the behaviour and performance of gcc.

Maybe more people should take an interest.

It would be funny if gcc suddenly doubled in speed overnight because of
some stupid oversight that nobody had bothered to investigate.

Because lines/second or MB/second or anything else is 'uninteresting'.

BTW how slow does a tool have to get before YOU start asking questions?

-O2 took just over twice as long as -O0.

But I guess every single example I come up with is 'exceptional'.

By definition.

Maybe
what's exceptional is that I measuring the runtime of the compiler, and
not all sorts of other junk.

You seem to be measuring the wall-clock time which is influenced by
factors other than the size of the source file including other processes running during your compile.

Yes, it is during that same time period that people waiting for it to
complete have to sit twiddling their thumbs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Mon Jun 24 18:10:20 2024

bart <bc@freeuk.com> writes:

On 24/06/2024 18:02, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 16:10, Michael S wrote:

But my project has much more than 34 modules. 164 modules compiled
during build + several dozens in libraries.

Does that matter? My example is a smaller project, but I'm comparing the >>> rate of compilation not total time.

If you want a bigger example, yesterday I posted one involving 140 .c
files, total EXE size is 5MB (don't know .text as this is ELF format).

Why do you believe that the size of the executable is interesting?

Well, what metric IS interesting?

You seem to not care whether an executable is 10KB, 10MB, or 10GB. You
really don't think there's correspondence with build-time?

No. The ELF file contains a lot of stuff that never gets
loaded into memory (symbol tables, DWARF section data, etc);
writing to the object files by the compiler is an insignificant
component of the overall compile time.

Build time is not related in any way to the size of the
ELF.

Disk space is cheap and plentiful.

The .text was also something introduced by MS.

I don't understand this comment. .text predates any MS
compiler by more than a decade.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Mon Jun 24 19:15:32 2024

On 24/06/2024 18:19, bart wrote:

On 24/06/2024 16:09, David Brown wrote:

On 24/06/2024 16:00, bart wrote:

However, is there any way of isolating the compilation time (turning
.c files into either or .o files) from 'make' the linker?

Why would anyone want to do that? At times, it can be useful to do
partial builds, but compilation alone is not particularly useful.

It is useful when you are trying to establish the true cost of -O2
compared to -O0.

And the relevance of that is... what? Absolutely nothing. We've
already established that for some files, the difference is below the measurement threshold and for some (such as Scott's C++ example) it is
massive.

But just to please you, and not because it has any bearing on reality, I isolated a couple of C files from my code, removed the stuff that is
specific to the target, and compiled with native gcc -O0 and -O2.

For one file (the largest C file in the project), "gcc -O0" took 0.085
seconds. "gcc -O2" took 0.134 seconds. Yes, optimising was slow - but
the time was negligible. For the other file, both compiles were 0.025
seconds - too fast to be distinguishable from noise.

What have we learned from this experiment? We have learned SFA about compilation speed - the only thing you, perhaps, have learned is that I
know what I am talking about when I say such measurements are pointless.

If including all sorts of extras (why not the time it
takes to DHL the resulting floppy to a client!) then any figures will be misleading.

Anyone reading this thread could well end up believing that applying -O2
to a compilation only makes gcc 10-15% slower.

If they learn that it is not worth fussing about compilation speed and
they should use the options that give the results they want -
optimisation, warnings, etc., as desired - then they have learned
something useful. If they think they have learned about any particular percentage figure, then they have misunderstood.

Failing that, can you compile just one module in isolation (.c to .o)
with -O0 and -O2, or is that not possible?

Those throughputs don't look that impressive for a parallel build on
what sounds like a high-spec machine.

How can you possibly judge that when you have no idea how big the
project is?

Because the 'text' (code) size was provided?

Why do you think that is an appropriate way to guess the size of the
project? Michael does embedded development - that's likely to be a
pretty big project with a lot of files. This is not PC compilation
where a "Hello, world" file can reach that size if statically linked.

If I had were "native" tools then all times will be likely shorter by
few seconds and the difference between -O0 and -O3 will be close to
10%.

So two people now saying that all the many dozens of extras passes
and extra analysis that gcc -O2/O3 has to do, compared with the basic
front-end work that every toy compiler needs to do and does it
quickly, only slows it down by 10%.

I really don't believe it. And you should understand that it doesn't
add up.

That's not what people have said.

They have said that /build/ times for /real/ projects, measured in
real time, with optimisation disabled do not give a speedup which
justifies turning off optimisation and losing the features you get
with a strong optimising compiler.

All the projects I tried except one give a typical speed-up of 2x or
more at -O0.

The exception was the SDL2 project (which now gives timings of 14 vs 17 seconds).

No one cares about your figures. No one, except you, cares about /my/
figures. Sometimes people care about the build speed of /their/ code,
using /their/ choice of compiler and /their/ choice of options on
/their/ computers. Do you /really/ not understand that the timings you
get are utterly pointless to everyone else?

No one denies that "gcc -O0" is faster than "gcc -O3" for individual
compiles, and that the percentage difference will vary and sometimes
be large.

But that's not the point. People who do C development for a living,
do not measure the quality of their tools by the speed of compiling
random junk they found on the internet to see which compiler saves
them half a second.

Factors that are important for considering a compiler can include, in
no particular order and not all relevant to all developers :

* Does it support the target devices I need?
* Does it support the languages and language standards I want?
* Does it have the extensions I want to use?
* How good are its error messages at leading me to problems in the code?
* How good is its static checks and warnings?
* How efficient are the results?
* Is it compatible with the libraries and SDK's I want to use?
* Is it commonly used by others - colleagues, customers, suppliers?
* Is it supported by the suppliers of my microcontrollers, OS, etc.?
* Can I easily run it on multiple machines?
* Can I back it up and run it on systems in the future?
* Can I get hold of specific old versions of the tools? Can I
reasonably expect the tools to be available for a long time in the
future?
* What are the policies for bug reporting, and bug fixing in the
toolchain?
* How easy is it to examine the generated code?
* Does it play well with my IDE, such as cross-navigating between
compiler messages and source code?
* Does it have any restrictions in its use?
* How good is the documentation?
* Does it have enough flexibility to tune it to my needs and
preferences for source code checks and warnings?
* Does it have enough flexibility to tune it to my code generation
needs and preferences?
* Can I re-use the same tool for multiple projects?
* Can I use the same source and the same tool (or same family) on my
targets and for simulation on PC's?
* Is the tool mainstream and well-tested by users in practice?
* Does the same tool, or family of tools, work for different targets?
* Am I familiar with the tool, its idiosyncrasies, and its options?
* Is it common enough that I can google for questions about it?
* Does it generate the debugging information I need? Does it play
well with my debugger?
* Is the price within budget? Does it have locks, dongles, or other
restrictions? Is commercial support available if I need it?
* Does it have run-time debugging tools such as sanitizers or optional
range checks?
* Does it run on the host systems I want to use?
* Does it have (or integrate with) other tools such as profilers or
code coverage tools?
* What is the upgrade path and the expectation of future improvements
in new versions?
* Are there any legal requirements or ramifications from using the tool?
* Is it fast enough that it is not annoying to use with normal options
and common build automation tools, running on a host within reasonable
budget?

Notice how important raw compiler speed is in the grand scheme of things?

Yes, gcc ticks all the boxes. Except the last.

No, it does not tick all the boxes. The toolchains I use tick most of
them (including all the ones that I see as hard requirements), and do
better than any alternatives, but they are not perfect. They do,
however, happily pass the last one. I have yet to find a C compiler
that was not fast enough for my needs.

For me it would be like
driving my car at walking pace all over town, even though most of my
time would be spent at various stopping places.

You still don't understand. You are telling people how fantastically
fast your car is, without realising it is just a remote-controlled toy
car. Nobody cars if your toy runs faster than a real tool - people will
still choose the real tool. And the real tools run fast enough for real developers doing real work.

You get around that minimising your visits, taking short-cuts, using
multiple cars each driven by different people to parallelise all the tasks.

But the cheapest car on the market that can do 30mph would fix it more easily.

You obviously have a very different view of what a compiler is for.

Yes.

The importance of tools is how effective they are for your use as a
/developer/. Seconds saved on compile time are totally irrelevant
compared to days, weeks, months saved by tools that help find or
prevent errors, or that let you write better or clearer code.

For better or clearer code, try a new language. For me the biggest
problems of developing with C are the language itself. All the
industrial scale tools in the world can't fix the language.

I often work with C++ rather than C. But I know without any doubt at
all, that I write better and clearer C code than you do. C is far from
a "perfect" language, but it is a lot better if you don't cripple it
with determined ignorance, absurd self-imposed limitations and crappy tools.

I use gcc - specifically toolchains built and released by ARM -
because that is the tool that I rate highest on these factors. If
there were a similar featured clang toolchain I'd look closely at that
too. And over the years I have used many toolchains for many targets,
some costing multiple $K for the license.

Of course everyone likes faster compiles, all other things being
equal. But the other things are /not/ equal when comparing real-world
development tools with the likes of tcc or your little compiler. The
idea that anyone should reasonably expect to get paid for wasting
customer time and money with those is just laughable.

That's a good point. How much money has been wasted in paying
programmers by the hour to twiddle their thumbs while waiting for a
rebuild?

None that I know of. Your worries about compiler speed are imaginary or self-imposed.

Perhap just once you can forget about analysing every obscure corner of
the code to quickly try out the latest change to see if it fixes that problem.

It's like being
hired to dig up a road and arriving with kid's sand spade then
claiming it is better than a mechanical digger because it is smaller
and lighter.

Have you considered that you could have a product that works like gcc,
in ticking most of those boxes, and yet could be nearly as fast as TCC?

Yes. There are two issues with that. First, a compiler as fast as tcc
would make no practical difference at all to my development process.
gcc is more than fast enough for my needs. Secondly, there is no such
tool, never has been, and I can confidently say, never will be. I want
far more from my tools than tcc can offer, and that takes more time.

If gcc was ten times slower than it is, it might get annoying sometimes,
and I'd then get a faster computer.

But nobody is interested in that; customers such as you are so inured to
slow build-times, and have learnt to get around its sluggishness (and in
your case even convinced yourself that it is really quite fast), so that there simply aren't enough people complaining about it.

You /do/ realise that the only person that "suffers" from slow gcc times
is /you/ ?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Mon Jun 24 20:33:05 2024

On 24/06/2024 19:10, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 18:02, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 16:10, Michael S wrote:

But my project has much more than 34 modules. 164 modules compiled
during build + several dozens in libraries.

Does that matter? My example is a smaller project, but I'm comparing the >>>> rate of compilation not total time.

If you want a bigger example, yesterday I posted one involving 140 .c
files, total EXE size is 5MB (don't know .text as this is ELF format).

Why do you believe that the size of the executable is interesting?

Well, what metric IS interesting?

You seem to not care whether an executable is 10KB, 10MB, or 10GB. You
really don't think there's correspondence with build-time?

No. The ELF file contains a lot of stuff that never gets
loaded into memory (symbol tables, DWARF section data, etc);
writing to the object files by the compiler is an insignificant
component of the overall compile time.

By ELF I mean executable binary not object files. And I never work with executable files with debug info.

So the size of an executable file will mainly be the size of the code
segment, plus the size of the initialised data. Anything else, import
tables etc, is a smaller overhead.

For ones generated by compilers for lower level languages for x64, a
rule of thumb is that every 10 bytes corresponds very roughly (depending
on the amount of init data and how it is presented in source code) to
one line of source code.

You can tweak that, eg. look only at .text size, and estimate the size
of preprocessed source code, or ignore declarations etc.

Build time is not related in any way to the size of the
ELF.

Yes it is. It is related to the amount of code in the program, and the
amount of initialised data.

Now I guess you're going bring up examples where a tiny ELF file takes
ages to build, and a huge one is instant.

And I can probably come up with scenarios where it's quicker for me to
walk or cycle from A to B, than it takes to drive with a 200mph supercar.

But generally driving is faster than walking or cycling.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Mon Jun 24 21:34:47 2024

bart <bc@freeuk.com> writes:

On 24/06/2024 18:15, David Brown wrote:

On my Windows machine, gcc -O0 took 0.20 seconds to build the 4-line
hello.c. As measured with clock() when executing 'system("gcc hello.c")'.

On WSL using 'time', it took 0.146 seconds 'real' time, 0.007 seconds
'user' time, and 0.051 seconds 'sys' time.

I'm not sure what these mean, or which one, or which combination, you
used to present your figures.

This means that 0.146 minus 0.058 seconds were spent waiting for I/O.

The actually compiler CPU time was .007 seconds, the rest was
I/O.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jun 24 22:15:53 2024

On 24/06/2024 18:15, David Brown wrote:

On 24/06/2024 18:19, bart wrote:

On 24/06/2024 16:09, David Brown wrote:

On 24/06/2024 16:00, bart wrote:

However, is there any way of isolating the compilation time (turning
.c files into either or .o files) from 'make' the linker?

Why would anyone want to do that? At times, it can be useful to do
partial builds, but compilation alone is not particularly useful.

It is useful when you are trying to establish the true cost of -O2
compared to -O0.

And the relevance of that is... what? Absolutely nothing. We've
already established that for some files, the difference is below the measurement threshold and for some (such as Scott's C++ example) it is massive.

But just to please you, and not because it has any bearing on reality, I isolated a couple of C files from my code, removed the stuff that is
specific to the target, and compiled with native gcc -O0 and -O2.

For one file (the largest C file in the project), "gcc -O0" took 0.085 seconds. "gcc -O2" took 0.134 seconds. Yes, optimising was slow - but
the time was negligible. For the other file, both compiles were 0.025 seconds - too fast to be distinguishable from noise.

On my Windows machine, gcc -O0 took 0.20 seconds to build the 4-line
hello.c. As measured with clock() when executing 'system("gcc hello.c")'.

On WSL using 'time', it took 0.146 seconds 'real' time, 0.007 seconds
'user' time, and 0.051 seconds 'sys' time.

I'm not sure what these mean, or which one, or which combination, you
used to present your figures.

However, if someone is waiting for it to finish, it is the elapsed time
that is relevant, since that is how long they have to wait!

Why do you think that is an appropriate way to guess the size of the project? Michael does embedded development - that's likely to be a
pretty big project with a lot of files. This is not PC compilation
where a "Hello, world" file can reach that size if statically linked.

No one cares about your figures. No one, except you, cares about /my/ figures. Sometimes people care about the build speed of /their/ code,
using /their/ choice of compiler and /their/ choice of options on
/their/ computers. Do you /really/ not understand that the timings you
get are utterly pointless to everyone else?

Obviously /you/ don't care about fast build systems. It's perfectly
alright for 90% of the time to build a project to be spent executing
auto-conf scripts.

Some people also cared enough about linkers to develop a new generation
of linker (maybe 'gold', maybe something even newer) that is supposed to
be 5 times the speed of 'ld'.

Who told them that linking was a bottleneck? Obviously not you! (And not
me either; linking is just a nuisance, but I hadn't noticed it being a
drag on the stuff I did.)

No one denies that "gcc -O0" is faster than "gcc -O3" for individual
compiles, and that the percentage difference will vary and sometimes
be large.

Yes, gcc ticks all the boxes. Except the last.

No, it does not tick all the boxes. The toolchains I use tick most of
them (including all the ones that I see as hard requirements), and do
better than any alternatives, but they are not perfect. They do,
however, happily pass the last one. I have yet to find a C compiler
that was not fast enough for my needs.

For me it would be like driving my car at walking pace all over town,
even though most of my time would be spent at various stopping places.

You still don't understand. You are telling people how fantastically
fast your car is, without realising it is just a remote-controlled toy
car. Nobody cars if your toy runs faster than a real tool - people will still choose the real tool. And the real tools run fast enough for real developers doing real work.

You're wrong. My 'car' would do the equivalent job of driving around
town. Unless someone /wanted/ a vehicle that was more like a 40-tonne truck.

Let's go back a few weeks to when the topic was translating millions of
lines like this:

unsigned char data[N] = {
1,
2,
....
};

This is not very demanding of a compiler; there is little static
analysis to do, no advanced code generation, not even a way of
minimising size: it has to occupy at least N bytes in the output.

So would you really demand gcc be used here, because an object file
produced by any lesser compiler woudldn't be up to scratch? gcc will of
course take 20 times longer to do the job. (And 200 times longer than my language takes to directly embed the original binary.)

Would you allow the use of a dedicated utility program which turns such
text into a binary file? It would rather churlish to say yes to this,
and no to the use of Tiny C which could probably do the job too.

How about assemblers, which aren't too challenging either; would you be snobbish and patronising about those too?

I see basic compilation, naively translating source code to native code,
as an equally low level, mechanical task. It should always be available
as something to fall back on.

I use gcc - specifically toolchains built and released by ARM -
because that is the tool that I rate highest on these factors. If
there were a similar featured clang toolchain I'd look closely at
that too. And over the years I have used many toolchains for many
targets, some costing multiple $K for the license.

Of course everyone likes faster compiles, all other things being
equal. But the other things are /not/ equal when comparing real-world
development tools with the likes of tcc or your little compiler. The
idea that anyone should reasonably expect to get paid for wasting
customer time and money with those is just laughable.

That's a good point. How much money has been wasted in paying
programmers by the hour to twiddle their thumbs while waiting for a
rebuild?

None that I know of. Your worries about compiler speed are imaginary or self-imposed.

So cartoons like https://xkcd.com/303/ have no basis in fact? It's just
made up?

You /do/ realise that the only person that "suffers" from slow gcc times
is /you/ ?

Forums abound with horror stories. Here are quotes from just one thread:

------------------
Well a build used to take 6 or 7 minutes, and that's a long time for my
little attention span. I'd always get distracted waiting for builds and
waste even more time.

In short, if a developer is waiting on a build to run for one hour and
doing nothing in that timeframe, the business is still spending $75 on
average for that developer’s time—and potentially losing out on time
that developer could be focusing on building more code.

I worked on a system where only linking the binary took 40-50 minutes.
For some technical reasons there was no dynamic linking - only static -
so you had to go through that delay for the slightest change.

This is why my computer and build server have an 11900k. Builds went
from 45 minutes to 15.

This is the reason I stopped being a Java developer and embraced JS.
Even the 1-3 minutes of build time was a big hit for me because it was
just enough time for me to get distracted by something else and then the
next thing you know, you have wasted 30 mins.
------------------

Maybe you should tell these guys how it's done!

That 1-3 minutes was the fastest build mentioned in these quotes, and
that would absolutely kill me. Just 1-3 seconds feels like forever.

So what is taking the time? Is it very large programs? This is where I'd
be interested in figures like MB of output per second.

If they're already doing 1-10MB/sec and it takes even one minute (so
60-600MB of output) then that's reasonable. The task is just huge, and
they need to look at other approaches. But if they're only managing 10-100KB/sec, then those tools need attention.

The problem I have is that I've always worked in isolation and with my
own tools. I never discovered fast or slow build processes were for
anyone else. I just kept my own tools fast enough to suit me, even on
1980s hardware.

It turned out not only that they were very fast, but also toys!

Yes. There are two issues with that. First, a compiler as fast as tcc would make no practical difference at all to my development process. gcc
is more than fast enough for my needs. Secondly, there is no such tool, never has been, and I can confidently say, never will be. I want far
more from my tools than tcc can offer, and that takes more time.

If gcc was ten times slower than it is, it might get annoying sometimes,
and I'd then get a faster computer.

The wrong approach.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Scott Lurndal on Mon Jun 24 22:03:54 2024

scott@slp53.sl.home (Scott Lurndal) writes:

bart <bc@freeuk.com> writes:

On 24/06/2024 18:15, David Brown wrote:

On my Windows machine, gcc -O0 took 0.20 seconds to build the 4-line >>hello.c. As measured with clock() when executing 'system("gcc hello.c")'.

On WSL using 'time', it took 0.146 seconds 'real' time, 0.007 seconds >>'user' time, and 0.051 seconds 'sys' time.

I'm not sure what these mean, or which one, or which combination, you
used to present your figures.

This means that 0.146 minus 0.058 seconds were spent waiting for I/O.

The actually compiler CPU time was .007 seconds, the rest was
I/O.

Here's a real world example for you.

$ time mr -s -j32
real 6m39.96s
user 1h25m54.07s
sys 3m40.61s

Elapsed time slightly under 7 minutes.

CPU time, 1 hour 26 minutes.

(mr is a shell function that invokes gnu make with certain pre-set
make variables including specifying -O3 when the compiler is
invoked.)

-j32 tells make to issue up to 32 jobs in parallel.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Tue Jun 25 01:35:36 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

[on the requirements for qsort]

I certainly would favor improved wording that made this clearer.
In fact, simply explicitly mandating total ordering rather than
making a vague comment about consistency would probably be the
best approach.

Clearly the C standard intends to impose a weaker requirement
than that the comparison function be a total ordering.

The plot thickens. Unless, of course, you are referring to the
distinction you drew before between an ordering of all possible objects
and only those in the array.

Consider the following situation.

We have an array with seven elements, the integers 1 to 7,
in that order. We call qsort on the array, with a natural
comparison function that compares the integer values.

The qsort function starts with a check, and for any array
with eight elements or fewer a simple insertion sort is
done. Because 1 is less than 2, these elements stay
where they are. Because 2 is less than 3, there is only
the one comparison, and 3 stays where it is. And so on...
at each point in the sort an element is compared to the
one before it, and nothing changes. Six compares are
done to sort seven elements. Question: has the program
encountered any undefined behavior? (I expect you will
say no.)

Now consider a second situation.

We again have an array with seven elements, the integers 1
to 7, but not necessarily in order. We call the same
qsort function. This time though the argument for the
comparison function is for a function that just always
returns -1. The same sequence of events takes place as
did in the first situation: each element after the first
is compared to the one before it, and because the previous
element is deemed "less than" this element no movement
occurs and we proceed to the next element of the array.
Six compares are done to "sort" seven elements. Question:
has the program encountered any undefined behavior?

If there has been undefined behavior, which passages in
the C standard explains the difference relative to the
first situation?

If there has not been undefined behavior, what does that
say about what the requirements are for a call to qsort?

So you are pointing out that only the comparisons made have to be
"consistent with one another"? BTW, your function that returns -1 is
just the total extension of my partial "dog order" function.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Tue Jun 25 01:30:24 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says anything >>>>>>>> about sorting with non-order functions like the one above. Is >>>>>>>> an implementation of qsort permitted to misbehave (for example >>>>>>>> by not terminating) when the comparison function does not
implement a proper order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective of >>>>>>> their current positions in the array) are passed more than once
to the comparison function, the results shall be consistent with >>>>>>> one another. That is, for qsort they shall define a total
ordering on the array, and for bsearch the same object shall
always compare the same way with the key.
"""

That's a "shall" outside a constraint, so violating it results in >>>>>>> undefined behavior.

I think it should be clearer. What the "that is" phrase seems to
clarify in no way implies a total order, merely that the repeated
comparisons or the same elements are consistent with one another.
That the comparison function defines a total order on the elements >>>>>> is, to me, a major extra constraint that should not be written as
an apparent clarification to something the does not imply it:
repeated calls should be consistent with one another and, in
addition, a total order should be imposed on the elements present.

I think you're misreading the first sentence.

Let's hope so. That's why I said it should be clearer, not that it
was wrong.

Suppose we are in
court listening to an ongoing murder trial. Witness one comes in
and testifies that Alice left the house before Bob. Witness two
comes in (after witness one has gone) and testifies that Bob left
the house before Cathy. Witness three comes in (after the first
two have gone) and testifies that Cathy left the house before
Alice. None of the witnesses have contradicted either of the
other witnesses, but the testimonies of the three witnesses are
not consistent with one another.

My (apparently incorrect) reading of the first sentence is that
the consistency is only required between the results of multiple
calls between each pair. In other words, if the witnesses are
repeatedly asked, again and again, if Alice left before Bob and/or
if Bob left before Alice the results would always be consistent
(with, of course, the same required of repeatedly asking about the
other pairs of people).

Let me paraphrase that. When the same pair of objects is passed
more than once to individual calls of the comparison function, the
results of those different calls shall each be consistent with
every other one of the results.

No, only with the results of the other calls that get passed the same
pair. [...]

Sorry, my oversight. That's is what I meant. "When the same pair
of objects is passed more than once to individual calls of the
comparison function, the results of those different calls shall
each be consistent with every other one of THOSE results." The
consistency is meant to be only between results of comparisons
of the same pair. (This mistake illustrates how hard it is to
write good specifications in the C standard.)

To paraphrase my reading, when some set of "same" objects is each
passed more than once to individual calls of the comparison
function, the results of all of those calls taken together shall
not imply an ordering contradiction.

Are the last two paragraphs fair restatements of our respective
readings?

I don't think so. The first does not seem to be what I meant, and the
second begs a question: what is an ordering contradiction?

A conclusion that violates the usual mathematical rules of the
relations less than, equal to, greater than: A<B and B<C implies
A<C, A<B implies A!=B, A=B implies not A<B, A<B implies B>A, etc.

Maybe I could work out what you mean by that if I thought about it
some more, but this discussion has reminded me why I swore not to
discuss wording and interpretation on Usenet. You found the wording
adequate. I didn't. I won't mind if no one ever knows exactly why
I didn't. C has managed fine with this wording for decades so there
is no practical problem. I think enough time has been spent on this
discussion already, but I can sense more is likely to spent.

A small correction: I found the wording understandable. If the
question is about adequacy, I certainly can't give the current
wording 10 out of 10. I would like to see the specification for
qsort stated more plainly. Although, as you can see, I'm having
trouble figuring out how to do that.

Is the second paragraph plain enough so that you
would not misconstrue it if read in isolation? Or if not, can
you suggest a better phrasing?

Since I don't know what an ordering contradiction is, I can't suggest
an alternative.

Now that I have explained that phrase, I hope you will have a go
at finding a better wording.

I would not introduce your new term, "an ordering contradiction", since
it still leaves exactly what kind of order vague. You interpret
"consistent" as "consistent with a total order" so I'd use that phrase:

"when some set of 'same' objects is each passed more than once to
individual calls of the comparison function, the results of all of
those calls taken together shall be consistent with a total order"

Presumably you came to interpret "consistent with one another" as
implying a total order rather because of the sentence that follows
("That is, for qsort they shall define a total ordering on the array").

I could not do that because I was interpreting the text about multiple
calls differently.

... The important point is the "consistent with" is something of an
idiomatic phrase, and it doesn't mean "equivalent to" or "the same
as". Maybe you already knew that, but I didn't, and learning it
helped me see what the quoted passage is getting at.

...

If you care to be less cryptic, maybe you will say what it was
about the meaning of "consistent with" that helped you see what
the text in question was getting at.

I think the key thing is that "consistent with" doesn't mean the
same. If we're comparing the same pair of objects over and over,
the results are either the same or they are different. It would
be odd to use "consistent with one another" if all that mattered
is whether they are all the same.

I never thought they were the same. The trouble is that (a) different
results imply the same order (e.g. -1 and -34 all mean <) and (b) the
(old) wording does not say that the objects are passed in the same order
and the result of cmp(a, b) can't be the same as cmp(b, a) but they can
be consistent. This makes "consistent with one another" a perfectly
reasonable thing to say even in my limited view of what results are
being talked about.

...

I have a second objection that promoted that remark. If I take the
(apparently) intended meaning of the first sentence, I think that
"consistent" is too weak to imply even a partial order. In dog club
tonight, because of how they get on, I will ensure that Enzo is
walking behind George, that George is walking behind Benji, Benji
behind Gibson, Gibson behind Pepper and Pepper behind Enzo. In what
sense is this "ordering" not consistent? All the calls to the
comparison function are consistent with each other.

I understand the objection, and this is the point I was trying to
make in the paragraph about children in the Jones family. The
phrase "one another" in "the results shall be consistent with one
another" is meant to be read as saying "all the results taken
together". It is not enough that results not be contradictory taken
two at a time; considering all the results at once must not lead to
an ordering contradiction.

...

All the results of the dog-order comparison function, taken together,
are consistent with the circular order, which is obviously not a total
order.

If A<B, B<C, C<D, D<E, and E<A, we can infer from the transitivity
of the "less than" relation that A<A. But A<A can never be true, so
this set of comparison results is no good.

[Technical aside. The relation should be seen as <=. not <. You can't conclude that I intended A < A from the informal presentation -- no dog
can be behind itself. However, this does not alter your argument in any significant way.]

So I guess what we have
discovered is that "consistent with one another" is intended to mean
"obeys the usual mathematical rules for ordering relations".

I would say this is backwards. You are assuming the usual rules where I
gave an order that is not at all usual with the purpose of showing that
some sets of comparisons between pairs can be "consistent with one
another" when the ordering is very peculiar.

On a more mathematical note, imagine that the text was describing a
topological sort function. Is there anything in your reading of the
first sentence that would make it inappropriate? If not, that
"consistent with one another" can't imply a total order.

...

It occurs to me now to say that "consistent with" is meant to
include logical inference.

Sure.

That distinction is a key difference
between "consistent" and "consistent with" (at least as the two
terms might be understood). The combination of: one, the results
of the comparison function are seen as corresponding to an ordering
relation;

But, according to you, only some ordering relations.

and two, that "consistent with one another" includes
logical inferences considering all of the results together; is what
allows us to conclude that the results define a total order.

Could the sentence in question be used in the description of a
topological sort based (rather unusually) on a partial order?

I'm sorry if any of the above sounds like it's just stating the
obvious. I'm strugging trying to find a way to explain what to
me seems straightforward.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Scott Lurndal on Tue Jun 25 11:36:16 2024

On Mon, 24 Jun 2024 18:10:20 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 18:02, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 24/06/2024 16:10, Michael S wrote:

But my project has much more than 34 modules. 164 modules
compiled during build + several dozens in libraries.

Does that matter? My example is a smaller project, but I'm
comparing the rate of compilation not total time.

If you want a bigger example, yesterday I posted one involving
140 .c files, total EXE size is 5MB (don't know .text as this is
ELF format).

Why do you believe that the size of the executable is interesting?

Well, what metric IS interesting?

You seem to not care whether an executable is 10KB, 10MB, or 10GB.
You really don't think there's correspondence with build-time?

No. The ELF file contains a lot of stuff that never gets
loaded into memory (symbol tables, DWARF section data, etc);
writing to the object files by the compiler is an insignificant
component of the overall compile time.

Build time is not related in any way to the size of the
ELF.

That's why in my original post in this sub-thread, in order to give
a feeling of the size of compiler's job I gave the size of text
segment rather than size of the elf.
The size of text segment is, of course, not a good measure of
compiler's job, esp. when we are trying to compare compile jobs for
different target architectures, but it is less bad than any alternative
measure [that is not too hard to gather] that I can think of.
If you can think about anything better, please tell us.

Disk space is cheap and plentiful.

The .text was also something introduced by MS.

I don't understand this comment. .text predates any MS
compiler by more than a decade.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 25 10:19:25 2024

On 24/06/2024 23:15, bart wrote:

On 24/06/2024 18:15, David Brown wrote:

On 24/06/2024 18:19, bart wrote:

On 24/06/2024 16:09, David Brown wrote:

On 24/06/2024 16:00, bart wrote:

No one cares about your figures. No one, except you, cares about /my/
figures. Sometimes people care about the build speed of /their/ code,
using /their/ choice of compiler and /their/ choice of options on
/their/ computers. Do you /really/ not understand that the timings
you get are utterly pointless to everyone else?

Obviously /you/ don't care about fast build systems.

How many times does this need repeated? I want my builds to be /fast
enough/. Fast enough is all that anyone needs. As long as a task is
fast enough not to hinder other tasks, doing it any faster gives no
benefits. I care that my builds are fast enough - I don't care if they
are faster. (I'm quite happy if they are faster, of course.)

If my gcc builds took too long, becoming a nuisance to my workflow, then
that would be an issue. Maybe I'd get a faster computer. Maybe I'd
change the workflow - perhaps doing more compiles without linking, or separating compilation and static error checking. Maybe I'd start using pre-compiled headers, or modules in C++. The one thing I will /not/ do
is compromise on the quality of the tools I use or cut down on the
features I rely on.

It's perfectly
alright for 90% of the time to build a project to be spent executing auto-conf scripts.

Please point me to references of Usenet posts where I have said anything remotely like that. In particular, show me where I have said I use
autoconf for my projects.

Some people also cared enough about linkers to develop a new generation
of linker (maybe 'gold', maybe something even newer) that is supposed to
be 5 times the speed of 'ld'.

Linking can take a significant amount of time. In particular, C++
linking is usually /far/ more work than C linking. So for big C++
projects the build time - including compiling and linking - can often be
a lot slower than people like.

No one denies that "gcc -O0" is faster than "gcc -O3" for individual
compiles, and that the percentage difference will vary and sometimes
be large.

Yes, gcc ticks all the boxes. Except the last.

No, it does not tick all the boxes. The toolchains I use tick most of
them (including all the ones that I see as hard requirements), and do
better than any alternatives, but they are not perfect. They do,
however, happily pass the last one. I have yet to find a C compiler
that was not fast enough for my needs.

For me it would be like driving my car at walking pace all over town,
even though most of my time would be spent at various stopping places.

You still don't understand. You are telling people how fantastically
fast your car is, without realising it is just a remote-controlled toy
car. Nobody cars if your toy runs faster than a real tool - people
will still choose the real tool. And the real tools run fast enough
for real developers doing real work.

You're wrong. My 'car' would do the equivalent job of driving around
town. Unless someone /wanted/ a vehicle that was more like a 40-tonne
truck.

No, your "car" is, at best, a home-made go-cart. It can go really fast
down steep slopes with a disregard to safety, and that seems to be all
you want from it. That's fine for you, since that's all you want to do.

Let's go back a few weeks

Let's not.

None that I know of. Your worries about compiler speed are imaginary
or self-imposed.

So cartoons like https://xkcd.com/303/ have no basis in fact? It's just
made up?

For my work? No. For /some/ other people's work? Yes.

Perhaps it would do you some good to raise your eyes from your little
projects with your highly unusual and restricted style of C, and think
about the rest of the world. Consider, just for a moment, that when the
rest of the world sees things differently from you, it is not because
/you/ are right and everyone else is crazy. Imagine the possibility
that other people have different needs than you.

Compile times /are/ long for some code. Build times /are/ big for some projects.

Compile times for almost all /C/ code are short, even with gcc and heavy optimisation. Build times for /C/ projects are usually fairly short.
Build times for /my/ /C/ projects are easily fast enough not to be of
any concern to me.

Compile times for big /C++/ files can often be long. Build times for
huge /C++/ projects are often a significant inconvenience. The /C++/
powers that be are dealing with that (more slowly than many would like)
with C++ modules in the language, and improving tools such as by making
new linkers designed from the outset with C++'s needs in mind, and with
as much parallel processing as practically possible.

At no point in all this does anyone care in the slightest about the
speed of your little toys or of the cute little tcc. tcc might be ideal
for the half-dozen people in the world who think C scripts are a good
idea, and it had its place in a time when "live Linux" systems were
booted from floppies, but that's about it.

You /do/ realise that the only person that "suffers" from slow gcc times is /you/ ?

Forums abound with horror stories. Here are quotes from just one thread:

------------------
Well a build used to take 6 or 7 minutes, and that's a long time for my little attention span. I'd always get distracted waiting for builds and
waste even more time.

In short, if a developer is waiting on a build to run for one hour and
doing nothing in that timeframe, the business is still spending $75 on average for that developer’s time—and potentially losing out on time
that developer could be focusing on building more code.

I worked on a system where only linking the binary took 40-50 minutes.
For some technical reasons there was no dynamic linking - only static -
so you had to go through that delay for the slightest change.

This is why my computer and build server have an 11900k. Builds went
from 45 minutes to 15.

This is the reason I stopped being a Java developer and embraced JS.
Even the 1-3 minutes of build time was a big hit for me because it was
just enough time for me to get distracted by something else and then the
next thing you know, you have wasted 30 mins.
------------------

Maybe you should tell these guys how it's done!

It seems they have already figured it out. If their build times are too
long for convenience, do something about it. Options include buying
faster build machines, distributing builds across existing workstations
(since most machines are 95%+ idle), changing languages, improving the
way the builds are done, using more dynamic linking (fixing whatever
their technical hinder was), using partial links, changing tools,
changing languages.

Options do /not/ include disabling compiler optimisation or changing to tcc.

If gcc was ten times slower than it is, it might get annoying
sometimes, and I'd then get a faster computer.

The wrong approach.

The professional approach. If each build is wasting $75 (figures from
your forum post), it doesn't take many of them to cover the cost of a
faster machine.

Of course there are many other ways to speed up builds. I know of
several ways to speed up my own builds, if I felt the need - but I don't
need to because they are more than fast enough already.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Tue Jun 25 13:15:20 2024

On Mon, 24 Jun 2024 17:09:25 +0200
David Brown <david.brown@hesbynett.no> wrote:

(I'm not suggesting Michael change for this project - for serious
embedded work, repeatable builds and consistency of toolchains is
generally far more important than build times. But I presume he'll
use newer and better tools for new projects.)

It is not that simple.
Tools are supplied by Altera (more recently called Intel, but gossips
are that will be called Altera again really soon now).
Of course, I can build gcc compiler and binutils to native exe myself,
but then it wouldn't be supported. And I'd still will be forced to run
these native tools from cygwin shell because of compatibility with
other vendor-supplied tools.
Altera/Intel-supplied Nios2 SDK on Windows up to 2018 was based on
cygwin. 2019-2022 it is based on WSL. 2023 and later it is "deprecated"
in theory and removed in practice, both on Windows and on Linux, in
favor of "Nios-V" which is a name for Intel-supplied RISC-V core.
I have a weak hope that if Altera become more independent then the last
step will be reversed, but by now it's what we have.
As you can see, at no point they supported msys/msys2-based tools any
other "native" Windows form of tools.
So practical choice Intel/Altera give is between cygwin and WSL. WSL is
not usable in our working environment. That leaves cygwin.

And it's not that bad.
Yes, cygwin shell is inconvenient, but not unusable. Yes, cygwin is
slower. But project that I presented is among our biggest and still a
full rebuild takes only ~15 seconds on rather old hardware. During
development full rebuilds are very rare. More typical build on more
typical project is 2-3 seconds. For me, it's slightly inconvenient, but tolerable. For few other co-workers it's not even inconvenient. I know
few people for whom it would be quite unnerving, but luckily non of
them is currently doing Nios2 sw development.

So, your presumption is wrong. I am going to start new project that
among other things involves Nios2 software and I planning to start it
with cygwin-based build tools. A little newer version of tools (gcc 5.2
instead of 4.1, newer binutils 2.25 etc) but otherwise almost identical
to 11 y.o. SDK that was used to gather numbers in post above.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Tue Jun 25 12:56:31 2024

On 25/06/2024 12:15, Michael S wrote:

On Mon, 24 Jun 2024 17:09:25 +0200
David Brown <david.brown@hesbynett.no> wrote:

(I'm not suggesting Michael change for this project - for serious
embedded work, repeatable builds and consistency of toolchains is
generally far more important than build times. But I presume he'll
use newer and better tools for new projects.)

It is not that simple.
Tools are supplied by Altera (more recently called Intel, but gossips
are that will be called Altera again really soon now).
Of course, I can build gcc compiler and binutils to native exe myself,
but then it wouldn't be supported. And I'd still will be forced to run
these native tools from cygwin shell because of compatibility with
other vendor-supplied tools.

If that's the vendor-supplied tools, then that's what you use, of
course. I assumed that this was an old project, since gcc 4.1 is a very
old version and few people use Cygwin now, and it's normal in this
branch to stick to the tools you start with in a project.

Altera/Intel-supplied Nios2 SDK on Windows up to 2018 was based on
cygwin. 2019-2022 it is based on WSL. 2023 and later it is "deprecated"
in theory and removed in practice, both on Windows and on Linux, in
favor of "Nios-V" which is a name for Intel-supplied RISC-V core.
I have a weak hope that if Altera become more independent then the last
step will be reversed, but by now it's what we have.
As you can see, at no point they supported msys/msys2-based tools any
other "native" Windows form of tools.
So practical choice Intel/Altera give is between cygwin and WSL. WSL is
not usable in our working environment. That leaves cygwin.

I have customers using WSL for gcc-based builds, though the gcc
toolchains in question are mingw64 hosted. I have no experience with it
myself - I use Linux for most development and my Windows system is
Windows 7 without WSL.

I used the Nios 2 briefly when it was new, but that was many years ago.
I can understand that Altera/Intel are a bit slow with updating the
toolchains (and I can understand why they are pushing towards a RISC-V
core). But the Nios 2 is a mainstream gcc port - there is no excuse for
them being stuck on such an old version of gcc, and such an awkward
emulation layer.

And it's not that bad.
Yes, cygwin shell is inconvenient, but not unusable. Yes, cygwin is
slower. But project that I presented is among our biggest and still a
full rebuild takes only ~15 seconds on rather old hardware. During development full rebuilds are very rare. More typical build on more
typical project is 2-3 seconds. For me, it's slightly inconvenient, but tolerable. For few other co-workers it's not even inconvenient. I know
few people for whom it would be quite unnerving, but luckily non of
them is currently doing Nios2 sw development.

Fair enough - after all, fast enough is fast enough.

You have no plans of switching to Linux? I think these days that's the
most common OS for FPGA development. (Of course I am not recommending switching to Linux just to get faster builds :-) )

So, your presumption is wrong. I am going to start new project that
among other things involves Nios2 software and I planning to start it
with cygwin-based build tools. A little newer version of tools (gcc 5.2 instead of 4.1, newer binutils 2.25 etc) but otherwise almost identical
to 11 y.o. SDK that was used to gather numbers in post above.

OK.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jun 25 12:18:39 2024

On 25/06/2024 09:19, David Brown wrote:

At no point in all this does anyone care in the slightest about the
speed of your little toys or of the cute little tcc. tcc might be ideal
for the half-dozen people in the world who think C scripts are a good
idea, and it had its place in a time when "live Linux" systems were
booted from floppies, but that's about it.

Yet, projects like mine, and like tcc, show what is possible: just how
fast should it take to turn lower level code into machine code.

Since as I said I don't see much difference in such a task compared with
doing the same with assembly, or translating textual data into binary.

So, if someone is using a tool (and perhaps language) that takes 1, 2 or
3 magnitudes longer for the same scale of task, then the trade-offs had
better be worthwhile.

And it shouldn't be because the developers of the tool are lousy at
writing performant code. Or they don't care. Or they expect customers to
just use faster and bigger hardware.

My own main tool needs to be fast because it is an experimental
whole-program compiler designed to turn source modules directly into
EXE/DLL with no external build system. There is no independent
module-based compilaton; it is program-based.

It also capable of running such an application directly form source code
just like a script language.

(It can run itself from source each time you want to compile something.
Imagine gcc building itself from scratch before you build any of your
programs; I think build times might be a tad slower than you would like!
Time to save up for the super-computer.)

LOTS of people are interested in the speed of such tools, lots of
working are working on such projects, and ultimately, ungrateful people
like you will benefit.

You think it is all totally pointless? Then fuck you.

Here is one more quote from a 2019 thread about compilation speed:

"I remember back in the early 90s having a copy of both Borland packages
Pascal and C++ for Windows (3.x). They had similar demo programs. I
compared one of them (forgot which)

The C++ demo would build in 5 minutes on my machine.

The [Object] Pascal demo would build in 5 seconds.

The C++ package got shipped back. Ain’t nobody got time for that."

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Tue Jun 25 05:38:08 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

[on the requirements for qsort]

I certainly would favor improved wording that made this clearer.
In fact, simply explicitly mandating total ordering rather than
making a vague comment about consistency would probably be the
best approach.

Clearly the C standard intends to impose a weaker requirement
than that the comparison function be a total ordering.

The plot thickens. Unless, of course, you are referring to the
distinction you drew before between an ordering of all possible objects
and only those in the array.

Consider the following situation.

We have an array with seven elements, the integers 1 to 7,
in that order. We call qsort on the array, with a natural
comparison function that compares the integer values.

The qsort function starts with a check, and for any array
with eight elements or fewer a simple insertion sort is
done. Because 1 is less than 2, these elements stay
where they are. Because 2 is less than 3, there is only
the one comparison, and 3 stays where it is. And so on...
at each point in the sort an element is compared to the
one before it, and nothing changes. Six compares are
done to sort seven elements. Question: has the program
encountered any undefined behavior? (I expect you will
say no.)

Now consider a second situation.

We again have an array with seven elements, the integers 1
to 7, but not necessarily in order. We call the same
qsort function. This time though the argument for the
comparison function is for a function that just always
returns -1. The same sequence of events takes place as
did in the first situation: each element after the first
is compared to the one before it, and because the previous
element is deemed "less than" this element no movement
occurs and we proceed to the next element of the array.
Six compares are done to "sort" seven elements. Question:
has the program encountered any undefined behavior?

If there has been undefined behavior, which passages in
the C standard explains the difference relative to the
first situation?

If there has not been undefined behavior, what does that
say about what the requirements are for a call to qsort?

So you are pointing out that only the comparisons made have to be
"consistent with one another"? BTW, your function that returns -1 is
just the total extension of my partial "dog order" function.

Let me try to be clear here. As I read the C standard: whatever
we decide "consistent with" means, it is only the results of the
calls to the comparison function that were actually performed
that matter. If other calls to the comparison function would
have given a result not consistent with the results of calls that
/were/ performed, but those other calls were not performed, that
potential inconsistency doesn't make the behavior be undefined.

It makes sense to me that this rule is what was intended. Whatever
results qsort gets back, as long as they are consistent with one
another the sorting algorithm is going to work okay, in the sense
that it won't rely on contradictory information. Conversely, if
qsort does get back contradictory information, then it easily might
wander off into the weeds and do who knows what. So the condition
for undefined behavior matches those cases where qsort could be
confused by having received bogus results.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Michael S on Tue Jun 25 13:48:02 2024

Michael S <already5chosen@yahoo.com> writes:

On Mon, 24 Jun 2024 18:10:20 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Well, what metric IS interesting?

You seem to not care whether an executable is 10KB, 10MB, or 10GB.
You really don't think there's correspondence with build-time?

No. The ELF file contains a lot of stuff that never gets
loaded into memory (symbol tables, DWARF section data, etc);
writing to the object files by the compiler is an insignificant
component of the overall compile time.

Build time is not related in any way to the size of the
ELF.

That's why in my original post in this sub-thread, in order to give
a feeling of the size of compiler's job I gave the size of text
segment rather than size of the elf.
The size of text segment is, of course, not a good measure of
compiler's job, esp. when we are trying to compare compile jobs for
different target architectures, but it is less bad than any alternative >measure [that is not too hard to gather] that I can think of.
If you can think about anything better, please tell us.

Does the compiled code meet functional and performance specifications?

That's the only criteria that matters. Size of the executable
and compilation speed are basically irrelevent metrics in my
experience.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Tue Jun 25 15:56:22 2024

On 25/06/2024 14:48, Scott Lurndal wrote:

Michael S <already5chosen@yahoo.com> writes:

On Mon, 24 Jun 2024 18:10:20 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Well, what metric IS interesting?

You seem to not care whether an executable is 10KB, 10MB, or 10GB.
You really don't think there's correspondence with build-time?

No. The ELF file contains a lot of stuff that never gets
loaded into memory (symbol tables, DWARF section data, etc);
writing to the object files by the compiler is an insignificant
component of the overall compile time.

Build time is not related in any way to the size of the
ELF.

That's why in my original post in this sub-thread, in order to give
a feeling of the size of compiler's job I gave the size of text
segment rather than size of the elf.
The size of text segment is, of course, not a good measure of
compiler's job, esp. when we are trying to compare compile jobs for
different target architectures, but it is less bad than any alternative
measure [that is not too hard to gather] that I can think of.
If you can think about anything better, please tell us.

Does the compiled code meet functional and performance specifications?

That's the only criteria that matters. Size of the executable
and compilation speed are basically irrelevent metrics in my
experience.

If apparently anything goes, and you don't care how slow a tool is or
how big its output, how do you detect unnecessary bloat?

How do you detect gratuitous use of machine resources?

BTW since you and DB are both keen on products like Python, I'm curious
as to how you reconcile the fast, streamlined bytecode compilers within
such languages, which do no analysis and no optimising, with the
heavyweight, 'professional' compilers used for translating C code.

Why is fast, TCC-like translation a terrible idea for C code and makes
it a 'toy' compiler, but perfectly fine for Python code? Bear in mind
that the worst TCC code will be a magnitude faster than any Python code,
if it is the speed of the output that you are concerned with.

(It's lucky that the people behind gcc didn't write CPython's bytecode compiler!)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Tue Jun 25 15:08:02 2024

bart <bc@freeuk.com> writes:

On 25/06/2024 14:48, Scott Lurndal wrote:

That's why in my original post in this sub-thread, in order to give
a feeling of the size of compiler's job I gave the size of text
segment rather than size of the elf.
The size of text segment is, of course, not a good measure of
compiler's job, esp. when we are trying to compare compile jobs for
different target architectures, but it is less bad than any alternative
measure [that is not too hard to gather] that I can think of.
If you can think about anything better, please tell us.

Does the compiled code meet functional and performance specifications?

That's the only criteria that matters. Size of the executable
and compilation speed are basically irrelevent metrics in my
experience.

If apparently anything goes,

I never said that.

and you don't care how slow a tool is or

I never said that.

how big its output, how do you detect unnecessary bloat?

I don't write programs with unnecessary bloat.

How do you detect gratuitous use of machine resources?

I write code that doesn't gratuitously use machine resources.

BTW since you and DB are both keen on products like Python,

I have never posted anything about python here, that I recall.

I use it very infrequently.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 25 17:08:42 2024

On 25/06/2024 13:18, bart wrote:

On 25/06/2024 09:19, David Brown wrote:

At no point in all this does anyone care in the slightest about the
speed of your little toys or of the cute little tcc. tcc might be
ideal for the half-dozen people in the world who think C scripts are a
good idea, and it had its place in a time when "live Linux" systems
were booted from floppies, but that's about it.

Yet, projects like mine, and like tcc, show what is possible: just how
fast should it take to turn lower level code into machine code.

That is not the be-all and end-all of compilers. Fortunately, real
compiler developers think differently from you.

Tools like tcc - and possibly your compiler - are better than nothing.
And if you really do have a situation where small size is a concrete requirement (such as Live Linux floppies last century), they can be very useful. But when you have the option of using something much better,
you use something much better. Disk space costs nothing. RAM space
costs peanuts. Processor time is dirt cheap. A major cost is
/developer/ time, so you use tools that save the /developer/ time, not
tools that save the developer's computer time. And for a lot of C code
- at least, a lot of code that should be written in C, the run time of
the final results are a cost.

Since as I said I don't see much difference in such a task compared with doing the same with assembly, or translating textual data into binary.

So, if someone is using a tool (and perhaps language) that takes 1, 2 or
3 magnitudes longer for the same scale of task, then the trade-offs had better be worthwhile.

An order of magnitude longer than negligible is still not worth
bothering about. Compiler time - for C - does not matter.

And it shouldn't be because the developers of the tool are lousy at
writing performant code. Or they don't care. Or they expect customers to
just use faster and bigger hardware.

Do you think the developers of gcc don't care? Or they are just bad at
writing code? Do you know how laughable that is? It is not /quite/ as
bad as your usual paranoia that the developers behind C, gcc, Linux,
make, and countless other things you don't understand created them just
to annoy you personally.

You think it is all totally pointless? Then fuck you.

I didn't say your /compiler/ was pointless. I said your "benchmarks"
were pointless.

And I have said your compiler (and tcc) is a toy in comparison to gcc
(and clang, MSVC, and other serious tools) for C development. That
again does not mean it can't have some uses - if you like it and use it,
then fine. Just don't expect other people to share your enthusiasm for
such limited tools, and certainly don't expect anyone else to share your obsession for meaningless benchmarks and imaginary "lines per second"
figures.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Scott Lurndal on Tue Jun 25 17:12:45 2024

On 25/06/2024 17:08, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 14:48, Scott Lurndal wrote:

That's why in my original post in this sub-thread, in order to give
a feeling of the size of compiler's job I gave the size of text
segment rather than size of the elf.
The size of text segment is, of course, not a good measure of
compiler's job, esp. when we are trying to compare compile jobs for
different target architectures, but it is less bad than any alternative >>>> measure [that is not too hard to gather] that I can think of.
If you can think about anything better, please tell us.

Does the compiled code meet functional and performance specifications?

That's the only criteria that matters. Size of the executable
and compilation speed are basically irrelevent metrics in my
experience.

If apparently anything goes,

I never said that.

and you don't care how slow a tool is or

I never said that.

how big its output, how do you detect unnecessary bloat?

I don't write programs with unnecessary bloat.

How do you detect gratuitous use of machine resources?

I write code that doesn't gratuitously use machine resources.

These answers apply to me tool.

BTW since you and DB are both keen on products like Python,

I have never posted anything about python here, that I recall.

I use it very infrequently.

I /do/ use Python. I use it when it is an appropriate language to use,
which is very different circumstances from when I use C (or C++).
Different tools for different tasks.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Tue Jun 25 16:59:33 2024

On 25/06/2024 16:12, David Brown wrote:

On 25/06/2024 17:08, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 14:48, Scott Lurndal wrote:

That's why in my original post in this sub-thread, in order to give
a feeling of the size of compiler's job I gave the size of text
segment rather than size of the elf.
The size of text segment is, of course, not a good measure of
compiler's job, esp. when we are trying to compare compile jobs for
different target architectures, but it is less bad than any
alternative
measure [that is not too hard to gather] that I can think of.
If you can think about anything better, please tell us.

Does the compiled code meet functional and performance specifications? >>>>
That's the only criteria that matters. Size of the executable
and compilation speed are basically irrelevent metrics in my
experience.

If apparently anything goes,

I never said that.

and you don't care how slow a tool is or

I never said that.

how big its output, how do you detect unnecessary bloat?

I don't write programs with unnecessary bloat.

How do you detect gratuitous use of machine resources?

I write code that doesn't gratuitously use machine resources.

These answers apply to me tool.

BTW since you and DB are both keen on products like Python,

I have never posted anything about python here, that I recall.

I use it very infrequently.

(It seemed to be a big part of that 8Mloc project of yours)

I /do/ use Python. I use it when it is an appropriate language to use,
which is very different circumstances from when I use C (or C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which
was why its simplistic bytecode compiler is acceptable in this scenario,
but would be considered useless if applied to C code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From DFS@21:1/5 to bart on Tue Jun 25 12:52:30 2024

On 6/25/2024 7:18 AM, bart wrote:

On 25/06/2024 09:19, David Brown wrote:

At no point in all this does anyone care in the slightest about the
speed of your little toys or of the cute little tcc. tcc might be
ideal for the half-dozen people in the world who think C scripts are a
good idea, and it had its place in a time when "live Linux" systems
were booted from floppies, but that's about it.

Yet, projects like mine, and like tcc, show what is possible: just how
fast should it take to turn lower level code into machine code.

Can you use tcc to generate assembly from a .c source file?

similar to:

$ gcc -S sourcefile.c

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Tue Jun 25 16:27:25 2024

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

I use it very infrequently.

(It seemed to be a big part of that 8Mloc project of yours)

No, it's not. We generate python "header files" that contain
definitions of every field in every control and status register
in the entire product, to be used with a handful of python-based
unit tests. Only a small subset of the unit tests use python,
most either use the internal command language or are large C
applications (we simulate an SoC and must be able to run the
linux kernel and all linux applications on the simulator).

And yet neither of you are interested in answering my question, which
was why its simplistic bytecode compiler is acceptable in this scenario,
but would be considered useless if applied to C code.

Still not interested in answering your question, as I consider it
pointless.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to David Brown on Tue Jun 25 18:11:05 2024

David Brown <david.brown@hesbynett.no> writes:

On 25/06/2024 17:59, bart wrote:

On 25/06/2024 16:12, David Brown wrote:

On 25/06/2024 17:08, Scott Lurndal wrote:

BTW since you and DB are both keen on products like Python,

I have never posted anything about python here, that I recall.

I use it very infrequently.

(It seemed to be a big part of that 8Mloc project of yours)

I got the impression that it was just for some scripting and automation.

Primarily. Converting YAML descriptions into C++ header files is the primary use;
this happens infrequently - not during regular builds. The generated
C++ headers (which are #ifdef'd to support C as well) are checked in after generation.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Tue Jun 25 19:51:31 2024

On 25/06/2024 17:59, bart wrote:

On 25/06/2024 16:12, David Brown wrote:

On 25/06/2024 17:08, Scott Lurndal wrote:

BTW since you and DB are both keen on products like Python,

I have never posted anything about python here, that I recall.

I use it very infrequently.

(It seemed to be a big part of that 8Mloc project of yours)

I got the impression that it was just for some scripting and automation.
I'm sure if Python were not available, he could happily have used
Perl, or Lua, or Tcl. I know that's the case for the Python code I
often have for build automation.

(/I/ have other code for PC and server programs that are in Python, and
I don't know of any other languages that would suit my needs and wants
better there. That's why I chose Python. But I don't remember Scott
talking about such code in Python.)

I /do/ use Python. I use it when it is an appropriate language to
use, which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which
was why its simplistic bytecode compiler is acceptable in this scenario,
but would be considered useless if applied to C code.

It doesn't often matter that Python code is not efficient - that's not
why people choose to use Python. As always, faster is usually better
when all other factors are equal, but it is rarely an important factor
for Python code. There are other ways to make Python programs fast -
primarily by making sure that the real work is done by underlying C (or
other compiled language) libraries.

There are oft-claimed rules about how programs spend 90% of their time
in 10% of the code, or other proportions picked out of thin air. Python
is suitable for the 90% of the code where speed doesn't much matter but flexibility and ease of development are important, while C is suitable
for the 10% of the code where speed is important and you're willing to
pay the cost in development time.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to DFS on Tue Jun 25 20:39:23 2024

On 25/06/2024 17:52, DFS wrote:

On 6/25/2024 7:18 AM, bart wrote:

On 25/06/2024 09:19, David Brown wrote:

At no point in all this does anyone care in the slightest about the
speed of your little toys or of the cute little tcc. tcc might be
ideal for the half-dozen people in the world who think C scripts are
a good idea, and it had its place in a time when "live Linux" systems
were booted from floppies, but that's about it.

Yet, projects like mine, and like tcc, show what is possible: just how
fast should it take to turn lower level code into machine code.

Can you use tcc to generate assembly from a .c source file?

similar to:

$ gcc -S sourcefile.c

I don't think that's an option with tcc.

gcc works by generating intermediate assembly anyway, and the -S option
exposes it.

Generally it depends on compiler. Mine has it for example.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From DFS@21:1/5 to bart on Tue Jun 25 16:28:13 2024

On 6/25/2024 3:39 PM, bart wrote:

On 25/06/2024 17:52, DFS wrote:

On 6/25/2024 7:18 AM, bart wrote:

On 25/06/2024 09:19, David Brown wrote:

At no point in all this does anyone care in the slightest about the
speed of your little toys or of the cute little tcc. tcc might be
ideal for the half-dozen people in the world who think C scripts are
a good idea, and it had its place in a time when "live Linux"
systems were booted from floppies, but that's about it.

Yet, projects like mine, and like tcc, show what is possible: just
how fast should it take to turn lower level code into machine code.

Can you use tcc to generate assembly from a .c source file?

similar to:

$ gcc -S sourcefile.c

I don't think that's an option with tcc.

It's not listed in the tcc help anyway.

gcc works by generating intermediate assembly anyway, and the -S option exposes it.

Generally it depends on compiler. Mine has it for example.

Cool.

If you don't mind, what assembly code does your compiler generate for
this C code:

-------------------------------------------------

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {

    int N=0;
    int *nums = malloc(2 * sizeof(int));

    FILE* datafile = fopen(argv[1], "r");
    while(fscanf(datafile, "%d", &nums[N++]) == 1){
nums = realloc(nums, (N+1) * sizeof(int));
    }
    fclose (datafile);

    N--;
    for(int i=0;i<N;i++) {
printf("%d.%d ", i+1, nums[i]);
    }
    free(nums);

    printf("\n");
    return 0;

}

----------------------------------------------------------

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Wed Jun 26 00:28:42 2024

On Tue, 25 Jun 2024 17:08:42 +0200
David Brown <david.brown@hesbynett.no> wrote:

Do you think the developers of gcc don't care?

That's about right. At least for C, they don't.
Most likely the same applies even stronger to Fortran.
For C++ they would love to not care, but then their compiler would
become unusable. So they have to care, but probably very sorry about it.

However bigger reason is not lack of care, but the style of their
development process. It's best described as a patchwork.

Or they are just bad at writing code?

I don't thinks so. Most of them are rather good. But factors mentioned
above prevail.

Do you know how laughable that is? It is not
/quite/ as bad as your usual paranoia that the developers behind C,
gcc, Linux, make, and countless other things you don't understand
created them just to annoy you personally.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Wed Jun 26 00:42:40 2024

On Tue, 25 Jun 2024 19:51:31 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 25/06/2024 17:59, bart wrote:

On 25/06/2024 16:12, David Brown wrote:

On 25/06/2024 17:08, Scott Lurndal wrote:

BTW since you and DB are both keen on products like Python,

I have never posted anything about python here, that I recall.

I use it very infrequently.

(It seemed to be a big part of that 8Mloc project of yours)

I got the impression that it was just for some scripting and
automation. I'm sure if Python were not available, he could happily
have used Perl, or Lua, or Tcl. I know that's the case for the
Python code I often have for build automation.

I have hard time imagining that anybody could happily use TCL instead
of Python.

(/I/ have other code for PC and server programs that are in Python,
and I don't know of any other languages that would suit my needs and
wants better there. That's why I chose Python. But I don't remember
Scott talking about such code in Python.)

For just about anything apart from availably of [mostly free] 3rd-party libraries and/or of ready-made modules, Ruby is as good or better than
Python. But Python was lucky to reach critical mass first. By now,
Python has better docs as well, but that's relatively recent
development.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to DFS on Wed Jun 26 00:23:47 2024

On 25/06/2024 21:28, DFS wrote:

On 6/25/2024 3:39 PM, bart wrote:

On 25/06/2024 17:52, DFS wrote:

On 6/25/2024 7:18 AM, bart wrote:

On 25/06/2024 09:19, David Brown wrote:

At no point in all this does anyone care in the slightest about the
speed of your little toys or of the cute little tcc. tcc might be
ideal for the half-dozen people in the world who think C scripts
are a good idea, and it had its place in a time when "live Linux"
systems were booted from floppies, but that's about it.

Yet, projects like mine, and like tcc, show what is possible: just
how fast should it take to turn lower level code into machine code.

Can you use tcc to generate assembly from a .c source file?

similar to:

$ gcc -S sourcefile.c

I don't think that's an option with tcc.

It's not listed in the tcc help anyway.

gcc works by generating intermediate assembly anyway, and the -S
option exposes it.

Generally it depends on compiler. Mine has it for example.

Cool.

If you don't mind, what assembly code does your compiler generate for
this C code:

I've shown the main code below. I normally use alternate registers
names, I've changed it to use Intel; it looks about right.

This is for Win64 ABI running on x64.

Although I'm not sure how helpful this will be. Most of the runtime in
your program will be spent inside library functions.

----------------------------------------
`main::
`main.argc = -8
`main.argv = -16
`R.main.N = edi
`R.main.nums = rbx
`R.main.datafile = rsi
`R.main.i.3 = r12d
`main.$env = -24
`main.$info = -152
push rdi
push rbx
push rsi
push r12
push rbp
mov rbp, rsp
sub rsp, 192
;... (some code which sets up argc/argv via __getmainargs() ...
mov edi, 0
mov rcx, 8
call `malloc*
mov rbx, rax
mov rax, [rbp+`main.argv]
lea rax, [rax+8]
mov rax, [rax]
mov rcx, rax
mov rdx, L6
call `fopen*
mov rsi, rax
jmp L7
L9:
lea eax, [rdi+1]
movsx rax, eax
shl rax, 2
mov rcx, rbx
mov rdx, rax
call `realloc*
mov rbx, rax
L7:
mov eax, edi
inc edi
movsx rax, eax
lea r10, [rbx+rax*4]
mov rcx, rsi
mov rdx, L10
mov r8, r10
call `fscanf*
cmp eax, 1
jz L9
L8:
mov rcx, rsi
call `fclose*
dec edi
mov r12d, 0
jmp L13
L14:
mov eax, r12d
movsx rax, eax
lea r10, [rbx+rax*4]
mov r10d, [r10]
lea eax, [r12+1]
mov rcx, L15
mov edx, eax
mov r8d, r10d
call `printf*
L11:
inc r12d
L13:
cmp r12d, edi
jl L14
L12:
mov rcx, rbx
call `free*
mov rcx, L16
call `printf*
;------------------------
xor ecx, ecx
call `exit*
add rsp, 192
pop rbp
pop r12
pop rsi
pop rbx
pop rdi
ret

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Wed Jun 26 09:21:14 2024

On 25/06/2024 21:04, Keith Thompson wrote:

bart <bc@freeuk.com> writes:
[...]

You think it is all totally pointless? Then fuck you.

[...]

Let's keep things civil.

He felt provoked. So it is I who need to think more, be more respectful
and be less provocative in my wording. My apologies to Bart, and to the
group for frustrating him that much.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Wed Jun 26 09:23:49 2024

On 25/06/2024 22:13, Keith Thompson wrote:

DFS <nospam@dfs.com> writes:
[...]

Can you use tcc to generate assembly from a .c source file?

similar to:

$ gcc -S sourcefile.c

No, tcc has no such option.

I was able to get an assembly listing by running `tcc -c sourcefile.c` followed by `objdump -d sourcefile.o`.

<https://godbolt.org> has tcc on its list of C compilers, and it shows
the generated assembly. I don't know how it does this - perhaps using
this same method. If people are interested in looking at the code
generated by tcc (or vast numbers of other compilers, versions and
targets), then godbolt is usually the easiest way to get it.

And if anyone wants to know how they get the nice assembly out of the compilers, the source code for godbolt is also freely available.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Wed Jun 26 09:35:21 2024

On 25/06/2024 23:42, Michael S wrote:

On Tue, 25 Jun 2024 19:51:31 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 25/06/2024 17:59, bart wrote:

On 25/06/2024 16:12, David Brown wrote:

On 25/06/2024 17:08, Scott Lurndal wrote:

BTW since you and DB are both keen on products like Python,

I have never posted anything about python here, that I recall.

I use it very infrequently.

(It seemed to be a big part of that 8Mloc project of yours)

I got the impression that it was just for some scripting and
automation. I'm sure if Python were not available, he could happily
have used Perl, or Lua, or Tcl. I know that's the case for the
Python code I often have for build automation.

I have hard time imagining that anybody could happily use TCL instead
of Python.

Maybe he wouldn't be /happy/ about it (I know I wouldn't!) but for small scripts, you can work in a variety of languages. Usually all you need
are decent file handling, good string support, some high-level data
structures (at least lists and hashmaps), and automatic memory handling.

(/I/ have other code for PC and server programs that are in Python,
and I don't know of any other languages that would suit my needs and
wants better there. That's why I chose Python. But I don't remember
Scott talking about such code in Python.)

For just about anything apart from availably of [mostly free] 3rd-party libraries and/or of ready-made modules, Ruby is as good or better than Python. But Python was lucky to reach critical mass first. By now,
Python has better docs as well, but that's relatively recent
development.

I have only looked briefly at Ruby - for some reason it never really
appealed to me. There was certainly not enough to make it worth
learning when I already had Python. If Python suddenly disappeared,
however, then it is certainly a language I'd look into as an alternative.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Wed Jun 26 09:17:17 2024

On 25/06/2024 23:28, Michael S wrote:

On Tue, 25 Jun 2024 17:08:42 +0200
David Brown <david.brown@hesbynett.no> wrote:

Do you think the developers of gcc don't care?

That's about right. At least for C, they don't.

I don't think speed of compilation for C is a big priority for gcc
developers. But that's not the same as saying they don't care. Like
any other big development project, they have limited resources and
priorities - when Intel, ARM, Red Hat, Google, and other companies pay
people to work on gcc, they want a concentration on the issues that
matter to them and their customers. Intel is not bothered about the
speed of C compilation, because their customers are not bothered about
it - they are a lot more interested in having the generated code running
faster (especially on /their/ chips).

Most likely the same applies even stronger to Fortran.

Again, people don't care about the speed of Fortran compilation. It is
more than fast enough.

There are only two reasons anybody uses Fortran. One is because they
have a large code base in Fortran and are stuck with it. The other is
because they have code that does massive numerical calculations and they
can get slightly more speed from it than the same thing in C (or C++).
Compile speed is almost entirely irrelevant to both groups.

For C++ they would love to not care, but then their compiler would
become unusable. So they have to care, but probably very sorry about it.

C++, especially modern template-heavy code with header-only libraries,
can easily take an order of magnitude or more longer to compile than
roughly equivalent C. It can easily take an order of magnitude or more
longer to link. And since projects in C++ are often far bigger than
those in C - since C's lack of namespaces of any sort makes it a poor
language for large code bases - speed of compilation is a different
world here.

However bigger reason is not lack of care, but the style of their
development process. It's best described as a patchwork.

It is, yes. It is a massive project built by vast numbers of people
over three and a half decades. Sometimes there are big renovations and significant parts get re-written for big gains. But those are rare -
they take a great deal of development resources, and they are a serious
risk for new problems. gcc is the backbone of far too much critical
software to take big risks easily.

Some twenty years ago, someone also thought that gcc's code structure
and development process was inflexible and limiting for a compiler, and
it would be better to start with a clean, modern design with modern
languages and tools, that allowed more modular development, streamlined
and faster compilation, better error messages and cross-module link-time optimisations. When llvm/clang was young, in comparison to gcc it was
very much faster, had much nicer error messages and supported much more efficient link-time optimisation - at the expense of significantly less efficient generated code and less static checking. Now llvm/clang is
much more mature, and is comparable to gcc in generated code quality
(clang wins some benchmarks, gcc wins most, but the differences are
often minor). And its compile times are not much faster than gcc. It
turns out that quality code generation takes time.

There are basically two things to be done about C++ compilation speed.
One is to move past the literal inclusion of headers as a language
model. The problem with C++ compilation is not that it takes a long
time to analyse and compile 2000 lines of code in a .cpp file. It is
that the compiler also has to deal with a million lines of include files
for that compilation. Thus we have C++ modules. Unfortunately,
retrofitting a good module solution to a language that has always used
literal includes, is far from easy, and it has taken longer than anyone
would have wanted to get them defined and standardised in the language,
and to get implementations in place. As these fall into place, however,
it will be possible to move projects to being mostly module based and
improve C++ build speeds in a way that you couldn't dream of by
streamlining a C++ compiler.

The second challenge for C++ build times on large projects is linking.
Here there are new linkers being developed that have C++ in mind, rather
than being mere modifications of traditional linkers that targeted
assembly, C and Fortran.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to David Brown on Wed Jun 26 08:31:45 2024

On 2024-06-26, David Brown <david.brown@hesbynett.no> wrote:

On 25/06/2024 21:04, Keith Thompson wrote:

bart <bc@freeuk.com> writes:
[...]

You think it is all totally pointless? Then fuck you.

[...]

Let's keep things civil.

He felt provoked. So it is I who need to think more, be more respectful
and be less provocative in my wording. My apologies to Bart, and to the group for frustrating him that much.

To balnce this view, we have to acknowledge that Bart carries a
"background level of provocation" as a reaction to the state of things
in the computing world, and is actively looking to be further provoked
in the C newsgroup.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to bart on Wed Jun 26 11:54:00 2024

On 24/06/2024 14:01, bart wrote:

Then people have different concepts of what a 'build' is.

My definition is a full translation from source code to executable
binary. That, I have never had any trouble with for my own projects, and
have never had to hang about for it either, because I define and control
the process (and most often write the necessary tools too, AND design
the language involved to that end).

To me that is part, but not all, of a build.

At the very least (at least, for production code) there will be some
kind of packaging.

And I really hope you have some automated tests to run. Which also count
as part of the build process.

Andy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vir Campestris@21:1/5 to David Brown on Wed Jun 26 11:31:07 2024

On 21/06/2024 10:46, David Brown wrote:

I understand your viewpoint and motivation. But my own experience is
mostly different.

First, to get it out of the way, there's the speed of compilation. While heavy optimisation (-O3) can take noticeably longer, I never see -O0 as
being in any noticeable way faster for compilation than -O1 or even
-O2. (I'm implicitly using gcc options here, but it's mostly applicable
to any serious compiler I have used.) Frankly, if your individual C compiles during development are taking too long, you are doing something wrong. Maybe you are using far too big files, or trying to do too much
in one part - split the code into manageable sections and possibly into libraries, and it will be far easier to understand, write and test.
Maybe you are not using appropriate build tools. Maybe you are using a
host computer that is long outdated or grossly underpowered.

Or maybe I have a really big code base. (My last project at work was
using a distributed compilation system across all our workstations)

I must admit I haven't tried O1 or O2. I'll give it a go.

There are exceptions. Clearly some languages - like C++ - are more demanding of compilers than others. And if you are using whole-program
or link-time optimisation, compilation and build time is more of an
issue - but of course these only make sense with strong optimisation.

In recent years most of my code has ben C++.

Now I'm retired I'm not writing much serious code anyway - for some
reason I don't seem to have the time for any voluntary projects, and all
I do is a little hackery. If it's really small I'll do it in a spreadsheet.

Secondly, there is the static error analysis. While it is possible to
do this using additional tools, your first friend is your compiler and
its warnings. (Even with additional tools, you'll want compiler
warnings enabled.) You always want to find your errors as early as
possible - from your editor/IDE, your compiler, your linker, your
additional linters, your automatic tests, your manual tests, your beta
tests, your end user complaints. The earlier in this chain you find the issue, the faster, easier and cheaper it is to fix things. And
compilers do a better job at static error checking with strong
optimisations enabled, because they do more code analysis.

Thirdly, optimisation allows you to write your code with more focus on clarity, flexibility and maintainability, relying on the compiler for
the donkey work of efficiency details. If you want efficient results
(and that doesn't always matter - but if it doesn't, then C is probably
not the best choice of language in the first place) and you also want to write good quality source code, optimisation is a must.

Agreed. The cost of maintenance is often overlooked, and correct code is
more useful than faster code.

Now to your point about debugging. It is not uncommon for me to use debuggers, including single-stepping, breakpoints, monitoring variables, modifying data via the debugger, and so on. It is common practice in embedded development. I also regularly examine the generated assembly,
and debug at that level. If I am doing a lot of debugging on a section
of code, I generally use -O1 rather than -O0 - precisely because it is
far /easier/ to understand the generated code. Typically it is hard to
see what is going on in the assembly because it is swamped by stack
accesses or code that would be far simpler when optimised. (That goes
back to the focus on source code clarity and flexibility rather than micro-managing for run-time efficiency without optimisation.)

Some specific optimisation options can make a big difference to
debugging, and can be worth disabling, such as "-fno-inline" or "-fno-top-reorder", and heavily optimised code can be hard to follow in
a debugger. But disabling optimisation entirely can often, IME, make
things harder.

Temporarily changing optimisation flags for all or part of the code
while chasing particular bugs is a useful tool, however.

I'll play with the settings.

Andy

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to bart on Wed Jun 26 13:15:36 2024

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python.� I use it when it is an appropriate language to use,
�which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which was
why its simplistic bytecode compiler is acceptable in this scenario, but would be considered useless if applied to C code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you /should/
know the answers to.

If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at least two scenarios in which a simplistic C bytecode compiler would be an
unacceptable tool to use, and two in which Python with a trivial
bytecode compiler would be an acceptable tool to use. In each case
explain why. Anyone who could not would get marked down on the course.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Vir Campestris on Wed Jun 26 16:43:14 2024

On 26/06/2024 12:31, Vir Campestris wrote:

On 21/06/2024 10:46, David Brown wrote:

I understand your viewpoint and motivation. But my own experience is
mostly different.

First, to get it out of the way, there's the speed of compilation.
While heavy optimisation (-O3) can take noticeably longer, I never see
-O0 as being in any noticeable way faster for compilation than -O1 or
even -O2. (I'm implicitly using gcc options here, but it's mostly
applicable to any serious compiler I have used.) Frankly, if your
individual C compiles during development are taking too long, you are
doing something wrong. Maybe you are using far too big files, or
trying to do too much in one part - split the code into manageable
sections and possibly into libraries, and it will be far easier to
understand, write and test. Maybe you are not using appropriate build
tools. Maybe you are using a host computer that is long outdated or
grossly underpowered.

Or maybe I have a really big code base. (My last project at work was
using a distributed compilation system across all our workstations)

Usually builds are incremental - even with a big code base, you don't
recompile more than a few files in most runs. The typical exception is
when you change a commonly-used header.

I must admit I haven't tried O1 or O2. I'll give it a go.

I find that -O3 rarely makes more than a marginal difference to the
efficiency of the results, but it can lead to noticeably longer compile
times. In particular, it enables several optimisations that scale superlinearly, so can be particularly bad if the files or functions are
large.

There are exceptions. Clearly some languages - like C++ - are more
demanding of compilers than others. And if you are using
whole-program or link-time optimisation, compilation and build time is
more of an issue - but of course these only make sense with strong
optimisation.

In recent years most of my code has ben C++.

That makes more sense - IME, compilation time for C is very rarely an
issue, but compilation and build time for C++ can often be.

Now I'm retired I'm not writing much serious code anyway - for some
reason I don't seem to have the time for any voluntary projects, and all
I do is a little hackery. If it's really small I'll do it in a spreadsheet.

Secondly, there is the static error analysis. While it is possible to
do this using additional tools, your first friend is your compiler and
its warnings. (Even with additional tools, you'll want compiler
warnings enabled.) You always want to find your errors as early as
possible - from your editor/IDE, your compiler, your linker, your
additional linters, your automatic tests, your manual tests, your beta
tests, your end user complaints. The earlier in this chain you find
the issue, the faster, easier and cheaper it is to fix things. And
compilers do a better job at static error checking with strong
optimisations enabled, because they do more code analysis.

Thirdly, optimisation allows you to write your code with more focus on
clarity, flexibility and maintainability, relying on the compiler for
the donkey work of efficiency details. If you want efficient results
(and that doesn't always matter - but if it doesn't, then C is
probably not the best choice of language in the first place) and you
also want to write good quality source code, optimisation is a must.

Agreed. The cost of maintenance is often overlooked, and correct code is
more useful than faster code.

Heathfield's law - "It is easier to make a readable program correct,
than to make a correct program readable. It is easier to make a correct
program fast, than to make a fast program correct."

A program that is not correct is of little use to anyone, and a program
that is not maintainable is not going to remain correct for long.

Now to your point about debugging. It is not uncommon for me to use
debuggers, including single-stepping, breakpoints, monitoring
variables, modifying data via the debugger, and so on. It is common
practice in embedded development. I also regularly examine the
generated assembly, and debug at that level. If I am doing a lot of
debugging on a section of code, I generally use -O1 rather than -O0 -
precisely because it is far /easier/ to understand the generated
code. Typically it is hard to see what is going on in the assembly
because it is swamped by stack accesses or code that would be far
simpler when optimised. (That goes back to the focus on source code
clarity and flexibility rather than micro-managing for run-time
efficiency without optimisation.)

Some specific optimisation options can make a big difference to
debugging, and can be worth disabling, such as "-fno-inline" or
"-fno-top-reorder", and heavily optimised code can be hard to follow
in a debugger. But disabling optimisation entirely can often, IME,
make things harder.

Temporarily changing optimisation flags for all or part of the code
while chasing particular bugs is a useful tool, however.

I'll play with the settings.

That's what retirement is for :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to All on Wed Jun 26 12:59:28 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

(I am lazily keeping everything so I don't have to
think about what to exclude. I have changed some
white space but otherwise it's all here.)

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:
[...]

On a C language point, I don't think the standard says
anything about sorting with non-order functions like the one >>>>>>>>> above. Is an implementation of qsort permitted to misbehave >>>>>>>>> (for example by not terminating) when the comparison
function does not implement a proper order relation?

N1570 7.22.5p4 (applies to bsearch and qsort):
"""
When the same objects (consisting of size bytes, irrespective
of their current positions in the array) are passed more than
once to the comparison function, the results shall be
consistent with one another. That is, for qsort they shall
define a total ordering on the array, and for bsearch the
same object shall always compare the same way with the key.
"""

That's a "shall" outside a constraint, so violating it
results in undefined behavior.

I think it should be clearer. What the "that is" phrase seems
to clarify in no way implies a total order, merely that the
repeated comparisons or the same elements are consistent with
one another. That the comparison function defines a total
order on the elements is, to me, a major extra constraint that
should not be written as an apparent clarification to
something the does not imply it: repeated calls should be
consistent with one another and, in addition, a total order
should be imposed on the elements present.

I think you're misreading the first sentence.

Let's hope so. That's why I said it should be clearer, not that
it was wrong.

Suppose we are in court listening to an ongoing murder trial.
Witness one comes in and testifies that Alice left the house
before Bob. Witness two comes in (after witness one has gone)
and testifies that Bob left the house before Cathy. Witness
three comes in (after the first two have gone) and testifies
that Cathy left the house before Alice. None of the witnesses
have contradicted either of the other witnesses, but the
testimonies of the three witnesses are not consistent with one
another.

My (apparently incorrect) reading of the first sentence is that
the consistency is only required between the results of multiple
calls between each pair. In other words, if the witnesses are
repeatedly asked, again and again, if Alice left before Bob
and/or if Bob left before Alice the results would always be
consistent (with, of course, the same required of repeatedly
asking about the other pairs of people).

Let me paraphrase that. When the same pair of objects is passed
more than once to individual calls of the comparison function,
the results of those different calls shall each be consistent
with every other one of the results.

No, only with the results of the other calls that get passed the
same pair. [...]

Sorry, my oversight. That's is what I meant. "When the same pair
of objects is passed more than once to individual calls of the
comparison function, the results of those different calls shall
each be consistent with every other one of THOSE results." The
consistency is meant to be only between results of comparisons of
the same pair. (This mistake illustrates how hard it is to write
good specifications in the C standard.)

To paraphrase my reading, when some set of "same" objects is each
passed more than once to individual calls of the comparison
function, the results of all of those calls taken together shall
not imply an ordering contradiction.

Are the last two paragraphs fair restatements of our respective
readings?

I don't think so. The first does not seem to be what I meant, and
the second begs a question: what is an ordering contradiction?

A conclusion that violates the usual mathematical rules of the
relations less than, equal to, greater than: A<B and B<C implies
A<C, A<B implies A!=B, A=B implies not A<B, A<B implies B>A, etc.

Maybe I could work out what you mean by that if I thought about it
some more, but this discussion has reminded me why I swore not to
discuss wording and interpretation on Usenet. You found the
wording adequate. I didn't. I won't mind if no one ever knows
exactly why I didn't. C has managed fine with this wording for
decades so there is no practical problem. I think enough time has
been spent on this discussion already, but I can sense more is
likely to spent.

A small correction: I found the wording understandable. If the
question is about adequacy, I certainly can't give the current
wording 10 out of 10. I would like to see the specification for
qsort stated more plainly. Although, as you can see, I'm having
trouble figuring out how to do that.

Is the second paragraph plain enough so that you would not
misconstrue it if read in isolation? Or if not, can you suggest
a better phrasing?

Since I don't know what an ordering contradiction is, I can't
suggest an alternative.

Now that I have explained that phrase, I hope you will have a go
at finding a better wording.

I would not introduce your new term, "an ordering contradiction",
since it still leaves exactly what kind of order vague.

My original thinking was that "ordering contradiction" would be a
good choice for your benefit, not that it would be good phrasing
for the C standard. Apparently my aim was not so good.

You interpret "consistent" as "consistent with a total order"

Actually I don't. More below.

so I'd use that phrase:

"when some set of 'same' objects is each passed more than once
to individual calls of the comparison function, the results of
all of those calls taken together shall be consistent with a
total order"

Presumably you came to interpret "consistent with one another" as
implying a total order rather because of the sentence that follows
("That is, for qsort they shall define a total ordering on the
array").

Actually not. To me the two sentences are not equivalent. More
specifically, the first is weaker than the second. More below.

I could not do that because I was interpreting the text about
multiple calls differently.

Yes, I understand that now, moreso than before.

... The important point is the "consistent with" is something of
an idiomatic phrase, and it doesn't mean "equivalent to" or "the
same as". Maybe you already knew that, but I didn't, and
learning it helped me see what the quoted passage is getting at.

...

If you care to be less cryptic, maybe you will say what it was
about the meaning of "consistent with" that helped you see what
the text in question was getting at.

I think the key thing is that "consistent with" doesn't mean the
same. If we're comparing the same pair of objects over and over,
the results are either the same or they are different. It would
be odd to use "consistent with one another" if all that mattered
is whether they are all the same.

I never thought they were the same. The trouble is that (a)
different results imply the same order (e.g. -1 and -34 all mean
<) and (b) the (old) wording does not say that the objects are
passed in the same order and the result of cmp(a, b) can't be the
same as cmp(b, a) but they can be consistent. This makes
"consistent with one another" a perfectly reasonable thing to say
even in my limited view of what results are being talked about.

It's interesting that our mental pictures here are so different.

To me there is no difference between a return value of -1 and a
return value of -34. To say that more generally, different
return values that have the same meaning are the same result.
That idea also applies changing the order of operands, so a
compare( a, b ) being positive is the same result as getting a
negative value from compare( b, a ). "Results" of a comparison
between a and b are either a<b, a==b, or a>b. The actual values
returned are incidental (and probably aren't even looked at
except to compare them to zero).

(That reminds me, I have a little challenge/puzzle exercise that
might be fun for people, and it is related to the previous
paragraph, so maybe that will get me to post it.)

Because I see "sameness of results" as being determined by
meaning, and not by what particular values come back, it wouldn't
have occurred to me to think "consistent with" was there just to
account for differences in the values. That difference in
viewpoint may account for much of the difference in our first
impressions of the "consistent with one another" sentence in the
C standard.

...

I have a second objection that promoted that remark. If I take
the (apparently) intended meaning of the first sentence, I think
that "consistent" is too weak to imply even a partial order. In
dog club tonight, because of how they get on, I will ensure that
Enzo is walking behind George, that George is walking behind
Benji, Benji behind Gibson, Gibson behind Pepper and Pepper
behind Enzo. In what sense is this "ordering" not consistent?
All the calls to the comparison function are consistent with
each other.

I understand the objection, and this is the point I was trying to
make in the paragraph about children in the Jones family. The
phrase "one another" in "the results shall be consistent with one
another" is meant to be read as saying "all the results taken
together". It is not enough that results not be contradictory
taken two at a time; considering all the results at once must
not lead to an ordering contradiction.

...

All the results of the dog-order comparison function, taken
together, are consistent with the circular order, which is
obviously not a total order.

If A<B, B<C, C<D, D<E, and E<A, we can infer from the transitivity
of the "less than" relation that A<A. But A<A can never be true,
so this set of comparison results is no good.

[Technical aside. The relation should be seen as <=. not <. You
can't conclude that I intended A < A from the informal
presentation -- no dog can be behind itself. However, this does
not alter your argument in any significant way.]

Different authors define "total ordering" differently. Also some
authors base the discussion on < rather than <=. I'm taking your
comment above narrowly in that it is meant to apply only to the
dog-order example, and not meant to be universal. However, if
the dog-order relation is meant to be <= rather than <, then the
dog-order example is consistent with "total orderings that allow
equality". The C standard uses "total ordering" in this sense,
because the comparison function can return an "equal" result for
distinct objects. For contrast, the integers have a "total
ordering that does not allow equality": for any two distinct
integers, it is always the case that one of them is less than the
other (and they are never equal). To me it's a little bit funny
to call a set "totally ordered" if equality is allowed, although
of course I understand what is meant in such cases.

So I guess what we have discovered is that "consistent with one
another" is intended to mean "obeys the usual mathematical rules
for ordering relations".

I would say this is backwards. You are assuming the usual rules
where I gave an order that is not at all usual with the purpose of
showing that some sets of comparisons between pairs can be
"consistent with one another" when the ordering is very peculiar.

I didn't understand before that you meant the "behind" relation
to be one that might not satisfy the axioms of "less than", but
rather just the axioms of "less than or equal". So I missed that
point earlier. Hopefully I'm caught up now.

On a more mathematical note, imagine that the text was describing
a topological sort function. Is there anything in your reading of
the first sentence that would make it inappropriate? If not, that "consistent with one another" can't imply a total order.

I take up this question when it is raised again below.

...

It occurs to me now to say that "consistent with" is meant to
include logical inference.

Sure.

That distinction is a key difference between "consistent" and
"consistent with" (at least as the two terms might be understood).
The combination of: one, the results of the comparison function
are seen as corresponding to an ordering relation;

But, according to you, only some ordering relations.

I am guilty of somewhat sloppy language there. Strictly speaking
an ordering relation is all the ordered pairs that define the
relationship. The results of all the comparisons done corresponds
(at least usually) to only a subset of the ordered pairs of an
ordering relation. The qsort function needs to do only enough
comparisons so that the closure of those results defines a total
ordering. As long as the set of comparisons actually done is a
subset of some totally ordered relation then the program is okay
and hasn't wandered off into the UB weeds. However, if the set
of all N*(N-1) comparisons (which includes reversing the argument
orders) would give results that are not a subset of a total
ordering, then which total ordering is determined (by a qsort
call that doesn't encounter UB) depends on which comparisons were
actually done. Considering all that I think my last sentence
above is better stated as "the results of the comparison calls
performed corresponds to a subset of some ordering relation".

and two, that "consistent with one another" includes logical
inferences considering all of the results together; is what
allows us to conclude that the results define a total order.

Could the sentence in question be used in the description of a
topological sort based (rather unusually) on a partial order?

Short answer: doing a topological sort requires a different
interface, and that change of context changes the meaning of the
phrase "consistent with one another".

Longer answer: the comparison function in the qsort interface is
specified as giving one of three results: a<b, a==b, a>b. The
returned value must indicate one of those three possibilities.

To do a (general) topological sort, there needs to be another
possibility, namely, that a and b are unrelated. There are now
four mutually exclusive possibilities. Note that "unrelated"
cannot be the same as "equal". The reason is that "equal" is
transitive but "unrelated" is not. In particular, we can have
a!=!b, b!=!c, but a<c rather than a!=!c (using !=! to mean
"unrelated"). That combination cannot occur for equal: if a==b
and b==c, then a==c. I expect you are already familiar with
these ideas; I'm going through them mainly as a check on my own
thinking.

A literal answer to your question is that the sentence about
being "consistent with one another" could also be used in a
different function that would do topological sorts. But the
meaning of the sentence would be different, because of changes in
how the comparison function would have to be specified. I guess
I should add, as I understand the meaning of these passages in
the C standard.

To me, the meaning of the phrase "consistent with one another" is
meant to be taken relative to the specifications of the comparison
function, whose results are three mutually exclusive cases: less
than, equal to, greater than. The C standard tacitly takes the
view that these operations behave like the ones we learned about
in grade school. As long as the results of comparisons done are
not in conflict with a logically valid deduction, under the usual
mathematical rules for these elementary relationships, with all of
the comparison results assumed as being true as a starting point,
then the condition of the first sentence is satisfied. But that
being true does not by itself show that the comparison results
define a total ordering.

To conclude that the comparison results define a total ordering,
we need to add what the standard says about the return value of
qsort, namely, that array elements are placed in ascending order.
This condition can be achieved only enough comparisons have been
done to determine a total order. The second sentence augments
the "consistent with" condition in the first sentence with a
tacit recognition of the qsort return condition to say comparison
results must define a total ordering. So a full statement might
be that the comparison results shall be consistent with one
another and they shall be sufficient to determine the total
ordering required by the output condition. The C standard
collapses those two parts down into the shorter second sentence.

In any case that's how I read this part of the standard. I hope
that clarifies my earlier statements.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Wed Jun 26 23:46:19 2024

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

(I am lazily keeping everything so I don't have to
think about what to exclude. I have changed some
white space but otherwise it's all here.)

And I have lazily removed it all because I need to call it a day.

I am grateful for you very patient replies and I hope you will not be disappointed if I don't reply in detail to you latest. I think I
understand your position (though I would not want to try to summarise
it) and I think you understand how I was initially reading the text.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Wed Jun 26 17:48:17 2024

Ben Bacarisse <ben@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben@bsb.me.uk> writes:

(I am lazily keeping everything so I don't have to
think about what to exclude. I have changed some
white space but otherwise it's all here.)

And I have lazily removed it all because I need to call it a day.

Oh good. Thank you for the extremely brief reply.

I am grateful for you very patient replies and I hope you will not be disappointed if I don't reply in detail to you latest. I think I
understand your position (though I would not want to try to summarise
it) and I think you understand how I was initially reading the text.

Yes, I think so too. A nice way to finish.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Ben Bacarisse on Thu Jun 27 12:16:14 2024

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python. I use it when it is an appropriate language to use, >>> which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which was
why its simplistic bytecode compiler is acceptable in this scenario, but
would be considered useless if applied to C code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you /should/
know the answers to.

If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at least two scenarios in which a simplistic C bytecode compiler would be an
unacceptable tool to use, and two in which Python with a trivial
bytecode compiler would be an acceptable tool to use. In each case
explain why. Anyone who could not would get marked down on the course.

I'm not sure what you're implying here.

Some here are consistently saying that any compiler whose internal
processes are not at the scale or depth that you find in 'professional', 'industrial scale' products like gcc, clang, icc, is not worth bothering
with and is basically a useless toy.

And yet those same people are happy that such a straightforward
compiler, which does less error-checking than Tiny C, is used within the dynamic scripting languages they employ.

It just seemed to me to be blind prejudice.

They were also unwilling to answer questions about whether, given a
simpler task of translating initialisation data such as long sequences
of integer constants, or strings, they'd be willing to entrust it to
such a 'toy' compiler or even a dedicated tool. Since here there is no
analysis to be done nor any optimisation.

Assuming the answer is No, it must be the bigger, much slower product,
then it sounds like irrational hatred.

So, what would your students say?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Jun 27 14:31:30 2024

On 27/06/2024 13:16, bart wrote:

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python. I use it when it is an appropriate language to use, >>>> which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which
was
why its simplistic bytecode compiler is acceptable in this scenario, but >>> would be considered useless if applied to C code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you /should/
know the answers to.

If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at least two
scenarios in which a simplistic C bytecode compiler would be an
unacceptable tool to use, and two in which Python with a trivial
bytecode compiler would be an acceptable tool to use. In each case
explain why. Anyone who could not would get marked down on the course.

I'm not sure what you're implying here.

Some here are consistently saying that any compiler whose internal
processes are not at the scale or depth that you find in 'professional', 'industrial scale' products like gcc, clang, icc, is not worth bothering
with and is basically a useless toy.

And yet those same people are happy that such a straightforward
compiler, which does less error-checking than Tiny C, is used within the dynamic scripting languages they employ.

I would expect Ben's students to understand the difference between a low
level language aimed at systems programming and high efficiency
binaries, and a high-level language aimed at ease of development,
convenience, and simplifying coding by avoiding manual resource
management. I would expect his students to see the difference in the languages, the different appropriate uses of the languages, and the
different requirements for tools for those languages.

I'd expect that from you too. Since you seem ignorant of these things,
I explained them to you. If you won't listen, or even try to think a
little, then I can't help you learn.

It just seemed to me to be blind prejudice.

They were also unwilling to answer questions about whether, given a
simpler task of translating initialisation data such as long sequences
of integer constants, or strings, they'd be willing to entrust it to
such a 'toy' compiler or even a dedicated tool. Since here there is no analysis to be done nor any optimisation.

I don't think I bothered answering that one because it is clearly a
pointless question. Again, if you don't read what I and others write, answering your questions is a waste of time.

Assuming the answer is No, it must be the bigger, much slower product,
then it sounds like irrational hatred.

So, what would your students say?

Maybe if they read your posts, they would think you are projecting. You
have consistently shown an irrational hatred and blind prejudice to C,
gcc, IDEs, make, Linux, and indeed every programming language that is
not your own invention, every compiler that is not your own or tcc,
every editor, linter, build automation system, and other software
development tool, and every OS except Windows. I don't quite know how
tcc and Windows escaped your obsessive "not invented here" syndrome.

Like most developers, I try to use the best tool for the job - where
"best" can depend on many factors, including convenience and
familiarity. So when I embed file data in my C and C++ projects, I use
the same compiler I use for the rest of the project (which is usually,
but not always, gcc). Even if tcc supported the targets I use (which it
does not), why would I bother messing around with an extra tool there?
gcc does the job in a time I consider practically instant (and therefore
doing it faster is no benefit).

I have no "irrational hatred" of tcc - it is simply incapable (in a
great many ways) of doing the job I need from a compiler, and for the
jobs it /can/ do it is in no way better than the tools I already need
and have.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Thu Jun 27 15:24:50 2024

On Thu, 27 Jun 2024 12:16:14 +0100
bart <bc@freeuk.com> wrote:

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python. I use it when it is an appropriate language
to use, which is very different circumstances from when I use C
(or C++). Different tools for different tasks.

And yet neither of you are interested in answering my question,
which was why its simplistic bytecode compiler is acceptable in
this scenario, but would be considered useless if applied to C
code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you
/should/ know the answers to.

If a software engineering student asked me this sort of "challenge" question it would immediately become homework: come up with at
least two scenarios in which a simplistic C bytecode compiler would
be an unacceptable tool to use, and two in which Python with a
trivial bytecode compiler would be an acceptable tool to use. In
each case explain why. Anyone who could not would get marked down
on the course.

I'm not sure what you're implying here.

Some here are consistently saying that any compiler whose internal
processes are not at the scale or depth that you find in
'professional', 'industrial scale' products like gcc, clang, icc, is
not worth bothering with and is basically a useless toy.

And yet those same people are happy that such a straightforward
compiler, which does less error-checking than Tiny C, is used within
the dynamic scripting languages they employ.

It just seemed to me to be blind prejudice.

The difference is that in dynamically typed scripting language like
python only most blatant syntactic error can be caught during initial compilation. The rest is caught in runtime. That's by design.
May be, ultra-smart compiler will be able to do a little better than
ultra-fast compiler used today, but the difference would be small.

On the other hand, C language provides significant amount of static info
that enables quite useful compile-time analysis. A lot more of mistakes
or 'likely mistakes' can be found not just in theory, but in practice.

There are language that provide more static info than C, e.g. Ada. I
fully expect that, other thing being equal, Ada compiler would be
slower than C compiler.

And then there exist one language that provides ALOT more static info
than C. I think, you can name it yourself, it happens to be the most
hyped programming language of our times. Not by chance, compiler for
this language is VERY slow.

Somehow, I tend to agree with other posters that say that you likely
know all that quite well, but pretend to not know for sake of arguments.

They were also unwilling to answer questions about whether, given a
simpler task of translating initialisation data such as long
sequences of integer constants, or strings, they'd be willing to
entrust it to such a 'toy' compiler or even a dedicated tool. Since
here there is no analysis to be done nor any optimisation.

Assuming the answer is No, it must be the bigger, much slower
product, then it sounds like irrational hatred.

In my practice the answer is 'No' not because of 'hatred'. The
situations like those do not happen often enough to justify hassle of
using more than one compile for the same language in the same project.

So, what would your students say?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Michael S on Thu Jun 27 14:13:02 2024

On 27/06/2024 13:24, Michael S wrote:

On Thu, 27 Jun 2024 12:16:14 +0100
bart <bc@freeuk.com> wrote:

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python. I use it when it is an appropriate language
to use, which is very different circumstances from when I use C
(or C++). Different tools for different tasks.

And yet neither of you are interested in answering my question,
which was why its simplistic bytecode compiler is acceptable in
this scenario, but would be considered useless if applied to C
code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you
/should/ know the answers to.

If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at
least two scenarios in which a simplistic C bytecode compiler would
be an unacceptable tool to use, and two in which Python with a
trivial bytecode compiler would be an acceptable tool to use. In
each case explain why. Anyone who could not would get marked down
on the course.

I'm not sure what you're implying here.

Some here are consistently saying that any compiler whose internal
processes are not at the scale or depth that you find in
'professional', 'industrial scale' products like gcc, clang, icc, is
not worth bothering with and is basically a useless toy.

And yet those same people are happy that such a straightforward
compiler, which does less error-checking than Tiny C, is used within
the dynamic scripting languages they employ.

It just seemed to me to be blind prejudice.

The difference is that in dynamically typed scripting language like
python only most blatant syntactic error can be caught during initial compilation. The rest is caught in runtime.

Some things can be caught at runtime in dynamic code. A few of those
could also be caught at runtime in C processed with a simple compiler;
such tests are easier to add than to try and optimise.

But catching things are runtime isn't as useful (the program might
already be at the customer site, or it might be part-way through a
long-running task - or just before the end!). Testing dynamic code needs
a different approach.

Yet, people still run those scripting languages, because they confer advantages. One might be an instant (or near-instant) edit-run cycle,
and a simpler (ie. non-existent) build process. Yet another might be informality and tolerance.

Those in turn can lead to different ways of developing code.

Some of those benefits useful in static languages too! But they also
provide extra discipline via strict typing, and much faster execution
even with zero optimisation.

On the other hand, C language provides significant amount of static info
that enables quite useful compile-time analysis. A lot more of mistakes
or 'likely mistakes' can be found not just in theory, but in practice.

Yes, but even tcc will do that. Although this is not helped by the C
language not requiring type errors to be treated seriously.

There are language that provide more static info than C, e.g. Ada. I
fully expect that, other thing being equal, Ada compiler would be
slower than C compiler.

And then there exist one language that provides ALOT more static info
than C. I think, you can name it yourself, it happens to be the most
hyped programming language of our times. Not by chance, compiler for
this language is VERY slow.

Are you talking about Rust? A few years ago I did a bunch of
compilation-speed benchmarks. On one test, Rustc -O would have taken an estimated 80,000 times longer to compile a program than Tiny C.

Although Rustc is getting better, it is still slow, and there is no
equivalent product like tcc for that language.

Here are some notes from its docs:

* TCC compiles so fast that even for big projects Makefiles may not be necessary.

* TCC can also be used to make C scripts, i.e. pieces of C source that
you run as a Perl or Python script. Compilation is so fast that your
script will be as fast as if it was an executable.

* With libtcc, you can use TCC as a backend for dynamic code generation

To do that can be added when you want to distribute programs a C source
on restricted hardware.

(I remember trying to build one of my one-file C programs on an RPi1.
gcc would have taken from half a minute to a couple of minutes. I didn't
have tcc then, but tests with my C compiler showed it could be done in a
second or two; I expect tcc would have been even better.)

For me, compilation should be like turning on a light; it should just
work instantly.

My own MCC product is not as good or as fast as TCC (but it produces
somewhat better code). My own MM product for my language, is about as
fast, and can be faster due to having proper modules. It could be used
for scripting.

Somehow, I tend to agree with other posters that say that you likely
know all that quite well, but pretend to not know for sake of arguments.

I wanted other posters to say why they thought a simple compiler is OK
within certain products (where its use is hidden away) but are so
adamantly against it when it comes to C:

DB:

At no point in all this does anyone care in the slightest about the

speed of your little toys or of the cute little tcc. tcc might be ideal
for the half-dozen people in the world who think C scripts are a good
idea, and it had its place in a time when "live Linux" systems were
booted from floppies, but that's about it.

They were also unwilling to answer questions about whether, given a
simpler task of translating initialisation data such as long
sequences of integer constants, or strings, they'd be willing to
entrust it to such a 'toy' compiler or even a dedicated tool. Since
here there is no analysis to be done nor any optimisation.

Assuming the answer is No, it must be the bigger, much slower
product, then it sounds like irrational hatred.

In my practice the answer is 'No' not because of 'hatred'. The
situations like those do not happen often enough to justify hassle of
using more than one compile for the same language in the same project.

Presumably because they use such complex and long-winded build processes already that any benefit would be lost in the noise.

Not everyone uses such processes, and then the benefits are clearer.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Jun 27 17:28:18 2024

On 27/06/2024 13:31, David Brown wrote:

On 27/06/2024 13:16, bart wrote:

And yet those same people are happy that such a straightforward
compiler, which does less error-checking than Tiny C, is used within
the dynamic scripting languages they employ.

I would expect Ben's students to understand the difference between a low level language aimed at systems programming and high efficiency
binaries, and a high-level language aimed at ease of development, convenience, and simplifying coding by avoiding manual resource
management.

I'd say that a lower level language can also benefit from fast
turnaround, conveniences and friendly, informal tools. Not every C
program is some mission-critical bit of software.

I'd expect that from you too. Since you seem ignorant of these things,

You can't help being insulting and patronising, can you?

I explained them to you. If you won't listen, or even try to think a little, then I can't help you learn.

You're also unwilling to learn. You're like someone used to driving a
high-end car or even a truck, who considers a bicycle (or even a cheap
car) a toy.

(The first time I was able to comfortably afford a new car, it was one
of the cheapest available. It did the job of getting me from A to B
perfectly well, it could do 100mph, just about, and even had room to
stick my bike in the back!)

It just seemed to me to be blind prejudice.

They were also unwilling to answer questions about whether, given a
simpler task of translating initialisation data such as long sequences
of integer constants, or strings, they'd be willing to entrust it to
such a 'toy' compiler or even a dedicated tool. Since here there is no
analysis to be done nor any optimisation.

I don't think I bothered answering that one because it is clearly a
pointless question.

I wanted to see the point at which a translation task turned, in your
view, from a simple mechanical process, where any tool, particularly a
nippy one, will do, into one involving a language where suddenly
in-depth analyis and tools that are magnitudes bigger and slower become
a must.

answering your questions is a waste of time.

Assuming the answer is No, it must be the bigger, much slower product,
then it sounds like irrational hatred.

So, what would your students say?

Maybe if they read your posts, they would think you are projecting. You have consistently shown an irrational hatred and blind prejudice to C,
gcc, IDEs, make,

I don't hate IDEs. I just don't use them. The dislike of C is not
irrational: remember I tried to implement it, and my dislike of it
increased; it was an even poorer design than I'd thought.

The dislike of 'make' is also rational; nearly always it makes life
harder for me than if a more straightforward build option was provided.

Linux,

I don't hate Linux either. What I hate is it being foisted upon me for
the wrong reasons. For example having to use Linux, or MSYS or Cygwin,
because some stuff I want to build, even if supposedly cross-platform,
has those dependencies.

and indeed every programming language that is
not your own invention, every compiler that is not your own or tcc,
every editor, linter, build automation system, and other software
development tool, and every OS except Windows. I don't quite know how
tcc and Windows escaped your obsessive "not invented here" syndrome.

I admire quite a few languages and products. But I tend to admire
simplicity, user-friendliness, lack of bloat, lack of unnecessary
dependencies, elegant aesthetics, and things that 'just work'.

I do dislike brace-syntax, 0-based indexing, and case-sensitivity. Those
are common characteristics.

Like most developers, I try to use the best tool for the job

Sure, you're a user, you don't get involved in devising new languages or creating tools, you have to use existed, trusted products. But you let
that get in the way of your views with a low tolerance for anything
different or that seems amateurish or pointless.

Over a decade ago I started looking at whole-program compilers which, if
I was more into optimising, would be lend themselves easily to
whole-program optimisation.

But while you will dismiss my own efforts out of hand, you do at least appreciate the benefits of 'LTO' (which I consider a third rate version
of what I do, and considerably more complex).

I have no "irrational hatred" of tcc - it is simply incapable (in a
great many ways) of doing the job I need from a compiler, and for the
jobs it /can/ do it is in no way better than the tools I already need
and have.

This is what I mean about you being incapable of being objective. You
dissed the whole idea of tcc for everyone. Whereas what you mean is that
it wouldn't benefit /you/ at all.

I can understand that: if you have a dozen slow components of some
elaborate process, replacing one with a faster one would make little difference.

My view is different: I already have /half/ a dozen /fast/ components,
then replacing just one with a slow product like 'gcc' makes a very
noticeable difference.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Jun 27 21:51:56 2024

On 27/06/2024 18:28, bart wrote:

On 27/06/2024 13:31, David Brown wrote:

On 27/06/2024 13:16, bart wrote:

I'm snipping a lot, because answering it will not get us anywhere except
more frustrated.

I do dislike brace-syntax, 0-based indexing, and case-sensitivity. Those
are common characteristics.

I can fully appreciate preferences and opinions - likes and dislikes.
It's the continued determination to fight things that is irrational and incomprehensible. I happen to like these three things. But if I am programming in Python (with indentation rather than braces), Lua (with
1-based indexing) or Pascal (case insensitive), I shrug my shoulders and
carry on. I don't go to comp.lang.python, or comp.lang.lua and rant and
rave about how terrible the language is and how my own tools are vastly
better than anything else.

Like most developers, I try to use the best tool for the job

Sure, you're a user, you don't get involved in devising new languages or creating tools, you have to use existed, trusted products. But you let
that get in the way of your views with a low tolerance for anything
different or that seems amateurish or pointless.

That makes /no/ sense at all.

First, I am as capable as you or anyone else at finding things in C or
any other language that I think are not as good as they could have been,
or poor design decisions. The fact that I am a user, not an
implementer, is irrelevant - programming languages are made for the
users, and the effort needed to implement them is of minor concern.

Over a decade ago I started looking at whole-program compilers which, if
I was more into optimising, would be lend themselves easily to
whole-program optimisation.

But while you will dismiss my own efforts out of hand, you do at least appreciate the benefits of 'LTO' (which I consider a third rate version
of what I do, and considerably more complex).

To be clear - as I have stated /many/ times, I appreciate the effort
needed to make your tools, and the achievement of making them. What I
dispute is your insistence that your tools are /better/ than mainstream
tools.

I have no "irrational hatred" of tcc - it is simply incapable (in a
great many ways) of doing the job I need from a compiler, and for the
jobs it /can/ do it is in no way better than the tools I already need
and have.

This is what I mean about you being incapable of being objective. You
dissed the whole idea of tcc for everyone. Whereas what you mean is that
it wouldn't benefit /you/ at all.

Much of what I say is clearly marked as being about /my/ uses. But yes,
I sometimes say that things that I believe apply to most people. I've
yet to hear of anything, from you or anyone else, to change my thoughts
on these things.

I can understand that: if you have a dozen slow components of some
elaborate process, replacing one with a faster one would make little difference.

My view is different: I already have /half/ a dozen /fast/ components,
then replacing just one with a slow product like 'gcc' makes a very noticeable difference.

No one doubts that gcc is slower than tcc. That is primarily because it
does vastly more, and is a vastly more useful tool. And for most C
compiles, gcc (even gcc -O2) is more than fast enough. And it is free,
and easily available on common systems. Therefore there is no benefit
to using tcc except in very niche cases.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to bart on Thu Jun 27 13:23:50 2024

bart <bc@freeuk.com> writes:

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python. I use it when it is an appropriate language to use, >>>> which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which was >>> why its simplistic bytecode compiler is acceptable in this scenario, but >>> would be considered useless if applied to C code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you /should/
know the answers to.

If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at least two
scenarios in which a simplistic C bytecode compiler would be an
unacceptable tool to use, and two in which Python with a trivial
bytecode compiler would be an acceptable tool to use. In each case
explain why. Anyone who could not would get marked down on the course.

I'm not sure what you're implying here.

Some here are consistently saying that any compiler whose internal
processes are not at the scale or depth that you find in
professional', 'industrial scale' products like gcc, clang, icc, is
not worth bothering with and is basically a useless toy.

After reading the above I decided to try tcc. I used tcc for
the first time earlier today.

First I tried using tcc for my most recent project. That
didn't go anywhere, because that project relies on C11,
and tcc doesn't support C11.

Next I tried using tcc on a small part of my larger current
project. That test involves compiling one .c file to produce a
.o file, and linking with several other .o files to produce an
executable, and running the executable. The .c file being
compiled uses C99 and doesn't need C11.

The first thing that came up is tcc doesn't support all of
C99. There are some C99 features that tcc just doesn't
understand. In this case the infringements were minor so I
edited the source to work around the missing features.

The second thing to come up is some language incompatibilities.
There are language features that tcc understands, sort of,
but implements them in a way that didn't work with my source
code. To be fair, a case could be made that what tcc does
conforms to the C standard. However, the code I had before
works fine with gcc and clang, and doesn't with tcc. Here
again the changes needed were minor so I edited the source
to work around the problem.

The third thing to come up was the link step. Compiling the
one .c file with tcc -- and there are three other .o files
produced using gcc -- implicated the link step, which needed
to be done with tcc to avoid some undefined symbols. That
kind of surprised me; I'm used to being able to mix gcc
object files and clang object files with no difficulty,
so having the link step fail caught me off guard.

After taking care of all that the build did manage to produce an
executable, which appears to have run successfully.

After doing a trial run with the produced executable, I looked at
the tcc man page. As best I can tell, tcc simply silently
ignores the -fPIC option. (I didn't test that, I only read what
the tcc man page says.) That project absolutely relies on -fPIC,
so if tcc doesn't support it that's a deal breaker.

Not offering any conclusion. Just reporting my experience.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Thu Jun 27 22:47:35 2024

On 27/06/2024 20:51, David Brown wrote:

On 27/06/2024 18:28, bart wrote:

On 27/06/2024 13:31, David Brown wrote:

On 27/06/2024 13:16, bart wrote:

I'm snipping a lot, because answering it will not get us anywhere except
more frustrated.

I do dislike brace-syntax, 0-based indexing, and case-sensitivity.
Those are common characteristics.

I can fully appreciate preferences and opinions - likes and dislikes.
It's the continued determination to fight things that is irrational and incomprehensible. I happen to like these three things. But if I am programming in Python (with indentation rather than braces), Lua (with 1-based indexing) or Pascal (case insensitive), I shrug my shoulders and carry on. I don't go to comp.lang.python, or comp.lang.lua and rant and rave about how terrible the language is and how my own tools are vastly better than anything else.

Like most developers, I try to use the best tool for the job

Sure, you're a user, you don't get involved in devising new languages
or creating tools, you have to use existed, trusted products. But you
let that get in the way of your views with a low tolerance for
anything different or that seems amateurish or pointless.

That makes /no/ sense at all.

First, I am as capable as you or anyone else at finding things in C or
any other language that I think are not as good as they could have been,
or poor design decisions. The fact that I am a user, not an
implementer, is irrelevant - programming languages are made for the
users, and the effort needed to implement them is of minor concern.

Over a decade ago I started looking at whole-program compilers which,
if I was more into optimising, would be lend themselves easily to
whole-program optimisation.

But while you will dismiss my own efforts out of hand, you do at least
appreciate the benefits of 'LTO' (which I consider a third rate
version of what I do, and considerably more complex).

To be clear - as I have stated /many/ times, I appreciate the effort
needed to make your tools, and the achievement of making them. What I dispute is your insistence that your tools are /better/ than mainstream tools.

I have no "irrational hatred" of tcc - it is simply incapable (in a
great many ways) of doing the job I need from a compiler, and for the
jobs it /can/ do it is in no way better than the tools I already need
and have.

This is what I mean about you being incapable of being objective. You
dissed the whole idea of tcc for everyone. Whereas what you mean is
that it wouldn't benefit /you/ at all.

Much of what I say is clearly marked as being about /my/ uses. But yes,
I sometimes say that things that I believe apply to most people. I've
yet to hear of anything, from you or anyone else, to change my thoughts
on these things.

I can understand that: if you have a dozen slow components of some
elaborate process, replacing one with a faster one would make little
difference.

My view is different: I already have /half/ a dozen /fast/ components,
then replacing just one with a slow product like 'gcc' makes a very
noticeable difference.

No one doubts that gcc is slower than tcc. That is primarily because it does vastly more, and is a vastly more useful tool. And for most C compiles, gcc (even gcc -O2) is more than fast enough.

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

And it is free,
and easily available on common systems. Therefore there is no benefit
to using tcc except in very niche cases.

And my argument would be the opposite. The use of gcc would be the
exception. (Long before I used gcc or tcc, I used lccwin32.)

Here's the result of an experiment I did. gcc 14 is about 800MB and over
10,000 files. I wanted to see the minimal set of files that would
compile one of my generated C files.

After half an hour I reduced the files to the following 15, to compile
to object code only (the link dependencies were too complex):

-----------------------
Directory of C:\tdm\bin
09/06/2024 23:13 1,837,582 as.exe
09/06/2024 23:13 1,924,622 gcc.exe
09/06/2024 23:13 4,627 gdb-add-index
09/06/2024 23:13 930,493 libgcc_s_seh-1.dll
09/06/2024 23:13 499,289 libgmp-10.dll
09/06/2024 23:13 1,679,127 libiconv-2.dll
09/06/2024 23:13 344,105 libintl-8.dll
09/06/2024 23:13 2,252,996 libisl-23.dll
09/06/2024 23:13 136,406 libmpc-3.dll
09/06/2024 23:13 676,258 libmpfr-6.dll
09/06/2024 23:13 94,355 libwinpthread-1.dll
09/06/2024 23:13 996,878 libzstd.dll
09/06/2024 23:13 121,870 zlib1.dll

Directory of C:\tdm\libexec\gcc\x86_64-w64-mingw32\14.1.0
09/06/2024 23:15 34,221,582 cc1.exe
15 File(s) 45,720,190 bytes
------------------------

(There are no header files; my generated C doesn't use them. Note this
is not 'tdm' despite the name. The gdb file probably isn't needed, I
hadn't spotted it.)

The equivalent set for tcc would be 2 files, totalling 0.22MB, about
1/200th the size - and it produces EXEs!

I can't explain to somebody who doesn't get it why a small, simple tool
is desirable.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to bart on Thu Jun 27 23:24:51 2024

bart <bc@freeuk.com> writes:

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python.� I use it when it is an appropriate language to use, >>>> �which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which was >>> why its simplistic bytecode compiler is acceptable in this scenario, but >>> would be considered useless if applied to C code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you /should/
know the answers to.
If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at least two
scenarios in which a simplistic C bytecode compiler would be an
unacceptable tool to use, and two in which Python with a trivial
bytecode compiler would be an acceptable tool to use. In each case
explain why. Anyone who could not would get marked down on the course.

I'm not sure what you're implying here.

If you are not sure what I'm saying (and you think it worth finding out)
then you could ask some questions /about what I said/. I was trying to
avoid implying anything, so please consider only what I actually said.
However, I can't easily explain it if you don't say what parts you
didn't follow.

The basics are simple: you asked a question. I think you aught to be
able to answer it yourself. Can you answer your own question and
explain why those different kinds of tool might be appropriate in
certain scenarios? If you now follow what I'm saying but can't answer
your own question, can you say why you can't?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to Ben Bacarisse on Fri Jun 28 00:44:09 2024

On 27/06/2024 23:24, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python. I use it when it is an appropriate language to use, >>>>> which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which was >>>> why its simplistic bytecode compiler is acceptable in this scenario, but >>>> would be considered useless if applied to C code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you /should/ >>> know the answers to.
If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at least two >>> scenarios in which a simplistic C bytecode compiler would be an
unacceptable tool to use, and two in which Python with a trivial
bytecode compiler would be an acceptable tool to use. In each case
explain why. Anyone who could not would get marked down on the course.

I'm not sure what you're implying here.

If you are not sure what I'm saying (and you think it worth finding out)
then you could ask some questions /about what I said/. I was trying to
avoid implying anything, so please consider only what I actually said. However, I can't easily explain it if you don't say what parts you
didn't follow.

The basics are simple: you asked a question. I think you aught to be
able to answer it yourself. Can you answer your own question and
explain why those different kinds of tool might be appropriate in
certain scenarios? If you now follow what I'm saying but can't answer
your own question, can you say why you can't?

Isn't it about the tradeoff between ease of development, vs speed of
execution?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Thu Jun 27 23:58:34 2024

bart <bc@freeuk.com> writes:

On 27/06/2024 21:23, Tim Rentsch wrote:

After taking care of all that the build did manage to produce an
executable, which appears to have run successfully.

After doing a trial run with the produced executable, I looked at
the tcc man page. As best I can tell, tcc simply silently
ignores the -fPIC option.

I think that it tries to be a drop-in replacement for gcc, so supports
some of its options, even if they don't do anything. Like -O3.

Position-independent code seems to be a recent thing with gcc tools. My
tools didn't support it either, until a year ago when I found out about
ASLR.

gcc has supported generating position independent code for at
over a quarter of a century.

For most, PIC isn't a necessity.

That's your opinion. I don't think it matches reality.

For most applications it doesn't matter and PIC code
works just as well as PDC.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Tim Rentsch on Fri Jun 28 00:16:40 2024

On 27/06/2024 21:23, Tim Rentsch wrote:

bart <bc@freeuk.com> writes:

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python. I use it when it is an appropriate language to use, >>>>> which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which was >>>> why its simplistic bytecode compiler is acceptable in this scenario, but >>>> would be considered useless if applied to C code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you /should/ >>> know the answers to.

If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at least two >>> scenarios in which a simplistic C bytecode compiler would be an
unacceptable tool to use, and two in which Python with a trivial
bytecode compiler would be an acceptable tool to use. In each case
explain why. Anyone who could not would get marked down on the course.

I'm not sure what you're implying here.

Some here are consistently saying that any compiler whose internal
processes are not at the scale or depth that you find in
professional', 'industrial scale' products like gcc, clang, icc, is
not worth bothering with and is basically a useless toy.

After reading the above I decided to try tcc. I used tcc for
the first time earlier today.

First I tried using tcc for my most recent project. That
didn't go anywhere, because that project relies on C11,
and tcc doesn't support C11.

Next I tried using tcc on a small part of my larger current
project. That test involves compiling one .c file to produce a
.o file, and linking with several other .o files to produce an
executable, and running the executable. The .c file being
compiled uses C99 and doesn't need C11.

The first thing that came up is tcc doesn't support all of
C99. There are some C99 features that tcc just doesn't
understand.

Which ones? I thought its C99 support was far more complete than mine.

In this case the infringements were minor so I
edited the source to work around the missing features.

The second thing to come up is some language incompatibilities.
There are language features that tcc understands, sort of,
but implements them in a way that didn't work with my source
code. To be fair, a case could be made that what tcc does
conforms to the C standard. However, the code I had before
works fine with gcc

People develop sofware using compilers like gcc, and will naturally do
whatever it takes to make them work with that tool. They might
inadvertently use extensions.

Probably not many test across multiple compilers, though some projects specifically check for compilers like GCC or MSVC and will use dedicated conditional blocks for each. I don't think I've seen such code for TCC.

and clang, and doesn't with tcc. Here

again the changes needed were minor so I edited the source
to work around the problem.

The third thing to come up was the link step. Compiling the
one .c file with tcc -- and there are three other .o files
produced using gcc -- implicated the link step, which needed
to be done with tcc to avoid some undefined symbols. That
kind of surprised me; I'm used to being able to mix gcc
object files and clang object files with no difficulty,
so having the link step fail caught me off guard.

After taking care of all that the build did manage to produce an
executable, which appears to have run successfully.

After doing a trial run with the produced executable, I looked at
the tcc man page. As best I can tell, tcc simply silently
ignores the -fPIC option.

I think that it tries to be a drop-in replacement for gcc, so supports
some of its options, even if they don't do anything. Like -O3.

Position-independent code seems to be a recent thing with gcc tools. My
tools didn't support it either, until a year ago when I found out about
ASLR.

For most, PIC isn't a necessity. But tcc supports shared libraries, and
some kinds of relocation, if not full PIC, are necessary.

the tcc man page says.) That project absolutely relies on -fPIC,
so if tcc doesn't support it that's a deal breaker.

tcc isn't perfect and there are missing bits especially with headers and libraries. Fixing that would make it a bit bigger, but not 100 times
bigger. If more people took it seriously then it might get done. gcc has
been under development since 1987.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Keith Thompson on Thu Jun 27 17:50:21 2024

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

After reading the above I decided to try tcc. I used tcc for
the first time earlier today.

[...]

You probably used tcc 0.9.27, the most recent release, from 2017.

Development has continued, with a git repo at

git://repo.or.cz/tinycc.git

The master branch hasn't gone past 9.9.27, but the "mob" branch has been updated as recently as 2024-03-22. It builds in just a few seconds on
my system. The same version is available on godbot.org as "TCC (trunk)".

I haven't paid much attention to what's actually been implemented post-0.9.27. See the included Changelog if you're curious. The one
thing I've noticed is that the "mob" version implements the C99 rule
that falling off the end of main does an implicit "return 0;",
admittedly a minor point.

I'm not suggesting you should build tcc from source and repeat the
experiment (I suspect it wouldn't make much difference), but it's there
if you're so inclined.

Thank you for the info. If I were going to use tcc I would want to
use only the version that is standard for my environment, rather
than try to deal with a custom build, but maybe things will be
different in the future.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to bart on Thu Jun 27 18:08:03 2024

bart <bc@freeuk.com> writes:

On 27/06/2024 21:23, Tim Rentsch wrote:

bart <bc@freeuk.com> writes:

On 26/06/2024 13:15, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 25/06/2024 16:12, David Brown wrote:

...

I /do/ use Python. I use it when it is an appropriate language to use, >>>>>> which is very different circumstances from when I use C (or
C++). Different tools for different tasks.

And yet neither of you are interested in answering my question, which was >>>>> why its simplistic bytecode compiler is acceptable in this scenario, but >>>>> would be considered useless if applied to C code.

You throw out a lot of these sorts of question, by which I mean
questions that you either /do/ know the answers to or which you /should/ >>>> know the answers to.

If a software engineering student asked me this sort of "challenge"
question it would immediately become homework: come up with at least two >>>> scenarios in which a simplistic C bytecode compiler would be an
unacceptable tool to use, and two in which Python with a trivial
bytecode compiler would be an acceptable tool to use. In each case
explain why. Anyone who could not would get marked down on the course. >>>

I'm not sure what you're implying here.

Some here are consistently saying that any compiler whose internal
processes are not at the scale or depth that you find in
professional', 'industrial scale' products like gcc, clang, icc, is
not worth bothering with and is basically a useless toy.

After reading the above I decided to try tcc. I used tcc for
the first time earlier today.

First I tried using tcc for my most recent project. That
didn't go anywhere, because that project relies on C11,
and tcc doesn't support C11.

Next I tried using tcc on a small part of my larger current
project. That test involves compiling one .c file to produce a
.o file, and linking with several other .o files to produce an
executable, and running the executable. The .c file being
compiled uses C99 and doesn't need C11.

The first thing that came up is tcc doesn't support all of
C99. There are some C99 features that tcc just doesn't
understand.

Which ones? I thought its C99 support was far more complete than
mine.

I didn't post to have a conversation about tcc. I just thought
people might be interested to hear about the trial.

In this case the infringements were minor so I
edited the source to work around the missing features.

The second thing to come up is some language incompatibilities.
There are language features that tcc understands, sort of,
but implements them in a way that didn't work with my source
code. To be fair, a case could be made that what tcc does
conforms to the C standard. However, the code I had before
works fine with gcc

People develop sofware using compilers like gcc, and will naturally do whatever it takes to make them work with that tool. They might
inadvertently use extensions.

I routinely use -pedantic-errors for my compiles, and am totally
intolerant of any warning or error messages. I very much doubt
my builds use any extensions.

After doing a trial run with the produced executable, I looked at
the tcc man page. As best I can tell, tcc simply silently
ignores the -fPIC option.

I think that it tries to be a drop-in replacement for gcc, so supports
some of its options, even if they don't do anything. Like -O3.

I would say tcc accepts the -fPIC option but does not support it.

Position-independent code seems to be a recent thing with gcc
tools. My tools didn't support it either, until a year ago when I
found out about ASLR.

For most, PIC isn't a necessity. But tcc supports shared libraries,
and some kinds of relocation, if not full PIC, are necessary.

Support for -fPIC is a must-have for that project. I'm not going
to spend time investigating possible partial solutions when I
already have other options available that are known to work.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Scott Lurndal on Thu Jun 27 18:21:37 2024

scott@slp53.sl.home (Scott Lurndal) writes:

bart <bc@freeuk.com> writes:

On 27/06/2024 21:23, Tim Rentsch wrote:

After taking care of all that the build did manage to produce an
executable, which appears to have run successfully.

After doing a trial run with the produced executable, I looked at
the tcc man page. As best I can tell, tcc simply silently
ignores the -fPIC option.

I think that it tries to be a drop-in replacement for gcc, so supports
some of its options, even if they don't do anything. Like -O3.

Position-independent code seems to be a recent thing with gcc tools. My
tools didn't support it either, until a year ago when I found out about
ASLR.

gcc has supported generating position independent code for at
over a quarter of a century.

For most, PIC isn't a necessity.

That's your opinion.

Not really an opinion, but either a conjecture or a belief about
a question of fact. It may not be easy to determine whether
the conjecture is true or not, but it's still a question of
fact and not just an opinion.

I don't think it matches reality.

That it is reasonable to ask whether the statement matches
reality is a giveaway that the proposition is a question
of fact and not merely a matter of opinion.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Fri Jun 28 03:23:47 2024

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall and -W?

Some diagnostics are not produced or don't work well without
optimization, because the the same analysis that goes into optimization
also goes into proofs connected to diagnostics.

For instance, some code look like might use a variable without
initializing it, under some conditions. The optimizer can rule out of
those conditions (e.g. by transforming the code in a way that
makes it obvious that the use of the variable is not reached
without crossing an assignment to it.)

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Fri Jun 28 03:30:07 2024

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows
every executable to be mapped to the same fixed address in its own
address space.

If you designed your personal OS, which would be the case?

[ ] Programs must be PIC

[ ] Programs needn't be PIC

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to All on Fri Jun 28 06:57:27 2024

On 27/06/2024 23:47, bart wrote:

(I'm off on holiday today - if I don't reply more, it's not because I am ghosting you!)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Fri Jun 28 06:56:41 2024

On 27/06/2024 23:47, bart wrote:

On 27/06/2024 20:51, David Brown wrote:

On 27/06/2024 18:28, bart wrote:

On 27/06/2024 13:31, David Brown wrote:

On 27/06/2024 13:16, bart wrote:

No one doubts that gcc is slower than tcc. That is primarily because
it does vastly more, and is a vastly more useful tool. And for most C
compiles, gcc (even gcc -O2) is more than fast enough.

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

With most of your compiles, is "gcc -O2" too slow to compile? If not,
then why would you or anyone else actively /choose/ to have a poorer
quality output and poorer quality warnings? I appreciate that fast
enough output is fast enough (just as I say the same for compilation
speed) - but choosing a slower output when a faster one is just as easy
makes little sense. The only reason I can think of why "gcc -O2 -Wall"
is not the starting point for compiler flags is because you write poor C
code and don't want your compiler to tell you so!

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

And it is free, and easily available on common systems. Therefore
there is no benefit to using tcc except in very niche cases.

And my argument would be the opposite. The use of gcc would be the
exception. (Long before I used gcc or tcc, I used lccwin32.)

On Linux, almost everyone uses gcc, except for a proportion who actively
choose to use clang or icc. The same applies to other many other *nix
systems, though some people will use their system compiler on commercial Unixes. For Macs, it is clang disguised as gcc that dominates. On
Windows, few people use C for native code (C++, C# and other languages dominate). I expect the majority use MSVC for C, and there will be
people using a variety of other tools including lcc-win and Borland, as
well as gcc in various packagings. (And embedded developers use
whatever cross-compile tools are appropriate for their target,
regardless of the host - the great majority use gcc now.)

I don't believe that in any graph of compiler usage on any platform, tcc
would show up as anything more than a tiny sliver under "others".

Here's the result of an experiment I did. gcc 14 is about 800MB and over 10,000 files. I wanted to see the minimal set of files that would
compile one of my generated C files.

Why? 800 MB is a few pence worth of disk space. For almost all uses,
it simply doesn't matter.

I can't explain to somebody who doesn't get it why a small, simple tool
is desirable.

If you were trying to say that tcc is simpler to /use/ than gcc, that
would be a different matter entirely. That would be a relevant factor.
The size of the gcc installation, is all hidden behind the scenes. Few
people know how big it is on their system, fewer still care.

(And I am not sure I agree with such a claim - certainly you /can/ have
very advanced and complicated use of gcc. But in comparison to learning
C itself, running "gcc -Wall -O2 -o hello hello.c" is hardly rocket
science. But I would certainly be much more open to a "simpler to use" argument.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Kaz Kylheku on Fri Jun 28 11:11:51 2024

On Fri, 28 Jun 2024 03:30:07 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows
every executable to be mapped to the same fixed address in its own
address space.

It does not sound right.
For "simple" program that does not mess with copying pats of itself
and such, PIC just allows simpler exe format and more primitive loader.
If you already have advanced exe format and smart loader then all PIC
buys is faster load at cost of slower execution. Hopefully, just a
little slower, but that depends on architecture.
PID is more "interesting". Even on x86-64, where address displacement
field is ether 8 or 32 bits, completely free relocation of data
segments would need co-operation from code generator. More so on
architectures with significantly narrower displacement range.

If you designed your personal OS, which would be the case?

[ ] Programs must be PIC

[ ] Programs needn't be PIC

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Fri Jun 28 11:20:47 2024

On Fri, 28 Jun 2024 06:56:41 +0200
David Brown <david.brown@hesbynett.no> wrote:

and Borland, as well as gcc in various packagings. (And embedded
developers use whatever cross-compile tools are appropriate for their
target, regardless of the host - the great majority use gcc now.)

I am not sure that it is still true.
Vendors like TI and ST Micro push users of their ARMv7-M development
suits toward clang. I didn't look at NXP lately, but would be surprised
if it's not the same.
It's not like dev can't chose gcc with this suits,
but it takes effort that majority of embedded code monkeys is
incapable of and majority of minority sees no reason.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Fri Jun 28 11:05:39 2024

On 28/06/2024 05:56, David Brown wrote:

On 27/06/2024 23:47, bart wrote:

On 27/06/2024 20:51, David Brown wrote:

On 27/06/2024 18:28, bart wrote:

On 27/06/2024 13:31, David Brown wrote:

On 27/06/2024 13:16, bart wrote:

No one doubts that gcc is slower than tcc. That is primarily because
it does vastly more, and is a vastly more useful tool. And for most
C compiles, gcc (even gcc -O2) is more than fast enough.

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

With most of your compiles, is "gcc -O2" too slow to compile? If not,
then why would you or anyone else actively /choose/ to have a poorer
quality output and poorer quality warnings? I appreciate that fast
enough output is fast enough (just as I say the same for compilation
speed) - but choosing a slower output when a faster one is just as easy
makes little sense. The only reason I can think of why "gcc -O2 -Wall"
is not the starting point for compiler flags is because you write poor C
code and don't want your compiler to tell you so!

I might very occasionally use gcc -O2/-O3 when I want a fast product,
(this is mostly with generated C) most often when I'm benchmarking and
what to report a higher lines/second figure or some such measure. Since
that would be a fairer comparison with other products which almost
certainly will be using an optimised build.

But usually I never bother. The 40% boost that gcc-O3 gives me, makes
most runtimes of my language tools 10-20ms faster (so 0.07 seconds
instead of 0.09 seconds; I can live with that).

The cost of the speedup is not just having to hang about for gcc-O3
(it's like doing 80mph on the M6 and having to stop for a red light).
It's keeping my non-C source code conservative - avoiding features that
are troublesome to transpile to C.

(One feature is a special looping switch that I implemented as a fast computed-goto. It is transpiled into a regular C switch, but it means
gcc-O3 can't generate code faster than mine. Only if I were to use C extensions.)

My other language product that could be speeded up is a bytecode
interpreter, which has two dispatch modes. One HLL-only dispatch mode
can have that 40-50% speed-up via C and gcc-O3.

But I normally use the ASM-accelerated mode, which is about 150-200%
faster even than the interpreter using transpiled C + gcc-O3.

Note that all these examples benefit from whole-program optimisation of
C. If written directly in C, while programs like interpreters can
benefit from all sorts of tricks like inlining everything, they would
lose that whole-program analysis.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

And it is free, and easily available on common systems. Therefore
there is no benefit to using tcc except in very niche cases.

And my argument would be the opposite. The use of gcc would be the
exception. (Long before I used gcc or tcc, I used lccwin32.)

On Linux, almost everyone uses gcc, except for a proportion who actively choose to use clang or icc. The same applies to other many other *nix systems, though some people will use their system compiler on commercial Unixes. For Macs, it is clang disguised as gcc that dominates. On Windows, few people use C for native code (C++, C# and other languages dominate). I expect the majority use MSVC for C, and there will be
people using a variety of other tools including lcc-win and Borland, as
well as gcc in various packagings. (And embedded developers use
whatever cross-compile tools are appropriate for their target,
regardless of the host - the great majority use gcc now.)

I don't believe that in any graph of compiler usage on any platform, tcc would show up as anything more than a tiny sliver under "others".

Here's the result of an experiment I did. gcc 14 is about 800MB and
over 10,000 files. I wanted to see the minimal set of files that would
compile one of my generated C files.

Why? 800 MB is a few pence worth of disk space. For almost all uses,
it simply doesn't matter.

It's sloppy.

If I transpile code to C via my one-file 0.3MB compiler, I'd have to
tell people they need also this 800MB/10000-file dependency, of which
they only need 45MB/15 files (or more with linking), but, sorry, I have
no idea which bits are actually essential!

I can't explain to somebody who doesn't get it why a small, simple
tool is desirable.

If you were trying to say that tcc is simpler to /use/ than gcc, that
would be a different matter entirely. That would be a relevant factor.
The size of the gcc installation, is all hidden behind the scenes. Few people know how big it is on their system, fewer still care.

(And I am not sure I agree with such a claim - certainly you /can/ have
very advanced and complicated use of gcc. But in comparison to learning
C itself, running "gcc -Wall -O2 -o hello hello.c" is hardly rocket science. But I would certainly be much more open to a "simpler to use" argument.)

Actually, on Windows, tcc is harder to use than gcc for my generated C.
Most of that is due to needing the -fdollars-in-identifiers option
because my C uses '$'.

It probably takes longer to type that, if you compile half a dozen
times, than it would take to fix tcc to allow '$' /unless/ such an
option was used.

(I just tried it; what tool longer was finding the write source file.
The fix was to change:

set_idnum('$', s1->dollars_in_identifiers ? IS_ID : 0);

to:

set_idnum('$', s1->dollars_in_identifiers ? 0 : IS_ID);

But that only fixes one copy of it. It doesn't fix the versions that
other people use.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Kaz Kylheku on Fri Jun 28 11:19:34 2024

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall and -W?

Using products like tcc doesn't mean never using gcc. (Especially on
Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the simpler
compiler may have missed, or to produce faster production builds.

But gcc is not needed for routine compilation.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Fri Jun 28 10:26:59 2024

On 2024-06-28, bart <bc@freeuk.com> wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall and -W?

Using products like tcc doesn't mean never using gcc. (Especially on
Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the simpler
compiler may have missed, or to produce faster production builds.

But gcc is not needed for routine compilation.

Catching common bugs in routine compilation is better than once
a month.

You could be wasting time debugging something where GCC would have told
you right away you have something uninitialized or whatever.

If you have a CI pipeline, you should at least have it run the
good-diagnostics compiler, so it catches problems when developers
submit code.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Fri Jun 28 11:15:36 2024

On 28/06/2024 00:58, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 27/06/2024 21:23, Tim Rentsch wrote:

After taking care of all that the build did manage to produce an
executable, which appears to have run successfully.

After doing a trial run with the produced executable, I looked at
the tcc man page. As best I can tell, tcc simply silently
ignores the -fPIC option.

I think that it tries to be a drop-in replacement for gcc, so supports
some of its options, even if they don't do anything. Like -O3.

Position-independent code seems to be a recent thing with gcc tools. My
tools didn't support it either, until a year ago when I found out about
ASLR.

gcc has supported generating position independent code for at
over a quarter of a century.

I'm pretty sure that just 4 years ago, I was able to generate non-PIC
object files that could be linked with gcc-ld on Windows.

When I tried it last year, it didn't work. So something changed, whether
in compiler, or OS loader, where a dynamic image base address, which
requires PIC, became the default.

For most, PIC isn't a necessity.

That's your opinion. I don't think it matches reality.

The fact that it hasn't been needed, or been the default, for decades
suggests that it isn't a necessity.

A 'necessity' isn't someone arbitrarily decreeing: 'all code must be PIC'.

If you took any Windows EXE and changed some flags in the headers, so
loading at the given base address, it would probably still work.

For most applications it doesn't matter and PIC code
works just as well as PDC.

Hang on, isn't that what /I/ said, which you just disagreed with?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Kaz Kylheku on Fri Jun 28 11:41:58 2024

On 28/06/2024 04:30, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows
every executable to be mapped to the same fixed address in its own
address space.

PIC never seemed to be a requirement during the 1980s and half the 90s.
But then OSes only ran one program a time.

And when virtual addressing came along, and multiple programs could
co-exist at the same address, PIC wasn't really needed either.

If you designed your personal OS, which would be the case?

[ ] Programs must be PIC

[ ] Programs needn't be PIC

With or without virtual addressing?

On x64, there are several aspects to this:

* Code that runs within the low 2GB (4GB is troublesome)

* Code that runs above 2GB, so that 32-bit fields (often signed so that
only 31 bits are usable) that refer to absolute addresses of code or
data will need 33 bits or more

* Dynamically linked /shared/ libraries, which need to have
base-relocation tables (since you can't have multiple libraries all at
the same address within the host process's virtual space), but which are
not necessarily relocated above 2GB

* True PIC that requires special compiler support (to avoid generating
address modes that involve registers plus absolute 32-bit address
offsets). RIP-relative address modes, when there are no registers, can
be done with instruction encoding outside of the compiler

It's all rather messy.

I can generate true PIC if necessary (for OBJ files that might be linked
with ld, or DLLs that /might/ be relocated above 2GB).

But usually I generate code to load within the first 2GB.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Fri Jun 28 13:48:53 2024

bart <bc@freeuk.com> writes:

On 28/06/2024 04:30, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows
every executable to be mapped to the same fixed address in its own
address space.

PIC never seemed to be a requirement during the 1980s and half the 90s.
But then OSes only ran one program a time.

Interactive operating systems in 1967 (e.g. TSS8) were running
multiple programs at a time.

And when virtual addressing came along, and multiple programs could
co-exist at the same address, PIC wasn't really needed either.

Virtual addressing has been part of computer systems since the 1960s.

PIC is obviously necessary for any kind of shared code (shared object
or DLL) that gets loaded at different base addressses in different
processes.

If you designed your personal OS, which would be the case?

[ ] Programs must be PIC

[ ] Programs needn't be PIC

With or without virtual addressing?

That's completely irrelevent.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Michael S on Fri Jun 28 13:52:23 2024

Michael S <already5chosen@yahoo.com> writes:

On Fri, 28 Jun 2024 06:56:41 +0200
David Brown <david.brown@hesbynett.no> wrote:

and Borland, as well as gcc in various packagings. (And embedded
developers use whatever cross-compile tools are appropriate for their
target, regardless of the host - the great majority use gcc now.)

I am not sure that it is still true.
Vendors like TI and ST Micro push users of their ARMv7-M development
suits toward clang. I didn't look at NXP lately, but would be surprised
if it's not the same.

ARM discontinued their in-house toolchain and switched to
a clang-based toolchain. It's not surprising that their
larger customers followed suit.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Fri Jun 28 13:53:25 2024

bart <bc@freeuk.com> writes:

On 28/06/2024 00:58, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

Position-independent code seems to be a recent thing with gcc tools. My
tools didn't support it either, until a year ago when I found out about
ASLR.

gcc has supported generating position independent code for at
over a quarter of a century.

I'm pretty sure that just 4 years ago, I was able to generate non-PIC
object files that could be linked with gcc-ld on Windows.

So what? That doesn't mean gcc didn't support generation of PIC
code three decades ago.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Fri Jun 28 15:36:23 2024

On 28/06/2024 14:48, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 04:30, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows
every executable to be mapped to the same fixed address in its own
address space.

PIC never seemed to be a requirement during the 1980s and half the 90s.
But then OSes only ran one program a time.

Interactive operating systems in 1967 (e.g. TSS8) were running
multiple programs at a time.

And when virtual addressing came along, and multiple programs could
co-exist at the same address, PIC wasn't really needed either.

Virtual addressing has been part of computer systems since the 1960s.

PIC is obviously necessary for any kind of shared code (shared object
or DLL) that gets loaded at different base addressses in different
processes.

I wouldn't call that PIC. On Windows, DLLs need base-relocation tables.

That means that if a line of code uses an absolute address, it will
still be an absolute address after relocation, but the address is
changed. (Although it can't turn a 32-bit address into a 64-bit one if
the relocation is large.)

True PIC wouldn't need those relocation tables; it would avoid absolute addresses.

I'm not sure that different processes see DLLs at different addresses
either. A new process using an existing DLL might as well use it at the
same address.

Relocation is only needed if there is a clash: the process's start-up
segments overlap the DLLs existing address, or it is loaded at a later
point (via dlopen etc) when the address might be used for heap data by
the new process.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Fri Jun 28 15:42:47 2024

On 28/06/2024 14:48, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 04:30, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows
every executable to be mapped to the same fixed address in its own
address space.

PIC never seemed to be a requirement during the 1980s and half the 90s.
But then OSes only ran one program a time.

Interactive operating systems in 1967 (e.g. TSS8) were running
multiple programs at a time.

OK, let's say the many millions of PCs used by home and business in that
era only ran a program at a time.

Other than that, I'm not familiar with the workings of 1960/70s major
OSes. But it's quite possible that two programs could be swapped in and
out of the same memory space.

And when virtual addressing came along, and multiple programs could
co-exist at the same address, PIC wasn't really needed either.

Virtual addressing has been part of computer systems since the 1960s.

OK, so was PIC needed for those or not? Did all processes share one
virtual address space, or did each have its own?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Fri Jun 28 16:01:54 2024

bart <bc@freeuk.com> writes:

On 28/06/2024 14:48, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 04:30, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows
every executable to be mapped to the same fixed address in its own
address space.

PIC never seemed to be a requirement during the 1980s and half the 90s.
But then OSes only ran one program a time.

Interactive operating systems in 1967 (e.g. TSS8) were running
multiple programs at a time.

OK, let's say the many millions of PCs used by home and business in that
era only ran a program at a time.

That's not what you said. I pointed out that your statement
was blatently incorrect.

The vast majority of real-world data processing was done
on Operating Systems that could run more than one program
at a time since the early 1960s through the 1990s.

Even windows 3.1 could run multiple programs at the same
time, in a crude fashion.

And truly, whether there were 10000 or 200000 PCs running
in homes or not is a completely meaningless argument. Much
like your PIC objections, obsession with compile time or
obsession with executable file size.

And when virtual addressing came along, and multiple programs could
co-exist at the same address, PIC wasn't really needed either.

Virtual addressing has been part of computer systems since the 1960s.

OK, so was PIC needed for those or not?

Who cares? When PIC is called for (e.g. shared objects), it is
used. When it isn't called for (static executables), it depends
on the architecture (Burroughs large systems code was always
position independent due to the stack-based architecture and
dates to the early 1960s).

Did all processes share one virtual address space, or did each have its own?

Each had its own address space. "Virtual" by definition means
that they have their own address space. Note that even systems
with virtual memory have use cases for PIC code (Unix, Linux
shared objects). Even tss8 on the PDP-8 in 1967 supported individual
address spaces for the 10 or 20 active users (in 16 to 32KW of
real memory), by swapping 4KW pages.

Your experiences seem limited to PC systems.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to bart on Fri Jun 28 15:48:00 2024

bart <bc@freeuk.com> writes:

On 28/06/2024 14:48, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 04:30, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows
every executable to be mapped to the same fixed address in its own
address space.

PIC never seemed to be a requirement during the 1980s and half the 90s.
But then OSes only ran one program a time.

Interactive operating systems in 1967 (e.g. TSS8) were running
multiple programs at a time.

And when virtual addressing came along, and multiple programs could
co-exist at the same address, PIC wasn't really needed either.

Virtual addressing has been part of computer systems since the 1960s.

PIC is obviously necessary for any kind of shared code (shared object
or DLL) that gets loaded at different base addressses in different
processes.

I wouldn't call that PIC. On Windows, DLLs need base-relocation tables.

DLL code cannot, for example, use the movabs instruction or any
other instruction that takes absolute addresses; relocatable code
generated needs to be position independent. Trying to fixup
non-PIC code at load time is fraught and pointless when compilers
have the capability of generating PIC code.

Only external interfaces (function addresses, global variables) should
need relocation tables (GOT and PLT for ELF).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Michael S on Fri Jun 28 19:39:00 2024

On Mon, 24 Jun 2024 18:10:06 +0300
Michael S <already5chosen@yahoo.com> wrote:

On Mon, 24 Jun 2024 15:00:26 +0100
bart <bc@freeuk.com> wrote:

On 24/06/2024 14:09, Michael S wrote:

On Fri, 21 Jun 2024 22:47:46 +0100
bart <bc@freeuk.com> wrote:

On 21/06/2024 14:34, David Brown wrote:

On 21/06/2024 12:42, bart wrote:

On 21/06/2024 10:46, David Brown wrote:

I understand your viewpoint and motivation.� But my own
experience is mostly different.

First, to get it out of the way, there's the speed of
compilation. While heavy optimisation (-O3) can take
noticeably longer, I never see -O0 as being in any noticeable
way faster for compilation than -O1 or even -O2.

Absolute time or relative?

Both.

For me, optimised options with gcc always take longer:

Of course.� But I said it was not noticeable - it does not make
enough difference in speed for it to be worth choosing.

��C:\c>tm gcc bignum.c -shared -s -obignum.dll�� # from
cold TM: 3.85

Cold build times are irrelevant to development - when you are
working on a project, all the source files and all your compiler
files are in the PC's cache.

��C:\c>tm gcc bignum.c -shared -s -obignum.dll
��TM: 0.31

��C:\c>tm gcc bignum.c -shared -s -obignum.dll -O2
��TM: 0.83

��C:\c>tm gcc bignum.c -shared -s -obignum.dll -O3
��TM: 0.93

��C:\c>dir bignum.dll
��21/06/2024� 11:14�� 35,840 bignum.dll

Any build time under a second is as good as instant.

I tested on a real project, not a single file.� It has 158 C
files and about 220 header files.� And I ran it on my old PC,
without any "tricks" that you dislike so much, doing full clean
re-builds.� The files are actually all compiled twice, building
two variants of the binary.

With -O2, it took 34.3 seconds to build.� With -O1, it took 33.4
seconds.� With -O0, it took 30.8 seconds.

So that is a 15% difference for full builds.� In practice, of
course, full rebuilds are rarely needed, and most builds after
changes to the source are within a second or so.

Then there's something very peculiar about your codebase.

To me it looks more likely that your codebase is very unusual
rather than David's

In order to get meaningful measurements I took embedded project
that is significantly bigger than average by my standards. Here
are times of full parallel rebuild (make -j5) on relatively old
computer (4-core Xeon E3-1271 v3).

Option time(s) -g time text size
-O0 13.1 13.3 631648
-Os 13.6 14.1 424016
-O1 13.5 13.7 455728
-O2 14.0 14.1 450056
-O3 14.0 14.6 525380

The difference in time between different -O settings in my
measurements is even smaller than reported by David Brown. That
can be attributed to older compiler (gcc 4.1.2). Another
difference is that this compiler works under cygwin, which is significantly slower both than native Linux and than native
Windows. That causes relatively higher make overhead and longer
link.

I don't know why Cygwin would make much difference; the native code
is still running on the same processor.

I don't know specific reasons. Bird's eye perspective is that cygwin
tries to emulate Posix semantics on platform that is not Posix and
achieves that by using few low-granularity semaphores in user space,
which seriously limits parallelism. Besides, there are problems with emulation of Posix I/O semantics that cause cygwin file I/O to be 2-3
times slower that native Windows I/O. The later applies mostly to
relatively small files, but, then again, software build mostly
accesses small files.
As a matter of fact, a parallel speed up I see on this project on this quad-core machine is barely 2x. I expect 3x or a little more for the
same project with native Windows tools.

Further investigation proved that cygwin is not at fault.
The slowness and poor scalability of the build process on this computer
was caused by computer virus that our IT department mistakenly calls
antivirus.
When I run more then 1 compilation job in parallel, the virus, which
appears to be strictly single-threaded, become a main bottleneck. When
I run more than 2 compilation job in parallel, it become a sole
bottleneck.
The impact is not specific to gcc - virus hates all compilers in
existence, both cross and native indiscriminately. However gcc is
impacted much worse tha, for example, MSVC. I'd guess that the main
difference between the two is that gcc compilation is two-stage -
generation of asm and assembling, while MSVC compilation is a single
stage.
Of course, with virus in action, the difference between gcc -O0 and
-O2 is lost in noise.

However, when I repeated an experiment on virus-free computer (my own
old home PC) the difference between -O0 and -O2 was still smaller than
your reports - approximately 1.8x on average when link time excluded.
But in this case the code base was less of representative of "real"
project. The truth is - I have no "real" projects in C that are not
embedded.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Scott Lurndal on Fri Jun 28 19:45:57 2024

On Fri, 28 Jun 2024 16:01:54 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

"Virtual" by definition means that they have their own address space.

"Virtual" and "Single Address Space" are not contradictory.
If you don't believe me, ask Ivan Godard.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Scott Lurndal on Fri Jun 28 20:37:59 2024

On 28/06/2024 16:48, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 14:48, Scott Lurndal wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 04:30, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

For most, PIC isn't a necessity.

Only because they use a virtual memory operating system which allows >>>>> every executable to be mapped to the same fixed address in its own
address space.

PIC never seemed to be a requirement during the 1980s and half the 90s. >>>> But then OSes only ran one program a time.

Interactive operating systems in 1967 (e.g. TSS8) were running
multiple programs at a time.

And when virtual addressing came along, and multiple programs could
co-exist at the same address, PIC wasn't really needed either.

Virtual addressing has been part of computer systems since the 1960s.

PIC is obviously necessary for any kind of shared code (shared object
or DLL) that gets loaded at different base addressses in different
processes.

I wouldn't call that PIC. On Windows, DLLs need base-relocation tables.

DLL code cannot, for example, use the movabs instruction or any
other instruction that takes absolute addresses;

That's not right. That is exactly what base-relocation tables are for.

relocatable code generated needs to be position independent.

Um, not it doesn't. If writing object files, then lots will be
relocatable, but once it gets linked into executable, that need not be relocatable nor position independent.

Trying to fixup
non-PIC code at load time is fraught and pointless when compilers
have the capability of generating PIC code.

They might do depending on what conveniences the instruction set offers.
There was a reason why RIP-relative addresses was introduced on x64. But
the compiler still has to on-side. That's where people like me come in.

Only external interfaces (function addresses, global variables) should
need relocation tables (GOT and PLT for ELF).

This is where some people might get the misleading idea that with PIC,
you can take an individual function and just move it anywhere. If you
have references to it at 100 call-sites, they will need updating! Plus
any calls that exist inside that function will need changing.

PIC applies to a whole unit like an EXE or DLL file.

Your experiences seem limited to PC systems.

I've used PCs a lot yes; so?

I've also used PDP10, PDP11 and ICL equipment. I've written assemblers
which directly generated machine code for devices like Z80, 8051, 80188
(a special version of 8088). All this was long before I first used a PC.

Funnily enough, PIC for native code executables never came up that I can remember.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Sat Jun 29 11:05:41 2024

On 28/06/2024 10:20, Michael S wrote:

On Fri, 28 Jun 2024 06:56:41 +0200
David Brown <david.brown@hesbynett.no> wrote:

and Borland, as well as gcc in various packagings. (And embedded
developers use whatever cross-compile tools are appropriate for their
target, regardless of the host - the great majority use gcc now.)

I am not sure that it is still true.
Vendors like TI and ST Micro push users of their ARMv7-M development
suits toward clang. I didn't look at NXP lately, but would be surprised
if it's not the same.

ARM have moved their official development suite to clang (it used to be
Keil's compiler, and before that they had their own). But that is a big toolchain with lots of proprietary tools - clang is just the compiler.
And they charge a lot of money for it. (It's been a fair number of
years since I tested ARM / Keil tools, and at the time I did not think
it gave significant benefits and was therefore not worth the money and
the great inconveniences and risks of the licensing system.)

For ARM development, there are basically four choices of toolchains.
There is gcc (primarily the "gnu arm embedded" toolchain, but also Code Sourcery and a few other variants where you pay for better libraries, commercial support, and the like). There is IAR (popular amongst
companies that have plenty of money and have always used IAR on other microcontrollers). There is Green Hills (popular in the automotive
industry). And there is ARM's own toolchain.

gcc toolchains are free, and the standard from manufacturers for a very
long time. The others are very expensive, and it's very questionable if
they actually provide much extra value for most development teams.
(They /do/ provide better tools for some kinds of development.)
Consequently, gcc is overwhelmingly the most popular.

Of course, microcontroller manufacturers could also package up
clang-based toolchains and provide them for free. And some group could
make the equivalent of the microcontroller manufacturer independent "gnu
arm embedded" toolchain. (ARM is not going to do that themselves, I
think, because they like to sell their own toolsuite for lots of money.)
Then I'd expect clang-based tools to be popular for gcc.

I think something like that will come, and I welcome the options - but I
don't see it yet. I've seen a move from manufacturers from Eclipse to
VS Code, but not from gcc to clang.

It's not like dev can't chose gcc with this suits,
but it takes effort that majority of embedded code monkeys is
incapable of and majority of minority sees no reason.

And of course there /are/ other microcontroller cores, not just ARM.
But they are very much the minority these days. However, MIPS (PIC32),
AVR and MSP430 are still around, and most developers use gcc for them
too. And for embedded x86 development, it's almost all Linux and gcc.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to David Brown on Sat Jun 29 09:15:16 2024

On 2024-06-29, David Brown <david.brown@hesbynett.no> wrote:

gcc toolchains are free, and the standard from manufacturers for a very
long time. The others are very expensive, and it's very questionable if
they actually provide much extra value for most development teams.
(They /do/ provide better tools for some kinds of development.)
Consequently, gcc is overwhelmingly the most popular.

One thing: builds having to talk to some god forsaken licensing server
to get permission to compile is beyond ridiculous.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sat Jun 29 10:47:54 2024

On 28/06/2024 12:19, bart wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall and -W?

Using products like tcc doesn't mean never using gcc. (Especially on
Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the simpler
compiler may have missed, or to produce faster production builds.

But gcc is not needed for routine compilation.

It is certainly a reasonable idea to use gcc as a linter if your normal
compile is poor at static checking. I've done that myself in the past -
in embedded development, you don't always get to choose a good compiler.
These days I'd be more likely to go for clang-tidy as a linter, and
there's other more advanced tools available (especially if you have the
money).

However, I don't see the point in doing any kind of compilation at all,
until you have at least basic linting and static error checking in
place. If I've used the value of a variable without initialising it, I
have no interest in running the program until that is fixed. And if I
don't want to run the program, I've no interest in compiling it.

So I simply can't comprehend why you'd want fast but pointless compiles regularly, then only occasionally check to see if the code is actually
correct - or at least, does not contain detectable errors.

Now, if you were using one of these "big" linters that does simulations
of your code and takes hours to run on a server machine before posting
back a report, that's a different matter. Tools like that run overnight
or in batches, integrating with version control systems, automatic test
setups, and so on. But that's not what we are talking about.

Say tcc takes 0.2 seconds to compile your code, and "gcc -O2 -Wall"
takes 3 seconds. If gcc catches an error that you missed, that's
probably going to save you between half and hour and several days
finding the problem by trial-and-error debugging.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Sat Jun 29 12:11:52 2024

On 29/06/2024 09:47, David Brown wrote:

On 28/06/2024 12:19, bart wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall and -W?

Using products like tcc doesn't mean never using gcc. (Especially on
Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the simpler
compiler may have missed, or to produce faster production builds.

But gcc is not needed for routine compilation.

It is certainly a reasonable idea to use gcc as a linter if your normal compile is poor at static checking. I've done that myself in the past -
in embedded development, you don't always get to choose a good compiler.
These days I'd be more likely to go for clang-tidy as a linter, and there's other more advanced tools available (especially if you have the money).

However, I don't see the point in doing any kind of compilation at all,
until you have at least basic linting and static error checking in
place. If I've used the value of a variable without initialising it, I
have no interest in running the program until that is fixed.

You run the program, and if it doesn't work or has some bug that you
can't figure out, then you can pass it through a bigger compiler to get
its opinion.

If the program works, then you can still do that to pick up things
you've missed, in case the program worked by luck. But this doesn't need
to be done every time.

I wonder, how do you work with Python. Presumably there you're used to
making frequent small tweaks (maybe lining up some output) and running
it immediately to check it's OK.

If there was some elaborate checker for Python code, or you had an
extensive test suite that executed all lines of the program to check you
hadn't mispelled anything, I expect you wouldn't to run those after
every minor edit.

You use Python /because/ it's quick and informal.

And if I
don't want to run the program, I've no interest in compiling it.

So I simply can't comprehend why you'd want fast but pointless compiles regularly, then only occasionally check to see if the code is actually correct - or at least, does not contain detectable errors

For similar reasons that drivers don't stop to check their tyre
pressures every mile. Or why MOTs (inspections) are only mandated once a
year.

Now, if you were using one of these "big" linters that does simulations
of your code and takes hours to run on a server machine before posting
back a report, that's a different matter. Tools like that run overnight
or in batches, integrating with version control systems, automatic test setups, and so on. But that's not what we are talking about.

Say tcc takes 0.2 seconds to compile your code, and "gcc -O2 -Wall"
takes 3 seconds. If gcc catches an error that you missed, that's
probably going to save you between half and hour and several days
finding the problem by trial-and-error debugging.

Take any kind of interactive work done on a computer - editing a source
file, doing creative writing, writing a reply on a forum, all sorts of
stuff you expect to work smoothly.

Now imagine that every so often it freezes for several seconds. (This
is exactly what used to happen on Thunderbird last year. It's what
happens now on one of my smart TV apps, there can be several seconds
latency after pressing a button on the remote.)

It would be incredibly annoying. It would break your concentration. It
would make some things impractical (I don't use the smart TV app; I
browse on my PC instead.)

That is what gcc is like to me. There is no fluency. YMMV.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Kaz Kylheku on Sat Jun 29 15:14:23 2024

On 28/06/2024 11:26, Kaz Kylheku wrote:

On 2024-06-28, bart <bc@freeuk.com> wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall and -W?

Using products like tcc doesn't mean never using gcc. (Especially on
Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the simpler
compiler may have missed, or to produce faster production builds.

But gcc is not needed for routine compilation.

Catching common bugs in routine compilation is better than once
a month.

You could be wasting time debugging something where GCC would have told
you right away you have something uninitialized or whatever.

Let's take the C program below. It has 4 things wrong with it, marked
with comments.

Gcc 10, run with no extra options, gives me warnings about (1) and (3);
the others are ignored.

Gcc 14, also with no extra options, gives errors for (1) and (3).

I'm not interested in comments that you COULD have provided enough
options to make those 1 and 3 fatal errors, and possibly taken care of 2
and 4 as well.

What is interesting is that Gcc 14 decided to make those hard errors by default. WHY? What was wrong with how earlier versions did it?

For years I've been complaining about that, with people telling I was an
idiot, that I should RTFM, that I should do this or that. And now
suddenly gcc 14 does what I said it should be doing. Maybe I had a point
after all!

Anyway, my 'mcc' compiler already reports 1 and 3 as hard errors.

My older bcc compiler reported 4 as a hard error unless an override was
used.

And for a while, it could detect 2; but I had to remove it because
sometimes such code is valid; it would have needed the C to be changed,
with a dummy return, to enable it to pass.

So it's not as though a little compiler does no checking at all; it can
pick up obvious things that a bigger one misses or does not treat
seriously. Gcc 10 (perhaps even up to 13) can produce an EXE which
someone can run; my mcc wouldn't allow it.

BTW if I try to reproduce this program in my systems language, then all
4 are hard errors. My language is stricter. My 'linting' is poor, that's
all.

-------------------------
#include <stdio.h>
#include <stdlib.h>

int F(void) {
return; // 1 No value
}

int G(void) {
if (rand())
return 0;
} // 2 Possibly running into end

int main() {
char s[10];
char *p = &s; // 3 Wrong types

main(123); // 4 Unchecked arg types
}
-------------------------

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to bart on Sat Jun 29 08:38:11 2024

bart <bc@freeuk.com> writes:

On 28/06/2024 11:26, Kaz Kylheku wrote:

On 2024-06-28, bart <bc@freeuk.com> wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is
fast enough. It also about the same speed as code produced by
one of my compilers.

So I tend to use it when I want the extra speed, or other
compilers don't work, or when a particular app only builds
with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall
and -W?

Using products like tcc doesn't mean never using gcc.
(Especially on Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the
simpler compiler may have missed, or to produce faster production
builds.

But gcc is not needed for routine compilation.

Catching common bugs in routine compilation is better than once
a month.

You could be wasting time debugging something where GCC would have
told you right away you have something uninitialized or whatever.

Let's take the C program below. It has 4 things wrong with it,
marked with comments.

[...]

People are never going to take you seriously as long as
you keep offering what are obviously strawman arguments,
and especially ones where you know better but pretend
that you don't.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Tim Rentsch on Sat Jun 29 17:11:19 2024

On 29/06/2024 16:38, Tim Rentsch wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 11:26, Kaz Kylheku wrote:

On 2024-06-28, bart <bc@freeuk.com> wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is
fast enough. It also about the same speed as code produced by
one of my compilers.

So I tend to use it when I want the extra speed, or other
compilers don't work, or when a particular app only builds
with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall
and -W?

Using products like tcc doesn't mean never using gcc.
(Especially on Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the
simpler compiler may have missed, or to produce faster production
builds.

But gcc is not needed for routine compilation.

Catching common bugs in routine compilation is better than once
a month.

You could be wasting time debugging something where GCC would have
told you right away you have something uninitialized or whatever.

Let's take the C program below. It has 4 things wrong with it,
marked with comments.

[...]

People are never going to take you seriously as long as
you keep offering what are obviously strawman arguments,
and especially ones where you know better but pretend
that you don't.

You've perhaps missed my main point, which was that gcc 14 now reports
hard errors BY DEFAULT for things which I have argued in the past should
be hard errors by default.

You've probably also missed my secondary point, which was asking WHY
that change was made, since all that people had to do to get that
behaviour was to jump through a number of hoops as I've always been told
to do. Apparently that was not good enough!

I've also learnt something interesting. Which is that whatever the
current version of gcc does is always right, and I'm always wrong if I
suggest it should be any different.

But if gcc suddenly starts doing what I'd advocated, then according to
you I'm still wrong! So I can never win.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Kaz Kylheku on Sat Jun 29 19:11:59 2024

On 29/06/2024 11:15, Kaz Kylheku wrote:

On 2024-06-29, David Brown <david.brown@hesbynett.no> wrote:

gcc toolchains are free, and the standard from manufacturers for a very
long time. The others are very expensive, and it's very questionable if
they actually provide much extra value for most development teams.
(They /do/ provide better tools for some kinds of development.)
Consequently, gcc is overwhelmingly the most popular.

One thing: builds having to talk to some god forsaken licensing server
to get permission to compile is beyond ridiculous.

I agree. I understand that vendors want to get paid for people using
their tools (especially if that's their main business), and that it's
hard to enforce license usage without some kind of license server or
lock system. But it is really bad for the user. I've seen countless
tricks to get around licensing restrictions - not to avoid paying money
or cheating the supplier, but simply to get the flexibility and
long-term usage that you have paid for. For me, the zero cost price of
most gcc toolchains is not a big deal. I've paid for gcc toolchains and
other toolchains, and am fine with that if they are good value for
money. The zero restrictions on usage, on the other hand, is an
enormous benefit.

I think the gcc model is one that works out well - big sponsors of the development include cpu manufacturers like ARM. We use the gcc
toolchain and make programs that run on ARM microcontrollers. And each microcontroller we buy then results in a bit of money back to ARM.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to bart on Sat Jun 29 18:46:56 2024

On 29/06/2024 15:14, bart wrote:
[...]

My older bcc compiler reported 4 as a hard error unless an override was
used.

But you didn't say anything about main's args.

Make that 'int main(void)', then it does what you'd expect:

bart.c:17:5: error: too many arguments to function ‘main’
17 | main(123); // 4 Unchecked arg types
| ^~~~
bart.c:13:5: note: declared here
13 | int main(void) {
| ^~~~

[...]

-------------------------
#include <stdio.h>
#include <stdlib.h>

int F(void) {
    return;             // 1 No value
}

int G(void) {
    if (rand())
        return 0;
}                       // 2 Possibly running into end

int main() {
    char s[10];
    char *p = &s;       // 3 Wrong types

    main(123);          // 4 Unchecked arg types
}
-------------------------

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sat Jun 29 19:42:01 2024

On 29/06/2024 18:11, bart wrote:

On 29/06/2024 16:38, Tim Rentsch wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 11:26, Kaz Kylheku wrote:

On 2024-06-28, bart <bc@freeuk.com> wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is
fast enough. It also about the same speed as code produced by
one of my compilers.

So I tend to use it when I want the extra speed, or other
compilers don't work, or when a particular app only builds
with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall
and -W?

Using products like tcc doesn't mean never using gcc.
(Especially on Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the
simpler compiler may have missed, or to produce faster production
builds.

But gcc is not needed for routine compilation.

Catching common bugs in routine compilation is better than once
a month.

You could be wasting time debugging something where GCC would have
told you right away you have something uninitialized or whatever.

Let's take the C program below. It has 4 things wrong with it,
marked with comments.

[...]

People are never going to take you seriously as long as
you keep offering what are obviously strawman arguments,
and especially ones where you know better but pretend
that you don't.

You've perhaps missed my main point, which was that gcc 14 now reports
hard errors BY DEFAULT for things which I have argued in the past should
be hard errors by default.

I don't remember anyone disagreeing with you that some things would be
better as hard errors by default. People have been quite happy with the
idea that code that breaks C syntax rules or constraint rules should be
hard errors. But they disagree with you on some of your hard errors -
code that we all think is bad code and almost certainly an error, but
which does not break the rules of standard C can be a warning by
default, but should not be an error by default.

gcc, however, is restricted and limited by its past - the developers do
not lightly make changes that will result in compilation failures of
code that previously compiled fine and had been tested to run correctly.
Such changes - as made for gcc 14 - are only done after long
discussion and long testing with existing code bases.

Many of us thing it would be better if gcc's warnings and errors were a
lot stricter by default. But we understand the reasons why gcc can't
change these much, and we understand how to choose the checking levels
with flags so that we can use gcc the way we want it. Telling you "just
use -Wall", or similar, is advice and help - it is not saying that we
like a lax compiler by default.

You've probably also missed my secondary point, which was asking WHY
that change was made, since all that people had to do to get that
behaviour was to jump through a number of hoops as I've always been told
to do. Apparently that was not good enough!

I've also learnt something interesting. Which is that whatever the
current version of gcc does is always right, and I'm always wrong if I suggest it should be any different.

You are wrong when you continually misrepresent what others say. When
people tell you the facts - be it about C or gcc - it is not the same as
saying we necessarily like those facts.

But if gcc suddenly starts doing what I'd advocated, then according to
you I'm still wrong! So I can never win.

You can win when you stop intentionally misunderstanding, and start
listening.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Harnden@21:1/5 to Kaz Kylheku on Sat Jun 29 18:51:22 2024

On 29/06/2024 10:15, Kaz Kylheku wrote:

On 2024-06-29, David Brown <david.brown@hesbynett.no> wrote:

gcc toolchains are free, and the standard from manufacturers for a very
long time. The others are very expensive, and it's very questionable if
they actually provide much extra value for most development teams.
(They /do/ provide better tools for some kinds of development.)
Consequently, gcc is overwhelmingly the most popular.

One thing: builds having to talk to some god forsaken licensing server
to get permission to compile is beyond ridiculous.

I haven't seen that since, like, the 90s. Does that actually still happen?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to All on Sat Jun 29 21:56:42 2024

On Sat, 29 Jun 2024 11:05:41 +0200
David Brown <david.brown@hesbynett.no> wrote:

It does not sound like you know what you are talking about.
Just download latest Arm MCU variant of TI CCS or of STM32 Cube and see.
They are free (like beer) and they are based on clang.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Sat Jun 29 21:49:44 2024

On Sat, 29 Jun 2024 19:42:01 +0200
David Brown <david.brown@hesbynett.no> wrote:

gcc, however, is restricted and limited by its past - the developers
do not lightly make changes that will result in compilation failures
of code that previously compiled fine and had been tested to run
correctly. Such changes - as made for gcc 14 - are only done after
long discussion and long testing with existing code bases.

Default input dialect of C language was changed (it seems, to gnu17)
before gcc14. May be, in gcc12.
BTW, finding out what dialect is a defaul is less than trivial. If Bart
calls it "jump through a number of hoops" he would at least correct,
but more like understating his case.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Richard Harnden on Sat Jun 29 20:55:54 2024

On 29/06/2024 18:46, Richard Harnden wrote:

On 29/06/2024 15:14, bart wrote:
[...]

My older bcc compiler reported 4 as a hard error unless an override
was used.

But you didn't say anything about main's args.

I did, indirectly. The actual error was the use of "()" as an empty
parameter list (for any function, not just main, but my example could
also have been 'void H(){H(123);}'). If you tried to compile:

int main() {
main(123);
}

then it wouldn't get past the () to the call.

Eventually I dropped that restriction, and the reason was that so much
code used such parameter lists, for any function.

Not because they wanted unchecked args (there are some legitimate
use-cases within function pointer types), but because so many people
assumed () meant zero parameters like (void).

Why was such code so common? Presumably because compilers said nothing;
and they couldn't because the language allowed it. If they had required
an override like mine did, more would have got the message.

Now it's too late because apparently the meaning of () is changing to
mean (void). All those people who got it wrong (and introduced a
dangerous bug) have won!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Sun Jun 30 00:36:13 2024

On 2024-06-29, bart <bc@freeuk.com> wrote:

You've perhaps missed my main point, which was that gcc 14 now reports
hard errors BY DEFAULT for things which I have argued in the past should
be hard errors by default.

I think you will not find disagreement about that here.

Situations which are constraint or syntax violations in ISO C, where the compiler is not providing a useful extensions, should be diagnosed in
such a way that translation does not succeed.

ISO C doesn't require that, possibly because it would forbid extensions. (Extensions that are non-conforming, but only in the regard that
translation doesn't fail.)

You've probably also missed my secondary point, which was asking WHY
that change was made, since all that people had to do to get that
behaviour was to jump through a number of hoops as I've always been told
to do. Apparently that was not good enough!

I've also learnt something interesting. Which is that whatever the
current version of gcc does is always right, and I'm always wrong if I suggest it should be any different.

It's always been a poor decision in GCC to allow incompatible pointer conversions without a cast, issuing only a diagnostic.

However, that is not wrong in the sense of not conforming to ISO C.
ISO C requires a diagnostic, not failed translation.

There are different senses of "wrong".

GCC also doesn't follow the ISO C dialect by default (no options given)
but its own dialect.

The requirements in ISO C are not sufficient to add up to good
engineering, though.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Sun Jun 30 01:43:20 2024

Michael S <already5chosen@yahoo.com> writes:

On Sat, 29 Jun 2024 19:42:01 +0200
David Brown <david.brown@hesbynett.no> wrote:

gcc, however, is restricted and limited by its past - the developers
do not lightly make changes that will result in compilation failures
of code that previously compiled fine and had been tested to run
correctly. Such changes - as made for gcc 14 - are only done after
long discussion and long testing with existing code bases.

Default input dialect of C language was changed (it seems, to gnu17)
before gcc14. May be, in gcc12.
BTW, finding out what dialect is a defaul is less than trivial.

I think it isn't hard to find out what variation of the C language
gcc takes in its default mode.

More importantly though, one should always give an explicit -std=
option, and specify which version of the language is desired.
Anything less is just asking for trouble.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to bart on Sun Jun 30 01:49:26 2024

bart <bc@freeuk.com> writes:

On 29/06/2024 16:38, Tim Rentsch wrote:

bart <bc@freeuk.com> writes:

On 28/06/2024 11:26, Kaz Kylheku wrote:

On 2024-06-28, bart <bc@freeuk.com> wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is
fast enough. It also about the same speed as code produced by
one of my compilers.

So I tend to use it when I want the extra speed, or other
compilers don't work, or when a particular app only builds
with that compiler.

Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall
and -W?

Using products like tcc doesn't mean never using gcc.
(Especially on Linux where you will have it installed anyway.)

You can use the latter to do extra, periodic checks that the
simpler compiler may have missed, or to produce faster production
builds.

But gcc is not needed for routine compilation.

Catching common bugs in routine compilation is better than once
a month.

You could be wasting time debugging something where GCC would have
told you right away you have something uninitialized or whatever.

Let's take the C program below. It has 4 things wrong with it,
marked with comments.

[...]

People are never going to take you seriously as long as
you keep offering what are obviously strawman arguments,
and especially ones where you know better but pretend
that you don't.

You've perhaps missed my main point,

I didn't.

You've probably also missed my secondary point,

I didn't.

I've also learnt something interesting. Which is that whatever the
current version of gcc does is always right, and I'm always wrong if I suggest it should be any different.

You still haven't learned the most important thing. Many or
most of the people responding to you are offering constructive
advice. For some reason you either don't hear or don't heed
the advice but instead take it as a personal attack. As long
as you keep doing that no one is going to care about what you
have to say.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sun Jun 30 11:05:15 2024

On 29/06/2024 13:11, bart wrote:

On 29/06/2024 09:47, David Brown wrote:

On 28/06/2024 12:19, bart wrote:

On 28/06/2024 04:23, Kaz Kylheku wrote:

On 2024-06-27, bart <bc@freeuk.com> wrote:

And for most of /my/ compiles, the code produced by gcc-O0 is fast
enough. It also about the same speed as code produced by one of my
compilers.

So I tend to use it when I want the extra speed, or other compilers
don't work, or when a particular app only builds with that compiler. >>>>>
Otherwise the extra overheads are not worth the bother.

How good are your diagnostics compared to GCC -O2, plus -Wall and -W?

Using products like tcc doesn't mean never using gcc. (Especially on
Linux where you will have it installed anyway.)
;
You can use the latter to do extra, periodic checks that the simpler
compiler may have missed, or to produce faster production builds.

But gcc is not needed for routine compilation.

It is certainly a reasonable idea to use gcc as a linter if your
normal compile is poor at static checking. I've done that myself in
the past - in embedded development, you don't always get to choose a
good compiler. These days I'd be more likely to go for clang-tidy as
a linter, and there's other more advanced tools available (especially
if you have the money).

However, I don't see the point in doing any kind of compilation at
all, until you have at least basic linting and static error checking
in place. If I've used the value of a variable without initialising
it, I have no interest in running the program until that is fixed.

You run the program, and if it doesn't work or has some bug that you
can't figure out, then you can pass it through a bigger compiler to get
its opinion.

If the program works, then you can still do that to pick up things
you've missed, in case the program worked by luck. But this doesn't need
to be done every time.

Don't be silly. Testing can only show the presence of bugs, not their
absence. It is not an alternative to automatic checks.

I wonder, how do you work with Python. Presumably there you're used to
making frequent small tweaks (maybe lining up some output) and running
it immediately to check it's OK.

It's a very different style of language from C, and I use it for very
different purposes, so my development process is different. It is very
common in Python development to have unit tests that you can run for
automatic testing. (It's common in C and C++ programming too, but it
can be hard to do that for hardware-specific embedded code.) There are
also linters and checkers for Python, editors and IDEs that help you
avoid some mistakes, type annotation, and other tools. You can't do
nearly as much pre-run checking as you can in static languages, but you
do what you can. Of course, for small programs or parts of programs,
quick tests are all you need - not all code needs extensive tests.

If there was some elaborate checker for Python code, or you had an
extensive test suite that executed all lines of the program to check you hadn't mispelled anything, I expect you wouldn't to run those after
every minor edit.

No, not after every edit. But then, I don't compile my C code after
every edit. It depends on the size and type of the edit. Some checks -
such as for misspelling - /are/ done continuously during editing, at
least to some extent. A decent IDE or editor will help there.

You use Python /because/ it's quick and informal.

I use Python for tasks where it is a lot faster to develop in Python
than C or C++, easier to get right, and where it has the libraries and
modules that I need. I would not say it is "informal", I would say it
is flexible.

And if I don't want to run the program, I've no interest in
compiling it.

So I simply can't comprehend why you'd want fast but pointless
compiles regularly, then only occasionally check to see if the code is
actually correct - or at least, does not contain detectable errors

For similar reasons that drivers don't stop to check their tyre
pressures every mile. Or why MOTs (inspections) are only mandated once a year.

Every modern car has automatic checks for tyre pressure every mile or
so. I don't want to spend time every build /manually/ re-checking
through piles of code looking for errors, but I am quite happy to let my computer and compiler do it.

Now, if you were using one of these "big" linters that does
simulations of your code and takes hours to run on a server machine
before posting back a report, that's a different matter. Tools like
that run overnight or in batches, integrating with version control
systems, automatic test setups, and so on. But that's not what we are
talking about.

Say tcc takes 0.2 seconds to compile your code, and "gcc -O2 -Wall"
takes 3 seconds. If gcc catches an error that you missed, that's
probably going to save you between half and hour and several days
finding the problem by trial-and-error debugging.

Take any kind of interactive work done on a computer - editing a source
file, doing creative writing, writing a reply on a forum, all sorts of
stuff you expect to work smoothly.

Now imagine that every so often it freezes for several seconds. (This is exactly what used to happen on Thunderbird last year. It's what happens
now on one of my smart TV apps, there can be several seconds latency
after pressing a button on the remote.)

It would be incredibly annoying. It would break your concentration. It
would make some things impractical (I don't use the smart TV app; I
browse on my PC instead.)

I agree. So I don't use a setup like that.

That is what gcc is like to me. There is no fluency. YMMV.

It is possible to have your cake and eat it. You think this is all a
binary choice - it is not. You can have smooth and fluent development
/and/ powerful checking /and/ efficient generated code /and/
full-featured tools.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Sun Jun 30 11:17:50 2024

On 29/06/2024 20:56, Michael S wrote:

On Sat, 29 Jun 2024 11:05:41 +0200
David Brown <david.brown@hesbynett.no> wrote:

It does not sound like you know what you are talking about.
Just download latest Arm MCU variant of TI CCS or of STM32 Cube and see.
They are free (like beer) and they are based on clang.

It is a very long time since I have used TI's ARM devices, and therefore
its tools. For STM, it was maybe a year or two ago. There are many manufacturers of microcontrollers, and while I have used a fair number
over the years, I don't use all the different ones all the time. I also
don't change development tools in existing projects.

If TI and STM have changed over to providing clang-based tools, that is interesting to know. I will probably be working on a new STM32 project
later this year. So that's nice information to know.

It doesn't surprise me that some have moved to clang-based tools - I
think it was inevitable. And I think it is good to see some variety and competition. But if the clang toolchains are not freely available in a device-independent fashion in the manner of the ARM-sponsored gcc
toolchains, then it will be a big step backwards. The last thing
embedded developers want is more custom variant toolchains and vendor
lock-in. We used to have that - gcc let us escape from it. A
disadvantage of clang's license is that it lets vendors try that shit again.

So this is all very interesting, but it does not change the fact that
gcc totally dominates for embedded development at the moment. It does, however, mean the proportions may change significantly in the coming years.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Sun Jun 30 12:18:35 2024

On Sat, 29 Jun 2024 20:55:54 +0100
bart <bc@freeuk.com> wrote:

On 29/06/2024 18:46, Richard Harnden wrote:

On 29/06/2024 15:14, bart wrote:
[...]

My older bcc compiler reported 4 as a hard error unless an
override was used.

But you didn't say anything about main's args.

I did, indirectly. The actual error was the use of "()" as an empty
parameter list (for any function, not just main, but my example could
also have been 'void H(){H(123);}'). If you tried to compile:

int main() {
main(123);
}

then it wouldn't get past the () to the call.

Eventually I dropped that restriction, and the reason was that so
much code used such parameter lists, for any function.

Not because they wanted unchecked args (there are some legitimate
use-cases within function pointer types), but because so many people
assumed () meant zero parameters like (void).

Why was such code so common? Presumably because compilers said
nothing; and they couldn't because the language allowed it. If they
had required an override like mine did, more would have got the
message.

I tried following code:
int foo() { return 1; }

Both MSVC and clang warn about it at high warnings level (-Wall for
MSVC, -Wpedantic for clang). But they don't warn at levels that most
people use in practice (-W3 or -W4 for MSVC, -Wall for clang).
gcc13 produces no warning even at -Wpedantic. It does produce warning
with '-Wpedantic -std=xxx' for all values of xxx except c23 and gnu23.
The absence of warning for c23/gnu23 makes sense, the rest of gcc
behavior - less so.

Now it's too late because apparently the meaning of () is changing to
mean (void). All those people who got it wrong (and introduced a
dangerous bug) have won!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Sun Jun 30 11:23:02 2024

On 29/06/2024 20:49, Michael S wrote:

On Sat, 29 Jun 2024 19:42:01 +0200
David Brown <david.brown@hesbynett.no> wrote:

gcc, however, is restricted and limited by its past - the developers
do not lightly make changes that will result in compilation failures
of code that previously compiled fine and had been tested to run
correctly. Such changes - as made for gcc 14 - are only done after
long discussion and long testing with existing code bases.

Default input dialect of C language was changed (it seems, to gnu17)
before gcc14. May be, in gcc12.
BTW, finding out what dialect is a defaul is less than trivial. If Bart
calls it "jump through a number of hoops" he would at least correct,
but more like understating his case.

That would be the wrong hoops to jump through. Don't use the default
standard unless you are writing a little hello world program and don't
care about the choice of standard - /specify/ the standard you want.
And that one is a very, very simple jump.

But as Keith said, this change was not about C standards. It was about
gcc developers changing the balance between being stricter about
defaults to reduce future programmers' mistakes, and being lax about
defaults to allow older and poorer code to compile without error messages.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Sun Jun 30 11:48:40 2024

On 30/06/2024 10:05, David Brown wrote:

On 29/06/2024 13:11, bart wrote:

If there was some elaborate checker for Python code, or you had an
extensive test suite that executed all lines of the program to check
you hadn't mispelled anything, I expect you wouldn't to run those
after every minor edit.

No, not after every edit. But then, I don't compile my C code after
every edit. It depends on the size and type of the edit. Some checks - such as for misspelling - /are/ done continuously during editing, at
least to some extent. A decent IDE or editor will help there.

An IDE or editor which presumably uses some fast aspects of fast
compilation to give you real-time feedback.

I don't use such smart tools and so rely on such feedback from the compiler.

Many edits to my source code (I realised this last night) consist of
commenting or uncommenting one line of code, eg. that may call or not
call some routine, or changing some internal flag to enable something or
other, eg. to show extra diagnostics. Or inserting or removing an early
return. Or adding or removing or commenting out some diagnostic print statements.

This is not working on the logic of some complex algorithm. It's
changing something on a whim (maybe calling that fixup routine or not)
and needing instant feedback. Here I don't need any deep analysis!

If compilation took a minute, then I might have to use command-line
options instead of editing an internal flag or using commenting. I might
need to use (or develop) a debugger to avoid recompiling. I'd have to
revert (in my language) to independent compilation so I'd only need to
compile a small part after any change.

I'd have to use some smart editor to tell me things quicker than a
compiler would (if such an editor understood my language).

This does not sound appealing. More like going back decades.

Perhaps you can understand better why more-or-less instant compilation
can be a useful thing, it eliminates the need for those cumbersome
external solutions, and it keeps my own tools simple.

It opens up possibilities.

That is what gcc is like to me. There is no fluency. YMMV.

It is possible to have your cake and eat it. You think this is all a
binary choice - it is not. You can have smooth and fluent development
/and/ powerful checking /and/ efficient generated code /and/
full-featured tools.

And you could have C development that works just like Python (well,
minus its bundled libraries).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Sun Jun 30 17:54:14 2024

On 30/06/2024 11:18, Michael S wrote:

On Sat, 29 Jun 2024 20:55:54 +0100
bart <bc@freeuk.com> wrote:

On 29/06/2024 18:46, Richard Harnden wrote:

On 29/06/2024 15:14, bart wrote:
[...]

My older bcc compiler reported 4 as a hard error unless an
override was used.

But you didn't say anything about main's args.

I did, indirectly. The actual error was the use of "()" as an empty
parameter list (for any function, not just main, but my example could
also have been 'void H(){H(123);}'). If you tried to compile:

int main() {
main(123);
}

then it wouldn't get past the () to the call.

Eventually I dropped that restriction, and the reason was that so
much code used such parameter lists, for any function.

Not because they wanted unchecked args (there are some legitimate
use-cases within function pointer types), but because so many people
assumed () meant zero parameters like (void).

Why was such code so common? Presumably because compilers said
nothing; and they couldn't because the language allowed it. If they
had required an override like mine did, more would have got the
message.

I tried following code:
int foo() { return 1; }

Both MSVC and clang warn about it at high warnings level (-Wall for
MSVC, -Wpedantic for clang). But they dont warn at levels that most
people use in practice (-W3 or -W4 for MSVC, -Wall for clang).
gcc13 produces no warning even at -Wpedantic. It does produce warning
with '-Wpedantic -std=xxx' for all values of xxx except c23 and gnu23.
The absence of warning for c23/gnu23 makes sense, the rest of gcc
behavior - less so.

gcc -Wpedantic makes very little sense without specifying a C standard
(rather than a gnu C standard).

But why would you expect a warning from code that is perfectly legal and well-defined C code, without explicitly enabling warnings that check for particular style issues? Non-prototype function declarations are
deprecated (since C99), but not removed from the language until C23
(where that declaration is now a function prototype).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Sun Jun 30 17:47:35 2024

On 30/06/2024 12:48, bart wrote:

On 30/06/2024 10:05, David Brown wrote:

On 29/06/2024 13:11, bart wrote:

If there was some elaborate checker for Python code, or you had an
extensive test suite that executed all lines of the program to check
you hadn't mispelled anything, I expect you wouldn't to run those
after every minor edit.

No, not after every edit. But then, I don't compile my C code after
every edit. It depends on the size and type of the edit. Some checks
- such as for misspelling - /are/ done continuously during editing, at
least to some extent. A decent IDE or editor will help there.

An IDE or editor which presumably uses some fast aspects of fast
compilation to give you real-time feedback.

There's no compilation involved - there is some code analysis for syntax highlighting, identifying types (and therefore things like struct
members), structure analysis, and so on.

I don't use such smart tools and so rely on such feedback from the
compiler.

I prefer to use the best tools available. I might use a simple editor
for remote (ssh) work for minor tasks, but editors which do reasonable highlighting are common and easily available.

Many edits to my source code (I realised this last night) consist of commenting or uncommenting one line of code, eg. that may call or not
call some routine, or changing some internal flag to enable something or other, eg. to show extra diagnostics. Or inserting or removing an early return. Or adding or removing or commenting out some diagnostic print statements.

This is not working on the logic of some complex algorithm. It's
changing something on a whim (maybe calling that fixup routine or not)
and needing instant feedback. Here I don't need any deep analysis!

If compilation took a minute, then I might have to use command-line
options instead of editing an internal flag or using commenting. I might
need to use (or develop) a debugger to avoid recompiling. I'd have to
revert (in my language) to independent compilation so I'd only need to compile a small part after any change.

I'd have to use some smart editor to tell me things quicker than a
compiler would (if such an editor understood my language).

This does not sound appealing. More like going back decades.

I really think you have everything backwards. Decades ago, I used the
kind of editors you say you use, before I had access to better choices.
(At one time, I paid quite a bit of money for a good editor.)

Perhaps you can understand better why more-or-less instant compilation
can be a useful thing, it eliminates the need for those cumbersome
external solutions, and it keeps my own tools simple.

No, I can't.

It opens up possibilities.

It opens possibilities for doing lots more manual work, making more
mistakes, finding those mistakes later, and generally working in a way
most people were glad to move away from a generation ago.

Fast compilation is fine and good in itself. And I can understand
feeling that giant IDE's like Eclipse or MS Code are bigger than you
want, and take time to learn. But it is incomprehensible to me that
you'd /want/ to use such limited tools as you do.

That is what gcc is like to me. There is no fluency. YMMV.

It is possible to have your cake and eat it. You think this is all a
binary choice - it is not. You can have smooth and fluent development
/and/ powerful checking /and/ efficient generated code /and/
full-featured tools.

And you could have C development that works just like Python (well,
minus its bundled libraries).

I would prefer Python development that worked as well as my C development.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to David Brown on Sun Jun 30 19:10:18 2024

On Sun, 30 Jun 2024 17:54:14 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 30/06/2024 11:18, Michael S wrote:

On Sat, 29 Jun 2024 20:55:54 +0100
bart <bc@freeuk.com> wrote:

On 29/06/2024 18:46, Richard Harnden wrote:

On 29/06/2024 15:14, bart wrote:
[...]

My older bcc compiler reported 4 as a hard error unless an
override was used.

But you didn't say anything about main's args.

I did, indirectly. The actual error was the use of "()" as an empty
parameter list (for any function, not just main, but my example
could also have been 'void H(){H(123);}'). If you tried to compile:

int main() {
main(123);
}

then it wouldn't get past the () to the call.

Eventually I dropped that restriction, and the reason was that so
much code used such parameter lists, for any function.

Not because they wanted unchecked args (there are some legitimate
use-cases within function pointer types), but because so many
people assumed () meant zero parameters like (void).

Why was such code so common? Presumably because compilers said
nothing; and they couldn't because the language allowed it. If they
had required an override like mine did, more would have got the
message.

I tried following code:
int foo() { return 1; }

Both MSVC and clang warn about it at high warnings level (-Wall for
MSVC, -Wpedantic for clang). But they dont warn at levels that most
people use in practice (-W3 or -W4 for MSVC, -Wall for clang).
gcc13 produces no warning even at -Wpedantic. It does produce
warning with '-Wpedantic -std=xxx' for all values of xxx except c23
and gnu23. The absence of warning for c23/gnu23 makes sense, the
rest of gcc behavior - less so.

gcc -Wpedantic makes very little sense without specifying a C
standard (rather than a gnu C standard).

But why would you expect a warning from code that is perfectly legal
and well-defined C code, without explicitly enabling warnings that
check for particular style issues? Non-prototype function
declarations are deprecated (since C99), but not removed from the
language until C23 (where that declaration is now a function
prototype).

I expect warning at -Wall, because it is deprecated. Those who do
not want warning can turn it off explicitly with -Wno-strict-prototypes
or whatever the name of the switch.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Mon Jul 1 00:20:38 2024

On 30/06/2024 18:10, Michael S wrote:

On Sun, 30 Jun 2024 17:54:14 +0200
David Brown <david.brown@hesbynett.no> wrote:

But why would you expect a warning from code that is perfectly legal
and well-defined C code, without explicitly enabling warnings that
check for particular style issues? Non-prototype function
declarations are deprecated (since C99), but not removed from the
language until C23 (where that declaration is now a function
prototype).

I expect warning at -Wall, because it is deprecated. Those who do
not want warning can turn it off explicitly with -Wno-strict-prototypes
or whatever the name of the switch.

Deprecated does not mean you can't, or even should not, use the feature.
It is just a warning that it might be removed some time in the future.
I personally agree that it would be have been better if there had been
a warning here, but compatibility with existing code meant that gcc
choose to be conservative. I've had "-Wstrict-prototypes" in my
makefiles for decades. So I am glad to see C23 removing non-prototype declarations entirely, only 20 years or so late. But the default flags
and settings are not picked for what you or I would like.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to bart on Mon Jul 1 13:09:54 2024

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc. (Especially on Linux where you will have it installed anyway.)

The parenthetical remark is wrong.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Mon Jul 1 12:22:42 2024

On 30/06/2024 16:47, David Brown wrote:

On 30/06/2024 12:48, bart wrote:

It opens up possibilities.

It opens possibilities for doing lots more manual work, making more
mistakes, finding those mistakes later, and generally working in a way
most people were glad to move away from a generation ago.

Fast compilation is fine and good in itself. And I can understand
feeling that giant IDE's like Eclipse or MS Code are bigger than you
want, and take time to learn. But it is incomprehensible to me that
you'd /want/ to use such limited tools as you do.

I've been doing this stuff a long time. I've got to know what problems
come up for me in programming and in development, and which features of
editor, compiler and language would have helped.

Remember that I generally write all those tools and can do what I like.

Of most help have been micro-features in my non-C language, some of
which made my efficient compilers possible. There I can't use external
tools since no one else supports my language.

If I was to use C, then I'd lose those features, and HAVE to employ
smarter tools to get over /some/ of the problems.

Since I started creating my tools, the machines they run on have gotten
several magnitudes faster /per core/, while the individual binaries they
have to produce might be typically 1-2 magnitudes bigger.

(In my Windows system folder, 90% of the DLLs and 93% of the EXEs are
smaller than 1MB. 1MB is 1/8000th of the RAM of the PC. In 1985, a
program 1/8000 the) size of 640KB would have been 80 bytes in size!

And yet, with all the computing power available, people are still
compiling one 1000-line module at a time (about 10KB of code); they are
still using techniques to avoid compilation where possible; they are
having to parallelise via multiple cores; they're having to use
100-times bigger and more complex compilers to get a mere 2x speedup on
top of that 1000x faster hardware.

What's gone wrong?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Ben Bacarisse on Mon Jul 1 15:14:53 2024

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc. (Especially on Linux >> where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc
preinstalled?

Sure, although in the dozen or two versions I've come across, it always
has been. This is a different situation from Windows, where you can be
fairly sure it won't be!

But in the context I'm talking about, if installing tcc is a
possibility, then so is installing gcc. Unless you want to include also
systems too small to run gcc, in which case I'd extend it to systems too
small to run any compiler.

I think if talking about tcc vs. gcc, we need to assume a system capable
of running either.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to bart on Tue Jul 2 16:00:50 2024

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc. (Especially on Linux >>> where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc preinstalled?

I mean that saying "on Linux ... you will have it installed anyway" is
wrong.

Sure, although in the dozen or two versions I've come across, it
always has been.

I'm not sure what you mean by a "version". Every version (in the sense
of release number) of a source-only Linux distribution will have gcc
installed, but is that all you mean? Source-only distributions are rare
and not widely used.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Ben Bacarisse on Tue Jul 2 16:44:10 2024

On 02/07/2024 16:00, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc. (Especially on Linux >>>> where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc
preinstalled?

I mean that saying "on Linux ... you will have it installed anyway" is
wrong.

Sure, although in the dozen or two versions I've come across, it
always has been.

I'm not sure what you mean by a "version". Every version (in the sense
of release number) of a source-only Linux distribution will have gcc installed, but is that all you mean? Source-only distributions are rare
and not widely used.

No I mean binary distributions (unless the install process silently
compiled from source; I've no idea).

That fact is that if you take any ordinary Linux user, not even a
developer, then the chances are high that gcc will be available. Do the
same with Windows, and the chances are low.

On one project of mine that ran on Linux and had to invoke a C compiler,
the default one it tried was gcc. I would have preferred tcc, but that
was less likely to be installed. So they needed to use a '-tcc' option
when it was.

If I did it now, I'd try tcc first anyway, and if that failed, to fall
back to gcc. Perhaps with a message suggesting they get tcc, as a hint.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to bart on Wed Jul 3 00:58:26 2024

bart <bc@freeuk.com> writes:

On 02/07/2024 16:00, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc. (Especially on Linux
where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc
preinstalled?

I mean that saying "on Linux ... you will have it installed anyway" is
wrong.

Sure, although in the dozen or two versions I've come across, it
always has been.

I'm not sure what you mean by a "version". Every version (in the sense
of release number) of a source-only Linux distribution will have gcc
installed, but is that all you mean? Source-only distributions are rare
and not widely used.

No I mean binary distributions (unless the install process silently
compiled from source; I've no idea).

Which ones?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Ben Bacarisse on Wed Jul 3 01:23:36 2024

On 03/07/2024 00:58, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 02/07/2024 16:00, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc. (Especially on Linux
where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc
preinstalled?

I mean that saying "on Linux ... you will have it installed anyway" is
wrong.

Sure, although in the dozen or two versions I've come across, it
always has been.

I'm not sure what you mean by a "version". Every version (in the sense
of release number) of a source-only Linux distribution will have gcc
installed, but is that all you mean? Source-only distributions are rare >>> and not widely used.

No I mean binary distributions (unless the install process silently
compiled from source; I've no idea).

Which ones?

I really, really don't remember. I've tinkered with Linux every so often
for 20, maybe 25 years. You used to be able to order a job-lot of CDs
with different versions. Few did much.

Then there were various ones I tried under Virtual Box. All had gcc.

I must have tried half a dozen, maybe more, on RPis. Those I know all
had gcc too. So did a laptop or two with Linux. As does WSL now.

I'm not sure what you're trying to do here.

I will admit that it might not be 100% certain that a Linux OS on a
system on which someone is planning to run a C compiler will have gcc installed, although that is not my experience.

Will that do?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to bart on Wed Jul 3 01:54:38 2024

bart <bc@freeuk.com> writes:

On 03/07/2024 00:58, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 02/07/2024 16:00, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using
gcc. (Especially on Linux
where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc
preinstalled?

I mean that saying "on Linux ... you will have it installed anyway" is >>>> wrong.

Sure, although in the dozen or two versions I've come across, it
always has been.

I'm not sure what you mean by a "version". Every version (in the sense >>>> of release number) of a source-only Linux distribution will have gcc
installed, but is that all you mean? Source-only distributions are rare >>>> and not widely used.

No I mean binary distributions (unless the install process silently
compiled from source; I've no idea).

Which ones?

I really, really don't remember.

"[I]n the dozen or two versions I've come across, it always has been".
So you can't remember even one of the dozen or two versions you've come
across? How long ago did you last come across a version of Linux?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Wed Jul 3 00:47:26 2024

On 2024-07-03, bart <bc@freeuk.com> wrote:

I really, really don't remember. I've tinkered with Linux every so often
for 20, maybe 25 years. You used to be able to order a job-lot of CDs
with different versions. Few did much.

Every major distro I've ever used going back ti 1995 made it optional
to install just about everything, including the compiler suite.

A very popular desktop distro is Ubuntu. GCC is not a base package in
Ubuntu.

Then there were various ones I tried under Virtual Box. All had gcc.

You might have been using some ready-to-run preinstalled Virtual Box appliances, where someone already did the package selection for you and included the dev tools.

If you have a program that uses a C compiler at run time, you will
have to ask the user to install one.

(If you make it work with something small like tcc, that would be
practical to bundle with your program in a self-contained way.
Doing that with GCC will bloat up your package.)

I 100% agree with you about the horrible compiler bloat.

It's like GCC is pregnant with octuplets: Two database engines (a boy
and a girl), a CAD suite, three operating system kernels, an office productivity suite, ...

GCC doesn't produce much better code than it did 25 years ago,
and good luck running today's GCC on a machine from back then.

How you write your C still matters today, in spite of all the bloat.

I'e been using this snippet as a "litmus test" for decades:

void insert_before(node *succ, node *ins)
{
ins->prev = succ->prev;
ins->next = succ;
succ->prev->next = ins;
succ->prev = ins;
}

GCC today generates the same code as some 28 years ago, and
just like 28 years ago, you can shave an instruction off
the target code using:

void insert_before(node *succ, node *ins)
{
node *pred = succ->prev;

ins->prev = pred;
ins->next = succ;

pred->next = ins;
succ->prev = ins;
}

For all the bloat and slow compilation, how you code still
matters, and decades-old tricks are still relevant.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Kaz Kylheku on Wed Jul 3 01:18:23 2024

On 2024-07-03, Kaz Kylheku <643-408-1753@kylheku.com> wrote:

On 2024-07-03, bart <bc@freeuk.com> wrote:

I really, really don't remember. I've tinkered with Linux every so often
for 20, maybe 25 years. You used to be able to order a job-lot of CDs
with different versions. Few did much.

Every major distro I've ever used going back ti 1995 made it optional
to install just about everything, including the compiler suite.

A very popular desktop distro is Ubuntu. GCC is not a base package in
Ubuntu.

See this question on the askubuntu stackexchange site:

https://askubuntu.com/questions/1276468/does-ubuntu-20-04-1-lts-not-come-with-a-c-compiler-by-default

Quote:
] I recently downloaded the official desktop version of 20.04.1 and
] installed it on a machine. I was surprised to find that there was no gcc
] command available! I installed it with apt but I always thought that
] every flavour of Linux came with gcc straight out of the box.
]
] Has something recently changed in the release philosophy of Ubuntu?

Same confusion: "I always thought that every flavor of Linux came with
gcc straight out of the box."

Umm, no. gcc is needed to build many of the packages in the distro, but
is not a run-time dependency. (Or, only a small part of gcc is: some
programs dynamically depend on a "libgcc".)

A non-programming end user, who just wants e-mail, web, word processing
and games does not require gcc.

A programming user not working in C doesn't need gcc either, unless
their programming language depends on gcc.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Wed Jul 3 09:08:35 2024

On 03/07/2024 02:23, bart wrote:

On 03/07/2024 00:58, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 02/07/2024 16:00, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc. (Especially >>>>>>> on Linux
where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc
preinstalled?

I mean that saying "on Linux ... you will have it installed anyway" is >>>> wrong.

Sure, although in the dozen or two versions I've come across, it
always has been.

I'm not sure what you mean by a "version". Every version (in the sense >>>> of release number) of a source-only Linux distribution will have gcc
installed, but is that all you mean? Source-only distributions are
rare
and not widely used.

No I mean binary distributions (unless the install process silently
compiled from source; I've no idea).

Which ones?

I really, really don't remember. I've tinkered with Linux every so often
for 20, maybe 25 years. You used to be able to order a job-lot of CDs
with different versions. Few did much.

Then there were various ones I tried under Virtual Box. All had gcc.

I must have tried half a dozen, maybe more, on RPis. Those I know all
had gcc too. So did a laptop or two with Linux. As does WSL now.

I'm not sure what you're trying to do here.

I will admit that it might not be 100% certain that a Linux OS on a
system on which someone is planning to run a C compiler will have gcc installed, although that is not my experience.

Will that do?

In my experience, Linux distributions (which is a much more correct term
than your "versions") rarely install gcc by default, unless they are source-based distributions. But virtually all will have gcc available
for easy installation from their repositories. And they will pull it in automatically if the user installs something that requires it to run, or
to install (such as some kinds of drivers that need to be matched to the
kernel being used).

So perhaps instead of insisting, incorrectly, that gcc is almost always installed on Linux, you could just say that gcc is almost always easily available, and move on. (And perhaps it is so easily installed that you
did so without noticing it on your systems.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to David Brown on Wed Jul 3 10:36:23 2024

On 03/07/2024 08:08, David Brown wrote:

On 03/07/2024 02:23, bart wrote:

On 03/07/2024 00:58, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 02/07/2024 16:00, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc.
(Especially on Linux
where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc
preinstalled?

I mean that saying "on Linux ... you will have it installed anyway" is >>>>> wrong.

Sure, although in the dozen or two versions I've come across, it
always has been.

I'm not sure what you mean by a "version". Every version (in the
sense
of release number) of a source-only Linux distribution will have gcc >>>>> installed, but is that all you mean? Source-only distributions are >>>>> rare
and not widely used.

No I mean binary distributions (unless the install process silently
compiled from source; I've no idea).

Which ones?

I really, really don't remember. I've tinkered with Linux every so
often for 20, maybe 25 years. You used to be able to order a job-lot
of CDs with different versions. Few did much.

Then there were various ones I tried under Virtual Box. All had gcc.

I must have tried half a dozen, maybe more, on RPis. Those I know all
had gcc too. So did a laptop or two with Linux. As does WSL now.

I'm not sure what you're trying to do here.

I will admit that it might not be 100% certain that a Linux OS on a
system on which someone is planning to run a C compiler will have gcc
installed, although that is not my experience.

Will that do?

In my experience, Linux distributions (which is a much more correct term
than your "versions") rarely install gcc by default, unless they are source-based distributions. But virtually all will have gcc available
for easy installation from their repositories. And they will pull it in automatically if the user installs something that requires it to run, or
to install (such as some kinds of drivers that need to be matched to the kernel being used).

So perhaps instead of insisting, incorrectly, that gcc is almost always installed on Linux, you could just say that gcc is almost always easily available, and move on. (And perhaps it is so easily installed that you
did so without noticing it on your systems.)

I've never had to install gcc on any distribution of Linux. That's not
to say it was already installed, but if I ever had to use it, it was there.

Maybe on very early versions, where I struggled to get it to do anything
at all (like support a display) I didn't get around to using a C compiler.

But I did exactly that on all Linuxes installed on Virtual Box, or that
was on that notebook I had, or all the ones I tried across my two RPis,
plus the WSLs I've used.

That's enough of a track record for even one person that one can say,
Linux pretty much always comes with gcc. And if it doesn't, it's easy to install as you say.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From DFS@21:1/5 to bart on Wed Jul 3 09:41:08 2024

On 7/3/2024 5:36 AM, bart wrote:

On 03/07/2024 08:08, David Brown wrote:

On 03/07/2024 02:23, bart wrote:

On 03/07/2024 00:58, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 02/07/2024 16:00, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc.
(Especially on Linux
where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc >>>>>>> preinstalled?

I mean that saying "on Linux ... you will have it installed
anyway" is
wrong.

Sure, although in the dozen or two versions I've come across, it >>>>>>> always has been.

I'm not sure what you mean by a "version". Every version (in the >>>>>> sense
of release number) of a source-only Linux distribution will have gcc >>>>>> installed, but is that all you mean? Source-only distributions
are rare
and not widely used.

No I mean binary distributions (unless the install process silently
compiled from source; I've no idea).

Which ones?

I really, really don't remember. I've tinkered with Linux every so
often for 20, maybe 25 years. You used to be able to order a job-lot
of CDs with different versions. Few did much.

Then there were various ones I tried under Virtual Box. All had gcc.

I must have tried half a dozen, maybe more, on RPis. Those I know all
had gcc too. So did a laptop or two with Linux. As does WSL now.

I'm not sure what you're trying to do here.

I will admit that it might not be 100% certain that a Linux OS on a
system on which someone is planning to run a C compiler will have gcc
installed, although that is not my experience.

Will that do?

In my experience, Linux distributions (which is a much more correct
term than your "versions") rarely install gcc by default, unless they
are source-based distributions. But virtually all will have gcc
available for easy installation from their repositories. And they
will pull it in automatically if the user installs something that
requires it to run, or to install (such as some kinds of drivers that
need to be matched to the kernel being used).

So perhaps instead of insisting, incorrectly, that gcc is almost
always installed on Linux, you could just say that gcc is almost
always easily available, and move on. (And perhaps it is so easily
installed that you did so without noticing it on your systems.)

I've never had to install gcc on any distribution of Linux. That's not
to say it was already installed, but if I ever had to use it, it was there.

Maybe on very early versions, where I struggled to get it to do anything
at all (like support a display) I didn't get around to using a C compiler.

But I did exactly that on all Linuxes installed on Virtual Box, or that
was on that notebook I had, or all the ones I tried across my two RPis,
plus the WSLs I've used.

That's enough of a track record for even one person that one can say,
Linux pretty much always comes with gcc. And if it doesn't, it's easy to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

I think Windows should come with various development tools and programs preinstalled and ready to go: tcc, python, VS Code, SQLite.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to DFS on Wed Jul 3 17:58:32 2024

On 03/07/2024 14:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

On 03/07/2024 08:08, David Brown wrote:

On 03/07/2024 02:23, bart wrote:

On 03/07/2024 00:58, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 02/07/2024 16:00, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

On 01/07/2024 13:09, Ben Bacarisse wrote:

bart <bc@freeuk.com> writes:

Using products like tcc doesn't mean never using gcc.
(Especially on Linux
where you will have it installed anyway.)

The parenthetical remark is wrong.

You mean it is possible for a Linux installation to not have gcc >>>>>>>> preinstalled?

I mean that saying "on Linux ... you will have it installed
anyway" is
wrong.

Sure, although in the dozen or two versions I've come across, it >>>>>>>> always has been.

I'm not sure what you mean by a "version". Every version (in the >>>>>>> sense
of release number) of a source-only Linux distribution will have gcc >>>>>>> installed, but is that all you mean? Source-only distributions >>>>>>> are rare
and not widely used.

No I mean binary distributions (unless the install process silently >>>>>> compiled from source; I've no idea).

Which ones?

I really, really don't remember. I've tinkered with Linux every so
often for 20, maybe 25 years. You used to be able to order a job-lot
of CDs with different versions. Few did much.

Then there were various ones I tried under Virtual Box. All had gcc.

I must have tried half a dozen, maybe more, on RPis. Those I know
all had gcc too. So did a laptop or two with Linux. As does WSL now. >>>>
I'm not sure what you're trying to do here.

I will admit that it might not be 100% certain that a Linux OS on a
system on which someone is planning to run a C compiler will have
gcc installed, although that is not my experience.

Will that do?

In my experience, Linux distributions (which is a much more correct
term than your "versions") rarely install gcc by default, unless they
are source-based distributions. But virtually all will have gcc
available for easy installation from their repositories. And they
will pull it in automatically if the user installs something that
requires it to run, or to install (such as some kinds of drivers that
need to be matched to the kernel being used).

So perhaps instead of insisting, incorrectly, that gcc is almost
always installed on Linux, you could just say that gcc is almost
always easily available, and move on. (And perhaps it is so easily
installed that you did so without noticing it on your systems.)

I've never had to install gcc on any distribution of Linux. That's not
to say it was already installed, but if I ever had to use it, it was
there.

Maybe on very early versions, where I struggled to get it to do
anything at all (like support a display) I didn't get around to using
a C compiler.

But I did exactly that on all Linuxes installed on Virtual Box, or
that was on that notebook I had, or all the ones I tried across my two
RPis, plus the WSLs I've used.

That's enough of a track record for even one person that one can say,
Linux pretty much always comes with gcc. And if it doesn't, it's easy
to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

I think Windows should come with various development tools and programs preinstalled and ready to go: tcc, python, VS Code, SQLite.

I think it already does if using WSL, presumably because (1) people
expect it under Linux; (2) only developers are going to use it anyway.

Windows itself is primarily a consumer product not a DIY OS as Linux
comes across.

Although I wouldn't mind if some of those were available; they would
take up an insignificant amount of space compared to the rest of
Windows. And would open interesting possibilities, such as supplying
some programs as source code.

It also needs a better built-in scripting language than 'BAT' scripts.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Wed Jul 3 21:33:16 2024

On Wed, 3 Jul 2024 17:58:32 +0100
bart <bc@freeuk.com> wrote:

Windows itself is primarily a consumer product not a DIY OS as Linux
comes across.

Although I wouldn't mind if some of those were available; they would
take up an insignificant amount of space compared to the rest of
Windows. And would open interesting possibilities, such as supplying
some programs as source code.

It also needs a better built-in scripting language than 'BAT' scripts.

Windows is primarily a corporate product and only secondarily consumer.
I suppose, with good degree of certainty, that few important corporate
clients will veto any attempt of Microsoft to provide compilers by
default or even with non-default, but easy to check checkbox during installation.
I'd think, few of them already quite unhappy because of default
presence of powerful scripting engines.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Kaz Kylheku on Wed Jul 3 17:12:47 2024

On 6/29/24 20:36, Kaz Kylheku wrote:
...

Situations which are constraint or syntax violations in ISO C, where the compiler is not providing a useful extensions, should be diagnosed in
such a way that translation does not succeed.

ISO C doesn't require that, possibly because it would forbid extensions. (Extensions that are non-conforming, but only in the regard that
translation doesn't fail.)

You're right, ISO C doesn't require that. ISO C only requires a
conforming implementation of C to fail to translate a program if that
program contains a correctly formatted #error directive that survives
the conditional compilation phase. Therefore, extensions that don't
result in a failed translation under those circumstances are fully
conforming.

The standard talks about conforming extensions (4p6); it says nothing
about non-conforming ones, because there's so little that can usefully
said about them.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to DFS on Wed Jul 3 22:58:52 2024

DFS <nospam@dfs.com> writes:

distrowatch.com shows most distros come with gcc preinstalled.

Assuming that "preinstalled" means "installed by default", can you point
to a link that shows this? It would be good to have some actual data in
this thread but I could not find anything like that on the site. Are
you confusing "supports" with "installed by default"?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to bart on Thu Jul 4 10:18:58 2024

On 03/07/2024 11:36, bart wrote:

That's enough of a track record for even one person that one can say,
Linux pretty much always comes with gcc.

You can say it if you like, but it is simply not true. I can't say for
sure why it seems to be true for you - perhaps it was just the options
you picked during installation. However, I hope you now understand that
it is not the case in general for other Linux users.

And if it doesn't, it's easy to
install as you say.

Indeed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to DFS on Thu Jul 4 10:24:23 2024

On 03/07/2024 15:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

That's enough of a track record for even one person that one can say,
Linux pretty much always comes with gcc. And if it doesn't, it's easy
to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

No, it does not. Distrowatch shows the version of the packages in the distributions repositories, not what is installed by default. And
almost all distributions will have gcc available.

I think Windows should come with various development tools and programs preinstalled and ready to go: tcc, python, VS Code, SQLite.

Python would be useful to have by default on Windows. The rest, not so
much.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Michael S on Mon Jul 8 19:48:22 2024

Michael S <already5chosen@yahoo.com> writes:

On Sun, 30 Jun 2024 17:54:14 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 30/06/2024 11:18, Michael S wrote:

On Sat, 29 Jun 2024 20:55:54 +0100
bart <bc@freeuk.com> wrote:

On 29/06/2024 18:46, Richard Harnden wrote:

On 29/06/2024 15:14, bart wrote:
[...]

My older bcc compiler reported 4 as a hard error unless an
override was used.

But you didn't say anything about main's args.

I did, indirectly. The actual error was the use of "()" as an empty
parameter list (for any function, not just main, but my example
could also have been 'void H(){H(123);}'). If you tried to compile:

int main() {
main(123);
}

then it wouldn't get past the () to the call.

Eventually I dropped that restriction, and the reason was that so
much code used such parameter lists, for any function.

Not because they wanted unchecked args (there are some legitimate
use-cases within function pointer types), but because so many
people assumed () meant zero parameters like (void).

Why was such code so common? Presumably because compilers said
nothing; and they couldn't because the language allowed it. If they
had required an override like mine did, more would have got the
message.

I tried following code:
int foo() { return 1; }

Both MSVC and clang warn about it at high warnings level (-Wall for
MSVC, -Wpedantic for clang). But they dont warn at levels that most
people use in practice (-W3 or -W4 for MSVC, -Wall for clang).
gcc13 produces no warning even at -Wpedantic. It does produce
warning with '-Wpedantic -std=xxx' for all values of xxx except c23
and gnu23. The absence of warning for c23/gnu23 makes sense, the
rest of gcc behavior - less so.

gcc -Wpedantic makes very little sense without specifying a C
standard (rather than a gnu C standard).

But why would you expect a warning from code that is perfectly legal
and well-defined C code, without explicitly enabling warnings that
check for particular style issues? Non-prototype function
declarations are deprecated (since C99), but not removed from the
language until C23 (where that declaration is now a function
prototype).

I expect warning at -Wall, because it is deprecated. Those who do
not want warning can turn it off explicitly with -Wno-strict-prototypes
or whatever the name of the switch.

I would like to offer a different view.

To me the behavior of -Wall is kind of a "fuck you" from the gcc
people. If a compile is done with, for example,

gcc -std=c99 -pedantic -Wall ...

the empty () were not deprecated for C99 (and in fact still
are not since C23 hasn't been ratified yet). The attitude
towards -Wall that it can change at any time - without regard
to what -std=c?? option is given - effectively makes it
useless, because it can't be relied on. gcc has any number
of diagnostic options, but many or most of the prominent ones
change over time and so have to be avoided if one wants
repeatable behavior. I would happily settle for things like,
say, -Wall99 or -Wall11 (that's two ells and two ones), but
of course gcc doesn't provide stable diagnostic aggregates,
only frustratingly ever-changing ones.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From DFS@21:1/5 to David Brown on Thu Jan 2 13:16:23 2025

On 7/4/2024 4:24 AM, David Brown wrote:

On 03/07/2024 15:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

That's enough of a track record for even one person that one can say,
Linux pretty much always comes with gcc. And if it doesn't, it's easy
to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

No, it does not.

Yes, it does.

Distrowatch shows the version of the packages in the
distributions repositories,

Wrong again.

From the founder of distrowatch:

"The package versions are the ones included on the install media."

not what is installed by default.

Wrong.

And almost all distributions will have gcc available.

Right

I think Windows should come with various development tools and
programs preinstalled and ready to go: tcc, python, VS Code, SQLite.

Python would be useful to have by default on Windows. The rest, not so much.

Wrong

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to DFS on Thu Jan 2 20:46:37 2025

DFS <guhnoo-basher@linux.advocaca> writes:

On 7/4/2024 4:24 AM, David Brown wrote:

On 03/07/2024 15:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

That's enough of a track record for even one person that one can say,
Linux pretty much always comes with gcc. And if it doesn't, it's easy
to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

No, it does not.

Yes, it does.

David is correct. I just installed Fedora41 and there were no
development tools (compilers, devel libraries, binutils, gdb,
make) installed by default (preinstalled).

The end-user is required to install them manually.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Phillip@21:1/5 to Scott Lurndal on Thu Jan 2 16:47:32 2025

On 1/2/25 3:46 PM, Scott Lurndal wrote:

DFS <guhnoo-basher@linux.advocaca> writes:

On 7/4/2024 4:24 AM, David Brown wrote:

On 03/07/2024 15:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

That's enough of a track record for even one person that one can say, >>>>> Linux pretty much always comes with gcc. And if it doesn't, it's easy >>>>> to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

No, it does not.

Yes, it does.

David is correct. I just installed Fedora41 and there were no
development tools (compilers, devel libraries, binutils, gdb,
make) installed by default (preinstalled).

The end-user is required to install them manually.

Same with Arch and Manjaro. Neither come preinstalled with gcc or devs
tools.

As far as I know, Debian doesn't come with gcc preinstalled either.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Waldek Hebisch@21:1/5 to Scott Lurndal on Thu Jan 2 22:22:04 2025

Scott Lurndal <scott@slp53.sl.home> wrote:

DFS <guhnoo-basher@linux.advocaca> writes:

On 7/4/2024 4:24 AM, David Brown wrote:

On 03/07/2024 15:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

That's enough of a track record for even one person that one can say, >>>>> Linux pretty much always comes with gcc. And if it doesn't, it's easy >>>>> to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

No, it does not.

Yes, it does.

David is correct. I just installed Fedora41 and there were no
development tools (compilers, devel libraries, binutils, gdb,
make) installed by default (preinstalled).

The end-user is required to install them manually.

On Debian "small" install includes cpp, that is C preprocessor.
Preprocessor is actually implemented by 'cc1' that is core
C compiler. So one can compile C programs and generate assembly.
But most headers are missing and there is no way to create an
executable (and one needs to invoke 'cc1' by path to gcc-internal
directory since 'gcc' executable is missing too). Kind of silly,
as 'cc1' is 33 MB and installing something like extra 3 MB would
give a working C compiler.

BTW: I wrote "small" as that was minimal thing that installer
offered and which included GUI and sshd. This installed
several packages marked "optional" resulting in 1.3 GB disc
use.

BTW2: That probably could be good use case for 'tcc', it provides
preprocessor and is much smaller than 'gcc'.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to DFS on Fri Jan 3 13:48:35 2025

On 02/01/2025 19:16, DFS wrote:

On 7/4/2024 4:24 AM, David Brown wrote:

On 03/07/2024 15:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

That's enough of a track record for even one person that one can
say, Linux pretty much always comes with gcc. And if it doesn't,
it's easy to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

No, it does not.

Yes, it does.

No.

You were wrong half a year ago when this discussion was active, and you
are still wrong now. Are you /really/ bearing a grudge for that long?

Distrowatch shows the version of the packages in the distributions
repositories,

Wrong again.

From the founder of distrowatch:

"The package versions are the ones included on the install media."

Re-read that sentence. Then try again a few more times, until you
understand it. Having a package on the install media does /not/ mean it
is necessarily installed - it merely means it is available for
installation if the user wants. In most cases, people do not install
anything close to all the packages on the installation media (unless the installation media is intentionally for a minimal install).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to DFS on Fri Jan 3 18:12:48 2025

On 03/01/2025 18:04, DFS wrote:

On 1/3/2025 7:48 AM, David Brown wrote:

On 02/01/2025 19:16, DFS wrote:

On 7/4/2024 4:24 AM, David Brown wrote:

On 03/07/2024 15:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

That's enough of a track record for even one person that one can
say, Linux pretty much always comes with gcc. And if it doesn't,
it's easy to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

No, it does not.

Yes, it does.

No.

You were wrong half a year ago when this discussion was active, and
you are still wrong now. Are you /really/ bearing a grudge for that
long?

My experience in the past was gcc was almost always installed with the distro.

So what?

My experience in the past is that Linux is almost always Debian or a
Debian derivative, and I have installed gcc myself if it is appropriate.
But that's /my/ experience because of the distos /I/ choose. It is no
more and no less relevant than /your/ experience. The reality is that
in most Linux distributions, gcc is not installed by default.

Distrowatch shows the version of the packages in the distributions
repositories,

Wrong again.

From the founder of distrowatch:

"The package versions are the ones included on the install media."

gcc is neither necessary, nor installed by default, by most
distributions. All Distrowatch says is that it is usually included on installation media, available for easy installation.

Now you can go back to sleep for another six months.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From DFS@21:1/5 to David Brown on Fri Jan 3 12:04:08 2025

On 1/3/2025 7:48 AM, David Brown wrote:

On 02/01/2025 19:16, DFS wrote:

On 7/4/2024 4:24 AM, David Brown wrote:

On 03/07/2024 15:41, DFS wrote:

On 7/3/2024 5:36 AM, bart wrote:

That's enough of a track record for even one person that one can
say, Linux pretty much always comes with gcc. And if it doesn't,
it's easy to install as you say.

distrowatch.com shows most distros come with gcc preinstalled.

No, it does not.

Yes, it does.

No.

You were wrong half a year ago when this discussion was active, and you
are still wrong now. Are you /really/ bearing a grudge for that long?

My experience in the past was gcc was almost always installed with the
distro.

Distrowatch shows the version of the packages in the distributions
repositories,

Wrong again.

From the founder of distrowatch:

"The package versions are the ones included on the install media."

Re-read that sentence. Then try again a few more times, until you understand it. Having a package on the install media does /not/ mean it
is necessarily installed

I never said it did. Nor did distrowatch make that claim.

- it merely means it is available for
installation if the user wants. In most cases, people do not install anything close to all the packages on the installation media (unless the installation media is intentionally for a minimal install).

Reread your incorrect claim a few more times, then read what distrowatch
said, until YOU understand it:

You: "Distrowatch shows the version of the packages in the
distributions repositories"

Distrowatch: "The package versions are the ones included on the install
media."

The versions (in the repo and shown on distrowatch) might match the
moment the distro is released, but the way FOSS is updated willy nilly
every 5 minutes, they soon do not.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet
- Thlc
  Sat Sep 13 17:11:34 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 17:04:03 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 16:32:19 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 15:41:11 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 07:56:03 2025
  from Rognac, France via SSH
- Gretchiie
  Sat Sep 13 07:22:10 2025
  from Derry, Nh via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (1 / 15)
Uptime:	159:59:49
Calls:	10,385
Calls today:	2
Files:	14,056
Messages:	6,416,492

Re: Baby X is bor nagain

Who's Online

Recent Visitors

System Info