Forum: >>> Magnum BBS <<<

Porting code from C

From Bart@21:1/5 to All on Sat Dec 3 20:17:01 2022

David Brown:

You can take a program in your language and translate it fairly

directly into C, albeit some parts will be ugly or non-idiomatic.

This can work, because I've done it, but there are still quite a few
problems, such as UB, limitations or omissions in C, or needing to
cajole C compilers into accepting my code.

You can take a old-style C program, with some restrictions, and

translate it fairly directly into your language.

Targetting a lower-level language is easier than one higher level than
your source language.

Mine is a little higher level than C, so is a little more challenging.
It can be done of course, depending on how crudely you want to express
the original code.

But let's go through some of them ('M' is my language):

(TLDR: see end)

* C uses nested include files. M has 'include', but C's includes are
used for module headers. M has a proper module system. You really need
do quite a lot of work to eliminate module headers from C, and express
things as M modules

* Exported variables must be done properly, with an 'owner' module.
Imported names must not be directly visible in M source; only via a
module that exports them. This applies to functions, variables, types,
enums, records, named constants

* Then there's C's macro system, for which there is no equivalent. You
can try preprocessing before conversion, but the result can be
gobbledygook. Not suitable for a one-time translation. Alternatively you
can go through each macro and see how it might map to the M language.
You will find it just as much work as eliminating macros from C code to
end up as still-readable and maintainable C code.

* M has no conditional code blocks (this stuff is taken care of at the
module level)

* For every non-static module-level function or variable in C, you need
to remember to mark it as 'global' or 'export'.

* Every A[i] term, when A is not an array, must turned into (&A[0]+i)^.
This is not straightforward; C abounds with types like int** which are
then accessed an array of pointers or array of arrays, or maybe pointer
to pointer.

* You must remember to declare any array type with a zero lower bound

* Switch: these must be well-structured to turn into M equivalents.
Forget trying to convert Duff's Device. Breaks to prevent fallthrough
must be removed. Implicit fallthrough, except for consecutive case
labels, must dealt with.

* C's plethora of integer types can usually be easily converted. But you
need to decide what to do about 'long', and what to do about plain
'char', which has no equivalent. Note that string literals in M are
sequences of 'char', a special type, but can be assumed to be sequences
of 'byte', a u8 type.

* C's 'int' type must be translated to 'int32'

* C's integer literals below 2**32 will have i32 or u32 type; in M they
will be i64 only, leading to possible different behaviours.

* M's rules for mixed arithmetic are different.

* M's rules for widening 8/16-bit types are different. All evaluation in
M is done as 64 bits

* C is lax in {...} initialisation structures; braces defining structure
can be omitted. M is stricter; they must exactly match the type.

* C has quite a few features not present or not fully supported in M
that will need workarounds:

* VLAs and variable types in general
* Designated initialisers
* Compound literals
* Bitfields as implemented in C (M's are far more controlled)

* M does not have block scopes as extensively used in C

* M does not have case-sensensitive identifiers, leading to likely clashes

* M does not have struct tags, or use separate name spaces for those.

* M does not have 'const', but that's an easy one: just leave it out.

* M's operator precedences are all different

* There is no equivalent of C's for-loop: this must be analysed for
being a simple iteration, or for being complex enough to be emulated
with 'while'.

* M has no variadic parameters

* M doesn't allow structs to be defined just anywhere, or inside another struct, or allow both a struct and variables of that struct to be
defined. It is far more disciplined.

* Does not allow 8/16/32-bit integers types as parameters or return
types (only in FFIs)

* M will not have the equivalent of most gnuC extensions

* M does not have all those macros in limits.h or inttypes.h; those
are represented as special syntax (eg int32.max)

* M has no equivalent of C's 29 standard headers

I won't go on. I'm not saying you can't take a program written in C
and reimplement it in M. But you can't trivially do it by modifying the
C sources into M; it would be a huge amount of work involving
refactoring and being highly errorprone (I've tried it).

So you'd have to rewrite the program, but that applies also to porting
to other languages. That doesn't make them ripoffs of C.

The underlying machine model might be the same between these two
languages, and the degree of abstraction isn't much different. But there
a chasm between how each language presents that model and how it
presents itself.

Mine /was/ developed independently from C.

You can take a old-style C program, with some restrictions, and

translate it fairly directly into your language.

You will have trouble just translating it into a modern style of C!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Sun Dec 4 17:30:00 2022

On 03/12/2022 21:17, Bart wrote:

David Brown:

You can take a program in your language and translate it fairly

directly into C, albeit some parts will be ugly or non-idiomatic.

This can work, because I've done it, but there are still quite a few problems, such as UB, limitations or omissions in C, or needing to
cajole C compilers into accepting my code.

No, that would just be poor translation from your code to C.

Typical examples would be if your language has wrapping overflow
semantics on signed overflow and you translate your own code (I'm
guessing syntax a bit here) :

a, b, x : i32

x = a + b

into

int a, b, x;

x = a + b;

Use correct C, and you'll be fine :

int32_t a, b, x;

x = (uint32_t) a + (uint32_t) b;

I am not suggesting you can translate idiomatic or typical code from
your language directly into equally idiomatic C code.

You can take a old-style C program, with some restrictions, and

translate it fairly directly into your language.

Targetting a lower-level language is easier than one higher level than
your source language.

Mine is a little higher level than C, so is a little more challenging.
It can be done of course, depending on how crudely you want to express
the original code.

But let's go through some of them ('M' is my language):

(TLDR: see end)

* C uses nested include files. M has 'include', but C's includes are
used for module headers. M has a proper module system. You really need
do quite a lot of work to eliminate module headers from C, and express
things as M modules

I don't understand what you mean here. Do you mean that in M, a
"module" is a single file that contains both an "interface" section and
an "implementation" section, like in Pascal, rather than having separate "interface" files and "implementation" files, like in Modula-2 ? This
is, of course, perfectly fine. The translation to C basically consists
of putting the "interface" part in "file.h" and the implementation part
in "file.c".

A modular system must always support nested imports (or "includes" in C parlance). Otherwise the interface part of Module A could not make use
of types or other definitions from Module B.

(Translating jumbled or poorly structured C code into your language
would be harder - after all, it is possible to write a chaotic mess with
C include files.)

* Exported variables must be done properly, with an 'owner' module.
Imported names must not be directly visible in M source; only via a
module that exports them. This applies to functions, variables, types,
enums, records, named constants

Again, I don't see the sense in what you are saying. The names /must/
be visible in order to use them. The /definitions/ behind the names can
be hidden.

When I write a "module" in C, I have a .c file and a .h file. Every
object and function is either local to the module, and declared with
"static", or "exported" and declared in the header file with "extern".
The header file is, of course, #include'd by the C file. Enum and type declarations are in the header if they are exported, or the C file if not.

I don't (as yet) see any reason why your M modules could not translate
directly to the same organisation in C.

* Then there's C's macro system, for which there is no equivalent. You
can try preprocessing before conversion, but the result can be
gobbledygook. Not suitable for a one-time translation. Alternatively you
can go through each macro and see how it might map to the M language.
You will find it just as much work as eliminating macros from C code to
end up as still-readable and maintainable C code.

Now I am mixed up. Are we translating your M code to C, or C code to M?

Macros in C can be used well to write clearer code, and that should be
fine to translate to your language (we are talking manual translation,
not automatic transcompilation - if you were doing that, you'd do it
from the C after pre-processing). Code that is so messy that it is hard
for a human reader to comprehend will be hard to translate to anything
else, regardless of the languages involved.

But certainly there are things that can be done with macros in C that
are hard to translate well into many other languages.

* M has no conditional code blocks (this stuff is taken care of at the
module level)

* For every non-static module-level function or variable in C, you need
to remember to mark it as 'global' or 'export'.

That's a simple rule, and hardly a challenge. (You were right to make
the default "private" in your language, and C was wrong to make the
default "public".)

<snip as this is just getting too long.>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Sun Dec 4 18:41:55 2022

On 04/12/2022 16:30, David Brown wrote:

On 03/12/2022 21:17, Bart wrote:

David Brown:

You can take a program in your language and translate it fairly

directly into C, albeit some parts will be ugly or non-idiomatic.

This can work, because I've done it, but there are still quite a few
problems, such as UB, limitations or omissions in C, or needing to
cajole C compilers into accepting my code.

No, that would just be poor translation from your code to C.

It will be poor, and it is. But it only needs to work with selected applications, mostly so that I can apply a optimising C compiler so my
programs compare more favourably with competing products.

However it means the original source is sometimes compromised if it uses constructs that don't tranlate; then it needs to dumbed down.

;You can take a old-style C program, with some restrictions, and

translate it fairly directly into your language.

*** This part was about translating from C to my M language ***

I don't understand what you mean here. Do you mean that in M, a
"module" is a single file that contains both an "interface" section and
an "implementation" section, like in Pascal, rather than having separate "interface" files and "implementation" files, like in Modula-2 ? This
is, of course, perfectly fine. The translation to C basically consists
of putting the "interface" part in "file.h" and the implementation part
in "file.c".

(Actually, the conversion of M->C doesn't bother with discrete headers
at all, since the whole program translator generates a single C source file.

The latest version doesn't even bother with standard C headers.)

A modular system must always support nested imports (or "includes" in C parlance). Otherwise the interface part of Module A could not make use
of types or other definitions from Module B.

(Translating jumbled or poorly structured C code into your language
would be harder - after all, it is possible to write a chaotic mess with
C include files.)

Yes, exactly; it can be unstructured and more chaotic.

* Exported variables must be done properly, with an 'owner' module.
Imported names must not be directly visible in M source; only via a
module that exports them. This applies to functions, variables, types,
enums, records, named constants

Again, I don't see the sense in what you are saying. The names /must/
be visible in order to use them. The /definitions/ behind the names can
be hidden.

In a C module that imports function F, the declaration (sometimes,
definition) of F must exist in the translation unit. (What you get if
you preprocess that source module, flatten all the includes etc.)

In the M module that imports function F, the definition of F never
appears in the source for that module.

The compiler will find it by searching the namespaces of the imported
modules.

(Actually, in the latest version, you won't see the name of the imported
module either; that information is centralised.)

Now I am mixed up. Are we translating your M code to C, or C code to M?

No this, is C to M now.

There are all sorts of problems involved with macros.

In short, doing this manually is just impractical (except for stylised,
very conservative C like I might write); it can be easier to rewrite.

Doing it programmatically has all sorts of issues too.

Macros in C can be used well to write clearer code, and that should be
fine to translate to your language (we are talking manual translation,
not automatic transcompilation - if you were doing that, you'd do it
from the C after pre-processing).

That doesn't work. For a start it will flatten all enumerations and
defines into literals; you need to preserve those.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Wed Dec 7 17:47:06 2022

Bart <bc@freeuk.com> wrote:

On 04/12/2022 16:30, David Brown wrote:

On 03/12/2022 21:17, Bart wrote:

David Brown:

constructs that don't tranlate; then it needs to dumbed down.

You can take a old-style C program, with some restrictions, and

translate it fairly directly into your language.

*** This part was about translating from C to my M language ***

I don't understand what you mean here.? Do you mean that in M, a
"module" is a single file that contains both an "interface" section and
an "implementation" section, like in Pascal, rather than having separate "interface" files and "implementation" files, like in Modula-2 ?? This
is, of course, perfectly fine.? The translation to C basically consists
of putting the "interface" part in "file.h" and the implementation part
in "file.c".

* Exported variables must be done properly, with an 'owner' module.
Imported names must not be directly visible in M source; only via a
module that exports them. This applies to functions, variables, types,
enums, records, named constants

Again, I don't see the sense in what you are saying.? The names /must/
be visible in order to use them.? The /definitions/ behind the names can
be hidden.

In a C module that imports function F, the declaration (sometimes, definition) of F must exist in the translation unit. (What you get if
you preprocess that source module, flatten all the includes etc.)

In the M module that imports function F, the definition of F never
appears in the source for that module.

The compiler will find it by searching the namespaces of the imported modules.

(Actually, in the latest version, you won't see the name of the imported module either; that information is centralised.)

You mean that it impossible to compile a module without info from
central registry? For me it means that registry is now part of
your source.

Regarding C, in well written

Now I am mixed up.? Are we translating your M code to C, or C code to M?

No this, is C to M now.

There are all sorts of problems involved with macros.

In short, doing this manually is just impractical (except for stylised,
very conservative C like I might write); it can be easier to rewrite.

Well, if code if _really_ badly written or trivial, then independent
rewrite may be best way. OTOH, in many cases one can do "limited"
rewrite: write code that performs "the same" computations as code
in other language. "The same" includes 1-1 correspondence of variables
and fields in data structurs. Such limited rewrite can be 5-10
times faster than writing code from scratch, so in this sense
translating from one language to different language is easy,
it is much less effort than writing from scratch. Of course,
this assumes that code is doing something interesting, for trivial
code you are just doing with bulk and following original is of no
help.

Doing it programmatically has all sorts of issues too.

If you want fully automatic translation that preserves meaning,
then I would expect ugly and inefficient code as the result.
The fist issue that comes to mind is name mangling: since your
languages is case insensitive one may be forced to change some
C names to avoid clashes. Simple translator would change all
names... OTOH resonably complex translator may be able to
translate 80-90% of code and flag problematic 10%. Manual
modification _before_ running translator can significantly
improve working of automatic part and reduce total amount
of manual labor.

Concerning macros, you somewhat ignore one obvious solution:
implement a preprocessor for your language so that you
can translate C macros into macros for your preprocessor.

Macros in C can be used well to write clearer code, and that should be
fine to translate to your language (we are talking manual translation,
not automatic transcompilation - if you were doing that, you'd do it
from the C after pre-processing).

That doesn't work. For a start it will flatten all enumerations and
defines into literals; you need to preserve those.

"doesn't work" depends on your goal. If goal is to get running
executable as output from your compiler (IIUC "automatic
transcompilation" and the following text up to closing
parenthesis covers this case), then I see no problem
that expansion would cause. If goal is to get idiomatic code
in your language, then you need to be more creative. IME most
C macros are rather simple ones, to get effect of named constants
and to inline code, those can be translated to constructs of
your langage. Some macros are used as abbreviations, if your
language has appropriate way for abbreviating you can replace
tham by constucts of your language, otherwise you need to
expand. Concerning expansion, you can modify C preprocessor
to do partial expansion and preserve comments (and possibly
also conditionals). In fact, even standard C preprocessor
maybe enough if there are no comments: just make sure that
input contains only definitions of things that you want
expanded and make rest undefined.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Wed Dec 7 18:58:45 2022

On 07/12/2022 17:47, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

(Actually, in the latest version, you won't see the name of the imported
module either; that information is centralised.)

You mean that it impossible to compile a module without info from
central registry? For me it means that registry is now part of
your source.

You can't compile a single 'module'; you compile the whole program. The
info that describes the module layout is at the top of the lead module,
which typically contains only module info.

So yes it can be considered part of the source, but is really
intermediate info between compiler, and source code proper.

Other languages may use command parameters, @ files, makefiles, or
untidy collections of 'import <module>` at the top of every module, that
need constant maintenance.

Regarding C, in well written

?

In short, doing this manually is just impractical (except for stylised,
very conservative C like I might write); it can be easier to rewrite.

Well, if code if _really_ badly written or trivial, then independent
rewrite may be best way. OTOH, in many cases one can do "limited"
rewrite: write code that performs "the same" computations as code
in other language. "The same" includes 1-1 correspondence of variables
and fields in data structurs. Such limited rewrite can be 5-10
times faster than writing code from scratch, so in this sense
translating from one language to different language is easy,
it is much less effort than writing from scratch. Of course,
this assumes that code is doing something interesting, for trivial
code you are just doing with bulk and following original is of no
help.

Doing it programmatically has all sorts of issues too.

If you want fully automatic translation that preserves meaning,
then I would expect ugly and inefficient code as the result.
The fist issue that comes to mind is name mangling: since your
languages is case insensitive one may be forced to change some
C names to avoid clashes.

That would be a trivial matter: names can have a backtick prepended that preserves their case. But I wouldn't want to work with such source code:
having to be case-sensensitive /and/ having the backtick. Name mangling
has the same problems.

This only works if the output is intermediate code that no one ever
sees. However compiling C via intermediate M is not a useful execise.

Concerning macros, you somewhat ignore one obvious solution:
implement a preprocessor for your language so that you
can translate C macros into macros for your preprocessor.

That's not a solution, that's just dragging half the C language into mine.

Besides, the C macros will still expand into C expressions, statements
and types; so not just half the language, but half the C source too.

Don't forget we're trying to translate the C, not find ways of avoiding
that task!

Macros in C can be used well to write clearer code, and that should be
fine to translate to your language (we are talking manual translation,
not automatic transcompilation - if you were doing that, you'd do it
from the C after pre-processing).

That doesn't work. For a start it will flatten all enumerations and
defines into literals; you need to preserve those.

"doesn't work" depends on your goal. If goal is to get running
executable as output from your compiler (IIUC "automatic
transcompilation" and the following text up to closing
parenthesis covers this case), then I see no problem
that expansion would cause.

I think DB meant being able to manually translate line by line. That can
work for small examples, but it doesn't really scale.

I anyway already have a tool that will do that. If this is the original
C of one example:

https://github.com/sal55/langs/blob/master/nano.c

Then my tool (a development of my C compiler) produces this file in my
syntax:

https://github.com/sal55/langs/blob/master/nano.m

It looks great! But it won't compile; this is purely to help visualising
C code.

I have tried a few times to use this as a starting point to manually
translate C programs to M, but there's always something that needs
fixing every few lines; it's usually a huge amount of work. And many
things can't be detected: they will produce legal M that is an incorrect representation of the C.

So in a line like this:

out^ := njClip(x7+x1>>14+128)

the operator priorities are all different.

Plus macros are expanded into literals and so on. You can do more work
on the tool to reduce the manual fixups needed, but generally this
solution is not viable.

But, we're not really seriously trying to translate a specific C program.

David Brown's point was to show that my language is really no different
from C. But the language /it implements/ doesn't claim to be, and there wouldn't be any great difficult in rewriting any C program in my
language, so long as your didn't try and do it token-by-token.

It would be the same task as translating a C program to D or Java or C#
or Zig or Rust. I think most agree those are different from C.

My language also has many differences in how such a lower-level language
is presented, like the module system for example. Or defaulting to
64-bit integers (causing subtle differences of behaviour). Or
out-of-order definitions. Or being expression-based no statement-based.
Or a dozen other matters I've already listed.

s to get idiomatic code
in your language, then you need to be more creative. IME most
C macros are rather simple ones, to get effect of named constants
and to inline code, those can be translated to constructs of
your langage.

I have another tool that attempts to translate APIs, ie. C headers. When
I applied it to gtk.h (which invokes 550 C headers across 330,000 lines
of code), it produced a flat 25,000 import module, or which the last
3000 lines are macros that need to be manually processed.

I do have a simple module scheme in my language, designed for
well-formed expressions only, but C macros can include random bits of
syntax, and expand to /C code/. Remember the tool only does
declarations, not code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Thu Dec 8 00:30:05 2022

Bart <bc@freeuk.com> wrote:

On 07/12/2022 17:47, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

(Actually, in the latest version, you won't see the name of the imported >> module either; that information is centralised.)

You mean that it impossible to compile a module without info from
central registry? For me it means that registry is now part of
your source.

You can't compile a single 'module'; you compile the whole program. The
info that describes the module layout is at the top of the lead module,
which typically contains only module info.

So yes it can be considered part of the source, but is really
intermediate info between compiler, and source code proper.

Other languages may use command parameters, @ files, makefiles, or
untidy collections of 'import <module>` at the top of every module, that
need constant maintenance.

Hmm, if module A used function f from module B and I want to change
A to use f from C, then I need to change import statement. I does
not matter if import statement is in A or in central file. OTOH
if imports stay stable then 'import' line does not change so no
need for exta maintenance. OTOH in langage I use imports are
scoped. Routine rB in A can import B and use f from B, routine
rC in A can import C and use f from C. I do not think it would
work with your import table. And I can compile each module
separately. In fact, this is usual developement mode: I stop
the program, modify the module, compile, load and restart the
program. The point is that all program data and debugging setup
is preserved. Local data of modifed module is destroyed, usually
this is not a problem for debugging. But if all program would be
recompiled and reloaded, then preserving data would be more
tricky. Anyway, I have benefit from independent compilation
and import lines are not a problem. In fact, in (not so
frequent) cases when I need to modify import statements it
happens with modification of module code. And it is easier
to do modification in single file than in two sepatate files.

And, if you ask, I have makefile and it contains list of all
modules, so when I add a new module I must add it to the
makefile. I principle list of modules could be generated
automatically ("take all modules in files in current directory"),
but I prefer current way. Some langages with modules have
"transitive closure" feature: when you give module to compiler
(or maybe extra tool) it will recompile all modules it needs.
But you still need list of "entry modules", that is modules
that should be compiled and are usable when user explicitely
request them by name, but otherwise would be unused.

Regarding C, in well written

?

Oops, I got distracted and did not finish this part. I mean,
in well written C program there will be implicit module structure,
essentially ".c file = module". Exported functions will be
declared in header files. One school says that there should be
.h file for each C file, this .h file containg "module" export info.
Less rigorous school allows single .h file (or some small number)
decaring exported function. Even if program slightly deviates
from such patterns usually this is not much work to separate
declarations of exported functions and move them in .h files
corresponding to .c files. I believe that gcc with appropriate
options would tell you if there is any function which gets
exported (is non-static) but which lacks declaration in
header file, so you can correct program to make sure that
everthing which needs to be exported is declared in header files
and rest is static. After that the C structure will map
1 to 1 to any resonable module system.

Of course, it assumes resonably well-written programs, but
if program is badly written then I do not think it makes much
sense to attempt translation. Translation of resonable program
in principle can "add value": better expose module structure,
add stronger type checking, etc. Translation of bad program
is likely to make it worse.

In short, doing this manually is just impractical (except for stylised,
very conservative C like I might write); it can be easier to rewrite.

Well, if code if _really_ badly written or trivial, then independent rewrite may be best way. OTOH, in many cases one can do "limited"
rewrite: write code that performs "the same" computations as code
in other language. "The same" includes 1-1 correspondence of variables
and fields in data structurs. Such limited rewrite can be 5-10
times faster than writing code from scratch, so in this sense
translating from one language to different language is easy,
it is much less effort than writing from scratch. Of course,
this assumes that code is doing something interesting, for trivial
code you are just doing with bulk and following original is of no
help.

Doing it programmatically has all sorts of issues too.

If you want fully automatic translation that preserves meaning,
then I would expect ugly and inefficient code as the result.
The fist issue that comes to mind is name mangling: since your
languages is case insensitive one may be forced to change some
C names to avoid clashes.

That would be a trivial matter: names can have a backtick prepended that preserves their case. But I wouldn't want to work with such source code: having to be case-sensensitive /and/ having the backtick. Name mangling
has the same problems.

Exactly.

This only works if the output is intermediate code that no one ever
sees. However compiling C via intermediate M is not a useful execise.

Yes. But for other languages it may be useful.

Concerning macros, you somewhat ignore one obvious solution:
implement a preprocessor for your language so that you
can translate C macros into macros for your preprocessor.

That's not a solution, that's just dragging half the C language into mine.

First, I wrote "macros", not "C macros". Second, in language whare
compiler has about 6000 lines (and leverages other language for code generation, otherwise compiler would be significantly bigger) handling
of macros is less than 300 lines. So macros are really small
addition.

Besides, the C macros will still expand into C expressions, statements
and types; so not just half the language, but half the C source too.

I wrote "translate macros". After translation macros will expand to
your language.

Don't forget we're trying to translate the C, not find ways of avoiding
that task!

Macros in C can be used well to write clearer code, and that should be >>> fine to translate to your language (we are talking manual translation, >>> not automatic transcompilation - if you were doing that, you'd do it
from the C after pre-processing).

That doesn't work. For a start it will flatten all enumerations and
defines into literals; you need to preserve those.

"doesn't work" depends on your goal. If goal is to get running
executable as output from your compiler (IIUC "automatic
transcompilation" and the following text up to closing
parenthesis covers this case), then I see no problem
that expansion would cause.

I think DB meant being able to manually translate line by line. That can
work for small examples, but it doesn't really scale.

I anyway already have a tool that will do that. If this is the original
C of one example:

https://github.com/sal55/langs/blob/master/nano.c

Then my tool (a development of my C compiler) produces this file in my syntax:

https://github.com/sal55/langs/blob/master/nano.m

It looks great! But it won't compile; this is purely to help visualising
C code.

I have tried a few times to use this as a starting point to manually translate C programs to M, but there's always something that needs
fixing every few lines; it's usually a huge amount of work. And many
things can't be detected: they will produce legal M that is an incorrect representation of the C.

So in a line like this:

out^ := njClip(x7+x1>>14+128)

the operator priorities are all different.

Handling priorities was solved many years ago. Of course, you need
to parse to get parse tree and then output the tree in your syntax.

Plus macros are expanded into literals and so on. You can do more work
on the tool to reduce the manual fixups needed, but generally this
solution is not viable.

But, we're not really seriously trying to translate a specific C program.

David Brown's point was to show that my language is really no different
from C. But the language /it implements/ doesn't claim to be, and there wouldn't be any great difficult in rewriting any C program in my
language, so long as your didn't try and do it token-by-token.

It would be the same task as translating a C program to D or Java or C#
or Zig or Rust. I think most agree those are different from C.

My language also has many differences in how such a lower-level language
is presented, like the module system for example. Or defaulting to
64-bit integers (causing subtle differences of behaviour). Or
out-of-order definitions. Or being expression-based no statement-based.
Or a dozen other matters I've already listed.

David Brown may explain more what he meant. But I think that my
view is similar to his: there is almost nothing new in your
language. You are moving in circle of ideas that were mainstream
in sixties, mostly resolved in seventies and consider old hat now.
AFAICS newest feature of your languages is modules, which got
popular around 1977. Newer languages either have them or
(possibly wrong) designers thought that they have better
features (that may be case of C++). I would say that among
resonably popular "newer" languages all have "new things".
Some of those things is successful and propagate to other
languages, some is fails, some stays in its niche, but
together they contribute to progress. It hard to find in
your language something that other may wich to copy. Your
'stringinclude' feature is a borderline case, I think
nobody will want to copy your way, but they may be inspired
to invent something better.

Now, if you think that what I wrote is "putting you down":
you invite this by comparing your language to C. There is
a lot of programmers and I think most would be unable to
invent and implement language of comparable quality to your.
So this is certainly your personal achievement. But
deficiences of C are known and so the is knowledge how to cope
with them. There are also real world advantages, starting
from availability of compilers and libraries. If you want
your language to compare well with C you need either _huge_
advantage at language level or provide comparable real
world support at level comparable to C. There were bunch
of languages that arguably offered significant improvement
over C and they made more effort than you on real world
support, yet to the moment none replaced C. If you want
comparisons, more relevant would be comparison with non-C
languages. And for starter, it looks that all contenders
offer modules, with better features than yours.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Thu Dec 8 10:08:37 2022

On 07/12/2022 19:58, Bart wrote:

David Brown's point was to show that my language is really no different
from C. But the language /it implements/ doesn't claim to be, and there wouldn't be any great difficult in rewriting any C program in my
language, so long as your didn't try and do it token-by-token.

It would be the same task as translating a C program to D or Java or C#
or Zig or Rust. I think most agree those are different from C.

My point was that your language (the low-level compiled one) and C are
similar styles - they are at a similar level, and are both procedural imperative structured languages. They have functions, variables, and
pointers. They do not have higher order functions, objects,
overloading, generics/templates, events, actors, coroutines, RAII, metaprogramming, type inference, native multi-precision arithmetic,
parallel blocks, closures, sandboxing. They are not domain-specific, or
tied to particular applications or environment.

I don't mean they are exactly the same, or that an automatic translator
could be made to generate idiomatic code. Each language has certain
features that would be ugly or non-idiomatic when translated. There may
be a few cases where you'd have to significantly re-structure the code,
but for much of it, you could translate function for function, variable
for variable, type for type. Details are different, but not the
structure of the languages or the way you approach the same tasks.

I'd put Pascal (not newer Object Pascal) in the same box.

I'd put C++, C#, Java and D in a box together, though they have more differences. And I'd say you can translate from C, Pascal, or your
language into those languages reasonably well - but not vice versa.

Put another way, a programmer who is familiar with C would be able to
learn your language quickly, and it would not take long to be writing
"normal" code in the language. They'd miss some features of C, and
appreciate some new features of your language, but (personal preferences
aside) would find it straightforward to work with. The same applies the
other direction.

Moving from C to idiomatic C++ or D would be a far bigger jump. And
moving the other direction would feel crippling.

Programming in Eiffel, Haskell, APL, Forth or Occam is /completely/
different - you approach your coding in an entirely different way, and
it makes no sense to think about translating from one of these to C (or
to each other).

(None of this suggests you "copied" C. You simply have a roughly
similar approach to solving the same kinds of tasks - you probably had experience with much the same programming languages as the C designers,
and similar assembly programming experience before making your languages
at the beginning.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Thu Dec 8 12:00:40 2022

On 08/12/2022 00:30, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

Other languages may use command parameters, @ files, makefiles, or
untidy collections of 'import <module>` at the top of every module, that
need constant maintenance.

Hmm, if module A used function f from module B and I want to change
A to use f from C, then I need to change import statement.

If it happens that A imports both B and C, each of which export function
F, then you would need to write B.F() or C.F() from A anyway, to
disambiguate.

I does
not matter if import statement is in A or in central file. OTOH
if imports stay stable then 'import' line does not change so no
need for exta maintenance.

Since I started using my 2022 module scheme, I very rarely have to look
at the module list. Mainly it's changed in order to create a different configuration of a program.

Actually it has dramatically simplified that kind of maintenance.

My scheme also allows circular and mutual imports with no restrictions.

I would link to some docs but I can sense you're not going to be
receptive. That's fine; you're used to your way of doing things.

OTOH in langage I use imports are
scoped. Routine rB in A can import B and use f from B, routine
rC in A can import C and use f from C. I do not think it would
work with your import table.

You mean a function privately imports a module? That sounds crazy - and chaotic. Suppose 100 different functions all import different subsets of
20 different modules (is that Python by any chance?)

There needs to be some structure, some organisation.

And I can compile each module
separately.

I can compile each program separately. Granularity moves from module to program. That's better. Otherwise why not compile individual functions?

In fact, this is usual developement mode: I stop
the program, modify the module, compile, load and restart the
program. The point is that all program data and debugging setup
is preserved.

You mean change one module of a running program? OK, that's not that
really a feature of a language I think, more of external tools,
especially IDEs and debuggers.

I would have no idea how to do that in C. But I used to do it at the application level where most functionality was implemented via
hot-loaded scripting modules. Then development was done from /within/
the running application.

Anyway, I have benefit from independent compilation
and import lines are not a problem.

I used to use independent compilation myself. I moved on to
whole-program compilation because it was better. But all the issues
involved with interfacing at the boundaries between modules don't
completely disappear, they move to the boundaries between programs
instead; that is, between libraries.

And, if you ask, I have makefile and it contains list of all
modules, so when I add a new module I must add it to the
makefile. I principle list of modules could be generated
automatically ("take all modules in files in current directory"),
but I prefer current way.

Well, my language doesn't use or need makefiles. While I got rid of
linkers as being a waste of time, I never got into makefiles at all.

There are however project files used by my mini-IDE, which contains
additional info needed for listing, browsing, and editing the source
files, and for running the programs with test inputs.

But if /you/ wanted to build one of my projects from source, I would
need to provide exactly two files:

mm.exe the compiler
app.ma the source code

The latter being an automatically created amalgamation (produced with
`mm -ma app`). Build using `mm app`.

If you were on Linux or didn't want to use my compiler, then it's even
simpler; I would provide exactly one file:

app.c Generated C source code (via `mc -c app`)

Here, you need a C compiler only. On Windows, you can build it using
`gcc app.c -oapp.exe`. Linux needs more options, listed at the top of
the file.

I haven't yet figured out a way of providing zero files if you wanted to
build my apps from source.

But I guess none of this cuts any ice. After all, there are innumerable
ways of taking the most fantastically complex build process, and
wrapping up in one command or one file (docker etc).

The difference is that what I provide is genuinely simple: one bare
compiler, one actual source file.

Some langages with modules have
"transitive closure" feature: when you give module to compiler
(or maybe extra tool) it will recompile all modules it needs.
But you still need list of "entry modules", that is modules
that should be compiled and are usable when user explicitely
request them by name, but otherwise would be unused.

My stuff is simpler, believe me. When you have 1Mlps compilation speed,
most of this build stuff can go out the window.

Oops, I got distracted and did not finish this part. I mean,
in well written C program there will be implicit module structure, essentially ".c file = module".

I've seen lots of C source, it's rarely that tidy.

If it was, I'd be be able to use an unusual feature of my C compiler
where it can automatically discover the necessary modules by tracking
the header files (`bcc -auto main.c`).

Exported functions will be
declared in header files. One school says that there should be
.h file for each C file, this .h file containg "module" export info.
Less rigorous school allows single .h file (or some small number)
decaring exported function. Even if program slightly deviates
from such patterns usually this is not much work to separate
declarations of exported functions and move them in .h files
corresponding to .c files. I believe that gcc with appropriate
options would tell you if there is any function which gets
exported (is non-static) but which lacks declaration in
header file, so you can correct program to make sure that
everthing which needs to be exported is declared in header files
and rest is static. After that the C structure will map
1 to 1 to any resonable module system.

C is just so primitive when it comes to this stuff. I'm sure it largely
works by luck.

David Brown may explain more what he meant. But I think that my
view is similar to his: there is almost nothing new in your
language. You are moving in circle of ideas that were mainstream
in sixties,

So what are the new circles of ideas? All that crap in Rust that makes
coding a nightmare, and makes building programs dead slow? All those new functional languages with esoteric type systems? 6GB IDEs (that used to
take a minute and a half to load on my old PC)? No thanks.

I keep the languages I maintain accessible, and the ideas simple.

Any genuinely good ideas I would already have stolen if I liked them and
they were practical to implement.

But the trend now is to make even features like modules as complicated
and comprehensive as possible. (For example, there may not be 1:1 correspondence between file and module: multiple modules per file;
nested modules; modules split across multiple files. I keep it simple.)

mostly resolved in seventies and consider old hat now.

I'm think you're not looking at the right features of a language. There
are some characteristics popular in the 70s that I like and maintain:

* Clean, uncluttered brace-free syntax
* Case-insensitive
* 1-based
* Line-oriented (no semicolons)
* Print/read as statements

C is still tremendously popular for many reasons. But anyone wanting to
code today in such a language will be out of luck if prefered any or all
of these characteristics. This is why I find coding in my language such
a pleasure.

Then, if we are comparing the C language with mine, I offer:

* Out of order definitions
* One-time definitions (no headers, interfaces etc)
* Expression-based
* Program-wide rather than module-wide compilation unit
* Build direct to executable; no object files or linkers
* Blazing fast compilation speed, can run direct from source
* Module scheme with tidy 'one-time' declaration of each module
* Function reflection (access all functions within the program)
* 64-bit default data types (ie. 'int' is 64 bits, 123 is 64 bits)
* No build system needed
* Create and directly compile One-file amalgamations of projects
* String and file embedding you know about
* One-file self-contained implementations

I haven't listed the literally one hundred small enhancements that make
the coding experience even in a lower-level system language with
primitive types so comfortable.

Few are remarkable or unique, but it's putting it all together in one
tidy package that counts.

And for starter, it looks that all contenders
offer modules, with better features than yours.

Sure, you have C++, Zig, Rust, Java, C#, Dart, ....

All with very advanced and complicated features. They also have big implementations and take ages to build projects.

(When I need more advanced features, I switch to my scripting language.
Which also, by design, shares the same syntax and most of the same characteristics and features.)

I think, like David Brown, you just don't get it.

Try and think of what I do as creating delicious meals in my kitchen
from my own recipies. Nothing new here either. But both you and DB
probably work (figuratively) in the food industry or run chains of
restaurants, or work in a food laboratory.

Or are used to buying ready-meals from supermarkets.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Fri Dec 9 00:52:24 2022

On 08/12/2022 09:08, David Brown wrote:

On 07/12/2022 19:58, Bart wrote:

David Brown's point was to show that my language is really no
different from C. But the language /it implements/ doesn't claim to
be, and there wouldn't be any great difficult in rewriting any C
program in my language, so long as your didn't try and do it
token-by-token.

It would be the same task as translating a C program to D or Java or
C# or Zig or Rust. I think most agree those are different from C.

My point was that your language (the low-level compiled one) and C are similar styles - they are at a similar level, and are both procedural imperative structured languages. They have functions, variables, and pointers. They do not have higher order functions, objects,
overloading, generics/templates, events, actors, coroutines, RAII, metaprogramming, type inference, native multi-precision arithmetic,
parallel blocks, closures, sandboxing. They are not domain-specific, or tied to particular applications or environment.

I don't know what half of those mean.

My system language was upgraded for a while with higher level data types
(you seem to have skipped a level or two in your list), but then I
decided to keep it lower level.

I think languages should know their place, so let it do what it does as
well as it can, rather pretend to be something it's not. (And mine
mainly exists to implement the next language along.)

I anyway favour features that everyone can understand and appreciate.
Here's an example using a minor extension to my C compiler:

void print_this_module(void) {
puts(strinclude(__FILE__));
}

Put this in any module (or in more than one, or make it a macro), and
when called it will print the sourcecode of the module, which is
embedded within the executable. There are a dozen uses other than the
novelty one above.

higher order functions, objects,
overloading, generics/templates, events, actors, coroutines, RAII, metaprogramming, type inference, native multi-precision arithmetic,
parallel blocks, closures, sandboxing.

Those are the sorts of features that are implemended more diversely
across different languages than simple ones. They will be harder to
translate from one language to another. I favour more conservative and
more universal features.

A language needs to be able to get things done, and the number one thing missing from mine is the ability to /effortlessly/ use third party
libraries. Not actors, whatever those are.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Fri Dec 9 12:54:47 2022

On 09/12/2022 01:52, Bart wrote:

On 08/12/2022 09:08, David Brown wrote:

On 07/12/2022 19:58, Bart wrote:

David Brown's point was to show that my language is really no
different from C. But the language /it implements/ doesn't claim to
be, and there wouldn't be any great difficult in rewriting any C
program in my language, so long as your didn't try and do it
token-by-token.

It would be the same task as translating a C program to D or Java or
C# or Zig or Rust. I think most agree those are different from C.

My point was that your language (the low-level compiled one) and C are
similar styles - they are at a similar level, and are both procedural
imperative structured languages. They have functions, variables, and
pointers. They do not have higher order functions, objects,
overloading, generics/templates, events, actors, coroutines, RAII,
metaprogramming, type inference, native multi-precision arithmetic,
parallel blocks, closures, sandboxing. They are not domain-specific,
or tied to particular applications or environment.

I don't know what half of those mean.

That's fine. Your language doesn't have them, and C doesn't have them.
If one of the languages had them, it would be very hard to translate
that feature smoothly into the other language.

My system language was upgraded for a while with higher level data types
(you seem to have skipped a level or two in your list), but then I
decided to keep it lower level.

I didn't cover everything!

And as I said, there are always some features that one of these broadly
similar languages has that don't translate smoothly. For example,
Pascal has sets - translating these to C means enumerations with
prefixes and quite a loss in the neatness, clarity and type-safety of
the original. In the other direction, printf() calls will be
significantly changed moving to Pascal as Pascal doesn't have variadic functions.

If you've added some higher level data types to your language, then
these are likely to be harder to translate smoothly into C.

I think languages should know their place, so let it do what it does as
well as it can, rather pretend to be something it's not. (And mine
mainly exists to implement the next language along.)

Sure. The fact that your language occupies a similar level to C and can practically be translated back and forth does not mean it does not have
its pros and cons compared to C. There is room for a great many
programming languages.

I anyway favour features that everyone can understand and appreciate.

Let's be clear here - your language is written by one person, for one
person, based on the preferences and understandings of one person. That
does not necessarily mean you are wrong, but you should be /extremely/
careful about extrapolating to "everyone". Serious languages are made
by groups and teams that work together, and move through testers and
early adopters to grow communities of users over the years. When you
have thousands of regular users who can give feedback on your design
choices, you can talk about features that "many people can understand
and appreciate".

Here's an example using a minor extension to my C compiler:

    void print_this_module(void) {
        puts(strinclude(__FILE__));
    }

Put this in any module (or in more than one, or make it a macro), and
when called it will print the sourcecode of the module, which is
embedded within the executable. There are a dozen uses other than the
novelty one above.

And your basis for thinking /everyone/ understands this is ... what? I
guess it looks clear enough for simple cases, but what about more
complicated circumstances? What if the code is not contained in a file,
but part of an on-the-fly compilation, or with source code from a pipe?
What if the file is a temporary one generated by transcompilation from
a higher level languages? Does it work properly with respect to the
#line pre-processor directive? How does it work regarding other
#include's in the file? How are macros and other pre-processing handled
in the file? What about someone trying to include an "infinite" file
like "/dev/random"? What about a file that contained characters that
are not acceptable by the compiler?

There are /lots/ of questions.

I think it is fair to say that a number of programmers would appreciate
a way to include data files embedded in their programs. Mostly you want
the files to be treated as data (initialising an array), rather than
strings, but sometimes strings are useful too. (Like most people, I
solved this long ago for my own needs with a simple data-to-include-file utility). Including a files own source code is not likely to be needed
by many, but other files are more important.

Just for comparison, have a look at how the real C language has dealt
with this, as a similar (but more powerful and useful) feature has been
added to C23. Since C is not a one-man toy, discussions and proposals
are needed, involving proponents of the idea, C standards committee
members, users, compiler writers, and testers. It takes many rounds to establish what "everyone" understands, and what "everyone" appreciates -
as well as avoiding potential problems, security issues and other
complications that are not obvious to most people.

<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2898.htm> <https://thephd.dev/finally-embed-in-c23>

higher order functions, objects,
overloading, generics/templates, events, actors, coroutines, RAII, metaprogramming, type inference, native multi-precision arithmetic, parallel blocks, closures, sandboxing.

Those are the sorts of features that are implemended more diversely
across different languages than simple ones. They will be harder to
translate from one language to another. I favour more conservative and
more universal features.

Again, you favour things that seem more conservative and universal to
/you/. There is a fair overlap with these things and the features of C,
since the languages are similar.

But people with different programming backgrounds and habits could have completely different ideas of what is "conservative and universal". For
many languages, type inference and generics is so integral that it would
seem very strange to declare variables of a given type - they may not
even see variables as a meaningful concept. Not all languages have
functions in the sense you think is "universal".

This is the same in all sorts of areas, not just programming languages -
our ideas about what things are "fundamental" or "universal" are almost entirely a matter of what we find /familiar/. Even if a concept is
familiar to many people, it merely means it is found in many popular
languages - it does not make it universal or fundamental. (And I note
that there are many concepts found in many popular languages which you
reject or dislike.)

Of course you put things in your language that /you/ find appropriate -
you made it for yourself, and if some Prolog user or Sketch expert wants
to use it, they'll have to learn how it works. All I am asking you to
do is understand that your ideas, your preferences, your choices are
formed from your experiences and background - they are not "universal"
and do not necessarily apply to others.

A language needs to be able to get things done, and the number one thing missing from mine is the ability to /effortlessly/ use third party
libraries. Not actors, whatever those are.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Fri Dec 9 16:28:53 2022

On 09/12/2022 11:54, David Brown wrote:

On 09/12/2022 01:52, Bart wrote:

(Replying about 'strinclude' only)

Here's an example using a minor extension to my C compiler:

     void print_this_module(void) {
         puts(strinclude(__FILE__));
     }

Put this in any module (or in more than one, or make it a macro), and
when called it will print the sourcecode of the module, which is
embedded within the executable. There are a dozen uses other than the
novelty one above.

And your basis for thinking /everyone/ understands this is ... what? I guess it looks clear enough for simple cases, but what about more
complicated circumstances? What if the code is not contained in a file,
but part of an on-the-fly compilation, or with source code from a pipe?
What if the file is a temporary one generated by transcompilation from
a higher level languages? Does it work properly with respect to the
#line pre-processor directive? How does it work regarding other
#include's in the file? How are macros and other pre-processing handled
in the file? What about someone trying to include an "infinite" file
like "/dev/random"? What about a file that contained characters that
are not acceptable by the compiler?

There are /lots/ of questions.

You can say the same about #include in C. The search algorithm for
include files is actually complex, and is not set by the language: it is implementation defined.

My strinclude is simpler. The concept itself is simple: incorporate a
TEXT file as a string literal rather than C code, a task that many will
have wanted to do at some point.

This was a quick proof of concept for C. The search for non-absolute
file-specs is relative to the current directory. In the original in my language, it is relative to the location of the lead module (to allow a
program to be built remotely from anywhere in the file system).

For C I won't bother upgrading it.

Of course, people can enter all sorts of things as the file path which
are likely to cause problems. Just like they can for #include. And just
like they can, with languages that have compile-time evaluation and the
ability to duplicate strings, when they write "A"*1000000000.

1GB is small enough to actually work, big enough to cause all sorts of problems, like a 1GB executable.

I think it is fair to say that a number of programmers would appreciate
a way to include data files embedded in their programs. Mostly you want
the files to be treated as data (initialising an array), rather than
strings, but sometimes strings are useful too.

My systems language has `bininclude` for binary files, but it's a poor implementation (a 1MB file will generate 1M AST nodes, one for each
byte, some 64MB in all, but `strinclude` for a 1MB file needs only a 1MB string).

My scripting language uses `strinclude` for both. The program below
writes a C file then compiles it, supplying its own C compiler!

const compiler = strinclude("c:/m/bcc.exe")

if not checkfile("newcc.exe") then
writestrfile("newcc.exe",compiler)
fi

writetextfile("test.c",
("#include <stdio.h>",
"int main(void) { puts(""Hi There " +strtime(getsystime())+
""");}"
))

execwait("newcc -run test.c")

(str-file = one big string; text-file = list of strings, one per line)

(Like most people, I
solved this long ago for my own needs with a simple data-to-include-file utility). Including a files own source code is not likely to be needed
by many, but other files are more important.

Here's a simpler use:

when help_sw then
println strinclude("mm_help.txt")

It is invaluable in producing my one-file self-contained tools.

Just for comparison, have a look at how the real C language has dealt
with this, as a similar (but more powerful and useful) feature has been
added to C23. Since C is not a one-man toy, discussions and proposals
are needed, involving proponents of the idea, C standards committee
members, users, compiler writers, and testers. It takes many rounds to establish what "everyone" understands, and what "everyone" appreciates -
as well as avoiding potential problems, security issues and other complications that are not obvious to most people.

<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2898.htm> <https://thephd.dev/finally-embed-in-c23>

Yeah, it's complicated when the language is already complicated and
everything is done by committee /and/ you have billions of lines of
existing code to support and you have 100s of implementations.

That's an advantage of creating a small product with a limited user-base.

This sort of feature needs to be as simple as I've shown above.

BTW in C, I implemented strinclude in 20 lines like this:

function readstrinclude:unit p=
ichar text

lex()
checksymbol(lbracksym)
lex()
p := readexpression()
checksymbol(rbracksym)
lex()

if p.tag<>j_const or p.mode<>trefchar then
serror("String const expected")
fi

text := readfile(p.svalue)
if not text then
serror_s("Can't read strinclude file: #",p.svalue)
fi

return createstringconstunit(text,rfsize)
end

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Bart on Fri Dec 9 19:16:39 2022

On 09/12/2022 16:28, Bart wrote:

On 09/12/2022 11:54, David Brown wrote:

[including files as text or binary data]

My scripting language uses `strinclude` for both. The program below
writes a C file then compiles it, supplying its own C compiler!

const compiler = strinclude("c:/m/bcc.exe")

While this script program still works (as does the equivalent using 'bininclude` in the other language), it was a feature I used when
scripting code was precompiled to binary bytecode. Then the binary data
was part of the bytecode file.

But now scripts are run from source, and production or distributed
programs are made into one-file amalgamations. These amalgamations are
still text files, and while they will include such support files, I
haven't yet solved the problem of representing binary data in the text file.

Amalgamations will usually incorporate text-files as-is, but here a
binary file would need to be transformed. It's not a hard problem, I
just haven't done it yet.

<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2898.htm>
<https://thephd.dev/finally-embed-in-c23>

I'd imagine the C23 and C++ versions will have the same problem when
creating source distributions that someone else will build, but they
probably don't care about bundling binary files as well.

The point however is to encapsulate the binary data; just zipping
everything doesn't cut it! You don't want the user or builder to see the discrete binaries.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Sat Dec 10 17:23:38 2022

On 09/12/2022 11:54, David Brown wrote:

On 09/12/2022 01:52, Bart wrote:

higher order functions, objects,
overloading, generics/templates, events, actors, coroutines, RAII,
metaprogramming, type inference, native multi-precision arithmetic,
parallel blocks, closures, sandboxing.

Those are the sorts of features that are implemended more diversely
across different languages than simple ones. They will be harder to
translate from one language to another. I favour more conservative and
more universal features.

Again, you favour things that seem more conservative and universal to
/you/.

Well, they were universal and obvious 40+ years ago. Nearly two
generations on, everyone is into more advanced features (mostly likely
due to being foisted on them in university courses).

Then it may be necessary to look at what would be obvious and intuitive
to anyone, or can be described in everyday terms like shelves of books, numbered drawers and so on. (Except that soon not many will know what a
book looks like.)

But people with different programming backgrounds and habits could have completely different ideas of what is "conservative and universal". For many languages, type inference and generics is so integral that it would
seem very strange to declare variables of a given type - they may not
even see variables as a meaningful concept.

Give it another generation, someone will reintroduce the idea of mutable variables as a brilliant new concept.

However I think you're wrong: there are a set of basic features that,
for time being, most will understand or can easily guess, often used in pseudo-code.

If I look at Intel processor manuals for example, the behaviour of
instructions is described in a pseudo-code language looks a lot like
Algol. It has IF statements, explicit FOR-loops and mutable variables.

Why not use functional style concepts and syntax since that is so great?

Here's another bit of syntax they use:

DEST[31:0] := SRC1[31:0] + SRC2[31:0]

This clearly refers to bits within those variables. While not common in languages, how hard is it to guess what this does? Once you understand
what it does, how hard emulate what it does with existing bitwise
operations?

Funnily enough I've used pretty much the same syntax for 30 years:

dest.[31..0] := src1.[31..0] + src2.[31..0]

Look at, say, the Wikipedia article on insertion sort, and it will have psuedo-code for it that looks like this:

i ← 1
while i < length(A)
j ← i
while j > 0 and A[j-1] > A[j]
swap A[j] and A[j-1]
j ← j - 1
end while
i ← i + 1
end while

Using ":=" and adding "do", this would be valid syntax in either of my languages (the swap needs rewriting as `swap(A[j], A[j-1])`, and
`length(A)` as `A.len`).

So, how about that; my syntax which I claim to be simple and univeral,
is clear enough to be used as pseudo-code.

These are some characteristics and fundamentals that I like, that are in
danger of becoming extinct; they already are in many languages:

* Case-insensitive
* 1-based counting (if there is even anything to count!)
* Mutable variables, even the concept of 'state'
* Explicit arrays, and explicit indexing of arrays
* Iteration over A to B using explicit loops
* While loops
* Goto
* Build-in read/print /statements/ (or just having any i/o at all)
* Ordinary functions, you know, the ones you just declare rather just
being some named lambda expression.

These include many constructs popular in pseudo-code. So what do the
developers of advanced languages actually have against clear code?

I get the impression that many can't take a language seriously if it
uses a syntax that just anyone can understand.

Of course you put things in your language that /you/ find appropriate -
you made it for yourself, and if some Prolog user or Sketch expert wants
to use it, they'll have to learn how it works. All I am asking you to
do is understand that your ideas, your preferences, your choices are
formed from your experiences and background - they are not "universal"
and do not necessarily apply to others.

Again, I think they are: more obvious, more intuitive, simpler, suitable
for use as pseudo-code, and far easier to port to an arbitrary language.
At least, an arbitrary language that hasn't completely done away with
the basics.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Sun Dec 11 17:24:28 2022

On 09/12/2022 17:28, Bart wrote:

On 09/12/2022 11:54, David Brown wrote:

On 09/12/2022 01:52, Bart wrote:

(Replying about 'strinclude' only)

Here's an example using a minor extension to my C compiler:

     void print_this_module(void) {
         puts(strinclude(__FILE__));
     }

Put this in any module (or in more than one, or make it a macro), and
when called it will print the sourcecode of the module, which is
embedded within the executable. There are a dozen uses other than the
novelty one above.

And your basis for thinking /everyone/ understands this is ... what?
I guess it looks clear enough for simple cases, but what about more
complicated circumstances? What if the code is not contained in a
file, but part of an on-the-fly compilation, or with source code from
a pipe?   What if the file is a temporary one generated by
transcompilation from a higher level languages? Does it work properly
with respect to the #line pre-processor directive? How does it work
regarding other #include's in the file? How are macros and other
pre-processing handled in the file? What about someone trying to
include an "infinite" file like "/dev/random"? What about a file that
contained characters that are not acceptable by the compiler?

There are /lots/ of questions.

You can say the same about #include in C. The search algorithm for
include files is actually complex, and is not set by the language: it is implementation defined.

Certainly there are implementation-defined aspects about "#include". An implementation does not even have to use normal files here - I know of
at least one embedded toolchain where the standard library includes are
handled directly within the toolchain, and do not exist as separate files.

But I think there is a significant difference between header files,
which are clearly part of program code, and embedded data files.

My strinclude is simpler. The concept itself is simple: incorporate a
TEXT file as a string literal rather than C code, a task that many will
have wanted to do at some point.

Yes, it is simpler - that's the point. You can make a simple feature
for your simple tool that does all you need for your simple
requirements. That is absolutely fine for these cases - there is never
any point in over-complicating things. Make things as simple as
possible, but no simpler. However, for serious and mainstream
languages, far more is involved.

This was a quick proof of concept for C. The search for non-absolute file-specs is relative to the current directory. In the original in my language, it is relative to the location of the lead module (to allow a program to be built remotely from anywhere in the file system).

For C I won't bother upgrading it.

Of course, people can enter all sorts of things as the file path which
are likely to cause problems. Just like they can for #include. And just
like they can, with languages that have compile-time evaluation and the ability to duplicate strings, when they write "A"*1000000000.

1GB is small enough to actually work, big enough to cause all sorts of problems, like a 1GB executable.

I think it is fair to say that a number of programmers would
appreciate a way to include data files embedded in their programs.
Mostly you want the files to be treated as data (initialising an
array), rather than strings, but sometimes strings are useful too.

My systems language has `bininclude` for binary files, but it's a poor implementation (a 1MB file will generate 1M AST nodes, one for each
byte, some 64MB in all, but `strinclude` for a 1MB file needs only a 1MB string).

My scripting language uses `strinclude` for both. The program below
writes a C file then compiles it, supplying its own C compiler!

    const compiler = strinclude("c:/m/bcc.exe")

    if not checkfile("newcc.exe") then
        writestrfile("newcc.exe",compiler)
    fi

    writetextfile("test.c",
        ("#include <stdio.h>",
         "int main(void) { puts(""Hi There " +strtime(getsystime())+ """);}"
        ))

    execwait("newcc -run test.c")

(str-file = one big string; text-file = list of strings, one per line)

(Like most people, I solved this long ago for my own needs with a
simple data-to-include-file utility). Including a files own source
code is not likely to be needed by many, but other files are more
important.

Here's a simpler use:

    when help_sw then
        println strinclude("mm_help.txt")

It is invaluable in producing my one-file self-contained tools.

Just for comparison, have a look at how the real C language has dealt
with this, as a similar (but more powerful and useful) feature has
been added to C23. Since C is not a one-man toy, discussions and
proposals are needed, involving proponents of the idea, C standards
committee members, users, compiler writers, and testers. It takes
many rounds to establish what "everyone" understands, and what
"everyone" appreciates - as well as avoiding potential problems,
security issues and other complications that are not obvious to most
people.

<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2898.htm>
<https://thephd.dev/finally-embed-in-c23>

Yeah, it's complicated when the language is already complicated and everything is done by committee /and/ you have billions of lines of
existing code to support and you have 100s of implementations.

That's an advantage of creating a small product with a limited user-base.

Yes. You only need to consider the uses and interests of one person.
On the other hand, you only get the features created by one person.

This sort of feature needs to be as simple as I've shown above.

BTW in C, I implemented strinclude in 20 lines like this:

    function readstrinclude:unit p=
        ichar text

        lex()
        checksymbol(lbracksym)
        lex()
        p := readexpression()
        checksymbol(rbracksym)
        lex()

        if p.tag<>j_const or p.mode<>trefchar then
            serror("String const expected")
        fi

        text := readfile(p.svalue)
        if not text then
            serror_s("Can't read strinclude file: #",p.svalue)
        fi

        return createstringconstunit(text,rfsize)
    end

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sun Dec 11 16:50:51 2022

Bart <bc@freeuk.com> wrote:

On 08/12/2022 00:30, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

Other languages may use command parameters, @ files, makefiles, or
untidy collections of 'import <module>` at the top of every module, that >> need constant maintenance.

Hmm, if module A used function f from module B and I want to change
A to use f from C, then I need to change import statement.

If it happens that A imports both B and C, each of which export function
F, then you would need to write B.F() or C.F() from A anyway, to disambiguate.

I call would be otherwise ambigious, then of course there is need
to disambiguate. Above would be F()$B and F()$C (dollar sign means
that what follows is module name). However, important use case
is when you have large program and try to replace B by C. And you
do this in incremental way, module by module. At any time
given module import either B (before convertion) or C (after),
but not both.

There is also extra aspect, irrelevent for you, but important
for me: my language has overloading and functions from different
modules may differ in types they return or in argument types.
So overloading may decide which one to use.

I does
not matter if import statement is in A or in central file. OTOH
if imports stay stable then 'import' line does not change so no
need for exta maintenance.

Since I started using my 2022 module scheme, I very rarely have to look
at the module list. Mainly it's changed in order to create a different configuration of a program.

Actually it has dramatically simplified that kind of maintenance.

My scheme also allows circular and mutual imports with no restrictions.

You probably mean "with no artifical restrictions". There is
fundamental restriction that everything should resolve in finite
number of steps.

I would link to some docs but I can sense you're not going to be
receptive. That's fine; you're used to your way of doing things.

OTOH in langage I use imports are
scoped. Routine rB in A can import B and use f from B, routine
rC in A can import C and use f from C. I do not think it would
work with your import table.

You mean a function privately imports a module? That sounds crazy - and chaotic. Suppose 100 different functions all import different subsets of
20 different modules (is that Python by any chance?)

Apparently you only think of ways of writing bad code. There are
many ways of writing bad code and it is not hard even in languages
with strong "nanny" attitude. Private import means that you can
import what is needed in small scope, normal case is of few critical
functions having external dependence and the rest of module not
needing it. Alternative could be to create some small intermediate
modules, but normally there is strong logical connection between functions
in a module and intermediate modules would be artificial and clumsy.
Private import allows to keep natural decomposition into modules
and localize dependencies.

There needs to be some structure, some organisation.

Exactly, private import is tool for better organisation.

And I can compile each module
separately.

I can compile each program separately. Granularity moves from module to program. That's better. Otherwise why not compile individual functions?

In the past there was support for compiling individual functions, and
I would not exclude possibility that it will come back. But ATM
I prefer to keep things simple, so this functionality was removed.

In fact, this is usual developement mode: I stop
the program, modify the module, compile, load and restart the
program. The point is that all program data and debugging setup
is preserved.

You mean change one module of a running program? OK, that's not that
really a feature of a language I think, more of external tools,
especially IDEs and debuggers.

Yes, that is feature of implementation (no IDE). But language
features also come into play.

I would have no idea how to do that in C. But I used to do it at the application level where most functionality was implemented via
hot-loaded scripting modules. Then development was done from /within/
the running application.

Anyway, I have benefit from independent compilation
and import lines are not a problem.

I used to use independent compilation myself. I moved on to
whole-program compilation because it was better. But all the issues
involved with interfacing at the boundaries between modules don't
completely disappear, they move to the boundaries between programs
instead; that is, between libraries.

I consider programs to be different from libraries. Program may
use several libraries, in degenerate case library may be just a
single module. Compiling whole program has clear troubles with
large programs. If you limit size of libraries they may be
reasonable compromise. ATM I my case inter-module optimizations
are limited and do not benefit from larger scale. I have
about 1000 modules that do not naturally decompose into libraries.
And in principle there could be tens of thousends modules, this
is basically limited by developement effort needed to write them.

I have one global compilation step, basically to resolve
dependencies between modules. But it is needed only for
initial build (or if module structure changed quite a lot),
after build typically info about dependencies can be updated
in incremental way.

And, if you ask, I have makefile and it contains list of all
modules, so when I add a new module I must add it to the
makefile. I principle list of modules could be generated
automatically ("take all modules in files in current directory"),
but I prefer current way.

Well, my language doesn't use or need makefiles. While I got rid of
linkers as being a waste of time, I never got into makefiles at all.

There are however project files used by my mini-IDE, which contains additional info needed for listing, browsing, and editing the source
files, and for running the programs with test inputs.

But if /you/ wanted to build one of my projects from source, I would
need to provide exactly two files:

mm.exe the compiler
app.ma the source code

The latter being an automatically created amalgamation (produced with
`mm -ma app`). Build using `mm app`.

I could provide a single file, shell archive containing build script
and sources, but important part of providing sources is that people
can read them, understand and modify. GNU folks have nice definition
of source: "preffered form for making modifications". I would guess
that 'app.ma' is _not_ your preffered form for making modifications,
so it is not really true source. And to build from "source" I need
source first. And I provide _true_ sources to my users.

If you were on Linux or didn't want to use my compiler, then it's even simpler; I would provide exactly one file:

app.c Generated C source code (via `mc -c app`)

Here, you need a C compiler only. On Windows, you can build it using
`gcc app.c -oapp.exe`. Linux needs more options, listed at the top of
the file.

Sorry, generated file is _not_ a source. If I were to modify C file
that you provide I would have trouble incorporationg fixes from
your later versions. And likely, you would have trouble incorporating
my changes into your real sources.

I haven't yet figured out a way of providing zero files if you wanted to build my apps from source.

But I guess none of this cuts any ice. After all, there are innumerable
ways of taking the most fantastically complex build process, and
wrapping up in one command or one file (docker etc).

The difference is that what I provide is genuinely simple: one bare
compiler, one actual source file.

Sorry, for me "one file" is not a problem, there is 'tar' (de facto
standard for distributing sorce code) and with it you download a
single file and recreate any needed directory structure. Real
issue is managing dependencies. In general, I make real effort
to minimize dependencies. I am not saying there are none, but
dependencies that I have are hard to avoid and IMO resonably natural.
Most other project have quite a lot of dependencies. Your project
_may_ have advantage of small number of dependencies. But since
you show only generated files or parts of source nobody can tell.
And there is possible quite large dependency, namley Windows.
It is clear how much of your code _usefully_ runs in now-Window
environment.

Some langages with modules have
"transitive closure" feature: when you give module to compiler
(or maybe extra tool) it will recompile all modules it needs.
But you still need list of "entry modules", that is modules
that should be compiled and are usable when user explicitely
request them by name, but otherwise would be unused.

My stuff is simpler, believe me. When you have 1Mlps compilation speed,
most of this build stuff can go out the window.

Oops, I got distracted and did not finish this part. I mean,
in well written C program there will be implicit module structure, essentially ".c file = module".

I've seen lots of C source, it's rarely that tidy.

If it was, I'd be be able to use an unusual feature of my C compiler
where it can automatically discover the necessary modules by tracking
the header files (`bcc -auto main.c`).

Well, taking header files as definition of modules clearly has
problems. IME .c files usually give resonable module structure.

Exported functions will be
declared in header files. One school says that there should be
.h file for each C file, this .h file containg "module" export info.
Less rigorous school allows single .h file (or some small number)
decaring exported function. Even if program slightly deviates
from such patterns usually this is not much work to separate
declarations of exported functions and move them in .h files
corresponding to .c files. I believe that gcc with appropriate
options would tell you if there is any function which gets
exported (is non-static) but which lacks declaration in
header file, so you can correct program to make sure that
everthing which needs to be exported is declared in header files
and rest is static. After that the C structure will map
1 to 1 to any resonable module system.

C is just so primitive when it comes to this stuff. I'm sure it largely
works by luck.

C is at low level, that is clear. But programs are written by
programmers and good programs require work. Good programming
environment should help. C as language is not helpful, one
may have fully compliant and rather unhelpful compiler. But
real C compilers tends to be as helpful as they can within
limit a C language. While C still limits what they can do,
there is quite a lot of difference betwen current popular
compiler and bare-bones legal comiler. And there are extra
tools and here C support is hard to beat.

David Brown may explain more what he meant. But I think that my
view is similar to his: there is almost nothing new in your
language. You are moving in circle of ideas that were mainstream
in sixties,

So what are the new circles of ideas? All that crap in Rust that makes
coding a nightmare, and makes building programs dead slow? All those new functional languages with esoteric type systems? 6GB IDEs (that used to
take a minute and a half to load on my old PC)? No thanks.

Borrow checker in Rust looks like good idea. There is good chance
that _idea_ will be adopted by several languages in near feature.
Not so new ideas are:
- removing limitations, that is making sure language constructs
work as general as possible (that allows to get rid of many
special constructs from older languages)
- nominal, problem dependent types. That is types should reflect
problem domain. In particular, domains which need types like
'u32' are somewhat specific, in normal domains fundamental types
are different
- functions as values/parameters. In particular functions have
types, can be members of data structures
- "full rights" for user defined types. Which means whatever
syntax/special constructs works on built-in types should
also work for user defined types
- function overloading
- type reconstruction
- garbage collection
- exception handling
- classes/objects

I keep the languages I maintain accessible, and the ideas simple.

Any genuinely good ideas I would already have stolen if I liked them and
they were practical to implement.

But the trend now is to make even features like modules as complicated
and comprehensive as possible. (For example, there may not be 1:1 correspondence between file and module: multiple modules per file;
nested modules; modules split across multiple files. I keep it simple.)

mostly resolved in seventies and consider old hat now.

I'm think you're not looking at the right features of a language. There
are some characteristics popular in the 70s that I like and maintain:

* Clean, uncluttered brace-free syntax

Does this count as brace-free?

for i in 1..10 repeat (print i; s := s + i)

* Case-insensitive
* 1-based
* Line-oriented (no semicolons)
* Print/read as statements

Lot of folks consider the above misfeatures/bugs. Concerning
'line-oriented' and 'intuitive, can you guess which changes to
following statement in 'line-oriented' syntax are legal and preserve
meaning?

nm = x or nm = 'log or nm = 'exp or nm = '%power or
nm = 'nthRoot or
nm = 'cosh or nm = 'coth or nm = 'sinh or nm = 'tanh or
nm = 'sech or nm = 'csch or
nm = 'acosh or nm = 'acoth or nm = 'asinh or nm = 'atanh or
nm = 'asech or nm = 'acsch or
nm = 'Ei or nm = 'erf or nm = 'erfi or nm = 'li or
nm = 'Gamma or nm = 'digamma or nm = 'dilog or
nm = '%root_sum =>
"iterate"

As a hit let me say that '=' is comparison. And this is single
statement, small changes to whitespace will change parse and lead
to wrong code or syntax/type error. BTW: you need real newsreader
to see it, Google and likes will change it so it no longer works.

C is still tremendously popular for many reasons. But anyone wanting to
code today in such a language will be out of luck if prefered any or all
of these characteristics. This is why I find coding in my language such
a pleasure.

Then, if we are comparing the C language with mine, I offer:

* Out of order definitions

That is considerd misfeature in modern time. In modern languages
definition may generate some code to be run and order in which
this code is run matters.

* One-time definitions (no headers, interfaces etc)
* Expression-based

C is mostly expression-based. There are langages that go further
than C, for example:

a := (s = 1; for i in 1..10 repeat (s := s + i); s)

is legeal in language that I use, but can not be directly translated to C. However, from examples that you gave it looked that your language
is _less_ expression-based than C.

* Program-wide rather than module-wide compilation unit

AFACS C leave choice to the implementer. If your language _only_
supports whole-program compilation, then this whould be
negative feature.

* Build direct to executable; no object files or linkers

That really question of implementation. Building _only_
to executable is misfeature (what about case when I want
to use a few routines in your language, but the rest including
main program is in different language?).

* Blazing fast compilation speed, can run direct from source

Again, that is implementation (some language features may slow
down compilation, as you know C allow fast comilation).

* Module scheme with tidy 'one-time' declaration of each module
* Function reflection (access all functions within the program)
* 64-bit default data types (ie. 'int' is 64 bits, 123 is 64 bits)
* No build system needed

That really depends on needs of your program. Some are complex
and need build system, some are simple and in principle could
be compiled with "no" build system. I still use Makefiles for
simple programs for two reasons:
- typing 'make' is almost as easy as it can get
- I want to have record of compiler used/compiler options/
libraries

* Create and directly compile One-file amalgamations of projects
* String and file embedding you know about
* One-file self-contained implementations

I haven't listed the literally one hundred small enhancements that make
the coding experience even in a lower-level system language with
primitive types so comfortable.

Few are remarkable or unique, but it's putting it all together in one
tidy package that counts.

And for starter, it looks that all contenders
offer modules, with better features than yours.

Sure, you have C++, Zig, Rust, Java, C#, Dart, ....

All with very advanced and complicated features. They also have big implementations and take ages to build projects.

(When I need more advanced features, I switch to my scripting language.
Which also, by design, shares the same syntax and most of the same characteristics and features.)

I think, like David Brown, you just don't get it.

Try and think of what I do as creating delicious meals in my kitchen
from my own recipies. Nothing new here either. But both you and DB
probably work (figuratively) in the food industry or run chains of restaurants, or work in a food laboratory.

Or are used to buying ready-meals from supermarkets.

Meals are different thing than programming languages. If you want
to say that _you_ enjoy yor language(s), then I got this. My point
was that you are trying to present your _subjective_ preferences
as something universal. I like programming and important part
is that my programs work. So I like featurs that help me to get
working program and dislike ones that cause troubles. IME, the
following cause troubles:
- case insensitivity
- dependence on poorly specified defauls
- out of order definitions
- infexible tools that for example insist on creating executable
without option on making linkable object file

Concerning 1-based indexing, IME in more cases it causes trouble
than helps, but usually this is minor issue. I use a lot
line oriented syntax. I can say that it works, simple examples
are easy, but there are unintuitive situations and sometimes troubles.
For example cut and paste works better for traditional syntax.
If there are problems, then beginers may be confused. As one
guy put it: trouble with white space is that one can not see it.

I am comfortable with braces and if I were do design new language
there is good chance that I would use them. And the same with
semicolons. One could go for "minimal" syntax, using only
parentheses and commas, but for reading variety is helpful,
so it is better to also have braces and semicolons.

I do my programming mostly without any IDE. Frequently I use
simple Makefiles, but also I do developement with "no" build
system, basically interactively defining new functions, and
collecting working part in file. Or semi-classic edit-compile-test
cycle where I have editor open in one termial window, and
I give compile commands by hand.

One thing that I learned many years ago is: what I find easy
is not necessarily easy for other folks. What I like is not
necessarily what other folks like. I think that my methods
of working and languages I use are quite effective, but
I do not advocate them as something good for all. In fact,
it seems that majority goes in quite different direction.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Sun Dec 11 20:46:13 2022

On 10/12/2022 18:23, Bart wrote:

On 09/12/2022 11:54, David Brown wrote:

On 09/12/2022 01:52, Bart wrote:

higher order functions, objects,
overloading, generics/templates, events, actors, coroutines, RAII,
metaprogramming, type inference, native multi-precision arithmetic,
parallel blocks, closures, sandboxing.

Those are the sorts of features that are implemended more diversely
across different languages than simple ones. They will be harder to
translate from one language to another. I favour more conservative
and more universal features.

Again, you favour things that seem more conservative and universal to
/you/.

Well, they were universal and obvious 40+ years ago. Nearly two
generations on, everyone is into more advanced features (mostly likely
due to being foisted on them in university courses).

Then it may be necessary to look at what would be obvious and intuitive
to anyone, or can be described in everyday terms like shelves of books, numbered drawers and so on. (Except that soon not many will know what a
book looks like.)

But people with different programming backgrounds and habits could
have completely different ideas of what is "conservative and
universal". For many languages, type inference and generics is so
integral that it would seem very strange to declare variables of a
given type - they may not even see variables as a meaningful concept.

Give it another generation, someone will reintroduce the idea of mutable variables as a brilliant new concept.

However I think you're wrong: there are a set of basic features that,
for time being, most will understand or can easily guess, often used in pseudo-code.

If I look at Intel processor manuals for example, the behaviour of instructions is described in a pseudo-code language looks a lot like
Algol. It has IF statements, explicit FOR-loops and mutable variables.

This is exactly what I said - you /think/ certain concepts are
"universal" or "fundamental" because of your background, and that
includes heavy use of assembly coding. von Neumann computer
architecture has turned out to be very successful, and imperative
procedural programming is a good fit for such systems at the low level.
Not all processors fit that model, however, and there are niche devices
that are very different. (Internally, even mainstream devices like x86 processors deviate substantially from strict von Neumann models.)

That does not mean, however, that such languages are ideal for
programming. Programming is the art of taking a task from the problem
domain and ending up with something in the implementation domain. For
most processors, a C-like language (which includes your compiled
language - imperative and procedural) is close to the implementation
domain. It is, however, typically very far from the problem domain. A
high level language such as Python is going to be closer to the problem
domain. Compilers, interpreters, libraries, etc., automate the process
of moving from the chosen programming language down to the actual implementation.

The closer your language choice is to the problem domain, the shorter
the program, the easier it is to see (or prove) that it is correct, and
the shorter development time. The closer your language choice is to the implementation domain, the fewer inefficiencies are typically introduced
in the automated part of the process.

Different kinds of task in the problem domain are best described in
different kinds of language. Different situations call for different selections of level of language and trade-offs. But common to all
programming is this progression from the problem to the implemented
solution.

If you view just one part of this process, it's easy to imagine that the concepts found at that point are specially important, or fundamental or universal. But they are not - they are simply common at that are of the
whole artform of software development. A test engineer might say
programming revolves around test-cases and user models. An assembly
programmer might say it is all about memory addresses, registers, ALU instructions and branches. A C programmer might think variables and
functions are the key concepts that every programmer understands. For
Forth programmers, the stack is the centre of the universe. And they
are all /wrong/, because they are only looking at a small part of a big picture, and within their small part, they are all /right/.

So, how about that; my syntax which I claim to be simple and univeral,
is clear enough to be used as pseudo-code.

It is fine to be used as pseudo-code by people who understand it. It is useless for people who don't understand it. The same goes for
functional programming:

factorial 0 = 1
factorial n = n * factorial (n - 1)

Is that hard for you to understand? Of course not. It is written in a
very different way than you would write the code in your own language -
there are no types, no loops, no variables, no conditionals. It works
by pattern matching, recursion and type inference.

It turns out that simple code can be simple to understand, even if it is written in a very different kind of language or a different style. As
for complex code - well, it depends entirely on your experience and
familiarity with the kind of language (as well as with the problem or
operation being described).

These are some characteristics and fundamentals that I like, that are in danger of becoming extinct; they already are in many languages:

These are not "in danger of becoming extinct" - though they may be going
out of fashion in some cases.

* Case-insensitive
* 1-based counting (if there is even anything to count!)
* Mutable variables, even the concept of 'state'
* Explicit arrays, and explicit indexing of arrays
* Iteration over A to B using explicit loops
* While loops
* Goto
* Build-in read/print /statements/ (or just having any i/o at all)
* Ordinary functions, you know, the ones you just declare rather just
being some named lambda expression.

Who cares what /you/ like, except for /you/ ? You make your language
the way you want it. It does not mean that the features you like are
somehow "better", "universal", "fundamental", "understandable", or
anything else. They are just the things you like. (And if you think
any of these are "in danger of becoming extinct", you must be living on
a different planet. Far from everyone will agree with you on any or all
of these points, but that is not remotely the same as saying they are
going extinct.)

These include many constructs popular in pseudo-code. So what do the developers of advanced languages actually have against clear code?

I get the impression that many can't take a language seriously if it
uses a syntax that just anyone can understand.

Of course you put things in your language that /you/ find appropriate
- you made it for yourself, and if some Prolog user or Sketch expert
wants to use it, they'll have to learn how it works. All I am asking
you to do is understand that your ideas, your preferences, your
choices are formed from your experiences and background - they are not
"universal" and do not necessarily apply to others.

Again, I think they are: more obvious, more intuitive, simpler, suitable
for use as pseudo-code, and far easier to port to an arbitrary language.
At least, an arbitrary language that hasn't completely done away with
the basics.

Again, you are wrong. I think it is a shame that you are sometimes so
insular that you can't even see that your viewpoint is limited.
Personal preferences are fine - assuming they apply to everyone is not.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Mon Dec 12 00:16:34 2022

On 11/12/2022 16:50, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

(I don't think you've made it clear whether the other language(s) you've refered to are some mainstream ones, or one(s) that you have devised. I
will assume the latter and tone down my remarks. A little.)

My scheme also allows circular and mutual imports with no restrictions.

You probably mean "with no artifical restrictions". There is
fundamental restriction that everything should resolve in finite
number of steps.

Huh? That doesn't come up. Anything which is recursively defined (eg.
const a=b, b=a) is detected, but that is due to out-of-order not
circular modules.

There needs to be some structure, some organisation.

Exactly, private import is tool for better organisation.

Sorry, all I can see is extra work; it was already a hassle having to
write `import B` on top of every module that used B, when it was visible
to all functions, because of having to manage that list of imports /per module/. Now I have to do that micro-managing /per function/?

(Presumably this language has block scopes: can this import also be
private to a nested block within each function?)

With module-wide imports, it is easy to draw a diagram of interdependent modules; with function-wide imports, that is not so easy, which is why I
think there will be less structure.

I can compile each program separately. Granularity moves from module to
program. That's better. Otherwise why not compile individual functions?

In the past there was support for compiling individual functions, and
I would not exclude possibility that it will come back. But ATM
I prefer to keep things simple, so this functionality was removed.

With whole program compilation, especially used in the context of a
resident tool that incorporates a compiler, with resident symbol tables, resident source and running code in-memory (actually, just like my very
first compilers), lots of possibiliies open up, of which I've hardly
scratched the surface.

Including compiling/recompiling a function at a time.

Although part-recompiling during a pause in a running program, then
resuming, would still be very tricky. That would need debugging
features, and I would consider, in that case, running via an interpreter.

My point: a system that does all this would need all the relevant bits
in memory, and may involve all sorts of complex components. But a
whole-program compiler that runs apps in-memmory already does half the work.

I used to use independent compilation myself. I moved on to
whole-program compilation because it was better. But all the issues
involved with interfacing at the boundaries between modules don't
completely disappear, they move to the boundaries between programs
instead; that is, between libraries.

I consider programs to be different from libraries. Program may
use several libraries, in degenerate case library may be just a
single module. Compiling whole program has clear troubles with
large programs.

My definition of 'program', on Windows, is a single EXE or DLL file.

I expect larger applications to consist of a collection of EXE and DLL
files. My own will have one EXE and zero or more DLLs, but I would also
make extensive use of scripting modules, that have different rules.

The latter being an automatically created amalgamation (produced with
`mm -ma app`). Build using `mm app`.

I could provide a single file, shell archive containing build script
and sources, but important part of providing sources is that people
can read them, understand and modify.

I don't agree. On Linux you do it with sources because it doesn't have a reliable binary format like Windows that will work on any machine. If
there are binaries, they might be limited to a particular Linux
distribution.

(Binaries on Linux have always been a mystery to me, starting with the
fact that they don't have a convenient extension like .exe to even tell
what it is.)

GNU folks have nice definition
of source: "preffered form for making modifications". I would guess
that 'app.ma' is _not_ your preffered form for making modifications,
so it is not really true source.

No, but deriving the true sources from app.ma is trivial, since it is
basically a concatenation of the relevant files.

And to build from "source" I need
source first. And I provide _true_ sources to my users.

If you were on Linux or didn't want to use my compiler, then it's even
simpler; I would provide exactly one file:

app.c Generated C source code (via `mc -c app`)

Here, you need a C compiler only. On Windows, you can build it using
`gcc app.c -oapp.exe`. Linux needs more options, listed at the top of
the file.

Sorry, generated file is _not_ a source. If I were to modify C file

This is not for modifying. 99% of the time I want to build an open
source C project, it is in order to provide a running binary, not spend
hours trying to get it to build. These are the obstacles I have faced:

* Struggling with formats like .gz2 that require multiple steps on Windows
* Ending up with myriad files scatterred across myriad nested directories
* Needing to run './configure' first (this will not work on Windows...)
* Finding a 'make' /program/ (my gcc has a program called
mingw32-make.exe; is that the one?)
* Getting 'make' to work. Usually it fails partway and makefiles can be
so complex that I have no way of figuring a way out
* Or, trying to compile manually, struggling with files which are all
over the place and imparting that info to a compiler.

I don't have any interest in this; I just want the binary!

So, with my own programs, if I can't provide a binary (eg. they are not trusted), then one step back from a single binary file, is a single
amalgamated source file.

I first did this in 2014, as a reaction to the difficulties I kept
facing: I wanted any of my applications to be as easy to build as hello.c.

If someone wants the original, discrete sources, then sure they can have
a ZIP file, which generally will have files that unpack into a single
directly. But it's on request.

The difference is that what I provide is genuinely simple: one bare
compiler, one actual source file.

Sorry, for me "one file" is not a problem, there is 'tar' (de facto
standard for distributing sorce code)

Yeah, I explained how well that works above. So the last Rust
implementation was a single binary download (great!), but it installed
itself as 56,000 discrete fies and across don't know how many 1000s of directories (not so great). And it didn't work (it requires additional
tools).

Being able to ZIP or TAR a sprawling set of files into giant binary
makes it marginally easier to transmit or download, but it doesn't
really address complexity.

And there is possible quite large dependency, namley Windows.

Yeah, my binaries run on Windows. Aside from requiring x64 and using
Win64 ABI, they use one external library MSVCRT.DLL, which itself uses
Windows.

For programs that run on Windows and Linux, those depend on libraries
used. For 'M' programs, one module has to be chosen from Windows and
Linux versions; To run on Linux, I have to do this:

mc -c -linux app.m # On Windows, makes app.c, using
# the Linux-specific module

gcc app.c -oapp -lm etc # On Linux
./app

but M makes little use of WinAPI. With my interpreter, the process is as follows:

c:\qx>mc -c -linux qq # On Windows
M6 Compiling qq.m---------- to qq.c

Copy qq.c to c:\c then under WSL:

root@DESKTOP-11:/mnt/c/c# gcc qq.c -oqq -fno-builtin -lm -ldl

Now I can run scripts under Linux:

root@DESKTOP-11:/mnt/c/c# ./qq -nosys hello
Hello, World!

However, notice the '-nosys' option; this is because qq automatically incorporates a library suite that include a GUI library based on Win32.
Without that, it would complain of not finding user32.dll etc.

I would need to dig up an old set of libraries or create new
Linux-specific ones. A bit of extra work. But see how the entire app is contained within that qq.c file.

It is [not?] clear how much of your code _usefully_ runs in now-Window environment.

OK, let's try my C compiler. Here I've done 'mc -c -linux cc`, copied
cc.q, and compiled under WSL as bcc:

root@DESKTOP-11:/mnt/c/c# ./bcc -s hello.c
Compiling hello.c to hello.asm

root@DESKTOP-11:/mnt/c/c# ./bcc -e hello.c
Preprocessing hello.c to hello.i

root@DESKTOP-11:/mnt/c/c# ./bcc -c hello.c
Compiling hello.c to hello.obj

root@DESKTOP-11:/mnt/c/c# ./bcc -exe hello.c
Compiling hello.c to hello.exe
msvcrt
msvcrt.dll
SS code gen error: Can't load search lib

So, most things actually work; only creating EXE doesn't work, because
it needs access to msvcrt.dll. But even it it did, it would work as a cross-compiler, as its code generator is for Win64 ABI.

But I think this shows useful stuff can be done. A more interesting test
(which used to work, but it's too much effort right now), is to get my M compiler working on Linux (the 'mc' version that targets C), and use
that to build qq, bcc etc from original sources on Linux.

In all, my Windows stuff generally works on Linux. Linux stuff generally doesn't work on Windows, in terms of building from source.

C is just so primitive when it comes to this stuff. I'm sure it largely
works by luck.

C is at low level, that is clear.

The way it does modules is crude. So was my scheme in the 1980s, but it
was still one step up from C. My 2022 scheme is miles above C now. The underlying language can still be low level, but you can at least fix
some aspects.

I think a better module scheme could be retrofitted to C, but I'm not
going to do it.

Good programming
environment should help. C as language is not helpful, one
may have fully compliant and rather unhelpful compiler. But
real C compilers tends to be as helpful as they can within
limit a C language. While C still limits what they can do,
there is quite a lot of difference betwen current popular
compiler and bare-bones legal comiler. And there are extra
tools and here C support is hard to beat.

I don't agree with spending 1000 times more effort in devising complex
tools compared with just fixing the language.

So what are the new circles of ideas? All that crap in Rust that makes
coding a nightmare, and makes building programs dead slow? All those new
functional languages with esoteric type systems? 6GB IDEs (that used to
take a minute and a half to load on my old PC)? No thanks.

Borrow checker in Rust looks like good idea. There is good chance
that _idea_ will be adopted by several languages in near feature.

OK. I've heard that that makes coding in Rust harder. Also that makes compilation slower. Not very enticing features!

Not so new ideas are:
- removing limitations, that is making sure language constructs
work as general as possible (that allows to get rid of many
special constructs from older languages)
- nominal, problem dependent types. That is types should reflect
problem domain. In particular, domains which need types like
'u32' are somewhat specific, in normal domains fundamental types
are different
- functions as values/parameters. In particular functions have
types, can be members of data structures
- "full rights" for user defined types. Which means whatever
syntax/special constructs works on built-in types should
also work for user defined types
- function overloading
- type reconstruction
- garbage collection
- exception handling
- classes/objects

Are these what your language supports? (If you have your own.)

I can't say these have ever troubled me. My scripting language has
garbage collection, and experimental features for exceptions and playing
with OOP, and one or two taken from functional languages.

Being dynamic, it has generics built-in. But it deliberately keeps type
systems at a simple, practical level (numbers, strings, lists, that sort
of thing), because the aim is for easy coding. If you want hard, then
Rust, Ada, Haskell etc are that way -->!

* Clean, uncluttered brace-free syntax

Does this count as brace-free?

for i in 1..10 repeat (print i; s := s + i)

Not if you just substitude brackets for braces. Brackets (ie "()") are
OK within one line, otherwise programs look too Lispy.

* Case-insensitive
* 1-based
* Line-oriented (no semicolons)
* Print/read as statements

Lot of folks consider the above misfeatures/bugs.

I know.

Concerning
'line-oriented' and 'intuitive, can you guess which changes to
following statement in 'line-oriented' syntax are legal and preserve
meaning?

nm = x or nm = 'log or nm = 'exp or nm = '%power or
nm = 'nthRoot or
nm = 'cosh or nm = 'coth or nm = 'sinh or nm = 'tanh or
nm = 'sech or nm = 'csch or
nm = 'acosh or nm = 'acoth or nm = 'asinh or nm = 'atanh or
nm = 'asech or nm = 'acsch or
nm = 'Ei or nm = 'erf or nm = 'erfi or nm = 'li or
nm = 'Gamma or nm = 'digamma or nm = 'dilog or
nm = '%root_sum =>
"iterate"

As a hit let me say that '=' is comparison. And this is single
statement, small changes to whitespace will change parse and lead
to wrong code or syntax/type error. BTW: you need real newsreader
to see it, Google and likes will change it so it no longer works.

Thunderbird screws it up as well, unless it is meant to have a ragged
left edge. But not sure what your point is.

Non-line-oriented (like C, like JSON) is better for machine readable
code, that can also be transmitted with less risk of garbling. But when
when 90% of semicolons in C-style languages coincidence with
end-of-line, you need to start question the point of them.

Note that C's preprocessor is line-oriented, but C itself isn't.

C is still tremendously popular for many reasons. But anyone wanting to
code today in such a language will be out of luck if prefered any or all
of these characteristics. This is why I find coding in my language such
a pleasure.

Then, if we are comparing the C language with mine, I offer:

* Out of order definitions

That is considerd misfeature in modern time.

Really? My experiments showed that modern languages (not C or C++) do
allow out-of-order functions. This gives great freedom in not worrying
about whether function F must go before G or after, or being able to
reorder or copy and paste.

In modern languages
definition may generate some code to be run and order in which
this code is run matters.

* One-time definitions (no headers, interfaces etc)
* Expression-based

C is mostly expression-based.

No, it's mostly statement-based. Although it might be that most
statements are expression statements (a=b; f(); ++c;).

You can't do 'return switch() {...}' for example, unless using gcc
extensions.

There are langages that go further
than C, for example:

a := (s = 1; for i in 1..10 repeat (s := s + i); s)

is legeal in language that I use, but can not be directly translated to C. However, from examples that you gave it looked that your language
is _less_ expression-based than C.

I don't use the feature much. I had it from 80s, then switched to statement-based for a few years to match the scripting language, now
both are expression-based.

One reason it's not used more is because it causes problems when
targetting C. However I like it as a cool feature.

* Program-wide rather than module-wide compilation unit

AFACS C leave choice to the implementer. If your language _only_
supports whole-program compilation, then this whould be
negative feature.

* Build direct to executable; no object files or linkers

That really question of implementation. Building _only_
to executable is misfeature (what about case when I want
to use a few routines in your language, but the rest including
main program is in different language?).

There were escape routes involving OBJ files, but that's fallen into
disuse and needs fixing. For example, I can't so `mm -obj app` ATM, but
could do this, when I've cleared some bugs:

mm -asm app # app.m to app.asm
aa -obj app # app.asm to app.obj
gcc app.obj lib.o -oapp.exe # or lib.a?

This (or something near) allows static linking of 'lib' instead of
dynamic, or including lib written in another language.

However, my /aim/ is for my language to be self-contained, and not to
talk to external software except via DLLs.

* Blazing fast compilation speed, can run direct from source

Again, that is implementation (some language features may slow
down compilation, as you know C allow fast comilation).

C also requires that the same header (say windows.h or gtk.h) used in 50 modules needs to be processed 50 times for a full build.

My M language processes it just once for a normal build (further, such
APIs are typically condensed into a single import module, not 100s of
nested headers). Some of it is by design!

* Module scheme with tidy 'one-time' declaration of each module
* Function reflection (access all functions within the program)
* 64-bit default data types (ie. 'int' is 64 bits, 123 is 64 bits)
* No build system needed

That really depends on needs of your program. Some are complex
and need build system, some are simple and in principle could
be compiled with "no" build system. I still use Makefiles for
simple programs for two reasons:
- typing 'make' is almost as easy as it can get

Ostensibly simple, yes. But it rarely works for me. And internally, it
is complex. Look at what a typical makefile contains with one of a
program headers, which looks like a shopping list - you can't get simpler!

- I want to have record of compiler used/compiler options/
libraries

So do I, but I want to incorporate that into the language. So if a
program uses OpenGL, when it sees this:

importdll opengl =

(followed by imported entities) that tells it it will need opengl.dll.
In more complex cases (mapping of import library to DLL file(s) is not straightforward), it's more explicit:

linkdll opengl # next to module info

This stuff no longer needs to be submitted via command line; that is
old-hat.

Or are used to buying ready-meals from supermarkets.

Meals are different thing than programming languages. If you want
to say that _you_ enjoy yor language(s), then I got this. My point
was that you are trying to present your _subjective_ preferences
as something universal.

Yes, and I think I'm right. For example, English breakfast choices are
simple (cereal, toast, eggs, sausages), everybody likes them, kids and
adults. But then in the evening you go to a posh restaurant and things
are very different.

I think the same basics exist in programming languages.

I like programming and important part

is that my programs work. So I like featurs that help me to get
working program and dislike ones that cause troubles. IME, the
following cause troubles:

- case insensitivity

I believe this would only cause problems if you already have a
dependence on case-sensitivity, so it's a self-fulfilling problem!

Create a new language with it, and those problems become minor ones that
occur on FFI boundaries, and then not that often.

- dependence on poorly specified defauls
- out of order definitions

I don't believe this. In C, not having this feature means:

* Requiring function prototypes, sometimes
* Causing problems in self-referential structs (needs struct tags)
* Causing problems with circular references in structs (S includes
a pointer to T, and T includes a pointer to S)

- infexible tools that for example insist on creating executable
without option on making linkable object file

Concerning 1-based indexing, IME in more cases it causes trouble
than helps, but usually this is minor issue.

In my compiler sources, about 30% of arrays are zero-based (with the
0-term usually associated with some error or non-set/non-valid index).

I use a lot
line oriented syntax. I can say that it works, simple examples
are easy, but there are unintuitive situations and sometimes troubles.
For example cut and paste works better for traditional syntax.
If there are problems, then beginers may be confused. As one
guy put it: trouble with white space is that one can not see it.

White space (spaces and tabs) is nothing to do with line-oriented.

In fact,
it seems that majority goes in quite different direction.

I can see that. 30+ years ago, I hardly knew what other people did, and
didn't care. I did get the impression that I had a competive edge in the
way I did things.

A lot of stuff I do with languages is experimental. When it doesn't work
it gets dropped. Or it evolves. One problem I have now is that I don't
do enough apps, which is what helps drive language development.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Mon Dec 12 02:17:04 2022

On 11/12/2022 19:46, David Brown wrote:

On 10/12/2022 18:23, Bart wrote:

If I look at Intel processor manuals for example, the behaviour of
instructions is described in a pseudo-code language looks a lot like
Algol. It has IF statements, explicit FOR-loops and mutable variables.

This is exactly what I said - you /think/ certain concepts are
"universal" or "fundamental" because of your background, and that
includes heavy use of assembly coding. von Neumann computer
architecture has turned out to be very successful, and imperative
procedural programming is a good fit for such systems at the low level.
Not all processors fit that model, however, and there are niche devices
that are very different. (Internally, even mainstream devices like x86 processors deviate substantially from strict von Neumann models.)

You're missing my point: the description of what the instructions did,
which we can assume to be detailed pseudo-code, used explicit loops for example, a feature missing or marginalised in FP.

I asked, why didn't they use recursion in their pseudo-code? Why didn't
my Wikepedia example do so rather than use 'while' loops?

I say it's because such features are understood by more people. You will disagree of course; because /you/ can understand FP code, so must everyone.

When I look at my paper tax return (or an IKEA instruction leaflet), it
will contain instructions like: 'go to section 7' or 'go to page 12'. So
you can assume that 'goto' at least is well understood!

That does not mean, however, that such languages are ideal for
programming. Programming is the art of taking a task from the problem domain and ending up with something in the implementation domain. For
most processors, a C-like language (which includes your compiled
language - imperative and procedural) is close to the implementation domain. It is, however, typically very far from the problem domain. A high level language such as Python is going to be closer to the problem domain.

Python? Probably my scripting language is closer to the problem domain
too. Mine however has 'goto'!

The closer your language choice is to the problem domain, the shorter
the program, the easier it is to see (or prove) that it is correct, and
the shorter development time.

That doesn't work with functional languages, not unless you're a genius
with a PhD in computer science. The program might be 10 times shorter
but 100 times more cryptic.

Different kinds of task in the problem domain are best described in
different kinds of language. Different situations call for different selections of level of language and trade-offs. But common to all programming is this progression from the problem to the implemented
solution.

Why do think I decided to create a scripting language at all? I first
did so in late 80s, the next step up to the non-programmable command
language of my application. It was meant also for users of my app, who
needed to do domain-specific scripting relevant to the application.

It was higher level than the one used to implemented the application. It
was domain-specific in having built-in types such as 3D points/vertices
and 3D transformation matrices.

So if P, Q were the endpoints of a line, then (P+Q)/2 was the midpoint,
while A*B might combine two transformation matrices into one. No memory addresses in sight.

Maybe I know something about the need for a more accessible language.

However one big part of that language was creating and working with GUI elements (dialogs etc), which was still a slog even in scripting code.
How much help would Haskell have been there?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Mon Dec 12 12:56:39 2022

On 12/12/2022 03:17, Bart wrote:

On 11/12/2022 19:46, David Brown wrote:

On 10/12/2022 18:23, Bart wrote:

If I look at Intel processor manuals for example, the behaviour of
instructions is described in a pseudo-code language looks a lot like
Algol. It has IF statements, explicit FOR-loops and mutable variables.

This is exactly what I said - you /think/ certain concepts are
"universal" or "fundamental" because of your background, and that
includes heavy use of assembly coding. von Neumann computer
architecture has turned out to be very successful, and imperative
procedural programming is a good fit for such systems at the low
level. Not all processors fit that model, however, and there are niche
devices that are very different. (Internally, even mainstream devices
like x86 processors deviate substantially from strict von Neumann
models.)

You're missing my point: the description of what the instructions did,
which we can assume to be detailed pseudo-code, used explicit loops for example, a feature missing or marginalised in FP.

I asked, why didn't they use recursion in their pseudo-code? Why didn't
my Wikepedia example do so rather than use 'while' loops?

I say it's because such features are understood by more people. You will disagree of course; because /you/ can understand FP code, so must everyone.

And you are /completely/ missing /my/ point. (This is a recurring theme
in our discussions. If only Usenet were interactive and supported
arm-waving and a whiteboard, I'm sure we'd understand each other far
better!)

I am not arguing that loops (or any other programming language feature)
are not commonly understood. I am arguing that they are not universal
or fundamental in any sense.

I am /not/ saying that "because I understand FP-style code, so does
everyone" - I am merely saying that /you/ are /wrong/ to say "because I understand imperative code, so does everyone".

Look at everyday language. We all understand English here in this
group. Even in countries with completely different native languages,
such as China, many technical people understand at least some English.
Does that mean English is somehow more "fundamental" than Mandarin?
Simpler? More universal? Understood by everyone?

When I look at my paper tax return (or an IKEA instruction leaflet), it
will contain instructions like: 'go to section 7' or 'go to page 12'. So
you can assume that 'goto' at least is well understood!

Different algorithms are clear when expressed in different ways.

You want the greatest common denominator of two numbers, a and b ? If a
and b are the same, that's the GCD. If b is bigger than a, swap them.
Then find the GCD of b and (a - b).

Oh look, it turns out recursion is "universally understood".

You want to go shopping? Make a list. Get the first thing on the list,
then get the rest of the list. Stop when the list is empty. So list processing, pattern matching and recursion is universal - there are no variables, loops or gotos in that description.

You need to get some things in one shop, other things in a different
shop? Take all the things from the first list that you can get in the
first shop, and put that on a new list. Now you have the key functional programming "filter" function as universal, even if the terminology of functional programming is unfamiliar.

You realise you need to bake two cakes instead of one? Double
everything on the list. Now "map" is universal.

You have recipes for six different kinds of cake. To make double-sized
cakes, you need to double the ingredients and add 30% to the cooking
time. Now you have a higher-order function.

You have a things-to-do list? Do the first one, then move on to the
rest of the list. That's recursion and lazy evaluation.

Imagine the list of all prime numbers. Take the first 5 containing the
digit "3". It seems that understanding how to manipulate infinite lists
is easy too.

That just about covers the key functional programming concepts - and
you've barely left the kitchen. No programming experience is needed,
yet these are instructions that pretty much anyone could follow. And
while it is certainly possible to give more imperative-style
descriptions of these tasks, I'd argue that the functional style
descriptions here are clearer and more natural. (I am quite happy to
agree that numbered imperative steps are often a better choice for
making flatpack furniture.)

If an algorithm can be described relatively clearly and simply, the
exact style is not critical to understanding - and there is absolutely
no justification for considering one feature more "universal" than any
other.

(The technical terms used for these features are not universal
understood. But that's the case for most things in life - people are
quite happy to walk around all day without knowing what "bipedal
ambulation" means.)

When it comes to something like description of processor instructions,
these are usually written in a mathematical form with a few special conventions. The result is normally as much "function programming
style" as "imperative programming style" - or a mixture of both. It can
have "while" loops common to imperative code and "where" clauses common
to functional languages. Baring coincidences in the particular syntaxes chosen, there's unlikely to be much challenge in viewing them as either
style - the descriptions are usually too short and simple to make much difference.

That does not mean, however, that such languages are ideal for
programming. Programming is the art of taking a task from the problem
domain and ending up with something in the implementation domain. For
most processors, a C-like language (which includes your compiled
language - imperative and procedural) is close to the implementation
domain. It is, however, typically very far from the problem domain.
A high level language such as Python is going to be closer to the
problem domain.

Python? Probably my scripting language is closer to the problem domain
too. Mine however has 'goto'!

I would definitely expect your scripting language to be closer to
typical problem domains - otherwise it would be pointless.

The closer your language choice is to the problem domain, the shorter
the program, the easier it is to see (or prove) that it is correct,
and the shorter development time.

That doesn't work with functional languages, not unless you're a genius
with a PhD in computer science. The program might be 10 times shorter
but 100 times more cryptic.

I don't have a PhD in computer science. I didn't even have a Bachelor's
degree when I learned functional programming. It is not nearly as hard
as some people seem to imagine. While it is very difficult to quantify
or qualify how "hard" something is to learn, I don't think it is
inherently any more difficult than imperative programming - it's simply
that many people learn imperative programming first, and never move on.

I mean, I appreciate your calling me a genius, but it's not actually
necessary!

Note that I don't write much pure functional programming code. What I
like is to be able to mix and match - I like to be able to use
functional style when that is clearest for the problem, and imperative
style when /that/ is clearest.

(Is a Dvorak keyboard layout harder to learn than Qwerty? No - and it
is more efficient in use. It is all about familiarity. I type code
three or four times as fast on a Norwegian keyboard layout than a UK
layout, despite needing extra keypresses for some symbols, as a result
of familiarity.)

Different kinds of task in the problem domain are best described in
different kinds of language. Different situations call for different
selections of level of language and trade-offs. But common to all
programming is this progression from the problem to the implemented
solution.

Why do think I decided to create a scripting language at all? I first
did so in late 80s, the next step up to the non-programmable command
language of my application. It was meant also for users of my app, who
needed to do domain-specific scripting relevant to the application.

I expect it was for much the same reasons as other scripting languages
were developed - making it simpler to solve the tasks you needed to
solve. It's higher level than your compiled low-level language, and
nearer to the problem domain.

It was higher level than the one used to implemented the application. It
was domain-specific in having built-in types such as 3D points/vertices
and 3D transformation matrices.

So if P, Q were the endpoints of a line, then (P+Q)/2 was the midpoint,
while A*B might combine two transformation matrices into one. No memory addresses in sight.

Maybe I know something about the need for a more accessible language.

However one big part of that language was creating and working with GUI elements (dialogs etc), which was still a slog even in scripting code.
How much help would Haskell have been there?

I've never had the need to investigate.

Declarative styles are often very convenient for defining graphic
elements, compared to imperative styles. (I.e., you want to say "the
dialog box has ten buttons" rather than "loop ten times : create a new button").

Usually, I think, a combination is best. Declarations can be nicer than function calls for creating your gui. Imperative code can make sense
for acting on events. Event-based programming is useful for user
interaction (rather than the pure imperative style of polling loops and blocking functions). Object oriented coding helps keep parts of the
system modularised and structured.

I wonder if you are imagining functional programming as some kind of
"opposite" to imperative programming, or that languages are all one or
the other. The reality is that there is a large number of ways to think
about programming, and most real-world languages support multiple
paradigms to a greater or lesser extent. (The paradigms themselves are
not well defined either - it's all just broad strokes and rough
categorisation based on common features.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Mon Dec 12 14:36:45 2022

On 12/12/2022 11:56, David Brown wrote:

On 12/12/2022 03:17, Bart wrote:

When I look at my paper tax return (or an IKEA instruction leaflet),
it will contain instructions like: 'go to section 7' or 'go to page
12'. So you can assume that 'goto' at least is well understood!

Different algorithms are clear when expressed in different ways.

You want the greatest common denominator of two numbers, a and b ? If a
and b are the same, that's the GCD. If b is bigger than a, swap them.
Then find the GCD of b and (a - b).

Oh look, it turns out recursion is "universally understood".

You want to go shopping? Make a list. Get the first thing on the list, then get the rest of the list. Stop when the list is empty. So list processing, pattern matching and recursion is universal - there are no variables, loops or gotos in that description.

You need to get some things in one shop, other things in a different
shop? Take all the things from the first list that you can get in the
first shop, and put that on a new list. Now you have the key functional programming "filter" function as universal, even if the terminology of functional programming is unfamiliar.

You realise you need to bake two cakes instead of one? Double
everything on the list. Now "map" is universal.

You have recipes for six different kinds of cake. To make double-sized cakes, you need to double the ingredients and add 30% to the cooking
time. Now you have a higher-order function.

You have a things-to-do list? Do the first one, then move on to the
rest of the list. That's recursion and lazy evaluation.

Imagine the list of all prime numbers. Take the first 5 containing the digit "3". It seems that understanding how to manipulate infinite lists
is easy too.

That just about covers the key functional programming concepts - and
you've barely left the kitchen.

A nice set of examples, except for the higher order function one, which
I don't get. Here are some observations:

* Imperative languages have recursion too.

* Things like map and reduce can be expressed in imperative code too,
via built-in functions, or ones that you can write yourself ...

* ... which leads to the fact that imperative offers the choice to
express the tasks using lower level features, offering more flexibility

* My beef with recursion as used in FP is using it for everything, when iterative would be clearer

* Those tasks would typically be expressed much more tersely in FP,
which also has a penchant for stringing sequences of them together on
one complex line (but this seems popular everywhere now). In imperative,
deeply nested function calls would be an anti-pattern.

* You mentioned a shopping list; I quite like code to /look/ like a
shopping list. I don't mind reading 10 or 20 lines of clear code where I
can follow every step; it's better than reading one long line that I
can't grok. Where would I even stick a debug print if I wanted to know
an intermediate value.

You realise you need to bake two cakes instead of one? Double
everything on the list. Now "map" is universal.

As is mutable data. Or do you mean create a new list? What exactly do
you double anyway? A typical ingredients list looks like this:

["eggs":1, "sugar":40, "SR flour":40]

Doubling is not that obvious (and the number for eggs has to be whole;
other units are grams).

You want to go shopping? Make a list. Get the first thing on the list, then get the rest of the list. Stop when the list is empty.

What does that actually look like in Haskell? I can tell you that
shopping as it's done in real-life is not done recursively.

In imperative it's:

proc do_shopping(list) =
for item in list do
buy(item)
end
end

Or one-liners:

for item in list do buy(item) end
apply(buy, list) # not map as there is no result

In imperative-recursive, if you have to, is:

proc do_shopping(list) =
if list then
buy(head(list))
do_shopping(tail(list)
fi
end

(I agree parametic pattern-matching to get the two parts of the list
would be better. But imperative is simpler here anyway.)

You have a things-to-do list? Do the first one, then move on to the
rest of the list. That's recursion and lazy evaluation.

But the to-do-list already exists? If there 1M things to do, the list
will have 1M items? That's not a great example of lazy evaluation (the
primes is better).

You might as well say that imperative code is lazily evaluated, as it
consists of a sequence of steps done one at a time, while, ironically,
FP code isn't as it is one big expression.

OK, I don't have a particular issue with FP features in small amounts.
As I said and showed, they can come up in imperative code two.

I have two broad objections:

* All the FP features I don't understand and don't find intuitive,
mainly do with functions (higher order, closures, currying etc).

Pattern-matching, map, reduce etc I can deal with; the concepts are
easy, and they can trivially expressed in non-FP terms.

* Basing an entirely language around FP features.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Mon Dec 12 17:27:04 2022

On 12/12/2022 15:36, Bart wrote:

On 12/12/2022 11:56, David Brown wrote:

On 12/12/2022 03:17, Bart wrote:

When I look at my paper tax return (or an IKEA instruction leaflet),
it will contain instructions like: 'go to section 7' or 'go to page
12'. So you can assume that 'goto' at least is well understood!

Different algorithms are clear when expressed in different ways.

You want the greatest common denominator of two numbers, a and b ? If
a and b are the same, that's the GCD. If b is bigger than a, swap
them. Then find the GCD of b and (a - b).

Oh look, it turns out recursion is "universally understood".

You want to go shopping? Make a list. Get the first thing on the
list, then get the rest of the list. Stop when the list is empty. So
list processing, pattern matching and recursion is universal - there
are no variables, loops or gotos in that description.

You need to get some things in one shop, other things in a different
shop? Take all the things from the first list that you can get in the
first shop, and put that on a new list. Now you have the key
functional programming "filter" function as universal, even if the
terminology of functional programming is unfamiliar.

You realise you need to bake two cakes instead of one? Double
everything on the list. Now "map" is universal.

You have recipes for six different kinds of cake. To make
double-sized cakes, you need to double the ingredients and add 30% to
the cooking time. Now you have a higher-order function.

You have a things-to-do list? Do the first one, then move on to the
rest of the list. That's recursion and lazy evaluation.

Imagine the list of all prime numbers. Take the first 5 containing
the digit "3". It seems that understanding how to manipulate infinite
lists is easy too.

That just about covers the key functional programming concepts - and
you've barely left the kitchen.

A nice set of examples, except for the higher order function one, which
I don't get.

It's a set of instructions that are applied to a set of instructions to
get a new set of instructions - a "recipe" for transforming one cake
recipe into a new one.

Here are some observations:

* Imperative languages have recursion too.

Sure - there's plenty of overlap. Equally, functional programming
languages have expressions that are often very much the same as in
imperative languages (though generally without side-effects). Syntax
detail varies, but the principles are the same.

* Things like map and reduce can be expressed in imperative code too,
via built-in functions, or ones that you can write yourself ...

* ... which leads to the fact that imperative offers the choice to
express the tasks using lower level features, offering more flexibility

* My beef with recursion as used in FP is using it for everything, when iterative would be clearer

I'm happy with mixing them. I don't claim pure functional programming
style is the best way to write all code. Sticking to pure FP can have
its advantages, such as making code proofs easier and being inherently
safe for multi-threading. But it can also be inconvenient for other
tasks that can be clearer as sequences of commands or explicit loops.
Equally, some things are vastly simpler and clearer in FP style.

This all started with a discussion of ideas for a new programming
language. My suggestion was never for James to make a pure functional programming language. Rather, I suggested he look at some functional programming languages (and other languages) and see what he could copy
or take as inspiration. That includes features such as higher-order
functions, anonymous functions, list comprehensions, summation types
(google for "how to make a binary tree in Haskell", and compare it to
what's needed in something like C), and pattern matching. It also
includes considering the benefits from restrictions - what does "no side-effects in expressions" give you? Are the benefits worth the
limitations?

I'd also recommend looking at Go and its CSP-style concurrency features.
Look at Rust's memory safety. Look at contracts from SPARK. There's
plenty to be inspired from in many languages, in order to make a better language that does new and exciting things. (Let's put these in new
threads if anyone wants to discuss them in detail.)

* Those tasks would typically be expressed much more tersely in FP,
which also has a penchant for stringing sequences of them together on
one complex line (but this seems popular everywhere now). In imperative, deeply nested function calls would be an anti-pattern.

* You mentioned a shopping list; I quite like code to /look/ like a
shopping list. I don't mind reading 10 or 20 lines of clear code where I
can follow every step; it's better than reading one long line that I
can't grok. Where would I even stick a debug print if I wanted to know
an intermediate value.

A key point about a shopping list is that it seldom requires much order.
That's also the case in declarative languages - there is no order
except when it is unavoidable. (Even where it appears to require order,
lazy evaluation may mean a function can start executing before its
arguments are known.) Imperative languages, by their fundamental
nature, order everything.

You realise you need to bake two cakes instead of one? Double
everything on the list. Now "map" is universal.

As is mutable data. Or do you mean create a new list?

Logically, you have a new list. You might not bother writing it down
(that's lazy evaluation again :-) ), but you are not destroying your old recipe.

What exactly do
you double anyway? A typical ingredients list looks like this:

   ["eggs":1, "sugar":40, "SR flour":40]

Doubling is not that obvious (and the number for eggs has to be whole;
other units are grams).

I was giving an example of programming paradigms, not a baking course!

You want to go shopping? Make a list. Get the first thing on the list, then get the rest of the list. Stop when the list is empty.

What does that actually look like in Haskell? I can tell you that
shopping as it's done in real-life is not done recursively.

In imperative it's:

    proc do_shopping(list) =
        for item in list do
            buy(item)
        end
    end

In Haskell :

shop [] = []
shop (x : xs) = buy x ++ shop xs

If you've nothing to get, you are done. Otherwise you buy the first
thing off the list and then buy everything else. (Or you buy everything
else, then the first thing - or send a kid to buy the first thing while
you get everything else. Unlike imperative code, the order of
evaluation is not fixed.)

This is actually very much the way shopping is done - it is as close a
model as your loop. They both model the same process.

(Recursion is more general than looping. Consider sorting a pack of
cards manually. You might use a mergesort, which is very neatly
expressed recursively.)

Or one-liners:

   for item in list do buy(item) end
   apply(buy, list)            # not map as there is no result

In Haskell :

[ buy x | x <- xs]

And yes, there /is/ a result - when you buy something, I would hope you
have the thing as a result of the purchase! In your imperative code,
this might be hidden by putting the purchases in a global variable bag,
but the "result" is still there.

In imperative-recursive, if you have to, is:

    proc do_shopping(list) =
        if list then
            buy(head(list))
            do_shopping(tail(list)
        fi
    end

(I agree parametic pattern-matching to get the two parts of the list
would be better. But imperative is simpler here anyway.)

Not really, no. It's possible that the syntax for the Haskell is
unfamiliar to you, making it harder to understand, but I think you'll
see what it says when you think about it. (And obviously for
non-programming shoppers the algorithm will be in prose, whether it is imperative or functional style.)

You have a things-to-do list? Do the first one, then move on to the
rest of the list. That's recursion and lazy evaluation.

But the to-do-list already exists? If there 1M things to do, the list
will have 1M items? That's not a great example of lazy evaluation (the
primes is better).

The "lazy evaluation" aspect is that you don't have to evaluate
arguments until they are needed. You can call "do_everything" on the "things-to-do" list even when one entry on the list is "find out what
the kids want for Christmas" and another is "buy the Christmas
presents". One of the arguments could even be "pass this list on to
someone else to finish".

You might as well say that imperative code is lazily evaluated, as it consists of a sequence of steps done one at a time, while, ironically,
FP code isn't as it is one big expression.

No, you can't say that.

Imperative programming has an ordering (though sometimes certain aspects
are unspecified by the language). In order to call a function, you
first evaluate all the arguments, then you call the function with those arguments.

If you have lazy evaluation (which is not required for functional
programming, but is common), arguments are not evaluated unless and
until they are needed. This requires being able to treat functions as
data and data as functions - instead of passing a value as an argument,
you pass a function that will generate that value when needed.

You can do lazy evaluation in more sophisticated non-functional
languages too. In C++, it is a popular way to implement some kinds of
heavy mathematics such as matrix libraries. When you write "A = B +
C;", the code does not create a new matrix that is the result of adding matrices A and B. Rather, it creates a proxy object that knows how to
get the result of the addition. Once you have written all your
expressions and calculations, and actually ask for the results, all
these proxies are evaluated. But now a lot of the memory management for creating and destroying the matrices can be avoided and more done
in-place, or multiple additions can be combined to more efficient
calculations.

OK, I don't have a particular issue with FP features in small amounts.
As I said and showed, they can come up in imperative code two.

I have two broad objections:

* All the FP features I don't understand and don't find intuitive,
mainly do with functions (higher order, closures, currying etc).

OK. These are not the easiest of concepts. It takes effort to learn to understand and appreciate them, and you can program quite happily in
other languages without ever seeing them. I'm not asking you to
understand them, or use them, or implement them in your own language -
all I am asking is that you accept that some other programmers do
understand them, and do find them useful.

Pattern-matching, map, reduce etc I can deal with; the concepts are
easy, and they can trivially expressed in non-FP terms.

* Basing an entirely language around FP features.

It involves thinking about things in a significantly different way. I
do little pure functional programming myself. (I think the last
functional programming I did was FPGA hardware design, a good number of
years ago. Many high-level hardware design languages are functional in nature.)

I am a big fan of combining features and use the best tools for the task
at hand, rather than trying to be "pure" about programming.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Bart on Mon Dec 12 17:20:37 2022

On 12/12/2022 14:36, Bart wrote:

On 12/12/2022 11:56, David Brown wrote:

I have two broad objections:

* All the FP features I don't understand and don't find intuitive,
mainly do with functions (higher order, closures, currying etc).

Pattern-matching, map, reduce etc I can deal with; the concepts are
easy, and they can trivially expressed in non-FP terms.

* Basing an entirely language around FP features.

I've been looking at examples on rosettacode.org. Most languages there
are conventional (my style) other than all the weird ones plus FP.

But one task caught my eye:

https://rosettacode.org/wiki/Determine_if_a_string_has_all_unique_characters#Haskell

as all three Haskell versions seem to make meal of it. I had been
looking for a short cryptic Haskell example; this was a long cryptic one!

One mystery is how it gets the output (of first version) properly lined
up, as I can't see anything relevant in the code.

Half my version below is all the fiddly formatting; this is where I'd
consider this a weak spot in my language and think about what could
improve it.

Other languages (eg. OCaml) keep the output minimal.

-------------------------

proc main =
teststrings:=(
"",
".",
"abcABC",
"XYZ ZYX",
"1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ")

println "+--------------------------------------+------+----------+--------+---+---------+"
println "|string |length|all
unique|1st diff|hex|positions|"
println "+--------------------------------------+------+----------+--------+---+---------+"

for s in teststrings do
(length, unique, diff, pos):=checkstring(s)

if diff then
fprintln "|#|#|# |#|#|#|",
s:"38jl", length:"6jl",
(unique|"yes"|"no "),
diff:"jl8", asc(diff):"Hjl3", sprint(pos[1], pos[2]):"9jl"
else
fprintln "|#|#|yes | | | |",
s:"38jl", length:"6jl"
fi
od

println "+--------------------------------------+------+----------+--------+---+---------+"
end

func checkstring(s)=
for i, c in s do
n:=c in rightstr(s,-i)
if n then !not unique
return (s.len, 0, c, (i, n+i))
fi
od

(s.len, 1, "", ())
end

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Tue Dec 13 16:40:21 2022

On 12/12/2022 18:20, Bart wrote:

On 12/12/2022 14:36, Bart wrote:

On 12/12/2022 11:56, David Brown wrote:

I have two broad objections:

* All the FP features I don't understand and don't find intuitive,
mainly do with functions (higher order, closures, currying etc).

Pattern-matching, map, reduce etc I can deal with; the concepts are
easy, and they can trivially expressed in non-FP terms.

* Basing an entirely language around FP features.

I've been looking at examples on rosettacode.org. Most languages there
are conventional (my style) other than all the weird ones plus FP.

But one task caught my eye:

https://rosettacode.org/wiki/Determine_if_a_string_has_all_unique_characters#Haskell

as all three Haskell versions seem to make meal of it. I had been
looking for a short cryptic Haskell example; this was a long cryptic one!

One mystery is how it gets the output (of first version) properly lined
up, as I can't see anything relevant in the code.

Half my version below is all the fiddly formatting; this is where I'd consider this a weak spot in my language and think about what could
improve it.

Other languages (eg. OCaml) keep the output minimal.

The same applies to the Haskell code shown - checking for uniqueness is
only about 10 lines of code - it's the table formatting that makes up
the bulk of it. (I am not nearly fluent enough in Haskell or its
standard libraries to write such code.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Thu Dec 15 11:26:39 2022

On 08/12/2022 09:08, David Brown wrote:

On 07/12/2022 19:58, Bart wrote:

My point was that your language (the low-level compiled one) and C are similar styles - they are at a similar level, and are both procedural imperative structured languages.

(None of this suggests you "copied" C. You simply have a roughly
similar approach to solving the same kinds of tasks - you probably had experience with much the same programming languages as the C designers,
and similar assembly programming experience before making your languages
at the beginning.)

Inspiration for the syntax of my first language were from these:

Algol68, Pascal, Ada, Fortran

I'd only actually used Pascal, Fortran and Algol60; others I'd seen
examples on paper.

Influences for the semantics, and operating and memory model, were from
these, in no particular order:

Pascal, Fortran, PDP10 ASM, Z80 ASM, Babbage (HLA), Misc ASM (incl.
6800).

But also from actual hands-on hardware development, involving Z80 and
x86 family up to 80188, and including graphics and video. (I don't know
if the C designers ran their language on a machine they'd had to
assemble with a soldering iron.)

The PDP10 was most useful in deciding what not to bother with (ie. word-addressed memory, packed strings etc). The 6800 was the only
big-endian processor I ever used, which had a brief positive influence,
but later I was happy to settle with little-endian.

AFAICR, C had no influence whatsoever. (Which is why I find it irksome
that all these lower-level aspects of hardware and software are now
considered the exclusive domain of C.)

However, I did encounter C much later on, and it did affect my own
language in some small matters when I adopted C's approach:

Original M C

Function calls, 0 args F F()
Create function pointer ^F F as well as &F
Address-of symbol ^X &F
Hex literals 0ABCH 0xABC
Pointer offset P + i*4 P + i (byte vs object offset)
Matching T* and Q* Any T, Q T = Q only, or one is void

Another change I made recently is in dropping the need for explicit
dereference operators in these contexts (originally I was against it for
losing transparency, but that is also the case by pass-by-reference params):

Original M Current M C

A^[i] A[i] (*A)[i], but usually A[i] ->
P^.m P.m (*P).m or P->m
F^(x) F(x) (*F)(x) or F(x)

Note: C makes little use of true array pointers; it likes to use T*

types not (T*)[], which mean my example would be written as A[i] anyway.
That however is very unsafe.)

Although the end result is the same (cleaner code), I don't achieve it
by by-passing the type system: internally the language still needs and
uses `P^.m`, it is purely a convenience.

^ can still be used, but not using ^ makes code easier to move to/from
my Q language.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From luserdroog@21:1/5 to David Brown on Fri Dec 16 20:07:28 2022

On Thursday, December 8, 2022 at 3:08:39 AM UTC-6, David Brown wrote:

Programming in Eiffel, Haskell, APL, Forth or Occam is /completely/
different - you approach your coding in an entirely different way, and
it makes no sense to think about translating from one of these to C (or
to each other).

It can be a fun challenge. I've tried implementing APL and Haskell concepts
in PostScript and C. Going from PostScript to C can be fairly easy if you completely disregard the goal of making it idiomatic. I have not looked at Eiffel or Occam in any depth. But I suppose "fun" is the key here. If you go into it thinking it's impossible, or impractically difficult, then you're right.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sat Dec 17 06:07:38 2022

Bart <bc@freeuk.com> wrote:

On 11/12/2022 16:50, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

(I don't think you've made it clear whether the other language(s) you've refered to are some mainstream ones, or one(s) that you have devised. I
will assume the latter and tone down my remarks. A little.)

Not my invention (I contributed a bit to some). But in most cases I
would not call them mainstream ones. Well, if your criterion for
mainstream language is having more than 10 users, then there is good
chance that all would qualify.

My scheme also allows circular and mutual imports with no restrictions.

You probably mean "with no artifical restrictions". There is
fundamental restriction that everything should resolve in finite
number of steps.

Huh? That doesn't come up. Anything which is recursively defined (eg.
const a=b, b=a) is detected, but that is due to out-of-order not
circular modules.

There could be "interesting" dependencies, naive handling would
treat them as cycle, but they are resolvable.

There needs to be some structure, some organisation.

Exactly, private import is tool for better organisation.

Sorry, all I can see is extra work; it was already a hassle having to
write `import B` on top of every module that used B, when it was visible
to all functions, because of having to manage that list of imports /per module/. Now I have to do that micro-managing /per function/?

I do not "have to". This is an option, and its use is not frequent.

(Presumably this language has block scopes: can this import also be
private to a nested block within each function?)

Actually, ATM there is no block scope, things propagate from normal
blocks to outside. There is limited number of things that prevent
propagation, one is functions, there are few other, but they are
somewhat unusual.

With module-wide imports, it is easy to draw a diagram of interdependent modules; with function-wide imports, that is not so easy, which is why I think there will be less structure.

You can draw diagrams that you want. Scoped import allows expressing
more information in source. For me it is more structure.

I can compile each program separately. Granularity moves from module to
program. That's better. Otherwise why not compile individual functions?

In the past there was support for compiling individual functions, and
I would not exclude possibility that it will come back. But ATM
I prefer to keep things simple, so this functionality was removed.

With whole program compilation, especially used in the context of a
resident tool that incorporates a compiler, with resident symbol tables, resident source and running code in-memory (actually, just like my very
first compilers), lots of possibiliies open up, of which I've hardly scratched the surface.

Including compiling/recompiling a function at a time.

Although part-recompiling during a pause in a running program, then
resuming, would still be very tricky. That would need debugging
features, and I would consider, in that case, running via an interpreter.

Well, you need relocatable code and indirection via pointers (hidden
from user, but part of runtime system). Main trouble is possible
change to layout of data structures -- if you change layout, then
all code depending on layout must be recompiled. In normal case
layout is only visible inside a module, so recompiling module
is enough. ATM it is up to user to do what is needed in more
tricky cases.

My point: a system that does all this would need all the relevant bits
in memory, and may involve all sorts of complex components.

Yes, "relevant bits". But "relevant bits" usually is much less
than whole program. And take into account that intermediate
data structures during compilation are much bigger than object
code.

But a
whole-program compiler that runs apps in-memmory already does half the work.

I used to use independent compilation myself. I moved on to
whole-program compilation because it was better. But all the issues
involved with interfacing at the boundaries between modules don't
completely disappear, they move to the boundaries between programs
instead; that is, between libraries.

I consider programs to be different from libraries. Program may
use several libraries, in degenerate case library may be just a
single module. Compiling whole program has clear troubles with
large programs.

My definition of 'program', on Windows, is a single EXE or DLL file.

I expect larger applications to consist of a collection of EXE and DLL
files. My own will have one EXE and zero or more DLLs, but I would also
make extensive use of scripting modules, that have different rules.

One way to allow incremental recompilation is to compile each module
to separate shared library (roughly corresponding to Windows DLL).
At least on Linux this has some overhead, namely each shared library
needs normally two or three separate paged areas and some system
data structures, so overhead of order of 10 kB per shared library.
But on modern Linux it of workable for low thousends of shared
libraries (I am not sure if it would scale to tens of thousends).

In running program you may unload current verision of shared
library resulting from module compilation and load new one.

There are different implementations which do not use machinery
of shared libraries, which in priciple should be much more
efficient. But with 1200 modules shared library handling seem
to cope fine.

The latter being an automatically created amalgamation (produced with
`mm -ma app`). Build using `mm app`.

I could provide a single file, shell archive containing build script
and sources, but important part of providing sources is that people
can read them, understand and modify.

I don't agree. On Linux you do it with sources because it doesn't have a reliable binary format like Windows that will work on any machine. If
there are binaries, they might be limited to a particular Linux
distribution.

You do not get it. I create binaries that work on 10 year old Linux
and new one. And on distributions that I never tried. Of course,
I mean that binaries are for specific architecture, separate
for i386 Linux and x86_64 Linux (that covers PC-s, I would have to
provide more if I wanted to support more architectures).

Concerning binary format, there were two: Linux started with a.out
and switched to ELF in second half of nineties. IIUC it is still
possible to run old a.out binaries, but support is/was an optional
addon and most users elected to not istall it. Anyway, all Linux
systems installed in this century are supposed to be able to run
ELF binaries. There were one change in ELF which means that by
default currently made programs will not run on old systems
(where old means something like 10 years old), but it is possible
to create binary that is compatible both with old and new systems.
So, maybe compatiblity of binary formats is not as good as
in Windows, but in practice is not a problem.

But the point is of managing dependencies. When you get Linux you
get a lot of libraries and normal "good behaviour" on Linux is to
use system provided libraries. Now, libraries change with time
and given library may be replaced by an incompatible one. Also,
commercial developers like to hardcode various system details
like location to system files. And there are utilities and
language interpreters that change too. One way to deal with
dependencies is bundle everything with binary. AFAIK this is
Windows way: Microsoft provides bunch of "redistributables"
and vendors are supposed to ship verion that they use. And
everything which does not come from Microsoft.

Linux system are more varied than Windows. But this is part
of point that I made: having sources people modify them and
create a lot of similar but (sometimes subtly) different
variants. Granted, "ordinary users" want to use system.
But almost any developer can make useful changes to programs
and such changes propagate to other people and Linux distributions.
That would be impossible without sources.

You also ignore educational aspect: some folks fetch sources to
learn how things are done.

(Binaries on Linux have always been a mystery to me, starting with the
fact that they don't have a convenient extension like .exe to even tell
what it is.)

Every file in Linux have associated permission bits. In most
wide sense "executable" means that you have permission to
execute it. Of course, if you try to execute it Linux kernel
must figure out what to do. Here important part is so called
magic numbers, normal executables are supposed to have few
bytes at the start which identify the format, like a.out versus ELF
machine architecture, etc. Actually, this is configurable and
users can add rules which say what to do given specific magic
number. And there is rather general support for interpreted
executables.

Magic numbers are not much different than MZ or NE markers in
Windows binaries. Concerning executables, IIUC in Windows
there are several executable extentions. And supposedly
you your EXE can have .com or .bat extension.

Concerning not having extension: you can add one if you want,
moderatly popular choices are .exe or .elf. But for using normal
Linux executable it should not matter if it is a shell script,
interpreted Python file or machine code. So exention should
not "give away" nature of executable. And having no extension
means that users are spared needless typing or (some of) surprises
of running different program that they wanted (PATH still
allows confusion).

GNU folks have nice definition
of source: "preffered form for making modifications". I would guess
that 'app.ma' is _not_ your preffered form for making modifications,
so it is not really true source.

No, but deriving the true sources from app.ma is trivial, since it is basically a concatenation of the relevant files.

Not less trivial than running 'tar' (which is standard component
on Linux).

And to build from "source" I need
source first. And I provide _true_ sources to my users.

If you were on Linux or didn't want to use my compiler, then it's even
simpler; I would provide exactly one file:

app.c Generated C source code (via `mc -c app`)

Here, you need a C compiler only. On Windows, you can build it using
`gcc app.c -oapp.exe`. Linux needs more options, listed at the top of
the file.

Sorry, generated file is _not_ a source. If I were to modify C file

This is not for modifying. 99% of the time I want to build an open
source C project, it is in order to provide a running binary, not spend
hours trying to get it to build. These are the obstacles I have faced:

* Struggling with formats like .gz2 that require multiple steps on Windows

'tar' knows about popular compression formats and can do this in one
step, also on Windows.

* Ending up with myriad files scatterred across myriad nested directories

Well, different folks have different ideas about good structure.
Some think that splitting things into a lot of directories is
good structure (I try to limit number of directiories and files,
but probably use much more than you would like). This is not
a big problem with right tools.

* Needing to run './configure' first (this will not work on Windows...)

I saw one case then a guy tried to run './configure' on Windows NT
and Windows NT kept crashing. It made a little progress and than
crashed, so that guy restarted it hoping that eventually it will
finish (after a week or two he gave up and used Linux). But
usually './configure' is not that bad. It make take a lot of
time, IME './configure' that run in seconds on Linux needed
several minutes on Windows. And of course you need to install
essential dependencies, good program will tell you what you need
to install first, before running configure. But you need to
understand what they mean...

* Finding a 'make' /program/ (my gcc has a program called
mingw32-make.exe; is that the one?)

Probably. Normal advice for Windows folks is to install thing
called msys (IIUC it is msys2 now) which contains several tools
incuding 'make'. You are likely to get it as part of bigger
bundle, I am not up to date to tell you if this bundle will
be called 'gcc' or something else.

* Getting 'make' to work. Usually it fails partway and makefiles can be
so complex that I have no way of figuring a way out
* Or, trying to compile manually, struggling with files which are all
over the place and imparting that info to a compiler.

I don't have any interest in this; I just want the binary!

Well, I provide Linux binaries, but only sources for Windows
users. One reason is that I have only Linux on my personal
machine, so to deal with Windows I need to lease a machine.
Different reason is that I an not paid for programming, I do
this because I like to program and to some degree to build
community. But if some potential members of community would
like to benefit but are unwilling to spent little effort.
Of course, in big community there may be a lot of "free
riders" who benefit without contributing anything whithout
bad effect because other folks will do needed work. But
here I am dealing with small community. I did port
to Windows to make sure that it actually works and there
are not serious problems. But I leave to other creation
of binaries and reporting possible build problems. If
nobody is willing to do this, then from my point of
view Windows has no users and is not worth supporting.

So, with my own programs, if I can't provide a binary (eg. they are not trusted), then one step back from a single binary file, is a single amalgamated source file.

I first did this in 2014, as a reaction to the difficulties I kept
facing: I wanted any of my applications to be as easy to build as hello.c.

If someone wants the original, discrete sources, then sure they can have
a ZIP file, which generally will have files that unpack into a single directly. But it's on request.

Normally I only deal with programs where I have full sources or I am
sure I can obtain them. My Linux is installed from binaries, but
sources are available on public repositories and I know where to
find them (and in cases where I needed sources I fetched them).

The difference is that what I provide is genuinely simple: one bare
compiler, one actual source file.

Sorry, for me "one file" is not a problem, there is 'tar' (de facto standard for distributing sorce code)

Yeah, I explained how well that works above. So the last Rust
implementation was a single binary download (great!), but it installed
itself as 56,000 discrete fies and across don't know how many 1000s of directories (not so great). And it didn't work (it requires additional tools).

I did not try Rust. As I wrote, I prefer moderate number of files,
but many files are not big problem, unless your system is deficient.
To be clearer: if you have too big allocation unit, then sources
which should take 20M many tak3 say 30M of disk space (that happended
when I unpacked sources of 386BSD on DOS). And you need tools
which can efficiently handle many files.

Being able to ZIP or TAR a sprawling set of files into giant binary
makes it marginally easier to transmit or download, but it doesn't
really address complexity.

In my book single blob of 20M is more problematic than 10000 files,
2kB each. At deeper level complexity is the same, but blob lacks
useful structure given by division into files and subdirectories.

And there is possible quite large dependency, namley Windows.

Yeah, my binaries run on Windows. Aside from requiring x64 and using
Win64 ABI, they use one external library MSVCRT.DLL,

MSVCRT.DLL _should_ not cause trouble, as this is C library
and functions in it in principle are available on other systems.
To make _should_ reality some care/effort may be needed.

which itself uses
Windows.

For programs that run on Windows and Linux, those depend on libraries
used. For 'M' programs, one module has to be chosen from Windows and
Linux versions; To run on Linux, I have to do this:

mc -c -linux app.m # On Windows, makes app.c, using
# the Linux-specific module

gcc app.c -oapp -lm etc # On Linux
./app

but M makes little use of WinAPI. With my interpreter, the process is as follows:

c:\qx>mc -c -linux qq # On Windows
M6 Compiling qq.m---------- to qq.c

Copy qq.c to c:\c then under WSL:

root@DESKTOP-11:/mnt/c/c# gcc qq.c -oqq -fno-builtin -lm -ldl

Now I can run scripts under Linux:

root@DESKTOP-11:/mnt/c/c# ./qq -nosys hello
Hello, World!

However, notice the '-nosys' option; this is because qq automatically incorporates a library suite that include a GUI library based on Win32. Without that, it would complain of not finding user32.dll etc.

I would need to dig up an old set of libraries or create new
Linux-specific ones. A bit of extra work. But see how the entire app is contained within that qq.c file.

It is [not?] clear how much of your code _usefully_ runs in now-Window environment.

OK, let's try my C compiler. Here I've done 'mc -c -linux cc`, copied
cc.q, and compiled under WSL as bcc:

root@DESKTOP-11:/mnt/c/c# ./bcc -s hello.c
Compiling hello.c to hello.asm

root@DESKTOP-11:/mnt/c/c# ./bcc -e hello.c
Preprocessing hello.c to hello.i

root@DESKTOP-11:/mnt/c/c# ./bcc -c hello.c
Compiling hello.c to hello.obj

root@DESKTOP-11:/mnt/c/c# ./bcc -exe hello.c
Compiling hello.c to hello.exe
msvcrt
msvcrt.dll
SS code gen error: Can't load search lib

So, most things actually work; only creating EXE doesn't work, because
it needs access to msvcrt.dll. But even it it did, it would work as a cross-compiler, as its code generator is for Win64 ABI.

Yes. It was particularly funny whan you had compiler running on
Raspberry Pi, but producing Intel code...

But I think this shows useful stuff can be done. A more interesting test (which used to work, but it's too much effort right now), is to get my M compiler working on Linux (the 'mc' version that targets C), and use
that to build qq, bcc etc from original sources on Linux.

In all, my Windows stuff generally works on Linux. Linux stuff generally doesn't work on Windows, in terms of building from source.

Our experience differ. There were times that I had to work on
Windows machine and problem was that Windows does not come with
tools that I consider essential. But after installing essential
tools from binaries the rest could be installed from sources.
To be clear, I mean mostly non-GUI stuff.

C is just so primitive when it comes to this stuff. I'm sure it largely
works by luck.

C is at low level, that is clear.

The way it does modules is crude. So was my scheme in the 1980s, but it
was still one step up from C. My 2022 scheme is miles above C now.

Concering "miles above": using C one can create shared libraries.
Some shared libraries may be system provided, some may be private.
Within C ecosystem, one you have corresponding header files you
can use them as "modules". And they are usable with other languages.
AFAIK no other module system can match _this_. So, primitive,
but it works and allows to do things which would be impossible
otherwise.

The
underlying language can still be low level, but you can at least fix
some aspects.

I think a better module scheme could be retrofitted to C, but I'm not
going to do it.

Adding "better" module system to C is trivial. Adding it in a
way preserving good properties of current system is tricky.

Good programming
environment should help. C as language is not helpful, one
may have fully compliant and rather unhelpful compiler. But
real C compilers tends to be as helpful as they can within
limit a C language. While C still limits what they can do,
there is quite a lot of difference betwen current popular
compiler and bare-bones legal comiler. And there are extra
tools and here C support is hard to beat.

I don't agree with spending 1000 times more effort in devising complex
tools compared with just fixing the language.

It is cheaper to have 1000 people doing tools, than 100000 people
fixing their programs. If you are single person or in small
group working on both, then adapting program to tools may be
resonable. You look what gives more benefit, implementing
feature in compiler or writing code without some features in
compiler. If you choose wisely you can have simple compiler
with features that you need. But this does not scale well.

So what are the new circles of ideas? All that crap in Rust that makes
coding a nightmare, and makes building programs dead slow? All those new >> functional languages with esoteric type systems? 6GB IDEs (that used to
take a minute and a half to load on my old PC)? No thanks.

Borrow checker in Rust looks like good idea. There is good chance
that _idea_ will be adopted by several languages in near feature.

OK. I've heard that that makes coding in Rust harder. Also that makes compilation slower. Not very enticing features!

Initial coding is easiest when compiler report no errors. But
if program is widely used, then errors will cause trouble sooner
or later. Borrow checker means that that programmers must deal
with problems earlier, which may look harder, but is likely to
reduce total work in longer time.

Not so new ideas are:
- removing limitations, that is making sure language constructs
work as general as possible (that allows to get rid of many
special constructs from older languages)
- nominal, problem dependent types. That is types should reflect
problem domain. In particular, domains which need types like
'u32' are somewhat specific, in normal domains fundamental types
are different
- functions as values/parameters. In particular functions have
types, can be members of data structures
- "full rights" for user defined types. Which means whatever
syntax/special constructs works on built-in types should
also work for user defined types
- function overloading
- type reconstruction
- garbage collection
- exception handling
- classes/objects

Are these what your language supports? (If you have your own.)

I doubt if all are fully implemented in single language, for example
having both function overloading and type reconstruction is
balancing act. The language with modules above has quite limited
support for exceptions: there is (scoped) "catch all" construct
to catch errors but no real exceptions. Here trouble is what
should be type of exceptions? The language takes type correctness
vary serously and ATM there is no agreement about good type.
There are no real classes and objects. Many things done via
classes/objects can be done using module features, but not
all. And here trouble is to implement nice object semantics
keeping good runtime efficiency of current system.

I would say that in other aspects this language is doing
quite well. "full rights" for user defined types is obtained
implementing builtin types almost as if they were user-defined types.

Here I contributed modest changes to the language and larger
to implementation. I must admit that trouble here is that
currently several features work as used in existing code, but
not in general. And you would hate compiler speed: about
200 lines per second (but with large variation depending
on actual constructs). Note: I would like to improve compiler
speed, OTOH is is resonably workable. Namely average module
has about 200 lines and can be compiled in about 1s. And there
is another compiler for interactive use which compiles somewhat
faster and allows testing of small pieces of code. So,
while compiler is much slower than gcc, actual turnaround
time during developement is actually resonably good.

I can't say these have ever troubled me. My scripting language has
garbage collection, and experimental features for exceptions and playing
with OOP, and one or two taken from functional languages.

Being dynamic, it has generics built-in. But it deliberately keeps type systems at a simple, practical level (numbers, strings, lists, that sort
of thing), because the aim is for easy coding.

Well, some people do not think in terms of types. For other types
are significant help, they allow easy coding because when I write
'*' compiler chooses right one based on types (and similar for
many other operations). And thanks to types compilers catches
a lot of errors quite early. And resulting code can be quite
efficient, main limitation here is quality of code generator
which is significantly weaker than gcc (significantly meaning
about half of speed of gcc compiled C on moderately low-level
tasks).

If you want hard, then
Rust, Ada, Haskell etc are that way -->!

* Clean, uncluttered brace-free syntax

Does this count as brace-free?

for i in 1..10 repeat (print i; s := s + i)

Not if you just substitude brackets for braces. Brackets (ie "()") are
OK within one line, otherwise programs look too Lispy.

There are many possible alternatives: '[' and ']' or '<<' and '>>' :)

* Case-insensitive
* 1-based
* Line-oriented (no semicolons)
* Print/read as statements

Lot of folks consider the above misfeatures/bugs.

I know.

Concerning
'line-oriented' and 'intuitive, can you guess which changes to
following statement in 'line-oriented' syntax are legal and preserve meaning?

nm = x or nm = 'log or nm = 'exp or nm = '%power or
nm = 'nthRoot or
nm = 'cosh or nm = 'coth or nm = 'sinh or nm = 'tanh or
nm = 'sech or nm = 'csch or
nm = 'acosh or nm = 'acoth or nm = 'asinh or nm = 'atanh or
nm = 'asech or nm = 'acsch or
nm = 'Ei or nm = 'erf or nm = 'erfi or nm = 'li or
nm = 'Gamma or nm = 'digamma or nm = 'dilog or
nm = '%root_sum =>
"iterate"

As a hit let me say that '=' is comparison. And this is single
statement, small changes to whitespace will change parse and lead
to wrong code or syntax/type error. BTW: you need real newsreader
to see it, Google and likes will change it so it no longer works.

Thunderbird screws it up as well, unless it is meant to have a ragged
left edge. But not sure what your point is.

Yes, there is ragged left edge. Well, with "line oriented" syntax
you will get constructs that span more than one line. And you
need some rules how they work. The point is that once you get
into corner cases the rules are unlikely to be intuitive. I mean,
they will be intuitive if you learn and internalize them, but
this is not much different from other syntax.

Non-line-oriented (like C, like JSON) is better for machine readable
code, that can also be transmitted with less risk of garbling. But when
when 90% of semicolons in C-style languages coincidence with
end-of-line, you need to start question the point of them.

Well, another sample:

sub1!(pol, p) ==
if #pol = 0 then
pol := new(1, 0)$U32Vector
if pol(0) = 0 then
pol(0) := p - 1
else
pol(0) := pol(0) - 1
pol

While semantics of this would need some explantation, I hope
that you agree that this is uncluttered syntax, with no
braces or semicolons. But there is price to be paid, namely
(rare) cases like the previous one.

BTW: this is strongly typechecked code, 'sub1!' has type
declared in module interface.

Note that C's preprocessor is line-oriented, but C itself isn't.

C is still tremendously popular for many reasons. But anyone wanting to
code today in such a language will be out of luck if prefered any or all >> of these characteristics. This is why I find coding in my language such
a pleasure.

Then, if we are comparing the C language with mine, I offer:

* Out of order definitions

That is considerd misfeature in modern time.

Really? My experiments showed that modern languages (not C or C++) do
allow out-of-order functions. This gives great freedom in not worrying
about whether function F must go before G or after, or being able to
reorder or copy and paste.

I am not sure what you mean here by "out-of-order functions".
If you declare functions earlier there is nothing "out-of-order".
Similarly, if language do not need declarations. For me "out-of-order"
is say PL/I where you need declaration (possibly implicit) and
(IIRC) you can define function before declaration.

In modern languages
definition may generate some code to be run and order in which
this code is run matters.

* One-time definitions (no headers, interfaces etc)
* Expression-based

C is mostly expression-based.

No, it's mostly statement-based. Although it might be that most
statements are expression statements (a=b; f(); ++c;).

You can't do 'return switch() {...}' for example, unless using gcc extensions.

That is why I wrote "mostly", this and few similar means that it is
not entirely expression-based. But in C assignment, conditional
and sequence is expression, which means that you can do in single
expression things that in Pascal or Ada (and several other lanuages)
would take multiple statement and possibly extra variables. Note
that 'switch' in most cases could be replaced by conditionals,
so looking at complexity of constructs only loops, gotos, and declarations/defintions are really excluded. Point is that C allows
most uses that appear in practical programs.

There are langages that go further
than C, for example:

a := (s = 1; for i in 1..10 repeat (s := s + i); s)

is legeal in language that I use, but can not be directly translated to C. However, from examples that you gave it looked that your language
is _less_ expression-based than C.

I don't use the feature much. I had it from 80s, then switched to statement-based for a few years to match the scripting language, now
both are expression-based.

One reason it's not used more is because it causes problems when
targetting C. However I like it as a cool feature.

* Program-wide rather than module-wide compilation unit

AFACS C leave choice to the implementer. If your language _only_
supports whole-program compilation, then this whould be
negative feature.

* Build direct to executable; no object files or linkers

That really question of implementation. Building _only_
to executable is misfeature (what about case when I want
to use a few routines in your language, but the rest including
main program is in different language?).

There were escape routes involving OBJ files, but that's fallen into
disuse and needs fixing. For example, I can't so `mm -obj app` ATM, but
could do this, when I've cleared some bugs:

mm -asm app # app.m to app.asm
aa -obj app # app.asm to app.obj
gcc app.obj lib.o -oapp.exe # or lib.a?

This (or something near) allows static linking of 'lib' instead of
dynamic, or including lib written in another language.

[continued in next message]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Sat Dec 17 13:22:24 2022

On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

I don't agree. On Linux you do it with sources because it doesn't have a
reliable binary format like Windows that will work on any machine. If
there are binaries, they might be limited to a particular Linux
distribution.

You do not get it. I create binaries that work on 10 year old Linux
and new one. And on distributions that I never tried.

I tried porting a binary from one ARM32 Linux machine to another; it
didn't work, even 2 minutes later. Maybe it should have worked and there
was some technical reason why my test failed.

But I have noticed that on Linux, distributing stuff as giant source
bundles seems popular. I assumed that was due to difficulties in using binaries.

Of course,
I mean that binaries are for specific architecture, separate
for i386 Linux and x86_64 Linux (that covers PC-s, I would have to
provide more if I wanted to support more architectures).

Concerning binary format, there were two: Linux started with a.out
and switched to ELF in second half of nineties.

(I don't understand that; a.out is a filename; ELF is a file format.)

You also ignore educational aspect: some folks fetch sources to
learn how things are done.

Sure. Android is open source; so is Firefox. While you can spend years
reading through the 25,000,000 lines of Linux kernal code.

Good luck finding out how they work!

Here I'm concerned only with building stuff that works, and don't want
to know what directory structure the developers use.

Concerning not having extension: you can add one if you want,
moderatly popular choices are .exe or .elf.

But nobody does. Main problem is in forums like this: if I say
`hello.exe`, everyone knows that's a binary executable for Windows. But
if I mention `hello`, how are you supposed to tell that I'm talking
about a Linux executable?

I know that Linux doesn't care about extensions, but people do. After
all it still uses, by convention, extensions like .c -s .o .a .so, so
why not actual binaries by convention?

But for using normal

Linux executable it should not matter if it is a shell script,
interpreted Python file or machine code. So exention should
not "give away" nature of executable.

You can have a neutral extension that doesn't give it away either. Using
no extension is not useful: is every file with no extension something
you can execute?

But there are also ways to execute .c files directly, and of course .py
files which are run from source anyway.

It simply doesn't make sense. On Linux, I can see that executables are displayed on consoles in different colours; what happened when there was
no colour used?

And having no extension
means that users are spared needless typing

Funny you should bring that up, because every time you run a /C
compiler/ on a /C source file/, you have to type the extension like this:

gcc hello.c

which also writes the output as a.exe or a.out, so you further need to
write at least:

gcc hello.c -o hello # hello.exe on Windows

I would only write this:

bcc hello

and it works out, by some very advanced AI, that I want to compile
hello.c into hello.exe. And once you have hello.exe, you can run it like
this:

hello

You don't need to type .exe. So, paradoxically, having extensions means
having to type them less often:

mm -pcl prog # old compiler: translate prog.m to prog.pcl
pcl -asm prog # prog.pcl to prog.asm
aa prog # prog.asm to prog.exe
prog # run it

At no point did I need to write an extension. It is implied by the
program I invoked.

No, but deriving the true sources from app.ma is trivial, since it is
basically a concatenation of the relevant files.

Not less trivial than running 'tar' (which is standard component
on Linux).

.ma is a text format; you can separate with a text editor if you want!
But you don't need to. Effectively you just do:

gcc app # ie. app.gz2

and it makes `app` (ie. an ELF binary 'app.').

* Needing to run './configure' first (this will not work on Windows...)

I saw one case then a guy tried to run './configure' on Windows NT
and Windows NT kept crashing.

Possibly you don't quite understand: aside from "./" being a syntax
error on Windows, 'configure' is a script full of Bash commands which
invoke all sorts of utilities from Linux. It is meaningless to attempt
to run it on Windows.

It would be like my bundling a Windows BAT file with sources intended to
be built on Linux.

It made a little progress and than
crashed, so that guy restarted it hoping that eventually it will
finish (after a week or two he gave up and used Linux). But
usually './configure' is not that bad. It make take a lot of
time, IME './configure' that run in seconds on Linux needed
several minutes on Windows.

It can takes several minutes on Linux too! Auto-conf-generated configure scripts can contain tens of thousands of lines of code.

Building CPython on Linux (this was a decade ago when it was smaller),
took 5 minutes from cold. After that, an incremental make (after editing
one module) took 5 seconds. (Still 25 times longer than building my own interpreter from scratch.)

Another project, the sources for A68G, also took minutes. On Windows it
didn't work at all, but extracting the dozen C files (about 70Kloc), I
managed to get part-working version using my bcc compiler. It took under
a second to build.

Yet another, the sources for GMP, could only be built on Windows using
MSYS2. Which I tried; it worked away for an hour, then it failed.

And of course you need to install
essential dependencies, good program will tell you what you need
to install first, before running configure. But you need to
understand what they mean...

* Finding a 'make' /program/ (my gcc has a program called
mingw32-make.exe; is that the one?)

Probably. Normal advice for Windows folks is to install thing
called msys (IIUC it is msys2 now) which contains several tools
incuding 'make'. You are likely to get it as part of bigger
bundle, I am not up to date to tell you if this bundle will
be called 'gcc' or something else.

But that's just a cop-out. As I said above, it's like my delivering a
build system for Linux that requires so many Windows dependencies, that
you can only build by installing half of Windows.

When /I/ provide sources (that is, a representation that is one step
back from binaries), to build on Linux, then it will build on Linux.
They will have a dependency on a C compiler that can produce a ELF file,
and I now stipulate either gcc or tcc.

I don't have any interest in this; I just want the binary!

Well, I provide Linux binaries, but only sources for Windows
users. One reason is that I have only Linux on my personal
machine, so to deal with Windows I need to lease a machine.
Different reason is that I an not paid for programming, I do
this because I like to program and to some degree to build
community.

I had the same problem, in reverse. I've spent money on RPis, cheap
Linux netbooks, spent endless time getting VirtualBox to work, and still
don't have a suitable Linux machine that Just Works.

WSL is not interesting since it is still x64, and maybe things will work
that will not work on real Linux (eg. it can still run actual Windows
EXEs; what else is it allowing that wouldn't work on real Linux).

I've stopped this since no one has ever expressed any interest in seeing
my stuff work on Linux, especially on RPi where a very fast alternative
to C that ran on the actual board would have been useful.

But if some potential members of community would
like to benefit but are unwilling to spent little effort.
Of course, in big community there may be a lot of "free
riders" who benefit without contributing anything whithout
bad effect because other folks will do needed work. But
here I am dealing with small community. I did port
to Windows to make sure that it actually works and there
are not serious problems. But I leave to other creation
of binaries and reporting possible build problems. If
nobody is willing to do this, then from my point of
view Windows has no users and is not worth supporting.

In 1990s, for my commercial app, about 1 in 1000 users ever enquired
about versions for Linux. One wanted a Mac version, but apparently my
app worked well enough under a Windows emulator.

Being able to ZIP or TAR a sprawling set of files into giant binary
makes it marginally easier to transmit or download, but it doesn't
really address complexity.

In my book single blob of 20M is more problematic than 10000 files,
2kB each. At deeper level complexity is the same, but blob lacks
useful structure given by division into files and subdirectories.

When you download a ready-to-run binary, it will be a single blob. Or a
single main blob.

In my case, my 'blob' can be run directly; start with these three files:

c:\demo>dir
30/03/2022 13:53 45 hello.m
17/12/2022 12:52 653,251 mm.ma
09/12/2022 18:54 471,552 ms.exe

mm.ma are the sources for my compiler as one text blob. ms.exe is my M
compiler (when called 'ms', it automatically invokes the '-run' option
to run from source).

Now I can run the compiler from that blob without even formally creating
any executable:

c:\demo>ms mm hello
(Building mm.ma)
Hello World! 12:54:27

c:\demo>tm ms mm hello
(Building mm.ma)
Hello World! 12:54:36
TM: 0.08

The second run applied a timer; it took 80ms to build my compiler from
scratch and use it to build /and/ run Hello.

On the same machine, gcc takes 0.22 seconds to build hello.c, without
that minor step of building itself from scratch first.

If /only/ other language tools were as genuinely effortless and simple
and fast as mine. Then I would moan a lot less!

So, most things actually work; only creating EXE doesn't work, because
it needs access to msvcrt.dll. But even it it did, it would work as a
cross-compiler, as its code generator is for Win64 ABI.

Yes. It was particularly funny whan you had compiler running on
Raspberry Pi, but producing Intel code...

I did work on a recent project where my x64 code for Win64 ABI could
work on x64 Linux (not arm64), which seemed exciting, then I discovered
that WSL could run EXE anyway. That spoilt it and I abandoned it.

But I think this shows useful stuff can be done. A more interesting test
(which used to work, but it's too much effort right now), is to get my M
compiler working on Linux (the 'mc' version that targets C), and use
that to build qq, bcc etc from original sources on Linux.

In all, my Windows stuff generally works on Linux. Linux stuff generally
doesn't work on Windows, in terms of building from source.

Our experience differ. There were times that I had to work on
Windows machine and problem was that Windows does not come with
tools that I consider essential.

This is the problem. Linux encourages the use of myriad built-in tools
and utilities. Is it any wonder that stuff is then hard to build
anywhere else?

My experience is that the OS provided nothing, especially pre-Windows
where it basically only provided a file-system. Then you learn to be self-sufficient and to keep your tools lean.

The way it does modules is crude. So was my scheme in the 1980s, but it
was still one step up from C. My 2022 scheme is miles above C now.

Concering "miles above": using C one can create shared libraries.
Some shared libraries may be system provided, some may be private.
Within C ecosystem, one you have corresponding header files you
can use them as "modules". And they are usable with other languages.
AFAIK no other module system can match _this_.

Huh? Any language that can produce DLL files can match that. Some can
produce object files that can be turned into .a and .lib files.

Regarding C header files, NO language can directly process .h files
unless that ability has been specifically built in, either by providing
half a C compiler, or bundling a whole one (eg. Zig comes with Clang).

I don't agree with spending 1000 times more effort in devising complex
tools compared with just fixing the language.

It is cheaper to have 1000 people doing tools, than 100000 people
fixing their programs.

Or one person fixing the C language and the compiler. OK, there are lots
of C compilers, but you don't need 1000 people.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Bart on Sat Dec 17 22:35:33 2022

On 17/12/2022 13:22, Bart wrote:

On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

When /I/ provide sources (that is, a representation that is one step
back from binaries), to build on Linux, then it will build on Linux.
They will have a dependency on a C compiler that can produce a ELF file,
and I now stipulate either gcc or tcc.

See https://github.com/sal55/langs/tree/master/demo

This includes mc.c, a generated-C rendering of my M-on-Linux compiler.

You [antispam] will need gcc or tcc to create a binary on Linux;
instructions are at the link.

Once you have a working binary, you can try that on the one-file M
'true' sources in mc.ma, to create a new binary.

If that monolithic source file still doesn't cut it for you, I've
included an extraction program. The readme tells you how to run that,
and how to run the 2nd compiler on those discrete files to make a third compiler.

(I've briefly tested those instructions under WSL. It ought to work on
any 64-bit Linux including ARM, but I can't guarantee it. The C file is
32Kloc, and the .ma file is 25Kloc.

If it doesn't work, then forget it. I know it can be made to work, and
to do so via my one-file distributions.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Sun Dec 18 14:05:09 2022

On 17/12/2022 14:22, Bart wrote:

On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

(I'm snipping a lot.)

Of course,
I mean that binaries are for specific architecture, separate
for i386 Linux and x86_64 Linux (that covers PC-s, I would have to
provide more if I wanted to support more architectures).

Concerning binary format, there were two: Linux started with a.out
and switched to ELF in second half of nineties.

(I don't understand that; a.out is a filename; ELF is a file format.)

"a.out" is an old executable file format. It was also the default name
a lot of tools used when producing something in that format, and that
default name has stuck. So if you use a modern gcc to generate an
executable, and don't give it a name, it uses "a.out" as the name - but
it will be in ELF format, not "a.out" format. Usually it's best to give
your executables better names!

Concerning not having extension: you can add one if you want,
moderatly popular choices are .exe or .elf.

But nobody does. Main problem is in forums like this: if I say
`hello.exe`, everyone knows that's a binary executable for Windows. But
if I mention `hello`, how are you supposed to tell that I'm talking
about a Linux executable?

In the *nix world, the meaning of a file is determined primarily by its contents, along with the "executable" flag, not by its name. This is a
/good/ thing. If I run a program called "apt", I don't care if it is an
ELF binary, a Perl script, a BASH shell script, a symbolic link to
another file, or anything else. I don't want to have to distinguish
between them, so it's good that they don't need to have the file type as
part of the name.

For data files, it can often be convenient to have an extension
indicating the type - and it is as common on Linux as it is on Windows
to have ".odt", ".mp3", etc., on data files.

If you want to know what a file actually is, the "file" command is your
friend on *nix.

I know that Linux doesn't care about extensions, but people do. After
all it still uses, by convention, extensions like .c -s .o .a .so, so
why not actual binaries by convention?

People use extensions where they are useful, and skip them when they are counter-productive (such as for executable programs).

But for using normal

Linux executable it should not matter if it is a shell script,
interpreted Python file or machine code. So exention should
not "give away" nature of executable.

You can have a neutral extension that doesn't give it away either. Using
no extension is not useful: is every file with no extension something
you can execute?

When you are writing code, and you have a function "twiddle" and an
integer variable "counter", you call them "twiddle" and "counter". You
don't call them "twiddle_func" and "counter_int". But maybe sometimes
you find it useful - it's common to write "counter_t" for a type, and
maybe you'd write "xs" for an array rather than "x". Filenames can
follow the same principle - naming conventions can be helpful, but you
don't need to be obsessive about it or you end up with too much focus on
the wrong thing.

On *nix, every file with the executable flag can be executed - that's
what the flag is for.

Sometimes it is convenient to be able to see which files in a directory
are executables, directories, etc. That's why "ls" has flags for
colours or to add indicators for different kinds of files. ("ls -F
--color").

But there are also ways to execute .c files directly, and of course .py
files which are run from source anyway.

There are standards for that. A text-based file can have a shebang
comment ("#! /usr/bin/bash", or similar) to let the shell know what
interpreter to use. This lets you distinguish between "python2" and
"python3", for example, which is a big improvement over Windows-style
file associations that can only handle one interpreter for each file
type. And the *nix system distinguishes between executable files and non-executables by the executable flag - that way you don't accidentally
try to execute non-executable Python files.

To be honest, it really does not bother me if there are file extensions
on programs, or if there are no file extensions. For my own
executables, I will usually have ".py" or ".sh" for Python or shell
files, and no extension for compiled files. But that's because there's
a fair chance I'll want to modify or update the file at some point, not
because it makes a big difference when it is running.

It simply doesn't make sense. On Linux, I can see that executables are displayed on consoles in different colours; what happened when there was
no colour used?

Try "ls -l", or "ls -F". It's been a long time since I used a computer
display that did not have colour, but I do not remember it being a
problem on the Sun workstations at university. (I do remember how
god-awful ugly and limited it was going back to all-caps
case-insensitive 8 character DOS/Win16 filenames on a PC at home. At
least most of these limitations are now outdated even on Windows.)

And having no extension
means that users are spared needless typing

Funny you should bring that up, because every time you run a /C
compiler/ on a /C source file/, you have to type the extension like this:

    gcc hello.c

You do realise that gcc can handle some 30-odd different file types?
It's not a simple C compiler that assumes everything it is given is a C
file. Of course you have to give the full name of the file. (You can
also tell gcc exactly how you want the file to be interpreted, if you
are doing something funny.)

which also writes the output as a.exe or a.out, so you further need to
write at least:

   gcc hello.c -o hello           # hello.exe on Windows

I would only write this:

   bcc hello

On Linux, you just write "make hello" - you don't need a makefile for
simple cases like that. (And the "advanced AI" can figure out if it is
C, C++, Fortran, or several other languages.)

and it works out, by some very advanced AI, that I want to compile
hello.c into hello.exe. And once you have hello.exe, you can run it like this:

   hello

You don't need to type .exe. So, paradoxically, having extensions means having to type them less often:

   mm -pcl prog     # old compiler: translate prog.m to prog.pcl
   pcl -asm prog    # prog.pcl to prog.asm
   aa prog          # prog.asm to prog.exe
   prog             # run it

At no point did I need to write an extension. It is implied by the
program I invoked.

That's fine for programs that handle just one file type.

But I'm a little confused here. On the one hand, you are saying how
terrible Linux is for not using file extensions. On the other hand, you
are saying how wonderful your own tools are because they don't need file extensions.

Could it be simply that file extensions are sometimes helpful, sometimes inconvenient or irrelevant, and mostly it all just works without much
trouble?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to David Brown on Sun Dec 18 14:38:27 2022

On 2022-12-18 14:05, David Brown wrote:

Could it be simply that file extensions are sometimes helpful, sometimes inconvenient or irrelevant, and mostly it all just works without much trouble?

Extensions is a way to convey the file type, e.g. a hint which
operations are supposed to work with the file, if the system is weakly
typed.

Some early OSes supported tagging files externally. They type was kept
by the filesystem not in the file. Of course, such systems became
cluttered in presence of access rights. A file executable for one, could
be non-executable for other. Anyway they did not advance much as DOS and
UNIX united scorched and salted the ground.

UNIX, as always, combined worst available options in the most peculiar
way. The idea of reading file before accessing it is as stupid as it
sounds. Isn't reading an access?

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Sun Dec 18 17:09:49 2022

On 18/12/2022 13:05, David Brown wrote:

On 17/12/2022 14:22, Bart wrote:

For data files, it can often be convenient to have an extension
indicating the type - and it is as common on Linux as it is on Windows
to have ".odt", ".mp3", etc., on data files.

It's convenient for all files. And before you say, I can add a .exe
extension if I want: I don't want to have to write that every time I run
that program.

People use extensions where they are useful, and skip them when they are counter-productive (such as for executable programs).

I can't imagine all my EXE (and perhaps BAT files) all having no
extensions. Try and envisage all your .c files have no extensions by
default. How do you even tell that are C sources and not Python or not executables?

When you are writing code, and you have a function "twiddle" and an
integer variable "counter", you call them "twiddle" and "counter". You don't call them "twiddle_func" and "counter_int". But maybe sometimes
you find it useful - it's common to write "counter_t" for a type, and
maybe you'd write "xs" for an array rather than "x". Filenames can
follow the same principle - naming conventions can be helpful, but you
don't need to be obsessive about it or you end up with too much focus on
the wrong thing.

But you /do/ write twiddle.c, twiddle.s, twiddle.o, twiddle.cpp,
twiddle.h etc? Yet the most important file of all, is just plain 'twiddle'!

In casual writing or conversation, how to do distinguish 'twiddle the
binary executable' from 'twiddle the folder', from 'twiddle the
application' (an installation) , from 'twiddle' the project etc, without
having to use that qualification?

Using 'twiddle.exe' does that succinctly and unequivocally.

On *nix, every file with the executable flag can be executed - that's
what the flag is for.

Sometimes it is convenient to be able to see which files in a directory
are executables, directories, etc. That's why "ls" has flags for
colours or to add indicators for different kinds of files. ("ls -F --color").

As I said, if it's convenient for data and source files, it's convenient
for all files.

But there are also ways to execute .c files directly, and of course
.py files which are run from source anyway.

There are standards for that. A text-based file can have a shebang
comment ("#! /usr/bin/bash", or similar) to let the shell know what interpreter to use. This lets you distinguish between "python2" and "python3", for example, which is a big improvement over Windows-style
file associations that can only handle one interpreter for each file
type.

That is invasive. And taking something that is really an attribute of a
file name, in having it not only inside the file, but requiring the file
to be opened and read to find out.

(Presumably every language that runs on Linux needs to accept '#' as a
line comment? And you need to build it in to every one of 10,000 source
files the direct location of the Python2 or Python3 installation on that machine? Is that portable across OSes? But I expect it's smarter than that.)

With Python, you're still left with the fact that you see a file with a
.py extension, and don't know if it's Py2 or Py3, or Py3.10 or Py3.11,
or whether it's a program that works with any version. It is a separate
problem from having, as convention, no extensions for ELF binary files.

And the *nix system distinguishes between executable files and
non-executables by the executable flag - that way you don't accidentally
try to execute non-executable Python files.

(So there are files that contain Python code that are non-executable?
Then what is the point?)

You do realise that gcc can handle some 30-odd different file types?

That doesn't change the fact that probably 99% of the time I run gcc, it
is with the name of a .c source file. And 99.9% of the times when I
invoke it on prog.c as the first or only file to create an executable,
then I want to create prog.exe.

So its behaviour is unhelpful. After the 10,000th time you have to type
.c, or backspace over .c to get at the name itself to modify, it becomes tedious.

Now it's not that hard to write a wrapper script or program on top of
gcc.exe, but if it isn't hard, why doesn't it just do that?

It's not a simple C compiler that assumes everything it is given is a C
file.

As I said, that is not helpful for me. Also, how many file types does
'as' accept? As that also requires the full extension, and also,
bizarrely, generates `a.out` as the object file name.

If you intend to assemble three .s files to object files, using separate
'as' invocations, they will all be called a.out!

That would be crass even for a toy program written by a student. And yet
here it is a mainstream product used by million of people.

All my language programs (and many of my apps), have a primary type of
input file, and will default to that file extension if omitted. Anything
else (eg .dll files) need the full extension.

Here's something funny: take hello.c and rename to 'hello', with no
extension. If I try and compile it:

gcc hello

it says: hello: file not recognised: file format not recognised. Trying
'gcc hello.' is worse: it can't see the file at all.

So first, on Linux, where file extensions are supposed to be optional,
gcc can't cope with a missing .c extension; you have to provide extra
info. Second, on Linux, "hello" is a distinct file from "hello.".

With bcc, I just have to type "bcc hello." to make it work. A trailing
dot means an empty extension.

On Linux, you just write "make hello" - you don't need a makefile for
simple cases like that.

OK... so how does 'make' figure out the file extension?

'Make' anyway has different behaviour:

* It can choose not to compile

* On Windows, it says this:

c:\yyy>make hello
cc hello.c -o hello
process_begin: CreateProcess(NULL, cc hello.c -o hello, ...) failed.
make (e=2): The system cannot find the file specified.
<builtin>: recipe for target 'hello' failed
make: *** [hello] Error 2

* I also use several C compilers; how does make know which one I intend?
How do I pass it options?

If I give another example:

c:\c>bcc cipher hmac sha2
Compiling cipher.c to cipher.asm
Compiling hmac.c to hmac.asm
Compiling sha2.c to sha2.asm
Assembling to cipher.exe

it just works. 'make cipher hmac sha2' doesn't, not even in WSL.

(And the "advanced AI" can figure out if it is
C, C++, Fortran, or several other languages.)

No, it can't. If I have hello.c and hello.cpp, it will favour the .c file.

But suppose you did have just the one plausible program; why not build
that logic into the compiler as I said above?

That's fine for programs that handle just one file type.

Like a .x file for a compiler for language X?

But I'm a little confused here. On the one hand, you are saying how terrible Linux is for not using file extensions. On the other hand, you
are saying how wonderful your own tools are because they don't need file extensions.

That's why I said 'paradoxically'. The extensions are needed so you can
be confident the tool will take a .x input and produce a .y output (and
my tools will always confirm exactly what they will do).

And so that you can identify .x and .y files on directory listings. But
you don't want to keep typing .x and maybe .y thousands and thousands of
times.

Even if most of the time you use an IDE or some other project manager,
you will working from a console prompt enough times, for various custom
builds, tests and debugging, for explicit extensions to become a nuisance.

Here is an example of actual use:

c:\mx>mc -c mm
M6 Compiling mm.m---------- to mm.c

c:\mx>bcc -s mm
Compiling mm.c to mm.asm

c:\mx>aa mm
Assembling mm.asm to mm.exe

c:\mx>mm
M Compiler [M6] 18-Dec-2022 15:06:29 ...

Notice how it makes clear exactly what x and y are at each step. gcc
either says nothing, or --verbose gives a wall of gobbledygook.

Could it be simply that file extensions are sometimes helpful, sometimes inconvenient or irrelevant, and mostly it all just works without much trouble?

File extensions are tremendously helpful. But that doesn't mean you have
to keep typing them! They just have to be there.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sun Dec 18 17:17:51 2022

Bart <bc@freeuk.com> wrote:

On 17/12/2022 13:22, Bart wrote:

On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

When /I/ provide sources (that is, a representation that is one step
back from binaries), to build on Linux, then it will build on Linux.
They will have a dependency on a C compiler that can produce a ELF file, and I now stipulate either gcc or tcc.

See https://github.com/sal55/langs/tree/master/demo

This includes mc.c, a generated-C rendering of my M-on-Linux compiler.

You [antispam] will need gcc or tcc to create a binary on Linux;
instructions are at the link.

Once you have a working binary, you can try that on the one-file M
'true' sources in mc.ma, to create a new binary.

If that monolithic source file still doesn't cut it for you, I've
included an extraction program. The readme tells you how to run that,
and how to run the 2nd compiler on those discrete files to make a third compiler.

(I've briefly tested those instructions under WSL. It ought to work on
any 64-bit Linux including ARM, but I can't guarantee it. The C file is 32Kloc, and the .ma file is 25Kloc.

If it doesn't work, then forget it. I know it can be made to work, and
to do so via my one-file distributions.)

It works on 64-bit AMD/Intel Linux. As-is it failed on 64-bit ARM.
More precisly, initial 'mc.c' compiled fine, but it could not
run 'gcc'. Namely, ARM gcc does not have '-m64' option. Once
I removed this it works.

So you may want to change this:

--- fred.nn/mm_winc.m 2022-12-18 14:52:37.635098030 +0000
+++ fred.nn2/mm_winc.m 2022-12-18 16:02:10.494914440 +0000
@@ -52,7 +52,7 @@

case ccompiler
when gcc_cc then
- fprint @&.str,"gcc -m64 # # -o# # -s ",
+ fprint @&.str,"gcc # # -o# # -s ",
(doobj|"-c"|""),(optimise|"-O3"|""),exefile, cfile
when tcc_cc then
fprint @&.str,f"tcc # -o# # # -luser32 c:\windows\system32\kernel32.dll -fdollars-in-identifiers",
@@ -88,7 +88,7 @@

case ccompiler
when gcc_cc then
- fprint @&.str,"gcc -m64 # # -o# # -lm -ldl -s -fno-builtin",
+ fprint @&.str,"gcc # # -o# # -lm -ldl -s -fno-builtin",
(doobj|"-c"|""),(optimise|"-O3"|""),&.newexefile, cfile
when tcc_cc then
fprint @&.str,"tcc # -o# # -lm -ldl -fdollars-in-identifiers",

Also, I noticed that a lot of system stuff is just stubs. You may
want the following implementation of 'os_getsystime':

--- fred/mlinux.m 2022-12-18 14:52:42.831097

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Sun Dec 18 23:18:27 2022

On 18/12/2022 17:17, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

On 17/12/2022 13:22, Bart wrote:

On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

When /I/ provide sources (that is, a representation that is one step
back from binaries), to build on Linux, then it will build on Linux.
They will have a dependency on a C compiler that can produce a ELF file, >>> and I now stipulate either gcc or tcc.

See https://github.com/sal55/langs/tree/master/demo

This includes mc.c, a generated-C rendering of my M-on-Linux compiler.

You [antispam] will need gcc or tcc to create a binary on Linux;
instructions are at the link.

Once you have a working binary, you can try that on the one-file M
'true' sources in mc.ma, to create a new binary.

It works on 64-bit AMD/Intel Linux.

Wow, you actually had a look! (Usually no one ever bothers.)

As-is it failed on 64-bit ARM.

More precisly, initial 'mc.c' compiled fine, but it could not
run 'gcc'. Namely, ARM gcc does not have '-m64' option. Once
I removed this it works.

gcc doesn't have `-m64`, really? I'm sure I've used it even on ARM. (How
do you tell it it to generate ARM32 rather than ARM64 code?)

But I've taken that out for now. (Programs can also be built using `./mc
-c prog` then compiling prog.c manually. That reminds me I haven't yet
provided help text specific to 'mc'.)

I tested this using the following program:

proc main=
rsystemtime tm
os_getsystime(&tm)
println tm.second
println tm.minute
println tm.hour
println tm.day
println tm.month
println tm.year
end

It's funny you picked on that, because the original version of my
hello.m also printed out the time:

proc main=
println "Hello World!",$time
end

This was to ensure I was actually running the just built-version, and
not the last of the 1000s of previous ones. But the time-of-day support
for Linux wasn't ready so I left it out.

I've updated the mc.c/mc.ma files (not hello.m, I'm sure you can fix that).

However getting this to work on Linux wasn't easy as it kept crashing.
The 'struct tm' record ostensibly has 9 fields of int32, so has a size
of 36 bytes. And on Windows it is. But on Linux, a test program reported
the size as 56 bytes.

Doing -E on that program under Linux, the struct actually looks like this:

struct tm
{
int tm_sec;
int tm_min;
int tm_hour;
int tm_mday;
int tm_mon;
int tm_year;
int tm_wday;
int tm_yday;
int tm_isdst;

long int tm_gmtoff;
const char *tm_zone;
};

16 extra byte for fields not mentioned in 'man' docs, plus 4 bytes
alignment account for the 20 bytes. This is typical of the problems in
adapting C APIs to the FFIs of other languages.

BTW: I still doubt that 'mc.ma' expands to true source: do you
really write no comments in your code?

The file was detabbed and decommented, as the comments would be full of
ancient crap, mainly debugging code that never got removed. I've tidied
most of that up, and now the file is just detabbed (otherwise things
won't line up properly). Note the sources are not heavily commented anyway.

It will always be a snapshot of the actual sources, which are not kept
on-line and can change every few seconds.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Mon Dec 19 06:15:26 2022

Bart <bc@freeuk.com> wrote:

On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

I don't agree. On Linux you do it with sources because it doesn't have a >> reliable binary format like Windows that will work on any machine. If
there are binaries, they might be limited to a particular Linux
distribution.

You do not get it. I create binaries that work on 10 year old Linux
and new one. And on distributions that I never tried.

I tried porting a binary from one ARM32 Linux machine to another; it
didn't work, even 2 minutes later. Maybe it should have worked and there
was some technical reason why my test failed.

But I have noticed that on Linux, distributing stuff as giant source
bundles seems popular. I assumed that was due to difficulties in using binaries.

Creating binaries that work on many system requires some effort.
Easiest way is to create them on oldest system that you expect to
be used: typically compile on Linux will try to take advantage
of features of processor and system, older system/processor
may lack those features and fail. As I wrote, one needs to limit
dependencies or bundle them. Bundling may lead to very large
download size.

Distributing sources has many advantages: program can be better
optimized to concrete machine, if there is trouble other folks
may fix it. If your program gains some popularity, then it is
likely to become part of Linux distribution. In such case
distribution maintainers will create binary packages and package
system will take care of dependencies.

And do not forget to open source means that users can get the source.
If source is not available, most users will ignore your program.

Concerning binaries for ARM, they are a bit more problematic than
for Intel/AMD. Namely, there is a lot of different variants of
ARM processors and 3 different instruction encodings in popular
use: 32-bit instructions operating on 32-bit data (original ARM),
16 or 32-bit instructions operating on 32-bit data (Thumb), and 32-bit intructions operating on 64-bit data (AARCH64). And there are versions
of ARM architecture, with new versions adding new useful
instructions. And not all ARM processors have FPU. Original
Raspberry Pi used architecture version 6 and had FPU. First Linux
for Raspberry Pi had code which allowed linking with code for
machines with no FPU (using emulation), this was supposed to be
portable, but slowed floating point computations. Quckly this was
replaced by version which do not allow linking with code using
FPU emulation. Chinese variants that I have use architecture
version 7. In version 6 Thumb instruction encoding lead to
much slower code. In version 7 Thumb instructions are almost
as fast as ARM ones, but give significantly smaller code.
Some Linux versions for those board use ARM instruction
encoding for most programs, other use Thumb instruction
encoding. Program for version 7 using Thumb instructions will
almost surely use instructions not present in version 6,
so will fail on older machines. From some time Raspberry Pi
use processor in version 8 which is 64-bit capable. But at
first those machines used to run 32-bit system and only
recently 64-bit system for ARM becane popular. Note, that
technically 64-bit ARM processor can run 32-bit code and
IIUC 64-bit system could provide compatibility with 32-bit
binaries. But unlike Intel/AMD machines where Linux for
long time provided support for running 32-bit binaries on
64-bit machines, it seems that 64-bit Linux distributions for
ARM have no support for 32-bit binaries. So, there is less
compatiblity than technically possible.

My binaries for ARM use 32-bit enconding in version 5, that
should be portable to any 32-bit ARM produced in last 10 years.
They are some percent slower, because they do not use newer
instructions and some percent larger due to not using Thumb
encoding. And I limited dependencies for binary. One gets
_much_ more functionality by compiling from source, binary
is just for bootstrap (there is no "C target" so one needs
initial binary compiler to compile the rest).

Of course,
I mean that binaries are for specific architecture, separate
for i386 Linux and x86_64 Linux (that covers PC-s, I would have to
provide more if I wanted to support more architectures).

Concerning binary format, there were two: Linux started with a.out
and switched to ELF in second half of nineties.

(I don't understand that; a.out is a filename; ELF is a file format.)

a.out is also name of executable format.

You also ignore educational aspect: some folks fetch sources to
learn how things are done.

Sure. Android is open source; so is Firefox. While you can spend years reading through the 25,000,000 lines of Linux kernal code.

Good luck finding out how they work!

Things have structure, kernel is divided into subdirectories in
resonably logical way. And there are tools like 'grep', it can
find relevant thing in seconds. This is not pure theory, I had
puzzling problem that binaries were failing on newer Linux
distributions. It took some time to solve, but another guy
hinted me that this may be tighthended "security" in kernel.
Even after the hint my first try did not work. But I was able
to find relevant code in the kernel and then it became clear
what to do.

FYI: The problem was with executing machine code generated at
runtime. Origianally program (Poplog) was depending on old default
memory permissions for newly allocated memory being read, write,
execute (program explicitely requested old default). But Linux
kernel starting from 5.8 removed execute permission from old
default and one had to make another system call to get execute
permissions as part of default.

As another example, I observed somewhat strange behaviour from
USB to serial convertors. I looked at corresponding kernel
driver and at least part was explained: intead of taking number
(clock divisor) to set speed chip had some funky way of setting
speed which limited possible speeds.

Here I'm concerned only with building stuff that works, and don't want
to know what directory structure the developers use.

Concerning not having extension: you can add one if you want,
moderatly popular choices are .exe or .elf.

But nobody does. Main problem is in forums like this: if I say
`hello.exe`, everyone knows that's a binary executable for Windows.

It may be for Linux...

But
if I mention `hello`, how are you supposed to tell that I'm talking
about a Linux executable?

Most people are intelignet enough to realize that talk is about
executable. And if there is possibility of confusion resonable
speaker/writer will explicitely mention executable. Essentially
to only case when extentions help is when you want to set up
some automation based on file extentions. That is usually part
of bigger process, and using (adding) extention like '.exe' or '.elf'
solves problem.

I know that Linux doesn't care about extensions, but people do. After
all it still uses, by convention, extensions like .c -s .o .a .so, so
why not actual binaries by convention?

Here you miss virtue of simplicity: binaries are started by kernel
and you pass filename of binary to the system call. No messing
with extentions there. There are similar library calls that
do search based on PATH, again no messing with extentions.

But for using normal

Linux executable it should not matter if it is a shell script,
interpreted Python file or machine code. So exention should
not "give away" nature of executable.

You can have a neutral extension that doesn't give it away either. Using
no extension is not useful: is every file with no extension something
you can execute?

But there are also ways to execute .c files directly, and of course .py
files which are run from source anyway.

It simply doesn't make sense.

It makes sense if you know that executable in the PATH is simultaneousy
shell command. You see, there are folks which really do not like
useless clutter in their command lines. And before calling
executable from a shell script you may wish to check if it is available.
Having different extention for calling and for access as normal
file would complicate scripts.

On Linux, I can see that executables are
displayed on consoles in different colours; what happened when there was
no colour used?

There is 'ls -l' which give rather detailed information. Or 'ls -F'
which appends star to names of executable (and slash to directory names).

And having no extension
means that users are spared needless typing

Funny you should bring that up, because every time you run a /C
compiler/ on a /C source file/, you have to type the extension like this:

gcc hello.c

which also writes the output as a.exe or a.out, so you further need to
write at least:

gcc hello.c -o hello # hello.exe on Windows

You can write

make hello

(this works without a Makefile, just using default make rules).
And for quick and dirty testing 'a.out' is file name.

I would only write this:

bcc hello

and it works out, by some very advanced AI, that I want to compile
hello.c into hello.exe. And once you have hello.exe, you can run it like this:

hello

You don't need to type .exe. So, paradoxically, having extensions means having to type them less often:

mm -pcl prog # old compiler: translate prog.m to prog.pcl
pcl -asm prog # prog.pcl to prog.asm
aa prog # prog.asm to prog.exe
prog # run it

At no point did I need to write an extension. It is implied by the
program I invoked.

Implied extentions have trouble that somebody else my try to hijack
them. I use TeX system which produces .dvi files. IIUC they could
be easily mishandled by systems depending just on extion. And in
area of programming languages at least three languages compete for
.p extention.

No, but deriving the true sources from app.ma is trivial, since it is
basically a concatenation of the relevant files.

Not less trivial than running 'tar' (which is standard component
on Linux).

.ma is a text format; you can separate with a text editor if you want!
But you don't need to. Effectively you just do:

gcc app # ie. app.gz2

and it makes `app` (ie. an ELF binary 'app.').

* Needing to run './configure' first (this will not work on Windows...)

I saw one case then a guy tried to run './configure' on Windows NT
and Windows NT kept crashing.

Possibly you don't quite understand: aside from "./" being a syntax
error on Windows,

AFAIK '/' is legal in Windows pathnames (even though many programs
do not support it). I am not sure about leading dot.

'configure' is a script full of Bash commands which
invoke all sorts of utilities from Linux. It is meaningless to attempt
to run it on Windows.

You probably do not understand that 'configure' scripts use POSIX
commands. POSIX commands was mostly based on Unix commands, but
original motivation was that various OS-s had quite different
ways of doing simple things. Influential users (like governments)
got tired of this and as response came industry standard, that
is POSIX. Base POSIX was carfully designed so that commands
could be provided on many different systems. IBM MVS had trouble
because they did not have timestamps on files, but IBM solved
this by providing "UNIX" subsystem on MVS (I write it in quotes
because it has significant differentces from normal Unix).
IIUC other OS providers did not have problems. There is more
to POSIX than commands and to sell its systems to US goverment
Microsoft claimed that Windows is "certified POSIX". Microsoft
being Microsoft made its POSIX compatibility as useless as
possible while satisfing letter of the standard. And did not
provide it at all in home versions of Windows. But you
can get POSIX tools from third parties and they run fine.

BTW1: Few years ago Microsoft noticed that lack of de-facto
POSIX compatibility is hurting then, so they started WSL.

BTW2: At least normal autoconf macros are quite carful to use
only Posix constructs. 'configure' scripts would be shorter
and simpler if one could assume that they are executed by
'bash' (which has bunch of useful features not present in
POSIX). And similar for executed commands: I one could assume
that versions of commands usually present on Linux 'configure'
would be simpler.

BTW3: You could get DOS versions of several Unix commands in
late eigthies. They were used not for porting, but simply
because they were useful.

It would be like my bundling a Windows BAT file with sources intended to
be built on Linux.

There are two important differences:
- COMMAND.COM is very crappy as command processor, unlike Unix shell
which from the start was designed as programming language. IIUC that
is changing with PowerShell.
- you compare thing which was designed to be portability layer with
platform specific thing.

BTW: I heard that some folks wrote shell compatible with PowerShell
for Linux. Traditional Linux users probably do not care much
about this, but would not object if you use it (it is just "another
scripting language").

It made a little progress and than
crashed, so that guy restarted it hoping that eventually it will
finish (after a week or two he gave up and used Linux). But
usually './configure' is not that bad. It make take a lot of
time, IME './configure' that run in seconds on Linux needed
several minutes on Windows.

It can takes several minutes on Linux too! Auto-conf-generated configure scripts can contain tens of thousands of lines of code.

It depends on commands and script. Shell can execute thousends of
simple commands per second. On Linux most of time probably goes
to C compiler (which is called many times from 'confugure').
On Windows cost of process creation used to be much higher than
on Linux, so it is likely that most 'configure' time went to
process creation. Anyway, the same 'configure' script tended to
run 10-100 times slower on Windows than on Linux. I did not try
recently...

And of course you need to install
essential dependencies, good program will tell you what you need
to install first, before running configure. But you need to
understand what they mean...

* Finding a 'make' /program/ (my gcc has a program called
mingw32-make.exe; is that the one?)

Probably. Normal advice for Windows folks is to install thing
called msys (IIUC it is msys2 now) which contains several tools
incuding 'make'. You are likely to get it as part of bigger
bundle, I am not up to date to tell you if this bundle will
be called 'gcc' or something else.

But that's just a cop-out. As I said above, it's like my delivering a
build system for Linux that requires so many Windows dependencies, that
you can only build by installing half of Windows.

POSIX utilities could be quite small. On Unix-like system one
could fit them in something between 1-2M. On Windows there is
trouble that some space saving ticks do not work (in Unix usual
trick is to have one program available under several names, doing
different thing depending on name). Also, for robustness they
may be staticaly linked. And people usually want versions with
most features, which are bigger than what is strictly necessary.
Still, that is rather small thing if you compare to size of
Windows. Another story is size of C compiler and its header
files. I looked at compiled version of Mac OS system interface
for GNU Pascal, it took about 10M around 2006. And users prefered
to have it as one large blob, because loading parts separately
lead to quadraticaly growing time with GNU Pascal. FYI,
interface contained several thousends of types and about 200000
symbolic constants. So basically, to have symbolic names and
type checking has significant cost and there will be some bloat
due to this.

I don't have any interest in this; I just want the binary!

Well, I provide Linux binaries, but only sources for Windows
users. One reason is that I have only Linux on my personal
machine, so to deal with Windows I need to lease a machine.
Different reason is that I an not paid for programming, I do
this because I like to program and to some degree to build
community.

I had the same problem, in reverse. I've spent money on RPis, cheap
Linux netbooks, spent endless time getting VirtualBox to work, and still don't have a suitable Linux machine that Just Works.

I have bunch of Pi-s (2 original RPis, the rest similar boards from
chinese vendors). Probably most troubles are caused by SD cards
an improper shutdown. In original Pi card (in adapter) got stuck,
and during removal adapter broks. In cheap Orange Pi Zero (was $6.99
at some time), SD cards developed errors for no apparent reason.
Still, Pi-s mostly work fine, but if you want to store important
files, then USB-stick or disk may be safer.

For laptop I bought cheapest one in the store and it works fine.
It is slow, but I do real work on desktop, and need laptop for
travel. For small weight and low current consumption were most
important. Low current consumption means slow processor, which
led to low price...

WSL is not interesting since it is still x64, and maybe things will work
that will not work on real Linux (eg. it can still run actual Windows
EXEs; what else is it allowing that wouldn't work on real Linux).

I've stopped this since no one has ever expressed any interest in seeing
my stuff work on Linux, especially on RPi where a very fast alternative
to C that ran on the actual board would have been useful.

Well, compiler that can not generate code for Pi is not very
interesting to run on RPi, even if it is very fast. Your
latest M is step in good direction, but suffers due to gcc
compile time:

time ./mc -asm mc.m
M6 Compiling mc.m---------- to mc.asm

real 0m0.431s
user 0m0.347s
sys 0m0.079s

time ../mc mc.m
M6 Compiling mc.m---------- to mc
L:Invoking C compiler: gcc -omc mc.c -lm -ldl -s -fno-builtin
mc.c:2:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
#pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

real 0m9.137s
user 0m8.746s
sys 0m0.386s

So 'mc' can generate C code in 0.431s, but then it takes '9.137s'
to compile generated C (IIUC Tiny C does not support ARM and
even on x86_64 compile command probably needs fixing).

And there seem to be per-program overhead:

time fred.nn/mc -asm hello.m
M6 Compiling hello.m------- to hello.asm

real 0m0.222s
user 0m0.186s
sys 0m0.032s

time fred.nn/mc hello.m
M6 Compiling hello.m------- to hello
L:Invoking C compiler: gcc -ohello hello.c -lm -ldl -s -fno-builtin hello.c:2:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
#pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

real 0m1.596s
user 0m1.464s
sys 0m0.125s

'hello.m' is quite small, but is needs half of time of mc which
is 6000 times larger. And generated 'hello.c' is still 4381
lines.

On 32-bit Pi-s I have Poplog:

http://github.com/hebisch/poplog http://www.math.uni.wroc.pl/~hebisch/poplog/corepop.arm

For bootstrap one needs binary above. For me it works on
oryginal Raspberry Pi, Banana Pi, Orange Pi Pc, Orange Pi Zero.
They all have different ARM chips and run different version
of Linux, yet the same 'corepop.arm' works on all of them.

If you are interested you can look at INSTALL file in repo
above (skip quick install part that assumes tarball and go to
full install). Building uses had written 'configure', it
is very simple, but calls C compiler up to 4 times, so runtime
of 'configure' on Pi-s is noticeable (of order of 1 second).
Actual build (using 'make') takes few minutes, depending
on what was configured.

Note: one needs to install dependencies first (describe in
INSTALL), if you do not do this either configure or build
will fail.

For massive files Poplog compiles much slower than your compiler
but usually much faster than gcc. However, main advantage
is that Poplog compiles to memory giving you impression of
interpreter, but generating machine code. It is not easy
to measure speed of compiler, but for resonably sized functions
compile time is short enough to one does not notice it.

With Poplog you actually get 4 languages: Pop11, SML, Prolog and
Common Lisp. Significant portion of Poplog is kept in source form
and (transparently) compiled when needed. Memory use is moderate.

Let me add that there are actually two compilers, one compiling to
memory and separate one which generates assembly code. The second
compiler has low-level extenstions allowing faster object code.
Compiler compiling to memory is is significantly faster than
the one which generates assembly.

There is also large documentation and notrivial part of build
time is spend creating indices for documentation (that probably
could be made faster, but happens rarely compared to compilation
so nobody cared enough to improve this).

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Mon Dec 19 15:14:31 2022

On 19/12/2022 06:15, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

But I have noticed that on Linux, distributing stuff as giant source
bundles seems popular. I assumed that was due to difficulties in using
binaries.

Creating binaries that work on many system requires some effort.

For users that is far the simplest way. Especially on Windows where you
are looking for .exe .or .msi when desiring to download some application.

Building from sources on Windows is only viable IMO if (1) everything
required is contained within .exe and .msi and the process is
transparent; (2) it actually works with no mysterious errors that you
can't do anything about.

(Why not provide the app as a binary anyway? Maybe to provide a more
targetted executable. But I think it's also a thing now to supply
multiple different EXEs packaged within the one binary.)

Easiest way is to create them on oldest system that you expect to
be used: typically compile on Linux will try to take advantage
of features of processor and system, older system/processor
may lack those features and fail. As I wrote, one needs to limit dependencies or bundle them. Bundling may lead to very large
download size.

No one cares about that anymore. I remember every trivial app on Android
always seemed to be 48MB in size, maybe because they were packaged with
the same tools.

/I/ care about it because size = complexity.

And do not forget to open source means that users can get the source.
If source is not available, most users will ignore your program.

Going back to Android and the 1000000 downloadable apps on Play Store:
how many downloaders care about source code? Approximately nobody!

Tools for programmers on desktop PCs will have a different demographic,
but I can tell you that when I want to run a language implementation, I
don't care about the source either. Let me try it out first.

Concerning binaries for ARM, they are a bit more problematic than
for Intel/AMD. Namely, there is a lot of different variants of
ARM processors and 3 different instruction encodings in popular

You're telling me! I've long been confused by the different numberings
between processors and architectures. All I want to know about these
days is, is it ARM32 or ARM64.

<long snip>

ARM have no support for 32-bit binaries. So, there is less
compatiblity than technically possible.

Sure. That's why, if you are working from source, it should have a
simple as structure as possible, if you are a /user/ of the application.

My view is that there are two kinds of source code:

(1) The true sources that the /developer/ works with, with 1000s of
source files, dozens or 100s of sprawling directories, config scripts,
make files, the works. Plus an ecosystem of tools for static analysis, refactoring, ... you have a better idea than I do

(2) A minimal representation which is the simplest needed to create a
working binary. Just enough to solve the problems of diverse targets
that you listed.

(1) is needed for developing and debugging. (2) is used on finished,
debugged, working programs.

You seem to be arguing for everyone to be provided with (1).

The rationale for the one-file source version of SQLite3 was precisely
to make it easy to build and to incorporate. (One big file, plus 2-3
auxiliary ones, compared with 100 separate true source files. The
choices the developers made in the folder hierarchy etc are irrelevant.)

Things have structure, kernel is divided into subdirectories in
resonably logical way.

Just getting a linear list of files in the Linux kernel sources was a mini-project. This was when it was a mere 21M lines; I can't remember
how many files there were.

And there are tools like 'grep', it can
find relevant thing in seconds. This is not pure theory, I had
puzzling problem that binaries were failing on newer Linux
distributions. It took some time to solve, but another guy
hinted me that this may be tighthended "security" in kernel.
Even after the hint my first try did not work. But I was able
to find relevant code in the kernel and then it became clear
what to do.

(My approach would have been different; it would not have been that big
in the first place. I understand that 95% of sources are not relevant to
any particular build, but I still think an OS, especially just the core
of an OS, should not be that large.)

Here I'm concerned only with building stuff that works, and don't want
to know what directory structure the developers use.

Concerning not having extension: you can add one if you want,
moderatly popular choices are .exe or .elf.

But nobody does. Main problem is in forums like this: if I say
`hello.exe`, everyone knows that's a binary executable for Windows.

It may be for Linux...

David also replied to my points and I made clear my views about file
extensions yesterday in my reply to him.

At no point did I need to write an extension. It is implied by the
program I invoked.

Implied extentions have trouble that somebody else my try to hijack
them. I use TeX system which produces .dvi files. IIUC they could
be easily mishandled by systems depending just on extion. And in
area of programming languages at least three languages compete for
.p extention.

I don't get your point. Somebody could still hijack DVI files, whether
an application requires 'prog file.dvi', or 'prog file' which defaults
to file.dvi.

People can also write misleading 'shebang' lines or file signatures.

Possibly you don't quite understand: aside from "./" being a syntax
error on Windows,

AFAIK '/' is legal in Windows pathnames (even though many programs
do not support it). I am not sure about leading dot.

Internally it's legal, but it's not allowed in shell commands (because
"/" was used for command options). "." and ".." work the same way as in
Linux.

'configure' is a script full of Bash commands which
invoke all sorts of utilities from Linux. It is meaningless to attempt
to run it on Windows.

You probably do not understand that 'configure' scripts use POSIX
commands.

So? The fact is that they will not work on Windows. This an extract from
the 30,500-line configure script for GMP:

DUALCASE=1; export DUALCASE # for MKS sh
if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then :
emulate sh
NULLCMD=:
# Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which
# is contrary to our usage. Disable this feature.
alias -g '${1+"$@"}'='"$@"'
setopt NO_GLOB_SUBST
else
case `(set -o) 2>/dev/null` in #(
*posix*) :
set -o posix ;; #(
*) :
;;
esac
fi

This makes no sense to Windows shell, either Command Prompt or Powershell.

(Actually the size of this file - which is not the source code - is
bigger than the GMP DLL which is the end product.)

It would be like my bundling a Windows BAT file with sources intended to
be built on Linux.

There are two important differences:
- COMMAND.COM is very crappy as command processor, unlike Unix shell
which from the start was designed as programming language. IIUC that
is changing with PowerShell.
- you compare thing which was designed to be portability layer with
platform specific thing.

I don't really care about COMMAND.COM either. I use only the minimum
features.

But I'm not sure what point you're making: are you trying to excuse
those 30,500 lines of totally useless crap by saying that COMMAND.COM
should be able to run it?

It can takes several minutes on Linux too! Auto-conf-generated configure
scripts can contain tens of thousands of lines of code.

It depends on commands and script. Shell can execute thousends of
simple commands per second. On Linux most of time probably goes
to C compiler (which is called many times from 'confugure').
On Windows cost of process creation used to be much higher than
on Linux, so it is likely that most 'configure' time went to
process creation. Anyway, the same 'configure' script tended to
run 10-100 times slower on Windows than on Linux. I did not try
recently...

It shouldn't need to run at all. I still have little idea what it
actually does, or why, if it is to determine system parameters, that
can't be done once and for all per system, and not repeated per
application and per full build.

But that's just a cop-out. As I said above, it's like my delivering a
build system for Linux that requires so many Windows dependencies, that
you can only build by installing half of Windows.

POSIX utilities could be quite small. On Unix-like system one
could fit them in something between 1-2M. On Windows there is
trouble that some space saving ticks do not work (in Unix usual
trick is to have one program available under several names, doing
different thing depending on name). Also, for robustness they
may be staticaly linked. And people usually want versions with
most features, which are bigger than what is strictly necessary.
Still, that is rather small thing if you compare to size of
Windows.

The size of Windows doesn't matter, since it is not something that
somebody needs to locate, download and install. It will already exist.

Besides my stuff needs a tiny fraction of its functionality: basically,
a file system.

Another story is size of C compiler and its header
files.

(My mc.c doesn't use any header files, you might have noticed! Not even stdio.h. Hence the need for fno-builtin etc to shut up gcc, but that
varies across OSes.)

WSL is not interesting since it is still x64, and maybe things will work
that will not work on real Linux (eg. it can still run actual Windows
EXEs; what else is it allowing that wouldn't work on real Linux).

I've stopped this since no one has ever expressed any interest in seeing
my stuff work on Linux, especially on RPi where a very fast alternative
to C that ran on the actual board would have been useful.

Well, compiler that can not generate code for Pi is not very
interesting to run on RPi, even if it is very fast.

I first looked at RPi1 ten years ago. But I didn't find enough incentive
to target ARM32 natively.

Still, I have a way to write systems code for RPi in my language, even
if it annoyingly has to go through C.

Your
latest M is step in good direction, but suffers due to gcc
compile time:

time ./mc -asm mc.m
M6 Compiling mc.m---------- to mc.asm

(This a bug: for 'mc', -asm does the same thing as -c, and writes mc.c,
but says it is writing mc.asm. Use -c option for C output without
compiling via C compiler. Note that this may have overwritten mc.c.)

real 0m0.431s
user 0m0.347s
sys 0m0.079s

Gcc is not a good backend for the M compiler, since compilation speed
hits a brick wall as it soon as it is invoked.

Much more suitable is tcc, but I couldn't make that the default, since
it might not be installed. (I suppose it could tentatively try tcc, then
fall back to gcc.)

On my WSL (some tidying done):

/mnt/c/c# time ./mc -gcc mc -out:mc2 # -gcc is default
M6 Compiling mc.m---------- to mc2
L:Invoking C compiler: gcc -omc2 mc2.c -lm -ldl -s -fno-builtin
real 0m1.362s
user 0m1.139s
sys 0m0.109s

/mnt/c/c# time ./mc -tcc mc -out:mc2
M6 Compiling mc.m---------- to mc2
L:Invoking C compiler: tcc -omc2 mc2.c -lm -ldl
-fdollars-in-identifiers
real 0m0.135s
user 0m0.058s
sys 0m0.012s

So tcc is 10 times the speed of using gcc (or maybe 20; I don't get
these readings). Note, I used gcc -O3 to get ./mc (the executable!), but
I get the same results with -O0.

/mnt/c/c# time ./mc -c mc
M6 Compiling mc.m---------- to mc.c
real 0m0.072s
user 0m0.014s
sys 0m0.021s
root@DESKTOP-11:/mnt/c/c#

This is the time to translate to C only. That is, 70ms or 14ms, probably
the former.

Going from mc.m to mc.exe directly on Windows is faster (I think):

c:\mx>TM mm mc
M6 Compiling mc.m---------- to mc.exe
TM: 0.07

That is unoptimised code. If I get an optimised version of mm.exe like
this: `mc -opt mm` (the main reason I again have a C target), then it is:

c:\mx>TM mm mc
M6 Compiling mc.m---------- to mc.exe
TM: 0.05

time ../mc mc.m
M6 Compiling mc.m---------- to mc
L:Invoking C compiler: gcc -omc mc.c -lm -ldl -s -fno-builtin
mc.c:2:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
#pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

real 0m9.137s
user 0m8.746s
sys 0m0.386s

So 'mc' can generate C code in 0.431s, but then it takes '9.137s'
to compile generated C (IIUC Tiny C does not support ARM and
even on x86_64 compile command probably needs fixing).

I think it does; I seem to remember using it. It was a major
disincentive to taking my own fast compiler further.

I assume you've tried 'apt-get install tcc'.

And there seem to be per-program overhead:

time fred.nn/mc -asm hello.m
M6 Compiling hello.m------- to hello.asm

real 0m0.222s
user 0m0.186s
sys 0m0.032s

time fred.nn/mc hello.m
M6 Compiling hello.m------- to hello
L:Invoking C compiler: gcc -ohello hello.c -lm -ldl -s -fno-builtin hello.c:2:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
#pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"

(So many incompatible ways in gcc to get it to say nothing about
declarations like:

extern puts(unsigned char*)int;

I thought one point of having an optional stdio.h was to be able to
easily override such functions.)

real 0m1.596s
user 0m1.464s
sys 0m0.125s

'hello.m' is quite small, but is needs half of time of mc which
is 6000 times larger. And generated 'hello.c' is still 4381
lines.

To get a clearer picture of what is taking the time, try:

time ./mc -c hello.m
time gcc hello.c -ohello @gflags
time gcc hello.c -ohello @tflags

It's annoying that on Linux, gcc needs those 3 extra options not needed
on Windows. And that tcc, because it doesn't support $ in names, needs
that other long option. So gflags/tflags contain what are necessary.

On my WSL, hello.m -> hello.c tales 38ms real time. The gcc build of
hello.c takes 340s, and tcc takes 17ms.

A regular 5-line hello.c takes 125ms and 10ms respectively. When I can
get -nosys/-minsys working with the C target, then generated C files can
get smaller for certain programs.

(On Windows, -nosys and -minsys options remove all or most of the
standard library. Executables go down to 2.5KB or 3KB instead of 50KB
minimum, but no or minimal library features are available.

These don't work yet with mc for C target. Full list of options are in mm_cli.m.)

On 32-bit Pi-s I have Poplog:

http://github.com/hebisch/poplog http://www.math.uni.wroc.pl/~hebisch/poplog/corepop.arm

For bootstrap one needs binary above. For me it works on
oryginal Raspberry Pi, Banana Pi, Orange Pi Pc, Orange Pi Zero.
They all have different ARM chips and run different version
of Linux, yet the same 'corepop.arm' works on all of them.

If you are interested you can look at INSTALL file in repo
above (skip quick install part that assumes tarball and go to
full install). Building uses had written 'configure', it
is very simple, but calls C compiler up to 4 times, so runtime
of 'configure' on Pi-s is noticeable (of order of 1 second).
Actual build (using 'make') takes few minutes, depending
on what was configured.

I'll have dig out my Pi boards. I have RPi1 and RPi4. I got the last
because I wanted to try ARM64, but it turns out that mature OSes for it
were still mostly 32-bit. The only 64-bit OS worked poorly. That was 3
years ago.

For massive files Poplog compiles much slower than your compiler
but usually much faster than gcc. However, main advantage
is that Poplog compiles to memory giving you impression of
interpreter, but generating machine code.

That's what my 'mm' compiler does on Windows, using -run option. Or if I
rename it 'ms', that is the default:

c:\mx>ms mm -run \qx\qq \qx\hello.q
Compiling mm.m to memory
Compiling \qx\qq.m to memory
Hello, World! 19-Dec-2022 15:00:34

(There's an issue ATM building ms with ms.) This builds the M compiler
from source, which builds the Q interpreter from source, which then runs hello.c. The whole process took 0.18 seconds.

But I can't do this via the C target. (Support native code for Linux on
x64 is a possibility, but is of limited use and interest for me.)

Let me add that there are actually two compilers, one compiling to
memory and separate one which generates assembly code. The second
compiler has low-level extenstions allowing faster object code.
Compiler compiling to memory is is significantly faster than
the one which generates assembly.

Generating lots of ASM source is slow, but you would need to generate
millions of lines to see much of a difference.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Mon Dec 19 16:39:33 2022

On 19/12/2022 00:18, Bart wrote:

On 18/12/2022 17:17, antispam@math.uni.wroc.pl wrote:

As-is it failed on 64-bit ARM.

More precisly, initial 'mc.c' compiled fine, but it could not
run 'gcc'. Namely, ARM gcc does not have '-m64' option. Once
I removed this it works.

gcc doesn't have `-m64`, really? I'm sure I've used it even on ARM. (How
do you tell it it to generate ARM32 rather than ARM64 code?)

gcc treats "x86" as one backend, with "-m32" and "-m64" variants for
32-bit and 64-bit x86, for historical reasons and because it is
convenient for people running mixed systems. But ARM (32-bit) and
AArch64 (64-bit) are different backends as they are very different architectures.

So you need separate gcc builds for 32-bit and 64-bit ARM, not just a flag.

(I make no comment as to whether this is a good thing or a bad thing -
I'm just saying how it is.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to antispam@math.uni.wroc.pl on Mon Dec 19 16:51:17 2022

On 19/12/2022 07:15, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

Possibly you don't quite understand: aside from "./" being a syntax
error on Windows,

AFAIK '/' is legal in Windows pathnames (even though many programs
do not support it). I am not sure about leading dot.

A single dot counts as "the current directory" in Windows, just like in
*nix.

But forward slashes are not allowed in Windows filenames, along with
control characters (0x00 - 0x1f, 0x7f), ", *, /, \, :, <, >, ?, | and
certain names that match old DOS device names.

*nix systems typically allow everything except / and NULL characters.

(I discovered recently that in Powershell on Windows, you need ./ to run executables in the current directory, unless you mess with your $PATH -
just like on *nix.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Bart on Mon Dec 19 19:54:44 2022

On 19/12/2022 15:14, Bart wrote:

For massive files Poplog compiles much slower than your compiler
but usually much faster than gcc. However, main advantage
is that Poplog compiles to memory giving you impression of
interpreter, but generating machine code.

That's what my 'mm' compiler does on Windows, using -run option. Or if I rename it 'ms', that is the default:

   c:\mx>ms mm -run \qx\qq \qx\hello.q
   Compiling mm.m to memory
   Compiling \qx\qq.m to memory
   Hello, World! 19-Dec-2022 15:00:34

(There's an issue ATM building ms with ms.)

That problem has gone, it just needed a tweak (usually ms.exe is a
renaming of mm, rather than being built directly). Now I can do:

c:\mx>ms ms ms ms ms ms ms ms ms ms ms \qx\qq \qx\hello
Hello, World! 19-Dec-2022 19:36:43

(Compiler messages are normally suppressed to avoid spoiling the
illusion that it's a script, but I showed them above.)

This builds 10 generations of the compiler in-memory, before building
the interpreter. This took 0.75 seconds (each version is necessarily unoptimised, only the first might be).

You wouldn't normally do this, but it illustrates the ability to run
native code applications, including compilers, directly from source
code. Not quite JIT, but different from a normal, one-time AOT build.
And this could be done at a customer site.

But I can't do this via the C target.

Actually if can be done, and I've just tried it. But it is an emulation:
it creates a normal executable file then runs it. I just need to arrange
for it to pick up the trailing command line parameters to pass to the
app. However it needs to use the tcc compiler to work fluently.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Tue Dec 20 00:17:41 2022

On 19/12/2022 06:15, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

I know that Linux doesn't care about extensions, but people do. After
all it still uses, by convention, extensions like .c -s .o .a .so, so
why not actual binaries by convention?

Here you miss virtue of simplicity: binaries are started by kernel
and you pass filename of binary to the system call. No messing
with extentions there. There are similar library calls that
do search based on PATH, again no messing with extentions.

I don't get this. Do you mean that inside a piece of code (ie. written
once and executed endless times), it is better to write run("prog")
instead of run("prog.exe"), because it saves 4 keystrokes?

And it's an advantage in not being able to distinguish between
"prog.exe", "prog.pl", "prog.sh" etc?

My views are to do with providing an ergonomic /interactive/ user CLI.

It simply doesn't make sense.

It makes sense if you know that executable in the PATH is simultaneousy
shell command. You see, there are folks which really do not like
useless clutter in their command lines. And before calling
executable from a shell script you may wish to check if it is available. Having different extention for calling and for access as normal
file would complicate scripts.

In every context I've been talking about where extensions have been
optional and have been inferred, you have always been able to write full extensions if you want. This would be recommended inside a script run
myriad times to make it clear to people reading or maintaining it.

People have mentioned that on Linux you could optionally name
executables with ".exe" or ".elf" extension. If 'gcc' (the main binary
driver program of gcc, not gcc as a broader concept - you see the
problems you get into!) had been named "gcc.exe", would you have had to
type this every time you ran it:

gcc.exe hello.c

If so, then I think I can see the real reason why extensions are empty!

In a Linux terminal shell, there apparently is no scope for informality
or user-friendliness at all.

This has lead me to thinking about how command line parameters are
separated. On either OS you normally type this:

gcc a.c b.c c.c

You can't do this, separate with commas, as the comma becomes part of
each filename:

gcc a.c, b.c, c.c

That applies also to my bcc, but there, you CAN have comma-separated
items inside an @file; with gcc, that still fails.

So, what's going on here: is it an OS shell misfeature, or what?

Well it's not the OS on Windows, since 'T.BAT a,b,c' will process a, b,
c as separate 'a b c' items inside the script (not as "a," etc). (I
can't test how it works on Linux.)

My bcc fails because it obtains the command parameters via the MSVCRT
call __getmainargs(), which turns the single command line string that
Windows normally provides, into separated items that C expects.

I used to work directly from the command line string (GetCommandLine)
and chop it up manually. It looks like I will have to go back to that.

That will provide consistency with subsequent line input from the
console, or from a file. (In fact I will extend my Read feature which
works on those, to work also on the command line params that follow the
program name.)

So, this has been productive after all.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 20 02:44:38 2022

Bart <bc@freeuk.com> wrote:

On 19/12/2022 06:15, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

I know that Linux doesn't care about extensions, but people do. After
all it still uses, by convention, extensions like .c -s .o .a .so, so
why not actual binaries by convention?

Here you miss virtue of simplicity: binaries are started by kernel
and you pass filename of binary to the system call. No messing
with extentions there. There are similar library calls that
do search based on PATH, again no messing with extentions.

I don't get this. Do you mean that inside a piece of code (ie. written
once and executed endless times), it is better to write run("prog")
instead of run("prog.exe"), because it saves 4 keystrokes?

I mean that what user input can go unchanged to system calls.
If user types 'prog' and library function gets 'prog.exe', then
there must be some code inbetween which messes with file extentions.
If your code handles file names obtained from user, then to
present consistent interface to users you also must handle the
issue. So from single place it speads out to many other places.

It simply doesn't make sense.

It makes sense if you know that executable in the PATH is simultaneousy shell command. You see, there are folks which really do not like
useless clutter in their command lines. And before calling
executable from a shell script you may wish to check if it is available. Having different extention for calling and for access as normal
file would complicate scripts.

In every context I've been talking about where extensions have been
optional and have been inferred, you have always been able to write full extensions if you want. This would be recommended inside a script run
myriad times to make it clear to people reading or maintaining it.

Your "full extention may be tricky". I a directory I may have:

a.exe
a.exe.gz

When I type 'a' do you use 'a.exe' or 'a.exe.gz'? Similarly
when I type 'a.exe' would you use it or 'a.exe.gz'?

People have mentioned that on Linux you could optionally name
executables with ".exe" or ".elf" extension. If 'gcc' (the main binary
driver program of gcc, not gcc as a broader concept - you see the
problems you get into!) had been named "gcc.exe", would you have had to
type this every time you ran it:

gcc.exe hello.c

If so, then I think I can see the real reason why extensions are empty!

You are slowy getting it. Use just 'gcc' as filname and you will
be fine.

In a Linux terminal shell, there apparently is no scope for informality
or user-friendliness at all.

Let me just say that you can have whatever program you want as
a shell. 'sh' from the start was intended as programmer tool
and programming language. There are other shells, in particular
'csh' was intended as interactive shell for "normal" users.
But descendants of 'sh' got more features. Concerning
user-friendliness, 'bash' had command line editing, history search
and tab completion for ages. And shell loops are quite usable
from command line, so single command can do what otherwise would
require program in a file or a lot of individual commands.
'zsh' tries to correct spelling, some folks consider this
friendly (I do not use 'zsh').

This has lead me to thinking about how command line parameters are
separated. On either OS you normally type this:

gcc a.c b.c c.c

You can't do this, separate with commas, as the comma becomes part of
each filename:

gcc a.c, b.c, c.c

That applies also to my bcc, but there, you CAN have comma-separated
items inside an @file; with gcc, that still fails.

Why would you do such silly thing? If you really want you can
redefine 'gcc' so that it strips trailing commas (that is
trivial). If you like excess characters you can type longer
thing like:

echo gcc a.c, b.c, c.c | tr -d ',' | bash

So, what's going on here: is it an OS shell misfeature, or what?

KISS principle. Commas are legal in filenames and potentially
useful. On command line spaces work fine. If you really need
splitting to work differently there are resonably simple ways
to do this, most crude is above.

BTW: travelling between UK and other countries do you complain
that cars drive on wrong side of the road?

Well it's not the OS on Windows, since 'T.BAT a,b,c' will process a, b,
c as separate 'a b c' items inside the script (not as "a," etc). (I
can't test how it works on Linux.)

There is IFS variable which lists characters used for word splitting,
you can put comma there together with whitspace. I never used it
myself, but it is used extensively in hairy shell scripts like
'configure'.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Tue Dec 20 07:56:13 2022

On 18/12/2022 18:09, Bart wrote:

On 18/12/2022 13:05, David Brown wrote:

On 17/12/2022 14:22, Bart wrote:

For data files, it can often be convenient to have an extension
indicating the type - and it is as common on Linux as it is on Windows
to have ".odt", ".mp3", etc., on data files.

It's convenient for all files. And before you say, I can add a .exe
extension if I want: I don't want to have to write that every time I run
that program.

You can add ".exe" if you want, and then it is part of the name of the
program - so you use it when naming the program. It's not really very difficult.

People use extensions where they are useful, and skip them when they
are counter-productive (such as for executable programs).

I can't imagine all my EXE (and perhaps BAT files) all having no
extensions. Try and envisage all your .c files have no extensions by
default. How do you even tell that are C sources and not Python or not executables?

I treat my .c files differently from my program files. Why should the
same rules apply?

I treat the ELF executables, shell files, python programs and other
executable programs the same - I run them. Why would I need a different
file extension for each? I don't need to run them in different ways -
the OS (and shell) figure out the details of how to run them so that I
don't need to bother.

When you are writing code, and you have a function "twiddle" and an
integer variable "counter", you call them "twiddle" and "counter".
You don't call them "twiddle_func" and "counter_int". But maybe
sometimes you find it useful - it's common to write "counter_t" for a
type, and maybe you'd write "xs" for an array rather than "x".
Filenames can follow the same principle - naming conventions can be
helpful, but you don't need to be obsessive about it or you end up
with too much focus on the wrong thing.

But you /do/ write twiddle.c, twiddle.s, twiddle.o, twiddle.cpp,
twiddle.h etc? Yet the most important file of all, is just plain 'twiddle'!

Yes.

In casual writing or conversation, how to do distinguish 'twiddle the
binary executable' from 'twiddle the folder', from 'twiddle the
application' (an installation) , from 'twiddle' the project etc, without having to use that qualification?

Using 'twiddle.exe' does that succinctly and unequivocally.

Sure. But so does "twiddle the executable". I don't generally talk in
shell commands or file names - it's just not an issue that has any
relevance.

On *nix, every file with the executable flag can be executed - that's
what the flag is for.

Sometimes it is convenient to be able to see which files in a
directory are executables, directories, etc. That's why "ls" has
flags for colours or to add indicators for different kinds of files.
("ls -F --color").

As I said, if it's convenient for data and source files, it's convenient
for all files.

Why do you care?

But there are also ways to execute .c files directly, and of course
.py files which are run from source anyway.

There are standards for that. A text-based file can have a shebang
comment ("#! /usr/bin/bash", or similar) to let the shell know what
interpreter to use. This lets you distinguish between "python2" and
"python3", for example, which is a big improvement over Windows-style
file associations that can only handle one interpreter for each file
type.

That is invasive. And taking something that is really an attribute of a
file name, in having it not only inside the file, but requiring the file
to be opened and read to find out.

You seem to have misread the paragraph.

(Presumably every language that runs on Linux needs to accept '#' as a
line comment? And you need to build it in to every one of 10,000 source
files the direct location of the Python2 or Python3 installation on that machine? Is that portable across OSes? But I expect it's smarter than
that.)

Of course it is smarter than that.

The great majority of languages typically used for scripting (that is,
running directly without compiling) are happy with # as a comment
character. Even C interpreters, as far as I saw with a very quick
google check, are happy with a #! shebang in the first line.

A key point here is that almost every general-purpose OS, other than
Windows, in modern use on personal computers is basically POSIX
compliant. (And even Windows has had some POSIX compliance since the
first NT days.) One of the things POSIX defines is required placement
of a large number of files and programs, and required support from
things like a standard POSIX shell. So a shell script can start with "#!/bin/sh", and be sure of running on every POSIX system - Linux, Macs, embedded Linux, Solaris, AIX, Windows WSL, msys, whatever. If it wants
Python, it can have "#!/usr/bin/python". If it wants Python 2.5
specifically, it can have "#!/usr/bin/python2.5". (Of course there is
no guarantee that a given system has Python 2.5 installed, but almost
all will have /some/ version of Python, and it can be found at /usr/bin/python.)

(That does not mean Python has to be installed at /usr/bin/python - it
means it must be /found/ there. Symbolic links are used widely to keep filesystems organised while letting files be found in standard places.)

These things are not Linux-specific. They predate Linux, and are
ubiquitous in the *nix world.

With Python, you're still left with the fact that you see a file with a
.py extension, and don't know if it's Py2 or Py3, or Py3.10 or Py3.11,
or whether it's a program that works with any version. It is a separate problem from having, as convention, no extensions for ELF binary files.

Exactly my point. That's why in the *nix world, you use a shebang that
can be as specific as you want or as general as you can, in regard to
versions. You can have a file extension if you like, but it is not
needed or used in order to find the interpreter for the script, so most
people don't bother for their executables. If you have an OS that
relies solely on file extensions, however, you do not have that
flexibility - it works in simple cases (one Python version installed)
but not in anything more.

On Windows, I always have to start my Python programs explicitly with "C:\Python2.8\python prog.py", or equivalent, precisely because of
Windows limitations.

File extensions for executable types seems like a nice idea at the
start, but is quickly shown to be limiting and inflexible.

And the *nix system distinguishes between executable files and
non-executables by the executable flag - that way you don't
accidentally try to execute non-executable Python files.

(So there are files that contain Python code that are non-executable?
Then what is the point?)

Maybe you haven't done much Python programming and have only worked with
small scripts. But like any other language, bigger programs are split
into multiple files or modules - only the main program file will be
executable. So if a big program has 50 Python files, only one of them
will normally be executable and have the shebang and the executable
flag. (Sometimes you'll "execute" other modules to run their tests
during development, but you'd likely do that as "python3 file.py".)

You do realise that gcc can handle some 30-odd different file types?

That doesn't change the fact that probably 99% of the time I run gcc, it
is with the name of a .c source file. And 99.9% of the times when I
invoke it on prog.c as the first or only file to create an executable,
then I want to create prog.exe.

OK. So gcc should base its handling of input on what /you/ do, never
mind the rest of the world? That's fine for your own tools, but not for
gcc.

So its behaviour is unhelpful. After the 10,000th time you have to type
.c, or backspace over .c to get at the name itself to modify, it becomes tedious.

Every serious developer uses build programs - or at least a DOS batch
file - for major programming work.

Now it's not that hard to write a wrapper script or program on top of gcc.exe, but if it isn't hard, why doesn't it just do that?

gcc is already a wrapper for a collection of tools and compilers.

It's not a simple C compiler that assumes everything it is given is a
C file.

As I said, that is not helpful for me. Also, how many file types does
'as' accept? As that also requires the full extension, and also,
bizarrely, generates `a.out` as the object file name.

"a.out" is the standard default name for executables on *nix, used by
all tools - it's hardly bizarre, even though you rarely want the default.

Like most *nix tools, gas can get its files from multiple places,
including pipes. And you can call your files anything you like -
"file.s", "file.asm", "file", "file.x86asm", "file.version2.4.12", etc.
It would be a very strange idea to decide it should only take part of
the file name even if it only accepts one type of file.

In *nix, the dot is just a character, and file extensions are just part
of the name. You can have as many or as few as you find convenient and helpful.

If you intend to assemble three .s files to object files, using separate
'as' invocations, they will all be called a.out!

That would be crass even for a toy program written by a student. And yet
here it is a mainstream product used by million of people.

All my language programs (and many of my apps), have a primary type of
input file, and will default to that file extension if omitted. Anything
else (eg .dll files) need the full extension.

Here's something funny: take hello.c and rename to 'hello', with no extension. If I try and compile it:

    gcc hello

it says: hello: file not recognised: file format not recognised. Trying
'gcc hello.' is worse: it can't see the file at all.

How is that "funny" ? It is perfectly clear behaviour.

gcc supports lots of file types. For user convenience it uses file
extensions to tell the file type unless you want to explicitly inform it
of the type using "-x" options.

<https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html>

"hello" has no file extension, so the compiler will not assume it is C. (Remember? gcc is not just a simple little dedicated C compiler.)
Files without extensions are assumed to be object files to pass to the
linker, and your file does not fit that format.

"hello." is a completely different file name - the file does not exist.
It is an oddity of DOS and Windows that there is a hidden dot at the end
of files with no extension - it's a hangover from 8.3 DOS names.

So first, on Linux, where file extensions are supposed to be optional,
gcc can't cope with a missing .c extension; you have to provide extra
info. Second, on Linux, "hello" is a distinct file from "hello.".

Yes. It's the only sane way, and consistent with millions of programs
spanning 50 years on huge numbers of systems.

With bcc, I just have to type "bcc hello." to make it work. A trailing
dot means an empty extension.

When you make your own little programs for your own use, you can pick
your own rules.

On Linux, you just write "make hello" - you don't need a makefile for
simple cases like that.

OK... so how does 'make' figure out the file extension?

"make" has a large number of default rules built in. When you write
"make hello", you are asking it to create the file "hello". It searches through its rules looking for ones that can be triggered and which match
files that are found in the directory. One of these rules is how to
compile and link a file "%.c" into and executable "%" - so it applies that.

'Make' anyway has different behaviour:

* It can choose not to compile

* On Windows, it says this:

c:\yyy>make hello
cc     hello.c   -o hello
process_begin: CreateProcess(NULL, cc hello.c -o hello, ...) failed.
make (e=2): The system cannot find the file specified.
<builtin>: recipe for target 'hello' failed
make: *** [hello] Error 2

Do you have a program called "cc" on your path? It's unlikely. "cc" is
the standard name for the system compiler, which may be gcc or may be
something else entirely.

* I also use several C compilers; how does make know which one I intend?
How do I pass it options?

It uses the POSIX standards. The C compiler is called "cc", the flags
passed are in the environment variable CFLAGS.

If that's not what you want, write a makefile.

If I give another example:

   c:\c>bcc cipher hmac sha2
   Compiling cipher.c to cipher.asm
   Compiling hmac.c to hmac.asm
   Compiling sha2.c to sha2.asm
   Assembling to cipher.exe

it just works. 'make cipher hmac sha2' doesn't, not even in WSL.

(And the "advanced AI" can figure out if it is C, C++, Fortran, or
several other languages.)

No, it can't. If I have hello.c and hello.cpp, it will favour the .c file.

Sorry, I should have specified that the "advanced AI" can do it on an
advanced OS, such as every *nix system since before Bill Gates found
MS-DOS is a dustbin.

File extensions are tremendously helpful. But that doesn't mean you have
to keep typing them! They just have to be there.

Exactly. You just have a very DOS-biased view as to when they are
helpful, and when they are not. It's a backwards and limited view due
to a lifetime of living with a backwards and limited OS.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 20 11:07:34 2022

Bart <bc@freeuk.com> wrote:

On 18/12/2022 13:05, David Brown wrote:

On 17/12/2022 14:22, Bart wrote:

When you are writing code, and you have a function "twiddle" and an
integer variable "counter", you call them "twiddle" and "counter".? You don't call them "twiddle_func" and "counter_int".? But maybe sometimes
you find it useful - it's common to write "counter_t" for a type, and
maybe you'd write "xs" for an array rather than "x".? Filenames can
follow the same principle - naming conventions can be helpful, but you don't need to be obsessive about it or you end up with too much focus on the wrong thing.

But you /do/ write twiddle.c, twiddle.s, twiddle.o, twiddle.cpp,
twiddle.h etc? Yet the most important file of all, is just plain 'twiddle'!

During developement you have normally have several source files
per executable. And normally executable is final product. So
rules based on extention help in managing build but are not
useful for executable. There are cases when executable is
_not_ a final product, in such cases people add extentions
to executables to allows simple rules. "Importance" actually
works in opposite direction that you would like to imply:
set of possible short names is limited, so something must be
important enough to get short name (without extention).

If you intend to assemble three .s files to object files, using separate
'as' invocations, they will all be called a.out!

That would be crass even for a toy program written by a student. And yet
here it is a mainstream product used by million of people.

Well, it would be crass not to specify output in such case. Students
have no trouble learning this.

* I also use several C compilers; how does make know which one I intend?
How do I pass it options?

We have Makefile for that. My Makefile for microcontrollers
(STM32F1) has at the start:

CM_INC = /mnt/m1/pom/kompi/work/libopencm3/include
CORE_FLAGS = -mthumb -mcpu=cortex-m3
CFLAGS = -Os -Wall -g $(CORE_FLAGS) -I $(CM_INC) -DSTM32F1

TOOL_PPREFIX = arm-none-eabi-
CC = $(TOOL_PPREFIX)gcc
CXX = $(TOOL_PPREFIX)g++
AS = $(TOOL_PPREFIX)as

CM_INC, CORE_FLAGS and TOOL_PPREFIX are my variables which
help to better organize the Makefile. CC is standad make
variable which tells make how to invoke C compiler (by default
male would use 'cc'). CXX is doing the same for C++. And
AS specifies how to invoke assembler.

Without setting above make would invoke normal compiler,
generating code for PC, which would not run on microcontroller.
And would invoke PC assembler which can not handle ARM assembly.

For your use you may want something like:

CC = bcc
AS = fasm

If I give another example:

c:\c>bcc cipher hmac sha2
Compiling cipher.c to cipher.asm
Compiling hmac.c to hmac.asm
Compiling sha2.c to sha2.asm
Assembling to cipher.exe

it just works. 'make cipher hmac sha2' doesn't, not even in WSL.

Well, you told make that you want cipher, hmac and sha2 as results.
If your sources are written in appropriate way (for multiple
executables), it would work.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 20 11:55:06 2022

Bart <bc@freeuk.com> wrote:

On 18/12/2022 17:17, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

On 17/12/2022 13:22, Bart wrote:

On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

I tested this using the following program:

proc main=
rsystemtime tm
os_getsystime(&tm)
println tm.second
println tm.minute
println tm.hour
println tm.day
println tm.month
println tm.year
end

It's funny you picked on that, because the original version of my
hello.m also printed out the time:

proc main=
println "Hello World!",$time
end

This was to ensure I was actually running the just built-version, and
not the last of the 1000s of previous ones. But the time-of-day support
for Linux wasn't ready so I left it out.

I've updated the mc.c/mc.ma files (not hello.m, I'm sure you can fix that).

However getting this to work on Linux wasn't easy as it kept crashing.
The 'struct tm' record ostensibly has 9 fields of int32, so has a size
of 36 bytes. And on Windows it is. But on Linux, a test program reported
the size as 56 bytes.

Doing -E on that program under Linux, the struct actually looks like this:

struct tm
{
int tm_sec;
int tm_min;
int tm_hour;
int tm_mday;
int tm_mon;
int tm_year;
int tm_wday;
int tm_yday;
int tm_isdst;

long int tm_gmtoff;
const char *tm_zone;
};

16 extra byte for fields not mentioned in 'man' docs, plus 4 bytes
alignment account for the 20 bytes. This is typical of the problems in adapting C APIs to the FFIs of other languages.

Sorry for that, it worked on my machine so I did not check the struct
size.

BTW: I still doubt that 'mc.ma' expands to true source: do you
really write no comments in your code?

The file was detabbed and decommented, as the comments would be full of ancient crap, mainly debugging code that never got removed. I've tidied
most of that up, and now the file is just detabbed (otherwise things
won't line up properly). Note the sources are not heavily commented anyway.

It will always be a snapshot of the actual sources, which are not kept on-line and can change every few seconds.

You are misusing git and github. git is "source control" system.
At least from my point of view (there is lot of flame wars discussing
what source control should do) main task of source control is
to store all significant versions of software and allow resonable
easy retrival of any version. Logically got stores separate
source tree for each version (plus some meta info like log messages).
Done naively it would lead to serious bloat, which 1547 versions
it would be almost 1547 times larger than single version. git
uses compression to reduce this. AFAICS actual sources of your
projects are about 4-5M. With normal git use I would expect
(compressed) history to add another 5-10M (if there are a lot of
deletions than history would be bigger). Your repo is bigger than
that probably due to generated files and .exe. Note: I understand
that if you write in your own language, than bootstrap is a problem.
But for boostrap mc.c is enough. OK, you want want be independent
from C, so mayb .exe. But .ma files just add bloat. Note that
github has release feature, people who want just binaries or single
verion can fetch release. And many projects having bootstrap
problem say: if you do not have compiler fetch earler binary
and use it to build the system. Or they add extra generated things
to releases but do not keep them in source repositiory.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to antispam@math.uni.wroc.pl on Tue Dec 20 14:43:57 2022

On 20/12/2022 03:44, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

This has lead me to thinking about how command line parameters are
separated. On either OS you normally type this:

gcc a.c b.c c.c

You can't do this, separate with commas, as the comma becomes part of
each filename:

gcc a.c, b.c, c.c

That applies also to my bcc, but there, you CAN have comma-separated
items inside an @file; with gcc, that still fails.

Why would you do such silly thing? If you really want you can
redefine 'gcc' so that it strips trailing commas (that is
trivial). If you like excess characters you can type longer
thing like:

echo gcc a.c, b.c, c.c | tr -d ',' | bash

So, what's going on here: is it an OS shell misfeature, or what?

KISS principle. Commas are legal in filenames and potentially
useful. On command line spaces work fine. If you really need
splitting to work differently there are resonably simple ways
to do this, most crude is above.

BTW: travelling between UK and other countries do you complain
that cars drive on wrong side of the road?

Well it's not the OS on Windows, since 'T.BAT a,b,c' will process a, b,
c as separate 'a b c' items inside the script (not as "a," etc). (I
can't test how it works on Linux.)

There is IFS variable which lists characters used for word splitting,
you can put comma there together with whitspace. I never used it
myself, but it is used extensively in hairy shell scripts like
'configure'.

An important issue here is that the "OS" is not involved in any of this,
either on Windows or on Linux.

In *nix, the shell (not the OS) is responsible for many aspects of
parsing command lines, including splitting up parameters and expanding wildcards in filenames. So on Linux, writing "gcc a.c, b.c, c.c" in
bash will call gcc with three parameters - "a.c,", "b.c,", and "c.c".

On Windows, the standard "DOS Prompt" command-line terminal does very
little of this. (I don't know the details of Powershell. And if you
use a different shell on Windows, like bash from msys, you get the
behaviour of that shell.) So if you have a normal "DOS Prompt" and
write "gcc a.c, b.c, c.c" then the program "gcc" is called with /one/ parameter. It's up to the program to decide how to parse these.
Typically it will use one of several different WinAPI calls depending on whether it wants the abomination that is "wide characters", or UTF-8, or
to hope that everything is simple ASCII. If a program wants to parse
the string itself using commas as separators, it can do that too.

Of course most programs - especially those that come from a *nix
heritage - will choose to parse in the same way as is done by *nix shells.

I did not know that the batch file interpreter handled commas
differently like this. Who says you never learn things on Usenet? :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to David Brown on Tue Dec 20 15:39:23 2022

On 2022-12-20 14:43, David Brown wrote:

In *nix, the shell (not the OS) is responsible for many aspects of
parsing command lines, including splitting up parameters and expanding wildcards in filenames. So on Linux, writing "gcc a.c, b.c, c.c" in
bash will call gcc with three parameters - "a.c,", "b.c,", and "c.c".

Under UNIX a process is invoked with the argument list as an argument,
e.g. exec* calls. So UNIX OS enforces a certain view of process
parameters as a flat list of NUL-terminated strings (and not, say, a key
map, a tree, an object etc).

On Windows, the standard "DOS Prompt" command-line terminal does very
little of this. (I don't know the details of Powershell. And if you
use a different shell on Windows, like bash from msys, you get the
behaviour of that shell.)

Under Windows API a process gets the command line, e.g. CreateProcess.

Windows approach was standard for other OSes. With the difference that,
say, RSX-11 provided a standard system function to parse the command
line in a common way (DOS borrowed that syntax: DIR /C /B etc). UNIX has
getopt for this (in its UNIX way: parse arguments once, get garbage,
re-sort the garbage again (:-)) Luckily Microsoft refrained from
providing API calls to parse arguments. I am shivering imaging what kind
of structures and how many dozens of functions they would come up
with... (:-))

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Wed Dec 21 00:42:09 2022

On 20/12/2022 13:43, David Brown wrote:

On 20/12/2022 03:44, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

This has lead me to thinking about how command line parameters are
separated. On either OS you normally type this:

    gcc a.c b.c c.c

You can't do this, separate with commas, as the comma becomes part of
each filename:

    gcc a.c, b.c, c.c
That applies also to my bcc, but there, you CAN have comma-separated
items inside an @file; with gcc, that still fails.

Why would you do such silly thing? If you really want you can
redefine 'gcc' so that it strips trailing commas (that is
trivial).   If you like excess characters you can type longer
thing like:

echo gcc a.c, b.c, c.c | tr -d ',' | bash

So, what's going on here: is it an OS shell misfeature, or what?

KISS principle. Commas are legal in filenames and potentially
useful. On command line spaces work fine. If you really need
splitting to work differently there are resonably simple ways
to do this, most crude is above.

BTW: travelling between UK and other countries do you complain
that cars drive on wrong side of the road?

Well it's not the OS on Windows, since 'T.BAT a,b,c' will process a, b,
c as separate 'a b c' items inside the script (not as "a," etc). (I
can't test how it works on Linux.)

There is IFS variable which lists characters used for word splitting,
you can put comma there together with whitspace. I never used it
myself, but it is used extensively in hairy shell scripts like
'configure'.

An important issue here is that the "OS" is not involved in any of this, either on Windows or on Linux.

In *nix, the shell (not the OS) is responsible for many aspects of
parsing command lines, including splitting up parameters and expanding wildcards in filenames. So on Linux, writing "gcc a.c, b.c, c.c" in
bash will call gcc with three parameters - "a.c,", "b.c,", and "c.c".

On Windows, the standard "DOS Prompt" command-line terminal does very
little of this. (I don't know the details of Powershell. And if you
use a different shell on Windows, like bash from msys, you get the
behaviour of that shell.) So if you have a normal "DOS Prompt" and
write "gcc a.c, b.c, c.c" then the program "gcc" is called with /one/ parameter. It's up to the program to decide how to parse these.
Typically it will use one of several different WinAPI calls depending on whether it wants the abomination that is "wide characters", or UTF-8, or
to hope that everything is simple ASCII. If a program wants to parse
the string itself using commas as separators, it can do that too.

Of course most programs - especially those that come from a *nix
heritage - will choose to parse in the same way as is done by *nix shells.

I did not know that the batch file interpreter handled commas
differently like this. Who says you never learn things on Usenet? :-)

It all depends on how the application decides to do it.

But most, like gcc, appear to just use the 'args' parameter of C's main
entry point, which does not do anything clever with commas.

Windows itself provides the command line as one long string, either as
of the arguments of WinMain(), or obtained via GetCommandLine().

Then applications could do what they want. So it depends on how C-ified
an application is.

This experiment with batch files I think demonstrates how Windows shell
(as command prompt or PowerShell) works:

c:\c>type test.bat
echo off
echo %1
echo %2
echo %3

c:\c>test a b c

c:\c>echo off
a
b
c

c:\c>test a,b,c

c:\c>echo off
a
b
c

c:\c>test a, b, c

c:\c>echo off
a
b
c

c:\c>test * *.c

c:\c>echo off
*
*.c
ECHO is off.

c:\c>test "a,b,c"

c:\c>echo off
"a,b,c"
ECHO is off.
ECHO is off.

The same tests with a C program that just lists main's args works like this:

c:\c>showargs a b c
1: showargs
2: a
3: b
4: c

c:\c>showargs a,b,c
1: showargs
2: a,b,c

c:\c>showargs a, b, c
1: showargs
2: a,
3: b,
4: c

c:\c>showargs * *.c
1: showargs
2: *
3: *.c

c:\c>showargs "a,b,c"
1: showargs
2: a,b,c

Under Windows, if you really wanted to do something as crass as having
commas within filenames, it's possible, but you have to use quotes.

Under WSL, that 'showargs * *.c' line works very differently: I get 1153 parameters, which includes 888 files corresponding to *, and 266
corresponding to *.c, with no way of knowing when you've come to the end
of one list, and started the other.

I will just say that the behaviour of my 'test.bat' demo is the most
sane, with the least surprises.

You of course will disagree, since whatever Unix does, no matter how
ridiculous or crass, is perfect, and every other kind of behaviour is
rubbish.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Wed Dec 21 02:02:34 2022

On 20/12/2022 06:56, David Brown wrote:

On 18/12/2022 18:09, Bart wrote:

A key point here is that almost every general-purpose OS, other than
Windows, in modern use on personal computers is basically POSIX
compliant.

POSIX compliant means basically being a clone of Unix with all the same restrictions and stupid quirks?

Maybe you haven't done much Python programming and have only worked with small scripts. But like any other language, bigger programs are split
into multiple files or modules - only the main program file will be executable. So if a big program has 50 Python files, only one of them
will normally be executable and have the shebang and the executable
flag. (Sometimes you'll "execute" other modules to run their tests
during development, but you'd likely do that as "python3 file.py".)

Oh, just like Windows then?

Obviously, all 50 modules will contain executable code. You probably
mean that only the lead module can be launched by the OS and needs
special permissions.

You do realise that gcc can handle some 30-odd different file types?

That doesn't change the fact that probably 99% of the time I run gcc,
it is with the name of a .c source file. And 99.9% of the times when I
invoke it on prog.c as the first or only file to create an executable,
then I want to create prog.exe.

OK. So gcc should base its handling of input on what /you/ do, never
mind the rest of the world?

No, based on what LOTS of people do. gcc is used as a /C/ compiler, and
is probably only ever used as a C compiler. Maybe this is acceptable to you:

gcc prog.c -oprog -lm
./prog

But I prefer:

bcc prog
prog

Who wouldn't?

That's fine for your own tools, but not for
gcc.

Why not? Have they thought of something as simple as using a dedicated executable name for each language? Like gcc and g++.

Otherwise you're telling me I have to type 'prog.c' 1000s of times
because of once in a blue moon that I might want to compile 'prog.ftn'?

Note that default file extensions weren't just routine in MSDOS, other
OSes like ones from DEC did it too, with their Fortran and Algol
compilers for example.

And remember that MSDOS had to be used by ordinary people, not Unix gurus.

In *nix, the dot is just a character, and file extensions are just part
of the name. You can have as many or as few as you find convenient and helpful.

That's a poor show, and explains why apparently simple, user-friendly
concepts are not practical in Linux.

I notice however that with 'gcc -c hello.c', it creates a file
'hello.o', and 'not hello.c.o'. So it does recognise in this case that
that or only extension has special meaning, and it does not form any
part of the logical file name.

If you intend to assemble three .s files to object files, using
separate 'as' invocations, they will all be called a.out!

That would be crass even for a toy program written by a student. And
yet here it is a mainstream product used by million of people.

All my language programs (and many of my apps), have a primary type of
input file, and will default to that file extension if omitted.
Anything else (eg .dll files) need the full extension.

Here's something funny: take hello.c and rename to 'hello', with no
extension. If I try and compile it:

     gcc hello

it says: hello: file not recognised: file format not recognised.
Trying 'gcc hello.' is worse: it can't see the file at all.

How is that "funny" ? It is perfectly clear behaviour.

It's funny because Linux famously doesn't need extensions, it looks at
the file contents. What I've learnt is that Linux relies on extensions
almost as much as Windows, it's just more inconsistent.

gcc supports lots of file types. For user convenience it uses file extensions to tell the file type unless you want to explicitly inform it
of the type using "-x" options.

<https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html>

"hello" has no file extension, so the compiler will not assume it is C. (Remember? gcc is not just a simple little dedicated C compiler.) Files without extensions are assumed to be object files to pass to the linker,
and your file does not fit that format.

So it DOES make some assumptions!

"hello." is a completely different file name - the file does not exist.
It is an oddity of DOS and Windows that there is a hidden dot at the end
of files with no extension - it's a hangover from 8.3 DOS names.

What about those DEC systems I mentioned? I would be surprised if on
RSX11M for example (PDP11), which I believe had 6-letter file names and 3-letter extensions, which fit into 3 16bit words via RADIX-50, they
would bother actually storing a "." character which is an input and
display artefact.

So it is Unix that is peculiar here, in making a logical separator a
physical one.

Yes. It's the only sane way, and consistent with millions of programs spanning 50 years on huge numbers of systems.

With bcc, I just have to type "bcc hello." to make it work. A trailing
dot means an empty extension.

When you make your own little programs for your own use, you can pick
your own rules.

The rules make sense for EVERY interactive CLI program. One
characteristic of Unix programs is that you start them, but then nothing happens - it has apparenly hanged. Actually, it's just waiting for
user-input, but thought fit not to mention that in a brief message.

Behaviour like that, or defaulting to 'a.out' no matter what, I would
expect in 'little', temporary and private programs, not something to be inflicted on a million people.

Do you have a program called "cc" on your path? It's unlikely. "cc" is the standard name for the system compiler, which may be gcc or may be something else entirely.

This was the 'make' program supplied with gcc on Windows.

Of course, 'make' wouldn't work with my own stuff, which is
unconventional: no object files, no linking, no listed of discrete
modules. It's basically 'mm prog', by design.

* I also use several C compilers; how does make know which one I
intend? How do I pass it options?

It uses the POSIX standards. The C compiler is called "cc", the flags passed are in the environment variable CFLAGS.

So are we only talking about C here, or is it other languages whose
compilers have been adapted from the C compiler, complete with that
a.out business?

If that's not what you want, write a makefile.

Why would I bother with such stone-age rubbish? (And that hardly ever
works.)

If I give another example:

    c:\c>bcc cipher hmac sha2
    Compiling cipher.c to cipher.asm
    Compiling hmac.c to hmac.asm
    Compiling sha2.c to sha2.asm
    Assembling to cipher.exe

it just works. 'make cipher hmac sha2' doesn't, not even in WSL.

(And the "advanced AI" can figure out if it is C, C++, Fortran, or
several other languages.)

No, it can't. If I have hello.c and hello.cpp, it will favour the .c
file.

Sorry, I should have specified that the "advanced AI" can do it on an advanced OS, such as every *nix system since before Bill Gates found
MS-DOS is a dustbin.

And yet it got it wrong; I wanted to build the cpp file.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Wed Dec 21 11:07:00 2022

On 20/12/2022 11:55, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

It will always be a snapshot of the actual sources, which are not kept
on-line and can change every few seconds.

You are misusing git and github. git is "source control" system.
At least from my point of view (there is lot of flame wars discussing
what source control should do) main task of source control is
to store all significant versions of software and allow resonable
easy retrival of any version. Logically got stores separate
source tree for each version (plus some meta info like log messages).
Done naively it would lead to serious bloat, which 1547 versions
it would be almost 1547 times larger than single version. git
uses compression to reduce this. AFAICS actual sources of your
projects are about 4-5M. With normal git use I would expect
(compressed) history to add another 5-10M (if there are a lot of
deletions than history would be bigger). Your repo is bigger than
that probably due to generated files and .exe. Note: I understand
that if you write in your own language, than bootstrap is a problem.
But for boostrap mc.c is enough. OK, you want want be independent
from C, so mayb .exe. But .ma files just add bloat. Note that
github has release feature, people who want just binaries or single
verion can fetch release. And many projects having bootstrap
problem say: if you do not have compiler fetch earler binary
and use it to build the system. Or they add extra generated things
to releases but do not keep them in source repositiory.

DB:

Exactly. You just have a very DOS-biased view as to when they are
helpful, and when they are not. It's a backwards and limited view due
to a lifetime of living with a backwards and limited OS.

So much negativity here.

First I'm castigated for my language not being original enough: it's
either a 'rip-off' of C, or a derivative.

Then, in every single case where I try to be different, or innovative,
I'm doing it wrong.

You two aren't going to be happy until my language is a clone of C, with
tools that work exactly the same way they do on Unix. But then you're
going to say, what's the point?

* Case-insensitive: bad, very bad

* 1-based: bad (a fair number of languages are 1-based too)

* Choice of array base including 1 and 0: bad (Modern Fortran allows this)

* Centralised module scheme: very, very bad. You want module info to be specified and repeated not only in every module, but sometimes also in
each function. (Note that makefiles are a crude form of centralised
module scheme, but not in a form useful to a compiler)

* Whole-program compilation: very bad. Yet some newer languages do the
same thing; Python uses a whole-program bytecode compiler.

* No object files and no linker: very bad. (Python does this)

* Instantly create amalgamated sources in one file (I thought this was brilliant): very bad: having a sprawling representation is /much/
better! And apparently plays badly with Github. (Note: the amalgamated sqlite3.c file is on Github.)

* Line-oriented syntax: bad. (Python is line-oriented; so is the C preprocessor.)

* Out of order definitions: bad: You /want/ to have to write and
maintain forward declarations for everything, and sometimes it's not
possible (circular refs in structs for example)

* Tools that primarily work on one file type do not need extensions for
that file to be specified on a command line: bad. OK, that was typical
on DEC in the 1970s; why doesn't it work now? Oh, because Unix treats
"." in a funny way, or you could in theory have a file called "c.c.c.c".
You know, the solution in those 0.1% of cases is very simple: then you
have to write the full extension. But you have the convenience the other
99.9% of the time.

* I don't use makefiles: very, very, very bad. Yet what would be inside
a makefile for my language where you build a whole program using one
command ('mm prog')? Do I need to maintain a duplicate list of modules
so that it can work out that I don't need to spend 50ms on rebuilding
from scratch? (Note: doing 'make hello' when hello is up to date also
takes 50ms, but that's one 5-line source file. It's easier to just build anyway.)

* Providing distributions in form of binary: bad, I think.

* Providing distributions in a form one step back from binary (eg. a
single file containing C source code): bad, I think.

* Putting stuff on Github: bad, because apparently I'm doing it wrong.
OK, I've taken my sources off it completely; does that help?

I get the impression that everything I try is viewed negatively.

At least, I don't remember anyone saying, What a great idea, Bart! Or,
Yeah, I'd like that, but unfortunately the way Linux works makes that impractical.

Instead, it would be, Yeah, that's what you would expect from a rubbish
OS that Bill Gates found in a bin.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Thu Dec 22 14:03:33 2022

On 21/12/2022 12:07, Bart wrote:

On 20/12/2022 11:55, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

It will always be a snapshot of the actual sources, which are not kept
on-line and can change every few seconds.

You are misusing git and github. git is "source control" system.
At least from my point of view (there is lot of flame wars discussing
what source control should do) main task of source control is
to store all significant versions of software and allow resonable
easy retrival of any version. Logically got stores separate
source tree for each version (plus some meta info like log messages).
Done naively it would lead to serious bloat, which 1547 versions
it would be almost 1547 times larger than single version. git
uses compression to reduce this. AFAICS actual sources of your
projects are about 4-5M. With normal git use I would expect
(compressed) history to add another 5-10M (if there are a lot of
deletions than history would be bigger). Your repo is bigger than
that probably due to generated files and .exe. Note: I understand
that if you write in your own language, than bootstrap is a problem.
But for boostrap mc.c is enough. OK, you want want be independent
from C, so mayb .exe. But .ma files just add bloat. Note that
github has release feature, people who want just binaries or single
verion can fetch release. And many projects having bootstrap
problem say: if you do not have compiler fetch earler binary
and use it to build the system. Or they add extra generated things
to releases but do not keep them in source repositiory.

DB:

Exactly. You just have a very DOS-biased view as to when they are helpful, and when they are not. It's a backwards and limited view due
to a lifetime of living with a backwards and limited OS.

So much negativity here.

I have long experience with MS-DOS and Windows, and long experience with
*nix. DOS is absolute shite in comparison - it was created as a cheap knock-off of other systems, thrown together quickly for a throw-away
marketing project by IBM. Unfortunately IBM forgot to throw away the
project and it was accidentally successful, resulting in the world being
stuck with hardware, software and a processor ISA that were known to be third-rate outdated cheapo solutions at the time the IBM PC was first
released. Those turds have been polished a great deal in the last 35
years or so - some versions of Windows are okay, and modern x86-64
processors are very impressive engineering - but turds they remain at
their core. While some designs were planned to be forward compatible
with future enhancements (like the 68k processor architecture, or the
BBC MOS operating system), and some were designed to be compatible with everything above a set minimum (like Unix), x86 and DOS then Windows
have been saddled with backwards compatibility as their prime
motivation. (This isn't really Microsoft or Intel's fault - they are
stuck with it, and as a result many of their more innovative projects
have failed even when they were good ideas.)

Yes, I have a rather negative view on DOS.

This is not personal - I don't have a negative view of you!

And while I have a negative view of one-man languages other than for
fun, learning, research, or very niche applications, I am always
impressed by people who make them.

I get the impression that everything I try is viewed negatively.

Your memory is biased.

At least, I don't remember anyone saying, What a great idea, Bart!

You've heard that a lot from me. Mostly when you list features that you
think are absolutely critical in a language, I ignore them because they
are so repetitive. But when I do comment on them, I regularly and
happily comment positively on the ones I like or that I think are often
liked by others. But I won't lie to you and tell you that I think
1-based arrays are a good idea, or that case-insensitivity is
universally liked, or that line-oriented syntax is innovative, or that I
think out-of-order definitions makes a significant difference in my
programming (it's nice to have, but for me, the disadvantages balance
the advantages).

To cheer you up, from your last list I agree that proper modules are
important, whole-program optimisation is great, and traditional
pre-compiled object files are outdated.

Or,
Yeah, I'd like that, but unfortunately the way Linux works makes that impractical.

I don't think anything related to Linux or DOS/Windows is at all
relevant to your language - it should work the same on any system. Your
tools don't follow *nix common standards, but they would not be the
first tools on Linux that are unconventional.

Instead, it would be, Yeah, that's what you would expect from a rubbish
OS that Bill Gates found in a bin.

Bill Gates boasted how he search rubbish bins for printouts of other
people's code, and copied it (without a care for copyright, licensing, recognition, or quality) in his own code for Microsoft.

But I don't understand how you can take personal offence when I talk
about operating systems, or how you end up thinking it was a criticism
of you or your language.

I also thought it was quite clear that I was simply telling you how
things are, and how things work in Linux - primarily to help you out
with a system that is unfamiliar to you.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andy Walker@21:1/5 to Bart on Thu Dec 22 15:09:55 2022

On 21/12/2022 11:07, Bart wrote:
[...]

So much negativity here.

[... To David and Waldek:]

You two aren't going to be happy until my language is a clone of C,
with tools that work exactly the same way they do on Unix. But then
you're going to say, what's the point?

You [and probably Dmitry] seem to have a very weird idea of what
Unix /is/. To really understand why many of us have been happy users of
Unix [and somewhat less so of Linux*] for several decades, you need to understand the history, what came before, and how Unix then evolved. I
don't intend to write an essay here; there are books on the subject.
But what you seem to fail to appreciate is that very little of Unix is
laid down in concrete. It would take major surgery to change the file
system significantly; you are probably also stuck with the "exec"
family of system calls; you would be unwise to tamper too much with
the basic security mechanisms. But thereafter, it's entirely up to
you.

You're bright enough to be able to write your own language and compiler. So you're surely bright enough to write your own shell, or
to tinker with one of those already available -- they /all/ came into
existence because some other bright person wanted something different.
Bright enough to write wrappers for things where you would prefer the
defaults to be different, to write your own editor, your own tools for
all purposes. All, every single one, of those supplied "by default"
again came into being because someone decided they wanted it and wrote
the requisite code. Sources are freely available, so if you want
something different and don't want to write your own, you can play
with the code that someone else wrote. Entirely up to you.

When David says "you can do X", he doesn't mean "you /have/ to
do X". There is almost no compulsion. All the tools are there, use
them as you please. When you complain about some aspect of "gcc" or
"make" or whatever, you're actually complaining that people who gave
their time and expertise freely to provide a tool that /they/ wanted,
haven't done so to /your/ specification. Well, shucks.

To give one example, you have been wittering recently about
the fact that "cc hello; hello" doesn't, as you would like, find and
compile a program whose source is in "hello.c", put the binary into
"hello", and run it. But you can write your own almost trivially;
it's a "one-line" shell script [for large values of "one", but that's
to provide checks rather than because it's complicated]. You complain
that you have to write "./hello" rather than just "hello"; but that's
because "." is not in your "$PATH", which is set by you, not because
Unix/Linux insists on extra verbiage. If you need further help, just
ask. But I'd expect you to be able to work it out rather than wring
your hands and flap around helplessly [or blame Unix for it].

[...]

At least, I don't remember anyone saying, What a great idea, Bart!
Or, Yeah, I'd like that, but unfortunately the way Linux works makes
that impractical.

Perhaps you would tell us what great ideas you'd like "us" to
consider? The things I recall you telling us are things that existed
long ago in other languages, such as 1-based arrays, line-based syntax,
or case insensitivity. If you want them in Unix/Linux, you can have
them, and no-one will, or should, tell you it's impractical. But if
the things you want are different from the things most of the rest of
the world wants, then you may need to write your own or adapt what is
already freely available.

_____
* Linux, sadly, has acquired a degree of bloat. Eg, "man gcc" comes
to some 300 pages, compared with the two pages of "man cc" in the
7th Edition version. Basically, it's always easier to add more
to an existing facility that to take stuff out. Grr! We used to
grumble when the binary of a fully-featured browser went to over
a megabyte. Now we scarcely turn a hair at the size of Firefox,
or the number of processes it spawns. Grr.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Simpson

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Andy Walker on Thu Dec 22 16:57:33 2022

On 22/12/2022 16:09, Andy Walker wrote:

* Linux, sadly, has acquired a degree of bloat. Eg, "man gcc" comes
    to some 300 pages, compared with the two pages of "man cc" in the
    7th Edition version. Basically, it's always easier to add more
    to an existing facility that to take stuff out. Grr! We used to
    grumble when the binary of a fully-featured browser went to over
    a megabyte. Now we scarcely turn a hair at the size of Firefox,
    or the number of processes it spawns. Grr.

It's Wirth's law - software gets slower faster than hardware gets
faster. It's not a Linux innovation!

(I don't care at all how big the program Firefox is - but it does annoy
me that it takes so many GB of memory. Maybe it's time to close some of
these couple of hundred tabs split amongst several dozen windows
arranged around 12 virtual desktops!)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Andy Walker on Thu Dec 22 17:05:35 2022

On 2022-12-22 16:09, Andy Walker wrote:

You [and probably Dmitry] seem to have a very weird idea of what Unix /is/.

I don't know about Bart. As for me, I started with PDP-11 UNIX and
continued with m68k UNIX Sys V. Both were utter garbage in every
possible aspect inferior to any competing system with worst C compilers
I ever seen. After these I spent a couple of years maintaining Sun
Solaris, which were decent systems. I installed, ran and maintained
earliest versions of Linux on i368, when i486 was considered a
"mainframe", when the kernel had to be configured and compiled (device
specific drivers, interrupts and addresses set manually etc). Setting up
X11 on SVGA cards with display modes as listed by the CRT monitor
manual, what a joy!

I know exactly what UNIX is.

But what you seem to fail to appreciate is that very little of Unix is
laid down in concrete.

What David wrote about DOS/Windows being rotten in the core, which no lipstick's paint may cure, fully applies to UNIX. It a pair of ugly
siblings.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Thu Dec 22 16:59:56 2022

On 22/12/2022 16:21, David Brown wrote:

On 21/12/2022 01:42, Bart wrote:

You of course will disagree, since whatever Unix does, no matter how
ridiculous or crass, is perfect, and every other kind of behaviour is
rubbish.

Yes.

But then, I would not normally keep a thousand files in one directory. I think even MS-DOS has supported directory trees since version 2.x.

You've obviously never written programs for other people to run on their
own PCs. How do /you/ know how other people will organise their files?
Who are you to tell them how to do so?

And it might in any case be up to third party apps how files are
generated or your client's machine.

But my point about '* *.c', which you've chosen to ignore, is valid even
for ten files; it's just wrong.

It might be acceptable within a higher level language where each
wildcard spec expands to a list of files which itself is a nested
element of the paramter list. But it doesn't work is you just
concatenate everything into one giant list; there are too many ambiguities.

Of course, will never agree there's anything wrong with it; you will
defend Linux to the death. Or you will point that you can do X, Y and Z
to turn off this 'globbing', which now causes problems to programs that
depend on, and which it is now up to each customer to do so persistently
on their machines.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Thu Dec 22 17:21:07 2022

On 21/12/2022 01:42, Bart wrote:

You of course will disagree, since whatever Unix does, no matter how ridiculous or crass, is perfect, and every other kind of behaviour is rubbish.

Yes.

But then, I would not normally keep a thousand files in one directory.
I think even MS-DOS has supported directory trees since version 2.x.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Thu Dec 22 17:17:02 2022

On 21/12/2022 03:02, Bart wrote:

On 20/12/2022 06:56, David Brown wrote:

On 18/12/2022 18:09, Bart wrote:

A key point here is that almost every general-purpose OS, other than
Windows, in modern use on personal computers is basically POSIX
compliant.

POSIX compliant means basically being a clone of Unix with all the same restrictions and stupid quirks?

It means a set of common features that you can rely on when writing
portable code.

Maybe you haven't done much Python programming and have only worked
with small scripts. But like any other language, bigger programs are
split into multiple files or modules - only the main program file will
be executable. So if a big program has 50 Python files, only one of
them will normally be executable and have the shebang and the
executable flag. (Sometimes you'll "execute" other modules to run
their tests during development, but you'd likely do that as "python3
file.py".)

Oh, just like Windows then?

Sure. Python is cross-platform. The difference is that if I have
"main.py" that includes "utils.py", and only "main.py" is intended to be executable, then on Linux I can only try to execute "main.py" directly.
On Windows, I can happily run "utils.py" exactly as I run "main.py"
with whatever accidental consequences that might have. (Usually nothing
bad.)

Obviously, all 50 modules will contain executable code. You probably
mean that only the lead module can be launched by the OS and needs
special permissions.

Yes, that is what "executable" means.

You do realise that gcc can handle some 30-odd different file types?

That doesn't change the fact that probably 99% of the time I run gcc,
it is with the name of a .c source file. And 99.9% of the times when
I invoke it on prog.c as the first or only file to create an
executable, then I want to create prog.exe.

OK. So gcc should base its handling of input on what /you/ do, never
mind the rest of the world?

No, based on what LOTS of people do. gcc is used as a /C/ compiler, and
is probably only ever used as a C compiler.

I think C++ programmers might disagree. So would Fortran programmers,
or people who use gcc as the front-end for linking (that covers most
people who use gcc at all), or many those who use any of the other
languages it covers.

Even the name of the tool means "GNU Compiler Collection" - not "GNU C Compiler" as you seem to think.

Maybe this is acceptable to
you:

    gcc prog.c -oprog -lm
    ./prog

But I prefer:

    bcc prog
    prog

Who wouldn't?

When your toolchain is so simple and limited, it only needs a simple
interface - and yes, simple is a good thing. When a tool is advanced
and has many features, a somewhat more involved interface is needed even
in simple use-cases. That is inevitable.

But if that's the way you want to have things, you can put this in a
file called "rcc" :

#!/bin/sh
gcc $1.c -o$1 -lm && ./$1

Put the file "rcc" in a directory on your path ("~/bin" is a common
choice), and now you can type :

rcc prog

That will compile "prog.c", and if the compilation was successful, it
will run it.

Feel free to add whatever other gcc options you like (I recommend "-O2
-Wall -Wextra" as a starting point). It's done once, in one file, that
you can run ever after.

I hope you haven't spent years complaining about gcc parameters, file
names, makefiles, etc., rather than writing such a two-line script.
(And in Windows it's just a one line batch file.)

(I don't think there is much I could add to your other comments - you
clearly have no interest in any answers.)

Why would I bother with such stone-age rubbish? (And that hardly ever
works.)

You really do specialise in failing to use tools others use happily.
But then, you put a lot of effort into making sure you fail.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Andy Walker on Thu Dec 22 16:46:21 2022

On 22/12/2022 15:09, Andy Walker wrote:

On 21/12/2022 11:07, Bart wrote:
[...]

So much negativity here.

[... To David and Waldek:]

You two aren't going to be happy until my language is a clone of C,
with tools that work exactly the same way they do on Unix. But then
you're going to say, what's the point?

    You [and probably Dmitry] seem to have a very weird idea of what Unix /is/. To really understand why many of us have been happy users of Unix [and somewhat less so of Linux*] for several decades, you need to understand the history, what came before, and how Unix then evolved. I don't intend to write an essay here; there are books on the subject.
But what you seem to fail to appreciate is that very little of Unix is
laid down in concrete. It would take major surgery to change the file system significantly; you are probably also stuck with the "exec"
family of system calls; you would be unwise to tamper too much with
the basic security mechanisms. But thereafter, it's entirely up to
you.

    You're bright enough to be able to write your own language and compiler. So you're surely bright enough to write your own shell, or
to tinker with one of those already available -- they /all/ came into existence because some other bright person wanted something different.
Bright enough to write wrappers for things where you would prefer the defaults to be different, to write your own editor, your own tools for
all purposes. All, every single one, of those supplied "by default"
again came into being because someone decided they wanted it and wrote
the requisite code. Sources are freely available, so if you want
something different and don't want to write your own, you can play
with the code that someone else wrote. Entirely up to you.

    When David says "you can do X", he doesn't mean "you /have/ to
do X". There is almost no compulsion. All the tools are there, use
them as you please. When you complain about some aspect of "gcc" or
"make" or whatever, you're actually complaining that people who gave
their time and expertise freely to provide a tool that /they/ wanted,
haven't done so to /your/ specification. Well, shucks.

    To give one example, you have been wittering recently about
the fact that "cc hello; hello" doesn't, as you would like, find and
compile a program whose source is in "hello.c", put the binary into
"hello", and run it. But you can write your own almost trivially;
it's a "one-line" shell script [for large values of "one", but that's
to provide checks rather than because it's complicated]. You complain
that you have to write "./hello" rather than just "hello"; but that's because "." is not in your "$PATH", which is set by you, not because Unix/Linux insists on extra verbiage. If you need further help, just
ask. But I'd expect you to be able to work it out rather than wring
your hands and flap around helplessly [or blame Unix for it].

So lots of workarounds to be able to do what DOS, maligned as it was,
did effortlessly.

Don't forget it is not just me personally who would have trouble. For
over a decade, I was supplying programs that users would have to launch
from their DOS systems, or on 8-bit systems before that.

So every one of 1000 users would have to be told how to fix that "."
problem? Fortunately, nobody really used Unix back then (Linux was not
yet ready), at least among our likely customers who were just ordinary
people.

Fortunate also that with case-sensivitivity in the shell program and
file system, it would have created a lot more customer support headaches.

But you can write your own almost trivially;
it's a "one-line" shell script

Sure. I also asked, if it is so trivial, why don't programs do that
anyway? Learn something from DOS at least which is user friendliness.

Everytime I complain about building stuff from Linux, people talk about installing CYGWIN, or MSYS, or WSL, because it's got 100 things that are apparently indispensible to building (I can tell you, they're not; those dependencies were deliberate choices).

My needs for building stuff on Linux can be satisfied, as you say, with
a handful of 1-line scripts, and those are not indispensible either;
just convenient.

[...]

At least, I don't remember anyone saying, What a great idea, Bart!
Or, Yeah, I'd like that, but unfortunately the way Linux works makes
that impractical.

    Perhaps you would tell us what great ideas you'd like "us" to consider? The things I recall you telling us are things that existed
long ago in other languages, such as 1-based arrays, line-based syntax,
or case insensitivity.

Well, I am standing up for those features and refusing to budge just
because C and Linux have taken over the world and shoving 0-based and case-sensitivity down people's throats.

Notice that most user-facing interfaces tend to be case-insensitive? And
for good reason. But don't forget that DOS was a user-facing CLI.

Any Linux shell made a terrible CLI, but I guess it was designed for
gurus rather than ordinary people.

As for other features, you can imagine I'm not really in the mood to go
through them again. A lot of the innovative stuff is to do project
description and compiling and running programs, rather than language.

So, nobody here thinks that doing 'mm -ma appl' to produce a one-file
appl.ma file representing /the entire application/, that can be
trivially compiled remotely using 'mm appl.ma', is a great idea?

Apparently it's little different from using 'tar'!

Well, have a look at the A68G source bundle for example: inside the .gz2
part which compresses it, there is a .tar file. Unfortunately, when I tried:

gcc algol68g-3.1.0.tar

it didn't work: file not recognised. You have to untar all the component
files and directories, and build conventionally, or conventional for
Linux. Like so many, this application starts with a 'configure' script, although only 9500 lines this time. So I can't build it on normal Windows.

Now look again at my 'mm appl.ma' which Just Works, and further, can do
so on either OS ('mm appl.ma' on Windows, './mc appl.ma' on Linux, or
just 'mc appl.ma' once you've figured out the "./" disappearing trick).

There is absolutely no comparison. So I find it surprising there is
lacklustre enthusiasm.

(I tried building this program on WSL. It took about 80 seconds in all.

But typing 'make' again still took 1.4 seconds even with nothing to do.

Then I looked inside the makefile: it was an auto-generated one with
nearly 3000 lines of crap inside - no wonder it took a second and a half
to do nothing!

And this stuff is supposed to miles better than what I do?)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Bart on Thu Dec 22 18:32:53 2022

On 2022-12-22 17:46, Bart wrote:

Any Linux shell made a terrible CLI, but I guess it was designed for
gurus rather than ordinary people.

No, they were brainless designs as MS-DOS batch was. This was just like
with C and its countless C-esque followers. There was (and still are)
sh, csh, tcsh, ksh, bash. Each new incarnation improved the old ones by parroting their every mistake. In early days of UNIX everybody's hobby
was to redefine the command prompt, the ls, the ps, etc in the shell
using an rc script. At some point people gave up. Exactly same situation
was and is with the text editors. There existed hundreds of absolutely disgusting things. An iconic citation from Datamation was:

"... Real Programmers consider "what you see is what you get" to be just
as bad a concept in Text Editors as it is in Women. No, the Real
Programmer wants a "you asked for it, you got it" text editor--
complicated, cryptic, powerful, unforgiving, dangerous."

-- Real Programmers Don't Use Pascal

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Thu Dec 22 21:55:29 2022

On 22/12/2022 13:03, David Brown wrote:

On 21/12/2022 12:07, Bart wrote:

So much negativity here.

I have long experience with MS-DOS and Windows,

So have I.

and long experience with
*nix.

I looked into it every few years; it always looked shite to me.

However, I should say I have little interest in operating systems
anyway. DOS was fine because it didn't get in my way. It provided a file system, could copy files, launch programs etc, and it didn't cut my productivity and sanity in half by throwing in case-sensitivity. What
else did I need?

I expect you didn't like DOS because it doesn't have the dozens of toys
that you came to rely on in Unix, including a built-in C compiler; what
luxury!

It's because DOS was so sparse that I have few dependencies on it; and
my stuff can build on Linux more easily than Linux programs can build on Windows. (AIUI, most won't; you need to use CYGWIN or MSYS or WSL, but
then you don't get a bona fide Windows executable that customers can run directly.)

DOS is absolute shite in comparison - it was created as a cheap
knock-off of other systems, thrown together quickly for a throw-away marketing project by IBM.

This is from who used, what was it, a Spectrum machine?

I was involved in creating 8-bit business computers at the time, and
looked down on such things. (But it was also my job to investigate
similar, low-cost designs for hobbyist computers as an area of expansion.)

BTW our machines used a rip-off of CP/M. My boss approached Digital
Research but couldn't come to an agreement on licensing. So we (not me
though) created a clone. So why is saving money a bad thing?

I don't know exactly what you expected from an OS that ran on a 64KB
machine, which wasn't allowed to use more than about 8KB.

And, where /were/ the PCs with Unix in those days? Where could you buy
one? Would you be able to do much on it other than endlessly configure
stuff to make it work? Could you create binaries that were guaranteed to
work with any other Unix?

How unfriendly would it have been to supply apps as software bundles
that would take an age to build on a dual-floppy machine, with users
havin to keep feeding it floppies?

I think you just have little experience of that world of creating
products for low-end consumer PCs.

IME Linux systems were poor, amateurish attempts at an OS where lots of
things just didn't work, until the early 2000s. GUIs came late too, and
looked dreadful. By comparison, Microsoft Windows looked professional.

Yes you had to pay for it; is that what this is about, that Linux is free?

Unfortunately IBM forgot to throw away the
project and it was accidentally successful,

Good.

resulting in the world being
stuck with hardware, software and a processor ISA that were known to be third-rate

The IBM PC was definitely more advanced than by 8-bit business machine,
if not that much faster despite an internal 16-bit processor.

The 8088/86/286 had some disappointing limitations, which were fixed
with the 80386.

outdated cheapo solutions at the time the IBM PC was first
released. Those turds have been polished a great deal in the last 35
years or so

The architecture was open. There was a huge market in add-on
peripherals, and they came with drivers that worked. Good luck in
finding equivalent support in 1990s for even a printer driver under Linux.

- some versions of Windows are okay, and modern x86-64

processors are very impressive engineering - but turds they remain at
their core. While some designs were planned to be forward compatible
with future enhancements (like the 68k processor architecture, or the
BBC MOS operating system), and some were designed to be compatible with everything above a set minimum (like Unix), x86 and DOS then Windows
have been saddled with backwards compatibility as their prime
motivation.

Which has been excellent. Until they chose not to support 16-bit
binaries under 64-bit Windows.

But I don't understand how you can take personal offence when I talk
about operating systems, or how you end up thinking it was a criticism
of you or your language.

I get annoyed when people openly diss Windows, or MSDOS, simply for not
being Linux.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Fri Dec 23 16:50:21 2022

On 22/12/2022 17:59, Bart wrote:

On 22/12/2022 16:21, David Brown wrote:

On 21/12/2022 01:42, Bart wrote:

You of course will disagree, since whatever Unix does, no matter how
ridiculous or crass, is perfect, and every other kind of behaviour is
rubbish.

Yes.

But then, I would not normally keep a thousand files in one directory.
I think even MS-DOS has supported directory trees since version 2.x.

You've obviously never written programs for other people to run on their
own PCs. How do /you/ know how other people will organise their files?
Who are you to tell them how to do so?

And it might in any case be up to third party apps how files are
generated or your client's machine.

But my point about '* *.c', which you've chosen to ignore, is valid even
for ten files; it's just wrong.

It might be acceptable within a higher level language where each
wildcard spec expands to a list of files which itself is a nested
element of the paramter list. But it doesn't work is you just
concatenate everything into one giant list; there are too many ambiguities.

Of course, will never agree there's anything wrong with it; you will
defend Linux to the death. Or you will point that you can do X, Y and Z
to turn off this 'globbing', which now causes problems to programs that depend on, and which it is now up to each customer to do so persistently
on their machines.

I can't figure out what you are worrying about here.

In any shell, in any OS, for any program, if you write "prog *" the
program is run with a list of all the files in the directory. If you
wrote "prog * *.c", it will be started with a list of all the files,
followed by a list of all the ".c" files.

It's the same in DOS, Linux, Windows, Macs, or anything else you like.
It's the same for any shell.

The difference is that for some shells (such as Windows PowerShell or
bash), the shell does the work of finding the files and expanding the
wildcards because this is what /every/ program needs - there's no point
in repeating the same code in each program. In other shells, such as
DOS "command prompt", every program has to have that functionality added
to the program.

Well, I say "every program" supports wildcards for filenames - I'm sure
there are some DOS/Windows programs that don't. But most do.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Fri Dec 23 16:38:56 2022

On 22/12/2022 22:55, Bart wrote:

On 22/12/2022 13:03, David Brown wrote:

On 21/12/2022 12:07, Bart wrote:

So much negativity here.

I have long experience with MS-DOS and Windows,

So have I.

and long experience with *nix.

I looked into it every few years; it always looked shite to me.

However, I should say I have little interest in operating systems
anyway. DOS was fine because it didn't get in my way. It provided a file system, could copy files, launch programs etc, and it didn't cut my productivity and sanity in half by throwing in case-sensitivity. What
else did I need?

I expect you didn't like DOS because it doesn't have the dozens of toys
that you came to rely on in Unix, including a built-in C compiler; what luxury!

I find it useful to have a well-populated toolbox. I am an engineer -
finding the right tools for the job, and using them well, is what I do.
DOS gave you a rock to bash things with. Some people, such as
yourself, seem to have been successful at bashing out your own tools
with that rock. That's impressive, but I find it strange how you can be
happy with it.

The alternative in DOS and Windows has always been to buy additional
tools that *nix users take for granted. And those tools always have to
include /everything/. So while a Pascal compiler for *nix just needs to
be a Pascal compiler - because there are editors, build systems,
debuggers, libraries, assemblers and linkers already - on DOS/Windows
you had to get Turbo Pascal or Borland Pascal which had everything
included. (Turbo Pascal for DOS and Win3.1 included "make", by the way
- for well over a decade that was the build tool I used for all my
assembly and C programming on microcontrollers.) And then when you
wanted C, you bought MSVC which included a different editor with a
different setup, a different assembler, a different build tool (called
"nmake", and so on. Everything was duplicated but different, everything incompatible, everything a huge waste of the manufacturers time and
effort, and a huge waste of the users' money and time as they had to get familiar with another set of basic tools.

Do I like the fact that *nix has always come with a wide range of general-purpose tools? Yes, I most surely do!

It's because DOS was so sparse that I have few dependencies on it; and
my stuff can build on Linux more easily than Linux programs can build on Windows. (AIUI, most won't; you need to use CYGWIN or MSYS or WSL, but
then you don't get a bona fide Windows executable that customers can run directly.)

Programs built with msys2 work fine on any Windows system. You have to
include any DLL's you use, but that applies to all programs with all tools.

DOS is absolute shite in comparison - it was created as a cheap
knock-off of other systems, thrown together quickly for a throw-away
marketing project by IBM.

This is from who used, what was it, a Spectrum machine?

I used many different systems, including a Spectrum. But you expect a different quality of design from a machine made as cheap as possible for
home use, primarily for games, and an expensive system for serious
professional business use.

I was involved in creating 8-bit business computers at the time, and
looked down on such things. (But it was also my job to investigate
similar, low-cost designs for hobbyist computers as an area of expansion.)

BTW our machines used a rip-off of CP/M. My boss approached Digital
Research but couldn't come to an agreement on licensing. So we (not me though) created a clone. So why is saving money a bad thing?

Saving money is fine.

I don't know exactly what you expected from an OS that ran on a 64KB
machine, which wasn't allowed to use more than about 8KB.

I don't know exactly either.

I think if IBM had followed their plan - learn what the market needed,
then throw the "proof of concept" out and design something serious for
the future - it would have been far better.

I realise that hardware was expensive and there are limits to what can
be done in so little ram. But even then it could have been /so/ much
better.

Start with the processor - it was crap. The IBM engineers knew it was
crap, and didn't want it. But they were forced to use something that
was basically limited to 64 KB before resorting to an insane segmented
memory system that was limited to 1 MB, leading to even more insane
hacks to get beyond that. The engineers wanted something that had a
future, such as the 68000.

Then there was the OS - it was crap. Unix was vastly better than CP/M
and the various DOS's. But IBM was not involved in Unix at that time,
and did not want others to control the software on their machines.
(They thought they could control Microsoft.)

Or compare it to other machines, such as the BBC Micro. It had the OS
in ROM, making it far simpler and more reliable. It had a good
programming language in ROM. (BASIC, but one of the best variants.)
This meant new software could be written quickly and easily in a high
level language, instead of assembly (as was the norm for the PC in the
early days). It had an OS that was expandable - it supported pluggable
file systems that were barely imagined at the time the OS was designed.
It was a tenth of the price of the PC. Even if you equipped it with networking (unheard-of in the PC world), external disk drives, and
multiple languages and software in ROM, it was still far cheaper than a
PC despite being easier to use and having a vastly better text and
graphics system. You could even get a Z80 co-processor with CP/M.

Of course the BBC Micro wasn't perfect either, and had limitations
compared to the PC - not surprising, given the price difference. The
6502 was not a powerful processor.

But imagine what we could have had with a computer using a 68000,
running a version of Unix, combining the design innovation,
user-friendliness and forward thinking of Acorn and the business
know-how of IBM? It would have been achievable at lower cost than the
IBM PC, and /so/ much better.

As it was, by the mid-eighties there were home computers with usability, graphics and user interfaces that were not seen in the PC world for a
decade. There were machines that were cheaper than the PC's of the time
and could /emulate/ PC's at near full-speed. The PC world was far
behind in its hardware, OS and basic software. But the IBM PC and MSDOS
won out because the other machines were not compatible with the IBM PC
and MSDOS.

And, where /were/ the PCs with Unix in those days? Where could you buy
one? Would you be able to do much on it other than endlessly configure
stuff to make it work? Could you create binaries that were guaranteed to
work with any other Unix?

How unfriendly would it have been to supply apps as software bundles
that would take an age to build on a dual-floppy machine, with users
havin to keep feeding it floppies?

I think you just have little experience of that world of creating
products for low-end consumer PCs.

IME Linux systems were poor, amateurish attempts at an OS where lots of things just didn't work, until the early 2000s. GUIs came late too, and looked dreadful. By comparison, Microsoft Windows looked professional.

I used Unix systems at university in the early 1990's, and by $DEITY it
was a /huge/ step backwards when I had to move to Windows. (To be fair,
there was a very significant price difference involved.) Even Windows
95 was barely on a feature par with Archimedes machines of a decade
previous, and of course was never close to it in stability.

Yes, I agree that Windows 2000 was the point when Windows started
looking and working like a professional OS. (I used OS/2 earlier.) And
yes, Linux was quite limited and had not "taken off" at that time.

Yes you had to pay for it; is that what this is about, that Linux is free?

No. SunOS and Solaris, which I used earlier, were very far from free.

Unfortunately IBM forgot to throw away the project and it was
accidentally successful,

Good.

resulting in the world being stuck with hardware, software and a
processor ISA that were known to be third-rate

The IBM PC was definitely more advanced than by 8-bit business machine,
if not that much faster despite an internal 16-bit processor.

The 8088/86/286 had some disappointing limitations, which were fixed
with the 80386.

It's a pity MS didn't catch up with proper 32-bit support until AMD were already working on 64-bit versions.

The 80386 was certainly an improvement on its predecessors, but still
way behind state of the art of the time. Backwards compatibility was a
killer - it has always been the killer feature of the DOS/Windows/x86
world, and it has always killed innovation.

outdated cheapo solutions at the time the IBM PC was first released.
Those turds have been polished a great deal in the last 35 years or so

The architecture was open. There was a huge market in add-on
peripherals, and they came with drivers that worked. Good luck in
finding equivalent support in 1990s for even a printer driver under Linux.

On the business and marketing side, there's no doubt that MS in
particular outclassed everyone else. They innovated the idea of
criminal action as a viable business tactic - use whatever illegal means
you like to ensure competitors go bankrupt before they manage to sue
you, and by the time the case gets to the courts the fines will be
negligible compared to your profit. Even IBM was trapped by them.

But yes, compatibility and market share was key - Windows and PC's were
popular because there was lots of hardware and software that worked with
them, and there was lots of hardware and software for them because they
were popular. They were technically shite, but successful because they
were successful.

No one marketed Linux until far later, and true Unix was happy in its
niche (as was Apple). A few innovative companies came out with hugely
better hardware or software, but they could never catch up with the
momentum of the PC.

- some versions of Windows are okay, and modern x86-64

processors are very impressive engineering - but turds they remain at
their core. While some designs were planned to be forward compatible
with future enhancements (like the 68k processor architecture, or the
BBC MOS operating system), and some were designed to be compatible
with everything above a set minimum (like Unix), x86 and DOS then
Windows have been saddled with backwards compatibility as their prime
motivation.

Which has been excellent. Until they chose not to support 16-bit
binaries under 64-bit Windows.

I believe you can run Wine on Windows, and then you could run 16-bit
binaries. But you might have to run Wine under Cygwin or something -
it's not something I have tried.

But I don't understand how you can take personal offence when I talk
about operating systems, or how you end up thinking it was a criticism
of you or your language.

I get annoyed when people openly diss Windows, or MSDOS, simply for not
being Linux.

I diss Windows or DOS because it deserves it. Linux was not conceived
when I realised DOS was crap compared to the alternatives. (You always
seem to have such trouble distinguishing Linux and Unix.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to David Brown on Fri Dec 23 17:17:35 2022

On 2022-12-23 16:38, David Brown wrote:

The alternative in DOS and Windows has always been to buy additional
tools that *nix users take for granted. And those tools always have to include /everything/. So while a Pascal compiler for *nix just needs to
be a Pascal compiler - because there are editors, build systems,
debuggers, libraries, assemblers and linkers already - on DOS/Windows
you had to get Turbo Pascal or Borland Pascal which had everything
included.

Yes, and this was one reason why DOS actually had won over UNIX
workstations, while being utterly inferior. You bought a set of floppies
from Borland and you could comfortably write and debug your program.

Comparing to that UNIX was command-line with an assorted set very poorly designed obscure utilities. Even Solaris with its gorgeous OpenLook was
no match to Borland C++ and Pascal in terms of productivity.

Do I like the fact that *nix has always come with a wide range of general-purpose tools? Yes, I most surely do!

Most of them were garbage. I remember the time very well. All activities quickly migrated to DOS. Time to time someone ran back to us with a
"huge" data set some shitty DOS statistics software could not process.
Of course UNIX had nothing for that. But I just wrote a C program
computing some linear regression stuff for scratch and gave processed
data back. Things like that happened less and less frequently.
Workstations ended up as network servers running NFS and Yellowpages and
when Linux matured got scrapped.

Presently, most of our software development is done under Windows.
Modern tools are OS-agnostic. You can have your preferable IDE in either
OS. But Windows variants are always somewhat more stable. Furthermore prototyping is far easier under Windows because the hardware support is
much better. So we design, test and debug under Windows and run
integration tests on the Linux target (some small ARM board).

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Fri Dec 23 16:51:56 2022

On 23/12/2022 15:50, David Brown wrote:

On 22/12/2022 17:59, Bart wrote:

But my point about '* *.c', which you've chosen to ignore, is valid
even for ten files; it's just wrong.

It might be acceptable within a higher level language where each
wildcard spec expands to a list of files which itself is a nested
element of the paramter list. But it doesn't work is you just
concatenate everything into one giant list; there are too many
ambiguities.

Of course, will never agree there's anything wrong with it; you will
defend Linux to the death. Or you will point that you can do X, Y and
Z to turn off this 'globbing', which now causes problems to programs
that depend on, and which it is now up to each customer to do so
persistently on their machines.

I can't figure out what you are worrying about here.

In any shell, in any OS, for any program, if you write "prog *" the
program is run with a list of all the files in the directory.

No.

If you
wrote "prog * *.c", it will be started with a list of all the files,
followed by a list of all the ".c" files.

No.

It's the same in DOS, Linux, Windows, Macs, or anything else you like.

No.

It's the same for any shell.

No. Where did you get the idea that it works anywhere? Or that it would
even make sense for most programs?

The difference is that for some shells (such as Windows PowerShell or
bash), the shell does the work of finding the files and expanding the wildcards because this is what /every/ program needs - there's no point
in repeating the same code in each program. In other shells, such as
DOS "command prompt", every program has to have that functionality added
to the program.

Oh, so your criterea that 'X works everywhere' includes having to
implement it within applications!

Then I can say that 'Y also works everywhere'. I don't know what Y might
be, it doesn' matter, because whatever it does, somebody just needs to implement it first!

This is Powershell:

PS C:\c> .\showargs * *.c
1: C:\c\showargs.exe
2: *
3: *.c

This is Command Prompt:

c:\c>showargs * *.c
1: showargs
2: *
3: *.c

This is WSL (a.out is the WSL-gcc-compiled version of showargs.exe):

root@DESKTOP-11:/mnt/c/mx# ../c/a.out * *.c

root@DESKTOP-11:/mnt/c/c# ./a.out * *.c
1: ./a.out
2: ...
3: ...
...
193: *.c

That last entry seems odd: it turns out there are no .c files, so the
parameter is the wildcard specifier itself.

Well, I say "every program" supports wildcards for filenames - I'm sure
there are some DOS/Windows programs that don't. But most do.

That will be up to individual programs whether they accept wildcards for filenames, and what they do about them. If the input is "*", what kinds
of application would be dealing with an ill-matched collection of files including all sorts of junks that happens to be lying around?

It would be very specific. BTW if I do this under WSL:

vim *.c

Then I was disappointed that I didn't get hundreds of edit windows for
all those files. So even under Linux, sometimes expansion doesn't
happen. (What magic does vim use to get the OS to see sense?)

Usually the last thing you want is for the OS (or whatever is
responsible for expanding those command line params before they get to
the app) is to just expand EVERYTHING willy-nilly:

* Maybe the app needs to know the exact parameters entered (like my
showargs program). Maybe they are to be stored, and passed on to
different parts of an app as needed.

* Maybe they are to be passed onto to another program, where it's going
to be much easier and tidier if they are unexpanded:

c:\mx>ms showargs * *.c
1 *
2 *.c

Here 'ms', which runs 'showargs' as a script, sees THREE parameters:
showargs, * and *.c; It arranges for the last two to be passed as input
to the program being run.

* Maybe the app's inputs are mathematical expressions so that you want
"A*B" and not a list of files that start with A and end with B!

* But above all, it simply doesn't work, not when you have expandable
params interspersed with other expandable ones, or even ordinary params, because everything just merges together.

So here there are two things I find utterly astonishing:

(1) That you seem to think this a good idea, despite that list of problems

(2) That you are under the delusion this is how Windows works too. As
I've shown above, it doesn't.

Yes, individual apps can CHOOSE to do their own expansion, but that is
workable because that expansion-list is segregrated from other parameters.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sat Dec 24 00:02:56 2022

Bart <bc@freeuk.com> wrote:

That will be up to individual programs whether they accept wildcards for filenames, and what they do about them. If the input is "*", what kinds
of application would be dealing with an ill-matched collection of files including all sorts of junks that happens to be lying around?

Yesterday I used several times the following:

du -s *

'man du' will tell you want is does. And sometimes I use the
(in)famous:

rm -rf *

Do not try this if you do not know what it is doing!

It would be very specific. BTW if I do this under WSL:

vim *.c

Then I was disappointed that I didn't get hundreds of edit windows for
all those files. So even under Linux, sometimes expansion doesn't
happen. (What magic does vim use to get the OS to see sense?)

I suspect that you incorrecty interpreted your observation.
I do not use vim in graphic mode, but I use it a lot in
text mode. In text mode vim given list of files shows you
the first. But it remembers all and allows easy "movement"
between files. Probably this happened to you.

Usually the last thing you want is for the OS (or whatever is
responsible for expanding those command line params before they get to
the app) is to just expand EVERYTHING willy-nilly:

Speak about yourself. If application needs a list of files I want
this list to be expanded. I priciple application could expand the
list, but usually it is convenient when it is expanded earlier.

* Maybe the app needs to know the exact parameters entered (like my
showargs program).

AFAICS your 'showargs' works fine, it that it received. Only
nitpick is that position 0 is special (usually this program name),
and normal arguments stat at 1. You started numbering of normal
arguments at 2.

Maybe they are to be stored, and passed on to
different parts of an app as needed.

Maybe, it works fine.

* Maybe they are to be passed onto to another program, where it's going
to be much easier and tidier if they are unexpanded:

c:\mx>ms showargs * *.c
1 *
2 *.c

Here 'ms', which runs 'showargs' as a script, sees THREE parameters: showargs, * and *.c; It arranges for the last two to be passed as input
to the program being run.

* Maybe the app's inputs are mathematical expressions so that you want
"A*B" and not a list of files that start with A and end with B!

Maybe.

* But above all, it simply doesn't work, not when you have expandable
params interspersed with other expandable ones, or even ordinary params, because everything just merges together.

It works fine for me in very common cases, when I need to produce
single list of files.

So here there are two things I find utterly astonishing:

(1) That you seem to think this a good idea, despite that list of problems

You did not explain why do you want your program to see * *.c. As you
noted, if needed one can quote parameters. So, I would need rather
frequent usage case to prefer non-expanded version.

Yes, individual apps can CHOOSE to do their own expansion, but that is workable because that expansion-list is segregrated from other parameters.

I am under impression that you miss important fact: Unix was designed
as a _system_. Programs expect system conventions and work with them,
not against them. One convention is that there are few dozens (myriad
in your language) small programs that are supposed to work together.
Shell works as a glue that combines them together. Shell+utilities
form a programming language, crappy as programming language, but
quite useful. In particular, ability to transform/crate command
line via programming means allows automating a lot of tasks.

Just more on my point of view: I started to use DOS around 1988
(I was introduced to computers on mainframes and I had ZX
Spectrum earlier). My first practical contact with Unix was in
1990. It took me some time to understand how Unix works, but once
I "got it" I was able easily to do things on Unix that would be
hard (or require much more work) on DOS. By 1993 I was mostly
using Unix (more precisely, at that time I switched from 386BSD
Unix to Linux).

Coming back to Unix: it works for me. DOS in comparison felt
crappy. Compared to 1993 Windows has improved, but for me
this does not change much, I saw nothing that would be
better for _me_ than what I have in Linux. Now, if you want to
improve, one can think of many ways of doing thing better than on
Unix. Trouble is to that real-world design will have compromises.
One looses some possibilites to have another ones. You either
does not understand Unix or at least pretend to not understand.
If you do not understand Unix, then you do not qualified to judge
it. It looks that you do not know what you loose by choosing
different design.

BTW: I did spent some time thinking of better command line than
Unix. Some my ideas were quite different. But none were borrowed
from DOS.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to David Brown on Fri Dec 23 23:46:37 2022

On 23/12/2022 15:38, David Brown wrote:

On 22/12/2022 22:55, Bart wrote:

I expect you didn't like DOS because it doesn't have the dozens of
toys that you came to rely on in Unix, including a built-in C
compiler; what luxury!

I find it useful to have a well-populated toolbox. I am an engineer - finding the right tools for the job, and using them well, is what I do.
DOS gave you a rock to bash things with. Some people, such as
yourself, seem to have been successful at bashing out your own tools
with that rock. That's impressive, but I find it strange how you can be happy with it.

You seem fixated on DOS. My early coding was done on DEC and ICL OSes,
then no OS at all, then CP/M (or our clone of it).

The tools provided, when they were provided, were always spartan:
editor, compiler, linker, assembler. That's nothing new.

The alternative in DOS and Windows has always been to buy additional
tools that *nix users take for granted. And those tools always have to include /everything/.

'Everything' is good. Remember those endless of discussions on clc about
what exactly constituted a C compiler? Because this new-fangled 'gcc'
didn't come with batteries included, like header files, assembler or linker.

A bad mistake on Windows where those utilities are not OS-provided.

But tell me again why a 'linker', of all things, should be part of a
consumer product mostly aimed at people doing things that are nothing to
do with building software. Why give it such special dispensation?

So while a Pascal compiler for *nix just needs to
be a Pascal compiler - because there are editors, build systems,
debuggers, libraries, assemblers and linkers already - on DOS/Windows
you had to get Turbo Pascal or Borland Pascal which had everything
included.

You can get bare compilers on Windows too.

(Turbo Pascal for DOS and Win3.1 included "make", by the way
- for well over a decade that was the build tool I used for all my
assembly and C programming on microcontrollers.) And then when you
wanted C, you bought MSVC which included a different editor with a
different setup, a different assembler, a different build tool (called "nmake", and so on. Everything was duplicated but different, everything incompatible, everything a huge waste of the manufacturers time and
effort, and a huge waste of the users' money and time as they had to get familiar with another set of basic tools.

Nothing stopped anybody from marketing a standalone assembler or linker
that could be used with third party compilers. These are not complicated programs (a workable linker is only 50KB).

I can't answer that. Unless that assembler used 'gas' syntax then I
would write my own too.

Do I like the fact that *nix has always come with a wide range of general-purpose tools? Yes, I most surely do!

What did that do for companies wanting to develop and sell their own
compilers and tools?

It's because DOS was so sparse that I have few dependencies on it; and
my stuff can build on Linux more easily than Linux programs can build
on Windows. (AIUI, most won't; you need to use CYGWIN or MSYS or WSL,
but then you don't get a bona fide Windows executable that customers
can run directly.)

Programs built with msys2 work fine on any Windows system. You have to include any DLL's you use, but that applies to all programs with all tools.

My main experience of MSYS2 was of trying to get GMP to build. I spent
hours but failed miserably. This stuff should just work, and I don't
mean watching stuff scroll up the screen for an hour until you were
eventually rewarded - if you were lucky.

Just how long should a 500KB library take to build? How many dozens of
special tools should it need? I didn't care about performance, only
something I could use.

Then there was the OS - it was crap. Unix was vastly better than CP/M
and the various DOS's.

Yeah, you keep saying that. So Shakespeare was perhaps a better writer
than Dickens; maybe so, but I'd rather read Raymond Chandler!

As I said I have little interest in OSes or the characteristics that for
you would score points over another. But your preferences would get
negative points from me due to case-sensitivity and pedanticness.

But IBM was not involved in Unix at that time,
and did not want others to control the software on their machines. (They thought they could control Microsoft.)

Or compare it to other machines, such as the BBC Micro. It had the OS
in ROM, making it far simpler and more reliable. It had a good
programming language in ROM. (BASIC, but one of the best variants.)
This meant new software could be written quickly and easily in a high
level language, instead of assembly (as was the norm for the PC in the
early days). It had an OS that was expandable - it supported pluggable
file systems that were barely imagined at the time the OS was designed.
It was a tenth of the price of the PC.

It used a 6502. I'd argue it was better designed than any Sinclair
product, with a proper keyboard, but it was still in that class of machine.

BTW this is the kind of machine my company were selling:

https://nosher.net/archives/computers/pcw_1982_12_006a

(My first redesign task was adding the bitmapped graphics on the display.)

Of course the BBC Micro wasn't perfect either, and had limitations
compared to the PC - not surprising, given the price difference. The
6502 was not a powerful processor.

As I said...

More business-oriented 8-bit systems were based on the Z80, such as the
PCW 8256, with CP/M 3. (My first commercial graphical application was
for that machine IIRC.)

So you don't rate its OS - so what? All customers needed were the most
mundane things. It was a marketed as a word processor after all!

But imagine what we could have had with a computer using a 68000,
running a version of Unix, combining the design innovation,
user-friendliness and forward thinking of Acorn and the business
know-how of IBM? It would have been achievable at lower cost than the
IBM PC, and /so/ much better.

As it was, by the mid-eighties there were home computers with usability, graphics and user interfaces that were not seen in the PC world for a decade. There were machines that were cheaper than the PC's of the time
and could /emulate/ PC's at near full-speed. The PC world was far
behind in its hardware, OS and basic software. But the IBM PC and MSDOS
won out because the other machines were not compatible with the IBM PC
and MSDOS.

That's true. I was playing with 24-bit RGB graphics for my private
designs about a decade before it became mainstream on PCs.

But where were the Unix alternatives that people could buy from PC
World? Sure there had been colour graphics in computers for years but
I'm talking about consumer PCs.

I used Unix systems at university in the early 1990's, and by $DEITY it
was a /huge/ step backwards when I had to move to Windows.

I went from a £500,000 (in mid-70s money) mainframe at college, running
TOPS 20 I think, to my own £100 Z80 machine with no OS on it at all, and
no disk drives either.

I'd say /that/ was a huge step backwards! Perhaps you can appreciate why
I'm not that bothered.

The 8088/86/286 had some disappointing limitations, which were fixed
with the 80386.

It's a pity MS didn't catch up with proper 32-bit support until AMD were already working on 64-bit versions.

It wasn't so critical with the 80386. Programs could run in 16-bit mode
under a 16-bit OS, and use 32-bit operations, registers and address modes.

On the business and marketing side, there's no doubt that MS in
particular outclassed everyone else. They innovated the idea of
criminal action as a viable business tactic - use whatever illegal means
you like to ensure competitors go bankrupt before they manage to sue
you, and by the time the case gets to the courts the fines will be
negligible compared to your profit. Even IBM was trapped by them.

I never got interested in that side; I was always working to deadlines!

But what exactly was the point of Linux? What exactly was wrong with Unix?

But yes, compatibility and market share was key - Windows and PC's were popular because there was lots of hardware and software that worked with them, and there was lots of hardware and software for them because they
were popular. They were technically shite,

/Every/ software and hardware product for Windows was shite? Because
you're looked at every one and given your completely unbiased opinion!

Which has been excellent. Until they chose not to support 16-bit
binaries under 64-bit Windows.

I believe you can run Wine on Windows, and then you could run 16-bit binaries. But you might have to run Wine under Cygwin or something -
it's not something I have tried.

(I gave this 10 minutes but it lead nowwhere. Except it involved an
extra 500MB to install stuff that didn't work, but when I purged it, it
only recovered 0.2MB.

Hmmm.. have I mentioned the advantages of a piece of software that comes
and runs as a single executable file? Either its there or not there.
There's nowhere for it to hide!)

But I don't understand how you can take personal offence when I talk
about operating systems, or how you end up thinking it was a
criticism of you or your language.

I get annoyed when people openly diss Windows, or MSDOS, simply for
not being Linux.

I diss Windows or DOS because it deserves it. Linux was not conceived
when I realised DOS was crap compared to the alternatives. (You always
seem to have such trouble distinguishing Linux and Unix.)

Understandably. What exactly /is/ the difference? And what are the
differences between the myriad different versions of Linux even for the
same platform?

Apparently having more than one assembler or linker on a platform is a disaster. But have 100 different versions of the same OS, that's
perfectly fine!

I like all these contradictions.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sat Dec 24 00:05:58 2022

Bart <bc@freeuk.com> wrote:

So much negativity here.

Just for the record: I did offer you constructive idea: use Github
release area for distribution (that is expected use of release
area).

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Sat Dec 24 10:13:53 2022

On 24/12/2022 00:02, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

That will be up to individual programs whether they accept wildcards for
filenames, and what they do about them. If the input is "*", what kinds
of application would be dealing with an ill-matched collection of files
including all sorts of junks that happens to be lying around?

Yesterday I used several times the following:

du -s *

'man du' will tell you want is does. And sometimes I use the
(in)famous:

rm -rf *

Do not try this if you do not know what it is doing!

It still strikes me as a very sloppy feature which, apart from my other misgivings, only happens to work:

* When there is at most one wildcard parameter

* When it happens to be the last one (otherwise it all merges together),

* When the application can tolerate parameters with embedded * or ?
character potentially providing huge numbers of expanded parameters

* When the application doesn't need parameters with raw * or ?
characters unchanged.

Plus new issues:

* Conceivably, some implementation limits can be reached when the number
of files is large

* It could waste time, and space, dealing needlessly with all these
files if the app does nothing with than

Your examples are OK but those are OS shell utilities.

Typical applications do not work like that, and they would not anyway
want to open themselves up to who knows what sorts of problems when
usage doesn't obey the above rules.

But even with programs working on files like shell commands, how the
hell do you do the equivalent of Windows':

copy *.c *.d

(Make copies of all files ending in .c, changing the extension to .d)

Since all the program sees is a uniform list of files. Did the user even
type two parameters, or was it none, or a dozen?

Sorry, it is just dreadful. I reckon it was a big mistake in Unix, like
the many in C, that is now being touted as a feature because it is too
late to change it.

Specifying wildcards in input to an app can be useful, but it has to be
at the application's behest, and it has to be probably organised: the
copy command of my example:

* Needs the original "*.c" and "*.d" separately so it can work out the
filename remapping pattern

* /Then/ the "*.c" can be formed into a list of files (which on Windows
is often done lazily), but not the "*.d".

As it is, if the directory had 5 .c files and 100 .d files, Unix would
return a list of 105 files as input - wrong! And copied to what? It's impossible to say.

Still not convinced? Never mind. Perhaps I was hoping that just once
someone would say that Unix got something wrong. You will say, Ah but it
works; yeah, but it needs you keeping you fingers crossed.

It would be very specific. BTW if I do this under WSL:

vim *.c

Then I was disappointed that I didn't get hundreds of edit windows for
all those files. So even under Linux, sometimes expansion doesn't
happen. (What magic does vim use to get the OS to see sense?)

I suspect that you incorrecty interpreted your observation.

Yeah, I ran it in a folder with no C files, so editing one file called
"*.c". With lots of C files, it edited the first, but said little about
the remaining 200+ files I'd apparently specified.

Usually the last thing you want is for the OS (or whatever is
responsible for expanding those command line params before they get to
the app) is to just expand EVERYTHING willy-nilly:

Speak about yourself. If application needs a list of files I want
this list to be expanded.

What about the apps that don't: it's easier to expand wildcards later,
than to turn 1000s of filenames back into the small number of actual
parameters later. (Like turning an omelette back into eggs!)

I am under impression that you miss important fact: Unix was designed
as a _system_. Programs expect system conventions and work with them,
not against them. One convention is that there are few dozens (myriad
in your language) small programs that are supposed to work together.
Shell works as a glue that combines them together. Shell+utilities
form a programming language, crappy as programming language, but
quite useful. In particular, ability to transform/crate command
line via programming means allows automating a lot of tasks.

Just more on my point of view: I started to use DOS around 1988
(I was introduced to computers on mainframes and I had ZX
Spectrum earlier). My first practical contact with Unix was in
1990. It took me some time to understand how Unix works, but once
I "got it" I was able easily to do things on Unix that would be
hard (or require much more work) on DOS. By 1993 I was mostly
using Unix (more precisely, at that time I switched from 386BSD
Unix to Linux).

Coming back to Unix: it works for me. DOS in comparison felt
crappy. Compared to 1993 Windows has improved, but for me
this does not change much, I saw nothing that would be
better for _me_ than what I have in Linux. Now, if you want to
improve, one can think of many ways of doing thing better than on
Unix. Trouble is to that real-world design will have compromises.
One looses some possibilites to have another ones. You either
does not understand Unix or at least pretend to not understand.
If you do not understand Unix, then you do not qualified to judge
it. It looks that you do not know what you loose by choosing
different design.

I really don't care about shell programs other than to provide the
absolute basics.

If you mean the underlying OS, then I don't have much of an opinion.
Unix is case-sensitive in its file system, that's one thing.

But its API seems to be controlled by 85 POSIX headers, which include
the standard C headers. Plus it has special 'syscall' entry points using
a different mechanism.

I can't remember how DOS worked (most functionality I had to provide
myself anyway) but comparing with Windows now, that does everything with
a single 'windows.h' header, but includes vast amounts of extra
functionality.

They're just different. One seems mired in C and cannot be extricated
from it, the other seems to have more successfully drawn a line between
itself, and the C language, C libraries and C compilation tools.

Neither are that easy to use from a private language which was /my/ main problem. I used workarounds only sufficient for my actual needs.

BTW: I did spent some time thinking of better command line than
Unix. Some my ideas were quite different. But none were borrowed
from DOS.

Not even the freedom to write 'cd..' or 'CD..` instead of 'cd ..'?

I guess not; the world would surely stop spinning if Unix didn't require
that space between "cd" and "..". (Let me guess, "cd.." could be the
complete name of an independent program? Stupid decisions apparently
have long-lasting consequences.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to antispam@math.uni.wroc.pl on Sat Dec 24 12:20:22 2022

On 2022-12-24 01:02, antispam@math.uni.wroc.pl wrote:

rm -rf *

Do not try this if you do not know what it is doing!

Oh, yes. I remember removing half of my Solaris file system running rm
as root with ".." matched. I noticed that it ran too long...

One of the way to illustrate the beauty of UNIX "ideas" is this:

$ echo "" > -i

Now try to "more" or remove it (:-))

(You can also experiment with files named -rf and various UNIX commands...)

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Dmitry A. Kazakov on Sat Dec 24 18:36:58 2022

Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote:

On 2022-12-22 16:09, Andy Walker wrote:

????You [and probably Dmitry] seem to have a very weird idea of what
Unix /is/.

I don't know about Bart. As for me, I started with PDP-11 UNIX and
continued with m68k UNIX Sys V. Both were utter garbage in every
possible aspect inferior to any competing system with worst C compilers
I ever seen.

Can you name those superior competing systems?

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to antispam@math.uni.wroc.pl on Sat Dec 24 19:51:35 2022

On 2022-12-24 19:36, antispam@math.uni.wroc.pl wrote:

Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote:

On 2022-12-22 16:09, Andy Walker wrote:

????You [and probably Dmitry] seem to have a very weird idea of what
Unix /is/.

I don't know about Bart. As for me, I started with PDP-11 UNIX and
continued with m68k UNIX Sys V. Both were utter garbage in every
possible aspect inferior to any competing system with worst C compilers
I ever seen.

Can you name those superior competing systems?

RSX-11M, VMS-11, IBM's virtual machines OS, I forgot the name.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andy Walker@21:1/5 to Dmitry A. Kazakov on Sat Dec 24 22:15:55 2022

On 24/12/2022 11:20, Dmitry A. Kazakov wrote:

One of the way to illustrate the beauty of UNIX "ideas" is this:
$ echo "" > -i
Now try to "more" or remove it (:-))

Any experienced Unix user should know at least four ways of
doing that; anyone else could RTFM [eg, "man rm"], which gives two
of them.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Schubert

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to David Brown on Tue Dec 27 14:10:29 2022

On 2022-12-27 13:24, David Brown wrote:

On 24/12/2022 00:46, Bart wrote:

On 23/12/2022 15:38, David Brown wrote:

But tell me again why a 'linker', of all things, should be part of a
consumer product mostly aimed at people doing things that are nothing
to do with building software. Why give it such special dispensation?

It is convenient to have on the system. Programs can rely on it being there.

I have an impression that you guys confuse linker with loader. Programs (applications) do not need linker.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Bart on Tue Dec 27 13:24:04 2022

On 24/12/2022 00:46, Bart wrote:

On 23/12/2022 15:38, David Brown wrote:

On 22/12/2022 22:55, Bart wrote:

I expect you didn't like DOS because it doesn't have the dozens of
toys that you came to rely on in Unix, including a built-in C
compiler; what luxury!

I find it useful to have a well-populated toolbox. I am an engineer -
finding the right tools for the job, and using them well, is what I
do. DOS gave you a rock to bash things with. Some people, such as
yourself, seem to have been successful at bashing out your own tools
with that rock. That's impressive, but I find it strange how you can
be happy with it.

You seem fixated on DOS. My early coding was done on DEC and ICL OSes,
then no OS at all, then CP/M (or our clone of it).

I know nothing about these systems, so I can't comment on them. And
besides, they all died off.

The tools provided, when they were provided, were always spartan:
editor, compiler, linker, assembler. That's nothing new.

Sure.

The alternative in DOS and Windows has always been to buy additional
tools that *nix users take for granted. And those tools always have
to include /everything/.

'Everything' is good. Remember those endless of discussions on clc about
what exactly constituted a C compiler? Because this new-fangled 'gcc'
didn't come with batteries included, like header files, assembler or
linker.

"Everything" is /not/ good.

The definition of a "C compiler" is not something from gcc. The
definition of a compiler comes from long before C and long before Unix.
The definition of a "C compiler" comes from the C language standards.
Compilers take high level source code and turn it into low level code -
whether that be assembly, machine code, byte code for virtual machine,
or anything else.

The GCC folks didn't bother making an assembler, linker or standard
library because they came with the systems GCC compilers targeted. The
same applies to pretty much all other compilers for all languages,
whether it was Intel's compilers, Sun's compilers, IBM's, or anyone else's.

Even if you look at, say, Borland's tools for DOS/Windows, you see the
same pattern. Borland's Pascal compiler group made a Pascal compiler,
their C compiler group made a C compiler, their assembler group made an assembler, their editor group made an editor, their build tools group
made a "make" utility. They /shipped/ tools together as a collection,
but they did not pointlessly duplicate effort.

A bad mistake on Windows where those utilities are not OS-provided.

GCC did not target Windows. Other people put together collections of
GCC compilers, along with libraries, an assembler and linker, and any
other parts they thought were useful. Some groups thought a selection
of command-line utilities were a helpful addition to the package, others thought an IDE was helpful.

Could these groups, such as msys, cygwin, etc., do a better job of
packing and making things easier? I'm sure they could - I've rarely
found any piece of software or packing that I felt was absolutely perfect.

But tell me again why a 'linker', of all things, should be part of a
consumer product mostly aimed at people doing things that are nothing to
do with building software. Why give it such special dispensation?

It is convenient to have on the system. Programs can rely on it being
there.

If you take a Windows system, and look at the DLL's and EXE files on the system, I bet most users could identify the purpose of far less than 1%
of them. Even the most experienced power users will only know about a
tiny fraction of them.

So while a Pascal compiler for *nix just needs to be a Pascal
compiler - because there are editors, build systems, debuggers,
libraries, assemblers and linkers already - on DOS/Windows you had to
get Turbo Pascal or Borland Pascal which had everything included.

You can get bare compilers on Windows too.

(Turbo Pascal for DOS and Win3.1 included "make", by the way - for
well over a decade that was the build tool I used for all my assembly
and C programming on microcontrollers.) And then when you wanted C,
you bought MSVC which included a different editor with a different
setup, a different assembler, a different build tool (called "nmake",
and so on. Everything was duplicated but different, everything
incompatible, everything a huge waste of the manufacturers time and
effort, and a huge waste of the users' money and time as they had to
get familiar with another set of basic tools.

Nothing stopped anybody from marketing a standalone assembler or linker
that could be used with third party compilers. These are not complicated programs (a workable linker is only 50KB).

Such tools were available as stand-alone products. More often, I think,
they were licensed and included with third-party tools. (I.e., if you
wanted to make a Fortran compiler, you'd get your assembler and linker
from Borland, MS, or someone else.) The same is still done today.

I can't answer that. Unless that assembler used 'gas' syntax then I
would write my own too.

Do I like the fact that *nix has always come with a wide range of
general-purpose tools? Yes, I most surely do!

What did that do for companies wanting to develop and sell their own compilers and tools?

People can, and do, develop and sell compilers and other tools. But if
good tools are available for free or included with the system (remember,
"true" Unix is not normally free, and there are plenty of commercial
Linux distributions) then you have to make better tools or provide
better service if you want to make much money.

Or compare it to other machines, such as the BBC Micro. It had the OS
in ROM, making it far simpler and more reliable. It had a good
programming language in ROM. (BASIC, but one of the best variants.)
This meant new software could be written quickly and easily in a high
level language, instead of assembly (as was the norm for the PC in the
early days). It had an OS that was expandable - it supported
pluggable file systems that were barely imagined at the time the OS
was designed. It was a tenth of the price of the PC.

It used a 6502. I'd argue it was better designed than any Sinclair
product, with a proper keyboard, but it was still in that class of machine.

The Sinclair machines (ZX80, ZX81, ZX Spectrum) were targeting absolute
minimal costs - the BBC Micro had a significantly higher price (about
three times the cost, IIRC). And I agree, the result was a far better
machine in most aspects.

BTW this is the kind of machine my company were selling:

https://nosher.net/archives/computers/pcw_1982_12_006a

(My first redesign task was adding the bitmapped graphics on the display.)

There was a lot more variety and innovation in those days!

At that time, 1982, we had a TI 99/4A. My father sometimes had a BBC
Micro from his work, but I rarely got a chance to use it.

Of course the BBC Micro wasn't perfect either, and had limitations
compared to the PC - not surprising, given the price difference. The
6502 was not a powerful processor.

As I said...

More business-oriented 8-bit systems were based on the Z80, such as the
PCW 8256, with CP/M 3. (My first commercial graphical application was
for that machine IIRC.)

So you don't rate its OS - so what? All customers needed were the most mundane things. It was a marketed as a word processor after all!

Yes, that's true.

But imagine what we could have had with a computer using a 68000,
running a version of Unix, combining the design innovation,
user-friendliness and forward thinking of Acorn and the business
know-how of IBM? It would have been achievable at lower cost than the
IBM PC, and /so/ much better.

As it was, by the mid-eighties there were home computers with
usability, graphics and user interfaces that were not seen in the PC
world for a decade. There were machines that were cheaper than the
PC's of the time and could /emulate/ PC's at near full-speed. The PC
world was far behind in its hardware, OS and basic software. But the
IBM PC and MSDOS won out because the other machines were not
compatible with the IBM PC and MSDOS.

That's true. I was playing with 24-bit RGB graphics for my private
designs about a decade before it became mainstream on PCs.

That must have been fun!

But where were the Unix alternatives that people could buy from PC
World? Sure there had been colour graphics in computers for years but
I'm talking about consumer PCs.

They were expensive, and business only.

I used Unix systems at university in the early 1990's, and by $DEITY
it was a /huge/ step backwards when I had to move to Windows.

I went from a £500,000 (in mid-70s money) mainframe at college, running
TOPS 20 I think, to my own £100 Z80 machine with no OS on it at all, and
no disk drives either.

I'd say /that/ was a huge step backwards! Perhaps you can appreciate why
I'm not that bothered.

OK, that was a big step :-)

The 8088/86/286 had some disappointing limitations, which were fixed
with the 80386.

It's a pity MS didn't catch up with proper 32-bit support until AMD
were already working on 64-bit versions.

It wasn't so critical with the 80386. Programs could run in 16-bit mode
under a 16-bit OS, and use 32-bit operations, registers and address modes.

On the business and marketing side, there's no doubt that MS in
particular outclassed everyone else. They innovated the idea of
criminal action as a viable business tactic - use whatever illegal
means you like to ensure competitors go bankrupt before they manage to
sue you, and by the time the case gets to the courts the fines will be
negligible compared to your profit. Even IBM was trapped by them.

I never got interested in that side; I was always working to deadlines!

But what exactly was the point of Linux? What exactly was wrong with Unix?

The Unix world was very insular, and often tightly bound to hardware and services. There were many vendors, such as HP, IBM, and Sun, along with
more hardware-independent vendors like AT&T and Microsoft. But the big suppliers wanted you to get everything from them - you bought Sun
workstations, Sun monitors, Sun printers, Sun networking systems, Sun processors, as well as SunOS or Solaris Unix. They cooperated on API's, libraries, standard utilities, file system layouts, etc. - a common base
that became POSIX and allowed a fair degree of software compatibility
across widely different hardware.

And it was expensive - it was expensive to license Unix, and the
hardware used for it was expensive. It was also closed source and
proprietary, though there was a fair amount of free and/or open source
software available under all sorts of different and sometimes
incompatible licenses.

Three things changed all this. One is the GNU project that aimed to
re-write Unix in as free software with a new license and development
model. (They are even working on a kernel.) The second was that a
university professor and writer, Andrew Tanenbaum, wrote a complete
Unix-like OS for x86 PC's and made it available cheaply for educational purposes. And the third was the internet, and Usenet.

These formed the ecosystem for Linux to be developed as an alternative - providing the power of Unix without the software cost and without the
hardware cost, and in an open sharing environment.

But yes, compatibility and market share was key - Windows and PC's
were popular because there was lots of hardware and software that
worked with them, and there was lots of hardware and software for them
because they were popular. They were technically shite,

/Every/ software and hardware product for Windows was shite? Because
you're looked at every one and given your completely unbiased opinion!

"They" refers to the core of the PC and of the OS.

Which has been excellent. Until they chose not to support 16-bit
binaries under 64-bit Windows.

I believe you can run Wine on Windows, and then you could run 16-bit
binaries. But you might have to run Wine under Cygwin or something -
it's not something I have tried.

(I gave this 10 minutes but it lead nowwhere. Except it involved an
extra 500MB to install stuff that didn't work, but when I purged it, it
only recovered 0.2MB.

As I say, I haven't tried anything like this, so I can't help.

But just for fun, I copied a 16-bit Windows program I wrote some 25
years ago (in Delphi) to my 64-bit Linux system, and it ran fine with
"wine PROG.EXE". The program doesn't do much without particular
hardware connected on a serial port, so I didn't test much - and I have
no idea how successful serial communication might be.

It is enough to see that 16-bit Windows software /can/ run on 64-bit
Linux, at least to some extent.

Hmmm.. have I mentioned the advantages of a piece of software that comes
and runs as a single executable file? Either its there or not there.
There's nowhere for it to hide!)

I definitely see the advantage of stand-alone software. Single file is completely irrelevant to me, but I do prefer programs to have a specific purpose and place. Programs should not take resources other than file
space when they are not in use, should not run unless you want them to,
and it should be straight-forward to remove them without leaving piles
of mess in odd places. I think we can agree on those principles (even
if we place different weights on the importance of the size of the
software).

This applies to software on Linux and Windows. I don't like software
that fills the Windows registry with settings which are usually left
hanging after an uninstall. I don't like software that changes other
parts of the system, or installs always-running programs or services.

I like software that has a specific task, and does that task when you
ask it to, and does not bother you when you are not using it.

It would be wrong to suggest that *nix software always fits that model,
or that Windows software always gets it wrong - but it is fair to say
that "do one thing, and do it well" is the *nix software philosophy that
is not as common in the Windows world.

But I don't understand how you can take personal offence when I talk
about operating systems, or how you end up thinking it was a
criticism of you or your language.

I get annoyed when people openly diss Windows, or MSDOS, simply for
not being Linux.

I diss Windows or DOS because it deserves it. Linux was not conceived
when I realised DOS was crap compared to the alternatives. (You
always seem to have such trouble distinguishing Linux and Unix.)

Understandably. What exactly /is/ the difference? And what are the differences between the myriad different versions of Linux even for the
same platform?

Think of it all as a family tree. Unix was created in 1971, and had
many branches of which the most important commercial ones where probably
AIX (IBM), HP-UX (HP), and SunOS/Solaris (Sun), and the most important academic/research branch was BSD. (There was also MS's Xenix, later
SCO, but its significance in the story was mostly as MS's proxy war
against Linux.)

Linux did not enter the picture until 1991, and did not spread beyond
somewhat nerdy or specialist use until at least the turn of the century.
It has no common heritage with "true" Unix - so technically, it is not
Unix, but it is Unix-like. Some people (including me) refer to the
wider world of Unix-like OS's as *nix.

<https://en.wikipedia.org/wiki/File:Unix_history-simple.svg>

"Linux" technically refers only to the kernel, not the whole OS and
surrounding basic software. You are correct that it is understandable
to find it confusing. There is only one "mainline" Linux kernel, but
there are vast numbers of options and configuration choices, as well as
patches for various features maintained outside the main kernel source.

Then there are "distributions", which are collections of the Linux
kernel and other software, along with some kind of "package management"
system to let users install and update software. Some distributions are commercial (such as Red Hat and Ubuntu), some are entirely open (such as Debian), some are specialised for particular purposes, some are aimed at servers, some at desktops. Android is also Linux, as is ChromeOS.

If you want a fairly complete tree, you can see it here:

<https://upload.wikimedia.org/wikipedia/commons/b/b5/Linux_Distribution_Timeline_21_10_2021.svg>

If you want a recommendation for use, I'd say "Linux Mint".

Apparently having more than one assembler or linker on a platform is a disaster.

That's an exaggeration. Having more than one assembler or linker is unnecessary effort, unless they do something significantly different.

But have 100 different versions of the same OS, that's
perfectly fine!

There aren't that many in common use. But it's all open source -
putting together a new distribution is not /that/ much effort, as you
generally start from an existing one and make modifications. One of the
most popular (and my favourite for desktop use) is Linux Mint - it
started as a fork of Ubuntu with nothing more than a change of the
default theme from a brown colour to a nice green colour. Most
distributions remain niche or die out, and others gain followers and spread.

I like all these contradictions.

Contradictions are inevitable, so it's great that you like them :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to Dmitry A. Kazakov on Tue Dec 27 14:50:24 2022

On 27/12/2022 13:10, Dmitry A. Kazakov wrote:

On 2022-12-27 13:24, David Brown wrote:

On 24/12/2022 00:46, Bart wrote:

On 23/12/2022 15:38, David Brown wrote:

But tell me again why a 'linker', of all things, should be part of a
consumer product mostly aimed at people doing things that are nothing
to do with building software. Why give it such special dispensation?

It is convenient to have on the system. Programs can rely on it being
there.

I have an impression that you guys confuse linker with loader. Programs (applications) do not need linker.

I privately coined the term 'loader' in the 1980s for a program that
combined multiple object files, from independently compiled source
modules of a program, into a single program binary (eg. a .com file).

This was a trivial task that could be done as fast as files could be
read from disk.

It was also pretty much what a linker did, yet a linker was a far more complicated program that also took much longer. What exactly do linkers
do? I'm still not really sure!

Anyway, I no longer have a need to combine object files (there are no
object files). But there are dynamic fix-ups needed, which the OS EXE
loader will do, between an EXE and its imported DLLs.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 27 16:12:02 2022

Bart <bc@freeuk.com> wrote:

On 18/12/2022 13:05, David Brown wrote:

There are standards for that.? A text-based file can have a shebang
comment ("#! /usr/bin/bash", or similar) to let the shell know what interpreter to use.? This lets you distinguish between "python2" and "python3", for example, which is a big improvement over Windows-style
file associations that can only handle one interpreter for each file
type.

That is invasive. And taking something that is really an attribute of a
file name, in having it not only inside the file, but requiring the file
to be opened and read to find out.

Inferring interpreter from extention would be invasive. As it is
now one can replace any executable by a script and be sure that
this catches all uses of the executable. Or one can replace
scrit in one language by script in different language.

(Presumably every language that runs on Linux needs to accept '#' as a
line comment? And you need to build it in to every one of 10,000 source
files the direct location of the Python2 or Python3 installation on that machine? Is that portable across OSes? But I expect it's smarter than that.)

You need a way for interpreter to ignore first line. Some interpreters
have special option for this. And you need '#! ' only for executables,
there is no reason to add it to library files which are not used as executables.

You do realise that gcc can handle some 30-odd different file types?

That doesn't change the fact that probably 99% of the time I run gcc, it
is with the name of a .c source file. And 99.9% of the times when I
invoke it on prog.c as the first or only file to create an executable,
then I want to create prog.exe.

Well I did not realize that world is supposed to revolve around you.
When I create executable using gcc in 90% of times gcc gets list
of .o files and libraries.

So its behaviour is unhelpful. After the 10,000th time you have to type
.c, or backspace over .c to get at the name itself to modify, it becomes tedious.

So automate things. Of course, if your hobby is to continously
rename file and look how it affects compilations, then you may
do a lot of command line editing. In normal use the same
Makefile with can be used for tens or hundreds of edits to
source files.

Some folks use compile commands in editors. That works nicely
because rules are simple.

Now it's not that hard to write a wrapper script or program on top of gcc.exe, but if it isn't hard, why doesn't it just do that?

You miss important point: gcc gives you a lot of possibilities.
Simple wrapper which substitutes some defaults would make
using non-default values harder or impossible. If you want
to have all functionality of gcc you will end up with complicated
command line.

It's not a simple C compiler that assumes everything it is given is a C file.

As I said, that is not helpful for me. Also, how many file types does
'as' accept? As that also requires the full extension,

AFAICS as accepts any extention. I can name my assember file
'ss.m' and it works fine.

and also,
bizarrely, generates `a.out` as the object file name.

Yes, this is silly default. It is slightly less strange
than you think because 'a.out' is abbreviation of
'assembler output'.

Here's something funny: take hello.c and rename to 'hello', with no extension. If I try and compile it:

gcc hello

it says: hello: file not recognised: file format not recognised. Trying
'gcc hello.' is worse: it can't see the file at all.

So first, on Linux, where file extensions are supposed to be optional,
gcc can't cope with a missing .c extension; you have to provide extra
info. Second, on Linux, "hello" is a distinct file from "hello.".

With bcc, I just have to type "bcc hello." to make it work. A trailing
dot means an empty extension.

You are determined not to learn, but for possible benefits of third
parties: in Linux file name is just string of characters. Dot is
as valid character in a name as any other.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dan Cross@21:1/5 to antispam@math.uni.wroc.pl on Tue Dec 27 17:08:14 2022

In article <tof7ol$ovb$1@gioia.aioe.org>, <antispam@math.uni.wroc.pl> wrote: >Bart <bc@freeuk.com> wrote:

On 22/12/2022 15:09, Andy Walker wrote:

You complain
that you have to write "./hello" rather than just "hello";? but that's
because "." is not in your "$PATH", which is set by you, not because
Unix/Linux insists on extra verbiage.? If you need further help, just
ask.? But I'd expect you to be able to work it out rather than wring
your hands and flap around helplessly [or blame Unix for it].

So lots of workarounds to be able to do what DOS, maligned as it was,
did effortlessly.

It seems that nobody mentioned this: not having '.' in PATH is
relatively recent trend.

Define "recent": I haven't included `.` in my $PATH for
30 or so years now. :-)

- Dan C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to All on Tue Dec 27 16:53:27 2022

So automate things.

Isn't gcc already an automated wrapper around compiler, assembler and
linker?

Of course, if your hobby is to continously

rename file and look how it affects compilations, then you may
do a lot of command line editing. In normal use the same
Makefile with can be used for tens or hundreds of edits to
source files.

Some folks use compile commands in editors. That works nicely
because rules are simple.

Now it's not that hard to write a wrapper script or program on top of
gcc.exe, but if it isn't hard, why doesn't it just do that?

You miss important point: gcc gives you a lot of possibilities.
Simple wrapper which substitutes some defaults would make
using non-default values harder or impossible.

Not on Windows. Clearly, Linux /would/ make some things impossible,
because there are no rules so anything goes.

The rules for my BCC compiler inputs are:

Input Assumes file

file.c file.c
file.ext file.ext
file file.c
file. file # use "file." when file has no extension

This is not possible with Unix, since "file." either ends up as "file",
or stays as "file." You can only refer to "file" or "file.", but not both.

So a silly decision with Unix, which really buys you very little, means
having to type .c extensions on inputs to C files, for eternity.

If you want
to have all functionality of gcc you will end up with complicated
command line.

It's not a simple C compiler that assumes everything it is given is a C
file.

As I said, that is not helpful for me. Also, how many file types does
'as' accept? As that also requires the full extension,

AFAICS as accepts any extention. I can name my assember file
'ss.m' and it works fine.

But you need an extension. Give it a null extension then resulting
executable, if this is the only module, will clash.

My point however was the the reason gcc needs an explicit extension was
the number of possible input file types. How many /different/ file types
does 'as' work with?

I write my command-line utilities both to be easy to use when manually
invoked, and for invoking from scripts.

My experience of Unix utilities is that they do nothing of the sort;
they have no provision for user-friendliness whatsoever.

You are determined not to learn

You also seem reluctant to learn how non-Unix systems might work, or acknowledge that they could be better and more user-friendly.

, but for possible benefits of third

parties: in Linux file name is just string of characters. Dot is
as valid character in a name as any other.

Well, that's wrong. It may have sounded a good idea at one time: accept
ANY non-space characters as the name of a file. But that allows for a
lot of completely daft names, while disallowing some sensible practices.

There is no structure at all, no possibility for common sense.

With floating point numbers, 1234 is the same value as 1234. while
1234.. is an error, but they are all legal and distinct filenames under
Unix.

Under Windows, 1234 1234. 1234... all represent the same "1234" file.
While 123,456 are two files "123" and "456"; finally, some rules!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Bart on Tue Dec 27 17:32:45 2022

On 2022-12-27 15:50, Bart wrote:

I privately coined the term 'loader' in the 1980s for a program that
combined multiple object files, from independently compiled source
modules of a program, into a single program binary (eg. a .com file).

Loader is a dynamic linking and relocation program. It takes an image
(EXE or DLL) and loads it into the memory of an existing or new process.

It was also pretty much what a linker did, yet a linker was a far more complicated program that also took much longer. What exactly do linkers
do? I'm still not really sure!

Linker is a program that creates a loadable image from various sources
(object files and static libraries/archives of object files, including
import libraries). Linker

- resolves static symbols (The MS linker supports partial resolution.
The GNU linker does not. Which is why linking using the MS linker for
large projects is many times faster than linking with the GNU linker)
- creates vectorized symbols (to be resolved by the loader)
- evaluates link-time expressions
- creates sections and generates code to initialize sections and
elaborate code (e.g. starting tasks, calling constructors, running initializers)
- creates overlay trees (earlier linkers)

Anyway, I no longer have a need to combine object files (there are no
object files). But there are dynamic fix-ups needed, which the OS EXE
loader will do, between an EXE and its imported DLLs.

My deepest condolences... (:-))

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 27 16:50:29 2022

Bart <bc@freeuk.com> wrote:

On 22/12/2022 15:09, Andy Walker wrote:

You complain
that you have to write "./hello" rather than just "hello";? but that's because "." is not in your "$PATH", which is set by you, not because Unix/Linux insists on extra verbiage.? If you need further help, just
ask.? But I'd expect you to be able to work it out rather than wring
your hands and flap around helplessly [or blame Unix for it].

So lots of workarounds to be able to do what DOS, maligned as it was,
did effortlessly.

It seems that nobody mentioned this: not having '.' in PATH is
relatively recent trend. Simple fact is that people make
typos and may end up running different program then intended.
This also has security implications: malware can lead to real losses.
One can assume that system programs are trusted, but it is
harder to make assumptions about programs in semi-random
directory, it may be some junk fetched from the net. Of
course, normally file permissions would prevent execution of
random junk, but a lot of folks think that stronger measures
are needed and so '.' is not in default PATH. You can still
add it, this is your decision.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 27 19:05:18 2022

Bart <bc@freeuk.com> wrote:

You two aren't going to be happy until my language is a clone of C, with tools that work exactly the same way they do on Unix. But then you're
going to say, what's the point?

I certainly do not suggest cloning C, indeed what would be point of
this? Rather, I would like to see language that fixed _important_
problems with C, the biggest one IMO being limited or no type checking
when one wants interesting/variable sized types. Once you have
language different than C there is of course opportunity to fix
smaller problems, like syntax of types.

I have GNU Pascal, it has:
- module system with many features
- object oriented extentions
- schema types
- restricted types
- nested functions
- low level extentions

Schema types allow variable sized types (size is derived from type
parmeters) with bounds checking. Restricted types allow exportiong
type such that all operations on type are done by provided functions.
More precisely, language disallows access to structure of restricted
type. Of course, low level extentions allow breaking normal rules,
including rules for restricted types, but restricted types are intened
to avoid accidental mistakes and to better structure software
(they are _not_ security mechanizm unlike to Java).

Nested functions when passed as parameters to other functions
have access to variables/parameters outside. This may simplify
use of functional parameters. This is less powerful than real
closures, but normally closures depend on garbage collection
while nested functions work with stack allocation.

GNU Pascal has its warts, many due to compatibility with other
Pascal dialects. Let me note one wart: Pascal has builtin constant
maxinteger. Normal Pascal programs expect all integers to be
smaller or equal in magnitude to maxinteger. But two-complement
arithmetic has negaive value which is bigger in absolute value
than maxinteger. To make things more interesting, GNU Pascal
has integer types bigger than standard integer, so there are
"integers" bigger than maxinteger...

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 27 23:56:54 2022

Bart <bc@freeuk.com> wrote:

So automate things.

Isn't gcc already an automated wrapper around compiler, assembler and
linker?

You miss principle of modularity: gcc contains bulk of code (I mean
the wrapper, it is quite large given that all what it is doing
is handling command line). Your own code can provide fixed
values (so that you do not need to retype them) and more complex
behaviours (which would be too special to have as gcc options).

In fact, 'make' was designed to be tool for "directing compilation",
it handles things at larger scale than gcc.

To put is differently: gcc and make provide mechanizms. It is
to you to specify policy.

BTW: there are now several competitors to make. One is 'cmake'.

Of course, if your hobby is to continously

rename file and look how it affects compilations, then you may
do a lot of command line editing. In normal use the same
Makefile with can be used for tens or hundreds of edits to
source files.

Some folks use compile commands in editors. That works nicely
because rules are simple.

Now it's not that hard to write a wrapper script or program on top of
gcc.exe, but if it isn't hard, why doesn't it just do that?

You miss important point: gcc gives you a lot of possibilities.
Simple wrapper which substitutes some defaults would make
using non-default values harder or impossible.

Not on Windows. Clearly, Linux /would/ make some things impossible,
because there are no rules so anything goes.

What I wrote really have nothing to with operating system, this
is very general principle.

The rules for my BCC compiler inputs are:

Input Assumes file

file.c file.c
file.ext file.ext
file file.c
file. file # use "file." when file has no extension

This is not possible with Unix, since "file." either ends up as "file",
or stays as "file." You can only refer to "file" or "file.", but not both.

You can implement your rules on Unix. Of course, one can ask if the
rules are useful. As written above your rules make it impossible to
access file named "file." (it will be mangled to "file"). And you
get quite unintuitive behaviour for "file".

So a silly decision with Unix, which really buys you very little, means having to type .c extensions on inputs to C files, for eternity.

Nobody forces you to have extention on files. And do not exagerate,
two characters extra from time to time do not make much difference.

If you want
to have all functionality of gcc you will end up with complicated
command line.

It's not a simple C compiler that assumes everything it is given is a C >>> file.

As I said, that is not helpful for me. Also, how many file types does
'as' accept? As that also requires the full extension,

AFAICS as accepts any extention. I can name my assember file
'ss.m' and it works fine.

But you need an extension. Give it a null extension then resulting executable, if this is the only module, will clash.

as ss

will produce 'a.out' (as you know). If there is no a.out there will
be no clash, otherwise you need to specify output file, like:

as ss -o ss.m

or

as ss -o tt

Note: as produces object file (it does not link). For automatic
linking use gcc.

My point however was the the reason gcc needs an explicit extension was
the number of possible input file types. How many /different/ file types
does 'as' work with?

as have no notion of "file type". as takes stream of bytes and turns
it into object file (different thing from an executable!). Stream
of bytes may come from file, but in typical use (when as is called by
gcc) as gets its input from a pipe.

I write my command-line utilities both to be easy to use when manually invoked, and for invoking from scripts.

My experience of Unix utilities is that they do nothing of the sort;
they have no provision for user-friendliness whatsoever.

Old saying (possibly mangled): "Unix is friendly, just not everybody
is its friend".

You are determined not to learn

You also seem reluctant to learn how non-Unix systems might work, or acknowledge that they could be better and more user-friendly.

"might work" is almost meaningless. By design Unix allows a lot
of flexibility so it "might work" in quite different way. Now,
concering how system actually work, I have some experience of MVS,
CP/CMS and significant experience with DOS. You may think that
non Unix system are friendly, but IME Unix behaviours make me
more productive. Unix was designed to help automation. Output
of one Unix utility typically can be used as input to the other
allowing composing more complex commands. I had choice between
DOS and Linux, Linux not only run faster than DOS on the same
machine but also was more convenient to use. Of course, there
are some particular things which were easier on other systems,
but that is simply cost of having other featurs.

, but for possible benefits of third

parties: in Linux file name is just string of characters. Dot is
as valid character in a name as any other.

Well, that's wrong. It may have sounded a good idea at one time: accept
ANY non-space characters as the name of a file.

For the record: spaces, newlines, tabs and similar are legal in Unix
filenames. They lead to various troubles, but it is for users to
avoid them (or use programs than handle them in convenient way).

But that allows for a
lot of completely daft names, while disallowing some sensible practices.

Yes you can create daft names, do not do this. Unix disallows
no sensible practices (unless you consider null character or slash
as sensible part of filename).

There is no structure at all, no possibility for common sense.

With floating point numbers, 1234 is the same value as 1234. while
1234.. is an error, but they are all legal and distinct filenames under
Unix.

Under Windows, 1234 1234. 1234... all represent the same "1234" file.
While 123,456 are two files "123" and "456"; finally, some rules!

In Unix you have simple rule: string with no embedded nuls or
slashes and within filesystem max limit is valid filename. Different
strings give different files. No need to worry about clashes,
no need to worry that directory listing will give you different
thing than one which you wanted to store. On top of that
applications are free to implement their handling and OS will
not stop them. In Windows you have arbitrary rules, which
AFAIK depend on codepage of filesystem.

I am affraid that our views diverge here quite a lot. For me
filesystem should store information and allow retrieval of
_exactly_ the same thing that was stored. That includes
filenames. Any restrictions on form of information
are undesirable, excluding null and slash is a limiation
but it is quite mild limitation compared to Windows
limitations.

From abstract point of view filesystem is a persistem associative
table. For given key (filename) there is associated data
(content of the file + possible metadata). Theretically you can
get fancy about keys and content. Some older OS required users
to specify "file organization" and treated files as sequences of
"records". Unix simply says that content is sequence of bytes.
You may think that older approach is has more structure, but OS
provided record frequently were poor match to application needs.
With Unix application can organize data in its own way. For
names also one can think about some fancy schemes. But names
should be printable/viewable for "user friendliness" so the
simplest way is to treat them as strings. Hierarchical
organization is not essential from abstract point of view,
but it is simple to implement and users understand it, so
it is natural to have it. Concerning metadata, size is
important. I consider modification time very important.
It is not clear if one need 3 times like in Unix and if one
goes with miltiple time it is not clear which ones to store.
One could go more fancy, Mac OS has concept of "resource fork"
that IIUC in priciple can store arbitrary extra info. It
is not clear to me what it buys compared to storing info
directly inside file. I can imagine benefits if arbitrary
program could "attrach" to file it own info is a way that
would be ignored by other programs which do not understand
such information. It is not clear to me if Mac "resource
fork" allows this.

Anyway, Unix file system is strightforward implementation of
file system abstraction. It has no fancy features, just
essential things. If power comes from simplicity and fact
that it "just works".

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andy Walker@21:1/5 to All on Thu Dec 29 00:37:20 2022

On 22/12/2022 16:46, Bart wrote:
[I wrote:]

[...] You complain
that you have to write "./hello" rather than just "hello"; but that's
because "." is not in your "$PATH", which is set by you, not because
Unix/Linux insists on extra verbiage. If you need further help, just
ask. But I'd expect you to be able to work it out rather than wring
your hands and flap around helplessly [or blame Unix for it].

So lots of workarounds to be able to do what DOS, maligned as it was,
did effortlessly.

It isn't a "workaround". It's all perfectly normal; commands are
run from either a directory in your "PATH" variable or a directory that you specify. Where else would you expect to look for executables?

Don't forget it is not just me personally who would have trouble. For
over a decade, I was supplying programs that users would have to
launch from their DOS systems, or on 8-bit systems before that.

The question is not how to launch them but where to install them.
If you [or a user following your instructions] install them in a weird
place, then it's not surprising that you need a weird instruction to run
them. Or do you expect to have to search the entire computer to find each command? Unix commands are usually installed in standard places such as "/bin", "/usr/bin", "/usr/local/bin" or "$HOME/bin" [usually therefore to
be found in your default "$PATH"], and usually run by simply naming the command.

So every one of 1000 users would have to be told how to fix that "."
problem?

A "problem" only in your imagination.

Fortunately, nobody really used Unix back then (Linux was
not yet ready), at least among our likely customers who were just
ordinary people.

Whereas our students all had two heads and three legs? [Before
you say "Well, they were CS specialists", no they weren't, we didn't
have a CS dept or course in those days, we just expected all undergrads
to learn how to use the computer.]

Fortunate also that with case-sensivitivity in the shell program and
file system, it would have created a lot more customer support
headaches.

Not the sort of thing that ever gave me or colleagues headaches,
either as users or in a support role.

But you can write your own almost trivially;
it's a "one-line" shell script

Sure. I also asked, if it is so trivial, why don't programs do that
anyway?

Because no-one other than you is confused by the notion that you
run programs by [merely] naming them [if they're in standard places] or
saying where to find them [otherwise]?

Learn something from DOS at least which is user
friendliness.

Really? I found DOS utterly unhelpful. Of course, I came to it
from Unix, where life was much easier and more transparent; and I read
the documentation first rather than just "suck it and see" [resulting in
the sorts of misunderstanding that you often display here].

[...]> Well, I am standing up for those features and refusing to budge just

because C and Linux have taken over the world [...].

Really? According to

https://www.simplilearn.com/best-programming-languages-start-learning-today-article

C and C++ combined are 11th on the language list and according to

https://gs.statcounter.com/os-market-share

Linux is 6th OS, with a mighty 1.09%, pretty much level with "unknown" and "other", whatever they may be. Rather niche, wouldn't you say? Of course, being niche doesn't stop them being influential.

Notice that most user-facing interfaces tend to be case-insensitive?

Doesn't this depend on the application? E-mail addresses are case- insensitive for good reasons, tho' I hope you don't make use of this to
write all your e-mails in camel-case. OTOH, passwords are almost always case-sensitive, for equally good reasons. I expect word-processors and
similar to take account of case in the text I type; some other applications much less so. The arguments for computer languages have been well-rehearsed here; personally I think keywords should be distinguished from identifiers, but YMMV.

[...]

So, nobody here thinks that doing 'mm -ma appl' to produce a one-file
appl.ma file representing /the entire application/, that can be
trivially compiled remotely using 'mm appl.ma', is a great idea?

Well, it seems to work for you, but then I don't know what "mm"
is beyond you saying "you can do /this/, /that/ and /the other/" in it. Documentation? System requirements? Will it work on my machine? Will
it work on legacy machines? Does it utilise R, "curses", "plotlib", PostgreSQL, the Gnu scientific library and/or other similar packages
if they are available? If the answer is a universal "yes", then bully
for you. Otherwise, it's not a "great idea" /for me/.

[...]

Well, have a look at the A68G source bundle for example: [...].
Like so many, this application starts with a
'configure' script, although only 9500 lines this time. So I can't
build it on normal Windows.

The auto-produced "configure" script is what enables A68G to work
not only on my current machine, but also the one I had at work just before
I retired 15 years ago, and on the next machine I buy [soon, probably].
I didn't have the chance to try it, but from the comments in the code I
would expect it to work also for the SGI and Sun machines I had at work in
even earlier times. Of course, if you want to try A68G on your Windows machine, you could try downloading the pre-built Windows ".exe" instead
of building for Linux?

[...]

But typing 'make' again still took 1.4 seconds even with nothing to
do.
Then I looked inside the makefile: it was an auto-generated one with
nearly 3000 lines of crap inside - no wonder it took a second and a
half to do nothing!

You said it yourself -- it's auto-generated. You aren't expected
to read it. It has to [be prepared to] cope not only with my current PC
but also with ancient SGI and Sun machines, other modern machines, a wide variety of architectures and available libraries. So yes, it's quite
complex. It doesn't just build and optionally install an A68G executable, "make" is much more versatile than that, and deals with many other aspects
of controlling software [inc documentation]. Even if it has "nothing to
do", it takes time to confirm that no relevant file in the directory tree
has changed so that all dependencies are still met.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Peerson

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Dan Cross on Thu Dec 29 17:47:18 2022

cross@spitfire.i.gajendra.net (Dan Cross) writes:

In article <tof7ol$ovb$1@gioia.aioe.org>, <antispam@math.uni.wroc.pl> wrote:

Bart <bc@freeuk.com> wrote:

On 22/12/2022 15:09, Andy Walker wrote:

You complain
that you have to write "./hello" rather than just "hello";? but
that's because "." is not in your "$PATH", which is set by you,
not because Unix/Linux insists on extra verbiage.? If you need
further help, just ask.? But I'd expect you to be able to work
it out rather than wring your hands and flap around helplessly
[or blame Unix for it].

So lots of workarounds to be able to do what DOS, maligned as it
was, did effortlessly.

It seems that nobody mentioned this: not having '.' in PATH is
relatively recent trend.

Define "recent": I haven't included `.` in my $PATH for
30 or so years now. :-)

How about this: "recent" is any time since you last had '.'
in your path. :)

seasoned greetings and Happy almost New Year ...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bart@21:1/5 to antispam@math.uni.wroc.pl on Fri Dec 30 02:38:46 2022

On 27/12/2022 23:56, antispam@math.uni.wroc.pl wrote:

Bart <bc@freeuk.com> wrote:

So automate things.

Isn't gcc already an automated wrapper around compiler, assembler and
linker?

You miss principle of modularity: gcc contains bulk of code (I mean
the wrapper, it is quite large given that all what it is doing
is handling command line). Your own code can provide fixed
values (so that you do not need to retype them) and more complex
behaviours (which would be too special to have as gcc options).

In fact, 'make' was designed to be tool for "directing compilation",
it handles things at larger scale than gcc.

To put is differently: gcc and make provide mechanizms. It is
to you to specify policy.

So, there isn't such a thing as a 'C compiler' program where you give it
input A, and it produces A.exe from A.c; or you give it A, B, C and it
produces A.exe from A.c, B.c, C.c.

No one apparently thinks that is useful. A million utilities in Unix
except one to compile its most important language with the minimum fuss.

Unless you count using 'make A' where it might work, if it's only one
module, and doesn't need '-lm', and there doesn't happen to be a module
called 'A' of some other extension since it might build that instead.

And it doesn't matter that it might not actually invoke the compiler,
since there are reasons you want to do so even when the source has not
changed.

BTW: there are now several competitors to make. One is 'cmake'.

A 90MB installation which ... creates makefiles?

You don't see how that's going in the wrong direction? How many ways can
a combination of gcc, make, cmake, configure possibly go wrong?

And who exactly is this for? I've repeatedly made the point that the set
of environmental info someone who just wants to build P from source, for
some very good reasons, need be nothing like the set used by the
developer of the application.

Do you think the build instructions for a piece of IKEA furniture need
to include all the locations in the factory where various parts are located?

Of course, if your hobby is to continously

rename file and look how it affects compilations, then you may
do a lot of command line editing. In normal use the same
Makefile with can be used for tens or hundreds of edits to
source files.

Some folks use compile commands in editors. That works nicely
because rules are simple.

Now it's not that hard to write a wrapper script or program on top of
gcc.exe, but if it isn't hard, why doesn't it just do that?

You miss important point: gcc gives you a lot of possibilities.
Simple wrapper which substitutes some defaults would make
using non-default values harder or impossible.

Not on Windows. Clearly, Linux /would/ make some things impossible,
because there are no rules so anything goes.

What I wrote really have nothing to with operating system, this
is very general principle.

The rules for my BCC compiler inputs are:

Input Assumes file

file.c file.c
file.ext file.ext
file file.c
file. file # use "file." when file has no extension

This is not possible with Unix, since "file." either ends up as "file",
or stays as "file." You can only refer to "file" or "file.", but not both.

You can implement your rules on Unix. Of course, one can ask if the
rules are useful. As written above your rules make it impossible to
access file named "file." (it will be mangled to "file"). And you
get quite unintuitive behaviour for "file".

That's a detail I wasn't aware of where I first learned to know and love default file extensions. It showed the machine had a spark of
intelligence; at least it knew what files it was supposed to work on!

Perhaps I should have looked 45 years into the future where people use a
rather bizarre OS where the "." in "file.ext" is not a bit of syntax
that separates the two parts, it is actually part of the file's name.

It's funny that, on Windows, the 'gcc' compiler driver binary is called 'gcc.exe'. Yet you don't have to type 'gcc.exe' to invoke it, it can
just be 'gcc'. Useful, yes?

So a silly decision with Unix, which really buys you very little, means
having to type .c extensions on inputs to C files, for eternity.

Nobody forces you to have extention on files.

Oh, come on! You have a file 'prog'; how do you distingish between:

* The source file
* The matching header file
* The assembler file
* The object file
* The executable file

'Shebangs' are no good here!

And do not exagerate,
two characters extra from time to time do not make much difference.

They are incredibly annoying. If I have a sequence of ops on version of
the same file, then I have to copy the line then backspace over the
extension and write a new one, which is... ah yes it's .asm this time!

It is an utter waste of time. Obviously, you're going to have your
opinion because that's what you've been forced to do for so long.

Besides, if it wasn't a big deal, why don't Unix executables have
extension in general? Because it would be too of an imposition to have
to repeatedly type it.

If you want
to have all functionality of gcc you will end up with complicated
command line.

It's not a simple C compiler that assumes everything it is given is a C >>>>> file.

As I said, that is not helpful for me. Also, how many file types does
'as' accept? As that also requires the full extension,

AFAICS as accepts any extention. I can name my assember file
'ss.m' and it works fine.

But you need an extension. Give it a null extension then resulting
executable, if this is the only module, will clash.

as ss

will produce 'a.out' (as you know). If there is no a.out there will
be no clash, otherwise you need to specify output file, like:

as ss -o ss.m

or

as ss -o tt

I think even Unix uses .o extension for object files. In fact, this:

gcc -c hello.s

writes a file called hello.o.

So why in God's name does this, with or without -c:

as -c hello.s

write 'a.out' rather than 'hello.o'?! It is just utterly bizarre.

Why are you even trying to defend this nonsense?

And why does this:

as -c hello.s hello.c

give an error? Is it now trying to do the job of linker?

Note: as produces object file (it does not link). For automatic
linking use gcc.

My point however was the the reason gcc needs an explicit extension was
the number of possible input file types. How many /different/ file types
does 'as' work with?

as have no notion of "file type". as takes stream of bytes and turns
it into object file (different thing from an executable!). Stream
of bytes may come from file, but in typical use (when as is called by
gcc) as gets its input from a pipe.

Not on Windows; there an anonymous, temporary .o file is created.

I write my command-line utilities both to be easy to use when manually
invoked, and for invoking from scripts.

My experience of Unix utilities is that they do nothing of the sort;
they have no provision for user-friendliness whatsoever.

Old saying (possibly mangled): "Unix is friendly, just not everybody
is its friend".

You are determined not to learn

You also seem reluctant to learn how non-Unix systems might work, or
acknowledge that they could be better and more user-friendly.

"might work" is almost meaningless. By design Unix allows a lot
of flexibility so it "might work" in quite different way.

Funny sort of flexibility: file extensions must be spot-on; no chance of inferring an extension to save the user some trouble. And letter case
must be spot-on too: so was that file oneTwo or OneTwo or Onetwo? And
commands must be spot-on as well: you type 'ls OneTwo' then you look up
and realise Caps-lock was on and you typed 'LS oNEtWO'; urggh! Start again..

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andy Walker@21:1/5 to Bart on Fri Dec 30 11:08:23 2022

On 30/12/2022 02:38, Bart wrote:

[...] And letter
case must be spot-on too: so was that file oneTwo or OneTwo or
Onetwo? And commands must be spot-on as well:

There are 3439 commands in the directories of my "$PATH", of
which just 15 include upper-case letters. The only one I have ever
used [in over 40 years!] is "SendAnywhere", which is the name shared
with the corresponding command on my 'phone, tablet and Mac. I don't
usually type it anyway, as it's available directly from the launcher.
I might in principle also use "R", but don't need to as the functions
are included in "a68g". So not a problem in practice.

you type 'ls OneTwo'
then you look up and realise Caps-lock was on and you typed 'LS
oNEtWO'; urggh! Start again..

Many years ago, I had a keyboard with an ultra-ultra-sensitive caps-lock key which was activated far too frequently by catching it
while typing a nearby symbol or key [esp a problem in the early days
when keyboard layouts were less standardised]. So I removed the key,
which I never used anyway. For the last 25 years or so, I've instead
disabled it by software. Solves all such problems.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Wolf

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Gretchiie
  Mon Sep 15 05:16:29 2025
  from Derry, Nh via Telnet
- Fred Blogs
  Mon Sep 15 00:03:12 2025
  from Uk via SSH
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet
- Thlc
  Sat Sep 13 17:11:34 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 17:04:03 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 16:32:19 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 15:41:11 2025
  from Rognac, France via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (2 / 14)
Uptime:	08:19:48
Calls:	10,387
Calls today:	2
Files:	14,058
Messages:	6,416,656

Porting code from C

Who's Online

Recent Visitors

System Info