• Porting code from C

    From Bart@21:1/5 to All on Sat Dec 3 20:17:01 2022
    David Brown:
    You can take a program in your language and translate it fairly
    directly into C, albeit some parts will be ugly or non-idiomatic.

    This can work, because I've done it, but there are still quite a few
    problems, such as UB, limitations or omissions in C, or needing to
    cajole C compilers into accepting my code.

    You can take a old-style C program, with some restrictions, and
    translate it fairly directly into your language.

    Targetting a lower-level language is easier than one higher level than
    your source language.

    Mine is a little higher level than C, so is a little more challenging.
    It can be done of course, depending on how crudely you want to express
    the original code.

    But let's go through some of them ('M' is my language):

    (TLDR: see end)

    * C uses nested include files. M has 'include', but C's includes are
    used for module headers. M has a proper module system. You really need
    do quite a lot of work to eliminate module headers from C, and express
    things as M modules

    * Exported variables must be done properly, with an 'owner' module.
    Imported names must not be directly visible in M source; only via a
    module that exports them. This applies to functions, variables, types,
    enums, records, named constants

    * Then there's C's macro system, for which there is no equivalent. You
    can try preprocessing before conversion, but the result can be
    gobbledygook. Not suitable for a one-time translation. Alternatively you
    can go through each macro and see how it might map to the M language.
    You will find it just as much work as eliminating macros from C code to
    end up as still-readable and maintainable C code.

    * M has no conditional code blocks (this stuff is taken care of at the
    module level)

    * For every non-static module-level function or variable in C, you need
    to remember to mark it as 'global' or 'export'.

    * Every A[i] term, when A is not an array, must turned into (&A[0]+i)^.
    This is not straightforward; C abounds with types like int** which are
    then accessed an array of pointers or array of arrays, or maybe pointer
    to pointer.

    * You must remember to declare any array type with a zero lower bound

    * Switch: these must be well-structured to turn into M equivalents.
    Forget trying to convert Duff's Device. Breaks to prevent fallthrough
    must be removed. Implicit fallthrough, except for consecutive case
    labels, must dealt with.

    * C's plethora of integer types can usually be easily converted. But you
    need to decide what to do about 'long', and what to do about plain
    'char', which has no equivalent. Note that string literals in M are
    sequences of 'char', a special type, but can be assumed to be sequences
    of 'byte', a u8 type.

    * C's 'int' type must be translated to 'int32'

    * C's integer literals below 2**32 will have i32 or u32 type; in M they
    will be i64 only, leading to possible different behaviours.

    * M's rules for mixed arithmetic are different.

    * M's rules for widening 8/16-bit types are different. All evaluation in
    M is done as 64 bits

    * C is lax in {...} initialisation structures; braces defining structure
    can be omitted. M is stricter; they must exactly match the type.

    * C has quite a few features not present or not fully supported in M
    that will need workarounds:

    * VLAs and variable types in general
    * Designated initialisers
    * Compound literals
    * Bitfields as implemented in C (M's are far more controlled)

    * M does not have block scopes as extensively used in C

    * M does not have case-sensensitive identifiers, leading to likely clashes

    * M does not have struct tags, or use separate name spaces for those.

    * M does not have 'const', but that's an easy one: just leave it out.

    * M's operator precedences are all different

    * There is no equivalent of C's for-loop: this must be analysed for
    being a simple iteration, or for being complex enough to be emulated
    with 'while'.

    * M has no variadic parameters

    * M doesn't allow structs to be defined just anywhere, or inside another struct, or allow both a struct and variables of that struct to be
    defined. It is far more disciplined.

    * Does not allow 8/16/32-bit integers types as parameters or return
    types (only in FFIs)

    * M will not have the equivalent of most gnuC extensions

    * M does not have all those macros in limits.h or inttypes.h; those
    are represented as special syntax (eg int32.max)

    * M has no equivalent of C's 29 standard headers

    I won't go on. I'm not saying you can't take a program written in C
    and reimplement it in M. But you can't trivially do it by modifying the
    C sources into M; it would be a huge amount of work involving
    refactoring and being highly errorprone (I've tried it).

    So you'd have to rewrite the program, but that applies also to porting
    to other languages. That doesn't make them ripoffs of C.

    The underlying machine model might be the same between these two
    languages, and the degree of abstraction isn't much different. But there
    a chasm between how each language presents that model and how it
    presents itself.

    Mine /was/ developed independently from C.

    You can take a old-style C program, with some restrictions, and
    translate it fairly directly into your language.

    You will have trouble just translating it into a modern style of C!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Sun Dec 4 17:30:00 2022
    On 03/12/2022 21:17, Bart wrote:


    David Brown:
    You can take a program in your language and translate it fairly
    directly into C, albeit some parts will be ugly or non-idiomatic.

    This can work, because I've done it, but there are still quite a few problems, such as UB, limitations or omissions in C, or needing to
    cajole C compilers into accepting my code.


    No, that would just be poor translation from your code to C.

    Typical examples would be if your language has wrapping overflow
    semantics on signed overflow and you translate your own code (I'm
    guessing syntax a bit here) :

    a, b, x : i32

    x = a + b

    into

    int a, b, x;

    x = a + b;

    Use correct C, and you'll be fine :

    int32_t a, b, x;

    x = (uint32_t) a + (uint32_t) b;


    I am not suggesting you can translate idiomatic or typical code from
    your language directly into equally idiomatic C code.


    You can take a old-style C program, with some restrictions, and
    translate it fairly directly into your language.

    Targetting a lower-level language is easier than one higher level than
    your source language.

    Mine is a little higher level than C, so is a little more challenging.
    It can be done of course, depending on how crudely you want to express
    the original code.

    But let's go through some of them ('M' is my language):

    (TLDR: see end)

    * C uses nested include files. M has 'include', but C's includes are
    used for module headers. M has a proper module system. You really need
    do quite a lot of work to eliminate module headers from C, and express
    things as M modules


    I don't understand what you mean here. Do you mean that in M, a
    "module" is a single file that contains both an "interface" section and
    an "implementation" section, like in Pascal, rather than having separate "interface" files and "implementation" files, like in Modula-2 ? This
    is, of course, perfectly fine. The translation to C basically consists
    of putting the "interface" part in "file.h" and the implementation part
    in "file.c".

    A modular system must always support nested imports (or "includes" in C parlance). Otherwise the interface part of Module A could not make use
    of types or other definitions from Module B.

    (Translating jumbled or poorly structured C code into your language
    would be harder - after all, it is possible to write a chaotic mess with
    C include files.)

    * Exported variables must be done properly, with an 'owner' module.
    Imported names must not be directly visible in M source; only via a
    module that exports them. This applies to functions, variables, types,
    enums, records, named constants


    Again, I don't see the sense in what you are saying. The names /must/
    be visible in order to use them. The /definitions/ behind the names can
    be hidden.

    When I write a "module" in C, I have a .c file and a .h file. Every
    object and function is either local to the module, and declared with
    "static", or "exported" and declared in the header file with "extern".
    The header file is, of course, #include'd by the C file. Enum and type declarations are in the header if they are exported, or the C file if not.

    I don't (as yet) see any reason why your M modules could not translate
    directly to the same organisation in C.

    * Then there's C's macro system, for which there is no equivalent. You
    can try preprocessing before conversion, but the result can be
    gobbledygook. Not suitable for a one-time translation. Alternatively you
    can go through each macro and see how it might map to the M language.
    You will find it just as much work as eliminating macros from C code to
    end up as still-readable and maintainable C code.


    Now I am mixed up. Are we translating your M code to C, or C code to M?

    Macros in C can be used well to write clearer code, and that should be
    fine to translate to your language (we are talking manual translation,
    not automatic transcompilation - if you were doing that, you'd do it
    from the C after pre-processing). Code that is so messy that it is hard
    for a human reader to comprehend will be hard to translate to anything
    else, regardless of the languages involved.

    But certainly there are things that can be done with macros in C that
    are hard to translate well into many other languages.

    * M has no conditional code blocks (this stuff is taken care of at the
    module level)

    * For every non-static module-level function or variable in C, you need
    to remember to mark it as 'global' or 'export'.


    That's a simple rule, and hardly a challenge. (You were right to make
    the default "private" in your language, and C was wrong to make the
    default "public".)

    <snip as this is just getting too long.>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sun Dec 4 18:41:55 2022
    On 04/12/2022 16:30, David Brown wrote:
    On 03/12/2022 21:17, Bart wrote:


    David Brown:
    You can take a program in your language and translate it fairly
    directly into C, albeit some parts will be ugly or non-idiomatic.

    This can work, because I've done it, but there are still quite a few
    problems, such as UB, limitations or omissions in C, or needing to
    cajole C compilers into accepting my code.


    No, that would just be poor translation from your code to C.

    It will be poor, and it is. But it only needs to work with selected applications, mostly so that I can apply a optimising C compiler so my
    programs compare more favourably with competing products.

    However it means the original source is sometimes compromised if it uses constructs that don't tranlate; then it needs to dumbed down.

    ;You can take a old-style C program, with some restrictions, and
    translate it fairly directly into your language.

    *** This part was about translating from C to my M language ***

    I don't understand what you mean here.  Do you mean that in M, a
    "module" is a single file that contains both an "interface" section and
    an "implementation" section, like in Pascal, rather than having separate "interface" files and "implementation" files, like in Modula-2 ?  This
    is, of course, perfectly fine.  The translation to C basically consists
    of putting the "interface" part in "file.h" and the implementation part
    in "file.c".

    (Actually, the conversion of M->C doesn't bother with discrete headers
    at all, since the whole program translator generates a single C source file.

    The latest version doesn't even bother with standard C headers.)


    A modular system must always support nested imports (or "includes" in C parlance).  Otherwise the interface part of Module A could not make use
    of types or other definitions from Module B.

    (Translating jumbled or poorly structured C code into your language
    would be harder - after all, it is possible to write a chaotic mess with
    C include files.)

    Yes, exactly; it can be unstructured and more chaotic.

    * Exported variables must be done properly, with an 'owner' module.
    Imported names must not be directly visible in M source; only via a
    module that exports them. This applies to functions, variables, types,
    enums, records, named constants


    Again, I don't see the sense in what you are saying.  The names /must/
    be visible in order to use them.  The /definitions/ behind the names can
    be hidden.

    In a C module that imports function F, the declaration (sometimes,
    definition) of F must exist in the translation unit. (What you get if
    you preprocess that source module, flatten all the includes etc.)

    In the M module that imports function F, the definition of F never
    appears in the source for that module.

    The compiler will find it by searching the namespaces of the imported
    modules.

    (Actually, in the latest version, you won't see the name of the imported
    module either; that information is centralised.)


    Now I am mixed up.  Are we translating your M code to C, or C code to M?

    No this, is C to M now.

    There are all sorts of problems involved with macros.

    In short, doing this manually is just impractical (except for stylised,
    very conservative C like I might write); it can be easier to rewrite.

    Doing it programmatically has all sorts of issues too.

    Macros in C can be used well to write clearer code, and that should be
    fine to translate to your language (we are talking manual translation,
    not automatic transcompilation - if you were doing that, you'd do it
    from the C after pre-processing).

    That doesn't work. For a start it will flatten all enumerations and
    defines into literals; you need to preserve those.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Wed Dec 7 17:47:06 2022
    Bart <bc@freeuk.com> wrote:
    On 04/12/2022 16:30, David Brown wrote:
    On 03/12/2022 21:17, Bart wrote:


    David Brown:
    constructs that don't tranlate; then it needs to dumbed down.

    You can take a old-style C program, with some restrictions, and
    translate it fairly directly into your language.

    *** This part was about translating from C to my M language ***

    I don't understand what you mean here.? Do you mean that in M, a
    "module" is a single file that contains both an "interface" section and
    an "implementation" section, like in Pascal, rather than having separate "interface" files and "implementation" files, like in Modula-2 ?? This
    is, of course, perfectly fine.? The translation to C basically consists
    of putting the "interface" part in "file.h" and the implementation part
    in "file.c".

    * Exported variables must be done properly, with an 'owner' module.
    Imported names must not be directly visible in M source; only via a
    module that exports them. This applies to functions, variables, types,
    enums, records, named constants


    Again, I don't see the sense in what you are saying.? The names /must/
    be visible in order to use them.? The /definitions/ behind the names can
    be hidden.

    In a C module that imports function F, the declaration (sometimes, definition) of F must exist in the translation unit. (What you get if
    you preprocess that source module, flatten all the includes etc.)

    In the M module that imports function F, the definition of F never
    appears in the source for that module.

    The compiler will find it by searching the namespaces of the imported modules.

    (Actually, in the latest version, you won't see the name of the imported module either; that information is centralised.)

    You mean that it impossible to compile a module without info from
    central registry? For me it means that registry is now part of
    your source.

    Regarding C, in well written
    Now I am mixed up.? Are we translating your M code to C, or C code to M?

    No this, is C to M now.

    There are all sorts of problems involved with macros.

    In short, doing this manually is just impractical (except for stylised,
    very conservative C like I might write); it can be easier to rewrite.

    Well, if code if _really_ badly written or trivial, then independent
    rewrite may be best way. OTOH, in many cases one can do "limited"
    rewrite: write code that performs "the same" computations as code
    in other language. "The same" includes 1-1 correspondence of variables
    and fields in data structurs. Such limited rewrite can be 5-10
    times faster than writing code from scratch, so in this sense
    translating from one language to different language is easy,
    it is much less effort than writing from scratch. Of course,
    this assumes that code is doing something interesting, for trivial
    code you are just doing with bulk and following original is of no
    help.

    Doing it programmatically has all sorts of issues too.

    If you want fully automatic translation that preserves meaning,
    then I would expect ugly and inefficient code as the result.
    The fist issue that comes to mind is name mangling: since your
    languages is case insensitive one may be forced to change some
    C names to avoid clashes. Simple translator would change all
    names... OTOH resonably complex translator may be able to
    translate 80-90% of code and flag problematic 10%. Manual
    modification _before_ running translator can significantly
    improve working of automatic part and reduce total amount
    of manual labor.

    Concerning macros, you somewhat ignore one obvious solution:
    implement a preprocessor for your language so that you
    can translate C macros into macros for your preprocessor.

    Macros in C can be used well to write clearer code, and that should be
    fine to translate to your language (we are talking manual translation,
    not automatic transcompilation - if you were doing that, you'd do it
    from the C after pre-processing).

    That doesn't work. For a start it will flatten all enumerations and
    defines into literals; you need to preserve those.

    "doesn't work" depends on your goal. If goal is to get running
    executable as output from your compiler (IIUC "automatic
    transcompilation" and the following text up to closing
    parenthesis covers this case), then I see no problem
    that expansion would cause. If goal is to get idiomatic code
    in your language, then you need to be more creative. IME most
    C macros are rather simple ones, to get effect of named constants
    and to inline code, those can be translated to constructs of
    your langage. Some macros are used as abbreviations, if your
    language has appropriate way for abbreviating you can replace
    tham by constucts of your language, otherwise you need to
    expand. Concerning expansion, you can modify C preprocessor
    to do partial expansion and preserve comments (and possibly
    also conditionals). In fact, even standard C preprocessor
    maybe enough if there are no comments: just make sure that
    input contains only definitions of things that you want
    expanded and make rest undefined.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Wed Dec 7 18:58:45 2022
    On 07/12/2022 17:47, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    (Actually, in the latest version, you won't see the name of the imported
    module either; that information is centralised.)

    You mean that it impossible to compile a module without info from
    central registry? For me it means that registry is now part of
    your source.

    You can't compile a single 'module'; you compile the whole program. The
    info that describes the module layout is at the top of the lead module,
    which typically contains only module info.

    So yes it can be considered part of the source, but is really
    intermediate info between compiler, and source code proper.

    Other languages may use command parameters, @ files, makefiles, or
    untidy collections of 'import <module>` at the top of every module, that
    need constant maintenance.




    Regarding C, in well written

    ?

    In short, doing this manually is just impractical (except for stylised,
    very conservative C like I might write); it can be easier to rewrite.

    Well, if code if _really_ badly written or trivial, then independent
    rewrite may be best way. OTOH, in many cases one can do "limited"
    rewrite: write code that performs "the same" computations as code
    in other language. "The same" includes 1-1 correspondence of variables
    and fields in data structurs. Such limited rewrite can be 5-10
    times faster than writing code from scratch, so in this sense
    translating from one language to different language is easy,
    it is much less effort than writing from scratch. Of course,
    this assumes that code is doing something interesting, for trivial
    code you are just doing with bulk and following original is of no
    help.

    Doing it programmatically has all sorts of issues too.

    If you want fully automatic translation that preserves meaning,
    then I would expect ugly and inefficient code as the result.
    The fist issue that comes to mind is name mangling: since your
    languages is case insensitive one may be forced to change some
    C names to avoid clashes.

    That would be a trivial matter: names can have a backtick prepended that preserves their case. But I wouldn't want to work with such source code:
    having to be case-sensensitive /and/ having the backtick. Name mangling
    has the same problems.

    This only works if the output is intermediate code that no one ever
    sees. However compiling C via intermediate M is not a useful execise.

    Concerning macros, you somewhat ignore one obvious solution:
    implement a preprocessor for your language so that you
    can translate C macros into macros for your preprocessor.

    That's not a solution, that's just dragging half the C language into mine.

    Besides, the C macros will still expand into C expressions, statements
    and types; so not just half the language, but half the C source too.

    Don't forget we're trying to translate the C, not find ways of avoiding
    that task!


    Macros in C can be used well to write clearer code, and that should be
    fine to translate to your language (we are talking manual translation,
    not automatic transcompilation - if you were doing that, you'd do it
    from the C after pre-processing).

    That doesn't work. For a start it will flatten all enumerations and
    defines into literals; you need to preserve those.

    "doesn't work" depends on your goal. If goal is to get running
    executable as output from your compiler (IIUC "automatic
    transcompilation" and the following text up to closing
    parenthesis covers this case), then I see no problem
    that expansion would cause.

    I think DB meant being able to manually translate line by line. That can
    work for small examples, but it doesn't really scale.

    I anyway already have a tool that will do that. If this is the original
    C of one example:

    https://github.com/sal55/langs/blob/master/nano.c

    Then my tool (a development of my C compiler) produces this file in my
    syntax:

    https://github.com/sal55/langs/blob/master/nano.m

    It looks great! But it won't compile; this is purely to help visualising
    C code.

    I have tried a few times to use this as a starting point to manually
    translate C programs to M, but there's always something that needs
    fixing every few lines; it's usually a huge amount of work. And many
    things can't be detected: they will produce legal M that is an incorrect representation of the C.

    So in a line like this:

    out^ := njClip(x7+x1>>14+128)

    the operator priorities are all different.

    Plus macros are expanded into literals and so on. You can do more work
    on the tool to reduce the manual fixups needed, but generally this
    solution is not viable.

    But, we're not really seriously trying to translate a specific C program.

    David Brown's point was to show that my language is really no different
    from C. But the language /it implements/ doesn't claim to be, and there wouldn't be any great difficult in rewriting any C program in my
    language, so long as your didn't try and do it token-by-token.

    It would be the same task as translating a C program to D or Java or C#
    or Zig or Rust. I think most agree those are different from C.

    My language also has many differences in how such a lower-level language
    is presented, like the module system for example. Or defaulting to
    64-bit integers (causing subtle differences of behaviour). Or
    out-of-order definitions. Or being expression-based no statement-based.
    Or a dozen other matters I've already listed.


    s to get idiomatic code
    in your language, then you need to be more creative. IME most
    C macros are rather simple ones, to get effect of named constants
    and to inline code, those can be translated to constructs of
    your langage.

    I have another tool that attempts to translate APIs, ie. C headers. When
    I applied it to gtk.h (which invokes 550 C headers across 330,000 lines
    of code), it produced a flat 25,000 import module, or which the last
    3000 lines are macros that need to be manually processed.

    I do have a simple module scheme in my language, designed for
    well-formed expressions only, but C macros can include random bits of
    syntax, and expand to /C code/. Remember the tool only does
    declarations, not code.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Thu Dec 8 00:30:05 2022
    Bart <bc@freeuk.com> wrote:
    On 07/12/2022 17:47, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    (Actually, in the latest version, you won't see the name of the imported >> module either; that information is centralised.)

    You mean that it impossible to compile a module without info from
    central registry? For me it means that registry is now part of
    your source.

    You can't compile a single 'module'; you compile the whole program. The
    info that describes the module layout is at the top of the lead module,
    which typically contains only module info.

    So yes it can be considered part of the source, but is really
    intermediate info between compiler, and source code proper.

    Other languages may use command parameters, @ files, makefiles, or
    untidy collections of 'import <module>` at the top of every module, that
    need constant maintenance.

    Hmm, if module A used function f from module B and I want to change
    A to use f from C, then I need to change import statement. I does
    not matter if import statement is in A or in central file. OTOH
    if imports stay stable then 'import' line does not change so no
    need for exta maintenance. OTOH in langage I use imports are
    scoped. Routine rB in A can import B and use f from B, routine
    rC in A can import C and use f from C. I do not think it would
    work with your import table. And I can compile each module
    separately. In fact, this is usual developement mode: I stop
    the program, modify the module, compile, load and restart the
    program. The point is that all program data and debugging setup
    is preserved. Local data of modifed module is destroyed, usually
    this is not a problem for debugging. But if all program would be
    recompiled and reloaded, then preserving data would be more
    tricky. Anyway, I have benefit from independent compilation
    and import lines are not a problem. In fact, in (not so
    frequent) cases when I need to modify import statements it
    happens with modification of module code. And it is easier
    to do modification in single file than in two sepatate files.

    And, if you ask, I have makefile and it contains list of all
    modules, so when I add a new module I must add it to the
    makefile. I principle list of modules could be generated
    automatically ("take all modules in files in current directory"),
    but I prefer current way. Some langages with modules have
    "transitive closure" feature: when you give module to compiler
    (or maybe extra tool) it will recompile all modules it needs.
    But you still need list of "entry modules", that is modules
    that should be compiled and are usable when user explicitely
    request them by name, but otherwise would be unused.



    Regarding C, in well written

    ?

    Oops, I got distracted and did not finish this part. I mean,
    in well written C program there will be implicit module structure,
    essentially ".c file = module". Exported functions will be
    declared in header files. One school says that there should be
    .h file for each C file, this .h file containg "module" export info.
    Less rigorous school allows single .h file (or some small number)
    decaring exported function. Even if program slightly deviates
    from such patterns usually this is not much work to separate
    declarations of exported functions and move them in .h files
    corresponding to .c files. I believe that gcc with appropriate
    options would tell you if there is any function which gets
    exported (is non-static) but which lacks declaration in
    header file, so you can correct program to make sure that
    everthing which needs to be exported is declared in header files
    and rest is static. After that the C structure will map
    1 to 1 to any resonable module system.

    Of course, it assumes resonably well-written programs, but
    if program is badly written then I do not think it makes much
    sense to attempt translation. Translation of resonable program
    in principle can "add value": better expose module structure,
    add stronger type checking, etc. Translation of bad program
    is likely to make it worse.

    In short, doing this manually is just impractical (except for stylised,
    very conservative C like I might write); it can be easier to rewrite.

    Well, if code if _really_ badly written or trivial, then independent rewrite may be best way. OTOH, in many cases one can do "limited"
    rewrite: write code that performs "the same" computations as code
    in other language. "The same" includes 1-1 correspondence of variables
    and fields in data structurs. Such limited rewrite can be 5-10
    times faster than writing code from scratch, so in this sense
    translating from one language to different language is easy,
    it is much less effort than writing from scratch. Of course,
    this assumes that code is doing something interesting, for trivial
    code you are just doing with bulk and following original is of no
    help.

    Doing it programmatically has all sorts of issues too.

    If you want fully automatic translation that preserves meaning,
    then I would expect ugly and inefficient code as the result.
    The fist issue that comes to mind is name mangling: since your
    languages is case insensitive one may be forced to change some
    C names to avoid clashes.

    That would be a trivial matter: names can have a backtick prepended that preserves their case. But I wouldn't want to work with such source code: having to be case-sensensitive /and/ having the backtick. Name mangling
    has the same problems.

    Exactly.

    This only works if the output is intermediate code that no one ever
    sees. However compiling C via intermediate M is not a useful execise.

    Yes. But for other languages it may be useful.

    Concerning macros, you somewhat ignore one obvious solution:
    implement a preprocessor for your language so that you
    can translate C macros into macros for your preprocessor.

    That's not a solution, that's just dragging half the C language into mine.

    First, I wrote "macros", not "C macros". Second, in language whare
    compiler has about 6000 lines (and leverages other language for code generation, otherwise compiler would be significantly bigger) handling
    of macros is less than 300 lines. So macros are really small
    addition.

    Besides, the C macros will still expand into C expressions, statements
    and types; so not just half the language, but half the C source too.

    I wrote "translate macros". After translation macros will expand to
    your language.

    Don't forget we're trying to translate the C, not find ways of avoiding
    that task!


    Macros in C can be used well to write clearer code, and that should be >>> fine to translate to your language (we are talking manual translation, >>> not automatic transcompilation - if you were doing that, you'd do it
    from the C after pre-processing).

    That doesn't work. For a start it will flatten all enumerations and
    defines into literals; you need to preserve those.

    "doesn't work" depends on your goal. If goal is to get running
    executable as output from your compiler (IIUC "automatic
    transcompilation" and the following text up to closing
    parenthesis covers this case), then I see no problem
    that expansion would cause.

    I think DB meant being able to manually translate line by line. That can
    work for small examples, but it doesn't really scale.

    I anyway already have a tool that will do that. If this is the original
    C of one example:

    https://github.com/sal55/langs/blob/master/nano.c

    Then my tool (a development of my C compiler) produces this file in my syntax:

    https://github.com/sal55/langs/blob/master/nano.m

    It looks great! But it won't compile; this is purely to help visualising
    C code.

    I have tried a few times to use this as a starting point to manually translate C programs to M, but there's always something that needs
    fixing every few lines; it's usually a huge amount of work. And many
    things can't be detected: they will produce legal M that is an incorrect representation of the C.

    So in a line like this:

    out^ := njClip(x7+x1>>14+128)

    the operator priorities are all different.

    Handling priorities was solved many years ago. Of course, you need
    to parse to get parse tree and then output the tree in your syntax.

    Plus macros are expanded into literals and so on. You can do more work
    on the tool to reduce the manual fixups needed, but generally this
    solution is not viable.

    But, we're not really seriously trying to translate a specific C program.

    David Brown's point was to show that my language is really no different
    from C. But the language /it implements/ doesn't claim to be, and there wouldn't be any great difficult in rewriting any C program in my
    language, so long as your didn't try and do it token-by-token.

    It would be the same task as translating a C program to D or Java or C#
    or Zig or Rust. I think most agree those are different from C.

    My language also has many differences in how such a lower-level language
    is presented, like the module system for example. Or defaulting to
    64-bit integers (causing subtle differences of behaviour). Or
    out-of-order definitions. Or being expression-based no statement-based.
    Or a dozen other matters I've already listed.

    David Brown may explain more what he meant. But I think that my
    view is similar to his: there is almost nothing new in your
    language. You are moving in circle of ideas that were mainstream
    in sixties, mostly resolved in seventies and consider old hat now.
    AFAICS newest feature of your languages is modules, which got
    popular around 1977. Newer languages either have them or
    (possibly wrong) designers thought that they have better
    features (that may be case of C++). I would say that among
    resonably popular "newer" languages all have "new things".
    Some of those things is successful and propagate to other
    languages, some is fails, some stays in its niche, but
    together they contribute to progress. It hard to find in
    your language something that other may wich to copy. Your
    'stringinclude' feature is a borderline case, I think
    nobody will want to copy your way, but they may be inspired
    to invent something better.

    Now, if you think that what I wrote is "putting you down":
    you invite this by comparing your language to C. There is
    a lot of programmers and I think most would be unable to
    invent and implement language of comparable quality to your.
    So this is certainly your personal achievement. But
    deficiences of C are known and so the is knowledge how to cope
    with them. There are also real world advantages, starting
    from availability of compilers and libraries. If you want
    your language to compare well with C you need either _huge_
    advantage at language level or provide comparable real
    world support at level comparable to C. There were bunch
    of languages that arguably offered significant improvement
    over C and they made more effort than you on real world
    support, yet to the moment none replaced C. If you want
    comparisons, more relevant would be comparison with non-C
    languages. And for starter, it looks that all contenders
    offer modules, with better features than yours.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Thu Dec 8 10:08:37 2022
    On 07/12/2022 19:58, Bart wrote:

    David Brown's point was to show that my language is really no different
    from C. But the language /it implements/ doesn't claim to be, and there wouldn't be any great difficult in rewriting any C program in my
    language, so long as your didn't try and do it token-by-token.

    It would be the same task as translating a C program to D or Java or C#
    or Zig or Rust. I think most agree those are different from C.


    My point was that your language (the low-level compiled one) and C are
    similar styles - they are at a similar level, and are both procedural imperative structured languages. They have functions, variables, and
    pointers. They do not have higher order functions, objects,
    overloading, generics/templates, events, actors, coroutines, RAII, metaprogramming, type inference, native multi-precision arithmetic,
    parallel blocks, closures, sandboxing. They are not domain-specific, or
    tied to particular applications or environment.

    I don't mean they are exactly the same, or that an automatic translator
    could be made to generate idiomatic code. Each language has certain
    features that would be ugly or non-idiomatic when translated. There may
    be a few cases where you'd have to significantly re-structure the code,
    but for much of it, you could translate function for function, variable
    for variable, type for type. Details are different, but not the
    structure of the languages or the way you approach the same tasks.

    I'd put Pascal (not newer Object Pascal) in the same box.

    I'd put C++, C#, Java and D in a box together, though they have more differences. And I'd say you can translate from C, Pascal, or your
    language into those languages reasonably well - but not vice versa.

    Put another way, a programmer who is familiar with C would be able to
    learn your language quickly, and it would not take long to be writing
    "normal" code in the language. They'd miss some features of C, and
    appreciate some new features of your language, but (personal preferences
    aside) would find it straightforward to work with. The same applies the
    other direction.

    Moving from C to idiomatic C++ or D would be a far bigger jump. And
    moving the other direction would feel crippling.

    Programming in Eiffel, Haskell, APL, Forth or Occam is /completely/
    different - you approach your coding in an entirely different way, and
    it makes no sense to think about translating from one of these to C (or
    to each other).


    (None of this suggests you "copied" C. You simply have a roughly
    similar approach to solving the same kinds of tasks - you probably had experience with much the same programming languages as the C designers,
    and similar assembly programming experience before making your languages
    at the beginning.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Thu Dec 8 12:00:40 2022
    On 08/12/2022 00:30, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    Other languages may use command parameters, @ files, makefiles, or
    untidy collections of 'import <module>` at the top of every module, that
    need constant maintenance.

    Hmm, if module A used function f from module B and I want to change
    A to use f from C, then I need to change import statement.

    If it happens that A imports both B and C, each of which export function
    F, then you would need to write B.F() or C.F() from A anyway, to
    disambiguate.

    I does
    not matter if import statement is in A or in central file. OTOH
    if imports stay stable then 'import' line does not change so no
    need for exta maintenance.

    Since I started using my 2022 module scheme, I very rarely have to look
    at the module list. Mainly it's changed in order to create a different configuration of a program.

    Actually it has dramatically simplified that kind of maintenance.

    My scheme also allows circular and mutual imports with no restrictions.

    I would link to some docs but I can sense you're not going to be
    receptive. That's fine; you're used to your way of doing things.

    OTOH in langage I use imports are
    scoped. Routine rB in A can import B and use f from B, routine
    rC in A can import C and use f from C. I do not think it would
    work with your import table.

    You mean a function privately imports a module? That sounds crazy - and chaotic. Suppose 100 different functions all import different subsets of
    20 different modules (is that Python by any chance?)

    There needs to be some structure, some organisation.


    And I can compile each module
    separately.

    I can compile each program separately. Granularity moves from module to program. That's better. Otherwise why not compile individual functions?

    In fact, this is usual developement mode: I stop
    the program, modify the module, compile, load and restart the
    program. The point is that all program data and debugging setup
    is preserved.

    You mean change one module of a running program? OK, that's not that
    really a feature of a language I think, more of external tools,
    especially IDEs and debuggers.

    I would have no idea how to do that in C. But I used to do it at the application level where most functionality was implemented via
    hot-loaded scripting modules. Then development was done from /within/
    the running application.


    Anyway, I have benefit from independent compilation
    and import lines are not a problem.

    I used to use independent compilation myself. I moved on to
    whole-program compilation because it was better. But all the issues
    involved with interfacing at the boundaries between modules don't
    completely disappear, they move to the boundaries between programs
    instead; that is, between libraries.


    And, if you ask, I have makefile and it contains list of all
    modules, so when I add a new module I must add it to the
    makefile. I principle list of modules could be generated
    automatically ("take all modules in files in current directory"),
    but I prefer current way.

    Well, my language doesn't use or need makefiles. While I got rid of
    linkers as being a waste of time, I never got into makefiles at all.

    There are however project files used by my mini-IDE, which contains
    additional info needed for listing, browsing, and editing the source
    files, and for running the programs with test inputs.

    But if /you/ wanted to build one of my projects from source, I would
    need to provide exactly two files:

    mm.exe the compiler
    app.ma the source code

    The latter being an automatically created amalgamation (produced with
    `mm -ma app`). Build using `mm app`.

    If you were on Linux or didn't want to use my compiler, then it's even
    simpler; I would provide exactly one file:

    app.c Generated C source code (via `mc -c app`)

    Here, you need a C compiler only. On Windows, you can build it using
    `gcc app.c -oapp.exe`. Linux needs more options, listed at the top of
    the file.

    I haven't yet figured out a way of providing zero files if you wanted to
    build my apps from source.

    But I guess none of this cuts any ice. After all, there are innumerable
    ways of taking the most fantastically complex build process, and
    wrapping up in one command or one file (docker etc).

    The difference is that what I provide is genuinely simple: one bare
    compiler, one actual source file.

    Some langages with modules have
    "transitive closure" feature: when you give module to compiler
    (or maybe extra tool) it will recompile all modules it needs.
    But you still need list of "entry modules", that is modules
    that should be compiled and are usable when user explicitely
    request them by name, but otherwise would be unused.

    My stuff is simpler, believe me. When you have 1Mlps compilation speed,
    most of this build stuff can go out the window.

    Oops, I got distracted and did not finish this part. I mean,
    in well written C program there will be implicit module structure, essentially ".c file = module".

    I've seen lots of C source, it's rarely that tidy.

    If it was, I'd be be able to use an unusual feature of my C compiler
    where it can automatically discover the necessary modules by tracking
    the header files (`bcc -auto main.c`).


    Exported functions will be
    declared in header files. One school says that there should be
    .h file for each C file, this .h file containg "module" export info.
    Less rigorous school allows single .h file (or some small number)
    decaring exported function. Even if program slightly deviates
    from such patterns usually this is not much work to separate
    declarations of exported functions and move them in .h files
    corresponding to .c files. I believe that gcc with appropriate
    options would tell you if there is any function which gets
    exported (is non-static) but which lacks declaration in
    header file, so you can correct program to make sure that
    everthing which needs to be exported is declared in header files
    and rest is static. After that the C structure will map
    1 to 1 to any resonable module system.

    C is just so primitive when it comes to this stuff. I'm sure it largely
    works by luck.

    David Brown may explain more what he meant. But I think that my
    view is similar to his: there is almost nothing new in your
    language. You are moving in circle of ideas that were mainstream
    in sixties,

    So what are the new circles of ideas? All that crap in Rust that makes
    coding a nightmare, and makes building programs dead slow? All those new functional languages with esoteric type systems? 6GB IDEs (that used to
    take a minute and a half to load on my old PC)? No thanks.

    I keep the languages I maintain accessible, and the ideas simple.

    Any genuinely good ideas I would already have stolen if I liked them and
    they were practical to implement.

    But the trend now is to make even features like modules as complicated
    and comprehensive as possible. (For example, there may not be 1:1 correspondence between file and module: multiple modules per file;
    nested modules; modules split across multiple files. I keep it simple.)

    mostly resolved in seventies and consider old hat now.


    I'm think you're not looking at the right features of a language. There
    are some characteristics popular in the 70s that I like and maintain:

    * Clean, uncluttered brace-free syntax
    * Case-insensitive
    * 1-based
    * Line-oriented (no semicolons)
    * Print/read as statements

    C is still tremendously popular for many reasons. But anyone wanting to
    code today in such a language will be out of luck if prefered any or all
    of these characteristics. This is why I find coding in my language such
    a pleasure.

    Then, if we are comparing the C language with mine, I offer:

    * Out of order definitions
    * One-time definitions (no headers, interfaces etc)
    * Expression-based
    * Program-wide rather than module-wide compilation unit
    * Build direct to executable; no object files or linkers
    * Blazing fast compilation speed, can run direct from source
    * Module scheme with tidy 'one-time' declaration of each module
    * Function reflection (access all functions within the program)
    * 64-bit default data types (ie. 'int' is 64 bits, 123 is 64 bits)
    * No build system needed
    * Create and directly compile One-file amalgamations of projects
    * String and file embedding you know about
    * One-file self-contained implementations

    I haven't listed the literally one hundred small enhancements that make
    the coding experience even in a lower-level system language with
    primitive types so comfortable.

    Few are remarkable or unique, but it's putting it all together in one
    tidy package that counts.


    And for starter, it looks that all contenders
    offer modules, with better features than yours.

    Sure, you have C++, Zig, Rust, Java, C#, Dart, ....

    All with very advanced and complicated features. They also have big implementations and take ages to build projects.

    (When I need more advanced features, I switch to my scripting language.
    Which also, by design, shares the same syntax and most of the same characteristics and features.)

    I think, like David Brown, you just don't get it.

    Try and think of what I do as creating delicious meals in my kitchen
    from my own recipies. Nothing new here either. But both you and DB
    probably work (figuratively) in the food industry or run chains of
    restaurants, or work in a food laboratory.

    Or are used to buying ready-meals from supermarkets.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Fri Dec 9 00:52:24 2022
    On 08/12/2022 09:08, David Brown wrote:
    On 07/12/2022 19:58, Bart wrote:

    David Brown's point was to show that my language is really no
    different from C. But the language /it implements/ doesn't claim to
    be, and there wouldn't be any great difficult in rewriting any C
    program in my language, so long as your didn't try and do it
    token-by-token.

    It would be the same task as translating a C program to D or Java or
    C# or Zig or Rust. I think most agree those are different from C.


    My point was that your language (the low-level compiled one) and C are similar styles - they are at a similar level, and are both procedural imperative structured languages.  They have functions, variables, and pointers.  They do not have higher order functions, objects,
    overloading, generics/templates, events, actors, coroutines, RAII, metaprogramming, type inference, native multi-precision arithmetic,
    parallel blocks, closures, sandboxing.  They are not domain-specific, or tied to particular applications or environment.

    I don't know what half of those mean.

    My system language was upgraded for a while with higher level data types
    (you seem to have skipped a level or two in your list), but then I
    decided to keep it lower level.

    I think languages should know their place, so let it do what it does as
    well as it can, rather pretend to be something it's not. (And mine
    mainly exists to implement the next language along.)

    I anyway favour features that everyone can understand and appreciate.
    Here's an example using a minor extension to my C compiler:

    void print_this_module(void) {
    puts(strinclude(__FILE__));
    }

    Put this in any module (or in more than one, or make it a macro), and
    when called it will print the sourcecode of the module, which is
    embedded within the executable. There are a dozen uses other than the
    novelty one above.

    higher order functions, objects,
    overloading, generics/templates, events, actors, coroutines, RAII, metaprogramming, type inference, native multi-precision arithmetic,
    parallel blocks, closures, sandboxing.


    Those are the sorts of features that are implemended more diversely
    across different languages than simple ones. They will be harder to
    translate from one language to another. I favour more conservative and
    more universal features.

    A language needs to be able to get things done, and the number one thing missing from mine is the ability to /effortlessly/ use third party
    libraries. Not actors, whatever those are.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Fri Dec 9 12:54:47 2022
    On 09/12/2022 01:52, Bart wrote:
    On 08/12/2022 09:08, David Brown wrote:
    On 07/12/2022 19:58, Bart wrote:

    David Brown's point was to show that my language is really no
    different from C. But the language /it implements/ doesn't claim to
    be, and there wouldn't be any great difficult in rewriting any C
    program in my language, so long as your didn't try and do it
    token-by-token.

    It would be the same task as translating a C program to D or Java or
    C# or Zig or Rust. I think most agree those are different from C.


    My point was that your language (the low-level compiled one) and C are
    similar styles - they are at a similar level, and are both procedural
    imperative structured languages.  They have functions, variables, and
    pointers.  They do not have higher order functions, objects,
    overloading, generics/templates, events, actors, coroutines, RAII,
    metaprogramming, type inference, native multi-precision arithmetic,
    parallel blocks, closures, sandboxing.  They are not domain-specific,
    or tied to particular applications or environment.

    I don't know what half of those mean.

    That's fine. Your language doesn't have them, and C doesn't have them.
    If one of the languages had them, it would be very hard to translate
    that feature smoothly into the other language.


    My system language was upgraded for a while with higher level data types
    (you seem to have skipped a level or two in your list), but then I
    decided to keep it lower level.

    I didn't cover everything!

    And as I said, there are always some features that one of these broadly
    similar languages has that don't translate smoothly. For example,
    Pascal has sets - translating these to C means enumerations with
    prefixes and quite a loss in the neatness, clarity and type-safety of
    the original. In the other direction, printf() calls will be
    significantly changed moving to Pascal as Pascal doesn't have variadic functions.

    If you've added some higher level data types to your language, then
    these are likely to be harder to translate smoothly into C.


    I think languages should know their place, so let it do what it does as
    well as it can, rather pretend to be something it's not. (And mine
    mainly exists to implement the next language along.)


    Sure. The fact that your language occupies a similar level to C and can practically be translated back and forth does not mean it does not have
    its pros and cons compared to C. There is room for a great many
    programming languages.


    I anyway favour features that everyone can understand and appreciate.

    Let's be clear here - your language is written by one person, for one
    person, based on the preferences and understandings of one person. That
    does not necessarily mean you are wrong, but you should be /extremely/
    careful about extrapolating to "everyone". Serious languages are made
    by groups and teams that work together, and move through testers and
    early adopters to grow communities of users over the years. When you
    have thousands of regular users who can give feedback on your design
    choices, you can talk about features that "many people can understand
    and appreciate".


    Here's an example using a minor extension to my C compiler:

        void print_this_module(void) {
            puts(strinclude(__FILE__));
        }

    Put this in any module (or in more than one, or make it a macro), and
    when called it will print the sourcecode of the module, which is
    embedded within the executable. There are a dozen uses other than the
    novelty one above.


    And your basis for thinking /everyone/ understands this is ... what? I
    guess it looks clear enough for simple cases, but what about more
    complicated circumstances? What if the code is not contained in a file,
    but part of an on-the-fly compilation, or with source code from a pipe?
    What if the file is a temporary one generated by transcompilation from
    a higher level languages? Does it work properly with respect to the
    #line pre-processor directive? How does it work regarding other
    #include's in the file? How are macros and other pre-processing handled
    in the file? What about someone trying to include an "infinite" file
    like "/dev/random"? What about a file that contained characters that
    are not acceptable by the compiler?

    There are /lots/ of questions.

    I think it is fair to say that a number of programmers would appreciate
    a way to include data files embedded in their programs. Mostly you want
    the files to be treated as data (initialising an array), rather than
    strings, but sometimes strings are useful too. (Like most people, I
    solved this long ago for my own needs with a simple data-to-include-file utility). Including a files own source code is not likely to be needed
    by many, but other files are more important.


    Just for comparison, have a look at how the real C language has dealt
    with this, as a similar (but more powerful and useful) feature has been
    added to C23. Since C is not a one-man toy, discussions and proposals
    are needed, involving proponents of the idea, C standards committee
    members, users, compiler writers, and testers. It takes many rounds to establish what "everyone" understands, and what "everyone" appreciates -
    as well as avoiding potential problems, security issues and other
    complications that are not obvious to most people.

    <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2898.htm> <https://thephd.dev/finally-embed-in-c23>



    higher order functions, objects,
    overloading, generics/templates, events, actors, coroutines, RAII, metaprogramming, type inference, native multi-precision arithmetic, parallel blocks, closures, sandboxing.


    Those are the sorts of features that are implemended more diversely
    across different languages than simple ones. They will be harder to
    translate from one language to another. I favour more conservative and
    more universal features.


    Again, you favour things that seem more conservative and universal to
    /you/. There is a fair overlap with these things and the features of C,
    since the languages are similar.

    But people with different programming backgrounds and habits could have completely different ideas of what is "conservative and universal". For
    many languages, type inference and generics is so integral that it would
    seem very strange to declare variables of a given type - they may not
    even see variables as a meaningful concept. Not all languages have
    functions in the sense you think is "universal".

    This is the same in all sorts of areas, not just programming languages -
    our ideas about what things are "fundamental" or "universal" are almost entirely a matter of what we find /familiar/. Even if a concept is
    familiar to many people, it merely means it is found in many popular
    languages - it does not make it universal or fundamental. (And I note
    that there are many concepts found in many popular languages which you
    reject or dislike.)

    Of course you put things in your language that /you/ find appropriate -
    you made it for yourself, and if some Prolog user or Sketch expert wants
    to use it, they'll have to learn how it works. All I am asking you to
    do is understand that your ideas, your preferences, your choices are
    formed from your experiences and background - they are not "universal"
    and do not necessarily apply to others.

    A language needs to be able to get things done, and the number one thing missing from mine is the ability to /effortlessly/ use third party
    libraries. Not actors, whatever those are.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Fri Dec 9 16:28:53 2022
    On 09/12/2022 11:54, David Brown wrote:
    On 09/12/2022 01:52, Bart wrote:

    (Replying about 'strinclude' only)

    Here's an example using a minor extension to my C compiler:

         void print_this_module(void) {
             puts(strinclude(__FILE__));
         }

    Put this in any module (or in more than one, or make it a macro), and
    when called it will print the sourcecode of the module, which is
    embedded within the executable. There are a dozen uses other than the
    novelty one above.


    And your basis for thinking /everyone/ understands this is ... what?  I guess it looks clear enough for simple cases, but what about more
    complicated circumstances?  What if the code is not contained in a file,
    but part of an on-the-fly compilation, or with source code from a pipe?
     What if the file is a temporary one generated by transcompilation from
    a higher level languages?  Does it work properly with respect to the
    #line pre-processor directive?  How does it work regarding other
    #include's in the file?  How are macros and other pre-processing handled
    in the file?  What about someone trying to include an "infinite" file
    like "/dev/random"?  What about a file that contained characters that
    are not acceptable by the compiler?

    There are /lots/ of questions.

    You can say the same about #include in C. The search algorithm for
    include files is actually complex, and is not set by the language: it is implementation defined.

    My strinclude is simpler. The concept itself is simple: incorporate a
    TEXT file as a string literal rather than C code, a task that many will
    have wanted to do at some point.

    This was a quick proof of concept for C. The search for non-absolute
    file-specs is relative to the current directory. In the original in my language, it is relative to the location of the lead module (to allow a
    program to be built remotely from anywhere in the file system).

    For C I won't bother upgrading it.

    Of course, people can enter all sorts of things as the file path which
    are likely to cause problems. Just like they can for #include. And just
    like they can, with languages that have compile-time evaluation and the
    ability to duplicate strings, when they write "A"*1000000000.

    1GB is small enough to actually work, big enough to cause all sorts of problems, like a 1GB executable.


    I think it is fair to say that a number of programmers would appreciate
    a way to include data files embedded in their programs.  Mostly you want
    the files to be treated as data (initialising an array), rather than
    strings, but sometimes strings are useful too.

    My systems language has `bininclude` for binary files, but it's a poor implementation (a 1MB file will generate 1M AST nodes, one for each
    byte, some 64MB in all, but `strinclude` for a 1MB file needs only a 1MB string).

    My scripting language uses `strinclude` for both. The program below
    writes a C file then compiles it, supplying its own C compiler!

    const compiler = strinclude("c:/m/bcc.exe")

    if not checkfile("newcc.exe") then
    writestrfile("newcc.exe",compiler)
    fi

    writetextfile("test.c",
    ("#include <stdio.h>",
    "int main(void) { puts(""Hi There " +strtime(getsystime())+
    """);}"
    ))

    execwait("newcc -run test.c")

    (str-file = one big string; text-file = list of strings, one per line)

    (Like most people, I
    solved this long ago for my own needs with a simple data-to-include-file utility).  Including a files own source code is not likely to be needed
    by many, but other files are more important.

    Here's a simpler use:

    when help_sw then
    println strinclude("mm_help.txt")

    It is invaluable in producing my one-file self-contained tools.

    Just for comparison, have a look at how the real C language has dealt
    with this, as a similar (but more powerful and useful) feature has been
    added to C23.  Since C is not a one-man toy, discussions and proposals
    are needed, involving proponents of the idea, C standards committee
    members, users, compiler writers, and testers.  It takes many rounds to establish what "everyone" understands, and what "everyone" appreciates -
    as well as avoiding potential problems, security issues and other complications that are not obvious to most people.

    <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2898.htm> <https://thephd.dev/finally-embed-in-c23>

    Yeah, it's complicated when the language is already complicated and
    everything is done by committee /and/ you have billions of lines of
    existing code to support and you have 100s of implementations.

    That's an advantage of creating a small product with a limited user-base.

    This sort of feature needs to be as simple as I've shown above.

    BTW in C, I implemented strinclude in 20 lines like this:


    function readstrinclude:unit p=
    ichar text

    lex()
    checksymbol(lbracksym)
    lex()
    p := readexpression()
    checksymbol(rbracksym)
    lex()

    if p.tag<>j_const or p.mode<>trefchar then
    serror("String const expected")
    fi

    text := readfile(p.svalue)
    if not text then
    serror_s("Can't read strinclude file: #",p.svalue)
    fi

    return createstringconstunit(text,rfsize)
    end

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Bart on Fri Dec 9 19:16:39 2022
    On 09/12/2022 16:28, Bart wrote:
    On 09/12/2022 11:54, David Brown wrote:


    [including files as text or binary data]

    My scripting language uses `strinclude` for both. The program below
    writes a C file then compiles it, supplying its own C compiler!

        const compiler = strinclude("c:/m/bcc.exe")

    While this script program still works (as does the equivalent using 'bininclude` in the other language), it was a feature I used when
    scripting code was precompiled to binary bytecode. Then the binary data
    was part of the bytecode file.

    But now scripts are run from source, and production or distributed
    programs are made into one-file amalgamations. These amalgamations are
    still text files, and while they will include such support files, I
    haven't yet solved the problem of representing binary data in the text file.

    Amalgamations will usually incorporate text-files as-is, but here a
    binary file would need to be transformed. It's not a hard problem, I
    just haven't done it yet.


    <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2898.htm>
    <https://thephd.dev/finally-embed-in-c23>

    I'd imagine the C23 and C++ versions will have the same problem when
    creating source distributions that someone else will build, but they
    probably don't care about bundling binary files as well.

    The point however is to encapsulate the binary data; just zipping
    everything doesn't cut it! You don't want the user or builder to see the discrete binaries.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sat Dec 10 17:23:38 2022
    On 09/12/2022 11:54, David Brown wrote:
    On 09/12/2022 01:52, Bart wrote:

    higher order functions, objects,
    overloading, generics/templates, events, actors, coroutines, RAII,
    metaprogramming, type inference, native multi-precision arithmetic,
    parallel blocks, closures, sandboxing.


    Those are the sorts of features that are implemended more diversely
    across different languages than simple ones. They will be harder to
    translate from one language to another. I favour more conservative and
    more universal features.


    Again, you favour things that seem more conservative and universal to
    /you/.

    Well, they were universal and obvious 40+ years ago. Nearly two
    generations on, everyone is into more advanced features (mostly likely
    due to being foisted on them in university courses).

    Then it may be necessary to look at what would be obvious and intuitive
    to anyone, or can be described in everyday terms like shelves of books, numbered drawers and so on. (Except that soon not many will know what a
    book looks like.)

    But people with different programming backgrounds and habits could have completely different ideas of what is "conservative and universal".  For many languages, type inference and generics is so integral that it would
    seem very strange to declare variables of a given type - they may not
    even see variables as a meaningful concept.

    Give it another generation, someone will reintroduce the idea of mutable variables as a brilliant new concept.

    However I think you're wrong: there are a set of basic features that,
    for time being, most will understand or can easily guess, often used in pseudo-code.

    If I look at Intel processor manuals for example, the behaviour of
    instructions is described in a pseudo-code language looks a lot like
    Algol. It has IF statements, explicit FOR-loops and mutable variables.

    Why not use functional style concepts and syntax since that is so great?

    Here's another bit of syntax they use:

    DEST[31:0] := SRC1[31:0] + SRC2[31:0]

    This clearly refers to bits within those variables. While not common in languages, how hard is it to guess what this does? Once you understand
    what it does, how hard emulate what it does with existing bitwise
    operations?

    Funnily enough I've used pretty much the same syntax for 30 years:

    dest.[31..0] := src1.[31..0] + src2.[31..0]

    Look at, say, the Wikipedia article on insertion sort, and it will have psuedo-code for it that looks like this:

    i ← 1
    while i < length(A)
    j ← i
    while j > 0 and A[j-1] > A[j]
    swap A[j] and A[j-1]
    j ← j - 1
    end while
    i ← i + 1
    end while

    Using ":=" and adding "do", this would be valid syntax in either of my languages (the swap needs rewriting as `swap(A[j], A[j-1])`, and
    `length(A)` as `A.len`).

    So, how about that; my syntax which I claim to be simple and univeral,
    is clear enough to be used as pseudo-code.


    These are some characteristics and fundamentals that I like, that are in
    danger of becoming extinct; they already are in many languages:

    * Case-insensitive
    * 1-based counting (if there is even anything to count!)
    * Mutable variables, even the concept of 'state'
    * Explicit arrays, and explicit indexing of arrays
    * Iteration over A to B using explicit loops
    * While loops
    * Goto
    * Build-in read/print /statements/ (or just having any i/o at all)
    * Ordinary functions, you know, the ones you just declare rather just
    being some named lambda expression.

    These include many constructs popular in pseudo-code. So what do the
    developers of advanced languages actually have against clear code?

    I get the impression that many can't take a language seriously if it
    uses a syntax that just anyone can understand.


    Of course you put things in your language that /you/ find appropriate -
    you made it for yourself, and if some Prolog user or Sketch expert wants
    to use it, they'll have to learn how it works.  All I am asking you to
    do is understand that your ideas, your preferences, your choices are
    formed from your experiences and background - they are not "universal"
    and do not necessarily apply to others.

    Again, I think they are: more obvious, more intuitive, simpler, suitable
    for use as pseudo-code, and far easier to port to an arbitrary language.
    At least, an arbitrary language that hasn't completely done away with
    the basics.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Sun Dec 11 17:24:28 2022
    On 09/12/2022 17:28, Bart wrote:
    On 09/12/2022 11:54, David Brown wrote:
    On 09/12/2022 01:52, Bart wrote:

    (Replying about 'strinclude' only)

    Here's an example using a minor extension to my C compiler:

         void print_this_module(void) {
             puts(strinclude(__FILE__));
         }

    Put this in any module (or in more than one, or make it a macro), and
    when called it will print the sourcecode of the module, which is
    embedded within the executable. There are a dozen uses other than the
    novelty one above.


    And your basis for thinking /everyone/ understands this is ... what?
    I guess it looks clear enough for simple cases, but what about more
    complicated circumstances?  What if the code is not contained in a
    file, but part of an on-the-fly compilation, or with source code from
    a pipe?   What if the file is a temporary one generated by
    transcompilation from a higher level languages?  Does it work properly
    with respect to the #line pre-processor directive?  How does it work
    regarding other #include's in the file?  How are macros and other
    pre-processing handled in the file?  What about someone trying to
    include an "infinite" file like "/dev/random"?  What about a file that
    contained characters that are not acceptable by the compiler?

    There are /lots/ of questions.

    You can say the same about #include in C. The search algorithm for
    include files is actually complex, and is not set by the language: it is implementation defined.

    Certainly there are implementation-defined aspects about "#include". An implementation does not even have to use normal files here - I know of
    at least one embedded toolchain where the standard library includes are
    handled directly within the toolchain, and do not exist as separate files.

    But I think there is a significant difference between header files,
    which are clearly part of program code, and embedded data files.


    My strinclude is simpler. The concept itself is simple: incorporate a
    TEXT file as a string literal rather than C code, a task that many will
    have wanted to do at some point.


    Yes, it is simpler - that's the point. You can make a simple feature
    for your simple tool that does all you need for your simple
    requirements. That is absolutely fine for these cases - there is never
    any point in over-complicating things. Make things as simple as
    possible, but no simpler. However, for serious and mainstream
    languages, far more is involved.

    This was a quick proof of concept for C. The search for non-absolute file-specs is relative to the current directory. In the original in my language, it is relative to the location of the lead module (to allow a program to be built remotely from anywhere in the file system).

    For C I won't bother upgrading it.

    Of course, people can enter all sorts of things as the file path which
    are likely to cause problems. Just like they can for #include. And just
    like they can, with languages that have compile-time evaluation and the ability to duplicate strings, when they write "A"*1000000000.

    1GB is small enough to actually work, big enough to cause all sorts of problems, like a 1GB executable.


    I think it is fair to say that a number of programmers would
    appreciate a way to include data files embedded in their programs.
    Mostly you want the files to be treated as data (initialising an
    array), rather than strings, but sometimes strings are useful too.

    My systems language has `bininclude` for binary files, but it's a poor implementation (a 1MB file will generate 1M AST nodes, one for each
    byte, some 64MB in all, but `strinclude` for a 1MB file needs only a 1MB string).

    My scripting language uses `strinclude` for both. The program below
    writes a C file then compiles it, supplying its own C compiler!

        const compiler = strinclude("c:/m/bcc.exe")

        if not checkfile("newcc.exe") then
            writestrfile("newcc.exe",compiler)
        fi

        writetextfile("test.c",
            ("#include <stdio.h>",
             "int main(void) { puts(""Hi There " +strtime(getsystime())+ """);}"
            ))

        execwait("newcc -run test.c")

    (str-file = one big string; text-file = list of strings, one per line)

    (Like most people, I solved this long ago for my own needs with a
    simple data-to-include-file utility).  Including a files own source
    code is not likely to be needed by many, but other files are more
    important.

    Here's a simpler use:

        when help_sw then
            println strinclude("mm_help.txt")

    It is invaluable in producing my one-file self-contained tools.

    Just for comparison, have a look at how the real C language has dealt
    with this, as a similar (but more powerful and useful) feature has
    been added to C23.  Since C is not a one-man toy, discussions and
    proposals are needed, involving proponents of the idea, C standards
    committee members, users, compiler writers, and testers.  It takes
    many rounds to establish what "everyone" understands, and what
    "everyone" appreciates - as well as avoiding potential problems,
    security issues and other complications that are not obvious to most
    people.

    <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2898.htm>
    <https://thephd.dev/finally-embed-in-c23>

    Yeah, it's complicated when the language is already complicated and everything is done by committee /and/ you have billions of lines of
    existing code to support and you have 100s of implementations.

    That's an advantage of creating a small product with a limited user-base.


    Yes. You only need to consider the uses and interests of one person.
    On the other hand, you only get the features created by one person.

    This sort of feature needs to be as simple as I've shown above.

    BTW in C, I implemented strinclude in 20 lines like this:


        function readstrinclude:unit p=
            ichar text

            lex()
            checksymbol(lbracksym)
            lex()
            p := readexpression()
            checksymbol(rbracksym)
            lex()

            if p.tag<>j_const or p.mode<>trefchar then
                serror("String const expected")
            fi

            text := readfile(p.svalue)
            if not text then
                serror_s("Can't read strinclude file: #",p.svalue)
            fi

            return createstringconstunit(text,rfsize)
        end


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sun Dec 11 16:50:51 2022
    Bart <bc@freeuk.com> wrote:
    On 08/12/2022 00:30, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    Other languages may use command parameters, @ files, makefiles, or
    untidy collections of 'import <module>` at the top of every module, that >> need constant maintenance.

    Hmm, if module A used function f from module B and I want to change
    A to use f from C, then I need to change import statement.

    If it happens that A imports both B and C, each of which export function
    F, then you would need to write B.F() or C.F() from A anyway, to disambiguate.

    I call would be otherwise ambigious, then of course there is need
    to disambiguate. Above would be F()$B and F()$C (dollar sign means
    that what follows is module name). However, important use case
    is when you have large program and try to replace B by C. And you
    do this in incremental way, module by module. At any time
    given module import either B (before convertion) or C (after),
    but not both.

    There is also extra aspect, irrelevent for you, but important
    for me: my language has overloading and functions from different
    modules may differ in types they return or in argument types.
    So overloading may decide which one to use.

    I does
    not matter if import statement is in A or in central file. OTOH
    if imports stay stable then 'import' line does not change so no
    need for exta maintenance.

    Since I started using my 2022 module scheme, I very rarely have to look
    at the module list. Mainly it's changed in order to create a different configuration of a program.

    Actually it has dramatically simplified that kind of maintenance.

    My scheme also allows circular and mutual imports with no restrictions.

    You probably mean "with no artifical restrictions". There is
    fundamental restriction that everything should resolve in finite
    number of steps.

    I would link to some docs but I can sense you're not going to be
    receptive. That's fine; you're used to your way of doing things.

    OTOH in langage I use imports are
    scoped. Routine rB in A can import B and use f from B, routine
    rC in A can import C and use f from C. I do not think it would
    work with your import table.

    You mean a function privately imports a module? That sounds crazy - and chaotic. Suppose 100 different functions all import different subsets of
    20 different modules (is that Python by any chance?)

    Apparently you only think of ways of writing bad code. There are
    many ways of writing bad code and it is not hard even in languages
    with strong "nanny" attitude. Private import means that you can
    import what is needed in small scope, normal case is of few critical
    functions having external dependence and the rest of module not
    needing it. Alternative could be to create some small intermediate
    modules, but normally there is strong logical connection between functions
    in a module and intermediate modules would be artificial and clumsy.
    Private import allows to keep natural decomposition into modules
    and localize dependencies.

    There needs to be some structure, some organisation.

    Exactly, private import is tool for better organisation.

    And I can compile each module
    separately.

    I can compile each program separately. Granularity moves from module to program. That's better. Otherwise why not compile individual functions?

    In the past there was support for compiling individual functions, and
    I would not exclude possibility that it will come back. But ATM
    I prefer to keep things simple, so this functionality was removed.

    In fact, this is usual developement mode: I stop
    the program, modify the module, compile, load and restart the
    program. The point is that all program data and debugging setup
    is preserved.

    You mean change one module of a running program? OK, that's not that
    really a feature of a language I think, more of external tools,
    especially IDEs and debuggers.

    Yes, that is feature of implementation (no IDE). But language
    features also come into play.

    I would have no idea how to do that in C. But I used to do it at the application level where most functionality was implemented via
    hot-loaded scripting modules. Then development was done from /within/
    the running application.


    Anyway, I have benefit from independent compilation
    and import lines are not a problem.

    I used to use independent compilation myself. I moved on to
    whole-program compilation because it was better. But all the issues
    involved with interfacing at the boundaries between modules don't
    completely disappear, they move to the boundaries between programs
    instead; that is, between libraries.

    I consider programs to be different from libraries. Program may
    use several libraries, in degenerate case library may be just a
    single module. Compiling whole program has clear troubles with
    large programs. If you limit size of libraries they may be
    reasonable compromise. ATM I my case inter-module optimizations
    are limited and do not benefit from larger scale. I have
    about 1000 modules that do not naturally decompose into libraries.
    And in principle there could be tens of thousends modules, this
    is basically limited by developement effort needed to write them.

    I have one global compilation step, basically to resolve
    dependencies between modules. But it is needed only for
    initial build (or if module structure changed quite a lot),
    after build typically info about dependencies can be updated
    in incremental way.

    And, if you ask, I have makefile and it contains list of all
    modules, so when I add a new module I must add it to the
    makefile. I principle list of modules could be generated
    automatically ("take all modules in files in current directory"),
    but I prefer current way.

    Well, my language doesn't use or need makefiles. While I got rid of
    linkers as being a waste of time, I never got into makefiles at all.

    There are however project files used by my mini-IDE, which contains additional info needed for listing, browsing, and editing the source
    files, and for running the programs with test inputs.

    But if /you/ wanted to build one of my projects from source, I would
    need to provide exactly two files:

    mm.exe the compiler
    app.ma the source code

    The latter being an automatically created amalgamation (produced with
    `mm -ma app`). Build using `mm app`.

    I could provide a single file, shell archive containing build script
    and sources, but important part of providing sources is that people
    can read them, understand and modify. GNU folks have nice definition
    of source: "preffered form for making modifications". I would guess
    that 'app.ma' is _not_ your preffered form for making modifications,
    so it is not really true source. And to build from "source" I need
    source first. And I provide _true_ sources to my users.

    If you were on Linux or didn't want to use my compiler, then it's even simpler; I would provide exactly one file:

    app.c Generated C source code (via `mc -c app`)

    Here, you need a C compiler only. On Windows, you can build it using
    `gcc app.c -oapp.exe`. Linux needs more options, listed at the top of
    the file.

    Sorry, generated file is _not_ a source. If I were to modify C file
    that you provide I would have trouble incorporationg fixes from
    your later versions. And likely, you would have trouble incorporating
    my changes into your real sources.

    I haven't yet figured out a way of providing zero files if you wanted to build my apps from source.

    But I guess none of this cuts any ice. After all, there are innumerable
    ways of taking the most fantastically complex build process, and
    wrapping up in one command or one file (docker etc).

    The difference is that what I provide is genuinely simple: one bare
    compiler, one actual source file.

    Sorry, for me "one file" is not a problem, there is 'tar' (de facto
    standard for distributing sorce code) and with it you download a
    single file and recreate any needed directory structure. Real
    issue is managing dependencies. In general, I make real effort
    to minimize dependencies. I am not saying there are none, but
    dependencies that I have are hard to avoid and IMO resonably natural.
    Most other project have quite a lot of dependencies. Your project
    _may_ have advantage of small number of dependencies. But since
    you show only generated files or parts of source nobody can tell.
    And there is possible quite large dependency, namley Windows.
    It is clear how much of your code _usefully_ runs in now-Window
    environment.

    Some langages with modules have
    "transitive closure" feature: when you give module to compiler
    (or maybe extra tool) it will recompile all modules it needs.
    But you still need list of "entry modules", that is modules
    that should be compiled and are usable when user explicitely
    request them by name, but otherwise would be unused.

    My stuff is simpler, believe me. When you have 1Mlps compilation speed,
    most of this build stuff can go out the window.

    Oops, I got distracted and did not finish this part. I mean,
    in well written C program there will be implicit module structure, essentially ".c file = module".

    I've seen lots of C source, it's rarely that tidy.

    If it was, I'd be be able to use an unusual feature of my C compiler
    where it can automatically discover the necessary modules by tracking
    the header files (`bcc -auto main.c`).

    Well, taking header files as definition of modules clearly has
    problems. IME .c files usually give resonable module structure.

    Exported functions will be
    declared in header files. One school says that there should be
    .h file for each C file, this .h file containg "module" export info.
    Less rigorous school allows single .h file (or some small number)
    decaring exported function. Even if program slightly deviates
    from such patterns usually this is not much work to separate
    declarations of exported functions and move them in .h files
    corresponding to .c files. I believe that gcc with appropriate
    options would tell you if there is any function which gets
    exported (is non-static) but which lacks declaration in
    header file, so you can correct program to make sure that
    everthing which needs to be exported is declared in header files
    and rest is static. After that the C structure will map
    1 to 1 to any resonable module system.

    C is just so primitive when it comes to this stuff. I'm sure it largely
    works by luck.

    C is at low level, that is clear. But programs are written by
    programmers and good programs require work. Good programming
    environment should help. C as language is not helpful, one
    may have fully compliant and rather unhelpful compiler. But
    real C compilers tends to be as helpful as they can within
    limit a C language. While C still limits what they can do,
    there is quite a lot of difference betwen current popular
    compiler and bare-bones legal comiler. And there are extra
    tools and here C support is hard to beat.

    David Brown may explain more what he meant. But I think that my
    view is similar to his: there is almost nothing new in your
    language. You are moving in circle of ideas that were mainstream
    in sixties,

    So what are the new circles of ideas? All that crap in Rust that makes
    coding a nightmare, and makes building programs dead slow? All those new functional languages with esoteric type systems? 6GB IDEs (that used to
    take a minute and a half to load on my old PC)? No thanks.

    Borrow checker in Rust looks like good idea. There is good chance
    that _idea_ will be adopted by several languages in near feature.
    Not so new ideas are:
    - removing limitations, that is making sure language constructs
    work as general as possible (that allows to get rid of many
    special constructs from older languages)
    - nominal, problem dependent types. That is types should reflect
    problem domain. In particular, domains which need types like
    'u32' are somewhat specific, in normal domains fundamental types
    are different
    - functions as values/parameters. In particular functions have
    types, can be members of data structures
    - "full rights" for user defined types. Which means whatever
    syntax/special constructs works on built-in types should
    also work for user defined types
    - function overloading
    - type reconstruction
    - garbage collection
    - exception handling
    - classes/objects


    I keep the languages I maintain accessible, and the ideas simple.

    Any genuinely good ideas I would already have stolen if I liked them and
    they were practical to implement.

    But the trend now is to make even features like modules as complicated
    and comprehensive as possible. (For example, there may not be 1:1 correspondence between file and module: multiple modules per file;
    nested modules; modules split across multiple files. I keep it simple.)

    mostly resolved in seventies and consider old hat now.


    I'm think you're not looking at the right features of a language. There
    are some characteristics popular in the 70s that I like and maintain:

    * Clean, uncluttered brace-free syntax
    Does this count as brace-free?

    for i in 1..10 repeat (print i; s := s + i)

    * Case-insensitive
    * 1-based
    * Line-oriented (no semicolons)
    * Print/read as statements

    Lot of folks consider the above misfeatures/bugs. Concerning
    'line-oriented' and 'intuitive, can you guess which changes to
    following statement in 'line-oriented' syntax are legal and preserve
    meaning?

    nm = x or nm = 'log or nm = 'exp or nm = '%power or
    nm = 'nthRoot or
    nm = 'cosh or nm = 'coth or nm = 'sinh or nm = 'tanh or
    nm = 'sech or nm = 'csch or
    nm = 'acosh or nm = 'acoth or nm = 'asinh or nm = 'atanh or
    nm = 'asech or nm = 'acsch or
    nm = 'Ei or nm = 'erf or nm = 'erfi or nm = 'li or
    nm = 'Gamma or nm = 'digamma or nm = 'dilog or
    nm = '%root_sum =>
    "iterate"

    As a hit let me say that '=' is comparison. And this is single
    statement, small changes to whitespace will change parse and lead
    to wrong code or syntax/type error. BTW: you need real newsreader
    to see it, Google and likes will change it so it no longer works.

    C is still tremendously popular for many reasons. But anyone wanting to
    code today in such a language will be out of luck if prefered any or all
    of these characteristics. This is why I find coding in my language such
    a pleasure.

    Then, if we are comparing the C language with mine, I offer:

    * Out of order definitions

    That is considerd misfeature in modern time. In modern languages
    definition may generate some code to be run and order in which
    this code is run matters.

    * One-time definitions (no headers, interfaces etc)
    * Expression-based

    C is mostly expression-based. There are langages that go further
    than C, for example:

    a := (s = 1; for i in 1..10 repeat (s := s + i); s)

    is legeal in language that I use, but can not be directly translated to C. However, from examples that you gave it looked that your language
    is _less_ expression-based than C.

    * Program-wide rather than module-wide compilation unit

    AFACS C leave choice to the implementer. If your language _only_
    supports whole-program compilation, then this whould be
    negative feature.

    * Build direct to executable; no object files or linkers

    That really question of implementation. Building _only_
    to executable is misfeature (what about case when I want
    to use a few routines in your language, but the rest including
    main program is in different language?).

    * Blazing fast compilation speed, can run direct from source

    Again, that is implementation (some language features may slow
    down compilation, as you know C allow fast comilation).

    * Module scheme with tidy 'one-time' declaration of each module
    * Function reflection (access all functions within the program)
    * 64-bit default data types (ie. 'int' is 64 bits, 123 is 64 bits)
    * No build system needed

    That really depends on needs of your program. Some are complex
    and need build system, some are simple and in principle could
    be compiled with "no" build system. I still use Makefiles for
    simple programs for two reasons:
    - typing 'make' is almost as easy as it can get
    - I want to have record of compiler used/compiler options/
    libraries
    * Create and directly compile One-file amalgamations of projects
    * String and file embedding you know about
    * One-file self-contained implementations

    I haven't listed the literally one hundred small enhancements that make
    the coding experience even in a lower-level system language with
    primitive types so comfortable.

    Few are remarkable or unique, but it's putting it all together in one
    tidy package that counts.


    And for starter, it looks that all contenders
    offer modules, with better features than yours.

    Sure, you have C++, Zig, Rust, Java, C#, Dart, ....

    All with very advanced and complicated features. They also have big implementations and take ages to build projects.

    (When I need more advanced features, I switch to my scripting language.
    Which also, by design, shares the same syntax and most of the same characteristics and features.)

    I think, like David Brown, you just don't get it.

    Try and think of what I do as creating delicious meals in my kitchen
    from my own recipies. Nothing new here either. But both you and DB
    probably work (figuratively) in the food industry or run chains of restaurants, or work in a food laboratory.

    Or are used to buying ready-meals from supermarkets.

    Meals are different thing than programming languages. If you want
    to say that _you_ enjoy yor language(s), then I got this. My point
    was that you are trying to present your _subjective_ preferences
    as something universal. I like programming and important part
    is that my programs work. So I like featurs that help me to get
    working program and dislike ones that cause troubles. IME, the
    following cause troubles:
    - case insensitivity
    - dependence on poorly specified defauls
    - out of order definitions
    - infexible tools that for example insist on creating executable
    without option on making linkable object file

    Concerning 1-based indexing, IME in more cases it causes trouble
    than helps, but usually this is minor issue. I use a lot
    line oriented syntax. I can say that it works, simple examples
    are easy, but there are unintuitive situations and sometimes troubles.
    For example cut and paste works better for traditional syntax.
    If there are problems, then beginers may be confused. As one
    guy put it: trouble with white space is that one can not see it.

    I am comfortable with braces and if I were do design new language
    there is good chance that I would use them. And the same with
    semicolons. One could go for "minimal" syntax, using only
    parentheses and commas, but for reading variety is helpful,
    so it is better to also have braces and semicolons.

    I do my programming mostly without any IDE. Frequently I use
    simple Makefiles, but also I do developement with "no" build
    system, basically interactively defining new functions, and
    collecting working part in file. Or semi-classic edit-compile-test
    cycle where I have editor open in one termial window, and
    I give compile commands by hand.

    One thing that I learned many years ago is: what I find easy
    is not necessarily easy for other folks. What I like is not
    necessarily what other folks like. I think that my methods
    of working and languages I use are quite effective, but
    I do not advocate them as something good for all. In fact,
    it seems that majority goes in quite different direction.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Sun Dec 11 20:46:13 2022
    On 10/12/2022 18:23, Bart wrote:
    On 09/12/2022 11:54, David Brown wrote:
    On 09/12/2022 01:52, Bart wrote:

    higher order functions, objects,
    overloading, generics/templates, events, actors, coroutines, RAII,
    metaprogramming, type inference, native multi-precision arithmetic,
    parallel blocks, closures, sandboxing.


    Those are the sorts of features that are implemended more diversely
    across different languages than simple ones. They will be harder to
    translate from one language to another. I favour more conservative
    and more universal features.


    Again, you favour things that seem more conservative and universal to
    /you/.

    Well, they were universal and obvious 40+ years ago. Nearly two
    generations on, everyone is into more advanced features (mostly likely
    due to being foisted on them in university courses).

    Then it may be necessary to look at what would be obvious and intuitive
    to anyone, or can be described in everyday terms like shelves of books, numbered drawers and so on. (Except that soon not many will know what a
    book looks like.)

    But people with different programming backgrounds and habits could
    have completely different ideas of what is "conservative and
    universal".  For many languages, type inference and generics is so
    integral that it would seem very strange to declare variables of a
    given type - they may not even see variables as a meaningful concept.

    Give it another generation, someone will reintroduce the idea of mutable variables as a brilliant new concept.

    However I think you're wrong: there are a set of basic features that,
    for time being, most will understand or can easily guess, often used in pseudo-code.

    If I look at Intel processor manuals for example, the behaviour of instructions is described in a pseudo-code language looks a lot like
    Algol. It has IF statements, explicit FOR-loops and mutable variables.


    This is exactly what I said - you /think/ certain concepts are
    "universal" or "fundamental" because of your background, and that
    includes heavy use of assembly coding. von Neumann computer
    architecture has turned out to be very successful, and imperative
    procedural programming is a good fit for such systems at the low level.
    Not all processors fit that model, however, and there are niche devices
    that are very different. (Internally, even mainstream devices like x86 processors deviate substantially from strict von Neumann models.)

    That does not mean, however, that such languages are ideal for
    programming. Programming is the art of taking a task from the problem
    domain and ending up with something in the implementation domain. For
    most processors, a C-like language (which includes your compiled
    language - imperative and procedural) is close to the implementation
    domain. It is, however, typically very far from the problem domain. A
    high level language such as Python is going to be closer to the problem
    domain. Compilers, interpreters, libraries, etc., automate the process
    of moving from the chosen programming language down to the actual implementation.

    The closer your language choice is to the problem domain, the shorter
    the program, the easier it is to see (or prove) that it is correct, and
    the shorter development time. The closer your language choice is to the implementation domain, the fewer inefficiencies are typically introduced
    in the automated part of the process.

    Different kinds of task in the problem domain are best described in
    different kinds of language. Different situations call for different selections of level of language and trade-offs. But common to all
    programming is this progression from the problem to the implemented
    solution.

    If you view just one part of this process, it's easy to imagine that the concepts found at that point are specially important, or fundamental or universal. But they are not - they are simply common at that are of the
    whole artform of software development. A test engineer might say
    programming revolves around test-cases and user models. An assembly
    programmer might say it is all about memory addresses, registers, ALU instructions and branches. A C programmer might think variables and
    functions are the key concepts that every programmer understands. For
    Forth programmers, the stack is the centre of the universe. And they
    are all /wrong/, because they are only looking at a small part of a big picture, and within their small part, they are all /right/.


    So, how about that; my syntax which I claim to be simple and univeral,
    is clear enough to be used as pseudo-code.


    It is fine to be used as pseudo-code by people who understand it. It is useless for people who don't understand it. The same goes for
    functional programming:

    factorial 0 = 1
    factorial n = n * factorial (n - 1)

    Is that hard for you to understand? Of course not. It is written in a
    very different way than you would write the code in your own language -
    there are no types, no loops, no variables, no conditionals. It works
    by pattern matching, recursion and type inference.

    It turns out that simple code can be simple to understand, even if it is written in a very different kind of language or a different style. As
    for complex code - well, it depends entirely on your experience and
    familiarity with the kind of language (as well as with the problem or
    operation being described).


    These are some characteristics and fundamentals that I like, that are in danger of becoming extinct; they already are in many languages:

    These are not "in danger of becoming extinct" - though they may be going
    out of fashion in some cases.


    * Case-insensitive
    * 1-based counting (if there is even anything to count!)
    * Mutable variables, even the concept of 'state'
    * Explicit arrays, and explicit indexing of arrays
    * Iteration over A to B using explicit loops
    * While loops
    * Goto
    * Build-in read/print /statements/ (or just having any i/o at all)
    * Ordinary functions, you know, the ones you just declare rather just
    being some named lambda expression.


    Who cares what /you/ like, except for /you/ ? You make your language
    the way you want it. It does not mean that the features you like are
    somehow "better", "universal", "fundamental", "understandable", or
    anything else. They are just the things you like. (And if you think
    any of these are "in danger of becoming extinct", you must be living on
    a different planet. Far from everyone will agree with you on any or all
    of these points, but that is not remotely the same as saying they are
    going extinct.)

    These include many constructs popular in pseudo-code. So what do the developers of advanced languages actually have against clear code?

    I get the impression that many can't take a language seriously if it
    uses a syntax that just anyone can understand.


    Of course you put things in your language that /you/ find appropriate
    - you made it for yourself, and if some Prolog user or Sketch expert
    wants to use it, they'll have to learn how it works.  All I am asking
    you to do is understand that your ideas, your preferences, your
    choices are formed from your experiences and background - they are not
    "universal" and do not necessarily apply to others.

    Again, I think they are: more obvious, more intuitive, simpler, suitable
    for use as pseudo-code, and far easier to port to an arbitrary language.
    At least, an arbitrary language that hasn't completely done away with
    the basics.


    Again, you are wrong. I think it is a shame that you are sometimes so
    insular that you can't even see that your viewpoint is limited.
    Personal preferences are fine - assuming they apply to everyone is not.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Mon Dec 12 00:16:34 2022
    On 11/12/2022 16:50, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:


    (I don't think you've made it clear whether the other language(s) you've refered to are some mainstream ones, or one(s) that you have devised. I
    will assume the latter and tone down my remarks. A little.)

    My scheme also allows circular and mutual imports with no restrictions.

    You probably mean "with no artifical restrictions". There is
    fundamental restriction that everything should resolve in finite
    number of steps.

    Huh? That doesn't come up. Anything which is recursively defined (eg.
    const a=b, b=a) is detected, but that is due to out-of-order not
    circular modules.

    There needs to be some structure, some organisation.

    Exactly, private import is tool for better organisation.

    Sorry, all I can see is extra work; it was already a hassle having to
    write `import B` on top of every module that used B, when it was visible
    to all functions, because of having to manage that list of imports /per module/. Now I have to do that micro-managing /per function/?

    (Presumably this language has block scopes: can this import also be
    private to a nested block within each function?)

    With module-wide imports, it is easy to draw a diagram of interdependent modules; with function-wide imports, that is not so easy, which is why I
    think there will be less structure.

    I can compile each program separately. Granularity moves from module to
    program. That's better. Otherwise why not compile individual functions?

    In the past there was support for compiling individual functions, and
    I would not exclude possibility that it will come back. But ATM
    I prefer to keep things simple, so this functionality was removed.

    With whole program compilation, especially used in the context of a
    resident tool that incorporates a compiler, with resident symbol tables, resident source and running code in-memory (actually, just like my very
    first compilers), lots of possibiliies open up, of which I've hardly
    scratched the surface.

    Including compiling/recompiling a function at a time.

    Although part-recompiling during a pause in a running program, then
    resuming, would still be very tricky. That would need debugging
    features, and I would consider, in that case, running via an interpreter.

    My point: a system that does all this would need all the relevant bits
    in memory, and may involve all sorts of complex components. But a
    whole-program compiler that runs apps in-memmory already does half the work.

    I used to use independent compilation myself. I moved on to
    whole-program compilation because it was better. But all the issues
    involved with interfacing at the boundaries between modules don't
    completely disappear, they move to the boundaries between programs
    instead; that is, between libraries.

    I consider programs to be different from libraries. Program may
    use several libraries, in degenerate case library may be just a
    single module. Compiling whole program has clear troubles with
    large programs.

    My definition of 'program', on Windows, is a single EXE or DLL file.

    I expect larger applications to consist of a collection of EXE and DLL
    files. My own will have one EXE and zero or more DLLs, but I would also
    make extensive use of scripting modules, that have different rules.


    The latter being an automatically created amalgamation (produced with
    `mm -ma app`). Build using `mm app`.

    I could provide a single file, shell archive containing build script
    and sources, but important part of providing sources is that people
    can read them, understand and modify.

    I don't agree. On Linux you do it with sources because it doesn't have a reliable binary format like Windows that will work on any machine. If
    there are binaries, they might be limited to a particular Linux
    distribution.

    (Binaries on Linux have always been a mystery to me, starting with the
    fact that they don't have a convenient extension like .exe to even tell
    what it is.)

    GNU folks have nice definition
    of source: "preffered form for making modifications". I would guess
    that 'app.ma' is _not_ your preffered form for making modifications,
    so it is not really true source.

    No, but deriving the true sources from app.ma is trivial, since it is
    basically a concatenation of the relevant files.

    And to build from "source" I need
    source first. And I provide _true_ sources to my users.

    If you were on Linux or didn't want to use my compiler, then it's even
    simpler; I would provide exactly one file:

    app.c Generated C source code (via `mc -c app`)

    Here, you need a C compiler only. On Windows, you can build it using
    `gcc app.c -oapp.exe`. Linux needs more options, listed at the top of
    the file.

    Sorry, generated file is _not_ a source. If I were to modify C file

    This is not for modifying. 99% of the time I want to build an open
    source C project, it is in order to provide a running binary, not spend
    hours trying to get it to build. These are the obstacles I have faced:

    * Struggling with formats like .gz2 that require multiple steps on Windows
    * Ending up with myriad files scatterred across myriad nested directories
    * Needing to run './configure' first (this will not work on Windows...)
    * Finding a 'make' /program/ (my gcc has a program called
    mingw32-make.exe; is that the one?)
    * Getting 'make' to work. Usually it fails partway and makefiles can be
    so complex that I have no way of figuring a way out
    * Or, trying to compile manually, struggling with files which are all
    over the place and imparting that info to a compiler.

    I don't have any interest in this; I just want the binary!

    So, with my own programs, if I can't provide a binary (eg. they are not trusted), then one step back from a single binary file, is a single
    amalgamated source file.

    I first did this in 2014, as a reaction to the difficulties I kept
    facing: I wanted any of my applications to be as easy to build as hello.c.

    If someone wants the original, discrete sources, then sure they can have
    a ZIP file, which generally will have files that unpack into a single
    directly. But it's on request.


    The difference is that what I provide is genuinely simple: one bare
    compiler, one actual source file.

    Sorry, for me "one file" is not a problem, there is 'tar' (de facto
    standard for distributing sorce code)

    Yeah, I explained how well that works above. So the last Rust
    implementation was a single binary download (great!), but it installed
    itself as 56,000 discrete fies and across don't know how many 1000s of directories (not so great). And it didn't work (it requires additional
    tools).

    Being able to ZIP or TAR a sprawling set of files into giant binary
    makes it marginally easier to transmit or download, but it doesn't
    really address complexity.

    And there is possible quite large dependency, namley Windows.

    Yeah, my binaries run on Windows. Aside from requiring x64 and using
    Win64 ABI, they use one external library MSVCRT.DLL, which itself uses
    Windows.

    For programs that run on Windows and Linux, those depend on libraries
    used. For 'M' programs, one module has to be chosen from Windows and
    Linux versions; To run on Linux, I have to do this:

    mc -c -linux app.m # On Windows, makes app.c, using
    # the Linux-specific module

    gcc app.c -oapp -lm etc # On Linux
    ./app

    but M makes little use of WinAPI. With my interpreter, the process is as follows:

    c:\qx>mc -c -linux qq # On Windows
    M6 Compiling qq.m---------- to qq.c

    Copy qq.c to c:\c then under WSL:

    root@DESKTOP-11:/mnt/c/c# gcc qq.c -oqq -fno-builtin -lm -ldl

    Now I can run scripts under Linux:

    root@DESKTOP-11:/mnt/c/c# ./qq -nosys hello
    Hello, World!

    However, notice the '-nosys' option; this is because qq automatically incorporates a library suite that include a GUI library based on Win32.
    Without that, it would complain of not finding user32.dll etc.

    I would need to dig up an old set of libraries or create new
    Linux-specific ones. A bit of extra work. But see how the entire app is contained within that qq.c file.


    It is [not?] clear how much of your code _usefully_ runs in now-Window environment.

    OK, let's try my C compiler. Here I've done 'mc -c -linux cc`, copied
    cc.q, and compiled under WSL as bcc:

    root@DESKTOP-11:/mnt/c/c# ./bcc -s hello.c
    Compiling hello.c to hello.asm

    root@DESKTOP-11:/mnt/c/c# ./bcc -e hello.c
    Preprocessing hello.c to hello.i

    root@DESKTOP-11:/mnt/c/c# ./bcc -c hello.c
    Compiling hello.c to hello.obj

    root@DESKTOP-11:/mnt/c/c# ./bcc -exe hello.c
    Compiling hello.c to hello.exe
    msvcrt
    msvcrt.dll
    SS code gen error: Can't load search lib

    So, most things actually work; only creating EXE doesn't work, because
    it needs access to msvcrt.dll. But even it it did, it would work as a cross-compiler, as its code generator is for Win64 ABI.

    But I think this shows useful stuff can be done. A more interesting test
    (which used to work, but it's too much effort right now), is to get my M compiler working on Linux (the 'mc' version that targets C), and use
    that to build qq, bcc etc from original sources on Linux.

    In all, my Windows stuff generally works on Linux. Linux stuff generally doesn't work on Windows, in terms of building from source.

    C is just so primitive when it comes to this stuff. I'm sure it largely
    works by luck.

    C is at low level, that is clear.

    The way it does modules is crude. So was my scheme in the 1980s, but it
    was still one step up from C. My 2022 scheme is miles above C now. The underlying language can still be low level, but you can at least fix
    some aspects.

    I think a better module scheme could be retrofitted to C, but I'm not
    going to do it.


    Good programming
    environment should help. C as language is not helpful, one
    may have fully compliant and rather unhelpful compiler. But
    real C compilers tends to be as helpful as they can within
    limit a C language. While C still limits what they can do,
    there is quite a lot of difference betwen current popular
    compiler and bare-bones legal comiler. And there are extra
    tools and here C support is hard to beat.

    I don't agree with spending 1000 times more effort in devising complex
    tools compared with just fixing the language.



    So what are the new circles of ideas? All that crap in Rust that makes
    coding a nightmare, and makes building programs dead slow? All those new
    functional languages with esoteric type systems? 6GB IDEs (that used to
    take a minute and a half to load on my old PC)? No thanks.

    Borrow checker in Rust looks like good idea. There is good chance
    that _idea_ will be adopted by several languages in near feature.

    OK. I've heard that that makes coding in Rust harder. Also that makes compilation slower. Not very enticing features!

    Not so new ideas are:
    - removing limitations, that is making sure language constructs
    work as general as possible (that allows to get rid of many
    special constructs from older languages)
    - nominal, problem dependent types. That is types should reflect
    problem domain. In particular, domains which need types like
    'u32' are somewhat specific, in normal domains fundamental types
    are different
    - functions as values/parameters. In particular functions have
    types, can be members of data structures
    - "full rights" for user defined types. Which means whatever
    syntax/special constructs works on built-in types should
    also work for user defined types
    - function overloading
    - type reconstruction
    - garbage collection
    - exception handling
    - classes/objects

    Are these what your language supports? (If you have your own.)

    I can't say these have ever troubled me. My scripting language has
    garbage collection, and experimental features for exceptions and playing
    with OOP, and one or two taken from functional languages.

    Being dynamic, it has generics built-in. But it deliberately keeps type
    systems at a simple, practical level (numbers, strings, lists, that sort
    of thing), because the aim is for easy coding. If you want hard, then
    Rust, Ada, Haskell etc are that way -->!

    * Clean, uncluttered brace-free syntax
    Does this count as brace-free?

    for i in 1..10 repeat (print i; s := s + i)

    Not if you just substitude brackets for braces. Brackets (ie "()") are
    OK within one line, otherwise programs look too Lispy.

    * Case-insensitive
    * 1-based
    * Line-oriented (no semicolons)
    * Print/read as statements

    Lot of folks consider the above misfeatures/bugs.

    I know.

    Concerning
    'line-oriented' and 'intuitive, can you guess which changes to
    following statement in 'line-oriented' syntax are legal and preserve
    meaning?

    nm = x or nm = 'log or nm = 'exp or nm = '%power or
    nm = 'nthRoot or
    nm = 'cosh or nm = 'coth or nm = 'sinh or nm = 'tanh or
    nm = 'sech or nm = 'csch or
    nm = 'acosh or nm = 'acoth or nm = 'asinh or nm = 'atanh or
    nm = 'asech or nm = 'acsch or
    nm = 'Ei or nm = 'erf or nm = 'erfi or nm = 'li or
    nm = 'Gamma or nm = 'digamma or nm = 'dilog or
    nm = '%root_sum =>
    "iterate"

    As a hit let me say that '=' is comparison. And this is single
    statement, small changes to whitespace will change parse and lead
    to wrong code or syntax/type error. BTW: you need real newsreader
    to see it, Google and likes will change it so it no longer works.

    Thunderbird screws it up as well, unless it is meant to have a ragged
    left edge. But not sure what your point is.

    Non-line-oriented (like C, like JSON) is better for machine readable
    code, that can also be transmitted with less risk of garbling. But when
    when 90% of semicolons in C-style languages coincidence with
    end-of-line, you need to start question the point of them.

    Note that C's preprocessor is line-oriented, but C itself isn't.

    C is still tremendously popular for many reasons. But anyone wanting to
    code today in such a language will be out of luck if prefered any or all
    of these characteristics. This is why I find coding in my language such
    a pleasure.

    Then, if we are comparing the C language with mine, I offer:

    * Out of order definitions

    That is considerd misfeature in modern time.

    Really? My experiments showed that modern languages (not C or C++) do
    allow out-of-order functions. This gives great freedom in not worrying
    about whether function F must go before G or after, or being able to
    reorder or copy and paste.

    In modern languages
    definition may generate some code to be run and order in which
    this code is run matters.

    * One-time definitions (no headers, interfaces etc)
    * Expression-based

    C is mostly expression-based.

    No, it's mostly statement-based. Although it might be that most
    statements are expression statements (a=b; f(); ++c;).

    You can't do 'return switch() {...}' for example, unless using gcc
    extensions.

    There are langages that go further
    than C, for example:

    a := (s = 1; for i in 1..10 repeat (s := s + i); s)

    is legeal in language that I use, but can not be directly translated to C. However, from examples that you gave it looked that your language
    is _less_ expression-based than C.

    I don't use the feature much. I had it from 80s, then switched to statement-based for a few years to match the scripting language, now
    both are expression-based.

    One reason it's not used more is because it causes problems when
    targetting C. However I like it as a cool feature.

    * Program-wide rather than module-wide compilation unit

    AFACS C leave choice to the implementer. If your language _only_
    supports whole-program compilation, then this whould be
    negative feature.

    * Build direct to executable; no object files or linkers

    That really question of implementation. Building _only_
    to executable is misfeature (what about case when I want
    to use a few routines in your language, but the rest including
    main program is in different language?).

    There were escape routes involving OBJ files, but that's fallen into
    disuse and needs fixing. For example, I can't so `mm -obj app` ATM, but
    could do this, when I've cleared some bugs:

    mm -asm app # app.m to app.asm
    aa -obj app # app.asm to app.obj
    gcc app.obj lib.o -oapp.exe # or lib.a?

    This (or something near) allows static linking of 'lib' instead of
    dynamic, or including lib written in another language.

    However, my /aim/ is for my language to be self-contained, and not to
    talk to external software except via DLLs.

    * Blazing fast compilation speed, can run direct from source

    Again, that is implementation (some language features may slow
    down compilation, as you know C allow fast comilation).

    C also requires that the same header (say windows.h or gtk.h) used in 50 modules needs to be processed 50 times for a full build.

    My M language processes it just once for a normal build (further, such
    APIs are typically condensed into a single import module, not 100s of
    nested headers). Some of it is by design!

    * Module scheme with tidy 'one-time' declaration of each module
    * Function reflection (access all functions within the program)
    * 64-bit default data types (ie. 'int' is 64 bits, 123 is 64 bits)
    * No build system needed

    That really depends on needs of your program. Some are complex
    and need build system, some are simple and in principle could
    be compiled with "no" build system. I still use Makefiles for
    simple programs for two reasons:
    - typing 'make' is almost as easy as it can get

    Ostensibly simple, yes. But it rarely works for me. And internally, it
    is complex. Look at what a typical makefile contains with one of a
    program headers, which looks like a shopping list - you can't get simpler!

    - I want to have record of compiler used/compiler options/
    libraries

    So do I, but I want to incorporate that into the language. So if a
    program uses OpenGL, when it sees this:

    importdll opengl =

    (followed by imported entities) that tells it it will need opengl.dll.
    In more complex cases (mapping of import library to DLL file(s) is not straightforward), it's more explicit:

    linkdll opengl # next to module info

    This stuff no longer needs to be submitted via command line; that is
    old-hat.


    Or are used to buying ready-meals from supermarkets.

    Meals are different thing than programming languages. If you want
    to say that _you_ enjoy yor language(s), then I got this. My point
    was that you are trying to present your _subjective_ preferences
    as something universal.

    Yes, and I think I'm right. For example, English breakfast choices are
    simple (cereal, toast, eggs, sausages), everybody likes them, kids and
    adults. But then in the evening you go to a posh restaurant and things
    are very different.

    I think the same basics exist in programming languages.

    I like programming and important part
    is that my programs work. So I like featurs that help me to get
    working program and dislike ones that cause troubles. IME, the
    following cause troubles:

    - case insensitivity

    I believe this would only cause problems if you already have a
    dependence on case-sensitivity, so it's a self-fulfilling problem!

    Create a new language with it, and those problems become minor ones that
    occur on FFI boundaries, and then not that often.

    - dependence on poorly specified defauls
    - out of order definitions

    I don't believe this. In C, not having this feature means:

    * Requiring function prototypes, sometimes
    * Causing problems in self-referential structs (needs struct tags)
    * Causing problems with circular references in structs (S includes
    a pointer to T, and T includes a pointer to S)

    - infexible tools that for example insist on creating executable
    without option on making linkable object file

    Concerning 1-based indexing, IME in more cases it causes trouble
    than helps, but usually this is minor issue.

    In my compiler sources, about 30% of arrays are zero-based (with the
    0-term usually associated with some error or non-set/non-valid index).

    I use a lot
    line oriented syntax. I can say that it works, simple examples
    are easy, but there are unintuitive situations and sometimes troubles.
    For example cut and paste works better for traditional syntax.
    If there are problems, then beginers may be confused. As one
    guy put it: trouble with white space is that one can not see it.

    White space (spaces and tabs) is nothing to do with line-oriented.


    In fact,
    it seems that majority goes in quite different direction.

    I can see that. 30+ years ago, I hardly knew what other people did, and
    didn't care. I did get the impression that I had a competive edge in the
    way I did things.

    A lot of stuff I do with languages is experimental. When it doesn't work
    it gets dropped. Or it evolves. One problem I have now is that I don't
    do enough apps, which is what helps drive language development.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Dec 12 02:17:04 2022
    On 11/12/2022 19:46, David Brown wrote:
    On 10/12/2022 18:23, Bart wrote:

    If I look at Intel processor manuals for example, the behaviour of
    instructions is described in a pseudo-code language looks a lot like
    Algol. It has IF statements, explicit FOR-loops and mutable variables.


    This is exactly what I said - you /think/ certain concepts are
    "universal" or "fundamental" because of your background, and that
    includes heavy use of assembly coding.  von Neumann computer
    architecture has turned out to be very successful, and imperative
    procedural programming is a good fit for such systems at the low level.
    Not all processors fit that model, however, and there are niche devices
    that are very different.  (Internally, even mainstream devices like x86 processors deviate substantially from strict von Neumann models.)

    You're missing my point: the description of what the instructions did,
    which we can assume to be detailed pseudo-code, used explicit loops for example, a feature missing or marginalised in FP.

    I asked, why didn't they use recursion in their pseudo-code? Why didn't
    my Wikepedia example do so rather than use 'while' loops?

    I say it's because such features are understood by more people. You will disagree of course; because /you/ can understand FP code, so must everyone.

    When I look at my paper tax return (or an IKEA instruction leaflet), it
    will contain instructions like: 'go to section 7' or 'go to page 12'. So
    you can assume that 'goto' at least is well understood!



    That does not mean, however, that such languages are ideal for
    programming.  Programming is the art of taking a task from the problem domain and ending up with something in the implementation domain.  For
    most processors, a C-like language (which includes your compiled
    language - imperative and procedural) is close to the implementation domain.  It is, however, typically very far from the problem domain.  A high level language such as Python is going to be closer to the problem domain.

    Python? Probably my scripting language is closer to the problem domain
    too. Mine however has 'goto'!


    The closer your language choice is to the problem domain, the shorter
    the program, the easier it is to see (or prove) that it is correct, and
    the shorter development time.

    That doesn't work with functional languages, not unless you're a genius
    with a PhD in computer science. The program might be 10 times shorter
    but 100 times more cryptic.


    Different kinds of task in the problem domain are best described in
    different kinds of language.  Different situations call for different selections of level of language and trade-offs.  But common to all programming is this progression from the problem to the implemented
    solution.

    Why do think I decided to create a scripting language at all? I first
    did so in late 80s, the next step up to the non-programmable command
    language of my application. It was meant also for users of my app, who
    needed to do domain-specific scripting relevant to the application.

    It was higher level than the one used to implemented the application. It
    was domain-specific in having built-in types such as 3D points/vertices
    and 3D transformation matrices.

    So if P, Q were the endpoints of a line, then (P+Q)/2 was the midpoint,
    while A*B might combine two transformation matrices into one. No memory addresses in sight.

    Maybe I know something about the need for a more accessible language.

    However one big part of that language was creating and working with GUI elements (dialogs etc), which was still a slog even in scripting code.
    How much help would Haskell have been there?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Dec 12 12:56:39 2022
    On 12/12/2022 03:17, Bart wrote:
    On 11/12/2022 19:46, David Brown wrote:
    On 10/12/2022 18:23, Bart wrote:

    If I look at Intel processor manuals for example, the behaviour of
    instructions is described in a pseudo-code language looks a lot like
    Algol. It has IF statements, explicit FOR-loops and mutable variables.


    This is exactly what I said - you /think/ certain concepts are
    "universal" or "fundamental" because of your background, and that
    includes heavy use of assembly coding.  von Neumann computer
    architecture has turned out to be very successful, and imperative
    procedural programming is a good fit for such systems at the low
    level. Not all processors fit that model, however, and there are niche
    devices that are very different.  (Internally, even mainstream devices
    like x86 processors deviate substantially from strict von Neumann
    models.)

    You're missing my point: the description of what the instructions did,
    which we can assume to be detailed pseudo-code, used explicit loops for example, a feature missing or marginalised in FP.

    I asked, why didn't they use recursion in their pseudo-code? Why didn't
    my Wikepedia example do so rather than use 'while' loops?

    I say it's because such features are understood by more people. You will disagree of course; because /you/ can understand FP code, so must everyone.


    And you are /completely/ missing /my/ point. (This is a recurring theme
    in our discussions. If only Usenet were interactive and supported
    arm-waving and a whiteboard, I'm sure we'd understand each other far
    better!)

    I am not arguing that loops (or any other programming language feature)
    are not commonly understood. I am arguing that they are not universal
    or fundamental in any sense.

    I am /not/ saying that "because I understand FP-style code, so does
    everyone" - I am merely saying that /you/ are /wrong/ to say "because I understand imperative code, so does everyone".


    Look at everyday language. We all understand English here in this
    group. Even in countries with completely different native languages,
    such as China, many technical people understand at least some English.
    Does that mean English is somehow more "fundamental" than Mandarin?
    Simpler? More universal? Understood by everyone?


    When I look at my paper tax return (or an IKEA instruction leaflet), it
    will contain instructions like: 'go to section 7' or 'go to page 12'. So
    you can assume that 'goto' at least is well understood!


    Different algorithms are clear when expressed in different ways.

    You want the greatest common denominator of two numbers, a and b ? If a
    and b are the same, that's the GCD. If b is bigger than a, swap them.
    Then find the GCD of b and (a - b).

    Oh look, it turns out recursion is "universally understood".

    You want to go shopping? Make a list. Get the first thing on the list,
    then get the rest of the list. Stop when the list is empty. So list processing, pattern matching and recursion is universal - there are no variables, loops or gotos in that description.

    You need to get some things in one shop, other things in a different
    shop? Take all the things from the first list that you can get in the
    first shop, and put that on a new list. Now you have the key functional programming "filter" function as universal, even if the terminology of functional programming is unfamiliar.

    You realise you need to bake two cakes instead of one? Double
    everything on the list. Now "map" is universal.

    You have recipes for six different kinds of cake. To make double-sized
    cakes, you need to double the ingredients and add 30% to the cooking
    time. Now you have a higher-order function.

    You have a things-to-do list? Do the first one, then move on to the
    rest of the list. That's recursion and lazy evaluation.

    Imagine the list of all prime numbers. Take the first 5 containing the
    digit "3". It seems that understanding how to manipulate infinite lists
    is easy too.


    That just about covers the key functional programming concepts - and
    you've barely left the kitchen. No programming experience is needed,
    yet these are instructions that pretty much anyone could follow. And
    while it is certainly possible to give more imperative-style
    descriptions of these tasks, I'd argue that the functional style
    descriptions here are clearer and more natural. (I am quite happy to
    agree that numbered imperative steps are often a better choice for
    making flatpack furniture.)


    If an algorithm can be described relatively clearly and simply, the
    exact style is not critical to understanding - and there is absolutely
    no justification for considering one feature more "universal" than any
    other.

    (The technical terms used for these features are not universal
    understood. But that's the case for most things in life - people are
    quite happy to walk around all day without knowing what "bipedal
    ambulation" means.)


    When it comes to something like description of processor instructions,
    these are usually written in a mathematical form with a few special conventions. The result is normally as much "function programming
    style" as "imperative programming style" - or a mixture of both. It can
    have "while" loops common to imperative code and "where" clauses common
    to functional languages. Baring coincidences in the particular syntaxes chosen, there's unlikely to be much challenge in viewing them as either
    style - the descriptions are usually too short and simple to make much difference.




    That does not mean, however, that such languages are ideal for
    programming.  Programming is the art of taking a task from the problem
    domain and ending up with something in the implementation domain.  For
    most processors, a C-like language (which includes your compiled
    language - imperative and procedural) is close to the implementation
    domain.  It is, however, typically very far from the problem domain.
    A high level language such as Python is going to be closer to the
    problem domain.

    Python? Probably my scripting language is closer to the problem domain
    too. Mine however has 'goto'!

    I would definitely expect your scripting language to be closer to
    typical problem domains - otherwise it would be pointless.



    The closer your language choice is to the problem domain, the shorter
    the program, the easier it is to see (or prove) that it is correct,
    and the shorter development time.

    That doesn't work with functional languages, not unless you're a genius
    with a PhD in computer science. The program might be 10 times shorter
    but 100 times more cryptic.


    I don't have a PhD in computer science. I didn't even have a Bachelor's
    degree when I learned functional programming. It is not nearly as hard
    as some people seem to imagine. While it is very difficult to quantify
    or qualify how "hard" something is to learn, I don't think it is
    inherently any more difficult than imperative programming - it's simply
    that many people learn imperative programming first, and never move on.

    I mean, I appreciate your calling me a genius, but it's not actually
    necessary!


    Note that I don't write much pure functional programming code. What I
    like is to be able to mix and match - I like to be able to use
    functional style when that is clearest for the problem, and imperative
    style when /that/ is clearest.


    (Is a Dvorak keyboard layout harder to learn than Qwerty? No - and it
    is more efficient in use. It is all about familiarity. I type code
    three or four times as fast on a Norwegian keyboard layout than a UK
    layout, despite needing extra keypresses for some symbols, as a result
    of familiarity.)


    Different kinds of task in the problem domain are best described in
    different kinds of language.  Different situations call for different
    selections of level of language and trade-offs.  But common to all
    programming is this progression from the problem to the implemented
    solution.

    Why do think I decided to create a scripting language at all? I first
    did so in late 80s, the next step up to the non-programmable command
    language of my application. It was meant also for users of my app, who
    needed to do domain-specific scripting relevant to the application.


    I expect it was for much the same reasons as other scripting languages
    were developed - making it simpler to solve the tasks you needed to
    solve. It's higher level than your compiled low-level language, and
    nearer to the problem domain.

    It was higher level than the one used to implemented the application. It
    was domain-specific in having built-in types such as 3D points/vertices
    and 3D transformation matrices.

    So if P, Q were the endpoints of a line, then (P+Q)/2 was the midpoint,
    while A*B might combine two transformation matrices into one. No memory addresses in sight.

    Maybe I know something about the need for a more accessible language.

    However one big part of that language was creating and working with GUI elements (dialogs etc), which was still a slog even in scripting code.
    How much help would Haskell have been there?


    I've never had the need to investigate.

    Declarative styles are often very convenient for defining graphic
    elements, compared to imperative styles. (I.e., you want to say "the
    dialog box has ten buttons" rather than "loop ten times : create a new button").

    Usually, I think, a combination is best. Declarations can be nicer than function calls for creating your gui. Imperative code can make sense
    for acting on events. Event-based programming is useful for user
    interaction (rather than the pure imperative style of polling loops and blocking functions). Object oriented coding helps keep parts of the
    system modularised and structured.

    I wonder if you are imagining functional programming as some kind of
    "opposite" to imperative programming, or that languages are all one or
    the other. The reality is that there is a large number of ways to think
    about programming, and most real-world languages support multiple
    paradigms to a greater or lesser extent. (The paradigms themselves are
    not well defined either - it's all just broad strokes and rough
    categorisation based on common features.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Dec 12 14:36:45 2022
    On 12/12/2022 11:56, David Brown wrote:
    On 12/12/2022 03:17, Bart wrote:

    When I look at my paper tax return (or an IKEA instruction leaflet),
    it will contain instructions like: 'go to section 7' or 'go to page
    12'. So you can assume that 'goto' at least is well understood!


    Different algorithms are clear when expressed in different ways.

    You want the greatest common denominator of two numbers, a and b ?  If a
    and b are the same, that's the GCD.  If b is bigger than a, swap them.
    Then find the GCD of b and (a - b).

    Oh look, it turns out recursion is "universally understood".

    You want to go shopping?  Make a list.  Get the first thing on the list, then get the rest of the list.  Stop when the list is empty.  So list processing, pattern matching and recursion is universal - there are no variables, loops or gotos in that description.

    You need to get some things in one shop, other things in a different
    shop?  Take all the things from the first list that you can get in the
    first shop, and put that on a new list.  Now you have the key functional programming "filter" function as universal, even if the terminology of functional programming is unfamiliar.

    You realise you need to bake two cakes instead of one?  Double
    everything on the list.  Now "map" is universal.

    You have recipes for six different kinds of cake.  To make double-sized cakes, you need to double the ingredients and add 30% to the cooking
    time.  Now you have a higher-order function.

    You have a things-to-do list?  Do the first one, then move on to the
    rest of the list.  That's recursion and lazy evaluation.

    Imagine the list of all prime numbers.  Take the first 5 containing the digit "3".  It seems that understanding how to manipulate infinite lists
    is easy too.


    That just about covers the key functional programming concepts - and
    you've barely left the kitchen.

    A nice set of examples, except for the higher order function one, which
    I don't get. Here are some observations:

    * Imperative languages have recursion too.

    * Things like map and reduce can be expressed in imperative code too,
    via built-in functions, or ones that you can write yourself ...

    * ... which leads to the fact that imperative offers the choice to
    express the tasks using lower level features, offering more flexibility

    * My beef with recursion as used in FP is using it for everything, when iterative would be clearer

    * Those tasks would typically be expressed much more tersely in FP,
    which also has a penchant for stringing sequences of them together on
    one complex line (but this seems popular everywhere now). In imperative,
    deeply nested function calls would be an anti-pattern.

    * You mentioned a shopping list; I quite like code to /look/ like a
    shopping list. I don't mind reading 10 or 20 lines of clear code where I
    can follow every step; it's better than reading one long line that I
    can't grok. Where would I even stick a debug print if I wanted to know
    an intermediate value.


    You realise you need to bake two cakes instead of one? Double
    everything on the list. Now "map" is universal.

    As is mutable data. Or do you mean create a new list? What exactly do
    you double anyway? A typical ingredients list looks like this:

    ["eggs":1, "sugar":40, "SR flour":40]

    Doubling is not that obvious (and the number for eggs has to be whole;
    other units are grams).

    You want to go shopping? Make a list. Get the first thing on the list, then get the rest of the list. Stop when the list is empty.

    What does that actually look like in Haskell? I can tell you that
    shopping as it's done in real-life is not done recursively.

    In imperative it's:

    proc do_shopping(list) =
    for item in list do
    buy(item)
    end
    end

    Or one-liners:

    for item in list do buy(item) end
    apply(buy, list) # not map as there is no result

    In imperative-recursive, if you have to, is:

    proc do_shopping(list) =
    if list then
    buy(head(list))
    do_shopping(tail(list)
    fi
    end

    (I agree parametic pattern-matching to get the two parts of the list
    would be better. But imperative is simpler here anyway.)

    You have a things-to-do list? Do the first one, then move on to the
    rest of the list. That's recursion and lazy evaluation.

    But the to-do-list already exists? If there 1M things to do, the list
    will have 1M items? That's not a great example of lazy evaluation (the
    primes is better).

    You might as well say that imperative code is lazily evaluated, as it
    consists of a sequence of steps done one at a time, while, ironically,
    FP code isn't as it is one big expression.

    OK, I don't have a particular issue with FP features in small amounts.
    As I said and showed, they can come up in imperative code two.

    I have two broad objections:

    * All the FP features I don't understand and don't find intuitive,
    mainly do with functions (higher order, closures, currying etc).

    Pattern-matching, map, reduce etc I can deal with; the concepts are
    easy, and they can trivially expressed in non-FP terms.

    * Basing an entirely language around FP features.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Dec 12 17:27:04 2022
    On 12/12/2022 15:36, Bart wrote:
    On 12/12/2022 11:56, David Brown wrote:
    On 12/12/2022 03:17, Bart wrote:

    When I look at my paper tax return (or an IKEA instruction leaflet),
    it will contain instructions like: 'go to section 7' or 'go to page
    12'. So you can assume that 'goto' at least is well understood!


    Different algorithms are clear when expressed in different ways.

    You want the greatest common denominator of two numbers, a and b ?  If
    a and b are the same, that's the GCD.  If b is bigger than a, swap
    them. Then find the GCD of b and (a - b).

    Oh look, it turns out recursion is "universally understood".

    You want to go shopping?  Make a list.  Get the first thing on the
    list, then get the rest of the list.  Stop when the list is empty.  So
    list processing, pattern matching and recursion is universal - there
    are no variables, loops or gotos in that description.

    You need to get some things in one shop, other things in a different
    shop?  Take all the things from the first list that you can get in the
    first shop, and put that on a new list.  Now you have the key
    functional programming "filter" function as universal, even if the
    terminology of functional programming is unfamiliar.

    You realise you need to bake two cakes instead of one?  Double
    everything on the list.  Now "map" is universal.

    You have recipes for six different kinds of cake.  To make
    double-sized cakes, you need to double the ingredients and add 30% to
    the cooking time.  Now you have a higher-order function.

    You have a things-to-do list?  Do the first one, then move on to the
    rest of the list.  That's recursion and lazy evaluation.

    Imagine the list of all prime numbers.  Take the first 5 containing
    the digit "3".  It seems that understanding how to manipulate infinite
    lists is easy too.


    That just about covers the key functional programming concepts - and
    you've barely left the kitchen.

    A nice set of examples, except for the higher order function one, which
    I don't get.

    It's a set of instructions that are applied to a set of instructions to
    get a new set of instructions - a "recipe" for transforming one cake
    recipe into a new one.


    Here are some observations:

    * Imperative languages have recursion too.

    Sure - there's plenty of overlap. Equally, functional programming
    languages have expressions that are often very much the same as in
    imperative languages (though generally without side-effects). Syntax
    detail varies, but the principles are the same.


    * Things like map and reduce can be expressed in imperative code too,
    via built-in functions, or ones that you can write yourself ...

    * ... which leads to the fact that imperative offers the choice to
    express the tasks using lower level features, offering more flexibility

    * My beef with recursion as used in FP is using it for everything, when iterative would be clearer


    I'm happy with mixing them. I don't claim pure functional programming
    style is the best way to write all code. Sticking to pure FP can have
    its advantages, such as making code proofs easier and being inherently
    safe for multi-threading. But it can also be inconvenient for other
    tasks that can be clearer as sequences of commands or explicit loops.
    Equally, some things are vastly simpler and clearer in FP style.

    This all started with a discussion of ideas for a new programming
    language. My suggestion was never for James to make a pure functional programming language. Rather, I suggested he look at some functional programming languages (and other languages) and see what he could copy
    or take as inspiration. That includes features such as higher-order
    functions, anonymous functions, list comprehensions, summation types
    (google for "how to make a binary tree in Haskell", and compare it to
    what's needed in something like C), and pattern matching. It also
    includes considering the benefits from restrictions - what does "no side-effects in expressions" give you? Are the benefits worth the
    limitations?

    I'd also recommend looking at Go and its CSP-style concurrency features.
    Look at Rust's memory safety. Look at contracts from SPARK. There's
    plenty to be inspired from in many languages, in order to make a better language that does new and exciting things. (Let's put these in new
    threads if anyone wants to discuss them in detail.)


    * Those tasks would typically be expressed much more tersely in FP,
    which also has a penchant for stringing sequences of them together on
    one complex line (but this seems popular everywhere now). In imperative, deeply nested function calls would be an anti-pattern.

    * You mentioned a shopping list; I quite like code to /look/ like a
    shopping list. I don't mind reading 10 or 20 lines of clear code where I
    can follow every step; it's better than reading one long line that I
    can't grok. Where would I even stick a debug print if I wanted to know
    an intermediate value.


    A key point about a shopping list is that it seldom requires much order.
    That's also the case in declarative languages - there is no order
    except when it is unavoidable. (Even where it appears to require order,
    lazy evaluation may mean a function can start executing before its
    arguments are known.) Imperative languages, by their fundamental
    nature, order everything.


    You realise you need to bake two cakes instead of one?  Double
    everything on the list.  Now "map" is universal.

    As is mutable data. Or do you mean create a new list?

    Logically, you have a new list. You might not bother writing it down
    (that's lazy evaluation again :-) ), but you are not destroying your old recipe.

    What exactly do
    you double anyway? A typical ingredients list looks like this:

       ["eggs":1, "sugar":40, "SR flour":40]

    Doubling is not that obvious (and the number for eggs has to be whole;
    other units are grams).

    I was giving an example of programming paradigms, not a baking course!


    You want to go shopping?  Make a list.  Get the first thing on the list, then get the rest of the list.  Stop when the list is empty.

    What does that actually look like in Haskell? I can tell you that
    shopping as it's done in real-life is not done recursively.

    In imperative it's:

        proc do_shopping(list) =
            for item in list do
                buy(item)
            end
        end


    In Haskell :

    shop [] = []
    shop (x : xs) = buy x ++ shop xs

    If you've nothing to get, you are done. Otherwise you buy the first
    thing off the list and then buy everything else. (Or you buy everything
    else, then the first thing - or send a kid to buy the first thing while
    you get everything else. Unlike imperative code, the order of
    evaluation is not fixed.)

    This is actually very much the way shopping is done - it is as close a
    model as your loop. They both model the same process.

    (Recursion is more general than looping. Consider sorting a pack of
    cards manually. You might use a mergesort, which is very neatly
    expressed recursively.)


    Or one-liners:

       for item in list do buy(item) end
       apply(buy, list)            # not map as there is no result


    In Haskell :

    [ buy x | x <- xs]

    And yes, there /is/ a result - when you buy something, I would hope you
    have the thing as a result of the purchase! In your imperative code,
    this might be hidden by putting the purchases in a global variable bag,
    but the "result" is still there.

    In imperative-recursive, if you have to, is:

        proc do_shopping(list) =
            if list then
                buy(head(list))
                do_shopping(tail(list)
            fi
        end

    (I agree parametic pattern-matching to get the two parts of the list
    would be better. But imperative is simpler here anyway.)

    Not really, no. It's possible that the syntax for the Haskell is
    unfamiliar to you, making it harder to understand, but I think you'll
    see what it says when you think about it. (And obviously for
    non-programming shoppers the algorithm will be in prose, whether it is imperative or functional style.)


    You have a things-to-do list?  Do the first one, then move on to the
    rest of the list.  That's recursion and lazy evaluation.

    But the to-do-list already exists? If there 1M things to do, the list
    will have 1M items? That's not a great example of lazy evaluation (the
    primes is better).

    The "lazy evaluation" aspect is that you don't have to evaluate
    arguments until they are needed. You can call "do_everything" on the "things-to-do" list even when one entry on the list is "find out what
    the kids want for Christmas" and another is "buy the Christmas
    presents". One of the arguments could even be "pass this list on to
    someone else to finish".


    You might as well say that imperative code is lazily evaluated, as it consists of a sequence of steps done one at a time, while, ironically,
    FP code isn't as it is one big expression.

    No, you can't say that.

    Imperative programming has an ordering (though sometimes certain aspects
    are unspecified by the language). In order to call a function, you
    first evaluate all the arguments, then you call the function with those arguments.

    If you have lazy evaluation (which is not required for functional
    programming, but is common), arguments are not evaluated unless and
    until they are needed. This requires being able to treat functions as
    data and data as functions - instead of passing a value as an argument,
    you pass a function that will generate that value when needed.

    You can do lazy evaluation in more sophisticated non-functional
    languages too. In C++, it is a popular way to implement some kinds of
    heavy mathematics such as matrix libraries. When you write "A = B +
    C;", the code does not create a new matrix that is the result of adding matrices A and B. Rather, it creates a proxy object that knows how to
    get the result of the addition. Once you have written all your
    expressions and calculations, and actually ask for the results, all
    these proxies are evaluated. But now a lot of the memory management for creating and destroying the matrices can be avoided and more done
    in-place, or multiple additions can be combined to more efficient
    calculations.



    OK, I don't have a particular issue with FP features in small amounts.
    As I said and showed, they can come up in imperative code two.

    I have two broad objections:

    * All the FP features I don't understand and don't find intuitive,
    mainly do with functions (higher order, closures, currying etc).


    OK. These are not the easiest of concepts. It takes effort to learn to understand and appreciate them, and you can program quite happily in
    other languages without ever seeing them. I'm not asking you to
    understand them, or use them, or implement them in your own language -
    all I am asking is that you accept that some other programmers do
    understand them, and do find them useful.

    Pattern-matching, map, reduce etc I can deal with; the concepts are
    easy, and they can trivially expressed in non-FP terms.

    * Basing an entirely language around FP features.


    It involves thinking about things in a significantly different way. I
    do little pure functional programming myself. (I think the last
    functional programming I did was FPGA hardware design, a good number of
    years ago. Many high-level hardware design languages are functional in nature.)

    I am a big fan of combining features and use the best tools for the task
    at hand, rather than trying to be "pure" about programming.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Bart on Mon Dec 12 17:20:37 2022
    On 12/12/2022 14:36, Bart wrote:
    On 12/12/2022 11:56, David Brown wrote:

    I have two broad objections:

    * All the FP features I don't understand and don't find intuitive,
    mainly do with functions (higher order, closures, currying etc).

    Pattern-matching, map, reduce etc I can deal with; the concepts are
    easy, and they can trivially expressed in non-FP terms.

    * Basing an entirely language around FP features.

    I've been looking at examples on rosettacode.org. Most languages there
    are conventional (my style) other than all the weird ones plus FP.

    But one task caught my eye:

    https://rosettacode.org/wiki/Determine_if_a_string_has_all_unique_characters#Haskell

    as all three Haskell versions seem to make meal of it. I had been
    looking for a short cryptic Haskell example; this was a long cryptic one!

    One mystery is how it gets the output (of first version) properly lined
    up, as I can't see anything relevant in the code.

    Half my version below is all the fiddly formatting; this is where I'd
    consider this a weak spot in my language and think about what could
    improve it.

    Other languages (eg. OCaml) keep the output minimal.


    -------------------------

    proc main =
    teststrings:=(
    "",
    ".",
    "abcABC",
    "XYZ ZYX",
    "1234567890ABCDEFGHIJKLMN0PQRSTUVWXYZ")

    println "+--------------------------------------+------+----------+--------+---+---------+"
    println "|string |length|all
    unique|1st diff|hex|positions|"
    println "+--------------------------------------+------+----------+--------+---+---------+"

    for s in teststrings do
    (length, unique, diff, pos):=checkstring(s)

    if diff then
    fprintln "|#|#|# |#|#|#|",
    s:"38jl", length:"6jl",
    (unique|"yes"|"no "),
    diff:"jl8", asc(diff):"Hjl3", sprint(pos[1], pos[2]):"9jl"
    else
    fprintln "|#|#|yes | | | |",
    s:"38jl", length:"6jl"
    fi
    od

    println "+--------------------------------------+------+----------+--------+---+---------+"
    end

    func checkstring(s)=
    for i, c in s do
    n:=c in rightstr(s,-i)
    if n then !not unique
    return (s.len, 0, c, (i, n+i))
    fi
    od

    (s.len, 1, "", ())
    end

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Dec 13 16:40:21 2022
    On 12/12/2022 18:20, Bart wrote:
    On 12/12/2022 14:36, Bart wrote:
    On 12/12/2022 11:56, David Brown wrote:

    I have two broad objections:

    * All the FP features I don't understand and don't find intuitive,
    mainly do with functions (higher order, closures, currying etc).

    Pattern-matching, map, reduce etc I can deal with; the concepts are
    easy, and they can trivially expressed in non-FP terms.

    * Basing an entirely language around FP features.

    I've been looking at examples on rosettacode.org. Most languages there
    are conventional (my style) other than all the weird ones plus FP.

    But one task caught my eye:

    https://rosettacode.org/wiki/Determine_if_a_string_has_all_unique_characters#Haskell


    as all three Haskell versions seem to make meal of it. I had been
    looking for a short cryptic Haskell example; this was a long cryptic one!

    One mystery is how it gets the output (of first version) properly lined
    up, as I can't see anything relevant in the code.

    Half my version below is all the fiddly formatting; this is where I'd consider this a weak spot in my language and think about what could
    improve it.

    Other languages (eg. OCaml) keep the output minimal.


    The same applies to the Haskell code shown - checking for uniqueness is
    only about 10 lines of code - it's the table formatting that makes up
    the bulk of it. (I am not nearly fluent enough in Haskell or its
    standard libraries to write such code.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Thu Dec 15 11:26:39 2022
    On 08/12/2022 09:08, David Brown wrote:
    On 07/12/2022 19:58, Bart wrote:

    My point was that your language (the low-level compiled one) and C are similar styles - they are at a similar level, and are both procedural imperative structured languages.

    (None of this suggests you "copied" C.  You simply have a roughly
    similar approach to solving the same kinds of tasks - you probably had experience with much the same programming languages as the C designers,
    and similar assembly programming experience before making your languages
    at the beginning.)

    Inspiration for the syntax of my first language were from these:

    Algol68, Pascal, Ada, Fortran

    I'd only actually used Pascal, Fortran and Algol60; others I'd seen
    examples on paper.

    Influences for the semantics, and operating and memory model, were from
    these, in no particular order:

    Pascal, Fortran, PDP10 ASM, Z80 ASM, Babbage (HLA), Misc ASM (incl.
    6800).

    But also from actual hands-on hardware development, involving Z80 and
    x86 family up to 80188, and including graphics and video. (I don't know
    if the C designers ran their language on a machine they'd had to
    assemble with a soldering iron.)

    The PDP10 was most useful in deciding what not to bother with (ie. word-addressed memory, packed strings etc). The 6800 was the only
    big-endian processor I ever used, which had a brief positive influence,
    but later I was happy to settle with little-endian.

    AFAICR, C had no influence whatsoever. (Which is why I find it irksome
    that all these lower-level aspects of hardware and software are now
    considered the exclusive domain of C.)

    However, I did encounter C much later on, and it did affect my own
    language in some small matters when I adopted C's approach:

    Original M C

    Function calls, 0 args F F()
    Create function pointer ^F F as well as &F
    Address-of symbol ^X &F
    Hex literals 0ABCH 0xABC
    Pointer offset P + i*4 P + i (byte vs object offset)
    Matching T* and Q* Any T, Q T = Q only, or one is void

    Another change I made recently is in dropping the need for explicit
    dereference operators in these contexts (originally I was against it for
    losing transparency, but that is also the case by pass-by-reference params):

    Original M Current M C

    A^[i] A[i] (*A)[i], but usually A[i] ->
    P^.m P.m (*P).m or P->m
    F^(x) F(x) (*F)(x) or F(x)

    Note: C makes little use of true array pointers; it likes to use T*
    types not (T*)[], which mean my example would be written as A[i] anyway.
    That however is very unsafe.)

    Although the end result is the same (cleaner code), I don't achieve it
    by by-passing the type system: internally the language still needs and
    uses `P^.m`, it is purely a convenience.

    ^ can still be used, but not using ^ makes code easier to move to/from
    my Q language.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luserdroog@21:1/5 to David Brown on Fri Dec 16 20:07:28 2022
    On Thursday, December 8, 2022 at 3:08:39 AM UTC-6, David Brown wrote:
    Programming in Eiffel, Haskell, APL, Forth or Occam is /completely/
    different - you approach your coding in an entirely different way, and
    it makes no sense to think about translating from one of these to C (or
    to each other).

    It can be a fun challenge. I've tried implementing APL and Haskell concepts
    in PostScript and C. Going from PostScript to C can be fairly easy if you completely disregard the goal of making it idiomatic. I have not looked at Eiffel or Occam in any depth. But I suppose "fun" is the key here. If you go into it thinking it's impossible, or impractically difficult, then you're right.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sat Dec 17 06:07:38 2022
    Bart <bc@freeuk.com> wrote:
    On 11/12/2022 16:50, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:


    (I don't think you've made it clear whether the other language(s) you've refered to are some mainstream ones, or one(s) that you have devised. I
    will assume the latter and tone down my remarks. A little.)

    Not my invention (I contributed a bit to some). But in most cases I
    would not call them mainstream ones. Well, if your criterion for
    mainstream language is having more than 10 users, then there is good
    chance that all would qualify.

    My scheme also allows circular and mutual imports with no restrictions.

    You probably mean "with no artifical restrictions". There is
    fundamental restriction that everything should resolve in finite
    number of steps.

    Huh? That doesn't come up. Anything which is recursively defined (eg.
    const a=b, b=a) is detected, but that is due to out-of-order not
    circular modules.

    There could be "interesting" dependencies, naive handling would
    treat them as cycle, but they are resolvable.

    There needs to be some structure, some organisation.

    Exactly, private import is tool for better organisation.

    Sorry, all I can see is extra work; it was already a hassle having to
    write `import B` on top of every module that used B, when it was visible
    to all functions, because of having to manage that list of imports /per module/. Now I have to do that micro-managing /per function/?

    I do not "have to". This is an option, and its use is not frequent.

    (Presumably this language has block scopes: can this import also be
    private to a nested block within each function?)

    Actually, ATM there is no block scope, things propagate from normal
    blocks to outside. There is limited number of things that prevent
    propagation, one is functions, there are few other, but they are
    somewhat unusual.

    With module-wide imports, it is easy to draw a diagram of interdependent modules; with function-wide imports, that is not so easy, which is why I think there will be less structure.

    You can draw diagrams that you want. Scoped import allows expressing
    more information in source. For me it is more structure.

    I can compile each program separately. Granularity moves from module to
    program. That's better. Otherwise why not compile individual functions?

    In the past there was support for compiling individual functions, and
    I would not exclude possibility that it will come back. But ATM
    I prefer to keep things simple, so this functionality was removed.

    With whole program compilation, especially used in the context of a
    resident tool that incorporates a compiler, with resident symbol tables, resident source and running code in-memory (actually, just like my very
    first compilers), lots of possibiliies open up, of which I've hardly scratched the surface.

    Including compiling/recompiling a function at a time.

    Although part-recompiling during a pause in a running program, then
    resuming, would still be very tricky. That would need debugging
    features, and I would consider, in that case, running via an interpreter.

    Well, you need relocatable code and indirection via pointers (hidden
    from user, but part of runtime system). Main trouble is possible
    change to layout of data structures -- if you change layout, then
    all code depending on layout must be recompiled. In normal case
    layout is only visible inside a module, so recompiling module
    is enough. ATM it is up to user to do what is needed in more
    tricky cases.

    My point: a system that does all this would need all the relevant bits
    in memory, and may involve all sorts of complex components.

    Yes, "relevant bits". But "relevant bits" usually is much less
    than whole program. And take into account that intermediate
    data structures during compilation are much bigger than object
    code.

    But a
    whole-program compiler that runs apps in-memmory already does half the work.

    I used to use independent compilation myself. I moved on to
    whole-program compilation because it was better. But all the issues
    involved with interfacing at the boundaries between modules don't
    completely disappear, they move to the boundaries between programs
    instead; that is, between libraries.

    I consider programs to be different from libraries. Program may
    use several libraries, in degenerate case library may be just a
    single module. Compiling whole program has clear troubles with
    large programs.

    My definition of 'program', on Windows, is a single EXE or DLL file.

    I expect larger applications to consist of a collection of EXE and DLL
    files. My own will have one EXE and zero or more DLLs, but I would also
    make extensive use of scripting modules, that have different rules.

    One way to allow incremental recompilation is to compile each module
    to separate shared library (roughly corresponding to Windows DLL).
    At least on Linux this has some overhead, namely each shared library
    needs normally two or three separate paged areas and some system
    data structures, so overhead of order of 10 kB per shared library.
    But on modern Linux it of workable for low thousends of shared
    libraries (I am not sure if it would scale to tens of thousends).

    In running program you may unload current verision of shared
    library resulting from module compilation and load new one.

    There are different implementations which do not use machinery
    of shared libraries, which in priciple should be much more
    efficient. But with 1200 modules shared library handling seem
    to cope fine.

    The latter being an automatically created amalgamation (produced with
    `mm -ma app`). Build using `mm app`.

    I could provide a single file, shell archive containing build script
    and sources, but important part of providing sources is that people
    can read them, understand and modify.

    I don't agree. On Linux you do it with sources because it doesn't have a reliable binary format like Windows that will work on any machine. If
    there are binaries, they might be limited to a particular Linux
    distribution.

    You do not get it. I create binaries that work on 10 year old Linux
    and new one. And on distributions that I never tried. Of course,
    I mean that binaries are for specific architecture, separate
    for i386 Linux and x86_64 Linux (that covers PC-s, I would have to
    provide more if I wanted to support more architectures).

    Concerning binary format, there were two: Linux started with a.out
    and switched to ELF in second half of nineties. IIUC it is still
    possible to run old a.out binaries, but support is/was an optional
    addon and most users elected to not istall it. Anyway, all Linux
    systems installed in this century are supposed to be able to run
    ELF binaries. There were one change in ELF which means that by
    default currently made programs will not run on old systems
    (where old means something like 10 years old), but it is possible
    to create binary that is compatible both with old and new systems.
    So, maybe compatiblity of binary formats is not as good as
    in Windows, but in practice is not a problem.

    But the point is of managing dependencies. When you get Linux you
    get a lot of libraries and normal "good behaviour" on Linux is to
    use system provided libraries. Now, libraries change with time
    and given library may be replaced by an incompatible one. Also,
    commercial developers like to hardcode various system details
    like location to system files. And there are utilities and
    language interpreters that change too. One way to deal with
    dependencies is bundle everything with binary. AFAIK this is
    Windows way: Microsoft provides bunch of "redistributables"
    and vendors are supposed to ship verion that they use. And
    everything which does not come from Microsoft.

    Linux system are more varied than Windows. But this is part
    of point that I made: having sources people modify them and
    create a lot of similar but (sometimes subtly) different
    variants. Granted, "ordinary users" want to use system.
    But almost any developer can make useful changes to programs
    and such changes propagate to other people and Linux distributions.
    That would be impossible without sources.

    You also ignore educational aspect: some folks fetch sources to
    learn how things are done.

    (Binaries on Linux have always been a mystery to me, starting with the
    fact that they don't have a convenient extension like .exe to even tell
    what it is.)

    Every file in Linux have associated permission bits. In most
    wide sense "executable" means that you have permission to
    execute it. Of course, if you try to execute it Linux kernel
    must figure out what to do. Here important part is so called
    magic numbers, normal executables are supposed to have few
    bytes at the start which identify the format, like a.out versus ELF
    machine architecture, etc. Actually, this is configurable and
    users can add rules which say what to do given specific magic
    number. And there is rather general support for interpreted
    executables.

    Magic numbers are not much different than MZ or NE markers in
    Windows binaries. Concerning executables, IIUC in Windows
    there are several executable extentions. And supposedly
    you your EXE can have .com or .bat extension.

    Concerning not having extension: you can add one if you want,
    moderatly popular choices are .exe or .elf. But for using normal
    Linux executable it should not matter if it is a shell script,
    interpreted Python file or machine code. So exention should
    not "give away" nature of executable. And having no extension
    means that users are spared needless typing or (some of) surprises
    of running different program that they wanted (PATH still
    allows confusion).

    GNU folks have nice definition
    of source: "preffered form for making modifications". I would guess
    that 'app.ma' is _not_ your preffered form for making modifications,
    so it is not really true source.

    No, but deriving the true sources from app.ma is trivial, since it is basically a concatenation of the relevant files.

    Not less trivial than running 'tar' (which is standard component
    on Linux).

    And to build from "source" I need
    source first. And I provide _true_ sources to my users.

    If you were on Linux or didn't want to use my compiler, then it's even
    simpler; I would provide exactly one file:

    app.c Generated C source code (via `mc -c app`)

    Here, you need a C compiler only. On Windows, you can build it using
    `gcc app.c -oapp.exe`. Linux needs more options, listed at the top of
    the file.

    Sorry, generated file is _not_ a source. If I were to modify C file

    This is not for modifying. 99% of the time I want to build an open
    source C project, it is in order to provide a running binary, not spend
    hours trying to get it to build. These are the obstacles I have faced:

    * Struggling with formats like .gz2 that require multiple steps on Windows

    'tar' knows about popular compression formats and can do this in one
    step, also on Windows.

    * Ending up with myriad files scatterred across myriad nested directories

    Well, different folks have different ideas about good structure.
    Some think that splitting things into a lot of directories is
    good structure (I try to limit number of directiories and files,
    but probably use much more than you would like). This is not
    a big problem with right tools.

    * Needing to run './configure' first (this will not work on Windows...)

    I saw one case then a guy tried to run './configure' on Windows NT
    and Windows NT kept crashing. It made a little progress and than
    crashed, so that guy restarted it hoping that eventually it will
    finish (after a week or two he gave up and used Linux). But
    usually './configure' is not that bad. It make take a lot of
    time, IME './configure' that run in seconds on Linux needed
    several minutes on Windows. And of course you need to install
    essential dependencies, good program will tell you what you need
    to install first, before running configure. But you need to
    understand what they mean...

    * Finding a 'make' /program/ (my gcc has a program called
    mingw32-make.exe; is that the one?)

    Probably. Normal advice for Windows folks is to install thing
    called msys (IIUC it is msys2 now) which contains several tools
    incuding 'make'. You are likely to get it as part of bigger
    bundle, I am not up to date to tell you if this bundle will
    be called 'gcc' or something else.

    * Getting 'make' to work. Usually it fails partway and makefiles can be
    so complex that I have no way of figuring a way out
    * Or, trying to compile manually, struggling with files which are all
    over the place and imparting that info to a compiler.

    I don't have any interest in this; I just want the binary!

    Well, I provide Linux binaries, but only sources for Windows
    users. One reason is that I have only Linux on my personal
    machine, so to deal with Windows I need to lease a machine.
    Different reason is that I an not paid for programming, I do
    this because I like to program and to some degree to build
    community. But if some potential members of community would
    like to benefit but are unwilling to spent little effort.
    Of course, in big community there may be a lot of "free
    riders" who benefit without contributing anything whithout
    bad effect because other folks will do needed work. But
    here I am dealing with small community. I did port
    to Windows to make sure that it actually works and there
    are not serious problems. But I leave to other creation
    of binaries and reporting possible build problems. If
    nobody is willing to do this, then from my point of
    view Windows has no users and is not worth supporting.

    So, with my own programs, if I can't provide a binary (eg. they are not trusted), then one step back from a single binary file, is a single amalgamated source file.

    I first did this in 2014, as a reaction to the difficulties I kept
    facing: I wanted any of my applications to be as easy to build as hello.c.

    If someone wants the original, discrete sources, then sure they can have
    a ZIP file, which generally will have files that unpack into a single directly. But it's on request.

    Normally I only deal with programs where I have full sources or I am
    sure I can obtain them. My Linux is installed from binaries, but
    sources are available on public repositories and I know where to
    find them (and in cases where I needed sources I fetched them).

    The difference is that what I provide is genuinely simple: one bare
    compiler, one actual source file.

    Sorry, for me "one file" is not a problem, there is 'tar' (de facto standard for distributing sorce code)

    Yeah, I explained how well that works above. So the last Rust
    implementation was a single binary download (great!), but it installed
    itself as 56,000 discrete fies and across don't know how many 1000s of directories (not so great). And it didn't work (it requires additional tools).

    I did not try Rust. As I wrote, I prefer moderate number of files,
    but many files are not big problem, unless your system is deficient.
    To be clearer: if you have too big allocation unit, then sources
    which should take 20M many tak3 say 30M of disk space (that happended
    when I unpacked sources of 386BSD on DOS). And you need tools
    which can efficiently handle many files.

    Being able to ZIP or TAR a sprawling set of files into giant binary
    makes it marginally easier to transmit or download, but it doesn't
    really address complexity.

    In my book single blob of 20M is more problematic than 10000 files,
    2kB each. At deeper level complexity is the same, but blob lacks
    useful structure given by division into files and subdirectories.

    And there is possible quite large dependency, namley Windows.

    Yeah, my binaries run on Windows. Aside from requiring x64 and using
    Win64 ABI, they use one external library MSVCRT.DLL,

    MSVCRT.DLL _should_ not cause trouble, as this is C library
    and functions in it in principle are available on other systems.
    To make _should_ reality some care/effort may be needed.

    which itself uses
    Windows.

    For programs that run on Windows and Linux, those depend on libraries
    used. For 'M' programs, one module has to be chosen from Windows and
    Linux versions; To run on Linux, I have to do this:

    mc -c -linux app.m # On Windows, makes app.c, using
    # the Linux-specific module

    gcc app.c -oapp -lm etc # On Linux
    ./app

    but M makes little use of WinAPI. With my interpreter, the process is as follows:

    c:\qx>mc -c -linux qq # On Windows
    M6 Compiling qq.m---------- to qq.c

    Copy qq.c to c:\c then under WSL:

    root@DESKTOP-11:/mnt/c/c# gcc qq.c -oqq -fno-builtin -lm -ldl

    Now I can run scripts under Linux:

    root@DESKTOP-11:/mnt/c/c# ./qq -nosys hello
    Hello, World!

    However, notice the '-nosys' option; this is because qq automatically incorporates a library suite that include a GUI library based on Win32. Without that, it would complain of not finding user32.dll etc.

    I would need to dig up an old set of libraries or create new
    Linux-specific ones. A bit of extra work. But see how the entire app is contained within that qq.c file.


    It is [not?] clear how much of your code _usefully_ runs in now-Window environment.

    OK, let's try my C compiler. Here I've done 'mc -c -linux cc`, copied
    cc.q, and compiled under WSL as bcc:

    root@DESKTOP-11:/mnt/c/c# ./bcc -s hello.c
    Compiling hello.c to hello.asm

    root@DESKTOP-11:/mnt/c/c# ./bcc -e hello.c
    Preprocessing hello.c to hello.i

    root@DESKTOP-11:/mnt/c/c# ./bcc -c hello.c
    Compiling hello.c to hello.obj

    root@DESKTOP-11:/mnt/c/c# ./bcc -exe hello.c
    Compiling hello.c to hello.exe
    msvcrt
    msvcrt.dll
    SS code gen error: Can't load search lib

    So, most things actually work; only creating EXE doesn't work, because
    it needs access to msvcrt.dll. But even it it did, it would work as a cross-compiler, as its code generator is for Win64 ABI.

    Yes. It was particularly funny whan you had compiler running on
    Raspberry Pi, but producing Intel code...

    But I think this shows useful stuff can be done. A more interesting test (which used to work, but it's too much effort right now), is to get my M compiler working on Linux (the 'mc' version that targets C), and use
    that to build qq, bcc etc from original sources on Linux.

    In all, my Windows stuff generally works on Linux. Linux stuff generally doesn't work on Windows, in terms of building from source.

    Our experience differ. There were times that I had to work on
    Windows machine and problem was that Windows does not come with
    tools that I consider essential. But after installing essential
    tools from binaries the rest could be installed from sources.
    To be clear, I mean mostly non-GUI stuff.

    C is just so primitive when it comes to this stuff. I'm sure it largely
    works by luck.

    C is at low level, that is clear.

    The way it does modules is crude. So was my scheme in the 1980s, but it
    was still one step up from C. My 2022 scheme is miles above C now.

    Concering "miles above": using C one can create shared libraries.
    Some shared libraries may be system provided, some may be private.
    Within C ecosystem, one you have corresponding header files you
    can use them as "modules". And they are usable with other languages.
    AFAIK no other module system can match _this_. So, primitive,
    but it works and allows to do things which would be impossible
    otherwise.

    The
    underlying language can still be low level, but you can at least fix
    some aspects.

    I think a better module scheme could be retrofitted to C, but I'm not
    going to do it.

    Adding "better" module system to C is trivial. Adding it in a
    way preserving good properties of current system is tricky.

    Good programming
    environment should help. C as language is not helpful, one
    may have fully compliant and rather unhelpful compiler. But
    real C compilers tends to be as helpful as they can within
    limit a C language. While C still limits what they can do,
    there is quite a lot of difference betwen current popular
    compiler and bare-bones legal comiler. And there are extra
    tools and here C support is hard to beat.

    I don't agree with spending 1000 times more effort in devising complex
    tools compared with just fixing the language.

    It is cheaper to have 1000 people doing tools, than 100000 people
    fixing their programs. If you are single person or in small
    group working on both, then adapting program to tools may be
    resonable. You look what gives more benefit, implementing
    feature in compiler or writing code without some features in
    compiler. If you choose wisely you can have simple compiler
    with features that you need. But this does not scale well.

    So what are the new circles of ideas? All that crap in Rust that makes
    coding a nightmare, and makes building programs dead slow? All those new >> functional languages with esoteric type systems? 6GB IDEs (that used to
    take a minute and a half to load on my old PC)? No thanks.

    Borrow checker in Rust looks like good idea. There is good chance
    that _idea_ will be adopted by several languages in near feature.

    OK. I've heard that that makes coding in Rust harder. Also that makes compilation slower. Not very enticing features!

    Initial coding is easiest when compiler report no errors. But
    if program is widely used, then errors will cause trouble sooner
    or later. Borrow checker means that that programmers must deal
    with problems earlier, which may look harder, but is likely to
    reduce total work in longer time.

    Not so new ideas are:
    - removing limitations, that is making sure language constructs
    work as general as possible (that allows to get rid of many
    special constructs from older languages)
    - nominal, problem dependent types. That is types should reflect
    problem domain. In particular, domains which need types like
    'u32' are somewhat specific, in normal domains fundamental types
    are different
    - functions as values/parameters. In particular functions have
    types, can be members of data structures
    - "full rights" for user defined types. Which means whatever
    syntax/special constructs works on built-in types should
    also work for user defined types
    - function overloading
    - type reconstruction
    - garbage collection
    - exception handling
    - classes/objects

    Are these what your language supports? (If you have your own.)

    I doubt if all are fully implemented in single language, for example
    having both function overloading and type reconstruction is
    balancing act. The language with modules above has quite limited
    support for exceptions: there is (scoped) "catch all" construct
    to catch errors but no real exceptions. Here trouble is what
    should be type of exceptions? The language takes type correctness
    vary serously and ATM there is no agreement about good type.
    There are no real classes and objects. Many things done via
    classes/objects can be done using module features, but not
    all. And here trouble is to implement nice object semantics
    keeping good runtime efficiency of current system.

    I would say that in other aspects this language is doing
    quite well. "full rights" for user defined types is obtained
    implementing builtin types almost as if they were user-defined types.

    Here I contributed modest changes to the language and larger
    to implementation. I must admit that trouble here is that
    currently several features work as used in existing code, but
    not in general. And you would hate compiler speed: about
    200 lines per second (but with large variation depending
    on actual constructs). Note: I would like to improve compiler
    speed, OTOH is is resonably workable. Namely average module
    has about 200 lines and can be compiled in about 1s. And there
    is another compiler for interactive use which compiles somewhat
    faster and allows testing of small pieces of code. So,
    while compiler is much slower than gcc, actual turnaround
    time during developement is actually resonably good.

    I can't say these have ever troubled me. My scripting language has
    garbage collection, and experimental features for exceptions and playing
    with OOP, and one or two taken from functional languages.

    Being dynamic, it has generics built-in. But it deliberately keeps type systems at a simple, practical level (numbers, strings, lists, that sort
    of thing), because the aim is for easy coding.

    Well, some people do not think in terms of types. For other types
    are significant help, they allow easy coding because when I write
    '*' compiler chooses right one based on types (and similar for
    many other operations). And thanks to types compilers catches
    a lot of errors quite early. And resulting code can be quite
    efficient, main limitation here is quality of code generator
    which is significantly weaker than gcc (significantly meaning
    about half of speed of gcc compiled C on moderately low-level
    tasks).

    If you want hard, then
    Rust, Ada, Haskell etc are that way -->!

    * Clean, uncluttered brace-free syntax
    Does this count as brace-free?

    for i in 1..10 repeat (print i; s := s + i)

    Not if you just substitude brackets for braces. Brackets (ie "()") are
    OK within one line, otherwise programs look too Lispy.

    There are many possible alternatives: '[' and ']' or '<<' and '>>' :)

    * Case-insensitive
    * 1-based
    * Line-oriented (no semicolons)
    * Print/read as statements

    Lot of folks consider the above misfeatures/bugs.

    I know.

    Concerning
    'line-oriented' and 'intuitive, can you guess which changes to
    following statement in 'line-oriented' syntax are legal and preserve meaning?

    nm = x or nm = 'log or nm = 'exp or nm = '%power or
    nm = 'nthRoot or
    nm = 'cosh or nm = 'coth or nm = 'sinh or nm = 'tanh or
    nm = 'sech or nm = 'csch or
    nm = 'acosh or nm = 'acoth or nm = 'asinh or nm = 'atanh or
    nm = 'asech or nm = 'acsch or
    nm = 'Ei or nm = 'erf or nm = 'erfi or nm = 'li or
    nm = 'Gamma or nm = 'digamma or nm = 'dilog or
    nm = '%root_sum =>
    "iterate"

    As a hit let me say that '=' is comparison. And this is single
    statement, small changes to whitespace will change parse and lead
    to wrong code or syntax/type error. BTW: you need real newsreader
    to see it, Google and likes will change it so it no longer works.

    Thunderbird screws it up as well, unless it is meant to have a ragged
    left edge. But not sure what your point is.

    Yes, there is ragged left edge. Well, with "line oriented" syntax
    you will get constructs that span more than one line. And you
    need some rules how they work. The point is that once you get
    into corner cases the rules are unlikely to be intuitive. I mean,
    they will be intuitive if you learn and internalize them, but
    this is not much different from other syntax.

    Non-line-oriented (like C, like JSON) is better for machine readable
    code, that can also be transmitted with less risk of garbling. But when
    when 90% of semicolons in C-style languages coincidence with
    end-of-line, you need to start question the point of them.

    Well, another sample:

    sub1!(pol, p) ==
    if #pol = 0 then
    pol := new(1, 0)$U32Vector
    if pol(0) = 0 then
    pol(0) := p - 1
    else
    pol(0) := pol(0) - 1
    pol

    While semantics of this would need some explantation, I hope
    that you agree that this is uncluttered syntax, with no
    braces or semicolons. But there is price to be paid, namely
    (rare) cases like the previous one.

    BTW: this is strongly typechecked code, 'sub1!' has type
    declared in module interface.

    Note that C's preprocessor is line-oriented, but C itself isn't.

    C is still tremendously popular for many reasons. But anyone wanting to
    code today in such a language will be out of luck if prefered any or all >> of these characteristics. This is why I find coding in my language such
    a pleasure.

    Then, if we are comparing the C language with mine, I offer:

    * Out of order definitions

    That is considerd misfeature in modern time.

    Really? My experiments showed that modern languages (not C or C++) do
    allow out-of-order functions. This gives great freedom in not worrying
    about whether function F must go before G or after, or being able to
    reorder or copy and paste.

    I am not sure what you mean here by "out-of-order functions".
    If you declare functions earlier there is nothing "out-of-order".
    Similarly, if language do not need declarations. For me "out-of-order"
    is say PL/I where you need declaration (possibly implicit) and
    (IIRC) you can define function before declaration.

    In modern languages
    definition may generate some code to be run and order in which
    this code is run matters.

    * One-time definitions (no headers, interfaces etc)
    * Expression-based

    C is mostly expression-based.

    No, it's mostly statement-based. Although it might be that most
    statements are expression statements (a=b; f(); ++c;).

    You can't do 'return switch() {...}' for example, unless using gcc extensions.

    That is why I wrote "mostly", this and few similar means that it is
    not entirely expression-based. But in C assignment, conditional
    and sequence is expression, which means that you can do in single
    expression things that in Pascal or Ada (and several other lanuages)
    would take multiple statement and possibly extra variables. Note
    that 'switch' in most cases could be replaced by conditionals,
    so looking at complexity of constructs only loops, gotos, and declarations/defintions are really excluded. Point is that C allows
    most uses that appear in practical programs.

    There are langages that go further
    than C, for example:

    a := (s = 1; for i in 1..10 repeat (s := s + i); s)

    is legeal in language that I use, but can not be directly translated to C. However, from examples that you gave it looked that your language
    is _less_ expression-based than C.

    I don't use the feature much. I had it from 80s, then switched to statement-based for a few years to match the scripting language, now
    both are expression-based.

    One reason it's not used more is because it causes problems when
    targetting C. However I like it as a cool feature.

    * Program-wide rather than module-wide compilation unit

    AFACS C leave choice to the implementer. If your language _only_
    supports whole-program compilation, then this whould be
    negative feature.

    * Build direct to executable; no object files or linkers

    That really question of implementation. Building _only_
    to executable is misfeature (what about case when I want
    to use a few routines in your language, but the rest including
    main program is in different language?).

    There were escape routes involving OBJ files, but that's fallen into
    disuse and needs fixing. For example, I can't so `mm -obj app` ATM, but
    could do this, when I've cleared some bugs:

    mm -asm app # app.m to app.asm
    aa -obj app # app.asm to app.obj
    gcc app.obj lib.o -oapp.exe # or lib.a?

    This (or something near) allows static linking of 'lib' instead of
    dynamic, or including lib written in another language.


    [continued in next message]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Sat Dec 17 13:22:24 2022
    On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    I don't agree. On Linux you do it with sources because it doesn't have a
    reliable binary format like Windows that will work on any machine. If
    there are binaries, they might be limited to a particular Linux
    distribution.

    You do not get it. I create binaries that work on 10 year old Linux
    and new one. And on distributions that I never tried.

    I tried porting a binary from one ARM32 Linux machine to another; it
    didn't work, even 2 minutes later. Maybe it should have worked and there
    was some technical reason why my test failed.

    But I have noticed that on Linux, distributing stuff as giant source
    bundles seems popular. I assumed that was due to difficulties in using binaries.

    Of course,
    I mean that binaries are for specific architecture, separate
    for i386 Linux and x86_64 Linux (that covers PC-s, I would have to
    provide more if I wanted to support more architectures).

    Concerning binary format, there were two: Linux started with a.out
    and switched to ELF in second half of nineties.

    (I don't understand that; a.out is a filename; ELF is a file format.)

    You also ignore educational aspect: some folks fetch sources to
    learn how things are done.

    Sure. Android is open source; so is Firefox. While you can spend years
    reading through the 25,000,000 lines of Linux kernal code.

    Good luck finding out how they work!

    Here I'm concerned only with building stuff that works, and don't want
    to know what directory structure the developers use.

    Concerning not having extension: you can add one if you want,
    moderatly popular choices are .exe or .elf.

    But nobody does. Main problem is in forums like this: if I say
    `hello.exe`, everyone knows that's a binary executable for Windows. But
    if I mention `hello`, how are you supposed to tell that I'm talking
    about a Linux executable?

    I know that Linux doesn't care about extensions, but people do. After
    all it still uses, by convention, extensions like .c -s .o .a .so, so
    why not actual binaries by convention?

    But for using normal
    Linux executable it should not matter if it is a shell script,
    interpreted Python file or machine code. So exention should
    not "give away" nature of executable.

    You can have a neutral extension that doesn't give it away either. Using
    no extension is not useful: is every file with no extension something
    you can execute?

    But there are also ways to execute .c files directly, and of course .py
    files which are run from source anyway.

    It simply doesn't make sense. On Linux, I can see that executables are displayed on consoles in different colours; what happened when there was
    no colour used?

    And having no extension
    means that users are spared needless typing

    Funny you should bring that up, because every time you run a /C
    compiler/ on a /C source file/, you have to type the extension like this:

    gcc hello.c

    which also writes the output as a.exe or a.out, so you further need to
    write at least:

    gcc hello.c -o hello # hello.exe on Windows

    I would only write this:

    bcc hello

    and it works out, by some very advanced AI, that I want to compile
    hello.c into hello.exe. And once you have hello.exe, you can run it like
    this:

    hello

    You don't need to type .exe. So, paradoxically, having extensions means
    having to type them less often:

    mm -pcl prog # old compiler: translate prog.m to prog.pcl
    pcl -asm prog # prog.pcl to prog.asm
    aa prog # prog.asm to prog.exe
    prog # run it

    At no point did I need to write an extension. It is implied by the
    program I invoked.

    No, but deriving the true sources from app.ma is trivial, since it is
    basically a concatenation of the relevant files.

    Not less trivial than running 'tar' (which is standard component
    on Linux).

    .ma is a text format; you can separate with a text editor if you want!
    But you don't need to. Effectively you just do:

    gcc app # ie. app.gz2

    and it makes `app` (ie. an ELF binary 'app.').

    * Needing to run './configure' first (this will not work on Windows...)

    I saw one case then a guy tried to run './configure' on Windows NT
    and Windows NT kept crashing.

    Possibly you don't quite understand: aside from "./" being a syntax
    error on Windows, 'configure' is a script full of Bash commands which
    invoke all sorts of utilities from Linux. It is meaningless to attempt
    to run it on Windows.

    It would be like my bundling a Windows BAT file with sources intended to
    be built on Linux.


    It made a little progress and than
    crashed, so that guy restarted it hoping that eventually it will
    finish (after a week or two he gave up and used Linux). But
    usually './configure' is not that bad. It make take a lot of
    time, IME './configure' that run in seconds on Linux needed
    several minutes on Windows.

    It can takes several minutes on Linux too! Auto-conf-generated configure scripts can contain tens of thousands of lines of code.

    Building CPython on Linux (this was a decade ago when it was smaller),
    took 5 minutes from cold. After that, an incremental make (after editing
    one module) took 5 seconds. (Still 25 times longer than building my own interpreter from scratch.)

    Another project, the sources for A68G, also took minutes. On Windows it
    didn't work at all, but extracting the dozen C files (about 70Kloc), I
    managed to get part-working version using my bcc compiler. It took under
    a second to build.

    Yet another, the sources for GMP, could only be built on Windows using
    MSYS2. Which I tried; it worked away for an hour, then it failed.

    And of course you need to install
    essential dependencies, good program will tell you what you need
    to install first, before running configure. But you need to
    understand what they mean...

    * Finding a 'make' /program/ (my gcc has a program called
    mingw32-make.exe; is that the one?)

    Probably. Normal advice for Windows folks is to install thing
    called msys (IIUC it is msys2 now) which contains several tools
    incuding 'make'. You are likely to get it as part of bigger
    bundle, I am not up to date to tell you if this bundle will
    be called 'gcc' or something else.

    But that's just a cop-out. As I said above, it's like my delivering a
    build system for Linux that requires so many Windows dependencies, that
    you can only build by installing half of Windows.

    When /I/ provide sources (that is, a representation that is one step
    back from binaries), to build on Linux, then it will build on Linux.
    They will have a dependency on a C compiler that can produce a ELF file,
    and I now stipulate either gcc or tcc.

    I don't have any interest in this; I just want the binary!

    Well, I provide Linux binaries, but only sources for Windows
    users. One reason is that I have only Linux on my personal
    machine, so to deal with Windows I need to lease a machine.
    Different reason is that I an not paid for programming, I do
    this because I like to program and to some degree to build
    community.

    I had the same problem, in reverse. I've spent money on RPis, cheap
    Linux netbooks, spent endless time getting VirtualBox to work, and still
    don't have a suitable Linux machine that Just Works.

    WSL is not interesting since it is still x64, and maybe things will work
    that will not work on real Linux (eg. it can still run actual Windows
    EXEs; what else is it allowing that wouldn't work on real Linux).

    I've stopped this since no one has ever expressed any interest in seeing
    my stuff work on Linux, especially on RPi where a very fast alternative
    to C that ran on the actual board would have been useful.

    But if some potential members of community would
    like to benefit but are unwilling to spent little effort.
    Of course, in big community there may be a lot of "free
    riders" who benefit without contributing anything whithout
    bad effect because other folks will do needed work. But
    here I am dealing with small community. I did port
    to Windows to make sure that it actually works and there
    are not serious problems. But I leave to other creation
    of binaries and reporting possible build problems. If
    nobody is willing to do this, then from my point of
    view Windows has no users and is not worth supporting.

    In 1990s, for my commercial app, about 1 in 1000 users ever enquired
    about versions for Linux. One wanted a Mac version, but apparently my
    app worked well enough under a Windows emulator.

    Being able to ZIP or TAR a sprawling set of files into giant binary
    makes it marginally easier to transmit or download, but it doesn't
    really address complexity.

    In my book single blob of 20M is more problematic than 10000 files,
    2kB each. At deeper level complexity is the same, but blob lacks
    useful structure given by division into files and subdirectories.

    When you download a ready-to-run binary, it will be a single blob. Or a
    single main blob.

    In my case, my 'blob' can be run directly; start with these three files:

    c:\demo>dir
    30/03/2022 13:53 45 hello.m
    17/12/2022 12:52 653,251 mm.ma
    09/12/2022 18:54 471,552 ms.exe

    mm.ma are the sources for my compiler as one text blob. ms.exe is my M
    compiler (when called 'ms', it automatically invokes the '-run' option
    to run from source).

    Now I can run the compiler from that blob without even formally creating
    any executable:

    c:\demo>ms mm hello
    (Building mm.ma)
    Hello World! 12:54:27

    c:\demo>tm ms mm hello
    (Building mm.ma)
    Hello World! 12:54:36
    TM: 0.08

    The second run applied a timer; it took 80ms to build my compiler from
    scratch and use it to build /and/ run Hello.

    On the same machine, gcc takes 0.22 seconds to build hello.c, without
    that minor step of building itself from scratch first.

    If /only/ other language tools were as genuinely effortless and simple
    and fast as mine. Then I would moan a lot less!

    So, most things actually work; only creating EXE doesn't work, because
    it needs access to msvcrt.dll. But even it it did, it would work as a
    cross-compiler, as its code generator is for Win64 ABI.

    Yes. It was particularly funny whan you had compiler running on
    Raspberry Pi, but producing Intel code...

    I did work on a recent project where my x64 code for Win64 ABI could
    work on x64 Linux (not arm64), which seemed exciting, then I discovered
    that WSL could run EXE anyway. That spoilt it and I abandoned it.


    But I think this shows useful stuff can be done. A more interesting test
    (which used to work, but it's too much effort right now), is to get my M
    compiler working on Linux (the 'mc' version that targets C), and use
    that to build qq, bcc etc from original sources on Linux.

    In all, my Windows stuff generally works on Linux. Linux stuff generally
    doesn't work on Windows, in terms of building from source.

    Our experience differ. There were times that I had to work on
    Windows machine and problem was that Windows does not come with
    tools that I consider essential.

    This is the problem. Linux encourages the use of myriad built-in tools
    and utilities. Is it any wonder that stuff is then hard to build
    anywhere else?

    My experience is that the OS provided nothing, especially pre-Windows
    where it basically only provided a file-system. Then you learn to be self-sufficient and to keep your tools lean.

    The way it does modules is crude. So was my scheme in the 1980s, but it
    was still one step up from C. My 2022 scheme is miles above C now.

    Concering "miles above": using C one can create shared libraries.
    Some shared libraries may be system provided, some may be private.
    Within C ecosystem, one you have corresponding header files you
    can use them as "modules". And they are usable with other languages.
    AFAIK no other module system can match _this_.

    Huh? Any language that can produce DLL files can match that. Some can
    produce object files that can be turned into .a and .lib files.

    Regarding C header files, NO language can directly process .h files
    unless that ability has been specifically built in, either by providing
    half a C compiler, or bundling a whole one (eg. Zig comes with Clang).



    I don't agree with spending 1000 times more effort in devising complex
    tools compared with just fixing the language.

    It is cheaper to have 1000 people doing tools, than 100000 people
    fixing their programs.

    Or one person fixing the C language and the compiler. OK, there are lots
    of C compilers, but you don't need 1000 people.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Bart on Sat Dec 17 22:35:33 2022
    On 17/12/2022 13:22, Bart wrote:
    On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

    When /I/ provide sources (that is, a representation that is one step
    back from binaries), to build on Linux, then it will build on Linux.
    They will have a dependency on a C compiler that can produce a ELF file,
    and I now stipulate either gcc or tcc.

    See https://github.com/sal55/langs/tree/master/demo

    This includes mc.c, a generated-C rendering of my M-on-Linux compiler.

    You [antispam] will need gcc or tcc to create a binary on Linux;
    instructions are at the link.

    Once you have a working binary, you can try that on the one-file M
    'true' sources in mc.ma, to create a new binary.

    If that monolithic source file still doesn't cut it for you, I've
    included an extraction program. The readme tells you how to run that,
    and how to run the 2nd compiler on those discrete files to make a third compiler.

    (I've briefly tested those instructions under WSL. It ought to work on
    any 64-bit Linux including ARM, but I can't guarantee it. The C file is
    32Kloc, and the .ma file is 25Kloc.

    If it doesn't work, then forget it. I know it can be made to work, and
    to do so via my one-file distributions.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Sun Dec 18 14:05:09 2022
    On 17/12/2022 14:22, Bart wrote:
    On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:


    (I'm snipping a lot.)


    Of course,
    I mean that binaries are for specific architecture, separate
    for i386 Linux and x86_64 Linux (that covers PC-s, I would have to
    provide more if I wanted to support more architectures).

    Concerning binary format, there were two: Linux started with a.out
    and switched to ELF in second half of nineties.

    (I don't understand that; a.out is a filename; ELF is a file format.)


    "a.out" is an old executable file format. It was also the default name
    a lot of tools used when producing something in that format, and that
    default name has stuck. So if you use a modern gcc to generate an
    executable, and don't give it a name, it uses "a.out" as the name - but
    it will be in ELF format, not "a.out" format. Usually it's best to give
    your executables better names!


    Concerning not having extension: you can add one if you want,
    moderatly popular choices are .exe or .elf.

    But nobody does. Main problem is in forums like this: if I say
    `hello.exe`, everyone knows that's a binary executable for Windows. But
    if I mention `hello`, how are you supposed to tell that I'm talking
    about a Linux executable?


    In the *nix world, the meaning of a file is determined primarily by its contents, along with the "executable" flag, not by its name. This is a
    /good/ thing. If I run a program called "apt", I don't care if it is an
    ELF binary, a Perl script, a BASH shell script, a symbolic link to
    another file, or anything else. I don't want to have to distinguish
    between them, so it's good that they don't need to have the file type as
    part of the name.

    For data files, it can often be convenient to have an extension
    indicating the type - and it is as common on Linux as it is on Windows
    to have ".odt", ".mp3", etc., on data files.

    If you want to know what a file actually is, the "file" command is your
    friend on *nix.

    I know that Linux doesn't care about extensions, but people do. After
    all it still uses, by convention, extensions like .c -s .o .a .so, so
    why not actual binaries by convention?


    People use extensions where they are useful, and skip them when they are counter-productive (such as for executable programs).

      But for using normal
    Linux executable it should not matter if it is a shell script,
    interpreted Python file or machine code.  So exention should
    not "give away" nature of executable.

    You can have a neutral extension that doesn't give it away either. Using
    no extension is not useful: is every file with no extension something
    you can execute?


    When you are writing code, and you have a function "twiddle" and an
    integer variable "counter", you call them "twiddle" and "counter". You
    don't call them "twiddle_func" and "counter_int". But maybe sometimes
    you find it useful - it's common to write "counter_t" for a type, and
    maybe you'd write "xs" for an array rather than "x". Filenames can
    follow the same principle - naming conventions can be helpful, but you
    don't need to be obsessive about it or you end up with too much focus on
    the wrong thing.

    On *nix, every file with the executable flag can be executed - that's
    what the flag is for.

    Sometimes it is convenient to be able to see which files in a directory
    are executables, directories, etc. That's why "ls" has flags for
    colours or to add indicators for different kinds of files. ("ls -F
    --color").

    But there are also ways to execute .c files directly, and of course .py
    files which are run from source anyway.


    There are standards for that. A text-based file can have a shebang
    comment ("#! /usr/bin/bash", or similar) to let the shell know what
    interpreter to use. This lets you distinguish between "python2" and
    "python3", for example, which is a big improvement over Windows-style
    file associations that can only handle one interpreter for each file
    type. And the *nix system distinguishes between executable files and non-executables by the executable flag - that way you don't accidentally
    try to execute non-executable Python files.


    To be honest, it really does not bother me if there are file extensions
    on programs, or if there are no file extensions. For my own
    executables, I will usually have ".py" or ".sh" for Python or shell
    files, and no extension for compiled files. But that's because there's
    a fair chance I'll want to modify or update the file at some point, not
    because it makes a big difference when it is running.


    It simply doesn't make sense. On Linux, I can see that executables are displayed on consoles in different colours; what happened when there was
    no colour used?


    Try "ls -l", or "ls -F". It's been a long time since I used a computer
    display that did not have colour, but I do not remember it being a
    problem on the Sun workstations at university. (I do remember how
    god-awful ugly and limited it was going back to all-caps
    case-insensitive 8 character DOS/Win16 filenames on a PC at home. At
    least most of these limitations are now outdated even on Windows.)

    And having no extension
    means that users are spared needless typing

    Funny you should bring that up, because every time you run a /C
    compiler/ on a /C source file/, you have to type the extension like this:

        gcc hello.c

    You do realise that gcc can handle some 30-odd different file types?
    It's not a simple C compiler that assumes everything it is given is a C
    file. Of course you have to give the full name of the file. (You can
    also tell gcc exactly how you want the file to be interpreted, if you
    are doing something funny.)


    which also writes the output as a.exe or a.out, so you further need to
    write at least:

       gcc hello.c -o hello           # hello.exe on Windows

    I would only write this:

       bcc hello

    On Linux, you just write "make hello" - you don't need a makefile for
    simple cases like that. (And the "advanced AI" can figure out if it is
    C, C++, Fortran, or several other languages.)


    and it works out, by some very advanced AI, that I want to compile
    hello.c into hello.exe. And once you have hello.exe, you can run it like this:

       hello

    You don't need to type .exe. So, paradoxically, having extensions means having to type them less often:

       mm -pcl prog     # old compiler: translate prog.m to prog.pcl
       pcl -asm prog    # prog.pcl to prog.asm
       aa prog          # prog.asm to prog.exe
       prog             # run it

    At no point did I need to write an extension. It is implied by the
    program I invoked.

    That's fine for programs that handle just one file type.

    But I'm a little confused here. On the one hand, you are saying how
    terrible Linux is for not using file extensions. On the other hand, you
    are saying how wonderful your own tools are because they don't need file extensions.

    Could it be simply that file extensions are sometimes helpful, sometimes inconvenient or irrelevant, and mostly it all just works without much
    trouble?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Sun Dec 18 14:38:27 2022
    On 2022-12-18 14:05, David Brown wrote:

    Could it be simply that file extensions are sometimes helpful, sometimes inconvenient or irrelevant, and mostly it all just works without much trouble?

    Extensions is a way to convey the file type, e.g. a hint which
    operations are supposed to work with the file, if the system is weakly
    typed.

    Some early OSes supported tagging files externally. They type was kept
    by the filesystem not in the file. Of course, such systems became
    cluttered in presence of access rights. A file executable for one, could
    be non-executable for other. Anyway they did not advance much as DOS and
    UNIX united scorched and salted the ground.

    UNIX, as always, combined worst available options in the most peculiar
    way. The idea of reading file before accessing it is as stupid as it
    sounds. Isn't reading an access?

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sun Dec 18 17:09:49 2022
    On 18/12/2022 13:05, David Brown wrote:
    On 17/12/2022 14:22, Bart wrote:


    For data files, it can often be convenient to have an extension
    indicating the type - and it is as common on Linux as it is on Windows
    to have ".odt", ".mp3", etc., on data files.

    It's convenient for all files. And before you say, I can add a .exe
    extension if I want: I don't want to have to write that every time I run
    that program.

    People use extensions where they are useful, and skip them when they are counter-productive (such as for executable programs).

    I can't imagine all my EXE (and perhaps BAT files) all having no
    extensions. Try and envisage all your .c files have no extensions by
    default. How do you even tell that are C sources and not Python or not executables?

    When you are writing code, and you have a function "twiddle" and an
    integer variable "counter", you call them "twiddle" and "counter".  You don't call them "twiddle_func" and "counter_int".  But maybe sometimes
    you find it useful - it's common to write "counter_t" for a type, and
    maybe you'd write "xs" for an array rather than "x".  Filenames can
    follow the same principle - naming conventions can be helpful, but you
    don't need to be obsessive about it or you end up with too much focus on
    the wrong thing.

    But you /do/ write twiddle.c, twiddle.s, twiddle.o, twiddle.cpp,
    twiddle.h etc? Yet the most important file of all, is just plain 'twiddle'!

    In casual writing or conversation, how to do distinguish 'twiddle the
    binary executable' from 'twiddle the folder', from 'twiddle the
    application' (an installation) , from 'twiddle' the project etc, without
    having to use that qualification?

    Using 'twiddle.exe' does that succinctly and unequivocally.


    On *nix, every file with the executable flag can be executed - that's
    what the flag is for.

    Sometimes it is convenient to be able to see which files in a directory
    are executables, directories, etc.  That's why "ls" has flags for
    colours or to add indicators for different kinds of files.  ("ls -F --color").

    As I said, if it's convenient for data and source files, it's convenient
    for all files.

    But there are also ways to execute .c files directly, and of course
    .py files which are run from source anyway.


    There are standards for that.  A text-based file can have a shebang
    comment ("#! /usr/bin/bash", or similar) to let the shell know what interpreter to use.  This lets you distinguish between "python2" and "python3", for example, which is a big improvement over Windows-style
    file associations that can only handle one interpreter for each file
    type.

    That is invasive. And taking something that is really an attribute of a
    file name, in having it not only inside the file, but requiring the file
    to be opened and read to find out.

    (Presumably every language that runs on Linux needs to accept '#' as a
    line comment? And you need to build it in to every one of 10,000 source
    files the direct location of the Python2 or Python3 installation on that machine? Is that portable across OSes? But I expect it's smarter than that.)

    With Python, you're still left with the fact that you see a file with a
    .py extension, and don't know if it's Py2 or Py3, or Py3.10 or Py3.11,
    or whether it's a program that works with any version. It is a separate
    problem from having, as convention, no extensions for ELF binary files.

      And the *nix system distinguishes between executable files and
    non-executables by the executable flag - that way you don't accidentally
    try to execute non-executable Python files.

    (So there are files that contain Python code that are non-executable?
    Then what is the point?)


    You do realise that gcc can handle some 30-odd different file types?

    That doesn't change the fact that probably 99% of the time I run gcc, it
    is with the name of a .c source file. And 99.9% of the times when I
    invoke it on prog.c as the first or only file to create an executable,
    then I want to create prog.exe.

    So its behaviour is unhelpful. After the 10,000th time you have to type
    .c, or backspace over .c to get at the name itself to modify, it becomes tedious.

    Now it's not that hard to write a wrapper script or program on top of
    gcc.exe, but if it isn't hard, why doesn't it just do that?

    It's not a simple C compiler that assumes everything it is given is a C
    file.

    As I said, that is not helpful for me. Also, how many file types does
    'as' accept? As that also requires the full extension, and also,
    bizarrely, generates `a.out` as the object file name.

    If you intend to assemble three .s files to object files, using separate
    'as' invocations, they will all be called a.out!

    That would be crass even for a toy program written by a student. And yet
    here it is a mainstream product used by million of people.

    All my language programs (and many of my apps), have a primary type of
    input file, and will default to that file extension if omitted. Anything
    else (eg .dll files) need the full extension.

    Here's something funny: take hello.c and rename to 'hello', with no
    extension. If I try and compile it:

    gcc hello

    it says: hello: file not recognised: file format not recognised. Trying
    'gcc hello.' is worse: it can't see the file at all.

    So first, on Linux, where file extensions are supposed to be optional,
    gcc can't cope with a missing .c extension; you have to provide extra
    info. Second, on Linux, "hello" is a distinct file from "hello.".

    With bcc, I just have to type "bcc hello." to make it work. A trailing
    dot means an empty extension.

    On Linux, you just write "make hello" - you don't need a makefile for
    simple cases like that.

    OK... so how does 'make' figure out the file extension?

    'Make' anyway has different behaviour:

    * It can choose not to compile

    * On Windows, it says this:

    c:\yyy>make hello
    cc hello.c -o hello
    process_begin: CreateProcess(NULL, cc hello.c -o hello, ...) failed.
    make (e=2): The system cannot find the file specified.
    <builtin>: recipe for target 'hello' failed
    make: *** [hello] Error 2

    * I also use several C compilers; how does make know which one I intend?
    How do I pass it options?

    If I give another example:

    c:\c>bcc cipher hmac sha2
    Compiling cipher.c to cipher.asm
    Compiling hmac.c to hmac.asm
    Compiling sha2.c to sha2.asm
    Assembling to cipher.exe

    it just works. 'make cipher hmac sha2' doesn't, not even in WSL.

    (And the "advanced AI" can figure out if it is
    C, C++, Fortran, or several other languages.)

    No, it can't. If I have hello.c and hello.cpp, it will favour the .c file.

    But suppose you did have just the one plausible program; why not build
    that logic into the compiler as I said above?

    That's fine for programs that handle just one file type.

    Like a .x file for a compiler for language X?

    But I'm a little confused here.  On the one hand, you are saying how terrible Linux is for not using file extensions.  On the other hand, you
    are saying how wonderful your own tools are because they don't need file extensions.

    That's why I said 'paradoxically'. The extensions are needed so you can
    be confident the tool will take a .x input and produce a .y output (and
    my tools will always confirm exactly what they will do).

    And so that you can identify .x and .y files on directory listings. But
    you don't want to keep typing .x and maybe .y thousands and thousands of
    times.

    Even if most of the time you use an IDE or some other project manager,
    you will working from a console prompt enough times, for various custom
    builds, tests and debugging, for explicit extensions to become a nuisance.

    Here is an example of actual use:

    c:\mx>mc -c mm
    M6 Compiling mm.m---------- to mm.c

    c:\mx>bcc -s mm
    Compiling mm.c to mm.asm

    c:\mx>aa mm
    Assembling mm.asm to mm.exe

    c:\mx>mm
    M Compiler [M6] 18-Dec-2022 15:06:29 ...

    Notice how it makes clear exactly what x and y are at each step. gcc
    either says nothing, or --verbose gives a wall of gobbledygook.



    Could it be simply that file extensions are sometimes helpful, sometimes inconvenient or irrelevant, and mostly it all just works without much trouble?

    File extensions are tremendously helpful. But that doesn't mean you have
    to keep typing them! They just have to be there.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sun Dec 18 17:17:51 2022
    Bart <bc@freeuk.com> wrote:
    On 17/12/2022 13:22, Bart wrote:
    On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

    When /I/ provide sources (that is, a representation that is one step
    back from binaries), to build on Linux, then it will build on Linux.
    They will have a dependency on a C compiler that can produce a ELF file, and I now stipulate either gcc or tcc.

    See https://github.com/sal55/langs/tree/master/demo

    This includes mc.c, a generated-C rendering of my M-on-Linux compiler.

    You [antispam] will need gcc or tcc to create a binary on Linux;
    instructions are at the link.

    Once you have a working binary, you can try that on the one-file M
    'true' sources in mc.ma, to create a new binary.

    If that monolithic source file still doesn't cut it for you, I've
    included an extraction program. The readme tells you how to run that,
    and how to run the 2nd compiler on those discrete files to make a third compiler.

    (I've briefly tested those instructions under WSL. It ought to work on
    any 64-bit Linux including ARM, but I can't guarantee it. The C file is 32Kloc, and the .ma file is 25Kloc.

    If it doesn't work, then forget it. I know it can be made to work, and
    to do so via my one-file distributions.)

    It works on 64-bit AMD/Intel Linux. As-is it failed on 64-bit ARM.
    More precisly, initial 'mc.c' compiled fine, but it could not
    run 'gcc'. Namely, ARM gcc does not have '-m64' option. Once
    I removed this it works.

    So you may want to change this:

    --- fred.nn/mm_winc.m 2022-12-18 14:52:37.635098030 +0000
    +++ fred.nn2/mm_winc.m 2022-12-18 16:02:10.494914440 +0000
    @@ -52,7 +52,7 @@

    case ccompiler
    when gcc_cc then
    - fprint @&.str,"gcc -m64 # # -o# # -s ",
    + fprint @&.str,"gcc # # -o# # -s ",
    (doobj|"-c"|""),(optimise|"-O3"|""),exefile, cfile
    when tcc_cc then
    fprint @&.str,f"tcc # -o# # # -luser32 c:\windows\system32\kernel32.dll -fdollars-in-identifiers",
    @@ -88,7 +88,7 @@

    case ccompiler
    when gcc_cc then
    - fprint @&.str,"gcc -m64 # # -o# # -lm -ldl -s -fno-builtin",
    + fprint @&.str,"gcc # # -o# # -lm -ldl -s -fno-builtin",
    (doobj|"-c"|""),(optimise|"-O3"|""),&.newexefile, cfile
    when tcc_cc then
    fprint @&.str,"tcc # -o# # -lm -ldl -fdollars-in-identifiers",

    Also, I noticed that a lot of system stuff is just stubs. You may
    want the following implementation of 'os_getsystime':

    --- fred/mlinux.m 2022-12-18 14:52:42.831097
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Sun Dec 18 23:18:27 2022
    On 18/12/2022 17:17, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:
    On 17/12/2022 13:22, Bart wrote:
    On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

    When /I/ provide sources (that is, a representation that is one step
    back from binaries), to build on Linux, then it will build on Linux.
    They will have a dependency on a C compiler that can produce a ELF file, >>> and I now stipulate either gcc or tcc.

    See https://github.com/sal55/langs/tree/master/demo

    This includes mc.c, a generated-C rendering of my M-on-Linux compiler.

    You [antispam] will need gcc or tcc to create a binary on Linux;
    instructions are at the link.

    Once you have a working binary, you can try that on the one-file M
    'true' sources in mc.ma, to create a new binary.

    It works on 64-bit AMD/Intel Linux.

    Wow, you actually had a look! (Usually no one ever bothers.)

    As-is it failed on 64-bit ARM.
    More precisly, initial 'mc.c' compiled fine, but it could not
    run 'gcc'. Namely, ARM gcc does not have '-m64' option. Once
    I removed this it works.

    gcc doesn't have `-m64`, really? I'm sure I've used it even on ARM. (How
    do you tell it it to generate ARM32 rather than ARM64 code?)

    But I've taken that out for now. (Programs can also be built using `./mc
    -c prog` then compiling prog.c manually. That reminds me I haven't yet
    provided help text specific to 'mc'.)


    I tested this using the following program:

    proc main=
    rsystemtime tm
    os_getsystime(&tm)
    println tm.second
    println tm.minute
    println tm.hour
    println tm.day
    println tm.month
    println tm.year
    end


    It's funny you picked on that, because the original version of my
    hello.m also printed out the time:

    proc main=
    println "Hello World!",$time
    end

    This was to ensure I was actually running the just built-version, and
    not the last of the 1000s of previous ones. But the time-of-day support
    for Linux wasn't ready so I left it out.

    I've updated the mc.c/mc.ma files (not hello.m, I'm sure you can fix that).

    However getting this to work on Linux wasn't easy as it kept crashing.
    The 'struct tm' record ostensibly has 9 fields of int32, so has a size
    of 36 bytes. And on Windows it is. But on Linux, a test program reported
    the size as 56 bytes.

    Doing -E on that program under Linux, the struct actually looks like this:

    struct tm
    {
    int tm_sec;
    int tm_min;
    int tm_hour;
    int tm_mday;
    int tm_mon;
    int tm_year;
    int tm_wday;
    int tm_yday;
    int tm_isdst;

    long int tm_gmtoff;
    const char *tm_zone;
    };

    16 extra byte for fields not mentioned in 'man' docs, plus 4 bytes
    alignment account for the 20 bytes. This is typical of the problems in
    adapting C APIs to the FFIs of other languages.

    BTW: I still doubt that 'mc.ma' expands to true source: do you
    really write no comments in your code?

    The file was detabbed and decommented, as the comments would be full of
    ancient crap, mainly debugging code that never got removed. I've tidied
    most of that up, and now the file is just detabbed (otherwise things
    won't line up properly). Note the sources are not heavily commented anyway.

    It will always be a snapshot of the actual sources, which are not kept
    on-line and can change every few seconds.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Mon Dec 19 06:15:26 2022
    Bart <bc@freeuk.com> wrote:
    On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    I don't agree. On Linux you do it with sources because it doesn't have a >> reliable binary format like Windows that will work on any machine. If
    there are binaries, they might be limited to a particular Linux
    distribution.

    You do not get it. I create binaries that work on 10 year old Linux
    and new one. And on distributions that I never tried.

    I tried porting a binary from one ARM32 Linux machine to another; it
    didn't work, even 2 minutes later. Maybe it should have worked and there
    was some technical reason why my test failed.

    But I have noticed that on Linux, distributing stuff as giant source
    bundles seems popular. I assumed that was due to difficulties in using binaries.

    Creating binaries that work on many system requires some effort.
    Easiest way is to create them on oldest system that you expect to
    be used: typically compile on Linux will try to take advantage
    of features of processor and system, older system/processor
    may lack those features and fail. As I wrote, one needs to limit
    dependencies or bundle them. Bundling may lead to very large
    download size.

    Distributing sources has many advantages: program can be better
    optimized to concrete machine, if there is trouble other folks
    may fix it. If your program gains some popularity, then it is
    likely to become part of Linux distribution. In such case
    distribution maintainers will create binary packages and package
    system will take care of dependencies.

    And do not forget to open source means that users can get the source.
    If source is not available, most users will ignore your program.

    Concerning binaries for ARM, they are a bit more problematic than
    for Intel/AMD. Namely, there is a lot of different variants of
    ARM processors and 3 different instruction encodings in popular
    use: 32-bit instructions operating on 32-bit data (original ARM),
    16 or 32-bit instructions operating on 32-bit data (Thumb), and 32-bit intructions operating on 64-bit data (AARCH64). And there are versions
    of ARM architecture, with new versions adding new useful
    instructions. And not all ARM processors have FPU. Original
    Raspberry Pi used architecture version 6 and had FPU. First Linux
    for Raspberry Pi had code which allowed linking with code for
    machines with no FPU (using emulation), this was supposed to be
    portable, but slowed floating point computations. Quckly this was
    replaced by version which do not allow linking with code using
    FPU emulation. Chinese variants that I have use architecture
    version 7. In version 6 Thumb instruction encoding lead to
    much slower code. In version 7 Thumb instructions are almost
    as fast as ARM ones, but give significantly smaller code.
    Some Linux versions for those board use ARM instruction
    encoding for most programs, other use Thumb instruction
    encoding. Program for version 7 using Thumb instructions will
    almost surely use instructions not present in version 6,
    so will fail on older machines. From some time Raspberry Pi
    use processor in version 8 which is 64-bit capable. But at
    first those machines used to run 32-bit system and only
    recently 64-bit system for ARM becane popular. Note, that
    technically 64-bit ARM processor can run 32-bit code and
    IIUC 64-bit system could provide compatibility with 32-bit
    binaries. But unlike Intel/AMD machines where Linux for
    long time provided support for running 32-bit binaries on
    64-bit machines, it seems that 64-bit Linux distributions for
    ARM have no support for 32-bit binaries. So, there is less
    compatiblity than technically possible.

    My binaries for ARM use 32-bit enconding in version 5, that
    should be portable to any 32-bit ARM produced in last 10 years.
    They are some percent slower, because they do not use newer
    instructions and some percent larger due to not using Thumb
    encoding. And I limited dependencies for binary. One gets
    _much_ more functionality by compiling from source, binary
    is just for bootstrap (there is no "C target" so one needs
    initial binary compiler to compile the rest).

    Of course,
    I mean that binaries are for specific architecture, separate
    for i386 Linux and x86_64 Linux (that covers PC-s, I would have to
    provide more if I wanted to support more architectures).

    Concerning binary format, there were two: Linux started with a.out
    and switched to ELF in second half of nineties.

    (I don't understand that; a.out is a filename; ELF is a file format.)

    a.out is also name of executable format.

    You also ignore educational aspect: some folks fetch sources to
    learn how things are done.

    Sure. Android is open source; so is Firefox. While you can spend years reading through the 25,000,000 lines of Linux kernal code.

    Good luck finding out how they work!

    Things have structure, kernel is divided into subdirectories in
    resonably logical way. And there are tools like 'grep', it can
    find relevant thing in seconds. This is not pure theory, I had
    puzzling problem that binaries were failing on newer Linux
    distributions. It took some time to solve, but another guy
    hinted me that this may be tighthended "security" in kernel.
    Even after the hint my first try did not work. But I was able
    to find relevant code in the kernel and then it became clear
    what to do.

    FYI: The problem was with executing machine code generated at
    runtime. Origianally program (Poplog) was depending on old default
    memory permissions for newly allocated memory being read, write,
    execute (program explicitely requested old default). But Linux
    kernel starting from 5.8 removed execute permission from old
    default and one had to make another system call to get execute
    permissions as part of default.

    As another example, I observed somewhat strange behaviour from
    USB to serial convertors. I looked at corresponding kernel
    driver and at least part was explained: intead of taking number
    (clock divisor) to set speed chip had some funky way of setting
    speed which limited possible speeds.

    Here I'm concerned only with building stuff that works, and don't want
    to know what directory structure the developers use.

    Concerning not having extension: you can add one if you want,
    moderatly popular choices are .exe or .elf.

    But nobody does. Main problem is in forums like this: if I say
    `hello.exe`, everyone knows that's a binary executable for Windows.

    It may be for Linux...

    But
    if I mention `hello`, how are you supposed to tell that I'm talking
    about a Linux executable?

    Most people are intelignet enough to realize that talk is about
    executable. And if there is possibility of confusion resonable
    speaker/writer will explicitely mention executable. Essentially
    to only case when extentions help is when you want to set up
    some automation based on file extentions. That is usually part
    of bigger process, and using (adding) extention like '.exe' or '.elf'
    solves problem.

    I know that Linux doesn't care about extensions, but people do. After
    all it still uses, by convention, extensions like .c -s .o .a .so, so
    why not actual binaries by convention?

    Here you miss virtue of simplicity: binaries are started by kernel
    and you pass filename of binary to the system call. No messing
    with extentions there. There are similar library calls that
    do search based on PATH, again no messing with extentions.

    But for using normal
    Linux executable it should not matter if it is a shell script,
    interpreted Python file or machine code. So exention should
    not "give away" nature of executable.

    You can have a neutral extension that doesn't give it away either. Using
    no extension is not useful: is every file with no extension something
    you can execute?

    But there are also ways to execute .c files directly, and of course .py
    files which are run from source anyway.

    It simply doesn't make sense.

    It makes sense if you know that executable in the PATH is simultaneousy
    shell command. You see, there are folks which really do not like
    useless clutter in their command lines. And before calling
    executable from a shell script you may wish to check if it is available.
    Having different extention for calling and for access as normal
    file would complicate scripts.

    On Linux, I can see that executables are
    displayed on consoles in different colours; what happened when there was
    no colour used?

    There is 'ls -l' which give rather detailed information. Or 'ls -F'
    which appends star to names of executable (and slash to directory names).

    And having no extension
    means that users are spared needless typing

    Funny you should bring that up, because every time you run a /C
    compiler/ on a /C source file/, you have to type the extension like this:

    gcc hello.c

    which also writes the output as a.exe or a.out, so you further need to
    write at least:

    gcc hello.c -o hello # hello.exe on Windows

    You can write

    make hello

    (this works without a Makefile, just using default make rules).
    And for quick and dirty testing 'a.out' is file name.

    I would only write this:

    bcc hello

    and it works out, by some very advanced AI, that I want to compile
    hello.c into hello.exe. And once you have hello.exe, you can run it like this:

    hello

    You don't need to type .exe. So, paradoxically, having extensions means having to type them less often:

    mm -pcl prog # old compiler: translate prog.m to prog.pcl
    pcl -asm prog # prog.pcl to prog.asm
    aa prog # prog.asm to prog.exe
    prog # run it

    At no point did I need to write an extension. It is implied by the
    program I invoked.

    Implied extentions have trouble that somebody else my try to hijack
    them. I use TeX system which produces .dvi files. IIUC they could
    be easily mishandled by systems depending just on extion. And in
    area of programming languages at least three languages compete for
    .p extention.

    No, but deriving the true sources from app.ma is trivial, since it is
    basically a concatenation of the relevant files.

    Not less trivial than running 'tar' (which is standard component
    on Linux).

    .ma is a text format; you can separate with a text editor if you want!
    But you don't need to. Effectively you just do:

    gcc app # ie. app.gz2

    and it makes `app` (ie. an ELF binary 'app.').

    * Needing to run './configure' first (this will not work on Windows...)

    I saw one case then a guy tried to run './configure' on Windows NT
    and Windows NT kept crashing.

    Possibly you don't quite understand: aside from "./" being a syntax
    error on Windows,

    AFAIK '/' is legal in Windows pathnames (even though many programs
    do not support it). I am not sure about leading dot.

    'configure' is a script full of Bash commands which
    invoke all sorts of utilities from Linux. It is meaningless to attempt
    to run it on Windows.

    You probably do not understand that 'configure' scripts use POSIX
    commands. POSIX commands was mostly based on Unix commands, but
    original motivation was that various OS-s had quite different
    ways of doing simple things. Influential users (like governments)
    got tired of this and as response came industry standard, that
    is POSIX. Base POSIX was carfully designed so that commands
    could be provided on many different systems. IBM MVS had trouble
    because they did not have timestamps on files, but IBM solved
    this by providing "UNIX" subsystem on MVS (I write it in quotes
    because it has significant differentces from normal Unix).
    IIUC other OS providers did not have problems. There is more
    to POSIX than commands and to sell its systems to US goverment
    Microsoft claimed that Windows is "certified POSIX". Microsoft
    being Microsoft made its POSIX compatibility as useless as
    possible while satisfing letter of the standard. And did not
    provide it at all in home versions of Windows. But you
    can get POSIX tools from third parties and they run fine.

    BTW1: Few years ago Microsoft noticed that lack of de-facto
    POSIX compatibility is hurting then, so they started WSL.

    BTW2: At least normal autoconf macros are quite carful to use
    only Posix constructs. 'configure' scripts would be shorter
    and simpler if one could assume that they are executed by
    'bash' (which has bunch of useful features not present in
    POSIX). And similar for executed commands: I one could assume
    that versions of commands usually present on Linux 'configure'
    would be simpler.

    BTW3: You could get DOS versions of several Unix commands in
    late eigthies. They were used not for porting, but simply
    because they were useful.

    It would be like my bundling a Windows BAT file with sources intended to
    be built on Linux.

    There are two important differences:
    - COMMAND.COM is very crappy as command processor, unlike Unix shell
    which from the start was designed as programming language. IIUC that
    is changing with PowerShell.
    - you compare thing which was designed to be portability layer with
    platform specific thing.

    BTW: I heard that some folks wrote shell compatible with PowerShell
    for Linux. Traditional Linux users probably do not care much
    about this, but would not object if you use it (it is just "another
    scripting language").

    It made a little progress and than
    crashed, so that guy restarted it hoping that eventually it will
    finish (after a week or two he gave up and used Linux). But
    usually './configure' is not that bad. It make take a lot of
    time, IME './configure' that run in seconds on Linux needed
    several minutes on Windows.

    It can takes several minutes on Linux too! Auto-conf-generated configure scripts can contain tens of thousands of lines of code.

    It depends on commands and script. Shell can execute thousends of
    simple commands per second. On Linux most of time probably goes
    to C compiler (which is called many times from 'confugure').
    On Windows cost of process creation used to be much higher than
    on Linux, so it is likely that most 'configure' time went to
    process creation. Anyway, the same 'configure' script tended to
    run 10-100 times slower on Windows than on Linux. I did not try
    recently...

    And of course you need to install
    essential dependencies, good program will tell you what you need
    to install first, before running configure. But you need to
    understand what they mean...

    * Finding a 'make' /program/ (my gcc has a program called
    mingw32-make.exe; is that the one?)

    Probably. Normal advice for Windows folks is to install thing
    called msys (IIUC it is msys2 now) which contains several tools
    incuding 'make'. You are likely to get it as part of bigger
    bundle, I am not up to date to tell you if this bundle will
    be called 'gcc' or something else.

    But that's just a cop-out. As I said above, it's like my delivering a
    build system for Linux that requires so many Windows dependencies, that
    you can only build by installing half of Windows.

    POSIX utilities could be quite small. On Unix-like system one
    could fit them in something between 1-2M. On Windows there is
    trouble that some space saving ticks do not work (in Unix usual
    trick is to have one program available under several names, doing
    different thing depending on name). Also, for robustness they
    may be staticaly linked. And people usually want versions with
    most features, which are bigger than what is strictly necessary.
    Still, that is rather small thing if you compare to size of
    Windows. Another story is size of C compiler and its header
    files. I looked at compiled version of Mac OS system interface
    for GNU Pascal, it took about 10M around 2006. And users prefered
    to have it as one large blob, because loading parts separately
    lead to quadraticaly growing time with GNU Pascal. FYI,
    interface contained several thousends of types and about 200000
    symbolic constants. So basically, to have symbolic names and
    type checking has significant cost and there will be some bloat
    due to this.

    I don't have any interest in this; I just want the binary!

    Well, I provide Linux binaries, but only sources for Windows
    users. One reason is that I have only Linux on my personal
    machine, so to deal with Windows I need to lease a machine.
    Different reason is that I an not paid for programming, I do
    this because I like to program and to some degree to build
    community.

    I had the same problem, in reverse. I've spent money on RPis, cheap
    Linux netbooks, spent endless time getting VirtualBox to work, and still don't have a suitable Linux machine that Just Works.

    I have bunch of Pi-s (2 original RPis, the rest similar boards from
    chinese vendors). Probably most troubles are caused by SD cards
    an improper shutdown. In original Pi card (in adapter) got stuck,
    and during removal adapter broks. In cheap Orange Pi Zero (was $6.99
    at some time), SD cards developed errors for no apparent reason.
    Still, Pi-s mostly work fine, but if you want to store important
    files, then USB-stick or disk may be safer.

    For laptop I bought cheapest one in the store and it works fine.
    It is slow, but I do real work on desktop, and need laptop for
    travel. For small weight and low current consumption were most
    important. Low current consumption means slow processor, which
    led to low price...

    WSL is not interesting since it is still x64, and maybe things will work
    that will not work on real Linux (eg. it can still run actual Windows
    EXEs; what else is it allowing that wouldn't work on real Linux).

    I've stopped this since no one has ever expressed any interest in seeing
    my stuff work on Linux, especially on RPi where a very fast alternative
    to C that ran on the actual board would have been useful.

    Well, compiler that can not generate code for Pi is not very
    interesting to run on RPi, even if it is very fast. Your
    latest M is step in good direction, but suffers due to gcc
    compile time:

    time ./mc -asm mc.m
    M6 Compiling mc.m---------- to mc.asm

    real 0m0.431s
    user 0m0.347s
    sys 0m0.079s

    time ../mc mc.m
    M6 Compiling mc.m---------- to mc
    L:Invoking C compiler: gcc -omc mc.c -lm -ldl -s -fno-builtin
    mc.c:2:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
    #pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    real 0m9.137s
    user 0m8.746s
    sys 0m0.386s

    So 'mc' can generate C code in 0.431s, but then it takes '9.137s'
    to compile generated C (IIUC Tiny C does not support ARM and
    even on x86_64 compile command probably needs fixing).

    And there seem to be per-program overhead:

    time fred.nn/mc -asm hello.m
    M6 Compiling hello.m------- to hello.asm

    real 0m0.222s
    user 0m0.186s
    sys 0m0.032s

    time fred.nn/mc hello.m
    M6 Compiling hello.m------- to hello
    L:Invoking C compiler: gcc -ohello hello.c -lm -ldl -s -fno-builtin hello.c:2:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
    #pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    real 0m1.596s
    user 0m1.464s
    sys 0m0.125s

    'hello.m' is quite small, but is needs half of time of mc which
    is 6000 times larger. And generated 'hello.c' is still 4381
    lines.

    On 32-bit Pi-s I have Poplog:

    http://github.com/hebisch/poplog http://www.math.uni.wroc.pl/~hebisch/poplog/corepop.arm

    For bootstrap one needs binary above. For me it works on
    oryginal Raspberry Pi, Banana Pi, Orange Pi Pc, Orange Pi Zero.
    They all have different ARM chips and run different version
    of Linux, yet the same 'corepop.arm' works on all of them.

    If you are interested you can look at INSTALL file in repo
    above (skip quick install part that assumes tarball and go to
    full install). Building uses had written 'configure', it
    is very simple, but calls C compiler up to 4 times, so runtime
    of 'configure' on Pi-s is noticeable (of order of 1 second).
    Actual build (using 'make') takes few minutes, depending
    on what was configured.

    Note: one needs to install dependencies first (describe in
    INSTALL), if you do not do this either configure or build
    will fail.

    For massive files Poplog compiles much slower than your compiler
    but usually much faster than gcc. However, main advantage
    is that Poplog compiles to memory giving you impression of
    interpreter, but generating machine code. It is not easy
    to measure speed of compiler, but for resonably sized functions
    compile time is short enough to one does not notice it.

    With Poplog you actually get 4 languages: Pop11, SML, Prolog and
    Common Lisp. Significant portion of Poplog is kept in source form
    and (transparently) compiled when needed. Memory use is moderate.

    Let me add that there are actually two compilers, one compiling to
    memory and separate one which generates assembly code. The second
    compiler has low-level extenstions allowing faster object code.
    Compiler compiling to memory is is significantly faster than
    the one which generates assembly.

    There is also large documentation and notrivial part of build
    time is spend creating indices for documentation (that probably
    could be made faster, but happens rarely compared to compilation
    so nobody cared enough to improve this).

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Mon Dec 19 15:14:31 2022
    On 19/12/2022 06:15, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    But I have noticed that on Linux, distributing stuff as giant source
    bundles seems popular. I assumed that was due to difficulties in using
    binaries.

    Creating binaries that work on many system requires some effort.

    For users that is far the simplest way. Especially on Windows where you
    are looking for .exe .or .msi when desiring to download some application.

    Building from sources on Windows is only viable IMO if (1) everything
    required is contained within .exe and .msi and the process is
    transparent; (2) it actually works with no mysterious errors that you
    can't do anything about.

    (Why not provide the app as a binary anyway? Maybe to provide a more
    targetted executable. But I think it's also a thing now to supply
    multiple different EXEs packaged within the one binary.)


    Easiest way is to create them on oldest system that you expect to
    be used: typically compile on Linux will try to take advantage
    of features of processor and system, older system/processor
    may lack those features and fail. As I wrote, one needs to limit dependencies or bundle them. Bundling may lead to very large
    download size.

    No one cares about that anymore. I remember every trivial app on Android
    always seemed to be 48MB in size, maybe because they were packaged with
    the same tools.

    /I/ care about it because size = complexity.

    And do not forget to open source means that users can get the source.
    If source is not available, most users will ignore your program.

    Going back to Android and the 1000000 downloadable apps on Play Store:
    how many downloaders care about source code? Approximately nobody!

    Tools for programmers on desktop PCs will have a different demographic,
    but I can tell you that when I want to run a language implementation, I
    don't care about the source either. Let me try it out first.

    Concerning binaries for ARM, they are a bit more problematic than
    for Intel/AMD. Namely, there is a lot of different variants of
    ARM processors and 3 different instruction encodings in popular

    You're telling me! I've long been confused by the different numberings
    between processors and architectures. All I want to know about these
    days is, is it ARM32 or ARM64.

    <long snip>
    ARM have no support for 32-bit binaries. So, there is less
    compatiblity than technically possible.

    Sure. That's why, if you are working from source, it should have a
    simple as structure as possible, if you are a /user/ of the application.

    My view is that there are two kinds of source code:

    (1) The true sources that the /developer/ works with, with 1000s of
    source files, dozens or 100s of sprawling directories, config scripts,
    make files, the works. Plus an ecosystem of tools for static analysis, refactoring, ... you have a better idea than I do

    (2) A minimal representation which is the simplest needed to create a
    working binary. Just enough to solve the problems of diverse targets
    that you listed.

    (1) is needed for developing and debugging. (2) is used on finished,
    debugged, working programs.

    You seem to be arguing for everyone to be provided with (1).

    The rationale for the one-file source version of SQLite3 was precisely
    to make it easy to build and to incorporate. (One big file, plus 2-3
    auxiliary ones, compared with 100 separate true source files. The
    choices the developers made in the folder hierarchy etc are irrelevant.)

    Things have structure, kernel is divided into subdirectories in
    resonably logical way.

    Just getting a linear list of files in the Linux kernel sources was a mini-project. This was when it was a mere 21M lines; I can't remember
    how many files there were.

    And there are tools like 'grep', it can
    find relevant thing in seconds. This is not pure theory, I had
    puzzling problem that binaries were failing on newer Linux
    distributions. It took some time to solve, but another guy
    hinted me that this may be tighthended "security" in kernel.
    Even after the hint my first try did not work. But I was able
    to find relevant code in the kernel and then it became clear
    what to do.

    (My approach would have been different; it would not have been that big
    in the first place. I understand that 95% of sources are not relevant to
    any particular build, but I still think an OS, especially just the core
    of an OS, should not be that large.)

    Here I'm concerned only with building stuff that works, and don't want
    to know what directory structure the developers use.

    Concerning not having extension: you can add one if you want,
    moderatly popular choices are .exe or .elf.

    But nobody does. Main problem is in forums like this: if I say
    `hello.exe`, everyone knows that's a binary executable for Windows.

    It may be for Linux...

    David also replied to my points and I made clear my views about file
    extensions yesterday in my reply to him.

    At no point did I need to write an extension. It is implied by the
    program I invoked.

    Implied extentions have trouble that somebody else my try to hijack
    them. I use TeX system which produces .dvi files. IIUC they could
    be easily mishandled by systems depending just on extion. And in
    area of programming languages at least three languages compete for
    .p extention.

    I don't get your point. Somebody could still hijack DVI files, whether
    an application requires 'prog file.dvi', or 'prog file' which defaults
    to file.dvi.

    People can also write misleading 'shebang' lines or file signatures.

    Possibly you don't quite understand: aside from "./" being a syntax
    error on Windows,

    AFAIK '/' is legal in Windows pathnames (even though many programs
    do not support it). I am not sure about leading dot.

    Internally it's legal, but it's not allowed in shell commands (because
    "/" was used for command options). "." and ".." work the same way as in
    Linux.


    'configure' is a script full of Bash commands which
    invoke all sorts of utilities from Linux. It is meaningless to attempt
    to run it on Windows.

    You probably do not understand that 'configure' scripts use POSIX
    commands.

    So? The fact is that they will not work on Windows. This an extract from
    the 30,500-line configure script for GMP:

    DUALCASE=1; export DUALCASE # for MKS sh
    if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then :
    emulate sh
    NULLCMD=:
    # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which
    # is contrary to our usage. Disable this feature.
    alias -g '${1+"$@"}'='"$@"'
    setopt NO_GLOB_SUBST
    else
    case `(set -o) 2>/dev/null` in #(
    *posix*) :
    set -o posix ;; #(
    *) :
    ;;
    esac
    fi

    This makes no sense to Windows shell, either Command Prompt or Powershell.

    (Actually the size of this file - which is not the source code - is
    bigger than the GMP DLL which is the end product.)

    It would be like my bundling a Windows BAT file with sources intended to
    be built on Linux.

    There are two important differences:
    - COMMAND.COM is very crappy as command processor, unlike Unix shell
    which from the start was designed as programming language. IIUC that
    is changing with PowerShell.
    - you compare thing which was designed to be portability layer with
    platform specific thing.

    I don't really care about COMMAND.COM either. I use only the minimum
    features.

    But I'm not sure what point you're making: are you trying to excuse
    those 30,500 lines of totally useless crap by saying that COMMAND.COM
    should be able to run it?

    It can takes several minutes on Linux too! Auto-conf-generated configure
    scripts can contain tens of thousands of lines of code.

    It depends on commands and script. Shell can execute thousends of
    simple commands per second. On Linux most of time probably goes
    to C compiler (which is called many times from 'confugure').
    On Windows cost of process creation used to be much higher than
    on Linux, so it is likely that most 'configure' time went to
    process creation. Anyway, the same 'configure' script tended to
    run 10-100 times slower on Windows than on Linux. I did not try
    recently...

    It shouldn't need to run at all. I still have little idea what it
    actually does, or why, if it is to determine system parameters, that
    can't be done once and for all per system, and not repeated per
    application and per full build.

    But that's just a cop-out. As I said above, it's like my delivering a
    build system for Linux that requires so many Windows dependencies, that
    you can only build by installing half of Windows.

    POSIX utilities could be quite small. On Unix-like system one
    could fit them in something between 1-2M. On Windows there is
    trouble that some space saving ticks do not work (in Unix usual
    trick is to have one program available under several names, doing
    different thing depending on name). Also, for robustness they
    may be staticaly linked. And people usually want versions with
    most features, which are bigger than what is strictly necessary.
    Still, that is rather small thing if you compare to size of
    Windows.

    The size of Windows doesn't matter, since it is not something that
    somebody needs to locate, download and install. It will already exist.

    Besides my stuff needs a tiny fraction of its functionality: basically,
    a file system.


    Another story is size of C compiler and its header
    files.

    (My mc.c doesn't use any header files, you might have noticed! Not even stdio.h. Hence the need for fno-builtin etc to shut up gcc, but that
    varies across OSes.)

    WSL is not interesting since it is still x64, and maybe things will work
    that will not work on real Linux (eg. it can still run actual Windows
    EXEs; what else is it allowing that wouldn't work on real Linux).

    I've stopped this since no one has ever expressed any interest in seeing
    my stuff work on Linux, especially on RPi where a very fast alternative
    to C that ran on the actual board would have been useful.

    Well, compiler that can not generate code for Pi is not very
    interesting to run on RPi, even if it is very fast.

    I first looked at RPi1 ten years ago. But I didn't find enough incentive
    to target ARM32 natively.

    Still, I have a way to write systems code for RPi in my language, even
    if it annoyingly has to go through C.

    Your
    latest M is step in good direction, but suffers due to gcc
    compile time:

    time ./mc -asm mc.m
    M6 Compiling mc.m---------- to mc.asm

    (This a bug: for 'mc', -asm does the same thing as -c, and writes mc.c,
    but says it is writing mc.asm. Use -c option for C output without
    compiling via C compiler. Note that this may have overwritten mc.c.)


    real 0m0.431s
    user 0m0.347s
    sys 0m0.079s

    Gcc is not a good backend for the M compiler, since compilation speed
    hits a brick wall as it soon as it is invoked.

    Much more suitable is tcc, but I couldn't make that the default, since
    it might not be installed. (I suppose it could tentatively try tcc, then
    fall back to gcc.)

    On my WSL (some tidying done):

    /mnt/c/c# time ./mc -gcc mc -out:mc2 # -gcc is default
    M6 Compiling mc.m---------- to mc2
    L:Invoking C compiler: gcc -omc2 mc2.c -lm -ldl -s -fno-builtin
    real 0m1.362s
    user 0m1.139s
    sys 0m0.109s

    /mnt/c/c# time ./mc -tcc mc -out:mc2
    M6 Compiling mc.m---------- to mc2
    L:Invoking C compiler: tcc -omc2 mc2.c -lm -ldl
    -fdollars-in-identifiers
    real 0m0.135s
    user 0m0.058s
    sys 0m0.012s

    So tcc is 10 times the speed of using gcc (or maybe 20; I don't get
    these readings). Note, I used gcc -O3 to get ./mc (the executable!), but
    I get the same results with -O0.

    /mnt/c/c# time ./mc -c mc
    M6 Compiling mc.m---------- to mc.c
    real 0m0.072s
    user 0m0.014s
    sys 0m0.021s
    root@DESKTOP-11:/mnt/c/c#

    This is the time to translate to C only. That is, 70ms or 14ms, probably
    the former.

    Going from mc.m to mc.exe directly on Windows is faster (I think):

    c:\mx>TM mm mc
    M6 Compiling mc.m---------- to mc.exe
    TM: 0.07

    That is unoptimised code. If I get an optimised version of mm.exe like
    this: `mc -opt mm` (the main reason I again have a C target), then it is:

    c:\mx>TM mm mc
    M6 Compiling mc.m---------- to mc.exe
    TM: 0.05



    time ../mc mc.m
    M6 Compiling mc.m---------- to mc
    L:Invoking C compiler: gcc -omc mc.c -lm -ldl -s -fno-builtin
    mc.c:2:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
    #pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    real 0m9.137s
    user 0m8.746s
    sys 0m0.386s

    So 'mc' can generate C code in 0.431s, but then it takes '9.137s'
    to compile generated C (IIUC Tiny C does not support ARM and
    even on x86_64 compile command probably needs fixing).

    I think it does; I seem to remember using it. It was a major
    disincentive to taking my own fast compiler further.

    I assume you've tried 'apt-get install tcc'.

    And there seem to be per-program overhead:

    time fred.nn/mc -asm hello.m
    M6 Compiling hello.m------- to hello.asm

    real 0m0.222s
    user 0m0.186s
    sys 0m0.032s

    time fred.nn/mc hello.m
    M6 Compiling hello.m------- to hello
    L:Invoking C compiler: gcc -ohello hello.c -lm -ldl -s -fno-builtin hello.c:2:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
    #pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"

    (So many incompatible ways in gcc to get it to say nothing about
    declarations like:

    extern puts(unsigned char*)int;

    I thought one point of having an optional stdio.h was to be able to
    easily override such functions.)

    real 0m1.596s
    user 0m1.464s
    sys 0m0.125s

    'hello.m' is quite small, but is needs half of time of mc which
    is 6000 times larger. And generated 'hello.c' is still 4381
    lines.

    To get a clearer picture of what is taking the time, try:

    time ./mc -c hello.m
    time gcc hello.c -ohello @gflags
    time gcc hello.c -ohello @tflags

    It's annoying that on Linux, gcc needs those 3 extra options not needed
    on Windows. And that tcc, because it doesn't support $ in names, needs
    that other long option. So gflags/tflags contain what are necessary.

    On my WSL, hello.m -> hello.c tales 38ms real time. The gcc build of
    hello.c takes 340s, and tcc takes 17ms.

    A regular 5-line hello.c takes 125ms and 10ms respectively. When I can
    get -nosys/-minsys working with the C target, then generated C files can
    get smaller for certain programs.

    (On Windows, -nosys and -minsys options remove all or most of the
    standard library. Executables go down to 2.5KB or 3KB instead of 50KB
    minimum, but no or minimal library features are available.

    These don't work yet with mc for C target. Full list of options are in mm_cli.m.)


    On 32-bit Pi-s I have Poplog:

    http://github.com/hebisch/poplog http://www.math.uni.wroc.pl/~hebisch/poplog/corepop.arm

    For bootstrap one needs binary above. For me it works on
    oryginal Raspberry Pi, Banana Pi, Orange Pi Pc, Orange Pi Zero.
    They all have different ARM chips and run different version
    of Linux, yet the same 'corepop.arm' works on all of them.

    If you are interested you can look at INSTALL file in repo
    above (skip quick install part that assumes tarball and go to
    full install). Building uses had written 'configure', it
    is very simple, but calls C compiler up to 4 times, so runtime
    of 'configure' on Pi-s is noticeable (of order of 1 second).
    Actual build (using 'make') takes few minutes, depending
    on what was configured.

    I'll have dig out my Pi boards. I have RPi1 and RPi4. I got the last
    because I wanted to try ARM64, but it turns out that mature OSes for it
    were still mostly 32-bit. The only 64-bit OS worked poorly. That was 3
    years ago.

    For massive files Poplog compiles much slower than your compiler
    but usually much faster than gcc. However, main advantage
    is that Poplog compiles to memory giving you impression of
    interpreter, but generating machine code.

    That's what my 'mm' compiler does on Windows, using -run option. Or if I
    rename it 'ms', that is the default:

    c:\mx>ms mm -run \qx\qq \qx\hello.q
    Compiling mm.m to memory
    Compiling \qx\qq.m to memory
    Hello, World! 19-Dec-2022 15:00:34

    (There's an issue ATM building ms with ms.) This builds the M compiler
    from source, which builds the Q interpreter from source, which then runs hello.c. The whole process took 0.18 seconds.

    But I can't do this via the C target. (Support native code for Linux on
    x64 is a possibility, but is of limited use and interest for me.)

    Let me add that there are actually two compilers, one compiling to
    memory and separate one which generates assembly code. The second
    compiler has low-level extenstions allowing faster object code.
    Compiler compiling to memory is is significantly faster than
    the one which generates assembly.

    Generating lots of ASM source is slow, but you would need to generate
    millions of lines to see much of a difference.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Dec 19 16:39:33 2022
    On 19/12/2022 00:18, Bart wrote:
    On 18/12/2022 17:17, antispam@math.uni.wroc.pl wrote:

      As-is it failed on 64-bit ARM.
    More precisly, initial 'mc.c' compiled fine, but it could not
    run 'gcc'.  Namely, ARM gcc does not have '-m64' option.  Once
    I removed this it works.

    gcc doesn't have `-m64`, really? I'm sure I've used it even on ARM. (How
    do you tell it it to generate ARM32 rather than ARM64 code?)


    gcc treats "x86" as one backend, with "-m32" and "-m64" variants for
    32-bit and 64-bit x86, for historical reasons and because it is
    convenient for people running mixed systems. But ARM (32-bit) and
    AArch64 (64-bit) are different backends as they are very different architectures.

    So you need separate gcc builds for 32-bit and 64-bit ARM, not just a flag.

    (I make no comment as to whether this is a good thing or a bad thing -
    I'm just saying how it is.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to antispam@math.uni.wroc.pl on Mon Dec 19 16:51:17 2022
    On 19/12/2022 07:15, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:
    On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:




    Possibly you don't quite understand: aside from "./" being a syntax
    error on Windows,

    AFAIK '/' is legal in Windows pathnames (even though many programs
    do not support it). I am not sure about leading dot.

    A single dot counts as "the current directory" in Windows, just like in
    *nix.

    But forward slashes are not allowed in Windows filenames, along with
    control characters (0x00 - 0x1f, 0x7f), ", *, /, \, :, <, >, ?, | and
    certain names that match old DOS device names.

    *nix systems typically allow everything except / and NULL characters.


    (I discovered recently that in Powershell on Windows, you need ./ to run executables in the current directory, unless you mess with your $PATH -
    just like on *nix.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Bart on Mon Dec 19 19:54:44 2022
    On 19/12/2022 15:14, Bart wrote:

    For massive files Poplog compiles much slower than your compiler
    but usually much faster than gcc.  However, main advantage
    is that Poplog compiles to memory giving you impression of
    interpreter, but generating machine code.

    That's what my 'mm' compiler does on Windows, using -run option. Or if I rename it 'ms', that is the default:

       c:\mx>ms mm -run \qx\qq \qx\hello.q
       Compiling mm.m to memory
       Compiling \qx\qq.m to memory
       Hello, World! 19-Dec-2022 15:00:34

    (There's an issue ATM building ms with ms.)

    That problem has gone, it just needed a tweak (usually ms.exe is a
    renaming of mm, rather than being built directly). Now I can do:

    c:\mx>ms ms ms ms ms ms ms ms ms ms ms \qx\qq \qx\hello
    Hello, World! 19-Dec-2022 19:36:43

    (Compiler messages are normally suppressed to avoid spoiling the
    illusion that it's a script, but I showed them above.)

    This builds 10 generations of the compiler in-memory, before building
    the interpreter. This took 0.75 seconds (each version is necessarily unoptimised, only the first might be).

    You wouldn't normally do this, but it illustrates the ability to run
    native code applications, including compilers, directly from source
    code. Not quite JIT, but different from a normal, one-time AOT build.
    And this could be done at a customer site.

    But I can't do this via the C target.

    Actually if can be done, and I've just tried it. But it is an emulation:
    it creates a normal executable file then runs it. I just need to arrange
    for it to pick up the trailing command line parameters to pass to the
    app. However it needs to use the tcc compiler to work fluently.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Tue Dec 20 00:17:41 2022
    On 19/12/2022 06:15, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    I know that Linux doesn't care about extensions, but people do. After
    all it still uses, by convention, extensions like .c -s .o .a .so, so
    why not actual binaries by convention?

    Here you miss virtue of simplicity: binaries are started by kernel
    and you pass filename of binary to the system call. No messing
    with extentions there. There are similar library calls that
    do search based on PATH, again no messing with extentions.

    I don't get this. Do you mean that inside a piece of code (ie. written
    once and executed endless times), it is better to write run("prog")
    instead of run("prog.exe"), because it saves 4 keystrokes?

    And it's an advantage in not being able to distinguish between
    "prog.exe", "prog.pl", "prog.sh" etc?

    My views are to do with providing an ergonomic /interactive/ user CLI.


    It simply doesn't make sense.

    It makes sense if you know that executable in the PATH is simultaneousy
    shell command. You see, there are folks which really do not like
    useless clutter in their command lines. And before calling
    executable from a shell script you may wish to check if it is available. Having different extention for calling and for access as normal
    file would complicate scripts.

    In every context I've been talking about where extensions have been
    optional and have been inferred, you have always been able to write full extensions if you want. This would be recommended inside a script run
    myriad times to make it clear to people reading or maintaining it.

    People have mentioned that on Linux you could optionally name
    executables with ".exe" or ".elf" extension. If 'gcc' (the main binary
    driver program of gcc, not gcc as a broader concept - you see the
    problems you get into!) had been named "gcc.exe", would you have had to
    type this every time you ran it:

    gcc.exe hello.c

    If so, then I think I can see the real reason why extensions are empty!

    In a Linux terminal shell, there apparently is no scope for informality
    or user-friendliness at all.

    This has lead me to thinking about how command line parameters are
    separated. On either OS you normally type this:

    gcc a.c b.c c.c

    You can't do this, separate with commas, as the comma becomes part of
    each filename:

    gcc a.c, b.c, c.c

    That applies also to my bcc, but there, you CAN have comma-separated
    items inside an @file; with gcc, that still fails.

    So, what's going on here: is it an OS shell misfeature, or what?

    Well it's not the OS on Windows, since 'T.BAT a,b,c' will process a, b,
    c as separate 'a b c' items inside the script (not as "a," etc). (I
    can't test how it works on Linux.)

    My bcc fails because it obtains the command parameters via the MSVCRT
    call __getmainargs(), which turns the single command line string that
    Windows normally provides, into separated items that C expects.

    I used to work directly from the command line string (GetCommandLine)
    and chop it up manually. It looks like I will have to go back to that.

    That will provide consistency with subsequent line input from the
    console, or from a file. (In fact I will extend my Read feature which
    works on those, to work also on the command line params that follow the
    program name.)

    So, this has been productive after all.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 20 02:44:38 2022
    Bart <bc@freeuk.com> wrote:
    On 19/12/2022 06:15, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    I know that Linux doesn't care about extensions, but people do. After
    all it still uses, by convention, extensions like .c -s .o .a .so, so
    why not actual binaries by convention?

    Here you miss virtue of simplicity: binaries are started by kernel
    and you pass filename of binary to the system call. No messing
    with extentions there. There are similar library calls that
    do search based on PATH, again no messing with extentions.

    I don't get this. Do you mean that inside a piece of code (ie. written
    once and executed endless times), it is better to write run("prog")
    instead of run("prog.exe"), because it saves 4 keystrokes?

    I mean that what user input can go unchanged to system calls.
    If user types 'prog' and library function gets 'prog.exe', then
    there must be some code inbetween which messes with file extentions.
    If your code handles file names obtained from user, then to
    present consistent interface to users you also must handle the
    issue. So from single place it speads out to many other places.

    It simply doesn't make sense.

    It makes sense if you know that executable in the PATH is simultaneousy shell command. You see, there are folks which really do not like
    useless clutter in their command lines. And before calling
    executable from a shell script you may wish to check if it is available. Having different extention for calling and for access as normal
    file would complicate scripts.

    In every context I've been talking about where extensions have been
    optional and have been inferred, you have always been able to write full extensions if you want. This would be recommended inside a script run
    myriad times to make it clear to people reading or maintaining it.

    Your "full extention may be tricky". I a directory I may have:

    a.exe
    a.exe.gz

    When I type 'a' do you use 'a.exe' or 'a.exe.gz'? Similarly
    when I type 'a.exe' would you use it or 'a.exe.gz'?

    People have mentioned that on Linux you could optionally name
    executables with ".exe" or ".elf" extension. If 'gcc' (the main binary
    driver program of gcc, not gcc as a broader concept - you see the
    problems you get into!) had been named "gcc.exe", would you have had to
    type this every time you ran it:

    gcc.exe hello.c

    If so, then I think I can see the real reason why extensions are empty!

    You are slowy getting it. Use just 'gcc' as filname and you will
    be fine.

    In a Linux terminal shell, there apparently is no scope for informality
    or user-friendliness at all.

    Let me just say that you can have whatever program you want as
    a shell. 'sh' from the start was intended as programmer tool
    and programming language. There are other shells, in particular
    'csh' was intended as interactive shell for "normal" users.
    But descendants of 'sh' got more features. Concerning
    user-friendliness, 'bash' had command line editing, history search
    and tab completion for ages. And shell loops are quite usable
    from command line, so single command can do what otherwise would
    require program in a file or a lot of individual commands.
    'zsh' tries to correct spelling, some folks consider this
    friendly (I do not use 'zsh').

    This has lead me to thinking about how command line parameters are
    separated. On either OS you normally type this:

    gcc a.c b.c c.c

    You can't do this, separate with commas, as the comma becomes part of
    each filename:

    gcc a.c, b.c, c.c

    That applies also to my bcc, but there, you CAN have comma-separated
    items inside an @file; with gcc, that still fails.

    Why would you do such silly thing? If you really want you can
    redefine 'gcc' so that it strips trailing commas (that is
    trivial). If you like excess characters you can type longer
    thing like:

    echo gcc a.c, b.c, c.c | tr -d ',' | bash

    So, what's going on here: is it an OS shell misfeature, or what?

    KISS principle. Commas are legal in filenames and potentially
    useful. On command line spaces work fine. If you really need
    splitting to work differently there are resonably simple ways
    to do this, most crude is above.

    BTW: travelling between UK and other countries do you complain
    that cars drive on wrong side of the road?

    Well it's not the OS on Windows, since 'T.BAT a,b,c' will process a, b,
    c as separate 'a b c' items inside the script (not as "a," etc). (I
    can't test how it works on Linux.)

    There is IFS variable which lists characters used for word splitting,
    you can put comma there together with whitspace. I never used it
    myself, but it is used extensively in hairy shell scripts like
    'configure'.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Dec 20 07:56:13 2022
    On 18/12/2022 18:09, Bart wrote:
    On 18/12/2022 13:05, David Brown wrote:
    On 17/12/2022 14:22, Bart wrote:


    For data files, it can often be convenient to have an extension
    indicating the type - and it is as common on Linux as it is on Windows
    to have ".odt", ".mp3", etc., on data files.

    It's convenient for all files. And before you say, I can add a .exe
    extension if I want: I don't want to have to write that every time I run
    that program.

    You can add ".exe" if you want, and then it is part of the name of the
    program - so you use it when naming the program. It's not really very difficult.


    People use extensions where they are useful, and skip them when they
    are counter-productive (such as for executable programs).

    I can't imagine all my EXE (and perhaps BAT files) all having no
    extensions. Try and envisage all your .c files have no extensions by
    default. How do you even tell that are C sources and not Python or not executables?


    I treat my .c files differently from my program files. Why should the
    same rules apply?

    I treat the ELF executables, shell files, python programs and other
    executable programs the same - I run them. Why would I need a different
    file extension for each? I don't need to run them in different ways -
    the OS (and shell) figure out the details of how to run them so that I
    don't need to bother.

    When you are writing code, and you have a function "twiddle" and an
    integer variable "counter", you call them "twiddle" and "counter".
    You don't call them "twiddle_func" and "counter_int".  But maybe
    sometimes you find it useful - it's common to write "counter_t" for a
    type, and maybe you'd write "xs" for an array rather than "x".
    Filenames can follow the same principle - naming conventions can be
    helpful, but you don't need to be obsessive about it or you end up
    with too much focus on the wrong thing.

    But you /do/ write twiddle.c, twiddle.s, twiddle.o, twiddle.cpp,
    twiddle.h etc? Yet the most important file of all, is just plain 'twiddle'!


    Yes.

    In casual writing or conversation, how to do distinguish 'twiddle the
    binary executable' from 'twiddle the folder', from 'twiddle the
    application' (an installation) , from 'twiddle' the project etc, without having to use that qualification?

    Using 'twiddle.exe' does that succinctly and unequivocally.


    Sure. But so does "twiddle the executable". I don't generally talk in
    shell commands or file names - it's just not an issue that has any
    relevance.


    On *nix, every file with the executable flag can be executed - that's
    what the flag is for.

    Sometimes it is convenient to be able to see which files in a
    directory are executables, directories, etc.  That's why "ls" has
    flags for colours or to add indicators for different kinds of files.
    ("ls -F --color").

    As I said, if it's convenient for data and source files, it's convenient
    for all files.


    Why do you care?

    But there are also ways to execute .c files directly, and of course
    .py files which are run from source anyway.


    There are standards for that.  A text-based file can have a shebang
    comment ("#! /usr/bin/bash", or similar) to let the shell know what
    interpreter to use.  This lets you distinguish between "python2" and
    "python3", for example, which is a big improvement over Windows-style
    file associations that can only handle one interpreter for each file
    type.

    That is invasive. And taking something that is really an attribute of a
    file name, in having it not only inside the file, but requiring the file
    to be opened and read to find out.

    You seem to have misread the paragraph.


    (Presumably every language that runs on Linux needs to accept '#' as a
    line comment? And you need to build it in to every one of 10,000 source
    files the direct location of the Python2 or Python3 installation on that machine? Is that portable across OSes? But I expect it's smarter than
    that.)

    Of course it is smarter than that.

    The great majority of languages typically used for scripting (that is,
    running directly without compiling) are happy with # as a comment
    character. Even C interpreters, as far as I saw with a very quick
    google check, are happy with a #! shebang in the first line.

    A key point here is that almost every general-purpose OS, other than
    Windows, in modern use on personal computers is basically POSIX
    compliant. (And even Windows has had some POSIX compliance since the
    first NT days.) One of the things POSIX defines is required placement
    of a large number of files and programs, and required support from
    things like a standard POSIX shell. So a shell script can start with "#!/bin/sh", and be sure of running on every POSIX system - Linux, Macs, embedded Linux, Solaris, AIX, Windows WSL, msys, whatever. If it wants
    Python, it can have "#!/usr/bin/python". If it wants Python 2.5
    specifically, it can have "#!/usr/bin/python2.5". (Of course there is
    no guarantee that a given system has Python 2.5 installed, but almost
    all will have /some/ version of Python, and it can be found at /usr/bin/python.)

    (That does not mean Python has to be installed at /usr/bin/python - it
    means it must be /found/ there. Symbolic links are used widely to keep filesystems organised while letting files be found in standard places.)

    These things are not Linux-specific. They predate Linux, and are
    ubiquitous in the *nix world.



    With Python, you're still left with the fact that you see a file with a
    .py extension, and don't know if it's Py2 or Py3, or Py3.10 or Py3.11,
    or whether it's a program that works with any version. It is a separate problem from having, as convention, no extensions for ELF binary files.


    Exactly my point. That's why in the *nix world, you use a shebang that
    can be as specific as you want or as general as you can, in regard to
    versions. You can have a file extension if you like, but it is not
    needed or used in order to find the interpreter for the script, so most
    people don't bother for their executables. If you have an OS that
    relies solely on file extensions, however, you do not have that
    flexibility - it works in simple cases (one Python version installed)
    but not in anything more.

    On Windows, I always have to start my Python programs explicitly with "C:\Python2.8\python prog.py", or equivalent, precisely because of
    Windows limitations.

    File extensions for executable types seems like a nice idea at the
    start, but is quickly shown to be limiting and inflexible.

      And the *nix system distinguishes between executable files and
    non-executables by the executable flag - that way you don't
    accidentally try to execute non-executable Python files.

    (So there are files that contain Python code that are non-executable?
    Then what is the point?)


    Maybe you haven't done much Python programming and have only worked with
    small scripts. But like any other language, bigger programs are split
    into multiple files or modules - only the main program file will be
    executable. So if a big program has 50 Python files, only one of them
    will normally be executable and have the shebang and the executable
    flag. (Sometimes you'll "execute" other modules to run their tests
    during development, but you'd likely do that as "python3 file.py".)


    You do realise that gcc can handle some 30-odd different file types?

    That doesn't change the fact that probably 99% of the time I run gcc, it
    is with the name of a .c source file. And 99.9% of the times when I
    invoke it on prog.c as the first or only file to create an executable,
    then I want to create prog.exe.


    OK. So gcc should base its handling of input on what /you/ do, never
    mind the rest of the world? That's fine for your own tools, but not for
    gcc.

    So its behaviour is unhelpful. After the 10,000th time you have to type
    .c, or backspace over .c to get at the name itself to modify, it becomes tedious.


    Every serious developer uses build programs - or at least a DOS batch
    file - for major programming work.

    Now it's not that hard to write a wrapper script or program on top of gcc.exe, but if it isn't hard, why doesn't it just do that?


    gcc is already a wrapper for a collection of tools and compilers.

    It's not a simple C compiler that assumes everything it is given is a
    C file.

    As I said, that is not helpful for me. Also, how many file types does
    'as' accept? As that also requires the full extension, and also,
    bizarrely, generates `a.out` as the object file name.

    "a.out" is the standard default name for executables on *nix, used by
    all tools - it's hardly bizarre, even though you rarely want the default.

    Like most *nix tools, gas can get its files from multiple places,
    including pipes. And you can call your files anything you like -
    "file.s", "file.asm", "file", "file.x86asm", "file.version2.4.12", etc.
    It would be a very strange idea to decide it should only take part of
    the file name even if it only accepts one type of file.

    In *nix, the dot is just a character, and file extensions are just part
    of the name. You can have as many or as few as you find convenient and helpful.


    If you intend to assemble three .s files to object files, using separate
    'as' invocations, they will all be called a.out!

    That would be crass even for a toy program written by a student. And yet
    here it is a mainstream product used by million of people.

    All my language programs (and many of my apps), have a primary type of
    input file, and will default to that file extension if omitted. Anything
    else (eg .dll files) need the full extension.

    Here's something funny: take hello.c and rename to 'hello', with no extension. If I try and compile it:

        gcc hello

    it says: hello: file not recognised: file format not recognised. Trying
    'gcc hello.' is worse: it can't see the file at all.


    How is that "funny" ? It is perfectly clear behaviour.

    gcc supports lots of file types. For user convenience it uses file
    extensions to tell the file type unless you want to explicitly inform it
    of the type using "-x" options.

    <https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html>

    "hello" has no file extension, so the compiler will not assume it is C. (Remember? gcc is not just a simple little dedicated C compiler.)
    Files without extensions are assumed to be object files to pass to the
    linker, and your file does not fit that format.

    "hello." is a completely different file name - the file does not exist.
    It is an oddity of DOS and Windows that there is a hidden dot at the end
    of files with no extension - it's a hangover from 8.3 DOS names.



    So first, on Linux, where file extensions are supposed to be optional,
    gcc can't cope with a missing .c extension; you have to provide extra
    info. Second, on Linux, "hello" is a distinct file from "hello.".


    Yes. It's the only sane way, and consistent with millions of programs
    spanning 50 years on huge numbers of systems.

    With bcc, I just have to type "bcc hello." to make it work. A trailing
    dot means an empty extension.

    When you make your own little programs for your own use, you can pick
    your own rules.


    On Linux, you just write "make hello" - you don't need a makefile for
    simple cases like that.

    OK... so how does 'make' figure out the file extension?

    "make" has a large number of default rules built in. When you write
    "make hello", you are asking it to create the file "hello". It searches through its rules looking for ones that can be triggered and which match
    files that are found in the directory. One of these rules is how to
    compile and link a file "%.c" into and executable "%" - so it applies that.


    'Make' anyway has different behaviour:

    * It can choose not to compile

    * On Windows, it says this:

      c:\yyy>make hello
      cc     hello.c   -o hello
      process_begin: CreateProcess(NULL, cc hello.c -o hello, ...) failed.
      make (e=2): The system cannot find the file specified.
      <builtin>: recipe for target 'hello' failed
      make: *** [hello] Error 2


    Do you have a program called "cc" on your path? It's unlikely. "cc" is
    the standard name for the system compiler, which may be gcc or may be
    something else entirely.

    * I also use several C compilers; how does make know which one I intend?
    How do I pass it options?


    It uses the POSIX standards. The C compiler is called "cc", the flags
    passed are in the environment variable CFLAGS.

    If that's not what you want, write a makefile.

    If I give another example:

       c:\c>bcc cipher hmac sha2
       Compiling cipher.c to cipher.asm
       Compiling hmac.c to hmac.asm
       Compiling sha2.c to sha2.asm
       Assembling to cipher.exe

    it just works. 'make cipher hmac sha2' doesn't, not even in WSL.

    (And the "advanced AI" can figure out if it is C, C++, Fortran, or
    several other languages.)

    No, it can't. If I have hello.c and hello.cpp, it will favour the .c file.


    Sorry, I should have specified that the "advanced AI" can do it on an
    advanced OS, such as every *nix system since before Bill Gates found
    MS-DOS is a dustbin.


    File extensions are tremendously helpful. But that doesn't mean you have
    to keep typing them! They just have to be there.


    Exactly. You just have a very DOS-biased view as to when they are
    helpful, and when they are not. It's a backwards and limited view due
    to a lifetime of living with a backwards and limited OS.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 20 11:07:34 2022
    Bart <bc@freeuk.com> wrote:
    On 18/12/2022 13:05, David Brown wrote:
    On 17/12/2022 14:22, Bart wrote:


    When you are writing code, and you have a function "twiddle" and an
    integer variable "counter", you call them "twiddle" and "counter".? You don't call them "twiddle_func" and "counter_int".? But maybe sometimes
    you find it useful - it's common to write "counter_t" for a type, and
    maybe you'd write "xs" for an array rather than "x".? Filenames can
    follow the same principle - naming conventions can be helpful, but you don't need to be obsessive about it or you end up with too much focus on the wrong thing.

    But you /do/ write twiddle.c, twiddle.s, twiddle.o, twiddle.cpp,
    twiddle.h etc? Yet the most important file of all, is just plain 'twiddle'!

    During developement you have normally have several source files
    per executable. And normally executable is final product. So
    rules based on extention help in managing build but are not
    useful for executable. There are cases when executable is
    _not_ a final product, in such cases people add extentions
    to executables to allows simple rules. "Importance" actually
    works in opposite direction that you would like to imply:
    set of possible short names is limited, so something must be
    important enough to get short name (without extention).

    If you intend to assemble three .s files to object files, using separate
    'as' invocations, they will all be called a.out!

    That would be crass even for a toy program written by a student. And yet
    here it is a mainstream product used by million of people.

    Well, it would be crass not to specify output in such case. Students
    have no trouble learning this.

    * I also use several C compilers; how does make know which one I intend?
    How do I pass it options?

    We have Makefile for that. My Makefile for microcontrollers
    (STM32F1) has at the start:

    CM_INC = /mnt/m1/pom/kompi/work/libopencm3/include
    CORE_FLAGS = -mthumb -mcpu=cortex-m3
    CFLAGS = -Os -Wall -g $(CORE_FLAGS) -I $(CM_INC) -DSTM32F1

    TOOL_PPREFIX = arm-none-eabi-
    CC = $(TOOL_PPREFIX)gcc
    CXX = $(TOOL_PPREFIX)g++
    AS = $(TOOL_PPREFIX)as

    CM_INC, CORE_FLAGS and TOOL_PPREFIX are my variables which
    help to better organize the Makefile. CC is standad make
    variable which tells make how to invoke C compiler (by default
    male would use 'cc'). CXX is doing the same for C++. And
    AS specifies how to invoke assembler.

    Without setting above make would invoke normal compiler,
    generating code for PC, which would not run on microcontroller.
    And would invoke PC assembler which can not handle ARM assembly.

    For your use you may want something like:

    CC = bcc
    AS = fasm

    If I give another example:

    c:\c>bcc cipher hmac sha2
    Compiling cipher.c to cipher.asm
    Compiling hmac.c to hmac.asm
    Compiling sha2.c to sha2.asm
    Assembling to cipher.exe

    it just works. 'make cipher hmac sha2' doesn't, not even in WSL.

    Well, you told make that you want cipher, hmac and sha2 as results.
    If your sources are written in appropriate way (for multiple
    executables), it would work.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 20 11:55:06 2022
    Bart <bc@freeuk.com> wrote:
    On 18/12/2022 17:17, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:
    On 17/12/2022 13:22, Bart wrote:
    On 17/12/2022 06:07, antispam@math.uni.wroc.pl wrote:

    I tested this using the following program:

    proc main=
    rsystemtime tm
    os_getsystime(&tm)
    println tm.second
    println tm.minute
    println tm.hour
    println tm.day
    println tm.month
    println tm.year
    end


    It's funny you picked on that, because the original version of my
    hello.m also printed out the time:

    proc main=
    println "Hello World!",$time
    end

    This was to ensure I was actually running the just built-version, and
    not the last of the 1000s of previous ones. But the time-of-day support
    for Linux wasn't ready so I left it out.

    I've updated the mc.c/mc.ma files (not hello.m, I'm sure you can fix that).

    However getting this to work on Linux wasn't easy as it kept crashing.
    The 'struct tm' record ostensibly has 9 fields of int32, so has a size
    of 36 bytes. And on Windows it is. But on Linux, a test program reported
    the size as 56 bytes.

    Doing -E on that program under Linux, the struct actually looks like this:

    struct tm
    {
    int tm_sec;
    int tm_min;
    int tm_hour;
    int tm_mday;
    int tm_mon;
    int tm_year;
    int tm_wday;
    int tm_yday;
    int tm_isdst;

    long int tm_gmtoff;
    const char *tm_zone;
    };

    16 extra byte for fields not mentioned in 'man' docs, plus 4 bytes
    alignment account for the 20 bytes. This is typical of the problems in adapting C APIs to the FFIs of other languages.

    Sorry for that, it worked on my machine so I did not check the struct
    size.


    BTW: I still doubt that 'mc.ma' expands to true source: do you
    really write no comments in your code?

    The file was detabbed and decommented, as the comments would be full of ancient crap, mainly debugging code that never got removed. I've tidied
    most of that up, and now the file is just detabbed (otherwise things
    won't line up properly). Note the sources are not heavily commented anyway.

    It will always be a snapshot of the actual sources, which are not kept on-line and can change every few seconds.

    You are misusing git and github. git is "source control" system.
    At least from my point of view (there is lot of flame wars discussing
    what source control should do) main task of source control is
    to store all significant versions of software and allow resonable
    easy retrival of any version. Logically got stores separate
    source tree for each version (plus some meta info like log messages).
    Done naively it would lead to serious bloat, which 1547 versions
    it would be almost 1547 times larger than single version. git
    uses compression to reduce this. AFAICS actual sources of your
    projects are about 4-5M. With normal git use I would expect
    (compressed) history to add another 5-10M (if there are a lot of
    deletions than history would be bigger). Your repo is bigger than
    that probably due to generated files and .exe. Note: I understand
    that if you write in your own language, than bootstrap is a problem.
    But for boostrap mc.c is enough. OK, you want want be independent
    from C, so mayb .exe. But .ma files just add bloat. Note that
    github has release feature, people who want just binaries or single
    verion can fetch release. And many projects having bootstrap
    problem say: if you do not have compiler fetch earler binary
    and use it to build the system. Or they add extra generated things
    to releases but do not keep them in source repositiory.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to antispam@math.uni.wroc.pl on Tue Dec 20 14:43:57 2022
    On 20/12/2022 03:44, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    This has lead me to thinking about how command line parameters are
    separated. On either OS you normally type this:

    gcc a.c b.c c.c

    You can't do this, separate with commas, as the comma becomes part of
    each filename:

    gcc a.c, b.c, c.c

    That applies also to my bcc, but there, you CAN have comma-separated
    items inside an @file; with gcc, that still fails.

    Why would you do such silly thing? If you really want you can
    redefine 'gcc' so that it strips trailing commas (that is
    trivial). If you like excess characters you can type longer
    thing like:

    echo gcc a.c, b.c, c.c | tr -d ',' | bash

    So, what's going on here: is it an OS shell misfeature, or what?

    KISS principle. Commas are legal in filenames and potentially
    useful. On command line spaces work fine. If you really need
    splitting to work differently there are resonably simple ways
    to do this, most crude is above.

    BTW: travelling between UK and other countries do you complain
    that cars drive on wrong side of the road?

    Well it's not the OS on Windows, since 'T.BAT a,b,c' will process a, b,
    c as separate 'a b c' items inside the script (not as "a," etc). (I
    can't test how it works on Linux.)

    There is IFS variable which lists characters used for word splitting,
    you can put comma there together with whitspace. I never used it
    myself, but it is used extensively in hairy shell scripts like
    'configure'.


    An important issue here is that the "OS" is not involved in any of this,
    either on Windows or on Linux.

    In *nix, the shell (not the OS) is responsible for many aspects of
    parsing command lines, including splitting up parameters and expanding wildcards in filenames. So on Linux, writing "gcc a.c, b.c, c.c" in
    bash will call gcc with three parameters - "a.c,", "b.c,", and "c.c".

    On Windows, the standard "DOS Prompt" command-line terminal does very
    little of this. (I don't know the details of Powershell. And if you
    use a different shell on Windows, like bash from msys, you get the
    behaviour of that shell.) So if you have a normal "DOS Prompt" and
    write "gcc a.c, b.c, c.c" then the program "gcc" is called with /one/ parameter. It's up to the program to decide how to parse these.
    Typically it will use one of several different WinAPI calls depending on whether it wants the abomination that is "wide characters", or UTF-8, or
    to hope that everything is simple ASCII. If a program wants to parse
    the string itself using commas as separators, it can do that too.

    Of course most programs - especially those that come from a *nix
    heritage - will choose to parse in the same way as is done by *nix shells.

    I did not know that the batch file interpreter handled commas
    differently like this. Who says you never learn things on Usenet? :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Tue Dec 20 15:39:23 2022
    On 2022-12-20 14:43, David Brown wrote:

    In *nix, the shell (not the OS) is responsible for many aspects of
    parsing command lines, including splitting up parameters and expanding wildcards in filenames.  So on Linux, writing "gcc a.c, b.c, c.c" in
    bash will call gcc with three parameters - "a.c,", "b.c,", and "c.c".

    Under UNIX a process is invoked with the argument list as an argument,
    e.g. exec* calls. So UNIX OS enforces a certain view of process
    parameters as a flat list of NUL-terminated strings (and not, say, a key
    map, a tree, an object etc).

    On Windows, the standard "DOS Prompt" command-line terminal does very
    little of this.  (I don't know the details of Powershell.  And if you
    use a different shell on Windows, like bash from msys, you get the
    behaviour of that shell.)

    Under Windows API a process gets the command line, e.g. CreateProcess.

    Windows approach was standard for other OSes. With the difference that,
    say, RSX-11 provided a standard system function to parse the command
    line in a common way (DOS borrowed that syntax: DIR /C /B etc). UNIX has
    getopt for this (in its UNIX way: parse arguments once, get garbage,
    re-sort the garbage again (:-)) Luckily Microsoft refrained from
    providing API calls to parse arguments. I am shivering imaging what kind
    of structures and how many dozens of functions they would come up
    with... (:-))

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Dec 21 00:42:09 2022
    On 20/12/2022 13:43, David Brown wrote:
    On 20/12/2022 03:44, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    This has lead me to thinking about how command line parameters are
    separated. On either OS you normally type this:

        gcc a.c b.c c.c

    You can't do this, separate with commas, as the comma becomes part of
    each filename:

        gcc a.c, b.c, c.c
    That applies also to my bcc, but there, you CAN have comma-separated
    items inside an @file; with gcc, that still fails.

    Why would you do such silly thing?  If you really want you can
    redefine 'gcc' so that it strips trailing commas (that is
    trivial).   If you like excess characters you can type longer
    thing like:

    echo gcc a.c, b.c, c.c | tr -d ',' | bash
    So, what's going on here: is it an OS shell misfeature, or what?

    KISS principle.  Commas are legal in filenames and potentially
    useful.  On command line spaces work fine.  If you really need
    splitting to work differently there are resonably simple ways
    to do this, most crude is above.

    BTW: travelling between UK and other countries do you complain
    that cars drive on wrong side of the road?

    Well it's not the OS on Windows, since 'T.BAT a,b,c' will process a, b,
    c as separate 'a b c' items inside the script (not as "a," etc). (I
    can't test how it works on Linux.)

    There is IFS variable which lists characters used for word splitting,
    you can put comma there together with whitspace.  I never used it
    myself, but it is used extensively in hairy shell scripts like
    'configure'.


    An important issue here is that the "OS" is not involved in any of this, either on Windows or on Linux.

    In *nix, the shell (not the OS) is responsible for many aspects of
    parsing command lines, including splitting up parameters and expanding wildcards in filenames.  So on Linux, writing "gcc a.c, b.c, c.c" in
    bash will call gcc with three parameters - "a.c,", "b.c,", and "c.c".

    On Windows, the standard "DOS Prompt" command-line terminal does very
    little of this.  (I don't know the details of Powershell.  And if you
    use a different shell on Windows, like bash from msys, you get the
    behaviour of that shell.)  So if you have a normal "DOS Prompt" and
    write "gcc a.c, b.c, c.c" then the program "gcc" is called with /one/ parameter.  It's up to the program to decide how to parse these.
    Typically it will use one of several different WinAPI calls depending on whether it wants the abomination that is "wide characters", or UTF-8, or
    to hope that everything is simple ASCII.  If a program wants to parse
    the string itself using commas as separators, it can do that too.

    Of course most programs - especially those that come from a *nix
    heritage - will choose to parse in the same way as is done by *nix shells.

    I did not know that the batch file interpreter handled commas
    differently like this.  Who says you never learn things on Usenet? :-)

    It all depends on how the application decides to do it.

    But most, like gcc, appear to just use the 'args' parameter of C's main
    entry point, which does not do anything clever with commas.

    Windows itself provides the command line as one long string, either as
    of the arguments of WinMain(), or obtained via GetCommandLine().

    Then applications could do what they want. So it depends on how C-ified
    an application is.

    This experiment with batch files I think demonstrates how Windows shell
    (as command prompt or PowerShell) works:

    c:\c>type test.bat
    echo off
    echo %1
    echo %2
    echo %3

    c:\c>test a b c

    c:\c>echo off
    a
    b
    c

    c:\c>test a,b,c

    c:\c>echo off
    a
    b
    c

    c:\c>test a, b, c

    c:\c>echo off
    a
    b
    c

    c:\c>test * *.c

    c:\c>echo off
    *
    *.c
    ECHO is off.

    c:\c>test "a,b,c"

    c:\c>echo off
    "a,b,c"
    ECHO is off.
    ECHO is off.

    The same tests with a C program that just lists main's args works like this:

    c:\c>showargs a b c
    1: showargs
    2: a
    3: b
    4: c

    c:\c>showargs a,b,c
    1: showargs
    2: a,b,c

    c:\c>showargs a, b, c
    1: showargs
    2: a,
    3: b,
    4: c

    c:\c>showargs * *.c
    1: showargs
    2: *
    3: *.c

    c:\c>showargs "a,b,c"
    1: showargs
    2: a,b,c


    Under Windows, if you really wanted to do something as crass as having
    commas within filenames, it's possible, but you have to use quotes.

    Under WSL, that 'showargs * *.c' line works very differently: I get 1153 parameters, which includes 888 files corresponding to *, and 266
    corresponding to *.c, with no way of knowing when you've come to the end
    of one list, and started the other.

    I will just say that the behaviour of my 'test.bat' demo is the most
    sane, with the least surprises.

    You of course will disagree, since whatever Unix does, no matter how
    ridiculous or crass, is perfect, and every other kind of behaviour is
    rubbish.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Dec 21 02:02:34 2022
    On 20/12/2022 06:56, David Brown wrote:
    On 18/12/2022 18:09, Bart wrote:

    A key point here is that almost every general-purpose OS, other than
    Windows, in modern use on personal computers is basically POSIX
    compliant.

    POSIX compliant means basically being a clone of Unix with all the same restrictions and stupid quirks?


    Maybe you haven't done much Python programming and have only worked with small scripts.  But like any other language, bigger programs are split
    into multiple files or modules - only the main program file will be executable.  So if a big program has 50 Python files, only one of them
    will normally be executable and have the shebang and the executable
    flag.  (Sometimes you'll "execute" other modules to run their tests
    during development, but you'd likely do that as "python3 file.py".)


    Oh, just like Windows then?

    Obviously, all 50 modules will contain executable code. You probably
    mean that only the lead module can be launched by the OS and needs
    special permissions.


    You do realise that gcc can handle some 30-odd different file types?

    That doesn't change the fact that probably 99% of the time I run gcc,
    it is with the name of a .c source file. And 99.9% of the times when I
    invoke it on prog.c as the first or only file to create an executable,
    then I want to create prog.exe.


    OK.  So gcc should base its handling of input on what /you/ do, never
    mind the rest of the world?

    No, based on what LOTS of people do. gcc is used as a /C/ compiler, and
    is probably only ever used as a C compiler. Maybe this is acceptable to you:

    gcc prog.c -oprog -lm
    ./prog

    But I prefer:

    bcc prog
    prog

    Who wouldn't?


    That's fine for your own tools, but not for
    gcc.


    Why not? Have they thought of something as simple as using a dedicated executable name for each language? Like gcc and g++.

    Otherwise you're telling me I have to type 'prog.c' 1000s of times
    because of once in a blue moon that I might want to compile 'prog.ftn'?

    Note that default file extensions weren't just routine in MSDOS, other
    OSes like ones from DEC did it too, with their Fortran and Algol
    compilers for example.

    And remember that MSDOS had to be used by ordinary people, not Unix gurus.

    In *nix, the dot is just a character, and file extensions are just part
    of the name.  You can have as many or as few as you find convenient and helpful.

    That's a poor show, and explains why apparently simple, user-friendly
    concepts are not practical in Linux.

    I notice however that with 'gcc -c hello.c', it creates a file
    'hello.o', and 'not hello.c.o'. So it does recognise in this case that
    that or only extension has special meaning, and it does not form any
    part of the logical file name.



    If you intend to assemble three .s files to object files, using
    separate 'as' invocations, they will all be called a.out!

    That would be crass even for a toy program written by a student. And
    yet here it is a mainstream product used by million of people.

    All my language programs (and many of my apps), have a primary type of
    input file, and will default to that file extension if omitted.
    Anything else (eg .dll files) need the full extension.

    Here's something funny: take hello.c and rename to 'hello', with no
    extension. If I try and compile it:

         gcc hello

    it says: hello: file not recognised: file format not recognised.
    Trying 'gcc hello.' is worse: it can't see the file at all.


    How is that "funny" ?  It is perfectly clear behaviour.

    It's funny because Linux famously doesn't need extensions, it looks at
    the file contents. What I've learnt is that Linux relies on extensions
    almost as much as Windows, it's just more inconsistent.


    gcc supports lots of file types.  For user convenience it uses file extensions to tell the file type unless you want to explicitly inform it
    of the type using "-x" options.

    <https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html>

    "hello" has no file extension, so the compiler will not assume it is C. (Remember?  gcc is not just a simple little dedicated C compiler.) Files without extensions are assumed to be object files to pass to the linker,
    and your file does not fit that format.

    So it DOES make some assumptions!


    "hello." is a completely different file name - the file does not exist.
    It is an oddity of DOS and Windows that there is a hidden dot at the end
    of files with no extension - it's a hangover from 8.3 DOS names.

    What about those DEC systems I mentioned? I would be surprised if on
    RSX11M for example (PDP11), which I believe had 6-letter file names and 3-letter extensions, which fit into 3 16bit words via RADIX-50, they
    would bother actually storing a "." character which is an input and
    display artefact.

    So it is Unix that is peculiar here, in making a logical separator a
    physical one.

    Yes.  It's the only sane way, and consistent with millions of programs spanning 50 years on huge numbers of systems.

    With bcc, I just have to type "bcc hello." to make it work. A trailing
    dot means an empty extension.

    When you make your own little programs for your own use, you can pick
    your own rules.

    The rules make sense for EVERY interactive CLI program. One
    characteristic of Unix programs is that you start them, but then nothing happens - it has apparenly hanged. Actually, it's just waiting for
    user-input, but thought fit not to mention that in a brief message.

    Behaviour like that, or defaulting to 'a.out' no matter what, I would
    expect in 'little', temporary and private programs, not something to be inflicted on a million people.


    Do you have a program called "cc" on your path?  It's unlikely.  "cc" is the standard name for the system compiler, which may be gcc or may be something else entirely.

    This was the 'make' program supplied with gcc on Windows.

    Of course, 'make' wouldn't work with my own stuff, which is
    unconventional: no object files, no linking, no listed of discrete
    modules. It's basically 'mm prog', by design.




    * I also use several C compilers; how does make know which one I
    intend? How do I pass it options?


    It uses the POSIX standards.  The C compiler is called "cc", the flags passed are in the environment variable CFLAGS.


    So are we only talking about C here, or is it other languages whose
    compilers have been adapted from the C compiler, complete with that
    a.out business?


    If that's not what you want, write a makefile.

    Why would I bother with such stone-age rubbish? (And that hardly ever
    works.)




    If I give another example:

        c:\c>bcc cipher hmac sha2
        Compiling cipher.c to cipher.asm
        Compiling hmac.c to hmac.asm
        Compiling sha2.c to sha2.asm
        Assembling to cipher.exe

    it just works. 'make cipher hmac sha2' doesn't, not even in WSL.

    (And the "advanced AI" can figure out if it is C, C++, Fortran, or
    several other languages.)

    No, it can't. If I have hello.c and hello.cpp, it will favour the .c
    file.


    Sorry, I should have specified that the "advanced AI" can do it on an advanced OS, such as every *nix system since before Bill Gates found
    MS-DOS is a dustbin.

    And yet it got it wrong; I wanted to build the cpp file.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Wed Dec 21 11:07:00 2022
    On 20/12/2022 11:55, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    It will always be a snapshot of the actual sources, which are not kept
    on-line and can change every few seconds.

    You are misusing git and github. git is "source control" system.
    At least from my point of view (there is lot of flame wars discussing
    what source control should do) main task of source control is
    to store all significant versions of software and allow resonable
    easy retrival of any version. Logically got stores separate
    source tree for each version (plus some meta info like log messages).
    Done naively it would lead to serious bloat, which 1547 versions
    it would be almost 1547 times larger than single version. git
    uses compression to reduce this. AFAICS actual sources of your
    projects are about 4-5M. With normal git use I would expect
    (compressed) history to add another 5-10M (if there are a lot of
    deletions than history would be bigger). Your repo is bigger than
    that probably due to generated files and .exe. Note: I understand
    that if you write in your own language, than bootstrap is a problem.
    But for boostrap mc.c is enough. OK, you want want be independent
    from C, so mayb .exe. But .ma files just add bloat. Note that
    github has release feature, people who want just binaries or single
    verion can fetch release. And many projects having bootstrap
    problem say: if you do not have compiler fetch earler binary
    and use it to build the system. Or they add extra generated things
    to releases but do not keep them in source repositiory.


    DB:
    Exactly. You just have a very DOS-biased view as to when they are
    helpful, and when they are not. It's a backwards and limited view due
    to a lifetime of living with a backwards and limited OS.

    So much negativity here.

    First I'm castigated for my language not being original enough: it's
    either a 'rip-off' of C, or a derivative.

    Then, in every single case where I try to be different, or innovative,
    I'm doing it wrong.

    You two aren't going to be happy until my language is a clone of C, with
    tools that work exactly the same way they do on Unix. But then you're
    going to say, what's the point?

    * Case-insensitive: bad, very bad

    * 1-based: bad (a fair number of languages are 1-based too)

    * Choice of array base including 1 and 0: bad (Modern Fortran allows this)

    * Centralised module scheme: very, very bad. You want module info to be specified and repeated not only in every module, but sometimes also in
    each function. (Note that makefiles are a crude form of centralised
    module scheme, but not in a form useful to a compiler)

    * Whole-program compilation: very bad. Yet some newer languages do the
    same thing; Python uses a whole-program bytecode compiler.

    * No object files and no linker: very bad. (Python does this)

    * Instantly create amalgamated sources in one file (I thought this was brilliant): very bad: having a sprawling representation is /much/
    better! And apparently plays badly with Github. (Note: the amalgamated sqlite3.c file is on Github.)

    * Line-oriented syntax: bad. (Python is line-oriented; so is the C preprocessor.)

    * Out of order definitions: bad: You /want/ to have to write and
    maintain forward declarations for everything, and sometimes it's not
    possible (circular refs in structs for example)

    * Tools that primarily work on one file type do not need extensions for
    that file to be specified on a command line: bad. OK, that was typical
    on DEC in the 1970s; why doesn't it work now? Oh, because Unix treats
    "." in a funny way, or you could in theory have a file called "c.c.c.c".
    You know, the solution in those 0.1% of cases is very simple: then you
    have to write the full extension. But you have the convenience the other
    99.9% of the time.

    * I don't use makefiles: very, very, very bad. Yet what would be inside
    a makefile for my language where you build a whole program using one
    command ('mm prog')? Do I need to maintain a duplicate list of modules
    so that it can work out that I don't need to spend 50ms on rebuilding
    from scratch? (Note: doing 'make hello' when hello is up to date also
    takes 50ms, but that's one 5-line source file. It's easier to just build anyway.)

    * Providing distributions in form of binary: bad, I think.

    * Providing distributions in a form one step back from binary (eg. a
    single file containing C source code): bad, I think.

    * Putting stuff on Github: bad, because apparently I'm doing it wrong.
    OK, I've taken my sources off it completely; does that help?


    I get the impression that everything I try is viewed negatively.

    At least, I don't remember anyone saying, What a great idea, Bart! Or,
    Yeah, I'd like that, but unfortunately the way Linux works makes that impractical.

    Instead, it would be, Yeah, that's what you would expect from a rubbish
    OS that Bill Gates found in a bin.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Thu Dec 22 14:03:33 2022
    On 21/12/2022 12:07, Bart wrote:
    On 20/12/2022 11:55, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    It will always be a snapshot of the actual sources, which are not kept
    on-line and can change every few seconds.

    You are misusing git and github.  git is "source control" system.
    At least from my point of view (there is lot of flame wars discussing
    what source control should do) main task of source control is
    to store all significant versions of software and allow resonable
    easy retrival of any version.  Logically got stores separate
    source tree for each version (plus some meta info like log messages).
    Done naively it would lead to serious bloat, which 1547 versions
    it would be almost 1547 times larger than single version.  git
    uses compression to reduce this.  AFAICS actual sources of your
    projects are about 4-5M.  With normal git use I would expect
    (compressed) history to add another 5-10M (if there are a lot of
    deletions than history would be bigger).  Your repo is bigger than
    that probably due to generated files and .exe.  Note: I understand
    that if you write in your own language, than bootstrap is a problem.
    But for boostrap mc.c is enough.  OK, you want want be independent
    from C, so mayb .exe.  But .ma files just add bloat.  Note that
    github has release feature, people who want just binaries or single
    verion can fetch release.  And many projects having bootstrap
    problem say: if you do not have compiler fetch earler binary
    and use it to build the system.  Or they add extra generated things
    to releases but do not keep them in source repositiory.


    DB:
    Exactly.  You just have a very DOS-biased view as to when they are helpful, and when they are not.  It's a backwards and limited view due
    to a lifetime of living with a backwards and limited OS.

    So much negativity here.

    I have long experience with MS-DOS and Windows, and long experience with
    *nix. DOS is absolute shite in comparison - it was created as a cheap knock-off of other systems, thrown together quickly for a throw-away
    marketing project by IBM. Unfortunately IBM forgot to throw away the
    project and it was accidentally successful, resulting in the world being
    stuck with hardware, software and a processor ISA that were known to be third-rate outdated cheapo solutions at the time the IBM PC was first
    released. Those turds have been polished a great deal in the last 35
    years or so - some versions of Windows are okay, and modern x86-64
    processors are very impressive engineering - but turds they remain at
    their core. While some designs were planned to be forward compatible
    with future enhancements (like the 68k processor architecture, or the
    BBC MOS operating system), and some were designed to be compatible with everything above a set minimum (like Unix), x86 and DOS then Windows
    have been saddled with backwards compatibility as their prime
    motivation. (This isn't really Microsoft or Intel's fault - they are
    stuck with it, and as a result many of their more innovative projects
    have failed even when they were good ideas.)

    Yes, I have a rather negative view on DOS.

    This is not personal - I don't have a negative view of you!

    And while I have a negative view of one-man languages other than for
    fun, learning, research, or very niche applications, I am always
    impressed by people who make them.


    I get the impression that everything I try is viewed negatively.

    Your memory is biased.


    At least, I don't remember anyone saying, What a great idea, Bart!

    You've heard that a lot from me. Mostly when you list features that you
    think are absolutely critical in a language, I ignore them because they
    are so repetitive. But when I do comment on them, I regularly and
    happily comment positively on the ones I like or that I think are often
    liked by others. But I won't lie to you and tell you that I think
    1-based arrays are a good idea, or that case-insensitivity is
    universally liked, or that line-oriented syntax is innovative, or that I
    think out-of-order definitions makes a significant difference in my
    programming (it's nice to have, but for me, the disadvantages balance
    the advantages).

    To cheer you up, from your last list I agree that proper modules are
    important, whole-program optimisation is great, and traditional
    pre-compiled object files are outdated.

    Or,
    Yeah, I'd like that, but unfortunately the way Linux works makes that impractical.

    I don't think anything related to Linux or DOS/Windows is at all
    relevant to your language - it should work the same on any system. Your
    tools don't follow *nix common standards, but they would not be the
    first tools on Linux that are unconventional.


    Instead, it would be, Yeah, that's what you would expect from a rubbish
    OS that Bill Gates found in a bin.


    Bill Gates boasted how he search rubbish bins for printouts of other
    people's code, and copied it (without a care for copyright, licensing, recognition, or quality) in his own code for Microsoft.


    But I don't understand how you can take personal offence when I talk
    about operating systems, or how you end up thinking it was a criticism
    of you or your language.

    I also thought it was quite clear that I was simply telling you how
    things are, and how things work in Linux - primarily to help you out
    with a system that is unfamiliar to you.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Thu Dec 22 15:09:55 2022
    On 21/12/2022 11:07, Bart wrote:
    [...]
    So much negativity here.
    [... To David and Waldek:]
    You two aren't going to be happy until my language is a clone of C,
    with tools that work exactly the same way they do on Unix. But then
    you're going to say, what's the point?

    You [and probably Dmitry] seem to have a very weird idea of what
    Unix /is/. To really understand why many of us have been happy users of
    Unix [and somewhat less so of Linux*] for several decades, you need to understand the history, what came before, and how Unix then evolved. I
    don't intend to write an essay here; there are books on the subject.
    But what you seem to fail to appreciate is that very little of Unix is
    laid down in concrete. It would take major surgery to change the file
    system significantly; you are probably also stuck with the "exec"
    family of system calls; you would be unwise to tamper too much with
    the basic security mechanisms. But thereafter, it's entirely up to
    you.

    You're bright enough to be able to write your own language and compiler. So you're surely bright enough to write your own shell, or
    to tinker with one of those already available -- they /all/ came into
    existence because some other bright person wanted something different.
    Bright enough to write wrappers for things where you would prefer the
    defaults to be different, to write your own editor, your own tools for
    all purposes. All, every single one, of those supplied "by default"
    again came into being because someone decided they wanted it and wrote
    the requisite code. Sources are freely available, so if you want
    something different and don't want to write your own, you can play
    with the code that someone else wrote. Entirely up to you.

    When David says "you can do X", he doesn't mean "you /have/ to
    do X". There is almost no compulsion. All the tools are there, use
    them as you please. When you complain about some aspect of "gcc" or
    "make" or whatever, you're actually complaining that people who gave
    their time and expertise freely to provide a tool that /they/ wanted,
    haven't done so to /your/ specification. Well, shucks.

    To give one example, you have been wittering recently about
    the fact that "cc hello; hello" doesn't, as you would like, find and
    compile a program whose source is in "hello.c", put the binary into
    "hello", and run it. But you can write your own almost trivially;
    it's a "one-line" shell script [for large values of "one", but that's
    to provide checks rather than because it's complicated]. You complain
    that you have to write "./hello" rather than just "hello"; but that's
    because "." is not in your "$PATH", which is set by you, not because
    Unix/Linux insists on extra verbiage. If you need further help, just
    ask. But I'd expect you to be able to work it out rather than wring
    your hands and flap around helplessly [or blame Unix for it].

    [...]
    At least, I don't remember anyone saying, What a great idea, Bart!
    Or, Yeah, I'd like that, but unfortunately the way Linux works makes
    that impractical.

    Perhaps you would tell us what great ideas you'd like "us" to
    consider? The things I recall you telling us are things that existed
    long ago in other languages, such as 1-based arrays, line-based syntax,
    or case insensitivity. If you want them in Unix/Linux, you can have
    them, and no-one will, or should, tell you it's impractical. But if
    the things you want are different from the things most of the rest of
    the world wants, then you may need to write your own or adapt what is
    already freely available.

    _____
    * Linux, sadly, has acquired a degree of bloat. Eg, "man gcc" comes
    to some 300 pages, compared with the two pages of "man cc" in the
    7th Edition version. Basically, it's always easier to add more
    to an existing facility that to take stuff out. Grr! We used to
    grumble when the binary of a fully-featured browser went to over
    a megabyte. Now we scarcely turn a hair at the size of Firefox,
    or the number of processes it spawns. Grr.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Simpson

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Andy Walker on Thu Dec 22 16:57:33 2022
    On 22/12/2022 16:09, Andy Walker wrote:
      * Linux, sadly, has acquired a degree of bloat.  Eg, "man gcc" comes
        to some 300 pages, compared with the two pages of "man cc" in the
        7th Edition version.  Basically, it's always easier to add more
        to an existing facility that to take stuff out.  Grr!  We used to
        grumble when the binary of a fully-featured browser went to over
        a megabyte.  Now we scarcely turn a hair at the size of Firefox,
        or the number of processes it spawns.  Grr.


    It's Wirth's law - software gets slower faster than hardware gets
    faster. It's not a Linux innovation!

    (I don't care at all how big the program Firefox is - but it does annoy
    me that it takes so many GB of memory. Maybe it's time to close some of
    these couple of hundred tabs split amongst several dozen windows
    arranged around 12 virtual desktops!)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Andy Walker on Thu Dec 22 17:05:35 2022
    On 2022-12-22 16:09, Andy Walker wrote:

        You [and probably Dmitry] seem to have a very weird idea of what Unix /is/.

    I don't know about Bart. As for me, I started with PDP-11 UNIX and
    continued with m68k UNIX Sys V. Both were utter garbage in every
    possible aspect inferior to any competing system with worst C compilers
    I ever seen. After these I spent a couple of years maintaining Sun
    Solaris, which were decent systems. I installed, ran and maintained
    earliest versions of Linux on i368, when i486 was considered a
    "mainframe", when the kernel had to be configured and compiled (device
    specific drivers, interrupts and addresses set manually etc). Setting up
    X11 on SVGA cards with display modes as listed by the CRT monitor
    manual, what a joy!

    I know exactly what UNIX is.

    But what you seem to fail to appreciate is that very little of Unix is
    laid down in concrete.

    What David wrote about DOS/Windows being rotten in the core, which no lipstick's paint may cure, fully applies to UNIX. It a pair of ugly
    siblings.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Thu Dec 22 16:59:56 2022
    On 22/12/2022 16:21, David Brown wrote:
    On 21/12/2022 01:42, Bart wrote:

    You of course will disagree, since whatever Unix does, no matter how
    ridiculous or crass, is perfect, and every other kind of behaviour is
    rubbish.


    Yes.

    But then, I would not normally keep a thousand files in one directory. I think even MS-DOS has supported directory trees since version 2.x.

    You've obviously never written programs for other people to run on their
    own PCs. How do /you/ know how other people will organise their files?
    Who are you to tell them how to do so?

    And it might in any case be up to third party apps how files are
    generated or your client's machine.

    But my point about '* *.c', which you've chosen to ignore, is valid even
    for ten files; it's just wrong.

    It might be acceptable within a higher level language where each
    wildcard spec expands to a list of files which itself is a nested
    element of the paramter list. But it doesn't work is you just
    concatenate everything into one giant list; there are too many ambiguities.

    Of course, will never agree there's anything wrong with it; you will
    defend Linux to the death. Or you will point that you can do X, Y and Z
    to turn off this 'globbing', which now causes problems to programs that
    depend on, and which it is now up to each customer to do so persistently
    on their machines.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Thu Dec 22 17:21:07 2022
    On 21/12/2022 01:42, Bart wrote:

    You of course will disagree, since whatever Unix does, no matter how ridiculous or crass, is perfect, and every other kind of behaviour is rubbish.


    Yes.

    But then, I would not normally keep a thousand files in one directory.
    I think even MS-DOS has supported directory trees since version 2.x.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Thu Dec 22 17:17:02 2022
    On 21/12/2022 03:02, Bart wrote:
    On 20/12/2022 06:56, David Brown wrote:
    On 18/12/2022 18:09, Bart wrote:

    A key point here is that almost every general-purpose OS, other than
    Windows, in modern use on personal computers is basically POSIX
    compliant.

    POSIX compliant means basically being a clone of Unix with all the same restrictions and stupid quirks?

    It means a set of common features that you can rely on when writing
    portable code.



    Maybe you haven't done much Python programming and have only worked
    with small scripts.  But like any other language, bigger programs are
    split into multiple files or modules - only the main program file will
    be executable.  So if a big program has 50 Python files, only one of
    them will normally be executable and have the shebang and the
    executable flag.  (Sometimes you'll "execute" other modules to run
    their tests during development, but you'd likely do that as "python3
    file.py".)


    Oh, just like Windows then?

    Sure. Python is cross-platform. The difference is that if I have
    "main.py" that includes "utils.py", and only "main.py" is intended to be executable, then on Linux I can only try to execute "main.py" directly.
    On Windows, I can happily run "utils.py" exactly as I run "main.py"
    with whatever accidental consequences that might have. (Usually nothing
    bad.)


    Obviously, all 50 modules will contain executable code. You probably
    mean that only the lead module can be launched by the OS and needs
    special permissions.


    Yes, that is what "executable" means.


    You do realise that gcc can handle some 30-odd different file types?

    That doesn't change the fact that probably 99% of the time I run gcc,
    it is with the name of a .c source file. And 99.9% of the times when
    I invoke it on prog.c as the first or only file to create an
    executable, then I want to create prog.exe.


    OK.  So gcc should base its handling of input on what /you/ do, never
    mind the rest of the world?

    No, based on what LOTS of people do. gcc is used as a /C/ compiler, and
    is probably only ever used as a C compiler.

    I think C++ programmers might disagree. So would Fortran programmers,
    or people who use gcc as the front-end for linking (that covers most
    people who use gcc at all), or many those who use any of the other
    languages it covers.

    Even the name of the tool means "GNU Compiler Collection" - not "GNU C Compiler" as you seem to think.

    Maybe this is acceptable to
    you:

        gcc prog.c -oprog -lm
        ./prog

    But I prefer:

        bcc prog
        prog

    Who wouldn't?


    When your toolchain is so simple and limited, it only needs a simple
    interface - and yes, simple is a good thing. When a tool is advanced
    and has many features, a somewhat more involved interface is needed even
    in simple use-cases. That is inevitable.

    But if that's the way you want to have things, you can put this in a
    file called "rcc" :

    #!/bin/sh
    gcc $1.c -o$1 -lm && ./$1


    Put the file "rcc" in a directory on your path ("~/bin" is a common
    choice), and now you can type :

    rcc prog

    That will compile "prog.c", and if the compilation was successful, it
    will run it.

    Feel free to add whatever other gcc options you like (I recommend "-O2
    -Wall -Wextra" as a starting point). It's done once, in one file, that
    you can run ever after.

    I hope you haven't spent years complaining about gcc parameters, file
    names, makefiles, etc., rather than writing such a two-line script.
    (And in Windows it's just a one line batch file.)


    (I don't think there is much I could add to your other comments - you
    clearly have no interest in any answers.)


    Why would I bother with such stone-age rubbish? (And that hardly ever
    works.)


    You really do specialise in failing to use tools others use happily.
    But then, you put a lot of effort into making sure you fail.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Thu Dec 22 16:46:21 2022
    On 22/12/2022 15:09, Andy Walker wrote:
    On 21/12/2022 11:07, Bart wrote:
    [...]
    So much negativity here.
    [... To David and Waldek:]
    You two aren't going to be happy until my language is a clone of C,
    with tools that work exactly the same way they do on Unix. But then
    you're going to say, what's the point?

        You [and probably Dmitry] seem to have a very weird idea of what Unix /is/.  To really understand why many of us have been happy users of Unix [and somewhat less so of Linux*] for several decades, you need to understand the history, what came before, and how Unix then evolved.  I don't intend to write an essay here;  there are books on the subject.
    But what you seem to fail to appreciate is that very little of Unix is
    laid down in concrete.  It would take major surgery to change the file system significantly;  you are probably also stuck with the "exec"
    family of system calls;  you would be unwise to tamper too much with
    the basic security mechanisms.  But thereafter, it's entirely up to
    you.

        You're bright enough to be able to write your own language and compiler.  So you're surely bright enough to write your own shell, or
    to tinker with one of those already available -- they /all/ came into existence because some other bright person wanted something different.
    Bright enough to write wrappers for things where you would prefer the defaults to be different, to write your own editor, your own tools for
    all purposes.  All, every single one, of those supplied "by default"
    again came into being because someone decided they wanted it and wrote
    the requisite code.  Sources are freely available, so if you want
    something different and don't want to write your own, you can play
    with the code that someone else wrote.  Entirely up to you.

        When David says "you can do X", he doesn't mean "you /have/ to
    do X".  There is almost no compulsion.  All the tools are there, use
    them as you please.  When you complain about some aspect of "gcc" or
    "make" or whatever, you're actually complaining that people who gave
    their time and expertise freely to provide a tool that /they/ wanted,
    haven't done so to /your/ specification.  Well, shucks.

        To give one example, you have been wittering recently about
    the fact that "cc hello; hello" doesn't, as you would like, find and
    compile a program whose source is in "hello.c", put the binary into
    "hello", and run it.  But you can write your own almost trivially;
    it's a "one-line" shell script [for large values of "one", but that's
    to provide checks rather than because it's complicated].  You complain
    that you have to write "./hello" rather than just "hello";  but that's because "." is not in your "$PATH", which is set by you, not because Unix/Linux insists on extra verbiage.  If you need further help, just
    ask.  But I'd expect you to be able to work it out rather than wring
    your hands and flap around helplessly [or blame Unix for it].

    So lots of workarounds to be able to do what DOS, maligned as it was,
    did effortlessly.

    Don't forget it is not just me personally who would have trouble. For
    over a decade, I was supplying programs that users would have to launch
    from their DOS systems, or on 8-bit systems before that.

    So every one of 1000 users would have to be told how to fix that "."
    problem? Fortunately, nobody really used Unix back then (Linux was not
    yet ready), at least among our likely customers who were just ordinary
    people.

    Fortunate also that with case-sensivitivity in the shell program and
    file system, it would have created a lot more customer support headaches.


    But you can write your own almost trivially;
    it's a "one-line" shell script

    Sure. I also asked, if it is so trivial, why don't programs do that
    anyway? Learn something from DOS at least which is user friendliness.

    Everytime I complain about building stuff from Linux, people talk about installing CYGWIN, or MSYS, or WSL, because it's got 100 things that are apparently indispensible to building (I can tell you, they're not; those dependencies were deliberate choices).

    My needs for building stuff on Linux can be satisfied, as you say, with
    a handful of 1-line scripts, and those are not indispensible either;
    just convenient.


    [...]
    At least, I don't remember anyone saying, What a great idea, Bart!
    Or, Yeah, I'd like that, but unfortunately the way Linux works makes
    that impractical.

        Perhaps you would tell us what great ideas you'd like "us" to consider?  The things I recall you telling us are things that existed
    long ago in other languages, such as 1-based arrays, line-based syntax,
    or case insensitivity.

    Well, I am standing up for those features and refusing to budge just
    because C and Linux have taken over the world and shoving 0-based and case-sensitivity down people's throats.

    Notice that most user-facing interfaces tend to be case-insensitive? And
    for good reason. But don't forget that DOS was a user-facing CLI.

    Any Linux shell made a terrible CLI, but I guess it was designed for
    gurus rather than ordinary people.


    As for other features, you can imagine I'm not really in the mood to go
    through them again. A lot of the innovative stuff is to do project
    description and compiling and running programs, rather than language.

    So, nobody here thinks that doing 'mm -ma appl' to produce a one-file
    appl.ma file representing /the entire application/, that can be
    trivially compiled remotely using 'mm appl.ma', is a great idea?

    Apparently it's little different from using 'tar'!

    Well, have a look at the A68G source bundle for example: inside the .gz2
    part which compresses it, there is a .tar file. Unfortunately, when I tried:

    gcc algol68g-3.1.0.tar

    it didn't work: file not recognised. You have to untar all the component
    files and directories, and build conventionally, or conventional for
    Linux. Like so many, this application starts with a 'configure' script, although only 9500 lines this time. So I can't build it on normal Windows.

    Now look again at my 'mm appl.ma' which Just Works, and further, can do
    so on either OS ('mm appl.ma' on Windows, './mc appl.ma' on Linux, or
    just 'mc appl.ma' once you've figured out the "./" disappearing trick).

    There is absolutely no comparison. So I find it surprising there is
    lacklustre enthusiasm.

    (I tried building this program on WSL. It took about 80 seconds in all.

    But typing 'make' again still took 1.4 seconds even with nothing to do.

    Then I looked inside the makefile: it was an auto-generated one with
    nearly 3000 lines of crap inside - no wonder it took a second and a half
    to do nothing!

    And this stuff is supposed to miles better than what I do?)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Thu Dec 22 18:32:53 2022
    On 2022-12-22 17:46, Bart wrote:

    Any Linux shell made a terrible CLI, but I guess it was designed for
    gurus rather than ordinary people.

    No, they were brainless designs as MS-DOS batch was. This was just like
    with C and its countless C-esque followers. There was (and still are)
    sh, csh, tcsh, ksh, bash. Each new incarnation improved the old ones by parroting their every mistake. In early days of UNIX everybody's hobby
    was to redefine the command prompt, the ls, the ps, etc in the shell
    using an rc script. At some point people gave up. Exactly same situation
    was and is with the text editors. There existed hundreds of absolutely disgusting things. An iconic citation from Datamation was:

    "... Real Programmers consider "what you see is what you get" to be just
    as bad a concept in Text Editors as it is in Women. No, the Real
    Programmer wants a "you asked for it, you got it" text editor--
    complicated, cryptic, powerful, unforgiving, dangerous."

    -- Real Programmers Don't Use Pascal

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Thu Dec 22 21:55:29 2022
    On 22/12/2022 13:03, David Brown wrote:
    On 21/12/2022 12:07, Bart wrote:

    So much negativity here.

    I have long experience with MS-DOS and Windows,

    So have I.

    and long experience with
    *nix.

    I looked into it every few years; it always looked shite to me.

    However, I should say I have little interest in operating systems
    anyway. DOS was fine because it didn't get in my way. It provided a file system, could copy files, launch programs etc, and it didn't cut my productivity and sanity in half by throwing in case-sensitivity. What
    else did I need?

    I expect you didn't like DOS because it doesn't have the dozens of toys
    that you came to rely on in Unix, including a built-in C compiler; what
    luxury!

    It's because DOS was so sparse that I have few dependencies on it; and
    my stuff can build on Linux more easily than Linux programs can build on Windows. (AIUI, most won't; you need to use CYGWIN or MSYS or WSL, but
    then you don't get a bona fide Windows executable that customers can run directly.)


    DOS is absolute shite in comparison - it was created as a cheap
    knock-off of other systems, thrown together quickly for a throw-away marketing project by IBM.

    This is from who used, what was it, a Spectrum machine?

    I was involved in creating 8-bit business computers at the time, and
    looked down on such things. (But it was also my job to investigate
    similar, low-cost designs for hobbyist computers as an area of expansion.)

    BTW our machines used a rip-off of CP/M. My boss approached Digital
    Research but couldn't come to an agreement on licensing. So we (not me
    though) created a clone. So why is saving money a bad thing?

    I don't know exactly what you expected from an OS that ran on a 64KB
    machine, which wasn't allowed to use more than about 8KB.

    And, where /were/ the PCs with Unix in those days? Where could you buy
    one? Would you be able to do much on it other than endlessly configure
    stuff to make it work? Could you create binaries that were guaranteed to
    work with any other Unix?

    How unfriendly would it have been to supply apps as software bundles
    that would take an age to build on a dual-floppy machine, with users
    havin to keep feeding it floppies?

    I think you just have little experience of that world of creating
    products for low-end consumer PCs.

    IME Linux systems were poor, amateurish attempts at an OS where lots of
    things just didn't work, until the early 2000s. GUIs came late too, and
    looked dreadful. By comparison, Microsoft Windows looked professional.

    Yes you had to pay for it; is that what this is about, that Linux is free?


    Unfortunately IBM forgot to throw away the
    project and it was accidentally successful,

    Good.

    resulting in the world being
    stuck with hardware, software and a processor ISA that were known to be third-rate

    The IBM PC was definitely more advanced than by 8-bit business machine,
    if not that much faster despite an internal 16-bit processor.

    The 8088/86/286 had some disappointing limitations, which were fixed
    with the 80386.

    outdated cheapo solutions at the time the IBM PC was first
    released.  Those turds have been polished a great deal in the last 35
    years or so

    The architecture was open. There was a huge market in add-on
    peripherals, and they came with drivers that worked. Good luck in
    finding equivalent support in 1990s for even a printer driver under Linux.


    - some versions of Windows are okay, and modern x86-64
    processors are very impressive engineering - but turds they remain at
    their core.  While some designs were planned to be forward compatible
    with future enhancements (like the 68k processor architecture, or the
    BBC MOS operating system), and some were designed to be compatible with everything above a set minimum (like Unix), x86 and DOS then Windows
    have been saddled with backwards compatibility as their prime
    motivation.

    Which has been excellent. Until they chose not to support 16-bit
    binaries under 64-bit Windows.

    But I don't understand how you can take personal offence when I talk
    about operating systems, or how you end up thinking it was a criticism
    of you or your language.

    I get annoyed when people openly diss Windows, or MSDOS, simply for not
    being Linux.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Fri Dec 23 16:50:21 2022
    On 22/12/2022 17:59, Bart wrote:
    On 22/12/2022 16:21, David Brown wrote:
    On 21/12/2022 01:42, Bart wrote:

    You of course will disagree, since whatever Unix does, no matter how
    ridiculous or crass, is perfect, and every other kind of behaviour is
    rubbish.


    Yes.

    But then, I would not normally keep a thousand files in one directory.
    I think even MS-DOS has supported directory trees since version 2.x.

    You've obviously never written programs for other people to run on their
    own PCs. How do /you/ know how other people will organise their files?
    Who are you to tell them how to do so?

    And it might in any case be up to third party apps how files are
    generated or your client's machine.

    But my point about '* *.c', which you've chosen to ignore, is valid even
    for ten files; it's just wrong.

    It might be acceptable within a higher level language where each
    wildcard spec expands to a list of files which itself is a nested
    element of the paramter list. But it doesn't work is you just
    concatenate everything into one giant list; there are too many ambiguities.

    Of course, will never agree there's anything wrong with it; you will
    defend Linux to the death. Or you will point that you can do X, Y and Z
    to turn off this 'globbing', which now causes problems to programs that depend on, and which it is now up to each customer to do so persistently
    on their machines.

    I can't figure out what you are worrying about here.

    In any shell, in any OS, for any program, if you write "prog *" the
    program is run with a list of all the files in the directory. If you
    wrote "prog * *.c", it will be started with a list of all the files,
    followed by a list of all the ".c" files.

    It's the same in DOS, Linux, Windows, Macs, or anything else you like.
    It's the same for any shell.

    The difference is that for some shells (such as Windows PowerShell or
    bash), the shell does the work of finding the files and expanding the
    wildcards because this is what /every/ program needs - there's no point
    in repeating the same code in each program. In other shells, such as
    DOS "command prompt", every program has to have that functionality added
    to the program.

    Well, I say "every program" supports wildcards for filenames - I'm sure
    there are some DOS/Windows programs that don't. But most do.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Fri Dec 23 16:38:56 2022
    On 22/12/2022 22:55, Bart wrote:
    On 22/12/2022 13:03, David Brown wrote:
    On 21/12/2022 12:07, Bart wrote:

    So much negativity here.

    I have long experience with MS-DOS and Windows,

    So have I.

    and long experience with *nix.

    I looked into it every few years; it always looked shite to me.

    However, I should say I have little interest in operating systems
    anyway. DOS was fine because it didn't get in my way. It provided a file system, could copy files, launch programs etc, and it didn't cut my productivity and sanity in half by throwing in case-sensitivity. What
    else did I need?

    I expect you didn't like DOS because it doesn't have the dozens of toys
    that you came to rely on in Unix, including a built-in C compiler; what luxury!


    I find it useful to have a well-populated toolbox. I am an engineer -
    finding the right tools for the job, and using them well, is what I do.
    DOS gave you a rock to bash things with. Some people, such as
    yourself, seem to have been successful at bashing out your own tools
    with that rock. That's impressive, but I find it strange how you can be
    happy with it.

    The alternative in DOS and Windows has always been to buy additional
    tools that *nix users take for granted. And those tools always have to
    include /everything/. So while a Pascal compiler for *nix just needs to
    be a Pascal compiler - because there are editors, build systems,
    debuggers, libraries, assemblers and linkers already - on DOS/Windows
    you had to get Turbo Pascal or Borland Pascal which had everything
    included. (Turbo Pascal for DOS and Win3.1 included "make", by the way
    - for well over a decade that was the build tool I used for all my
    assembly and C programming on microcontrollers.) And then when you
    wanted C, you bought MSVC which included a different editor with a
    different setup, a different assembler, a different build tool (called
    "nmake", and so on. Everything was duplicated but different, everything incompatible, everything a huge waste of the manufacturers time and
    effort, and a huge waste of the users' money and time as they had to get familiar with another set of basic tools.

    Do I like the fact that *nix has always come with a wide range of general-purpose tools? Yes, I most surely do!

    It's because DOS was so sparse that I have few dependencies on it; and
    my stuff can build on Linux more easily than Linux programs can build on Windows. (AIUI, most won't; you need to use CYGWIN or MSYS or WSL, but
    then you don't get a bona fide Windows executable that customers can run directly.)


    Programs built with msys2 work fine on any Windows system. You have to
    include any DLL's you use, but that applies to all programs with all tools.


    DOS is absolute shite in comparison - it was created as a cheap
    knock-off of other systems, thrown together quickly for a throw-away
    marketing project by IBM.

    This is from who used, what was it, a Spectrum machine?


    I used many different systems, including a Spectrum. But you expect a different quality of design from a machine made as cheap as possible for
    home use, primarily for games, and an expensive system for serious
    professional business use.

    I was involved in creating 8-bit business computers at the time, and
    looked down on such things. (But it was also my job to investigate
    similar, low-cost designs for hobbyist computers as an area of expansion.)

    BTW our machines used a rip-off of CP/M. My boss approached Digital
    Research but couldn't come to an agreement on licensing. So we (not me though) created a clone. So why is saving money a bad thing?


    Saving money is fine.

    I don't know exactly what you expected from an OS that ran on a 64KB
    machine, which wasn't allowed to use more than about 8KB.

    I don't know exactly either.

    I think if IBM had followed their plan - learn what the market needed,
    then throw the "proof of concept" out and design something serious for
    the future - it would have been far better.

    I realise that hardware was expensive and there are limits to what can
    be done in so little ram. But even then it could have been /so/ much
    better.

    Start with the processor - it was crap. The IBM engineers knew it was
    crap, and didn't want it. But they were forced to use something that
    was basically limited to 64 KB before resorting to an insane segmented
    memory system that was limited to 1 MB, leading to even more insane
    hacks to get beyond that. The engineers wanted something that had a
    future, such as the 68000.

    Then there was the OS - it was crap. Unix was vastly better than CP/M
    and the various DOS's. But IBM was not involved in Unix at that time,
    and did not want others to control the software on their machines.
    (They thought they could control Microsoft.)

    Or compare it to other machines, such as the BBC Micro. It had the OS
    in ROM, making it far simpler and more reliable. It had a good
    programming language in ROM. (BASIC, but one of the best variants.)
    This meant new software could be written quickly and easily in a high
    level language, instead of assembly (as was the norm for the PC in the
    early days). It had an OS that was expandable - it supported pluggable
    file systems that were barely imagined at the time the OS was designed.
    It was a tenth of the price of the PC. Even if you equipped it with networking (unheard-of in the PC world), external disk drives, and
    multiple languages and software in ROM, it was still far cheaper than a
    PC despite being easier to use and having a vastly better text and
    graphics system. You could even get a Z80 co-processor with CP/M.

    Of course the BBC Micro wasn't perfect either, and had limitations
    compared to the PC - not surprising, given the price difference. The
    6502 was not a powerful processor.

    But imagine what we could have had with a computer using a 68000,
    running a version of Unix, combining the design innovation,
    user-friendliness and forward thinking of Acorn and the business
    know-how of IBM? It would have been achievable at lower cost than the
    IBM PC, and /so/ much better.

    As it was, by the mid-eighties there were home computers with usability, graphics and user interfaces that were not seen in the PC world for a
    decade. There were machines that were cheaper than the PC's of the time
    and could /emulate/ PC's at near full-speed. The PC world was far
    behind in its hardware, OS and basic software. But the IBM PC and MSDOS
    won out because the other machines were not compatible with the IBM PC
    and MSDOS.


    And, where /were/ the PCs with Unix in those days? Where could you buy
    one? Would you be able to do much on it other than endlessly configure
    stuff to make it work? Could you create binaries that were guaranteed to
    work with any other Unix?

    How unfriendly would it have been to supply apps as software bundles
    that would take an age to build on a dual-floppy machine, with users
    havin to keep feeding it floppies?

    I think you just have little experience of that world of creating
    products for low-end consumer PCs.

    IME Linux systems were poor, amateurish attempts at an OS where lots of things just didn't work, until the early 2000s. GUIs came late too, and looked dreadful. By comparison, Microsoft Windows looked professional.


    I used Unix systems at university in the early 1990's, and by $DEITY it
    was a /huge/ step backwards when I had to move to Windows. (To be fair,
    there was a very significant price difference involved.) Even Windows
    95 was barely on a feature par with Archimedes machines of a decade
    previous, and of course was never close to it in stability.

    Yes, I agree that Windows 2000 was the point when Windows started
    looking and working like a professional OS. (I used OS/2 earlier.) And
    yes, Linux was quite limited and had not "taken off" at that time.

    Yes you had to pay for it; is that what this is about, that Linux is free?


    No. SunOS and Solaris, which I used earlier, were very far from free.


    Unfortunately IBM forgot to throw away the project and it was
    accidentally successful,

    Good.

    resulting in the world being stuck with hardware, software and a
    processor ISA that were known to be third-rate

    The IBM PC was definitely more advanced than by 8-bit business machine,
    if not that much faster despite an internal 16-bit processor.

    The 8088/86/286 had some disappointing limitations, which were fixed
    with the 80386.


    It's a pity MS didn't catch up with proper 32-bit support until AMD were already working on 64-bit versions.

    The 80386 was certainly an improvement on its predecessors, but still
    way behind state of the art of the time. Backwards compatibility was a
    killer - it has always been the killer feature of the DOS/Windows/x86
    world, and it has always killed innovation.


    outdated cheapo solutions at the time the IBM PC was first released.
    Those turds have been polished a great deal in the last 35 years or so

    The architecture was open. There was a huge market in add-on
    peripherals, and they came with drivers that worked. Good luck in
    finding equivalent support in 1990s for even a printer driver under Linux.


    On the business and marketing side, there's no doubt that MS in
    particular outclassed everyone else. They innovated the idea of
    criminal action as a viable business tactic - use whatever illegal means
    you like to ensure competitors go bankrupt before they manage to sue
    you, and by the time the case gets to the courts the fines will be
    negligible compared to your profit. Even IBM was trapped by them.

    But yes, compatibility and market share was key - Windows and PC's were
    popular because there was lots of hardware and software that worked with
    them, and there was lots of hardware and software for them because they
    were popular. They were technically shite, but successful because they
    were successful.

    No one marketed Linux until far later, and true Unix was happy in its
    niche (as was Apple). A few innovative companies came out with hugely
    better hardware or software, but they could never catch up with the
    momentum of the PC.


     - some versions of Windows are okay, and modern x86-64
    processors are very impressive engineering - but turds they remain at
    their core.  While some designs were planned to be forward compatible
    with future enhancements (like the 68k processor architecture, or the
    BBC MOS operating system), and some were designed to be compatible
    with everything above a set minimum (like Unix), x86 and DOS then
    Windows have been saddled with backwards compatibility as their prime
    motivation.

    Which has been excellent. Until they chose not to support 16-bit
    binaries under 64-bit Windows.


    I believe you can run Wine on Windows, and then you could run 16-bit
    binaries. But you might have to run Wine under Cygwin or something -
    it's not something I have tried.

    But I don't understand how you can take personal offence when I talk
    about operating systems, or how you end up thinking it was a criticism
    of you or your language.

    I get annoyed when people openly diss Windows, or MSDOS, simply for not
    being Linux.


    I diss Windows or DOS because it deserves it. Linux was not conceived
    when I realised DOS was crap compared to the alternatives. (You always
    seem to have such trouble distinguishing Linux and Unix.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Fri Dec 23 17:17:35 2022
    On 2022-12-23 16:38, David Brown wrote:

    The alternative in DOS and Windows has always been to buy additional
    tools that *nix users take for granted.  And those tools always have to include /everything/.  So while a Pascal compiler for *nix just needs to
    be a Pascal compiler - because there are editors, build systems,
    debuggers, libraries, assemblers and linkers already - on DOS/Windows
    you had to get Turbo Pascal or Borland Pascal which had everything
    included.

    Yes, and this was one reason why DOS actually had won over UNIX
    workstations, while being utterly inferior. You bought a set of floppies
    from Borland and you could comfortably write and debug your program.

    Comparing to that UNIX was command-line with an assorted set very poorly designed obscure utilities. Even Solaris with its gorgeous OpenLook was
    no match to Borland C++ and Pascal in terms of productivity.

    Do I like the fact that *nix has always come with a wide range of general-purpose tools?  Yes, I most surely do!

    Most of them were garbage. I remember the time very well. All activities quickly migrated to DOS. Time to time someone ran back to us with a
    "huge" data set some shitty DOS statistics software could not process.
    Of course UNIX had nothing for that. But I just wrote a C program
    computing some linear regression stuff for scratch and gave processed
    data back. Things like that happened less and less frequently.
    Workstations ended up as network servers running NFS and Yellowpages and
    when Linux matured got scrapped.

    Presently, most of our software development is done under Windows.
    Modern tools are OS-agnostic. You can have your preferable IDE in either
    OS. But Windows variants are always somewhat more stable. Furthermore prototyping is far easier under Windows because the hardware support is
    much better. So we design, test and debug under Windows and run
    integration tests on the Linux target (some small ARM board).

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Fri Dec 23 16:51:56 2022
    On 23/12/2022 15:50, David Brown wrote:
    On 22/12/2022 17:59, Bart wrote:

    But my point about '* *.c', which you've chosen to ignore, is valid
    even for ten files; it's just wrong.

    It might be acceptable within a higher level language where each
    wildcard spec expands to a list of files which itself is a nested
    element of the paramter list. But it doesn't work is you just
    concatenate everything into one giant list; there are too many
    ambiguities.

    Of course, will never agree there's anything wrong with it; you will
    defend Linux to the death. Or you will point that you can do X, Y and
    Z to turn off this 'globbing', which now causes problems to programs
    that depend on, and which it is now up to each customer to do so
    persistently on their machines.

    I can't figure out what you are worrying about here.

    In any shell, in any OS, for any program, if you write "prog *" the
    program is run with a list of all the files in the directory.

    No.

    If you
    wrote "prog * *.c", it will be started with a list of all the files,
    followed by a list of all the ".c" files.

    No.


    It's the same in DOS, Linux, Windows, Macs, or anything else you like.

    No.

    It's the same for any shell.

    No. Where did you get the idea that it works anywhere? Or that it would
    even make sense for most programs?

    The difference is that for some shells (such as Windows PowerShell or
    bash), the shell does the work of finding the files and expanding the wildcards because this is what /every/ program needs - there's no point
    in repeating the same code in each program.  In other shells, such as
    DOS "command prompt", every program has to have that functionality added
    to the program.

    Oh, so your criterea that 'X works everywhere' includes having to
    implement it within applications!

    Then I can say that 'Y also works everywhere'. I don't know what Y might
    be, it doesn' matter, because whatever it does, somebody just needs to implement it first!

    This is Powershell:

    PS C:\c> .\showargs * *.c
    1: C:\c\showargs.exe
    2: *
    3: *.c

    This is Command Prompt:

    c:\c>showargs * *.c
    1: showargs
    2: *
    3: *.c

    This is WSL (a.out is the WSL-gcc-compiled version of showargs.exe):

    root@DESKTOP-11:/mnt/c/mx# ../c/a.out * *.c

    root@DESKTOP-11:/mnt/c/c# ./a.out * *.c
    1: ./a.out
    2: ...
    3: ...
    ...
    193: *.c

    That last entry seems odd: it turns out there are no .c files, so the
    parameter is the wildcard specifier itself.

    Well, I say "every program" supports wildcards for filenames - I'm sure
    there are some DOS/Windows programs that don't.  But most do.

    That will be up to individual programs whether they accept wildcards for filenames, and what they do about them. If the input is "*", what kinds
    of application would be dealing with an ill-matched collection of files including all sorts of junks that happens to be lying around?

    It would be very specific. BTW if I do this under WSL:

    vim *.c

    Then I was disappointed that I didn't get hundreds of edit windows for
    all those files. So even under Linux, sometimes expansion doesn't
    happen. (What magic does vim use to get the OS to see sense?)


    Usually the last thing you want is for the OS (or whatever is
    responsible for expanding those command line params before they get to
    the app) is to just expand EVERYTHING willy-nilly:

    * Maybe the app needs to know the exact parameters entered (like my
    showargs program). Maybe they are to be stored, and passed on to
    different parts of an app as needed.

    * Maybe they are to be passed onto to another program, where it's going
    to be much easier and tidier if they are unexpanded:

    c:\mx>ms showargs * *.c
    1 *
    2 *.c

    Here 'ms', which runs 'showargs' as a script, sees THREE parameters:
    showargs, * and *.c; It arranges for the last two to be passed as input
    to the program being run.

    * Maybe the app's inputs are mathematical expressions so that you want
    "A*B" and not a list of files that start with A and end with B!

    * But above all, it simply doesn't work, not when you have expandable
    params interspersed with other expandable ones, or even ordinary params, because everything just merges together.

    So here there are two things I find utterly astonishing:

    (1) That you seem to think this a good idea, despite that list of problems

    (2) That you are under the delusion this is how Windows works too. As
    I've shown above, it doesn't.

    Yes, individual apps can CHOOSE to do their own expansion, but that is
    workable because that expansion-list is segregrated from other parameters.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sat Dec 24 00:02:56 2022
    Bart <bc@freeuk.com> wrote:

    That will be up to individual programs whether they accept wildcards for filenames, and what they do about them. If the input is "*", what kinds
    of application would be dealing with an ill-matched collection of files including all sorts of junks that happens to be lying around?

    Yesterday I used several times the following:

    du -s *

    'man du' will tell you want is does. And sometimes I use the
    (in)famous:

    rm -rf *

    Do not try this if you do not know what it is doing!

    It would be very specific. BTW if I do this under WSL:

    vim *.c

    Then I was disappointed that I didn't get hundreds of edit windows for
    all those files. So even under Linux, sometimes expansion doesn't
    happen. (What magic does vim use to get the OS to see sense?)

    I suspect that you incorrecty interpreted your observation.
    I do not use vim in graphic mode, but I use it a lot in
    text mode. In text mode vim given list of files shows you
    the first. But it remembers all and allows easy "movement"
    between files. Probably this happened to you.

    Usually the last thing you want is for the OS (or whatever is
    responsible for expanding those command line params before they get to
    the app) is to just expand EVERYTHING willy-nilly:

    Speak about yourself. If application needs a list of files I want
    this list to be expanded. I priciple application could expand the
    list, but usually it is convenient when it is expanded earlier.

    * Maybe the app needs to know the exact parameters entered (like my
    showargs program).

    AFAICS your 'showargs' works fine, it that it received. Only
    nitpick is that position 0 is special (usually this program name),
    and normal arguments stat at 1. You started numbering of normal
    arguments at 2.

    Maybe they are to be stored, and passed on to
    different parts of an app as needed.

    Maybe, it works fine.

    * Maybe they are to be passed onto to another program, where it's going
    to be much easier and tidier if they are unexpanded:

    c:\mx>ms showargs * *.c
    1 *
    2 *.c

    Here 'ms', which runs 'showargs' as a script, sees THREE parameters: showargs, * and *.c; It arranges for the last two to be passed as input
    to the program being run.

    * Maybe the app's inputs are mathematical expressions so that you want
    "A*B" and not a list of files that start with A and end with B!

    Maybe.

    * But above all, it simply doesn't work, not when you have expandable
    params interspersed with other expandable ones, or even ordinary params, because everything just merges together.

    It works fine for me in very common cases, when I need to produce
    single list of files.

    So here there are two things I find utterly astonishing:

    (1) That you seem to think this a good idea, despite that list of problems

    You did not explain why do you want your program to see * *.c. As you
    noted, if needed one can quote parameters. So, I would need rather
    frequent usage case to prefer non-expanded version.

    Yes, individual apps can CHOOSE to do their own expansion, but that is workable because that expansion-list is segregrated from other parameters.

    I am under impression that you miss important fact: Unix was designed
    as a _system_. Programs expect system conventions and work with them,
    not against them. One convention is that there are few dozens (myriad
    in your language) small programs that are supposed to work together.
    Shell works as a glue that combines them together. Shell+utilities
    form a programming language, crappy as programming language, but
    quite useful. In particular, ability to transform/crate command
    line via programming means allows automating a lot of tasks.

    Just more on my point of view: I started to use DOS around 1988
    (I was introduced to computers on mainframes and I had ZX
    Spectrum earlier). My first practical contact with Unix was in
    1990. It took me some time to understand how Unix works, but once
    I "got it" I was able easily to do things on Unix that would be
    hard (or require much more work) on DOS. By 1993 I was mostly
    using Unix (more precisely, at that time I switched from 386BSD
    Unix to Linux).

    Coming back to Unix: it works for me. DOS in comparison felt
    crappy. Compared to 1993 Windows has improved, but for me
    this does not change much, I saw nothing that would be
    better for _me_ than what I have in Linux. Now, if you want to
    improve, one can think of many ways of doing thing better than on
    Unix. Trouble is to that real-world design will have compromises.
    One looses some possibilites to have another ones. You either
    does not understand Unix or at least pretend to not understand.
    If you do not understand Unix, then you do not qualified to judge
    it. It looks that you do not know what you loose by choosing
    different design.

    BTW: I did spent some time thinking of better command line than
    Unix. Some my ideas were quite different. But none were borrowed
    from DOS.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Fri Dec 23 23:46:37 2022
    On 23/12/2022 15:38, David Brown wrote:
    On 22/12/2022 22:55, Bart wrote:

    I expect you didn't like DOS because it doesn't have the dozens of
    toys that you came to rely on in Unix, including a built-in C
    compiler; what luxury!


    I find it useful to have a well-populated toolbox.  I am an engineer - finding the right tools for the job, and using them well, is what I do.
     DOS gave you a rock to bash things with.  Some people, such as
    yourself, seem to have been successful at bashing out your own tools
    with that rock.  That's impressive, but I find it strange how you can be happy with it.

    You seem fixated on DOS. My early coding was done on DEC and ICL OSes,
    then no OS at all, then CP/M (or our clone of it).

    The tools provided, when they were provided, were always spartan:
    editor, compiler, linker, assembler. That's nothing new.

    The alternative in DOS and Windows has always been to buy additional
    tools that *nix users take for granted.  And those tools always have to include /everything/.

    'Everything' is good. Remember those endless of discussions on clc about
    what exactly constituted a C compiler? Because this new-fangled 'gcc'
    didn't come with batteries included, like header files, assembler or linker.

    A bad mistake on Windows where those utilities are not OS-provided.

    But tell me again why a 'linker', of all things, should be part of a
    consumer product mostly aimed at people doing things that are nothing to
    do with building software. Why give it such special dispensation?

      So while a Pascal compiler for *nix just needs to
    be a Pascal compiler - because there are editors, build systems,
    debuggers, libraries, assemblers and linkers already - on DOS/Windows
    you had to get Turbo Pascal or Borland Pascal which had everything
    included.

    You can get bare compilers on Windows too.

      (Turbo Pascal for DOS and Win3.1 included "make", by the way
    - for well over a decade that was the build tool I used for all my
    assembly and C programming on microcontrollers.)  And then when you
    wanted C, you bought MSVC which included a different editor with a
    different setup, a different assembler, a different build tool (called "nmake", and so on.  Everything was duplicated but different, everything incompatible, everything a huge waste of the manufacturers time and
    effort, and a huge waste of the users' money and time as they had to get familiar with another set of basic tools.

    Nothing stopped anybody from marketing a standalone assembler or linker
    that could be used with third party compilers. These are not complicated programs (a workable linker is only 50KB).

    I can't answer that. Unless that assembler used 'gas' syntax then I
    would write my own too.


    Do I like the fact that *nix has always come with a wide range of general-purpose tools?  Yes, I most surely do!

    What did that do for companies wanting to develop and sell their own
    compilers and tools?


    It's because DOS was so sparse that I have few dependencies on it; and
    my stuff can build on Linux more easily than Linux programs can build
    on Windows. (AIUI, most won't; you need to use CYGWIN or MSYS or WSL,
    but then you don't get a bona fide Windows executable that customers
    can run directly.)


    Programs built with msys2 work fine on any Windows system.  You have to include any DLL's you use, but that applies to all programs with all tools.

    My main experience of MSYS2 was of trying to get GMP to build. I spent
    hours but failed miserably. This stuff should just work, and I don't
    mean watching stuff scroll up the screen for an hour until you were
    eventually rewarded - if you were lucky.

    Just how long should a 500KB library take to build? How many dozens of
    special tools should it need? I didn't care about performance, only
    something I could use.

    Then there was the OS - it was crap.  Unix was vastly better than CP/M
    and the various DOS's.

    Yeah, you keep saying that. So Shakespeare was perhaps a better writer
    than Dickens; maybe so, but I'd rather read Raymond Chandler!

    As I said I have little interest in OSes or the characteristics that for
    you would score points over another. But your preferences would get
    negative points from me due to case-sensitivity and pedanticness.


      But IBM was not involved in Unix at that time,
    and did not want others to control the software on their machines. (They thought they could control Microsoft.)

    Or compare it to other machines, such as the BBC Micro.  It had the OS
    in ROM, making it far simpler and more reliable.  It had a good
    programming language in ROM.  (BASIC, but one of the best variants.)
    This meant new software could be written quickly and easily in a high
    level language, instead of assembly (as was the norm for the PC in the
    early days).  It had an OS that was expandable - it supported pluggable
    file systems that were barely imagined at the time the OS was designed.
     It was a tenth of the price of the PC.

    It used a 6502. I'd argue it was better designed than any Sinclair
    product, with a proper keyboard, but it was still in that class of machine.

    BTW this is the kind of machine my company were selling:

    https://nosher.net/archives/computers/pcw_1982_12_006a

    (My first redesign task was adding the bitmapped graphics on the display.)

    Of course the BBC Micro wasn't perfect either, and had limitations
    compared to the PC - not surprising, given the price difference.  The
    6502 was not a powerful processor.

    As I said...

    More business-oriented 8-bit systems were based on the Z80, such as the
    PCW 8256, with CP/M 3. (My first commercial graphical application was
    for that machine IIRC.)

    So you don't rate its OS - so what? All customers needed were the most
    mundane things. It was a marketed as a word processor after all!

    But imagine what we could have had with a computer using a 68000,
    running a version of Unix, combining the design innovation,
    user-friendliness and forward thinking of Acorn and the business
    know-how of IBM?  It would have been achievable at lower cost than the
    IBM PC, and /so/ much better.

    As it was, by the mid-eighties there were home computers with usability, graphics and user interfaces that were not seen in the PC world for a decade.  There were machines that were cheaper than the PC's of the time
    and could /emulate/ PC's at near full-speed.  The PC world was far
    behind in its hardware, OS and basic software.  But the IBM PC and MSDOS
    won out because the other machines were not compatible with the IBM PC
    and MSDOS.

    That's true. I was playing with 24-bit RGB graphics for my private
    designs about a decade before it became mainstream on PCs.

    But where were the Unix alternatives that people could buy from PC
    World? Sure there had been colour graphics in computers for years but
    I'm talking about consumer PCs.

    I used Unix systems at university in the early 1990's, and by $DEITY it
    was a /huge/ step backwards when I had to move to Windows.

    I went from a £500,000 (in mid-70s money) mainframe at college, running
    TOPS 20 I think, to my own £100 Z80 machine with no OS on it at all, and
    no disk drives either.

    I'd say /that/ was a huge step backwards! Perhaps you can appreciate why
    I'm not that bothered.

    The 8088/86/286 had some disappointing limitations, which were fixed
    with the 80386.


    It's a pity MS didn't catch up with proper 32-bit support until AMD were already working on 64-bit versions.

    It wasn't so critical with the 80386. Programs could run in 16-bit mode
    under a 16-bit OS, and use 32-bit operations, registers and address modes.

    On the business and marketing side, there's no doubt that MS in
    particular outclassed everyone else.  They innovated the idea of
    criminal action as a viable business tactic - use whatever illegal means
    you like to ensure competitors go bankrupt before they manage to sue
    you, and by the time the case gets to the courts the fines will be
    negligible compared to your profit.  Even IBM was trapped by them.

    I never got interested in that side; I was always working to deadlines!

    But what exactly was the point of Linux? What exactly was wrong with Unix?


    But yes, compatibility and market share was key - Windows and PC's were popular because there was lots of hardware and software that worked with them, and there was lots of hardware and software for them because they
    were popular.  They were technically shite,

    /Every/ software and hardware product for Windows was shite? Because
    you're looked at every one and given your completely unbiased opinion!

    Which has been excellent. Until they chose not to support 16-bit
    binaries under 64-bit Windows.


    I believe you can run Wine on Windows, and then you could run 16-bit binaries.  But you might have to run Wine under Cygwin or something -
    it's not something I have tried.

    (I gave this 10 minutes but it lead nowwhere. Except it involved an
    extra 500MB to install stuff that didn't work, but when I purged it, it
    only recovered 0.2MB.

    Hmmm.. have I mentioned the advantages of a piece of software that comes
    and runs as a single executable file? Either its there or not there.
    There's nowhere for it to hide!)

    But I don't understand how you can take personal offence when I talk
    about operating systems, or how you end up thinking it was a
    criticism of you or your language.

    I get annoyed when people openly diss Windows, or MSDOS, simply for
    not being Linux.


    I diss Windows or DOS because it deserves it.  Linux was not conceived
    when I realised DOS was crap compared to the alternatives.  (You always
    seem to have such trouble distinguishing Linux and Unix.)

    Understandably. What exactly /is/ the difference? And what are the
    differences between the myriad different versions of Linux even for the
    same platform?

    Apparently having more than one assembler or linker on a platform is a disaster. But have 100 different versions of the same OS, that's
    perfectly fine!

    I like all these contradictions.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Sat Dec 24 00:05:58 2022
    Bart <bc@freeuk.com> wrote:

    So much negativity here.

    Just for the record: I did offer you constructive idea: use Github
    release area for distribution (that is expected use of release
    area).

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Sat Dec 24 10:13:53 2022
    On 24/12/2022 00:02, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    That will be up to individual programs whether they accept wildcards for
    filenames, and what they do about them. If the input is "*", what kinds
    of application would be dealing with an ill-matched collection of files
    including all sorts of junks that happens to be lying around?

    Yesterday I used several times the following:

    du -s *

    'man du' will tell you want is does. And sometimes I use the
    (in)famous:

    rm -rf *

    Do not try this if you do not know what it is doing!

    It still strikes me as a very sloppy feature which, apart from my other misgivings, only happens to work:

    * When there is at most one wildcard parameter

    * When it happens to be the last one (otherwise it all merges together),

    * When the application can tolerate parameters with embedded * or ?
    character potentially providing huge numbers of expanded parameters

    * When the application doesn't need parameters with raw * or ?
    characters unchanged.

    Plus new issues:

    * Conceivably, some implementation limits can be reached when the number
    of files is large

    * It could waste time, and space, dealing needlessly with all these
    files if the app does nothing with than

    Your examples are OK but those are OS shell utilities.

    Typical applications do not work like that, and they would not anyway
    want to open themselves up to who knows what sorts of problems when
    usage doesn't obey the above rules.

    But even with programs working on files like shell commands, how the
    hell do you do the equivalent of Windows':

    copy *.c *.d

    (Make copies of all files ending in .c, changing the extension to .d)

    Since all the program sees is a uniform list of files. Did the user even
    type two parameters, or was it none, or a dozen?

    Sorry, it is just dreadful. I reckon it was a big mistake in Unix, like
    the many in C, that is now being touted as a feature because it is too
    late to change it.

    Specifying wildcards in input to an app can be useful, but it has to be
    at the application's behest, and it has to be probably organised: the
    copy command of my example:

    * Needs the original "*.c" and "*.d" separately so it can work out the
    filename remapping pattern

    * /Then/ the "*.c" can be formed into a list of files (which on Windows
    is often done lazily), but not the "*.d".

    As it is, if the directory had 5 .c files and 100 .d files, Unix would
    return a list of 105 files as input - wrong! And copied to what? It's impossible to say.

    Still not convinced? Never mind. Perhaps I was hoping that just once
    someone would say that Unix got something wrong. You will say, Ah but it
    works; yeah, but it needs you keeping you fingers crossed.


    It would be very specific. BTW if I do this under WSL:

    vim *.c

    Then I was disappointed that I didn't get hundreds of edit windows for
    all those files. So even under Linux, sometimes expansion doesn't
    happen. (What magic does vim use to get the OS to see sense?)

    I suspect that you incorrecty interpreted your observation.

    Yeah, I ran it in a folder with no C files, so editing one file called
    "*.c". With lots of C files, it edited the first, but said little about
    the remaining 200+ files I'd apparently specified.


    Usually the last thing you want is for the OS (or whatever is
    responsible for expanding those command line params before they get to
    the app) is to just expand EVERYTHING willy-nilly:

    Speak about yourself. If application needs a list of files I want
    this list to be expanded.

    What about the apps that don't: it's easier to expand wildcards later,
    than to turn 1000s of filenames back into the small number of actual
    parameters later. (Like turning an omelette back into eggs!)

    I am under impression that you miss important fact: Unix was designed
    as a _system_. Programs expect system conventions and work with them,
    not against them. One convention is that there are few dozens (myriad
    in your language) small programs that are supposed to work together.
    Shell works as a glue that combines them together. Shell+utilities
    form a programming language, crappy as programming language, but
    quite useful. In particular, ability to transform/crate command
    line via programming means allows automating a lot of tasks.

    Just more on my point of view: I started to use DOS around 1988
    (I was introduced to computers on mainframes and I had ZX
    Spectrum earlier). My first practical contact with Unix was in
    1990. It took me some time to understand how Unix works, but once
    I "got it" I was able easily to do things on Unix that would be
    hard (or require much more work) on DOS. By 1993 I was mostly
    using Unix (more precisely, at that time I switched from 386BSD
    Unix to Linux).

    Coming back to Unix: it works for me. DOS in comparison felt
    crappy. Compared to 1993 Windows has improved, but for me
    this does not change much, I saw nothing that would be
    better for _me_ than what I have in Linux. Now, if you want to
    improve, one can think of many ways of doing thing better than on
    Unix. Trouble is to that real-world design will have compromises.
    One looses some possibilites to have another ones. You either
    does not understand Unix or at least pretend to not understand.
    If you do not understand Unix, then you do not qualified to judge
    it. It looks that you do not know what you loose by choosing
    different design.

    I really don't care about shell programs other than to provide the
    absolute basics.

    If you mean the underlying OS, then I don't have much of an opinion.
    Unix is case-sensitive in its file system, that's one thing.

    But its API seems to be controlled by 85 POSIX headers, which include
    the standard C headers. Plus it has special 'syscall' entry points using
    a different mechanism.

    I can't remember how DOS worked (most functionality I had to provide
    myself anyway) but comparing with Windows now, that does everything with
    a single 'windows.h' header, but includes vast amounts of extra
    functionality.

    They're just different. One seems mired in C and cannot be extricated
    from it, the other seems to have more successfully drawn a line between
    itself, and the C language, C libraries and C compilation tools.

    Neither are that easy to use from a private language which was /my/ main problem. I used workarounds only sufficient for my actual needs.


    BTW: I did spent some time thinking of better command line than
    Unix. Some my ideas were quite different. But none were borrowed
    from DOS.

    Not even the freedom to write 'cd..' or 'CD..` instead of 'cd ..'?

    I guess not; the world would surely stop spinning if Unix didn't require
    that space between "cd" and "..". (Let me guess, "cd.." could be the
    complete name of an independent program? Stupid decisions apparently
    have long-lasting consequences.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to antispam@math.uni.wroc.pl on Sat Dec 24 12:20:22 2022
    On 2022-12-24 01:02, antispam@math.uni.wroc.pl wrote:

    rm -rf *

    Do not try this if you do not know what it is doing!

    Oh, yes. I remember removing half of my Solaris file system running rm
    as root with ".." matched. I noticed that it ran too long...

    One of the way to illustrate the beauty of UNIX "ideas" is this:

    $ echo "" > -i

    Now try to "more" or remove it (:-))

    (You can also experiment with files named -rf and various UNIX commands...)

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Dmitry A. Kazakov on Sat Dec 24 18:36:58 2022
    Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote:
    On 2022-12-22 16:09, Andy Walker wrote:

    ????You [and probably Dmitry] seem to have a very weird idea of what
    Unix /is/.

    I don't know about Bart. As for me, I started with PDP-11 UNIX and
    continued with m68k UNIX Sys V. Both were utter garbage in every
    possible aspect inferior to any competing system with worst C compilers
    I ever seen.

    Can you name those superior competing systems?

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to antispam@math.uni.wroc.pl on Sat Dec 24 19:51:35 2022
    On 2022-12-24 19:36, antispam@math.uni.wroc.pl wrote:
    Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote:
    On 2022-12-22 16:09, Andy Walker wrote:

    ????You [and probably Dmitry] seem to have a very weird idea of what
    Unix /is/.

    I don't know about Bart. As for me, I started with PDP-11 UNIX and
    continued with m68k UNIX Sys V. Both were utter garbage in every
    possible aspect inferior to any competing system with worst C compilers
    I ever seen.

    Can you name those superior competing systems?

    RSX-11M, VMS-11, IBM's virtual machines OS, I forgot the name.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Dmitry A. Kazakov on Sat Dec 24 22:15:55 2022
    On 24/12/2022 11:20, Dmitry A. Kazakov wrote:
    One of the way to illustrate the beauty of UNIX "ideas" is this:
    $ echo "" > -i
    Now try to "more" or remove it (:-))

    Any experienced Unix user should know at least four ways of
    doing that; anyone else could RTFM [eg, "man rm"], which gives two
    of them.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Schubert

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Tue Dec 27 14:10:29 2022
    On 2022-12-27 13:24, David Brown wrote:
    On 24/12/2022 00:46, Bart wrote:
    On 23/12/2022 15:38, David Brown wrote:

    But tell me again why a 'linker', of all things, should be part of a
    consumer product mostly aimed at people doing things that are nothing
    to do with building software. Why give it such special dispensation?

    It is convenient to have on the system.  Programs can rely on it being there.

    I have an impression that you guys confuse linker with loader. Programs (applications) do not need linker.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Dec 27 13:24:04 2022
    On 24/12/2022 00:46, Bart wrote:
    On 23/12/2022 15:38, David Brown wrote:
    On 22/12/2022 22:55, Bart wrote:

    I expect you didn't like DOS because it doesn't have the dozens of
    toys that you came to rely on in Unix, including a built-in C
    compiler; what luxury!


    I find it useful to have a well-populated toolbox.  I am an engineer -
    finding the right tools for the job, and using them well, is what I
    do.   DOS gave you a rock to bash things with.  Some people, such as
    yourself, seem to have been successful at bashing out your own tools
    with that rock.  That's impressive, but I find it strange how you can
    be happy with it.

    You seem fixated on DOS. My early coding was done on DEC and ICL OSes,
    then no OS at all, then CP/M (or our clone of it).

    I know nothing about these systems, so I can't comment on them. And
    besides, they all died off.


    The tools provided, when they were provided, were always spartan:
    editor, compiler, linker, assembler. That's nothing new.


    Sure.

    The alternative in DOS and Windows has always been to buy additional
    tools that *nix users take for granted.  And those tools always have
    to include /everything/.

    'Everything' is good. Remember those endless of discussions on clc about
    what exactly constituted a C compiler? Because this new-fangled 'gcc'
    didn't come with batteries included, like header files, assembler or
    linker.


    "Everything" is /not/ good.

    The definition of a "C compiler" is not something from gcc. The
    definition of a compiler comes from long before C and long before Unix.
    The definition of a "C compiler" comes from the C language standards.
    Compilers take high level source code and turn it into low level code -
    whether that be assembly, machine code, byte code for virtual machine,
    or anything else.

    The GCC folks didn't bother making an assembler, linker or standard
    library because they came with the systems GCC compilers targeted. The
    same applies to pretty much all other compilers for all languages,
    whether it was Intel's compilers, Sun's compilers, IBM's, or anyone else's.

    Even if you look at, say, Borland's tools for DOS/Windows, you see the
    same pattern. Borland's Pascal compiler group made a Pascal compiler,
    their C compiler group made a C compiler, their assembler group made an assembler, their editor group made an editor, their build tools group
    made a "make" utility. They /shipped/ tools together as a collection,
    but they did not pointlessly duplicate effort.

    A bad mistake on Windows where those utilities are not OS-provided.


    GCC did not target Windows. Other people put together collections of
    GCC compilers, along with libraries, an assembler and linker, and any
    other parts they thought were useful. Some groups thought a selection
    of command-line utilities were a helpful addition to the package, others thought an IDE was helpful.

    Could these groups, such as msys, cygwin, etc., do a better job of
    packing and making things easier? I'm sure they could - I've rarely
    found any piece of software or packing that I felt was absolutely perfect.

    But tell me again why a 'linker', of all things, should be part of a
    consumer product mostly aimed at people doing things that are nothing to
    do with building software. Why give it such special dispensation?


    It is convenient to have on the system. Programs can rely on it being
    there.

    If you take a Windows system, and look at the DLL's and EXE files on the system, I bet most users could identify the purpose of far less than 1%
    of them. Even the most experienced power users will only know about a
    tiny fraction of them.

      So while a Pascal compiler for *nix just needs to be a Pascal
    compiler - because there are editors, build systems, debuggers,
    libraries, assemblers and linkers already - on DOS/Windows you had to
    get Turbo Pascal or Borland Pascal which had everything included.

    You can get bare compilers on Windows too.

      (Turbo Pascal for DOS and Win3.1 included "make", by the way - for
    well over a decade that was the build tool I used for all my assembly
    and C programming on microcontrollers.)  And then when you wanted C,
    you bought MSVC which included a different editor with a different
    setup, a different assembler, a different build tool (called "nmake",
    and so on.  Everything was duplicated but different, everything
    incompatible, everything a huge waste of the manufacturers time and
    effort, and a huge waste of the users' money and time as they had to
    get familiar with another set of basic tools.

    Nothing stopped anybody from marketing a standalone assembler or linker
    that could be used with third party compilers. These are not complicated programs (a workable linker is only 50KB).


    Such tools were available as stand-alone products. More often, I think,
    they were licensed and included with third-party tools. (I.e., if you
    wanted to make a Fortran compiler, you'd get your assembler and linker
    from Borland, MS, or someone else.) The same is still done today.

    I can't answer that. Unless that assembler used 'gas' syntax then I
    would write my own too.


    Do I like the fact that *nix has always come with a wide range of
    general-purpose tools?  Yes, I most surely do!

    What did that do for companies wanting to develop and sell their own compilers and tools?


    People can, and do, develop and sell compilers and other tools. But if
    good tools are available for free or included with the system (remember,
    "true" Unix is not normally free, and there are plenty of commercial
    Linux distributions) then you have to make better tools or provide
    better service if you want to make much money.


    Or compare it to other machines, such as the BBC Micro.  It had the OS
    in ROM, making it far simpler and more reliable.  It had a good
    programming language in ROM.  (BASIC, but one of the best variants.)
    This meant new software could be written quickly and easily in a high
    level language, instead of assembly (as was the norm for the PC in the
    early days).  It had an OS that was expandable - it supported
    pluggable file systems that were barely imagined at the time the OS
    was designed.   It was a tenth of the price of the PC.

    It used a 6502. I'd argue it was better designed than any Sinclair
    product, with a proper keyboard, but it was still in that class of machine.


    The Sinclair machines (ZX80, ZX81, ZX Spectrum) were targeting absolute
    minimal costs - the BBC Micro had a significantly higher price (about
    three times the cost, IIRC). And I agree, the result was a far better
    machine in most aspects.

    BTW this is the kind of machine my company were selling:

      https://nosher.net/archives/computers/pcw_1982_12_006a

    (My first redesign task was adding the bitmapped graphics on the display.)


    There was a lot more variety and innovation in those days!

    At that time, 1982, we had a TI 99/4A. My father sometimes had a BBC
    Micro from his work, but I rarely got a chance to use it.

    Of course the BBC Micro wasn't perfect either, and had limitations
    compared to the PC - not surprising, given the price difference.  The
    6502 was not a powerful processor.

    As I said...

    More business-oriented 8-bit systems were based on the Z80, such as the
    PCW 8256, with CP/M 3. (My first commercial graphical application was
    for that machine IIRC.)

    So you don't rate its OS - so what? All customers needed were the most mundane things. It was a marketed as a word processor after all!


    Yes, that's true.

    But imagine what we could have had with a computer using a 68000,
    running a version of Unix, combining the design innovation,
    user-friendliness and forward thinking of Acorn and the business
    know-how of IBM?  It would have been achievable at lower cost than the
    IBM PC, and /so/ much better.

    As it was, by the mid-eighties there were home computers with
    usability, graphics and user interfaces that were not seen in the PC
    world for a decade.  There were machines that were cheaper than the
    PC's of the time and could /emulate/ PC's at near full-speed.  The PC
    world was far behind in its hardware, OS and basic software.  But the
    IBM PC and MSDOS won out because the other machines were not
    compatible with the IBM PC and MSDOS.

    That's true. I was playing with 24-bit RGB graphics for my private
    designs about a decade before it became mainstream on PCs.


    That must have been fun!

    But where were the Unix alternatives that people could buy from PC
    World? Sure there had been colour graphics in computers for years but
    I'm talking about consumer PCs.


    They were expensive, and business only.

    I used Unix systems at university in the early 1990's, and by $DEITY
    it was a /huge/ step backwards when I had to move to Windows.

    I went from a £500,000 (in mid-70s money) mainframe at college, running
    TOPS 20 I think, to my own £100 Z80 machine with no OS on it at all, and
    no disk drives either.

    I'd say /that/ was a huge step backwards! Perhaps you can appreciate why
    I'm not that bothered.


    OK, that was a big step :-)

    The 8088/86/286 had some disappointing limitations, which were fixed
    with the 80386.


    It's a pity MS didn't catch up with proper 32-bit support until AMD
    were already working on 64-bit versions.

    It wasn't so critical with the 80386. Programs could run in 16-bit mode
    under a 16-bit OS, and use 32-bit operations, registers and address modes.

    On the business and marketing side, there's no doubt that MS in
    particular outclassed everyone else.  They innovated the idea of
    criminal action as a viable business tactic - use whatever illegal
    means you like to ensure competitors go bankrupt before they manage to
    sue you, and by the time the case gets to the courts the fines will be
    negligible compared to your profit.  Even IBM was trapped by them.

    I never got interested in that side; I was always working to deadlines!

    But what exactly was the point of Linux? What exactly was wrong with Unix?


    The Unix world was very insular, and often tightly bound to hardware and services. There were many vendors, such as HP, IBM, and Sun, along with
    more hardware-independent vendors like AT&T and Microsoft. But the big suppliers wanted you to get everything from them - you bought Sun
    workstations, Sun monitors, Sun printers, Sun networking systems, Sun processors, as well as SunOS or Solaris Unix. They cooperated on API's, libraries, standard utilities, file system layouts, etc. - a common base
    that became POSIX and allowed a fair degree of software compatibility
    across widely different hardware.

    And it was expensive - it was expensive to license Unix, and the
    hardware used for it was expensive. It was also closed source and
    proprietary, though there was a fair amount of free and/or open source
    software available under all sorts of different and sometimes
    incompatible licenses.

    Three things changed all this. One is the GNU project that aimed to
    re-write Unix in as free software with a new license and development
    model. (They are even working on a kernel.) The second was that a
    university professor and writer, Andrew Tanenbaum, wrote a complete
    Unix-like OS for x86 PC's and made it available cheaply for educational purposes. And the third was the internet, and Usenet.

    These formed the ecosystem for Linux to be developed as an alternative - providing the power of Unix without the software cost and without the
    hardware cost, and in an open sharing environment.


    But yes, compatibility and market share was key - Windows and PC's
    were popular because there was lots of hardware and software that
    worked with them, and there was lots of hardware and software for them
    because they were popular.  They were technically shite,

    /Every/ software and hardware product for Windows was shite? Because
    you're looked at every one and given your completely unbiased opinion!


    "They" refers to the core of the PC and of the OS.

    Which has been excellent. Until they chose not to support 16-bit
    binaries under 64-bit Windows.


    I believe you can run Wine on Windows, and then you could run 16-bit
    binaries.  But you might have to run Wine under Cygwin or something -
    it's not something I have tried.

    (I gave this 10 minutes but it lead nowwhere. Except it involved an
    extra 500MB to install stuff that didn't work, but when I purged it, it
    only recovered 0.2MB.


    As I say, I haven't tried anything like this, so I can't help.

    But just for fun, I copied a 16-bit Windows program I wrote some 25
    years ago (in Delphi) to my 64-bit Linux system, and it ran fine with
    "wine PROG.EXE". The program doesn't do much without particular
    hardware connected on a serial port, so I didn't test much - and I have
    no idea how successful serial communication might be.

    It is enough to see that 16-bit Windows software /can/ run on 64-bit
    Linux, at least to some extent.

    Hmmm.. have I mentioned the advantages of a piece of software that comes
    and runs as a single executable file? Either its there or not there.
    There's nowhere for it to hide!)


    I definitely see the advantage of stand-alone software. Single file is completely irrelevant to me, but I do prefer programs to have a specific purpose and place. Programs should not take resources other than file
    space when they are not in use, should not run unless you want them to,
    and it should be straight-forward to remove them without leaving piles
    of mess in odd places. I think we can agree on those principles (even
    if we place different weights on the importance of the size of the
    software).

    This applies to software on Linux and Windows. I don't like software
    that fills the Windows registry with settings which are usually left
    hanging after an uninstall. I don't like software that changes other
    parts of the system, or installs always-running programs or services.

    I like software that has a specific task, and does that task when you
    ask it to, and does not bother you when you are not using it.

    It would be wrong to suggest that *nix software always fits that model,
    or that Windows software always gets it wrong - but it is fair to say
    that "do one thing, and do it well" is the *nix software philosophy that
    is not as common in the Windows world.


    But I don't understand how you can take personal offence when I talk
    about operating systems, or how you end up thinking it was a
    criticism of you or your language.

    I get annoyed when people openly diss Windows, or MSDOS, simply for
    not being Linux.


    I diss Windows or DOS because it deserves it.  Linux was not conceived
    when I realised DOS was crap compared to the alternatives.  (You
    always seem to have such trouble distinguishing Linux and Unix.)

    Understandably. What exactly /is/ the difference? And what are the differences between the myriad different versions of Linux even for the
    same platform?


    Think of it all as a family tree. Unix was created in 1971, and had
    many branches of which the most important commercial ones where probably
    AIX (IBM), HP-UX (HP), and SunOS/Solaris (Sun), and the most important academic/research branch was BSD. (There was also MS's Xenix, later
    SCO, but its significance in the story was mostly as MS's proxy war
    against Linux.)

    Linux did not enter the picture until 1991, and did not spread beyond
    somewhat nerdy or specialist use until at least the turn of the century.
    It has no common heritage with "true" Unix - so technically, it is not
    Unix, but it is Unix-like. Some people (including me) refer to the
    wider world of Unix-like OS's as *nix.

    <https://en.wikipedia.org/wiki/File:Unix_history-simple.svg>


    "Linux" technically refers only to the kernel, not the whole OS and
    surrounding basic software. You are correct that it is understandable
    to find it confusing. There is only one "mainline" Linux kernel, but
    there are vast numbers of options and configuration choices, as well as
    patches for various features maintained outside the main kernel source.

    Then there are "distributions", which are collections of the Linux
    kernel and other software, along with some kind of "package management"
    system to let users install and update software. Some distributions are commercial (such as Red Hat and Ubuntu), some are entirely open (such as Debian), some are specialised for particular purposes, some are aimed at servers, some at desktops. Android is also Linux, as is ChromeOS.

    If you want a fairly complete tree, you can see it here:

    <https://upload.wikimedia.org/wikipedia/commons/b/b5/Linux_Distribution_Timeline_21_10_2021.svg>

    If you want a recommendation for use, I'd say "Linux Mint".


    Apparently having more than one assembler or linker on a platform is a disaster.

    That's an exaggeration. Having more than one assembler or linker is unnecessary effort, unless they do something significantly different.

    But have 100 different versions of the same OS, that's
    perfectly fine!


    There aren't that many in common use. But it's all open source -
    putting together a new distribution is not /that/ much effort, as you
    generally start from an existing one and make modifications. One of the
    most popular (and my favourite for desktop use) is Linux Mint - it
    started as a fork of Ubuntu with nothing more than a change of the
    default theme from a brown colour to a nice green colour. Most
    distributions remain niche or die out, and others gain followers and spread.

    I like all these contradictions.

    Contradictions are inevitable, so it's great that you like them :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Tue Dec 27 14:50:24 2022
    On 27/12/2022 13:10, Dmitry A. Kazakov wrote:
    On 2022-12-27 13:24, David Brown wrote:
    On 24/12/2022 00:46, Bart wrote:
    On 23/12/2022 15:38, David Brown wrote:

    But tell me again why a 'linker', of all things, should be part of a
    consumer product mostly aimed at people doing things that are nothing
    to do with building software. Why give it such special dispensation?

    It is convenient to have on the system.  Programs can rely on it being
    there.

    I have an impression that you guys confuse linker with loader. Programs (applications) do not need linker.


    I privately coined the term 'loader' in the 1980s for a program that
    combined multiple object files, from independently compiled source
    modules of a program, into a single program binary (eg. a .com file).

    This was a trivial task that could be done as fast as files could be
    read from disk.

    It was also pretty much what a linker did, yet a linker was a far more complicated program that also took much longer. What exactly do linkers
    do? I'm still not really sure!

    Anyway, I no longer have a need to combine object files (there are no
    object files). But there are dynamic fix-ups needed, which the OS EXE
    loader will do, between an EXE and its imported DLLs.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 27 16:12:02 2022
    Bart <bc@freeuk.com> wrote:
    On 18/12/2022 13:05, David Brown wrote:

    There are standards for that.? A text-based file can have a shebang
    comment ("#! /usr/bin/bash", or similar) to let the shell know what interpreter to use.? This lets you distinguish between "python2" and "python3", for example, which is a big improvement over Windows-style
    file associations that can only handle one interpreter for each file
    type.

    That is invasive. And taking something that is really an attribute of a
    file name, in having it not only inside the file, but requiring the file
    to be opened and read to find out.

    Inferring interpreter from extention would be invasive. As it is
    now one can replace any executable by a script and be sure that
    this catches all uses of the executable. Or one can replace
    scrit in one language by script in different language.

    (Presumably every language that runs on Linux needs to accept '#' as a
    line comment? And you need to build it in to every one of 10,000 source
    files the direct location of the Python2 or Python3 installation on that machine? Is that portable across OSes? But I expect it's smarter than that.)

    You need a way for interpreter to ignore first line. Some interpreters
    have special option for this. And you need '#! ' only for executables,
    there is no reason to add it to library files which are not used as executables.

    You do realise that gcc can handle some 30-odd different file types?

    That doesn't change the fact that probably 99% of the time I run gcc, it
    is with the name of a .c source file. And 99.9% of the times when I
    invoke it on prog.c as the first or only file to create an executable,
    then I want to create prog.exe.

    Well I did not realize that world is supposed to revolve around you.
    When I create executable using gcc in 90% of times gcc gets list
    of .o files and libraries.

    So its behaviour is unhelpful. After the 10,000th time you have to type
    .c, or backspace over .c to get at the name itself to modify, it becomes tedious.

    So automate things. Of course, if your hobby is to continously
    rename file and look how it affects compilations, then you may
    do a lot of command line editing. In normal use the same
    Makefile with can be used for tens or hundreds of edits to
    source files.

    Some folks use compile commands in editors. That works nicely
    because rules are simple.

    Now it's not that hard to write a wrapper script or program on top of gcc.exe, but if it isn't hard, why doesn't it just do that?

    You miss important point: gcc gives you a lot of possibilities.
    Simple wrapper which substitutes some defaults would make
    using non-default values harder or impossible. If you want
    to have all functionality of gcc you will end up with complicated
    command line.

    It's not a simple C compiler that assumes everything it is given is a C file.

    As I said, that is not helpful for me. Also, how many file types does
    'as' accept? As that also requires the full extension,

    AFAICS as accepts any extention. I can name my assember file
    'ss.m' and it works fine.

    and also,
    bizarrely, generates `a.out` as the object file name.

    Yes, this is silly default. It is slightly less strange
    than you think because 'a.out' is abbreviation of
    'assembler output'.

    Here's something funny: take hello.c and rename to 'hello', with no extension. If I try and compile it:

    gcc hello

    it says: hello: file not recognised: file format not recognised. Trying
    'gcc hello.' is worse: it can't see the file at all.

    So first, on Linux, where file extensions are supposed to be optional,
    gcc can't cope with a missing .c extension; you have to provide extra
    info. Second, on Linux, "hello" is a distinct file from "hello.".

    With bcc, I just have to type "bcc hello." to make it work. A trailing
    dot means an empty extension.

    You are determined not to learn, but for possible benefits of third
    parties: in Linux file name is just string of characters. Dot is
    as valid character in a name as any other.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dan Cross@21:1/5 to antispam@math.uni.wroc.pl on Tue Dec 27 17:08:14 2022
    In article <tof7ol$ovb$1@gioia.aioe.org>, <antispam@math.uni.wroc.pl> wrote: >Bart <bc@freeuk.com> wrote:
    On 22/12/2022 15:09, Andy Walker wrote:
    You complain
    that you have to write "./hello" rather than just "hello";? but that's
    because "." is not in your "$PATH", which is set by you, not because
    Unix/Linux insists on extra verbiage.? If you need further help, just
    ask.? But I'd expect you to be able to work it out rather than wring
    your hands and flap around helplessly [or blame Unix for it].

    So lots of workarounds to be able to do what DOS, maligned as it was,
    did effortlessly.

    It seems that nobody mentioned this: not having '.' in PATH is
    relatively recent trend.

    Define "recent": I haven't included `.` in my $PATH for
    30 or so years now. :-)

    - Dan C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to All on Tue Dec 27 16:53:27 2022
    So automate things.

    Isn't gcc already an automated wrapper around compiler, assembler and
    linker?


    Of course, if your hobby is to continously
    rename file and look how it affects compilations, then you may
    do a lot of command line editing. In normal use the same
    Makefile with can be used for tens or hundreds of edits to
    source files.

    Some folks use compile commands in editors. That works nicely
    because rules are simple.

    Now it's not that hard to write a wrapper script or program on top of
    gcc.exe, but if it isn't hard, why doesn't it just do that?

    You miss important point: gcc gives you a lot of possibilities.
    Simple wrapper which substitutes some defaults would make
    using non-default values harder or impossible.

    Not on Windows. Clearly, Linux /would/ make some things impossible,
    because there are no rules so anything goes.

    The rules for my BCC compiler inputs are:

    Input Assumes file

    file.c file.c
    file.ext file.ext
    file file.c
    file. file # use "file." when file has no extension

    This is not possible with Unix, since "file." either ends up as "file",
    or stays as "file." You can only refer to "file" or "file.", but not both.

    So a silly decision with Unix, which really buys you very little, means
    having to type .c extensions on inputs to C files, for eternity.

    If you want
    to have all functionality of gcc you will end up with complicated
    command line.

    It's not a simple C compiler that assumes everything it is given is a C
    file.

    As I said, that is not helpful for me. Also, how many file types does
    'as' accept? As that also requires the full extension,

    AFAICS as accepts any extention. I can name my assember file
    'ss.m' and it works fine.

    But you need an extension. Give it a null extension then resulting
    executable, if this is the only module, will clash.

    My point however was the the reason gcc needs an explicit extension was
    the number of possible input file types. How many /different/ file types
    does 'as' work with?

    I write my command-line utilities both to be easy to use when manually
    invoked, and for invoking from scripts.

    My experience of Unix utilities is that they do nothing of the sort;
    they have no provision for user-friendliness whatsoever.


    You are determined not to learn

    You also seem reluctant to learn how non-Unix systems might work, or acknowledge that they could be better and more user-friendly.


    , but for possible benefits of third
    parties: in Linux file name is just string of characters. Dot is
    as valid character in a name as any other.

    Well, that's wrong. It may have sounded a good idea at one time: accept
    ANY non-space characters as the name of a file. But that allows for a
    lot of completely daft names, while disallowing some sensible practices.

    There is no structure at all, no possibility for common sense.

    With floating point numbers, 1234 is the same value as 1234. while
    1234.. is an error, but they are all legal and distinct filenames under
    Unix.

    Under Windows, 1234 1234. 1234... all represent the same "1234" file.
    While 123,456 are two files "123" and "456"; finally, some rules!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Tue Dec 27 17:32:45 2022
    On 2022-12-27 15:50, Bart wrote:

    I privately coined the term 'loader' in the 1980s for a program that
    combined multiple object files, from independently compiled source
    modules of a program, into a single program binary (eg. a .com file).

    Loader is a dynamic linking and relocation program. It takes an image
    (EXE or DLL) and loads it into the memory of an existing or new process.

    It was also pretty much what a linker did, yet a linker was a far more complicated program that also took much longer. What exactly do linkers
    do? I'm still not really sure!

    Linker is a program that creates a loadable image from various sources
    (object files and static libraries/archives of object files, including
    import libraries). Linker

    - resolves static symbols (The MS linker supports partial resolution.
    The GNU linker does not. Which is why linking using the MS linker for
    large projects is many times faster than linking with the GNU linker)
    - creates vectorized symbols (to be resolved by the loader)
    - evaluates link-time expressions
    - creates sections and generates code to initialize sections and
    elaborate code (e.g. starting tasks, calling constructors, running initializers)
    - creates overlay trees (earlier linkers)

    Anyway, I no longer have a need to combine object files (there are no
    object files). But there are dynamic fix-ups needed, which the OS EXE
    loader will do, between an EXE and its imported DLLs.

    My deepest condolences... (:-))

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 27 16:50:29 2022
    Bart <bc@freeuk.com> wrote:
    On 22/12/2022 15:09, Andy Walker wrote:
    You complain
    that you have to write "./hello" rather than just "hello";? but that's because "." is not in your "$PATH", which is set by you, not because Unix/Linux insists on extra verbiage.? If you need further help, just
    ask.? But I'd expect you to be able to work it out rather than wring
    your hands and flap around helplessly [or blame Unix for it].

    So lots of workarounds to be able to do what DOS, maligned as it was,
    did effortlessly.

    It seems that nobody mentioned this: not having '.' in PATH is
    relatively recent trend. Simple fact is that people make
    typos and may end up running different program then intended.
    This also has security implications: malware can lead to real losses.
    One can assume that system programs are trusted, but it is
    harder to make assumptions about programs in semi-random
    directory, it may be some junk fetched from the net. Of
    course, normally file permissions would prevent execution of
    random junk, but a lot of folks think that stronger measures
    are needed and so '.' is not in default PATH. You can still
    add it, this is your decision.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 27 19:05:18 2022
    Bart <bc@freeuk.com> wrote:

    You two aren't going to be happy until my language is a clone of C, with tools that work exactly the same way they do on Unix. But then you're
    going to say, what's the point?

    I certainly do not suggest cloning C, indeed what would be point of
    this? Rather, I would like to see language that fixed _important_
    problems with C, the biggest one IMO being limited or no type checking
    when one wants interesting/variable sized types. Once you have
    language different than C there is of course opportunity to fix
    smaller problems, like syntax of types.

    I have GNU Pascal, it has:
    - module system with many features
    - object oriented extentions
    - schema types
    - restricted types
    - nested functions
    - low level extentions

    Schema types allow variable sized types (size is derived from type
    parmeters) with bounds checking. Restricted types allow exportiong
    type such that all operations on type are done by provided functions.
    More precisely, language disallows access to structure of restricted
    type. Of course, low level extentions allow breaking normal rules,
    including rules for restricted types, but restricted types are intened
    to avoid accidental mistakes and to better structure software
    (they are _not_ security mechanizm unlike to Java).

    Nested functions when passed as parameters to other functions
    have access to variables/parameters outside. This may simplify
    use of functional parameters. This is less powerful than real
    closures, but normally closures depend on garbage collection
    while nested functions work with stack allocation.

    GNU Pascal has its warts, many due to compatibility with other
    Pascal dialects. Let me note one wart: Pascal has builtin constant
    maxinteger. Normal Pascal programs expect all integers to be
    smaller or equal in magnitude to maxinteger. But two-complement
    arithmetic has negaive value which is bigger in absolute value
    than maxinteger. To make things more interesting, GNU Pascal
    has integer types bigger than standard integer, so there are
    "integers" bigger than maxinteger...


    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From antispam@math.uni.wroc.pl@21:1/5 to Bart on Tue Dec 27 23:56:54 2022
    Bart <bc@freeuk.com> wrote:

    So automate things.

    Isn't gcc already an automated wrapper around compiler, assembler and
    linker?

    You miss principle of modularity: gcc contains bulk of code (I mean
    the wrapper, it is quite large given that all what it is doing
    is handling command line). Your own code can provide fixed
    values (so that you do not need to retype them) and more complex
    behaviours (which would be too special to have as gcc options).

    In fact, 'make' was designed to be tool for "directing compilation",
    it handles things at larger scale than gcc.

    To put is differently: gcc and make provide mechanizms. It is
    to you to specify policy.

    BTW: there are now several competitors to make. One is 'cmake'.

    Of course, if your hobby is to continously
    rename file and look how it affects compilations, then you may
    do a lot of command line editing. In normal use the same
    Makefile with can be used for tens or hundreds of edits to
    source files.

    Some folks use compile commands in editors. That works nicely
    because rules are simple.

    Now it's not that hard to write a wrapper script or program on top of
    gcc.exe, but if it isn't hard, why doesn't it just do that?

    You miss important point: gcc gives you a lot of possibilities.
    Simple wrapper which substitutes some defaults would make
    using non-default values harder or impossible.

    Not on Windows. Clearly, Linux /would/ make some things impossible,
    because there are no rules so anything goes.

    What I wrote really have nothing to with operating system, this
    is very general principle.

    The rules for my BCC compiler inputs are:

    Input Assumes file

    file.c file.c
    file.ext file.ext
    file file.c
    file. file # use "file." when file has no extension

    This is not possible with Unix, since "file." either ends up as "file",
    or stays as "file." You can only refer to "file" or "file.", but not both.

    You can implement your rules on Unix. Of course, one can ask if the
    rules are useful. As written above your rules make it impossible to
    access file named "file." (it will be mangled to "file"). And you
    get quite unintuitive behaviour for "file".

    So a silly decision with Unix, which really buys you very little, means having to type .c extensions on inputs to C files, for eternity.

    Nobody forces you to have extention on files. And do not exagerate,
    two characters extra from time to time do not make much difference.

    If you want
    to have all functionality of gcc you will end up with complicated
    command line.

    It's not a simple C compiler that assumes everything it is given is a C >>> file.

    As I said, that is not helpful for me. Also, how many file types does
    'as' accept? As that also requires the full extension,

    AFAICS as accepts any extention. I can name my assember file
    'ss.m' and it works fine.

    But you need an extension. Give it a null extension then resulting executable, if this is the only module, will clash.

    as ss

    will produce 'a.out' (as you know). If there is no a.out there will
    be no clash, otherwise you need to specify output file, like:

    as ss -o ss.m

    or

    as ss -o tt

    Note: as produces object file (it does not link). For automatic
    linking use gcc.

    My point however was the the reason gcc needs an explicit extension was
    the number of possible input file types. How many /different/ file types
    does 'as' work with?

    as have no notion of "file type". as takes stream of bytes and turns
    it into object file (different thing from an executable!). Stream
    of bytes may come from file, but in typical use (when as is called by
    gcc) as gets its input from a pipe.

    I write my command-line utilities both to be easy to use when manually invoked, and for invoking from scripts.

    My experience of Unix utilities is that they do nothing of the sort;
    they have no provision for user-friendliness whatsoever.

    Old saying (possibly mangled): "Unix is friendly, just not everybody
    is its friend".


    You are determined not to learn

    You also seem reluctant to learn how non-Unix systems might work, or acknowledge that they could be better and more user-friendly.

    "might work" is almost meaningless. By design Unix allows a lot
    of flexibility so it "might work" in quite different way. Now,
    concering how system actually work, I have some experience of MVS,
    CP/CMS and significant experience with DOS. You may think that
    non Unix system are friendly, but IME Unix behaviours make me
    more productive. Unix was designed to help automation. Output
    of one Unix utility typically can be used as input to the other
    allowing composing more complex commands. I had choice between
    DOS and Linux, Linux not only run faster than DOS on the same
    machine but also was more convenient to use. Of course, there
    are some particular things which were easier on other systems,
    but that is simply cost of having other featurs.

    , but for possible benefits of third
    parties: in Linux file name is just string of characters. Dot is
    as valid character in a name as any other.

    Well, that's wrong. It may have sounded a good idea at one time: accept
    ANY non-space characters as the name of a file.

    For the record: spaces, newlines, tabs and similar are legal in Unix
    filenames. They lead to various troubles, but it is for users to
    avoid them (or use programs than handle them in convenient way).
    But that allows for a
    lot of completely daft names, while disallowing some sensible practices.

    Yes you can create daft names, do not do this. Unix disallows
    no sensible practices (unless you consider null character or slash
    as sensible part of filename).

    There is no structure at all, no possibility for common sense.

    With floating point numbers, 1234 is the same value as 1234. while
    1234.. is an error, but they are all legal and distinct filenames under
    Unix.

    Under Windows, 1234 1234. 1234... all represent the same "1234" file.
    While 123,456 are two files "123" and "456"; finally, some rules!

    In Unix you have simple rule: string with no embedded nuls or
    slashes and within filesystem max limit is valid filename. Different
    strings give different files. No need to worry about clashes,
    no need to worry that directory listing will give you different
    thing than one which you wanted to store. On top of that
    applications are free to implement their handling and OS will
    not stop them. In Windows you have arbitrary rules, which
    AFAIK depend on codepage of filesystem.

    I am affraid that our views diverge here quite a lot. For me
    filesystem should store information and allow retrieval of
    _exactly_ the same thing that was stored. That includes
    filenames. Any restrictions on form of information
    are undesirable, excluding null and slash is a limiation
    but it is quite mild limitation compared to Windows
    limitations.

    From abstract point of view filesystem is a persistem associative
    table. For given key (filename) there is associated data
    (content of the file + possible metadata). Theretically you can
    get fancy about keys and content. Some older OS required users
    to specify "file organization" and treated files as sequences of
    "records". Unix simply says that content is sequence of bytes.
    You may think that older approach is has more structure, but OS
    provided record frequently were poor match to application needs.
    With Unix application can organize data in its own way. For
    names also one can think about some fancy schemes. But names
    should be printable/viewable for "user friendliness" so the
    simplest way is to treat them as strings. Hierarchical
    organization is not essential from abstract point of view,
    but it is simple to implement and users understand it, so
    it is natural to have it. Concerning metadata, size is
    important. I consider modification time very important.
    It is not clear if one need 3 times like in Unix and if one
    goes with miltiple time it is not clear which ones to store.
    One could go more fancy, Mac OS has concept of "resource fork"
    that IIUC in priciple can store arbitrary extra info. It
    is not clear to me what it buys compared to storing info
    directly inside file. I can imagine benefits if arbitrary
    program could "attrach" to file it own info is a way that
    would be ignored by other programs which do not understand
    such information. It is not clear to me if Mac "resource
    fork" allows this.

    Anyway, Unix file system is strightforward implementation of
    file system abstraction. It has no fancy features, just
    essential things. If power comes from simplicity and fact
    that it "just works".

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to All on Thu Dec 29 00:37:20 2022
    On 22/12/2022 16:46, Bart wrote:
    [I wrote:]
    [...] You complain
    that you have to write "./hello" rather than just "hello";  but that's
    because "." is not in your "$PATH", which is set by you, not because
    Unix/Linux insists on extra verbiage.  If you need further help, just
    ask.  But I'd expect you to be able to work it out rather than wring
    your hands and flap around helplessly [or blame Unix for it].
    So lots of workarounds to be able to do what DOS, maligned as it was,
    did effortlessly.

    It isn't a "workaround". It's all perfectly normal; commands are
    run from either a directory in your "PATH" variable or a directory that you specify. Where else would you expect to look for executables?

    Don't forget it is not just me personally who would have trouble. For
    over a decade, I was supplying programs that users would have to
    launch from their DOS systems, or on 8-bit systems before that.

    The question is not how to launch them but where to install them.
    If you [or a user following your instructions] install them in a weird
    place, then it's not surprising that you need a weird instruction to run
    them. Or do you expect to have to search the entire computer to find each command? Unix commands are usually installed in standard places such as "/bin", "/usr/bin", "/usr/local/bin" or "$HOME/bin" [usually therefore to
    be found in your default "$PATH"], and usually run by simply naming the command.

    So every one of 1000 users would have to be told how to fix that "."
    problem?

    A "problem" only in your imagination.

    Fortunately, nobody really used Unix back then (Linux was
    not yet ready), at least among our likely customers who were just
    ordinary people.

    Whereas our students all had two heads and three legs? [Before
    you say "Well, they were CS specialists", no they weren't, we didn't
    have a CS dept or course in those days, we just expected all undergrads
    to learn how to use the computer.]

    Fortunate also that with case-sensivitivity in the shell program and
    file system, it would have created a lot more customer support
    headaches.

    Not the sort of thing that ever gave me or colleagues headaches,
    either as users or in a support role.

    But you can write your own almost trivially;
    it's a "one-line" shell script
    Sure. I also asked, if it is so trivial, why don't programs do that
    anyway?

    Because no-one other than you is confused by the notion that you
    run programs by [merely] naming them [if they're in standard places] or
    saying where to find them [otherwise]?

    Learn something from DOS at least which is user
    friendliness.

    Really? I found DOS utterly unhelpful. Of course, I came to it
    from Unix, where life was much easier and more transparent; and I read
    the documentation first rather than just "suck it and see" [resulting in
    the sorts of misunderstanding that you often display here].

    [...]> Well, I am standing up for those features and refusing to budge just
    because C and Linux have taken over the world [...].
    Really? According to

    https://www.simplilearn.com/best-programming-languages-start-learning-today-article

    C and C++ combined are 11th on the language list and according to

    https://gs.statcounter.com/os-market-share

    Linux is 6th OS, with a mighty 1.09%, pretty much level with "unknown" and "other", whatever they may be. Rather niche, wouldn't you say? Of course, being niche doesn't stop them being influential.

    Notice that most user-facing interfaces tend to be case-insensitive?

    Doesn't this depend on the application? E-mail addresses are case- insensitive for good reasons, tho' I hope you don't make use of this to
    write all your e-mails in camel-case. OTOH, passwords are almost always case-sensitive, for equally good reasons. I expect word-processors and
    similar to take account of case in the text I type; some other applications much less so. The arguments for computer languages have been well-rehearsed here; personally I think keywords should be distinguished from identifiers, but YMMV.

    [...]
    So, nobody here thinks that doing 'mm -ma appl' to produce a one-file
    appl.ma file representing /the entire application/, that can be
    trivially compiled remotely using 'mm appl.ma', is a great idea?

    Well, it seems to work for you, but then I don't know what "mm"
    is beyond you saying "you can do /this/, /that/ and /the other/" in it. Documentation? System requirements? Will it work on my machine? Will
    it work on legacy machines? Does it utilise R, "curses", "plotlib", PostgreSQL, the Gnu scientific library and/or other similar packages
    if they are available? If the answer is a universal "yes", then bully
    for you. Otherwise, it's not a "great idea" /for me/.

    [...]
    Well, have a look at the A68G source bundle for example: [...].
    Like so many, this application starts with a
    'configure' script, although only 9500 lines this time. So I can't
    build it on normal Windows.

    The auto-produced "configure" script is what enables A68G to work
    not only on my current machine, but also the one I had at work just before
    I retired 15 years ago, and on the next machine I buy [soon, probably].
    I didn't have the chance to try it, but from the comments in the code I
    would expect it to work also for the SGI and Sun machines I had at work in
    even earlier times. Of course, if you want to try A68G on your Windows machine, you could try downloading the pre-built Windows ".exe" instead
    of building for Linux?

    [...]
    But typing 'make' again still took 1.4 seconds even with nothing to
    do.
    Then I looked inside the makefile: it was an auto-generated one with
    nearly 3000 lines of crap inside - no wonder it took a second and a
    half to do nothing!

    You said it yourself -- it's auto-generated. You aren't expected
    to read it. It has to [be prepared to] cope not only with my current PC
    but also with ancient SGI and Sun machines, other modern machines, a wide variety of architectures and available libraries. So yes, it's quite
    complex. It doesn't just build and optionally install an A68G executable, "make" is much more versatile than that, and deals with many other aspects
    of controlling software [inc documentation]. Even if it has "nothing to
    do", it takes time to confirm that no relevant file in the directory tree
    has changed so that all dependencies are still met.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Peerson

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Dan Cross on Thu Dec 29 17:47:18 2022
    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In article <tof7ol$ovb$1@gioia.aioe.org>, <antispam@math.uni.wroc.pl> wrote:

    Bart <bc@freeuk.com> wrote:

    On 22/12/2022 15:09, Andy Walker wrote:

    You complain
    that you have to write "./hello" rather than just "hello";? but
    that's because "." is not in your "$PATH", which is set by you,
    not because Unix/Linux insists on extra verbiage.? If you need
    further help, just ask.? But I'd expect you to be able to work
    it out rather than wring your hands and flap around helplessly
    [or blame Unix for it].

    So lots of workarounds to be able to do what DOS, maligned as it
    was, did effortlessly.

    It seems that nobody mentioned this: not having '.' in PATH is
    relatively recent trend.

    Define "recent": I haven't included `.` in my $PATH for
    30 or so years now. :-)

    How about this: "recent" is any time since you last had '.'
    in your path. :)

    seasoned greetings and Happy almost New Year ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to antispam@math.uni.wroc.pl on Fri Dec 30 02:38:46 2022
    On 27/12/2022 23:56, antispam@math.uni.wroc.pl wrote:
    Bart <bc@freeuk.com> wrote:

    So automate things.

    Isn't gcc already an automated wrapper around compiler, assembler and
    linker?

    You miss principle of modularity: gcc contains bulk of code (I mean
    the wrapper, it is quite large given that all what it is doing
    is handling command line). Your own code can provide fixed
    values (so that you do not need to retype them) and more complex
    behaviours (which would be too special to have as gcc options).


    In fact, 'make' was designed to be tool for "directing compilation",
    it handles things at larger scale than gcc.

    To put is differently: gcc and make provide mechanizms. It is
    to you to specify policy.

    So, there isn't such a thing as a 'C compiler' program where you give it
    input A, and it produces A.exe from A.c; or you give it A, B, C and it
    produces A.exe from A.c, B.c, C.c.

    No one apparently thinks that is useful. A million utilities in Unix
    except one to compile its most important language with the minimum fuss.

    Unless you count using 'make A' where it might work, if it's only one
    module, and doesn't need '-lm', and there doesn't happen to be a module
    called 'A' of some other extension since it might build that instead.

    And it doesn't matter that it might not actually invoke the compiler,
    since there are reasons you want to do so even when the source has not
    changed.


    BTW: there are now several competitors to make. One is 'cmake'.

    A 90MB installation which ... creates makefiles?

    You don't see how that's going in the wrong direction? How many ways can
    a combination of gcc, make, cmake, configure possibly go wrong?

    And who exactly is this for? I've repeatedly made the point that the set
    of environmental info someone who just wants to build P from source, for
    some very good reasons, need be nothing like the set used by the
    developer of the application.

    Do you think the build instructions for a piece of IKEA furniture need
    to include all the locations in the factory where various parts are located?

    Of course, if your hobby is to continously
    rename file and look how it affects compilations, then you may
    do a lot of command line editing. In normal use the same
    Makefile with can be used for tens or hundreds of edits to
    source files.

    Some folks use compile commands in editors. That works nicely
    because rules are simple.

    Now it's not that hard to write a wrapper script or program on top of
    gcc.exe, but if it isn't hard, why doesn't it just do that?

    You miss important point: gcc gives you a lot of possibilities.
    Simple wrapper which substitutes some defaults would make
    using non-default values harder or impossible.

    Not on Windows. Clearly, Linux /would/ make some things impossible,
    because there are no rules so anything goes.

    What I wrote really have nothing to with operating system, this
    is very general principle.

    The rules for my BCC compiler inputs are:

    Input Assumes file

    file.c file.c
    file.ext file.ext
    file file.c
    file. file # use "file." when file has no extension

    This is not possible with Unix, since "file." either ends up as "file",
    or stays as "file." You can only refer to "file" or "file.", but not both.

    You can implement your rules on Unix. Of course, one can ask if the
    rules are useful. As written above your rules make it impossible to
    access file named "file." (it will be mangled to "file"). And you
    get quite unintuitive behaviour for "file".

    That's a detail I wasn't aware of where I first learned to know and love default file extensions. It showed the machine had a spark of
    intelligence; at least it knew what files it was supposed to work on!

    Perhaps I should have looked 45 years into the future where people use a
    rather bizarre OS where the "." in "file.ext" is not a bit of syntax
    that separates the two parts, it is actually part of the file's name.

    It's funny that, on Windows, the 'gcc' compiler driver binary is called 'gcc.exe'. Yet you don't have to type 'gcc.exe' to invoke it, it can
    just be 'gcc'. Useful, yes?




    So a silly decision with Unix, which really buys you very little, means
    having to type .c extensions on inputs to C files, for eternity.

    Nobody forces you to have extention on files.

    Oh, come on! You have a file 'prog'; how do you distingish between:

    * The source file
    * The matching header file
    * The assembler file
    * The object file
    * The executable file

    'Shebangs' are no good here!

    And do not exagerate,
    two characters extra from time to time do not make much difference.

    They are incredibly annoying. If I have a sequence of ops on version of
    the same file, then I have to copy the line then backspace over the
    extension and write a new one, which is... ah yes it's .asm this time!

    It is an utter waste of time. Obviously, you're going to have your
    opinion because that's what you've been forced to do for so long.

    Besides, if it wasn't a big deal, why don't Unix executables have
    extension in general? Because it would be too of an imposition to have
    to repeatedly type it.

    If you want
    to have all functionality of gcc you will end up with complicated
    command line.

    It's not a simple C compiler that assumes everything it is given is a C >>>>> file.

    As I said, that is not helpful for me. Also, how many file types does
    'as' accept? As that also requires the full extension,

    AFAICS as accepts any extention. I can name my assember file
    'ss.m' and it works fine.

    But you need an extension. Give it a null extension then resulting
    executable, if this is the only module, will clash.

    as ss

    will produce 'a.out' (as you know). If there is no a.out there will
    be no clash, otherwise you need to specify output file, like:

    as ss -o ss.m

    or

    as ss -o tt

    I think even Unix uses .o extension for object files. In fact, this:

    gcc -c hello.s

    writes a file called hello.o.

    So why in God's name does this, with or without -c:

    as -c hello.s

    write 'a.out' rather than 'hello.o'?! It is just utterly bizarre.

    Why are you even trying to defend this nonsense?

    And why does this:

    as -c hello.s hello.c

    give an error? Is it now trying to do the job of linker?


    Note: as produces object file (it does not link). For automatic
    linking use gcc.

    My point however was the the reason gcc needs an explicit extension was
    the number of possible input file types. How many /different/ file types
    does 'as' work with?

    as have no notion of "file type". as takes stream of bytes and turns
    it into object file (different thing from an executable!). Stream
    of bytes may come from file, but in typical use (when as is called by
    gcc) as gets its input from a pipe.

    Not on Windows; there an anonymous, temporary .o file is created.

    I write my command-line utilities both to be easy to use when manually
    invoked, and for invoking from scripts.

    My experience of Unix utilities is that they do nothing of the sort;
    they have no provision for user-friendliness whatsoever.

    Old saying (possibly mangled): "Unix is friendly, just not everybody
    is its friend".



    You are determined not to learn

    You also seem reluctant to learn how non-Unix systems might work, or
    acknowledge that they could be better and more user-friendly.

    "might work" is almost meaningless. By design Unix allows a lot
    of flexibility so it "might work" in quite different way.

    Funny sort of flexibility: file extensions must be spot-on; no chance of inferring an extension to save the user some trouble. And letter case
    must be spot-on too: so was that file oneTwo or OneTwo or Onetwo? And
    commands must be spot-on as well: you type 'ls OneTwo' then you look up
    and realise Caps-lock was on and you typed 'LS oNEtWO'; urggh! Start again..



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Fri Dec 30 11:08:23 2022
    On 30/12/2022 02:38, Bart wrote:
    [...] And letter
    case must be spot-on too: so was that file oneTwo or OneTwo or
    Onetwo? And commands must be spot-on as well:

    There are 3439 commands in the directories of my "$PATH", of
    which just 15 include upper-case letters. The only one I have ever
    used [in over 40 years!] is "SendAnywhere", which is the name shared
    with the corresponding command on my 'phone, tablet and Mac. I don't
    usually type it anyway, as it's available directly from the launcher.
    I might in principle also use "R", but don't need to as the functions
    are included in "a68g". So not a problem in practice.

    you type 'ls OneTwo'
    then you look up and realise Caps-lock was on and you typed 'LS
    oNEtWO'; urggh! Start again..

    Many years ago, I had a keyboard with an ultra-ultra-sensitive caps-lock key which was activated far too frequently by catching it
    while typing a nearby symbol or key [esp a problem in the early days
    when keyboard layouts were less standardised]. So I removed the key,
    which I never used anyway. For the last 25 years or so, I've instead
    disabled it by software. Solves all such problems.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Wolf

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)