• Introducing the C_ Dialect

    From cHaR@21:1/5 to All on Sat Mar 1 03:07:55 2025
    I started working on a preprocessing-based dialect of C a couple of
    years ago for use in personal projects, and now that its documentation
    is complete, I am pleased to share the reference implementation with
    fellow programmers.

    https://github.com/cHaR-shinigami/c_

    The entire implementation rests on the C preprocessor, and the ellipsis framework is its metaprogramming cornerstone, which can perform any kind
    form of mathematical and logical computation with function composition.
    A new higher-order function named "omni" is introduced, which provides a generalized syntax for operating with arrays and scalars; for example:

    `op_(&arr0, +, &arr1)` adds elements at same indices in arr0 and arr1 `op_(&arr, *, 10)` scales each element of arr by 10
    `op_(sum, +, &arr)` adds all elements of arr to sum
    `op_(price, -, discount)` is simply price - discount

    The exact semantics are a tad detailed, and can be found in chapters 4
    and 5 of the documentation.

    C_ establishes quite a few naming conventions: for example, type
    synonyms are named with a leading uppercase letter, the notable aspect
    being that they are non-modifiable by default; adding a trailing
    underscore makes them modifiable. Thus an Int cannot be modified after initialization, but an Int_ can be.

    The same convention is also followed for pointers: `Ptr (Char_) ptr`
    means `ptr` cannot be modified but `*ptr` (type Char_) can be, whereas `Ptr_(Char) ptr_` means something else: `ptr_` can be modified but
    `*ptr_` (type Char) cannot be. `Ptr (Int [10]) p1, p2` says both are non-modifiable pointers to non-modifiable array of 10 integers; this
    conveys intent more clearly than the conventional `const int (* const
    p1)[10], p2` which ends up declaring something else: `p2` is not a
    pointer, but a plain non-modifiable int.

    C_ blends several ideas from object-oriented paradigms and functional programming to facilitate abstraction-oriented designs with protocols, procedures, classes and interfaces, which are explored from chapter 6.
    For algorithm enthusiasts, I have also presented my designs on two
    new(?) sorting strategies in the same chapter: "hourglass sort" uses
    twin heaps with quick sort, and "burrow sort" uses a quasi-inplace merge strategy. For the preprocessor sorting, I have used a custom-made
    variant of adaptive bubble sort.

    The sample examples have been tested with gcc-14 and clang-19 on a
    32-bit variant of Ubuntu having glibc 2.39; setting the path for header
    files is shown in the README file, and other options are discussed in
    the documentation. I should mention that due to the massive (read as
    obsessive) use of preprocessing by yours truly, the transpilation to C
    programs is slow enough to rival the speed of a tortoise. This is
    currently a major bottleneck without an easy solution.

    Midway through the development, I set an ambitious goal of achieving full-conformance with the C23 standard (back then in its draft stage),
    and several features have evolved through a long cycle of changes to fix language-lawyer(-esque) corner-cases that most programmers never worry
    about. While the reference implementation may not have touched the
    finish line of that goal, it is close enough, and at the very least, I
    believe that the ellipsis framework fully conforms to C99 rules of the preprocessor (if not, then it is probably a bug).

    The documentation has been prepared in LaTeX and its generated PDF (with 300-ish pages of content) can be downloaded from https://github.com/cHaR-shinigami/c_/blob/main/c_.pdf

    I tried to maintain a formal style of writing throughout the document,
    and as an unintended byproduct, some of the wording may seem overly standardese. I am not sure if being a non-native English speaker was an
    issue here, but I am certain that the writing can be made more beginner-friendly in future revisions without loss of technical rigor.

    While it took a considerably longer time than I had anticipated, the
    code is still not quite polished yet, and the dialect has not matured
    enough to suggest that it will "wear well with experience". However, I
    do hope that at least some parts of it can serve a greater purpose for
    other programmers to building something better. Always welcome to bug
    reports on the reference implementation, documentation typos, and
    general suggestions on improving the dialect to widen its scope of
    application.

    Regards,
    cHaR

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ar Rakin@21:1/5 to cHaR on Sat Mar 1 22:16:47 2025
    On 3/1/25 3:37 AM, cHaR wrote:
    I started working on a preprocessing-based dialect of C a couple of
    years ago for use in personal projects, and now that its documentation
    is complete, I am pleased to share the reference implementation with
    fellow programmers.

    https://github.com/cHaR-shinigami/c_

    The entire implementation rests on the C preprocessor, and the ellipsis framework is its metaprogramming cornerstone, which can perform any kind
    form of mathematical and logical computation with function composition.
    A new higher-order function named "omni" is introduced, which provides a generalized syntax for operating with arrays and scalars; for example:

    `op_(&arr0, +, &arr1)` adds elements at same indices in arr0 and arr1 `op_(&arr, *, 10)` scales each element of arr by 10
    `op_(sum, +, &arr)` adds all elements of arr to sum
    `op_(price, -, discount)` is simply price - discount

    The exact semantics are a tad detailed, and can be found in chapters 4
    and 5 of the documentation.

    C_ establishes quite a few naming conventions: for example, type
    synonyms are named with a leading uppercase letter, the notable aspect
    being that they are non-modifiable by default; adding a trailing
    underscore makes them modifiable. Thus an Int cannot be modified after initialization, but an Int_ can be.

    The same convention is also followed for pointers: `Ptr (Char_) ptr`
    means `ptr` cannot be modified but `*ptr` (type Char_) can be, whereas `Ptr_(Char) ptr_` means something else: `ptr_` can be modified but
    `*ptr_` (type Char) cannot be. `Ptr (Int [10]) p1, p2` says both are non-modifiable pointers to non-modifiable array of 10 integers; this
    conveys intent more clearly than the conventional `const int (* const p1)[10], p2` which ends up declaring something else: `p2` is not a
    pointer, but a plain non-modifiable int.

    C_ blends several ideas from object-oriented paradigms and functional programming to facilitate abstraction-oriented designs with protocols, procedures, classes and interfaces, which are explored from chapter 6.
    For algorithm enthusiasts, I have also presented my designs on two
    new(?) sorting strategies in the same chapter: "hourglass sort" uses
    twin heaps with quick sort, and "burrow sort" uses a quasi-inplace merge strategy. For the preprocessor sorting, I have used a custom-made
    variant of adaptive bubble sort.

    The sample examples have been tested with gcc-14 and clang-19 on a 32-
    bit variant of Ubuntu having glibc 2.39; setting the path for header
    files is shown in the README file, and other options are discussed in
    the documentation. I should mention that due to the massive (read as obsessive) use of preprocessing by yours truly, the transpilation to C programs is slow enough to rival the speed of a tortoise. This is
    currently a major bottleneck without an easy solution.

    Midway through the development, I set an ambitious goal of achieving full-conformance with the C23 standard (back then in its draft stage),
    and several features have evolved through a long cycle of changes to fix language-lawyer(-esque) corner-cases that most programmers never worry
    about. While the reference implementation may not have touched the
    finish line of that goal, it is close enough, and at the very least, I believe that the ellipsis framework fully conforms to C99 rules of the preprocessor (if not, then it is probably a bug).

    The documentation has been prepared in LaTeX and its generated PDF (with 300-ish pages of content) can be downloaded from https://github.com/cHaR-shinigami/c_/blob/main/c_.pdf

    I tried to maintain a formal style of writing throughout the document,
    and as an unintended byproduct, some of the wording may seem overly standardese. I am not sure if being a non-native English speaker was an
    issue here, but I am certain that the writing can be made more beginner- friendly in future revisions without loss of technical rigor.

    While it took a considerably longer time than I had anticipated, the
    code is still not quite polished yet, and the dialect has not matured
    enough to suggest that it will "wear well with experience". However, I
    do hope that at least some parts of it can serve a greater purpose for
    other programmers to building something better. Always welcome to bug
    reports on the reference implementation, documentation typos, and
    general suggestions on improving the dialect to widen its scope of application.

    Regards,
    cHaR

    Very interesting. I haven't looked at everything yet, but I just want
    to give my opinion on the language syntax. I will post my detailed
    options and takes later.

    From the PDF documentation:

    --------------------------------------------
    #include <c._>

    Int_ main(Int argc, Ptr(Char_) argv[const])
    begin
    guard_(argc == 2)

    Auto_ count_ = 0U;
    guard_(input__(argv[1], count_) == 1)
    Var prices = new__(Float_ [count_]);
    guard_(prices)

    Var discounts = new__(Float_ [count_]);
    guard_(discounts)

    loop_(0, count_ - 1)
    print_("Enter price and discount (in %) for item", _i_ + 1);
    guard_(scan__((*prices)[_i_], (*discounts)[_i_]) == 2, 1)
    end

    Auto_ price_ = 0.f;
    op_(price_, +, prices)
    print_("Total price is", price_);
    op_(discounts, *, prices)
    op_(discounts, /, 100)

    loop_(0, count_ - 1)
    print_("Discount on item", _i_ + 1, "is", (*discounts)[_i_]);
    end

    Auto_ discount_ = .0f;
    op_(discount_, +, discounts)

    print_("Total discount is", discount_);
    print_("Final price is", price_ - discount_);
    print_("Have a nice day");

    end
    --------------------------------------------

    I like the types being in pascal case (in the form of Int_ and int_).
    The function and block syntax looks a little bit like Elixir. However I
    see that in some statements you do not use semicolon. Is that how it is supposed to be? If so, then it is a bit inconsistent.

    Other than that, the dialect looks fancy. Good work!

    --
    Rakin

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From cHaR@21:1/5 to Ar Rakin on Sat Mar 1 22:42:38 2025
    On 3/1/25 9:46 PM, Ar Rakin wrote:
    On 3/1/25 3:37 AM, cHaR wrote:
    I started working on a preprocessing-based dialect of C a couple of
    years ago for use in personal projects, and now that its documentation
    is complete, I am pleased to share the reference implementation with
    fellow programmers.

    https://github.com/cHaR-shinigami/c_

    The entire implementation rests on the C preprocessor, and the
    ellipsis framework is its metaprogramming cornerstone, which can
    perform any kind form of mathematical and logical computation with
    function composition. A new higher-order function named "omni" is
    introduced, which provides a generalized syntax for operating with
    arrays and scalars; for example:

    `op_(&arr0, +, &arr1)` adds elements at same indices in arr0 and arr1
    `op_(&arr, *, 10)` scales each element of arr by 10
    `op_(sum, +, &arr)` adds all elements of arr to sum
    `op_(price, -, discount)` is simply price - discount

    The exact semantics are a tad detailed, and can be found in chapters 4
    and 5 of the documentation.

    C_ establishes quite a few naming conventions: for example, type
    synonyms are named with a leading uppercase letter, the notable aspect
    being that they are non-modifiable by default; adding a trailing
    underscore makes them modifiable. Thus an Int cannot be modified after
    initialization, but an Int_ can be.

    The same convention is also followed for pointers: `Ptr (Char_) ptr`
    means `ptr` cannot be modified but `*ptr` (type Char_) can be, whereas
    `Ptr_(Char) ptr_` means something else: `ptr_` can be modified but
    `*ptr_` (type Char) cannot be. `Ptr (Int [10]) p1, p2` says both are
    non-modifiable pointers to non-modifiable array of 10 integers; this
    conveys intent more clearly than the conventional `const int (* const
    p1)[10], p2` which ends up declaring something else: `p2` is not a
    pointer, but a plain non-modifiable int.

    C_ blends several ideas from object-oriented paradigms and functional
    programming to facilitate abstraction-oriented designs with protocols,
    procedures, classes and interfaces, which are explored from chapter 6.
    For algorithm enthusiasts, I have also presented my designs on two
    new(?) sorting strategies in the same chapter: "hourglass sort" uses
    twin heaps with quick sort, and "burrow sort" uses a quasi-inplace
    merge strategy. For the preprocessor sorting, I have used a
    custom-made variant of adaptive bubble sort.

    The sample examples have been tested with gcc-14 and clang-19 on a 32-
    bit variant of Ubuntu having glibc 2.39; setting the path for header
    files is shown in the README file, and other options are discussed in
    the documentation. I should mention that due to the massive (read as
    obsessive) use of preprocessing by yours truly, the transpilation to C
    programs is slow enough to rival the speed of a tortoise. This is
    currently a major bottleneck without an easy solution.

    Midway through the development, I set an ambitious goal of achieving
    full-conformance with the C23 standard (back then in its draft stage),
    and several features have evolved through a long cycle of changes to
    fix language-lawyer(-esque) corner-cases that most programmers never
    worry about. While the reference implementation may not have touched
    the finish line of that goal, it is close enough, and at the very
    least, I believe that the ellipsis framework fully conforms to C99
    rules of the preprocessor (if not, then it is probably a bug).

    The documentation has been prepared in LaTeX and its generated PDF
    (with 300-ish pages of content) can be downloaded from
    https://github.com/cHaR-shinigami/c_/blob/main/c_.pdf

    I tried to maintain a formal style of writing throughout the document,
    and as an unintended byproduct, some of the wording may seem overly
    standardese. I am not sure if being a non-native English speaker was
    an issue here, but I am certain that the writing can be made more
    beginner- friendly in future revisions without loss of technical rigor.

    While it took a considerably longer time than I had anticipated, the
    code is still not quite polished yet, and the dialect has not matured
    enough to suggest that it will "wear well with experience". However, I
    do hope that at least some parts of it can serve a greater purpose for
    other programmers to building something better. Always welcome to bug
    reports on the reference implementation, documentation typos, and
    general suggestions on improving the dialect to widen its scope of
    application.

    Regards,
    cHaR

    Very interesting.  I haven't looked at everything yet, but I just want
    to give my opinion on the language syntax.  I will post my detailed
    options and takes later.

    From the PDF documentation:

    --------------------------------------------
    #include <c._>

    Int_ main(Int argc, Ptr(Char_) argv[const])
    begin
        guard_(argc == 2)

        Auto_ count_ = 0U;
        guard_(input__(argv[1], count_) == 1)
        Var prices = new__(Float_ [count_]);
        guard_(prices)

        Var discounts = new__(Float_ [count_]);
        guard_(discounts)

        loop_(0, count_ - 1)
            print_("Enter price and discount (in %) for item", _i_ + 1);
            guard_(scan__((*prices)[_i_], (*discounts)[_i_]) == 2, 1)
        end

        Auto_ price_ = 0.f;
        op_(price_, +, prices)
        print_("Total price is", price_);
        op_(discounts, *, prices)
        op_(discounts, /, 100)

        loop_(0, count_ - 1)
            print_("Discount on item", _i_ + 1, "is", (*discounts)[_i_]);
        end

        Auto_ discount_ = .0f;
        op_(discount_, +, discounts)

        print_("Total discount is", discount_);
        print_("Final price is", price_ - discount_);
        print_("Have a nice day");

    end
    --------------------------------------------

    I like the types being in pascal case (in the form of Int_ and int_).
    The function and block syntax looks a little bit like Elixir.  However I
    see that in some statements you do not use semicolon.  Is that how it is supposed to be?  If so, then it is a bit inconsistent.

    Other than that, the dialect looks fancy.  Good work!


    Yes, the C_-specific statements do not need semicolon, though we can
    always use one for consistency. The only time it would cause issues is
    with code like:

    if (expr) op_(arr_ptr, +, 10);
    else /* something else */

    It will not compile as the `else` gets detached from the `if` due to the semicolon, but it would only cause compilation error, not a runtime issue.

    Also, thanks for the feedback on type naming: the idea is to declare
    everything as non-modifiable (Int or Char) as a default practice, and
    suffixing an underscore later (Int_ or Char_) only if the variable needs
    to be updated. While this is impractical for all variables, a good
    starting point is to always declare function parameters as non modifiable.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ar Rakin@21:1/5 to Ar Rakin on Sat Mar 1 22:19:30 2025
    On 3/1/25 10:16 PM, Ar Rakin wrote:
    I like the types being in pascal case (in the form of Int_ and int_).

    Would like to correct myself, I wanted to say in the form of Int_ and
    *not* int_.

    --
    Rakin

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)