• set-optimizer as an API for per-word optimizer

    From Ruvim@21:1/5 to Anton Ertl on Sun Nov 20 15:38:47 2022
    On 2022-10-01 07:06, Anton Ertl wrote:

    SET-OPTIMIZER sets the implementation of COMPILE, ( xt -- ) for
    the current word. A correct implementation of COMPILE, does not
    change the semantics in any way, only the implementation of the
    semantics.

    It seems, "set-optimizer" as a basis for such an API is suboptimal,
    since you have to describe the same semantics *twice*, and you have a
    chance to do it incorrectly.

    Actually, if we have a definition that compiles some behavior, the
    definition that performs this behavior can be created automatically.

    I mean, if we have a definition:

    : compile-foo ( -- ) ... ;

    that appends behavior "foo" to the current definition,
    then a word "foo" can be defined as

    : foo [ compile-foo ] ;


    Then, why do we need to define both "foo" and "compile-foo" by hands?
    Having one of them, another can be created automatically.


    A better API for per-word optimization should require the user to define
    only the compiler for a word, and the word itself will be created automatically.

    For example:

    [: postpone over postpone over ;] "2dup" define-by-compiler

    compiler: 2dup ]] over over [[ ;

    : value
    create ,
    [: ( addr -- ) lit, postpone @ ;] does-by-compiler
    ;


    BTW, I don't see why xt should be passed to a compiler (as it's done in "set-compiler"). In what cases it's useful?


    --
    Ruvim

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Ruvim on Sun Nov 20 16:43:40 2022
    Ruvim <ruvim.pinka@gmail.com> writes:
    On 2022-10-01 07:06, Anton Ertl wrote:

    SET-OPTIMIZER sets the implementation of COMPILE, ( xt -- ) for
    the current word. A correct implementation of COMPILE, does not
    change the semantics in any way, only the implementation of the
    semantics.

    It seems, "set-optimizer" as a basis for such an API is suboptimal,
    since you have to describe the same semantics *twice*, and you have a
    chance to do it incorrectly.

    Yes.

    Actually, if we have a definition that compiles some behavior, the
    definition that performs this behavior can be created automatically.

    Yes, but ... [see below].

    I mean, if we have a definition:

    : compile-foo ( -- ) ... ;

    that appends behavior "foo" to the current definition,
    then a word "foo" can be defined as

    : foo [ compile-foo ] ;

    Actually it's

    : compile-foo ( xt -- ) ... ;
    : foo recursive [ ' foo compile-foo ] ;

    If COMPILE-FOO just drops the xt, you can just pass 0 to COMPILE-FOO.

    Then, why do we need to define both "foo" and "compile-foo" by hands?
    Having one of them, another can be created automatically.

    The usual usage of SET-COMPILER is in defining words, e.g.

    : constant1 ( n "name" -- )
    create ,
    ['] @ set-does>
    [: >body @ ]] literal [[ ;] set-optimizer ;

    Here you have the advantage that the constant needs only one cell in
    addition to the header. Yes, you have the disadvantage that the
    SET-DOES> and SET-OPTIMIZER actions might disagree, leading to
    incorrect behaviour. An additional aspect here is that this
    definition assumes that the value of the constant is not changed.

    Could we avoid the redundancy and the potential disagreement? You
    suggest creating a colon definition for "name". How could this work?
    We have to store N somewhere. What I can come up with is:

    : lit, postpone literal ;
    : constant2 ( n "name" -- )
    >r :noname r> ]] drop literal lit, ; [[ >r
    : 0 r@ execute postpone ; r> set-optimizer ;

    The definition

    5 constant1 five1

    takes 6 cells (on a 64-bit machine) in the dictionary, while

    5 constant2 five2

    takes 16 cells in the dictionary plus 146 Bytes of native code with
    the debugging engine on AMD64.

    Moreover, I had several bugs in CONSTANT2 until I got it right, but
    that could get better with more practice. But will it get better than
    the alternative? The code is larger, so that's far from clear.

    In any case, it seems to me that the size advantage alone makes the
    CONSTANT1 approach preferable. Yes, you describe the same thing
    twice, and you may get it wrong in one description while getting it
    right in the other, so you have test both implementations separately
    (e.g., interpret the word once, and include it in a colon definition,
    and use the same tests on it; maybe we could automate that), but such
    bugs are rare.

    A better API for per-word optimization should require the user to define
    only the compiler for a word, and the word itself will be created >automatically.

    For example:

    [: postpone over postpone over ;] "2dup" define-by-compiler

    compiler: 2dup ]] over over [[ ;

    : value
    create ,
    [: ( addr -- ) lit, postpone @ ;] does-by-compiler
    ;

    The first two are alternatives, the third one addresses a different
    need. For the VALUE example, how does the implementation work; I can
    imagine how it works for the 2DUP examples.

    BTW, I don't see why xt should be passed to a compiler (as it's done in >"set-compiler"). In what cases it's useful?

    It's useful for getting the value of the constant in CONSTANT1. It's
    also the interface of COMPILE,. SET-OPTIMIZER only defines what
    COMPILE, does for the word that SET-OPTIMIZER is applied to. If
    COMPILE, instead DROPped the xt and only then called the word that we
    pass with SET-OPTIMIZER, that works nicely for the 2DUP example, but
    how would DOES-BY-COMPILER produce the ADDR that is passed to the xt?

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to ruvim.pinka@gmail.com on Mon Nov 21 11:26:15 2022
    In article <tldhm8$3hk6b$1@dont-email.me>,
    Ruvim <ruvim.pinka@gmail.com> wrote:
    On 2022-10-01 07:06, Anton Ertl wrote:

    SET-OPTIMIZER sets the implementation of COMPILE, ( xt -- ) for
    the current word. A correct implementation of COMPILE, does not
    change the semantics in any way, only the implementation of the
    semantics.

    It seems, "set-optimizer" as a basis for such an API is suboptimal,
    since you have to describe the same semantics *twice*, and you have a
    chance to do it incorrectly.

    SET-OPTIMIZER seems to be a glorified peep-hole optimiser.
    A general optimiser that work on a definition, then inline and process
    the resulting code is described in https://home.hccnet.nl/a.w.m.van.der.horst/forthlecture5.html

    <SNIP>

    --
    Ruvim
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@arcor.de@21:1/5 to Ruvim on Mon Nov 21 04:03:37 2022
    Ruvim schrieb am Sonntag, 20. November 2022 um 16:38:51 UTC+1:
    On 2022-10-01 07:06, Anton Ertl wrote:

    SET-OPTIMIZER sets the implementation of COMPILE, ( xt -- ) for
    the current word. A correct implementation of COMPILE, does not
    change the semantics in any way, only the implementation of the
    semantics.

    It seems, "set-optimizer" as a basis for such an API is suboptimal,
    since you have to describe the same semantics *twice*, and you have a
    chance to do it incorrectly.

    I agree, too much handcrafting required for my taste. What are compilers
    good for? _Automatic_ translation from a source to a target language.
    In most Forths that would be assembler or machine code.

    Unfortunately or luckily (depending on one's POV) the standard Forth
    compiler is ultra-dumb (it can even generate correct code from state-
    smart definitions). AFAIU the SET-OPTIMIZER scheme is one way to
    enhance or even bypass the dumb compiler for better results through
    lots of cryptic meta-information. I would expect a better Forth compiler
    to parse run-time & compile-time stack diagrams to generate the greater
    part of this meta-information automatically.

    IIRC gforth uses vmgen to preparate Forth code for gcc and relies on
    the many optimization passes built into gcc: https://gcc.gnu.org/onlinedocs/gccint/Passes.html
    What does SET-OPTIMIZER do better than gcc's optimizers?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to minf...@arcor.de on Mon Nov 21 20:27:06 2022
    "minf...@arcor.de" <minforth@arcor.de> writes:
    Ruvim schrieb am Sonntag, 20. November 2022 um 16:38:51 UTC+1:
    It seems, "set-optimizer" as a basis for such an API is suboptimal,
    since you have to describe the same semantics *twice*, and you have a
    chance to do it incorrectly.

    I agree, too much handcrafting required for my taste. What are compilers
    good for? _Automatic_ translation from a source to a target language.

    Yes, but someone has to write the compiler. And how do these people
    plug it into the Forth system? With SET-OPTIMIZER. And since this is
    Forth, they don't bury their tools. So you have the option of using SET-OPTIMIZER. Or you can rely on what gets done automatically when
    you don't use SET-OPTIMIZER. The latter option is correct, but may be slower.q

    Unfortunately or luckily (depending on one's POV) the standard Forth
    compiler is ultra-dumb (it can even generate correct code from state-
    smart definitions). AFAIU the SET-OPTIMIZER scheme is one way to
    enhance or even bypass the dumb compiler for better results through
    lots of cryptic meta-information.

    State-smartness does not come into play at the level where COMPILE,/SET-OPTIMIZER operate; but of course a correct COMPILE,
    generates correct code when you pass it the xt of a STATE-smart word,
    whether it generates more or less efficient code. Whether the word
    that you now have poisoned with STATE-smartness behaves as intended by
    you and as expected by others is another story.

    I would expect a better Forth compiler
    to parse run-time & compile-time stack diagrams to generate the greater
    part of this meta-information automatically.

    COMPILE,/SET-OPTIMIZER works at a different level and sees only one
    word each time COMPILE, is invoked.

    But I would not expect a Forth compiler that compiles a colon
    definition at a time to benefit from stack diagrams. The words to be
    compiled determine the stack effect, while the stack effect comment
    may be wrong or the compiler may misundertstand it.

    IIRC gforth uses vmgen to preparate Forth code for gcc and relies on
    the many optimization passes built into gcc: >https://gcc.gnu.org/onlinedocs/gccint/Passes.html
    What does SET-OPTIMIZER do better than gcc's optimizers?

    It's faster. Therefore it is used at the Forth system run-time, while
    gcc is only used at Gforth build time.

    Martin Maierhofer did a Forth2C compiler in 1995 that used gcc for
    its back end, but it's a proof of concept. There has not been much
    interest from the Forth community in this work (or work by others that
    compiled through C).

    @InProceedings{ertl&maierhofer95,
    author = {M. Anton Ertl and Martin Maierhofer},
    title = {Translating {Forth} to Efficient {C}},
    crossref = {euroforth95},
    url = {http://www.complang.tuwien.ac.at/papers/ertl%26maierhofer95.ps.gz},
    url2 = {http://www.complang.tuwien.ac.at/papers/ertl%26maierhofer95.pdf},
    abstract = {An automatic translator can translate Forth into C
    code which the current generation of optimizing C
    compilers compiles to efficient machine code. I.e.,
    the resulting code keeps stack items in registers
    and rarely updates the stack pointer. This paper
    presents a simple translation method that produces
    efficient C code, describes an implementation of the
    method and presents results achieved with this
    implementation: The translated code is 4.5--7.5
    times faster than Gforth (the fastest measured
    interpretive system), 1.3--3 times faster than
    BigForth 386 (a native code compiler), and smaller
    than Gforth's threaded code.}
    }

    @Proceedings{euroforth95,
    title = "EuroForth~'95 Conference Proceedings",
    booktitle = "EuroForth~'95 Conference Proceedings",
    year = "1995",
    key = "EuroForth '95",
    address = "Schloss Dagstuhl, Germany",
    }

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to albert@cherry. on Mon Nov 21 20:28:48 2022
    albert@cherry.(none) (albert) writes:
    SET-OPTIMIZER seems to be a glorified peep-hole optimiser.

    What makes you think so?

    COMPILE, is not even a peephole optimizer; it just compiles a single
    word.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From P Falth@21:1/5 to Ruvim on Mon Nov 21 13:23:11 2022
    On Sunday, 20 November 2022 at 16:38:51 UTC+1, Ruvim wrote:
    On 2022-10-01 07:06, Anton Ertl wrote:

    SET-OPTIMIZER sets the implementation of COMPILE, ( xt -- ) for
    the current word. A correct implementation of COMPILE, does not
    change the semantics in any way, only the implementation of the
    semantics.

    It seems, "set-optimizer" as a basis for such an API is suboptimal,
    since you have to describe the same semantics *twice*, and you have a
    chance to do it incorrectly.

    Actually, if we have a definition that compiles some behavior, the definition that performs this behavior can be created automatically.

    I mean, if we have a definition:

    : compile-foo ( -- ) ... ;

    that appends behavior "foo" to the current definition,
    then a word "foo" can be defined as

    : foo [ compile-foo ] ;

    In lxf/ntf this is also done in 2 steps.

    :p DUP 0 v-dup ;p

    Will define the compilation part

    :r DUP dup ;r

    will define the interpretive part, attaching that xt to the already created header

    lxf/ntf is meta compiled. The meta compilation step will create a kernel with all the :p ;p definitions. At the extension step the first file compiled is the one
    with all :r ;r definitions for the interpretive parts of the words.
    It would be very difficult (if not impossible) to create also the interpretive parts
    during metacompilation. The newly created compiler part might not even be in executable memory. In lxf/ntf the code generator is not an addon, it is the only way
    to create code, it can not be turned off. New code generators can be added
    with :p ,;p :r ;r but it is very seldom needed. There are already over 400!

    One objective for the design of the code generator was that standard code should produce fast machine code.

    lxf/ntf does not have set-optimizer and COMPILE, will just compile a call
    to the xt.

    BR
    Peter Fälth

    Then, why do we need to define both "foo" and "compile-foo" by hands?
    Having one of them, another can be created automatically.


    A better API for per-word optimization should require the user to define only the compiler for a word, and the word itself will be created automatically.

    For example:

    [: postpone over postpone over ;] "2dup" define-by-compiler

    compiler: 2dup ]] over over [[ ;

    : value
    create ,
    [: ( addr -- ) lit, postpone @ ;] does-by-compiler
    ;


    BTW, I don't see why xt should be passed to a compiler (as it's done in "set-compiler"). In what cases it's useful?


    --
    Ruvim

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to Anton Ertl on Tue Nov 22 16:07:28 2022
    In article <2022Nov21.212848@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    albert@cherry.(none) (albert) writes:
    SET-OPTIMIZER seems to be a glorified peep-hole optimiser.

    What makes you think so?

    COMPILE, is not even a peephole optimizer; it just compiles a single
    word.

    You almost got me! I thought it has something to do with optimisation.
    My fault.


    - anton

    Groetjes Albert
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to albert@cherry. on Wed Nov 23 12:02:43 2022
    albert@cherry.(none) (albert) writes:
    COMPILE, is not even a peephole optimizer; it just compiles a single
    word.

    You almost got me! I thought it has something to do with optimisation.

    What makes you think it does not?

    SET-OPTIMIZER must not be used for changing the behaviour of
    COMPILE,ing the xt (the meaning of COMPILE, is fixed), so the only
    correct use is to change the implementation; the primary use is for
    improving the generated code (i.e., optimization). A secondary
    potential use is instrumentation, but we have not used it for that
    yet.

    Let's see what happens is we use the most general COMPILE,
    implementation instead of the ones installed with SET-OPTIMIZER:

    sieve bubble matrix fib fft numbers on a 4GHz Skylake
    0.078 0.109 0.044 0.068 0.025 gforth-fast with SET-OPTIMIZER (default)
    0.181 0.219 0.138 0.274 0.091 gforth-fast without SET-OPTIMIZER
    0.144 0.213 0.100 0.201 0.069 gforth-itc with SET-OPTIMIZER
    0.152 0.237 0.102 0.228 0.071 gforth-itc without SET-OPTIMIZER (default)

    The invocations for these four measurements were (same order as above):

    gforth-fast onebench.fs
    gforth-fast -e ":noname ['] lit peephole-compile, , ['] execute peephole-compile, ; is compile," onebench.fs
    gforth-itc -e "' opt-compile, is compile," onebench.fs
    gforth-itc onebench.fs

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ruvim@21:1/5 to Anton Ertl on Wed Nov 23 17:26:00 2022
    On 2022-11-20 16:43, Anton Ertl wrote:
    Ruvim <ruvim.pinka@gmail.com> writes:
    On 2022-10-01 07:06, Anton Ertl wrote:

    SET-OPTIMIZER sets the implementation of COMPILE, ( xt -- ) for
    the current word. A correct implementation of COMPILE, does not
    change the semantics in any way, only the implementation of the
    semantics.

    It seems, "set-optimizer" as a basis for such an API is suboptimal,
    since you have to describe the same semantics *twice*, and you have a
    chance to do it incorrectly.

    Yes.

    Actually, if we have a definition that compiles some behavior, the
    definition that performs this behavior can be created automatically.

    Yes, but ... [see below].

    I mean, if we have a definition:

    : compile-foo ( -- ) ... ;

    that appends behavior "foo" to the current definition,
    then a word "foo" can be defined as

    : foo [ compile-foo ] ;

    Actually it's

    : compile-foo ( xt -- ) ... ;
    : foo recursive [ ' foo compile-foo ] ;

    If COMPILE-FOO just drops the xt, you can just pass 0 to COMPILE-FOO.

    Then, why do we need to define both "foo" and "compile-foo" by hands?
    Having one of them, another can be created automatically.

    The usual usage of SET-COMPILER is in defining words, e.g.

    : constant1 ( n "name" -- )
    create ,
    ['] @ set-does>
    [: >body @ ]] literal [[ ;] set-optimizer ;

    To me, "lit," looks far more comprehensible than "]] literal [["


    Here you have the advantage that the constant needs only one cell in
    addition to the header. Yes, you have the disadvantage that the
    SET-DOES> and SET-OPTIMIZER actions might disagree, leading to
    incorrect behaviour. An additional aspect here is that this
    definition assumes that the value of the constant is not changed.

    Could we avoid the redundancy and the potential disagreement? You
    suggest creating a colon definition for "name". How could this work?
    We have to store N somewhere. What I can come up with is:

    : lit, postpone literal ;
    : constant2 ( n "name" -- )
    >r :noname r> ]] drop literal lit, ; [[ >r
    : 0 r@ execute postpone ; r> set-optimizer ;

    The definition

    5 constant1 five1

    takes 6 cells (on a 64-bit machine) in the dictionary, while

    5 constant2 five2

    takes 16 cells in the dictionary plus 146 Bytes of native code with
    the debugging engine on AMD64.



    The code is lager since create-does in Gforth avoids duplication of some
    code parts (i.e., it utilizes one instance for many definitions). And
    since anonymous definition are too heavy in Gforth. For example,
    ":noname ;" takes 24 bytes (3 cells) in Gforth, 3 bytes (3/4 cells) in SwiftForth 3.11.6, and 1 byte (1/4 cells) in SP-Forth/4.

    For colon definitions this difference should not be so drastic.

    OTOH, if you provide an optimizer that generates longer code instead of
    a definition call, it's probably not a problem that the definition
    itself takes more space.



    Moreover, I had several bugs in CONSTANT2 until I got it right, but
    that could get better with more practice. But will it get better than
    the alternative? The code is larger, so that's far from clear.


    Having proper tools, it should not be more difficult.

    The compiler in "constant1":
    [: >body @ ]] literal [[ ;] ( xt )

    The compiler in "constant2":
    >r :noname r> ]] drop literal lit, ; [[ ( xt )


    They can be expressed far simpler as following.

    In "constant1":
    [: >body @ lit, ;]

    In "constant2":
    ['] lit, partial1



    In any case, it seems to me that the size advantage alone makes the
    CONSTANT1 approach preferable. Yes, you describe the same thing
    twice, and you may get it wrong in one description while getting it
    right in the other, so you have test both implementations separately
    (e.g., interpret the word once, and include it in a colon definition,
    and use the same tests on it; maybe we could automate that), but such
    bugs are rare.

    A better API for per-word optimization should require the user to define
    only the compiler for a word, and the word itself will be created
    automatically.

    For example:

    [: postpone over postpone over ;] "2dup" define-by-compiler

    compiler: 2dup ]] over over [[ ;

    : value
    create ,
    [: ( addr -- ) lit, postpone @ ;] does-by-compiler
    ;

    The first two are alternatives, the third one addresses a different
    need. For the VALUE example, how does the implementation work; I can
    imagine how it works for the 2DUP examples.

    Ideally, an implementation for such "does-by-compiler" should be
    supported by the corresponding implementations for "create" and "does>".

    But for the purpose of PoC we can do it less efficiently. So a Gforth
    specific PoC is following.

    In Gforth, ":" and ":noname" affect "latestxt" (which is used by
    "set-does>" and "set-compiler"), but "[: ... ;]" doesn't affect it.
    So I use the latter construct to create intermediate helper definitions.
    The intermediate definitions are needed to adapt the interface of "does-by-compiler" to the interface of "set-does>" and "set-optimizer"
    in Gforth.

    : begin-quot ( C: -- quotation-sys colon-sys ) ['] [: execute ;
    : end-quot ( C: quotation-sys colon-sys -- xt ) postpone ;] ;

    : does-by-compiler ( xt.compiler -- ) \ xt.compiler ( addr.body -- )
    latestxt >body >r >r ( R: addr.body xt.compiler )
    begin-quot
    postpone drop \ the passed addr.body is not needed
    2r@ execute
    end-quot set-does>
    begin-quot
    postpone drop \ the passed xt is not needed
    r> r> lit, compile,
    end-quot set-optimizer
    ;


    A usage example:

    : val ( x "name" -- )
    create , [: lit, postpone @ ;] does-by-compiler
    ;

    123 val x
    x . \ prints 123
    : foo x . ; foo \ prints 123
    456 ' x >body !
    x . \ prints 456
    see foo \ should show an optimized variant


    BTW, I don't see why xt should be passed to a compiler (as it's done in
    "set-compiler"). In what cases it's useful?

    It's useful for getting the value of the constant in CONSTANT1.

    As I can see, what is actually needed in this case is not an xt but a
    data field address.

    Do we have an example when an xt itself is needed?



    It's also the interface of COMPILE,. SET-OPTIMIZER only defines
    what COMPILE, does for the word that SET-OPTIMIZER is applied to. If
    COMPILE, instead DROPped the xt and only then called the word that we
    pass with SET-OPTIMIZER, that works nicely for the 2DUP example, but
    how would DOES-BY-COMPILER produce the ADDR that is passed to the xt?

    From the formal point of view, "does>" in run-time makes partial
    application. It partially applies the part "X" in "does> X ;" to the
    ADDR, producing a new definition, and replaces the execution semantics
    of the most recent definition by the execution semantics of this new definition.

    In the case of "does-by-compiler", this new definition is created by
    means of the passed xt.compiler, and then the execution semantics of the
    most recent definition is replaced by this new definition.

    But it still have to partially apply the xt.compiler to create the full optimizer. A possible more concise definition for "does-by-compiler":

    : does-by-compiler ( xt.compiler -- ) \ xt.compiler ( addr.body -- )
    latest-name> >body swap 2>r ( R: addr.body xt.compiler )
    begin-quot 2r@ execute end-quot latest-name> replace-behavior
    2r> partial1 latest-name> advise-compiler
    ;

    where
    partial1 ( x xt1 -- xt2 )
    \ xt2 is partially applied xt1 to x
    \ This word may use data space.

    latest-name> ( -- xt )
    \ xt is the execution token of the most recently appended
    \ definition in the compilation word list.
    \ An ambiguous condition exists if such a definition is absent.

    advise-compiler ( xt.compiler xt -- )
    \ It makes "compiler," to only perform xt.compiler
    \ when it's applied to xt. It may use data space.
    \ An ambiguous condition exists if the execution semantics
    \ identified by xt.compiler are distinct from appending
    \ the execution semantics identified by xt to the current definition.

    replace-behavior ( xt.new xt -- )
    \ It makes xt to identify the execution semantics identified
    \ by xt.new. It may use data space.
    \ An ambiguous condition exists if xt is not for a definition
    \ that is created by "create".




    --
    Ruvim

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Ruvim on Wed Nov 23 22:07:47 2022
    Ruvim <ruvim.pinka@gmail.com> writes:
    On 2022-11-20 16:43, Anton Ertl wrote:
    The usual usage of SET-COMPILER is in defining words, e.g.

    : constant1 ( n "name" -- )
    create ,
    ['] @ set-does>
    [: >body @ ]] literal [[ ;] set-optimizer ;

    To me, "lit," looks far more comprehensible than "]] literal [["


    Here you have the advantage that the constant needs only one cell in
    addition to the header. Yes, you have the disadvantage that the
    SET-DOES> and SET-OPTIMIZER actions might disagree, leading to
    incorrect behaviour. An additional aspect here is that this
    definition assumes that the value of the constant is not changed.

    Could we avoid the redundancy and the potential disagreement? You
    suggest creating a colon definition for "name". How could this work?
    We have to store N somewhere. What I can come up with is:

    : lit, postpone literal ;
    : constant2 ( n "name" -- )
    >r :noname r> ]] drop literal lit, ; [[ >r
    : 0 r@ execute postpone ; r> set-optimizer ;

    The definition

    5 constant1 five1

    takes 6 cells (on a 64-bit machine) in the dictionary, while

    5 constant2 five2

    takes 16 cells in the dictionary plus 146 Bytes of native code with
    the debugging engine on AMD64.



    The code is lager since create-does in Gforth avoids duplication of some
    code parts (i.e., it utilizes one instance for many definitions). And
    since anonymous definition are too heavy in Gforth. For example,
    ":noname ;" takes 24 bytes (3 cells) in Gforth, 3 bytes (3/4 cells) in >SwiftForth 3.11.6, and 1 byte (1/4 cells) in SP-Forth/4.

    So let's see how the size of such a constant would be in SwiftForth
    4.0.0-RC52 (64-bit):

    defer thunk ok
    here :noname drop 5 postpone literal ; is thunk ok
    : five2 [ 0 thunk ] ; ok
    here swap - . \ 56 ok

    see thunk
    44CBA0 402637 ( (DEFER) ) CALL E8925AFBFF

    thunk +F
    44CBAF 5 # EBX MOV BB05000000
    44CBB4 40C27A ( LITERAL ) JMP E9C1F6FBFF ok
    see five2
    44CBD2 -8 [RBP] RBP LEA 488D6DF8
    44CBD6 RBX 0 [RBP] MOV 48895D00
    44CBDA 5 # RBX MOV 48BB0500000000000000
    44CBE4 RET C3 ok

    So despite the heavy definitions of Gforth, the Gforth FIVE1 is
    smaller than the SwiftForth FIVE2. How small would a SwiftForth FIVE1
    be?

    here 5 constant five1 here swap - . \ 38 ok

    see five1
    44CBA0 402528 ( (CONSTANT) ) CALL E88359FBFF
    44CBA5 5 ok

    Having proper tools, it should not be more difficult.

    The compiler in "constant1":
    [: >body @ ]] literal [[ ;] ( xt )

    The compiler in "constant2":
    >r :noname r> ]] drop literal lit, ; [[ ( xt )


    They can be expressed far simpler as following.

    In "constant1":
    [: >body @ lit, ;]

    In "constant2":
    ['] lit, partial1

    Yes, I can use closures rather than :noname for plugging the constant
    in. Closures only consist of the stored data plus two cells of
    metadata; and they are also much nicer to write. So we get:

    : lit, postpone literal ;
    : constant2 ( n "name" -- )
    [n:d nip lit, ;] >r
    : 0 r@ execute postpone ; r> set-optimizer ;

    The whole part after the closure should be the same for every such
    defining word (but use the proper xt instead of 0), so yes, this could
    be much smaller in source code, something like:

    : lit, postpone literal ;
    : constant2 ( n "name" -- )
    [n:d nip lit, ;] define-by-optimizer ;

    Concerning the executable code, the need for a colon definition for
    every defined word is still a disadvantage of this approach. For
    gforth (the debugging engine) on AMD64 I see 11 cells and 47 Bytes of
    native code for five2.

    BTW, I don't see why xt should be passed to a compiler (as it's done in
    "set-compiler"). In what cases it's useful?

    It's useful for getting the value of the constant in CONSTANT1.

    As I can see, what is actually needed in this case is not an xt but a
    data field address.

    Do we have an example when an xt itself is needed?

    : general-compile, ( xt -- )
    postpone literal postpone execute ;

    This is the default for the COMPILE, method. It is used whenever no
    more specific COMPILE, implementation is installed with SET-OPTIMIZER.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to ruvim.pinka@gmail.com on Thu Nov 24 11:08:45 2022
    In article <tlll39$cu1p$1@dont-email.me>, Ruvim <ruvim.pinka@gmail.com> wrote: >On 2022-11-20 16:43, Anton Ertl wrote:
    Ruvim <ruvim.pinka@gmail.com> writes:
    On 2022-10-01 07:06, Anton Ertl wrote:

    SET-OPTIMIZER sets the implementation of COMPILE, ( xt -- ) for
    the current word. A correct implementation of COMPILE, does not
    change the semantics in any way, only the implementation of the
    semantics.

    It seems, "set-optimizer" as a basis for such an API is suboptimal,
    since you have to describe the same semantics *twice*, and you have a
    chance to do it incorrectly.

    Yes.

    Actually, if we have a definition that compiles some behavior, the
    definition that performs this behavior can be created automatically.

    Yes, but ... [see below].

    I mean, if we have a definition:

    : compile-foo ( -- ) ... ;

    that appends behavior "foo" to the current definition,
    then a word "foo" can be defined as

    : foo [ compile-foo ] ;

    Actually it's

    : compile-foo ( xt -- ) ... ;
    : foo recursive [ ' foo compile-foo ] ;

    If COMPILE-FOO just drops the xt, you can just pass 0 to COMPILE-FOO.

    Then, why do we need to define both "foo" and "compile-foo" by hands?
    Having one of them, another can be created automatically.

    The usual usage of SET-COMPILER is in defining words, e.g.

    : constant1 ( n "name" -- )
    create ,
    ['] @ set-does>
    [: >body @ ]] literal [[ ;] set-optimizer ;

    To me, "lit," looks far more comprehensible than "]] literal [["


    Here you have the advantage that the constant needs only one cell in
    addition to the header. Yes, you have the disadvantage that the
    SET-DOES> and SET-OPTIMIZER actions might disagree, leading to
    incorrect behaviour. An additional aspect here is that this
    definition assumes that the value of the constant is not changed.

    Could we avoid the redundancy and the potential disagreement? You
    suggest creating a colon definition for "name". How could this work?
    We have to store N somewhere. What I can come up with is:

    : lit, postpone literal ;
    : constant2 ( n "name" -- )
    >r :noname r> ]] drop literal lit, ; [[ >r
    : 0 r@ execute postpone ; r> set-optimizer ;

    The definition

    5 constant1 five1

    takes 6 cells (on a 64-bit machine) in the dictionary, while

    5 constant2 five2

    takes 16 cells in the dictionary plus 146 Bytes of native code with
    the debugging engine on AMD64.



    The code is lager since create-does in Gforth avoids duplication of some
    code parts (i.e., it utilizes one instance for many definitions). And
    since anonymous definition are too heavy in Gforth. For example,
    ":noname ;" takes 24 bytes (3 cells) in Gforth, 3 bytes (3/4 cells) in >SwiftForth 3.11.6, and 1 byte (1/4 cells) in SP-Forth/4.

    { } takes 5 CELLS (20/40 bytes) in ciforth. Who cares?
    I compile an AHEAD in front of and a THEN after, in order to
    use it in the middle of a definition. In interpret mode this is
    not necessary, but I do it anyway. Who cares?


    For colon definitions this difference should not be so drastic.

    OTOH, if you provide an optimizer that generates longer code instead of
    a definition call, it's probably not a problem that the definition
    itself takes more space.

    Hear hear. An optimiser works best if there is simple code to
    begin with.

    <SNIP>

    --
    Ruvim
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Anton Ertl on Thu Nov 24 10:32:26 2022
    anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
    Ruvim <ruvim.pinka@gmail.com> writes:
    On 2022-11-20 16:43, Anton Ertl wrote:
    The usual usage of SET-COMPILER is in defining words, e.g.

    : constant1 ( n "name" -- )
    create ,
    ['] @ set-does>
    [: >body @ ]] literal [[ ;] set-optimizer ;
    ...
    : lit, postpone literal ;
    : constant2 ( n "name" -- )
    [n:d nip lit, ;] >r
    : 0 r@ execute postpone ; r> set-optimizer ;

    The whole part after the closure should be the same for every such
    defining word (but use the proper xt instead of 0), so yes, this could
    be much smaller in source code, something like:

    : lit, postpone literal ;
    : constant2 ( n "name" -- )
    [n:d nip lit, ;] define-by-optimizer ;

    Concerning the executable code, the need for a colon definition for
    every defined word is still a disadvantage of this approach. For
    gforth (the debugging engine) on AMD64 I see 11 cells and 47 Bytes of
    native code for five2.

    Some more thoughts in this direction: We can define (tested)

    : constant3 ( n "name" -- )
    create {: n :}
    n [{: n :}d drop n ;] set-does>
    n [{: n :}d drop ]] n [[ ;] set-optimizer ;

    5 constant3 five3
    five3 .
    : foo five3 ;
    see foo
    foo .

    11 cells for five3 (with no native code), and worked on first try.
    That's 5 cells for the CREATEd word, and 3 cells for each closure.

    We have three word headers here, one for "name" and two for the
    closures. We could change the header of "name" to be of a closure

    [{: n :}d n ;]

    and do a variant of the set-optimizer closure that uses the passed xt
    to get to the data for "name" and use that (essentially reusing the
    [{: n :}d part of the other closure. Something like (does not work):

    : constant3 ( n "name" -- )
    create [{: n :}d n ;]... ]] n [[ ;] set-xt&optimizer ;

    As a result, FIVE3 would consume the same 6 cells as FIVE1. Next, how
    can we eliminate the redundancy of specifying separately what happens
    on EXECUTE and what happens at COMPILE,?

    Looking at defining words for words with read-only parameters, the
    usage looks quite systematical:

    : +field3 ( n1 n2 "name" -- )
    over + swap create [{: n :}d n + ;]... ]] n + [[ ;] set-xt&optimizer ;

    : fconstant3 ( r "name" -- )
    create [{ f: r :}d r ;]... ]] r [[ ;] set-xt&optimizer ;

    So one might think that we can have something like

    : fconstant3 ( r "name" -- )
    create [{ f: r :}d [ "r" gen-xt&optimizer ] ;

    and GEN-XT&OPTIMIZER repeats its parameter at the appropriate places,
    generates the code for the rest of the double-closure, and calls SET-XT&OPTIMIZER.

    For words with changeable data (e.g., 2VALUE), we could use the same
    approach by treating the address as read-only:

    : 2value4 ( n1 n2 "name" -- )
    here >r align 2, r> create [{ a }:d [ "a @" gen-xt&optimizer ] ;

    However, this needs an extra cell for keeping the address, and TO
    would have to find the data by going through the address. A more
    appropriate way would be to start with

    : 2value3 ( n1 n2 "name" -- )
    create 2,
    [: 2@ ;] set-does>
    [: >body ]] literal 2@ [[ ;] set-optimizer ;

    I guess all defining words for changeable data can be implemented with
    this scheme, so we might have something like GEN-XT&OPT-WRITABLE,
    where we could define 2VALUE3 as:

    : 2value3 ( n1 n2 "name" -- )
    create 2,
    [: [ "2@" gen-xt&opt-writable ] ;

    And maybe we can avoid the redundant code fragments occuring in
    practice with just these two words.

    However, we don't have that many potential uses of SET-XT&OPTIMIZER
    and GEN-XT&OPTIMIZER and GEN-XT&OPT-WRITABLE in Gforth that I would
    expect that implementing such words to ever pay off. There are only
    17 occurences of SET-OPTIMIZER in the Gforth image, not all of them
    fit the bill (e.g., the use in FORWARD), and bugs stemming from this
    redundancy have not been a problem yet.

    Having several closures with shared data might be more generally
    useful, though.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to Anton Ertl on Mon Nov 28 15:17:20 2022
    In article <2022Nov23.130243@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    albert@cherry.(none) (albert) writes:
    COMPILE, is not even a peephole optimizer; it just compiles a single >>>word.

    You almost got me! I thought it has something to do with optimisation.

    What makes you think it does not?

    SET-OPTIMIZER must not be used for changing the behaviour of
    COMPILE,ing the xt (the meaning of COMPILE, is fixed), so the only
    correct use is to change the implementation; the primary use is for
    improving the generated code (i.e., optimization). A secondary
    potential use is instrumentation, but we have not used it for that
    yet.

    Let's see what happens is we use the most general COMPILE,
    implementation instead of the ones installed with SET-OPTIMIZER:

    sieve bubble matrix fib fft numbers on a 4GHz Skylake
    0.078 0.109 0.044 0.068 0.025 gforth-fast with SET-OPTIMIZER (default) 0.181 0.219 0.138 0.274 0.091 gforth-fast without SET-OPTIMIZER
    0.144 0.213 0.100 0.201 0.069 gforth-itc with SET-OPTIMIZER
    0.152 0.237 0.102 0.228 0.071 gforth-itc without SET-OPTIMIZER (default)

    My Ubuntu installs gforth 0.7.3.
    It helps if you mention the results with that version for comparison,
    to give an impression of the progress you have made with optimisation.
    (And we can see the benefit if the gforth team pushes a newer
    version to Debian.)
    <SNIP>
    - anton
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to albert@cherry. on Tue Nov 29 08:12:10 2022
    albert@cherry.(none) (albert) writes:
    In article <2022Nov23.130243@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    Let's see what happens is we use the most general COMPILE,
    implementation instead of the ones installed with SET-OPTIMIZER:

    sieve bubble matrix fib fft numbers on a 4GHz Skylake
    0.078 0.109 0.044 0.068 0.025 gforth-fast with SET-OPTIMIZER (default)
    0.181 0.219 0.138 0.274 0.091 gforth-fast without SET-OPTIMIZER
    0.144 0.213 0.100 0.201 0.069 gforth-itc with SET-OPTIMIZER
    0.152 0.237 0.102 0.228 0.071 gforth-itc without SET-OPTIMIZER (default)

    My Ubuntu installs gforth 0.7.3.
    It helps if you mention the results with that version for comparison,

    I don't have Ubuntu on that machine, but for comparison the Debian 11 distribution of gforth-0.7.3 (first line), and current gforth-fast
    invoked in the Debian-default way.

    sieve bubble matrix fib fft numbers on a 4GHz Skylake
    0.104 0.144 0.064 0.146 gforth-fast 0.7.3 from Debian 11
    0.098 0.125 0.067 0.121 0.042 gforth-fast --no-dynamic with SET-OPTIMIZER 0.078 0.109 0.044 0.068 0.025 gforth-fast with SET-OPTIMIZER (default)
    0.181 0.219 0.138 0.274 0.091 gforth-fast without SET-OPTIMIZER
    0.144 0.213 0.100 0.201 0.069 gforth-itc with SET-OPTIMIZER
    0.152 0.237 0.102 0.228 0.071 gforth-itc without SET-OPTIMIZER (default)

    to give an impression of the progress you have made with optimisation.

    The difference between 0.7.3 and current is not primarily in code
    generation (there the big step was from 0.5 to 0.6), and the code
    generation differences are not just on the COMPILE, level.
    Nevertheless, let's look at the difference in fib between the first,
    second, and third line:

    Debian (no-dynamic) no-dynamic dynamic
    0.7.3 current
    dup dup dup 1->1
    lit lit lit 1->1
    <2> #2 #2
    < < ?branch < ?branch 1->1
    ?branch
    <140135227933968> <fib+$58> <fib+$58>
    drop drop drop 1->0
    lit lit lit 0->1
    <1> #1 #1
    branch branch branch 1->1
    <140135227934056> <fib+$A8> <fib+$A8>
    dup dup dup 1->1
    1- 1- 1- 1->1
    call call call 1->1
    <fib> fib fib
    swap swap swap 1->1
    lit lit+ lit+ 1->1
    <2> #-2 #-2
    -
    call call call 1->1
    <fib> fib fib
    + + + 1->1
    ;s ok ;s ;s 1->1

    We see here that current has a static superinstruction for < ?BRANCH
    (possible in 0.7 and IIRC 0.6, but the superinstruction was not
    there).

    We also see that "2 -" is compiled in current into lit+ (with the
    operand -2); this is achieved using SET-OPTIMIZER.

    And we see the static stack caching states in the dynamic output; on
    AMD64 it mostly stays in the default state 1 of having one stack item
    in a register, but the sequence DROP LIT is optimized with using state
    0 (no stack item in a register) in between them; that was already
    possible in 0.7, but was and is not possible with --no-dynamic, and
    you only see it nicely with the current SEE-CODE.

    (And we can see the benefit if the gforth team pushes a newer
    version to Debian.)

    We don't push, Debian pulls. Given that the main thing missing from
    Gforth-1.0 is to update the documentation to the changes, and given
    that Debian does not deliver the documentation, they could just pull a
    current snapshot.

    Anyway, Debian has been maiming Gforth not just by not delivering the documentation, but also by making --no-dynamic the default, which
    disables a number of Gforth optimizations below the COMPILE, level.
    To show what you can expect from Debian/Ubuntu, I also present the
    numbers for gforth-fast --no-dynamic.

    So you can see the difference between the first and second line as
    indication of what improvement you can expect from the Debian
    installation of Gforth-1.0, and the difference between the second and
    third line of what you can expect from making your own installation of Gforth-1.0 (plus, you get the documentation). Note that the
    improvements from dynamic code generation tend to be larger for bigger programs; for smaller programs the indirect branch predictors of
    modern CPUs work very well, for larger programs a little worse.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to Anton Ertl on Tue Nov 29 11:32:46 2022
    In article <2022Nov29.091210@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    <SNIP>
    Anyway, Debian has been maiming Gforth not just by not delivering the >documentation, but also by making --no-dynamic the default, which
    disables a number of Gforth optimizations below the COMPILE, level.
    To show what you can expect from Debian/Ubuntu, I also present the
    numbers for gforth-fast --no-dynamic.

    Yeah. The free software lawyers at Debian have decided that
    the info docs as supplied for Gforth (and many more free
    programs) do not comply with their "freedom" standards.




    So you can see the difference between the first and second line as
    indication of what improvement you can expect from the Debian
    installation of Gforth-1.0, and the difference between the second and
    third line of what you can expect from making your own installation of >Gforth-1.0 (plus, you get the documentation). Note that the
    improvements from dynamic code generation tend to be larger for bigger >programs; for smaller programs the indirect branch predictors of
    modern CPUs work very well, for larger programs a little worse.

    Thanks, saved for further study.


    - anton
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ruvim@21:1/5 to Anton Ertl on Tue Nov 29 12:12:16 2022
    On 2022-11-23 22:07, Anton Ertl wrote:
    Ruvim <ruvim.pinka@gmail.com> writes:
    [...]
    BTW, I don't see why xt should be passed to a compiler (as it's done in >>>> "set-compiler"). In what cases it's useful?

    It's useful for getting the value of the constant in CONSTANT1.

    As I can see, what is actually needed in this case is not an xt but a
    data field address.

    Do we have an example when an xt itself is needed?

    : general-compile, ( xt -- )
    postpone literal postpone execute ;

    This is the default for the COMPILE, method. It is used whenever no
    more specific COMPILE, implementation is installed with SET-OPTIMIZER.


    Well, "compile," can pass xt to the default general method.

    But do we have an example when an xt itself is useful for the compiler
    that is set via "set-optimizer"?

    The mentioned optimizer for "pick" (from another thread) actually does
    not require an xt argument, since it knows beforehand that it's xt of
    "pick" only.


    --
    Ruvim

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Ruvim on Tue Nov 29 22:08:58 2022
    Ruvim <ruvim.pinka@gmail.com> writes:
    But do we have an example when an xt itself is useful for the compiler
    that is set via "set-optimizer"?

    : ;abi-code, ['] ;abi-code-exec peephole-compile, , ;
    : does, ( xt -- ) does-check ['] does-xt peephole-compile, , ;

    : fold-constants {: xt m xt: pop xt: unpop xt: push -- :}
    \ compiles xt with constant folding: xt ( m*n -- l*n ).
    \ xt-pop pops m items from literal stack to data stack, xt-push
    \ pushes l items from data stack to literal stack.
    lits# m u>= if
    pop xt catch 0= if
    push rdrop exit then
    unpop then
    xt dup >code-address docol: = if
    :,
    else
    peephole-compile,
    then ;

    : fold2-1 ( xt -- ) 2 ['] 2lits> ['] >2lits ['] >lits fold-constants ;
    ' fold2-1 folds * and or xor
    ' fold2-1 folds min max umin umax
    ' fold2-1 folds nip
    ' fold2-1 folds rshift lshift arshift rol ror
    ' fold2-1 folds = > >= < <= u> u>= u< u<=
    ' fold2-1 folds d0> d0< d0=
    ' fold2-1 folds /s mods

    and similar for FOLD1-1 FOLD 1-2 FOLD2-0 FOLD2-2 FOLD2-3 FOLD3-1
    FOLD3-3 FOLD4-1 FOLD4-2. And while FOLD1-0 and FOLD4-4 only have one
    client at the moment, this could change, so why make it specific to
    that client?

    \ optimize +loop (not quite folding)
    : replace-(+loop) ( xt1 -- xt2 )
    case
    ['] (+loop) of ['] (/loop)# endof
    ['] (+loop)-lp+!# of ['] (/loop)#-lp+!# endof
    -21 throw
    endcase ;

    : (+loop)-optimizer ( xt -- )
    lits# 1 u>= if
    lits> dup 0> if
    swap replace-(+loop) peephole-compile, , exit then
    lits then
    peephole-compile, ;

    ' (+loop)-optimizer optimizes (+loop)
    ' (+loop)-optimizer optimizes (+loop)-lp+!#

    : opt+- {: xt: op -- :}
    lits# 1 = if
    0 lits> op ?dup-if
    ['] lit+ peephole-compile, , then
    exit then
    action-of op fold2-1 ;
    ' opt+- folds + -

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bernd Paysan@21:1/5 to All on Mon Dec 12 23:53:11 2022
    Am Tue, 29 Nov 2022 11:32:46 +0100 schrieb none) (albert:
    Yeah. The free software lawyers at Debian have decided that the info
    docs as supplied for Gforth (and many more free programs) do not comply
    with their "freedom" standards.

    Actually, they didn't. They passed https://www.debian.org/vote/2006/
    vote_001 “GFDL-licensed works without unmodifiable sections are free”. Gforth's documentation has no invariant section, so it is free.

    It's just plain and simple idiocy. Nothing we can fix (we will mention
    that the documentation has no invariant section in the next release notes, though). Well, we do absolutely everything to make a Debian maintainers
    life as easy as possible with the current development system. We even
    maintain our own Debian distribution.

    --
    Bernd Paysan
    "If you want it done right, you have to do it yourself"
    net2o id: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ*
    https://bernd-paysan.de/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to bernd@net2o.de on Tue Dec 13 10:45:11 2022
    In article <tn8et7$2afti$1@dont-email.me>,
    Bernd Paysan <bernd@net2o.de> wrote:
    Am Tue, 29 Nov 2022 11:32:46 +0100 schrieb none) (albert:
    Yeah. The free software lawyers at Debian have decided that the info
    docs as supplied for Gforth (and many more free programs) do not comply
    with their "freedom" standards.

    Actually, they didn't. They passed https://www.debian.org/vote/2006/
    vote_001 “GFDL-licensed works without unmodifiable sections are free”. >Gforth's documentation has no invariant section, so it is free.

    I didn't know that. I thought they use the inmodifiable sections
    as excuse. A poor excuse, because they should put these invaluable documentations then in the non-free section. ( "non-free" between
    scare quotes.)


    It's just plain and simple idiocy. Nothing we can fix (we will mention
    that the documentation has no invariant section in the next release notes, >though). Well, we do absolutely everything to make a Debian maintainers
    life as easy as possible with the current development system. We even >maintain our own Debian distribution.

    You are not alone having issues with Debian maintainers.
    I've spent a couple of years arriving at a .deb archive for ciforth
    that complies with all their rules. No one is willing to sponsor,
    (that is looking at it and put it in a distribution.)
    Likely candidates who sponsors IMHO crappy Forths didn't bother
    to answer.
    I generated i86 ciforth in debian format (.deb) and just
    distribute them myself.
    It is actually much easier to create a .deb
    format then abiding by their zillions rules and use their
    tools.

    Fun fact. There is a rpm distribution of AMD version appearing
    spontaneously, without any effort from my part.

    It is a pity that there is no official, newer gforth version that is
    spread more widely in distributions.

    --
    Bernd Paysan

    Groetjes Albert
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)