• Re: GPIO access with GForth on Raspberry Pi

    From Zbig@21:1/5 to All on Sun Jul 24 14:10:31 2022
    See also: https://news.ycombinator.com/item?id=29564458

    „Wasted many hours of my young life trying to figure out why my
    self-modifying assembler program worked perfect in the debugger
    but not without it.”

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Sun Jul 24 14:07:25 2022
    Since that part is described as "self modifying code" the conclusion is, that
    the self-modification doesn't go exactly as well in different circumstances. That conclusion is correct. The bug is not in mina, but in the msdos emulator
    that not allows this modification.

    Well maybe it cannot be called „a bug” in strict sense, still it is non-reliable code:

    „For Intel486(TM) processors, a write to an instruction in the cache will modify
    it in both the cache and memory, but if the instruction was prefetched before the write, the old version of the instruction could be the one executed.”

    ( https://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/pentium4_hh/events4/self-modifying_code_clear.htm )

    So it's confirmed 100%. Creating self-modifying code for the processors higher than 386 seems to be risky business, unfortunately.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Mon Jul 25 16:42:41 2022
    On 25/07/2022 07:07, Zbig wrote:
    Since that part is described as "self modifying code" the conclusion is, that
    the self-modification doesn't go exactly as well in different circumstances.
    That conclusion is correct. The bug is not in mina, but in the msdos emulator
    that not allows this modification.

    Well maybe it cannot be called „a bug” in strict sense, still it is non-reliable code:

    „For Intel486(TM) processors, a write to an instruction in the cache will modify
    it in both the cache and memory, but if the instruction was prefetched before the write, the old version of the instruction could be the one executed.”

    ( https://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/pentium4_hh/events4/self-modifying_code_clear.htm )

    So it's confirmed 100%. Creating self-modifying code for the processors higher
    than 386 seems to be risky business, unfortunately.

    But what is the official position? All I could find was:

    IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

    To write self-modifying code and ensure that it is compliant with current
    and future versions of the IA-32 architecture, one of the following two
    coding options must be chosen:

    (* OPTION 1 *)
    Store modified code (as data) into code segment;
    Jump to new code or an intermediate location;
    Execute new code;

    (* OPTION 2 *)
    Store modified code (as data) into code segment;
    Execute a serializing instruction; (* For example, CPUID instruction *)
    Execute new code;

    (The use of one of these options is not required for programs intended
    to run on the Pentium or Intel486 processors, but are recommended to
    insure compatibility with the Pentium 4, Intel Xeon, and P6 family
    processors.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Jul 25 01:50:05 2022
    Since that part is described as "self modifying code" the conclusion is, that
    the self-modification doesn't go exactly as well in different circumstances.
    That conclusion is correct. The bug is not in mina, but in the msdos emulator
    that not allows this modification.

    Well maybe it cannot be called „a bug” in strict sense, still it is non-reliable code:

    „For Intel486(TM) processors, a write to an instruction in the cache will modify
    it in both the cache and memory, but if the instruction was prefetched before
    the write, the old version of the instruction could be the one executed.”

    ( https://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/pentium4_hh/events4/self-modifying_code_clear.htm )

    So it's confirmed 100%. Creating self-modifying code for the processors higher
    than 386 seems to be risky business, unfortunately.
    But what is the official position? All I could find was:

    IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

    Not sure about „official position”; I've found statements of the programmers
    who experienced that problem. They confirmed my initial suspicion.

    It seems, that self-modifying code:
    — can be considered safe on older processors (and I mean „quite old”)
    — may work as intended, or may not — on newer processors
    — always makes slower the procedure it's applied to

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to dxforth on Tue Jul 26 09:26:40 2022
    dxforth <dxforth@gmail.com> writes:
    On 25/07/2022 07:07, Zbig wrote:
    So it's confirmed 100%. Creating self-modifying code for the processors higher
    than 386 seems to be risky business, unfortunately.

    Risky? You just follow the rules, and it works.

    And that's already with the 8086 and 8088: They have an instruction
    perfetch buffer of 6 and 4 bytes, respectively, so if you change an
    instruction that has already been prefetched, you will not see the
    effect of the change right away. I expect that the prefetch buffers
    grew over the generations (e.g., the 486 would prefetch one cache line
    (16 bytes) at a time). But the prefetching was straight ahead up to
    and including the 486, and when a jump was taken, the prefetching
    started from scratch. So they announced the rule that you had to take
    a jump between the modification and the execution.

    The Pentium added branch prediction and an instruction cache that is
    separate from the data cache, both of which mean that the jump rule
    would no longer work without extra hardware effort. Because backwards compatibility is so important, they spent the extra hardware effort,
    so the jump rule is still sufficient. I suspect that Pentium and
    later CPUs are actually closer to the ideal of self-modifying code as
    you would imagine it working from the specifications of the
    instructions than the earlier CPUs, but I have not tested this.

    IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

    To write self-modifying code and ensure that it is compliant with current
    and future versions of the IA-32 architecture, one of the following two
    coding options must be chosen:

    (* OPTION 1 *)
    Store modified code (as data) into code segment;
    Jump to new code or an intermediate location;
    Execute new code;

    That's the rule I stated above.

    (* OPTION 2 *)
    Store modified code (as data) into code segment;
    Execute a serializing instruction; (* For example, CPUID instruction *)
    Execute new code;

    Note that CPUID only exists on the Pentium and later CPUs. I wonder
    what that rule is about? Who would use a slow serializing instruction (typically 10+ cycles) instead of a fast jump and make their program
    less portable at the same time?

    (The use of one of these options is not required for programs intended
    to run on the Pentium or Intel486 processors, but are recommended to
    insure compatibility with the Pentium 4, Intel Xeon, and P6 family
    processors.)

    I am absolutely certain that the jump rule is needed for the 486 and
    earlier CPUs. I don't know if it's needed for the more recent CPUs
    mentioned above.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 03:56:23 2022
    So it's confirmed 100%. Creating self-modifying code for the processors higher
    than 386 seems to be risky business, unfortunately.
    Risky? You just follow the rules, and it works.

    Oh, I don't know. Most of the time it works — but then one finds
    the case when it doesn't, and then there are controversies: „no,
    it's not the code; it's bad emulator” or the like.

    From what I see the technique is better to be avoided, unless it is
    „really really” needed for some particular reason.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Tue Jul 26 22:05:58 2022
    On 26/07/2022 20:56, Zbig wrote:
    So it's confirmed 100%. Creating self-modifying code for the processors higher
    than 386 seems to be risky business, unfortunately.
    Risky? You just follow the rules, and it works.

    Oh, I don't know. Most of the time it works — but then one finds
    the case when it doesn't, and then there are controversies: „no,
    it's not the code; it's bad emulator” or the like.

    What steps did you take to rule out MINA as the cause of your error?
    Does it happen with DX-Forth? I used self-modifying code which you
    can test with:

    s" foobar$" drop 'DX ! 9 doscall

    From what I see the technique is better to be avoided, unless it is
    „really really” needed for some particular reason.

    Such as when there's no single 'INT' instruction.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Anton Ertl on Tue Jul 26 22:37:50 2022
    On 26/07/2022 19:26, Anton Ertl wrote:
    ...
    IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

    To write self-modifying code and ensure that it is compliant with current >> and future versions of the IA-32 architecture, one of the following two
    coding options must be chosen:

    (* OPTION 1 *)
    Store modified code (as data) into code segment;
    Jump to new code or an intermediate location;
    Execute new code;

    That's the rule I stated above.

    (* OPTION 2 *)
    Store modified code (as data) into code segment;
    Execute a serializing instruction; (* For example, CPUID instruction *)
    Execute new code;

    Note that CPUID only exists on the Pentium and later CPUs. I wonder
    what that rule is about? Who would use a slow serializing instruction (typically 10+ cycles) instead of a fast jump and make their program
    less portable at the same time?

    (The use of one of these options is not required for programs intended
    to run on the Pentium or Intel486 processors, but are recommended to
    insure compatibility with the Pentium 4, Intel Xeon, and P6 family
    processors.)

    I am absolutely certain that the jump rule is needed for the 486 and
    earlier CPUs. I don't know if it's needed for the more recent CPUs
    mentioned above.

    I've no experience but assuming so I would at least expect to see it
    stated in the docs for the 486. At the time the 486 was created, DOS
    programs were still widely used and for Intel to release a CPU that
    was incompatible would be 'A courageous decision, Minister'.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to But I on Tue Jul 26 05:24:22 2022
    What steps did you take to rule out MINA as the cause of your error?

    But I wrote everything in this thread already. What more can I say?

    Does it happen with DX-Forth? I used self-modifying code which you
    can test with:

    s" foobar$" drop 'DX ! 9 doscall

    No, I hadn't such problem with DXForth. But — considering that Mina
    example — can we be sure it won't happen in the future, and in case
    of every possible x86 clone?

    From what I see the technique is better to be avoided, unless it is „really really” needed for some particular reason.
    Such as when there's no single 'INT' instruction.

    But is it really the situation that begs for self-modifying code?
    — one doesn't usually use that many different DOS/BIOS software interrupts — these interrupts usually require variable number of arguments anyway
    — having quite nice interface to ML — like exactly in DXForth — allows to
    do request for interrupt in elegant way without resorting to self-modifying code

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Wed Jul 27 00:10:49 2022
    On 18/07/2022 12:19, Zbig wrote:
    [...]
    As DX-Forth uses the same scheme (self-modify an 'INT 0') it should also
    fail assuming that's the cause. To rule out MINA, I would patch INT 00h to >> INT 66h (unused interrupt). AFAIK 'Runtime error 200 at' is a Borland code >> and message.

    So I did it, changing "INT 00" to "INT 66". While with "INT 00" it spits out "Divide error" under debug (but then I'm able to continue), in case of "INT 66"
    the machine is hung up.
    Of course DXForth works as usual (no problems).

    It appears INT 66 needs to be initialized first, hence the crash. I got the same when I called it directly using $66 INTCALL.

    From the above it certainly looks like MINA isn't modifying the 'INT 00' instr - but why if the DX-Forth DOSCALL test I posted works? I'll compare the codes to see if I can spot something. It doesn't make sense that SMC should work on DX-Forth but not on MINA when run on the same machine/setup.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 07:42:15 2022
    OK, so I made the test you've posted and yes, it works like this:
    s" foobar$" drop 'DX ! 9 doscall foobar ok

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 07:38:49 2022
    From the above it certainly looks like MINA isn't modifying the 'INT 00' instr
    - but why if the DX-Forth DOSCALL test I posted works? I'll compare the codes
    to see if I can spot something. It doesn't make sense that SMC should work on
    DX-Forth but not on MINA when run on the same machine/setup.

    I DIDN'T use the test you've posted. You asked a question, whether I had any such
    problems with DXForth — so I responded I didn't have any so far.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Wed Jul 27 01:49:04 2022
    On 27/07/2022 00:42, Zbig wrote:
    OK, so I made the test you've posted and yes, it works like this:
    s" foobar$" drop 'DX ! 9 doscall foobar ok

    So it successfully modified INT 00h to INT 21h and performed DOS Fn9 -
    'Write String to STDout'.

    Try this:

    : test begin $30 doscall key? until key drop ;

    It proves nothing if it works but a fail would be interesting.

    You are using FreeDOS - can you boot a genuine MS-DOS?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 09:03:23 2022
    : test begin $30 doscall key? until key drop ;

    It proves nothing if it works but a fail would be interesting.

    It seems to be working OK. Waited some time and ended the
    loop with keypress.

    You are using FreeDOS - can you boot a genuine MS-DOS?

    Not on that particular 486 — too much hassle with such „transition”.
    But honestly: reliable code should work on either one, not „only for
    MS-DOS (tm)". FreeDOS isn't an emulator — and I already wrote, that
    exactly under emulator (which uses FreeDOS files, BTW) Mina — rather surprisingly, in such situation — works correctly?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Wed Jul 27 13:56:21 2022
    On 27/07/2022 02:03, Zbig wrote:
    : test begin $30 doscall key? until key drop ;

    It proves nothing if it works but a fail would be interesting.

    It seems to be working OK. Waited some time and ended the
    loop with keypress.

    You are using FreeDOS - can you boot a genuine MS-DOS?

    Not on that particular 486 — too much hassle with such „transition”. But honestly: reliable code should work on either one, not „only for
    MS-DOS (tm)". FreeDOS isn't an emulator — and I already wrote, that
    exactly under emulator (which uses FreeDOS files, BTW) Mina — rather surprisingly, in such situation — works correctly?

    Running FreeDOS under an emulator is introducing another variable -
    not ruling out FreeDOS. If the claim is MINA's code is unreliable
    then one has to prove it. Since only you can reproduce the fault
    it's left to you to pin-point where the defect is.

    I compared MINA BIOSO vs. DX-Forth INTCALL. Not a lot of difference
    other than the latter's code is larger as it can use any 8086 register
    as parameter.

    There are 23 instructions in DX-Forth between modifying and executing
    INT 00, vs. 8 instructions for MINA. Neither uses a JMP between modify
    and execute as suggested by Intel for later generation cpu's.

    DX-Forth has extra code to handle a MS-DOS 2 bug:

    mov cs:fssav,sp
    intc1: int 0 ; NOTE: self-modifying code
    cli
    mov ss,cs:cseg1 ; restore SS:SP
    mov sp,cs:fssav ; for DOS 2.x
    sti

    I don't understand the purpose of the XCHG instr's in MINA:

    XCHG SI,AX ; Save AX in (already free) SI

    XCHG SI,AX
    RQBIOS: INT(0) ; Request number to be overwritten.
    PUSHF ; Save status into DI
    POP DI
    XCHG SI,AX ; Save AX in (still free) SI

    XCHG SI,AX

    If you are able re-assemble MINA, try inserting a JMP per Intel:

    MOV BYTE [RQBIOS+1],AL ; Patch the code.
    JMP XXX
    XXX: POP DX

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to dxforth on Wed Jul 27 14:58:32 2022
    On 26/07/2022 22:37, dxforth wrote:
    On 26/07/2022 19:26, Anton Ertl wrote:
    ...
    IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

    To write self-modifying code and ensure that it is compliant with current >>> and future versions of the IA-32 architecture, one of the following two >>> coding options must be chosen:

    (* OPTION 1 *)
    Store modified code (as data) into code segment;
    Jump to new code or an intermediate location;
    Execute new code;

    That's the rule I stated above.

    (* OPTION 2 *)
    Store modified code (as data) into code segment;
    Execute a serializing instruction; (* For example, CPUID instruction *) >>> Execute new code;

    Note that CPUID only exists on the Pentium and later CPUs. I wonder
    what that rule is about? Who would use a slow serializing instruction
    (typically 10+ cycles) instead of a fast jump and make their program
    less portable at the same time?

    (The use of one of these options is not required for programs intended
    to run on the Pentium or Intel486 processors, but are recommended to
    insure compatibility with the Pentium 4, Intel Xeon, and P6 family
    processors.)

    I am absolutely certain that the jump rule is needed for the 486 and
    earlier CPUs. I don't know if it's needed for the more recent CPUs
    mentioned above.

    I've no experience but assuming so I would at least expect to see it
    stated in the docs for the 486. At the time the 486 was created, DOS programs were still widely used and for Intel to release a CPU that
    was incompatible would be 'A courageous decision, Minister'.

    Well, it appears Intel did. From the 486DX2 manual:

    3. The prefetch queue has been increased from 16
    bytes to 32 bytes. A jump always needs to execute
    after modifying code to guarantee correct execution
    of the new instruction.

    http://www.s100computers.com/My%20System%20Pages/80486%20Board/Intel486_DX2_Microprocessor_Data_Book_Jul92.pdf

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to dxforth on Wed Jul 27 15:22:18 2022
    On 27/07/2022 13:56, dxforth wrote:

    If you are able re-assemble MINA, try inserting a JMP per Intel:

    MOV BYTE [RQBIOS+1],AL ; Patch the code.
    JMP XXX
    XXX: POP DX

    p.s. Change that so the interval between modify and execute exceeds 32 bytes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Wed Jul 27 06:28:32 2022
    p.s. Change that so the interval between modify and execute exceeds 32 bytes.

    Indeed after insertion of 32 NOPs — which along with following few instructions
    gave a little more than 32 bytes — it started to work. But when I was trying to fix
    that with two shorter jumps „back and forth” — no way.

    I'll simply replace that SMC with „ordinary” code.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Thu Jul 28 00:14:14 2022
    On 27/07/2022 23:28, Zbig wrote:
    p.s. Change that so the interval between modify and execute exceeds 32 bytes.

    Indeed after insertion of 32 NOPs — which along with following few instructions
    gave a little more than 32 bytes — it started to work. But when I was trying to fix
    that with two shorter jumps „back and forth” — no way.

    Googling I got the impression a JMP to the next instruction should have worked. Did you use a 3-byte JMP instr? As both BIOSO and BIOSN self-modify, both need patching.

    If the JMP's don't work it would be possible to split the routines and satisfy the 32-byte distance requirement without the need for padding.

    I'll simply replace that SMC with „ordinary” code.

    It rather defeats the purpose. I wouldn't do it to DX-Forth. A programmer is expected to solve problems - not run away from them :)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Wed Jul 27 12:24:28 2022
    Indeed after insertion of 32 NOPs — which along with following few instructions
    gave a little more than 32 bytes — it started to work. But when I was trying to fix
    that with two shorter jumps „back and forth” — no way.
    Googling I got the impression a JMP to the next instruction should have worked.
    Did you use a 3-byte JMP instr? As both BIOSO and BIOSN self-modify, both need
    patching.

    Indeed NASM optimized these jumps to 2-byte instruction — so I changed it to be
    both 'JMP LONG' explicitly. Unfortunately, it didn't change much; mina breaks as
    before.

    If the JMP's don't work it would be possible to split the routines and satisfy
    the 32-byte distance requirement without the need for padding.

    Probably. Even very likely.

    I'll simply replace that SMC with „ordinary” code.
    It rather defeats the purpose. I wouldn't do it to DX-Forth. A programmer is expected to solve problems - not run away from them :)

    Actually I'm not sure is it worth the effort.
    From what I see the most important is „DOS dispatcher” INT 21h, and just
    a few more, like: INT 10, 11, 12, 13, 15, 16, 19, 1A, 20, 33 (h). That makes eleven interrupts together. So for such handful it's possible to handle the problem
    using kind of table, being 100% sure nothing will break in case the program will be
    run on another, even different processor, that maybe will behave some slighthly different
    way. If there was need to use, say, 60 different interrupts or so — well, that would be
    quite different story. But if there's no need — maybe it would be practical to follow the
    advice from „Thinking Forth”, I mean: „Generality usually involves complexity. Don't
    generalize your solution any more than will be required; instead, keep it changeable”.

    I believe that of course it can be handled, since I fixed it primitive way by adding NOPs —
    still there remains an apprehension of kind: „...until next time” (in different conditions).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Thu Jul 28 11:52:11 2022
    On 28/07/2022 05:24, Zbig wrote:
    Indeed after insertion of 32 NOPs — which along with following few instructions
    gave a little more than 32 bytes — it started to work. But when I was trying to fix
    that with two shorter jumps „back and forth” — no way.
    Googling I got the impression a JMP to the next instruction should have worked.
    Did you use a 3-byte JMP instr? As both BIOSO and BIOSN self-modify, both need
    patching.

    Indeed NASM optimized these jumps to 2-byte instruction — so I changed it to be
    both 'JMP LONG' explicitly. Unfortunately, it didn't change much; mina breaks as
    before.

    If the JMP's don't work it would be possible to split the routines and satisfy
    the 32-byte distance requirement without the need for padding.

    Probably. Even very likely.

    This should do it.

    RQBIOS: INT(0) ; Request number to be overwritten.
    ...
    JMP NEXT
    ;
    RQBIOSN: INT(0) ; Request number to be overwritten.
    ...
    JMP NEXT

    ; *************
    ; * BIOSO *
    ; *************
    ;
    N_BIOSO:
    DW 5
    DB "BIOSO"
    BIOSO:
    ...
    MOV BYTE [RQBIOS+1],AL ; Patch the code.
    ...
    XCHG SI,AX
    JMP RQBIOS

    ; *************
    ; * BIOSN *
    ; *************
    ;
    N_BIOSN:
    DW 5
    DB "BIOSN"
    BIOSN:
    ...
    MOV BYTE [RQBIOSN+1],AL ; Patch the code.
    ...
    XCHG SI,AX
    JMP RQBIOSN

    I'll simply replace that SMC with „ordinary” code.
    It rather defeats the purpose. I wouldn't do it to DX-Forth. A programmer is >> expected to solve problems - not run away from them :)

    Actually I'm not sure is it worth the effort.
    From what I see the most important is „DOS dispatcher” INT 21h, and just a few more, like: INT 10, 11, 12, 13, 15, 16, 19, 1A, 20, 33 (h). That makes eleven interrupts together. So for such handful it's possible to handle the problem
    using kind of table, being 100% sure nothing will break in case the program will be
    run on another, even different processor, that maybe will behave some slighthly different
    way. If there was need to use, say, 60 different interrupts or so — well, that would be
    quite different story. But if there's no need — maybe it would be practical to follow the
    advice from „Thinking Forth”, I mean: „Generality usually involves complexity. Don't
    generalize your solution any more than will be required; instead, keep it changeable”.

    Nice try. I'd try a backwards JMP. If beaten, I'd go for 32-byte separation and a JMP (see above). As for SMC failing on later CPUs I would have expected to see reports of same. I ran MINA on a Pentium without issue. Without evidence
    to the contrary I would treat 486 as a special case. No point crippling software
    because Intel gave birth to one bastard. Similarly the idea SMC is the devil's work to be avoided at any cost.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Thu Jul 28 19:41:27 2022
    On 27/07/2022 23:28, Zbig wrote:

    I'll simply replace that SMC with „ordinary” code.

    It's possible to simulate an INT n instruction. How foolproof it is, I don't know.

    pop bx ; INT#
    sub ax,ax ; get vector
    mov es,ax
    shl bx,1
    shl bx,1
    mov ax,es:[bx]
    mov ivec,ax
    mov ax,es:[bx+2]
    mov ivec+2,ax
    ...
    pushf ; execute it
    cli
    call dword ptr [ivec]
    ...


    ivec dw 0,0

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Thu Jul 28 12:06:48 2022
    As for SMC failing on later CPUs I would have expected
    to see reports of same. I ran MINA on a Pentium without issue. Without evidence
    to the contrary I would treat 486 as a special case. No point crippling software
    because Intel gave birth to one bastard. Similarly the idea SMC is the devil's
    work to be avoided at any cost.

    I tried Mina on AthlonXP — and yes, it works (I mean unmodified, original binary).
    The CPU I tried it with:

    processor : 0
    vendor_id : AuthenticAMD
    cpu family : 6
    model : 10
    model name : AMD Athlon(TM) XP 3000+
    stepping : 0
    cpu MHz : 2100.201
    cache size : 512 KB
    physical id : 0
    siblings : 1
    core id : 0
    cpu cores : 1
    apicid : 0
    initial apicid : 0
    fdiv_bug : no
    f00f_bug : no
    coma_bug : no
    fpu : yes
    fpu_exception : yes
    cpuid level : 1
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow cpuid 3dnowprefetch vmmcall
    bugs : fxsave_leak sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
    bogomips : 4202.41
    clflush size : 32
    cache_alignment : 32
    address sizes : 34 bits physical, 32 bits virtual
    power management: ts

    So yes, maybe „one bastard” — still it's the quite ubiquitous one (if we mean that now
    „retro” gear). Really a pity I don't have any AMD 486 for comparison.
    On the other hand: it's good to know, that unmodified Mina (or just that short procedure)
    can be used for testing CPUs regarding potential SMC (un)reliability.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Fri Jul 29 12:48:51 2022
    On 29/07/2022 05:06, Zbig wrote:

    So yes, maybe „one bastard” — still it's the quite ubiquitous one (if we mean that now
    „retro” gear). Really a pity I don't have any AMD 486 for comparison.
    On the other hand: it's good to know, that unmodified Mina (or just that short procedure)
    can be used for testing CPUs regarding potential SMC (un)reliability.

    It may be worth chasing down an AMD 486 manual. From an earlier i486 manual (not DX2) it explains:

    "The prefetch unit is flushed whenever the next instruction needed is not
    in numerical sequence with the previous instruction - for example, during
    jumps, task switches, exceptions, and interrupts."

    So the JMP can be to anywhere *but* the next sequential instruction? That
    may explain why:

    MOV BYTE [RQBIOS+1],AL ; Patch the code.
    JMP XXX
    XXX: POP DX

    didn't work. You may like to try:

    JMP XXX
    NOP
    XXX:

    to see whether it flushes the prefetch queue. Otherwise any backwards JMP should do it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Fri Jul 29 04:38:06 2022
    Otherwise any backwards JMP should do it.

    Two days ago I changed that code following way:

    POP AX ; Function code
    ; Once we are more acknowledgeable, put segment overwrite here.
    JMP XXX1 ; senseless jump to make self-modifying code work on 486 YYY1: MOV BYTE [RQBIOS+1],AL ; Patch the code.
    POP DX
    POP CX
    POP BX
    POP AX
    PUSH SI ; Save Forth registers. NEEDED?
    PUSH BP
    ; XCHG SI,AX ; Save AX in (already free) SI
    ; XCHG SI,AX
    RQBIOS: INT(0) ; Request number to be overwritten.
    PUSHF ; Save status into DI
    POP DI
    ; XCHG SI,AX ; Save AX in (still free) SI
    ; XCHG SI,AX
    POP BP ; Restore Forth registers. NEEDED?
    POP SI
    PUSH AX
    PUSH BX
    PUSH CX
    PUSH DX
    PUSH DI ; i.e. flags
    JMP SHORT ZZZ1
    XXX1: JMP YYY1 ; senseless return
    ZZZ1: JMP NEXT

    ; SELF MODIFYING CODE ENDS HERE! YOU HAVE BEEN WARNED!

    So, as you can see there are even two jumps — forth and back — and this wasn't of any help.
    That's why I became somewhat cautious with SMC.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Fri Jul 29 22:49:52 2022
    On 29/07/2022 21:38, Zbig wrote:
    Otherwise any backwards JMP should do it.

    Two days ago I changed that code following way:

    POP AX ; Function code
    ; Once we are more acknowledgeable, put segment overwrite here.
    JMP XXX1 ; senseless jump to make self-modifying code work on 486 YYY1: MOV BYTE [RQBIOS+1],AL ; Patch the code.
    POP DX
    POP CX
    POP BX
    POP AX
    PUSH SI ; Save Forth registers. NEEDED?
    PUSH BP
    ; XCHG SI,AX ; Save AX in (already free) SI
    ; XCHG SI,AX
    RQBIOS: INT(0) ; Request number to be overwritten.
    PUSHF ; Save status into DI
    POP DI
    ; XCHG SI,AX ; Save AX in (still free) SI
    ; XCHG SI,AX
    POP BP ; Restore Forth registers. NEEDED?
    POP SI
    PUSH AX
    PUSH BX
    PUSH CX
    PUSH DX
    PUSH DI ; i.e. flags
    JMP SHORT ZZZ1
    XXX1: JMP YYY1 ; senseless return
    ZZZ1: JMP NEXT

    ; SELF MODIFYING CODE ENDS HERE! YOU HAVE BEEN WARNED!

    So, as you can see there are even two jumps — forth and back — and this wasn't of any help.
    That's why I became somewhat cautious with SMC.

    I wouldn't expect the above to work as there's no JMP between modification
    and execution.

    Are you saying the following patch (applied to both BIOSO and BIOSN) doesn't work?

    MOV BYTE [RQBIOS+1],AL ; Patch the code.
    JMP XXX
    NOP
    XXX: POP DX

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Fri Jul 29 05:19:21 2022
    Oh, I forgot — that was „the first version”. I changed then the lines:

    JMP XXX1 ; senseless jump to make self-modifying code work on 486
    [..]
    XXX1: JMP YYY1 ; senseless return

    ...to both contain JMP LONG. No effect.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Fri Jul 29 08:24:01 2022
    Are you saying the following patch (applied to both BIOSO and BIOSN) doesn't work?
    MOV BYTE [RQBIOS+1],AL ; Patch the code.
    JMP XXX
    NOP
    XXX: POP DX

    Yep, that works. Thanks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Sat Jul 30 12:20:01 2022
    On 30/07/2022 01:24, Zbig wrote:
    Are you saying the following patch (applied to both BIOSO and BIOSN) doesn't >> work?
    MOV BYTE [RQBIOS+1],AL ; Patch the code.
    JMP XXX
    NOP
    XXX: POP DX

    Yep, that works. Thanks.

    Great! Can you confirm:

    1) Changing JMP XXX to JMP SHORT XXX it still works
    2) Deleting the NOP causes it to fail

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Sat Jul 30 05:18:02 2022
    Great! Can you confirm:

    1) Changing JMP XXX to JMP SHORT XXX it still works

    NASM earlier already optimized these JMPs to be „short”,
    so this made no difference (same opcode, I made sure).

    2) Deleting the NOP causes it to fail

    When I commented-out the NOPs, it... still works.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Sat Jul 30 23:36:08 2022
    On 30/07/2022 22:18, Zbig wrote:
    Great! Can you confirm:

    [...]

    2) Deleting the NOP causes it to fail

    When I commented-out the NOPs, it... still works.

    Ok - so the NOP is superfluous and a JMP to the next sequential instruction works i.e. is sufficient to flush the CPU prefetch. Solved!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Pelc@21:1/5 to All on Mon Aug 8 09:21:38 2022
    On 5 Jul 2022 at 15:51:22 CEST, "Christof Eberspaecher" <chwebersp@gmail.com> wrote:

    Hi,
    the actual raspberry os does no longer support wiringPi, which was a way to access GPIO under GForth as described here: https://forums.raspberrypi.com/viewtopic.php?t=207597
    Is there now another (easy) way?
    Thanks in advance! Christof

    VFX Forth for ARM Linux includes port access code for Raspberry PI. See
    Examples/Lin32/rpi-gpio.fth

    Stephen
    --
    Stephen Pelc, stephen@vfxforth.com
    MicroProcessor Engineering, Ltd. - More Real, Less Time
    133 Hill Lane, Southampton SO15 5AF, England
    tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612,
    +34 649 662 974
    http://www.mpeforth.com - free VFX Forth downloads

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)