Forum: >>> Magnum BBS <<<

Re: GPIO access with GForth on Raspberry Pi

From Zbig@21:1/5 to All on Sun Jul 24 14:10:31 2022

See also: https://news.ycombinator.com/item?id=29564458

„Wasted many hours of my young life trying to figure out why my
self-modifying assembler program worked perfect in the debugger
but not without it.”

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Sun Jul 24 14:07:25 2022

Since that part is described as "self modifying code" the conclusion is, that
the self-modification doesn't go exactly as well in different circumstances. That conclusion is correct. The bug is not in mina, but in the msdos emulator

that not allows this modification.

Well maybe it cannot be called „a bug” in strict sense, still it is non-reliable code:

„For Intel486(TM) processors, a write to an instruction in the cache will modify
it in both the cache and memory, but if the instruction was prefetched before the write, the old version of the instruction could be the one executed.”

( https://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/pentium4_hh/events4/self-modifying_code_clear.htm )

So it's confirmed 100%. Creating self-modifying code for the processors higher than 386 seems to be risky business, unfortunately.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Mon Jul 25 16:42:41 2022

On 25/07/2022 07:07, Zbig wrote:

Since that part is described as "self modifying code" the conclusion is, that
the self-modification doesn't go exactly as well in different circumstances.

That conclusion is correct. The bug is not in mina, but in the msdos emulator
that not allows this modification.

Well maybe it cannot be called „a bug” in strict sense, still it is non-reliable code:

„For Intel486(TM) processors, a write to an instruction in the cache will modify
it in both the cache and memory, but if the instruction was prefetched before the write, the old version of the instruction could be the one executed.”

( https://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/pentium4_hh/events4/self-modifying_code_clear.htm )

So it's confirmed 100%. Creating self-modifying code for the processors higher
than 386 seems to be risky business, unfortunately.

But what is the official position? All I could find was:

IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

To write self-modifying code and ensure that it is compliant with current
and future versions of the IA-32 architecture, one of the following two
coding options must be chosen:

(* OPTION 1 *)
Store modified code (as data) into code segment;
Jump to new code or an intermediate location;
Execute new code;

(* OPTION 2 *)
Store modified code (as data) into code segment;
Execute a serializing instruction; (* For example, CPUID instruction *)
Execute new code;

(The use of one of these options is not required for programs intended
to run on the Pentium or Intel486 processors, but are recommended to
insure compatibility with the Pentium 4, Intel Xeon, and P6 family
processors.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Jul 25 01:50:05 2022

Since that part is described as "self modifying code" the conclusion is, that
the self-modification doesn't go exactly as well in different circumstances.

That conclusion is correct. The bug is not in mina, but in the msdos emulator
that not allows this modification.

Well maybe it cannot be called „a bug” in strict sense, still it is non-reliable code:

„For Intel486(TM) processors, a write to an instruction in the cache will modify
it in both the cache and memory, but if the instruction was prefetched before
the write, the old version of the instruction could be the one executed.”

( https://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/pentium4_hh/events4/self-modifying_code_clear.htm )

So it's confirmed 100%. Creating self-modifying code for the processors higher
than 386 seems to be risky business, unfortunately.

But what is the official position? All I could find was:

IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

Not sure about „official position”; I've found statements of the programmers
who experienced that problem. They confirmed my initial suspicion.

It seems, that self-modifying code:
— can be considered safe on older processors (and I mean „quite old”)
— may work as intended, or may not — on newer processors
— always makes slower the procedure it's applied to

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to dxforth on Tue Jul 26 09:26:40 2022

dxforth <dxforth@gmail.com> writes:

On 25/07/2022 07:07, Zbig wrote:

So it's confirmed 100%. Creating self-modifying code for the processors higher
than 386 seems to be risky business, unfortunately.

Risky? You just follow the rules, and it works.

And that's already with the 8086 and 8088: They have an instruction
perfetch buffer of 6 and 4 bytes, respectively, so if you change an
instruction that has already been prefetched, you will not see the
effect of the change right away. I expect that the prefetch buffers
grew over the generations (e.g., the 486 would prefetch one cache line
(16 bytes) at a time). But the prefetching was straight ahead up to
and including the 486, and when a jump was taken, the prefetching
started from scratch. So they announced the rule that you had to take
a jump between the modification and the execution.

The Pentium added branch prediction and an instruction cache that is
separate from the data cache, both of which mean that the jump rule
would no longer work without extra hardware effort. Because backwards compatibility is so important, they spent the extra hardware effort,
so the jump rule is still sufficient. I suspect that Pentium and
later CPUs are actually closer to the ideal of self-modifying code as
you would imagine it working from the specifications of the
instructions than the earlier CPUs, but I have not tested this.

IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

To write self-modifying code and ensure that it is compliant with current
and future versions of the IA-32 architecture, one of the following two
coding options must be chosen:

(* OPTION 1 *)
Store modified code (as data) into code segment;
Jump to new code or an intermediate location;
Execute new code;

That's the rule I stated above.

(* OPTION 2 *)
Store modified code (as data) into code segment;
Execute a serializing instruction; (* For example, CPUID instruction *)
Execute new code;

Note that CPUID only exists on the Pentium and later CPUs. I wonder
what that rule is about? Who would use a slow serializing instruction (typically 10+ cycles) instead of a fast jump and make their program
less portable at the same time?

(The use of one of these options is not required for programs intended
to run on the Pentium or Intel486 processors, but are recommended to
insure compatibility with the Pentium 4, Intel Xeon, and P6 family
processors.)

I am absolutely certain that the jump rule is needed for the 486 and
earlier CPUs. I don't know if it's needed for the more recent CPUs
mentioned above.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 03:56:23 2022

So it's confirmed 100%. Creating self-modifying code for the processors higher
than 386 seems to be risky business, unfortunately.

Risky? You just follow the rules, and it works.

Oh, I don't know. Most of the time it works — but then one finds
the case when it doesn't, and then there are controversies: „no,
it's not the code; it's bad emulator” or the like.

From what I see the technique is better to be avoided, unless it is
„really really” needed for some particular reason.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Tue Jul 26 22:05:58 2022

On 26/07/2022 20:56, Zbig wrote:

So it's confirmed 100%. Creating self-modifying code for the processors higher
than 386 seems to be risky business, unfortunately.

Risky? You just follow the rules, and it works.

Oh, I don't know. Most of the time it works — but then one finds
the case when it doesn't, and then there are controversies: „no,
it's not the code; it's bad emulator” or the like.

What steps did you take to rule out MINA as the cause of your error?
Does it happen with DX-Forth? I used self-modifying code which you
can test with:

s" foobar$" drop 'DX ! 9 doscall

From what I see the technique is better to be avoided, unless it is
„really really” needed for some particular reason.

Such as when there's no single 'INT' instruction.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Anton Ertl on Tue Jul 26 22:37:50 2022

On 26/07/2022 19:26, Anton Ertl wrote:

...

IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

To write self-modifying code and ensure that it is compliant with current >> and future versions of the IA-32 architecture, one of the following two
coding options must be chosen:

(* OPTION 1 *)
Store modified code (as data) into code segment;
Jump to new code or an intermediate location;
Execute new code;

That's the rule I stated above.

(* OPTION 2 *)
Store modified code (as data) into code segment;
Execute a serializing instruction; (* For example, CPUID instruction *)
Execute new code;

Note that CPUID only exists on the Pentium and later CPUs. I wonder
what that rule is about? Who would use a slow serializing instruction (typically 10+ cycles) instead of a fast jump and make their program
less portable at the same time?

(The use of one of these options is not required for programs intended
to run on the Pentium or Intel486 processors, but are recommended to
insure compatibility with the Pentium 4, Intel Xeon, and P6 family
processors.)

I am absolutely certain that the jump rule is needed for the 486 and
earlier CPUs. I don't know if it's needed for the more recent CPUs
mentioned above.

I've no experience but assuming so I would at least expect to see it
stated in the docs for the 486. At the time the 486 was created, DOS
programs were still widely used and for Intel to release a CPU that
was incompatible would be 'A courageous decision, Minister'.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to But I on Tue Jul 26 05:24:22 2022

What steps did you take to rule out MINA as the cause of your error?

But I wrote everything in this thread already. What more can I say?

Does it happen with DX-Forth? I used self-modifying code which you
can test with:

s" foobar$" drop 'DX ! 9 doscall

No, I hadn't such problem with DXForth. But — considering that Mina
example — can we be sure it won't happen in the future, and in case
of every possible x86 clone?

From what I see the technique is better to be avoided, unless it is „really really” needed for some particular reason.

Such as when there's no single 'INT' instruction.

But is it really the situation that begs for self-modifying code?
— one doesn't usually use that many different DOS/BIOS software interrupts — these interrupts usually require variable number of arguments anyway
— having quite nice interface to ML — like exactly in DXForth — allows to
do request for interrupt in elegant way without resorting to self-modifying code

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Wed Jul 27 00:10:49 2022

On 18/07/2022 12:19, Zbig wrote:

[...]

As DX-Forth uses the same scheme (self-modify an 'INT 0') it should also
fail assuming that's the cause. To rule out MINA, I would patch INT 00h to >> INT 66h (unused interrupt). AFAIK 'Runtime error 200 at' is a Borland code >> and message.

So I did it, changing "INT 00" to "INT 66". While with "INT 00" it spits out "Divide error" under debug (but then I'm able to continue), in case of "INT 66"
the machine is hung up.
Of course DXForth works as usual (no problems).

It appears INT 66 needs to be initialized first, hence the crash. I got the same when I called it directly using $66 INTCALL.

From the above it certainly looks like MINA isn't modifying the 'INT 00' instr - but why if the DX-Forth DOSCALL test I posted works? I'll compare the codes to see if I can spot something. It doesn't make sense that SMC should work on DX-Forth but not on MINA when run on the same machine/setup.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 07:42:15 2022

OK, so I made the test you've posted and yes, it works like this:
s" foobar$" drop 'DX ! 9 doscall foobar ok

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 07:38:49 2022

From the above it certainly looks like MINA isn't modifying the 'INT 00' instr
- but why if the DX-Forth DOSCALL test I posted works? I'll compare the codes
to see if I can spot something. It doesn't make sense that SMC should work on
DX-Forth but not on MINA when run on the same machine/setup.

I DIDN'T use the test you've posted. You asked a question, whether I had any such
problems with DXForth — so I responded I didn't have any so far.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Wed Jul 27 01:49:04 2022

On 27/07/2022 00:42, Zbig wrote:

OK, so I made the test you've posted and yes, it works like this:
s" foobar$" drop 'DX ! 9 doscall foobar ok

So it successfully modified INT 00h to INT 21h and performed DOS Fn9 -
'Write String to STDout'.

Try this:

: test begin $30 doscall key? until key drop ;

It proves nothing if it works but a fail would be interesting.

You are using FreeDOS - can you boot a genuine MS-DOS?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 09:03:23 2022

: test begin $30 doscall key? until key drop ;

It proves nothing if it works but a fail would be interesting.

It seems to be working OK. Waited some time and ended the
loop with keypress.

You are using FreeDOS - can you boot a genuine MS-DOS?

Not on that particular 486 — too much hassle with such „transition”.
But honestly: reliable code should work on either one, not „only for
MS-DOS (tm)". FreeDOS isn't an emulator — and I already wrote, that
exactly under emulator (which uses FreeDOS files, BTW) Mina — rather surprisingly, in such situation — works correctly?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Wed Jul 27 13:56:21 2022

On 27/07/2022 02:03, Zbig wrote:

: test begin $30 doscall key? until key drop ;

It proves nothing if it works but a fail would be interesting.

It seems to be working OK. Waited some time and ended the
loop with keypress.

You are using FreeDOS - can you boot a genuine MS-DOS?

Not on that particular 486 — too much hassle with such „transition”. But honestly: reliable code should work on either one, not „only for
MS-DOS (tm)". FreeDOS isn't an emulator — and I already wrote, that
exactly under emulator (which uses FreeDOS files, BTW) Mina — rather surprisingly, in such situation — works correctly?

Running FreeDOS under an emulator is introducing another variable -
not ruling out FreeDOS. If the claim is MINA's code is unreliable
then one has to prove it. Since only you can reproduce the fault
it's left to you to pin-point where the defect is.

I compared MINA BIOSO vs. DX-Forth INTCALL. Not a lot of difference
other than the latter's code is larger as it can use any 8086 register
as parameter.

There are 23 instructions in DX-Forth between modifying and executing
INT 00, vs. 8 instructions for MINA. Neither uses a JMP between modify
and execute as suggested by Intel for later generation cpu's.

DX-Forth has extra code to handle a MS-DOS 2 bug:

mov cs:fssav,sp
intc1: int 0 ; NOTE: self-modifying code
cli
mov ss,cs:cseg1 ; restore SS:SP
mov sp,cs:fssav ; for DOS 2.x
sti

I don't understand the purpose of the XCHG instr's in MINA:

XCHG SI,AX ; Save AX in (already free) SI

XCHG SI,AX
RQBIOS: INT(0) ; Request number to be overwritten.
PUSHF ; Save status into DI
POP DI
XCHG SI,AX ; Save AX in (still free) SI

XCHG SI,AX

If you are able re-assemble MINA, try inserting a JMP per Intel:

MOV BYTE [RQBIOS+1],AL ; Patch the code.
JMP XXX
XXX: POP DX

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to dxforth on Wed Jul 27 14:58:32 2022

On 26/07/2022 22:37, dxforth wrote:

On 26/07/2022 19:26, Anton Ertl wrote:

...

IA-32 Intel® Architecture Software Developer’s Manual Volume 3:

To write self-modifying code and ensure that it is compliant with current >>> and future versions of the IA-32 architecture, one of the following two >>> coding options must be chosen:

(* OPTION 1 *)
Store modified code (as data) into code segment;
Jump to new code or an intermediate location;
Execute new code;

That's the rule I stated above.

(* OPTION 2 *)
Store modified code (as data) into code segment;
Execute a serializing instruction; (* For example, CPUID instruction *) >>> Execute new code;

Note that CPUID only exists on the Pentium and later CPUs. I wonder
what that rule is about? Who would use a slow serializing instruction
(typically 10+ cycles) instead of a fast jump and make their program
less portable at the same time?

(The use of one of these options is not required for programs intended
to run on the Pentium or Intel486 processors, but are recommended to
insure compatibility with the Pentium 4, Intel Xeon, and P6 family
processors.)

I am absolutely certain that the jump rule is needed for the 486 and
earlier CPUs. I don't know if it's needed for the more recent CPUs
mentioned above.

I've no experience but assuming so I would at least expect to see it
stated in the docs for the 486. At the time the 486 was created, DOS programs were still widely used and for Intel to release a CPU that
was incompatible would be 'A courageous decision, Minister'.

Well, it appears Intel did. From the 486DX2 manual:

3. The prefetch queue has been increased from 16
bytes to 32 bytes. A jump always needs to execute
after modifying code to guarantee correct execution
of the new instruction.

http://www.s100computers.com/My%20System%20Pages/80486%20Board/Intel486_DX2_Microprocessor_Data_Book_Jul92.pdf

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to dxforth on Wed Jul 27 15:22:18 2022

On 27/07/2022 13:56, dxforth wrote:

If you are able re-assemble MINA, try inserting a JMP per Intel:

MOV BYTE [RQBIOS+1],AL ; Patch the code.
JMP XXX
XXX: POP DX

p.s. Change that so the interval between modify and execute exceeds 32 bytes.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Wed Jul 27 06:28:32 2022

p.s. Change that so the interval between modify and execute exceeds 32 bytes.

Indeed after insertion of 32 NOPs — which along with following few instructions
gave a little more than 32 bytes — it started to work. But when I was trying to fix
that with two shorter jumps „back and forth” — no way.

I'll simply replace that SMC with „ordinary” code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Thu Jul 28 00:14:14 2022

On 27/07/2022 23:28, Zbig wrote:

p.s. Change that so the interval between modify and execute exceeds 32 bytes.

Indeed after insertion of 32 NOPs — which along with following few instructions
gave a little more than 32 bytes — it started to work. But when I was trying to fix
that with two shorter jumps „back and forth” — no way.

Googling I got the impression a JMP to the next instruction should have worked. Did you use a 3-byte JMP instr? As both BIOSO and BIOSN self-modify, both need patching.

If the JMP's don't work it would be possible to split the routines and satisfy the 32-byte distance requirement without the need for padding.

I'll simply replace that SMC with „ordinary” code.

It rather defeats the purpose. I wouldn't do it to DX-Forth. A programmer is expected to solve problems - not run away from them :)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Wed Jul 27 12:24:28 2022

Indeed after insertion of 32 NOPs — which along with following few instructions
gave a little more than 32 bytes — it started to work. But when I was trying to fix
that with two shorter jumps „back and forth” — no way.

Googling I got the impression a JMP to the next instruction should have worked.
Did you use a 3-byte JMP instr? As both BIOSO and BIOSN self-modify, both need
patching.

Indeed NASM optimized these jumps to 2-byte instruction — so I changed it to be
both 'JMP LONG' explicitly. Unfortunately, it didn't change much; mina breaks as
before.

If the JMP's don't work it would be possible to split the routines and satisfy
the 32-byte distance requirement without the need for padding.

Probably. Even very likely.

I'll simply replace that SMC with „ordinary” code.

It rather defeats the purpose. I wouldn't do it to DX-Forth. A programmer is expected to solve problems - not run away from them :)

Actually I'm not sure is it worth the effort.
From what I see the most important is „DOS dispatcher” INT 21h, and just
a few more, like: INT 10, 11, 12, 13, 15, 16, 19, 1A, 20, 33 (h). That makes eleven interrupts together. So for such handful it's possible to handle the problem
using kind of table, being 100% sure nothing will break in case the program will be
run on another, even different processor, that maybe will behave some slighthly different
way. If there was need to use, say, 60 different interrupts or so — well, that would be
quite different story. But if there's no need — maybe it would be practical to follow the
advice from „Thinking Forth”, I mean: „Generality usually involves complexity. Don't
generalize your solution any more than will be required; instead, keep it changeable”.

I believe that of course it can be handled, since I fixed it primitive way by adding NOPs —
still there remains an apprehension of kind: „...until next time” (in different conditions).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Thu Jul 28 11:52:11 2022

On 28/07/2022 05:24, Zbig wrote:

Indeed after insertion of 32 NOPs — which along with following few instructions
gave a little more than 32 bytes — it started to work. But when I was trying to fix
that with two shorter jumps „back and forth” — no way.

Googling I got the impression a JMP to the next instruction should have worked.
Did you use a 3-byte JMP instr? As both BIOSO and BIOSN self-modify, both need
patching.

Indeed NASM optimized these jumps to 2-byte instruction — so I changed it to be
both 'JMP LONG' explicitly. Unfortunately, it didn't change much; mina breaks as
before.

If the JMP's don't work it would be possible to split the routines and satisfy
the 32-byte distance requirement without the need for padding.

Probably. Even very likely.

This should do it.

RQBIOS: INT(0) ; Request number to be overwritten.
...
JMP NEXT
;
RQBIOSN: INT(0) ; Request number to be overwritten.
...
JMP NEXT

; *************
; * BIOSO *
; *************
;
N_BIOSO:
DW 5
DB "BIOSO"
BIOSO:
...
MOV BYTE [RQBIOS+1],AL ; Patch the code.
...
XCHG SI,AX
JMP RQBIOS

; *************
; * BIOSN *
; *************
;
N_BIOSN:
DW 5
DB "BIOSN"
BIOSN:
...
MOV BYTE [RQBIOSN+1],AL ; Patch the code.
...
XCHG SI,AX
JMP RQBIOSN

I'll simply replace that SMC with „ordinary” code.

It rather defeats the purpose. I wouldn't do it to DX-Forth. A programmer is >> expected to solve problems - not run away from them :)

Actually I'm not sure is it worth the effort.
From what I see the most important is „DOS dispatcher” INT 21h, and just a few more, like: INT 10, 11, 12, 13, 15, 16, 19, 1A, 20, 33 (h). That makes eleven interrupts together. So for such handful it's possible to handle the problem
using kind of table, being 100% sure nothing will break in case the program will be
run on another, even different processor, that maybe will behave some slighthly different
way. If there was need to use, say, 60 different interrupts or so — well, that would be
quite different story. But if there's no need — maybe it would be practical to follow the
advice from „Thinking Forth”, I mean: „Generality usually involves complexity. Don't
generalize your solution any more than will be required; instead, keep it changeable”.

Nice try. I'd try a backwards JMP. If beaten, I'd go for 32-byte separation and a JMP (see above). As for SMC failing on later CPUs I would have expected to see reports of same. I ran MINA on a Pentium without issue. Without evidence
to the contrary I would treat 486 as a special case. No point crippling software
because Intel gave birth to one bastard. Similarly the idea SMC is the devil's work to be avoided at any cost.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Thu Jul 28 19:41:27 2022

On 27/07/2022 23:28, Zbig wrote:

I'll simply replace that SMC with „ordinary” code.

It's possible to simulate an INT n instruction. How foolproof it is, I don't know.

pop bx ; INT#
sub ax,ax ; get vector
mov es,ax
shl bx,1
shl bx,1
mov ax,es:[bx]
mov ivec,ax
mov ax,es:[bx+2]
mov ivec+2,ax
...
pushf ; execute it
cli
call dword ptr [ivec]
...

ivec dw 0,0

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Thu Jul 28 12:06:48 2022

As for SMC failing on later CPUs I would have expected
to see reports of same. I ran MINA on a Pentium without issue. Without evidence
to the contrary I would treat 486 as a special case. No point crippling software
because Intel gave birth to one bastard. Similarly the idea SMC is the devil's
work to be avoided at any cost.

I tried Mina on AthlonXP — and yes, it works (I mean unmodified, original binary).
The CPU I tried it with:

processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 10
model name : AMD Athlon(TM) XP 3000+
stepping : 0
cpu MHz : 2100.201
cache size : 512 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow cpuid 3dnowprefetch vmmcall
bugs : fxsave_leak sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips : 4202.41
clflush size : 32
cache_alignment : 32
address sizes : 34 bits physical, 32 bits virtual
power management: ts

So yes, maybe „one bastard” — still it's the quite ubiquitous one (if we mean that now
„retro” gear). Really a pity I don't have any AMD 486 for comparison.
On the other hand: it's good to know, that unmodified Mina (or just that short procedure)
can be used for testing CPUs regarding potential SMC (un)reliability.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Fri Jul 29 12:48:51 2022

On 29/07/2022 05:06, Zbig wrote:

So yes, maybe „one bastard” — still it's the quite ubiquitous one (if we mean that now
„retro” gear). Really a pity I don't have any AMD 486 for comparison.
On the other hand: it's good to know, that unmodified Mina (or just that short procedure)
can be used for testing CPUs regarding potential SMC (un)reliability.

It may be worth chasing down an AMD 486 manual. From an earlier i486 manual (not DX2) it explains:

"The prefetch unit is flushed whenever the next instruction needed is not
in numerical sequence with the previous instruction - for example, during
jumps, task switches, exceptions, and interrupts."

So the JMP can be to anywhere *but* the next sequential instruction? That
may explain why:

MOV BYTE [RQBIOS+1],AL ; Patch the code.
JMP XXX
XXX: POP DX

didn't work. You may like to try:

JMP XXX
NOP
XXX:

to see whether it flushes the prefetch queue. Otherwise any backwards JMP should do it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Fri Jul 29 04:38:06 2022

Otherwise any backwards JMP should do it.

Two days ago I changed that code following way:

POP AX ; Function code
; Once we are more acknowledgeable, put segment overwrite here.
JMP XXX1 ; senseless jump to make self-modifying code work on 486 YYY1: MOV BYTE [RQBIOS+1],AL ; Patch the code.
POP DX
POP CX
POP BX
POP AX
PUSH SI ; Save Forth registers. NEEDED?
PUSH BP
; XCHG SI,AX ; Save AX in (already free) SI
; XCHG SI,AX
RQBIOS: INT(0) ; Request number to be overwritten.
PUSHF ; Save status into DI
POP DI
; XCHG SI,AX ; Save AX in (still free) SI
; XCHG SI,AX
POP BP ; Restore Forth registers. NEEDED?
POP SI
PUSH AX
PUSH BX
PUSH CX
PUSH DX
PUSH DI ; i.e. flags
JMP SHORT ZZZ1
XXX1: JMP YYY1 ; senseless return
ZZZ1: JMP NEXT

; SELF MODIFYING CODE ENDS HERE! YOU HAVE BEEN WARNED!

So, as you can see there are even two jumps — forth and back — and this wasn't of any help.
That's why I became somewhat cautious with SMC.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Fri Jul 29 22:49:52 2022

On 29/07/2022 21:38, Zbig wrote:

Otherwise any backwards JMP should do it.

Two days ago I changed that code following way:

POP AX ; Function code
; Once we are more acknowledgeable, put segment overwrite here.
JMP XXX1 ; senseless jump to make self-modifying code work on 486 YYY1: MOV BYTE [RQBIOS+1],AL ; Patch the code.
POP DX
POP CX
POP BX
POP AX
PUSH SI ; Save Forth registers. NEEDED?
PUSH BP
; XCHG SI,AX ; Save AX in (already free) SI
; XCHG SI,AX
RQBIOS: INT(0) ; Request number to be overwritten.
PUSHF ; Save status into DI
POP DI
; XCHG SI,AX ; Save AX in (still free) SI
; XCHG SI,AX
POP BP ; Restore Forth registers. NEEDED?
POP SI
PUSH AX
PUSH BX
PUSH CX
PUSH DX
PUSH DI ; i.e. flags
JMP SHORT ZZZ1
XXX1: JMP YYY1 ; senseless return
ZZZ1: JMP NEXT

; SELF MODIFYING CODE ENDS HERE! YOU HAVE BEEN WARNED!

So, as you can see there are even two jumps — forth and back — and this wasn't of any help.
That's why I became somewhat cautious with SMC.

I wouldn't expect the above to work as there's no JMP between modification
and execution.

Are you saying the following patch (applied to both BIOSO and BIOSN) doesn't work?

MOV BYTE [RQBIOS+1],AL ; Patch the code.
JMP XXX
NOP
XXX: POP DX

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Fri Jul 29 05:19:21 2022

Oh, I forgot — that was „the first version”. I changed then the lines:

JMP XXX1 ; senseless jump to make self-modifying code work on 486
[..]
XXX1: JMP YYY1 ; senseless return

...to both contain JMP LONG. No effect.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Fri Jul 29 08:24:01 2022

Are you saying the following patch (applied to both BIOSO and BIOSN) doesn't work?
MOV BYTE [RQBIOS+1],AL ; Patch the code.
JMP XXX
NOP
XXX: POP DX

Yep, that works. Thanks.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Sat Jul 30 12:20:01 2022

On 30/07/2022 01:24, Zbig wrote:

Are you saying the following patch (applied to both BIOSO and BIOSN) doesn't >> work?
MOV BYTE [RQBIOS+1],AL ; Patch the code.
JMP XXX
NOP
XXX: POP DX

Yep, that works. Thanks.

Great! Can you confirm:

1) Changing JMP XXX to JMP SHORT XXX it still works
2) Deleting the NOP causes it to fail

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Sat Jul 30 05:18:02 2022

Great! Can you confirm:

1) Changing JMP XXX to JMP SHORT XXX it still works

NASM earlier already optimized these JMPs to be „short”,
so this made no difference (same opcode, I made sure).

2) Deleting the NOP causes it to fail

When I commented-out the NOPs, it... still works.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Sat Jul 30 23:36:08 2022

On 30/07/2022 22:18, Zbig wrote:

Great! Can you confirm:

[...]

2) Deleting the NOP causes it to fail

When I commented-out the NOPs, it... still works.

Ok - so the NOP is superfluous and a JMP to the next sequential instruction works i.e. is sufficient to flush the CPU prefetch. Solved!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stephen Pelc@21:1/5 to All on Mon Aug 8 09:21:38 2022

On 5 Jul 2022 at 15:51:22 CEST, "Christof Eberspaecher" <chwebersp@gmail.com> wrote:

Hi,
the actual raspberry os does no longer support wiringPi, which was a way to access GPIO under GForth as described here: https://forums.raspberrypi.com/viewtopic.php?t=207597
Is there now another (easy) way?
Thanks in advance! Christof

VFX Forth for ARM Linux includes port access code for Raspberry PI. See
Examples/Lin32/rpi-gpio.fth

Stephen
--
Stephen Pelc, stephen@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612,
+34 649 662 974
http://www.mpeforth.com - free VFX Forth downloads

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- J Ord
  Thu Jul 17 04:04:29 2025
  from Calgary, Alberta via SSH
- Paul
  Wed Jul 16 22:19:38 2025
  from New York via Telnet
- Bob Worm
  Wed Jul 16 22:16:17 2025
  from Wales, Uk via Telnet
- Bob Worm
  Wed Jul 16 21:07:30 2025
  from Wales, Uk via Telnet
- Bagwaa
  Wed Jul 16 18:40:15 2025
  from Nottingham via Telnet
- Centurion
  Wed Jul 16 17:19:56 2025
  from Berea, Ohio via Telnet
- Michal Wronka
  Wed Jul 16 13:23:00 2025
  from Wroclaw, Poland via SSH
- Plume
  Wed Jul 16 12:58:18 2025
  from Uk via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	512
Nodes:	16 (2 / 14)
Uptime:	90:53:03
Calls:	10,018
Calls today:	1
Files:	13,849
D/L today:	1 files (9K bytes)
Messages:	6,365,856

Re: GPIO access with GForth on Raspberry Pi

Who's Online

Recent Visitors

System Info