ciforth is much a classical Forth. The headers are followed by
high level code, machine code or data.
Is there any experience in separating code and data using
the text segment?
Nowadays apparently Apple requires that all executable code resides in
her text segment for the modern systems.
I'm interested in the problems encountered, and also if there
is any benefit in speed.
ciforth can do this relatively easy, because it is indirect
threaded. I can imagine that directly threaded, subroutine
threaded code encounters even more difficulties.
However, looking at the second-to-last line, I expect that we can
still see a performance problem from code where the data does not
start with a defining word, like (proof-of-concept):
iForth is since long prepared for separated data and code, but I never enabled >it because I would mean introducing new/non-standard words for CREATE .. >DOES> and , C, etc.. Maybe in next year's release.
-marcel--
On Wednesday, October 18, 2023 at 7:06:16=E2=80=AFPM UTC+2, Anton Ertl wrot= >e:
[..]
However, looking at the second-to-last line, I expect that we can
still see a performance problem from code where the data does not
start with a defining word, like (proof-of-concept):
FORTH> : foo 100000000 0 do 0 over ! loop drop ; ok
FORTH> here 0 , ok
[1]FORTH> ' foo idis
$013CEAC0 : foo
$013CEACA mov rcx, $05F5E100 d#
$013CEAD1 xor rbx, rbx
$013CEAD4 call (DO) offset NEAR
$013CEADE nop
$013CEADF nop
$013CEAE0 mov [rbx] qword, 0 d#
$013CEAE7 add [rbp 0 +] qword, 1 b#
$013CEAEC add [rbp 8 +] qword, 1 b#
$013CEAF1 jno $013CEAE0 offset NEAR
$013CEAF7 add rbp, #24 b#
$013CEAFB ;
$013CEB05 nop
$013CEB06 nop
[1]FORTH> dup h. $013CEB70 ok
[1]FORTH> foo ok
FORTH> $013CEB70 ? 0 ok
FORTH> see foo
Flags: TOKENIZE, ANSI
: foo 100000000 0 DO 0 OVER ! LOOP DROP ; ok
ok
FORTH> : test ( addr -- ) cr dup h. space timer-reset foo .elapsed ; = >ok
FORTH> $013CEB70 test
$013CEB70 0.037 seconds elapsed. ok
FORTH> PAD test
$013CF1B8 0.036 seconds elapsed. ok
FORTH> PAD 4000 + aligned test
$013D0158 0.038 seconds elapsed. ok
iForth is since long prepared for separated data and code, but I never enab= >led
it because I would mean introducing new/non-standard words for CREATE .. >DOES> and , C, etc.. Maybe in next year's release.
Marcel Hendrix <m...@iae.nl> writes:[..]
On Wednesday, October 18, 2023 at 7:06:16 PM UTC+2, Anton Ertl wrote:
foo start loop end cell address time
$10226000 $10226037 $102260B0 0.140s
$102268C0 $102268FF $10226970 5.711s
Why is the code longer in the second case? For some reason, it used a
10-byte instruction to put $00000001:00000000 into rcx, while the
first variant used a 7-byte instruction to put $05F5E100 into rcx.
But in this case I did not see the slowdown, even with BAR ending 1
byte before the end of the cache line, and even if it ends at the end
of a cache line.
So, the padding you put after code is usually enough, but I found one
case where it was not.
iForth is since long prepared for separated data and code, but I never enabled
it because I would mean introducing new/non-standard words for CREATE .. >DOES> and , C, etc.. Maybe in next year's release.
I don't see why it should. Gforth keeps the native code elsewhere
without such words.
On Wednesday, October 18, 2023 at 10:00:25=E2=80=AFPM UTC+2, Anton Ertl wro= >te:
Marcel Hendrix <m...@iae.nl> writes:[..]
On Wednesday, October 18, 2023 at 7:06:16 PM UTC+2, Anton Ertl wrote:
foo start loop end cell address time
$10226000 $10226037 $102260B0 0.140s
$102268C0 $102268FF $10226970 5.711s
Why is the code longer in the second case? For some reason, it used a
10-byte instruction to put $00000001:00000000 into rcx, while the
first variant used a 7-byte instruction to put $05F5E100 into rcx.
It seems you were in HEX, which means your second loop was ...
decimal $0000000100000000 100000000 / .=20
... 42 times longer than the first loop. Therefore the ratio of timings was >5711 140 / .=20
... 42 which is no surprise.
iForth is since long prepared for separated data and code, but I never e= >nabled
it because I would mean introducing new/non-standard words for CREATE ..
DOES> and , C, etc.. Maybe in next year's release.
I don't see why it should. Gforth keeps the native code elsewhere
without such words.
How do you generate native code with separate code (protected for write)=20 >and data segments, given the assembler is written in Forth. I can't use
the standard !, C!, @, C@, C, and , to access the code segment.
On Wednesday, October 18, 2023 at 7:06:16 PM UTC+2, Anton Ertl wrote:
[..]
However, looking at the second-to-last line, I expect that we can
still see a performance problem from code where the data does not
start with a defining word, like (proof-of-concept):
FORTH> : foo 100000000 0 do 0 over ! loop drop ; ok
FORTH> here 0 , ok
[1]FORTH> ' foo idis
$013CEAC0 : foo
$013CEACA mov rcx, $05F5E100 d#
$013CEAD1 xor rbx, rbx
$013CEAD4 call (DO) offset NEAR
$013CEADE nop
$013CEADF nop
$013CEAE0 mov [rbx] qword, 0 d#
$013CEAE7 add [rbp 0 +] qword, 1 b#
$013CEAEC add [rbp 8 +] qword, 1 b#
$013CEAF1 jno $013CEAE0 offset NEAR
$013CEAF7 add rbp, #24 b#
$013CEAFB ;
$013CEB05 nop
$013CEB06 nop
[1]FORTH> dup h. $013CEB70 ok
[1]FORTH> foo ok
FORTH> $013CEB70 ? 0 ok
FORTH> see foo
Flags: TOKENIZE, ANSI
: foo 100000000 0 DO 0 OVER ! LOOP DROP ; ok
ok
FORTH> : test ( addr -- ) cr dup h. space timer-reset foo .elapsed ; ok >FORTH> $013CEB70 test
$013CEB70 0.037 seconds elapsed. ok
FORTH> PAD test
$013CF1B8 0.036 seconds elapsed. ok
FORTH> PAD 4000 + aligned test
$013D0158 0.038 seconds elapsed. ok
Not in this case, at least. However, with a bit more cleverness it is possible >to write data in a cached line of preceding code that really needs to execute >(CREATE ... DOES> or ... [ 0 , ] ... ). ISTR that in the past I have
used ALIGN
once or twice to get rid of a real (or imagined) problem.
iForth is since long prepared for separated data and code, but I never enabled >it because I would mean introducing new/non-standard words for CREATE .. >DOES> and , C, etc.. Maybe in next year's release.
-marcel
How do you generate native code with separate code (protected for write)
and data segments, given the assembler is written in Forth. I can't use
the standard !, C!, @, C@, C, and , to access the code segment.
-marcel
In article <b739e7b2-56ce-4020...@googlegroups.com>,
Marcel Hendrix <m...@iae.nl> wrote:
<SNIP>
How do you generate native code with separate code (protected for write) >and data segments, given the assembler is written in Forth. I can't use
the standard !, C!, @, C@, C, and , to access the code segment.
Time to proceed to 64 bits, with its flat memory space.
In fact in the 32 bits era, it already was behind the times to
have separate data, code, stack, and extra segments.
Linus Torvalds could not be bothered. He wouldn't have started
Linux if he was obliged to.
In article <53600953-77c8-466c-a1b2-3044388359e9n@googlegroups.com>,
Marcel Hendrix <mhx@iae.nl> wrote:
iForth is since long prepared for separated data and code, but I never enabled
it because I would mean introducing new/non-standard words for CREATE .. >>DOES> and , C, etc.. Maybe in next year's release.
I can't see that one has to introduce non-standard words.
Also the changes to CODE ENDCODE ;CODE doesn't seem to be
a bug deal either. But you are right, only do it, if it has
benefits.
-marcel
when data is close to code. The question above is in case segments are write-protected.
-marcel
In article <nnd$23e6e7e7$28df6727@b48c89f815d28223>,
none) (albert <albert@cherry.> wrote:
In article <53600953-77c8-466c-a1b2-3044388359e9n@googlegroups.com>,
Marcel Hendrix <mhx@iae.nl> wrote:
iForth is since long prepared for separated data and code, but I never enabled
it because I would mean introducing new/non-standard words for CREATE .. >>>DOES> and , C, etc.. Maybe in next year's release.
I can't see that one has to introduce non-standard words.
Also the changes to CODE ENDCODE ;CODE doesn't seem to be
a bug deal either. But you are right, only do it, if it has
benefits.
I have done it. The benefits are general cleaner code and
a preparation in case we are in fact forced to separate for
the newest arm apple computers.
As you know the ciforth's are generated with one source file
regulated by macro's using m4. This is i86 and AMD only.
An addition for separating the code and data sections must
make the main builds for windows and linux, i.e. the following
tests must pass:
make testlina64
make testlina32
make testwina64
make testwina32
These are build with fasm, one of the four assemblers foreseen.
The additions are added to the gnu assembler version.
That is regulated by the same lina64.cfg control file, but
the target in the Makefile is .s.
define( {_SEPARATED_}, _yes)dnl
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 475 |
Nodes: | 16 (2 / 14) |
Uptime: | 18:37:11 |
Calls: | 9,487 |
Calls today: | 6 |
Files: | 13,617 |
Messages: | 6,121,092 |