mhx@iae.nl (mhx) writes:
However, a state machine has well defined rules based on a
state's stored information and its inputs, causing it to go to
another well-defined state while generating outputs. In that
context a goto is harmless and merely serves as a crutch when
there are not enough computing nodes to serve all states in
parallel. How to make such an efficient crutch in Forth?
You lost me. Why would one "serve all states in parallel"?
Anyway, if the question is how to implement a state machine
efficiently in Forth, one answer is to stay at a higher, more
structured level of abstraction or recreate it from the state machine.
E.g., don't transform a regular expression into a (indeterministic or deterministic) finite state machine, but instead interpret it directly (that's what Bernd Paysan's regexp.fs does). Or instead of
transforming a grammar into a push-down automaton, transform it into a structured Forth program (like Gray does).
If you cannot do that, in standard Forth you don't really have good
options. The best is probably to have the current state on the stack (probably in the form of the address of an array indexed with the
input (or whatever causes a state change) and containing the potential
next states at the appropriate elements.
In a particular implementation, you can do more, including goto-like
things. What I would do then is have a colon definition per state,
and do the transition to the next state as tail call. Have some
efficient forward-tail-call mechanism to allow calling states where
the definition comes later. Gforth has a clever FORWARD, but for now
that does not do tail-calls.
- anton
Use of a wordlist, whose wid is held in an immediate constant, enables
easy linkage between states at compile time, eg a typical state action
in outline is:
:noname <action>
if state-x [ >order ] next-state [ previous ]
else state-y [ >order ] next-state [ previous ]
; this-state >order to next-state previous
Disadvantages are:
- the Forth code doesn't give much idea of the overall operation of the
FSM (probably true for FSMs in general)
: run-fsm ( ad xt -- ) begin dup while execute repeat 2drop ;
Gerry Jackson <do-not-use@swldwa.uk> writes::) Well if you've never encountered this in (how many years has GForth
Use of a wordlist, whose wid is held in an immediate constant, enables
easy linkage between states at compile time, eg a typical state action
in outline is:
:noname <action>
if state-x [ >order ] next-state [ previous ]
else state-y [ >order ] next-state [ previous ]
; this-state >order to next-state previous
It means that Gforth will have to improve its SEE in order to point
out the differences between the different NEXT-STATEs. Currently I get:
/2 >order ok
next-state xt-see
noname :
dup @ #1 and
IF next-state
ELSE next-state
THEN ; ok
Disadvantages are:
- the Forth code doesn't give much idea of the overall operation of the
FSM (probably true for FSMs in general)
That's definitely the case here.
IIRC for Michael Gassanenko it was a
demonstration of filtering and backtracking,
but it's unclear to me
how that transfers to FSMs.
Anyway, let's look at the core:
: run-fsm ( ad xt -- ) begin dup while execute repeat 2drop ;
This means that every state has to return to this loop to dispatch the
next one. Now Gforth (development) has EXECUTE-EXIT, which allows tail-calling the next state for better efficiency.
I have worked out an example: https://www.complang.tuwien.ac.at/forth/programs/fsm-ae.4th
- anton
Here are the results on a Zen4:...
loop plain optimized implementation variant
1_278_763_454 1_175_241_846 1_175_505_964 cycles
3_651_376_357 2_441_376_030 2_045_832_844 instructions
For now I don't see why the whole thing takes so many cycles. I'll
take a closer look later.
I have worked out an example: https://www.complang.tuwien.ac.at/forth/programs/fsm-ae.4th
Let's first look at the example: The example recognizes and prints
numbers in a text and ignores everything else. It terminates when it
sees '$'. It has two states, one for being inside a number and one
for outside:
state outside-num
state inside-num
(Note that this is not the standar Forth word STATE).
Then we define transitions:
: out->out ( c-addr -- c-addr1 )
count outside-num transition ;
' out->out outside-num all-transitions
The out->out transition is the simplest one: It fetches the next char
(with COUNT) and switches to OUTSIDE-NUM. TRANSITION already starts
the dispatch for that state and the next char; this (and maybe also
COUNT) could be put in the general FSM interpreter (START-DFA), but by
having TRANSITION in the individual transition actions (e.g.,
OUT->OUT), the implementation is more flexible, as we will see.
At first OUT->OUT is put in transitions from OUTSIDE-NUM for all
characters using ALL-TANSITIONS; later the transitions of various
characters are overwritten:
' out->in '9' 1+ '0' outside-num some-transitions
' out->stop '$' outside-num one-transition
Note that the stack effect comment for out->out is from the start of
the word to the start of the next state-transition word; the actual
stack effect depends on the implementation of transition.
For more state transitions and the corresponding transition words see
the source code.
Example usage:
s" 123 abc 456 df$" drop outside-num start-dfa \ prints "123 456"
Now for the implementations: States are just arrays of xts, indexed by
the character, and the xt is that of the transition from the state
with that character.
The implementation without EXECUTE-EXIT looks as follows:
: transition ( c addr -- xt )
\ addr specifies the next state
]] swap th @ [[ ; immediate
: stop ( c-addr -- 0 )
drop 0 ;
: start-dfa ( c-addr addr -- )
swap count rot transition
begin ( ... xt )
execute dup
0= until
drop ;
TRANSITION could be a plain colon definition here, but it is a macro
in order to make it more competetive in Gforth with the EXECUTE-EXIT
variant. Here the termination is performed by replacing the next
c-addr with 0 and testing for 0 in the loop.
An alternative implementation is to use EXECUTE-EXIT to tail-call the
next transition:
: transition ( c addr -- )
]] swap th @ execute-exit [[ ; immediate
: stop ( -- )
\ let the ";" behind the STOP do the stopping
]] drop [[ ; immediate
: start-dfa ( c-addr addr -- )
\ let the dfa work on the string starting at c-addr, with initial
\ state addr
swap count rot transition ;
Here TRANSITION contains the EXECUTE in the form of EXECUTE-EXIT, and
so each transition word directly calls the next one, and no loop is necessary; with EXECUTE this would fill the return stack after a few
thousand transitions, but EXECUTE-EXIT takes the return address off
the return stack before EXECUTEing the next word and therefore can
perform an indefinite number of state transitions.
So how do we get out of the state machine? By not performing a
transition; instead we simply return to the caller of START-DFA.
I looked at the generated code and thought that we can get rid of the
SWAP in the transition code by using the generalized constant folding
feature of Gforth. This replaces the definition of TRANSITION with:
: noopt-transition-compile, ( xt -- )
\ basic compile, implementation for TRANSITION
drop ]] swap th @ execute-exit [[ ;
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (2 / 14) |
Uptime: | 145:23:37 |
Calls: | 10,383 |
Calls today: | 8 |
Files: | 14,054 |
D/L today: |
2 files (1,861K bytes) |
Messages: | 6,417,685 |