• Re: F*/ (f-star-slash)

    From Krishna Myneni@21:1/5 to Krishna Myneni on Sun May 19 16:44:30 2024
    On 5/19/24 16:28, Krishna Myneni wrote:
    ...
    The code below, which requires a separate FP stack in memory, has been
    tested on kForth-64 and Gforth. ...

    === start of fstarslash.4th ===
    \ fstarslash.4th
    \
    \ Implementation of F*/ on 64-bit Forth
    \
    \ Krishna Myneni, 2024-05-19
    \
    \ F*/ ( F: r1 r2 r3 -- r )
    \ Multiply two IEEE-754 double precision floating point
    \ numbers r1 and r2 and divide by r3 to obtain the result,
    \ r. The intermediate product significand, r1*r2, is
    \ represented with at least IEEE quad precision for both
    \ its significand and exponent.
    \
    \ F*/ uses an intermediate significand represented by
    \   128 bits
    \
    ...
    \ Tests
    1 [IF]
    [UNDEFINED] T{ [IF] include ttester.4th [THEN]
    1e-17 rel-near f!
    1e-17 abs-near f!
    TESTING F*/
    \ These results are different using F* and F/
    t{  1.0e300  -3.0e259   1.0e280  f*/ -> -3.0e279 }t
    t{  1.0e-300  3.0e-259  1.0e-280 f*/ -> 3.0e-279 }t
    t{  1.0e300  -3.0e259  -1.0e280  f*/ ->  3.0e279 }t
    t{ -1.0e300  -3.0e259  -1.0e280  f*/ -> -3.0e279 }t
    t{  1.0e302  -3.0e302   DP_+INF  f*/ -> -0.0e0   }t
    t{ -1.0e-302 -1.0e-302  0.0e0    f*/ -> DP_+INF  }t

    \ The following should give same results as using F* and F/
    t{  0.0e0    3.0e302  1.0e-100 f*/ -> 0.0e0   }t
    t{  3.0e302  0.0e0    1.0e-100 f*/ -> 0.0e0   }t
    t{  1.0e302  -3.0e259 0.0e0    f*/ -> DP_-INF }t
    t{  DP_+INF  3.0e300  1.0e0    f*/ -> DP_+INF }t
    t{  DP_+INF  3.0e-300 -1.0e0   f*/ -> DP_-INF }t
    [THEN]


    Just a note of caution. This code is preliminary, and I have not used it
    for anything yet. There are only a few test cases, and it may well have significant bugs in it.

    --
    KM

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to All on Sun May 19 16:28:12 2024
    The analog of */ for single length integers, with intermediate double
    length result, can be implemented for floating point arithmetic as well.
    The following implementation of F*/ uses the non-standard word, UD/MOD, although it might be possible to code it with UM/MOD as well. It also
    requires FP@ (requiring a separate fp stack in memory).

    How is F*/ different from using F* and F/ sequentially?

    Consider the following cases:

    1) 1.0e300 -3.0e259 F* 1.0e280 F/

    2) 1.0e-300 3.0e-259 F* 1.0e-280 F/

    3) 1.0e302 -3.0e302 F* DP_+INF F/

    4) -1.0e-302 -1.0e-302 F* 0.0e0 F/

    (DP_+INF is the double-precision IEEE-754 representation of plus infinity).

    The first two cases, on a double-precision FP system, result in -inf and
    zero, respectively.

    Cases 3 and 4 result in NaN (not a number).

    A proper implementation of F*/ can evaluate the result to a number for
    cases 1 and 2, recognizing that the final result is representable as a double-precision number. For cases 3 and 4 F*/ recognizes that the final results are signed zero and +INF. F*/ represents the intermediate result
    of the product with a wider number representation.

    The code below, which requires a separate FP stack in memory, has been
    tested on kForth-64 and Gforth. Gforth appears to have a problem
    representing signed zero, although it does represent plus and minus
    infinity.

    1.0e302 -3.0e302 DP_+INF f*/ fs. 0.00000000000000E0 ok

    ( should be minus zero in the above case).

    The factor of _F*/ was done on purpose, to be used as a basis for higher precision calculations.

    The tests do not include subnormal number testing or for NANs at the
    present time.

    --
    Krishna Myneni

    === start of fstarslash.4th ===
    \ fstarslash.4th
    \
    \ Implementation of F*/ on 64-bit Forth
    \
    \ Krishna Myneni, 2024-05-19
    \
    \ F*/ ( F: r1 r2 r3 -- r )
    \ Multiply two IEEE-754 double precision floating point
    \ numbers r1 and r2 and divide by r3 to obtain the result,
    \ r. The intermediate product significand, r1*r2, is
    \ represented with at least IEEE quad precision for both
    \ its significand and exponent.
    \
    \ F*/ uses an intermediate significand represented by
    \ 128 bits
    \

    include ans-words

    1 cells 8 < [IF]
    cr .( Requires 64-bit Forth system ) cr
    ABORT
    [THEN]

    fvariable temp
    : >fpstack ( u -- ) ( F: -- r )
    temp ! temp f@ ;

    base @ hex
    8000000000000000 constant DP_SIGNBIT_MASK
    000FFFFFFFFFFFFF constant DP_FRACTION_MASK
    7FF0000000000000 constant DP_EXPONENT_MASK
    \ DP special value constants
    7FF0000000000001 >fpstack fconstant DP_NAN
    FFF0000000000000 >fpstack fconstant DP_-INF
    7FF0000000000000 >fpstack fconstant DP_+INF
    base !

    52 constant DP_FRACTION_BITS
    1023 constant DP_EXPONENT_BIAS
    2047 constant DP_EXPONENT_MAX

    1 DP_FRACTION_BITS LSHIFT constant DP_IMPLIED_BIT
    63 constant DP_SIGN_BIT

    : dp-fraction ( F: r -- r ) ( -- u )
    FP@ @ DP_FRACTION_MASK and ;
    : dp-exponent ( F: r -- r ) ( -- e )
    FP@ @ DP_EXPONENT_MASK and DP_FRACTION_BITS rshift ;
    : dp-sign ( F: r -- r ) ( -- sign )
    FP@ @ DP_SIGN_BIT rshift ;

    : _isZero? ( ufrac uexp -- b ) 0= swap 0= and ;
    : _isInf? ( ufrac uexp -- b ) DP_EXPONENT_MAX = swap 0= and ;
    : _isNaN? ( ufrac uexp -- b ) DP_EXPONENT_MAX = swap 0<> and ;
    : _isSubNormal? ( ufrac uexp -- b ) 0= swap 0<> and ;

    : frac>significand ( u -- u' ) DP_IMPLIED_BIT or ;

    \ Show the significand, unbiased base-2 exponent, and sign bit
    \ for a double-precision floating point number. To recover the
    \ dp floating point number, r, from these fields,
    \
    \ r = (-1)^sign * significand * 2^(exponent - 52)
    \
    : dp-components. ( F: r -- )
    dp-sign
    dp-exponent
    dp-fraction
    2dup D0= invert IF frac>significand THEN
    swap DP_EXPONENT_BIAS - swap
    fdrop
    ." significand = " 18 u.r 2 spaces
    ." exponent = " 4 .r 2 spaces
    ." sign = " .
    ;

    : dp-normal-fraction ( u -- expinc u')
    dup DP_FRACTION_BITS 1+ rshift
    dup IF
    tuck rshift
    ELSE
    drop -1 swap \ -1 u
    BEGIN
    1 lshift
    dup DP_IMPLIED_BIT and 0=
    WHILE
    swap 1- swap
    REPEAT
    THEN
    ;

    0 value r1_sign
    0 value r2_sign
    0 value r3_sign
    0 value num_sign
    variable r1_exp
    variable r2_exp
    variable r3_exp
    variable r_exp
    variable r1_frac
    variable r2_frac
    variable r3_frac

    : get-arg-components ( F: r1 r2 r3 -- )
    dp-fraction r3_frac !
    dp-exponent r3_exp !
    dp-sign to r3_sign
    fdrop
    dp-fraction r2_frac !
    dp-exponent r2_exp !
    dp-sign to r2_sign
    fdrop
    dp-fraction r1_frac !
    dp-exponent r1_exp !
    dp-sign to r1_sign
    fdrop ;

    0 value b_num_zero
    0 value b_den_zero
    0 value b_num_Inf
    0 value b_den_Inf

    variable remainder

    : _F*/ ( F: r1 r2 r3 -- ) ( -- usign uexponent udfraction true )
    \ or ( F: -- DP_NAN | +/- DP_INF ) ( -- false )
    get-arg-components

    \ Check for NaN in args
    r1_frac @ r1_exp @ _IsNaN?
    r2_frac @ r2_exp @ _IsNaN? or
    r3_frac @ r3_exp @ _IsNaN? or IF DP_NAN FALSE EXIT THEN

    r1_frac @ r1_exp @ _IsZero?
    r2_frac @ r2_exp @ _IsZero? or to b_num_zero
    r3_frac @ r3_exp @ _IsZero? to b_den_zero

    b_num_zero b_den_zero and IF DP_NAN FALSE EXIT THEN

    r1_sign r2_sign xor to num_sign

    r1_frac @ r1_exp @ _IsInf?
    r2_frac @ r2_exp @ _IsInf? or to b_num_Inf
    r3_frac @ r3_exp @ _IsInf? to b_den_Inf

    b_num_Inf b_den_Inf and IF DP_NAN FALSE EXIT THEN

    b_num_zero b_den_Inf or IF \ return signed zero
    num_sign r3_sign xor
    0
    0 s>d TRUE EXIT
    THEN

    b_den_zero b_num_Inf or IF \ return signed Inf
    DP_+INF
    num_sign r3_sign xor IF FNEGATE THEN
    FALSE EXIT
    THEN

    num_sign r3_sign xor
    r1_exp @ r2_exp @ + r3_exp @ -
    r1_frac @ frac>significand
    r2_frac @ frac>significand um*
    r3_frac @ frac>significand ud/mod
    rot remainder !
    TRUE
    ;

    variable uhigh

    : F*/ ( F: r1 r2 r3 -- r )
    _F*/ \ ( -- usign uexp udfrac true ) or
    \ ( -- false ) ( F: NAN | +/-INF )
    IF
    2 pick 0 DP_EXPONENT_MAX WITHIN 0= IF
    -43 throw
    THEN
    dup IF \ high 64 bits of the quotient
    uhigh !
    -43 throw
    ELSE
    drop
    2dup D0= IF \ ufrac = 0 and uexp 0
    drop \ usign uexp
    ELSE
    \ usign
    \ quotient: normal double precision fraction
    dp-normal-fraction \ usign uexp expinc uquot
    >r + DP_FRACTION_BITS lshift r> or
    THEN
    swap DP_SIGN_BIT lshift or
    >fpstack
    THEN
    THEN
    ;

    \ Tests
    1 [IF]
    [UNDEFINED] T{ [IF] include ttester.4th [THEN]
    1e-17 rel-near f!
    1e-17 abs-near f!
    TESTING F*/
    \ These results are different using F* and F/
    t{ 1.0e300 -3.0e259 1.0e280 f*/ -> -3.0e279 }t
    t{ 1.0e-300 3.0e-259 1.0e-280 f*/ -> 3.0e-279 }t
    t{ 1.0e300 -3.0e259 -1.0e280 f*/ -> 3.0e279 }t
    t{ -1.0e300 -3.0e259 -1.0e280 f*/ -> -3.0e279 }t
    t{ 1.0e302 -3.0e302 DP_+INF f*/ -> -0.0e0 }t
    t{ -1.0e-302 -1.0e-302 0.0e0 f*/ -> DP_+INF }t

    \ The following should give same results as using F* and F/
    t{ 0.0e0 3.0e302 1.0e-100 f*/ -> 0.0e0 }t
    t{ 3.0e302 0.0e0 1.0e-100 f*/ -> 0.0e0 }t
    t{ 1.0e302 -3.0e259 0.0e0 f*/ -> DP_-INF }t
    t{ DP_+INF 3.0e300 1.0e0 f*/ -> DP_+INF }t
    t{ DP_+INF 3.0e-300 -1.0e0 f*/ -> DP_-INF }t
    [THEN]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Sun May 19 23:28:30 2024
    Nice work! This failed here:

    $FFFFFFFFFFFFF S>F FCONSTANT MANTMAX \ full 52 bit mantissa
    t{ mantmax 65535e fdup f*/ -> mantmax }T

    But perhaps it is just me not thinking straight anymore...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Sun May 19 22:35:51 2024
    On 5/19/24 18:28, minforth wrote:
    Nice work! This failed here:

    $FFFFFFFFFFFFF S>F FCONSTANT MANTMAX \ full 52 bit mantissa
    t{ mantmax 65535e fdup f*/ -> mantmax }T

    But perhaps it is just me not thinking straight anymore...

    Your example with S>F is creating a number where the fraction does not represent the maximum mantissa

    HEX
    ok
    MANTMAX DP-FRACTION 40 BINARY U.R
    1111111111111111111111111111111111111111111111111110 ok

    Note the lowest bit in the field.

    MANTMAX dp-components.
    significand = 1FFFFFFFFFFFFE exponent = 33 sign = 0 ok

    I think you need to do the following:

    1FFFFFFFFFFFFF s>f fconstant MANTMAX ( note the leading 53rd bit )
    MANTMAX DP-FRACTION 40 BINARY U.R
    1111111111111111111111111111111111111111111111111111 ok
    DECIMAL
    ok
    MANTMAX dp-components.
    significand = 9007199254740991 exponent = 52 sign = 0 ok
    MANTMAX fs.
    9.00719925474099e+15 ok
    MANTMAX 65535e fdup f*/ fs.
    9.00719925474099e+15 ok

    The implied 53rd bit exists in the d.p. representation. It is only
    absent when the exponent is zero, and we have a subnormal number.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Mon May 20 05:11:22 2024
    Okay, thanks, I should have looked it up. I was too tired yesterday...
    Now even when the mantissa still can digest another bit, this still
    fails:

    $1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX
    t{ mantmax $ffff s>f fdup f*/ -> mantmax }t

    What did go wrong?

    The test is about expanding the mantissa to more than 53 bits by
    multiplication
    and shrinking it back by division. The intermediate 128-bit format
    should allow it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From mhx@21:1/5 to Krishna Myneni on Mon May 20 05:20:32 2024
    Krishna Myneni wrote:

    \ The following should give same results as using F* and F/
    t{  0.0e0    3.0e302  1.0e-100 f*/ -> 0.0e0   }t
    t{  3.0e302  0.0e0    1.0e-100 f*/ -> 0.0e0   }t
    t{  1.0e302  -3.0e259 0.0e0    f*/ -> DP_-INF }t
    t{  DP_+INF  3.0e300  1.0e0    f*/ -> DP_+INF }t
    t{  DP_+INF  3.0e-300 -1.0e0   f*/ -> DP_-INF }t

    What is the stack diagram of FP@ ?

    With IEEE extended floats in iForth:

    FORTH> 1.0e300 -3.0e259 F* 1.0e280 F/ e. -3.000000e279 ok
    FORTH> 1.0e-300 3.0e-259 F* 1.0e-280 F/ e. 3.000000e-279 ok
    FORTH> 1.0e302 -3.0e302 F* +INF F/ e. -0.000000e0 ok
    FORTH> -1.0e-302 -1.0e-302 F* 0.0e0 F/ e. +INF ok
    etc.

    Of course, this won't always work. What is the exponent range
    that should be supported?

    Of course, you considered F/* instead of F*/. What is your
    problem with that?

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From albert@spenarnc.xs4all.nl@21:1/5 to mhx on Mon May 20 11:37:40 2024
    In article <09d2b252e92d9c88e67055fb6ee601d4@www.novabbs.com>,
    mhx <mhx@iae.nl> wrote:
    Krishna Myneni wrote:

    \ The following should give same results as using F* and F/
    t{  0.0e0    3.0e302  1.0e-100 f*/ -> 0.0e0   }t
    t{  3.0e302  0.0e0    1.0e-100 f*/ -> 0.0e0   }t
    t{  1.0e302  -3.0e259 0.0e0    f*/ -> DP_-INF }t
    t{  DP_+INF  3.0e300  1.0e0    f*/ -> DP_+INF }t
    t{  DP_+INF  3.0e-300 -1.0e0   f*/ -> DP_-INF }t

    What is the stack diagram of FP@ ?

    With IEEE extended floats in iForth:

    FORTH> 1.0e300 -3.0e259 F* 1.0e280 F/ e. -3.000000e279 ok
    FORTH> 1.0e-300 3.0e-259 F* 1.0e-280 F/ e. 3.000000e-279 ok
    FORTH> 1.0e302 -3.0e302 F* +INF F/ e. -0.000000e0 ok
    FORTH> -1.0e-302 -1.0e-302 F* 0.0e0 F/ e. +INF ok
    etc.

    Of course, this won't always work. What is the exponent range
    that should be supported?

    Of course, you considered F/* instead of F*/. What is your
    problem with that?

    It is absolutely symmetric, that runs into the same problem.

    In Philips mat-lab I fixed a defect for 5 the order interpolation.
    They plugged the electron charge in some 10E-19 Coulomb and got overflow.
    The solution is not a fix to the floating point package, but rather
    use electron charges as unit or femto Coulomb.

    (10E-80 was too much for the IBM machine. They wrote some strange code
    that confused the FORTRAN optimiser. Boy, were they surprised when
    I pointed out the defect in the generated assembler code. Maybe first
    time they saw assembler code.)

    -marcel
    Groetjes Albert
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat purring. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Mon May 20 07:22:32 2024
    On 5/20/24 00:11, minforth wrote:
    Okay, thanks, I should have looked it up. I was too tired yesterday...
    Now even when the mantissa still can digest another bit, this still
    fails:

    $1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX
    t{ mantmax $ffff s>f fdup f*/ -> mantmax }t

    What did go wrong?

    The test is about expanding the mantissa to more than 53 bits by multiplication
    and shrinking it back by division. The intermediate 128-bit format
    should allow it.

    Good question. I can reproduce it here.

    hex
    ok
    1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX
    ok
    decimal
    ok
    MANTMAX dp-components.
    significand = 9007199254740991 exponent = 52 sign = 0 ok
    hex
    ok
    MANTMAX FFFF S>F FDUP F*/
    ok
    decimal dp-components.
    significand = 9007199254740990 exponent = 52 sign = 0 ok


    The lowest bit is different in the significand. Need to look into why
    the lowest fraction bit is different.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Krishna Myneni on Mon May 20 07:39:25 2024
    On 5/20/24 07:17, Krishna Myneni wrote:
    On 5/20/24 00:20, mhx wrote:
    ...
    Of course, you considered F/* instead of F*/. What is your
    problem with that?


    The issue is you have three fp numbers on the fp stack, from some
    source. If you can write a word which automatically selects between
    doing the sequence, F/ and F*, or the sequence, F* and F/, so that the intermediate quotient or product does not underflow or overflow, and
    produce a result which fits into the d.p. range, you will have
    effectively written F*/ (except for maybe a corner case or two).


    Just for clarity, if a word which automatically selects between the two sequences F/ and F* vs F* and F/ it will also have to re-order the
    arguments so as to produce the intended F*/ on the original argument
    sequence.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to mhx on Mon May 20 07:17:44 2024
    On 5/20/24 00:20, mhx wrote:
    Krishna Myneni wrote:

    \ The following should give same results as using F* and F/
    t{  0.0e0    3.0e302  1.0e-100 f*/ -> 0.0e0   }t
    t{  3.0e302  0.0e0    1.0e-100 f*/ -> 0.0e0   }t
    t{  1.0e302  -3.0e259 0.0e0    f*/ -> DP_-INF }t
    t{  DP_+INF  3.0e300  1.0e0    f*/ -> DP_+INF }t
    t{  DP_+INF  3.0e-300 -1.0e0   f*/ -> DP_-INF }t

    What is the stack diagram of FP@ ?

    FP@ ( -- a ) \ return address of top of floating point stack

    FP@ is the analog of SP@ but for the FP stack.



    With IEEE extended floats in iForth:

    FORTH> 1.0e300  -3.0e259  F*  1.0e280  F/ e. -3.000000e279  ok
    FORTH> 1.0e-300  3.0e-259 F*  1.0e-280 F/ e. 3.000000e-279  ok
    FORTH> 1.0e302  -3.0e302  F*  +INF     F/ e. -0.000000e0  ok
    FORTH> -1.0e-302 -1.0e-302 F* 0.0e0    F/ e. +INF  ok
    etc.

    Of course, this won't always work. What is the exponent range that
    should be supported?


    For this implementation, the arguments should be within the range of
    IEEE-754 double-precision floats, which, for normal numbers is a biased
    base-2 exponent range of [0,2047] -- the exponent bias is 1023.


    Of course, you considered F/* instead of F*/. What is your
    problem with that?


    The issue is you have three fp numbers on the fp stack, from some
    source. If you can write a word which automatically selects between
    doing the sequence, F/ and F*, or the sequence, F* and F/, so that the intermediate quotient or product does not underflow or overflow, and
    produce a result which fits into the d.p. range, you will have
    effectively written F*/ (except for maybe a corner case or two).

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Mon May 20 13:40:31 2024
    AFAIK IEEE 754 specifies that arithmetic with integers without
    decimal places within the width of the mantissa must be EXACT,
    i.e. it behaves like integer arithmetic, i.e. without extra
    rounding.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Mon May 20 18:12:24 2024
    On 5/20/24 08:40, minforth wrote:
    AFAIK IEEE 754 specifies that arithmetic with integers without
    decimal places within the width of the mantissa must be EXACT,
    i.e. it behaves like integer arithmetic, i.e. without extra
    rounding.

    I think you are looking for the following in integer arithmetic.

    hex FFFFFFFFFFFFF decimal 68 binary u.r
    1111111111111111111111111111111111111111111111111111 ok
    hex FFFFFFFFFFFFF FFFF um* decimal 68 binary ud.r 11111111111111101111111111111111111111111111111111110000000000000001 ok
    hex FFFFFFFFFFFFF FFFF um* FFFF ud/mod decimal 68 binary ud.r drop decimal
    1111111111111111111111111111111111111111111111111111 ok

    The original 52 bit pattern is restored.

    I'm not sure S>F will give encoding as a subnormal to avoid decimal
    places within the width of the mantissa. A subnormal has a biased
    exponent of zero and a nonzero fraction.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From mhx@21:1/5 to Krishna Myneni on Mon May 20 23:04:53 2024
    Krishna Myneni wrote:
    [...]
    The IEEE extended float uses a 15 bit exponent, while it is only 11 bits

    for IEEE double precision. This is why you are not seeing a problem
    above. However, it will show up once your arithmetic
    overflows/underflows the extended float exponent range.

    That's why is was curious if you had a specific range / exponents
    in mind. Is it possible to formulate which combinations of x,y,z
    do NOT give a result that fits a double-precision float, and can NOT
    be fixed with x,z,y?

    E.g., when x is very large, y/z must be near 1, and therefore there
    would be no overflow problem when first doing y/z and then x*(y/z).

    Or maybe it is possible to use 'rational numbers' and simply not
    evaluate fractions until the very end?

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to mhx on Mon May 20 17:34:26 2024
    On 5/20/24 00:20, mhx wrote:
    Krishna Myneni wrote:


    What is the stack diagram of FP@ ?

    Sorry, the code given in my original post does not use FP@ -- I used it earlier, but >FPSTACK uses a more portable method of moving a 64-bit bit pattern on the data stack to the fp stack.

    With IEEE extended floats in iForth:

    FORTH> 1.0e300  -3.0e259  F*  1.0e280  F/ e. -3.000000e279  ok
    FORTH> 1.0e-300  3.0e-259 F*  1.0e-280 F/ e. 3.000000e-279  ok
    FORTH> 1.0e302  -3.0e302  F*  +INF     F/ e. -0.000000e0  ok
    FORTH> -1.0e-302 -1.0e-302 F* 0.0e0    F/ e. +INF  ok
    etc.


    The IEEE extended float uses a 15 bit exponent, while it is only 11 bits
    for IEEE double precision. This is why you are not seeing a problem
    above. However, it will show up once your arithmetic
    overflows/underflows the extended float exponent range.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to PMF on Mon May 20 18:27:51 2024
    On 5/20/24 08:06, PMF wrote:
    Krishna Myneni wrote:

    On 5/20/24 00:11, minforth wrote:
    Okay, thanks, I should have looked it up. I was too tired
    yesterday... Now even when the mantissa still can digest another bit,
    this still
    fails:

    $1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX
    t{ mantmax $ffff s>f fdup f*/ -> mantmax }t

    What did go wrong?

    The test is about expanding the mantissa to more than 53 bits by
    multiplication
    and shrinking it back by division. The intermediate 128-bit format
    should allow it.

    Good question. I can reproduce it here.

    hex
      ok
    1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX
      ok
    decimal
      ok
    MANTMAX dp-components.
    significand =   9007199254740991  exponent =   52  sign = 0  ok
    hex
      ok
    MANTMAX FFFF S>F FDUP F*/
      ok
    decimal dp-components.
    significand =   9007199254740990  exponent =   52  sign = 0  ok


    The lowest bit is different in the significand. Need to look into why
    the lowest fraction bit is different.

    It looks like you are truncating the result and not rounding to nearest
    which is expected by F*  and F/.
    You save the remainder but do not use it for anything!


    I think you're correct, but the rounding used by F*/ should be to
    whatever rounding mode the FPU is set. In kForth, the FPU is always
    initialized to round to nearest. Since we are doing the calculation for
    F*/ in integer arithmetic, it is not observing the round-to-nearest mode
    of the FPU.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to mhx on Tue May 21 06:52:24 2024
    mhx@iae.nl (mhx) writes:
    That's why is was curious if you had a specific range / exponents
    in mind. Is it possible to formulate which combinations of x,y,z
    do NOT give a result that fits a double-precision float, and can NOT
    be fixed with x,z,y?

    E.g., when x is very large, y/z must be near 1, and therefore there
    would be no overflow problem when first doing y/z and then x*(y/z).

    Basically we have a multiplication of three factors: x, y, and 1/z;
    and the assumption is that the product of all three is in range, while
    there is one product of one of the factors that is out of range.

    It seems to me that this can be solved by sorting the three factors
    into a>b>c. Then you can avoid the intermediate overflow by
    performing the computation as (a*c)*b.

    This approach can be taken to more than three factors, but you don't
    need to do full sorting: Divide the factors into a set that is <1 and
    a set that is >1. Start with an intermediate result 1 from one of the
    sets. Now if p<1, take a value v from the set >1, otherwise take v
    from the set <1. Compute p = p*v. Repeat until one of the sets is
    empty, then multiply the numbers from the other set.

    Or maybe it is possible to use 'rational numbers' and simply not
    evaluate fractions until the very end?

    Rational numbers in programming (e.g., in Prolog II) are just
    represented as fractions of unlimited-size integers. I don't think
    that you have that in mind.

    What you may have in mind and solves the problem: From the numbers x,
    y, and z extract the exponents xe, ye, ze, and replace them with 0 in
    the numbers, resulting in xm, ym, and zm. Now compute rm=xm*ym/zm,
    and re=xe+ye-ze, and finally add re to the exponent of rm, giving the
    result r.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2023: https://euro.theforth.net/2023

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From albert@spenarnc.xs4all.nl@21:1/5 to Anton Ertl on Tue May 21 09:33:11 2024
    In article <2024May21.085224@mips.complang.tuwien.ac.at>,
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    <SNIP>
    Basically we have a multiplication of three factors: x, y, and 1/z;
    and the assumption is that the product of all three is in range, while
    there is one product of one of the factors that is out of range.

    It seems to me that this can be solved by sorting the three factors
    into a>b>c. Then you can avoid the intermediate overflow by
    performing the computation as (a*c)*b.

    Make that |a|>|b|>|c|
    - anton

    Groetjes Albert
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat purring. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From mhx@21:1/5 to Anton Ertl on Tue May 21 09:03:49 2024
    Anton Ertl wrote:

    [..]
    It seems to me that this can be solved by sorting the three factors
    into a>b>c. Then you can avoid the intermediate overflow by
    performing the computation as (a*c)*b.

    Elegant! This F3SORT, enhanced with Albert's comment further down,
    looks like a useful and cheap factor in case I ever encounter
    a problem with intermediate FP overflow.
    [..]
    Or maybe it is possible to use 'rational numbers' and simply not
    evaluate fractions until the very end?

    Rational numbers in programming (e.g., in Prolog II) are just
    represented as fractions of unlimited-size integers. I don't think
    that you have that in mind.

    For some reason, MATLAB / Octave uses 'characters' for this extension (https://nl.mathworks.com/help/matlab/ref/rat.html). Infinite
    precision creates problems where a rational approximation
    does not exist.
    [ BTW: here octave acts a bit surprising (but strictly correct):
    octave:1> R=rat(pi,1e-20)
    R = 3 + 1/(7 + 1/(16 + 1/(-294 + 1/(3 + 1/(-4 + 1/(5 + 1/(-15 +
    1/(-3))))))))
    octave:2> R=rat(pi,1e-32)
    R = 3 + 1/(7 + 1/(16 + 1/(-294 + 1/(3 + 1/(-4 + 1/(5 + 1/(-15 +
    1/(-3))))))))
    ]
    What you may have in mind and solves the problem: From the numbers x,
    y, and z extract the exponents xe, ye, ze, and replace them with 0 in
    the numbers, resulting in xm, ym, and zm. Now compute rm=xm*ym/zm,
    and re=xe+ye-ze, and finally add re to the exponent of rm, giving the
    result r.

    That was indeed my first idea when I saw the OP's post. DNW wrote a
    nice
    toolkit to dissect FP numbers. I should check if it uncovers the
    problem encountered in this thread.

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to mhx on Tue May 21 07:27:57 2024
    On 5/21/24 04:03, mhx wrote:
    Anton Ertl wrote:

    [..]
    It seems to me that this can be solved by sorting the three factors
    into a>b>c.  Then you can avoid the intermediate overflow by
    performing the computation as (a*c)*b.

    Elegant! This F3SORT, enhanced with Albert's comment further down, looks
    like a useful and cheap factor in case I ever encounter a problem with intermediate FP overflow.
    [..]
    Or maybe it is possible to use 'rational numbers' and simply not
    evaluate fractions until the very end?

    Rational numbers in programming (e.g., in Prolog II) are just
    represented as fractions of unlimited-size integers.  I don't think
    that you have that in mind.

    For some reason, MATLAB / Octave uses 'characters' for this extension (https://nl.mathworks.com/help/matlab/ref/rat.html). Infinite precision creates problems where a rational approximation
    does not exist. [ BTW: here octave acts a bit surprising (but strictly correct): octave:1> R=rat(pi,1e-20)
    R = 3 + 1/(7 + 1/(16 + 1/(-294 + 1/(3 + 1/(-4 + 1/(5 + 1/(-15 +
    1/(-3))))))))
    octave:2> R=rat(pi,1e-32)
    R = 3 + 1/(7 + 1/(16 + 1/(-294 + 1/(3 + 1/(-4 + 1/(5 + 1/(-15 +
    1/(-3))))))))
    ]
    What you may have in mind and solves the problem: From the numbers x,
    y, and z extract the exponents xe, ye, ze, and replace them with 0 in
    the numbers, resulting in xm, ym, and zm.  Now compute rm=xm*ym/zm,
    and re=xe+ye-ze, and finally add re to the exponent of rm, giving the
    result r.

    That was indeed my first idea when I saw the OP's post. DNW wrote a
    nice
    toolkit to dissect FP numbers. I should check if it uncovers the problem encountered in this thread.


    Remember that you will also have to deal with IEEE 754 special values
    like Inf and NaN. It will be interesting to compare the efficiency of
    both my approach and your sorting approach. I'm skeptical that the
    additional sorting will make the equivalent calculation faster.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to Krishna Myneni on Tue May 21 19:17:20 2024
    Krishna Myneni wrote:
    MANTMAX dp-components.
    significand =   9007199254740991  exponent =   52  sign = 0  ok
    hex
      ok
    MANTMAX FFFF S>F FDUP F*/
      ok
    decimal dp-components.
    significand =   9007199254740990  exponent =   52  sign = 0  ok

    The lowest bit is different in the significand. Need to look into why
    the lowest fraction bit is different.

    It looks like you are truncating the result and not rounding to nearest
    which is expected by F*  and F/.
    You save the remainder but do not use it for anything!


    I think you're correct, but the rounding used by F*/ should be to
    whatever rounding mode the FPU is set. In kForth, the FPU is always initialized to round to nearest. Since we are doing the calculation for

    F*/ in integer arithmetic, it is not observing the round-to-nearest mode

    of the FPU.

    Manual cross-check with quad precision floats:

    MinForth 3.6 (64 bit) (fp 128 bit)
    # $1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX ok
    # mantmax ok
    f: 9.0072e+15 | # $FFFF s>f ok
    f: 9.0072e+15 65535 | # f* ok
    f: 5.90287e+20 | # fdup f>d ok
    f: 5.90287e+20 | -9007199254806527 31 # hex ok
    f: 5.90287e+20 | FFDFFFFFFFFF0001 1F $ decimal 65535e f/ ok
    f: 9.0072e+15 | -9007199254806527 31 # 2drop f>s ok
    9007199254740991 # hex ok
    1FFFFFFFFFFFFF $

    So for the 64-bit F*/ challenge, the interim significand after the multiplication step should be $1F_FFDF_FFFF_FFFF_0001.

    It is now easy to check whether the erroneous 1-digit offset was caused
    by the multiplication step or the division step.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Wed May 22 18:09:50 2024
    NaNs can be surprising. I read in Wikipedia:

    Encodings of qNaN and sNaN are not completely specified in
    IEEE 754 and depend on the processor. Most processors, such
    as the x86 family and the ARM family processors, use the most
    significant bit of the significand field to indicate a quiet
    NaN; this is what is recommended by IEEE 754. The PA-RISC
    processors use the bit to indicate a signaling NaN.

    IOW tests with filled significands should better leave alone
    the 53rd bit.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Krishna Myneni on Wed May 22 16:31:45 2024
    Krishna Myneni <krishna.myneni@ccreweb.org> writes:
    On 5/21/24 04:03, mhx wrote:
    Anton Ertl wrote:

    [..]
    It seems to me that this can be solved by sorting the three factors
    into a>b>c.  Then you can avoid the intermediate overflow by
    performing the computation as (a*c)*b.
    ...
    Remember that you will also have to deal with IEEE 754 special values
    like Inf and NaN.

    Not a problem. If any operand is a NaN, the result will be NaN no
    matter how the operations are associated. For infinities (and 0 as
    divisor), I would analyse it by looking at all cases, but I don't see
    that it makes any difference:

    Variable names here represent finite non-zero values:

    (inf*y)/z=inf/z=inf
    inf*(y/z)=inf*finite=inf
    y*(inf/z)=y*inf=inf

    Likewise if x is finite and y is infinite

    (x*y)/inf=finite/inf=0
    x*(y/inf)=x*0=0
    y*(x/inf)=y*0=0

    (x*y)/0=finite/0=inf
    x*(y/0)=x*inf=inf
    y*(x/0)=y*inf=inf

    Signs in all these cases follow the same rules whether infinities are
    involved or not.

    It will be interesting to compare the efficiency of
    both my approach and your sorting approach. I'm skeptical that the
    additional sorting will make the equivalent calculation faster.

    Actually sorting is overkill:

    : fsort2 ( r1 r2 -- r3 r4 )
    \ |r3|>=|r4|
    fover fabs fover fabs f< if
    fswap
    then ;

    : f*/ ( r1 r2 r3 -- r )
    fdup fabs 1e f> fswap frot fsort2 if
    fswap then
    frot f/ f* ;

    I have tested this with your tests from
    <v2dqte$3ir32$1@dont-email.me>, but needed to change rel-near (I
    changed it to 1e-16) for gforth to pass your tests. I leave
    performance testing to you. Here's what vfx64 produces for this F*/:

    see f*/
    F*/
    ( 0050A310 D9C0 ) FLD ST
    ( 0050A312 D9E1 ) FABS
    ( 0050A314 D9E8 ) FLD1
    ( 0050A316 E8F5BEFFFF ) CALL 00506210 F>
    ( 0050A31B D9C9 ) FXCH ST(1)
    ( 0050A31D D9C9 ) FXCH ST(1)
    ( 0050A31F D9CA ) FXCH ST(2)
    ( 0050A321 E88AFFFFFF ) CALL 0050A2B0 FSORT2
    ( 0050A326 4885DB ) TEST RBX, RBX
    ( 0050A329 488B5D00 ) MOV RBX, [RBP]
    ( 0050A32D 488D6D08 ) LEA RBP, [RBP+08]
    ( 0050A331 0F8402000000 ) JZ/E 0050A339
    ( 0050A337 D9C9 ) FXCH ST(1)
    ( 0050A339 D9C9 ) FXCH ST(1)
    ( 0050A33B D9CA ) FXCH ST(2)
    ( 0050A33D DEF9 ) FDIVP ST(1), ST
    ( 0050A33F DEC9 ) FMULP ST(1), ST
    ( 0050A341 C3 ) RET/NEXT
    ( 50 bytes, 18 instructions )
    ok
    see fsort2
    FSORT2
    ( 0050A2B0 D9C1 ) FLD ST(1)
    ( 0050A2B2 D9E1 ) FABS
    ( 0050A2B4 D9C1 ) FLD ST(1)
    ( 0050A2B6 D9E1 ) FABS
    ( 0050A2B8 E863BEFFFF ) CALL 00506120 F<
    ( 0050A2BD 4885DB ) TEST RBX, RBX
    ( 0050A2C0 488B5D00 ) MOV RBX, [RBP]
    ( 0050A2C4 488D6D08 ) LEA RBP, [RBP+08]
    ( 0050A2C8 0F8402000000 ) JZ/E 0050A2D0
    ( 0050A2CE D9C9 ) FXCH ST(1)
    ( 0050A2D0 C3 ) RET/NEXT
    ( 33 bytes, 11 instructions )
    ok
    see f<
    F<
    ( 00506120 E86BFEFFFF ) CALL 00505F90 FCMP2
    ( 00506125 4881FB00010000 ) CMP RBX, # 00000100
    ( 0050612C 0F94C3 ) SETZ/E BL
    ( 0050612F F6DB ) NEG BL
    ( 00506131 480FBEDB ) MOVSX RBX, BL
    ( 00506135 C3 ) RET/NEXT
    ( 22 bytes, 6 instructions )
    ok
    see fcmp2
    FCMP2
    ( 00505F90 4883ED08 ) SUB RBP, # 08
    ( 00505F94 48895D00 ) MOV [RBP], RBX
    ( 00505F98 D9C9 ) FXCH ST(1)
    ( 00505F9A DED9 ) FCOMPP
    ( 00505F9C 9B ) FWAIT
    ( 00505F9D DFE0 ) FSTSW AX
    ( 00505F9F 66250041 ) AND AX, # 4100
    ( 00505FA3 480FB7D8 ) MOVZX RBX, AX
    ( 00505FA7 C3 ) RET/NEXT
    ( 24 bytes, 9 instructions )

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2023: https://euro.theforth.net/2023

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Mon May 27 13:48:05 2024
    On 5/21/24 14:17, minforth wrote:
    Krishna Myneni wrote:
    MANTMAX dp-components.
    significand =   9007199254740991  exponent =   52  sign = 0  ok >>>> hex
      ok
    MANTMAX FFFF S>F FDUP F*/
      ok
    decimal dp-components.
    significand =   9007199254740990  exponent =   52  sign = 0  ok

    The lowest bit is different in the significand. Need to look into
    why the lowest fraction bit is different.

    It looks like you are truncating the result and not rounding to nearest
    which is expected by F*  and F/.
    You save the remainder but do not use it for anything!


    I think you're correct, but the rounding used by F*/ should be to
    whatever rounding mode the FPU is set. In kForth, the FPU is always
    initialized to round to nearest. Since we are doing the calculation for

    F*/ in integer arithmetic, it is not observing the round-to-nearest mode

    of the FPU.

    Manual cross-check with quad precision floats:

    MinForth 3.6 (64 bit) (fp 128 bit)
    # $1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX  ok
    # mantmax  ok
    f: 9.0072e+15 | # $FFFF s>f  ok
    f: 9.0072e+15 65535 | # f*  ok
    f: 5.90287e+20 | # fdup f>d  ok
    f: 5.90287e+20 | -9007199254806527 31 # hex  ok
    f: 5.90287e+20 | FFDFFFFFFFFF0001 1F $ decimal 65535e f/  ok
    f: 9.0072e+15 | -9007199254806527 31 # 2drop f>s  ok
    9007199254740991 # hex  ok
    1FFFFFFFFFFFFF $

    So for the 64-bit F*/ challenge, the interim significand after the multiplication step should be $1F_FFDF_FFFF_FFFF_0001.

    It is now easy to check whether the erroneous 1-digit offset was caused
    by the multiplication step or the division step.

    Ok. Will go back and compare.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Anton Ertl on Mon May 27 13:45:38 2024
    On 5/22/24 11:31, Anton Ertl wrote:
    Krishna Myneni <krishna.myneni@ccreweb.org> writes:
    On 5/21/24 04:03, mhx wrote:
    Anton Ertl wrote:

    [..]
    It seems to me that this can be solved by sorting the three factors
    into a>b>c.  Then you can avoid the intermediate overflow by
    performing the computation as (a*c)*b.
    ...
    Remember that you will also have to deal with IEEE 754 special values
    like Inf and NaN.

    Not a problem. If any operand is a NaN, the result will be NaN no
    matter how the operations are associated. For infinities (and 0 as
    divisor), I would analyse it by looking at all cases, but I don't see
    that it makes any difference:

    Variable names here represent finite non-zero values:

    (inf*y)/z=inf/z=inf
    inf*(y/z)=inf*finite=inf
    y*(inf/z)=y*inf=inf

    Likewise if x is finite and y is infinite

    (x*y)/inf=finite/inf=0
    x*(y/inf)=x*0=0
    y*(x/inf)=y*0=0

    (x*y)/0=finite/0=inf
    x*(y/0)=x*inf=inf
    y*(x/0)=y*inf=inf

    Signs in all these cases follow the same rules whether infinities are involved or not.

    It will be interesting to compare the efficiency of
    both my approach and your sorting approach. I'm skeptical that the
    additional sorting will make the equivalent calculation faster.

    Actually sorting is overkill:

    : fsort2 ( r1 r2 -- r3 r4 )
    \ |r3|>=|r4|
    fover fabs fover fabs f< if
    fswap
    then ;

    : f*/ ( r1 r2 r3 -- r )
    fdup fabs 1e f> fswap frot fsort2 if
    fswap then
    frot f/ f* ;

    I have tested this with your tests from
    <v2dqte$3ir32$1@dont-email.me>, but needed to change rel-near (I
    changed it to 1e-16) for gforth to pass your tests. I leave
    performance testing to you. Here's what vfx64 produces for this F*/:

    see f*/
    F*/
    ( 0050A310 D9C0 ) FLD ST
    ( 0050A312 D9E1 ) FABS
    ( 0050A314 D9E8 ) FLD1
    ( 0050A316 E8F5BEFFFF ) CALL 00506210 F>
    ( 0050A31B D9C9 ) FXCH ST(1)
    ( 0050A31D D9C9 ) FXCH ST(1)
    ( 0050A31F D9CA ) FXCH ST(2)
    ( 0050A321 E88AFFFFFF ) CALL 0050A2B0 FSORT2
    ( 0050A326 4885DB ) TEST RBX, RBX
    ( 0050A329 488B5D00 ) MOV RBX, [RBP]
    ( 0050A32D 488D6D08 ) LEA RBP, [RBP+08]
    ( 0050A331 0F8402000000 ) JZ/E 0050A339
    ( 0050A337 D9C9 ) FXCH ST(1)
    ( 0050A339 D9C9 ) FXCH ST(1)
    ( 0050A33B D9CA ) FXCH ST(2)
    ( 0050A33D DEF9 ) FDIVP ST(1), ST
    ( 0050A33F DEC9 ) FMULP ST(1), ST
    ( 0050A341 C3 ) RET/NEXT
    ( 50 bytes, 18 instructions )
    ok
    see fsort2
    FSORT2
    ( 0050A2B0 D9C1 ) FLD ST(1)
    ( 0050A2B2 D9E1 ) FABS
    ( 0050A2B4 D9C1 ) FLD ST(1)
    ( 0050A2B6 D9E1 ) FABS
    ( 0050A2B8 E863BEFFFF ) CALL 00506120 F<
    ( 0050A2BD 4885DB ) TEST RBX, RBX
    ( 0050A2C0 488B5D00 ) MOV RBX, [RBP]
    ( 0050A2C4 488D6D08 ) LEA RBP, [RBP+08]
    ( 0050A2C8 0F8402000000 ) JZ/E 0050A2D0
    ( 0050A2CE D9C9 ) FXCH ST(1)
    ( 0050A2D0 C3 ) RET/NEXT
    ( 33 bytes, 11 instructions )
    ok
    see f<
    F<
    ( 00506120 E86BFEFFFF ) CALL 00505F90 FCMP2
    ( 00506125 4881FB00010000 ) CMP RBX, # 00000100
    ( 0050612C 0F94C3 ) SETZ/E BL
    ( 0050612F F6DB ) NEG BL
    ( 00506131 480FBEDB ) MOVSX RBX, BL
    ( 00506135 C3 ) RET/NEXT
    ( 22 bytes, 6 instructions )
    ok
    see fcmp2
    FCMP2
    ( 00505F90 4883ED08 ) SUB RBP, # 08
    ( 00505F94 48895D00 ) MOV [RBP], RBX
    ( 00505F98 D9C9 ) FXCH ST(1)
    ( 00505F9A DED9 ) FCOMPP
    ( 00505F9C 9B ) FWAIT
    ( 00505F9D DFE0 ) FSTSW AX
    ( 00505F9F 66250041 ) AND AX, # 4100
    ( 00505FA3 480FB7D8 ) MOVZX RBX, AX
    ( 00505FA7 C3 ) RET/NEXT
    ( 24 bytes, 9 instructions )



    Nice work. I'll try some comparisons using your Forth implementation of
    F*/ .

    Interesting that there is an FWAIT instruction assembled into FCMP2.
    IIRC, FWAIT was needed to fix a bug in early FPUs.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Anton Ertl on Mon May 27 18:27:20 2024
    On 5/22/24 11:31, Anton Ertl wrote:
    Krishna Myneni <krishna.myneni@ccreweb.org> writes:
    On 5/21/24 04:03, mhx wrote:
    Anton Ertl wrote:

    [..]
    It seems to me that this can be solved by sorting the three factors
    into a>b>c.  Then you can avoid the intermediate overflow by
    performing the computation as (a*c)*b.
    ...
    Remember that you will also have to deal with IEEE 754 special values
    like Inf and NaN.

    Not a problem. If any operand is a NaN, the result will be NaN no
    matter how the operations are associated. For infinities (and 0 as
    divisor), I would analyse it by looking at all cases, but I don't see
    that it makes any difference:

    Variable names here represent finite non-zero values:

    (inf*y)/z=inf/z=inf
    inf*(y/z)=inf*finite=inf
    y*(inf/z)=y*inf=inf

    Likewise if x is finite and y is infinite

    (x*y)/inf=finite/inf=0
    x*(y/inf)=x*0=0
    y*(x/inf)=y*0=0

    (x*y)/0=finite/0=inf
    x*(y/0)=x*inf=inf
    y*(x/0)=y*inf=inf

    Signs in all these cases follow the same rules whether infinities are involved or not.

    It will be interesting to compare the efficiency of
    both my approach and your sorting approach. I'm skeptical that the
    additional sorting will make the equivalent calculation faster.

    Actually sorting is overkill:

    : fsort2 ( r1 r2 -- r3 r4 )
    \ |r3|>=|r4|
    fover fabs fover fabs f< if
    fswap
    then ;

    : f*/ ( r1 r2 r3 -- r )
    fdup fabs 1e f> fswap frot fsort2 if
    fswap then
    frot f/ f* ;

    I have tested this with your tests from
    <v2dqte$3ir32$1@dont-email.me>, but needed to change rel-near (I
    changed it to 1e-16) for gforth to pass your tests. I leave
    performance testing to you. ...

    I put together a benchmark test. My implementation of F*/ hangs during
    the test, so I have some debugging to do on that. Your implementation
    above does execute.

    --
    Krishna


    === begin bench-fstarslash.4th ===
    \ bench-fstarslash.4th
    \
    \ Benchmark implementation of double-precision F*/ using
    \ a large sample of triplet dp floating-point numbers,
    \ which would overflow/underflow with F* F/ sequence.
    \

    include ans-words \ only for kForth
    include random
    include fsl/fsl-util
    include fsl/dynmem

    1 [IF]

    \ Anton Ertl's implementation of F*/ per discussion on
    \ comp.lang.forth: 5/22/2024, "Re: F*/ (f-star-slash)"

    : fsort2 ( F: r1 r2 -- r3 r4 )
    \ |r3|>=|r4|
    fover fabs fover fabs f< if
    fswap
    then ;

    : f*/ ( F: r1 r2 r3 -- r )
    fdup fabs 1e f> fswap frot fsort2 if
    fswap then
    frot f/ f* ;

    [ELSE]
    include fstarslash \ other implementation
    [THEN]

    \ Generate triplet:
    \
    \ 1. Use a integer random number generator to generate
    \ binary signed exponents, e1, e2, e3, in the interval
    \
    \ [-DP_EXPONENT_BIAS, MAX_DP_EXPONENT-DP_EXPONENT_BIAS)
    \
    \ 2. Obtain the triplet, 2^{e1--e3}, such that it is in
    \ the normal range for double-precision fp.
    \
    \ 3. Store in matrix and repeat from 1 for N_SAMPLES.
    \
    \ 4. Benchmark results of F*/ on triplets; store in
    \ results array.
    \

    1014678321 seed !

    [UNDEFINED] DP_EXPONENT_MAX [IF]
    2047 constant DP_EXPONENT_MAX
    [THEN]

    DP_EXPONENT_MAX constant RAND_EXP_MASK

    [UNDEFINED] DP_EXPONENT_BIAS [IF]
    1023 constant DP_EXPONENT_BIAS
    [THEN]

    1000000 constant N_SAMPLES

    FLOAT DMATRIX triplets{{
    FLOAT DARRAY results{

    : alloc-mem ( u -- bFail )
    & triplets{{ over 3 }}malloc
    & results{ swap }malloc
    malloc-fail? ;

    : free-mem ( -- )
    & results{ }free
    & triplets{{ }}free
    ;

    \ Return signed binary exponent
    : rand-exp ( -- e )
    random2 RAND_EXP_MASK and \ ensure e stays within range
    DP_EXPONENT_BIAS - ;

    : normal-exponent? ( e -- bNormal )
    DP_EXPONENT_BIAS negate DP_EXPONENT_BIAS within ;

    : random-exponents ( -- e1 e2 e3 )
    BEGIN
    rand-exp rand-exp
    2dup + normal-exponent?
    WHILE
    2drop
    REPEAT
    \ -- e1 e2
    \ (2^e1)*(2^e2) will underflow or overflow double fp under F*
    BEGIN
    2dup +
    rand-exp 2dup -
    normal-exponent? 0=
    WHILE
    2drop
    REPEAT
    nip
    ;

    : randomize-sign ( F: r -- +/-r )
    random2 1 and IF fnegate THEN ;

    : gen-triplet ( F: -- r1 r2 r3 )
    random-exponents
    2.0e0 s>f f** randomize-sign
    2.0e0 s>f f** randomize-sign
    2.0e0 s>f f** randomize-sign ;

    : gen-samples ( -- )
    N_SAMPLES alloc-mem ABORT" Unable to allocate memory!"
    N_SAMPLES 0 DO
    gen-triplet
    triplets{{ I 2 }} f! triplets{{ I 1 }} f! triplets{{ I 0 }} f!
    key? IF key drop I . cr unloop EXIT THEN
    LOOP ;

    \ For Gforth, include kforth-compat.fs, or replace MS@ with UTIME
    \ and double arithmetic to obtain elapsed time.

    : bench-f*/ ( -- )
    ms@
    results{ 0 }
    N_SAMPLES 0 DO
    triplets{{ I 0 }}
    dup f@
    float+ dup f@
    float+ f@
    f*/
    dup f! float+
    LOOP
    drop
    ms@ swap - . ." ms" cr ;

    cr
    cr .( Type 'gen-samples' to obtain arithmetic cases for F*/' )
    cr .( Type bench-f*/ to execute the benchmark. )
    cr
    cr
    === end bench-fstarslash.4th ===

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Krishna Myneni on Tue May 28 06:11:05 2024
    On 5/27/24 18:27, Krishna Myneni wrote:
    On 5/22/24 11:31, Anton Ertl wrote:
    Krishna Myneni <krishna.myneni@ccreweb.org> writes:
    On 5/21/24 04:03, mhx wrote:
    Anton Ertl wrote:

    [..]
    It seems to me that this can be solved by sorting the three factors
    into a>b>c.  Then you can avoid the intermediate overflow by
    performing the computation as (a*c)*b.
    ...
    Remember that you will also have to deal with IEEE 754 special values
    like Inf and NaN.

    Not a problem.  If any operand is a NaN, the result will be NaN no
    matter how the operations are associated.  For infinities (and 0 as
    divisor), I would analyse it by looking at all cases, but I don't see
    that it makes any difference:

    Variable names here represent finite non-zero values:

    (inf*y)/z=inf/z=inf
    inf*(y/z)=inf*finite=inf
    y*(inf/z)=y*inf=inf

    Likewise if x is finite and y is infinite

    (x*y)/inf=finite/inf=0
    x*(y/inf)=x*0=0
    y*(x/inf)=y*0=0

    (x*y)/0=finite/0=inf
    x*(y/0)=x*inf=inf
    y*(x/0)=y*inf=inf

    Signs in all these cases follow the same rules whether infinities are
    involved or not.

    It will be interesting to compare the efficiency of
    both my approach and your sorting approach. I'm skeptical that the
    additional sorting will make the equivalent calculation faster.

    Actually sorting is overkill:

    : fsort2 ( r1 r2 -- r3 r4 )
         \ |r3|>=|r4|
         fover fabs fover fabs f< if
             fswap
         then ;

    : f*/ ( r1 r2 r3 -- r )
         fdup fabs 1e f> fswap frot fsort2 if
             fswap then
         frot f/ f* ;

    I have tested this with your tests from
    <v2dqte$3ir32$1@dont-email.me>, but needed to change rel-near (I
    changed it to 1e-16) for gforth to pass your tests.  I leave
    performance testing to you.  ...

    I put together a benchmark test. My implementation of F*/ hangs during
    the test, so I have some debugging to do on that. Your implementation
    above does execute.

    --
    Krishna


    === begin bench-fstarslash.4th ===
    \ bench-fstarslash.4th
    \
    \ Benchmark implementation of double-precision F*/ using
    \ a large sample of triplet dp floating-point numbers,
    \ which would overflow/underflow with F* F/ sequence.
    \

    Updated benchmark code also allows for validating the arithmetic of F*/
    for all of the 1 million test cases. At the moment, all arithmetic cases
    use only "normal" fp numbers i.e. they do not include zero, infinity, or subnormal numbers -- special tests are needed for such cases.

    --
    Krishna

    === begin bench-fstarslash.4th ===
    \ bench-fstarslash.4th
    \
    \ Benchmark implementation of double-precision F*/ using
    \ a large sample of triplet dp floating-point numbers,
    \ which would overflow/underflow with F* F/ sequence.
    \

    include ans-words \ only for kForth
    include random
    include fsl/fsl-util
    include fsl/dynmem

    1 [IF]

    \ Anton Ertl's implementation of F*/ per discussion on
    \ comp.lang.forth: 5/22/2024, "Re: F*/ (f-star-slash)"

    : fsort2 ( F: r1 r2 -- r3 r4 )
    \ |r3|>=|r4|
    fover fabs fover fabs f< if
    fswap
    then ;

    : f*/ ( F: r1 r2 r3 -- r )
    fdup fabs 1e f> fswap frot fsort2 if
    fswap then
    frot f/ f* ;

    [ELSE]
    include fstarslash \ other implementation
    [THEN]

    \ Generate triplet:
    \
    \ 1. Use a integer random number generator to generate
    \ binary signed exponents, e1, e2, e3, in the interval
    \
    \ [-DP_EXPONENT_BIAS, MAX_DP_EXPONENT-DP_EXPONENT_BIAS)
    \
    \ 2. Obtain the triplet, 2^{e1--e3}, such that it is in
    \ the normal range for double-precision fp.
    \
    \ 3. Store in matrix and repeat from 1 for N_SAMPLES.
    \
    \ 4. Benchmark results of F*/ on triplets; store in
    \ results array.
    \

    1014678321 seed !

    [UNDEFINED] DP_EXPONENT_MAX [IF]
    2047 constant DP_EXPONENT_MAX
    [THEN]

    DP_EXPONENT_MAX constant RAND_EXP_MASK

    [UNDEFINED] DP_EXPONENT_BIAS [IF]
    1023 constant DP_EXPONENT_BIAS
    [THEN]

    1000000 constant N_SAMPLES

    FLOAT DMATRIX triplets{{
    FLOAT DARRAY results{
    INTEGER DMATRIX refexp{{

    : 3dup ( n1 n2 n3 -- n1 n2 n3 n1 n2 n3 )
    2 pick 2 pick 2 pick ;

    : alloc-mem ( u -- bFail )
    & triplets{{ over 3 }}malloc
    & results{ over }malloc
    & refexp{{ swap 3 }}malloc
    malloc-fail? ;

    : free-mem ( -- )
    & refexp{{ }}free
    & results{ }free
    & triplets{{ }}free
    ;

    : normal-exponent? ( e -- bNormal )
    DP_EXPONENT_BIAS negate 1+ DP_EXPONENT_BIAS 1+ within ;

    \ Return signed binary exponent in normal range
    : rand-exp ( -- e )
    random2 RAND_EXP_MASK and \ ensure e stays within range
    DP_EXPONENT_BIAS -
    DP_EXPONENT_BIAS min ;

    : random-exponents ( -- e1 e2 e3 )
    BEGIN
    rand-exp rand-exp
    2dup + normal-exponent?
    WHILE
    2drop
    REPEAT
    \ -- e1 e2
    \ (2^e1)*(2^e2) will underflow or overflow double fp under F*
    BEGIN
    2dup +
    rand-exp 2dup -
    normal-exponent? 0=
    WHILE
    2drop
    REPEAT
    nip
    ;

    : combine-exponents ( e1 e2 e3 -- e1 e2 e3 etot )
    2dup - 3 pick + ;

    : randomize-sign ( F: r -- +/-r )
    random2 1 and IF fnegate THEN ;

    : etriple>floats ( e1 e2 e3 -- ) ( F: -- r1 r2 r3 )
    rot 2.0e0 s>f f**
    swap 2.0e0 s>f f**
    2.0e0 s>f f** ;

    \ GEN-SAMPLES does not generate cases with zeros or
    \ infinities.
    : gen-samples ( -- )
    N_SAMPLES alloc-mem ABORT" Unable to allocate memory!"
    N_SAMPLES 0 DO
    random-exponents
    3dup refexp{{ I 2 }} ! refexp{{ I 1 }} ! refexp{{ I 0 }} !
    etriple>floats
    triplets{{ I 2 }} randomize-sign f!
    triplets{{ I 1 }} randomize-sign f!
    triplets{{ I 0 }} randomize-sign f!
    LOOP ;

    \ For Gforth, include kforth-compat.fs, or replace MS@ with UTIME
    \ and double arithmetic to obtain elapsed time.

    : bench-f*/ ( -- )
    ms@
    results{ 0 }
    N_SAMPLES 0 DO
    triplets{{ I 0 }}
    dup f@
    float+ dup f@
    float+ f@
    f*/
    dup f! float+
    LOOP
    drop
    ms@ swap - . ." ms" cr ;


    \ Check the calculation performed by F*/
    : relative-error ( F: r1 r2 -- relerror )
    fswap fover f- fswap f/ fabs ;

    1e-17 fconstant tol
    variable tol-errors
    variable sign-errors
    fvariable max-rel-error

    : check-results ( -- )
    cr N_SAMPLES 7 .r ." samples" cr
    0 tol-errors !
    0 sign-errors !
    0.0e0 max-rel-error f!
    N_SAMPLES 0 DO
    results{ I } f@ fabs
    2.0e0 refexp{{ I 0 }} @ refexp{{ I 1 }} @ +
    refexp{{ I 2 }} @ - s>f f**
    relative-error fdup tol f> IF
    max-rel-error f@ fmax max-rel-error f!
    1 tol-errors +!
    ELSE
    fdrop
    THEN
    \ check for sign errors
    results{ I } f@ f0< \ bSign
    triplets{{ I 0 }} f@ f0< triplets{{ I 1 }} f@ f0< xor
    triplets{{ I 2 }} f@ f0< xor
    <> IF 1 sign-errors +! THEN
    LOOP
    cr ." # tolerance errors: " tol-errors @ 6 .r
    cr ." # sign errors: " sign-errors @ 6 .r
    cr ." Max relative error: " max-rel-error f@ f.
    cr
    ;

    cr
    cr .( Type: )
    cr .( 'GEN-SAMPLES' to obtain arithmetic cases for F*/ )
    cr .( 'BENCH-F*/' to execute the benchmark )
    cr .( 'CHECK-RESULTS' to validate arithmetic by F*/ )
    cr .( 'FREE-MEM' to free memory )
    cr .( Note: Test cases do not involve zeros or infinities. )
    cr

    === end bench-fstarslash.4th ===


    === Sample Run ===
    Type:
    'GEN-SAMPLES' to obtain arithmetic cases for F*/
    'BENCH-F*/' to execute the benchmark
    'CHECK-RESULTS' to validate arithmetic by F*/
    'FREE-MEM' to free memory
    Note: Test cases do not involve zeros or infinities.
    ok
    gen-samples
    ok
    bench-f*/
    87 ms
    ok
    check-results

    1000000 samples

    # tolerance errors: 0
    # sign errors: 0
    Max relative error: 0
    ok
    === end Sample Run ===

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Krishna Myneni on Wed May 29 07:27:33 2024
    On 5/28/24 06:11, Krishna Myneni wrote:
    ...
    Updated benchmark code also allows for validating the arithmetic of F*/
    for all of the 1 million test cases. At the moment, all arithmetic cases
    use only "normal" fp numbers i.e. they do not include zero, infinity, or subnormal numbers -- special tests are needed for such cases.
    ...

    I found the reason for hanging in my implementation of F*/ -- the
    problem was in the word DP_NORMAL_FRACTION (the only word which has an indefinite loop in the code). The version which doesn't go into an
    infinite loop is given below:

    : dp-normal-fraction ( u -- expinc u')
    dup DP_FRACTION_BITS 1+ rshift
    dup IF
    tuck rshift
    ELSE
    drop -1 swap \ -1 u
    BEGIN
    dup DP_IMPLIED_BIT and 0=
    WHILE
    1 lshift
    swap 1- swap
    REPEAT
    THEN
    ;

    Now the benchmark code runs. It shows two things (see below):

    1. The execution time is poor compared to Anton's implementation, about
    8x slower.

    2. Half of the arithmetic results are wrong, with the max relative error
    being as high as 2/3.

    So, I clearly have some work to do to fix the arithmetic. It was
    fortuitous that it passed my test cases, but clearly the test cases in
    my original implementation of F*/ were insufficient.

    --
    Krishna


    bench-f*/
    603 ms
    ok
    check-results

    1000000 samples

    # tolerance errors: 501374
    # sign errors: 0
    Max relative error: 0.666667
    ok

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Krishna Myneni on Fri May 31 22:57:18 2024
    On 5/29/24 07:27, Krishna Myneni wrote:
    On 5/28/24 06:11, Krishna Myneni wrote:
    ...
    ...
    Now the benchmark code runs. It shows two things (see below):

    1. The execution time is poor compared to Anton's implementation, about
    8x slower.

    2. Half of the arithmetic results are wrong, with the max relative error being as high as 2/3.

    So, I clearly have some work to do to fix the arithmetic. It was
    fortuitous that it passed my test cases, but clearly the test cases in
    my original implementation of F*/ were insufficient.

    --
    Krishna


    bench-f*/d
    603 ms
     ok
    check-results

    1000000 samples

    # tolerance errors: 501374
    # sign errors:           0
    Max relative error: 0.666667
     ok


    Ok. I found and fixed the errors in my Forth implementation of F*/ . It
    runs the benchmark test and now there are no arithmetic errors to within
    the specified tolerance of 1E-17.

    My Forth implementation is much slower than Anton's, which uses the FPU floating point operations. However, my Forth implementation of F*/ makes
    no use of floating point arithmetic, and will likely be considerably
    faster in assembly code than in Forth due to the low level bit operations.

    There is also a restriction currently on passing subnormal numbers to my version of F*/ -- as it stands, it will give incorrect results for
    subnormal numbers and won't detect this condition.

    I also found an error in the benchmark code (bench-fstarslash.4th), in
    the word,

    RAND-EXP ( -- e ) \ Return signed binary exponent in normal range

    It was allowing return of biased binary exponent of -1023 which is not
    in the normal range. Anton's implementation works with subnormal
    numbers, so it passed the results verification check.

    The updated definition of the random exponent generator for bench-fstarslash.4th is

    \ Return signed binary exponent in normal range
    : rand-exp ( -- e )
    random2 RAND_EXP_MASK and
    1 max DP_EXPONENT_MAX 1- min \ ensure e within normal range
    DP_EXPONENT_BIAS - ;


    My current implementation, which passes the results verification test in bench-fstarslash.4th (with the above revised def of RAND-EXP), is listed
    below, along with a sample run.

    Cheers,
    Krishna

    --

    === begin fstarslash.4th ===
    \ fstarslash.4th
    \
    \ Implementation of F*/ on 64-bit Forth
    \
    \ Krishna Myneni, 2024-05-19
    \ Revised: 2024-05-31
    \
    \ F*/ ( F: r1 r2 r3 -- r )
    \ Multiply two IEEE-754 double precision floating point
    \ numbers r1 and r2 and divide by r3 to obtain the result,
    \ r. The intermediate product significand, r1*r2, is
    \ represented with at least IEEE quad precision for both
    \ its significand and exponent.
    \
    \ F*/ uses an intermediate significand represented by
    \ 128 bits
    \
    \ Notes:
    \ 1. Does not work with subnormal IEEE 754 double-
    \ precision numbers. Needs tests for subnormal
    \ inputs.

    \ include ans-words

    1 cells 8 < [IF]
    cr .( Requires 64-bit Forth system ) cr
    ABORT
    [THEN]

    fvariable temp
    : >fpstack ( u -- ) ( F: -- r )
    temp ! temp f@ ;

    base @ hex
    8000000000000000 constant DP_SIGNBIT_MASK
    000FFFFFFFFFFFFF constant DP_FRACTION_MASK
    7FF0000000000000 constant DP_EXPONENT_MASK
    \ DP special value constants
    7FF0000000000001 >fpstack fconstant DP_NAN
    FFF0000000000000 >fpstack fconstant DP_-INF
    7FF0000000000000 >fpstack fconstant DP_+INF
    base !

    52 constant DP_FRACTION_BITS
    1023 constant DP_EXPONENT_BIAS
    2047 constant DP_EXPONENT_MAX

    1 DP_FRACTION_BITS LSHIFT constant DP_IMPLIED_BIT
    63 constant DP_SIGN_BIT

    : dp-fraction ( F: r -- r ) ( -- u )
    FP@ @ DP_FRACTION_MASK and ;
    : dp-exponent ( F: r -- r ) ( -- e )
    FP@ @ DP_EXPONENT_MASK and DP_FRACTION_BITS rshift ;
    : dp-sign ( F: r -- r ) ( -- sign )
    FP@ @ DP_SIGN_BIT rshift ;

    : _isZero? ( ufrac uexp -- b ) 0= swap 0= and ;
    : _isInf? ( ufrac uexp -- b ) DP_EXPONENT_MAX = swap 0= and ;
    : _isNaN? ( ufrac uexp -- b ) DP_EXPONENT_MAX = swap 0<> and ;
    : _isSubNormal? ( ufrac uexp -- b ) 0= swap 0<> and ;

    : frac>significand ( u -- u' ) DP_IMPLIED_BIT or ;

    \ Show the significand, unbiased base-2 exponent, and sign bit
    \ for a double-precision floating point number. To recover the
    \ dp floating point number, r, from these fields,
    \
    \ r = (-1)^sign * significand * 2^(exponent - 52)
    \
    : dp-components. ( F: r -- )
    dp-sign
    dp-exponent
    dp-fraction
    2dup D0= invert IF frac>significand THEN
    swap DP_EXPONENT_BIAS - swap
    fdrop
    ." significand = " 18 u.r 2 spaces
    ." exponent = " 4 .r 2 spaces
    ." sign = " .
    ;

    : dp-normal-significand ( u -- expinc u' )
    dup DP_FRACTION_BITS 1+ rshift
    dup IF
    tuck rshift
    ELSE
    drop 0 swap \ 0 u
    BEGIN
    dup DP_IMPLIED_BIT and 0=
    WHILE
    1 lshift
    swap 1- swap
    REPEAT
    THEN
    ;

    0 value r1_sign
    0 value r2_sign
    0 value r3_sign
    0 value num_sign
    variable r1_exp
    variable r2_exp
    variable r3_exp
    variable r_exp
    variable r1_frac
    variable r2_frac
    variable r3_frac

    : get-arg-components ( F: r1 r2 r3 -- )
    dp-fraction r3_frac !
    dp-exponent r3_exp !
    dp-sign to r3_sign
    fdrop
    dp-fraction r2_frac !
    dp-exponent r2_exp !
    dp-sign to r2_sign
    fdrop
    dp-fraction r1_frac !
    dp-exponent r1_exp !
    dp-sign to r1_sign
    fdrop ;

    0 value b_num_zero
    0 value b_den_zero
    0 value b_num_Inf
    0 value b_den_Inf

    variable remainder

    : _F*/ ( F: r1 r2 r3 -- ) ( -- usign uexponent udquot true )
    \ or ( F: -- DP_NAN | +/- DP_INF ) ( -- false )
    get-arg-components

    \ Check for NaN in args
    r1_frac @ r1_exp @ _IsNaN?
    r2_frac @ r2_exp @ _IsNaN? or
    r3_frac @ r3_exp @ _IsNaN? or IF DP_NAN FALSE EXIT THEN

    r1_frac @ r1_exp @ _IsZero?
    r2_frac @ r2_exp @ _IsZero? or to b_num_zero
    r3_frac @ r3_exp @ _IsZero? to b_den_zero

    b_num_zero b_den_zero and IF DP_NAN FALSE EXIT THEN

    r1_sign r2_sign xor to num_sign

    r1_frac @ r1_exp @ _IsInf?
    r2_frac @ r2_exp @ _IsInf? or to b_num_Inf
    r3_frac @ r3_exp @ _IsInf? to b_den_Inf

    b_num_Inf b_den_Inf and IF DP_NAN FALSE EXIT THEN

    b_num_zero b_den_Inf or IF \ return signed zero
    num_sign r3_sign xor
    0
    0 s>d TRUE EXIT
    THEN

    b_den_zero b_num_Inf or IF \ return signed Inf
    DP_+INF
    num_sign r3_sign xor IF FNEGATE THEN
    FALSE EXIT
    THEN

    num_sign r3_sign xor
    r1_exp @ r2_exp @ + r3_exp @ -
    r1_frac @ frac>significand
    r2_frac @ frac>significand um*
    r3_frac @ frac>significand ud/mod
    rot remainder !
    TRUE
    ;

    variable uhigh

    : F*/ ( F: r1 r2 r3 -- r )
    _F*/ \ ( -- usign uexp udquot true ) or
    \ ( -- false ) ( F: NAN | +/-INF )
    IF
    2 pick 0 DP_EXPONENT_MAX WITHIN 0= IF
    -43 throw
    THEN
    dup IF \ high 64 bits of the quotient
    uhigh !
    -43 throw
    ELSE
    drop \ -- usign uexp uquot
    2dup D0= IF \ ufrac = 0 and uexp 0
    drop \ usign uexp
    ELSE
    dp-normal-significand \ usign uexp expinc uquot
    >r + DP_FRACTION_BITS lshift
    r> DP_FRACTION_MASK and or
    THEN
    swap DP_SIGN_BIT lshift or
    >fpstack
    THEN
    THEN
    ;

    \ Tests
    1 [IF]
    [UNDEFINED] T{ [IF] include ttester.4th [THEN]
    1e-17 rel-near f!
    1e-17 abs-near f!
    TESTING F*/
    \ These results are different using F* and F/
    t{ 1.0e300 -3.0e259 1.0e280 f*/ -> -3.0e279 }t
    t{ 1.0e-300 3.0e-259 1.0e-280 f*/ -> 3.0e-279 }t
    t{ 1.0e300 -3.0e259 -1.0e280 f*/ -> 3.0e279 }t
    t{ -1.0e300 -3.0e259 -1.0e280 f*/ -> -3.0e279 }t
    t{ 1.0e302 -3.0e302 DP_+INF f*/ -> -0.0e0 }t
    t{ -1.0e-302 -1.0e-302 0.0e0 f*/ -> DP_+INF }t

    \ The following should give same results as using F* and F/
    t{ 0.0e0 3.0e302 1.0e-100 f*/ -> 0.0e0 }t
    t{ 3.0e302 0.0e0 1.0e-100 f*/ -> 0.0e0 }t
    t{ 1.0e302 -3.0e259 0.0e0 f*/ -> DP_-INF }t
    t{ DP_+INF 3.0e300 1.0e0 f*/ -> DP_+INF }t
    t{ DP_+INF 3.0e-300 -1.0e0 f*/ -> DP_-INF }t
    [THEN]

    === end fstarslash.4th ===

    === sample run of bench-fstarslash.4th ===
    \ Testing implementation of F*/ in fstarslash.4th
    include bench-fstarslash

    FSL-UTIL V1.3d 21 Apr 2024 EFC, KM
    DYNMEM V1.9d 19 February 2012 EFC TESTING F*/


    Type:
    'GEN-SAMPLES' to obtain arithmetic cases for F*/
    'BENCH-F*/' to execute the benchmark
    'CHECK-RESULTS' to validate arithmetic by F*/
    'FREE-MEM' to free memory
    Note: Test cases do not involve zeros or infinities.
    ok
    gen-samples
    ok
    bench-f*/
    602 ms
    ok
    check-results

    1000000 samples

    # tolerance errors: 0
    # sign errors: 0
    Max relative error: 0
    ok
    === end sample run ===

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Krishna Myneni on Sun Jun 2 07:11:12 2024
    On 5/19/24 16:28, Krishna Myneni wrote:
    The analog of */ for single length integers, with intermediate double
    length result, can be implemented for floating point arithmetic as well.
    The following implementation of F*/ uses the non-standard word, UD/MOD, although it might be possible to code it with UM/MOD as well. It also requires FP@ (requiring a separate fp stack in memory).

    ...

    Based on how it handles large integer arithmetic, I half-expected that double-precision floating point arithmetic overflow/underflow would by
    handled automatically by python. That appears not to be the case:

    Python 3.12.3:
    3.12e306 * 1.1e145 / 2.8e150
    inf


    Using F*/ in Forth:

    3.12e306 1.1e145 2.8e150 f*/ fs.
    1.22571428571429e+301 ok

    --
    KM

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From mhx@21:1/5 to Krishna Myneni on Sun Jun 2 15:03:47 2024
    Krishna Myneni wrote:

    Python 3.12.3:
    3.12e306 * 1.1e145 / 2.8e150
    inf
    [..]

    Worse.

    3.12e306 * 1.1e2 / 2.8e150
    inf

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to mhx on Sun Jun 2 13:01:19 2024
    On 6/2/24 10:03, mhx wrote:
    Krishna Myneni wrote:

    Python 3.12.3:
    3.12e306 * 1.1e145 / 2.8e150
    inf
    [..]

    Worse.

    3.12e306 * 1.1e2 / 2.8e150
    inf


    The largest decimal number expressible at double precision is

    1.7976931348623157e+308

    so 3.12e306 * 1.1e2 overflows double-precision IEEE fp.

    3.12e306 1.1e2 f* fs.
    inf ok

    But, the following is ok:

    3.12e306 5.76183697071255e1 f* 2.8e150 f/ fs.
    6.420332624508270e+157 ok

    I have a least-significant bit problem with the significand in my implementation of F*/ , which I need to fix:

    3.12e306 5.7618369707125503e1 2.8e150 f*/ fs.
    6.420332624508269e+157 ok

    --
    KM

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to Krishna Myneni on Sun Jun 2 18:03:17 2024
    Krishna Myneni wrote:

    On 5/19/24 16:28, Krishna Myneni wrote:
    The analog of */ for single length integers, with intermediate double
    length result, can be implemented for floating point arithmetic as
    well.

    The following implementation of F*/ uses the non-standard word, UD/MOD,

    although it might be possible to code it with UM/MOD as well. It also
    requires FP@ (requiring a separate fp stack in memory).

    ....

    Based on how it handles large integer arithmetic, I half-expected that double-precision floating point arithmetic overflow/underflow would by handled automatically by python. That appears not to be the case:

    Python 3.12.3:
    3.12e306 * 1.1e145 / 2.8e150
    inf


    Using F*/ in Forth:

    3.12e306 1.1e145 2.8e150 f*/ fs.
    1.22571428571429e+301 ok

    Of course, in Python you could use parentheses around the division, or
    load bigfloat or float128 packages, but...
    MicroPython v1.19.1 on 2022-11-05; win32 [GCC 6.3.0] version

    Use Ctrl-D to exit, Ctrl-E for paste mode
    3.12e306 * ( 1.1e145 / 2.8e150 )
    1.225714285714286e+301


    I know of no language with automatic type casting/inference to higher
    precision floating-point libraries, triggered by falling into a +/-Inf
    trap.

    OTOH, regarding the ease of working with type information in Julia,
    it shouldn't be too hard to implement in Julia.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Krishna Myneni on Sun Jun 2 13:18:13 2024
    On 6/2/24 07:11, Krishna Myneni wrote:
    ...
    Using F*/ in Forth:

    3.12e306 1.1e145 2.8e150 f*/ fs.
    1.22571428571429e+301  ok


    The alternative way to calculate this without overflow in Forth is to add/subtract the logarithms, then take the antilog. However, let's check
    the accuracy of this approach compared to the above:

    3.12e306 flog 1.1e145 flog f+ 2.8e150 flog f- falog fs.
    1.225714285714187e+301 ok

    The base 10 significand of the result is

    3.12e 1.1e f* 2.8e f/ fs.
    1.225714285714286e+00 ok

    The result from using F*/ to the same number of decimal digits is

    3.12e306 1.1e145 2.8e150 f*/ fs. \ Anton's version of F*/ 1.225714285714286e+301 ok

    The relative error of the logarithm approach to using F*/ is

    1.225714285714286e+301 1.225714285714187e+301 f- 1.225714285714286e+301
    f/ f.
    8.07495e-14 ok

    We appear losing between about 3 decimal digits of accuracy with the
    logarithm approach.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Sun Jun 2 13:51:28 2024
    On 6/2/24 13:03, minforth wrote:
    Krishna Myneni wrote:

    On 5/19/24 16:28, Krishna Myneni wrote:
    The analog of */ for single length integers, with intermediate double
    length result, can be implemented for floating point arithmetic as
    well.

    The following implementation of F*/ uses the non-standard word, UD/MOD,

    although it might be possible to code it with UM/MOD as well. It also
    requires FP@ (requiring a separate fp stack in memory).

    ....

    Based on how it handles large integer arithmetic, I half-expected that
    double-precision floating point arithmetic overflow/underflow would by
    handled automatically by python. That appears not to be the case:

    Python 3.12.3:
    3.12e306 * 1.1e145 / 2.8e150
    inf


    Using F*/ in Forth:

    3.12e306 1.1e145 2.8e150 f*/ fs.
    1.22571428571429e+301  ok

    Of course, in Python you could use parentheses around the division, or
    load bigfloat or float128 packages, but...
    MicroPython v1.19.1 on 2022-11-05; win32 [GCC 6.3.0] version

    Use Ctrl-D to exit, Ctrl-E for paste mode
    3.12e306 * ( 1.1e145 / 2.8e150 )
    1.225714285714286e+301



    Yes, you could do that for a specific case; however, you can get
    overflow to INF if you encode the calculation as a*(b/c) for general
    cases. Consider,

    a = 3.12e-306
    b = 1.1e150
    c = 2.8e-159

    For this case, a*b/c will not overflow, but a*(b/c) won't work.

    3.12e-306*(1.1e150/2.8e-159)
    inf

    But F*/ will work for this case as well as for the prior case (a =
    3.12e306, b = 1.1e145, c = 2.8e150).

    3.12e-306 1.1e150 2.8e-159 f*/ fs.
    1.22571428571429e+03 ok

    You don't have to do a case by case selection of (a*b)/c or a*(b/c),
    just a b c F*/

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Sun Jun 2 21:58:12 2024
    On 5/21/24 14:17, minforth wrote:
    Krishna Myneni wrote:
    MANTMAX dp-components.
    significand =   9007199254740991  exponent =   52  sign = 0  ok >>>> hex
      ok
    MANTMAX FFFF S>F FDUP F*/
      ok
    decimal dp-components.
    significand =   9007199254740990  exponent =   52  sign = 0  ok

    The lowest bit is different in the significand. Need to look into
    why the lowest fraction bit is different.

    It looks like you are truncating the result and not rounding to nearest
    which is expected by F*  and F/.
    You save the remainder but do not use it for anything!


    I think you're correct, but the rounding used by F*/ should be to
    whatever rounding mode the FPU is set. In kForth, the FPU is always
    initialized to round to nearest. Since we are doing the calculation for

    F*/ in integer arithmetic, it is not observing the round-to-nearest mode

    of the FPU.

    Manual cross-check with quad precision floats:

    MinForth 3.6 (64 bit) (fp 128 bit)
    # $1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX  ok
    # mantmax  ok
    f: 9.0072e+15 | # $FFFF s>f  ok
    f: 9.0072e+15 65535 | # f*  ok
    f: 5.90287e+20 | # fdup f>d  ok
    f: 5.90287e+20 | -9007199254806527 31 # hex  ok
    f: 5.90287e+20 | FFDFFFFFFFFF0001 1F $ decimal 65535e f/  ok
    f: 9.0072e+15 | -9007199254806527 31 # 2drop f>s  ok
    9007199254740991 # hex  ok
    1FFFFFFFFFFFFF $

    So for the 64-bit F*/ challenge, the interim significand after the multiplication step should be $1F_FFDF_FFFF_FFFF_0001.

    It is now easy to check whether the erroneous 1-digit offset was caused
    by the multiplication step or the division step.

    I had errors in my previous F*/ implementation, provided in
    fstarslash.4th, which were corrected in my subsequent post of this code
    on 5/31/24 (see listing for fstarslash.4th in that post). Using the
    2024-05-31 revision, I now reproduce your results:

    hex
    ok
    1FFFFFFFFFFFFF S>F FCONSTANT MANTMAX
    ok
    mantmax dp-components.
    significand = 1FFFFFFFFFFFFF exponent = 34 sign = 0 ok
    decimal
    ok
    mantmax dp-components.
    significand = 9007199254740991 exponent = 52 sign = 0 ok
    mantmax 65535 s>f fdup f*/
    ok
    decimal dp-components.
    significand = 9007199254740991 exponent = 52 sign = 0 ok

    ok
    9007199254740991 hex .
    1FFFFFFFFFFFFF ok

    We are in agreement now.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Tue Jun 4 10:55:26 2024
    Super!

    In a dreamy moment it occurred to me whether these methods
    would also be suitable for optimising the standard Forth
    operator M*/ (with triple-wide intermediate result after
    multiplication).

    AFAIR there are very different implementation approaches
    between different Forth systems. But presumably M*/ is not
    used by anyone...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Tue Jun 4 17:46:47 2024
    On 6/4/24 05:55, minforth wrote:
    Super!

    In a dreamy moment it occurred to me whether these methods
    would also be suitable for optimising the standard Forth
    operator M*/ (with triple-wide intermediate result after
    multiplication).

    AFAIR there are very different implementation approaches
    between different Forth systems. But presumably M*/ is not
    used by anyone...

    kForth implements M*/ in this way. It provides the word UTM/ which has
    the stack diagram,

    UTM/ ( ut u -- ud )

    Divide unsigned triple by unsigned single to give unsigned double quotient.

    It also provides DS* and UDM* :

    DS* ( d n -- t )

    Multiply double and single to give signed triple length product.

    UDM* ( ud u -- ut )

    Multiply unsigned double and unsigned single to give unsigned triple
    length product.

    I remember David Williams contributed these words to kForth, during his
    port of the kForth VM to the PowerPC processor (this port has not been maintained for many years due to lack of access to the hardware).

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to PMF on Tue Jun 4 17:50:04 2024
    On 6/4/24 08:13, PMF wrote:
    Krishna Myneni wrote:

    On 6/2/24 07:11, Krishna Myneni wrote:
    ....
    Using F*/ in Forth:

    3.12e306 1.1e145 2.8e150 f*/ fs.
    1.22571428571429e+301  ok


    The alternative way to calculate this without overflow in Forth is to
    add/subtract the logarithms, then take the antilog. However, let's
    check

    the accuracy of this approach compared to the above:

    3.12e306 flog 1.1e145 flog f+ 2.8e150 flog f- falog fs.
    1.225714285714187e+301  ok

    The base 10 significand of the result is

    3.12e 1.1e f* 2.8e f/ fs.
    1.225714285714286e+00  ok

    The result from using F*/ to the same number of decimal digits is

    3.12e306 1.1e145 2.8e150 f*/ fs.  \ Anton's version of F*/
    1.225714285714286e+301  ok

    The relative error of the logarithm approach to using F*/ is

    1.225714285714286e+301 1.225714285714187e+301 f- 1.225714285714286e+301

    f/ f.
    8.07495e-14  ok

    We appear losing between about 3 decimal digits of accuracy with the
    logarithm approach.

    When I implemented floating point for LXF64 I decided to use a C math library,
    fdlibm. As they were available I implemented frexp and scalbn.
    For implementing F*/ they finally became useful!


    Cool. FREXP and SCALBN look like useful factors. I can define them in
    terms of my primitive factors.

    \ F*/  using frexp and fscalbn
    \ frexp( f - mantissa exp )  split float in mantissa and integer
    exponent
    \ fscalbn ( F: f  exp -- f1)  scale the float with integer exponent f1 = f*2^exp

    : F*/  ( f1 f2 f3  -- f1*f2/f3 )
       frexp fswap frexp fswap f/
       fswap frexp f*    + swap -
       fscalbn ;


    Will be interesting to benchmark this version against the other two implementations of F*/ (mine and Anton's).


    This is somewhat similar to your approach but keeping the mantissa as a

    floating-point number. The benefit is that subnormals, INFs, NANs are
    handled automatically.


    -- Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Wed Jun 12 19:01:21 2024
    Just for fun: I took Peter Fälth's F*/ to good use in
    one of my locals test programs. By pure coincidence ;-)
    the test program is a modified version of Krishna Myneni's
    calculation program sph_bes_neu.fth for spherical Bessel
    and Neumann functions.

    NB: The test program deliberately makes heavy use of
    locals and of an embedded function that accesses an upvalue.
    It should not be taken as an indication of some crazy
    normal coding style. ;-)

    \ ====== SPH_BES_NEU.MF =============================================
    \
    \ Spherical Bessel and Neumann functions for large index range
    \
    \ Reference
    \ E. Gillman and H. R. Fiebig, "Accurate recursive generation
    \ of spherical Bessel and Neumann functions for a large range
    \ of indices," Computers in Physics vol. 2, p. 62 (1988).

    \ Original FORTRAN to Forth conversion by Krishna Myneni

    \ Adapted to MinForth 3.6 syntax (as locals test):

    1000 CONSTANT MAXL MAXL VECTOR RBES MAXL VECTOR RNEU

    : SPHFUNCS { f: x -- } \ fill vectors RBES and RNEU
    \ --- Initialize
    x fsqr x fcos x fsin { f: xx cx sx }
    1e RBES{ maxl }! 1e RBES{ maxl 1- }!
    cx RNEU{ 1 }! x sx f* cx f+ RNEU{ 2 }!
    \ --- Recursion loop
    <: LFUN ( l -- r3 ) xx sqr 4 * 1- s>f f/ ;>
    maxl 1- 1 { lu lv }
    maxl 1+ 3 DO
    RBES{ lu } lu lfun RBES{ lu 1+ } f* f- RBES{ lu 1- }!
    RNEU{ lv 1+ } lv lfun RNEU{ lv } f* f- RNEU{ lv 2+ }!
    -1 += lu 1 += lv
    LOOP
    \ --- Scale
    RBES{ 1 } RNEU{ 2 } f* xx cx f* RBES{ 2 } 3e f*/ f- 1/f *to RBES
    \ --- Normalize
    x 1/f -1e { f: cu cv | w }
    maxl 1+ 1 DO
    i 1- 2* s>f := w
    cu x w 1e f+ f*/ := cu cu RBES{ i }*
    cv 1e w f- x f*/ := cv cv RNEU{ i }*
    LOOP ;

    \ Test: 1e SPHFUNCS RBES m. RNEU m.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Fri Jun 14 21:17:12 2024
    On 6/12/24 14:01, minforth wrote:
    Just for fun: I took Peter Fälth's F*/ to good use in
    one of my locals test programs. By pure coincidence ;-)
    the test program is a modified version of Krishna Myneni's
    calculation program sph_bes_neu.fth for spherical Bessel and Neumann functions.

    NB: The test program deliberately makes heavy use of
    locals and of an embedded function that accesses an upvalue.
    It should not be taken as an indication of some crazy
    normal coding style. ;-)

    \ ====== SPH_BES_NEU.MF =============================================
    \
    \ Spherical Bessel and Neumann functions for large index range
    \
    \ Reference
    \      E. Gillman and H. R. Fiebig, "Accurate recursive generation \      of spherical Bessel and Neumann functions for a large range \      of indices," Computers in Physics vol. 2, p. 62 (1988).

    \ Original FORTRAN to Forth conversion by Krishna Myneni

    \ Adapted to MinForth 3.6 syntax (as locals test):

    1000 CONSTANT MAXL  MAXL VECTOR RBES  MAXL VECTOR RNEU

    : SPHFUNCS { f: x -- } \ fill vectors RBES and RNEU
    \ --- Initialize
     x fsqr  x fcos  x fsin  { f: xx cx sx }
     1e RBES{ maxl }! 1e RBES{ maxl 1- }!
     cx RNEU{ 1 }!  x sx f* cx f+ RNEU{ 2 }!
    \ --- Recursion loop
     <: LFUN ( l -- r3 ) xx  sqr 4 * 1- s>f  f/ ;>
     maxl 1-  1  { lu lv }
     maxl 1+ 3 DO
       RBES{ lu } lu lfun RBES{ lu 1+ } f* f- RBES{ lu 1- }!
       RNEU{ lv 1+ } lv lfun RNEU{ lv } f* f- RNEU{ lv 2+ }!
       -1 += lu  1 += lv
     LOOP
    \ --- Scale
     RBES{ 1 } RNEU{ 2 } f* xx cx f* RBES{ 2 } 3e f*/ f-  1/f *to RBES
    \ --- Normalize
     x 1/f  -1e  { f: cu cv | w }
     maxl 1+ 1 DO
       i 1- 2* s>f  := w
       cu  x  w 1e f+  f*/  := cu  cu RBES{ i }*
       cv  1e w f-  x  f*/  := cv  cv RNEU{ i }*
     LOOP ;

    \ Test:  1e SPHFUNCS  RBES m.  RNEU m.

    Does the use of F*/ change any of the results in the test code
    accompanying the original Forth code?

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Sat Jun 15 08:21:01 2024
    .. more precisely: precision deviation at j_10 (100)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Sat Jun 15 08:12:24 2024
    A quick check showed no difference between the F*/ versions.
    Comparisons with -1e13 F~ worked well, while -1e-14 F~ stalled
    at RBES{ 10 }.

    Either I gave F*/ wrong factors or it is simply a limitation
    of the algorithm. TBH I was more interested in the locals
    scheme with embedded function.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to All on Sat Jun 15 19:38:43 2024
    .. I became curious and reordered factors in the main loop
    to use f*/ also there:

    <: LFUN ( l -- r3 ) sqr 4 * 1- s>f ;>
    maxl 1- 1 { n: lu lv }
    maxl 1+ 3 DO
    RBES( lu ) RBES( lu 1+ ) xx lu lfun f*/ f- RBES( lu 1- )!
    RNEU( lv 1+ ) RNEU( lv ) xx lv lfun f*/ f- RNEU( lv 2+ )!
    -1 += lu 1 += lv
    LOOP

    Precision improved indeed by about one decimal digit, iow close to
    maximal achievable double fp-number resolution. Higher precision
    would require calculation with extended fp-numbers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to minforth on Sat Jun 15 15:41:20 2024
    On 6/15/24 14:38, minforth wrote:
    .. I became curious and reordered factors in the main loop
    to use f*/ also there:

     <: LFUN ( l -- r3 )  sqr 4 * 1- s>f ;>
     maxl 1- 1  { n: lu lv }
     maxl 1+ 3 DO
       RBES( lu )  RBES( lu 1+ ) xx  lu lfun  f*/  f-  RBES( lu 1- )!
       RNEU( lv 1+ )  RNEU( lv ) xx  lv lfun  f*/  f-  RNEU( lv 2+ )!
       -1 += lu  1 += lv
     LOOP

    Precision improved indeed by about one decimal digit, iow close to
    maximal achievable double fp-number resolution. Higher precision
    would require calculation with extended fp-numbers.

    The numerical implementation of the algorithm using double-precision
    numbers overflows the exponent for large l, and for large x. The results
    of the calculation, however, are typically reasonable numbers over a
    large range of l,x. I looked at the code to see if F*/ could be used to
    avoid the exponent overflow in the intermediate calculations, and
    couldn't see anything obvious.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)