• copying and pasting from pdf to Forth

    From Krishna Myneni@21:1/5 to All on Mon Jul 1 05:38:23 2024
    Yesterday, I learned a good lesson to not copy and paste text from a pdf
    into the Forth environment. There can be hidden characters when doing
    so, and then a word fails because the input isn't correct.

    For an hour or so I was chasing down an imaginary bug in the
    (non-standard) word NUMBER? used to convert a counted string into a
    signed double length number.

    --
    Krishna Myneni

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From sjack@21:1/5 to Krishna Myneni on Mon Jul 1 12:13:36 2024
    Krishna Myneni <krishna.myneni@ccreweb.org> wrote:
    Yesterday, I learned a good lesson to not copy and paste text from a pdf
    into the Forth environment. There can be hidden characters when doing
    so, and then a word fails because the input isn't correct.

    I've encountered that problem often enough over many years to
    get a feel for it as when action doesn't match logic.
    Vim has a ':list' option that will display non-printiables
    and for block files just re-write the suspect line. Also
    have ADUMP that prints text and displays any non-printable
    as '^nn', where nn was the value of the non-printable.

    --
    me

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From mhx@21:1/5 to Krishna Myneni on Mon Jul 1 11:47:24 2024
    Krishna Myneni wrote:
    [..]
    For an hour or so I was chasing down an imaginary bug in the
    (non-standard) word NUMBER? used to convert a counted string into a
    signed double length number.

    I know about '-', and single/double quote characters. They are a
    nuisance not only in Forth.

    Was it something else?

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to sjack on Mon Jul 1 18:45:10 2024
    On 7/1/24 07:13, sjack wrote:
    Krishna Myneni <krishna.myneni@ccreweb.org> wrote:
    Yesterday, I learned a good lesson to not copy and paste text from a pdf
    into the Forth environment. There can be hidden characters when doing
    so, and then a word fails because the input isn't correct.

    I've encountered that problem often enough over many years to
    get a feel for it as when action doesn't match logic.
    Vim has a ':list' option that will display non-printiables
    and for block files just re-write the suspect line. Also
    have ADUMP that prints text and displays any non-printable
    as '^nn', where nn was the value of the non-printable.



    No display difference when I used :SET LIST in Vim. But I found the
    issue was the difference between a unicode minus sign and an ASCII minus
    sign -- see above.

    --
    KM

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to mhx on Mon Jul 1 18:33:58 2024
    On 7/1/24 06:47, mhx wrote:
    Krishna Myneni wrote:
    [..]
    For an hour or so I was chasing down an imaginary bug in the
    (non-standard) word NUMBER? used to convert a counted string into a
    signed double length number.

    I know about '-', and single/double quote characters. They are a
    nuisance not only in Forth.

    Was it something else?

    -marcel

    It was a large number containing commas. I deleted the commas.

    The text is pasted here:

    −170,141,183,460,469,231,731,687,303,715,884,105,728

    It appears to have pasted properly, but there is a difference between
    Line 1 and Line 2. The latter is entered by hand.
    \ Line 1
    c" −170141183460469231731687303715884105728" NUMBER? .s

    0
    0
    0
    ok
    drop 2drop \ Now enter by hand
    ok
    \ Line 2
    c" -170141183460469231731687303715884105728" NUMBER? .s

    -1
    -9223372036854775808
    0

    ok

    From my newsreader, I can copy Line 1 into my Forth environment and
    reproduce the error. When I copy Line 2 which is entered by hand,
    NUMBER? works as expected.

    --
    KM

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to Krishna Myneni on Mon Jul 1 18:43:09 2024
    On 7/1/24 18:33, Krishna Myneni wrote:
    On 7/1/24 06:47, mhx wrote:
    Krishna Myneni wrote:
    [..]
    For an hour or so I was chasing down an imaginary bug in the
    (non-standard) word NUMBER? used to convert a counted string into a
    signed double length number.

    I know about '-', and single/double quote characters. They are a
    nuisance not only in Forth.

    Was it something else?

    -marcel

    It was a large number containing commas. I deleted the commas.

    The text is pasted here:

    −170,141,183,460,469,231,731,687,303,715,884,105,728

    It appears to have pasted properly, but there is a difference between
    Line 1 and Line 2. The latter is entered by hand.
    \ Line 1
    c" −170141183460469231731687303715884105728" NUMBER? .s

            0
            0
            0
     ok
    drop 2drop  \ Now enter by hand
     ok
    \ Line 2
    c" -170141183460469231731687303715884105728" NUMBER? .s


    Ok, when I apply COUNT I see there is a 2 character difference between
    lines 1 and 2. If you look closely, the minus sign is different between
    the two! The first line must have copied a UTF-8 encoded unicode
    character for the minus sign.

    --
    KM

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From mhx@21:1/5 to All on Tue Jul 2 08:03:36 2024
    iForth silently drops non-ASCII :

    ( mouse copy and paste followed by keyboard ENTER )
    FORTH> 17 ok
    [1]FORTH> . 17 ok
    FORTH>

    Octave is more explicit:

    octave:1> −17
    error: parse error:

    invalid character '�' (ASCII 226)

    −17
    ^
    octave:1>

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From sjack@21:1/5 to Krishna Myneni on Tue Jul 2 14:29:14 2024
    Krishna Myneni <krishna.myneni@ccreweb.org> wrote:

    No display difference when I used :SET LIST in Vim. But I found the
    issue was the difference between a unicode minus sign and an ASCII minus
    sign -- see above.


    Hard to believe Vim :SET LIST wouldn't catch that. But I use codepage
    KOI8-R so that wouldn't show up for me. I'll have to try Vim on unicode terminal.
    But my ADUMP would catch it for sure; it's showing the values Forth sees.

    Ok, tried on unicode terminal and the PDF I used had the correct value,
    45, for minus after copy to text file that I loaded with Forth.
    (Vim ga over the minus also showed 45). Also if I type minus in
    Vim on text file and it had the correct value. So your particular
    PDF file was the culprit; so yes, all be warned.


    --
    me

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Krishna Myneni@21:1/5 to sjack on Tue Jul 2 19:49:43 2024
    On 7/2/24 09:29, sjack wrote:
    Krishna Myneni <krishna.myneni@ccreweb.org> wrote:

    No display difference when I used :SET LIST in Vim. But I found the
    issue was the difference between a unicode minus sign and an ASCII minus
    sign -- see above.


    Hard to believe Vim :SET LIST wouldn't catch that. But I use codepage
    KOI8-R so that wouldn't show up for me. I'll have to try Vim on unicode terminal.
    But my ADUMP would catch it for sure; it's showing the values Forth sees.

    Ok, tried on unicode terminal and the PDF I used had the correct value,
    45, for minus after copy to text file that I loaded with Forth.
    (Vim ga over the minus also showed 45). Also if I type minus in
    Vim on text file and it had the correct value. So your particular
    PDF file was the culprit; so yes, all be warned.



    Forth is a good environment for troubleshooting this -- just paste the
    text after S" followed by a space and close the quote, then perform DUMP
    to see the hex codes of all the characters. Many implementations of DUMP
    also show the printable characters.

    About the PDF file: it was one I created (the User's manual for
    kForth-64); now I specifically remember changing the ASCII minus sign to Unicode minus so that it would be more readable! I had forgotten about that.

    --
    Krishna

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)