BTW: I think if „space” is too difficult to use it as „thousandTerrible bad idea, because it can be visually discerned from a space.
separator”,
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Tab is not a glyph, but a control of mechanical type writers.
Mechanical type writers aren't used (since very long time) anymore,
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
I got a better „candidate”: Vertical Tab (0Bh):Terrible bad idea, because it can be visually discerned from a space.
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Tab is not a glyph, but a control of mechanical type writers.
Mechanical type writers aren't used (since very long time) anymore,True enough; but the point that you can't see it unless a special font
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
is used, is a valid one.
I got a better „candidate”: Vertical Tab (0Bh):Terrible bad idea, because it can be visually discerned from a space. >> Tab is not a glyph, but a control of mechanical type writers.
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Maybe my assumption is different, but actually I don't see any need to make it visible. I treat it as kind of „hard space” („non-breakable”) used sometimesMechanical type writers aren't used (since very long time) anymore,True enough; but the point that you can't see it unless a special font
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
is used, is a valid one.
in text editors. I see its supposed invisibility rather as advantage.
Of course if for some particular reasons that character should be visible, underscore
may be good enough.
Mechanical type writers aren't used (since very long time) anymore,BTW: I think if „space” is too difficult to use it as „thousand >separator”,Terrible bad idea, because it can be visually discerned from a space.
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh): >— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Tab is not a glyph, but a control of mechanical type writers.
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
The entire point of a thousands separator is to facilitate humans
reading large numbers or small fractions.
284 985 000 234,23
Like this:
284 985 000 234,23
Great. So how should a Forth text interpreter know that this is one
number, not four? And you should a human reading this as Forth code
know that?
Actually employing VT could have another advantage: consider all
these „hyphenated words”. They wouldn't have to be hyphenated
any longer. Instead of „pseudo space” VT could „link” two strings that comprise such word — making it look more natural.
: strip ( c-addr u -- c-addr2 u2 )
<# 2dup 1- over + do
i c@ [char] _ over - if hold else drop then
-1 +loop #> ;
Won't work on zero-length strings but irrelevant here.
Actually employing VT could have another advantage: consider all
these „hyphenated words”. They wouldn't have to be hyphenated
any longer. Instead of „pseudo space” VT could „link” two strings that comprise such word — making it look more natural.
Great. So how should a Forth text interpreter know that this is one=20
number, not four? And you should a human reading this as Forth code=20
know that?
That's why I proposed VT for that. The operator, by pressing Shift-Space >inserts VT between _groups_ of digits of the single number.
On the screen it looks like =E2=80=9Eordinary=E2=80=9D spaces =E2=80=94 exa= >ctly, like in case of
=E2=80=9Eordinary space=E2=80=9D and =E2=80=9Enon-breakable space=E2=80=9D = >(in case of text editor).
Actually employing VT could have another advantage: consider all
these =E2=80=9Ehyphenated words=E2=80=9D. They wouldn't have to be hyphena= >ted
any longer. Instead of =E2=80=9Epseudo space=E2=80=9D VT could =E2=80=9Elin= >k=E2=80=9D two strings
that comprise such word =E2=80=94 making it look more natural.
So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?
Why can't you see the issues this would cause???
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
001 002 003 004
Again, how should a human see the difference between
unused-words
and
unused words
if you replace the "-" by something that looks like a space?
Again, how should a human see the difference between
unused-words
and
unused words
if you replace the "-" by something that looks like a space?
Sometimes it may create a problem indeed, but taking a peek
into glossary usually should help.
So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?No.
Why can't you see the issues this would cause???What issues — in particular?
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
Why can't you grasp this fail?
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
Why can't you grasp this fail?1. You wrote about text interpreter -- did you mean 'human' of Forth?
Forth won't have any problem, it'll find VT there.
2. If you mean human: if you want the others to understand you, you
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).
I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.
Ok, how many spaces did I type to separate these digits?
0123 4567 8901 2345
On Tuesday, July 26, 2022 at 4:51:25 PM UTC-4, minf...@arcor.de wrote:is not in use, so that would be good. I suppose if you were looking for coordinates in text, you could redefine ' for a bit, then restore it to mean "tick". Or do I not understand how numbers are read?
gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:
Ok, how many spaces did I type to separate these digits?
0123 4567 8901 2345At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W
Very helpful. ;o)So if you had a few spaces (not vertical tabs) in your coordinate, 38° 17′ 10″ N 76° 24′ 42″ W, I believe Forth would read the number 38, then treat ° as a word, no? I suppose ' would be a problem, since that is already in use. ", however,
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
Yes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.Why can't you grasp this fail?1. You wrote about text interpreter -- did you mean 'human' of Forth? Forth won't have any problem, it'll find VT there.
2. If you mean human: if you want the others to understand you, youOk, how many spaces did I type to separate these digits?
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).
0123 4567 8901 2345
I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.Yes, you don't understand. That's the point.
gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:
Ok, how many spaces did I type to separate these digits?
0123 4567 8901 2345At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W
Very helpful. ;o)
Ok, how many spaces did I type to separate these digits?
0123 4567 8901 2345
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
If you write something like this: 001_002_003 004 -- I'll also have to ask youYes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.Why can't you grasp this fail?1. You wrote about text interpreter -- did you mean 'human' of Forth? Forth won't have any problem, it'll find VT there.
a question, what actually you typed.
It doesn't depend on the selected separator character.
2. If you mean human: if you want the others to understand you, youOk, how many spaces did I type to separate these digits?
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).
0123 4567 8901 2345Maybe now it's the time for me to ask a question — you have already made
a fair use out of your question quota: does your Forth interpreter — and/or
your computer screen — „compress” spaces like Google News interface? Or it doesn't?
Never understood the people that insist on looking for the problems where there aren't any. I'm not a psychologist, you know, so I don't have to.I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.Yes, you don't understand. That's the point.
I honestly don't understand why are you put so much effort into creating problem out of nothing.
Great. So how should a Forth text interpreter know that this is one number, not four? And you should a human reading this as Forth code
know that?
That's why I proposed VT for that. The operator, by pressing Shift-Space inserts VT between _groups_ of digits of the single number.
On the screen it looks like „ordinary” spaces — exactly, like in case of
„ordinary space” and „non-breakable space” (in case of text editor).
(' ', 32)i=input(); i, ord(i)
Like this:
284 985 000 234,23
Like this:or '284 985 000 234.23' depending on locale?
284 985 000 234,23
'284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.
Like this:or '284 985 000 234.23' depending on locale?
284 985 000 234,23
'284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.I was trying to explain, that there are EXACTLY THE SAME „problems
to resolve” whether you connect the 3-digits groups with underscore,
or with VT — but in latter case it just... looks better.
32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number. It's increasingly used in programming languages for this purpose. Even
XPL0 has it.
A. ANS didn't see the need for it.
Q. Are you married?
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented
Q. Has it broken anything?
A. Not AFAIK
Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often do you enter values in binary?
On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:
32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even XPL0 has it.
A. ANS didn't see the need for it.
Q. Are you married?
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented
Q. Has it broken anything?
A. Not AFAIK
Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often do you enter values in binary?
I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char
: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;
I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.
After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart
123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok
I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char
$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok
it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.
But using the non breaking space I can now make words with spaces in them!
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok
This of course looks even more confusing then spaces in numbers!
For me this improves readability enormously! Thanks for the suggestion.
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok
This of course looks even more confusing then spaces in numbers!
P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:
On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:
32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g.
underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even XPL0 has it.
A. ANS didn't see the need for it.
Q. Are you married?
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented
Q. Has it broken anything?
A. Not AFAIK
Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?
I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char
: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;
I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.
After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart
123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok
I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char
$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok
it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.
But using the non breaking space I can now make words with spaces in them!
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok
This of course looks even more confusing then spaces in numbers!
For me this improves readability enormously! Thanks for the suggestion.
Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.
...
My systems require input to be utf8 encoded Unicode and will output utf8 streams.
It has worked for over 20 years like that on both Windows and Linux.
´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?
On Monday, 22 August 2022 at 16:44:02 UTC+2, minf...@arcor.de wrote:
P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:
On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.
32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g.
underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even >> > > XPL0 has it.
A. ANS didn't see the need for it.
Q. Are you married?
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger. >> > >
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented
Q. Has it broken anything?
A. Not AFAIK
Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?
I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char
: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;
I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.
After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart
123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok
I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char
$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok
it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.
But using the non breaking space I can now make words with spaces in them! >> >
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok
This of course looks even more confusing then spaces in numbers!
For me this improves readability enormously! Thanks for the suggestion.
My systems require input to be utf8 encoded Unicode and will output utf8 streams.
It has worked for over 20 years like that on both Windows and Linux.
´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?
There is a good reason to junk { BL WORD } in favor of TOKEN / NAME or whatever.
NAME ( -- addr n ) get a blank surrounded token from the input stream
with appropriate side effects on the input stream.
Then encoding of the characters shouldn't be a concern of the Forth system.
On 23/08/2022 06:50, P Falth wrote:
... My systems require input to be utf8 encoded Unicode and will
output utf8 streams. It has worked for over 20 years like that on
both Windows and Linux. 'at $B4 is present in Windows 1252 and
Linux Latin 1 codepages. Is there any reason to not use Unicode and
utf8 today on Windows and Linux?
String literals and comment fields excepted, there's not a lot of
reason to use UTF-8 in programming code.
Underscore in numbers is about convention. Several programming
languages have adopted it as a programmer convenience. It might
bemuse other languages to know Forth had no problem giving comma et
al new meanings but drew the line at underscore.
On 23.08.2022 05:21, dxforth wrote:
On 23/08/2022 06:50, P Falth wrote:
... My systems require input to be utf8 encoded Unicode and will
output utf8 streams. It has worked for over 20 years like that on
both Windows and Linux. 'at $B4 is present in Windows 1252 and
Linux Latin 1 codepages. Is there any reason to not use Unicode and
utf8 today on Windows and Linux?
String literals and comment fields excepted, there's not a lot of
reason to use UTF-8 in programming code.
Underscore in numbers is about convention. Several programming
languages have adopted it as a programmer convenience. It might
bemuse other languages to know Forth had no problem giving comma et
al new meanings but drew the line at underscore.
I really do like writing literals in sources in UTF-8, since my system
fully supports it and has not the faintest will to use antiquated or
strange things like CP1252, ISO-8859-xxx, UTF-16, but one gets quickly
used to writing sources in ASCII with hex escapes again when
collaborating with Windows people who are not willing or able to save
edited files as UTF-8 and all your special characters (for me,
especially measurement units containing characters like u+00B0
(Degrees), u+00B5 (greek mu for micro prefix), u+202F (narrow no-break
space between value and measurement unit) etc. are lost every time one
of these moron^H^H^H^H^Hfolks changed something.
[..]123麓456麓789 ok.
. 123麓456麓789 ok
麓at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?
Not wanting to contradict, but lots of Forth programs run on small systems >where UTF-8 is not present, even when the programs are developped on >feature-rich desktops.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (2 / 14) |
Uptime: | 143:40:45 |
Calls: | 10,383 |
Calls today: | 8 |
Files: | 14,054 |
D/L today: |
2 files (1,861K bytes) |
Messages: | 6,417,671 |