Here are the results of six invocations of the "regexp" command.
For one of them I'm sure that the result is correct (1).
For some of them I'm unsure (2, 3). I wouldn't be surprised if the result can be explained to be
correct though.
However, for invocations 4, 5 and 6 I definitely can't imagine how the results can be correct:
On 16/09/2022 21:06, Erik Leunissen wrote:
Here are the results of six invocations of the "regexp" command.
For one of them I'm sure that the result is correct (1).
For some of them I'm unsure (2, 3). I wouldn't be surprised if the result can be explained to be
correct though.
However, for invocations 4, 5 and 6 I definitely can't imagine how the results can be correct:
Aftre more thinking, I can imagine that 5 and 6 can be explained also.
But I can't wrap my mind around cases 4 and 3 (the latter additionally to my previous post).
Erik.
--
elns@ nl | Merge the left part of these two lines into one,
xs4all. | respecting a character's position in a line.
The arguments typed in the source code is not percisly what the command actually sees. This is explained by reading the rules of Tcl, closely.
The results:
regexp - a-z
regexp - a-z
regexp - a-z
regexp -\ a-z
regexp \- a-z
regexp - a-z
-Brian
On 17/09/2022 02:54, briang wrote:
The arguments typed in the source code is not percisly what the command actually sees. This is explained by reading the rules of Tcl, closely.Thanks Brian, I will investigate what you indicate.
Nonetheless, these results ... :
The results:
regexp - a-z
regexp - a-z
regexp - a-z
regexp -\ a-z
regexp \- a-z
regexp - a-z
... indicate that the regexp command sees identical arguments for cases 1, 2, 3 and 6.
However, the *results* of the command invocations for these cases are not the same.
% regexp - $str; #1While these should be the exact same thing, they produce different byte
bad option "-": must be -all, -about, -indices, -inline, -expanded,
-line, -linestop, -lineanchor, -nocase, -start, or --
% regexp \- $str; #2
1
That seems to me like there's a bug lurking somewhere.
On 16/09/2022 21:06, Erik Leunissen wrote:
% regexp - $str; #1While these should be the exact same thing, they produce different byte codes:
bad option "-": must be -all, -about, -indices, -inline, -expanded,
-line, -linestop, -lineanchor, -nocase, -start, or --
% regexp \- $str; #2
1
% ::tcl::unsupported::disassemble script {regexp - $str}
ByteCode 0x0x555be8c65680, refCt 1, epoch 17, interp 0x0x555be8bfa380
(epoch 17)
Source "regexp - $str"
Cmds 1, src 13, inst 10, litObjs 3, aux 0, stkDepth 3, code/src 0.00
Commands 1:
1: pc 0-8, src 0-12
Command 1: "regexp - $str"
(0) push1 0 # "regexp"
(2) push1 1 # "-"
(4) push1 2 # "str"
(6) loadStk
(7) invokeStk1 3
(9) done
% ::tcl::unsupported::disassemble script {regexp \- $str}
ByteCode 0x0x555be8c66180, refCt 1, epoch 17, interp 0x0x555be8bfa380
(epoch 17)
Source "regexp \- $str"
Cmds 1, src 14, inst 8, litObjs 2, aux 0, stkDepth 2, code/src 0.00
Commands 1:
1: pc 0-6, src 0-13
Command 1: "regexp \- $str"
(0) push1 0 # "-"
(2) push1 1 # "str"
(4) loadStk
(5) regexp +3
(7) done
On 16/09/2022 21:06, Erik Leunissen wrote:
% regexp - $str; #1While these should be the exact same thing, they produce different byte codes:
bad option "-": must be -all, -about, -indices, -inline, -expanded, -line, -linestop, -lineanchor, -nocase, -start, or --
% regexp \- $str; #2
1
% ::tcl::unsupported::disassemble script {regexp - $str}Byte-code compiler cannot optimize regexp with invalid "switch" "-"; therefore, it simply invokes the actual proc (that will bail out).
ByteCode 0x0x555be8c65680, refCt 1, epoch 17, interp 0x0x555be8bfa380 (epoch 17)
Source "regexp - $str"
Cmds 1, src 13, inst 10, litObjs 3, aux 0, stkDepth 3, code/src 0.00 Commands 1:
1: pc 0-8, src 0-12
Command 1: "regexp - $str"
(0) push1 0 # "regexp"
(2) push1 1 # "-"
(4) push1 2 # "str"
(6) loadStk
(7) invokeStk1 3
(9) done
% ::tcl::unsupported::disassemble script {regexp \- $str}The byte-code compiler seems to not detect the erroneous first argument. It fails to apply backslash substitution before looking for switches ... therefore, it produces the "correct" invocation of regexp.
ByteCode 0x0x555be8c66180, refCt 1, epoch 17, interp 0x0x555be8bfa380 (epoch 17)
Source "regexp \- $str"
Cmds 1, src 14, inst 8, litObjs 2, aux 0, stkDepth 2, code/src 0.00 Commands 1:
1: pc 0-6, src 0-13
Command 1: "regexp \- $str"
(0) push1 0 # "-"
(2) push1 1 # "str"
(4) loadStk
(5) regexp +3
(7) done
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 495 |
Nodes: | 16 (2 / 14) |
Uptime: | 50:09:59 |
Calls: | 9,749 |
Calls today: | 9 |
Files: | 13,742 |
Messages: | 6,184,621 |