given the string 'Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy
Bob, Jr}
I am able to use regexp negative look ahead to locate all comma chars in positions not enclosed in braces. ,(?![^\{]*\})
However, an regular expression that identify locations of commas external
to double quotes and braces would be appreciated.
Thanks
On Fri, 4 Nov 2022 10:56:05 -0700 (PDT), The Rickster wrote:
given the string 'Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy
Bob, Jr}
I am able to use regexp negative look ahead to locate all comma chars in positions not enclosed in braces. ,(?![^\{]*\})
However, an regular expression that identify locations of commas external to double quotes and braces would be appreciated.
ThanksThose problems are better handled in multiple stages.
For example, begin by removing everything in quotes or brackets:
regsub -all {"[^"]*"|{[^{}]*}} $string "" variable
Then all remaining commas are what you're looking for. Case closed.
But if you still need to know their exact positions within the original complete string, then I think you'll have to write your own parser.Thanks for confirming that I wasn't missing some trivial solution...;your suggestions are appreciated and will be followed.
But regexp still can help you a lot:
% set FN {Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}}
% regexp -all -inline -indices {"} $FN
{10 10} {25 25}
The first double quote is in position 10, the second in position 25.
There are no others.
Let's confirm:
% string first \" $FN 0
10
% string first \" $FN 11
25
OK. Let's continue:
% regexp -all -inline -indices {,} $FN
{20 20} {26 26} {38 38} {55 55}
Four commas: 20, 26, 38, 55.
The first comma is in that 10 to 25 range of the double quotes
so it's out. All others are valid.
The same trick with brackets is too hard because they are a pair, i.e.
not one same character. So I recommend using string first for that:
% string first \{ $FN 0
33
% string first \} $FN 33
37
So all commas within the 33 to 37 range are invalid.
But there are more bracket pairs in the string. You have to find
them all.
You probably can pick up from there.
--
Luc
On Friday, November 4, 2022 at 11:29:52 AM UTC-7, Luc wrote:
On Fri, 4 Nov 2022 10:56:05 -0700 (PDT), The Rickster wrote:
given the string 'Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}
I am able to use regexp negative look ahead to locate all comma chars in positions not enclosed in braces. ,(?![^\{]*\})
However, an regular expression that identify locations of commas external to double quotes and braces would be appreciated.
Those problems are better handled in multiple stages.
Thanks for confirming that I wasn't missing some trivial solution...;your suggestions are appreciated and will be followed.
On Saturday, November 5, 2022 at 6:25:56 AM UTC-7, heinrichmartin wrote:Hey, lesson learned - A single negative answer does not mean that no trivial solution exists - especially in the setting of an XY-problem.
On Saturday, November 5, 2022 at 5:13:26 AM UTC+1, The Rickster wrote:
On Friday, November 4, 2022 at 11:29:52 AM UTC-7, Luc wrote:
On Fri, 4 Nov 2022 10:56:05 -0700 (PDT), The Rickster wrote:
given the string 'Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}
I am able to use regexp negative look ahead to locate all comma chars in
positions not enclosed in braces. ,(?![^\{]*\})
A few notes about precise problem statements:However, an regular expression that identify locations of commas external
to double quotes and braces would be appreciated.
Where does the string end? As there is not closing single quote, is the single quote part of the string?
We currently experience two threads on c.l.t. that demonstrate the XY-problem - everyone can learn from that, too.
Are you looking for the comma or are you looking for the definitions "* be *"?
What is the grammar? May quotation marks or braces be nested or escaped? Does repeated whitespace matter? Which whitespace? May the statement span lines?
% set in {Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}}Answers:
Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}
% set kv [regexp -all -inline {(^Let\s|,)\s*(\w+)\s+be\s+([^,]+|"[^"]+"|\{[^\}]+\})\s*(?=,|$)} $in]
{Let FN be "Billy Bob, jr."} {Let } FN {"Billy Bob, jr."} {,MI be {Bob}} , MI {{Bob}} {,LN be {Billy Bob, Jr}} , LN {{Billy Bob, Jr}}
% dict get $kv FN
"Billy Bob, jr."
% dict get $kv MI
{Bob}
% dict get $kv LN
{Billy Bob, Jr}
I disagree with the general statement (e.g. most parsers work be traversing the input only once), but I agree that regexp need not be the best solution.Those problems are better handled in multiple stages.
Thanks for confirming that I wasn't missing some trivial solution...;your suggestions are appreciated and will be followed.A single negative answer does not mean that no trivial solution exists - especially in the setting of an XY-problem.
There is no closing quote. A string of text may be enclosed in double quotes or braces.
One can ignore the "Let" portion of the string. The intent is to be able to evaluate each ?varname be ?textstring, where text string is delimited by braces or double quotes.
The statement does not span lines and repeated white space does not matter. Initial thoughts were to replace each comma with a 'non printable' character (e.g. x03) and then split the string..
On Saturday, November 5, 2022 at 5:13:26 AM UTC+1, The Rickster wrote:Answers:
On Friday, November 4, 2022 at 11:29:52 AM UTC-7, Luc wrote:
On Fri, 4 Nov 2022 10:56:05 -0700 (PDT), The Rickster wrote:
given the string 'Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}
I am able to use regexp negative look ahead to locate all comma chars in
positions not enclosed in braces. ,(?![^\{]*\})
A few notes about precise problem statements:However, an regular expression that identify locations of commas external
to double quotes and braces would be appreciated.
Where does the string end? As there is not closing single quote, is the single quote part of the string?
We currently experience two threads on c.l.t. that demonstrate the XY-problem - everyone can learn from that, too.
Are you looking for the comma or are you looking for the definitions "* be *"?
What is the grammar? May quotation marks or braces be nested or escaped?
Does repeated whitespace matter? Which whitespace? May the statement span lines?
% set in {Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}}
Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}
% set kv [regexp -all -inline {(^Let\s|,)\s*(\w+)\s+be\s+([^,]+|"[^"]+"|\{[^\}]+\})\s*(?=,|$)} $in]
{Let FN be "Billy Bob, jr."} {Let } FN {"Billy Bob, jr."} {,MI be {Bob}} , MI {{Bob}} {,LN be {Billy Bob, Jr}} , LN {{Billy Bob, Jr}}
% dict get $kv FN
"Billy Bob, jr."
% dict get $kv MI
{Bob}
% dict get $kv LN
{Billy Bob, Jr}
I disagree with the general statement (e.g. most parsers work be traversing the input only once), but I agree that regexp need not be the best solution.Those problems are better handled in multiple stages.
Thanks for confirming that I wasn't missing some trivial solution...;your suggestions are appreciated and will be followed.A single negative answer does not mean that no trivial solution exists - especially in the setting of an XY-problem.
The regexp you supplies is what was needed. What is c.l.t. ? how can I access?
Rick
On Thursday, November 10, 2022 at 9:50:37 PM UTC-8, The Rickster wrote:
On Saturday, November 5, 2022 at 6:25:56 AM UTC-7, heinrichmartin wrote:
On Saturday, November 5, 2022 at 5:13:26 AM UTC+1, The Rickster wrote:
On Friday, November 4, 2022 at 11:29:52 AM UTC-7, Luc wrote:
On Fri, 4 Nov 2022 10:56:05 -0700 (PDT), The Rickster wrote:
given the string 'Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy
Bob, Jr}
I am able to use regexp negative look ahead to locate all comma chars in
positions not enclosed in braces. ,(?![^\{]*\})
A few notes about precise problem statements:However, an regular expression that identify locations of commas external
to double quotes and braces would be appreciated.
Where does the string end? As there is not closing single quote, is the single quote part of the string?
We currently experience two threads on c.l.t. that demonstrate the XY-problem - everyone can learn from that, too.
Are you looking for the comma or are you looking for the definitions "* be *"?
What is the grammar? May quotation marks or braces be nested or escaped? Does repeated whitespace matter? Which whitespace? May the statement span lines?
Hey, lesson learned - A single negative answer does not mean that no trivial solution exists - especially in the setting of an XY-problem.% set in {Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}} Let FN be "Billy Bob, jr.",MI be {Bob},LN be {Billy Bob, Jr}Answers:
% set kv [regexp -all -inline {(^Let\s|,)\s*(\w+)\s+be\s+([^,]+|"[^"]+"|\{[^\}]+\})\s*(?=,|$)} $in]
{Let FN be "Billy Bob, jr."} {Let } FN {"Billy Bob, jr."} {,MI be {Bob}} , MI {{Bob}} {,LN be {Billy Bob, Jr}} , LN {{Billy Bob, Jr}}
% dict get $kv FN
"Billy Bob, jr."
% dict get $kv MI
{Bob}
% dict get $kv LN
{Billy Bob, Jr}
I disagree with the general statement (e.g. most parsers work be traversing the input only once), but I agree that regexp need not be the best solution.Those problems are better handled in multiple stages.
Thanks for confirming that I wasn't missing some trivial solution...;your suggestions are appreciated and will be followed.A single negative answer does not mean that no trivial solution exists - especially in the setting of an XY-problem.
There is no closing quote. A string of text may be enclosed in double quotes or braces.
One can ignore the "Let" portion of the string. The intent is to be able to evaluate each ?varname be ?textstring, where text string is delimited by braces or double quotes.
The statement does not span lines and repeated white space does not matter. Initial thoughts were to replace each comma with a 'non printable' character (e.g. x03) and then split the string..
The regexp you supplies is what was needed.
What is c.l.t. ? how can I access?
What is c.l.t. ? how can I access? Rick
What is c.l.t. ? how can I access?
It is an abbreviation for the usenet[2] group comp.lang.tcl. We are
currently exchanging messages there.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 498 |
Nodes: | 16 (2 / 14) |
Uptime: | 23:04:53 |
Calls: | 9,828 |
Calls today: | 7 |
Files: | 13,761 |
Messages: | 6,191,777 |