• missing feature for "regsub"

    From aotto1968@21:1/5 to All on Sun May 1 14:59:57 2022
    Hi,

    I don't know if the TCL community is the right place to post a Regular-Expression (RG) enhancement, but let me try :-)

    problem: howto put MULTIPLE "regsub" statements into ONE single
    (probably more efficient) together.

    lets start with "string map…"

    string map { old1 new1 old2 new2 … } STRING

    ONE command is doing MULTIPLE replacement.

    I want to do the same with "regsub" BUT with a RG string, proposal:

    regsub {…(old1|old2|…)…} STRING {\?1(new1|new2|…)} OUTVAR

    "old1" will be replaced with "new1" … etc

    example: >>>>>

    string="MyName§E other-thing"

    regsub {(\w+)§(E|S|U)} $string {\?2(enum|struct|union) \1} OUTVAR

    OUTVAR="enum MyName other-thing"

    <<<<<<<<<<<<<<

    new is the "\?2(…)"

    \2 would replace the match-string and "\?2(…)" will use the
    index of the match-string and pig-up the idx-replacement from the "(…}"


    mfg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Arjen Markus@21:1/5 to All on Mon May 2 03:06:09 2022
    On Sunday, May 1, 2022 at 3:00:02 PM UTC+2, aotto1 wrote:
    Hi,

    I don't know if the TCL community is the right place to post a Regular-Expression (RG) enhancement, but let me try :-)

    problem: howto put MULTIPLE "regsub" statements into ONE single
    (probably more efficient) together.

    lets start with "string map…"

    string map { old1 new1 old2 new2 … } STRING

    ONE command is doing MULTIPLE replacement.

    I want to do the same with "regsub" BUT with a RG string, proposal:

    regsub {…(old1|old2|…)…} STRING {\?1(new1|new2|…)} OUTVAR

    "old1" will be replaced with "new1" … etc

    example: >>>>>

    string="MyName§E other-thing"

    regsub {(\w+)§(E|S|U)} $string {\?2(enum|struct|union) \1} OUTVAR

    OUTVAR="enum MyName other-thing"

    <<<<<<<<<<<<<<

    new is the "\?2(…)"

    \2 would replace the match-string and "\?2(…)" will use the
    index of the match-string and pig-up the idx-replacement from the "(…}"


    mfg
    A feature of [string map] is that the substitutions are defined in pairs, whereas your suggested syntax interferes with the alternatives already defined in regular expressions. I suspect there are more interferences lurking in there, besides the obvious
    one. Is the benefit over a sequence of regsubs really worth the trouble? I am not an expert wrt the machinery of regular expression matching, but I can imagine this would be quite complicated to getright, even at the definition level.

    Regards,

    Arjen

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to All on Mon May 2 14:04:01 2022
    On 01/05/2022 14:59, aotto1968 wrote:
    problem: howto put MULTIPLE "regsub" statements into ONE single
    (probably more efficient) together.

    In Tcl 8.7, the regsub command has a new -command option that may be of
    use (tip #463: https://core.tcl-lang.org/tips/doc/trunk/tip/463.md).

    At least it can handle your example with a single regsub:

    regsub -command {(\w+)§[ESU]} $string [list apply [list {match prefix} {
    set ch [string index $match end]
    string cat [string map {E enum S struct U union} $ch] " " $prefix
    }]] OUTVAR


    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Schelte on Mon May 2 14:12:35 2022
    On 02/05/2022 14:04, Schelte wrote:
    regsub -command {(\w+)§[ESU]} $string [list apply [list {match prefix} {
        set ch [string index $match end]
        string cat [string map {E enum S struct U union} $ch] " " $prefix
    }]] OUTVAR

    Of course that can be simplified to:
    regsub -command {(\w+)§([ESU])} $string [list apply [list {match prefix
    ch} {
    string cat [string map {E enum S struct U union} $ch] " " $prefix
    }]] OUTVAR


    Schelte

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Mon May 2 15:58:14 2022
    Am 02.05.2022 um 14:12 schrieb Schelte:
    On 02/05/2022 14:04, Schelte wrote:
    regsub -command {(\w+)§[ESU]} $string [list apply [list {match prefix} {
         set ch [string index $match end]
         string cat [string map {E enum S struct U union} $ch] " " $prefix >> }]] OUTVAR

    Of course that can be simplified to:
    regsub -command {(\w+)§([ESU])} $string [list apply [list {match prefix
    ch} {
        string cat [string map {E enum S struct U union} $ch] " " $prefix
    }]] OUTVAR


    Schelte

    Schelte,
    I appreciate your high level answers and deep knowledge of TCL.
    I always learn from them!
    Thank you for that,
    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From aotto1968@21:1/5 to All on Fri Jun 24 10:02:08 2022
    Hi,

    *regsub* and *regexp* are used to apply a *regexpression* on a string.
    In difference to *regexp* the *regsub* is used to modify the string.

    I have the following problem:

    → I want to modify the string *config_error_text* to *configErrorText*.

    this looks like a simple *regsub* job like:

    regsub -all {_(\w)} "config_error_text" {\1??}

    but as you see I can not uppercase the '\1' to just do the job I want.
    I cant even call a *proc* in the *{\1??}* to do an arbitrary
    post-processing.


    mfg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Arjen Markus@21:1/5 to All on Fri Jun 24 05:46:07 2022
    On Friday, June 24, 2022 at 10:02:13 AM UTC+2, aotto1 wrote:
    Hi,

    *regsub* and *regexp* are used to apply a *regexpression* on a string.
    In difference to *regexp* the *regsub* is used to modify the string.

    I have the following problem:

    → I want to modify the string *config_error_text* to *configErrorText*.

    this looks like a simple *regsub* job like:

    regsub -all {_(\w)} "config_error_text" {\1??}

    but as you see I can not uppercase the '\1' to just do the job I want.
    I cant even call a *proc* in the *{\1??}* to do an arbitrary post-processing.


    mfg
    If I remember correctly, Tcl 8.7 does allow you to specify a procedure like that.

    Regards,

    Arjen

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Gerald Lester@21:1/5 to All on Fri Jun 24 10:04:16 2022
    On 6/24/22 03:02, aotto1968 wrote:
    Hi,

    *regsub* and *regexp* are used to apply a *regexpression* on a string.
    In difference to *regexp* the *regsub* is used to modify the string.

    I have the following problem:

    → I want to modify the string *config_error_text* to *configErrorText*.

    this looks like a simple *regsub* job like:

    regsub -all {_(\w)} "config_error_text" {\1??}

    but as you see I can not uppercase the '\1' to just do the job I want.
    I cant even call a *proc* in the *{\1??}* to do an arbitrary
    post-processing.


    mfg

    You may want to consider using the *string map* command instead.

    --
    +----------------------------------------------------------------------+
    | Gerald W. Lester, President, KNG Consulting LLC |
    | Email: Gerald.Lester@kng-consulting.net | +----------------------------------------------------------------------+

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From clt.to.davebr@dfgh.net@21:1/5 to All on Fri Jun 24 15:09:56 2022
    The Tcl string functions are often faster than regexp (and regsub) for fairly complex string manipulations.

    For instance to convert snake to camel case:

    join [lmap x [split $str _] {string toupper $x 0 0}] ""

    is faster than:

    regsub -all {_(\w)} $str {\1}

    And the regsub only removes the "_", it does not capitalize the following character.

    There are some differences in the results for "odd" cases. The first one will eat multiple or trailing "_", and the regexp will leave some of them.

    Dave B

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Siri Cruise@21:1/5 to Arjen Markus on Fri Jun 24 09:35:35 2022
    In article
    <a81653bf-f4f0-4532-b4b2-d5e971b8cd27n@googlegroups.com>,
    Arjen Markus <arjen.markus895@gmail.com> wrote:

    this looks like a simple *regsub* job like:

    regsub -all {_(\w)} "config_error_text" {\1??}

    but as you see I can not uppercase the '\1' to just do the job I want.
    I cant even call a *proc* in the *{\1??}* to do an arbitrary post-processing.


    mfg
    If I remember correctly, Tcl 8.7 does allow you to specify a procedure like that.

    Making every conceivable and inconceivable (and I do know its
    meaning) edit possible results in a bloat of never used options.

    while {[regexp {^(.*)_(¥w)(.*)$} $string - pre letter post]} {
    set string $pre[string toupper $letter]$post
    }

    And if I do uae this pattern a lot:
    proc regall {string re args} {
    set script [lindex $args end]
    set vars [lrange $args 0 end-1]
    while {[llength [set vals [regexp -inline $string $re]]]}
    {
    uplevel 1 [list lassign [lrange $vals 1 end] $vars]
    set string [uplevel 1 $script]
    }
    return $string
    }

    regall $string {^(.*)_(¥w)(.*)$} pre letter post {concat
    $pre[string toupper $letter]$post]}

    --
    :-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
    'I desire mercy, not sacrifice.' /|¥ Discordia: not just a religion but also a parody. This post / ¥
    I am an Andrea Chen sockpuppet. insults Islam. Mohammed

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)