• Re: Breaking a table of record rows into an array

    From Janis Papanagnou@21:1/5 to Mr. Man-wai Chang on Fri Mar 1 15:52:42 2024
    On 01.03.2024 14:33, Mr. Man-wai Chang wrote:
    I am new to Awk programmin.

    Given a text table with the following sample entry:

    [ 8] SSID[ [HOME]] BSSID[04:9F:xx:xx:xx:xx] channel[ 6] frequency[2437] numsta[1] rssi[-63] noise[-75] beacon[98] cap[1411]
    dtim[0] rate[450] enc[Group-AES-CCMP CCMP PSK2 ]

    Is that all on one line? (If it's on multiple lines you should
    provide more context information, how more than one records are
    separated from each other.)


    How do you use Awk to quickly & easily break it into:

    The nasty thing is the nested '[...]'.

    One quick way is to choose an appropriate field separator. For
    example

    BEGIN { FS="] " }
    { for (i=1; i<=NF; i++)
    print $i
    }

    will produce on one data line like the above (it also works if
    the data is spread across three lines, but you still need to
    know the record separators then)...

    [ 8
    SSID[ [HOME]
    BSSID[04:9F:xx:xx:xx:xx
    channel[ 6]
    frequency[2437
    numsta[1
    rssi[-63
    noise[-75
    beacon[98
    cap[1411]
    dtim[0
    rate[450
    enc[Group-AES-CCMP CCMP PSK2

    If the basic splitting is okay you can do the formatting;
    using sub() or gsub() on $i to remove/replace parts of the
    text (e.g. to remove undesired spaces), use string
    concatenation (e.g. to add the "]" again which had been
    removed with the field splitting), etc., whatever needed.

    Janis


    bssid="04:9F:xx:xx:xx:xx";
    ssid[bssid]="[HOME]";
    channel[bssid]="6";
    frequency[bssid]="2437";
    ....
    rate[bssid]="450;
    enc[bssid]="Group-AES-CCMP CCMP PSK2";

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From jeorge@invalid.invalid@21:1/5 to All on Fri Mar 1 15:59:59 2024
    I am new to Awk programming.

    Given a text table with the following sample entry:

    [ 8] SSID[ [HOME]] BSSID[04:9F:xx:xx:xx:xx] channel[ 6] frequency[2437] numsta[1] rssi[-63] noise[-75] beacon[98] cap[1411]
    dtim[0] rate[450] enc[Group-AES-CCMP CCMP PSK2 ]

    How do you use Awk to quickly & easily break it into:

    bssid="04:9F:xx:xx:xx:xx";
    ssid[bssid]="[HOME]";
    channel[bssid]="6";
    frequency[bssid]="2437";
    ....
    rate[bssid]="450;
    enc[bssid]="Group-AES-CCMP CCMP PSK2";

    Found your issue interesting enough to attempt a solution:


    #../sandbox/test.awk
    BEGIN { FS="\\[[ []*" ; RS="]" }
    { sub("\n","")
    for (i=1; i<=NF; i+=2) {
    ($i ~ /^$/) ? $i = "Station" : sub(/^ */,"\t",$i)
    if ($(i+1) != "")
    printf "%s[bssid] = %s\n", $i,$(i+1)
    } }

    $ nawk -f test.awk test.data
    Station[bssid] = 8
    SSID[bssid] = HOME
    BSSID[bssid] = 04:9F:xx:xx:xx:xx
    channel[bssid] = 6
    frequency[bssid] = 2437
    numsta[bssid] = 1
    rssi[bssid] = -63
    noise[bssid] = -75
    beacon[bssid] = 98
    cap[bssid] = 1411
    dtim[bssid] = 0
    rate[bssid] = 450
    enc[bssid] = Group-AES-CCMP CCMP PSK2

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Mr. Man-wai Chang on Tue Mar 12 00:08:17 2024
    On 11.03.2024 18:41, Mr. Man-wai Chang wrote:
    On 1/3/2024 10:52 pm, Janis Papanagnou wrote:

    BEGIN { FS="] " }
    { for (i=1; i<=NF; i++)
    print $i
    }

    Use of `NF` in awk command - Stack Overflow

    So what?

    You want a more cryptic way? - Here it is...

    BEGIN { FS="] " ; OFS="\n" }
    { NF=NF } 1

    or

    BEGIN { FS="] " ; OFS="\n" }
    { $1=$1 } 1


    Mind, though, that for a program skeleton to solve your task
    my original code is easier to adjust for your data processing.
    You are aware that it's just the first step and needs further
    processing, aren't you?

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to mortonspam@gmail.com on Wed Mar 13 09:21:44 2024
    In article <usqkgn$he7u$2@dont-email.me>,
    Ed Morton <mortonspam@gmail.com> wrote:
    the effect of setting `NF` is
    undefined behavior per POSIX and so will do different things in
    different awk variants and even in 1 awk variant can behave differently >depending on whether you're setting it to a higher or lower than
    original value

    This is not true. The effect of setting NF was well defined
    by the original awk book and also in POSIX.

    Decreasing NF throws away fields. Increasing NF adds the
    intervening fields with the null string as their values
    and rebuilds the record.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Mar 13 18:24:37 2024
    On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    arnold@freefriends.org (Aharon Robbins) writes:
    In article <usqkgn$he7u$2@dont-email.me>,
    Ed Morton <mortonspam@gmail.com> wrote:
    the effect of setting `NF` is
    undefined behavior per POSIX and so will do different things in
    different awk variants and even in 1 awk variant can behave differently >>>depending on whether you're setting it to a higher or lower than
    original value

    This is not true. The effect of setting NF was well defined
    by the original awk book and also in POSIX.

    Decreasing NF throws away fields. Increasing NF adds the
    intervening fields with the null string as their values
    and rebuilds the record.

    I don't see that in the POSIX specification.

    The key is this:

    References to nonexistent fields (that is, fields after $NF), shall
    evaluate to the uninitialized value.

    NF is assignable, and fields after $NF do not exist. Thus if we
    have four fields and set NF = 3, then $4 doesn't exist.

    That implies it must cease to exist; i.e. be destroyed. If setting NF = 4 were to restore $4 then that would mean it had continued to exist, but was only hidden.

    The behavior is present in GNU Awk, Mawk, BusyBox Awk and others.

    I reproduced the behavior carefully in the awk macro of TXR Lisp:

    $ echo '1 2 3 4' | txr -e '(awk (t (set nf 1) (set nf 3) (prn [f 1])))'

    $ echo '1 2 3 4' | txr -e '(awk (t (set nf 3) (prn [f 1])))'
    2

    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
    """
    NF
    The number of fields in the current record. Inside a BEGIN action,
    the use of NF is undefined unless a getline function without a var
    argument is executed previously. Inside an END action, NF shall
    retain the value it had for the last record read, unless a
    subsequent, redirected, getline function without a var argument is
    performed prior to entering the END action.

    This looks defective. The value of NF observed in END must obviously
    be the last stored one, however it was stored, whether by assignment
    or getline.

    Note that NF is also recalculated if $0 is assigned, which is
    explicitly required in the document; it is glaringly defective to
    be appearing to be making an exception for getline but not for
    assignment to $0.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Mar 13 21:49:26 2024
    On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <433-929-6894@kylheku.com> writes:
    On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    arnold@freefriends.org (Aharon Robbins) writes:
    In article <usqkgn$he7u$2@dont-email.me>,
    Ed Morton <mortonspam@gmail.com> wrote:
    the effect of setting `NF` is
    undefined behavior per POSIX and so will do different things in >>>>>different awk variants and even in 1 awk variant can behave differently >>>>>depending on whether you're setting it to a higher or lower than >>>>>original value

    This is not true. The effect of setting NF was well defined
    by the original awk book and also in POSIX.

    Decreasing NF throws away fields. Increasing NF adds the
    intervening fields with the null string as their values
    and rebuilds the record.

    I don't see that in the POSIX specification.

    The key is this:

    References to nonexistent fields (that is, fields after $NF), shall
    evaluate to the uninitialized value.

    NF is assignable, and fields after $NF do not exist. Thus if we
    have four fields and set NF = 3, then $4 doesn't exist.

    That describes what happens if NF is modified by assignment, but I don't
    see that it implies that such an assignment is allowed.

    "The left-hand side of an assignment and the target of increment and
    decrement operators can be one of a variable, an array with index, or a
    field selector."

    NF is described as a variable. Some unique remarks are made about NF,
    but none deny that it's assignable like any other variable.

    But I can imagine a hypothetical awk-like language in which assigning to
    NF has undefined behavior. My question is, how does the POSIX
    specification not describe that language?

    That language is failing to support an instance of a variable
    being the left operand of an assignment, which a variable "can be".

    It looks like the violation of a requirement.

    On the other hand, it also implies that `foo = 42` is valid where `foo`
    is the name of a user-defined function (gawk disallows it).

    POSIX does say that "[t]he same name shall not be used as both a
    function parameter name and as the name of a function or a special awk variable." So foo = 42 isn't valid if foo is already a function.

    Also: "The same name shall not be used both as a variable name with
    global scope and as the name of a function. The same name shall not be
    used within the same scope both as a scalar variable and as an array."

    All that said, the business of the NF tail wagging the $1, $2, ...
    legs of the dog should be the target of at least one clarifying remark,
    and the other defects should also be corrected:

    - In a BEGIN clause NF should be undefined unless any action
    whatsoever is executed that sets its value: direct assignment,
    use of getline or assignment to $0.

    - At the start of the execution of an END clause, NF retains
    its current value (or undefined status, if it was never set);
    the END clause has no implicit effect on NF.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Thu Mar 14 00:22:56 2024
    On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    That describes what happens if NF is modified by assignment, but I don't
    see that it implies that such an assignment is allowed.

    Here is a problem. In numerous implementations, when you set NF, not
    only does that set the number of fields, but $0 is recomputed.
    So instead of $1=$1 you can use NF=NF.

    $ echo '1 2 3 4' | awk -v OFS=: '{ NF=NF; print $0; }'
    1:2:3:4

    $ echo '1 2 3 4' | awk -v OFS=: '{ NF=2; print $0; }'
    1:2


    We can continue to infer that if setting NF causes certain fields to
    exist, and not others, then $0 must be reconstituted accordingly,
    just like when a field is assigned, according to the idea that Awk
    implements a kind of "reactive programming" paradigm whereby $0
    and the fields are kept in sync.

    But that's going a little unconfortably far on the proverbial limb,
    without assurance from the text.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Ed Morton on Thu Mar 14 00:17:48 2024
    On 2024-03-13, Ed Morton <mortonspam@gmail.com> wrote:
    On 3/13/2024 4:49 PM, Kaz Kylheku wrote:
    On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <433-929-6894@kylheku.com> writes:
    On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    arnold@freefriends.org (Aharon Robbins) writes:
    In article <usqkgn$he7u$2@dont-email.me>,
    Ed Morton <mortonspam@gmail.com> wrote:
    the effect of setting `NF` is
    undefined behavior per POSIX and so will do different things in
    different awk variants and even in 1 awk variant can behave differently >>>>>>> depending on whether you're setting it to a higher or lower than >>>>>>> original value

    This is not true. The effect of setting NF was well defined
    by the original awk book and also in POSIX.

    Decreasing NF throws away fields. Increasing NF adds the
    intervening fields with the null string as their values
    and rebuilds the record.

    I don't see that in the POSIX specification.

    The key is this:

    References to nonexistent fields (that is, fields after $NF), shall >>>> evaluate to the uninitialized value.

    NF is assignable, and fields after $NF do not exist. Thus if we
    have four fields and set NF = 3, then $4 doesn't exist.

    That's a bit like the argument from an old episode of the comedy TV show "Yes, Prime Minister"

    But that show is the reference model for how ISO and IEEE standarization
    works.

    in the UK where his aide says (paraphrased) "Some
    country has done X, we must go something. War is something, therefore we
    must go to war".

    Being able to set NF to 3 does not mean you must delete $4.

    The passage says that fields do not exist beyond $NF. So if NF
    is 3, $4 doesn't exist.

    Why not
    delete $1 or $2 instead?
    You'd still end up with 3 fields to satisfy the
    value of NF.

    Because those are less than 3, the value in NF. Those exist.
    $2 and $3 exist while NF is originally 4; and continue to
    exist if it is decremented to 3. Why would $2 be victimized,
    when at no point had NF been less than 2?

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to Keith.S.Thompson+u@gmail.com on Thu Mar 14 06:19:40 2024
    In article <87y1am5cfo.fsf@nosuchdomain.example.com>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Do you see something in POSIX that defines the behavior of assigning to
    NF?

    In the section "Variables and Special Values"

    | References to nonexistent fields (that is, fields after $NF), shall
    | evaluate to the uninitialized value. Such references shall not create
    | new fields. However, assigning to a nonexistent field (for example,
    | $(NF+2)=5) shall increase the value of NF; create any intervening fields
    | with the uninitialized value; and cause the value of $0 to be
    | recomputed, with the fields being separated by the value of OFS. Each
    | field variable shall have a string value or an uninitialized value when
    | created.

    It doesn't say what happens when you do NF -= 2; nonetheless, all
    traditional awks throw away fields when you do something like that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)