• Using gets in nonblocking mode on a file not ending in a newline

    From clt.to.davebr@dfgh.net@21:1/5 to All on Sun Nov 6 03:59:07 2022
    After thinking about recent comments about reading CSV files, and trying a few experiments,
    I noticed odd behavior when using gets in nonblocking mode on files that did not end with a newline.

    The first [gets] attempting to read the last text before the end of file returned no data,
    (returned -1 if a variable name was supplied) and the file was fblocked after the read attempt.
    A second [gets] returned the text up to the end of file, cleared fblocked and set EOF.
    Any subsequent gets returned no data (returned -1 if a variable was given) and EOF remained set.

    This behavior does not appear to be explained on the gets or fconfigure man pages. Where is this documented?

    transcripts from tkcon:

    # create a short file with no trailing newline
    (try) 57 % set ff [open no-eol.txt w]
    file6
    (try) 58 % puts -nonewline $ff "asdf"
    (try) 59 % close $ff
    (try) 60 % file size no-eol.txt
    4

    # gets in nonblocking mode
    (try) 70 % set ff [open no-eol.txt r]
    file6
    (try) 71 % fconfigure $ff -blocking 0
    (try) 72 % fblocked $ff
    0
    (try) 73 % gets $ff line
    -1
    # no new line after current file location, fblocked waiting for a newline to appear?
    (try) 74 % fblocked $ff
    1
    (try) 75 % eof $ff
    0
    # at this point no EOF, but fblocked is true, issue another gets
    (try) 76 % gets $ff line
    4
    # line is read up to to end of file!
    (try) 77 % set line
    asdf
    (try) 78 % eof $ff
    1
    (try) 79 % fblocked $ff
    0
    # now at this point have EOF, nothing left to read
    (try) 80 % gets $ff line
    -1
    (try) 81 % fblocked $ff
    0
    (try) 82 % eof $ff
    1
    # the file I/O configuration was
    (try) 85 % fconfigure $ff
    -blocking 0 -buffering full -buffersize 4096 -encoding utf-8 -eofchar {} -translation auto


    Dave B

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Gollwitzer@21:1/5 to All on Sun Nov 6 11:51:29 2022
    Am 06.11.22 um 04:59 schrieb clt.to.davebr@dfgh.net:
    After thinking about recent comments about reading CSV files, and trying a few experiments,
    I noticed odd behavior when using gets in nonblocking mode on files that did not end with a newline.

    non-blocking mode doesn't make much sense on a file. On a socket or
    command pipe, non-blocking is useful because if the sender decides not
    to send any data, you would otherwise wait infinitely long. A file is
    always ready. My guess is that you actually want to do something else.

    Christian

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From clt.to.davebr@dfgh.net@21:1/5 to All on Sun Nov 6 16:32:50 2022
    non-blocking mode doesn't make much sense on a file. On a socket or
    command pipe, non-blocking is useful because if the sender decides not
    to send any data, you would otherwise wait infinitely long. A file is
    always ready. My guess is that you actually want to do something else.

    I generally open a disk file using the default blocking mode, so the behavior with files not ending in a newline is normally not a problem for me.
    However one of my use cases is reading CSV data as it is being captured from a data logger.

    I'm writing procs to split CSV data into a list of records (instead of a queue or matrix).
    In looking a how this is done in the tcllib csv package I realized the code was written to gracefully handle nonblocking mode when using [gets]. A minor change in my code would make it more robust.

    This kind of thing should be documented somewhere. I'd like to see what other best practices
    and quirks are described there. If documentation does not exist, it needs to be created.

    Dave B

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Mon Nov 7 11:12:53 2022
    * clt.to.davebr@dfgh.net
    | >non-blocking mode doesn't make much sense on a file. On a socket or
    | >command pipe, non-blocking is useful because if the sender decides not
    | >to send any data, you would otherwise wait infinitely long. A file is
    | >always ready. My guess is that you actually want to do something else.

    | I generally open a disk file using the default blocking mode, so the
    | behavior with files not ending in a newline is normally not a problem
    | for me. However one of my use cases is reading CSV data as it is
    | being captured from a data logger.

    Are you reading from a pipe to the logger, or from a disk file which is
    in parallel updated by the logger?

    | In looking a how this is done in the tcllib csv package I realized the
    | code was written to gracefully handle nonblocking mode when using
    | [gets]. A minor change in my code would make it more robust.

    Where do you see nonblocking mode in tcllib csv?
    I don't see any 'fconfigure' commands in the module, only

    while !eof
    if gets < 0 continue

    loops... (tcllib 1.20).

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From clt.to.davebr@dfgh.net@21:1/5 to All on Mon Nov 7 17:14:36 2022
    Are you reading from a pipe to the logger, or from a disk file which is
    in parallel updated by the logger?

    I am reading from a file now, but might end up reading data over USB serial from a Teensy LC or 3.2 microcontroller.

    Where do you see nonblocking mode in tcllib csv?
    I don't see any 'fconfigure' commands in the module, only

    while !eof
    if gets < 0 continue

    loops... (tcllib 1.20).

    It's the same in tcllib 1.21 too.

    My code to read one CSV record included:

    while {-1 < [gets $ff line]} { ... }

    which works fine reading a file in blocking mode.
    In looking for why the library code might be different
    I found my code did not work for nonblocking mode,
    while the tcllib code works in blocking or nonblocking mode.

    In retrospect the operation of [gets] makes sense, however
    nothing in the documentation for gets (or fconfigure or open)
    appears to explain this.

    Dave B

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Mon Nov 7 18:57:13 2022
    * clt.to.davebr@dfgh.net
    | My code to read one CSV record included:

    | while {-1 < [gets $ff line]} { ... }

    | which works fine reading a file in blocking mode.
    | In looking for why the library code might be different
    | I found my code did not work for nonblocking mode,
    | while the tcllib code works in blocking or nonblocking mode.

    Note that the TCLLIB code is

    while !eof
    if -1 < gets continue

    i.e. in non-blocking mode it busy-waits for EOF.

    This is quite different from

    while -1 < gets
    ...

    which in non-blocking mode stops at the first no-data-available.

    I doubt that the Tcllib CSV code will work on input from a serial line, blocking or not, since it waits for eof, which will never arrive on the
    serial line (unless of course you remove the hardware).

    Since the Tcllib code does not change the properties of the channel, it completely depends on how the channel is configured by the caller. And
    since it waits for eof, it implicitely requires a channel which will
    ultimately 'send' eof.

    | I am reading from a file now, but might end up reading data over USB
    | serial from a Teensy LC or 3.2 microcontroller.

    I *think* your best bet in this case would be to set up an fileevent
    handler for the serial line, and read the data event-based as they come
    in. Note however that you would need some indication when no more input
    is to be expected from the serial line.

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From clt.to.davebr@dfgh.net@21:1/5 to All on Mon Nov 7 21:13:50 2022
    I *think* your best bet in this case would be to set up an fileevent
    handler for the serial line, and read the data event-based as they come
    in. Note however that you would need some indication when no more input
    is to be expected from the serial line.

    You are correct, fileevent will probably be used to schedule reads over the serial port.
    If the data is a report from a buffer it will be easy to tag the end of data. If it is being streamed to a graph (not likely in my current project) it is likely the end of streaming will be commanded from my end.

    Dave Bruchie

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From saitology9@21:1/5 to Ralf Fassel on Mon Nov 7 17:25:25 2022
    On 11/7/2022 12:57 PM, Ralf Fassel wrote:


    | I am reading from a file now, but might end up reading data over USB
    | serial from a Teensy LC or 3.2 microcontroller.

    I *think* your best bet in this case would be to set up an fileevent
    handler for the serial line, and read the data event-based as they come
    in. Note however that you would need some indication when no more input
    is to be expected from the serial line.

    R'


    I think the behavior is more common than serial bus lines. I recall
    dealing with output from R and having similar results. IRC, R has
    "print" or "cat" and neither prints a newline character by default. So,
    when the file closes without doing a newline, the fileevent handler
    misses the data that was in the buffer.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Tue Nov 8 10:35:44 2022
    * saitology9 <saitology9@gmail.com>
    | IRC, R has "print" or "cat" and neither prints a newline character by
    | default. So, when the file closes without doing a newline, the
    | fileevent handler misses the data that was in the buffer.

    Cf 'gets' manpage:

    This command reads the next line from channelId, returns everything in
    the line up to (but not including) the end-of-line character(s), and
    discards the end-of-line character(s).
    --<snip-snip>--
    If end of file occurs while scanning for an end of line, the command
    returns whatever input is available up to the end of file.

    If you're reading from a file, gets will return the last line of that
    file regardless of whether there is a newline at the end or not:

    set fd [open test.txt w]
    puts $fd "line one"
    puts -nonewline $fd "line two"
    close $fd

    exec od -tc test.txt
    0000000 l i n e o n e \n l i n e t w
    0000020 o
    0000021

    set fd [open test.txt r]
    while {[gets $fd line] >= 0} {
    puts "READ: {$line}"
    }
    close $fd
    =>
    READ: {line one}
    READ: {line two}

    The crucial part is 'EOF'. If you're reading from a serial line, there
    is no EOF, so 'gets' keeps waiting for the newline to arrive (and does
    *not* return the bytes which have already arrived). You need to plain
    'read' in that case to get at the waiting bytes.

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)