• Tentative File Open & Safe Save

    From Lawrence D'Oliveiro@21:1/5 to All on Fri Jan 24 00:39:50 2025
    When developing an app, saving changes that a user has made to a document
    needs to be managed carefully. Simply overwriting the existing file with
    the new data can cause trouble, if your app (or the system) should crash part-way through, because then the file ends up with some part of the old document overwritten with the new one, and so the user ends up without a
    valid copy of either the old or the new version -- in effect, all their
    work is lost.

    A better technique is to rename any existing version of the file (e.g. appending a suffix such as “-old”) before saving the new document under
    the original file name. After the successful save of the new document, the
    old version might or might not be deleted.

    Alternatively, you might save the new document under the original file
    name but with some temporary suffix, e.g. “-new”, added. Then use the
    Linux RENAME_EXCHANGE option to the rename_at(2) call to simultaneously
    rename each file to the other name -- exchanging the names of the new and
    old files. After this, you can delete the file with the name ending in “-new”, since this is now the old version.

    Another technique is to do “tentative” file creation. If you open a file with the O_TMPFILE option, then no entry is made in any directory; space
    is allocated on the destination volume, but if the process terminates for
    any reason without taking action to make the file permanent, it simply disappears from the filesystem (and any space it was using is reclaimed).

    Making the file permanent involves giving it an explicit name within the destination filesystem. This is done with a linkat(2) call. But this call requires an existing name to be linked to a new name; how do you specify
    the existing name when, by design the file doesn’t have one?

    In fact, Linux gives it a name, by a mechanism called a “magic symlink”.
    If you look in /proc/«pid»/fd for a given process, it will show symlinks
    to the files the process has open. For a file opened with the O_TMPFILE
    option, this name can be used in a linkat(2) call to give the file a
    “real” name -- i.e. one that exists in the regular filesystem.

    Some example C code that shows how to do this linking is on the openat(2)
    man page <https://manpages.debian.org/openat(2)>. I implemented a Python version of this code in the save_tmpfile() routine in the linuxfs module
    here <https://gitlab.com/ldo/python_linuxfs>.

    For an example program that uses this module to demonstrate various of the above options, see the safe_save script here <https://gitlab.com/ldo/python_linuxfs_examples/>.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From 186283@ud0s4.net@21:1/5 to Lawrence D'Oliveiro on Fri Jan 24 00:39:54 2025
    On 1/23/25 7:39 PM, Lawrence D'Oliveiro wrote:
    When developing an app, saving changes that a user has made to a document needs to be managed carefully. Simply overwriting the existing file with
    the new data can cause trouble, if your app (or the system) should crash part-way through, because then the file ends up with some part of the old document overwritten with the new one, and so the user ends up without a valid copy of either the old or the new version -- in effect, all their
    work is lost.

    A better technique is to rename any existing version of the file (e.g. appending a suffix such as “-old”) before saving the new document under the original file name. After the successful save of the new document, the old version might or might not be deleted.


    This has become pretty standard. Word and Excel tend
    to create (and forget) lots of such temp files.


    Alternatively, you might save the new document under the original file
    name but with some temporary suffix, e.g. “-new”, added. Then use the Linux RENAME_EXCHANGE option to the rename_at(2) call to simultaneously rename each file to the other name -- exchanging the names of the new and
    old files. After this, you can delete the file with the name ending in “-new”, since this is now the old version.

    Another technique is to do “tentative” file creation. If you open a file with the O_TMPFILE option, then no entry is made in any directory; space
    is allocated on the destination volume, but if the process terminates for
    any reason without taking action to make the file permanent, it simply disappears from the filesystem (and any space it was using is reclaimed).

    Fair, but you CAN lose minor edits.

    Making the file permanent involves giving it an explicit name within the destination filesystem. This is done with a linkat(2) call. But this call requires an existing name to be linked to a new name; how do you specify
    the existing name when, by design the file doesn’t have one?

    In fact, Linux gives it a name, by a mechanism called a “magic symlink”. If you look in /proc/«pid»/fd for a given process, it will show symlinks
    to the files the process has open. For a file opened with the O_TMPFILE option, this name can be used in a linkat(2) call to give the file a “real” name -- i.e. one that exists in the regular filesystem.

    Some example C code that shows how to do this linking is on the openat(2)
    man page <https://manpages.debian.org/openat(2)>. I implemented a Python version of this code in the save_tmpfile() routine in the linuxfs module
    here <c>.

    For an example program that uses this module to demonstrate various of the above options, see the safe_save script here <https://gitlab.com/ldo/python_linuxfs_examples/>.

    Good ... but maybe a little more complicated than
    usually required. Just making .tmp1, .tmp2 files
    before renaming has always been good enough
    for me.

    Of course systems CAN glitch at any time, often for
    totally mysterious reasons - power maybe, minor
    coding error only hit 1:1000 times, cosmic rays ....
    so if yer stuff is SUPER important, like tax docs
    or whatever .......

    Of late I've been trying to find the wisdom of
    people who do systems for deep space probes -
    where cosmic glitches become a very real issue.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to 186282@ud0s4.net on Fri Jan 24 19:08:43 2025
    186282@ud0s4.net <186283@ud0s4.net> wrote:
    On 1/23/25 7:39 PM, Lawrence D'Oliveiro wrote:
    When developing an app, saving changes that a user has made to a document
    needs to be managed carefully. Simply overwriting the existing file with
    the new data can cause trouble, if your app (or the system) should crash
    part-way through, because then the file ends up with some part of the old
    document overwritten with the new one, and so the user ends up without a
    valid copy of either the old or the new version -- in effect, all their
    work is lost.


    Of course systems CAN glitch at any time, often for
    totally mysterious reasons - power maybe, minor
    coding error only hit 1:1000 times, cosmic rays ....
    so if yer stuff is SUPER important, like tax docs
    or whatever .......

    Third option:

    Use a Sqlite file as the "file" the app uses, and delegate all the ugly
    aspects of atomic file "adjusting" and "storing" to Sqlite (which by
    now has mitigations for issues most individual developers will never
    see nor hear of).

    Plus, a Sqlite file would allow a very easy "versioned file" setup as
    well.

    Downside: one has to have an Sqlite module for one's language availble,
    or one has to include Sqlite's driver in one's app.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From 186283@ud0s4.net@21:1/5 to Rich on Sat Jan 25 00:02:50 2025
    On 1/24/25 2:08 PM, Rich wrote:
    186282@ud0s4.net <186283@ud0s4.net> wrote:
    On 1/23/25 7:39 PM, Lawrence D'Oliveiro wrote:
    When developing an app, saving changes that a user has made to a document >>> needs to be managed carefully. Simply overwriting the existing file with >>> the new data can cause trouble, if your app (or the system) should crash >>> part-way through, because then the file ends up with some part of the old >>> document overwritten with the new one, and so the user ends up without a >>> valid copy of either the old or the new version -- in effect, all their
    work is lost.


    Of course systems CAN glitch at any time, often for
    totally mysterious reasons - power maybe, minor
    coding error only hit 1:1000 times, cosmic rays ....
    so if yer stuff is SUPER important, like tax docs
    or whatever .......

    Third option:

    Use a Sqlite file as the "file" the app uses, and delegate all the ugly aspects of atomic file "adjusting" and "storing" to Sqlite (which by
    now has mitigations for issues most individual developers will never
    see nor hear of).

    Plus, a Sqlite file would allow a very easy "versioned file" setup as
    well.

    Downside: one has to have an Sqlite module for one's language availble,
    or one has to include Sqlite's driver in one's app.

    I looked into this a bit ... it's a potential solution,
    but seems, well, a little TOO for the issue at hand.

    If using Word or Excel, the system continually creates
    temp files of every little change every X minutes. My
    bitch is that sometimes if FORGETS to delete all those
    files after (had to add a filter to my backup pgms) - but
    I'm not bitching about the CONCEPT.

    Basically ANY programming language allows easy use of
    that particular kind of solution. No add-ons needed.

    Let's say I'm a fan of "KISS" solutions.

    A concern is systems that update ALL OF THE TIME like
    databases. Keeping in-transaction copies of every little
    file is less fun. Totally do-able, and oft is, but
    less fun. Multi-user record-only-locked files makes
    it even more less fun.

    But, alas, abrupt crashes/lockups or user madness is
    STILL a real problem so SOMETHING has to be done.
    Computers are machines, and machines fuck up and/or
    CAN be fucked-up.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to 186282@ud0s4.net on Sat Jan 25 17:24:02 2025
    186282@ud0s4.net <186283@ud0s4.net> wrote:
    On 1/24/25 2:08 PM, Rich wrote:
    186282@ud0s4.net <186283@ud0s4.net> wrote:
    On 1/23/25 7:39 PM, Lawrence D'Oliveiro wrote:
    When developing an app, saving changes that a user has made to a document >>>> needs to be managed carefully. Simply overwriting the existing file with >>>> the new data can cause trouble, if your app (or the system) should crash >>>> part-way through, because then the file ends up with some part of the old >>>> document overwritten with the new one, and so the user ends up without a >>>> valid copy of either the old or the new version -- in effect, all their >>>> work is lost.


    Of course systems CAN glitch at any time, often for
    totally mysterious reasons - power maybe, minor
    coding error only hit 1:1000 times, cosmic rays ....
    so if yer stuff is SUPER important, like tax docs
    or whatever .......

    Third option:

    Use a Sqlite file as the "file" the app uses, and delegate all the ugly
    aspects of atomic file "adjusting" and "storing" to Sqlite (which by
    now has mitigations for issues most individual developers will never
    see nor hear of).

    Plus, a Sqlite file would allow a very easy "versioned file" setup as
    well.

    Downside: one has to have an Sqlite module for one's language availble,
    or one has to include Sqlite's driver in one's app.

    I looked into this a bit ... it's a potential solution, but seems,
    well, a little TOO for the issue at hand.

    If using Word or Excel, the system continually creates temp files
    of every little change every X minutes. My bitch is that sometimes
    if FORGETS to delete all those files after (had to add a filter to
    my backup pgms) - but I'm not bitching about the CONCEPT.

    If the "storage file" had been a sqlite DB, all those "little temp
    files" could have been new rows in a "backup log" table inside sqlite,
    and from your file browser perspective, there's only one "file" on disk
    at all times (or two, if one turns on the alternate sqlite update
    method).

    Basically ANY programming language allows easy use of that
    particular kind of solution. No add-ons needed.

    Yes, and programmers will take the route of least resistance most every
    time. That's the cause of so many popup modal dialog "OK" boxes
    confirming that what you expected to happen did happen, but because of
    the modal ness, you now have to go dismiss the damn thing to get on
    with whatever you are actually trying to do. Those popup modal "ok"
    style boxes are often the only UI widget provided by default in any
    given UI library - most anything else has to be "assembled yoursef".

    A concern is systems that update ALL OF THE TIME like databases.
    Keeping in-transaction copies of every little file is less fun.
    Totally do-able, and oft is, but less fun. Multi-user
    record-only-locked files makes it even more less fun.

    sqlite *is* a database, but it takes care of all those nitty gritty
    details for you, so you don't have to care from the level of the app
    you are writing.

    But, alas, abrupt crashes/lockups or user madness is STILL a real
    problem so SOMETHING has to be done. Computers are machines, and
    machines fuck up and/or CAN be fucked-up.

    Sqlite, being a database, goes that extra mile to make sure the changes
    you commit to it remain to be seen past most anything that can happen
    (beyond the obvious unrecoverable such as "disk storing file dies, and
    can no longer be accessed). But OS crashes, power failures, etc., it
    tries its best (and claims to be very good at it) to avoid data loss in
    those situations.

    The prograamer gets none of that with the usual generic file-open,
    file-write, file-close style interface that most languages provide.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)