Forum: >>> Magnum BBS <<<

Tentative File Open & Safe Save

From Lawrence D'Oliveiro@21:1/5 to All on Fri Jan 24 00:39:50 2025

When developing an app, saving changes that a user has made to a document
needs to be managed carefully. Simply overwriting the existing file with
the new data can cause trouble, if your app (or the system) should crash part-way through, because then the file ends up with some part of the old document overwritten with the new one, and so the user ends up without a
valid copy of either the old or the new version -- in effect, all their
work is lost.

A better technique is to rename any existing version of the file (e.g. appending a suffix such as “-old”) before saving the new document under
the original file name. After the successful save of the new document, the
old version might or might not be deleted.

Alternatively, you might save the new document under the original file
name but with some temporary suffix, e.g. “-new”, added. Then use the
Linux RENAME_EXCHANGE option to the rename_at(2) call to simultaneously
rename each file to the other name -- exchanging the names of the new and
old files. After this, you can delete the file with the name ending in “-new”, since this is now the old version.

Another technique is to do “tentative” file creation. If you open a file with the O_TMPFILE option, then no entry is made in any directory; space
is allocated on the destination volume, but if the process terminates for
any reason without taking action to make the file permanent, it simply disappears from the filesystem (and any space it was using is reclaimed).

Making the file permanent involves giving it an explicit name within the destination filesystem. This is done with a linkat(2) call. But this call requires an existing name to be linked to a new name; how do you specify
the existing name when, by design the file doesn’t have one?

In fact, Linux gives it a name, by a mechanism called a “magic symlink”.
If you look in /proc/«pid»/fd for a given process, it will show symlinks
to the files the process has open. For a file opened with the O_TMPFILE
option, this name can be used in a linkat(2) call to give the file a
“real” name -- i.e. one that exists in the regular filesystem.

Some example C code that shows how to do this linking is on the openat(2)
man page <https://manpages.debian.org/openat(2)>. I implemented a Python version of this code in the save_tmpfile() routine in the linuxfs module
here <https://gitlab.com/ldo/python_linuxfs>.

For an example program that uses this module to demonstrate various of the above options, see the safe_save script here <https://gitlab.com/ldo/python_linuxfs_examples/>.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From 186283@ud0s4.net@21:1/5 to Lawrence D'Oliveiro on Fri Jan 24 00:39:54 2025

On 1/23/25 7:39 PM, Lawrence D'Oliveiro wrote:

When developing an app, saving changes that a user has made to a document needs to be managed carefully. Simply overwriting the existing file with
the new data can cause trouble, if your app (or the system) should crash part-way through, because then the file ends up with some part of the old document overwritten with the new one, and so the user ends up without a valid copy of either the old or the new version -- in effect, all their
work is lost.

A better technique is to rename any existing version of the file (e.g. appending a suffix such as “-old”) before saving the new document under the original file name. After the successful save of the new document, the old version might or might not be deleted.

This has become pretty standard. Word and Excel tend
to create (and forget) lots of such temp files.

Alternatively, you might save the new document under the original file
name but with some temporary suffix, e.g. “-new”, added. Then use the Linux RENAME_EXCHANGE option to the rename_at(2) call to simultaneously rename each file to the other name -- exchanging the names of the new and
old files. After this, you can delete the file with the name ending in “-new”, since this is now the old version.

Another technique is to do “tentative” file creation. If you open a file with the O_TMPFILE option, then no entry is made in any directory; space
is allocated on the destination volume, but if the process terminates for
any reason without taking action to make the file permanent, it simply disappears from the filesystem (and any space it was using is reclaimed).

Fair, but you CAN lose minor edits.

Making the file permanent involves giving it an explicit name within the destination filesystem. This is done with a linkat(2) call. But this call requires an existing name to be linked to a new name; how do you specify
the existing name when, by design the file doesn’t have one?

In fact, Linux gives it a name, by a mechanism called a “magic symlink”. If you look in /proc/«pid»/fd for a given process, it will show symlinks
to the files the process has open. For a file opened with the O_TMPFILE option, this name can be used in a linkat(2) call to give the file a “real” name -- i.e. one that exists in the regular filesystem.

Some example C code that shows how to do this linking is on the openat(2)
man page <https://manpages.debian.org/openat(2)>. I implemented a Python version of this code in the save_tmpfile() routine in the linuxfs module
here <c>.

For an example program that uses this module to demonstrate various of the above options, see the safe_save script here <https://gitlab.com/ldo/python_linuxfs_examples/>.

Good ... but maybe a little more complicated than
usually required. Just making .tmp1, .tmp2 files
before renaming has always been good enough
for me.

Of course systems CAN glitch at any time, often for
totally mysterious reasons - power maybe, minor
coding error only hit 1:1000 times, cosmic rays ....
so if yer stuff is SUPER important, like tax docs
or whatever .......

Of late I've been trying to find the wisdom of
people who do systems for deep space probes -
where cosmic glitches become a very real issue.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rich@21:1/5 to 186282@ud0s4.net on Fri Jan 24 19:08:43 2025

186282@ud0s4.net <186283@ud0s4.net> wrote:

On 1/23/25 7:39 PM, Lawrence D'Oliveiro wrote:

When developing an app, saving changes that a user has made to a document
needs to be managed carefully. Simply overwriting the existing file with
the new data can cause trouble, if your app (or the system) should crash
part-way through, because then the file ends up with some part of the old
document overwritten with the new one, and so the user ends up without a
valid copy of either the old or the new version -- in effect, all their
work is lost.

Of course systems CAN glitch at any time, often for
totally mysterious reasons - power maybe, minor
coding error only hit 1:1000 times, cosmic rays ....
so if yer stuff is SUPER important, like tax docs
or whatever .......

Third option:

Use a Sqlite file as the "file" the app uses, and delegate all the ugly
aspects of atomic file "adjusting" and "storing" to Sqlite (which by
now has mitigations for issues most individual developers will never
see nor hear of).

Plus, a Sqlite file would allow a very easy "versioned file" setup as
well.

Downside: one has to have an Sqlite module for one's language availble,
or one has to include Sqlite's driver in one's app.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From 186283@ud0s4.net@21:1/5 to Rich on Sat Jan 25 00:02:50 2025

On 1/24/25 2:08 PM, Rich wrote:

186282@ud0s4.net <186283@ud0s4.net> wrote:

On 1/23/25 7:39 PM, Lawrence D'Oliveiro wrote:

When developing an app, saving changes that a user has made to a document >>> needs to be managed carefully. Simply overwriting the existing file with >>> the new data can cause trouble, if your app (or the system) should crash >>> part-way through, because then the file ends up with some part of the old >>> document overwritten with the new one, and so the user ends up without a >>> valid copy of either the old or the new version -- in effect, all their
work is lost.

Of course systems CAN glitch at any time, often for
totally mysterious reasons - power maybe, minor
coding error only hit 1:1000 times, cosmic rays ....
so if yer stuff is SUPER important, like tax docs
or whatever .......

Third option:

Use a Sqlite file as the "file" the app uses, and delegate all the ugly aspects of atomic file "adjusting" and "storing" to Sqlite (which by
now has mitigations for issues most individual developers will never
see nor hear of).

Plus, a Sqlite file would allow a very easy "versioned file" setup as
well.

Downside: one has to have an Sqlite module for one's language availble,
or one has to include Sqlite's driver in one's app.

I looked into this a bit ... it's a potential solution,
but seems, well, a little TOO for the issue at hand.

If using Word or Excel, the system continually creates
temp files of every little change every X minutes. My
bitch is that sometimes if FORGETS to delete all those
files after (had to add a filter to my backup pgms) - but
I'm not bitching about the CONCEPT.

Basically ANY programming language allows easy use of
that particular kind of solution. No add-ons needed.

Let's say I'm a fan of "KISS" solutions.

A concern is systems that update ALL OF THE TIME like
databases. Keeping in-transaction copies of every little
file is less fun. Totally do-able, and oft is, but
less fun. Multi-user record-only-locked files makes
it even more less fun.

But, alas, abrupt crashes/lockups or user madness is
STILL a real problem so SOMETHING has to be done.
Computers are machines, and machines fuck up and/or
CAN be fucked-up.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rich@21:1/5 to 186282@ud0s4.net on Sat Jan 25 17:24:02 2025

186282@ud0s4.net <186283@ud0s4.net> wrote:

On 1/24/25 2:08 PM, Rich wrote:

186282@ud0s4.net <186283@ud0s4.net> wrote:

On 1/23/25 7:39 PM, Lawrence D'Oliveiro wrote:

When developing an app, saving changes that a user has made to a document >>>> needs to be managed carefully. Simply overwriting the existing file with >>>> the new data can cause trouble, if your app (or the system) should crash >>>> part-way through, because then the file ends up with some part of the old >>>> document overwritten with the new one, and so the user ends up without a >>>> valid copy of either the old or the new version -- in effect, all their >>>> work is lost.

Of course systems CAN glitch at any time, often for
totally mysterious reasons - power maybe, minor
coding error only hit 1:1000 times, cosmic rays ....
so if yer stuff is SUPER important, like tax docs
or whatever .......

Third option:

Use a Sqlite file as the "file" the app uses, and delegate all the ugly
aspects of atomic file "adjusting" and "storing" to Sqlite (which by
now has mitigations for issues most individual developers will never
see nor hear of).

Plus, a Sqlite file would allow a very easy "versioned file" setup as
well.

Downside: one has to have an Sqlite module for one's language availble,
or one has to include Sqlite's driver in one's app.

I looked into this a bit ... it's a potential solution, but seems,
well, a little TOO for the issue at hand.

If using Word or Excel, the system continually creates temp files
of every little change every X minutes. My bitch is that sometimes
if FORGETS to delete all those files after (had to add a filter to
my backup pgms) - but I'm not bitching about the CONCEPT.

If the "storage file" had been a sqlite DB, all those "little temp
files" could have been new rows in a "backup log" table inside sqlite,
and from your file browser perspective, there's only one "file" on disk
at all times (or two, if one turns on the alternate sqlite update
method).

Basically ANY programming language allows easy use of that
particular kind of solution. No add-ons needed.

Yes, and programmers will take the route of least resistance most every
time. That's the cause of so many popup modal dialog "OK" boxes
confirming that what you expected to happen did happen, but because of
the modal ness, you now have to go dismiss the damn thing to get on
with whatever you are actually trying to do. Those popup modal "ok"
style boxes are often the only UI widget provided by default in any
given UI library - most anything else has to be "assembled yoursef".

A concern is systems that update ALL OF THE TIME like databases.
Keeping in-transaction copies of every little file is less fun.
Totally do-able, and oft is, but less fun. Multi-user
record-only-locked files makes it even more less fun.

sqlite *is* a database, but it takes care of all those nitty gritty
details for you, so you don't have to care from the level of the app
you are writing.

But, alas, abrupt crashes/lockups or user madness is STILL a real
problem so SOMETHING has to be done. Computers are machines, and
machines fuck up and/or CAN be fucked-up.

Sqlite, being a database, goes that extra mile to make sure the changes
you commit to it remain to be seen past most anything that can happen
(beyond the obvious unrecoverable such as "disk storing file dies, and
can no longer be accessed). But OS crashes, power failures, etc., it
tries its best (and claims to be very good at it) to avoid data loss in
those situations.

The prograamer gets none of that with the usual generic file-open,
file-write, file-close style interface that most languages provide.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Guest
  Sun Jun 15 18:58:11 2025
  from No via SSH
- Plume
  Sun Jun 15 15:01:03 2025
  from Uk via SSH
- Centurion
  Sun Jun 15 09:44:59 2025
  from Berea, Ohio via Telnet
- Deasl
  Sun Jun 15 08:43:59 2025
  from Foo, Bar via SSH
- Deasl
  Sun Jun 15 08:41:06 2025
  from Foo, Bar via SSH
- Plume
  Sat Jun 14 21:49:07 2025
  from Uk via SSH
- Max Prime
  Sat Jun 14 16:47:41 2025
  from United Kingdom via SSH
- Deasl
  Sat Jun 14 16:38:22 2025
  from Foo, Bar via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	493
Nodes:	16 (2 / 14)
Uptime:	178:33:07
Calls:	9,705
Calls today:	5
Files:	13,736
Messages:	6,179,156

Tentative File Open & Safe Save

Who's Online

Recent Visitors

System Info