XPost: linux.debian.bugs.dist, linux.debian.maint.gtk.gnome
From:
smcv@debian.org
Package: tech-ctte
Severity: normal
X-Debbugs-Cc:
debian-dpkg@lists.debian.org,
debian-gtk-gnome@lists.debian.org,
vorlon@debian.org
I'm requesting advice from the tech-ctte (or anyone else with relevant knowledge, e.g. the dpkg team or the drivers of the time64 transition)
on how to resolve glib2.0 bug #1065022. This is time-sensitive,
because it is a RC bug (temporarily breaking many applications across
this transition) and will hold up the time64 transition.
Background
==========
glib2.0 has two similar patterns where files that are managed by dpkg
are summarized in a non-dpkg-managed file, maintained by triggers and the library postinst/postrm.
The first of these patterns is GSettings schemas. There is a tool that
loads GSettings schemas in XML format from /usr/share/glib-2.0/schemas
and aggregates them into a single binary blob in a more efficient format, /usr/share/glib-2.0/schemas/gschemas.compiled. For performance reasons, applications only load gschemas.compiled: there is no support for loading
the more authorable but less efficient XML files directly.
The second of these patterns is GIO modules, a plugin architecture which
loads .so files from /usr/lib/${DEB_HOST_MULTIARCH}/gio/modules
and summarizes their functionality
in /usr/lib/${DEB_HOST_MULTIARCH}/gio/modules/giomodule.cache.
Applications that want to load plugins parse giomodule.cache, and only
dlopen the plugins that provide the desired functionality (for example
an application that doesn't do any networking will not load plugins that
only implement gio-proxy-resolver).
This is implemented with dpkg file-based triggers: when a package adds
or removes GSettings schemas or GIO modules, it triggers processing by
the libglib2.0-0{,t64} postinst. The implementation has been approximately
the same shape for 10 years, and has worked well until now.
Because dpkg doesn't have an equivalent of RPM %ghost files, the two
generated summary files need to be deleted by the library's postrm.
As of bookworm (and still true in trixie), the implementation is:
- for giomodule.cache (per-architecture), the file is simply deleted by
postrm remove
- for gschemas.compiled (shared by all architectures), if the multiarch
refcount of the library reaches 0, then the file is deleted during the
next postrm purge
The bug
=======
When we transition from libglib2.0-0 to libglib2.0-0t64, this involves
the removal of libglib2.0-0. In the postrm of libglib2.0-0, removing libglib2.0-0:amd64 deletes /usr/lib/x86_64-linux-gnu/gio/modules/giomodule.cache, and so on for
all the other architectures.
The result is that until the postinst of libglib2.0-0t64:amd64 is run,
amd64 applications will be unable to load GIO plugins, causing
functionality loss (for example, inability to use https, because the
TLS plugin is not loaded).
Similarly, either during or after the transition from libglib2.0-0
to libglib2.0-0t64, users will want to purge libglib2.0-0. In
the postrm of libglib2.0-0, if there are no multiarch
instances of libglib2.0-0 remaining, purging the package deletes /usr/share/glib-2.0/schemas/gschemas.compiled. The result is that
applications that want to load GSettings schemas will not find their
required schemas, which is normally treated as a programming error
(incorrect installation) that causes a crash with an assertion failure.
The workaround is: after removal or purging of libglib2.0-0, reinstall
either libglib2.0-0t64 or any package that will trigger libglib2.0-0t64.
On multiarch systems, this must be done for the architecture that matches
the instance of libglib2.0-0 that was removed.
During upgrade, I am unsure what ordering guarantees we have about
the postrm of libglib2.0-0 running before or after the postinst of libglib2.0-0t64 - perhaps we avoid the giomodule.cache bug in practice,
because the postrm runs before the postinst? But purge can happen at any
later time, so we certainly cannot guarantee that libglib2.0-0t64.postinst
will run after purging libglib2.0-0.
I apologise for not having foreseen this.
Non-solutions
=============
I am not interested in solutions that would require a use of a time
machine to change the postrm that was shipped in bookworm: bookworm was
already released, and now we are stuck with it. *After* the time-sensitive
part of this issue has been solved, I plan to look into making the postrm robust against future transitions similar to this one by adding some way
for the new package to take over responsibility for giomodule.cache and gschemas.compiled, but for this particular transition it's too late: the
first time at which we could rely on that functionality is trixie -> forky.
I am also not interested in solutions that require design changes in GLib,
for example adding a fallback slow-path that ignores the absence of the
summary files and loads the individual GSettings schemas and GIO modules directly. This is because upstream would not accept such a change, and it
would introduce significant delta into Debian, which we would potentially
never be able to remove (because the removed libglib2.0-0 can be purged
at any later date). I consider the deletion of these summary files to
be a packaging problem, which we should be able to solve in packaging.
Possible solution: delete libglib2.0-0.postrm in libglib2.0-0t64.preinst ========================================================================
libglib2.0-0t64 could gain a preinst that deletes /var/lib/dpkg/info/libglib2.0-0:${DEB_HOST_ARCH}.postrm. This is a clear
Policy violation, but perhaps between closely cooperating packages
(glib2.0 and, er, glib2.0) it would be the least-bad answer to this?
There is nothing else in the postrm other than the two problematic file deletions (I'll have to check bookworm, but this is certainly true for
trixie) so I think there would not be any harmful side-effect of this,
[continued in next message]
--- SoupGate-Win32 v1.05
* Origin: you cannot sedate... all the things you hate (1:229/2)