• Bug#1065170: tech-ctte: Requesting advice on glib2.0 #1065022, file del

    From Simon McVittie@1:229/2 to All on Fri Mar 1 13:00:02 2024
    XPost: linux.debian.bugs.dist, linux.debian.maint.gtk.gnome
    From: smcv@debian.org

    Package: tech-ctte
    Severity: normal
    X-Debbugs-Cc: debian-dpkg@lists.debian.org, debian-gtk-gnome@lists.debian.org, vorlon@debian.org

    I'm requesting advice from the tech-ctte (or anyone else with relevant knowledge, e.g. the dpkg team or the drivers of the time64 transition)
    on how to resolve glib2.0 bug #1065022. This is time-sensitive,
    because it is a RC bug (temporarily breaking many applications across
    this transition) and will hold up the time64 transition.

    Background
    ==========

    glib2.0 has two similar patterns where files that are managed by dpkg
    are summarized in a non-dpkg-managed file, maintained by triggers and the library postinst/postrm.

    The first of these patterns is GSettings schemas. There is a tool that
    loads GSettings schemas in XML format from /usr/share/glib-2.0/schemas
    and aggregates them into a single binary blob in a more efficient format, /usr/share/glib-2.0/schemas/gschemas.compiled. For performance reasons, applications only load gschemas.compiled: there is no support for loading
    the more authorable but less efficient XML files directly.

    The second of these patterns is GIO modules, a plugin architecture which
    loads .so files from /usr/lib/${DEB_HOST_MULTIARCH}/gio/modules
    and summarizes their functionality
    in /usr/lib/${DEB_HOST_MULTIARCH}/gio/modules/giomodule.cache.
    Applications that want to load plugins parse giomodule.cache, and only
    dlopen the plugins that provide the desired functionality (for example
    an application that doesn't do any networking will not load plugins that
    only implement gio-proxy-resolver).

    This is implemented with dpkg file-based triggers: when a package adds
    or removes GSettings schemas or GIO modules, it triggers processing by
    the libglib2.0-0{,t64} postinst. The implementation has been approximately
    the same shape for 10 years, and has worked well until now.

    Because dpkg doesn't have an equivalent of RPM %ghost files, the two
    generated summary files need to be deleted by the library's postrm.
    As of bookworm (and still true in trixie), the implementation is:

    - for giomodule.cache (per-architecture), the file is simply deleted by
    postrm remove

    - for gschemas.compiled (shared by all architectures), if the multiarch
    refcount of the library reaches 0, then the file is deleted during the
    next postrm purge

    The bug
    =======

    When we transition from libglib2.0-0 to libglib2.0-0t64, this involves
    the removal of libglib2.0-0. In the postrm of libglib2.0-0, removing libglib2.0-0:amd64 deletes /usr/lib/x86_64-linux-gnu/gio/modules/giomodule.cache, and so on for
    all the other architectures.

    The result is that until the postinst of libglib2.0-0t64:amd64 is run,
    amd64 applications will be unable to load GIO plugins, causing
    functionality loss (for example, inability to use https, because the
    TLS plugin is not loaded).

    Similarly, either during or after the transition from libglib2.0-0
    to libglib2.0-0t64, users will want to purge libglib2.0-0. In
    the postrm of libglib2.0-0, if there are no multiarch
    instances of libglib2.0-0 remaining, purging the package deletes /usr/share/glib-2.0/schemas/gschemas.compiled. The result is that
    applications that want to load GSettings schemas will not find their
    required schemas, which is normally treated as a programming error
    (incorrect installation) that causes a crash with an assertion failure.

    The workaround is: after removal or purging of libglib2.0-0, reinstall
    either libglib2.0-0t64 or any package that will trigger libglib2.0-0t64.
    On multiarch systems, this must be done for the architecture that matches
    the instance of libglib2.0-0 that was removed.

    During upgrade, I am unsure what ordering guarantees we have about
    the postrm of libglib2.0-0 running before or after the postinst of libglib2.0-0t64 - perhaps we avoid the giomodule.cache bug in practice,
    because the postrm runs before the postinst? But purge can happen at any
    later time, so we certainly cannot guarantee that libglib2.0-0t64.postinst
    will run after purging libglib2.0-0.

    I apologise for not having foreseen this.

    Non-solutions
    =============

    I am not interested in solutions that would require a use of a time
    machine to change the postrm that was shipped in bookworm: bookworm was
    already released, and now we are stuck with it. *After* the time-sensitive
    part of this issue has been solved, I plan to look into making the postrm robust against future transitions similar to this one by adding some way
    for the new package to take over responsibility for giomodule.cache and gschemas.compiled, but for this particular transition it's too late: the
    first time at which we could rely on that functionality is trixie -> forky.

    I am also not interested in solutions that require design changes in GLib,
    for example adding a fallback slow-path that ignores the absence of the
    summary files and loads the individual GSettings schemas and GIO modules directly. This is because upstream would not accept such a change, and it
    would introduce significant delta into Debian, which we would potentially
    never be able to remove (because the removed libglib2.0-0 can be purged
    at any later date). I consider the deletion of these summary files to
    be a packaging problem, which we should be able to solve in packaging.

    Possible solution: delete libglib2.0-0.postrm in libglib2.0-0t64.preinst ========================================================================

    libglib2.0-0t64 could gain a preinst that deletes /var/lib/dpkg/info/libglib2.0-0:${DEB_HOST_ARCH}.postrm. This is a clear
    Policy violation, but perhaps between closely cooperating packages
    (glib2.0 and, er, glib2.0) it would be the least-bad answer to this?

    There is nothing else in the postrm other than the two problematic file deletions (I'll have to check bookworm, but this is certainly true for
    trixie) so I think there would not be any harmful side-effect of this,

    [continued in next message]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)