• Re: DEP 17: Improve support for directory aliasing in dpkg

    From Guillem Jover@21:1/5 to Helmut Grohne on Sat Apr 8 04:40:01 2023
    Hi!

    On Mon, 2023-04-03 at 14:02:13 +0200, Helmut Grohne wrote:
    I have been looking into the aliasing problems in dpkg on behalf of Freexian's Debian funding. To that end I proposed a possible way forward
    last year (https://lists.debian.org/debian-dpkg/2022/11/msg00007.html),
    but the feedback I got was not particularly helpful in determining
    consensus.

    I thought my reply was rather clear, and that we had further clarified
    that privately, that at the time I thought there was no other answer
    required as (AFAIR) you stated you'd be digging further on it. And I
    mentioned I'd try to reply to the list, but it didn't feel urgent given
    the clarifications given, neither the timing during the freeze?

    A little later, Simon Richter also looked into the problem (https://lists.debian.org/debian-dpkg/2022/12/msg00023.html), but
    remained silent after the initial post. Little happened since then. Now Raphael Hertzog proposed to use the DEP process to get this thing
    unstuck

    Sigh, a DEP(!?), for a dpkg change? It feels more like a way to exhort
    pressure over this than anything else TBH…

    and with the help of Emilio Pozuelo Monfort I created a draft
    for discussion. I allocate number 17 via debian-project@l.d.o. What
    follows is the draft text. Please consider it to be a piece of best intentions at reconciling feedback wherever I could.

    I'm unlikely to discuss this topic on debian-devel, given previous
    nastiness and abuse.

    The text includes most (but not all) of what I've been saying publicly,
    and what I've tried to further clarify to you and Emilio in private.
    But I think ignores the essence of what I've been repeating all along.

    Introduction
    ============

    At its core, `dpkg` assumes that every filename uniquely refers to a
    file on disk. The situation where two distinct filenames refer to the
    same file on disk is referred to as aliasing.

    (To be precise, I think this describes hardlinks. Aliasing occurs when different pathnames where their last component is not a symlink, all
    refer to the same filename on the same directory. But I don't think this matters much.)

    Proposal
    ========

    In order to handle aliasing efficiently, `dpkg` gains new options `--add-alias <symlink>`, `--remove-alias <symlink>` and
    `--list-aliases`. When creating symbolic links that cause aliasing
    effects, the creating entity is supposed to inform `dpkg` using an appropriate invocation. Doing so records the aliasing information in a
    new mapping inside its administrative directory. No existing
    administrative files are modified as a result of this operation. When
    `dpkg` operates on paths, it can compute a canonicalized version using a
    pure function without the need to `stat()` files on disk thus greatly improving performance. Canonicalized paths are only needed when
    determining whether a file conflict exists. In all other cases,
    original paths continue to be used as symbolic links will be followed by filesystem operations. The `--add-alias` operation records the target
    of the symbolic link that must exist prior to invocation. The `--remove-alias` operation fails if any files are still installed in the aliased location.

    I already mentioned this in my reply for the thread you reference. So,
    let me repeat and possibly expand to avoid any future doubt. I already considered and discarded something like this (except for using a config
    option instead of a new command, but that does not really change the
    substance of the problems).

    Let's also get back to the very basics. dpkg manages objects shipped
    in binary packages, on the filesystem. It assumes this managing role in exclusivity, it will for example overwrite unmanaged files. It preserves
    admin changes with interfaces specifically provided for that (diversions, statoverrides, conffile changes) or the unfortunate symlink redirects.
    These shipped objects define the filesystem layout (not the other way
    around). Due to the missing fsys metadata, where it does not have all
    such metadata at hand when necessary (it might only have the one for
    the currently unpacked .deb), it might use heuristics or check the
    filesystem for such metadata, because it does not have anything else,
    but that should not be taken to mean that the filesystem is the source
    of truth, as most of those will be unnecessary once it has such
    metadata at hand.

    So the reason this proposal is still conceptually wrong is manifold:

    * dpkg cannot safely and atomically perform such switches (and I don't
    see it ever being able to portably do so, so I don't see ever
    supporting that).
    * No packages ships those symlinks (and none should! as that would
    currently imply having the same pathname contain different file types
    on the same system, introducing ordering issues and file type
    conflicts).
    * This introduces a series of commands to let dpkg know that a
    filesystem change that was not shipped in any .deb (even though that
    should have been the way to do it), has been done, which:
    - Switches the source of truth from the .deb to the fsys.
    - Confuses admin initiated changes from distro initiated ones.
    * Wants to be a generic change but it is really targeted to this
    specific mess. We have been doing similar aliasing transitions for
    many doc dirs, by stopping shipping files within, shipping that
    pathname as a symlink and then switching the directories to symlinks
    to match (via the dpkg-maintscript-helper hack because we miss fsys
    metadata). This means we'd need to then register all these directories
    too? Meh.
    * This information can get out of sync with reality, as it adds an
    additional and unconnected with anything source of truth, that dpkg
    cannot do anything about if it diverges (in contrast to diversions
    or statoverrides f.ex.). This can never happen when that information
    comes from the real source of truth (the fsys metadata via the .deb).
    * This also adds undue complexity, by supporting those as admin aliases.
    The admin generated redirecting symlinks are already annoying, I'd rather
    not add further to that pile. I don't really want to support admins doing
    this (dpkg-divert does not even support diverting a directory).

    [ As an aside, I think ideally eventually nothing distro provided should
    be allowed to be installed within an aliased dir, and dpkg should
    eventually just error out in those cases, which eventually would get
    rid of the aliasing problems and any such complexity (I'm not sure how
    or when that would be feasible though, but obviously in Debian at
    least not until nothing ships files there). ]

    So this still looks like a terrible interface, like it did at the time
    it was discarded; founded on a hack, an interface that seems wants to
    be kind of a file-type override but it cannot be, and cannot even
    properly act as record tracker, etc…

    Rejected proposals
    ==================

    Hardcoding aliases into dpkg
    ----------------------------

    It was suggested to include a static aliasing mapping into the `dpkg`
    source code. Since `dpkg` is used by multiple projects in different
    ways (not necessarily Debian-derivatives), this approach would break
    other consumers. Also note that Debian's `dpkg` can be used to operate
    on an installation using different aliases via the `--root` flag. As
    such the alias mapping needs to be a property of the installation.

    Yes.

    Modifying package lists in place
    --------------------------------

    `dpkg` could rewrite the extracted `.list` files from `control.tar` and
    store paths in canonicalized form. Canonicalization would happen as
    when a `control.tar` is extracted. It would also happen either as a
    one-time conversion during the upgrade of `dpkg` or whenever a `.list`
    file is read. Given canonicalized list files, string comparison on
    files would support conflict detection. Other pieces to be updated in a similar way include `alternatives`, `diversions`, `statoverride`, and `triggers`.

    This would affect the output of `dpkg -S`, which would then output canonicalized paths. Packages generated by `dpkg-repack` would have
    their contents canonicalized as well.

    This is an interface breaking change, as it introduces
    change-at-a-distance for packages themselves, and reproducibility
    issues that depend on the system at hand. As it is based on a foundation
    of an invented filesystem view, as those remapped packages never shipped
    those pathnames.

    Managing the aliasing mapping using a control file --------------------------------------------------

    It was suggested that the mapping could be managed via a special control
    file `canonical`. Given that aliasing is not a common operation, the
    benefit of handling it declaratively is minor. Beyond that, aliasing
    can also happen as an customization issued by an administrator.
    Therefore, a command line based approach is preferred.

    As long as the package does not provide the symlinks, shipping this
    type of information declaratively would also be conceptually wrong.
    And it is just a distraction from the fsys metadata stuff, with all
    the drawbacks of the CLI commands.

    Having dpkg move files and create symbolic links ------------------------------------------------

    When instructed with `--add-alias`, `dpkg` could also create the corresponding symbolic links and move the affected files to their new location. While that would be convenient, doing so is non-trivial in an atomic way. Sometimes, the underlying filesystem does not fully conform
    to POSIX (e.g. `overlayfs`) and such corner cases need to be managed individually. Since such an implementation already exists outside
    `dpkg` and its complexity is non-trivial, the moving of files shall
    remain external. In case aliases are setup in a bootstrap setting, no
    moves are necessary.

    dpkg expects several requirements for filesystems semantics, if they do
    not provide them, then those filesystems are not supported for dpkg to
    manage objects on them.

    dpkg cannot guarantee atomicity and safety for this kind of aliasing
    switch, and I don't see it will ever be able to support performing such
    switch, as that can break the system.

    Implement aliasing after metadata tracking ------------------------------------------

    The [metadata tracking](https://wiki.debian.org/Teams/Dpkg/Spec/MetadataTracking)
    feature enhances `dpkg` with knowledge about filesystem metadata for installed files. This includes knowledge of symbolic links, which would
    help with tracking aliasing. Unfortunately, progress on this is fairly
    slow and we think that aliasing support is more urgent.

    I thought it would be clear that if there is stuff that depends on
    any of this kind of changes to dpkg, relying on those changes in
    Debian would not be possible until after trixie+1. Of course there is
    always the route to further pile up over the Jenga tower of hacks,
    by for example adding huge amounts of Pre-Depends…

    So given the above, I don't see why the apparent rush here. And as I've mentioned many times now, I'm planning to continue working on the fsys
    metadata stuff for 1.22.x, probably at the cost of database duplication
    if necessary, if current blockers have not adapted by then. But as I've mentioned before, that might not guarantee this support is sufficient to support fixing this mess. But all other proposed changes I've seen
    flying around for changes to dpkg are just conceptually wrong in one way
    or another.

    Regards,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helmut Grohne@21:1/5 to Guillem Jover on Sat Apr 22 13:00:01 2023
    Hi Guillem,

    On Sat, Apr 08, 2023 at 04:35:25AM +0200, Guillem Jover wrote:
    I thought my reply was rather clear, and that we had further clarified
    that privately, that at the time I thought there was no other answer
    required as (AFAIR) you stated you'd be digging further on it. And I mentioned I'd try to reply to the list, but it didn't feel urgent given
    the clarifications given, neither the timing during the freeze?

    I'm sorry for the misunderstanding. Thank you for replying here as it
    helps clarify the relevant matters in various areas. The freeze timing
    arises because we ideally merge intrusive changes early in a release
    cycle.

    Sigh, a DEP(!?), for a dpkg change? It feels more like a way to exhort pressure over this than anything else TBH…

    I'm sorry if you perceive it as such. The intention was to capture
    consensus around a change that is necessary in some sense and
    controversial in another.

    I'm unlikely to discuss this topic on debian-devel, given previous
    nastiness and abuse.

    I see. I would have appreciated a Cc in that case as I am not subscribed
    to debian-dpkg.

    The text includes most (but not all) of what I've been saying publicly,
    and what I've tried to further clarify to you and Emilio in private.
    But I think ignores the essence of what I've been repeating all along.

    It's good to see that I captured many of your arguments. That essence
    however probably is the sticking point where we fundamentally disagree.
    I'll reference this "fundamental disagreement" a few more times and
    circle back to it later.

    I already mentioned this in my reply for the thread you reference. So,
    let me repeat and possibly expand to avoid any future doubt. I already considered and discarded something like this (except for using a config option instead of a new command, but that does not really change the substance of the problems).

    Thank you for repeating it in such clarity.

    Let's also get back to the very basics. dpkg manages objects shipped
    in binary packages, on the filesystem. It assumes this managing role in exclusivity, it will for example overwrite unmanaged files. It preserves admin changes with interfaces specifically provided for that (diversions, statoverrides, conffile changes) or the unfortunate symlink redirects.
    These shipped objects define the filesystem layout (not the other way around). Due to the missing fsys metadata, where it does not have all
    such metadata at hand when necessary (it might only have the one for
    the currently unpacked .deb), it might use heuristics or check the
    filesystem for such metadata, because it does not have anything else,
    but that should not be taken to mean that the filesystem is the source
    of truth, as most of those will be unnecessary once it has such
    metadata at hand.

    This captures an insight I previously didn't have in that clarity and
    that I find agreeable conceptually.

    So the reason this proposal is still conceptually wrong is manifold:

    * dpkg cannot safely and atomically perform such switches (and I don't
    see it ever being able to portably do so, so I don't see ever
    supporting that).

    I agree, but the proposal also does not ask dpkg to perform such
    switches, so I kinda fail to see how this is a relevant argument.

    * No packages ships those symlinks (and none should! as that would
    currently imply having the same pathname contain different file types
    on the same system, introducing ordering issues and file type
    conflicts).

    I disagree with this argument on two levels. For one thing, I think that
    the transition only is complete once these symlinks are shipped in a
    package. In particular, that notion of complete likely encompasses that
    no aliasing occurs anymore as all aliased files have been moved to their canonical location somehow (<- and this likely will be a quite difficult
    thing to do). For another, no package actually ships those symlinks now.
    They are created behind dpkg's back in some postinst. This is
    unfortunate and I agree with Simon Richter that this kinda is a policy violation, but at this time, it is an aspect we have to deal with
    whether we want to or not.

    I suspect that you disagree with the notion the we have to deal with
    this situation, which I consider to be our fundamental disagreement.

    * This introduces a series of commands to let dpkg know that a
    filesystem change that was not shipped in any .deb (even though that
    should have been the way to do it), has been done, which:
    - Switches the source of truth from the .deb to the fsys.

    While this is correct on some level, the aim of this change is to put
    that truth back into dpkg of course.

    - Confuses admin initiated changes from distro initiated ones.

    I think we already do this with dpkg-divert, dpkg-statoverride and other
    tools. While this may not be nice, it certain has prior art and is
    consistent with how we have been doing things in the past.

    * Wants to be a generic change but it is really targeted to this
    specific mess. We have been doing similar aliasing transitions for
    many doc dirs, by stopping shipping files within, shipping that
    pathname as a symlink and then switching the directories to symlinks
    to match (via the dpkg-maintscript-helper hack because we miss fsys
    metadata). This means we'd need to then register all these directories
    too? Meh.

    I would love to agree with this, but I believe that this ship has
    sailed. This likely is part of our fundamental disagreement.

    * This information can get out of sync with reality, as it adds an
    additional and unconnected with anything source of truth, that dpkg
    cannot do anything about if it diverges (in contrast to diversions
    or statoverrides f.ex.). This can never happen when that information
    comes from the real source of truth (the fsys metadata via the .deb).

    I have difficulties accurately capturing the argument. The problem of information getting out of sync with reality should affect every aspect
    of dpkg and indeed, that kinda is the status quo where upgrades can
    loose files, because dpkg has an incomplete picture of reality. The aim
    of this change is to allow us to re-sync the status quo into dpkg. My
    view is that dpkg's information presently is out of sync with reality
    and the proposed change partially fixes that.

    * This also adds undue complexity, by supporting those as admin aliases.
    The admin generated redirecting symlinks are already annoying, I'd rather
    not add further to that pile. I don't really want to support admins doing
    this (dpkg-divert does not even support diverting a directory).

    I have to agree with this one. This and the fact that this feature is
    probably impossible to remove later is my main problem with the proposed change. I have been proposing it anyway, because my impression is that
    it is the least bad option available. The notion of "least bad" quite
    obviously is somewhat subjective and is where we need to get consensus.

    [ As an aside, I think ideally eventually nothing distro provided should
    be allowed to be installed within an aliased dir, and dpkg should
    eventually just error out in those cases, which eventually would get
    rid of the aliasing problems and any such complexity (I'm not sure how
    or when that would be feasible though, but obviously in Debian at
    least not until nothing ships files there). ]

    It seems to me that this is something everyone agrees on. So our
    disagreement resides in the way to get there rather than where to get
    to.

    So this still looks like a terrible interface, like it did at the time
    it was discarded; founded on a hack, an interface that seems wants to
    be kind of a file-type override but it cannot be, and cannot even
    properly act as record tracker, etc…

    I agree that in a perfect world, we would not need this. Let me circle
    back to our fundamental disagreement.

    My impression is that at this time basically everyone except you agrees
    that we have to deal with the aliasing problems that have been rolled
    out to users and will be forced in bookworm. I believe that this is the
    state that we have to consider as starting point and that we cannot
    magically turn this transition back to perform it in a better way. And
    indeed, I believe that there would have been a better way[1] that no
    longer is available to us.

    On the other hand, my impression is that you continue to see the
    transition as fundamentally broken and in a state that we cannot work
    from. You appear to believe that if we want to do it, we must start over
    in a better way. That better way must not cause aliasing problems to
    dpkg.

    And sure enough, DEP 17 is a result of having first created these
    aliases and now trying to move the files. If we had opted for first
    moving the files and then creating the aliases, much of the bad effects
    would never have affected anyone and we wouldn't be discussing this
    change as it would not be necessary. With DEP 17 (or any similar change
    to dpkg), we will be able to actually move files to their canonical
    location and thus resolve the aliasing at which point DEP 17 will become unused.

    Did I accurately describe your view on this matter?

    I thought it would be clear that if there is stuff that depends on
    any of this kind of changes to dpkg, relying on those changes in
    Debian would not be possible until after trixie+1. Of course there is
    always the route to further pile up over the Jenga tower of hacks,
    by for example adding huge amounts of Pre-Depends…

    I agree that we probably will deal with this until at least trixie+1.
    This is precisely why I would like to have a plan to finish it sooner
    rather than later.

    So given the above, I don't see why the apparent rush here. And as I've mentioned many times now, I'm planning to continue working on the fsys metadata stuff for 1.22.x, probably at the cost of database duplication
    if necessary, if current blockers have not adapted by then. But as I've mentioned before, that might not guarantee this support is sufficient to support fixing this mess. But all other proposed changes I've seen
    flying around for changes to dpkg are just conceptually wrong in one way
    or another.

    As I see it, the fsys metadata work may help with the aliasing problems
    only if the respective symlinks are actually shipped in a data.tar of a
    .deb, which is not the way we currently do things. For that reason, I
    fail to see how the fsys metadata work is part of the solution for the
    problems we currently experience. I'd appreciate if you could elaborate
    on how you see it helping to fix the file loss on upgrade problem
    affecting current installations if you have sufficient energy to do so.

    Helmut

    [1] What follows is to be considered a time travel fix and is purely
    academic. Imagine we never had never done the usrmerge and would be
    introducing it again in a usrmerge package that works quite
    differently. Rather than create those aliases upfront, it would
    declare a trigger interest in affected locations (i.e. both /bin and
    /usr/bin and so on). It would practically trigger on every package
    operation. Whenever triggered, it would scan the non-/usr location
    for files. If any are found, it would populate both the /usr
    location and the non-/usr location with symlinks to the actual
    files. However when the non-/usr location becomes empty, it would
    remove the hierarchy and place the symlink directly. In that
    approach, we'd end up at the same layout that we currently have, but
    use symlink farms until we reach the point that all files have been
    moved. There are a number of aspects where this approach is
    problematic as well, but I think it is no longer useful to discuss
    it as switching to this approach is not an available option anymore.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helmut Grohne@21:1/5 to Simon Richter on Sat Apr 22 13:00:01 2023
    Hi Simon,

    On Sat, Apr 08, 2023 at 04:06:54PM +0200, Simon Richter wrote:
    Yes, I am quite busy, but it's not forgotten. I keep adding new test cases.

    Thank you for taking the time to follow up. I discarded many of your
    arguments in this reply due to agreement.

    Dpkg already has defined behaviour for directory vs symlink: the directory wins. In principle a future version of dpkg could change that, but /lib/ld-linux.so.2 is just too special, we'd never want to have a package that actually moves it.

    Your argument of the dynamic linker being too special is interesting. I certainly agree that we must take great care at moving it, but on the
    flip side I do not consider the transition complete until we reach the
    point where we have moved it.

    Do you actually see us coping with some aliases (e.g. the /lib one)
    eternally?

    That's why I went with "this needs to be a separate mechanism."

    Or do you want to say that we need a new mechanism to be able to move
    such important files?

    The reason to use a control file instead of a tool would be to install the alias from an Essential package, so the old-school "unpack essential packages, then overwrite with dpkg" approach to system installation would work again without special-casing usrmerge in debootstrap&co.

    I did not have this goal in mind, but now that you mention it, it seems important to me. It is not clear to me though how that control file
    actually gets us there. In your picture, which component is in charge of actually creating the symbolic links on the filesystem? Can you go into
    detail as to how you imagine that bootstrap without special-casing?

    It was suggested that the mapping could be managed via a special control file `canonical`. Given that aliasing is not a common operation, the benefit of handling it declaratively is minor. Beyond that, aliasing
    can also happen as an customization issued by an administrator. Therefore, a command line based approach is preferred.

    The advantage is that it works for Essential packages, like the one shipping /lib/ld-linux.so.2.

    In the --add-alias variant, I think we would still move the dynamic
    linker to /usr and ship the /lib symlink in base-files eventually. I
    admit that it is not entirely clear how such a move could be performed
    safely, but that seems like a solvable problem to me. In that way, I
    fail to see the control file being an advantage.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Helmut Grohne on Wed Jun 21 13:40:01 2023
    Hi!

    On Sat, 2023-04-22 at 10:27:26 +0200, Helmut Grohne wrote:
    On Sat, Apr 08, 2023 at 04:35:25AM +0200, Guillem Jover wrote:
    Let's also get back to the very basics. dpkg manages objects shipped
    in binary packages, on the filesystem. It assumes this managing role in exclusivity, it will for example overwrite unmanaged files. It preserves admin changes with interfaces specifically provided for that (diversions, statoverrides, conffile changes) or the unfortunate symlink redirects. These shipped objects define the filesystem layout (not the other way around). Due to the missing fsys metadata, where it does not have all
    such metadata at hand when necessary (it might only have the one for
    the currently unpacked .deb), it might use heuristics or check the filesystem for such metadata, because it does not have anything else,
    but that should not be taken to mean that the filesystem is the source
    of truth, as most of those will be unnecessary once it has such
    metadata at hand.

    This captures an insight I previously didn't have in that clarity and
    that I find agreeable conceptually.

    So the reason this proposal is still conceptually wrong is manifold:

    * dpkg cannot safely and atomically perform such switches (and I don't
    see it ever being able to portably do so, so I don't see ever
    supporting that).

    I agree, but the proposal also does not ask dpkg to perform such
    switches, so I kinda fail to see how this is a relevant argument.

    It is relevant because it affects the end state, and what solutions
    are going to be appropriate then. See below.

    This might perhaps also have been a source of misunderstanding, my
    thinking is not focused solely on this particular instance but how
    this interacts with other current or long term behavior and upcoming
    features, and how this all would look like in the end.

    * No packages ships those symlinks (and none should! as that would
    currently imply having the same pathname contain different file types
    on the same system, introducing ordering issues and file type
    conflicts).

    I disagree with this argument on two levels. For one thing, I think that
    the transition only is complete once these symlinks are shipped in a
    package. In particular, that notion of complete likely encompasses that
    no aliasing occurs anymore as all aliased files have been moved to their canonical location somehow (<- and this likely will be a quite difficult thing to do). For another, no package actually ships those symlinks now.
    They are created behind dpkg's back in some postinst. This is
    unfortunate and I agree with Simon Richter that this kinda is a policy violation, but at this time, it is an aspect we have to deal with
    whether we want to or not.

    I suspect that you disagree with the notion the we have to deal with
    this situation, which I consider to be our fundamental disagreement.

    I don't think we disagree (?), I probably didn't express myself clearly.
    The fact that no package ships those symlinks *is* and *has* been a
    problem, and what I've been saying all along, this will be the only
    correct way to let dpkg know whether there will be aliasing in play.
    At the same time what I was trying to say is that we cannot ship those
    symlinks because even though dpkg does not yet track fsys metadata
    (even though it should and is one requirement to be able to be
    aliasing-aware), it would be an implicit file type conflict, where
    dpkg (currently) would not know or be able to do anything meaningful
    with it, and might make unpacks fail in the future (depending on the
    ordering or packages being unpacked).

    Coming now back to the atomic and safe switches, and the ordering,
    as I think I've mentioned elsewhere, dpkg should eventually be made aliasing-aware, in that it should know about all fsys file types and
    be able to detect these cases during unpack (once these symlinks are
    properly shipped in a package). But given these mentioned constraints
    it cannot be made to support (as in accept) unpacking files inside
    aliased directories (it should be able to unpack the symlinks creating
    those aliased directories though!).

    There are several reasons for that:

    * One is that the expected behavior for file types tracked by dpkg
    is to switch their file type if this is data.tar initiated and the
    operation can be done when the dirs are empty (so to get rid of
    these dpkg-maintscript-helper parts) otherwise abort, applying the
    symlink←→dir preservation behavior should only be done (if at all)
    for admin initiated changes on the fsys.
    * Another is that dpkg would need to allow those pathnames to have at
    the same time two sets of metadata attributes (mode, perms, xattrs,
    file type, one a symlink target), which is a terrible interface.
    * But more importantly this causes ordering issues and unpredictability.
    If there is a package A shipping a directory and package B shipping
    an aliasing symlink on the same pathname, and package C shipping also
    contents within that directory, and we have established that dpkg
    cannot always safely perform such file type switch, then depending on
    the unpack order and whether the "directory" is empty or not, dpkg
    would be able to perform the file type switch or not, and you might
    end up with files appearing in two "directories" and with an
    aliased directory or not. This is also terrible behavior. And that's
    why I say dpkg should simply refuse that, and something that should
    not be supported.

    * This introduces a series of commands to let dpkg know that a
    filesystem change that was not shipped in any .deb (even though that
    should have been the way to do it), has been done, which:
    - Switches the source of truth from the .deb to the fsys.

    While this is correct on some level, the aim of this change is to put
    that truth back into dpkg of course.

    Sure, the problem is the price that will need to be paid to get there,
    in terms of problematic interfaces or behavior and what kind of
    workarounds or hacks that will entail, and for how long.

    - Confuses admin initiated changes from distro initiated ones.

    I think we already do this with dpkg-divert, dpkg-statoverride and other tools. While this may not be nice, it certain has prior art and is
    consistent with how we have been doing things in the past.

    dpkg-divert distinguishes between local and package level changes, it
    is true that dpkg-statoverride does not have (currently) that
    distinction, although it is primarily an admin tool where I don't
    think it makes much sense to support something like declarative
    package statoverrides TBH once we can ship fsys metadata (perhaps
    conditional one though).

    * Wants to be a generic change but it is really targeted to this
    specific mess. We have been doing similar aliasing transitions for
    many doc dirs, by stopping shipping files within, shipping that
    pathname as a symlink and then switching the directories to symlinks
    to match (via the dpkg-maintscript-helper hack because we miss fsys
    metadata). This means we'd need to then register all these directories
    too? Meh.

    I would love to agree with this, but I believe that this ship has
    sailed. This likely is part of our fundamental disagreement.

    The comment was not focused on how this could have been done, but in
    that this is a common operation we do, and would need to get the same treatment, which seems bad.

    * This information can get out of sync with reality, as it adds an
    additional and unconnected with anything source of truth, that dpkg
    cannot do anything about if it diverges (in contrast to diversions
    or statoverrides f.ex.). This can never happen when that information
    comes from the real source of truth (the fsys metadata via the .deb).

    I have difficulties accurately capturing the argument. The problem of information getting out of sync with reality should affect every aspect
    of dpkg and indeed, that kinda is the status quo where upgrades can
    loose files, because dpkg has an incomplete picture of reality. The aim
    of this change is to allow us to re-sync the status quo into dpkg. My
    view is that dpkg's information presently is out of sync with reality
    and the proposed change partially fixes that.

    The current problem stems from both dpkg lacking fsys metadata and
    Debian holding dpkg wrong in an unsupported way, but where ideally
    both of these will eventually go away (?). My objection was that the
    proposal introduces a mechanism which makes things worse because it
    adds more information sources that can/will get out of sync.

    [ As an aside, I think ideally eventually nothing distro provided should
    be allowed to be installed within an aliased dir, and dpkg should
    eventually just error out in those cases, which eventually would get
    rid of the aliasing problems and any such complexity (I'm not sure how
    or when that would be feasible though, but obviously in Debian at
    least not until nothing ships files there). ]

    It seems to me that this is something everyone agrees on. So our
    disagreement resides in the way to get there rather than where to get
    to.

    If that's the case, then great. My impression though is that some
    people expect dpkg will be able to unpack content within aliased
    directories (?), which I don't see happening for the reasons I
    mentioned above. This will imply that you cannot install any old
    package that ships content there, which might be unexpected, but I
    don't see any other sane way to handle this. :/

    So this still looks like a terrible interface, like it did at the time
    it was discarded; founded on a hack, an interface that seems wants to
    be kind of a file-type override but it cannot be, and cannot even
    properly act as record tracker, etc…

    I agree that in a perfect world, we would not need this. Let me circle
    back to our fundamental disagreement.

    My impression is that at this time basically everyone except you agrees
    that we have to deal with the aliasing problems that have been rolled
    out to users and will be forced in bookworm. I believe that this is the
    state that we have to consider as starting point and that we cannot
    magically turn this transition back to perform it in a better way. And indeed, I believe that there would have been a better way[1] that no
    longer is available to us.

    I think I've mentioned before multiple times, that dpkg should
    eventually be able to be aliasing-aware. I think I've also mentioned
    that to get there we need to move all files out of aliased directories, otherwise several of the changes required for that "support" might not
    be even able to be deployed.

    On the other hand, my impression is that you continue to see the
    transition as fundamentally broken and in a state that we cannot work
    from. You appear to believe that if we want to do it, we must start over
    in a better way. That better way must not cause aliasing problems to
    dpkg.

    Well, it should be obvious by now this somewhat called transition is fundamentally broken, and I also see that there is no magic simple and
    clean way to get out of it. And every way out, is through further
    complexity, workarounds or badness. Of course given the corner Debian
    has painted itself into, there needs to be a way out, my objection is
    what kind of price to pay for that.

    I thought it would be clear that if there is stuff that depends on
    any of this kind of changes to dpkg, relying on those changes in
    Debian would not be possible until after trixie+1. Of course there is always the route to further pile up over the Jenga tower of hacks,
    by for example adding huge amounts of Pre-Depends…

    I agree that we probably will deal with this until at least trixie+1.
    This is precisely why I would like to have a plan to finish it sooner
    rather than later.

    Also, to note, that even if the way out was through some dpkg
    workaround, which would even get backported to bookworm, AIUI upgrades
    are never guaranteed to start from the last point release, so that
    would not seem to help much anyway.


    So coming back to workarounds and hacks, I'm finding the diversions
    stuff to be rather bad, as it requires to bypass an explicit dpkg
    refusal to deal with diverted directories, so it's going into further unsupported territory. :/ My other concern is that this might end up
    leaving unsupported directory diversions around which could break dpkg
    if it starts refusing to work on them during unpack, not just during
    diversion additions.

    I did a PoC (untested) implementation for the partial upgrade deletion prevention workaround to see how bad that might look like, and in
    comparison to the diverted stuff it is bad but not as bad. As I
    mentioned on our talks, this needs to imply emitting a warning,
    because otherwise this might end up as relied on behavior that should
    not be supported, and it would be a temporary hack for Debian and
    derivatives until things have moved out.

    https://git.hadrons.org/git/debian/dpkg/dpkg.git/log/?h=pu/aliasing-workaround

    Also, in case there is any confusion, this is a _partial_ workaround
    that does not cover many of the other badness, such as file overwrites
    and disappearances in other stages of the package life-cycle nor in
    other tools from the dpkg suite, from local packages, or from admin
    initiated changes via supported interfaces.

    I still think all the proposed workarounds are pretty terrible, TBH.

    Thanks,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Richter@21:1/5 to Guillem Jover on Wed Jun 21 16:30:01 2023
    Hi,

    On 6/21/23 20:33, Guillem Jover wrote:

    I don't think we disagree (?), I probably didn't express myself clearly.
    The fact that no package ships those symlinks *is* and *has* been a
    problem, and what I've been saying all along, this will be the only
    correct way to let dpkg know whether there will be aliasing in play.

    I've looked into building a dpkg-alias tool that would work similar to dpkg-divert, and currently that looks like it might be a viable solution.

    Rules:

    - some package will need to register the alias in its preinst and
    remove it in its postrm (to make piuparts happy and provide symmetry).

    My favourite for that would be systemd-sysv, because that is literally
    the package that brings in the requirement, but there might be problems
    with that approach for containers, so I suspect that's not a good choice.

    - unpacking a symlink over a registered alias is fine if the symlink
    and the alias match.

    This way, we can ship the symlink in a package for bootstrapping. Terms
    and conditions apply.

    - dpkg keeps track of the name of files in the .tar.gz, but also
    recognizes aliased names as referring to the same file

    This can be done inside dpkg's file database -- whenever an entry is
    created, additional entries for aliases are also generated along it, so
    the file can be looked up using any aliased path

    - circular aliases are not allowed

    This would break the requirement that it is possible to generate an
    exhaustive set of all names a file may be found under.

    - the newly created dpkg-alias tool is responsible for moving files,
    if necessary

    This is a separate tool, so we don't need to extend the unpacking logic,
    and we can build an algorithm here that includes error recovery.

    - if an alias is registered for a symlink that already exists, that is
    not an error

    This way, we accept the status quo silently.

    - registering an existing alias or unregistering a nonexistant alias
    is not an error

    This allows future releases to change the list of aliases without
    requiring complex logic in maintainer scripts.

    - Files remain in the same place during the trixie cycle

    We only shift responsibility for moving files and creating the symlinks
    during this cycle, but bootstrapping will have to go through an unmerged
    phase in the beginning, and files are then moved into merged paths from
    the preinst of the key package, after the initial unpack.

    Whether bootstrapping tools prefer to create the symlinks themselves
    does not really matter at this point -- ideally they wouldn't, because
    we'd need to keep track of any symlinks typically created by bootstrap
    tools and explicitly remove these if the actual system should not have
    them (e.g. if an architecture specific symlink is created on an arch
    that doesn't have it).

    - Symlinks cannot yet be shipped in data.tar during the trixie cycle

    Because the unpack phase during bootstrap creates an unmerged file
    system, the symlinks cannot be unpacked here.

    - In trixie+1, the symlinks are then created from data.tar, and files
    can then be moved.

    This allows bootstrap to create merged filesystems directly during the
    unpack phase.

    - dpkg-alias can fail if there is a conflict during alias registration

    This should not actually happen, but protects people who may have local packages that use unmerged paths.

    - dpkg-divert and dpkg-statoverride act on the normalized path
    - dpkg-divert and dpkg-statoverride are registered with un-normalized
    paths
    - it is an error to register a diversion or statoverride with
    non-matching data

    This should allow most of the handover scenarios. For a diversion, it is sufficient if the normalized destinations match, so we can have a
    handover that registers /usr/lib/x -> /usr/lib/y from the preinst of a
    new package and then unregisters /lib/x -> /lib/y from the postrm of an
    older one.

    The package would need to unregister on upgrade in the postrm though,
    but that is standard for removed diversions.

    - dpkg-query returns the package name if any aliased name matches

    There should also be a flag whether to report the file name from the
    data.tar as well, defaulting to "no", because that's what scripts expect.

    But given these mentioned constraints
    it cannot be made to support (as in accept) unpacking files inside
    aliased directories (it should be able to unpack the symlinks creating
    those aliased directories though!).

    I think that can be done. I have already successfully made it report a
    conflict between /bin/testfile and /usr/bin/testfile, with a meaningful
    error message, and runtime overhead isn't too bad -- a factor of
    log_{262144} 2 on the lookup time for a single path, but inserts got a
    bit more expensive because these now have prefix comparisons on the
    path. The latter could probably be improved with another hash on the
    first N bytes of the path.

    dpkg-divert distinguishes between local and package level changes, it
    is true that dpkg-statoverride does not have (currently) that
    distinction, although it is primarily an admin tool where I don't
    think it makes much sense to support something like declarative
    package statoverrides TBH once we can ship fsys metadata (perhaps
    conditional one though).

    This interface could be provided independent from the implementation, by essentially pretending that maintainer scripts contain calls to dpkg-statoverrides if a specific control file is present (and the same
    would work for dpkg-divert and dpkg-alias). The change for that would be
    fairly localized, around the maintainer script calls.

    This can then be optimized later, keeping the same interface, if needed.

    I'd like to see a mechanism that ensures that dpkg understands those
    control files, though -- like a "critical" flag.

    I suspect that for trixie, this will have to be an archive side check
    that any package using one of the declarative interfaces depends on an appropriate version of dpkg, and/or its use disallowed until trixie+1
    for the convenience of backporters.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Simon Richter on Mon Jul 10 11:30:01 2023
    Hi!

    On Wed, 2023-06-21 at 23:24:53 +0900, Simon Richter wrote:
    On 6/21/23 20:33, Guillem Jover wrote:
    I don't think we disagree (?), I probably didn't express myself clearly. The fact that no package ships those symlinks *is* and *has* been a problem, and what I've been saying all along, this will be the only
    correct way to let dpkg know whether there will be aliasing in play.

    I've looked into building a dpkg-alias tool that would work similar to dpkg-divert, and currently that looks like it might be a viable solution.

    Hmm, I get the impression the bulk of the mail this is replying to
    (and other previous ones), got ignored here.

    The package would need to unregister on upgrade in the postrm though, but that is standard for removed diversions.

    - dpkg-query returns the package name if any aliased name matches

    There should also be a flag whether to report the file name from the
    data.tar as well, defaulting to "no", because that's what scripts expect.

    That completely breaks the interface. This is one of the things that
    this change-at-a-distance breaks. The packages expect to be able to
    find their own files under their shipped names, not something the
    system might have done under their feet. So this new behavior can
    never be the default.

    But given these mentioned constraints
    it cannot be made to support (as in accept) unpacking files inside
    aliased directories (it should be able to unpack the symlinks creating those aliased directories though!).

    I think that can be done. I have already successfully made it report a conflict between /bin/testfile and /usr/bin/testfile, with a meaningful
    error message, and runtime overhead isn't too bad -- a factor of
    log_{262144} 2 on the lookup time for a single path, but inserts got a bit more expensive because these now have prefix comparisons on the path. The latter could probably be improved with another hash on the first N bytes of the path.

    I think this comment does not take into account the "mentioned
    constraints". A directory cannot be replaced atomically, even less so
    if it is non-empty. If it is empty and only owned this package, then
    it is still non-atomic, and replacing its file-type seems like
    safe-ish thing to do, as it does not suddenly disappear entire
    hierarchies that would stop being accessible through the old pathname.
    But moving entire hierarchies is simply not possible to do in an
    atomic and crash resistant way. This is one of the reasons dpkg-divert
    refuses to operate on directories.

    This also seems to ignore for example the ordering issue I mentioned
    before.

    I have no doubt that the "db aliasing" could be "done", although that
    also breaks a bunch of interfaces, but that's not the point, the
    problem comes from its consequences on unpacking and on the disk, and
    on having to sync the world views for these with the contents in the
    db on an ongoing basis.

    I'd like to see a mechanism that ensures that dpkg understands those control files, though -- like a "critical" flag.

    I don't think this is possible in a quick or non-nasty way (currently),
    given that such mechanism does not exist today, and we'd need to wait a
    release cycle to be able to use it anyway. In general things get
    introduced into dpkg, and then depending on the interface they can be
    used right away or one needs to wait until it can be used; if it's an independent tool that does not require any support from the running dpkg
    then that just requires dependencies, if it is part of dpkg then you
    need to wait, or if it's an optional feature you might be able to use
    «dpkg --assert-*» to check for availability, or just to bail out if
    it's a hard requirement.

    (Using a different .deb member that comes before the expected ones
    would be an option, which seems like the nasty but only existing
    mechanism for what this would involve, given the requirement to
    update all .deb consumers, and because it would be taking over the
    role of the control.tar member.)

    But for the future, perhaps it's worth considering, yes.

    I suspect that for trixie, this will have to be an archive side check that any package using one of the declarative interfaces depends on an
    appropriate version of dpkg, and/or its use disallowed until trixie+1 for
    the convenience of backporters.

    If we can manage to move files to their canonical locations, then the
    bulk of the aliasing disappears, and then as I've mentioned before
    then dpkg only needs to eventually become aliasing aware (in the fsys
    db sense from the fsys metadata) and simply get to an eventual point
    where it can (once it has a complete fsys metadata view) reject any
    potential attempt to perform a similar migration that would otherwise
    be unsafe and non-atomic. Any such migration would require to first
    move contents into their destination and then switch the dir to a
    symlink (which will be possible automatically once dpkg has the fsys
    metadata and can then replace the matching logic in
    dpkg-maintscript-helper), like we have been doing for /usr/share/doc/
    for ages for example.

    Thanks,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Raphael Hertzog@1:229/2 to Helmut Grohne on Fri Apr 21 14:20:01 2023
    XPost: linux.debian.devel
    From: hertzog@debian.org

    Hello,

    On Mon, 03 Apr 2023, Helmut Grohne wrote:
    Please consider it to be a piece of best
    intentions at reconciling feedback wherever I could. At the time of this writing it certainly is not consensus, but consensus is what I seek
    here. Without further ado, the full DEP text follows after my name
    while it also is available at https://salsa.debian.org/dep-team/deps/-/merge_requests/5

    I'd like to express some disappointment that nobody replied publicly
    sofar. Last year's developer survey concluded that "Debian should complete
    the merged-/usr transition" was the most important project for Debian [1] (among those proposed in the survey). That's what we are trying to do
    here and it would be nice to build some sort of consensus on what it means
    in terms of changes for dpkg.

    I know that Guillem (dpkg's maintainer) is generally opposed to the
    approach that Debian has followed to implement merged-/usr but I have
    yet to read his concerns on the changes proposed here (besides the fact
    that they ought to not be needed because we should redo the transition
    in another way, but that's a ship that has sailed a long time ago...).

    The rough project consensus seems to be that we should modify dpkg to
    avoid the cases where some files can disappear upon upgrades. Most people
    don't really care how we modify dpkg for this, and I can't blame them, but given that dpkg's maintainer seems unwilling to work on this problem,
    someone else has to come up with a design, implement it and get it applied
    on Debian's version of dpkg.

    We are committed to work on the design and implementation but we want to
    make sure the design is sound and agreed upon by the persons who are technically knowledgeable on this issue and who have thought a lot on this issue. There aren't that many persons in that set but it is also not empty.
    So please read the DEP and share your feedback, even if it's just "I have
    read it and it sounds fine", it will definitely help.

    Thank you!

    [1] page 28-32 of https://debian.pages.debian.net/dd-surveys/dd-survey-analysis-2022.pdf
    --
    ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog <hertzog@debian.org>
    ⣾⠁⢠⠒⠀⣿⡁
    ⢿⡄⠘⠷⠚⠋ The Debian Handbook: https://debian-handbook.info/get/
    ⠈⠳⣄⠀⠀⠀⠀ Debian Long Term Support: https://deb.li/LTS

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Raphael Hertzog@1:229/2 to Helmut Grohne on Fri Apr 21 15:10:02 2023
    XPost: linux.debian.devel
    From: hertzog@debian.org

    Hello,

    I'd like to offer some food for thoughts on this issue.

    From what I have understood, Guillem would rather avoid committing
    to a new public interface for this specific use-case, i.e. the
    fact that the DEP is suggesting "dpkg --add-alias" is problematic
    because that feature will be useless when we will have moved
    to .deb shipping files in /usr only.

    However the problem of file loss through aliased directories is a broader problem, it is not specific to this transition. It's quite possible that a package is shipping a symlink pointing to a directory and to have other packages installing files through that symlink (and then move those files between binary packages and between their two possible locations).

    Let's try to tackle that problem in a generic way without requiring
    any external information... it ought to be doable. You did consider
    it partly already:

    On Mon, 03 Apr 2023, Helmut Grohne wrote:
    Naive solution
    ==============

    In theory, `dpkg` could resolve this automatically. For every file it touches, it could canonicalize the location using the actual filesystem
    and check whether any other installed file has the same canonicalized location. Unfortunately, `dpkg` cannot know which filenames can
    collide, so it would check every filename in its database. For canonicalization, it would `stat()` every component of every filename.
    This easily amounts to a million or more `stat()` calls on larger installations. Caching could reduce the impact somewhat, but since
    Debian introduces aliases during maintainer scripts, it would have to invalidate the cache after maintainer scripts have been run. The
    resulting performance would be unacceptable.

    Here you are considering all files, but for the purpose of our issue,
    we can restrict ourselves to the directories known by dpkg. We really
    only care about directories that have been turned into symlinks (or
    packaged symlinks that are pointing to directories). That's a a much lower number of paths that we would have to check.

    You are speaking of having some sort of cache and I certainly agree
    that it would make sense to have such a cache.

    We could decide that /var/lib/dpkg/aliases is that cache, it would
    be the result of a scan of all directories known by dpkg (i.e. all
    paths known by dpkg where files are installed through that path) and
    it would list the target directory in case that path is a symlink.
    The absence of a directory in that file would mean that, according to
    dpkg, the directory ought to be a real directory.

    Thus this time-consuming operation would be done once, the first
    time that the updated dpkg starts and when /var/lib/dpkg/aliases
    does not yet exist.

    That cache file would be kept up-to-date by the various dpkg invocations:
    - when you install a new .deb containing a symlink pointing to a
    directory, that new "aliased path" is added to this file
    - when dpkg removes a symlink that is listed in the aliases file, we drop
    it too

    We don't add any new public interface to dpkg, but we also have the
    possibility to remove to /var/lib/dpkg/aliases to force an new scan
    (some sort of "dpkg --refresh-aliases" without an official name).

    It might still be cleaner to have that "dpkg --refresh-aliases" command
    so that we can invoke it for example in "dpkg-maintscript-helper symlink_to_dir/dir_to_symlink" when we are voluntarily turning a directory
    into a symlink (or vice-versa).

    In any case, now that you have a database of aliases, you can do the other modifications to detect conflicting files and avoid file losses.

    How does that sound?

    Implement aliasing after metadata tracking ------------------------------------------

    The [metadata tracking](https://wiki.debian.org/Teams/Dpkg/Spec/MetadataTracking)
    feature enhances `dpkg` with knowledge about filesystem metadata for installed files. This includes knowledge of symbolic links, which would
    help with tracking aliasing. Unfortunately, progress on this is fairly
    slow and we think that aliasing support is more urgent.

    The proposal I made above is not a real database in the sense that we
    don't record what was shipped by the .deb when we installed the files...
    it's rather the opposite, it analyzes the system to detect possible
    conflicts with dpkg's view of the system.

    It can be seen as complimentary to it. In any case, I don't see how implementing metadata tracking would help to solve the problem that we
    have today. dpkg would know that all .deb have /bin as a directory and
    not as a symlink, and it would be able to conclude that the directory
    has been replaced by a symlink by something external, but that's it.

    It should still accept that replacement and do its best to work with it.

    Cheers,
    --
    ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog <hertzog@debian.org>
    ⣾⠁⢠⠒⠀⣿⡁
    ⢿⡄⠘⠷⠚⠋ The Debian Handbook: https://debian-handbook.info/get/
    ⠈⠳⣄⠀⠀⠀⠀ Debian Long Term Support: https://deb.li/LTS

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Raphael Hertzog on Fri Apr 21 16:50:03 2023
    XPost: linux.debian.devel
    From: luca.boccassi@gmail.com

    On Fri, 21 Apr 2023 at 13:16, Raphael Hertzog <hertzog@debian.org> wrote:

    Hello,

    On Mon, 03 Apr 2023, Helmut Grohne wrote:
    Please consider it to be a piece of best
    intentions at reconciling feedback wherever I could. At the time of this writing it certainly is not consensus, but consensus is what I seek
    here. Without further ado, the full DEP text follows after my name
    while it also is available at https://salsa.debian.org/dep-team/deps/-/merge_requests/5

    I'd like to express some disappointment that nobody replied publicly
    sofar. Last year's developer survey concluded that "Debian should complete the merged-/usr transition" was the most important project for Debian [1] (among those proposed in the survey). That's what we are trying to do
    here and it would be nice to build some sort of consensus on what it means
    in terms of changes for dpkg.

    I know that Guillem (dpkg's maintainer) is generally opposed to the
    approach that Debian has followed to implement merged-/usr but I have
    yet to read his concerns on the changes proposed here

    There was an answer from the maintainer, 2 weeks ago: https://lists.debian.org/debian-dpkg/2023/04/msg00001.html
    Essentially, the answer was "no", so...

    After Bookworm ships I plan to propose a policy change to the CTTE and
    policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small
    % that does not use dh and a piuparts test to stop migration for
    anything that is uploaded and doesn't comply. That should bring the
    matter to an end, without needing to modify dpkg.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to Raphael Hertzog on Fri Apr 21 19:30:02 2023
    XPost: linux.debian.devel
    From: sjr@debian.org

    Hi,

    On 21.04.23 15:03, Raphael Hertzog wrote:

    Here you are considering all files, but for the purpose of our issue,
    we can restrict ourselves to the directories known by dpkg. We really
    only care about directories that have been turned into symlinks (or
    packaged symlinks that are pointing to directories). That's a a much lower number of paths that we would have to check.

    Having all paths in the database is cheaper, because doubling the number
    of paths multiplies the (average) cost by log_{262144} 2 only, and we do significantly more lookups than inserts.

    The other problem is that we do not know all of these paths, because the
    file system has been modified externally without informing dpkg. The
    closest thing we can do is scan everything that is supposed to be a
    directory.

    As an additional complication, dpkg silently resolves
    symlink-vs-directory conflicts in favour of the directory (which happens seldom, but third-party tools sometimes generate broken packages like
    that, so it is useful to keep it that way).

    Thus this time-consuming operation would be done once, the first
    time that the updated dpkg starts and when /var/lib/dpkg/aliases
    does not yet exist.

    That is still a public interface. :/

    In any case, now that you have a database of aliases, you can do the other modifications to detect conflicting files and avoid file losses.

    How does that sound?

    Alas, that is the easy part. My branch already implements most of that, including the logic to trigger a reload after a maintainer script if the
    stat information changed (like for diversions).

    The proposal I made above is not a real database in the sense that we
    don't record what was shipped by the .deb when we installed the files...
    it's rather the opposite, it analyzes the system to detect possible
    conflicts with dpkg's view of the system.

    That is going to be slow, and it changes dpkg's public interface to a
    more complex one where our tight loop that handles unpacking files gains additional error states.

    It can be seen as complimentary to it. In any case, I don't see how implementing metadata tracking would help to solve the problem that we
    have today. dpkg would know that all .deb have /bin as a directory and
    not as a symlink, and it would be able to conclude that the directory
    has been replaced by a symlink by something external, but that's it.

    It should still accept that replacement and do its best to work with it.

    That means there are two sources of truth: packages and the file system.
    We then need a (lowercase) policy how to resolve conflicts between
    these, which becomes our public interface, and thus part of (uppercase)
    Policy.

    I'd also single out the usrmerge transition here. This package operates
    in a grey area of Policy where technically a grave bug is warranted
    because it manipulates files belonging to other packages without going
    through one of the approved interfaces, but since we accidentally
    shipped that, we need to deal with it now. That does not mean this is acceptable, it just wasn't enforced.

    To me it would also be acceptable to just hardcode "if usrmerge or usr-is-merged is installed, take over the known aliases and silently
    discard that package", then salt the earth in dak that no package of
    this name can ever be shipped again until bookworm+3.

    That would be significantly easier than finding a generic solution that
    covers all existing use cases.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon McVittie@1:229/2 to Luca Boccassi on Sat Apr 22 12:50:01 2023
    XPost: linux.debian.devel
    From: smcv@debian.org

    On Fri, 21 Apr 2023 at 15:29:33 +0100, Luca Boccassi wrote:
    After Bookworm ships I plan to propose a policy change to the CTTE and
    policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild

    That seems quite likely to trigger the scenario Helmut is trying to avoid, which if I understand correctly is this:

    * foo_12.0 in Debian 12 ships /lib/abcd
    * bar_13.0 takes over /lib/abcd from foo, but because of either your
    proposed change or a manual action by the maintainer, it is actually in
    the data.tar as ./usr/lib/abcd (not ./lib/abcd like it was in foo_12.0)
    * the maintainer of bar didn't add the correct Breaks/Replaces on foo
    * a user upgrading from Debian 12 to 13 installs bar_13.0, perhaps pulled
    in as a dependency
    * expected result: dpkg refuses to unpack bar ("trying to overwrite ..."),
    the upgrade is cancelled, and the user reports a RC bug in bar
    * actual result: /usr/lib/abcd in bar quietly overwrites /lib/abcd from foo
    * if bar is subsequently removed, then dpkg (and therefore apt) thinks foo
    is fully functional, but in fact /{usr/,}lib/abcd is missing

    (For simplicity I've described that scenario in terms of files directly
    shipped in the data.tar, but dpkg also tracks the ownership of files
    created by dpkg-divert or alternatives, and similar things can happen
    to those.)

    I had hoped that the last section of technical committee resolution
    #994388 (which concerns this situation) would become irrelevant in Debian
    13, but it's looking as though without the sort of dpkg changes discussed
    in this thread, the concern about files moving between packages would
    remain a valid concern.

    However, as far as I can see, the other reasons not to do this that were mentioned in the last section of #994388 *do* become irrelevant in Debian
    13, so solving the files-moved-between-packages thing is the last major
    blocker for doing what you propose. (Unless someone has a reason why this
    is not the case?)

    You might reasonably say that "the maintainer of bar didn't add the
    correct Breaks/Replaces on foo" is a RC bug in bar - and it is! - but
    judging by the number of "missing Breaks/Replaces" bug reports that have
    to be opened by unstable users (sometimes me), it's a very easy mistake
    to make.

    One thing that's particularly tricky about this is that the move from
    / into /usr and the move from foo to bar might be 18 months apart if
    they happen to occur at opposite ends of our stable release cycle. In particular, if the move from / into /usr is done as soon as the Debian 13
    cycle opens, we cannot predict whether the packages that have undergone
    that move will also need to undergo a package split/merge at some point
    in the following 18 months (but it's reasonable to assume that at least
    some of them will).

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Raphael Hertzog on Sat Apr 22 13:00:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Raphal,

    On Fri, Apr 21, 2023 at 03:03:10PM +0200, Raphael Hertzog wrote:
    Here you are considering all files, but for the purpose of our issue,
    we can restrict ourselves to the directories known by dpkg. We really
    only care about directories that have been turned into symlinks (or
    packaged symlinks that are pointing to directories). That's a a much lower number of paths that we would have to check.

    Considering just sid amd64 main, I count around 140000 directories,
    which clearly is less than millions. A typical installation will only
    have a fraction of that, probably less than 50000. I think this is the
    number of stat() calls we'd have to do. I timed this on a reasonably
    fast system (admittedly using Python but I think the overhead is not
    huge) and this can complete in around 0.1s (with a hot vfs cache). So
    depending on the cache invalidation strategy this may be viable or not.

    This is looking at it from a performance point of view. Guillem also
    raised that this is changing the source of truth from the dpkg database
    to the actual filesystem, which Guillem considers wrong and I find that
    vaguely agreeable.

    We don't add any new public interface to dpkg, but we also have the possibility to remove to /var/lib/dpkg/aliases to force an new scan
    (some sort of "dpkg --refresh-aliases" without an official name).

    Can I rephrase this as your cache invalidation strategy is that any
    external entity (such as a maintainer script) introducing aliases should explicitly invalidate the cache.

    It might still be cleaner to have that "dpkg --refresh-aliases" command
    so that we can invoke it for example in "dpkg-maintscript-helper symlink_to_dir/dir_to_symlink" when we are voluntarily turning a directory into a symlink (or vice-versa).

    If you put it this way, it is not that different from the --add-alias/--remove-alias proposal. It is a different interface to
    dpkg, but the semantics are roughly the same:

    In both cases, something external to dpkg is responsible for performing
    the moves and creating the symbolic links followed by informing dpkg
    about the alias (explicitly or implicitly via scanning directories).

    Would you agree with me that this is a minor adaption of DEP17? In
    essence what changes is the way that a user communicates aliases to
    dpkg, but the assumption that a user must communicate aliases to dpkg is
    not affected. I'd be fine with changing this aspect in principle, but I
    still consider this a new public interface to dpkg with much the same
    effects to long term maintenance.

    In any case, now that you have a database of aliases, you can do the other modifications to detect conflicting files and avoid file losses.

    How does that sound?

    It sounds all the same as DEP17 with a different color to me. Hope I got
    it right.

    What I tried ruling out as naive solution is eliminating the need to
    tell dpkg about aliasing changes and then we'd have to incur this 0.1s
    delay after every maintainer script invocation, which would amount to 5
    minutes of stat()ing on a typical dist-upgrade assuming a hot vfs cache
    on a fast x86 CPU.

    The proposal I made above is not a real database in the sense that we
    don't record what was shipped by the .deb when we installed the files...
    it's rather the opposite, it analyzes the system to detect possible
    conflicts with dpkg's view of the system.

    I think that Guillem considers this a bad property as he has expressed
    in his reply on debian-dpkg, that .debs should be the source of truth.

    It can be seen as complimentary to it. In any case, I don't see how implementing metadata tracking would help to solve the problem that we
    have today. dpkg would know that all .deb have /bin as a directory and
    not as a symlink, and it would be able to conclude that the directory
    has been replaced by a symlink by something external, but that's it.

    Let me put it subtly different. As we currently do not ship the aliasing symbolic links in any data.tar, metadata tracking will not tell dpkg
    about the aliasing and therefore metadata tracking cannot help resolve
    the current situation (as singular measure). We can only add the
    symbolic links to a data.tar after the aliasing has been resolved (see
    Simon Richter's mails on how dpkg resolves directory vs symlink) and
    thus metadata tracking can only help with resolving the situation after
    we have fully resolved the situation. I don't see a way to resolve this
    vicious circle and shall update the DEP17 text.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Luca Boccassi on Sat Apr 22 13:00:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Luca,

    On Fri, Apr 21, 2023 at 03:29:33PM +0100, Luca Boccassi wrote:
    After Bookworm ships I plan to propose a policy change to the CTTE and
    policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small
    % that does not use dh and a piuparts test to stop migration for
    anything that is uploaded and doesn't comply. That should bring the
    matter to an end, without needing to modify dpkg.

    I agree with the goal of removing aliases by moving files to their
    canonical locations. However, I do not quite see us getting there in the
    way you see it, but maybe I am missing something. As long as dpkg does
    not understand the effects of aliasing, we cannot safely move those
    files and thus the file move moratorium will have to be kept in place.
    And while moving the files would bring the matter to an end, we cannot
    do so without either modifying dpkg or rolling back the transition and
    starting over. I hope that we all agree that rolling back would be too
    insane to even consider, but I fail to see how you safely move files
    without dpkg being changed. Can you elaborate on that aspect?

    I'd also be interested on how you plan to move important files in
    essential packages. This is an aspect raised by Simon Richter and where
    I do not see an obvious answer yet.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Helmut Grohne on Sat Apr 22 14:30:01 2023
    XPost: linux.debian.devel
    From: luca.boccassi@gmail.com

    On Sat, 22 Apr 2023 at 11:50, Helmut Grohne <helmut@subdivi.de> wrote:

    Hi Luca,

    On Fri, Apr 21, 2023 at 03:29:33PM +0100, Luca Boccassi wrote:
    After Bookworm ships I plan to propose a policy change to the CTTE and policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small
    % that does not use dh and a piuparts test to stop migration for
    anything that is uploaded and doesn't comply. That should bring the
    matter to an end, without needing to modify dpkg.

    I agree with the goal of removing aliases by moving files to their
    canonical locations. However, I do not quite see us getting there in the
    way you see it, but maybe I am missing something. As long as dpkg does
    not understand the effects of aliasing, we cannot safely move those
    files and thus the file move moratorium will have to be kept in place.
    And while moving the files would bring the matter to an end, we cannot
    do so without either modifying dpkg or rolling back the transition and starting over. I hope that we all agree that rolling back would be too
    insane to even consider, but I fail to see how you safely move files
    without dpkg being changed. Can you elaborate on that aspect?

    Moving files within _the same_ package is actually fine as far as I
    know. It's moving between location _and_ packages within the same
    upgrade that is problematic. The piuparts test I added is overzealous,
    but it doesn't need to be.

    I'd also be interested on how you plan to move important files in
    essential packages. This is an aspect raised by Simon Richter and where
    I do not see an obvious answer yet.

    Do you have a pointer? Not sure I follow what "important" files means
    here, doesn't ring a bell.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Simon McVittie on Sat Apr 22 14:30:01 2023
    XPost: linux.debian.devel
    From: luca.boccassi@gmail.com

    On Sat, 22 Apr 2023 at 11:41, Simon McVittie <smcv@debian.org> wrote:

    On Fri, 21 Apr 2023 at 15:29:33 +0100, Luca Boccassi wrote:
    After Bookworm ships I plan to propose a policy change to the CTTE and policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild

    That seems quite likely to trigger the scenario Helmut is trying to avoid, which if I understand correctly is this:

    * foo_12.0 in Debian 12 ships /lib/abcd
    * bar_13.0 takes over /lib/abcd from foo, but because of either your
    proposed change or a manual action by the maintainer, it is actually in
    the data.tar as ./usr/lib/abcd (not ./lib/abcd like it was in foo_12.0)
    * the maintainer of bar didn't add the correct Breaks/Replaces on foo
    * a user upgrading from Debian 12 to 13 installs bar_13.0, perhaps pulled
    in as a dependency
    * expected result: dpkg refuses to unpack bar ("trying to overwrite ..."),
    the upgrade is cancelled, and the user reports a RC bug in bar
    * actual result: /usr/lib/abcd in bar quietly overwrites /lib/abcd from foo
    * if bar is subsequently removed, then dpkg (and therefore apt) thinks foo
    is fully functional, but in fact /{usr/,}lib/abcd is missing

    (For simplicity I've described that scenario in terms of files directly shipped in the data.tar, but dpkg also tracks the ownership of files
    created by dpkg-divert or alternatives, and similar things can happen
    to those.)

    I had hoped that the last section of technical committee resolution
    #994388 (which concerns this situation) would become irrelevant in Debian
    13, but it's looking as though without the sort of dpkg changes discussed
    in this thread, the concern about files moving between packages would
    remain a valid concern.

    However, as far as I can see, the other reasons not to do this that were mentioned in the last section of #994388 *do* become irrelevant in Debian
    13, so solving the files-moved-between-packages thing is the last major blocker for doing what you propose. (Unless someone has a reason why this
    is not the case?)

    You might reasonably say that "the maintainer of bar didn't add the
    correct Breaks/Replaces on foo" is a RC bug in bar - and it is! - but
    judging by the number of "missing Breaks/Replaces" bug reports that have
    to be opened by unstable users (sometimes me), it's a very easy mistake
    to make.

    One thing that's particularly tricky about this is that the move from
    / into /usr and the move from foo to bar might be 18 months apart if
    they happen to occur at opposite ends of our stable release cycle. In particular, if the move from / into /usr is done as soon as the Debian 13 cycle opens, we cannot predict whether the packages that have undergone
    that move will also need to undergo a package split/merge at some point
    in the following 18 months (but it's reasonable to assume that at least
    some of them will).

    We already have piuparts tests detecting files moving, it should be
    easy enough to extend that to check that the appropriate
    Breaks/Replaces have been added. Correct me if I'm wrong, but I
    believe it's already against policy to do this without
    Breaks/Replaces, so it's not a use case that we need to support, no?
    If someone does that by mistake, the package will not migrate to
    testing.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon McVittie@1:229/2 to Helmut Grohne on Sat Apr 22 15:50:02 2023
    XPost: linux.debian.devel
    From: smcv@debian.org

    On Sat, 22 Apr 2023 at 12:43:09 +0200, Helmut Grohne wrote:
    Guillem also
    raised that this is changing the source of truth from the dpkg database
    to the actual filesystem, which Guillem considers wrong and I find that vaguely agreeable.

    To be fair, dpkg does already have at least one case where it treats
    the filesystem as the source of truth, namely a local sysadmin (or
    even a package, I think?) substituting a symlink-to-directory for a
    "real" directory. It has supported this for a long time, and requires
    specific action to avoid that behaviour (usually the dir-to-symlink and symlink-to-dir maintscript actions). The strategy used in the usrmerge
    package wouldn't have worked otherwise.

    I believe the original use-case for this was offloading large
    subtrees to a secondary filesystem, like a symlink
    /usr/share/games -> /srv/large-secondary-disk/games (which is better
    achieved with bind-mounts, or perhaps btrfs subvolumes if you use btrfs,
    on modern systems).

    If I understand correctly, this feature is considered vaguely deprecated,
    but is also quite entrenched, to the extent that we even have wording in
    Policy specifically to make it work better, namely the rule about symlinks within a top-level directory (/usr/lib/foo/data -> ../../share/foo/data)
    being relative, while symlinks between separate top-level directories (/usr/bin/foo -> /etc/alternatives/foo -> /usr/bin/foo-full) are absolute.

    The key difference in how usrmerge uses this dpkg feature is that
    traditionally the target directory would be somewhere non-dpkg-managed
    (meaning that the resulting path aliasing doesn't affect dpkg's
    behaviour), whereas in usrmerge, both the symlink and the target directory
    are in the subset of the filesystem tree that is managed by dpkg (meaning
    that the resulting path aliasing becomes significant).

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Raphael Hertzog@1:229/2 to Helmut Grohne on Mon Apr 24 11:40:01 2023
    XPost: linux.debian.devel
    From: hertzog@debian.org

    Hello,

    On Sat, 22 Apr 2023, Helmut Grohne wrote:
    This is looking at it from a performance point of view. Guillem also
    raised that this is changing the source of truth from the dpkg database
    to the actual filesystem, which Guillem considers wrong and I find that vaguely agreeable.

    smcv already replied to that part.


    We don't add any new public interface to dpkg, but we also have the possibility to remove to /var/lib/dpkg/aliases to force an new scan
    (some sort of "dpkg --refresh-aliases" without an official name).

    Can I rephrase this as your cache invalidation strategy is that any
    external entity (such as a maintainer script) introducing aliases should explicitly invalidate the cache.

    Yes and no. Sticking to the idea that .deb should be the source of truth,
    we make the assumptions that external entities should not do that and if
    they do it, they use the already existing interfaces (either shipping
    symlinks in a .deb or calling dpkg-maintscript-helper to convert a
    directory in a symlink or the opposite, depending on the history of said
    path).

    If you put it this way, it is not that different from the --add-alias/--remove-alias proposal. It is a different interface to
    dpkg, but the semantics are roughly the same:

    In both cases, something external to dpkg is responsible for performing
    the moves and creating the symbolic links followed by informing dpkg
    about the alias (explicitly or implicitly via scanning directories).

    I don't consider "dpkg-maintscript-helper" as external to dpkg. Quite
    on the opposite, it's an ugly hack that is part of dpkg so that it can
    evolve together with dpkg relying on internal implementation details that nobody else can rely on.

    Would you agree with me that this is a minor adaption of DEP17? In

    It is an adaptation of DEP17 that tries to not create a new public
    interface for users. Whether that change is minor or not, I leave that
    up to Guillem to decide. My hope is that the restricted scope makes
    it acceptable to him.

    The proposal I made above is not a real database in the sense that we
    don't record what was shipped by the .deb when we installed the files... it's rather the opposite, it analyzes the system to detect possible conflicts with dpkg's view of the system.

    I think that Guillem considers this a bad property as he has expressed
    in his reply on debian-dpkg, that .debs should be the source of truth.

    I understood this but at the same time dpkg has supported an exception
    already, this is only about improving how we detect issues related
    to that supported exception.

    Cheers,
    --
    ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog <hertzog@debian.org>
    ⣾⠁⢠⠒⠀⣿⡁
    ⢿⡄⠘⠷⠚⠋ The Debian Handbook: https://debian-handbook.info/get/
    ⠈⠳⣄⠀⠀⠀⠀ Debian Long Term Support: https://deb.li/LTS

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Simon Richter on Wed Apr 26 11:40:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Wed, 26 Apr 2023 at 10:11, Simon Richter <sjr@debian.org> wrote:

    Hi,

    On Tue, Apr 25, 2023 at 09:07:28PM +0200, Helmut Grohne wrote:

    This and /bin/sh is the kind of files I'd consider important. And then
    upon thinking further it became more and more difficult for me to make sense of the objection. On a merged system, we can just move that file
    to its canonical location without having any trouble even with an unmodified dpkg. So from my pov, the question about important files can
    be disregarded. I hope Simon Richter agrees.

    Yes, the relevant code at

    https://github.com/guillemj/dpkg/blob/main/src/main/unpack.c#L749

    already handles moving a file inside the same package, and that has
    existed for some time, that's why I use two packages for the PoC.

    I have not looked for more issues beyond that, so there might be others lurking in the depths of this code.

    What I'm mostly concerned about (read: have not verified either way)
    with /lib/ld.so and /bin/sh is what happens when dpkg learns of /bin and
    /lib as symlinks -- because right now, the symlinks created by usrmerge
    are protected by the rule that if dpkg expects a directory and finds a symlink, that is fine because that is obviously an action taken by the
    admin.

    But if dpkg sees a package containing these as symlinks, then this is
    entered into the dpkg database, and subject to conflict resolution, and
    for that, a separate rule exists that directory-symlink conflicts are resolved in favour of the directory, so the interaction between a newer base-files packages shipping /lib as a symlink and an older or
    third-party package containing /lib as a directory (e.g. a kernel
    package from a hosting provider) could overwrite the /lib symlink.

    It might be possible to avoid that by never shipping /lib as a symlink
    and always creating it externally, but I still think that's kind of
    wobbly.

    IMHO we should not ship the top-level symlinks in a package. The
    reason for that is to allow the use case where /usr is a separate
    vendor partition and / is either a luks volume or a tmpfs, and thus
    the top-level symlinks are ephemeral and re-created on boot on the
    fly. If they were part of a package, that would get awkward to say the
    least.
    I really would like to move toward the direction of having exclusively
    /usr and /etc shipped in data.tar, and everything else created locally
    as needed, and that includes files in /.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to Helmut Grohne on Wed Apr 26 11:20:01 2023
    XPost: linux.debian.devel
    From: sjr@debian.org

    Hi,

    On Tue, Apr 25, 2023 at 09:07:28PM +0200, Helmut Grohne wrote:

    This and /bin/sh is the kind of files I'd consider important. And then
    upon thinking further it became more and more difficult for me to make
    sense of the objection. On a merged system, we can just move that file
    to its canonical location without having any trouble even with an
    unmodified dpkg. So from my pov, the question about important files can
    be disregarded. I hope Simon Richter agrees.

    Yes, the relevant code at

    https://github.com/guillemj/dpkg/blob/main/src/main/unpack.c#L749

    already handles moving a file inside the same package, and that has
    existed for some time, that's why I use two packages for the PoC.

    I have not looked for more issues beyond that, so there might be others
    lurking in the depths of this code.

    What I'm mostly concerned about (read: have not verified either way)
    with /lib/ld.so and /bin/sh is what happens when dpkg learns of /bin and
    /lib as symlinks -- because right now, the symlinks created by usrmerge
    are protected by the rule that if dpkg expects a directory and finds a
    symlink, that is fine because that is obviously an action taken by the
    admin.

    But if dpkg sees a package containing these as symlinks, then this is
    entered into the dpkg database, and subject to conflict resolution, and
    for that, a separate rule exists that directory-symlink conflicts are
    resolved in favour of the directory, so the interaction between a newer base-files packages shipping /lib as a symlink and an older or
    third-party package containing /lib as a directory (e.g. a kernel
    package from a hosting provider) could overwrite the /lib symlink.

    It might be possible to avoid that by never shipping /lib as a symlink
    and always creating it externally, but I still think that's kind of
    wobbly.

    If we look deeper into the dpkg toolbox, we pause at diversions. What if
    the new package were to add a (--no-rename) diversion for files that are
    at risk of being accidentally deleted in newpkg.preinst and then remove
    that diversion in newpkg.postinst? Any such diversion will cause package removal of the oldpkg to skip removal of the diverted file (and instead deleted the non-existent path that we diverted to). Thus we retain the
    files we were interested in.

    O_O

    Yes. Hahahahaha yes.

    Brittle, but it could work. What is a bit annoying is that we'd have to
    keep this for an entire cycle.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Sam Hartman@1:229/2 to All on Wed Apr 26 15:20:01 2023
    XPost: linux.debian.devel
    From: hartmans@debian.org

    "Simon" == Simon McVittie <smcv@debian.org> writes:

    Simon> You might reasonably say that "the maintainer of bar didn't
    Simon> add the correct Breaks/Replaces on foo" is a RC bug in bar -
    Simon> and it is! - but judging by the number of "missing
    Simon> Breaks/Replaces" bug reports that have to be opened by
    Simon> unstable users (sometimes me), it's a very easy mistake to
    Simon> make.

    Is adding the correct breaks/replaces enough to solve things?
    I could believe adding a versioned conflicts would be sufficient, but it
    is not obvious to me that breaks/replaces is enough given that dpkg
    doesn't understand aliasing.

    My intuition (and I have not worked through this as much as you) is that
    any time you can have files moving where both packages are unpacked can
    create problems.
    I think that can happen with breaks/replaces but not without a conflicts (without replaces?)

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Raphael Hertzog@1:229/2 to Luca Boccassi on Wed Apr 26 15:20:01 2023
    XPost: linux.debian.devel
    From: hertzog@debian.org

    On Tue, 25 Apr 2023, Luca Boccassi wrote:
    Brilliant! Would never have thought of using divert like that.

    +1 Nice trick.

    So, what work would need to happen to make this reality? Do we need tooling/scripts/build changes to support the divert scheme, or is it
    "simply" a matter of documenting and testing?

    I would suggest to implement it with "dpkg-maintscript-helper" and
    document what needs to be documented as part of the associated manual
    page documentation.

    After all it's a hack to work around a dpkg limitation and it fits
    well with the purpose of that helper tool.

    Cheers,
    --
    ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog <hertzog@debian.org>
    ⣾⠁⢠⠒⠀⣿⡁
    ⢿⡄⠘⠷⠚⠋ The Debian Handbook: https://debian-handbook.info/get/
    ⠈⠳⣄⠀⠀⠀⠀ Debian Long Term Support: https://deb.li/LTS

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Helmut Grohne on Thu Apr 27 00:50:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    On Tue, Apr 25, 2023 at 09:07:28PM +0200, Helmut Grohne wrote:
    In sincerely hope that this fixed-up plan doesn't have any serious
    issues. If you find any please tell.

    Thanks for the praise, but problems I found and I'm pretty sure this is
    only the tip of the iceberg.

    So for one thing, let us imagine merged /usr was mandatory in bullseye
    already and we were now moving all the files to /usr for bookworm. This
    is what is on the table for trixie, but since there is no trixie yet, we
    can try to use the current freeze to see what would happen.

    To that end, let's look at /lib/systemd/system-generators/systemd-bless-boot-generator. This file
    was part of systemd in bullseye and has been split out to systemd-boot
    in bookworm. If it were moved to its canonical location, it could be
    unpacked by dpkg before upgrading systemd and thus the systemd-bless-boot-generator would vanish from the system despite
    correct Breaks+Replaces. This is a situation where we'd have to use
    Conflicts instead.

    There are actually many more such situations such as:
    * /bin/fusermount: fuse -> fuse3
    * /bin/rksh93: ksh -> ksh93u+m
    * /lib/systemd/system/dbus.socket: dbus -> dbus-system-bus-common
    * /lib/systemd/system/dhcpcd.service: dhcpcd5 -> dhcpcd
    * /lib/systemd/system/polkit.service: policykit-1 -> polkitd
    * /lib/systemd/system/systemd-resolved.service: systemd -> systemd-resolved
    * /sbin/hwclock: util-linux -> util-linux-extra
    * ...

    This really is a common situation and given the number of systemd units affected, we now also see why not allowing them to move to /usr was a
    smart thing to do.

    And that's just the ones where correct Breaks+Replaces have been added.
    We also have a number of situations where Breaks+Replaces are missing.

    Ok, let's move on. I've proposed diversions as a cure, but in reality diversions are a problem themselves. Consider that
    cryptsetup-nuke-password diverts /lib/cryptsetup/askpass, which is
    usually owned by cryptsetup. If cryptsetup were to move that file to
    /usr, the diversion would not cover it anymore and the actual content of askpass would depend on the unpack order. That's very bad and none of
    what I proposed earlier is going to fix this.

    And of course, this is not some special example, it's a pattern:
    * /lib/udev/rules.d/60-cdrom_id.rules: udev -> amazon-ec2-utils
    * /sbin/dhclient: isc-dhcp-client -> isc-dhcp-client-ddns
    * /bin/systemd-sysusers: systemd -> opensysusers
    * ...

    So how do we fix diversions? Let's have a look into the dpkg toolbox
    again. I've got an idea. Diversions. What you say? How do you fix
    diversions with diversions? Quite obviously, you divert
    /usr/bin/dpkg-divert! And whenever dpkg-divert is instructed to add a
    diversion for a non-canonical path, you forward that call to the real dpkg-divert, but also call it with a canonicalized version such that
    both locations are covered. When initially deploying the diversion of /usr/bin/dpkg-divert, we also need to transform existing diversions.
    Other than that, things should work after doubling down on diversions.
    Sorry, I don't have a test case for this yet.

    I have a bad feeling about this. I think some dpkg maintainer warned us
    that diversions would break. Let's peek at his list again. He also said update-alternatives would be broken. I admit not having dug into this
    yet, but my gut feeling already is that update-alternatives will become
    "funny" as well though I guess we cannot fix update-alternatives by
    adding alternatives.

    So we started with moving some files to their canonical location. We
    learned that Breaks+Replaces are sometimes insufficient and we can fix
    that with Conflicts. Then we learned that Conflicts cannot always be
    used and we can work around that using diversions. Now we learned that diversions are also broken and we can work around that as well. The
    amount of complexity we are piling up here becomes non-trivial.

    At some point the question becomes: Do we want that complexity inside
    dpkg (aka DEP 17 or some variant of it) or outside of dpkg (i.e. what
    we're talking about here). It seems clear at this time, that complexity
    is unavoidable.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Sam Hartman on Thu Apr 27 00:50:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    On Wed, Apr 26, 2023 at 07:11:10AM -0600, Sam Hartman wrote:
    "Simon" == Simon McVittie <smcv@debian.org> writes:

    Simon> You might reasonably say that "the maintainer of bar didn't
    Simon> add the correct Breaks/Replaces on foo" is a RC bug in bar -
    Simon> and it is! - but judging by the number of "missing
    Simon> Breaks/Replaces" bug reports that have to be opened by
    Simon> unstable users (sometimes me), it's a very easy mistake to
    Simon> make.

    Indeed, I was thinking that we correctly to Breaks and Replaces. I now
    know better and files an initial batch of four rc bugs for missing
    cases. So yeah, quite clearly we need to fix this more systematically,
    but I think that's technically possible:

    1. Download all Contents and Packages files for stable and testing.
    2. Generate candidates for file conflicts from the Contents.
    3. Skip cases covered by Breaks+Replaces or Conflicts.
    3b. Optionally also parse binarycontrol.d.n data to be able to skip
    known diversions.
    4. Create a fresh stable chroot and install the "old" package.
    5. Update sources.list to testing and download the "new" package.
    6. dpkg --auto-configure --unpack new.pkg.
    7. Check output for "trying to overwrite".

    Is adding the correct breaks/replaces enough to solve things?
    I could believe adding a versioned conflicts would be sufficient, but it
    is not obvious to me that breaks/replaces is enough given that dpkg
    doesn't understand aliasing.

    It is not.

    My intuition (and I have not worked through this as much as you) is that
    any time you can have files moving where both packages are unpacked can create problems.

    This is exactly the situation that caused the moratorium.

    I think that can happen with breaks/replaces but not without a conflicts (without replaces?)

    Yes, Conflicts can in situations where Breaks+Replaces would fail due to
    the aliasing issue. My mail from yesterday goes into more detail and
    also explains when you cannot use Conflicts.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to Helmut Grohne on Thu Apr 27 08:00:02 2023
    XPost: linux.debian.devel
    From: sjr@debian.org

    Hi,

    On Thu, Apr 27, 2023 at 12:34:06AM +0200, Helmut Grohne wrote:

    At some point the question becomes: Do we want that complexity inside
    dpkg (aka DEP 17 or some variant of it) or outside of dpkg (i.e. what
    we're talking about here). It seems clear at this time, that complexity
    is unavoidable.

    My gut feeling is that returning to "dpkg's model is an accurate
    representation of the file system" will be less complex to manage
    long-term. For this to work, the model needs to be able to express
    reality, so I guess we can't avoid updating dpkg.

    I'm also not convinced that the current filesystem layout will remain as
    it is, for example I can see a use case for installing kernel modules
    outside of /usr. It would be great to have a generic mechanism here, and
    be able to do transitions like these without inventing new tools every
    time.

    Also, the more we can do in a descriptive framework, the better we can
    do static checks. The main reason we can argue about what packages are
    affected is that we have a database of what files are installed where,
    and that still accurately reflects reality, so we can apply a
    transformation onto this data and check for conflicts -- but we cannot
    see diversions in this database as these are created from imperative
    code.

    So my fear is that if we create a workaround here that is implemented as imperative code in pre/postinst, this will be invisible to whoever plans
    the next transition, so this would create immense technical debt.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Simon McVittie on Thu Apr 27 15:40:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Simon,

    On Sat, Apr 22, 2023 at 11:41:29AM +0100, Simon McVittie wrote:
    You might reasonably say that "the maintainer of bar didn't add the
    correct Breaks/Replaces on foo" is a RC bug in bar - and it is! - but
    judging by the number of "missing Breaks/Replaces" bug reports that have
    to be opened by unstable users (sometimes me), it's a very easy mistake
    to make.

    That number seemed quite vague to me and I wanted to get a better handle
    on it. The rough idea here should be that we have some package from
    bullseye and "upgrade" it to a different package from bookworm.
    Generating useful candidates for this can be done using Contents. Given candidates, I've attached a validation script:

    ./check_conflicts.sh $OLDPKG bullseye $NEWPKG bookworm

    In order to draw value from it, the output must be parsed. The exit code
    can be non-zero for various reasons. As for candidate generation, I
    think one can either just try them all (which takes a bit longer on the validation phase) or reduce their number by ignoring existing
    Breaks+Replaces, but I haven't found an elegant solution for the latter
    yet.

    In any case, unstable has around:
    * 5700 Breaks
    * 6500 Replaces
    * 100 unpack errors due to missing Breaks+Replaces

    That latter number has just been turned into rc bugs...

    So maybe it isn't as bad as we think it is, but it definitely is an
    aspect that may require a more automated solution.

    In any case, that also gives us a rough idea on how many Breaks+Replaces
    we'd have to convert to Conflicts. 6000 likely is an upper bound. I
    expect that it is probably below 1000 since we can ignore conflicting
    paths that reside below /usr only. I'm not sure whether these numbers
    argue in favour or against the projected approach.

    Helmut

    #!/bin/shset -euxif test -n "${CONFLICT_TEST_CASE_INNER:+set}"; then cd "$1" sed -i -e "s/ $CONFLICT_TEST_CASE_FROMDIST / $CONFLICT_TEST_CASE_TODIST /" ./etc/apt/sources.list test "$CONFLICT_TEST_CASE_TODIST" = bookworm && sed -i -e 's/ non-free/&
    non-free-firmware/' ./etc/apt/sources.list APT_CONFIG="$MMDEBSTRAP_APT_CONFIG" apt-get update C=$(APT_CONFIG="$MMDEBSTRAP_APT_CONFIG" apt-cache show --no-all-versions "$CONFLICT_TEST_CASE_TOPKG" | sed -n '/^Conflicts:/p') PD=$(APT_CONFIG="$MMDEBSTRAP_
    APT_CONFIG" apt-cache show --no-all-versions "$CONFLICT_TEST_CASE_TOPKG" | sed -n 's/^Pre-Depends://p') if test -n "$C" -o -n "$PD"; then set -- test -n "$C" && set -- "$@" "$C" test -n "$PD" && set -- "$@" "$PD" # apt-get satisfy behaves
    differently when not chrooted. Unknown cause. chroot . apt-get satisfy -y "$@" fi APT_CONFIG="$MMDEBSTRAP_APT_CONFIG" apt-get download "$CONFLICT_TEST_CASE_TOPKG" # cannot use dpkg --root due to https://lists.debian.org/debian-dpkg/2023/03/msg00003.
    html chroot . dpkg --auto-deconfigure --unpack ./*.debelse export CONFLICT_TEST_CASE_INNER=yes export CONFLICT_TEST_CASE_FROMPKG=$1 export CONFLICT_TEST_CASE_FROMDIST=$2 export CONFLICT_TEST_CASE_TOPKG=$3 export CONFLICT_TEST_CASE_TODIST=$4
    export CONFLICT_TEST_CASE_MIRROR=${5:-http://deb.debian.org/debian} mmdebstrap \ --verbose \ --variant=apt \ --hook-dir=/usr/share/mmdebstrap/hooks/maybe-merged-usr \ --include="$CONFLICT_TEST_CASE_FROMPKG" \ --components=main,contrib,non-
    free \ --customize-hook="$(realpath "$0")" \ "$CONFLICT_TEST_CASE_FROMDIST" \ /dev/null \ "$CONFLICT_TEST_CASE_MIRROR"fi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Sven Joachim@1:229/2 to Luca Boccassi on Wed Apr 26 18:30:01 2023
    XPost: linux.debian.devel
    From: svenjoac@gmx.de

    On 2023-04-26 10:34 +0100, Luca Boccassi wrote:

    On Wed, 26 Apr 2023 at 10:11, Simon Richter <sjr@debian.org> wrote:

    What I'm mostly concerned about (read: have not verified either way)
    with /lib/ld.so and /bin/sh is what happens when dpkg learns of /bin and
    /lib as symlinks -- because right now, the symlinks created by usrmerge
    are protected by the rule that if dpkg expects a directory and finds a
    symlink, that is fine because that is obviously an action taken by the
    admin.

    But if dpkg sees a package containing these as symlinks, then this is
    entered into the dpkg database, and subject to conflict resolution, and
    for that, a separate rule exists that directory-symlink conflicts are
    resolved in favour of the directory, so the interaction between a newer
    base-files packages shipping /lib as a symlink and an older or
    third-party package containing /lib as a directory (e.g. a kernel
    package from a hosting provider) could overwrite the /lib symlink.

    No, this does not change anything. The dpkg database currently does not
    even record if a pathname in it corresponds to a symlink, a directory or something else. See also Policy 6.6.4 :

    ,----
    | A directory will never be replaced by a symbolic link to a directory or
    | vice versa; instead, the existing state (symlink or not) will be left
    | alone and "dpkg" will follow the symlink if there is one.
    `----

    It might be possible to avoid that by never shipping /lib as a symlink
    and always creating it externally, but I still think that's kind of
    wobbly.

    IMHO we should not ship the top-level symlinks in a package. The
    reason for that is to allow the use case where /usr is a separate
    vendor partition and / is either a luks volume or a tmpfs, and thus
    the top-level symlinks are ephemeral and re-created on boot on the
    fly. If they were part of a package, that would get awkward to say the
    least.
    I really would like to move toward the direction of having exclusively
    /usr and /etc shipped in data.tar, and everything else created locally
    as needed, and that includes files in /.

    This means that you need special code in dpkg to preserve these top
    level symlinks, as otherwise they are going to disappear with the last
    package that contained these as directories, instantly hosing your installation. I am pretty sure the dpkg maintainer will not like this.

    Cheers,
    Sven

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Timo =?utf-8?Q?R=C3=B6hling?=@1:229/2 to All on Fri Apr 28 12:00:01 2023
    XPost: linux.debian.devel
    From: roehling@debian.org

    * Luca Boccassi <bluca@debian.org> [2023-04-28 10:12]:
    Which part of config-package-dev causes a conflict here? Is it
    something that can be fixed? Given it's declarative, an upload + rdeps >rebuild should be all that's needed, assuming we know what the issue
    is and how to fix it. As far as I can remember, it's a build-time
    utility and everything it does is embedded in the target package's
    maintainer scripts. But it's been a few years since I last used it, so
    I might remember wrongly.

    You remember correctly. It is relatively straight-forward to patch config-package-dev to create additional diversions for files in /
    and /usr (if we decide that is the way forward), but admins will
    have to rebuild their local config packages for these changes to
    take effect.


    Cheers
    Timo

    --
    ⢀⣴⠾⠻⢶⣦⠀ ╭────────────────────────────────────────────────────╮
    ⣾⠁⢠⠒⠀⣿⡁ │ Timo Röhling │
    ⢿⡄⠘⠷⠚⠋⠀ │ 9B03 EBB9 8300 DF97 C2B1 23BF CC8C 6BDD 1403 F4CA │
    ⠈⠳⣄⠀⠀⠀⠀ ╰────────────────────────────────────────────────────╯

    -----BEGIN PGP SIGNATURE-----

    iQGzBAEBCgAdFiEEJvtDgpxjkjCIVtam+C8H+466LVkFAmRLl+kACgkQ+C8H+466 LVn3GAv/bBoFzkbb8E2fARRc4gJ+rkTObRffHkYrb1mtHYvD2MnE8R+9clqGqyXP w2LNNg6XZ9I7Sf8vzJcVMjdNbYQhsugB66fPV3OITIgklFv2gMqzUYvMwoS7T7c+ 1wbdeuZlgsXaO4CPOGiOCbt+TIEI/kSYRxwcEIobcLB8D8kgK07niNgPmkPHH6KL U4zJDH+fvEXuLsmjZeglVwpu2apZg3KPh2fJbZBZCfAEi6X1GDyUn9wRkmsSM48e fI1lNnzcbscdo3NATOf5K0un9dNp4E+hmr8R/oRbdeg1hXcsL+F1sgHzwbo83K2o isHDGjsKX4glbFx7IIiF3y5Q4Vu6/hvJmA0kfsA7kcb
  • From Luca Boccassi@1:229/2 to Luca Boccassi on Sat Apr 29 02:50:02 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Fri, 28 Apr 2023 at 10:12, Luca Boccassi <bluca@debian.org> wrote:

    On Fri, 28 Apr 2023 at 09:09, Helmut Grohne <helmut@subdivi.de> wrote:
    So yeah, with the exception of dash, this looks fairly good. Let me also dive into dash. Unlike the majority of diverters, it diverts in postinst rather than preinst to allow controlling /bin/sh via debconf. A similar technique is in effect by gpr. In any case, this is special, because
    dash diverts its own files, so when moving dash's file, its diversions
    can be migrated at the same time. It merely means, that we cannot have debhelper just move files (as that would horribly break dash) and
    instead have to move files on a package-by-package way. We could also
    opt for removing dash's diversion in the default case and there even is
    a patch for doing so (#989632) since almost two years. Too bad we didn't apply it. In any case, as long as the file moving is not forced via debhelper, dash should be harmless.

    If I understand correctly, by "forced via debhelper" you mean the
    proposal of fixing the paths at build time, right? But if not via
    that, it means having to fix all of them by hand, which is a lot - is
    it possible to fix dash instead? Or else, we could add an opt-out via
    one of the usual dh mechanisms, and use it only in dash perhaps?

    Also as pointed out by the maintainer the debconf machinery was
    dropped from dash, last year: https://salsa.debian.org/debian/dash/-/commit/c322a1c9fc6be11d7eb4439

    This should hopefully simplify things?

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Simon Richter on Fri Apr 28 15:50:02 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Simon,

    On Fri, Apr 28, 2023 at 02:07:33PM +0200, Simon Richter wrote:
    Transforming existing diversions: yes, if you can find out about them
    without looking at dpkg internal files. It may very well be necessary to update the file format on one of these, and if that would cause your
    script to create nonsense diversions, that'd be another thing we'd have
    to work around while building real aliasing support.

    My current mood is "I'd rather focus on a proper solution, not another
    hack that needs to be supported by the proper solution."

    Anything we build here that is not aliasing support for dpkg, but
    another "shortcut" will delay aliasing support for dpkg because it adds
    more possible code paths that all need to be tested.

    Keep in mind that we're here because someone took a shortcut, after all.

    I think we have a misunderstanding here. As far as I understand it, the
    core idea of Luca's approach is that we move all files to their
    canonical locations and then - when nothing is left in directories such
    as /bin or /lib - there is no aliasing anymore, which is why we do not
    have to teach dpkg about aliasing and never patch it.

    From my point of view the only reason to try and solve this with a pile
    of hacks is get us to a state that the current dpkg can deal well with
    again (because all aliasing is gone). And while I've argued earlier that
    dpkg will need to support aliasing, I'm trying to get myself convinced
    that this is not necessary. Personally, I don't have a final conclusion
    on this yet. Can I ask you to go into more detail as to why you think
    that patching dpkg is ultimately necessary?

    At the point where we conclude that no, we cannot move forward without
    patching dpkg, I fully agree with you that taking more shortcuts is only
    going to make it worse.

    Please do understand my research as evaluating all possible approaches
    (and their consequences) in parallel. I'm not trying to push a
    particular approach other than trying to move the whole matter forward.
    Once we have a better understanding, we'll have to build consensus
    around one of these approaches somehow.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Marvin Renich on Sat Apr 29 22:20:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Marvin,

    On Sat, Apr 29, 2023 at 02:08:37PM -0400, Marvin Renich wrote:
    My understanding from following this thread (and others) is that dpkg
    has a bug that can easily be triggered by a sysadmin replacing a
    directory with a symlink (and some other necessary conditions that don't happen very often), which is explicitly allowed by policy. This bug is

    I fear that this is more nuanced than it initially looks. While we kinda support symlinking parts of the directory hierarchy, the implicit
    assumption has always been that this never introduces aliasing. In other
    words, the target of such a symlink needs to be a location that is not
    used by any package.

    The problem here arises from introducing aliasing symlinks - which has
    never been supported. In that sense, we do not have consensus on whether
    this is a bug in dpkg or merely a missing feature.

    the one that is causing the problem with the approach that was chosen by
    the people implementing usrmerge, even though they were aware of this
    problem and a different approach that would have taken two release
    cycles and would not have triggered this bug was considered and
    rejected.

    In all fairness, my understanding is that the different approach would
    have had different semantics. Not all binaries would have been available
    via both / and /usr and that would have resulted in continuous
    incompatibility with other Linux distributions.

    If this is correct, then Luca's approach may fix the problem for
    usrmerge, but does not fix the general dpkg bug. (And, IIUC, is going
    to take two _more_ release cycles to fix the problems with usrmerge as implemented! Hmm...)

    That general dpkg bug is only observable when aliasing happens. Once no
    package ships anything inside non-canonical paths, no relevant aliasing
    is happening anymore, so we will not experience any symptoms of that
    bug. As such, we could call it unsupported again once Luca's approach
    has completed. So yeah, I think the beauty of that approach would be
    getting out of this mess without patching dpkg. I've not yet fully
    convinced myself that this is indeed viable (though I'm trying to).

    The --add-alias solution that has been suggested in this thread seems
    like it would fix the general problem iff policy was changed to require sysadmins to use it if they replaced a directory with a symlink.

    I don't think we want to support aliasing as a general mechanism
    available to administrators.

    I do not understand why the dpkg maintainer has rejected this solution;
    it would still be a fix for the general bug after the usrmerge
    transition has completed. And it would be at least one order of
    magnitude more performant than scanning the filesystem for directory symlinks.

    The --add-alias mechanism certainly adds significant complexity to the
    dpkg code base. Once introduced, it would become API and cannot be
    removed anymore, so it introduces a continuous maintenance cost. And if
    you disagree with the bug being a bug (and think of it as rejecting to implement a new feature), that rejection vaguely makes sense. We may
    disagree with that reasoning, but it is far from baseless.

    So the problem kinda is that aliasing is happening and causes dpkg's
    undefined behaviour wrt aliasing to cause problems. We basically have
    two ways to fix this:
    A. Ensure that no aliasing is happening anymore. (e.g. by
    canonicalizing all paths)
    B. Make dpkg cope with such aliasing.

    And while some seem to see these solutions as complementary, I think we
    should build consensus around either option as combining them gets us
    the downsides of both without adding benefit.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Marvin Renich@1:229/2 to All on Sat Apr 29 20:20:02 2023
    XPost: linux.debian.devel
    From: mrvn@renich.org

    * Helmut Grohne <helmut@subdivi.de> [230428 09:50]:
    I think we have a misunderstanding here. As far as I understand it, the
    core idea of Luca's approach is that we move all files to their
    canonical locations and then - when nothing is left in directories such
    as /bin or /lib - there is no aliasing anymore, which is why we do not
    have to teach dpkg about aliasing and never patch it.

    My understanding from following this thread (and others) is that dpkg
    has a bug that can easily be triggered by a sysadmin replacing a
    directory with a symlink (and some other necessary conditions that don't
    happen very often), which is explicitly allowed by policy. This bug is
    the one that is causing the problem with the approach that was chosen by
    the people implementing usrmerge, even though they were aware of this
    problem and a different approach that would have taken two release
    cycles and would not have triggered this bug was considered and
    rejected.

    If this is correct, then Luca's approach may fix the problem for
    usrmerge, but does not fix the general dpkg bug. (And, IIUC, is going
    to take two _more_ release cycles to fix the problems with usrmerge as implemented! Hmm...)

    The --add-alias solution that has been suggested in this thread seems
    like it would fix the general problem iff policy was changed to require sysadmins to use it if they replaced a directory with a symlink.

    I do not understand why the dpkg maintainer has rejected this solution;
    it would still be a fix for the general bug after the usrmerge
    transition has completed. And it would be at least one order of
    magnitude more performant than scanning the filesystem for directory
    symlinks.

    ...Marvin

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Raphael Hertzog@1:229/2 to Simon Richter on Tue May 2 12:40:01 2023
    XPost: linux.debian.devel
    From: hertzog@debian.org

    Hello,

    On Wed, 26 Apr 2023, Simon Richter wrote:
    This and /bin/sh is the kind of files I'd consider important. And then
    upon thinking further it became more and more difficult for me to make sense of the objection. On a merged system, we can just move that file
    to its canonical location without having any trouble even with an unmodified dpkg. So from my pov, the question about important files can
    be disregarded. I hope Simon Richter agrees.

    Yes, the relevant code at

    https://github.com/guillemj/dpkg/blob/main/src/main/unpack.c#L749

    already handles moving a file inside the same package, and that has
    existed for some time, that's why I use two packages for the PoC.

    Hum... why aren't we improving this part of the code?

    We don't want to stat all the files in all packages but we could do better:
    if we are about to remove an old file that is available through a
    symlinked directory, we could check the new name of the file and see if
    it's available in some package... and if yes just forget the file without removing it.

    This file removal is the reason of the moratorium and incuring some extra
    cost in some specific cases (installation through directory symlinks which
    is not the default case, and would not affect us after the migration is complete) seems certainly fair.

    The cost of analyzing directory components is a cost that we will have on
    all dpkg invocations but it doesn't seem unreasonable to me. We could also restrict it to the top-level directories to make it negligible as this
    is the only transition that we care about here.

    Cheers,
    --
    ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog <hertzog@debian.org>
    ⣾⠁⢠⠒⠀⣿⡁
    ⢿⡄⠘⠷⠚⠋ The Debian Handbook: https://debian-handbook.info/get/
    ⠈⠳⣄⠀⠀⠀⠀ Debian Long Term Support: https://deb.li/LTS

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to Helmut Grohne on Fri Apr 28 14:10:01 2023
    XPost: linux.debian.devel
    From: sjr@debian.org

    Hi,

    On Thu, Apr 27, 2023 at 12:34:06AM +0200, Helmut Grohne wrote:

    Ok, let's move on. I've proposed diversions as a cure, but in reality diversions are a problem themselves. Consider that
    cryptsetup-nuke-password diverts /lib/cryptsetup/askpass, which is
    usually owned by cryptsetup. If cryptsetup were to move that file to
    /usr, the diversion would not cover it anymore and the actual content of askpass would depend on the unpack order. That's very bad and none of
    what I proposed earlier is going to fix this.

    Another question: how would this interact with a patch that teaches dpkg
    to do aliases properly, because such a patch would affect diversion
    handling as well -- specifically, aliases mean that diversions are also
    aliased (I believe my patch implicitly does this right, but I think I
    just got 60 more testcases to write), and that diversion targets are
    implicitly redirected to a resolved form (I don't do that yet, but it's
    simple to add to my patch).

    I think the "divert, but do not rename" approach itself should be fairly
    safe, because all it does is make a deletion fail. Registering the
    diversion to the new package should be sufficient to make sure the newly unpacked file is not diverted. This probably needs some additional undo operations for failed installs/upgrades, so the diversion is properly
    removed in these cases (there is no guarantee that postinst will be
    called after preinst, we could also end up in postrm).

    So how do we fix diversions? Let's have a look into the dpkg toolbox
    again. I've got an idea. Diversions. What you say? How do you fix
    diversions with diversions? Quite obviously, you divert
    /usr/bin/dpkg-divert! And whenever dpkg-divert is instructed to add a diversion for a non-canonical path, you forward that call to the real dpkg-divert, but also call it with a canonicalized version such that
    both locations are covered. When initially deploying the diversion of /usr/bin/dpkg-divert, we also need to transform existing diversions.

    Ouch, if you deploy that, I will definitely need to add diversion
    merging code to alias registration. That's another 60 testcases, but we
    need defined behaviour for that anyway.

    Transforming existing diversions: yes, if you can find out about them
    without looking at dpkg internal files. It may very well be necessary to
    update the file format on one of these, and if that would cause your
    script to create nonsense diversions, that'd be another thing we'd have
    to work around while building real aliasing support.

    My current mood is "I'd rather focus on a proper solution, not another
    hack that needs to be supported by the proper solution."

    Anything we build here that is not aliasing support for dpkg, but
    another "shortcut" will delay aliasing support for dpkg because it adds
    more possible code paths that all need to be tested.

    Keep in mind that we're here because someone took a shortcut, after all.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Luca Boccassi on Tue May 2 15:00:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Luca,

    On Fri, Apr 21, 2023 at 03:29:33PM +0100, Luca Boccassi wrote:
    After Bookworm ships I plan to propose a policy change to the CTTE and
    policy maintainers to forbid shipping files in the legacy directories altogether, followed by a debhelper change to adjust any stragglers automatically at build time and a mass rebuild, plus MBF for the small
    % that does not use dh and a piuparts test to stop migration for
    anything that is uploaded and doesn't comply. That should bring the
    matter to an end, without needing to modify dpkg.

    I think we now learned that this is quite oversimplified, but possibly
    fixable. At this time we know about the following problem areas:
    * file loss during upgrades
    + The main reason for having the moratorium
    + Possible workaround is using Conflicts
    * diversion mismatches
    + Possibly fixable by duplicating affected diversions
    * alternatives
    + Only become a problem if we try migrating them to canonical paths
    * triggers
    + Possible fix is duplicating trigger interest

    This is problems we know about now, but it likely is not an exhaustive
    list. This list was mostly guided by Guillem's intuition of what could
    break at https://wiki.debian.org/Teams/Dpkg/MergedUsr and I have to say
    that his intuition was quite precise thus far. Notably missing in the investigation are statoverrides. However, we should also look for a more generic approach that tries capturing unexpected breakage.

    I noticed that the number of packages shipping non-canonical files is relatively small. It's less than 2000 binary packages in unstable and
    their total size is about 2GB. So I looked into binary-patching them and
    attach the resulting scripts.
    * repackdeb.sh is a helper script for repacking an individual binary
    package and canonicalizing the contained paths.
    * autorepack.sh creates and apt repository of repacked .debs and calls
    repackdeb.sh repeatedly in that process. Should complete in half an
    hour.
    * createchroot.sh consumes the apt repository to create a chroot using
    mmdebstrap. In there, all paths should be canonical.
    * Depends: curl, devscripts, dpkg-dev, fakeroot, mmdebstrap, moreutils
    * You are expected to place these scripts in an empty directory and run
    them from there. Please read them before running them. You'll likely
    have to edit at least createchroot.sh to make it useful.

    I'm not sure how to devise test cases from this yet.

    * I tried createchroot.sh with --include=zutils, because that's one of
    the packages where I expect breakage due to mismatching diversions.
    Sure enough, I got the expected unpack error from dpkg!

    * I tried createchroot.sh with --include=gnome to see it in action with
    a few more packages and that appeared successful. So in this scenario
    there were no unforeseen and readily visible problems.

    If you end up performing tests using these or similar scripts, please
    report your results. If you can think of relevant scenarios, please
    tell.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Raphael Hertzog on Tue May 2 15:00:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Raphal,

    On Tue, May 02, 2023 at 12:30:21PM +0200, Raphael Hertzog wrote:
    We don't want to stat all the files in all packages but we could do better: if we are about to remove an old file that is available through a
    symlinked directory, we could check the new name of the file and see if
    it's available in some package... and if yes just forget the file without removing it.

    Indeed, this is a neat idea. You are effectively implying that we only
    ever move files from non-canonical locations to canonical locations and
    never the other way round. I think this is a reasonable assumption and
    this assumption is the one that makes your variant simpler.

    I think there is a caveat (whose severity I am unsure about): In order
    to rely on this (and on DEP 17), we will likely have versioned
    Pre-Depends on dpkg. Can we reasonably rule out the case where and old
    dpkg is running, unpacking a fixed dpkg, configuring the fixed dpkg and
    then unpacking an affected package still running the unfixed dpkg
    process?

    This file removal is the reason of the moratorium and incuring some extra cost in some specific cases (installation through directory symlinks which
    is not the default case, and would not affect us after the migration is complete) seems certainly fair.

    I think the file loss problem is one sufficient reason to have the
    moratorium. We didn't need other reasons once we knew this one. Now that
    we look into dropping the moratorium, we need to ensure that there are
    no reasons anymore and we learned that diversions are affected in a
    non-trivial way. So even if we were to fix just the file loss problem,
    the diversion problems would still be sufficient reason to keep the
    moratorium unless they were also fixed by the approach. Here you need
    both directions a) diverting a non-canonical location would have to
    divert a canonical file and b) diverting a canonical location would have
    to divert a non-canonical file. This is breaking the initial assumption.

    In any case, this train of thought is definitely widening the solution
    space. Thank you very much.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Helmut Grohne on Tue May 2 16:00:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    On Tue, May 02, 2023 at 02:09:32PM +0200, Helmut Grohne wrote:
    This is problems we know about now, but it likely is not an exhaustive
    list. This list was mostly guided by Guillem's intuition of what could
    break at https://wiki.debian.org/Teams/Dpkg/MergedUsr and I have to say
    that his intuition was quite precise thus far. Notably missing in the investigation are statoverrides. However, we should also look for a more generic approach that tries capturing unexpected breakage.

    I mentioned statoverrides as missing. I think we can split statoverrides
    into the two classes "package changes" and "admin changes". Quite
    obviously, moving files, will break admin changes. I see little ways
    around this, we can partially mitigate this by detecting common
    statoverrides and migrating them automatically, but in the end, we'll
    probably have to explain issues with admin-initiated statoverrides in
    the release notes.

    For package changes, the good thing is that statoverrides usually change
    stats of files owned by the package initiating them. Thus a package
    moving files can also move statoverrides (though this again means that automatic moves e.g. by debhelper must be opt-in in order to avoid
    breaking stuff). For getting an idea of the scope, we can use https://binarycontrol.debian.net/?q=dpkg-statoverride.*+%2F%28bin%7Csbin%7Clib%7Clib32%7Clib64%7Clibo32%7Clibx32%29&path=%2Funstable%2F

    * fuse and fuse3 adapt to an admin initiated statoverride of
    /bin/fusermount.
    * nfs-common cleans an obsolete dpkg-statoverride of /sbin/mount.nfs
    * systemd-cron adds a statoverride for /lib/systemd-cron/crontab_setgid
    and needs to migrate it with its files.
    * yp-tools adds a statoverride for /sbin/unix_chkpwd and needs to
    migrate it with its files.

    I also tried installing all packages that contain dpkg-statoverride in
    any of their maintainer scripts and capturing the resulting statoverride
    file. That doesn't yield anything unexpected thus far, but it also
    hasn't completed yet. I'll reply to this message with findings if
    there are any beyond the ones above.

    So statoverides seem quite similar to the diversions induced by dash:
    Mostly harmless if handled correctly while moving the files, but we
    cannot just move the files in an opt-out fashion. Beyond that we need to augment release notes to ask admins to carefully update their local statoverrides (and local diversions).

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Sam Hartman@1:229/2 to All on Tue May 2 17:40:01 2023
    XPost: linux.debian.devel
    From: hartmans@debian.org

    "Helmut" == Helmut Grohne <helmut@subdivi.de> writes:

    Helmut> Luca Boccassi kindly pointed me at config-package-dev
    Helmut> though. This is a tool for generating local packages and it
    Helmut> also employs dpkg-divert. There is a significant risk of
    Helmut> breaking this use case. If we were to divert dpkg-divert and
    Helmut> automatically duplicate diversions, this use case were
    Helmut> automatically covered.

    Most of the uses of config-package-dev are typically within /etc.
    Also, people who use config-package-dev tend to be managing
    infrastructure and have ways to handle upgrades of their infrastructure.

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Raphael Hertzog@1:229/2 to Helmut Grohne on Wed May 3 10:40:01 2023
    XPost: linux.debian.devel
    From: hertzog@debian.org

    Hello,

    On Tue, 02 May 2023, Helmut Grohne wrote:
    I think there is a caveat (whose severity I am unsure about): In order
    to rely on this (and on DEP 17), we will likely have versioned
    Pre-Depends on dpkg. Can we reasonably rule out the case where and old
    dpkg is running, unpacking a fixed dpkg, configuring the fixed dpkg and
    then unpacking an affected package still running the unfixed dpkg
    process?

    I don't know APT well enough to answer that question but from my point of
    view it's perfectly acceptable to document in the release notes that you
    need to upgrade dpkg first.

    I think the file loss problem is one sufficient reason to have the moratorium. We didn't need other reasons once we knew this one. Now that
    we look into dropping the moratorium, we need to ensure that there are
    no reasons anymore and we learned that diversions are affected in a non-trivial way. So even if we were to fix just the file loss problem,
    the diversion problems would still be sufficient reason to keep the moratorium unless they were also fixed by the approach. Here you need
    both directions a) diverting a non-canonical location would have to
    divert a canonical file and b) diverting a canonical location would have
    to divert a non-canonical file. This is breaking the initial assumption.

    Are you sure that we need anything for diversions except some documented
    policy on how to deal with it?

    AFAIK the following sequence performs no filesystem changes and should
    be sufficient to move a diversion to its new location (I only consider the
    case of an upgrade, not of a new installation that should just work
    "normally" on the new location):

    dpkg-divert --package $package --remove /bin/foo --no-rename
    dpkg-divert --package $package --add /usr/bin/foo --divert /usr/bin/foo.diverted --no-rename

    The case of update-alternatives is likely more tricky. You already looked
    into it. That's a place where it will be harder to get things right
    without some changes.

    In any case, this train of thought is definitely widening the solution
    space. Thank you very much.

    You are welcome.

    Cheers,
    --
    ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog <hertzog@debian.org>
    ⣾⠁⢠⠒⠀⣿⡁
    ⢿⡄⠘⠷⠚⠋ The Debian Handbook: https://debian-handbook.info/get/
    ⠈⠳⣄⠀⠀⠀⠀ Debian Long Term Support: https://deb.li/LTS

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Helmut Grohne on Tue May 2 15:10:02 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    On Tue, May 02, 2023 at 02:09:32PM +0200, Helmut Grohne wrote:
    I noticed that the number of packages shipping non-canonical files is relatively small. It's less than 2000 binary packages in unstable and
    their total size is about 2GB. So I looked into binary-patching them and attach the resulting scripts.

    Sorry for missing the attachments.

    Helmut

    #!/bin/shset -euxPACKAGE=$1CHDIST_DIR=$(realpath ./chdist)WORKDIR=$(mktemp -d)MERGED_DIRS="bin sbin lib lib32 lib64 libo32 libx32"cleanup() { rm -Rf "$WORKDIR"}trap cleanup EXIT HUP INT TERMenv -C "$WORKDIR" chdist -d "$CHDIST_DIR" apt-get
    orig download "$PACKAGE"debfile=$(echo "$WORKDIR/${PACKAGE}_"*.deb)dpkg-deb -x "$debfile" "$WORKDIR/unpack"dpkg-deb -e "$debfile" "$WORKDIR/unpack/DEBIAN"rm -f "$debfile"for d in $MERGED_DIRS; do test -d "$WORKDIR/unpack/$d" || continue mkdir -p "$
    WORKDIR/unpack/usr" cp --link --archive "$WORKDIR/unpack/$d" "$WORKDIR/unpack/usr/" rm -Rf "$WORKDIR/unpack/$d"donedpkg-deb -b "$WORKDIR/unpack" "$WORKDIR"debfile=$(echo "$WORKDIR/${PACKAGE}_"*.deb)mv "$debfile" repacked/
    #!/bin/shset -euxSUITE=unstableMIRROR=http://deb.debian.org/debianCOMPONENTS="main contrib non-free non-free-firmware"ARCH=$(dpkg --print-architecture)MERGED_DIRS="bin sbin lib lib32 lib64 libo32 libx32"rm -Rf chdist repackedmkdir repacked#
    shellcheck disable=SC2086 # intentional word splittingchdist -d ./chdist create orig "$MIRROR" "$SUITE" $COMPONENTSchdist -d ./chdist apt-get orig updateSED_PATTERN="s,^\\($(echo "$MERGED_DIRS" | sed 's/ /\\|/g')"'\).*/\(.*\),\2,p'for arch in "$
    ARCH" all; do for component in $COMPONENTS; do curl -s "$MIRROR/dists/$SUITE/$component/Contents-$arch.gz" | zcat | sed -n "$SED_PATTERN" | sort -u donedone | xargs parallel fakeroot ./repackdeb.sh --env -C repacked dpkg-scanpackages . /
    dev/null | gzip -9 > repacked/Packages.gz #!/bin/shTARGET=/dev/nullSUITE=unstableMIRROR=http://deb.debian.org/debian# try passing:# --customize-hook="bash < /dev/tty >/dev/tty 2>&1"mmdebstrap --verbose --variant=apt --hook-dir=/usr/share/mmdebstrap/hooks/maybe-merged-usr "$SUITE" "$
    TARGET" "deb [trusted=yes] copy://$(realpath ./repacked) ./" "deb $MIRROR $SUITE main contrib non-free non-free-firmware" "$@"

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to Helmut Grohne on Wed May 3 20:40:01 2023
    XPost: linux.debian.devel
    From: sjr@debian.org

    Hi,

    On 03.05.23 19:19, Helmut Grohne wrote:

    What still applies here is that we can have usr-is-merged divert /usr/bin/dpkg-divert and have it automatically duplicate any possibly
    aliased diversion and then the diverter may Pre-Depends: usr-is-merged (>=...) to have its diversions duplicated. Of course, doing so will make usr-is-merged very hard to remove, but we have experience here from multiarch-support.

    For aliasing support in dpkg, that means we need a safe policy of
    dealing with diversions that conflict through aliasing that isn't
    "reject with error", because the magic dpkg-divert would always generate conflicts.

    One thing we need to check is whether diversions to the same target
    cause file conflicts -- I think they should.

    So if you divert

    /bin/foo -> /usr/bin/foo.dontdelete
    /usr/bin/foo -> /usr/bin/foo.dontdelete

    then a package containing /bin/foo and a package containing /usr/bin/foo
    now have a file conflict in dpkg. Not sure if that is a problem, or
    exactly the behaviour we want. Probably the latter, which would allow us
    to define a policy "if aliased paths are diverted, the diversion needs
    to match", which in turn would allow the conflict checker during alias registration to verify that the aliased diversions are not in conflict.

    The diverted dpkg-divert would probably generate extra
    register/unregister calls as soon as dpkg-divert itself is aliasing
    aware, but all that does is generate warning messages about existing
    diversions being added again, or nonexistent diversions being deleted --
    these are harmless anyway, because maintainer scripts are supposed to be idempotent, and dpkg-divert supports that by not requiring scripts to
    check before they register/unregister.

    And of course, we can always draw the diversion card and have
    usr-is-merged divert /usr/bin/update-alternatives to have it
    canonicalize paths as required to be able to migrate alternatives in a
    sane way (from a consumer point of view).

    We get to draw this card exactly once, and any package that would need
    the same diversion would need to conflict with usr-is-merged, which
    would make it basically uninstallable.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Raphael Hertzog on Wed May 3 12:30:02 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Raphal,

    On Wed, May 03, 2023 at 10:31:14AM +0200, Raphael Hertzog wrote:
    I don't know APT well enough to answer that question but from my point of view it's perfectly acceptable to document in the release notes that you
    need to upgrade dpkg first.

    Yes, this issue seems vaguely solvable one way or another. It also
    affects other approaches modifying dpkg in the very same way.

    Are you sure that we need anything for diversions except some documented policy on how to deal with it?

    Yes! There is a hard ordering constraint involved here. Failure to do so results in unpack errors and or file loss in much the same way.

    AFAIK the following sequence performs no filesystem changes and should
    be sufficient to move a diversion to its new location (I only consider the case of an upgrade, not of a new installation that should just work "normally" on the new location):

    dpkg-divert --package $package --remove /bin/foo --no-rename
    dpkg-divert --package $package --add /usr/bin/foo --divert /usr/bin/foo.diverted --no-rename

    This is insufficient. Either we modify dpkg to consider aliasing when
    managing diversions (i.e. Simon Richter's branch or DEP17) or there is a
    more complex ordering requirement involved:

    * We must not remove the aliased diversion (/bin/foo) before the
    diverted package has moved its files to the canonical location
    (/bin/foo -> /usr/bin/foo).
    * We must add the canonical diversion (/usr/bin/foo) before the
    diverted package update that moves its files to canonical locations
    can be unpacked.

    Say we currently have

    Package: diverter
    Version: 1
    Files: /bin/foo
    preinst: diverts /bin/foo

    Package: diverted
    Version: 1
    Files: /bin/foo

    We must first update the diverter.

    Package: diverter
    Version: 2
    Files: /usr/bin/foo
    preinst: diverts both /bin/foo and /usr/bin/foo

    Since we divert both locations, diverter can now deal with an old
    diverted and a canonicalized diverted.

    Package: diverted
    Version: 2
    Conflicts: diverter (<< 2~)
    Files: /usr/bin/foo

    At the time of unpacking the updated diverted, we must ensure that no
    diverter versioned 1 is unpacked. Breaks does not help here as it allows concurrent unpacks. Neither does Replaces since dpkg thinks that
    /bin/foo is different from /usr/bin/foo and thus no replacing happens.

    Package: diverter
    Version: 3
    Conflicts: diverted (<< 2~)
    Files: /usr/bin/foo
    preinst: diverts /usr/bin/foo

    When unpacking the updated diverter, we must ensure that no diverted
    version 1 is unpacked. Again, Breaks and Replaces does not suffice.
    Therefore an upgrade from stable to nextstable containing both diverter
    and diverted must temporarily remove either package, which is known to
    annoy apt.

    What still applies here is that we can have usr-is-merged divert /usr/bin/dpkg-divert and have it automatically duplicate any possibly
    aliased diversion and then the diverter may Pre-Depends: usr-is-merged
    =...) to have its diversions duplicated. Of course, doing so will make usr-is-merged very hard to remove, but we have experience here from multiarch-support.

    Hope this clarifies.

    The case of update-alternatives is likely more tricky. You already looked into it. That's a place where it will be harder to get things right
    without some changes.

    As detailed in
    https://lists.debian.org/debian-devel/2023/04/msg00169.html I believe
    that update-alternatives really are not tricky at all as long as we do
    not attempt to migrate them to canonical paths in any way. For instance, elvis-tiny needs to continue to name the editor alternative
    /bin/elvis-tiny even when it actually moves that file to /usr/bin. The
    reason that this does not hurt is that we never attempted to move
    alternatives (unlike regular files in packages).

    If we really want to migrate alternatives to canonical paths, we do get
    into the tricky area of preserving the user configuration and we also
    break custom scripts, ansible's community.general.alternatives, uses of puppet's alternatives modules and probably a lot more.

    And of course, we can always draw the diversion card and have
    usr-is-merged divert /usr/bin/update-alternatives to have it
    canonicalize paths as required to be able to migrate alternatives in a
    sane way (from a consumer point of view).

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Timo =?utf-8?Q?R=C3=B6hling?=@1:229/2 to All on Fri May 5 11:40:01 2023
    XPost: linux.debian.devel
    From: roehling@debian.org

    Hi,

    * Simon Richter <sjr@debian.org> [2023-05-05 17:59]:
    - it is not an error to register a diversion for an alias of an
    existing diversion, provided the package and target matches, this is a
    no-op
    - it is not an error to unregister a diversion for an alias of a path
    that has been unregistered previously, that is a no-op as well
    How do you distinguish between aliased diversions and "real" ones?
    Because if you allow the registration of duplicate diversions, the
    following can happen:

    - Package A is installed, preinst creates a diversion
    - Package B is installed, preinst creates the same diversion
    - Package A is uninstalled, postrm removes the diversion

    Now package B has lost its diversion.


    Cheers
    Timo

    --
    ⢀⣴⠾⠻⢶⣦⠀ ╭────────────────────────────────────────────────────╮
    ⣾⠁⢠⠒⠀⣿⡁ │ Timo Röhling │
    ⢿⡄⠘⠷⠚⠋⠀ │ 9B03 EBB9 8300 DF97 C2B1 23BF CC8C 6BDD 1403 F4CA │
    ⠈⠳⣄⠀⠀⠀⠀ ╰────────────────────────────────────────────────────╯

    -----BEGIN PGP SIGNATURE-----

    iQGzBAEBCgAdFiEEJvtDgpxjkjCIVtam+C8H+466LVkFAmRUzgwACgkQ+C8H+466 LVnfWgwAovhmBny6sVWBq6grhDrfBOiETTUOiTkK4pSIDn60mZY9WwIwoQo8Oy2h wa2fGFb3+0htXQsF6uIpTF9ipurUvAGGA6XIRl46//nTXVtywEaYxVGJ0KXrGkdy 5/nxY2UfyC9rmhi83zIiAG/h/iMwUKV19/4QiLGdbW3KSGgTi2+bEiVoSHy+YkaK l8GM5RsNptNmWNKMtFho1cubpBIthceOJpJs7lNfm3owGIyqJDA/5BX9154Ktbhu YDxKz6ez6shgFbMGPjS4g8b6qvLLXMUCLzn6HimHKMVtFBk35Dx3AJLsCVrDMJHV 2YBx8/qJFc9HDWs7H6Ln5O3VVoNV+dmmfWzbJlF/AuG
  • From Helmut Grohne@1:229/2 to Simon Richter on Thu May 4 19:20:02 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Simon,

    On Thu, May 04, 2023 at 03:37:49AM +0900, Simon Richter wrote:
    For aliasing support in dpkg, that means we need a safe policy of dealing with diversions that conflict through aliasing that isn't "reject with error", because the magic dpkg-divert would always generate conflicts.

    I think we still have that misunderstanding I mentioned earlier, so let
    me try to resolve that again.

    From my point of view, the ultimate goal here should be moving all files
    to their canonical location and thereby make aliasing effects
    irrelevant. Do you confirm?

    As such, I do not see aliasing support in dpkg as complementing the
    forced file move approach with lots of workarounds such as diverting dpkg-divert. Rather, I see them as exclusive strategies. Each of these strategies has significant downsides. In combining the different
    strategies, we combine their downsides, but since their benefit is
    shared, we do not gain anything in return but increase the price to pay.
    Why should we do that?

    So when we discuss diverting dpkg-divert, I imply that we do not change
    the implementation of dpkg wrt. aliasing. So this branch of discussion
    that you raise here, seems irrelevant to me.

    On the flip side, if dpkg (and thus dpkg-divert) is to gain aliasing
    support, I see no reason (and benefit) to diverting dpkg-divert.

    Can you explain why you see combining these strategies as something
    worth exploring?

    then a package containing /bin/foo and a package containing /usr/bin/foo now have a file conflict in dpkg. Not sure if that is a problem, or exactly the

    This case already is prohibited by policy section 10.1. It can only
    happen as a temporary state during a file move (from / to /usr and from
    one package to another).

    behaviour we want. Probably the latter, which would allow us to define a policy "if aliased paths are diverted, the diversion needs to match", which in turn would allow the conflict checker during alias registration to verify that the aliased diversions are not in conflict.

    If we do not modify dpkg to improve aliasing support, then yes, such a
    scenario will require a Conflicts declaration or a different measure
    averting this problem.

    The diverted dpkg-divert would probably generate extra register/unregister calls as soon as dpkg-divert itself is aliasing aware, but all that does is generate warning messages about existing diversions being added again, or nonexistent diversions being deleted -- these are harmless anyway, because maintainer scripts are supposed to be idempotent, and dpkg-divert supports that by not requiring scripts to check before they register/unregister.

    Again, the premise seems unreasonable to me. Also note that such a
    diversion of dpkg-divert certainly is meant as a temporary measure
    facilitating the transition. I'd hope we could delete it in forky
    already and failing that thereafter.

    We get to draw this card exactly once, and any package that would need the same diversion would need to conflict with usr-is-merged, which would make
    it basically uninstallable.

    I don't think the case of packages wanting to divert update-alternatives
    is all that common. Please elaborate on the use case. Also note that
    this suggestion already is to be considered a plan B. My current
    understanding is that as long as we do not canonicalize alternatives at
    all, we don't run into problems with them. This kinda is ugly, but the
    number of affected packages is small.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to All on Fri May 5 14:00:01 2023
    XPost: linux.debian.devel
    From: sjr@debian.org
    To: debian-dpkg@lists.debian.org

    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------6JsnJ0liPA859sPt9X8eyOSR
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64
    [SoupGate killed MIME-encoded file 00000000.ATT (1101 bytes)]


    --------------6JsnJ0liPA859sPt9X8eyOSR--

    -----BEGIN PGP SIGNATURE-----

    iQEzBAEBCgAdFiEEtjuqOJSXmNjSiX3Tfr04e7CZCBEFAmRU7l0ACgkQfr04e7CZ CBEOygf/ePXtDDUSe1O2MQp7zWSvNTxJtRBIq2pogyBdMrjbspswGQqDJVV1479X j/YqId5Vbsav6fJqus7cYJJIr7ORk3J0VDeh9Lm4vcidJIKBKT4vTUI5uzZYFSsY Vxar7whik8n+x+V/2/9zRbiBS+lgQea52vb4zI5Q1Duz7CHhAsgn39EN/h3BL2uE 1X3Z8NJjy0vomEX5XP61G63R5jY9SQBzsAjgScuAKkR0dkTwFCE+OAoTi3BVnLOd 5n5aZ/Rq97rOLmEKtveieS3xQpenz/VlQEUQn2GXLqKq4QrAn2tTQceQM9Ad/ehz Y/ATeO2F82GywxbMM7SLNIvnlg9sRQ==
    =dQbk
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Andreas Metzler@1:229/2 to sjr@debian.org on Fri May 5 18:40:01 2023
    XPost: linux.debian.devel
    From: ametzler@bebt.de

    On 2023-05-05 Simon Richter <sjr@debian.org> wrote:
    [...]
    My proposal would be to put the onus on the client registering the
    diversion:
    [...]
    - packages are encouraged to register both diversions

    Hello,

    That seems to be a rather ugly user interface, ("There is dpkg-divert on Debian, but because the usrmerge you need to invoke it twce to be
    sure"). Will we need to have a meta-transition years from now trying to
    get get rid of the double diversions?

    cu Andreas

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Andreas Metzler on Sat May 6 00:40:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Fri, 5 May 2023 at 17:38, Andreas Metzler <ametzler@bebt.de> wrote:

    On 2023-05-05 Simon Richter <sjr@debian.org> wrote:
    [...]
    My proposal would be to put the onus on the client registering the diversion:
    [...]
    - packages are encouraged to register both diversions

    Hello,

    That seems to be a rather ugly user interface, ("There is dpkg-divert on Debian, but because the usrmerge you need to invoke it twce to be
    sure"). Will we need to have a meta-transition years from now trying to
    get get rid of the double diversions?

    It is not the prettiest thing but it is a very clever solution.
    Perhaps it could be mitigated with an addendum that makes it optional,
    and to be used only when strictly needed, after all moving files
    within the same package is fine, it's only the combination of moving
    location _and_ package that causes problems. In other words:

    - every package is forcefully canonicalized as soon as trixie is open
    for business
    - the moratorium on moving files from bin/ sbin/ lib/ _and_ to other
    packages at the same time is maintained from bookworm till trixie, and
    will lifted after trixie ships, and applies implicitly to all the
    ~2000 binary pkgs that are affected by the above
    - the moratorium can be bypassed by a maintainer if and only if the
    appropriate conflicts/replaces/diverts as discussed in this thread are
    put in place and left there for as long as needed (TBD whether this
    means for trixie's cycle or for trixie+1)

    In practice, I suspect that out of ~2000 packages shipping bin/ sbin/
    lib*/, only a small fraction would end up needing to further move
    files out to other packages, so the divert dance requirement can be
    restricted only to those. This way impact is minimized, required
    testing is smaller, and we get in the final state on day one of trixie
    dev cycle.
    Moving files between packages already requires busywork anyway, so a
    bit more won't hurt that much, especially if we figure out a way to
    provide the functionality with a dh addon or such to do the heavy
    lifting.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Simon Richter on Sat May 6 14:30:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Sat, 6 May 2023 at 06:21, Simon Richter <sjr@debian.org> wrote:

    Hi,

    On 06.05.23 07:11, Luca Boccassi wrote:

    - every package is forcefully canonicalized as soon as trixie is open
    for business

    You will also need to ship at least

    - /lib -> usr/lib (on 32 bit)
    - /lib64 -> usr/lib64 (on 64 bit)

    as a symlink either in the libc-bin package or any other Essential
    package, to fulfill the requirement that unpacked Essential packages are operational.

    Sure, that doesn't sound problematic? We'll need the same for bin/ and
    sbin/ for at least a cycle as you already pointed out. Sounds like a
    job for base-files.

    In the far future I'd like for these details to be owned by image builders/launchers/setup processes rather than a package, but this can
    be discussed separately and independently, no need to be tied to this
    effort.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to Luca Boccassi on Sat May 6 17:20:02 2023
    XPost: linux.debian.devel
    From: sjr@debian.org
    To: debian-devel@lists.debian.org
    To: debian-dpkg@lists.debian.org

    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------efiJK7W8opNK49afOCKpJ9sR
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64
    [SoupGate killed MIME-encoded file 00000000.ATT (1031 bytes)]


    --------------efiJK7W8opNK49afOCKpJ9sR--

    -----BEGIN PGP SIGNATURE-----

    iQEzBAEBCgAdFiEEtjuqOJSXmNjSiX3Tfr04e7CZCBEFAmRWbgsACgkQfr04e7CZ CBF8XAgAhwrNkQZsSpiaAmBGJKi1r465TSPg+STk1J3dBVGfGZZp0qXFU6O6MIvd hsZxXJsIhCHPCJGI3u4AW2LfSpOESjUvvxT2c39rSZWUDaQSk3P8VcY9mwbf1AVF gk9qpl9M25ow7E/fW0VUhEqYkHUUvsEq/ZfVNHQiE2fP3PQglZRtaBGsH0zKng/H FqW86EFXww2yIgFPyZThr1IMB064UdjFvMkB7nh/vQVO9vNq9/DlvL8y/Kmz79ux CIdVJ5EtBXHQ7v1baA4pm5xCOTvydOg+55v8GFHZVJ8LzVnmyM2BHBgdms+IulzQ cM0encyl1OfRHpPtE4U0iSf1rlOkIg==
    =hD+t
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to All on Sat May 6 07:30:01 2023
    XPost: linux.debian.devel
    From: sjr@debian.org
    To: debian-devel@lists.debian.org

    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------Lp0qoBrnxOMu6ussZmCS87Fu
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64
    [SoupGate killed MIME-encoded file 00000000.ATT (423 bytes)]


    --------------Lp0qoBrnxOMu6ussZmCS87Fu--

    -----BEGIN PGP SIGNATURE-----

    iQEzBAEBCgAdFiEEtjuqOJSXmNjSiX3Tfr04e7CZCBEFAmRV48UACgkQfr04e7CZ CBGYKwgAkht2aaUNbqU4E5bLqJyY9/v1MKX5U7ZOUsgvJn+5hoUbvT3kItAyx0kE GdTyR27tYfx4hFHx/oilgzkfjOV4eoVaQ44t1z53pxyKZAq3oGWXJQEJwAdXKdsi nSU8ERszaaWvUgqIrZ7SfB1CHzMMDj7z1ff+EI2Dg6J4OWgYyU/si9geWtIKZ4C8 8MaOLJOQnRrR856GSGsy5AYjIILh6f0j0fjVLf76IiVOOibZURJ0/J2kkcg4yPnK 6cJjy7cD4EnFsfyo9XGn8/ctCs1qXEEuYlK6DlMOchfJh5yNC/KHikeKIwY/xifX Jy+Z2mcCQt6wHGRT0PNghlrfgHODww==
    =95hY
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Simon Richter on Sat May 6 18:10:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Sat, 6 May 2023 at 16:11, Simon Richter <sjr@debian.org> wrote:

    Hi,

    On 06.05.23 21:28, Luca Boccassi wrote:

    [shipping usrmerge symlinks in packages]

    In the far future I'd like for these details to be owned by image builders/launchers/setup processes rather than a package, but this can
    be discussed separately and independently, no need to be tied to this effort.

    Ideally I'd like to have this information in a single package rather
    than distributed over ten different tools, especially as this is also
    release and platform dependent.

    If possible, I'd like to go back to the gold standard of
    - download Essential: yes packages and their dependencies
    - unpack them using dpkg --fsys-tarfile | tar -x
    - install over this directory with dpkg --root=... -i *.deb

    to get something that works as a container. Right now, that only works
    if I remove "init" from the package list, which is fair since a
    container doesn't need an init system anyway.

    The less an image builder needs to deviate from this approach, the
    better for our users.

    To have a working system you need several more steps that are
    performed by the instantiator/image builder, such as providing working
    and populated proc/sys/dev, writable tmp/var, possibly etc. And it
    needs to be instantiated with user/password/ssh certs/locale/timezone.
    And if it needs to be bootable on baremetal/vm, it needs an ESP. And
    then if you have an ESP and want to run in a VM with SB, you'll need self-enrolling certs on first use or ensuring the 3rd party CA is
    provisioned. And then...

    You get the point. Going from a bunch of packages to a running system necessarily has many steps in between, some that are already done and
    taken for granted, for example when you say "works as a container" I'm
    pretty sure the "container" engine is taking care of at the very least proc/dev/sys for you, and it's just expected to work. bin -> usr/bin,
    sbin -> usr/sbin and lib -> usr/lib should get the same treatment: if
    they are not there, the invoked engine should prepare them. systemd
    and nspawn have been able to do this for a while now.

    Not having those hard coded means that the use case of / on a tmpfs
    with the rest instantiated on the fly, assembled with the vendor's
    /usr and /etc trees, becomes possible, which is neat. And said trees
    can pass the checksum/full integrity muster.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Luca Boccassi on Sat May 6 21:00:02 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Luca,

    On Sat, May 06, 2023 at 04:52:30PM +0100, Luca Boccassi wrote:
    To have a working system you need several more steps that are
    performed by the instantiator/image builder, such as providing working
    and populated proc/sys/dev, writable tmp/var, possibly etc. And it
    needs to be instantiated with user/password/ssh certs/locale/timezone.
    And if it needs to be bootable on baremetal/vm, it needs an ESP. And
    then if you have an ESP and want to run in a VM with SB, you'll need self-enrolling certs on first use or ensuring the 3rd party CA is provisioned. And then...

    You paint it this way, but it really used to just work until we got the /usr-merge. Indeed, debvm creates virtual machine images effectively by bootstrapping a filesystem from packages and turning the resulting tree
    into a file system image.

    * /proc, /sys, /dev are mounted by systemd. All you need to do here is
    create the directories and base-files does so.
    * /tmp is shipped by base-files.
    * user and password creation is not handled yet, but can be handled by
    something similar to systemd-firstboot.
    * Not sure what you mean with certs, locale and timezone. You can just
    install ca-certificates, locales and tzdata as part of the bootstrap.
    * The bootloader part for baremetal is kinda out of scope for
    bootstrap, which is why debvm side-steps this. You can also skip it
    for containers and build chroots. So it is one out of multiple use
    cases that needs extra work here.

    In a good chunk of situations, you can get just by without messing
    around. Well that is until we broke it via usr-is-merged. I concur with
    Simon Richter, that restoring this property is a primary concern.

    You get the point. Going from a bunch of packages to a running system necessarily has many steps in between, some that are already done and
    taken for granted, for example when you say "works as a container" I'm
    pretty sure the "container" engine is taking care of at the very least proc/dev/sys for you, and it's just expected to work. bin -> usr/bin,
    sbin -> usr/sbin and lib -> usr/lib should get the same treatment: if
    they are not there, the invoked engine should prepare them. systemd
    and nspawn have been able to do this for a while now.

    No, this misses the point. You can configure essential in a very limited environment. However, you cannot do so without the lib or lib64 symlink (depending on the architecture) and the bin symlink. This is so
    critical, that it cannot be deferred to some external entity. It must be
    part of the bootstrap protocol. There are some suggested ways to fix
    this (such as adding separate bootstrap scripts next to maintainer
    scripts), but nothing implemented.

    Not having those hard coded means that the use case of / on a tmpfs
    with the rest instantiated on the fly, assembled with the vendor's
    /usr and /etc trees, becomes possible, which is neat. And said trees
    can pass the checksum/full integrity muster.

    It's neat that you can solve your use case by breaking other people's
    use cases. This is not constructive interaction however. This kind of
    behaviour is precisely what caused so much conflict around the
    /usr-merge. What if I gave a shit for your use case? Denying the
    /usr-merge and just continuing unmerged as long as possible (as merging
    would break my use case) would be my strategy of choice. You can make a difference here by starting to recognize other people's use cases and
    proposing solutions in that merged world. And no, it's not "add duct
    tape to every bootstrap tool".

    So I really want to see a solution for the bootstrap protocol before
    moving the dynamic linker and /bin/sh to its canonical location. The
    current bootstrap protocol is kept on life-support by installing the
    usrmerge package by default. Dropping usrmerge from the
    init-system-helpers dependency as first alternative or moving the
    dynamic linker would break it. If I had a solution in mind, I'd
    definitely post it right here, but unfortunately I have not.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Ansgar@1:229/2 to Helmut Grohne on Sun May 7 11:20:01 2023
    XPost: linux.debian.devel
    From: ansgar@43-1.org

    Hi,

    On Sun, 2023-05-07 at 07:50 +0200, Helmut Grohne wrote:
    But then, you only capture diversions inside Debian's ecosystem

    It's unreasonable to support stuff outside Debian's ecosystem: even
    basic dependency relations do not work for this.

    Debian's dependency system requires to explicitly declare Depends/Conflicts/Replaces/Breaks, but for obvious reasons we cannot do
    that for packages outside Debian's ecosystem.

    The same is true for diversions/alternatives/* or anything else
    requiring coordination among all users: the dpkg ecosystem has too many practical limitations to support non-Debian packages on anything but a
    "it might work" basis (which is usually "good enough"). (This is even
    true for packages within the Debian ecosystem, especially when one
    considers partially implemented features like multi-arch.)

    Is there any specific reason why specifically diversions are a problem
    where "it might work" is not sufficient? That is, why should we divert
    from the usual standard for dealing with packages outside the Debian
    ecosystem here?

    I also caution that we've started from a very simple approach and tried fixing it up to address the problems that we recognized in analyzing it.
    My impression is that we are not finished with our discovery here and
    won't be for quite some time.

    Well, we find limitations in dpkg that we in all other contexts usually
    ignore. If we used similar expectations in other cases, we would need
    to very much restrict when Breaks/Conflicts/Replaces might be used at
    all: it's totally unrealistic to list all (possibly local) packages
    that ship conflicting files, possibly only created by maintainer
    scripts. Or to explicitly list all reverse dependencies that might be
    broken by a particular change. We also would not have multi-arch yet as
    the dependency system doesn't support it fully (some of which is
    already known, but probably discovery isn't finished yet).

    (Of course in some cases explicitly listing reverse dependencies can be avoided: just always introduce something like

    Provides: ${foo}-compat (= 1)

    for *all* dependencies and forbid `>=` in `Depends`; this allows to
    stop providing that in cases where one would have to declare explicit
    `Breaks` before. But only the direct provider can use this, so it's
    already too limited... Alternatively forbid *all* changes that would
    require this, i.e., require stable interfaces. However we do not do
    this.)

    But for all these issues we just say "meh, you are out of luck".

    Ansgar

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Ansgar@1:229/2 to Helmut Grohne on Sun May 7 09:10:01 2023
    XPost: linux.debian.devel
    From: ansgar@43-1.org

    Hi,

    On Sun, 2023-05-07 at 07:50 +0200, Helmut Grohne wrote:
    But then, you only capture diversions inside Debian's ecosystem and miss
    out on other kinds of diversions such as local diversions. We currently support imposing local diversions on pretty much arbitrary files
    including unit files.

    No, we do not really support diversions. Once you are divert a file, it
    is luck whether this works or not.

    Once you divert files, maintainer scripts and other parts would have to
    be aware of this and chose whether to use the original file included in
    the file or the diverted file.

    Not handling diversions can lead to files disappearing, data loss or
    other breakage, but it's very rare a package considers this.

    Ansgar

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Ansgar on Sun May 7 13:10:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Sun, 7 May 2023 at 10:14, Ansgar <ansgar@43-1.org> wrote:

    Hi,

    On Sun, 2023-05-07 at 07:50 +0200, Helmut Grohne wrote:
    But then, you only capture diversions inside Debian's ecosystem

    It's unreasonable to support stuff outside Debian's ecosystem: even
    basic dependency relations do not work for this.

    Debian's dependency system requires to explicitly declare Depends/Conflicts/Replaces/Breaks, but for obvious reasons we cannot do
    that for packages outside Debian's ecosystem.

    The same is true for diversions/alternatives/* or anything else
    requiring coordination among all users: the dpkg ecosystem has too many practical limitations to support non-Debian packages on anything but a
    "it might work" basis (which is usually "good enough"). (This is even
    true for packages within the Debian ecosystem, especially when one
    considers partially implemented features like multi-arch.)

    Is there any specific reason why specifically diversions are a problem
    where "it might work" is not sufficient? That is, why should we divert
    from the usual standard for dealing with packages outside the Debian ecosystem here?

    I also caution that we've started from a very simple approach and tried fixing it up to address the problems that we recognized in analyzing it.
    My impression is that we are not finished with our discovery here and
    won't be for quite some time.

    Well, we find limitations in dpkg that we in all other contexts usually ignore. If we used similar expectations in other cases, we would need
    to very much restrict when Breaks/Conflicts/Replaces might be used at
    all: it's totally unrealistic to list all (possibly local) packages
    that ship conflicting files, possibly only created by maintainer
    scripts. Or to explicitly list all reverse dependencies that might be
    broken by a particular change. We also would not have multi-arch yet as
    the dependency system doesn't support it fully (some of which is
    already known, but probably discovery isn't finished yet).

    (Of course in some cases explicitly listing reverse dependencies can be avoided: just always introduce something like

    Provides: ${foo}-compat (= 1)

    for *all* dependencies and forbid `>=` in `Depends`; this allows to
    stop providing that in cases where one would have to declare explicit `Breaks` before. But only the direct provider can use this, so it's
    already too limited... Alternatively forbid *all* changes that would
    require this, i.e., require stable interfaces. However we do not do
    this.)

    But for all these issues we just say "meh, you are out of luck".

    Indeed, if we don't worry about local random changes or random third
    party packages beyond documenting what needs attention, we shouldn't
    worry about them for this either. As already mentioned lots of third
    party packages don't even use Debian's toolchain to build packages, so
    there's nothing that we can do anyway.

    Kind regards, Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Luca Boccassi on Sun May 7 16:50:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Sun, 7 May 2023 at 12:51, Luca Boccassi <bluca@debian.org> wrote:

    On Sun, 7 May 2023 at 06:55, Helmut Grohne <helmut@subdivi.de> wrote:

    Hi Luca,

    On Sat, May 06, 2023 at 09:47:15PM +0100, Luca Boccassi wrote:
    Sure, there are some things that need special handling, as you have pointed out. What I meant is that I don't think we need special
    handling for _all_ affected packages. AFAIK nothing is using
    diversions for unit files or udev rules, for example (I mean if any package is, please point it out, because I would like a word...). I

    I've posted a list in https://lists.debian.org/20230428080516.GA203171@subdivi.de and indeed, udev rules are being diverted in one case.

    *fetching sledgehammer*

    Filed https://bugs.debian.org/1035667 with extreme prejudice, and MR
    up to fix it too.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to Ansgar on Mon May 8 05:00:01 2023
    XPost: linux.debian.devel
    From: sjr@debian.org

    Hi,

    On 5/7/23 18:14, Ansgar wrote:

    Is there any specific reason why specifically diversions are a problem
    where "it might work" is not sufficient? That is, why should we divert
    from the usual standard for dealing with packages outside the Debian ecosystem here?

    Locally created diversions are a supported feature, and the only way for
    admins to replace single files in a way that is safe for installing updates.

    Even within Debian, it is not sufficient to just coordinate uploads of
    packages that divert each others' files, because the new diversion needs
    to be in place before a newly-canonicalized package is unpacked, a
    Breaks relationship does not enforce that ordering, and while a
    Conflicts without a Replaces does, this adds a lot of constraints to the solver.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Simon Richter on Mon May 8 14:00:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Mon, 8 May 2023 at 03:57, Simon Richter <sjr@debian.org> wrote:

    Hi,

    On 5/7/23 18:14, Ansgar wrote:

    Is there any specific reason why specifically diversions are a problem where "it might work" is not sufficient? That is, why should we divert
    from the usual standard for dealing with packages outside the Debian ecosystem here?

    Locally created diversions are a supported feature, and the only way for admins to replace single files in a way that is safe for installing updates.

    Even within Debian, it is not sufficient to just coordinate uploads of packages that divert each others' files, because the new diversion needs
    to be in place before a newly-canonicalized package is unpacked, a
    Breaks relationship does not enforce that ordering, and while a
    Conflicts without a Replaces does, this adds a lot of constraints to the solver.

    Sure, they are supported in the sense that they can be enabled, and
    then you get to keep the pieces. We ship thousands of maintainer
    scripts, and I have never seen one that takes into account completely
    arbitrary and random possible local diversion, apart from dash for
    /bin/sh (and we are about to nuke most of it!), when
    moving/adjusting/fixing and whatnot. Do you have any such
    counter-example in mind?

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Sam Hartman@1:229/2 to All on Mon May 8 21:10:01 2023
    XPost: linux.debian.devel
    From: hartmans@debian.org

    "Helmut" == Helmut Grohne <helmut@subdivi.de> writes:

    Helmut> Hi Luca,
    Helmut> On Sun, May 07, 2023 at 12:51:21PM +0100, Luca Boccassi wrote:
    >> The local/external aspect is already covered in Ansgar's reply
    >> and subthread.

    Helmut> I hope that we can at least agree that we don't have
    Helmut> consensus on this view. And the more I think about it, the
    Helmut> more it becomes clear to me that this non-consensus is part
    Helmut> of the larger disagreement we have about this whole
    Helmut> transition. Do you see any way towards getting to common
    Helmut> ground here?

    As someone who has been following this, I support the work Helmut and
    Simon Richter have been doing.
    I have more confidence in that view than the one Luca is proposing.
    I also support Shawn's interpretation that being conservative here is
    good.

    I think even with my support we have no consensus. However hopefully we
    can get a few more people who have been reading the whole thread to
    chime in and a consensus will appear.

    -----BEGIN PGP SIGNATURE-----

    iHUEARYIAB0WIQSj2jRwbAdKzGY/4uAsbEw8qDeGdAUCZFlIUAAKCRAsbEw8qDeG dFZnAQCi8rfwfhul457+8gQrE/jT1uifP8p9EFqblADFraXdrAD/QBGt0nGcWKJD blnHFfM1jGCG4G5S28d4OTh0HPOb4Qo=
    =TjUM
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Simon Richter@1:229/2 to Luca Boccassi on Mon May 8 14:20:01 2023
    XPost: linux.debian.devel
    From: sjr@debian.org

    Hi,

    On 5/8/23 20:38, Luca Boccassi wrote:

    [local diversions]

    Sure, they are supported in the sense that they can be enabled, and
    then you get to keep the pieces.

    They are supported in the sense that someone actually added an explicit
    flag for dpkg-divert for specifically this feature and documented it in
    the manual page as an interface.

    Maintainer scripts don't need to work around the admin installing
    arbitrary incompatible tools, because we generally expect admins to know
    what they are doing -- however requiring admins to perform multiple
    diversion registrations to have them count is a change of the interface.

    The /bin/sh diversion is a bit ...special. This should have been an alternative, but cannot be because /bin/sh needs to be functional after unpacking Essential packages, so maintainer scripts work during bootstrap.

    There is an interesting use case here as well, bootstrapping a foreign architecture, where we only unpack the Essential set, divert /sbin/init
    to /sbin/init.real and place a shell script as /sbin/init that completes
    the installation once the resulting tree is mounted as a root filesystem
    on the target platform. It has been a while since I used that, in
    principle this should even still work with systemd.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Helmut Grohne on Mon May 8 23:40:02 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    Will get to the rest later tonight, two quick points:

    On Mon, 8 May 2023 at 09:58, Helmut Grohne <helmut@subdivi.de> wrote:
    But the more I think about it, the more I am convinced that the
    default option working best for Debian is the one that matches the project's choice of a filesystem layout. After all, this is
    configurable in the toolchain for a reason.
    And the vast majority of the rest of the world has long since finished
    this transition, so I struggle to think where software built with this default wouldn't work. Bullseye will be oldoldstable at that point,
    and even that was default merged for new installations, and really old
    ones (oldoldoldoldstable at that point? I lost count) will be long
    EOL. I suppose they could still be around unmaintained, but who uses a toolchain from 8 years in the future to build software for an EOL distribution 8 years in the past? Normally it's the other way around,
    as even glibc adds new symbols and is not forward compatible.

    This seems somewhat convincing to me. Would you reach out to toolchain maintainers to discuss this as an early change after the release of
    bookworm?

    Have done so now via the gcc mailing list.

    On the ELF interpreter, as long as we can reasonably ensure it works,
    I do believe we should switch it, regardless of what we do with the symlinks, how we ship/add/build/package/create/manage them, as a
    desired final state. Again, we should make the default in Debian work
    for Debian. And given the default for Debian from Bookworm onward is
    that the loader is in /usr/lib/, it seems perfectly reasonable to me
    that it software built for Debian and shipped in Debian should look
    there for it.

    I suppose that we've been confusing the different approaches here. The question of what links base-files should contain mostly arises if you
    start from the assumption that we do not modify the ELF interpreter
    location. Once changing its (and /bin/sh's) location, the question of
    how to install those symlinks can indeed be done in base-files.postinst
    or at some other place where dpkg doesn't have to know much about it
    indeed. Would you agree to examine the approach where we don't modify
    the ELF interpreter location in parallel as a backup plan?

    Yeah we definitely should do that. I think we should separate a bit
    long-term vs short-term on that front, as it will help reach a
    conclusion more quickly. I think that aspect is easy to revise, and
    shouldn't lock us in a particular position one way or the other.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Sean Whitton@1:229/2 to Luca Boccassi on Mon May 8 20:10:01 2023
    XPost: linux.debian.devel
    From: spwhitton@spwhitton.name

    Hello,

    On Sun 07 May 2023 at 12:03PM +01, Luca Boccassi wrote:

    Sure, this is in the context of the ongoing discussion in the TC about revising their side of the advice.

    I think it's highly unlikely that we revise it rather than just reissue
    it, at the present time, because too many details are unsettled.

    Also, we shouldn't lose sight of the reason why this was issued in the
    first place: it is designed to stop a problem from happening, and that problem can only happen when both conditions are true. I can't read
    minds obviously, but I imagine that's the reason the RT advice was
    worded as it was.

    It's designed to stop as-yet-unknown problems happening, too.

    --
    Sean Whitton

    --=-=-Content-Type: application/pgp-signature; name="signature.asc"

    -----BEGIN PGP SIGNATURE-----

    iQJNBAEBCgA3FiEEm5FwB64DDjbk/CSLaVt65L8GYkAFAmRZOiMZHHNwd2hpdHRv bkBzcHdoaXR0b24ubmFtZQAKCRBpW3rkvwZiQJCiD/9tMuEYYb/ehlA5bWQfYJw+ 6+DDufokq/vAHyogfFuPkOSMgWyOkTUOYM5gG/hZQ0V+gAnZpHd+eRLHW06Pg34t lC0Qi6qWwxBkPug1d1yTku/p+keQkXoU6V9eFNMCrCnsid7fnETPctqlmm5Ubmo+ nKOrAjxORmlm3Z+8nv0CBubFVdSJvnh5rLznL+v9f8Cr0WZ0LCYX7sg5XKWsgwHm xLiTIN4UmrJFm9LhGFIPMNUDGOKysxpZFOLlFXesYuYq/dUNMLD93RXB6NDxxInH cl0rqpPuTsCfut+rJguJ6vS9U8U6Iek/vIhbWv2vuRWhlOW1K6Ym7GLqt8BUzRVP 2qTajgNJlLf6D4YHrYSnhdV3JXa9I9jrJqM2MyWEf6rsg1lWrdRaJujOQoYPdEfd eZMa0T4ZKcyOsy8Z6JVV+SSABloR+D5Pn7hl58wjjaid+mIGVklElkAX+LIk1Jm+ BfN9g/VaRa3COjlU3FywJ9rEsOPp2VQTYq6hoTNXtlk7wQxKK0wLwsCgVmw71K6R zu9IPcqRj/a0PZK7WkkELWYzUn911tIU3u3jshGGzHuju1Zq3XVBdDR1S24yJ2xN oY+Oe4/1g4n7p33JeY7TZf8x+U/TQdKXNRhfFg4aBxLpmnZg3E3MVob65YQbKD4n PkgKMZLfeMWypzENDpQmBA==Wa6l
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: you canno
  • From Luca Boccassi@1:229/2 to Sean Whitton on Tue May 9 03:00:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Mon, 8 May 2023 at 19:06, Sean Whitton <spwhitton@spwhitton.name> wrote:

    Hello,

    On Sun 07 May 2023 at 12:03PM +01, Luca Boccassi wrote:

    Sure, this is in the context of the ongoing discussion in the TC about revising their side of the advice.

    I think it's highly unlikely that we revise it rather than just reissue
    it, at the present time, because too many details are unsettled.

    Also, we shouldn't lose sight of the reason why this was issued in the first place: it is designed to stop a problem from happening, and that problem can only happen when both conditions are true. I can't read
    minds obviously, but I imagine that's the reason the RT advice was
    worded as it was.

    It's designed to stop as-yet-unknown problems happening, too.

    Well, sure, but we've been at this for years, any such problems should
    really be known by now. This is with Bookworm as it stands of course,
    when we go in and make more changes then we obviously have to be
    careful, but that's the entire reason this thread exists and is still
    going on.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Helmut Grohne on Tue May 9 13:10:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Tue, 9 May 2023 at 05:12, Helmut Grohne <helmut@subdivi.de> wrote:

    Hi Luca,

    On Tue, May 09, 2023 at 01:56:53AM +0100, Luca Boccassi wrote:
    On Mon, 8 May 2023 at 19:06, Sean Whitton <spwhitton@spwhitton.name> wrote:
    It's designed to stop as-yet-unknown problems happening, too.

    Well, sure, but we've been at this for years, any such problems should really be known by now. This is with Bookworm as it stands of course,
    when we go in and make more changes then we obviously have to be
    careful, but that's the entire reason this thread exists and is still
    going on.

    This actually feels rather worrying to me. On one hand, you say that
    problems should be know. On the other hand, you proposed a simple
    transition with quite a number of problems that you apparently didn't
    see coming. Even relatively simple mechanisms, such as just repacking
    all the .debs to ship files in their canonical location and then trying
    to install them, revealed a dpkg unpack error in zutils. This
    combination of claiming that problems should be known while at the same
    time apparently not knowing them makes me uneasy to move forward here.

    So while I want to see the moratorium lifted, it all makes a lot more
    sense to me given what we've seen in this thread. The worst of outcomes
    I see here is the one where we cause problems that don't have a good
    solution as any way forward would break someone's use case (with
    someone's use case often being smooth upgrades in one way or another).
    It's those where we cannot move forward nor revert.

    No need to get worried! With "Bookworm as it stands" I mean that
    literally. What we ship in Bookworm has been stable for years, so
    while something new could be lurking somewhere, it seems vanishingly
    unlikely at this stage.

    Once we start changing things in Trixie, in whichever way we decide,
    that's of course all new and all bets are off.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Helmut Grohne@1:229/2 to Luca Boccassi on Tue May 9 06:20:01 2023
    XPost: linux.debian.devel
    From: helmut@subdivi.de

    Hi Luca,

    On Tue, May 09, 2023 at 01:56:53AM +0100, Luca Boccassi wrote:
    On Mon, 8 May 2023 at 19:06, Sean Whitton <spwhitton@spwhitton.name> wrote:
    It's designed to stop as-yet-unknown problems happening, too.

    Well, sure, but we've been at this for years, any such problems should
    really be known by now. This is with Bookworm as it stands of course,
    when we go in and make more changes then we obviously have to be
    careful, but that's the entire reason this thread exists and is still
    going on.

    This actually feels rather worrying to me. On one hand, you say that
    problems should be know. On the other hand, you proposed a simple
    transition with quite a number of problems that you apparently didn't
    see coming. Even relatively simple mechanisms, such as just repacking
    all the .debs to ship files in their canonical location and then trying
    to install them, revealed a dpkg unpack error in zutils. This
    combination of claiming that problems should be known while at the same
    time apparently not knowing them makes me uneasy to move forward here.

    So while I want to see the moratorium lifted, it all makes a lot more
    sense to me given what we've seen in this thread. The worst of outcomes
    I see here is the one where we cause problems that don't have a good
    solution as any way forward would break someone's use case (with
    someone's use case often being smooth upgrades in one way or another).
    It's those where we cannot move forward nor revert.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Ansgar@1:229/2 to Sean Whitton on Wed May 10 21:30:01 2023
    XPost: linux.debian.devel
    From: ansgar@43-1.org

    On Wed, 2023-05-10 at 08:35 -0700, Sean Whitton wrote:
    On Sun 07 May 2023 at 11:14AM +02, Ansgar wrote:
    Debian's dependency system requires to explicitly declare Depends/Conflicts/Replaces/Breaks, but for obvious reasons we
    cannot do
    that for packages outside Debian's ecosystem.

    The same is true for diversions/alternatives/* or anything else
    requiring coordination among all users: the dpkg ecosystem has too
    many
    practical limitations to support non-Debian packages on anything
    but a
    "it might work" basis (which is usually "good enough").  (This is
    even
    true for packages within the Debian ecosystem, especially when one considers partially implemented features like multi-arch.)

    I don't think this is the consensus view.

    So why do we allow changes that require listing all reverse
    dependencies in Breaks then? This is known to be wrong for all non-
    listed packages, e.g., all local/vendor/derivative-specific packages.

    Our derivatives are among our users, for example, and we care about
    them being able to add packages in appropriate ways.

    As far as I understand, we do explicitly *not* care about our
    derivatives with regard to merged-/usr as some packages in Debian
    recommend users to move *away* from merged-/usr to split-/usr on
    derivatives, i.e., to an unsupported fs layout.

    AFAIR the ctte felt that doing so on derivatives is fine for packages
    in Debian (w/o an explicit formal ruling).

    Ansgar

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Russ Allbery@1:229/2 to Ansgar on Wed May 10 23:00:01 2023
    XPost: linux.debian.devel
    From: rra@debian.org

    Ansgar <ansgar@43-1.org> writes:

    So why do we allow changes that require listing all reverse dependencies
    in Breaks then? This is known to be wrong for all non- listed packages,
    e.g., all local/vendor/derivative-specific packages.

    Because it's a balance; we don't want to stop making changes, and never
    making a backward-compatible change doesn't work, so we do the best we can
    with the tools we have. However, if someone with an out-of-Debian package tells us that a change breaks it, historically we did add them to Breaks.
    We just don't have a good way of discovering this.

    As far as I understand, we do explicitly *not* care about our
    derivatives with regard to merged-/usr as some packages in Debian
    recommend users to move *away* from merged-/usr to split-/usr on
    derivatives, i.e., to an unsupported fs layout.

    Caring about them isn't the same thing as doing everything they want. We
    can both try to make things as smooth for them as possible and still make design decisions about Debian that they may disagree with or that may make
    some property they want to maintain difficult or impossible. It's the
    sort of decision we have to make on a case-by-case basis.

    --
    Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Ansgar@1:229/2 to Russ Allbery on Wed May 10 23:20:01 2023
    XPost: linux.debian.devel
    From: ansgar@43-1.org

    Hi Russ,

    On Wed, 2023-05-10 at 13:50 -0700, Russ Allbery wrote:
    Ansgar <ansgar@43-1.org> writes:
    As far as I understand, we do explicitly *not* care about our
    derivatives with regard to merged-/usr as some packages in Debian
    recommend users to move *away* from merged-/usr to split-/usr on derivatives, i.e., to an unsupported fs layout.

    Caring about them isn't the same thing as doing everything they want.  We can both try to make things as smooth for them as possible and still make design decisions about Debian that they may disagree with or that may make some property they want to maintain difficult or impossible.  It's the
    sort of decision we have to make on a case-by-case basis.

    Debian going out of its way to tell derivative users to switch back
    from merged-/usr to split-/usr is the *opposite* of trying to make
    things as smooth for them as possible.

    I asked the ctte to consider not telling derivative users to revert
    from merged-/usr and was told me that "we [ctte] would not consider
    this [change] to be in line with our existing decisions" (informally).

    I take that as explicitly not caring that we break derivative users'
    systems.

    Ansgar

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Russ Allbery@1:229/2 to Ansgar on Wed May 10 23:40:02 2023
    XPost: linux.debian.devel
    From: rra@debian.org

    Ansgar <ansgar@43-1.org> writes:
    On Wed, 2023-05-10 at 13:50 -0700, Russ Allbery wrote:

    Caring about them isn't the same thing as doing everything they want. 
    We can both try to make things as smooth for them as possible and still
    make design decisions about Debian that they may disagree with or that
    may make some property they want to maintain difficult or impossible. 
    It's the sort of decision we have to make on a case-by-case basis.

    Debian going out of its way to tell derivative users to switch back from merged-/usr to split-/usr is the *opposite* of trying to make things as smooth for them as possible.

    Yes, I agree with that part and I think I objected to that at the time. Nonetheless, one bad decision doesn't mean that it is Debian policy that
    we don't care about derivatives or their users. I think we made a mistake there which is not in alignment with our ideals or our goals. We should
    try to reverse that mistake, not double down on it.

    --
    Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Sean Whitton@1:229/2 to Ansgar on Wed May 10 17:40:01 2023
    XPost: linux.debian.devel
    From: spwhitton@spwhitton.name

    Hello,

    On Sun 07 May 2023 at 11:14AM +02, Ansgar wrote:

    Debian's dependency system requires to explicitly declare Depends/Conflicts/Replaces/Breaks, but for obvious reasons we cannot do
    that for packages outside Debian's ecosystem.

    The same is true for diversions/alternatives/* or anything else
    requiring coordination among all users: the dpkg ecosystem has too many practical limitations to support non-Debian packages on anything but a
    "it might work" basis (which is usually "good enough"). (This is even
    true for packages within the Debian ecosystem, especially when one
    considers partially implemented features like multi-arch.)

    I don't think this is the consensus view.

    Our derivatives are among our users, for example, and we care about them
    being able to add packages in appropriate ways.

    Our policy documents and best practices contain various provisions with
    user's own packages in mind.

    --
    Sean Whitton

    --=-=-Content-Type: application/pgp-signature; name="signature.asc"

    -----BEGIN PGP SIGNATURE-----

    iQJNBAEBCgA3FiEEm5FwB64DDjbk/CSLaVt65L8GYkAFAmRbucQZHHNwd2hpdHRv bkBzcHdoaXR0b24ubmFtZQAKCRBpW3rkvwZiQLzfD/9PkgcKbHlFEVWaNtiXg5Po LxUCdTcQwIKgnI1uhZmfpcUnmiwLl8uRsCD6Vptag40/zo4AkJWuB93FzowHQ0Kc 4CEaEurckRDj4c2Ph8zrUBPt/g/m64n4RsOKw3ubwkBalitLP3v/dhwhLSNOFMha JCZkyohJuF1IyG+cC5lbjfRHo3zh8cv5trA18TARSG8QMI1KNowV5fWpO8oIGK3d dUCswveeXdiKmvh91nKylw4qgpleuig5zqOUX5y25MbsCQQgeFTdjQWOf0vb8+Wc jNvMf3HuOnSun/P9efIG7DLmfSpBOscbLmcDgMbWXm9U96XkG4GgcUh/f1yLD3tz qnFQj+XJOLFFIDMqbLrCM2ie+iRV19vGW48WUTDxtBICzpFHFthDJYOfEMcttuX0 mbTOCsQIq4sxUs70T8ep1AfHFyLCvI51rc725ZQMHBe8XZzyVjXkD2x1ltQMC9rk 9DUrC5PgB8lCjYYOLwGnHQ6aHH9K/8vpBzz0WcdqvEaBye6AJVQbHSI98mMDY3bx LCgKnZJVwc9iDZCD4E2lYUdUB9YAwQLTydfMD1O9Z9gELayTopICy6CThTivcoXB 5Kf2aBVNJi+BtE7r3lTUF+kEWYNaFJHs13BIi/0h0KCZ5bcji5YhfKHzVzE+QNvi RXmm1MQ38qfaLoFfBPRL3g==KSjz
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: you canno
  • From Ansgar@1:229/2 to Russ Allbery on Fri May 26 08:10:01 2023
    XPost: linux.debian.devel
    From: ansgar@43-1.org

    Hi Russ,

    On Wed, 2023-05-10 at 14:36 -0700, Russ Allbery wrote:
    Ansgar <ansgar@43-1.org> writes:
    Debian going out of its way to tell derivative users to switch back from merged-/usr to split-/usr is the *opposite* of trying to make things as smooth for them as possible.

    Yes, I agree with that part and I think I objected to that at the time. Nonetheless, one bad decision doesn't mean that it is Debian policy that
    we don't care about derivatives or their users.  I think we made a mistake there which is not in alignment with our ideals or our goals.  We should
    try to reverse that mistake, not double down on it.

    My impression is that the tech-ctte disagrees on this point and would
    not want to reverse the mistake, but double down on it (in your words).

    Or rather my impression is that they would like to avoid any decision
    on the dpkg mess situation. (Though not making a decision when asked is
    of course also an explicit decision.)

    So let me summarize Debian's "official" position as I understand it: we
    do *NOT* care how dpkg's recommendations will break derivative
    installations at all; if systems become unbootable, cause data loss,
    ... now or in the future that is explicitly fine.

    Ansgar

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Luca Boccassi@1:229/2 to Matthew Vernon on Fri May 26 10:50:01 2023
    XPost: linux.debian.devel
    From: bluca@debian.org

    On Fri, 26 May 2023 at 08:39, Matthew Vernon <matthew@debian.org> wrote:

    Hi,

    On 26/05/2023 07:03, Ansgar wrote:
    On Wed, 2023-05-10 at 14:36 -0700, Russ Allbery wrote:
    Ansgar <ansgar@43-1.org> writes:
    Debian going out of its way to tell derivative users to switch back from >>> merged-/usr to split-/usr is the *opposite* of trying to make things as >>> smooth for them as possible.

    Yes, I agree with that part and I think I objected to that at the time.
    Nonetheless, one bad decision doesn't mean that it is Debian policy that >> we don't care about derivatives or their users. I think we made a mistake >> there which is not in alignment with our ideals or our goals. We should >> try to reverse that mistake, not double down on it.

    My impression is that the tech-ctte disagrees on this point and would
    not want to reverse the mistake, but double down on it (in your words).

    Your impression is incorrect. And assigning motivations to other parties during contentious discussions should be done with care if at all.

    Consider: it is consistent to believe that it would have been better for
    dpkg not to have had that warning added (quite some time ago now), but
    that by now most derivatives that care will likely have patched it out
    again (mitigating the harm); and if the current work on dpkg is allowed
    to run its course then the warning will probably go away anyway.

    That assumes all derivatives track unstable/testing and have taken
    action, but it is possible for derivatives to track stable only, and
    those would be broken.

    Kind regards,
    Luca Boccassi

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Ansgar@1:229/2 to Matthew Vernon on Fri May 26 10:10:02 2023
    XPost: linux.debian.devel
    From: ansgar@43-1.org

    On Fri, 2023-05-26 at 08:39 +0100, Matthew Vernon wrote:
    So let me summarize Debian's "official" position as I understand it: we
    do *NOT* care how dpkg's recommendations will break derivative installations at all; if systems become unbootable, cause data loss,
    ... now or in the future that is explicitly fine.

    This is also unhelpful (and incorrect).

    No, it is correct.

    We allow boot-critical parts to refer to files using either the path in
    / or /usr; on systems following the recommendations from dpkg's warning
    this might result in non-booting systems.

    That is what we sign up to accept by having the warning in dpkg.

    Ansgar

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Matthew Vernon@1:229/2 to Ansgar on Fri May 26 09:50:01 2023
    XPost: linux.debian.devel
    From: matthew@debian.org

    Hi,

    On 26/05/2023 07:03, Ansgar wrote:
    On Wed, 2023-05-10 at 14:36 -0700, Russ Allbery wrote:
    Ansgar <ansgar@43-1.org> writes:
    Debian going out of its way to tell derivative users to switch back from >>> merged-/usr to split-/usr is the *opposite* of trying to make things as
    smooth for them as possible.

    Yes, I agree with that part and I think I objected to that at the time.
    Nonetheless, one bad decision doesn't mean that it is Debian policy that
    we don't care about derivatives or their users.  I think we made a mistake >> there which is not in alignment with our ideals or our goals.  We should
    try to reverse that mistake, not double down on it.

    My impression is that the tech-ctte disagrees on this point and would
    not want to reverse the mistake, but double down on it (in your words).

    Your impression is incorrect. And assigning motivations to other parties
    during contentious discussions should be done with care if at all.

    Consider: it is consistent to believe that it would have been better for
    dpkg not to have had that warning added (quite some time ago now), but
    that by now most derivatives that care will likely have patched it out
    again (mitigating the harm); and if the current work on dpkg is allowed
    to run its course then the warning will probably go away anyway.

    Or rather my impression is that they would like to avoid any decision
    on the dpkg mess situation. (Though not making a decision when asked is
    of course also an explicit decision.)

    There is currently a pile of ongoing work and discussion about
    /usr-merge and dpkg (in -devel at least). It seems to me that the right
    thing to do is to see how that work pans out, and let the people doing
    that work do so in peace.

    So let me summarize Debian's "official" position as I understand it: we
    do *NOT* care how dpkg's recommendations will break derivative
    installations at all; if systems become unbootable, cause data loss,
    ... now or in the future that is explicitly fine.

    This is also unhelpful (and incorrect). I do not think the case has been
    made that it is urgent that we remove (or revise) the warning from dpkg
    Right Now; if you want to attempt to do so, please do so without
    impugning those who disagree with you.

    Regards,

    Matthew

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Matthew Vernon@1:229/2 to Luca Boccassi on Fri May 26 11:10:01 2023
    XPost: linux.debian.devel
    From: matthew@debian.org

    On 26/05/2023 09:24, Luca Boccassi wrote:
    On Fri, 26 May 2023 at 08:39, Matthew Vernon <matthew@debian.org> wrote:

    Consider: it is consistent to believe that it would have been better for
    dpkg not to have had that warning added (quite some time ago now), but
    that by now most derivatives that care will likely have patched it out
    again (mitigating the harm); and if the current work on dpkg is allowed
    to run its course then the warning will probably go away anyway.

    That assumes all derivatives track unstable/testing and have taken
    action, but it is possible for derivatives to track stable only, and
    those would be broken.

    I agree such distributions would be left with a confusing disagreement
    between the release notes "only /usg-merged systems are supported" and
    dpkg's warning. I agree this isn't ideal; but the release notes will
    mitigate the risk to such derivatives.

    And as I said up-thread (and I'm trying not to repeat myself too much),
    I'm not sure why this is suddenly urgent so late in the release cycle,
    nor that we wouldn't be better off working on fixing the issues around
    dpkg and /usr-merge (which some people are currently doing).

    Regards,

    Matthew

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)