• Bug#1103679: unblock: autopkgtest/5.49 (1/2)

    From Ian Jackson@21:1/5 to All on Sun Apr 20 17:20:01 2025
    XPost: linux.debian.devel.release

    Package: release.debian.org
    Severity: normal
    X-Debbugs-Cc: autopkgtest@packages.debian.org, Debian CI team <team+ci@tracker.debian.org>
    Control: affects -1 + src:autopkgtest
    User: release.debian.org@packages.debian.org
    Usertags: unblock

    Hi. (I'm not sure "unblock" is the right request type.)


    unblock autopkgtest/5.49


    Summary:

    Please would you arrange that src:autopkgtest 5.49 will migrate
    (when the 10 days are up) despite the apparent regression in the
    autopkgtest for src:surf?

    I have looked at the logs for the test, and I am confident that the
    apparent regression is not due to any change in the autopkgtest
    package. It looks like the tests are flaky (and the surf package is
    also possibly just broken).

    I hope it's right for me to file this request now rather than waiting
    for the 10 days to be up.


    Analysing the "regression" in more detail:

    Firstly, note that two versions of autopkgtest are involved in this
    scenario, in totally different roles. One is the autopkgtest that's
    used as a test runner in ci.debian.net. That autopkgtest is *not
    being updated* here. Presumably it will be updated to 5.49 in due
    course if 5.49 updates to testing. The other is the autopkgtest from
    sid, the migration candidate, which is pulled in as a dependency
    because it's listed in the dependencies of one of surf's test cases.
    It is this latter autopkgtest which is impugned by the "regression".

    The surf package has three test cases, which I will call "command1",
    "command2" and "command3", in the order from d/t/control, as ci.d.n's autopkgtest calls thme.

    "command1" passes, is marked superficial, and is a simple `cmp`.

    "command3" Depends on autopkgtest. This is because it uses
    autopkgtest's README.package-tests as test data. The test fails with
    an error message which seems to indicate that surf, or webkit, or
    webdriver, or something, didn't start up at all. The test is marked
    flaky, so this is not classed as a "regression".

    "command2" is *not* marked flaky, and fails in the britney-requested
    migration runs. But it does *not* Depend on autopkgtest. Indeed,
    looking at the test logs, autopkgtest isn't even installed! The error
    message is similar to the failure from "command3".

    Or to put it another way, if it weren't for the flaky "command3" test
    case, nothing would even have worried that a new autopkgtest might
    break surf - the test wouldn't have been triggered.

    The "regression" appears only on riscv64. All of the changes in
    autopkgtest are arch-neutral.

    I have grepped the surf source code and it doesn't invoke autopkgtest.
    This is discussed some more in the bug report where Paul Gevers asked
    for the "command3" test to be marked flaky:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055639

    I did ask ci.d.n to retry the baseline, and to retry the failed test,
    but this didn't change the results. Even so, it seems impossible
    that this failure can be blamed on the new autopkgtest.

    I filed a bug about the "command2" test being flaky too
    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1103356
    There hasn't been a response from the maintainer yet. Outside the
    freeze, this would eventually result in surf being removed from
    testing, but that will come too slowly now.

    Additionally, there is another bug against surf saying that it doesn't
    work at all:
    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054150

    I think there is probably something inherently flaky inside surf :-/.

    So, in summary: it seems clear that this "regression" is not real.


    What is in autopkgtest 5.49

    The autopkgtest that is currently in testing, 5.47, has (like most
    previous versions) some serious security problems. It is only
    suitable for use with highly-isolating virtualisation environments.

    This is mostly because it makes a number of important directories world-writeable. It also plays fast and loose with ownership and
    user-vs-root, and the autopkgtest-virt-null backend (which does not
    offer any isolation whatsoever) falsely advertises
    isolation-container.

    These things are all fixed in 5.48, with a NEWS entry added in 5.49.

    I think it is important to get this security update into trixie.

    There is some risk that the changes will break something. The design
    intent is to avoid operational breakage in existing containerised or virtualised deployments - the code looks for the isolation
    capabilities, and uses the world-writeable permissions and a relaxed
    attitude, when those capabilities are present.

    This is tested in the extensive test suite - the salsa CI tests are
    quite comprehensive. There are autopkgtests too.

    So I'm hoping that there won't be any significant fallout.
    Nevertheless, if there is, I think we want to detect it early.

    The alternatives to migrating this to trixie now are to delay these
    critical security fixes for another ~two years, or to try to do them
    as a stable release update or security update. Both of these are
    quite unpalatable.

    I say "critical security fixes", because right now anyone who runs
    autopkgtest from their own normal development environment, and is
    using anything other than full container-based virtualisation, is
    exposing their main account to all other uids on their system (for
    example, uids used for privilege separation, or maybe, container uid namespaces).


    There is additionally one small bugfix, to install a missing file.

    The changes in question are in the changelog in detail, and summarised
    in NEWS.Debian.


    Thanks for your attention,
    Ian.


    diff --git a/Makefile b/Makefile
    index 13ef15b6..ba9ab010 100644
    --- a/Makefile
    +++ b/Makefile
    @@ -97,6 +97,7 @@ install:
    $(INSTALL_DATA) $(rstfiles) $(htmlfiles) $(docdir)
    $(INSTALL_PROG) lib/*.sh $(datadir)/lib/
    $(INSTALL_PROG) lib/in-testbed/*.sh $(datadir)/lib/in-testbed/
    + $(INSTALL_PROG) lib/arch-is-concerned.pl $(datadir)/lib/
    $(INSTALL_PROG) lib/parse-deps.pl $(datadir)/lib/
    $(INSTALL_PROG) lib/unshare-helper $(datadir)/lib/
    $(INSTALL_PROG) setup-commands/*[!~] $(datadir)/setup-commands
    diff --git a/debian/NEWS b/debian/NEWS
    index 2fe8d5ed..920b4a37 100644
    --- a/debian/NEWS
    +++ b/debian/NEWS
    @@ -1,3 +1,25 @@
    +autopkgtest (5.48) unstable; urgency=medium
    +
    + Security fixes involving many permissiosn changes:
    + * Several directories are nowworld-writeable only when used with
    + suitably isolating virtualisation servers. Previously they were
    + world-writeable in circumstances where this wasn't safe.
    + * autopkgtest-virt-null no longer advertises isolation-machine.
    + This was a dangerous lie. For environments where the call