• Including build metadata in packages

    From Vagrant Cascadian@21:1/5 to All on Sun Feb 13 23:20:01 2022
    A while ago I noticed binutils had some embedded logs in one of it's
    packages, which included timing information about the test suite runs
    which will almost certainly have differences between the different
    builds, even on the exact same machine:

    https://bugs.debian.org/950585

    My proposed patch removed the timing information and various other
    things, but was exactly the information wanted from these files, so was
    not an appropriate patch.


    It also became known that other key toolchain packages (e.g. gcc) also
    embed similar log files in the .deb packages... I have since found a few
    other packages that do similar things:

    https://tests.reproducible-builds.org/debian/issues/unstable/test_suite_logs_issue.html


    Obviously, this would interfere with any meaningful reproducible builds
    testing for any package that did something like this. Ideally metadata
    like this about a build should *not* be included in the .deb files
    themselves.


    I'll try to summarize and detail a bit some of the proposed strategies
    for resolving this issue:


    * output plaintext data to the build log

    Some of these log files are large (>13MB? per architecture, per package
    build) and would greatly benefit from compression...

    How large is too large for this approach to work?

    Relatively simple to implement (at least for plain text logs), but
    potentially stores a lot of data on the buildd infrastructure...


    * Selectively filter out known unreproducible files

    This adds complexity to the process of verification; you can't beat the simplicty of comparing checksums on two .deb files.

    With increased complexity comes increased opportunity for errors, as
    well as maintenance overhead.

    RPM packages, for example, embed signatures in the packages, and these
    need to be excluded for comparison.

    I vaguely recall at least one case where attempting something like this
    in the past and resulting in packages incorrectly being reported as reproducible when the filter was overly broad...

    Some nasty corner cases probably lurk down this approach...


    * Split build metadata into a separate .deb file

    Some of the similar problems of the previous, though maybe a little
    easier to get a reliable exclusion pattern? Wouldn't require huge
    toolchain changes.

    I would expect that such packages be not actually dependend on by any
    other packages, and *only* contain build metadata. Maybe named SOURCEPACKAGE-buildmetadata-unreproducible.deb ... or.... ?

    Not beautiful or elegant, but maybe actually achievable for bookworm
    release cycle?


    * Split build metadata into a separate file or archive

    Some of the debian-installer packages generate tarballs that are not
    .deb files and are included in the .changes files when uploading to the archive; making a similar generalized option for other packages to put
    build metadata into a separate artifact might be workable approach,
    although this would presumably require toolchain changes in dpkg and dak
    at the very least, and might take a couple release cycles, which
    is... well, debian.

    The possibility of bundling up .buildinfo files into this metadata too,
    while taking some changes in relevent dpkg, dak, etc. tooling, might in
    the long term be worth exploring.

    There was a relevent bug report in launchpad:

    https://bugs.launchpad.net/launchpad/+bug/1845159

    This seems like the best long-term approach, but pretty much *only* a
    long-term approach...


    I'd really like to remove this hurdle to reproducible builds from some
    key packages like binutils and gcc, but also curious about a
    generalizable approach so each package needing something like this
    doesn't reinvent the wheel in incompatible ways...


    Curious to hear your thoughts!


    live well,
    vagrant

    p.s. please consider CCing me and/or reproducible-builds@lists.alioth.debian.org, as I'm not subscribed to debian-devel.

    -----BEGIN PGP SIGNATURE-----

    iHUEARYKAB0WIQRlgHNhO/zFx+LkXUXcUY/If5cWqgUCYgmCdwAKCRDcUY/If5cW qgO/AQDrqPcfBBn08cWyrE84JERy72d+UtNcZLtgNU+rnz8k7wD/SsUXMZ5isoSn 40JCM+cWhJu91ckF8p5lWUgCS7xeGQQ=
    =LhMo
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Wise@21:1/5 to Vagrant Cascadian on Mon Feb 14 12:40:01 2022
    On Sun, 2022-02-13 at 14:13 -0800, Vagrant Cascadian wrote:

    * Split build metadata into a separate file or archive

    Some of the debian-installer packages generate tarballs that are not
    .deb files and are included in the .changes files when uploading to
    the archive; making a similar generalized option for other packages to
    put build metadata into a separate artifact might be workable approach, although this would presumably require toolchain changes in dpkg and
    dak at the very least, and might take a couple release cycles, which
    is... well, debian.

    I already sent a mail like this in the past, but...

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=950585#30

    This approach is already in use in the archive, but not
    yet for the kind of artefacts that you are talking about:

    https://codesearch.debian.net/search?perpkg=1&q=dpkg-distaddfile https://salsa.debian.org/ftp-team/dak/raw/master/config/debian/dak.conf (search for AutomaticByHandPackages)
    https://salsa.debian.org/ftp-team/dak/raw/master/daklib/upload.py (search for byhand_files)
    https://salsa.debian.org/ftp-team/dak/tree/master/scripts/debian/

    I think this would not require anything except a new dak config stanza
    for AutomaticByHandPackage and potentially a patch to dak code or a
    script. Seems unlikely it would require changes to anything other than
    dak plus the packages that want to opt in to using it, so should be
    completely doable within the bookworm release cycle.

    If you want to have some way to automatically download the data, then
    something like apt-file and Contents files could be done, I expect that
    would also be able to be done for the bookworm release and also be
    possible to put in bullseye-backports.

    You could even include all the build logs and build info in the same
    data set, and potentially modify the package build process so that
    build logs for maintainer built binaries end up there too.

    Something like this would be my suggested archive structure:

    Release -> Builds-amd64 -> foo_amd64.build                      \ \-> foo_amd64.buildinfo                       \--> foo_amd64.buildmeta.tar.xz

    Or since the buildinfo files are similar to Packages/Sources stanzas:

    Release -> BuildLogs-amd64 -> foo_amd64.build.xz
         \ \-> BuildInfo-amd64
          \--> BuildMeta-amd64 -> foo_amd64.buildmeta.tar.xz

    This could be in the main archive, or a separate debian-builds archive.

    --
    bye,
    pabs

    https://wiki.debian.org/PaulWise

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEYQsotVz8/kXqG1Y7MRa6Xp/6aaMFAmIKPSYACgkQMRa6Xp/6 aaNyrxAAmTMi7r3EzX/puKaAwAfkGR4oszbU7A/n5aqZQc2ws2Rn4EiAM+U9ZaBq C6U33cQ/CsdWUYkt8Bl+X5bLKj3/7ybWyp9B9MP5q6cn58rxiEt7pzAp4ax4RV20 JR1hi93ff6CvCddkxt3GQOHpA5fZhnzW3xrmyCjdbNWae0taBiftHLYBuM3nfFPO MwzPMb1YmenoiqhA+hghSp9fyXRUgyNx9hDhYludNICPMKEHAtUilidTgQc0B/6Q 72SJg/d4ci61kk+IsD+pIz8tybQNrpEdoLHzpbpNkoe6UgKbwUBjfJpdK10ICwNf TlUFajpIdcswACHpvBG0p9NHd+8rOWpdJzcKFaEPBXuR8ENDjA4ceuMKR3Ivr6Yk iQ0JbcLE4yAljJP0X4RLvd14y4b82igxjDHmdqdQlsbfoDgL/wmzAA8q7bZrmcD9 pYQ4LueKOHOb6CPMO2FN5A+y1ENHz+SAMKb4tdnfJ3f2oS0ELc0O5PrHmBfjsCKV QP/NLY3GZfY0COaW4UbTdhiwdHVcGsnX3+tDBj5bJQPayCYVqvGFPyADuySqV6gm QLbVWezignMgCCK7B8Czw5LMbGh2659syPbjptO+PfpHyXK3Vs+XGZcRIA7MZGiH GZDMhl4tFi6rNCdpdmWe63vN8iw+TDVbAiuiw6gOez0Es80+dCE=
    =UGlx
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Vagrant Cascadian on Wed Feb 16 14:50:01 2022
    On Sun, 13 Feb 2022 at 14:13:10 -0800, Vagrant Cascadian wrote:
    Obviously, this would interfere with any meaningful reproducible builds testing for any package that did something like this. Ideally metadata
    like this about a build should *not* be included in the .deb files themselves.

    Relatedly, I would like to be able to capture some information about
    builds even if (perhaps especially if) the build fails. It might make
    sense to combine that with what you're looking at. It doesn't seem
    ideal that for a successful build, the maintainer can recover detailed
    results via a .deb (at a significant reproducibility cost), but for a
    failing build - perhaps one that fails as a result of test regressions
    - they get no information other than what's in the log. If anything,
    these artifacts seem *more* important for failing builds.

    Some prior art for this:

    In any autopkgtest test-case, in addition to the machine-readable result
    (exit status, and depending on test flags, maybe also whether stderr is 0
    bytes or >= 1 byte), and the generic human-readable log (stdout/stderr),
    tests can drop arbitrary files into $AUTOPKGTEST_ARTIFACTS and they will
    be saved in a directory or tarball by the test infrastructure.
    ci.debian.net keeps test artifacts for a while and then discards them
    (they are not kept indefinitely).

    Lots of upstream build systems output detailed test results in some
    way: for example Autotools test-suite.log, *.log and *.trs, Meson meson-logs/**, and various packages like librsvg and gtk4 that drop logs, images etc. into somewhere under the build directory and/or /tmp for later analysis if a test fails. At the moment, anything written to these places
    and not recorded in the build's stdout/stderr just gets thrown away.

    In at least librsvg and gtk4, there is Debian-specific code to grab the
    results of failing "reftests" (drawing the same thing in two different
    ways that should end up equivalent, and comparing the resulting PNG
    images for equality), uuencode them and output them into the log for later inspection: this is particularly important when a maintainer is assessing whether a reftest result is "close enough" (e.g. font hinting is off
    by a few pixels due to different rounding) or unacceptable (e.g. text
    is unreadable or in the wrong place). Getting this out via uuencode is practically annoying, but it's better than nothing...

    In Gitlab-CI, there's a simple, declarative way to ask Gitlab to save
    certain files from the CI job's container and store them in a zip file
    for later inspection. For example, for a Meson build this could look like:

    artifacts:
    when: always
    paths:
    - _build/meson-logs
    - _build/tests/reftests/*.expected.png
    - _build/tests/reftests/*.actual.png

    * output plaintext data to the build log

    Some of these log files are large (>13MB? per architecture, per package build) and would greatly benefit from compression...

    How large is too large for this approach to work?

    Relatively simple to implement (at least for plain text logs), but potentially stores a lot of data on the buildd infrastructure...

    This has the advantage that it can work equally well for failing and
    successful builds, and doesn't need any special support in either the
    buildd infrastructure or dak.

    For packages like gtk4 and librsvg that are quite visual, it would be
    very useful to be able to record images (that is, potentially quite
    large binary files) and not just text: uuencoding them is a workaround,
    but screen-scraping the logs to get the uuencoded binary PNGs out is
    not a great start to a debugging session. So far, I've been lucky and
    all the failing reftests have had relatively small output...

    * Selectively filter out known unreproducible files

    This adds complexity to the process of verification; you can't beat the simplicty of comparing checksums on two .deb files.

    With increased complexity comes increased opportunity for errors, as
    well as maintenance overhead.

    RPM packages, for example, embed signatures in the packages, and these
    need to be excluded for comparison.

    I vaguely recall at least one case where attempting something like this
    in the past and resulting in packages incorrectly being reported as reproducible when the filter was overly broad...

    Some nasty corner cases probably lurk down this approach...

    A significant disadvantage of this approach is that it will only work
    for successful builds: you can't use it to record more information about
    a FTBFS caused by build-time test failures, unless you are willing to
    let packages with build-time test failures into the archive (at which
    point people will start using them or build-depending on them, which
    we don't really want for packages that have failed the QA checks that
    were meant to stop them from being shipped if they're broken/unusable,
    either on a particular architecture or in general).

    I personally don't like this: as you say, it's difficult to beat the
    simplicity of "if the content isn't identical then the package is unreproducible".

    * Split build metadata into a separate .deb file

    Some of the similar problems of the previous, though maybe a little
    easier to get a reliable exclusion pattern? Wouldn't require huge
    toolchain changes.

    I would expect that such packages be not actually dependend on by any
    other packages, and *only* contain build metadata. Maybe named SOURCEPACKAGE-buildmetadata-unreproducible.deb ... or.... ?

    Not beautiful or elegant, but maybe actually achievable for bookworm
    release cycle?

    This is not exactly elegant, but it wouldn't need infrastructural changes
    (we'd just have to define a naming convention for packages that are intentionally unreproducible).

    Another disadvantage of this approach is that it will only work for
    successful builds, as above.

    This will require each package that adopts it to go through NEW, unless
    the ftp team are willing to special-case packages that match the pattern
    to be accepted automatically (like they do for automatic -dbgsym packages).

    * Split build metadata into a separate file or archive

    Some of the debian-installer packages generate tarballs that are not
    .deb files and are included in the .changes files when uploading to the archive; making a similar generalized option for other packages to put
    build metadata into a separate artifact might be workable approach,
    although this would presumably require toolchain changes in dpkg and dak
    at the very least, and might take a couple release cycles, which
    is... well, debian.

    The possibility of bundling up .buildinfo files into this metadata too,
    while taking some changes in relevent dpkg, dak, etc. tooling, might in
    the long term be worth exploring.

    There was a relevent bug report in launchpad:

    https://bugs.launchpad.net/launchpad/+bug/1845159

    This seems like the best long-term approach, but pretty much *only* a long-term approach...

    I think even if we do one of the other approaches as a stopgap, we'll
    want this in the long term.

    There are two approaches that could be taken to this. One is to use
    BYHAND, as Paul Wise already discussed. This would require action from the
    ftp team and dak (I think), but nothing special in sbuild or the buildd infrastructure.

    However, I'd prefer it if this was output from the build alongside the log, instead of being exported via the .changes file, so that failing builds
    can also produce artifacts, to help the maintainer and/or porters to
    figure out why the build failed. This would require action in sbuild and
    the buildd infrastructure, but not in dak, because handling build logs is
    not dak's job (and I don't think handling things like the binutils test
    results should be dak's job either).

    Here's a straw-man spec, which I have already prototyped in <https://salsa.debian.org/debian/sbuild/-/merge_requests/14>:

    Each whitespace-separated token in the Build-Artifacts field represents
    a filename pattern in the same simplified shell glob syntax used in
    "Machine-readable debian/copyright file", version 1.0.

    If the pattern matches a directory, its contents are included in
    the artifacts, recursively. If a pattern matches another file type,
    it is included in the artifacts as-is. If a pattern does not match
    anything, nothing is included in the artifacts: this may be diagnosed
    with a warning, but is not an error.

    If a pattern matches files outside the build directory, is an absolute
    path or contains ".." segments, build tools may exclude those files
    from the artifacts.

    Build tools should collect the artifacts that match the specified
    patterns, for example in a compressed tar archive, and make them
    available alongside the build log for inspection. The artifacts should
    usually be collected regardless of whether the build succeeds or fails.

    The Build-Artifacts field is not copied into the source package control
    file (dsc(5)), binary package control file (deb-control(5)),
    changes file (deb-changes(5)) or any other build results.

    (To prototype this without dpkg supporting it, X-Build-Artifacts would be appropriate, with none of the XS-, XB-, XC- prefixes.)

    For example, a package using Meson with recent debhelper versions would typically use:

    Build-Artifacts: obj-*/meson-logs

    or a package using recursive Automake might use:

    Build-Artifacts:
    config.log
    tests/*.log
    tests/*.trs
    tests/reftests/*.png

    Does that sound like what you had in mind?

    In practice, if a build currently produces a log file with a name like foo_1.2-3_amd64.build, I think it would make sense for it to produce an accompanying tarball foo_1.2-3_amd64-artifacts.tar.xz or similar. This
    could be pushed down into dpkg, but I think it might actually make more
    sense in sbuild and pbuilder.

    To be useful on buildds, the buildd infrastructure would have to pick up
    these artifacts from sbuild and make them available for download alongside
    the build logs. I don't know the buildd infrastructure, so I'm not
    volunteering for that part.

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Wise@21:1/5 to Simon McVittie on Wed Feb 16 16:30:01 2022
    Simon McVittie wrote:

    Relatedly, I would like to be able to capture some information about
    builds even if (perhaps especially if) the build fails.

    That is a good point that I hadn't considered.

    so that failing builds can also produce artifacts, to help the
    maintainer and/or porters to figure out why the build failed.

    Agreed that this is useful.

    handling build logs is not dak's job (and I don't think handling
    things like the binutils test results should be dak's job either).

    It has always felt weird to me that build logs are entirely separate to
    the archive off in a side service rather than first-class artefacts
    that people occasionally need to look at. Also that the maintainer
    build logs don't end up anywhere and are probably just deleted. I think
    the same applies to the buildinfo files and also these tests results
    and other artefacts that are mentioned in this thread.

    Here's a straw-man spec, which I have already prototyped in <https://salsa.debian.org/debian/sbuild/-/merge_requests/14>:

    This seems better than my proposal, modulo the above and also the repro
    builds need for a way to distribute buildinfo files somehow.

    IIRC last time the build artefact discussion came up I was cycling
    between having the artefact handling in the sbuild configs on the
    buildds for quick implementation vs having it in debian/ dirs for
    distributed maintenance by maintainers.

    I think there is a fundamental question here that needs answering
    definitively: who is the audience for the artefact feature?

     * Is it individual package maintainers who want test result details?
     * Is it build tool maintainers who want data on tool use/failures?
     * Is it porters who want more detailed logs in case of failure?
     * Is it buildd maintainers for some reason?
     * Is it RC bug fixers?
     * Is it all of the above?

    Once that is answered, then we can think about how to accommodate how
    and where the list(s?) of files are to be maintained?

     * in debian/
     * in build tools (meson, gcc etc)
     * in debhelper extensions
     * in debhelper
     * in wanna-build
     * in sbuild
     * in sbuild.conf in dsa-puppet
     * in sbuild overrides on buildds

    Some of the above will be faster to implement and some will be slower.
    The faster parts can possibly even make up for the slower parts, by for
    example doing the sbuild proposal in hooks until it is done in stable.

    Then there is the question of how the files get off the systems where
    builds happen (buildds, maintainer systems). Again, the faster/slower implementation implications exist here too.

    Then there is the question of how the files are further distributed
    from there and the question of how people access them.

    Then there is the question of whether any of the above will be
    implemented in a way that is useful solely to Debian, or in a more
    general way to all Debian or apt repository based distributions. Being
    able to publish build logs/artifacts seems like something other distros
    would be interested in. It sounds like at least the GCC maintainers
    want that for too Ubuntu at minimum.

    --
    bye,
    pabs

    https://wiki.debian.org/PaulWise

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEYQsotVz8/kXqG1Y7MRa6Xp/6aaMFAmINF3oACgkQMRa6Xp/6 aaMJqA//WfEn5/rGTscmAyNdIVC7KOol6Vc45p8e0YbbWCiN5jja0RT669aQW72t p5PMfkwbWnd2z9OPg6y8Ko9i5QxOlYOVEyzBPcMB5xHM5bSZXmasLD+sP4DDqeTf PqdNjIiJkUvLzAo3CDLDMPm1d2G/xXk4pskr9XxerQfiZND1G+eWs2OUlrUCprW4 NXuOY49LfbkzZFDEe6R2aWUO9DL/R5MYPwn4HD39N79A8DB6hXjEcHUQtiGm1URn 3LvlYElToc9FkqqioAr3cJR4g/epPgN81C+3o9kLnchO9eitEBut5NDM8MDphcx4 kvjCSuVZOffA8q8+Gmwct+c3vQCzYii6f2BaogIIvpIY6tLJTiHIYSYONbqZkOZb CbZuYEQovm3L1acZrhg+plYeOnmxb1QaOUvMZJoVMEeeOUXQxCx/zZIPncpZRKWH m31Q0osr1uT8kHgjS2kM5visCQgsWAUUWWrsv1T0UzZ4sFq5VWBdQ0WSr3u+uQIT 5Sd7qLxkaTn6Av+92JMx05JuJV6wS2p1ig22bbeCRwv04zGGTioKbfNaNMYmJSI8 El1JFjNKKlhWGidmRjFT9u497ZKGMC7FtRdfrxvwVTY+7VziD1t8AHpjhhiCi+xc pTH4YxdNSKJjaK+Lb5lrCycs28EP0kucX72e9UV0tCZg7+Dgiro=
    =eRmx
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Paul Wise on Wed Feb 16 18:00:01 2022
    On Wed, 16 Feb 2022 at 23:25:46 +0800, Paul Wise wrote:
    Simon McVittie wrote:
    handling build logs is not dak's job (and I don't think handling
    things like the binutils test results should be dak's job either).

    It has always felt weird to me that build logs are entirely separate to
    the archive off in a side service rather than first-class artefacts
    that people occasionally need to look at. Also that the maintainer
    build logs don't end up anywhere and are probably just deleted. I think
    the same applies to the buildinfo files and also these tests results
    and other artefacts that are mentioned in this thread.

    If the maintainers of dak (our eternally overworked ftp team) want to
    pick up build logs as first-class artifacts produced by both failed
    and successful builds, they're welcome to do so (and then handling my
    prototype of test artifacts would be a matter of adding another glob
    pattern to be stored, for the tarball of artifacts that accompanies the
    log); but I don't want to block on them doing that, because that seems
    like a recipe for it never happening.

    I am also not sure that it would be appropriate for dak to be doing
    any processing on *failed* builds, which currently fail and get diverted
    off into other code paths long before they get to dak.

    If you are trying to solve the problem "we cannot see into the logs of maintainer-built binaries that exist in the archive", I think a better
    answer to that would be to stop letting maintainer-built binaries into the archive, as the release team are already pushing us towards. That way,
    we don't have to worry about whether maintainers' build logs and/or test artifacts would be leaking personal or sensitive information that they
    would prefer not to have shared.

    IIRC last time the build artefact discussion came up I was cycling
    between having the artefact handling in the sbuild configs on the
    buildds for quick implementation vs having it in debian/ dirs for
    distributed maintenance by maintainers.

    I'm reasonably sure that the sbuild configuration is the wrong place
    to specify what the artifacts are, because the interesting artifacts
    depend on the build system (Autotools vs Meson vs etc.) and on how the
    package uses it (in-tree vs. out-of-tree build, single vs multiple builds,
    and so on), as well as on the package itself (for example GTK's ad-hoc mechanism to store reftest results as PNG files is entirely GTK-specific).
    This is something that the package maintainer already needs to know, so
    that they can debug failing builds locally.

    I tested my prototype with a Meson package, which has the advantage that
    it's very consistent: whatever your build directory is, it will have
    a meson-logs subdirectory and that's where all the logs are. However,
    even Meson is not always done identically: the most obvious example
    is that most Meson-built packages use the dh default build directory ./obj-${multiarch}, but if you do two builds (perhaps one for the .deb
    and one for the .udeb, like GLib does), you have to find somewhere else
    to put the second build.

    I think there is a fundamental question here that needs answering definitively: who is the audience for the artefact feature?

    * Is it individual package maintainers who want test result details?
    * Is it build tool maintainers who want data on tool use/failures?
    * Is it porters who want more detailed logs in case of failure?
    * Is it buildd maintainers for some reason?
    * Is it RC bug fixers?
    * Is it all of the above?

    As an individual package maintainer, I certainly want this feature.
    The exact artifacts that I want vary between packages, which is why
    I prototyped it as a new field in d/control.

    When toolchain packages like binutils and gcc collect their test
    results, I think that's also their maintainer acting as an individual
    package maintainer. Obviously they're very important core packages,
    but collecting their test results doesn't seem like it fundamentally
    differs from me wanting to collect GTK test results.

    If the other groups get a benefit from this too, then that's a welcome
    bonus, but I think solving it for individual package maintainers and
    ignoring everyone else would be a net improvement.

    Porters and RC bug fixers can benefit from this information in the
    same way package maintainers do; if they're looking at fixing a bug,
    they are going to have to change the package *anyway* (to apply the
    bug fix), so changing it to collect artifacts (if it doesn't already)
    doesn't seem like a huge cost.

    I am not aware of buildd maintainers having asked for more detailed
    logs. Indeed, buildd maintainers are in the unique position that they
    can run arbitrary privileged code on buildds, so they are in a better
    position to collect information from a half-built package than mere DDs,
    and presumably have less need for this feature.

    Build tool maintainers seem like the only one of the groups you've named
    that isn't necessarily well-served by my prototype: they don't want to
    modify everyone else's packages to get more information about how their
    build tool is working.

    Perhaps it would make sense to have a hybrid of what I prototyped, and something more like substvars:

    - the package maintainer can write a list of patterns into
    debian/build-artifacts (or a field in d/control, as in my prototype)
    - the package's build system (d/rules, debhelper or whatever) can write
    additional patterns into debian/extra-build-artifacts at runtime
    - anything listed in either or both places is collected into the
    -artifacts.tar.gz

    What I definitely want to avoid is a system that requires collecting
    the artifacts imperatively rather than declaratively, e.g. converting

    dh_auto_test -- --parameters

    into

    if ! dh_auto_test -- --parameters; then \
    cp _build/meson-logs/* debian/build-artifacts/ || :; \
    cp tests/foo/bar/*.png debian/build-artifacts/ || :; \
    exit 1; \
    fi

    (with all the right makefile escaping) in every package that has its
    own ad-hoc artifacts. That scales really poorly, and conflicts quite
    badly with the philosophy of failing with a fatal error as soon as a sufficiently bad problem is seen.

    Once that is answered, then we can think about how to accommodate how
    and where the list(s?) of files are to be maintained?
    ...
    * in wanna-build
    * in sbuild
    * in sbuild.conf in dsa-puppet
    * in sbuild overrides on buildds

    I think those are a non-starter: as a maintainer of an individual package,
    I do not want to have to ask the Debian sysadmins' permission to collect
    test results (or, worse, ask the sbuild maintainer's permission and then
    wait 2 years for the change to be in a stable release).

    However, if you think those people will genuinely want to use it, then
    it seems fine to have a sbuild option with the semantics "always behave
    as though the package's list of build artifacts had these extra patterns
    in it".

    I think part of being a do-ocracy is that if there isn't an important
    reason for a small and usually overworked group to be in a position to
    block other people's work, then we should avoid putting extra load on them.

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Wise@21:1/5 to Simon McVittie on Thu Feb 17 06:30:01 2022
    On Wed, 2022-02-16 at 16:51 +0000, Simon McVittie wrote:

    If the maintainers of dak (our eternally overworked ftp team) want to
    pick up build logs as first-class artifacts produced by both failed
    and successful builds, they're welcome to do so (and then handling my prototype of test artifacts would be a matter of adding another glob
    pattern to be stored, for the tarball of artifacts that accompanies the
    log); but I don't want to block on them doing that, because that seems
    like a recipe for it never happening.

    I have heard that they accept patches :)

    If you are trying to solve the problem "we cannot see into the logs of maintainer-built binaries that exist in the archive", I think a better
    answer to that would be to stop letting maintainer-built binaries into the archive, as the release team are already pushing us towards. That way,
    we don't have to worry about whether maintainers' build logs and/or test artifacts would be leaking personal or sensitive information that they
    would prefer not to have shared.

    There are always going to be non-buildd binaries in the archive, since
    Debian doesn't support autobuilding with non-default build profiles and
    even if we had that there will likely always be the need for packages
    to be manually bootstrapped.

    There is already a (merged?) dak patch for dropping maintainer built
    binaries after NEW processing, so we are close to completing this.

    ISTR dropping all (not just NEW) maintainer built binaries by default
    was decided to be unwanted and the NEW-only approach was preferred.
    Personally I wanted to drop all maintainer built binaries by default,
    with perhaps a .changes field for enabling accepting binaries.

    I'm reasonably sure that the sbuild configuration is the wrong place
    to specify what the artifacts are

    I think that completely depends on the audiences and which artefacts
    each of the audiences wants to look at. For some it will be.

    If the other groups get a benefit from this too, then that's a welcome
    bonus, but I think solving it for individual package maintainers and
    ignoring everyone else would be a net improvement.

    I think that package maintainers are indeed the primary need for this
    feature but that the other audiences shouldn't be ignored.

    Perhaps it would make sense to have a hybrid of what I prototyped, and something more like substvars:
    ...
    What I definitely want to avoid is a system that requires collecting
    the artifacts imperatively rather than declaratively, e.g. converting

    Sounds good.

    I think those are a non-starter: as a maintainer of an individual package,
    I do not want to have to ask the Debian sysadmins' permission to collect
    test results (or, worse, ask the sbuild maintainer's permission and then
    wait 2 years for the change to be in a stable release).

    I'm saying we want all of the options, not just one of the options.

    I think part of being a do-ocracy is that if there isn't an important
    reason for a small and usually overworked group to be in a position to
    block other people's work, then we should avoid putting extra load on them.

    I think that working around groups like this often leads to suboptimal
    designs and a better approach is to get the design right and help those
    groups do the implementation work, leaving only the deployment to them.

    Anyway, I'm not in any of the audiences for this feature and I won't be
    doing any work on it, so I'll leave it up to others to determine the
    final design and implementation of the flow of build info/artifacts.

    --
    bye,
    pabs

    https://wiki.debian.org/PaulWise

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEYQsotVz8/kXqG1Y7MRa6Xp/6aaMFAmIN22sACgkQMRa6Xp/6 aaNFnxAAsgNzRmDK8NpcK6vgNP+Mt61MJjByqZCsFN/DGcwBK5QS4ElSrDOZrwpL 0cFa948aYLJVVf9cR06qb6Itg3QP1kiDeyUoi+Vpx8R/Twcef/UB+lV3aypeAcLP FjWn7ecHF0ACpyQUAii+97yGfZnRddARJDQWRCDd1Ga0izan2U0y0tGJR3G9baac 7msftFfxDew5lCyIwz+1kyIzn58Id+QfEJGdG365OqsyHEsaOuzvhA5txr8ov4md 1lySFPjBJyGieWv5aiXvWpEwSKGm0P4Pz/pheGhcqt9X/KxHIb2oKnr2o9TmxsYr 0dCtTH4EJAbLe8vgup3kPqyrcCrDfzZ20EklaDX0xkjAqKebO658JrtmPbfk+Ywk oBOjOCLk1LsZ3N//0b+ANefl0XJjrTb20/UOpynIqYUzpIfd0RqFuLXvSk9m0F8Z YgQV38h5IPxGg5AINZpgb+mAgl/e9GuVTSmFxnb6d5iIEJ0IBFjROkojxlcV6KRQ 9QgfLgRnPRZyGzcHSY4x7mvd8kg/PNaPzugbti+4WbCR9Tbg/2EBscTfNuWjs1Wa gKa7br8zmOjvztx7UADJUCV7DyR3tWwWSZ6LEFa0Bj93L3Xfg4k7nxf6m+0sqYSE 72HzUs6jQJgOw8pTH2na57Ck0egV0zqWnx+l6gSu30nH8CJqJzk=
    =Kl6k
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Holger Levsen@21:1/5 to Vagrant Cascadian on Sun Feb 20 03:00:01 2022
    On Sun, Feb 13, 2022 at 02:13:10PM -0800, Vagrant Cascadian wrote:
    Curious to hear your thoughts!

    I'd just like to comment with three rather general comments:

    a.) thanks for bringing this up here, Vagrant.
    b.) solving this seems to be a requirement for getting the build-essential
    package set reproducible in Debian.(!)
    c.) solving this could include solving the distributing Debian .buildinfo files
    problem(s) - some of which are collected here: https://wiki.debian.org/ReproducibleBuilds#Big_outstanding_issues

    IOW, this could become a *major* step towards reproducible Debian!


    --
    cheers,
    Holger

    ⢀⣴⠾⠻⢶⣦⠀
    ⣾⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org
    ⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
    ⠈⠳⣄

    The devel is in the details.

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEuL9UE3sJ01zwJv6dCRq4VgaaqhwFAmIRnjYACgkQCRq4Vgaa qhzlWQ//SPylzuahaJzf0nKInNMiIJ2HOqhwY3rV1dVAtK2au37oQC6sYbKB/ENJ 9uEETf3j8jjE0VhZ2nQTouGKnuZeie2yy6sax4Qh7wWDv30KgXHIHmyKGyPb0B1Y d/ZWduitIm/YE+vgYQQo3WEecTrTxR1yjJzD8l3hoPRq0IOvVcY+FBMHxhK3UBZI XowMJIQjdK1JVuLPR7n+ytDEt022vHoYpBgUfLOUjm3Nj/WUkDDOiCTc3pU7O12j Mk27S4a8lw71zsU9U82VTJdhM1BH2ZSRD0N/MXZvNe4H2mN35og5WJWnLpGSg4MA iWIptIAuj1a7DrU30GNrsOEvPxKfISci0MTCkrCZDyGOAi3+Ni+n/dUa/6d5xUxx e/8E3N07DL7iWnzKIaf8eMdNj4J/7vg5LljWb27SLrIOCfiNe37eRAM01fGpKMG3 tSPy8Ov+cWBoa8Q8BNDfTFCLQbVBjTUuNIcoEEWr8gtiyVTxNyjwbbvG4Ij5jUtR RLjjcfBNwYogFTDs4UpnJt5lvxxD2eUGz+0Yeo04XVR9ilPSfs1xZXYm9hNGRSYv 6dNRp0BCGtM7ZvvQoDRL5m8fqWGHGf1LdB/C6sC7Q5WQAbSl2B2gDa8rCMtZk7AE 8ZJM7fv4rdzBvm46/8