* Split build metadata into a separate file or archive
Some of the debian-installer packages generate tarballs that are not
.deb files and are included in the .changes files when uploading to
the archive; making a similar generalized option for other packages to
put build metadata into a separate artifact might be workable approach, although this would presumably require toolchain changes in dpkg and
dak at the very least, and might take a couple release cycles, which
is... well, debian.
Obviously, this would interfere with any meaningful reproducible builds testing for any package that did something like this. Ideally metadata
like this about a build should *not* be included in the .deb files themselves.
* output plaintext data to the build log
Some of these log files are large (>13MB? per architecture, per package build) and would greatly benefit from compression...
How large is too large for this approach to work?
Relatively simple to implement (at least for plain text logs), but potentially stores a lot of data on the buildd infrastructure...
* Selectively filter out known unreproducible files
This adds complexity to the process of verification; you can't beat the simplicty of comparing checksums on two .deb files.
With increased complexity comes increased opportunity for errors, as
well as maintenance overhead.
RPM packages, for example, embed signatures in the packages, and these
need to be excluded for comparison.
I vaguely recall at least one case where attempting something like this
in the past and resulting in packages incorrectly being reported as reproducible when the filter was overly broad...
Some nasty corner cases probably lurk down this approach...
* Split build metadata into a separate .deb file
Some of the similar problems of the previous, though maybe a little
easier to get a reliable exclusion pattern? Wouldn't require huge
toolchain changes.
I would expect that such packages be not actually dependend on by any
other packages, and *only* contain build metadata. Maybe named SOURCEPACKAGE-buildmetadata-unreproducible.deb ... or.... ?
Not beautiful or elegant, but maybe actually achievable for bookworm
release cycle?
* Split build metadata into a separate file or archive
Some of the debian-installer packages generate tarballs that are not
.deb files and are included in the .changes files when uploading to the archive; making a similar generalized option for other packages to put
build metadata into a separate artifact might be workable approach,
although this would presumably require toolchain changes in dpkg and dak
at the very least, and might take a couple release cycles, which
is... well, debian.
The possibility of bundling up .buildinfo files into this metadata too,
while taking some changes in relevent dpkg, dak, etc. tooling, might in
the long term be worth exploring.
There was a relevent bug report in launchpad:
https://bugs.launchpad.net/launchpad/+bug/1845159
This seems like the best long-term approach, but pretty much *only* a long-term approach...
Relatedly, I would like to be able to capture some information about
builds even if (perhaps especially if) the build fails.
so that failing builds can also produce artifacts, to help the
maintainer and/or porters to figure out why the build failed.
handling build logs is not dak's job (and I don't think handling
things like the binutils test results should be dak's job either).
Here's a straw-man spec, which I have already prototyped in <https://salsa.debian.org/debian/sbuild/-/merge_requests/14>:
Simon McVittie wrote:
handling build logs is not dak's job (and I don't think handling
things like the binutils test results should be dak's job either).
It has always felt weird to me that build logs are entirely separate to
the archive off in a side service rather than first-class artefacts
that people occasionally need to look at. Also that the maintainer
build logs don't end up anywhere and are probably just deleted. I think
the same applies to the buildinfo files and also these tests results
and other artefacts that are mentioned in this thread.
IIRC last time the build artefact discussion came up I was cycling
between having the artefact handling in the sbuild configs on the
buildds for quick implementation vs having it in debian/ dirs for
distributed maintenance by maintainers.
I think there is a fundamental question here that needs answering definitively: who is the audience for the artefact feature?
* Is it individual package maintainers who want test result details?
* Is it build tool maintainers who want data on tool use/failures?
* Is it porters who want more detailed logs in case of failure?
* Is it buildd maintainers for some reason?
* Is it RC bug fixers?
* Is it all of the above?
Once that is answered, then we can think about how to accommodate how...
and where the list(s?) of files are to be maintained?
* in wanna-build
* in sbuild
* in sbuild.conf in dsa-puppet
* in sbuild overrides on buildds
If the maintainers of dak (our eternally overworked ftp team) want to
pick up build logs as first-class artifacts produced by both failed
and successful builds, they're welcome to do so (and then handling my prototype of test artifacts would be a matter of adding another glob
pattern to be stored, for the tarball of artifacts that accompanies the
log); but I don't want to block on them doing that, because that seems
like a recipe for it never happening.
If you are trying to solve the problem "we cannot see into the logs of maintainer-built binaries that exist in the archive", I think a better
answer to that would be to stop letting maintainer-built binaries into the archive, as the release team are already pushing us towards. That way,
we don't have to worry about whether maintainers' build logs and/or test artifacts would be leaking personal or sensitive information that they
would prefer not to have shared.
I'm reasonably sure that the sbuild configuration is the wrong place
to specify what the artifacts are
If the other groups get a benefit from this too, then that's a welcome
bonus, but I think solving it for individual package maintainers and
ignoring everyone else would be a net improvement.
Perhaps it would make sense to have a hybrid of what I prototyped, and something more like substvars:...
What I definitely want to avoid is a system that requires collecting
the artifacts imperatively rather than declaratively, e.g. converting
I think those are a non-starter: as a maintainer of an individual package,
I do not want to have to ask the Debian sysadmins' permission to collect
test results (or, worse, ask the sbuild maintainer's permission and then
wait 2 years for the change to be in a stable release).
I think part of being a do-ocracy is that if there isn't an important
reason for a small and usually overworked group to be in a position to
block other people's work, then we should avoid putting extra load on them.
Curious to hear your thoughts!
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (0 / 16) |
Uptime: | 164:45:58 |
Calls: | 10,385 |
Calls today: | 2 |
Files: | 14,057 |
Messages: | 6,416,518 |