I am one of those who builds a lot of different packages with different requirements and found that picking a good parallel=... value in DEB_BUILD_OPTIONS is hard. Go too low and your build takes very long. Go
too high and you swap until the OOM killer terminates your build. (Usage
of choom recommended in any case.)
I think this demonstrates that we probably have something between 10 and
50 packages in unstable that would benefit from a generic parallelism
limit based on available RAM. Do others agree that this is a problem
worth solving in a more general way?
For one thing, I propose extending debhelper to provide --min-ram-per-parallel-core as that seems to be the most common way to
do it. I've proposed https://salsa.debian.org/debian/debhelper/-/merge_requests/128
to this end.
Unfortunately, a the affeted packages tend to not just be big, but also
so special that they cannot use dh_auto_*. As a result, I also looked at another layer to support this and found /usr/share/dpkg/buildopts.mk,
which sets DEB_BUILD_OPTION_PARALLEL by parsing DEB_BUILD_OPTIONS. How
about extending this file with a mechanism to reduce parallelity? I am attaching a possible extension to it to this mail to see what you think. Guillem, is that something you consider including in dpkg?
Are there other layers that could reasonably be used to implement a more general form of parallelism limiting based on system RAM? Ideally, we'd consolidate these implementations into fewer places.
As I am operating build daemons (outside Debian), I note that I have to
limit their cores below what is actually is available to avoid OOM
kills and even that is insufficient in some cases. In adopting such a mechanism, we could generally raise the core count per buildd and
consider OOM a problem of the package to be fixed by applying a sensible parallelism limit.
On Thu, 2024-11-28 at 10:54:37 +0100, Helmut Grohne wrote:
Are there other layers that could reasonably be used to implement a more general form of parallelism limiting based on system RAM? Ideally, we'd consolidate these implementations into fewer places.
I think adding this in dpkg-buildpackage itself would make most sense
to me, where it is already deciding what amount of parallelism to use
when specifying «auto» for example.
Given that this would be and outside-in interface, I think this would
imply declaring these parameters say as debian/control fields for example,
or some other file to be parsed from the source tree.
My main concerns would be:
* Portability.
* Whether this is a local property of the package (so that the
maintainer has the needed information to decide on a value, or
whether this depends on the builder's setup, or perhaps both).
* We might need a way to percolate these parameters to children of
the build/test system (as Paul has mentioned), where some times
you cannot specify this directly in the parent. Setting some
standardize environment variables would seem sufficient I think,
but while all this seems kind of optional, this goes a bit into
reliance on dpkg-buildpackage being the only supported build
entry point. :)
Hi Guillem (2024.12.04_13:03:29_+0000)
Are there other layers that could reasonably be used to implement a more general form of parallelism limiting based on system RAM? Ideally, we'd consolidate these implementations into fewer places.
I think adding this in dpkg-buildpackage itself would make most sense
to me, where it is already deciding what amount of parallelism to use
when specifying «auto» for example.
Given that this would be and outside-in interface, I think this would
imply declaring these parameters say as debian/control fields for example, or some other file to be parsed from the source tree.
I don't think this can be entirely outside-in, the package needs to say
how much ram it needs per-core, to be able to calculate the appropriate degree of parallelism. So, we have to declare a value that then gets calculated against the proposed parallelism.
On Thu, 2024-11-28 at 10:54:37 +0100, Helmut Grohne wrote:
I think this demonstrates that we probably have something between 10 and
50 packages in unstable that would benefit from a generic parallelism
limit based on available RAM. Do others agree that this is a problem
worth solving in a more general way?
I think the general idea make sense, yes.
For one thing, I propose extending debhelper to provide --min-ram-per-parallel-core as that seems to be the most common way to
do it. I've proposed https://salsa.debian.org/debian/debhelper/-/merge_requests/128
to this end.
To me this looks too high in the stack (and too Linux-specific :).
I think adding this in dpkg-buildpackage itself would make most sense
to me, where it is already deciding what amount of parallelism to use
when specifying «auto» for example.
Given that this would be and outside-in interface, I think this would
imply declaring these parameters say as debian/control fields for example,
or some other file to be parsed from the source tree.
My main concerns would be:
* Portability.
* Whether this is a local property of the package (so that the
maintainer has the needed information to decide on a value, or
whether this depends on the builder's setup, or perhaps both).
* We might need a way to percolate these parameters to children of
the build/test system (as Paul has mentioned), where some times
you cannot specify this directly in the parent. Setting some
standardize environment variables would seem sufficient I think,
but while all this seems kind of optional, this goes a bit into
reliance on dpkg-buildpackage being the only supported build
entry point. :)
My thinking here was also about the general case too, say a system
that has many cores relative to its available memory, where each core
would get what we'd consider not enough memory per core
(assuming for
example a baseline for what dpkg-deb might require, plus build helpers
and their interpreters, and what a compiler with say an empty C, C++
or similar file might need, etc).
This could also imply alternatively or in addition, providing a tool
or adding some querying logic in an existing tools (in the dpkg
toolset)
to gather that information which the packaging could use, or…
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (2 / 14) |
Uptime: | 146:09:30 |
Calls: | 10,383 |
Calls today: | 8 |
Files: | 14,054 |
D/L today: |
2 files (1,861K bytes) |
Messages: | 6,417,699 |