• lintian worker on Debian infra ; future of lintian.debian.org

    From Lucas Nussbaum@21:1/5 to All on Wed Jul 6 22:00:02 2022
    Dear DSA,

    For context, quoting <YsGezYimnj9HwY+D@xanadu.blop.info>:

    Seeing that lintian got adopted, I got motivated into looking if I could
    help on the lintian.d.o side, that is, provide up-to-date archive-wide
    up to date to developers.

    Since the architecture of lintian.d.o seemed quite complicated, I
    instead decided to follow what worked for other UDD-based data importers (such as the one that scans for new upstream versions). So my plan is
    the following:
    - use a UDD postgresql table for data storage
    - use UDD to decide which packages need to be analyzed
    - coordinate the analysis from UDD, but do the analysis itself on a
    third-party 'worker' machine (since the process is quite CPU intensive)
    - provide visualisation directly on https://udd.debian.org (similar
    to https://udd.debian.org/dmd/ or https://udd.debian.org/bugs/)
    - work with data consumers on how to best export the data from UDD to
    them

    I know it feels a bit like NIH, but I believe the simpler design will
    help in the long term...

    The current status is that my code works and is currently finishing its
    first scan of all packages in Debian unstable or experimental. The data
    is made available using https://udd.debian.org/lintian/

    There are two topics I'd like to discuss with you:

    1/ Moving the worker node to Debian infra

    Currently the worker node (that runs lintian) is an AWS VM. It would
    be nice to move it to Debian infra instead. Requirements are:
    - bullseye VM where lintian from bullseye-backports can be installed.
    Ideally, you would then auto-upgrade it from backports when a new
    version gets released. Or someone can ping you to do it.
    - the orchestrator (on the UDD VM) connects using SSH to the worker
    node.
    - technical specs: running lintian is mainly CPU intensive, and requires
    some disk space to store the temporary data. The AWS VM has 8 cores,
    32 GB RAM, 100 GB disk.

    2/ Future of lintian.debian.org.

    It is currently not actively maintained, and the data on it is stale. We
    can:
    - keep it like that until someone decides to adopt it (but the fact that
    the data is stale is a bit misleading)
    - shut it down or redirect it to https://udd.debian.org/lintian/ or
    somewhere else.
    I don't have a strong opinion.

    Lucas

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julien Cristau@21:1/5 to Lucas Nussbaum on Wed Aug 31 19:20:01 2022
    On Wed, Jul 6, 2022 at 21:56:16 +0200, Lucas Nussbaum wrote:

    1/ Moving the worker node to Debian infra

    Currently the worker node (that runs lintian) is an AWS VM. It would
    be nice to move it to Debian infra instead. Requirements are:
    - bullseye VM where lintian from bullseye-backports can be installed.
    Ideally, you would then auto-upgrade it from backports when a new
    version gets released. Or someone can ping you to do it.
    - the orchestrator (on the UDD VM) connects using SSH to the worker
    node.
    - technical specs: running lintian is mainly CPU intensive, and requires
    some disk space to store the temporary data. The AWS VM has 8 cores,
    32 GB RAM, 100 GB disk.

    That is probably feasible, although I'm curious why this is preferred
    over a cloud VM, or set of cloud VMs (in a debian-owned account) that
    can get spawned as needed instead of a static host?

    2/ Future of lintian.debian.org.

    It is currently not actively maintained, and the data on it is stale. We
    can:
    - keep it like that until someone decides to adopt it (but the fact that
    the data is stale is a bit misleading)
    - shut it down or redirect it to https://udd.debian.org/lintian/ or
    somewhere else.
    I don't have a strong opinion.

    I think we should very much not keep it stale, it's been that way way
    too long already. I'd lean towards shutting it down but a redirect
    would also be OK IMO.

    Cheers,
    Julien

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lucas Nussbaum@21:1/5 to Lucas Nussbaum on Wed Nov 2 10:00:01 2022
    On 02/11/22 at 09:50 +0100, Lucas Nussbaum wrote:
    I think we should very much not keep it stale, it's been that way way
    too long already. I'd lean towards shutting it down but a redirect
    would also be OK IMO.

    I started a wiki page to document how to transition from lintian.d.o to
    the UDD implementation. That could be useful if you turn lintian.d.o
    into a static page saying it has been shutdown, and want to point
    somewhere to help people transition.

    https://wiki.debian.org/UltimateDebianDatabase/ReplacingLintianDebianOrg

    Lucas

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lucas Nussbaum@21:1/5 to Julien Cristau on Wed Nov 2 10:00:01 2022
    Hi

    On 31/08/22 at 19:04 +0200, Julien Cristau wrote:
    On Wed, Jul 6, 2022 at 21:56:16 +0200, Lucas Nussbaum wrote:

    1/ Moving the worker node to Debian infra

    Currently the worker node (that runs lintian) is an AWS VM. It would
    be nice to move it to Debian infra instead. Requirements are:
    - bullseye VM where lintian from bullseye-backports can be installed.
    Ideally, you would then auto-upgrade it from backports when a new
    version gets released. Or someone can ping you to do it.
    - the orchestrator (on the UDD VM) connects using SSH to the worker
    node.
    - technical specs: running lintian is mainly CPU intensive, and requires
    some disk space to store the temporary data. The AWS VM has 8 cores,
    32 GB RAM, 100 GB disk.

    That is probably feasible, although I'm curious why this is preferred
    over a cloud VM, or set of cloud VMs (in a debian-owned account) that
    can get spawned as needed instead of a static host?

    Multiple VMs would be overkill. Re-processing everything (as required
    to update the results for a new lintian version) takes 4-5 days, so
    that's probably acceptable.

    I thought the general goal was to have official Debian services run on DSA-controlled infrastructure. In that case it's true that it's only the
    data acquisition part that is running outside DSA-controlled
    infrastructure. We can leave it like that (and it's not that hard to
    transition later, the worker is mostly stateless).
    I just wanted to raise the topic.

    2/ Future of lintian.debian.org.

    It is currently not actively maintained, and the data on it is stale. We can:
    - keep it like that until someone decides to adopt it (but the fact that
    the data is stale is a bit misleading)
    - shut it down or redirect it to https://udd.debian.org/lintian/ or
    somewhere else.
    I don't have a strong opinion.

    I think we should very much not keep it stale, it's been that way way
    too long already. I'd lean towards shutting it down but a redirect
    would also be OK IMO.

    I started a wiki page to document how to transition from lintian.d.o to
    the UDD implementation. That could be useful if you turn lintian.d.o
    into a static page saying it has been shutdown, and want to point
    somewhere to help people transition.

    Lucas

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)