• Stats on packages not on Salsa (Was: Bits from DPL)

    From Andreas Tille@21:1/5 to All on Thu Jan 9 17:40:02 2025
    XPost: linux.debian.devel.qa

    Hi Stuart,

    changing subject and suggest moving the topic to Debian QA list where
    it probably belongs.

    Am Thu, Jan 09, 2025 at 11:54:47AM +1100 schrieb Stuart Prescott:
    Good point on anonscm as well... that really does blow out the numbers.

    Unfortunately yes.

    However... some of them still work via the aliasing mechanism that was introduced at the time of migration to salsa.

    In the migration phase from Alioth to Salsa I maintained lists of
    packages for Debian Med and Debian Science team. In my practical
    experience finding some working alias is a rare exception. I also think
    this alias mechanism was a temporary solution that should not survive
    for >5 years.

    Duck used to check them all
    but I don't think it is running any more, unfortunately. vcswatch still
    does, more on that later.

    Vcswatch is a good hint.

    The vast majority of these packages have seen post-alioth uploads but with
    the broken Vcs fields still in place.

    Do you have numbers backing up this "vast majority" statement?

    Yes, that's in the table below. Of those 161 packages, 145 have been
    uploaded since salsa launched and alioth stopped. (updated data with anonscm at the bottom - the story is still the same, although not all those anonscm links are broken)

    Ahhh, got your point now. The Bug of the Day criteria are selecting
    packages that are not uploaded for a long time and thus might experience
    is different.

    (I accidentally found 2 python-team packages without Vcs URLs yesterday - the repos were on salsa, just not listed in d/control)

    Not so nice. Did you just injected these? If not would you mind naming the packages?

    One got uploaded because I was sorting other changes for qtpy, the other is fixed in git. Having looked at 20-something packages in the last 2 days, I'm not sure I could actually name which ones at this stage...

    OK as long as these are fixed now.

    In pursuing this, you might also find the vcswatch table in udd - it lists 1533 packages where the VCS fields might need fixing. Some of the errors there are transient, but this also picks up typos in the VCS fields ('debain', 'debian/packages/') and repos that simply don't exist.

    Good point.

    Updated queries and data appended. (and btw postgres can do regex matches which simplifies the sql quite a lot)

    I'm aware in principle about the regexp feature. Unfortunately I have
    to deal with SQL databases without this kind feature in my day job. So
    I usually try to avoid PostgreSQL only features.

    Majority of packages with invalid vcs_url uploaded post salsa:

    SELECT
    DATE_PART('year', date) AS year,
    COUNT(*)
    FROM
    sources AS s
    JOIN upload_history AS h
    ON s.source = h.source AND s.version = h.version
    WHERE
    release = 'sid'
    AND vcs_url ~ '/(git|svn|alioth|anonscm).debian.org'
    GROUP BY
    year
    ORDER BY
    year ASC;

    year | count
    -----+-------
    2011 | 2
    2012 | 5
    2013 | 7
    2014 | 9
    2015 | 9
    2016 | 20
    2017 | 102
    2018 | 85 ← (salsa.d.o general availability)
    2019 | 10
    2020 | 77
    2021 | 411
    2022 | 115
    2023 | 13
    2024 | 31
    2025 | 3
    (15 rows)



    Teams with packages to fix - and the packages are probably already on salsa so this is just metadata, not lots of work.

    SELECT
    maintainer_name, COUNT(*)
    FROM sources
    WHERE
    release = 'sid'
    AND vcs_url ~ '/(git|svn|alioth|anonscm).debian.org'
    AND maintainer ~ '(team|group|lists)'
    GROUP BY
    maintainer_name
    ORDER BY
    count DESC;

    maintainer_name | count ---------------------------------+-------
    Debian Ruby Extras Maintainers | 196 (+2 that are in Uploaders)
    Debian Java Maintainers | 178
    Debian Go Packaging Team | 105
    Debian Perl Group | 83
    pkg-go | 25
    Debian Javascript Maintainers | 20
    Debian Fonts Task Force | 15
    Debian PHP PEAR Maintainers | 14
    Debian X Strike Force | 12
    Debian Science Maintainers | 11
    Debian XML/SGML Group | 5
    Debichem Team | 4
    Debian VDR Team | 4
    Debian CLI Applications Team | 2
    Debian Games Team | 2
    Debian Java maintainers | 2
    Debian Tasktools Packaging Team | 2
    Debian VoIP Team | 2
    Debian Astronomy Maintainers | 2
    Debian Privacy Tools Maintainers | 2
    Debian Clojure Maintainers | 2
    Debian Astronomy Team | 2
    Debian Telepathy maintainers | 2
    Live Systems Maintainers | 1
    The Debian Lua Team | 1
    Pulseaudio maintenance team | 1
    Android Tools Maintainers | 1
    Debian PhotoTools Maintainers | 1
    Puppet Package Maintainers | 1
    ClamAV Team | 1
    Debian-IN Team | 1
    Debian CLI Libraries Team | 1
    Debian Islamic Maintainers | 1
    Debian GNOME Maintainers | 1
    Debian Science Team | 1
    Debian Sugar Team | 1
    Debian GNUKhata Team | 1
    Debian Emacs addons team | 1
    Debian Med Packaging Team | 1
    Debian Salt Team | 1
    NeuroDebian Team | 1



    Find packages in your favourite team that you want to work on...

    SELECT
    source, vcs_url
    FROM sources
    WHERE
    release = 'sid'
    AND vcs_url ~ '/(git|svn|alioth|anonscm).debian.org'
    AND maintainer ~ 'science'
    ORDER BY
    source;

    Thank you for publishing these data - I hope this will encourage people
    to look into this.

    The vcswatch table has lots of interesting things... Note that the salsa error "could not read Username" in the table is not a misconfiguration - it means that the repo couldn't be obtained anonymously, which could be that it doesn't exist, or that it needs permissions - both are wrong for Debian.

    SELECT
    source, url, error
    FROM
    vcswatch
    WHERE
    error IS NOT NULL
    ORDER BY
    source;

    I've remove the quotation markers from the SQL queries to enable easy copy-n-pasting for the readers. I confirm a couple of Debian Science
    packages will not show up any more tomorrow (but some are not simple
    metadata fixes since a lot has happened on code in Git which does not
    build currently - at least I pinged the team in those cases).

    Kind regards
    Andreas.

    --
    https://fam-tille.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)