• javadocs

    From =?UTF-8?Q?Julien_Plissonneau_Duqu=C@21:1/5 to All on Tue Feb 18 10:50:01 2025
    Hi,

    There is a growing tendency these last few years to remove -java-doc
    packages as part of regular maintenance (including fixing builds or
    updating to a new upstream release). As I could not find this issue
    being discussed previously in this mailing list's archives, here we go.

    Some figures: there are currently 337 -java-doc packages in bookworm,
    264 in trixie, 272 in unstable, for a total of 348 unique names overall,
    while I counted 1537 libsomething-java. This means that currently close
    to 23% of packaged Java libraries provide a -java-doc package overall,
    and that number is down to 18% currently in trixie.

    Arguments in favor of their removal are so far:
    - they often cause builds to break, especially with new releases of JDKs
    or build tools
    - they have a low popcon
    - they can usually be downloaded from the upstream project or Maven
    Central, or browsed online
    - maintainer time would be better invested on other issues.

    Tell me if I missed some.

    I believe however that we should continue to provide -java-doc packages
    for several reasons:
    - Debian sometimes provides patched libraries that may behave slightly differently than the upstream version
    - not all projects reliably publish all versions of their developer documentation on public repositories, and the documentation of some
    older versions of libraries still packaged in Debian is not available on
    the usual online services
    - they are convenient for working offline (which also happens in
    corporate settings, e.g. when deep inside a building where you won't
    pick up your operator's network, there is no guest network, and the
    corporate network has such an unfriendly and invasive policy that you
    won't even try to connect to it)
    - additional developer documentation (e.g. markdown files, reports) can
    be provided with these package but are usually not bundled in upstream
    javadoc archives
    - they would remain available in Debian even if the upstream project
    removed entirely its online presence for any reason
    - as a matter of principle for completeness, downloadable developer documentation being part of what's expected from a popular,
    general-purpose, quality distribution such as Debian (even though there
    is no such requirement in the Debian policy AIUI).

    Popcon is IMO not a relevant metric to estimate the usefulness of
    developer documentation packages (or, more generally, of packages that
    would only be used by developers). Some -java-doc packages have a very
    low popcon because the library package itself has a low popcon, and
    developers (those that might need the documentation of a library) are a
    tiny fraction anyway compared to regular users of a package (which
    include developers that don't directly work on anything related to this package). I also sampled a few other non-java library -doc packages and
    they get similarly low popcon scores.

    Build issues are a fact, but I think that there are ways to drastically
    improve the situation, among other things by (automatically) testing new
    JDK or build tool releases before discovering compatibility issues as
    FTBFS bug reports pile up. A few other fixes in the toolchain are also
    needed, and I'm planning to work on these (testing and toolchains) later
    this year.

    Now I understand why some maintainers would rather drop -java-doc
    packages and I think that in the current situation it's fair to not make
    it a priority to maintain them.

    I'm thus proposing the following policy from now on, to be revisited
    after the toolchain is fixed and we see how it goes with a few JDK and
    build tool updates (so maybe 3 years from now, let's say 2028):
    - maintainers may at their discretion drop -java-doc packages rather
    than fix them when they encounter build issues
    - other maintainers may (re-)introduce them at their discretion
    - new library packages may introduce new -java-doc packages and that
    "may" will revert to a "should" (as in the currently published Debian
    policy for Java) once the toolchain is sufficiently improved.

    What do you think of that?

    Cheers,

    --
    Julien Plissonneau Duquène

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Emmanuel Bourg@21:1/5 to All on Tue Feb 18 12:50:01 2025
    On 18/02/2025 10:46, Julien Plissonneau Duquène wrote:

    There is a growing tendency these last few years to remove -java-doc
    packages as part of regular maintenance (including fixing builds or
    updating to a new upstream release). As I could not find this issue
    being discussed previously in this mailing list's archives, here we go.

    Some figures: there are currently 337 -java-doc packages in bookworm,
    264 in trixie, 272 in unstable, for a total of 348 unique names overall, while I counted 1537 libsomething-java. This means that currently close
    to 23% of packaged Java libraries provide a -java-doc package overall,
    and that number is down to 18% currently in trixie.

    Thank you for the stats, I admit I hoped there was fewer packages left :)


    Arguments in favor of their removal are so far:
    - they often cause builds to break, especially with new releases of JDKs
    or build tools
    - they have a low popcon
    - they can usually be downloaded from the upstream project or Maven
    Central, or browsed online
    - maintainer time would be better invested on other issues.

    Tell me if I missed some.

    - javadoc slows down the build, use more computing resources (the
    -java-doc dependency resolution of maven-debian-helper at the end of the
    build is extremely slow, and often inaccurate)
    - -the java-doc packages fill more space on the mirrors
    - the inter -java-doc links are often broken, unless the build is
    patched (more maintenance)
    - javadoc is often the source of reproducibility issues, the openjdk-21
    package has 7 patches addressing some of these issues, and they have to
    be updated for every new JDK release. There are also patches in Ant and
    Maven.
    - javadocs sometimes contain Google Analytics scriplets or load external resources, which cause privacy issues and must be patched out
    - https://javadoc.io is much more useful than our -java-doc packages
    - jquery unbundling complexifies the build
    - every new JDK release over the past ten years came with new javadoc
    issues. The tool is more and more strict, forcing us to rewrite the documentations or use undocumented/unsupported flags to skip the
    errors... until the next JDK reshuffles everything.


    I believe however that we should continue to provide -java-doc packages
    for several reasons:
    - Debian sometimes provides patched libraries that may behave slightly differently than the upstream version

    The public API visible in the javadoc is rarely changed. Some notable exceptions that come to mind are Guava, where removed code is often reintroduced to preserve the backward compatibility, and the latest
    iterations of the Servlet API for the same reason. But these changes are
    just a convenience for Debian, developers will rather pull their
    dependencies from Maven Central, and not from /usr/share/maven-repo.
    Developing specifically for the Debian libraries is a non-sense for a
    Java developer, why would I sacrifice cross-platform compatibility by
    targeting a version of a library only available in Debian?


    - not all projects reliably publish all versions of their developer documentation on public repositories, and the documentation of some
    older versions of libraries still packaged in Debian is not available on
    the usual online services

    The javadoc can still be built locally from the source package, it's
    just a couple of commands away:
    - apt source <package>
    - mvn javadoc:javadoc, or ant javadoc


    - they are convenient for working offline (which also happens in
    corporate settings, e.g. when deep inside a building where you won't
    pick up your operator's network, there is no guest network, and the
    corporate network has such an unfriendly and invasive policy that you
    won't even try to connect to it)

    That's a corporate issue, not a Debian issue. A developer without
    internet access is close to useless in 2025 anyway.


    - additional developer documentation (e.g. markdown files, reports) can
    be provided with these package but are usually not bundled in upstream javadoc archives
    - they would remain available in Debian even if the upstream project
    removed entirely its online presence for any reason
    - as a matter of principle for completeness, downloadable developer documentation being part of what's expected from a popular,
    general-purpose, quality distribution such as Debian (even though there
    is no such requirement in the Debian policy AIUI).

    Popcon is IMO not a relevant metric to estimate the usefulness of
    developer documentation packages (or, more generally, of packages that
    would only be used by developers). Some -java-doc packages have a very
    low popcon because the library package itself has a low popcon, and developers (those that might need the documentation of a library) are a
    tiny fraction anyway compared to regular users of a package (which
    include developers that don't directly work on anything related to this package). I also sampled a few other non-java library -doc packages and
    they get similarly low popcon scores.

    And yet we keep removing -java-doc packages and no user complains.


    Build issues are a fact, but I think that there are ways to drastically improve the situation, among other things by (automatically) testing new
    JDK or build tool releases before discovering compatibility issues as
    FTBFS bug reports pile up. A few other fixes in the toolchain are also needed, and I'm planning to work on these (testing and toolchains) later
    this year.

    You can detect the issues automatically, but not fix them automatically.


    Now I understand why some maintainers would rather drop -java-doc
    packages and I think that in the current situation it's fair to not make
    it a priority to maintain them.

    I'm thus proposing the following policy from now on, to be revisited
    after the toolchain is fixed and we see how it goes with a few JDK and
    build tool updates (so maybe 3 years from now, let's say 2028):
    - maintainers may at their discretion drop -java-doc packages rather
    than fix them when they encounter build issues
    - other maintainers may (re-)introduce them at their discretion
    - new library packages may introduce new -java-doc packages and that
    "may" will revert to a "should" (as in the currently published Debian
    policy for Java) once the toolchain is sufficiently improved.

    What do you think of that?

    I think the best would be to recognize that -java-doc packages are not sustainable and of limited use. That doesn't mean we have to give up any
    hope of documenting the Java APIs in Debian, it's just that the
    -java-doc packages are not the right tool. In my opinion the right
    approach would be to build a javadoc.debian.net service gathering the
    javadoc of all Debian packages, similar to javadoc.io but specific to
    Debian. And if it could also serve as a class search engine it would be incredibly useful.

    Emmanuel Bourg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_Plissonneau_Duqu=C@21:1/5 to All on Wed Feb 19 09:00:01 2025
    Hi,

    Le 2025-02-18 10:59, Sebastiaan Couwenberg a écrit :

    I doubt those "other maintainers" have the discipline to target their
    uploads reintroducing -java-doc packages to experimental where they'll
    land after NEW processing.

    I'm not sure this is much of an issue. Maybe you could elaborate why you
    think so?

    I also wouldn't appreciate having to drop the reintroduced -java-doc
    package again when its breaks with the next JDK update.

    People who value java-doc package should maintain them separately to
    not bother the maintainers who don't care for them. The separately
    maintained gcc -doc packages might serve as an example.

    I'm not against the principle, but this isn't workable for javadocs.
    gcc-N-doc has separate source files and its own build system (and even a different, non-DFSG license). javadoc source is embedded in source code,
    and its build more or less tightly integrated with the binaries build
    system. Having different maintainers for those would mean duplicating
    the entire source code as a new source package, which is obviously not something that would be reasonable to do.

    Cheers,

    --
    Julien Plissonneau Duquène

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_Plissonneau_Duqu=C@21:1/5 to All on Wed Feb 19 09:40:01 2025
    Le 2025-02-19 09:08, Sebastiaan Couwenberg a écrit :

    Building something on top of snapshot.d.o might be feasible for
    separately maintained javadocs.

    I don't see how this could help to prevent the source package
    duplication, could you explain your idea?

    Cheers,

    --
    Julien Plissonneau Duquène

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_Plissonneau_Duqu=C@21:1/5 to All on Wed Feb 19 09:30:01 2025
    Hi,

    Le 2025-02-18 12:31, Emmanuel Bourg a écrit :
    - javadoc slows down the build, use more computing resources (the
    -java-doc dependency resolution of maven-debian-helper at the end of
    the build is extremely slow, and often inaccurate)

    That's not much of an issue in my experience, and there is room for
    improvement in the maven-debian-helper.

    - -the java-doc packages fill more space on the mirrors

    Some more stats then: I counted 189 MiB of -java-doc packages for
    bookworm, compared to 937 MiB of -java packages (which is surprisingly
    low, I was expecting 5x-10x more), which means that the -java-docs
    account for around 17% of the volume used by all java packages, which is reasonable. The whole bookworm archive is about 162 GiB in comparison,
    of which all -java-doc packages account for about 0.1%. I think we can
    call this a non-issue.

    - the inter -java-doc links are often broken, unless the build is
    patched (more maintenance)
    - javadoc is often the source of reproducibility issues, the openjdk-21 package has 7 patches addressing some of these issues, and they have to
    be updated for every new JDK release. There are also patches in Ant and Maven.
    - jquery unbundling complexifies the build
    - every new JDK release over the past ten years came with new javadoc
    issues. The tool is more and more strict, forcing us to rewrite the documentations or use undocumented/unsupported flags to skip the
    errors... until the next JDK reshuffles everything.

    Some of these issues need to be fixed in the toolchain, eventually in
    the JDK itself, eventually by bringing to the attention of upstream
    developers (JDK, build tools) that backwards compatibility of that chain
    has been suboptimal so far to say the least and asking them to give more consideration to these issues in their developments, as maintainability
    matters for popularity.

    - javadocs sometimes contain Google Analytics scriplets or load
    external resources, which cause privacy issues and must be patched out

    I would argue here that it's actually a very good reason to provide alternatives that don't have these privacy issues.

    - https://javadoc.io is much more useful than our -java-doc packages

    Not in my experience though. It lacks a proper index and a convenient
    search feature, fails to display some versions, and anyway not all java projects push their javadoc artifacts to Maven Central, gradle doesn't
    for example.

    The public API visible in the javadoc is rarely changed. Some notable exceptions that come to mind are Guava, where removed code is often reintroduced to preserve the backward compatibility, and the latest iterations of the Servlet API for the same reason. But these changes
    are just a convenience for Debian, developers will rather pull their dependencies from Maven Central, and not from /usr/share/maven-repo. Developing specifically for the Debian libraries is a non-sense for a
    Java developer, why would I sacrifice cross-platform compatibility by targeting a version of a library only available in Debian?

    I was thinking more about the cases where features are dropped in the
    Debian versions (because of missing dependencies, security or licensing
    issues etc). The javadocs are a convenient place to document the
    deviations from upstream, reference bugs/discussions and eventually
    recommend ways to work around known issues.

    The javadoc can still be built locally from the source package, it's
    just a couple of commands away:
    - apt source <package>
    - mvn javadoc:javadoc, or ant javadoc

    ... and then Maven happily downloads the entire Internet etc etc and
    then fails the build because of some of the issues above that are not
    fixed, unless Maven toolchains were configured to also download an
    entire older JDK to run that build. I would just download the javadoc
    jar from Maven Central or some upstream repository, or keep grepping the
    source code as is, that's faster and less trouble.

    - they are convenient for working offline (which also happens in
    corporate settings, e.g. when deep inside a building where you won't
    pick up your operator's network, there is no guest network, and the
    corporate network has such an unfriendly and invasive policy that you
    won't even try to connect to it)

    That's a corporate issue, not a Debian issue. A developer without
    internet access is close to useless in 2025 anyway.

    That was an example. We take permanent, high-speed, unrestricted
    Internet access for granted, but there are still many places and
    situations where being able to work offline is appreciated, if not a
    necessity.

    And yet we keep removing -java-doc packages and no user complains.

    Fixing that: I'm hereby complaining here as one of them ^ ^

    More seriously, there is a certain threshold that has to be crossed for
    users to start complaining or report issues. The fact that it's quiet so
    far doesn't mean everybody is happy with that.

    Build issues are a fact, but I think that there are ways to
    drastically improve the situation, among other things by
    (automatically) testing new JDK or build tool releases before
    discovering compatibility issues as FTBFS bug reports pile up. A few
    other fixes in the toolchain are also needed, and I'm planning to work
    on these (testing and toolchains) later this year.

    You can detect the issues automatically, but not fix them
    automatically.

    Sure, but this gives a chance to mitigate the issues in the build tools
    and document what needs to be done in the packages if anything. Also a
    part of the issue is that by policy we keep rebuilding older packages
    with newer JDKs, which has some advantages but is clearly diverging from
    the mainstream java practice where a given version is built once,
    artifacts are published and then that same version is never built again.
    This has some consequences that we have to deal with.

    I think the best would be to recognize that -java-doc packages are not sustainable and of limited use. That doesn't mean we have to give up
    any hope of documenting the Java APIs in Debian, it's just that the
    -java-doc packages are not the right tool. In my opinion the right
    approach would be to build a javadoc.debian.net service gathering the
    javadoc of all Debian packages, similar to javadoc.io but specific to
    Debian. And if it could also serve as a class search engine it would be incredibly useful.

    I also had this idea for an online service, and it's not mutually
    exclusive of -java-doc packages. Actually if you wanted to provide
    online javadocs that are built from Debian sources you would still have
    to do mostly the same work as what's required for producing -java-doc
    packages. In terms of logistics it's also much easier to have them
    packaged (e.g. they are guaranteed to be rebuilt with every new release
    of the main packages), and that would make it possible to distribute the
    webapp as a package that could be easily used in private deployments
    including one's own machine.

    Do you (or others) happen to know a corporate sponsor that would be
    willing to provide the hosting or fund the service?

    Cheers,

    --
    Julien Plissonneau Duquène

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)