• Bug#1109253: at least nl_BE definitely needs to be put back

    From Soren Stoutner@21:1/5 to All on Tue Jul 15 12:45:10 2025
    To: rene@debian.org (Rene Engelhard)

    On Tuesday, July 15, 2025 4:47:45 AM Mountain Standard Time Rene Engelhard wrote:
    Hi,

    yes, I also think at least nl_BE (if not all) definitely should be put back.

    There was no reason to remove them - in contrast...

    I disagree.

    Country or region specific dictionaries should only exist if they actually contain distinct country or region specific information. So, for example, if upstream shipped a nl_BE.dic that was different than the main nl.dic, then that file should be shipped in Debian. In this case, the upstream project does not produce any country or region specific dictionaries, but rather only one language dictionary, which they name nl.dic. Creating country-specific symlinks causes the LibreOffice GUI to list each country as if it had a separate, customized dictionary for that country, which in the case of this package is incorrect.

    In other words, I consider this a bug in LibreOffice, which I have opened here:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1109355

    However, given the short amount of time before the trixie release, I consider the best place to work around this bug to be to temporarily rename the dictionary to include the primary country code so that LibreOffice can find it. For forky, once the bug is fixed in LibreOffice, this package should resume shipping the original upstream file name of nl.dic.

    --
    Soren Stoutner
    soren@debian.org
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEJKVN2yNUZnlcqOI+wufLJ66wtgMFAmh2r8YACgkQwufLJ66w tgPxmQ//XZb+zfD3sD4e0Svul75aviMjQRrVApFTvmrWurArZPtaKINNnXpcDyzY MUMjnlPp0ACWYrjZ3zLa9P+oLO8rFNzxtWlrf06InVk9Cnu9OBh8HGa4u34DbgsZ 2FuaUy1DpRSdJAsZ7/Ut+xAQpeyutbcAYc8z1kftlXppzsvOKa4UMWjIpld7DIur Vcr6jQnzRKot9gjLz1Gsfxgp66WuBsrc97Q0xzwP63nInkGQMmXPAJUDuOW2XbDt Q3L+3JJE2eXvxsp3fOy24wSzMRK2xUphENarcWNyAlVK14ta7HeMX+urnF+j7N3z zaOyrSaWu6RJQ63Oh77b/cHptp+db4iNhl44C9oxFcTpfvY6UApUqS+zE4k73CXU nTUAJuIvJyt8cz7zm9ZZCB/tVvgh7sXk7/SKbePwhnCj/FJ+dUYgI2oce7OU2lvA DgiZYYRPc5ronSE5jPjIwxjbQV2TzQPBcDavLHLiwEoWmhzGukSnChpG5p9MleJb 8K4q1Mz08OydOAenxRhJ+MuunKuEKYgvfu0l1cm5C/O82w+TE8C0aWDKiZ+HakuK dTR0sc517lpLOTcbKXNCMVVh/32p2ITJHhaGfftyU6I3/RvmmpHeE7GKotJbQwkq 0PXMMCHGYRhNeOFn8Eyh68cXmDLpPOPg8Uegz15UGs18oy2N+ZA=
    =wHMQ
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kurt Roeckx@21:1/5 to All on Wed Jul 16 09:10:01 2025
    ------7NL9YZS3CP9O6MX82OWWKC3G4QY9W7
    Content-Type: text/plain;
    charset=utf-8
    Content-Transfer-Encoding: quoted-printable

    BE is just as primary as NL. There was an institute generated, containing members of both. The other 2 counties have also agreed to it. So it's the official spelling for those 4.

    But there are differences in the uses. For instance some words are male or female in 1 county, but are genderless in the other. This has no effect on hunspell, because there is no difference in spelling. The setting in Libreoffice is not just about
    spelling.

    Kurt
    ------7NL9YZS3CP9O6MX82OWWKC3G4QY9W7
    Content-Type: text/html;
    charset=utf-8
    Content-Transfer-Encoding: quoted-printable

    <!DOCTYPE html><html><body><div dir="auto">BE is just as primary as NL. There was an institute generated,  containing members of both. The other 2 counties have also agreed to it. So it's the official spelling for those 4.<br><br>But there are
    differences in the uses. For instance some words are male or female in 1 county, but are genderless in the other. This has no effect on hunspell, because there is no difference in spelling. The setting in Libreoffice is not just about spelling.<br><br>
    Kurt</div></body></html>
    ------7NL9YZS3CP9O6MX82OWWKC3G4QY9W7--

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Soren Stoutner@21:1/5 to All on Wed Jul 16 10:02:04 2025
    René,

    On Tuesday, July 15, 2025 11:38:37 PM Mountain Standard Time René Engelhard wrote:
    Country or region specific dictionaries should only exist if they actually >contain distinct country or region specific information. So, for example,
    if
    upstream shipped a nl_BE.dic that was different than the main nl.dic, then >that file should be shipped in Debian. In this case, the upstream project >does not produce any country or region specific dictionaries, but rather
    only
    one language dictionary, which they name nl.dic.

    That would be ideal, yes.

    That is just not how it works... It worked the current way since the 2000s. No
    reason to immediately change it now.

    I lived for two years in Perú and speak fairly decent Spanish. When I go to enable Spanish spell checking in LibreOffice, I see the following list of options:

    Spanish (Argentina)
    Spanish (Bolivia)
    Spanish (Chile)
    Spanish (Columbia)
    Spanish (Costa Rica)
    Spanish (Dom. Rep.)
    Spanish (Ecuador)
    Spanish (El Salvador)
    Spanish (Guatemala)
    Spanish (Honduras)
    Spanish (Mexico)
    Spanish (Nicaragua)
    Spanish (Panama)
    Spanish (Paraguay)
    Spanish (Peru)
    Spanish (Puerto Rico)
    Spanish (Spain)
    Spanish (Uruguay)
    Spanish (Venezuela)

    Before I had looked into this closely, I assumed this meant that Debian was shipping distinct dictionaries for each of these countries. There is some regional variation in Spanish. Argentina, for example, uses a verb tense that it not common elsewhere. Mexico, expecially Mexico City, uses a lot of slang that approaches an official custom vocabulary.

    In other cases, there is very little variation. I think anyone would be hard pressed to describe any dictionary differences between Peru and Ecuador or Bolivia.

    But the truth is that this entire list is a farce. There is not a single difference in LibreOffice between selecting any of them. They are all just symlinks to es_ES.dic (Spain). So, even if I select Argentina because I want Argentina specific spell checking, I don’t get it.

    I consider this to be false advertising.

    In the case of Dutch, the upstream project does not produce any country- independent Hunspell dictionaries. They ship one dictionary named nl.dic.

    https://github.com/OpenTaal/opentaal-hunspell/

    They do not ship four separate dictionaries for the Netherlands (nl_NL), Belgium (nl_BE), Aruba (nl_AW), and Suriname (nl_SR), which were the four symlinks shipped previously. Interestingly, these are not a complete list of possible Dutch country/region specific codes, just like the above Spanish list is not a complete list of all the possible codes. For Dutch, at lest those in the following link are possible:

    https://localizely.com/language-code/nl/

    In the case of Dutch, LibreOffice only recognizes two language codes, nl_NL and nl_BE, which makes the previous shipping of nl_AW and nl_SR in Debian superfluous.

    The change I have already made to the package, and which has already been unblocked to migrate to testing, is to ship one language specific code: nl_NL. This preserves the ability of Dutch users of LibreOffice to enable Dutch spell checking. In LibreOffice’s GUI, it lists this language as Dutch (Netherlands). This is not completely accurate as there is nothing about the dictionary we are shipping that has specific information about the Netherlands in it. So, I consider doing so a temporary workaround. However, it is more accurate than shipping two symlinks, one for nl_NL and the other for nl_BE. That would falsely advertise that two separate dictionaries are provided.

    The solution to this problem is for LibreOffice to correctly enumerate the Hunspell dictionaries present on the system, including any dictionaries without country/region specific codes. As suggested, I will file an upstream LibreOffice bug and reference this bug and #1109355.

    --
    Soren Stoutner
    soren@debian.org
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEJKVN2yNUZnlcqOI+wufLJ66wtgMFAmh32wwACgkQwufLJ66w tgNifhAAh3ylcOlapSmsNL9J5e3lspK4/C40dL3e2KD7ItNY0ElOoNPL7pNqNHaY YxV/9hxg9qD6WgPx5GqixN7/o+aRHOa9Sy9ONwB50Zdm3VeGim4QrIpPKpdaROe8 T9aGr2Mfv28XqHRBpqkwwukSzLkH5NM07oFPLCNJGpHy8DIIT1DiW+e3/kBmJYA3 zLW+L9cV6Z+UF8gSMzvS1ah1MzdwEogb5He37PNxT/usPAGtjc5Ffzmx19N++CYK 9XrLFeHHn4UjY9+E6mzQak9lr6kZj4TutdwcyYv2b9PHqJHrqjwcf9e8kQZ2nCaJ UhwPt2MJwH1d1JKtT5Z7QbvfupUSrcYun36V0gaaBgRLBUMnAQjv0hIrc5ZLYacX 4PM8tIRlXbcvJ/wHRFC8QDTu+hZ+DOgtdWjGQt3xed8YQ+ppVhhI2wvTDSuigx/A kbQvbySpiL6ICRRz+Q7tIhQ9IfB/L8EC4/DZUhO4ooVKuVB2a1V4CnhP9SRglkxB ixZZSVVt9zLUSzr1g8cI7bc7MNp7PPRXvgTlTTgdggqOxCkqi3Nv1DmCji3hEsG2 4OJZB1J650WIdejrg4qHVhGUSN/zkygcb4v5J2NqvTV10EpMrYRQnsCqvg9BlPcu C/hXFTSfG+Wv9DfZvbdTptPBNNOMr3z1eFi8/weUncygvRmQfqY=
    =RsCM
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kurt Roeckx@21:1/5 to All on Thu Jul 17 00:00:02 2025
    ------PMRLF0P977OZL86NQAAR4PFGZYUR4K
    Content-Type: text/plain;
    charset=utf-8
    Content-Transfer-Encoding: quoted-printable

    It doesn't matter which combinations are supported by Libreoffice, it can be used by other software too. It's the official spelling for:
    - Netherlands (including Bonaire, Sint Eustatius en Saba)
    - Belgium
    - Suriname

    Aruba, Curaçao, and Sint Maarten also seem to be part of the Netherlands, but are a different country, and they have Dutch as one of the official languages there. They seem to collaborate with the rest, but it's unclear that they use the same spelling,
    my guess is not.

    Kurt
    ------PMRLF0P977OZL86NQAAR4PFGZYUR4K
    Content-Type: text/html;
    charset=utf-8
    Content-Transfer-Encoding: quoted-printable

    <!DOCTYPE html><html><body><div dir="auto">It doesn't matter which combinations are supported by Libreoffice, it can be used by other software too. It's the official spelling for:<br>- Netherlands (including Bonaire, Sint Eustatius en Saba)<br>- Belgium<
    - Suriname<br><br>Aruba, Curaçao, and Sint Maarten also seem to be part of the Netherlands, but are a different country, and they have Dutch as one of the official languages there. They seem to collaborate with the rest, but it's unclear that they
    use the same spelling, my guess is not. <br><br>Kurt</div></body></html> ------PMRLF0P977OZL86NQAAR4PFGZYUR4K--

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Soren Stoutner@21:1/5 to All on Wed Jul 16 15:51:00 2025
    To: rene@debian.org (Rene Engelhard)

    On Wednesday, July 16, 2025 11:12:32 AM Mountain Standard Time Rene Engelhard
    wrote:
    Hi,

    Am 16.07.25 um 19:02 schrieb Soren Stoutner:
    In the case of Dutch, LibreOffice only recognizes two language codes,
    nl_NL
    and nl_BE, which makes the previous shipping of nl_AW and nl_SR in Debian superfluous.

    Indeed. I didn't do those links. (They don't do harm, though)

    rene@frodo:~/LibreOffice/git/master/i18npool/source/localedata/data$ ls nl* nl_BE.xml nl_NL.xml

    That also technically means that on those in AR or SR probably even need for format their Text as either Dutch (Netherlands) or Dutch (Belgium) to be
    able
    to spellcheck.

    That doesn't say anything about nl_BE, though.

    nl_BE has to be there.

    (And yes, that's there also for es, as for your other example in your mail: rene@frodo:~/LibreOffice/git/master/i18npool/source/localedata/data$ ls es* es_AR.xml es_CO.xml es_EC.xml es_HN.xml es_PA.xml es_PY.xml es_VE.xml es_BO.xml es_CR.xml es_ES.xml es_MX.xml es_PE.xml es_SV.xml
    es_CL.xml es_DO.xml es_GT.xml es_NI.xml es_PR.xml es_UY.xml
    )

    Again, that is so deep in LO.. (And is saved in files for Text/Cells, as I wrote before).

    Please readd needed symlinks.

    I read all of the above as evidence that this is a bug in LibreOffice that needs to be fixed there. I assume that is obvious to everyone involved.

    Because I do realize that this is a non-trivial change to LibreOffice, I am shipping nl_NL.dic for now, which all Dutch speakers can use for spell checking.

    But the real solution for forky is for this bug to be fixed in LibreOffice.

    --
    Soren Stoutner
    soren@debian.org
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEJKVN2yNUZnlcqOI+wufLJ66wtgMFAmh4LNUACgkQwufLJ66w tgNolQ/9G3fYdM6aRCJGb8NdQ2Of720a8e9zu3Maz8TdKk3ref+lWLXU7EotgLj7 3Recv+4aIdp/OSSH2/+bPR7+DpiVAzo2NWKF84x0VFggSJjcn4i2oGWytFR1ypNr f/8oRM6chkGUmOcVRNcQ1sZM9mK86LCARJ0jJGtSw6vN9s10Xuus5UJOLiY3xcrC s0noLpzjHd5K4QSg8RsztAItb71htV+JNbN8gubEKV1TbJ6QwWkhEHa4Bz3Km2kj pd+VND9+0cDLLNqOLFiXhfYbUrA0fVHEfNf6xWdSjScc+yf/MnJ6YDPkfJS8hnVv 4mn/T9sT2rYhL0ulH1HAPID4kHbDGfD1WHNo4w+mqdlvSXFWgb6oaioy8cc2k8La HrALW6QAt0rxbui58jij2WjzATMAJ7SJmdEBXa1RNOzHk+aD0+Hwh7Q/0cRRfnrt bEj3ylhwK/uBU4dtINLcdiuYjY7PbPOdQCxfBnfKrilH/X9k3FNycwRQBcH+A4oM Lh7yqpnLfJwiCm3oQ84lQg/Ynj5jIMFTmqFX81zgNgcnhb9jXAaJCpB4FbfXs5oo inhurlBIftZDK9qningyKnXAnALJJrxblygyFqV6X73krgiT9AKAj/vpRx3q53ks MYUMvh/43dsPU5E3GlYxm5xhqNK3fxyfZap65A57H3WfkcL9Zdc=
    =5kKc
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rene Engelhard@21:1/5 to All on Thu Jul 17 13:50:01 2025
    [ resending since when sending it via phone I "lost" the bug in Cc ]


    Hi,

    Am 17. Juli 2025 00:51:00 MESZ schrieb Soren Stoutner <soren@debian.org>:

    I read all of the above as evidence that this is a bug in LibreOffice that >needs to be fixed there. I assume that is obvious to everyone involved.

    No. It's a shortcoming of LibreOffices system, yes. Bug? No. It's as designed

    Because I do realize that this is a non-trivial change to LibreOffice, I am >shipping nl_NL.dic for now, which all Dutch speakers can use for spell >checking.

    Not those in Belgium, unless their text is formatted as Dutch (Belgium). How hard is it to understand that?

    But the real solution for forky is for this bug to be fixed in LibreOffice.

    As said that it so deep in LibreOffice I'd actually be surprised if they did it for those "old style dictionaries" and reworked the locale stuff since they - as said - use the extensions and their registration mechanism. Can be surprised though.

    I am quite sure we won't be further in forky.

    Asked the thunderbird maintainers how their detection works...

    I personally give up. But breaking hunspell-nl for Belgian users (in Trixie, but most probably beyond) is seriously bad.

    Regards

    René

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Soren Stoutner@21:1/5 to All on Thu Jul 17 12:08:11 2025
    To: kurt@roeckx.be (Kurt Roeckx)

    This is a multi-part message in MIME format.

    --nextPart2584040.xXo6FtjLZv
    Content-Transfer-Encoding: quoted-printable
    Content-Type: text/plain; charset="utf-8"

    Kurt,

    On Wednesday, July 16, 2025 11:19:53 PM Mountain Standard Time Kurt Roeckx wrote:
    On July 17, 2025 12:51:00 AM GMT+02:00, Soren Stoutner <soren@debian.org>
    wrote:
    Because I do realize that this is a non-trivial change to LibreOffice, I am >shipping nl_NL.dic for now, which all Dutch speakers can use for spell >checking.

    I disagree. I think not shipping nl_BE is an RC bug. If you don't fix it, I will upload it myself.

    First, I should note that the above email I am replying to was sent by Kurt to the bug email address, as well as to myself directly. It does not currently appear on the bug interface, although I read on debian-private today that the BTS was experiencing some load issues related to scraping that either delayed or deleted some emails. I am quoting the text above because Kurt obviously intended it to be included in bug history.

    Based on this discussion on this thread, it has become apparent that this is a problem with both LibreOffice and Thunderbird. This provides an interesting scenario, as these two programs are probably the primary consumers of the Hunspell dictionaries on Debian. So, their behavior results in a de-facto standard for how things are done.

    I feel that the current behavior is incorrect and should be considered a bug. Other people either seem to feel that the behavior is correct or that, even if it is a bug, it is not important enough to be changed.

    Given the ramifications for users, I feel it is important that the Hunspell packages shipped in Debian work with LibreOffice and Thunderbird, even if the behavior of those packages is not in conformance with any actual standard or how language codes are treated in general by other programs.

    If LibreOffice and Thunderbird were interested in working towards a correct use of the language codes, I would be willing to maintain the previous behavior, including shipping as many inaccurate country symlinks as necessary, until the changes could be implemented. But, personally, I do not want to have my name associated with a package that intends to ship these inaccurate symlinks in perpetuity.

    As such, I think the best way forward is probably for Kurt to upload the changes he proposes. When doing so, please remove my name from the Maintainer field.

    It should be noted that the Dutch package has some significant issues that I intended to address when I salvaged it, but were not appropriate to address before the freeze. These changes exist in experimental for dutch and in the new hunspell-nl source package in the NEW queue. Because they are significant, I intended to discuss them via email before they entered unstable. The email, which I had previously written and intended to send out after the trixie release, is attached. As the new maintainer of the package, you can decide how you want to deal with these issues and if you want to use the work on these two packages currently in experimental or if you would like to go a different direction.

    The other thing that should be noted is that there are at least three other Hunspell packages affected by this same issue:

    hunspell-ar
    hunspell-bo
    hunspell-dz

    There is already an existing bug report discussing hunspell-ar that is worth reviewing:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1109305

    The other two dictionaries also have bugs, already closed, but those do not contain any important information not already discussed here.

    In all those cases, please feel free to upload whatever changes you feel are appropriate as long as you remove me as the Maintainer.

    --
    Soren Stoutner
    soren@debian.org
    --nextPart2584040.xXo6FtjLZv
    Content-Disposition: attachment; filename="dutch email.txt" Content-Transfer-Encoding: 7Bit
    Content-Type: text/plain; charset="UTF-8"; name="dutch email.txt"

    To: debian-l10n-dutch@lists.debian.org,
    Debian Install System Team <debian-boot@lists.debian.org>,
    nicoo <nicoo@debian.org>
    CC: Manuel Guerra <ar.manuelguerra@gmail.com>

    Subject: Intention to drop aspell-nl and idutch binary packages

    TL;DR
    -----

    There is no DFSG-free source available to produce the aspell-nl and
    idutch binary packages. Unless a DFSG-free source can be identified, they will be removed from Debian.


    Background
    ----------

    Recently, Manuel Guerra and I salvaged the dutch source package, which produces the hunspell-nl, wdutch, aspell-nl, and idutch binary packages. Our primary motivation was to update hunspell-nl to ship a .bdic binary dictionary compatible with Qt WebEngine used by various browsers in Debian including Privacy B
  • From Kurt Roeckx@21:1/5 to Soren Stoutner on Thu Jul 17 21:30:01 2025
    On July 17, 2025 12:51:00 AM GMT+02:00, Soren Stoutner <soren@debian.org> wrote:

    Because I do realize that this is a non-trivial change to LibreOffice, I am >shipping nl_NL.dic for now, which all Dutch speakers can use for spell >checking.

    I disagree. I think not shipping nl_BE is an RC bug. If you don't fix it, I will upload it myself.

    Kurt

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)