• Re: Efficiency of requesting by message-id vs group+number

    From Jon Ribbens@21:1/5 to Colin Macleod on Thu Aug 22 07:49:43 2024
    On 2024-08-22, Colin Macleod <user7@cmacleod.me.uk.invalid> wrote:
    Hi, I'm looking for some guidance on the efficiency of requesting individual articles by message-id compared to group and article number. For a server such as INN, does finding an article from its message-id impose extra load?

    I'm operating a web/usenet gateway at https://cmacleod.me.uk/ng/ .
    When I display a thread I need to get all the articles in it. I'm
    wondering about
    spreading the load of this across multiple nntp servers. However they would have different article numbering, so I would need to request each article
    by its message-id. Is this practical, or would it impose an unreasonable load on the servers?

    It is reasonable to fetch articles by Message-ID.
    NNTP's "NEWNEWS" command is designed to work that way.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Colin Macleod@21:1/5 to All on Thu Aug 22 07:33:40 2024
    Hi, I'm looking for some guidance on the efficiency of requesting individual articles by message-id compared to group and article number. For a server
    such as INN, does finding an article from its message-id impose extra load?

    I'm operating a web/usenet gateway at https://cmacleod.me.uk/ng/ . When I display a thread I need to get all the articles in it. I'm wondering about
    spreading the load of this across multiple nntp servers. However they would have different article numbering, so I would need to request each article
    by its message-id. Is this practical, or would it impose an unreasonable
    load on the servers?

    --
    Colin Macleod.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jon Ribbens@21:1/5 to urs@buil.tin.org on Thu Aug 22 14:15:18 2024
    On 2024-08-22, Urs Janßen <urs@buil.tin.org> wrote:
    Jon Ribbens wrote:
    It is reasonable to fetch articles by Message-ID.
    NNTP's "NEWNEWS" command is designed to work that way.

    | 502 NEWNEWS command disabled by administrator

    guess why.

    Because people have tended to abuse it?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Urs =?UTF-8?Q?Jan=C3=9Fen?=@21:1/5 to Jon Ribbens on Thu Aug 22 13:15:35 2024
    Jon Ribbens wrote:
    It is reasonable to fetch articles by Message-ID.
    NNTP's "NEWNEWS" command is designed to work that way.

    | 502 NEWNEWS command disabled by administrator

    guess why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to urs@buil.tin.org on Thu Aug 22 08:08:40 2024
    Urs Janßen <urs@buil.tin.org> writes:
    Jon Ribbens wrote:

    It is reasonable to fetch articles by Message-ID.
    NNTP's "NEWNEWS" command is designed to work that way.

    | 502 NEWNEWS command disabled by administrator

    guess why.

    The expensive part of NEWNEWS for INN isn't retrieving articles by message
    ID. It's finding the list of message IDs by arrival time, since the traditional overview and history data structures do not maintain that information in an efficient way.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jon Ribbens@21:1/5 to kyonshi on Thu Aug 22 15:10:40 2024
    On 2024-08-22, kyonshi <smaug@ereborbbs.duckdns.org> wrote:
    On Thu, 22 Aug 2024 07:49:43 -0000 (UTC), Jon Ribbens wrote:
    It is reasonable to fetch articles by Message-ID.
    NNTP's "NEWNEWS" command is designed to work that way.

    but isn't NEWNEWS blocked on most servers?

    I've no idea. I haven't looked since the 1990s, when it was very rarely,
    if ever, blocked. But it would seem rather surprising if the efficiency
    of looking up an article by Message-ID has somehow gone *down* in the intervening period.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Colin Macleod on Thu Aug 22 08:09:59 2024
    Colin Macleod <user7@cmacleod.me.uk.invalid> writes:

    Hi, I'm looking for some guidance on the efficiency of requesting
    individual articles by message-id compared to group and article number.
    For a server such as INN, does finding an article from its message-id
    impose extra load?

    No, it should be fast. It's a history lookup of the message ID to get the storage token, and then retrieval of the article by storage token. It
    should be roughly equivalent to retrieving the article by number. There's
    one extra hash calculation, I think, but it's going to be in the noise.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From kyonshi@21:1/5 to Jon Ribbens on Thu Aug 22 14:34:55 2024
    On Thu, 22 Aug 2024 07:49:43 -0000 (UTC), Jon Ribbens wrote:


    It is reasonable to fetch articles by Message-ID.
    NNTP's "NEWNEWS" command is designed to work that way.

    but isn't NEWNEWS blocked on most servers?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to urs@buil.tin.org on Thu Aug 22 16:58:03 2024
    On Thu, 22 Aug 2024 13:15:35 -0000 (UTC), Urs Jan?en <urs@buil.tin.org> wrote: >Jon Ribbens wrote:
    It is reasonable to fetch articles by Message-ID.
    NNTP's "NEWNEWS" command is designed to work that way.

    | 502 NEWNEWS command disabled by administrator

    guess why.

    only curious (didn't know why so i googled it) . . .

    (using Tor Browser 13.5.2)
    https://duckduckgo.com/?q=nntp+newnews+expensive
    ...
    https://linux.die.net/man/8/nntpd
    The optional allownewnews option enables the NNTP NEWNEWS command. NOTE:
    For servers with a large volume of articles, the NEWNEWS command can be >expensive.
    ...
    https://www.unix.com/man-page/centos/8/nntpd/ >https://www.cyrusimap.org/imap/reference/manpages/systemcommands/nntpd.html >https://manpages.ubuntu.com/manpages/kinetic/en/man8/cyrus-nntpd.8.html
    ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Jon Ribbens on Thu Aug 22 09:12:53 2024
    Jon Ribbens <jon+usenet@unequivocal.eu> writes:

    I've no idea. I haven't looked since the 1990s, when it was very rarely,
    if ever, blocked.

    Hm, I'm not sure about that. Blocking NEWNEWS in INN goes back a long
    time. The old implementation did history text file searches, IIRC, and
    was rather inefficient for single groups if the server had a lot of
    traffic. Maybe I'm misremembering, but I don't think I am; I'm pretty
    sure I disabled NEWNEWS on the servers I was running in the late 1990s.

    The current implementation is quite efficient if the user specifies a
    single newsgroup, which is a common use case of NEWNEWS. If they specify
    a wildmat that matches a bunch of groups, it's still fairly bad.
    NEWNEWS * has to search the overview of every group on the server for
    articles in the date range, so if you have a lot of groups, that is going
    to be painful.

    Fast NEWNEWS with broad wildmat patterns requires a different data
    structure (article message IDs in arrival order regardless of newsgroup,
    but with the newsgroup information available to do the wildmat match)
    that's otherwise kind of pointless to maintain.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jon Ribbens@21:1/5 to Russ Allbery on Thu Aug 22 16:33:48 2024
    On 2024-08-22, Russ Allbery <eagle@eyrie.org> wrote:
    Jon Ribbens <jon+usenet@unequivocal.eu> writes:
    I've no idea. I haven't looked since the 1990s, when it was very rarely,
    if ever, blocked.

    Hm, I'm not sure about that. Blocking NEWNEWS in INN goes back a long
    time. The old implementation did history text file searches, IIRC, and
    was rather inefficient for single groups if the server had a lot of
    traffic. Maybe I'm misremembering, but I don't think I am; I'm pretty
    sure I disabled NEWNEWS on the servers I was running in the late 1990s.

    It may perhaps be that you were not working at a UK dial-up ISP? :-)

    At the time we sold an Internet connectivity package for Acorn RISC
    computers, that included an off-line mail/newsreader whereby you
    dialled up, fetched the new articles for your groups, then disconnected
    (calls were charged by the minute, so you did not want to remain online
    while reading/replying).

    This obviously worked by using NEWNEWS, and I don't recall many
    complaints about it not working, although looking back I may dimly
    recollect one or two.

    (I may perhaps be the only person in the world who's written both
    a news server and a graphical web browser in assembly language.)

    It may have helped that the UK ISP marked at the time was dominated
    by Demon Internet, who pioneered the dial-up market, and as far as
    I recall they used their own custom news server, which certainly
    did support NEWNEWS efficiently enough for their purposes.

    The current implementation is quite efficient if the user specifies a
    single newsgroup, which is a common use case of NEWNEWS. If they specify
    a wildmat that matches a bunch of groups, it's still fairly bad.
    NEWNEWS * has to search the overview of every group on the server for articles in the date range, so if you have a lot of groups, that is going
    to be painful.

    I would expect that our software used NEWNEWS with specific group names.
    It was a non-technical GUI interface so I don't think people will have
    been interacting with it using patterns rather than ticking groups shown
    on the screen.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Jon Ribbens on Thu Aug 22 09:40:30 2024
    Jon Ribbens <jon+usenet@unequivocal.eu> writes:

    It may perhaps be that you were not working at a UK dial-up ISP? :-)

    I was not! :)

    It may have helped that the UK ISP marked at the time was dominated by
    Demon Internet, who pioneered the dial-up market, and as far as I recall
    they used their own custom news server, which certainly did support
    NEWNEWS efficiently enough for their purposes.

    Oh, sure, there were definitely other implementations out there that were designed to work well with NEWNEWS. The context of the original question
    was INN, though, and I'm pretty sure the options to disable NEWNEWS in INN
    go way back.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Jon Ribbens on Thu Aug 22 20:08:07 2024
    Jon Ribbens <jon+usenet@unequivocal.eu> wrote or quoted:
    I've no idea. I haven't looked since the 1990s, when it was very rarely,
    if ever, blocked. But it would seem rather surprising if the efficiency
    of looking up an article by Message-ID has somehow gone *down* in the >intervening period.

    If I've got a message ID and wanna check out the related article,
    I throw in:

    article <message ID>

    .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Colin Macleod@21:1/5 to All on Fri Aug 23 06:49:23 2024
    Russ Allbery <eagle@eyrie.org> posted:

    Colin Macleod <user7@cmacleod.me.uk.invalid> writes:

    Hi, I'm looking for some guidance on the efficiency of requesting individual articles by message-id compared to group and article number.
    For a server such as INN, does finding an article from its message-id impose extra load?

    No, it should be fast. It's a history lookup of the message ID to get the storage token, and then retrieval of the article by storage token. It
    should be roughly equivalent to retrieving the article by number. There's one extra hash calculation, I think, but it's going to be in the noise.

    That's great, thanks for the clarification!

    --
    Colin Macleod.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)