• Historical articles and longest retention.

    From ZMarkGC@21:1/5 to All on Sun May 14 22:56:39 2023
    I have used giganews for grabbing old articles, but they only reach
    2004. Does anyone have older text retention available over NNTP (i.e not
    google newsgroups or web archives). I would love to slurp/archive
    anything not stored on the major commercial providers.

    If so, can you give a rough disk usage and storage backend?

    I have seen people mention 50mb/day recently based on eternal-september
    stats, so assuming the average daily usage is static since 1980, it
    should be under 1TB.

    If not, I am planning to inject articles from archive.org and anywhere
    else I can find them.

    Are there any issues with injecting posts from 30 years ago? I don't
    peer with anyone but if I can get everything imported and renumbered
    correctly for my local reader to understand, I might consider peering or
    making a public NNTP connection available.

    -------------

    ZMarkGC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to ZMarkGC on Sun May 14 23:33:28 2023
    On May 14, 2023 at 4:56:39 PM CDT, "ZMarkGC" <ZMarkGC@example.com> wrote:

    I have used giganews for grabbing old articles, but they only reach
    2004. Does anyone have older text retention available over NNTP (i.e not google newsgroups or web archives). I would love to slurp/archive
    anything not stored on the major commercial providers.

    If so, can you give a rough disk usage and storage backend?

    I have seen people mention 50mb/day recently based on eternal-september stats, so assuming the average daily usage is static since 1980, it
    should be under 1TB.

    If not, I am planning to inject articles from archive.org and anywhere
    else I can find them.

    Are there any issues with injecting posts from 30 years ago? I don't
    peer with anyone but if I can get everything imported and renumbered correctly for my local reader to understand, I might consider peering or making a public NNTP connection available.

    -------------

    ZMarkGC

    The oldest available on-spool articles I've been able to obtain are from 2003, not much farther back than GigaNews.

    I've grabbed the Big8, de.*, it.*, and most of uk.* from ~2003 and I'm at
    1.4TB with overview.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to ZMarkGC on Mon May 15 01:28:09 2023
    ZMarkGC wrote:

    I have used giganews for grabbing old articles, but they only reach
    2004. Does anyone have older text retention available over NNTP (i.e not google newsgroups or web archives). I would love to slurp/archive
    anything not stored on the major commercial providers.

    If so, can you give a rough disk usage and storage backend?

    I have seen people mention 50mb/day recently based on eternal-september stats, so assuming the average daily usage is static since 1980, it
    should be under 1TB.

    If not, I am planning to inject articles from archive.org and anywhere
    else I can find them.

    Are there any issues with injecting posts from 30 years ago? I don't
    peer with anyone but if I can get everything imported and renumbered correctly for my local reader to understand, I might consider peering or making a public NNTP connection available.

    It's been a while since I looked at them, but I grabbed some old archives
    and took a look. The oldest ones I found (some were uni's sending their first test article) had some differences in headers.

    I can't remember right now the specifics, but it would take some (probably simple) scripting to modify them to work correctly with current news servers.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Retro Guy on Mon May 15 02:08:18 2023
    Retro Guy wrote:

    ZMarkGC wrote:

    I have used giganews for grabbing old articles, but they only reach
    2004. Does anyone have older text retention available over NNTP (i.e not
    google newsgroups or web archives). I would love to slurp/archive
    anything not stored on the major commercial providers.

    If so, can you give a rough disk usage and storage backend?

    I have seen people mention 50mb/day recently based on eternal-september
    stats, so assuming the average daily usage is static since 1980, it
    should be under 1TB.

    If not, I am planning to inject articles from archive.org and anywhere
    else I can find them.

    Are there any issues with injecting posts from 30 years ago? I don't
    peer with anyone but if I can get everything imported and renumbered
    correctly for my local reader to understand, I might consider peering or
    making a public NNTP connection available.

    It's been a while since I looked at them, but I grabbed some old archives
    and took a look. The oldest ones I found (some were uni's sending their first test article) had some differences in headers.

    I can't remember right now the specifics, but it would take some (probably simple) scripting to modify them to work correctly with current news servers.

    I found an example:

    ----------
    Autzoo.101
    test
    utzoo!henry
    Fri Feb 6 00:19:47 1981
    first_test
    This is the first U of T test of the Duke news program.
    Here is some more text.
    And some more.
    ----------

    The newer the article, the less work the header needs to work properly.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Retro Guy on Mon May 15 08:31:53 2023
    retro.guy@rocksolidbbs.com (Retro Guy) writes:

    I found an example: ----------
    Autzoo.101
    test
    utzoo!henry
    Fri Feb 6 00:19:47 1981
    first_test
    This is the first U of T test of the Duke news program.
    Here is some more text.
    And some more.
    ----------

    The newer the article, the less work the header needs to work properly.

    This is the "A News" format (named after the software in use at the time). There is an example in RFC-850 but not a specification (RFC-850 documents
    the B News format). I think there may be a specification for it somewhere
    in old software, but I'm not sure where off-hand.

    RFC-1036 documents the modern format, and everything from that point
    forward is *mostly* compatible. The B News format (RFC-850) looks more
    like the modern format but has some interesting variations, such as Title instead of Subject, Article-I.D. instead of Message-ID, and UUCP bang
    paths for From addresses.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Spiros Bousbouras@21:1/5 to ZMarkGC on Thu May 18 10:37:21 2023
    On Sun, 14 May 2023 22:56:39 +0100
    ZMarkGC <ZMarkGC@example.com> wrote:
    I have used giganews for grabbing old articles, but they only reach
    2004. Does anyone have older text retention available over NNTP (i.e not google newsgroups or web archives). I would love to slurp/archive
    anything not stored on the major commercial providers.

    If so, can you give a rough disk usage and storage backend?

    I have seen people mention 50mb/day recently based on eternal-september stats, so assuming the average daily usage is static since 1980, it
    should be under 1TB.

    If not, I am planning to inject articles from archive.org and anywhere
    else I can find them.

    https://www.xach.com/naggum/articles/notes.html has a link to a
    comp.lang.lisp archive , http://data.xach.com.s3.amazonaws.com/cll.txt.gz . This I think is close to what you're asking but specific to one newsgroup. Earliest posts are from 1987. The moderator of comp.compilers also keeps a comprehensive archive going back to the 1990s. You can find it with a bit of googling.

    Are there any issues with injecting posts from 30 years ago? I don't
    peer with anyone but if I can get everything imported and renumbered correctly for my local reader to understand, I might consider peering or making a public NNTP connection available.

    A public NNTP connection to such an archive would be amazing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Spiros Bousbouras on Fri Jun 2 18:48:43 2023
    Spiros Bousbouras wrote:

    On Sun, 14 May 2023 22:56:39 +0100
    ZMarkGC <ZMarkGC@example.com> wrote:
    I have used giganews for grabbing old articles, but they only reach
    2004. Does anyone have older text retention available over NNTP (i.e not
    google newsgroups or web archives). I would love to slurp/archive
    anything not stored on the major commercial providers.

    If so, can you give a rough disk usage and storage backend?

    I have seen people mention 50mb/day recently based on eternal-september
    stats, so assuming the average daily usage is static since 1980, it
    should be under 1TB.

    If not, I am planning to inject articles from archive.org and anywhere
    else I can find them.

    https://www.xach.com/naggum/articles/notes.html has a link to a comp.lang.lisp archive , http://data.xach.com.s3.amazonaws.com/cll.txt.gz . This I think is close to what you're asking but specific to one newsgroup. Earliest posts are from 1987. The moderator of comp.compilers also keeps a comprehensive archive going back to the 1990s. You can find it with a bit of googling.

    Are there any issues with injecting posts from 30 years ago? I don't
    peer with anyone but if I can get everything imported and renumbered
    correctly for my local reader to understand, I might consider peering or
    making a public NNTP connection available.

    A public NNTP connection to such an archive would be amazing.

    I've taken some time to modify some articles so that inn2 will accept them. These are all from the 1980s.

    I needed to change the Date: format, so all the articles now end up with
    my timezone (MST), but the date/times are correct, just wrong timezone.
    Removed 'Relay-Version', 'Posting-Version' and 'Date-Received' headers.

    Now they post except for one exception. I still get '441 Can't set system Xref header field'
    on some articles, but it is a minority of them.

    I've started with the can.* hierarchy, and will continue through the rest of what I have (which is a lot), but it will take me a long time to complete.

    You are free to view and/or pull the articles from news.novalink.us:119 if
    you are interested. It will probably take me most of the summer to get it all done as I don't have a ton of free time to work on it, but I want to complete at some point.

    If anyone has suggestions on the above error (Xref), I'd be glad to try to get those articles to post also.

    No account required to read at news.novalink.us:119

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Fri Jun 2 21:27:28 2023
    Hi Retro Guy,

    Removed 'Relay-Version', 'Posting-Version' and 'Date-Received' headers.

    Now they post except for one exception. I still get '441 Can't set
    system Xref header field' on some articles, but it is a minority of
    them.

    If anyone has suggestions on the above error (Xref), I'd be glad to try
    to get those articles to post also.

    I would just suggest to remove existing Xref header fields, like you did
    for Relay-Version & al.

    I bet you'll find out that the more recent the articles are, the more
    header fields you'll need adding in the list to remove as they are not
    supposed to be present in posted articles.
    Like X-Trace, X-Complaints-To, NNTP-Posting-Host, Injection-Info, etc.

    --
    Julien ÉLIE

    « Je préfère glisser ma peau sous des draps pour le plaisir des sens que
    de la risquer sous les drapeaux pour le prix de l'essence. » (Raymond
    Devos)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to All on Fri Jun 2 19:47:33 2023
    Julien_ÉLIE wrote:


    Hi Retro Guy,

    Removed 'Relay-Version', 'Posting-Version' and 'Date-Received' headers.

    Now they post except for one exception. I still get '441 Can't set
    system Xref header field' on some articles, but it is a minority of
    them.

    If anyone has suggestions on the above error (Xref), I'd be glad to try
    to get those articles to post also.

    I would just suggest to remove existing Xref header fields, like you did
    for Relay-Version & al.

    I bet you'll find out that the more recent the articles are, the more
    header fields you'll need adding in the list to remove as they are not supposed to be present in posted articles.
    Like X-Trace, X-Complaints-To, NNTP-Posting-Host, Injection-Info, etc.

    Thank you for the hints. I will go ahead and add these headers for deletion
    as they don't need to be there anyway when posting as a READER.

    Let's see how it goes :)

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to iulius@nom-de-mon-site.com.invalid on Fri Jun 2 20:12:32 2023
    On Jun 2, 2023 at 2:27:28 PM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:


    Hi Retro Guy,

    Removed 'Relay-Version', 'Posting-Version' and 'Date-Received' headers.

    Now they post except for one exception. I still get '441 Can't set
    system Xref header field' on some articles, but it is a minority of
    them.

    If anyone has suggestions on the above error (Xref), I'd be glad to try
    to get those articles to post also.

    I would just suggest to remove existing Xref header fields, like you did
    for Relay-Version & al.

    I bet you'll find out that the more recent the articles are, the more
    header fields you'll need adding in the list to remove as they are not supposed to be present in posted articles.
    Like X-Trace, X-Complaints-To, NNTP-Posting-Host, Injection-Info, etc.

    When I use suck/pullnews, articles with these headers come in with no issue,
    is this due to a difference in the way the message gets to INN?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to All on Fri Jun 2 20:11:48 2023
    On Jun 2, 2023 at 1:48:43 PM CDT, "Retro Guy" <Retro Guy> wrote:

    Spiros Bousbouras wrote:

    On Sun, 14 May 2023 22:56:39 +0100
    ZMarkGC <ZMarkGC@example.com> wrote:
    I have used giganews for grabbing old articles, but they only reach
    2004. Does anyone have older text retention available over NNTP (i.e not >>> google newsgroups or web archives). I would love to slurp/archive
    anything not stored on the major commercial providers.

    If so, can you give a rough disk usage and storage backend?

    I have seen people mention 50mb/day recently based on eternal-september
    stats, so assuming the average daily usage is static since 1980, it
    should be under 1TB.

    If not, I am planning to inject articles from archive.org and anywhere
    else I can find them.

    https://www.xach.com/naggum/articles/notes.html has a link to a
    comp.lang.lisp archive , http://data.xach.com.s3.amazonaws.com/cll.txt.gz . >> This I think is close to what you're asking but specific to one newsgroup. >> Earliest posts are from 1987. The moderator of comp.compilers also keeps a >> comprehensive archive going back to the 1990s. You can find it with a bit of >> googling.

    Are there any issues with injecting posts from 30 years ago? I don't
    peer with anyone but if I can get everything imported and renumbered
    correctly for my local reader to understand, I might consider peering or >>> making a public NNTP connection available.

    A public NNTP connection to such an archive would be amazing.

    I've taken some time to modify some articles so that inn2 will accept them. These are all from the 1980s.

    I needed to change the Date: format, so all the articles now end up with
    my timezone (MST), but the date/times are correct, just wrong timezone. Removed 'Relay-Version', 'Posting-Version' and 'Date-Received' headers.

    Now they post except for one exception. I still get '441 Can't set system Xref
    header field'
    on some articles, but it is a minority of them.

    I've started with the can.* hierarchy, and will continue through the rest of what I have (which is a lot), but it will take me a long time to complete.

    You are free to view and/or pull the articles from news.novalink.us:119 if you are interested. It will probably take me most of the summer to get it all done as I don't have a ton of free time to work on it, but I want to complete at some point.

    If anyone has suggestions on the above error (Xref), I'd be glad to try to get
    those articles to post also.

    No account required to read at news.novalink.us:119

    Are you going to take a crack at the net.* stuff that's available in various archives? That stuff I will definitely suck off of your server, if you do. :)

    Keep us updated as you progress. If you come up with a scriptable or easily repeatable process and need another machine to help munge/inject articles let me know, I'd be happy to offer some assistance.

    I'm still pulling stuff available on public spools and would love to get stuff from archives, but this work is slow and time consuming. Took a break from sucking because I need to switch out INN's article storage subsystem to CNFS from tradspool. Starting to run into some stupid things with performance when doing certain maintenance operations that is annoying (like expireover taking days to a week or more to complete). Currently feeding my spool into another machine at home with one large CNFS buffer and will see if it resolves some annoyances of a large spool.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Fri Jun 2 22:26:20 2023
    Hi Jesse,

    I bet you'll find out that the more recent the articles are, the more
    header fields you'll need adding in the list to remove as they are not
    supposed to be present in posted articles.
    Like X-Trace, X-Complaints-To, NNTP-Posting-Host, Injection-Info, etc.

    When I use suck/pullnews, articles with these headers come in with no issue, is this due to a difference in the way the message gets to INN?

    Yes, you've configured in incoming.conf your suck/pullnews connections
    to be handled by innd.
    Retro Guy uses nnrpd. He may want to try to feed innd, that's a good
    idea (hoping it won't complain of missing headers).

    --
    Julien ÉLIE

    « Hey, I had to let awk be better at *something*… » (Larry Wall)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to iulius@nom-de-mon-site.com.invalid on Fri Jun 2 20:40:33 2023
    On Jun 2, 2023 at 3:26:20 PM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Jesse,

    I bet you'll find out that the more recent the articles are, the more
    header fields you'll need adding in the list to remove as they are not
    supposed to be present in posted articles.
    Like X-Trace, X-Complaints-To, NNTP-Posting-Host, Injection-Info, etc.

    When I use suck/pullnews, articles with these headers come in with no issue, >> is this due to a difference in the way the message gets to INN?

    Yes, you've configured in incoming.conf your suck/pullnews connections
    to be handled by innd.
    Retro Guy uses nnrpd. He may want to try to feed innd, that's a good
    idea (hoping it won't complain of missing headers).

    I never added anything to incoming.conf, but I'm running the tools on the same server as INN. I never paid attention to how the tools actually 'post' the articles to be honest.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Retro Guy on Fri Jun 2 21:19:05 2023
    Retro Guy wrote:

    Julien_ÉLIE wrote:


    Hi Retro Guy,

    Removed 'Relay-Version', 'Posting-Version' and 'Date-Received' headers.

    Now they post except for one exception. I still get '441 Can't set
    system Xref header field' on some articles, but it is a minority of
    them.

    If anyone has suggestions on the above error (Xref), I'd be glad to try
    to get those articles to post also.

    I would just suggest to remove existing Xref header fields, like you did
    for Relay-Version & al.

    I bet you'll find out that the more recent the articles are, the more
    header fields you'll need adding in the list to remove as they are not
    supposed to be present in posted articles.
    Like X-Trace, X-Complaints-To, NNTP-Posting-Host, Injection-Info, etc.

    Thank you for the hints. I will go ahead and add these headers for deletion as they don't need to be there anyway when posting as a READER.

    Let's see how it goes :)

    That helps. I actually did try to remove the Xref header previously, but I
    must have had a typo or something. That error is gone now.

    One other thing I forgot to mention is that I needed to remove lines of
    just '.', so I converted them to '..', same as a newsreader should.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to All on Fri Jun 2 21:25:05 2023
    Julien_ÉLIE wrote:

    Hi Jesse,

    I bet you'll find out that the more recent the articles are, the more
    header fields you'll need adding in the list to remove as they are not
    supposed to be present in posted articles.
    Like X-Trace, X-Complaints-To, NNTP-Posting-Host, Injection-Info, etc.

    When I use suck/pullnews, articles with these headers come in with no issue, >> is this due to a difference in the way the message gets to INN?

    Yes, you've configured in incoming.conf your suck/pullnews connections
    to be handled by innd.
    Retro Guy uses nnrpd. He may want to try to feed innd, that's a good
    idea (hoping it won't complain of missing headers).

    Yes, I'm using nnrpd. The uploading is easy, and I've written a script to modify the headers so that is easy also.

    One thing that would really make a difference is not needing to create the groups by hand. Is it possible for inn2 to create groups on demand? That would make all the difference.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Jesse Rehmer on Fri Jun 2 21:29:32 2023
    Jesse Rehmer wrote:

    On Jun 2, 2023 at 3:26:20 PM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Jesse,

    I bet you'll find out that the more recent the articles are, the more
    header fields you'll need adding in the list to remove as they are not >>>> supposed to be present in posted articles.
    Like X-Trace, X-Complaints-To, NNTP-Posting-Host, Injection-Info, etc.

    When I use suck/pullnews, articles with these headers come in with no issue,
    is this due to a difference in the way the message gets to INN?

    Yes, you've configured in incoming.conf your suck/pullnews connections
    to be handled by innd.
    Retro Guy uses nnrpd. He may want to try to feed innd, that's a good
    idea (hoping it won't complain of missing headers).

    I never added anything to incoming.conf, but I'm running the tools on the same
    server as INN. I never paid attention to how the tools actually 'post' the articles to be honest.

    Just to explain how I'm doing it. I dump all the file names (with path) to a big
    file (using find), then run my script to modify all the headers at one time and dump those files
    to another dir. Then I run a script to upload all the files in that dir using rpost.

    I do notice that the newer files (later 80s or so) do not contain the headers I need
    to remove, but earlier files do. I just run them all through my script.
    second script to read

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Jesse Rehmer on Fri Jun 2 21:23:15 2023
    Jesse Rehmer wrote:

    On Jun 2, 2023 at 1:48:43 PM CDT, "Retro Guy" <Retro Guy> wrote:

    Spiros Bousbouras wrote:

    On Sun, 14 May 2023 22:56:39 +0100
    ZMarkGC <ZMarkGC@example.com> wrote:
    I have used giganews for grabbing old articles, but they only reach
    2004. Does anyone have older text retention available over NNTP (i.e not >>>> google newsgroups or web archives). I would love to slurp/archive
    anything not stored on the major commercial providers.

    If so, can you give a rough disk usage and storage backend?

    I have seen people mention 50mb/day recently based on eternal-september >>>> stats, so assuming the average daily usage is static since 1980, it
    should be under 1TB.

    If not, I am planning to inject articles from archive.org and anywhere >>>> else I can find them.

    https://www.xach.com/naggum/articles/notes.html has a link to a
    comp.lang.lisp archive , http://data.xach.com.s3.amazonaws.com/cll.txt.gz .
    This I think is close to what you're asking but specific to one newsgroup. >>> Earliest posts are from 1987. The moderator of comp.compilers also keeps a
    comprehensive archive going back to the 1990s. You can find it with a bit of
    googling.

    Are there any issues with injecting posts from 30 years ago? I don't
    peer with anyone but if I can get everything imported and renumbered
    correctly for my local reader to understand, I might consider peering or >>>> making a public NNTP connection available.

    A public NNTP connection to such an archive would be amazing.

    I've taken some time to modify some articles so that inn2 will accept them. >> These are all from the 1980s.

    I needed to change the Date: format, so all the articles now end up with
    my timezone (MST), but the date/times are correct, just wrong timezone.
    Removed 'Relay-Version', 'Posting-Version' and 'Date-Received' headers.

    Now they post except for one exception. I still get '441 Can't set system Xref
    header field'
    on some articles, but it is a minority of them.

    I've started with the can.* hierarchy, and will continue through the rest of >> what I have (which is a lot), but it will take me a long time to complete. >>
    You are free to view and/or pull the articles from news.novalink.us:119 if >> you are interested. It will probably take me most of the summer to get it all
    done as I don't have a ton of free time to work on it, but I want to complete
    at some point.

    If anyone has suggestions on the above error (Xref), I'd be glad to try to get
    those articles to post also.

    No account required to read at news.novalink.us:119

    Are you going to take a crack at the net.* stuff that's available in various archives? That stuff I will definitely suck off of your server, if you do. :)

    That's the hierarchy I'm working on now. Only 504,091 articles to handle :)

    Keep us updated as you progress. If you come up with a scriptable or easily repeatable process and need another machine to help munge/inject articles let me know, I'd be happy to offer some assistance.

    Thank you, I'll keep you in mind if needed. Right now I should be able to handle
    it as this is an inn2 install specifically dedicated to this.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Fri Jun 2 23:36:50 2023
    Hi Retro Guy,

    One other thing I forgot to mention is that I needed to remove lines of
    just '.', so I converted them to '..', same as a newsreader should.

    Actually, you need adding an additional dot to lines *beginning* with a
    dot, not only lines containing only a dot.

    --
    Julien ÉLIE

    « La bête aux douze pieds qui marche sur la tête. » (Nougaro)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Fri Jun 2 23:41:32 2023
    Hi Retro Guy,

    One thing that would really make a difference is not needing to create the groups by hand. Is it possible for inn2 to create groups on demand? That would make all the difference.

    No, it does not create groups on-the-fly.
    Note that the logtrash parameter in inn.conf can be used to have a list
    of newsgroups not present on the server but which received an attempt of
    post.

    As you're parsing all the articles before feeding them, why not parse
    the Newsgroups header field and create a list of newsgroups you then
    make unique and run "ctlinnd newgroup xxx" on all of them? (INN will
    then create missing newsgroups)

    --
    Julien ÉLIE

    « Il n'y a que le premier pas qui coûte. » (Mme du Deffand)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to All on Fri Jun 2 21:51:00 2023
    Julien_ÉLIE wrote:


    Hi Retro Guy,

    One thing that would really make a difference is not needing to create the >> groups by hand. Is it possible for inn2 to create groups on demand? That
    would make all the difference.

    No, it does not create groups on-the-fly.
    Note that the logtrash parameter in inn.conf can be used to have a list
    of newsgroups not present on the server but which received an attempt of post.

    As you're parsing all the articles before feeding them, why not parse
    the Newsgroups header field and create a list of newsgroups you then
    make unique and run "ctlinnd newgroup xxx" on all of them? (INN will
    then create missing newsgroups)

    That's an excellent idea. My brain was getting bit weak trying to come up
    with a plan. That's when you miss the obvious :)

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Retro Guy on Fri Jun 2 22:43:20 2023
    Retro Guy wrote:

    Julien_ÉLIE wrote:


    Hi Retro Guy,

    One thing that would really make a difference is not needing to create the >>> groups by hand. Is it possible for inn2 to create groups on demand? That >>> would make all the difference.

    No, it does not create groups on-the-fly.
    Note that the logtrash parameter in inn.conf can be used to have a list
    of newsgroups not present on the server but which received an attempt of
    post.

    As you're parsing all the articles before feeding them, why not parse
    the Newsgroups header field and create a list of newsgroups you then
    make unique and run "ctlinnd newgroup xxx" on all of them? (INN will
    then create missing newsgroups)

    That's an excellent idea. My brain was getting bit weak trying to come up with a plan. That's when you miss the obvious :)

    Much better! Thanks to Julien's brain (better than mine), I got the groups created in about 15 minutes of work (including writing the script to extract the group names and split the multiple groups in a line).

    Also, thanks to wed for providing a simple bash script to create groups:

    #/bin/bash
    for WORD in `cat ./newsgroups.txt`
    do
    echo $WORD
    ctlinnd newgroup $WORD
    done
    echo "Done."

    from: https://news.novabbs.org/rocksolid/article-flat.php?id=162&group=rocksolid.shared.linux#162

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Hochstein@21:1/5 to All on Sat Jun 3 01:35:40 2023
    Julien ÉLIE wrote:

    As you're parsing all the articles before feeding them, why not parse
    the Newsgroups header field and create a list of newsgroups you then
    make unique and run "ctlinnd newgroup xxx" on all of them? (INN will
    then create missing newsgroups)

    ... including all typos. :)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Thomas Hochstein on Sun Jun 4 00:41:17 2023
    Thomas Hochstein wrote:

    Julien ÉLIE wrote:

    As you're parsing all the articles before feeding them, why not parse
    the Newsgroups header field and create a list of newsgroups you then
    make unique and run "ctlinnd newgroup xxx" on all of them? (INN will
    then create missing newsgroups)

    .... including all typos. :)

    Very true! I'll try to clean those up later.

    Currently uploading net.* and it's been running now for about 24 hours.
    Let's see if inn2 recovers after this is done, it's throttling right now,
    but accepting the posts.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Retro Guy on Sun Jun 4 13:53:04 2023
    Retro Guy wrote:

    Thomas Hochstein wrote:

    Julien ÉLIE wrote:

    As you're parsing all the articles before feeding them, why not parse
    the Newsgroups header field and create a list of newsgroups you then
    make unique and run "ctlinnd newgroup xxx" on all of them? (INN will
    then create missing newsgroups)

    .... including all typos. :)

    Very true! I'll try to clean those up later.

    Currently uploading net.* and it's been running now for about 24 hours.
    Let's see if inn2 recovers after this is done, it's throttling right now,
    but accepting the posts.

    Finally have net.* on the server. I needed to rebuild history when complete
    due probably to all the messing around I was doing with the server.

    I'll clean up the typo group names at some point, but for now I plan to
    put can.* back on, then move to some more hierarchies.

    The fact it's working is nice to see.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Retro Guy on Mon Jun 5 12:25:11 2023
    Retro Guy wrote:

    Retro Guy wrote:

    Thomas Hochstein wrote:

    Julien ÉLIE wrote:

    As you're parsing all the articles before feeding them, why not parse
    the Newsgroups header field and create a list of newsgroups you then
    make unique and run "ctlinnd newgroup xxx" on all of them? (INN will
    then create missing newsgroups)

    .... including all typos. :)

    Very true! I'll try to clean those up later.

    Currently uploading net.* and it's been running now for about 24 hours.
    Let's see if inn2 recovers after this is done, it's throttling right now,
    but accepting the posts.

    Finally have net.* on the server. I needed to rebuild history when complete due probably to all the messing around I was doing with the server.

    I'll clean up the typo group names at some point, but for now I plan to
    put can.* back on, then move to some more hierarchies.

    The fact it's working is nice to see.

    Or is it? I'm having some trouble where after inn2 runs for a few hours I
    get the error 'File exists writing SMstore file -- throttling'

    I then shut it down, rebuild the history 'makehistory -b -f history.n -O -l 30000 -I',
    copy the .h files over as directed in the man page, then start inn2 up again. (I note
    some duplicate Message-ID messages when it runs)

    After a few hours the error returns. I'm not posting any messages at all, just letting it run.

    How can I fix this?

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Mon Jun 5 22:36:58 2023
    Hi Retro Guy,

    I'm having some trouble where after inn2 runs for a few hours I
    get the error 'File exists writing SMstore file -- throttling'

    Do you happen to use tradspool and some newsgroup names have components
    with only digits?
    For instance, if you have a newsgroup named net.test.17 or
    net.test.17.help and another named net.test, I believe this error will
    come up when receiving article number 17 for net.test. INN will try to
    write the article into the file <patharticles>/net/test/17 whereas it is
    a directory (belonging to the net.test.17 newsgroup or net.test.17.help).

    Or the inverse is possible: having net.test and trying to insert article
    1 for the net.test.17 newsgroup whereas net.test already has 17 articles.

    You should either remove the <patharticles>/net/test/17 file or the
    net.test.17 newsgroup. Or use another storage method.

    --
    Julien ÉLIE

    « – Je t'ai préparé une bonne soupe dont tu me diras des nouvelles, mon
    garçon !
    – Pour moi, ça, ce n'est pas des bonnes nouvelles ! » (Astérix)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to All on Thu Jun 8 16:13:44 2023
    Julien_ÉLIE wrote:

    Hi Retro Guy,

    I'm having some trouble where after inn2 runs for a few hours I
    get the error 'File exists writing SMstore file -- throttling'

    Do you happen to use tradspool and some newsgroup names have components
    with only digits?
    For instance, if you have a newsgroup named net.test.17 or
    net.test.17.help and another named net.test, I believe this error will
    come up when receiving article number 17 for net.test. INN will try to
    write the article into the file <patharticles>/net/test/17 whereas it is
    a directory (belonging to the net.test.17 newsgroup or net.test.17.help).

    Or the inverse is possible: having net.test and trying to insert article
    1 for the net.test.17 newsgroup whereas net.test already has 17 articles.

    You should either remove the <patharticles>/net/test/17 file or the net.test.17 newsgroup. Or use another storage method.

    Thank you for the pointer. This appears to be exactly the problem. I found net.micro, net.micro432 and net.micro6809. Following your advice and the problem appears to be resolved.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to All on Wed Aug 23 04:27:54 2023
    On Jun 8, 2023 at 11:13:44 AM CDT, "Retro Guy" <Retro Guy> wrote:

    Julien_ÉLIE wrote:

    Hi Retro Guy,

    I'm having some trouble where after inn2 runs for a few hours I
    get the error 'File exists writing SMstore file -- throttling'

    Do you happen to use tradspool and some newsgroup names have components
    with only digits?
    For instance, if you have a newsgroup named net.test.17 or
    net.test.17.help and another named net.test, I believe this error will
    come up when receiving article number 17 for net.test. INN will try to
    write the article into the file <patharticles>/net/test/17 whereas it is
    a directory (belonging to the net.test.17 newsgroup or net.test.17.help).

    Or the inverse is possible: having net.test and trying to insert article
    1 for the net.test.17 newsgroup whereas net.test already has 17 articles.

    You should either remove the <patharticles>/net/test/17 file or the
    net.test.17 newsgroup. Or use another storage method.

    Thank you for the pointer. This appears to be exactly the problem. I found net.micro, net.micro432 and net.micro6809. Following your advice and the problem appears to be resolved.

    How's your effort coming along?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to Jesse Rehmer on Wed Aug 23 12:31:56 2023
    Jesse Rehmer wrote:

    On Jun 8, 2023 at 11:13:44 AM CDT, "Retro Guy" <Retro Guy> wrote:

    Julien_ÉLIE wrote:

    Hi Retro Guy,

    I'm having some trouble where after inn2 runs for a few hours I
    get the error 'File exists writing SMstore file -- throttling'

    Do you happen to use tradspool and some newsgroup names have components
    with only digits?
    For instance, if you have a newsgroup named net.test.17 or
    net.test.17.help and another named net.test, I believe this error will
    come up when receiving article number 17 for net.test. INN will try to
    write the article into the file <patharticles>/net/test/17 whereas it is >>> a directory (belonging to the net.test.17 newsgroup or net.test.17.help). >>
    Or the inverse is possible: having net.test and trying to insert article >>> 1 for the net.test.17 newsgroup whereas net.test already has 17 articles. >>
    You should either remove the <patharticles>/net/test/17 file or the
    net.test.17 newsgroup. Or use another storage method.

    Thank you for the pointer. This appears to be exactly the problem. I found >> net.micro, net.micro432 and net.micro6809. Following your advice and the
    problem appears to be resolved.

    How's your effort coming along?

    I currently have 1.49 million posts on novalink.us:119. Some visible by web browser at http://novalink.us .

    Most of the articles were able to be imported after running a script to modify some older headers, and they are on the inn server. The oldest ones, which I have a few, have not yet been modified. They need some work and I haven't had the tme, but I still have them.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Billy G. (go-while)@21:1/5 to Retro Guy on Tue Sep 5 08:45:21 2023
    On 23.08.23 14:31, Retro Guy wrote:
    I currently have 1.49 million posts on novalink.us:119. Some visible by web

    this is the utzoo archive?

    i scanned novalink vs my server and sucked only few missing messages.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Retro Guy@21:1/5 to All on Tue Sep 5 14:32:42 2023
    Billy G. (go-while) wrote:

    On 23.08.23 14:31, Retro Guy wrote:
    I currently have 1.49 million posts on novalink.us:119. Some visible by web

    this is the utzoo archive?

    i scanned novalink vs my server and sucked only few missing messages.

    Yes, it is utzoo.

    I don't know what differences there are between what is available from archive.org
    and what is in utzoo. Just that the source of the posts on novalink is utzoo.

    --
    Retro Guy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Billy G. (go-while)@21:1/5 to All on Mon Sep 11 09:52:34 2023
    On 02.06.23 23:36, Julien ÉLIE wrote:

    Actually, you need adding an additional dot to lines *beginning* with a
    dot, not only lines containing only a dot.


    are you sure?

    if a line contains a dot at the beginning and any text following can't
    be a <CRLF>.
    the beginning dot in for example ".anytext" or ". any text" does not
    need to be escaped.

    nntp protocol defines end-of-message with a <CRLF> <DOT> <CRLF>?

    https://www.rfc-editor.org/rfc/rfc3977.txt

    3.1.1. Multi-line Data Blocks

    A multi-line data block is used in certain commands and responses.
    It MUST adhere to the following rules:

    1. The block consists of a sequence of zero or more "lines", each
    being a stream of octets ending with a CRLF pair. Apart from
    those line endings, the stream MUST NOT include the octets NUL,
    LF, or CR.

    2. In a multi-line response, the block immediately follows the CRLF
    at the end of the initial line of the response. When used in any
    other context, the specific command will define when the block is
    sent.

    3. If any line of the data block begins with the "termination octet"
    ("." or %x2E), that line MUST be "dot-stuffed" by prepending an
    additional termination octet to that line of the block.

    4. The lines of the block MUST be followed by a terminating line
    consisting of a single termination octet followed by a CRLF pair
    in the normal way. Thus, unless it is empty, a multi-line block
    is always terminated with the five octets CRLF "." CRLF
    (%x0D.0A.2E.0D.0A).

    5. When a multi-line block is interpreted, the "dot-stuffing" MUST
    be undone; i.e., the recipient MUST ensure that, in any line
    beginning with the termination octet followed by octets other
    than a CRLF pair, that initial termination octet is disregarded.

    6. Likewise, the terminating line ("." CRLF or %x2E.0D.0A) MUST NOT
    be considered part of the multi-line block; i.e., the recipient
    MUST ensure that any line beginning with the termination octet
    followed immediately by a CRLF pair is disregarded. (The first
    CRLF pair of the terminating CRLF "." CRLF of a non-empty block
    is, of course, part of the last line of the block.)

    Note that texts using an encoding (such as UTF-16 or UTF-32) that may
    contain the octets NUL, LF, or CR other than a CRLF pair cannot be
    reliably conveyed in the above format (that is, they violate the MUST
    requirement above). However, except when stated otherwise, this
    specification does not require the content to be UTF-8, and therefore
    (subject to that same requirement) it MAY include octets above and
    below 128 mixed arbitrarily.

    This document does not place any limit on the length of a line in a
    multi-line block. However, the standards that define the format of
    articles may do so.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to no-reply@no.spam on Mon Sep 11 08:42:43 2023
    "Billy G. (go-while)" <no-reply@no.spam> writes:
    On 02.06.23 23:36, Julien ÉLIE wrote:

    Actually, you need adding an additional dot to lines *beginning* with a
    dot, not only lines containing only a dot.

    are you sure?

    Yes. :)

    if a line contains a dot at the beginning and any text following can't
    be a <CRLF>. the beginning dot in for example ".anytext" or ". any
    text" does not need to be escaped.

    But see the bit that you quoted:

    3. If any line of the data block begins with the "termination octet"
    ("." or %x2E), that line MUST be "dot-stuffed" by prepending an
    additional termination octet to that line of the block.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Billy G. (go-while)@21:1/5 to Russ Allbery on Tue Sep 12 19:12:51 2023
    On 11.09.23 17:42, Russ Allbery wrote:
    "Billy G. (go-while)" <no-reply@no.spam> writes:
    On 02.06.23 23:36, Julien ÉLIE wrote:

    Actually, you need adding an additional dot to lines *beginning* with a
    dot, not only lines containing only a dot.

    are you sure?

    Yes. :)

    But see the bit that you quoted:

    3. If any line of the data block begins with the "termination octet"
    ("." or %x2E), that line MUST be "dot-stuffed" by prepending an
    additional termination octet to that line of the block.


    ++thanks Russ!
    ++thanks Julien!

    sometimes you can't see the forest for the trees!

    found this and looks like GO does it correctly...
    if you use dotreader/dotwriter.

    i don't. the basic GO way does not stop reading.
    in order to have an incoming article size limit, i read by lines, count
    bytes and break out if an article is too large.
    basic dotreader returns whenever client sends a closing dot which could
    be somewhere near infinite.


    to send data use "dotwriter": add a dot to every leading dot.

    to read data use "dotreader": cut any first dot if line is not only a
    (closing) dot.


    i hope this is correct or did i miss something?

    1) a) server sends article via IHAVE/TAKETHIS OR
    b) client sends article via POST to server OR
    c) server sends ARTICLE/BODY to a client:
    --> use "dotwriter"

    2) server receives article via IHAVE/TAKETHIS/POST
    --> use "dotreader"
    + server writes data to storage
    + client requests ARTICLE/BODY: jump to 1) c).

    3) client receives (reads) ARTICLE/BODY:
    --> use "dotreader" and print it.




    dotreader https://cs.opensource.google/go/go/+/refs/tags/go1.21.1:src/net/textproto/reader.go;drc=1e43cfa15b4b618812e85c00c9e92c2615b324c8;l=448

    // Dot by itself marks end; otherwise cut one dot.
    if len(line) > 0 && line[0] == '.' {
    if len(line) == 1 {
    break
    }
    line = line[1:]
    }


    dotwriter https://cs.opensource.google/go/go/+/refs/tags/go1.21.1:src/net/textproto/writer.go;drc=2580d0e08d5e9f979b943758d3c49877fb2324cb;l=67,

    case wstateBegin, wstateBeginLine:
    d.state = wstateData
    if c == '.' {
    // escape leading dot
    bw.WriteByte('.')
    }

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Tue Sep 12 19:39:40 2023
    Hi Billy,

    to send data use "dotwriter": add a dot to every leading dot.

    to read data use "dotreader": cut any first dot if line is not only a (closing) dot.

    Exactly.


    i hope this is correct or did i miss something?

    1) a) server sends article via IHAVE/TAKETHIS OR
       b) client sends article via POST to server OR
       c) server sends ARTICLE/BODY to a client:
       --> use "dotwriter"

    2) server receives article via IHAVE/TAKETHIS/POST
       --> use "dotreader"
           + server writes data to storage
           + client requests ARTICLE/BODY: jump to 1) c).

    3) client receives (reads) ARTICLE/BODY:
       --> use "dotreader" and print it.

    Also for HEAD when you say ARTICLE/BODY.

    Though not related to articles, there would also be dot-stuffing to deal
    with when reading/sending HELP and LIST MOTD.
    Any time you're reading/sending a multi-line data block, dot-stuffing
    applies (though it should not occur in other commands than those listed
    above, as the syntax of newsgroup names, distributions, etc. does not
    allow a leading dot).

    --
    Julien ÉLIE

    « Sum, ergo bibo ; bibo, ergo sum. »

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to no-reply@no.spam on Tue Sep 12 11:49:48 2023
    "Billy G. (go-while)" <no-reply@no.spam> writes:
    On 12.09.23 19:39, Julien ÉLIE wrote:

    Also for HEAD when you say ARTICLE/BODY.

    hm but first char of a Header Line should by either a space to indicate
    a continuing line or [A-Z] (maybe [a-z] for some strange (old) clients
    which should not exist nowadays, in theory)?

    any leading dot in the header should break when server receives it?

    Nope, there is no such requirement on RFC 5322 header fields, and thus no
    such requirement on RFC 5536 header fields because this is not one of the places where netnews is stricter. RFC 5322 section 2.2:

    Header fields are lines beginning with a field name, followed by a
    colon (":"), followed by a field body, and terminated by CRLF. A
    field name MUST be composed of printable US-ASCII characters (i.e.,
    characters that have values between 33 and 126, inclusive), except
    colon. A field body may be composed of printable US-ASCII characters
    as well as the space (SP, ASCII value 32) and horizontal tab (HTAB,
    ASCII value 9) characters (together known as the white space
    characters, WSP). A field body MUST NOT include CR and LF except
    when used in "folding" and "unfolding", as described in section
    2.2.3. All field bodies MUST conform to the syntax described in
    sections 3 and 4 of this specification.

    So it's allowed to have a header field name that starts with a period, as
    well as all sorts of other exotic and fascinating stuff that you don't see
    in practice.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Billy G. (go-while)@21:1/5 to Russ Allbery on Tue Sep 12 21:12:32 2023
    On 12.09.23 20:49, Russ Allbery wrote:
    So it's allowed to have a header field name that starts with a period, as well as all sorts of other exotic and fascinating stuff that you don't see
    in practice.


    great thanks your help is priceless!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Billy G. (go-while)@21:1/5 to All on Tue Sep 12 20:46:16 2023
    On 12.09.23 19:39, Julien ÉLIE wrote:

    Also for HEAD when you say ARTICLE/BODY.


    hm but first char of a Header Line should by either a space to indicate
    a continuing line or [A-Z] (maybe [a-z] for some strange (old) clients
    which should not exist nowadays, in theory)?

    any leading dot in the header should break when server receives it?

    thanks!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Tue Sep 12 21:12:58 2023
    Hi Russ,

    So it's allowed to have a header field name that starts with a period, as well as all sorts of other exotic and fascinating stuff that you don't see
    in practice.

    Just tried, but looks like there's a bug in INN as headers are not
    dot-stuffed when retrieved via for instance ARTICLE:


    POST
    [...]
    ..header-test: valid

    Adding a dot-stuffed .header-test header field in headers.
    .. as well a a dot-stuffed line in the body.
    .



    ARTICLE
    [...]
    .header-test: valid

    Adding a dot-stuffed .header-test header field in headers.
    .. as well a a dot-stuffed line in the body.
    .



    HDR .header-test 729-
    225 Header information for .header-test follows (from articles)
    729 valid
    .




    I'll have a look, as well as the computation of the :bytes metadata in
    that case.



    FWIW, Thunderbird correctly shows the .header-test header field.

    --
    Julien ÉLIE

    « L'éternité, c'est long, surtout vers la fin. » (Woody Allen)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to no-reply@no.spam on Tue Sep 12 12:42:57 2023
    "Billy G. (go-while)" <no-reply@no.spam> writes:
    On 12.09.23 20:49, Russ Allbery wrote:

    So it's allowed to have a header field name that starts with a period,
    as well as all sorts of other exotic and fascinating stuff that you
    don't see in practice.

    great thanks your help is priceless!

    I will never have time to write this personally, but if someone with a
    love of pedantic nit-picks ever felt like writing an NNTP and netnews compliance test suite that tried all sorts of edge conditions like this
    that could be run as a read/write NNTP client against a server (and
    presumably also the target of a feed), that would be a real service to
    everyone writing NNTP and netnews software.

    I have from time to time thought about rewriting some of the random tools
    I use, like tinyleaf, in Rust, but my thought process gets as far as
    header parsing and then I groan and find other hobbies.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Wed Sep 13 12:21:53 2023
    In addition to my previous article:

    Just tried, but looks like there's a bug in INN as headers are not dot-stuffed when retrieved via for instance ARTICLE:


    POST
    [...]
    ..header-test: valid

    Adding a dot-stuffed .header-test header field in headers.
    .. as well as a dot-stuffed line in the body.
    .



    ARTICLE
    [...]
    .header-test: valid

    Adding a dot-stuffed .header-test header field in headers.
    .. as well as a dot-stuffed line in the body.
    .

    Issue found, and appearing only with nnrpd. It uses two different
    output methods for the headers (Towire) and the body (NNTPsendarticle).
    Only the second one dot-stuffs lines when appropriate. I'll fix that.

    innd (IHAVE/TAKETHIS) already correctly handles dot-stuffed header lines.



    FWIW, Thunderbird correctly shows the .header-test header field.

    flnews too BTW.
    And both Thunderbird and flnews are resilient with ".header-test"
    (invalid in an ARTICLE response per RFC - but INN currently sends it
    without doubling the initial dot) and "..header-test" (the valid syntax
    in wire format). They display the ".header-test" header field in both
    cases.

    --
    Julien ÉLIE

    « Vinum bonum laetificat cor hominis. »

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)