of the many thousands of unmoderated newsgroups (e.g., last check 2 Nov 2023 news.mixmin.net http://mixmin.net/active list [2.03 MB] shows 39,385 "y" and 5,936 "m"), is there any, preferably easy way for news admins to post totals for all articles in all unmoderated newsgroups containing "googlegroups.com" in message-id headers, and separately for reference headers of replies which also contain "googlegroups.com", e.g., since 1 January 2023 to current date?
On 2023-11-09 13:35, D wrote:
of the many thousands of unmoderated newsgroups (e.g., last check 2 Nov 2023 >> news.mixmin.net http://mixmin.net/active list [2.03 MB] shows 39,385 "y" and >> 5,936 "m"), is there any, preferably easy way for news admins to post totals >> for all articles in all unmoderated newsgroups containing "googlegroups.com" >> in message-id headers, and separately for reference headers of replies which >> also contain "googlegroups.com", e.g., since 1 January 2023 to current date?
Any news admin with at least 320 days of message retention would be able
to run scripts against the message files on their servers.
For speed, this should be done in a fast language that can parse RFC822 >headers on the fly, do regex matches and remember a hashed table of
msgIds. Perl or Python might do with enough RAM for the table.
One check could be run backwards (newest post first) and special case >messages whose own msgId was not seen as a reference/inreplyto in a
newer message. Such messages would be at a higher chance of being of
little interest to other participants, though there will be many >misdetections both ways with that crude metric.
For detection of GG traffic, look at the Path header, not the msgId .
Enjoy
Jakob
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 430 |
Nodes: | 16 (2 / 14) |
Uptime: | 122:55:38 |
Calls: | 9,059 |
Calls today: | 6 |
Files: | 13,398 |
Messages: | 6,017,143 |
Posted today: | 1 |