Forum: >>> Magnum BBS <<<

webecv4 questions

From Hemo@1:103/705 to Al on Wed Apr 1 18:39:33 2020

Re: webecv4 questions
By: Al to Hemo on Wed Apr 01 2020
02:28 pm

I've looked for a while but my google-foo is failing me.

I am wanting to have the BBS web pages present, but not allow anyone
to browse the message areas unless logged in. Perhaps allow one or two
areas like a local/main, if possible. I want to shutdown the network
areas from being web crawling/indexing targets.

You can stop the web crawlers with your robots.txt.

I'm not sure but I think the default robots.txt that comes with

Synchronet

will do this. My own robots.txt looks like this..

User-agent: *
Disallow: /bbbs

I've got this:
User-agent: *
Disallow: /

Its not stopping things taht are not identifying as a crawler. I think. I think a legitimate crawler starts by looking for the robots.txt file, I see some of those too.

Here snips of what I see in the log:

Apr 1 12:31:32 bbs synchronet: web 0045 HTTP connection accepted from: 52.82.96.27 port 49946
Apr 1 12:31:32 bbs synchronet: web 0045 Hostname: ec2-52-82-96-27.cn-northwest-1.compute.amazonaws.com.cn [52.82.96.27]
Apr 1 12:31:32 bbs synchronet: web 0045 Request: GET /api/files.ssjs?call=download-file&dir=sndmodv1mod_hl&file=INFLNCIA.MOD HTTP/1.1
Apr 1 12:31:32 bbs synchronet: web 0045 Unable to send to peer
Apr 1 12:31:32 bbs synchronet: web 0045 Sending file: /sbbs/tmp/SBBS_SSJS.31685.45.html (0 bytes)
Apr 1 12:31:33 bbs synchronet: web 0045 Session thread terminated (0 clients, 3 threads remain, 219 served)
Apr 1 12:32:16 bbs synchronet: web 0045 HTTPS connection accepted from: 111.225.148.163 port 55238
Apr 1 12:32:17 bbs synchronet: web 0045 Hostname: bytespider-111-225-148-163.crawl.bytedance.com [111.225.148.163]
Apr 1 12:32:17 bbs synchronet: web 0045 Request: GET /robots.txt HTTP/1.1
Apr 1 12:32:17 bbs synchronet: web 0045 Sending file: /sbbs/webv4/root/robots.txt (2076 bytes)
Apr 1 12:32:17 bbs synchronet: web 0045 Sent file: /sbbs/webv4/root/robots.txt (2076 bytes)
Apr 1 12:32:18 bbs synchronet: web 0045 Session thread terminated (0 clients, 3 threads remain, 220 served)
Apr 1 12:32:58 bbs synchronet: web 0045 HTTP connection accepted from: 111.225.148.177 port 46388
Apr 1 12:32:58 bbs synchronet: web 0045 Hostname: bytespider-111-225-148-177.crawl.bytedance.com [111.225.148.177]
Apr 1 12:32:58 bbs synchronet: web 0045 Request: GET /robots.txt HTTP/1.1
Apr 1 12:32:58 bbs synchronet: web 0045 Sending file: /sbbs/webv4/root/robots.txt (2076 bytes)
Apr 1 12:32:58 bbs synchronet: web 0045 Sent file: /sbbs/webv4/root/robots.txt (2076 bytes)
Apr 1 12:32:59 bbs synchronet: web 0045 Session thread terminated (0 clients, 3 threads remain, 221 served)
Apr 1 12:33:42 bbs synchronet: web 0045 HTTPS connection accepted from: 52.83.249.124 port 52734
Apr 1 12:33:42 bbs synchronet: web 0045 Hostname: ec2-52-83-249-124.cn-northwest-1.compute.amazonaws.com.cn [52.83.249.124]
Apr 1 12:33:43 bbs synchronet: web 0045 Request: GET /api/files.ssjs?call=download-file&dir=st20s92msdosc&file=CNEWS003.ARC HTTP/1.1
Apr 1 12:33:43 bbs synchronet: web 0045 Sending file: /sbbs/tmp/SBBS_SSJS.31685.45.html (0 bytes)
Apr 1 12:33:44 bbs synchronet: web 0045 Session thread terminated (0 clients, 3 threads remain, 222 served)

every minute or so, something comes in and goes directly to a specific file and tries to download it. Most of these seem to come from cn-northwest-1.compute.amazonaws.com.cn
--
H

... It is impossible to please the whole world and your mother-in-law.

---
� Synchronet � - Running madly into the wind and screaming - bbs.ujoint.org
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

From Hemo@1:103/705 to poindexter FORTRAN on Wed Apr 1 18:49:38 2020

Re: webecv4 questions
By: poindexter FORTRAN to Hemo on
Wed Apr 01 2020 03:21 pm

Re: webecv4 questions
By: Hemo to All on Wed Apr 01 2020 03:45 pm

I am wanting to have the BBS web pages present, but not allow anyone
to browse the message areas unless logged in. Perhaps allow one or
two areas like a local/main, if possible. I want to shutdown the
network areas from being web crawling/indexing targets.

The security levels of the groups determine what can be seen on the web. The guest user's security level controls what un-authenticated users can see from the web.

Boom. that was it, thank you. I wasn't picking up that the 'non-logged-in' access to the web was controlled by the security level of the Guest account and somehow my Guest account got 'validated' ( well.. I'm sure I either did that not realizing the implications or it was an mis-typed key ). Validation cranked up the security level and opened up the forums and files on the web pages to the public.

I got my Guest account security back where it should be and that sealed up the web portion back to where I wanted it. whew!

thanks!

--
Hemo

... I love criticism just so long as it's unqualified praise.

---
� Synchronet � - Running madly into the wind and screaming - bbs.ujoint.org
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

From Rampage@1:103/705 to Hemo on Thu Apr 2 07:27:16 2020

Re: webecv4 questions
By: Hemo to Al on Wed Apr 01 2020 18:39:33

Hemo> Its not stopping things taht are not identifying as a crawler. I think.

robots.txt cannot stop anything... it is only a guide from the site operator to the spider operator indicating the areas the spider is allowed to crawl or not...

Hemo> I think a legitimate crawler starts by looking for the robots.txt file,
Hemo> I see some of those too.

close... robots.txt may or may not be gathered on each visit by a spider... if it is gathered, it may not be taken into account until later visits...

Hemo> every minute or so, something comes in and goes directly to a specific
Hemo> file and tries to download it. Most of these seem to come from
Hemo> cn-northwest-1.compute.amazonaws.com.cn

look in your /sbbs/data/logs directory for the http logs (if you have them enabled) and you will see a traditional apache-style log format... the last field contains the user agent which will generally tell you if the visitor really is a spider or not... what you're seeing from that amazon cloud domain may be a spider or it may be someone's file getter or possible even an indexer (which is like a spider or crawler)...

)\/(ark

---
� Synchronet � The SouthEast Star Mail HUB - SESTAR
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

From Hemo@1:103/705 to Rampage on Thu Apr 2 13:17:26 2020

Re: webecv4 questions
By: Rampage to Hemo on Thu Apr 02
2020 07:27 am

Re: webecv4 questions
By: Hemo to Al on Wed Apr 01 2020 18:39:33
Hemo> every minute or so, something comes in and goes directly to a specific Hemo> file and tries to download it. Most of these seem to come from Hemo> cn-northwest-1.compute.amazonaws.com.cn

look in your /sbbs/data/logs directory for the http logs (if you have

them

enabled) and you will see a traditional apache-style log format... the last field contains the user agent which will generally tell you if the visitor really is a spider or not... what you're seeing from that amazon cloud domain may be a spider or it may be someone's file getter or possible even an indexer (which is like a spider or crawler)...

Interesting. I need to spend more time reading logs, I think.
The lines in question all show this:
"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.9740.1940 Mobile Safari/537.36"

I also see some that are not even trying to hide anything. polaris botnet, ZmEu, zgrab, The Knowledge AI, and so forth. Even some from this fella: "masscan/1.0 (https://github.com/robertdavidgraham/masscan)"

coincidence or not, about an hour after closing down reading of files and forums to anyone/thing not logged in, I was slammed for a couple hours from no-reverse-dns-configured.com with what looks like attempted php exploits.

I see the php exploit attempts randomly here and there in all the log files, but this period was non stop for about 2 hours, 2-5 attempts every second. the log file is huge.

Man.. this stuff felt simpler when just dealing with a modem and baud rates.

... Buy Land Now. It's Not Being Made Any More.

---
� Synchronet � - Running madly into the wind and screaming - bbs.ujoint.org
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

From poindexter FORTRAN@1:103/705 to Rampage on Thu Apr 2 09:09:00 2020

Rampage wrote to Hemo <=-

look in your /sbbs/data/logs directory for the http logs (if you have
them enabled) and you will see a traditional apache-style log format... the last field contains the user agent which will generally tell you if the visitor really is a spider or not... what you're seeing from that amazon cloud domain may be a spider or it may be someone's file getter
or possible even an indexer (which is like a spider or crawler)...

That's a good point - ROBOTS.TXT can block by *user agent*, so if you have a particularly annoying web crawler, you can block that user agent from
getting to anything instead of trying to block specific areas to all
crawlers.

This is all voluntary, a badly behaving crawler can just ignore your ROBOTS.TXT file.

... What do you think management's real interests are?
--- MultiMail/XT v0.52
� Synchronet � realitycheckBBS -- http://realitycheckBBS.org
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

From Mortifis@1:103/705 to Hemo on Fri Apr 3 15:49:07 2020

Re: webecv4 questions
By: Al to Hemo on Wed Apr 01
2020 02:28 pm

I've looked for a while but my google-foo is failing me.

I am wanting to have the BBS web pages present, but not allow anyone
to browse the message areas unless logged in. Perhaps allow one or two
areas like a local/main, if possible. I want to shutdown the network
areas from being web crawling/indexing targets.

You can stop the web crawlers with your robots.txt.

I'm not sure but I think the default robots.txt that comes with Synchronet will do this. My own robots.txt looks like this..

User-agent: *
Disallow: /bbbs

I've got this:
User-agent: *
Disallow: /

Its not stopping things taht are not identifying as a crawler. I think. I think a legitimate crawler starts by looking for the robots.txt file, I see some of those too.

Here snips of what I see in the log:

every minute or so, something comes in and goes directly to a specific file and tries to download it. Most of these seem to come from cn-northwest-1.compute.amazonaws.com.cn
--
H

I wonder if adding if(user.alias === 'Guest') { writeln('You must be logged in to view files!'); exit(); } to /sbbs/webv4/root/api/files.ssjs would help? o0r something like that

---
� Synchronet � AlleyCat! BBS Lake Echo, NS Canada
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

From Digital Man@1:103/705 to Mortifis on Fri Apr 3 12:11:03 2020

Re: Re: webecv4 questions
By: Mortifis to Hemo on Fri Apr 03 2020 03:49 pm

Re: webecv4 questions
By: Al to Hemo on Wed Apr

01

2020 02:28 pm

I've looked for a while but my google-foo is failing me.

I am wanting to have the BBS web pages present, but not allow anyone
to browse the message areas unless logged in. Perhaps allow one or

two

areas like a local/main, if possible. I want to shutdown the network
areas from being web crawling/indexing targets.

You can stop the web crawlers with your robots.txt.

I'm not sure but I think the default robots.txt that comes with Synchronet will do this. My own robots.txt looks like this..

User-agent: *
Disallow: /bbbs

I've got this:
User-agent: *
Disallow: /

Its not stopping things taht are not identifying as a crawler. I think.
I think a legitimate crawler starts by looking for the robots.txt file, see some of those too.

Here snips of what I see in the log:

every minute or so, something comes in and goes directly to a specific file and tries to download it. Most of these seem to come from cn-northwest-1.compute.amazonaws.com.cn
--
H

I wonder if adding if(user.alias === 'Guest') { writeln('You must be logged in to view files!'); exit(); } to /sbbs/webv4/root/api/files.ssjs would help? o0r something like that

I don't think bots are logging in as Guest, but ecweb might do an auto-login-as-guest thing.

digital man

Synchronet "Real Fact" #4:
Synchronet version 3 is written mostly in C, with some C++, x86 ASM, and Pascal.
Norco, CA WX: 63.1�F, 58.0% humidity, 3 mph E wind, 0.00 inches rain/24hrs
--- SBBSecho 3.10-Linux
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

From echicken@1:103/705 to Mortifis on Fri Apr 3 15:28:05 2020

Re: Re: webecv4 questions
By: Mortifis to Hemo on Fri Apr 03 2020 15:49:07

I wonder if adding if(user.alias === 'Guest') { writeln('You must be

logged in

to view files!'); exit(); } to /sbbs/webv4/root/api/files.ssjs would

help? o0r

something like that

That script already checks if the current user has the ability to download, so this shouldn't be necessary.

Likewise I think all of the file stuff uses 'file_area.lib_list', which is:

"File Transfer Libraries (current user has access to) - introduced in v3.10"

So I would expect it not to include areas that the current user isn't supposed to be able to see. Maybe I'm wrong or maybe that isn't working as expected.

I suspect OP needs to tweak the guest account in use, along with settings on file and message areas.

---
echicken
electronic chicken bbs - bbs.electronicchicken.com
� Synchronet � electronic chicken bbs - bbs.electronicchicken.com
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

Who's Online
Recent Visitors
- Gwylbert
  Sun Aug 24 00:06:36 2025
  from Sydney, Nsw via Telnet
- Bob Worm
  Sat Aug 23 21:17:17 2025
  from Wales, Uk via Telnet
- Craig Colby
  Sat Aug 23 03:22:40 2025
  from Lewiston, Maine via Telnet
- Bob Worm
  Fri Aug 22 22:17:23 2025
  from Wales, Uk via Telnet
- Centurion
  Fri Aug 22 21:33:08 2025
  from Berea, Ohio via Telnet
- Bob Worm
  Fri Aug 22 17:24:20 2025
  from Wales, Uk via Telnet
- Bob Worm
  Fri Aug 22 08:41:05 2025
  from Wales, Uk via Telnet
- Plume
  Fri Aug 22 07:29:56 2025
  from Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	537
Nodes:	16 (0 / 16)
Uptime:	156:16:32
Calls:	10,251
Calls today:	1
Files:	13,983
Messages:	6,406,354

webecv4 questions

Who's Online

Recent Visitors

System Info