Forum: >>> Magnum BBS <<<

NTP discarding servers when we have more than 3 servers configured, and

From rcheaito via questions Mailing List@21:1/5 to All on Sun Feb 16 20:28:00 2025

I am using NTP 4.8.2p15 with VxWorks. The current issue that I have with NTP
is that when I configure more than 3 servers (my box allows to configure 5 servers), the daemon will always discard servers configured above 3, as in
this example (2 out of 5), i.e., cannot use more than 3 servers, otherwise NTP client won't sync to all servers.

remote refid st t when poll reach delay offset jitter ============================================================================== 127.127.1.0 .LOCL. 5 l 101m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 410 512 377 0.532 -3.020 0.460 +2620:11b:d06d:f 10.176.6.101 3 u 48 512 377 0.757 +1.801 0.472 -192.168.1.140 149.56.19.163 3 u 81 512 377 0.553 +2.506 0.593 *192.168.1.146 10.176.6.101 3 u 82 512 377 0.776 +1.880 0.469 +192.168.1.204 10.176.6.101 3 u 374 512 377 0.545 +1.867 0.440

After some time, the daemon may resync with one of the rejected servers, but
it keeps rejecting at least one of the servers.

remote refid st t when poll reach delay offset jitter ============================================================================== 127.127.1.0 .LOCL. 5 l 192m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 87 1024 377 0.553 -7.216 2.259 +2620:11b:d06d:f 10.176.6.101 3 u 825 1024 377 0.676 -1.294 1.693 *192.168.1.140 149.56.19.163 3 u 266 1024 377 0.595 -3.259 3.592 +192.168.1.146 10.176.6.101 3 u 233 1024 377 0.828 -2.009 2.260 +192.168.1.204 10.176.6.101 3 u 1049 1024 377 0.553 -2.856 2.957

This is another instance from a different box but with same 5 servers configured:

remote refid st t when poll reach delay offset jitter ============================================================================== 127.127.1.0 .LOCL. 5 l 117m 8 0 0.000 +0.000 0.000 *2620:11b:d06d:f 10.176.6.101 3 u 232 256 377 1.273 +0.107 0.395 -192.168.1.146 10.176.6.101 3 u 138 512 377 1.302 +0.342 0.214 +192.168.1.140 149.56.19.163 3 u 211 512 377 1.092 -0.059 0.166 +192.168.1.204 10.176.6.101 3 u 8 512 377 1.123 -0.395 0.339 -192.168.1.143 10.32.35.198 3 u 246 256 377 1.176 -4.317 0.308

Daemon config:

ntpd.config.param=restrict 127.0.0.1;server 127.127.1.0 minpoll 3 maxpoll 3 iburst;server 2620:11b:d06d:f10a:4a4d:7eff:fea2:b2d1 iburst minpoll 6 maxpoll 10;server 192.168.1.146 iburst minpoll 6 maxpoll 10;server 192.168.1.140
iburst minpoll 6 maxpoll 10;server 192.168.1.204 iburst minpoll 6 maxpoll 10; server 192.168.1.243 iburst minpoll 6 maxpoll 10;

ntpd.init.param=-g -f /tffs0/ntpd_driftfile

I checked reported bugs against 4.8.2p15 to .p18 and did not find any related to this issue.

Note that if I configure only 3 servers (any) out of the above 5, NTP daemon synchs to all 3 with no issue.

Is it a known limitation with NTP daemon when there are more than 3 servers configured? Or is this an expected behavior due for example to changes in network latency, offset, jitter…? Anyone else have similar issue?

Also, another issue I came across is that if NTP is configured as client + server, it takes around 5 minutes to converge to sync up with Local system clock if NTP client cannot sync up with the configured server. Anything I can do to speed up the sync up with local system clock in this case?

Any insights and help to solve those issues will be appreciated.

Thanks,
RC

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From William Unruh@21:1/5 to rcheaito via questions Mailing List on Sun Feb 16 22:12:51 2025

On 2025-02-16, rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

I am using NTP 4.8.2p15 with VxWorks. The current issue that I have with NTP is that when I configure more than 3 servers (my box allows to configure 5 servers), the daemon will always discard servers configured above 3, as in this example (2 out of 5), i.e., cannot use more than 3 servers, otherwise NTP
client won't sync to all servers.

remote refid st t when poll reach delay offset jitter
==============================================================================
127.127.1.0 .LOCL. 5 l 101m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 410 512 377 0.532 -3.020 0.460 +2620:11b:d06d:f 10.176.6.101 3 u 48 512 377 0.757 +1.801 0.472
-192.168.1.140 149.56.19.163 3 u 81 512 377 0.553 +2.506 0.593 *192.168.1.146 10.176.6.101 3 u 82 512 377 0.776 +1.880 0.469 +192.168.1.204 10.176.6.101 3 u 374 512 377 0.545 +1.867 0.440

Why do you have .LOCL. in there? It is useless as a time source (It is
like looking at your own watch to set the time on that watch itself It
is always perfectly in time.)
As to the three servers, the two 192.168.1.143 192.168.1.140 are oiut--
2 standard deviations from the othr three 2620:11b:d06d:f 192.168.1.146 192.168.1.204
so are regarded as falsetickers. Why would you wnat ntp to use clocks
which are possibly off from the real time? It is not the number but how
well they fit in with the other time sources. NTP has no idea what the
right time is. It relies on a majority vote from its servers.

What I worry about is that four of your servers are all in the local
network and three of them all get their time from from the same source 10.176.6.101, and those three all agree with each other. Ie, those three
are hardly independent source. They are all the same source effectively.

You should always use at least three INDEPENDENT sources. Ie sources
which cannot be traced back to the same ultimate source.

After some time, the daemon may resync with one of the rejected servers, but it keeps rejecting at least one of the servers.

remote refid st t when poll reach delay offset jitter
==============================================================================
127.127.1.0 .LOCL. 5 l 192m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 87 1024 377 0.553 -7.216 2.259 +2620:11b:d06d:f 10.176.6.101 3 u 825 1024 377 0.676 -1.294 1.693
*192.168.1.140 149.56.19.163 3 u 266 1024 377 0.595 -3.259 3.592 +192.168.1.146 10.176.6.101 3 u 233 1024 377 0.828 -2.009 2.260 +192.168.1.204 10.176.6.101 3 u 1049 1024 377 0.553 -2.856 2.957

This is another instance from a different box but with same 5 servers configured:

remote refid st t when poll reach delay offset jitter
==============================================================================
127.127.1.0 .LOCL. 5 l 117m 8 0 0.000 +0.000 0.000 *2620:11b:d06d:f 10.176.6.101 3 u 232 256 377 1.273 +0.107 0.395
-192.168.1.146 10.176.6.101 3 u 138 512 377 1.302 +0.342 0.214 +192.168.1.140 149.56.19.163 3 u 211 512 377 1.092 -0.059 0.166 +192.168.1.204 10.176.6.101 3 u 8 512 377 1.123 -0.395 0.339 -192.168.1.143 10.32.35.198 3 u 246 256 377 1.176 -4.317 0.308

Daemon config:

ntpd.config.param=restrict 127.0.0.1;server 127.127.1.0 minpoll 3 maxpoll 3 iburst;server 2620:11b:d06d:f10a:4a4d:7eff:fea2:b2d1 iburst minpoll 6 maxpoll 10;server 192.168.1.146 iburst minpoll 6 maxpoll 10;server 192.168.1.140 iburst minpoll 6 maxpoll 10;server 192.168.1.204 iburst minpoll 6 maxpoll 10; server 192.168.1.243 iburst minpoll 6 maxpoll 10;

GEt rid of .LOCL. as a server. Choose ONE of the servers which use 10.176.6.101.

ntpd.init.param=-g -f /tffs0/ntpd_driftfile

I checked reported bugs against 4.8.2p15 to .p18 and did not find any related to this issue.

You are interpreting the issue wrongly.

Note that if I configure only 3 servers (any) out of the above 5, NTP daemon synchs to all 3 with no issue.

Yes, because now none are false tickers.

Is it a known limitation with NTP daemon when there are more than 3 servers configured? Or is this an expected behavior due for example to changes in network latency, offset, jitter…? Anyone else have similar issue?

It has nothing to do with "3 sources" It has to do with three of them
all having the same source.

Also, another issue I came across is that if NTP is configured as client + server, it takes around 5 minutes to converge to sync up with Local system clock if NTP client cannot sync up with the configured server. Anything I can
do to speed up the sync up with local system clock in this case?

The local system clock always synced with the local system clock. Also
it takes a while for ntp to gater enough statistics to figure out how
far out the local system clock is from the others. NTP is NOT for
switching on and off. It is for running "forever". You can tell ntp to
zero in to some one server an startup and jump the time to agree with
that one. Of course it has not idea yet as to how bad the local rate is
with respect to that clock. so even though it immediately has the some
time, that could drift way, until ntp has a good idea of how badly the
local clock rate is compared to server clock rate. That takes time.

Any insights and help to solve those issues will be appreciated.

Thanks,
RC

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Harlan Stenn via questions Mailing@21:1/5 to All on Mon Feb 17 07:53:00 2025

To: pessimus192@gmail.com
To: questions@lists.ntp.org

On 2/16/2025 5:04 PM, rcheaito (via questions Mailing List) wrote:

Thanks @pessimus192 for your reply. Just for own clarification, do I need to increase minclock to 6 or maxclock to 6?
On my system, minclock = 3 and maxclock is 10.

Have you looked at the ntpq documentation?

Do you understand the 'tally' code in the first column?

--
Harlan Stenn <stenn@nwtime.org>
https://www.nwtime.org/ - be a member!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Danny Mayer via questions Mailing L@21:1/5 to All on Mon Feb 17 22:48:00 2025

To: questions@lists.ntp.org

On 2/17/25 3:13 PM, rcheaito (via questions Mailing List) wrote:

Hi Harlan,

You mean the following:

+ : included by the combine algorithm
# : backup (more than tos maxclock sources)
' ': (a space) discarded as not valid (TEST10-TEST13)
x : discarded by intersection algorithm
. : discarded by table overflow (not used)
- : discarded by the cluster algorithm

In our case, the servers above 3 are being discarded with the - sign, i.e., by
the cluster algorithm.

From documentation:
minclock minclock
Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list. The default is 3. This is also the number of
servers to be averaged by the combining algorithm

So from the above, I understand that I need to increase minclock from 3 to 6. Is this correct?

No. They are being thrown out because the offset is too far outside the
ones compared to the rest of the list.

The minclock is related to how often the server is being queried.
Raising it merely means you are asking less frequently.

Danny

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Danny Mayer via questions Mailing L@21:1/5 to Edward McGuire on Mon Feb 17 23:13:00 2025

Copy: rcheaito@yahoo.com
Copy: questions@lists.ntp.org

This is a multi-part message in MIME format.
On 2/17/25 6:03 PM, Edward McGuire wrote:

On Mon, Feb 17, 2025 at 4:40 PM Danny Mayer <questions@lists.ntp.org> wrote:

The minclock is related to how often the server is being queried.
Raising it merely means you are asking less frequently.

Hi Danny, this is actually not right. From the fine manual: "tos
minclock /minclock/ [...] Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list.
The default is 3. This is also the number of servers to be averaged by
the combining algorithm." You might be thinking of the "minpoll"
parameter, used to adjust peer polling interval.

Oh, yes, you're right. However increasing the minimum number of servers
doesn't really help here.

Danny

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 2/17/25 6:03 PM, Edward McGuire
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAGdkUnUR2wNaDs2UdQ_0fBiOx8vHsZ2PMs7FdC2OmkoNgMsy5w@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">On Mon, Feb 17, 2025 at 4:40 PM Danny Mayer <<a
href="mailto:questions@lists.ntp.org" moz-do-not-send="true"
class="moz-txt-link-freetext">questions@lists.ntp.org</a>>
wrote:<br>
> The minclock is related to how often the server is being
queried.<br>
> Raising it merely means you are asking less frequently.<br>
<br>
Hi Danny, this is actually not right. From the fine manual: "tos
minclock /minclock/ [...] Specify the number of servers used by
the clustering algorithm as the minimum to include on the
candidate list. The default is 3. This is also the number of
servers to be averaged by the combining algorithm." You might be
thinking of the "minpoll" parameter, used to adjust peer polling
interval.</div>
</blockquote>
<p>Oh, yes, you're right. However increasing the minimum number of
servers doesn't really help here.</p>
<p>Danny<br>
</p>
</body>
</html>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Harlan Stenn via questions Mailing@21:1/5 to James Browning on Mon Feb 17 23:38:00 2025

To: questions@lists.ntp.org

Hi James,

On 2/17/2025 3:04 PM, James Browning wrote:

On Mon, Feb 17, 2025, 14:40 Danny Mayer <questions@lists.ntp.org <mailto:questions@lists.ntp.org>> wrote:

No. They are being thrown out because the offset is too far outside the
ones compared to the rest of the list.

The minclock is related to how often the server is being queried.
Raising it merely means you are asking less frequently.

Then, please forgive this fool because I thought:
Minpoll was how frequently* you will try to get data from clocks (in 2^n seconds)

I would say that it sets the floor on how frequently the other side is
polled..

Maxpoll was how infrequently* by the same metric.

I would say that it sets the ceiling on how frequently ...

NTP will dynamically set the poll interval, bounded by minpoll and maxpoll.

Minsane was how many clocks need to agree for good time.
Minclock was how many clocks are needed including those who disagree. Maxclock was how many can get to the quorum at most.

* barring burst and iburst which I consider not very nice.

While I have seen rare use cases for 'burst', iburst is great for
initial clock sync. In fairly common cases it means the local clock is
sync'd in about 10 seconds with iburst, compared to about 10 minutes
without iburst.

--
Harlan Stenn <stenn@ntp.org>
The NTP Project is part of
https://www.nwtime.org/ - be a member!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Tue Feb 18 14:06:02 2025

rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

I changed the minclock on my system from 3 to 6 and looks this solves the issue of discarding the servers above 3.

For the second issue, using the 'tos orphan 10 orphanwait 0' did not help. If anyone has other ideas, please let me know.

I think you are overthinking the whole thing.

Look at this:

*127.127.28.0 .SHM. 0 l 3 16 377 0.000 -0.315 0.896 +192.168.0.21 .PPS. 1 u 6 64 377 1.260 +0.694 2.345 +192.168.0.100 .PPS. 1 u 30 64 377 1.264 +0.592 3.476 +192.168.0.101 .PPS. 1 u 55 64 377 1.196 -0.280 3.157 +192.168.0.185 .PPS. 1 u 1 64 377 1.223 +0.727 1.362

And at some time later:

*127.127.28.0 .SHM. 0 l 13 16 377 0.000 +0.680 1.344 -192.168.0.21 .PPS. 1 u 64 64 377 1.260 +0.694 2.345 +192.168.0.100 .PPS. 1 u 22 64 377 1.264 +0.592 3.694 +192.168.0.101 .PPS. 1 u 46 64 377 1.132 -2.246 2.722 -192.168.0.185 .PPS. 1 u 59 64 377 1.223 +0.727 1.362

SHM is a USB GNSS dongle which by itself provides crap time as it has no
PPS output, yet it is the selected server thanks to the other four
servers which are all on the local network and stratum 1.

There are no options in the config file except for minpoll 4 on the USB
device and iburst for all servers for a quicker restart.

No minclock, no tos anything, nothing.

So the +/- signs at the start of the line move around a bit. So what?

If any of them change to x, then there is a problem.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From William Unruh@21:1/5 to rcheaito via questions Mailing List on Wed Feb 19 01:10:59 2025

On 2025-02-17, rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

Hi Harlan,

You mean the following:

+ : included by the combine algorithm
# : backup (more than tos maxclock sources)
' ': (a space) discarded as not valid (TEST10-TEST13)
x : discarded by intersection algorithm
. : discarded by table overflow (not used)
- : discarded by the cluster algorithm

In our case, the servers above 3 are being discarded with the - sign, i.e., by
the cluster algorithm.

No, they are not discarded because they are more than three. They are
discarded because those other two have offsets larger than the standard deviation of those three. And those three have smaller standard
deviation because they are all from the same source. Ie, they are NOT independent. So if their server happens to have a small jitter they will overwhelm the others.
What you are essentially doing is saying that one source should have
three times the weight of any other source of time. It will thus be
highly probably that they will dominate the time source.

From documentation:

minclock minclock
Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list. The default is 3. This is also the number of
servers to be averaged by the combining algorithm

So from the above, I understand that I need to increase minclock from 3 to 6. Is this correct?

If you want to loose all time sources, yes. Ie, if there are not 6 which
agree then deliver nothing. That is just silly.

a) Get rid of .LOCL.
b) Choose just one of the servers that come from that one source, not three.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Harlan Stenn via questions Mailing@21:1/5 to rcheaito@yahoo.com on Wed Feb 19 04:13:00 2025

To: questions@lists.ntp.org

On 2/18/2025 5:26 AM, rcheaito@yahoo.com wrote:

Thank you all for the replies. I still have not seen a confirmation/or objection of whether I should increase minclock from 3 to 6... But I tried this change on my system and now I see NTP daemon is including all 5 servers and no discarding any one. So, looks this change solves the issue I was having
with NTP.

Your local ntpd is rejecting those servers because they are out of spec
for the selection group, as others have said.

If you bump maxclock you are effectively forcing a larger selection
group, and that's why these worse-behaving servers are remaining in your selection pool.

Be careful what you wish for.

As for speeding up the convergence of NTP to sync up with the local system clock when it cannot reach any of the configured external servers, 'tos orphan
10 orphanwait 0' did not unfortunately help. It still takes ~5 minutes to converge. Any ideas?

The local system clock is already sync'd to the local system clock,
regardless of what ntpd says.

Iburst is your friend when it comes to getting the initial sync done
quickly, when it can be used. Some refclocks will support iburst, some
will not.

Perhaps your "time distribution" systems need some better configuration.

--
Harlan Stenn <stenn@ntp.org>
The NTP Project is part of
https://www.nwtime.org/ - be a member!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Wed Feb 19 15:04:12 2025

rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

Yes, in theory I should not bother about this NTP behavior as this is after all controlled by the selection algorithm based on quality of communication with the server, precision,....etc. However, what made me follow through this
issue mainly this:

We allow user to configure up to 5 time sources on our system, and so if all 5
time servers are supposedly good, then the expectation is that one of them will be selected as the system peer and the remaining ones as backup.

A properly operating system should show one server preceded by a '*',
two (in rare situations transiently more) preceded by a '+', and all the
rest preceded by a '-' in the ntpq line, which is what you have.

So what is your problem?

Not
having this behavior will create confusion and raise questions. I understand that the system peer may change as NTP polls the servers, but still one will be selected as system peer and the remaining will be as backup (of course assuming all 5 are still considered as good time sources per the selection algorithm!).

Which is, from your examples, what is happening though it appears you
have a strange view of what "backup" means.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Wed Feb 19 15:13:35 2025

rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

We already have maxclock set to 10 on our system (default value), so why I need to bump it up further?
What's wrong if I bump up minclock? Isn't this the value that allows to include more servers to the selection process, which is what I found with my testing?

If we increase the number of servers into the selection process, NTP should still include only the good sources and discard the bad ones, but not sure if this may slow down the selection process as the more servers we have the more calculations NTP will need to make before deciding which ones to keep and/or discard?

Discarding a server from the selection process does NOT mean it is bad,
it just means the ones selected are statistically "better" AT THE
CURRENT TIME. Ntp is ALWAYS looking at ALL the configured servers to
determine which are the statistically best.

A "bad" server will show an 'x' at the beginning of the ntpq line.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Woolley@21:1/5 to rcheaito via questions Mailing List on Thu Feb 20 00:50:02 2025

On 19/02/2025 22:02, rcheaito via questions Mailing List wrote:

We allow user to configure up to 5 time sources on our system, and so if all 5
time servers are supposedly good, then the expectation is that one of them will be selected as the system peer and the remaining ones as backup. Not

I think you are putting too much weight on the system peer. The actual
time is set based on, I think weighted, average of more than one source.

I also think, in terms of the original question, you are trying to shoot
the messenger. You are getting sources rejected because you have
sources which are not independent of each. You need to fix that issue,
rather than trying to get more accepted by insisting a minimum number
that must be used.

If you say that you only allow users to specify five servers, you need
to set a minimum that is less than five, to give some scope for
rejecting some of them.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Harlan Stenn via questions Mailing@21:1/5 to All on Thu Feb 20 00:28:12 2025

To: questions@lists.ntp.org

On 2/19/2025 8:49 AM, rcheaito (via questions Mailing List) wrote:

We already have maxclock set to 10 on our system (default value), so why I need to bump it up further?
What's wrong if I bump up minclock? Isn't this the value that allows to include more servers to the selection process, which is what I found with my testing?

If we increase the number of servers into the selection process, NTP should still include only the good sources and discard the bad ones, but not sure if this may slow down the selection process as the more servers we have the more calculations NTP will need to make before deciding which ones to keep and/or discard?

The - tally code in ntpq means "discarded by the cluster algorithm".

Have you seen https://www.ntp.org/documentation/4.2.8-series/cluster/ ?

Those hosts with the '-' have already survived the selection process.

They were dropped from consideration by the (subsequent) cluster algorithm.

--
Harlan Stenn <stenn@ntp.org>
NTP Project Lead. The NTP Project is part of
https://www.nwtime.org/ - be a member!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Thu Feb 20 17:45:57 2025

rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

When we have more than 3 servers configured (5 in our case), increasing the minclock from 3 to 6 should help keep all the 5 survivors by the cluster algorithm and terminate the pruning rounds faster, based on my understanding of the following snippets from documentation. And so this should lead to tagging the additional survivors in ntpq with a '+' sign instead of '-' sign as the additional survivors won't be discarded (as they are no longer pruned).
Correct?

And just what do you think this would accomplish?

It certainly isn't going to make the time on the system any "better" and
my gut feel is that it will just make jitter worse.

Have you ever run any statistical analysis of what the system clock is
actually doing or are you just counting + and - signs?

<snip>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Fri Feb 21 18:21:34 2025

rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

I can say we are focusing more on the count of '+' and '-' signs. We may obviously be so paranoid by this, however the concern raised is that with 2 servers out of 5 showing as discarded ('-' sign) almost all the time, our clients will have questions raised when they know that those discarded servers
are working fine with other boxes (with '*' or '+' sign).

It sounds to me like neither you nor your clients have a clue how ntp
works and what the + and - signs actually mean to the overall operation
of ntp.

I did not run any statistical tests with those servers.

From the different answers I got, increasing the minclock helps replace the

'-' signs with the '+' signs against those previously discarded servers, but it may make our client more fragile. So, what I take from all those discussions that we better keep the minclock as 3 and then just document this behavior as expected!

Actually, the behavior is already well described and documented, which
is why I question your obsession over it.

Counting + and - signs tells you absolutely nothing about how accurate
the system clock is.

If you actually want to know how accurate the clock is, at least use
something like ntpstat which will give information like:

synchronised to UHF radio at stratum 1
time correct to within 2 ms
polling server every 16 s

Better is to use ntpviz which will graph things and give in depth
reports.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From William Unruh@21:1/5 to Jim Pennino on Sat Feb 22 04:06:32 2025

On 2025-02-22, Jim Pennino <jimp@gonzo.specsol.net> wrote:

rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

I can say we are focusing more on the count of '+' and '-' signs. We may
obviously be so paranoid by this, however the concern raised is that with 2 >> servers out of 5 showing as discarded ('-' sign) almost all the time, our
clients will have questions raised when they know that those discarded servers
are working fine with other boxes (with '*' or '+' sign).

It sounds to me like neither you nor your clients have a clue how ntp
works and what the + and - signs actually mean to the overall operation
of ntp.

I did not run any statistical tests with those servers.

From the different answers I got, increasing the minclock helps replace the >> '-' signs with the '+' signs against those previously discarded servers, but >> it may make our client more fragile. So, what I take from all those

discussions that we better keep the minclock as 3 and then just document this
behavior as expected!

Actually, the behavior is already well described and documented, which
is why I question your obsession over it.

Counting + and - signs tells you absolutely nothing about how accurate
the system clock is.

If you actually want to know how accurate the clock is, at least use something like ntpstat which will give information like:

synchronised to UHF radio at stratum 1
time correct to within 2 ms
polling server every 16 s

UHF is a terrible server. Its precision is 2ms. Its accuracy is way
worse than that. It is not a server you can determine how long the round
trip time is (Your system to the UHF radio station, and back to your
server. If you want to determine the accuracy of your time, get a gps
time receiver. GPS knows both where you are and where the sattelite is,
and thus can accuratly determine the one way distance and the one way
time lag (well, modulo the atmospheric lag fluctuation due to the
ionisphere, and the water vapour changes in the air). Also while you are
at it you can use the gps time and one of your servers, and get and
accuracy of microseconds, not milliseconds.One GPS clock can be worth a
million internet servers.

Better is to use ntpviz which will graph things and give in depth
reports.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From William Unruh@21:1/5 to rcheaito via questions Mailing List on Sat Feb 22 03:40:03 2025

On 2025-02-21, rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

I can say we are focusing more on the count of '+' and '-' signs. We may obviously be so paranoid by this, however the concern raised is that with 2 servers out of 5 showing as discarded ('-' sign) almost all the time, our clients will have questions raised when they know that those discarded servers
are working fine with other boxes (with '*' or '+' sign).

I did not run any statistical tests with those servers.

From the different answers I got, increasing the minclock helps replace the

'-' signs with the '+' signs against those previously discarded servers, but it may make our client more fragile. So, what I take from all those discussions that we better keep the minclock as 3 and then just document this behavior as expected!

You do not listen to anyone.
a) .LOCL. is the local clock. Obviously it is always in perfect
agreement with the local clock. It is useless as a server,

b) 3 of your sources have the same server. That they agree with each
other is no surprize. They are the ones that the system uses as their
preferred source. Bad idea. You might as well have just one server
instead of those three.

c) 3 is not a magic number. The other sources are rejected because they disagree with those 3 sources, ( which are only one source because they
use the same server).
Get rid of two of them and replace them with two other servers which do
NOT have the same server freeding them

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jim Pennino@21:1/5 to William Unruh on Sat Feb 22 07:26:57 2025

William Unruh <unruh@invalid.ca> wrote:

On 2025-02-22, Jim Pennino <jimp@gonzo.specsol.net> wrote:

rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:

I can say we are focusing more on the count of '+' and '-' signs. We may >>> obviously be so paranoid by this, however the concern raised is that with 2 >>> servers out of 5 showing as discarded ('-' sign) almost all the time, our >>> clients will have questions raised when they know that those discarded servers
are working fine with other boxes (with '*' or '+' sign).

It sounds to me like neither you nor your clients have a clue how ntp
works and what the + and - signs actually mean to the overall operation
of ntp.

I did not run any statistical tests with those servers.

From the different answers I got, increasing the minclock helps replace the >>> '-' signs with the '+' signs against those previously discarded servers, but

it may make our client more fragile. So, what I take from all those
discussions that we better keep the minclock as 3 and then just document this
behavior as expected!

Actually, the behavior is already well described and documented, which
is why I question your obsession over it.

Counting + and - signs tells you absolutely nothing about how accurate
the system clock is.

If you actually want to know how accurate the clock is, at least use
something like ntpstat which will give information like:

synchronised to UHF radio at stratum 1
time correct to within 2 ms
polling server every 16 s

UHF is a terrible server. Its precision is 2ms. Its accuracy is way
worse than that.

It is actually GNSS.

*127.127.28.0 .SHM. 0 l 15 16 377 0.000 -0.528 3.298 -192.168.0.21 .PPS. 1 u 64 64 377 1.241 -1.550 2.182 +192.168.0.100 .PPS. 1 u 32 64 377 1.285 -3.298 1.780 +192.168.0.101 .PPS. 1 u 36 64 377 1.310 -4.271 2.197 -192.168.0.185 .PPS. 1 u 48 64 377 1.236 +0.784 2.147

The local clock time offset has a mean of -0.004 ms and a standard
deviation of 1.120 ms.

Why ntpstat calls it UHF radio I have no clue.

It is not a server you can determine how long the round
trip time is (Your system to the UHF radio station, and back to your
server.

I think it is calling the .SHM. server UHF, which means for all
practical purposes the rtt is basically zero as shown in the ntpq line.

If you want to determine the accuracy of your time, get a gps
time receiver. GPS knows both where you are and where the sattelite is,
and thus can accuratly determine the one way distance and the one way
time lag (well, modulo the atmospheric lag fluctuation due to the
ionisphere, and the water vapour changes in the air). Also while you are
at it you can use the gps time and one of your servers, and get and
accuracy of microseconds, not milliseconds.One GPS clock can be worth a million internet servers.

Note that all five servers are GNSS and are on a local 5G WiFi network.

The .100 and .101 servers are GNSS ntp appliance boxes that are accurate
to a bit less than 100 us.

The .21 server is a rasberry pi4 with a GNSS card and is accurate to
about 60 us.

The .185 server is a serial attached GNSS box with a high accuracy,
temperature controlled, GNSS steered oscillator with a PPS accuracy of
+/- 1 nanosecond. The server is typically accurate to less than 5 us.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Miroslav Lichvar@21:1/5 to Jim Pennino on Mon Feb 24 08:56:08 2025

On 2025-02-22, Jim Pennino <jimp@gonzo.specsol.net> wrote:

Why ntpstat calls it UHF radio I have no clue.

That's how ntpd describes SHM refclocks in the mode-6 protocol. See

https://www.rfc-editor.org/rfc/rfc9327.html#table-3

By the ITU definition, GPS is in the UHF band, so that actually seems to
be correct, even though SHM is used with other time sources, e.g.
longwave radio signals.

--
Miroslav Lichvar

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Bob Worm
  Mon Sep 15 15:42:34 2025
  from Wales, Uk via Telnet
- Gretchiie
  Mon Sep 15 05:16:29 2025
  from Derry, Nh via Telnet
- Fred Blogs
  Mon Sep 15 00:03:12 2025
  from Uk via SSH
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet
- Thlc
  Sat Sep 13 17:11:34 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 17:04:03 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 16:32:19 2025
  from Rognac, France via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (3 / 13)
Uptime:	06:29:41
Calls:	10,388
Calls today:	3
Files:	14,061
Messages:	6,416,810
Posted today:	1

NTP discarding servers when we have more than 3 servers configured, and

Who's Online

Recent Visitors

System Info