• NTP discarding servers when we have more than 3 servers configured, and

    From rcheaito via questions Mailing List@21:1/5 to All on Sun Feb 16 20:28:00 2025
    I am using NTP 4.8.2p15 with VxWorks. The current issue that I have with NTP
    is that when I configure more than 3 servers (my box allows to configure 5 servers), the daemon will always discard servers configured above 3, as in
    this example (2 out of 5), i.e., cannot use more than 3 servers, otherwise NTP client won't sync to all servers.

    remote refid st t when poll reach delay offset jitter ============================================================================== 127.127.1.0 .LOCL. 5 l 101m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 410 512 377 0.532 -3.020 0.460 +2620:11b:d06d:f 10.176.6.101 3 u 48 512 377 0.757 +1.801 0.472 -192.168.1.140 149.56.19.163 3 u 81 512 377 0.553 +2.506 0.593 *192.168.1.146 10.176.6.101 3 u 82 512 377 0.776 +1.880 0.469 +192.168.1.204 10.176.6.101 3 u 374 512 377 0.545 +1.867 0.440

    After some time, the daemon may resync with one of the rejected servers, but
    it keeps rejecting at least one of the servers.

    remote refid st t when poll reach delay offset jitter ============================================================================== 127.127.1.0 .LOCL. 5 l 192m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 87 1024 377 0.553 -7.216 2.259 +2620:11b:d06d:f 10.176.6.101 3 u 825 1024 377 0.676 -1.294 1.693 *192.168.1.140 149.56.19.163 3 u 266 1024 377 0.595 -3.259 3.592 +192.168.1.146 10.176.6.101 3 u 233 1024 377 0.828 -2.009 2.260 +192.168.1.204 10.176.6.101 3 u 1049 1024 377 0.553 -2.856 2.957

    This is another instance from a different box but with same 5 servers configured:

    remote refid st t when poll reach delay offset jitter ============================================================================== 127.127.1.0 .LOCL. 5 l 117m 8 0 0.000 +0.000 0.000 *2620:11b:d06d:f 10.176.6.101 3 u 232 256 377 1.273 +0.107 0.395 -192.168.1.146 10.176.6.101 3 u 138 512 377 1.302 +0.342 0.214 +192.168.1.140 149.56.19.163 3 u 211 512 377 1.092 -0.059 0.166 +192.168.1.204 10.176.6.101 3 u 8 512 377 1.123 -0.395 0.339 -192.168.1.143 10.32.35.198 3 u 246 256 377 1.176 -4.317 0.308


    Daemon config:

    ntpd.config.param=restrict 127.0.0.1;server 127.127.1.0 minpoll 3 maxpoll 3 iburst;server 2620:11b:d06d:f10a:4a4d:7eff:fea2:b2d1 iburst minpoll 6 maxpoll 10;server 192.168.1.146 iburst minpoll 6 maxpoll 10;server 192.168.1.140
    iburst minpoll 6 maxpoll 10;server 192.168.1.204 iburst minpoll 6 maxpoll 10; server 192.168.1.243 iburst minpoll 6 maxpoll 10;

    ntpd.init.param=-g -f /tffs0/ntpd_driftfile

    I checked reported bugs against 4.8.2p15 to .p18 and did not find any related to this issue.

    Note that if I configure only 3 servers (any) out of the above 5, NTP daemon synchs to all 3 with no issue.

    Is it a known limitation with NTP daemon when there are more than 3 servers configured? Or is this an expected behavior due for example to changes in network latency, offset, jitter…? Anyone else have similar issue?

    Also, another issue I came across is that if NTP is configured as client + server, it takes around 5 minutes to converge to sync up with Local system clock if NTP client cannot sync up with the configured server. Anything I can do to speed up the sync up with local system clock in this case?

    Any insights and help to solve those issues will be appreciated.

    Thanks,
    RC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From William Unruh@21:1/5 to rcheaito via questions Mailing List on Sun Feb 16 22:12:51 2025
    On 2025-02-16, rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    I am using NTP 4.8.2p15 with VxWorks. The current issue that I have with NTP is that when I configure more than 3 servers (my box allows to configure 5 servers), the daemon will always discard servers configured above 3, as in this example (2 out of 5), i.e., cannot use more than 3 servers, otherwise NTP
    client won't sync to all servers.

    remote refid st t when poll reach delay offset jitter
    ==============================================================================
    127.127.1.0 .LOCL. 5 l 101m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 410 512 377 0.532 -3.020 0.460 +2620:11b:d06d:f 10.176.6.101 3 u 48 512 377 0.757 +1.801 0.472
    -192.168.1.140 149.56.19.163 3 u 81 512 377 0.553 +2.506 0.593 *192.168.1.146 10.176.6.101 3 u 82 512 377 0.776 +1.880 0.469 +192.168.1.204 10.176.6.101 3 u 374 512 377 0.545 +1.867 0.440

    Why do you have .LOCL. in there? It is useless as a time source (It is
    like looking at your own watch to set the time on that watch itself It
    is always perfectly in time.)
    As to the three servers, the two 192.168.1.143 192.168.1.140 are oiut--
    2 standard deviations from the othr three 2620:11b:d06d:f 192.168.1.146 192.168.1.204
    so are regarded as falsetickers. Why would you wnat ntp to use clocks
    which are possibly off from the real time? It is not the number but how
    well they fit in with the other time sources. NTP has no idea what the
    right time is. It relies on a majority vote from its servers.

    What I worry about is that four of your servers are all in the local
    network and three of them all get their time from from the same source 10.176.6.101, and those three all agree with each other. Ie, those three
    are hardly independent source. They are all the same source effectively.

    You should always use at least three INDEPENDENT sources. Ie sources
    which cannot be traced back to the same ultimate source.


    After some time, the daemon may resync with one of the rejected servers, but it keeps rejecting at least one of the servers.

    remote refid st t when poll reach delay offset jitter
    ==============================================================================
    127.127.1.0 .LOCL. 5 l 192m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 87 1024 377 0.553 -7.216 2.259 +2620:11b:d06d:f 10.176.6.101 3 u 825 1024 377 0.676 -1.294 1.693
    *192.168.1.140 149.56.19.163 3 u 266 1024 377 0.595 -3.259 3.592 +192.168.1.146 10.176.6.101 3 u 233 1024 377 0.828 -2.009 2.260 +192.168.1.204 10.176.6.101 3 u 1049 1024 377 0.553 -2.856 2.957

    This is another instance from a different box but with same 5 servers configured:

    remote refid st t when poll reach delay offset jitter
    ==============================================================================
    127.127.1.0 .LOCL. 5 l 117m 8 0 0.000 +0.000 0.000 *2620:11b:d06d:f 10.176.6.101 3 u 232 256 377 1.273 +0.107 0.395
    -192.168.1.146 10.176.6.101 3 u 138 512 377 1.302 +0.342 0.214 +192.168.1.140 149.56.19.163 3 u 211 512 377 1.092 -0.059 0.166 +192.168.1.204 10.176.6.101 3 u 8 512 377 1.123 -0.395 0.339 -192.168.1.143 10.32.35.198 3 u 246 256 377 1.176 -4.317 0.308


    Daemon config:

    ntpd.config.param=restrict 127.0.0.1;server 127.127.1.0 minpoll 3 maxpoll 3 iburst;server 2620:11b:d06d:f10a:4a4d:7eff:fea2:b2d1 iburst minpoll 6 maxpoll 10;server 192.168.1.146 iburst minpoll 6 maxpoll 10;server 192.168.1.140 iburst minpoll 6 maxpoll 10;server 192.168.1.204 iburst minpoll 6 maxpoll 10; server 192.168.1.243 iburst minpoll 6 maxpoll 10;


    GEt rid of .LOCL. as a server. Choose ONE of the servers which use 10.176.6.101.
    ntpd.init.param=-g -f /tffs0/ntpd_driftfile

    I checked reported bugs against 4.8.2p15 to .p18 and did not find any related to this issue.

    You are interpreting the issue wrongly.


    Note that if I configure only 3 servers (any) out of the above 5, NTP daemon synchs to all 3 with no issue.

    Yes, because now none are false tickers.


    Is it a known limitation with NTP daemon when there are more than 3 servers configured? Or is this an expected behavior due for example to changes in network latency, offset, jitter…? Anyone else have similar issue?

    It has nothing to do with "3 sources" It has to do with three of them
    all having the same source.

    Also, another issue I came across is that if NTP is configured as client + server, it takes around 5 minutes to converge to sync up with Local system clock if NTP client cannot sync up with the configured server. Anything I can
    do to speed up the sync up with local system clock in this case?

    The local system clock always synced with the local system clock. Also
    it takes a while for ntp to gater enough statistics to figure out how
    far out the local system clock is from the others. NTP is NOT for
    switching on and off. It is for running "forever". You can tell ntp to
    zero in to some one server an startup and jump the time to agree with
    that one. Of course it has not idea yet as to how bad the local rate is
    with respect to that clock. so even though it immediately has the some
    time, that could drift way, until ntp has a good idea of how badly the
    local clock rate is compared to server clock rate. That takes time.

    Any insights and help to solve those issues will be appreciated.

    Thanks,
    RC


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harlan Stenn via questions Mailing@21:1/5 to All on Mon Feb 17 07:53:00 2025
    To: pessimus192@gmail.com
    To: questions@lists.ntp.org

    On 2/16/2025 5:04 PM, rcheaito (via questions Mailing List) wrote:
    Thanks @pessimus192 for your reply. Just for own clarification, do I need to increase minclock to 6 or maxclock to 6?
    On my system, minclock = 3 and maxclock is 10.

    Have you looked at the ntpq documentation?

    Do you understand the 'tally' code in the first column?

    --
    Harlan Stenn <stenn@nwtime.org>
    https://www.nwtime.org/ - be a member!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Danny Mayer via questions Mailing L@21:1/5 to All on Mon Feb 17 22:48:00 2025
    To: questions@lists.ntp.org

    On 2/17/25 3:13 PM, rcheaito (via questions Mailing List) wrote:
    Hi Harlan,

    You mean the following:

    + : included by the combine algorithm
    # : backup (more than tos maxclock sources)
    ' ': (a space) discarded as not valid (TEST10-TEST13)
    x : discarded by intersection algorithm
    . : discarded by table overflow (not used)
    - : discarded by the cluster algorithm

    In our case, the servers above 3 are being discarded with the - sign, i.e., by
    the cluster algorithm.

    From documentation:
    minclock minclock
    Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list. The default is 3. This is also the number of
    servers to be averaged by the combining algorithm

    So from the above, I understand that I need to increase minclock from 3 to 6. Is this correct?

    No. They are being thrown out because the offset is too far outside the
    ones compared to the rest of the list.

    The minclock is related to how often the server is being queried.
    Raising it merely means you are asking less frequently.

    Danny

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Danny Mayer via questions Mailing L@21:1/5 to Edward McGuire on Mon Feb 17 23:13:00 2025
    Copy: rcheaito@yahoo.com
    Copy: questions@lists.ntp.org

    This is a multi-part message in MIME format.
    On 2/17/25 6:03 PM, Edward McGuire wrote:
    On Mon, Feb 17, 2025 at 4:40 PM Danny Mayer <questions@lists.ntp.org> wrote:
    The minclock is related to how often the server is being queried.
    Raising it merely means you are asking less frequently.

    Hi Danny, this is actually not right. From the fine manual: "tos
    minclock /minclock/ [...] Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list.
    The default is 3. This is also the number of servers to be averaged by
    the combining algorithm." You might be thinking of the "minpoll"
    parameter, used to adjust peer polling interval.

    Oh, yes, you're right. However increasing the minimum number of servers
    doesn't really help here.

    Danny

    <!DOCTYPE html>
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    </head>
    <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 2/17/25 6:03 PM, Edward McGuire
    wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:CAGdkUnUR2wNaDs2UdQ_0fBiOx8vHsZ2PMs7FdC2OmkoNgMsy5w@mail.gmail.com">
    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
    <div dir="ltr">On Mon, Feb 17, 2025 at 4:40 PM Danny Mayer &lt;<a
    href="mailto:questions@lists.ntp.org" moz-do-not-send="true"
    class="moz-txt-link-freetext">questions@lists.ntp.org</a>&gt;
    wrote:<br>
    &gt; The minclock is related to how often the server is being
    queried.<br>
    &gt; Raising it merely means you are asking less frequently.<br>
    <br>
    Hi Danny, this is actually not right. From the fine manual: "tos
    minclock /minclock/ [...] Specify the number of servers used by
    the clustering algorithm as the minimum to include on the
    candidate list. The default is 3. This is also the number of
    servers to be averaged by the combining algorithm." You might be
    thinking of the "minpoll" parameter, used to adjust peer polling
    interval.</div>
    </blockquote>
    <p>Oh, yes, you're right. However increasing the minimum number of
    servers doesn't really help here.</p>
    <p>Danny<br>
    </p>
    </body>
    </html>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harlan Stenn via questions Mailing@21:1/5 to James Browning on Mon Feb 17 23:38:00 2025
    To: questions@lists.ntp.org

    Hi James,

    On 2/17/2025 3:04 PM, James Browning wrote:
    On Mon, Feb 17, 2025, 14:40 Danny Mayer <questions@lists.ntp.org <mailto:questions@lists.ntp.org>> wrote:

    No. They are being thrown out because the offset is too far outside the
    ones compared to the rest of the list.

    The minclock is related to how often the server is being queried.
    Raising it merely means you are asking less frequently.


    Then, please forgive this fool because I thought:
    Minpoll was how frequently* you will try to get data from clocks (in 2^n seconds)

    I would say that it sets the floor on how frequently the other side is
    polled..

    Maxpoll was how infrequently* by the same metric.

    I would say that it sets the ceiling on how frequently ...

    NTP will dynamically set the poll interval, bounded by minpoll and maxpoll.

    Minsane was how many clocks need to agree for good time.
    Minclock was how many clocks are needed including those who disagree. Maxclock was how many can get to the quorum at most.

    * barring burst and iburst which I consider not very nice.

    While I have seen rare use cases for 'burst', iburst is great for
    initial clock sync. In fairly common cases it means the local clock is
    sync'd in about 10 seconds with iburst, compared to about 10 minutes
    without iburst.

    --
    Harlan Stenn <stenn@ntp.org>
    The NTP Project is part of
    https://www.nwtime.org/ - be a member!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Tue Feb 18 14:06:02 2025
    rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    I changed the minclock on my system from 3 to 6 and looks this solves the issue of discarding the servers above 3.

    For the second issue, using the 'tos orphan 10 orphanwait 0' did not help. If anyone has other ideas, please let me know.


    I think you are overthinking the whole thing.

    Look at this:

    *127.127.28.0 .SHM. 0 l 3 16 377 0.000 -0.315 0.896 +192.168.0.21 .PPS. 1 u 6 64 377 1.260 +0.694 2.345 +192.168.0.100 .PPS. 1 u 30 64 377 1.264 +0.592 3.476 +192.168.0.101 .PPS. 1 u 55 64 377 1.196 -0.280 3.157 +192.168.0.185 .PPS. 1 u 1 64 377 1.223 +0.727 1.362

    And at some time later:

    *127.127.28.0 .SHM. 0 l 13 16 377 0.000 +0.680 1.344 -192.168.0.21 .PPS. 1 u 64 64 377 1.260 +0.694 2.345 +192.168.0.100 .PPS. 1 u 22 64 377 1.264 +0.592 3.694 +192.168.0.101 .PPS. 1 u 46 64 377 1.132 -2.246 2.722 -192.168.0.185 .PPS. 1 u 59 64 377 1.223 +0.727 1.362

    SHM is a USB GNSS dongle which by itself provides crap time as it has no
    PPS output, yet it is the selected server thanks to the other four
    servers which are all on the local network and stratum 1.

    There are no options in the config file except for minpoll 4 on the USB
    device and iburst for all servers for a quicker restart.

    No minclock, no tos anything, nothing.

    So the +/- signs at the start of the line move around a bit. So what?

    If any of them change to x, then there is a problem.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From William Unruh@21:1/5 to rcheaito via questions Mailing List on Wed Feb 19 01:10:59 2025
    On 2025-02-17, rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    Hi Harlan,

    You mean the following:

    + : included by the combine algorithm
    # : backup (more than tos maxclock sources)
    ' ': (a space) discarded as not valid (TEST10-TEST13)
    x : discarded by intersection algorithm
    . : discarded by table overflow (not used)
    - : discarded by the cluster algorithm

    In our case, the servers above 3 are being discarded with the - sign, i.e., by
    the cluster algorithm.

    No, they are not discarded because they are more than three. They are
    discarded because those other two have offsets larger than the standard deviation of those three. And those three have smaller standard
    deviation because they are all from the same source. Ie, they are NOT independent. So if their server happens to have a small jitter they will overwhelm the others.
    What you are essentially doing is saying that one source should have
    three times the weight of any other source of time. It will thus be
    highly probably that they will dominate the time source.


    From documentation:
    minclock minclock
    Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list. The default is 3. This is also the number of
    servers to be averaged by the combining algorithm

    So from the above, I understand that I need to increase minclock from 3 to 6. Is this correct?

    If you want to loose all time sources, yes. Ie, if there are not 6 which
    agree then deliver nothing. That is just silly.

    a) Get rid of .LOCL.
    b) Choose just one of the servers that come from that one source, not three.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harlan Stenn via questions Mailing@21:1/5 to rcheaito@yahoo.com on Wed Feb 19 04:13:00 2025
    To: questions@lists.ntp.org

    On 2/18/2025 5:26 AM, rcheaito@yahoo.com wrote:
    Thank you all for the replies. I still have not seen a confirmation/or objection of whether I should increase minclock from 3 to 6... But I tried this change on my system and now I see NTP daemon is including all 5 servers and no discarding any one. So, looks this change solves the issue I was having
    with NTP.

    Your local ntpd is rejecting those servers because they are out of spec
    for the selection group, as others have said.

    If you bump maxclock you are effectively forcing a larger selection
    group, and that's why these worse-behaving servers are remaining in your selection pool.

    Be careful what you wish for.

    As for speeding up the convergence of NTP to sync up with the local system clock when it cannot reach any of the configured external servers, 'tos orphan
    10 orphanwait 0' did not unfortunately help. It still takes ~5 minutes to converge. Any ideas?

    The local system clock is already sync'd to the local system clock,
    regardless of what ntpd says.

    Iburst is your friend when it comes to getting the initial sync done
    quickly, when it can be used. Some refclocks will support iburst, some
    will not.

    Perhaps your "time distribution" systems need some better configuration.

    --
    Harlan Stenn <stenn@ntp.org>
    The NTP Project is part of
    https://www.nwtime.org/ - be a member!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Wed Feb 19 15:04:12 2025
    rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    Yes, in theory I should not bother about this NTP behavior as this is after all controlled by the selection algorithm based on quality of communication with the server, precision,....etc. However, what made me follow through this
    issue mainly this:

    We allow user to configure up to 5 time sources on our system, and so if all 5
    time servers are supposedly good, then the expectation is that one of them will be selected as the system peer and the remaining ones as backup.

    A properly operating system should show one server preceded by a '*',
    two (in rare situations transiently more) preceded by a '+', and all the
    rest preceded by a '-' in the ntpq line, which is what you have.

    So what is your problem?

    Not
    having this behavior will create confusion and raise questions. I understand that the system peer may change as NTP polls the servers, but still one will be selected as system peer and the remaining will be as backup (of course assuming all 5 are still considered as good time sources per the selection algorithm!).


    Which is, from your examples, what is happening though it appears you
    have a strange view of what "backup" means.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Wed Feb 19 15:13:35 2025
    rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    We already have maxclock set to 10 on our system (default value), so why I need to bump it up further?
    What's wrong if I bump up minclock? Isn't this the value that allows to include more servers to the selection process, which is what I found with my testing?

    If we increase the number of servers into the selection process, NTP should still include only the good sources and discard the bad ones, but not sure if this may slow down the selection process as the more servers we have the more calculations NTP will need to make before deciding which ones to keep and/or discard?

    Discarding a server from the selection process does NOT mean it is bad,
    it just means the ones selected are statistically "better" AT THE
    CURRENT TIME. Ntp is ALWAYS looking at ALL the configured servers to
    determine which are the statistically best.

    A "bad" server will show an 'x' at the beginning of the ntpq line.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Woolley@21:1/5 to rcheaito via questions Mailing List on Thu Feb 20 00:50:02 2025
    On 19/02/2025 22:02, rcheaito via questions Mailing List wrote:
    We allow user to configure up to 5 time sources on our system, and so if all 5
    time servers are supposedly good, then the expectation is that one of them will be selected as the system peer and the remaining ones as backup. Not

    I think you are putting too much weight on the system peer. The actual
    time is set based on, I think weighted, average of more than one source.

    I also think, in terms of the original question, you are trying to shoot
    the messenger. You are getting sources rejected because you have
    sources which are not independent of each. You need to fix that issue,
    rather than trying to get more accepted by insisting a minimum number
    that must be used.

    If you say that you only allow users to specify five servers, you need
    to set a minimum that is less than five, to give some scope for
    rejecting some of them.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harlan Stenn via questions Mailing@21:1/5 to All on Thu Feb 20 00:28:12 2025
    To: questions@lists.ntp.org

    On 2/19/2025 8:49 AM, rcheaito (via questions Mailing List) wrote:
    We already have maxclock set to 10 on our system (default value), so why I need to bump it up further?
    What's wrong if I bump up minclock? Isn't this the value that allows to include more servers to the selection process, which is what I found with my testing?

    If we increase the number of servers into the selection process, NTP should still include only the good sources and discard the bad ones, but not sure if this may slow down the selection process as the more servers we have the more calculations NTP will need to make before deciding which ones to keep and/or discard?

    The - tally code in ntpq means "discarded by the cluster algorithm".

    Have you seen https://www.ntp.org/documentation/4.2.8-series/cluster/ ?

    Those hosts with the '-' have already survived the selection process.

    They were dropped from consideration by the (subsequent) cluster algorithm.

    --
    Harlan Stenn <stenn@ntp.org>
    NTP Project Lead. The NTP Project is part of
    https://www.nwtime.org/ - be a member!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Thu Feb 20 17:45:57 2025
    rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    When we have more than 3 servers configured (5 in our case), increasing the minclock from 3 to 6 should help keep all the 5 survivors by the cluster algorithm and terminate the pruning rounds faster, based on my understanding of the following snippets from documentation. And so this should lead to tagging the additional survivors in ntpq with a '+' sign instead of '-' sign as the additional survivors won't be discarded (as they are no longer pruned).
    Correct?

    And just what do you think this would accomplish?

    It certainly isn't going to make the time on the system any "better" and
    my gut feel is that it will just make jitter worse.

    Have you ever run any statistical analysis of what the system clock is
    actually doing or are you just counting + and - signs?

    <snip>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jim Pennino@21:1/5 to rcheaito via questions Mailing List on Fri Feb 21 18:21:34 2025
    rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    I can say we are focusing more on the count of '+' and '-' signs. We may obviously be so paranoid by this, however the concern raised is that with 2 servers out of 5 showing as discarded ('-' sign) almost all the time, our clients will have questions raised when they know that those discarded servers
    are working fine with other boxes (with '*' or '+' sign).

    It sounds to me like neither you nor your clients have a clue how ntp
    works and what the + and - signs actually mean to the overall operation
    of ntp.


    I did not run any statistical tests with those servers.

    From the different answers I got, increasing the minclock helps replace the
    '-' signs with the '+' signs against those previously discarded servers, but it may make our client more fragile. So, what I take from all those discussions that we better keep the minclock as 3 and then just document this behavior as expected!


    Actually, the behavior is already well described and documented, which
    is why I question your obsession over it.

    Counting + and - signs tells you absolutely nothing about how accurate
    the system clock is.

    If you actually want to know how accurate the clock is, at least use
    something like ntpstat which will give information like:

    synchronised to UHF radio at stratum 1
    time correct to within 2 ms
    polling server every 16 s

    Better is to use ntpviz which will graph things and give in depth
    reports.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From William Unruh@21:1/5 to Jim Pennino on Sat Feb 22 04:06:32 2025
    On 2025-02-22, Jim Pennino <jimp@gonzo.specsol.net> wrote:
    rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    I can say we are focusing more on the count of '+' and '-' signs. We may
    obviously be so paranoid by this, however the concern raised is that with 2 >> servers out of 5 showing as discarded ('-' sign) almost all the time, our
    clients will have questions raised when they know that those discarded servers
    are working fine with other boxes (with '*' or '+' sign).

    It sounds to me like neither you nor your clients have a clue how ntp
    works and what the + and - signs actually mean to the overall operation
    of ntp.


    I did not run any statistical tests with those servers.

    From the different answers I got, increasing the minclock helps replace the >> '-' signs with the '+' signs against those previously discarded servers, but >> it may make our client more fragile. So, what I take from all those
    discussions that we better keep the minclock as 3 and then just document this
    behavior as expected!


    Actually, the behavior is already well described and documented, which
    is why I question your obsession over it.

    Counting + and - signs tells you absolutely nothing about how accurate
    the system clock is.

    If you actually want to know how accurate the clock is, at least use something like ntpstat which will give information like:

    synchronised to UHF radio at stratum 1
    time correct to within 2 ms
    polling server every 16 s

    UHF is a terrible server. Its precision is 2ms. Its accuracy is way
    worse than that. It is not a server you can determine how long the round
    trip time is (Your system to the UHF radio station, and back to your
    server. If you want to determine the accuracy of your time, get a gps
    time receiver. GPS knows both where you are and where the sattelite is,
    and thus can accuratly determine the one way distance and the one way
    time lag (well, modulo the atmospheric lag fluctuation due to the
    ionisphere, and the water vapour changes in the air). Also while you are
    at it you can use the gps time and one of your servers, and get and
    accuracy of microseconds, not milliseconds.One GPS clock can be worth a
    million internet servers.


    Better is to use ntpviz which will graph things and give in depth
    reports.



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From William Unruh@21:1/5 to rcheaito via questions Mailing List on Sat Feb 22 03:40:03 2025
    On 2025-02-21, rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    I can say we are focusing more on the count of '+' and '-' signs. We may obviously be so paranoid by this, however the concern raised is that with 2 servers out of 5 showing as discarded ('-' sign) almost all the time, our clients will have questions raised when they know that those discarded servers
    are working fine with other boxes (with '*' or '+' sign).

    I did not run any statistical tests with those servers.

    From the different answers I got, increasing the minclock helps replace the
    '-' signs with the '+' signs against those previously discarded servers, but it may make our client more fragile. So, what I take from all those discussions that we better keep the minclock as 3 and then just document this behavior as expected!

    You do not listen to anyone.
    a) .LOCL. is the local clock. Obviously it is always in perfect
    agreement with the local clock. It is useless as a server,

    b) 3 of your sources have the same server. That they agree with each
    other is no surprize. They are the ones that the system uses as their
    preferred source. Bad idea. You might as well have just one server
    instead of those three.

    c) 3 is not a magic number. The other sources are rejected because they disagree with those 3 sources, ( which are only one source because they
    use the same server).
    Get rid of two of them and replace them with two other servers which do
    NOT have the same server freeding them





    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jim Pennino@21:1/5 to William Unruh on Sat Feb 22 07:26:57 2025
    William Unruh <unruh@invalid.ca> wrote:
    On 2025-02-22, Jim Pennino <jimp@gonzo.specsol.net> wrote:
    rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
    I can say we are focusing more on the count of '+' and '-' signs. We may >>> obviously be so paranoid by this, however the concern raised is that with 2 >>> servers out of 5 showing as discarded ('-' sign) almost all the time, our >>> clients will have questions raised when they know that those discarded servers
    are working fine with other boxes (with '*' or '+' sign).

    It sounds to me like neither you nor your clients have a clue how ntp
    works and what the + and - signs actually mean to the overall operation
    of ntp.


    I did not run any statistical tests with those servers.

    From the different answers I got, increasing the minclock helps replace the >>> '-' signs with the '+' signs against those previously discarded servers, but
    it may make our client more fragile. So, what I take from all those
    discussions that we better keep the minclock as 3 and then just document this
    behavior as expected!


    Actually, the behavior is already well described and documented, which
    is why I question your obsession over it.

    Counting + and - signs tells you absolutely nothing about how accurate
    the system clock is.

    If you actually want to know how accurate the clock is, at least use
    something like ntpstat which will give information like:

    synchronised to UHF radio at stratum 1
    time correct to within 2 ms
    polling server every 16 s

    UHF is a terrible server. Its precision is 2ms. Its accuracy is way
    worse than that.

    It is actually GNSS.

    *127.127.28.0 .SHM. 0 l 15 16 377 0.000 -0.528 3.298 -192.168.0.21 .PPS. 1 u 64 64 377 1.241 -1.550 2.182 +192.168.0.100 .PPS. 1 u 32 64 377 1.285 -3.298 1.780 +192.168.0.101 .PPS. 1 u 36 64 377 1.310 -4.271 2.197 -192.168.0.185 .PPS. 1 u 48 64 377 1.236 +0.784 2.147

    The local clock time offset has a mean of -0.004 ms and a standard
    deviation of 1.120 ms.

    Why ntpstat calls it UHF radio I have no clue.

    It is not a server you can determine how long the round
    trip time is (Your system to the UHF radio station, and back to your
    server.

    I think it is calling the .SHM. server UHF, which means for all
    practical purposes the rtt is basically zero as shown in the ntpq line.

    If you want to determine the accuracy of your time, get a gps
    time receiver. GPS knows both where you are and where the sattelite is,
    and thus can accuratly determine the one way distance and the one way
    time lag (well, modulo the atmospheric lag fluctuation due to the
    ionisphere, and the water vapour changes in the air). Also while you are
    at it you can use the gps time and one of your servers, and get and
    accuracy of microseconds, not milliseconds.One GPS clock can be worth a million internet servers.

    Note that all five servers are GNSS and are on a local 5G WiFi network.

    The .100 and .101 servers are GNSS ntp appliance boxes that are accurate
    to a bit less than 100 us.

    The .21 server is a rasberry pi4 with a GNSS card and is accurate to
    about 60 us.

    The .185 server is a serial attached GNSS box with a high accuracy,
    temperature controlled, GNSS steered oscillator with a PPS accuracy of
    +/- 1 nanosecond. The server is typically accurate to less than 5 us.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Miroslav Lichvar@21:1/5 to Jim Pennino on Mon Feb 24 08:56:08 2025
    On 2025-02-22, Jim Pennino <jimp@gonzo.specsol.net> wrote:
    Why ntpstat calls it UHF radio I have no clue.

    That's how ntpd describes SHM refclocks in the mode-6 protocol. See

    https://www.rfc-editor.org/rfc/rfc9327.html#table-3

    By the ITU definition, GPS is in the UHF band, so that actually seems to
    be correct, even though SHM is used with other time sources, e.g.
    longwave radio signals.

    --
    Miroslav Lichvar

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)