• Re: Another little puzzle

    From Dmitry A. Kazakov@21:1/5 to Stefan Ram on Wed Dec 14 14:06:11 2022
    On 2022-12-14 13:24, Stefan Ram wrote:
    Given n times of the 24-hour day, print their average.

    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".

    You probably missed to require the interesting part: doing all that in
    the modular type (modulo 24) arithmetic:

    20 + 5 = 1 (mod 24)

    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    That is not representation. Averaging hours, minutes, seconds are three different problems, though the algorithm would be same.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Stefan Ram on Wed Dec 14 12:30:06 2022
    On 14/12/2022 12:24 pm, Stefan Ram wrote:
    Given n times of the 24-hour day, print their average.

    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".

    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Subject line: "Another little puzzle"

    Your specification seems too simple to qualify as a puzzle, so
    what am I missing?

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to All on Wed Dec 14 12:24:32 2022
    Given n times of the 24-hour day, print their average.

    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".

    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Richard Heathfield on Wed Dec 14 12:33:16 2022
    Richard Heathfield <rjh@cpax.org.uk> writes:
    On 14/12/2022 12:24 pm, Stefan Ram wrote:
    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)
    Subject line: "Another little puzzle"
    Your specification seems too simple to qualify as a puzzle, so
    what am I missing?

    Answering this is part of the puzzle!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 13:10:54 2022
    On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 13:24, Stefan Ram wrote:
       Given n times of the 24-hour day, print their average.

       For example, the average of "eight o'clock" and
       "ten o'clock" (n=2) would be "nine o'clock".

    You probably missed to require the interesting part: doing all
    that in the modular type (modulo 24) arithmetic:

       20 + 5 = 1 (mod 24)

    ...which will give you the wrong answer. Chase that goose!

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Richard Heathfield on Wed Dec 14 14:35:19 2022
    On 2022-12-14 14:10, Richard Heathfield wrote:
    On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 13:24, Stefan Ram wrote:
       Given n times of the 24-hour day, print their average.

       For example, the average of "eight o'clock" and
       "ten o'clock" (n=2) would be "nine o'clock".

    You probably missed to require the interesting part: doing all that in
    the modular type (modulo 24) arithmetic:

        20 + 5 = 1 (mod 24)

    ...which will give you the wrong answer. Chase that goose!

    Right, you must count the wrap-ups. E.g.

    for I in Times'Range loop
    New_Sum := Old_Sum + Times (I);
    if Old_Sum > New_Sum then -- Wrapped
    Count := Count + 1;
    if Count = n then -- Each n wrap-ups can be discarded
    Count := 0;
    end if;
    end if;
    Old_Sum := New_Sum;
    end loop;

    In the end you have to evaluate

    (Old_Sum + Count * 24) / n (mod 24)

    where Count < n is the wrap-ups count.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 14:10:11 2022
    On 14/12/2022 1:35 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 14:10, Richard Heathfield wrote:
    On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 13:24, Stefan Ram wrote:
       Given n times of the 24-hour day, print their average.

       For example, the average of "eight o'clock" and
       "ten o'clock" (n=2) would be "nine o'clock".

    You probably missed to require the interesting part: doing all
    that in the modular type (modulo 24) arithmetic:

        20 + 5 = 1 (mod 24)

    ...which will give you the wrong answer. Chase that goose!

    Right, you must count the wrap-ups.

    No, you don't. You're given n times of the 24-hour day, so all
    values are already in the hour range [0-24). Convert to seconds
    from midnight, add all values, divide by n to give a number t
    guaranteed (assuming no leap seconds) to be in the range
    [0-86400), and convert to the representation of your choice. (And
    even if there is a leap second, it doesn't matter more than half
    a sixpence.)

    e.g.

    int h, m, s;

    h = t / 3600;
    m = (t - h*3600) / 60;
    s = (t - h*3600 - m * 60);

    So why do you need mod?

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Richard Heathfield on Wed Dec 14 15:58:45 2022
    On 2022-12-14 15:10, Richard Heathfield wrote:
    On 14/12/2022 1:35 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 14:10, Richard Heathfield wrote:
    On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 13:24, Stefan Ram wrote:
       Given n times of the 24-hour day, print their average.

       For example, the average of "eight o'clock" and
       "ten o'clock" (n=2) would be "nine o'clock".

    You probably missed to require the interesting part: doing all that
    in the modular type (modulo 24) arithmetic:

        20 + 5 = 1 (mod 24)

    ...which will give you the wrong answer. Chase that goose!

    Right, you must count the wrap-ups.

    [...]

    So why do you need mod?

    As I said, the challenge is only interesting in modulo arithmetic (24 or
    24*60 or 24*60*60). E.g. let

    type Hour is mod 24;

    Average a series of Hour without conversion to another integer type.

    (Otherwise it is trivial: convert to a wider type, average, convert back)

    P.S. This problem may actually arise in programming a microcontroller
    when you have some large series and narrow 16-bit types. Close cases are computations with color channels and grayscale levels when conversions
    to wider types are too expensive etc.

    The general case is some f(X1,X2,..,Xn) such that max{Xi} >= f() >=
    min{Xi}, but intermediates are not. As it is in the case with averaging Xi.

    BTW, averaging floats is a nasty problem too. A naive implementation
    quickly loses precision.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 15:18:44 2022
    On 14/12/2022 2:58 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 15:10, Richard Heathfield wrote:
    On 14/12/2022 1:35 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 14:10, Richard Heathfield wrote:
    On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 13:24, Stefan Ram wrote:
       Given n times of the 24-hour day, print their average.

       For example, the average of "eight o'clock" and
       "ten o'clock" (n=2) would be "nine o'clock".

    You probably missed to require the interesting part: doing
    all that in the modular type (modulo 24) arithmetic:

        20 + 5 = 1 (mod 24)

    ...which will give you the wrong answer. Chase that goose!

    Right, you must count the wrap-ups.

    [...]

    So why do you need mod?

    As I said, the challenge is only interesting in modulo arithmetic

    So the challenge is only interesting if you add something you
    don't need. Let's throw in some elephants. Does that make it more
    interesting?

    BTW, averaging floats is a nasty problem too. A naive
    implementation quickly loses precision.

    We're dealing with 'o'clock' and "HH:MM", and nowadays we have
    64-bit integer types and there are even 128-bit integers mooching
    around looking for a reason to exist. You'd have to average a
    hell of a lot of times even to /need/ floats, let alone lose
    significant precision.

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Richard Heathfield on Wed Dec 14 16:41:57 2022
    On 2022-12-14 16:18, Richard Heathfield wrote:

    BTW, averaging floats is a nasty problem too. A naive implementation
    quickly loses precision.

    We're dealing with 'o'clock' and "HH:MM", and nowadays we have 64-bit
    integer types and there are even 128-bit integers mooching around
    looking for a reason to exist. You'd have to average a hell of a lot of
    times even to /need/ floats, let alone lose significant precision.

    I never suggested float for averaging time stamps, I pointed out that
    averaging is not a simple problem. E.g. try this one:

    function Average (X : Float; N : Positive) return Float is
    Sum : Float := 0.0;
    begin
    for Index in 1..N loop
    Sum := Sum + X;
    end loop;
    return Sum / Float (N);
    end Average;

    The function does naive averaging. For simplicity it just sums up the
    same number X N times and divides by N.

    Average (1.0, 17_000_000) = 0.986895
    Average (1.0, 100_000_000) = 0.167772
    Average (1.0, 200_000_000) = 0.838861

    The issue is catastrophic precision loss of addition Sum + X when Sum >> X.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 16:43:14 2022
    On 2022-12-14 16:41, Dmitry A. Kazakov wrote:
    On 2022-12-14 16:18, Richard Heathfield wrote:

    BTW, averaging floats is a nasty problem too. A naive implementation
    quickly loses precision.

    We're dealing with 'o'clock' and "HH:MM", and nowadays we have 64-bit
    integer types and there are even 128-bit integers mooching around
    looking for a reason to exist. You'd have to average a hell of a lot
    of times even to /need/ floats, let alone lose significant precision.

    I never suggested float for averaging time stamps, I pointed out that averaging is not a simple problem. E.g. try this one:

       function Average (X : Float; N : Positive) return Float is
          Sum : Float := 0.0;
       begin
          for Index in 1..N loop
             Sum := Sum + X;
          end loop;
          return Sum / Float (N);
       end Average;

    The function does naive averaging. For simplicity it just sums up the
    same number X N times and divides by N.

       Average (1.0, 17_000_000)  = 0.986895
       Average (1.0, 100_000_000) = 0.167772
       Average (1.0, 200_000_000) = 0.838861
    0.0838861

    The error is obviously systematic.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 16:13:17 2022
    On 14/12/2022 3:41 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 16:18, Richard Heathfield wrote:

    BTW, averaging floats is a nasty problem too. A naive
    implementation quickly loses precision.

    We're dealing with 'o'clock' and "HH:MM", and nowadays we have
    64-bit integer types and there are even 128-bit integers
    mooching around looking for a reason to exist. You'd have to
    average a hell of a lot of times even to /need/ floats, let
    alone lose significant precision.

    I never suggested float for averaging time stamps,

    Yes, you did.

    I pointed out
    that averaging is not a simple problem.

    Yes, it is.

    E.g. try this one:

       function Average (X : Float; N : Positive) return Float is
          Sum : Float := 0.0;
       begin
          for Index in 1..N loop
             Sum := Sum + X;
          end loop;
          return Sum / Float (N);
       end Average;

    The function does naive averaging. For simplicity it just sums up
    the same number X N times and divides by N.

    You're not being naïve /enough/. The average of N instances of X
    is X, so just return X.

    Yes, if you try hard enough, you can screw anything up. So what?

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Stefan Ram on Thu Dec 15 02:04:05 2022
    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.

    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".

    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    I came across this problem in another context where the solution I ended
    up using had two components -- the average and the "strength" of that
    average.

    Is this a purely abstract problem or did it crop up in some practical
    context? The two-valued solution (average/strength) would also work
    well from some applications.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Stefan Ram on Thu Dec 15 04:49:24 2022
    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Richard Heathfield <rjh@cpax.org.uk> writes:

    On 14/12/2022 12:24 pm, Stefan Ram wrote:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Subject line: "Another little puzzle"
    Your specification seems too simple to qualify as a puzzle, so
    what am I missing?

    Answering this is part of the puzzle!

    Without any further constraints this problem is
    absurdly easy.

    If you let the reader choose the constraints, there
    appears to be no reason to choose any constraints
    that make the problem harder. On the contrary,
    imposing additional constraints seems artificial.

    So please let us know what additional constraints
    you have in mind. I daresay mind reading is not in
    the skill set of anyone reading these postings.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Stefan Ram on Wed Dec 21 12:03:27 2022
    ram@zedat.fu-berlin.de (Stefan Ram) writes:
    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Stefan Ram on Wed Dec 21 13:09:29 2022
    On 21/12/2022 12:03 pm, Stefan Ram wrote:
    ram@zedat.fu-berlin.de (Stefan Ram) writes:
    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5

    The specification says "the 24-hour day" --- one day, not two
    days. So the average of 23.5 and 1.5 is 12.5. 0.5 is a mistake.

    average( 11.5, 13.5 )== 12.5

    Correct.


    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    Your hope is misplaced, because one of your test cases bears an
    incorrect expected result.


    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    You misunderstood your specification.

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Stefan Ram on Wed Dec 21 15:15:58 2022
    On 2022-12-21 13:03, Stefan Ram wrote:

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5

    Nope, it is 12.5! (:-))

    Well, assuming that time samples can cross the day boundary and assuming
    that averaged time is defined as averaging duration from some epoch and
    adding the epoch back:

    Avr({Ti}) =def= Avr({Ti}-E) + E

    (you cannot average time as it is).

    Then in the case of n = 2 you have two competing answers:

    (23.5 + n * 24 + 1.5 + m * 24) / 2 (mod 24)

    where n and m are any natural numbers (specifying the day). This
    resolves to:

    12.5 + (n + m) * 12 (mod 24)

    which gives either 12.5 or 0.5 depending on whether n + m is odd or even.

    Now the fun part is that when n contains factors other than 2 or 3
    (24=2*2*2*3) then there might be up to 23 different answers!

    Your puzzle is ill specified.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Richard Heathfield on Wed Dec 21 17:05:03 2022
    Richard Heathfield <rjh@cpax.org.uk> writes:

    On 21/12/2022 12:03 pm, Stefan Ram wrote:
    ram@zedat.fu-berlin.de (Stefan Ram) writes:
    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)
    Thanks for all replies!
    I waited a few days before answering to allow
    sufficient time to think about the problem.
    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).
    So, here are two possible test cases.
    average( 23.5, 1.5 )== 0.5

    The specification says "the 24-hour day" --- one day, not two days. So
    the average of 23.5 and 1.5 is 12.5. 0.5 is a mistake.

    I don't think it's reasonable to call it a mistake. There is a
    perfectly plausible interpretation of "average" that gives 0.5. My
    (un-posted) code gives 0.5.

    I think the one day/two days distinction might be a distraction. Can
    you say more about how that distinction changes the answer? (Mind you,
    I can't see how SR's words imply one day.)

    average( 11.5, 13.5 )== 12.5

    Correct.

    (I use hours as units, so "0.5" means, "half past midnight".)
    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    Your hope is misplaced, because one of your test cases bears an
    incorrect expected result.

    Hmm... My interpretation of average gives 0.5. Mind you, I just took
    some exiting related code and changed degrees into hours. That mapping
    seemed to me to match up with what SR was driving at.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    You misunderstood your specification.

    That's very severe!

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Wed Dec 21 18:55:47 2022
    On 2022-12-21 18:05, Ben Bacarisse wrote:

    There is a
    perfectly plausible interpretation of "average" that gives 0.5.

    Not very plausible as it violates two properties of:

    min({Xi}) <= avr({Xi}) <= max({Xi})

    and

    avr({C * Xi}) = C * avr({Xi})

    The second one allows calculation of the average as:

    23.5 / 2 + 1.5 / 2 = ?

    P.S. Of course one could reformulate the puzzle in a non-numeric way
    that would give Stefan's answer. For example this.

    Let us consider vectors of same length in polar coordinates [0, 24[,
    e.g. the clock hands. An average of these vectors will be a clock hand
    again. Then, indeed, (v(23.5) + v(1.5)) / 2 = v(0.5).

    P.P.S. The bounds property does not apply to vector averages. The
    scaling property does, because multiplication by a scalar works
    differently on vectors, it does not rotate the hands.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Ben Bacarisse on Wed Dec 21 17:21:25 2022
    On 21/12/2022 5:05 pm, Ben Bacarisse wrote:
    Richard Heathfield <rjh@cpax.org.uk> writes:

    On 21/12/2022 12:03 pm, Stefan Ram wrote:

    <snip>

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    You misunderstood your specification.

    That's very severe!

    Then I apologise to Stefan for my severity, but I stand by my
    interpretation as being a valid reading of his problem.

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Terry@21:1/5 to Dmitry A. Kazakov on Wed Dec 21 19:55:56 2022
    On 21/12/2022 17:55, Dmitry A. Kazakov wrote:
    On 2022-12-21 18:05, Ben Bacarisse wrote:

    There is a
    perfectly plausible interpretation of "average" that gives 0.5.

    Not very plausible as it violates two properties of:

    min({Xi}) <= avr({Xi}) <= max({Xi})

    and

    avr({C * Xi}) = C * avr({Xi})

    The second one allows calculation of the average as:

    23.5 / 2 + 1.5 / 2 = ?


    Dmitri: while I'm replying to /your/ post, my points are not especially addressed to you - it's more
    of an accumulation of "frustration" at what I consider to be "unimaginative" misinterpretations and
    dogmatic claims of what was originally requested... (I pretty much just chose a recent post and
    responded due to some internal threshold being reached! :) So don't be personally offended!)

    A lot of people here have treated the problem as saying "find the numeric average of the number of
    seconds past midnight for a given set of times". Well, clearly (to me) that is NOT what the OP
    intended, and I would say it's by no means the required meaning of the words he actually used.

    It was pretty clear to me that the intention was (somehow) to look at the problem of averaging
    values in some kind of modular setting, e.g. times given with a 24 hour clock. Such times wrap at
    midnight and common sense would say that we are to consider times a couple of minutes apart as
    "close" in this modular setting. I.e. 2 minutes to midnight and 2 minutes past midnight should be
    considered separated by 4 minutes, not 23 hours 56 minutes.

    The OP was probably deliberately rather vague on this point! The easiest definition for "average"
    literally doesn't make sense in a modulo context: we can ADD the modular values, but in general
    dividing in such a mathematical setting doesn't make much sense, so

    Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n

    would be inappropriate due to the /n operation.

    HOWEVER, there's another characterisation for the average of a set, which looks more promising in a
    modular (or other more general) setting: the average is the value of V which MINIMISES THE "ERROR"
    calculated by

    error = Sum ([i=1,2,..n] (V - x_i)^2)

    that is, minimises the sum of squares of differences from V. [Incidentally, if we minimise the sum
    of absolute differeces from V, that characterises the mode (aka "midpoint") of the sample, but I
    think the median is more what is wanted here...]

    So I'll suggest a precise wording of what was (I think) wanted: find the time V that minimises the
    error function above, based on "distances" in a modular setting. We also need to make clear what
    "distance" means in that setting, but put like that I think there would be less disagreement that
    the sensible way to define the distance between two times, would be in the expected "wrapping"
    manor. E.g. we could say the distance is the LEAST positive D such that V +- D = x_i, with the
    arithmetic + or - being modular arithmetic, i.e. mod 24hours. Obviously this makes the distance
    between 23:58 and 00:02 4 minutes, like we want...

    Had the OP suggested that the times in question were EVENTS, then obviously averaging the time for
    them is a concretely defined operation, but to interpret the OP like that, we would then need to
    turn "times" into "event occurences date+times" and there was no mention of dates anywhere - to me
    it's clear the times are just intended as modular 24hour times, not applying to specific events.

    And for those that go further and suggest what was said was that the times were all events occuring
    on one and the same day, and that is what OP asked to be averaged - well! (I don't think that is
    reasonable at all, to put it mildly.)

    Now, as to how to CALCULATE the V above???

    [I haven't got a definitive approach - obviously we could calculate the normal arithmetic "number of
    seconds past midnight" average, then look at adjusting it as we flip various x_i by 24hours, but
    there could be a lot of x_i to flip (problem grows exponentially). Or maybe we sample V at a fixed
    number of places like on each hour, and (with luck) there may be an obvious minimum - at that point
    we could make a good guess for exactly which x_i should be "flipped" by 24hours, and then calculate
    using normal arithmetic averaging??? Well that's my first idea.]

    Regards,
    Mike.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Stefan Ram on Wed Dec 21 13:17:55 2022
    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    As before, the problem is underspecified.

    What is average( 0700, 1900 ), where the times indicate 24-hour
    military times?

    What is average( 1900, 0700 )?

    What is average( 0100, 0700, 1300, 1900 )?
    What is average( 0700, 1300, 1900, 0100 )?
    What is average( 1300, 1900, 0100, 0700 )?
    What is average( 1900, 0100, 0700, 1300 )?

    and similarly for all other permutations of
    the four arguments?

    Stop giving just examples; instead give a statement
    of the problem that gives correct answers for all
    inputs. Giving just examples is a waste of everyone's
    time.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Mike Terry on Wed Dec 21 22:54:43 2022
    On 2022-12-21 20:55, Mike Terry wrote:

    The OP was probably deliberately rather vague on this point!  The
    easiest definition for "average" literally doesn't make sense in a
    modulo context:  we can ADD the modular values, but in general dividing
    in such a mathematical setting doesn't make much sense, so

        Average ([i=1,2,..n] x_i)  =[def]=  Sum ([i=1,2,..n] x_i) /n

    would be inappropriate due to the /n operation.

    Unless you calculate everything as reals or integers and then take the remainder of 24. Which is kind of most natural definition, at least to me.

    HOWEVER, there's another characterisation for the average of a set,
    which looks more promising in a modular (or other more general)
    setting:  the average is the value of V which MINIMISES THE "ERROR" calculated by

        error = Sum ([i=1,2,..n] (V - x_i)^2)

    that is, minimises the sum of squares of differences from V.

    This has exactly same problem as above, because subtraction and
    multiplication (power of two) have different semantics in modular
    arithmetic.

    [Incidentally, if we minimise the sum of absolute differeces from V,
    that characterises the mode (aka "midpoint") of the sample, but I think
    the median is more what is wanted here...]

    Actually it is the same beast. Median is the least C-norm defined as |x-y|.

    Mean is the least Euclidean norm (squares), the thing you proposed.

    Mean and average are same for real numbers.

    You are right that median (and C-norm) can be unambiguously defined in
    modular numbers. But nobody would ever qualify this as a puzzle! (:-))

    [...]

    Now, as to how to CALCULATE the V above???

    As I said, in real numbers V is the mean, i.e.

    V = Sum (Xi) =def= mean({Xi | i=1..n})
    i=1..n

    Any numeric method is in the end mapping Xi to some normal number,
    calculating a classic mean, mapping back. The problem is with the
    forward mapping being ambiguous. One method to disambiguate is staying
    within the same day. Another could be taking a median and considering 12
    hours before and 12 hours after it. And there are many more.

    P.S. There are differences between the average and mean. OP referred the average, which may mean something completely counterintuitive (no pun
    intended (:-))

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to Mike Terry on Wed Dec 21 14:47:49 2022
    On Wednesday, 21 December 2022 at 20:56:01 UTC+1, Mike Terry wrote:

    an accumulation of "frustration" at what I consider to be
    "unimaginative" misinterpretations and
    dogmatic claims of what was originally requested...

    Besides that the problem was and remains at best poorly
    phrased (and I agree that guesswork around here should
    rather be banned), there is a much simpler and canonical
    way to define "midpoint" in a modular space, namely the
    midpoint along the shortest of the two possible paths.
    And that's actually a quite common kind of calculation...

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to Mike Terry on Wed Dec 21 16:05:00 2022
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Bunch of pretenders then fraudulent spammers...

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Terry@21:1/5 to Dmitry A. Kazakov on Thu Dec 22 00:01:03 2022
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
    On 2022-12-21 20:55, Mike Terry wrote:

    The OP was probably deliberately rather vague on this point! The easiest definition for "average"
    literally doesn't make sense in a modulo context: we can ADD the modular values, but in general
    dividing in such a mathematical setting doesn't make much sense, so

    Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n

    would be inappropriate due to the /n operation.

    Unless you calculate everything as reals or integers and then take the remainder of 24. Which is
    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted - an artificial cut-point has been
    introduced to the continuous mod-24 "loop", converting all the points into non-modular lengths (say
    real numbers), and then those are averaged. That produces "silly" conclusions as people have
    listed. I get that people want to read the original question in a literal fashion, because at least
    that would make it clear - but to me that is very much not what was intended or actually said.

    (Well, I agree very little was actually /said/ but near the end of the post I give a couple of [what
    I consider] minimal requirements for what a modulo average should achieve, and I bet the OP would
    agree with those!)


    HOWEVER, there's another characterisation for the average of a set, which looks more promising in
    a modular (or other more general) setting: the average is the value of V which MINIMISES THE
    "ERROR" calculated by

    error = Sum ([i=1,2,..n] (V - x_i)^2)

    that is, minimises the sum of squares of differences from V.

    This has exactly same problem as above, because subtraction and multiplication (power of two) have
    different semantics in modular arithmetic.

    Hmm, I don't agree - we've moved from a "calculate this formula for averaging numbers in R"
    perspective, which DOESN'T WORK IN THE MODULO SYSTEM [due to the division] to a "choose V which
    minimises such-and-such error function" which DOES work, provided we have a way of making the error
    function clear.

    Note that in the error = formula above, the "(V - x_i)^2" terms are the squares of the distance
    between V and X_i. I agree V-x_i would be ambiguous in the modulo system, but that formula is the
    formula for REAL number averages (the "mean" as you point out below). In the modulo system we need
    a replacement for "distance" that is well defined and appropriate.

    So provided we can propose a sensible metric on the modular system, the error function approach
    makes perfect sense. That's not at all like the "convert modulo number to real number and then take
    arithmetic mean, and then reduce that modulo whatever" definition, which will give different results
    depending on where you make the required "cut" in the modulo loop.

    So to be clearer perhaps I should have written:

    error = Sum ([i=1,2,..n] Distance(V,x_i)^2)

    ALSO NOTE: this error calculation ISN'T A MODULO CALCULATION! It's a calculation in the real
    numbers R. So Distance is a real number, and squaring is being done in the real numbers, and the
    error function is a function mapping to real numbers that is being minimised as a value in R (by a
    choice of V in the modulo system). [That all relates to your remark about different semantics in
    modular arithmetic.]


    [Incidentally, if we minimise the sum of absolute differeces from V, that characterises the mode
    (aka "midpoint") of the sample, but I think the median is more what is wanted here...]

    Actually it is the same beast. Median is the least C-norm defined as |x-y|.

    Mean is the least Euclidean norm (squares), the thing you proposed.

    Mean and average are same for real numbers.

    Yeah, I agree - Thanks, I'd totally muddled the definitions for those terms!! Well, I'll say it's
    been a while :) I should have said "mean". A quick check shows "mode" is actually "the most common
    value" which is not of much interest here.


    You are right that median (and C-norm) can be unambiguously defined in modular numbers. But nobody
    would ever qualify this as a puzzle! (:-))

    [...]

    Now, as to how to CALCULATE the V above???

    As I said, in real numbers V is the mean, i.e.

    V = Sum (Xi) =def= mean({Xi | i=1..n})
    i=1..n

    But that doesn't work sensibly for a modulo system where arithmetic wraps. That's the whole point
    of this thread, it seems to me.

    Hmmm, ok clearly for a modulo system, if we translate all the x_i by a fixed amount, THE "AVERAGE"
    SHOULD ALSO TRANSLATE BY THE SAME FIXED AMOUNT. Otherwise it is simply not a sensible definition of
    average with that modulo system. Your definition does not have the translation property while my
    one does - that's why I propose it for this problem.

    I'd agree that there's plenty of room for discussing what is the right "distance" function or
    precisely how that feeds into the "error" function (squares vs. abs values or whatever). But also I
    think my suggestion for those is good enough...


    Any numeric method is in the end mapping Xi to some normal number, calculating a classic mean,
    mapping back. The problem is with the forward mapping being ambiguous. One method to disambiguate is
    staying within the same day. Another could be taking a median and considering 12 hours before and 12
    hours after it. And there are many more.

    Maybe so - I'd just say here that you are discussing an /algorithm/ to compute a result, rather than
    the definition of what is to be considered the correct answer, which was the focus in my post. I
    agree any algorithm could well incorporate calculating the mean of certain real numbers, but
    /correctness/ is judged by whether the algorithm produces the V with minimum error function.
    [Otherwise people just make an arbitrary coding decision, and say by definition their code is
    correct because it correctly codes what they want it to :) ]

    Going a bit further than my last post, I'd say that the average as I've defined it would probably
    turn out to be the real-number average, if only we could make the cut in the modulo loop 12 hours
    away from the point V. Of course the point V is the value to be determined, which isn't known up
    front, so /defining/ V like this would be circular! It does seem that the effect of moving V in all
    the calculations would be more easily evaluated, if the times x_i were SORTED, so we could quickly
    find how many x_i are in different intervals. Just thinking...


    P.S. There are differences between the average and mean. OP referred the average, which may mean
    something completely counterintuitive (no pun intended (:-))


    I think the OP didn't think too deeply about the mathematics - but I think any solution without
    these two properties would be an automatic fail for OP:

    - if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
    AROUND THE END OF THE MODULO LOOP] then the "average" will be within
    those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.

    - if all x_i are translated by a fixed offset like 1 hour, the average
    will translate by the same offset, i.e. it maintains the same "modulo"
    relation with the translated x_i.

    Regards,
    Mike.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Tim Rentsch on Thu Dec 22 04:10:50 2022
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    As before, the problem is underspecified.

    Some remarks not specifically in reply to you, Tim...

    The input is a collection, t(n), of n > 1 numbers in [0, 24). The
    average should be a number, A, in [0, 24) that minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms of variance). So far, this is just what an average is. The key point is
    what is the distance (or difference) whose sum (or sum of squares) we
    want to minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.

    The problem has a natural interpretation in terms of angles. Whatever
    the circular quantity is, convert the values to unit vectors round a
    circle. For times of day, just scale [0, 24) to [0, 2*pi). The average
    is then just the direction of the average vector, converted back to a
    time of day.

    Sometimes that vector has zero length, and the average is undefined, but otherwise the length of the vector gives an indication of the
    variability of the data.

    Why do I consider this a reasonable interpretation of the problem?
    Well, given a list of times of day when a train is observed to pass some station, the circular 24-hour-time average should be our best estimate
    of the scheduled time.

    Obviously there are other possible readings of the problem, but I was
    not able to justify any of them as useful for any real-world
    applications. This is a case where I hope I am wrong and there /are/
    other circular averages with practical interpretations.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Mike Terry on Thu Dec 22 09:10:41 2022
    On 2022-12-22 01:01, Mike Terry wrote:

    Maybe so - I'd just say here that you are discussing an /algorithm/ to compute a result, rather than the definition of what is to be considered
    the correct answer, which was the focus in my post.

    That was my first impression too. This is why my first response was that
    the idea of puzzle was to perform calculations in modular arithmetic
    without conversions to real/integer/fixed-point. Which is possible to
    do, one should only count wrap-ups after each addition and in the end
    take them into account in final division and remainder computations.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Julio Di Egidio on Thu Dec 22 16:50:45 2022
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.


    Programming is all about trying to guess what the customer actually
    wanted, rather than what they asked for! :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to David Brown on Thu Dec 22 08:30:03 2022
    On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Programming is all about trying to guess what the customer actually
    wanted, rather than what they asked for! :-)

    Utter nonsense of the usual kind and an entire industry down the drain.

    No, we are rather supposed to *elicit* requirements... which is just the
    tip of the (software) analysis iceberg.

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Julio Di Egidio on Thu Dec 22 22:06:57 2022
    On 22/12/2022 17:30, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Programming is all about trying to guess what the customer actually
    wanted, rather than what they asked for! :-)

    Utter nonsense of the usual kind and an entire industry down the drain.

    No, we are rather supposed to *elicit* requirements... which is just the
    tip of the (software) analysis iceberg.


    Perhaps I should have put the smiley in a bigger font?

    Of course the correct way to do things is to work with the customer to establish the requirements and specifications that everyone is happy
    with, and then code to those specifications.

    And if it turns out that the specifications were unclear, you clarify
    them, rather than guessing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to David Brown on Thu Dec 22 21:30:59 2022
    On 22/12/2022 9:06 pm, David Brown wrote:
    On 22/12/2022 17:30, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown
    wrote:
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry
    wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Programming is all about trying to guess what the customer
    actually
    wanted, rather than what they asked for! :-)

    Utter nonsense of the usual kind and an entire industry down
    the drain.

    No, we are rather supposed to *elicit* requirements... which is
    just the
    tip of the (software) analysis iceberg.


    Perhaps I should have put the smiley in a bigger font?

    David, obviously it's your call, but do you really think you're
    going to have a meaningful conversation with someone who on
    05/01/2022 10:57 replied to you[1]:
    "It is you who are an incompetent fucking clown just repeating
    some bullshit you read in the blogs."

    And you're by far from being the only one he abuses in thus wise.

    God invented killfiles for a reason.

    [1] If you want to look it up, see just over halfway down this
    thread: https://groups.google.com/g/comp.programming/c/gKQ3q0D5IT0/m/7JoZ5e7ICAAJ

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to David Brown on Thu Dec 22 14:15:29 2022
    On Thursday, 22 December 2022 at 22:07:01 UTC+1, David Brown wrote:
    On 22/12/2022 17:30, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Programming is all about trying to guess what the customer actually
    wanted, rather than what they asked for! :-)

    Utter nonsense of the usual kind and an entire industry down the drain.

    No, we are rather supposed to *elicit* requirements... which is just the tip of the (software) analysis iceberg.

    Perhaps I should have put the smiley in a bigger font?

    You can laugh your frauds and bad conscience as much as you like,
    I am not kidding at all.

    Of course the correct way to do things is to work with the customer to establish the requirements and specifications that everyone is happy
    with, and then code to those specifications.

    And if it turns out that the specifications were unclear, you clarify
    them, rather than guessing.

    No, you don't, you go back to the customer and work together until you
    do reach again an agreement (indeed analysis, for another big secret,
    can and should iterate with everything else)... which is still barely 101.

    But you spammers (not to even mention meanwhile the plain vile cunts),
    "of course" will never concede, and this remains mainly to the benefit of
    the innocent reader, learning something sensible for a change.. which
    might even be your employers, for those of you who don't just suck blood.

    Fucking pathetic...

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Richard Heathfield on Fri Dec 23 13:46:31 2022
    On 22/12/2022 22:30, Richard Heathfield wrote:
    On 22/12/2022 9:06 pm, David Brown wrote:
    On 22/12/2022 17:30, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Programming is all about trying to guess what the customer actually
    wanted, rather than what they asked for! :-)

    Utter nonsense of the usual kind and an entire industry down the drain.

    No, we are rather supposed to *elicit* requirements... which is just the >>> tip of the (software) analysis iceberg.


    Perhaps I should have put the smiley in a bigger font?

    David, obviously it's your call, but do you really think you're going to
    have a meaningful conversation with someone who on 05/01/2022 10:57
    replied to you[1]:
    "It is you who are an incompetent fucking clown just repeating some
    bullshit you read in the blogs."

    And you're by far from being the only one he abuses in thus wise.

    God invented killfiles for a reason.

    [1] If you want to look it up, see just over halfway down this thread: https://groups.google.com/g/comp.programming/c/gKQ3q0D5IT0/m/7JoZ5e7ICAAJ


    I am blessed with absent-mindedness, which makes it easier to give
    people a second chance! But seeing his latest reply, it seems that was
    wasted effort.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Julio Di Egidio on Fri Dec 23 13:47:27 2022
    On 22/12/2022 23:15, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 22:07:01 UTC+1, David Brown wrote:

    Perhaps I should have put the smiley in a bigger font?

    You can laugh your frauds and bad conscience as much as you like,
    I am not kidding at all.


    Happy Christmas to you too :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to David Brown on Fri Dec 23 08:33:43 2022
    On Friday, 23 December 2022 at 13:47:31 UTC+1, David Brown wrote:
    On 22/12/2022 23:15, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 22:07:01 UTC+1, David Brown wrote:

    Perhaps I should have put the smiley in a bigger font?

    You can laugh your frauds and bad conscience as much as you like,
    I am not kidding at all.

    Happy Christmas to you too :-)

    What will you have for lunch, kidneys of some little African kids?

    You pieces of vile, retarded, abusive, and always fraudulent nazi-shit:
    if anything I'll be spitting on you till the end of time...

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to Julio Di Egidio on Fri Dec 23 08:35:19 2022
    On Wednesday, 21 December 2022 at 23:47:52 UTC+1, Julio Di Egidio wrote:
    On Wednesday, 21 December 2022 at 20:56:01 UTC+1, Mike Terry wrote:

    an accumulation of "frustration" at what I consider to be
    "unimaginative" misinterpretations and
    dogmatic claims of what was originally requested...

    Besides that the problem was and remains at best poorly
    phrased (and I agree that guesswork around here should
    rather be banned), there is a much simpler and canonical
    way to define "midpoint" in a modular space, namely the
    midpoint along the shortest of the two possible paths.
    And that's actually a quite common kind of calculation...

    Up 1, you piece of retarded polluting fraudulent shit.

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Sat Dec 24 06:21:40 2022
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    As before, the problem is underspecified.

    Some remarks not specifically in reply to you, Tim...

    I didn't really say very much. Your comments are a lot more
    interesting.

    The input is a collection, t(n), of n > 1 numbers in [0, 24). The
    average should be a number, A, in [0, 24) that minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
    of variance). So far, this is just what an average is. The key point
    is what is the distance (or difference) whose sum (or sum of squares)
    we want to minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.

    Minimizing the square of the distance (aka length of the shorter arc)
    is one metric, and I think a reasonable one. (Minimizing the absolute
    value of the distance gives a result more like the median than the
    mean, IIANM.)

    The problem has a natural interpretation in terms of angles. Whatever
    the circular quantity is, convert the values to unit vectors round a
    circle. For times of day, just scale [0, 24) to [0, 2*pi). The
    average is then just the direction of the average vector, converted
    back to a time of day.

    Averaging the "time unit vectors" is another plausible metric.

    Sometimes that vector has zero length, and the average is undefined,
    but otherwise the length of the vector gives an indication of the
    variability of the data.

    Why do I consider this a reasonable interpretation of the problem?
    Well, given a list of times of day when a train is observed to pass
    some station, the circular 24-hour-time average should be our best
    estimate of the scheduled time.

    Obviously there are other possible readings of the problem, but I was
    not able to justify any of them as useful for any real-world
    applications. This is a case where I hope I am wrong and there /are/
    other circular averages with practical interpretations.

    A nice analysis.

    Some observations (all of which I believe are correct, but no
    guarantees are offered).

    The "vector average" metric is easier to calculate and gives an
    explicit indication when the data produces a degenerate result.

    Considering only cases that do not produce a degenerate result:

    (1) the "conventional average" metric and the "vector average"
    metric can produce different results when there are more than two
    data points. (For one or two data points I think they are always
    the same.)

    (2) the "vector average" metric can be calculated in O(n). The
    "conventional average" metric can be calculated in O(n*n).

    (3) when the times are all reasonably close together, the
    "conventional average" metric seems more likely to produce a result corresponding to what most people expect. Another way to say this
    is that the "vector average" metric will sometimes surprise people.

    Incidentally, I had implemented a method that is isomorphic to
    the "vector average" metric (at least I think it is), and I was
    surprised to discover that it gave different results than the
    "conventional average" metric in some cases.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Mike Terry on Sat Dec 24 17:14:41 2022
    On 24/12/2022 4:53 pm, Mike Terry wrote:
    On 24/12/2022 14:21, Tim Rentsch wrote:
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their
    average. For example, the average of "eight o'clock"
    and "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example
    "HH:MM" or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a
    result, the puzzle has not yet been solved (unless I
    have overlooked a contribution or misworded
    expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5 average( 11.5, 13.5 )==
    12.5

    (I use hours as units, so "0.5" means, "half past
    midnight".)

    I hope that these test cases encode sensible
    expectations for an average of two times on a 24-hour
    clock in the spirit of the example given in the OP,
    which was, "the average of eight o'clock and ten
    o'clock would be nine o'clock", since these test cases
    just have rotated that example by 3.5 and 15.5 hours.

    I believe that I have not seen an algorithm so far in
    this thread that would pass these tests.

    As before, the problem is underspecified.

    Some remarks not specifically in reply to you, Tim...

    I didn't really say very much. Your comments are a lot
    more interesting.

    The input is a collection, t(n), of n > 1 numbers in [0,
    24). The average should be a number, A, in [0, 24) that
    minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to
    think in terms of variance). So far, this is just what an
    average is. The key point is what is the distance (or
    difference) whose sum (or sum of squares) we want to
    minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.

    Minimizing the square of the distance (aka length of the
    shorter arc) is one metric, and I think a reasonable one.
    (Minimizing the absolute value of the distance gives a
    result more like the median than the mean, IIANM.)

    Yes - perhaps Ben was thinking of making

    Sum_{i=1,n} signed_distance(A, t(i)) = 0

    rather than minimising the (absolute) distance.

    And so perhaps you are closer to what Ben was thinking of, but as
    far as we know you are no closer to knowing what Stefan was
    thinking of.

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Terry@21:1/5 to Tim Rentsch on Sat Dec 24 16:53:57 2022
    On 24/12/2022 14:21, Tim Rentsch wrote:
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    As before, the problem is underspecified.

    Some remarks not specifically in reply to you, Tim...

    I didn't really say very much. Your comments are a lot more
    interesting.

    The input is a collection, t(n), of n > 1 numbers in [0, 24). The
    average should be a number, A, in [0, 24) that minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
    of variance). So far, this is just what an average is. The key point
    is what is the distance (or difference) whose sum (or sum of squares)
    we want to minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.

    Minimizing the square of the distance (aka length of the shorter arc)
    is one metric, and I think a reasonable one. (Minimizing the absolute
    value of the distance gives a result more like the median than the
    mean, IIANM.)

    Yes - perhaps Ben was thinking of making

    Sum_{i=1,n} signed_distance(A, t(i)) = 0

    rather than minimising the (absolute) distance. That would match up with minimising the sum of
    squares of the distance (mean). Or he just meant the median as you say, which is different from the
    mean. Both have plausible "average" properties, and meet my "minimal" criteria for what I think
    anyone would require of an "average" applying to a modular arithmetic:

    - if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
    AROUND THE END OF THE MODULO LOOP] then the "average" will be within
    those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.

    - if all x_i are translated by a fixed offset like 1 hour, the average
    will translate by the same offset, i.e. it maintains the same "modulo"
    relation with the translated x_i.

    hmm, thinking now I'll add the similar:

    - if all x_i are "reflected" about a chosen time, then the new average
    will be the reflected time of the original average.

    (those aren't intended to be a definition of a "modular average", just to exclude suggestions that
    blatently lack the right symmetry for such an average.)


    The problem has a natural interpretation in terms of angles. Whatever
    the circular quantity is, convert the values to unit vectors round a
    circle. For times of day, just scale [0, 24) to [0, 2*pi). The
    average is then just the direction of the average vector, converted
    back to a time of day.

    Averaging the "time unit vectors" is another plausible metric.

    Yes I think this one would work best in most situations, given that the OP didn't have a precise
    mathematical requirement for his solution - probably he wants something that just works reasonably
    as expected. (It's my favourite answer.)

    Another way of thinking of this approach is "balancing" the 24-hour modular circle, where we've put
    unit weights at each of the times x_i. E.g. we look for a balancing line through the centre of the
    circle. [Note times on the circle go from 1-24, not 1-12 like a normal clock.] Also, thinking like
    this, it's equivalent to (conventional) averaging of the sin of the angular differences from the
    average (balancing) point, since the sin will give the perpendicular distance of the weight from the
    balancing line. So it still fits with the general scheme of minimising distance(x_i, v)^2, but with
    a slightly different metric (distance function) in use.

    As sin(x) is close to x when x is small, the vector average solution would match closely with the
    mean (by arc length) solution when the times are close together, but diverge more noticeably when
    the times are spread out...


    Mike.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kristjan Robam@21:1/5 to Richard Heathfield on Sat Dec 24 12:17:22 2022
    Has any of you guys tried to make a universal puzzle solver ?





    On Wednesday, December 14, 2022 at 2:30:11 PM UTC+2, Richard Heathfield wrote:
    On 14/12/2022 12:24 pm, Stefan Ram wrote:
    Given n times of the 24-hour day, print their average.

    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".

    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Subject line: "Another little puzzle"

    Your specification seems too simple to qualify as a puzzle, so
    what am I missing?

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Mike Terry on Sat Dec 24 22:30:55 2022
    Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

    On 24/12/2022 14:21, Tim Rentsch wrote:
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    As before, the problem is underspecified.

    Some remarks not specifically in reply to you, Tim...
    I didn't really say very much. Your comments are a lot more
    interesting.

    The input is a collection, t(n), of n > 1 numbers in [0, 24). The
    average should be a number, A, in [0, 24) that minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
    of variance). So far, this is just what an average is. The key point
    is what is the distance (or difference) whose sum (or sum of squares)
    we want to minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.
    Minimizing the square of the distance (aka length of the shorter arc)
    is one metric, and I think a reasonable one. (Minimizing the absolute
    value of the distance gives a result more like the median than the
    mean, IIANM.)

    Yes - perhaps Ben was thinking of aking

    Sum_{i=1,n} signed_distance(A, t(i)) = 0

    rather than minimising the (absolute) distance. That would match up
    with minimising the sum of squares of the distance (mean). Or he just
    meant the median as you say, which is different from the mean. Both
    have plausible "average" properties,

    Well, I was trying to be deliberately vague (hence my talking about "an average" rather than mean, median or whatever) but I was no vague enough
    with the formula.

    I wasn't thinking of a zero net distance because some metrics (though
    probably not applicable in this situation) don't always allow for a zero
    sum. I should have referred to minimising some kind distance sum, be it
    the absolute value of a signed sum or, as in the solution I was
    proposing, the sum of directional differences (i.e. full vectors).

    and meet my "minimal" criteria
    for what I think anyone would require of an "average" applying to a
    modular arithmetic:

    - if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
    AROUND THE END OF THE MODULO LOOP] then the "average" will be within
    those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.

    And when the x_i are widely spaced the average might be considered
    nearly meaningless. If you have {0, 8, 16} hours then what is the
    "average"? Are there cases where every set of circular values has a
    meaningful "average"? If not, you really need more than a single number answer.

    (These are not questions to you, specifically, I'm interested to know of interpretations that go beyond the way I see things.)

    - if all x_i are translated by a fixed offset like 1 hour, the average
    will translate by the same offset, i.e. it maintains the same "modulo"
    relation with the translated x_i.

    hmm, thinking now I'll add the similar:

    - if all x_i are "reflected" about a chosen time, then the new average
    will be the reflected time of the original average.

    (those aren't intended to be a definition of a "modular average", just
    to exclude suggestions that blatently lack the right symmetry for such
    an average.)

    The problem has a natural interpretation in terms of angles. Whatever
    the circular quantity is, convert the values to unit vectors round a
    circle. For times of day, just scale [0, 24) to [0, 2*pi). The
    average is then just the direction of the average vector, converted
    back to a time of day.
    Averaging the "time unit vectors" is another plausible metric.

    Yes I think this one would work best in most situations, given that
    the OP didn't have a precise mathematical requirement for his solution
    - probably he wants something that just works reasonably as expected.
    (It's my favourite answer.)

    Another way of thinking of this approach is "balancing" the 24-hour
    modular circle, where we've put unit weights at each of the times x_i.
    E.g. we look for a balancing line through the centre of the circle.

    Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or
    16 balances, but the vector sum interpretation is undefined. And for
    {0, 6, 12, 18} there are two balance lines, neither of which seems to be
    a natural answer for the average.

    (Also, every balance line gives two answers, so how do you pick?)

    [Note times on the circle go from 1-24, not 1-12 like a normal clock.]
    Also, thinking like this, it's equivalent to (conventional) averaging
    of the sin of the angular differences from the average (balancing)
    point, since the sin will give the perpendicular distance of the
    weight from the balancing line. So it still fits with the general
    scheme of minimising distance(x_i, v)^2, but with a slightly different
    metric (distance function) in use.

    As sin(x) is close to x when x is small, the vector average solution
    would match closely with the mean (by arc length) solution when the
    times are close together, but diverge more noticeably when the times
    are spread out...

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Terry@21:1/5 to Ben Bacarisse on Sun Dec 25 00:02:31 2022
    On 24/12/2022 22:30, Ben Bacarisse wrote:
    Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

    On 24/12/2022 14:21, Tim Rentsch wrote:
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    As before, the problem is underspecified.

    Some remarks not specifically in reply to you, Tim...
    I didn't really say very much. Your comments are a lot more
    interesting.

    The input is a collection, t(n), of n > 1 numbers in [0, 24). The
    average should be a number, A, in [0, 24) that minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
    of variance). So far, this is just what an average is. The key point >>>> is what is the distance (or difference) whose sum (or sum of squares)
    we want to minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.
    Minimizing the square of the distance (aka length of the shorter arc)
    is one metric, and I think a reasonable one. (Minimizing the absolute
    value of the distance gives a result more like the median than the
    mean, IIANM.)

    Yes - perhaps Ben was thinking of aking

    Sum_{i=1,n} signed_distance(A, t(i)) = 0

    rather than minimising the (absolute) distance. That would match up
    with minimising the sum of squares of the distance (mean). Or he just
    meant the median as you say, which is different from the mean. Both
    have plausible "average" properties,

    Well, I was trying to be deliberately vague (hence my talking about "an average" rather than mean, median or whatever) but I was no vague enough
    with the formula.

    I wasn't thinking of a zero net distance because some metrics (though probably not applicable in this situation) don't always allow for a zero
    sum. I should have referred to minimising some kind distance sum, be it
    the absolute value of a signed sum or, as in the solution I was
    proposing, the sum of directional differences (i.e. full vectors).

    and meet my "minimal" criteria
    for what I think anyone would require of an "average" applying to a
    modular arithmetic:

    - if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
    AROUND THE END OF THE MODULO LOOP] then the "average" will be within
    those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.

    And when the x_i are widely spaced the average might be considered
    nearly meaningless. If you have {0, 8, 16} hours then what is the
    "average"? Are there cases where every set of circular values has a meaningful "average"? If not, you really need more than a single number answer.

    (These are not questions to you, specifically, I'm interested to know of interpretations that go beyond the way I see things.)

    - if all x_i are translated by a fixed offset like 1 hour, the average
    will translate by the same offset, i.e. it maintains the same "modulo" >> relation with the translated x_i.

    hmm, thinking now I'll add the similar:

    - if all x_i are "reflected" about a chosen time, then the new average
    will be the reflected time of the original average.

    (those aren't intended to be a definition of a "modular average", just
    to exclude suggestions that blatently lack the right symmetry for such
    an average.)

    The problem has a natural interpretation in terms of angles. Whatever >>>> the circular quantity is, convert the values to unit vectors round a
    circle. For times of day, just scale [0, 24) to [0, 2*pi). The
    average is then just the direction of the average vector, converted
    back to a time of day.
    Averaging the "time unit vectors" is another plausible metric.

    Yes I think this one would work best in most situations, given that
    the OP didn't have a precise mathematical requirement for his solution
    - probably he wants something that just works reasonably as expected.
    (It's my favourite answer.)

    Another way of thinking of this approach is "balancing" the 24-hour
    modular circle, where we've put unit weights at each of the times x_i.
    E.g. we look for a balancing line through the centre of the circle.

    Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or
    16 balances, but the vector sum interpretation is undefined. And for
    {0, 6, 12, 18} there are two balance lines, neither of which seems to be
    a natural answer for the average.

    Yes, it's logically the same, in so far as WHEN the x_i aren't already balanced [aka they /have/ a
    meaningful average] we get the same average both ways. (Adding vectors is effectively a way of
    calculating moments of the weights in the balancing view of things.)

    With modular arithmetic there is a scenario where everything is already balanced, and so it isn't
    clear what an average would represent, if anything. Both {0, 8, 16} and {0, 6, 12, 18} are already
    perfectly balanced - ANY line through the origin would balance for either example. That is just a
    feature of modular balancing/averaging I guess. I agree there's no natural "average", so I'll just
    suggest we say "this distribution doesn't have an average" or something. Adding vectors gives a
    zero result and so doesn't work either...

    Of course, we might apply some definition that we're working with and just see how it goes! E.g. we
    might say "we're minimising XXXXXX error function" - but then we just find the function is constant
    on the whole circle - so it doesn't give us anything useful other than that there's no useful
    average. Or we could say "every point meets the average criteria", but those are just words and no
    more help. This is what happens with the minimising functions that have been mentioned in the
    thread, for your two "already balanced" examples. [E.g. if we're minimising

    error(A) = Sum_{x=0,8,16} sin (angle of least arc from A to x)^2

    it's going to zero on the whole circle - no (unique) minimum. :(

    You're right that a balancing line has two ends, so which to take? We would take the end bearing
    most weight! But actually the balancing line idea wasn't my proper thinking - that was that we
    "counter-balanced" the circle by adding a new weight somewhere on the circle. There is a unique
    such place where such a weight can go, and a unique weight that works. The "average" for the x_i is
    then directly OPPOSITE the counter weight. This is all assuming we don't start with an exactly
    balancing sample! The mass of the counter-weight is the magnitude of your vector sum, indicating
    the "strength" of the average...

    [I thought someone would ask the annoying question "why is the counter-balancing point on the circle
    /opposite/ to the average point?" so by turning it into a balancing line which went through the
    average-point I could save a response... And of course once we're considering a balancing line
    passing through the counter-weight and average-point, we could then remove the counter-weight and
    things would still balance]

    Anyhow, I just said it all as another way of thinking about the vector average method. In a program
    we're going to compute x and y components of vectors and add them like you said.


    (Also, every balance line gives two answers, so how do you pick?)

    The "heaviest end" of the circle! But really, maybe forget the balancing line and think of the
    counter-balancing weight idea.


    Mike.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Dec 25 00:41:38 2022
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    As before, the problem is underspecified.

    Some remarks not specifically in reply to you, Tim...

    I didn't really say very much. Your comments are a lot more
    interesting.

    The input is a collection, t(n), of n > 1 numbers in [0, 24). The
    average should be a number, A, in [0, 24) that minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
    of variance). So far, this is just what an average is. The key point
    is what is the distance (or difference) whose sum (or sum of squares)
    we want to minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.

    Minimizing the square of the distance (aka length of the shorter arc)
    is one metric, and I think a reasonable one. (Minimizing the absolute
    value of the distance gives a result more like the median than the
    mean, IIANM.)

    The problem has a natural interpretation in terms of angles. Whatever
    the circular quantity is, convert the values to unit vectors round a
    circle. For times of day, just scale [0, 24) to [0, 2*pi). The
    average is then just the direction of the average vector, converted
    back to a time of day.

    Averaging the "time unit vectors" is another plausible metric.

    Sometimes that vector has zero length, and the average is undefined,
    but otherwise the length of the vector gives an indication of the
    variability of the data.

    Why do I consider this a reasonable interpretation of the problem?
    Well, given a list of times of day when a train is observed to pass
    some station, the circular 24-hour-time average should be our best
    estimate of the scheduled time.

    Obviously there are other possible readings of the problem, but I was
    not able to justify any of them as useful for any real-world
    applications. This is a case where I hope I am wrong and there /are/
    other circular averages with practical interpretations.

    A nice analysis.

    A generous appraisal, thanks. I was trying to be vague enough not to
    put my foot in it, by was too specific with the formula.

    Some observations (all of which I believe are correct, but no
    guarantees are offered).

    The "vector average" metric is easier to calculate and gives an
    explicit indication when the data produces a degenerate result.

    Considering only cases that do not produce a degenerate result:

    (1) the "conventional average" metric and the "vector average"
    metric can produce different results when there are more than two
    data points. (For one or two data points I think they are always
    the same.)

    (2) the "vector average" metric can be calculated in O(n). The
    "conventional average" metric can be calculated in O(n*n).

    By which I presume you mean and /not/ in O(n). (Big O is just an upper
    bound.)

    (3) when the times are all reasonably close together, the
    "conventional average" metric seems more likely to produce a result corresponding to what most people expect. Another way to say this
    is that the "vector average" metric will sometimes surprise people.

    I'm sorry to be obtuse, but what is the "conventional average"? The
    name makes it sound trivial, but the quadratic time makes me certain
    that is isn't.

    My "conventional average" algorithm (which is not well thought out) was
    to (a) rotate the data set to avoid the 23/0 boundary (not always
    possible), (b) take the arithmetic mean, and then (c) rotate the result
    back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
    mean[0,1,2] - 1 = 0.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Mike Terry on Sun Dec 25 01:55:35 2022
    Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

    ... But actually the balancing line idea wasn't my proper thinking -
    that was that we "counter-balanced" the circle by adding a new weight somewhere on the circle. There is a unique such place where such a
    weight can go, and a unique weight that works.

    Or just find the point in the (weightless) disc where it balances.
    That's the vector average.

    The "average" for the
    x_i is then directly OPPOSITE the counter weight. This is all
    assuming we don't start with an exactly balancing sample!
    The mass of
    the counter-weight is the magnitude of your vector sum, indicating the "strength" of the average...

    The distance from the centre is then a measure of the strength of the
    average.

    I like the physical metaphor of a balanced disc, but I prefer the idea
    of finding the balance point directly rather than the mass and location
    (on the circumference) of a counterweight.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Mike Terry on Sun Dec 25 01:30:42 2022
    Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

    On 24/12/2022 22:30, Ben Bacarisse wrote:
    Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

    Another way of thinking of this approach is "balancing" the 24-hour
    modular circle, where we've put unit weights at each of the times x_i.
    E.g. we look for a balancing line through the centre of the circle.

    Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or
    16 balances, but the vector sum interpretation is undefined. And for
    {0, 6, 12, 18} there are two balance lines, neither of which seems to be
    a natural answer for the average.

    Yes, it's logically the same, in so far as WHEN the x_i aren't already balanced [aka they /have/ a meaningful average] we get the same
    average both ways. (Adding vectors is effectively a way of
    calculating moments of the weights in the balancing view of things.)

    Oh, yes, I see what you mean. Sorry.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Mike Terry on Sun Dec 25 00:37:47 2022
    Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

    On 24/12/2022 14:21, Tim Rentsch wrote:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    [...]

    The input is a collection, t(n), of n > 1 numbers in [0, 24). The
    average should be a number, A, in [0, 24) that minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
    of variance). So far, this is just what an average is. The key point
    is what is the distance (or difference) whose sum (or sum of squares)
    we want to minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.

    Minimizing the square of the distance (aka length of the shorter
    arc) is one metric, and I think a reasonable one. [...]

    Yes - perhaps Ben was thinking of making

    Sum_{i=1,n} signed_distance(A, t(i)) = 0

    rather than minimising the (absolute) distance. [...]

    I think Ben meant what he said, but I will let him speak for
    himself.

    The problem has a natural interpretation in terms of angles.
    Whatever the circular quantity is, convert the values to unit
    vectors round a circle. For times of day, just scale [0, 24) to
    [0, 2*pi). The average is then just the direction of the average
    vector, converted back to a time of day.

    Averaging the "time unit vectors" is another plausible metric.

    Yes I think this one would work best in most situations, [...]

    It gives results in many cases, including many easy cases, that I
    think would surprise most people.

    Another way of thinking of this approach is "balancing" the 24-hour
    modular circle, where we've put unit weights at each of the times
    x_i. E.g. we look for a balancing line through the centre of the
    circle. [Note times on the circle go from 1-24, not 1-12 like a
    normal clock.]

    To me this way of thinking about the problem seems not very
    useful.

    Also, thinking like this, it's equivalent to (conventional)
    averaging of the sin of the angular differences from the average
    (balancing) point, since the sin will give the perpendicular
    distance of the weight from the balancing line. [...]

    I think this statement is nonsensical. We are taking the
    conventional average of "something", but the "something" has
    been chosen so that its conventional average is zero. It's
    like saying, Once you get to where you're going, you know that
    where you're going is where you are. There is no information
    about how to solve the problem, and so it cannot be equivalent
    to any method that provides an actual answer.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Dec 25 01:05:06 2022
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    Given n times of the 24-hour day, print their average.
    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".
    (You can choose any representation, for example "HH:MM"
    or "seconds since midnight".)

    Thanks for all replies!

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    There were not enough tests written and run. As a result,
    the puzzle has not yet been solved (unless I have overlooked
    a contribution or misworded expectations).

    So, here are two possible test cases.

    average( 23.5, 1.5 )== 0.5
    average( 11.5, 13.5 )== 12.5

    (I use hours as units, so "0.5" means, "half past midnight".)

    I hope that these test cases encode sensible expectations
    for an average of two times on a 24-hour clock in the spirit
    of the example given in the OP, which was, "the average of
    eight o'clock and ten o'clock would be nine o'clock", since
    these test cases just have rotated that example by 3.5 and
    15.5 hours.

    I believe that I have not seen an algorithm so far in this
    thread that would pass these tests.

    As before, the problem is underspecified.

    Some remarks not specifically in reply to you, Tim...

    I didn't really say very much. Your comments are a lot more
    interesting.

    The input is a collection, t(n), of n > 1 numbers in [0, 24). The
    average should be a number, A, in [0, 24) that minimises

    Sum_{i=1,n} distance(A, t(i))

    (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
    of variance). So far, this is just what an average is. The key point
    is what is the distance (or difference) whose sum (or sum of squares)
    we want to minimise? For times, I would say it is the length of the
    shorter arc round an imaginary 24-hour clock face.

    Minimizing the square of the distance (aka length of the shorter arc)
    is one metric, and I think a reasonable one. (Minimizing the absolute
    value of the distance gives a result more like the median than the
    mean, IIANM.)

    The problem has a natural interpretation in terms of angles. Whatever
    the circular quantity is, convert the values to unit vectors round a
    circle. For times of day, just scale [0, 24) to [0, 2*pi). The
    average is then just the direction of the average vector, converted
    back to a time of day.

    Averaging the "time unit vectors" is another plausible metric.

    Sometimes that vector has zero length, and the average is undefined,
    but otherwise the length of the vector gives an indication of the
    variability of the data.

    Why do I consider this a reasonable interpretation of the problem?
    Well, given a list of times of day when a train is observed to pass
    some station, the circular 24-hour-time average should be our best
    estimate of the scheduled time.

    Obviously there are other possible readings of the problem, but I was
    not able to justify any of them as useful for any real-world
    applications. This is a case where I hope I am wrong and there /are/
    other circular averages with practical interpretations.

    A nice analysis.

    A generous appraisal, thanks. I was trying to be vague enough not to
    put my foot in it, by was too specific with the formula.

    Some observations (all of which I believe are correct, but no
    guarantees are offered).

    The "vector average" metric is easier to calculate and gives an
    explicit indication when the data produces a degenerate result.

    Considering only cases that do not produce a degenerate result:

    (1) the "conventional average" metric and the "vector average"
    metric can produce different results when there are more than two
    data points. (For one or two data points I think they are always
    the same.)

    (2) the "vector average" metric can be calculated in O(n). The
    "conventional average" metric can be calculated in O(n*n).

    By which I presume you mean and /not/ in O(n). (Big O is just an
    upper bound.)

    What I mean is I have an algorithm that runs in time proportional
    to n*n. I suspect that there is an algorithm with better big-O
    behavior (probably O( n * log n )), but I don't actually have one
    to show, so I simply gave the best big-O result that I knew.

    (3) when the times are all reasonably close together, the
    "conventional average" metric seems more likely to produce a result
    corresponding to what most people expect. Another way to say this
    is that the "vector average" metric will sometimes surprise people.

    I'm sorry to be obtuse, but what is the "conventional average"? The
    name makes it sound trivial, but the quadratic time makes me certain
    that is isn't.

    Sorry, I meant to refer to your formulation of average

    A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

    where 'difference' means the shorter arc length. This formula
    matches the result for 'mean' on real numbers.

    My "conventional average" algorithm (which is not well thought out) was
    to (a) rotate the data set to avoid the 23/0 boundary (not always
    possible), (b) take the arithmetic mean, and then (c) rotate the result
    back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
    mean[0,1,2] - 1 = 0.

    Yes, if you know where to split the cycle then the answer can be
    found in O(n) time. But how can we figure out where to split the
    cycle? (Incidentally I got that part wrong the first way I tried
    to do it.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Tim Rentsch on Sun Dec 25 09:30:55 2022
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    [pulled up from further on in the responded-to posting]

    [By "conventional average" I mean a time/direction A such that]

    A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

    where 'difference' means the shorter arc length.

    [...]

    The "conventional average" metric can be calculated in O(n*n).

    By which I presume you mean and /not/ in O(n). (Big O is just an
    upper bound.)

    What I mean is I have an algorithm that runs in time proportional
    to n*n. I suspect that there is an algorithm with better big-O
    behavior (probably O( n * log n )), but I don't actually have one
    to show, so I simply gave the best big-O result that I knew.

    After some further thought I have convinced myself that there is
    indeed an algorithm that is O( n * log n ). The idea isn't that
    complicated, but it's tedious, so I haven't actually implemented
    it. Perhaps someone will be interested to take that as a challenge
    problem.

    And who knows, maybe there is an algorithm with even better big-O
    performance. At least for now that's above my pay grade.

    (and of course, Merry Christmas...)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Dec 25 21:44:03 2022
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    I'm sorry to be obtuse, but what is the "conventional average"? The
    name makes it sound trivial, but the quadratic time makes me certain
    that is isn't.

    Sorry, I meant to refer to your formulation of average

    A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

    where 'difference' means the shorter arc length. This formula
    matches the result for 'mean' on real numbers.

    My "conventional average" algorithm (which is not well thought out) was
    to (a) rotate the data set to avoid the 23/0 boundary (not always
    possible), (b) take the arithmetic mean, and then (c) rotate the result
    back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
    mean[0,1,2] - 1 = 0.

    Yes, if you know where to split the cycle then the answer can be
    found in O(n) time. But how can we figure out where to split the
    cycle?

    Well I handily stopped considering this at the stage where I assumed
    there must be a simple way to spot the optimal rotation, so I never
    thought it might have to be quadratic. Presumably your algorithm tries
    all the offsets and minimises the result.

    Looking at it a bit more I can't see a better way (but that might be the Ratafia de Champagne). It feels as if there /should/ be one. In fact
    it feels as if it should be linear.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Damon@21:1/5 to Ben Bacarisse on Sun Dec 25 17:59:18 2022
    On 12/25/22 4:44 PM, Ben Bacarisse wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    I'm sorry to be obtuse, but what is the "conventional average"? The
    name makes it sound trivial, but the quadratic time makes me certain
    that is isn't.

    Sorry, I meant to refer to your formulation of average

    A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

    where 'difference' means the shorter arc length. This formula
    matches the result for 'mean' on real numbers.

    My "conventional average" algorithm (which is not well thought out) was
    to (a) rotate the data set to avoid the 23/0 boundary (not always
    possible), (b) take the arithmetic mean, and then (c) rotate the result
    back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
    mean[0,1,2] - 1 = 0.

    Yes, if you know where to split the cycle then the answer can be
    found in O(n) time. But how can we figure out where to split the
    cycle?

    Well I handily stopped considering this at the stage where I assumed
    there must be a simple way to spot the optimal rotation, so I never
    thought it might have to be quadratic. Presumably your algorithm tries
    all the offsets and minimises the result.

    Looking at it a bit more I can't see a better way (but that might be the Ratafia de Champagne). It feels as if there /should/ be one. In fact
    it feels as if it should be linear.


    I may have lost track of the problem, but wouldn't the addition of the
    points in two-dimensions and finding the center of gravity let you find
    the answer in O(N) time?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to Julio Di Egidio on Sun Dec 25 15:52:54 2022
    On Thursday, 22 December 2022 at 17:30:06 UTC+1, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Programming is all about trying to guess what the customer actually
    wanted, rather than what they asked for! :-)

    Utter nonsense of the usual kind and an entire industry down the drain.

    No, we are rather supposed to *elicit* requirements... which is just the
    tip of the (software) analysis iceberg.

    Up 2. You fraudulent cunts.

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Richard Damon on Sun Dec 25 18:39:52 2022
    Richard Damon <Richard@Damon-Family.org> writes:

    On 12/25/22 4:44 PM, Ben Bacarisse wrote:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    I'm sorry to be obtuse, but what is the "conventional average"? The
    name makes it sound trivial, but the quadratic time makes me certain
    that is isn't.

    Sorry, I meant to refer to your formulation of average

    A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

    where 'difference' means the shorter arc length. This formula
    matches the result for 'mean' on real numbers.

    My "conventional average" algorithm (which is not well thought
    out) was to (a) rotate the data set to avoid the 23/0 boundary
    (not always possible), (b) take the arithmetic mean, and then (c)
    rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
    and the average is mean[0,1,2] - 1 = 0.

    Yes, if you know where to split the cycle then the answer can be
    found in O(n) time. But how can we figure out where to split the
    cycle?

    Well I handily stopped considering this at the stage where I
    assumed there must be a simple way to spot the optimal rotation, so
    I never thought it might have to be quadratic. Presumably your
    algorithm tries all the offsets and minimises the result.

    Looking at it a bit more I can't see a better way (but that might
    be the Ratafia de Champagne). It feels as if there /should/ be
    one. In fact it feels as if it should be linear.

    I may have lost track of the problem, but wouldn't the addition of
    the points in two-dimensions and finding the center of gravity let
    you find the answer in O(N) time?

    Finding the center of mass does give an answer in O(N) time, but
    the answer it gives doesn't match the metric of minimizing the
    squares of shorter arc lengths. The results of the two metrics
    can be, and usually are, different.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Dec 25 18:35:18 2022
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    I'm sorry to be obtuse, but what is the "conventional average"? The
    name makes it sound trivial, but the quadratic time makes me certain
    that is isn't.

    Sorry, I meant to refer to your formulation of average

    A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

    where 'difference' means the shorter arc length. This formula
    matches the result for 'mean' on real numbers.

    My "conventional average" algorithm (which is not well thought
    out) was to (a) rotate the data set to avoid the 23/0 boundary
    (not always possible), (b) take the arithmetic mean, and then (c)
    rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
    and the average is mean[0,1,2] - 1 = 0.

    Yes, if you know where to split the cycle then the answer can be
    found in O(n) time. But how can we figure out where to split the
    cycle?

    Well I handily stopped considering this at the stage where I assumed
    there must be a simple way to spot the optimal rotation, so I never
    thought it might have to be quadratic. Presumably your algorithm
    tries all the offsets and minimises the result.

    Right.

    Looking at it a bit more I can't see a better way (but that might be
    the Ratafia de Champagne). It feels as if there /should/ be one.
    In fact it feels as if it should be linear.

    My best so far is only O( n * log n ). Probably that isn't
    optimal. I don't see how to make it linear though. Do you have
    any ideas?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Tim Rentsch on Mon Dec 26 03:49:36 2022
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    I'm sorry to be obtuse, but what is the "conventional average"? The
    name makes it sound trivial, but the quadratic time makes me certain
    that is isn't.

    Sorry, I meant to refer to your formulation of average

    A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

    where 'difference' means the shorter arc length. This formula
    matches the result for 'mean' on real numbers.

    My "conventional average" algorithm (which is not well thought
    out) was to (a) rotate the data set to avoid the 23/0 boundary
    (not always possible), (b) take the arithmetic mean, and then (c)
    rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
    and the average is mean[0,1,2] - 1 = 0.

    Yes, if you know where to split the cycle then the answer can be
    found in O(n) time. But how can we figure out where to split the
    cycle?

    Well I handily stopped considering this at the stage where I assumed
    there must be a simple way to spot the optimal rotation, so I never
    thought it might have to be quadratic. Presumably your algorithm
    tries all the offsets and minimises the result.

    Right.

    Looking at it a bit more I can't see a better way (but that might be
    the Ratafia de Champagne). It feels as if there /should/ be one.
    In fact it feels as if it should be linear.

    My best so far is only O( n * log n ). Probably that isn't
    optimal. I don't see how to make it linear though. Do you have
    any ideas?

    No. I should try with a clear head.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Richard Damon on Mon Dec 26 03:18:44 2022
    Richard Damon <Richard@Damon-Family.org> writes:

    On 12/25/22 4:44 PM, Ben Bacarisse wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    I'm sorry to be obtuse, but what is the "conventional average"? The
    name makes it sound trivial, but the quadratic time makes me certain
    that is isn't.

    Sorry, I meant to refer to your formulation of average

    A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

    where 'difference' means the shorter arc length. This formula
    matches the result for 'mean' on real numbers.

    My "conventional average" algorithm (which is not well thought out) was >>>> to (a) rotate the data set to avoid the 23/0 boundary (not always
    possible), (b) take the arithmetic mean, and then (c) rotate the result >>>> back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
    mean[0,1,2] - 1 = 0.

    Yes, if you know where to split the cycle then the answer can be
    found in O(n) time. But how can we figure out where to split the
    cycle?
    Well I handily stopped considering this at the stage where I assumed
    there must be a simple way to spot the optimal rotation, so I never
    thought it might have to be quadratic. Presumably your algorithm tries
    all the offsets and minimises the result.
    Looking at it a bit more I can't see a better way (but that might be the
    Ratafia de Champagne). It feels as if there /should/ be one. In fact
    it feels as if it should be linear.

    I may have lost track of the problem, but wouldn't the addition of the
    points in two-dimensions and finding the center of gravity let you
    find the answer in O(N) time?

    Yes. It's been generally agreed that the vector average is the simplest
    way to go. Not only is it simple, it gives an answer to the question
    when there is no meaningful average (the zero vector), and an indication
    when the average is "unreliable" (a very short vector).

    But there is another, one dimensional, average that minimises the sum of
    the squares of the short-arc distances to all the data points. It has
    been argued that this average might be less surprising to users, but
    it's tricky to calculate -- at least my first thoughts were way too
    optimistic. Tim has an algorithm for it but I don't know what his
    method is.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to Julio Di Egidio on Mon Dec 26 06:32:42 2022
    On Monday, 26 December 2022 at 00:52:57 UTC+1, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 17:30:06 UTC+1, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Programming is all about trying to guess what the customer actually wanted, rather than what they asked for! :-)

    Utter nonsense of the usual kind and an entire industry down the drain.

    No, we are rather supposed to *elicit* requirements... which is just the tip of the (software) analysis iceberg.

    Up 2. You fraudulent cunts.

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to Julio Di Egidio on Mon Dec 26 06:33:57 2022
    On Wednesday, 21 December 2022 at 23:47:52 UTC+1, Julio Di Egidio wrote:
    On Wednesday, 21 December 2022 at 20:56:01 UTC+1, Mike Terry wrote:

    an accumulation of "frustration" at what I consider to be
    "unimaginative" misinterpretations and
    dogmatic claims of what was originally requested...
    Besides that the problem was and remains at best poorly
    phrased (and I agree that guesswork around here should
    rather be banned), there is a much simpler and canonical
    way to define "midpoint" in a modular space, namely the
    midpoint along the shortest of the two possible paths.
    And that's actually a quite common kind of calculation...

    Up.3, you fraudulent cunts.

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julio Di Egidio@21:1/5 to Julio Di Egidio on Mon Dec 26 06:34:21 2022
    On Thursday, 22 December 2022 at 17:30:06 UTC+1, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
    On 22/12/2022 01:05, Julio Di Egidio wrote:
    On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
    On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

    kind of most natural definition, at least to me.

    That is unambiguous, but to my mind not at all what is wanted

    You just dont get it, do you? This is (comp.)programming
    where guesswork is a *capital sin*.

    Programming is all about trying to guess what the customer actually
    wanted, rather than what they asked for! :-)
    Utter nonsense of the usual kind and an entire industry down the drain.

    No, we are rather supposed to *elicit* requirements... which is just the
    tip of the (software) analysis iceberg.

    Up 4. You fraudulent cunts.

    Julio

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Stefan Ram on Wed Dec 28 12:12:13 2022
    ram@zedat.fu-berlin.de (Stefan Ram) writes:
    I waited a few days before answering to allow
    sufficient time to think about the problem.

    (A few lines in this post contain more than 72 characters.)

    I sometimes look at records of events like when someone got
    up or went to bed on different days and wonder about the
    "average time" he gets up or goes to bed. Some years ago,
    I asked in another newsgroup how to average times, and from
    answers there I kept this expression for a spreadsheet and
    two times in cells A1 and B1:

    REST(1.5+1/6.28318530717959*ARCTAN2((COS((A1-0.5)*6.28318530717959)+COS((B1-0.5)*6.28318530717959))/2,(SIN((A1-0.5)*6.28318530717959)+SIN((B1-0.5)*6.28318530717959))/2),1)

    . I added a bit, like "REST", but the trigonometric part
    came from that newsgroup. (Times t of a day in spreadsheets
    often are 0 <= t < 1 because they often use the day as the
    unit of time). At that time, I did not understand what was
    going on! In hindsight, today, I see that this seems to
    calculate the average via the average of the position of
    the two points on a 24-hour clock in two dimensions.

    The corresponding Wikipedia page is called, "Mean of circular
    quantities".

    I also have a copy of a page "Circular Values Math and
    Statistics with C++11" saved. Other pages I found were
    "Averages-Mean angle - Rosetta Code", "Averaging Angles -
    The Math Doctors", or "algorithm - How do you calculate
    the average of a set of angles - Stack Overflow". Maybe
    some of these pages can still be found in the Web today.

    There was also another web page that used lines some of
    which have more than 72 characters. It contained,

    |A good way to estimate an average angle, A, from a set of angle measurements |a[i] 0<=i<N, is:
    |
    | sum_i_from_1_to_N sin(a[i])
    | a = arctangent ---------------------------
    | sum_i_from_1_to_N cos(a[i])
    |
    |A very careful study of the properties of this estimator is in the book |"Statistics On Spheres", Geoffrey S. Watson, University of Arkansas Lecture |Notes in the Mathematical Sciences, 1983 John Wiley & Sons, Inc.

    .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Stefan Ram on Fri Dec 30 01:00:43 2022
    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    ram@zedat.fu-berlin.de (Stefan Ram) writes:
    I waited a few days before answering to allow
    sufficient time to think about the problem.

    (A few lines in this post contain more than 72 characters.)

    ??? That's an odd thing to say!

    I sometimes look at records of events like when someone got
    up or went to bed on different days and wonder about the
    "average time" he gets up or goes to bed. Some years ago,
    I asked in another newsgroup how to average times, and from
    answers there I kept this expression for a spreadsheet and
    two times in cells A1 and B1:

    REST(1.5+1/6.28318530717959*ARCTAN2((COS((A1-0.5)*6.28318530717959)+COS((B1-0.5)*6.28318530717959))/2,(SIN((A1-0.5)*6.28318530717959)+SIN((B1-0.5)*6.28318530717959))/2),1)

    I'm not going to try to unravel that (it is most likely the conventional
    atan2 of the sum of sines and cosines), but the problem is barely worth considering for two times. What did you do with more than two?

    . I added a bit, like "REST", but the trigonometric part
    came from that newsgroup. (Times t of a day in spreadsheets
    often are 0 <= t < 1 because they often use the day as the
    unit of time). At that time, I did not understand what was
    going on! In hindsight, today, I see that this seems to
    calculate the average via the average of the position of
    the two points on a 24-hour clock in two dimensions.

    The corresponding Wikipedia page is called, "Mean of circular
    quantities".

    I also have a copy of a page "Circular Values Math and
    Statistics with C++11" saved. Other pages I found were
    "Averages-Mean angle - Rosetta Code", "Averaging Angles -
    The Math Doctors", or "algorithm - How do you calculate
    the average of a set of angles - Stack Overflow". Maybe
    some of these pages can still be found in the Web today.

    There was also another web page that used lines some of
    which have more than 72 characters. It contained,

    |A good way to estimate an average angle, A, from a set of angle measurements |a[i] 0<=i<N, is:
    |
    | sum_i_from_1_to_N sin(a[i])
    | a = arctangent ---------------------------
    | sum_i_from_1_to_N cos(a[i])

    I think it's easier to think in terms of vectors (for one thing you get
    a useful magnitude as well). Also the programmer's atan2 is more
    helpful here than the mathematician's arctan.

    I found this problem interesting but only later in the discussion as I
    have been using this metric for some time. What got interesting (to me)
    was that there is another sound interpretation of the average as
    suggested by Tim, ironically prompted but a general definition of what
    might constitute an average that I had posted and failed to follow
    through on.

    So thank you for bringing it up.

    |A very careful study of the properties of this estimator is in the book |"Statistics On Spheres", Geoffrey S. Watson, University of Arkansas Lecture |Notes in the Mathematical Sciences, 1983 John Wiley & Sons, Inc.

    Rant: This book is now out of print and effectively dead. What a shame.
    I can't imagine that either the author or the publisher (John Wiley and
    Sons) hold out much hope of making any more money from it, so with
    things as they stand, all that person's hard work is now lost to anyone
    not lucky enough to have access to a library that happens to have it, or patient enough to wait (and prepared to pay) for the inter-library loan
    system to work it's magic.

    And for people no where near any library, tough!

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Thu Dec 29 23:25:26 2022
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    [...]

    I found this problem interesting but only later in the discussion as
    I have been using this metric for some time. What got interesting
    (to me) was that there is another sound interpretation of the
    average as suggested by Tim, [...]

    It wasn't my idea. I got it from a posting by Mike Terry on
    December 21. I hadn't seen that formulation of arithmetic mean
    before and I was amazed that it worked. So I can't really take
    any credit for the suggestion.

    ironically prompted but a general definition of what might
    constitute an average that I had posted and failed to follow
    through on.

    I remember your posting as coming after the one by Mike Terry,
    and so I thought your comments were derived from his. Sorry if
    my conclusions there were off the mark.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Stefan Ram on Fri Dec 30 05:07:45 2022
    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    [edited for brevity]

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    I waited a few days before answering to allow
    sufficient time to think about the problem.

    I sometimes look at records of events like when someone got up
    or went to bed on different days and wonder about the "average
    time" he gets up or goes to bed. Some years ago, I asked in
    another newsgroup how to average times, and from answers there
    I [found a long and somewhat forbidding formula]

    At that time, I did not understand what was going on! In
    hindsight, today, I see that this seems to calculate the
    average via the average of the position of the two points on a
    24-hour clock in two dimensions.

    [..other references to the same general idea..]

    Several people, including myself, thought of this idea, suitably
    generalized to N points rather than two. Two problems:

    One, there is no way of knowing this idea is what you were
    looking for. Basically your puzzle was "Can you guess what I'm
    thinking?", without giving any useful clues to solve the puzzle.
    At the very least, it is misleading to call it a "puzzle".

    Two, this approach gives unexpected (or counter-intuitive)
    results for many ordinary cases where the times have at least
    some clustering (for example, all times within a 4-hour range, or
    potentially even a 15-hour range). In cases like that, which are
    likely to be common for most real-world data, using geometric
    averaging gives an inferior result.

    Besides those drawbacks, it looks like you didn't even bother to
    read the responses. As far as the outside world can tell, your
    purpose in posting was just an exercise in self-congratulations.

    I know you can do better. And I hope that in the future you will
    make an effort to do so.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Tim Rentsch on Fri Dec 30 14:04:16 2022
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    [...]

    I found this problem interesting but only later in the discussion as
    I have been using this metric for some time. What got interesting
    (to me) was that there is another sound interpretation of the
    average as suggested by Tim, [...]

    It wasn't my idea. I got it from a posting by Mike Terry on
    December 21. I hadn't seen that formulation of arithmetic mean
    before and I was amazed that it worked. So I can't really take
    any credit for the suggestion.

    ironically prompted but a general definition of what might
    constitute an average that I had posted and failed to follow
    through on.

    I remember your posting as coming after the one by Mike Terry,
    and so I thought your comments were derived from his. Sorry if
    my conclusions there were off the mark.

    You are right in that MT posted the same general formulation 9 hours
    before I did (though I'd not seen that). My confusion came from your explanation, to me, of "conventional average":

    "Sorry, I meant to refer to your formulation of average"

    followed by the formula I gave rather than then entirely equivalent one
    given by MT.

    Anyway, the credit I'm giving is for your considering this a reasonable
    thing to try calculate for arc lengths, rather than the vector average
    that minimises a different measure altogether. Maybe MT also suggested
    that as an alternative but I only remember his championing the vector interpretation. My apologies to MT if he did that as well.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?B?xo4=?=@21:1/5 to All on Fri Dec 30 05:59:32 2022
    Wahtsup nǝgro ?



    🌞


    Richard Heathfield kirjutas Kolmapäev, 14. detsember 2022 kl 15:10:58 UTC+2:
    On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
    On 2022-12-14 13:24, Stefan Ram wrote:
    Given n times of the 24-hour day, print their average.

    For example, the average of "eight o'clock" and
    "ten o'clock" (n=2) would be "nine o'clock".

    You probably missed to require the interesting part: doing all
    that in the modular type (modulo 24) arithmetic:

    20 + 5 = 1 (mod 24)
    ...which will give you the wrong answer. Chase that goose!
    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Ben Bacarisse on Sat Dec 31 00:24:34 2022
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    A couple of further thoughts only one of which is directed at you
    Tim (at the end)...

    Given the general conception of a mean (rather than any other kind of
    summary statistic) as minimising the sum of squares of some "distance"
    metric:

    Sum_{i=1,n} difference(A, t(i))^2

    we can characterise the two contenders by what distance is being
    minimised. For the less well discussed "conventional average" (to use
    your terminology) we are minimising the sum of squares of the shorter
    arc lengths between A and the t(i).

    For the "vector average", we convert the t(i) to unit vectors u(i) and
    we calculate the mean if the u(i) to get a vector m. The "average", A,
    is just the direction of this vector -- another point on the unit
    circle. In this case we are minimising the sum of squares of the
    /chord/ lengths between A and the t(i).

    The mean vector itself (which may not lie on the unit circle) minimises
    the sum of the squares of the length of the vector differences m-u(i):

    Sum_{i=1,n} |m - u(i)|^2

    and any other vector along the same line as m (i.e. c*m for real c)
    minimises

    Sum_{i=1,n} |c*m - u(i)|^2

    This includes, of course, m projected out to the unit circle.

    This distinction between arc lengths and chord lengths helps to
    visualise where these averages differ, and why the conventional average
    may seem more intuitive.

    Incidentally, I found another book on statistics on spheres, and that
    gets the average over in a few paragraphs. It states, without
    considering any alternatives, that the average is the vector average. I
    can't find anyone using or citing the arc-length minimising average,
    despite it being natural on a sphere to find the mid-point of, say, a
    set of points on the Earth's surface.

    Tim, my best shot at calculating this other average sorts the points to
    find the widest gap. I suspect your algorithm is similar since you say
    it is O(n log n).

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?B?xo4=?=@21:1/5 to Dmitry A. Kazakov on Fri Dec 30 18:18:16 2022
    You !



    On Wednesday, December 21, 2022 at 11:54:47 PM UTC+2, Dmitry A. Kazakov wrote:
    On 2022-12-21 20:55, Mike Terry wrote:

    The OP was probably deliberately rather vague on this point! The
    easiest definition for "average" literally doesn't make sense in a
    modulo context: we can ADD the modular values, but in general dividing
    in such a mathematical setting doesn't make much sense, so

    Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n

    would be inappropriate due to the /n operation.
    Unless you calculate everything as reals or integers and then take the remainder of 24. Which is kind of most natural definition, at least to me.
    HOWEVER, there's another characterisation for the average of a set,
    which looks more promising in a modular (or other more general)
    setting: the average is the value of V which MINIMISES THE "ERROR" calculated by

    error = Sum ([i=1,2,..n] (V - x_i)^2)

    that is, minimises the sum of squares of differences from V.
    This has exactly same problem as above, because subtraction and multiplication (power of two) have different semantics in modular
    arithmetic.
    [Incidentally, if we minimise the sum of absolute differeces from V,
    that characterises the mode (aka "midpoint") of the sample, but I think
    the median is more what is wanted here...]
    Actually it is the same beast. Median is the least C-norm defined as |x-y|.

    Mean is the least Euclidean norm (squares), the thing you proposed.

    Mean and average are same for real numbers.

    You are right that median (and C-norm) can be unambiguously defined in modular numbers. But nobody would ever qualify this as a puzzle! (:-))

    [...]
    Now, as to how to CALCULATE the V above???
    As I said, in real numbers V is the mean, i.e.

    V = Sum (Xi) =def= mean({Xi | i=1..n})
    i=1..n

    Any numeric method is in the end mapping Xi to some normal number, calculating a classic mean, mapping back. The problem is with the
    forward mapping being ambiguous. One method to disambiguate is staying
    within the same day. Another could be taking a median and considering 12 hours before and 12 hours after it. And there are many more.

    P.S. There are differences between the average and mean. OP referred the average, which may mean something completely counterintuitive (no pun intended (:-))
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Sat Dec 31 06:20:03 2022
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    [...]

    I found this problem interesting but only later in the discussion as
    I have been using this metric for some time. What got interesting
    (to me) was that there is another sound interpretation of the
    average as suggested by Tim, [...]

    It wasn't my idea. I got it from a posting by Mike Terry on
    December 21. I hadn't seen that formulation of arithmetic mean
    before and I was amazed that it worked. So I can't really take
    any credit for the suggestion.

    ironically prompted but a general definition of what might
    constitute an average that I had posted and failed to follow
    through on.

    I remember your posting as coming after the one by Mike Terry,
    and so I thought your comments were derived from his. Sorry if
    my conclusions there were off the mark.

    You are right in that MT posted the same general formulation 9 hours
    before I did (though I'd not seen that). My confusion came from your explanation, to me, of "conventional average":

    "Sorry, I meant to refer to your formulation of average"

    followed by the formula I gave rather than then entirely equivalent
    one given by MT.

    Ahh, that makes sense. My comment muddied the waters; I was
    thinking of your formula and the MT formula as the same, and
    rather inadvertently said "your formulation" (after all, I was
    responding to and had quoted your formula) without mentioning
    that I had also read MT's post before responding.

    Anyway, the credit I'm giving is for your considering this a
    reasonable thing to try calculate for arc lengths, rather than
    [...]

    Thank you for that. And in that respect I think I can modestly
    say that some credit is deserved (especially considering the
    amount of effort it took to code something up that calculated
    this measure).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Sat Dec 31 06:42:15 2022
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    A couple of further thoughts only one of which is directed at you
    Tim (at the end)...

    Given the general conception of a mean (rather than any other kind of
    summary statistic) as minimising the sum of squares of some "distance" metric:

    Sum_{i=1,n} difference(A, t(i))^2

    we can characterise the two contenders by what distance is being
    minimised. For the less well discussed "conventional average" (to use
    your terminology) we are minimising the sum of squares of the shorter
    arc lengths between A and the t(i).

    For the "vector average", we convert the t(i) to unit vectors u(i) and
    we calculate the mean if the u(i) to get a vector m. The "average", A,
    is just the direction of this vector -- another point on the unit
    circle. In this case we are minimising the sum of squares of the
    /chord/ lengths between A and the t(i).

    I think of this approach differently. I take the time values
    t(i) as being unit masses on the unit circle, and calculate the
    center of mass. As long as the center of mass is not the origin
    we can project it from the origin to find a corresponding time
    value on the unit circle (which in my case is done implicitly by
    using atan2()).

    The mean vector itself (which may not lie on the unit circle) minimises
    the sum of the squares of the length of the vector differences m-u(i):

    Sum_{i=1,n} |m - u(i)|^2

    That's true but I thought of it just as the average of the time
    value "masses" on the unit circle.

    and any other vector along the same line as m (i.e. c*m for real c)
    minimises

    Sum_{i=1,n} |c*m - u(i)|^2

    I'm not sure what you're saying here. Only the one point (m in
    your terminology) minimizes the sum of squares of distances. How
    do other points on the same line qualify?

    This includes, of course, m projected out to the unit circle.

    This distinction between arc lengths and chord lengths helps to
    visualise where these averages differ, and why the conventional
    average may seem more intuitive.

    Interesting perspective. I wouldn't call them chord lengths
    because I think of a chord as being between two points both on
    the same circle, and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    Incidentally, I found another book on statistics on spheres, and that
    gets the average over in a few paragraphs. It states, without
    considering any alternatives, that the average is the vector average. I can't find anyone using or citing the arc-length minimising average,
    despite it being natural on a sphere to find the mid-point of, say, a
    set of points on the Earth's surface.

    Do you mind my asking, which book was that?

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant. But it could be relevant in some contexts (thinking
    in particular of astronautics, or, dare I say it, rocket science).

    Tim, my best shot at calculating this other average sorts the points to
    find the widest gap. I suspect your algorithm is similar since you say
    it is O(n log n).

    Your instinct that I sorted the time values is right. If you
    think a little more I think you will see that the widest gap need
    not have any particular relationship to where the cut point is.
    No further elaboration for now so as not to be a spoiler (and I
    know you like to try figuring things out yourself).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Tim Rentsch on Sat Dec 31 17:04:47 2022
    On 2022-12-31 15:42, Tim Rentsch wrote:
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    For the "vector average", we convert the t(i) to unit vectors u(i) and
    we calculate the mean if the u(i) to get a vector m. The "average", A,
    is just the direction of this vector -- another point on the unit
    circle. In this case we are minimising the sum of squares of the
    /chord/ lengths between A and the t(i).

    I think of this approach differently. I take the time values
    t(i) as being unit masses on the unit circle, and calculate the
    center of mass. As long as the center of mass is not the origin
    we can project it from the origin to find a corresponding time
    value on the unit circle (which in my case is done implicitly by
    using atan2()).

    Center of mass of a set of ideal points (particles) and vector average
    are same:

    CoM = Sum Mi * Ri / Sum Mi
    i = 1..n i = 1..n

    Mi = masses, Ri = vectors. If all Mi are same you get

    CoM = Sum Ri / n
    i = 1..n

    This distinction between arc lengths and chord lengths helps to
    visualise where these averages differ, and why the conventional
    average may seem more intuitive.

    Interesting perspective. I wouldn't call them chord lengths
    because I think of a chord as being between two points both on
    the same circle, and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    Arc length is proportional to angle:

    L = Rα, R is radius, α is angle in radians.

    Averaging arcs is equivalent to averaging angles.

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant.

    See above, it is just angles to average.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?B?QXVnx51s?=@21:1/5 to All on Sat Dec 31 10:23:59 2022
    You!


    Tim Rentsch kirjutas Laupäev, 31. detsember 2022 kl 16:20:09 UTC+2:
    Ben Bacarisse <ben.u...@bsb.me.uk> writes:

    Tim Rentsch <tr.1...@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.u...@bsb.me.uk> writes:

    [...]

    I found this problem interesting but only later in the discussion as
    I have been using this metric for some time. What got interesting
    (to me) was that there is another sound interpretation of the
    average as suggested by Tim, [...]

    It wasn't my idea. I got it from a posting by Mike Terry on
    December 21. I hadn't seen that formulation of arithmetic mean
    before and I was amazed that it worked. So I can't really take
    any credit for the suggestion.

    ironically prompted but a general definition of what might
    constitute an average that I had posted and failed to follow
    through on.

    I remember your posting as coming after the one by Mike Terry,
    and so I thought your comments were derived from his. Sorry if
    my conclusions there were off the mark.

    You are right in that MT posted the same general formulation 9 hours before I did (though I'd not seen that). My confusion came from your explanation, to me, of "conventional average":

    "Sorry, I meant to refer to your formulation of average"

    followed by the formula I gave rather than then entirely equivalent
    one given by MT.
    Ahh, that makes sense. My comment muddied the waters; I was
    thinking of your formula and the MT formula as the same, and
    rather inadvertently said "your formulation" (after all, I was
    responding to and had quoted your formula) without mentioning
    that I had also read MT's post before responding.
    Anyway, the credit I'm giving is for your considering this a
    reasonable thing to try calculate for arc lengths, rather than
    [...]

    Thank you for that. And in that respect I think I can modestly
    say that some credit is deserved (especially considering the
    amount of effort it took to code something up that calculated
    this measure).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Jan 1 01:10:57 2023
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    A couple of further thoughts only one of which is directed at you
    Tim (at the end)...

    Given the general conception of a mean (rather than any other kind of
    summary statistic) as minimising the sum of squares of some "distance"
    metric:

    Sum_{i=1,n} difference(A, t(i))^2

    we can characterise the two contenders by what distance is being
    minimised. For the less well discussed "conventional average" (to use
    your terminology) we are minimising the sum of squares of the shorter
    arc lengths between A and the t(i).

    For the "vector average", we convert the t(i) to unit vectors u(i) and
    we calculate the mean if the u(i) to get a vector m. The "average", A,
    is just the direction of this vector -- another point on the unit
    circle. In this case we are minimising the sum of squares of the
    /chord/ lengths between A and the t(i).

    I think of this approach differently. I take the time values
    t(i) as being unit masses on the unit circle, and calculate the
    center of mass. As long as the center of mass is not the origin
    we can project it from the origin to find a corresponding time
    value on the unit circle (which in my case is done implicitly by
    using atan2()).

    It's the calculation though.

    The mean vector itself (which may not lie on the unit circle) minimises
    the sum of the squares of the length of the vector differences m-u(i):

    Sum_{i=1,n} |m - u(i)|^2

    That's true but I thought of it just as the average of the time
    value "masses" on the unit circle.

    and any other vector along the same line as m (i.e. c*m for real c)
    minimises

    Sum_{i=1,n} |c*m - u(i)|^2

    I'm not sure what you're saying here. Only the one point (m in
    your terminology) minimizes the sum of squares of distances. How
    do other points on the same line qualify?

    I over-simplified by omitting "compared to any other point on that
    circle". The mean vector is the absolute minimum, but at any radius
    |c*m| the sum above is at a minmum at c*m.

    This includes, of course, m projected out to the unit circle.

    This distinction between arc lengths and chord lengths helps to
    visualise where these averages differ, and why the conventional
    average may seem more intuitive.

    Interesting perspective. I wouldn't call them chord lengths
    because I think of a chord as being between two points both on
    the same circle,

    m (the vector) is the centre of mass, but for some c, c*m lies on the
    circle. All the lines from the data points to c*m are actual chords.
    The average "time", A, is the point on the circle that minimises the sum
    of squares of the cords between A the t(i).

    The "other" average, minimises the sum of the of the angular distances.

    and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    It's more interesting because it really is about chords!

    Incidentally, I found another book on statistics on spheres, and that
    gets the average over in a few paragraphs. It states, without
    considering any alternatives, that the average is the vector average. I
    can't find anyone using or citing the arc-length minimising average,
    despite it being natural on a sphere to find the mid-point of, say, a
    set of points on the Earth's surface.

    Do you mind my asking, which book was that?

    http://library.sadjad.ac.ir/opac//temp/19108.pdf

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant. But it could be relevant in some contexts (thinking
    in particular of astronautics, or, dare I say it, rocket science).

    Tim, my best shot at calculating this other average sorts the points to
    find the widest gap. I suspect your algorithm is similar since you say
    it is O(n log n).

    Your instinct that I sorted the time values is right. If you
    think a little more I think you will see that the widest gap need
    not have any particular relationship to where the cut point is.
    No further elaboration for now so as not to be a spoiler (and I
    know you like to try figuring things out yourself).

    You must be confusing me with someone else!

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Sun Jan 1 01:24:27 2023
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2022-12-31 15:42, Tim Rentsch wrote:
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    For the "vector average", we convert the t(i) to unit vectors u(i) and
    we calculate the mean if the u(i) to get a vector m. The "average", A,
    is just the direction of this vector -- another point on the unit
    circle. In this case we are minimising the sum of squares of the
    /chord/ lengths between A and the t(i).
    I think of this approach differently. I take the time values
    t(i) as being unit masses on the unit circle, and calculate the
    center of mass. As long as the center of mass is not the origin
    we can project it from the origin to find a corresponding time
    value on the unit circle (which in my case is done implicitly by
    using atan2()).

    Center of mass of a set of ideal points (particles) and vector average are same:

    CoM = Sum Mi * Ri / Sum Mi
    i = 1..n i = 1..n

    Mi = masses, Ri = vectors. If all Mi are same you get

    CoM = Sum Ri / n
    i = 1..n

    This distinction between arc lengths and chord lengths helps to
    visualise where these averages differ, and why the conventional
    average may seem more intuitive.
    Interesting perspective. I wouldn't call them chord lengths
    because I think of a chord as being between two points both on
    the same circle, and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    Arc length is proportional to angle:

    L = Rα, R is radius, α is angle in radians.

    Averaging arcs is equivalent to averaging angles.

    Sure. I'm certain Tim knows that! The question is what "average" are
    we interested in. An average, angle A, minimises

    Sum_{i=1,n} distance(A, t(i))^2

    for some measure of distance. The most common circular (or spherical)
    average, sums the unit vectors and, where possible, projects the
    resultant vector onto the unit circle. This average, it turns out,
    minimises the sum of squares of the chords between A and the t(i).

    Another average, which is more complicated to calculate, minimises the
    sum of squares of the arc lengths or, as you say, the angular distances
    between A and the t(i).

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant.

    See above, it is just angles to average.

    Yes, but that's hard. The simple average (just sum the unit vectors)
    does not minimise the sum of squares of the angle differences, and,
    despite being a reasonable concept, is not discussed in any of the text
    I've been able to find.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Jan 8 07:17:25 2023
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    A couple of further thoughts only one of which is directed at you
    Tim (at the end)...

    Given the general conception of a mean (rather than any other kind
    of summary statistic) as minimising the sum of squares of some
    "distance" metric:

    Sum_{i=1,n} difference(A, t(i))^2

    we can characterise the two contenders by what distance is being
    minimised. For the less well discussed "conventional average" (to
    use your terminology) we are minimising the sum of squares of the
    shorter arc lengths between A and the t(i).

    For the "vector average", we convert the t(i) to unit vectors u(i)
    and we calculate the mean if the u(i) to get a vector m. The
    "average", A, is just the direction of this vector -- another
    point on the unit circle. In this case we are minimising the sum
    of squares of the /chord/ lengths between A and the t(i).

    I think of this approach differently. I take the time values
    t(i) as being unit masses on the unit circle, and calculate the
    center of mass. As long as the center of mass is not the origin
    we can project it from the origin to find a corresponding time
    value on the unit circle (which in my case is done implicitly by
    using atan2()).

    It's the [same] calculation though.

    Yes, my first sentence there was meant to convey that I am
    offering an alternate formulation that is mathematically
    equivalent, which I expected would be self-evident.

    The mean vector itself (which may not lie on the unit circle)
    minimises the sum of the squares of the length of the vector
    differences m-u(i):

    Sum_{i=1,n} |m - u(i)|^2

    That's true but I thought of it just as the average of the time
    value "masses" on the unit circle.

    and any other vector along the same line as m (i.e. c*m for real
    c) minimises

    Sum_{i=1,n} |c*m - u(i)|^2

    I'm not sure what you're saying here. Only the one point (m in
    your terminology) minimizes the sum of squares of distances. How
    do other points on the same line qualify?

    I over-simplified by omitting "compared to any other point on that
    circle". The mean vector is the absolute minimum, but at any radius
    |c*m| the sum above is at a minmum at c*m.

    I see. Not obvious to me that this is true (and I didn't see an
    easy proof either). OTOH certainly it is plausibly true, so for
    now I can take your word for it.

    This includes, of course, m projected out to the unit circle.

    This distinction between arc lengths and chord lengths helps to
    visualise where these averages differ, and why the conventional
    average may seem more intuitive.

    Interesting perspective. I wouldn't call them chord lengths
    because I think of a chord as being between two points both on
    the same circle,

    m (the vector) is the centre of mass, but for some c, c*m lies on
    the circle. All the lines from the data points to c*m are actual
    chords. The average "time", A, is the point on the circle that
    minimises the sum of squares of the cords between A the t(i).

    Minor point: the center of mass is a point, not a vector.
    Because the circle is centered on the origin, it happens that the
    point and the vector from the origin to the point have the same
    coordinate values, but still points are different from vectors.

    The "other" average, minimises the sum of the of the angular
    distances.

    I think you mean it minimizes the sum of the squares of the
    angular distances.

    and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    It's more interesting because it really is about chords!

    I don't buy the "really" here. The center of mass is fundamental
    and always well-defined. Furthermore it works for collections of
    points that are not all on the same circle or sphere. Looking at
    the problem in terms of chords seems somewhat artificial, or at
    least less fundamental.

    Incidentally, I found another book on statistics on spheres, and
    that gets the average over in a few paragraphs. It states,
    without considering any alternatives, that the average is the
    vector average. I can't find anyone using or citing the
    arc-length minimising average, despite it being natural on a
    sphere to find the mid-point of, say, a set of points on the
    Earth's surface.

    Do you mind my asking, which book was that?

    http://library.sadjad.ac.ir/opac//temp/19108.pdf

    That's great, thank you.

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant. But it could be relevant in some contexts (thinking
    in particular of astronautics, or, dare I say it, rocket science).

    Tim, my best shot at calculating this other average sorts the
    points to find the widest gap. I suspect your algorithm is
    similar since you say it is O(n log n).

    Your instinct that I sorted the time values is right. If you
    think a little more I think you will see that the widest gap need
    not have any particular relationship to where the cut point is.
    No further elaboration for now so as not to be a spoiler (and I
    know you like to try figuring things out yourself).

    You must be confusing me with someone else!

    Oh. Maybe it was your brother posting in those other messages I
    saw coming from your address.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Dmitry A. Kazakov on Sun Jan 8 07:45:23 2023
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2022-12-31 15:42, Tim Rentsch wrote:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    For the "vector average", we convert the t(i) to unit vectors u(i) and
    we calculate the mean if the u(i) to get a vector m. The "average", A,
    is just the direction of this vector -- another point on the unit
    circle. In this case we are minimising the sum of squares of the
    /chord/ lengths between A and the t(i).

    I think of this approach differently. I take the time values
    t(i) as being unit masses on the unit circle, and calculate the
    center of mass. As long as the center of mass is not the origin
    we can project it from the origin to find a corresponding time
    value on the unit circle (which in my case is done implicitly by
    using atan2()).

    Center of mass of a set of ideal points (particles) and vector
    average are same:

    Yes, I thought the equivalence is obvious and not in need of
    explanation.

    This distinction between arc lengths and chord lengths helps to
    visualise where these averages differ, and why the conventional
    average may seem more intuitive.

    Interesting perspective. I wouldn't call them chord lengths
    because I think of a chord as being between two points both on
    the same circle, and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    Arc length is proportional to angle:

    A trivial and useless observation.

    Averaging arcs is equivalent to averaging angles.

    Angles are a one-dimensional measure. Finding an arc length
    "average" of points on a sphere needs a two-dimensional result.

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant.

    See above, it is just angles to average.

    Apparently you have not yet understood the problem. Why don't
    you try writing a program that inputs a set of points normalized
    to be on the unit sphere, and then calculates the arc length
    average point (on the unit sphere) of those input points?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Tim Rentsch on Sun Jan 8 17:17:21 2023
    On 2023-01-08 16:45, Tim Rentsch wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    Averaging arcs is equivalent to averaging angles.

    Angles are a one-dimensional measure.

    Averaging arcs is still equivalent to averaging angles, which is trivial
    result of elementary trigonometry.

    Finding an arc length
    "average" of points on a sphere needs a two-dimensional result.

    Points do not have arcs.

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant.

    See above, it is just angles to average.

    Apparently you have not yet understood the problem.

    Again, averages of arcs and angles are equivalent up to a multiplier.

    Why don't
    you try writing a program that inputs a set of points normalized
    to be on the unit sphere, and then calculates the arc length
    average point (on the unit sphere) of those input points?

    Why don't you write a formula specifying your need?

    Programs are written according to the specifications. Numeric programs
    require a properly stated problem, rather than a bunch of words
    arbitrarily thrown in a meaningless sentence as above.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Jan 8 19:43:53 2023
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
    <cut>
    The "other" average, minimises the sum of the of the angular
    distances.

    I think you mean it minimizes the sum of the squares of the
    angular distances.

    Yes thanks. I originally thought I'd write an abbreviation but then I convinced myself I would not forget any of the "squares of"s. Proof
    reading might have been useful too.

    and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    It's more interesting because it really is about chords!

    I don't buy the "really" here. The center of mass is fundamental
    and always well-defined. Furthermore it works for collections of
    points that are not all on the same circle or sphere. Looking at
    the problem in terms of chords seems somewhat artificial, or at
    least less fundamental.

    There are other ways for something to be interesting (at least to me).
    When you plot the two averages, and look at the chords vs the arcs, it
    becomes clear why one average "looks" more average than the other.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Ben Bacarisse on Sun Jan 8 19:59:14 2023
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
    <cut>
    The "other" average, minimises the sum of the of the angular
    distances.

    I think you mean it minimizes the sum of the squares of the
    angular distances.

    Yes thanks. I originally thought I'd write an abbreviation but then I convinced myself I would not forget any of the "squares of"s. Proof
    reading might have been useful too.

    and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    It's more interesting because it really is about chords!

    I don't buy the "really" here. The center of mass is fundamental
    and always well-defined. Furthermore it works for collections of
    points that are not all on the same circle or sphere. Looking at
    the problem in terms of chords seems somewhat artificial, or at
    least less fundamental.

    There are other ways for something to be interesting (at least to me).
    When you plot the two averages, and look at the chords vs the arcs, it becomes clear why one average "looks" more average than the other.

    I can see a possible misunderstanding here. You might have taken my "it
    really is about chords" to mean "it is fundamentally about chords", but
    I said "it really is about chords" because you thought I'd used the word
    chord incorrectly to refer to the lines between the centre of mass and
    the data points. I was saying "I really did mean chords".

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Sun Jan 8 20:41:19 2023
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-08 16:45, Tim Rentsch wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    Averaging arcs is equivalent to averaging angles.
    Angles are a one-dimensional measure.

    Averaging arcs is still equivalent to averaging angles, which is trivial result of elementary trigonometry.

    Finding an arc length
    "average" of points on a sphere needs a two-dimensional result.

    Points do not have arcs.

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant.

    See above, it is just angles to average.
    Apparently you have not yet understood the problem.

    Again, averages of arcs and angles are equivalent up to a multiplier.

    Why don't
    you try writing a program that inputs a set of points normalized
    to be on the unit sphere, and then calculates the arc length
    average point (on the unit sphere) of those input points?

    Why don't you write a formula specifying your need?

    You seemed to understand the need sufficiently to dismiss the problem: "averaging angles, which is trivial", "it is just angles to average" and "averages of arcs and angles are equivalent up to a multiplier". But
    the problem is /finding/ a specific average -- the point (or angle) that minimises the sum of squares of the distances (or angles) from that
    average point (or angle).

    The fact that it makes no odds (as everyone knows) whether we consider
    angles (often called central angles in this context) or great circle
    distances is not the issue. It's finding the average that minimises the
    sum of squares of differences that's the issue.

    You say you need a formula, so I'll try... Let P_n be a collection of n
    unit vectors specifying n points on a unit sphere. Find the unit vector
    A that minimises

    Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2

    (arctan is the "all quadrant" version that is often called atan2 in
    programming languages.)

    Programs are written according to the specifications. Numeric programs require a properly stated problem, rather than a bunch of words
    arbitrarily thrown in a meaningless sentence as above.

    Given the context, I think that's a very biased characterisation of
    what's been said here.

    My first job was as a "numerical analyst", and the very first program I
    was employed to write was for a professor of statistics. It was to
    calculate a novel kind of fit line. The specification was just a few sentences. No formulas. It was perfectly clear, and I could get the
    job done. I don't think this is unusual. Words are often enough, and
    they can avoid undue over specification. For example, the problem in
    question is essentially the same if the points are given by latitude and longitude on a non-unit sphere, but the formula would look very
    different.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to Ben Bacarisse on Sun Jan 8 21:14:05 2023
    On 08/01/2023 8:41 pm, Ben Bacarisse wrote:
    The specification was just a few
    sentences.

    There hangs on the wall of $COMPANY$ (a UK insurance company) a
    restaurant's paper napkin in a wooden frame behind glass.
    Scribbled on the napkin in biro are the words "quotes system".

    End of spec.

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Sun Jan 8 22:31:21 2023
    On 2023-01-08 21:41, Ben Bacarisse wrote:

    But
    the problem is /finding/ a specific average -- the point (or angle) that minimises the sum of squares of the distances (or angles) from that
    average point (or angle).

    That is not a programming problem. It is a problem space's one. You must
    go to the corresponding part of science or engineering or economics etc
    and formulate it there in terms of that space.

    Average is just such formulation of finding a representative member from
    some set having some specific properties. E.g. least squares in physics
    usually has the meaning of least energy etc. So, it goes in this order
    (not in reverse):

    1. You need the least energy representative.
    2. An average gives you that.
    3. You measure the inputs.

    After that you ask yourself given these measures how to compute the
    average? And only now it becomes a programming issue.

    The fact that it makes no odds (as everyone knows) whether we consider
    angles (often called central angles in this context) or great circle distances is not the issue. It's finding the average that minimises the
    sum of squares of differences that's the issue.

    Minimum of squares (Euclidean norm) is represented by the mathematical mean.

    You say you need a formula, so I'll try... Let P_n be a collection of n
    unit vectors specifying n points on a unit sphere. Find the unit vector
    A that minimises

    Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2

    (arctan is the "all quadrant" version that is often called atan2 in programming languages.)

    1. atan2 has two arguments (the adjacent and the opposite legs);
    2. What is "A";
    3. What operations(?) "x" and "." denote?

    Programs are written according to the specifications. Numeric programs
    require a properly stated problem, rather than a bunch of words
    arbitrarily thrown in a meaningless sentence as above.

    Given the context, I think that's a very biased characterisation of
    what's been said here.

    My first job was as a "numerical analyst", and the very first program I
    was employed to write was for a professor of statistics. It was to
    calculate a novel kind of fit line. The specification was just a few sentences. No formulas. It was perfectly clear, and I could get the
    job done. I don't think this is unusual. Words are often enough, and
    they can avoid undue over specification. For example, the problem in question is essentially the same if the points are given by latitude and longitude on a non-unit sphere, but the formula would look very
    different.

    This just means that you formulated the specifications yourself, being
    able to read your professor's mind. Not everybody's mind is easy read or
    worth reading... (:-))

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Mon Jan 9 03:25:10 2023
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-08 21:41, Ben Bacarisse wrote:

    But
    the problem is /finding/ a specific average -- the point (or angle) that
    minimises the sum of squares of the distances (or angles) from that
    average point (or angle).

    That is not a programming problem. It is a problem space's one. You
    must go to the corresponding part of science or engineering or
    economics etc and formulate it there in terms of that space.

    I don't follow. It's a mathematical problem for which an algorithm is
    sought. I don't see any connection to some science or engineering
    space. If it /came/ from some scientific study, it has already been
    turned into what the programmer needs come up with.

    Average is just such formulation of finding a representative member
    from some set having some specific properties. E.g. least squares in
    physics usually has the meaning of least energy etc. So, it goes in
    this order (not in reverse):

    1. You need the least energy representative.
    2. An average gives you that.
    3. You measure the inputs.

    After that you ask yourself given these measures how to compute the
    average? And only now it becomes a programming issue.

    So if I asked you for code to calculate the arithmetic mean of an array
    of numbers you would ask for all this first, declaring it not yet a
    programming issue? I really don't see where all this comes from.

    The fact that it makes no odds (as everyone knows) whether we consider
    angles (often called central angles in this context) or great circle
    distances is not the issue. It's finding the average that minimises the
    sum of squares of differences that's the issue.

    Minimum of squares (Euclidean norm) is represented by the mathematical
    mean.

    This problem obviously uses a different norm.

    You say you need a formula, so I'll try... Let P_n be a collection of n
    unit vectors specifying n points on a unit sphere. Find the unit vector
    A that minimises
    Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
    (arctan is the "all quadrant" version that is often called atan2 in
    programming languages.)

    1. atan2 has two arguments (the adjacent and the opposite legs);

    Obviously I can't guess how much I can safely assume so I expected there
    might be questions. arctan(p / q) is usually coded as atan2(p, q).

    2. What is "A";

    The unit vector that is being sought -- the result of the algorithm. It represents a point on the unit sphere that is the desired average of the
    input points.

    3. What operations(?) "x" and "." denote?

    x denotes the cross product, and . the dot product.

    Do you agree that this is not a trivial problem? If not, please give
    the trivial algorithm!

    Programs are written according to the specifications. Numeric programs
    require a properly stated problem, rather than a bunch of words
    arbitrarily thrown in a meaningless sentence as above.
    Given the context, I think that's a very biased characterisation of
    what's been said here.
    My first job was as a "numerical analyst", and the very first program I
    was employed to write was for a professor of statistics. It was to
    calculate a novel kind of fit line. The specification was just a few
    sentences. No formulas. It was perfectly clear, and I could get the
    job done. I don't think this is unusual. Words are often enough, and
    they can avoid undue over specification. For example, the problem in
    question is essentially the same if the points are given by latitude and
    longitude on a non-unit sphere, but the formula would look very
    different.

    This just means that you formulated the specifications yourself, being
    able to read your professor's mind.

    No, I did not have to read his mind because he told me what he wanted.
    To my mind "find the point on the sphere that minimises the sum of the
    squares of the great-circle distances between that point and the input
    points" is clearer than the formula I gave because the formula says too
    much.

    Of course, no specification is 100% precise. You didn't know what x and
    . denoted above and I have answered you with mere words. Will you need
    the formulas for X x Y and X . Y before accepting the specification?
    What symbols in /those/ formulas might you want to have formally
    specified?

    Not everybody's mind is easy read or worth reading... (:-))

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Mon Jan 9 11:22:58 2023
    On 2023-01-09 04:25, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-08 21:41, Ben Bacarisse wrote:

    But
    the problem is /finding/ a specific average -- the point (or angle) that >>> minimises the sum of squares of the distances (or angles) from that
    average point (or angle).

    That is not a programming problem. It is a problem space's one. You
    must go to the corresponding part of science or engineering or
    economics etc and formulate it there in terms of that space.

    I don't follow. It's a mathematical problem for which an algorithm is sought.

    Firstly, this is comp.programming group. Secondly, if algorithm is
    sought then there must be a mathematical formulation of what a numeric
    solution is sought for.

    (Finding a minimum is an optimization problem, there are thousands of algorithms already)

    I don't see any connection to some science or engineering
    space. If it /came/ from some scientific study, it has already been
    turned into what the programmer needs come up with.

    Average is just such formulation of finding a representative member
    from some set having some specific properties. E.g. least squares in
    physics usually has the meaning of least energy etc. So, it goes in
    this order (not in reverse):

    1. You need the least energy representative.
    2. An average gives you that.
    3. You measure the inputs.

    After that you ask yourself given these measures how to compute the
    average? And only now it becomes a programming issue.

    So if I asked you for code to calculate the arithmetic mean of an array
    of numbers you would ask for all this first, declaring it not yet a programming issue?

    Then I would ask you, if this is a solution, what was the problem?

    I really don't see where all this comes from.

    From an average (:-)) customer who does not mind his business.
    Customers have programming ideas, wrong ideas, ideas they express in
    terms they have no idea of (:-)). It takes time and tact to bring the
    customer back to the field of his expertise and get from him what he
    actually needs in order to solve his actual problem.

    The fact that it makes no odds (as everyone knows) whether we consider
    angles (often called central angles in this context) or great circle
    distances is not the issue. It's finding the average that minimises the >>> sum of squares of differences that's the issue.

    Minimum of squares (Euclidean norm) is represented by the mathematical
    mean.

    This problem obviously uses a different norm.

    I see nothing obvious in an unstated problem.

    You say you need a formula, so I'll try... Let P_n be a collection of n >>> unit vectors specifying n points on a unit sphere. Find the unit vector >>> A that minimises
    Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
    (arctan is the "all quadrant" version that is often called atan2 in
    programming languages.)

    1. atan2 has two arguments (the adjacent and the opposite legs);

    Obviously I can't guess how much I can safely assume so I expected there might be questions. arctan(p / q) is usually coded as atan2(p, q).

    2. What is "A";

    The unit vector that is being sought -- the result of the algorithm. It represents a point on the unit sphere that is the desired average of the input points.

    3. What operations(?) "x" and "." denote?

    x denotes the cross product, and . the dot product.

    Do you agree that this is not a trivial problem? If not, please give
    the trivial algorithm!

    If I correctly understood your explanation (and with a few obvious corrections):

    Sum atan2 (|A x Pi|, A · Pi) --> min
    i=1..n {A | A on the circle}

    |A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)

    A · Pi = |A| |Pi| cos θi

    atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'

    Where θi' is θi with lower semicircle mapped to the upper one (for
    whatever reason).

    (If mapping was unintended you must have taken the direction of the
    cross product for the sign for |A x Pi|)

    On a circle varying the angle between A and (R, 0), where R is the
    radius, you find that the mean of the angles between Pi and (R, 0)
    yields the minimum.

    To my mind "find the point on the sphere that minimises the sum of the squares of the great-circle distances between that point and the input points" is clearer than the formula I gave because the formula says too
    much.

    If you mean the circumference, then that is trivially proportional to
    the angle.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Mon Jan 9 20:37:05 2023
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-09 04:25, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-08 21:41, Ben Bacarisse wrote:

    But
    the problem is /finding/ a specific average -- the point (or angle) that >>>> minimises the sum of squares of the distances (or angles) from that
    average point (or angle).

    That is not a programming problem. It is a problem space's one. You
    must go to the corresponding part of science or engineering or
    economics etc and formulate it there in terms of that space.
    I don't follow. It's a mathematical problem for which an algorithm is
    sought.

    Firstly, this is comp.programming group.

    Do you consider discussion algorithms off topic here? Seriously?

    Secondly, if algorithm is sought then there must be a mathematical formulation of what a numeric solution is sought for.

    There needs to be a problem statement that can be discussed and refined.
    The original post about circular averages has led to two different
    refinements -- the vector average (projected onto the unit circle) and
    the arc-length average. Both generalise to more dimensions and I
    happened to mention in passing that the great-circle average would be a
    natural fit in some situations.

    Tim mentioned (again in passing) that it looks hard. You seemed to
    suggest it was not, but I am not sure you knew what the problem was at
    that time.

    All of this seems to be to be exactly what comp.programming should be
    about.

    (Finding a minimum is an optimization problem, there are thousands of algorithms already)

    Programmers should know that specifications are not algorithm designs.
    You would not find a number that minimises the sum of squares of
    differences from an input set by breaking out one of these thousands of minimisation algorithms. You'd sum and divide.

    I don't see any connection to some science or engineering
    space. If it /came/ from some scientific study, it has already been
    turned into what the programmer needs come up with.

    Average is just such formulation of finding a representative member
    from some set having some specific properties. E.g. least squares in
    physics usually has the meaning of least energy etc. So, it goes in
    this order (not in reverse):

    1. You need the least energy representative.
    2. An average gives you that.
    3. You measure the inputs.

    After that you ask yourself given these measures how to compute the
    average? And only now it becomes a programming issue.
    So if I asked you for code to calculate the arithmetic mean of an array
    of numbers you would ask for all this first, declaring it not yet a
    programming issue?

    Then I would ask you, if this is a solution, what was the problem?

    This is a solution to the simpler problem.

    I really don't see where all this comes from.

    From an average (:-)) customer who does not mind his
    business. Customers have programming ideas, wrong ideas, ideas they
    express in terms they have no idea of (:-)). It takes time and tact to
    bring the customer back to the field of his expertise and get from him
    what he actually needs in order to solve his actual problem.

    But we are not in that situation. A few knowledgeable and experienced
    people are discussing a programming problem. You are welcome to join
    in, but the best way for you to do that is to ask whatever questions you
    think would help you understand what's being discussed. So far, your
    replies have alternated between "it's trivial" and "it's unstated".

    The fact that it makes no odds (as everyone knows) whether we consider >>>> angles (often called central angles in this context) or great circle
    distances is not the issue. It's finding the average that minimises the >>>> sum of squares of differences that's the issue.

    Minimum of squares (Euclidean norm) is represented by the mathematical
    mean.
    This problem obviously uses a different norm.

    I see nothing obvious in an unstated problem.

    It's been stated several times.

    You say you need a formula, so I'll try... Let P_n be a collection of n >>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>> A that minimises
    Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
    (arctan is the "all quadrant" version that is often called atan2 in
    programming languages.)

    1. atan2 has two arguments (the adjacent and the opposite legs);
    Obviously I can't guess how much I can safely assume so I expected there
    might be questions. arctan(p / q) is usually coded as atan2(p, q).

    2. What is "A";
    The unit vector that is being sought -- the result of the algorithm. It
    represents a point on the unit sphere that is the desired average of the
    input points.

    3. What operations(?) "x" and "." denote?
    x denotes the cross product, and . the dot product.
    Do you agree that this is not a trivial problem? If not, please give
    the trivial algorithm!

    If I correctly understood your explanation (and with a few obvious corrections):

    Sum atan2 (|A x Pi|, A · Pi) --> min
    i=1..n {A | A on the circle}

    No. A is a vector denoting a point on the unit sphere. And you have
    missed out the power of two. Were these changes what you call "obvious corrections"?

    |A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)

    A · Pi = |A| |Pi| cos θi

    atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'

    Where θi' is θi with lower semicircle mapped to the upper one (for
    whatever reason).

    (If mapping was unintended you must have taken the direction of the
    cross product for the sign for |A x Pi|)

    On a circle varying the angle between A and (R, 0), where R is the
    radius, you find that the mean of the angles between Pi and (R, 0)
    yields the minimum.

    This does not even solve the related case of finding the average point
    on the unit circle.

    To my mind "find the point on the sphere that minimises the sum of the
    squares of the great-circle distances between that point and the input
    points" is clearer than the formula I gave because the formula says too
    much.

    If you mean the circumference, then that is trivially proportional to
    the angle.

    You keep making the point that angles are proportional to arcs and/or
    great circle distances. No one disputes this and I even tried at one
    point to keep writing both so you would know that that is not the issue.
    The problem is finding the average.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Y A@21:1/5 to All on Mon Jan 9 16:33:53 2023
    TG9vayB0aGlzDQrihpMNCuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggA0K4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDioIDioIDioIDioIDioIDioIDi oIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDwn4ye 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDioIDioIDi oIDioIDioIDioIDioIANCuKggCDioIAg4qCAIOKggCDioIAg4qCAIOKggCDioIAg4qCAIPCfm6kN CuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggA0K4qCA8J+Mp++4j/CfjKfvuI/wn4yn77iP4qCA4qCA8J+Mp++4j/CfjKfvuI/i oIDioIDioIDioIDioIDioIDwn4yn77iP8J+Mp++4j/CfjKfvuI/ioIDioIDwn4yn77iP8J+Mp++4 j/CfjKfvuI/wn4yn77iP4qCA4qCA8J+Mp++4j/CfjKfvuI/ioIDioIDioIDwn4yn77iP8J+Mp++4 j/CfjKfvuI/ioIDioIANCuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggA0K4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDi oIDioIDioIDioIDioIDioIDioIDwn5WK77iPIOKggOKggOKggOKggOKggOKggA0K4qCA4qCA4qCA 4qCA8J+MtOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggPCfjYQg4qCA4qCA4qCA 4qCAIPCfjLzioIDioIDioIDioIAg8J+Muw0K8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cf n6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+f q/Cfn6vwn5+r8J+fq/Cfn6sNCg0KSG93IGRvIFlvdSByYXRlIHRoaXMgb24gMS4uLi4xMCBzY2Fs ZSA/DQoNCg0KT24gV2VkbmVzZGF5LCBEZWNlbWJlciAxNCwgMjAyMiBhdCA1OjQzOjE4IFBNIFVU QysyLCBEbWl0cnkgQS4gS2F6YWtvdiB3cm90ZToNCj4gT24gMjAyMi0xMi0xNCAxNjo0MSwgRG1p dHJ5IEEuIEthemFrb3Ygd3JvdGU6IA0KPiA+IE9uIDIwMjItMTItMTQgMTY6MTgsIFJpY2hhcmQg SGVhdGhmaWVsZCB3cm90ZTogDQo+ID4gDQo+ID4+PiBCVFcsIGF2ZXJhZ2luZyBmbG9hdHMgaXMg YSBuYXN0eSBwcm9ibGVtIHRvby4gQSBuYWl2ZSBpbXBsZW1lbnRhdGlvbiANCj4gPj4+IHF1aWNr bHkgbG9zZXMgcHJlY2lzaW9uLiANCj4gPj4gDQo+ID4+IFdlJ3JlIGRlYWxpbmcgd2l0aCAnbydj bG9jaycgYW5kICJISDpNTSIsIGFuZCBub3dhZGF5cyB3ZSBoYXZlIDY0LWJpdCANCj4gPj4gaW50 ZWdlciB0eXBlcyBhbmQgdGhlcmUgYXJlIGV2ZW4gMTI4LWJpdCBpbnRlZ2VycyBtb29jaGluZyBh cm91bmQgDQo+ID4+IGxvb2tpbmcgZm9yIGEgcmVhc29uIHRvIGV4aXN0LiBZb3UnZCBoYXZlIHRv IGF2ZXJhZ2UgYSBoZWxsIG9mIGEgbG90IA0KPiA+PiBvZiB0aW1lcyBldmVuIHRvIC9uZWVkLyBm bG9hdHMsIGxldCBhbG9uZSBsb3NlIHNpZ25pZmljYW50IHByZWNpc2lvbi4gDQo+ID4gDQo+ID4g SSBuZXZlciBzdWdnZXN0ZWQgZmxvYXQgZm9yIGF2ZXJhZ2luZyB0aW1lIHN0YW1wcywgSSBwb2lu dGVkIG91dCB0aGF0IA0KPiA+IGF2ZXJhZ2luZyBpcyBub3QgYSBzaW1wbGUgcHJvYmxlbS4gRS5n LiB0cnkgdGhpcyBvbmU6IA0KPiA+IA0KPiA+ICAgIGZ1bmN0aW9uIEF2ZXJhZ2UgKFggOiBGbG9h dDsgTiA6IFBvc2l0aXZlKSByZXR1cm4gRmxvYXQgaXMgDQo+ID4gICAgICAgU3VtIDogRmxvYXQg Oj0gMC4wOyANCj4gPiAgICBiZWdpbiANCj4gPiAgICAgICBmb3IgSW5kZXggaW4gMS4uTiBsb29w IA0KPiA+ICAgICAgICAgIFN1bSA6PSBTdW0gKyBYOyANCj4gPiAgICAgICBlbmQgbG9vcDsgDQo+ ID4gICAgICAgcmV0dXJuIFN1bSAvIEZsb2F0IChOKTsgDQo+ID4gICAgZW5kIEF2ZXJhZ2U7IA0K PiA+IA0KPiA+IFRoZSBmdW5jdGlvbiBkb2VzIG5haXZlIGF2ZXJhZ2luZy4gRm9yIHNpbXBsaWNp dHkgaXQganVzdCBzdW1zIHVwIHRoZSANCj4gPiBzYW1lIG51bWJlciBYIE4gdGltZXMgYW5kIGRp dmlkZXMgYnkgTi4gDQo+ID4gDQo+ID4gICAgQXZlcmFnZSAoMS4wLCAxN18wMDBfMDAwKSAgPSAw Ljk4Njg5NSANCj4gPiAgICBBdmVyYWdlICgxLjAsIDEwMF8wMDBfMDAwKSA9IDAuMTY3NzcyIA0K PiA+ICAgIEF2ZXJhZ2UgKDEuMCwgMjAwXzAwMF8wMDApID0gMC44Mzg4NjENCj4gMC4wODM4ODYx IA0KPiANCj4gVGhlIGVycm9yIGlzIG9idmlvdXNseSBzeXN0ZW1hdGljLg0KPiAtLSANCj4gUmVn YXJkcywgDQo+IERtaXRyeSBBLiBLYXpha292IA0KPiBodHRwOi8vd3d3LmRtaXRyeS1rYXpha292 LmRlDQo=

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Y A@21:1/5 to All on Mon Jan 9 21:26:45 2023
    TG9vayB0aGlzDQrihpMNCuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggA0K4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDioIDioIDioIDioIDioIDioIDi oIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDwn4ye 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDioIDioIDi oIDioIDioIDioIDioIANCuKggCDioIAg4qCAIOKggCDioIAg4qCAIOKggCDioIAg4qCAIPCfm6kN CuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggA0K4qCA8J+Mp++4j/CfjKfvuI/wn4yn77iP4qCA4qCA8J+Mp++4j/CfjKfvuI/i oIDioIDioIDioIDioIDioIDwn4yn77iP8J+Mp++4j/CfjKfvuI/ioIDioIDwn4yn77iP8J+Mp++4 j/CfjKfvuI/wn4yn77iP4qCA4qCA8J+Mp++4j/CfjKfvuI/ioIDioIDioIDwn4yn77iP8J+Mp++4 j/CfjKfvuI/ioIDioIANCuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggA0K4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDi oIDioIDioIDioIDioIDioIDioIDwn5WK77iPIOKggOKggOKggOKggOKggOKggA0K4qCA4qCA4qCA 4qCA8J+MtOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggPCfjYQg4qCA4qCA4qCA 4qCAIPCfjLzioIDioIDioIDioIAg8J+Muw0K8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cf n6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+f q/Cfn6vwn5+r8J+fq/Cfn6sNCkhvdyBkbyBZb3UgcmF0ZSB0aGlzIG9uIDEuLi4uMTAgc2NhbGUg Pw0KDQoNCg0KRGF2aWQgQnJvd24ga2lyanV0YXMgUmVlZGUsIDIzLiBkZXRzZW1iZXIgMjAyMiBr bCAxNDo0NjozNSBVVEMrMjoNCj4gT24gMjIvMTIvMjAyMiAyMjozMCwgUmljaGFyZCBIZWF0aGZp ZWxkIHdyb3RlOiANCj4gPiBPbiAyMi8xMi8yMDIyIDk6MDYgcG0sIERhdmlkIEJyb3duIHdyb3Rl OiANCj4gPj4gT24gMjIvMTIvMjAyMiAxNzozMCwgSnVsaW8gRGkgRWdpZGlvIHdyb3RlOiANCj4g Pj4+IE9uIFRodXJzZGF5LCAyMiBEZWNlbWJlciAyMDIyIGF0IDE2OjUwOjQ5IFVUQysxLCBEYXZp ZCBCcm93biB3cm90ZTogDQo+ID4+Pj4gT24gMjIvMTIvMjAyMiAwMTowNSwgSnVsaW8gRGkgRWdp ZGlvIHdyb3RlOiANCj4gPj4+Pj4gT24gVGh1cnNkYXksIDIyIERlY2VtYmVyIDIwMjIgYXQgMDE6 MDE6MDggVVRDKzEsIE1pa2UgVGVycnkgd3JvdGU6IA0KPiA+Pj4+Pj4gT24gMjEvMTIvMjAyMiAy MTo1NCwgRG1pdHJ5IEEuIEthemFrb3Ygd3JvdGU6IA0KPiA+Pj4+PiANCj4gPj4+Pj4+PiBraW5k IG9mIG1vc3QgbmF0dXJhbCBkZWZpbml0aW9uLCBhdCBsZWFzdCB0byBtZS4gDQo+ID4+Pj4+PiAN Cj4gPj4+Pj4+IFRoYXQgaXMgdW5hbWJpZ3VvdXMsIGJ1dCB0byBteSBtaW5kIG5vdCBhdCBhbGwg d2hhdCBpcyB3YW50ZWQgDQo+ID4+Pj4+IA0KPiA+Pj4+PiBZb3UganVzdCBkb250IGdldCBpdCwg ZG8geW91PyBUaGlzIGlzIChjb21wLilwcm9ncmFtbWluZyANCj4gPj4+Pj4gd2hlcmUgZ3Vlc3N3 b3JrIGlzIGEgKmNhcGl0YWwgc2luKi4gDQo+ID4+Pj4gDQo+ID4+Pj4gUHJvZ3JhbW1pbmcgaXMg YWxsIGFib3V0IHRyeWluZyB0byBndWVzcyB3aGF0IHRoZSBjdXN0b21lciBhY3R1YWxseSANCj4g Pj4+PiB3YW50ZWQsIHJhdGhlciB0aGFuIHdoYXQgdGhleSBhc2tlZCBmb3IhIDotKSANCj4gPj4+ IA0KPiA+Pj4gVXR0ZXIgbm9uc2Vuc2Ugb2YgdGhlIHVzdWFsIGtpbmQgYW5kIGFuIGVudGlyZSBp bmR1c3RyeSBkb3duIHRoZSBkcmFpbi4gDQo+ID4+PiANCj4gPj4+IE5vLCB3ZSBhcmUgcmF0aGVy IHN1cHBvc2VkIHRvICplbGljaXQqIHJlcXVpcmVtZW50cy4uLiB3aGljaCBpcyBqdXN0IHRoZSAN Cj4gPj4+IHRpcCBvZiB0aGUgKHNvZnR3YXJlKSBhbmFseXNpcyBpY2ViZXJnLiANCj4gPj4+IA0K PiA+PiANCj4gPj4gUGVyaGFwcyBJIHNob3VsZCBoYXZlIHB1dCB0aGUgc21pbGV5IGluIGEgYmln Z2VyIGZvbnQ/IA0KPiA+IA0KPiA+IERhdmlkLCBvYnZpb3VzbHkgaXQncyB5b3VyIGNhbGwsIGJ1 dCBkbyB5b3UgcmVhbGx5IHRoaW5rIHlvdSdyZSBnb2luZyB0byANCj4gPiBoYXZlIGEgbWVhbmlu Z2Z1bCBjb252ZXJzYXRpb24gd2l0aCBzb21lb25lIHdobyBvbiAwNS8wMS8yMDIyIDEwOjU3IA0K PiA+IHJlcGxpZWQgdG8geW91WzFdOiANCj4gPiAiSXQgaXMgeW91IHdobyBhcmUgYW4gaW5jb21w ZXRlbnQgZnVja2luZyBjbG93biBqdXN0IHJlcGVhdGluZyBzb21lIA0KPiA+IGJ1bGxzaGl0IHlv dSByZWFkIGluIHRoZSBibG9ncy4iIA0KPiA+IA0KPiA+IEFuZCB5b3UncmUgYnkgZmFyIGZyb20g YmVpbmcgdGhlIG9ubHkgb25lIGhlIGFidXNlcyBpbiB0aHVzIHdpc2UuIA0KPiA+IA0KPiA+IEdv ZCBpbnZlbnRlZCBraWxsZmlsZXMgZm9yIGEgcmVhc29uLiANCj4gPiANCj4gPiBbMV0gSWYgeW91 IHdhbnQgdG8gbG9vayBpdCB1cCwgc2VlIGp1c3Qgb3ZlciBoYWxmd2F5IGRvd24gdGhpcyB0aHJl YWQ6IA0KPiA+IGh0dHBzOi8vZ3JvdXBzLmdvb2dsZS5jb20vZy9jb21wLnByb2dyYW1taW5nL2Mv Z0tRM3EwRDVJVDAvbS83Sm9aNWU3SUNBQUogDQo+ID4NCj4gSSBhbSBibGVzc2VkIHdpdGggYWJz ZW50LW1pbmRlZG5lc3MsIHdoaWNoIG1ha2VzIGl0IGVhc2llciB0byBnaXZlIA0KPiBwZW9wbGUg YSBzZWNvbmQgY2hhbmNlISBCdXQgc2VlaW5nIGhpcyBsYXRlc3QgcmVwbHksIGl0IHNlZW1zIHRo YXQgd2FzIA0KPiB3YXN0ZWQgZWZmb3J0Lg0K

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Tue Jan 10 09:06:36 2023
    On 2023-01-09 21:37, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-09 04:25, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-08 21:41, Ben Bacarisse wrote:

    But
    the problem is /finding/ a specific average -- the point (or angle) that >>>>> minimises the sum of squares of the distances (or angles) from that
    average point (or angle).

    That is not a programming problem. It is a problem space's one. You
    must go to the corresponding part of science or engineering or
    economics etc and formulate it there in terms of that space.
    I don't follow. It's a mathematical problem for which an algorithm is
    sought.

    Firstly, this is comp.programming group.

    Do you consider discussion algorithms off topic here? Seriously?

    No, I consider off topic discussing mathematical problems /= algorithms.
    You did not discuss any algorithms so far.

    Secondly, if algorithm is sought then there must be a mathematical
    formulation of what a numeric solution is sought for.

    There needs to be a problem statement that can be discussed and refined.
    The original post about circular averages has led to two different refinements -- the vector average (projected onto the unit circle) and
    the arc-length average. Both generalise to more dimensions and I
    happened to mention in passing that the great-circle average would be a natural fit in some situations.

    I missed generalization to 3D. Sorry.
    [You kept on writing about unit *circle* and an angle, where it actually
    meant unit sphere and polar angles/spherical coordinates]

    Tim mentioned (again in passing) that it looks hard. You seemed to
    suggest it was not, but I am not sure you knew what the problem was at
    that time.

    All of this seems to be to be exactly what comp.programming should be
    about.

    Differential geometry and spherical trigonometry are certainly off topic.

    (Finding a minimum is an optimization problem, there are thousands of
    algorithms already)

    Programmers should know that specifications are not algorithm designs.

    Who said they are?

    I don't see any connection to some science or engineering
    space. If it /came/ from some scientific study, it has already been
    turned into what the programmer needs come up with.

    Average is just such formulation of finding a representative member
    from some set having some specific properties. E.g. least squares in
    physics usually has the meaning of least energy etc. So, it goes in
    this order (not in reverse):

    1. You need the least energy representative.
    2. An average gives you that.
    3. You measure the inputs.

    After that you ask yourself given these measures how to compute the
    average? And only now it becomes a programming issue.
    So if I asked you for code to calculate the arithmetic mean of an array
    of numbers you would ask for all this first, declaring it not yet a
    programming issue?

    Then I would ask you, if this is a solution, what was the problem?

    This is a solution to the simpler problem.

    And what was that the problem? [Adjective adds/clarifies nothing]

    But we are not in that situation. A few knowledgeable and experienced
    people are discussing a programming problem.

    I saw no discussion on programming so either.

    You are welcome to join
    in, but the best way for you to do that is to ask whatever questions you think would help you understand what's being discussed. So far, your
    replies have alternated between "it's trivial" and "it's unstated".

    Yes, because you wrote about minimum of arc length on a *circle*. And
    that is trivial.

    You say you need a formula, so I'll try... Let P_n be a collection of n >>>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>>> A that minimises
    Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
    (arctan is the "all quadrant" version that is often called atan2 in
    programming languages.)

    1. atan2 has two arguments (the adjacent and the opposite legs);
    Obviously I can't guess how much I can safely assume so I expected there >>> might be questions. arctan(p / q) is usually coded as atan2(p, q).

    2. What is "A";
    The unit vector that is being sought -- the result of the algorithm. It >>> represents a point on the unit sphere that is the desired average of the >>> input points.

    3. What operations(?) "x" and "." denote?
    x denotes the cross product, and . the dot product.
    Do you agree that this is not a trivial problem? If not, please give
    the trivial algorithm!

    If I correctly understood your explanation (and with a few obvious
    corrections):

    Sum atan2 (|A x Pi|, A · Pi) --> min
    i=1..n {A | A on the circle}

    No. A is a vector denoting a point on the unit sphere. And you have
    missed out the power of two. Were these changes what you call "obvious corrections"?

    |A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)

    A · Pi = |A| |Pi| cos θi

    atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'

    Where θi' is θi with lower semicircle mapped to the upper one (for
    whatever reason).

    (If mapping was unintended you must have taken the direction of the
    cross product for the sign for |A x Pi|)

    On a circle varying the angle between A and (R, 0), where R is the
    radius, you find that the mean of the angles between Pi and (R, 0)
    yields the minimum.

    This does not even solve the related case of finding the average point
    on the unit circle.

    It elementary does for both formula you gave (after corrections) and the circumference:

    Sum Length_Of_Arc (Pi, A)^2 =
    i=1..n

    = Sum (R Angle (Pi, A))^2 =
    i=1..n

    = R^2 Sum ((Angle (Pi, (R, 0)) - Angle (A, (R, 0)))^2
    i=1..n

    (R, 0) denotes vector x = R, y = 0. Let θi denote Angle (Pi, (R, 0)) and
    α do Angle (A, (R, 0)) then

    = R^2 Sum (θi - α)^2 --> min
    i=1..n

    Derivative d/dα:

    2 R^2 Sum (θi - α) = 0 <=> (R /= 0)
    i=1..n

    <=> Sum θi - nα = 0 <=>
    i=1..n

    <=> α = Sum θi / n = mean ({θi})
    i=1..n

    The angle between A and (R, 0) is the mean of angles of points.

    I can't help you with spherical trigonometry, which is an utter off
    topic anyway. I do not know if there is a direct solution of least
    squares. For two points it is like for a circle. For three points it
    would be the circumcenter of the spherical triangle.

    However even if the direct solution existed, it could be numerically
    unstable as many formulae in spherical geometry are. I used to process
    remote sensing geological data which involved miles long chains of
    cosines and sines. That stuff was no fun.

    Anyway, there are lengthy papers on spherical averages introducing
    iterative solutions, some with source code for the algorithms. That is
    the right way to compute it, if you really wanted...

    [Again, it is not programming. In case of spherical geometry a
    programmer will find an expert mathematician and who will have the
    problem properly stated. The next stop is by an expert applied
    mathematician who will outline a workable numeric solution. Then the
    programmer can start.]

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Wed Jan 11 02:41:59 2023
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-09 21:37, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-09 04:25, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-08 21:41, Ben Bacarisse wrote:

    But
    the problem is /finding/ a specific average -- the point (or angle) that >>>>>> minimises the sum of squares of the distances (or angles) from that >>>>>> average point (or angle).

    That is not a programming problem. It is a problem space's one. You
    must go to the corresponding part of science or engineering or
    economics etc and formulate it there in terms of that space.
    I don't follow. It's a mathematical problem for which an algorithm is >>>> sought.

    Firstly, this is comp.programming group.
    Do you consider discussion algorithms off topic here? Seriously?

    No, I consider off topic discussing mathematical problems /=
    algorithms. You did not discuss any algorithms so far.

    Algorithms don't come out of thin air. The objective must be discussed
    first. You can't seriously think that everyone here should remain
    silent until, out of nowhere, someone says "consider this algorithm to
    find the average of a set of points".

    Secondly, if algorithm is sought then there must be a mathematical
    formulation of what a numeric solution is sought for.
    There needs to be a problem statement that can be discussed and refined.
    The original post about circular averages has led to two different
    refinements -- the vector average (projected onto the unit circle) and
    the arc-length average. Both generalise to more dimensions and I
    happened to mention in passing that the great-circle average would be a
    natural fit in some situations.

    I missed generalization to 3D. Sorry.
    [You kept on writing about unit *circle* and an angle, where it
    actually meant unit sphere and polar angles/spherical coordinates]

    Both have been discussed, but I think I've been clear about which one I
    am talking about.

    Tim mentioned (again in passing) that it looks hard. You seemed to
    suggest it was not, but I am not sure you knew what the problem was at
    that time.
    All of this seems to be to be exactly what comp.programming should be
    about.

    Differential geometry and spherical trigonometry are certainly off
    topic.

    I don't think I discussed either.

    (Finding a minimum is an optimization problem, there are thousands of
    algorithms already)
    Programmers should know that specifications are not algorithm designs.

    Who said they are?

    I explained in the text you cut. Maybe your reference to the thousands
    of minimisation algorithms was just an off the cuff remark, but it reads
    as if you were suggesting using one as them as the solution to this
    programming task.

    I don't see any connection to some science or engineering
    space. If it /came/ from some scientific study, it has already been
    turned into what the programmer needs come up with.

    Average is just such formulation of finding a representative member
    from some set having some specific properties. E.g. least squares in >>>>> physics usually has the meaning of least energy etc. So, it goes in
    this order (not in reverse):

    1. You need the least energy representative.
    2. An average gives you that.
    3. You measure the inputs.

    After that you ask yourself given these measures how to compute the
    average? And only now it becomes a programming issue.
    So if I asked you for code to calculate the arithmetic mean of an array >>>> of numbers you would ask for all this first, declaring it not yet a
    programming issue?

    Then I would ask you, if this is a solution, what was the problem?
    This is a solution to the simpler problem.

    And what was that the problem? [Adjective adds/clarifies nothing]

    But we are not in that situation. A few knowledgeable and experienced
    people are discussing a programming problem.

    I saw no discussion on programming so either.

    You are welcome to join
    in, but the best way for you to do that is to ask whatever questions you
    think would help you understand what's being discussed. So far, your
    replies have alternated between "it's trivial" and "it's unstated".

    Yes, because you wrote about minimum of arc length on a *circle*. And
    that is trivial.

    I think I have been clear that the problem you claimed was trivial was
    about points on a sphere.

    But the circle version is not trivial in the sense you seem to think it
    is. What you say below about seems to miss the key issue altogether.

    You say you need a formula, so I'll try... Let P_n be a collection of n >>>>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>>>> A that minimises
    Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
    (arctan is the "all quadrant" version that is often called atan2 in >>>>>> programming languages.)

    1. atan2 has two arguments (the adjacent and the opposite legs);
    Obviously I can't guess how much I can safely assume so I expected there >>>> might be questions. arctan(p / q) is usually coded as atan2(p, q).

    2. What is "A";
    The unit vector that is being sought -- the result of the algorithm. It >>>> represents a point on the unit sphere that is the desired average of the >>>> input points.

    3. What operations(?) "x" and "." denote?
    x denotes the cross product, and . the dot product.
    Do you agree that this is not a trivial problem? If not, please give
    the trivial algorithm!

    If I correctly understood your explanation (and with a few obvious
    corrections):

    Sum atan2 (|A x Pi|, A · Pi) --> min
    i=1..n {A | A on the circle}
    No. A is a vector denoting a point on the unit sphere. And you have
    missed out the power of two. Were these changes what you call "obvious
    corrections"?

    |A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)

    A · Pi = |A| |Pi| cos θi

    atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'

    Where θi' is θi with lower semicircle mapped to the upper one (for
    whatever reason).

    (If mapping was unintended you must have taken the direction of the
    cross product for the sign for |A x Pi|)

    On a circle varying the angle between A and (R, 0), where R is the
    radius, you find that the mean of the angles between Pi and (R, 0)
    yields the minimum.
    This does not even solve the related case of finding the average point
    on the unit circle.

    It elementary does for both formula you gave (after corrections) and
    the circumference:

    Sum Length_Of_Arc (Pi, A)^2 =
    i=1..n

    = Sum (R Angle (Pi, A))^2 =
    i=1..n

    = R^2 Sum ((Angle (Pi, (R, 0)) - Angle (A, (R, 0)))^2
    i=1..n

    There's no need to bring R into it. Either accept that this is a unit
    circle or consider only the angles.

    (R, 0) denotes vector x = R, y = 0. Let θi denote Angle (Pi, (R, 0))
    and α do Angle (A, (R, 0)) then

    Simpler to start from here -- to find the average of the angles -- the
    angle that minimises the sum of squares of angular differences:

    = R^2 Sum (θi - α)^2 --> min
    i=1..n

    Derivative d/dα:

    2 R^2 Sum (θi - α) = 0 <=> (R /= 0)
    i=1..n

    <=> Sum θi - nα = 0 <=>
    i=1..n

    <=> α = Sum θi / n = mean ({θi})
    i=1..n

    The angle between A and (R, 0) is the mean of angles of points.

    I don't think you have grasped the basic problem. If you want to
    consider the circular case you'll have to review the posts that were specifically about it. You seem to be suggesting that a conventional
    mean solves the problem.

    Anyway, there are lengthy papers on spherical averages introducing
    iterative solutions, some with source code for the algorithms. That is
    the right way to compute it, if you really wanted...

    I'd like to read one. Do you have a citation, please? The only
    treatments I've found don't discuss the average that Tim and I were
    talking about. They only consider the simple vector average. (But I've
    only found two.)

    [Again, it is not programming. In case of spherical geometry a
    programmer will find an expert mathematician and who will have the
    problem properly stated.

    The problem has been properly stated. We seek an algorithm that finds a
    point that minimises the sum of squares of the great-circle distances
    between that point and the input points. You could have asked if any of
    this was unclear. To me it is both clear and intuitive.

    The next stop is by an expert applied mathematician who will outline a workable numeric solution. Then the programmer can start.]

    Ah, you hand off the task of devising algorithms to mathematicians?
    That would make programming very dull. Anyway, you would not tackle
    this sort of problem until an expert applied mathematician has done
    everything except code up a workable numerical solution. Is that also
    the case for the average angle problem?

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Dmitry A. Kazakov on Tue Jan 10 23:36:36 2023
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-08 16:45, Tim Rentsch wrote:

    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    Averaging arcs is equivalent to averaging angles.

    Angles are a one-dimensional measure.

    Averaging arcs is still equivalent to averaging angles, which is
    trivial result of elementary trigonometry.

    Finding an arc length
    "average" of points on a sphere needs a two-dimensional result.

    Points do not have arcs.

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant.

    See above, it is just angles to average.

    Apparently you have not yet understood the problem.

    Again, averages of arcs and angles are equivalent up to a
    multiplier.

    Why don't
    you try writing a program that inputs a set of points normalized
    to be on the unit sphere, and then calculates the arc length
    average point (on the unit sphere) of those input points?

    Why don't you write a formula specifying your need?

    Programs are written according to the specifications. Numeric
    programs require a properly stated problem, rather than a bunch
    of words arbitrarily thrown in a meaningless sentence as above.

    After reading this response and also three other responses of
    yours (to Ben Bacarisse) down stream, I can't help but think you
    have little or no interest in understanding the problem, offering
    any useful comments about it, or trying to write a program to
    solve the problem. Apparently the only interest you do have is
    making captious remarks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Tue Jan 10 23:25:10 2023
    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

    <cut>

    The "other" average, minimises the sum of the of the angular
    distances.

    I think you mean it minimizes the sum of the squares of the
    angular distances.

    Yes thanks. I originally thought I'd write an abbreviation but then I
    convinced myself I would not forget any of the "squares of"s. Proof
    reading might have been useful too.

    and the center of mass is never on the unit
    circle (not counting the case when all the time values are the
    same). Even so it's an interesting way to view the distinction.

    It's more interesting because it really is about chords!

    I don't buy the "really" here. The center of mass is fundamental
    and always well-defined. Furthermore it works for collections of
    points that are not all on the same circle or sphere. Looking at
    the problem in terms of chords seems somewhat artificial, or at
    least less fundamental.

    There are other ways for something to be interesting (at least to me).
    When you plot the two averages, and look at the chords vs the arcs, it
    becomes clear why one average "looks" more average than the other.

    I can see a possible misunderstanding here. You might have taken my "it really is about chords" to mean "it is fundamentally about chords", but
    I said "it really is about chords" because you thought I'd used the word chord incorrectly to refer to the lines between the centre of mass and
    the data points. I was saying "I really did mean chords".

    Now that I see what you meant it all makes sense.

    Also I agree that comparing the chords view and the arcs view
    gives a useful perspective. I am starting to appreciate the
    value of that perspective more as time goes on.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Wed Jan 11 10:01:45 2023
    On 2023-01-11 03:41, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-09 21:37, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-09 04:25, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    Firstly, this is comp.programming group.
    Do you consider discussion algorithms off topic here? Seriously?

    No, I consider off topic discussing mathematical problems /=
    algorithms. You did not discuss any algorithms so far.

    Algorithms don't come out of thin air. The objective must be discussed first. You can't seriously think that everyone here should remain
    silent until, out of nowhere, someone says "consider this algorithm to
    find the average of a set of points".

    I explained in the text you cut. Maybe your reference to the thousands
    of minimisation algorithms was just an off the cuff remark, but it reads
    as if you were suggesting using one as them as the solution to this programming task.

    Sure. I highly doubt that production code, if ever existed, would use
    some novel algorithm.

    Anyway, there are lengthy papers on spherical averages introducing
    iterative solutions, some with source code for the algorithms. That is
    the right way to compute it, if you really wanted...

    I'd like to read one. Do you have a citation, please?

    Perhaps this could be a starting point:

    "Spherical Averages and Applications to Spherical Splines and
    Interpolation" Samuel R. Buss, Jay P. Fillmore

    It is not area of my interest and research is not a programming task
    either. And you still completely miss the key point about stating the
    problem. The problems come from the problem spaces. Spherical geometry
    is used in remote sensing, navigation, computer graphics (spherical
    polygons and triangles) etc. You must look for works in these
    application areas first. If the problem were *real* you would already
    have that step behind.

    [Again, it is not programming. In case of spherical geometry a
    programmer will find an expert mathematician and who will have the
    problem properly stated.

    The problem has been properly stated. We seek an algorithm that finds a point that minimises the sum of squares of the great-circle distances
    between that point and the input points. You could have asked if any of
    this was unclear. To me it is both clear and intuitive.

    The next stop is by an expert applied mathematician who will outline a
    workable numeric solution. Then the programmer can start.]

    Ah, you hand off the task of devising algorithms to mathematicians?

    Yep, *numeric* algorithms is an area of applied mathematics.

    That would make programming very dull. Anyway, you would not tackle
    this sort of problem until an expert applied mathematician has done everything except code up a workable numerical solution. Is that also
    the case for the average angle problem?

    If you are not familiar with spherical geometry, numerical approximation
    and optimization, yes.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Y A@21:1/5 to Tim Rentsch on Wed Jan 11 02:39:48 2023
    Check out this



    http://generatingmiracluosnumbers.medianewsonline.com/Y.htm


    You might start to like it......




    On Wednesday, January 11, 2023 at 9:36:40 AM UTC+2, Tim Rentsch wrote:
    "Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:

    On 2023-01-08 16:45, Tim Rentsch wrote:

    "Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:

    Averaging arcs is equivalent to averaging angles.

    Angles are a one-dimensional measure.

    Averaging arcs is still equivalent to averaging angles, which is
    trivial result of elementary trigonometry.

    Finding an arc length
    "average" of points on a sphere needs a two-dimensional result.

    Points do not have arcs.

    Now that I think about it, finding the point that minimizes the
    great circle distances squared would be at least computationally
    unpleasant.

    See above, it is just angles to average.

    Apparently you have not yet understood the problem.

    Again, averages of arcs and angles are equivalent up to a
    multiplier.

    Why don't
    you try writing a program that inputs a set of points normalized
    to be on the unit sphere, and then calculates the arc length
    average point (on the unit sphere) of those input points?

    Why don't you write a formula specifying your need?

    Programs are written according to the specifications. Numeric
    programs require a properly stated problem, rather than a bunch
    of words arbitrarily thrown in a meaningless sentence as above.
    After reading this response and also three other responses of
    yours (to Ben Bacarisse) down stream, I can't help but think you
    have little or no interest in understanding the problem, offering
    any useful comments about it, or trying to write a program to
    solve the problem. Apparently the only interest you do have is
    making captious remarks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Thu Jan 12 01:00:09 2023
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    On 2023-01-11 03:41, Ben Bacarisse wrote:
    "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

    Anyway, there are lengthy papers on spherical averages introducing
    iterative solutions, some with source code for the algorithms. That is
    the right way to compute it, if you really wanted...

    I'd like to read one. Do you have a citation, please?

    Perhaps this could be a starting point:

    "Spherical Averages and Applications to Spherical Splines and
    Interpolation" Samuel R. Buss, Jay P. Fillmore

    Thanks.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)