Forum: >>> Magnum BBS <<<

Re: Another little puzzle

From Dmitry A. Kazakov@21:1/5 to Stefan Ram on Wed Dec 14 14:06:11 2022

On 2022-12-14 13:24, Stefan Ram wrote:

Given n times of the 24-hour day, print their average.

For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".

You probably missed to require the interesting part: doing all that in
the modular type (modulo 24) arithmetic:

20 + 5 = 1 (mod 24)

(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

That is not representation. Averaging hours, minutes, seconds are three different problems, though the algorithm would be same.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Stefan Ram on Wed Dec 14 12:30:06 2022

On 14/12/2022 12:24 pm, Stefan Ram wrote:

Given n times of the 24-hour day, print their average.

For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".

(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Subject line: "Another little puzzle"

Your specification seems too simple to qualify as a puzzle, so
what am I missing?

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Ram@21:1/5 to All on Wed Dec 14 12:24:32 2022

Given n times of the 24-hour day, print their average.

For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".

(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Ram@21:1/5 to Richard Heathfield on Wed Dec 14 12:33:16 2022

Richard Heathfield <rjh@cpax.org.uk> writes:

On 14/12/2022 12:24 pm, Stefan Ram wrote:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Subject line: "Another little puzzle"
Your specification seems too simple to qualify as a puzzle, so
what am I missing?

Answering this is part of the puzzle!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 13:10:54 2022

On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 13:24, Stefan Ram wrote:

   Given n times of the 24-hour day, print their average.

   For example, the average of "eight o'clock" and
   "ten o'clock" (n=2) would be "nine o'clock".

You probably missed to require the interesting part: doing all
that in the modular type (modulo 24) arithmetic:

   20 + 5 = 1 (mod 24)

...which will give you the wrong answer. Chase that goose!

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Richard Heathfield on Wed Dec 14 14:35:19 2022

On 2022-12-14 14:10, Richard Heathfield wrote:

On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 13:24, Stefan Ram wrote:

   Given n times of the 24-hour day, print their average.

   For example, the average of "eight o'clock" and
   "ten o'clock" (n=2) would be "nine o'clock".

You probably missed to require the interesting part: doing all that in
the modular type (modulo 24) arithmetic:

    20 + 5 = 1 (mod 24)

...which will give you the wrong answer. Chase that goose!

Right, you must count the wrap-ups. E.g.

for I in Times'Range loop
New_Sum := Old_Sum + Times (I);
if Old_Sum > New_Sum then -- Wrapped
Count := Count + 1;
if Count = n then -- Each n wrap-ups can be discarded
Count := 0;
end if;
end if;
Old_Sum := New_Sum;
end loop;

In the end you have to evaluate

(Old_Sum + Count * 24) / n (mod 24)

where Count < n is the wrap-ups count.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 14:10:11 2022

On 14/12/2022 1:35 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 14:10, Richard Heathfield wrote:

On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 13:24, Stefan Ram wrote:

   Given n times of the 24-hour day, print their average.

   For example, the average of "eight o'clock" and
   "ten o'clock" (n=2) would be "nine o'clock".

You probably missed to require the interesting part: doing all
that in the modular type (modulo 24) arithmetic:

    20 + 5 = 1 (mod 24)

...which will give you the wrong answer. Chase that goose!

Right, you must count the wrap-ups.

No, you don't. You're given n times of the 24-hour day, so all
values are already in the hour range [0-24). Convert to seconds
from midnight, add all values, divide by n to give a number t
guaranteed (assuming no leap seconds) to be in the range
[0-86400), and convert to the representation of your choice. (And
even if there is a leap second, it doesn't matter more than half
a sixpence.)

e.g.

int h, m, s;

h = t / 3600;
m = (t - h*3600) / 60;
s = (t - h*3600 - m * 60);

So why do you need mod?

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Richard Heathfield on Wed Dec 14 15:58:45 2022

On 2022-12-14 15:10, Richard Heathfield wrote:

On 14/12/2022 1:35 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 14:10, Richard Heathfield wrote:

On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 13:24, Stefan Ram wrote:

   Given n times of the 24-hour day, print their average.

   For example, the average of "eight o'clock" and
   "ten o'clock" (n=2) would be "nine o'clock".

You probably missed to require the interesting part: doing all that
in the modular type (modulo 24) arithmetic:

    20 + 5 = 1 (mod 24)

...which will give you the wrong answer. Chase that goose!

Right, you must count the wrap-ups.

[...]

So why do you need mod?

As I said, the challenge is only interesting in modulo arithmetic (24 or
24*60 or 24*60*60). E.g. let

type Hour is mod 24;

Average a series of Hour without conversion to another integer type.

(Otherwise it is trivial: convert to a wider type, average, convert back)

P.S. This problem may actually arise in programming a microcontroller
when you have some large series and narrow 16-bit types. Close cases are computations with color channels and grayscale levels when conversions
to wider types are too expensive etc.

The general case is some f(X1,X2,..,Xn) such that max{Xi} >= f() >=
min{Xi}, but intermediates are not. As it is in the case with averaging Xi.

BTW, averaging floats is a nasty problem too. A naive implementation
quickly loses precision.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 15:18:44 2022

On 14/12/2022 2:58 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 15:10, Richard Heathfield wrote:

On 14/12/2022 1:35 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 14:10, Richard Heathfield wrote:

On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 13:24, Stefan Ram wrote:

   Given n times of the 24-hour day, print their average.

   For example, the average of "eight o'clock" and
   "ten o'clock" (n=2) would be "nine o'clock".

You probably missed to require the interesting part: doing
all that in the modular type (modulo 24) arithmetic:

    20 + 5 = 1 (mod 24)

...which will give you the wrong answer. Chase that goose!

Right, you must count the wrap-ups.

[...]

So why do you need mod?

As I said, the challenge is only interesting in modulo arithmetic

So the challenge is only interesting if you add something you
don't need. Let's throw in some elephants. Does that make it more
interesting?

BTW, averaging floats is a nasty problem too. A naive
implementation quickly loses precision.

We're dealing with 'o'clock' and "HH:MM", and nowadays we have
64-bit integer types and there are even 128-bit integers mooching
around looking for a reason to exist. You'd have to average a
hell of a lot of times even to /need/ floats, let alone lose
significant precision.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Richard Heathfield on Wed Dec 14 16:41:57 2022

On 2022-12-14 16:18, Richard Heathfield wrote:

BTW, averaging floats is a nasty problem too. A naive implementation
quickly loses precision.

We're dealing with 'o'clock' and "HH:MM", and nowadays we have 64-bit
integer types and there are even 128-bit integers mooching around
looking for a reason to exist. You'd have to average a hell of a lot of
times even to /need/ floats, let alone lose significant precision.

I never suggested float for averaging time stamps, I pointed out that
averaging is not a simple problem. E.g. try this one:

function Average (X : Float; N : Positive) return Float is
Sum : Float := 0.0;
begin
for Index in 1..N loop
Sum := Sum + X;
end loop;
return Sum / Float (N);
end Average;

The function does naive averaging. For simplicity it just sums up the
same number X N times and divides by N.

Average (1.0, 17_000_000) = 0.986895
Average (1.0, 100_000_000) = 0.167772
Average (1.0, 200_000_000) = 0.838861

The issue is catastrophic precision loss of addition Sum + X when Sum >> X.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 16:43:14 2022

On 2022-12-14 16:41, Dmitry A. Kazakov wrote:

On 2022-12-14 16:18, Richard Heathfield wrote:

BTW, averaging floats is a nasty problem too. A naive implementation
quickly loses precision.

We're dealing with 'o'clock' and "HH:MM", and nowadays we have 64-bit
integer types and there are even 128-bit integers mooching around
looking for a reason to exist. You'd have to average a hell of a lot
of times even to /need/ floats, let alone lose significant precision.

I never suggested float for averaging time stamps, I pointed out that averaging is not a simple problem. E.g. try this one:

   function Average (X : Float; N : Positive) return Float is
      Sum : Float := 0.0;
   begin
      for Index in 1..N loop
         Sum := Sum + X;
      end loop;
      return Sum / Float (N);
   end Average;

The function does naive averaging. For simplicity it just sums up the
same number X N times and divides by N.

   Average (1.0, 17_000_000) = 0.986895
   Average (1.0, 100_000_000) = 0.167772
   Average (1.0, 200_000_000) = 0.838861

0.0838861

The error is obviously systematic.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Dmitry A. Kazakov on Wed Dec 14 16:13:17 2022

On 14/12/2022 3:41 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 16:18, Richard Heathfield wrote:

BTW, averaging floats is a nasty problem too. A naive
implementation quickly loses precision.

We're dealing with 'o'clock' and "HH:MM", and nowadays we have
64-bit integer types and there are even 128-bit integers
mooching around looking for a reason to exist. You'd have to
average a hell of a lot of times even to /need/ floats, let
alone lose significant precision.

I never suggested float for averaging time stamps,

Yes, you did.

I pointed out
that averaging is not a simple problem.

Yes, it is.

E.g. try this one:

   function Average (X : Float; N : Positive) return Float is
      Sum : Float := 0.0;
   begin
      for Index in 1..N loop
         Sum := Sum + X;
      end loop;
      return Sum / Float (N);
   end Average;

The function does naive averaging. For simplicity it just sums up
the same number X N times and divides by N.

You're not being naïve /enough/. The average of N instances of X
is X, so just return X.

Yes, if you try hard enough, you can screw anything up. So what?

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Stefan Ram on Thu Dec 15 02:04:05 2022

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.

For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".

(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

I came across this problem in another context where the solution I ended
up using had two components -- the average and the "strength" of that
average.

Is this a purely abstract problem or did it crop up in some practical
context? The two-valued solution (average/strength) would also work
well from some applications.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Stefan Ram on Thu Dec 15 04:49:24 2022

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Richard Heathfield <rjh@cpax.org.uk> writes:

On 14/12/2022 12:24 pm, Stefan Ram wrote:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Subject line: "Another little puzzle"
Your specification seems too simple to qualify as a puzzle, so
what am I missing?

Answering this is part of the puzzle!

Without any further constraints this problem is
absurdly easy.

If you let the reader choose the constraints, there
appears to be no reason to choose any constraints
that make the problem harder. On the contrary,
imposing additional constraints seems artificial.

So please let us know what additional constraints
you have in mind. I daresay mind reading is not in
the skill set of anyone reading these postings.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Ram@21:1/5 to Stefan Ram on Wed Dec 21 12:03:27 2022

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Stefan Ram on Wed Dec 21 13:09:29 2022

On 21/12/2022 12:03 pm, Stefan Ram wrote:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5

The specification says "the 24-hour day" --- one day, not two
days. So the average of 23.5 and 1.5 is 12.5. 0.5 is a mistake.

average( 11.5, 13.5 )== 12.5

Correct.

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

Your hope is misplaced, because one of your test cases bears an
incorrect expected result.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

You misunderstood your specification.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Stefan Ram on Wed Dec 21 15:15:58 2022

On 2022-12-21 13:03, Stefan Ram wrote:

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5

Nope, it is 12.5! (:-))

Well, assuming that time samples can cross the day boundary and assuming
that averaged time is defined as averaging duration from some epoch and
adding the epoch back:

Avr({Ti}) =def= Avr({Ti}-E) + E

(you cannot average time as it is).

Then in the case of n = 2 you have two competing answers:

(23.5 + n * 24 + 1.5 + m * 24) / 2 (mod 24)

where n and m are any natural numbers (specifying the day). This
resolves to:

12.5 + (n + m) * 12 (mod 24)

which gives either 12.5 or 0.5 depending on whether n + m is odd or even.

Now the fun part is that when n contains factors other than 2 or 3
(24=2*2*2*3) then there might be up to 23 different answers!

Your puzzle is ill specified.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Richard Heathfield on Wed Dec 21 17:05:03 2022

Richard Heathfield <rjh@cpax.org.uk> writes:

On 21/12/2022 12:03 pm, Stefan Ram wrote:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5

The specification says "the 24-hour day" --- one day, not two days. So
the average of 23.5 and 1.5 is 12.5. 0.5 is a mistake.

I don't think it's reasonable to call it a mistake. There is a
perfectly plausible interpretation of "average" that gives 0.5. My
(un-posted) code gives 0.5.

I think the one day/two days distinction might be a distraction. Can
you say more about how that distinction changes the answer? (Mind you,
I can't see how SR's words imply one day.)

average( 11.5, 13.5 )== 12.5

Correct.

(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

Your hope is misplaced, because one of your test cases bears an
incorrect expected result.

Hmm... My interpretation of average gives 0.5. Mind you, I just took
some exiting related code and changed degrees into hours. That mapping
seemed to me to match up with what SR was driving at.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

You misunderstood your specification.

That's very severe!

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Wed Dec 21 18:55:47 2022

On 2022-12-21 18:05, Ben Bacarisse wrote:

There is a
perfectly plausible interpretation of "average" that gives 0.5.

Not very plausible as it violates two properties of:

min({Xi}) <= avr({Xi}) <= max({Xi})

and

avr({C * Xi}) = C * avr({Xi})

The second one allows calculation of the average as:

23.5 / 2 + 1.5 / 2 = ?

P.S. Of course one could reformulate the puzzle in a non-numeric way
that would give Stefan's answer. For example this.

Let us consider vectors of same length in polar coordinates [0, 24[,
e.g. the clock hands. An average of these vectors will be a clock hand
again. Then, indeed, (v(23.5) + v(1.5)) / 2 = v(0.5).

P.P.S. The bounds property does not apply to vector averages. The
scaling property does, because multiplication by a scalar works
differently on vectors, it does not rotate the hands.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Ben Bacarisse on Wed Dec 21 17:21:25 2022

On 21/12/2022 5:05 pm, Ben Bacarisse wrote:

Richard Heathfield <rjh@cpax.org.uk> writes:

On 21/12/2022 12:03 pm, Stefan Ram wrote:

<snip>

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

You misunderstood your specification.

That's very severe!

Then I apologise to Stefan for my severity, but I stand by my
interpretation as being a valid reading of his problem.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mike Terry@21:1/5 to Dmitry A. Kazakov on Wed Dec 21 19:55:56 2022

On 21/12/2022 17:55, Dmitry A. Kazakov wrote:

On 2022-12-21 18:05, Ben Bacarisse wrote:

There is a
perfectly plausible interpretation of "average" that gives 0.5.

Not very plausible as it violates two properties of:

�� min({Xi}) <= avr({Xi}) <= max({Xi})

and

�� avr({C * Xi}) = C * avr({Xi})

The second one allows calculation of the average as:

�� 23.5 / 2 + 1.5 / 2 = ?

Dmitri: while I'm replying to /your/ post, my points are not especially addressed to you - it's more
of an accumulation of "frustration" at what I consider to be "unimaginative" misinterpretations and
dogmatic claims of what was originally requested... (I pretty much just chose a recent post and
responded due to some internal threshold being reached! :) So don't be personally offended!)

A lot of people here have treated the problem as saying "find the numeric average of the number of
seconds past midnight for a given set of times". Well, clearly (to me) that is NOT what the OP
intended, and I would say it's by no means the required meaning of the words he actually used.

It was pretty clear to me that the intention was (somehow) to look at the problem of averaging
values in some kind of modular setting, e.g. times given with a 24 hour clock. Such times wrap at
midnight and common sense would say that we are to consider times a couple of minutes apart as
"close" in this modular setting. I.e. 2 minutes to midnight and 2 minutes past midnight should be
considered separated by 4 minutes, not 23 hours 56 minutes.

The OP was probably deliberately rather vague on this point! The easiest definition for "average"
literally doesn't make sense in a modulo context: we can ADD the modular values, but in general
dividing in such a mathematical setting doesn't make much sense, so

Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n

would be inappropriate due to the /n operation.

HOWEVER, there's another characterisation for the average of a set, which looks more promising in a
modular (or other more general) setting: the average is the value of V which MINIMISES THE "ERROR"
calculated by

error = Sum ([i=1,2,..n] (V - x_i)^2)

that is, minimises the sum of squares of differences from V. [Incidentally, if we minimise the sum
of absolute differeces from V, that characterises the mode (aka "midpoint") of the sample, but I
think the median is more what is wanted here...]

So I'll suggest a precise wording of what was (I think) wanted: find the time V that minimises the
error function above, based on "distances" in a modular setting. We also need to make clear what
"distance" means in that setting, but put like that I think there would be less disagreement that
the sensible way to define the distance between two times, would be in the expected "wrapping"
manor. E.g. we could say the distance is the LEAST positive D such that V +- D = x_i, with the
arithmetic + or - being modular arithmetic, i.e. mod 24hours. Obviously this makes the distance
between 23:58 and 00:02 4 minutes, like we want...

Had the OP suggested that the times in question were EVENTS, then obviously averaging the time for
them is a concretely defined operation, but to interpret the OP like that, we would then need to
turn "times" into "event occurences date+times" and there was no mention of dates anywhere - to me
it's clear the times are just intended as modular 24hour times, not applying to specific events.

And for those that go further and suggest what was said was that the times were all events occuring
on one and the same day, and that is what OP asked to be averaged - well! (I don't think that is
reasonable at all, to put it mildly.)

Now, as to how to CALCULATE the V above???

[I haven't got a definitive approach - obviously we could calculate the normal arithmetic "number of
seconds past midnight" average, then look at adjusting it as we flip various x_i by 24hours, but
there could be a lot of x_i to flip (problem grows exponentially). Or maybe we sample V at a fixed
number of places like on each hour, and (with luck) there may be an obvious minimum - at that point
we could make a good guess for exactly which x_i should be "flipped" by 24hours, and then calculate
using normal arithmetic averaging??? Well that's my first idea.]

Regards,
Mike.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Stefan Ram on Wed Dec 21 13:17:55 2022

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

As before, the problem is underspecified.

What is average( 0700, 1900 ), where the times indicate 24-hour
military times?

What is average( 1900, 0700 )?

What is average( 0100, 0700, 1300, 1900 )?
What is average( 0700, 1300, 1900, 0100 )?
What is average( 1300, 1900, 0100, 0700 )?
What is average( 1900, 0100, 0700, 1300 )?

and similarly for all other permutations of
the four arguments?

Stop giving just examples; instead give a statement
of the problem that gives correct answers for all
inputs. Giving just examples is a waste of everyone's
time.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Mike Terry on Wed Dec 21 22:54:43 2022

On 2022-12-21 20:55, Mike Terry wrote:

The OP was probably deliberately rather vague on this point! The
easiest definition for "average" literally doesn't make sense in a
modulo context: we can ADD the modular values, but in general dividing
in such a mathematical setting doesn't make much sense, so

Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n

would be inappropriate due to the /n operation.

Unless you calculate everything as reals or integers and then take the remainder of 24. Which is kind of most natural definition, at least to me.

HOWEVER, there's another characterisation for the average of a set,
which looks more promising in a modular (or other more general)
setting: the average is the value of V which MINIMISES THE "ERROR" calculated by

error = Sum ([i=1,2,..n] (V - x_i)^2)

that is, minimises the sum of squares of differences from V.

This has exactly same problem as above, because subtraction and
multiplication (power of two) have different semantics in modular
arithmetic.

[Incidentally, if we minimise the sum of absolute differeces from V,
that characterises the mode (aka "midpoint") of the sample, but I think
the median is more what is wanted here...]

Actually it is the same beast. Median is the least C-norm defined as |x-y|.

Mean is the least Euclidean norm (squares), the thing you proposed.

Mean and average are same for real numbers.

You are right that median (and C-norm) can be unambiguously defined in
modular numbers. But nobody would ever qualify this as a puzzle! (:-))

[...]

Now, as to how to CALCULATE the V above???

As I said, in real numbers V is the mean, i.e.

V = Sum (Xi) =def= mean({Xi | i=1..n})
i=1..n

Any numeric method is in the end mapping Xi to some normal number,
calculating a classic mean, mapping back. The problem is with the
forward mapping being ambiguous. One method to disambiguate is staying
within the same day. Another could be taking a median and considering 12
hours before and 12 hours after it. And there are many more.

P.S. There are differences between the average and mean. OP referred the average, which may mean something completely counterintuitive (no pun
intended (:-))

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to Mike Terry on Wed Dec 21 14:47:49 2022

On Wednesday, 21 December 2022 at 20:56:01 UTC+1, Mike Terry wrote:

an accumulation of "frustration" at what I consider to be
"unimaginative" misinterpretations and
dogmatic claims of what was originally requested...

Besides that the problem was and remains at best poorly
phrased (and I agree that guesswork around here should
rather be banned), there is a much simpler and canonical
way to define "midpoint" in a modular space, namely the
midpoint along the shortest of the two possible paths.
And that's actually a quite common kind of calculation...

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to Mike Terry on Wed Dec 21 16:05:00 2022

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Bunch of pretenders then fraudulent spammers...

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mike Terry@21:1/5 to Dmitry A. Kazakov on Thu Dec 22 00:01:03 2022

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

On 2022-12-21 20:55, Mike Terry wrote:

The OP was probably deliberately rather vague on this point!� The easiest definition for "average"
literally doesn't make sense in a modulo context:� we can ADD the modular values, but in general
dividing in such a mathematical setting doesn't make much sense, so

�� Average ([i=1,2,..n] x_i)� =[def]=� Sum ([i=1,2,..n] x_i) /n

would be inappropriate due to the /n operation.

Unless you calculate everything as reals or integers and then take the remainder of 24. Which is
kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted - an artificial cut-point has been
introduced to the continuous mod-24 "loop", converting all the points into non-modular lengths (say
real numbers), and then those are averaged. That produces "silly" conclusions as people have
listed. I get that people want to read the original question in a literal fashion, because at least
that would make it clear - but to me that is very much not what was intended or actually said.

(Well, I agree very little was actually /said/ but near the end of the post I give a couple of [what
I consider] minimal requirements for what a modulo average should achieve, and I bet the OP would
agree with those!)

HOWEVER, there's another characterisation for the average of a set, which looks more promising in
a modular (or other more general) setting:� the average is the value of V which MINIMISES THE
"ERROR" calculated by

�� error = Sum ([i=1,2,..n] (V - x_i)^2)

that is, minimises the sum of squares of differences from V.

This has exactly same problem as above, because subtraction and multiplication (power of two) have
different semantics in modular arithmetic.

Hmm, I don't agree - we've moved from a "calculate this formula for averaging numbers in R"
perspective, which DOESN'T WORK IN THE MODULO SYSTEM [due to the division] to a "choose V which
minimises such-and-such error function" which DOES work, provided we have a way of making the error
function clear.

Note that in the error = formula above, the "(V - x_i)^2" terms are the squares of the distance
between V and X_i. I agree V-x_i would be ambiguous in the modulo system, but that formula is the
formula for REAL number averages (the "mean" as you point out below). In the modulo system we need
a replacement for "distance" that is well defined and appropriate.

So provided we can propose a sensible metric on the modular system, the error function approach
makes perfect sense. That's not at all like the "convert modulo number to real number and then take
arithmetic mean, and then reduce that modulo whatever" definition, which will give different results
depending on where you make the required "cut" in the modulo loop.

So to be clearer perhaps I should have written:

error = Sum ([i=1,2,..n] Distance(V,x_i)^2)

ALSO NOTE: this error calculation ISN'T A MODULO CALCULATION! It's a calculation in the real
numbers R. So Distance is a real number, and squaring is being done in the real numbers, and the
error function is a function mapping to real numbers that is being minimised as a value in R (by a
choice of V in the modulo system). [That all relates to your remark about different semantics in
modular arithmetic.]

[Incidentally, if we minimise the sum of absolute differeces from V, that characterises the mode
(aka "midpoint") of the sample, but I think the median is more what is wanted here...]

Actually it is the same beast. Median is the least C-norm defined as |x-y|.

Mean is the least Euclidean norm (squares), the thing you proposed.

Mean and average are same for real numbers.

Yeah, I agree - Thanks, I'd totally muddled the definitions for those terms!! Well, I'll say it's
been a while :) I should have said "mean". A quick check shows "mode" is actually "the most common
value" which is not of much interest here.

You are right that median (and C-norm) can be unambiguously defined in modular numbers. But nobody
would ever qualify this as a puzzle! (:-))

[...]

Now, as to how to CALCULATE the V above???

As I said, in real numbers V is the mean, i.e.

�� V = Sum (Xi) =def= mean({Xi | i=1..n})
�� i=1..n

But that doesn't work sensibly for a modulo system where arithmetic wraps. That's the whole point
of this thread, it seems to me.

Hmmm, ok clearly for a modulo system, if we translate all the x_i by a fixed amount, THE "AVERAGE"
SHOULD ALSO TRANSLATE BY THE SAME FIXED AMOUNT. Otherwise it is simply not a sensible definition of
average with that modulo system. Your definition does not have the translation property while my
one does - that's why I propose it for this problem.

I'd agree that there's plenty of room for discussing what is the right "distance" function or
precisely how that feeds into the "error" function (squares vs. abs values or whatever). But also I
think my suggestion for those is good enough...

Any numeric method is in the end mapping Xi to some normal number, calculating a classic mean,
mapping back. The problem is with the forward mapping being ambiguous. One method to disambiguate is
staying within the same day. Another could be taking a median and considering 12 hours before and 12
hours after it. And there are many more.

Maybe so - I'd just say here that you are discussing an /algorithm/ to compute a result, rather than
the definition of what is to be considered the correct answer, which was the focus in my post. I
agree any algorithm could well incorporate calculating the mean of certain real numbers, but
/correctness/ is judged by whether the algorithm produces the V with minimum error function.
[Otherwise people just make an arbitrary coding decision, and say by definition their code is
correct because it correctly codes what they want it to :) ]

Going a bit further than my last post, I'd say that the average as I've defined it would probably
turn out to be the real-number average, if only we could make the cut in the modulo loop 12 hours
away from the point V. Of course the point V is the value to be determined, which isn't known up
front, so /defining/ V like this would be circular! It does seem that the effect of moving V in all
the calculations would be more easily evaluated, if the times x_i were SORTED, so we could quickly
find how many x_i are in different intervals. Just thinking...

P.S. There are differences between the average and mean. OP referred the average, which may mean
something completely counterintuitive (no pun intended (:-))

I think the OP didn't think too deeply about the mathematics - but I think any solution without
these two properties would be an automatic fail for OP:

- if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
AROUND THE END OF THE MODULO LOOP] then the "average" will be within
those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.

- if all x_i are translated by a fixed offset like 1 hour, the average
will translate by the same offset, i.e. it maintains the same "modulo"
relation with the translated x_i.

Regards,
Mike.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Thu Dec 22 04:10:50 2022

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms of variance). So far, this is just what an average is. The key point is
what is the distance (or difference) whose sum (or sum of squares) we
want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The average
is then just the direction of the average vector, converted back to a
time of day.

Sometimes that vector has zero length, and the average is undefined, but otherwise the length of the vector gives an indication of the
variability of the data.

Why do I consider this a reasonable interpretation of the problem?
Well, given a list of times of day when a train is observed to pass some station, the circular 24-hour-time average should be our best estimate
of the scheduled time.

Obviously there are other possible readings of the problem, but I was
not able to justify any of them as useful for any real-world
applications. This is a case where I hope I am wrong and there /are/
other circular averages with practical interpretations.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Mike Terry on Thu Dec 22 09:10:41 2022

On 2022-12-22 01:01, Mike Terry wrote:

Maybe so - I'd just say here that you are discussing an /algorithm/ to compute a result, rather than the definition of what is to be considered
the correct answer, which was the focus in my post.

That was my first impression too. This is why my first response was that
the idea of puzzle was to perform calculations in modular arithmetic
without conversions to real/integer/fixed-point. Which is possible to
do, one should only count wrap-ups after each addition and in the end
take them into account in final division and remainder computations.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Julio Di Egidio on Thu Dec 22 16:50:45 2022

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to David Brown on Thu Dec 22 08:30:03 2022

On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)

Utter nonsense of the usual kind and an entire industry down the drain.

No, we are rather supposed to *elicit* requirements... which is just the
tip of the (software) analysis iceberg.

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Julio Di Egidio on Thu Dec 22 22:06:57 2022

On 22/12/2022 17:30, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)

Utter nonsense of the usual kind and an entire industry down the drain.

No, we are rather supposed to *elicit* requirements... which is just the
tip of the (software) analysis iceberg.

Perhaps I should have put the smiley in a bigger font?

Of course the correct way to do things is to work with the customer to establish the requirements and specifications that everyone is happy
with, and then code to those specifications.

And if it turns out that the specifications were unclear, you clarify
them, rather than guessing.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to David Brown on Thu Dec 22 21:30:59 2022

On 22/12/2022 9:06 pm, David Brown wrote:

On 22/12/2022 17:30, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown
wrote:

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry
wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer
actually
wanted, rather than what they asked for! :-)

Utter nonsense of the usual kind and an entire industry down
the drain.

No, we are rather supposed to *elicit* requirements... which is
just the
tip of the (software) analysis iceberg.

Perhaps I should have put the smiley in a bigger font?

David, obviously it's your call, but do you really think you're
going to have a meaningful conversation with someone who on
05/01/2022 10:57 replied to you[1]:
"It is you who are an incompetent fucking clown just repeating
some bullshit you read in the blogs."

And you're by far from being the only one he abuses in thus wise.

God invented killfiles for a reason.

[1] If you want to look it up, see just over halfway down this
thread: https://groups.google.com/g/comp.programming/c/gKQ3q0D5IT0/m/7JoZ5e7ICAAJ

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to David Brown on Thu Dec 22 14:15:29 2022

On Thursday, 22 December 2022 at 22:07:01 UTC+1, David Brown wrote:

On 22/12/2022 17:30, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)

Utter nonsense of the usual kind and an entire industry down the drain.

No, we are rather supposed to *elicit* requirements... which is just the tip of the (software) analysis iceberg.

Perhaps I should have put the smiley in a bigger font?

You can laugh your frauds and bad conscience as much as you like,
I am not kidding at all.

Of course the correct way to do things is to work with the customer to establish the requirements and specifications that everyone is happy
with, and then code to those specifications.

And if it turns out that the specifications were unclear, you clarify
them, rather than guessing.

No, you don't, you go back to the customer and work together until you
do reach again an agreement (indeed analysis, for another big secret,
can and should iterate with everything else)... which is still barely 101.

But you spammers (not to even mention meanwhile the plain vile cunts),
"of course" will never concede, and this remains mainly to the benefit of
the innocent reader, learning something sensible for a change.. which
might even be your employers, for those of you who don't just suck blood.

Fucking pathetic...

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Richard Heathfield on Fri Dec 23 13:46:31 2022

On 22/12/2022 22:30, Richard Heathfield wrote:

On 22/12/2022 9:06 pm, David Brown wrote:

On 22/12/2022 17:30, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)

Utter nonsense of the usual kind and an entire industry down the drain.

No, we are rather supposed to *elicit* requirements... which is just the >>> tip of the (software) analysis iceberg.

Perhaps I should have put the smiley in a bigger font?

David, obviously it's your call, but do you really think you're going to
have a meaningful conversation with someone who on 05/01/2022 10:57
replied to you[1]:
"It is you who are an incompetent fucking clown just repeating some
bullshit you read in the blogs."

And you're by far from being the only one he abuses in thus wise.

God invented killfiles for a reason.

[1] If you want to look it up, see just over halfway down this thread: https://groups.google.com/g/comp.programming/c/gKQ3q0D5IT0/m/7JoZ5e7ICAAJ

I am blessed with absent-mindedness, which makes it easier to give
people a second chance! But seeing his latest reply, it seems that was
wasted effort.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Julio Di Egidio on Fri Dec 23 13:47:27 2022

On 22/12/2022 23:15, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 22:07:01 UTC+1, David Brown wrote:

Perhaps I should have put the smiley in a bigger font?

You can laugh your frauds and bad conscience as much as you like,
I am not kidding at all.

Happy Christmas to you too :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to David Brown on Fri Dec 23 08:33:43 2022

On Friday, 23 December 2022 at 13:47:31 UTC+1, David Brown wrote:

On 22/12/2022 23:15, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 22:07:01 UTC+1, David Brown wrote:

Perhaps I should have put the smiley in a bigger font?

You can laugh your frauds and bad conscience as much as you like,
I am not kidding at all.

Happy Christmas to you too :-)

What will you have for lunch, kidneys of some little African kids?

You pieces of vile, retarded, abusive, and always fraudulent nazi-shit:
if anything I'll be spitting on you till the end of time...

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to Julio Di Egidio on Fri Dec 23 08:35:19 2022

On Wednesday, 21 December 2022 at 23:47:52 UTC+1, Julio Di Egidio wrote:

On Wednesday, 21 December 2022 at 20:56:01 UTC+1, Mike Terry wrote:

an accumulation of "frustration" at what I consider to be
"unimaginative" misinterpretations and
dogmatic claims of what was originally requested...

Besides that the problem was and remains at best poorly
phrased (and I agree that guesswork around here should
rather be banned), there is a much simpler and canonical
way to define "midpoint" in a modular space, namely the
midpoint along the shortest of the two possible paths.
And that's actually a quite common kind of calculation...

Up 1, you piece of retarded polluting fraudulent shit.

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sat Dec 24 06:21:40 2022

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

I didn't really say very much. Your comments are a lot more
interesting.

The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)

The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.

Averaging the "time unit vectors" is another plausible metric.

Sometimes that vector has zero length, and the average is undefined,
but otherwise the length of the vector gives an indication of the
variability of the data.

Why do I consider this a reasonable interpretation of the problem?
Well, given a list of times of day when a train is observed to pass
some station, the circular 24-hour-time average should be our best
estimate of the scheduled time.

Obviously there are other possible readings of the problem, but I was
not able to justify any of them as useful for any real-world
applications. This is a case where I hope I am wrong and there /are/
other circular averages with practical interpretations.

A nice analysis.

Some observations (all of which I believe are correct, but no
guarantees are offered).

The "vector average" metric is easier to calculate and gives an
explicit indication when the data produces a degenerate result.

Considering only cases that do not produce a degenerate result:

(1) the "conventional average" metric and the "vector average"
metric can produce different results when there are more than two
data points. (For one or two data points I think they are always
the same.)

(2) the "vector average" metric can be calculated in O(n). The
"conventional average" metric can be calculated in O(n*n).

(3) when the times are all reasonably close together, the
"conventional average" metric seems more likely to produce a result corresponding to what most people expect. Another way to say this
is that the "vector average" metric will sometimes surprise people.

Incidentally, I had implemented a method that is isomorphic to
the "vector average" metric (at least I think it is), and I was
surprised to discover that it gave different results than the
"conventional average" metric in some cases.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Mike Terry on Sat Dec 24 17:14:41 2022

On 24/12/2022 4:53 pm, Mike Terry wrote:

On 24/12/2022 14:21, Tim Rentsch wrote:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their
average. For example, the average of "eight o'clock"
and "ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example
"HH:MM" or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a
result, the puzzle has not yet been solved (unless I
have overlooked a contribution or misworded
expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5 average( 11.5, 13.5 )==
12.5

(I use hours as units, so "0.5" means, "half past
midnight".)

I hope that these test cases encode sensible
expectations for an average of two times on a 24-hour
clock in the spirit of the example given in the OP,
which was, "the average of eight o'clock and ten
o'clock would be nine o'clock", since these test cases
just have rotated that example by 3.5 and 15.5 hours.

I believe that I have not seen an algorithm so far in
this thread that would pass these tests.

As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

I didn't really say very much. Your comments are a lot
more interesting.

The input is a collection, t(n), of n > 1 numbers in [0,
24). The average should be a number, A, in [0, 24) that
minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to
think in terms of variance). So far, this is just what an
average is. The key point is what is the distance (or
difference) whose sum (or sum of squares) we want to
minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

Minimizing the square of the distance (aka length of the
shorter arc) is one metric, and I think a reasonable one.
(Minimizing the absolute value of the distance gives a
result more like the median than the mean, IIANM.)

Yes - perhaps Ben was thinking of making

Sum_{i=1,n} signed_distance(A, t(i)) = 0

rather than minimising the (absolute) distance.

And so perhaps you are closer to what Ben was thinking of, but as
far as we know you are no closer to knowing what Stefan was
thinking of.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mike Terry@21:1/5 to Tim Rentsch on Sat Dec 24 16:53:57 2022

On 24/12/2022 14:21, Tim Rentsch wrote:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

I didn't really say very much. Your comments are a lot more
interesting.

The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)

Yes - perhaps Ben was thinking of making

Sum_{i=1,n} signed_distance(A, t(i)) = 0

rather than minimising the (absolute) distance. That would match up with minimising the sum of
squares of the distance (mean). Or he just meant the median as you say, which is different from the
mean. Both have plausible "average" properties, and meet my "minimal" criteria for what I think
anyone would require of an "average" applying to a modular arithmetic:

- if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
AROUND THE END OF THE MODULO LOOP] then the "average" will be within
those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.

- if all x_i are translated by a fixed offset like 1 hour, the average
will translate by the same offset, i.e. it maintains the same "modulo"
relation with the translated x_i.

hmm, thinking now I'll add the similar:

- if all x_i are "reflected" about a chosen time, then the new average
will be the reflected time of the original average.

(those aren't intended to be a definition of a "modular average", just to exclude suggestions that
blatently lack the right symmetry for such an average.)

The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.

Averaging the "time unit vectors" is another plausible metric.

Yes I think this one would work best in most situations, given that the OP didn't have a precise
mathematical requirement for his solution - probably he wants something that just works reasonably
as expected. (It's my favourite answer.)

Another way of thinking of this approach is "balancing" the 24-hour modular circle, where we've put
unit weights at each of the times x_i. E.g. we look for a balancing line through the centre of the
circle. [Note times on the circle go from 1-24, not 1-12 like a normal clock.] Also, thinking like
this, it's equivalent to (conventional) averaging of the sin of the angular differences from the
average (balancing) point, since the sin will give the perpendicular distance of the weight from the
balancing line. So it still fits with the general scheme of minimising distance(x_i, v)^2, but with
a slightly different metric (distance function) in use.

As sin(x) is close to x when x is small, the vector average solution would match closely with the
mean (by arc length) solution when the times are close together, but diverge more noticeably when
the times are spread out...

Mike.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kristjan Robam@21:1/5 to Richard Heathfield on Sat Dec 24 12:17:22 2022

Has any of you guys tried to make a universal puzzle solver ?

On Wednesday, December 14, 2022 at 2:30:11 PM UTC+2, Richard Heathfield wrote:

On 14/12/2022 12:24 pm, Stefan Ram wrote:

Given n times of the 24-hour day, print their average.

For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".

(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Subject line: "Another little puzzle"

Your specification seems too simple to qualify as a puzzle, so
what am I missing?

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Mike Terry on Sat Dec 24 22:30:55 2022

Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

On 24/12/2022 14:21, Tim Rentsch wrote:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

I didn't really say very much. Your comments are a lot more
interesting.

The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)

Yes - perhaps Ben was thinking of aking

Sum_{i=1,n} signed_distance(A, t(i)) = 0

rather than minimising the (absolute) distance. That would match up
with minimising the sum of squares of the distance (mean). Or he just
meant the median as you say, which is different from the mean. Both
have plausible "average" properties,

Well, I was trying to be deliberately vague (hence my talking about "an average" rather than mean, median or whatever) but I was no vague enough
with the formula.

I wasn't thinking of a zero net distance because some metrics (though
probably not applicable in this situation) don't always allow for a zero
sum. I should have referred to minimising some kind distance sum, be it
the absolute value of a signed sum or, as in the solution I was
proposing, the sum of directional differences (i.e. full vectors).

and meet my "minimal" criteria
for what I think anyone would require of an "average" applying to a
modular arithmetic:

- if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
AROUND THE END OF THE MODULO LOOP] then the "average" will be within
those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.

And when the x_i are widely spaced the average might be considered
nearly meaningless. If you have {0, 8, 16} hours then what is the
"average"? Are there cases where every set of circular values has a
meaningful "average"? If not, you really need more than a single number answer.

(These are not questions to you, specifically, I'm interested to know of interpretations that go beyond the way I see things.)

- if all x_i are translated by a fixed offset like 1 hour, the average
will translate by the same offset, i.e. it maintains the same "modulo"
relation with the translated x_i.

hmm, thinking now I'll add the similar:

- if all x_i are "reflected" about a chosen time, then the new average
will be the reflected time of the original average.

(those aren't intended to be a definition of a "modular average", just
to exclude suggestions that blatently lack the right symmetry for such
an average.)

The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.

Averaging the "time unit vectors" is another plausible metric.

Yes I think this one would work best in most situations, given that
the OP didn't have a precise mathematical requirement for his solution
- probably he wants something that just works reasonably as expected.
(It's my favourite answer.)

Another way of thinking of this approach is "balancing" the 24-hour
modular circle, where we've put unit weights at each of the times x_i.
E.g. we look for a balancing line through the centre of the circle.

Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or
16 balances, but the vector sum interpretation is undefined. And for
{0, 6, 12, 18} there are two balance lines, neither of which seems to be
a natural answer for the average.

(Also, every balance line gives two answers, so how do you pick?)

[Note times on the circle go from 1-24, not 1-12 like a normal clock.]
Also, thinking like this, it's equivalent to (conventional) averaging
of the sin of the angular differences from the average (balancing)
point, since the sin will give the perpendicular distance of the
weight from the balancing line. So it still fits with the general
scheme of minimising distance(x_i, v)^2, but with a slightly different
metric (distance function) in use.

As sin(x) is close to x when x is small, the vector average solution
would match closely with the mean (by arc length) solution when the
times are close together, but diverge more noticeably when the times
are spread out...

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mike Terry@21:1/5 to Ben Bacarisse on Sun Dec 25 00:02:31 2022

On 24/12/2022 22:30, Ben Bacarisse wrote:

Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

On 24/12/2022 14:21, Tim Rentsch wrote:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

I didn't really say very much. Your comments are a lot more
interesting.

The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point >>>> is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)

Yes - perhaps Ben was thinking of aking

Sum_{i=1,n} signed_distance(A, t(i)) = 0

rather than minimising the (absolute) distance. That would match up
with minimising the sum of squares of the distance (mean). Or he just
meant the median as you say, which is different from the mean. Both
have plausible "average" properties,

Well, I was trying to be deliberately vague (hence my talking about "an average" rather than mean, median or whatever) but I was no vague enough
with the formula.

I wasn't thinking of a zero net distance because some metrics (though probably not applicable in this situation) don't always allow for a zero
sum. I should have referred to minimising some kind distance sum, be it
the absolute value of a signed sum or, as in the solution I was
proposing, the sum of directional differences (i.e. full vectors).

and meet my "minimal" criteria
for what I think anyone would require of an "average" applying to a
modular arithmetic:

- if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
AROUND THE END OF THE MODULO LOOP] then the "average" will be within
those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.

And when the x_i are widely spaced the average might be considered
nearly meaningless. If you have {0, 8, 16} hours then what is the
"average"? Are there cases where every set of circular values has a meaningful "average"? If not, you really need more than a single number answer.

(These are not questions to you, specifically, I'm interested to know of interpretations that go beyond the way I see things.)

- if all x_i are translated by a fixed offset like 1 hour, the average
will translate by the same offset, i.e. it maintains the same "modulo" >> relation with the translated x_i.

hmm, thinking now I'll add the similar:

- if all x_i are "reflected" about a chosen time, then the new average
will be the reflected time of the original average.

(those aren't intended to be a definition of a "modular average", just
to exclude suggestions that blatently lack the right symmetry for such
an average.)

The problem has a natural interpretation in terms of angles. Whatever >>>> the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.

Averaging the "time unit vectors" is another plausible metric.

Yes I think this one would work best in most situations, given that
the OP didn't have a precise mathematical requirement for his solution
- probably he wants something that just works reasonably as expected.
(It's my favourite answer.)

Another way of thinking of this approach is "balancing" the 24-hour
modular circle, where we've put unit weights at each of the times x_i.
E.g. we look for a balancing line through the centre of the circle.

Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or
16 balances, but the vector sum interpretation is undefined. And for
{0, 6, 12, 18} there are two balance lines, neither of which seems to be
a natural answer for the average.

Yes, it's logically the same, in so far as WHEN the x_i aren't already balanced [aka they /have/ a
meaningful average] we get the same average both ways. (Adding vectors is effectively a way of
calculating moments of the weights in the balancing view of things.)

With modular arithmetic there is a scenario where everything is already balanced, and so it isn't
clear what an average would represent, if anything. Both {0, 8, 16} and {0, 6, 12, 18} are already
perfectly balanced - ANY line through the origin would balance for either example. That is just a
feature of modular balancing/averaging I guess. I agree there's no natural "average", so I'll just
suggest we say "this distribution doesn't have an average" or something. Adding vectors gives a
zero result and so doesn't work either...

Of course, we might apply some definition that we're working with and just see how it goes! E.g. we
might say "we're minimising XXXXXX error function" - but then we just find the function is constant
on the whole circle - so it doesn't give us anything useful other than that there's no useful
average. Or we could say "every point meets the average criteria", but those are just words and no
more help. This is what happens with the minimising functions that have been mentioned in the
thread, for your two "already balanced" examples. [E.g. if we're minimising

error(A) = Sum_{x=0,8,16} sin (angle of least arc from A to x)^2

it's going to zero on the whole circle - no (unique) minimum. :(

You're right that a balancing line has two ends, so which to take? We would take the end bearing
most weight! But actually the balancing line idea wasn't my proper thinking - that was that we
"counter-balanced" the circle by adding a new weight somewhere on the circle. There is a unique
such place where such a weight can go, and a unique weight that works. The "average" for the x_i is
then directly OPPOSITE the counter weight. This is all assuming we don't start with an exactly
balancing sample! The mass of the counter-weight is the magnitude of your vector sum, indicating
the "strength" of the average...

[I thought someone would ask the annoying question "why is the counter-balancing point on the circle
/opposite/ to the average point?" so by turning it into a balancing line which went through the
average-point I could save a response... And of course once we're considering a balancing line
passing through the counter-weight and average-point, we could then remove the counter-weight and
things would still balance]

Anyhow, I just said it all as another way of thinking about the vector average method. In a program
we're going to compute x and y components of vectors and add them like you said.

(Also, every balance line gives two answers, so how do you pick?)

The "heaviest end" of the circle! But really, maybe forget the balancing line and think of the
counter-balancing weight idea.

Mike.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Dec 25 00:41:38 2022

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

I didn't really say very much. Your comments are a lot more
interesting.

The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)

The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.

Averaging the "time unit vectors" is another plausible metric.

Sometimes that vector has zero length, and the average is undefined,
but otherwise the length of the vector gives an indication of the
variability of the data.

Why do I consider this a reasonable interpretation of the problem?
Well, given a list of times of day when a train is observed to pass
some station, the circular 24-hour-time average should be our best
estimate of the scheduled time.

Obviously there are other possible readings of the problem, but I was
not able to justify any of them as useful for any real-world
applications. This is a case where I hope I am wrong and there /are/
other circular averages with practical interpretations.

A nice analysis.

A generous appraisal, thanks. I was trying to be vague enough not to
put my foot in it, by was too specific with the formula.

Some observations (all of which I believe are correct, but no
guarantees are offered).

The "vector average" metric is easier to calculate and gives an
explicit indication when the data produces a degenerate result.

Considering only cases that do not produce a degenerate result:

(1) the "conventional average" metric and the "vector average"
metric can produce different results when there are more than two
data points. (For one or two data points I think they are always
the same.)

(2) the "vector average" metric can be calculated in O(n). The
"conventional average" metric can be calculated in O(n*n).

By which I presume you mean and /not/ in O(n). (Big O is just an upper
bound.)

(3) when the times are all reasonably close together, the
"conventional average" metric seems more likely to produce a result corresponding to what most people expect. Another way to say this
is that the "vector average" metric will sometimes surprise people.

I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.

My "conventional average" algorithm (which is not well thought out) was
to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result
back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Mike Terry on Sun Dec 25 01:55:35 2022

Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

... But actually the balancing line idea wasn't my proper thinking -
that was that we "counter-balanced" the circle by adding a new weight somewhere on the circle. There is a unique such place where such a
weight can go, and a unique weight that works.

Or just find the point in the (weightless) disc where it balances.
That's the vector average.

The "average" for the
x_i is then directly OPPOSITE the counter weight. This is all
assuming we don't start with an exactly balancing sample!
The mass of
the counter-weight is the magnitude of your vector sum, indicating the "strength" of the average...

The distance from the centre is then a measure of the strength of the
average.

I like the physical metaphor of a balanced disc, but I prefer the idea
of finding the balance point directly rather than the mass and location
(on the circumference) of a counterweight.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Mike Terry on Sun Dec 25 01:30:42 2022

Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

On 24/12/2022 22:30, Ben Bacarisse wrote:

Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

Another way of thinking of this approach is "balancing" the 24-hour
modular circle, where we've put unit weights at each of the times x_i.
E.g. we look for a balancing line through the centre of the circle.

Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or
16 balances, but the vector sum interpretation is undefined. And for
{0, 6, 12, 18} there are two balance lines, neither of which seems to be
a natural answer for the average.

Yes, it's logically the same, in so far as WHEN the x_i aren't already balanced [aka they /have/ a meaningful average] we get the same
average both ways. (Adding vectors is effectively a way of
calculating moments of the weights in the balancing view of things.)

Oh, yes, I see what you mean. Sorry.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Mike Terry on Sun Dec 25 00:37:47 2022

Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:

On 24/12/2022 14:21, Tim Rentsch wrote:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

[...]

The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

Minimizing the square of the distance (aka length of the shorter
arc) is one metric, and I think a reasonable one. [...]

Yes - perhaps Ben was thinking of making

Sum_{i=1,n} signed_distance(A, t(i)) = 0

rather than minimising the (absolute) distance. [...]

I think Ben meant what he said, but I will let him speak for
himself.

The problem has a natural interpretation in terms of angles.
Whatever the circular quantity is, convert the values to unit
vectors round a circle. For times of day, just scale [0, 24) to
[0, 2*pi). The average is then just the direction of the average
vector, converted back to a time of day.

Averaging the "time unit vectors" is another plausible metric.

Yes I think this one would work best in most situations, [...]

It gives results in many cases, including many easy cases, that I
think would surprise most people.

Another way of thinking of this approach is "balancing" the 24-hour
modular circle, where we've put unit weights at each of the times
x_i. E.g. we look for a balancing line through the centre of the
circle. [Note times on the circle go from 1-24, not 1-12 like a
normal clock.]

To me this way of thinking about the problem seems not very
useful.

Also, thinking like this, it's equivalent to (conventional)
averaging of the sin of the angular differences from the average
(balancing) point, since the sin will give the perpendicular
distance of the weight from the balancing line. [...]

I think this statement is nonsensical. We are taking the
conventional average of "something", but the "something" has
been chosen so that its conventional average is zero. It's
like saying, Once you get to where you're going, you know that
where you're going is where you are. There is no information
about how to solve the problem, and so it cannot be equivalent
to any method that provides an actual answer.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Dec 25 01:05:06 2022

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)

Thanks for all replies!

I waited a few days before answering to allow
sufficient time to think about the problem.

There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).

So, here are two possible test cases.

average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5

(I use hours as units, so "0.5" means, "half past midnight".)

I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.

I believe that I have not seen an algorithm so far in this
thread that would pass these tests.

As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

I didn't really say very much. Your comments are a lot more
interesting.

The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises

Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)

The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.

Averaging the "time unit vectors" is another plausible metric.

Sometimes that vector has zero length, and the average is undefined,
but otherwise the length of the vector gives an indication of the
variability of the data.

Why do I consider this a reasonable interpretation of the problem?
Well, given a list of times of day when a train is observed to pass
some station, the circular 24-hour-time average should be our best
estimate of the scheduled time.

Obviously there are other possible readings of the problem, but I was
not able to justify any of them as useful for any real-world
applications. This is a case where I hope I am wrong and there /are/
other circular averages with practical interpretations.

A nice analysis.

A generous appraisal, thanks. I was trying to be vague enough not to
put my foot in it, by was too specific with the formula.

Some observations (all of which I believe are correct, but no
guarantees are offered).

The "vector average" metric is easier to calculate and gives an
explicit indication when the data produces a degenerate result.

Considering only cases that do not produce a degenerate result:

(1) the "conventional average" metric and the "vector average"
metric can produce different results when there are more than two
data points. (For one or two data points I think they are always
the same.)

(2) the "vector average" metric can be calculated in O(n). The
"conventional average" metric can be calculated in O(n*n).

By which I presume you mean and /not/ in O(n). (Big O is just an
upper bound.)

What I mean is I have an algorithm that runs in time proportional
to n*n. I suspect that there is an algorithm with better big-O
behavior (probably O( n * log n )), but I don't actually have one
to show, so I simply gave the best big-O result that I knew.

(3) when the times are all reasonably close together, the
"conventional average" metric seems more likely to produce a result
corresponding to what most people expect. Another way to say this
is that the "vector average" metric will sometimes surprise people.

I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.

Sorry, I meant to refer to your formulation of average

A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.

My "conventional average" algorithm (which is not well thought out) was
to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result
back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.

Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle? (Incidentally I got that part wrong the first way I tried
to do it.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Tim Rentsch on Sun Dec 25 09:30:55 2022

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

[pulled up from further on in the responded-to posting]

[By "conventional average" I mean a time/direction A such that]

A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

where 'difference' means the shorter arc length.

[...]

The "conventional average" metric can be calculated in O(n*n).

By which I presume you mean and /not/ in O(n). (Big O is just an
upper bound.)

What I mean is I have an algorithm that runs in time proportional
to n*n. I suspect that there is an algorithm with better big-O
behavior (probably O( n * log n )), but I don't actually have one
to show, so I simply gave the best big-O result that I knew.

After some further thought I have convinced myself that there is
indeed an algorithm that is O( n * log n ). The idea isn't that
complicated, but it's tedious, so I haven't actually implemented
it. Perhaps someone will be interested to take that as a challenge
problem.

And who knows, maybe there is an algorithm with even better big-O
performance. At least for now that's above my pay grade.

(and of course, Merry Christmas...)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Dec 25 21:44:03 2022

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.

Sorry, I meant to refer to your formulation of average

A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.

My "conventional average" algorithm (which is not well thought out) was
to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result
back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.

Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?

Well I handily stopped considering this at the stage where I assumed
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm tries
all the offsets and minimises the result.

Looking at it a bit more I can't see a better way (but that might be the Ratafia de Champagne). It feels as if there /should/ be one. In fact
it feels as if it should be linear.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Damon@21:1/5 to Ben Bacarisse on Sun Dec 25 17:59:18 2022

On 12/25/22 4:44 PM, Ben Bacarisse wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.

Sorry, I meant to refer to your formulation of average

A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.

My "conventional average" algorithm (which is not well thought out) was
to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result
back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.

Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?

Well I handily stopped considering this at the stage where I assumed
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm tries
all the offsets and minimises the result.

Looking at it a bit more I can't see a better way (but that might be the Ratafia de Champagne). It feels as if there /should/ be one. In fact
it feels as if it should be linear.

I may have lost track of the problem, but wouldn't the addition of the
points in two-dimensions and finding the center of gravity let you find
the answer in O(N) time?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to Julio Di Egidio on Sun Dec 25 15:52:54 2022

On Thursday, 22 December 2022 at 17:30:06 UTC+1, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)

Utter nonsense of the usual kind and an entire industry down the drain.

No, we are rather supposed to *elicit* requirements... which is just the
tip of the (software) analysis iceberg.

Up 2. You fraudulent cunts.

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Richard Damon on Sun Dec 25 18:39:52 2022

Richard Damon <Richard@Damon-Family.org> writes:

On 12/25/22 4:44 PM, Ben Bacarisse wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.

Sorry, I meant to refer to your formulation of average

A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.

My "conventional average" algorithm (which is not well thought
out) was to (a) rotate the data set to avoid the 23/0 boundary
(not always possible), (b) take the arithmetic mean, and then (c)
rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
and the average is mean[0,1,2] - 1 = 0.

Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?

Well I handily stopped considering this at the stage where I
assumed there must be a simple way to spot the optimal rotation, so
I never thought it might have to be quadratic. Presumably your
algorithm tries all the offsets and minimises the result.

Looking at it a bit more I can't see a better way (but that might
be the Ratafia de Champagne). It feels as if there /should/ be
one. In fact it feels as if it should be linear.

I may have lost track of the problem, but wouldn't the addition of
the points in two-dimensions and finding the center of gravity let
you find the answer in O(N) time?

Finding the center of mass does give an answer in O(N) time, but
the answer it gives doesn't match the metric of minimizing the
squares of shorter arc lengths. The results of the two metrics
can be, and usually are, different.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Dec 25 18:35:18 2022

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.

Sorry, I meant to refer to your formulation of average

A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.

My "conventional average" algorithm (which is not well thought
out) was to (a) rotate the data set to avoid the 23/0 boundary
(not always possible), (b) take the arithmetic mean, and then (c)
rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
and the average is mean[0,1,2] - 1 = 0.

Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?

Well I handily stopped considering this at the stage where I assumed
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm
tries all the offsets and minimises the result.

Right.

Looking at it a bit more I can't see a better way (but that might be
the Ratafia de Champagne). It feels as if there /should/ be one.
In fact it feels as if it should be linear.

My best so far is only O( n * log n ). Probably that isn't
optimal. I don't see how to make it linear though. Do you have
any ideas?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Mon Dec 26 03:49:36 2022

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.

Sorry, I meant to refer to your formulation of average

A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.

My "conventional average" algorithm (which is not well thought
out) was to (a) rotate the data set to avoid the 23/0 boundary
(not always possible), (b) take the arithmetic mean, and then (c)
rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
and the average is mean[0,1,2] - 1 = 0.

Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?

Well I handily stopped considering this at the stage where I assumed
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm
tries all the offsets and minimises the result.

Right.

Looking at it a bit more I can't see a better way (but that might be
the Ratafia de Champagne). It feels as if there /should/ be one.
In fact it feels as if it should be linear.

My best so far is only O( n * log n ). Probably that isn't
optimal. I don't see how to make it linear though. Do you have
any ideas?

No. I should try with a clear head.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Richard Damon on Mon Dec 26 03:18:44 2022

Richard Damon <Richard@Damon-Family.org> writes:

On 12/25/22 4:44 PM, Ben Bacarisse wrote:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.

Sorry, I meant to refer to your formulation of average

A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }

where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.

My "conventional average" algorithm (which is not well thought out) was >>>> to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result >>>> back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.

Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?

Well I handily stopped considering this at the stage where I assumed
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm tries
all the offsets and minimises the result.
Looking at it a bit more I can't see a better way (but that might be the
Ratafia de Champagne). It feels as if there /should/ be one. In fact
it feels as if it should be linear.

I may have lost track of the problem, but wouldn't the addition of the
points in two-dimensions and finding the center of gravity let you
find the answer in O(N) time?

Yes. It's been generally agreed that the vector average is the simplest
way to go. Not only is it simple, it gives an answer to the question
when there is no meaningful average (the zero vector), and an indication
when the average is "unreliable" (a very short vector).

But there is another, one dimensional, average that minimises the sum of
the squares of the short-arc distances to all the data points. It has
been argued that this average might be less surprising to users, but
it's tricky to calculate -- at least my first thoughts were way too
optimistic. Tim has an algorithm for it but I don't know what his
method is.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to Julio Di Egidio on Mon Dec 26 06:32:42 2022

On Monday, 26 December 2022 at 00:52:57 UTC+1, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 17:30:06 UTC+1, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer actually wanted, rather than what they asked for! :-)

Utter nonsense of the usual kind and an entire industry down the drain.

No, we are rather supposed to *elicit* requirements... which is just the tip of the (software) analysis iceberg.

Up 2. You fraudulent cunts.

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to Julio Di Egidio on Mon Dec 26 06:33:57 2022

On Wednesday, 21 December 2022 at 23:47:52 UTC+1, Julio Di Egidio wrote:

On Wednesday, 21 December 2022 at 20:56:01 UTC+1, Mike Terry wrote:

an accumulation of "frustration" at what I consider to be
"unimaginative" misinterpretations and
dogmatic claims of what was originally requested...

Besides that the problem was and remains at best poorly
phrased (and I agree that guesswork around here should
rather be banned), there is a much simpler and canonical
way to define "midpoint" in a modular space, namely the
midpoint along the shortest of the two possible paths.
And that's actually a quite common kind of calculation...

Up.3, you fraudulent cunts.

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Julio Di Egidio@21:1/5 to Julio Di Egidio on Mon Dec 26 06:34:21 2022

On Thursday, 22 December 2022 at 17:30:06 UTC+1, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:

On 22/12/2022 01:05, Julio Di Egidio wrote:

On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:

On 21/12/2022 21:54, Dmitry A. Kazakov wrote:

kind of most natural definition, at least to me.

That is unambiguous, but to my mind not at all what is wanted

You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.

Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)

Utter nonsense of the usual kind and an entire industry down the drain.

No, we are rather supposed to *elicit* requirements... which is just the
tip of the (software) analysis iceberg.

Up 4. You fraudulent cunts.

Julio

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Ram@21:1/5 to Stefan Ram on Wed Dec 28 12:12:13 2022

ram@zedat.fu-berlin.de (Stefan Ram) writes:

I waited a few days before answering to allow
sufficient time to think about the problem.

(A few lines in this post contain more than 72 characters.)

I sometimes look at records of events like when someone got
up or went to bed on different days and wonder about the
"average time" he gets up or goes to bed. Some years ago,
I asked in another newsgroup how to average times, and from
answers there I kept this expression for a spreadsheet and
two times in cells A1 and B1:

REST(1.5+1/6.28318530717959*ARCTAN2((COS((A1-0.5)*6.28318530717959)+COS((B1-0.5)*6.28318530717959))/2,(SIN((A1-0.5)*6.28318530717959)+SIN((B1-0.5)*6.28318530717959))/2),1)

. I added a bit, like "REST", but the trigonometric part
came from that newsgroup. (Times t of a day in spreadsheets
often are 0 <= t < 1 because they often use the day as the
unit of time). At that time, I did not understand what was
going on! In hindsight, today, I see that this seems to
calculate the average via the average of the position of
the two points on a 24-hour clock in two dimensions.

The corresponding Wikipedia page is called, "Mean of circular
quantities".

I also have a copy of a page "Circular Values Math and
Statistics with C++11" saved. Other pages I found were
"Averages-Mean angle - Rosetta Code", "Averaging Angles -
The Math Doctors", or "algorithm - How do you calculate
the average of a set of angles - Stack Overflow". Maybe
some of these pages can still be found in the Web today.

There was also another web page that used lines some of
which have more than 72 characters. It contained,

|A good way to estimate an average angle, A, from a set of angle measurements |a[i] 0<=i<N, is:
|
| sum_i_from_1_to_N sin(a[i])
| a = arctangent ---------------------------
| sum_i_from_1_to_N cos(a[i])
|
|A very careful study of the properties of this estimator is in the book |"Statistics On Spheres", Geoffrey S. Watson, University of Arkansas Lecture |Notes in the Mathematical Sciences, 1983 John Wiley & Sons, Inc.

.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Stefan Ram on Fri Dec 30 01:00:43 2022

ram@zedat.fu-berlin.de (Stefan Ram) writes:

ram@zedat.fu-berlin.de (Stefan Ram) writes:

I waited a few days before answering to allow
sufficient time to think about the problem.

(A few lines in this post contain more than 72 characters.)

??? That's an odd thing to say!

I sometimes look at records of events like when someone got
up or went to bed on different days and wonder about the
"average time" he gets up or goes to bed. Some years ago,
I asked in another newsgroup how to average times, and from
answers there I kept this expression for a spreadsheet and
two times in cells A1 and B1:

REST(1.5+1/6.28318530717959*ARCTAN2((COS((A1-0.5)*6.28318530717959)+COS((B1-0.5)*6.28318530717959))/2,(SIN((A1-0.5)*6.28318530717959)+SIN((B1-0.5)*6.28318530717959))/2),1)

I'm not going to try to unravel that (it is most likely the conventional
atan2 of the sum of sines and cosines), but the problem is barely worth considering for two times. What did you do with more than two?

. I added a bit, like "REST", but the trigonometric part
came from that newsgroup. (Times t of a day in spreadsheets
often are 0 <= t < 1 because they often use the day as the
unit of time). At that time, I did not understand what was
going on! In hindsight, today, I see that this seems to
calculate the average via the average of the position of
the two points on a 24-hour clock in two dimensions.

The corresponding Wikipedia page is called, "Mean of circular
quantities".

I also have a copy of a page "Circular Values Math and
Statistics with C++11" saved. Other pages I found were
"Averages-Mean angle - Rosetta Code", "Averaging Angles -
The Math Doctors", or "algorithm - How do you calculate
the average of a set of angles - Stack Overflow". Maybe
some of these pages can still be found in the Web today.

There was also another web page that used lines some of
which have more than 72 characters. It contained,

|A good way to estimate an average angle, A, from a set of angle measurements |a[i] 0<=i<N, is:
|
| sum_i_from_1_to_N sin(a[i])
| a = arctangent ---------------------------
| sum_i_from_1_to_N cos(a[i])

I think it's easier to think in terms of vectors (for one thing you get
a useful magnitude as well). Also the programmer's atan2 is more
helpful here than the mathematician's arctan.

I found this problem interesting but only later in the discussion as I
have been using this metric for some time. What got interesting (to me)
was that there is another sound interpretation of the average as
suggested by Tim, ironically prompted but a general definition of what
might constitute an average that I had posted and failed to follow
through on.

So thank you for bringing it up.

|A very careful study of the properties of this estimator is in the book |"Statistics On Spheres", Geoffrey S. Watson, University of Arkansas Lecture |Notes in the Mathematical Sciences, 1983 John Wiley & Sons, Inc.

Rant: This book is now out of print and effectively dead. What a shame.
I can't imagine that either the author or the publisher (John Wiley and
Sons) hold out much hope of making any more money from it, so with
things as they stand, all that person's hard work is now lost to anyone
not lucky enough to have access to a library that happens to have it, or patient enough to wait (and prepared to pay) for the inter-library loan
system to work it's magic.

And for people no where near any library, tough!

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Thu Dec 29 23:25:26 2022

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

[...]

I found this problem interesting but only later in the discussion as
I have been using this metric for some time. What got interesting
(to me) was that there is another sound interpretation of the
average as suggested by Tim, [...]

It wasn't my idea. I got it from a posting by Mike Terry on
December 21. I hadn't seen that formulation of arithmetic mean
before and I was amazed that it worked. So I can't really take
any credit for the suggestion.

ironically prompted but a general definition of what might
constitute an average that I had posted and failed to follow
through on.

I remember your posting as coming after the one by Mike Terry,
and so I thought your comments were derived from his. Sorry if
my conclusions there were off the mark.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Stefan Ram on Fri Dec 30 05:07:45 2022

ram@zedat.fu-berlin.de (Stefan Ram) writes:

[edited for brevity]

ram@zedat.fu-berlin.de (Stefan Ram) writes:

I waited a few days before answering to allow
sufficient time to think about the problem.

I sometimes look at records of events like when someone got up
or went to bed on different days and wonder about the "average
time" he gets up or goes to bed. Some years ago, I asked in
another newsgroup how to average times, and from answers there
I [found a long and somewhat forbidding formula]

At that time, I did not understand what was going on! In
hindsight, today, I see that this seems to calculate the
average via the average of the position of the two points on a
24-hour clock in two dimensions.

[..other references to the same general idea..]

Several people, including myself, thought of this idea, suitably
generalized to N points rather than two. Two problems:

One, there is no way of knowing this idea is what you were
looking for. Basically your puzzle was "Can you guess what I'm
thinking?", without giving any useful clues to solve the puzzle.
At the very least, it is misleading to call it a "puzzle".

Two, this approach gives unexpected (or counter-intuitive)
results for many ordinary cases where the times have at least
some clustering (for example, all times within a 4-hour range, or
potentially even a 15-hour range). In cases like that, which are
likely to be common for most real-world data, using geometric
averaging gives an inferior result.

Besides those drawbacks, it looks like you didn't even bother to
read the responses. As far as the outside world can tell, your
purpose in posting was just an exercise in self-congratulations.

I know you can do better. And I hope that in the future you will
make an effort to do so.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Fri Dec 30 14:04:16 2022

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

[...]

I found this problem interesting but only later in the discussion as
I have been using this metric for some time. What got interesting
(to me) was that there is another sound interpretation of the
average as suggested by Tim, [...]

It wasn't my idea. I got it from a posting by Mike Terry on
December 21. I hadn't seen that formulation of arithmetic mean
before and I was amazed that it worked. So I can't really take
any credit for the suggestion.

ironically prompted but a general definition of what might
constitute an average that I had posted and failed to follow
through on.

I remember your posting as coming after the one by Mike Terry,
and so I thought your comments were derived from his. Sorry if
my conclusions there were off the mark.

You are right in that MT posted the same general formulation 9 hours
before I did (though I'd not seen that). My confusion came from your explanation, to me, of "conventional average":

"Sorry, I meant to refer to your formulation of average"

followed by the formula I gave rather than then entirely equivalent one
given by MT.

Anyway, the credit I'm giving is for your considering this a reasonable
thing to try calculate for arc lengths, rather than the vector average
that minimises a different measure altogether. Maybe MT also suggested
that as an alternative but I only remember his championing the vector interpretation. My apologies to MT if he did that as well.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?B?xo4=?=@21:1/5 to All on Fri Dec 30 05:59:32 2022

Wahtsup nǝgro ?

🌞

Richard Heathfield kirjutas Kolmapäev, 14. detsember 2022 kl 15:10:58 UTC+2:

On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:

On 2022-12-14 13:24, Stefan Ram wrote:

Given n times of the 24-hour day, print their average.

For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".

You probably missed to require the interesting part: doing all
that in the modular type (modulo 24) arithmetic:

20 + 5 = 1 (mod 24)

...which will give you the wrong answer. Chase that goose!
--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Ben Bacarisse on Sat Dec 31 00:24:34 2022

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

A couple of further thoughts only one of which is directed at you
Tim (at the end)...

Given the general conception of a mean (rather than any other kind of
summary statistic) as minimising the sum of squares of some "distance"
metric:

Sum_{i=1,n} difference(A, t(i))^2

we can characterise the two contenders by what distance is being
minimised. For the less well discussed "conventional average" (to use
your terminology) we are minimising the sum of squares of the shorter
arc lengths between A and the t(i).

For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).

The mean vector itself (which may not lie on the unit circle) minimises
the sum of the squares of the length of the vector differences m-u(i):

Sum_{i=1,n} |m - u(i)|^2

and any other vector along the same line as m (i.e. c*m for real c)
minimises

Sum_{i=1,n} |c*m - u(i)|^2

This includes, of course, m projected out to the unit circle.

This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional average
may seem more intuitive.

Incidentally, I found another book on statistics on spheres, and that
gets the average over in a few paragraphs. It states, without
considering any alternatives, that the average is the vector average. I
can't find anyone using or citing the arc-length minimising average,
despite it being natural on a sphere to find the mid-point of, say, a
set of points on the Earth's surface.

Tim, my best shot at calculating this other average sorts the points to
find the widest gap. I suspect your algorithm is similar since you say
it is O(n log n).

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?B?xo4=?=@21:1/5 to Dmitry A. Kazakov on Fri Dec 30 18:18:16 2022

You !

On Wednesday, December 21, 2022 at 11:54:47 PM UTC+2, Dmitry A. Kazakov wrote:

On 2022-12-21 20:55, Mike Terry wrote:

The OP was probably deliberately rather vague on this point! The
easiest definition for "average" literally doesn't make sense in a
modulo context: we can ADD the modular values, but in general dividing
in such a mathematical setting doesn't make much sense, so

Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n

would be inappropriate due to the /n operation.

Unless you calculate everything as reals or integers and then take the remainder of 24. Which is kind of most natural definition, at least to me.

HOWEVER, there's another characterisation for the average of a set,
which looks more promising in a modular (or other more general)
setting: the average is the value of V which MINIMISES THE "ERROR" calculated by

error = Sum ([i=1,2,..n] (V - x_i)^2)

that is, minimises the sum of squares of differences from V.

This has exactly same problem as above, because subtraction and multiplication (power of two) have different semantics in modular
arithmetic.

[Incidentally, if we minimise the sum of absolute differeces from V,
that characterises the mode (aka "midpoint") of the sample, but I think
the median is more what is wanted here...]

Actually it is the same beast. Median is the least C-norm defined as |x-y|.

Mean is the least Euclidean norm (squares), the thing you proposed.

Mean and average are same for real numbers.

You are right that median (and C-norm) can be unambiguously defined in modular numbers. But nobody would ever qualify this as a puzzle! (:-))

[...]

Now, as to how to CALCULATE the V above???

As I said, in real numbers V is the mean, i.e.

V = Sum (Xi) =def= mean({Xi | i=1..n})
i=1..n

Any numeric method is in the end mapping Xi to some normal number, calculating a classic mean, mapping back. The problem is with the
forward mapping being ambiguous. One method to disambiguate is staying
within the same day. Another could be taking a median and considering 12 hours before and 12 hours after it. And there are many more.

P.S. There are differences between the average and mean. OP referred the average, which may mean something completely counterintuitive (no pun intended (:-))
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sat Dec 31 06:20:03 2022

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

[...]

I found this problem interesting but only later in the discussion as
I have been using this metric for some time. What got interesting
(to me) was that there is another sound interpretation of the
average as suggested by Tim, [...]

It wasn't my idea. I got it from a posting by Mike Terry on
December 21. I hadn't seen that formulation of arithmetic mean
before and I was amazed that it worked. So I can't really take
any credit for the suggestion.

ironically prompted but a general definition of what might
constitute an average that I had posted and failed to follow
through on.

I remember your posting as coming after the one by Mike Terry,
and so I thought your comments were derived from his. Sorry if
my conclusions there were off the mark.

You are right in that MT posted the same general formulation 9 hours
before I did (though I'd not seen that). My confusion came from your explanation, to me, of "conventional average":

"Sorry, I meant to refer to your formulation of average"

followed by the formula I gave rather than then entirely equivalent
one given by MT.

Ahh, that makes sense. My comment muddied the waters; I was
thinking of your formula and the MT formula as the same, and
rather inadvertently said "your formulation" (after all, I was
responding to and had quoted your formula) without mentioning
that I had also read MT's post before responding.

Anyway, the credit I'm giving is for your considering this a
reasonable thing to try calculate for arc lengths, rather than
[...]

Thank you for that. And in that respect I think I can modestly
say that some credit is deserved (especially considering the
amount of effort it took to code something up that calculated
this measure).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sat Dec 31 06:42:15 2022

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

A couple of further thoughts only one of which is directed at you
Tim (at the end)...

Given the general conception of a mean (rather than any other kind of
summary statistic) as minimising the sum of squares of some "distance" metric:

Sum_{i=1,n} difference(A, t(i))^2

we can characterise the two contenders by what distance is being
minimised. For the less well discussed "conventional average" (to use
your terminology) we are minimising the sum of squares of the shorter
arc lengths between A and the t(i).

For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).

I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).

The mean vector itself (which may not lie on the unit circle) minimises
the sum of the squares of the length of the vector differences m-u(i):

Sum_{i=1,n} |m - u(i)|^2

That's true but I thought of it just as the average of the time
value "masses" on the unit circle.

and any other vector along the same line as m (i.e. c*m for real c)
minimises

Sum_{i=1,n} |c*m - u(i)|^2

I'm not sure what you're saying here. Only the one point (m in
your terminology) minimizes the sum of squares of distances. How
do other points on the same line qualify?

This includes, of course, m projected out to the unit circle.

This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.

Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle, and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

Incidentally, I found another book on statistics on spheres, and that
gets the average over in a few paragraphs. It states, without
considering any alternatives, that the average is the vector average. I can't find anyone using or citing the arc-length minimising average,
despite it being natural on a sphere to find the mid-point of, say, a
set of points on the Earth's surface.

Do you mind my asking, which book was that?

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant. But it could be relevant in some contexts (thinking
in particular of astronautics, or, dare I say it, rocket science).

Tim, my best shot at calculating this other average sorts the points to
find the widest gap. I suspect your algorithm is similar since you say
it is O(n log n).

Your instinct that I sorted the time values is right. If you
think a little more I think you will see that the widest gap need
not have any particular relationship to where the cut point is.
No further elaboration for now so as not to be a spoiler (and I
know you like to try figuring things out yourself).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Tim Rentsch on Sat Dec 31 17:04:47 2022

On 2022-12-31 15:42, Tim Rentsch wrote:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).

I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).

Center of mass of a set of ideal points (particles) and vector average
are same:

CoM = Sum Mi * Ri / Sum Mi
i = 1..n i = 1..n

Mi = masses, Ri = vectors. If all Mi are same you get

CoM = Sum Ri / n
i = 1..n

This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.

Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle, and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

Arc length is proportional to angle:

L = Rα, R is radius, α is angle in radians.

Averaging arcs is equivalent to averaging angles.

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.

See above, it is just angles to average.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?B?QXVnx51s?=@21:1/5 to All on Sat Dec 31 10:23:59 2022

You!

Tim Rentsch kirjutas Laupäev, 31. detsember 2022 kl 16:20:09 UTC+2:

Ben Bacarisse <ben.u...@bsb.me.uk> writes:

Tim Rentsch <tr.1...@z991.linuxsc.com> writes:

Ben Bacarisse <ben.u...@bsb.me.uk> writes:

[...]

I found this problem interesting but only later in the discussion as
I have been using this metric for some time. What got interesting
(to me) was that there is another sound interpretation of the
average as suggested by Tim, [...]

It wasn't my idea. I got it from a posting by Mike Terry on
December 21. I hadn't seen that formulation of arithmetic mean
before and I was amazed that it worked. So I can't really take
any credit for the suggestion.

ironically prompted but a general definition of what might
constitute an average that I had posted and failed to follow
through on.

I remember your posting as coming after the one by Mike Terry,
and so I thought your comments were derived from his. Sorry if
my conclusions there were off the mark.

You are right in that MT posted the same general formulation 9 hours before I did (though I'd not seen that). My confusion came from your explanation, to me, of "conventional average":

"Sorry, I meant to refer to your formulation of average"

followed by the formula I gave rather than then entirely equivalent
one given by MT.

Ahh, that makes sense. My comment muddied the waters; I was
thinking of your formula and the MT formula as the same, and
rather inadvertently said "your formulation" (after all, I was
responding to and had quoted your formula) without mentioning
that I had also read MT's post before responding.

Anyway, the credit I'm giving is for your considering this a
reasonable thing to try calculate for arc lengths, rather than
[...]

Thank you for that. And in that respect I think I can modestly
say that some credit is deserved (especially considering the
amount of effort it took to code something up that calculated
this measure).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Jan 1 01:10:57 2023

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

A couple of further thoughts only one of which is directed at you
Tim (at the end)...

Given the general conception of a mean (rather than any other kind of
summary statistic) as minimising the sum of squares of some "distance"
metric:

Sum_{i=1,n} difference(A, t(i))^2

we can characterise the two contenders by what distance is being
minimised. For the less well discussed "conventional average" (to use
your terminology) we are minimising the sum of squares of the shorter
arc lengths between A and the t(i).

For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).

I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).

It's the calculation though.

The mean vector itself (which may not lie on the unit circle) minimises
the sum of the squares of the length of the vector differences m-u(i):

Sum_{i=1,n} |m - u(i)|^2

That's true but I thought of it just as the average of the time
value "masses" on the unit circle.

and any other vector along the same line as m (i.e. c*m for real c)
minimises

Sum_{i=1,n} |c*m - u(i)|^2

I'm not sure what you're saying here. Only the one point (m in
your terminology) minimizes the sum of squares of distances. How
do other points on the same line qualify?

I over-simplified by omitting "compared to any other point on that
circle". The mean vector is the absolute minimum, but at any radius
|c*m| the sum above is at a minmum at c*m.

This includes, of course, m projected out to the unit circle.

This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.

Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle,

m (the vector) is the centre of mass, but for some c, c*m lies on the
circle. All the lines from the data points to c*m are actual chords.
The average "time", A, is the point on the circle that minimises the sum
of squares of the cords between A the t(i).

The "other" average, minimises the sum of the of the angular distances.

and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

It's more interesting because it really is about chords!

Incidentally, I found another book on statistics on spheres, and that
gets the average over in a few paragraphs. It states, without
considering any alternatives, that the average is the vector average. I
can't find anyone using or citing the arc-length minimising average,
despite it being natural on a sphere to find the mid-point of, say, a
set of points on the Earth's surface.

Do you mind my asking, which book was that?

http://library.sadjad.ac.ir/opac//temp/19108.pdf

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant. But it could be relevant in some contexts (thinking
in particular of astronautics, or, dare I say it, rocket science).

Tim, my best shot at calculating this other average sorts the points to
find the widest gap. I suspect your algorithm is similar since you say
it is O(n log n).

Your instinct that I sorted the time values is right. If you
think a little more I think you will see that the widest gap need
not have any particular relationship to where the cut point is.
No further elaboration for now so as not to be a spoiler (and I
know you like to try figuring things out yourself).

You must be confusing me with someone else!

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Sun Jan 1 01:24:27 2023

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2022-12-31 15:42, Tim Rentsch wrote:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).

I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).

Center of mass of a set of ideal points (particles) and vector average are same:

CoM = Sum Mi * Ri / Sum Mi
i = 1..n i = 1..n

Mi = masses, Ri = vectors. If all Mi are same you get

CoM = Sum Ri / n
i = 1..n

This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.

Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle, and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

Arc length is proportional to angle:

L = Rα, R is radius, α is angle in radians.

Averaging arcs is equivalent to averaging angles.

Sure. I'm certain Tim knows that! The question is what "average" are
we interested in. An average, angle A, minimises

Sum_{i=1,n} distance(A, t(i))^2

for some measure of distance. The most common circular (or spherical)
average, sums the unit vectors and, where possible, projects the
resultant vector onto the unit circle. This average, it turns out,
minimises the sum of squares of the chords between A and the t(i).

Another average, which is more complicated to calculate, minimises the
sum of squares of the arc lengths or, as you say, the angular distances
between A and the t(i).

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.

See above, it is just angles to average.

Yes, but that's hard. The simple average (just sum the unit vectors)
does not minimise the sum of squares of the angle differences, and,
despite being a reasonable concept, is not discussed in any of the text
I've been able to find.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Jan 8 07:17:25 2023

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

A couple of further thoughts only one of which is directed at you
Tim (at the end)...

Given the general conception of a mean (rather than any other kind
of summary statistic) as minimising the sum of squares of some
"distance" metric:

Sum_{i=1,n} difference(A, t(i))^2

we can characterise the two contenders by what distance is being
minimised. For the less well discussed "conventional average" (to
use your terminology) we are minimising the sum of squares of the
shorter arc lengths between A and the t(i).

For the "vector average", we convert the t(i) to unit vectors u(i)
and we calculate the mean if the u(i) to get a vector m. The
"average", A, is just the direction of this vector -- another
point on the unit circle. In this case we are minimising the sum
of squares of the /chord/ lengths between A and the t(i).

I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).

It's the [same] calculation though.

Yes, my first sentence there was meant to convey that I am
offering an alternate formulation that is mathematically
equivalent, which I expected would be self-evident.

The mean vector itself (which may not lie on the unit circle)
minimises the sum of the squares of the length of the vector
differences m-u(i):

Sum_{i=1,n} |m - u(i)|^2

That's true but I thought of it just as the average of the time
value "masses" on the unit circle.

and any other vector along the same line as m (i.e. c*m for real
c) minimises

Sum_{i=1,n} |c*m - u(i)|^2

I'm not sure what you're saying here. Only the one point (m in
your terminology) minimizes the sum of squares of distances. How
do other points on the same line qualify?

I over-simplified by omitting "compared to any other point on that
circle". The mean vector is the absolute minimum, but at any radius
|c*m| the sum above is at a minmum at c*m.

I see. Not obvious to me that this is true (and I didn't see an
easy proof either). OTOH certainly it is plausibly true, so for
now I can take your word for it.

This includes, of course, m projected out to the unit circle.

This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.

Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle,

m (the vector) is the centre of mass, but for some c, c*m lies on
the circle. All the lines from the data points to c*m are actual
chords. The average "time", A, is the point on the circle that
minimises the sum of squares of the cords between A the t(i).

Minor point: the center of mass is a point, not a vector.
Because the circle is centered on the origin, it happens that the
point and the vector from the origin to the point have the same
coordinate values, but still points are different from vectors.

The "other" average, minimises the sum of the of the angular
distances.

I think you mean it minimizes the sum of the squares of the
angular distances.

and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

It's more interesting because it really is about chords!

I don't buy the "really" here. The center of mass is fundamental
and always well-defined. Furthermore it works for collections of
points that are not all on the same circle or sphere. Looking at
the problem in terms of chords seems somewhat artificial, or at
least less fundamental.

Incidentally, I found another book on statistics on spheres, and
that gets the average over in a few paragraphs. It states,
without considering any alternatives, that the average is the
vector average. I can't find anyone using or citing the
arc-length minimising average, despite it being natural on a
sphere to find the mid-point of, say, a set of points on the
Earth's surface.

Do you mind my asking, which book was that?

http://library.sadjad.ac.ir/opac//temp/19108.pdf

That's great, thank you.

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant. But it could be relevant in some contexts (thinking
in particular of astronautics, or, dare I say it, rocket science).

Tim, my best shot at calculating this other average sorts the
points to find the widest gap. I suspect your algorithm is
similar since you say it is O(n log n).

Your instinct that I sorted the time values is right. If you
think a little more I think you will see that the widest gap need
not have any particular relationship to where the cut point is.
No further elaboration for now so as not to be a spoiler (and I
know you like to try figuring things out yourself).

You must be confusing me with someone else!

Oh. Maybe it was your brother posting in those other messages I
saw coming from your address.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Dmitry A. Kazakov on Sun Jan 8 07:45:23 2023

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2022-12-31 15:42, Tim Rentsch wrote:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).

I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).

Center of mass of a set of ideal points (particles) and vector
average are same:

Yes, I thought the equivalence is obvious and not in need of
explanation.

This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.

Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle, and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

Arc length is proportional to angle:

A trivial and useless observation.

Averaging arcs is equivalent to averaging angles.

Angles are a one-dimensional measure. Finding an arc length
"average" of points on a sphere needs a two-dimensional result.

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.

See above, it is just angles to average.

Apparently you have not yet understood the problem. Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Tim Rentsch on Sun Jan 8 17:17:21 2023

On 2023-01-08 16:45, Tim Rentsch wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

Averaging arcs is equivalent to averaging angles.

Angles are a one-dimensional measure.

Averaging arcs is still equivalent to averaging angles, which is trivial
result of elementary trigonometry.

Finding an arc length
"average" of points on a sphere needs a two-dimensional result.

Points do not have arcs.

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.

See above, it is just angles to average.

Apparently you have not yet understood the problem.

Again, averages of arcs and angles are equivalent up to a multiplier.

Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?

Why don't you write a formula specifying your need?

Programs are written according to the specifications. Numeric programs
require a properly stated problem, rather than a bunch of words
arbitrarily thrown in a meaningless sentence as above.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Tim Rentsch on Sun Jan 8 19:43:53 2023

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

<cut>

The "other" average, minimises the sum of the of the angular
distances.

I think you mean it minimizes the sum of the squares of the
angular distances.

Yes thanks. I originally thought I'd write an abbreviation but then I convinced myself I would not forget any of the "squares of"s. Proof
reading might have been useful too.

and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

It's more interesting because it really is about chords!

I don't buy the "really" here. The center of mass is fundamental
and always well-defined. Furthermore it works for collections of
points that are not all on the same circle or sphere. Looking at
the problem in terms of chords seems somewhat artificial, or at
least less fundamental.

There are other ways for something to be interesting (at least to me).
When you plot the two averages, and look at the chords vs the arcs, it
becomes clear why one average "looks" more average than the other.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Ben Bacarisse on Sun Jan 8 19:59:14 2023

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

<cut>

The "other" average, minimises the sum of the of the angular
distances.

I think you mean it minimizes the sum of the squares of the
angular distances.

Yes thanks. I originally thought I'd write an abbreviation but then I convinced myself I would not forget any of the "squares of"s. Proof
reading might have been useful too.

and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

It's more interesting because it really is about chords!

I don't buy the "really" here. The center of mass is fundamental
and always well-defined. Furthermore it works for collections of
points that are not all on the same circle or sphere. Looking at
the problem in terms of chords seems somewhat artificial, or at
least less fundamental.

There are other ways for something to be interesting (at least to me).
When you plot the two averages, and look at the chords vs the arcs, it becomes clear why one average "looks" more average than the other.

I can see a possible misunderstanding here. You might have taken my "it
really is about chords" to mean "it is fundamentally about chords", but
I said "it really is about chords" because you thought I'd used the word
chord incorrectly to refer to the lines between the centre of mass and
the data points. I was saying "I really did mean chords".

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Sun Jan 8 20:41:19 2023

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-08 16:45, Tim Rentsch wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

Averaging arcs is equivalent to averaging angles.

Angles are a one-dimensional measure.

Averaging arcs is still equivalent to averaging angles, which is trivial result of elementary trigonometry.

Finding an arc length
"average" of points on a sphere needs a two-dimensional result.

Points do not have arcs.

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.

See above, it is just angles to average.

Apparently you have not yet understood the problem.

Again, averages of arcs and angles are equivalent up to a multiplier.

Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?

Why don't you write a formula specifying your need?

You seemed to understand the need sufficiently to dismiss the problem: "averaging angles, which is trivial", "it is just angles to average" and "averages of arcs and angles are equivalent up to a multiplier". But
the problem is /finding/ a specific average -- the point (or angle) that minimises the sum of squares of the distances (or angles) from that
average point (or angle).

The fact that it makes no odds (as everyone knows) whether we consider
angles (often called central angles in this context) or great circle
distances is not the issue. It's finding the average that minimises the
sum of squares of differences that's the issue.

You say you need a formula, so I'll try... Let P_n be a collection of n
unit vectors specifying n points on a unit sphere. Find the unit vector
A that minimises

Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2

(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)

Programs are written according to the specifications. Numeric programs require a properly stated problem, rather than a bunch of words
arbitrarily thrown in a meaningless sentence as above.

Given the context, I think that's a very biased characterisation of
what's been said here.

My first job was as a "numerical analyst", and the very first program I
was employed to write was for a professor of statistics. It was to
calculate a novel kind of fit line. The specification was just a few sentences. No formulas. It was perfectly clear, and I could get the
job done. I don't think this is unusual. Words are often enough, and
they can avoid undue over specification. For example, the problem in
question is essentially the same if the points are given by latitude and longitude on a non-unit sphere, but the formula would look very
different.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Heathfield@21:1/5 to Ben Bacarisse on Sun Jan 8 21:14:05 2023

On 08/01/2023 8:41 pm, Ben Bacarisse wrote:

The specification was just a few
sentences.

There hangs on the wall of $COMPANY$ (a UK insurance company) a
restaurant's paper napkin in a wooden frame behind glass.
Scribbled on the napkin in biro are the words "quotes system".

End of spec.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Sun Jan 8 22:31:21 2023

On 2023-01-08 21:41, Ben Bacarisse wrote:

But
the problem is /finding/ a specific average -- the point (or angle) that minimises the sum of squares of the distances (or angles) from that
average point (or angle).

That is not a programming problem. It is a problem space's one. You must
go to the corresponding part of science or engineering or economics etc
and formulate it there in terms of that space.

Average is just such formulation of finding a representative member from
some set having some specific properties. E.g. least squares in physics
usually has the meaning of least energy etc. So, it goes in this order
(not in reverse):

1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.

After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.

The fact that it makes no odds (as everyone knows) whether we consider
angles (often called central angles in this context) or great circle distances is not the issue. It's finding the average that minimises the
sum of squares of differences that's the issue.

Minimum of squares (Euclidean norm) is represented by the mathematical mean.

You say you need a formula, so I'll try... Let P_n be a collection of n
unit vectors specifying n points on a unit sphere. Find the unit vector
A that minimises

Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2

(arctan is the "all quadrant" version that is often called atan2 in programming languages.)

1. atan2 has two arguments (the adjacent and the opposite legs);
2. What is "A";
3. What operations(?) "x" and "." denote?

Programs are written according to the specifications. Numeric programs
require a properly stated problem, rather than a bunch of words
arbitrarily thrown in a meaningless sentence as above.

Given the context, I think that's a very biased characterisation of
what's been said here.

My first job was as a "numerical analyst", and the very first program I
was employed to write was for a professor of statistics. It was to
calculate a novel kind of fit line. The specification was just a few sentences. No formulas. It was perfectly clear, and I could get the
job done. I don't think this is unusual. Words are often enough, and
they can avoid undue over specification. For example, the problem in question is essentially the same if the points are given by latitude and longitude on a non-unit sphere, but the formula would look very
different.

This just means that you formulated the specifications yourself, being
able to read your professor's mind. Not everybody's mind is easy read or
worth reading... (:-))

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Mon Jan 9 03:25:10 2023

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-08 21:41, Ben Bacarisse wrote:

But
the problem is /finding/ a specific average -- the point (or angle) that
minimises the sum of squares of the distances (or angles) from that
average point (or angle).

That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.

I don't follow. It's a mathematical problem for which an algorithm is
sought. I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.

Average is just such formulation of finding a representative member
from some set having some specific properties. E.g. least squares in
physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):

1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.

After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.

So if I asked you for code to calculate the arithmetic mean of an array
of numbers you would ask for all this first, declaring it not yet a
programming issue? I really don't see where all this comes from.

The fact that it makes no odds (as everyone knows) whether we consider
angles (often called central angles in this context) or great circle
distances is not the issue. It's finding the average that minimises the
sum of squares of differences that's the issue.

Minimum of squares (Euclidean norm) is represented by the mathematical
mean.

This problem obviously uses a different norm.

You say you need a formula, so I'll try... Let P_n be a collection of n
unit vectors specifying n points on a unit sphere. Find the unit vector
A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)

1. atan2 has two arguments (the adjacent and the opposite legs);

Obviously I can't guess how much I can safely assume so I expected there
might be questions. arctan(p / q) is usually coded as atan2(p, q).

2. What is "A";

The unit vector that is being sought -- the result of the algorithm. It represents a point on the unit sphere that is the desired average of the
input points.

3. What operations(?) "x" and "." denote?

x denotes the cross product, and . the dot product.

Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!

Programs are written according to the specifications. Numeric programs
require a properly stated problem, rather than a bunch of words
arbitrarily thrown in a meaningless sentence as above.

Given the context, I think that's a very biased characterisation of
what's been said here.
My first job was as a "numerical analyst", and the very first program I
was employed to write was for a professor of statistics. It was to
calculate a novel kind of fit line. The specification was just a few
sentences. No formulas. It was perfectly clear, and I could get the
job done. I don't think this is unusual. Words are often enough, and
they can avoid undue over specification. For example, the problem in
question is essentially the same if the points are given by latitude and
longitude on a non-unit sphere, but the formula would look very
different.

This just means that you formulated the specifications yourself, being
able to read your professor's mind.

No, I did not have to read his mind because he told me what he wanted.
To my mind "find the point on the sphere that minimises the sum of the
squares of the great-circle distances between that point and the input
points" is clearer than the formula I gave because the formula says too
much.

Of course, no specification is 100% precise. You didn't know what x and
. denoted above and I have answered you with mere words. Will you need
the formulas for X x Y and X . Y before accepting the specification?
What symbols in /those/ formulas might you want to have formally
specified?

Not everybody's mind is easy read or worth reading... (:-))

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Mon Jan 9 11:22:58 2023

On 2023-01-09 04:25, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-08 21:41, Ben Bacarisse wrote:

But
the problem is /finding/ a specific average -- the point (or angle) that >>> minimises the sum of squares of the distances (or angles) from that
average point (or angle).

That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.

I don't follow. It's a mathematical problem for which an algorithm is sought.

Firstly, this is comp.programming group. Secondly, if algorithm is
sought then there must be a mathematical formulation of what a numeric
solution is sought for.

(Finding a minimum is an optimization problem, there are thousands of algorithms already)

I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.

Average is just such formulation of finding a representative member
from some set having some specific properties. E.g. least squares in
physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):

1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.

After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.

So if I asked you for code to calculate the arithmetic mean of an array
of numbers you would ask for all this first, declaring it not yet a programming issue?

Then I would ask you, if this is a solution, what was the problem?

I really don't see where all this comes from.

From an average (:-)) customer who does not mind his business.
Customers have programming ideas, wrong ideas, ideas they express in
terms they have no idea of (:-)). It takes time and tact to bring the
customer back to the field of his expertise and get from him what he
actually needs in order to solve his actual problem.

The fact that it makes no odds (as everyone knows) whether we consider
angles (often called central angles in this context) or great circle
distances is not the issue. It's finding the average that minimises the >>> sum of squares of differences that's the issue.

Minimum of squares (Euclidean norm) is represented by the mathematical
mean.

This problem obviously uses a different norm.

I see nothing obvious in an unstated problem.

You say you need a formula, so I'll try... Let P_n be a collection of n >>> unit vectors specifying n points on a unit sphere. Find the unit vector >>> A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)

1. atan2 has two arguments (the adjacent and the opposite legs);

Obviously I can't guess how much I can safely assume so I expected there might be questions. arctan(p / q) is usually coded as atan2(p, q).

2. What is "A";

The unit vector that is being sought -- the result of the algorithm. It represents a point on the unit sphere that is the desired average of the input points.

3. What operations(?) "x" and "." denote?

x denotes the cross product, and . the dot product.

Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!

If I correctly understood your explanation (and with a few obvious corrections):

Sum atan2 (|A x Pi|, A · Pi) --> min
i=1..n {A | A on the circle}

|A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)

A · Pi = |A| |Pi| cos θi

atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'

Where θi' is θi with lower semicircle mapped to the upper one (for
whatever reason).

(If mapping was unintended you must have taken the direction of the
cross product for the sign for |A x Pi|)

On a circle varying the angle between A and (R, 0), where R is the
radius, you find that the mean of the angles between Pi and (R, 0)
yields the minimum.

To my mind "find the point on the sphere that minimises the sum of the squares of the great-circle distances between that point and the input points" is clearer than the formula I gave because the formula says too
much.

If you mean the circumference, then that is trivially proportional to
the angle.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Mon Jan 9 20:37:05 2023

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-09 04:25, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-08 21:41, Ben Bacarisse wrote:

But
the problem is /finding/ a specific average -- the point (or angle) that >>>> minimises the sum of squares of the distances (or angles) from that
average point (or angle).

That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.

I don't follow. It's a mathematical problem for which an algorithm is
sought.

Firstly, this is comp.programming group.

Do you consider discussion algorithms off topic here? Seriously?

Secondly, if algorithm is sought then there must be a mathematical formulation of what a numeric solution is sought for.

There needs to be a problem statement that can be discussed and refined.
The original post about circular averages has led to two different
refinements -- the vector average (projected onto the unit circle) and
the arc-length average. Both generalise to more dimensions and I
happened to mention in passing that the great-circle average would be a
natural fit in some situations.

Tim mentioned (again in passing) that it looks hard. You seemed to
suggest it was not, but I am not sure you knew what the problem was at
that time.

All of this seems to be to be exactly what comp.programming should be
about.

(Finding a minimum is an optimization problem, there are thousands of algorithms already)

Programmers should know that specifications are not algorithm designs.
You would not find a number that minimises the sum of squares of
differences from an input set by breaking out one of these thousands of minimisation algorithms. You'd sum and divide.

I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.

Average is just such formulation of finding a representative member
from some set having some specific properties. E.g. least squares in
physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):

1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.

After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.

So if I asked you for code to calculate the arithmetic mean of an array
of numbers you would ask for all this first, declaring it not yet a
programming issue?

Then I would ask you, if this is a solution, what was the problem?

This is a solution to the simpler problem.

I really don't see where all this comes from.

From an average (:-)) customer who does not mind his
business. Customers have programming ideas, wrong ideas, ideas they
express in terms they have no idea of (:-)). It takes time and tact to
bring the customer back to the field of his expertise and get from him
what he actually needs in order to solve his actual problem.

But we are not in that situation. A few knowledgeable and experienced
people are discussing a programming problem. You are welcome to join
in, but the best way for you to do that is to ask whatever questions you
think would help you understand what's being discussed. So far, your
replies have alternated between "it's trivial" and "it's unstated".

The fact that it makes no odds (as everyone knows) whether we consider >>>> angles (often called central angles in this context) or great circle
distances is not the issue. It's finding the average that minimises the >>>> sum of squares of differences that's the issue.

Minimum of squares (Euclidean norm) is represented by the mathematical
mean.

This problem obviously uses a different norm.

I see nothing obvious in an unstated problem.

It's been stated several times.

You say you need a formula, so I'll try... Let P_n be a collection of n >>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>> A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)

1. atan2 has two arguments (the adjacent and the opposite legs);

Obviously I can't guess how much I can safely assume so I expected there
might be questions. arctan(p / q) is usually coded as atan2(p, q).

2. What is "A";

The unit vector that is being sought -- the result of the algorithm. It
represents a point on the unit sphere that is the desired average of the
input points.

3. What operations(?) "x" and "." denote?

x denotes the cross product, and . the dot product.
Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!

If I correctly understood your explanation (and with a few obvious corrections):

Sum atan2 (|A x Pi|, A · Pi) --> min
i=1..n {A | A on the circle}

No. A is a vector denoting a point on the unit sphere. And you have
missed out the power of two. Were these changes what you call "obvious corrections"?

|A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)

A · Pi = |A| |Pi| cos θi

atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'

Where θi' is θi with lower semicircle mapped to the upper one (for
whatever reason).

(If mapping was unintended you must have taken the direction of the
cross product for the sign for |A x Pi|)

On a circle varying the angle between A and (R, 0), where R is the
radius, you find that the mean of the angles between Pi and (R, 0)
yields the minimum.

This does not even solve the related case of finding the average point
on the unit circle.

To my mind "find the point on the sphere that minimises the sum of the
squares of the great-circle distances between that point and the input
points" is clearer than the formula I gave because the formula says too
much.

If you mean the circumference, then that is trivially proportional to
the angle.

You keep making the point that angles are proportional to arcs and/or
great circle distances. No one disputes this and I even tried at one
point to keep writing both so you would know that that is not the issue.
The problem is finding the average.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Y A@21:1/5 to All on Mon Jan 9 16:33:53 2023

TG9vayB0aGlzDQrihpMNCuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggA0K4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDioIDioIDioIDioIDioIDioIDi oIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDwn4ye 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDioIDioIDi oIDioIDioIDioIDioIANCuKggCDioIAg4qCAIOKggCDioIAg4qCAIOKggCDioIAg4qCAIPCfm6kN CuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggA0K4qCA8J+Mp++4j/CfjKfvuI/wn4yn77iP4qCA4qCA8J+Mp++4j/CfjKfvuI/i oIDioIDioIDioIDioIDioIDwn4yn77iP8J+Mp++4j/CfjKfvuI/ioIDioIDwn4yn77iP8J+Mp++4 j/CfjKfvuI/wn4yn77iP4qCA4qCA8J+Mp++4j/CfjKfvuI/ioIDioIDioIDwn4yn77iP8J+Mp++4 j/CfjKfvuI/ioIDioIANCuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggA0K4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDi oIDioIDioIDioIDioIDioIDioIDwn5WK77iPIOKggOKggOKggOKggOKggOKggA0K4qCA4qCA4qCA 4qCA8J+MtOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggPCfjYQg4qCA4qCA4qCA 4qCAIPCfjLzioIDioIDioIDioIAg8J+Muw0K8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cf n6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+f q/Cfn6vwn5+r8J+fq/Cfn6sNCg0KSG93IGRvIFlvdSByYXRlIHRoaXMgb24gMS4uLi4xMCBzY2Fs ZSA/DQoNCg0KT24gV2VkbmVzZGF5LCBEZWNlbWJlciAxNCwgMjAyMiBhdCA1OjQzOjE4IFBNIFVU QysyLCBEbWl0cnkgQS4gS2F6YWtvdiB3cm90ZToNCj4gT24gMjAyMi0xMi0xNCAxNjo0MSwgRG1p dHJ5IEEuIEthemFrb3Ygd3JvdGU6IA0KPiA+IE9uIDIwMjItMTItMTQgMTY6MTgsIFJpY2hhcmQg SGVhdGhmaWVsZCB3cm90ZTogDQo+ID4gDQo+ID4+PiBCVFcsIGF2ZXJhZ2luZyBmbG9hdHMgaXMg YSBuYXN0eSBwcm9ibGVtIHRvby4gQSBuYWl2ZSBpbXBsZW1lbnRhdGlvbiANCj4gPj4+IHF1aWNr bHkgbG9zZXMgcHJlY2lzaW9uLiANCj4gPj4gDQo+ID4+IFdlJ3JlIGRlYWxpbmcgd2l0aCAnbydj bG9jaycgYW5kICJISDpNTSIsIGFuZCBub3dhZGF5cyB3ZSBoYXZlIDY0LWJpdCANCj4gPj4gaW50 ZWdlciB0eXBlcyBhbmQgdGhlcmUgYXJlIGV2ZW4gMTI4LWJpdCBpbnRlZ2VycyBtb29jaGluZyBh cm91bmQgDQo+ID4+IGxvb2tpbmcgZm9yIGEgcmVhc29uIHRvIGV4aXN0LiBZb3UnZCBoYXZlIHRv IGF2ZXJhZ2UgYSBoZWxsIG9mIGEgbG90IA0KPiA+PiBvZiB0aW1lcyBldmVuIHRvIC9uZWVkLyBm bG9hdHMsIGxldCBhbG9uZSBsb3NlIHNpZ25pZmljYW50IHByZWNpc2lvbi4gDQo+ID4gDQo+ID4g SSBuZXZlciBzdWdnZXN0ZWQgZmxvYXQgZm9yIGF2ZXJhZ2luZyB0aW1lIHN0YW1wcywgSSBwb2lu dGVkIG91dCB0aGF0IA0KPiA+IGF2ZXJhZ2luZyBpcyBub3QgYSBzaW1wbGUgcHJvYmxlbS4gRS5n LiB0cnkgdGhpcyBvbmU6IA0KPiA+IA0KPiA+ICAgIGZ1bmN0aW9uIEF2ZXJhZ2UgKFggOiBGbG9h dDsgTiA6IFBvc2l0aXZlKSByZXR1cm4gRmxvYXQgaXMgDQo+ID4gICAgICAgU3VtIDogRmxvYXQg Oj0gMC4wOyANCj4gPiAgICBiZWdpbiANCj4gPiAgICAgICBmb3IgSW5kZXggaW4gMS4uTiBsb29w IA0KPiA+ICAgICAgICAgIFN1bSA6PSBTdW0gKyBYOyANCj4gPiAgICAgICBlbmQgbG9vcDsgDQo+ ID4gICAgICAgcmV0dXJuIFN1bSAvIEZsb2F0IChOKTsgDQo+ID4gICAgZW5kIEF2ZXJhZ2U7IA0K PiA+IA0KPiA+IFRoZSBmdW5jdGlvbiBkb2VzIG5haXZlIGF2ZXJhZ2luZy4gRm9yIHNpbXBsaWNp dHkgaXQganVzdCBzdW1zIHVwIHRoZSANCj4gPiBzYW1lIG51bWJlciBYIE4gdGltZXMgYW5kIGRp dmlkZXMgYnkgTi4gDQo+ID4gDQo+ID4gICAgQXZlcmFnZSAoMS4wLCAxN18wMDBfMDAwKSAgPSAw Ljk4Njg5NSANCj4gPiAgICBBdmVyYWdlICgxLjAsIDEwMF8wMDBfMDAwKSA9IDAuMTY3NzcyIA0K PiA+ICAgIEF2ZXJhZ2UgKDEuMCwgMjAwXzAwMF8wMDApID0gMC44Mzg4NjENCj4gMC4wODM4ODYx IA0KPiANCj4gVGhlIGVycm9yIGlzIG9idmlvdXNseSBzeXN0ZW1hdGljLg0KPiAtLSANCj4gUmVn YXJkcywgDQo+IERtaXRyeSBBLiBLYXpha292IA0KPiBodHRwOi8vd3d3LmRtaXRyeS1rYXpha292 LmRlDQo=

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Y A@21:1/5 to All on Mon Jan 9 21:26:45 2023

TG9vayB0aGlzDQrihpMNCuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggA0K4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDioIDioIDioIDioIDioIDioIDi oIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDioIDwn4ye 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDioIDioIDi oIDioIDioIDioIDioIANCuKggCDioIAg4qCAIOKggCDioIAg4qCAIOKggCDioIAg4qCAIPCfm6kN CuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggA0K4qCA8J+Mp++4j/CfjKfvuI/wn4yn77iP4qCA4qCA8J+Mp++4j/CfjKfvuI/i oIDioIDioIDioIDioIDioIDwn4yn77iP8J+Mp++4j/CfjKfvuI/ioIDioIDwn4yn77iP8J+Mp++4 j/CfjKfvuI/wn4yn77iP4qCA4qCA8J+Mp++4j/CfjKfvuI/ioIDioIDioIDwn4yn77iP8J+Mp++4 j/CfjKfvuI/ioIDioIANCuKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKg gOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggA0K4qCA4qCA 4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCA4qCADQrioIDioIDioIDi oIDioIDioIDioIDioIDioIDioIDwn5WK77iPIOKggOKggOKggOKggOKggOKggA0K4qCA4qCA4qCA 4qCA8J+MtOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggOKggPCfjYQg4qCA4qCA4qCA 4qCAIPCfjLzioIDioIDioIDioIAg8J+Muw0K8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cf n6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+fq/Cfn6vwn5+r8J+f q/Cfn6vwn5+r8J+fq/Cfn6sNCkhvdyBkbyBZb3UgcmF0ZSB0aGlzIG9uIDEuLi4uMTAgc2NhbGUg Pw0KDQoNCg0KRGF2aWQgQnJvd24ga2lyanV0YXMgUmVlZGUsIDIzLiBkZXRzZW1iZXIgMjAyMiBr bCAxNDo0NjozNSBVVEMrMjoNCj4gT24gMjIvMTIvMjAyMiAyMjozMCwgUmljaGFyZCBIZWF0aGZp ZWxkIHdyb3RlOiANCj4gPiBPbiAyMi8xMi8yMDIyIDk6MDYgcG0sIERhdmlkIEJyb3duIHdyb3Rl OiANCj4gPj4gT24gMjIvMTIvMjAyMiAxNzozMCwgSnVsaW8gRGkgRWdpZGlvIHdyb3RlOiANCj4g Pj4+IE9uIFRodXJzZGF5LCAyMiBEZWNlbWJlciAyMDIyIGF0IDE2OjUwOjQ5IFVUQysxLCBEYXZp ZCBCcm93biB3cm90ZTogDQo+ID4+Pj4gT24gMjIvMTIvMjAyMiAwMTowNSwgSnVsaW8gRGkgRWdp ZGlvIHdyb3RlOiANCj4gPj4+Pj4gT24gVGh1cnNkYXksIDIyIERlY2VtYmVyIDIwMjIgYXQgMDE6 MDE6MDggVVRDKzEsIE1pa2UgVGVycnkgd3JvdGU6IA0KPiA+Pj4+Pj4gT24gMjEvMTIvMjAyMiAy MTo1NCwgRG1pdHJ5IEEuIEthemFrb3Ygd3JvdGU6IA0KPiA+Pj4+PiANCj4gPj4+Pj4+PiBraW5k IG9mIG1vc3QgbmF0dXJhbCBkZWZpbml0aW9uLCBhdCBsZWFzdCB0byBtZS4gDQo+ID4+Pj4+PiAN Cj4gPj4+Pj4+IFRoYXQgaXMgdW5hbWJpZ3VvdXMsIGJ1dCB0byBteSBtaW5kIG5vdCBhdCBhbGwg d2hhdCBpcyB3YW50ZWQgDQo+ID4+Pj4+IA0KPiA+Pj4+PiBZb3UganVzdCBkb250IGdldCBpdCwg ZG8geW91PyBUaGlzIGlzIChjb21wLilwcm9ncmFtbWluZyANCj4gPj4+Pj4gd2hlcmUgZ3Vlc3N3 b3JrIGlzIGEgKmNhcGl0YWwgc2luKi4gDQo+ID4+Pj4gDQo+ID4+Pj4gUHJvZ3JhbW1pbmcgaXMg YWxsIGFib3V0IHRyeWluZyB0byBndWVzcyB3aGF0IHRoZSBjdXN0b21lciBhY3R1YWxseSANCj4g Pj4+PiB3YW50ZWQsIHJhdGhlciB0aGFuIHdoYXQgdGhleSBhc2tlZCBmb3IhIDotKSANCj4gPj4+ IA0KPiA+Pj4gVXR0ZXIgbm9uc2Vuc2Ugb2YgdGhlIHVzdWFsIGtpbmQgYW5kIGFuIGVudGlyZSBp bmR1c3RyeSBkb3duIHRoZSBkcmFpbi4gDQo+ID4+PiANCj4gPj4+IE5vLCB3ZSBhcmUgcmF0aGVy IHN1cHBvc2VkIHRvICplbGljaXQqIHJlcXVpcmVtZW50cy4uLiB3aGljaCBpcyBqdXN0IHRoZSAN Cj4gPj4+IHRpcCBvZiB0aGUgKHNvZnR3YXJlKSBhbmFseXNpcyBpY2ViZXJnLiANCj4gPj4+IA0K PiA+PiANCj4gPj4gUGVyaGFwcyBJIHNob3VsZCBoYXZlIHB1dCB0aGUgc21pbGV5IGluIGEgYmln Z2VyIGZvbnQ/IA0KPiA+IA0KPiA+IERhdmlkLCBvYnZpb3VzbHkgaXQncyB5b3VyIGNhbGwsIGJ1 dCBkbyB5b3UgcmVhbGx5IHRoaW5rIHlvdSdyZSBnb2luZyB0byANCj4gPiBoYXZlIGEgbWVhbmlu Z2Z1bCBjb252ZXJzYXRpb24gd2l0aCBzb21lb25lIHdobyBvbiAwNS8wMS8yMDIyIDEwOjU3IA0K PiA+IHJlcGxpZWQgdG8geW91WzFdOiANCj4gPiAiSXQgaXMgeW91IHdobyBhcmUgYW4gaW5jb21w ZXRlbnQgZnVja2luZyBjbG93biBqdXN0IHJlcGVhdGluZyBzb21lIA0KPiA+IGJ1bGxzaGl0IHlv dSByZWFkIGluIHRoZSBibG9ncy4iIA0KPiA+IA0KPiA+IEFuZCB5b3UncmUgYnkgZmFyIGZyb20g YmVpbmcgdGhlIG9ubHkgb25lIGhlIGFidXNlcyBpbiB0aHVzIHdpc2UuIA0KPiA+IA0KPiA+IEdv ZCBpbnZlbnRlZCBraWxsZmlsZXMgZm9yIGEgcmVhc29uLiANCj4gPiANCj4gPiBbMV0gSWYgeW91 IHdhbnQgdG8gbG9vayBpdCB1cCwgc2VlIGp1c3Qgb3ZlciBoYWxmd2F5IGRvd24gdGhpcyB0aHJl YWQ6IA0KPiA+IGh0dHBzOi8vZ3JvdXBzLmdvb2dsZS5jb20vZy9jb21wLnByb2dyYW1taW5nL2Mv Z0tRM3EwRDVJVDAvbS83Sm9aNWU3SUNBQUogDQo+ID4NCj4gSSBhbSBibGVzc2VkIHdpdGggYWJz ZW50LW1pbmRlZG5lc3MsIHdoaWNoIG1ha2VzIGl0IGVhc2llciB0byBnaXZlIA0KPiBwZW9wbGUg YSBzZWNvbmQgY2hhbmNlISBCdXQgc2VlaW5nIGhpcyBsYXRlc3QgcmVwbHksIGl0IHNlZW1zIHRo YXQgd2FzIA0KPiB3YXN0ZWQgZWZmb3J0Lg0K

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Tue Jan 10 09:06:36 2023

On 2023-01-09 21:37, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-09 04:25, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-08 21:41, Ben Bacarisse wrote:

But
the problem is /finding/ a specific average -- the point (or angle) that >>>>> minimises the sum of squares of the distances (or angles) from that
average point (or angle).

That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.

I don't follow. It's a mathematical problem for which an algorithm is
sought.

Firstly, this is comp.programming group.

Do you consider discussion algorithms off topic here? Seriously?

No, I consider off topic discussing mathematical problems /= algorithms.
You did not discuss any algorithms so far.

Secondly, if algorithm is sought then there must be a mathematical
formulation of what a numeric solution is sought for.

There needs to be a problem statement that can be discussed and refined.
The original post about circular averages has led to two different refinements -- the vector average (projected onto the unit circle) and
the arc-length average. Both generalise to more dimensions and I
happened to mention in passing that the great-circle average would be a natural fit in some situations.

I missed generalization to 3D. Sorry.
[You kept on writing about unit *circle* and an angle, where it actually
meant unit sphere and polar angles/spherical coordinates]

Tim mentioned (again in passing) that it looks hard. You seemed to
suggest it was not, but I am not sure you knew what the problem was at
that time.

All of this seems to be to be exactly what comp.programming should be
about.

Differential geometry and spherical trigonometry are certainly off topic.

(Finding a minimum is an optimization problem, there are thousands of
algorithms already)

Programmers should know that specifications are not algorithm designs.

Who said they are?

I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.

Average is just such formulation of finding a representative member
from some set having some specific properties. E.g. least squares in
physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):

1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.

After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.

So if I asked you for code to calculate the arithmetic mean of an array
of numbers you would ask for all this first, declaring it not yet a
programming issue?

Then I would ask you, if this is a solution, what was the problem?

This is a solution to the simpler problem.

And what was that the problem? [Adjective adds/clarifies nothing]

But we are not in that situation. A few knowledgeable and experienced
people are discussing a programming problem.

I saw no discussion on programming so either.

You are welcome to join
in, but the best way for you to do that is to ask whatever questions you think would help you understand what's being discussed. So far, your
replies have alternated between "it's trivial" and "it's unstated".

Yes, because you wrote about minimum of arc length on a *circle*. And
that is trivial.

You say you need a formula, so I'll try... Let P_n be a collection of n >>>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>>> A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)

1. atan2 has two arguments (the adjacent and the opposite legs);

Obviously I can't guess how much I can safely assume so I expected there >>> might be questions. arctan(p / q) is usually coded as atan2(p, q).

2. What is "A";

The unit vector that is being sought -- the result of the algorithm. It >>> represents a point on the unit sphere that is the desired average of the >>> input points.

3. What operations(?) "x" and "." denote?

x denotes the cross product, and . the dot product.
Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!

If I correctly understood your explanation (and with a few obvious
corrections):

Sum atan2 (|A x Pi|, A · Pi) --> min
i=1..n {A | A on the circle}

No. A is a vector denoting a point on the unit sphere. And you have
missed out the power of two. Were these changes what you call "obvious corrections"?

|A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)

A · Pi = |A| |Pi| cos θi

atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'

Where θi' is θi with lower semicircle mapped to the upper one (for
whatever reason).

(If mapping was unintended you must have taken the direction of the
cross product for the sign for |A x Pi|)

On a circle varying the angle between A and (R, 0), where R is the
radius, you find that the mean of the angles between Pi and (R, 0)
yields the minimum.

This does not even solve the related case of finding the average point
on the unit circle.

It elementary does for both formula you gave (after corrections) and the circumference:

Sum Length_Of_Arc (Pi, A)^2 =
i=1..n

= Sum (R Angle (Pi, A))^2 =
i=1..n

= R^2 Sum ((Angle (Pi, (R, 0)) - Angle (A, (R, 0)))^2
i=1..n

(R, 0) denotes vector x = R, y = 0. Let θi denote Angle (Pi, (R, 0)) and
α do Angle (A, (R, 0)) then

= R^2 Sum (θi - α)^2 --> min
i=1..n

Derivative d/dα:

2 R^2 Sum (θi - α) = 0 <=> (R /= 0)
i=1..n

<=> Sum θi - nα = 0 <=>
i=1..n

<=> α = Sum θi / n = mean ({θi})
i=1..n

The angle between A and (R, 0) is the mean of angles of points.

I can't help you with spherical trigonometry, which is an utter off
topic anyway. I do not know if there is a direct solution of least
squares. For two points it is like for a circle. For three points it
would be the circumcenter of the spherical triangle.

However even if the direct solution existed, it could be numerically
unstable as many formulae in spherical geometry are. I used to process
remote sensing geological data which involved miles long chains of
cosines and sines. That stuff was no fun.

Anyway, there are lengthy papers on spherical averages introducing
iterative solutions, some with source code for the algorithms. That is
the right way to compute it, if you really wanted...

[Again, it is not programming. In case of spherical geometry a
programmer will find an expert mathematician and who will have the
problem properly stated. The next stop is by an expert applied
mathematician who will outline a workable numeric solution. Then the
programmer can start.]

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Wed Jan 11 02:41:59 2023

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-09 21:37, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-09 04:25, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-08 21:41, Ben Bacarisse wrote:

But
the problem is /finding/ a specific average -- the point (or angle) that >>>>>> minimises the sum of squares of the distances (or angles) from that >>>>>> average point (or angle).

That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.

I don't follow. It's a mathematical problem for which an algorithm is >>>> sought.

Firstly, this is comp.programming group.

Do you consider discussion algorithms off topic here? Seriously?

No, I consider off topic discussing mathematical problems /=
algorithms. You did not discuss any algorithms so far.

Algorithms don't come out of thin air. The objective must be discussed
first. You can't seriously think that everyone here should remain
silent until, out of nowhere, someone says "consider this algorithm to
find the average of a set of points".

Secondly, if algorithm is sought then there must be a mathematical
formulation of what a numeric solution is sought for.

There needs to be a problem statement that can be discussed and refined.
The original post about circular averages has led to two different
refinements -- the vector average (projected onto the unit circle) and
the arc-length average. Both generalise to more dimensions and I
happened to mention in passing that the great-circle average would be a
natural fit in some situations.

I missed generalization to 3D. Sorry.
[You kept on writing about unit *circle* and an angle, where it
actually meant unit sphere and polar angles/spherical coordinates]

Both have been discussed, but I think I've been clear about which one I
am talking about.

Tim mentioned (again in passing) that it looks hard. You seemed to
suggest it was not, but I am not sure you knew what the problem was at
that time.
All of this seems to be to be exactly what comp.programming should be
about.

Differential geometry and spherical trigonometry are certainly off
topic.

I don't think I discussed either.

(Finding a minimum is an optimization problem, there are thousands of
algorithms already)

Programmers should know that specifications are not algorithm designs.

Who said they are?

I explained in the text you cut. Maybe your reference to the thousands
of minimisation algorithms was just an off the cuff remark, but it reads
as if you were suggesting using one as them as the solution to this
programming task.

I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.

Average is just such formulation of finding a representative member
from some set having some specific properties. E.g. least squares in >>>>> physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):

1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.

After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.

So if I asked you for code to calculate the arithmetic mean of an array >>>> of numbers you would ask for all this first, declaring it not yet a
programming issue?

Then I would ask you, if this is a solution, what was the problem?

This is a solution to the simpler problem.

And what was that the problem? [Adjective adds/clarifies nothing]

But we are not in that situation. A few knowledgeable and experienced
people are discussing a programming problem.

I saw no discussion on programming so either.

You are welcome to join
in, but the best way for you to do that is to ask whatever questions you
think would help you understand what's being discussed. So far, your
replies have alternated between "it's trivial" and "it's unstated".

Yes, because you wrote about minimum of arc length on a *circle*. And
that is trivial.

I think I have been clear that the problem you claimed was trivial was
about points on a sphere.

But the circle version is not trivial in the sense you seem to think it
is. What you say below about seems to miss the key issue altogether.

You say you need a formula, so I'll try... Let P_n be a collection of n >>>>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>>>> A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in >>>>>> programming languages.)

1. atan2 has two arguments (the adjacent and the opposite legs);

Obviously I can't guess how much I can safely assume so I expected there >>>> might be questions. arctan(p / q) is usually coded as atan2(p, q).

2. What is "A";

The unit vector that is being sought -- the result of the algorithm. It >>>> represents a point on the unit sphere that is the desired average of the >>>> input points.

3. What operations(?) "x" and "." denote?

x denotes the cross product, and . the dot product.
Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!

If I correctly understood your explanation (and with a few obvious
corrections):

Sum atan2 (|A x Pi|, A · Pi) --> min
i=1..n {A | A on the circle}

No. A is a vector denoting a point on the unit sphere. And you have
missed out the power of two. Were these changes what you call "obvious
corrections"?

|A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)

A · Pi = |A| |Pi| cos θi

atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'

Where θi' is θi with lower semicircle mapped to the upper one (for
whatever reason).

(If mapping was unintended you must have taken the direction of the
cross product for the sign for |A x Pi|)

On a circle varying the angle between A and (R, 0), where R is the
radius, you find that the mean of the angles between Pi and (R, 0)
yields the minimum.

This does not even solve the related case of finding the average point
on the unit circle.

It elementary does for both formula you gave (after corrections) and
the circumference:

Sum Length_Of_Arc (Pi, A)^2 =
i=1..n

= Sum (R Angle (Pi, A))^2 =
i=1..n

= R^2 Sum ((Angle (Pi, (R, 0)) - Angle (A, (R, 0)))^2
i=1..n

There's no need to bring R into it. Either accept that this is a unit
circle or consider only the angles.

(R, 0) denotes vector x = R, y = 0. Let θi denote Angle (Pi, (R, 0))
and α do Angle (A, (R, 0)) then

Simpler to start from here -- to find the average of the angles -- the
angle that minimises the sum of squares of angular differences:

= R^2 Sum (θi - α)^2 --> min
i=1..n

Derivative d/dα:

2 R^2 Sum (θi - α) = 0 <=> (R /= 0)
i=1..n

<=> Sum θi - nα = 0 <=>
i=1..n

<=> α = Sum θi / n = mean ({θi})
i=1..n

The angle between A and (R, 0) is the mean of angles of points.

I don't think you have grasped the basic problem. If you want to
consider the circular case you'll have to review the posts that were specifically about it. You seem to be suggesting that a conventional
mean solves the problem.

Anyway, there are lengthy papers on spherical averages introducing
iterative solutions, some with source code for the algorithms. That is
the right way to compute it, if you really wanted...

I'd like to read one. Do you have a citation, please? The only
treatments I've found don't discuss the average that Tim and I were
talking about. They only consider the simple vector average. (But I've
only found two.)

[Again, it is not programming. In case of spherical geometry a
programmer will find an expert mathematician and who will have the
problem properly stated.

The problem has been properly stated. We seek an algorithm that finds a
point that minimises the sum of squares of the great-circle distances
between that point and the input points. You could have asked if any of
this was unclear. To me it is both clear and intuitive.

The next stop is by an expert applied mathematician who will outline a workable numeric solution. Then the programmer can start.]

Ah, you hand off the task of devising algorithms to mathematicians?
That would make programming very dull. Anyway, you would not tackle
this sort of problem until an expert applied mathematician has done
everything except code up a workable numerical solution. Is that also
the case for the average angle problem?

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Dmitry A. Kazakov on Tue Jan 10 23:36:36 2023

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-08 16:45, Tim Rentsch wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

Averaging arcs is equivalent to averaging angles.

Angles are a one-dimensional measure.

Averaging arcs is still equivalent to averaging angles, which is
trivial result of elementary trigonometry.

Finding an arc length
"average" of points on a sphere needs a two-dimensional result.

Points do not have arcs.

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.

See above, it is just angles to average.

Apparently you have not yet understood the problem.

Again, averages of arcs and angles are equivalent up to a
multiplier.

Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?

Why don't you write a formula specifying your need?

Programs are written according to the specifications. Numeric
programs require a properly stated problem, rather than a bunch
of words arbitrarily thrown in a meaningless sentence as above.

After reading this response and also three other responses of
yours (to Ben Bacarisse) down stream, I can't help but think you
have little or no interest in understanding the problem, offering
any useful comments about it, or trying to write a program to
solve the problem. Apparently the only interest you do have is
making captious remarks.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Tim Rentsch@21:1/5 to Ben Bacarisse on Tue Jan 10 23:25:10 2023

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

<cut>

The "other" average, minimises the sum of the of the angular
distances.

I think you mean it minimizes the sum of the squares of the
angular distances.

Yes thanks. I originally thought I'd write an abbreviation but then I
convinced myself I would not forget any of the "squares of"s. Proof
reading might have been useful too.

and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.

It's more interesting because it really is about chords!

I don't buy the "really" here. The center of mass is fundamental
and always well-defined. Furthermore it works for collections of
points that are not all on the same circle or sphere. Looking at
the problem in terms of chords seems somewhat artificial, or at
least less fundamental.

There are other ways for something to be interesting (at least to me).
When you plot the two averages, and look at the chords vs the arcs, it
becomes clear why one average "looks" more average than the other.

I can see a possible misunderstanding here. You might have taken my "it really is about chords" to mean "it is fundamentally about chords", but
I said "it really is about chords" because you thought I'd used the word chord incorrectly to refer to the lines between the centre of mass and
the data points. I was saying "I really did mean chords".

Now that I see what you meant it all makes sense.

Also I agree that comparing the chords view and the arcs view
gives a useful perspective. I am starting to appreciate the
value of that perspective more as time goes on.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Dmitry A. Kazakov@21:1/5 to Ben Bacarisse on Wed Jan 11 10:01:45 2023

On 2023-01-11 03:41, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-09 21:37, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-09 04:25, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

Firstly, this is comp.programming group.

Do you consider discussion algorithms off topic here? Seriously?

No, I consider off topic discussing mathematical problems /=
algorithms. You did not discuss any algorithms so far.

Algorithms don't come out of thin air. The objective must be discussed first. You can't seriously think that everyone here should remain
silent until, out of nowhere, someone says "consider this algorithm to
find the average of a set of points".

I explained in the text you cut. Maybe your reference to the thousands
of minimisation algorithms was just an off the cuff remark, but it reads
as if you were suggesting using one as them as the solution to this programming task.

Sure. I highly doubt that production code, if ever existed, would use
some novel algorithm.

Anyway, there are lengthy papers on spherical averages introducing
iterative solutions, some with source code for the algorithms. That is
the right way to compute it, if you really wanted...

I'd like to read one. Do you have a citation, please?

Perhaps this could be a starting point:

"Spherical Averages and Applications to Spherical Splines and
Interpolation" Samuel R. Buss, Jay P. Fillmore

It is not area of my interest and research is not a programming task
either. And you still completely miss the key point about stating the
problem. The problems come from the problem spaces. Spherical geometry
is used in remote sensing, navigation, computer graphics (spherical
polygons and triangles) etc. You must look for works in these
application areas first. If the problem were *real* you would already
have that step behind.

[Again, it is not programming. In case of spherical geometry a
programmer will find an expert mathematician and who will have the
problem properly stated.

The problem has been properly stated. We seek an algorithm that finds a point that minimises the sum of squares of the great-circle distances
between that point and the input points. You could have asked if any of
this was unclear. To me it is both clear and intuitive.

The next stop is by an expert applied mathematician who will outline a
workable numeric solution. Then the programmer can start.]

Ah, you hand off the task of devising algorithms to mathematicians?

Yep, *numeric* algorithms is an area of applied mathematics.

That would make programming very dull. Anyway, you would not tackle
this sort of problem until an expert applied mathematician has done everything except code up a workable numerical solution. Is that also
the case for the average angle problem?

If you are not familiar with spherical geometry, numerical approximation
and optimization, yes.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Y A@21:1/5 to Tim Rentsch on Wed Jan 11 02:39:48 2023

Check out this

↓

http://generatingmiracluosnumbers.medianewsonline.com/Y.htm

You might start to like it......

On Wednesday, January 11, 2023 at 9:36:40 AM UTC+2, Tim Rentsch wrote:

"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:

On 2023-01-08 16:45, Tim Rentsch wrote:

"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:

Averaging arcs is equivalent to averaging angles.

Angles are a one-dimensional measure.

Averaging arcs is still equivalent to averaging angles, which is
trivial result of elementary trigonometry.

Finding an arc length
"average" of points on a sphere needs a two-dimensional result.

Points do not have arcs.

Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.

See above, it is just angles to average.

Apparently you have not yet understood the problem.

Again, averages of arcs and angles are equivalent up to a
multiplier.

Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?

Why don't you write a formula specifying your need?

Programs are written according to the specifications. Numeric
programs require a properly stated problem, rather than a bunch
of words arbitrarily thrown in a meaningless sentence as above.

After reading this response and also three other responses of
yours (to Ben Bacarisse) down stream, I can't help but think you
have little or no interest in understanding the problem, offering
any useful comments about it, or trying to write a program to
solve the problem. Apparently the only interest you do have is
making captious remarks.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ben Bacarisse@21:1/5 to Dmitry A. Kazakov on Thu Jan 12 01:00:09 2023

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

On 2023-01-11 03:41, Ben Bacarisse wrote:

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

Anyway, there are lengthy papers on spherical averages introducing
iterative solutions, some with source code for the algorithms. That is
the right way to compute it, if you really wanted...

I'd like to read one. Do you have a citation, please?

Perhaps this could be a starting point:

"Spherical Averages and Applications to Spherical Splines and
Interpolation" Samuel R. Buss, Jay P. Fillmore

Thanks.

--
Ben.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Centurion
  Fri Jun 13 18:58:14 2025
  from Berea, Ohio via Telnet
- Bob Worm
  Fri Jun 13 10:37:24 2025
  from Wales, Uk via Telnet
- Centurion
  Fri Jun 13 08:09:26 2025
  from Berea, Ohio via Telnet
- Centurion
  Fri Jun 13 05:19:00 2025
  from Berea, Ohio via Telnet
- Centurion
  Thu Jun 12 18:57:30 2025
  from Berea, Ohio via Telnet
- Bob Worm
  Thu Jun 12 18:29:11 2025
  from Wales, Uk via Telnet
- Centurion
  Thu Jun 12 17:59:25 2025
  from Berea, Ohio via Telnet
- Plume
  Thu Jun 12 15:49:12 2025
  from Uk via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	491
Nodes:	16 (2 / 14)
Uptime:	127:45:22
Calls:	9,688
Calls today:	4
Files:	13,728
Messages:	6,177,251

Re: Another little puzzle

Who's Online

Recent Visitors

System Info