Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
On 14/12/2022 12:24 pm, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.Subject line: "Another little puzzle"
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Your specification seems too simple to qualify as a puzzle, so
what am I missing?
On 2022-12-14 13:24, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
You probably missed to require the interesting part: doing all
that in the modular type (modulo 24) arithmetic:
20 + 5 = 1 (mod 24)
On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
On 2022-12-14 13:24, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
You probably missed to require the interesting part: doing all that in
the modular type (modulo 24) arithmetic:
20 + 5 = 1 (mod 24)
...which will give you the wrong answer. Chase that goose!
On 2022-12-14 14:10, Richard Heathfield wrote:
On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
On 2022-12-14 13:24, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
You probably missed to require the interesting part: doing all
that in the modular type (modulo 24) arithmetic:
20 + 5 = 1 (mod 24)
...which will give you the wrong answer. Chase that goose!
Right, you must count the wrap-ups.
On 14/12/2022 1:35 pm, Dmitry A. Kazakov wrote:
On 2022-12-14 14:10, Richard Heathfield wrote:
On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
On 2022-12-14 13:24, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
You probably missed to require the interesting part: doing all that
in the modular type (modulo 24) arithmetic:
20 + 5 = 1 (mod 24)
...which will give you the wrong answer. Chase that goose!
Right, you must count the wrap-ups.
So why do you need mod?
On 2022-12-14 15:10, Richard Heathfield wrote:
On 14/12/2022 1:35 pm, Dmitry A. Kazakov wrote:
On 2022-12-14 14:10, Richard Heathfield wrote:
On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
On 2022-12-14 13:24, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
You probably missed to require the interesting part: doing
all that in the modular type (modulo 24) arithmetic:
20 + 5 = 1 (mod 24)
...which will give you the wrong answer. Chase that goose!
Right, you must count the wrap-ups.
[...]
So why do you need mod?
As I said, the challenge is only interesting in modulo arithmetic
BTW, averaging floats is a nasty problem too. A naive
implementation quickly loses precision.
BTW, averaging floats is a nasty problem too. A naive implementation
quickly loses precision.
We're dealing with 'o'clock' and "HH:MM", and nowadays we have 64-bit
integer types and there are even 128-bit integers mooching around
looking for a reason to exist. You'd have to average a hell of a lot of
times even to /need/ floats, let alone lose significant precision.
On 2022-12-14 16:18, Richard Heathfield wrote:0.0838861
BTW, averaging floats is a nasty problem too. A naive implementation
quickly loses precision.
We're dealing with 'o'clock' and "HH:MM", and nowadays we have 64-bit
integer types and there are even 128-bit integers mooching around
looking for a reason to exist. You'd have to average a hell of a lot
of times even to /need/ floats, let alone lose significant precision.
I never suggested float for averaging time stamps, I pointed out that averaging is not a simple problem. E.g. try this one:
function Average (X : Float; N : Positive) return Float is
Sum : Float := 0.0;
begin
for Index in 1..N loop
Sum := Sum + X;
end loop;
return Sum / Float (N);
end Average;
The function does naive averaging. For simplicity it just sums up the
same number X N times and divides by N.
Average (1.0, 17_000_000) = 0.986895
Average (1.0, 100_000_000) = 0.167772
Average (1.0, 200_000_000) = 0.838861
On 2022-12-14 16:18, Richard Heathfield wrote:
BTW, averaging floats is a nasty problem too. A naive
implementation quickly loses precision.
We're dealing with 'o'clock' and "HH:MM", and nowadays we have
64-bit integer types and there are even 128-bit integers
mooching around looking for a reason to exist. You'd have to
average a hell of a lot of times even to /need/ floats, let
alone lose significant precision.
I never suggested float for averaging time stamps,
I pointed out
that averaging is not a simple problem.
E.g. try this one:
function Average (X : Float; N : Positive) return Float is
Sum : Float := 0.0;
begin
for Index in 1..N loop
Sum := Sum + X;
end loop;
return Sum / Float (N);
end Average;
The function does naive averaging. For simplicity it just sums up
the same number X N times and divides by N.
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Richard Heathfield <rjh@cpax.org.uk> writes:
On 14/12/2022 12:24 pm, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Subject line: "Another little puzzle"
Your specification seems too simple to qualify as a puzzle, so
what am I missing?
Answering this is part of the puzzle!
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
On 21/12/2022 12:03 pm, Stefan Ram wrote:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.Thanks for all replies!
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
The specification says "the 24-hour day" --- one day, not two days. So
the average of 23.5 and 1.5 is 12.5. 0.5 is a mistake.
average( 11.5, 13.5 )== 12.5
Correct.
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
Your hope is misplaced, because one of your test cases bears an
incorrect expected result.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
You misunderstood your specification.
There is a
perfectly plausible interpretation of "average" that gives 0.5.
Richard Heathfield <rjh@cpax.org.uk> writes:
On 21/12/2022 12:03 pm, Stefan Ram wrote:
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
You misunderstood your specification.
That's very severe!
On 2022-12-21 18:05, Ben Bacarisse wrote:
There is a
perfectly plausible interpretation of "average" that gives 0.5.
Not very plausible as it violates two properties of:
min({Xi}) <= avr({Xi}) <= max({Xi})
and
avr({C * Xi}) = C * avr({Xi})
The second one allows calculation of the average as:
23.5 / 2 + 1.5 / 2 = ?
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
The OP was probably deliberately rather vague on this point! The
easiest definition for "average" literally doesn't make sense in a
modulo context: we can ADD the modular values, but in general dividing
in such a mathematical setting doesn't make much sense, so
Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n
would be inappropriate due to the /n operation.
HOWEVER, there's another characterisation for the average of a set,
which looks more promising in a modular (or other more general)
setting: the average is the value of V which MINIMISES THE "ERROR" calculated by
error = Sum ([i=1,2,..n] (V - x_i)^2)
that is, minimises the sum of squares of differences from V.
[Incidentally, if we minimise the sum of absolute differeces from V,
that characterises the mode (aka "midpoint") of the sample, but I think
the median is more what is wanted here...]
Now, as to how to CALCULATE the V above???
an accumulation of "frustration" at what I consider to be
"unimaginative" misinterpretations and
dogmatic claims of what was originally requested...
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
On 2022-12-21 20:55, Mike Terry wrote:
The OP was probably deliberately rather vague on this point! The easiest definition for "average"
literally doesn't make sense in a modulo context: we can ADD the modular values, but in general
dividing in such a mathematical setting doesn't make much sense, so
Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n
would be inappropriate due to the /n operation.
Unless you calculate everything as reals or integers and then take the remainder of 24. Which is
kind of most natural definition, at least to me.
HOWEVER, there's another characterisation for the average of a set, which looks more promising in
a modular (or other more general) setting: the average is the value of V which MINIMISES THE
"ERROR" calculated by
error = Sum ([i=1,2,..n] (V - x_i)^2)
that is, minimises the sum of squares of differences from V.
This has exactly same problem as above, because subtraction and multiplication (power of two) have
different semantics in modular arithmetic.
[Incidentally, if we minimise the sum of absolute differeces from V, that characterises the mode
(aka "midpoint") of the sample, but I think the median is more what is wanted here...]
Actually it is the same beast. Median is the least C-norm defined as |x-y|.
Mean is the least Euclidean norm (squares), the thing you proposed.
Mean and average are same for real numbers.
You are right that median (and C-norm) can be unambiguously defined in modular numbers. But nobody
would ever qualify this as a puzzle! (:-))
[...]
Now, as to how to CALCULATE the V above???
As I said, in real numbers V is the mean, i.e.
V = Sum (Xi) =def= mean({Xi | i=1..n})
i=1..n
Any numeric method is in the end mapping Xi to some normal number, calculating a classic mean,
mapping back. The problem is with the forward mapping being ambiguous. One method to disambiguate is
staying within the same day. Another could be taking a median and considering 12 hours before and 12
hours after it. And there are many more.
P.S. There are differences between the average and mean. OP referred the average, which may mean
something completely counterintuitive (no pun intended (:-))
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
As before, the problem is underspecified.
Maybe so - I'd just say here that you are discussing an /algorithm/ to compute a result, rather than the definition of what is to be considered
the correct answer, which was the focus in my post.
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
On 22/12/2022 01:05, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)
On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
On 22/12/2022 01:05, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)
Utter nonsense of the usual kind and an entire industry down the drain.
No, we are rather supposed to *elicit* requirements... which is just the
tip of the (software) analysis iceberg.
On 22/12/2022 17:30, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown
wrote:
On 22/12/2022 01:05, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry
wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
Programming is all about trying to guess what the customer
actually
wanted, rather than what they asked for! :-)
Utter nonsense of the usual kind and an entire industry down
the drain.
No, we are rather supposed to *elicit* requirements... which is
just the
tip of the (software) analysis iceberg.
Perhaps I should have put the smiley in a bigger font?
On 22/12/2022 17:30, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
On 22/12/2022 01:05, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)
Utter nonsense of the usual kind and an entire industry down the drain.
No, we are rather supposed to *elicit* requirements... which is just the tip of the (software) analysis iceberg.
Perhaps I should have put the smiley in a bigger font?
Of course the correct way to do things is to work with the customer to establish the requirements and specifications that everyone is happy
with, and then code to those specifications.
And if it turns out that the specifications were unclear, you clarify
them, rather than guessing.
On 22/12/2022 9:06 pm, David Brown wrote:
On 22/12/2022 17:30, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
On 22/12/2022 01:05, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)
Utter nonsense of the usual kind and an entire industry down the drain.
No, we are rather supposed to *elicit* requirements... which is just the >>> tip of the (software) analysis iceberg.
Perhaps I should have put the smiley in a bigger font?
David, obviously it's your call, but do you really think you're going to
have a meaningful conversation with someone who on 05/01/2022 10:57
replied to you[1]:
"It is you who are an incompetent fucking clown just repeating some
bullshit you read in the blogs."
And you're by far from being the only one he abuses in thus wise.
God invented killfiles for a reason.
[1] If you want to look it up, see just over halfway down this thread: https://groups.google.com/g/comp.programming/c/gKQ3q0D5IT0/m/7JoZ5e7ICAAJ
On Thursday, 22 December 2022 at 22:07:01 UTC+1, David Brown wrote:
Perhaps I should have put the smiley in a bigger font?
You can laugh your frauds and bad conscience as much as you like,
I am not kidding at all.
On 22/12/2022 23:15, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 22:07:01 UTC+1, David Brown wrote:
Perhaps I should have put the smiley in a bigger font?
You can laugh your frauds and bad conscience as much as you like,
I am not kidding at all.
Happy Christmas to you too :-)
On Wednesday, 21 December 2022 at 20:56:01 UTC+1, Mike Terry wrote:
an accumulation of "frustration" at what I consider to be
"unimaginative" misinterpretations and
dogmatic claims of what was originally requested...
Besides that the problem was and remains at best poorly
phrased (and I agree that guesswork around here should
rather be banned), there is a much simpler and canonical
way to define "midpoint" in a modular space, namely the
midpoint along the shortest of the two possible paths.
And that's actually a quite common kind of calculation...
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
As before, the problem is underspecified.
Some remarks not specifically in reply to you, Tim...
The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises
Sum_{i=1,n} distance(A, t(i))
(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.
The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.
Sometimes that vector has zero length, and the average is undefined,
but otherwise the length of the vector gives an indication of the
variability of the data.
Why do I consider this a reasonable interpretation of the problem?
Well, given a list of times of day when a train is observed to pass
some station, the circular 24-hour-time average should be our best
estimate of the scheduled time.
Obviously there are other possible readings of the problem, but I was
not able to justify any of them as useful for any real-world
applications. This is a case where I hope I am wrong and there /are/
other circular averages with practical interpretations.
On 24/12/2022 14:21, Tim Rentsch wrote:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their
average. For example, the average of "eight o'clock"
and "ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example
"HH:MM" or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a
result, the puzzle has not yet been solved (unless I
have overlooked a contribution or misworded
expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5 average( 11.5, 13.5 )==
12.5
(I use hours as units, so "0.5" means, "half past
midnight".)
I hope that these test cases encode sensible
expectations for an average of two times on a 24-hour
clock in the spirit of the example given in the OP,
which was, "the average of eight o'clock and ten
o'clock would be nine o'clock", since these test cases
just have rotated that example by 3.5 and 15.5 hours.
I believe that I have not seen an algorithm so far in
this thread that would pass these tests.
As before, the problem is underspecified.
Some remarks not specifically in reply to you, Tim...
I didn't really say very much. Your comments are a lot
more interesting.
The input is a collection, t(n), of n > 1 numbers in [0,
24). The average should be a number, A, in [0, 24) that
minimises
Sum_{i=1,n} distance(A, t(i))
(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to
think in terms of variance). So far, this is just what an
average is. The key point is what is the distance (or
difference) whose sum (or sum of squares) we want to
minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.
Minimizing the square of the distance (aka length of the
shorter arc) is one metric, and I think a reasonable one.
(Minimizing the absolute value of the distance gives a
result more like the median than the mean, IIANM.)
Yes - perhaps Ben was thinking of making
Sum_{i=1,n} signed_distance(A, t(i)) = 0
rather than minimising the (absolute) distance.
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
As before, the problem is underspecified.
Some remarks not specifically in reply to you, Tim...
I didn't really say very much. Your comments are a lot more
interesting.
The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises
Sum_{i=1,n} distance(A, t(i))
(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.
Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)
The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.
Averaging the "time unit vectors" is another plausible metric.
On 14/12/2022 12:24 pm, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Subject line: "Another little puzzle"
Your specification seems too simple to qualify as a puzzle, so
what am I missing?
--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within
On 24/12/2022 14:21, Tim Rentsch wrote:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:I didn't really say very much. Your comments are a lot more
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
As before, the problem is underspecified.
Some remarks not specifically in reply to you, Tim...
interesting.
The input is a collection, t(n), of n > 1 numbers in [0, 24). TheMinimizing the square of the distance (aka length of the shorter arc)
average should be a number, A, in [0, 24) that minimises
Sum_{i=1,n} distance(A, t(i))
(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)
Yes - perhaps Ben was thinking of aking
Sum_{i=1,n} signed_distance(A, t(i)) = 0
rather than minimising the (absolute) distance. That would match up
with minimising the sum of squares of the distance (mean). Or he just
meant the median as you say, which is different from the mean. Both
have plausible "average" properties,
and meet my "minimal" criteria
for what I think anyone would require of an "average" applying to a
modular arithmetic:
- if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
AROUND THE END OF THE MODULO LOOP] then the "average" will be within
those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.
- if all x_i are translated by a fixed offset like 1 hour, the average
will translate by the same offset, i.e. it maintains the same "modulo"
relation with the translated x_i.
hmm, thinking now I'll add the similar:
- if all x_i are "reflected" about a chosen time, then the new average
will be the reflected time of the original average.
(those aren't intended to be a definition of a "modular average", just
to exclude suggestions that blatently lack the right symmetry for such
an average.)
The problem has a natural interpretation in terms of angles. WhateverAveraging the "time unit vectors" is another plausible metric.
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.
Yes I think this one would work best in most situations, given that
the OP didn't have a precise mathematical requirement for his solution
- probably he wants something that just works reasonably as expected.
(It's my favourite answer.)
Another way of thinking of this approach is "balancing" the 24-hour
modular circle, where we've put unit weights at each of the times x_i.
E.g. we look for a balancing line through the centre of the circle.
[Note times on the circle go from 1-24, not 1-12 like a normal clock.]
Also, thinking like this, it's equivalent to (conventional) averaging
of the sin of the angular differences from the average (balancing)
point, since the sin will give the perpendicular distance of the
weight from the balancing line. So it still fits with the general
scheme of minimising distance(x_i, v)^2, but with a slightly different
metric (distance function) in use.
As sin(x) is close to x when x is small, the vector average solution
would match closely with the mean (by arc length) solution when the
times are close together, but diverge more noticeably when the times
are spread out...
Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:
On 24/12/2022 14:21, Tim Rentsch wrote:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:I didn't really say very much. Your comments are a lot more
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
As before, the problem is underspecified.
Some remarks not specifically in reply to you, Tim...
interesting.
The input is a collection, t(n), of n > 1 numbers in [0, 24). TheMinimizing the square of the distance (aka length of the shorter arc)
average should be a number, A, in [0, 24) that minimises
Sum_{i=1,n} distance(A, t(i))
(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point >>>> is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)
Yes - perhaps Ben was thinking of aking
Sum_{i=1,n} signed_distance(A, t(i)) = 0
rather than minimising the (absolute) distance. That would match up
with minimising the sum of squares of the distance (mean). Or he just
meant the median as you say, which is different from the mean. Both
have plausible "average" properties,
Well, I was trying to be deliberately vague (hence my talking about "an average" rather than mean, median or whatever) but I was no vague enough
with the formula.
I wasn't thinking of a zero net distance because some metrics (though probably not applicable in this situation) don't always allow for a zero
sum. I should have referred to minimising some kind distance sum, be it
the absolute value of a signed sum or, as in the solution I was
proposing, the sum of directional differences (i.e. full vectors).
and meet my "minimal" criteria
for what I think anyone would require of an "average" applying to a
modular arithmetic:
- if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING
AROUND THE END OF THE MODULO LOOP] then the "average" will be within
those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.
And when the x_i are widely spaced the average might be considered
nearly meaningless. If you have {0, 8, 16} hours then what is the
"average"? Are there cases where every set of circular values has a meaningful "average"? If not, you really need more than a single number answer.
(These are not questions to you, specifically, I'm interested to know of interpretations that go beyond the way I see things.)
- if all x_i are translated by a fixed offset like 1 hour, the average
will translate by the same offset, i.e. it maintains the same "modulo" >> relation with the translated x_i.
hmm, thinking now I'll add the similar:
- if all x_i are "reflected" about a chosen time, then the new average
will be the reflected time of the original average.
(those aren't intended to be a definition of a "modular average", just
to exclude suggestions that blatently lack the right symmetry for such
an average.)
The problem has a natural interpretation in terms of angles. Whatever >>>> the circular quantity is, convert the values to unit vectors round aAveraging the "time unit vectors" is another plausible metric.
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.
Yes I think this one would work best in most situations, given that
the OP didn't have a precise mathematical requirement for his solution
- probably he wants something that just works reasonably as expected.
(It's my favourite answer.)
Another way of thinking of this approach is "balancing" the 24-hour
modular circle, where we've put unit weights at each of the times x_i.
E.g. we look for a balancing line through the centre of the circle.
Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or
16 balances, but the vector sum interpretation is undefined. And for
{0, 6, 12, 18} there are two balance lines, neither of which seems to be
a natural answer for the average.
(Also, every balance line gives two answers, so how do you pick?)
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
As before, the problem is underspecified.
Some remarks not specifically in reply to you, Tim...
I didn't really say very much. Your comments are a lot more
interesting.
The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises
Sum_{i=1,n} distance(A, t(i))
(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.
Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)
The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.
Averaging the "time unit vectors" is another plausible metric.
Sometimes that vector has zero length, and the average is undefined,
but otherwise the length of the vector gives an indication of the
variability of the data.
Why do I consider this a reasonable interpretation of the problem?
Well, given a list of times of day when a train is observed to pass
some station, the circular 24-hour-time average should be our best
estimate of the scheduled time.
Obviously there are other possible readings of the problem, but I was
not able to justify any of them as useful for any real-world
applications. This is a case where I hope I am wrong and there /are/
other circular averages with practical interpretations.
A nice analysis.
Some observations (all of which I believe are correct, but no
guarantees are offered).
The "vector average" metric is easier to calculate and gives an
explicit indication when the data produces a degenerate result.
Considering only cases that do not produce a degenerate result:
(1) the "conventional average" metric and the "vector average"
metric can produce different results when there are more than two
data points. (For one or two data points I think they are always
the same.)
(2) the "vector average" metric can be calculated in O(n). The
"conventional average" metric can be calculated in O(n*n).
(3) when the times are all reasonably close together, the
"conventional average" metric seems more likely to produce a result corresponding to what most people expect. Another way to say this
is that the "vector average" metric will sometimes surprise people.
... But actually the balancing line idea wasn't my proper thinking -
that was that we "counter-balanced" the circle by adding a new weight somewhere on the circle. There is a unique such place where such a
weight can go, and a unique weight that works.
The "average" for the
x_i is then directly OPPOSITE the counter weight. This is all
assuming we don't start with an exactly balancing sample!
The mass of
the counter-weight is the magnitude of your vector sum, indicating the "strength" of the average...
On 24/12/2022 22:30, Ben Bacarisse wrote:
Mike Terry <news.dead.person.stones@darjeeling.plus.com> writes:
Another way of thinking of this approach is "balancing" the 24-hour
modular circle, where we've put unit weights at each of the times x_i.
E.g. we look for a balancing line through the centre of the circle.
Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or
16 balances, but the vector sum interpretation is undefined. And for
{0, 6, 12, 18} there are two balance lines, neither of which seems to be
a natural answer for the average.
Yes, it's logically the same, in so far as WHEN the x_i aren't already balanced [aka they /have/ a meaningful average] we get the same
average both ways. (Adding vectors is effectively a way of
calculating moments of the weights in the balancing view of things.)
On 24/12/2022 14:21, Tim Rentsch wrote:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises
Sum_{i=1,n} distance(A, t(i))
(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.
Minimizing the square of the distance (aka length of the shorter
arc) is one metric, and I think a reasonable one. [...]
Yes - perhaps Ben was thinking of making
Sum_{i=1,n} signed_distance(A, t(i)) = 0
rather than minimising the (absolute) distance. [...]
The problem has a natural interpretation in terms of angles.
Whatever the circular quantity is, convert the values to unit
vectors round a circle. For times of day, just scale [0, 24) to
[0, 2*pi). The average is then just the direction of the average
vector, converted back to a time of day.
Averaging the "time unit vectors" is another plausible metric.
Yes I think this one would work best in most situations, [...]
Another way of thinking of this approach is "balancing" the 24-hour
modular circle, where we've put unit weights at each of the times
x_i. E.g. we look for a balancing line through the centre of the
circle. [Note times on the circle go from 1-24, not 1-12 like a
normal clock.]
Also, thinking like this, it's equivalent to (conventional)
averaging of the sin of the angular differences from the average
(balancing) point, since the sin will give the perpendicular
distance of the weight from the balancing line. [...]
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
ram@zedat.fu-berlin.de (Stefan Ram) writes:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
(You can choose any representation, for example "HH:MM"
or "seconds since midnight".)
Thanks for all replies!
I waited a few days before answering to allow
sufficient time to think about the problem.
There were not enough tests written and run. As a result,
the puzzle has not yet been solved (unless I have overlooked
a contribution or misworded expectations).
So, here are two possible test cases.
average( 23.5, 1.5 )== 0.5
average( 11.5, 13.5 )== 12.5
(I use hours as units, so "0.5" means, "half past midnight".)
I hope that these test cases encode sensible expectations
for an average of two times on a 24-hour clock in the spirit
of the example given in the OP, which was, "the average of
eight o'clock and ten o'clock would be nine o'clock", since
these test cases just have rotated that example by 3.5 and
15.5 hours.
I believe that I have not seen an algorithm so far in this
thread that would pass these tests.
As before, the problem is underspecified.
Some remarks not specifically in reply to you, Tim...
I didn't really say very much. Your comments are a lot more
interesting.
The input is a collection, t(n), of n > 1 numbers in [0, 24). The
average should be a number, A, in [0, 24) that minimises
Sum_{i=1,n} distance(A, t(i))
(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms
of variance). So far, this is just what an average is. The key point
is what is the distance (or difference) whose sum (or sum of squares)
we want to minimise? For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.
Minimizing the square of the distance (aka length of the shorter arc)
is one metric, and I think a reasonable one. (Minimizing the absolute
value of the distance gives a result more like the median than the
mean, IIANM.)
The problem has a natural interpretation in terms of angles. Whatever
the circular quantity is, convert the values to unit vectors round a
circle. For times of day, just scale [0, 24) to [0, 2*pi). The
average is then just the direction of the average vector, converted
back to a time of day.
Averaging the "time unit vectors" is another plausible metric.
Sometimes that vector has zero length, and the average is undefined,
but otherwise the length of the vector gives an indication of the
variability of the data.
Why do I consider this a reasonable interpretation of the problem?
Well, given a list of times of day when a train is observed to pass
some station, the circular 24-hour-time average should be our best
estimate of the scheduled time.
Obviously there are other possible readings of the problem, but I was
not able to justify any of them as useful for any real-world
applications. This is a case where I hope I am wrong and there /are/
other circular averages with practical interpretations.
A nice analysis.
A generous appraisal, thanks. I was trying to be vague enough not to
put my foot in it, by was too specific with the formula.
Some observations (all of which I believe are correct, but no
guarantees are offered).
The "vector average" metric is easier to calculate and gives an
explicit indication when the data produces a degenerate result.
Considering only cases that do not produce a degenerate result:
(1) the "conventional average" metric and the "vector average"
metric can produce different results when there are more than two
data points. (For one or two data points I think they are always
the same.)
(2) the "vector average" metric can be calculated in O(n). The
"conventional average" metric can be calculated in O(n*n).
By which I presume you mean and /not/ in O(n). (Big O is just an
upper bound.)
(3) when the times are all reasonably close together, the
"conventional average" metric seems more likely to produce a result
corresponding to what most people expect. Another way to say this
is that the "vector average" metric will sometimes surprise people.
I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.
My "conventional average" algorithm (which is not well thought out) was
to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result
back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[By "conventional average" I mean a time/direction A such that]
A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }
where 'difference' means the shorter arc length.
The "conventional average" metric can be calculated in O(n*n).
By which I presume you mean and /not/ in O(n). (Big O is just an
upper bound.)
What I mean is I have an algorithm that runs in time proportional
to n*n. I suspect that there is an algorithm with better big-O
behavior (probably O( n * log n )), but I don't actually have one
to show, so I simply gave the best big-O result that I knew.
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.
Sorry, I meant to refer to your formulation of average
A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }
where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.
My "conventional average" algorithm (which is not well thought out) was
to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result
back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.
Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.
Sorry, I meant to refer to your formulation of average
A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }
where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.
My "conventional average" algorithm (which is not well thought out) was
to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result
back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.
Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?
Well I handily stopped considering this at the stage where I assumed
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm tries
all the offsets and minimises the result.
Looking at it a bit more I can't see a better way (but that might be the Ratafia de Champagne). It feels as if there /should/ be one. In fact
it feels as if it should be linear.
On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
On 22/12/2022 01:05, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
Programming is all about trying to guess what the customer actually
wanted, rather than what they asked for! :-)
Utter nonsense of the usual kind and an entire industry down the drain.
No, we are rather supposed to *elicit* requirements... which is just the
tip of the (software) analysis iceberg.
On 12/25/22 4:44 PM, Ben Bacarisse wrote:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.
Sorry, I meant to refer to your formulation of average
A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }
where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.
My "conventional average" algorithm (which is not well thought
out) was to (a) rotate the data set to avoid the 23/0 boundary
(not always possible), (b) take the arithmetic mean, and then (c)
rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
and the average is mean[0,1,2] - 1 = 0.
Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?
Well I handily stopped considering this at the stage where I
assumed there must be a simple way to spot the optimal rotation, so
I never thought it might have to be quadratic. Presumably your
algorithm tries all the offsets and minimises the result.
Looking at it a bit more I can't see a better way (but that might
be the Ratafia de Champagne). It feels as if there /should/ be
one. In fact it feels as if it should be linear.
I may have lost track of the problem, but wouldn't the addition of
the points in two-dimensions and finding the center of gravity let
you find the answer in O(N) time?
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.
Sorry, I meant to refer to your formulation of average
A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }
where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.
My "conventional average" algorithm (which is not well thought
out) was to (a) rotate the data set to avoid the 23/0 boundary
(not always possible), (b) take the arithmetic mean, and then (c)
rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
and the average is mean[0,1,2] - 1 = 0.
Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?
Well I handily stopped considering this at the stage where I assumed
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm
tries all the offsets and minimises the result.
Looking at it a bit more I can't see a better way (but that might be
the Ratafia de Champagne). It feels as if there /should/ be one.
In fact it feels as if it should be linear.
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.
Sorry, I meant to refer to your formulation of average
A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }
where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.
My "conventional average" algorithm (which is not well thought
out) was to (a) rotate the data set to avoid the 23/0 boundary
(not always possible), (b) take the arithmetic mean, and then (c)
rotate the result back. E.g. [23,0,1] -> [0,1,2] by adding one,
and the average is mean[0,1,2] - 1 = 0.
Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?
Well I handily stopped considering this at the stage where I assumed
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm
tries all the offsets and minimises the result.
Right.
Looking at it a bit more I can't see a better way (but that might be
the Ratafia de Champagne). It feels as if there /should/ be one.
In fact it feels as if it should be linear.
My best so far is only O( n * log n ). Probably that isn't
optimal. I don't see how to make it linear though. Do you have
any ideas?
On 12/25/22 4:44 PM, Ben Bacarisse wrote:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:Well I handily stopped considering this at the stage where I assumed
I'm sorry to be obtuse, but what is the "conventional average"? The
name makes it sound trivial, but the quadratic time makes me certain
that is isn't.
Sorry, I meant to refer to your formulation of average
A that minimizes { Sum_{i=1,n} difference(A, t(i))^2 }
where 'difference' means the shorter arc length. This formula
matches the result for 'mean' on real numbers.
My "conventional average" algorithm (which is not well thought out) was >>>> to (a) rotate the data set to avoid the 23/0 boundary (not always
possible), (b) take the arithmetic mean, and then (c) rotate the result >>>> back. E.g. [23,0,1] -> [0,1,2] by adding one, and the average is
mean[0,1,2] - 1 = 0.
Yes, if you know where to split the cycle then the answer can be
found in O(n) time. But how can we figure out where to split the
cycle?
there must be a simple way to spot the optimal rotation, so I never
thought it might have to be quadratic. Presumably your algorithm tries
all the offsets and minimises the result.
Looking at it a bit more I can't see a better way (but that might be the
Ratafia de Champagne). It feels as if there /should/ be one. In fact
it feels as if it should be linear.
I may have lost track of the problem, but wouldn't the addition of the
points in two-dimensions and finding the center of gravity let you
find the answer in O(N) time?
On Thursday, 22 December 2022 at 17:30:06 UTC+1, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
On 22/12/2022 01:05, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
Programming is all about trying to guess what the customer actually wanted, rather than what they asked for! :-)
Utter nonsense of the usual kind and an entire industry down the drain.
No, we are rather supposed to *elicit* requirements... which is just the tip of the (software) analysis iceberg.
On Wednesday, 21 December 2022 at 20:56:01 UTC+1, Mike Terry wrote:
an accumulation of "frustration" at what I consider to beBesides that the problem was and remains at best poorly
"unimaginative" misinterpretations and
dogmatic claims of what was originally requested...
phrased (and I agree that guesswork around here should
rather be banned), there is a much simpler and canonical
way to define "midpoint" in a modular space, namely the
midpoint along the shortest of the two possible paths.
And that's actually a quite common kind of calculation...
On Thursday, 22 December 2022 at 16:50:49 UTC+1, David Brown wrote:
On 22/12/2022 01:05, Julio Di Egidio wrote:
On Thursday, 22 December 2022 at 01:01:08 UTC+1, Mike Terry wrote:
On 21/12/2022 21:54, Dmitry A. Kazakov wrote:
kind of most natural definition, at least to me.
That is unambiguous, but to my mind not at all what is wanted
You just dont get it, do you? This is (comp.)programming
where guesswork is a *capital sin*.
Programming is all about trying to guess what the customer actuallyUtter nonsense of the usual kind and an entire industry down the drain.
wanted, rather than what they asked for! :-)
No, we are rather supposed to *elicit* requirements... which is just the
tip of the (software) analysis iceberg.
I waited a few days before answering to allow
sufficient time to think about the problem.
ram@zedat.fu-berlin.de (Stefan Ram) writes:
I waited a few days before answering to allow
sufficient time to think about the problem.
(A few lines in this post contain more than 72 characters.)
I sometimes look at records of events like when someone got
up or went to bed on different days and wonder about the
"average time" he gets up or goes to bed. Some years ago,
I asked in another newsgroup how to average times, and from
answers there I kept this expression for a spreadsheet and
two times in cells A1 and B1:
REST(1.5+1/6.28318530717959*ARCTAN2((COS((A1-0.5)*6.28318530717959)+COS((B1-0.5)*6.28318530717959))/2,(SIN((A1-0.5)*6.28318530717959)+SIN((B1-0.5)*6.28318530717959))/2),1)
. I added a bit, like "REST", but the trigonometric part
came from that newsgroup. (Times t of a day in spreadsheets
often are 0 <= t < 1 because they often use the day as the
unit of time). At that time, I did not understand what was
going on! In hindsight, today, I see that this seems to
calculate the average via the average of the position of
the two points on a 24-hour clock in two dimensions.
The corresponding Wikipedia page is called, "Mean of circular
quantities".
I also have a copy of a page "Circular Values Math and
Statistics with C++11" saved. Other pages I found were
"Averages-Mean angle - Rosetta Code", "Averaging Angles -
The Math Doctors", or "algorithm - How do you calculate
the average of a set of angles - Stack Overflow". Maybe
some of these pages can still be found in the Web today.
There was also another web page that used lines some of
which have more than 72 characters. It contained,
|A good way to estimate an average angle, A, from a set of angle measurements |a[i] 0<=i<N, is:
|
| sum_i_from_1_to_N sin(a[i])
| a = arctangent ---------------------------
| sum_i_from_1_to_N cos(a[i])
|A very careful study of the properties of this estimator is in the book |"Statistics On Spheres", Geoffrey S. Watson, University of Arkansas Lecture |Notes in the Mathematical Sciences, 1983 John Wiley & Sons, Inc.
I found this problem interesting but only later in the discussion as
I have been using this metric for some time. What got interesting
(to me) was that there is another sound interpretation of the
average as suggested by Tim, [...]
ironically prompted but a general definition of what might
constitute an average that I had posted and failed to follow
through on.
ram@zedat.fu-berlin.de (Stefan Ram) writes:
I waited a few days before answering to allow
sufficient time to think about the problem.
I sometimes look at records of events like when someone got up
or went to bed on different days and wonder about the "average
time" he gets up or goes to bed. Some years ago, I asked in
another newsgroup how to average times, and from answers there
I [found a long and somewhat forbidding formula]
At that time, I did not understand what was going on! In
hindsight, today, I see that this seems to calculate the
average via the average of the position of the two points on a
24-hour clock in two dimensions.
[..other references to the same general idea..]
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
[...]
I found this problem interesting but only later in the discussion as
I have been using this metric for some time. What got interesting
(to me) was that there is another sound interpretation of the
average as suggested by Tim, [...]
It wasn't my idea. I got it from a posting by Mike Terry on
December 21. I hadn't seen that formulation of arithmetic mean
before and I was amazed that it worked. So I can't really take
any credit for the suggestion.
ironically prompted but a general definition of what might
constitute an average that I had posted and failed to follow
through on.
I remember your posting as coming after the one by Mike Terry,
and so I thought your comments were derived from his. Sorry if
my conclusions there were off the mark.
On 14/12/2022 1:06 pm, Dmitry A. Kazakov wrote:
On 2022-12-14 13:24, Stefan Ram wrote:
Given n times of the 24-hour day, print their average.
For example, the average of "eight o'clock" and
"ten o'clock" (n=2) would be "nine o'clock".
You probably missed to require the interesting part: doing all
that in the modular type (modulo 24) arithmetic:
20 + 5 = 1 (mod 24)...which will give you the wrong answer. Chase that goose!
--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
On 2022-12-21 20:55, Mike Terry wrote:
The OP was probably deliberately rather vague on this point! The
easiest definition for "average" literally doesn't make sense in a
modulo context: we can ADD the modular values, but in general dividing
in such a mathematical setting doesn't make much sense, so
Average ([i=1,2,..n] x_i) =[def]= Sum ([i=1,2,..n] x_i) /n
would be inappropriate due to the /n operation.Unless you calculate everything as reals or integers and then take the remainder of 24. Which is kind of most natural definition, at least to me.
HOWEVER, there's another characterisation for the average of a set,
which looks more promising in a modular (or other more general)
setting: the average is the value of V which MINIMISES THE "ERROR" calculated by
error = Sum ([i=1,2,..n] (V - x_i)^2)
that is, minimises the sum of squares of differences from V.This has exactly same problem as above, because subtraction and multiplication (power of two) have different semantics in modular
arithmetic.
[Incidentally, if we minimise the sum of absolute differeces from V,Actually it is the same beast. Median is the least C-norm defined as |x-y|.
that characterises the mode (aka "midpoint") of the sample, but I think
the median is more what is wanted here...]
Mean is the least Euclidean norm (squares), the thing you proposed.
Mean and average are same for real numbers.
You are right that median (and C-norm) can be unambiguously defined in modular numbers. But nobody would ever qualify this as a puzzle! (:-))
[...]
Now, as to how to CALCULATE the V above???As I said, in real numbers V is the mean, i.e.
V = Sum (Xi) =def= mean({Xi | i=1..n})
i=1..n
Any numeric method is in the end mapping Xi to some normal number, calculating a classic mean, mapping back. The problem is with the
forward mapping being ambiguous. One method to disambiguate is staying
within the same day. Another could be taking a median and considering 12 hours before and 12 hours after it. And there are many more.
P.S. There are differences between the average and mean. OP referred the average, which may mean something completely counterintuitive (no pun intended (:-))
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
[...]
I found this problem interesting but only later in the discussion as
I have been using this metric for some time. What got interesting
(to me) was that there is another sound interpretation of the
average as suggested by Tim, [...]
It wasn't my idea. I got it from a posting by Mike Terry on
December 21. I hadn't seen that formulation of arithmetic mean
before and I was amazed that it worked. So I can't really take
any credit for the suggestion.
ironically prompted but a general definition of what might
constitute an average that I had posted and failed to follow
through on.
I remember your posting as coming after the one by Mike Terry,
and so I thought your comments were derived from his. Sorry if
my conclusions there were off the mark.
You are right in that MT posted the same general formulation 9 hours
before I did (though I'd not seen that). My confusion came from your explanation, to me, of "conventional average":
"Sorry, I meant to refer to your formulation of average"
followed by the formula I gave rather than then entirely equivalent
one given by MT.
Anyway, the credit I'm giving is for your considering this a
reasonable thing to try calculate for arc lengths, rather than
[...]
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
A couple of further thoughts only one of which is directed at you
Tim (at the end)...
Given the general conception of a mean (rather than any other kind of
summary statistic) as minimising the sum of squares of some "distance" metric:
Sum_{i=1,n} difference(A, t(i))^2
we can characterise the two contenders by what distance is being
minimised. For the less well discussed "conventional average" (to use
your terminology) we are minimising the sum of squares of the shorter
arc lengths between A and the t(i).
For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).
The mean vector itself (which may not lie on the unit circle) minimises
the sum of the squares of the length of the vector differences m-u(i):
Sum_{i=1,n} |m - u(i)|^2
and any other vector along the same line as m (i.e. c*m for real c)
minimises
Sum_{i=1,n} |c*m - u(i)|^2
This includes, of course, m projected out to the unit circle.
This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.
Incidentally, I found another book on statistics on spheres, and that
gets the average over in a few paragraphs. It states, without
considering any alternatives, that the average is the vector average. I can't find anyone using or citing the arc-length minimising average,
despite it being natural on a sphere to find the mid-point of, say, a
set of points on the Earth's surface.
Tim, my best shot at calculating this other average sorts the points to
find the widest gap. I suspect your algorithm is similar since you say
it is O(n log n).
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).
I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).
This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.
Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle, and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.
Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.
Ben Bacarisse <ben.u...@bsb.me.uk> writes:
Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
Ben Bacarisse <ben.u...@bsb.me.uk> writes:
[...]
I found this problem interesting but only later in the discussion as
I have been using this metric for some time. What got interesting
(to me) was that there is another sound interpretation of the
average as suggested by Tim, [...]
It wasn't my idea. I got it from a posting by Mike Terry on
December 21. I hadn't seen that formulation of arithmetic mean
before and I was amazed that it worked. So I can't really take
any credit for the suggestion.
ironically prompted but a general definition of what might
constitute an average that I had posted and failed to follow
through on.
I remember your posting as coming after the one by Mike Terry,
and so I thought your comments were derived from his. Sorry if
my conclusions there were off the mark.
You are right in that MT posted the same general formulation 9 hours before I did (though I'd not seen that). My confusion came from your explanation, to me, of "conventional average":
"Sorry, I meant to refer to your formulation of average"
followed by the formula I gave rather than then entirely equivalentAhh, that makes sense. My comment muddied the waters; I was
one given by MT.
thinking of your formula and the MT formula as the same, and
rather inadvertently said "your formulation" (after all, I was
responding to and had quoted your formula) without mentioning
that I had also read MT's post before responding.
Anyway, the credit I'm giving is for your considering this a
reasonable thing to try calculate for arc lengths, rather than
[...]
Thank you for that. And in that respect I think I can modestly
say that some credit is deserved (especially considering the
amount of effort it took to code something up that calculated
this measure).
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
A couple of further thoughts only one of which is directed at you
Tim (at the end)...
Given the general conception of a mean (rather than any other kind of
summary statistic) as minimising the sum of squares of some "distance"
metric:
Sum_{i=1,n} difference(A, t(i))^2
we can characterise the two contenders by what distance is being
minimised. For the less well discussed "conventional average" (to use
your terminology) we are minimising the sum of squares of the shorter
arc lengths between A and the t(i).
For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).
I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).
The mean vector itself (which may not lie on the unit circle) minimises
the sum of the squares of the length of the vector differences m-u(i):
Sum_{i=1,n} |m - u(i)|^2
That's true but I thought of it just as the average of the time
value "masses" on the unit circle.
and any other vector along the same line as m (i.e. c*m for real c)
minimises
Sum_{i=1,n} |c*m - u(i)|^2
I'm not sure what you're saying here. Only the one point (m in
your terminology) minimizes the sum of squares of distances. How
do other points on the same line qualify?
This includes, of course, m projected out to the unit circle.
This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.
Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle,
and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.
Incidentally, I found another book on statistics on spheres, and that
gets the average over in a few paragraphs. It states, without
considering any alternatives, that the average is the vector average. I
can't find anyone using or citing the arc-length minimising average,
despite it being natural on a sphere to find the mid-point of, say, a
set of points on the Earth's surface.
Do you mind my asking, which book was that?
Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant. But it could be relevant in some contexts (thinking
in particular of astronautics, or, dare I say it, rocket science).
Tim, my best shot at calculating this other average sorts the points to
find the widest gap. I suspect your algorithm is similar since you say
it is O(n log n).
Your instinct that I sorted the time values is right. If you
think a little more I think you will see that the widest gap need
not have any particular relationship to where the cut point is.
No further elaboration for now so as not to be a spoiler (and I
know you like to try figuring things out yourself).
On 2022-12-31 15:42, Tim Rentsch wrote:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
For the "vector average", we convert the t(i) to unit vectors u(i) andI think of this approach differently. I take the time values
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).
Center of mass of a set of ideal points (particles) and vector average are same:
CoM = Sum Mi * Ri / Sum Mi
i = 1..n i = 1..n
Mi = masses, Ri = vectors. If all Mi are same you get
CoM = Sum Ri / n
i = 1..n
This distinction between arc lengths and chord lengths helps toInteresting perspective. I wouldn't call them chord lengths
visualise where these averages differ, and why the conventional
average may seem more intuitive.
because I think of a chord as being between two points both on
the same circle, and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.
Arc length is proportional to angle:
L = Rα, R is radius, α is angle in radians.
Averaging arcs is equivalent to averaging angles.
Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.
See above, it is just angles to average.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
A couple of further thoughts only one of which is directed at you
Tim (at the end)...
Given the general conception of a mean (rather than any other kind
of summary statistic) as minimising the sum of squares of some
"distance" metric:
Sum_{i=1,n} difference(A, t(i))^2
we can characterise the two contenders by what distance is being
minimised. For the less well discussed "conventional average" (to
use your terminology) we are minimising the sum of squares of the
shorter arc lengths between A and the t(i).
For the "vector average", we convert the t(i) to unit vectors u(i)
and we calculate the mean if the u(i) to get a vector m. The
"average", A, is just the direction of this vector -- another
point on the unit circle. In this case we are minimising the sum
of squares of the /chord/ lengths between A and the t(i).
I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).
It's the [same] calculation though.
The mean vector itself (which may not lie on the unit circle)
minimises the sum of the squares of the length of the vector
differences m-u(i):
Sum_{i=1,n} |m - u(i)|^2
That's true but I thought of it just as the average of the time
value "masses" on the unit circle.
and any other vector along the same line as m (i.e. c*m for real
c) minimises
Sum_{i=1,n} |c*m - u(i)|^2
I'm not sure what you're saying here. Only the one point (m in
your terminology) minimizes the sum of squares of distances. How
do other points on the same line qualify?
I over-simplified by omitting "compared to any other point on that
circle". The mean vector is the absolute minimum, but at any radius
|c*m| the sum above is at a minmum at c*m.
This includes, of course, m projected out to the unit circle.
This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.
Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle,
m (the vector) is the centre of mass, but for some c, c*m lies on
the circle. All the lines from the data points to c*m are actual
chords. The average "time", A, is the point on the circle that
minimises the sum of squares of the cords between A the t(i).
The "other" average, minimises the sum of the of the angular
distances.
and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.
It's more interesting because it really is about chords!
Incidentally, I found another book on statistics on spheres, and
that gets the average over in a few paragraphs. It states,
without considering any alternatives, that the average is the
vector average. I can't find anyone using or citing the
arc-length minimising average, despite it being natural on a
sphere to find the mid-point of, say, a set of points on the
Earth's surface.
Do you mind my asking, which book was that?
http://library.sadjad.ac.ir/opac//temp/19108.pdf
Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant. But it could be relevant in some contexts (thinking
in particular of astronautics, or, dare I say it, rocket science).
Tim, my best shot at calculating this other average sorts the
points to find the widest gap. I suspect your algorithm is
similar since you say it is O(n log n).
Your instinct that I sorted the time values is right. If you
think a little more I think you will see that the widest gap need
not have any particular relationship to where the cut point is.
No further elaboration for now so as not to be a spoiler (and I
know you like to try figuring things out yourself).
You must be confusing me with someone else!
On 2022-12-31 15:42, Tim Rentsch wrote:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
For the "vector average", we convert the t(i) to unit vectors u(i) and
we calculate the mean if the u(i) to get a vector m. The "average", A,
is just the direction of this vector -- another point on the unit
circle. In this case we are minimising the sum of squares of the
/chord/ lengths between A and the t(i).
I think of this approach differently. I take the time values
t(i) as being unit masses on the unit circle, and calculate the
center of mass. As long as the center of mass is not the origin
we can project it from the origin to find a corresponding time
value on the unit circle (which in my case is done implicitly by
using atan2()).
Center of mass of a set of ideal points (particles) and vector
average are same:
This distinction between arc lengths and chord lengths helps to
visualise where these averages differ, and why the conventional
average may seem more intuitive.
Interesting perspective. I wouldn't call them chord lengths
because I think of a chord as being between two points both on
the same circle, and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.
Arc length is proportional to angle:
Averaging arcs is equivalent to averaging angles.
Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.
See above, it is just angles to average.
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
Averaging arcs is equivalent to averaging angles.
Angles are a one-dimensional measure.
Finding an arc length
"average" of points on a sphere needs a two-dimensional result.
Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.
See above, it is just angles to average.
Apparently you have not yet understood the problem.
Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:<cut>
The "other" average, minimises the sum of the of the angular
distances.
I think you mean it minimizes the sum of the squares of the
angular distances.
and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.
It's more interesting because it really is about chords!
I don't buy the "really" here. The center of mass is fundamental
and always well-defined. Furthermore it works for collections of
points that are not all on the same circle or sphere. Looking at
the problem in terms of chords seems somewhat artificial, or at
least less fundamental.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:<cut>
The "other" average, minimises the sum of the of the angular
distances.
I think you mean it minimizes the sum of the squares of the
angular distances.
Yes thanks. I originally thought I'd write an abbreviation but then I convinced myself I would not forget any of the "squares of"s. Proof
reading might have been useful too.
and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.
It's more interesting because it really is about chords!
I don't buy the "really" here. The center of mass is fundamental
and always well-defined. Furthermore it works for collections of
points that are not all on the same circle or sphere. Looking at
the problem in terms of chords seems somewhat artificial, or at
least less fundamental.
There are other ways for something to be interesting (at least to me).
When you plot the two averages, and look at the chords vs the arcs, it becomes clear why one average "looks" more average than the other.
On 2023-01-08 16:45, Tim Rentsch wrote:
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
Averaging arcs is equivalent to averaging angles.Angles are a one-dimensional measure.
Averaging arcs is still equivalent to averaging angles, which is trivial result of elementary trigonometry.
Finding an arc length
"average" of points on a sphere needs a two-dimensional result.
Points do not have arcs.
Apparently you have not yet understood the problem.Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.
See above, it is just angles to average.
Again, averages of arcs and angles are equivalent up to a multiplier.
Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?
Why don't you write a formula specifying your need?
Programs are written according to the specifications. Numeric programs require a properly stated problem, rather than a bunch of words
arbitrarily thrown in a meaningless sentence as above.
The specification was just a few
sentences.
But
the problem is /finding/ a specific average -- the point (or angle) that minimises the sum of squares of the distances (or angles) from that
average point (or angle).
The fact that it makes no odds (as everyone knows) whether we consider
angles (often called central angles in this context) or great circle distances is not the issue. It's finding the average that minimises the
sum of squares of differences that's the issue.
You say you need a formula, so I'll try... Let P_n be a collection of n
unit vectors specifying n points on a unit sphere. Find the unit vector
A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in programming languages.)
Programs are written according to the specifications. Numeric programs
require a properly stated problem, rather than a bunch of words
arbitrarily thrown in a meaningless sentence as above.
Given the context, I think that's a very biased characterisation of
what's been said here.
My first job was as a "numerical analyst", and the very first program I
was employed to write was for a professor of statistics. It was to
calculate a novel kind of fit line. The specification was just a few sentences. No formulas. It was perfectly clear, and I could get the
job done. I don't think this is unusual. Words are often enough, and
they can avoid undue over specification. For example, the problem in question is essentially the same if the points are given by latitude and longitude on a non-unit sphere, but the formula would look very
different.
On 2023-01-08 21:41, Ben Bacarisse wrote:
But
the problem is /finding/ a specific average -- the point (or angle) that
minimises the sum of squares of the distances (or angles) from that
average point (or angle).
That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.
Average is just such formulation of finding a representative member
from some set having some specific properties. E.g. least squares in
physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):
1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.
After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.
The fact that it makes no odds (as everyone knows) whether we consider
angles (often called central angles in this context) or great circle
distances is not the issue. It's finding the average that minimises the
sum of squares of differences that's the issue.
Minimum of squares (Euclidean norm) is represented by the mathematical
mean.
You say you need a formula, so I'll try... Let P_n be a collection of n
unit vectors specifying n points on a unit sphere. Find the unit vector
A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)
1. atan2 has two arguments (the adjacent and the opposite legs);
2. What is "A";
3. What operations(?) "x" and "." denote?
Programs are written according to the specifications. Numeric programsGiven the context, I think that's a very biased characterisation of
require a properly stated problem, rather than a bunch of words
arbitrarily thrown in a meaningless sentence as above.
what's been said here.
My first job was as a "numerical analyst", and the very first program I
was employed to write was for a professor of statistics. It was to
calculate a novel kind of fit line. The specification was just a few
sentences. No formulas. It was perfectly clear, and I could get the
job done. I don't think this is unusual. Words are often enough, and
they can avoid undue over specification. For example, the problem in
question is essentially the same if the points are given by latitude and
longitude on a non-unit sphere, but the formula would look very
different.
This just means that you formulated the specifications yourself, being
able to read your professor's mind.
Not everybody's mind is easy read or worth reading... (:-))
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
On 2023-01-08 21:41, Ben Bacarisse wrote:
But
the problem is /finding/ a specific average -- the point (or angle) that >>> minimises the sum of squares of the distances (or angles) from that
average point (or angle).
That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.
I don't follow. It's a mathematical problem for which an algorithm is sought.
I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.
Average is just such formulation of finding a representative member
from some set having some specific properties. E.g. least squares in
physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):
1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.
After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.
So if I asked you for code to calculate the arithmetic mean of an array
of numbers you would ask for all this first, declaring it not yet a programming issue?
I really don't see where all this comes from.
The fact that it makes no odds (as everyone knows) whether we consider
angles (often called central angles in this context) or great circle
distances is not the issue. It's finding the average that minimises the >>> sum of squares of differences that's the issue.
Minimum of squares (Euclidean norm) is represented by the mathematical
mean.
This problem obviously uses a different norm.
You say you need a formula, so I'll try... Let P_n be a collection of n >>> unit vectors specifying n points on a unit sphere. Find the unit vector >>> A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)
1. atan2 has two arguments (the adjacent and the opposite legs);
Obviously I can't guess how much I can safely assume so I expected there might be questions. arctan(p / q) is usually coded as atan2(p, q).
2. What is "A";
The unit vector that is being sought -- the result of the algorithm. It represents a point on the unit sphere that is the desired average of the input points.
3. What operations(?) "x" and "." denote?
x denotes the cross product, and . the dot product.
Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!
To my mind "find the point on the sphere that minimises the sum of the squares of the great-circle distances between that point and the input points" is clearer than the formula I gave because the formula says too
much.
On 2023-01-09 04:25, Ben Bacarisse wrote:
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
On 2023-01-08 21:41, Ben Bacarisse wrote:I don't follow. It's a mathematical problem for which an algorithm is
But
the problem is /finding/ a specific average -- the point (or angle) that >>>> minimises the sum of squares of the distances (or angles) from that
average point (or angle).
That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.
sought.
Firstly, this is comp.programming group.
Secondly, if algorithm is sought then there must be a mathematical formulation of what a numeric solution is sought for.
(Finding a minimum is an optimization problem, there are thousands of algorithms already)
I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.
Average is just such formulation of finding a representative memberSo if I asked you for code to calculate the arithmetic mean of an array
from some set having some specific properties. E.g. least squares in
physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):
1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.
After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.
of numbers you would ask for all this first, declaring it not yet a
programming issue?
Then I would ask you, if this is a solution, what was the problem?
I really don't see where all this comes from.
From an average (:-)) customer who does not mind his
business. Customers have programming ideas, wrong ideas, ideas they
express in terms they have no idea of (:-)). It takes time and tact to
bring the customer back to the field of his expertise and get from him
what he actually needs in order to solve his actual problem.
This problem obviously uses a different norm.The fact that it makes no odds (as everyone knows) whether we consider >>>> angles (often called central angles in this context) or great circle
distances is not the issue. It's finding the average that minimises the >>>> sum of squares of differences that's the issue.
Minimum of squares (Euclidean norm) is represented by the mathematical
mean.
I see nothing obvious in an unstated problem.
Obviously I can't guess how much I can safely assume so I expected thereYou say you need a formula, so I'll try... Let P_n be a collection of n >>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>> A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)
1. atan2 has two arguments (the adjacent and the opposite legs);
might be questions. arctan(p / q) is usually coded as atan2(p, q).
2. What is "A";The unit vector that is being sought -- the result of the algorithm. It
represents a point on the unit sphere that is the desired average of the
input points.
3. What operations(?) "x" and "." denote?x denotes the cross product, and . the dot product.
Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!
If I correctly understood your explanation (and with a few obvious corrections):
Sum atan2 (|A x Pi|, A · Pi) --> min
i=1..n {A | A on the circle}
|A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)
A · Pi = |A| |Pi| cos θi
atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'
Where θi' is θi with lower semicircle mapped to the upper one (for
whatever reason).
(If mapping was unintended you must have taken the direction of the
cross product for the sign for |A x Pi|)
On a circle varying the angle between A and (R, 0), where R is the
radius, you find that the mean of the angles between Pi and (R, 0)
yields the minimum.
To my mind "find the point on the sphere that minimises the sum of the
squares of the great-circle distances between that point and the input
points" is clearer than the formula I gave because the formula says too
much.
If you mean the circumference, then that is trivially proportional to
the angle.
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
On 2023-01-09 04:25, Ben Bacarisse wrote:
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
On 2023-01-08 21:41, Ben Bacarisse wrote:I don't follow. It's a mathematical problem for which an algorithm is
But
the problem is /finding/ a specific average -- the point (or angle) that >>>>> minimises the sum of squares of the distances (or angles) from that
average point (or angle).
That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.
sought.
Firstly, this is comp.programming group.
Do you consider discussion algorithms off topic here? Seriously?
Secondly, if algorithm is sought then there must be a mathematical
formulation of what a numeric solution is sought for.
There needs to be a problem statement that can be discussed and refined.
The original post about circular averages has led to two different refinements -- the vector average (projected onto the unit circle) and
the arc-length average. Both generalise to more dimensions and I
happened to mention in passing that the great-circle average would be a natural fit in some situations.
Tim mentioned (again in passing) that it looks hard. You seemed to
suggest it was not, but I am not sure you knew what the problem was at
that time.
All of this seems to be to be exactly what comp.programming should be
about.
(Finding a minimum is an optimization problem, there are thousands of
algorithms already)
Programmers should know that specifications are not algorithm designs.
I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.
Average is just such formulation of finding a representative memberSo if I asked you for code to calculate the arithmetic mean of an array
from some set having some specific properties. E.g. least squares in
physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):
1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.
After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.
of numbers you would ask for all this first, declaring it not yet a
programming issue?
Then I would ask you, if this is a solution, what was the problem?
This is a solution to the simpler problem.
But we are not in that situation. A few knowledgeable and experienced
people are discussing a programming problem.
You are welcome to join
in, but the best way for you to do that is to ask whatever questions you think would help you understand what's being discussed. So far, your
replies have alternated between "it's trivial" and "it's unstated".
Obviously I can't guess how much I can safely assume so I expected there >>> might be questions. arctan(p / q) is usually coded as atan2(p, q).You say you need a formula, so I'll try... Let P_n be a collection of n >>>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>>> A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in
programming languages.)
1. atan2 has two arguments (the adjacent and the opposite legs);
2. What is "A";The unit vector that is being sought -- the result of the algorithm. It >>> represents a point on the unit sphere that is the desired average of the >>> input points.
3. What operations(?) "x" and "." denote?x denotes the cross product, and . the dot product.
Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!
If I correctly understood your explanation (and with a few obvious
corrections):
Sum atan2 (|A x Pi|, A · Pi) --> min
i=1..n {A | A on the circle}
No. A is a vector denoting a point on the unit sphere. And you have
missed out the power of two. Were these changes what you call "obvious corrections"?
|A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)
A · Pi = |A| |Pi| cos θi
atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'
Where θi' is θi with lower semicircle mapped to the upper one (for
whatever reason).
(If mapping was unintended you must have taken the direction of the
cross product for the sign for |A x Pi|)
On a circle varying the angle between A and (R, 0), where R is the
radius, you find that the mean of the angles between Pi and (R, 0)
yields the minimum.
This does not even solve the related case of finding the average point
on the unit circle.
On 2023-01-09 21:37, Ben Bacarisse wrote:
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
On 2023-01-09 04:25, Ben Bacarisse wrote:Do you consider discussion algorithms off topic here? Seriously?
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
On 2023-01-08 21:41, Ben Bacarisse wrote:I don't follow. It's a mathematical problem for which an algorithm is >>>> sought.
But
the problem is /finding/ a specific average -- the point (or angle) that >>>>>> minimises the sum of squares of the distances (or angles) from that >>>>>> average point (or angle).
That is not a programming problem. It is a problem space's one. You
must go to the corresponding part of science or engineering or
economics etc and formulate it there in terms of that space.
Firstly, this is comp.programming group.
No, I consider off topic discussing mathematical problems /=
algorithms. You did not discuss any algorithms so far.
Secondly, if algorithm is sought then there must be a mathematicalThere needs to be a problem statement that can be discussed and refined.
formulation of what a numeric solution is sought for.
The original post about circular averages has led to two different
refinements -- the vector average (projected onto the unit circle) and
the arc-length average. Both generalise to more dimensions and I
happened to mention in passing that the great-circle average would be a
natural fit in some situations.
I missed generalization to 3D. Sorry.
[You kept on writing about unit *circle* and an angle, where it
actually meant unit sphere and polar angles/spherical coordinates]
Tim mentioned (again in passing) that it looks hard. You seemed to
suggest it was not, but I am not sure you knew what the problem was at
that time.
All of this seems to be to be exactly what comp.programming should be
about.
Differential geometry and spherical trigonometry are certainly off
topic.
(Finding a minimum is an optimization problem, there are thousands ofProgrammers should know that specifications are not algorithm designs.
algorithms already)
Who said they are?
This is a solution to the simpler problem.I don't see any connection to some science or engineering
space. If it /came/ from some scientific study, it has already been
turned into what the programmer needs come up with.
Average is just such formulation of finding a representative memberSo if I asked you for code to calculate the arithmetic mean of an array >>>> of numbers you would ask for all this first, declaring it not yet a
from some set having some specific properties. E.g. least squares in >>>>> physics usually has the meaning of least energy etc. So, it goes in
this order (not in reverse):
1. You need the least energy representative.
2. An average gives you that.
3. You measure the inputs.
After that you ask yourself given these measures how to compute the
average? And only now it becomes a programming issue.
programming issue?
Then I would ask you, if this is a solution, what was the problem?
And what was that the problem? [Adjective adds/clarifies nothing]
But we are not in that situation. A few knowledgeable and experienced
people are discussing a programming problem.
I saw no discussion on programming so either.
You are welcome to join
in, but the best way for you to do that is to ask whatever questions you
think would help you understand what's being discussed. So far, your
replies have alternated between "it's trivial" and "it's unstated".
Yes, because you wrote about minimum of arc length on a *circle*. And
that is trivial.
No. A is a vector denoting a point on the unit sphere. And you haveObviously I can't guess how much I can safely assume so I expected there >>>> might be questions. arctan(p / q) is usually coded as atan2(p, q).You say you need a formula, so I'll try... Let P_n be a collection of n >>>>>> unit vectors specifying n points on a unit sphere. Find the unit vector >>>>>> A that minimises
Sum_{i=1,n} ( arctan( |A x P_n| / A . P_n ) )^2
(arctan is the "all quadrant" version that is often called atan2 in >>>>>> programming languages.)
1. atan2 has two arguments (the adjacent and the opposite legs);
2. What is "A";The unit vector that is being sought -- the result of the algorithm. It >>>> represents a point on the unit sphere that is the desired average of the >>>> input points.
3. What operations(?) "x" and "." denote?x denotes the cross product, and . the dot product.
Do you agree that this is not a trivial problem? If not, please give
the trivial algorithm!
If I correctly understood your explanation (and with a few obvious
corrections):
Sum atan2 (|A x Pi|, A · Pi) --> min
i=1..n {A | A on the circle}
missed out the power of two. Were these changes what you call "obvious
corrections"?
|A x Pi| = |A| |Pi| |sin θi| (θi is the angle between the vectors A and Pi)This does not even solve the related case of finding the average point
A · Pi = |A| |Pi| cos θi
atan2 (|A x Pi|, A · Pi) = atan2 (|sin θi|, cos θi) = θi'
Where θi' is θi with lower semicircle mapped to the upper one (for
whatever reason).
(If mapping was unintended you must have taken the direction of the
cross product for the sign for |A x Pi|)
On a circle varying the angle between A and (R, 0), where R is the
radius, you find that the mean of the angles between Pi and (R, 0)
yields the minimum.
on the unit circle.
It elementary does for both formula you gave (after corrections) and
the circumference:
Sum Length_Of_Arc (Pi, A)^2 =
i=1..n
= Sum (R Angle (Pi, A))^2 =
i=1..n
= R^2 Sum ((Angle (Pi, (R, 0)) - Angle (A, (R, 0)))^2
i=1..n
(R, 0) denotes vector x = R, y = 0. Let θi denote Angle (Pi, (R, 0))
and α do Angle (A, (R, 0)) then
= R^2 Sum (θi - α)^2 --> min
i=1..n
Derivative d/dα:
2 R^2 Sum (θi - α) = 0 <=> (R /= 0)
i=1..n
<=> Sum θi - nα = 0 <=>
i=1..n
<=> α = Sum θi / n = mean ({θi})
i=1..n
The angle between A and (R, 0) is the mean of angles of points.
Anyway, there are lengthy papers on spherical averages introducing
iterative solutions, some with source code for the algorithms. That is
the right way to compute it, if you really wanted...
[Again, it is not programming. In case of spherical geometry a
programmer will find an expert mathematician and who will have the
problem properly stated.
The next stop is by an expert applied mathematician who will outline a workable numeric solution. Then the programmer can start.]
On 2023-01-08 16:45, Tim Rentsch wrote:
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
Averaging arcs is equivalent to averaging angles.
Angles are a one-dimensional measure.
Averaging arcs is still equivalent to averaging angles, which is
trivial result of elementary trigonometry.
Finding an arc length
"average" of points on a sphere needs a two-dimensional result.
Points do not have arcs.
Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.
See above, it is just angles to average.
Apparently you have not yet understood the problem.
Again, averages of arcs and angles are equivalent up to a
multiplier.
Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?
Why don't you write a formula specifying your need?
Programs are written according to the specifications. Numeric
programs require a properly stated problem, rather than a bunch
of words arbitrarily thrown in a meaningless sentence as above.
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
<cut>
The "other" average, minimises the sum of the of the angular
distances.
I think you mean it minimizes the sum of the squares of the
angular distances.
Yes thanks. I originally thought I'd write an abbreviation but then I
convinced myself I would not forget any of the "squares of"s. Proof
reading might have been useful too.
and the center of mass is never on the unit
circle (not counting the case when all the time values are the
same). Even so it's an interesting way to view the distinction.
It's more interesting because it really is about chords!
I don't buy the "really" here. The center of mass is fundamental
and always well-defined. Furthermore it works for collections of
points that are not all on the same circle or sphere. Looking at
the problem in terms of chords seems somewhat artificial, or at
least less fundamental.
There are other ways for something to be interesting (at least to me).
When you plot the two averages, and look at the chords vs the arcs, it
becomes clear why one average "looks" more average than the other.
I can see a possible misunderstanding here. You might have taken my "it really is about chords" to mean "it is fundamentally about chords", but
I said "it really is about chords" because you thought I'd used the word chord incorrectly to refer to the lines between the centre of mass and
the data points. I was saying "I really did mean chords".
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
On 2023-01-09 21:37, Ben Bacarisse wrote:
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
On 2023-01-09 04:25, Ben Bacarisse wrote:Do you consider discussion algorithms off topic here? Seriously?
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:Firstly, this is comp.programming group.
No, I consider off topic discussing mathematical problems /=
algorithms. You did not discuss any algorithms so far.
Algorithms don't come out of thin air. The objective must be discussed first. You can't seriously think that everyone here should remain
silent until, out of nowhere, someone says "consider this algorithm to
find the average of a set of points".
I explained in the text you cut. Maybe your reference to the thousands
of minimisation algorithms was just an off the cuff remark, but it reads
as if you were suggesting using one as them as the solution to this programming task.
Anyway, there are lengthy papers on spherical averages introducing
iterative solutions, some with source code for the algorithms. That is
the right way to compute it, if you really wanted...
I'd like to read one. Do you have a citation, please?
[Again, it is not programming. In case of spherical geometry a
programmer will find an expert mathematician and who will have the
problem properly stated.
The problem has been properly stated. We seek an algorithm that finds a point that minimises the sum of squares of the great-circle distances
between that point and the input points. You could have asked if any of
this was unclear. To me it is both clear and intuitive.
The next stop is by an expert applied mathematician who will outline a
workable numeric solution. Then the programmer can start.]
Ah, you hand off the task of devising algorithms to mathematicians?
That would make programming very dull. Anyway, you would not tackle
this sort of problem until an expert applied mathematician has done everything except code up a workable numerical solution. Is that also
the case for the average angle problem?
"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:
On 2023-01-08 16:45, Tim Rentsch wrote:
"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:
Averaging arcs is equivalent to averaging angles.
Angles are a one-dimensional measure.
Averaging arcs is still equivalent to averaging angles, which is
trivial result of elementary trigonometry.
Finding an arc length
"average" of points on a sphere needs a two-dimensional result.
Points do not have arcs.
Now that I think about it, finding the point that minimizes the
great circle distances squared would be at least computationally
unpleasant.
See above, it is just angles to average.
Apparently you have not yet understood the problem.
Again, averages of arcs and angles are equivalent up to a
multiplier.
Why don't
you try writing a program that inputs a set of points normalized
to be on the unit sphere, and then calculates the arc length
average point (on the unit sphere) of those input points?
Why don't you write a formula specifying your need?
Programs are written according to the specifications. NumericAfter reading this response and also three other responses of
programs require a properly stated problem, rather than a bunch
of words arbitrarily thrown in a meaningless sentence as above.
yours (to Ben Bacarisse) down stream, I can't help but think you
have little or no interest in understanding the problem, offering
any useful comments about it, or trying to write a program to
solve the problem. Apparently the only interest you do have is
making captious remarks.
On 2023-01-11 03:41, Ben Bacarisse wrote:
"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
Anyway, there are lengthy papers on spherical averages introducing
iterative solutions, some with source code for the algorithms. That is
the right way to compute it, if you really wanted...
I'd like to read one. Do you have a citation, please?
Perhaps this could be a starting point:
"Spherical Averages and Applications to Spherical Splines and
Interpolation" Samuel R. Buss, Jay P. Fillmore
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 491 |
Nodes: | 16 (2 / 14) |
Uptime: | 127:45:22 |
Calls: | 9,688 |
Calls today: | 4 |
Files: | 13,728 |
Messages: | 6,177,251 |