Forum: >>> Magnum BBS <<<

Re: Radians Or Degrees?

From Michael S@21:1/5 to Lawrence D'Oliveiro on Thu Feb 22 23:39:12 2024

XPost: comp.lang.c

On Thu, 22 Feb 2024 21:04:52 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Thu, 22 Feb 2024 20:16:25 -0000 (UTC), Steven G. Kargl wrote:

Apparently, you missed the part about argument reduction. For
sinpi(x), it is fairly easy to reduce x = n + r with n an integer
and r in [0,1). For the extended interval, x in [0,2^23], there are
roughly 2^23 values with r = 0.5; and thus, sinpi(x) = 1 exactly.
There are no such values for sin(x), and argument reduction for
sin(x) is much more involved.

You are working with approximations anyway. That those approximations
happen to exactly equal some random value seems irrelevant.

As to real-world use, how about any physical phenomena where one is interest in resonance frequencies of the system. For a simple
example see https://www.feynmanlectures.caltech.edu/I_49.html where
one might write f(x) = sin(kx) = sin(pi * (2*x/L)) with L a length
of say a clamped string.

I don’t see a formula anything like that anywhere on that page. On
the other hand, I see lots of use of “ω” symbols, representing frequencies in radians per second.

There are also uses with computing other functions, e.g., the true
gamma function via the Euler reflection formula.

gamma(x) * gamma(1 - x) = pi / sin(pi * x) = pi / sinpi(x)

π radians = half a circle. Are there any other examples of the
usefulness of half-circles as an angle unit? As opposed to the dozens
or hundreds of examples of the usefulness of radians as an angle unit?

In digital signal processing circle-based units are pretty much always
more natural than radians.
For specific case of 1/2 circle, I can't see where it can be used
directly.
From algorithmic perspective, full circle looks like the most obvious
choice.
From [binary floating point] numerical properties perspective, 1/8th of
the circle (==pi/4 radians = 45 degrees) is probably the best option
for a library routine, because for sin() its derivative at 0 is closest
to 1 among all powers of two which means that loss of precision
near 0 is very similar for input and for output. But this advantage
does not sound like particularly big deal.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Thu Feb 22 22:57:20 2024

XPost: comp.lang.c

Michael S wrote:

On Thu, 22 Feb 2024 21:04:52 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Thu, 22 Feb 2024 20:16:25 -0000 (UTC), Steven G. Kargl wrote:

Apparently, you missed the part about argument reduction. For
sinpi(x), it is fairly easy to reduce x = n + r with n an integer
and r in [0,1).

You are better off with r in [-½,+½]

For the extended interval, x in [0,2^23], there are

roughly 2^23 values with r = 0.5; and thus, sinpi(x) = 1 exactly.
There are no such values for sin(x), and argument reduction for
sin(x) is much more involved.

I have a patent on doing argument reduction for sin(x) in 4 cycles...that
is as good as Payne-Haneok argument reduction over -infinity..+infinity

You are working with approximations anyway. That those approximations
happen to exactly equal some random value seems irrelevant.

As to real-world use, how about any physical phenomena where one is
interest in resonance frequencies of the system. For a simple
example see https://www.feynmanlectures.caltech.edu/I_49.html where
one might write f(x) = sin(kx) = sin(pi * (2*x/L)) with L a length
of say a clamped string.

I don’t see a formula anything like that anywhere on that page. On
the other hand, I see lots of use of “ω” symbols, representing
frequencies in radians per second.

There are also uses with computing other functions, e.g., the true
gamma function via the Euler reflection formula.

gamma(x) * gamma(1 - x) = pi / sin(pi * x) = pi / sinpi(x)

I would not have expected for the magnitude of gamma(x) * gamma(1 - x)
to be always greater than than pi....but there it is.

π radians = half a circle. Are there any other examples of the
usefulness of half-circles as an angle unit? As opposed to the dozens
or hundreds of examples of the usefulness of radians as an angle unit?

In digital signal processing circle-based units are pretty much always
more natural than radians.
For specific case of 1/2 circle, I can't see where it can be used
directly.
From algorithmic perspective, full circle looks like the most obvious
choice.
From [binary floating point] numerical properties perspective, 1/8th of
the circle (==pi/4 radians = 45 degrees) is probably the best option
for a library routine, because for sin() its derivative at 0 is closest
to 1 among all powers of two which means that loss of precision
near 0 is very similar for input and for output. But this advantage
does not sound like particularly big deal.

I should remind that my patents include methods for calculating sin()
and cos() in 18 cycles, sinpi() and cospi() in 14 cycles while achieving
an accuracy n the 0.5002-bits of error. All of this is in HW, and all
carried out in precision wider than IEEE 754 with a bunch of HW tricks
not available to SW.

At this performance level, let the algorithms wanting pi based trigonometry have it and those wanting unit based trigonometry have it too. Let program- mers use what is easiest for them.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Steven G. Kargl@21:1/5 to All on Fri Feb 23 00:13:02 2024

XPost: comp.lang.c

On Thu, 22 Feb 2024 22:57:20 +0000, MitchAlsup1 wrote:

Michael S wrote:

On Thu, 22 Feb 2024 21:04:52 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Thu, 22 Feb 2024 20:16:25 -0000 (UTC), Steven G. Kargl wrote:

Apparently, you missed the part about argument reduction. For
sinpi(x), it is fairly easy to reduce x = n + r with n an integer
and r in [0,1).

You are better off with r in [-½,+½]

Partial implementation details for single precision C code (trying
to keep this on topic for c.l.c) are at https://cgit.freebsd.org/src/tree/lib/msun/src/s_sinpif.c

One uses sinpi(-|x|) = -sinpi(|x|), which natually gets one to
r = x - floor(x) in [0,1). One then uses kernels for sin() and
cos() to do the actual computation.

r in [0,0.25) is ksin(r).
r in [0.25,0.5) is kcos(0.5-r).
r in [0.5,0.75) is kcos(r - 0.5).
r in [0.75,1) is ksin(1 - r).

For the extended interval, x in [0,2^23], there are

roughly 2^23 values with r = 0.5; and thus, sinpi(x) = 1 exactly.
There are no such values for sin(x), and argument reduction for
sin(x) is much more involved.

I have a patent on doing argument reduction for sin(x) in 4 cycles...that
is as good as Payne-Haneok argument reduction over -infinity..+infinity

By a strange coincidence, I have the Payn-Hanek paper on the
top of stack of papers on my desk. Still need to internalize
the details.

(much trimmed for brevity)

In digital signal processing circle-based units are pretty much always
more natural than radians.
For specific case of 1/2 circle, I can't see where it can be used
directly.
From algorithmic perspective, full circle looks like the most obvious
choice.
From [binary floating point] numerical properties perspective, 1/8th of
the circle (==pi/4 radians = 45 degrees) is probably the best option
for a library routine, because for sin() its derivative at 0 is closest
to 1 among all powers of two which means that loss of precision
near 0 is very similar for input and for output. But this advantage
does not sound like particularly big deal.

I should remind that my patents include methods for calculating sin()
and cos() in 18 cycles, sinpi() and cospi() in 14 cycles while achieving
an accuracy n the 0.5002-bits of error. All of this is in HW, and all
carried out in precision wider than IEEE 754 with a bunch of HW tricks
not available to SW.

At this performance level, let the algorithms wanting pi based trigonometry have it and those wanting unit based trigonometry have it too. Let program- mers use what is easiest for them.

Agreed a programmer should use what is required by the problem
that they are solving. I'll note that SW implementations have
their sets of tricks (e.g., use of double-double arithmetic to
achieve double precision).

--
steve

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Chris M. Thomasson on Fri Feb 23 02:42:23 2024

XPost: comp.lang.c

Chris M. Thomasson wrote:

On 2/22/2024 4:49 PM, Chris M. Thomasson wrote:
[...]> Use some form of arbitrary precision if you want to get to:

A 10^4141 zoom mset zoom!? It must took many hours to render... ;^)

https://youtu.be/Xjy_HSUujaw

A 10^275 zoom:

https://youtu.be/0jGaio87u3A

Argument reduction zoom::

COS( 6381956970095103×2^797) = -4.68716592425462761112×10-19

You can argue all you want about whether this has any significance, but
it happens to be the value closest to that radian argument when performed
with infinite precision.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Steven G. Kargl on Fri Feb 23 02:28:07 2024

XPost: comp.lang.c

Steven G. Kargl wrote:

On Thu, 22 Feb 2024 22:57:20 +0000, MitchAlsup1 wrote:

Michael S wrote:

On Thu, 22 Feb 2024 21:04:52 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Thu, 22 Feb 2024 20:16:25 -0000 (UTC), Steven G. Kargl wrote:

Apparently, you missed the part about argument reduction. For
sinpi(x), it is fairly easy to reduce x = n + r with n an integer
and r in [0,1).

You are better off with r in [-½,+½]

Partial implementation details for single precision C code (trying
to keep this on topic for c.l.c) are at https://cgit.freebsd.org/src/tree/lib/msun/src/s_sinpif.c

One uses sinpi(-|x|) = -sinpi(|x|), which natually gets one to
r = x - floor(x) in [0,1). One then uses kernels for sin() and
cos() to do the actual computation.

r in [0,0.25) is ksin(r).
r in [0.25,0.5) is kcos(0.5-r).
r in [0.5,0.75) is kcos(r - 0.5).
r in [0.75,1) is ksin(1 - r).

By "better off" I mean you have ½-bit of greater reduced argument precision when the reduced argument is centered on zero.

For the extended interval, x in [0,2^23], there are

roughly 2^23 values with r = 0.5; and thus, sinpi(x) = 1 exactly.
There are no such values for sin(x), and argument reduction for
sin(x) is much more involved.

I have a patent on doing argument reduction for sin(x) in 4 cycles...that
is as good as Payne-Haneok argument reduction over -infinity..+infinity

By a strange coincidence, I have the Payn-Hanek paper on the
top of stack of papers on my desk. Still need to internalize
the details.

Here is the version found in the SUN transcendentals::

static void ReduceFull(double *xp, int *a, double x)
{
Double X = { x };
int ec = X.s.exponent - (1023+33);
int k = (ec + 26) * (607*4) >> 16;
int m = 27*k - ec;
int offset = m >> 3;
x *= 0x1p-400;
double xDekker = x * (0x1p27 + 1);
double x0 = xDekker - (xDekker - x);
double x1 = x - x0;
const double *p0 = &TwoOverPiWithOffset[offset][k]; // 180 DP FP numbers
const double fp0 = p0[0];
const double fp1 = p0[1];
const double fp2 = p0[2];
const double fp3 = p0[3];
const double f0 = x1 * fp0 + fp1 * x0;
double f = x1 * fp1 + fp2 * x0;
const double fi = f0 + f;
static const double IntegerBias = 0x1.8p52;
Double Fi = { fi + IntegerBias };
*a = Fi.s.significand2;
double fint = Fi.d - IntegerBias;
const double fp4 = p0[4];
const double fp5 = p0[5];
const double fp6 = p0[6];
f = f0 - fint + f;
f += x1 * fp2 + fp3 * x0;
f += x1 * fp3 + fp4 * x0;
f += x1 * fp4 + fp5 * x0;
f += x1 * fp5 + fp6 * x0;
*xp = f * 0x3.243F6A8885A3p-1;
}

Which I can do in 4 cycles in HW !! {{I happen to have this on hand to explain how
I can do this in HW in 4 cycles......}}

I should also note that the conversion of f back into *xp looses ¼-bit of precision.

(much trimmed for brevity)

In digital signal processing circle-based units are pretty much always
more natural than radians.
For specific case of 1/2 circle, I can't see where it can be used
directly.
From algorithmic perspective, full circle looks like the most obvious
choice.
From [binary floating point] numerical properties perspective, 1/8th of
the circle (==pi/4 radians = 45 degrees) is probably the best option
for a library routine, because for sin() its derivative at 0 is closest
to 1 among all powers of two which means that loss of precision
near 0 is very similar for input and for output. But this advantage
does not sound like particularly big deal.

I should remind that my patents include methods for calculating sin()
and cos() in 18 cycles, sinpi() and cospi() in 14 cycles while achieving
an accuracy n the 0.5002-bits of error. All of this is in HW, and all
carried out in precision wider than IEEE 754 with a bunch of HW tricks
not available to SW.

At this performance level, let the algorithms wanting pi based trigonometry >> have it and those wanting unit based trigonometry have it too. Let program- >> mers use what is easiest for them.

Agreed a programmer should use what is required by the problem
that they are solving. I'll note that SW implementations have
their sets of tricks (e.g., use of double-double arithmetic to
achieve double precision).

To get near IEEE desired precision, one HAS TO use more than 754 precision.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to All on Fri Feb 23 11:10:00 2024

XPost: comp.lang.c

MitchAlsup1 wrote:

Steven G. Kargl wrote:

Agreed a programmer should use what is required by the problem
that they are solving. I'll note that SW implementations have
their sets of tricks (e.g., use of double-double arithmetic to
achieve double precision).

To get near IEEE desired precision, one HAS TO use more than 754 precision.

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

There is a suggestion on the table to make that a (probably optional
imho) feature for an upcoming ieee754 revision.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to Michael S on Fri Feb 23 11:01:02 2024

XPost: comp.lang.c

Michael S wrote:

On Thu, 22 Feb 2024 21:04:52 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

π radians = half a circle. Are there any other examples of the
usefulness of half-circles as an angle unit? As opposed to the dozens
or hundreds of examples of the usefulness of radians as an angle unit?

In digital signal processing circle-based units are pretty much always
more natural than radians.
For specific case of 1/2 circle, I can't see where it can be used
directly.
From algorithmic perspective, full circle looks like the most obvious choice.
From [binary floating point] numerical properties perspective, 1/8th of
the circle (==pi/4 radians = 45 degrees) is probably the best option
for a library routine, because for sin() its derivative at 0 is closest
to 1 among all powers of two which means that loss of precision
near 0 is very similar for input and for output. But this advantage
does not sound like particularly big deal.

ieee754 defines sinpi() and siblings, but imho it really doesn't matter
if you use circles, half-circles (i.e. sinpi) or some other binary
fraction of a circle: Argument reduction for huge inputs are just as
easy, you might just have to multiply by the corresponding power of two
(i.e. adjust the exponent) before extracting the fractional term.

For sinpi(x) I could do it like this:

if (abs(x) >= two_to_52nd_power) error("Zero significant bits.");
ix = int(x);
x_reduced = x - (double) (ix & ~1);
if (x_reduced < 0.0) x_reduced += 2.0;

but it is probably better to return a value in the [-1.0 .. 1.0> range?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Terje Mathisen on Fri Feb 23 14:02:09 2024

On Fri, 23 Feb 2024 11:01:02 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

Michael S wrote:

On Thu, 22 Feb 2024 21:04:52 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

π radians = half a circle. Are there any other examples of the
usefulness of half-circles as an angle unit? As opposed to the
dozens or hundreds of examples of the usefulness of radians as an
angle unit?

In digital signal processing circle-based units are pretty much
always more natural than radians.
For specific case of 1/2 circle, I can't see where it can be used
directly.
From algorithmic perspective, full circle looks like the most
obvious choice.
From [binary floating point] numerical properties perspective,
1/8th of the circle (==pi/4 radians = 45 degrees) is probably the
best option for a library routine, because for sin() its derivative
at 0 is closest to 1 among all powers of two which means that loss
of precision near 0 is very similar for input and for output. But
this advantage does not sound like particularly big deal.

ieee754 defines sinpi() and siblings, but imho it really doesn't
matter if you use circles, half-circles (i.e. sinpi) or some other
binary fraction of a circle: Argument reduction for huge inputs are
just as easy, you might just have to multiply by the corresponding
power of two (i.e. adjust the exponent) before extracting the
fractional term.

For sinpi(x) I could do it like this:

if (abs(x) >= two_to_52nd_power) error("Zero significant bits.");
ix = int(x);
x_reduced = x - (double) (ix & ~1);
if (x_reduced < 0.0) x_reduced += 2.0;

but it is probably better to return a value in the [-1.0 .. 1.0>
range?

Terje

Both you and Mitch look at it from wrong perspective.
When we define a library API, an ease of implementation of the library
function should be pretty low on our priority scale. As long as
reasonably precise calculation is theoretically possible, we should
give credit to intelligence of implementor, that's all.
The real concern of API designer should be with avoidance of loss of
precision in preparation of inputs and use of outputs.
In specific case of y=sin2pi(x), it is x that is more problematic,
because near 0 it starts to lose precision 3 octaves before y. In
subnormal range we lose ~2.5 bits of precision in preparation of the
argument. An implementation, no matter how good, can't recover what's
already lost.
sinpi() is slightly better, but only slightly, not enough better to
justify less natural semantics.

My yesterday suggestion is a step in right direction, but today I think
that it is not sufficiently radical step. In specific case of sin/cos
there is no good reason to match loss of precision on input with loss of precision.
<Thinking out load>
There are advantages in matched loss of precision for input and for
output when both input and output occupy full range of real numbers
(ignoring sign). Output of sin/cos does not occupy o full range. But
for tan() it does. So, may be, there is a one good reason for matching
input with output for sin/cos - consistency with tan.
</Thinking out load>
So, ignoring tan(), what is really an optimal input scaling for sin/cos
inputs? Today I think that it is a scaling in which full circle
corresponds to 2**64. With such scaling you never lose any input
precision to subnoramals before precision of the result is lost
completely.
Now, one could ask "Why 2**64, why not 2**56 that has the same
property?". My answer is "Because 2**64 is a scaling that is most
convenient for preparation of trig arguments in fix-point [on 64-bit computer]." I.e. apart from being a good scaling for avoidance of loss
of precision in tiny range, it happens to be the best scaling for interoperability with fixed point.
That is my answer today. Will it hold tomorrow? Tomorrow will tell.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Steven G. Kargl on Fri Feb 23 14:32:50 2024

XPost: comp.lang.c

On Fri, 23 Feb 2024 00:13:02 -0000 (UTC)
"Steven G. Kargl" <sgk@REMOVEtroutmask.apl.washington.edu> wrote:

On Thu, 22 Feb 2024 22:57:20 +0000, MitchAlsup1 wrote:

I have a patent on doing argument reduction for sin(x) in 4
cycles...that is as good as Payne-Haneok argument reduction over -infinity..+infinity

By a strange coincidence, I have the Payn-Hanek paper on the
top of stack of papers on my desk. Still need to internalize
the details.

Payne-Hanek is a right answer to a wrong question.
In science/engineering floating-point number x represents a range
[x-ULP:x+ULP] with hopefully much higher probability of being in range [x-ULP/2:x+ULP/2]. However in this reduced range probability is flat,
an "exact" x is no worse and no better than any other point within this
range.
Which means that when we look for y=sin(x) then for
scientific/engineering purposes any answer in range* [sin(x-ULP/2):sin(x+ULP/2)] is as "right" as any other answer. The
right answer to the question "What is a sin() of IEEE-754 binary64
number 1e17?" is "Anything in range [-1:1]". The wise and useful answer
to the same question is "You're holding it wrong".

* in paragraph Sin(x) designates result calculated with infinite
precision.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Fri Feb 23 20:02:00 2024

XPost: comp.lang.c

Michael S wrote:

On Fri, 23 Feb 2024 00:13:02 -0000 (UTC)
"Steven G. Kargl" <sgk@REMOVEtroutmask.apl.washington.edu> wrote:

On Thu, 22 Feb 2024 22:57:20 +0000, MitchAlsup1 wrote:

I have a patent on doing argument reduction for sin(x) in 4
cycles...that is as good as Payne-Haneok argument reduction over
-infinity..+infinity

By a strange coincidence, I have the Payn-Hanek paper on the
top of stack of papers on my desk. Still need to internalize
the details.

Payne-Hanek is a right answer to a wrong question.

If someone looking for fast sin(x) you are correct, for a person looking
for the best possible (correctly rounded) result, you are not. In any
event I have driven that monstrosity down from 50-odd cycles to 4--
making it at least palatable.

In science/engineering floating-point number x represents a range [x-ULP:x+ULP] with hopefully much higher probability of being in range [x-ULP/2:x+ULP/2]. However in this reduced range probability is flat,
an "exact" x is no worse and no better than any other point within this range.
Which means that when we look for y=sin(x) then for
scientific/engineering purposes any answer in range* [sin(x-ULP/2):sin(x+ULP/2)] is as "right" as any other answer. The
right answer to the question "What is a sin() of IEEE-754 binary64
number 1e17?" is "Anything in range [-1:1]". The wise and useful answer
to the same question is "You're holding it wrong".

There are verification tests that know the exact value of both x and sin(x). For these uses you argument is not valid--however I am willing to grant
that outside of verification, and in actual use, your argument holds as
well ad the noise in the data provides.

* in paragraph Sin(x) designates result calculated with infinite
precision.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Steven G. Kargl@21:1/5 to All on Fri Feb 23 20:38:59 2024

XPost: comp.lang.c

On Fri, 23 Feb 2024 20:02:00 +0000, MitchAlsup1 wrote:

Michael S wrote:

On Fri, 23 Feb 2024 00:13:02 -0000 (UTC)
"Steven G. Kargl" <sgk@REMOVEtroutmask.apl.washington.edu> wrote:

On Thu, 22 Feb 2024 22:57:20 +0000, MitchAlsup1 wrote:

I have a patent on doing argument reduction for sin(x) in 4
cycles...that is as good as Payne-Haneok argument reduction over
-infinity..+infinity

By a strange coincidence, I have the Payn-Hanek paper on the top of
stack of papers on my desk. Still need to internalize the details.

Payne-Hanek is a right answer to a wrong question.

If someone looking for fast sin(x) you are correct, for a person looking
for the best possible (correctly rounded) result, you are not. In any
event I have driven that monstrosity down from 50-odd cycles to 4--
making it at least palatable.

In science/engineering floating-point number x represents a range
[x-ULP:x+ULP] with hopefully much higher probability of being in range
[x-ULP/2:x+ULP/2]. However in this reduced range probability is flat,
an "exact" x is no worse and no better than any other point within this
range.
Which means that when we look for y=sin(x) then for
scientific/engineering purposes any answer in range*
[sin(x-ULP/2):sin(x+ULP/2)] is as "right" as any other answer. The
right answer to the question "What is a sin() of IEEE-754 binary64
number 1e17?" is "Anything in range [-1:1]". The wise and useful answer
to the same question is "You're holding it wrong".

There are verification tests that know the exact value of both x and
sin(x).
For these uses you argument is not valid--however I am willing to grant
that outside of verification, and in actual use, your argument holds as
well ad the noise in the data provides.

Fortunately, for 32-bit single precision floating point, one can
exhaustively test single-argument functions. For the SW implementation
on FreeBSD, exhaustive testing shows

% tlibm sinpi -fPED -x 0 -X 0x1p23 -s 0
Interval tested for sinpif: [0,8.38861e+06]
1000000 calls, 0.015453 secs, 0.01545 usecs/call
ulp <= 0.5: 99.842% 1253631604 | 99.842% 1253631604
0.5 < ulp < 0.6: 0.158% 1989420 | 100.000% 1255621024
Max ulp: 0.583666 at 6.07950642e-05

% tlibm sin -fPED -x 0 -X 0x1p23 -s 0
Interval tested for sinf: [0,8.38861e+06]
1000000 calls, 0.019520 secs, 0.01952 usecs/call
ulp <= 0.5: 99.995% 1249842491 | 99.995% 1249842491
0.5 < ulp < 0.6: 0.005% 60102 | 100.000% 1249902593
Max ulp: 0.500900 at 3.68277281e+05

The speed test is for 1M values evenly distributed over
the interval. The difference in speed for sinpi vs sin
is due to the argument reduction.

Note 1: libm built with clang/llvm with only -O2 on a
AMD Phenom II X2 560 Processor (3300.14-MHz K8-class CPU).

Note 2: subnormal values for x are test not included in
the count as subnormal are tested with a different approach.

--
steve

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Fri Feb 23 22:39:19 2024

XPost: comp.lang.c

On Fri, 23 Feb 2024 22:29:17 +0000, MitchAlsup1 wrote:

I find it interesting that sinpi() has worse numeric error than sin().

Another argument for sticking with the simpler solution? One set of radian-specific functions plus conversion factors, instead of a separate
set of unit-specific functions for every unit?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Steven G. Kargl on Fri Feb 23 22:29:17 2024

XPost: comp.lang.c

Steven G. Kargl wrote:

On Fri, 23 Feb 2024 20:02:00 +0000, MitchAlsup1 wrote:

Michael S wrote:

On Fri, 23 Feb 2024 00:13:02 -0000 (UTC)
"Steven G. Kargl" <sgk@REMOVEtroutmask.apl.washington.edu> wrote:

On Thu, 22 Feb 2024 22:57:20 +0000, MitchAlsup1 wrote:

I have a patent on doing argument reduction for sin(x) in 4
cycles...that is as good as Payne-Haneok argument reduction over
-infinity..+infinity

By a strange coincidence, I have the Payn-Hanek paper on the top of
stack of papers on my desk. Still need to internalize the details.

Payne-Hanek is a right answer to a wrong question.

If someone looking for fast sin(x) you are correct, for a person looking
for the best possible (correctly rounded) result, you are not. In any
event I have driven that monstrosity down from 50-odd cycles to 4--
making it at least palatable.

In science/engineering floating-point number x represents a range
[x-ULP:x+ULP] with hopefully much higher probability of being in range
[x-ULP/2:x+ULP/2]. However in this reduced range probability is flat,
an "exact" x is no worse and no better than any other point within this
range.
Which means that when we look for y=sin(x) then for
scientific/engineering purposes any answer in range*
[sin(x-ULP/2):sin(x+ULP/2)] is as "right" as any other answer. The
right answer to the question "What is a sin() of IEEE-754 binary64
number 1e17?" is "Anything in range [-1:1]". The wise and useful answer
to the same question is "You're holding it wrong".

There are verification tests that know the exact value of both x and
sin(x).
For these uses you argument is not valid--however I am willing to grant
that outside of verification, and in actual use, your argument holds as
well ad the noise in the data provides.

Fortunately, for 32-bit single precision floating point, one can

All of the numerics I have spoken about are DP (64-bit) values.

exhaustively test single-argument functions. For the SW implementation
on FreeBSD, exhaustive testing shows

% tlibm sinpi -fPED -x 0 -X 0x1p23 -s 0
Interval tested for sinpif: [0,8.38861e+06]
1000000 calls, 0.015453 secs, 0.01545 usecs/call
ulp <= 0.5: 99.842% 1253631604 | 99.842% 1253631604
0.5 < ulp < 0.6: 0.158% 1989420 | 100.000% 1255621024
Max ulp: 0.583666 at 6.07950642e-05

% tlibm sin -fPED -x 0 -X 0x1p23 -s 0
Interval tested for sinf: [0,8.38861e+06]
1000000 calls, 0.019520 secs, 0.01952 usecs/call
ulp <= 0.5: 99.995% 1249842491 | 99.995% 1249842491
0.5 < ulp < 0.6: 0.005% 60102 | 100.000% 1249902593
Max ulp: 0.500900 at 3.68277281e+05

I find it interesting that sinpi() has worse numeric error than sin().

I also find it interesting that the highest error is in the easy part
of polynomial evaluation without argument reduction whereas sin() has
its worst result in a region where significant argument reduction has transpired.

{{Seems like another case of faster poor answers top slower correct
answers.}}

The speed test is for 1M values evenly distributed over
the interval. The difference in speed for sinpi vs sin
is due to the argument reduction.

But result precision suffers nonetheless.

Note 1: libm built with clang/llvm with only -O2 on a
AMD Phenom II X2 560 Processor (3300.14-MHz K8-class CPU).

Note 2: subnormal values for x are test not included in
the count as subnormal are tested with a different approach.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Steven G. Kargl@21:1/5 to All on Fri Feb 23 23:20:17 2024

XPost: comp.lang.c

On Fri, 23 Feb 2024 22:29:17 +0000, MitchAlsup1 wrote:

Steven G. Kargl wrote:

On Fri, 23 Feb 2024 20:02:00 +0000, MitchAlsup1 wrote:

Michael S wrote:

On Fri, 23 Feb 2024 00:13:02 -0000 (UTC)
"Steven G. Kargl" <sgk@REMOVEtroutmask.apl.washington.edu> wrote:

On Thu, 22 Feb 2024 22:57:20 +0000, MitchAlsup1 wrote:

I have a patent on doing argument reduction for sin(x) in 4
cycles...that is as good as Payne-Haneok argument reduction over
-infinity..+infinity

By a strange coincidence, I have the Payn-Hanek paper on the top of
stack of papers on my desk. Still need to internalize the details.

Payne-Hanek is a right answer to a wrong question.

If someone looking for fast sin(x) you are correct, for a person
looking for the best possible (correctly rounded) result, you are not.
In any event I have driven that monstrosity down from 50-odd cycles to
4-- making it at least palatable.

In science/engineering floating-point number x represents a range
[x-ULP:x+ULP] with hopefully much higher probability of being in
range [x-ULP/2:x+ULP/2]. However in this reduced range probability is
flat, an "exact" x is no worse and no better than any other point
within this range.
Which means that when we look for y=sin(x) then for
scientific/engineering purposes any answer in range*
[sin(x-ULP/2):sin(x+ULP/2)] is as "right" as any other answer. The
right answer to the question "What is a sin() of IEEE-754 binary64
number 1e17?" is "Anything in range [-1:1]". The wise and useful
answer to the same question is "You're holding it wrong".

There are verification tests that know the exact value of both x and
sin(x).
For these uses you argument is not valid--however I am willing to
grant that outside of verification, and in actual use, your argument
holds as well ad the noise in the data provides.

Fortunately, for 32-bit single precision floating point, one can

All of the numerics I have spoken about are DP (64-bit) values.

Sure, but it isn't possible to exhaustively DP on current hardware
(available to me). Simple testing can give one a warm fuzzy feeling.
With 10M values in the range [0,0x1p53], the SW sinpi seems to be better
than the SW sin.

% tlibm sin -d -x 0 -X 0x1p53 -N 10 -s 0
Interval tested for sin: [0,9.0072e+15]
10000000 calls, 2.594275 secs, 0.25943 usecs/call
count: 10000000
xm = 3.4119949728897955e+15, /* 0x43283e61, 0xf8ab4587 */
libm = 7.2135591711063873e-01, /* 0x3fe71559, 0x0118855e */
mpfr = 7.2135591711063862e-01, /* 0x3fe71559, 0x0118855d */
ULP = 0.77814

% tlibm sinpi -d -x 0 -X 0x1p53 -N 10 -s 0
Interval tested for sinpi: [0,9.0072e+15]
10000000 calls, 0.184541 secs, 0.01845 usecs/call
count: 10000000
xm = 1.5384297865527400e+12, /* 0x42766318, 0xf99b8bd7 */
libm = 7.2898962872051931e-01, /* 0x3fe753e2, 0x0ecf4a3e */
mpfr = 7.2898962872051942e-01, /* 0x3fe753e2, 0x0ecf4a3f */
ULP = 0.69476

Note, the timing difference is again dominated by argument reduction.

Also note, that each of the above tests took ~180 cpu seconds. At
the moment, tlibm will not use multiple cores or cpus on other nodes.

exhaustively test single-argument functions. For the SW implementation
on FreeBSD, exhaustive testing shows

% tlibm sinpi -fPED -x 0 -X 0x1p23 -s 0 Interval tested for sinpif:
[0,8.38861e+06]
1000000 calls, 0.015453 secs, 0.01545 usecs/call
ulp <= 0.5: 99.842% 1253631604 | 99.842% 1253631604
0.5 < ulp < 0.6: 0.158% 1989420 | 100.000% 1255621024 Max ulp:
0.583666 at 6.07950642e-05

% tlibm sin -fPED -x 0 -X 0x1p23 -s 0 Interval tested for sinf:
[0,8.38861e+06]
1000000 calls, 0.019520 secs, 0.01952 usecs/call
ulp <= 0.5: 99.995% 1249842491 | 99.995% 1249842491
0.5 < ulp < 0.6: 0.005% 60102 | 100.000% 1249902593 Max ulp:
0.500900 at 3.68277281e+05

I find it interesting that sinpi() has worse numeric error than sin().

I also find it interesting that the highest error is in the easy part of polynomial evaluation without argument reduction whereas sin() has its
worst result in a region where significant argument reduction has
transpired.

{{Seems like another case of faster poor answers top slower correct answers.}}

For my purposes, a max ulp 0.538 is good enough. The time needed
to reduce this to 0.5009 is better spent on other unimplemented
libm functions in Freebsd's libm or implemented functions with
much worse ULP

The speed test is for 1M values evenly distributed over the interval.
The difference in speed for sinpi vs sin is due to the argument
reduction.

But result precision suffers nonetheless.

Argument reduction itself isn't the culprit. Go read the FreeBSD source
code. What is done once one has the reduced argument is the issue. I
suspect that any patch you come up with would be gladly accepted.

--
steve

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Steven G. Kargl@21:1/5 to Lawrence D'Oliveiro on Sat Feb 24 04:03:08 2024

XPost: comp.lang.c

On Fri, 23 Feb 2024 22:39:19 +0000, Lawrence D'Oliveiro wrote:

On Fri, 23 Feb 2024 22:29:17 +0000, MitchAlsup1 wrote:

I find it interesting that sinpi() has worse numeric error than sin().

Another argument for sticking with the simpler solution? One set of radian-specific functions plus conversion factors, instead of a separate
set of unit-specific functions for every unit?

Here's a counter argument for your simple conversion
factors. Changing FreeBSD's libm to return sinf(M_PI*x)
for sinpif(x), one observes

%./tlibm sinpi -f -x 0 -X 0x1p22 -PED
Interval tested for sinpif: [0,4.1943e+06]
ulp <= 0.5: 83.174% 1037372501 | 83.174% 1037372501
0.5 < ulp < 0.6: 0.937% 11690904 | 84.111% 1049063405
0.6 < ulp < 0.7: 0.689% 8587516 | 84.800% 1057650921
0.7 < ulp < 0.8: 0.497% 6200902 | 85.297% 1063851823
0.8 < ulp < 0.9: 0.323% 4023726 | 85.620% 1067875549
0.9 < ulp < 1.0: 0.157% 1952256 | 85.776% 1069827805
1.0 < ulp < 1.5: 0.347% 4332879 | 86.124% 1074160684
1.5 < ulp < 2.0: 0.247% 3083647 | 86.371% 1077244331
2.0 < ulp < 3.0: 0.360% 4491936 | 86.731% 1081736267
3.0 < ulp < 0.0: 13.269% 165496149 | 100.000% 1247232416
Max ulp: 8388278.500000 at 5.65300049e+03

% ./tlibm sinpi -f -a 5.65300049e+03
x = 5.65300049e+03f, /* 0x45b0a801 */
libm = -2.51050433e-03f, /* 0xbb248746 */
mpfr = -1.53398013e-03f, /* 0xbac90fd5 */
ULP = 8388278.50000

I can assure one of the libm and mpfr values, when x = 5653.00049,
is quite wrong.

--
steve

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Steven G. Kargl on Sat Feb 24 04:47:29 2024

XPost: comp.lang.c

On Sat, 24 Feb 2024 04:03:08 -0000 (UTC), Steven G. Kargl wrote:

I can assure one of the libm and mpfr values, when x = 5653.00049,
is quite wrong.

That’s 9 figures, but single-precision IEEE754 floats only allow about 7.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Steven G. Kargl@21:1/5 to Lawrence D'Oliveiro on Sat Feb 24 05:27:10 2024

XPost: comp.lang.c

On Sat, 24 Feb 2024 04:47:29 +0000, Lawrence D'Oliveiro wrote:

On Sat, 24 Feb 2024 04:03:08 -0000 (UTC), Steven G. Kargl wrote:

I can assure one of the libm and mpfr values, when x = 5653.00049,
is quite wrong.

That’s 9 figures, but single-precision IEEE754 floats only allow about
7.

ROTFL. I suggest (re-)reading the entire section

IEEE Std 754TM-2008

5.12.2 External decimal character sequences representing
finite numbers

However, I'll give you the salient part on p.32

For the purposes of discussing the limits on correctly
rounded conversion, define the following quantities:

for binary32, Pmin (binary32) = 9
...

Conversions from a supported binary format bf to an external
character sequence and back again results in a copy of the
original number so long as there are at least Pmin (bf)
significant digits specified and the rounding-direction
attributes in effect during the two conversions are round to
nearest rounding-direction attributes.

You also seem to miss that I gave you the bit pattern as
an unsigned 32-bit int. There are no extra binary digits.

At this point, I think it might be prudent for you to
read Goldberg's paper [1] and IEEE 754.

[1] https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

I'm done responding to your trolls and apologies other c.l.c
posters.

--
steve

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Steven G. Kargl on Sat Feb 24 05:38:53 2024

XPost: comp.lang.c

On Sat, 24 Feb 2024 04:03:08 -0000 (UTC), Steven G. Kargl wrote:

% ./tlibm sinpi -f -a 5.65300049e+03
x = 5.65300049e+03f, /* 0x45b0a801 */
libm = -2.51050433e-03f, /* 0xbb248746 */
mpfr = -1.53398013e-03f, /* 0xbac90fd5 */
ULP = 8388278.50000

I can assure one of the libm and mpfr values, when x = 5653.00049,
is quite wrong.

They’re all wrong. Python says (in double-precision), that math.sin(5653.00049) is -0.9566595256716378.

Even GCC’s sinf function says the value is -0.9565167, which is only a
slight discrepancy in the fourth decimal place, so even in single-
precision it is doing a darn sight better than you have managed above.

So where is there a version of the code you’re using, using different
angle units, that gets closer to the right answer?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Steven G. Kargl on Sat Feb 24 05:48:33 2024

XPost: comp.lang.c

On 2024-02-24, Steven G. Kargl <sgk@REMOVEtroutmask.apl.washington.edu> wrote:

On Sat, 24 Feb 2024 04:47:29 +0000, Lawrence D'Oliveiro wrote:

On Sat, 24 Feb 2024 04:03:08 -0000 (UTC), Steven G. Kargl wrote:

I can assure one of the libm and mpfr values, when x = 5653.00049,
is quite wrong.

That’s 9 figures, but single-precision IEEE754 floats only allow about
7.

ROTFL. I suggest (re-)reading the entire section

[ snip ]

for binary32, Pmin (binary32) = 9

Also, from the comp.lang.c point of view, in the C language we have
FLT_DIG and FLT_DECIMAL_DIG in <float.h>.

On IEEE 754 systems these are 6 and 9.

FLT_DIG gives you, roughly speaking, how many decimal digits of
precision a float can preserve in the decimal -> float -> decimal
conversion sequence

FLT_DECIMAL_DIG indicates how many digits are required to exactly
preserve a float value in decimal form: float -> decimal -> float.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Steven G. Kargl@21:1/5 to Lawrence D'Oliveiro on Sat Feb 24 06:13:58 2024

XPost: comp.lang.c

On Sat, 24 Feb 2024 05:38:53 +0000, Lawrence D'Oliveiro wrote:

On Sat, 24 Feb 2024 04:03:08 -0000 (UTC), Steven G. Kargl wrote:

% ./tlibm sinpi -f -a 5.65300049e+03
x = 5.65300049e+03f, /* 0x45b0a801 */
libm = -2.51050433e-03f, /* 0xbb248746 */
mpfr = -1.53398013e-03f, /* 0xbac90fd5 */
ULP = 8388278.50000

I can assure one of the libm and mpfr values, when x = 5653.00049,
is quite wrong.

They’re all wrong. Python says (in double-precision), that math.sin(5653.00049) is -0.9566595256716378.

Even GCC’s sinf function says the value is -0.9565167, which is only a slight discrepancy in the fourth decimal place, so even in single-
precision it is doing a darn sight better than you have managed above.

So where is there a version of the code you’re using, using different
angle units, that gets closer to the right answer?

sin(x) is not sinpi(x). The conversion factor that you're missing
is M_PI as in sinpi(x) = sin(M_PI*x). The whole point of the
half-cycle trig functions is that one does not need to do the
multiplication by pi, or more importantly the argument reduction
modulo pi.

% ./tlibm sin -f -a 5.65300049e+3
x = 5.65300049e+03f, /* 0x45b0a801 */
libm = -9.56659019e-01f, /* 0xbf74e79b */
mpfr = -9.56659019e-01f, /* 0xbf74e79b */
ULP = 0.10338

--
steve

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to Michael S on Sun Feb 25 16:00:58 2024

Michael S wrote:

On Fri, 23 Feb 2024 11:01:02 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

Michael S wrote:

On Thu, 22 Feb 2024 21:04:52 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

Ï€ radians = half a circle. Are there any other examples of the
usefulness of half-circles as an angle unit? As opposed to the
dozens or hundreds of examples of the usefulness of radians as an
angle unit?

In digital signal processing circle-based units are pretty much
always more natural than radians.
For specific case of 1/2 circle, I can't see where it can be used
directly.
From algorithmic perspective, full circle looks like the most
obvious choice.
From [binary floating point] numerical properties perspective,
1/8th of the circle (==pi/4 radians = 45 degrees) is probably the
best option for a library routine, because for sin() its derivative
at 0 is closest to 1 among all powers of two which means that loss
of precision near 0 is very similar for input and for output. But
this advantage does not sound like particularly big deal.

ieee754 defines sinpi() and siblings, but imho it really doesn't
matter if you use circles, half-circles (i.e. sinpi) or some other
binary fraction of a circle: Argument reduction for huge inputs are
just as easy, you might just have to multiply by the corresponding
power of two (i.e. adjust the exponent) before extracting the
fractional term.

For sinpi(x) I could do it like this:

if (abs(x) >= two_to_52nd_power) error("Zero significant bits.");
ix = int(x);
x_reduced = x - (double) (ix & ~1);
if (x_reduced < 0.0) x_reduced += 2.0;

but it is probably better to return a value in the [-1.0 .. 1.0>
range?

Terje

Both you and Mitch look at it from wrong perspective.
When we define a library API, an ease of implementation of the library function should be pretty low on our priority scale. As long as
reasonably precise calculation is theoretically possible, we should
give credit to intelligence of implementor, that's all.
The real concern of API designer should be with avoidance of loss of precision in preparation of inputs and use of outputs.
In specific case of y=sin2pi(x), it is x that is more problematic,
because near 0 it starts to lose precision 3 octaves before y. In
subnormal range we lose ~2.5 bits of precision in preparation of the argument. An implementation, no matter how good, can't recover what's
already lost.
sinpi() is slightly better, but only slightly, not enough better to
justify less natural semantics.

My yesterday suggestion is a step in right direction, but today I think
that it is not sufficiently radical step. In specific case of sin/cos
there is no good reason to match loss of precision on input with loss of precision.
<Thinking out load>
There are advantages in matched loss of precision for input and for
output when both input and output occupy full range of real numbers
(ignoring sign). Output of sin/cos does not occupy o full range. But
for tan() it does. So, may be, there is a one good reason for matching
input with output for sin/cos - consistency with tan.
</Thinking out load>
So, ignoring tan(), what is really an optimal input scaling for sin/cos inputs? Today I think that it is a scaling in which full circle
corresponds to 2**64. With such scaling you never lose any input
precision to subnoramals before precision of the result is lost
completely.
Now, one could ask "Why 2**64, why not 2**56 that has the same
property?". My answer is "Because 2**64 is a scaling that is most
convenient for preparation of trig arguments in fix-point [on 64-bit computer]." I.e. apart from being a good scaling for avoidance of loss
of precision in tiny range, it happens to be the best scaling for interoperability with fixed point.
That is my answer today. Will it hold tomorrow? Tomorrow will tell.

This is, except for today being a 64-bit world as opposed to 30 years
earlier, exactly the same reasoning Garmin's programmers used to decide
that all their lat/long calculations would use 2^32 as the full circle.

With a signed 32-bit int you get a resolution of 40e6m / 2^32 = 0.0093m
or 9.3 mm, which they considered was more than good enough back in the
days of SA and its ~100 RMS noise, and even after Clinton got rid of
that (May 2 2000 or 2001?), sub-cm GPS is very rarely available.

Doing the same with 64-bit means that you get a resolution of 2.17e-12 m
which is 2.17 picometers or 0.0217 Å, so significantly smaller than a
single H atom which is about 1 Å in size.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to Steven G. Kargl on Sun Feb 25 16:19:02 2024

XPost: comp.lang.c

Steven G. Kargl wrote:

Fortunately, for 32-bit single precision floating point, one can
exhaustively test single-argument functions. For the SW implementation
on FreeBSD, exhaustive testing shows

% tlibm sinpi -fPED -x 0 -X 0x1p23 -s 0
Interval tested for sinpif: [0,8.38861e+06]
1000000 calls, 0.015453 secs, 0.01545 usecs/call
ulp <= 0.5: 99.842% 1253631604 | 99.842% 1253631604
0.5 < ulp < 0.6: 0.158% 1989420 | 100.000% 1255621024
Max ulp: 0.583666 at 6.07950642e-05

That seems worse than it should be, I would expect better results than
the 0.5009 quoted for regular sin() below, and still with the same or
better performance. The only obvious explanation would be that sin(x)
over +/- 45 degrees would have a first Taylor term of 1.0*x, and you
could probably force the first Cheby term to be the same, so by adding
that as the last stage you get improved precision for free?

% tlibm sin -fPED -x 0 -X 0x1p23 -s 0
Interval tested for sinf: [0,8.38861e+06]
1000000 calls, 0.019520 secs, 0.01952 usecs/call
ulp <= 0.5: 99.995% 1249842491 | 99.995% 1249842491
0.5 < ulp < 0.6: 0.005% 60102 | 100.000% 1249902593
Max ulp: 0.500900 at 3.68277281e+05

The speed test is for 1M values evenly distributed over
the interval. The difference in speed for sinpi vs sin
is due to the argument reduction.

Note 1: libm built with clang/llvm with only -O2 on a
AMD Phenom II X2 560 Processor (3300.14-MHz K8-class CPU).

Note 2: subnormal values for x are test not included in
the count as subnormal are tested with a different approach.

sin(x) for Subnormal x seems like it should be trivial: If you just
return x the maximum error of a Taylor series with two terms
(x - x^3/3!)
would be that second term which is by definition zero since x is already subnormal, right?

If you want to perform well on a platform where subnormal ops traps to a
much slower hardware or software path, then it might make sense to start
by checking for |x| < 2^-340 or so?

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Terje Mathisen on Sun Feb 25 18:11:40 2024

XPost: comp.lang.c

Terje Mathisen wrote:

sin(x) for Subnormal x seems like it should be trivial: If you just
return x the maximum error of a Taylor series with two terms
(x - x^3/3!)
would be that second term which is by definition zero since x is already subnormal, right?

The 2 term polynomial begins to be accurate when |x| < ~2^-19 which is far bigger than 2^-1023. (2^-19)^3 is 2^-57 which has few digits in the resulting fraction 2^-19; certainly no 3rd term is required.

If you want to perform well on a platform where subnormal ops traps to a
much slower hardware or software path, then it might make sense to start
by checking for |x| < 2^-340 or so?

Many subnormal operands yield constant results or themselves::

cos(sn) = 1.0
sin(sn) = sn
tan(sn) = sn

Terje

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Terje Mathisen on Thu Mar 14 11:26:55 2024

XPost: comp.lang.c

On Fri, 23 Feb 2024 11:10:00 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

MitchAlsup1 wrote:

Steven G. Kargl wrote:

Agreed a programmer should use what is required by the problem
that they are solving.� I'll note that SW implementations have
their sets of tricks (e.g., use of double-double arithmetic to
achieve double precision).

To get near IEEE desired precision, one HAS TO use more than 754
precision.

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

At which cost in tables sizes?

There is a suggestion on the table to make that a (probably optional
imho) feature for an upcoming ieee754 revision.

Terje

The critical point here is definition of what considered exact. If
'exact' is measured only on y side of y=foo(x), disregarding
possible imprecision on the x side then you are very likely to end up
with results that are slower to calculate, but not at all more useful
from point of view of engineer or physicist. Exactly like Payne-Hanek
or Mitch's equivalent of Payne-Hanek.

The definition of 'exact' should be:
For any finite-precision function foo(x) lets designate the same
mathematical function calculated with infinite precision as Foo(x).
Let's designate an operation of rounding of infinite-precision number to desired finite precision as Rnd(). Rounding is done in to-nearest mode.
Unlike in the case of basic operations, ties are allowed to be broken in
any direction.
The result of y=foo(x) for finite-precision number x considered
exact if *at least one* two conditions is true:
(1=Y-clause) Rnd(Foo(x)) == y
(2=X-clause) There exist an infinite precision number X for which
both Foo(X) == y and Rnd(X) == x.

As follows from the (2), it is possible and not uncommon that more
than one finite-precision number y is accepted exact result of foo(x).

If Committee omits the 2nd clause then the whole proposition will be not
just not useful, but harmful.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Steven G. Kargl on Thu Mar 14 17:15:11 2024

XPost: comp.lang.c

Steven G. Kargl wrote:

sin(x) is not sinpi(x). The conversion factor that you're missing
is M_PI as in sinpi(x) = sin(M_PI*x).

When I faced this kind of accuracy/precision problem in my HW transcendentals,

To get a sufficiently correct reduced argument I had to multiply the fraction of x by a 2/pi number selected such that the HoB was aligned to 2 (quadrant) and the multiplied result had 51+53 bits of precision so that up to 51 leading bits of the reduced argument (after the quadrant bits) could be skipped if 0. This required 3 uses of the multiplier tree one produced the leading bits::

Lead bits |/
Middle bits / /
trail bits /|

which were arranged |/ /| into a single 104 bit (minimum) product. My current implementation uses 128 bits. And my polynomial evaluation uses 58-bit argu- ments (min, 64-bit current) at each iteration. And I still only get 0.502 (58-bit:: 0.5002 64-bit) precision.

Payne and Hanek argument reduction--because it does not have access to the intermediate bits, needs 4 multiplies instead of 2 and very careful arithmetic to preserve accuracy. I can do this in 2 trips through the multiplier array (patented)

So, I used 159-bits of 2/pi in 2 multiplies over 2 trips through the array
and get perfect argument (DP) reduction in 4 cycles. Payne ad Hanek use 256- bits of DP FP operands and 30+ instruction to do the same thing.

My multiplier tree is cut into 2 sections (just like one would do for dual SP) but here I feed the top 2/pi bits into the left hand side and the bottom 2/pi bits into the right hand side so both get computed simultaneously; the subsequent
cycle multiplies by the middle bits of 2/pi. The 2/pi table is indexed by exponent.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Thu Mar 14 17:34:57 2024

XPost: comp.lang.c

Michael S wrote:

On Fri, 23 Feb 2024 11:10:00 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

MitchAlsup1 wrote:

Steven G. Kargl wrote:

Agreed a programmer should use what is required by the problem
that they are solving. I'll note that SW implementations have
their sets of tricks (e.g., use of double-double arithmetic to
achieve double precision).

To get near IEEE desired precision, one HAS TO use more than 754
precision.

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

At which cost in tables sizes?

There is a suggestion on the table to make that a (probably optional
imho) feature for an upcoming ieee754 revision.

Terje

The critical point here is definition of what considered exact. If
'exact' is measured only on y side of y=foo(x), disregarding
possible imprecision on the x side then you are very likely to end up
with results that are slower to calculate, but not at all more useful
from point of view of engineer or physicist. Exactly like Payne-Hanek
or Mitch's equivalent of Payne-Hanek.

The definition of 'exact' should be:
For any finite-precision function foo(x) lets designate the same
mathematical function calculated with infinite precision as Foo(x).
Let's designate an operation of rounding of infinite-precision number to desired finite precision as Rnd(). Rounding is done in to-nearest mode. Unlike in the case of basic operations, ties are allowed to be broken in
any direction.
The result of y=foo(x) for finite-precision number x considered
exact if *at least one* two conditions is true:
(1=Y-clause) Rnd(Foo(x)) == y
(2=X-clause) There exist an infinite precision number X for which
both Foo(X) == y and Rnd(X) == x.

In the second clause:: are we guaranteed that RND(Foo(X))= Y ??

As follows from the (2), it is possible and not uncommon that more
than one finite-precision number y is accepted exact result of foo(x).

If Committee omits the 2nd clause then the whole proposition will be not
just not useful, but harmful.

An interesting side effect of greater intermediate precision is the lack
of need to round prior to the final result. Thus, my sin(x) over its
entire calculation suffers exactly 1 rounding. Payne & Hanek does not
have this prperty.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Terje Mathisen on Thu Mar 14 17:28:35 2024

Terje Mathisen wrote:

Michael S wrote:

On Fri, 23 Feb 2024 11:01:02 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

Michael S wrote:

On Thu, 22 Feb 2024 21:04:52 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

Ï€ radians = half a circle. Are there any other examples of the
usefulness of half-circles as an angle unit? As opposed to the
dozens or hundreds of examples of the usefulness of radians as an
angle unit?

In digital signal processing circle-based units are pretty much
always more natural than radians.
For specific case of 1/2 circle, I can't see where it can be used
directly.
From algorithmic perspective, full circle looks like the most
obvious choice.
From [binary floating point] numerical properties perspective,
1/8th of the circle (==pi/4 radians = 45 degrees) is probably the
best option for a library routine, because for sin() its derivative
at 0 is closest to 1 among all powers of two which means that loss
of precision near 0 is very similar for input and for output. But
this advantage does not sound like particularly big deal.

ieee754 defines sinpi() and siblings, but imho it really doesn't
matter if you use circles, half-circles (i.e. sinpi) or some other
binary fraction of a circle: Argument reduction for huge inputs are
just as easy, you might just have to multiply by the corresponding
power of two (i.e. adjust the exponent) before extracting the
fractional term.

For sinpi(x) I could do it like this:

if (abs(x) >= two_to_52nd_power) error("Zero significant bits.");
ix = int(x);
x_reduced = x - (double) (ix & ~1);
if (x_reduced < 0.0) x_reduced += 2.0;

but it is probably better to return a value in the [-1.0 .. 1.0>
range?

Terje

Both you and Mitch look at it from wrong perspective.
When we define a library API, an ease of implementation of the library
function should be pretty low on our priority scale. As long as
reasonably precise calculation is theoretically possible, we should
give credit to intelligence of implementor, that's all.
The real concern of API designer should be with avoidance of loss of
precision in preparation of inputs and use of outputs.
In specific case of y=sin2pi(x), it is x that is more problematic,
because near 0 it starts to lose precision 3 octaves before y. In
subnormal range we lose ~2.5 bits of precision in preparation of the
argument. An implementation, no matter how good, can't recover what's
already lost.
sinpi() is slightly better, but only slightly, not enough better to
justify less natural semantics.

My yesterday suggestion is a step in right direction, but today I think
that it is not sufficiently radical step. In specific case of sin/cos
there is no good reason to match loss of precision on input with loss of
precision.
<Thinking out load>
There are advantages in matched loss of precision for input and for
output when both input and output occupy full range of real numbers
(ignoring sign). Output of sin/cos does not occupy o full range. But
for tan() it does. So, may be, there is a one good reason for matching
input with output for sin/cos - consistency with tan.
</Thinking out load>
So, ignoring tan(), what is really an optimal input scaling for sin/cos
inputs? Today I think that it is a scaling in which full circle
corresponds to 2**64. With such scaling you never lose any input
precision to subnoramals before precision of the result is lost
completely.
Now, one could ask "Why 2**64, why not 2**56 that has the same
property?". My answer is "Because 2**64 is a scaling that is most
convenient for preparation of trig arguments in fix-point [on 64-bit
computer]." I.e. apart from being a good scaling for avoidance of loss
of precision in tiny range, it happens to be the best scaling for
interoperability with fixed point.
That is my answer today. Will it hold tomorrow? Tomorrow will tell.

This is, except for today being a 64-bit world as opposed to 30 years earlier, exactly the same reasoning Garmin's programmers used to decide
that all their lat/long calculations would use 2^32 as the full circle.

With a signed 32-bit int you get a resolution of 40e6m / 2^32 = 0.0093m
or 9.3 mm, which they considered was more than good enough back in the
days of SA and its ~100 RMS noise, and even after Clinton got rid of
that (May 2 2000 or 2001?), sub-cm GPS is very rarely available.

The drift rate of the oscillators in the satellites is such it will neve be. The clock drift is reset every orbit.

Also note: hackers are now using ground based GPS transmitters to alter
where your GPS calculates where you think you are. This is most annoying
around airports when planes use GPS to auto guide the planes to runways.

Doing the same with 64-bit means that you get a resolution of 2.17e-12 m which is 2.17 picometers or 0.0217 Å, so significantly smaller than a
single H atom which is about 1 Å in size.

And yet, driving by Edwards AFB sometimes my car's GPS shows my 50m off the interstate quality road, and sometimes not.

Terje

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to All on Thu Mar 14 19:48:26 2024

XPost: comp.lang.c

And thirdly::

Let us postulate that the reduced argument r is indeed calculated with 0.5ULP error.

What makes you think you can calculate the polynomial without introducing
any more error ??

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Thu Mar 14 20:30:47 2024

XPost: comp.lang.c

On Thu, 14 Mar 2024 17:34:57 +0000, MitchAlsup1 wrote:

Michael S wrote:

(2=X-clause) There exist an infinite precision number X for which
both Foo(X) == y and Rnd(X) == x.

In the second clause:: are we guaranteed that RND(Foo(X))= Y ??

No idea why that’s relevant. Michael S was talking about “Rnd”, not “RND”.

When you say “RND”, I think of a bad random-number generator found in various implementations of BASIC.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Lawrence D'Oliveiro on Thu Mar 14 22:19:16 2024

XPost: comp.lang.c

Lawrence D'Oliveiro wrote:

On Thu, 14 Mar 2024 17:34:57 +0000, MitchAlsup1 wrote:

Michael S wrote:

(2=X-clause) There exist an infinite precision number X for which
both Foo(X) == y and Rnd(X) == x.

In the second clause:: are we guaranteed that RND(Foo(X))= Y ??

No idea why that’s relevant. Michael S was talking about “Rnd”, not “RND”.

You have just selected yourself as someone I will never reply to again.

When you say “RND”, I think of a bad random-number generator found in various implementations of BASIC.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Chris M. Thomasson on Thu Mar 14 22:18:27 2024

Chris M. Thomasson wrote:

On 3/14/2024 10:28 AM, MitchAlsup1 wrote:

The drift rate of the oscillators in the satellites is such it will neve
be.
The clock drift is reset every orbit.

Also note: hackers are now using ground based GPS transmitters to alter
where your GPS calculates where you think you are. This is most annoying
around airports when planes use GPS to auto guide the planes to runways.

GPS Spoofing?

More so in Europe than USA, but it is coming. Mentor Pilot had a video on
this last week.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Thu Mar 14 22:31:57 2024

Michael S wrote:

On Fri, 23 Feb 2024 11:01:02 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

Both you and Mitch look at it from wrong perspective.

For the record, I am only addressing SIN() or SINPI() as a HW instruction.

When we define a library API, an ease of implementation of the library function should be pretty low on our priority scale. As long as
reasonably precise calculation is theoretically possible, we should
give credit to intelligence of implementor, that's all.

At the HW level, when someone executes a
SIN R9,R17
instruction, they should get a result that has been computed to very
high precision, including argument reductions (which may require more
bits in the reduced argument than fit in an IEEE container).

The real concern of API designer should be with avoidance of loss of precision in preparation of inputs and use of outputs.

With an instruction performing all the work, this argument is moot.

In specific case of y=sin2pi(x), it is x that is more problematic,
because near 0 it starts to lose precision 3 octaves before y. In
subnormal range we lose ~2.5 bits of precision in preparation of the argument.

NOT when the argument reduction process does not contain ROUNDINGs !!!
OR when the argument reduction process contains more precision bits than
fit in an IEEE container !!! I determined for SIN() this needed 57-bits
of intermediate reduced argument, for TAN() and ATAN() this requires
58-bits:: compared to the 53 available in normalized IEEE 754.

An implementation, no matter how good, can't recover what's already lost.

My 66000 ISA has these as instructions (not function calls) and has
the execution width to avoid your perils.

sinpi() is slightly better, but only slightly, not enough better to
justify less natural semantics.

SINPI() is also an instruction.
SIN() takes 19 cycles whereas SINPI() takes 16.....

My yesterday suggestion is a step in right direction, but today I think
that it is not sufficiently radical step. In specific case of sin/cos
there is no good reason to match loss of precision on input with loss of precision.

In the end, you either have a precise result, or you do not.

<Thinking out load>
There are advantages in matched loss of precision for input and for
output when both input and output occupy full range of real numbers
(ignoring sign). Output of sin/cos does not occupy o full range. But
for tan() it does. So, may be, there is a one good reason for matching
input with output for sin/cos - consistency with tan.
</Thinking out load>
So, ignoring tan(), what is really an optimal input scaling for sin/cos inputs? Today I think that it is a scaling in which full circle
corresponds to 2**64. With such scaling you never lose any input
precision to subnoramals before precision of the result is lost
completely.

See USPTO 10,761,806 and 10,983,755 for your answer.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to Michael S on Fri Mar 15 11:23:45 2024

XPost: comp.lang.c

Michael, I for the main part agree with you here, i.e. calculating
sin(x) with x larger than 2^53 or so, is almost certainly stupid.

Actually using and depending upon the result is more stupid.

OTOH, it is and have always been, a core principle of ieee754 that basic operations (FADD/FSUB/FMUL/FDIV/FSQRT) shall assume that the inputs are
exact (no fractional ulp uncertainty), and that we from that starting
point must deliver a correctly rounded version of the infinitely precise
exact result of the operation.

Given the latter, it is in fact very tempting to see if that basic
result rule could be applied to more of the non-core operations, but I
cannot foresee any situation where I would use it myself: If I find
myself in a situation where the final fractional ulp is important, then
I would far rather switch to doing the operation in fp128.

Terje

Michael S wrote:

On Fri, 23 Feb 2024 11:10:00 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

MitchAlsup1 wrote:

Steven G. Kargl wrote:

Agreed a programmer should use what is required by the problem
that they are solving. I'll note that SW implementations have
their sets of tricks (e.g., use of double-double arithmetic to
achieve double precision).

To get near IEEE desired precision, one HAS TO use more than 754
precision.

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

At which cost in tables sizes?

There is a suggestion on the table to make that a (probably optional
imho) feature for an upcoming ieee754 revision.

Terje

The critical point here is definition of what considered exact. If
'exact' is measured only on y side of y=foo(x), disregarding
possible imprecision on the x side then you are very likely to end up
with results that are slower to calculate, but not at all more useful
from point of view of engineer or physicist. Exactly like Payne-Hanek
or Mitch's equivalent of Payne-Hanek.

The definition of 'exact' should be:
For any finite-precision function foo(x) lets designate the same
mathematical function calculated with infinite precision as Foo(x).
Let's designate an operation of rounding of infinite-precision number to desired finite precision as Rnd(). Rounding is done in to-nearest mode. Unlike in the case of basic operations, ties are allowed to be broken in
any direction.
The result of y=foo(x) for finite-precision number x considered
exact if *at least one* two conditions is true:
(1=Y-clause) Rnd(Foo(x)) == y
(2=X-clause) There exist an infinite precision number X for which
both Foo(X) == y and Rnd(X) == x.

As follows from the (2), it is possible and not uncommon that more
than one finite-precision number y is accepted exact result of foo(x).

If Committee omits the 2nd clause then the whole proposition will be not
just not useful, but harmful.

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to All on Fri Mar 15 12:00:45 2024

MitchAlsup1 wrote:

Terje Mathisen wrote:

This is, except for today being a 64-bit world as opposed to 30 years
earlier, exactly the same reasoning Garmin's programmers used to
decide that all their lat/long calculations would use 2^32 as the full
circle.

With a signed 32-bit int you get a resolution of 40e6m / 2^32 =
0.0093m or 9.3 mm, which they considered was more than good enough
back in the days of SA and its ~100 RMS noise, and even after Clinton
got rid of that (May 2 2000 or 2001?), sub-cm GPS is very rarely
available.

The drift rate of the oscillators in the satellites is such it will neve
be.
The clock drift is reset every orbit.

Sub-meter (typically 2 cm at 10-20 Hz) GPS requires a nearby (static)
base station which can measure the individual pseudo-range errors from
each sat and send that to the measuring device (the rover) over a
separate channel.

If you record all the pseudorange values, then you can do the same with post-processing, this can be useful for DIY surveying.

Also note: hackers are now using ground based GPS transmitters to alter
where your GPS calculates where you think you are. This is most annoying around airports when planes use GPS to auto guide the planes to runways.

The potential for this attack is one of the reasons the military signal
is encrypted, so that it cannot be spoofed.

Doing the same with 64-bit means that you get a resolution of 2.17e-12
m which is 2.17 picometers or 0.0217 Ã…, so significantly smaller than >> a single H atom which is about 1 Ã… in size.

And yet, driving by Edwards AFB sometimes my car's GPS shows my 50m off the interstate quality road, and sometimes not.

50 m is a bit high, but still fairly typical for what commercial
receivers can do in the vicinity of refelcting surfaces like cliffs or buildings.

Modern multi-system receivers are actually getting much better at
detecting and mitigating such problems. Starting with
GPS+Glonass+Galileo having worldwide coverage, then adding in the
Chinese and Japanese sats means that as long as you don't worry to much
about pwer usage, you can do quite well. My personal target is sub 3m
when under a wet forest canopy, that is good enough for orienteering map
field survey work.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to mitchalsup@aol.com on Fri Mar 15 13:49:36 2024

XPost: comp.lang.c

On Thu, 14 Mar 2024 17:34:57 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Michael S wrote:

(2=X-clause) There exist an infinite precision number X for which
both Foo(X) == y and Rnd(X) == x.

In the second clause:: are we guaranteed that RND(Foo(X))= Y ??

No, we are not.

As follows from the (2), it is possible and not uncommon that more
than one finite-precision number y is accepted exact result of
foo(x).

If Committee omits the 2nd clause then the whole proposition will
be not just not useful, but harmful.

An interesting side effect of greater intermediate precision is the
lack of need to round prior to the final result. Thus, my sin(x) over
its entire calculation suffers exactly 1 rounding. Payne & Hanek does
not have this prperty.

Which does not help to recover precision lost during rounding of x
that happened before your wonderful instruction.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to All on Fri Mar 15 12:16:27 2024

XPost: comp.lang.c

MitchAlsup1 wrote:

And thirdly::

Let us postulate that the reduced argument r is indeed calculated with
0.5ULP
error.

What makes you think you can calculate the polynomial without introducing
any more error ??

It should be obvious that any argument reduction step must return a
value with significantly higher precision than the input(s), and that
this higher precision value is then used in any polynomial evaluation.

With careful setup, it is very often possible to reduce the amount of extended-procision work needed to just one or two steps, i.e. for the
classic Taylor sin(x) series, with x fairly small, the x^3 and higher
terms can make do with double precision, so that the final step is to
add the two parts of the leading x term: First the trailing part and
then when adding the upper 53 bits of x you get a single rounding at
this stage.

This is easier and better when done with 64-bit fixed-point values,
augemented with a few 128-bit operations.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Terje Mathisen on Fri Mar 15 14:15:52 2024

XPost: comp.lang.c

On Fri, 15 Mar 2024 11:23:45 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

Michael, I for the main part agree with you here, i.e. calculating
sin(x) with x larger than 2^53 or so, is almost certainly stupid.

Actually using and depending upon the result is more stupid.

OTOH, it is and have always been, a core principle of ieee754 that
basic operations (FADD/FSUB/FMUL/FDIV/FSQRT) shall assume that the
inputs are exact (no fractional ulp uncertainty), and that we from
that starting point must deliver a correctly rounded version of the infinitely precise exact result of the operation.

Given the latter, it is in fact very tempting to see if that basic
result rule could be applied to more of the non-core operations, but
I cannot foresee any situation where I would use it myself: If I find
myself in a situation where the final fractional ulp is important,
then I would far rather switch to doing the operation in fp128.

Terje

To make it less tempting, you could try to push for inclusion of
rsqrt() into basic set. Long overdue, IMHO.

Right now, I can't think of any other transcendental that I really want
to elevate to higher status. It seems to me that elevation of log2(x)
and of 2**x will do no harm, but I am not sure about usefulness.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From George Neuner@21:1/5 to terje.mathisen@tmsw.no on Fri Mar 15 11:40:57 2024

On Fri, 15 Mar 2024 12:00:45 +0100, Terje Mathisen
<terje.mathisen@tmsw.no> wrote:

MitchAlsup1 wrote:

Terje Mathisen wrote:

This is, except for today being a 64-bit world as opposed to 30 years
earlier, exactly the same reasoning Garmin's programmers used to
decide that all their lat/long calculations would use 2^32 as the full
circle.

With a signed 32-bit int you get a resolution of 40e6m / 2^32 =
0.0093m or 9.3 mm, which they considered was more than good enough
back in the days of SA and its ~100 RMS noise, and even after Clinton
got rid of that (May 2 2000 or 2001?), sub-cm GPS is very rarely
available.

The drift rate of the oscillators in the satellites is such it will neve
be.
The clock drift is reset every orbit.

Sub-meter (typically 2 cm at 10-20 Hz) GPS requires a nearby (static)
base station which can measure the individual pseudo-range errors from
each sat and send that to the measuring device (the rover) over a
separate channel.

If you record all the pseudorange values, then you can do the same with >post-processing, this can be useful for DIY surveying.

Also note: hackers are now using ground based GPS transmitters to alter
where your GPS calculates where you think you are. This is most annoying
around airports when planes use GPS to auto guide the planes to runways.

The potential for this attack is one of the reasons the military signal
is encrypted, so that it cannot be spoofed.

Doing the same with 64-bit means that you get a resolution of 2.17e-12
m which is 2.17 picometers or 0.0217 Ã…, so significantly smaller than >>> a single H atom which is about 1 Ã… in size.

And yet, driving by Edwards AFB sometimes my car's GPS shows my 50m off the >> interstate quality road, and sometimes not.

50 m is a bit high, but still fairly typical for what commercial
receivers can do in the vicinity of refelcting surfaces like cliffs or >buildings.

Also the AFB operates differential GPS for its runways. An OTS unit
might be confused by close proximity to the ground transmitter.

Modern multi-system receivers are actually getting much better at
detecting and mitigating such problems. Starting with
GPS+Glonass+Galileo having worldwide coverage, then adding in the
Chinese and Japanese sats means that as long as you don't worry to much
about pwer usage, you can do quite well. My personal target is sub 3m
when under a wet forest canopy, that is good enough for orienteering map >field survey work.

Terje

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Keith Thompson on Sat Mar 16 01:16:25 2024

XPost: comp.lang.c

Keith Thompson wrote:

I can see how computing sin(x) with high precision for "reasonable"
values of x would be useful, but does any of that benefit from being
able to compute sin(2^53) accurately?

Because accurate argument reduction reduces the burden on the programmer
to remain within his sandbox. Since early 1990s, when Payne and Hanek
argument reduction became widely available, there is no reason NOT to
perform argument reduction with great accuracy. NOTE: you don't HAVE to
use Payne and Hanek when simpler strategies word adequately; but when
they do not, you do not have to "Throw up your hands" and run away.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Sat Mar 16 01:23:12 2024

XPost: comp.lang.c

Michael S wrote:

To make it less tempting, you could try to push for inclusion of
rsqrt() into basic set. Long overdue, IMHO.

Many Newton-Raphson iterations converge more rapidly with RSQRT than
with SQRT, for reasons I never looked that deeply into.

Right now, I can't think of any other transcendental that I really want
to elevate to higher status. It seems to me that elevation of log2(x)
and of 2**x will do no harm, but I am not sure about usefulness.

Ln2 and EXP2 are the basis of My 66000 log and EXP families, performed in
HW function unit.

Ln2 has the property that a binade [1..2) maps directly to another binade [2..4); exp2 has a similar property. Ln2 is easier to get good accuracy
from than Ln or Log as a starting point. But in any event:: you are always within a high precision multiply of the correct result = round(53×106);

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to All on Sat Mar 16 16:59:19 2024

XPost: comp.lang.c

MitchAlsup1 wrote:

Michael S wrote:

To make it less tempting, you could try to push for inclusion of
rsqrt() into basic set. Long overdue, IMHO.

Many Newton-Raphson iterations converge more rapidly with RSQRT than
with SQRT, for reasons I never looked that deeply into.

The classic NR iteration for rsqrt() uses only fmul & fadd/fsub, while
the naive sqrt() version needs a divide in every iteration.

Personally I prefer to use the y = rsqrt(x) form and then just multiply
by x in the end:

y = sqrt(x) = x/sqrt(x) = rsqrt(x)*x

In order to get the rounding correct from this form you can probably
just square the final result and compare this with the original input?

Right now, I can't think of any other transcendental that I really want
to elevate to higher status. It seems to me that elevation of log2(x)
and of 2**x will do no harm, but I am not sure about usefulness.

Ln2 and EXP2 are the basis of My 66000 log and EXP families, performed in
HW function unit.

Ln2 has the property that a binade [1..2) maps directly to another binade [2..4); exp2 has a similar property. Ln2 is easier to get good accuracy
from than Ln or Log as a starting point. But in any event:: you are
always within a high precision multiply of the correct result = round(53Ã—106);

Exactly the same reasoning I used to compute rsqrt() over the [1.0 -

range, or (simplifying the exp logic) over [0.5 - 2.0> so that the

last exp bit maps to the first vs second half of the range. Back before
we got approximate rsqrt() lookup opcodes, I wrote code that took the
last exp bit and the first n mantissa bits and looked up an optimal
starting point for that value + 0.5 ulp. Getting 11-12 bits this way
means approximatly correct float after a single iteration and useful
double after two.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to mitchalsup@aol.com on Sat Mar 16 19:08:15 2024

XPost: comp.lang.c

On Sat, 16 Mar 2024 01:16:25 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Keith Thompson wrote:

I can see how computing sin(x) with high precision for "reasonable"
values of x would be useful, but does any of that benefit from being
able to compute sin(2^53) accurately?

Because accurate argument reduction reduces the burden on the
programmer to remain within his sandbox.

Not really.
sin(2**53) is not what programmer wanted, because 2**53 is not the x he
really wanted. It has nothing to do with calculation of sin().

Since early 1990s, when
Payne and Hanek argument reduction became widely available, there is
no reason NOT to perform argument reduction with great accuracy.
NOTE: you don't HAVE to use Payne and Hanek when simpler strategies
word adequately; but when they do not, you do not have to "Throw up
your hands" and run away.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Sat Mar 16 17:22:50 2024

XPost: comp.lang.c

Michael S wrote:

On Sat, 16 Mar 2024 01:16:25 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Keith Thompson wrote:

I can see how computing sin(x) with high precision for "reasonable"
values of x would be useful, but does any of that benefit from being
able to compute sin(2^53) accurately?

Because accurate argument reduction reduces the burden on the
programmer to remain within his sandbox.

Not really.

Say you are a programmer and you receive a value like 2^53 from an
Input read and you wan the most accurate possible SIN( of that ).

With high precision argument reduction you have to do (wait for it)
NoThInG !!.

Without high precision argument reduction yo have to do a lot of programming--programming that I insist should have been done by
the programmer responsible for sin(x).

sin(2**53) is not what programmer wanted, because 2**53 is not the x he really wanted. It has nothing to do with calculation of sin().

And EvErYtHiNg to do with high precision argument reduction; and
maybe the programmer has no requirement on what he is allowed to
read in from a file and pass onto sin(x).

Since early 1990s, when
Payne and Hanek argument reduction became widely available, there is
no reason NOT to perform argument reduction with great accuracy.
NOTE: you don't HAVE to use Payne and Hanek when simpler strategies
word adequately; but when they do not, you do not have to "Throw up
your hands" and run away.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to mitchalsup@aol.com on Sat Mar 16 18:32:52 2024

XPost: comp.lang.c

mitchalsup@aol.com (MitchAlsup1) writes:

Michael S wrote:

On Sat, 16 Mar 2024 01:16:25 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Keith Thompson wrote:

I can see how computing sin(x) with high precision for "reasonable"
values of x would be useful, but does any of that benefit from being
able to compute sin(2^53) accurately?

Because accurate argument reduction reduces the burden on the
programmer to remain within his sandbox.

Not really.

Say you are a programmer and you receive a value like 2^53 from an
Input read and you wan the most accurate possible SIN( of that ).

That generally seems to be an highly unlikely scenario.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to mitchalsup@aol.com on Sat Mar 16 20:49:45 2024

XPost: comp.lang.c

On Sat, 16 Mar 2024 17:22:50 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Michael S wrote:

On Sat, 16 Mar 2024 01:16:25 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Keith Thompson wrote:

I can see how computing sin(x) with high precision for
"reasonable" values of x would be useful, but does any of that
benefit from being able to compute sin(2^53) accurately?

Because accurate argument reduction reduces the burden on the
programmer to remain within his sandbox.

Not really.

Say you are a programmer and you receive a value like 2^53 from an
Input read and you wan the most accurate possible SIN( of that ).

With high precision argument reduction you have to do (wait for it)
NoThInG !!.

Without high precision argument reduction yo have to do a lot of programming--programming that I insist should have been done by
the programmer responsible for sin(x).

I never met programmers that wants do anything of this sort.
Programmers that I meet in real world could sometimes want something
else. They could want, for example, to draw 1000 points of wave of blue
LED with frequency 663.42 THz (663.42e12 Hz) with step of 10 nm (1e-8 m) starting at distance of 1 million km. Well, not really, but at least I
can imagine them wanting to do it*.
Will double precision sin() routine that does perfect argument
reduction be any more helpful for their task then sin() implementation
that makes no effort at all beyound simple modulo 2*pi? No, they both
would be the same.
May be, sinpi() that does a perfect reduction more or less by
definition, be more helpful?
For naive folks - not at all.
For advanced folks - yes, but just a little. It still wouldn't be easy.
And among programmers that I know naives outnumber advanced by more
than factor of 10.

-----
* - For more real case, they could want to draw the same wave at
distance of 2m using single-precision arithmetic.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Keith Thompson on Sun Mar 17 00:00:00 2024

XPost: comp.lang.c

On 16/03/2024 23:19, Keith Thompson wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

Say you are a programmer and you receive a value like 2^53 from an
Input read and you wan the most accurate possible SIN( of that ).

I can't think of a scenario where that would be useful (other than just
doing it for the sake of doing it).

If 2^53 represents a physical quantity, how likely is the actual value
to be known within ±π (+/i pi for those who prefer ASCII)?

If you can get better precision without too much extra cost, that's
great. I don't know enough to have an opinion about what the best
tradeoff is, but I presume it's going to be different depending on the application.

Here's a C program that shows how precise sin(2^53) can be for types
float, double, and long double (I used gcc and glibc). The nextafter functions are used to compute the nearest representable number. For
long double, the value of sin() changes by about 1 part in 1600, which
seems decent, but it's not nearly as precise as for values around 1.0.
For float and double, the imprecision of the argument is enough to make
the result practically meaningless.

...
Output:

float (32 bits, 24 mantissa bits)
9007199254740992.00000000 -0.84892595
9007200328482816.00000000 -0.34159181

double (64 bits, 53 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740994.00000000 -0.12729655

long double (128 bits, 64 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740992.00097656 -0.84944168

Is this output supposed to be different between gcc -O0 and gcc -O3?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Keith Thompson on Sun Mar 17 01:57:22 2024

XPost: comp.lang.c

On 17/03/2024 01:38, Keith Thompson wrote:

bart <bc@freeuk.com> writes:

On 16/03/2024 23:19, Keith Thompson wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

Say you are a programmer and you receive a value like 2^53 from an
Input read and you wan the most accurate possible SIN( of that ).

I can't think of a scenario where that would be useful (other than
just
doing it for the sake of doing it).
If 2^53 represents a physical quantity, how likely is the actual
value
to be known within ±π (+/i pi for those who prefer ASCII)?
If you can get better precision without too much extra cost, that's
great. I don't know enough to have an opinion about what the best
tradeoff is, but I presume it's going to be different depending on the
application.
Here's a C program that shows how precise sin(2^53) can be for types
float, double, and long double (I used gcc and glibc). The nextafter
functions are used to compute the nearest representable number. For
long double, the value of sin() changes by about 1 part in 1600, which
seems decent, but it's not nearly as precise as for values around 1.0.
For float and double, the imprecision of the argument is enough to make
the result practically meaningless.

...
Output:
float (32 bits, 24 mantissa bits)
9007199254740992.00000000 -0.84892595
9007200328482816.00000000 -0.34159181
double (64 bits, 53 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740994.00000000 -0.12729655
long double (128 bits, 64 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740992.00097656 -0.84944168

Is this output supposed to be different between gcc -O0 and gcc -O3?

No, why do you ask?

On my system, the output is identical with gcc and clang, and tcc at all optimization levels, with both glibc and musl. I do get slightly
different results for long double on Cygwin vs. Ubuntu (Cygwin uses
newlib, not glibc).

Because I get the results shown below. Even if the library is different
from yours, I would have assumed matching results.

------------------------------

C:\c>gcc c.c -O3

C:\c>a
float (32 bits, 24 mantissa bits)
9007199254740992.00000000 -0.84892595
9007200328482816.00000000 -0.34159181

double (64 bits, 53 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740994.00000000 -0.12729655

long double (128 bits, 64 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740992.00097656 -0.84944168

C:\c>gcc c.c -O0

C:\c>a
float (32 bits, 24 mantissa bits)
9007199254740992.00000000 -0.84893209
9007200328482816.00000000 -0.34160271

double (64 bits, 53 mantissa bits)
9007199254740992.00000000 -0.84893209
9007199254740994.00000000 -0.12728505

long double (128 bits, 64 mantissa bits)
9007199254740992.00000000 -0.84893209
9007199254740992.00097656 -0.84944780

C:\c>gcc --version
gcc (tdm64-1) 10.3.0

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Keith Thompson on Sun Mar 17 11:06:21 2024

XPost: comp.lang.c

On Sat, 16 Mar 2024 16:19:11 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

Michael S wrote:

On Sat, 16 Mar 2024 01:16:25 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Keith Thompson wrote:

I can see how computing sin(x) with high precision for
"reasonable" values of x would be useful, but does any of that
benefit from being able to compute sin(2^53) accurately?

Because accurate argument reduction reduces the burden on the
programmer to remain within his sandbox.

Not really.

Say you are a programmer and you receive a value like 2^53 from an
Input read and you wan the most accurate possible SIN( of that ).

I can't think of a scenario where that would be useful (other than
just doing it for the sake of doing it).

If 2^53 represents a physical quantity, how likely is the actual value
to be known within ±π (+/i pi for those who prefer ASCII)?

If you can get better precision without too much extra cost, that's
great. I don't know enough to have an opinion about what the best
tradeoff is, but I presume it's going to be different depending on the application.

Here's a C program that shows how precise sin(2^53) can be for types
float, double, and long double (I used gcc and glibc). The nextafter functions are used to compute the nearest representable number. For
long double, the value of sin() changes by about 1 part in 1600, which
seems decent, but it's not nearly as precise as for values around 1.0.
For float and double, the imprecision of the argument is enough to
make the result practically meaningless.

#include <math.h>
#include <stdio.h>
#include <limits.h>
#include <float.h>
int main(void) {
{
printf("float (%zu bits, %d mantissa bits)\n", CHAR_BIT *
sizeof (float), FLT_MANT_DIG); const float x = (float)(1LL<<53);
const float y = nextafterf(x, x*2);
printf("%.8f %.8f\n", x, sinf(x));
printf("%.8f %.8f\n", y, sinf(y));
}
putchar('\n');
{
printf("double (%zu bits, %d mantissa bits)\n", CHAR_BIT *
sizeof (double), DBL_MANT_DIG); const double x = (double)(1LL<<53);
const double y = nextafter(x, x*2);
printf("%.8f %.8f\n", x, sin(x));
printf("%.8f %.8f\n", y, sin(y));
}
putchar('\n');
{
printf("long double (%zu bits, %d mantissa bits)\n", CHAR_BIT
* sizeof (long double), LDBL_MANT_DIG); const long double x = (long double)(1LL<<53); const long double y = nextafterl(x, x*2);
printf("%.8Lf %.8Lf\n", x, sinl(x));
printf("%.8Lf %.8Lf\n", y, sinl(y));
}
}

Output:

float (32 bits, 24 mantissa bits)
9007199254740992.00000000 -0.84892595
9007200328482816.00000000 -0.34159181

double (64 bits, 53 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740994.00000000 -0.12729655

long double (128 bits, 64 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740992.00097656 -0.84944168

As written, your example does not emphasize that the problem has
nothing to do with implementation of sinX() library routine.
It's best illustrated by followup conversation with bart, IMHO 100%
O.T.
To make the point more clear I'd rather change it to following form:

#include <math.h>
#include <stdio.h>
#include <limits.h>
#include <float.h>

void foo(long double x1, long double x2)
{
const double y1 = (double)sinl(x1);
const double y2 = (double)sinl(x2);
printf("%.20Le %.17f\n", x1, y1);
printf("%.20Le %.17f\n", x2, y2);
}

int main(void) {
const float x0 = (float)(1LL<<53);
{
printf("float (%zu bits, %d mantissa bits)\n", CHAR_BIT * sizeof
(float), FLT_MANT_DIG); const float x1 = x0;
const float x2 = nextafterf(x1, FLT_MAX);
foo(x1, x2);
}
putchar('\n');
{
printf("double (%zu bits, %d mantissa bits)\n", CHAR_BIT * sizeof
(double), DBL_MANT_DIG); const double x1 = x0;
const double x2 = nextafter(x1, FLT_MAX);
foo(x1, x2);
}
putchar('\n');
{
printf("long double (%zu bits, %d mantissa bits)\n", CHAR_BIT *
sizeof (long double), LDBL_MANT_DIG); const long double x1 = x0;
const long double x2 = nextafterl(x1, FLT_MAX);
foo(x1, x2);
}
}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Michael S on Sun Mar 17 11:34:58 2024

XPost: comp.lang.c

On Sun, 17 Mar 2024 11:06:21 +0200
Michael S <already5chosen@yahoo.com> wrote:

On Sat, 16 Mar 2024 16:19:11 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

Michael S wrote:

On Sat, 16 Mar 2024 01:16:25 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Keith Thompson wrote:

I can see how computing sin(x) with high precision for
"reasonable" values of x would be useful, but does any of that
benefit from being able to compute sin(2^53) accurately?

Because accurate argument reduction reduces the burden on the
programmer to remain within his sandbox.

Not really.

Say you are a programmer and you receive a value like 2^53 from an
Input read and you wan the most accurate possible SIN( of that ).

I can't think of a scenario where that would be useful (other than
just doing it for the sake of doing it).

If 2^53 represents a physical quantity, how likely is the actual
value to be known within ±π (+/i pi for those who prefer ASCII)?

If you can get better precision without too much extra cost, that's
great. I don't know enough to have an opinion about what the best
tradeoff is, but I presume it's going to be different depending on
the application.

Here's a C program that shows how precise sin(2^53) can be for types
float, double, and long double (I used gcc and glibc). The
nextafter functions are used to compute the nearest representable
number. For long double, the value of sin() changes by about 1
part in 1600, which seems decent, but it's not nearly as precise as
for values around 1.0. For float and double, the imprecision of the argument is enough to make the result practically meaningless.

<snip>

As written, your example does not emphasize that the problem has
nothing to do with implementation of sinX() library routine.
It's best illustrated by followup conversation with bart, IMHO 100%
O.T.
To make the point more clear I'd rather change it to following form:

<snip>

BTW, personally I'd prefer to illustrate futility of Payne-Hanek or
equivalent reduction with following example:

#include <math.h>
#include <stdio.h>
#include <limits.h>
#include <float.h>

void foo(long double x)
{
const double y = (double)sinl(x);
printf("%.20Le %25.16La %.17f\n", x, x, y);
}

int main(void) {
const float a0 = 0x1p56;
const float b0 = 11;
{
printf("%-12s (%3zu bits, %d mantissa bits) ",
"float", CHAR_BIT * sizeof (float), FLT_MANT_DIG);
const float a = a0;
const float b = b0;
const float x = a / b;
foo(x);
}
{
printf("%-12s (%3zu bits, %d mantissa bits) ",
"double", CHAR_BIT * sizeof (double), DBL_MANT_DIG);
const double a = a0;
const double b = b0;
const double x = a / b;
foo(x);
}
{
printf("%-12s (%3zu bits, %d mantissa bits) ",
"long double", CHAR_BIT * sizeof (long double), LDBL_MANT_DIG);
const long double a = a0;
const long double b = b0;
const long double x = a / b;
foo(x);
}
}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Sun Mar 17 13:10:56 2024

XPost: comp.lang.c

On 17/03/2024 03:57, Keith Thompson wrote:

bart <bc@freeuk.com> writes:

On 17/03/2024 01:38, Keith Thompson wrote:

bart <bc@freeuk.com> writes:

On 16/03/2024 23:19, Keith Thompson wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

Say you are a programmer and you receive a value like 2^53 from an >>>>>> Input read and you wan the most accurate possible SIN( of that ).

I can't think of a scenario where that would be useful (other than
just
doing it for the sake of doing it).
If 2^53 represents a physical quantity, how likely is the actual
value
to be known within ±π (+/i pi for those who prefer ASCII)?
If you can get better precision without too much extra cost, that's
great. I don't know enough to have an opinion about what the best
tradeoff is, but I presume it's going to be different depending on the >>>>> application.
Here's a C program that shows how precise sin(2^53) can be for types >>>>> float, double, and long double (I used gcc and glibc). The nextafter >>>>> functions are used to compute the nearest representable number. For >>>>> long double, the value of sin() changes by about 1 part in 1600, which >>>>> seems decent, but it's not nearly as precise as for values around 1.0. >>>>> For float and double, the imprecision of the argument is enough to make >>>>> the result practically meaningless.

...
Output:
float (32 bits, 24 mantissa bits)
9007199254740992.00000000 -0.84892595
9007200328482816.00000000 -0.34159181
double (64 bits, 53 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740994.00000000 -0.12729655
long double (128 bits, 64 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740992.00097656 -0.84944168

Is this output supposed to be different between gcc -O0 and gcc -O3?

No, why do you ask?
On my system, the output is identical with gcc and clang, and tcc at
all
optimization levels, with both glibc and musl. I do get slightly
different results for long double on Cygwin vs. Ubuntu (Cygwin uses
newlib, not glibc).

Because I get the results shown below. Even if the library is
different from yours, I would have assumed matching results.

------------------------------

C:\c>gcc c.c -O3

C:\c>a
float (32 bits, 24 mantissa bits)
9007199254740992.00000000 -0.84892595
9007200328482816.00000000 -0.34159181

double (64 bits, 53 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740994.00000000 -0.12729655

long double (128 bits, 64 mantissa bits)
9007199254740992.00000000 -0.84892596
9007199254740992.00097656 -0.84944168

C:\c>gcc c.c -O0

C:\c>a
float (32 bits, 24 mantissa bits)
9007199254740992.00000000 -0.84893209
9007200328482816.00000000 -0.34160271

double (64 bits, 53 mantissa bits)
9007199254740992.00000000 -0.84893209
9007199254740994.00000000 -0.12728505

long double (128 bits, 64 mantissa bits)
9007199254740992.00000000 -0.84893209
9007199254740992.00097656 -0.84944780

C:\c>gcc --version
gcc (tdm64-1) 10.3.0

I see similar results (possibly identical, I haven't checked) with i686-w64-mingw32-gcc on Cygwin. I think it uses the same, or a similar, runtime library as the tdm64 implementation you're using.

Comparing the generated assembly, at -O0 it generates a call to "sin",
while at -O1 and above it replaces the expression with a compile-time constant. Apparently that compile-time constant matches the run-time computed value for glibc, but not for whatever tdm64 uses.

I don't *think* it's a conformance issue as long as the precision is
within the standard's (fairly vague) requirements. When I modify the
program to start with sin(1.0) rather than sin(2**53), the output is identical at all optimization levels.

Apparently glibc and the library used by tgm64 behave differently when computing the sin of very large values. The result for sin(2**53) in
long double differs by about one part in 2**17 between -O0 and -O1, or between gcc/glibc and tgm64.

That is interesting. It might not be a conformance issue, but it is a divergence from the intention of the compiler. gcc makes use of a very comprehensive library that emulates a whole range of floating point
hardware implementations to try to get bit-perfect compile-time versions
of run-time calculations, precisely so that it can do compile-time
calculations that fully match what you would get at run-time, regardless
of compilation flags and optimisation.

So if the program is producing different results for run-time and
compile-time versions of "sin", then there is a mismatch between the configuration of the compiler and the run-time library it is using. I
don't know enough about the configuration options for gcc (there are a
/lot/ of them), but I know it can be configured to assume you are using
glibc or a glibc-compatible library. Perhaps the mingw build here is configured with this option, but is actually using a run-time library
with a slightly implementation of these functions (such as an MS DLL C library). If that is the case, then this is a mistake in the
configuration of gcc here, and I would think it could be reported back
to the mingw people as a bug.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Monnier@21:1/5 to All on Mon Mar 18 15:18:17 2024

XPost: comp.lang.c

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

That much? I had the impression it was significantly cheaper.

At which cost in tables sizes?

My impression was that it wasn't costly in that respect, but since my recollection seems to be off on the performance, maybe it's off here
as well.

The critical point here is definition of what considered exact. If
'exact' is measured only on y side of y=foo(x), disregarding
possible imprecision on the x side then you are very likely to end up
with results that are slower to calculate, but not at all more useful
from point of view of engineer or physicist. Exactly like Payne-Hanek
or Mitch's equivalent of Payne-Hanek.

I don't know what are/were the motivations for the people working on
exact transcendentals, but they have applications unrelated to the fact
that they're "better": the main benefit (from this here PL guy) is that
it gives them a reliable, reproducible semantics.
Bit-for-bit reproducibility makes several things much easier.

Stefan

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Stefan Monnier on Mon Mar 18 22:19:19 2024

XPost: comp.lang.c

Stefan Monnier wrote:

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

That much? I had the impression it was significantly cheaper.

The J. M. Muller book indicates about 2× to 2.5×

At which cost in tables sizes?

Making transcendental faster is always a tradeoff between table size
and speed. SIN() COS() can use 10-term polynomials when the reduced
argument is -¼pi..+¼pi, on the other hand, when the reduced argument
is ±0.008 a 3 term polynomial is just as accurate but you need 128×3
tables.

My impression was that it wasn't costly in that respect, but since my recollection seems to be off on the performance, maybe it's off here
as well.

ITANIC did rather well, here, because it had 2 FMAC units and could use
both for the polynomials.

The critical point here is definition of what considered exact. If
'exact' is measured only on y side of y=foo(x), disregarding
possible imprecision on the x side then you are very likely to end up
with results that are slower to calculate, but not at all more useful
from point of view of engineer or physicist. Exactly like Payne-Hanek
or Mitch's equivalent of Payne-Hanek.

I don't know what are/were the motivations for the people working on
exact transcendentals, but they have applications unrelated to the fact
that they're "better": the main benefit (from this here PL guy) is that
it gives them a reliable, reproducible semantics.
Bit-for-bit reproducibility makes several things much easier.

Consider moving an application which uses libm from machine to machine.
When libm is correctly rounded, there is no issue at all; not so other-
wise.

Stefan

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Monnier@21:1/5 to All on Wed Mar 20 09:54:36 2024

XPost: comp.lang.c

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

That much? I had the impression it was significantly cheaper.

The J. M. Muller book indicates about 2� to 2.5�

The [Rlibm](https://people.cs.rutgers.edu/~sn349/rlibm/) project claims
to get much better performance (basically, in the same ballpark as not-correctly-rounded implementations).

[ Their key insight is the idea that to get correct rounding, you
shouldn't try to compute the best approximation of the exact result
and then round, but you should instead try to compute any
approximation whose rounding gives the correct result. ]

My impression was that their performance was good enough that the case
for not-correctly-rounded implementations becomes very weak.

I don't know what are/were the motivations for the people working on
exact transcendentals, but they have applications unrelated to the fact
that they're "better": the main benefit (from this here PL guy) is that
it gives them a reliable, reproducible semantics.
Bit-for-bit reproducibility makes several things much easier.

Consider moving an application which uses libm from machine to machine.
When libm is correctly rounded, there is no issue at all; not so other-
wise.

Exactly!
[ Or should I say "Correctly rounded!"? ]

Stefan

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Stefan Monnier on Wed Mar 20 18:21:47 2024

XPost: comp.lang.c

On Wed, 20 Mar 2024 09:54:36 -0400
Stefan Monnier <monnier@iro.umontreal.ca> wrote:

There are groups who have shown that exactly rounded
trancendental functions are in fact achievable with maybe 3X
reduced performance.

That much? I had the impression it was significantly cheaper.

The J. M. Muller book indicates about 2× to 2.5×

The [Rlibm](https://people.cs.rutgers.edu/~sn349/rlibm/) project
claims to get much better performance (basically, in the same
ballpark as not-correctly-rounded implementations).

I had only read the 1st page.
It sounds like they are not particularly interested in IEEE binary64
which appears to be the major point of interest of math libs of
conventional languages.

[ Their key insight is the idea that to get correct rounding, you
shouldn't try to compute the best approximation of the exact result
and then round, but you should instead try to compute any
approximation whose rounding gives the correct result. ]

My impression was that their performance was good enough that the case
for not-correctly-rounded implementations becomes very weak.

It all depend of what you compare against.
For scalar call for majority of transcendental functions on IEEE-754
list, it's probably very easy to get correctly rounded binary32 results
in approximately the same time as results calculated with max. err of,
say, 0.75 ULP. Especially so if target machine has fast binary64
arithmetic.
But in practice when we use lower (than binary64) precision we often
care about vector performance rather than scalar.
I.e. we care little about speed of sinf(), but want ippsTone_32f() as
fast as possible. In case you wonder, this function is part Intel
Performance Primitives and it is FAST. Writing correctly rounded
function that approaches the speed of this *almost* correctly
rounded routine (I think, for sane input ranges it's better than
0.55 ULP) would not be easy at all!

I don't know what are/were the motivations for the people working
on exact transcendentals, but they have applications unrelated to
the fact that they're "better": the main benefit (from this here
PL guy) is that it gives them a reliable, reproducible semantics.
Bit-for-bit reproducibility makes several things much easier.

Consider moving an application which uses libm from machine to
machine. When libm is correctly rounded, there is no issue at all;
not so other- wise.

Exactly!
[ Or should I say "Correctly rounded!"? ]

Stefan

You like this proposal because you are implementer of the language/lib.
It makes your regression tests easier. And it's good challenge.
I don't like it because I am primarily user of the language/lib. My floating-point tests have zero chance of repeatability of this sort for
a thousand of other reasons.
I don't want to pay for correct rounding of transcendental functions.
Neither in speed and especially nor in tables footprint. Not even a
little. Because for me there are no advantages.
Now, there are things that I am ready to pay for. E.g. preservation of mathematical properties of original exact function. I.e. if original is monotonic on certain interval then I do want at least weak monotonicity
of approximation. If original is even (or odd) I want the same for approximation. If original never exceed 1, I want the same
for approximation. Etc... But correct rounding is not on the list.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Steven G. Kargl@21:1/5 to Michael S on Wed Mar 20 17:02:57 2024

XPost: comp.lang.c

On Wed, 20 Mar 2024 18:21:47 +0200, Michael S wrote:

On Wed, 20 Mar 2024 09:54:36 -0400
Stefan Monnier <monnier@iro.umontreal.ca> wrote:

There are groups who have shown that exactly rounded
trancendental functions are in fact achievable with maybe 3X
reduced performance.

That much? I had the impression it was significantly cheaper.

The J. M. Muller book indicates about 2× to 2.5×

The [Rlibm](https://people.cs.rutgers.edu/~sn349/rlibm/) project
claims to get much better performance (basically, in the same
ballpark as not-correctly-rounded implementations).

I had only read the 1st page.
It sounds like they are not particularly interested in IEEE binary64
which appears to be the major point of interest of math libs of
conventional languages.

I skimmed their logf(x) implementation. Their technique will
fall a part for binary64 (and other higher precisions). With
logf(x), they combine an argument step with table look-up and
a 5th-order polynomial approximation. The polynomial is computed
in double precision, and provides the correct rounding. For
binary64, the polynomial would require many more terms and
double-double or binary128 evaluation.

--
steve

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Monnier@21:1/5 to All on Wed Mar 20 12:59:24 2024

XPost: comp.lang.c

The [Rlibm](https://people.cs.rutgers.edu/~sn349/rlibm/) project
claims to get much better performance (basically, in the same
ballpark as not-correctly-rounded implementations).

I had only read the 1st page.
It sounds like they are not particularly interested in IEEE binary64
which appears to be the major point of interest of math libs of
conventional languages.

Indeed, I hoped by now they had attacked the world of 64bit floats, but
it looks like it's not the case yet. In theory the same idea is
applicable there and should give similar results, but of course it's
harder for two reasons:

- The search space to consider is much larger, making it much more
costly to do exhaustive testing (and to collect the info they need to
choose/compute the polynomials).
- You can't cheaply use higher-precision floats for internal computations.

It all depend of what you compare against.
For scalar call for majority of transcendental functions on IEEE-754
list, it's probably very easy to get correctly rounded binary32 results
in approximately the same time as results calculated with max. err of,
say, 0.75 ULP. Especially so if target machine has fast binary64
arithmetic.

IIUC that was not the case before their work: it was "easy" to get the
correct result in 99% of the cases, but covering all 100% of the cases
used to be costly because those few cases needed a lot more
internal precision.

I don't want to pay for correct rounding of transcendental functions.
Neither in speed and especially nor in tables footprint. Not even a
little. Because for me there are no advantages.

For that reason, correctly rounded results were nowhere near the menu,
indeed. But the proposition of Rlibm is that maybe we can have our cake
and eat it too, because (if all goes well) you don't have to pay for it
in speed nor in table size.

I'm hoping that pans out :-)

Now, there are things that I am ready to pay for. E.g. preservation of mathematical properties of original exact function. I.e. if original is monotonic on certain interval then I do want at least weak monotonicity
of approximation. If original is even (or odd) I want the same for approximation. If original never exceed 1, I want the same
for approximation. Etc... But correct rounding is not on the list.

But correct rounding would give you those properties.

Stefan

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Stefan Monnier on Wed Mar 20 20:26:03 2024

XPost: comp.lang.c

Stefan Monnier wrote:

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

That much? I had the impression it was significantly cheaper.

The J. M. Muller book indicates about 2× to 2.5×

The [Rlibm](https://people.cs.rutgers.edu/~sn349/rlibm/) project claims
to get much better performance (basically, in the same ballpark as not-correctly-rounded implementations).

[ Their key insight is the idea that to get correct rounding, you
shouldn't try to compute the best approximation of the exact result
and then round, but you should instead try to compute any
approximation whose rounding gives the correct result. ]

This sounds like the "Minefield Method"

My impression was that their performance was good enough that the case
for not-correctly-rounded implementations becomes very weak.

I don't know what are/were the motivations for the people working on
exact transcendentals, but they have applications unrelated to the fact
that they're "better": the main benefit (from this here PL guy) is that
it gives them a reliable, reproducible semantics.
Bit-for-bit reproducibility makes several things much easier.

Consider moving an application which uses libm from machine to machine.
When libm is correctly rounded, there is no issue at all; not so other-
wise.

Exactly!
[ Or should I say "Correctly rounded!"? ]

Stefan

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stefan Monnier@21:1/5 to All on Wed Mar 20 16:34:22 2024

XPost: comp.lang.c

[ Their key insight is the idea that to get correct rounding, you
shouldn't try to compute the best approximation of the exact result
and then round, but you should instead try to compute any
approximation whose rounding gives the correct result. ]

This sounds like the "Minefield Method"

Indeed it is.

Stefan

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Wed Mar 20 20:33:44 2024

XPost: comp.lang.c

Michael S wrote:

On Wed, 20 Mar 2024 09:54:36 -0400
Stefan Monnier <monnier@iro.umontreal.ca> wrote:

[ Their key insight is the idea that to get correct rounding, you
shouldn't try to compute the best approximation of the exact result
and then round, but you should instead try to compute any
approximation whose rounding gives the correct result. ]

My impression was that their performance was good enough that the case
for not-correctly-rounded implementations becomes very weak.

It all depend of what you compare against.
For scalar call for majority of transcendental functions on IEEE-754
list, it's probably very easy to get correctly rounded binary32 results
in approximately the same time as results calculated with max. err of,
say, 0.75 ULP. Especially so if target machine has fast binary64
arithmetic.

But in practice when we use lower (than binary64) precision we often
care about vector performance rather than scalar.
I.e. we care little about speed of sinf(), but want ippsTone_32f() as
fast as possible. In case you wonder, this function is part Intel
Performance Primitives and it is FAST. Writing correctly rounded
function that approaches the speed of this *almost* correctly
rounded routine (I think, for sane input ranges it's better than
0.55 ULP) would not be easy at all!

I challenge ANY software version of SIN() correctly rounded or not
to compete with my <patented> HW implementations for speed (or even
power).

I don't know what are/were the motivations for the people working
on exact transcendentals, but they have applications unrelated to
the fact that they're "better": the main benefit (from this here
PL guy) is that it gives them a reliable, reproducible semantics.
Bit-for-bit reproducibility makes several things much easier.

Consider moving an application which uses libm from machine to
machine. When libm is correctly rounded, there is no issue at all;
not so other- wise.

Exactly!
[ Or should I say "Correctly rounded!"? ]

Stefan

You like this proposal because you are implementer of the language/lib.
It makes your regression tests easier. And it's good challenge.
I don't like it because I am primarily user of the language/lib. My floating-point tests have zero chance of repeatability of this sort for
a thousand of other reasons.

I don't want to pay for correct rounding of transcendental functions.

Even when the HW is 7× faster than SW algorithms ??

Neither in speed and especially nor in tables footprint. Not even a
little. Because for me there are no advantages.

My HW algorithms use no memory (cache, DRAM, or even LDs.)

Now, there are things that I am ready to pay for. E.g. preservation of mathematical properties of original exact function.

You know, of course, that incorrectly rounded SIN() and COS() do not
maintain the property of SIN()^2+COS()^2 == 1.0

I.e. if original is monotonic on certain interval then I do want at least weak monotonicity
of approximation.

My HW algorithms have a numeric proof of this maintenance.

If original is even (or odd) I want the same for approximation. If original never exceed 1, I want the same
for approximation. Etc... But correct rounding is not on the list.

SIN(x) can be incorrectly rounded to be greater than 1.0.

Still want incorrect rounding--or just a polynomial that does not
have SI(X) > 1.0 ??

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Steven G. Kargl on Wed Mar 20 20:47:02 2024

XPost: comp.lang.c

Steven G. Kargl wrote:

On Wed, 20 Mar 2024 18:21:47 +0200, Michael S wrote:

On Wed, 20 Mar 2024 09:54:36 -0400
Stefan Monnier <monnier@iro.umontreal.ca> wrote:

There are groups who have shown that exactly rounded
trancendental functions are in fact achievable with maybe 3X
reduced performance.

That much? I had the impression it was significantly cheaper.

The J. M. Muller book indicates about 2× to 2.5×

The [Rlibm](https://people.cs.rutgers.edu/~sn349/rlibm/) project
claims to get much better performance (basically, in the same
ballpark as not-correctly-rounded implementations).

I had only read the 1st page.
It sounds like they are not particularly interested in IEEE binary64
which appears to be the major point of interest of math libs of
conventional languages.

I skimmed their logf(x) implementation. Their technique will
fall a part for binary64 (and other higher precisions). With
logf(x), they combine an argument step with table look-up and
a 5th-order polynomial approximation. The polynomial is computed
in double precision, and provides the correct rounding. For
binary64, the polynomial would require many more terms and
double-double or binary128 evaluation.

The first 7 terms of the DP polynomial do not require FP128, whereas
the last 3 do/will.

Whereas: HW with 64-bits of fraction does not need FP128 at all.
64-bits of fraction with 64-bit coefficients added into a 128-bit
accumulator is more than sufficient.

Note FP64 has 53-bits of fraction.....

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Stefan Monnier on Wed Mar 20 20:40:25 2024

XPost: comp.lang.c

Stefan Monnier wrote:

The [Rlibm](https://people.cs.rutgers.edu/~sn349/rlibm/) project
claims to get much better performance (basically, in the same
ballpark as not-correctly-rounded implementations).

I had only read the 1st page.
It sounds like they are not particularly interested in IEEE binary64
which appears to be the major point of interest of math libs of
conventional languages.

Indeed, I hoped by now they had attacked the world of 64bit floats, but
it looks like it's not the case yet. In theory the same idea is
applicable there and should give similar results, but of course it's
harder for two reasons:

- The search space to consider is much larger, making it much more
costly to do exhaustive testing (and to collect the info they need to
choose/compute the polynomials).
- You can't cheaply use higher-precision floats for internal computations.

You can in HW, just not in SW. That is a strong algorithms to migrate
these functions from SW procedures to HW instructions. You do not need
FP128, just FP with a 64-bit fraction.

It all depend of what you compare against.

OK:: SIN(x) takes 19-cycles and is 0.5002 accurate with Payne & Hanek argument reduction--Against any SW implementation.

For scalar call for majority of transcendental functions on IEEE-754
list, it's probably very easy to get correctly rounded binary32 results
in approximately the same time as results calculated with max. err of,
say, 0.75 ULP.

Yes, at the cost of 4× larger tables.

Especially so if target machine has fast binary64

arithmetic.

Obviously.

IIUC that was not the case before their work: it was "easy" to get the correct result in 99% of the cases, but covering all 100% of the cases
used to be costly because those few cases needed a lot more
internal precision.

Muller indicates one typically need 2×n+6 to 2×n+12 bits to get correct roundings 100% of the time. FP128 only has 2×n+3 and is insufficient by itself.

I don't want to pay for correct rounding of transcendental functions.
Neither in speed and especially nor in tables footprint. Not even a
little. Because for me there are no advantages.

For that reason, correctly rounded results were nowhere near the menu, indeed. But the proposition of Rlibm is that maybe we can have our cake
and eat it too, because (if all goes well) you don't have to pay for it
in speed nor in table size.

I'm hoping that pans out :-)

Now, there are things that I am ready to pay for. E.g. preservation of
mathematical properties of original exact function. I.e. if original is
monotonic on certain interval then I do want at least weak monotonicity
of approximation. If original is even (or odd) I want the same for
approximation. If original never exceed 1, I want the same
for approximation. Etc... But correct rounding is not on the list.

But correct rounding would give you those properties.

Stefan

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to mitchalsup@aol.com on Thu Mar 21 00:03:48 2024

XPost: comp.lang.c

On Wed, 20 Mar 2024 20:33:44 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Michael S wrote:

On Wed, 20 Mar 2024 09:54:36 -0400
Stefan Monnier <monnier@iro.umontreal.ca> wrote:

[ Their key insight is the idea that to get correct rounding, you
shouldn't try to compute the best approximation of the exact
result and then round, but you should instead try to compute any
approximation whose rounding gives the correct result. ]

My impression was that their performance was good enough that the
case for not-correctly-rounded implementations becomes very weak.

It all depend of what you compare against.
For scalar call for majority of transcendental functions on IEEE-754
list, it's probably very easy to get correctly rounded binary32
results in approximately the same time as results calculated with
max. err of, say, 0.75 ULP. Especially so if target machine has
fast binary64 arithmetic.

But in practice when we use lower (than binary64) precision we often
care about vector performance rather than scalar.
I.e. we care little about speed of sinf(), but want ippsTone_32f()
as fast as possible. In case you wonder, this function is part Intel Performance Primitives and it is FAST. Writing correctly rounded
function that approaches the speed of this *almost* correctly
rounded routine (I think, for sane input ranges it's better than
0.55 ULP) would not be easy at all!

I challenge ANY software version of SIN() correctly rounded or not
to compete with my <patented> HW implementations for speed (or even
power).

Before you post this response, you could as well look at what
ippsTone_32f() is doing. Hint - it's not generic scalar sin().
IMHO, for long enough vector and on modern enough Intel or AMD CPU it
will very easily beat any scalar-oriented binary64-oriented HW
implementation of sin() or cos().
This function is not about latency. It's about throughput.

AFAIR, youu were quite surprised by speed (throughput) of another IPP primitive, ippsSin_64f_A53() when I posted results of timing
measurement here less than 2 yeears ago. So, before you issue a
challenge, just take into account that ippsTone_32f() is both more
specialized than ippsSin_64f_A53() and has much lower precision. So,
while I didn't test, I expect that it is much much faster.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to Stefan Monnier on Thu Mar 21 08:38:53 2024

XPost: comp.lang.c

Stefan Monnier wrote:

There are groups who have shown that exactly rounded trancendental
functions are in fact achievable with maybe 3X reduced performance.

That much? I had the impression it was significantly cheaper.

The J. M. Muller book indicates about 2× to 2.5×

The [Rlibm](https://people.cs.rutgers.edu/~sn349/rlibm/) project claims
to get much better performance (basically, in the same ballpark as not-correctly-rounded implementations).

[ Their key insight is the idea that to get correct rounding, you
shouldn't try to compute the best approximation of the exact result
and then round, but you should instead try to compute any
approximation whose rounding gives the correct result. ]

That is indeed interesting. However, it is also very interesting that
they only do this for 32-bit or less. I.e. the domains for which it is
almost trivially easy to verify the results by checking all possible
inputs. :-)

My impression was that their performance was good enough that the case
for not-correctly-rounded implementations becomes very weak.

I agree in priniciple: If you can use polys that are not much more
complicated than the min-max/cheby case, and which always round to the
desired values, then that's an obvious good.
...
Reading the full log2f() code, it seems that they can use double for all evaluations, and with a single (!) mantissa exception, this produces the correct results for all inputs and all rounding modes. :-)

I.e. with 53 bits to work with, giving up about one ulp for the range reduction, the 52 remaining bits corresponds to 2n+6 bits available for
the 5-term poly to evaluate a final float result.

When asking for perfectly rounded trancendentals, with approximately the
same runtime, the float case is a form of cheating, simply because
current FPUs tend to run float and double at the same speed.

OTOH, I'm still persuaded that for a float library, using this approach
might in fact be included in the 754 standard.

Doing the same for double by "simply" doing all operations with fp128 variables would definitely take significantly longer, and verification
would be an "interesting" problem. (Interesting in the "multiple PhDs" domain.)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to All on Thu Mar 21 08:52:18 2024

XPost: comp.lang.c

MitchAlsup1 wrote:

Stefan Monnier wrote:

IIUC that was not the case before their work: it was "easy" to get the
correct result in 99% of the cases, but covering all 100% of the cases
used to be costly because those few cases needed a lot more
internal precision.

Muller indicates one typically need 2×n+6 to 2×n+12 bits to get correct roundings 100% of the time. FP128 only has 2×n+3 and is insufficient by itself.

I agree with everything else you've written about this subject, but
afair, fp128 is using 1:15:112 while double is of course 1:10:53.

On the one hand, 53*2+6 -> 112, on the other hand (if we include the
hidden bits) we get 54*2+5 -> 113.

So significantly more than 2n+3 but not enough on its own to guarantee
correct rounding.

As Michael S have mentioned, we want these algorithms to work for
vector/SIMD calculations, and at that point both lookup tables and
widening the size of the temporaries are very costly.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Terje Mathisen on Thu Mar 21 14:51:33 2024

XPost: comp.lang.c

On Thu, 21 Mar 2024 08:52:18 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

MitchAlsup1 wrote:

Stefan Monnier wrote:

IIUC that was not the case before their work: it was "easy" to get
the correct result in 99% of the cases, but covering all 100% of
the cases used to be costly because those few cases needed a lot
more internal precision.

Muller indicates one typically need 2×n+6 to 2×n+12 bits to get
correct roundings 100% of the time. FP128 only has 2×n+3 and is insufficient by itself.

I agree with everything else you've written about this subject, but
afair, fp128 is using 1:15:112 while double is of course 1:10:53.

IEEE-754 binary64 is 1:11:52 :-)

But anyway I am skeptical about Miller's rules of thumb.
I'd expect that different transcendental functions would exercise
non-trivially different behaviors, mostly because they have different relationships between input and output ranges. Some of them compress
wider inputs into narrower output and some do the opposite.
Yet another factor is luck.

Besides, I see nothing special about binary128 as a helper format.
It is not supported on wast majority of HW, And even when it is
supported, like on IBM POWER, for majority of operations it is slower
than emulated 128-bit fixed-point. Fix-point is more work for coder, but
sounds like more sure path to success.

On the one hand, 53*2+6 -> 112, on the other hand (if we include the
hidden bits) we get 54*2+5 -> 113.

So significantly more than 2n+3 but not enough on its own to
guarantee correct rounding.

As Michael S have mentioned, we want these algorithms to work for vector/SIMD calculations, and at that point both lookup tables and
widening the size of the temporaries are very costly.

Terje

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Thu Mar 21 16:37:29 2024

XPost: comp.lang.c

Michael S wrote:

On Thu, 21 Mar 2024 08:52:18 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

MitchAlsup1 wrote:

Stefan Monnier wrote:

IIUC that was not the case before their work: it was "easy" to get
the correct result in 99% of the cases, but covering all 100% of
the cases used to be costly because those few cases needed a lot
more internal precision.

Muller indicates one typically need 2×n+6 to 2×n+12 bits to get
correct roundings 100% of the time. FP128 only has 2×n+3 and is
insufficient by itself.

I agree with everything else you've written about this subject, but
afair, fp128 is using 1:15:112 while double is of course 1:10:53.

IEEE-754 binary64 is 1:11:52 :-)

But anyway I am skeptical about Miller's rules of thumb.

See Chapter 10 "Elementary Functions: Algorithms and Implementation" Jean-Michael Muller {you can find a free copy on the net} and in
particular tables 10.5-10.14 -- quoting from the book::

"In his PhD dissertation [206], Lef`evre gives algorithms for finding
the worst cases of the table maker’s dilemma. These algorithms use linear approximations and massive parallelism. A recent presentation of these algorithms, with some improvements, can be found in [207]. We have run Lef`evre’s algorithms to find worst cases in double-precision for the
most common functions and domains. The results obtained so far, first
published in [208] are given in Tables 10.5 to 10.14

They are NOT rules of thumb !! But go read it for yourself.

I'd expect that different transcendental functions would exercise non-trivially different behaviors, mostly because they have different relationships between input and output ranges. Some of them compress
wider inputs into narrower output and some do the opposite.
Yet another factor is luck.

Besides, I see nothing special about binary128 as a helper format.
It is not supported on wast majority of HW, And even when it is
supported, like on IBM POWER, for majority of operations it is slower
than emulated 128-bit fixed-point. Fix-point is more work for coder, but sounds like more sure path to success.

On the one hand, 53*2+6 -> 112, on the other hand (if we include the
hidden bits) we get 54*2+5 -> 113.

So significantly more than 2n+3 but not enough on its own to
guarantee correct rounding.

As Michael S have mentioned, we want these algorithms to work for
vector/SIMD calculations, and at that point both lookup tables and
widening the size of the temporaries are very costly.

Terje

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to Michael S on Sat Mar 23 09:11:38 2024

XPost: comp.lang.c

Michael S wrote:

On Thu, 21 Mar 2024 08:52:18 +0100
Terje Mathisen <terje.mathisen@tmsw.no> wrote:

MitchAlsup1 wrote:

Stefan Monnier wrote:

IIUC that was not the case before their work: it was "easy" to get
the correct result in 99% of the cases, but covering all 100% of
the cases used to be costly because those few cases needed a lot
more internal precision.

Muller indicates one typically need 2Ã—n+6 to 2Ã—n+12 bits to get
correct roundings 100% of the time. FP128 only has 2Ã—n+3 and is
insufficient by itself.

I agree with everything else you've written about this subject, but
afair, fp128 is using 1:15:112 while double is of course 1:10:53.

IEEE-754 binary64 is 1:11:52 :-)

Oops! Mea Culpa!
I _do_ know that double has a 10-bit exponent bias (1023), so it has to
be 11 bits wide. :-(

But anyway I am skeptical about Miller's rules of thumb.
I'd expect that different transcendental functions would exercise non-trivially different behaviors, mostly because they have different relationships between input and output ranges. Some of them compress
wider inputs into narrower output and some do the opposite.
Yet another factor is luck.

I agree, this is a per-function problem, with some being substantially
harder than others.

Besides, I see nothing special about binary128 as a helper format.
It is not supported on wast majority of HW, And even when it is
supported, like on IBM POWER, for majority of operations it is slower
than emulated 128-bit fixed-point. Fix-point is more work for coder, but sounds like more sure path to success.

In my own code (since I don't have Mitch's ability to use much wider
internal fp formats) I also prefer 64-bit u64 as the working chunk size.

Almost 30 years ago, during the FDIV workaround, I needed a q&d way to
verify that our fpatan2 replacement was correct, so what I did was to
write a 1:31:96 format library over a weekend.

Yeah, it was much more exponent than needed, but with only 32-bit
registers available it was much easier to get the asm correct.

For the fpatan2 I used a dead simple approach with little range
reduction, just a longish Taylor series (i.e. no Cheby optimizations).

I had previously written 3-4 different iomplementations of arbitrary
precision atan2() when I wanted to calculatie as many digits of pi as possible, so I just reused one of those algorithms.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Bob Worm
  Tue Sep 16 15:15:42 2025
  from Wales, Uk via Telnet
- Gretchiie
  Tue Sep 16 05:20:21 2025
  from Derry, Nh via Telnet
- Ginger1
  Mon Sep 15 19:33:54 2025
  from London via SSH
- Bob Worm
  Mon Sep 15 15:42:34 2025
  from Wales, Uk via Telnet
- Gretchiie
  Mon Sep 15 05:16:29 2025
  from Derry, Nh via Telnet
- Fred Blogs
  Mon Sep 15 00:03:12 2025
  from Uk via SSH
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (3 / 13)
Uptime:	30:28:08
Calls:	10,391
Calls today:	2
Files:	14,064
Messages:	6,417,097

Re: Radians Or Degrees?

Who's Online

Recent Visitors

System Info