do_sin
This post is about the do_sin(double x, double dx)
function from s_sin.c
. The source code is here.
The purpose of do_sin(x, dx)
is to compute an approximation to \(\sin(x + dx)\) (addition in the sense of real numbers) even when the floating point addition x + dx
is not exact (e.g. because dx
is very small relative to x
). The source doesn’t specify how much smaller dx
should be, but with one exception, whenever do_sin(x, dx)
is called x
and dx
have come from a call to reduce_sincos
, so as it explains in my post on that function we can assume dx
will be at most \(2^{-27}\) times x
in magnitude.
x less than 0.126 in magnitude
do_sin
does a check to see if \(\vert x\vert <0.126\) using fabs
, and if so, invokes the TAYLOR_SIN
macro with parameters x*x, x, dx
.
It invokes two other macros which are defined on lines 49-65 and which I’ve copied below.
#define POLYNOMIAL2(xx) ((((s5 * (xx) + s4) * (xx) + s3) * (xx) + s2) * (xx))
#define POLYNOMIAL(xx) (POLYNOMIAL2 (xx) + s1)
/* The computed polynomial is a variation of the Taylor series expansion for
sin(x):
x - x^3/3! + x^5/5! - x^7/7! + x^9/9! - dx*x^2/2 + dx
The constants s1, s2, s3, etc. are pre-computed values of 1/3!, 1/5! and so
on. The result is returned to LHS. */
#define TAYLOR_SIN(xx, x, dx)
({
double t = ((POLYNOMIAL (xx) * (x) - 0.5 * (dx)) * (xx) + (dx));
double res = (x) + t;
res;
})
The constants used, defined in usncs.h
, are
double s1 = -0x1.5555555555555p-3; /* -0.16666666666666666 */
double s2 = 0x1.1111111110ECEp-7; /* 0.0083333333333323288 */
double s3 = -0x1.A01A019DB08B8p-13; /* -0.00019841269834414642 */
double s4 = 0x1.71DE27B9A7ED9p-19; /* 2.755729806860771e-06 */
double s5 = -0x1.ADDFFC2FCDF59p-26; /* -2.5022014848318398e-08 */
and these numbers are close to, but not the nearest floats to, the coefficients -1/3!, 1/5!, -1/7!, 1/9!, and -1/11! in the Taylor series for sin. (For example, the nearest double to 1/5! is 0x1.1111111111111p-7 and this isn’t equal to s2
.)
Unpacking the definitions, what gets evaluated is
\[\texttt{s5} x^{11} + \texttt{s4} x^9 + \texttt{s3} x^7 + \texttt{s2} x^5 + \texttt{s1} x^3 + x - 0.5 \texttt{dx} x^2 + \texttt{dx}\]that is, a polynomial of degree 11 whose coefficients are close to those of the Taylor series for \(\sin(x)\), plus \(dx(1-x^2/2)\), which is dx
times the degree 2 truncation of the Taylor series for \(\cos(x)\). The comment in the code about
x - x^3/3! + x^5/5! - x^7/7! + x^9/9! - dx*x^2/2 + dx
is misleading: the approximation used for sin(x) is degree 11, not 9, and as above, the constants s2
to s5
are not just double approximations to the true Taylor coefficients. (In older versions of s_sin.c
the coefficient of dx
was also wrong, but the maintainers fixed it a few years ago.)
The reason for these coefficients s1
, …, s5
to differ from the true Taylor coefficients is that in general, the “best” polynomial approximation to a function, even a nice function like sin on an interval symmetric around 0, is not just a truncation of its Taylor series (although often the coefficients are close to those from the Taylor series). Since as double users we care about significant figures, these coefficients will have been chosen to minimise relative error on an interval.
There’s no discussion in the file of how small dx
should be for this function to produce acceptable results, but some basic estimates show that \(\vert dx \vert \leqslant 2^{-40} \vert x \vert\) is good enough. Truncating Taylor expansions for sin and cos and using the Lagrange remainder, we get
at most for some \(\zeta, \xi\). The error in TAYLOR_SIN
is therefore the sum of the error in approximating $\sin(x)$ by the degree 11 polynomial above, the error in neglecting \(dx\cos(\zeta)x^4/4!\), and the error in neglecting \(\sin(\xi)(dx)^2/\). If \(\vert dx \vert \leqslant 2^{-k} \vert x\vert\) then \(\vert \cos(\zeta) x^4 dx / 4!\vert \leqslant 2^{-k-16.5}\vert x \vert\) (using \(\vert x \vert \leqslant 0.126\)) and \(\vert \sin(\xi)(dx)^2/2! \leqslant 2^{-2k-3.9}\vert x \vert\) (using the same approximation). For x
of this magnitude an ULP for x
is about the same as an ULP for \(\sin(x+dx)\), so with \(k=40\) the overall error will be dominated by the contribution from the polynomial approximation to \(\sin(x)\) which testing in Python suggests is not much more than half an ULP. On the other hand, the same test code shows \(\vert dx \vert \leqslant 2^{-27}\vert x \vert\) is nowhere near small enough, but \(2^{-40}\) is OK (in the sense of having errors little more than half and ULP). Since the actual implementation shows this level of accuracy, I think the conclusion is that my estimate of the magnitude of dx
returned by reduce_sincos
was way too pessimistic.
This range of x-values does provide examples of doubles for which the glibc sin function is not correctly rounded. For example, testing in Python against the crlibm
library, which claims to be correctly rounded, and with mpmath
’s arbitrary precision sin function, shows that glibc sin of -0x1.e6fbcae266c20p-4 has an error of 0.506 ULPs so is off by one in the final place.
x between 0.126 and 0.855469 in magnitude
This starts around line 131. The code is rather annoying to read, so I have rewritten it below with some new variables introduced for clarity and with simplifications made by assuming x
is positive.
mynumber u;
u.x = big + x;
double y = x - (u.x - big);
big
is a static const double equal to 0x1.8000000000000p45. An ULP for big is \(2^{-7}\), while \(\vert x\vert < 1\). Therefore the last 7 bits of u.x
will be the result of rounding x to 7 bits, that is, to the nearest multiple of 1/128. y
is now the difference between x
and the nearest multiple of 1/128. Let’s say this multiple is m/128 where m is a whole number. We have \(x+dx = y + m/128 + dx\).
yy = y * y
s = y + (dx + y * yy * (sn3 + yy * sn5))
c = y * dx + yy * (cs2 + yy * (cs4 + yy * cs6))
The constants here are given in the following odd format.
static const double
sn3 = -1.66666666666664880952546298448555E-01,
sn5 = 8.33333214285722277379541354343671E-03,
cs2 = 4.99999999999999999999950396842453E-01,
cs4 = -4.16666666666664434524222570944589E-02,
cs6 = 1.38888874007937613028114285595617E-03;
There are 32 decimal places, about half of which don’t make any difference to the double value (cs2
is just 0.5, for example). Presumably they came from some higher precision optimisation calculation. The result is that c
is an approximation to \(1-\cos(y+dx)\) and s
is an approximation to \(\sin(y + dx)\). The code now uses a macro SINCOS_TABLE_LOOKUP
which retrieves double-double approximations to sin and cos of m/128 from a lookup table __sincostab
.
int k = u.i[LOW_HALF] << 2; // u.i[LOW_HALF] multiplied by 4
// Interpreted as an integer, u.i[LOW_HALF] is the numerator of the
// closest fraction m/128 to x.
// k is the index of the "row" of the table sincostab corresponding to y
sn = __sincostab.d[k];
ssn = __sincostab.d[k + 1];
cs = __sincostab.d[k + 2];
ccs = __sincostab.d[k + 3];
sn
and ssn
are a double-double approximation to sin(m/128) with ssn being the smaller part. cs
and ccs
are a double-double approximation of cos(m/128) with ccs
being the smaller part.
cor = (ssn + s * ccs - sn * c) + cs * s;
return sn + cor;
The trig identity being used here is the addition formula for sin: \(\sin(x+dx) = \sin(m/128 + (y+dx)) = \sin(m/128)\cos(y+dx) + \cos(m/128)\sin(y+dx)\). If we introduce error terms \(\mathtt{s} = \sin(y+dx) + \epsilon_s\), \(1-\mathtt{c} = \cos(y+dx)+\epsilon_c\), \(\mathtt{sn}+\mathtt{ssn} = \sin(m/128)+\epsilon_{ssn}\), \(\mathtt{cs}+\mathtt{ccs} = \cos(m/128) + \epsilon_{ccs}\) then we get
\[\sin(x+dx) = (\mathtt{sn} + \mathtt{ssn} -\epsilon_{ssn})(1-\mathtt{c} - \epsilon_c) + (\mathtt{cs} + \mathtt{ccs} - \epsilon_{ccs})(\mathtt{s}-\epsilon_s)\]so the neglected terms in the return value sn + cor
are \(-\mathtt{ssn} \cdot \mathtt{c} - \epsilon_c (\mathtt{sn} + \mathtt{ssn}) - \epsilon_{ccs} \mathtt{s}- \epsilon_{ssn} - \epsilon_{ssn} \mathtt{c} + \epsilon_{ssn} \epsilon_c - \epsilon_s \mathtt{cs} - \epsilon_s \mathtt{ccs} + \epsilon_{ccs} \epsilon_s\).
We can bound the sizes of the various terms appearing here as follows:
expression | lower bound | upper bound |
---|---|---|
x | 0.126 | 0.855469 |
y | -1/256 | 1/256 |
dx | \(-2^{40}x\) | \(2^{-40}x\) |
sin(x+dx) | 0.125 | 0.76 |
m | 16 | 110 |
sn + ssn | 0.124 | 0.76 |
sn | 0.124 | 0.76 |
ssn | \(-2^{-54}\) | \(2^{-54}\) |
cs | 0.658 | 0.993 |
ccs | \(-2^{-54}\) | \(2^{-54}\) |
c | 0 | \(2^{-17}\) |
s | 0 | \(2^{-8}\) |
\(\epsilon_c\) | 0 | \(2^{-66}\), est. |
\(\epsilon_s\) | 0 | \(2^{-57}\), est. |
\(\epsilon_{ssn}\) | 0 | \(2^{-100}\) |
\(\epsilon_{ccs}\) | 0 | \(2^{-100}\) |
(the bounds for \(\epsilon_c\) and \(\epsilon_c\) are estimates derived by computing a large number randomly chosen examples). The term with the smallest bound that is kept is s * ccs
which is at most \(2^{-62}\) (but can make a difference to the return value, in the last place). On the other hand the neglected term \(-\epsilon_s\cdot \mathtt{cs}\) has a larger upper bound than that for s * ccs
. Given that cs
is never smaller than 0.658, it’s surprising to me that this works out OK.