msun: fix cbrt iterations from Newton to Halley method

Since we're inverting a cube, we have:
    f(Tₙ)=Tₙ³-x    (1)

Its first and second derivatives are:
    f'(Tₙ)=3Tₙ²    (2)
    f"(Tₙ)=6Tₙ     (3)

Halley iteration[1] uses:
    Tₙ₊₁=Tₙ-2f(Tₙ)f'(Tₙ)/(2f'(Tₙ)²-f(Tₙ)f"(Tₙ))    (4)

Replacing the terms of (4) using (1), (2) and (3):

Tₙ₊₁ = Tₙ-2f(Tₙ)f'(Tₙ)/(2f'(Tₙ)²-f(Tₙ)f"(Tₙ))
     = Tₙ-2(Tₙ³-x)3Tₙ²/(2(3Tₙ²)²-(Tₙ³-x)6Tₙ)
     = <snip, see WolframAlpha[2] alternate forms>
     = Tₙ(2x+Tₙ³)/(x+2Tₙ³)

This formula corresponds to the exact expression used in the code.

Newton formula is Tₙ-f(Tₙ)/f'(Tₙ) which would have simplified to
(2Tₙ³+x)/(3Tₙ²) instead.

[1] https://en.wikipedia.org/wiki/Halley's_method
[2] https://www.wolframalpha.com/input?i=T-2%28T%5E3-x%293T%5E2%2F%282%283T%5E2%29%5E2-%28T%5E3-x%296T%29

Note: UTF8 in commit message due to the heavy math being hard to
recreate w/o it. -- imp

Signed-off-by: Clément Bœsch <u@pkh.me>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1684

This commit is contained in:

Clément Bœsch

2025-05-01 19:19:36 +02:00

committed by

Warner Losh

parent 5b9660caff

commit f887d0215f

3 changed files with 5 additions and 5 deletions

									
										lib/msun/src/s_cbrt.c
									
		+2
		-2
	
												View File
												
				@@ -90,7 +90,7 @@ cbrt(double x)

				     * the result is larger in magnitude than cbrt(x) but not much more than

				     * 2 23-bit ulps larger).  With rounding towards zero, the error bound

				     * would be ~5/6 instead of ~4/6.  With a maximum error of 2 23-bit ulps

				     * in the rounded t, the infinite-precision error in the Newton

				     * in the rounded t, the infinite-precision error in the Halley

				     * approximation barely affects third digit in the final error

				     * 0.667; the error in the rounded t can be up to about 3 23-bit ulps

				     * before the final error is larger than 0.667 ulps.

				@@ -99,7 +99,7 @@ cbrt(double x)

					u.bits=(u.bits+0x80000000)&0xffffffffc0000000ULL;

					t=u.value;

				    /* one step Newton iteration to 53 bits with error < 0.667 ulps */

				    /* one step Halley iteration to 53 bits with error < 0.667 ulps */

					s=t*t;				/* t*t is exact */

					r=x/s;				/* error <= 0.5 ulps; |r| < |t| */

					w=t+t;				/* t+t is exact */

									
										lib/msun/src/s_cbrtf.c
									
		+2
		-2
	
												View File
												
				@@ -50,7 +50,7 @@ cbrtf(float x)

					    SET_FLOAT_WORD(t,sign|(hx/3+B1));

				    /*

				     * First step Newton iteration (solving t*t-x/t == 0) to 16 bits.  In

				     * First step Halley iteration (solving t*t-x/t == 0) to 16 bits.  In

				     * double precision so that its terms can be arranged for efficiency

				     * without causing overflow or underflow.

				     */

				@@ -59,7 +59,7 @@ cbrtf(float x)

					T=T*((double)x+x+r)/(x+r+r);

				    /*

				     * Second step Newton iteration to 47 bits.  In double precision for

				     * Second step Halley iteration to 47 bits.  In double precision for

				     * efficiency and accuracy.

				     */

					r=T*T*T;

									
										lib/msun/src/s_cbrtl.c
									
		+1
		-1
	
												View File
												
				@@ -126,7 +126,7 @@ cbrtl(long double x)

				#endif

					/*

				     	 * Final step Newton iteration to 64 or 113 bits with

					 * Final step Halley iteration to 64 or 113 bits with

					 * error < 0.667 ulps

					 */

					s=t*t;				/* t*t is exact */