greg@jiff.berkeley.edu (Greg) (03/15/88)
In article <1656@thorin.cs.unc.edu> turner@unc.cs.unc.edu (Douglass Turner) writes: >Hello, does anyone know of a good, robust, method to find all roots of a >quartic equation? There are various plans of attack. I assume, since you are ray-tracing, that you want the real roots, or perhaps even the real positive roots. I will describe one reasonable method below. >I have heard that direct solution is prone to numerical instability, so >an iterative technique is called for. The direct solution is prone to programmer's instability. It's very messy and doesn't generalize. The most painful step is taking the cube root of a complex number, which cannot be expressed in terms of roots of real numbers. In any case, the numerous roots in the formula are themselves computed by iterative algorithms in the math package, so you aren't really saving much by using the exact formula. My algorithm is based on Newton's method and Sturm chains. A Sturm chain is an easy way to find out how many roots a polynomial has in an interval. You take the polynomial p(x), which we may call p0(x), and you take its derivative, p1(x) = p'(x). Now let pn(x) be the remainder obtained when you divide p{n-2}(x) by p{n-1}(x). Quit when you obtain a constant polynomial (the degree goes down by at least one every time). The sequence of polynomials p0,...,pk is the Sturm chain. Now let S(a) be the number of sign changes in the sequence p0(a),p1(a),...,pk(a), i.e. the number of times that pi(a) is positive and p{i+1}(a) is negative, or vice versa. Then the number of roots in the interval [a,b] is S(a)-S(b). One can use Sturm chains directly in an enlightened divide-and-conquer approach, but if you want to zero in on the solution quickly it's best to switch to Newton's method. The trick with Newton's method is to avoid the inflection points. They are the roots of the second derivative, p''(x). Find the real roots of p''(x), say y1,...,ym, by recursion. In your case p(x) is a quartic, so you can use the quadratic formula for p''. Pick y0 to be smaller than all of the roots, and y{m+1} to be larger than all of the roots. Now use Sturm chains to find the number of roots in the intervals [y0,y1], [y1,y2],...,[ym,y{m+1}]. Each interval has at most two roots. Use Newton's method in each interval that has roots in it. If an interval has two roots, Newton's method will converge to the two roots starting from each end of the interval. For an interval with one root, check if p(x) is concave up or concave down in that interval (the concavity of course alternates with successive intervals). If it is concave up, Newton's method will converge starting from the endpoint at which p(x) has a positive value. Otherwise start from the other endpoint. Since you are ray tracing, you may often only want the first positive root. You can discard and truncate the intervals appropriately, and then progress from left to right, stopping at the first root you find. Sturm chains will fail if p(x) and p'(x) share roots, which happens exactly when p(x) has multiple roots. Exact numerical coincidences are a delicate matter in floating point arithmetic, if the first constant polynomial in the Sturm chain is the zero function, then the previous polynomial in the chain is the greatest common divisor of p(x) and p'(x), and you can just divide out p(x) by this and start over. -- Greg
td@alice.UUCP (03/16/88)
I believe that the Sturm chain algorithm as described in the referenced article has at least two bugs. The article describes sturm chains thus: > You take the polynomial p(x), which we may call >p0(x), and you take its derivative, p1(x) = p'(x). Now let pn(x) >be the remainder obtained when you divide p{n-2}(x) by p{n-1}(x). >Quit when you obtain a constant polynomial (the degree goes down >by at least one every time). Restating, this is p0(x)=p(x) p1(x)=p'(x) p[n+2](x)=p[n](x) mod p[n-1](x) where a(x) mod b(x) is the remainder left after `synthetic division.' This is incorrect. Sturm chains have p[n+2](x)= -(p[n](x) mod p[n-1](x)) i.e. the sign of the remainders in jiff!greg's version must be flipped for even n>2. The article describes the method of counting roots on an interval [a,b] as follows: > Now let S(a) be the number of sign changes >in the sequence p0(a),p1(a),...,pk(a), i.e. the number of times >that pi(a) is positive and p{i+1}(a) is negative, or vice versa. >Then the number of roots in the interval [a,b] is S(a)-S(b). This is wrong for two reasons: first, the interval must be open at the bottom (i.e. (a,b], not [a,b]). Verify this for yourself by considering the case a=b where p(a)=0. Second the method of counting sign changes is subtly wrong. It is possible for p[k](a) to be zero. Any zeros in the sequence must be deleted before counting sign changes. Another criticism of the method is that his numerical method for converging to the root once it's isolated is over-elaborate. Since the root is bracketed, methods like regula falsi are guaranteed to converge, albeit slowly. A good way to proceed is to take Newton steps starting at one end of the interval, contracting the interval as you go. If a Newton step takes you outside the interval, or involves division by zero, take a regula falsi step instead.
td@alice.UUCP (03/16/88)
Small error in previous article: > i.e. the sign of the remainders in jiff!greg's version must be flipped > for even n>2. ^ should be `for even n>0.'
cfchiesa@bsu-cs.UUCP (Christopher Chiesa) (03/17/88)
In article <7662@agate.BERKELEY.EDU>, greg@jiff.berkeley.edu (Greg) writes: > In article <1656@thorin.cs.unc.edu> turner@unc.cs.unc.edu (Douglass Turner) writes: > >Hello, does anyone know of a good, robust, method to find all roots of a > >quartic equation? > > [miscellaneous introductory comments] > > The direct solution is prone to programmer's instability. It's very > messy and doesn't generalize. The most painful step is taking the > cube root of a complex number, which cannot be expressed in terms of > roots of real numbers. In any case, the numerous roots in the formula Perhaps I misunderstand you (heaven knows I didn't follow the Sturm chains and so on), but... I recall taking cube and higher roots of complex numbers quite easily by converting from "cartesian" (A+Bi) representation to "polar" (r cis Theta), and performing a few straightforward real-number calculations. Basically, you take the root (let's generalize and say the Nth root) of the "r" (radius) part of the representation, and divide the "Theta" (angle) part by N, to find the first Nth root; then you add Theta/N to the angle part re- peatedly to get the other N-1 roots. Works quite nicely; I've never bothered to program it, but if it's been a bottleneck for you, this may be a way around it. I'd be interested to know if this has been helpful to you or anyone else... Chris Chiesa Senior, CS Dept Ball State University Muncie, IN -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><> Chris Chiesa <><><><><> <> {ihpn4|seismo}!{iuvax|pur-ee}!bsu-cs!cfchiesa <> <> cfchiesa@bsu-cs.UUCP <> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
bumby@math.rutgers.edu (Richard Bumby) (03/17/88)
In article <7741@alice.UUCP> td@alice.UUCP writes: > > >I believe that the Sturm chain algorithm as described >in the referenced article has at least two bugs. > >...(stuff omitted)... > p[n+2](x)= -(p[n](x) mod p[n-1](x)) >....(more stuff omitted)... >The article describes the method of counting roots on an interval [a,b] >as follows: >> Now let S(a) be the number of sign changes >>in the sequence p0(a),p1(a),...,pk(a), i.e. the number of times >>that pi(a) is positive and p{i+1}(a) is negative, or vice versa. >>Then the number of roots in the interval [a,b] is S(a)-S(b). > >This is wrong for two reasons: first, the interval must be open at >the bottom (i.e. (a,b], not [a,b]). >....(more stuff omitted)... The proof of Sturm's theorem, which you can find in lovely detail in Weber's "Algebra", makes clear why the method works and what the theorem really says. For example, the polynomial that you start with should have no multiple roots so that any Euclidean algorithm starting from these two polynomials will always have consecutive terms without common factor. Next, you observe what happens as you pass through a zero of any of the polynomials in your chain: the role of the sign change is to guarantee that p{i-1}(a) and p{i+1}(a) have opposit signs if pi(a) = 0. Thus, the number of changes of sign does not change, however you interpret the sign of 0. Finally, going through a zero of p from left to right is easily seen to always remove a change of sign between p and p'. The same method of proof gives "Descartes' rule of signs" and other related results. A thoroughly robust program must be able to handle the accidental discovery of the root of the polynomial, but the first version should be allowed to handle it in a sloppier fashion. Similarly, the decision of whether to use a rapid iterative solution once the root has been isolated can be postponed. It is sobering to realize that much more time is spent telling students why 'bisection' is a poor rootfinding method than would be spent in using it for all relevant problems in a numerical analysis course. -- --R. T. Bumby ** Math ** Rutgers ** New Brunswick ** (in one form or another for all kinds of mail)
greg@jiff.berkeley.edu (Greg) (03/19/88)
In article <2381@bsu-cs.UUCP> cfchiesa@bsu-cs.UUCP (Christopher Chiesa) writes: >In article <7662@agate.BERKELEY.EDU>, greg@jiff.berkeley.edu (Greg) writes: >> The direct solution is prone to programmer's instability. It's very >> messy and doesn't generalize. The most painful step is taking the >> cube root of a complex number,... ... >I recall taking cube and higher roots of complex numbers >quite easily by converting from "cartesian" (A+Bi) representation to "polar" >(r cos Theta), and performing a few straightforward real-number calculations. You are correct; you can take roots of complex number by passing to polar coordinates. Although the calculation is not quite as messy as I suggested, it is still painful in the sense that it requires several invocations of transcendental functions. It does not seem like the fastest solution. Be that as it may, the formula for the roots of a quartic is complicated. Its derivation is even worse. -- Greg
bumby@math.rutgers.edu (Richard Bumby) (03/22/88)
This discussion concerns the solution of quartic equations. The formula reducing the solution of qurtic equations to the extaction of roots was referred to as 'the direct method' and questions about implementing the complex cube root it requires were addressed in some of the articles. Here is a quote from article <2381@bsu-cs.UUCP> by cfchiesa@bsu-cs.UUCP (Christopher Chiesa): Perhaps I misunderstand you (heaven knows I didn't follow the Sturm chains and so on), but... I recall taking cube and higher roots of complex numbers quite easily by converting from "cartesian" (A+Bi) representation to "polar" (r cis Theta), and performing a few straightforward real-number calculations. This is possible, but it is totally insensitive to the topology of the real numbers, and hence to any numerical considerations. Similarly, the Sturm chains are easy (I thought I posted on that, but I'm not sure it got out), but are really used for theoretical purposes. In this problem, it is probably easy to separate the roots using other considerations. If there are 4 real roots, then Newton's method is sure to locate the largest if you start from a value larger than it. The smallest is found similarly. If there are only two real roots, this may be reliable for only one of them. Deflation to reduce to an equation of lower degree is probably safe if you check the roots in the original equation. Since I last wrote on this, I have had a chance to consult "Numerical Recipes" and find that it does not give very much information. On the other hand, this is really a numerical problem, not an algebraic one. There is no reason to believe that 'extraction of roots', which must be done numerically (frequently in software) is to be preferred to a direct rootfinding method which does not require complicated preprocesssing. -- --R. T. Bumby ** Math ** Rutgers ** New Brunswick ** (in one form or another for all kinds of mail)
jadwa@henry.cs.reading.ac.uk (James Anderson) (03/23/88)
Expires: Sender: Followup-To: Distribution: I am an absolute beginner, so let me get this clear. If I have an equation with several roots and I start Newton's method lower than the least one it will converge to the least one and if I start it larger than the largest it will converge to the largest? No matter how close the roots are? Wow! James (JANET) James.Anderson@reading.ac.uk
ags@s.cc.purdue.edu (Dave Seaman) (03/26/88)
In article <726@onion.cs.reading.ac.uk>, jadwa@henry.cs.reading.ac.uk (James Anderson) writes: > I am an absolute beginner, so let me get this clear. If I have an > equation with several roots and I start Newton's method lower > than the least one it will converge to the least one and if I > start it larger than the largest it will converge to the largest? > No matter how close the roots are? Wow! This is true if all roots are real and if you assume infinite precision arithmetic (a very large assumption). The reason is that a polynomial with all roots real is either concave upward (if positive) or concave downward (if negative) on the intervals (-inf,rmin] and [rmax,+inf), where rmin and rmax are the smallest and largest roots. The claim is not necessarily true if the polynomial has complex roots. There may be extreme values which lie outside the interval [rmin,rmax], and choosing an initial approximation near such an extreme may cause the first Newton iteration to lead to a point that is far away on the other side of all the real roots. -- Dave Seaman ags@j.cc.purdue.edu