Fitting Data

            Chi-square is defined to be

           (1)        

The function fA is fitted to the data by adjusting the parameters c such that the chi-square is a minimum.  The power of 2 in (1) is convenient but not necessary, a c2 with the 2 replaced by 2n is discussed in ExtremePow.doc .htmNote that the x in (1) is not limited to one dimensional.  The vector sign will usually be omitted, but causes no extra complications. 

Expansion about c0

            In general c2 is known, but not a minimum, for some set of constants c0Expand c2(c) about this value so that

(2)

Truncate at the quadratic term to define the predicted c2P.

Truncating (2) at the quadratic term, the partial of c2 with respect to cL is approximately given by

(3).   

At its extreme value, the derivatives of c2 with respect to the components of c are equal to zero.  Setting (3) to zero yields

(4)

Multiply both sides of (4) by  (GaussJ.doc Cholesky\Curve fit matrices are positive definite.doc) and sum over L

(5)

The sum on the right is dIJ.  This makes the sum easy to evaluate leading to

(6)

Finding the partials of c2

(7)

Note that the error value in (7) appears squared.

(8)

The second term in (8) is exactly zero for linear approximating functions, and approximately zero for any approximating function capable of reproducing the data.  This is due to the fact that the leading term is not squared but contains roughly equal numbers of positive and negative terms.

Non Linear

(9)

A general function is one such as a Pade

      (10)

The first derivative with respect to ci for i<M+1 is

  (11)

And the first derivative with respect to ci for i > M  is

          (12)

These depend heavily on the c0.  This means that a new second derivative matrix needs to be calculated and inverted before each step in the iteration sequence in equation (6).  No great harm is done by the extra calculations when they are not needed. 

nlfit\Welcome.doc .htm

Linear fA

For many of the function types in CFIT1.htm .doc the approximating function can be written as

(13)

The N hat is in general much less than the N in equation (1).  In this case equation (7) becomes

            (14)

And (8) becomes

(15)      

Note that AK,L  does not depend on the value of fi or on the value of c.  Start with c=0, then (14) becomes

(16)

Then (6) becomes a single step minimization. 

 

In practice, there is a large accuracy gain in following this step with one in which c à c-c0 and B is the explicit B given by (14)

The 2 in the definitions of A and B differ from the usual definitions.  This eventually changes the equation for the error estimate by a factor of 2 from the usual equation.  I use these so that the first derivative of chi-square will B and second derivative matrix will be A.

Real life

            The Newton Raphson method has quadratic convergence when it works.  That is the error on each succeeding step squares the error.  1D-1à1D-2à1D-4à1D-8à1D-16 à truncation limit.  This seems to almost never happen.  What usually happens in physics is a function that is not quadratic at all

            Define the function

(17)

This is cP2, equation (2), plus a Marquardt parameter l multiplying the sum of the changes in the constants weighted by the relative importance of the constant SJ2.    For reference in the function

           (18)

 Minimize c2p following the steps that led from (2) to (6).  Setting the first derivative to zero yields

(19)

Move the last term inside the summation

(20)

The Marquardt parameter l is a one dimensional parameter that can be found to make c2p have “almost” any desired value.  The smoothers which are to be experientially found make this multidimensional.

  1. For l = 0, this is the Newton Raphson equation with all of its faults.
  2. For l = infinity, the inverse of the second derivative array is zero, and the c which minimizes equation (17) is simply c0
  3. For a sufficiently large value of l, the inverse can always be found. 

The usual situation is shown as a one-dimensional plot on the left.  The term c2 and all of its derivatives has been evaluated at the position c0.  The expansion of c2p(c,l=0) predicts a minimum at cq which when tested is much larger than the current value of c2(c0).  The plot of c2m is also shown for a fairly large value of l which results in cm that gives rise to a c2 which is slightly smaller than the c2(c0), but nothing to write home about.  The goal of course is to find just the right value of l so that the minimum is at cb that is also at the minimum of c2.

The general method

The idea is to request an improvement in c2 by a specific amount.  In particular to solve for a l such that

  (21)

The value of Fr is determined experientially.  That is it is initially specified in the direction file as for example 0.99999.  Then when c2 at c(l) is equal to the predicted value it is cubed.  When it is not equal, Fr moves ¾ of the distance towards one.  Equation (21) is solved by Bracketing.doc .htm  followed by a one dimensional Newton’s (.htm) method.