Fitting Data

Chi-square is defined to be

The function f_A is fitted to the data by adjusting the parameters c such that the chi-square is a minimum. The power of 2 in (1) is convenient but not necessary, a c² with the 2 replaced by 2n is discussed in ExtremePow.doc .htm. Note that the x in (1) is not limited to one dimensional. The vector sign will usually be omitted, but causes no extra complications.

Expansion about c₀

In general c² is known, but not a minimum, for some set of constants c₀. Expand c²(c) about this value so that

(2)

Truncate at the quadratic term to define the predicted c²_P.

Truncating (2) at the quadratic term, the partial of c² with respect to c_L is approximately given by

(3).

At its extreme value, the derivatives of c² with respect to the components of c are equal to zero. Setting (3) to zero yields

(4)

Multiply both sides of (4) by (GaussJ.doc Cholesky\Curve fit matrices are positive definite.doc) and sum over L

(5)

The sum on the right is d_IJ. This makes the sum easy to evaluate leading to

(6)

Finding the partials of c²

(7)

Note that the error value in (7) appears squared.

(8)

The second term in (8) is exactly zero for linear approximating functions, and approximately zero for any approximating function capable of reproducing the data. This is due to the fact that the leading term is not squared but contains roughly equal numbers of positive and negative terms.

Non Linear

(9)

A general function is one such as a Pade

(10)

The first derivative with respect to c_i for i<M+1 is

(11)

And the first derivative with respect to c_i for i > M is

(12)

These depend heavily on the c₀. This means that a new second derivative matrix needs to be calculated and inverted before each step in the iteration sequence in equation (6). No great harm is done by the extra calculations when they are not needed.

nlfit\Welcome.doc .htm

Linear f_A

For many of the function types in CFIT1.htm .doc the approximating function can be written as

(13)

The N hat is in general much less than the N in equation (1). In this case equation (7) becomes

(14)

And (8) becomes

(15)

Note that A_K,L does not depend on the value of f_i or on the value of c. Start with c=0, then (14) becomes

(16)

Then (6) becomes a single step minimization.

In practice, there is a large accuracy gain in following this step with one in which c à c-c₀ and B is the explicit B given by (14)

The 2 in the definitions of A and B differ from the usual definitions. This eventually changes the equation for the error estimate by a factor of 2 from the usual equation. I use these so that the first derivative of chi-square will B and second derivative matrix will be A.

Real life

The Newton Raphson method has quadratic convergence when it works. That is the error on each succeeding step squares the error. 1D-1à1D-2à1D-4à1D-8à1D-16 à truncation limit. This seems to almost never happen. What usually happens in physics is a function that is not quadratic at all

Define the function

(17)

This is c_P², equation (2), plus a Marquardt parameter l multiplying the sum of the changes in the constants weighted by the relative importance of the constant S_J². For reference in the function

(18)

Minimize c²_p following the steps that led from (2) to (6). Setting the first derivative to zero yields

(19)

Move the last term inside the summation

(20)

The Marquardt parameter l is a one dimensional parameter that can be found to make c²_p have “almost” any desired value. The smoothers which are to be experientially found make this multidimensional.

For l = 0, this is the Newton Raphson equation with all of its faults.
For l = infinity, the inverse of the second derivative array is zero, and the c which minimizes equation (17) is simply c₀
For a sufficiently large value of l, the inverse can always be found.

The usual situation is shown as a one-dimensional plot on the left. The term c²and all of its derivatives has been evaluated at the position c₀. The expansion of c²_p(c,l=0)predicts a minimum at c_q which when tested is much larger than the current value of c²(c₀). The plot of c²_m is also shown for a fairly large value of l which results in c_m that gives rise to a c² which is slightly smaller than the c²(c₀), but nothing to write home about. The goal of course is to find just the right value of l so that the minimum is at c_b that is also at the minimum of c².

The general method

The idea is to request an improvement in c² by a specific amount. In particular to solve for a l such that

(21)

The value of F_r is determined experientially. That is it is initially specified in the direction file as for example 0.99999. Then when c² at c(l) is equal to the predicted value it is cubed. When it is not equal, Fr moves ¾ of the distance towards one. Equation (21) is solved by Bracketing.doc .htm followed by a one dimensional Newton’s (.htm) method.