Robust Fitting of Spectra to Splines with Variable Knots

 

R. L. Coldwell

 

Department of Physics, University of Florida, Gainesville, FL 32611 and

Constellation Technology Corporation, 7887 Brian Dairy Road, Largo, FL 33777

 

 

Spectra consist of continuum features that vary over many channels, and peaks that vary over few channels.  In a fit to the continuum the peaks appear as outliers.  Robust methods in which the weights associated with the peaks are reduced allow the continuum to be fitted almost independently of the peaks.  This requires a very smooth background.  Cubic splines, which are continuous with continuous first and second derivatives, are a good choice for this task when the locations of the discontinuities in the third derivatives, the knots, are included in the parameters optimized in the fitting process.  An extension to the Marquardt method allows a Newton-Raphson method to be used in this optimization.

 

.

 

 

SPLINES

 

     A spline can be defined as

    (3) Where

                                        (4)

With c1=1 and x1 and all others 0, this is

FIGURE 1  A single knot spline and its derivatives

The spline is the function representing the data with the smoothest possible second derivatives. (7) (8)

     Consider a peak like bump which is 1 at x=0, extending from -1 to 1 and is zero elsewhere, which we want to fit to a spline.   This bspline is

     (5)

FIGURE 2 The bspline of equation 5

Five knots from -1 to +1, approximately twice the fwhm of the peak are required to produce this.   It is interesting to attempt to produce this result with only three splines.

          A back to back cubic spline is

 


             

FIGURE 3.  Attempt to produce a bspline with 3 knots

     The practical value of this is that a cubic spline will be able to reproduce a peak only if it has approximately 5 knots within a few half widths of the peak.  In addition, the natural tendency of the splines defined here is to become large for small values of x.  This is also the natural tendency for spectral data.  The only coding needed to take advantage of this reluctance of a spline to produce a peak is the requirement that new knots be introduced at a distance from the old knots.  

     A positive definite continuum is best fitted to the exponential of this spline.  The hard part of fitting splines to data is the fitting of the knots.  Allowing these to vary, however, significantly decreases the number of constants required to fit the continuum and more importantly eliminates the usual curve fit problem of oscillations in the fitting function.

     The final detail needed to keep the background below the peaks is the outlier regression, which increases the error estimates for data above the continuum and decreases it for those below.