# Difference between revisions of "Multifit"

Toby Perring (Talk | contribs) |
Toby Perring (Talk | contribs) |
||

Line 1: | Line 1: | ||

Horace (and the parent utilities library, Herbert) comes with a rich and powerful fitting syntax that is common to the methods used to fit functions or models of S(Q,w) to one or more datasets. The documentation here is only meant to give an introduction and overview. For the full help, please use the Matlab documentation for the various fitting functions that can be obtained by using the <code>doc</code> command, for example <code>doc d1d/multifit</code> (for fitting function like Gaussians to d1d objects) or <code>doc sqw/multifit_sqw</code> (fitting models for S(Q,w) to sqw objects). It is strongly recommended that you use <code>doc</code>, not <code>help</code> to explore how to use these methods, so that you can navigate between the numerous pages of documentation in the Matlab help window. | Horace (and the parent utilities library, Herbert) comes with a rich and powerful fitting syntax that is common to the methods used to fit functions or models of S(Q,w) to one or more datasets. The documentation here is only meant to give an introduction and overview. For the full help, please use the Matlab documentation for the various fitting functions that can be obtained by using the <code>doc</code> command, for example <code>doc d1d/multifit</code> (for fitting function like Gaussians to d1d objects) or <code>doc sqw/multifit_sqw</code> (fitting models for S(Q,w) to sqw objects). It is strongly recommended that you use <code>doc</code>, not <code>help</code> to explore how to use these methods, so that you can navigate between the numerous pages of documentation in the Matlab help window. | ||

+ | |||

+ | ==Introduction to setting up and performing a fit== | ||

+ | All the variants of multifit share a common procedure for setting up and performing a fit - they differ only in the form of the functions which are either functions of the plot coordinates or qh,qk,ql,en. In what follows we refer to <code>multifit</code>, but for this you can equally read <code>multifit_func</code>, <code>multifit_sqw</code>, <code>multifit_sqw</code>, and even the resolution convolution program <code>Tobyfit</code>. | ||

+ | |||

+ | First you have to create a multifit object with the data you want to fit. You can give it any name you like - here we'll use the name <code>kk</code> | ||

+ | <pre> | ||

+ | >> kk = multifit (w1) % w1 is an object (or array of objects) to be fitted | ||

+ | </pre> | ||

+ | |||

+ | <pre> | ||

+ | >> kk = multifit (w1,w2,w3,w4) % several (arrays of) objects to be fitted simultaneously | ||

+ | </pre> | ||

+ | |||

+ | Next you need to set the fitting functions. In this case, let us assume that you are fitting an array of three objects and that you are going to fit Gaussian functions to all three objects simultaneously: | ||

+ | <pre> | ||

+ | >> kk = multifit (my_data) ; | ||

+ | >> kk = kk.set_fun (@gauss); | ||

+ | </pre> | ||

+ | In the Horace installation there is a folder with a selection of fitting functions, including <code>gauss.m </code>. A fit function requires a particular set of input and output arguments. Type <code>help gauss </code> for help for an example of a function, or <code>doc sqw/multifit <code> and click on the links in the help window that is opened up. The way to set the fit functions follows the generic syntax of multifit, which is to set properties of the object using a command of the form <code>myobj = myobj.command(arg1,arg2,...)</code>. | ||

+ | |||

+ | Now we need to provide the starting parameters for the fit. This is a row vector of the numerical values for the parameters, which in the case of <code>gauss </code> is the height, position and standard deviation: | ||

+ | <pre> | ||

+ | >> kk = kk.set_fun ([100,0,10]); % height 100, centred on 0, standard deviation 10 | ||

+ | </pre> | ||

+ | |||

+ | By default, multifit will allow all these parameters to float freely in the fit. However, suppose you want to keep the Gaussian centred on the origin. Then you can provide a list of which parameters are allowed to float (1) or are fixed (0). In this example: | ||

+ | <pre> | ||

+ | >> kk = kk.set_free ([1,0,1]); % keep the second parameter fixed in the fit | ||

+ | </pre> | ||

+ | |||

+ | At this point, you can perform a fit: | ||

+ | <pre> | ||

+ | >> [my_fitted_data, fit_params] = kk.fit; | ||

+ | </pre> | ||

+ | |||

+ | This returns two arguments: <code>my_fitted_data</code> is an array of objects which is the same as the input data except that the signal (or equivalently, the intensity) is set to the values calculated at the fitted parameter values; and <code>fit_params</code>, which is a structure that contains the fitted parameter values, estimated errors on those fitted values, the value of chi-squared for the fit and the covariance matrix of the fitted parameters. | ||

+ | |||

+ | If you want to see how the fit is progressing from one iteration to the next, and also get a listing in the Matlab command window of the final fit parameters, you can ask for more verbose output by changing one of the multifit options: | ||

+ | <pre> | ||

+ | >> kk = kk.set_options ('list',2); % prints highly verbose output to the screen | ||

+ | </pre> | ||

+ | |||

+ | Other options change the fit convergence criteria, and whether or not the final fit is calculated only are data points that remained once points with zero error bars were removed or at all data points. | ||

+ | |||

==Overview== | ==Overview== |

## Revision as of 16:00, 20 January 2019

Horace (and the parent utilities library, Herbert) comes with a rich and powerful fitting syntax that is common to the methods used to fit functions or models of S(Q,w) to one or more datasets. The documentation here is only meant to give an introduction and overview. For the full help, please use the Matlab documentation for the various fitting functions that can be obtained by using the `doc`

command, for example `doc d1d/multifit`

(for fitting function like Gaussians to d1d objects) or `doc sqw/multifit_sqw`

(fitting models for S(Q,w) to sqw objects). It is strongly recommended that you use `doc`

, not `help`

to explore how to use these methods, so that you can navigate between the numerous pages of documentation in the Matlab help window.

## Contents

## Introduction to setting up and performing a fit

All the variants of multifit share a common procedure for setting up and performing a fit - they differ only in the form of the functions which are either functions of the plot coordinates or qh,qk,ql,en. In what follows we refer to `multifit`

, but for this you can equally read `multifit_func`

, `multifit_sqw`

, `multifit_sqw`

, and even the resolution convolution program `Tobyfit`

.

First you have to create a multifit object with the data you want to fit. You can give it any name you like - here we'll use the name `kk`

>> kk = multifit (w1) % w1 is an object (or array of objects) to be fitted

>> kk = multifit (w1,w2,w3,w4) % several (arrays of) objects to be fitted simultaneously

Next you need to set the fitting functions. In this case, let us assume that you are fitting an array of three objects and that you are going to fit Gaussian functions to all three objects simultaneously:

>> kk = multifit (my_data) ; >> kk = kk.set_fun (@gauss);

In the Horace installation there is a folder with a selection of fitting functions, including `gauss.m `

. A fit function requires a particular set of input and output arguments. Type `help gauss `

for help for an example of a function, or `doc sqw/multifit <code> and click on the links in the help window that is opened up. The way to set the fit functions follows the generic syntax of multifit, which is to set properties of the object using a command of the form <code>myobj = myobj.command(arg1,arg2,...)`

.

Now we need to provide the starting parameters for the fit. This is a row vector of the numerical values for the parameters, which in the case of `gauss `

is the height, position and standard deviation:

>> kk = kk.set_fun ([100,0,10]); % height 100, centred on 0, standard deviation 10

By default, multifit will allow all these parameters to float freely in the fit. However, suppose you want to keep the Gaussian centred on the origin. Then you can provide a list of which parameters are allowed to float (1) or are fixed (0). In this example:

>> kk = kk.set_free ([1,0,1]); % keep the second parameter fixed in the fit

At this point, you can perform a fit:

>> [my_fitted_data, fit_params] = kk.fit;

This returns two arguments: `my_fitted_data`

is an array of objects which is the same as the input data except that the signal (or equivalently, the intensity) is set to the values calculated at the fitted parameter values; and `fit_params`

, which is a structure that contains the fitted parameter values, estimated errors on those fitted values, the value of chi-squared for the fit and the covariance matrix of the fitted parameters.

If you want to see how the fit is progressing from one iteration to the next, and also get a listing in the Matlab command window of the final fit parameters, you can ask for more verbose output by changing one of the multifit options:

>> kk = kk.set_options ('list',2); % prints highly verbose output to the screen

Other options change the fit convergence criteria, and whether or not the final fit is calculated only are data points that remained once points with zero error bars were removed or at all data points.

## Overview

Horace provides a set of methods for fitting sqw and d1d,d2d,...d4d objects. which all share the same fitting syntax and capabilities. The various forms of multifit enable you to:

- fit a function to single dataset

- for example, fitting a Gaussian function to a single one-dimensional dataset

- fit a function with a global set of parameters to several datasets simultaneously

- for example, fitting a model for S(Q,w) from spin waves to several function sqw objects to determine the global intensity scale and magnetic exchange constants that best fit the set of data

- fit a global foreground function with local background functions

- for example, in the previous illustration, allowing an independent linear background for each sqw dataset, or even different functional forms for the background for different datasets

- fix one or more parameters in the foreground functions and background functions
- bind the values of pairs of parameters so that they vary in a fixed ratio in the fit

The last functionality can be very useful if you have a model for S(Q,w) where you want to have parameters that apply globally (for example, magnetic exchange constants that define spin wave dispersion) but other parameters that can vary independently for each dataset (for example, the spin wave lifetime) . In this instance, you can define the foreground function to be local, then bind the exchange constants across all datasets with ratio unity. The following multifit variants are available for sqw and d1d,d2d,...d4d objects:

`multifit_func`

(or equivalently`multifit`

)

- the foreground and background functions both are functions of the plot axes x1,x2,...

`multifit_sqw`

- the foreground function(s) are functions of S(Q,w), and the background functions are functions of the plot axes x1,x2,...

`multifit_sqw_sqw`

- the foreground and background function(s) are all functions of S(Q,w)

## Detail

In the discussion below we will start by considering the most basic syntax that can be provided to the function, and will then gradually introduce further complexity. These notes are essentially the same as the in-built help for `multifit`

, albeit with a little more detail. To read this abridged version, type

>> help multifit_sqw

There are several different ways of calling the multifit function, with varying levels of complexity:

>> [wout, fitdata] = multifit_sqw (w, func, pin) >> [wout, fitdata] = multifit_sqw (w, func, pin, pfree) >> [wout, fitdata] = multifit_sqw (w, func, pin, pfree, pbind)

The inputs have a similar form to those used in `fit`

. For the first case the array of objects `w`

are fitted using the function `func`

, with an initial guess of the parameters supplied in the vector or cell array `pin`

. If `pin`

is a vector then all of its elements are fittable parameters. If it is a cell array then its first cell must be a vector of this form, and the remaining cells contain additional information passed to the fit function, that cannot be varied (e.g. magnetic form factor lookup tables).

If no `pfree`

is specified then it is assumed that all fit parameters can be varied. Otherwise `pfree`

must be a vector with the same length as the number of fit parameters, and comprising 1s and 0s (for the parameter being free or fixed respectively).

`pbind`

is an additional argument which is a halfway house between allowing parameters to be free or fixed. It allows you to specify that some parameters must maintain a fixed ratio, but that their absolute values can be varied. For example, you may wish to fix the ratio of two exchange constants, but allow their magnitude to vary in order to find the best fit.

There are several different things you can do with parameter binding, so you must be careful about how you write `pbind`

. The simplest form for `pbind`

is a 2-element cell array, such as `pbind = {1,3}`

, which binds the first and third parameters to a constant ratio, determined by the ratio between the parameters given in `pin`

. You can specify multiple pairs of bindings by saying `pbind = {{1,3},{4,3},{5,6}}`

, so parameters 1 and 3, 4 and 3, and 5 and 6 are bound in the ratios of the initial guesses. You can also explicitly provide the ratios for the bindings (rather than using the ratio of the initial guess) by writing, for example, `pbind = {{1,3,0,7.4},{4,3,0,0.023}}`

in which case parameter 1 and 3 are bound in the ratio 7.4, and 4 and 3 are bound with a ratio 0.023. Notice the extra zero – this is required so that the program knows what you are trying to do. Finally, you can choose to bind parameters in the fit function to parameters in the separate background function(s). The syntax for this would be, for example, `pbind = {1,3,[1,2],3.1}`

- whereby parameter 1 is bound to background parameter 3 for the 2nd background function, and the ratio is 3.1.

>> [wout, fitdata] = multifit_sqw (..., bkdfunc, bpin) >> [wout, fitdata] = multifit_sqw (..., bkdfunc, bpin, bpfree) >> [wout, fitdata] = multifit_sqw (..., bkdfunc, bpin, bpfree, bpbind)

As mentioned above, when using multifit the background can be fitted separately. You can either specify a different background function for each object to be fitted, or use a single function. In each case the backgrounds are not fitted globally, so the background parameters for each fit can be different, even though the model fit is the same. The syntax for the background fitting is a little more complicated than for the main fitting, essentially because there are more permutations for keeping parameters fixed and/or bound.

There are two options for specifying the background function `bkdfunc`

. You either specify a single function handle (in the same way as you did for the fitting function), or you provide a cell array of function handles. In the former case the same background function (but with different parameters) is used for every object to be fitted. In the latter case the cell array must have the same number of elements as there are objects to fit. Each element is then a function handle (e.g. `@bg_func`

) which specifies the function that will be used to model the background for each object to be fitted. This means that even if you only wish to use a different background function for one of the objects to be fitted, you must give a cell array of function handles (for that case all of the function handles would be the same, except for one).

For specifying the initial guesses for the parameters passed to the background function(s) the syntax is also pretty similar to that used for the fitting function. If only a single background function is to be used then the parameters `bpin`

can be specified as either a vector or a cell array. If the former then the elements of the vector are the parameters required by the background function. If the latter, then the first element of the cell array should be a vector of parameters, and the other elements are extra parameters (e.g. lookup tables, etc) that are required by the background function. If more than one background function is being used, or you wish to supply a different set of initial guesses for different objects to be fitted, then `bpin`

must be a cell array with the same number of elements as there are objects to be fitted. The form of each element of the cell array would then be the same as that described above, i.e. a vector of parameters, or a cell array of parameters (so you could have a cell array of cell arrays…).

You can specify which parameters to vary or keep fixed for the background in a similar way to how you do it for the fitting function. You can either specify a vector of 1s and 0s if only one background function is being used. If more than one background function is used, or you do not wish to vary the parameters for every object, you can specify a cell array which has a vector of 1s and 0s in each element.

Finally for the most complicated part – specifying the bindings for the background. You should be especially careful here, as there could be cell arrays all over the place, and the precise format depends on a variety of factors. The way to lay this out is best illustrated by a series of examples of single binding descriptions:

1) `bpbind = {1,4}`

. The background parameter `bp(1)`

is bound to background parameter `bp(4)`

with a fixed ratio, determined by the initial guess. This binding will apply for all objects and for all background functions.
2) `bpbind = {2,3,[7,2]}`

. bp(2) is bound to bp(3) for background function bkdfunc{7,2} only, in the case where multiple functions have been specified. For all other background functions there is no binding. The ratio is determined from the initial guesses.
3) `bpbind = {5,11,0}`

. bp(5) is bound to parameter p(11) of the global fitting function. This binding will apply for all objects and all background functions. The ratio is determined by the initial guess.
4) `bpbind = {{1,4},{2,3,[7,2]},{5,11,0}}`

. Several different bindings (see above) are defined.
5) `bpbind = {5,11,0,0.013}`

. Background parameter bp(5) is bound to global parameter p(11), with a ratio explicitly given as 0.013, overriding the initial guess.
6) `bpbind = {1,4,[7,2],14.15}`

. bp(1) is bound to bp(4) for bg_func(7,2) only, with a ratio of 14.15, which overrides the ratio determined from the initial guess.

As an illustration of the need for care, consider the following:

bpbind1 = { {1,4}, {2,3,[7,2]}, {5,11,0} } bpbind2 = { { {1,4}, {2,3,[7,2]}, {5,11,0} } }

`bpbind1`

specifies the bindings for 3 different backgrounds. `bpbind2`

specifies a set of bindings applicable to all backgrounds.

>> [wout, fitdata] = multifit_sqw (…,keyword, value)

Keywords relating to which parts of the data to fit, what information to display in the Matlab command window whilst the fit is running, etc, can be chosen. The keywords and values are the same as for the functions `fit`

and `fit_sqw`

.

The outputs from `multifit`

are as follows. `wout`

is an array of dnd / sqw objects that matches the input, but the intensity values are the results of the fitting. `fitdata`

is a structure array with the following fields:

1) fitdata.p – the global parameter values. 2) fitdata.sig – the errors on the global parameters (=0 if a parameter was fixed). 3) fitdata.bp – the background parameters. This is either a vector, or a cell array of vectors, depending on how the input was arranged. 4) fitdata.bsig – the errors on the background parameters, in either a vector or a cell array of vectors. 5) fitdata.corr – the correlation matrix for the free parameters. 6) fitdata.chisq – the value of the reduced Chi^2 of the fit, i.e. divided by (number of data points minus number of free parameters). 7) fitdata.pnames – parameter names (if they were specified in the fit function). 8) fitdata.bpnames – background parameter names (if they were specified in the background function(s)).

## Examples

Let us illustrate how multifit is actually used by means of a series of examples. For all of the following we shall assume that we wish to fit 3 two-dimensional sqw objects comprising constant energy slices in the (h,0,0),(0,k,0) plane, and that we wish to fit a quartet of Gaussian profile peaks, much as we did in the Getting Started section of this manual.

### No background functions

In this case we allow all of the parameters to vary, and we bind the second and third parameters to a ratio of 1.

win(1)=first_obj; win(2)=second_obj; win(3)=third_obj; pin=[6 1 1 0.1 1.25 6]; pfree=ones(size(pin{1})); pbind={2,3,0,1}; [wout, fitdata] = multifit_sqw (win, @demo_4gauss, pin, pfree, pbind);

### Simple background functions

In this case we will have the same as above, but will add a single background function that we allow to vary between the objects to be fitted. We supply different initial guesses for the background parameters, and we bind the third background parameter to the 1st background parameter in the ratio specified by the initial guess.

pin=[6 1 1 0.1 1.25 6]; pfree=ones(size(pin{1})); pbind={2,3,0,1}; bpin=cell(1,3); bpin{1}=[1 1 0]; bpin{2}=[1 1.5 0.2]; bpin{3}=[2 -0.1 1]; bpfree=[1,1,1]; bpbind={3,1}; [wout, fitdata] = multifit_sqw (win, @demo_4gauss, pin, pfree, pbind, @sloping_background,... bpin, bpfree, bpbind);

### The Fully Monty

Let us now demonstrate a more complicated use of multifit. In this case we will have a different background function for each object. The third of these background functions will require additional information passed to it (a lookup table), necessitating the use of a nested cell array. We will bind the third parameter to the first parameter of the first background function, and likewise for the second background function. Finally we will also bind the second parameter of the background functions to the first parameter of the fitting function.

pin=[6 1 1 0.1 1.25 6]; pfree=ones(size(pin{1})); pbind={2,3,0,1}; bg_funcs=cell(1,3); bg_funcs{1}=@constant_background; bg_funcs{2}=@sloping_background; bg_funcs{3}=@special_background; bpin=cell(1,3); bpin{1}=[1 1 0]; bpin{2}=[1 1.5 0.2]; bg_cell=cell(1,2) bg_cell{1}=[2 -0.1 1]; bg_cell{2}='C:\mprogs\lookup_table.dat'; bpin{3}=bg_cell; bpfree=cell{1,3}; bpfree{1}=[1,1,1]; bpfree{2}=[1,1,0]; bpfree{3}=[0,1,0]; bpbind={{3,1,[1,1]},{3,1,[1,2]},{2,1,0}}; [wout, fitdata] = multifit_sqw (win, @demo_4gauss, pin, pfree, pbind, bg_funcs,... bpin, bpfree, bpbind);

## Semi-global fits

Horace allows you to fit several objects independently (i.e. sequentially) using the `fit_sqw`

and `fit`

functions. You can also fit several objects with a global set of parameters using `multifit`

. However there is a third thing you might wish to do, halfway between these two, which is to fit a model where some of the parameters are global but others are object-specific. For example, you may wish to fit several 1d cuts which show some magnetic scattering, but are taken in different Brillouin zones so that the intensity scaling is very different. If the magnetic form-factor alone does not describe the intensity differences you may wish to have a different intensity scale for each cut, but to keep other parameters global. Previously there is no straightforward way of doing this in `multifit`

, however it is now possible to do this using the function `multifit_sqw_sqw`

.

`multifit_sqw_sqw`

allows you to have sqw models for both the spectral function and the background function. You can then bind parameters in the *background* function to parameters in the spectral function, i.e. the background function becomes the spectral function.

In order to make some parameters global, and others object-specific, you can be clever with the background bindings. Let us illustrate this with an example…

Suppose the spectral weight can be specified by a model which has I0, k0 and G as its input parameters, and that the background is just a constant B. We wish to allow I0 and B to be different for different objects, but k0 and G to be global. We specify a spectral function with the handle `@null_func`

, the Matlab code for which would be something like:

function weight = null_func(qh,qk,ql,en,p) weight=zeros(size(qh));

We then write a Matlab function called `real_func`

which takes I0, k0, G and B as input parameters and calculates the spectral weight plus background. We then use multifit as follows:

[wfit,fitdata] = multifit_sqw_sqw( win,@null_func,[k0_guess,G_guess],[1,1],… @real_func,[I0_guess,k0_guess,G_guess,B_guess], [1,1,1,1],… { { {2,1,0,1}, {3,2,0,1} } } );

The binding argument we have specified applies to all backgrounds. We have bound parameter 2 of real_func to parameter 1 of null_func, in the ratio 1:1; and we have bound parameter 3 of real_func to parameter 2 of null_func, in the ratio 1:1.

When this code is executed the actual fitting all happens with real_func, since null_func does not do anything, however by passing k0 and G to null_func we can ensure that they are global parameters. Of course if your model is complicated then writing the function real_func might be rather involved, but it can be done if you have the patience!