wiki:global-2DSA

Global 2DSA Fitting Algorithm:

When we fit several datasets globally using 2DSA, the basic need is to find a set of solutes that can explain all included datasets. In addition to the so called "superglobal" model, we will find 2 additional fits, each of the two will be a variation of the superglobal model, explained below.

When we fit multiple datasets by 2DSA, the first requirement is that all datasets are scaled to the SAME TOTAL CONCENTRATION. The total concentraiton is the sum of all solutes in a standard 2DSA fit. Presumably, all datasets have already been taken through the 3-step refinement process, so ti and ri noise are already subtracted out, as well as any baselines. All datasets should start at zero concentration.

So the first step in a global experiment is to do a quick run on all individual datasets and form the sum over all components found in each single experiment. This is your total concentration for each dataset and this value should be stored in memory until the very end of this global analysis, because it is needed again then. For now, you need to normalize each dataset with this value by dividing each datapoint with this number. As a result, each dataset will now have a total concentration of 1.0000.

Next we need to fit the first global model. This is the SUPERGLOBAL model. To find it, build the A matrix simply by concatenating all datasets as additional rows. Since they are ALL normalized to 1.000, they can all be fitted to the same amplitudes for each solute. So the structure of the Ax=B linear system for n globally fitted datasets and m solutes is:

   |a11|    |a21|        |am1| |b1|
   |a12|    |a22|        |am2| |b2|
x1*|a13|+x2*|a23|+...+xm*|am3|=|b3|
   |...|    |...|        |...| |..|
   |a1n|    |a2n|        |amn| |bn|

each aij of course stands for an entire experiment with however many radial and time points.

The x vector obtained here is our SUPERGLOBAL model. The superglobal model should be scaled to the weight average of the total concentrations from all included datasets.

This particular model is useful to determine if each included dataset is identical. You would expect to get a perfect fit to any dataset if they were all the identical samples measured at the same concentration, say from a single batch, unless there are issue with the instrument or the cell alignment, so if you get bad residuals in one or more datasets this means you have an issue with your instrument if in fact you loaded n times the same sample.

The next model, let's call it the scaled model, tests something different: It will test if the sample is non-interacting. If the sample is non-interacting, ANY concentration would have the same ratio of individual solutes, and is simply a scaled version of the superglobal model, adjusted to the total concentration of each dataset. To obtain the scaled models, simply multiply the superglobal model with the correctly corresponding total concentration obtained previously and saved in memory.

The third model, let's call it the variable ratio model, tests if the system is reversibly self-associating. If that were the case, the RATIO of one solute to another would change when the concentration changes since mass action will increase/decrease the oligomeric species present based on concentration. In this case, you would make one more fit, and this fit will simply use all solutes obtained in the superglobal model as the fitting grid, and let nnls adjust the amplitudes of each solute for the most optimal value.

If the system doesn't fit well to any of these above models, then none of the included datasets will fit to any of these global models.

I propose to name the superglobal models 2dsa-sg, the scaled models 2dsa-sc, and variable ratio ones I would name 2dsa-vr.

Last modified 2 years ago Last modified on Apr 26, 2016 10:45:22 PM