Opened 7 years ago

Last modified 7 years ago

#375 assigned enhancement

PCSA Tikhonov Regularization

Reported by: emre Owned by: gegorbet
Priority: major Milestone: future
Component: ultrascan3 Version:
Keywords: review Cc:

Description (last modified by emre)

NNLS solves min |Ax-b|_2
For Tikhonov Regularization (TR)
you additionally want to minimize |x|_2
Add a variable regularization factor "alpha"
After normal setup for NNLS
The matrix A will have the simulations in each column and b will have the experimental data.
Identify the simulations with numbers, 1,2,...,c
create the c by c identity matrix multiplied by alpha and place below A, creating an augmented matrix.
augment b with c zeros

e.g.

A = [ sim1, sim2, ... , simc ] where each sim is a column vector
A_augmented =
[

sim1, sim2, ..., simc
alpha, 0, ..., 0
0, alpha, 0, ..., 0
...
0, 0, ..., alpha

]

b_augmented =
[

b
0
0
0
...
0

]

Place this system into NNLS
(this will not work with subgrids* [we could later, but this will require some careful work], you must solve the one big NNLS system,
that's why I recommend for PCSA right now only).

Once you get the results,
you will want to vary alpha
and produce the parametric (on alpha) plot of
|x|=x_02 + x_12 + ... + x_c2 vs rmsd2.
This will often give an L shaped curve, the sharpness of which will vary from problem-to-problem. The idea is to minimize the rmsd while simultaneously minimizing the norm of x.
The regularization value at the elbow of the L is called the L curve criterion and is frequently chosen as the optimal value of alpha.
This second bit of determining alpha should be available as an option and particularly as we are getting a handle on good alphas.
It may turn out that alpha falls in a good range, but I feel this will likely be somewhat experiment dependent.
Note: alpha=0 is like doing no regularization.
alpha = 1 is probably a reasonable maximum.
I would suggest starting with alpha = 0, .1, .2, ..., 1 and plot th e parametric curve described above.

Let me know if you have any questions.
-e.

Change History (9)

comment:1 Changed 7 years ago by emre

  • Description modified (diff)
  • Owner changed from dzollars to gegorbet

comment:2 follow-up: Changed 7 years ago by emre

  • Description modified (diff)

comment:3 in reply to: ↑ 2 ; follow-up: Changed 7 years ago by gegorbet

Replying to emre:
The method for TR with one specific alpha is understood. The initial questions are about how this fits into PCSA. Currently, PCSA first computes and orders results for a fixed set of PC variations. Next, it uses Levenberg-Marquardt to refine the best result from stage one. Would the TR method be a stage three? Would the B matrix be constructed from the final stage-two (LM) result?

As a test of the basic NNLS-TR coding, can it be assumed that any arbitrary alpha (say, 0.4) would give better results than non-regularization (alpha=0)?

comment:4 Changed 7 years ago by gegorbet

  • Status changed from new to assigned

comment:5 in reply to: ↑ 3 ; follow-up: Changed 7 years ago by emre

Replying to gegorbet:

I initially envisioned the usage of TR uniformly (i.e. at every NNLS stage). This could be optional and then you run an additional final NNLS with TR, but I think the best solution would be to do it at every NNLS instance and leave the rest of the code alone.
This because you will get different RMSDs at every NNLS with TR on, and the minimum may move. This is not certain, and there may be
some computational efficiency to postpone TR until the last instance (as a 3rd stage), but it seems possible you may get a different regularized solution.

No, the RMSD will go UP with the additional constraint on the magnitude of NNLS result vector x. What is gained is smoothness
in the vector x, the sharp peaks will be penalized and this is
the effect of TR.

Would the B matrix be constructed from the final stage-two (LM) result?

b is a vector (containing the experimental data augmented with zeros when TR is on).
A is a matrix (containing the simulations augmented with the identity matrix multiplied by alpha when TR is on).
To do a stage three computation (only done when TR is not used for NNLS during stage one and two), you would take the standard b vector, augment with zeros, take best last PCSA A matrix, augument with the identity time alpha and run NNLS again. Your resulting x vector of concentrations would be different.

comment:6 in reply to: ↑ 5 Changed 7 years ago by gegorbet

Replying to emre:

Replying to gegorbet:

I initially envisioned the usage of TR uniformly (i.e. at every NNLS stage). This could be optional and then you run an additional final NNLS with TR, but I think the best solution would be to do it at every NNLS instance and leave the rest of the code alone.
This because you will get different RMSDs at every NNLS with TR on, and the minimum may move. This is not certain, and there may be
some computational efficiency to postpone TR until the last instance (as a 3rd stage), but it seems possible you may get a different regularized solution.

Wouldn't that mean that you would eventually have to construct an outer loop of alpha values and the whole process get repeated for each one? This is going to take a long time. Also: won't we have to change the external routine for returning "fitness" for L-M? Replace low-RMSD with low-X-Norm+low-RMSD?

No, the RMSD will go UP with the additional constraint on the magnitude of NNLS result vector x. What is gained is smoothness
in the vector x, the sharp peaks will be penalized and this is
the effect of TR.

Wow! I don't quite understand this. Minimizing RMSD is no longer the goal? A solution which has a larger RMSD might be regarded as a "better" one? If you actually have just two macro-molecules of interest, you don't want to see sharp peaks representing them?

Would the B matrix be constructed from the final stage-two (LM) result?

b is a vector (containing the experimental data augmented with zeros when TR is on).
A is a matrix (containing the simulations augmented with the identity matrix multiplied by alpha when TR is on).
To do a stage three computation (only done when TR is not used for NNLS during stage one and two), you would take the standard b vector, augment with zeros, take best last PCSA A matrix, augument with the identity time alpha and run NNLS again. Your resulting x vector of concentrations would be different.

I, or course, meant "A". That is, I meant to ask "Would the A matrix be constructed from stage-two result?". And you have answered my question.

comment:7 Changed 7 years ago by gegorbet

Tikhonov regularization has been implemented in PCSA. It has just been committed. It needs a bit more use and independent verification of its correctness and utility.

comment:8 Changed 7 years ago by gegorbet

  • Keywords review added

A number of recent refinements to regularization have been implemented (revision 1594). The PCSA program is likely production ready in this regard.

This ticket is review-ready.

comment:9 Changed 7 years ago by demeler

The program is basically working, but the alpha scan returns values that are of the same magnitude in both UV absorbance and FDS data, but the regularization seems very different. It appears that the alpha value may be off by 20-100 fold in order to get qualitatively similar regularization results when comapring uvabs and fds data. Right now, when I use the default run parameters for scanning alpha, I get values between 0.25-0.35, regardless of optical system. So the magnitude is identical, but the regularization level is very different. Strangely, both of these values seem to come from the elbow of the regularization scan curve. I think it would be nice if we could resolve this discrepancy before calling this program done.

Note: See TracTickets for help on using tickets.