ML IALGO LINREG: Difference between revisions

From VASP Wiki
No edit summary
No edit summary
Line 1: Line 1:
{{TAGDEF|ML_IALGO_LINREG|[integer]|1}}
{{TAGDEF|ML_IALGO_LINREG|[integer]|1}}


Description: This tag determines which algorithm is employed to solve the system of linear equations in the ridge regression method for machine learning.
Description: This tag determines the algorithm that is employed to solve the system of linear equations in the ridge regression method for machine learning.
----
----


In the ridge regression method for machine learning one needs to solve for the unknown weights <math>\mathbf{w}</math> within
In the ridge regression method for machine learning one needs to solve for the unknown weights <math>\mathbf{w}</math> minimizing


<math>
<math>
\mathbf{Y} = \mathbf{\Phi} \mathbf{w}.
|| \mathbf{Y} - \mathbf{\Phi} \mathbf{w} || \rightarrow \mbox{min}
</math>
</math>


Line 13: Line 13:


The following options are available to solve for <math>\mathbf{w}</math>:
The following options are available to solve for <math>\mathbf{w}</math>:
*{{TAG|ML_IALGO_LINREG}}=1: Bayesian linear regression (see [[Machine learning force field: Theory#Bayesian error estimation|here]]). Recoomended for {{TAG|NSW}}<math>\ge</math>1. Usable with on-the-fly learning.
*{{TAG|ML_IALGO_LINREG}}=1: Bayesian linear regression (see [[Machine learning force field: Theory#Bayesian error estimation|here]]). Recommended for {{TAG|NSW}}<math>\ge</math>1. Usable with on-the-fly learning.
*{{TAG|ML_IALGO_LINREG}}=2: QR factorization. Usable with {{TAG|NSW}}=0,1.
*{{TAG|ML_IALGO_LINREG}}=2: QR factorization. Usable with {{TAG|NSW}}=0.
*{{TAG|ML_IALGO_LINREG}}=3: Singular value decomposition. Usable with {{TAG|NSW}}=0,1.
*{{TAG|ML_IALGO_LINREG}}=3: Singular value decomposition. Usable with {{TAG|NSW}}=0.
*{{TAG|ML_IALGO_LINREG}}=4: Singular value decomposition with Tikhonov regularization. Usable with {{TAG|NSW}}=0,1.
*{{TAG|ML_IALGO_LINREG}}=4: Singular value decomposition with Tikhonov regularization. Usable with {{TAG|NSW}}=0.


In the current implementation the on-the-fly learning algorithm requires to have a probability model included within the regression. So only the Bayesian linear regression method ({{TAG|ML_IALGO_LINREG}}=1) is usable with this option. All other methods should be used only in a single step calculation ({{TAG|NSW}}=0) to refine the force-field after the force field was trained with {{TAG|ML_IALGO_LINREG}}=1. {{TAG|ML_IALGO_LINREG}}=3  is the most tested for this purpose up to now. It should be also noted that this method is also computationally more demanding that the Bayesian linear regression.
For on the fly learning, it is strictly necessary to use Bayesian regression ({{TAG|ML_IALGO_LINREG}}=1), since uncertainty estimates are only available for Bayesian regression.  
 
All other methods are used to read an existing {{TAG|ML_AB}} data base file and create a final {{TAG|ML_FFN}} force field file (postprocessing). During on the fly training, the database {{TAG|ML_ABN}} file is created, and before postprocessing the user needs to copy the {{TAG|ML_ABN}} file to the {{TAG|ML_AB}} file. Among the postprocessing methods, {{TAG|ML_IALGO_LINREG}}=3  is the best tested approach and we use it routinely before employing a machine learned force field. It should be also noted that this postprocessing step is computationally somewhat more demanding that the Bayesian linear regression, but typically this post processing step still requires between a few minutes and an hour. So usually the extra cost is negligible compared to the original training.


== Related Tags and Sections ==
== Related Tags and Sections ==

Revision as of 07:11, 3 February 2022

ML_IALGO_LINREG = [integer]
Default: ML_IALGO_LINREG = 1 

Description: This tag determines the algorithm that is employed to solve the system of linear equations in the ridge regression method for machine learning.


In the ridge regression method for machine learning one needs to solve for the unknown weights minimizing

For more details please see here.

The following options are available to solve for :

For on the fly learning, it is strictly necessary to use Bayesian regression (ML_IALGO_LINREG=1), since uncertainty estimates are only available for Bayesian regression.

All other methods are used to read an existing ML_AB data base file and create a final ML_FFN force field file (postprocessing). During on the fly training, the database ML_ABN file is created, and before postprocessing the user needs to copy the ML_ABN file to the ML_AB file. Among the postprocessing methods, ML_IALGO_LINREG=3 is the best tested approach and we use it routinely before employing a machine learned force field. It should be also noted that this postprocessing step is computationally somewhat more demanding that the Bayesian linear regression, but typically this post processing step still requires between a few minutes and an hour. So usually the extra cost is negligible compared to the original training.

Related Tags and Sections

ML_LMLFF, ML_W1, ML_WTOTEN, ML_WTIFOR, ML_WTSIF, ML_ISTART

Examples that use this tag