ML_ICRITERIA

From VASP Wiki
Revision as of 15:18, 17 September 2022 by Kresse (talk | contribs)

ML_ICRITERIA = [integer]
Default: ML_ICRITERIA = 1 

Description: Decides whether (ML_ICRITERIA>0) or how the Bayesian error threshold (ML_CTIFOR) is updated within the machine learning force field method. ML_CTIFOR determines whether a first principles calculations is performed.


The use of this tag in combination with the learning algorithms is described here: here.

The following options are possible for ML_ICRITERIA:

  • ML_ICRITERIA = 0: No update of the threshold ML_CTIFOR is performed. We recommend to use this mode only to refine an existing force field. For instance, if you know that in previous runs ML_CTIFOR was taking a value of 0.03, you might continue acquiring training data with the threshold now fixed to ML_CTIFOR=0.03, in order to catch all outliners and areas of the potential energy surface, where first principle data are still missing. To obtain highly robust force fields, we recommend to run for say NSW=100000 (one hundred thousand steps) in this mode at the highest temperature to be considered (or slightly above the highest considered temperature).
  • ML_ICRITERIA = 1: Set ML_CTIFOR to a value proportional to the average Bayesian errors of ML_MHIS steps. For ML_ICRITERIA = 1, the average is calculated only over the errors after updates of the force field. Such updates occur only rather rarely, hence updates of ML_CTIFOR are also fairly seldom in this mode. Furthermore, since first principles calculations are only performed for configurations with large Bayesian errors ("outliners"), also updates of the force fields occur only after outliners have been considered. Hence the Bayesian errors that enter the averaging are also typically larger than the average Bayesian error in this mode. It is thus recommended to set ML_CX to 0 in this mode (default).
  • ML_ICRITERIA = 2: Update of criteria using gliding average of all previous Bayesian errors. This mode averages the error over all previous predictions (that is every previously considered MD step), whereas the ML_ICRITERIA = 1 averages only over predictions immediately after re-training. The history length in this mode is currently hard coded and set to 400 steps (or ML_MHIS x 50 in newer version). This mode tends to continue sampling, and it is thus somewhat prone to oversampling: as the Bayesian errors decrease, also the threshold will be continuously lowered and further first principles calculations are initiated. Recommended values for ML_CX are about 0.1- 0.2 in this mode. For a value around ML_CX = 0.2, typically every 50 steps a first principles calculation is performed. This means that if the number of ionic steps is set to say NSW=50000, about 1000 first principles calculations are performed. This results in a fairly good and robust data base for ML for many materials.

As already hinted above, the tag ML_CX allows to fine tune the update of ML_CTIFOR. Whether to use ML_ICRITERIA = 1 or ML_ICRITERIA = 2, is a matter of taste. Just recall that ML_CX must be set differently for both modes Whereas a good default for ML_ICRITERIA = 1 is ML_CX = 0.0, a sensible default for ML_ICRITERIA = 2 is ML_CX = 0.2. Most of our force-fields have been generated using ML_ICRITERIA = 1, but this mode sometimes stagnates and stops performing first principles calculations. On the other hand and as already mentioned, ML_ICRITERIA = 2 tends to over-sample, that is, it can perform too many first principles calculations.

Related tags and articles

ML_LMLFF, ML_CTIFOR, ML_CSLOPE, ML_CSIG, ML_MHIS, ML_CX

Examples that use this tag