ML IWEIGHT: Difference between revisions

From VASP Wiki
No edit summary
No edit summary
(One intermediate revision by the same user not shown)
Line 6: Line 6:
*{{TAG|ML_FF_IWEIGHT}}=1: The unnormalized energies, forces and stress tensor training data are divided by the weights determined by the flags {{TAG|ML_FF_WTOTEN}} (eV/atom), {{TAG|ML_FF_WTIFOR}} (eV/Angstrom) and {{TAG|ML_FF_WTSIF}} (kBar), respectively.
*{{TAG|ML_FF_IWEIGHT}}=1: The unnormalized energies, forces and stress tensor training data are divided by the weights determined by the flags {{TAG|ML_FF_WTOTEN}} (eV/atom), {{TAG|ML_FF_WTIFOR}} (eV/Angstrom) and {{TAG|ML_FF_WTSIF}} (kBar), respectively.
*{{TAG|ML_FF_IWEIGHT}}=2: The training data are normalized by using their standard deviations. The averaging is done over all training data. Then, the normalized energy, forces and stress tensor are multiplied by {{TAG|ML_FF_WTOTEN}}, {{TAG|ML_FF_WTIFOR}} and {{TAG|ML_FF_WTSIF}}, respectively. In this case the flags {{TAG|ML_FF_WTOTEN}}, {{TAG|ML_FF_WTIFOR}} and {{TAG|ML_FF_WTSIF}} are unitless quantities.  
*{{TAG|ML_FF_IWEIGHT}}=2: The training data are normalized by using their standard deviations. The averaging is done over all training data. Then, the normalized energy, forces and stress tensor are multiplied by {{TAG|ML_FF_WTOTEN}}, {{TAG|ML_FF_WTIFOR}} and {{TAG|ML_FF_WTSIF}}, respectively. In this case the flags {{TAG|ML_FF_WTOTEN}}, {{TAG|ML_FF_WTIFOR}} and {{TAG|ML_FF_WTSIF}} are unitless quantities.  
*{{TAG|ML_FF_IWEIGHT}}=3: Same as {{TAG|ML_FF_IWEIGHT}}=2 but the training data is divided into individual subsets. For each subset the standard deviations are calculated seperately.
*{{TAG|ML_FF_IWEIGHT}}=3: Same as {{TAG|ML_FF_IWEIGHT}}=2 but the training data is divided into individual subsets. For each subset the standard deviations are calculated separately. The energies, forces and stress are normalized using the average of the standard deviations of all subsets. Finally, the normalized energy, forces and stress tensor are multiplied by {{TAG|ML_FF_WTOTEN}}, {{TAG|ML_FF_WTIFOR}} and {{TAG|ML_FF_WTSIF}}, respectively. The division into subsets is based on the name tag as given in the first line of the {{TAG|POSCAR}} file. If training is performed for widely different materials, for instance different phases that have widely different energies, it is important to chose different system names in the first line of the  {{TAG|POSCAR}} file. If this is not done, the standard deviation for the energy might become large, concomitantly reducing the weight of the energy equations.
The energies, forces and stress tensors for each subset are normalized using the average of the standard deviation of the subsets. The division into subsets is based on the name tag as given in the first line of the {{TAG|POSCAR}} file. If training is performed for widely different materials, for instance different phases that have large energy difference, it is important to chose different system names in the  {{TAG|POSCAR}} file. If this is not done, the standard deviation for the energy might become large, concomitantly reducing the weight of the energy equations.





Revision as of 06:35, 2 October 2020

ML_FF_IWEIGHT = [integer]
Default: ML_FF_IWEIGHT = 3 

Description: Flag to control the weighting of the energy, force and stress equations in the machine learning force field method.


For ML_FF_IWEIGHT the following settings are possible:

  • ML_FF_IWEIGHT=1: The unnormalized energies, forces and stress tensor training data are divided by the weights determined by the flags ML_FF_WTOTEN (eV/atom), ML_FF_WTIFOR (eV/Angstrom) and ML_FF_WTSIF (kBar), respectively.
  • ML_FF_IWEIGHT=2: The training data are normalized by using their standard deviations. The averaging is done over all training data. Then, the normalized energy, forces and stress tensor are multiplied by ML_FF_WTOTEN, ML_FF_WTIFOR and ML_FF_WTSIF, respectively. In this case the flags ML_FF_WTOTEN, ML_FF_WTIFOR and ML_FF_WTSIF are unitless quantities.
  • ML_FF_IWEIGHT=3: Same as ML_FF_IWEIGHT=2 but the training data is divided into individual subsets. For each subset the standard deviations are calculated separately. The energies, forces and stress are normalized using the average of the standard deviations of all subsets. Finally, the normalized energy, forces and stress tensor are multiplied by ML_FF_WTOTEN, ML_FF_WTIFOR and ML_FF_WTSIF, respectively. The division into subsets is based on the name tag as given in the first line of the POSCAR file. If training is performed for widely different materials, for instance different phases that have widely different energies, it is important to chose different system names in the first line of the POSCAR file. If this is not done, the standard deviation for the energy might become large, concomitantly reducing the weight of the energy equations.


Mind: For ML_FF_IWEIGHT=2 and 3 the weights are unitless quantities used to multiply the data, whereas for ML_FF_IWEIGHT=1 they have a unit. All three methods provide unitless energies, forces and stress tensors, which are then passed to the regression. Although the defaults are usually rather sensible, it can be useful to explore different weights. For instance, if vibrational frequencies are supposed to be reproduced accurately, we found it helpful to increase ML_FF_WTIFOR to 10-100. On the other hand, if energy difference between different phases need to be described accurately by the force field, it might be useful to increase ML_FF_WTOTEN to around 10-100.

Related Tags and Sections

ML_FF_LMLFF, ML_FF_WTOTEN, ML_FF_WTIFOR, ML_FF_WTSIF

Examples that use this tag