Stochastic LTMP2: Difference between revisions

Revision as of 16:55, 29 March 2018

(UNDER CONSTRUCTION)

Parallelization

The stochastic LTMP2 algorithm supports parallelization with MPI and OpenMP (OMP). The optimal setting is to set the number of MPI ranks as well as the KPAR flag to the number of cores (#cores), i.e. start VASP using

mpirun -np #cores vasp

and write

KPAR = #cores

in the INCAR file. With this setting the entire set of Hartree-Fock orbitals (WAVECAR) is available on each MPI rank, which is necessary to calculate stochastic samples independently. Note, that KPAR is only used to control the distribution of the orbitals and has nothing to do with k-point parallelization here.

However, for very large systems (large WAVECAR files) the available storage per MPI rank could be insufficient to store the entire set of orbitals. In this case, simply decrease the KPAR. Note that the available memory for the orbitals can be calculated by (memory per MPI rank) * (number of MPI ranks) / KPAR. For example, if your WAVECAR file has 17 GB you need 2*17 GB = 34 GB of memory to distribute the orbitals (the factor 2 is due to double precision). If want to use 64 cores with 4 GB per core and 64 MPI ranks, you have to set KPAR = 4. In this case the orbitals are distributed over 64/4 = 16 MPI ranks. Each MPI rank will still be able to perform independent stochastic calculations, however, a bit more MPI communication is necessary.

It is also possible to increase the memory per MPI rank using shared memory with OMP. This is a viable option if your available memory per core is too small, decreasing KPAR does not help or you don't want to set too small KPAR values. However, in general, it is recommended to solve memory issues with the KPAR flag first.

@@ Line 11: / Line 11: @@
 However, for very large systems (large WAVECAR files) the available storage per MPI rank could be insufficient to store the entire set of orbitals. In this case, simply decrease the '''KPAR'''. Note that the available memory for the orbitals can be calculated by (memory per MPI rank) * (number of MPI ranks) / KPAR. For example, if your WAVECAR file has 17 GB you need 2*17 GB = 34 GB of memory to distribute the orbitals (the factor 2 is due to double precision). If want to use 64 cores with 4 GB per core and 64 MPI ranks, you have to set '''KPAR''' = 4. In this case the orbitals are distributed over 64/4 = 16 MPI ranks. Each MPI rank will still be able to perform independent stochastic calculations, however, a bit more MPI communication is necessary.
-It is also possible to increase the memory per MPI rank using shared memory with OMP. This is a viable option if your available memory per core is too small or you don't want to set too small '''KPAR''' values. However, in general, it is recommended to solve memory issues with the '''KPAR''' flag first.
+It is also possible to increase the memory per MPI rank using shared memory with OMP. This is a viable option if your available memory per core is too small, decreasing '''KPAR''' does not help or you don't want to set too small '''KPAR''' values. However, in general, it is recommended to solve memory issues with the '''KPAR''' flag first.
 === KPAR flag ===