Not enough memory: Difference between revisions

From VASP Wiki
No edit summary
No edit summary
Line 1: Line 1:
Nowadays, for standard DFT and hybrid functional calculations,  memory is not an issues.
Nowadays, for standard DFT and hybrid functional calculations,  memory is usually not an issues.
Furthermore, by increasing the number of cores, the memory requirements per core can be reduced significantly.
Furthermore, by increasing the number of cores, the memory requirements per core can be reduced significantly.


Line 5: Line 5:
The following things can be tried to reduce the memory requirements per core.
The following things can be tried to reduce the memory requirements per core.
*Switch of symmetrisation ({{TAG|ISYM}}=0). Symmetrisation is done locally on each node requiring three fairly arrays. VASP.4.4.2 (and newer versions) have a switch to run a more memory conserving symmetrization. This can be selected by specifying  {{TAG|ISYM}}=2. Results might however differ somewhat from  {{TAG|ISYM}}=1 (usually only 1/100th of an meV). Also avoid writing or reading the {{TAG|CHGCAR}} file ({{TAG|LCHARG}}=''.FALSE.'').
*Switch of symmetrisation ({{TAG|ISYM}}=0). Symmetrisation is done locally on each node requiring three fairly arrays. VASP.4.4.2 (and newer versions) have a switch to run a more memory conserving symmetrization. This can be selected by specifying  {{TAG|ISYM}}=2. Results might however differ somewhat from  {{TAG|ISYM}}=1 (usually only 1/100th of an meV). Also avoid writing or reading the {{TAG|CHGCAR}} file ({{TAG|LCHARG}}=''.FALSE.'').
*Use {{TAG|NPAR}}=1.
*For large many atoms systems, increase {{TAG|NCORE}} to larger values. This allows to decrease the memory
requirements per core in order to store the non-local projectors. Furthermore, real space projection
{{TAG|LREAL}}= A decreases the required memory per core.
*{{TAG|KAR}} allows to distribute the k-points over cores. Unfortunately, only the calculations are
distributed, but the storage of the orbitals is not distributed over cores. This means that using
{{TAG|KAR}}=1 results in the smallest memory requirements per core (but the slowest calculations, since
VASP needs to rely on other less efficient parallelization strategies).
 
A final hint is in place. At some key places the code writes out the required memory per core. Please search
the lines
 
total amount of memory used by VASP MPI-rank0
 
and inspect how much memory VASP uses per core.


It should be mentioned that VASP relies heavily on dynamic memory
allocation (''ALLOCATE'' and ''DEALLOCATE''). As far as we know there
is no memory leakage (''ALLOCATE'' without ''DEALLOCATE''), however unfortunately
it is impossible to be entirely sure that no leakage exists. It should be mentioned
that some users have observed that the code is growing during
dynamic simulations on the T3E.
This is however most likely due to a "problematic"
dynamic memory management of the f90 runtime system and not due to
programming error in VASP. Unfortunately the
dynamic memory subsystems of most f90 compilers are still
rather inefficient. As a result it might happen, that
the memory becomes more and more fragmented during the run, so that large pieces
of memory can not be allocated. We can only hope for
improvements in the dynamic memory management (for instance
the introduction of garbage collectors).


----
----
[[Category:Performance]][[Category:Howto]]
[[Category:Performance]][[Category:Howto]]

Revision as of 17:43, 2 November 2020

Nowadays, for standard DFT and hybrid functional calculations, memory is usually not an issues. Furthermore, by increasing the number of cores, the memory requirements per core can be reduced significantly.


The following things can be tried to reduce the memory requirements per core.

  • Switch of symmetrisation (ISYM=0). Symmetrisation is done locally on each node requiring three fairly arrays. VASP.4.4.2 (and newer versions) have a switch to run a more memory conserving symmetrization. This can be selected by specifying ISYM=2. Results might however differ somewhat from ISYM=1 (usually only 1/100th of an meV). Also avoid writing or reading the CHGCAR file (LCHARG=.FALSE.).
  • For large many atoms systems, increase NCORE to larger values. This allows to decrease the memory

requirements per core in order to store the non-local projectors. Furthermore, real space projection LREAL= A decreases the required memory per core.

  • KAR allows to distribute the k-points over cores. Unfortunately, only the calculations are

distributed, but the storage of the orbitals is not distributed over cores. This means that using KAR=1 results in the smallest memory requirements per core (but the slowest calculations, since VASP needs to rely on other less efficient parallelization strategies).

A final hint is in place. At some key places the code writes out the required memory per core. Please search the lines

total amount of memory used by VASP MPI-rank0

and inspect how much memory VASP uses per core.