LDIAG

From VASP Wiki
Revision as of 14:48, 16 January 2017 by Karsai (talk | contribs) (Created page with "{\tt IALGO} = 38 | 48 \qquad {\tt LDIAG} = .TRUE. | .FALSE. \begin{tabular} {lll} Default \\ {\tt IALGO} & = & 8 for VASP.4.4 and older \\ & = & 38 for VASP....")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

{\tt IALGO} = 38 | 48 \qquad {\tt LDIAG} = .TRUE. | .FALSE.

\begin{tabular} {lll} Default \\ {\tt IALGO} & = & 8 for VASP.4.4 and older \\

            & = & 38 for VASP.4.5, VASP.4.6 and VASP.5.2 (if {\tt ALGO} is not set)\\

{\tt LDIAG} & = & .TRUE. \\ \end{tabular}\vspace{5mm} \begin{verbatim} IALGO = integer selecting algorithm \end{verbatim} \begin{verbatim} LDIAG = perform sub space rotation \end{verbatim}

Please mind, that the VASP.4.5 default is {\tt IALGO} = 38 (a Davidson block iteration scheme). {\tt IALGO} = 8 is not supported for copyright reasons in VASP.4.5, but

{\tt IALGO} = 38 is roughly 2 times faster for large systems than
{\tt IALGO} = 8 and at least as stable. You can select the algorithm also

by setting {\tt ALGO}= Normal | Fast | Very$\_$Fast in the INCAR file (see Sec. \ref{incar-algo}).

\noindent {\tt IALGO} selects the main algorithm, and {\tt LDIAG} determines whether a subspace--diagonalization is performed, or not. {\em We strongly urge the users to set the algorithms via {\tt ALGO}. Algorithms other than those available via {\tt ALGO } are subject to instabilities.}

Generally the first digit of {\tt IALGO} specifies the main algorithm, the second digit controls the actual settings within the algorithm. For instance 4X will always call the same routine for the electronic minimization the second digit X controls the details of the electronic minimization (preconditioning etc.).

{\em Mind:} All implemented algorithms will result in the same result, i.e. they will correctly calculate the KS groundstate, {\em if they converge}. This is guaranteed because all minimization routines use the same set of subroutines to calculate the residual (correction) vector (${\bf H} - \epsilon {\bf S}) \vert \phi \rangle$ for the current orbitals $\phi$ and they are considered to be converged if this correction vector becomes smaller than some specified threshold. The only difference between the algorithms is the way this correction vector is added to the trial orbital and therefore the performance of the routines might be quite different.

\noindent The most extensive tests has been done for {\tt IALGO} = 38 ({\tt IALGO} = 8 before VASP.4.5). {\em If random vectors ({\tt INIWAV} = 1) are used for the initialization of the orbitals, this algorithm always gives the correct KS groundstate. Therefore, if you have problems with {\tt IALGO} = 48 ({\tt ALGO} = Fast) switch to {\tt IALGO} = 38.}

\noindent List of possible settings for {\tt IALGO}. \begin{itemize} \item[-1] Performance test.

VASP does not perform an actual calculations --- only some important parts of the program will be executed and the timing for each part is printed out at the end.

\item[5-8] Conjugate gradient algorithm (section \ref{min-en4})

Optimize each band iteratively using a conjugate gradient algorithm.

Subspace-diagonalization before conjugate gradient algorithm. The conjugate gradient algorithm is used to optimize the eigenvalue of each band.

Sub-switches:

\begin{tabular} {ll} 5 & steepest descent \\ 6 & conjugated gradient \\ 7 & preconditioned steepest descent \\ 8 & preconditioned conjugated gradient\\ \end{tabular}\vspace{5mm}

\noindent {\tt IALGO} = 8 (VASP-releases older than VASP.4.5) is always fastest, {\tt IALGO} = 5-7 are only implemented for test purpose.

Please mind, that {\tt IALGO} =8 is not supported by VASP.4.5, since M. Teter, Corning and M. Payne hold a patent on this algorithm.

\item[38] ({\tt ALGO} =N) Kosugi algorithm (special Davidson block iteration scheme) (see section \ref{min-david})

This algorithm is the default in VASP.4.6 and VASP.5.X. It optimizes a subset of {\tt NSIM} \index{INCAR!N!NSIM|textit} bands simultaneously (Sec. \ref{incar-nsim}). The optimized bands are kept orthogonal to all other bands. If problems are encountered with the algorithm, try to decrease {\tt NSIM}. Such problems are encountered, if linear dependencies develop in the search space. By reducing {\tt NSIM} the rank of the search space is decreased.


\item[44-48] ({\tt ALGO} = F) Residual minimization method direct inversion in the iterative subspace (RMM-DIIS see section \ref{min-en3} and \ref{min-en5})

The RMM-DIIS algorithm reduces the number of orthonormalization steps (o($N^3$)) considerably and is therefore much faster than {\tt IALGO} = 8 and {\tt IALGO} = 38, at least for large systems and for workstations with a small memory band width. For optimal performance, we recommend to use this switch together with {\tt LREAL} = Auto (Section \ref{incar-real}). The algorithm works in a blocked mode in which several bands are optimized at the same time. This can improve the performance even further on systems with a low memory band width (see \ref{incar-nsim}, default is presently {\tt NSIM} = 4).

The following sub-switches exist:

\begin{tabular} {ll} 44 & steepest descent eigenvalue minimization\\ 46 & residuum-minimization + preconditioning\\ 48 & preconditioned residuum-minimization ({\tt ALGO} = F)\\ \end{tabular}\vspace{5mm}

\noindent {\tt IALGO} = 48 is usually most reliable ({\tt IALGO} = 44 and 46 are mainly for test purposes).

For {\tt IALGO} =4X, a subspace-diagonalization is performed before the residual vector minimization, and a Gram-Schmidt orthogonalization is employed after the RMM-DIIS step. In the RMM-DIIS step, each band is optimized individually (without the orthogonality constraint); a maximum of {\tt NDAV}\index{INCAR!N!NDAV|textbf} iterative steps per band are performed for each band. The default for {\tt NDAV} is {\tt NDAV=4}, and we we recommend to leave this value unchanged.

Please mind, that the RMM-DIIS algorithm can fail in rare cases, whereas {\tt IALGO} = 38 did not fail for any system tested up to date. Therefore, if you have problems with {\tt IALGO} = 48 try first to switch to {\tt IALGO} = 38

However, in some cases the performance gains due to {\tt IALGO} = 48 are so significant that {\tt IALGO} = 38 might not be a feasible option. In the following we try to explain what to do if {\tt IALGO} = 48 does not work reliably:

In general two major problems can be encountered when using {\tt IALGO} = 48: First, the optimization of unoccupied bands might fail for molecular dynamics and relaxations. This is because our implementation of the RMM-DIIS algorithm treats unoccupied bands more ``sloppy then occupied bands (see section \ref{incar-wei}) during MD's. The problem can be solved rather easily by specifying {\tt WEIMIN} = 0\index{INCAR!W!WEIMIN|textit} in the INCAR file. In that case all bands are treated accurately.


The other major problem -- which occurs also for static calculations -- is the initialization of the orbitals. Because the RMM-DIIS algorithm tends to find eigenvectors which are close to the initial set of trial vectors there is no guarantee to converge to the correct ground state! This situation is usually very easy to recognize; whenever one eigenvector is missing in the final solution, the convergence becomes slow at the end (mind, that it is possible that one state with a small fractional occupancy above the Fermi-level is missing). If you suspect that this is the case switch to {\tt ICHARG} = 12\index{INCAR!I!ICHARG|textit} (i.e. no update of charge and Hamiltonian) and try to calculate the orbitals with high accuracy ($10^{-6}$). If the convergence is fairly slow or stucks at some precision, the RMM-DIIS algorithm has problems with the initial set of orbitals (as a rule of thumb not more than 12 electronic iterations should be required to determine the orbital for the default precision for {\tt ICHARG} = 12). The first thing to do in that case is to increase the number of bands ({\tt NBANDS}\index{INCAR!N!NBANDS|textit}) in the INCAR file. This is usually the simplest and most efficient fix, but it does not work in all cases. This solution is also undesirable for MD's and long relaxations because it increases the computational demand somewhat. A simple alternative -- which worked in all tested cases -- is to use {\tt IALGO} = 38 (Davidson) for a few non selfconsistent iterations and to switch then to the RMM-DIIS algorithm. This setup is automatically selected when {\tt ALGO} = Fast is specified in the INCAR file (IALGO must not specified in the INCAR file in this case).

The final option is somewhat complicated and requires an understanding of how the initialization algorithm of the RMM-DIIS algorithm works: after the random initialization of the orbitals, the initial orbitals for the RMM-DIIS algorithm are determined during a non selfconsistent steepest descent phase (the number of steepest descent sweeps is given by {\tt NELMDL}\index{INCAR!N!NELMDL|textit}, default is {\tt NELMDL}=-12 for RMM-DIIS, section \ref{incar-nelm}). During this initial phase in each sweep, one steepest descent step per orbital is performed between each sub space rotation. This "automatic" simple steepest descent approach during the delay is faced with a rather ill-conditioned minimization problem and can fail to produce reasonable trial orbitals for the RMM-DIIS algorithm. In this case the quantity in the column "rms" will not decrease during the initial phase (12 steps), and you must improve the conditioning of the problem by setting the {\tt ENINI} parameter in the INCAR file. {\tt ENINI}\index{INCAR!E!ENINI|textbf} controls the cutoff during the initial (steepest descent) phase for {\tt IALGO} = 48. Default for {\tt ENINI} is {\tt ENINI} = {\tt ENCUT}. If convergence problems are observed, start with a slightly smaller {\tt ENINI}; reduce {\tt ENINI} in steps of $20~\%$, till the norm of the residual vector (column "rms") decreases continuously during the first 12 steps.

A final note concerns the mixing: {\tt IALGO} = 48 dislikes too abrupt mixing. Since the RMM-DIIS algorithm always stays in the space spanned by the initial orbitals, and too strong mixing (large {\tt AMIX}\index{INCAR!A!AMIX|textit}, small {\tt BMIX}\index{INCAR!B!BMIX|textit}) might require to change the Hilbert space, the initial mixing must not be too strong for {\tt IALGO} = 48. Try to reduce {\tt AMIX} and increase {\tt BMIX} if you suspect such a situation. Increasing {\tt NBANDS} also helps in this situation.

\item[53-58] Treat total free energy as variational quantity and minimize the functional completely selfconsistently.

This algorithm is based on an idea first proposed in Refs. \cite{sti89,gil89,ari92}. The algorithm has been carefully optimized and should be selected for Hartree-Fock type calculations. The present version is rather stable and robust even for metallic systems. Important sub-switches:

\begin{tabular} {ll} 53 & damped MD with damping term automatically determined by the given time-step ({\tt ALGO} = D)\\ 54 & damped MD (velocity quench or quickmin) \\ 58 & preconditioned conjugated gradient ({\tt ALGO} = A)\\ \end{tabular}\vspace{5mm}

\noindent Furthermore {\tt LDIAG} determines, whether the subspace rotation matrix (rotation matrix in the space spanned by the occupied and unoccupied orbitals) is optimized. The current default is {\tt LDIAG} = .TRUE. selecting the algorithm presented in Ref. \cite{marsalgo07}. This allows for efficient groundstate calculations of metals and small gap semiconductors. {\tt LDIAG} = .FALSE. selects Loewdin perturbation theory for the subspace rotation matrix\cite{kre96b} which is much faster but generally significantly less stable for metallic and small gap systems.

\noindent The preconditioned conjugate gradient ({\tt IALGO} = 58, {\tt ALGO} = A) algorithm is recommended for insulators. The best stability is usually obtained if the number of bands equals half the number of electrons (non spin polarized case). In this case, the algorithm is fairly robust and fool proof and might even outperform the mixing algorithm.

For small gap systems and for metals, it is however usually required (metals) or desirable (semiconductors) to use a larger value for {\tt NBANDS}. In this case, we recommend to use the damped MD algorithm ({\tt IALGO} = 53, {\tt ALGO} = Damped) instead of the conjugate gradient one.

The stability of the all bands simultaneously algorithms depends strongly on the setting of {\tt TIME}\index{INCAR!T!TIME|textit}. For the conjugate gradient case, {\tt TIME} controls the step size in the trial step, which is required in order to perform a line minimization of the energy along the gradient (or conjugated gradient, see section \ref{incar-ibrion} for details). Too small steps make the line minimization less accurate, whereas too large steps can cause instabilities. The step size is usually automatically scaled by the actual step size minimizing the total energy along the gradient (values can range from 1.0 for insulators to 0.01 for metals with a large density of states at the Fermi-level).

For the damped MD algorithm ({\tt IALGO} = 53, {\tt ALGO} = Damped), a sensible {\tt TIME} step is even more important. In this case {\tt TIME} is not automatically adjusted, and the user is entirely responsible to chose an appropriate value. Too small time-steps slow the convergence significantly, whereas too large values will always lead to divergence. It is sensible to optimize this value, in particular, if many different configurations are considered for a particular system. It is recommended to start with a small step size {\tt TIME}, and to increase {\tt TIME} by a factor 1.2 until the calculations diverge. The largest stable step {\tt TIME} should then be used for all calculations.

The final algorithm {\tt IALGO} = 54 also uses a damped molecular dynamics algorithm and quenches the velocities to zero if they are antiparallel to the present forces (quick-min). It is usually not as efficient as {\tt IALGO} = 53, but it is also less sensitive to the { \tt TIME} parameter. (for detail please also read section \ref{incar-ibrion}).

{\em Note: it is very important to set the {\tt TIME} tag for these algorithms (see section \ref{incar-time})}.

\item[2] Orbitals and one-electron energies are kept fixed. One electron occupancies and electronic density of states (DOS) are, however, recalculated. This option is only useful if a pre-converged WAVECAR file is read. The option allows to run selected post-processing tasks, such as local DOS, or the interface code to Wannier90.

\item[3] Orbitals (one-electron wavefunctions) are kept fixed. One-electron energies, one electron occupancies, band structure energies, and the electronic density of states (DOS) are, as well as, the total energy are recalculated for the present Hamiltonian. This option is only useful if a pre-converged WAVECAR file is read. The option also allows to run selected post-processing tasks, such as local DOS, or the interface code to Wannier90.

\item[4] Orbitals are updated by applying a sub-space rotation, i.e. the Hamiltonian is evaluated in the space spanned by the orbitals (read from WAVECAR),

and one diagonalization
in this space is performed. No optimization outside the subspace spanned by the orbitals
is performed.

{\em Note: if {\tt NBANDS} is larger or equal to the total number of plane waves, the

resulting one-electron orbitals are exact.}

\item[15-18] Conjugate gradient algorithm

Subspace-diagonalization after iterative refinement of the eigenvectors using the conjugate gradient algorithm. This switch is for compatibility reasons only and should not be used any longer. Generally {\tt IALGO} = 5-8 is preferable, but was not implemented previous to VAMP 1.1.

Sub-switches as above.


\item[28] Conjugate gradient algorithm (section \ref{min-en4})

Subspace-diagonalization before conjugate gradient algorithm.

No explicit orthonormalization of the gradients to the trial orbitals is done.

This setting saves time, but does fail in most cases --- mainly included for test purpose. Try {\tt IALGO} = 4X instead.

\item[90] Exact Diagonalization. This flag selects an exact diagonalization of the one-electron Hamiltonian. This requires a fairly large amount of memory, and should be selected with caution. Specifically, we recommend to select this algorithm for RPA or $GW$ calculations, if many unoccupied orbitals are calculated (more than 30-50~\% of the states spanned by the full plane wave basis). To speed up the calculations, we recommend to perform a routine groundstate calculation before calculating the unoccupied states. \end{itemize}