Page 1 of 1

vasp6 gpu version ab initio MD crushes

Posted: Tue Dec 19, 2023 5:04 pm
by german_d.samolyuk1
Dear developers,

I'm running Langevin MD for 128 tungsten + 1 Re atoms on NERSC perlmutter
With the same input (INCAR, POSCAR, KPOINS, POTCAR) vasp.6.4.1 fail to reach self-consistency after 200 scf iterations and fails at second time step, while vasp 5 cpu version runs without any problem.

Below I copied e-mail received from nersk software engineer. I would appreciate any help. Please let me know what information do you need

"2023-12-15 13:14:48 PST - Phillip ThomasAdditional comments
Hi German,

Thank you for your patience! I tested your job with several versions of VASP:

5.4.4-cpu
6.3.2-cpu
6.4.1-cpu
6.2.1-gpu
6.3.2-gpu
6.4.1-gpu
6.4.2-gpu (not yet public on Perlmutter, new build)

I can reproduce the error that you experienced in 6.2.1-gpu, but I found that this error appears in *all* VASP-6 builds at NERSC; it is not specific to the GPU builds. Looking at the output files I noticed that the SCF iterations begin to differ between VASP 5.4.4 and the VASP 6.x.y runs very early in the calculation, with the SCF energies diverging within the first few SCF cycles (sometimes even in the very first step). In 5.4.4 the free energy always converges to a value around -1645 eV for all SCF cycles in the job, but all of the VASP 6.x.y builds show SCF divergence, so I believe the values from VASP 5.4.4 to be correct.

I notice that the number of "eigenvalue-minimisations" in VASP 6.X begins to differ from VASP 5.4.4 at the point of divergence, so I suspect the issue lies in the eigensolver routine.

At this point I recommend that you file a bug report with the VASP developers. Some issues that the VASP developers might check include:

1) Were there any changes in the eigensolver routine between VASP 5 and VASP 6 which may have introduced a bug?
2) Were any default parameters changed between VASP 5 and VASP 6 which might affect SCF convergence for certain types of systems? If so, then you may be able to restore convergence by setting some parameter in your INCAR in the VASP 6.x.y runs.
3) Is there a possibility of a bug either in the compiler or in the linked libraries which may affect the VASP 6.x.y versions but not VASP 5.4.4? All versions of VASP at NERSC were built using NVIDIA SDK 22.7 and use Cray-MPICH, if that helps.

If you decide to file a bug report with VASP, we would be grateful if you reference the thread in this ticket so that we can track it and patch our VASP builds if the developers suggest a patch!

Best,
Phillip
"

the thread in this ticket is Ref:MSG3501497

Re: vasp6 gpu version ab initio MD crushes

Posted: Tue Dec 19, 2023 6:03 pm
by pedro_melo
Dear german_d.samolyuk1,

We will need some more information about the jobs in question to check the performance of VASP 5.4.4 and the later 6.x.y versions. Could you provide us with the input files that you or the Nersk engineer are using?

Kind regards,
Pedro Melo

Re: vasp6 gpu version ab initio MD crushes

Posted: Tue Dec 19, 2023 6:25 pm
by german_d.samolyuk1
Dear Pedro Melo,

Thank you for your quick replay.
I attached archive wre.tar. It contains INCAR, POSCAR, POTCAR, KPOINTS, gpu.pbatch (the one i used tu run vasp6), cpu.pbatch (vasp5).

Sincerely,

German

Re: vasp6 gpu version ab initio MD crushes

Posted: Tue Dec 19, 2023 6:38 pm
by pedro_melo
Dear German,

You seem to have forgotten the .tar file.

Best,
Pedro

Re: vasp6 gpu version ab initio MD crushes

Posted: Tue Dec 19, 2023 6:42 pm
by german_d.samolyuk1
Dear Pedro,

Did it work this time?

Thanks,

German

Re: vasp6 gpu version ab initio MD crushes

Posted: Wed Dec 20, 2023 10:02 am
by pedro_melo
Dear German,

In your INCAR there are at least 3 references to the algorithm that you want VASP to use:

ALGO = Fast
IALGO = 48
ALGO = VeryFast

If I am not wrong, VASP will only consider the first time ALGO is assigned. Could you try changing the INCAR and use a more robust option for ALGO, such as Normal?

Kind regards,
Pedro

Re: vasp6 gpu version ab initio MD crushes

Posted: Thu Dec 21, 2023 9:19 pm
by german_d.samolyuk1
Dear Pedro,

Now it works:)

Surprisingly, IALGO=48 has been read from INCAR and it didn't work with vasp6, but worked with vasp5

Thank you,

German

Happy Holidays!