Page 1 of 1
Error: convergence problem in MD! on startup
Posted: Wed Aug 13, 2025 11:19 am
by akretschmer
Hello,
I am trying to do a ML-FF run. I have trained different structures, then did a refitting step. Now I want to run a FF-only simulation. But right after starting, I get this error message:
Code: Select all
-----------------------------------------------------------------------------
| |
| EEEEEEE RRRRRR RRRRRR OOOOOOO RRRRRR ### ### ### |
| E R R R R O O R R ### ### ### |
| E R R R R O O R R ### ### ### |
| EEEEE RRRRRR RRRRRR O O RRRRRR # # # |
| E R R R R O O R R |
| E R R R R O O R R ### ### ### |
| EEEEEEE R R R R OOOOOOO R R ### ### ### |
| |
| Error: convergence problem in MD! |
| |
| ----> I REFUSE TO CONTINUE WITH THIS SICK JOB ... BYE!!! <---- |
| |
-----------------------------------------------------------------------------
I don't know what this means, a search did not turn up anything useful.
The input and output files are attached (except for the ML_FF which is very large).
Re: Error: convergence problem in MD! on startup
Posted: Thu Aug 14, 2025 7:26 am
by manuel_engel1
Hi,
Thanks for reaching out to us on the VASP forum. What you are experiencing is most likely due to the fast dynamics of Hydrogen atoms in your simulation.
With the light masses of H and a time step of 0.5, it is very likely that at the final step of your calculation, the forces on a particular hydrogen atom become very large. This can cause the hydrogen atom to be flung out of the hydrogen molecule and even to cause the MD algorithm to fail.
I suggest that you investigate the issue by looking into the final steps of your trajectory (XDATCAR, CONTCAR). A potential fix is to reduce the time step or increase the masses of the hydrogen via POMASS.
Let me know if that solves it for you.
Re: Error: convergence problem in MD! on startup
Posted: Thu Aug 14, 2025 1:02 pm
by akretschmer
The problem is that there is no trajectory. VASP crashes before the first step is done. I also increased the H mass to 8 and it behaves the same.
Re: Error: convergence problem in MD! on startup
Posted: Thu Aug 14, 2025 1:51 pm
by manuel_engel1
In your INCAR file, you have set ML_OUTBLOCK=2000, meaning that if the error happens before those 2000 steps, you will not see any output. Since the error happens really early, I suggest to set ML_OUTBLOCK=1 and check the output every step to see what is happening.
Re: Error: convergence problem in MD! on startup
Posted: Mon Sep 22, 2025 8:44 am
by akretschmer
I have checked the trajectory with ML_OUTBLOCK = 1, and everything runs perfectly fine well beyond 2000 timesteps, also at slightly higher intervals. Only when I get into the 1000 region, I see this problem occurring. This also happens with other completely different jobs. If I set ML_OUTBLOCK to large values, I get this error, while the jobs run perfectly fine at smaller values.
So this seems to be a bug rather than a problem with my force field.
Re: Error: convergence problem in MD! on startup
Posted: Mon Sep 22, 2025 8:57 am
by ferenc_karsai
Please post your ML_AB and ML_FF file. We need definitely the ML_AB file. But would be good to have both.
Re: Error: convergence problem in MD! on startup
Posted: Tue Sep 23, 2025 1:52 pm
by akretschmer
Re: Error: convergence problem in MD! on startup
Posted: Wed Sep 24, 2025 8:27 am
by andreas.singraber
Hello!
Thanks for uploading the ML files, I was able to reproduce the error. So far I also noticed that decreasing the number of MD steps NSW also hides the issue. For example, everything works as expected with ML_OUTBLOCK=2000 when reducing NSW from 2000000 to 200000. I suspect that there is some kind of integer overflow involved here (Fortran integers are usually only 4 byte long, unfortuantely numbers > 2.147...*109 are easily reached when multiplying large NSW numbers with other variables). I hope I will have a fix soon, please stay tuned...
All the best,
Andreas Singraber
Re: Error: convergence problem in MD! on startup
Posted: Wed Sep 24, 2025 12:40 pm
by andreas.singraber
Hey!
So it turns out that this is indeed an integer overflow problem and has been reported previously (https://www.vasp.at/forum/viewtopic.php?t=19760). A bugfix was already prepared and its release planned for VASP 6.5.0 which is also indicated on the Known Issues Wiki page. Unfortunately, this was overlooked and the fix is not yet in, not even in VASP 6.5.1. I am sorry that this was causing inconvenience, the fix will definitely be available in the upcoming release. However, to provide a solution right now, I have two options for you:
- Work around the overflow and use smaller values for ML_OUTBLOCK and NSW, in such way that the product does not exceed the 4-byte integer limit 231-1 = 2147483647. For example, keeping ML_OUTBLOCK=2000 you could decrease NSW to 1000000 instead of 2000000 and should not run into this issue any more (see also the known issue description).
- Alternatively, if you are willing to patch and recompile your VASP installation, I have attached a patch file which you can use to fix the bug directly in VASP 6.5.1. To install the patch unzip the patch file, go to your VASP base directory and use the patch command to apply the changes to the source code:
Code: Select all
cd /path/to/vasp.6.5.1/
patch -p0 < /path/to/integer-overflow.patch
Then, it is best to recompile VASP from scratch (clean directory first with make veryclean)
Hope this works for you! Please let us know if there are still issues and sorry again for the inconvenience.
All the best,
Andreas Singraber