Page 1 of 1

Test suite when VASP is compiled using Intel 2020.1.217

Posted: Wed Feb 10, 2021 2:29 pm
by john_low1
I have found that the 2018 toolchain has numerical issues with instruction sets newer than AVX2. I am using Intel 2020.1.217 for this build. I have found that VASP built with this toolset fails on tests involving the Andersen thermostat in the VASP test-suite. I have listed them below.

andersen_nve_constrain_fixed andersen_nve_constrain_fixed_MDALGO=11 andersen_nve_constrain_fixed_MDALGO=11_RPR andersen_nve_constrain_fixed_RPR andersen_nve_fixed andersen_nve_fixed_MDALGO=11 andersen_nve_fixed_MDALGO=11_RPR andersen_nve_fixed_RPR andersen_nvt_fixed andersen_nvt_fixed_MDALGO=11 andersen_nvt_fixed_MDALGO=11_RPR andersen_nvt_fixed_RPR

John Low
Argonne National Laboratory

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Mon Feb 15, 2021 11:58 am
by henrique_miranda
This post was originally made on a different thread:
forum/viewtopic.php?f=4&t=17952

This is a different question so I made a new thread.
Could you please post the log file with the exact error that you got in this run?

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Fri Feb 26, 2021 10:35 pm
by roger_amos1
I have also come across this problem in vasp.6.2.0 (and vasp6.1.2) It is due to the use of xHost in the standard makefiles.

The attached two outputs were run on a large Intel SkyLake cluster at the Australian National University.
They used Intel compilers and libraries 2020.0.166 which is one of the 'verified' toolchains.
The standard makefile.include makefile.include.linux_intel_omp was used.

The output labelled 'fail' was from a program build with -xHost, as in the supplied files.
The output labelled 'correct' was build without using xHost.
The job example is andersen_nve_constrain_fixed from the test suite, but all jobs with andersen+fixed fail the same way, using xHost, but work without.
If you want someone more specific, they fail if AVX512 instructions are requested.

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Wed Apr 21, 2021 7:31 pm
by mmarsman
Hi John, Hi Roger,

I can confirm that this is an issue that is triggered by requesting AVX512 instructions (-xCORE-AVX512 or -xHost on an applicable host), and disappears when limiting things to AVX2.
We have not seen this before because only recently we acquired an AVX512 capable machine (a Cascade Lake Xeon).
I reproduced this with the Intel 19.1.2.254 compilers (which means some 2020 version of Parallel Studio --- confusing).
I have not checked whether it is solved in the new oneAPI distros, but will try to do so ASAP.

Just in case you are interested: I traced the problem to a few completely innocuous lines:

Code: Select all

diff --git a/src/mymath.F b/src/mymath.F
index f46d930b..919fda62 100644
--- a/src/mymath.F
+++ b/src/mymath.F
@@ -1449,9 +1449,7 @@
           Ltxyz=1
           DO i=1,T_INFO%NIONS
             DO j=1,3
-              IF (.NOT. T_INFO%LSFOR(j,i)) THEN
-                Ltxyz(j)=0
-              ENDIF
+              IF (.NOT. T_INFO%LSFOR(j,i) .AND. Ltxyz(j)==1) Ltxyz(j)=0
             ENDDO
           ENDDO

The change shown above solves the problem .. go figure :-)
I will incorporate this change in the upcoming release, just in case it is still a problem with the current Intel oneAPI compilers & tools.

Cheers!

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Fri May 21, 2021 2:25 pm
by john_low1
Martin and Roger,

I have added Martin's patch to VASP6.2.0 and tested it on the Skylake clusters at eagle.nrel.gov. This build passed all the tests in the VASP testsuite.

"Intel(R) MPI Library for Linux* OS, Version 2019 Update 7 Build 20200312 (id: 5dc2dd3e9)" and "ifort version 19.1.1.217" was used in this build.

I used the attached makefile.include which includes the compiler flag "-fp-model precise" to avoid other numerical errors.

Thank you Martin for the patch!

John

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Sun Jun 27, 2021 4:34 pm
by john_low1
The patch for AVX-512 helped with vasp built for KNLs (MIC-AVX512) but there are still a few issues with SCAN on the KNLs.

I have attached the makefile.include I used and the results from the testsuite on a KNL.

If anyone is interested in helping me with this issue, but does not have access to KNLs. I might be able to arrange access to KNLs at lcrc.anl.gov.

John J. Low
Argonne National Laboratory.

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Mon Jul 12, 2021 8:58 am
by henrique_miranda
Sorry for my delayed answer.
Thanks for your report.
We are currently looking into this issue and will let you know as soon as we've found something.

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Mon Jul 12, 2021 3:24 pm
by henrique_miranda
Ok, we (@mmarsman and I) looked more carefully at your makefile.include
The reason for the discrepancy is probably the flag "-DnoAugXCmeta" and not the MIC-AVX512 architecture.
This "-DnoAugXCmeta" tag falls in the category of "Deprecated/Not-recommended":
wiki/index.php/Precompiler_flags#Deprec ... ecommended

We added some further explanation in the wiki for the reason it should not be used.
This option was added to compute the metaGGA contributions from the non-augmented pseudo density (instead of the augmented density). There is a condition concerning the behavior of the von-Weizsäcker kinetic energy density (second derivative of the charge density) and the kinetic energy density computed from the orbitals ingrained into TPSS and revTPSS. This condition can be strongly violated when one augments the charge density. For the TPSS and revTPSS the functionals can become unstable in those cases. SCAN and its derivates (RSCAN, R2SCAN, etc) do not assume the aforementioned conditions to be met and remain stable for the augmented density as well so this option should not be used as it will negatively affect the final results.
Could you try recompiling the code without this flag and let us know if the test suite passes?

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Wed Aug 11, 2021 3:43 pm
by john_low1
Henrique,

Thanks for the tip on the -DnoAugXCmeta flag. I have built vasp 6.2.1 with that flag removed from my makefile.include file and it passes all the tests in the vasp test suite!

I have attached a tar archive with my makefile.include, the testing scripts and results from the test suite.

Note that one of the tests (bulk_BN_SCAN+rVV10) failed when run with 8 MPI processes with 8 OMP threads for each MPI process. But did pass when run with 4 MPI processes with two OMP threads each.

Sorry for the delay in following up on this!

Thanks for the help!

John Low
Argonne National Laboratory

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Fri Feb 18, 2022 11:12 am
by hszhao.cn@gmail.com
john_low1 wrote: Wed Aug 11, 2021 3:43 pm Henrique,

Thanks for the tip on the -DnoAugXCmeta flag. I have built vasp 6.2.1 with that flag removed from my makefile.include file and it passes all the tests in the vasp test suite!

I have attached a tar archive with my makefile.include, the testing scripts and results from the test suite.

Note that one of the tests (bulk_BN_SCAN+rVV10) failed when run with 8 MPI processes with 8 OMP threads for each MPI process. But did pass when run with 4 MPI processes with two OMP threads each.
I have built vasp 6.3.0 with the recent/latest Intel oneAPI base and hpc toolkits. Based on my validation, the bulk_BN_SCAN+rVV10 test runs successfully for both 8 MPI processes with 8 OMP threads for each MPI process and 4 MPI processes with 2 OMP threads for each. I have attached a tar archive which inclues my makefile.include, the testing scripts and results from the test suite.

Best regards,
Hongsheng Zhao

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Fri Feb 18, 2022 12:37 pm
by hszhao.cn@gmail.com
I tried with the bulk_BN_SCAN+rVV10 example with different combinations of nranks and nthrds. I found that their time performance may be very different. In this test, nranks=16 nthrds=16 is very time-consuming, and I terminated this testing step before it was over. For a summary of the time benchmarks corresponding to the tests here, see the following:

Code: Select all

nranks=4 nthrds=2

real	0m13.734s
user	1m21.187s
sys	0m4.244s

nranks=8 nthrds=8

real	0m12.930s
user	8m31.540s
sys	0m30.591s

nranks=16 nthrds=16
^C
So a natural question is: what combination of nranks and nthrds is optimal for a specific computational task? Is there a rule of thumb?

Regards,
HZ

Re: Test suite when VASP is compiled using Intel 2020.1.217

Posted: Mon Mar 07, 2022 12:02 pm
by hszhao.cn@gmail.com
So a natural question is: what combination of nranks and nthrds is optimal for a specific computational task? Is there a rule of thumb?

Regards,
HZ
I've tried to discuss this question here and got some useful advice. Interested users can refer to the above discussion for some relevant clues.

Yours,
HZ