Problems with MPI vasp at runtime

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.

Moderators: Global Moderator, Moderator

Post Reply
Message
Author
brockp
Newbie
Newbie
Posts: 5
Joined: Thu Jun 08, 2006 7:33 pm

Problems with MPI vasp at runtime

#1 Post by brockp » Thu Jun 08, 2006 7:45 pm

I am a sysadmin helping a user install vasp on our linux (RHEL 4.0) opteron cluster. The compiler is PGIF90 6.1 and the MPI lib is OpenMPI 1.0.2 the serial version builds and runs just fine but the paralell version gives the following error,

running on 2 nodes
[nyx.engin.umich.edu:28430] *** An error occurred in MPI_Cart_create
[nyx.engin.umich.edu:28430] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:28430] *** MPI_ERR_OTHER: known error not in list
[nyx.engin.umich.edu:28430] *** MPI_ERRORS_ARE_FATAL (goodbye)
[nyx.engin.umich.edu:28431] *** An error occurred in MPI_Cart_create
[nyx.engin.umich.edu:28431] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:28431] *** MPI_ERR_OTHER: known error not in list
[nyx.engin.umich.edu:28431] *** MPI_ERRORS_ARE_FATAL (goodbye)
1 additional process aborted (not shown)

This is a regular OMPI error, and i have contacted the devs of openmpi, i am posting here to see if this is a problem anyone else has seen and if so how/if they were able to fix this problem.

Brock
Last edited by brockp on Thu Jun 08, 2006 7:45 pm, edited 1 time in total.

brockp
Newbie
Newbie
Posts: 5
Joined: Thu Jun 08, 2006 7:33 pm

Problems with MPI vasp at runtime

#2 Post by brockp » Fri Jun 09, 2006 2:03 pm

Looks like the problem isnt with openMPI here is the result rebuilding everythign with lam-7.1.2

bash-3.00$ mpirun -np 2 ./vasp
running on 2 nodes
MPI_Cart_create: invalid dimension argument: Invalid argument (rank 0, MPI_COMM_WORLD)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Cart_create()
Rank (0, MPI_COMM_WORLD): - main()
MPI_Cart_create: invalid dimension argument: Invalid argument (rank 1, MPI_COMM_WORLD)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Cart_create()
Rank (1, MPI_COMM_WORLD): - main()

Could it be a problem with the users input that the problem cant be broken down correctly ? This input works fine on the serial version of vasp
Last edited by brockp on Fri Jun 09, 2006 2:03 pm, edited 1 time in total.

job
Jr. Member
Jr. Member
Posts: 55
Joined: Tue Aug 16, 2005 7:44 am

Problems with MPI vasp at runtime

#3 Post by job » Wed Jun 14, 2006 11:03 am

Have you compiled vasp with -i8 and the mpi library with default settings? That won't work.
Last edited by job on Wed Jun 14, 2006 11:03 am, edited 1 time in total.

c00jsh00
Newbie
Newbie
Posts: 7
Joined: Tue Nov 15, 2005 9:01 am

Problems with MPI vasp at runtime

#4 Post by c00jsh00 » Thu Jun 29, 2006 5:22 am

Hi,

We have the similar problem, have you solved the problem yet?
Last edited by c00jsh00 on Thu Jun 29, 2006 5:22 am, edited 1 time in total.

admin
Administrator
Administrator
Posts: 2922
Joined: Tue Aug 03, 2004 8:18 am
License Nr.: 458

Problems with MPI vasp at runtime

#5 Post by admin » Mon Jul 24, 2006 11:20 am

please check if the LAM was compiled in the same bit-mode as you used for the compilation of vasp
Last edited by admin on Mon Jul 24, 2006 11:20 am, edited 1 time in total.

brockp
Newbie
Newbie
Posts: 5
Joined: Thu Jun 08, 2006 7:33 pm

Problems with MPI vasp at runtime

#6 Post by brockp » Tue Sep 26, 2006 7:02 pm

[quote="job"]Have you compiled vasp with -i8 and the mpi library with default settings? That won't work.[/quote]$ mpirun -np 2 -v ./vasp
running on 2 nodes
[nyx.engin.umich.edu:31483] *** An error occurred in MPI_Cartdim_get
[nyx.engin.umich.edu:31483] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:31483] *** MPI_ERR_COMM: invalid communicator
[nyx.engin.umich.edu:31483] *** MPI_ERRORS_ARE_FATAL (goodbye)
distr: one band on 1 nodes, 1 groups
1 process killed (possibly by Open MPI)


So i still have not made any progress. I also added the -Ddebug to the flags, but vasp did not display anything.

Also what does -Dkind8 mean?
Last edited by brockp on Tue Sep 26, 2006 7:02 pm, edited 1 time in total.

brockp
Newbie
Newbie
Posts: 5
Joined: Thu Jun 08, 2006 7:33 pm

Problems with MPI vasp at runtime

#7 Post by brockp » Tue Oct 03, 2006 1:07 pm

The problem was solved using the following:

lam-7.1.2
Open MPI would not work with vasp this is unfortonate, both mpich and lam are nolonger dev. Moving to more uptodate MPI libs like OpenMPI would be a plus in the future. Im not sure if its OpenMPI or VASP causing the problem so i will pass it on to the OMPI devs see if we can fix it.

PGI 6.1 -i4
Matchin the size of LOGICALS and such was a real pain, Its not documented anyware but the default PGI make file for linux has -i8 in the Makefiles. This caused quite a headache. This MUST match what your MPI lib was built with.

VASP is running now int MPI GoTO was very slow on the example case i had (dont know why) ACML3.5 was slightly faster than ATLAS.

This was on OPT 244 with GIG-E non blocking + Jumbo frames networking. Hope this helps anyone else. If you want I can provide Makefiles for anyone having trouble.

Brock
1(734)936-1985
Center for Advanced Computing
University of Michigan (Ann Arbor)
Last edited by brockp on Tue Oct 03, 2006 1:07 pm, edited 1 time in total.

atogo
Newbie
Newbie
Posts: 10
Joined: Tue Mar 22, 2005 7:45 am
License Nr.: 221

Problems with MPI vasp at runtime

#8 Post by atogo » Tue Oct 17, 2006 6:06 am

I met similar problem. In my case, I could solve it by following:

Set the compiler path directory like as,
FC=/usr/foo/bar/bin/mpif90
'FC=mpif90' with $PATH doesn't work. A shared library is missing. I don't know why.

Another choice is to link staticaly ( libmpi.a, liborte.a and libopal.a in openmpi case). Remenber to copy header files from 'include' directory in openmpi.
Last edited by atogo on Tue Oct 17, 2006 6:06 am, edited 1 time in total.

Post Reply