Compilation with OpenMPI

Dear all,

I have now installed siesta-4.1-b4 on my cluster with OpenMPI-4.0.3 and intel parallel_studio_xe_2016.4. The compilation is completed. The executable can run in serial. However, when I run the executable with mpirun, whether it can run depends on the number of core I used. For example, for a specific job, when I use 1/2/3/5/7/11/13 cores it can run, but with 4/6/8/9/10/12/14 cores, it crashes. Most of the time I run with mpirun it crashes at the beginning of the SCF.

====
New grid distribution: 3
1 1: 76 40: 160 1: 93
2 77: 216 1: 47 1: 92
3 1: 76 1: 39 1: 91
4 77: 216 1: 47 93: 108
5 77: 216 48: 160 1: 26
6 77: 216 48: 160 27: 108
7 1: 76 1: 39 92: 108
8 1: 76 40: 160 94: 108
Setting up quadratic distribution…
ExtMesh (bp) on 0 = 129 x 193 x 166 = 4132902
PhiOnMesh: Number of (b)points on node 0 = 747250
PhiOnMesh: nlist on node 0 = 1258199

====

Then it crashes. The error is like this.

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
siesta-4.1.b4-new 00000000024BEECD Unknown Unknown Unknown
siesta-4.1.b4-new 00000000024BCD67 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000246C134 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000246BF46 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000002410786 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000002416A50 Unknown Unknown Unknown
Unknown 00002B6A269317C0 Unknown Unknown Unknown
libmpi.so.40 00002B6A276DCA3D Unknown Unknown Unknown
siesta-4.1.b4-new 00000000023A5C3A Unknown Unknown Unknown
siesta-4.1.b4-new 00000000023A3305 Unknown Unknown Unknown
siesta-4.1.b4-new 00000000023999C4 Unknown Unknown Unknown
siesta-4.1.b4-new 00000000008CD03C Unknown Unknown Unknown
siesta-4.1.b4-new 00000000008C7569 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000049A7F8 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000046995E Unknown Unknown Unknown
siesta-4.1.b4-new 00000000005B60EA Unknown Unknown Unknown
siesta-4.1.b4-new 00000000005F93C9 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000000C1286D Unknown Unknown Unknown
siesta-4.1.b4-new 0000000000430E1E Unknown Unknown Unknown
libc.so.6 00002B6A279E7C36 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000000430D29 Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
siesta-4.1.b4-new 00000000024BEECD Unknown Unknown Unknown
siesta-4.1.b4-new 00000000024BCD67 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000246C134 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000246BF46 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000002410786 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000002416A50 Unknown Unknown Unknown
Unknown 00002B5ACCAFC7C0 Unknown Unknown Unknown
libmpi.so.40 00002B5ACD8A7A3D Unknown Unknown Unknown
siesta-4.1.b4-new 00000000023A5C3A Unknown Unknown Unknown
siesta-4.1.b4-new 00000000023A3305 Unknown Unknown Unknown
siesta-4.1.b4-new 00000000023999C4 Unknown Unknown Unknown
siesta-4.1.b4-new 00000000008CD03C Unknown Unknown Unknown
siesta-4.1.b4-new 00000000008C7569 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000049A7F8 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000046995E Unknown Unknown Unknown
siesta-4.1.b4-new 00000000005B60EA Unknown Unknown Unknown
siesta-4.1.b4-new 00000000005F93C9 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000000C1286D Unknown Unknown Unknown
siesta-4.1.b4-new 0000000000430E1E Unknown Unknown Unknown
libc.so.6 00002B5ACDBB2C36 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000000430D29 Unknown Unknown Unknown

Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
siesta-4.1.b4-new 00000000024BEECD Unknown Unknown Unknown
siesta-4.1.b4-new 00000000024BCD67 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000246C134 Unknown Unknown Unknown
siesta-4.1.b4-new 000000000246BF46 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000002410786 Unknown Unknown Unknown
siesta-4.1.b4-new 0000000002416A50 Unknown Unknown Unknown
Unknown 00002AE4A9DEA7C0 Unknown Unknown Unknown

Stack trace terminated abnormally.
forrtl: severe (174): SIGSEGV, segmentation fault occurred

Any suggestions?

Thanks so much.

/Guang-Ping Zhang

Hi all,
My problem is solved. It is caused by using a compiler and OpenMPI developed at different time. The compiler intel parallel_studio_xe_2016.4 is around 2016 while OpenMPI-4.0.3 is after 2018. When I use OpenMPI-2.0.4, which is also around 2016, the executable can run normally.

Thanks for your attention.

/Guang-Ping Zhang