I have a working environment where $OMP_NUM_THREADS=1 is enforced (login node), but the system has many more available threads. It seems that when -O2 and -O3 optimizations are included in my compile command, the optimizations hard code instructions assuming OpenMP thread availability based on the host system, or at least the optimizations prevent the graceful handling of $OMP_NUM_THREADS. On execution, I get a segfault on entering __kmp_enter_single().
home$ ifort diagonalize.f90 -g -debug -assume buffered_io -ipo -fpic -openmp -O2 -I$MKLROOT/include/intel64/lp64 -I$MKLROOT/include $MKLROOT/lib/intel64/libmkl_blas95_lp64.a $MKLROOT/lib/intel64/libmkl_lapack95_lp64.a -Wl,--start-group $MKLROOT/lib/intel64/libmkl_intel_lp64.a $MKLROOT/lib/intel64/libmkl_core.a $MKLROOT/lib/intel64/libmkl_intel_thread.a -Wl,--end-group -lpthread -lm -o diagonalize ipo: remark #11000: performing multi-file optimizations ipo: remark #11006: generating object file /tmp/ipo_ifortuRFRzr.o home$ diagonalize forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libiomp5.so 00002B52D2AFB71A Unknown Unknown Unknown libiomp5.so 00002B52D2ADEE16 Unknown Unknown Unknown diagonalize_16 0000000000411514 Unknown Unknown Unknown libiomp5.so 00002B52D2B20FE3 Unknown Unknown Unknown home$ idbc diagonalize Intel(R) Debugger for applications running on Intel(R) 64, Version 13.0, Build [79.936.23] ------------------ object file name: diagonalize Reading symbols from /home/me/diagonalize...done. (idb) run Starting program: /home/me/diagonalize [New Thread 26293 (LWP 26293)] Program received signal SIGSEGV __kmp_enter_single () in /opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so (idb) where #0 0x00002b92e998671a in __kmp_enter_single () in /opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so #1 0x00002b92e9969e16 in __kmpc_single () in /opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so #2 0x0000000000411514 in diagonalize () at /home/me/diagonalize.f90:403 #3 0x000000000040f6ed in diagonalize () at /home/me/diagonalize.f90:333 #4 0x000000000040dfec in main () in /home/me/diagonalize #5 0x00000038cc81ecdd in __libc_start_main () in /lib64/libc-2.12.so (idb) set $cmdset='dbx' (idb) file diagonalize.f90 (idb) list 403,406 403 !$omp workshare 404 ! zero result array 405 resArray(ptr(2):ptr(2)+dim_matrix-1) = 0_dbl_real 406 !$omp end workshare
If I only use -O1 level optimizations, the program runs fine. However, this is a HPC environment and for real data sets I will need at least -O2 functioning.
Also, the segfault is usually present and rarely absent from this executable. I'm fairly certain it's due to varying load on the host system and thus, thread availability.
As a side note, the source works fine with gfortran and the GOMP library with -O2 and -O3 (main point is that there's nothing odd with this code).
Any suggestions?
Thanks,
Jonathan