Quantcast
Channel: Intel® Fortran Compiler for Linux* and macOS*
Viewing all 2583 articles
Browse latest View live

Intel Parallel Studio XE 2017 has been released!

$
0
0

Intel Parallel Studio XE 2017, including Intel Fortran Compiler 17.0, is now available from the Intel Registration Center. Release notes can be found here.

Among the new F2008 features are:

  • TYPE(intrinsic-type)
  • Pointer initialization
  • Implied-shape PARAMETER arrays
  • Extend EXIT statement to all valid construct names
  • Support BIND(C) in internal procedures

There is also new support for OpenMP 4.5 features, an option to align the code for loops and more.

A significant change in this release is that the default for intrinsic assignment to allocatable arrays is now to do the automatic (re)allocation if necessary, as specified by the standard. In past releases you needed to specify the -assume realloc_lhs option to get this behavior. In some applications the additional checking may affect performance - you can revert to the previous behavior by specifying norealloc_lhs or the new option -nostandard-realloc-lhs.

Another change is to correct and make consistent how the compiler treats nonstandard conversions between numeric and LOGICAL types. Please see the release notes for more details.


floating point overflow in __svml_powf4_h9

$
0
0

The following code throws floating point overflow in __svml_powf4_h9 when compiled with -O3 and -O2:

  do i        = 1,nkd2p1

     x2 = abs ((i-1)*delk/kright)

     x2          = -apar*x2**bex

     wrkr(i*2-1) = x2

  enddo

 

Disable fpe check the program runs fine and give correct result. We have seen these kind of FPE caused by vectorise from time to time. By explicitly declare "!DIR$ NOVECTOR" does the trick but will have impact on performance.

The loop seems to be perfectly fine for vectorize and the numerical values in the result are far from overflow. I am wondering why overflow will happen?

The machine code when it crash:

   0x00000000012349f0 <+784>:   vaddpd %xmm3,%xmm2,%xmm5
   0x00000000012349f4 <+788>:   vpaddq %xmm7,%xmm4,%xmm1
   0x00000000012349f8 <+792>:   vpaddq %xmm8,%xmm5,%xmm2
=> 0x00000000012349fd <+797>:   vcvtpd2ps %xmm1,%xmm3
   0x0000000001234a01 <+801>:   vcvtpd2ps %xmm2,%xmm4
   0x0000000001234a05 <+805>:   vmovlhps %xmm4,%xmm3,%xmm1
   0x0000000001234a09 <+809>:   test   %eax,%eax
   0x0000000001234a0b <+811>:   jne    0x1234a4f <__svml_powf4_h9+879>
   0x0000000001234a0d <+813>:   vmovups 0x30(%rsp),%xmm8
   0x0000000001234a13 <+819>:   vmovaps %xmm1,%xmm0

 

(gdb) p $xmm3                                                                                                                                                                                                                                                                             
$1 = {v4_float = {1.42776291e+31, 0.708184719, -6.739982e+24, 0.690881014}, v2_double = {0.00032494041370586407, 0.00025734792206721777}, v16_int8 = {-126, 53, 52, 115, -104, 75, 53, 63, -27, 103, -78, -24, -108, -35, 48, 63}, v8_int16 = {13698, 29492, 19352, 16181, 26597, -5966, 
    -8812, 16176}, v4_int32 = {1932801410, 1060457368, -390961179, 1060167060}, v2_int64 = {4554629716295038338, 4553382854900475877}, uint128 = 0x3f30dd94e8b267e53f354b9873343582}

 

 

 

Zone: 

Finalization: segfault on optional intent(out) allocatable

$
0
0

The finalization of a non-present optional intent(out) allocatable dummy argument produces a segmentation fault at runtime when the following code is compiled with ifort 16.0.3. This is the same issue as reported some years ago in https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux... (topic closed) and faintly ressembles https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux... (pointer instead of allocatabe).

module m
    ! finalized t
    type :: t
    contains
        final :: tfinal
    end type
contains
    subroutine tfinal(a)
        type(t), intent(inout) :: a
    end subroutine

    ! pass optional t, finalized on entry
    subroutine pass(a)
        type(t), allocatable, intent(out), optional :: a
        print *, 'passed'
    end subroutine
end module

program p
    use m
    type(t), allocatable :: a

    call pass(a)        ! ok
    call pass()         ! segfault
end program

It would certainly be nice to have a fix for this, or some ideas for workarounds without having to change the dummy's attributes.

Kind regards
Ferdinand

PS: Syntax-highlighting as suggested using (left-angle bracket) pre class="brush:fortran"(right-angle bracket) doesn't work in my post.

Thread Topic: 

Bug Report

relocation truncated to fit: R_X86_64_32S against symbol

$
0
0

Hi,  

I was part of the beta program for the intel 2017 compiler. Everything worked fine. We just updated to the official release and i have a new issue with the message : 

(.text+0x51): relocation truncated to fit: R_X86_64_32 against symbol `mod_parallel_defs_mp_comm_' defined in COMMON section in avbp.a(mod_parallel_defs.o)

the avbp.a file is an archive we generate ourself from .o files with the  ar r  command … I have seen in the forum that switching to dynamic libraries might solve the issue ( ticket from 2011 ) but this is not a librarie juste an archive to avoid having a very large linking command that fails on some systems ( we have a lot of files .. )

Any suggestions ? Why did the behavior change between the beta and the release ? 

We have this issue on a KNL system.

thank you

 

 

 

 

Thread Topic: 

Bug Report

installation of intel parralel cluster 2017 (student license)

$
0
0

hi sir,

i am having trouble installing the software on my ubuntu os.

as given in the installation guide i tried running the command install_GUI.sh and install.sh in the terminal but at first it showed no file found

then i changed the directory where the file was downloaded but still it shows the same

Zone: 

Thread Topic: 

How-To

vallue not assigned to variable declared "character(len=:) :: textblock(:)"

$
0
0

I expect that whether textblock is declared len=: or len=80 that the following code would produce a successful assignment to textblock; but in one case textblock is unallocated on version 16.  Is this still an issue with V 17?

! place in file testit.F90; compile with ifort, then "ifort -DCOLON"
! ifort testit.F90 -o fixedlen;./fixedlen
! ifort -DCOLON testit.F90 -o colon;./colon
program testit
implicit none
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
#ifdef COLON
character(len=:),allocatable :: textblock(:)
character(len=*),parameter   :: string='LEN=:'
#else
character(len=80),allocatable :: textblock(:)
character(len=*),parameter   :: string='LEN=80'
#endif
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
integer                      :: i
   textblock=[ character(len=40) :: &
   '#########',&
   '#       #',&
   '#       ###################',&
   '#                         #',&
   '#             #############',&
   '#             #',&
   '###############']
   write(*,*)   'FOR ',string
   write(*,*)   'ALLOCATED ',allocated(textblock)
   write(*,*)   'SIZE      ',size(textblock)
   if(size(textblock).gt.0)then
      write(*,*)'LEN       ',len(textblock(1))
   endif
   write(*,'(a)')(trim(textblock(i)),i=1,size(textblock))
end program testit
 

Issue with the alignment of the components of derived-types

$
0
0

Greetings,

I am developing a program to study the dynamics of a system of particles in a cubic box with periodic boundary conditions. I would like to take advantage of auto-vectorization to expedite computations. To that end, I have created a data structure for storing the position, orientation, force, and torque vectors (among quantities of interest). 

If I understood correctly the documentation on auto-vectorization, I can request the compiler to align the data structure (or derived-type) to a 32-bit boundary and use the sequence attribute to pack the components of the structure (no-padding). I am interested in aligning data to 32-bit boundaries since I am targeting the program to run in AVX processors. The problem that I am facing is that none of the components becomes aligned despite the fact that the data structure is requested to be aligned to a 32-bit boundary.

To ease the troubleshooting, I am posting a minimalistic version of the code here (with comments for clarity of the reader):

module dynamics
    ! Description:
    ! Defines the principal data structure that stores information about the particles
    ! in the system. To take advantage of auto-vectorization the data is organized
    ! as a Structure of Arrays (SoA).
    use, intrinsic :: iso_fortran_env
    implicit none
    type data
        sequence
        ! position vector
        real(kind = real64), allocatable :: r_x(:);
        real(kind = real64), allocatable :: r_y(:);
        real(kind = real64), allocatable :: r_z(:);
        ! orientation vector (director)
        real(kind = real64), allocatable :: d_x(:);
        real(kind = real64), allocatable :: d_y(:);
        real(kind = real64), allocatable :: d_z(:);
        ! force vector
        real(kind = real64), allocatable :: F_x(:);
        real(kind = real64), allocatable :: F_y(:);
        real(kind = real64), allocatable :: F_z(:);
        ! torque vector
        real(kind = real64), allocatable :: T_x(:);
        real(kind = real64), allocatable :: T_y(:);
        real(kind = real64), allocatable :: T_z(:);
        ! displacement vector (the prefix `d' stands for delta or difference)
        real(kind = real64), allocatable :: dr_x(:);
        real(kind = real64), allocatable :: dr_y(:);
        real(kind = real64), allocatable :: dr_z(:);
        ! particle ID array
        integer(kind = int32), allocatable :: ID(:);
        ! padding array, such that we have an equivalent of 16 arrays of 64 bits each
        integer(kind = int32), allocatable :: padding(:);
    end type data

    private
    public data

end module dynamics

program alignment_test
    ! Minimalistic program to test the alignment of derived-types.
    use, intrinsic :: iso_fortran_env
    use dynamics
    implicit none

    ! particle data structure, aligned to 32-bits
    type(data), target :: pdata;
    !dir$ attributes align: 32 :: pdata

    ! number of particles in the system
    integer(kind = int32), parameter :: n_pdata = 64;

    ! captures the status returned by allocate/deallocate functions
    integer(kind = int32) :: alloc_stat;

    ! pointer to access the components of the data structure
    real(kind = real64), pointer, contiguous :: pr_x(:);
    real(kind = real64), pointer, contiguous :: pr_y(:);
    real(kind = real64), pointer, contiguous :: pr_z(:);

    allocate( pdata %r_x(n_pdata), pdata %r_y(n_pdata), pdata %r_z(n_pdata),&
        pdata %d_x(n_pdata), pdata %d_y(n_pdata), pdata %d_z(n_pdata),&
        pdata %F_x(n_pdata), pdata %F_y(n_pdata), pdata %F_z(n_pdata),&
        pdata %T_x(n_pdata), pdata %T_y(n_pdata), pdata %T_z(n_pdata),&
        pdata %dr_x(n_pdata), pdata %dr_y(n_pdata), pdata %dr_z(n_pdata),&
        pdata %ID(n_pdata), pdata %padding(n_pdata), stat=alloc_stat );

    if ( alloc_stat /= 0 ) then
        stop "insufficient memory to allocate the data structure, program stopped"
    end if

    pr_x => pdata %r_x;
    pr_y => pdata %r_y;
    pr_z => pdata %r_z;

    ! assign pretend values to the components of the position vector of the particles

    !dir$ assume_aligned pr_x: 32
    pr_x = 0.0d+0;
    !dir$ assume_aligned pr_y: 32
    pr_y = 0.0d+0;
    !dir$ assume_aligned pr_z: 32
    pr_z = 0.0d+0;


    ! free structure from memory
    deallocate( pdata %r_x, pdata %r_y, pdata %r_z,&
        pdata %d_x, pdata %d_y, pdata %d_z,&
        pdata %F_x, pdata %F_y, pdata %F_z,&
        pdata %T_x, pdata %T_y, pdata %T_z,&
        pdata %dr_x, pdata %dr_y, pdata %dr_z,&
        pdata %ID, pdata %padding, stat=alloc_stat );
    if ( alloc_stat /= 0 ) then
        stop "unexpected error, failed deallocate the data structure..."
    end if

end program

 

 

The program was compiled in the following manner:

ifort -g -traceback -check all -align nosequence  -O0 alignment_test.f90

Here is the output generated at runtime:

forrtl: severe (408): fort: (28): Check for ASSUME_ALIGNED fails for 'PR_X' in routine 'ALIGNMENT_TEST' at line 77.

Image              PC                Routine            Line        Source             
a.out              0000000000407786  Unknown               Unknown  Unknown
a.out              000000000040454F  MAIN__                     77  alignment_test.f90
a.out              0000000000402F1E  Unknown               Unknown  Unknown
libc.so.6          0000003B1F81ED5D  Unknown               Unknown  Unknown
a.out              0000000000402E29  Unknown               Unknown  Unknown

and the version of the Fortran compiler is the following:

ifort --version
ifort (IFORT) 16.0.3 20160415
Copyright (C) 1985-2016 Intel Corporation.  All rights reserved.

Thanks in advance for your help,
Misael

 

 

 

Another issue with user defined assignment

$
0
0

Dear all,

there is an issue, I've found with user defined assignment when using the ifort compiler (version 16.0). Let's have a basic type with user defined assignment (as TBP) and an extending type which does not define its own assignment. When a non-polymorphic instance of the extended type is defined, an assignment to it invokes the user defined assignment of the base type. When I understood the explanations of Malcom Cohen (as response to my nagfor compiler bug report in this matter), this should, however, not happen. So, in the example below, the assignBasic() routine should not be invoked (as it happens in the ifort compiled binary).

module typedef
  implicit none

  type :: Basic
  contains
    procedure :: assignBasic
    generic :: assignment(=) => assignBasic
  end type Basic

  type, extends(Basic) :: Extended
  end type Extended

contains

  subroutine assignBasic(this, other)
    class(Basic), intent(out) :: this
    type(Basic), intent(in) :: other

    print "(A)", "assignBasic invoked"

  end subroutine assignBasic

end module typedef


program test
  use typedef
  implicit none

  type(Extended) :: ext
  print "(A)", "Non-polymorphic assignment"
  ext = Extended()

end program test

 

Zone: 

Thread Topic: 

Bug Report

Overloaded operator for derived type undefined when imported from a wrapper module

$
0
0

The following code does not compile with Intel Fortran v16, v17, on x86-64 Linux:

module mymod
implicit none

type :: mytype
  private
  integer :: a
  contains
  procedure,private :: eq
  generic :: operator(==) => eq
endtype mytype

interface mytype
  module procedure :: constructor
endinterface mytype

contains

type(mytype) function constructor(a)
  integer,intent(in),optional :: a
  if(present(a))then
    constructor % a = a
  else
    constructor % a = 0
  endif
endfunction constructor

logical function eq(m0,m1)
  class(mytype),intent(in) :: m0
  class(mytype),intent(in) :: m1
  eq = m0 % a == m1 % a
endfunction eq

endmodule mymod

module wrapper_module
use mymod,only:mytype
endmodule wrapper_module

program test
use wrapper_module
implicit none

write(*,*)mytype() == mytype(0)

endprogram test

The compiler does not seem to resolve the overloaded == operator for mytype:

$ ifort test.f90
test.f90(44): error #6355: This binary operation is invalid for this data type.
write(*,*)mytype() == mytype(0)
----------^
test.f90(44): error #6355: This binary operation is invalid for this data type.
write(*,*)mytype() == mytype(0)
----------------------^
compilation aborted for test.f90 (code 1)

If the main program accesses mytype directly from mymod, the program compiles and executes as expected.

If the main program accesses mytype from wrapper_module, and wrapper module accesses mytype via "use mymod", the program compiles and executes as expected.

If the main program accesses mytype from wrapper_module, and wrapper module accesses mytype via "use mymod,only:mytype" (the example code above), the compilation fails with the above message. I think this is unexpected behavior.

The code compiles and executes as expected when compiled with gfortran 5.x

Thanks,

milan

 

 

Zone: 

Thread Topic: 

Bug Report

call to fortran double precision function returning garbage

$
0
0

Environment: Suse linux,

I have a real function tbl that is invoking a fortran function. The function is double precision, the variable catching the return is also double. I have inserted write statements just before the return, and just after the value is caught. The two do NOT match.

dtblu3 is in a utility library that MANY other applications use. tbl is part of an analysis application that mixes C, Fortran, and Oracle's Pro C code.

      real function tbl (a,v,d,ier)

      real a(*),v(*),d(*)
      double precision xi,yi,zi, f3(10000)
      double precision dval, dd, da, xary(100), yary(100), zary(100)

      double precision my_tbl

...

            my_tbl = dtblu3(xi,yi,zi,xary,yary,zary,f3,ndx,ndy,ndz,nx,
     1                  ny,nz,nxa,nya,nza)
      write(*,*) "tbl point 1 my_tbl = ", my_tbl

 

      double precision function dtblu3( x1, y1, z1, x, y, z, f3,
     +                                  ndx, ndy, ndz, nx, ny,
     +                                  nz, mx, my, mz )

      double precision  x(*), y(*), z(*), f3(mx,my,mz)                             
      double precision  x1, y1, z1, x2, y2, z2, dterp3

...

            dtblu3 = f3(i,j,k)                                                 
      write(*,*) 'dtblu3 point 1 dtblu3 = ', dtblu3
            return

The output from these write statements is

 dtblu3 point 1 dtblu3 =    1.34000003337860
 tbl point 1 my_tbl =  -3.689348814741910E+019

The compile statement for dtblu3.f is:

/ots/sw/Intel/compilers_and_libraries_2016.1.150/linux/bin/intel64/ifort -c  -I../include dtblu3.f

The compile statement for tbl.f is:

/ots/sw/Intel/compilers_and_libraries_2016.1.150/linux/bin/intel64/ifort -g -C -c -assume byterecl -convert big_endian -check -I /boeing/include -I/usr/include -I/net/psn02hom/home/d/dra3556/work/vobs/libutil/dvlp/include tbl.f

 

So, any ideas on where my problem might be originating?

 

 

 

Thread Topic: 

Help Me

Error passing variables from Gnu C/C++ main to ifort F90 shared object subroutine

$
0
0

Hi Folks,
I have an F90 shared object library compiled with the Intel FORTRAN compiler - don't know what version.

I have a C++ driver compiled with the Gnu compiler (v4.4.7) which can link against the fortran library fine.

I'm trying to recompile the fortran library so I can trace through it with a debugger (and eventually to modify it).  I'm using ifort Version 16.0.0.109 Build 20150815.

However, when I attempt to run the driver with the recompiled library, I get an error passing arguments into a subroutine.  The function prototype is as follows:

subroutine set_some_params(Dir, bufSize,Mode, flag1, flag2, param1, param2, param3)
!DEC$ IF DEFINED(_WIN32)
!DEC$ ATTRIBUTES DLLEXPORT, DEFAULT, ALIAS:'set_some_params' :: set_some_params
!DEC$ ELSE
!DEC$ ATTRIBUTES ALIAS:'set_some_params' :: set_some_params
!DEC$ END IF

   implicit none

   character(FILEPATHLEN), intent(in) :: Dir ! directory path
   integer, value :: bufSize                    ! buffer size [5000]
   integer, value :: Mode                       ! run mode
   integer, value :: flag1
   integer, value :: flag2
   real(8), value :: param1
   integer, value :: param2
   integer, value :: param3

...

This is the first call of a fortran library function, and it crashes.  Loading the exe & core file into gdb shows the failure where the bufSize parameter is first used.

Printing the Dir variable in gdb shows the expected value, however the rest of the arguments I get the following:

(gdb) p bufSize
Cannot access memory at address 0x1388
(gdb) p Mode
Cannot access memory at address 0x0
(gdb) p flag1
Cannot access memory at address 0x0
(gdb) p flag2
Cannot access memory at address 0x0

etc.

The ifort compile command looks like this:

/usr/local/bin/ifort -D_LINUX64 -g -c -u -r8 -i4 -fPIC -fpp1 -reentrancy -threads -recursive -fopenmp ../../Source_F/some_file.f90 -o some_file.o

Which produces the following warning:
ifort: command line remark #10128: invalid value '1' for '-fpp'; ignoring

The library link line looks like this:

/usr/local/bin/ifort -D_LINUX64 -V -fPIC -shared --reentrancy -threads -recursive -fopenmp -o ../../../lib_x64/some_lib.dll ./some_file.o ./some_file2.o ./some_file3.o -L. -L/opt/intel/lib/intel64/ -i-dynamic -lifport -lifcore -limf -lsvml -lintlc

I get the following warnings here:
ifort: command line warning #10006: ignoring unknown option '-freentrancy'
ifort: command line remark #10148: option '-i-dynamic' not supported

And the -L path /opt/intel/lib/intel64/ doesn't exist on my system, but it apparently doesn't need anything there.

The C++ command line:

g++ -g -m64 driver.cc -o driver ./DllUtils.o ./DllMainDll.o -ldl -lc -lm

where DllUtils.o and DllMainDll.o are C++ object files that specify some helper functions and interfaces to the shared object library.  Sorry about the "Dll" notation - that's from the original developer.

The header file, DllUtils.h contains the following:

typedef void (STDCALL *fnPtrset_some_params)(char Dir[512], int bufSize, int Mode, int flag1, int flag2, double param1, int param2, int param3);

So to summarize - the original library works fine.  When I attempt to recompile the library using the original makefiles (modified for different path to ifort only) I get an error passing arguments.  First argument (pointer) is OK, the rest fail.

Sorry this is so vague - it's a huge project, I'm trying to extract what may be relevant here.  This is on a CentOS 6 box

Thread Topic: 

Help Me

Failed to install Intel® Parallel Studio XE Composer Edition for Fortran OS X*

$
0
0

I used my email to apply for student version composer for Fortran and successfully received the serial number, when I put in the serial number, it showed authorization failure. I have checked that my serial number is active. Could anybody help me or give me some suggestions? THANKS.

ifort 16.0.3 and HDf5 segfault

$
0
0

All builds I've tested (up to 1.8.17 and 1.10.0-patch1) segfault when being built on Linux with icc and ifortran 16.0.3. Any idea what's going on? gcc and gfortran builds build correctly.

What is wrong with my Makefile?

$
0
0

My makefile

FC = ifort
FCFLAGS= -O3 -xHost

TARGETS= clean birrp
OBJSOC= strlen.f diagnostic.f math.f rtpss.f zlinpack.f coherence.f fft.f rarfilt.f utils.f dataft.f filter.f response.f weight.f  birrp.f
   
all:  $(TARGETS)    
clean:$
        rm -f *.o *.mod
        rm -f birrp
        
birrp:$(OBJSOC)
        $(FC) $(FCFLAGS) -o $@ $(OBJSOC)

# General compile rules
.SUFFIXES: .f .o 

.f .o:    
    $(FC) $(FCFLAGS) -c -o $@ $< 

 

I have now birrp exe and when I try to run it

./birrp < n128.in

I got some results.But the problem is that it does not read some input files at all.I have double checked this by changing these files,resultsDownloadapplication/x-gzipb.tar.gz are the same.I have inserted write(*,*) line in dataft.f to print on the screen but nothing happens.I have attached all the files.

Array Indexing Error in Intel Composer 2017 (ifort 17.0.0)

$
0
0
program Bug
  call main()

!    contains
end program Bug


Module errorModule

double precision,allocatable,dimension(:) :: readInArray, tmpWork
integer :: n,i
character(len=300) :: filout


end Module errorModule


subroutine main()

use errorModule

n = 59925

allocate(tmpWork(n+10))
do i=1,n+10
    tmpWork(i) = i
enddo

allocate(readInArray(62267))
 write(filout,'("PhysicalMapInput.txt")')

open(unit=1001,file=trim(filout),status="old")



read(1001,*) readInArray(tmpWork(1:n))


print *,readInArray

deallocate(readInArray)
deallocate(tmpWork)

end subroutine main

 

 

 

 

I believe I have found an error in the intel ifort compiler (the 2017 version). I have tested this with ifort 16 and the bug is not there.

I have written the above program to demonstrate this.

When I run the following:

 

ifort error.f90 -traceback -g -debug all -warn all -check bounds -check format -check output_conversion -check pointers  -ftrapuv -check all -gen-interfaces -warn interfaces 

 

I get the following error: 

 

forrtl: severe (194): Run-Time Check Failure. The variable 'var$76' is being used in 'error.f90(36,1)' without being defined

Image              PC                Routine            Line        Source             

a.out              000000010D72464C  _main_                     36  error.f90

a.out              000000010D725574  _MAIN__                     2  error.f90

a.out              000000010D723CAE  Unknown               Unknown  Unknown

 

 

As already mentioned. This does not happen in the 2016 version. 

 


Seeking for suggestions on OpenMP copyin of module variables...

$
0
0

Hi,

I am trying to hybridize a pure-MPI code with OpenMP. But I am not expert on it, and I face a problem related to copyin the variables initialized from modules.

The arrays are initialized in 2 different modules, and they are then revalued and used to generate the output array in a do loop. Therefore, I have to copyin them in the OMP threads. However, these arrays are private in the modules, while the OMP part is in the subroutine outside the modules. And the calling relationship are a bit of complicated.

I really appreciate any suggestion on how to solve this problem. Thanks a lot in advance!

Regards,

Liu

!---------------------------------------------------------------------------------------

The following is the structure of the code:

!---------------------------------------------------------------------------------------

subroutine main()

      use module1

      call initialize_module1 ! In module 1, a list of arrays are given initial values.

      call step1()

      call finalize_module1

end subroutine main

subroutine step1()

      use module2

      call initialize_module2 ! In module 2, a list of arrays are also given initial values...

      do loop (very big --> OpenMP) ! Here I think I have to copyin the arrays initialized in both module1 and module2, because in step2 they are revalued...

            call step2()

      end do

      call finalize_module2

end subroutine step1

subroutine step2()

      use module1

      use module2

      call work_module1()

      call work_module2()

      use_the_arrays_and_calculate_some_ouput_arrays

end subroutine step2

 

module module1

      private array11, array12, ...

      contains

      subroutine initialize_module1() ! Give initial values of array11, array12, etc

      subroutine work_module1() ! Here array11, array12 are revalued...

      subroutine finalize_module1()

end module1

module module2

      private array21, array22, ...

      contains

      subroutine initialize_module2()

      subroutine work_module2() ! revalue array21, array22,...

      subroutine finalize_module2()

end module2

If the actual argument is scalar, the dummy argument shall be scalar

$
0
0

Following Steve Lionel'sadvicee with warn interface I got this warning for the code I am compiling

math.f(2079): error #8284: If the actual argument is scalar, the dummy argument shall be scalar unless the actual argument is of type character or is an element of an array that is not assumed shape, pointer, or polymorphic.   [IPAR]

 

In my subroutine fzero I have vector if integer parameters ipar.The guy who wrote the code declared

 dimension rpar(*),ipar(*)

 

How should I change this to function properly,not just to silence the compiler?

 

Unspecified compile time error with function returning arrays

$
0
0

Hi there

here is the example:

Module Mod_1
  Type :: T_1
    integer :: n
    Integer, allocatable :: a(:)
  contains
    Procedure, Pass :: Init => SubInit
    Procedure, PAss :: getA => FunGetA
  End type T_1
  Interface
    Module Subroutine SubInit(this,n)
      Class(T_1), Intent(Inout) :: this
      Integer, Intent(in) :: n
    End Subroutine
    Module Function FunGetA(this)
      Class(T_1), Intent(Inout) :: this
      Integer, Dimension(size(this%a)) :: FunGetA
    End Function
  End Interface
End Module Mod_1

Submodule(Mod_1) Routines
contains
  Module Procedure SubInit
    Implicit None
    allocate(this%a(n))
  end Procedure
  Module Procedure FunGetA
    Implicit none
    if(allocated(this%a)) Then
      FunGetA=this%a
    End if
  End Procedure
End Submodule Routines

Program Test
  use Mod_1
  Implicit None
  Type(T_1) :: TA
  Integer, allocatable :: b(:)
  call TA%Init(5)
  b=TA%getA()
End Program Test

when compiling it chrashes with

SubMod_1.f90: catastrophic error: **Internal compiler error: segmentation violation signal raised** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.
compilation aborted for SubMod_1.f90 (code 1)
make: *** [makefile:31: SubMod_1.o] Error 1

The only workaround is to set the dimension specification in the interface definition of FunGetA to an integer value (eg. 10). Setting is to this%n causes the same crash.

Since I usually prefer subroutines I am not an expert in functions, so I might do something wrong here. But if it were obviously wrong I would expect a more informative compiler message.

Any idea.

Thanks

Karl

for_emit_diagnostic in backtrace

$
0
0

When I debug some problem like a segfault in a fortran program, the first three frames in the backtrace (0-2) always point to some functions in the intel library libifcoremt.so and in the system library libpthread.so, and only in frame 3, the actual cause in my code is shown:

raise ()                      libpthread-2.12.so             0x0000003DE2A0F6AB
for__issue_diagnostic ()      libifcoremt.so.5               0x00002AFBB1E83348
for_emit_diagnostic ()	      libifcoremt.so.5               0x00002AFBB1E83913
somefunc (somepar=somevalue)  somefile.f90:1364	somebinary   0x00000000029EAA3B

I am not completely sure but I believe it has not always been like that, but I don't know what could have changed in my setup.

Can someone help me figure out what is going on here, and whether there's a possibility to make _my_ relevant code become frame 0 again?

Thread Topic: 

Help Me

Memory Leak in OpenMP Task parallel application with Ifort 17

$
0
0

I am trying to parallelize an algorithm using DAG-Scheduling via OpenMP tasks and there many programs are killed by the Linux kernel due to Out-Of-Memory after a calls to the parallelized code although the allocated memory is only 1% of the servers main memory. But this happens only if I use the Intel Compilers from 2015, 2016 or even the new 2017 edition.

Here is a small example building the same task dependency graph as the algorithm crashing:

    PROGRAM OMP_TASK_PROBLEM
        IMPLICIT NONE

        INTEGER M, N
        PARAMETER(M = 256, N=256)
        DOUBLE PRECISION X(M,N)

        X(1:M,1:N) = 0.0D0

        CALL COMPUTE_X(M, N, X, M)

        ! WRITE(*,*) X (1:M, 1:N)

    END PROGRAM


    SUBROUTINE COMPUTE_X(M,N, X, LDX)
        IMPLICIT NONE
        INTEGER M, N, LDX
        DOUBLE PRECISION X(LDX, N)
        INTEGER K, L, KOLD, LOLD


        !$omp parallel default(shared)
        !$omp master
        L = 1
        DO WHILE ( L .LE. N )
            K = M
            DO WHILE (K .GT. 0)
                IF ( K .EQ. M .AND. L .EQ. 1) THEN
                    !$omp  task depend(out:X(K,L)) firstprivate(K,L) default(shared)
                    X(K,L) = 0
                    !$omp end task
                ELSE IF ( K .EQ. M .AND. L .GT. 1) THEN
                    !$omp  task depend(out:X(K,L)) depend(in:X(K,LOLD)) firstprivate(K,L,LOLD) default(shared)
                    X(K,L) = 1 + X(K,LOLD)
                    !$omp end task
                ELSE IF ( K .LT. M .AND. L .EQ. 1) THEN
                    !$omp  task depend(out:X(K,L)) depend(in:X(KOLD,L)) firstprivate(K,L,KOLD) default(shared)
                    X(K,L) = 2 + X(KOLD,L)
                    !$omp end task
                ELSE
                    !$omp  task depend(out:X(K,L)) depend(in:X(KOLD,L),X(K,LOLD)) firstprivate(K,L,KOLD, LOLD) default(shared)
                    X(K,L) = X(KOLD, L) + X(K,LOLD)
                    !$omp end task
                END IF

                KOLD = K
                K = K - 1
            END DO

            LOLD = L
            L = L + 1
        END DO
        !$omp end master
        !$omp taskwait
        !$omp end parallel
    END SUBROUTINE

After compiling it using `ifort -qopenmp -g omp_test.f90` and running it via `valgrind` it reports:

==23255== 1,048,576 bytes in 1 blocks are possibly lost in loss record 22 of 27
==23255==    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==23255==    by 0x516AF47: bget(kmp_info*, long) (kmp_alloc.c:741)
==23255==    by 0x516AC6D: ___kmp_fast_allocate (kmp_alloc.c:2012)
==23255==    by 0x51CFAC7: __kmp_task_alloc (kmp_tasking.c:997)
==23255==    by 0x51CFA36: __kmpc_omp_task_alloc (kmp_tasking.c:1134)
==23255==    by 0x40367E: compute_x_ (omp_task_problem.f90:31)
==23255==    by 0x51DC412: __kmp_invoke_microtask (in /scratch/software/intel-2017/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin/libiomp5.so)
==23255==    by 0x51AC186: __kmp_invoke_task_func (kmp_runtime.c:7055)
==23255==    by 0x51AD229: __kmp_fork_call (kmp_runtime.c:2361)
==23255==    by 0x5184EE7: __kmpc_fork_call (kmp_csupport.c:339)
==23255==    by 0x4034D3: compute_x_ (omp_task_problem.f90:24)
==23255==    by 0x4030CC: MAIN__ (omp_task_problem.f90:10)
==23255==
==23255== 1,048,576 bytes in 1 blocks are possibly lost in loss record 23 of 27
==23255==    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==23255==    by 0x516AF47: bget(kmp_info*, long) (kmp_alloc.c:741)
==23255==    by 0x516AC6D: ___kmp_fast_allocate (kmp_alloc.c:2012)
==23255==    by 0x51CE30A: __kmp_add_node (kmp_taskdeps.cpp:204)
==23255==    by 0x51CE30A: __kmp_process_deps (kmp_taskdeps.cpp:320)
==23255==    by 0x51CE30A: __kmp_check_deps (kmp_taskdeps.cpp:365)
==23255==    by 0x51CE30A: __kmpc_omp_task_with_deps (kmp_taskdeps.cpp:523)
==23255==    by 0x403A11: compute_x_ (omp_task_problem.f90:35)
==23255==    by 0x51DC412: __kmp_invoke_microtask (in /scratch/software/intel-2017/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin/libiomp5.so)
==23255==    by 0x51AC186: __kmp_invoke_task_func (kmp_runtime.c:7055)
==23255==    by 0x51AD229: __kmp_fork_call (kmp_runtime.c:2361)
==23255==    by 0x5184EE7: __kmpc_fork_call (kmp_csupport.c:339)
==23255==    by 0x4034D3: compute_x_ (omp_task_problem.f90:24)
==23255==    by 0x4030CC: MAIN__ (omp_task_problem.f90:10)

 

The line numbers of the `compute_x_` function in the backtrace correspond to the `!$omp task` statements. These memory leaks accumulated rapidly to an amount of memory such that the program crashes.

Using gcc-6.2 for this `valgrind` ends up with:

   ==21246== LEAK SUMMARY:
    ==21246==    definitely lost: 0 bytes in 0 blocks
    ==21246==    indirectly lost: 0 bytes in 0 blocks
    ==21246==      possibly lost: 8,640 bytes in 15 blocks
    ==21246==    still reachable: 4,624 bytes in 4 blocks
    ==21246==         suppressed: 0 bytes in 0 blocks
    ==21246==

where the leaks are only from the first initialization of the OpenMP runtime system.

So my question is: Why does the Intel Compiler/Intel OpenMP runtime system produce theses leaks or alternatively is there an error in the way I have
designed the task parallelism.

 

Zone: 

Thread Topic: 

Help Me
Viewing all 2583 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>