Quantcast
Channel: Intel® Fortran Compiler for Linux* and macOS*
Viewing all articles
Browse latest Browse all 2583

INtel vectorization not efficient

$
0
0

Dear Intel developers,

I have a Fortran piece of code where my program spend a lot of times:

k=0

id = 1

do j = start, end  
  do i = 1, ns(j)
     k = k + 1  
     if(selectT(lx00(i), j, id) > 1.00) &
      tco(k) = 10.0 
  end do
end do

I'm using  intel/cs-xe-2012 on Intel Xeon E5645. I compiled by using -O3 -ip -ipo -xXost -vec-report=3. The compiler report that nested loop is vectorized, but the execution time of that piece of code is the same without vectorization. I tried to linearize selectT with any results. I tried also to build a"truth table" linearized:

do j = start, end  
  do i = 1, ns(j)
     k = k + 1   
     tco(k) = 10.0*select_cond(offset + lx00(i))
  end do
end do

Do you have any idea how to implement a good vectorization? I suspect the indirect address of lx00(i) break the vectorization, but it is unavoidable


Viewing all articles
Browse latest Browse all 2583

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>