*Serial processing pprogramming [#d576a6ab]

Here is the summarized text of [[porting your programs from former system to current system>Porting your programs]].

**Compiling Command [#cbaa47fc]
--for Fortran programs
--for C programs
--for C++ programs

We recommend you to use Intel Compiler which could get good performance of Xeon processors.
It is also possible using GNU compiler.

**Compiler Options [#cb56fd9a]
-Optimization Options
--Recomended optimization options
The follows are recomended optimization options for bugfree programs.
---eic or eich
 -O3 -xAVX

 -O3 -xCORE-AVX2

--Optimization Options
|-O0|Disables all optimizations|
|-O1|Enables optimizations for speed and disables some optimizations that increase code size and affect speed.|
|-O2|Enables optimizations for speed. This is the generally recommended optimization level.(default)|
|-O3|Performs O2 optimizations and enables more aggressive loop transformations.|

--Code Generation Options
|-xAVX|May generate AVX instructions|
|-xCORE-AVX2|May generate AVX2 instructions|
AVX2 instruction set includes FMA(Fused Multiply-Add).
FMA calculate the above expression in an instruction.

---eic and eich support -xAVX option
---eicp supports -xAVX and -xCORE-AVX2 options.

--Floating point Operation Options
|-no-prec-div|[Improves performance]Enables optimization of floating-point divides|
|-fp-model fast [=1/2]|[Improves performance]Enables more aggressive optimizations on floating-point data|
|-fp-model precise|[Improves precision]Disables optimizations that are not value-safe on floating-point data|
Deault : -fp-model fast=1

--Debug Options
---ifort Only
|-traceback -g|When the severe error occurs, source file, routine name, and line number correlation information is displayed along with call stack hexadecimal addresses (program counter trace).|
|-traceback -g -check bounds|Determines whether checking occurs for array subscript and character substring expressions.|
|-traceback -g -fpe0|If floating-point invalid divide-by-zero, and overflow exceptions occur, execution is aborted.|
(*)Specifying -g turn off -O2 and make -O0 the deault unless -O2 is explicity speciied int the same command line.
&br;Debug Options may affect the speed of your programs. So, when debugging is done, you would be better off removing these debug options.

**Specific Memory Model [#zd4b6f2b]
The compiler restricts code and data to the first 2GB of address space. 
If, during linking, you fail to use the appropriate memory model and dynamic library options, an error message in this format occurs: 

 relocation  truncated  to  fit: R_X86_64_32S against  `.bss'
 relocation  truncated  to  fit: R_X86_64_32S  against  `.bss'
When you specify option -mcmodel=medium or -mcmodel=large, it sets option -shared-intel. 
|-mcmodel=small(default)|Tells the compiler to restrict code and data to the first 2GB of address space. |
|-mcmodel=medium|Tells the compiler to restrict code to the first 2GB; it places no memory restriction on data. |
|-mcmodel=large|Places no memory restriction on code or data. |
|-shared-intel|This option causes Intel-provided libraries to be linked in dynamically.|

**Math Kernel Library (MKL) [#t799fdd4]
-MKL provides math processing routines as follows.

--sparse solvers
--Vector Math (VML)
--Vector Statistics (VSL)
--Fast Fourier Transform
--FFTW interface for Fast Fouriew Transform

-How to link serial version or multi-threaded version
--Serial version
 $ ifort -o a.out -mkl=sequential
--Multi-threaded version
 $ ifort -o a.out -mkl=parallel

-ex.1)Vector inner product calculation using SDOT routine
 $ cat test1.f
     program test1
     real x(10), y(10), sdot, res
     integer n, incx, inxy, i
     external sdot
     n = 5
     incx = 2
     ncy = 1
     do i = 1, 10
        x(i) = real(i)
        y(i) = 1.0e0
     res = sdot(n, x, incx, y, incy)
     print*,'SDOT = ', res
 $ ifort  -O3 -xAVX test1.f  -mkl=sequential
 $ dplace ./a.out
 SDOT =    25.00000

-ex.2)FFTW using FFT in MKL
 $ cat test2.f
 .... FFTW source code ......
 $ ifort -O3 -xAVX test2.f -I${MKLROOT}/include/fftw  -mkl=sequential
 $ dplace ./a.out

**Time Functions [#sc735795]

Return elapsed time from 0:00 in the day. Retrun value has reasl(8) data type.
 real(8) time1, dclock
 time1 = dclock()
 $ cat test3.f
     program test3
     real*8 dclock, t1, t2
     t1 = dclock()
     call sub()
     t2 = dclock()
      write(6,*) "time :", t2 - t1
     subroutine sub()
     call system("sleep 3")
 $ ifort -O3  -xAVX test3.f
 $ dplace ./a.out
  time :   3.01978499999677

**Performance Analuzing Tool [#rc555f9b]
Under Construction

トップ   新規 一覧 単語検索 最終更新   ヘルプ