Home
        Intel(R) Math Kernel Library User's Guide for Linux*
         Contents
1.         libmkl_lapack64 so    libmkl so  libvsl so  libguide so          libguide a                Linking Your Application with I ntel   Math Kernel Library                         Function Intel MKL libraries in lib  em64t Linux libraries  domain  directory  Interface  Dynamic Static Dynamic Static  Fourier    Transform libmk1 so libmkl_em  4t a libpthread so   libpthread a  FuncHong libguide so libguide a libm  so libm a   optionally   optionally   Interval _        Arithmetic libmkl_lapack32 so   libmkl_ lapack a   libpthread so libpthread a  and or libmkl_em64t a  libmkl_lapack64 so libguide a  libmk1 so  libguide so  Trigonomet   _ f      ric libmk1 so libmkl_em  4t a libpthread so   libpthread a  to libvml so libguide a  libguide so  Poisson _ f      Library libmkl so libmkl_em  4t a libpthread so   libpthread a  libvml so libguide a          libguide so                1 Regardless of the function domain  when linking statically with libguide   discouraged   sometimes  you may have to use the libguide version different  from the one in the indicated directory  See Notes below        Not applicable    19       Intel   Math Kernel Library User s Guide for Linux     20                                                 Table 6 Link libraries for Itanium   2 based applications by function domain  Function Intel MKL libraries in lib  64 directory Linux libraries  domain   Interface Dynamic Static Dynamic Static  BLAS  libmk1 so libmkl_ipf a libpthread so libpthread   libguide 
2.      See Selecting libraries to link for your platform and functional domain for specific  recommendations on which libraries to link in depending on your Intel MKL usage  scenario        3 4 Selecting Libraries to Link for Your Platform  and Function Domain    Using the link command  see Link command syntax and some examples   note that    14    libmkl_solver a contains the sparse solver functions     libmkl_lapack a  Or libmkl_lapack32 so and lib mkl_lapack64 so have the  LAPACK functions     libmkl_ia32 a  libmkl_em6  4t a  and libmkl_ipf a have the BLAS  Sparse  BLAS  GMP  FFT DFT  VML  VSL  interval arithmetic  Trigonometric Transform and  Poisson Library functions for  A 32  Intel   EM64T  and Intel   Itanium    processors respectively    libmkl_lapack95 a and libmk1_blas95 a contain LAPACK95 and BLAS95  interfaces respectively  They are not included into the original distribution and  should be built before using the interface  see Fortran 95 interfaces and wrappers  to LAPACK and BLAS for details on building the libraries     The libmk1 so file contains the dynamically loaded versions of these objects  except for VML VSL  which are contained in 1ibvml  so    In all cases  appropriate libraries will be loaded at runtime           The  1lguide option is used to link in the threading library libguide  see Intel MKL   specific linking recommendations         If you want to use FFT DFT  you may link in the Linux mathematics library libm by  adding   1m      In products for
3.   6 2f  n    c im               40    Coding Intel   Math Kernel Library Calls    Appendix A Intel   Math Kernel Library  Language Support    The following table lists function domains that Intel   Math Kernel Library comprises  as well as programming language support for these domains     Table 8 Intel MKL language support                   Function Domain Fortran 77 Fortran 90  95 C  C    Basic Linear Algebra Subprograms  BLAS      via CBLAS  Sparse BLAS Level 1     via CBLAS  Sparse BLAS Level 2 and 3        LAPACK routines for solving systems of linear      equations  LAPACK routines for solving least squares problems           eigenvalue and singular value problems  and  Sylvester s equations                               Auxiliary and utility LAPACK routines     PARDISO      Other Direct and Iterative Sparse Solver routines        Vector Mathematical Library  VML  functions      Vector Statistical Library  VSL  functions      Fourier Transform functions  DFT       Interval Solver routines     Trigonometric Transform routines      Fast Poisson  Laplace  and Helmholz Solver  Poisson                    Library  routines          41    Intel   Math Kernel Library User s Guide for Linux     Appendix B Contents of the doc Directory    The table below shows the contents of the doc directory     Table 9 Contents of the doc directory       File name    Comment       mk 1LEULA txt    Intel MKL license       Doc_Index htm    Index of Intel MKL documentation       fftw2xmk1l_notes 
4.   Intel   Math Kernel Library Language SUPPOMt               cceceee eee eee eee e eee teeta ea ena eae es 41  Contents of the doc Directory             cccceee cece eee eee nii eee EEEE EE EERE AENDA 42  WG OX inpo ann E tenet ares teen nieteas eliames eden nt ars EE 43  Tables   What you need to know before you get Started             ecccceeeee cece e ee eee eee eee eaeae 6  High Level directory StruCtUre         0    cece cece eee eee eee eee eee eee eee need 8  Detailed Intel MKL directory StruCtUre           0    cece eee teed 10  Link libraries for  A 32 applications by function domain                 cceceeeeeee teens eaten es 16  Link libraries for Intel   EM64T applications by function domain                ceceeeee eee 18  Link libraries for Itanium   2 based applications by function domain                  008  20  How to avoid conflicts in the computation environment for your threading model       28  Intel MKL language SUPPOFt       0  eed 41  Contents of the doc direCtOry            cc cceceee nine eee eee eee ete eee 42    Intel   Math Kernel Library User s Guide for Linux     1    Overview    Intel   Math Kernel Library  Intel   MKL  offers highly optimized  thread safe math  routines for science  engineering  and financial applications that require maximum  performance     About This Document    To get Intel   MKL reference information  see Intel MKL Reference Manual  which  features routines functionality  parameters description  interfaces and calling synt
5.   microarchitecture       libvml so    Library dispatcher for dynamic load of processor specific VML  kernels       libmk1l_vml_def so    VML part of default kernel       libmk1l_vml_p4n so    VML for Intel   Xeon   processor with Intel   EM64T       libmk1_vml_mc so    VML for processors based on the Intel   Core    microarchitecture       libmkl_ias so    Interval arithmetic routines       libguide so    Threading library for dynamic linking       lib     64    Contains all libraries for Intel   Itanium   2 based applications             libmkl_ipf a       Processor kernels for Intel   Itanium   2 processor       11       Intel   Math Kernel Library User s Guide for Linux     3 2    12       libmk1l_lapack a LAPACK routines and drivers       penuh polyer a Sparse solver routines       libguide a Threading library for static linking       libmkl_lapack32 so LAPACK routines and drivers  single precision data types       libmkl_lapack64 so LAPACK routines and drivers  double precision data types       libguide so Threading library for dynamic linking       libmk1 so Library dispatcher for dynamic load of processor specific kernel       libmkl_i2p so Itanium   2 processor kernel       Ti pmE I VML LAPESD Itanium   2 processor VML kernel       libmk1_ias so Interval arithmetic routines          libvml  so Library dispatcher for dynamic load of processor specific VML    kernel             Additionally  a number of interface libraries may be generated  see Using language   specifi
6.   of two dimensional arrays  are not power of two     Using Intel MKL Memory Management    Intel   MKL has memory management software that controls memory buffers for use  by the library functions  When a call is made to certain library functions  such as those  in the Level 3 BLAS or DFTs   new buffers are allocated if there are no free ones   marked as free  currently available  These buffers are not deallocated until the  program ends  If at some point your program needs to free memory  it may do so  with a call to MKL_FreeBuffers     If another call is made to a library function that  needs a memory buffer  then the memory manager will again allocate the buffers and  they will again remain allocated until either the program ends or the program  deallocates the memory  The memory management software is turned on by default     This behavior by design facilitates achieving better performance  However  some tools  might report this behavior as a memory leak  Should you wish  you can release  memory in your program through the use of a function made available in Intel MKL or  you can force memory releasing after each call by setting an environment variable     To disable the memory management software using the environment variable  set  MKL DISABLE FAST MM to any value  which will cause memory to be allocated and  freed from call to call  Disabling this feature will negatively impact performance of  routines such as the level 3 BLAS  especially for small problem sizes     Us
7.  AKING a 8  dynaMi Ciiaiahsinitedtabeeatieanaeatenaiaad ieii 13  recoMMeEndationS    nc 13  C r E EE ne dace 12   M   memory management     ssssssssssrrererrerees 32  replacing memory functions                 33   mixed language programming                 37   N   notational Conventions               eceeeeeeee tees 5   number of threads  changing at run time            ce eeeeee eee ees 29  setting UP iisisnnncndeataataiatelateteieieieielede 28   P   parallelismi engeded iai 27   performance    ussssssssssrsirerrrrnnrrnrrnrrreernen 27  Multi Core noain eeke enie e 30  Of FFT FUNCH ONS diaais aia 32  of LAPACK packed routines                  31  setting up data fOr    eseese 31   processor specific kernel              eeeeeeee ee 10    43    Intel   Math Kernel Library User s Guide for Linux     T  threading keia aaaea a e adei 27  avoiding conflicts occ 28  setting up              See number of threads  threading library             eeeeeeeeeeeeeee eee 10    44    U    usage information    
8.  Linux   it is necessary to link to the pthreads library by adding   lpthread  The pthread library is native to Linux  and Intel   MKL makes use of  this library to support multi threading  Depending on what functions you call  you    Linking Your Application with I ntel   Math Kernel Library    may have to include this at the end of your link line  link order is important   regardless of whether you plan to use more than one thread     Table 4  Table 5  and Table 6 illustrate the choice of libraries for dynamic and static    linking and different architectures  To link in a library libxxx  you should include       1xxx    into the link command     15    Intel   Math Kernel Library User s Guide for Linux     16    Table 4    Link libraries for I A 32 applications by function domain             Function Intel MKL libraries in lib  em64t Linux libraries  domain  directory  Interface  Dynamic Dynamic Static  BLAS            libmk1l so libmkl_1a32 a libpthread so   libpthread     Sparse    libguide so    libguide a                             BLAS libmk1l so libmkl_ ia32 a libpthread so   libpthread   libguide so libguide a  BLAS95 i  Interface libmkl_ blas95 a   libmkl_blas95 a   libpthread so   libpthread   libmk1 so libmkl_ia32 a  libguide so libguide a  CBLAS  libmk1 so libmkl_ia32 a libpthread so   libpthread   libguide so libguide a  LAPACK      libmkl_lapack32 so   libmkl_lapack a   libpthread so   libpthread   and or libmkl ia32 eee  libmkl lapack  4 so   libguide a  libm
9.  b i SIZE j    double   i j    c i SIZE j    double  0          cblas_ dgemm CblasRowMajor  CblasNoTrans  CblasNoTrans   m  n  k  alpha  a  lda  b  ldb  beta  c  ldc      printf   row ta te n      for   i 0 i lt 10 1i       printf    d  t f t f n   i  alix SIZE    c i SIZE            delete    a   delete    b   delete    c     Multi core performance    You may experience the issues that the application using Intel MKL runs faster when  threaded on the number of sockets rather than on the number of cores  and or  parallel application performance is unstable on multi core  Binding threads to the CPU  cores may improve or stabilize the performance  This is performed by setting an  affinity mask to threads  You can do it either with OpenMP facilities  which is  recommended if available  for instance  via KMP_AFFINITY environment variable  using Intel OpenMP   or with a system routine  see the example below      Example of setting an affinity mask by operating system means  using I ntel compiler    Suppose  the system has two sockets with two cores each  and 2 threads parallel  application  which calls Intel MKL DFT  happens to run faster than in 4 threads  but  the performance in 2 threads is very unstable  Put the fragment marked green into  your code before DFT call to bind the threads to the cores on different sockets        Set affinity mask    include  lt sched h gt     include  lt omp h gt     pragma omp parallel default  shared  private  mask       unsigned long mask    1 
10.  defects which may cause the product to deviate from published  specifications  Current characterized software defects are available on request     This document as well as the software described in it is furnished under license and may only be used or copied in accordance  with the terms of the license  The information in this manual is furnished for informational use only  is subject to change without  notice  and should not be construed as a commitment by Intel Corporation  Intel Corporation assumes no responsibility or  liability for any errors or inaccuracies that may appear in this document or any software that may be provided in association with  this document     Except as permitted by such license  no part of this document may be reproduced  stored in a retrieval system  or transmitted in  any form or by any means without the express written consent of Intel Corporation     Developers must not rely on the absence or characteristics of any features or instructions marked  reserved  or  undefined    Improper use of reserved or undefined features or instructions may cause unpredictable behavior or failure in developer s  software code when running on an Intel processor  Intel reserves these features or instructions for future definition and shall  have no responsibility whatsoever for conflicts or incompatibilities arising from their unauthorized use     BunnyPeople  Celeron  Celeron Inside  Centrino  Centrino logo  Chips  Core Inside  Dialogic  EtherExpress  ETOX  Flas
11.  i  im    double i   2 0   b i  re    double   n   i   b i  im    double i   2 0          zdote   amp c   amp n  a   amp inca  b   amp incb     printf   The complex dot product is     6 2f   6 2f  n   c re   c im            Calling a Complex BLAS Level 1 Function from C       include  mkl h    typedef struct  double re  double im    complex16    extern  C  void zdotc  complex16   int    complexl6    int     complex16    int         define N 5  void main          int n  inca   1  incb   1  i     complex16 a N   bIN   c     39    Intel   Math Kernel Library User s Guide for Linux     n   N    for  i   0  i  lt  n  i        a i  re    double i  al i  im    double i   2 0    b i  re    double   n   i   b i  im    double i   2 0   zdotc  amp c   amp n  a   amp inca  b   amp inchb      printf   The complex dot product is     6 2f   6 2f  n    c im            Example 3  Using the CBLAS I nterface I nstead of Calling BLAS Directly from C Programs     include  mkl h   typedef struct  double re  double im    complex16     extern  C  void cbhlas_zdotc_sub   const int   const complexl       const int   const complex1l6    const int  const complexl16         define N 5    void main            int n  inca   1  incb   1  i     complex16 a N   bIN   c     n   N    for  i   0  i  lt  n  i        a i  re    double i  a i  im    double i   2 0   b i  re    double   n   i   b i  im    double i   2 0     cblas_zdotc_sub n  a  inca  b  incb  amp c     printf    The complex dot product is     6 2f 
12.  lt  lt  omp_get_thread_num      2   sched_setaffinity  0  sizeof  mask    amp mask               Call MKL DFT routine    4 1 3    Managing Performance and Memory    Then build your application and run it in 2 threads   env OMP NUM THREADS 2   a out    Setting up data for better performance    To obtain best performance  you should properly align data arrays in your code  The  section lists general conditions as well as FFT domain specific ones  Additionally  in  certain cases  the proper matrix format is recommended for LAPACK routines     General recommendations on data alignment    To obtain the best performance with Intel   MKL  make sure the following conditions  are met     e arrays are aligned on a 16 byte boundary    e leading dimension values  n element_size  of two dimensional arrays are  divisible by 16    e for two dimensional arrays  leading dimension values divisible by 2048 are  avoided     LAPACK packed routines performance    The routines with the names that contain the letters HP  OP  PP  SP  TP  UP in the  matrix type and storage position  the second and third letters respectively  operate on  the matrices in the packed format  see  LAPACK Routine Naming Conventions   sections in the Intel MKL Reference Manual   Their functionality is strictly equivalent to  the functionality of the unpacked routines with the names containing the letters HE   OR  PO  SY  TR  UN in the corresponding positions  but the performance is significantly  lower    If the memory rest
13.  you may have to use the libguide version different  from the one in the indicated directory  see Notes below        Not applicable    17       Intel   Math Kernel Library User s Guide for Linux     18    Table 5    Link libraries for Intel   EM64T applications by function domain                                              Function Intel MKL libraries in lib  em64t Linux libraries  domain  directory  Interface  Dynamic Static Dynamic Static  BLAS libmk1l so libmkl_em64t a libpthread so libpthread   libguide so libguide at  Sparse libmk1 so libmkl_em  4t a libpthread so libpthread   BLAS libguide so libguide a  BLAS95 libmkl_blas95 a   libmkl_blas95 a   libpthread so  libpthread   Interface libmkl so libmkl_em64t A  libguide so libguide a  CBLAS libmk1 so libmkl_em  4t a libpthread so libpthread   libguide so libguide a  LAPACK _ i    libmkl_lapack32 so   1ibmkl_lapack a   libpthread so   libpthread   and or libmkl_em64t a  libmkl_lapack64 so   libguide a  libmkl so  libguide so  LAPACK95 i    Interface   n a libmkl_lapack95 a libpthread so libpthread   libmkl_lapack a  libmkl_em  4t a  libguide a  Sparse   f  Solver n a libmkl_solver a   libpthread so   libpthread   libmkl_lapack a  libmkl_em64t a  libguide a  Vector   f      Math libvml so libmkl_em  4t a libpthread so   libpthread   Library libguide so libguide a libm so libm a  Vector PN   P    Statistical   libmkl_lapack32 so   libmkl_lapack a   libpthread so   libpthread   Library and or libmkl_em64t a libm  so libm a  
14. 90 or ifort  c mkl_blas f90    These files are in the include directory    If you do not have administrator rights  do the following     1  copy the entire directory  mk1 9 0 beta interfaces blas95 or  mk1 9 0_ beta interfaces lapack95  into a user defined directory   lt user_ dir gt     2  copy the corresponding file  mk1_blas   90 or mk1l_lapack   90  from  mk1 9 0_ beta include into the user defined directory  lt user_ dir gt  blas95  or  lt user_dir gt  lapack95 respectively    3  run one of the above commands in  lt user_dir gt  blas95 or   lt user_dir gt  lapack95 with an additional variable  for instance   make PLAT 1nx32 INTERFACE mkl_blas f  90 lib  make PLAT 1nx32 INTERFACE mkl_ lapack   90 lib     Now the required library and the  mod file will be built and installed in the   lt user_dir gt  blas95 or  lt user_dir gt  lapack95 directory  respectively     By default  the ifort compiler is assumed  You may change it with an additional  parameter of make  FC  lt compiler gt     For instance    make PLAT 1nx64 FC  lt compiler gt  lib    There is also a way to use the interfaces without building the libraries     3 7    Linking Your Application with I ntel   Math Kernel Library    To delete library from the building directory  use the following commands     make PLAT 1nx32 clean   for  A 32  make PLAT 1nx32e clean   for Intel   EM64T  make PLAT 1nx64 clean   for Intel   Itanium   2 processor platform     Building custom shared objects    Custom shared objects enable red
15. H  lmkl_ia32  lguide  lpthread  1m  static linking of user code myprog c  BLAS  Sparse BLAS  GMP  VML VSL   interval arithmetic  and FFT DFT  Processor dispatcher will call the appropriate  kernel for the system at runtime     ifort myprog f  LSMKLPATH  lmkl_ solver  lmkl_lapack  lmkl_ia32    lguide  lpthread  static linking of user code myprog f  the sparse solver  and possibly other  routines within Intel MKL  including the kernels needed to support the sparse  solver      icc myprog c  LSMKLPATH  lmkl  lguide  lpthread  dynamic linking of user code myprog c  the BLAS or FFTs within Intel MKL     Notes     e When using the Intel   MKL shared libraries  do not forget to update the shared  libraries environment path  i e  a system variable LD_LIBRARY_PATH  to include  the libraries location  For example  if the Intel MKL libraries are in the   opt intel mk1 8 0 1ib 32 directory  then the following command line can be  used  assuming a bash shell      export LD LIBRARY PATH  opt intel mk1 9 0 1lib 32 S LD LIBRARY PATH    e Ifyou link 1ibguide statically  discouraged  and use the Intel compilers  then  link in the 1ibguide version that comes with the compiler  that is  use  openmp  option     e Ifyou link 1ibguide statically but do not use the Intel compiler  then link in the  libguide version that comes with Intel MKL     3 6    Linking Your Application with I ntel   Math Kernel Library    e If you use dynamic linking  libguide so  of Intel MKL  recommended   make  sure the LD
16. Intel   Math Kernel Library    User s Guide for Linux        Copyright    2006 Intel Corporation  All Rights Reserved  Document Number  314774 001US    World Wide Web   http   www  intel com cd software products asmo na eng perflib mkl index htm       intel     Disclaimer and Legal Information    INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL   PRODUCTS  NO LICENSE  EXPRESS OR  IMPLIED  BY ESTOPPEL OR OTHERWISE  TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT  EXCEPT  AS PROVIDED IN INTEL S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS  INTEL ASSUMES NO LIABILITY  WHATSOEVER  AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY  RELATING TO SALE AND OR USE OF INTEL  PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE  MERCHANTABILITY   OR INFRINGEMENT OF ANY PATENT  COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT  Intel products are not intended  for use in medical  life saving  life sustaining  critical control or safety systems  or in nuclear facility applications    Intel may make changes to specifications and product descriptions at any time  without notice     Designers must not rely on the absence or characteristics of any features or instructions marked  reserved  or  undefined   Intel  reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from  future changes to them     The software described in this document may contain software
17. _LIBRARY_PATH is defined so that exactly this version of libguide is  found and used at runtime     For linking examples  see the Intel MKL support website at  http   www  intel com support performancetools libraries mkl         Using language specific interfaces with Intel  MKL    The following interface libraries and modules may be generated as a result of  operation of respective makefiles located in the interfaces folder        File name Comment       libmkl_blas95 a Contains Fortran 95 wrappers for BLAS  BLAS95        libmkl_lapack95 a Contains Fortran 95 wrappers for LAPACK     LAPACK95        Libfftw2xc_gnu a Contains interfaces for FFTW version 2 x  C interface    for GNU compiler  to call Intel MKL DFT       Libfftw2xc_intel a Contains interfaces for FFTW version 2 x  C interface    for Intel   compiler  to call Intel MKL DFTI       Libfftw2xf_gnu a Contains interfaces for FFTW version 2 x  Fortran    interface for GNU compiler  to call Intel MKL DFTI       Libfftw2xf_intel a Contains interfaces for FFTW version 2 x  Fortran    interface for Intel compiler  to call Intel MKL DFTI       Libfftw3xc_gnu a Contains interfaces for FFTW version 3 x  C interface    for GNU compiler  to call Intel MKL DFTI       Libfftw3xc_intel a Contains interfaces for FFTW version 3 x  C interface    for Intel compiler  to call Intel MKL DFTI       Libfftw3xf_gnu a Contains interfaces for FFTW version 3 x  Fortran    interface for GNU compiler  to call Intel MKL DFTI       Libfftw3xf inte
18. ack32 so  MKL LAPACK64so   mkl_lapack64 so      Serial or parallel mode             YES   single threaded     NO   multi threaded     OMP   control by OMP_NUM_ THREADS    MKL SERIAL   YES      Input parameters check      ON   checkers are used  default      OFF   checkers are not used   MKL INPUT CHECK   ON    When any Intel   MKL function is first called  Intel MKL checks to see if the  configuration file exists  and if so  it operates with the specified variables  The path to  the configuration file is specified by environment variable MKL_CFG FILE  If this  variable is not defined  then first the current directory is searched through  and then  the directories specified in the PATH environment variable  If the Intel MKL  configuration file does not exist  the library operates with default values of variables   standard names of libraries  checkers on  non threaded operation mode     If the variable is not specified in the configuration file  or specified incorrectly  the  default value is used     Redefining Names of Downloadable Dynamic  Libraries    Below is an example of the configuration file that only redefines the library names     35    Intel   Math Kernel Library User s Guide for Linux     36       SO redefinition   MKL X87so   matlab_x87 so  MKL SSE1so   matlab_ssel so  MKL SSE2so   matlab_sse2 so  MKL SSE3so   matlab_sse2 so  MKL_ITPso matlab _ipt so  MKL_I2Pso matlab _i2p so       6 1    NOTE     Coding Intel   Math Kernel Library Calls    Coding Intel   Math Ke
19. aming SIMD  Extensions 3  SSE3        libmkl_p4m  so    Kernel for processors based on the Intel   Core    microarchitecture       libvml so    Library dispatcher for dynamic load of processor specific VML  kernels       libmk1l_vml_def so    VML part of default kernel  Pentium    Pentium   Pro  Pentium   II  processors        libmk1_vml_p3 so    VML part of Pentium   III processor kernel       libmk1l_vml_p4 so    VML part of Pentium   4 processor kernel       libmk1l_vml_p4p so    VML for Pentium   4 processor with Streaming SIMD Extensions 3   SSE3        libmk1l_vml_p4m so    VML for processors based on the Intel   Core    microarchitecture       libmkl_ias so    libguide so    Interval arithmetic routines    Threading library for dynamic linking       lib     em64t    Contains all libraries for Intel   EM64T applications       libmkl_em  4t a    Optimized kernels for Intel   EM64T       libmk1l_lapack a    libmkl_solver a    LAPACK routines and drivers    Sparse solver routines       libguide a    Threading library for static linking       libmk1 so    Library dispatcher for dynamic load of processor specific kernel       libmk1l_lapack32 so    LAPACK routines and drivers  single precision data types       libmkl_lapack6  4 so    LAPACK routines and drivers  double precision data types       libmkl_ def so    Default kernel       libmkl_p4n so    Kernel for Intel   Xeon   processor with Intel   EM64T       libmkl_mc so    Kernel for processors based on the Intel   Core  
20. atly help you get off to a good start  The table  below summarizes some important things to think of before you start using Intel MKL     Table 1 What you need to know before you get started    Target platform Identify the processor inside your target machine   e  A 32  e Processor with Intel   EM64T    e Intel   Itanium   processor family     Reason  When linking your application with the Intel MKL libraries  the directory  corresponding to your particular architecture should be included in the link  command  see Selecting libraries to link for your platform and function domain         Mathematical problem Identify all Intel MKL function domains that problems you are solving require   BLAS  Sparse BLAS  LAPACK  Sparse Solver routines  Vector Mathematical Library functions  Vector Statistical Library functions  Fourier Transform functions  Interval Solver routines  Trigonometric Transform routines    Fast Poisson  Laplace  and Helmholz Solver routines     Reason  The function domain you intend to use determines the MKL libraries that  your application must link with  For more information see Selecting libraries to link    for your platform and functional domain     Programming language Though Intel MKL provides support for both Fortran and C C   programming   not all the function domains support a particular language environment  for  example  C C   or Fortran90 95  Identify the languages that your function  domains support  see Appendix A  Intel MKL Language Support      Rea
21. ax  as well as return values  However  a lot of questions not answered in the Reference  Manual arise when you try to call Intel MKL routines from your applications  For  example  you need to know how the library is organized  how to configure Intel MKL  for your particular platform and problems you are solving  how to compile and link  your applications with Intel MKL  You also need understanding of how to achieve best  performance  take advantage of Intel MKL threading and memory management  Other  questions may deal with specifics of routine calls  for example  passing parameters in  different programming languages or coding inter language routine calls  You may be  interested in the ways of estimating and improving computation accuracy  These and  similar issues make up Intel MKL usage information     Purpose    This document focuses on the usage information needed to call Intel MKL routines  from user   s applications running on Linux   Linux usage of Intel MKL has its particular  features  which are described in this guide  along with those that do not depend upon  a particular OS     This guide contains usage information for plain  non cluster  Intel MKL routines and  functions  comprised in the function domains listed in Table 8  in Appendix A   Usage  information inherent to functions and routines available only with Intel MKL Cluster  Edition will be included in the appropriate user s guide to be developed     Audience    The guide is intended for Linux programmers 
22. brary User s Guide for Linux     NOTE     6 2    38    In the BLAS  Intel MKL provides both upper case and lower case names to the routines   Please see Example  in section 6 2      An alternative to calling BLAS routines from a C language program is to use the  CBLAS interface     CBLAS    CBLAS routines are provided as the C style interface to the BLAS routines  Call CBLAS  routines using regular C style calls  When using the CBLAS interface  the header file  mk1 h will simplify the program development as it specifies enumerated values as  well as prototypes of all the functions  The header determines if the program is being  compiled with a C   compiler  and if it is  the included file will be correct for use with  C   compilation     How to Call BLAS Functions That Return the  Complex Values in C C   Code    When handling a call of a complex BLAS function that returns complex values from C   you must be careful  The problem arises because these are Fortran functions and the  return values are handled quite differently for the two languages  C and Fortran  for  complex values  Because Fortran lets you call functions as though they were  subroutines  however  there is a mechanism for returning the complex value correctly  when the function is called from a C program  When a Fortran function is called as a  subroutine  the return value shows up as the first parameter in the calling sequence       This feature can be exploited by the C programmer     The following example  fo
23. c interfaces with Intel MKL            Selecting Between Static and Dynamic Linking    You can link your applications with Intel MKL libraries statically  using static library  versions  or dynamically  using shared libraries     Static linking    During static linking  all links are resolved at link time  Therefore  the behavior of  statically built executables is absolutely predictable  as they do not depend upon a  particular version of the libraries available on the system where the executables run   Such executables must behave exactly the same way as was observed during testing   The main disadvantage of static linking is that upgrading statically linked applications  to higher library versions is troublesome and time consuming  as you have to relink  the entire application  Besides  static linking produces large size executables and uses  memory inefficiently  since if several executables are linked with the same library   each of them loads it into memory independently  However  this is hardly an issue for  Intel MKL  used mainly for large size problems      It matters only on shared memory  systems for executables having data size relatively small and comparable with the size  of the executable     Dynamic linking    During dynamic linking  resolving of some undefined symbols is postponed until run  time  Dynamically built executables still contain undefined symbols along with lists of       3 3    Linking Your Application with I ntel   Math Kernel Library    librari
24. cation with Intel    Math Kernel Library    This chapter features linking of your applications with Intel   Math Kernel Library   Intel   MKL   The chapter discusses the library structure  as it determines much of  the linking procedure  compares static and dynamic linking models  describes the  general link line syntax to be used for linking with Intel MKL libraries and finally  provides comprehensive information in a tabular form on the libraries that should be  linked with your application for a particular platform and function domain  Generation  of language specific interface libraries and modules as well as building of custom  dynamic libraries is also discussed     Intel MKL Library Structure    The table below shows a high level structure for Intel MKL after installation     High Level directory structure       Directory Comment        lt mkl directory gt  Main directory  for example       opt intel mk1 9 0           lt mkl  directory gt  doc Documentation directory       me  directorys man  mans Man pages for Intel MKL BLAS  Sparse BLAS  and    LAPACK  without auxiliary  functions       ciel directory gt  examples Source and data for examples       smki directory gt  include Contains INCLUDE files for both library routines and    test and example programs       mel GIP aL Oya TeV er reece  lasts Contains f95 wrappers for BLAS and makefile to build    the library       Smki Giepet er ye  MEET races  Lapackos Contains f95 wrappers for LAPACK and makefile to    build th
25. ce problems  too many threads  and may also cause correctness problems  if more than one copy is initialized  For more information  see Using Intel MKL  parallelism      Link Command Syntax    To link libraries with names 1libxxx a or libxxx so with your application  two  options are available     e Just list library names using relative or absolute paths  for example      lt ld gt  myprog o  opt intel cmk1 9 0 1ib 32 libmk1_scalapack a   opt intel cmk1 9 0 lib 32 libmkl_ blacs a   opt intel cmk1 9 0 lib 32 libmkl_ lapack a   opt intel cmk1 9 0 lib 32 libmkl_ia32 a   opt intel cmk1 9 0 1lib 32 libguide so  lpthread   where  lt 1ds gt  is a linker  myprog o is a user   s object file  then Intel MKL libraries are  listed followed by the system library libpthread    13    Intel   Math Kernel Library User s Guide for Linux     In the link line  list library names  with absolute or relative paths  if needed   preceded with  L  lt paths gt   which indicates where to search for binaries   Discussion of linking with Intel MKL libraries employs this option     To link with Intel MKL libraries  follow this general form of specifying the path and  libraries in the link line      L lt MKL path gt      lmkl_solver    1lmkl_lapack95    1mk1_blas95        lmkl_lapack   lmkl_ ia32  em64t  ipf     lmkl_lapack 32 64     1lmk1   lvm1     lguide  lpthread   1m     NOTE  It is necessary to follow the order of listing libraries in the link command because  Linux does not support multi pass linking
26. e a  for static linking  discouraged   please see Intel  MKL specific linking recommendations        e dynamic link library  libguide so  for linking dynamically to Intel   MKL   recommended   please see Intel MKL specific linking recommendations         Directory structure in detail  The information in the table below shows detailed structure of the architecture specific  directories of the library  For the detailed structure of the doc directory  see Appendix    B     Detailed I ntel   MKL directory structure       Directory  file Contents       lib 32 Contains all libraries for 32 bit applications       libmkl_ia32 a Optimized kernels  BLAS  CBLAS  Sparse BLAS  GMP  FFTs  DFTs     VML  VSL  interval arithmetic  for 32 bit applications    libmk1l_lapack a LAPACK routines and drivers       Tibmkl solvera Sparse solver routines       libguide a Threading library for static linking       libmk1 so Library dispatcher for dynamic load of processor specific kernel       libmkl_lapack32 so LAPACK routines and drivers  single precision data types       libmkl_lapack64 so LAPACK routines and drivers  double precision data types          libmkl_def so Default kernel  Intel   Pentium    Pentium   Pro  and Pentium   II    processors              Linking Your Application with I ntel   Math Kernel Library       libmkl_p3 so    Intel   Pentium   IIl processor kernel       libmkl_p4 so    Pentium   4 processor kernel       libmkl_p4p so    Kernel for Intel   Pentium   4 processor with Stre
27. e library        lt mk1 directory gt  interfaces fftwaxc Contains wrappers for FFTW version 2 x  C interface     to call Intel MKL DFTI           lt mk1 directory gt  interfaces fftw2xf Contains wrappers for FFTW version 2 x  Fortran    interface  to call Intel MKL DFTI        lt mkl directory gt  interfaces fftw3xc Contains wrappers for FFTW version 3 x  C interface     to call Intel MKL DFTI             Linking Your Application with I ntel   Math Kernel Library                                        Directory Comment   smki directory gt  interfac  s   Eftw3xE Contains wrappers for FFTW version 3 x  Fortran  interface  to call Intel MKL DFTI   snel directoiy   intertaces iEEG Contains wrappers for FFT  C interface  to call Intel  MKL DFTI    lt mkl  directory gt  interfaces tfti Contains wrappers for FFT  Fortran interface  to call  Intel MKL DFTI   smki direct Orys  TERES Source and data for tests    lt mkl  digectorys lib 22 Contains static libraries and shared objects for  A 32  applications   smki directory  LiB eme1t Contains static libraries and shared objects for  applications running on processors with Intel    EM64T   mic  directory gt  lib 64 Contains static libraries and shared objects for Intel    Itanium   2 processor    lt mk1  directory gt  tools builder Contains tools for creating custom dynamically  linkable libraries   sikl directary  tools environment Contains shell scripts to set environmental variables  in the user shell   smki directory  tools support C
28. e threaded  execution environment  shows how to set up the number of threads and change it  during run time  gives recommendations on multi core performance  The section also  shows other ways to gain performance  for example  by proper data alignment     Using Intel MKL parallelism    Intel   MKL is threaded in a number of places  direct sparse solver  LAPACK   GETRF    POTRF   GBTRF   GEQRF   ORMQR   STEQR   BDSQR routines   all Level 3  BLAS  Sparse BLAS matrix vector and matrix matrix multiply routines for the  compressed sparse row and diagonal formats  and all DFTs  except 1D  transformations when DFTI_NUMBER_OF_TRANSFORMS 1 and sizes are not power of  two      There are situations in which conflicts can exist in the execution environment that  make the use of threads in Intel   MKL problematic  They are listed here with  recommendations for dealing with these  First  a brief discussion of why the problem  exists is appropriate     If the user threads the program using OpenMP  directives and uses the Intel  compilers to compile the program  Intel   MKL and the user program will both use the  same threading library  Intel   MKL tries to determine if it is in a parallel region in the  program  and if it is  it does not spread its operations over multiple threads  But  Intel   MKL can be aware that it is in a parallel region only if the threaded program  and Intel   MKL are using the same threading library  If the user program is threaded  by some other means  Intel   MKL 
29. een eee 31   mixed language CallS               cece 37  Cofiguration fle          ce eeeeeeeeeeeee nent eee 34  configuring development environment      34  custom shared object              seeeeeee eee 25   specifying list of functions                    26   specifying makefile parameters            25  D  data   Setting UP the btbabeisaunt 31  development environment   COMMOUNING aadi daaa walettsiatls 34  directory structure   NIQGHsEVElesincacicinane aivewtudadnbeenadndadad anal 8   IN2O Stalls crtncnnatoiatahanwmanmennheenh hakeeee 10  downloadable dynamic lbrary   DUNGING niaaa a 25   redefining NAMES             eceeeeeeee eee tees 36  F  FFT functions  data alignment                 32    H   high level library          cee eeeeeeee eee eee 10   L   language SUpport          eee eee eee 41  Fortran 95 interfaces acceden 23  language specific interfaces                 23   LAPACK  calling routines from C          cece eee 37  Fortran 95 interfaces to             ceeeeeeee 23  packed routines performance               31   library Structure           cc ceeeeeeeeeeeeeeee ee eeeaees 9  high level library             ceceeeeeeeeeeen eee 10  processor specific kernel                 008 10  threading library scese 10   link COMMANA 1 0    eee eeee eee ee eee ee ee teen enees 14  CXAMPIOS i istaccacaaaasaaaaac dsann 21   link LiDrArieS      eee cece eeeeeee teen ee ee ees 14  TOMAS 2 at E ata 16  for Intel   EMG4T eeren 18  for Itanium   2 based applications        20  
30. er to  Setting up the number of threads     After the installation of Intel MKL is complete  you can use files mklvars32 sh   mklvarsem64t sh  and mklvars64 sh in the tools environment directory to set the  environment variables INCLUDE and LD_LIBRARY_PATH in the user shell     Customizing Intel MKL Using the Configuration  File    Intel MKL configuration file will provide the possibilities to customize several features  of the Intel MKL  namely     e redefine names of downloadable dynamic libraries  e turn on off checking of the input parameters for possible errors    e set Threaded Non Threaded operation mode     The configuration file is mk1   cfg file by default  The file contains several variables  that can be changed  Below is the example of the configuration file containing all  possible variables with default values             Default values for mkl cfg file          SO names for IA 32   MKL_X87so   mkl_def so   MKL_SSE1so   mkl_p3 so   MKL_SSE2so   mkl_p4 so    5 2    Configuring Your Development Environment    MKL_SSE3so   mkl_p4p so   MKL VML_X87so   mkl_vml_def so   MKL VML_SSE1so   mkl_vml_p3 so   MKL VML_SSE2so   mkl_vml_p4 so   MKL VML_SSE3so   mkl_vml_p4p so      SO names for Intel R  64  MKL_EM64TDEFso   mkl_def so  MKL_EM64TSSE3so   mkl_p4n so  MKL_VML_EM64TDEFso   mkl_vml_ def so  MKL VML_EM64TSSE3so   mkl_vml_p4n so     SO names for IA 64   MKL_I2Pso   mkl_i2p so  MKL_VML_I2Pso   mkl_vml_i2p so      DLL names for LAPACK libraries  MKL LAPACK32so   mkl_lap
31. es that provide definitions of the symbols  When the executable is loaded  final  linking is done before the application starts running  If several dynamically built  executables use the same library  the library loads to memory only once and the  executables share it  thereby saving memory  Dynamic linking ensures consistency in  using and upgrading libraries  as all the dynamically built applications share the same  library  This way of linking enables you to separately update libraries and applications  that use the libraries  which facilitates keeping applications up to date  The  advantages of dynamic linking are achieved at the cost of run time performance  losses  as a part of linking is done at run time and every unresolved symbol has to be  looked up in a dedicated table and resolved     Making the choice    It is up to you to select whether to link in Intel MKL libraries dynamically or statically  when building your application     In most cases  users choose dynamic linking due to its strong advantages     However  if you are developing applications to be shipped to a third party  to have  nothing else than your application shipped  you have to use static linking     Intel MKL specific linking recommendations    You are strongly encouraged to dynamically link in Intel MKL threading library  libguide  Linking to static OpenMP run time libraries is not recommended  as it is  very easy with layered software to link in more than one copy of them  This causes  performan
32. formance and Memory    sample code demonstrates changing the number of threads during runtime using the    omp_set_num threads    routine      include  omp h    include  mkl h    include  lt stdio h gt      define SIZE 1000  void main int args  char  argv         double  a   b   c    a   new double  SIZE SIZE     b   new double  SIZE SIZE    c   new double  SIZE SIZE       double alpha 1  beta 1    int m SIZE  n SIZE  k SIZE  lda SIZE  ldb SIZE   j 0    char transa  n   transb  n      for  i 0  i lt SIZE  i       for  j 0  j lt SIZE  j       a li SIZE j    double     b i SIZE j    double     c i SIZE j    double  0    i j    i j                 ldc SIZE  i 0     cblas_dgemm CblasRowMajor  CblasNoTrans  CblasNoTrans   m  n  k  alpha  a  lda  b  ldb  beta  c  ldc      printf   row ta te n      for   i 0 i lt 10 i         printf    d  t f t f n   i  ali SIZE   c i SIZE               omp set _num_threads  1       for  i 0  i lt SIZE  i       for  j 0  j lt SIZE  j       a li SIZE j    double     b i SIZE j    double     c i SIZE j    double  0       i j    LEJI    f    cblas_dgemm  CblasRowMajor  CblasNoTrans  CblasNoTrans   m  n  k  alpha  a  lda  b  ldb  beta  c  ldc      printf   row ta te n      for   i 0 i lt 10 1i       printf    d  t    tsf n   i  alixSIZE    c i SIZE            omp _set_num threads  2     for  i 0  i lt SIZE  i            29    Intel   Math Kernel Library User s Guide for Linux     4 1 2    30    for  j 0  j lt SIZE  j       a li SIZE j    double   i j   
33. hFile  i386   i486  i960  iCOMP  InstantIP  Intel  Intel logo  Intel386  Intel486  Intel740  IntelDX2  IntelDX4  IntelSX2  Intel Core  Intel  Inside  Intel Inside logo  Intel  Leap ahead   Intel  Leap ahead  logo  Intel NetBurst  Intel NetMerge  Intel NetStructure  Intel  SingleDriver  Intel SpeedStep  Intel StrataFlash  Intel Viiv  Intel vPro  Intel XScale  IPLink  Itanium  Itanium Inside  MCS  MMX   MMX logo  Optimizer logo  OverDrive  Paragon  PDCharm  Pentium  Pentium II Xeon  Pentium III Xeon  Performance at Your  Command  Pentium Inside  skoool  Sound Mark  The Computer Inside   The J ourney Inside  VTune  Xeon  Xeon Inside and  Xircom are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries       Other names and brands may be claimed as the property of others     Copyright    2006  Intel Corporation     Verision Information    Version Version I nformation Date       Contents    Appendix A    Appendix B    Table 1  Table 2  Table 3  Table 4  Table 5  Table 6  Table 7  Table 8  Table 9    OVENVICW onora cae ieaa dated tin le a Fidwodiviahs E E E N oe eran Gia Mile ia pete axles vas 4  Before YOU BEGIN cisaracecniatdeth ls end aiivecpian on han oraaa iat Lal N EURAN feeders 6  Linking Your Application with Intel   Math Kernel Library                c cceeeeeeeeeeee eee tees 8  3 1 Intel MKL Library Structure         00    cccee eee e eee eee eee eee eee eee eee eterna 8  3 2 Selecting Between Static and Dyna
34. htm  fftw3xmk1l_notes htm    FFTW 2 x Interface Support Technical User Notes    FFTW 3 x Interface Support Technical User Notes       fft2dfti pdf    Intel FFT to DFTI Wrappers Technical User Notes       Getting Started htm    Getting Started with Intel MKL  this document        Install txt    Readme txt    Installation Guide    Initial User Information       redist txt    List of redistributable files       Release Notes htm    Release Notes       Release Notes txt    Release Notes  text format        mklman  pdf    Intel MKL Reference Manual       mklman80_j pdf    Intel MKL Reference Manual in J apanese       mklqref index htm    MKL Quick Reference       mklsupport txt    Information on package number for customer support reference       mkluse htm    Technical User notes for Intel MKL       vmlnotes htm    General discussion of VML       vslnotes pdf    General discussion of VSL          userguide pdf       This document        Index    A  audience 1 0    eee eee cette eee ee eae ee eae 4  B  BLAS  calling routines from C              eeeeee eee 38  Fortran 95 interfaces to eseese 24  building custom shared object                 25  C  calling  BLAS functions in C  ssececeeeerrereee 38    complex BLAS Level 1 function from C  39  complex BLAS Level 1 function from C      E EEEE E E T 40   Fortran routines from C  asssccccccs 37  COBLAS sni asninn aa a aaa 38   code example      s ssssssssrssrrrsrrsrrsrreenne 40  COGING a n E 37   data alignment             cee ceeeeee t
35. ing one of these methods to release memory will not necessarily stop programs  from reporting memory leaks  and  in fact  may increase the number of such reports  should you make multiple calls to the library thereby requiring new allocations with    Managing Performance and Memory    each call  Memory not released by one of the methods described will be released by  the system when the program ends     Memory management has a restriction for the number of allocated buffers in each  thread  Currently this number is 32  The maximum number of supported threads is  514  To avoid the default restriction  disable memory management     Replacing Memory Functions    Intel   MKL memory management uses standard C runtime memory functions to  allocate or free memory  Since MKL 9 0  you can replace these memory functions by  your own memory functions  The i_malloc h header file contains all declarations  required for an application developer to replace the memory allocation functions  This  header file describes how memory allocation can be replaced in those Intel R  libraries  that support this feature     33    Intel   Math Kernel Library User s Guide for Linux     5    5 1    34    Configuring Your Development  Environment    This chapter explains how to configure your development environment for the use of  Intel MKL and especially what features may be customized using the Intel MKL  configuration file     For information on how to set up the environment variables for threading  ref
36. k1 so  libguide so  LAPACK95        Interface n a  libmkl_lapack95 a   libpthread so   libpthread   libmkl_lapack a  libmkl_ia32 a  libguide a  Sparse   f f  Solver n a libmkl_solver a libpthread so   libpthread   libmk1_lapack a  libmkl_ia32 a  libguide a  Vector   i      Math libvml so libmkl ia32 a libpthread so   libpthread   Library         libguide so    libguide a       libm so       libm a          Linking Your Application with I ntel   Math Kernel Library                            Function Intel MKL libraries in lib  em64t Linux libraries  domain  directory  Interface  Dynamic Dynamic Static  Vector        Statistical libmkl_lapack32 so   libmkl_lapack a   libpthread so   libpthread a  Horaro I ndior libmkl_ia32 a   libm so libm a  libmkl lapack  4 so   libguide a  libmkl so  libvsl so  libguide so  Fourier      Transform libmkl so libmkl_ia32 a libpthread so   libpthread a  eer libguide so libguide a libm so libm a   optionally   optionally   Interval   i i  Arithmetic   libmkl_lapack32 so   libmkl_lapack a   libpthread so   libpthread a  and or libmkl_ia32 a  libmkl_lapack64 so   libguide a  libmkl so  libguide so  Trigonome   F i    tric libmkl so libmkl_ia32 a libpthread so   libpthread a  Transiorm libvml so libguide a  libguide so  Poisson   f    Library libmkl so libmkl_ia32 a libpthread so   libpthread a       libvml so  libguide so    libguide a             t Regardless of the function domain  when linking statically with libguide   discouraged   sometimes 
37. l a       Contains interfaces for FFTW version 3 x  Fortran  interface for Intel compiler  to call Intel MKL DFTI                mk195_blas mod Contains Fortran 95 interface module for BLAS     BLAS95        mk195_lapack mod Contains Fortran 95 interface module for LAPACK     LAPACK95              mk195_precision mod Contains Fortran 95 definition of precision    parameters for BLAS95 and LAPACK95           23       Intel   Math Kernel Library User s Guide for Linux     24    Section  Fortran 95 interfaces and wrappers to LAPACK and BLAS  shows by example  how these libraries and modules are generated     Fortran 95 interfaces and wrappers to LAPACK and BLAS    Fortran 95 interfaces and wrappers are delivered as sources  The simplest way to use  them is building corresponding libraries and linking them as user s libraries  To do  this  you must have administrator rights  Provided the product directory is open for  writing  the procedure is simple     Go to the respective directory mk1 9 0_beta interfaces blas95 or  mk1 9 0 beta interfaces lapack95  and type one of the following commands        make PLAT 1nx32 lib   for  A 32  make PLAT 1nx32e lib   for Intel   EM64T  make PLAT 1nx64 lib   for Intel   Itanium   2 processor platform     As a result  the required library and a respective  mod file will be built and installed in  the standard catalog of the release  The  mod files can also be obtained from files of  interfaces  using the compiler command   ifort  c mkl_lapack f  
38. may operate in multithreaded mode and the  computations may be corrupted     Here are several cases with recommendations depending on the threading model you  employ     27    Intel   Math Kernel Library User s Guide for Linux     Table 7    NOTE     28    How to avoid conflicts in the computation environment for your threading    model       Threading model    Discussion       You thread the program using OS  threads  pthreads on Linux       If more than one thread calls the library  and the function being called    is threaded  it is important that you turn off Intel   MKL threading  Set  OMP_NUM_THREADS 1 in the environment  This is the default with    Intel   MKL except for the Direct Sparse Solver        You thread the program using  OpenMP  directives and or  pragmas and compiles the program  using a compiler other than a  compiler from Intel     This is more problematic in that setting OMP_NUM_ THREADS in the  environment affects both the compiler s threading library and the  threading library with Intel   MKL  At this time  a safer approach is to  set MKL_ SERIAL YES  or MKL_ SERIAL yes  which forces Intel    MKL to serial mode regardless of OMP_NUM THREADS value  You  can also obtain through your Intel   Premier Support account a  sequential version of MKL  containing no threading  which is the safest  approach for this case           There are multiple programs  running on a multiple cpu system   as in the case of a parallelized  program running using MPI for  commu
39. meters are not mandatory  For the simplest case  the command line could be  make ia32 and the values of the remaining parameters will be taken by default  As a    25    Intel   Math Kernel Library User s Guide for Linux     26    result  mkl_custom so library for  A 32 will be created  the functions list will be  taken from functions list file  and the standard MKL error handler xerbla will be  used     Another example for a more complex case is as follows    make ia32 export my_func_list txt name mkl_small  xerbla my_xerbla o   In this case  mk1_small so library for  A 32 will be created  the functions list will be  taken from my_func_list txt file  user   s error handler my_xerbla o will be used     The process is similar for the Intel   Itanium   processor family applications and  Intel   EM64T applications     Specifying list of functions    Entry points in functions list file should be adjusted to interface  For example   Fortran functions get an underscore character    _    as a suffix when added to the  library     dgemm_  ddot_  dgetrf_    If selected functions have several processor specific versions  they all will be included  into the custom library and managed by dispatcher     4 1    4 1 1    Managing Performance and Memory    Managing Performance and Memory    The chapter discusses ways to obtain best performance with Intel MKL as well as Intel  MKL memory management     Performance    The section discusses Intel MKL parallelism and ways to avoid conflicts in th
40. mic Linking               cciceee eee eee eee eee ed 12  3 3 Link Command SYNtAK isis is irinenn en rror E Err PE ela Vid IEEE E 13  3 4 Selecting Libraries to Link for Your Platform and Function Domain s s s  14  3 5 LINKING EXAMpIES srirren rninn r iE EED O EErEE E PI ETEESI ERES OIR OE OE EREDA E 22  3 6 Using language specific interfaces with Intel MKL        sssssssssssssrrrrssserrrrrsssn  23  3 7 Building custom shared Objects             ccc ect narun nrin nnnnnennnn 25  Managing Performance and Memory                ce cece eee eed 27  4 1 PErmOnManGe iii icpcche ete cee I ORENT EE sheen O oleae ieee 27   4 1 1 Using Intel MKL parallelism                  c cece cece e eee eee eee eee eee ened 27   4 1 2 Multi core performance             cc cecee cette eee eee ea 30   4 1 3 Setting up data for better performance               cceceeeee ee eee eee eee 31  4 2 Using Intel MKL Memory Management                   c eeee cece eee eee eee eee eae eae 32  Configuring Your Development Environment                  cece e eee e nett eee tate terin  34  5 1 Customizing Intel MKL Using the Configuration File                 ceeceeeee teers 34  5 2 Redefining Names of Downloadable Dynamic Libraries                   0  eeeeeeees 35  Coding Intel   Math Kernel Library Calls              cececeeeee eee eee eee eee tae 37    6 1 Calling LAPACK  BLAS  and CBLAS Routines from C Language Environments   37  6 2 How to Call BLAS Functions That Return the Complex Values in C C   Code 38  
41. nication in which each  processor is treated as a node        The threading software will see multiple processors on the system even  though each processor has a separate process running on it  In this  case OMP_NUM_ THREADS should be set to 1  Again  however  the  default behavior of Intel MKL is to run with one thread        Setting up the number of threads    The OpenMP  software responds to the environmental variable OMP_NUM_THREADS   The number of threads can be set in the shell the program is running in  To change  the number of threads  in the command shell in which the program is going to run     enter     export OMP_NUM_THREADS  lt number of threads to use gt      To force the library to serial mode  environment variable MKL_SERIAL should be set  to YES  It works regardless of OMP_NUM_THREADS value     If the variable OMP_NUM_THREADS is not set  Intel   MKL software will run on the    number of threads equal to 1  It is recommended that you always set  OMP_NUM_ THREADS to the number of processors you wish to use in your application     Currently the default number of threads for Sparse Solver is the number of processors    in the system     Changing the number of processors for threading during run    time    It is not possible to change the number of processors during run time using the  environment variable OoMP_NUM_THREADS  However  you can call OpenMP API functions  from your program to change the number of threads during runtime  The following       Managing Per
42. ontains a utility for reporting the package ID and  license key information to Intel   Premier Support    lt mkl directory gt  tools plugins     com intel mkl help          Contains an Eclipse plugin with Intel MKL Reference  Manual in WebHelp format        Intel   MKL separates  A 32 library versions  Intel   EM64T versions  and versions for  Intel   Itanium   2 processor     e The IA 32 versions are located in the 1ib 32 directory     e Intel   EM64T versions are located in the 1ib em64t directory     e Intel   Itanium   2 processor versions are located in the 1ib 64 directory     See detailed structure of these directories in Table 3     Intel   MKL consists of two parts     e high level libraries  LAPACK  sparse solver     e processor specific kernels in libmk1_ia32 a  libmkl_em64t a  and  libmkl_ipf a        Intel   Math Kernel Library User s Guide for Linux     10    Table 3    High level libraries  The high level libraries are optimized without regard to the processor and can be used    effectively on processors from Intel   Pentium   processor through Intel   Core    2  Extreme processor family and Intel   Itanium   2 processor     Processor specific kernels  Processor specific kernels containing BLAS  Sparse BLAS  CBLAS  GMP  FFTs  DFTs     VSL  VML  interval arithmetic  Trigonometric Transform and Poisson Library routines  are optimized for each specific processor     Threading libraries  Threading software is supplied in two versions     e separate library  libguid
43. r cdotc    shows how this works  You call the function from  Fortran as follows  result   cdotc  n  x  1  y  1      A call to this function as a subroutine  looks like this  call cdotce  result  n  x   Ty y  1ye    From C  this would look in this way  cdotc   amp result   amp n  x   amp one  y   amp one    where the hidden parameter is exposed     NOTE  Intel   MKL has both upper case and lower case entry points in the BLAS  so all upper   case or all lower case names are acceptable     Using this form  you can call from C  and thus  from C    several level 1 BLAS  functions that return complex values  However  it is still easier to use the CBLAS  interface  For instance  you can call the same function using the CBLAS interface as  follows     NOTE     Example 1     Example 2     Coding Intel   Math Kernel Library Calls    cblas_cdotu  n  x  1  y  1   amp result      The complex value comes back expressly in this case     Calling a Complex BLAS Level 1 Function from C       The following example illustrates a call from a C program to  the complex BLAS Level 1 function zdotc    This function computes  the dot product of two double precision complex vectors     In this example  the complex dot product is returned in the  structure c    ty    include  mkl h     define N 5   void main          int n  inca   1  incb   1  i    typedef struct  double re  double im    complex16   complex16 a N   b N   c    void zdotc       n   N    for  i  0  i  lt  n  i        a i  re    double i  a
44. riction is not too tight  use an unpacked routine for better  performance  Note that in such a case  you need to allocate N  2 more memory than  the memory required by a respective packed routine  where N is the problem size  the  number of equations      For example  solving a symmetric eigenproblem with an expert driver can be speeded  up through using an unpacked routine    call dsyevx jobz  range  uplo  n  a  lda  vl  vu  il  iu  abstol   m  w  Zz  ldz  work  lwork  iwork  ifail  info     where a is the dimension 1da by n  which is at least N  elements  instead of   call dspevx jobz  range  uplo  n  ap  vl  vu  il  iu  abstol  m     31    Intel   Math Kernel Library User s Guide for Linux     4 2    32    w  z  ldz  work  iwork  ifail  info    where ap is the dimension N  N 1  2     Data alignment for FFT functions  There are additional conditions for the FFT functions     On  A 32 based applications the addresses of the first elements of arrays and the  leading dimension values  in bytes  n element_size   of two dimensional arrays  should be divisible by cache line size  32 bytes for Pentium   III processor  64 bytes  for Pentium   4 processor  and 128 bytes for Intel   EM64T processor     On Itanium    based applications the sufficient conditions are as follows     e for the C style FFT  the distance L between arrays that represent real and  imaginary parts is not divisible by 64  The best case is when L k 64   16    e leading dimension values  in bytes  n element_size 
45. rnel Library Calls    Though Intel   Math Kernel Library  Intel   MKL  provides support for Fortran and  C C   programming  not all function domains support both interfaces  see Table Intel  MKL Language Support   For example  LAPACK has no C interface        To enable calling Intel MKL routines in your environment even if the function domain  does not support it  mixed language programming may be used     See also Using language specific interfaces with Intel MKL        Calling LAPACK  BLAS  and CBLAS Routines  from C Language Environments    The Intel MKL is provided in C and Fortran environments  Not all of the Intel MKL sub   libraries support both environments  In order to use these sub libraries in both  environments  you should observe some  rules      LAPACK    When calling LAPACK routines from C language programs  make sure that you follow  Fortran rules     e Pass variables by  address  as opposed to pass by    value        e Be sure to store your data Fortran style  i e  data stored in column major rather  than row major order     In LAPACK  routine names are upper case only  Please see Examplel in section 6 2    BLAS    BLAS routines are Fortran style routines  If you call BLAS routines from a C language  program  you must follow the Fortran style calling conventions     e Pass variables by address as opposed to passing by value     e Be sure to store data Fortran style  i e  data stored in column major rather than  row major order     37    Intel   Math Kernel Li
46. so libguide a   Sparse i  BLAS libmk1 so libmkl_ipf a libpthread so libpthread   libguide so libguide a  BLAS95  Interface libmkl_blas95 a   libmkl_blas95 a   libpthread so   libpthread   libmkl so libmkl_ipf a  libguide so libguide a  CBLAS  libmk1 so libmkl_ipf a libpthread so libpthread   libguide so libguide a  LAPACK  libmkl lapack32 so libmkl_lapack a  libpthread so libpthread   and or libmkl_ipf s    libmkl_lapack64 so   libguide a  libmk1 so  libguide so  LAPACK95      I nterface n a  libmkl_lapack95 a   libpthread so  libpthread   libmkl_lapack a  libmkl_ipf a  libguide a  Sparse i f  Solver n a libmkl_solver a  libpthread so   libpthread a  libmkl_lapack a  libmkl_ipf a  libguide a  Vector    i  Math libvml so libmkl_ipf a libpthread so   libpthread a  Library libguide so libguide a libm so libm a                         Linking Your Application with I ntel   Math Kernel Library                                  Function Intel MKL libraries in lib  64 directory Linux libraries  domain   Interface Dynamic Static Dynamic Static  Vector _        Statistical libmkl_lapack32 so   libmkl_lapack a   libpthread so libpthread a  Library   ador libmkl_ipf a   libm so libm a  libmkl_lapack64 so   libguide a  libmk1 so  libvsl so  libguide so  Fourier        Transform libmk1 so libmkl_ipf a libpthread so   libpthread a  Functions libguide so libguide a libm  so libm a   optionally   optionally   Interval   f      Arithmetic libmkl_lapack32 so   libmkl_lapack a libpthread so libp
47. son   In case your function domain does not directly support the needed  environment  you can use mixed language programming  See Calling LAPACK     BLAS  and CBLAS routines from C language environments   See also Using language specific interfaces with Intel MKL for a list of language        Before You Begin    Po specific interface libraries and modules and an example how to generate them     Threading model Select among the following options how you are going to thread your application   e Your application is already threaded  e You want to use MKL treading capability  the 1ibguide library   e You do not want to thread your application    Reason  By default Intel MKL runs with a number of threads equal to one  except  for Direct Sparse Solver  To utilize multi threading  you will need to set the  number of threads yourself  For more information  and especially  how to avoid  conflicts in the threaded execution environment  see Using Intel MKL parallelism     Linking model Decide which linking model is appropriate for linking you application with Intel  MKL libraries     e Static    e Dynamic    Reason  For information on the benefits of each linking model  link command  syntax and examples  link libraries as well as on other linking topics  like how to  save disk space by creating a custom dynamic library  see Linking Your  Application with Intel   Math Kernel Library           Intel   Math Kernel Library User s Guide for Linux     3    3 1    Table 2    Linking Your Appli
48. thread a  and or libmkl_ipf a  libmkl_lapack6  4 so   libguide a  libmk1 so  libguide so  Trigonomet   _ f        ric libmk1 so libmkl_ipf a libpthread so  libpthread a  ee libvml so libguide a  libguide so  Poisson     s f f  Library libmkl so libmkl_ipf a libpthread so libpthread a  libvml so libguide a          libguide so                1  Regardless of the function domain  when linking statically with libguide   discouraged   sometimes  you may have to use the libguide version different  from the one in the indicated directory  See Notes below        Not applicable    21       Intel   Math Kernel Library User s Guide for Linux     3 5    22    Linking Examples    Below are some specific examples for linking on  A 32 systems with Intel   compilers     ifort myprog f  LSMKLPATH  lmkl_lapack  lmkl_ia32  lguide    lpthread  static linking of user code myprog f  LAPACK  and kernels  Processor  dispatcher will call the appropriate kernel for the system at runtime     ifort myprog f  LSMKLPATH  lmkl_lapack95  lmkl_lapack  1mkl_ia32   lguide  lpthread  static linking of user code myprog f  Fortran 95 LAPACK interface  and kernels   Processor dispatcher will call the appropriate kernel for the system at runtime     ifort myprog f  LSMKLPATH  lmkl_blas95  lmkl_lapack  lmkl_ia32    lguide  lpthread  static linking of user code myprog f  Fortran 95 BLAS interface  and kernels   Processor dispatcher will call the appropriate kernel for the system at runtime     icc myprog c  LSMKLPAT
49. ucing the collection of functions available in Intel  MKL libraries to those required to solve your particular problems  which helps to save  disk space and build your own dynamic libraries for distribution     Intel MKL custom shared object builder    Custom shared object builder is targeted for creation of a dynamic library  shared  object  with selected functions and located in tools builder folder  The builder  contains a makefile and a definition file with the list of functions  The makefile has  three targets   ia32    ipf   and  em64t   ia32 target is used for  A 32  ipf is used for  Intel   Itanium   processor family and em64t is used for Intel   Xeon   processor with  Intel   EM64T     Specifying makefile parameters    There are several macros  parameters  for the makefile     export   functions list  determines the name of the file that contains the list of entry point functions that    will be included into shared object  This file is used for definition file creation and  then for export table creation  Default name is functions list     name   mkl_custom    specifies the name of the created library  By default  the library mk1_custom so  is built     xerbla   user_xerbla obj    specifies the name of object file that contains user   s error handler  This error  handler will be added to the library and then will be used instead of standard MKL  error handler xerbla  By default  that is  when this parameter is not specified   standard MKL xerbla is used     All para
50. whose software development experience  may vary from beginner to advanced     Overview    Notational Conventions    The document employs the following font conventions and symbols     Italic    Monospace  lowercase    Monospace  lowercase mixed  with uppercase    UPPERCASE  MONOS PACE    Monospace italic      items        item   item      Italic is used for emphasis and also indicates document names in body text  for  example  see Intel MKL Reference Manual    Indicates filenames  directory names and pathnames  for example  libmkl_ia32 a   opt intel mk1 9 0      Indicates commands and command line options  for example   ifort myprog f  LSMKLPATH  lmkl_ blas95      C C   code fragments  for example   complex16 a N   bIN   c     System variables  for example  SMKLPATH    Indicates a parameter of a makefile or in other contexts  for example   functions list    When enclosed in angle brackets  indicates a placeholder for an identifier  an  expression  a string  a symbol  or a value  Substitute one of these items for the  placeholder  for example     lt mkl directory gt     Square brackets indicate that the items enclosed in brackets are optional     Braces indicate that only one of the items listed between braces should be selected   A vertical bar       separates the items     Intel   Math Kernel Library User s Guide for Linux     2 Before You Begin    Before you get started using the Intel   Math Kernel Library  Intel   MKL   sorting out  a few important basic concepts will gre
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
LG 50PN6500 plasma panel    MANUALE TECNICO 2013 CONDIZIONATORE TIPO  Nuance Dragon Dictate 3, 5-50u, Mac, Edu, FRE  FCU - Edilportale  Cerabar S, PMC71, PMP71, PMP72, PMP75    Copyright © All rights reserved. 
   Failed to retrieve file