Home
Intel(R) Math Kernel Library for Windows* OS User's Guide
Contents
1. CommitDescriptor Performs all initialization for the actual FFT computation Syntax Fortran Status DitiCommitDescriptor Desc_Handle Dynamic Help Dynamic Help also provides access to topics relevant to the current selection or to the text being typed Links to all relevant topics are displayed in the Dynamic Help window To get the list of relevant topics each time you select the Intel MKL function name or as you type it in your code open the Dynamic Help window by selecting Help gt Dynamic Help from the menu To open a topic from the list click the appropriate link in the Dynamic Help window shown in the above figure Typically only one link corresponds to each Intel MKL function Using the IntelliSense Capability IntelliSense is a set of native Visual Studio VS IDE features that make language references easily accessible The user programming with Intel MKL in the VS Code Editor can employ two IntelliSense features Parameter Info and Complete Word Both features use header files Therefore to benefit from Intellisense make sure the path to the include files is specified in the VS or solution settings For example see Configuring the Microsoft Visual C C Development System to Link with Intel MKL on how to do this Parameter Info The Parameter Info feature displays the parameter list for a function to give information on the number and types of parameters This feature requires adding the include s
2. See Also Using the Single Dynamic Library Layered Model Concept Using the cdecl and stdcall Interfaces Directory Structure in Detail Linking with Interface Libraries Using the cdecl and stdcall Interfaces Intel MKL provides the following interfaces in its IA 32 architecture implementation e stdcall Default Compaq Visual Fortran CVF interface Use it with the Intel Fortran Compiler e cdecl Default interface of the Microsoft Visual C C application To use each of these interfaces link with the appropriate library as specified in the following table Interface cdecl stdcall Library for Static Linking Library for Dynamic Linking mkl_intel_c lib mkl_intel_c_dll lib mkl intel s lib mkl_intel_s_ dll lib To link with the cdecl or stdcall interface library use appropriate calling syntax in C applications and appropriate compiler options for Fortran applications If you are using a C compiler to link with the cdecl or stdcall interface library call Intel MKL routines in your code as explained in the table below Interface Library mkl_ intel s dll lib mkl _ intel c dll lib Calling Intel MKL Routines Call a routine with the following statement extern __stdcall name lt prototype variablel gt lt prototype variable2 gt where stdcall1 is actually the CVF compiler default compilation which differs from the regular stdcall compilation in the way how strings are passed to the routin
3. item item Italic is used for emphasis and also indicates document names in body text for example see Intel MKL Reference Manual Indicates e Commands and command line options for example ifort myprog f mkl blas95 lib mkl_c lib libiomp5md 1lib e Filenames directory names and pathnames for example C Program Files Java jdk1 5 0 09 e C C code fragments for example a new double SIZE SIZE Indicates system variables for example SMKLPATH Indicates a parameter in discussions for example 1da When enclosed in angle brackets indicates a placeholder for an identifier an expression a string a symbol or a value for example lt mk1 directory gt Substitute one of these items for the placeholder Square brackets indicate that the items enclosed in brackets are optional Braces indicate that only one of the items listed between braces should be selected A vertical bar separates the items 13 Intel Math Kernel Library for Windows OS User s Guide 14 Overview Document Overview The Intel Math Kernel Library Intel MKL User s Guide provides usage information for the library The usage information covers the organization configuration performance and accuracy of Intel MKL specifics of routine calls in mixed language programming linking and more This guide describes OS specific usage of Intel MKL along with OS independent features The document contains us
4. Compiling an Application that Calls the Intel Math Kernel Library and Uses the CVF Calling Conventions Include Files Compiling an Application that Calls the Intel Math Kernel Library and Uses the CVF Calling Conventions The IA 32 architecture implementation of Intel MKL supports the Compaq Visual Fortran CVF calling convention by providing the stdcall interface 60 Language specific Usage Options 6 Although the Intel MKL does not provide the CVF interface in its Intel 64 architecture implementation you can use the Intel Visual Fortran Compiler to compile your Intel 64 architecture application that calls Intel MKL and uses the CVF calling convention To do this e Provide the following compiler options to enable compatibility with the CVF calling convention Gm or iface cvf e Additionally provide the following options to enable calling Intel MKL from your application iface nomixed_str_len_arg See Also Using the cdecl and stdcall Interfaces Compiler Support Mixed language Programming with the Intel Math Kernel Library Appendix A Intel R Math Kernel Library Language Interfaces Support lists the programming languages supported for each Intel MKL function domain However you can call Intel MKL routines from different language environments Calling LAPACK BLAS and CBLAS Routines from C C Language Environments Not all Intel MKL function domains support both C and Fortran environments To use Intel MKL
5. e A set of vectorized transcendental functions called the Vector Math Library VML For most of the supported processors the Intel MKL VML functions offer greater performance than the libm scalar functions while keeping the same high accuracy e The Vector Statistical Library VSL which offers high performance vectorized random number generators for several probability distributions convolution and correlation routines and summary statistics functions e Data Fitting Library which provides capabilities for spline based approximation of functions derivatives and integrals of functions and search For details see the Intel MKL Reference Manual Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Intel M
6. Composer XE compiler see Using the Qmk1 Compiler Option Microsoft Visual Studio Integrated Development Environment see Automatically Linking a Project in the IDE Visual Studio IDE with Intel MKL Other options are independent of your development environment but depend on the way you link Explicit dynamic linking see Using the Single Dynamic Library for how to simplify your link line Explicitly listing libraries on your link line see Selecting Libraries to Link with for a summary of the libraries Using an interactive interface see Using the Link line Advisor to determine libraries and options to specify on your link or compilation line Using an internally provided tool see Using the Command line Link Tool to determine libraries options and environment variables or even compile and build your application Using the Qmkl Compiler Option The Intel Composer XE compiler supports the following variants of the Qmk1 compiler option Qmk1 or to link with standard threaded Intel MKL Qmkl parallel Qmk1 sequential to link with sequential version of Intel MKL Qmk1 cluster to link with Intel MKL cluster components sequential that use Intel MPI For more information on the Qmk1 compiler option see the Intel Compiler User and Reference Guides For each variant of the Qmk1 option the compiler links your application using the following conventions e cdecl for the IA 32 architecture e LP64 for the Intel 64 architect
7. Fortran 90 modules result in the compiler specific code generation requiring RTL support Therefore Intel MKL delivers these modules compiled with the Intel compiler along with source code to be used with different compilers Using the stdcall Calling Convention in C C Intel MKL supports stdcall calling convention for the following function domains e BLAS Routines e Sparse BLAS Routines e LAPACK Routines e Vector Mathematical Functions e Vector Statistical Functions e PARDISO e Direct Sparse Solvers e RCI Iterative Solvers e Support Functions To use the stdcall calling convention in C C follow the guidelines below e In your function calls pass lengths of character strings to the functions For example compare the following calls to dgemm cdecl dgemm N N amp n amp m amp k amp alpha b amp ldb a amp lda amp beta c amp ldc stdcall dgemm N 1 N 1 amp n amp m amp k amp alpha b amp ldb a amp lda amp beta c amp ldc e Define the MKL_STDCALL macro using either of the following techniques Define the macro in your source code before including Intel MKL header files define MKL STDCALL include mkl h Pass the macro to the compiler For example icl DMKL STDCALL foo c e Link your application with the following library mkl_intel_s 1lib for static linking mkl_intel_s_dll 1lib for dynamic linking See Also Using the cdecl and stdcall Interfaces
8. However the best way to set environment variables is using the Job Scheduler with the Microsoft Management Console MMC and or the Command Line Interface CLI to submit a job and pass environment variables For more information about MMC and CLI see the Microsoft Help and Support page at the Microsoft Web site http www microsoft com Building ScaLAPACK Tests To build ScaLAPACK tests e For the IA 32 architecture add mkl_scalapack_core 1lib to your link command e For the Intel 64 architecture add mkl_scalapack_1p64 1lib or mkl_scalapack_ilp64 1lib depending on the desired interface Lampies TOL UNUNG WI StaLAP ACK and HWStet EEE This section provides examples of linking with ScaLAPACK and Cluster FFT Note that a binary linked with ScaLAPACK runs the same way as any other MPI application refer to the documentation that comes with your MPI implementation For further linking examples see the support website for Intel products at http www intel com software products support 74 Working with the Intel Math Kernel Library Cluster Software 8 See Also Directory Structure in Detail Examples for Linking a C Application These examples illustrate linking of an application whose main module is in C under the following conditions e MPICH2 1 0 x is installed in c mpich2x64 e You use the Intel C Compiler 10 0 or higher To link with ScaLAPACK using LP64 interface for a cluster of Intel 64 architecture
9. Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Detailed Structure of the IA 32 Architecture Directories Static Libraries in the 1ib ia32 Directory File Contents Interface layer mkl_intel_c lib cdecl interface library mkl_ intel _s lib CVF default interface library mkl blas95 lib Fortran 95 interface library for BLAS Supports the Intel Fortran compiler mkl_lapack95 1lib Fortran 95 interface library for LAPACK Supports the Intel Fortran compiler Threading layer mkl_ intel thread lib Threading library for the Intel compilers mkl_pgi_thread lib Threading library for the PGI compiler mkl_sequential lib Sequential library Computational layer mkl_core lib Kernel library for IA 32 architecture mkl_solver lib Deprecated Empty library for backward compatibility mkl_solver_sequential lib Deprecated Empty library for backward compatibility mkl _ scalapack _core lib ScaLAPACK routines mkl cdft_core lib Cluster version of FFTs Run time Libraries RTL 101 C Intel Math Kernel Library for Windows OS User s Guide File Contents mkl blacs_intelmpi lib BLACS routines supporting
10. OS You thread the program using OpenMP directives and or pragmas and compile the program using a compiler other than a compiler from Intel There are multiple programs running on a multiple cpu system for example a parallelized program that runs using MPI for communication in which each processor is treated as a node Discussion If more than one thread calls Intel MKL and the function being called is threaded it may be important that you turn off Intel MKL threading Set the number of threads to one by any of the available means see Techniques to Set the Number of Threads This is more problematic because setting of the OMP_NUM_ THREADS environment variable affects both the compiler s threading library and libiomp In this case choose the threading library that matches the layered Intel MKL with the OpenMP compiler you employ see Linking Examples on how to do this If this is not possible use Intel MKL in the sequential mode To do this you should link with the appropriate threading library mkl_sequential lib or mkl_sequential dll see High level Directory Structure The threading software will see multiple processors on the system even though each processor has a separate MPI process running on it In this case one of the solutions is to set the number of threads to one by any of the available means see Techniques to Set the Number of Threads Section Intel R Optimized MP LINPACK Benchmark for Clusters discus
11. kl _dss 90 kl rei fi kl tei fi kl vml f 77 kl vml 90 kl_vs1l f77 kl _ vs1l1 f90 kl dfti f90 kl cdft f90 77 90 C C Include Files m m kl h 1 _blas h 1_trans h 1_cbhlas h 1_spblas h l _lapack h 1_lapacke h 1_scalapack h 1_solver h l_pardiso h 1_dss h l roih l rcii l vml h l vsl functions h l_dfti h 1 cdft h l trig _transforms h l poisson h 1 df h l gmp h l service Intel Math Kernel Library Language Interfaces Support A Function domain Fortran Include Files C C Include Files mkl service fi Memory allocation routines i_malloc h Intel MKL examples interface mkl_example h t GMP Arithmetic Functions are deprecated and will be removed in a future release See Also Language Interfaces Support by Function Domain 97 A Intel Math Kernel Library for Windows OS User s Guide 98 Support for Third Party Interfaces GMP Functions Intel Math Kernel Library Intel MKL implementation of GMP arithmetic functions includes arbitrary precision arithmetic operations on integer numbers The interfaces of such functions fully match the GNU Multiple Precision GMP Arithmetic Library For specifications of these functions please see http software intel com sites products documentation hpc mkl gnump index htm NOTE Intel MKL GMP Arithmetic Functions are deprecated and will be removed in a future release If you curr
12. orz The following routines are threaded for Intel Core 2 Duo and Intel Core i7 processors e Levelli BLAS axpy copy swap ddot sdot cdotc drot srot e Level2 BLAS gemv trmv dsyr ssyr dsyr2 ssyr2 dsymv ssymv Threaded FFT Problems The following characteristics of a specific problem determine whether your FFT computation may be threaded e rank e domain e size length e precision single or double e placement in place or out of place e strides e number of transforms e layout for example interleaved or split layout of complex data Most FFT problems are threaded In particular computation of multiple transforms in one call number of transforms gt 1 is threaded Details of which transforms are threaded follow One dimensional 1D transforms 1D transforms are threaded in many cases 44 Managing Performance and Memory 5 1D complex to complex c2c transforms of size N using interleaved complex data layout are threaded under the following conditions depending on the architecture Architecture Intel 64 Conditions Nis a power of 2 log2 N gt 9 the transform is double precision out of place and input output strides equal 1 IA 32 Nis a power of 2 log2 N gt 13 and the transform is single precision Nis a power of 2 log2 N gt 14 and the transform is double precision Any Nis composite log2 N gt 16 and input output strides equal 1 1D real to
13. 0 nmake libia32 boost _root lt your_path gt boost_1 37 0 Intel MKL ublas examples on default Boost UBLAS configuration support only e Microsoft Visual C Compiler versions 2005 and higher e Intel C Compiler versions 11 1 and higher with Microsoft Visua See Also Using Code Examples Invoking Intel MKL Functions from Java Applications Studio IDE versions 2005 and higher 65 6 Intel Math Kernel Library for Windows OS User s Guide Intel MKL Java Examples To demonstrate binding with Java Intel MKL includes a set of Java examples in the following directory lt mkl directory gt examples java The examples are provided for the following MKL functions e gemm gemv and dot families from CBLAS e The complete set of non cluster FFT functions e ESSL like functions for one dimensional convolution and correlation e VSL Random Number Generators RNG except user defined ones and file subroutines e VML functions except GetErrorCallBack SetErrorCallBack and ClearErrorCallBack You can see the example sources in the following directory lt mkl directory gt examples java examples The examples are written in Java They demonstrate usage of the MKL functions with the following variety of data e 1 and 2 dimensional data sequences e Real and complex types of the data e Single and double precision However the wrappers used in the examples do not e Demonstrate the use of large arrays gt 2
14. 2 Add the following string to the library path sProgramFiles Intel MPI lt ver gt lt arch gt 1lib for example ProgramFiles Intel MPI 3 1 intel64 lib 3 Add impi lib and impicxx 1ib to your link command Check the documentation that comes with your MPI implementation for implementation specific details of linking Linking with ScaLAPACK and Cluster FFTs To link with Intel MKL ScaLAPACK and or Cluster FFTs use the following commands 71 8 Intel Math Kernel Library for Windows OS User s Guide set lib lt path to MKL libraries gt lt path to MPI libraries gt lib lt linker gt lt files to link gt lt MKL cluster library gt lt BLACS gt lt MKL core libraries gt lt MPI libraries gt where the placeholders stand for paths and libraries as explained in the following table lt path to MKL libraries gt lt mk1 directory gt lib ia32 intel64 depending on your architecture If you performed the Setting Environment Variables step of the Getting Started process you do not need to add this directory to the lib environment variable lt path to MPI libraries gt Typically the 1ib subdirectory in the MPI installation directory For example C Program Files x86 Intel MPI 3 2 0 005 ia32 lib for a default installation of Intel MPI 3 2 lt linker gt One of icl ifort xilink lt MKL cluster library gt One of ScaLAPACK or Cluster FFT libraries for the appropriate architecture which are listed in Directo
15. 29 Start Page Microsoft Visual Studio DER File Edit view Tools Window i Community Help Filtered by Microsoft unfiltered Visual BiR rE Help on Help Microsoft Document Explorer Help Intel Math Kernel Library Help Intel Math Kernel Library Reference Manual Legal Information W Overview BLAS and Sparse BLAS Routines W LAPACK Routines Linear Equations LAPACK Routines Least Squares and Eigenv LAPACK Auxiliary and Utility Routines eylsolution Contents L8 Index F Help Fav 82 Programming with Intel Math Kernel Library in Integrated Development Environments IDE 9 You can filter Visual Studio Help collections to show only content related to installed Intel tools To do this select Intel from the Filtered by list This hides the contents and index entries for all collections that do not refer to Intel 19 Start Page Microsoft Visual Studio BAR Fie Edit View Tools Window Community Help Start Page gt X K A Filtered by Micra intel Legal Information version Information w Overview Index Results 4 X BLAS and Sparse BLAS Routines Title E LAPACK Routines Linear Equations LAPACK Routines Least Squares and Eigenvalue Problems v RY Solution Explorer Contents L Index Help Favorites Dynamic Help wAX Li tit 2 How Dol Q Search _3 Index lt 3 Contents EL E09 Bl How to Navigate with the Table of Contents Ready Acces
16. Fortran style functions in C C environments you should observe certain conventions which are discussed for LAPACK and BLAS in the subsections below CAUTION Avoid calling BLAS 95 LAPACK 95 from C C Such calls require skills in manipulating the descriptor of a deferred shape array which is the Fortran 90 type Moreover BLAS95 LAPACK95 routines contain links to a Fortran RTL LAPACK and BLAS Because LAPACK and BLAS routines are Fortran style when calling them from C language programs follow the Fortran style calling conventions e Pass variables by address not by value Function calls in Example Calling a Complex BLAS Level 1 Function from C and Example Using CBLAS Interface Instead of Calling BLAS Directly from C illustrate this e Store your data in Fortran style that is column major rather than row major order With row major order adopted in C the last array index changes most quickly and the first one changes most slowly when traversing the memory segment where the array is stored With Fortran style column major order the last index changes most slowly whereas the first index changes most quickly as illustrated by the figure below for a two dimensional array 1 2 3 4 0 1 2 3 A Column major order Fortran style B Row major order C style 61 6 Intel Math Kernel Library for Windows OS User s Guide For example if a two dimensional matrix A of size mxn is stored densely in a one dimension
17. Guide 16 Getting Started Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Checking Your Installation After installing the Intel Math Kernel Library Intel MKL verify that the library is properly installed and configured 1 Intel MKL installs in lt Composer XE directory gt Check that the subdirectory of lt Composer XE directory gt referred to as lt mkl directory gt was created Check that subdirectories for Intel MKL redistributable DLLs redist ia32 mkl and redist intel64 mk1 were created in the lt Composer XE directory gt directory See redist txt in the Intel MKL documentation directory for a list of files that can be redistributed 2 If you want to keep multiple versions of I
18. Intel Advanced Vector Extensions Intel AVX ScaLAPACK routines Cluster FFT dynamic library Dynamic library to support renaming of memory functions BLACS routines BLACS routines supporting Intel MPI BLACS routines supporting MPICH2 Catalog of Intel Math Kernel Library Intel MKL messages in English Catalog of Intel MKL messages in Japanese Available only if the Intel MKL package provides Japanese localization Please see the Release Notes for this information Detailed Structure of the Intel 64 Architecture Directories 103 C Intel Math Kernel Library for Windows OS User s Guide Static Libraries in the 1ib inte164 Directory File Interface layer mk mkl l intel 1lp64 1lib _intel_ilpo4 lib l_ intel sp2dp a l blas95_1p64 lib l blas95_ilp64 lib l lapack95_l1p64 lib l lapack95_ilp64 lib Threading layer mK mk mkl mkl mkl mk mK l intel thread lib L pgi_thread lib _sequential lib Computational layer core lib solver 1p64 1lib l solver 1p64 sequential lib 1_ solver _ilp64 lib l solver ilp64 sequential lib l _scalapack_1p64 lib scalapack_ilpo4 lib 1_cdft_core lib Run time Libraries RTL mkl mkl blacs_intelmpi_lp64 lib blacs_intelmpi_ilp64 lib l blacs_mpich2 1p64 lib l _blacs_ mpich2_ ilp64 1lib l blacs msmpi_ 1p64 1lib l blacs msmpi_ilp64 lib 104 Contents LP64 interface
19. Intel MPI mkl blacs_mpich2 lib BLACS routines supporting MPICH2 Dynamic Libraries in the 1ib ia32 Directory File Contents mkl_rt lib Single Dynamic Library to be used for linking Interface layer mkl intel c dll lib cdecl interface library for dynamic linking mkl_intel_s_dll lib CVF default interface library for dynamic linking Threading layer mkl intel thread_dll lib Threading library for dynamic linking with the Intel compilers mkl_pgi_thread_dll lib Threading library for dynamic linking with the PGI compiler mkl_sequential_dll lib Sequential library for dynamic linking Computational layer mkl _ core dll lib Core library for dynamic linking mkl scalapack core dll lib ScaLAPACK routine library for dynamic linking mkl _cadft core dll lib Cluster FFT library for dynamic linking Run time Libraries RTL mkl_blacs_dll lib BLACS interface library for dynamic linking Contents of the redist ia32 mk1 Directory File Contents mkl_rt dll Single Dynamic Library Threading layer mkl_intel_thread dll Dynamic threading library for the Intel compilers mkl_pgi_thread dll Dynamic threading library for the PGI compiler mkl_sequential dll Dynamic sequential library Computational layer mkl core dll Core library containing processor independent code and a dispatcher for dynamic loading of processor specific code mkl def dll Default kernel Intel Pentium Pentium Pro Pentium II and Pentium III processors 10
20. Laplace and Helmholtz Solver Poisson Library Yes Yes routines Optimization Trust Region Solver routines Yes Yes Yes Data Fitting functions Yes Yes Yes GMP arithmetic functions Yes Support functions including memory allocation Yes Yes Yes t Supported using a mixed language programming call See Intel MKL Include Files for the respective header file 95 A Intel Math Kernel Library for Windows OS User s Guide tt GMP Arithmetic Functions are deprecated and will be removed in a future release Include Files Function domain All function domains BLAS Routines BLAS like Extension Transposition Routines CBLAS Interface to BLAS Sparse BLAS Routines LAPACK Routines C Interface to LAPACK ScaLAPACK Routines All Sparse Solver Routines PARDISO DSS Interface RCI Iterative Solvers ILU Factorization Optimization Solver Routines Vector Mathematical Functions Vector Statistical Functions Fourier Transform Functions Cluster Fourier Transform Functions Partial Differential Equations Support Routines Trigonometric Transforms Poisson Solvers Data Fitting functions GMP interface Support functions 96 Fortran Include Files mkl fi blas f90 mkl_blas mkl_trans fi mkl_spblas fi fi lapack f90 mkl_lapack fi mkl_trig_transforms f 90 mkl_ poisson 90 mkl_df f mkl_df f mkl_service f90 kl solver 90 kl pardiso f 77 kl pardiso 90 kl_dss f77
21. Microsoft Visual C That is a program threaded with the Microsoft Visual C compiler can safely be linked with Intel MKL and libiomp The table below helps explain what threading library and RTL you should choose under different scenarios when using Intel MKL static cases only 36 Linking Your Application with the Intel Math Kernel Library 4 Compiler Application Threading Layer RTL Recommended Comment Threaded Intel Does not kl_intel_ libiomp5md lib matter thread lib PGI Yes kl_ pgi_thread PGI supplied Use of lib or mkl_sequential lib kl sequential removes threading from lib Intel MKL calls PGI No kl intel_ libiomp5md lib thread lib PGI No kl pgi_thread PGI supplied lib PGI No kl sequential None lib Microsoft Yes kl _intel_ libiomp5md lib For the OpenMP library of thread lib the Microsoft Visual Studio IDE version 2005 or later Microsoft Yes kl sequential None For Win32 threading lib Microsoft No kl intel libiomp5md lib thread lib other Yes kl_ sequential None lib other No kl intel libiomp5md lib thread lib TIP To use the threaded Intel MKL compile your code with the MT option The compiler driver will pass the option to the linker and the latter will load multi thread MT run time libraries Linking with Computational Libraries If you are not using the Intel MKL cluster software you need to link your application with only one computational library depending on the linking me
22. N 5 int main INE il wines il indo l sf complex16 a N bI N c n N rocl i 0p i lt mg iss Ji a i re double i a i im double i 2 0 b i re double n i b i im double i 2 0 Chillaismzd Cec stiom cp ocer 9p 1mo Ce ly prince Wits complex cot procuct iss BG 2i S62 Wa Corer Colm le return 0 Support for Boost uBLAS Matrix matrix Multiplication If you are used to uBLAS you can perform BLAS matrix matrix multiplication in C using Intel MKL substitution of Boost uBLAS functions uBLAS is the Boost C open source library that provides BLAS functionality for dense packed and sparse matrices The library uses an expression template technique for passing expressions as function arguments which enables evaluating vector and matrix expressions in one pass without temporary matrices uBLAS provides two modes e Debug safe mode default 64 6 Language specific Usage Options Checks types and conformance e Release fast mode Does not check types and conformance To enable this mode use the ND EBUG preprocessor symbol The documentation for the Boost uBLAS is available at www boost org Intel MKL provides overloaded prod functions for substituting UBLAS dense matrix matrix multiplication with the Intel MKL gemm calls Though these functions break uBLAS expression templates and introduce temporary matrices the performance advantage can be considerable
23. Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 88 LINPACK and MP LINPACK Benchmarks 1 0 Known Limitations of the Intel Optimized LINPACK Benchmark The following limitations are known for the Intel Optimized LINPACK Benchmark for Windows OS e Intel Optimized LINPACK Benchmark is threaded to effectively use multiple processors So in multi processor systems best performance will be obtained with the Intel Hyper Threading Technology turned off which ensures that the operating system assigns threads to physical processors only e If an incomplete data input file is given the binaries may either hang or fault See the sample data input files and or the extended help for insight into creating a correc
24. THREADS mkl_set_num threads Suggests the number of OMP_NUM THREADS threads to use MKL DOMAIN NUM_ mkl domain _set_num_ Suggests the number of THREADS threads threads for a particular function domain MKL DYNAMIC mkl_ set dynamic Enables Intel MKL to OMP_ DYNAMIC dynamically change the number of threads NOTE The functions take precedence over the respective environment variables Therefore if you want Intel MKL to use a given number of threads in your application and do not want users of your application to change this number using environment variables set the number of threads by a call to mkl_set_num_threads which will have full precedence over any environment variables being set The example below illustrates the use of the Intel MKL function mkl_set_num_threads to set one thread fi KKKKKKK E language KKKKKKK include lt omp h gt include lt mkl h gt mkl_set_num threads 1 fil KKKKKKK Fortran language kKaKKKKKK call mkl_set_num threads 1 See the Intel MKL Reference Manual for the detailed description of the threading control functions their parameters calling syntax and more code examples MKL_DYNAMIC The MKL DYNAMIC environment variable enables Intel MKL to dynamically change the number of threads The default value of MKL DYNAMIC is TRUE regardless of OMP_DYNAMIC whose default value may be FALSE When MKL DYNAMIC is TRUE Intel MKL tries to use what it considers the best
25. The following example shows how a call to a Fortran function as a subroutine converts to a call from C and the hidden parameter result gets exposed Normal Fortran function call result cdotc n x 1 y 1 A call to the function as a subroutine call cdotc result n x 1 y 1 A call to the function from C cdotc amp result amp n x amp one y amp one NOTE Intel MKL has both upper case and lower case entry points in the Fortran style case insensitive BLAS with or without the trailing underscore So all these names are equivalent and acceptable cdotc CDOTC cdotc_ and CDOTC The above example shows one of the ways to call several level 1 BLAS functions that return complex values from your C and C applications An easier way is to use the CBLAS interface For instance you can call the same function using the CBLAS interface as follows cblas_cdotu n x 1 y 1 amp result NOTE The complex value comes last on the argument list in this case The following examples show use of the Fortran style BLAS interface from C and C as well as the CBLAS C language interface e Example Calling a Complex BLAS Level 1 Function from C e Example Calling a Complex BLAS Level 1 Function from C e Example Using CBLAS Interface Instead of Calling BLAS Directly from C Example Calling a Complex BLAS Level 1 Function from C The example below illustrates a call from a C program to the complex BLAS Lev
26. Tools for creating custom dynamically linkable libraries XE directory gt DLLs for applications running on processors with the IA 32 architecture DLLs for applications running on processors with Intel 64 architecture Intel MKL documentation Help2 format files for integration of the Intel MKL documentation with the Microsoft Visual Studio 2005 2008 IDE Microsoft Help Viewer format files for integration of the Intel MKL documentation with the Microsoft Visual Studio 2010 IDE Structure of the Intel Math Kernel Library 3 Layered Model Concept Intel MKL is structured to support multiple compilers and interfaces different OpenMP implementations both serial and multiple threads and a wide range of processors Conceptually Intel MKL can be divided into distinct parts to support different interfaces threading models and core computations 1 Interface Layer 2 Threading Layer 3 Computational Layer You can combine Intel MKL libraries to meet your needs by linking with one library in each part layer by layer Once the interface library is selected the threading library you select picks up the chosen interface and the computational library uses interfaces and OpenMP implementation or non threaded mode chosen in the first two layers To support threading with different compilers one more layer is needed which contains libraries not included in Intel MKL e Compiler run time libraries RTL The following table provides m
27. billion elements e Demonstrate processing of arrays in native memory e Check correctness of function parameters e Demonstrate performance optimizations The examples use the Java Native Interface JNI developer framework to bind with Intel MKL The JNI documentation is available from http java sun com javase 6 docs technotes guides jni The Java example set includes JNI wrappers that perform the binding The wrappers do not depend on the examples and may be used in your Java applications The wrappers for CBLAS FFT VML VSL RNG and ESSL like convolution and correlation functions do not depend on each other To build the wrappers just run the examples The makefile builds the wrapper binaries After running the makefile you can run the examples which will determine whether the wrappers were built correctly As a result of running the examples the following directories will be created in lt mkl1 directory gt examples java e docs e include e classes e bin e results The directories docs include classes and bin will contain the wrapper binaries and documentation the directory results will contain the testing results For a Java programmer the wrappers are the following Java classes e com intel mk1l CBLAS e com intel mk1 DFTI e com intel mk1l ESSL e com intel mk1 VML e com intel mk1 VSL 66 Language specific Usage Options 6 Documentation for the particular wrapper and example classes will be
28. billion operations that have been performed in DGEMM by one processor Hence the performance of processor 0 in Gflops in DGEMM is always DF DT Using the number of DGEMM flops as a basis instead of the number of LU flops you get a lower bound on performance of the run by looking at DMF which can be compared to Mflops above It uses the global LU time but the DGEMM flops are computed under the assumption that the problem is evenly distributed amongst the nodes as only HPL s node 0 0 returns any output Note that when using the above performance monitoring tools to compare different HPL dat input data sets you should be aware that the pattern of performance drop off that LU experiences is sensitive to some input data For instance when you try very small problems the performance drop off from the initial values to end values is very rapid The larger the problem the less the drop off and it is probably safe to use the first few performance values to estimate the difference between a problem size 700000 and 701000 for instance Another factor that influences the performance drop off is the grid dimensions P and Q For big problems the performance tends to fall off less from the first few steps when P and Q are roughly equal in value You can make use of a large number of parameters such as broadcast types and change them so that the final performance is determined very closely by the first few steps Using these tools will
29. cluster software Intel R MKL cluster software linking with commands 71 linking examples 74 code examples use of 19 coding data alignment techniques to improve performance 52 compilation Intel R MKL version dependent 70 compiler run time libraries linking with 38 compiler support 19 compiler dependent function 59 complex types in C and C Intel R MKL 62 computation results consistency 69 computational libraries linking with 37 conditional compilation 70 configuring Intel R Visual Fortran 77 Microsoft Visual C C 77 project that runs Intel R MKL code example in Visual Studio 2008 IDE 78 consistent results 69 context sensitive Help for Intel R MKL in Visual Studio IDE 83 conventions notational 13 ctdcall interface use of 33 custom DLL building 39 composing list of functions 40 specifying function names 41 CVF calling convention use with Intel R MKL 60 D denormal number performance 54 directory structure documentation 26 high level 23 in detail documentation directories contents 26 Enter index keyword 27 environment variables setting 17 examples linking for cluster software 74 general 30 F FFT interface data alignment 52 optimised radices 54 threaded problems 43 FFTW interface support 99 Fortran 95 interface libraries 36 G GNU Multiple Precision Arithmetic Library 99 H header files Intel R MKL 96 Help for Intel R MKL in Visual Studio IDE 82 HT technology configuration
30. ene n nena 41 Building a Custom Dynamic link Library in the Visual Studio Development SystemM sssssssssrrsrrsrnsnnrnnrnnrnnnnnnnnnnnnrnrrnrrerrnrnrerra 41 Distributing Your Custom Dynamic link Library sssssssssssssssssrrrrersrssssss 42 Chapter 5 Managing Performance and Memory Using Parallelism of the Intel Math Kernel Library ccceeeeeee eee eeeee teenies 43 Threaded Functions and ProblemS eeceeeeee eee cent eeeee eee eeee eae eaees 43 Avoiding Conflicts in the Execution Environment eeeeee seen reese 45 Techniques to Set the Number of Threads cccccceeeeeeeeeeeeeeeeeeeeeeanes 46 Setting the Number of Threads Using an OpenMP Environment Variables suite nate wle ODA 2 Nl dena PAAR NE E E ANAE AE TELTAN 46 Changing the Number of Threads at RUN Tim c cece sees eee eee tees 46 Using Additional Threading Control ccceeeeee eee ee eee ee eee ee eee tee eaeeaeed 48 Intel MKL specific Environment Variables for Threading Control 48 MK Lx DY NAMIC ariran tian adie ey a aa at Maen EAE ea et ee wee 49 MKL_DOMAIN_NUM_THREADS ece eee eee eee eee eee e ee eee eee ated 50 Setting the Environment Variables for Threading Control 51 Tips and Techniques to Improve PerfOrMance csceeceeeeee eset seen eeeeeaeeeneneaeees 52 Coding Techniques irae Aaaa AnA a NOEN lie ean cient anette 52 Hardware Configuration TipS s ssssssssssrsrrrsnsrsserrn
31. for matrix sizes that are not too small roughly over 50 You do not need to change your source code to use the functions To call them e Include the header file mkl_boost_ublas matrix prod hpp in directory Add appropriate Intel MKL libraries to the link line The list of expressions that are substituted follows prod ml m2 prod trans m1 m2 prod trans conj ml m2 prod conj trans ml m2 prod ml trans m2 prod trans ml trans m2 prod trans conj ml trans m2 prod conj trans ml trans m2 prod ml trans conj m2 prod trans ml trans conj m2 prod trans conj ml trans conj m2 prod conj trans ml trans conj m2 prod ml conj trans m2 prod trans m1 conj trans m2 prod trans conj m1 conj trans m2 prod conj trans ml conj trans m2 These expressions are substituted in the release mode only with NDI Supported uBLAS versions are Boost 1 34 1 and higher To get them A code example provided in the lt mkl1 directory gt examples ubla illustrates usage of the Intel MKL UBLAS header file for solving a spe your code from the Intel MKL include EBUG preprocessor symbol defined Visit www boost org s source sylvester cpp file cial case of the Sylvester equation To run the Intel MKL ublas examples specify the boost_root parameter in the n make command for instance when using Boost version 1 37
32. generated from the Java sources while building and running the examples To browse the documentation open the index file in the docs directory created by the build script lt mkl directory gt examples java docs index html The Java wrappers for CBLAS VML VSL RNG and FFT establish the interface that directly corresponds to the underlying native functions so you can refer to the Intel MKL Reference Manual for their functionality and parameters Interfaces for the ESSL like functions are described in the generated documentation for the com intel mkl ESSL class Each wrapper consists of the interface part for Java and JNI stub written in C You can find the sources in the following directory lt mkl directory gt examples java wrappers Both Java and C parts of the wrapper for CBLAS and VML demonstrate the straightforward approach which you may use to cover additional CBLAS functions The wrapper for FFT is more complicated because it needs to support the lifecycle for FFT descriptor objects To compute a single Fourier transform an application needs to call the FFT software several times with the same copy of the native FFT descriptor The wrapper provides the handler class to hold the native descriptor while the virtual machine runs Java bytecode The wrapper for VSL RNG is similar to the one for FFT The wrapper provides the handler class to hold the native descriptor of the stream state The wrapper for the convolution and
33. gt Existing Item from the drop down menu The Add Existing Item lt project name gt window opens d Browse to the lt mk1 directory gt include directory Select the header files that appear in the use statements For example select the mkl_dfti 90 and mkl_ trig transforms 90 files Click Add The Add Existing Item lt project name gt window closes and the selected files to appear in theHeader Filesfolder in Solution Explorer The next steps adjust the properties of the project 3 Select the lt project name gt 4 On the main menu select Project gt Properties to open the lt project name gt Property Pages window 5 Set the Intel MKL include dependencies a Select Configuration Properties gt Fortran gt General In the right hand part of the window select Additional Include Directories gt gt lt Edit gt The Additional Include Directories window opens b Type the Intel MKL include directory in quotes lt mkl directory gt include Click OK to close the window 6 Select Configuration Properties gt Fortran gt Preprocessor In the right hand part of the window select Preprocess Source File gt Yes default is No This step is recommended because some examples require preprocessing 7 Set library dependencies a Select Configuration Properties gt Linker gt General In the right hand part of the window select Additional Library Directories gt gt lt Edit gt The Additi
34. ifort myprog f mkl _intel_lp64 lib mkl_sequential lib mkl_core lib Dynamic linking of myprog f and sequential version of Intel MKL supporting the LP64 interface Hh ifort myprog f mkl_intel lpo4 dll lib mkl sequential _dll lib mkl_core dll lib Static linking of myprog f and parallel Intel MKL supporting the ILP64 interface ifort myprog f mkl_intel ilpe64 lib mkl_intel_thread lib mkl_core lib libiomp5md lib Dynamic linking of myprog f and parallel Intel MKL supporting the ILP64 interface ifort myprog f mkl_intel ilpe4 dll lib mkl_intel_thread_dlil lib mkl_core dll lib libiomp5md 1lib Dynamic linking of user code myprog f and parallel or sequential Intel MKL supporting the LP64 or ILP64 interface Call appropriate functions or set environment variables to choose threaded or sequential mode and to set the interface ifort myprog f mkl_rt lib Static linking of myprog f Fortran 95 LAPACK interface and parallel Intel MKL supporting the LP64 interface ifort myprog f mkl_lapack95 1p64 lib mkl_intel_lp64 lib mkl_intel_thread 1lib mkl_core lib libiomp5md 1lib Static linking of myprog f Fortran 95 BLAS interface and parallel Intel MKL supporting the LP64 interface ifort myprog f mkl_blas95 1p64 1lib mkl_intel_lp64 lib mkl_intel_thread lib mkl_core lib libiomp5md 1lib See Also Fortran 95 Interfaces to LAPACK and BLAS Examples for Linking a C Application Examples for Linking a Fortran Application Using t
35. interface respectively e mkl_intel 1p64 1lib ormkl_intel_ilp64 1ib for static linking e mkl intel 1p64 dll lib ormkl_intel_ilp64 dll 1lib for dynamic linking The ILP64 interface provides for the following e Support large data arrays with more than 231 1 elements e Enable compiling your Fortran code with the 418 compiler option The LP64 interface provides compatibility with the previous Intel MKL versions because LP64 is just a new name for the only interface that the Intel MKL versions lower than 9 1 provided Choose the ILP64 interface if your application uses Intel MKL for calculations with large data arrays or the library may be used so in future Intel MKL provides the same include directory for the ILP64 and LP64 interfaces Compiling for LP64 ILP64 The table below shows how to compile for the ILP64 and LP64 interfaces Fortran Compiling for ifort 4I8 I lt mkl directory gt include ILP64 Compiling for LP64 ifort I lt mkl directory gt include C or C Compiling for icl DMKL ILP64 I lt mkl directory gt include ILP64 Compiling for LP64 icl I lt mkl directory gt include CAUTION Linking of an application compiled with the 418 or DMKL_ILP64 option to the LP64 libraries may result in unpredictable consequences and erroneous output Coding for ILP64 You do not need to change existing code if you are not using the ILP64 interface 34 Linking Your Application with the Intel Math Kernel Librar
36. library for the Intel compilers ILP64 interface library for the Intel compilers SP2DP interface library for the Intel compilers Fortran 95 interface library for BLAS Supports the Intel Fortran compiler and LP64 interface Fortran 95 interface library for BLAS Supports the Intel Fortran compiler and ILP64 interface Fortran 95 interface library for LAPACK Supports the Intel Fortran compiler and LP64 interface Fortran 95 interface library for LAPACK Supports the Intel Fortran compiler and ILP64 interface Threading library for the Intel compilers Threading library for the PGI compiler Sequential library Kernel library for the Intel 64 architecture Deprecated Empty library for backward compatibility Deprecated Empty library for backward compatibility Deprecated Empty library for backward compatibility Deprecated Empty library for backward compatibility ScaLAPACK routine library supporting the LP64 interface ScaLAPACK routine library supporting the ILP64 interface Cluster version of FFTs LP64 version of BLACS routines supporting Intel MPI ILP64 version of BLACS routines supporting Intel MPI LP64 version of BLACS routines supporting MPICH2 ILP64 version of BLACS routines supporting MPICH2 LP64 version of BLACS routines supporting Microsoft MPI ILP64 version of BLACS routines supporting Microsoft MPI Directory Structure in Detail C Dynamic Libraries in the 1ib inte164 Directory File mkl_rt lib Inter
37. mkl_intel thread lib mkl_core lib libiomp5md lib msmpi lib bufferoverflowu lib To link with Cluster FFTs using LP64 interface for a cluster of Intel 64 architecture based systems set the environment variable and use the link line as follows set lib c MS CCP SDK Lib AMD64 lt mkl directory gt lib intel64 1lib ifort lt user files to link gt mkl_cdft_core lib mkl_blacs_ mpich2 1p64 1lib mkl_ intel 1p64 lib mkl_intel thread lib mkl_core lib libiomp5md lib msmpi lib bufferoverflowu lib See Also Linking with ScaLAPACK and Cluster FFTs Linking with System Libraries 75 8 Intel Math Kernel Library for Windows OS User s Guide 76 Programming with Intel Math Kernel Library in Integrated Development Environments IDE Configuring Your Integrated Development Environment to Link with Intel Math Kernel Library Configuring the Microsoft Visual C C Development System to Link with Intel MKL Steps for configuring Microsoft Visual C C Development System for linking with Intel Math Kernel Library Intel MKL depend on whether If you installed the C Integration s in Microsoft Visual Studio component of the Intel Composer XE e If you installed the integration component see Automatically Linking Your Microsoft Visual C C Project with Intel MKL e If you did not install the integration component or need more control over Intel MKL libraries to link you can configure the Microsoft Visual C 2005 Visu
38. performance compile with DASYOUGO2 and DASYOUGO2_DISPLAY These options provide a lot of useful DGEMM performance information at the cost of around 0 2 performance loss If you want to use the old HPL simply omit these options and recompile from scratch To do this try nmake arch lt arch gt clean_arch_all DASYOUGO DASYOUGO gives performance data as the run proceeds The performance always starts off higher and then drops because this actually happens in LU decomposition a decomposition of a matrix into a product of a lower L and upper U triangular matrices The ASYOUGO performance estimate is usually an overestimate because the LU decomposition slows down as it goes but it gets more accurate as the problem proceeds The greater the lookahead step the less accurate the first number may be ASYOUGO tries to estimate where one is in the LU decomposition that MP LINPACK performs and this is always an overestimate as compared to ASYOUGO2 which measures actually achieved DGEMM performance Note that the ASYOUGO output is a subset of the information that ASYOUGO2 provides So refer to the description of the DASYOUGO2 option below for the details of the output DENDEARLY DENDEARLY t erminates the problem after a few steps so that you can set up 10 or 20 HPL runs without monitoring them see how they all do and then only run the fastest ones to completion DENDEARLY assumes DASYOUGO You do not need to define both alth
39. provided by Intel MKL The C interface to LAPACK is a C style interface to the LAPACK routines This interface supports matrices in row major and column major order which you can define in the first function argument matrix order Use the mkl_lapacke h header file with the C interface to LAPACK The header file specifies constants and prototypes of all the functions It also determines whether the program is being compiled with a C compiler and if it is the included file will be correct for use with C compilation You can find examples of the C interface to LAPACK in the examples lapacke subdirectory in the Intel MKL installation directory Using Complex Types in C C As described in the documentation for the Intel Visual Fortran Compiler XE C C does not directly implement the Fortran types COMPLEX 4 and COMPLEX 8 However you can write equivalent structures The type COMPLEX 4 consists of two 4 byte floating point numbers The first of them is the real number component and the second one is the imaginary number component The type COMPLEX 8 is similar to COMPLEX 4 except that it contains two 8 byte floating point numbers Intel MKL provides complex types MKL Complex8 and MKL Complex16 which are structures equivalent to the Fortran complex types COMPLEX 4 and COMPLEX 8 respectively The MKL Complex8 and MKL Complex16 types are defined in the mkl_types h header file You can use these types to define compl
40. software products Contents of the Intel Optimized LINPACK Benchmark The Intel Optimized LINPACK Benchmark for Windows OS contains the following files located in the benchmarks linpack subdirectory of the Intel Math Kernel Library Intel MKL directory File in benchmarks Description lLinpack linpack_xeon32 exe The 32 bit program executable for a system based on Intel Xeon processor or Intel Xeon processor MP with or without Streaming SIMD Extensions 3 SSE3 linpack_ xeon64 exe The 64 bit program executable for a system with Intel Xeon processor using Intel 64 architecture runme_xeon32 bat A sample shell script for executing a pre determined problem set for linpack_xeon32 exe OMP_NUM_ THREADS set to 2 processors runme_xeon64 bat A sample shell script for executing a pre determined problem set for linpack xeon64 exe OMP_NUM THREADS set to 4 processors 87 1 0 Intel Math Kernel Library for Windows OS User s Guide File in benchmarks Description linpack lininput_xeon32 Input file for pre determined problem for the runme_xeon32 script lininput_xeon64 Input file for pre determined problem for the runme_xeon64 script win _xeon32 txt Result of the runme_xeon32 script execution win _xeon64 txt Result of the runme_xeon 4 script execution help 1lpk Simple help file xhelp lpk Extended help file See Also High level Directory Structure Running the Software To obtain results for the pr
41. the main menu select File gt New gt Project to open the New Project window c Select Project Types gt Intel Fortran gt Console Application then select Templates gt Empty Project When done in the Name field type lt project name gt for example MKL_PDETTF_D_TRIG_TRANSFORM_BVP and click OK The New Project window closes The next steps are performed inside the Solution Explorer window To open it select View gt Solution Explorer from the main menu 2 Add sources of Intel MKL example to the project a Right click the Source Files folder under lt project name gt and select Add gt Existing Item from the drop down menu The Add Existing Item lt project name gt window opens b Browse to the Intel MKL example directory for example lt mk1 directory gt examples pdettf source Select the example file and supporting files with extension f or 90 Fortran sources For example select the d_trig_tforms_bvp f90 file For the list of supporting files in each example directory see Support Files for Intel MKL Examples Click Add 80 Programming with Intel Math Kernel Library in Integrated Development Environments IDE 9 The Add Existing Item lt project name gt window closes and the selected files appear in the Source Files folder in Solution Explorer Some examples with the use statements require the next two steps c Right click the Header Files folder under lt project name gt and select Add
42. those required to solve your particular problems which helps to save disk space and build your own dynamic libraries for distribution The Intel MKL custom DLL builder enables you to create a dynamic library containing the selected functions and located in the tools builder directory The builder contains a makefile and a definition file with the list of functions Using the Custom Dynamic link Library Builder in the Command line Mode To build a custom DLL use the following command nmake target lt options gt The following table lists possible values of target and explains what the command does for each value Value Comment libia3z The builder uses static Intel MKL interface threading and core libraries to build a custom DLL for the IA 32 architecture tibintelp4 The builder uses static Intel MKL interface threading and core libraries to build a custom DLL for the Intel 64 architecture diliasz The builder uses the single dynamic library 1ibmk1_rt d1l1 to build a custom DLL for the IA 32 architecture dllintel64 z sanis The builder uses the single dynamic library libmk1l_rt d1l to build a custom DLL for the Intel 64 architecture help The command prints Help on the custom DLL builder The lt options gt placeholder stands for the list of parameters that define macros to be used by the makefile The following table describes these parameters Parameter Description Values Teas Defines which programming
43. with an additional parameter of nmake FC lt compiler gt For example the command nmake libintel64 FC f 95 install _dir lt userf95 dir gt interface 1po64 builds the required library and mod files and installs them in subdirectories of lt userf95 dir gt To delete the library from the building directory use one of the following commands e For the IA 32 architecture nmake cleania32 install dir lt user dir gt e For the Intel 64 architecture nmake cleanintel64 interface 1p64 ilp64 install dir lt user dir gt e For all the architectures nmake clean install dir lt user dir gt CAUTION Even if you have administrative rights avoid setting install_dir or install_dir lt mkl directory gt ina build or clean command above because these settings replace or delete the Intel MKL prebuilt Fortran 95 library and modules Compiler dependent Functions and Fortran 90 Modules Compiler dependent functions occur whenever the compiler inserts into the object code function calls that are resolved in its run time library RTL Linking of such code without the appropriate RTL will result in undefined symbols Intel MKL has been designed to minimize RTL dependencies In cases where RTL dependencies might arise the functions are delivered as source code and you need to compile the code with whatever compiler you are using for your application 59 6 Intel Math Kernel Library for Windows OS User s Guide In particular
44. 2 Directory Structure in Detail C File mkl p4 dll mkl_p4p dll mkl_ p4m dll mkl p4m3 dl1l mkl _vml_def dll mkl _ vml_ia dll mkl_ vml p4 dll mkl _vml_p4p dll mkl_ vml _ p4m dll mkl vml p4m2 dll mkl_vml_p4m3 d11 mkl_vml_avx dll mkl_scalapack_core dll mkl_cdft_core dll libimalloc dll Run time Libraries RTL mkl_blacs dll mkl blacs_intelmpi dll mkl_blacs_mpich2 d1l 1033 mkl_msg dl1l 1041 mkl_msg dll Contents Pentium 4 processor kernel Kernel for the Intel Pentium 4 processor with Streaming SIMD Extensions 3 SSE3 including Intel Core Duo and Intel Core Solo processors Kernel for processors based on the Intel Core microarchitecture except Intel Core Duo and Intel Core Solo processors for which mkl_p4p d11 is intended Kernel for the Intel Core i7 processors VML VSL part of default kernel for old Intel Pentium processors VML VSL default kernel for newer Intel architecture processors VML VSL part of Pentium 4 processor kernel VML VSL for Pentium 4 processor with Streaming SIMD Extensions 3 SSE3 VML VSL for processors based on the Intel Core microarchitecture except Intel Core Duo and Intel Core Solo processors for which mkl_vml_p4p d11 is intended VML VSL for 45nm Hi k Intel Core 2 and Intel Xeon processor families VML VSL for the Intel Core i7 processors VML VSL optimized for the
45. 3 x Fortran interface for Intel compilers to call Intel MKL FFTs Single precision interfaces for MPI FFTW version 2 x C interface to call Intel MKL cluster FFTs Double precision interfaces for MPI FFTW version 2 x C interface to call Intel MKL cluster FFTs Interfaces for MPI FFTW version 3 x C interface to call Intel MKL cluster FFTs Interfaces for MPI FFTW version 3 x C interface to call Intel MKL cluster FFTs supporting the ILP64 interface Modules in architecture and interface specific subdirectories of the Intel MKL include directory blas95 mod lapack95 mod 95 precision mod mk195 blas modt mk195 lapack mod mk195 precision mod mkl_service mod 1 Prebuilt for the Intel Fortran compiler Fortran 95 interface module for BLAS BLAS95 Fortran 95 interface module for LAPACK LAPACK95 Fortran 95 definition of precision parameters for BLAS95 and LAPACK95 Fortran 95 interface module for BLAS BLAS95 identical to blas95 mod To be removed in one of the future releases Fortran 95 interface module for LAPACK LAPACK95 identical to lapack95 mod To be removed in one of the future releases Fortran 95 definition of precision parameters for BLAS95 and LAPACK95 identical to 95 precision mod To be removed in one of the future releases Fortran 95 interface module for Intel MKL support functions 2 FFTW3 interfaces are integrated with Intel MKL Look into lt mk1 directory gt interf
46. 4 Chapter 10 LINPACK and MP LINPACK Benchmarks Intel Optimized LINPACK Benchmark for WiINdOWS OS cccecceeeee esas eeeeeenaes 87 Contents of the Intel Optimized LINPACK Benchmark sscesseeeeeeeeees 87 RUNNING the SoftWat eessen ceed es cnet Maleate ee tile Pac eal da Cade eevee EA 88 Known Limitations of the Intel Optimized LINPACK Benchmark 89 Intel Optimized MP LINPACK Benchmark for ClUStEIrS ceccceeee seen seen eeeeeanes 89 Overview of the Intel Optimized MP LINPACK Benchmark for Clusters 89 Intel Math Kernel Library for Windows OS User s Guide Contents of the Intel Optimized MP LINPACK Benchmark for Clusters 90 Building the MP LINPACK mecra e ceil E EA E ves ga eer 91 New Features of Intel Optimized MP LINPACK Benchmark 00000 91 Benchmarking a CIUSCES 0 cecee cece teeter eee eee eee nee E a neta ed 92 Options to Reduce Search TiMEG cccceecete cece eee ee eee ee eee ene eee ee een enn nee ens 92 Appendix A Intel Math Kernel Library Language Interfaces Support Language Interfaces Support by Function DOMAain sssssssssssssssrsrrrrrrrrrrrrren 95 INChIid G FilGS eerie T TETTE A devtlacs vas seta daang desea ca lous ETT 96 Appendix B Support for Third Party Interfaces GMP FUNCTIONS a Aras EE a View lve enti neant ei deewh Dian ede E Vee ead ieee aces 99 FFTW Interface SUPPOrt cccccccccceece eset eee ee sense esse nese eee ee eee eaeeeeteaen
47. APACK Packed Routines The routines with the names that contain the letters HP OP PP SP TP UP in the matrix type and storage position the second and third letters respectively operate on the matrices in the packed format see LAPACK Routine Naming Conventions sections in the Intel MKL Reference Manual Their functionality is strictly equivalent to the functionality of the unpacked routines with the names containing the letters HE OR PO SY TR UN in the same positions but the performance is significantly lower If the memory restriction is not too tight use an unpacked routine for better performance In this case you need to allocate 2 2 more memory than the memory required by a respective packed routine where w is the problem size the number of equations For example to speed up solving a symmetric eigenproblem with an expert driver use the unpacked routine call dsyevx jobz range uplo n a lda vl vu il iu abstol m w z ldz work lwork iwork dren Lro where a is the dimension 1da by n which is at least x2 elements instead of the packed routine call dspevx jobz range uplo n ap vl vu il iu abstol m w z ldz work iwork ifail info where ap is the dimension n n 1 2 FFT Functions Additional conditions can improve performance of the FFT functions The addresses of the first elements of arrays and the leading dimension values in bytes n element_size of two dimensional arrays should
48. Building a Custom Dynamic link Library in the Visual Studio Development System You can build a custom dynamic link library DLL in the Microsoft Visual Studio Development System VS To do this use projects available in the tools builder MSVS_ Projects subdirectory of the Intel MKL directory The directory contains the vS2005 VS2008 and VS2010 subdirectories with projects for the respective versions of the Visual Studio Development System For each version of VS two solutions are available 41 4 Intel Math Kernel Library for Windows OS User s Guide e libia32 sin builds a custom DLL for the IA 32 architecture e libintel64 sin builds a custom DLL for the Intel 64 architecture The builder uses the following default settings for the custom DLL Interface cdecl for the IA 32 architecture and LP64 for the Intel 64 architecture Error handler Native Intel MKL xerbla Create Microsoft manifest yes List of functions in the project s source file examples def To build a custom DLL 1 Open the libia32 sl1n or libintel64 s1n solution depending on the architecture of your system The solution includes the following projects e i malloc dll e vml_dll_core e cdecl parallel in libia32 sin or 1p64 parallel in libintel64 s1n e cdecl sequential in libia32 sln or 1p64 sequential in libintel64 s1n 2 Optional To change any of the default settings select the project depending on whether the DLL will use Intel MKL funct
49. Corporation in the United States and or other countries Java is a registered trademark of Oracle and or its affiliates Copyright 2007 2011 Intel Corporation All rights reserved Microsoft product screen shot s reprinted with permission from Microsoft Corporation Intel Math Kernel Library for Windows OS User s Guide Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Introducing the Intel Math Kernel Library The Intel Math Kernel Library Intel MKL improves performance of scientific engineering and financial software that solves large computational problems Among other functionality Intel MKL provides linear algebra routines fast Fourier transforms as well as vectorized math and random num
50. EADING LAYER environment variable to choose threaded or sequential mode ifort myprog f mkl_rt lib e Static linking of myprog f Fortran 95 LAPACK interface and parallel Intel MKL supporting the cdecl interface ifort myprog f mkl_lapack95 lib mkl_intel_c lib mkl_intel thread lib mkl_core lib libiomp5md lib e Static linking of myprog f Fortran 95 BLAS interface and parallel Intel MKL supporting the cdecl interface ifort myprog f mkl_blas95 lib mkl_intel_c lib mkl_intel_thread lib mkl_core lib libiomp5md 1lib 30 Linking Your Application with the Intel Math Kernel Library 4 See Also Fortran 95 Interfaces to LAPACK and BLAS Examples for Linking a C Application Examples for Linking a Fortran Application Using the Single Dynamic Library Linking on Intel R 64 Architecture Systems The following examples illustrate linking that uses Intel R compilers The examples use the f Fortran source file C C users should instead specify a cpp C or c C file and replace ifort with icc Static linking of myprog f and parallel Intel MKL supporting the LP64 interface ifort myprog f mkl_intel 1po4 1lib mkl_intel thread lib mkl_core lib libiomp5md 1lib Dynamic linking of myprog f and parallel Intel MKL supporting the LP64 interface ifort myprog f mkl_intel lpo4 dll lib mkl_intel_ thread_dll lib mkl_core dll lib libiomp5md 1lib Static linking of myprog f and sequential version of Intel MKL supporting the LP64 interface
51. GO2_ DISPLAY Displays the performance of all the significant DGEMMs inside the run ENDEARLY Displays a few performance hints and then terminates the run early FASTSWAP Inserts the LAPACK optimized DLASWP into HPL s code You can experiment with this to determine best results HYBRID Establishes the Hybrid OpenMP MPI mode of MP LINPACK providing the possibility to use threaded Intel MKL and prebuilt MP LINPACK hybrid libraries CAUTION Use this option only with an Intel compiler and the Intel MPI library version 3 1 or higher You are also recommended to use the compiler version 10 0 or higher 91 1 0 Intel Math Kernel Library for Windows OS User s Guide Benchmarking a Cluster To benchmark a cluster follow the sequence of steps below some of them are optional Pay special attention to the iterative steps 3 and 4 They make a loop that searches for HPL parameters specified in HPL dat that enable you to reach the top performance of your cluster 1 Install HPL and make sure HPL is functional on all the nodes 2 You may run nodeperf c included in the distribution to see the performance of DGEMM on all the nodes Compile nodeperf c with your MPI and Intel MKL For example icl Za 03 w D WIN I lt Home directory of MPI gt include lt Home directory of MPI libraries gt lt MPI library gt ie ae lt mkl directory gt lib intel64 mkl_core lib lt Composer XE directory gt lib intel64 libiomp5md
52. HPL 2 0 HPL 2 0 code modified to do ASYOUGO and ENDEARLY modifications HPL 2 0 code modified to do ASYOUGO ASYOUGO2 and ENDEARLY modifications HPL 2 0 sample HPL dat modified All the makefiles in this directory have been rebuilt in the Windows OS distribution Some files in here have been modified in the Windows OS distribution Some files in here have been modified in the Windows OS distribution New Sample architecture makefile for nmake utility to be used on processors based on the IA 32 and Intel 64 architectures and Windows OS New Prebuilt binary for the IA 32 architecture Windows OS and Intel MPI New Prebuilt binary for the Intel 64 architecture Windows OS and Intel MPI LINPACK and MP LINPACK Benchmarks 1 0 Directory File in benchmarks Contents mp_linpack lib hybrid New Prebuilt library with the hybrid version of MP LINPACK ia32 libhpl_ hybrid lib for the IA 32 architecture and Intel MPI lib hybrid New Prebuilt library with the hybrid version of MP LINPACK intel64 libhpl hybrid lib for the Intel 64 architecture and Intel MPI bin_intel New Prebuilt hybrid binary for the IA 32 architecture ia32 xhpl_hybrid_ia32 exe Windows OS and Intel MPI bin_intel New Prebuilt hybrid binary for the Intel 64 architecture intel64 xhpl_ hybrid _intel64 exe Windows OS and Intel MPI nodeperf c New Sample utility that tests the DGEMM speed across the
53. IN FFT Setting the Environment Variables for Threading Control To set the environment variables used for threading control in the command shell in which the program is going to run enter set lt VARIABLE NAME gt lt value gt For example set MKL NUM_THREADS 4 set MKL DOMAIN NUM _THREADS MKL DOMAIN ALL 1 MKL DOMAIN BLAS 4 set MKL DYNAMIC FALSE Some shells require the variable and its value to be exported export lt VARIABLE NAME gt lt value gt For example export MKL NUM _THREADS 4 export MKL DOMAIN NUM THREADS MKL DOMAIN ALL 1 MKL DOMAIN BLAS 4 export MKL DYNAMIC FALSE 51 5 Intel Math Kernel Library for Windows OS User s Guide You can alternatively assign values to the environment variables using Microsoft Windows OS Control Panel Tips and Techniques to Improve Performance Coding Techniques To obtain the best performance with Intel MKL ensure the following data alignment in your source code e Align arrays on 16 byte boundaries See Aligning Addresses on 16 byte Boundaries for how to do it e Make sure leading dimension values n element_size of two dimensional arrays are divisible by 16 where element_size is the size of an array element in bytes e For two dimensional arrays avoid leading dimension values divisible by 2048 bytes For example for a double precision array with element_size 8 avoid leading dimensions 256 512 768 1024 elements L
54. Intel Math Kernel Library for Windows OS User s Guide Intel MKL Windows OS Document Number 315930 018US Legal Information Contents Contents LEGal INFORMACION icassaveved cssvessen edad seeder euveneanssevedsauesswedddsenueusdadsweawewe das 7 Introducing the Intel Math Kernel Library cccccccscesccesceesceeeneesneesnens 9 Getting Help and SUPPOMtb sccisccccceccssccdcsesednaseseseceesnsncsrsnsesneseewseceessenwens 11 Notational ConventionS ssssssssssnnsnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn n 13 Chapter 1 Overview Doc ment OvervieW ssns anpii eA aaa AAAA AIN INATET E EAA LETE VAA ANEA NEA 15 What s NeW errimea dii rnea eena L aa AA AE N DOTEA TEE EE A E E E aE 15 Related Information ssssessusnnrnunnnnnunnnnnunnnunnnnnnnnnnnnnnnnnunnnnnnnnrnnnnnnnn nunnur 15 Chapter 2 Getting Started Checking Your Installati i serineto iee E TEE anes 17 Setting Environment Variables sersan pna oit EEEE E T a 17 Compiler SUPPO a a A a a E E Meee E A a aaa 19 Using Code Example S vii iiacstesnsy i ini nEaN NA AGLAN AAE ANAE ENANTA AETA 19 What You Need to Know Before You Begin Using the Intel Math Kernel NE D EEE T T 19 Chapter 3 Structure of the Intel Math Kernel Library Arohte cure SU p O E a T E EaR 23 High level Directory StruUCtUre ssssssssrrssssrrrnssrrnnnurnrnnusrrnnnnerrnnnnrerrnneurnnne 23 Layered Model Concept ssssssrserssrrsrrsrrerrssnasnsensennornorrnenrenrnnnannannne
55. Intel MKL Steps for configuring Intel Visual Fortran for linking with Intel Math Kernel Library Intel MKL depend on whether you installed the Visual Fortran Integration s in Microsoft Visual Studio component of the Intel Composer XE e If you installed the integration component see Automatically Linking Your Intel Visual Fortran Project with Intel MKL 77 9 Intel Math Kernel Library for Windows OS User s Guide e If you did not install the integration component or need more control over Intel MKL libraries to link you can configure your project as follows 1 Select Project gt Properties gt Linker gt General gt Additional Library Directories Add architecture specific directories for Intel MKL and OpenMP libraries for example lt mkl directory gt lib ia32 and lt Composer XE directory gt compiler lib ia32 2 Select Project gt Properties gt Linker gt Input gt Additional Dependencies Insert names of the required libraries for example mkl_intel_c lib mkl_intel_thread lib mkl_core lib libiomp5md 1lib 3 Select Project gt Properties gt Debugging gt Environment Add architecture specific paths to dynamic link libraries e For OpenMP support for example enter PATH PATH lt Composer XE directory gt redist ia32 compiler e For Intel MKL only if you link dynamically for example enter PATH PATH lt Composer XE directory gt redist ia32 mkl See Also Intel Software Documentation Library Running
56. MPI is used for message passing If you are using a non default MPI assign the same appropriate value to MKL_BLACS MPI on all nodes See Also Setting Environment Variables on a Cluster Setting Environment Variables on a Cluster If you are using MPICH2 or Intel MPI to set an environment variable on the cluster use env genv genvlist keys of mpiexec See the following MPICH2 examples on how to set the value of OMP_NUM_ THREADS mpiexec genv OMP NUM THREADS 2 mpiexec genvlist OMP_NUM THREADS mpiexec n 1 host first env OMP_ NUM THREADS 2 test exe n 1 host second env OMP_ NUM THREADS 3 test exe See the following Intel MPI examples on how to set the value of MKL_BLACS MPI mpiexec genv MKL BLACS MPI INTELMPI mpiexec genvlist MKL BLACS MPI mpiexec n 1 host first env MKL BLACS MPI INTELMPI test exe n 1 host second env MKL BLACS MPI INTELMPI test exe When using MPICH2 you may have problems with getting the global environment such as MKL_BLACS MPI by the genvlist key In this case set up user or system environments on each node as follows From the Start menu select Settings gt Control Panel gt System gt Advanced gt Environment Variables If you are using Microsoft MPI the above ways of setting environment variables are also applicable if the Microsoft Single Program Multiple Data SPMD process managers are running in a debug mode on all nodes of the cluster
57. Microsoft Visual C C project linking with Intel R MKL 28 mixed language programming 61 module Fortran 95 59 MP LINPACK benchmark 89 multi core performance 53 notational conventions 13 number of threads changing at run time 46 changing with OpenMP environment variable 46 Intel R MKL choice particular cases 49 setting for cluster 73 techniques to set 46 P parallel performance 45 108 parallelism of Intel R MKL 43 performance multi core 53 with denormals 54 with subnormals 54 S ScaLAPACK linking with 71 SDL 28 32 sequential mode of Intel R MKL 36 Single Dynamic Library 28 32 stdcall calling convention use in C C 60 structure high level 23 in detail model 25 support technical 11 supported architectures 23 system libraries linking with 38 T technical support 11 thread safety of Intel R MKL 43 threaded functions 43 threaded problems 43 threading control Intel R MKL specific 48 threading libraries linking with 36 U uBLAS matrix matrix multiplication substitution with Intel MKL functions 64 unstable output getting rid of 69 usage information 15 V Visual Studio 2008 IDE configuring a project that runs Intel R MKL code example 78 Visual Studio IDE IntelliSense with Intel R MKL 84 using Intel R MKL context sensitive Help in 83 Veiwing Intel R MKL documentation in 82
58. NTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR Intel may make changes to specifications and product descriptions at any time without notice Designers must not rely on the absence or characteristics of any features or instructions marked reserved or undefined Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them The information here is subject to change without notice Do not finalize a design with this information The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications Current characterized errata are available on request Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order Copies of documents which have an order number and are referenced in this document or other Intel literature may be obtained by calling 1 800 548 4725 or go to http www intel com design literature htm Intel processor numbers are not a measure of performance Processor numbers differentiate features within each processor family not across different processor families Go to http www intel com products processor_number Software and workloads used in performance tests may have been optimi
59. OK to close the window 7 Set library dependencies a Select Configuration Properties gt Linker gt General In the right hand part of the window select Additional Library Directories gt the browse button The Additional Library Directories window opens b Click the New Line button the first button in the uppermost row When the new line appears in the window click the browse button The Select Directory window opens c Browse to the directory with the Intel MKL libraries lt mk1 directory gt lib lt architecture gt where lt architecture gt is one of ia32 intel64 for example lt mkl directory gt lib ia32 For most laptop and desktop computers lt architecture gt iS ia32 Click OK The Select Directory window closes and the full path to the Intel MKL libraries appears in the Additional Library Directories window d Click the New Line button again When the new line appears in the window click the browse button The Select Directory window opens e Browse to the lt Composer XE directory gt compiler lib lt architecture gt where lt architecture gt is one of ia32 intel64 for example lt Composer XE directory gt compiler lib ia32 Click OK The Select Directory window closes and the specified full path appears in the Additional Library Directories window f Click OK to close the Additional Library Directories window g Select Configuration Properties gt Linker gt Input In the right hand p
60. ScaLAPACK routine library supporting the LP64 interface ScaLAPACK routine library supporting the ILP64 interface Cluster FFT dynamic library Dynamic library to support renaming of memory functions LP64 version of BLACS routines ILP64 version of BLACS routines LP64 version of BLACS routines supporting Intel MPI ILP64 version of BLACS routines supporting Intel MPI LP64 version of BLACS routines supporting MPICH2 ILP64 version of BLACS routines supporting MPICH2 LP64 version of BLACS routines supporting Microsoft MPI ILP64 version of BLACS routines supporting Microsoft MPI Catalog of Intel Math Kernel Library Intel MKL messages in English Catalog of Intel MKL messages in Japanese Available only if the Intel MKL package provides Japanese localization Please see the Release Notes for this information Index Index A affinity mask 53 aligning data 69 architecture support 23 BLAS calling routines from C 61 Fortran 95 interface to 59 threaded routines 43 building a custom DLL in Visual Studio IDE 41 C C interface to LAPACK use of 61 C calling LAPACK BLAS CBLAS from 61 C C Intel R MKL complex types 62 calling BLAS functions from C 63 CBLAS interface from C 63 complex BLAS Level 1 function from C 63 complex BLAS Level 1 function from C 63 Fortran style routines from C 61 calling convention cdecl and stdcall 19 CBLAS interface use of 61 cdecl interface use of 33 Cluster FFT linking with 71
61. To get rid of these warnings go to Project gt Properties and when the lt project name gt Property Pages window opens go to Configuration Properties gt C C gt Preprocessor In the right hand part of the window select Preprocessor Definitions add CRT SECURE _NO WARNINGS and click OK 1270 run the example select Debug gt Start Debugging The Console window opens 13You can see the results of the example in the Console window If you used the getchar statement to pause execution of the program press Enter to complete the run If you used a breakpoint to pause execution of the program select Debug gt Continue The Console window closes See Also Running an Intel MKL Example in the Visual Studio 2008 IDE Creating Configuring and Running the Intel Visual Fortran Project This section demonstrates how to create an Intel Visual Fortran project running an Intel MKL example in Microsoft Visual Studio 2008 The instructions below create a Win32 Debug project running one Intel MKL example in a Console window For details on creation of different kinds of Microsoft Visual Studio projects refer to MSDN Visual Studio documentation at http www microsoft com To create and configure a Win32 Debug project running the Intel MKL Fortran example with the Intel Visual Fortran Compiler integrated into Visual Studio perform the following steps 1 Create a Visual Fortran Project a Open Visual Studio 2008 b On
62. Values in C C Codeine itn tives Gene E I EE Sa A E R EE AENA ES 63 Support for Boost uBLAS Matrix matrix Multiplication cccceeeeeeeeeaes 64 Invoking Intel MKL Functions from Java Applications cccccseeeeeeeeeeees 65 Intel MKL Java ExampleS sssssrrsrrsrrsrrsessensessesnesnannnnnrerrnrena 66 Running the Java Example Sinises aiaia iea cesta AE E a a 67 Known Limitations of the Java Examples cccccesseeeseeeeeeeeaeeeas 68 Chapter 7 Coding Tips Aligning Data for Consistent ReSUItS cece cnet eee e ee eee eee eee eee e eae tenet teats 69 Using Predefined Preprocessor Symbols for Intel MKL Version Dependent COMPINALLON i raos e a E eke vend ani ET EOT biter et ents eee Ie 70 Chapter 8 Working with the Intel Math Kernel Library Cluster Software MPI SUp ports Actas Lai ar eta ceo ade ade Leroi adc edadna teen atest 71 Linking with ScaLAPACK and Cluster FFTS ceeeeee eset sees eee ee eens eeeeaeeaeed 71 Determining the Number of Threads cceececeeee eee eee teat eens eee eeee eae eae tated 73 Using DEES inii cee rE E eavidanity sche ee tidy amp ecw alten AEEA Nr dese beers 73 Setting Environment Variables on a CIUStEP cc ce cce eee eee ee eee eee eee eee ee enaee 74 Building Sca LAPACK TeStS vies scccetss nev iv cetva ceaavalSncve ae oy dobry neat ev CEN saa ed cote ae as 74 Examples for Linking with ScaLAPACK and Cluster FFT eeeeeeeeeeeeeeeees 74 Ex
63. Wy I alap ei ey Id intf row ta tc n i 0 i lt 10 i LINEE MACs iewin Wee Wal aly ot ala 6 CblasNoTrans CbhlasNoTrans Tab losicel Cy keke a i SIZE Fortran language PROGRAM DGEMM DIFF THREADS 47 5 Intel Math Kernel Library for Windows OS User s Guide INTEGER N I J PARAMETER N 100 REAL 8 A N N B N N C N N REAL 8 ALPHA BETA iol tae OHH ily 5 CALL DGEMM N N N N N ALPHA A N B N BETA C N print Row A C DO i 1 10 peite TA P20 i BAC eh T Al CTE END DO CALL OMP SET NUM THREADS 1 DO I 1 DORI ir A I J 1B I pad Clips END DO END DO CALL DGEMM N N N N N ALPHA A N B N BETA C N Disinigg 7a ROW ANC HH AE ug E E gh es c Oo DO i 1 10 meitelt 14 20 8 920 8 Ty ACL Ww Ca 1 END DO CALL OMP SET NUM THREADS 2 DO Tl E DO J 1 N A I J Itd ETa Tej C I J 0 0 END DO END DO CALL DGEMM N N N N N ALPHA A N B N BETA C N Deine ROW A CY DO i 1 10 piaite Y t BAO GTA a Ail CT END DO STOP END Using Additional Threading Control Intel MKL specific Environment Variables for Threading Control Intel MKL provides optional threading controls that is the environment variables and support functions that are independent of OpenMP They behave similar to their OpenMP equivalents but take precedence over them in the meaning that th
64. a path to Fortran 95 modules precompiled with the Intel Fortran compiler to the INCLUD environment variable Supply this parameter only if you are using the Intel Fortran compiler e Interface of the Fortran 95 modules This parameter is needed only if you requested addition of a path to the modules Gl Usage and values of these parameters depend on the script The following table lists values of the script parameters Script Architecture Addition of a Path Interface required to Fortran 95 Modules optional when applicable optional mklvars_ia32 n at mod n a mklvars_intel 4 n a mod 1p64 default ilp64 mklvars ia32 mod lp64 default intel64 ilp64 t Not applicable For example e The command mklvars_ia32 sets environment variables for the IA 32 architecture and adds no path to the Fortran 95 modules e The command mklvars_intel64 mod ilp64 sets environment variables for the Intel 64 architecture and adds the path to the Fortran 95 modules for the ILP64 interface to the INCLUDE environment variable e The command mklvars intel64 mod sets environment variables for the Intel 64 architecture and adds the path to the Fortran 95 modules for the LP64 interface to the INCLUDE environment variable p NOTE Supply the parameter specifying the architecture first if it is needed Values of the other two parameters can be listed in any order See Also High level Directory Structure Interface Libraries and Modu
65. ace For example you can list the cdecl entry points as follows DGEM DTRSM DDOT DGETRF DGETRS cbhlas_dgemm cbhlas_ddot You can list the stdcall entry points as follows _DGEMM 60 _DDOTe20 DGETRF 24 For more examples see domain specific lists of function names in the lt mk1 directory gt tools builder folder This folder contains lists of function names for both cdecl or stdcall interfaces NOTE The lists of function names are provided in the lt mkl directory gt tools builder folder merely as examples See Composing a List of Functions for how to compose lists of functions for your custom DLL g TIP Names of Fortran style routines BLAS LAPACK etc can be both upper case or lower case with or without the trailing underscore For example these names are equivalent BLAS dgemm DGEMM dgemm_ DGEMM _ LAPACK dgetrf DGETRF dgetrf_ DGETRF Properly capitalize names of C support functions in the function list To do this follow the guidelines below 1 In the mkl _service h include file look up a define directive for your function 2 Take the function name from the replacement part of that directive For example the define directive for the mkl_disable_fast_mm function is define mkl_ disable fast_mm MKL Disable Fast_MM Capitalize the name of this function in the list like this MKL Disable Fast _MM For the names of the Fortran support functions see the tip
66. aces fftw3x makefile for options defining how to build and where to place the standalone library with the wrappers See Also Fortran 95 Interfaces to LAPACK and BLAS 58 Language specific Usage Options 6 Fortran 95 Interfaces to LAPACK and BLAS Fortran 95 interfaces are compiler dependent Intel MKL provides the interface libraries and modules precompiled with the Intel Fortran compiler Additionally the Fortran 95 interfaces and wrappers are delivered as sources For more information see Compiler dependent Functions and Fortran 90 Modules If you are using a different compiler build the appropriate library and modules with your compiler and link the library as a user s library 1 Go to the respective directory lt mk1 directory gt interfaces blas95 or lt mkl directory gt interfaces lapack95 2 Type one of the following commands depending on your architecture e For the IA 32 architecture nmake libia32 install _dir lt user dir gt e For the Intel 64 architecture nmake libintel64 interface 1p64 1ilp64 install _dir lt user dir gt Important The parameter install dir is required As a result the required library is built and installed in the lt user dir gt lib directory and the mod files are built and installed in the lt user dir gt include lt arch gt 1p64 ilp64 directory where lt arch gt is one of ia32 intel6o4 By default the ifort compiler is assumed You may change the compiler
67. age information for all Intel MKL function domains This User s Guide provides the following information e Describes post installation steps to help you start using the library e Shows you how to configure the library with your development environment e Acquaints you with the library structure e Explains how to link your application with the library and provides simple usage scenarios e Describes how to code compile and run your application with Intel MKL This guide is intended for Windows OS programmers with beginner to advanced experience in software development See Also Language Interfaces Support by Function Domain What s New This User s Guide documents the Intel Math Kernel Library Intel MKL 10 3 Update 8 The document was updated to reflect addition of Data Fitting Functions to the product and to describe how to build a custom dynamic link library in the Visual Studio Development System see Building a Custom Dynamic link Library in the Visual Studio Development System Related Information To reference how to use the library in your application use this guide in conjunction with the following documents e The Intel Math Kernel Library Reference Manual which provides reference information on routine functionalities parameter descriptions interfaces calling syntaxes and return values e The Intel Math Kernel Library for Windows OS Release Notes 15 1 Intel Math Kernel Library for Windows OS User s
68. al C 2008 or Visual C 2010 development system by performing the following steps Though some versions of the Visual C development system may vary slightly in the menu items mentioned below the fundamental configuring steps are applicable to all these versions 1 From the menu select View gt Solution Explorer and make sure this window is active 2 Select Tools gt Options gt Projects gt VC Directories 3 From the Show directories for list select Include Files Add the directory for the Intel MKL include files that is lt mk1 directory gt include 4 From the Show directories for list select Library Files Add architecture specific directories for Intel MKL and OpenMP libraries for example lt mkl directory gt lib ia32 and lt Composer XE directory gt compiler lib ia32 5 From the Show directories for list select Executable Files Add architecture specific directories with dynamic link libraries e For OpenMP support for example lt Composer XE directory gt redist ia32 compiler e For Intel MKL only if you link dynamically for example lt Composer XE directory gt redist ia32 mk1 6 Select Project gt Properties gt Configuration Properties gt Linker gt Input gt Additional Dependencies Add the libraries required for example mkl_intel_c lib mkl_intel_thread lib mkl_core lib libiomp5md 1lib See Also Intel Software Documentation Library Linking in Detail Configuring Intel Visual Fortran to Link with
69. al array B you can access a matrix element like this Ali j Bli n 3 inc i 0 m 1 j 0 1 A i j B j m i in Fortran i 1 m j 1 n When calling LAPACK or BLAS routines from C be aware that because the Fortran language is case insensitive the routine names can be both upper case or lower case with or without the trailing underscore For example the following names are equivalent e LAPACK dgetrf DGETRF dgetrf_ and DGETRF_ e BLAS dgemm DGEMM dgemm_ and DGEMM See Example Calling a Complex BLAS Level 1 Function from C on how to call BLAS routines from C See also the Intel R MKL Reference Manual for a description of the C interface to LAPACK functions CBLAS Instead of calling BLAS routines from a C language program you can use the CBLAS interface CBLAS is a C style interface to the BLAS routines You can call CBLAS routines using regular C style calls Use the mk1 h header file with the CBLAS interface The header file specifies enumerated values and prototypes of all the functions It also determines whether the program is being compiled with a C compiler and if it is the included file will be correct for use with C compilation Example Using CBLAS Interface Instead of Calling BLAS Directly from C illustrates the use of the CBLAS interface C Interface to LAPACK Instead of calling LAPACK routines from a C language program you can use the C interface to LAPACK
70. amples for Linking a C Application c cccceceeeeeeeeeee saat eeeeeeaeeeaeeeanes 75 Examples for Linking a Fortran Application cccccccceceeeseeeeeeeeeeeeneeeaaes 75 Chapter 9 Programming with Intel Math Kernel Library in Integrated Development Environments IDE Configuring Your Integrated Development Environment to Link with Intel Math Kernel LIDrary ccccccccsce este eee e ee eee eee eee ne sense eee e eee DE AREIA E REIA RIAIN 77 Configuring the Microsoft Visual C C Development System to Link with Intel MKL da anaE aa a E KATATE AEAN AAE EA AEE OS a LETRE 77 Configuring Intel Visual Fortran to Link with Intel MKL cccceeeee eee ees 77 Running an Intel MKL Example in the Visual Studio 2008 IDE 78 Creating Configuring and Running the Intel C C and or Visual C 2008 Project spricsiotispri ivin eisein ninna oda aain 78 Creating Configuring and Running the Intel Visual Fortran Projecteur Pie A AE ian la cera aitauae 80 Support Files for Intel Math Kernel Library Examples 0000 81 Known Limitations of the Project Creation Procedure 82 Getting Assistance for Programming in the Microsoft Visual Studio IDE 82 Viewing Intel MKL Documentation in Visual Studio IDE 008 82 Using Context Sensitive Help c cece cece eee eee eee eee eens eee eens tetas 83 Using the IntelliSense Capability ccccccceceeeee eee eeeeeeeeeeeaeeeaeenaenenes 8
71. an Intel MKL Example in the Visual Studio 2008 IDE This section explains how to create and configure projects with the Intel Math Kernel Library Intel MKL examples in Microsoft Visual Studio 2008 For Intel MKL examples where the instructions below do not work see Known Limitations To run the Intel MKL C examples in Microsoft Visual Studio 2008 1 Do either of the following e Install Intele C C Compiler and integrate it into Visual Studio recommended e Use the Microsoft Visual C 2008 Compiler integrated into Visual Studio 2 Create configure and run the Intel C C and or Microsoft Visual C 2008 To run the Intel MKL Fortran examples in Microsoft Visual Studio 2008 1 Install Intel Visual Fortran Compiler and integrate it into Visual Studio The default installation of the Intel Visual Fortran Compiler performs this integration For more information see the Intel Visual Fortran Compiler documentation 2 Create configure and run the Intel Visual Fortran project Creating Configuring and Running the Intel C C and or Visual C 2008 Project This section demonstrates how to create a Visual C C project using an Intel Math Kernel Library Intel MKL example in Microsoft Visual Studio 2008 The instructions below create a Win32 Debug project running one Intel MKL example in a Console window For details on creation of different kinds of Microsoft Visual Studio projects refer to MSDN Visual Studio doc
72. ar solvers e All mathematical VML functions e FFT For the list of FFT transforms that can be threaded see Threaded FFT Problems 43 5 Intel Math Kernel Library for Windows OS User s Guide Threaded LAPACK Routines In the following list stands for a precision prefix of each flavor of the respective routine and may have the value of s d c orz The following LAPACK routines are threaded e Linear equations computational routines e Factorization getrf gbtrf potrf pptrf sytrf hetrf sptrf hptrf e Solving dttrsb gbtrs gttrs pptrs pbtrs pttrs sytrs sptrs hptrs tptrs tbtrs e Orthogonal factorization computational routines geqrf ormqr Punmgqr ormlg unmlq ormgl unmql ormrg unmrg e Singular Value Decomposition computational routines gebrd bdsqr e Symmetric Eigenvalue Problems computational routines sytrd Phetrd sptrd Phptrd steqr stedc e Generalized Nonsymmetric Eigenvalue Problems computational routines chgeqz zhgeqz A number of other LAPACK routines which are based on threaded LAPACK or BLAS routines make effective use of parallelism gesv posv gels gesvd syev heev cgegs zgegs cgegv zgegv cgges zgges cggesx zggesx cggev zggev cggevx zggevx and so on Threaded BLAS Level1 and Level2 Routines In the following list stands for a precision prefix of each flavor of the respective routine and may have the value of s d c
73. ariable You can set the number of threads using the environment variable OMP_NUM_ THREADS To change the number of threads in the command shell in which the program is going to run enter set OMP_NUM THREADS lt number of threads to use gt Some shells require the variable and its value to be exported export OMP NUM _THREADS lt number of threads to use gt You can alternatively assign value to the environment variable using Microsoft Windows OS Control Panel Note that you will not benefit from setting this variable on Microsoft Windows 98 or Windows ME because multiprocessing is not supported See Also Using Additional Threading Control Changing the Number of Threads at Run Time You cannot change the number of threads during run time using environment variables However you can call OpenMP API functions from your program to change the number of threads during run time The following sample code shows how to change the number of threads during run time using the omp_set_num_threads routine See also Techniques to Set the Number of Threads 46 5 Managing Performance and Memory The following example shows both C and Fortran code examples To run this example in the C language use the omp h header file from the Intel R compiler package If you do not have the Intel compiler but wish to explore the functionality in the example use Fortran API for omp _set_num threads rather than the C version For examp
74. art of the window select Additional Dependencies gt the browse button The Additional Dependencies window opens h Type the libraries required for example if lt architecture gt ia32 type mkl_intel_c lib mkl_ intel thread lib mkl_core lib libiomp5md 1ib For more details see Linking in Detail i Click OK to close the Additional Dependencies window 79 9 Intel Math Kernel Library for Windows OS User s Guide j If the Intel MKL example directory does not contain a data directory skip the next step 8 Set data dependencies for the Intel MKL example a Select Configuration Properties gt Debugging In the right hand part of the window select Command Arguments gt gt lt Edit gt The Command Arguments window opens b Type the path to the proper data file in quotes The name of the data file is the same as the name of the example file with a a extension for example lt mk1 directory gt examples cblas data cblas_caxpyix d c Click OK to close the Command Arguments window 9 Click OK to close the lt project name gt Property Pages window 10Certain examples do not pause before the end of execution To see the results printed in the Console window set a breakpoint at the very last return 0 statement or add a call to getchar before the last return 0 statement 1170 build the solution select Build gt Build Solution NOTE You may see warnings about unsafe functions and variables
75. ath Kernel Library for Windows OS User s Guide 10 Getting Help and Support Intel provides a support web site that contains a rich repository of self help information including getting started tips known product issues product errata license information user forums and more Visit the Intel MKL support website at http www intel com software products support The Intel MKL documentation integrates into the Microsoft Visual Studio integrated development environment IDE See Getting Assistance for Programming in the Microsoft Visual Studio IDE 11 Intel Math Kernel Library for Windows OS User s Guide 12 Notational Conventions The following term is used in reference to the operating system Windows OS This term refers to information that is valid on all supported Windows operating systems The following notations are used to refer to Intel MKL directories lt Composer XE directory gt lt mkl directory gt The installation directory for the Intel C Composer XE or Intel Visual Fortran Composer XE The main directory where Intel MKL is installed lt mkl directory gt lt Composer XE directory gt mkl Replace this placeholder with the specific pathname in the configuring linking and building instructions The following font conventions are used in this document Italic Monospace lowercase mixed with uppercase UPPERCASE MONOSPACE Monospace italic items
76. based systems set the environment variable and use the link line as follows set lib c mpich2x64 lib lt mkl directory gt lib intel64 lib icl lt user files to link gt mkl_scalapack_1p64 1lib mkl_blacs_mpich2_1p64 lib mkl_ intel 1lp64 lib mkl_intel thread lib mkl_core lib libiomp5md lib mpi lib cxx 1lib bufferoverflowu lib To link with Cluster FFT using LP64 interface for a cluster of Intel 64 architecture based systems set the environment variable and use the link line as follows set lib c mpich2x64 lib lt mkl directory gt lib intel64 lib icl lt user files to link gt mkl_cdft_core lib mkl_blacs_mpich2 1p64 1lib mkl_intel 1p64 lib mkl_ intel thread lib mkl_core lib libiomp5md lib mpi lib cxx lib bufferoverflowu lib See Also Linking with ScaLAPACK and Cluster FFTs Linking with System Libraries Examples for Linking a Fortran Application These examples illustrate linking of an application whose main module is in Fortran under the following conditions e Microsoft Windows Compute Cluster Pack SDK is installed in c MS CCP SDK e You use the Intel Fortran Compiler 10 0 or higher To link with ScaLAPACK using LP64 interface for a cluster of Intel 64 architecture based systems set the environment variable and use the link line as follows set lib c MS CCP SDK Lib AMD64 lt mkl directory gt lib intel64 1lib ifort lt user files to link gt mkl_scalapack_l1p64 1lib mkl_blacs_mpich2 1p64 1lib mkl_ intel 1p64 lib
77. be added to the project for respective examples examples cblas source common_func c examples dftc source dfti_example status _print c dfti_example support c 81 9 Intel Math Kernel Library for Windows OS User s Guide Known Limitations of the Project Creation Procedure You cannot create a Visual Studio project using the instructions from Creating Configuring and Running the Intel C C and or Visual C 2008 Project or Creating Configuring and Running the Intel Visual Fortran Project for examples from the following directories examples blas examples blas95 examples cdftc examples cdftf examples dftf examples fftw2x cdf examples fftw2xc examples fftw2xf examples fftw3xc examples fftw3xf examples java examples lapack examples lapack95 Getting Assistance for Programming in the Microsoft Visual Studio IDE Viewing Intel MKL Documentation in Visual Studio IDE Viewing Intel MKL Documentation in Document Explorer Visual Studio 2005 2008 IDE Intel MKL documentation is integrated in the Visual Studio IDE VS help collection To open Intel MKL help 1 Select Help gt Contents from the menu This displays the list of VS Help collections 2 Click Intel Math Kernel Library Help 3 In the help tree that expands click Intel MKL Reference Manual To open the help index select Help gt Inde x from the menu To search in the help select Help gt Search from the menu and enter a search string
78. be divisible by cache line size which equals e 32 bytes for the Intel Pentium III processors e 64 bytes for the Intel Pentium 4 processors and processors using Intel 64 architecture 52 Managing Performance and Memory 5 Hardware Configuration Tips Dual Core Intel Xeon processor 5100 series systems To get the best performance with Intel MKL on Dual Core Intel Xeon processor 5100 series systems enable the Hardware DPL streaming data Prefetcher functionality of this processor To configure this functionality use the appropriate BIOS settings as described in your BIOS documentation Intel Hyper Threading Technology Intel Hyper Threading Technology Intel HT Technology is especially effective when each thread performs different types of operations and when there are under utilized resources on the processor However Intel MKL fits neither of these criteria because the threaded portions of the library execute at high efficiencies using most of the available resources and perform identical operations on each thread You may obtain higher performance by disabling Intel HT Technology If you run with Intel HT Technology enabled performance may be especially impacted if you run on fewer threads than physical cores Moreover if for example there are two threads to every physical core the thread scheduler may assign two threads to some cores and ignore the other cores altogether If you are using the OpenMP li
79. ber generation functions all optimized for the latest Intel processors including processors with multiple cores see the Inte MKL Release Notes for the full list of supported processors Intel MKL also performs well on non Intel processors Intel MKL is thread safe and extensively threaded using the OpenMP technology Intel MKL provides the following major functionality e Linear algebra implemented in LAPACK solvers and eigensolvers plus level 1 2 and 3 BLAS offering the vector vector matrix and matrix matrix operations needed for complex mathematical software If you prefer the FORTRAN 90 95 programming language you can call LAPACK driver and computational subroutines through specially designed interfaces with reduced numbers of arguments A C interface to LAPACK is also available e ScaLAPACK SCAlable LAPACK with its support functionality including the Basic Linear Algebra Communications Subprograms BLACS and the Parallel Basic Linear Algebra Subprograms PBLAS ScaLAPACK is available for Intel MKL for Linux and Windows operating systems e Direct sparse solver an iterative sparse solver and a supporting set of sparse BLAS level 1 2 and 3 for solving sparse systems of equations e Multidimensional discrete Fourier transforms 1D 2D 3D with a mixed radix support for sizes not limited to powers of 2 Distributed versions of these functions are provided for use on clusters on the Linux and Windows operating systems
80. brary of the Intel Compiler read the respective User Guide on how to best set the thread affinity interface to avoid this situation For Intel MKL apply the following setting set KMP AFFINITY granularity fine compact 1 0 See Also Using Parallelism of the Intel Math Kernel Library Managing Multi core Performance You can obtain best performance on systems with multi core processors by requiring that threads do not migrate from core to core To do this bind threads to the CPU cores by setting an affinity mask to threads Use one of the following options e OpenMP facilities recommended if available for example the KMP_AFFINITY environment variable using the Intel OpenMP library e A system function as explained below Consider the following performance issue e The system has two sockets with two cores each for a total of four cores CPUs e Performance of t he four thread parallel application using the Intel MKL LAPACK is unstable The following code example shows how to resolve this issue by setting an affinity mask by operating system means using the Intel compiler The code calls the system function SetThreadAffinityMask to bind the threads to appropriate cores thus preventing migration of the threads Then the Intel MKL LAPACK routine is called Set affinity mask include lt windows h gt include lt omp h gt int main void pragma omp parallel default shared int tid omp get _thread_num 2 packag
81. c rfaces fftw2xf rfaces fftw3xc rfaces fftw3xf rfaces lapack95 s s s builder st ia32 mk1 st intel64 mk1 mentation en_US MKL mentation vshelp 3 1l mkldocs mentation msvhelp 3 mk1 Iso Notational Conventions 24 Contents Static libraries and static interfaces to DLLs for the IA 32 architecture Static libraries and static interfaces to DLLs for the Intel 64 architecture Examples directory Each subdirectory has source and data files INCLUDE files for the library routines as well as for tests and examples Fortran 95 mod files for the IA 32 architecture and Intel Fortran compiler Fortran 95 mod files for the Intel 64 architecture Intel Fortran compiler and LP64 interface Fortran 95 mod files for the Intel 64 architecture Intel Fortran compiler and ILP64 interface Header files for the FFTW2 and FFTW3 interfaces Fortran 95 interfaces to BLAS and a makefile to build the library MPI FFTW 2 x interfaces to Intel MKL Cluster FFTs MPI FFTW 3 x interfaces to Intel MKL Cluster FFTs FFTW 2 x interfaces to the Intel MKL FFTs C interface FFTW 2 x interfaces to the Intel MKL FFTs Fortran interface FFTW 3 x interfaces to the Intel MKL FFTs C interface FFTW 3 x interfaces to the Intel MKL FFTs Fortran interface Fortran 95 interfaces to LAPACK and a makefile to build the library Source and data files for tests Commad line link tool and tools for creating custom dynamically linkable libraries
82. cale gt mkl redist txt List of redistributable files mkl_documentation htm Overview and links for the Intel MKL documentation mkl_manual index htm Intel MKL Reference Manual in an uncompressed HTML format Release Notes htm Intel MKL Release Notes mkl_userguide index htm Intel MKL User s Guide in an uncompressed HTML format this document mkl link line advisor htm Intel MKL Link line Advisor 26 Linking Your Application with the Intel Math Kernel Library Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Linking Quick Start Intel Math Kernel Library Intel MKL provides several options for quick linking of your application The simplest options depend on your development environment Intel
83. ces directory File name Contains Libraries in Intel MKL architecture specific directories mkl_blas95 lib Fortran 95 wrappers for BLAS BLAS95 for IA 32 architecture mkl blas95 ilp64 1lib Fortran 95 wrappers for BLAS BLAS95 supporting LP64 interface mkl blas95 lp64 libt Fortran 95 wrappers for BLAS BLAS95 supporting ILP64 interface mkl_lapack95 libt Fortran 95 wrappers for LAPACK LAPACK95 for IA 32 architecture mkl lapack95 1p64 1lib Fortran 95 wrappers for LAPACK LAPACK95 supporting LP64 interface mkl_lapack95 ilp64 1lib Fortran 95 wrappers for LAPACK LAPACK95 supporting ILP64 interface 57 6 Intel Math Kernel Library for Windows OS User s Guide File name fftw2xc intel lib Ftw2xc_ms lib fFftw2xf intel lib fftw3xc_intel lib Ftw3xc_ms lib fftw3xf_intel 1lib Ftw2x cdft_SINGLE 1lib Ftw2x cdft_DOUBLE 1ib Ftw3x cdft lib Fftw3x_cdft_ilpo4 lib Contains Interfaces for FFTW version 2 x C interface for Intel compilers to call Intel MKL FFTs Contains interfaces for FFTW version 2 x C interface for Microsoft compilers to call Intel MKL FFTs Interfaces for FFTW version 2 x Fortran interface for Intel compilers to call Intel MKL FFTs Interfaces for FFTW version 3 x C interface for Intel compiler to call Intel MKL FFTs Interfaces for FFTW version 3 x C interface for Microsoft compilers to call Intel MKL FFTs Interfaces for FFTW version
84. cessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Intel Optimized LINPACK Benchmark for Windows OS Intel Optimized LINPACK Benchmark is a generalization of the LINPACK 1000 benchmark It solves a dense real 8 system of linear equations Ax b measures the amount of time it takes to factor and solve the system converts that time into a performance rate and tests the results for accuracy The generalization is in the number of equations N it can solve which is not limited to 1000 It uses partial pivoting to assure the accuracy of the results Do not use this benchmark to report LINPACK 100 performance because that is a compiled code only benchmark This is a shared memory SMP implementation which runs on a single platform Do not confuse this benchmark with e MP LINPACK which is a distributed memory version of the same benchmark e LINPACK the library which has been expanded upon by the LAPACK library Intel provides optimized versions of the LINPACK benchmarks to help you obtain high LINPACK benchmark results on your genuine Intel processor systems more easily than with the High Performance Linpack HPL benchmark Use this package to benchmark your SMP machine Additional information on this software as well as other Intel software performance products is available at http www intel com
85. cide what MPI you will use with the Intel MKL cluster software You are strongly encouraged to use Intel MPI 3 2 or later MPI used Reason To link your application with ScaLAPACK and or Cluster FFT the libraries corresponding to your particular MPI should be listed on the link line see Working with the Cluster Software 21 2 Intel Math Kernel Library for Windows OS User s Guide 22 Structure of the Intel Math Kernel Library Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Architecture Support Intel Math Kernel Library Intel MKL for Windows OS provides two architecture specific implementations The following table lists the supported architectures and directories where each archite
86. cluster See Also High level Directory Structure Building the MP LINPACK The MP LINPACK Benchmark contains a few sample architecture makefiles You can edit them to fit your specific configuration Specifically e Set TOPdir to the directory that MP LINPACK is being built in e Set MPI variables that is MPdir MPinc and MPlib e Specify the location Intel MKL and of files to be used LAdir LAinc LAlib e Adjust compiler and compiler linker options e Specify the version of MP LINPACK you are going to build hybrid or non hybrid by setting the version parameter for the nmake command For example nmake arch intel64 mpi intelmpi version hybrid install For some sample cases the makefiles contain values that must be common However you need to be familiar with building an HPL and picking appropriate values for these variables New Features of Intel Optimized MP LINPACK Benchmark The toolset is basically identical with the HPL 2 0 distribution There are a few changes that are optionally compiled in and disabled until you specifically request them These new features are ASYOUGO Provides non intrusive performance information while runs proceed There are only a few outputs and this information does not impact performance This is especially useful because many runs can go for hours without any information ASYOUGO2 Provides slightly intrusive additional performance information by intercepting every DGEMM call ASYOU
87. complex and complex to real transforms are not threaded 1D complex to complex transforms using split complex layout are not threaded Prime size complex to complex 1D transforms are not threaded Multidimensional transforms All multidimensional transforms on large volume data are threaded Avoiding Conflicts in the Execution Environment Certain situations can cause conflicts in the execution environment that make the use of threads in Intel MKL problematic This section briefly discusses why these problems exist and how to avoid them If you thread the program using OpenMP directives and compile the program with Intel compilers Intel MKL and the program will both use the same threading library Intel MKL tries to determine if it is in a parallel region in the program and if it is it does not spread its operations over multiple threads unless you specifically request Intel MKL to do so via the MKL DYNAMIC functionality However Intel MKL can be aware that it is in a parallel region only if the threaded program and Intel MKL are using the same threading library If your program is threaded by some other means Intel MKL may operate in multithreaded mode and the performance may suffer due to overuse of the resources The following table considers several cases where the conflicts may arise and provides recommendations depending on your threading model Threading model You thread the program using OS threads Win32 threads on Windows
88. correlation functions mitigates the same difficulty of the VSL interface which assumes a similar lifecycle for task descriptors The wrapper utilizes the ESSL like interface for those functions which is simpler for the case of 1 dimensional data The JNI stub additionally encapsulates the MKL functions into the ESSL like wrappers written in C and so packs the lifecycle of a task descriptor into a single call to the native method The wrappers meet the JNI Specification versions 1 1 and 5 0 and should work with virtually every modern implementation of Java The examples and the Java part of the wrappers are written for the Java language described in The Java Language Specification First Edition and extended with the feature of inner classes this refers to late 1990s This level of language version is supported by all versions of the Sun Java Development Kit JDK developer toolkit and compatible implementations starting from version 1 1 5 or by all modern versions of Java The level of C language is Standard C that is C89 with additional assumptions about integer and floating point data types required by the Intel MKL interfaces and the JNI header files That is the native float and double data types must be the same as JNI j float and jdouble data types respectively and the native int must be 4 bytes long 1 IBM Engineering Scientific Subroutine Library ESSL See Also Running the Java Examples Running the Java Exam
89. cture specific implementation is located Architecture Location IA 32 or compatible lt mkl directory gt lib ia32 lt Composer XE directory gt redist ia32 mkl DLLs Intel 64 or compatible lt mkl directory gt lib intel64 lt Composer XE directory gt redist intel64 mkl DLLs See Also High level Directory Structure Detailed Structure of the IA 32 Architecture Directories Detailed Structure of the Intel 64 Architecture Directories High level Directory Structure Directory Contents lt mkl directory gt Installation directory of the Intel Math Kernel Library Intel MKL Subdirectories of lt mk1 directory gt bin Batch files to set environmental variables in the user shell bin ia32 Batch files for the IA 32 architecture bin intel64 Batch files for the Intel 64 architecture benchmarks linpack Shared Memory SMP version of the LINPACK benchmark benchmarks mp_ linpack Message passing interface MPI version of the LINPACK benchmark 23 3 Intel Math Kernel Library for Windows OS User s Guide Dire lib lib exam incl incl incl inte inte inte inte inte inte inte inte test tool tool Subdirectories of lt Composer redi redi Docu Docu 103 inte Docu 103 See A ctory ia32 intel64 ples ude ude ia32 ude intel64 1p64 lude intel64 ilp6e4 lude fftw rfaces blas95 rfaces fftw2x_cdft cfaces fftw3x_cdft rfaces fftw2x
90. details see the Intel Composer XE documentation See Also Intel Software Documentation Library Automatically Linking Your Intel Visual Fortran Project with Intel MKL Configure your Intel Visual Fortran project for automatic linking with Intel MKL as follows Go to Project gt Properties gt Libraries gt Use Intel Math Kernel Library and select Parallel Sequential or Cluster as appropriate Specific Intel MKL libraries that link with your application may depend on more project settings For details see the Intel Visual Fortran Compiler XE User and Reference Guides See Also Intel Software Documentation Library Using the Single Dynamic Library You can simplify your link line through the use of the Intel MKL Single Dynamic Library SDL To use SDL place mkl_rt 1lib on your link line For example icl exe application c mkl_rt lib mkl_rt lib is the import library for mkl_rt dll SDL enables you to select the interface and threading library for Intel MKL at run time By default linking with SDL provides e LP64 interface on systems based on the Intel 64 architecture e Intel threading To use other interfaces or change threading preferences including use of the sequential version of Intel MKL you need to specify your choices using functions or environment variables as explained in section Dynamically Selecting the Interface and Threading Layer 28 Linking Your Application with the Intel Math Kernel Lib
91. e Because the default CVF format is not identical with stdcall you must specially handle strings in the calling sequence See how to do it in sections on interfaces in the CVF documentation Use the following declaration lt type gt name lt prototype variablel gt lt prototype variable2 gt If you are using a Fortran compiler to link with the cdecl or stdcall interface library provide compiler options as explained in the table below Interface Library Compiler Options Comment CVF compiler mkl_intel_s _dll lib Default mki intel _c _dll lib iface cref nomixed_str_len_arg 33 4 Intel Math Kernel Library for Windows OS User s Guide Interface Library Compiler Options Comment Intel Fortran compiler mkl_intel_c _dll lib Default mkl_intel_s _dll lib Gm Gm and iface cvf options or enable compatibility of the CVF and iface cvf Powerstation calling conventions See Also Using the stdcall Calling Convention in C C Compiling an Application that Calls the Intel Math Kernel Library and Uses the CVF Calling Conventions Using the ILP64 Interface vs LP64 Interface The Intel MKL ILP64 libraries use the 64 bit integer type necessary for indexing large arrays with more than 231 1 elements whereas the LP64 libraries index arrays with the 32 bit integer type The LP64 and ILP64 interfaces are implemented in the Interface layer Link with the following interface libraries for the LP64 or ILP64
92. e Computational layer accommodates multiple architectures through identification of architecture features and chooses the appropriate binary code at run time Compiler Run time To support threading with Intel compilers Intel MKL uses RTLs of the Intel C Libraries RTL Composer XE or Intel Visual Fortran Composer XE To thread using third party threading compilers use libraries in the Threading layer or an appropriate compatibility library See Also Using the ILP64 Interface vs LP64 Interface 25 3 Intel Math Kernel Library for Windows OS User s Guide Linking Your Application with the Intel Math Kernel Library Linking with Threading Libraries Contents of the Documentation Directories Most of Intel MKL documentation is installed at lt Composer XE directory gt Documentation lt locale gt mk1 For example the documentation in English is installed at lt Composer XE directory gt Documentation en_US mk1l However some Intel MKL related documents are installed one or two levels up The following table lists MKL related documentation File name Comment Files in lt Composer XE directory gt Documentation lt locale gt clicense rtf or Common end user license for the Intel C Composer XE 2011 or lt locale gt flicense rtf Intel Visual Fortran Composer XE 2011 respectively mklsupport txt Information on package number for customer support reference Contents of lt Composer XE directory gt Documentation lt lo
93. e MKL specific threading controls are inspected first By using these controls along with OpenMP variables you can thread the part of the application that does not call Intel MKL and the library independently from each other These controls enable you to specify the number of threads for Intel MKL independently of the OpenMP settings Although Intel MKL may actually use a different number of threads from the number suggested the controls will also enable you to instruct the library to try using the suggested number when the number used in the calling application is unavailable 48 Managing Performance and Memory 5 NOTE Sometimes Intel MKL does not have a choice on the number of threads for certain reasons such as system resources Use of the Intel MKL threading controls in your application is optional If you do not use them the library will mainly behave the same way as Intel MKL 9 1 in what relates to threading with the possible exception of a different default number of threads Section Number of User Threads in the Fourier Transform Functions chapter of the Inte MKL Reference Manual shows how the Intel MKL threading controls help to set the number of threads for the FFT computation The table below lists the Intel MKL environment variables for threading control their equivalent functions and OMP counterparts Environment Variable Support Function Comment Equivalent OpenMP Environment Variable MKL NUM
94. e determined sample problem sizes on a given system type one of the following as appropriate runme_xeon32 bat runme_xeon64 bat To run the software for other problem sizes see the extended help included with the program Extended help can be viewed by running the program executable with the e option linpack_xeon32 exe e linpack_xeon64 exe e The pre defined data input fileslininput_xeon32 and lininput_xeon 4 are provided merely as examples Different systems have different number of processors or amount of memory and thus require new input files The extended help can be used for insight into proper ways to change the sample input files Each input file requires at least the following amount of memory lininput_xeon32 2 GB lininput_xeon 4 16 GB If the system has less memory than the above sample data input requires you may need to edit or create your own data input files as explained in the extended help Each sample script uses the OMP_NUM_ THREADS environment variable to set the number of processors it is targeting To optimize performance on a different number of physical processors change that line appropriately If you run the Intel Optimized LINPACK Benchmark without setting the number of threads it will default to the number of cores according to the OS You can find the settings for this environment variable in the runme_ sample scripts If the settings do not yet match the situation for your machine edit the script
95. e product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 MPI Support Intel MKL ScaLAPACK and Cluster FFTs support MPI implementations identified in the Intel Math Kernel Library Intel MKL Release Notes To link applications with ScaLAPACK or Cluster FFTs you need to configure your system depending on your message passing interface MPI implementation as explained below If you are using MPICH2 do the following 1 Add mpich2 include to the include path assuming the default MPICH2 installation 2 Add mpich2 1ib to the library path 3 Add mpi 1lib to your link command 4 Add fmpich2 1ib to your Fortran link command 5 Add cxx 1ib to your Release target link command and cxxd 1ib to your Debug target link command for C programs If you are using the Microsoft MPI do the following 1 Add Microsoft Compute Cluster Pack include to the include path assuming the default installation of the Microsoft MPI 2 Add Microsoft Compute Cluster Pack Lib AMD64 to the library path 3 Add msmpi 1ib to your link command If you are using the Intel MPI do the following 1 Add the following string to the include path sProgramFiles Intel MPI lt ver gt lt arch gt include where lt ver gt is the directory for a particular MPI version and lt arch gt is ia32 or intel64 for example ProgramFiles Intel MPI 3 1 intel64 include
96. ecture The command takes the list of functions from the functions list file and uses the native Intel MKL error handler xerbla An example of a more complex case follows nmake 1a32 interface stdcall export my_func_list txt name mkl small xerbla my_xerbla obj In this case the command creates the mkl_smal1 dl1 and mkl_small1 1lib libraries with the stdcall interface for processors using the IA 32 architecture The command takes the list of functions from my func_list txt file and uses the user s error handler my_xerbla obj The process is similar for processors using the Intel 64 architecture See Also Linking with System Libraries Composing a List of Functions To compose a list of functions for a minimal custom DLL needed for your application you can use the following procedure 1 Link your application with installed Intel MKL libraries to make sure the application builds 2 Remove all Intel MKL libraries from the link line and start linking Unresolved symbols indicate Intel MKL functions that your application uses 3 Include these functions in the list 40 Linking Your Application with the Intel Math Kernel Library 4 E Important Each time your application starts using more Intel MKL functions update the list to include the new functions See Also Specifying Function Names Specifying Function Names In the file with the list of functions for your custom DLL adjust function names to the required interf
97. el 1 function zdotc This function computes the dot product of two double precision complex vectors In this example the complex dot product is returned in the structure c include mkl h define N 5 int main LE i IN alin il tado il ip MKL Complexl6 a N b N c Foe a Og i lt mp irr Ji afi double i a i imag double i 2 0 al real b i real double n i bl i imag double i 2 0 63 6 Intel Math Kernel Library for Windows OS User s Guide Moline Me Mit ar Canes Id Eme jp printf The complex dot product is 6 2f 6 2f n c real c imag return 0 Example Calling a Complex BLAS Level 1 Function from C Below is the C implementation include lt complex gt include lt iostream gt define MKL Complexl6 std complex lt double gt include mkl h define N 5 int main Hie m SAC il imedo il ip std complex lt double gt a N bIN c n WN for i 0 7 2 lt ny ae a a i std complex lt double gt i i 2 0 b i std complex lt double gt n i i 2 0 zdotc amp c amp n a amp inca b amp inch std cout lt lt The complex dot product is lt lt c lt lt std endl return 0 Example Using CBLAS Interface Instead of Calling BLAS Directly from C This example uses CBLAS include lt stdio h gt include mkl1 h typedef struct double re double im complexl6 define
98. ent to MKL_NUM_THREADS 4 MKL DOMAIN ALL All parts of Intel MKL should try one thread except for BLAS which is suggested to dy try four threads MKL DOMAIN BLAS 4 MKL DOMAIN VML VML should try two threads The setting affects no other part of Intel MKL Be aware that the domain specific settings take precedence over the overall ones For example the MKL DOMAIN BLAS 4 value of MKL DOMAIN NUM THREADS suggests trying four threads for BLAS regardless of later setting MKL_NUM_THREADS and a function call mkl_ domain _set_num_threads 4 MKL DOMAIN BLAS suggests the same regardless of later calls to mkl_set_num_threads However a function call with input MKL DOMAIN ALL such as mkl_ domain _set_num_threads 4 KL_ DOMAIN ALL is equivalent to mk1_set_num_ threads 4 and thus it will be overwritten by later calls to mkl_set_num_threads Similarly the environment setting of MKL_DOMAIN NUM _THREADS with MKL DOMAIN ALL 4 will be overwritten with MKL NUM THREADS 2 Whereas the MKL_DOMAIN NUM_THREADS environment variable enables you set several variables at once for example MKL_ DOMAIN BLAS 4 MKL DOMAIN FFT 2 the corresponding function does not take string syntax So to do the same with the function calls you may need to make several calls which in this example are as follows mkl domain _set_num_threads 4 MKL DOMAIN BLAS mkl domain _set_num threads 2 MKL DOMA
99. ently use the GMP library you need to modify INCLUDE statements in your programs to mkl_gmp h FFTW Interface Support Intel Math Kernel Library Intel MKL offers two collections of wrappers for the FFTW interface www fftw org The wrappers are the superstructure of FFTW to be used for calling the Intel MKL Fourier transform functions These collections correspond to the FFTW versions 2 x and 3 x and the Intel MKL versions 7 0 and later These wrappers enable using Intel MKL Fourier transforms to improve the performance of programs that use FFTW without changing the program source code See the FFTW Interface to Intel Math Kernel Library appendix in the Intel MKL Reference Manual for details on the use of the wrappers Important For ease of use FFTW3 interface is also integrated in Intel MKL 99 B Intel Math Kernel Library for Windows OS User s Guide 100 Directory Structure in Detail Tables in this section show contents of the Intel R Math Kernel Library Intel R MKL architecture specific directories Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel
100. eradineaeieves enous becudvenedevdehates EA AE Eaa a ENNS aa 31 Dynamically Selecting the Interface and Threading Layef cccceseeeeeees 32 Linking with Interface Libraries cccceeeee eee eee eee eee eee eee eee sean ented 33 Using the cdecl and stdcall InterfacesS ccceccceeeeeneeeeeeeeeeeeeeeeees 33 Using the ILP64 Interface vs LP64 Interface eceeee eee eee ee 34 Linking with Fortran 95 Interface Libraries c eeeeee seen ee eee 36 Intel Math Kernel Library for Windows OS User s Guide Linking with Threading Libraries ccceeeee eee ee eect eee eee e eee eee ee seen ee ees 36 Sequential Mode of the Library ssssssssssrrsrrsrrrrrrrrrrrrarrrrrerrrn 36 Selecting the Threading Layer ccceeeeeeeee neces ee eee eee ee eee eeee eae 36 Linking with Computational Libraries ccceeeeee eee eee eee ee eee eee eee 37 Linking with Compiler Run time Libraries ccceceeeee eee cnet teeta eee eee 38 Linking with System Libraries cccecceee eee eee eee eee ee eee ee E E E A 38 Building Custom Dynamic link Libraries ccccceeee eee cnet eee eee eee e eee eeenaees 39 Using the Custom Dynamic link Library Builder in the Command line Modei ie esaa naera stent lene AAN IAEA C hel a ONN ANEAN 39 Composing a List Of FUNCTIONS cece eee ee eee crete eee teeta teeta eee eeneeaes 40 Specifying Function NAMES ccc cece e eect eee eee
101. ers on a distributed memory machine On a shared memory machine use the Intel Optimized LINPACK Benchmark Intel provides optimized versions of the LINPACK benchmarks to help you obtain high LINPACK benchmark results on your systems based on genuine Intel processors more easily than with the HPL benchmark Use the Intel Optimized MP LINPACK Benchmark to benchmark your cluster The prebuilt binaries require that you first install Intel MPI 3 x be installed on the cluster The run time version of Intel MPI is free and can be downloaded from www intel com software products The Intel package includes software developed at the University of Tennessee Knoxville Innovative Computing Laboratories and neither the University nor ICL endorse or promote this product Although HPL 2 0 is redistributable under certain conditions this particular package is subject to the Intel MKL license Intel MKL has introduced a new functionality into MP LINPACK which is called a hybrid build while continuing to support the older version The term hybrid refers to special optimizations added to take advantage of mixed OpenMP MPI parallelism If you want to use one MPI process per node and to achieve further parallelism by means of OpenMP use the hybrid build In general the hybrid build is useful when the number of MPI processes per core is less than one If you want to rely exclusively on MPI for parallelism and use one MPI per core use the non hybrid build I
102. es x 2 cores pkg x 1 threads core 4 total cores DWORD PTR mask 1 lt lt tid 0 0 2 SetThreadAffinityMask GetCurrentThread mask Call Intel MKL LAPACK routine return 0 53 5 Intel Math Kernel Library for Windows OS User s Guide Compile the application with the Intel compiler using the following command icl Qopenmp test_application c where test_application c is the filename for the application Build the application Run it in four threads for example by using the environment variable to set the number of threads set OMP NUM THREADS 4 test_application exe See Windows API documentation at msdn microsoft com for the restrictions on the usage of Windows API routines and particulars of the SetThreadAffinityMask function used in the above example See also a similar example at en wikipedia org wiki Affinity_mask Operating on Denormals The IEEE 754 2008 standard An IEEE Standard for Binary Floating Point Arithmetic defines denormal or subnormal numbers as non zero numbers smaller than the smallest possible normalized numbers for a specific floating point format Floating point operations on denormals are slower than on normalized operands because denormal operands and results are usually handled through a software assist mechanism rather than directly in hardware This software processing causes Intel MKL functions that consume denormals to run slower than with normalized floating poi
103. etween cdecl and stdcall at link time according to the function names Setting the Threading Layer To set the threading layer at run time use the mkl_set_threading_ layer function or the MKL THREADING LAYER environment variable The following table lists available threading layers along with the values to be used to set each layer Threading Layer Value of Value of the Parameter of MKL_ THREADING LAYER mkl_set threading layer Intel threading INTEL MKL THREADING INTEL Sequential mode SEQUENTIAL MKL THREADING SEQUENTIAL of Intel MKL PGI threading PGI MKL THREADING PGI If the mkl_set_threading_ layer function is called the environment variable MKL_ THREADING LAYER is ignored By default Intel threading is used See the Intel MKL Reference Manual for details of the mk1_set_threading_ layer function Replacing Error Handling and Progress Information Routines You can replace the Intel MKL error handling routine xerbla or progress information routine mkl_progress with your own function If you are using SDL to replace xerbla or mkl_progress call the mkl_set_xerbla and mkl_set_progress function respectively See the Intel MKL Reference Manual for details 32 Linking Your Application with the Intel Math Kernel Library 4 NOTE If you are using SDL you cannot perform the replacement by linking the object file with your implementation of xerbla or mkl_progress
104. ex data You can also redefine the types with your own types before including the mk1_types h header file The only requirement is that the types must be compatible with the Fortran complex layout that is the complex type must be a pair of real numbers for the values of real and imaginary parts For example you can use the following definitions in your C code define MKL Complex8 std complex lt float gt and define MKL Complexl6 std complex lt double gt 62 Language specific Usage Options 6 See Example Calling a Complex BLAS Level 1 Function from C for details You can also define these types in the command line DMKL_Complex8 std complex lt float gt DMKL_Complex16 std complex lt double gt See Also Intel Software Documentation Library Calling BLAS Functions that Return the Complex Values in C C Code Complex values that functions return are handled differently in C and Fortran Because BLAS is Fortran style you need to be careful when handling a call from C to a BLAS function that returns complex values However in addition to normal function calls Fortran enables calling functions as though they were subroutines which provides a mechanism for returning the complex value correctly when the function is called from a C program When a Fortran function is called as a subroutine the return value is the first parameter in the calling sequence You can use this feature to call a BLAS function from C
105. face layer mkl_intel_ lp64 dll lib mkl intel ilp64 dll lib Threading layer mkl_intel_thread_dll lib mkl pgi_thread_dll lib mkl sequential dll lib Computational layer mkl core dll lib mkl_ scalapack_1lp64 dll lib mkl_scalapack_ilpo4 dll lib mkl_cdft_core dll lib Run time Libraries RTL mkl blacs_1p64 dll lib mkl blacs ilp64 dll lib Contents Single Dynamic Library to be used for linking LP64 interface library for dynamic linking with the Intel compilers ILP64 interface library for dynamic linking with the Intel compilers Threading library for dynamic linking with the Intel compilers Threading library for dynamic linking with the PGI compiler Sequential library for dynamic linking Core library for dynamic linking ScaLAPACK routine library for dynamic linking supporting the LP64 interface ScaLAPACK routine library for dynamic linking supporting the ILP64 interface Cluster FFT library for dynamic linking LP64 version of BLACS interface library for dynamic linking ILP64 version of BLACS interface library for dynamic linking Contents of the redist inte164 mk1 Directory File mkl_ rt dll Threading layer mkl_ intel thread dll mkl_pgi_thread dll mkl_sequential dll Computational layer mkl_core dll Contents Single Dynamic Library Dynamic threading library for the Intel compilers Dynamic threading library for the PGI compiler Dynamic sequential libra
106. for C C e mkl fi for Fortran 2 Optionally Use the following preprocessor directives to check whether the macro is defined e ifdef endif for C C e DECSIF DEFINED DECSENDIF for Fortran 3 Use preprocessor directives for conditional inclusion of code e if endif for C C e DECSIF DECSENDIF for Fortran Example Compile a part of the code if Intel MKL version is MKL 10 3 update 4 C C include mkl1 h ifdef INTEL MKL VERSION if NTEL MKL VERSION 100304 Code to be conditionally compiled endif endif Fortran Uvelmele Smale TiN DECSIF DEFINED INTEL MKL VERSION DECSIF INTEL MKL VERSION EQ 100304 x Code to be conditionally compiled DECSENDIF DECSENDIF 70 I Working with the Intel Math Kernel Library Cluster Software Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicabl
107. greatly assist the amount of data you can test See Also Benchmarking a Cluster 94 Intel Math Kernel Library Language Interfaces Support Language Interfaces Support by Function Domain The following table shows language interfaces that Intel Math Kernel Library Intel MKL provides for each function domain However Intel MKL routines can be called from other languages using mixed language programming See Mixed language Programming with Intel MKL for an example of how to call Fortran routines from C C Function Domain FORTRAN Fortran9 C C 77 0 95 interface interface interface Basic Linear Algebra Subprograms BLAS Yes Yes via CBLAS BLAS like extension transposition routines Yes Yes Sparse BLAS Level 1 Yes Yes via CBLAS Sparse BLAS Level 2 and 3 Yes Yes Yes LAPACK routines for solving systems of linear equations Yes Yes Yes LAPACK routines for solving least squares problems eigenvalue Yes Yes Yes and singular value problems and Sylvester s equations Auxiliary and utility LAPACK routines Yes Yes Parallel Basic Linear Algebra Subprograms PBLAS Yes ScaLAPACK routines Yes t DSS PARDISO solvers Yes Yes Yes Other Direct and Iterative Sparse Solver routines Yes Yes Yes Vector Mathematical Library VML functions Yes Yes Yes Vector Statistical Library VSL functions Yes Yes Yes Fourier Transform functions FFT Yes Yes Cluster FFT functions Yes Yes Trigonometric Transform routines Yes Yes Fast Poisson
108. he Single Dynamic Library Linking in Detail This section recommends which libraries to link with depending on your Intel MKL usage scenario and provides details of the linking 31 4 Intel Math Kernel Library for Windows OS User s Guide Dynamically Selecting the Interface and Threading Layer The Single Dynamic Library SDL enables you to dynamically select the interface and threading layer for Intel MKL Setting the Interface Layer Available interfaces depend on the architecture of your system On systems based on the Intel 64 architecture LP64 and ILP64 interfaces are available To set one of these interfaces at run time use the mkl_set_interface layer function or the MKL_ INTERFACE LAYER environment variable The following table provides values to be used to set each interface Interface Layer Value of MKL INTERFACE LAYER Value of the Parameter of mkl_set_interface_ layer LP64 LP64 MKL_ INTERFACE LP64 ILP64 ILP64 MKL_ INTERFACE ILP64 If the mkl_set_interface layer function is called the environment variable MKL_ INTERFACE LAYER is ignored By default the LP64 interface is used See the Intel MKL Reference Manual for details of the mk1_set_interface layer function On systems based on the IA 32 architecture the cdecl and stdcall interfaces are available These interfaces have different function naming conventions and SDL selects b
109. ing examples illustrate linking that uses Intel R compilers The examples use the f Fortran source file C C users should instead specify a cpp C or c C file and replace ifort with icc e Static linking of myprog f and parallel Intel MKL supporting the cdecl interface ifort myprog f mkl_intel_c lib mkl_intel_thread lib mkl_core lib libiomp5md lib e Dynamic linking of myprog f and parallel Intel MKL supporting the cdecl interface ifort myprog f mkl_intel_ c_dll lib mkl intel thread dll lib mkl core dll 1lib libiomp5md 1lib e Static linking of myprog f and sequential version of Intel MKL supporting the cdecl interface ifort myprog f mkl_intel_c lib mkl_sequential lib mkl_core lib e Dynamic linking of myprog f and sequential version of Intel MKL supporting the cdecl interface Hh ifort myprog f mkl_intel_ c_dll lib mkl_ sequential dll lib mkl_core dll lib e Static linking of user code myprog f and parallel Intel MKL supporting the stdcall interface ifort myprog f mkl_intel_s lib mkl_intel_thread lib mkl_core lib libiomp5md 1lib e Dynamic linking of user code myprog f and parallel Intel MKL supporting the stdcall interface ifort myprog f mkl_intel_s dll lib mkl_intel_ thread_dll lib mkl_core dll lib libiomp5md 1lib e Dynamic linking of user code myprog f and parallel or sequential Intel MKL supporting the cdecl or stdcall interface Call the mkl_set_threading_layer function or set value of the MKL_ THR
110. interface to use Possible values e For the IA 32 architecture cdecl stdcall The default value is cdecl e For the Intel 64 architecture 1p64 i1p64 The default value is 1p64 threading parallel sequential Defines whether to use the Intel MKL in the threaded or sequential mode The default value is parallel o Specifies the full name of the file that contains the list of entry point functions to be Siipi names included in the DLL The default name is user_example_list no extension name lt dll S Specifies the name of the dll and interface library to be created By default the names of the created libraries are mkl _custom dll and mkl custom lib xerbla Specifies the name of the object file lt user xerbla gt obj that contains the user s lt error handler gt error handler The makefile adds this error handler to the library for use instead of the default Intel MKL error handler xerbla If you omit this parameter the native Intel MKL xerbla is used See the description of the xerbla function in the Intel MKL Reference Manual on how to develop your own error handler For the IA 32 architecture the object file should be in the interface defined by the interface macro cdecl or stdcall 39 4 Intel Math Kernel Library for Windows OS User s Guide Parameter Description Values MKLROOT f Specifies the location of Intel MKL libraries used to build the custom DLL By default lt mkl director
111. ions e Fourier Transform functions FFT e Cluster FFT e Trigonometric Transform routines e Poisson Laplace and Helmholtz Solver routines e Optimization Trust Region Solver routines e Data Fitting Functions e GMP arithmetic functions Deprecated and will be removed in a future release Reason The function domain you intend to use narrows the search in the Reference Manual for specific routines you need Additionally if you are using the Intel MKL cluster software your link line is function domain specific see Working with the Cluster Software Coding tips may also depend on the function domain see Tips and Techniques to Improve Performance Intel MKL provides support for both Fortran and C C programming Identify the language interfaces that your function domains support see Intel Math Kernel Library Language Interfaces Support Reason Intel MKL provides language specific include files for each function domain to simplify program development see Language Interfaces Support by Function Domain For a list of language specific interface libraries and modules and an example how to generate them see also Using Language Specific Interfaces with Intel Math Kernel Library If your system is based on the Intel 64 architecture identify whether your application performs calculations with large data arrays of more than 231 1 elements Reason To operate on large data arrays you need to select the ILP64 interface where in
112. ions in the sequential or multi threaded mode e Inthe libia32 solution select the cdecl_ sequential or cdecl parallel project e Inthe libintel 4 solution select the 1p64 sequential or 1p64 parallel project 3 Optional To build the DLL that uses the stdcall interface for the IA 32 architecture or the ILP64 interface for the Intel 64 architecture a Select Project gt Properties gt Configuration Properties gt Linker gt Input gt Additional Dependencies b In the 1ibia32 solution change mkl_intel_c libtomkl_intel_s lib In the libintel 4 solution change mkl_intel_1p64 lib to mkl_ intel ilp64 1lib 4 Optional To include your own error handler in the DLL a Select Project gt Properties gt Configuration Properties gt Linker gt Input b Add lt user_ xerbla gt obj 5 Optional To turn off creation of the manifest a Select Project gt Properties gt Configuration Properties gt Linker gt Manifest File gt Generate Manifest b Select no 6 Optional To change the list of functions to be included in the DLL a Select Source Files b Edit the examples def file Refer to Specifying Function Names for how to specify entry points 7 To build the library e In VS2005 VS2008 select Build gt Project Only gt Link Only and link projects in this order i_malloc_dll vml_dll_core cdecl_sequential 1lp64 sequential or cdecl_ parallel lp64 parallel e In VS2010 select Build gt Build Solution See Also Using the Custom Dynamic
113. ix decomposition fractions 0 005 0 010 0 015 0 020 0 025 0 030 0 035 0 040 0 045 0 050 0 055 0 060 0 065 0 070 0 075 0 080 0 085 0 090 0 095 0 100 0 105 0 110 0 115 0 120 0 125 0 130 0 135 0 140 0 145 0 150 0 155 0 160 0 165 0 170 0 175 0 180 0 185 0 190 0 195 0 200 0 205 0 210 0 215 0 220 0 225 0 230 0 235 0 240 0 245 0 250 0 255 0 260 0 265 0 270 0 275 0 280 0 285 0 290 0 295 0 300 0 305 0 310 0 315 0 320 0 325 0 330 0 335 0 340 0 345 0 350 0 355 0 360 0 365 0 370 0 375 0 380 0 385 0 390 0 395 0 400 0 405 0 410 0 415 0 420 0 425 0 430 0 435 0 440 0 445 0 450 0 455 0 460 0 465 0 470 0 475 0 480 0 485 0 490 0 495 0 515 0 535 0 555 0 575 0 595 0 615 0 635 0 655 0 675 0 695 0 795 0 895 However this problem size is so small and the block size so big by comparison that as soon as it prints the value for 0 045 it was already through 0 08 fraction of the columns On a really big problem the fractional number will be more accurate It never prints more than the 112 numbers above So smaller problems will have fewer than 112 updates and the biggest problems will have precisely 112 updates Mflops is an estimate based on 1280 columns of LU being completed However with lookahead steps sometimes that work is not actually completed when the output is made Nevertheless this is a good estimate for comparing identical runs The 3 numbers in parenthesis are intrusive ASYOUGO2 addins DT is the total time processor 0 has spent in DGEMM DF is the number of
114. le omp set _ num threads amp i_one KaKKKKK language KKKKKKK incl incl incl ude omp h ude mkl h ude lt stdio h gt define SIZE 1000 int main int args char argv doublesza lo els a double malloc sizeof double SIZE SIZE b double malloc sizeof double SIZE SIZE c double malloc sizeof double SIZE SIZE double alpha 1 beta 1 int m SIZE n SIZE K SIZE lda SIZE ldb SIZE ldc SIZE i 0 j 0 char transa n transb n for 1 0 I lt SIZE att LOL Jas J lt SIAE J a i SIZE j double i j b i SIZE j double i J c i SIZE j double 0 cblas_dgemm CblasRowMajor CblasNoTrans CblasNoTrans lit i Ike edia ey Lea lo Ilo Insite Cy LEN printf row ta tc n gone A O PaO pase yt oine USO eee mE ON ap ells CASTA omp_set_num_ threads 1 for i 0 i lt SIZE itt for j 0 j lt SIZE j a i SIZE j double i j b i SIZE double i 3 c i SIZE j double 0 cblas_dgemm CblasRowMajor CblasNoTrans CblasNoTrans itl Wy I ellen ey lel 19 printf row ta tc n for i1 0 1 lt 10 1i pri ENEAN CEN aby omp _set_num threads 2 for i 0 i lt SIZE i for j 0 J SLAD j a i SIZE j double i b i SIZE j double 4 c i SIZE j double 0 Telg octe Leeg a De SEAS CSia i ae oll las_dgemm CblasRowMajor jit
115. les Fortran 95 Interfaces to LAPACK and BLAS Setting the Number of Threads Using an OpenMP Environment Variable 18 Getting Started 2 Compiler Support Intel MKL supports compilers identified in the Release Notes However the library has been successfully used with other compilers as well Although Compaq no longer supports the Compaq Visual Fortran CVF compiler Intel MKL still preserves the CVF interface in the IA 32 architecture implementation You can use this interface with the Intel Fortran Compiler Intel MKL provides both stdcall default CVF interface and cdecl default interface of the Microsoft Visual C application interfaces for the IA 32 architecture Intel MKL provides a set of include files to simplify program development by specifying enumerated values and prototypes for the respective functions Calling Intel MKL functions from your application without an appropriate include file may lead to incorrect behavior of the functions See Also Compiling an Application that Calls the Intel Math Kernel Library and Uses the CVF Calling Conventions Using the cdecl and stdcall Interfaces Include Files Using Code Examples The Intel MKL package includes code examples located in the examples subdirectory of the installation directory Use the examples to determine e Whether Intel MKL is working on your system e How you should call the library e How to link the library The examples are grouped in subdirec
116. lib nodeperf c where lt MPI library gt iS msmpi 1lib in the case of Microsoft MPI and mpi 1lib in the case of MPICH Launching nodeperf c on all the nodes is especially helpful in a very large cluster nodeperf enables quick identification of the potential problem spot without numerous small MP LINPACK runs around the cluster in search of the bad node It goes through all the nodes one at a time and reports the performance of DGEMM followed by some host identifier Therefore the higher the DGEMM performance the faster that node was performing 3 Edit HPL dat to fit your cluster needs Read through the HPL documentation for ideas on this Note however that you should use at least 4 nodes 4 Make an HPL run using compile options such as ASYOUGO ASYOUGO2 or ENDEARLY to aid in your search These options enable you to gain insight into the performance sooner than HPL would normally give this insight When doing so follow these recommendations e Use MP LINPACK which is a patched version of HPL to save time in the search All performance intrusive features are compile optional in MP LINPACK That is if you do not use the new options to reduce search time these features are disabled The primary purpose of the additions is to assist you in finding solutions HPL requires a long time to search for many different parameters In MP LINPACK the goal is to get the best possible number Given that the input is not fi
117. link Library Builder in the Command line Mode Distributing Your Custom Dynamic link Library To enable use of your custom DLL in a threaded mode distribute 1ibiomp5md d11 along with the custom DLL 42 Managing Performance and Memory Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Using Parallelism of the Intel Math Kernel Library Intel MKL is extensively parallelized See Threaded Functions and Problems for lists of threaded functions and problems that can be threaded Intel MKL is thread safe which means that all Intel MKL functions except the LAPACK deprecated routine lacon work correctly during simultaneous execution by multiple threads In particular any chunk of threaded Intel MKL code p
118. lled in a parallel region it will use only one thread by default If you want the library to use nested parallelism and the thread within a parallel region is compiled with the same OpenMP compiler as Intel MKL is using you may experiment with setting MKL DYNAMIC to FALSE and manually increasing the number of threads In general set MKL_DYNAMIC to FALSE only under circumstances that Intel MKL is unable to detect for example to use nested parallelism where the library is already called from a parallel section MKL_DOMAIN_NUM_THREADS The MKL_ DOMAIN NUM THREADS environment variable suggests the number of threads for a particular function domain MKL DOMAIN NUM THREADS accepts a string value lt MKL env string gt which must have the following format lt MKL env string gt lt MKL domain env string gt lt delimiter gt lt MKL domain env string gt lt delimiter gt lt space symbol gt lt space symbol gt lt comma symbol gt lt semicolon symbol gt lt colon symbol gt lt space symbol gt lt MKL domain env string gt lt MKL domain env name gt lt uses gt lt number of threads gt lt MKL domain env name gt MKL DOMAIN ALL MKL DOMAIN BLAS MKL DOMAIN FFT MKL DOMAIN VML MKL DOMAIN PARDISO lt uses gt lt space symbol gt lt space symbol gt lt equality sign gt lt comma symbol gt lt space symbol gt lt number of threads gt lt posi
119. ly Otherwise a pop up list appears with the names specified in the header file 3 Select the name from the list if needed 29 proj Microsoft Visual Studio DER Fie Edit View Project Build Debug Tools Window Community Help j mkl _dfti h proj c Start Page Search TX Global Scope g ox maing v aE re a SE int main Ji status DftiCreateDescriptc status Dfric i ap DFTI_LUNCOMMITTED DFTI_UNIMPLEMENTED oes if DFTI_VERSION Cees DFTI_VERSION_LENGTH Show output from 2 DftiCommitDescriptor DftiComputeBackward O DftiComputeForward kpoints gp Breakpoints E Immec DitiCopyDescriptor Ready Q DftiCreateDescriptor Q DftiErrorClass 85 9 Intel Math Kernel Library for Windows OS User s Guide 86 LINPACK and MP LINPACK Benchmarks Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel micropro
120. mples use the makefile found in the Intel MKL Java examples directory nmake dllia32 dllintel64 libia32 libintel64 function compiler If you type the make command and omit the target for example d11ia32 the makefile prints the help info which explains the targets and parameters For the examples list see the examples 1st file in the Java examples directory Known Limitations of the Java Examples This section explains limitations of Java examples Functionality Some Intel MKL functions may fail to work if called from the Java environment by using a wrapper like those provided with the Intel MKL Java examples Only those specific CBLAS FFT VML VSL RNG and the convolution correlation functions listed in the Intel MKL Java Examples section were tested with the Java environment So you may use the Java wrappers for these CBLAS FFT VML VSL RNG and convolution correlation functions in your Java applications Performance The Intel MKL functions must work faster than similar functions written in pure Java However the main goal of these wrappers is to provide code examples not maximum performance So an Intel MKL function called from a Java application will probably work slower than the same function called from a program written in C C or Fortran Known bugs There are a number of known bugs in Intel MKL identified in the Release Notes as well as incompatibilities between different versions of JDK The exa
121. mples and wrappers include workarounds for these problems Look at the source code in the examples and wrappers for comments that describe the workarounds 68 Coding Tips This section discusses programming with the Intel Math Kernel Library Intel MKL to provide coding tips that meet certain specific needs such as consistent results of computations or conditional compilation Aligning Data for Consistent Results Routines in Intel MKL may return different results from run to run on the same system This is usually due to a change in the order in which floating point operations are performed The two most influential factors are array alignment and parallelism Array alignment can determine how internal loops order floating point operations Non deterministic parallelism may change the order in which computational tasks are executed While these results may differ they should still fall within acceptable computational error bounds To better assure identical results from run to run do the following e Align input arrays on 16 byte boundaries e Run Intel MKL in the sequential mode To align input arrays on 16 byte boundaries use mk1_malloc in place of system provided memory allocators as shown in the code example below Sequential mode of Intel MKL removes the influence of non deterministic parallelism Aligning Addresses on 16 byte Boundaries KK KEKE i language KKKKKKK include lt stdlib h gt void darray int works
122. n addition to supplying certain hybrid prebuilt binaries Intel MKL supplies some hybrid prebuilt libraries for Intel MPI to take advantage of the additional OpenMP optimizations If you wish to use an MPI version other than Intel MPI you can do so by using the MP LINPACK source provided You can use the source to build a non hybrid version that may be used in a hybrid mode but it would be missing some of the optimizations added to the hybrid version Non hybrid builds are the default of the source code makefiles provided In some cases the use of the hybrid mode is required for external reasons If there is a choice the non hybrid code may be faster To use the non hybrid code in a hybrid mode use the threaded version of Intel MKL BLAS link with a thread safe MPI and call function MPI_ init _thread so as to indicate a need for MPI to be thread safe 89 1 0 Intel Math Kernel Library for Windows OS User s Guide Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimization
123. nMP code The sequential mode requires no compatibility OpenMP run time library and does not respond to the environment variable OMP_NUM_THREADS or its Intel MKL equivalents You should use the library in the sequential mode only if you have a particular reason not to use Intel MKL threading The sequential mode may be helpful when using Intel MKL with programs threaded with some non Intel compilers or in other situations where you need a non threaded version of the library for instance in some MPI cases To set the sequential mode in the Threading layer choose the sequential library See Also Directory Structure in Detail Using Parallelism of the Intel Math Kernel Library Avoiding Conflicts in the Execution Environment Linking Examples Selecting the Threading Layer Several compilers that Intel MKL supports use the OpenMP threading technology Intel MKL supports implementations of the OpenMP technology that these compilers provide To make use of this support you need to link with the appropriate library in the Threading Layer and Compiler Support Run time Library RTL Threading Layer Each Intel MKL threading library contains the same code compiled by the respective compiler Intel and PGI compilers on Windows OS RTL This layer includes libiomp the compatibility OpenMP run time library of the Intel compiler In addition to the Intel compiler 1ibiomp provides support for one more threading compiler on Windows OS
124. nks per node equals the number of real processors or cores of a node If the Intel Hyper Threading Technology is enabled on the node use only half number of the processors that are visible on Windows OS See Also Setting Environment Variables on a Cluster Using DLLs All the needed DLLs must be visible on all the nodes at run time and you should install Intel Math Kernel Library Intel MKL on each node of the cluster You can use Remote Installation Services RIS provided by Microsoft to remotely install the library on each of the nodes that are part of your cluster The best way to make the DLLs visible is to point to these libraries in the PATH environment variable See Setting Environment Variables on a Cluster on how to set the value of the PATH environment variable The ScaLAPACK DLLs for the IA 32 and Intel 64 architectures in the lt Composer XE directory gt redist ia32 mkl and lt Composer XE directory gt redist intel64 mkl directories respectively use the MPI dispatching mechanism MPI dispatching is based on the MKL_BLACS MPI environment variable The BLACS DLL uses MKL_BLACS_MPI for choosing the needed MPI libraries The table below lists possible values of the variable Value Comment MPICH2 Default value MPICH2 1 0 x for Windows OS is used for message passing INTELM Intel MPI is used for message passing PI 73 8 Intel Math Kernel Library for Windows OS User s Guide Value Comment MSMPI Microsoft
125. nt numbers You can mitigate this performance issue by setting the appropriate bit fields in the MXCSR floating point control register to flush denormals to zero FTZ or to replace any denormals loaded from memory with zero DAZ Check your compiler documentation to determine whether it has options to control FTZ and DAZ Note that these compiler options may slightly affect accuracy FFT Optimized Radices You can improve the performance of Intel MKL FFT if the length of your data vector permits factorization into powers of optimized radices In Intel MKL the optimized radices are 2 3 5 7 11 and 13 Using Memory Management Intel MKL Memory Management Software Intel MKL has memory management software that controls memory buffers for the use by the library functions New buffers that the library allocates when your application calls Intel MKL are not deallocated until the program ends To get the amount of memory allocated by the memory management software call the mkl_mem_stat function If your program needs to free memory call mkl_ free buffers If another call is made to a library function that needs a memory buffer the memory manager again allocates the buffers and they again remain allocated until either the program ends or the program deallocates the memory This behavior facilitates better performance However some tools may report this behavior as a memory leak The memory management software is turned on by default To t
126. ntel MKL installed on your system update your build scripts to point to the correct Intel MKL version 3 Check that the following files appear in the lt mk1 directory gt bin directory and its subdirectories mklvars bat ia32 mklvars_ia32 bat intel64 mklvars_ intel64 bat Use these files to assign Intel MKL specific values to several environment variables as explained in Setting Environment Variables 4 To understand how the Intel MKL directories are structured see Intel Math Kernel Library Structure 5 To make sure that Intel MKL runs on your system do one of the following e Launch an Intel MKL example as explained in Using Code Examples e In the Visual Studio IDE create and run a simple project that uses Intel MKL as explained in Running an Intel MKL Example in the Visual Studio IDE See Also Notational Conventions Setting Environment Variables When the installation of Intel MKL for Windows OS is complete set the PATH LIB and INCLUDE environment variables in the command shell using one of the script files in the bin subdirectory of the Intel MKL installation directory ia32 mklvars ia32 bat for the IA 32 architecture 17 2 Intel Math Kernel Library for Windows OS User s Guide intel64 mklvars_intel64 bat for the Intel 64 architecture mklvars bat for the IA 32 and Intel 64 architectures Running the Scripts The scripts accept parameters to specify the following e Architecture e Addition of
127. number of threads up to the maximum number you specify 49 5 Intel Math Kernel Library for Windows OS User s Guide For example MKL DYNAMIC set to TRUE enables optimal choice of the number of threads in the following cases e If the requested number of threads exceeds the number of physical cores perhaps because of using the Intel Hyper Threading Technology and MKL_ DYNAMIC is not changed from its default value of TRUE Intel MKL will scale down the number of threads to the number of physical cores e If you are able to detect the presence of MPI but cannot determine if it has been called in a thread safe mode it is impossible to detect this with MPICH 1 2 x for instance and MKL DYNAMIC has not been changed from its default value of TRUE Intel MKL will run one thread When MKL DYNAMIC is FALSE Intel MKL tries not to deviate from the number of threads the user requested However setting MKL_DYNAMIC FALSE does not ensure that Intel MKL will use the number of threads that you request The library may have no choice on this number for such reasons as system resources Additionally the library may examine the problem and use a different number of threads than the value suggested For example if you attempt to do a size one matrix matrix multiply across eight threads the library may instead choose to use only one thread because it is impractical to use eight threads in this event Note also that if Intel MKL is ca
128. oject name gt and select Add gt Existing Item from the drop down menu The Add Existing Item lt project name gt window opens b Browse to the Intel MKL example directory for example lt mk1 directory gt examples cblas source Select the example file and supporting files with extension c C sources for example select files cblas_caxpyix c and common_func c For the list of supporting files in each example directory see Support Files for Intel MKL Examples Click Add The Add Existing Item lt project name gt window closes and selected files appear in the Source Files folder in Solution Explorer The next steps adjust the properties of the project 4 Select lt project name gt 5 On the main menu select Project gt Properties to open the lt project name gt Property Pages window 6 Set Intel MKL Include dependencies a Select Configuration Properties gt C C gt General In the right hand part of the window select Additional Include Directories gt the browse button The Additional Include Directories window opens b Click the New Line button the first button in the uppermost row When the new line appears in the window click the browse button The Select Directory window opens c Browse to the lt mk1 directory gt include directory and click OK The Select Directory window closes and full path to the Intel MKL include directory appears in the Additional Include Directories window d Click
129. onal Library Directories window opens b Type the directory with the Intel MKL libraries in quotes that is lt mkl directory gt lib lt architecture gt where lt architecture gt is one of ia32 intel64 for example lt mk1 directory gt lib ia32 For most laptop and desktop computers lt architecture gt is ia32 Click OK to close the window c Select Configuration Properties gt Linker gt Input In the right hand part of the window select Additional Dependencies and type the libraries required for example if lt architecture gt ia32 type mkl_ intel _c lib mkl_intel_ thread lib mkl_core lib libiomp5md 1lib 8 In the lt project name gt Property Pages window click OK to close the window 9 Some examples do not pause before the end of execution To see the results printed in the Console window set a breakpoint at the very end of the program or add the pause statement before the last end statement 1070 build the solution select Build gt Build Solution 1170 run the example select Debug gt Start Debugging The Console window opens 12You can see the results of the example in the Console window If you used pause statement to pause execution of the program press Enter to complete the run If you used a breakpoint to pause execution of the program select Debug gt Continue The Console window closes Support Files for Intel Math Kernel Library Examples Below is the list of support files that have to
130. ore details of each layer Layer Description Interface Layer This layer matches compiled code of your application with the threading and or computational parts of the library This layer provides e cdecl and CVF default interfaces e LP64 and ILP64 interfaces e Compatibility with compilers that return function values differently e A mapping between single precision names and double precision names for applications using Cray style naming SP2DP interface SP2DP interface supports Cray style naming in applications targeted for the Intel 64 architecture and using the ILP64 interface SP2DP interface provides a mapping between single precision names for both real and complex types in the application and double precision names in Intel MKL BLAS and LAPACK Function names are mapped as shown in the following example for BLAS functions GEMM SGEMM gt DGEMM DGEMM gt DGEMM CGEMM gt ZGEMM ZGEMM gt ZGEMM Mind that no changes are made to double precision names Threading Layer This layer e Provides a way to link threaded Intel MKL with different threading compilers e Enables you to link with a threaded or sequential mode of the library This layer is compiled for different environments threaded or sequential and compilers from Intel Microsoft and so on Computational This layer is the heart of Intel MKL It has only one library for each combination of Layer architecture and supported OS Th
131. ore dll lib Functions t Also add the library with BLACS routines corresponding to the MPI used See Also Linking with ScaLAPACK and Cluster FFTs Using the Link line Advisor Using the ILP64 Interface vs LP64 Interface Linking with Compiler Run time Libraries Dynamically link 1ibiomp the compatibility OpenMP run time library even if you link other libraries statically Linking to the 1ibiomp statically can be problematic because the more complex your operating environment or application the more likely redundant copies of the library are included This may result in performance issues oversubscription of threads and even incorrect results To link 1ibiomp dynamically be sure the PATH environment variable is defined correctly See Also Setting Environment Variables Layered Model Concept Linking with System Libraries If your system is based on the Intel 64 architecture be aware that Microsoft SDK builds 1289 or higher provide the bufferoverflowu 1ib library to resolve the security cookie external references Makefiles for examples and tests include this library by using the buf_lib bufferoverflowu 1lib macro If you are using older SDKs leave this macro empty on your command line as follows buf lib 38 Linking Your Application with the Intel Math Kernel Library 4 Building Custom Dynamic link Libraries Custom dynamic link libraries DLL reduce the collection of functions available in Intel MKL libraries to
132. ough it doesn t hurt To avoid the residual check for a problem that terminates early set the threshold parameter in HPL dat to a negative number when testing ENDEARLY It also sometimes gives a better picture to compile with DASYOUGO2 when using DENDEARLY Usage notes on DENDEARLY follow e DENDEARLY stops the problem after a few iterations of DGEMM on the block size the bigger the blocksize the further it gets It prints only 5 or 6 updates whereas DASYOUGO prints about 46 or so output elements before the problem completes e Performance for DASYOUGO and DENDEARLY always starts off at one speed slowly increases and then slows down toward the end because that is what LU does DENDEARLY is likely to terminate before it starts to slow down e DENDEARLY terminates the problem early with an HPL Error exit It means that you need to ignore the missing residual results which are wrong because the problem never completed However you can get an idea what the initial performance was and if it looks good then run the problem to completion without DENDEARLY To avoid the error check you can set HPL s threshold parameter in HPL dat to a negative number e Though DENDEARLY terminates early HPL treats the problem as completed and computes Gflop rating as though the problem ran to completion Ignore this erroneously high rating e The bigger the problem
133. pace Allocate workspace aligned on 16 byte boundary darray mkl_malloc sizeof double workspace 16 call the program using MKL mkl_app darray Free workspace mkl_ free darray esogeni Mortan language sorts double precision darray pointer p_wrk darray 1 integer workspace Allocate workspace aligned on 16 byte boundary p_wrk mkl_malloc 8 workspace 16 call the program using MKL call mkl_app darray Free workspace call mkl_free p wrk 69 7 Intel Math Kernel Library for Windows OS User s Guide Using Predefined Preprocessor Symbols for Intel MKL Version Dependent Compilation Preprocessor symbols macros substitute values in a program before it is compiled The substitution is performed in the preprocessing phase The following preprocessor symbols are available Predefined Preprocessor Symbol Description INTEL MKL Intel MKL major version INTEL MKL MINOR Intel MKL minor version INTEL MKL UPDATE Intel MKL update number INTEL MKL VERSION Intel MKL full version in the following format INTEL MKL VERSION INTEL MKL 100 INTEL MKL MINOR 100 NTEL MKL UPDATE These symbols enable conditional compilation of code that uses new features introduced in a particular version of the library To perform conditional compilation 1 Include in your code the file where the macros are defined e mkl h
134. ple c binaries gt Program Files x86 Intel MPI 3 2 0 005 ia32 1lib for a default installation of Intel MPI 3 2 lt MPI linker gt mpicl Or mpiifort See Also Linking Your Application with the Intel Math Kernel Library Examples for Linking with ScaLAPACK and Cluster FFT Determining the Number of Threads The OpenMP software responds to the environment variable OMP_NUM_THREADS Intel MKL also has other mechanisms to set the number of threads such as the MKL_NUM THREADS or MKL_ DOMAIN NUM THREADS environment variables see Using Additional Threading Control Make sure that the relevant environment variables have the same and correct values on all the nodes Intel MKL versions 10 0 and higher no longer set the default number of threads to one but depend on the OpenMP libraries used with the compiler to set the default number For the threading layer based on the Intel compiler mkl_intel_thread 1lib this value is the number of CPUs according to the OS Al CAUTION Avoid over prescribing the number of threads which may occur for instance when the number of MPI ranks per node and the number of threads per node are both greater than one The product of MPI ranks per node and the number of threads per node should not exceed the number of physical cores per node The OMP_NUM_THREADS environment variable is assumed in the discussion below Set OMP_NUM_ THREADS so that the product of its value and the number of MPI ra
135. ples The Java examples support all the C and C compilers that Intel MKL does The makefile intended to run the examples also needs the n make utility which is typically provided with the C C compiler package To run Java examples the JDK developer toolkit is required for compiling and running Java code A Java implementation must be installed on the computer or available via the network You may download the JDK from the vendor website The examples should work for all versions of JDK However they were tested only with the following Java implementation s for all the supported architectures e J2SE SDK 1 4 2 JDK 5 0 and 6 0 from Sun Microsystems Inc http sun com e JRockit JDK 1 4 2 and 5 0 from Oracle Corporation http oracle com Note that the Java run time environment JRE system which may be pre installed on your computer is not enough You need the JDK developer toolkit that supports the following set of tools 67 6 Intel Math Kernel Library for Windows OS User s Guide e java e javac e javah e javadoc To make these tools available for the examples makefile set the JAVA_HOME environment variable and add the JDK binaries directory to the system PATH for example SET JAVA_HOME C Program Files Java jdk1 5 0_09 SET PATH JAVA_HOME bin PATH You may also need to clear the JDK_HOME environment variable if it is assigned a value SET JDK_HOME To start the exa
136. rary 4 Selecting Libraries to Link with To link with Intel MKL e Choose one library from the Interface layer and one library from the Threading layer e Add the only library from the Computational layer and run time libraries RTL The following table lists Intel MKL libraries to link with your application Interface layer Threading layer Computational RTL layer IA 32 mkl intel _c lib mkl intel_ mkl core lib libiomp5md lib architecture thread lib static linking IA 32 mkl intel c_ mkl intel_ mkl core dll libiomp5md lib architecture dll lib thread_dll lib lib dynamic linking Intel 64 mkl intel _ mkl_ intel _ mkl_core lib libiomp5md lib architecture lp64 lib thread lib static linking Intel 64 mkl intel_ mkl_intel_ mkl _ core dll libiomp5md lib architecture lp64_d1l lib thread dll lib lib dynamic linking The Single Dynamic Library SDL automatically links interface threading and computational libraries and thus simplifies linking The following table lists Intel MKL libraries for dynamic linking using SDL See Dynamically Selecting the Interface and Threading Layer for how to set the interface and threading layers at run time through function calls or environment settings SDL RTL IA 32 and Intel 64 mkl rt lib libiomp5md 1ib architectures t Linking with libiomp5md 1ib is not required For exceptions and alternatives to the libraries listed above see Linking in Detail See Also Layered Model Concept U
137. rnnrnseurerrnnnnnnneurnnrnnnn 53 Managing Multi core Performance sssssssssssserrnrrrsrsrsrerrnrrnrnnnnuerenrnnnn 53 Operating on DenormalSi masida naeran cent eee EETA PEE TED ANED 54 FFT Optimized Radi eS a a AEEA AEA TAEAE Eaa 54 Using Memory Management cece nee eee 54 Intel MKL Memory Management Software cceeeeeeee eee eases eee eeeeaeeees 54 Redefining Memory FUNCTIONS c cece eee e eee eee e eee teeta eae tae tates 55 Chapter 6 Language specific Usage Options Using Language Specific Interfaces with Intel Math Kernel Library 0 57 Interface Libraries ANd MOCUIES ccccccee eects eee eee eee eee eee teat tenant nee eas 57 Fortran 95 Interfaces to LAPACK and BLAS cccecceeeeeeeeeeeeeeeeeteneeaeeeaes 59 Compiler dependent Functions and Fortran 90 Module s cccceeeeeeeees 59 Using the stdcall Calling Convention in C C cc cecceeeeee eee ee eee teen ea eee 60 Compiling an Application that Calls the Intel Math Kernel Library and Uses the CVF Calling Conventions cccceeeeeeeee eee eect eee eeeeeeeeeeaes 60 Mixed language Programming with the Intel Math Kernel Library 61 Calling LAPACK BLAS and CBLAS Routines from C C Language Environ ES EEE AKEE E ate vee se AEE PEA EEES EaSI ET ies 61 Using Complex Types in C C sssssssrssrserrerrennssnsennennennennannnennennen 62 Contents Calling BLAS Functions that Return the Complex
138. rovides access for multiple threads to the same shared data while permitting only one thread at any given time to access a shared piece of data Therefore you can call Intel MKL from multiple threads and not worry about the function instances interfering with each other The library uses OpenMP threading software so you can use the environment variable OMP_NUM_ THREADS to specify the number of threads or the equivalent OpenMP run time function calls Intel MKL also offers variables that are independent of OpenMP such as MKL NUM THREADS and equivalent Intel MKL functions for thread management The Intel MKL variables are always inspected first then the OpenMP variables are examined and if neither is used the OpenMP software chooses the default number of threads By default Intel MKL uses the number of threads equal to the number of physical cores on the system To achieve higher performance set the number of threads to the number of real processors or physical cores as summarized in Techniques to Set the Number of Threads See Also Managing Multi core Performance Threaded Functions and Problems The following Intel MKL function domains are threaded e Direct sparse solver e LAPACK For the list of threaded routines see Threaded LAPACK Routines e Levell and Level2 BLAS For the list of threaded routines see Threaded BLAS Leveli and Level2 Routines e All Level 3 BLAS and all Sparse BLAS routines except Level 2 Sparse Triangul
139. rstand ILP64 interface details see also examples and tests Limitations All Intel MKL function domains support ILP64 programming with the following exceptions e FFTW interfaces to Intel MKL e FFTW 2 x wrappers do not support ILP64 e FFTW 3 2 wrappers support ILP64 by a dedicated set of functions plan _guru64 e GMP Arithmetic Functions do not support ILP64 NOTE GMP Arithmetic Functions are deprecated and will be removed in a future release See Also High level Directory Structure Include Files Language Interfaces Support by Function Domain Layered Model Concept 35 4 Intel Math Kernel Library for Windows OS User s Guide Directory Structure in Detail Linking with Fortran 95 Interface Libraries The mkl_blas95 1lib and mkl_lapack95 1ib libraries contain Fortran 95 interfaces for BLAS and LAPACK respectively which are compiler dependent In the Intel MKL package they are prebuilt for the Intel Fortran compiler If you are using a different compiler build these libraries before using the interface See Also Fortran 95 Interfaces to LAPACK and BLAS Compiler dependent Functions and Fortran 90 Modules Linking with Threading Libraries Sequential Mode of the Library You can use Intel MKL in a sequential non threaded mode In this mode Intel MKL runs unthreaded code However it is thread safe except the LAPACK deprecated routine lacon which means that you can use it in a parallel region in your Ope
140. ry Core library containing processor independent code and a 105 C Intel Math Kernel Library for Windows OS User s Guide File K kl l _def dll l p4n dll 1_mc dll l mc3 dll l_avx dll l_def dll 1 _p4n dll 1_mc dll I me2 sall l mc3 dll l_avx dll l _ scalapack_lp64 dll l scalapack_ilp64 dll _cdft_core dll libimalloc dll Run time Libraries RTL kl kl _b lacs 1p64 dll lacs_ilpo4 dll lacs_intelmpi_1p64 dll lacs_intelmpi_ilp64 dll lacs _mpich2 1p64 dll lacs mpich2 ilpe4 dll lacs msmpi_ 1p64 dll b lacs msmpi_ ilp64 dll 1033 mkl_msg dll 1041 mk1l_msg dll 106 Contents dispatcher for dynamic loading of processor specific code Default kernel for the Intel 64 architecture Kernel for the Intel Xeon processor using the Intel 64 archi tecture Kernel for processors based on the Intel Core microarchitec ture Kernel for the Intel Core i7 processors Kernel optimized for the Intel Advanced Vector Extensions Intel AVX VML VSL part of default kernel VML VSL for the Intel Xeon processor using the Intel 64 architecture VML VSL for processors based on the Intel Core microarchi tecture VML VSL for 45nm Hi k Intel Core 2 and Intel Xeon processor families VML VSL for the Intel Core i7 processors VML VSL optimized for the Intel Advanced Vector Extensions Intel AVX
141. ry Structure in Detail For example for the IA 32 architecture it is one of mkl_scalapack_core 1lib or mkl_cdft_core lib lt BLACS gt The BLACS library corresponding to your architecture programming interface LP64 or ILP64 and MPI version These libraries are listed in Directory Structure in Detail For example for the IA 32 architecture choose one of mkl blacs_mpich2 lib or mkl_blacs_intelmpi lib in case of static linking or mkl_blacs_d11 lib in case of dynamic linking specifically for MPICH2 choose mkl _blacs_mpich2 lib in case of static linking lt MKL core libraries gt Intel MKL libraries other than ScaLAPACK or Cluster FFTs libraries TIP Use the Link line Advisor to quickly choose the appropriate set of lt MKL cluster Library gt lt BLACS gt and lt MKL core libraries gt Intel MPI provides prepackaged scripts for its linkers to help you link using the respective linker Therefore if you are using Intel MPI the best way to link is to use the following commands lt path to Intel MPI binaries gt mpivars bat set lib lt path to MKL libraries gt lib lt mpilinker gt lt files to link gt lt MKL cluster Library gt lt BLACS gt lt MKL core libraries gt where the placeholders that are not yet defined are explained in the following table 72 Working with the Intel Math Kernel Library Cluster Software 8 lt path to MPI By default the bin subdirectory in the MPI installation directory For exam
142. s not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Contents of the Intel Optimized MP LINPACK Benchmark for Clusters The Intel Optimized MP LINPACK Benchmark for Clusters MP LINPACK Benchmark includes the HPL 2 0 distribution in its entirety as well as the modifications delivered in the files listed in the table below and located in the benchmarks mp_linpack subdirectory of the Intel MKL directory NOTE Because MP LINPACK Benchmark includes the entire HPL 2 0 distribution which provides a configuration for Linux OS only some Linux OS files remain in the directory Directory File in benchmarks mp_linpack testing ptest HPL pdtest c src blas HPL dgemm c n rc grid HPL grid _init c src pgesv HPL_ pdgesvK2 c n cc pgesv HPL pdgesv0 c testing ptest HPL dat makes testing ptimer testing timer Make bin_intel ia32 xhpl_ia32 exe bin_intel intel64 xhpl intel64 exe 90 Contents HPL 2 0 code modified to display captured DGEMM information in ASYOUGO2_ DISPLAY if it was captured for details see New Features HPL 2 0 code modified to capture DGEMM information if desired from ASYOUGO2_DISPLAY HPL 2 0 code modified to do additional grid experiments originally not in
143. ses another solution for a Hybrid OpenMP MPI mode 45 5 Intel Math Kernel Library for Windows OS User s Guide g TIP To get best performance with threaded Intel MKL compile your code with the MT option See Also Using Additional Threading Control Linking with Compiler Run time Libraries Techniques to Set the Number of Threads Use one of the following techniques to change the number of threads to use in Intel MKL e Set one of the OpenMP or Intel MKL environment variables e OMP NUM THREADS e MKL NUM THREADS e MKL DOMAIN NUM THREADS e Call one of the OpenMP or Intel MKL functions e omp_set_num_threads e mkl_set_num_threads e mkl domain set num threads When choosing the appropriate technique take into account the following rules e The Intel MKL threading controls take precedence over the OpenMP controls because they are inspected first e A function call takes precedence over any environment variables The exception which is a consequence of the previous rule is the OpenMP subroutine omp_set_num_threads which does not have precedence over Intel MKL environment variables such as MKL_NUM THREADS See Using Additional Threading Control for more details e You cannot change run time behavior in the course of the run using the environment variables because they are read only once at the first call to Intel MKL Setting the Number of Threads Using an OpenMP Environment V
144. sing Intel MKL Documentation in Visual Studio 2010 IDE To access the Intel MKL documentation in Visual Studio 2010 IDE e Configure the IDE to use local help once To do this Go to Help gt Manage Help Settings and check I want to use online help e Use the Help gt View Help menu item to view a list of available help collections and open the Intel MKL documentation Using Context Sensitive Help When typing your code in the Visual Studio VS IDE Code Editor you can get context sensitive help using the Fi Help and Dynamic Help features F1 Help To open the help topic relevant to the current selection press F1 In particular to open the help topic describing an Intel MKL function called in your code select the function name and press F1 The topic with the function description opens in the window that displays search results 83 9 Intel Math Kernel Library for Windows OS User s Guide Source1 cpp Microsoft Visual Studio File Edit View Debug Tools Window Community Help a ahga 4 3 3 ea eee i Source1 cpp Start Page Search v Unknown Scope t input data into x 0 x 31 y O y 31 ae D ticreateDescriptor smy_desci_handle DFTI_ Ft iCommitDescriptor my _desci_ handle D ticomputeForward my_descil_handle x D tiFreeDescriptor amp my_desc1_handle OAK S WO N He CommitDescriptor F1 Options choose Dynamic Help
145. sing the Link line Advisor Using the Qmkl Compiler Option Working with the Intel Math Kernel Library Cluster Software Using the Link line Advisor Use the Intel MKL Link line Advisor to determine the libraries and options to specify on your link or compilation line The latest version of the tool is available at http software intel com en us articles intel mkl link line advisor The tool is also available in the product The Advisor requests information about your system and on how you intend to use Intel MKL link dynamically or statically use threaded or sequential mode etc The tool automatically generates the appropriate link line for your application See Also Contents of the Documentation Directories 29 4 Intel Math Kernel Library for Windows OS User s Guide Using the Command line Link Tool Use the command line Link tool provided by Intel MKL to simplify building your application with Intel MKL The tool not only provides the options libraries and environment variables to use but also performs compilation and building of your application The tool mkl_link_tool exe is installed in the lt mk1 directory gt tools directory See the knowledge base article at http software intel com en us articles mkl command line link tool for more information Linking Examples See Also Using the Link line Advisor Examples for Linking with ScaLAPACK and Cluster FFT Linking on IA 32 Architecture Systems The follow
146. snennas 25 Contents of the Documentation DirectorjeS s ssssssrrrssssrrrrrrrrurrrrrrnrerrrrrnens 26 Chapter 4 Linking Your Application with the Intel Math Kernel Library Linking QuICK Startice ss sovtasds Weavivaas tang states weet as p A eee sensed eit assesa vee etewcea beeen 27 Using the Qmkl Compiler Option cccccecceeee sete nsec eee ee eens teat eneeeaeeeaes 27 Automatically Linking a Project in the Visual Studio Integrated Development Environment with Intel MKL ccecceeeeeeeeeeeeeeeeeeeeeeees 28 Automatically Linking Your Microsoft Visual C C Project with Inte P MK a Ta a ANIE ANAE A OT NaI OSINA AAEN E OAE ee 28 Automatically Linking Your Intel Visual Fortran Project with Tate MK Listings donate ae EANET A E EE da eves 28 Using the Single Dynamic Library cceceeeeeee eee ee eee eee eee nese eee eeaeed 28 Selecting Libraries to Link With ccc an eee ee eee eres ea eee teen aes 29 Using the Link line ACVISOF cccece eect ee erect eee eee ee eens ene a nets eae eae ed 29 Using the Command line Link TOOl ceceeeeee eect eee eee teeta eee eee eee nates 30 LINKING lt EXAMPIlES aces Sete eases Meee cars sh cade bad seleWagsalenues iva nda case deg ued PANEER OEE SRAN 30 Linking on IA 32 Architecture SYSteMS cceeeeee cece eee eee ee eee eee eee 30 Linking on Intel R 64 Architecture SYStEMS ccecseee eee esse eeeeeneeeaee eas 31 LINKING IN Detalles sora v
147. statically linked Intel MKL 1 Include the i_malloc h header file in your code This header file contains all declarations required for replacing the memory allocation functions The header file also describes how memory allocation can be replaced in those Intel libraries that support this feature 2 Redefine values of pointers i_ malloc i free i_calloc and i_realloc prior to the first call to MKL functions as shown in the following example include i_malloc h i malloc my malloc i calloc my calloc i realloc my realloc i free my free Now you may call Intel MKL functions If you are using the dynamically linked Intel MKL 1 Include the i_malloc h header file in your code 2 Redefine values of pointers i_ malloc dll i free dll i_calloc_dll and i_realloc_d11 prior to the first call to MKL functions as shown in the following example include i_malloc h i malloc dll my malloc i calloc dll my calloc i realloc dll my realloc i free dll my free Now you may call Intel MKL functions 55 5 Intel Math Kernel Library for Windows OS User s Guide 56 Language specific Usage Options The Intel Math Kernel Library Intel MKL provides broad support for Fortran and C C programming However not all functions support both Fortran and C interfaces For example some LAPACK functions have no C interface You can call such functions from C using mixed language programming If you
148. t data input file Intel Optimized MP LINPACK Benchmark for Clusters Overview of the Intel Optimized MP LINPACK Benchmark for Clusters The Intel Optimized MP LINPACK Benchmark for Clusters is based on modifications and additions to HPL 2 0 from Innovative Computing Laboratories ICL at the University of Tennessee Knoxville UTK The Intel Optimized MP LINPACK Benchmark for Clusters can be used for Top 500 runs see http www top500 org To use the benchmark you need be intimately familiar with the HPL distribution and usage The Intel Optimized MP LINPACK Benchmark for Clusters provides some additional enhancements and bug fixes designed to make the HPL usage more convenient as well as explain Intel Message Passing Interface MPI settings that may enhance performance The benchmarks mp_linpack directory adds techniques to minimize search times frequently associated with long runs The Intel Optimized MP LINPACK Benchmark for Clusters is an implementation of the Massively Parallel MP LINPACK benchmark by means of HPL code It solves a random dense real 8 system of linear equations Ax b Measures the amount of time it takes to factor and solve the system converts that time into a performance rate and tests the results for accuracy You can solve any size N system of equations that fit into memory The benchmark uses full row pivoting to ensure the accuracy of the results Use the Intel Optimized MP LINPACK Benchmark for Clust
149. tatement with the appropriate Intel MKL header file to your code To get the list of parameters of a function specified in the header file 1 Type the function name 2 Type the opening parenthesis This brings up the tooltip with the list of the function parameters 84 Programming with Intel Math Kernel Library in Integrated Development Environments IDE 9 29 proj Microsoft Visual Studio DER Fie Edit View Project Build Debug Tools Window Community Help A EETA GL ho ae E a A SIGS mkl_dfti h proj c Start Page Search Global Scope v F maing 16 mkl_dfti h 2il 36 0 DftiCreateDescriptor amy DFTI CC DfticommitDescriptor v long DftiCommitDescriptor DFTI_Descriptor_struct Output j Show output from B reakpoints jimmediate E Output E Index Results Ready Ln 8 Col 32 Ch 32 Complete Word For a software library the Complete Word feature types or prompts for the rest of the name defined in the header file once you type the first few characters of the name in your code This feature requires adding the include statement with the appropriate Intel MKL header file to your code To complete the name of the function or named constant specified in the header file 1 Type the first few characters of the name 2 Press Alt RIGHT ARROW or Ctrl SPACEBAR If you have typed enough characters to disambiguate the name the rest of the name is typed automatical
150. tegers are 64 bit otherwise use the default LP64 interface where integers are 32 bit see Using the ILP64 Interface vs LP64 Interface Identify whether and how your application is threaded e Threaded with the Intel compiler e Threaded with a third party compiler e Not threaded Reason The compiler you use to thread your application determines which threading library you should link with your application For applications threaded with a third party compiler you may need to use Intel MKL in the sequential mode for more information see Sequential Mode of the Library and Linking with Threading Libraries Determine the number of threads you want Intel MKL to use Reason Intel MKL is based on the OpenMP threading By default the OpenMP software sets the number of threads that Intel MKL uses If you need a different number you have to set it yourself using one of the available mechanisms For more information see Using Parallelism of the Intel Math Kernel Library Decide which linking model is appropriate for linking your application with Intel MKL libraries e Static Getting Started 2 e Dynamic Reason The link libraries for static and dynamic linking are different For the list of link libraries for static and dynamic models linking examples and other relevant topics like how to save disk space by creating a custom dynamic library see Linking Your Application with the Intel Math Kernel Library MPI used De
151. the more accurately the last update that DENDEARLY returns is close to what happens when the problem runs to completion DENDEARLY is a poor approximation for small problems It is for this reason that you are suggested to use ENDEARLY in conjunction with ASYOUGO2 because ASYOUGO2 reports actual DGEMM performance which can be a closer approximation to problems just starting 93 1 0 Intel Math Kernel Library for Windows OS User s Guide DASYOUGO2 DASYOUGO2 gives detailed single node DGEMM performance information It captures all DGEMM calls if you use Fortran BLAS and records their data Because of this the routine has a marginal intrusive overhead Unlike DASYOUGO which is quite non intrusive DASYOUGO2 interrupts every DGEMM call to monitor its performance You should beware of this overhead although for big problems it is less than 0 1 Here is a sample ASYOUGO2 output the first 3 non intrusive numbers can be found in ASYOUGO and ENDEARLY so it suffices to describe these numbers here Col 001280 Fract 0 050 Mflops 42454 99 DT 9 5 DF 34 1 DMF 38322 78 The problem size was N 16000 with a block size of 128 After 10 blocks that is 1280 columns an output was sent to the screen Here the fraction of columns completed is 1280 16000 0 08 Only up to 40 outputs are printed at various places through the matr
152. thod Static Linking Dynamic Linking mkl_core lib mkl cote dll lib Computational Libraries for Applications that Use the Intel MKL Cluster Software ScaLAPACK and Cluster Fourier Transform Functions Cluster FFT require more computational libraries which may depend on your architecture The following table lists computational libraries for IA 32 architecture applications that use ScaLAPACK or Cluster FFT 37 4 Intel Math Kernel Library for Windows OS User s Guide Computational Libraries for IA 32 Architecture Function domain Static Linking Dynamic Linking ScaLAPACK mkl_scalapack_core lib mkl_scalapack_core dll lib mkl_core lib mkl _ core dll lib Cluster Fourier mkl_cdft_core lib mkl_cdft_core_dll lib Transform mkl _ core lib mkl core dll lib Functions t Also add the library with BLACS routines corresponding to the MPI used The following table lists computational libraries for Intel 64 architecture applications that use ScaLAPACK or Cluster FFT Computational Libraries for the Intel 64 Architecture Function domain Static Linking Dynamic Linking ScaLAPACK LP64 kl scalapack_1p64 1lib kl scalapack_1p64 dll lib interface a She kl core lib kl core dll lib ScaLAPACK ILP64 kl scalapack_ilp64 lib kl scalapack_ilp64 dll lib interface i ae kl core lib kl core dll lib Cluster Fourier kl_cdft_core lib kl_cdft_core dll lib Transform R l a dew 1 kl _core lib kl c
153. tip 53 hybrid version of MP LINPACK 89 I ILP64 programming support for 34 include files Intel R MKL 96 installation checking 17 Intel R Hyper Threading Technology configuration tip 53 Intel R Visual Fortran project linking with Intel R MKL 28 IntelliSense with Intel R MKL in Visual Studio IDE 84 interface cdecl and stdcall use of 33 Fortran 95 libraries 36 LP64 and ILP64 use of 34 interface libraries and modules Intel R MKL 57 interface libraries linking with 33 J Java examples 66 L language interfaces support 95 language specific interfaces interface libraries and modules 57 LAPACK 107 Intel Math Kernel Library for Windows OS User s Guide C interface to use of 61 calling routines from C 61 Fortran 95 interface to 59 performance of packed routines 52 threaded routines 43 layers Intel R MKL structure 25 libraries to link with computational 37 interface 33 run time 38 system libraries 38 threading 36 link tool command line 30 linking Intel R Visual Fortran project with Intel R MKL 28 Microsoft Visual C C project with Intel R MKL 28 linking examples cluster software 74 general 30 linking with compiler run time libraries 38 computational libraries 37 interface libraries 33 system libraries 38 threading libraries 36 linking quick start 27 linking Web based advisor 29 LINPACK benchmark 87 memory functions redefining 55 memory management 54 memory renaming 55
154. tive number gt lt positive number gt lt decimal positive number gt lt octal number gt lt hexadecimal number gt In the syntax above values of lt MKL domain env name gt indicate function domains as follows MKL DOMAIN ALL All function domains KL DOMAIN BLAS BLAS Routines MKL DOMAIN FFT non cluster Fourier Transform Functions KL DOMAIN VML Vector Mathematical Functions MKL DOMAIN PARDISO PARDISO For example KL DOMAIN ALL 2 MKL DOMAIN BLAS 1 MKL DOMAIN FFT 4 MKL DOMAIN ALL 2 MKL DOMAIN BLAS 1 MKL DOMAIN FFT 4 KL DOMAIN ALL 2 MKL DOMAIN BLAS 1 MKL DOMAIN FFT 4 MKL DOMAIN ALL 2 MKL DOMAIN BLAS 1 MKL DOMAIN FFT 4 KL DOMAIN A 2 MKL DOMAIN BLAS 1 MKL DOMAIN FFT 4 50 Managing Performance and Memory 5 MKL DOMAIN ALL 2 MKL DOMAIN BLAS 1 MKL DOMAIN FFT 4 The global variables MKL_ DOMAIN ALL MKL DOMAIN BLAS MKL DOMAIN FFT MKL_ DOMAIN VML and MKL DOMAIN PARDISO as well as the interface for the Intel MKL threading control functions can be found in the mk1 h header file The table below illustrates how values of MKL_ DOMAIN NUM _THREADS are interpreted Value of Interpretation MKL DOMAIN NUM_ THREADS MKL DOMAIN ALL All parts of Intel MKL should try four threads The actual number of threads may be 4 still different because of the MKL DYNAMIC setting or system resource issues The setting is equival
155. tories mainly by Intel MKL function domains and programming languages For example the examples spblas subdirectory contains a makefile to build the Sparse BLAS examples and the examples vmlc subdirectory contains the makefile to build the C VML examples Source code for the examples is in the next level sources subdirectory See Also High level Directory Structure Running an Intel MKL Example in the Visual Studio 2008 IDE What You Need to Know Before You Begin Using the Intel Math Kernel Library Target platform Identify the architecture of your target machine e JA 32 or compatible e Intel 64 or compatible Reason Because Intel MKL libraries are located in directories corresponding to your particular architecture see Architecture Support you should provide proper paths on your link lines see Linking Examples To configure your development environment for the use with Intel MKL set your environment variables using the script corresponding to your architecture see Setting Environment Variables for details Mathematical Identify all Intel MKL function domains that you require problem e BLAS e Sparse BLAS 19 2 Intel Math Kernel Library for Windows OS User s Guide Programming language Range of integer data Threading model Number of threads Linking model 20 e LAPACK e PBLAS e ScaLAPACK e Sparse Solver routines e Vector Mathematical Library functions VML e Vector Statistical Library funct
156. tune taneeaees 99 Appendix C Directory Structure in Detail Detailed Structure of the IA 32 Architecture Directories cccscceeseeeeeeeees 101 Static Libraries in the lib ia32 DirectOry cccccceeceeeeeeeeeeeeeseseeeeeeaes 101 Dynamic Libraries in the lib ia32 DirectOry ccceecececeeee ee eeee eee eeeee ees 102 Contents of the redist ia32 mkl DirectOry cccc ccs eeeeeeeeeeeeeeeeeeeeees 102 Detailed Structure of the Intel 64 Architecture Directories cccceeeeeeees 103 Static Libraries in the lib intel64 DireCtOry ccccesseeceseeeeeseeeesaeeeees 104 Dynamic Libraries in the lib intel64 DireCtOry cccceeeceeeeeeeeeeeeeeeeneees 105 Contents of the redist intel64 mklI DirectOry cccccceeeeeseeeeeeseeeeees 105 Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS NO LICENSE EXPRESS OR IMPLIED BY ESTOPPEL OR OTHERWISE TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT EXCEPT AS PROVIDED IN INTEL S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE MERCHANTABILITY OR INFRINGEMENT OF ANY PATENT COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT UNLESS OTHERWISE AGREED IN WRITING BY INTEL THE INTEL PRODUCTS ARE NOT DESIGNED NOR I
157. umentation at http www microsoft com To create and configure the Win32 Debug project running an Intel MKL C example with the Intel C C Compiler integrated into Visual Studio and or Microsoft Visual C 2008 perform the following steps 1 Create a C Project a Open Visual Studio 2008 b On the main menu select File gt New gt Project to open the New Project window c Select Project Types gt Visual C gt Win32 then select Templates gt Win32 Console Application In the Name field type lt project name gt for example MKL_CBLAS_CAXPYIX and click OK The New Project window closes and the Win32 Application Wizard lt project name gt window opens d Select Next then select Application Settings check Additional options gt Empty project and click Finish 78 Programming with Intel Math Kernel Library in Integrated Development Environments IDE 9 The Win32 Application Wizard lt project name gt window closes The next steps are performed inside the Solution Explorer window To open it select View gt Solution Explorer from the main menu 2 optional To switch to the Intel C C project right click lt project name gt and from the drop down menu select Convert to use Intel C Project System The menu item is available if the Intel C C Compiler is integrated into Visual Studio 3 Add sources of the Intel MKL example to the project a Right click the Source Files folder under lt pr
158. ure If you specify any variant of the Qmk1 compiler option the compiler automatically includes the Intel MKL libraries In cases not covered by the option use the Link line Advisor or see Linking in Detail 27 4 Intel Math Kernel Library for Windows OS User s Guide See Also Using the ILP64 Interface vs LP64 Interface Using the Link line Advisor Intel Software Documentation Library Automatically Linking a Project in the Visual Studio Integrated Development Environment with Intel MKL After a default installation of the Intel Math Kernel Library Intel MKL Intele C Composer XE or Intel Visual Fortran Composer XE you can easily configure your project to automatically link with Intel MKL Automatically Linking Your Microsoft Visual C C Project with Intel MKL Configure your Microsoft Visual C C project for automatic linking with Intel MKL as follows e For the Visual Studio 2010 development system 1 Go to Project gt Properties gt Configuration Properties gt Intel Performance Libraries 2 Change the Use MKL property setting by selecting Parallel Sequential or Cluster as appropriate e For the Visual Studio 2005 2008 development system 1 Go to Project gt Intel C Composer XE 2011 gt Select Build Components 2 From the Use MKL drop down menu select Parallel Sequential or Cluster as appropriate Specific Intel MKL libraries that link with your application may depend on more project settings For
159. urn it off set the MKL_ DISABLE FAST MM environment variable to any value or call the mkl_disable fast_mm function Be aware that this change may negatively impact performance of some Intel MKL routines especially for small problem sizes 54 Managing Performance and Memory 5 Redefining Memory Functions In C C programs you can replace Intel MKL memory functions that the library uses by default with your own functions To do this use the memory renaming feature Memory Renaming Intel MKL memory management by default uses standard C run time memory functions to allocate or free memory These functions can be replaced using memory renaming Intel MKL accesses the memory functions by pointers i malloc i free i_calloc andi_realloc which are visible at the application level These pointers initially hold addresses of the standard C run time memory functions malloc free calloc and realloc respectively You can programmatically redefine values of these pointers to the addresses of your application s memory management functions Redirecting the pointers is the only correct way to use your own set of memory management functions If you call your own memory functions without redirecting the pointers the memory will get managed by two independent memory management packages which may cause unexpected memory issues How to Redefine Memory Functions To redefine memory functions use the following procedure If you are using the
160. want to use LAPACK or BLAS functions that support Fortran 77 in the Fortran 95 environment additional effort may be initially required to build compiler specific interface libraries and modules from the source code provided with Intel MKL Optimization Notice Intel s compilers may or may not optimize to the same degree for non Intel microprocessors for optimizations that are not unique to Intel microprocessors These optimizations include SSE2 SSE3 and SSSE3 instruction sets and other optimizations Intel does not guarantee the availability functionality or effectiveness of any optimization on microprocessors not manufactured by Intel Microprocessor dependent optimizations in this product are intended for use with Intel microprocessors Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice Notice revision 20110804 Using Language Specific Interfaces with Intel Math Kernel Library This section discusses mixed language programming and the use of language specific interfaces with Intel MKL See also Appendix G in the Intel MKL Reference Manual for details of the FFTW interfaces to Intel MKL Interface Libraries and Modules You can create the following interface libraries and modules using the respective makefiles located in the interfa
161. xed there is a large parameter space you must search over An exhaustive search of all possible inputs is improbably large even for a powerful cluster MP LINPACK optionally prints information on performance as it proceeds You can also terminate early e Save time by compiling with DENDEARLY DASYOUGO2 and using a negative threshold do not use a negative threshold on the final run that you intend to submit as a Top500 entry Set the threshold in line 13 of the HPL 2 0 input file HPL dat e If you are going to run a problem to completion do it with DASYOUGO 5 Using the quick performance feedback return to step 3 and iterate until you are sure that the performance is as good as possible See Also Options to Reduce Search Time Options to Reduce Search Time Running large problems to completion on large numbers of nodes can take many hours The search space for MP LINPACK is also large not only can you run any size problem but over a number of block sizes grid layouts lookahead steps using different factorization methods and so on It can be a large waste of time to run a large problem to completion only to discover it ran 0 01 slower than your previous best problem Use the following options to reduce the search time 92 LINPACK and MP LINPACK Benchmarks 1 0 e DASYOUGO e DENDEARLY e DASYOUGO2 Use DASYOUGO2 cautiously because it does have a marginal performance impact To see DGEMM internal
162. y 4 To migrate to ILP64 or write new code for ILP64 use appropriate types for parameters of the Intel MKL functions and subroutines Integer Types Fortran C or C 32 bit integers INTEGER 4 or int INTEGER KIND 4 Universal integers for ILP64 INTEGER MKL INT LP64 without specifying KIND e 64 bit for ILP64 e 32 bit otherwise Universal integers for ILP64 INTEGER 8 or MKL_ INT64 LP64 INTEGER KIND 8 Q e 64 bit integers FFT interface integers for ILP64 INTEGER MKL LONG LP64 without specifying KIND To determine the type of an integer parameter of a function use appropriate include files For functions that support only a Fortran interface use the C C include files h The above table explains which integer parameters of functions become 64 bit and which remain 32 bit for ILP64 The table applies to most Intel MKL functions except some VML and VSL functions which require integer parameters to be 64 bit or 32 bit regardless of the interface e VML The mode parameter of VML functions is 64 bit e Random Number Generators RNG All discrete RNG except viRngUniformBits64 are 32 bit The viRngUniformBits64 generator function and vsl1SkipAheadStream service function are 64 bit e Summary Statistics The estimate parameter of the vslsSSCompute vsldSSCompute function is 64 bit Refer to the Intel MKL Reference Manual for more information To better unde
163. y gt the builder uses the Intel MKL installation directory PUTRED Manages resolution of the security cookie external references in the custom DLL on systems based on the Intel 64 architecture By default the makefile uses the bufferoverflowu 1ib library of Microsoft SDK builds 1289 or higher This library resolves the _ _security_cookie external references To avoid using this library set the empty value of this parameter Therefore if you are using an older SDK set buf lib Al CAUTION Use the buf_1ib parameter only with the empty value Incorrect value of the parameter causes builder errors crt lt c run Specifies the name of the Microsoft C run time library to be used to build the custom time library DLL By default the builder uses msvcrt lib manifest Manages the creation of a Microsoft manifest for the custom DLL yes no embed e Ifmanifest yes the manifest file with the name defined by the name parameter above and the manifest extension will be created e If manifest no the manifest file will not be created e If manifest embed the manifest will be embedded into the DLL By default the builder does not use the manifest parameter All the above parameters are optional In the simplest case the command line is nmake ia32 and the missing options have default values This command creates the mkl_custom dll and mkl_custom lib libraries with the cdecl interface for processors using the IA 32 archit
164. zed for performance only on Intel microprocessors Performance tests such as SYSmark and MobileMark are measured using specific computer systems components software operations and functions Any change to any of those factors may cause the results to vary You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases including the performance of that product when combined with other products BlueMoon BunnyPeople Celeron Celeron Inside Centrino Centrino Inside Cilk Core Inside E GOLD i960 Intel the Intel logo Intel AppUp Intel Atom Intel Atom Inside Intel Core Intel Inside Intel Insider the Intel Inside logo Intel NetBurst Intel NetMerge Intel NetStructure Intel SingleDriver Intel SpeedStep Intel Sponsors of Tomorrow the Intel Sponsors of Tomorrow logo Intel StrataFlash Intel vPro Intel XScale InTru the InTru logo the InTru Inside logo InTru soundmark Itanium Itanium Inside MCS MMX Moblin Pentium Pentium Inside Puma skoool the skoool logo SMARTi Sound Mark The Creators Project The Journey Inside Thunderbolt Ultrabook vPro Inside VTune Xeon Xeon Inside X GOLD XMM X PMU and XPOSYS are trademarks of Intel Corporation in the U S and or other countries Other names and brands may be claimed as the property of others Microsoft Windows Visual Studio Visual C and the Windows logo are trademarks or registered trademarks of Microsoft
Download Pdf Manuals
Related Search
Related Contents
brochure jeunes 200x280 ok:Mise en page 1.qxd DX-TL5000E Toshiba Network e-STUDIO 120 窒素自動充填システム取扱説明書 POWERCERT PC parameter setting tool (For the 取扱説明書 - ダッチウエストジャパン Handbuch Magic Copyright © All rights reserved.
Failed to retrieve file