Home
VampirTrace User Manual
Contents
1. STL Cycles with no instruction issue FUL Cycles with maximum instruction issue CCY Cycles with no instructions completed FUL CCY Cycles with maximum instructions completed BR UCN Unconditional branch instructions I BR CN Conditional branch instructions I BR TKN Conditional branch instructions taken I BR NTK Conditional branch instructions not taken I BR MSP Conditional branch instructions mispredicted I BR PRC Conditional branch instructions correctly predicted FMA INS FMA instructions completed I TOT IIS Instructions issued TOT INS Instructions completed I INT INS Integer instructions FP INS Floating point instructions LD INS Load instructions SR INS Store instructions BR INS Branch instructions VEC INS Vector SIMD instructions LST INS Load store instructions completed SYC INS Synchronization instructions completed FML INS Floating point multiply instructions FAD INS Floating point add instructions I FDV INS Floating point divide instructions I FSO INS Floating point square root instructions I FNV INS Floating point inverse instructions RES Cycles stalled any resource FP STAL Cycles the FP unit s are stalled I_FP_OPS Floating point operations I_TOT_CYC Total cycles I HW INT Hardware interrupts APPENDIX VAMPIRTRACE INSTALLATION C VampirTrace
2. lt f gt lt f gt Exclude certain symbols from filtering A symbol may contain 25 4 TRACE FILTER TOOL VTFILTER 26 osc stats environment variables TRACEF LT ER EXCLUDEFI wildcards Force to include certain symbols into the filter A symbol may contain wildcards Automatically include children of included functions as well into the filter Prints out the desired and the xpected percentage of file size LE Specifies a file containing list TRACEFI ER INCLUDEFTI of symbols not to be filtered The list of members can be seperated by space comma tab newline and may contain wildcards LE Specifies a file containing a list of symbols to be filtered APPENDIX B COUNTER SPECIFICATIONS B PAPI Counter Specifications Available counter names can be queried with the PAPI commands 11 and papi_native_avail There are limitations to the combinations of coun ters To check whether your choice works properly use the command papi_event_chooser PAPI_L 1 2 3 _ D I T JC M H A R W Level 1 2 3 data instruction total cache misses hits accesses reads writes PAPT_L 1 2 3 _ LD ST M Level 1 2 3 load store misses PAPI SNP Requests fo
3. bargo foo foobar F90 o foobar DVTRACE or POMP directives have to be preprocessed by CPP APPENDIX COMMAND REFERENCE A 2 Local Trace Unifier vtunify vtunify local trace unifier for VampirTrace Syntax vtunify lt files gt lt iprefix gt o lt oprefix gt c compress lt on off gt k keeplocal 7v verbose Options h help Show this help message files number of local trace files equal to of uctl files iprefix prefix of input trace filename lt oprefix gt prefix of output trace filename lt statsofile gt statistics output filename default lt oprefix gt stats noshowstats Don t show statistics on stdout nocompress Don t compress output trace files 1 Don t remove input trace files verbos Enable verbose mode 23 DYNINST MUTATOR VTDYN A 3 Dyninst Mutator vtdyn vtdyn Dyninst Mutator for VampirTrace Syntax vtdyn v verbose s shlib lt shlib gt b blacklist bfile pl pid lt pid gt app appargs Options h help Show this help message V verbose Enable verbose mode s shlib Comma separated list of shared libraries lt shlib gt which should also be instrumented 24 b blacklrst lt bfile gt lt pid gt appargs Set path of blac
4. ZIH Center for Information Services amp High Performance Computing VampirTrace 5 4 6 User Manual EERE tt EELE EEE BEERS BE AA BEEBE E L1 EI EJ EJ E E ES EI EI EJ ET ET F1 TU Dresden Center for Information Services and High Performance Computing ZIH 01062 Dresden Germany http www tu dresden de zih http www tu dresden de zih vampirtrace Contents Contents 1 Introduction 2 Instrumentation 2 1 The Compiler Wrappers 2 2 Instrumentation Ty 2 3 Automatic Instrumentation 2 4 Manual Instrumentation using the VampirTrace 2 5 Manual Instrumentation using POMP 2 6 Binary instrumentation using Dyninst 3 Runtime Measurement 3 1 Environment Varia 9 2 Influencing Trace File 3 3 Unification of local Traces 4 Recording additional Events and Counters 4 1 PAPI Hardware Performance Counters 4 2 Memory Allocation Counter
5. event records in the trace file This feature has to be activated for each tracing run by setting the environment variable IOTRACE tO yes 4 4 User Defined Counters In addition to the manual instrumentation see Section 2 4 the VampirTrace API provides instrumentation calls which allow recording of program variable values e g iteration counts calculation results or any other numerical quantity A user defined counter is identified by its name the counter group it belongs to the type of its value integer or floating point and the unit that the value is quoted e g GFlop sec The VT COUNT GROUP DEF and VT COUNT DEF instrumentation calls can be used to define counter groups and counters Fortran include vt user inc integer id gid VT COUNT GROUP DEF name gid VT COUNT DEF name unit type gid id include vt_user h 16 CHAPTER 4 RECORDING ADDITIONAL EVENTS AND COUNTERS unsigned int id gid gid VI_COUNT_GROUP_ id DEF name VT_COUNT_DEF name unit type gid The definition of a counter group is optionally If no special counter group is desired the default group User can be used In this case set the parameter gid of VT COUNT I DEF tO VT COUNT DEFGROUP The third parameter type of VT COUNT 1 DEF specifies the data type of the counter value To record a value for any of the defined co
6. BLACKLIST Name of blacklist file for Dyninst instrumentation Section 2 6 VT_DYN_SHLIBS Colon separated list of shared libraries for Dyninst in strumentation see Section 2 6 VT FILTER SPEC Name of function region filter file see Section 5 1 VT GROUPS SPEC Name of function grouping file See Section 5 2 VT_UNIFY Unify local trace files afterwards yes VT COMPRESSION Write compressed trace files yes The value for the first three variables can contain sub strings of the form 5 2 or xXvz where xyz is the name of another environment variable Evaluation of the environment variable is done at measurement run time When you use these environment variables make sure that they have the same value for all processes of your application on all nodes of your cluster Some cluster environments do not automatically transfer your environment when executing parts of your job on remote nodes of the cluster and you may need to explicitly set and export them in batch job submission scripts 3 2 Influencing Trace File Size The default values of the environment variables vT BUFFER SIZE and VT MAX FLUSHES limit the internal buffer of VampirTrace to 32 MB and the num ber of times that the buffer is flushed to 1 Events that should be recorded after the limit has been reached are no longer written into the trace file The envi ronment variables apply to every process of a parallel application meaning that applications with
7. To use VampirTrace with Dyninst you will also need to add the lib subdirectory to your LD LIBRARY PATH environment variable for csh and tcsh 32 APPENDIX VAMPIRTRACE INSTALLATION gt setenv PATH vt install bin PATH gt setenv LD LIBRARY PATH vt install lib SLD LIBRARY PATH for bash and sh export PATH vt install bin PATH export LIBRARY PATH vt install lib LD LIBRARY PATH C 5 Notes for Developers Build from CVS If you have checked out a developer s copy of VampirTrace i e checked out from CVS you should first run bootstrap Note that GNU Autoconf gt 2 60 GNU Automake gt 1 9 6 is required You can download them from http www gnu org software autoconf and http www gnu org software automake Creating a distribution tarball VampirTrace X X X tar gz If you would like to create a new distribution tarball run makedist o otftarball major minor release instead of make dist The script makedist adapts the version number major minor release in configure in and extracts given OTF tarball ot tarball in extlib otf 33
8. lposix 29 C 2 CONFIGURE OPTIONS Installation Names By default make install will install the package s files in usr local bin usr local include etc You can specify an installation prefix other than usr local by giving configure the option prefix PATH Optional Features enable compinst COMPINSTLIST enable support for compiler instrumentation e g gnu intel phat 1 ft race A VampirTrace installation can handle different compilers The first item in the list is the run time default default automatically by configure enable mpi enable MPI support default enable if MPI found by configure enable omp enable OpenMP support default enable if compiler supports OpenMP enable hyb enable Hybrid support default enable if MPI found and compiler supports OpenMP enable memt race enable memory tracing support default enable if found by configure enable iotrace enable libc s I O tracing support default enable if libdl found by configure enable dyninst enable support for Dyninst instrumentation default enable if found by configure Note Requires Dyninst version 5 0 1 or higher http www dyninst org enable dyninst attlib build shared library which attaches dyninst to the running application default enable if dyninst found by configure and system supports shared libraries 1 enable PAPI hardware counter su
9. If you want to instrument events only creates smaller trace files and less overhead use the option vt inst manual to disable automatic instrumentation of user functions see also Section 2 4 e OpenMP parallel programs When VampirTrace detects OpenMP flags on the command line OPARI is invoked for automatic source code instru mentation of OpenMP events original ifort openmp pi f o pi with instrumentation vtf77 openmp pi f o pi For more information about OPARI refer to share vampirtrace doc opari Readme html in VampirTraces installation directory e Hybrid MPl OpenMP parallel programs With a combination of the above mentioned approaches hybrid applications can be instrumented original mpif90 openmp hybrid F90 o hybrid with instrumentation vtf90 vt f90 mpif90 openmp hybrid F90 o hybrid The VampirTrace compiler wrappers try to detect automatically which paral lelization method is used by means of the compiler flags e g openmp or 1mpi and the compiler command e g mpif90 If the compiler wrapper failed to detect this correctly the instrumentation could be incomplete and an unsuitable VampirTrace library would be linked to the binary In this case you should tell the compiler wrapper which parallelization method your program uses by the switches vt mpi vt and vt for MPI OpenMP and hybrid programs respectively Note that these switches do not change the underlying compiler or com
10. example this is required on the BlueGene L platform or when using Dyninst instrumentation 3 3 UNIFICATION OF LOCAL TRACES 14 CHAPTER 4 RECORDING ADDITIONAL EVENTS AND COUNTERS 4 Recording additional Events and Counters 4 1 PAPI Hardware Performance Counters If VampirTrace has been built with hardware counter support enabled see Sec tion C VampirTrace is capable of recording hardware counter information as part of the event records To request the measurement of certain counters the user must set the environment variable vT METRICS The variable should con tain a colon separated list of counter names or a predefined platform specific group Metric names can be any PAPI preset names or PAPI native counter names For example set VT METRICS PAPI FP OPS PAPI 12 to record the number of floating point instructions and level 2 cache misses See Appendix B for a full list of PAPI preset counters The user can leave the environment variable unset to indicate that no counters are requested If any of the requested counters are not recognized or the full list of counters cannot be recorded due to hardware resource limits program execution will be aborted with an error message 4 2 Memory Allocation Counters The GNU glibc implementation provides a special hook mechanism that allows intercepting all calls to allocation and free functions e g malloc realloc free This is independent from compi
11. n processes will typically create trace files n times the size of a serial application To remove the limit and get a complete trace of an application set VT MAX FLUSHES to 0 This causes VampirTrace to always write the buffer to disk when the buffer is full To change the size of the buffer use the variable VT BUFFER SIZE The optimal value for this variable depends on the applica tion that should be traced Setting a small value will increase the memory that is available to the application but will trigger frequent buffer flushes by Vampir Trace These buffer flushes can significantly change the behavior of the appli cation On the other hand setting a large value like 2G will minimize buffer flushes by VampirTrace but decrease the memory available to the application If not enough memory is available to hold the VampirTrace buffer and the applica tion data this may cause parts of the application to be swapped to disk leading also to a significant change in the behavior of the application 12 CHAPTER 3 RUNTIME MEASUREMENT 3 3 Unification of local Traces After a run of an instrumented application the traces of the single processes need to be unified in terms of timestamps and event IDs In most cases this happens automatically But under certain circumstances it is necessary to per form unification of local traces manually To do this use the command 5 vtunify no of traces prefix For
12. times The remaining functions will be recorded at most 3000000 times Besides creating filter files by hand you can also use the vt filter tool to generate them automatically This tool reads the provided trace and decides whether a function should be filtered or not based on the evaluation of certain parameters For more information see Section A 4 19 Center 5 2 FUNCTION GROUPING 5 2 Function Grouping VampirTrace allows assigning functions regions to a group Groups can for in stance be highlighted by different colors in Vampir displays The following stan dard groups are created by VampirTrace Group name Contained functions regions MPI MPI functions OMP OpenMP constructs and functions MEM Memory allocation functions see 4 2 0 I O functions see 4 3 Application remaining instrumented functions and source code regions Additionally you can create your own groups e g to better distinguish different phases of an application To use function region grouping set the environment variable vT GROUPS SPEC to the path of a file which contains the group assign ments Below there is an example of how to use group assignments VampirTrace region groups specification group definitions and region assignments syntax lt group gt lt regions gt group group name regions semicolon separated list of regions can be wildcards de de db db db de db db db d CALC add sub mul
13. to inject addi tional measurement calls during run time The tracing part provides the current measurement functionality used by the instrumentation calls By this means a variety of detailed performance properties can be collected and recorded during run time This includes e Function call enter and leave events e MPI communication events e OpenMP events e Hardware performance counters e various special purpose events After a successful trace run VampirTrace writes all collected data to a trace in the Open Trace Format OTF see http www tu dresden de zih otf As a result the information is available for post mortem analysis and visualiza tion by various tools Most notably VampirTrace provides the input data for the Vampir analysis and visualization tool see http www vampir eu VampirTrace is included in 1 3 and later If not disabled explicitly VampirTrace is built automatically when installing Open MPI Refer to http www open mpi org faq category vampirtrace for more information Trace files can quickly become very large With automatic instrumentation even tracing applications that run only for a few seconds can result in trace files of several hundred megabytes To protect users from creating trace files of several gigabytes the default behavior of VampirTrace limits the internal buffer to 32 MB This produces trace files that are not larger than 32 MB per process typically a lot smaller Please re
14. Installation C 1 Basics Building VampirTrace is typically a combination of running configure and make Execute the following commands to install VampirTrace from within the directory at the top of the tree configure prefix where to install lots of output make all install If you need special access for installing then you can execute make a11 as a user with write permissions in the build tree and a separate make install as a user with write permissions to the install tree However for more details also read the following instructions Sometimes it might be necessary to provide configure with options e g specifications of paths or compilers Please consult the CONFIG EXAMPLES file to get an idea of how to configure VampirTrace for your platform VampirTrace comes with example programs written in C and Fortran They can be used to test different instrumentation types of the VampirTrace in stallation You can find them in the directory examples of the VampirTrace pack age C 2 Configure Options Compilers and Options Some systems require unusual options for compiling or linking that the configure script does not know about Run configure help for de tails on some of the pertinent environment variables You can pass initial values for configuration parameters to configure by set ting variables in the command line or in the environment Here is an example configure 89 CFLAGS O2 LIBS
15. S VTCC VTCXX VTF77 VTF90 vt vt t verbose t showme showme_compile showme_link mpi parallel omp parallel hyb hybrid parallel default underlying compiler and Enable verbose mode Do not invoke the Instead show the would be executed uses MPI uses OpenMP MPI automatically determining by OpenMP flags underlying compiler command line that Do not invoke the underlying compiler Instead show the compiler flags that would be supplied to the compiler Do not invoke the underlying compiler Instead show the linker flags that would be supplied to the compiler See the man page for your underlying compiler for other options that can be passed through vt cc oxx f77 f90 Environment variables VI CC Equivalent to vt occ VT CXX Equivalent to vt cxx VT F77 Equivalent to vt f77 VT F90 Equivalent to vt f90 VT INST Equivalent to vt inst The corresponding command line options environment variable settings Exampl es overwrite the automatically instrumentation by using manually instrumentation by using VT s API IMPO 22 toc vtilnst tf90 RTANT tcc vt cc gcc vt inst toc vtilinst vt inst manual gnu c foo gnu c bar gnu Fortran source files instrumented by VT s API foo o bar o o GNU compiler 2G fO00 0
16. _USER_END name If a block has several exit points as it is often the case for functions all exit points have to be instrumented vT USER END too For C it is simpler as shown in the following example Only entry points into a scope need to be marked Exit points are detected automatically when deletes scope local variables Crt include vt_user h name For all three languages the instrumented sources have to be compiled with DVTRACE otherwise the calls are ignored Note that Fortran source files instrumented this way have to be preprocessed too In addition you can combine this instrumentation type with all other ones For example all user functions can be instrumented by a compiler while special source code regions e g loops can be instrumented by VT s API Use VT s compiler wrapper described above for compiling and linking the instrumented source code like e Without other instrumentation e g compiler vtcc vt inst manual myprogl c DVTRACE o myprog e combined with compiler instrumentation vtcc vt inst gnu myprogl c DVTRACE o myprog Note that you can also use the option vt inst manual with non instru mented sources Binaries created this way only contain and OpenMP in strumentation which might be desirable in some cases 2 5 Manual Instrumentation using POMP POMP OpenMP Profiling Tool instrumentation dire
17. ad Section 3 2 on how to remove or change the limit VampirTrace supports various Unix and Linux platforms common in HPC nowa days It comes as open source software under a BSD License _ZIH Center for Inf rmation Services amp High Performance Computing CHAPTER 2 INSTRUMENTATION 2 Instrumentation To make measurements with VampirTrace the user s application program needs to be instrumented i e at specific important points called events VampirTrace measurement calls have to be activated As an example common events are entering and leaving of function calls as well as sending and receiving of MPI messages By default VampirTrace handles this automatically In order to enable instru mentation of function calls the user only needs to replace the compiler and linker commands with VampirTrace s wrappers see Section 2 1 below VampirTrace supports different ways of instrumentation as described in Section 2 2 2 1 The Compiler Wrappers All the necessary instrumentation of user functions as well as MPI and OpenMP events is handled by VampirTrace s compiler wrappers vtf77 and vtf90 In the script used to build the application e g a makefile all compile and link commands should be replaced by the VampirTrace compiler wrapper The wrappers perform the necessary instrumentation of the program and link the suitable VampirTrace library Note that the VampirTrace version
18. ctives are supported for For tran and C C The main advantage is that by using directives the instrumen tation is ignored during normal compilation 2 6 BINARY INSTRUMENTATION USING DYNINST The INST BEGIN and INST END directives can be used to mark any user defined sequence of statements If this block has several exit points all but the last exit point have to be instrumented by INST ALTEND Fortran POMPS INST BEGIN name d POMPS INST ALTEND name POMPS INST END name C C pragma pomp inst begin name pragma pomp inst altend name pragma pomp inst At least the main program function has to be instrumented in this way and ad ditionally the following must be inserted as the first executable statement of the main program Fortran POMPS INST INIT pragma pomp inst init 2 6 Binary instrumentation using Dyninst The option vt inst dyninst selects the compiler wrapper to instrument the application during run time binary instrumentation by using Dyninst http www dyninst org Recompiling is not necessary for this way of instru menting but relinking as shown 5 vtf90 vt inst dyninst myprogl o myprog2 o o myprog The compiler wrapper dynamically links the library 1ibvt dynatt so to the application This library attaches the Mutator program vtdyn during run time which invokes t
19. div USER app These group assignments make the functions add sub mul and div asso ciated with group CALC and all functions with the prefix app are associated with group USER 20 APPENDIX A COMMAND REFERENCE A Command Reference A 1 Compiler Wrappers vtcc vtcxx vtf77 vtf90 vtcc vtcxx vtf77 vtf90 compiler wrappers for C Fortran 77 Fortran 90 Syntax vt lt cc cxx f 77 90 gt vt lt cc cxx f 77 90 gt lt cmd gt vt inst lt insttype gt vt lt seq mpi omp hyb gt vt opari lt args gt vt verbose vt version vt showme vt showme_compile vt showme_link options vt help Show this help message vt cc cxx f77 90 cmd Set the underlying compiler command vt inst insttype Set the instrumentation type possible values gnu fully automatic by GNU compiler intel Intel version 10 x pgi Portland Group PGI phat SUN Fortran 90 ss ses ftrace 2 NEC SX manual manual by using VampirTrace s API pomp manual by using using POMP INST directives dyninst binary by using Dyninst www dyninst org vt opari args Set options for OPARI command see share vampirtrace doc opari Readme html vt seq mpilomp hyb Force application s parallelization type Necessary if this cannot be determined by underlying compiler and flags Seq sequential 21 A 1 COMPILER WRAPPER
20. he instrumenting by using the Dyninst API Note that the appli cation should have been compiled with the switch in order to have symbol names visible After a trace run by using this way of instrumenting the vcunify utility needs to be invoked manually see Sections 3 3 and A 2 To prevent certain functions from being instrumented you can set the envi ronment variable VT DYN BLACKLIST to a file containing a newline separated CHAPTER 2 INSTRUMENTATION list of function names All additional overhead due to instrumentation of these functions will be removed VampirTrace also allows binary instrumentation of functions located in shared libraries Ensure that the shared libraries have been compiled with g and assign a colon separated list of their names to the environment variable vT DYN SHLIBS e g VT DYN SHLIBS libsupport so libmath so 29 2 6 BINARY INSTRUMENTATION USING DYNINST 10 CHAPTER 3 RUNTIME MEASUREMENT 3 Runtime Measurement By default running a VampirTrace instrumented application should result in an OTF trace file in the current working directory where the application was exe cuted Use the environment variables VT_FILE_PREFIX and VT_PFORM_GDIR described below to change the name of the trace file and its final location In case a problem occurs set the environment variable vT VERBOSE to yes before executing the instrumented application in order to see control messages of
21. in cluded in Open 1 3 has additional wrappers mpicc vt mpicxx vt mpif77 vt and mpif90 vt which are like the ordinary compiler wrappers mpicc and friends with the extension of automatic instrumentation The following list shows some examples depending on the parallelization type of the program e Serial programs Compiling serial code is the default behavior of the wrap pers Simply replace the compiler by VampirTrace s wrapper original gfortran a f90 b f90 o myprog with instrumentation vt 90 90 b 90 o myprog This will instrument user functions if supported by compiler and link the VampirTrace library e MPI parallel programs MPI instrumentation is always handled by means of the PMPI interface which is part of the MPI standard This requires the compiler wrapper to link with an MPl aware version of the Vampir Trace library If your MPI implementation uses MPI compilers e g mpicc Carer freon Sonne 8 2 1 THE COMPILER WRAPPERS mpxlf90 you need to tell VampirTrace s wrapper to use this compiler in stead of the serial one original mpice hello c o hello with instrumentation vtcc vt cc mpicc hello c o hello MPI implementations without own compilers require the user to link the MPI library manually In this case you simply replace the compiler by Vampir Trace s compiler wrapper original hello c o hello lImpi with instrumentation vtcc hello c o hello 1
22. klist file containing a newline separated list of functions which should not be instrumented application s process id attaches the mutator to a running process path of application executable application s arguments APPENDIX COMMAND REFERENCE A 4 Trace Filter Tool vtfilter vtfilter filter generator for VampirTrace Syntax Filter a trace file using an already existing filter file vtfilter filt filt options lt input trace file gt Generate a filter vtfilter gen gen options input trace file general options h help show this help message p show progress filt options to file output trace file name fi file input filter file nam z lt zlevel gt Set the compression level Level reaches from 0 to 9 where 0 is no compression and 9 is the highest level Standard is 4 f n Set max number of file handles available Standard is 256 gen options fo file output filter file name r n Reduce the trace size to n percent of the original size The program relies on the fact that the major part of the trace are function calls The approximation of size will get worse with a rising percentage of communication and other non function calling or performance counter records n Limit the number of accepted function calls for filtered functions to n Standard is O0
23. lation or source code access but relies on the underlying system library If VampirTrace has been built with memory tracing support enabled see Sec tion C VampirTrace is capable of recording memory allocation information as part of the event records To request the measurement of the application s al located memory the user must set the environment variable vT MEMTRACE to yes Note This approach to get memory allocation information requires changing internal function pointers in a non thread safe way so VampirTrace doesn t sup port memory tracing for OpenMP parallelized programs Carrer remi Sons 8 4 3 APPLICATION I O CALLS 4 3 Application I O Calls Calls to functions which reside in external libraries can be intercepted by imple menting identical functions and linking them before the external library Such wrapper functions can record the parameters and return values of the library functions If VampirTrace has been built with I O tracing support it uses this technique for recording calls to I O functions of the standard C library which are executed by the application Following functions are intercepted by VampirTrace open read fdopen fread open64 write fopen fwrite creat readv fopen64 fgetc creat 64 writev fclose getc close pread fseek fputc dup pwrite fseeko putc dup2 pread64 fseeko64 fgets lseek pwrite64 rewind fputs lseek64 fsetpos fscanf fsetpos64 fprintf The gathered information will be saved as
24. lib 31 Cee for Senden C 3 CROSS COMPILATION with mpi 1lib use given mpi lib with pmpi 1ib use given pmpi lib If your system does not have an MPI Fortran library set nable fmpi lib see above otherwise set with fmpi 1ib use given fmpi lib C 3 Cross Compilation Building VampirTrace on cross compilation platforms needs some special atten tion The compiler wrappers and OPARI are built for the front end build system whereas the VampirTrace libraries vtdyn vtunify and vtfilter are built for the back end host system Some configure options which are of interest for cross compilation are shown below e Set CC CXX F77 and FC to the cross compilers installed on the front end e Set CXX FOR BUILD to the native compiler of the front end used to com pile compiler wrappers and OPARI only e Set host to the output of config guess on the back end Maybe you also need to set additional commands and flags for the back end e g RANLIB AR MPICC CXXFLAGS For example this configure command line works for an NEC SX6 system with an X86 64 based front end configure 77 5 90 FC sxf90 MPICC sxmpicc AR sxar RANLIB sxar st CXX_FOR_BUILD c host sx6 nec superuxl4 1 with otf lib lotf C 4 Environment Set Up Add the bin subdirectory of the installation directory to your PATH environment variable
25. nn like nm myprog gt myprog nm Note that the output format of nm must be written in BSD style See the manual page of nm for getting help about the output format setting Notes on instrumentation of inline functions Compilers have different be haviors when automatically instrumenting inlined functions By default the GNU and Intel gt 10 0 compilers instrument all functions when used with VampirTrace Thus they switch off inlining completely regardless of the optimization level cho sen By appending the following attribute to function declarations one can pre vent these particular functions from being instrumented making them able to be inlined attribute __no_instrument_function__ The PGI and IBM compilers prefer inlining over instrumentation when compil ing with inlining enabled Thus one needs to disable inlining to enable instru mentation of inline functions and vice versa The bottom line is that you cannot inline and instrument a function at the same time For more information on how to inline functions read your compiler s man ual 2 4 Manual Instrumentation using the VampirTrace API The VT USER START VT USER END instrumentation calls can be used to mark any user defined sequence of statements Fortran include vt user inc VT USER START name VT USER END name CHAPTER 2 INSTRUMENTATION include vt_user h VT_USER_START name VT
26. piler flags Use the option vt verbose to see the command line the compiler wrapper executes Refer to Appendix A 1 for a list of all compiler wrapper options The default settings of the compiler wrappers can be modified in the files share vampirtrace vtcc wrapper data txt and similar for the other CHAPTER 2 INSTRUMENTATION languages in the installation directory of VampirTrace The settings include compilers compiler flags libraries and instrumentation types For example you could modify the default C compiler from gcc to mpicc by changing the line compiler gcc tO compiler mpicc This be convenient if you instru ment 1 parallel programs only 2 2 Instrumentation Types The wrapper s option vt inst insttype specifies the instrumentation type to use Following values for insttype are possible e fully automatic instrumentation by the compiler see Section 2 3 insttype Compilers gnu GNU e g gcc g gfortran 095 intel Intel version 710 0 e g icc icpc ifort pgi Portland Group e g pgcc pgCC pgf90 pgf77 phat SUN Fortran 90 e g cc CC f90 xl e g xICC 90 ftrace NEC SX e g sxcc Sxc sxf90 e manual instrumentation needs source code modifications insttype manual 5 API see Section 2 4 pomp POMP INST directives see Section 2 5 e special instrumentation types uses external tools insttype dynin
27. pport default enable if found by configure 30 APPENDIX VAMPIRTRACE INSTALLATION enable fmpi lib build the 1 Fortran support library in case your system does not have a MPI Fortran library default enable if no MPI Fortran library found by configure Important Optional Packages with local tmp dir LTMPDIR give the path for node local temporary directory to store local traces to default tmp If you would like to use an external version of OTF library set with extern otf use external OTF library default not set with extern otf dir OTFDIR give the path for OTF default usr local with otf flags FLAGS pass FLAGS to the OTF distribution configuration only for internal OTF version with otf lib OTFLIB use given otf lib default lotf 1z If used OTF library was built without zlib support then OTFLIB will be set to with dyninst dir DYNIDIR give the path for DYNINST default usr local with papi dir PAPIDIR give the path for PAPI default usr If you have not specified the environment variable MPI compiler com mand use the following options to set the location of your MPI installation with mpi dir MPIDIR give the path for MPI default usr with mpi inc dir MPIINCDIR give the path for MPI include files default MPIDIR include with mpi lib dir MPILIBDIR give the path for MPI libraries default SMPIDIR
28. r snoop PAPI CA SHR Requests for exclusive access to shared cache line PAPI CA CLN Requests for exclusive access to clean cache line PAPI CA INV Requests for cache line invalidation PAPI CA ITV Requests for cache line intervention PAPI BRU ID Cycles branch units are idle PAPI FXU ID Cycles integer units are idle PAPI FPU ID Cycles floating point units are idle PAPI LSU ID Cycles load store units are idle PAPI TLB DM Data translation lookaside buffer misses PAPI TLB IM Instruction translation lookaside buffer misses PAPI TLB TL Total translation lookaside buffer misses PAPI BTAC M Branch target address cache misses PAPI PRF DM Data prefetch cache misses PAPI TLB SD Translation lookaside buffer shootdowns PAPI CSR FAL Failed store conditional instructions PAPI CSR SUC Successful store conditional instructions PAPI CSR TOT Total store conditional instructions PAPI MEM SCY Cycles Stalled Waiting for memory accesses PAPI MEM RCY Cycles Stalled Waiting for memory Reads PAPI MEM WCY Cycles Stalled Waiting for memory writes 27 1 1 1 1 PAPI PAPI PAPI PAPI PAPI 1 1 1 1 1 PAPI PAPI 1 1 1 1 28
29. s 4 3 Application VO Calls 4 4 User Defined 5 Filtering amp Grouping 5 1 Function Filtering 5 2 Function Grouping A Command Reference A 1 Compiler Wrappers 77 1490 2 Local Trace Unifier vtunify Dyninst Mutator vidyn A 4 Trace Filter Tool vtfilter B PAPI Counter Specifications C VampirTrace Installation C Basics 2 Configure Options Cross Compilation ZH Contents C4 Environment Set Up 32 6 5 Notes for 33 This documentation describes how to prepare application programs in order to have traces generated when executed This step is called instrumentation Fur thermore it explains how to control the run time measurement system during execution tracing This also includes hardware performance counter sampling as well as selective filtering and grouping of functions CHAPTER 1 INTRODUCTION 1 Introduction VampirTrace consists of a tool set and a run time library for instrumentation and tracing of software applications It is particularly tailored towards parallel and distributed High Performance Computing HPC applications The instrumentation part modifies a given application in order
30. st binary instrumentation with Dyninst Section 2 6 To determine which instrumentation type will be used by default and which other are available on your system take look at the entry inst avail in the wrappers configuration file e g share vampirtrace vtcc wrapper data txt in the installation directory of VampirTrace for the C compiler wrapper See Appendix A 1 or type vtcc vt help for other options that can be passed through VampirTrace s compiler wrapper 2 3 Automatic Instrumentation Automatic Instrumentation is the most convenient way to instrument your pro gram Simply use the compiler wrappers without any parameters 0 5 vtf90 myprog1 f90 myprog2 f90 o myprog 2 4 MANUAL INSTRUMENTATION USING THE VAMPIRTRACE API Important notes for using the GNU or Intel gt 10 0 compiler Both need the library BFD for getting symbol information of the running application executable This library is part of the GNU Binutils which are downloadable from http www gnu org software binutils To get the application executable for BFD during run time VampirTrace uses the proc file system which is available on Linux On non Linux operating sys tems e g MacOS it is necessary to set the environment variable VT_APPPATH to the application executable If there are any problems to get symbol information by using BFD then the environment variable VT_NMF ILE can be set to a symbol list file which is created with the command
31. the VampirTrace run time system which might help tracking down the problem The internal buffer of VampirTrace is limited to 32 MB Use the environment variable BUFFER SIZE and VT MAX FLUSHES to increase this limit Section 3 2 contains further information on influencing trace file size 3 1 Environment Variables The following environment variables can be used to control the measurement of a VampirTrace instrumented executable Variable Purpose Default VT PFORM GDIR Name of global directory to store final trace file in Ti VT PFORM LDIR Name of node local directory that can be used to tmp store temporary trace files VT FILE PREFIX Prefix used for trace filenames a VT APPPATH Path to the application executable VT_BUFFER_SIZE Size of internal event trace buffer This is the place 32M where event records are stored before being written to a file VT_MAX_FLUSHES Maximum number of buffer flushes 1 VT_VERBOSE Print VampirTrace related control information during no measurement VT_METRICS Specify counter metrics to be recorded with trace events as a colon separated list of names for de tails see Appendix B VT_MEMTRACE Enable memory allocation counters see Sec 4 2 VT_IOTRACE Enable tracing of application I O calls see Sec 4 3 VT MPITRACE Enable tracing of MPI events yes 11 oo 3 2 INFLUENCING TRACE FILE SIZE
32. to be compiled with DVTRACE Otherwise the vT calls are ignored If additionally any functions or regions are manually instrumented by VT s API see Section 2 4 and only the instrumentation calls for user defined counter should be disabled then the sources have to be compiled with DVTRACE COUNT too 18 CHAPTER 5 FILTERING amp GROUPING 5 Filtering amp Grouping 5 1 Function Filtering By default all calls of instrumented functions will be traced so that the resulting trace files can easily become very large In order to decrease the size of a trace VampirTrace allows the specification of filter directives before running an instrumented application The user can decide on how often an instrumented function region is to be recorded to a trace file To use a filter the environment variable VT FILTER SPEC needs to be defined It should contain the path and name of a file with filter directives Below there is an example of a file containing filter directives VampirTrace region filter specification call limit definitions and region assignments syntax lt regions gt lt limit gt regions semicolon separated list of regions be wildcards limit assigned call limit 0 region s denied 1 unlimited add sub mul div 1000 x 3000000 These region filter directives cause that the functions add sub mul and div to be recorded at most 1000
33. unters the correspond ing instrumentation call COUNT VAL must be invoked Fortran Type Count call VT COUNT TYPE INTEG ER VT COUNT INTEGER VAL VT COUNT TYPE INTEG VT COUNT INTEGER8 VAL VT COUNT TYPE REAL VT COUNT TYPE DOUBLI VT COUNT REAL VAL VT COUNT DOUBLE VAL Lr C C Type VT_COUNT_TYPE_SIGNI Count call VT_COUNT_SIGNED_VAL VT_COUNT_TYPE_UNSIGNED VT COUNT UNSIGNED VAL VT COUNT TYPE FLOAT VT COUNT TYPE DOUI VT COUNT FLOAT VAL BLE VT COUNT DOUBLE VAL The following example records the loop index i Fortran include vt user inc program main integer i cid cgid VT COUNT GROUP VT COUNT DEF i 4 VT COUNT TYPE _ do i 1 100 VT COUNT end do DEF loopindex cgid Data type integer 4 byte integer 8 byte real double precision Data type signed int max 64 bit unsigned int max 64 bit float double INTEGER VAL cid i end program main INTEGER cgid cid 17 4 4 USER DEFINED COUNTERS C C include vt_user h int main 1 unsigned int i cid cgid VT COUNT GROUP DEF loopindex cid VT COUNT DEF i 4 VT COUNT TYPE UNSIGNED cgid for 1 i lt 100 i VT COUNT UNSIGNED VAL cid i return 0 For all three languages the instrumented sources have
Download Pdf Manuals
Related Search
Related Contents
manuale - NSM Generators HotHammer 2 - DF Philips Saeco HD8761 能力表を見る www.philips.com/welcome ES Manual de usuario 1 Atención al StarTech.com Black Micro USB to Apple 8-pin Lightning Connector Adapter for iPhone / iPod / iPad Receiver Multicoupler Samsung WA80P1B User Manual Anleitung - Fleischmann Copyright © All rights reserved.
Failed to retrieve file