Home

CUPTI User's Guide

image

Contents

1. CUPTI METRIC ATTR CATEGORY attribute Enumerator CUPTI METRIC CATEGORY MEMORY A memory related metric CUPTI METRIC CATEGORY INSTRUCTION An instruction related metric CUPTI METRIC CATEGORY MULTIPROCESSOR A multiprocessor related metric CUPTI METRIC CATEGORY CACHE A cache related metric CUPTI METRIC CATEGORY TEXTURE A texture related metric enum CUpti MetricEvaluationMode A metric can be evaluated per hardware instance to know the load balancing across instances of a domain or the metric can be evaluated in aggregate mode when the events involved in metric evaluation are from different event domains It might be possible to evaluate some metrics in both modes for convenience metric s evaluation mode is accessed using CUpti MetricEvaluationMode and the CUPTI METRIC ATTR EVALUATION MODE attribute Enumerator CUPTI METRIC EVALUATION MODE PER INSTANCE If the metric evaluation mode is per instance then the event value passed to cuptiMetricGet Value should contain value for an instance of the domain Also in this mode cuptiMetricGet Value should be called for all available instances of the domain to get overall status CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 88 CUPTI METRIC EVALUATION MODE AGGREGATE If the metric evaluation mode is aggregate then the event value passed to cuptiMetricGet Value should be aggregated value of an event for all instances of the domain In this mode cuptiMetricGet Value should be c
2. K enum CUpti EventCollectionMode 1 CUPTI EVENT COLLECTION MODE CONTINUOUS 0 CUPTI EVENT COLLECTION MODE KERNEL 1 Event collection modes enum CUpti EventDomainAttribute CUPTI EVENT DOMAIN ATTR NAME 0 CUPTI EVENT DOMAIN ATTR INSTANCE COUNT 1 CUPTI EVENT DOMAIN ATTR TOTAL INSTANCE COUNT 3 Event domain attributes gt enum CUpti EventGroupAttribute CUPTI EVENT GROUP ATTR EVENT DOMAIN ID 0 CUPTI EVENT GROUP ATTR PROFILE ALL DOMAIN INSTANCES 1 CUPTI EVENT GROUP ATTR USER DATA 2 CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 61 CUPTI EVENT GROUP ATTR NUM EVENTS 3 CUPTI EVENT GROUP ATTR EVENTS 4 CUPTI EVENT GROUP ATTR INSTANCE COUNT 5 Event group attributes K enum CUpti ReadEventFlags EVENT READ FLAG NONE 0 Flags for cuptiEventGroupHRead Event an cuptiEventGroupReadAllEvents Functions K CUptiResult cuptiDeviceEnumEventDomains CUdevice device size_t arraySizeBytes CUpti EventDomainID domainArray Get the event domains for a device CUptiResult cuptiDeviceGetAttribute CUdevice device CUpti DeviceAttribute attrib size_t valueSize void value Read a device attribute CUptiResult cuptiDeviceGetEventDomainAttribute CUdevice device CUpti EventDomainID eventDomain CUpti EventDomainAttribute attrib size_t valueSize void value Read an event domain attribute gt CUptiResult cuptiDeviceGetNumEventDomains CUdevice device uint32 t
3. Note Only a single subscriber can be registered at a time CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 52 This function does not enable any callbacks Thread safety this function is thread safe Parameters subscriber Returns handle to initialize subscriber callback The callback function userdata A pointer to user data This data will be passed to the callback function via the userdata paramater Return values CUPTI SUCCESS on success CUPTI ERROR NOT INITIALIZED if unable to initialize CUPTI CUPTI ERROR MAX LIMIT REACHED if there is already a CUPTI subscriber CUPTI ERROR INVALID PARAMETER if subscriber is NULL CUptiResult cuptiSupportedDomains size_t domainCount CUpti DomainTable domainTable Returns in domainTable an array of size xdomainCount of all the available callback domains Note Thread safety this function is thread safe Parameters domainCount Returns number of callback domains domainTable Returns pointer to array of available callback domains Return values CUPTI SUCCESS on success CUPTI ERROR NOT INITIALIZED if unable to initialize CUPTI CUPTI ERROR INVALID PARAMETER if domainCount or domainTable are NULL CUptiResult cuptiUnsubscribe CUpti SubscriberHandle subscriber Removes a callback subscriber so that no future callbacks will be issued to that subscriber Note Thread safety this function is thread safe CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 53 Para
4. attrib is not a metric attribute CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT For non c string attribute values indicates that the value buffer is too small to hold the attribute value CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 92 CUptiResult cuptiMetricGetIdFromName CUdevice device const char metricName CUpti MetricID metric Find a metric by name and return the metric ID in metric Parameters device The CUDA device metricName The name of metric to find metric Returns the ID of the found metric or undefined if unable to find the metric Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE CUPTI ERROR INVALID METRIC NAME if unable to find a metric with name metricName In this case metric is undefined CUPTI ERROR INVALID PARAMETER if metricName or metric are NULL CUptiResult cuptiMetricGetNumEvents CUpti MetricID metric uint32 t numEvents Returns the number of events in numEvents that are required to calculate a metric Parameters metric ID of the metric numEvents Returns the number of events required for the metric Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID METRIC ID CUPTI ERROR INVALID PARAMETER if numEvents is NULL CUptiResult cuptiMetricGetValue CUdevice device CUpti MetricID metric size t 1 CUpti EventID eventIdArray size t event ValueArraySizeBytes 111664
5. bufferSizeBytes The size of the buffer in bytes The size of the buffer must be at least CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 20 1024 bytes Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID PARAMETER if buffer is NULL does not have alignment of at least 8 bytes or is not at least 1024 bytes in size CUptiResult cuptiActivityGetNextRecord uint8 t buffer size t validBufferSizeBytes CUpti Activity record This is a helper function to iterate over the activity records in a buffer A buffer of activity records is typically obtained by using the cuptiActivityDequeueBuffer function An example of typical usage CUpti Activity record NULL CUptiResult status CUPTI SUCCESS do status cuptiActivityGetNextRecord buffer validSize amp record if status CUPTI SUCCESS Use record here T else if status CUPTI ERROR MAX LIMIT REACHED break else goto Error T while 1 Parameters buffer The buffer containing activity records record Inputs the previous record returned by cuptiActivityGetNextRecord and returns the next activity record from the buffer If input value if NULL returns the first activity record in the buffer validBufferSizeBytes The number of valid bytes in the buffer Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR MAX LIMIT REACHED if no more records in the buffer CUPTI ERROR INVALID PARAMETER if buffer
6. group must belong to the same domain typedef uint32 t CUpti EventID An event represents a countable activity action or occurrence on the device Enumeration Type Documentation enum CUpti DeviceAttribute CUPTI device attributes These attributes can be read using cuptiDeviceGetAttribute Enumerator CUPTI DEVICE ATTR MAX EVENT ID Number of event IDs for a device Value is uint32 t CUDA Toolkit CUPTI User s Guide DA 05679 001_ 01 65 CUPTI DEVICE ATTR MAX EVENT DOMAIN ID Number of event domain IDs for a device Value is 11532 1 CUPTI DEVICE ATTR GLOBAL MEMORY BANDWIDTH Get global memory bandwidth in Kbytes sec Value is a uint64 t CUPTI DEVICE ATTR INSTRUCTION PER CYCLE Get theoretical instructions per cycle Value is a uint32 t CUPTI DEVICE ATTR INSTRUCTION THROUGHPUT SINGLE PRECISION Get theoretical number of single precision instructions that can be executed per second Value is uint64 t enum CUpti EventAttribute Event attributes These attributes can be read using cuptiEventGetAttribute Enumerator CUPTI EVENT ATTR NAME Event name Value is a null terminated const c string CUPTI EVENT ATTR SHORT DESCRIPTION Short description of event Value is a null terminated const c string CUPTI EVENT ATTR LONG DESCRIPTION Long description of event Value is a null terminated const c string CUPTI EVENT ATTR CATEGORY Category of event Value is CUpti EventCategory enum CUpti EventCatego
7. number of passes required to collect the events and the events to collect on each pass Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID CONTEXT CUPTI ERROR INVALID EVENT ID CUPTI ERROR INVALID PARAMETER if eventIdArray or eventGroupPasses is NULL CUptiResult cuptiEventGroupSetsDestroy CUpti EventGroupSets eventGroupSets Destroy a CUpti EventGroupSets object Note Thread safety this function is thread safe CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 83 Parameters eventGroupSets The object to destroy Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID OPERATION if any of the event groups contained in the sets is enabled CUPTI ERROR INVALID PARAMETER if eventGroupSets is NULL CUptiResult cuptiGetNumEventDomains uint32 t numDomains Returns the total number of event domains available on any CUDA capable device Note Thread safety this function is thread safe Parameters numDomains Returns the number of domains Return values CUPTI SUCCESS CUPTI ERROR INVALID PARAMETER if numDomains is NULL CUptiResult cuptiSetEventCollectionMode CUcontext context CUpti EventCollectionMode mode Set the event collection mode for a context The mode controls the event collection behavior of all events in event groups created in the context Note Thread safety this function is thread safe Parameters context The context mode The event co
8. xnumDomains Get the number of domains for a device K CUptiResult cuptiDeviceGetTimestamp CUcontext context uint64 t timestamp Read a device timestamp K CUptiResult cuptiEnumEventDomains size t arraySizeBytes CUpti EventDomainID domainArray Get the event domains available on amy device CUptiResult cuptiEventDomainEnumEvents CUpti EventDomainID eventDomain size t arraySizeBytes CUpti EventID event Array Get the events in a domain CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 62 gt CUptiResult cuptiEventDomainGetAttribute CUpti EventDomainID eventDomain CUpti EventDomainAttribute attrib size_t valueSize void value Read an event domain attribute CUptiResult cuptiEventDomainGetNumEvents CUpti EventDomainID eventDomain uint32 t numEvents Get mumber of events in a domain CUptiResult cuptiEventGetAttribute CUpti EventID event CUpti EventAttribute attrib size_t valueSize void value Get an event attribute K CUptiResult cuptiEventGetIdFromName CUdevice device const char event Name CUpti EventID xevent Find an event by name CUptiResult cuptiEventGroupAddEvent CUpti EventGroup eventGroup CUpti EventID event Add an event to an event group gt CUptiResult cuptiEventGroupCreate CUcontext context CUpti EventGroup xeventGroup uint32 t flags Create a new event group for a context K CUptiResult cuptiEventGroupDestroy CUpti EventGroup eventGro
9. The context for which activity is to be enabled kind The kind of activity record to collect Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR NOT COMPATIBLE if the activity kind cannot be enabled CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 19 CUptiResult cuptiActivityEnqueueBuffer CUcontext context uint32 t streamld uint8 t buffer size t bufferSizeBytes Queue a buffer for activity record collection Calling this function transfers ownership of the buffer to CUPTI The buffer should not be accessed or modified until ownership is regained by calling cuptiActivityDequeueBuffer There are three types of queues Global Queue The global queue collects all activity records that are not associated with a valid context All device and API activity records are collected in the global queue A buffer is enqueued in the global queue by specifying context NULL Context Queue Each context queue collects activity records associated with that context that are not associated with a specific stream or that are associated with the default stream A buffer is enqueued in a context queue by specifying the context and a streamId of 0 Stream Queue Each stream queue collects memcpy memset and kernel activity records associated with the stream A buffer is enqueued in a stream queue by specifying a context and a non zero stream ID Multiple buffers can be enqueued on each queue and buffers can be enqueue on multip
10. activity record providing a marker which is an instantaneous point in time K struct CUpti ActivityMarkerData The activity record providing detailed information for a marker K struct CUpti ActivityMemcpy The activity record for memory copies K struct CUpti ActivityMemset CUDA Toolkit CUPTI User s Guide DA 05679 001_ 01 8 The activity record for memset K struct CUpti ActivityMetric The activity record for a CUPTI metric K struct CUpti ActivityName The activity record providing a name gt union CUpti ActivityObjectKindId Identifiers for object kinds as specified by CUpti ActivityObject Kind K struct CUpti ActivityOverhead The activity record for CUPTI and driver overheads K struct CUpti ActivitySourceLocator The activity record for source locator Defines K define CUPTI SOURCE LOCATOR ID UNKNOWN 0 Enumerations enum CUpti ActivityComputeApiKind CUPTI ACTIVITY COMPUTE API UNKNOWN 0 CUPTI ACTIVITY COMPUTE CUDA 1 The kind of a compute API enum CUpti_ ActivityFlag 1 CUPTI ACTIVITY FLAG NONE 0 CUPTI ACTIVITY FLAG DEVICE CONCURRENT KERNELS 1 lt lt 0 CUPTI ACTIVITY FLAG MEMCPY ASYNC 1 0 CUPTI ACTIVITY FLAG MARKER INSTANTANEOUS 1 0 CUPTI ACTIVITY FLAG MARKER START 1 lt lt 1 CUPTI ACTIVITY FLAG MARKER END 1 lt lt 2 CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 9 CUPTI ACTIVITY FLAG MARKER COLOR NONE 1 lt lt 0 CUPTI ACTIVITY
11. array memory copy CUPTI ACTIVITY MEMCPY KIND ATOD A device array to device memory copy CUPTI ACTIVITY MEMCPY KIND DTOA device to device array memory copy CUPTI ACTIVITY MEMCPY KIND DTOD A device to device memory CUPTI ACTIVITY MEMCPY KIND HTOH A host to host memory copy enum CUpti ActivityMemoryKind Each kind represents the type of the source or destination memory accessed by a memory copy CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 16 Enumerator CUPTI ACTIVITY MEMORY KIND UNKNOWN The source or destination memory kind is unknown CUPTI ACTIVITY MEMORY KIND PAGEABLE The source or destination memory is pageable CUPTI ACTIVITY MEMORY KIND PINNED The source or destination memory is pinned CUPTI ACTIVITY MEMORY KIND DEVICE The source or destination memory is on the device CUPTI ACTIVITY MEMORY KIND ARRAY The source or destination memory is an array enum CUpti_ ActivityObject Kind See also CUpti ActivityObjectKindId Enumerator CUPTI ACTIVITY OBJECT UNKNOWN object kind is not known CUPTI ACTIVITY OBJECT PROCESS A process CUPTI ACTIVITY OBJECT THREAD A thread CUPTI ACTIVITY OBJECT DEVICE A device CUPTI ACTIVITY OBJECT CONTEXT context CUPTI ACTIVITY OBJECT STREAM stream enum CUpti ActivityOverheadKind Enumerator CUPTI ACTIVITY OVERHEAD UNKNOWN The overhead kind is not known CUPTI ACTIVITY OVERHEAD DRIVER COMPILER Compiler JIT o
12. attrib is not an event group attribute or if attrib is not a writable attribute CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT Indicates that the value buffer is too small to hold the attribute value CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 82 CUptiResult cuptiEventGroupSetsCreate CUcontext context size t eventIdArraySizeBytes CUpti EventID eventIdArray CUpti EventGroupSets eventGroupPasses The number of events that can be collected simultaneously varies by device and by the type of the events When events can be collected simultaneously they may need to be grouped into multiple event groups because they are from different event domains This function takes a set of events and determines how many passes are required to collect all those events and which events can be collected simultaneously in each pass The CUpti EventGroupSets returned in eventGroupPasses indicates how many passes are required to collect the events with the numSets field Within each event group set the sets array indicates the event groups that should be collected on each pass Note Thread safety this function is thread safe but client must guard against another thread simultaneously destroying context Parameters context The context for event collection eventIdArraySizeBytes Size of eventIdArray in bytes eventIdArray Array of event IDs that need to be grouped eventGroupPasses Returns a CUpti EventGroupSets object that indicates the
13. groups CUPTI ERROR INVALID PARAMETER if eventGroup is NULL CUptiResult cuptiEventGroupGetAttribute CUpti_ EventGroup eventGroup CUpti EventGroupAttribute attrib size_t valueSize void value Read an event group attribute and return it in value Note Thread safety this function is thread safe but client must guard against simultaneous destruction or modification of eventGroup for example client must guard against simultaneous calls to cuptiEventGroupDestroy cuptiEvent Group AddEvent etc and must guard against simultaneous destruction of the context in which eventGroup was created for example client must guard against simultaneous calls to cudaDeviceReset cuCtxDestroy etc Parameters eventGroup The event group CUDA Toolkit CUPTI User s Guide 05679 001 01 77 attrib The attribute to read valueSize Size of buffer pointed by the value and returns the number of bytes written to value value Returns the value of the attribute Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID PARAMETER if valueSize or value is NULL or if attrib is not an eventgroup attribute CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT For non c string attribute values indicates that the value buffer is too small to hold the attribute value CUptiResult cuptiEventGroupReadAllEvents CUpti EventGroup eventGroup CUpti ReadEventFlags flags size t event ValueBufferSizeBytes uint64 t eventVa
14. is NULL CUDA Toolkit CUPTI User s Guide 05679 001 01 21 CUptiResult cuptiActivityGetNumDroppedRecords CUcontext context uint32 t streamld size t dropped Get the number of records that were dropped from a queue because all the buffers in the queue are full See cuptiActivityEnqueueBuffer for description of queues Calling this function does not transfer ownership of the buffer The dropped count maintained for the queue is reset to zero when this function is called Parameters context The context or NULL to get dropped count from global queue streamId The stream ID dropped The number of records that were dropped since the last call to this function Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID PARAMETER if dropped is NULL CUptiResult cuptiActivityQueryBuffer CUcontext context uint32 t streamld size t validBufferSizeBytes Query the status of buffer at the head in the queue See cuptiActivityEnqueueBuffer for description of queues Calling this function does not transfer ownership of the buffer Parameters context The context or NULL to query the global queue streamId The stream ID validBufferSizeBytes Returns the number of bytes in the buffer that contain activity records Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID PARAMETER if buffer or validBufferSizeBytes are NULL CUPTI ERROR MAX LIMIT REACHED if buffer is full CUPT
15. runtimeCorrelationId The runtime correlation ID of the memory set Each memory set is assigned a unique runtime correlation ID that is identical to the correlation ID in the runtime API activity record that launched the memory set uint 4 t CUpti ActivityMemset start The start timestamp for the memory set in ns uint32 t CUpti ActivityMemset streamId The ID of the stream where the memory set is occurring uint32 t CUpti ActivityMemset value The value being assigned to memory by the memory set CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 42 CUpti ActivityMetric Type Reference The activity record for a CUPTI metric Data Fields uint32 t correlationId CUpti MetricID id CUpti ActivityKind kind uint32 t pad CUpti MetricValue value Y Y Y Y Y Detailed Description This activity record represents the collection of a CUPTI metric value CUPTI ACTIVITY KIND METRIC This activity record kind is not produced by the activity API but is included for completeness and ease of use Profile frameworks built on top of CUPTI that collect metric data may choose to use this type to store the collected metric data Field Documentation uint32 t CUpti ActivityMetric correlationId The correlation ID of the metric Use of this ID is user defined but typically this ID value will equal the correlation ID of the kernel for which the metric was gathered CUpti MetricID CUpti ActivityMetric id The metric ID CU
16. safe but client must guard against simultaneous destruction or modification of eventGroup for example client must guard against simultaneous calls to cuptiEventGroupDestroy cuptiEvent Group AddEvent etc and must guard against simultaneous destruction of the context in which eventGroup was created for example client must guard against simultaneous calls to cudaDeviceReset cuCtxDestroy etc If cuptiEventGroupResetAllEvents is called simultaneously with this function then returned event values are undefined Parameters eventGroup The event group flags Flags controlling the reading mode event ValueBufferSizeBytes The size of eventValueBuffer in bytes and returns the number of bytes written to eventValueBuffer event ValueBuffer Returns the event values eventIdArraySizeBytes The size of eventIdArray in bytes and returns the number of bytes written to eventIdArray eventIdArray Returns the IDs of the events in the same order as the values return in event ValueBuffer numEventIdsRead Returns the number of event IDs returned in eventIdArray Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR HARDWARE CUPTI ERROR INVALID OPERATION if eventGroup is disabled CUPTI ERROR INVALID PARAMETER if eventGroup eventValueBufferSizeBytes eventValueBuffer eventIdArraySizeBytes eventIdArray or numEventIdsRead is NULL CUptiResult cuptiEventGroupReadEvent CUpti EventGroup eventGroup CUpti ReadEventFlags flags CU
17. to the correlation ID in the runtime API activity record that launched the kernel uint 4 t CUpti ActivityKernel start The start timestamp for the kernel execution in ns int32 t CUpti ActivityKernel staticSharedMemory The static shared memory allocated for the kernel in bytes uint32 t CUpti ActivityKernel streamId The ID of the stream where the kernel is executing CUDA Toolkit CUPTI User s Guide 05679 001 01 37 CUpti ActivityMemopy Type Reference The activity record for memory copies Data Fields uint64 t bytes uint32 t contextId uint8 t copyKind uint32 t correlationId uint32 t deviceld uint8 t dstKind uint64 t end uint8 t flags CUpti ActivityKind kind void reserved0 uint32 t runtimeCorrelationId uint8 t srcKind uint64 t start uint32 t streamld Y Y Y Y Y Y Y Y Y Y Y Y Y Y Detailed Description This activity record represents a memory copy CUPTI ACTIVITY KIND MEMCPY Field Documentation uint 4 t CUpti Activity Memcpy bytes The number of bytes transferred by the memory copy uint32 t CUpti ActivityMemcocpy contextId The ID of the context where the memory copy is occurring CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 38 uint8 t CUpti Activity Memcpy copyKind The kind of the memory copy stored as a byte to reduce record size See also CUpti ActivityMemcpyKind uint32 t CUpti ActivityMemcocpy correlationId The correlation ID of the memory copy Each m
18. 1 name Returns pointer to the name string on success NULL otherwise Return values CUPTI SUCCESS on success CUPTI ERROR INVALID PARAMETER if name is NULL or if domain or cbid is invalid CUptiResult cuptiGetCallbackState uint32 t enable CUpti SubscriberHandle subscriber CUpti CallbackDomain domain CUpti Callbackld cbid Returns non zero in enable if the callback for a domain and callback ID is enabled and zero if not enabled Note Thread safety a subscriber must serialize access to cuptiGetCallbackState cuptiEnableCallback cuptiEnableDomain and cuptiEnableAllDomains For example if cuptiGetCallbackState sub d c and cuptiEnableCallback sub d c are called concurrently the results are undefined Parameters enable Returns non zero if callback enabled zero if not enabled subscriber Handle to the initialize subscriber domain The domain of the callback cbid The ID of the callback Return values CUPTI SUCCESS on success CUPTI ERROR NOT INITIALIZED if unable to initialized CUPTI CUPTI ERROR INVALID PARAMETER if enabled is NULL or if subscriber domain or cbid is invalid CUptiResult cuptiSubscribe CUpti Subscriber Handle subscriber CUpti CallbackFunc callback void userdata Initializes a callback subscriber with a callback function and optionally a pointer to user data The returned subscriber handle can be used to enable and disable the callback for specific domains and callback IDs
19. A NVIDIA CUDA Toolkit CUPTI User s Guide Document Change History Ver Date Resp Reason for change v01 2011 1 19 DG Initial revision for CUDA Tools SDK 4 0 v02 2012 1 5 DG Revisions for CUDA Tools SDK 4 1 v03 2012 2 13 DG Revisions for CUDA Tools SDK 4 2 v04 2012 5 1 DG Revisions for CUDA Toolkit 5 0 CUDA Toolkit CUPTI User s Guide DA 05679 001_v01 ii CUPTI Reference CUPTI Version Defines gt define CUPTI VERSION 3 The API version for this implementation of CUPTI Functions CUptiResult cuptiGet Version uint32 t version Get the CUPTI API version Detailed Description Function and macro to determine the CUPTI version Define Documentation define CUPTI VERSION The API version for this implementation of CUPTI This define along with cuptiGet Version can be used to dynamically detect if the version of CUPTI compiled against matches the version of the loaded CUPTI library vl CUDAToolsSDK 4 0 v2 CUDAToolsSDK 4 1 v3 CUDA Toolkit 5 0 CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 3 Function Documentation CUptiResult cuptiGetVersion uint32 t version Return the API version in version Parameters version Returns the version Return values CUPTI SUCCESS on success CUPTI ERROR INVALID PARAMETER if version is NULL See also CUPTI API VERSION CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 4 CUPTI Result Cod
20. ABILITY AND FITNESS FOR A PARTICULAR PURPOSE Information furnished is believed to be accurate and reliable However NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation Specifications mentioned in this publication are subject to change without notice This publication supersedes and replaces all other information previously supplied NVIDIA Corporation products are not authorized as critical components in life support devices or systems without express written approval of NVIDIA Corporation Trademarks NVIDIA and the NVIDIA logo are trademarks and or registered trademarks of NVIDIA Corporation in the U S and other countries Other company and product names may be trademarks of the respective companies with which they are associated Copyright 2012 NVIDIA Corporation All rights reserved www nvidia com nVIDIA
21. ARAMETER if streamId is NULL See also cuptiActivityEnqueueBuffer cuptiActivity DequeueBuffer CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 23 CUptiResult cuptiGetTimestamp uint64 t timestamp Returns a timestamp normalized to correspond with the start and end timestamps reported in the CUPTI activity records The timestamp is reported in nanoseconds Parameters timestamp Returns the CUPTI timestamp Return values CUPTI SUCCESS CUPTI ERROR INVALID PARAMETER if timestamp is NULL CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 24 CUpti Activity Type Reference The base activity record Data Fields gt CUpti ActivityKind kind Detailed Description The activity API uses a CUpti_ Activity as a generic representation for any activity The kind field is used to determine the specific activity kind and from that the CUpti_ Activity object can be cast to the specific activity record type appropriate for that kind Note that all activity record types are padded and aligned to ensure that each member of the record is naturally aligned See also CUpti ActivityKind Field Documentation CUpti ActivityKind CUpti Activity kind The kind of this activity CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 25 CUpti_ ActivityAPI Type Reference The activity record for a driver or runtime API invocation Data Fields CUpti CallbackId chid uint32 t correlationId uint64 t end CUpti Ac
22. Array metricArray Returns the IDs of the metrics for the device CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 89 Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE CUPTI ERROR INVALID PARAMETER if arraySizeBytes or metricArray are NULL CUptiResult cuptiDeviceGetNumMetrics CUdevice device uint32 t numMetrics Returns the number of metrics available for a device Parameters device The CUDA device numMetrics Returns the number of metrics available for the device Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE CUPTI ERROR INVALID PARAMETER if numMetrics is NULL CUptiResult cuptiEnumMetrics size_t arraySizeBytes CUpti MetricID metricArray Returns the metric IDs in metricArray for all CUDA capable devices The size of the metricArray buffer is given by arraySizeBytes The size of the metricArray buffer must be at least numMetrics sizeof CUpti MetricID or all metric IDs will not be returned The value returned in arraySizeBytes contains the number of bytes returned in metricArray Parameters arraySizeBytes The size of metricArray in bytes and returns the number of bytes written to metricArray metricArray Returns the IDs of the metrics Return values CUPTI SUCCESS CUPTI ERROR INVALID PARAMETER if arraySizeBytes or metricArray are NULL CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 90 CUptiResult cuptiGetNum
23. CB_DOMAIN_ INVALID 0 CUPTI_CB_DOMAIN_DRIVER_API 1 CUPTI CB DOMAIN RUNTIME API 2 CUPTI CB DOMAIN RESOURCE 3 CUPTI CB DOMAIN SYNCHRONIZE 4 CUPTI CB DOMAIN NVTX 5 Callback domains enum CUpti CallbackIdResource CUPTI CBID RESOURCE INVALID 0 CUPTI CBID RESOURCE CONTEXT CREATED 1 CUPTI CBID RESOURCE CONTEXT DESTROY STARTING 2 CUPTI CBID RESOURCE STREAM CREATED 3 CUPTI CBID RESOURCE STREAM DESTROY STARTING 4 Callback IDs for resource domain K enum CUpti CallbackIdSync 1 CUPTI CBID SYNCHRONIZE INVALID 0 CUPTI CBID SYNCHRONIZE STREAM SYNCHRONIZED 1 CUPTI CBID SYNCHRONIZE CONTEXT SYNCHRONIZED 2 Callback IDs for synchronization domain Functions CUptiResult cuptiEnableAllDomains uint32 t enable CUpti SubscriberHandle subscriber Enable or disable all callbacks in all domains CUptiResult cuptiEnableCallback uint32 t enable CUpti SubscriberHandle subscriber CUpti CallbackDomain domain CUpti CallbackId cbid CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 46 Enable or disabled callbacks for a specific domain and callback ID CUptiResult cuptiEnableDomain uint32 t enable CUpti SubscriberHandle subscriber CUpti CallbackDomain domain Enable or disabled all callbacks for a specific domain CUptiResult cuptiGetCallbackName CUpti CallbackDomain domain uint32 t cbid const char Get the name of a callback for a specific domai
24. CUpti MetricID metric size t eventIdArraySizeBytes CUpti EventID xeventIdArray size t eventValueArraySizeBytes uint64 t event Value Array uint64 timeDuration CUpti MetricValue metricValue Calculate the value for a metric Detailed Description Functions types and enums that implement the CUPTI Metric API Typedef Documentation typedef uint32 t CUpti MetricID A metric provides a measure of some aspect of the device Enumeration Type Documentation enum CUpti MetricAttribute Metric attributes describe properties of a metric These attributes can be read using cuptiMetricGetAttribute Enumerator CUPTI METRIC ATTR NAME Metric name Value is a null terminated const c string CUDA Toolkit CUPTI User s Guide DA 05679 001 01 87 CUPTI METRIC ATTR SHORT DESCRIPTION Short description of metric Value is a null terminated const c string CUPTI METRIC ATTR LONG DESCRIPTION Long description of metric Value is a null terminated const c string CUPTI METRIC ATTR CATEGORY Category of the metric Value is of type CUpti MetricCategory CUPTI METRIC ATTR VALUE KIND Value type of the metric Value is of type CUpti MetricValueKind CUPTI METRIC ATTR EVALUATION MODE Metric evaluation mode Value is of type CUpti MetricEvaluationMode enum CUpti MetricCategory Each metric is assigned to a category that represents the general type of the metric metric s category is accessed using cuptiMetricGetAttribute and the
25. ENT This activity record kind is not produced by the activity API but is included for completeness and ease of use Profile frameworks built on top of CUPTI that collect event data may choose to use this type to store the collected event data Field Documentation uint32 t CUpti ActivityEvent correlationId The correlation ID of the event Use of this ID is user defined but typically this ID value will equal the correlation ID of the kernel for which the event was gathered CUpti EventDomainID CUpti ActivityEvent domain The event domain ID CUpti EventID CUpti ActivityEvent id The event ID CUpti ActivityKind CUpti ActivityEvent kind The activity record kind must be CUPTI ACTIVITY KIND EVENT CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 32 uint 4 t CUpti ActivityEvent value The event value CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 33 CUpti ActivityKernel Reference The activity record for kernel Data Fields int32 t blockX int32 t blockY int32 t blockZ uint8 t cacheConfigExecuted uint8 t cacheConfigRequested uint32 t contextId uint32 t correlationId uint32 t deviceld int32 t dynamicSharedMemory uint64 t end int32 t gridX int32 t gridY int32 t gridZ CUpti ActivityKind kind uint32 t localMemoryPerThread uint32 t localMemory Total const char name uint32 t pad uintl6 t registersPerThread void reserved0 uint32 t runtimeCorrelationId uint64 t start in
26. EVICE The device does not correspond to a valid CUDA device CUPTI ERROR INVALID CONTEXT The context is NULL or not valid CUPTI ERROR INVALID EVENT DOMAIN ID The event domain id is invalid CUPTI ERROR INVALID EVENT ID The event id is invalid CUPTI ERROR INVALID EVENT NAME The event name is invalid CUPTI ERROR INVALID OPERATION The current operation cannot be performed due to dependency on other factors CUPTI ERROR OUT OF MEMORY Unable to allocate enough memory to perform the requested operation CUPTI ERROR HARDWARE The performance monitoring hardware could not be reserved or some other hardware error occurred CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT The output buffer size is not sufficient to return all requested data CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 6 CUPTI ERROR NOT IMPLEMENTED is not implemented CUPTI ERROR MAX LIMIT REACHED The maximum limit is reached CUPTI ERROR NOT READY The object is not yet ready to perform the requested operation CUPTI ERROR NOT COMPATIBLE The current operation is not compatible with the current state of the object CUPTI ERROR NOT INITIALIZED CUPTI is unable to initialize its connection to the CUDA driver CUPTI ERROR INVALID METRIC ID The metric id is invalid CUPTI ERROR INVALID METRIC NAME The metric name is invalid CUPTI ERROR QUEUE EMPTY The queue is empty CUPTI ERROR INVALID HANDLE Invalid handle internal CUPTI ERROR INVALID STREAM In
27. FLAG MARKER COLOR ARGB 1 lt lt 1 CUPTI ACTIVITY FLAG GLOBAL ACCESS KIND SIZE 0xFF lt lt 0 CUPTI ACTIVITY FLAG GLOBAL ACCESS KIND LOAD 1 lt lt 8 CUPTI ACTIVITY FLAG GLOBAL ACCESS KIND CACHED 1 lt lt 9 Flags associated with activity records K enum CUpti ActivityKind CUPTI ACTIVITY KIND INVALID 0 CUPTI ACTIVITY KIND MEMCPY 1 CUPTI ACTIVITY KIND MEMSET 2 CUPTI ACTIVITY KIND KERNEL 3 CUPTI ACTIVITY KIND DRIVER 4 CUPTI ACTIVITY KIND RUNTIME 5 CUPTI ACTIVITY KIND EVENT 6 CUPTI ACTIVITY KIND METRIC 7 CUPTI ACTIVITY KIND DEVICE 8 CUPTI ACTIVITY KIND CONTEXT 9 CUPTI ACTIVITY KIND CONCURRENT KERNEL 10 CUPTI ACTIVITY KIND NAME 11 CUPTI ACTIVITY KIND MARKER 12 CUPTI ACTIVITY KIND MARKER DATA 13 CUPTI ACTIVITY KIND SOURCE LOCATOR 14 CUPTI ACTIVITY KIND GLOBAL ACCESS 15 CUPTI ACTIVITY KIND BRANCH 16 CUPTI ACTIVITY KIND OVERHEAD 17 L The kinds of activity records K enum CUpti ActivityMemcpyKind CUPTI ACTIVITY MEMCPY KIND UNKNOWN 0 CUPTI ACTIVITY MEMCPY KIND HTOD 1 CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 10 CUPTI ACTIVITY MEMCPY KIND DTOH 2 CUPTI ACTIVITY MEMCPY KIND HTOA 3 CUPTI ACTIVITY MEMCPY KIND ATOH 4 CUPTI ACTIVITY MEMCPY KIND ATOA 5 CUPTI ACTIVITY MEMCPY KIND 6 CUPTI ACTIVITY MEMCPY KIND DTOA CUPTI ACTIVITY MEMCPY KIND DTOD 8 CUPTI ACTIVITY MEMCPY KIND HTOH 9 The kind of a
28. GroupAttribute attrib size t valueSize void value Write an event group attribute K CUptiResult cuptiEventGroupSetsCreate CUcontext context size t eventIdArraySizeBytes CUpti EventID eventIdArray CUpti EventGroupSets eventGroupPasses For a set of events get the grouping that indicates the number of passes and the event groups necessary to collect the events CUptiResult cuptiEventGroupSetsDestroy CUpti EventGroupSets xeventGroupSets Destroy CUpti_ EventGroupSets object gt CUptiResult cuptiGetNumEventDomains uint32 xnumDomains Get the number of event domains available on any device CUptiResult cuptiSetEventCollectionMode CUcontext context CUpti EventCollectionMode mode Set the event collection mode CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 64 Detailed Description Functions types and enums that implement the CUPTI Event API Define Documentation define CUPTI EVENT OVERFLOW uint64_t 0xFFFFFFFFFFFFFFFFULL The CUPTI event value that indicates an overflow Typedef Documentation typedef uint32 t CUpti Event DomainID ID for an event domain An event domain represents a group of related events A device may have multiple instances of a domain indicating that the device can simultaneously record multiple instances of each event within that domain typedef void CUpti EventGroup An event group is a collection of events that are managed together All events in an event
29. I ERROR QUEUE EMPTY the queue is empty validBufferSizeBytes returns 0 CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 22 CUptiResult cuptiGetDeviceld CUcontext context uint32 t deviceld If context is NULL returns the ID of the device that contains the currently active context If context is non NULL returns the ID of the device which contains that context Operates in a similar manner to cudaGetDevice or cuCtxGetDevice but may be called from within callback functions Parameters context The context or NULL to indicate the current context deviceld Returns the ID of the device that is current for the calling thread Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE if unable to get device ID CUPTI ERROR INVALID PARAMETER if deviceId is NULL CUptiResult cuptiGetStreamId CUcontext context CUstream stream uint32 t streamld Get the ID of a stream The stream ID is unique within a context i e all streams within a context will have unique stream IDs Parameters context If non NULL then the stream is checked to ensure that it belongs to this context Typically this parameter should be null stream The stream streamId Returns a context unique ID for the stream Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID STREAM if unable to get stream ID or if context is non NULL and stream does not belong to the context CUPTI ERROR INVALID P
30. Metrics uint32 t numMetrics Returns the total number of metrics available on any CUDA capable devices Parameters numMetrics Returns the number of metrics Return values CUPTI SUCCESS CUPTI ERROR INVALID PARAMETER if numMetrics is NULL CUptiResult cuptiMetricCreateEventGroupSets CUcontext context size t metricldArraySizeBytes CUpti MetricID metricIdArray CUpti EventGroupSets eventGroupPasses For a set of metrics get the grouping that indicates the number of passes and the event groups necessary to collect the events required for those metrics See also cuptiEventGroupSetsCreate for details on event group set creation Parameters context The context for event collection metricldArraySizeBytes Size of the metricIdArray in bytes metricldArray Array of metric IDs eventGroupPasses Returns a CUpti EventGroupSets object that indicates the number of passes required to collect the events and the events to collect on each pass Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID CONTEXT CUPTI ERROR INVALID METRIC ID CUPTI ERROR INVALID PARAMETER if metricIdArray or eventGroupPasses is NULL CUptiResult cuptiMetricEnumEvents CUpti MetricID metric size_t eventIdArraySizeBytes CUpti EventID eventIdArray Gets the event IDs in event IdArray required to calculate a metric The size of the eventIdArray buffer is given by xeventIdArraySizeBytes and must be at least CUDA Tool
31. NITIALIZED CUPTI ERROR INVALID EVENT ID CUPTI ERROR INVALID OPERATION if eventGroup is enabled CUPTI ERROR INVALID PARAMETER if eventGroup is NULL CUptiResult cuptiEventGroupResetAllEvents CUpti EventGroup eventGroup Zero all the event counts in an event group CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 81 Note Thread safety this function is thread safe but client must guard against simultaneous destruction or modification of eventGroup for example client must guard against simultaneous calls to cuptiEventGroupDestroy cuptiEvent GroupAddEvent etc and must guard against simultaneous destruction of the context in which eventGroup was created for example client must guard against simultaneous calls to cudaDeviceReset cuCtxDestroy etc Parameters eventGroup The event group Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR HARDWARE CUPTI ERROR INVALID PARAMETER if eventGroup is NULL CUptiResult cuptiEventGroupSetAttribute CUpti EventGroup eventGroup CUpti EventGroupAttribute attrib size t valueSize void value Write an event group attribute Note Thread safety this function is thread safe Parameters eventGroup The event group attrib The attribute to write valueSize The size in bytes of the value value The attribute value to write Return values CUPTI_ SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID PARAMETER if valueSize or value is NULL or if
32. Toolkit CUPTI User s Guide DA 05679 001 v01 68 Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE CUPTI ERROR INVALID PARAMETER if arraySizeBytes or domainArray are NULL CUptiResult cuptiDeviceGetAttribute CUdevice device CUpti DeviceAttribute attrib size t valueSize void value Read a device attribute and return it in value Note Thread safety this function is thread safe Parameters device The CUDA device attrib The attribute to read valueSize Size of buffer pointed by the value and returns the number of bytes written to value value Returns the value of the attribute Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE CUPTI ERROR INVALID PARAMETER if valueSize or value is NULL or if attrib is not a device attribute CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT For non c string attribute values indicates that the value buffer is too small to hold the attribute value CUptiResult cuptiDeviceGet Event DomainAttribute CUdevice device CUpti EventDomainID eventDomain CUpti EventDomainAttribute attrib size t valueSize void value Returns an event domain attribute in value The size of the value buffer is given by xvalueSize The value returned in valueSize contains the number of bytes returned in value CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 69 If the attribute value is a c string that is longer tha
33. a region end marker Valid for CUPTI ACTIVITY KIND MARKER CUPTI ACTIVITY FLAG MARKER COLOR NONE Indicates the activity represents a marker that does not specify a color Valid for CUPTI ACTIVITY KIND MARKER DATA CUPTI ACTIVITY FLAG MARKER COLOR ARGB Indicates the activity represents a marker that specifies a color in alpha red green blue format Valid for CUPTI ACTIVITY KIND MARKER DATA CUPTI ACTIVITY FLAG GLOBAL ACCESS KIND SIZE MASK The number of bytes requested by each thread Valid for CUpti ActivityGlobalAccess CUPTI ACTIVITY FLAG GLOBAL ACCESS KIND LOAD If bit in this flag is set the access was load else it is a store access Valid for CUpti ActivityGlobalAccess CUPTI ACTIVITY FLAG GLOBAL ACCESS KIND CACHED If this bit in flag is set the load access was cached else it is uncached Valid for CUpti ActivityGlobalAccess enum CUpti ActivityKind Each activity record kind represents information about a GPU or an activity occurring on a CPU or GPU Each kind is associated with a activity record structure that holds the information associated with the kind See also CUpti Activity CUpti ActivityAPI CUpti ActivityContext CUpti ActivityDevice CUpti ActivityEvent CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 14 CUpti ActivityKernel CUpti ActivityMemcpy CUpti ActivityMemset CUpti ActivityMetric CUpti ActivityName CUpti ActivityMarker CUpti ActivityMarkerData CUpti ActivitySourceLocator CUpti ActivityGlobalAc
34. alled only once enum CUpti MetricValueKind Metric values can be one of several different kinds Corresponding to each kind is a member of the CUpti MetricValue union The metric value returned by cuptiMetricGet Value should be accessed using the appropriate member of that union based on its value kind Enumerator CUPTI METRIC VALUE KIND DOUBLE The metric value is a 64 bit double CUPTI METRIC VALUE KIND UINT64 The metric value is a 64 bit unsigned integer CUPTI METRIC VALUE KIND PERCENT The metric value is a percentage represented by a 64 bit double For example 57 596 is represented by the value 57 5 CUPTI METRIC VALUE KIND THROUGHPUT The metric value is a throughput represented by a 64 bit integer The unit for throughput values is bytes second CUPTI METRIC VALUE KIND INT64 The metric value is a 64 bit signed integer Function Documentation CUptiResult cuptiDeviceEnumMetrics CUdevice device size_t arraySizeBytes CUpti MetricID metricArray Returns the metric IDs in metricArray for a device The size of the metricArray buffer is given by arraySizeBytes The size of the metricArray buffer must be at least numMetrics sizeof CUpti MetricID or else all metric IDs will not be returned The value returned in xarraySizeBytes contains the number of bytes returned in metricArray Parameters device The CUDA device arraySizeBytes The size of metricArray in bytes and returns the number of bytes written to metric
35. alue is an integer CUPTI EVENT GROUP ATTR USER DATA rw Reserved for user data CUPTI EVENT GROUP ATTR NUM EVENTS Number of events in the group Value is a uint32 t CUPTI EVENT GROUP ATTR EVENTS Enumerates events in the group Value is a pointer to buffer of size sizeof CUpti EventID num of events in the eventgroup num events can be queried using CUPTI EVENT GROUP ATTR NUM EVENTS CUPTI EVENT GROUP ATTR INSTANCE COUNT Number of instances of the domain bound to this event group that will be counted Value is a uint32 t enum CUpti ReadEventFlags Flags for cuptiEventGroupReadEvent an cuptiEventGroupReadAllEvents Enumerator CUPTI EVENT READ FLAG NONE No flags Function Documentation CUptiResult cuptiDeviceEnumEventDomains CUdevice device size_t arraySizeBytes CUpti EventDomainID domainArray Returns the event domains IDs in domainArray for a device The size of the domainArray buffer is given by xarraySizeBytes The size of the domainArray buffer must be at least numdomains sizeof CUpti EventDomainID or else all domains will not be returned The value returned in xarraySizeBytes contains the number of bytes returned in domainArray Note Thread safety this function is thread safe Parameters device The CUDA device arraySizeBytes The size of domainArray in bytes and returns the number of bytes written to domainArray domainArray Returns the IDs of the event domains for the device CUDA
36. ared memory reserved for the kernel in bytes uint 4 t CUpti ActivityKernel end The end timestamp for the kernel execution in ns int32 t CUpti ActivityKernel gridX The X dimension grid size for the kernel int32 t CUpti ActivityKernel grid Y The Y dimension grid size for the kernel int32 t CUpti ActivityKernel gridZ The Z dimension grid size for the kernel CUpti ActivityKind CUpti ActivityKernel kind The activity record kind must be CUPTI ACTIVITY KIND KERNEL or CUPTI ACTIVITY KIND CONCURRENT KERNEL uint32 t CUpti ActivityKernel localMemoryPerThread The amount of local memory reserved for each thread in bytes uint32 t CUpti ActivityKernel localMemory Total The total amount of local memory reserved for the kernel in bytes const chars CUpti ActivityKernel name The name of the kernel This name is shared across all activity records representing the same kernel and so should not be modified CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 36 uint32 t CUpti ActivityKernel pad Undefined Reserved for internal use uintl t CUpti Activity Kernel registersPerThread The number of registers required for each thread executing the kernel vold CUpti ActivityKernel reservedO Undefined Reserved for internal use uint32 t CUpti ActivityKernel runtimeCorrelationId The runtime correlation ID of the kernel Each kernel execution is assigned a unique runtime correlation ID that is identical
37. at implement the CUPTI Activity API Define Documentation define CUPTI SOURCE LOCATOR ID UNKNOWN 0 The source locator ID that indicates an unknown source location There is not an actual CUpti ActivitySourceLocator object corresponding to this value Enumeration Type Documentation enum CUpti ActivityComputeA piKind Enumerator CUPTI ACTIVITY COMPUTE API UNKNOWN The compute API is not known CUPTI ACTIVITY COMPUTE CUDA The compute APIs are for CUDA enum CUpti ActivityFlag Activity record flags Flags can be combined by bitwise OR to associated multiple flags with an activity record Each flag is specific to a certain activity kind as noted below Enumerator CUPTI ACTIVITY FLAG NONE Indicates the activity record has no flags CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 13 CUPTI ACTIVITY FLAG DEVICE CONCURRENT KERNELS Indicates the activity represents a device that supports concurrent kernel execution Valid for CUPTI ACTIVITY KIND DEVICE CUPTI ACTIVITY FLAG MEMCPY ASYNOC Indicates the activity represents an asychronous memcpy operation Valid for CUPTI ACTIVITY KIND MEMCPY CUPTI ACTIVITY FLAG MARKER INSTANTANEOUS Indicates the activity represents an instantaneous marker Valid for CUPTI ACTIVITY KIND MARKER CUPTI ACTIVITY FLAG MARKER START Indicates the activity represents a region start marker Valid for CUPTI ACTIVITY KIND MARKER CUPTI ACTIVITY FLAG MARKER END Indicates the activity represents
38. cess CUpti ActivityBranch CUpti ActivityOverhead Enumerator CUPTI ACTIVITY KIND INVALID The activity record is invalid CUPTI ACTIVITY KIND MEMCPY A host lt gt host host lt gt device or device lt gt device memory copy The corresponding activity record structure is CUpti ActivityMemopy CUPTI ACTIVITY KIND MEMSET A memory set executing on the GPU The corresponding activity record structure is CUpti ActivityMemset CUPTI ACTIVITY KIND KERNEL A kernel executing on the GPU The corresponding activity record structure is CUpti ActivityKernel CUPTI ACTIVITY KIND DRIVER CUDA driver API function execution The corresponding activity record structure is CUpti ActivityAPI CUPTI ACTIVITY KIND RUNTIME A CUDA runtime API function execution The corresponding activity record structure is CUpti Activity API CUPTI ACTIVITY KIND EVENT An event value The corresponding activity record structure is CUpti ActivityEvent CUPTI ACTIVITY KIND METRIC metric value The corresponding activity record structure is CUpti Activity Metric CUPTI ACTIVITY KIND DEVICE Information about a device The corresponding activity record structure is CUpti ActivityDevice CUPTI ACTIVITY KIND CONTEXT Information about a context The corresponding activity record structure is CUpti ActivityContext CUPTI ACTIVITY KIND CONCURRENT KERNEL A potentially concurrent kernel executing on the GPU The corresponding activity record structure is CUpti Ac
39. d Documentation CUcontext CUpti SynchronizeData context The context of the stream being synchronized CUstream CUpti SynchronizeData stream The stream being synchronized CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 59 CUPTI Event API Data Structures K struct CUpti EventGroupSet A set of event groups K struct CUpti_EventGroupSets A set of event group sets Defines K define CUPTI_EVENT_OVERFLOW uint6 4 t OxFFFFFFFFFFFFFFFFULL The overflow value for a CUPTI event Typedefs gt typedef uint32 t CUpti EventDomainID ID for an event domain K typedef void CUpti EventGroup A group of events K typedef uint32 t CUpti EventID ID for an event Enumerations enum CUpti DeviceAttribute CUPTI DEVICE ATTR MAX EVENT ID 1 CUPTI DEVICE ATTR MAX EVENT DOMAIN ID 2 CUPTI DEVICE ATTR GLOBAL MEMORY BANDWIDTH 3 CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 60 CUPTI DEVICE ATTR INSTRUCTION PER CYCLE 4 CUPTI DEVICE ATTR INSTRUCTION THROUGHPUT SINGLE PRECISION Device attributes enum CUpti EventAttribute CUPTI EVENT ATTR NAME 0 CUPTI EVENT ATTR SHORT DESCRIPTION 1 CUPTI EVENT ATTR LONG DESCRIPTION 2 CUPTI EVENT ATTR CATEGORY 3 Event attributes enum CUpti EventCategory CUPTI EVENT CATEGORY INSTRUCTION 0 CUPTI EVENT CATEGORY MEMORY 1 CUPTI EVENT CATEGORY CACHE 2 CUPTI EVENT CATEGORY PROFILE TRIGGER 3 An event category
40. driver API functions CUPTI CB DOMAIN RUNTIME Domain containing callback points for all runtime API functions CUPTI CB DOMAIN RESOURCE Domain containing callback points for CUDA resource tracking CUPTI CB DOMAIN SYNCHRONIZE Domain containing callback points for CUDA synchronization CUPTI CB DOMAIN NVTX Domain containing callback points for NVTX API functions enum CUpti CallbackIdResource Callback IDs for resource domain CUPTI CB DOMAIN RESOURCE This value is communicated to the callback function via the cbid parameter Enumerator CUPTI CBID RESOURCE INVALID Invalid resource callback ID CUPTI CBID RESOURCE CONTEXT CREATED A new context has been created CUPTI CBID RESOURCE CONTEXT DESTROY STARTING A context is about to be destroyed CUPTI CBID RESOURCE STREAM CREATED A new stream has been created CUPTI CBID RESOURCE STREAM DESTROY STARTING A stream is about to be destroyed enum CUpti CallbackIdSync Callback IDs for synchronization domain CUPTI CB DOMAIN SYNCHRONIZE This value is communicated to the callback function via the cbid parameter Enumerator CUPTI CBID SYNCHRONIZE INVALID Invalid synchronize callback ID CUPTI CBID SYNCHRONIZE STREAM SYNCHRONIZED Stream synchronization has completed for the stream CUPTI CBID SYNCHRONIZE CONTEXT SYNCHRONIZED Context synchronization has completed for the context CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 49 Function Documenta
41. e The core clock rate of the device in kHz CUpti ActivityFlag CUpti ActivityDevice flags The flags associated with the device See also CUpti ActivityFlag uint 4 t CUpti ActivityDevice globalMemoryBandwidth The global memory bandwidth available on the device in kBytes sec uint 4 t CUpti ActivityDevice globalMemorySize The amount of global memory on the device in bytes CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 29 uint32 t CUpti ActivityDevice id The device ID CUpti ActivityKind CUpti ActivityDevice kind The activity record kind must be CUPTI ACTIVITY KIND DEVICE uint32 t CUpti Activity Device 12CacheSize The size of the L2 cache on the device in bytes uint32 t CUpti ActivityDevice maxBlockDimX Maximum allowed X dimension for a block uint32 t CUpti ActivityDevice maxBlockDimY Maximum allowed Y dimension for a block uint32 t CUpti ActivityDevice maxBlockDimZ Maximum allowed Z dimension for a block uint32 t CUpti ActivityDevice maxBlocksPerMultiprocessor Maximum number of blocks that can be present on a multiprocessor at any given time uint32 t CUpti Activity Device maxGridDimX Maximum allowed X dimension for a grid uint32 t CUpti ActivityDevice maxGridDimY Maximum allowed Y dimension for a grid uint32 t CUpti Activity Device maxGridDimZ Maximum allowed Z dimension for a grid CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 30 uint32 t CUpti Ac
42. emory copy is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the memory copy uint32 t CUpti Activity Memcpy deviceld The ID of the device where the memory copy is occurring uint8 t CUpti ActivityMemcpy dstKind The destination memory kind read by the memory copy stored as a byte to reduce record size See also CUpti ActivityMemoryKind uint64_t CUpti Activity Memcpy end The end timestamp for the memory copy in ns uint8 t CUpti ActivityMemcpy flags The flags associated with the memory copy See also CUpti ActivityFlag CUpti ActivityKind CUpti Activity Memcpy kind The activity record kind must be CUPTI ACTIVITY KIND MEMCPY CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 39 voids CUpti_ ActivityMemcpy reserved0 Undefined Reserved for internal use uint32 t CUpti Activity Memcpy runtimeCorrelationId The runtime correlation ID of the memory copy Each memory copy is assigned a unique runtime correlation ID that is identical to the correlation ID in the runtime API activity record that launched the memory copy uint8 t CUpti Activity Memcpy srcKind The source memory kind read by the memory copy stored as a byte to reduce record size See also CUpti ActivityMemoryKind uint 4 t CUpti ActivityMemcopy start The start timestamp for the memory copy in ns uint32 t CUpti ActivityMemcocpy streamId The ID of the
43. ent does not belong to the same event domain as the events that are already in the event group Device limitations on the events that can belong to the same group The event group is full Note Thread safety this function is thread safe Parameters eventGroup The event group event The event to add to the group Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID EVENT ID CUPTI ERROR OUT OF MEMORY CUPTI ERROR INVALID OPERATION if eventGroup is enabled CUPTI ERROR NOT COMPATIBLE if event belongs to a different event domain than the events already in eventGroup or if a device limitation prevents event from being collected at the same time as the events already in eventGroup CUPTI ERROR MAX LIMIT REACHED if eventGroup is full CUPTI ERROR INVALID PARAMETER if eventGroup is NULL CUptiResult cuptiEventGroupCreate CUcontext context CUpti EventGroup eventGroup uint32 t flags Creates a new event group for context and returns the new group in eventGroup Note flags are reserved for future use and should be set to zero Thread safety this function is thread safe Parameters context The context for the event group CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 75 eventGroup Returns the new event group flags Reserved must be zero Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID CONTEXT CUPTI ERROR OUT OF MEMORY CUPTI ERROR INVALID PARAMETER if
44. es Enumerations enum CUptiResult CUPTI SUCCESS 0 CUPTI ERROR INVALID PARAMETER 1 CUPTI ERROR INVALID DEVICE 2 CUPTI ERROR INVALID CONTEXT 3 CUPTI ERROR INVALID EVENT DOMAIN ID 4 CUPTI ERROR INVALID EVENT ID 5 CUPTI ERROR INVALID EVENT NAME 6 CUPTI ERROR INVALID OPERATION 7 CUPTI ERROR OUT OF MEMORY 8 CUPTI ERROR HARDWARE 9 CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT 10 CUPTI ERROR API NOT IMPLEMENTED 11 CUPTI ERROR MAX LIMIT REACHED 12 CUPTI ERROR NOT READY 13 CUPTI ERROR NOT COMPATIBLE 14 CUPTI ERROR NOT INITIALIZED 15 CUPTI ERROR INVALID METRIC ID 16 CUPTI ERROR INVALID METRIC NAME 17 CUPTI ERROR QUEUE EMPTY 18 CUPTI ERROR INVALID HANDLE 19 CUPTI ERROR INVALID STREAM 20 CUPTI ERROR INVALID KIND 21 CUPTI ERROR INVALID EVENT VALUE 22 CUPTI ERROR DISABLED 23 CUPTI ERROR INVALID MODULE 24 CUPTI ERROR UNKNOWN 999 CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 5 CUPTI result codes Functions gt CUptiResult cuptiGetResultString CUptiResult result const char str Get the descriptive string for a CUptiResult Detailed Description Error and result codes returned by CUPTI functions Enumeration Type Documentation enum CUptiResult Error and result codes returned by CUPTI functions Enumerator CUPTI SUCCESS No error CUPTI ERROR INVALID PARAMETER One or more of the parameters is invalid CUPTI ERROR INVALID D
45. es The size of eventIdArray in bytes eventIdArray The event IDs required to calculate metric event ValueArraySizeBytes The size of eventValueArray in bytes eventValueArray The normalized event values required to calculate metric The values must be order to match the order of events in eventIdArray timeDuration The duration over which the events were collected in ns metricValue Returns the value for the metric Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID METRIC ID CUPTI ERROR INVALID OPERATION CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT if the eventIdArray does not contain all the events needed for metric CUPTI ERROR INVALID EVENT VALUE if any of the event values required for the metric is CUPTI EVENT OVERFLOW CUPTI ERROR NOT COMPATIBLE if the computed metric value cannot be represented in the metric s value type For example if the metric value type is unsigned and the computed metric value is negative CUPTI ERROR INVALID PARAMETER if metricValue eventIdArray or eventValueArray is NULL CUDA Toolkit CUPTI User s Guide DA 05679 001_ 01 94 Notice ALL NVIDIA DESIGN SPECIFICATIONS REFERENCE BOARDS FILES DRAWINGS DIAGNOSTICS LISTS AND OTHER DOCUMENTS TOGETHER AND SEPARATELY MATERIALS ARE BEING PROVIDED AS IS NVIDIA MAKES NO WARRANTIES EXPRESSED IMPLIED STATUTORY OR OTHERWISE WITH RESPECT TO THE MATERIALS AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT MERCHANT
46. eters event ID of the event attrib The event attribute to read valueSize The size of the value buffer in bytes and returns the number of bytes written to value value Returns the attribute s value Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID EVENT ID CUPTI ERROR INVALID PARAMETER if valueSize or value is NULL or if attrib is not an event attribute CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT For non c string attribute values indicates that the value buffer is too small to hold the attribute value CUptiResult cuptiEventGetIdFromName CUdevice device const char eventName CUpti EventID event Find an event by name and return the event ID in event Note Thread safety this function is thread safe Parameters device The CUDA device eventName The name of the event to find event Returns the ID of the found event or undefined if unable to find the event Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE CUPTI ERROR INVALID EVENT NAME if unable to find an event with name eventName In this case event is undefined CUPTI ERROR INVALID PARAMETER if eventName or event are NULL CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 74 CUptiResult cuptiEventGroupAddEvent CUpti EventGroup eventGroup CUpti EventID event Add an event to an event group The event add can fail for a number of reasons The event group is enabled The ev
47. etrics for a device gt CUptiResult cuptiDeviceGetNumMetrics CUdevice device uint32 t numMetrics Get the number of metrics for a device gt CUptiResult cuptiEnumMetrics size t xarraySizeBytes CUpti MetricID metricArray Get all the metrics available on any device gt CUptiResult cuptiGetNumMetrics 010632 t numMetrics Get the total number of metrics available on any device gt CUptiResult cuptiMetricCreateEventGroupSets CUcontext context size t metricIdArraySizeBytes CUpti_ MetricID metricIdArray CUpti EventGroupSets xevent Group Passes For a set of metrics get the grouping that indicates the number of passes and the event groups necessary to collect the events required for those metrics gt CUptiResult cuptiMetricEnumEvents CUpti MetricID metric size t xeventIdArraySizeBytes CUpti EventID Get the events required to calculating a metric CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 86 CUptiResult cuptiMetricGetAttribute CUpti MetricID metric CUpti MetricAttribute attrib size 6 valueSize void value Get a metric attribute CUptiResult cuptiMetricGetIdFromName CUdevice device const char xmetricName CUpti MetricID metric Find an metric by name CUptiResult cuptiMetricGetNumEvents CUpti MetricID metric uint32 t xnumEvents Get mumber of events required to calculate a metric K CUptiResult cuptiMetricGet Value CUdevice device
48. eventGroup is NULL CUptiResult cuptiEventGroupDestroy CUpti EventGroup eventGroup Destroy an eventGroup and free its resources An event group cannot be destroyed if it is enabled Note Thread safety this function is thread safe Parameters eventGroup The event group to destroy Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID OPERATION if the event group is enabled CUPTI ERROR INVALID PARAMETER if eventGroup is NULL CUptiResult cuptiEventGroupDisable CUpti EventGroup eventGroup Disable an event group Disabling an event group stops collection of events contained in the group Note Thread safety this function is thread safe Parameters eventGroup The event group Return values CUPTI_ SUCCESS CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 76 CUPTI ERROR NOT INITIALIZED CUPTI ERROR HARDWARE CUPTI ERROR INVALID PARAMETER if eventGroup is NULL CUptiResult cuptiEventGroupEnable CUpti EventGroup eventGroup Enable an event group Enabling an event group zeros the value of all the events in the group and then starts collection of those events Note Thread safety this function is thread safe Parameters eventGroup The event group Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR HARDWARE CUPTI ERROR NOT READY if eventGroup does not contain any events CUPTI ERROR COMPATIBLE if eventGroup cannot be enabled due to other already enabled event
49. executing CUDA Toolkit CUPTI User s Guide DA 05679 001__v01 27 CUpti ActivityDevice Type Reference The activity record for a device Data Fields uint32 t computeCapabilityMajor uint32 t computeCapabilityMinor uint32 t constantMemorySize uint32 t coreClockRate CUpti ActivityFlag flags uint64 t globalMemoryBandwidth uint64 t globalMemorySize uint32 tid CUpti ActivityKind kind uint32 t 12CacheSize uint32 t maxBlockDimX uint32 t maxBlockDimY uint32 t maxBlockDimZ uint32 t maxBlocksPerMultiprocessor uint32 t maxGridDimX uint32 t maxGridDimY uint32 t maxGridDimZ uint32 t maxIPC uint32 t maxRegistersPerBlock uint32 t maxSharedMemoryPerBlock uint32 t max ThreadsPerBlock uint32 t maxWarpsPerMultiprocessor const char name uint32 t numMemcpyEngines uint32 t numMultiprocessors uint32 t numThreadsPerWarp Y Y WV Y Y Y Y Y NW Y NW NW NW NW NN NN NN Y Y Y Y V Y CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 28 Detailed Description This activity record represents information about a GPU device CUPTI ACTIVITY KIND DEVICE Field Documentation uint32 t CUpti Activity Device computeCapability Major Compute capability for the device major number uint32 t CUpti ActivityDevice computeCapability Minor Compute capability for the device minor number uint32 t CUpti ActivityDevice constant MemorySize The amount of constant memory on the device in bytes uint32 t CUpti ActivityDevice coreClockRat
50. kit CUPTI User s Guide DA 05679 001_ v01 91 numEvents sizeof CUpti EventID or all events will not be returned The value returned in xeventIdArraySizeBytes contains the number of bytes returned in eventIdArray Parameters metric ID of the metric eventIdArraySizeBytes The size of eventIdArray in bytes and returns the number of bytes written to eventIdArray eventIdArray Returns the IDs of the events required to calculate metric Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID METRIC ID CUPTI ERROR INVALID PARAMETER if eventIdArraySizeBytes or eventIdArray are NULL CUptiResult cuptiMetricGetAttribute CUpti MetricID metric CUpti MetricAttribute attrib size t valueSize void value Returns a metric attribute in value The size of the value buffer is given by valueSize The value returned in valueSize contains the number of bytes returned in value If the attribute value is a c string that is longer than valueSize then only the first valueSize characters will be returned and there will be no terminating null byte Parameters metric ID of the metric attrib The metric attribute to read valueSize The size of the value buffer in bytes and returns the number of bytes written to value value Returns the attribute s value Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID METRIC ID CUPTI ERROR INVALID PARAMETER if valueSize or value is NULL or if
51. le queues When a new activity record needs to be recorded CUPTI searches for a non empty queue to hold the record in this order 1 the appropriate stream queue 2 the appropriate context queue If the search does not find any queue with a buffer then the activity record is dropped If the search finds a queue containing a buffer but that buffer is full then the activity record is dropped and the dropped record count for the queue is incremented If the search finds a queue containing a buffer with space available to hold the record then the record is recorded in the buffer At a minimum one or more buffers must be queued in the global queue and context queue at all times to avoid dropping activity records Global queue will not store any activity records for gpu activity kernel memcpy memset It is also necessary to enqueue at least one buffer in the context queue of each context as it is created The stream queues are optional and can be used to reduce or eliminate application perturbations caused by the need to process or save the activity records returned in the buffers For example if a stream queue is used that queue can be flushed when the stream is synchronized Parameters context The context or NULL to enqueue on the global queue streamId The stream ID buffer The pointer to user supplied buffer for storing activity records The buffer must be at least 8 byte aligned and the size of the buffer must be at least 1024 bytes
52. llback cbid The ID of the callback CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 50 Return values CUPTI SUCCESS on success CUPTI ERROR NOT INITIALIZED if unable to initialized CUPTI CUPTI ERROR INVALID PARAMETER if subscriber domain or cbid is invalid CUptiResult cuptiEnableDomain uint32 t enable CUpti Subscriber Handle subscriber CUpti CallbackDomain domain Enable or disabled all callbacks for a specific domain Note Thread safety a subscriber must serialize access to cuptiGetCallbackState cuptiEnableCallback cuptiEnableDomain and cuptiEnableAllDomains For example if cuptiGetCallbackEnabled sub d and cuptiEnableDomain sub d are called concurrently the results are undefined Parameters enable New enable state for all callbacks in the domain Zero disables all callbacks non zero enables all callbacks subscriber Handle to callback subscription domain The domain of the callback Return values CUPTI SUCCESS on success CUPTI ERROR NOT INITIALIZED if unable to initialized CUPTI CUPTI ERROR INVALID PARAMETER if subscriber or domain is invalid CUptiResult cuptiGetCallbackName CUpti CallbackDomain domain uint32 t cbid const char name Returns a pointer to the name c string in name Note Names are available only for the DRIVER and RUNTIME domains Parameters domain The domain of the callback cbid The ID of the callback CUDA Toolkit CUPTI User s Guide DA 05679 001__v01 5
53. llection mode Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID CONTEXT CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 84 CUPTI Metric API Data Structures union CUpti MetricValue A metric value Typedefs gt typedef uint32 t CUpti MetricID ID for a metric Enumerations enum CUpti MetricAttribute CUPTI METRIC ATTR NAME 0 CUPTI METRIC ATTR SHORT DESCRIPTION 1 CUPTI METRIC ATTR LONG DESCRIPTION 2 CUPTI METRIC ATTR CATEGORY 3 CUPTI METRIC ATTR VALUE KIND 4 CUPTI METRIC ATTR EVALUATION MODE 5 Metric attributes enum CUpti MetricCategory 1 CUPTI METRIC CATEGORY MEMORY 0 CUPTI METRIC CATEGORY INSTRUCTION 1 CUPTI METRIC CATEGORY MULTIPROCESSOR 2 CUPTI METRIC CATEGORY CACHE 3 CUPTI METRIC CATEGORY TEXTURE 4 A metric category enum CUpti MetricEvaluationMode 1 CUPTI METRIC EVALUATION MODE PER INSTANCE 1 CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 85 CUPTI METRIC EVALUATION MODE AGGREGATE 1 lt lt 1 A metric evaluation mode K enum CUpti MetricValueKind 1 CUPTI METRIC VALUE KIND DOUBLE 0 CUPTI METRIC VALUE KIND UINT64 1 CUPTI METRIC VALUE KIND PERCENT 2 CUPTI METRIC VALUE KIND THROUGHPUT 3 CUPTI METRIC VALUE KIND INT64 4 Kinds of metric values Functions K CUptiResult cuptiDeviceEnumMetrics CUdevice device size_t xarraySizeBytes CUpti MetricID metricArray Get the m
54. llection of a specific kind of activity record for a context CUptiResult cuptiActivityEnable CUpti ActivityKind kind Enable collection of a specific kind of activity record CUptiResult cuptiActivityEnableContext CUcontext context CUpti ActivityKind kind Enable collection of a specific kind of activity record for a context CUptiResult cuptiActivityEnqueueBuffer CUcontext context uint32 t streamld uint8 t buffer size t bufferSizeBytes Queue a buffer for activity record collection CUptiResult cuptiActivityGetNextRecord uint8 t buffer size t validBufferSizeBytes CUpti Activity Iterate over the activity records in a buffer CUptiResult cuptiActivityGetNumDroppedRecords CUcontext context uint32 t streamld size_t dropped Get the number of activity records that were dropped from a queue because of insufficient buffer space CUptiResult cuptiActivityQueryBuffer CUcontext context uint32 t streamld size_t validBufferSizeBytes Query the status of the buffer at the head of a queue CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 12 gt CUptiResult cuptiGetDeviceld CUcontext context uint32 t deviceld Get the ID of a device CUptiResult cuptiGetStreamId CUcontext context CUstream stream uint32 t streamlId Get the ID of a stream CUptiResult cuptiGetTimestamp uint64 t timestamp Get the CUPTI timestamp Detailed Description Functions types and enums th
55. lueBuffer size t eventIdArraySizeBytes CUpti EventID eventIdArray size t numEventIdsRead Read the values for all the events in an event group The event values are returned in the eventValueBuffer buffer eventValueBufferSizeBytes indicates the size of eventValueBuffer The buffer must be at least sizeof uint64 number of events in group if EVENT GROUP ATTR PROFILE ALL DOMAIN INSTANCES is not set on the group containing the events The buffer must be at least sizeof uint64 number of domain instances number of events in group if CUPTI EVENT GROUP ATTR PROFILE ALL DOMAIN INSTANCES is set on the group The data format returned in eventValueBuffer is gt domain instance 0 event0 eventl eventN gt domain instance 1 event0 eventl eventN gt gt domain instance M event0 eventl eventN The event order in eventValueBuffer is returned in eventIdArray The size of eventIdArray is specified in eventIdArraySizeBytes The size should be at least sizeof CUpti EventID number of events in group If any instance of any event counter overflows the value returned for that event instance will be CUPTI EVENT OVERFLOW CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 78 The only allowed value for flags is CUPTI EVENT READ FLAG NONE Reading events from a disabled event group is not allowed After being read an events value is reset to zero Note Thread safety this function is thread
56. memory copy indicating the source and destination targets of the copy enum CUpti ActivityMemoryKind 1 CUPTI ACTIVITY MEMORY KIND UNKNOWN 0 CUPTI ACTIVITY MEMORY KIND PAGEABLE 1 CUPTI ACTIVITY MEMORY KIND PINNED 2 CUPTI ACTIVITY MEMORY KIND DEVICE 3 CUPTI ACTIVITY MEMORY KIND ARRAY 4 The kinds of memory accessed by a memory copy K enum CUpti ActivityObjectKind 1 CUPTI ACTIVITY OBJECT UNKNOWN 0 CUPTI ACTIVITY OBJECT PROCESS 1 CUPTI ACTIVITY OBJECT THREAD 2 CUPTI ACTIVITY OBJECT DEVICE 3 CUPTI ACTIVITY OBJECT CONTEXT 4 CUPTI ACTIVITY OBJECT STREAM 5 The kinds of activity objects K enum CUpti ActivityOverheadKind 4 CUPTI ACTIVITY OVERHEAD UNKNOWN 0 CUPTI ACTIVITY OVERHEAD DRIVER COMPILER 1 CUPTI ACTIVITY OVERHEAD CUPTI BUFFER FLUSH 1 lt lt 16 CUPTI ACTIVITY OVERHEAD CUPTI INSTRUMENTATION 2 lt lt 16 CUPTI ACTIVITY OVERHEAD CUPTI RESOURCE 3 lt lt 16 CUDA Toolkit CUPTI User s Guide 05679 001 v01 11 The kinds of activity overhead Functions K CUptiResult cuptiActivityDequeueBuffer CUcontext context uint32 t streamld uint8 t buffer size_t validBufferSizeBytes Dequeue a buffer containing activity records CUptiResult cuptiActivityDisable CUpti ActivityKind kind Disable collection of a specific kind of activity record CUptiResult cuptiActivityDisableContext CUcontext context CUpti ActivityKind kind Disable co
57. meters subscriber Handle to the initialize subscriber Return values CUPTI SUCCESS on success CUPTI ERROR NOT INITIALIZED if unable to initialized CUPTI CUPTI ERROR INVALID PARAMETER if subscriber is NULL or not initialized CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 54 CUpti CallbackData Type Reference Data passed into a runtime or driver API callback function Data Fields CUpti ApiCallbackSite callbackSite CUcontext context uint32 t contextUid uint64 t correlationData uint32 t correlationId const char functionName const void functionParams void functionReturnValue Y Y Y Y Y Y Y Y Y const char symbolName Detailed Description Data passed into a runtime or driver API callback function as the cbdata argument to CUpti CallbackFunc The cbdata will be this type for domain equal to CUPTI CB DOMAIN DRIVER API or CUPTI DOMAIN RUNTIME The callback data is valid only within the invocation of the callback function that is passed the data If you need to retain some data for use outside of the callback you must make a copy of that data For example if you make a shallow copy of CUpti CallbackData within a callback you cannot dereference functionParams outside of that callback to access the function parameters functionName is an exception the string pointed to by functionName is a global constant and so may be accessed outside of the callback Field Documentation CUpti ApiCallbackSi
58. n valueSize then only the first valueSize characters will be returned and there will be no terminating null byte Note Thread safety this function is thread safe Parameters device The CUDA device eventDomain ID of the event domain attrib The event domain attribute to read valueSize The size of the value buffer in bytes and returns the number of bytes written to value value Returns the attribute s value Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE CUPTI ERROR INVALID EVENT DOMAIN ID CUPTI ERROR INVALID PARAMETER if valueSize or value is NULL or if attrib is not an event domain attribute CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT For non c string attribute values indicates that the value buffer is too small to hold the attribute value CUptiResult cuptiDeviceGetNumEventDomains CUdevice device uint32 t numDomains Returns the number of domains in numDomains for a device Note Thread safety this function is thread safe Parameters device The CUDA device numDomains Returns the number of domains Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID DEVICE CUPTI ERROR INVALID PARAMETER if numDomains is NULL CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 70 CUptiResult cuptiDeviceGetTimestamp CUcontext context uint64 t timestamp Returns the device timestamp in timestamp The timestamp is reported in nanoseconds and i
59. n and callback ID gt CUptiResult cuptiGetCallbackState uint32 t enable CUpti SubscriberHandle subscriber CUpti CallbackDomain domain CUpti CallbackId cbid Get the current enabled disabled state of a callback for a specific domain and function ID K CUptiResult cuptiSubscribe CUpti Subscriber Handle subscriber CUpti CallbackFunc callback void userdata Initialize a callback subscriber with a callback function and user data K CUptiResult cuptiSupportedDomains size t domainCount CUpti Domain Tablo domainTable Get the available callback domains CUptiResult cuptiUnsubscribe CUpti SubscriberHandle subscriber Unregister a callback subscriber Detailed Description Functions types and enums that implement the CUPTI Callback API Typedef Documentation typedef void x CUpti CallbackFunc void userdata CUpti CallbackDomain domain CUpti CallbackId cbid const void cbdata Function type for a callback The type of the data passed to the callback in cbdata depends on the domain If domain is DOMAIN DRIVER API or CUPTI CB DOMAIN RUNTIME API the type of cbdata will be CUDA Toolkit CUPTI User s Guide DA 05679 001__v01 47 CUpti CallbackData If domain is CUPTI CB DOMAIN RESOURCE the type of cbdata wil be CUpti ResourceData If domain is CUPTI CB DOMAIN SYNCHRONIZE the type of cbdata will be CUpti SynchronizeData If domain is CUPTI CB DOMAIN NVTX the type of cbdata will be CUpti Nv
60. nd The kind of activity record to stop collecting Return values CUPTI_ SUCCESS CUPTI ERROR NOT INITIALIZED CUptiResult cuptiActivityDisableContext CUcontext context CUpti ActivityKind kind Disable collection of a specific kind of activity record for a context This setting done by this API will supercede the global settings for activity records Multiple kinds can be enabled by calling this function multiple times CUDA Toolkit CUP TI User s Guide DA 05679 001_ v01 18 Parameters context The context for which activity is to be disabled kind The kind of activity record to stop collecting Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUptiResult cuptiActivityEnable CUpti ActivityKind kind Enable collection of a specific kind of activity record Multiple kinds can be enabled by calling this function multiple times By default all activity kinds are disabled for collection Parameters kind The kind of activity record to collect Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR NOT COMPATIBLE if the activity kind cannot be enabled CUptiResult cuptiActivityEnableContext CUcontext context CUpti ActivityKind kind Enable collection of a specific kind of activity record for a context This setting done by this API will supercede the global settings for activity records enabled by cuptiActivityEnable Multiple kinds can be enabled by calling this function multiple times Parameters context
61. ndicates the time since the device was last reset Note Thread safety this function is thread safe Parameters context A context on the device from which to get the timestamp timestamp Returns the device timestamp Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID CONTEXT CUPTI ERROR INVALID PARAMETER is timestamp is NULL CUptiResult cuptiEnumEventDomains size_t arraySizeBytes CUpti EventDomainID domainArray Returns all the event domains available on any CUDA capable device Event domain IDs are returned in domainArray The size of the domainArray buffer is given by arraySizeBytes The size of the domainArray buffer must be at least numDomains sizeof CUpti EventDomainID or all domains will not be returned The value returned in xarraySizeBytes contains the number of bytes returned in domainArray Note Thread safety this function is thread safe Parameters arraySizeBytes The size of domainArray in bytes and returns the number of bytes written to domainArray domainArray Returns all the event domains Return values CUPTI SUCCESS CUPTI ERROR INVALID PARAMETER if arraySizeBytes or domainArray are NULL CUDA Toolkit CUPTI User s Guide 05679 001 01 71 CUptiResult cuptiEventDomainEnumEvents CUpti EventDomainID eventDomain size t arraySizeBytes CUpti EventID eventArray Returns the event IDs in eventArray for a domain The size of the eventArray buffer is gi
62. onst chars CUpti CallbackData functionName Name of the runtime or driver API function which issued the callback This string is a global constant and so may be accessed outside of the callback const vold CUpti CallbackData functionParams Pointer to the arguments passed to the runtime or driver API call See generated cuda runtime api meta h and generated cuda meta h for structure definitions for the parameters for each runtime and driver API function voids CUpti CallbackData functionReturnValue Pointer to the return value of the runtime or driver API call This field is only valid within the EXIT callback For a runtime API functionReturnValue points to a cudaError_t For a driver API functionReturnValue points to a CUresult CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 56 const chars CUpti CallbackData symbolName Name of the symbol operated on by the runtime or driver API function which issued the callback This entry is valid only for driver and runtime launch callbacks where it returns the name of the kernel CUDA Toolkit CUPTI User s Guide DA 05679 001__v01 57 CUpti ResourceData Reference Data passed into a resource callback function Data Fields gt CUcontext context K void resourceDescriptor gt CUstream stream Detailed Description Data passed into a resource callback function as the cbdata argument to CUpti CallbackFunc The cbdata will be this type fo
63. pti ActivityKind CUpti ActivityMetric kind The activity record kind must be CUPTI ACTIVITY KIND METRIC uint32 t CUpti ActivityMetric pad Undefined Reserved for internal use CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 43 CUpti MetricValue CUpti ActivityMetric value The metric value CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 44 CUPTI Callback API Data Structures K struct CUpti CallbackData Data passed into a runtime or driver API callback function K struct CUpti NvtxData Data passed into a NVTX callback function K struct CUpti ResourceData Data passed into a resource callback function K struct CUpti SynchronizeData Data passed into a synchronize callback function Typedefs gt typedef void CUpti CallbackFunc void userdata CUpti CallbackDomain domain CUpti CallbackId cbid const void cbdata Function type for a callback K typedef uint32 t CUpti CallbackId An ID for a driver API runtime API resource or synchronization callback K typedef CUpti CallbackDomain CUpti Domain Table Pointer to an array of callback domains K typedef struct CUpti Subscriber st CUpti SubscriberHandle A callback subscriber Enumerations enum CUpti ApiCallbackSite CUPTI API ENTER 0 CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 45 CUPTI EXIT 1 Specifies the point in an API call that a callback is issued K enum CUpti_CallbackDomain CUPTI_
64. pti EventID event size t event ValueBufferSizeBytes uint64_t event ValueBuffer Read the value for an event in an event group The event value is returned in the eventValueBuffer buffer eventValueBufferSizeBytes indicates the size of the CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 79 eventValueBuffer buffer The buffer must be at least sizeof uint64 if CUPTI EVENT GROUP ATTR PROFILE ALL DOMAIN INSTANCES is not set on the group containing the event The buffer must be at least sizeof uint64 number of domain instances if CUPTI EVENT GROUP ATTR PROFILE ALL DOMAIN INSTANCES is set on the group If any instance of an event counter overflows the value returned for that event instance will be CUPTI EVENT OVERFLOW The only allowed value for flags is CUPTI EVENT READ FLAG NONE Reading an event from a disabled event group is not allowed After being read an event s value is reset to zero Note Thread safety this function is thread safe but client must guard against simultaneous destruction or modification of eventGroup for example client must guard against simultaneous calls to cuptiEventGroupDestroy cuptiEvent Group AddEvent etc and must guard against simultaneous destruction of the context in which eventGroup was created for example client must guard against simultaneous calls to cudaDeviceReset cuCtxDestroy etc If cuptiEventGroupResetAllEvents is called simultaneously with this function then
65. r domain equal to CUPTI CB DOMAIN RESOURCE The callback data is valid only within the invocation of the callback function that is passed the data If you need to retain some data for use outside of the callback you must make a copy of that data Field Documentation CUcontext CUpti ResourceData context For CBID RESOURCE CONTEXT CREATED and CUPTI CBID RESOURCE CONTEXT DESTROY STARTING the context being created or destroyed For CBID RESOURCE STREAM CREATED and CUPTI CBID RESOURCE STREAM DESTROY STARTING the context containing the stream being created or destroyed vold CUpti ResourceData resourceDescriptor Reserved for future use CUstream CUpti ResourceData stream For CBID RESOURCE STREAM CREATED and CUPTI CBID RESOURCE STREAM DESTROY STARTING the stream being created or destroyed CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 58 CUpti 5ynchronizeData Type Reference Data passed into a synchronize callback function Data Fields CUcontext context gt CUstream stream Detailed Description Data passed into a synchronize callback function as the cbdata argument to CUpti CallbackFunc The cbdata will be this type for domain equal to CUPTI CB DOMAIN SYNCHRONIZE The callback data is valid only within the invocation of the callback function that is passed the data If you need to retain some data for use outside of the callback you must make a copy of that data Fiel
66. returned event values are undefined Parameters eventGroup The event group flags Flags controlling the reading mode event The event to read event ValueBufferSizeBytes The size of eventValueBuffer in bytes and returns the number of bytes written to eventValueBuffer event ValueBuffer Returns the event value s Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID EVENT ID CUPTI ERROR HARDWARE CUPTI ERROR INVALID OPERATION if eventGroup is disabled CUPTI ERROR INVALID PARAMETER if eventGroup eventValueBufferSizeBytes or eventValueBuffer is NULL CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 80 CUptiResult cuptiEventGroupRemoveAllEvents CUpti EventGroup eventGroup Remove all events from an event group Events cannot be removed if the event group is enabled Note Thread safety this function is thread safe Parameters eventGroup The event group Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID OPERATION if eventGroup is enabled CUPTI ERROR INVALID PARAMETER if eventGroup is NULL CUptiResult cuptiEventGroupRemoveEvent CUpti EventGroup eventGroup CUpti EventID event Remove event from the an event group The event cannot be removed if the event group is enabled Note Thread safety this function is thread safe Parameters eventGroup The event group event The event to remove from the group Return values CUPTI SUCCESS CUPTI ERROR NOT I
67. ry Each event is assigned to a category that represents the general type of the event event s category is accessed using cuptiEventGetAttribute and the CUPTI EVENT ATTR CATEGORY attribute Enumerator CUPTI EVENT CATEGORY INSTRUCTION An instruction related event CUPTI EVENT CATEGORY MEMORY A memory related event CUPTI EVENT CATEGORY CACHE A cache related event CUPTI EVENT CATEGORY PROFILE TRIGGER A profile trigger event enum CUpti EventCollectionMode The event collection mode determines the period over which the events within the enabled event groups will be collected CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 66 Enumerator CUPTI EVENT COLLECTION MODE CONTINUOUS Events are collected for the entire duration between the cuptiEventGroupEnable and cuptiEventGroupDisable calls This is the default mode CUPTI EVENT COLLECTION MODE KERNEL Events are collected only for the durations of kernel executions that occur between the cuptiEventGroupEnable and cuptiEventGroupDisable calls Event collection begins when a kernel execution begins and stops when kernel execution completes If multiple kernel executions occur between the cuptiEvent GroupEnable and cuptiEvent GroupDisable calls then the event values must be read after each kernel launch if those events need to be associated with the specific kernel launch enum CUpti EventDomainAttribute Event domain attributes Except where noted all the attributes can be read u
68. s written to value value Returns the attribute s value Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID EVENT DOMAIN ID CUPTI ERROR INVALID PARAMETER if valueSize or value is NULL or if attrib is not an event domain attribute CUPTI ERROR PARAMETER SIZE NOT SUFFICIENT For non c string attribute values indicates that the value buffer is too small to hold the attribute value CUptiResult cuptiEventDomainGetNumEvents CUpti EventDomainID eventDomain uint32 t numEvents Returns the number of events in numEvents for a domain Note Thread safety this function is thread safe Parameters eventDomain ID of the event domain numEvents Returns the number of events in the domain Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID EVENT DOMAIN ID CUPTI ERROR INVALID PARAMETER if numEvents is NULL CUptiResult cuptiEventGetAttribute CUpti EventID event CUpti EventAttribute attrib size t valueSize void value Returns an event attribute in value The size of the value buffer is given by valueSize The value returned in valueSize contains the number of bytes returned in value If the attribute value is a c string that is longer than valueSize then only the first valueSize characters will be returned and there will be no terminating null byte CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 73 Note Thread safety this function is thread safe Param
69. sing either cuptiDeviceGetEventDomainAttribute cuptiEventDomainGet Attribute Enumerator CUPTI EVENT DOMAIN ATTR NAME Event domain name Value is a null terminated const c string CUPTI EVENT DOMAIN ATTR INSTANCE COUNT Number of instances of the domain for which event counts will be collected The domain may have additional instances that cannot be profiled see CUPTI EVENT DOMAIN ATTR TOTAL INSTANCE COUNT Can be read only with cuptiDeviceGetEventDomainAttribute Value is a uint32 t CUPTI EVENT DOMAIN ATTR TOTAL INSTANCE COUNT Total number of instances of the domain including instances that cannot be profiled Use CUPTI EVENT DOMAIN ATTR INSTANCE COUNT to get the number of instances that can be profiled Can be read only with cuptiDeviceGetEventDomainAttribute Value is a uint32 t enum CUpti EventGroupAttribute Event group attributes These attributes can be read using cuptiEventGroupGet Attribute Attributes marked rw can also be written using cuptiEventGroupSet Attribute Enumerator CUPTI EVENT GROUP ATTR EVENT DOMAIN ID domain to which the event group is bound This attribute is set when the first event is added to the group Value is a CUpti EventDomainID CUDA Toolkit CUPTI User s Guide 05679 001 01 67 CUPTI EVENT GROUP ATTR PROFILE ALL DOMAIN INSTANCES rw Profile all the instances of the domain for this eventgroup This feature can be used to get load balancing across all instances of a domain V
70. stream where the memory copy is occurring CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 40 CUpti ActivityMemset Type Reference The activity record for memset Data Fields uint64 t bytes uint32 t contextId uint32 t correlationId uint32 t deviceld uint64 t end CUpti ActivityKind kind void reserved0 uint32 t runtimeCorrelationId uint64 t start uint32 t streamld uint32 t value Y Y Y Y Y Y Y Y Y Y Y Detailed Description This activity record represents a memory set operation CUPTI ACTIVITY KIND MEMSET Field Documentation uint 4 t CUpti ActivityMemset bytes The number of bytes being set by the memory set uint32 t CUpti ActivityMemset contextId The ID of the context where the memory set is occurring uint32 t CUpti ActivityMemset correlationId The correlation ID of the memory set Each memory set is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched CUDA Toolkit CUPTI User s Guide 05679 001 v01 41 the memory set uint32 t CUpti ActivityMemset deviceld The ID of the device where the memory set is occurring uint 4 t CUpti ActivityMemset end The end timestamp for the memory set in ns CUpti ActivityKind CUpti ActivityMemset kind The activity record kind must be CUPTI ACTIVITY KIND MEMSET voids CUpti ActivityMemset reservedO Undefined Reserved for internal use uint32 t CUpti ActivityMemset
71. t eventValueArray uint64_t timeDuration CUpti MetricValue metricValue Use the events collected for a metric to calculate the metric value Metric value evaluation depends on the evaluation mode CUpti MetricEvaluationMode that the metric supports CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 93 If a metric has evaluation mode as CUPTI METRIC EVALUATION MODE PER INSTANCE then it assumes that the input event value is for one domain instance If a metric has evaluation mode as CUPTI METRIC EVALUATION MODE AGGREGATE it assumes that input event values are normalized to represent all domain instances on a device For the most accurate metric collection the events required for the metric should be collected for all profiled domain instances For example to collect all instances of an event set the CUPTI EVENT GROUP ATTR PROFILE ALL DOMAIN INSTANCES attribute on the group containing the event to 1 The normalized value for the event is then sum event values totalInstanceCount instanceCount where sum event values is the summation of the event values across all profiled domain instances totalInstanceCount is obtained from querying CUPTI EVENT DOMAIN ATTR TOTAL INSTANCE COUNT and instanceCount is obtained from querying CUPTI EVENT GROUP ATTR INSTANCE COUNT or CUPTI EVENT DOMAIN ATTR INSTANCE COUNT Parameters device The CUDA device that the metric is being calculated for metric The metric ID eventIdArraySizeByt
72. t32 t staticSharedMemory uint32 t streamld Y Y Y Y Y Y Y Y Y Y Y Y NN Y NN Y Y Y Y Y V Y Detailed Description This activity record represents a kernel execution CUPTI ACTIVITY KIND KERNEL and CUPTI ACTIVITY KIND CONCURRENT KERNEL CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 34 Field Documentation int32 t CUpti ActivityKernel blockX The X dimension block size for the kernel int32 t CUpti ActivityKernel blockY The Y dimension block size for the kernel int32 t CUpti ActivityKernel blockZ The Z dimension grid size for the kernel uint8 t CUpti ActivityKernel cacheConfigExecuted The cache configuration used for the kernel The value is one of the CUfunc cache enumeration values from cuda h uint8 t CUpti ActivityKernel cacheConfigRequested The cache configuration requested by the kernel The value is one of the CUfunc_ cache enumeration values from cuda h uint32 t CUpti ActivityKernel contextId The ID of the context where the kernel is executing uint32 t CUpti ActivityKernel correlationId The correlation ID of the kernel Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the kernel uint32 t CUpti ActivityKernel devicelId The ID of the device where the kernel is executing CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 35 int32 t CUpti ActivityKernel dynamicSharedMemory The dynamic sh
73. te CUpti CallbackData callbackSite Point in the runtime or driver function from where the callback was issued CUcontext CUpti CallbackData context Driver context current to the thread or null if no context is current This value can change from the entry to exit callback of a runtime API function if the runtime initializes a CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 55 context uint32 t CUpti CallbackData context Uid Unique ID for the CUDA context associated with the thread The UIDs are assigned sequentially as contexts are created and are unique within a process uint64_t CUpti CallbackData correlationData Pointer to data shared between the entry and exit callbacks of a given runtime or drive API function invocation This field can be used to pass 64 bit values from the entry callback to the corresponding exit callback uint32 t CUpti CallbackData correlationId The activity record correlation ID for this callback For a driver domain callback i e domain CUPTI DOMAIN DRIVER API this ID will equal the correlation ID in the CUpti Activity API record corresponding to the CUDA driver function call For a runtime domain callback i e domain CUPTI DOMAIN RUNTIME API this ID will equal the correlation ID in the CUpti Activity API record corresponding to the CUDA runtime function call Within the callback this ID can be recorded to correlate user data with the activity record This field is new in 4 1 c
74. tion CUptiResult cuptiEnableAllDomains uint32 t enable CUpti SubscriberHandle subscriber Enable or disable all callbacks in all domains Note Thread safety a subscriber must serialize access to cuptiGetCallbackState cuptiEnableCallback cuptiEnableDomain and cuptiEnableAllDomains For example if cuptiGetCallbackState sub d and cuptiEnableAllDomains sub are called concurrently the results are undefined Parameters enable New enable state for all callbacks in all domain Zero disables all callbacks non zero enables all callbacks subscriber Handle to callback subscription Return values CUPTI SUCCESS on success CUPTI ERROR NOT INITIALIZED if unable to initialized CUPTI CUPTI ERROR INVALID PARAMETER if subscriber is invalid CUptiResult cuptiEnableCallback uint32 t enable CUpti Subscriber Handle subscriber CUpti CallbackDomain domain CUpti CallbacklId cbid Enable or disabled callbacks for a subscriber for a specific domain and callback ID Note Thread safety a subscriber must serialize access to cuptiGetCallbackState cuptiEnableCallback cuptiEnableDomain and cuptiEnableAllDomains For example if cuptiGetCallbackState sub d c and cuptiEnableCallback sub d c are called concurrently the results are undefined Parameters enable New enable state for the callback Zero disables the callback non zero enables the callback subscriber Handle to callback subscription domain The domain of the ca
75. tivityDevice maxIPC The maximum instructions per cycle possible on each device multiprocessor uint32 t CUpti ActivityDevice maxRegistersPerBlock Maximum number of registers that can be allocated to a block uint32 t CUpti ActivityDevice maxSharedMemoryPerBlock Maximum amount of shared memory that can be assigned to a block in bytes uint32 t CUpti ActivityDevice maxThreadsPerBlock Maximum number of threads allowed in a block uint32 t CUpti ActivityDevice maxWarpsPerMultiprocessor Maximum number of warps that can be present on a multiprocessor at any given time const chars CUpti_ Activity Device name The device name This name is shared across all activity records representing instances of the device and so should not be modified uint32 t CUpti_ Activity Device numMemcpy Engines Number of memory copy engines on the device uint32 t CUpti_ Activity Device numMultiprocessors Number of multiprocessors on the device uint32 t CUpti_ Activity Device numThreadsPer Warp The number of threads per warp on the device CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 31 CUpti_ ActivityEvent Type Reference The activity record for a CUPTI event Data Fields K uint32 t correlationld CUpti EventDomainID domain K CUpti EventID id K CUpti ActivityKind kind K uint64 t value Detailed Description This activity record represents the collection of a CUPTI event value CUPTI ACTIVITY KIND EV
76. tivityKernel CUPTI ACTIVITY KIND NAME Thread device context etc name The corresponding activity record structure is CUpti ActivityName CUPTI ACTIVITY KIND MARKER Instantaneous start or end marker CUPTI ACTIVITY KIND MARKER DATA Extended optional data about a marker CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 15 CUPTI ACTIVITY KIND SOURCE LOCATOR Source information about source level result The corresponding activity record structure is CUpti ActivitySourceLocator CUPTI ACTIVITY KIND GLOBAL ACCESS Results for source level global acccess The corresponding activity record structure is CUpti ActivityGlobalAccess CUPTI ACTIVITY KIND BRANCH Results for source level branch The corresponding activity record structure is CUpti ActivityBranch CUPTI ACTIVITY KIND OVERHEAD Overhead activity records The corresponding activity record structure is CUpti ActivityOverhead enum CUpti ActivityMemcpyKind Each kind represents the source and destination targets of a memory copy Targets are host device and array Enumerator CUPTI ACTIVITY MEMCPY KIND UNKNOWN The memory copy kind is not known CUPTI ACTIVITY MEMCPY KIND HTOD A host to device memory copy CUPTI ACTIVITY MEMCPY KIND DTOH A device to host memory copy CUPTI ACTIVITY MEMCPY KIND HTOA A host to device array memory copy CUPTI ACTIVITY MEMCPY KIND A device array to host memory copy CUPTI ACTIVITY MEMCPY KIND ATOA A device array to device
77. tivityKind kind uint32 t processId uint32 t returnValue uint64 t start uint32 t threadId Y Y Y Y Y Y Y Y Detailed Description This activity record represents an invocation of a driver or runtime API CUPTI ACTIVITY KIND DRIVER and CUPTI ACTIVITY KIND RUNTIME Field Documentation CUpti CallbackId CUpti Activity API cbid The ID of the driver or runtime function uint32 t CUpti Activity API correlationId The correlation ID of the driver or runtime CUDA function Each function invocation is assigned a unique correlation ID that is identical to the correlation ID in the memcpy memset or kernel activity record that is associated with this function uint 4 t CUpti Activity API end The end timestamp for the function in ns CUDA Toolkit CUPTI User s Guide DA 05679 001 01 26 CUpti ActivityKind CUpti Activity API kind The activity record kind must be CUPTI ACTIVITY KIND DRIVER or CUPTI ACTIVITY KIND RUNTIME uint32 t CUpti Activity API processId The ID of the process where the driver or runtime CUDA function is executing uint32 t CUpti Activity API returnValue The return value for the function For a CUDA driver function with will be a CUresult value and for a CUDA runtime function this will be a cudaError t value uint 4 t CUpti Activity APlI start The start timestamp for the function in ns uint32 t CUpti Activity API threadId The ID of the thread where the driver or runtime CUDA function is
78. txData Parameters userdata User data supplied at subscription of the callback domain The domain of the callback cbid The ID of the callback cbdata Data passed to the callback typedef uint32 t CUpti CallbackId An ID for a driver API runtime API resource or synchronization callback Within a driver API callback this should be interpreted as a CUpti driver api trace cbid value these values are defined in cupti driver cbid h Within a runtime API callback this should be interpreted as a CUpti runtime api trace chid value these values are defined in runtime cbid h Within a resource API callback this should be interpreted as a CUpti CallbackIdResource value Within a synchronize API callback this should be interpreted as a CUpti CallbackIdSync value Enumeration Type Documentation enum CUpti ApiCallbackSite Specifies the point in an API call that a callback is issued This value is communicated to the callback function via CUpti CallbackData callbackSite Enumerator CUPTI API ENTER The callback is at the entry of the API call CUPTI API EXIT The callback is at the exit of the API call enum CUpti CallbackDomain Callback domains Each domain represents callback points for a group of related API functions or CUDA driver activity Enumerator CUPTI CB DOMAIN INVALID Invalid domain CUDA Toolkit CUPTI User s Guide DA 05679 001 v01 48 CUPTI CB DOMAIN DRIVER API Domain containing callback points for all
79. up Destroy an event group gt CUptiResult cuptiEventGroupDisable CUpti EventGroup eventGroup Disable an event group CUptiResult cuptiEventGroupEnable CUpti EventGroup eventGroup Enable an event group gt CUptiResult cuptiEventGroupGetAttribute CUpti EventGroup eventGroup CUpti EventGroupAttribute attrib size t valueSize void value Read an event group attribute gt CUptiResult cuptiEventGroupReadAllEvents CUpti EventGroup eventGroup CUpti ReadEventFlags flags size t event ValueBufferSizeBytes uint64 t CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 63 xeventValueBuffer size_t eventIdArraySizeBytes CUpti EventID xeventIdArray size t numEventIdsRead Read the values for all the events in an event group CUptiResult cuptiEventGroupReadEvent CUpti EventGroup eventGroup CUpti ReadEventFlags flags CUpti EventID event size t xeventValueBufferSizeBytes uint64 t event ValueBuffer Read the value for an event in an event group gt CUptiResult cuptiEventGroupRemoveAllEvents CUpti EventGroup eventGroup Remove all events from an event group gt CUptiResult cuptiEventGroupRemoveEvent CUpti EventGroup eventGroup CUpti EventID event Remove an event from an event group CUptiResult cuptiEventGroupResetAllEvents CUpti EventGroup eventGroup Zero all the event counts in an event group CUptiResult cuptiEventGroupSetAttribute CUpti EventGroup eventGroup CUpti Event
80. valid stream CUPTI ERROR INVALID KIND Invalid kind CUPTI ERROR INVALID EVENT VALUE Invalid event value CUPTI ERROR DISABLED CUPTI is disabled due to conflicts with other enabled profilers CUPTI ERROR INVALID MODULE Invalid module CUPTI ERROR UNKNOWN An unknown internal error has occurred Function Documentation CUptiResult cuptiGetResultString CUptiResult result const char str Return the descriptive string for a CUptiResult in str Note Thread safety this function is thread safe Parameters result The result to get the string for str Returns the string Return values CUPTI SUCCESS on success CUPTI ERROR INVALID PARAMETER if str is NULL or result is not a valid CUptiResult CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 7 CUPTI Activity API Data Structures K struct CUpti Activity The base activity record K struct CUpti ActivityAPI The activity record for a driver or runtime API invocation K struct CUpti_ ActivityBranch The activity record for source level result branch K struct CUpti ActivityContext The activity record for a context K struct CUpti ActivityDevice The activity record for a device K struct CUpti ActivityEvent The activity record for a CUPTI event K struct CUpti ActivityGlobalAccess The activity record for source level global access K struct CUpti ActivityKernel The activity record for kernel K struct CUpti ActivityMarker The
81. ven by arraySizeBytes The size of the eventArray buffer must be at least numdomainevents sizeof CUpti EventID or else all events will not be returned The value returned in arraySizeBytes contains the number of bytes returned in eventArray Note Thread safety this function is thread safe Parameters eventDomain ID of the event domain arraySizeBytes The size of eventArray in bytes and returns the number of bytes written to eventArray eventArray Returns the IDs of the events in the domain Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID EVENT DOMAIN ID CUPTI ERROR INVALID PARAMETER if arraySizeBytes or eventArray are NULL CUptiResult cuptiEventDomainGetAttribute CUpti EventDomainID eventDomain CUpti EventDomainAttribute attrib size t valueSize void value Returns an event domain attribute in value The size of the value buffer is given by xvalueSize The value returned in valueSize contains the number of bytes returned in value If the attribute value is a c string that is longer than valueSize then only the first valueSize characters will be returned and there will be no terminating null byte Note Thread safety this function is thread safe Parameters eventDomain ID of the event domain CUDA Toolkit CUPTI User s Guide DA 05679 001_ v01 72 attrib The event domain attribute to read valueSize The size of the value buffer in bytes and returns the number of byte
82. verhead CUPTI ACTIVITY OVERHEAD CUPTI BUFFER FLUSH Activity buffer flush overhead CUPTI ACTIVITY OVERHEAD CUPTI INSTRUMENTATION CUPTI instrumentation overhead CUPTI ACTIVITY OVERHEAD CUPTI RESOURCE CUPTI resource creation and destruction overhead CUDA Toolkit CUPTI User s Guide 05679 001 01 17 Function Documentation CUptiResult cuptiActivityDequeueBuffer CUcontext context uint32 t streamId uint8 t buffer size t validBufferSizeBytes Remove the buffer from the head of the specified queue See cuptiActivityEnqueueBuffer for description of queues Calling this function transfers ownership of the buffer from CUPTI CUPTI will no add any activity records to the buffer after it is dequeued Parameters context The context or NULL to dequeue from the global queue streamId The stream ID buffer Returns the dequeued buffer validBufferSizeBytes Returns the number of bytes in the buffer that contain activity records Return values CUPTI SUCCESS CUPTI ERROR NOT INITIALIZED CUPTI ERROR INVALID PARAMETER if buffer or validBufferSizeBytes are NULL CUPTI ERROR QUEUE EMPTY the queue is empty buffer returns NULL and validBufferSizeBytes returns 0 CUptiResult cuptiActivityDisable CUpti_ActivityKind kind Disable collection of a specific kind of activity record Multiple kinds can be disabled by calling this function multiple times By default all activity kinds are disabled for collection Parameters ki

Download Pdf Manuals

image

Related Search

Related Contents

English Deutsch  TUE - MOUCHES  Raadpleeg uw dealer voor de max. massa die uw wagen mag    Notice d'emploi  0237 Instrucciones y manual de usuario  Samsung 2333HD Bruksanvisning  Manual de usuario  Ironman Fitness EVO-1 User's Manual  

Copyright © All rights reserved.
Failed to retrieve file