Home
Mellanox GPUDirect RDMA User Manual
Contents
1. AMN Mellanox TECHNOLOGIES Connect Accelerate Outperform Mellanox GPUDirect RDMA User Manual Rev 1 0 www mellanox com Rev 1 0 NOTE THIS HARDWARE SOFTWARE OR TEST SUITE PRODUCT PRODUCT S AND ITS RELATED DOCUMENTATION ARE PROVIDED BY MELLANOX TECHNOLOGIES AS IS WITH ALL FAULTS OF ANY KIND AND SOLELY FOR THE PURPOSE OF AIDING THE CUSTOMER IN TESTING APPLICATIONS THAT USE THE PRODUCTS IN DESIGNATED SOLUTIONS THE CUSTOMER S MANUFACTURING TEST ENVIRONMENT HAS NOT MET THE STANDARDS SET BY MELLANOX TECHNOLOGIES TO FULLY QUALIFY THE PRODUCTO S AND OR THE SYSTEM USING IT THEREFORE MELLANOX TECHNOLOGIES CANNOT AND DOES NOT GUARANTEE OR WARRANT THAT THE PRODUCTS WILL OPERATE WITH THE HIGHEST QUALITY ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT ARE DISCLAIMED IN NO EVENT SHALL MELLANOX BE LIABLE TO CUSTOMER OR ANY THIRD PARTIES FOR ANY DIRECT INDIRECT SPECIAL EXEMPLARY OR CONSEQUENTIAL DAMAGES OF ANY KIND INCLUDING BUT NOT LIMITED TO PAYMENT FOR PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES LOSS OF USE DATA OR PROFITS OR BUSINESS INTERRUPTION HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY WHETHER IN CONTRACT STRICT LIABILITY OR TORT INCLUDING NEGLIGENCE OR OTHERWISE ARISING IN ANY WAY FROM THE USE OF THE PRODUCT S AND RELATED DOCUMENTATION EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
2. CUDA Runtime and Toolkit 6 0 https developer nvidia com cuda downloadsservice 1 2 Important Notes Once the hardware and software components are installed it is important to check that the GPUDirect kernel module is properly loaded on each of the compute systems where you plan to run the job that requires the GPUDirect RDMA feature To check service nv peer mem status Or for some other flavors of Linux lsmod grep nv peer mem Usually this kernel module is set to load by default by the system startup service If not loaded GPU Direct RDMA would not work which would result in very high latency for message communications One you start the module by either service nv peer mem start Or for some other flavors of Linux modprobe nv peer mem To achieve the best performance for GPUDirect RDMA it is required that both the HCA and the GPU be physically located on the same PCIe IO root complex To find out about the system architecture either review the system manual or run Ispci tv Mellanox Technologies 5 J Rev 1 0 Installing GPUDirect RDMA 2 Installing GPUDirect RDMA gt To install GPUDirect RDMA excluding ubuntu rpmbuild rebuild lt path to srpm gt rpm ivh lt path to generated binary rpm file gt Note On SLES OSes add nodeps gt To install GPUDirect RDMA on Ubuntu Copy the tarball to a temporary directory tar xzf lt tarball gt cd lt extracted directory gt dpkg build
3. DAMAGE Mellanox TECHNOLOGIES Mellanox Technologies Mellanox Technologies Ltd 350 Oakmead Parkway Suite 100 Beit Mellanox Sunnyvale CA 94085 PO Box 586 Yokneam 20692 U S A Israel www mellanox com www mellanox com Tel 408 970 3400 Tel 972 0 74 723 7200 Fax 408 970 3403 Fax 972 0 4 959 3245 O Copyright 2014 Mellanox Technologies All Rights Reserved Mellanox amp Mellanox logo BridgeX ConnectX Connect IB amp CORE Direct InfiniBridge InfiniHost InfiniScale MetroX MLNX OS PhyX ScalableHPC SwitchX UFM Virtual Protocol Interconnect and Voltaire are registered trademarks of Mellanox Technologies Ltd ExtendX FabricIT Mellanox Open Ethernet Mellanox Virtual Modular Switch MetroDX TestX Unbreakable Link are trademarks of Mellanox Technologies Ltd All other trademarks are property of their respective owners 2 Mellanox Technologies Document Number MLNX 15 3878 Rev 1 0 Table of Contents Tableof Contents sk eee epica sects AA AAA 3 List Of Tables rad 86 idee iS aire Oe Sa dian fata LE 4 Chapter T OVER ep a 5 1 1 System Requirements 5 1 2 Important Notes sn 5 Chapter 2 Installing GPUDirect RDMA ccc eee cece eee ee hn nn 6 Chapter 3 Benchmark Tests sas Y Vere ra 7 3 1 Testing GPUDirect ROMA with CUDA Enabled Benchmark 7 3 2 Running GPUDirect ROMA with MVAPICH GDR 2 0b 7 3 3 Running GPUDir
4. ble io romio enable picky with cuda usr local cuda 5 5 V with cuda include usr local cuda 6 0 include V with cuda libpath usr local cuda 6 0 1ib64 co melliGlogin sand8 make make install To run the OpenMPI that uses the flag that enables GPUDirect RDMA gdrejupiter001 mpirun mca btl openib want cuda gdr 1 np 2 npernode 1 x LD LIBRARY PATH mca btl openib if include mlx5 0 1 bind to core report bindings mca coll fca enable 0 x CUDA VISIBLE DEVICES 0 home co melll scratch osu micro benchmarks 4 2 install libexec osu micro benchmarks mpi pt2pt osu latency d cuda D D OSU MPI CUDA Latency Test v4 2 Send Buffer on DEVICE D and Receive Buffer on DEVICE D Size Latency us 08 83 83 84 83 83 02 80 co d N GM CO CO CO CO CO CO a E gt 4 If the flag for GPUDirect RDMA is not enabled it would result in much higher latency for the above By default in OpenMPI 1 7 4 the GPUDirect RDMA will work for message sizes between 0 to 30KB For messages above that limit it will be switched to use asynchronous copies through the host memory instead Sometimes better application performance can be seen by adjusting that limit Here is an example of increasing to adjust the switch over point to above 64KB mca btl openib cuda rdma limit 65537 8 Mellanox Technologies
5. ect ROMA with OpenMPI 1 7 4 00 0 cece eee 8 Mellanox Technologies 3 J Rev 1 0 List Of Tables Table 1 GPUDirect ROMA System Requirements 5 4 Mellanox Technologies Rev 1 0 1 Overview GPUDirect RDMA is an API between IB CORE and peer memory clients such as NVIDIA Kepler class GPU s It provides access for the HCA to read write peer memory data buffers as a result it allows RDMA based applications to use the peer device computing power with the RDMA interconnect without the need to copy data to host memory This capability is supported with Mellanox ConnectX 3 VPI or Connect IB InfiniBand adapters It will also work seem lessly using RoCE technology with the Mellanox ConnectX 3 VPI adapters 1 4 System Requirements The platform and server requirements for GPUDirect RDMA are detailed in the following table Table 1 GPUDirect RDMA System Requirements Platform Type and Version HCAs Mellanox ConnectX 3 Mellanox ConnectX 3 Pro Mellanox Connect IB NVIDIA Tesla K Series K10 K20 K40 Software Plugins e MLNX OFED v2 1 x x x or later www mellanox com gt Products gt Software gt InfiniBand VPI Drivers gt Linux SW Drivers Plugin module to enable GPUDirect RDMA www mellanox com gt Products gt Software gt InfiniBand VPI Drivers gt GPUDirect RDMA NVIDIA Driver 331 20 or later http www nvidia com Download index aspx lang en us NVIDIA
6. osu micro benchmarks 4 2 mvapich2 mpi pt2pt osu bw d cuda D D OSU MPI CUDA Bandwidth Test v4 2 t Send Buffer on DEVICE D and Receive Buffer on DEVICE D Size Bandwidth MB s 2097152 6372 60 4194304 6388 63 The MV2 GPUDIRECT LIMIT is a tunable parameter which controls the buffer size that it starts to use Here is a list of runtime parameters that can be used for process to rail binding in case the system has multi rail configuration export MV2 USE CUDA 1 export MV2 USE GPUDIRECT 1 export MV2 RAIL SHARING POLICY FIXED MAPPING export MV2 PROCESS TO RAIL MAPPING mlx5 0 mlx5 1 export MV2 RAIL SHARING LARGE MSG THRESHOLD 1G export MV2 CPU BINDING LEVEL SOCKET export MV2 CPU BINDING POLICY SCATTER Additional tuning parameters related to CUDA and GPUDirect RDMA such as MV2 CUDA BLOCK SIZE can be found in the README installed on the node opt mvapich2 gdr 2 0 gnu share doc mvapich2 gdr gnu 2 0 README GDR Mellanox Technologies 7 J Rev 1 0 Benchmark Tests 3 3 Running GPUDirect ROMA with OpenMPI 1 7 4 The GPUDirect RDMA support is available on OpenMPI 1 7 4rc1 Unlike MVAPICH2 GDR which is available in the RPM format one can download the source code for OpenMPI and com pile using flags below to enable GPUDirect RDMA support co mell1 login sand8 configure prefix path to openmpi 1 7 4rcl install with wrapper ldflags Wl rpath lib disable vt enable orterun prefix by default dis a
7. package us uc dpkg i path to generated deb files Example dpkg i nvidia peer memory 1 0 0 all deb dpkg i nvidia peer memory dkms 1 0 0 all deb gt Please make sure this kernel module is installed and loaded on each GPU InfiniBand com 3 pute nodes 6 Mellanox Technologies Rev 1 0 3 Benchmark Tests 3 1 Testing GPUDirect RDMA with CUDA Enabled Benchmark GPUDirect RDMA can be tested by running the micro benchmarks from Ohio State University OSU The OSU benchmarks 4 and above are CUDA enabled benchmarks that can downloaded from http mvapich cse ohio state edu benchmarks When building the OSU benchmarks you must verify that the proper flags are set to enable the CUDA part of the tests otherwise the tests will only run using the host memory instead which is the default configure CC path to mpicc V enable cuda V with cuda include path to cuda include with cuda libpath path to cuda lib make make install 3 2 Running GPUDirect RDMA with MVAPICH GDR 2 0b MVAPICH2 that takes advantage of the new GPUDirect RDMA technology for inter node data movement on NVIDIA GPUs clusters with Mellanox InfiniBand interconnect MVAPICH GDR 2 0b can be downloaded from http mvapich cse ohio state edu download mvapich2gdr Below is an example of running one of the OSU benchmark which enables GPUDirect RDMA gdr ops001 mpirun rsh np 2 ops001 ops002 MV2 USE CUDA 1 MV2 USE GPUDIRECT 1 home gdr
Download Pdf Manuals
Related Search
Related Contents
Power Line Multi-Meter Wowza nDVR AddOn User's Guide Stage CE A2-B1-B2 SERVICE MANUAL CentreCOM x610シリーズ AT-x610-24Ts/X-POE+ • INSTALLATION • FONCTIONNEMENT • ENTRETIEN - Alto Liebert® PSI-XR - Emerson Network Power ペットボトル加湿器の安全性(全文)(PDF形式) カタログ Equipo manual de recubrimiento por polvo OptiFlex B Copyright © All rights reserved.