Home

Mellanox FCA User Manual

image

Contents

1. There are two options for installing the FCA Manager when using MLNX OFED v1 5 3 3 1 0 e From an RPM on page 15 Select this option if you want to install the FCA Manager on the machine s local disk and let the RPM package handle all post install tasks e From Tarball on page 15 Select this option if you wish to install FCA Manager in any location user s home directory NFS shared folder etc There are number of post install tasks that need to be applied as root on every cluster node after you install FCA from a tarball Select one of the installation options according to your site s installation policy 3 2 1 Installing the FCA Manager from RPM To install the FCA Manager on all cluster nodes from an RPM as root when using MLNX OFED v1 5 3 3 1 0 1 Enter the following command rpm e fca rpm e openmpi rpm ihv fca 2 5 x86 64 rpm 2 Set the environment variable pointing to the installed location of FCA in the user login profile export FCA MGR HOME opt mellanox fca 3 Optional Configure the FCA Manager to start automatically after boot etc init d fca managerd install service 3 2 2 Installing the FCA Manager from Tarball To install the FCA Manager from Tarball in the shared NFS location when using MLNX OFED v1 5 3 3 1 0 1 Enter the following commands mkdir p usr local mellanox cd usr local mellanox ipee VE SEI yO OO BA Ele oA 2 Run the following post install scrip on all h
2. sse ener ASKR KR RR rent nnne nnns 11 1 6 FCA Installation Package Content ssssssssssseseses eene 12 2 Installation and Initial ConfiguratiOn rrassvnnnnnvnnnvvnnnnvennnnnnnnnnnnnnnvnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnner 13 2 1 Overview of Installation and Initial Configuration mmessesssrsrsrsrsrsrsrersrersrer sne r ann annan nr anor sr sn ora 13 2 1 1 Downloading the FCA Software sissmessrrssrrssrsssrsrsrrrsrrrsrrrsrnsrr sr sr ser rn nennen nennen 13 Bis we c 14 31 Prerequisites ttu tata ete aet 14 3 2 Installing the FCA Manager on a Dedicated Node sssmmssssssrsrsrsrsrersrersrer sne r arr ann annan sn an oa 15 3 2 1 Installing the FCA Manager from RPM sse 15 3 2 2 Installing the FCA Manager from Tarball seen 15 3 2 9 Starting the FCA Manager onte rc ea e DRE Hte be ce PE ek need 16 4 Installing FCA MPI Support Libraries rrrssvrnnnnvnnnvvnnnnvnnnnnvnnnnnnnnnnvnnnnvnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnner 17 4 1 Building OpenMPI 1 6 x with FCA Support coococccoccccnoninonoccnoncccnoncnnnnnna nn nn nancc rn nn 17 4 2 Verifying the FCA Installation sess enne 17 4 3 Running MPI Jobs with FCA teea aeie enna tiaa aaa n aaaeei denii aie renis 17 5 Configuring ECA snaut skanne 18 5 1 FCA Manager Configuration Parameters sss 18 6 FCA MPI Runtime Library Configuration Parameters srrnnnvrnnnvvnnnnvnnnnnnnnnnnnnnn
3. HCA firmware version go to Firmware Downloads The minimum system requirements for installing and running FCA are listed in the following table E NOTE Mellanox OFED 1 5 3 3 1 0 includes FCA 2 2 and OpenMPI which is compiled with FCA v2 2 Both packages should be removed prior to installing FCA v2 5 To remove them run rpm e fca rpm e openmpi Table 2 System Requirements Item Requirement FCA 2 5 Supported switches Mellanox IB QDR FDR switches Linux distributions lt OS gt RHEL 6 2 Supported HCAs Mellanox ConnectX 2 HCA with firmware version 2 9 1000 or later Mellanox ConnectX 3 HCA with firmware version 2 10 0000 or later Open Message Passing Open MPI 1 6 3 or later Interface MPI Project Open Fabrics Enterprise 1 5 3 3 1 0 or later Distribution OFEDTM Root permission The installer should have root permissions for post installation tasks InfiniBand Subnet All InfiniBand Subnet Management based software is supported in Management FCA version 2 5 14 J Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 3 2 Installing the FCA Manager on a Dedicated Node FCA Manager must be installed on a dedicated machine which is not a part of the cluster nodes with only a single instance of the FCA Manager in operation per fabric NOTE We recommend that the FCA Manager be installed on the same node as OpenSM
4. Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 Preface Audience The intended audience for the Mellanox Fabric Collective Accelerator FCA User Manual is the MPI implementer and the network administrator responsible for managing FCA on Mellanox InfiniBand switches It is assumed that the administrator is familiar with advanced concepts in network management Related Documentation The following document is part of the library for network administrators and installers supporting the Mellanox FCA Document Name Part Number Mellanox Fabric Collective Accelerator Release Notes DOC 00984 Typographical Conventions Before you start using this guide it is important to understand the terms and typographical conventions used in the documentation The following kinds of formatting in the text identify special information Formatting convention Type of Information Special Bold Items you must select such as menu options command buttons or items in a list Emphasis Use to emphasize the importance of a point or for variable expressions such as parameters CAPITALS Names of keys on the keyboard for example SHIFT CTRL or ALT KEY KEY Key combinations for which the user must press and hold down one key and then press another for example CTRL P or ALT F4 7 J Mellanox Technologies Version 2 5 Preface Document Conventions NO
5. list of OpenMPI FCA related parameters MPI HOME bin ompi info param coll fca gt To provide MCA parameters to the OpenMPI mpirun command use the following format MPI HOME bin mpirun mca param value Example MENTE OM mp un mc ae OE eaa verbose Mcr SM a mias CS RE The following is a list of MCA parameters for FCA Parameter Description Default coll_fca_priority lt int gt Priority of the fca coll component 80 coll_fca_verbose lt int gt Verbose level of the fca coll 0 component coll fca enable lt 011 gt Enable Disable Fabric Collective 1 Accelerator coll fca spec file string Path to the FCA configuration file FCA HOMB etc fca fca mpi spec ini mpi spec ini coll fca library path lt string gt Path to FCA runtime library FCA HOMH lib libf ca so coll_fca_np lt int gt Minimal allowed job s NP to activate 64 FCA coll fca enable barrier lt 011 gt Enable Disable FCA Barrier support 1 coll fca enable bcast 0l1 Enable Disable FCA Bcast support 1 coll fca enable reduce 0l1 Enable Disable FCA Reduce support 1 coll fca enable allreduce Enable Disable FCA Allreduce 1 0l support coll fca enable allgather 0l1 Enable Disable FCA Allgather 1 support coll fca enable allgatherv Enable Disable FCA Allgatherv 1 0l support 25 J Mellanox Technologies Confidential
6. provided parameter file fca manager spec ini will be located in the path described in the following table Table 3 Paths for FCA Manager INI File Installation Method Path to fca manager spec ini From RPM opt mellanox fca etc From Tarball FCA_HOME etc The FCA Manager configuration file is in INI format and contains two sections fmm and ib To set parameter values in the fca manager spec ini file edit the file as necessary using the following format variable value Example fmm deus level 5 log iile fam log Table 4 FCA Manager INI File Parameters Parameter Description Values The following parameters may be changed in the INI file under the fmm section debug level Verbosity level of fca manager for integer between 0 7 debugging The default is 3 The debug levels are e O fatal e 1 error e 2 warn e 3 info e 4 debug e 5 7 detailed debug info log file FCA Manager log filename The string representing log file name logfile name can contain printf like The default is tokens which are substituted during fmm H D log log file creation e H hostname where FCA Manager is running e D current date in format Me E ooo Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 Parameter Description Values DDMMYYYY e T current thread ID log file max size
7. EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE Mellanox TECHNOLOGIES Mellanox Technologies Mellanox Technologies Ltd 350 Oakmead Parkway Suite 100 Beit Mellanox Sunnyvale CA 94085 PO Box 586 Yokneam 20692 U S A Israel www mellanox com www mellanox com Tel 408 970 3400 Tel 972 0 74 723 7200 Fax 408 970 3403 Fax 972 0 4 959 3245 O Copyright 2014 Mellanox Technologies All Rights Reserved Mellanox Mellanox logo BridgeXG ConnectX Connect IB CoolBox CORE Direct InfiniBridge InfiniHost InfiniScale MetroX MLNX OS PhyX ScalableHPC SwitchX UFM Virtual Protocol Interconnect and Voltaire are registered trademarks of Mellanox Technologies Ltd ExtendX FabricIT Mellanox Open Ethernet Mellanox Virtual Modular Switch MetroDX TestX Unbreakable Link are trademarks of Mellanox Technologies Ltd All other trademarks are property of their respective owners A GM Mellanox Technologies Contents Version 2 5 Contents nba 6 A ERR 7 1 Introduction to Mellanox Fabric Collective Accelerator eere 9 Tar NNM rer 9 1 2 Supported MPI Collectives rnrnnrnnnnronannvnnnnnvnnnrnnnnrnnnnnnnnnnnennr renn i nennen sinn sn nennen ns 10 1 9 Supported Topologies tired tne iet Pe e epe todas 11 1 4 Planning the Server Configuration nennen 11 1 5 FCA Software Components
8. Mellanox TECHNOLOGIES Fabric Collective Accelerator FCA User Manual Version 2 5 Last Modified December 23 2014 www mellanox com Mellanox Technologies Version 2 5 Contents NOTE THIS HARDWARE SOFTWARE OR TEST SUITE PRODUCT PRODUCT S AND ITS RELATED DOCUMENTATION ARE PROVIDED BY MELLANOX TECHNOLOGIES AS IS WITH ALL FAULTS OF ANY KIND AND SOLELY FOR THE PURPOSE OF AIDING THE CUSTOMER IN TESTING APPLICATIONS THAT USE THE PRODUCTS IN DESIGNATED SOLUTIONS THE CUSTOMER S MANUFACTURING TEST ENVIRONMENT HAS NOT MET THE STANDARDS SET BY MELLANOX TECHNOLOGIES TO FULLY QUALIFY THE PRODUCTO S AND OR THE SYSTEM USING IT THEREFORE MELLANOX TECHNOLOGIES CANNOT AND DOES NOT GUARANTEE OR WARRANT THAT THE PRODUCTS WILL OPERATE WITH THE HIGHEST QUALITY ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT ARE DISCLAIMED IN NO EVENT SHALL MELLANOX BE LIABLE TO CUSTOMER OR ANY THIRD PARTIES FOR ANY DIRECT INDIRECT SPECIAL EXEMPLARY OR CONSEQUENTIAL DAMAGES OF ANY KIND INCLUDING BUT NOT LIMITED TO PAYMENT FOR PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES LOSS OF USE DATA OR PROFITS OR BUSINESS INTERRUPTION HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY WHETHER IN CONTRACT STRICT LIABILITY OR TORT INCLUDING NEGLIGENCE OR OTHERWISE ARISING IN ANY WAY FROM THE USE OF THE PRODUCT S AND RELATED DOCUMENTATION
9. TE Identifies important information that contains helpful suggestions CAUTION Alerts you to risk of personal injury system damage or loss of data WARNING Warns you that failure to take or avoid a specific action might result in YN personal injury or a malfunction of the hardware or software Be aware of the hazards involved with electrical circuitry and be familiar with standard practices for preventing accidents before you work on any equipment a a a Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 1 1 Introduction to Mellanox Fabric Collective Accelerator Overview The Mellanox Fabric Collective Accelerator FCA is a unique solution for offloading collective operations from the Message Passing Interface MPI process to the server CPUs As a system wide solution FCA does not require any additional hardware The FCA manager creates a topology based collective tree and orchestrates an efficient collective operation using the CPUs in the servers that are part of the collective operation FCA accelerates MPI collective operation performance by up to 100 times providing a reduction in the overall job runtime Implementation is simple and transparent during the job runtime FCA is built on the following main principles e Topology aware Orchestration The MPI collective logical tree is matched to the physical topology The collective logical tree is constructed to assure Maximum uti
10. This is a critical size in MB of the log file to which the file will be rolled If set to zero rolling is disabled and file size is unlimited any positive valid number The default is 10 log file max backu p files Denotes the number of backup files to be created Effective only when log file max size parameter has value greater than zero integer The default is 20 enable stdout Determines whether the log should also be written to the standard output Valid values e yorl enable e nor0 disable character The default is enable The following parameters may be changed in the INI file under section ib on the matching device If not set or zero the first active port is used dev name If set the specified IB device will be string representing active IB used for communication device name The name as appears in The default is set sys class infiniband directory If not set the first device with ACTIVE port is used port num If set the selected port number is used positive integer The default is unset use auto discover service level Quality of Service QoS is offered in IB as a means to offer some guarantees minimum requirements for certain applications on the fabric SL2VL mapping should be configured in OpenSM Valid values 0 15 Note that OpenSM works by default with QoS values of 0 7 integer The default is 0 N 19 J Mellanox Technologi
11. ctive Accelerator FCA User Manual Version 2 5 2 Installation and Initial Configuration 2 1 Overview of Installation and Initial Configuration FCA software includes the FCA Manager and the FCA MPI runtime support libraries FCA Manager software should be installed on a central management node For optimal performance and to minimize interference with other applications it is recommended to use a dedicated server for the FCA Manager installation The following sections provide step by step instructions for installing the FCA Server software and installing the FCA Agent 2 1 1 Downloading the FCA Software This software download process applies to software updates as well as for first time installation gt To download the FCA software Go to the Mellanox website 1 2 Click the Downloads tab and select the relevant version of the software to download 3 Save the file on your local drive 4 Click Close N 13 J Mellanox Technologies Version 2 5 Installing FCA 3 Installing FCA 3 1 Prerequisites Before you begin be certain that 1 InfiniBand Subnet Management is installed and running on a dedicated node in the fabric 2 Mellanox OFED 1 5 3 3 1 0 or later is installed To download the latest MLNX_OFED version go to Mellanox OpenFabrics Enterprise Distribution for Linux MLNX OFED 3 Mellanox ConnectX 2 or ConnectX 3 HCA with firmware version 2 9 1000 or later To download the latest ConnectX
12. es Version 2 5 FCA MPI Runtime Library Configuration Parameters 6 2 FCA MPI Runtime Library Configuration Parameters Specifying FCA Parameters as mpirun Command Line Arguments The FCA runtime library is used by MPI to offload collective operations into IB switches You can supply configuration parameters to the FCA runtime library The configuration parameters may be passed to the FCA library by either loading the parameters from a configuration INI file or entering the parameters in a command line to the MPI job or setting them from the shell environment e The default configuration file for the FCA MPI runtime parameter is located at SECA HOME etc fca mpi spec ini e The FCA parameters can be entered as command line parameters as part of the mpirun command To pass FCA parameter from shell environment e Enter the following command export fca ini section name ini section param name gt value Example export fea mol debug level 5 or provide the parameter to OpenMPI as command line argument MOLI x Eos m r debug level 5 oo Other moirun parsmerers M Specifying FCA Parameters in the INI File At runtime use mpirun s x command switch to overwrite FCA parameters set in the fca mpi spec ini file Set the FCA parameters in an MPI job at runtime with the following syntax MPI HOME bin mpirun np 32 machinefile hostfile mca coll oma fca library path SFCA HOME lib libfca so mca btl sm se
13. lf openib x lt param gt value other mpi options mpi hello world where e MPI HOME represents the path to the MPI installation directory e FCA HOME represents the path to the FCA software directory NOTE The x command switch is used as follows x lt param gt lt value gt a Example x fca mpi collect stats y Table 5 FCA Parameters in Open MPI at Run Time INI File Parameter Description Values The following parameters may be changed in the INI file under section mpi 20 Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 INI File Parameter Description Values fca mpi debug level Verbosity level for MPI FCA debugging The debug levels are e O fatal e 1 error e 2 warn e 3 info e 4 debug e 5 7 detailed debug info integer between 0 7 The default is 2 fca_mpi_log_file FCA log filename The logfile name can contain printf like tokens which are substituted during log file creation e H hostname of the process e u current time in ms e T current thread ID e s time in sec e t time in ticks String representing log file name or empty for none The default is none fca mpi enable stdout Determines whether the FCA log should also be written to the standard output Valid values lt yln gt character The default is y fca mpi fp sum fixedpoi nt Use fixed point math whe
14. lization of fast inter core communication Distribution of the results e Communication Isolation Collective communications are isolated from the rest of the traffic in the fabric using a private virtual network VLane eliminating contention with other types of traffic The following diagram summarizes the FCA architecture Figure 1 FCA Architecture Inter core communication optimized Use of IB multicast for result Collective tree amp Rank distribution placement optimized to the topology Mellanox Technologies Version 2 5 Introduction to Mellanox Fabric Collective Accelerator The following diagram shows the FCA components and the role that each plays in the acceleration process Figure 2 FCA Components Job UFM GD 4700 4200 Scheduler FCA Manager Mellanox UFM FCA Manager Orchestrating fabric wide collectives GD 4036 4036E CPUs doses FCA a A offload collective nt E EE SES ES E E computations Intra node SE B 8 B EE B E EB E collective eee ea eB EE eB E computation um re bor br ux bd br bre Compute nodes 1 2 Supported MPI Collectives FCA addresses a wide range of applications with out of the box integration with leading MPI implementations such as Platform MPI and Open MPI and requires no changes to the application The following MPI collectives are currently supported by FCA and accelerated e MPI Reduce e MPI Allred
15. n performing floating point summation to keep a consistent result regardless of the order of operations Valid values lt yln gt character The default is n fca mpi collect stat Collect MPI application performance statistics Valid values lt yln gt character The default is n fca mpi stats max ops Max number of different MPI collective operations for which to collect statistics This option is effective when collect stat y positive integer The default is 1000 fca mpi stats file name File name in which to keep collected statistics collect stats must be enabled for this parameter to take effect string The default is fca stats xml The following parameters may be changed in the INI file under section ib fca ib dev name If set the specified IB device will be used for communication The name as appears in sys class infiniband directory If not set the first device with an ACTIVE port is used string representing active IB device name default Leave empty or commented then auto discovery will be used N 21 Mellanox Technologies Version 2 5 FCA MPI Runtime Library Configuration Parameters INI File Parameter Description Values fca ib port num If set the selected port number is used on positive integer the matching device The default is unset If not set or zero the first active port is use auto discove
16. nologies Version 2 5 Revision History Revision History Version 2 5 Sep 30 2014 e Updated section Building OpenMPI 1 6 x with FCA Support on page 17 Version 2 5 Sep 30 2014 e Removed the osm type and ufm url parameters from Table 4 on page 18 Version 2 5 Dec 2012 e Removed section Upgrading from FCA 2 0 or Later e Updated the following sections FCA Installation Package Content on page 12 Downloading the FCA Software on page 13 Prerequisites on page 14 Building OpenMPI 1 6 x with FCA Support on page 17 Installing the FCA Manager on a Dedicated Node on page 15 Installing the FCA Manager from RPM on page 15 Version 2 2 May 2012 e Removed section Activating the Software License e Updated the following sections Prerequisites on page 14 Installing the FCA Manager on a Dedicated Node on page 15 Configuring a Specific Rule on page 23 Upgrading from FCA 2 0 or Later Starting the FCA Manager on page 16 Version 2 1 1 December 2011 e Updated the following sections to reflect offloading collective operations onto HCA Overview text and graphics Supported Topologies FCA Installation Package Content e Updated Prerequisites and Installation sections for 2 1 1 e Removed section on configuring Grid Director switches to enable FCA e Added note for OpenMPI 1 5 x in section Building OpenMPI 1 4 x with FCA Support ua lo _
17. osts cd fca 2 5 xxxx x86 64 scripts udev update sh 3 Set the environment variable pointing to the extracted location of FCA in the user login profile export FCA MGR HOME usr local mellanox fca 2 5 xxxx x86 64 N 15 J Mellanox Technologies Version 2 5 Installing FCA 3 2 3 Starting the FCA Manager gt To start the FCA Manager e Enter the following command SFCA MGR HOME scripts fca managerd start e For RPM setup only enter etc init d fca managerd start e Toconfigure FCA manager to start automatically after boot run etc init d fca manager install service A loo _ Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 4 Installing FCA MPI Support Libraries You can install the FCA MPI support libraries from either an RPM or from Tarball on all cluster nodes or to the shared NFS location using Tarball For further information see Installing the FCA Manager from RPM on page 15 and Installing the FCA Manager from Tarball on page 15 4 1 Building OpenMPI 1 6 x with FCA Support LE NOTE If you use OpenMPI 1 6 x or later no patch is required OpenMPI 1 6 x supports FCA natively gt To build OpenMPI 1 6 x with FCA support 1 Download OpenMPI 1 6 x from the OpenMPI site Enter the following commands mkdir p HOME openmpi cd SHOME openmpi wget http www open mpi org software ompi vl 6 x downloads openmpi 1 6 3 tar gz
18. r used Example opt openmpi 1 6 3 bin mpirun np 32 machinefile hostfile mca btl sm self openib x fca mpi debug level 4 mpi hello world 22 J Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 7 Configuring Rules for Offloading The FCA system is provided with user defined rules to select the most suitable offloading method for MPI communication The used defined rules consider the following MPI Communicator parameters e message size range in bytes e communicator size in ranks e offloading method CD CoreDirect UD MPI native e Operation for Reduce AllReduce e Data type for Reduce AllReduce 7 1 Enabling Dynamic Rules Mechanism gt To enable dynamic rules mechanism e Enter the following command in the fca mpi spec ini file section called rules enable lt 0 1 gt Example enable 1 7 2 Configuring a Specific Rule User defined offloading rules are added and enumerated in the fca_mpi_spec ini file Every user defined rule is represented by a new INI file section named in the following format rule coll name gt lt SN gt e coll name can be one of the following values reduce allreduce beast barrier allgather allgatherv e SNisarule serial number for given coll name The default value for min max params is 1 no limit Valid offload types are e ud use FCA in UD mode e cd default use FCA in COREDirect mode e none do not u
19. s of the UFM machine e Onlyasingle instance of FCA Manager should be running in the fabric 1 5 FCA Software Components The FCA related software components are listed in the following table Table 1 FCA Related Packages Package Description FCA Manager The FCA Manager is server software that responds to requests from the MPI application to set up new communicators FCA MPI The FCA MPI Runtime library is a user level shared library which is Runtime integrated with specific MPI distributions IBM PE OpenMPI Platforms MPI Libraries MVAPICH2 that is responsible for offloading MPI collective operations into Fabric 11 Mellanox Technologies Version 2 5 Introduction to Mellanox Fabric Collective Accelerator 1 6 FCA Installation Package Content The FCA installation package includes the following items FCA Mellanox Fabric Collector Accelerator Installation files fca lt version gt x86_64 lt OS gt rpm fca lt version gt x86_64 lt OS gt tar gz where lt version gt is the version of this release and lt OS gt is one of the supported Linux distributions listed in Prerequisites on page 14 Mellanox Fabric Collective Accelerator FCA Software End User License Agreement FCA Manager software FCA MPI runtime libraries Mellanox Fabric Collective Accelerator FCA User Manual Mellanox Fabric Collective Accelerator FCA Release Notes A llo Mellanox Technologies Fabric Colle
20. se FCA 23 J Mellanox Technologies Version 2 5 Configuring Rules for Offloading Rules are applied by the first match If none of the rules match the default is to use FCA with COREDirect mode The following is a list of valid rules parameters for FCA Parameter Description Default msg_size_min lt int gt Minimum message size No limit msg_size_max lt int gt Maximum message size No limit comm size min lt int gt Minimum communicator size No limit comm size max lt int gt Maximum communicator size No limit offload type lt string gt FCA offload type cd CORED rect mode data type lt string gt Data type given as a parameter Applicable for reduce allreduce reduce op string Reduce operation type requested Applicable for reduce allreduce Examples of reduce rules rules enable uite reducen msg size min msg size max Gomes op Comm Size max offload type data type reduce op rule reduce 2 msg size min meg SLAS mer comm size min COM Size mes offload type 24 J 256 1024 30 85 ud MPI CHAR MPI LXOR 2024 10 none Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 8 OpenMPI MCA Parameters to Control FCA Offload The complete list of OpenMPI FCA related parameters can be extracted using the ompi info command gt To extract the complete
21. tar zxvf openmpi 1 6 3 tar gz S cel cpenaai 1 0 3 2 Install OpenMPI 1 6 x with FCA support autogen sh amp amp configure with fca 4 2 Verifying the FCA Installation gt To verify that OpenMPI is working with the FCA installation e Enter the following command MPI HOME bin ompi info param coll fca grep fca enable The list of FCA parameters should be displayed as a command output 4 3 Running MPI Jobs with FCA Make sure that the FCA manager tarball is unpacked and available from all cluster nodes Its m opened location is referenced below as SFCA HOME 1 Use the following script examples with the information provided on how to run MPI jobs with FCA for different MPI vendors For OpenMPI MPI FCA HOME scripts run ompi fca sh For Platforms MPI SFCA HOME scripts run pmpi fca sh For MVAPICH2 SFCA HOME scripts run mvapich2 fca sh 2 Check the SFCA HOME etc fca mpi spec ini file for various FCA tuning options MA aaa Mellanox Technologies Version 2 5 Configuring FCA 5 1 Configuring FCA FCA Manager Configuration Parameters The fca manager spec ini file is a configuration file containing FCA related parameters which you can change or overwrite using the command line during runtime The FCA Manager process reads its configuration on startup from the FCA MGR HOME etc fca manager spec ini file Depending on the method used to install FCA the Mellanox
22. uce e MPI Barrier e MPI Bcast e MPI_AllGather e MPI_AllGatherv FCA supports an unlimited message size and advanced optimizations for Torus topologies It can work with any InfiniBand Subnet Management based software OpenSM Embedded SM Host SM FCA supports the following data types for Reduce and Allreduce operations e All data types for C language bindings except MPI LONG DOUBLE e All data types for C reduction functions C reduction types e The following data types for FORTRAN language bindings MPI INTEGER MPI INTEGER2 A loo y Mellanox Technologies Fabric Collective Accelerator FCA User Manual Version 2 5 MPI INTEGER4 MPLINTEGERS MPI REAL MPI REALA MPI REAL8 FCA does not support data types for Fortran reduction functions Fortran reduction types 1 3 Supported Topologies e FCA supports almost all fabric topologies Fat Tree HyperScale Torus e FCA requires a Mellanox based Infiniband network 1 4 Planning the Server Configuration Following are points to consider when planning on which server to install the FCA Manager e The FCA Manager should be installed on a different server than the one where MPI jobs will run e If you do not have two servers running UFM for redundancy you should install FCA Manager on the UFM server e If you do have two servers running UFM for redundancy it is best to install FCA Manager on a non UFM server and provide it with the virtual addres
23. vnnnnnnnnnnnnnnnnnnnnner 20 6 1 Specifying FCA Parameters as mpirun Command Line Arguments ssse 20 6 2 Specifying FCA Parameters in the INI File 20 7 Configuring Rules for Offloading rrevrnnnvnnnnnvnnnnnvnnnnvnnnnvnnnnnnnnnnnnnnnnnnnnnvnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnner 23 7 1 Enabling Dynamic Rules Mechanism sukesssssesssssrersssrersssesrsssesrss sees snar snar AR eene nnne 23 7 2 Configuring a Specific Rule sseeessesissesees essen nenne entren iaaa aao aaa nnn entes nn 23 8 OpenMPI MCA Parameters to Control FCA Offload eese 25 3 J Mellanox Technologies Version 2 5 Contents List of Figures Eigure13EGA Arehitectureuaudru et dat tiae t ti etus 9 Figure 2 FCA Components 1 utate i 10 Re oo o Mellanox Technologies Contents Version 2 5 List of Tables Table 1 FCA Related Packages p EI ree creer peer ere eee 11 Table 2 System Requirements mmsmssssserssrrssrresrrrsrrrsrrrsrrrsrnr ers sr rer esee entente KR REKA RR KR BKK BARR nennt nnns 14 Table 3 Paths for FCA Manager INI File mssmasssssssrsssrrssrsrsrsrsrrrsrrrsrrr sosse r rr ARANDA cnn rr 18 Table 4 FCA Manager INI File Parameters rrvrerrnnrvrrrrnnvvrrrrnnvvrrrrnnvnrrrrnnvnrrrrnnrsrerrnnrneerrnnnnesrrnrneerrnnnn 18 Table 5 FCA Parameters in Open MPI at Run Time oooonccccccccnnociconocananonononcno no nn naar cnn nc cnn rana rra 20 N 5 J Mellanox Tech

Download Pdf Manuals

image

Related Search

Related Contents

機能追加に伴う取扱説明書記載内容の変更について    CN-2000シリーズ  fuel injection system  Releasenotes 14.0.0  1] 設置位置の決定  Interruptor KVM USB Micro (Micro USB KVM Switch)  Patton electronic 2977 Server User Manual  Equip USB 2.0 Hub, 7-Port  Liquid Fence HG-70168 Instructions / Assembly  

Copyright © All rights reserved.
Failed to retrieve file