Home
User Manual
Contents
1. Parameter Description Values The following parameters may be changed in the INI file under the fmm section osm type Select subnet manager service provider opensmlufmlautodetect Valid values The default is autodetect e ufm Use Mellanox s UFM opensm Use OpenSM library e autodetect Detect automatically from fabric When embedded OpenSM is used in the switch the FCA module should be disabled in that specific switch ufm url URL of OpenSM service The default port string is 8081 You can replace localhost with The default is the IP or hostname of the machine on http localhost 808 1 which OpenSM runs PSOE Valid values http host port debug level Verbosity level of fca manager for integer between 0 7 w d sy Mellanox Technologies Confidential Version 2 2 Configuring FCA name can contain printf like tokens which are substituted during log file creation e H hostname where FCA Manager is running e D current date in format DDMMYYYY e T current thread ID Parameter Description Values debugging The default is 3 The debug levels are e 0 fatal e 1 error e 2 warn e 3 info e 4 debug e 5 7 detailed debug info log_file FCA Manager log filename The logfile string representing log file name The default is fmm H D log log file max size This is a critical size in MB of the log file to which the file will be rolled If
2. NOTE If you use OpenMPI 1 5 x or later no patch is required OpenMPI 1 5 x supports FCA natively To build OpenMPI 1 4 x with FCA support 1 Download OpenMPI 1 4 x from the OpenMPI site Enter the following commands mkdir p SHOME openmpi cd SHOME openmpi wget http www open mpi org software ompi v1 4 downloads openmpi 1 4 3 tar gz sitari xvkopenne MAP ok Cano cd openmpi 1 4 3 2 Apply the FCA 2 2 patch to enable FCA in OpenMPI v1 4 x Enter the following commands patch pl FCA HOME share fca ompi 1 4 x fca2 patch autogen sh mici o bunile fea cd build fca configure prefix PWD install with fca FCA HOME with openib make make install export MPI HOME S HOME openmpi openmpi 1 4 3 build fca install 4 2 Verifying the FCA Installation gt To verify that OpenMPI is working with the FCA installation e Enter the following command MPI HOME bin ompi info param coll fca grep fca enable The list of FCA parameters should be displayed as a command output 4 3 Running MPI Jobs with FCA Make sure that the FCA manager tarball is unpacked and available from all cluster nodes Its ma opened location is referenced below as FCA HOME 1 Use the following script examples with the information provided on how to run MPI jobs with FCA for different MPI vendors For OpenMPI MPI FCA HOME scripts run ompi fca sh w d a y Mellanox Technolo
3. l u 22 6 1 Specifying FCA Parameters as mpirun Command Line Arguments 22 6 2 Specifying FCA Parameters in the INI File 22 7 Configuring Rules for Offloading U u u uu u u 25 7 1 Enabling Dynamic Rules Mechanism U a 25 7 2 Configuring a Specific R e2 u au au u A nia aaa aW nn iaa eak 25 OpenMPI MCA Parameters to Control FCA Offload u u 27 Upgrading from FCA 2 0 or Later J uu u u u T J T 28 EE Mellanox Technologies Confidential Version 2 2 Contents List of Figures Figure 1 EGA Architect re ada asset ds 9 Figure 2 FCA Components enea uere deg baade a sue d Piae ED ad tapez vga 10 N a ER C c Mellanox Technologies Confidential Contents Version 2 2 List of Tables Table 1 FCA Related Packages T E MIR AER 11 Table 2 System Requirements n s en tenens RR KR nsn RK Sn ann 14 Table 3 Paths for FCA Manager INI File errors ror rr ARR ARD RAR BARR BARR RR RR RK RR RKS RK sr nn 19 Table 4 FCA Manager INI File Parameters I n 19 Table 5 FCA Parameters in Open MPI at Run Time arenans sn er sann L a 22 Va NX J Me
4. Mellanox TECHNOLOGIES Mellanox Technologies Inc Mellanox Technologies Ltd 350 Oakmead Parkway Suite 100 Beit Mellanox Sunnyvale CA 94085 PO Box 586 Yokneam 20692 U S A Israel www mellanox com www mellanox com Tel 408 970 3400 Tel 972 0 4 909 7200 972 0 74 723 7200 Fax 408 970 3403 Fax 4972 0 4 959 3245 O Copyright 2012 Mellanox Technologies Inc All Rights Reserved Mellanox Technologies All rights reserved Mellanox Mellanox logo BridgeX ConnectX CORE Direct InfiniBridge InfiniHost InfiniScale PhyX SwitchX Virtual Protocol Interconnect and Voltaire are registered trademarks of Mellanox Technologies Ltd FabricIT MLNX OS Unbreakable Link UFM and Unified Fabric Manager are trademarks of Mellanox Technologies Ltd All other trademarks are property of their respective owners 2 Document Number DOC 00983 A05 Mellanox Technologies Confidential Contents Version 2 2 Contents Revision HIStory cetreria EE EAN T 6 Preface on E 7 1 Introduction to Mellanox Fabric Collective Accelerator eene 9 MSD JONE EON ostro Eccc e eda a E LED LM a LaL etr ui ek 9 1 2 Supported MPI Collectives rnrnnrnnnnvnnannvnnnnnvnnrrrnnnrnnnnnvnnnnnennr renn ennemis nennen ns 10 1 9 Supported Topologies avse ert eb ne ttbi eq i fas 11 1 4 Planning the Server Configuration nnns 11 1 5 FCA Software Components ssssss
5. Part Number Mellanox Fabric Collective Accelerator Release Notes DOC 00984 Typographical Conventions Before you start using this guide it is important to understand the terms and typographical conventions used in the documentation The following kinds of formatting in the text identify special information Formatting convention Type of Information Special Bold Items you must select such as menu options command buttons or items in a list Emphasis Use to emphasize the importance of a point or for variable expressions such as parameters CAPITALS Names of keys on the keyboard for example SHIFT CTRL or ALT KEY KEY Key combinations for which the user must press and hold down one key and then press another for example CTRL P or ALT F4 Document Conventions gt NOTE Identifies important information that contains helpful suggestions CAUTION Alerts you to risk of personal injury system damage or loss of data MU Mellanox Technologies Confidential Version 2 2 Preface personal injury or a malfunction of the hardware or software Be aware of the hazards involved with electrical circuitry and be familiar with standard practices for preventing accidents before you work on any equipment WARNING Warns you that failure to take or avoid a specific action might result in ME j y Mellanox Technologies Confidential Fabric Collective Accelerator FCA Use
6. We recommend that the FCA Manager be installed on the same node as OpenSM b There are two options for installing the FCA Manager when using MLNX_OFED v1 5 3 3 0 0 e From an RPM on page 15 Select this option if you want to install the FCA Manager on the machine s local disk and let the RPM package handle all post install tasks e From Tarball on page 15 Select this option if you wish to install FCA Manager in any location user s home directory NFS shared folder etc There are number of post install tasks that need to be applied as root on every cluster node after you install FCA from a tarball Select one of the installation options according to your site s installation policy gt NOTE Mellanox OFED 1 8 includes FCA 2 2 which is installed under opt mellanox fca be If you have installed OFED 1 8 you do not need to download and install FCA However you need to set FCA manager location in your SHELL export FCA MGR HOME opt mellanox fca 3 2 1 Installing the FCA Manager from RPM gt To install the FCA Manager on all cluster nodes from an RPM as root when using MLNX_OFED v1 5 3 3 0 0 1 Enter the following command rpm ihv fca 2 2 x86 64 rpm 2 Set the environment variable pointing to the installed location of FCA in the user login profile export FCA MGR HOME opt mellanox fca 3 Configure the FCA Manager to start automatically after boot Optional etc init d fca managerd instal
7. set to zero rolling is disabled and file size is unlimited any positive valid number The default is 10 log file max backup files Denotes the number of backup files to be created Effective only when log file max size parameter has value greater then zero integer The default is 20 enable stdout Determines whether the log should also be written to the standard output Valid values e yorl enable e nor0 disable character The default is enable The following parameters may be changed in the INI file under section ib the matching device If not set or zero the first active port is used dev name If set the specified IB device will be used string representing active IB device for communication name The name as appears in The default is set sys class infiniband directory If not set the first device with ACTIVE port is used port num If set the selected port number is used on positive integer The default is unset use auto discover service level Quality of Service QoS is offered in IB as a means to offer some guarantees minimum requirements for certain applications on the fabric SL2VL mapping should be configured in OpenSM integer The default is 0 20 J Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 Parameter Description Values Valid values 0 15 Note that OpenSM works
8. use FCA in UD mode e cd default use FCA in COREDirect mode e none do not use FCA 25 J Mellanox Technologies Confidential Version 2 2 Configuring Rules for Offloading Rules are applied by the first match If none of the rules match the default is to use FCA with COREDirect mode The following is a list of valid rules parameters for FCA Parameter Description Default msg_size_min lt int gt Minimum message size No limit msg size max int Maximum message size No limit comm size min int Minimum communicator size No limit comm size max int Maximum communicator size No limit offload type string FCA offload type cd COREDirect mode data type string Data type given as a parameter Applicable for reduce allreduce reduce op string Reduce operation type requested Applicable for reduce allreduce Examples of reduce rules rules enable Laeta Ye exohbxee 1L T msg size min msg size max comm size min mi e Me offload type data type reduce op rule reduce 2 meg Size min msg size max Conn Size mum comm size max offload type 256 1024 30 35 ud Ue C sia s MPI LXOR 2024 10 none Mellanox Technologies Confidential 26 J Fabric Collective Accelerator FCA User Manual Version 2 2 OpenMPI MCA Parameters to Control FCA Offload The complete list of
9. 32 machinefile hostfile mca coll oma fca library path SFCA HOME lib libfca so mca btl sm self openib lt param gt value other mpi options mpi hello world where MPI HOME represents the path to the MPI installation directory FCA HOME represents the path to the FCA software directory NOTE The x command switch is used as follows x lt param gt lt value gt Example x fca mpi collect stats y Table 5 FCA Parameters in Open MPI at Run Time INI File Parameter Description Values 22 J Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 INI File Parameter Description Values The following parameters may be changed in the INI file under section mpi fca mpi debug level Verbosity level for MPI FCA debugging integer between 0 7 The debug levels are The default is 2 e 0 fatal e 1 error e 2 warn e 3 info e 4 debug e 5 7 detailed debug info fca_mpi_log_file FCA log filename The logfile name can String representing log file contain printf like tokens which are name or empty for none substituted during log file creation The default is none e H hostname of the process e u current time in ms e T current thread ID e s time in sec e ot time in ticks fca mpi enable stdout Determines whether the FCA log should also character be written to the standard output The default is y V
10. Mellanox TECHNOLOGIES Fabric Collective Accelerator FCA User Manual Version 2 2 www mellanox com Mellanox Technologies Confidential NOTE THIS HARDWARE SOFTWARE OR TEST SUITE PRODUCT PRODUCT S AND ITS RELATED DOCUMENTATION ARE PROVIDED BY MELLANOX TECHNOLOGIES AS IS WITH ALL FAULTS OF ANY KIND AND SOLELY FOR THE PURPOSE OF AIDING THE CUSTOMER IN TESTING APPLICATIONS THAT USE THE PRODUCTS IN DESIGNATED SOLUTIONS THE CUSTOMER S MANUFACTURING TEST ENVIRONMENT HAS NOT MET THE STANDARDS SET BY MELLANOX TECHNOLOGIES TO FULLY QUALIFY THE PRODUCT S AND OR THE SYSTEM USING IT THEREFORE MELLANOX TECHNOLOGIES CANNOT AND DOES NOT GUARANTEE OR WARRANT THAT THE PRODUCTS WILL OPERATE WITH THE HIGHEST QUALITY ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY FITNESS FOR A PARTICULAR PURPOSE AND NON INFRINGEMENT ARE DISCLAIMED IN NO EVENT SHALL MELLANOX BE LIABLE TO CUSTOMER OR ANY THIRD PARTIES FOR ANY DIRECT INDIRECT SPECIAL EXEMPLARY OR CONSEQUENTIAL DAMAGES OF ANY KIND INCLUDING BUT NOT LIMITED TO PAYMENT FOR PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES LOSS OF USE DATA OR PROFITS OR BUSINESS INTERRUPTION HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY WHETHER IN CONTRACT STRICT LIABILITY OR TORT INCLUDING NEGLIGENCE OR OTHERWISE ARISING IN ANY WAY FROM THE USE OF THE PRODUCT S AND RELATED DOCUMENTATION EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE
11. OpenMPI FCA related parameters can be extracted using the ompi info command gt To extract the complete list of OpenMPI FCA related parameters MPI HOME bin ompi info param coll fca gt To provide MCA parameters to the OpenMPI mpirun command use the following format MPI HOME bin mpirun mca lt param gt value Example MPI HOME bin mpirun mca coll fca verbose 1 other mpirun args gt The following is a list of MCA parameters for FCA Parameter Description Default coll fca priority int Priority of the fca coll component 80 coll fca verbose int Verbose level of the fca coll component 0 coll fca enable lt 011 gt Enable Disable Fabric Collective Accelerator 1 coll fca spec file string Path to the FCA configuration file fca mpi spec ini FCA_HOME etc fca_ mpi_spec ini coll_fca_library_path lt string gt Path to FCA runtime library FCA_HOME lib libfca SO coll_fca_np lt int gt Minimal allowed job s NP to activate FCA 64 coll fca enable barrier lt 011 gt Enable Disable FCA Barrier support coll fca enable bcast lt 011 gt Enable Disable FCA Bcast support coll fca enable reduce lt 011 gt Enable Disable FCA Reduce support coll fca enable allreduce lt 011 gt Enable Disable FCA Allreduce support coll fca enable allgather lt 011 gt Enable Disable FCA Allgather support coll fc
12. a enable allgatherv lt 011 gt Enable Disable FCA Allgatherv support Mellanox Technologies Confidential N 27 Version 2 2 Upgrading from FCA 2 0 or Later 9 Upgrading from FCA 2 0 or Later gt To upgrade FCA from FCA v2 0 or Later 1 Save the FCA Manager configuration files to the different location from where FCA was installed Use the following command cp FCA MGR HOME etc ini HOME 2 Save the FCA runtime library configuration files to the different location from where FCA runtime was installed Use the following command cp FCA HOME etc ini HOME 3 Stop the FCA Manager Use the following command SFCA MGR HOME scripts fca managerd sh stop For an RPM setup enter the following etc init d fca managerd stop 4 Uninstall the FCA Manager if it was installed from RPM Use the following command rpm e fca 5 Install FCA as described in Installing FCA on page 14 28 J Mellanox Technologies Confidential
13. al performance and to minimize interference with other applications it is recommended to use a dedicated server for the FCA Manager installation The following sections provide step by step instructions for installing the FCA Server software and installing the FCA Agent 2 1 1 Downloading the FCA Software NOTE Mellanox OFED 1 8 includes FCA 2 2 which is installed under opt mellanox fca z If you have installed OFED 1 8 you do not need to download and install FCA This software download process applies to software updates as well as for first time installation gt To download the FCA software Go to the Mellanox website 1 2 Click the Downloads tab and select the relevant version of the software to download 3 Save the file on your local drive 4 Click Close lr Mellanox Technologies Confidential Version 2 2 Installing FCA 3 Installing FCA 3 1 Prerequisites Before you begin be certain that 1 InfiniBand Subnet Management is installed and running on a dedicated node in the fabric 2 Mellanox OFED 1 5 3 or later is installed To download the latest MLNX_OFED version go to Mellanox OpenFabrics Enterprise Distribution for Linux MLNX_OFED 3 Mellanox ConnectX 2 or ConnectX 3 HCA with firmware version 2 9 1000 or later To download the latest ConnectX HCA firmware version go to Firmware Downloads The minimum system requirements for installing and running FCA are list
14. alid values lt yln gt fca mpi fp sum fixedpoint Use fixed point math when performing character floating point summation to keep a consistent The default is n result regardless of the order of operations Valid values lt yln gt fca mpi collect stat Collect MPI application performance character statistics The default is n Valid values lt yln gt fca mpi stats max ops Max number of different MPI collective positive integer operations for which to collect statistics This The default is 1000 option is effective when collect stat y fca mpi stats file name File name in which to keep collected statistics string collect stats must be enabled for this The default is parameter to take effect fca stats xml The following parameters may be changed in the INI file under section ib fca ib dev name If set the specified IB device will be used for string representing active communication IB device name The name as appears in sys class infiniband default Leave empty or directory commented then If not set the first device with an ACTIVE auto discovery will be port is used used fca ib port num If set the selected port number is used on the positive integer N 23 J Mellanox Technologies Confidential Version 2 2 FCA MPI Runtime Library Configuration Parameters INI File Parameter Description Values matching device The default is unset use aut
15. by default with QoS values of 0 7 UE Mellanox Technologies Confidential Version 2 2 FCA MPI Runtime Library Configuration Parameters 6 1 6 2 FCA MPI Runtime Library Configuration Parameters Specifying FCA Parameters as mpirun Command Line Arguments The FCA runtime library is used by MPI to offload collective operations into IB switches You can supply configuration parameters to the FCA runtime library The configuration parameters may be passed to the FCA library by either loading the parameters from a configuration INI file or entering the parameters in a command line to the MPI job or setting them from the shell environment e The default configuration file for the FCA MPI runtime parameter is located at FCA HOME etc fca mpi spec ini e The FCA parameters can be entered as command line parameters as part of the mpirun command To pass FCA parameter from shell environment e Enter the following command export fca ini section name ini section param name gt value Example Spore fea mol debug lewel 5 or provide the parameter to OpenMPI as command line argument nons x rea mol debug level 5 Koo Other m lrun parsmerers oso Specifying FCA Parameters in the INI File At runtime use mpirun s x command switch to overwrite FCA parameters set in the fca mpi spec ini file Set the FCA parameters in an MPI job at runtime with the following syntax MPI HOME bin mpirun np
16. cific MPI distributions IBM PE OpenMPI Platforms MPI Intel MPI Libraries MVAPICH2 that is responsible for offloading MPI collective operations into Fabric 1 6 FCA Installation Package Content The FCA installation package includes the following items e FCA Mellanox Fabric Collector Accelerator Installation files 11 Mellanox Technologies Confidential Version 2 2 Introduction to Mellanox Fabric Collective Accelerator fca lt version gt x86_64 lt OS gt rpm fca lt version gt x86_64 lt OS gt tar gz where lt version gt is the version of this release and lt OS gt is one of the supported Linux distributions listed in Prerequisites on page 14 e Mellanox Fabric Collective Accelerator FCA Software End User License Agreement e FCA Manager software e FCA MPI runtime libraries e Source patch for OpenMPI 1 4 x to enable FCA with OpenMPI software Note OpenMPI 1 5 2 and later distribution will include native FCA support e Mellanox Fabric Collective Accelerator FCA User Manual e Mellanox Fabric Collective Accelerator FCA Release Notes Ns 3 qp Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 2 Installation and Initial Configuration 2 1 Overview of Installation and Initial Configuration FCA software includes the FCA Manager and the FCA MPI runtime support libraries FCA Manager software should be installed on a central management node For optim
17. dings MPI INTEGER MPI INTEGER2 MP qu Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 MPI INTEGER4 MPI INTEGERS MPI REAL MPI REALA MPI REAL8 FCA does not support data types for Fortran reduction functions Fortran reduction types 1 3 Supported Topologies e FCA supports almost all fabric topologies Fat Tree HyperScale Torus e FCA requires a Mellanox based Infiniband network 1 4 Planning the Server Configuration Following are points to consider when planning on which server to install the FCA Manager e The FCA Manager should be installed on a different server than the one where MPI jobs will run e If you do not have two servers running UFM for redundancy you should install FCA Manager on the UFM server e If you do have two servers running UFM for redundancy it is best to install FCA Manager on a non UFM server and provide it with the virtual address of the UFM machine e Only a single instance of FCA Manager should be running in the fabric 1 5 FCA Software Components The FCA related software components are listed in the following table Table 1 FCA Related Packages Package Description FCA Manager The FCA Manager is server software that responds to requests from the MPI application to set up new communicators FCA MPI The FCA MPI Runtime library is a user level shared library which is integrated with Runtime spe
18. ed in the following table NOTE Mellanox OFED 1 8 includes FCA 2 2 which is installed under opt mellanox fca If you have installed OFED 1 8 you do not need to download and install FCA Table 2 System Requirements Item Requirement FCA 2 2 Supported switches Mellanox IB QDR FDR switches Linux distributions lt OS gt RHEL 5 5 RHEL 5 6 RHEL 6 0 RHEL 6 1 RHEL 6 2 CentOS 5 5 CentOS 5 6 Centos 5 7 CentOS 6 0 SLES 10 SP4 SLES 11 SLES11 SP1 Supported HCAs Mellanox ConnectX 2 HCA with firmware version 2 9 1000 or later Mellanox ConnectX 3 HCA with firmware version 2 10 0000 or later Open Message Passing Interface Open MPI 1 4 3 or later MPI Project Open Fabrics Enterprise 1 5 3 or later Distribution OFED Root permission The installer should have root permissions for post installation tasks InfiniBand Subnet Management All InfiniBand Subnet Management based software is supported in FCA version 2 2 NOTE MLNX_OFED v1 8 comes with OpenMPI 1 4 6 and 1 6 which is already compiled to support FCA v2 2 b 14 J Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 3 2 Installing the FCA Manager on a Dedicated Node FCA Manager must be installed on a dedicated machine which is not a part of the cluster nodes with only a single instance of the FCA Manager in operation per fabric NOTE
19. gies Confidential Version 2 2 Installing FCA MPI Support Libraries For Platforms MPI SFCA HOME scripts run pmpi fca sh For Intel MPI SFCA HOME scripts run impi fca sh For MVAPICH2 FCA HOME scripts run mvapich2 fca sh 2 Check the SFCA HOME etc fca mpi spec ini file for various FCA tuning options Ns ooo Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 5 Configuring FCA 5 1 FCA Manager Configuration Parameters The fca manager spec ini file is a configuration file containing FCA related parameters which you can change or overwrite using the command line during runtime The FCA Manager process reads its configuration on startup from the FCA MGR HOME etc fca manager spec ini file Depending on the method used to install FCA the Mellanox provided parameter file fca manager spec ini will be located in the path described in the following table Table 3 Paths for FCA Manager INI File Installation Method Path to fca manager spec ini From RPM opt mellanox fca etc From Tarball FCA_HOME etc The FCA Manager configuration file is in INI format and contains two sections fmm and ib I To set parameter values in the fca manager spec ini file edit the file as necessary using the following format variable value Example fmm eles lewel log file fm log Table 4 FCA Manager INI File Parameters
20. l service 3 2 2 Installing the FCA Manager from Tarball To install the FCA Manager from Tarball in the shared NFS location when using MLNX OFED v1 5 3 3 0 0 1 Enter the following commands mkdir p usr local mellanox cd usr local mellanox iv Car MENE FORS 02000000 64 per Ga 2 Run the following post install scrip on all hosts cd foa 2 2 xxxx x86 64 mm Mellanox Technologies Confidential Version 2 2 Installing FCA scripts udev update sh 3 Set the environment variable pointing to the extracted location of FCA in the user login profile export FCA MGR HOME usr local mellanox fca 2 2 xxxx x86 64 3 2 3 Starting the FCA Manager gt To start the FCA Manager e Enter the following command SFCA MGR HOME scripts fca managerd start e For RPM setup only enter etc init d fca_managerd start e To configure FCA manager to start automatically after boot run etc init d fca_manager install service V w ooo J Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 4 Installing FCA MPI Support Libraries You can install the FCA MPI support libraries from either an RPM or from Tarball on all cluster nodes or to the shared NFS location using Tarball For further information see Installing the FCA Manager from RPM on page 15 and Installing the FCA Manager from Tarball on page 15 4 1 Building OpenMPI 1 4 x with FCA Support
21. llanox Technologies Confidential Version 2 2 Revision History Revision History Version 2 2 May 2012 e Removed section Activating the Software License e Updated the following sections FCA 2 2 Prerequisites on page 14 Installing the FCA Manager on a Dedicated Node on page 15 Configuring a Specific Rule on page 25 Upgrading from FCA 2 0 or Later on page 28 Starting the FCA Manager on page 16 Version 2 1 1 December 2011 e Updated the following sections to reflect offloading collective operations onto HCA Overview text and graphics Supported Topologies FCA Installation Package Content e Updated Prerequisites and Installation sections for 2 1 1 e Removed section on configuring Grid Director switches to enable FCA e Added note for OpenMPI 1 5 x in section Building OpenMPI 1 4 x with FCA Support sa j M Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 Preface Audience The intended audience for the Mellanox Fabric Collective Accelerator FCA User Manual is the MPI implementer and the network administrator responsible for managing FCA on Mellanox InfiniBand switches It is assumed that the administrator is familiar with advanced concepts in network management Related Documentation The following document is part of the library for network administrators and installers supporting the Mellanox FCA Document Name
22. o discover If not set or zero the first active port is used Example opt openmpi 1 4 1 bin mpirun np 32 machinefile hostfile mca btl sm self openib x fca mpi debug level 4 mpi hello world 24 J Mellanox Technologies Confidential Fabric Collective Accelerator FCA User Manual Version 2 2 7 Configuring Rules for Offloading The FCA system is provided with user defined rules to select the most suitable offloading method for MPI communication The used defined rules consider the following MPI Communicator parameters e message size range in bytes e communicator size in ranks e offloading method CD CoreDirect UD MPI native e Operation for Reduce AllReduce e Data type for Reduce AllReduce 7 1 Enabling Dynamic Rules Mechanism gt To enable dynamic rules mechanism e Enter the following command in the fca mpi spec ini file section called rules enable lt 0 1 gt Example enable 1 7 2 Configuring a Specific Rule User defined offloading rules are added and enumerated in the fca mpi spec ini file Every user defined rule is represented by a new INI file section named in the following format rule coll name gt lt SN gt coll name can be one of the following values reduce allreduce beast barrier allgather allgatherv e SNisarule serial number for given coll name The default value for min max params is 1 no limit Valid offload types are e ud
23. ologies Confidential Version 2 2 Introduction to Mellanox Fabric Collective Accelerator The following diagram shows the FCA components and the role that each plays in the acceleration process Figure 2 FCA Components Job UFM GD 4700 4200 Scheduler FCA Manager is Mellanox UFM GD 4036 4036E FCA Manager Orchestrating fabric wide collectives CPUs wae FCA A offload collective nt E EF amp 8 E EI EI EJ E Ae computations Intra node Ei Ei EM E Ed EM EM E E i collective E E E E E computation Em Am w w G G G Compute nodes 1 2 Supported MPI Collectives FCA addresses a wide range of applications with out of the box inteeration with leading MPI implementations such as Platform MPI and Open MPI and requires no changes to the application The following MPI collectives are currently supported by FCA and accelerated e MPI Reduce e MPI Allreduce e MPI Barrier e MPI Beast e MPI AllGather e MPI AllGatherv FCA supports an unlimited message size and advanced optimizations for Torus topologies It can work with any InfiniBand Subnet Management based software OpenSM Embedded SM Host SM FCA supports the following data types for Reduce and Allreduce operations e All data types for C language bindings except MPL LONG DOUBLE e All data types for C reduction functions C reduction types e The following data types for FORTRAN language bin
24. r Manual Version 2 2 1 1 Introduction to Mellanox Fabric Collective Accelerator Overview The Mellanox Fabric Collective Accelerator FCA is a unique solution for offloading collective operations from the Message Passing Interface MPI process to the server CPUs As a system wide solution FCA does not require any additional hardware The FCA manager creates a topology based collective tree and orchestrates an efficient collective operation using the CPUs in the servers that are part of the collective operation FCA accelerates MPI collective operation performance by up to 100 times providing a reduction in the overall job runtime Implementation is simple and transparent during the job runtime FCA is built on the following main principles e Topology aware Orchestration The MPI collective logical tree is matched to the physical topology The collective logical tree is constructed to assure Maximum utilization of fast inter core communication Distribution of the results e Communication Isolation Collective communications are isolated from the rest of the traffic in the fabric using a private virtual network VLane eliminating contention with other types of traffic The following diagram summarizes the FCA architecture Figure 1 FCA Architecture Inter core communication optimized Use of IB multicast for result Collective tree amp Rank distribution placement optimized to the topology Mellanox Techn
25. ssssesssee entente snnt en ARR AREA ARR KA Kr KR oa 11 1 6 FCA Installation Package Content aaa 11 2 Installation and Initial Configuration U U u u u u u u 13 2 1 Overview of Installation and Initial Configuration semersrrserrsrsrsrersrersrerrrer sne r rn rann arr sr sn ora 13 2 1 1 Downloading the FCA Software aras 13 3 Installing FGA E E eee decente E aee lenge 14 34 Prerequisites uu w tete UE ette etate teu 14 3 2 Installing the FCA Manager on a Dedicated Node sssmsssssrsrersrsrsrersrersrer enes sne r ann annan ansa oa 15 3 2 1 Installing the FCA Manager from RPM sse 15 3 2 2 Installing the FCA Manager from Tarball essen 15 3 2 9 Starting the FCA Manager orn teo dine ea tni gre kien t eee board 16 4 Installing FCA MPI Support Libraries U u u u u u 17 4 1 Building OpenMPI 1 4 x with FCA Suppolrti a 17 4 2 Verifying the FCA Installation sess nennen nennen nnne Sn a 17 4 3 Running MPI Jobs with FCA enannvnnnnrnonnvonannvennnvenrsrrrannrnnennvnnenrennnrrnennvnnsnnennsrrensnrnnsnnnenennenen 17 5 Gontiguring FGA a La a s eite detecte krasse 19 5 1 FCA Manager Configuration Parameters sess 19 6 FCA MPI Runtime Library Configuration Parameters l
Download Pdf Manuals
Related Search
Related Contents
Mode d'emploi 40タイプ Manual de Instalación - Servicios Integrados Argentinos A 0915 TRIGANO TRANSIT SMA-01クイックガイド Emerson 24000C Instruction Manual Copyright © All rights reserved.
Failed to retrieve file