Home

Mellanox OFED for Linux Release Notes

image

Contents

1. 22 When using SR IOV make sure to set interface to down and unbind BEFORE unloading driver remov ing VF restarting VM or kernel will lock reboot needed Basically clean up might not work perfectly so user should do it manually 23 Repeating change of the m1x5 num vfs value from 0 to non zero might cause kernel panic in the PF driver 30 Mellanox Technologies Rev 3 1 1 0 3 3 8 Resiliency 3 8 1 Reset Flow Known Issues Table 38 Resiliency Known Issues Index Description Workaround l SR IOV non persistent configuration such as VGT Reset Admin configuration post Reset Flow VST Host assigned GUIDs and QP0 enabled VFs may be lost upon Reset Flow 2 Upon Reset Flow or after running restart driver Reset the VLANS using the ifup command Ethernet VLANs are lost 3 Restarting the driver or running connectx port config when Reset Flow is running might result in a kernel panic 4 Networking configuration e g VLANs IPv6 should be statically defined in order to have them set after Reset Flow as of after restart driver 5 After recovering from an EEH event mlx5 core mlx4 core unload may fail due to a bug in some ker nel versions The bug is fixed in Kernel 3 15 Mellanox Technologies 31 J Rev 3 1 1 0 3 Known Issues 3 9 Miscellaneous Known Issues 3 9 1 General Known Issues Table 39 General Known Issues
2. Mellanox Technologies 37 J Rev 3 1 1 0 3 Bug Fixes History Table 50 Fixed Bugs List Discovered Fixed in E Ius Desecpuon in Release Release 31 Installation MLNX OFED v3 0 1 0 1 installation using yum 3 0 1 0 1 3 0 2 0 0 fails on RH7 1 32 mlx5 driver In PPC systems when working with ConnectX amp 4 3 0 1 0 1 3 0 2 0 0 adapter card configured as Ethernet driver load fails with BAD INPUT LENGTH dmesg command failed status bad input length 0x50 syndrome 0x9074aa 33 Error counters such as CRC error counters RX out 3 0 1 0 1 3 0 2 0 0 range length error counter are missing in the ConnectX 4 Ethernet driver 34 Changing the RX queues number is not supported in 3 0 1 0 1 3 0 2 0 0 Ethernet driver when connected to a ConnectX 4 card 35 Ethernet Hardware checksum call trace may appear when 3 0 1 0 1 3 0 2 0 0 receiving IPV6 traffic on PPC systems that uses CHECKSUM COMPLETE method 36 mlx4 en Fixed ping traffic issue occurred when RXVLAN 2 4 1 0 4 13 0 1 0 1 offload was disabled and CHECKSUM COM PLETE was used on ingress packets 37 Security CVE 2014 8159 Fix Prevented integer overflow in 2 0 2 0 5 2 4 1 0 4 IB core module during memory registration 38 mlx5 ib Fixed the return value of max inline received size in 2 3 2 0 1 2 4 1 0 0 the created QP 39 Resolved soft lock on massive amount of user mem 2 3 2 0 1 2 4 1 0
3. Operating System Platform Ubuntu 14 04 x86_64 PPC64le Power 8 Ubuntu 14 10 x86_64 PPC64le Power 8 Ubuntu 15 04 x86_64 PPC64le Power 8 Debian 6 0 10 x86_64 Debian 7 6 x86_64 Debian 8 0 x86_64 Debian 8 1 x86_64 Windriver 6 0 x86_64 kernel 3 10 4 1 a This kernel is supported only when using the Operating Systems stated in the table above For RPM based Distributions if you wish to install OFED on a different kernel you need P to create a new ISO image using mlnx add kernel support sh script ae See the MLNX_OFED User Guide for instructions P Upgrading MLNX OFED on your cluster requires upgrading all of its nodes to the newest version as well 5 1 2 1 Supported Hypervisors The following are the supported Hypervisors in MLNX OFED Rev 3 1 1 0 3 KVM RedHat 6 6 6 7 7 1 Ubuntu 14 10 15 04 Sles11SP4 Sles12 Debian 6 0 10 1 2 2 Supported Non Linux Virtual Machines The following are the supported Non Linux InfiniBand only Virtual Machines in MLNX OFED Rev 3 1 1 0 3 Windows Server 2012 R2 Windows Server 20012 Windows Server 2008 R2 1 3 Hardware and Software Requirements The following are the hardware and software requirements of MLNX OFED Rev 3 1 1 0 3 Linux operating system Administrator privileges on your machine s 8 Mellanox Technologies J Rev 3 1 1 0 3 Disk Space IGB For the OFED Distribution to compile on your machine some soft
4. Rev 3 1 1 0 3 Table 51 Change Log History Release Category Description Rev 2 1 1 0 6 Content Packages Updates The following packages were updated bupc to v2 2 407 mstflint to v3 5 0 1 1 g76e4acf perftest to v2 0 0 76 gbf9a463 e hcoll to v2 0 472 1 Openmpi to v1 6 5 440ad47 e daplto v2 0 40 Rev 2 1 1 0 0 EoIB EoIB is supported only in SLES11SP2 and RHEL6 4 eIPoIB eIPoIB is currently at GA level Connect IB Added the ability to resize CQs IPoIB Reusing DMA mapped SKB buffers Performance improvements when IOMMU is enabled minx_en Added reporting autonegotiation support Added Transmit Packet Steering XPS support Added reporting 56Gbit s link speed support Added Low Latency Socket LLS support Added check for dma_mapping errors eIPoIB Added non virtual environment support Rev 2 0 3 0 0 Operating Systems Additional OS support SLESIISP3 Fedoral6 Fedoral7 Drivers Added Connect IBTM support Installation Added ability to install MLNX OFED with SR IOV support Added Yum installation support EoIB EoIB at beta level is supported only in SLES11SP2 and RHEL6 4 mlx4 core Modified module parameters to associate configuration values with specific PCI devices identified by their bus device function value format mlx4 en Reusing DMA mapped buffers major performance improveme
5. Joining a multicast group in the SM using the RDMA CM API requires IPoIB to first join the broadcast group Mellanox Technologies 21 J Rev 3 1 1 0 3 Known Issues Table 23 IPoIB Known Issues Continued Index Description Workaround 11 Whenever the IOMMU parameter is enabled in the To avoid such issue kernel it can decrease the number of child interfaces Decrease the amount of the RX receive buf on the device according to resource limitation fers module parameter the default is 512 The driver will stuck after unknown amount of child Decrease the number of RX rings sys fs or interfaces creation ethtool in new kernels e Avoid using IOMMU if not required For further information please see https access redhat com site articles 66747 For KVM users http support citrix com article CTX136517 Run http www novell com support kb echo 1 gt sys module kvm parame doc php id 7012337 ters allow unsafe assigned inter https bugzilla redhat com show_ rupts bug cgi id 1044595 To make this change persist across reboots add the following to the etc modprobe d kvm conf file or create this file if it does not exist options kvm allow unsafe as Signed interrupts 1 kernel parame ters 12 System might crash in skb checksum help Use UD mode in IPoIB while performing TCP retransmit involving packets with 64k packet size A similar out to the below will be printed kernel BUG
6. 1 1 Content of Mellanox OFED for Linux sssseeeeeee ees 6 1 2 Supported Platforms and Operating Systems oooooooomommmmom oo 7 1 2 1 Supported Hypervisors veia ERATIS 8 1 2 2 Supported Non Linux Virtual Machines e eese 8 1 3 Hardware and Software Requirements 0 0 000 cece een ene 8 1 4 Supported HCAs Firmware Versions 2 000 e eee cece eee eens 9 LS Compatibility oia cated ate ate ance Ri ang tee ans estet exi ate Rats etg 10 1 6 RoCE Modes Matrix lisse 10 Chapter 2 Changes and New Features in Rev3 1 1 0 3 11 Chapter 3 Knownlssu ss so kate OX v ee VEA Ar IZ 3 1 Driver Installation Loading Unloading Start Known Issues 12 3 1 1 Installation Known Issues 00 0 cece cette eens 12 3 1 2 Driver Unload Known Issues 0000s 12 3 1 3 Driver Start Known Issues 13 3 1 4 System Time Known Issues o0oooooooooooorr e 14 3 1 5 UEFI Secure Boot Known Issues 0 0 eee eee eee 15 3 2 Performance Known Issues 15 3 3 HAS KNOWN ISSUES oett emer v A E s 16 3 3 1 ConnectXG 3 mlx4 Driver Known Issues esee 16 3 3 2 ConnectXG 4 mlx5 Driver Known Issues 00 000 e eee eee ee 16 3 4 Ethernet Network ooo RR Pb ine widmet diets R4 ita PG 17 3 4 1 Ethernet Known Issues 00 ccc ete ett eens 17 3 4 2 Port Type Management Known Issues 0 00 cece eee eee eee 19 3 4 3 Flow Steer
7. In ConnectX 2 when the debug level module parameter for module mlx4 core is non zero if the driver load succeeds the message below is pre sented mlx4 core 0000 0d 00 0 command SET PORT 0xc failed in param 0x120064000 in mod 0x2 op mod 0x0 fw status 0x40 This message is simply part of the learning process for setting the maximum port VLs compatible with a 4K port mtu and should be ignored openibd start unloads kernel modules that were loaded from initrd initramfs upon boot This affects only kernel modules which come with MLNX_OFED and are included in initrd initramfs If a Lustre storage is used it must be fully unloaded before restarting the driver or rebooting the machine otherwise machine might get stuck panic 1 Unmount any mounted Lustre storages d umount lustre mount point 2 Unload all Lustre modules lustre rmmod Mellanox Technologies 13 J Rev 3 1 1 0 3 Known Issues Table 12 Driver Start Known Issues Continued Index Description Workaround 8 Driver unload fails with the following error mes Make sure that there are no mount points over sage NFS RDMA prior to unloading the driver and Unloading rdma_cm FAILED run rmmod ERROR Module rdma cm is in use modprobe r xprtrdma by xprtrdma In case that the xprtrdma module keeps getting loaded automatically even though it is not used add a pre stop hook for the openibd ser vice script t
8. Index Description Workaround l On ConnectX 2 ConnectX 3 Ethernet adapter cards N A there is a mismatch between the GUID value Please use the GUID value returned by the fab returned by firmware management tools and that ric driver utilities not Oxfffff returned by fabric driver utilities that read the GUID via device firmware e g using ibstat MIxburn flint return Oxffff as GUID while the utilities return a value derived from the MAC address For all driver firmware software purposes the latter value should be used 2 On rare occasions under extremely heavy MAD traffic MAD Management Datagram storms might cause soft lockups in the UMAD layer 3 Packets are dropped on the SM server on big clus Increase the recv queue sizeofib mad ters module parameter for SM server to 8K The recv queue size default size 4K 3 9 2 ABI Compatibility Known Issues Table 40 ABI Compatibility Known Issues Index Description Workaround 1 MLNX OFED v2 3 1 0 1 is not ABI compatible Recompile the application over the new with previous MLNX_OFED OFED versions MLNX OFED version 3 9 3 Connection Manager CM Known Issues Table 41 Connection Manager CM Known Issues Index Description Workaround l When 2 different ports have identical GIDs the CM All ports must have different GIDs might send its packets on the wrong port 3 9 4 Fork Support Known Issues Table 42
9. Power Management Quality of Service when the traffic 1s active the Power Management QoS is enabled by disabling the CPU states for maximum performance Ethernet PTP Hardware Clock support on kernels OSes that sup port it Verbs Added additional experimental verbs interface This interface exposes new features which are not integrated yet in to the upstream libibverbs The Experimental API is an extended API therefor it is backward compatible meaning old application are not required to be recompiled to use MLNX OFED v2 2 1 0 1 Performance Out of the box performance improvements e Use of affinity hints based on NUMA node of the device to indicate the IRQ balancer daemon on the optimal IRQ affinity Improvement in buffers allocation schema based on the hint above Improvement in the adaptive interrupt moderation algorithm Rev 2 1 1 0 6 IB Core Added allocation success verification process to ib alloc device dapl dapl is recompiled with no FCA support openibd Added the ability to bring up child interfaces even if the parent s 1fcfg file is not configured libmlx4 Unmapped the hca clock page parameter from mlx4 uninit con text Scsi transport srp scsi transport srp cannot be cleared up when rport reconnecting fails minxofedinstall Added support for the following parameters e umad dev na e without lt package gt 48 Mellanox Technologies
10. tate Debian 8 apt get install libnl 3 200 automake debhelper curl dkms logrotate libglib2 0 0 python libxml2 graphviz tk tcl libvirt bin coreutils pkg config autotools dev flex autoconf pciutils quilt module init tools libvirtO libstdc 6 dpkg libgfortran3 procps Isof libltdl dev gcc dpatch chrpath grep m4 gfortran bison libnl route 3 200 swig perl make 1 4 Supported HCAs Firmware Versions MLNX_OFED Rev 3 1 1 0 3 supports the following Mellanox network adapter cards firmware versions Table 6 Supported HCAs Firmware Versions HCA Recommended Firmware Rev Additional Firmware Rev Supported Connect IB 10 12 1100 10 12 0780 ConnectX 4 Lx 14 12 1100 14 12 0780 ConnectX 4 12 12 1100 12 12 0780 ConnectX 3 Pro 2 35 5000 2 34 5000 Mellanox Technologies 9 J Rev 3 1 1 0 3 Overview Table 6 Supported HCAs Firmware Versions HCA Recommended Firmware Rev Additional Firmware Rev Supported ConnectX 3 2 35 5000 2 34 5000 ConnectX 2 2 9 1000 2 9 1000 For official firmware versions please see http www mellanox com content pages php pg firmware download 1 5 Compatibility MLNX OFED Rev 3 1 1 0 3 is compatible with the following Table 7 MLNX OFED Rev 3 1 1 0 3 Compatibility Matrix Mellanox Product Description Version MLNX OS MSX6036 w w MLNX OS version 3 3 4304 Grid Director 4036 w w Grid Direct
11. MAD prevention is achieved by assigning a cont DOS MAD Pre threshold for each agent s RX Agent s RX threshold provides a vention protection mechanism to the host memory by limiting the agents RX with a threshold QoS per VF Rate Virtualized QoS per VF supported in ConnectX 3 ConnectX 3 Limit per VF Pro adapter cards only with firmware v2 33 5100 and above lim its the chosen VFs throughput rate limitations Maximum through put The granularity of the rate limitation is 1 Mbits Ignore Frame Upon receiving packets the packets go through a checksum valida Check Sequence tion process for the FCS field If the validation fails the received FCS Errors packets are dropped Using this feature enables you to choose whether or not to drop the frames in case the FCS is wrong and use the FCS field for other info Sockets Direct Pro Sockets Direct Protocol SDP is a byte stream transport protocol tocol SDP that provides TCP stream semantics and utilizes InfiniBand s advanced protocol offload capabilities Scalable Subnet The Scalable Subnet Administration SSA solves Subnet Admin Administration istrator SA scalability problems for Infiniband clusters It distrib SSA utes the needed data to perform the path record calculation needed for a node to connect to another node and caches these locally in the compute client nodes SSA requires AF IB address family support 3 12 28 4 kernel and later SR IOV in Con Changed the A
12. P QP BURST CREATE ENABLE MULTI PAC KET SEND WR on QP burst family creation ACLs e SR IOV eSwitch offloads priority and dscp forcing e Loopback decision e VLAN insertion e encapsulation encap decap sniffer Signature 34 Mellanox Technologies Rev 3 1 1 0 3 3 10 InfiniBand Fabric Utilities Known Issues 3 10 1 Performance Tools Known Issues Table 47 Performance Tools Known Issues Index Description Workaround l perftest package in MLNX OFED v2 2 1 0 1 and onwards does not work with older versions of the driver 3 10 2 Diagnostic Utilities Known Issues Table 48 Diagnostic Utilities Known Issues Index Description Workaround l When running the ibdiagnet check nodes infoon Runibdiagnet skip nodes info the fabric a warning specifying that the card does not support general info capabilities for all the HCAs in the fabric will be displayed 2 ibdump does not work when IPoIB device managed Enable IPoIB Flow Steering and restart the Flow Steering is OFF and at least one of the ports is driver configured as InfiniBand For further information please refer to MLNX OFED User Manual section Enable Disable Flow Steering 3 10 3 Tools Known Issues Table 49 Tools Known Issues Index Description Workaround l Running ibdump in InfiniBand mode with firmware Run ibdump with firmware v2 33 5000 and older than v2 33 5000 may cau
13. list of general limitations and known issues of the various components of this Mellanox OFED for Linux release 3 1 Installation Known Issues Table 10 Installation Known Issues Driver Installation Loading Unloading Start Known Issues Index Description Workaround 1 When upgrading from an earlier Mellanox OFED version the installation script does not stop the ear lier version prior to uninstalling it Stop the old OFED stack etc init d openibd stop before upgrading to this new version Upgrading from the previous OFED installation to this release does not unload the kernel module ipoi b_helper Reboot after installing the driver Installation using Yum does not update HCA firm ware See Updating Firmware After Installation in OFED User Manual total vfs lt 0 63 gt installation parameter is no longer supported Use enable sriov installation parame ter to burn firmware with SR IOV support The number of virtual functions VFs will be set to 16 For further information please refer to the User Manual When using bonding on Ubuntu OS the ifenslave package must be installed On PPC systems the ib_srp module is not installed by default since it breaks the ibmvscsi module If your system does not require the ibmvscsi module run the mlnxofedinstall script with the with srp flag 3 1 2 Driver Unload Known Issues Table 11 Driver Unlo
14. prevent process starvation due to 2 3 2 0 1 2 3 2 0 5 MAD packet storm 49 IPoIB Fixed an issue which prevented the spread of events 2 3 1 0 1 2 3 2 0 0 among the closet NUMA CPU when only a single RX queue existed in the system 50 Returned the CQ to its original state armed to pre 2 3 1 0 1 2 3 2 0 0 vent traffic from stopping 51 Fixed a TX timeout issue in CM mode which 2 1 1 0 0 2 3 2 0 0 occurred under heavy stress combined with ifup ifdown operation on the IPoIB interface 52 mlx4 core Fixed sleeping while atomic error occurred when 2 3 1 0 1 2 3 2 0 0 the driver ran many firmware commands simultane ously 53 mlx4 ib Fixed an issue related to spreading of completion 2 1 1 0 0 2 3 2 0 0 queues among multiple MSI X vectors to allow bet ter utilization of multiple cores 54 Fixed an issue that caused an application to fail 2 3 1 0 1 2 3 2 0 0 when attaching Shared Memory 55 mlx4 en Fixed dmesg warnings NOHZ local soft 2 3 1 0 1 2 3 2 0 0 irq pending 08 56 Fixed erratic report of hardware clock which caused 2 1 1 0 0 2 3 2 0 0 bad report of PTP hardware Time Stamping 57 mlx5 core Fixed race when async events arrived during driver 2 3 1 0 1 2 3 2 0 0 load 58 Fixed race in mlx5 eq int when events arrived 2 3 1 0 1 2 3 2 0 0 before eq dev was set 59 Enabled all pending interrupt handlers completion 2 3 1 0 1 2 3 2 0 0 befor
15. the adaptive interrupt moderation ethtool C lt eth gt X adaptive rx off routine which sets high values of interrupt coalesc rx usecs 64 rx frames 24 ing causing the driver to process large number of packets in the same interrupt leading UDP to drop Values above may need tuning depending the packets due to overflow in its buffers system configuration and link speed 4 Performance degradation might occur when bonding Ethernet interfaces on Centos 6 5 Mellanox Technologies 15 J Rev 3 1 1 0 3 Known Issues 3 3 3 3 1 HCAs Known Issues ConnectX 3 mlx4 Driver Known Issues Table 16 ConnectX 3 mlx4 Driver Known Issues Index Description Workaround l Using RDMA READ with a higher value than 30 SGEs in the WR might lead to local length error Do not set the value of SGEs higher than 30 when RDMA READ is used 3 3 2 ConnectXG 4 mlx5 Driver Known Issues Table 17 ConnectX 4 mlx5 Driver Known Issues Index Description Workaround 1 Atomic Operations in Connect IB are fully sup ported on big endian machines e g PPC Their support is limited on little endian machines e g x86 EEH events that arrive while the mlx5 driver is load ing may cause the driver to hang The mlx5 driver can handle up to 5 EEH events per hour If more events are received cold reboot the machine When working with Connect IB firmware
16. v10 10 5054 the following message would appear in driver start command failed status bad system State 0x4 syndrome 0x408b33 The message can be safely ignored Upgrade Connect IB firmware to the latest available version Changing the link speed is not supported in Ethernet driver when connected to a ConnectX 4 card Bonding active backup mode does not function properly Rate speed and width using IB sysfs tools are avail able in RoCE mode in ConnectX 4 only after port physical speed configu ration is done Since MLNX_OFED s openibd does not unload modules while OpenSM is running removing mlx 5 core manually while OpenSM is running may cause it to be out of sync when probed again Restart OpenSM ConnectX 4 port GIDs table shows a duplicated RoCE v2 default GID 16 Mellanox Technologies Rev 3 1 1 0 3 3 4 Ethernet Network 3 4 1 Ethernet Known Issues Ethernet Know Issues are applicable to ConnectX 3 ConnectX 3 Pro only Table 18 Ethernet Known Issues Index Description Workaround l When creating more than 125 VLANs and SR IOV mode is enabled a kernel warning message will be printed indicating that the native VLAN is created but will not work with RoCE traffic kernel warning m1x4 core 0000 07 00 0 vhcr command ALLOC RES 0xf00 slave 0 in param 0x7e in mod 0x107 op _mod 0x1 failed with error 0 status 28 2 Kernel panic might occu
17. which may cause ISER to log a stack trace 28 Mellanox Technologies Rev 3 1 1 0 3 3 7 3 7 1 Virtualization SR IOV Known Issues Table 37 SR IOV Known Issues Index Description Workaround l When using legacy VMs with MLNX OFED 2 x hypervisor you may need to set the enable 64b cqe eqe parameter to zero on the hypervisor It should be set in the same way that other module parameters are set for mlx4 core at module load time For example add options mlx4 core enable 64b cqe eqe 0 as a line in the file etc modprobe d mlx4 core conf mlx4 portl1 mtu sysfs entry shows a wrong MTU number in the VM When at least one port is configured as InfiniBand and the num vfs is provided but the probe vf is not HCA initialization fails Use both the num v s and the probe vf in the modprobe line When working with a bonding device to enslave the Ethernet devices in active backup mode and failover MAC policy in a Virtual Machine VM establish ment of RoCE connections may fail Unload the module mlx4 1b and reload it in the VM Attaching or detaching a Virtual Function on SLES11 SP3 to a guest Virtual Machine while the mlx4 core driver is loaded in the Virtual Machine may cause a kernel panic in the hypervisor Unload the m1x4 core module in the hypervi sor before attaching or detaching a function to or from the guest When detaching a VF without shutt
18. 0 ory registrations 40 InfiniBand Counters Occasionally port rcv data and port xmit 2 3 1 0 1 2 4 1 0 0 data counters may not function properly 41 mlx4 en LRO fixes and improvements for jumbo MTU 2 3 2 0 1 2 4 1 0 0 42 Fixed a crash occurred when changing the number of 2 2 1 0 1 2 4 1 0 0 rings ethtool set channels when interface con nected to netconsole 43 Fixed ping issues with IP fragmented datagrams in 2 2 1 0 1 2 4 1 0 0 MTUS 1600 1700 44 The default priority to TC mapping assigns all prior 2 3 1 0 1 2 4 1 0 0 ities to TCO This configuration achieves fairness in transmission between priorities but may cause unde sirable PFC behavior where pause request for prior ity n affects all other priorities 38 Mellanox Technologies Rev 3 1 1 0 3 Table 50 Fixed Bugs List Discovered Fixed in sus Deseription in Release Release 45 mlx 5 ib Fixed an issue related to large memory regions regis 2 3 2 0 1 2 3 2 0 5 tration The problem mainly occurred on PPC sys tems due to the large page size and on non PPC systems with large pages contiguous pages 46 Fixed an issue in verbs API fallback to glibc on con 2 3 2 0 1 2 3 2 0 5 tiguous memory allocation failure 47 IPoIB Fixed a memory corruption issue in multi core sys 2 3 2 0 1 2 3 2 0 5 tem due to intensive IPoIB transmit operation 48 IB MAD Fixed an issue to
19. 0 21 mlx4 ib Fixed mismatch between SL and VL in outgoing 3 0 1 0 1 3 1 1 0 0 QP1 packets which caused buffer overruns in attached switches at high MAD rates 22 SR IOV RoCE Fixed a problem on VFs where the RoCE driver reg 2 3 1 0 1 3 1 1 0 0 istered a zero MAC into the port s MAC table during QP1 creation because the ETH driver had not yet generated a non zero random MAC for the ETH port t 23 Removed BUG ON assert when checking if the ring 3 0 1 0 1 3 1 1 0 0 is full 24 libvma Added libvma support for Debian 8 0 x86 64 and 3 0 2 0 1 3 1 1 0 0 Ubuntu 15 04 25 IPoIB Fixed an issue which prevented the failure to destroy 3 0 1 0 1 3 0 2 0 0 QP upon IPoIB unload on debug kernel 26 Configuration Fixed an issue which prevented the driver version to 3 0 1 0 1 3 0 2 0 0 be reported to the Remote Access Controller tools such as IDRAC 27 SR IOV Passed the correct port number in port change event 2 4 1 0 0 3 0 2 0 0 to single port VFs where the actual physical port used is port 2 28 Enabled OpenSM running over a ConnectX 3 HCA 3 0 1 0 1 3 0 2 0 0 to manage a mixed ConnectX 3 ConnectX 4 net work by recognizing the Well known GID in mad demux processing 29 Fixed double free memory corruption in case where 3 0 1 0 1 3 0 2 0 0 SR IOV enabling failed error flow 30 Start up sequence Fixed a crash in EQ s initialization error flow 3 0 1 0 1 3 0 2 0 0
20. 00 cece cee teen ees 32 3 9 4 General Known Issues 00 0 cette e eee RAN 32 3 9 2 ABI Compatibility Known Issues 00 000 cece eens 32 3 9 3 Connection Manager CM Known Issues 0 lese 32 3 9 4 Fork Support Known Issues 0 0000 cette 32 3 9 5 MLNX OFED Sources Known Issues 00 eee 33 3 9 6 Uplinks Known Issues 00 cece ett eens 33 3 9 7 Resources Limitation Known Issues 00 0c cee cece eee eens 33 3 9 8 Accelerated Verbs Known Issues 0 000 cece eee eens 34 3 10 InfiniBand Fabric Utilities Known Issues 0 0000 cece eee 35 3 10 1 Performance Tools Known Issues 0 00sec eee esee 35 3 10 2 Diagnostic Utilities Known Issues ooooooooooorrrrrrrrr eee 35 3 10 3 Tools Known Issues 00 0 cet e 35 Chapter 4 Bug Fixes History ooooooooooocrorrorrrrrrrrrcrrrrroo 36 Chapter 5 Change Log History ccc cece cece cece nnn n nn nnns 42 Chapter 6 API Change Log History eer ARE RR SI 2 Mellanox Technologies Rev 3 1 1 0 3 List Of Tables Table 1 Release Update History 0 0 0 hh ms 5 Table 2 Supported Uplinks to Servers 0 0 ccc ete en en enn 6 Table 3 Mellanox OFED for Linux Software Components 0 000 c een 6 Table 4 Supported Platforms and Operating Systems 0 0 c ccc eens 7 Table 5 Additional Software Packages 0 0 0 eect ehe 9 Table 6 Supported HCAs Firmwar
21. 4 0x90 ib ipoib Mellanox Technologies 23 J Rev 3 1 1 0 3 Known Issues 3 5 2 elPolB Known Issues Table 24 elPoIB Known Issues Index Description Workaround l On rare occasions upon driver restart the following message is shown in the dmesg cannot create duplicate filename class net eth ipoib interfaces 2 No indication is received when eIPoIB is non func Run ps ef grep ipoibd to verify its tional functionality 3 eIPoIB requires libvirtd python 4 eIPoIB supports only active backup mode for bond ing 5 eIPoIB supports only VLAN Switch Tagging VST mode on guests 6 IPv6 is currently not supported in eIPoIB 7 eIPoIB cannot run when Flow Steering is enabled 8 eIPoIB daemon requires the following libs in order to run python libxml2 libvirt bin libvirtO 9 The eIPoIB driver in ConnectX 3 and Connect IB is currently at beta level 3 5 3 XRC Known Issues Table 25 XRC Known Issues Index Description Workaround 1 Legacy API is deprecated thus when recompiling applications over MLNX_OFED v2 0 3 x x warn ings such as the below are displayed rdma c 1699 warning ibv open xrc do main is deprecated declared at usr include infiniband ofa verbs h 72 rdma c 1706 warning ibv create x rc srq is deprecated declared at usr include infiniband ofa verbs h 89 These warnings can be safely ign
22. CAP CQE WAIT From IBV M WQE CALC CAP fo IBV M WQE CAP CALC SEND I 54 Mellanox Technologies
23. ECTED MODE in its restart or reboot the system ifcfg file will cause the interface bring up to fail 17 Clone interfaces receive a duplicated IPv6 address when a child interface with the same PKey a k a clone interface is used for all the clones 18 eth ipoib module is not loaded after reloading the To restart the IPoIB driver run etc ib ipoib module init d openibd restart Do not restart 1t by manually restarting each module 19 In Ubuntu and Debian the default of the recv_ queue size and send queue size is 128 according to the io mmu issue 20 In RHEL6 7 when the Network Manager service is Either disable the Network Manager or add enabled and an IPoIB interface is configured using DEVICE interface name to the inter the nm connection editor tool the generated face s ifcfg file ifcfg file is missing the DEVICE lt interface name gt parameter Hence changing the CONNECT ED MODE in the ifcfg file will cause a failure in the interface bring up 21 ifdown command does not function in RH7 x 22 In RHEL7 2 when creating a child interface for the IPoIB interface in some cases you might get a trace or a panic as the following RIP 0010 lt ffffffffa05e204c gt ffffffa05e204c ipoib get i flink 0x1c 0x30 ib ipoib Call Trace register netdevice 0x140 0x430 ipoib vlan add 0x10d 0x290 ib ipoib ipoib vlan_add 0x1b5 0x240 ib ipoib create child 0x6
24. For further information please refer to the played in dmesg if the Mellanox s x 509 Public Key User Manual section Enrolling Mellanox s is not added to the system x 509 Public Key On your Systems 4671958 383506 Request for unknown module key Mellanox Technologies sign ing key 61feb074fc7292f958419386ffdd9d5 ca999e403 err 11 This error can be safely ignored as long as Secure Boot is disabled on the system 2 Ubuntul2 requires update of user space open iscsi to v2 0 873 3 The initiator does not respect interface parameter Configure each interface on a different subnet while logging in 3 2 Performance Known Issues Table 15 Performance Known Issues Index Description Workaround l On machines with irqbalancer daemon turned off Execute the following script as root the default InfiniBand interrupts will be routed toa set_irq_affinity sh interface or single core which may cause overload and software IB device 2nd interface or IB hardware lockups device 2 Out of the box throughput performance in Ubun For additional performance tuning please refer tu14 04 is not optimal and may achieve results to Performance Tuning Guide below the line rate in 40GE link speed 3 UDP receiver throughput may be lower then Disable adaptive interrupt moderation and set expected when running over mlx4 en Ethernet lower values for the interrupt coalescing manu driver ally This is caused by
25. Fork Support Known Issues Index Description Workaround 1 Fork support from kernel 2 6 12 and above is avail able provided that applications do not use threads fork is supported as long as the parent process does not run before the child exits or calls exec The former can be achieved by calling wait childpid and the latter can be achieved by application specific means The Posix system call is supported 32 Mellanox Technologies Rev 3 1 1 0 3 3 9 5 MLNX OFED Sources Known Issues Table 43 MLNX_OFED Sources Known Issues Index Description Workaround 1 MLNX_OFED includes the OFED source RPM packages used as a build platform for kernel code but does not include the sources of Mellanox propri etary packages 3 9 6 Uplinks Known Issues Table 44 Uplinks Known Issues Index Description Workaround 1 On rare occasions ConnectX 3 Pro adapter card Restart the driver may fail to link up when performing parallel detect to 40GbE 3 9 7 Resources Limitation Known Issues Table 45 Resources Limitation Known Issues Index Description Workaround l The device capabilities reported may not be reached as it depends on the system on which the device is installed and whether the resource is allocated in the kernel or the userspace 2 mlx4 core can allocate up to 64 MSI X vectors an MSI X vector per CPU 3 Setting more IP addre
26. Known ISSUES isch ren reet ee ae area A ee A 35 Table 50 Fixed Bugs Lists oni A aaah ROV STR D E RR en Ota 36 Table 5T Change Log History cio 2 ava aed pe IER ERROR CERA NR RE NUS 42 Table 52 API Change LogHistory sees 5 4 Mellanox Technologies J Rev 3 1 1 0 3 Release Update History Table 1 Release Update History Release Date Description Rev 3 1 1 0 3 December 10 2015 Updated Table 4 Supported Platforms and Operating Systems added RHEL CentOS 7 2 Added known issue 22 in Table 23 IPoIB Known Issues on page 20 October 01 2015 This is the initial release of this MLNX OFED release Mellanox Technologies 5 J Rev 3 1 1 0 3 Overview 1 Overview These are the release notes of Mellanox OFED for Linux Driver Rev 3 1 1 0 3 Mellanox OFED is a single Virtual Protocol Interconnect VPI software stack and operates across all Mellanox network adapter solutions supporting the following uplinks to servers Table 2 Supported Uplinks to Servers Uplink HCAs Uplink Speed Connect IB InfiniBand SDR QDR FDR EDR ConnectX 4 InfiniBand SDR QDR FDR EDR Ethernet 10GigE 25GigE 40GigE 50GigE and 100GigE ConnectX 4 Lx Ethernet 10GigE 25GigE 40GigE and 50GigE ConnectX 3 ConnectX 3 Pro InfiniBand SDR QDR FDR10 FDR Ethernet 10GigE 40GigE and 56GigE ConnectX 2 InfiniBand SDR DDR Ethe
27. Mellanox TECHNOLOGIES Connect Accelerate Outperform Mellanox OFED for Linux Release Notes Rev 3 1 1 0 3 www mellanox com Rev 3 1 1 0 3 NOTE THIS HARDWARE SOFTWARE OR TEST SUITE PRODUCT PRODUCT S AND ITS RELATED DOCUMENTATION ARE PROVIDED BY MELLANOX TECHNOLOGIES AS IS WITH ALL FAULTS OF ANY KIND AND SOLELY FOR THE PURPOSE OF AIDING THE CUSTOMER IN TESTING APPLICATIONS THAT USE THE PRODUCTS IN DESIGNATED SOLUTIONS THE CUSTOMER S MANUFACTURING TEST ENVIRONMENT HAS NOT MET THE STANDARDS SET BY MELLANOX TECHNOLOGIES TO FULLY QUALIFY THE PRODUCT S AND OR THE SYSTEM USING IT THEREFORE MELLANOX TECHNOLOGIES CANNOT AND DOES NOT GUARANTEE OR WARRANT THAT THE PRODUCTS WILL OPERATE WITH THE HIGHEST QUALITY ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT ARE DISCLAIMED IN NO EVENT SHALL MELLANOX BE LIABLE TO CUSTOMER OR ANY THIRD PARTIES FOR ANY DIRECT INDIRECT SPECIAL EXEMPLARY OR CONSEQUENTIAL DAMAGES OF ANY KIND INCLUDING BUT NOT LIMITED TO PAYMENT FOR PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES LOSS OF USE DATA OR PROFITS OR BUSINESS INTERRUPTION HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY WHETHER IN CONTRACT STRICT LIABILITY OR TORT INCLUDING NEGLIGENCE OR OTHERWISE ARISING IN ANY WAY FROM THE USE OF THE PRODUCT S AND RELATED DOCUMENTATION EVEN IF ADVISED OF THE POSSIBILITY OF S
28. N tag striping offload via ethtool 128 Byte Completion Queue Entry CQE Non Linux Virtual Machines Added Windows Virtual Machine over Linux KVM Hypervisor SR IOV with InfiniBand only support Rev 2 2 1 0 1 minxofedinstall 32 bit libraries are no longer installed by default on 64 bit OS To install 32 bit libraries use the with 32bit installation parameter openibd Added pre post start stop scripts support For further information please refer to section openibd Script in the MLNX OFED User Manual Reset Flow Reset Flow is not activated by default It is controlled by the mlx 4 core internal err reset module parameter Mellanox Technologies 47 J Rev 3 1 1 0 3 Change Log History Table 51 Change Log History Release Category Description Rev 2 2 1 0 1 InfiniBand Core Asymmetric MSI X vectors allocation for the SR IOV hypervisor and guest instead of allocating 4 default MSI X vectors The maxi mum number of MSI X vectors is num cpu for port ConnectX 3 has 1024 MSI X vectors 28 MSI X vectors are reserved Physical Function gets the number of MSI X vectors accord ing to the pf msix table size multiple of 4 1 INI parameter Virtual Functions the remaining MSI X vectors are spread equally between all VFs according to the num v s mlx 4 core module parameter Ethernet Ethernet VXLAN support for kernels 3 12 10 or higher
29. P Known Issues Index Description Workaround 1 MLNX_OFED SRP installation breaks the ibmvstgt and ibmvscsi symbol resolution in RHEL7 0 3 6 3 SRP Interop Known Issues Table 31 SRP Interop Known Issues Index Description Workaround l The driver is tested with Storage target vendors rec ommendations for multipath conf extensions ZFS DDN TMS Nimbus NetApp 3 6 4 DDN Storage Fusion 10000 Target Known Issues Table 32 DDN Storage Fusion 10000 Target Known Issues Index Description Workaround 1 DDN does not accept non default P Key connection establishment 3 6 5 Oracle Sun ZFS Storage 7420 Known Issues Table 33 Oracle Sun ZFS Storage 7420 Known Issues Index Description Workaround 1 Ungraceful power cycle of an initiator connected with Targets DDN Nimbus NetApp may result in temporary stale connection messages when initia tor reconnects Mellanox Technologies 27 J Rev 3 1 1 0 3 Known Issues 3 6 6 SER Initiator Known Issues Table 34 SER Initiator Known Issues Index Description Workaround l On SLES OSs the ib iser module does not load Add a dummy interface using iscsiadm on boot 4 iscsiadm m iface I ib iser o new e d iscsiadm m iface I ib iser o update n iface trans port name v ib iser 2 Ubuntul2 requires update of user space open iscsi to v2 0 873 3 The initiat
30. RoCE does not support Multicast Lis tener Discovery MLD therefore multicast traffic over IPv6 may not work as expected 6 DIF When running IO over FS over DM during unstable ports block layer BIOS merges may cause false DIF error 7 connectx port config configurations is not saved Re run connectx port config after unbind bind 3 4 3 Flow Steering Known Issues Table 20 Flow Steering Known Issues Index Description Workaround l Flow Steering is disabled by default in firmware ver To enable it set the parameter below as follow sion 2 32 5100 log num mgm entry size should set to 1 2 IPv4 rule with source IP cannot be created in SLES 11 x OSes 3 RFS does not support UDP 4 When working in DMFS A0 mode and VM hyper visor is MLNX OFED 2 3 x x x the second side hypervisor VM respectively should be MLNX _OFED 2 3 x x x as well 5 Setting ARP flow rules through ethtool is not allowed Mellanox Technologies 19 J Rev 3 1 1 0 3 Known Issues 3 4 4 Quality of Service Known Issues Table 21 Quality of Service Known Issues Index Description Workaround l QoS is not supported in XenServer Debian 6 0 and 6 2 with uek kernel 2 When QoS features are not supported by the kernel mlnx qos tool may crash 3 QoS default settings are not returned after configur ing QoS 3 4 5 Ethernet Performan
31. The following verbs have become deprecated e struct ibv xrc domain ibv open xrc domain ibv exp create qp ibv exp query device e struct ibv srq ibv create xrc srq int ibv close xrc domain int ibv create xrc rcv qp int ibv modify xrc rcv qp int ibv query xrc rcv qp e int ibv reg xrc rcv qp int ibv unreg xrc rcv qp Mellanox Technologies 53 J Rev 3 1 1 0 3 API Change Log History Table 52 API Change Log History Release Name Description Rev 2 0 2 0 5 Libibverbs Extended speeds Missing the ext active speed attribute from the struct ibv_port_attr e Removed function ibv ext rate to int e Added functions ibv rate to mbps and mbps to ibv rate Libibverbs Raw QPs QP types IBV OPT RAW PACKET and IBV_QPT RAW ETH are not supported Libibverbs Contiguous e Added Contiguous pages support pages e Added function ibv_reg shared mr Libmverbs e The enumeration IBV M WR CALC was renamed to IBV M WR CALC SEND Theenumeration IBV M WR WRITE WITH IMM was added Inthe structure ibv m send wr the union wr send was renamed to wr calc send and wr rdma was added e The enumerations IBV M WQE CAP CALC RD MA WRITE WITH IMM was added The following enumerations were renamed From IBV M WQE SQ ENABLE CAP to IBV M WOE CAP SO ENABLE From IBV M WQE RQ ENABLE CAP to IBV M WOE CAP RO ENABLE e From IBV M WOE CQE WAIT CAP to IBV M WOE
32. UCH DAMAGE Mellanox TECHNOLOGIES Mellanox Technologies 350 Oakmead Parkway Suite 100 Sunnyvale CA 94085 U S A www mellanox com Tel 408 970 3400 Fax 408 970 3403 O Copyright 2015 Mellanox Technologies All Rights Reserved Mellanox amp Mellanox logo BridgeX amp CloudX logo Connect IB ConnectX CoolBox CORE Direct amp GPUDirect InfiniHost InfiniScale amp Kotura Kotura logo Mellanox Federal Systems Mellanox Open Ethernet Mellanox ScalableHPC Mellanox Connect Accelerate Outperform logo Mellanox Virtual Modular Switch MetroDX MetroX MLNX OS Open Ethernet logo PhyX SwitchX TestX The Generation of Open Ethernet logo UFM Virtual Protocol Interconnect Voltaire and Voltaire logo are registered trademarks of Mellanox Technologies Ltd Accelio CyPU FPGADirect M HPC X InfiniBridge LinkX Mellanox Care Mellanox CloudX Mellanox Multi Host Mellanox NEO Mellanox PeerDirect Mellanox Socket Direct Mellanox Spectrum NVMeDirect StPU Spectrum logo Switch IB Unbreakable Link are trademarks of Mellanox Technologies Ltd All other trademarks are property of their respective owners 2 Mellanox Technologies Rev 3 1 1 0 3 1 Table of Contents Table of Contents 2 e e eae exec e pte eco oS bbe ees Cees Last OF Tables A d ede Ce eret ERE RO duda es Chapter 1 OvVerviewsso iive bea bach ARA eed eee Se
33. Ubuntu14 04 The issue 1s within the bridge modules in those ker nels In Vanilla kernels above 3 16 issues is fixed 11 In RH6 4 ping may not work over VLANs that are configured over Linux bridge when the bridge has a mlx4 en interface attached to it 12 The interfaces LRO needs to be set to OFF manu Run ethtool K ethX lro off ally when there is a bond configured on Mellanox interfaces with a Bridge over that bond 13 On SLES12 the bonding interface over Mellanox 1 Set STARTMODE hotplug in the bonding Ethernet slave interfaces does not get IP address slave s ifcfg files after reboot More details can be found in the SUSE doc umentations page https www suse com documentation sles 12 book sle admin page documentation sles 12 book sle admin data sec bond html 2 Enable the nanny service to support hot plugging Open the etc wicked common xml file Change lt use nanny gt false lt use nanny gt to lt use nanny gt true lt use nanny gt 3 Run systemctl restart wickedd ser vice wicked 14 ethtool x command does not function in SLES OS 15 Ethertype proto 0x806 not supported by ethtool 16 ETS configuration is not supported in the following kernels 3 7 3 8 39 3 10 3 2 37 94 fbk17 01925 g8e3b329 3 14 3 2 55 106 fbk22 00877 86902630 3 228 76 fbkl4 00230 g3c40d9e 17 ETS is not supported in kernels that do not have MQPRIO as QDISC KIND opti
34. V Maen E Uae de ase Ge Sh wena 25 Table 28 ISCSIoverIPoIB Known Issues 00 0 n 26 Table 29 Storage Known Issues 0 0 ccc eee rn 27 Table 30 SRP Known Issues 2 dey os P REPRE os 27 Table 31 SRP Interop Known Issues 00 0 m 27 Table 32 DDN Storage Fusion 10000 Target Known Issues 0 00 esses 21 Table 33 Oracle Sun ZFS Storage 7420 Known Issues 00 0 eee ees 27 Table 34 SER Initiator Known Issues 1 0 0 0 ec ete n en ens 28 Table 35 SER Target Known Issues 0 0 c cc em 28 Mellanox Technologies 3 J Rev 3 1 1 0 3 Table 36 ZFS Appliance Known Issues 1 0 0 0 0c cc ee 28 Table 37 SR IOV Known Issues ooooooocoororrr ra 29 Table 38 Resiliency Known Issues 31 Table 39 General Known Issues 0 0 0 0 0c a 32 Table 40 ABI Compatibility Known Issues o oo ooooooororrrrr ene e eens 32 Table 41 Connection Manager CM Known Issues 00 eee eee e 32 Table 42 Fork Support Known Issues ssesseesesee e nen teen eens 32 Table 43 MLNX OFED Sources Known Issues 0 00 00 cece cece eens 33 Table 44 Uplinks Known Issues 0 0 0 0 ccc a 33 Table 45 Resources Limitation Known Issues 00 0 0 cece cee eh 33 Table 46 Accelerated Verbs Known Issues 0 00sec cette eh 34 Table 47 Performance Tools Known Issues 0 0 0 0 00 ccc cette cnet 35 Table 48 Diagnostic Utilities Known Issues 35 Table 49 Tools
35. ad Known Issues Index Description Workaround 1 openibd stop can sometime fail with the error Unloading ib cm FAILED ERROR Module ib cm is in use by ib i poib Re run openibd stop 12 Mellanox Technologies Rev 3 1 1 0 3 3 1 3 Driver Start Known Issues Table 12 Driver Start Known Issues Index Description Workaround l Out of memory issues may rise during drivers load depending on the values of the driver module parameters set e g log num cq When reloading starting the driver using the etc init d openibd the following messages are dis played if there is a third party RPM or driver installed Module mlx4 core does not belong to MLNX _OFED or Module mlx4 core belong to rpm name gt which is not a part of MLNX OFED Remove the third party RPM non MLNX OFED drivers directory run depmod and then rerun etc init d openibd restart Occasionally when trying to repetitively reload the NES hardware driver on SLES11 SP2 a soft lockups occurs that required reboot When downgrading from MLNX_OFED 3 0 x x x driver reload might fail with the following errors in dmeg 166271 886407 compat exports dupli cate symbol _ ethtool get settings owned by mlx compat The issues will be resolved automatically after system reboot or by invoking the following commands rmmod mlx compat depmod a etc init d openibd restart
36. at net core dev c 1707 invalid opcode 0000 1 SMP RIP 0010 lt ffffffff81448988 gt skb checksum_help 0x148 0x160 Call Trace IRQ lt ffffffff81448d83 gt dev hard start x mit 0x3e3 0x530 lt ffffffff8144c805 gt dev queue x mit 0x205 0x550 fffff8145247d neigh connect ed_output 0xbd 0x1 13 When InfiniBand ports are removed from the host To avoid it and have persistent IPoIB network e g when changing port type from IB to Eth or devices names for ConnectX ports add to the removing a card from the PCI bus the remaining etc udev rules d 70 persistent IPoIB interface might be renamed net rules file SUBSYSTEM net ACTION add DRIVERS ATTR address Port GID NAME ibN Where N is the IPoIB required interface index 14 After releasing a bond interface that contains IPoIB slaves a call trace might be printed into the dmesg 22 Mellanox Technologies Rev 3 1 1 0 3 Table 23 IPoIB Known Issues Continued Index Description Workaround 15 TPoIB interfaces are loaded without an IP address on 1 Open the etc wicked common xml file SLES 12 2 Change lt use nanny gt false lt use nanny gt to lt use nanny gt true lt use nanny gt 3 Run H systemctl restart wickedd ser vice wicked ifup all 16 In RHEL7 0 running ifdown then ifup on an inter Reload the driver etc init d openibd face after changing CONN
37. ce Counters Known Issues Table 22 Ethernet Performance Counters Known Issues Index Description Workaround l In ConnectX 3 in a system with more than 61 VFs the 62nd VF and onwards is assigned with the SINKQP counter and as a result will have no statis tics and loopback prevention functionality for SINK counter 2 In ConnectX 3 since each VF tries to allocate 2 more QP counter for its RoCE traffic statistics in a system with less than 61 VFs if there is free resources it receives new counter otherwise receives the default counter which is shared with Ethernet In this case RoCE statistics is not available 3 In ConnectX 3 when we enable function based loopback prevention for Ethernet port by default i e based on the QP counter index the dropped self loopback packets increase the IfRxErrorFrames Octets counters 3 5 InfiniBand Network 3 5 1 IPoIB Known Issues Table 23 IPoIB Known Issues Index Description Workaround l When user increases receive send a buffer it might consume all the memory when few child s interfaces are created 2 The size of send queue in Connect IB cards cannot exceed IK 3 In 32 bit devices the maximum number of child interfaces that can be created is 16 Creating more that might cause out of memory issues 20 Mellanox Technologies Rev 3 1 1 0 3 Table 23 IPoIB Known Issues Continued In
38. ctive operations acceleration library over InfiniBand KNEM Linux kernel module enabling high performance intra node MPI PGAS communication for large messages Extra packages e jbutils2 ibdump MFT Sources of all software modules under conditions mentioned in the modules LICENSE files except for MFT OpenSM plugins ibutils2 and ibdump HCAs ConnectX 3 EN driver Rev 3 1 1 0 3 ConnectX 4 EN driver Rev 3 1 1 0 3 Documentation 1 2 Supported Platforms and Operating Systems The following are the supported OSs in MLNX OFED Rev 3 1 1 0 3 Table 4 Supported Platforms and Operating Systems Operating System Platform RHEL CentOS 6 5 x86 64 RHEL CentOS 6 6 x86 64 RHEL CentOS 6 7 x86 64 PPC64 Power 7 RHEL CentOS 7 0 x86 64PPC64 Power 7 RHEL CentOS 7 1 x86 64 PPC64 Power 7 PPC64le Power 8 ARM64 ARM is at beta level RHEL CentOS 7 2 x86 64 SLES11 SP1 x86 64 SLES11 SP2 x86 64 SLES11 SP3 x86_64 PPC64 Power 7 SLES11 SP4 x86_64 PPC64 Power 7 SLES12 x86_64 PPC64le Power 8 OEL 6 3 x86_64 OEL 6 4 x86_64 OEL 6 5 x86_64 OEL 6 6 x86_64 OEL 6 7 x86_64 OEL 7 0 x86_64 OEL 7 1 x86_64 Fedora 19 x86 64 PPC64 Power 7 Fedora 20 x86 64 Fedora 21 x86 64 PPC64 Power 7 Ubuntu 12 04 x86 64 Mellanox Technologies 7 J Rev 3 1 1 0 3 Overview Table 4 Supported Platforms and Operating Systems
39. dex Description Workaround 4 In RHEL7 0 the Network Manager can detect when the carrier of one of the IPoIB interfaces is OFF and can decide to disable its IP address Set ignore carrier for the corresponding device in NetworkManager conf For further information please refer to man NetworkManager conf IPoIB interface does not function properly if a third party application changes the PKey table We recommend modifying PKey tables via OpenSM Fallback to the primary slave of an IPoIB bond does not work with ARP monitoring https bugs open fabrics org show bug cgi id 1990 Out of memory issue might occur due to overload of interfaces created To calculate the allowed memory per each IPoIB interface check the following e Num rings min num cores on that device 16 Ring size 512 by default it is module parameter UD memory 2 num rings ring size 8K CM memory ring size 64k Total memory UD mem CM mem Connect IB does not reach the bidirectional line rate Optimize the IPoIB performance in Connect IB cat sys class net interface device local cpus sys class net interface queues rx 0 rps cpus Ifthe CONNECTED MODE no parameter is set to no or missing from the ifcfg file for Connect IB IPoIB interface then the service network restart will hang Setthe CONNECTED MODE yes parameter in the ifcfg file for Connect IB interface
40. e Versions o ooooooooooorrrrr en 9 Table 7 MLNX OFED Rev 3 1 1 0 3 Compatibility Matrix 0 0 00 cece eee 10 Table 8 gt _ RoCE Modes Mattx sic cso ed n ERR ed ene RR AEN ul olds eee 10 Table9 Changes in y3 151 0 3 27 53 poy e en hee ah ed ea a esperes 11 Table 10 Installation Known Issues oooooooooororror RH 12 Table 11 Driver Unload Known Issues slsssseeseseeee teen enn 12 Table 12 Driver Start Known Issues l l 13 Table 13 System Time Known Issues 00 0 n 14 Table 14 UEFI Secure Boot Known Issues 0 00 cc rr 15 Table 15 Performance Known Issues 15 Table 16 ConnectX 3 mlx4 Driver Known Issues 00 0 cece ees 16 Table 17 ConnectX 4 mlx5 Driver Known Issues 0 0 00 0c cette 16 Table 18 Ethernet Known Issues 0 cee n 17 Table 19 Port Type Management Known Issues 0 0 00 cece cece eee nes 19 Table 20 Flow Steering Known Issues 0 0 cc cee een t eens 19 Table 21 Quality of Service Known Issues 0 0 ccc eee eee tenes 20 Table 22 Ethernet Performance Counters Known Issues 00 000 cece teenies 20 Table 23 IPoIB Known Issues sees e nent nnn Ens 20 Table 24 eIPoIB Known Issues 0 ccc cece een eben enn ees 24 Table 25 XRG Knownlssues lesse ren 24 Table 26 Verbs Known Issues nnana ccc cc ee een en een tenn eens 25 Table 27 ROCE Known ISSUES e oa tet e abe Ss
41. e freeing EQ memory 60 minx conf Defined mlnx conf as a configuration file in mlnx 2 1 1 0 0 2 3 2 0 0 ofa kernel RPM Mellanox Technologies 39 J Rev 3 1 1 0 3 Bug Fixes History Table 50 Fixed Bugs List Discovered Fixed in B Pu Description in Release Release 61 SR IOV Fixed counter index allocation for VFs which 2 3 1 0 1 2 3 2 0 0 enables Ethernet port statistics 62 iSER Fixed SER DIX sporadic false DIF errors caused in 2 3 1 0 1 2 3 2 0 0 large transfers when block merges were enabled 63 RoCE v2 RoCE v2 was non functional on big Endian 2 3 1 0 1 2 3 2 0 0 machines 64 Verbs Fixed registration memory failure when fork was 2 3 1 0 1 2 3 2 0 0 enabled and contiguous pages or ODP were used 65 Installation Using both c config and add kernel 22 1 0 1 2 3 2 0 0 support flags simultaneously when running the mlnxofedinstall sh script caused installation failure with the following on screen message config does not exist 66 IPoIB Changing the GUID ofa specific SR IOV guest after 2 1 1 0 0 2 3 1 0 1 the driver has been started causes the ping to fail Hence no traffic can go over that InfiniBand inter face 67 XRC XRC over ROCE in SR IOV mode is not functional 2 0 3 1 0 2 2 1 0 1 68 mlx4 en Fixed wrong calculation of packet true size reporting 2 1 1 0 0 2 2 1 0 1 in LRO flow 69 Fixed kernel panic on Debian 6 0 7
42. ect port rate and port speed values in 3 0 2 0 1 3 1 1 0 0 RoCE mode in ConnectX 4 11 IPoIB In RedHat7 1 kernel 3 10 0 299 when sending 3 0 2 0 1 3 1 1 0 0 ICMP TCP UDP traffic over Connect IB ConnectX 4 in UD mode the packets were dropped with the following error UDP bad checksum 12 openibd Fixed an issue which prevented openibd from start 3 0 2 0 1 3 1 1 0 0 ing correctly during boot 13 Ethernet Added a new module parameter to control the num 3 0 2 0 1 3 1 1 0 0 ber of IRGs allocated to the device 14 mlx5 driver Fixed an issue on PPC servers which prevented PCI 3 0 2 0 1 3 1 1 0 0 from reloading after EEH error recovery 15 OpenSM Fixed an issue which prevented the OpenSM pack 3 0 2 0 1 3 1 1 0 0 Rev 3 1 1 0 3 Table 50 Fixed Bugs List Discovered Fixed in sus Description in Release Release 16 mlx5_en Added the option to toggle LRO ON OFF using the 3 0 2 0 1 3 1 1 0 0 K flags The priv flag hw_lro will determine the type of LRO to be used if the flag is ON the hard ware LRO will be used otherwise the software LRO will be used 17 Added the option to toggle LRO ON OFF using the 3 0 2 0 1 3 1 1 0 0 K flags 18 Fixed race when updating counters 3 0 2 0 1 3 1 1 0 0 19 Fixed scheduling while sending atomic dmesg warn 3 0 2 0 1 3 1 1 0 0 ing during bonding configuration 20 Added set rx csumcallback implementation 3 0 2 0 1 3 1 1 0
43. ental verbs e ijbv exp arm dct e ibv exp query port e ibv exp create flow e ibv exp destroy flow e ibv exp post send e ijbv exp reg mr e ibv exp get provider func 52 Mellanox Technologies Rev 3 1 1 0 3 Table 52 API Change Log History Release Name Description Rev 2 1 1 0 0 Dynamically Connected DC The following verbs were added e struct ibv dct ibv exp create dct struct ibv context context struct ibv ex p dct init attr attr int ibv exp destroy dct struct ibv dct dct int ibv exp query dct struct ibv dct dct struct ibv exp dct attr attr Verbs Extension API Ibv post task Verbs extension API defines ibv query values ex OFA APIs extension scheme e iby query device ex to detect ABI compatibility Ibv create flow and enable backward and for ibv destroy flow ward compatibility support ibv poll cq ex e ibv reg shared mr ex e ibv open xrcd e ibv close xrcd e ibv modify cq e ibv create srq ex e iby get srq num e ibv create qp ex e jibv create cq ex ibv open qp e jbv modify qp ex Rev 2 1 1 0 0 Verbs Experimental API Verbs experimental API defines MLNX OFED APIs ibv exp create dct extension scheme which is ibv exp destroy dct similar to the Verbs exten e bv exp query dct sion APT This extension provides a way to introduce new features before they are integrated into the formal OFA API and to the upstream kernel and libs Rev 2 0 3 0 0 XRC
44. ess 10 Using IPv6 link local address GIDO when VLANs are configured is not supported I1 Using GID index 0 the default GID on port 2 1s currently not supported on kernel 3 14 and below 12 Dynamically Connected DC in RoCE in Con nectX 4 is currently not supported 13 Enslaving a Mellanox device to a bond with already 1 Enslave the Mellanox device configured IPs or configured upper devices pre 2 Configure IP devices vents these IPs from being configured as GIDs 14 ibv create ah from wc is not supported for multicast messages 15 Infiniband error counters that are found under sys class infiniband mlx5 dev ports port which dis not function properly in Con nectX 4 adapter cards 3 5 6 ISCSI over IPoIB Known Issues Table 28 ISCSI over IPoIB Known Issues Index Description Workaround l When working with ISCSI over IPoIB LRO must be disabled even if IPoIB is set to connected mode due to a a bug in older kernels which causes a kernel panic 26 Mellanox Technologies Rev 3 1 1 0 3 3 6 Storage Protocols Known Issues 3 6 1 Storage Known Issues Table 29 Storage Known Issues Index Description Workaround l Older versions of rescan scsi bus shmaynot If encountering such issues it is recommended recognize some newly created LUNs to use the c flag 3 6 2 SRP Known Issues Table 30 SR
45. imental API ibv exp post send the following opcodes e IBV EXP WR EXT MASKED ATOMIC CMP AND SWP e IBV EXP WR EXT MASKED ATOMIC FETCH AND ADD e IBV EXP WR NOP and these completion opcodes e IBV EXP WC MASKED COMP SWAP e IBV EXP WC MASKED FETCH ADD Mellanox Technologies 51 J Rev 3 1 1 0 3 API Change Log History Table 52 API Change Log History Release Name Description Rev2 2 1 0 1 libibverbs The following verbs changed to align with upstream libib verbs ibv reg mr ibv access flags changed ibv post send opcodes and send flags changed and wr fields removed task op dc and bind mw ibv query device capability flags changed ibv poll cq opcodes and wc flags changed e ibv modify qp mask bits changed ibv create qp ex create flags field removed The following verbs removed to align with upstream libib verbs ibv bind mw ibv post task ibv query values ex ibv query device ex ibv poll cq ex ibv reg shared mr ex ibv reg shared mr ibv modify cq ibv create cq ex e ibv modify qp ex Rev 2 2 1 0 1 Verbs Experimental API The following experimental verbs added replacing the removed extended verbs e jbv exp bind mw e jbv exp post task e ibv exp query values e ibv exp query device e ibv exp poll cq e jbv exp reg shared mr e ijbv exp modify cq e jbv exp create cq e ijbv exp modify qp New experim
46. ing Known Issues 00 000 c cece eee 19 3 4 4 Quality of Service Known Issues 00 0 cette teens 20 3 4 5 Ethernet Performance Counters Known Issues 0 esee 20 3 5 InfiniBand Network o oooooooooooornr eee 20 3 3 1 TPOIB Known ISSUES ix i eee tee xr e ce Ree CAD 20 3 5 2 elPoIB Known Issues o o oococoooooorco mh 24 3 5 3 XRC Known ISSUES eiie enie e eeke ea aa egaa m hen 24 3 5 4 Verbs Known ISSUES se END EDEN a 25 3 5 5 ROGE Known Issues ceo e a UR RI eR EUR 25 3 5 6 ISCSI over IPoIB Known Issues 0 0 cece eh 26 3 6 Storage Protocols Known Issues 0 000 ccc eens 27 3 6 1 Storage Known Issues e 27 3 6 2 SRP Known Issues ute te tle rege ee eic alts ie 27 3 6 3 SRP Interop Known Issues 02 0 ccc ee rh 27 Mellanox Technologies 1 J Rev 3 1 1 0 3 3 6 4 DDN Storage Fusion 10000 Target Known Issues 0000 0000s 27 3 6 5 Oracle Sun ZFS Storage 7420 Known Issues 00000 eee eee eee 27 3 6 6 SER Initiator Known Issues 28 3 6 7 SER Target Known Issues 00 0 cece cece teen eens 28 3 6 8 ZFS Appliance Known Issues 0000s cee ence eens 28 3 7 Niptu alization esac e ease b ang olen panies ae A Ru pp Oe Se 29 3 7 1 SR IOV Known Issues 0 cee eect rh 29 3 8 RESTOS o genie tese eeu er ve with tae Geena do les 31 3 8 1 Reset Flow KnownlIssues sss e 31 3 9 Miscellaneous Known Issues 0
47. ing down the driver from a VM and reattaching it to another VM with the same IP address for the Mellanox NIC RoCE connections will fail Shut down the driver in the VM before detach ing the VF Enabling SR IOV requires appending the intel iommu on option to the relevant OS in file boot grub grub conf Without that SR IOV cannot be loaded On various combinations of Hypervisor OSes and Guest OSes an issue might occur when attaching detaching VFs to a guest while that guest is up and running Attach detach VFs to from a VM only while that VM is down The known PCI BDFs for all VFs in kernel com mand line should be specified by adding xen pci back hide For further information please refer to http wiki xen org wiki Xen PCI Passthrough The inbox qemu version 2 0 provided with Ubuntu 14 04 does not work properly when more than 2 VMs are run over an Ubuntu 14 04 Hypervisor Mellanox Technologies 29 J Rev 3 1 1 0 3 Known Issues Table 37 SR IOV Known Issues Continued Index Description Workaround 11 SR IOV UD QPs are forced by the Hypervisorto use the base GID i e the GID that the VF sees in its GID entry at its paravirtualized index 0 This is needed for security since UD QPs use Address Vec tors and any GID index may be placed in such a vector including indices not belonging to that VF 12 Attempting to attach a PF to a VM when SR IOV is already enabled o
48. ions 2 0 2 0 5 2 0 3 0 0 84 Fixed an issue of VLAN traffic over Virtual Machine 2 0 2 0 5 2 0 3 0 0 in paravirtualized mode 85 Fixed ethtool operation crash while interface down 2 0 2 0 5 2 0 3 0 0 86 IPoIB Fixed memory leak in Connected mode 2 0 2 0 5 2 0 3 0 0 87 Fixed an issue causing IPoIB to avoid pkey value 0 2 0 2 0 5 2 0 3 0 0 for child interfaces Mellanox Technologies 41 J Rev 3 1 1 0 3 Change Log History 5 Change Log History Table 51 Change Log History ConnectX 3 Con nectX 3 Pro only Release Category Description 3 1 1 0 0 Wake on LAN Wake on LAN WOL is a technology that allows a network pro WOL fessional to remotely power on a computer or to wake it up from sleep mode Hardware Acceler Q in Q tunneling allows the user to create a Layer 2 Ethernet con ated 802 1ad nection between two servers The user can segregate a different VLAN Q in Q VLAN traffic on a link or bundle different VLANs into a single Tunneling VLAN ConnectX 4 ECN ECN in ConnectX 4 enables end to end congestions notifications between two end points when a congestion occurs and works over Layer 3 RSS Verbs Sup Receive Side Scaling RSS technology allows spreading incoming port for ConnectX traffic between different receive descriptor queues Assigning each 4 HCAs queue to different CPU cores allows better load balancing of the incoming traffic and improve perf
49. iver Ethernet net device New adaptive interrupt moderation scheme to improve CPU utili zation RSS support of fragmented IP datagram Connect IB Vir Added Connect IB Virtual Function to the list of supported tual Function devices 2 3 2 0 5 mlx5 core Added the following files under sys class infiniband mlx5 0 mr cache rel timeout Defines the minimum allowed time between the last MR creation to the first MR released from the cache When rel timeout 1 MRs are not released from the cache rel imm Triggers the immediate release of excess MRs from the cache when set to 1 When all excess MRs are released from the cache rel imm is reset back to 0 Bug Fixes See Bug Fixes History on page 36 2 3 2 0 1 Bug Fixes See Bug Fixes History on page 36 44 Mellanox Technologies Rev 3 1 1 0 3 Table 51 Change Log History Release Category Description 2 3 2 0 0 Connect IB Added Suspend to RAM S3 Reset Flow Added Enhanced Error Handling for PCI EEH a recovery strat egy for I O errors that occur on the PCI bus Register Contigu Added the option to ask for a specific address when the register ous Pages memory is using contiguous page mlx5 core Moved the mr cache subtree from debugfs to m1x5 ib while preserving all its semantics InfiniBand Utilities Updated the ibutils package Added to the ibdiagnet tool the ibdi agnet2 mInx_cntrs option to enable reading of Mellanox diagnos tic c
50. lias GUID support behavior in InfiniBand nectX 3 cards LLR max retrans Added LLR max retransmission rate as specified in Vendor Spe mission rate cific MAD V1 1 Table 110 PortLLRStatistics MAD Description Ibdiagnet presents the LLR max retransmission rate counter as part ofthe PM INFO in db csv file Experimental Added the following verbs Verbs e ibv exp create res domain ibv exp destroy res domain e ibv exp query intf e ibv exp release intf Added the following interface families e ibv exp qp burst family e ibv exp cq family 2 4 1 0 4 Bug Fixes See Bug Fixes History on page 36 Mellanox Technologies 43 J Rev 3 1 1 0 3 Change Log History Table 51 Change Log History Release Category Description 2 4 1 0 0 mlx4 en net Added support for Ethtool speed control and advertised link mode device Ethtool Added ethtool txvlan control for setting ON OFF hardware TX VLAN insertion ethtool k txvlan on off Ethtool report on port parameters improvements Ethernet TX packet rate improvements RoCE RoCE uses now all available EQs and not only the 3 legacy EQs InfiniBand IRQ affinity hints are now set when working in InfiniBand mode Virtualization VXLAN fixes and performance improvements libmlx4 amp libmlx5 Improved message rate of short massages libmlx5 Added ConnectX 4 device 4114 to the list of supported devices hca table Storage Added SER Target dr
51. ly the following kernel configuration option CON FIG MLX4 EN VXLAN y must be enabled MLNX OFED no longer changes the OS sysctl TCP parameters Added Explicit Congestion Notification ECN support Added Flow Steering AO simplified steering support Added RoCE v2 support 46 Mellanox Technologies Rev 3 1 1 0 3 Table 51 Change Log History Release Category Description 2 3 1 0 1 cont InfiniBand Network Added Secure host to enable the device to protect itself and the subnet from malicious software Added User Mode Memory Registration UMR to enable the usage of RDMA operations and to scatter the data at the remote side through the definition of appropriate memory keys on the remote side Added On Demand Paging ODP a technique to alleviate much of the shortcomings of memory registration Added Masked Atomics operation support Added Checksum offload for packets without L4 header support Added Memory re registration to allow the user to change attri butes of the memory region Resiliency Added Reset Flow for ConnectX 3 SR IOV support SR IOV Added Virtual Guest Tagging VGT an advanced mode of Vir tual Guest Tagging VGT in which a VF is allowed to tag its own packets as in VGT but is still subject to an administrative VLAN trunk policy Ethtool Added Cable EEPROM reporting support Disable Enable ethernet RX VLA
52. n that PF may result in a kernel panic 13 osmtest on the Hypervisor fails when SR IOV is enabled However only the test fails OpenSM will operate correctly with the host The failure reason is that if an mcg is already joined by the host a subse quent join request for that group succeeds automati cally even if the join parameters in the request are not correct This success does no harm 14 Ifa VM does not support PCI hot plug detaching an mlx4 VF and probing it to the hypervisor may cause the hypervisor to crash 15 QPerftest is not supported on SR IOV guests in Connect IB cards 16 On ConnectX 3 HCAs with firmware version 2 32 5000 and later SR IOV VPI mode works only with Port 1 ETH and Port 2 IB 17 Occasionally the lspci grep Mellanox com 1 Locate the file mand shows incorrect or partial information due to 1ocate pci ids the current pci ids file on the machine 2 Manually update the file according to the lat est version available online at https pci ids ucw cz v2 2 pci ids This file can also be downloaded 18 SR IOV is not supported in AMD architecture 19 Setting 1 Mbit s rate limit on Virtual Functions Qos Per VF feature may cause TX queue transmit time out 20 DC transport type is not supported on SR IOV VMs 21 Attaching a VF to a VM before unbinding it from the hypervisor and then attempting to destroy the VM may cause the system to hang for a few min utes
53. nly x86 64 architecture b SR IOV Ethernet Time Stamping and Flow Steering are ConnectX amp 3 HCA capability 50 Mellanox Technologies Rev 3 1 1 0 3 6 API Change Log History Table 52 API Change Log History Release Name Description Rev 3 1 1 0 3 libibverbs Added ibv_exp_wq_family interface family Sup ported only by ConnectX 4 e Added flag to the QP burst family to enable Multi Packet WR Added return error statuses to the ibv exp que ry intf to notify that common flags family flags are not supported Addedibv exp query gid attr verb For further information please refer to the manual page of the verb Rev3 0 1 0 0 libibverbs Added the following new APIs e ibv exp create res domain create resource domain e jbv exp destroy res domain destroy resource domain e ibv exp query intf query for family of verbs inter face for specific QP CQ e ijbv exp release intf release the queried interface Updated the following APIs e jbv exp create qp Add resource domain to the verb parameters e jbv exp create cq Add resource domain to the verb parameters Rev 2 4 1 0 0 libibverbs Added the following verbs interfaces e ibv create flow e ibv destroy flow ibv exp use priv env ibv exp setenv Rev2 3 1 0 1 libibverbs e ibv exp rereg mr Added new API for memory region re reintegration For futher information please refer to MLNX OFED User Manual e Added to the exper
54. nts when IOMMU is enabled Added Port level QoS support IPoIB Reduced memory consumption Limited the number TX and RX queues to 16 Default IPoIB mode is set to work in Datagram except for Con nect IBTM adapter card which uses IPoIB with Connected mode as default Mellanox Technologies 49 J Table 51 Change Log History Rev 3 1 1 0 3 Change Log History Release Category Description Rev 2 0 3 0 0 Storage SER at GA level Rev 2 0 2 0 55 Virtualization SR IOV for both Ethernet and InfiniBand at Beta level Ethernet Network RoCE over SR IOV at Beta level eIPoIB to enable IPoIB in a Para Virtualized environment at Alpha level Ethernet Performance Enhancements NUMA related and others for 10G and 40G Ethernet Time Stamping at Beta level Flow Steering for Ethernet and InfiniBand at Beta level Raw Eth QPs e Checksum TX RX Flow Steering InfiniBand Net work Contiguous pages Internal memory allocation improvements Register shared memory e Control objects QPs CQs Installation YUM update support VMA OFED VMA integration to a single branch Storage ISER at Beta level and SRP Operating Systems Errata Kernel upgrade support API VERSION query API library and headers Counters 64bit wide counters port xmit recv data packets unicast mcast a SSA is tested on SLES 12 o
55. o always unload it Create an executable file etc infiniband pre stop hook sh with the following content bin bash modprobe r xprtrdma 9 When loading or unloading the driver on HP Pro If you are not running SR IOV on your sys liant systems you may see log messages like tem you may eliminate these messages by dmar DMAR DMA Write Request device removing the term intel iommu on from 07 00 0 fault addr 3df7f000 the boot line in file boot grub menu 1st DMAR fault reason 05 PTE Write access For SR IOV systems this term must remain TE TOE NR you can ignore the log messages This is a known issue with ProLiant systems see their support notice for Emulex adapters http h20564 www2 hpe com hpsc doc public dis play docId emr na c04446026 amp lang en us amp cc us The messages are harmless and may be ignored 3 1 4 System Time Known Issues Table 13 System Time Known Issues Index Description Workaround l Loading the driver using the openibd script when no InfiniBand vendor module is selected for example mlx4 ib may cause the execution of the sbin start_udev script In RedHat 6 x and OEL6 x this may change the local system time 14 Mellanox Technologies Rev 3 1 1 0 3 3 1 5 UEFI Secure Boot Known Issues Table 14 UEFI Secure Boot Known Issues Index Description Workaround l On RHEL7 and SLES12 the following error is dis
56. ommunication problems 4 MLNX OFED v2 1 1 0 0 and onwards is not interoperable with older versions of MLNX OFED 5 Since the number of GIDs per port is limited to 128 there cannot be more than the allowed IP addresses configured to Ethernet devices that are associated with the port Allowed number is e 127 fora single function machine e 15 for a hypervisor in a multifunction machine e 127 15 n for a guest in a multifunction machine where n is the number of virtual func tions Note also that each IP address occupies 2 entries when RoCE mode is set to 4 RoCEv1 RoCE v2 This further reduces the number of allowed IP addresses 6 A working IP connectivity between the RoCE devices is required when creating an address handle or modifying a QP with an address vector 7 IPv4 multicast over RoCE requires the MGID format to beas follow Multicast IPv4 Address Mellanox Technologies 25 J Rev 3 1 1 0 3 Known Issues Table 27 RoCE Known Issues Continued Index Description Workaround 8 IP RoCEv2 does not support Multicast Listener Dis covery MLD therefore multicast traffic over IPv6 may not work as expected 9 Using GID index 0 the default GID is possible only if the matching IPv6 link local address is con figured on the net device of the port This behavior 1s possible even though the default GID is configured regardless the presence of the IPv6 addr
57. on in the tc tool 18 When NC SI is ON the port s MTU cannot be set to lower than 1500 19 GRO is not functional when using VXLAN in Con nectX 3 adapter cards 20 ethtool X The driver supports only the equal mode and cannot be set by using weight flags 21 Q in Q infrastructure in the kernel is supported only from kernel version 3 10 and up 18 Mellanox Technologies Rev 3 1 1 0 3 Table 18 Ethernet Known Issues Continued Index Description Workaround 22 When SLES11 SP4 is used as a DHCP client over ConnectX 3 or ConnectX 3 adapters it might fail to get an IP from the DHCP server 3 4 2 Port Type Management Known Issues Table 19 Port Type Management Known Issues Index Description Workaround l OpenSM must be stopped prior to changing the port protocol from InfiniBand to Ethernet 2 After changing port type using connectx port Use udev rules for persistent naming configu config interface ports names can be changed For ration example ibl gt ibO if port changed to be Ethernet For further information please refer to the port and port2 left IB User Manual 3 A working IP connectivity between the RoCE devices is required when creating an address handle or modifying a QP with an address vector 4 IPv4 multicast over RoCE requires the MGID for mat to be as follow f fff Multicast IPv4 Address 5 IP routable
58. or version 3 9 1 985 FabricIT EFM 185035 w w FabricIT EFM version 1 1 3000 FabricIT BXM MBX5020 w w FabricIT BXM version 2 1 2000 Unified Fabric Manager UFM v4 8 MXM v3 2 HPC X UPC v2 18 0 HPC X OpenSHMEM v1 8 3 FCA v2 5 and v3 1 HPC X MPI v1 8 3 MVAPICH v2 0 a MLNX OFED Rev 3 1 1 0 3 was tested with this switch however additional switches might be supported as well 1 6 RoCE Modes Matrix The following is RoCE modes matrix Table 8 RoCE Modes Matrix Software Stack RoCEv1 Layer 2 RoCEv2 Layer 3 RoCEv1 amp RoCEv2 Layer 3 Inbox Distribution Supported as of Version Supported as of Version Supported as of Version MLNX OFED 2 1 X X x 2 3 X X X 3 0 x x x Kernel org 3 14 RHEL 6 6 7 0 SLES 12 Ubuntu 14 04 10 Mellanox Technologies Rev 3 1 1 0 3 2 Changes and New Features in Rev 3 1 1 0 3 Table 9 Changes in v3 1 1 0 3 Category Description User Access Region UAR Allows the ConnectX 3 driver to operate on PPC machines without requir ing a change to the MMIO area size CQE Compression Saves PCIe bandwidth by compressing a few CQEs into a smaller amount of bytes on PCIe Bug fixes See Section 4 Bug Fixes History on page 36 For additional information on the new features please refer to the MLNX OFED User Manual Mellanox Technologies 11 J Rev 3 1 1 0 3 Known Issues 3 Known Issues The following is a
59. or does not respect interface parameter Configure each interface on a different subnet while logging in 4 ISCSID v2 0 873 can enter an endless loop on bind error 5 iSCSID may hang if target crashes during logout sequence reproducible with TCP 6 SLES12 Logging in with PI disabled followed by a log out and re log in with PI enabled without flush ing multipath might cause the block layer to panic 7 Rarely in InfiniBand devices when a catastrophic error scenario occurs iSCSI iSER initiator might not fully recover and result in system hang 8 Ubuntul 4 04 Stress login logout might cause block layer to invoke a WARN trace 3 6 7 SER Target Known Issues Table 35 iSER Target Known Issues Index Description Workaround 1 Currently only the following OSs are supported RHEL ContOS 7 0 SLES12 Ubuntul 4 04 2 Stress login logout from multiple initiators may cause ISER target to panic 3 RHEL CentOS 7 0 Discovery over RDMA is not supported 4 ib_isert is unavailable on custom kernels after run 1 Add isert y to them1nx add ker ning the minx_add_kernel_support sh script nel_support sh script after cat lt lt EOF gt ofed conf 2 Use the updated script to build MLNX_OFED for the custom kernel 3 6 8 ZFS Appliance Known Issues Table 36 ZFS Appliance Known Issues Index Description Workaround l Connection establishment occurs twice
60. ored 2 XRC is not functional in heterogeneous clusters containing non Mellanox HCAs 3 XRC options do not work when using qperf tool Use perftest instead 4 Out of memory issue might occur due to overload of XRC receive QP with non zero receive queue size created XRC QPs do not have receive queues 24 Mellanox Technologies Rev 3 1 1 0 3 3 5 4 Verbs Known Issues Table 26 Verbs Known Issues Index Description Workaround l Using libnll 1 3 26 or earlier requires ibv cre ate ah protection by a lock for multi threaded appli cations 2 In MLNX OFED v2 4 1 0 0 if several CQEs are When getting an event poll the CQ until it is received on a CQ they will be coalesced and a user empty space event will be triggered only once 3 ibv task pingpong over ConnectX 2 adapter cards in not supported 3 5 5 RoCE Known Issues Table 27 RoCE Known Issues Index Description Workaround l Not configuring the Ethernet devices or independent Restart the driver VMs with a unique IP address in the physical port may result in RoCE GID table corruption 2 If RDMA CM is not used for connection manage ment then the source and destination GIDs used to modify a QP or create AH should be of the same type IPv4 or IPv6 3 On rare occasions the driver reports a wrong GID table read from sys class infiniband mlx4 ports gids This may cause c
61. ormance Minimal Band The amount of bandwidth BW left on the wire may be split width Guarantee among other TCs according to a minimal guarantee policy ETS SR IOV Ethernet SR IOV Ethernet at Beta level 3 0 2 0 1 Virtualization Added support for SR IOV for ConnectX 4 Connect IB adapter cards 3 0 1 0 1 HCAs Added support for ConnectX 4 Single Dual Port Adapter sup porting up to 100Gb s RoCE per GID RoCE per GID provides the ability to use different RoCE versions modes simultaneously RoCE Link Aggre RoCE Link Aggregation available in kernel 4 0 only gation RoCE provides failover and link aggregation capabilities for mlx4 device LAG physical ports In this mode only one IB port that represents the two physical ports is exposed to the application layer Resource Domain Experimental Verbs Resource domain is a verb object which may be associated with QP and or CQ objects on creation to enhance data path performance Alias GUID Sup port in InfiniBand Enables the query gid verb to return the admin desired value instead of the value that was approved by the SM to prevent a case where the SM is unreachable or a response is delayed or if the VF Is probed into a VM before their GUID is registered with the SM 42 Mellanox Technologies Rev 3 1 1 0 3 Table 51 Change Log History Release Category Description 3 0 1 0 1 Denial Of Service Denial Of Service
62. ounters Bug Fixes See Bug Fixes History on page 36 Mellanox Technologies 45 J Rev 3 1 1 0 3 Change Log History Table 51 Change Log History Release Category Description 2 3 1 0 1 OpenSM Added Routing Chains support with Minhop UPDN FTree DOR Torus 2QoS Added double failover elimination When the Master SM is turned down for some reason the Standby SM takes ownership over the fabric and remains the Master SM even when the old Master SM is brought up to avoid any unneces sary re registrations in the fabric To enable this feature setthe master sm priority parameter to be greater than the sm priority parameter in all SMs in the fabric Once the Standby SM becomes the Master SM its priority becomes equal to the master sm priority So that addi tional SM handover is avoided Default value of the mas ter sm priority is 14 To disable this feature set the master sm priority in opensm conf to 0 Added credit loop free unicast multicast updn ftree routing Added multithreaded Minhop UPDN DOR routing RoCE Added IP routable RoCE modes For further information please refer to the MLNX OFED User Manual Installation Added apt get installation support Ethernet Added support for arbitrary UDP port for VXLAN From upstream 3 15 rcl and onward it is possible to use arbitrary UDP port for VXLAN This feature requires firmware version 2 32 5100 or higher Additional
63. r during FIO splice in ker Use kernel v2 6 34 rc4 nels which provides the following solution before 2 6 34 rc4 baff42a net Fix oops from tcp collapse when using splice 3 In PPC systems when QoS is enabled a harmless Kernel DMA mapping error messages might appear in kernel log IOMMU related issue 4 Transmit timeout might occur on RH6 3 as a result of lost interrupt OS issue In this case the follow ing message will be shown in dmesg do IRO 0 203 No irq handler for vector irq 1 5 Mixing ETS and strict QoS policies for TCs in 40GbE ports may cause inaccurate results in band width division among TCs 6 Creating a VLAN with user priority gt 4 on ConnectX 2 HCA is not supported 7 Affinity hints are not supported in Xen Hypervisor To overcome this issues run an irqblancer issue This causes a non optimal IRQ set irq affinity sh eth lt x gt affinity 8 Reboot might hang in SR IOV when using the probe vf parameter with many Virtual Func tions The following message is logged in the kernel log waiting for eth to become free Usage count 1 9 In ConnectX amp 2 RoCE UD QP does not include VLAN tags in the Ethernet header Mellanox Technologies 17 J Rev 3 1 1 0 3 Known Issues Table 18 Ethernet Known Issues Continued Index Description Workaround 10 VXLAN may not be functional when configured over Linux bridge in RH7 0 or
64. rnet 10GigE 20GigE PCI Express 2 0 2 5 or 5 0 GT s PCI Express 3 0 8 GT s a 56 GbE is a Mellanox propriety link speed and can be achieved while connecting a Mellanox adapter cards to Mellanox SX10XX switch series or connecting a Mellanox adapter card to another Mellanox adapter card 1 1 Content of Mellanox OFED for Linux Mellanox OFED for Linux software contains the following components Table 3 Mellanox OFED for Linux Software Components Components Description OpenFabrics core and ULPs InfiniBand and Ethernet HCA drivers mlx4 mlx5 core Upper Layer Protocols IPoIB SRP SER and SER Initiator and Target OpenFabrics utilities e OpenSM IB Subnet Manager with Mellanox proprietary Adaptive Routing Diagnostic tools e Performance tests SSA SLESI2 libopensmssa plugin for OpenSM ibssa ibacm MPI e OSU MPI mvapich2 2 0 stack supporting the InfiniBand interface Open MPI stack 1 6 5 and later supporting the InfiniBand interface e MPI benchmark tests OSU benchmarks Intel MPI benchmarks Presta PGAS e HPC X OpenSHMEM v2 2 supporting InfiniBand MXM and FCA e HPC X UPC v2 2 supporting InfiniBand MXM and FCA 6 Mellanox Technologies Rev 3 1 1 0 3 Table 3 Mellanox OFED for Linux Software Components Components Description HPC Acceleration packages e Mellanox MXM v3 0 p2p transport library acceleration over Infini band Mellanox FCA v2 5 MPI PGAS colle
65. se the server to hang higher due to a firmware issue Mellanox Technologies 35 J Rev 3 1 1 0 3 Bug Fixes History 4 36 Mellanox Technologies Bug Fixes History Table 50 lists the bugs fixed in this release Table 50 Fixed Bugs List age from being fully removed when uninstalling MLNX_OFED v3 0 2 0 1 I Descrintion Discovered Fixed in AE SIR in Release Release 1 IB MAD Fixed an issue causing MADs to drop in large scale 3 1 1 0 0 3 1 1 0 3 clusters 2 SR IOV Fixed InfiniBand counters which were unavailable in 2 1 1 0 0 3 1 1 0 0 the VM 3 RoCE Fixed InfiniBand traffic counters that are found 3 0 1 0 1 3 1 1 0 0 under sys class infiniband lt mlx5_dev gt ports port which dis not function properly in ConnectX 4 adapter cards 4 Virtualization Fixed VXLAN functionality issues 3 0 2 0 1 3 1 1 0 0 5 Performance TCP UDP latency on ConnectX 4 was higher than 3 0 2 0 1 3 1 1 0 0 expected 6 TCP throughput on ConnectX 4 achieved full line 3 0 2 0 1 3 1 1 0 0 rate 7 Fixed an issue causing inconsistent performance 3 0 2 0 1 3 1 1 0 0 with ConnectX 3 and PowerKVM 2 1 1 8 Fixed ConnectX 4 traffic counters 3 0 2 0 1 3 1 1 0 0 9 num entries Updated the desired num entries in each iteration 3 0 1 0 1 3 1 1 0 0 and accordingly updated the offset of the WC in the given WC array 10 mlx5 driver Fixed incorr
66. sses than the available GID entries in the table results in failure and the update gid table error message is displayed GID table of port 1 is full Can t add address message 4 Registering a large amount of Memory Regions MR may fail because of DMA mapping issues on RHEL 7 0 Mellanox Technologies 33 J Rev 3 1 1 0 3 Known Issues Table 45 Resources Limitation Known Issues Continued Index Description Workaround 5 Occasionally a user process might experience some To free memory to allow it to be allocated in a memory shortage and not function properly due to user process run the drop caches procedure Linux kernel occupation of the system s free mem below ory for its internal cache Performing the following steps will cause the kernel to flush and free pages dentries and inodes caches from memory causing that memory to become free Note As this is a non destructive operation and dirty objects are not freeable run sync first To free the pagecache echo 1 proc sys vm drop caches To free dentries and inodes echo 2 proc sys vm drop caches To free pagecache dentries and inodes echo 3 proc sys vm drop caches 3 9 8 Accelerated Verbs Known Issues Table 46 Accelerated Verbs Known Issues Index Description Workaround 1 On ConnectX 4 Lx the following may not be sup ported when using Multi Packet WR flag IBV_EX
67. ware packages of your operat ing system OS distribution are required To install the additional packages run the following commands per OS Table 5 Additional Software Packages Operating System Required Packages Installation Command RHEL OEL Fedora yum install perl pciutils python gcc gfortran libxml2 python tesh libnl 1686 libnl expat glib2 tcl libstdc bc tk gtk2 atk cairo numactl pkgconfig XenServer yum install perl pciutils python libxml2 python libnl expat glib2 tcl bc libstdc tk pkgconfig SLES 10 SP3 zypper install pkgconfig pciutils python libxml2 python libnl Isof expat glib2 tcl libstdc bc tk SLES 11 SP2 zypper install perl pciutils python libnl 32bit libxml2 python tesh libnl libstdc 46 expat glib2 tcl bc tklibcurl4 gtk2 atk cairo pkg config SLES 11 SP3 zypper install perl pciutils python libnl 32bit libxml2 python tesh libstdc 43 libnl expat glib2 tcl bc tk libcurl4 gtk2 atk cairo pkg config SLES 12 zypper install pkg config expat libstdc 6 libglib 2 0 0 libgtk 2 0 0 tcl libcairo2 tesh python bc pciutils libatk 1 0 0 tk python libxml2 Isof libnl1 Ubuntu Debian apt get install perl dpkg autotools dev autoconf libtool automake1 10 automake m4 dkms debhelper tcl tc18 4 chrpath swig graphviz tcl dev tcl8 4 dev tk dev tk8 4 dev bison flex dpatch zlib1 g dev curl libcurl4 gnutls dev python libxml2 libvirt bin lib virtO libnl dev libglib2 0 dev libgfortran3 automake m4 pkg config libnuma logro
68. which occurred 2 1 1 0 0 2 2 1 0 1 when the number of TX channels was set above the default value 70 Fixed a crash incidence which occurred when 2 0 2 0 5 22 1 0 1 enabling Ethernet Time stamping and running VLAN traffic 71 IB Core Fixed the QP attribute mask upon smac resolving 2 1 1 0 0 2 1 1 0 6 72 mlx5 ib Fixed a send WQE overhead issue 2 1 1 0 0 2 1 1 0 6 73 Fixed a NULL pointer de reference on the debug 2 1 1 0 0 2 1 1 0 6 print 74 Fixed arguments to kzalloc 2 1 1 0 0 2 1 1 0 6 TS mlx4_core Fixed the locks around completion handler 2 1 1 0 0 2 1 1 0 6 76 mlx4 core Restored port types as they were when recovering 2 0 2 0 5 2 1 1 0 0 from an internal error 77 Added an N A port type to support port type array 2 0 2 0 5 2 1 1 0 0 module param in an HCA with a single port 40 Mellanox Technologies Rev 3 1 1 0 3 Table 50 Fixed Bugs List Discovered Fixed in iue Description in Release Release 78 SR IOV Fixed memory leak in SR IOV flow 2 0 2 0 5 2 0 3 0 0 79 Fixed communication channel being stuck 2 0 2 0 5 2 0 3 0 0 80 mlx4 en Fixed ALB bonding mode failure when enslaving 2 0 3 0 0 2 1 1 0 0 Mellanox interfaces 81 Fixed leak of mapped memory 2 0 3 0 0 2 1 1 0 0 82 Fixed TX timeout in Ethernet driver 2 0 2 0 5 2 0 3 0 0 83 Fixed ethtool stats report for Virtual Funct

Download Pdf Manuals

image

Related Search

Related Contents

Clarion Model RK1 User's Manual  engine-driven aluminum water pump model 7rlag-2kst, 6rlag  12004 NEWest C - Tulsa Stained Glass  LockState LS-90 Instructions / Assembly  Ward Estimates of Income – User Guide    ESPECIFICACIÓN DEL PROGRAMA INTRODUCCIÓN  Beats by Dr. Dre BeatBox  HMC987lP5E - Analog Devices  Service Boards  

Copyright © All rights reserved.
Failed to retrieve file