Loading...
 
[Zobrazit/Skrýt nabídky vlevo]
[Zobrazit/Skrýt nabídky vpravo]

IB_basics

Infiniband basics

Shortcuts

MAD management datagram
SMP subnet manager MAD
GMP general management MAD
SM subnet manager
HCA host channel adapter
SL service level (QOS)
VL virtual lane
LID local ID
GUID global UID
QP queue pair
UD unreliable datagram
UC unreliable connected
RC reliable connected
verbs IB sw interface


Shortcut notes:
MAD - use UD
LID - assigned to node during initialization by SM
RC - maintain packet flow + retransmit (ack, nack)
SL is mapped to VL => QOS mgm, at least 2 VL, data + management. VL15 = management - no flow control

Topology discovery

  • topology discovery -> init FDB database (switches)
  • init use direct routed MAD
  • every node must be SMA (subnet manager agent)

Infiniband diagnostics

iblinkinfo report link info for all links in the fabric root@array06 ~# iblinkinfo
ibportstate report port state (local or switch) root@array06 ~# ibportstate -G 0x0002c90200466eb0 2
ibroute ibroute uses SMPs to display the forwarding tables root@array06 ~# ibroute -G 0x0002c90200466eb0
smpquery smpquery allows a basic subset of standard SMP queries smpquery switchinfo -G 0x0002c90200466eb0
perfquery perfquery uses PerfMgt? GMPs to obtain the PortCounters? root@array06 ~# perfquery -G 0x0002c90200466f80 1
ibtracert ibtracert uses SMPs to trace the path from a source GID/LID to a destination GID/LID root@array06 ~# ibtracert 8 12
ibdiagnet ibdiagnet scans the fabric using directed route packets and extracts all the available information regarding its connectivity and devices root@array06 ~# ibdiagnet
ibnetdiscover ibnetdiscover performs IB subnet discovery and outputs a human readable topology file root@array06 ~# ibnetdiscover
ibhost show InfiniBand? host nodes in topology root@array06 ~# ibhosts
obswitches show InfiniBand? switch nodes in topology root@array06 ~# ibswitches
sminfo query SM info root@array06 ~# sminfo

SubnetManager?

Important opensm.conf directives:

guid 0x0002c903004bdb33 The port GUID on which the OpenSM is running
lmc 1 The number of LIDs assigned to each port is 2^LMC. The LMC value must be in the range 0-7. LMC values > 0 allow multiple paths between ports. LMC values > 0 should only be used if the subnet topology actually provides multiple paths between ports, i.e. multiple interconnects between switches.
partition_config_file /etc/rdma/partitions.conf.1 partition file, see next
sm_priority 15 SM handover priority
sminfo_polling_timeout 500 Timeout in msec between two polls of active master SM
polling_retry_number 2 # Number of failing polls of remote SM that declares it dead


example partitioning conf:

[root@array06 rdma]# cat partitions.conf.1 
# For reference:
# IPv4 IANA reserved multicast addresses:
#   http://www.iana.org/assignments/multicast-addresses/multicast-addresses.txt
# IPv6 IANA reserved multicast addresses:
#   http://www.iana.org/assignments/ipv6-multicast-addresses/ipv6-multicast-addresses.xml
#
# mtu =
#   1 = 256
#   2 = 512
#   3 = 1024
#   4 = 2048
#   5 = 4096
#
# rate =
#   2  =   2.5 GBit/s
#   3  =  10   GBit/s
#   4  =  30   GBit/s
#   5  =   5   GBit/s
#   6  =  20   GBit/s
#   7  =  40   GBit/s
#   8  =  60   GBit/s
#   9  =  80   GBit/s
#   10 = 120   GBit/s

Default=0x7fff,ipoib,rate=7,mtu=4:
        mgid=ff12:401b::ffff:ffff       # IPv4 Broadcast address
        mgid=ff12:401b::1               # IPv4 All Hosts group
        mgid=ff12:401b::2               # IPv4 All Routers group
        mgid=ff12:401b::16              # IPv4 IGMP group
        mgid=ff12:401b::fb              # IPv4 mDNS group
        mgid=ff12:401b::fc              # IPv4 Multicast Link Local Name Resolution group
        mgid=ff12:401b::101             # IPv4 NTP group
        mgid=ff12:401b::202             # IPv4 Sun RPC
        mgid=ff12:601b::1               # IPv6 All Hosts group
        mgid=ff12:601b::2               # IPv6 All Routers group
        mgid=ff12:601b::16              # IPv6 MLDv2-capable Routers group
        mgid=ff12:601b::fb              # IPv6 mDNS group
        mgid=ff12:601b::101             # IPv6 NTP group
        mgid=ff12:601b::202             # IPv6 Sun RPC group
        mgid=ff12:601b::1:3             # IPv6 Multicast Link Local Name Resolution group
        ALL=full, ALL_SWITCHES=full;

IPoIB

example IPoIB ifcfg settings:

[root@array06 network-scripts]# cat ifcfg-ib0 
DEVICE=ib0
HWADDR=80:00:00:48:FE:80:00:00:00:00:00:00:00:02:C9:03:00:4B:DB:33
TYPE=InfiniBand
UUID=b329f616-d4db-4977-829f-72f5fd140743
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=static
MASTER=bond0 
SLAVE=yes
CONNECTED_MODE="yes"


example IPoIB bonding settings:

[root@array06 network-scripts]# cat ifcfg-bond0 
DEVICE=bond0
IPADDR=172.25.0.6
NETMASK=255.255.255.0 
ONBOOT=yes
BOOTPROTO=static
USERCTL=no
TYPE=Bonding
MTU=65520
BONDING_OPTS="mode=active-backup use_carrier=1 miimon=100 primary=ib0"

Created by darek. Last Modification: Středa 15 of červenec, 2015 14:40:13 CEST by darek.