*Reposted with permission from Oracle’s Networking Blog and Neeraj Gupta
When you want to bring up a compute server in your environment and need InfiniBand connectivity, usually you go through various installation steps. This could involve operating systems like Linux, followed by a compatible InfiniBand software distribution, associated dependencies and configurations.
What if you just want to run some InfiniBand diagnostics or troubleshooting tools from a test machine ? What if something happened to your primary machine and while recovering in rescue mode, you also need access to your InfiniBand network ? Often times we use opensource community supported small Linux distributions but they don't come with required InfiniBand support and tools.
In this weblog, I am going to provide instructions on how to add InfniBand support to a specific Linux image - Parted Magic.This is a free to use opensource Linux distro often used to recover or rescue machines.
The distribution itself will not be changed at all. Yes, you heard it right ! I have built an InfiniBand Add-on package that will be passed to the default kernel and initrd to get this all working.
You will need to have have a PXE server ready on your ethernet based network. The compute server you are trying to PXE boot should have a compatible IB HCA and must be connected to an active IB network.
Download the Parted Magic small distribution for PXE from Parted Magic website.
Download InfiniBand PXE Add On package. Right Click and Download from here. Do not extract contents of this file. You need to use it as is.
Extract the contents of downloaded pmagic distribution into a temporary directory. Inside the directory structure, you will see pmagic directory containing two files - bzImage and initrd.img. Copy this directory in your TFTP server's root directory. This is usually /tftpboot unless you have a different setup. For Example:
# cp pmagic_pxe_2012_2_27_x86_64.zip /tmp # cd /tmp # unzip pmagic_pxe_2012_2_27_x86_64.zip # cd pmagic_pxe_2012_2_27_x86_64 # ls -l total 12 drwxr-xr-x 3 root root 4096 Feb 27 15:48 boot drwxr-xr-x 2 root root 4096 Mar 17 22:19 pmagic # cp -r pmagic /tftpboot
As I mentioned earlier, we dont change anything to the default pmagic distro. Simply provide the add-on package via PXE append options.
If you are using a menu based PXE server, then add an entry to your menu. For example /tftpboot/pxelinux.cfg/default can be appended with following section.
LABEL Diskless Boot With InfiniBand Support MENU LABEL Diskless Boot With InfiniBand Support KERNEL pmagic/bzImage APPEND initrd=pmagic/initrd.img,pmagic/ib-pxe-addon.cgz edd=off load_ramdisk=1 prompt_ramdisk=0 rw vga=normal loglevel=9 max_loop=256 TEXT HELP * A Linux Image which can be used to PXE Boot w/ IB tools ENDTEXT
Note: Keep the line starting with "APPEND" as a single line.
If you use host specific files in pxelinux.cfg, then you can use that specific file to add the above mentioned entry.
Now boot your desired compute machine over PXE. This does not have to be over InfiniBand. Just use your standard ethernet interface and network. If using menus, then pick the new entry that you created in previous section. After a few minutes, you will be booted into Parted Magic environment.
Well, I have made things a bit easy for you The add-on package that we passed while booting, starts IPoIB automatically. All you need to do is provide an IP address to ib0 or ib1 interfaces. Open a terminal session and check the status. You can use commands like:
ifconfig -a ibstat ibv_devices ibv_devinfo
If you are connected to InfiniBand network with an active Subnet Manager, then your IB interfaces must have come online by now. You can proceed and assign IP address to them. This will enable you at IPoIB layer.
I have also added several InfiniBand Diagnistic tools in this add-on. You can use from following list:
ibstat, ibstatus, ibv_devinfo, ibv_devices perfquery, smpquery ibnetdiscover, iblinkinfo.pl ibhosts, ibswitches, ibnodes
This concludes this weblog. Here we saw how to bring up a computer with IPoIB and InfiniBand diagnostic tools without installing anything on it. Its almost like running diskless !