Over the course of the last several years, I've noticed that on the forum, there are similar questions about both ConnectX NICs and Bluefields. Therefore, I've written a huge meta post that collects information about them.
I have had some experience working with ConnectX-4, 5, 6, and 7 NICs, as well as Bluefield-1 and 2. Most of my experience is with pure Linux, but that shouldn't matter for most things, and I'll be glad to accept any changes. To do so, please add them as a suggestion in this Google Doc. If you want to be credited, don't forget to include your username on the forum (or any other way you want to be credited in the Authors section). If you can't, PM me.
As Bluefield is essentially a ConnectX NIC with an ARM CPU (and more), there are many similarities in working with them.
Last update: 13 May 2025
Changelog:
NOTE: All server NICs require some airflow. You might be able to get away with little to no airflow for ConnectX-4 Lx and ConnectX-6 Lx (though this is already questionable), but for all other models, you must provide active airflow of some form; otherwise, even in idle mode, the NIC will overheat and shut down.
To get the full potential out of your Mellanox/nvidia NIC, you need to install MLXOFED. Since December 2024, it has been distributed as part of the Host DOCA SDK bundle. As of writing, you can download it from this section of nvidia's website. To download the latest DOCA, go to the website, click on "Host-Server", select "DOCA-Host", choose your OS, and follow the provided instructions.
DOCA 2.10 (latest as of writing) supports all NICs starting with ConnectX-4 and Bluefield-2 and newer (see below). For unsupported cards, you need to use OFED that still supports them. The archive is located here.
To reflash NICs, you should use open-source mstflint. Most Linux distros come with flint prepackaged, but its version might be too old. The latest version can be downloaded from here.
Unofficial list of Mellanox and Nvidia NICs
3. ConnectX
i. General
Working with ConnectX cards is mostly straightforward. There are a few common tools and commands that people use:
Some NICs have so-called "White" and "Black" Connectors. Those are for the "Socket Direct" adapter. Those can be used to connect the NIC to PCIe lanes from 2 different sockets. You connect cable that is labeled "white" to the "WHITE" connector and the cable labeled "black" to a connector that have "BLACK" written near it (note: color of the cable can be any - both white, white and black, black and black - that is normal). It might help in terms of performance when you have a multi-socket system and want to assign queues from the NIC to both CPUs without data passing via a slow cross-socket link (same is applicable for Sub-NUMA nodes, like NPS in AMD Epyc).
a. Troubleshooting
Some cards, especially those running older firmware (from approximately 2018-2019), have severe compatibility issues with modern systems, to the extent that they won't initialize and are therefore inaccessible from within the OS. The solution is to put it in flash recovery mode and reflash or find a compatible system (I personally use a Gigabyte MC12-LE0-based system as a reflashing machine)
ii. Updating firmware
There is a fully official automated way to update firmware for supported non-OEM cards:
It will update the NIC firmware to the one bundled in the version of DOCA you have installed. That doesn't work for OEM cards (e.x, Dell, HP, Cisco, …) and won't allow you to cross-flash cards. They have different PSIDs allocated to them that don't start with MT_ but rather with company-allocated prefixes (DEL, HP_, etc).
If you want to flash specific firmware for the same PSID (one of the unique identifiers of a particular card):
a. Flashing / cross-flashing / recovering from "flash recovery" state
WARNING: That is a dangerous operation. While ConnectX-4 and newer are rather resilient and most of the time can be restored, in some cases, it would be hard or might cause permanent damage to the NIC or your computer. Do it at your own risk. You have higher chances to succeed if PCB of the cards mostly matches, and you must be sure they use the same chip (it is a bad idea to flash ConnectX-4 Lx with ConnectX-4 EN firmware). At least do backups.
Newer cards (ConnectX-6, 7, 8, Bluefield-2, Bluefield-3) can run signed firmware (all CX-7 and BF-3 run signed firmware), some of them can still be cross-flashed, but it is not guaranteed they'll be able to boot; in some cases, overwriting VPD helps.
To cross-flash a card with another PSID's version for the same-vendor non-signed card or older ConnectX-5 and 4s.
Note: Crossflashing Dell variant of ConnectX-4 Lx (DPN: 20NJD for short bracket, MRT0D long bracket) with stock Mellanox FW render built-in LEDs (uplink/traffic) disabled hence it’s advised to stay on latest Dell FW 14.32.20.04).
To check if your card runs signed firmware:
If it has "Security Attributes: secure-fw," - then it is signed and you need to cross-flash it from within recovery mode. Otherwise, the normal way would work:
In rare cases, you need to pass '--no_fw_ctrl' option for some of the ConnectX-5 cards.
I personally like to use something like that:
For newer cards, if the card runs signed or encrypted firmware, you won't be able to take a backup from normal mode and won't be able to use the 'allow_psid_change' flag in normal mode. You need to put it into flash-recovery mode. To do that, turn off your server, remove the card, and short JP1/JP2/J7 (sometimes labeled "FNP") - that is a 2-hole unsoldered connector with just "JPx" written next to it. I personally use just a wire, but be careful not to short other boards around it.
For ConnectX-6/Bluefield-2, you can also try flashing another vendor's firmware by overwriting the VPD. You will need to put the card into flash recovery mode, and you'll need to use flint_oem from mft-oem package provided by nvidia. You can get one from here:
Note: reflashing VPD would set your MAC and GUID to 0, and you'll need to restore it from a pre-saved file:
After that, you can turn off your server and remove the jumper. It should boot normally, and the card should change the PSID and all displayed information.
TODO: Check if Diolan U2C works with the cards as well, as it seems to use the same chip and USB ID as mtusb-1.
Note on ConnectX-7 cards:
There are some ConnectX-7 Engineering Samples on the market that have pre-production unencrypted firmware (their production date is earlier than April 2022). In the "mstflint -d ${PCI_ID} query full" you will also see:
Nvidia's documentation lists a path to make those cards run Production firmware: you need to obtain a special jump firmware version 28.98.2406, flash it, and after reboot, flash to production firmware. However, there is no known way to obtain 28.98.2406 firmware, so it was never tested if that upgrade path works. For whatever reason, some cards were shipped to actual people with firmware version 28.98.xxxx flashed, as there are multiple mentions of that on the internet, though.
If you have a card with firmware from the 28.98 range, please dump it and share for preservation purposes.
iii. Specialized SKUs
a. Innova
Those are ConnectX-4 Lx (Innova) or ConnectX-5 (Innova-2) NICs with on-board FPGAs for crypto acceleration. There isn't much information available, though, mostly just news from STH.
I haven't tried using Innova NICs, but there seems to be a lot of information available in one of the GitHub repos (have a look at that account, it has way more details about Innova-2) or there is a good longread on GitHub Gist.
b. Branded cards
Mellanox/Nvidia also customizes cards for big customers. One notable example is CX71343DAC-WEBF, a custom SKU for Facebook, which features a single QSFP-DD that can be split into two 200G QSFP56s. That card has a custom PSID that starts with 'FB_' and therefore, firmware can't be easily updated.
iv. MacGyvering firmware
It is possible to MacGyver your own firmware, but currently, the working methods are limited to ConnectX-5 and 6, as they use the FS4 firmware format, which is unencrypted. It is theoretically possible to explore other firmware formats, but no tooling exists, and newer firmwares are also encrypted, so there isn't much you can do with them, as there is no known method to work around the encryption.
There are some reasons why you might want to do that - for example, if you want to change a PSID to your custom one, or have a custom vendor card that works somewhat better (e.x, ATTO-branded ConnectX-5s for Mac that work with their driver) but the vendor stopped updating the firmware and EOLed the product. However, note that if the firmware is secure and signed, the number of changes it accepts is limited.
One example is to add PCIe Gen4 initialization to the ATTO FastFrame N312 (ConnectX-5), which would utilize all available bandwidth in Thunderbolt 5/USB4 v2 enclosures.
The available tooling is limited in terms of functionality.
Main tools are:
i. General
All documentation assumes you have both mstflint and DOCA-Host installed on your host system.
Reflashing of the ARM part goes via rshim (you need to start rshim service if you are using Linux), it would create devices in /dev/rshim${n}/ that can give you serial console (/dev/rshim${n}/console, 115200n8) or allow you to control the card (/dev/rshim${n}/misc).
Some useful commands:
The OS for the ARM part comes packaged into bfb files (bluefield firmware bundle). To flash them, you should use bfb-install from the rshim package.
a. Working with DOCA SDK
NOTE: An example is based on the assumption that you are flashing DOCA 2.9.1, and all the file names are based on that.
To flash DOCA SDK to your card, you should use something similar to that command:
That would flash DOCA SDK 2.9.1 (that you've downloaded) to the NIC and also would use the provided config file with extra parameters.
In that config file, you can provide some useful parameters:
Password hash can be generated with something like 'openssl passwd -5', for example, you can try to use other hash types that the OS of your choice supports. The official documentation suggests using type 1 for whatever reason.
It supports multiple options; refer to the documentation for a list of available parameters or a detailed process description.
Another interesting part is the default network interface configuration, which manages the settings for all interfaces. Most notably,, that way you can change the default IP address: Deploying BlueField Software Using BFB from Host
b. Accessing Bluefield OS
If you load rshim service, it will create tmfifo_net# interface (where # - number corresponding to /dev/rshim folder for the card). By default, Bluefield OS is available at 192.168.100.2 and would try to use 192.168.100.1 as a default gateway and nameserver.
By default, all of the tmfifo interfaces on all cards would have a static MAC address 00:1A:CA:FF:FF:01 and therefore should be accessible over a link-local IPv6 address: fe80::21a:caff:feff:ff01. That can be changed via one of the parameters in bf.cfg.
Alternatively, it would try to obtain an IP address over OOB RJ45 via DHCP (v4).
c. Bluefield Applications
Nvidia provides good information about sample applications of Bluefield. I suggest referring to the relevant section of the documentation.
One of the notable use-cases for Bluefield is to emulate an NVMe drive while actually performing NVMe-oF (e.x, NVMe over TCP): DOCA Storage Zero Copy
Another use case is a GPU Packet Processing application - that way you can abstract the GPU over the network: DOCA GPU Packet Processing Application Guide
ii. Bluefield-1
NOTE: Avoid those cards, unless they are extremely cheap or you want one for your collection. They are unsupported by nVidia and ARM-part requires old DOCA (1.3.0 is the last one that officially supported BF1, but DOCA 1.4.0 still contains firmwares for those and might or might not work) and have bugs that were never fixed. If you have one, you can still install the latest DOCA on the host for the newest versions of OFED.
One of the common problems on Bluefield-1 systems is PCIe compatibility. That is the same as with ConnectX-5, which is running older firmware. Still, there is no proper fix for Bluefields available and even with newest available firmware it won't be able to run in some of the newer systems (I personally had no problems with compatibility with Sapphire Rapids/Emerald Rapids server systems, but in SP3/SP5 AMD Epyc machines NIC failed to initialize). In theory, it might be possible to MacGyver fixed firmware by transplanting the PCIe init section from ConnectX-5. However, that was never tried and might brick the card, rendering it beyond repair.
a. General
There are a few different SKUs in Bluefield-1 generation available, and it is important to verify which one you got before inserting the card.
All the cards use the same ConnectX-5 En NIC as the ConnectX-5 NIC. That might be important in terms of performance.
Also, all the cards, including 2x25G, are PCIe Gen4 versions. Therefore, it should be possible to reach full speed on a PCIe Gen4 x4 port.
Mostly full list of Bluefield-1 SKUs
There is a bit of logic in how they are named, mainly in the suffix of the card.
You would see the suffixes like AENAT, ASNAT, ASCAT, CSNAT, etc.
The first letter represents the number of ports on a card. A - 2x25, C or E - 2x100.
The second letter represents the number of cores - E is 8 ARM cores, S - 16.
The third letter represents crypto support - N - no crypto, C - crypto enabled.
"T" is just always there, I don't know if it has any special meaning, and it was never used on the older firmware download page.
Firmware updates for the cards are no longer available as standalone files, but the latest firmware is included in DOCA 1.3.0. On some cards (e.x, MBF1L515B ones), you can install the SDK only using a mini-USB to USB cable as it doesn't expose rshim services over PCIe.
Follow the guide for DOCA 1.3.0 to install it on the card, as that is the latest version of DOCA that works.
iii. Bluefield-2
a. General
Even 2x25G SKUs are based on ConnectX-6 Dx; therefore, in terms of power consumption and performance, they should be faster than ConnectX-6 Lx if that matters. As a drawback, none of the cards supports ASPM.
Mostly complete official list of bluefield-2s
As with BF1, there are separate SKUs that are JBoF, and just like with BF1, you cannot use those cards on a normal computer or server. Those SKUs are called BF2500 or BF2 VPI DPU Controllers. Unfortunately, it is unclear if they have different PCBs or share the same one with other Bluefield-2s, as I wasn't able to obtain them, and there are no reports on any successful attempts to flash cards with BF2500 firmware. Firmware for BF2500 can be found inside DOCA 1.3.0 and 1.4.0, but it is old.
Some SKUs are not mentioned in the documentation, but older versions of DOCA have firmware for those SKUs. For example, MBF2H536C-CEUO has Secure Boot, but UEFI and Crypto are disabled.
b. Special versions
Other special versions of Bluefield-2 exist, but I haven't personally seen them. Those are bluefields with integrated GPU - Bluefield-2X, which consists of 3 production models:
On the internet, you can also find the BF A10X Engineering Sample for sale, but it is not clear if it exists outside of Engineering Samples. It is unclear what kind of silicon the card uses or if it actually exists and not just a typo in the listing (it is hard to verify, as firmware for BF-2X distributed only under NDA after approx. DOCA 1.4.0)
c. NC-SI Interface
Bluefield-2 has an NC-SI connector that can be used for debugging/recovery purposes. The connector is different depending on the card. Rule of thumb: if the card is an Engineering Sample, it will have a 30-pin connector for a flat cable. For production cards, a 20-pin Molex 5011892010 connector is used.
They all have UART pins exposed, which would provide UART to the BMC.
There is slightly more information in the manuals:
d. [IMPORTANT] Updating the DOCA
It is important not to try updating a Bluefield from within the pre-installed DOCA SDK if the version is not recent enough. That would result in a soft-bricked NIC that you won't be able to even detect on the PCIe bus, and the only way to bring it back to life would be to access BMC (via SSH, if it was enabled, or via NC-SI serial console). See below for the recovery instructions.
If you have a custom SKU for which DOCA doesn't have a firmware, at the end of the firmware update process, you'll get a message:
That is normal; it means that DOCA didn't have any firmware for that card.
e. Note on MBF2M345A-VENO Engineering Samples
Those are pre-production SKUs. They usually come with pre-production firmware 24.33.0356 or with an updated firmware version 24.40.1000. It is not possible to just cross-flash those cards to production SKU like MBF2M345A-HECO or HESO because there is physical difference between those cards - VENO uses 2-bank 8gbit RAM chips, while HE*O cards are using 1-bank 16-gbit RAM chips and if you just cross-flash the card, ARM cores will fail to boot, complaining to ddr_init training errors.
There are several workarounds available. A temporary workaround involves modifying the bfb file and replacing the RAM configuration on the flash. You need to replace the config within the BFB file each time you are performing bfb-install, and you need to persist your changes on the eMMC. You can try to make it persistent - ini is located in /dev/mmcblk0boot0 and a copy is in /dev/mmcblk0boot1. Please make a full backup of a working boot0 before making any changes (and refer to the section on BFB structure for more information).
The second workaround is to modify the firmware of the NIC. If you want a modified firmware, please send me a PM, as I don't want it to be widely available for now, as that probably would increase the prices of the cards.
The third workaround is to find newer firmware in other places (there are multiple versions available). For detailed instructions on where to find it please send a PM (same reasons as for the second workaround).
The boot partition can be updated with a special bfb file, and the steps are officially described in BlueField BSP documentation.
Those cards don't have a BMC chip and instead run on a BMC simulator. You might soft-brick the card if you attempt to change any BMC parameters (like assigning an IP address). If you do that, you will need to install the latest DOCA that still supports that card (DOCA 1.4.0). You can install it with bfb-install. That will downgrade the EFI firmware and allow you to fix the BMC Simulator parameters.
TLDR: You need to prepare a bfb file that contains the bootloader and all the configuration files you want to change, and then you need to flash it from within Bluefield's OS using the /opt/mellanox/scripts/bfrec script.
Do not try to hexedit files manually; some of them are signed, and all of them have a CRC attached. In the event of a CRC mismatch, the system will not boot.
iv. BFB structure
You can use mlx-mkbfb script to make your own BFB or to extract existing ones.
would extract bfb from bf-bundle-2.10.0-147_25.01_ubuntu22.04_prod.bfb file.
A script like that can be used to repack partitions into bfb file.
Current firmware has the following partitions that are mostly self-descriptive (if partitions have v1 and v2 - that is for bluefield-2 and bluefield-3, respectively):
All installation is done from within initramfs.
The main script is located in scripts/initrd-install, and it then calls the installation script for the OS.
a. Extracting firmware from the bf-bundle
There are sample scripts that will try to extract firmware updates from bf-bundle: GitHub - Civil/bfb-extract-fw
Scripts are simplistic and might fail if anything changes in the format. They will produce a few directories where firmware would be named by a combination of model and PSID for all cards supported by the bfb file.
That works with older DOCAs, like 1.3.0 as well, but you need to modify the script to point to the correct URLs and files.
v. Troubleshooting
a. Failed setting eswitch to offloads
Full message:
Source: 1, 2
b. BMC on vendor cards
In some cases, it is disabled because Vendor Field Mode is enabled. It can be reenabled from within the OS on the card: Vendor Field Mode
c. The card is not detected as a PCIe device.
You can attempt to recover the card by logging in to BMC's serial console. For that, you need to have either a working BMC (responding on web interface/API) or, if that is not an option (card in VFM), you can get a physical serial console over an NC-SI Connector. On production cards, those are Molex 5011892010, pico-clasp. Cables from various sources work fine with those cards.
TODO: Brick one of the cards I have and provide instructions on how to recover it, step-by-step.
5. MTUSB-1
This device is used to modify the Mellanox NICs as an option when putting them into flash recovery does not work. It accesses the NIC via the I2C interface.
This only seems to be necessary on some OEM CX5 NICS and the CX6 NICs. It doesn’t seem to be necessary on the CX7 NICs.
The device looks like the following:
Another pic of how the foot bone is connected to the knee bone!
You will need to get a couple of extra parts that don’t come with the kit.
That includes the gender changer shown below:
I also needed to rig the 3 pin connector that will go through the 3 holes on the nic:
And another shot:
I originally thought that green wire would go to the G for ground on the nic ….but the white wire goes to the G on the nic!
Also, take note of the order of the 3 holes on a CX5 NIC may be different than the CX6 NIC!
Please note that this did not work in vmware esxi. I did get it to work in linux and windows.
What it looks like without the mtusb-1 connected:
With the mtusb-1 connected:
Now you can run any of the normal commands. You just need to specify the mtusb-1 as the device:
Some additional pics of the complete setup on a benchtop:
6. Useful links
Mellanox OFED cheat sheet - MLXOFED Cheatsheet
bfscripts/mlx-mkbfb at master · Mellanox/bfscripts
NVIDIA BlueField-2 Ethernet DPU User Guide
bfscripts/mlx-mkbfb at master · Mellanox/bfscripts
Configuring NVIDIA BlueField2 SmartNIC
39. NVIDIA MLX5 Ethernet Driver — Data Plane Development Kit 25.03.0 documentation
Levente Csikor – Medium - very good series of articles on working with Bluefield, however, it requires an account on Medium.
I have had some experience working with ConnectX-4, 5, 6, and 7 NICs, as well as Bluefield-1 and 2. Most of my experience is with pure Linux, but that shouldn't matter for most things, and I'll be glad to accept any changes. To do so, please add them as a suggestion in this Google Doc. If you want to be credited, don't forget to include your username on the forum (or any other way you want to be credited in the Authors section). If you can't, PM me.
As Bluefield is essentially a ConnectX NIC with an ARM CPU (and more), there are many similarities in working with them.
Last update: 13 May 2025
Changelog:
- 4 May 2025 Initial document
- 5 May 2025 extra note about Dell's CX-4 Lx and LED.
- 5 May 2025 mention that there is a 3rd workaround for MBF2M345A-VENOT_ES
- 10 May 2025 added MTUSB-1 section 6
- 13 May 2025 Clarify the section about cross-flashing vendor cards
- Civiloid
- pimposh
- jpmomo
NOTE: All server NICs require some airflow. You might be able to get away with little to no airflow for ConnectX-4 Lx and ConnectX-6 Lx (though this is already questionable), but for all other models, you must provide active airflow of some form; otherwise, even in idle mode, the NIC will overheat and shut down.
To get the full potential out of your Mellanox/nvidia NIC, you need to install MLXOFED. Since December 2024, it has been distributed as part of the Host DOCA SDK bundle. As of writing, you can download it from this section of nvidia's website. To download the latest DOCA, go to the website, click on "Host-Server", select "DOCA-Host", choose your OS, and follow the provided instructions.
DOCA 2.10 (latest as of writing) supports all NICs starting with ConnectX-4 and Bluefield-2 and newer (see below). For unsupported cards, you need to use OFED that still supports them. The archive is located here.
To reflash NICs, you should use open-source mstflint. Most Linux distros come with flint prepackaged, but its version might be too old. The latest version can be downloaded from here.
Unofficial list of Mellanox and Nvidia NICs
3. ConnectX
i. General
Working with ConnectX cards is mostly straightforward. There are a few common tools and commands that people use:
Code:
# query current configuration
mlxconfig -d ${PCI_ID} q
Code:
# Change port 1 to Ethernet and port2 to Infiniband
mlxconfig -d ${PCI_ID} set LINK_TYPE_P1=2 LINK_TYPE_P2=1
Code:
# Enable SR-IOV and Change number of virtual functions to 8:
mlxconfig -d ${PCI_ID} set SRIOV_EN=1 NUM_OF_VFS=8
Code:
# Link-aggregation mode, queue affinity
mlxconfig -d ${PCI_ID} s LAG_RESOURCE_ALLOCATION=0
Code:
# Link-aggregation mode, hash mode
mlxconfig -d ${PCI_ID} s LAG_RESOURCE_ALLOCATION=1
Code:
# Enable aggressive CQE Compression - that might help with small packet performance, however, might make performance worse in some cases
mlxconfig -d ${PCI_ID} s CQE_COMPRESSION=1
Some NICs have so-called "White" and "Black" Connectors. Those are for the "Socket Direct" adapter. Those can be used to connect the NIC to PCIe lanes from 2 different sockets. You connect cable that is labeled "white" to the "WHITE" connector and the cable labeled "black" to a connector that have "BLACK" written near it (note: color of the cable can be any - both white, white and black, black and black - that is normal). It might help in terms of performance when you have a multi-socket system and want to assign queues from the NIC to both CPUs without data passing via a slow cross-socket link (same is applicable for Sub-NUMA nodes, like NPS in AMD Epyc).
a. Troubleshooting
Some cards, especially those running older firmware (from approximately 2018-2019), have severe compatibility issues with modern systems, to the extent that they won't initialize and are therefore inaccessible from within the OS. The solution is to put it in flash recovery mode and reflash or find a compatible system (I personally use a Gigabyte MC12-LE0-based system as a reflashing machine)
ii. Updating firmware
There is a fully official automated way to update firmware for supported non-OEM cards:
Code:
mlxfwmanager
If you want to flash specific firmware for the same PSID (one of the unique identifiers of a particular card):
Code:
sudo mstflint -d "${PCI_ID}" -i "${NEW_FIRMWARE_BIN}" burn
WARNING: That is a dangerous operation. While ConnectX-4 and newer are rather resilient and most of the time can be restored, in some cases, it would be hard or might cause permanent damage to the NIC or your computer. Do it at your own risk. You have higher chances to succeed if PCB of the cards mostly matches, and you must be sure they use the same chip (it is a bad idea to flash ConnectX-4 Lx with ConnectX-4 EN firmware). At least do backups.
Newer cards (ConnectX-6, 7, 8, Bluefield-2, Bluefield-3) can run signed firmware (all CX-7 and BF-3 run signed firmware), some of them can still be cross-flashed, but it is not guaranteed they'll be able to boot; in some cases, overwriting VPD helps.
To cross-flash a card with another PSID's version for the same-vendor non-signed card or older ConnectX-5 and 4s.
Note: Crossflashing Dell variant of ConnectX-4 Lx (DPN: 20NJD for short bracket, MRT0D long bracket) with stock Mellanox FW render built-in LEDs (uplink/traffic) disabled hence it’s advised to stay on latest Dell FW 14.32.20.04).
To check if your card runs signed firmware:
Code:
mstflint -d ${PCI_ID} query full
Code:
sudo mstflint -d "${PCI_ID}" -i "${NEW_FIRMWARE_BIN}" -allow_psid_change burn
I personally like to use something like that:
Code:
echo "You are about to flash your Mellanox card to a different firmware without any validation. That can cause irreversible damage to your network card or PC Please STOP IF YOU ARE NOT SURE YOU KNOW WHAT YOU ARE DOING, AND YOU TAKE FULL RESPONSIBILITY FOR WHAT IS ABOUT TO HAPPEN."
sleep 60
sudo apt install mstflint gawk
NEW_FIRMWARE_BIN="<set this to the path to your new unzipped firmware, bin file>"
# This will get the only ID of the first card, you should modify that if you need another one
PCI_ID=$(sudo lspci | gawk '($0 ~ /ConnectX/ && $1 ~ /\.0$/){print $1}' | head -n 1)
mkdir -p "mellanox_${PCI_ID}_backup"
sudo mlxconfig -d "${PCI_ID}" q > "mellanox_${PCI_ID}_backup"/mlxconfig.txt
sudo mstflint -d "${PCI_ID}" query full > "mellanox_${PCI_ID}_backup"/query_full.txt
sudo mstflint -d "${PCI_ID}" hw query > "mellanox_${PCI_ID}_backup"/hw_query.txt
sudo mstflint -d "${PCI_ID}" ri "mellanox_${PCI_ID}_backup"/orig_firmware.bin
sudo mstflint -d "${PCI_ID}" dc "mellanox_${PCI_ID}_backup"/orig_firmware.ini
sudo mstflint -d "${PCI_ID}" -i "${NEW_FIRMWARE_BIN}" -allow_psid_change burn
sudo mstfwreset -d "${PCI_ID}" reset
Code:
sudo mstflint -d "${PCI_ID}" -i "${NEW_FIRMWARE_BIN}" -ocr --nofs --allow_psid_change burn
Note: reflashing VPD would set your MAC and GUID to 0, and you'll need to restore it from a pre-saved file:
Code:
sudo mstflint -d "${PCI_ID}" -ocr query full > query_full.txt
sudo mstflint -d "${PCI_ID}" -ocr hw query > hw_query.txt
# Make a backup for the firmware, just in case.
sudo mstflint -d "${PCI_ID}" -ocr ri orig_firmware.ini
sudo mstflint -d "${PCI_ID}" -ocr dc orig_firmware.ini
# Remove flash write protection. Open-source Flint doesn't support hw queries (might need to be compiled specially)
sudo flint_oem -d 09:00.0 -ocr hw set Flash0.WriteProtected=Disabled
sudo flint_oem -d "${PCI_ID}" -i "${NEW_FIRMWARE_BIN}" -ocr --nofs --allow_psid_change --ignore_dev_data --use_image_ps burn
# For me, a reboot was required; until then, the GUID and MAC stayed 0, even though it said otherwise
reboot
PCI_ID=$(sudo lspci | gawk '($0 ~ /ConnectX/ && $1 ~ /\.0$/){print $1}' | head -n 1)
GUID=$(gawk '($1 == "Base" && $2 == "GUID:"){print $3}' query_full.txt)
MAC=$(gawk '($1 == "Base" && $2 == "MAC:"){print $3}' query_full.txt)
sudo mstflint -d "${PCI_ID}" -guid ${GUID} -mac ${MAC} -ocr sg
TODO: Check if Diolan U2C works with the cards as well, as it seems to use the same chip and USB ID as mtusb-1.
Note on ConnectX-7 cards:
There are some ConnectX-7 Engineering Samples on the market that have pre-production unencrypted firmware (their production date is earlier than April 2022). In the "mstflint -d ${PCI_ID} query full" you will also see:
Code:
Life cycle: PRODUCTION
<...>
Encryption: Disabled
If you have a card with firmware from the 28.98 range, please dump it and share for preservation purposes.
iii. Specialized SKUs
a. Innova
Those are ConnectX-4 Lx (Innova) or ConnectX-5 (Innova-2) NICs with on-board FPGAs for crypto acceleration. There isn't much information available, though, mostly just news from STH.
I haven't tried using Innova NICs, but there seems to be a lot of information available in one of the GitHub repos (have a look at that account, it has way more details about Innova-2) or there is a good longread on GitHub Gist.
b. Branded cards
Mellanox/Nvidia also customizes cards for big customers. One notable example is CX71343DAC-WEBF, a custom SKU for Facebook, which features a single QSFP-DD that can be split into two 200G QSFP56s. That card has a custom PSID that starts with 'FB_' and therefore, firmware can't be easily updated.
iv. MacGyvering firmware
It is possible to MacGyver your own firmware, but currently, the working methods are limited to ConnectX-5 and 6, as they use the FS4 firmware format, which is unencrypted. It is theoretically possible to explore other firmware formats, but no tooling exists, and newer firmwares are also encrypted, so there isn't much you can do with them, as there is no known method to work around the encryption.
There are some reasons why you might want to do that - for example, if you want to change a PSID to your custom one, or have a custom vendor card that works somewhat better (e.x, ATTO-branded ConnectX-5s for Mac that work with their driver) but the vendor stopped updating the firmware and EOLed the product. However, note that if the firmware is secure and signed, the number of changes it accepts is limited.
One example is to add PCIe Gen4 initialization to the ATTO FastFrame N312 (ConnectX-5), which would utilize all available bandwidth in Thunderbolt 5/USB4 v2 enclosures.
The available tooling is limited in terms of functionality.
Main tools are:
- GitHub - irisc-research-syndicate/mlx5fw: Tool for manipulating ConnectX-5 firmware (FS4?) - open-ish (unlicensed) tool that can extract ITOC sections of firmware (but not DTOC) and replace them if the size of the section hasn't changed.
- Open-source mstflint has a command to verify firmware; it will print most sections along with their offsets.
i. General
All documentation assumes you have both mstflint and DOCA-Host installed on your host system.
Reflashing of the ARM part goes via rshim (you need to start rshim service if you are using Linux), it would create devices in /dev/rshim${n}/ that can give you serial console (/dev/rshim${n}/console, 115200n8) or allow you to control the card (/dev/rshim${n}/misc).
Some useful commands:
Code:
#reboot ARM CPU
echo "SW_RESET 1" > /dev/rshim${n}/misc
#increase verbosity of the console logs
echo "DISPLAY_LEVEL 2" > /dev/rshim${n}/misc
a. Working with DOCA SDK
NOTE: An example is based on the assumption that you are flashing DOCA 2.9.1, and all the file names are based on that.
To flash DOCA SDK to your card, you should use something similar to that command:
Code:
bfb-install --rshim rshim0 --bfb ./bf-bundle-2.9.1-30_24.11_ubuntu-22.04_prod.bfb --config ./bf.cfg
In that config file, you can provide some useful parameters:
Code:
ubuntu_PASSWORD=<hash of the password>
grub_admin_PASSWORD=’<grub2 pbkdf2 hashed password>'
It supports multiple options; refer to the documentation for a list of available parameters or a detailed process description.
Another interesting part is the default network interface configuration, which manages the settings for all interfaces. Most notably,, that way you can change the default IP address: Deploying BlueField Software Using BFB from Host
b. Accessing Bluefield OS
If you load rshim service, it will create tmfifo_net# interface (where # - number corresponding to /dev/rshim folder for the card). By default, Bluefield OS is available at 192.168.100.2 and would try to use 192.168.100.1 as a default gateway and nameserver.
By default, all of the tmfifo interfaces on all cards would have a static MAC address 00:1A:CA:FF:FF:01 and therefore should be accessible over a link-local IPv6 address: fe80::21a:caff:feff:ff01. That can be changed via one of the parameters in bf.cfg.
Alternatively, it would try to obtain an IP address over OOB RJ45 via DHCP (v4).
c. Bluefield Applications
Nvidia provides good information about sample applications of Bluefield. I suggest referring to the relevant section of the documentation.
One of the notable use-cases for Bluefield is to emulate an NVMe drive while actually performing NVMe-oF (e.x, NVMe over TCP): DOCA Storage Zero Copy
Another use case is a GPU Packet Processing application - that way you can abstract the GPU over the network: DOCA GPU Packet Processing Application Guide
ii. Bluefield-1
NOTE: Avoid those cards, unless they are extremely cheap or you want one for your collection. They are unsupported by nVidia and ARM-part requires old DOCA (1.3.0 is the last one that officially supported BF1, but DOCA 1.4.0 still contains firmwares for those and might or might not work) and have bugs that were never fixed. If you have one, you can still install the latest DOCA on the host for the newest versions of OFED.
One of the common problems on Bluefield-1 systems is PCIe compatibility. That is the same as with ConnectX-5, which is running older firmware. Still, there is no proper fix for Bluefields available and even with newest available firmware it won't be able to run in some of the newer systems (I personally had no problems with compatibility with Sapphire Rapids/Emerald Rapids server systems, but in SP3/SP5 AMD Epyc machines NIC failed to initialize). In theory, it might be possible to MacGyver fixed firmware by transplanting the PCIe init section from ConnectX-5. However, that was never tried and might brick the card, rendering it beyond repair.
a. General
There are a few different SKUs in Bluefield-1 generation available, and it is important to verify which one you got before inserting the card.
All the cards use the same ConnectX-5 En NIC as the ConnectX-5 NIC. That might be important in terms of performance.
Also, all the cards, including 2x25G, are PCIe Gen4 versions. Therefore, it should be possible to reach full speed on a PCIe Gen4 x4 port.
Mostly full list of Bluefield-1 SKUs
There is a bit of logic in how they are named, mainly in the suffix of the card.
You would see the suffixes like AENAT, ASNAT, ASCAT, CSNAT, etc.
The first letter represents the number of ports on a card. A - 2x25, C or E - 2x100.
The second letter represents the number of cores - E is 8 ARM cores, S - 16.
The third letter represents crypto support - N - no crypto, C - crypto enabled.
"T" is just always there, I don't know if it has any special meaning, and it was never used on the older firmware download page.
- BF1500 (e.x. MBF1L516B-CSNAT) or 2x25G bluefield (all MBF1M332A) - those are normal NICs with an ARM on board. The ARM CPU is slow, as it is either an 8-core or 16-core Cortex A72 CPU that runs at just 800 MHz. 2x25G cards have a built-in fan that is rather loud.
- BF1600 (e.x. MBF1M606A-CSNAT or MBF1M636A-CSNAT) - those are special SKUs for JBoF systems - instead of being PCIe client cards, they are actually PCIe host and therefore should not be inserted into a normal system - according to the docs, it might damage the motherboard and/or the card. They are also rare and usually extremely overpriced. They have different CPUs, but all of them are 16-core and either 1.1 GHz or 1.3 GHz. All but the lowest-end (606) have a 'PCIe Aux connector' that allows you to use a second PCIe slot to use all 32 PCIe lanes that those NICs have. It is not clear if it is a special card or if a Socket Direct adapter can be used. Some of the BF1600 cards have a SODIMM DDR4 slot for RAM, instead of soldered RAM.
Firmware updates for the cards are no longer available as standalone files, but the latest firmware is included in DOCA 1.3.0. On some cards (e.x, MBF1L515B ones), you can install the SDK only using a mini-USB to USB cable as it doesn't expose rshim services over PCIe.
Follow the guide for DOCA 1.3.0 to install it on the card, as that is the latest version of DOCA that works.
iii. Bluefield-2
a. General
Even 2x25G SKUs are based on ConnectX-6 Dx; therefore, in terms of power consumption and performance, they should be faster than ConnectX-6 Lx if that matters. As a drawback, none of the cards supports ASPM.
Mostly complete official list of bluefield-2s
As with BF1, there are separate SKUs that are JBoF, and just like with BF1, you cannot use those cards on a normal computer or server. Those SKUs are called BF2500 or BF2 VPI DPU Controllers. Unfortunately, it is unclear if they have different PCBs or share the same one with other Bluefield-2s, as I wasn't able to obtain them, and there are no reports on any successful attempts to flash cards with BF2500 firmware. Firmware for BF2500 can be found inside DOCA 1.3.0 and 1.4.0, but it is old.
Some SKUs are not mentioned in the documentation, but older versions of DOCA have firmware for those SKUs. For example, MBF2H536C-CEUO has Secure Boot, but UEFI and Crypto are disabled.
b. Special versions
Other special versions of Bluefield-2 exist, but I haven't personally seen them. Those are bluefields with integrated GPU - Bluefield-2X, which consists of 3 production models:
- AX800 - actually Bluefield-3 with A100 class chip
- A100X - Bluefield-2 with A100
- A30X - Bluefield-2 with A20
On the internet, you can also find the BF A10X Engineering Sample for sale, but it is not clear if it exists outside of Engineering Samples. It is unclear what kind of silicon the card uses or if it actually exists and not just a typo in the listing (it is hard to verify, as firmware for BF-2X distributed only under NDA after approx. DOCA 1.4.0)
c. NC-SI Interface
Bluefield-2 has an NC-SI connector that can be used for debugging/recovery purposes. The connector is different depending on the card. Rule of thumb: if the card is an Engineering Sample, it will have a 30-pin connector for a flat cable. For production cards, a 20-pin Molex 5011892010 connector is used.
They all have UART pins exposed, which would provide UART to the BMC.
There is slightly more information in the manuals:
d. [IMPORTANT] Updating the DOCA
It is important not to try updating a Bluefield from within the pre-installed DOCA SDK if the version is not recent enough. That would result in a soft-bricked NIC that you won't be able to even detect on the PCIe bus, and the only way to bring it back to life would be to access BMC (via SSH, if it was enabled, or via NC-SI serial console). See below for the recovery instructions.
If you have a custom SKU for which DOCA doesn't have a firmware, at the end of the firmware update process, you'll get a message:
Code:
INFO[MISC]: NIC firmware update failed
e. Note on MBF2M345A-VENO Engineering Samples
Those are pre-production SKUs. They usually come with pre-production firmware 24.33.0356 or with an updated firmware version 24.40.1000. It is not possible to just cross-flash those cards to production SKU like MBF2M345A-HECO or HESO because there is physical difference between those cards - VENO uses 2-bank 8gbit RAM chips, while HE*O cards are using 1-bank 16-gbit RAM chips and if you just cross-flash the card, ARM cores will fail to boot, complaining to ddr_init training errors.
There are several workarounds available. A temporary workaround involves modifying the bfb file and replacing the RAM configuration on the flash. You need to replace the config within the BFB file each time you are performing bfb-install, and you need to persist your changes on the eMMC. You can try to make it persistent - ini is located in /dev/mmcblk0boot0 and a copy is in /dev/mmcblk0boot1. Please make a full backup of a working boot0 before making any changes (and refer to the section on BFB structure for more information).
The second workaround is to modify the firmware of the NIC. If you want a modified firmware, please send me a PM, as I don't want it to be widely available for now, as that probably would increase the prices of the cards.
The third workaround is to find newer firmware in other places (there are multiple versions available). For detailed instructions on where to find it please send a PM (same reasons as for the second workaround).
The boot partition can be updated with a special bfb file, and the steps are officially described in BlueField BSP documentation.
Those cards don't have a BMC chip and instead run on a BMC simulator. You might soft-brick the card if you attempt to change any BMC parameters (like assigning an IP address). If you do that, you will need to install the latest DOCA that still supports that card (DOCA 1.4.0). You can install it with bfb-install. That will downgrade the EFI firmware and allow you to fix the BMC Simulator parameters.
TLDR: You need to prepare a bfb file that contains the bootloader and all the configuration files you want to change, and then you need to flash it from within Bluefield's OS using the /opt/mellanox/scripts/bfrec script.
Do not try to hexedit files manually; some of them are signed, and all of them have a CRC attached. In the event of a CRC mismatch, the system will not boot.
iv. BFB structure
You can use mlx-mkbfb script to make your own BFB or to extract existing ones.
Code:
mlx-mkbfb -x ./bf-bundle-2.10.0-147_25.01_ubuntu-22.04_prod.bfb
A script like that can be used to repack partitions into bfb file.
Current firmware has the following partitions that are mostly self-descriptive (if partitions have v1 and v2 - that is for bluefield-2 and bluefield-3, respectively):
- bl2-cert-v1
- bl2-cert-v2
- bl2r-cert-v1
- bl2r-v1
- bl2-v0
- bl2-v1
- bl2-v2
- bl31-cert-v1
- bl31-cert-v2
- bl31-key-cert-v1
- bl31-key-cert-v2
- bl31-v0
- bl31-v1
- bl31-v2
- bl32-cert-v1
- bl32-cert-v2
- bl32-key-cert-v1
- bl32-key-cert-v2
- bl32-v0
- bl33-cert-v1
- bl33-cert-v2
- bl33-key-cert-v1
- bl33-key-cert-v2
- bl33-v0
- boot-acpi-v0
- boot-args-v0
- boot-args-v2
- boot-desc-v0
- boot-path-v0
- capsule-v0
- ddr_ate_dmem-v1
- ddr_ate_dmem-v2
- ddr_ate_imem-v1
- ddr_ate_imem-v2
- ddr-cert-v1
- ddr-cert-v2
- ddr_ini-v1
- ddr_ini-v2
- image-v0 – kernel
- initramfs-v0
- psc-app-v2
- psc-bl-v2
- psc-certs-v2
- psc-fw-v2
- snps_images-v1
- snps_images-v2
- trusted-key-cert-v1
- trusted-key-cert-v2
All installation is done from within initramfs.
The main script is located in scripts/initrd-install, and it then calls the installation script for the OS.
a. Extracting firmware from the bf-bundle
There are sample scripts that will try to extract firmware updates from bf-bundle: GitHub - Civil/bfb-extract-fw
Scripts are simplistic and might fail if anything changes in the format. They will produce a few directories where firmware would be named by a combination of model and PSID for all cards supported by the bfb file.
That works with older DOCAs, like 1.3.0 as well, but you need to modify the script to point to the correct URLs and files.
v. Troubleshooting
a. Failed setting eswitch to offloads
Full message:
Code:
[ 183.852908] mlx5_core 0000:03:00.1: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
[ 191.032911] mlx5_core 0000:03:00.1: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
[ 192.322688] mlx5_core 0000:03:00.1: mlx5_cmd_out_err:833:(pid 1249): CREATE_FLOW_GROUP(0x933) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x201c1c), err(-22)
[ 192.340343] mlx5_core 0000:03:00.1: mlx5_rdma_enable_roce_steering:71:(pid 1249): Failed to create RDMA RX flow group err(-22)
[ 192.356593] mlx5_core 0000:03:00.1: mlx5_rdma_enable_roce:164:(pid 1249): Failed to enable RoCE steering: -22
Code:
# That helps if you have strange errors in dmesg about ESWITCH
# PF_TOTAL_SF: maximum number of scalable functions you wish to configure for the given PF/ECPF. 252 is the max.
# PF_SF_BAR_SIZE: size of each SF at the BAR2. The size is in powers of 2 in KB. 12 seems to be the default.
# NUM_OF_PF: number of physical ports exposed to the host. NOTE: if you set it here to 0, no ports will be visible on the host and you'll need to log in to ARM to re-enable them
mlxconfig -d ${PCI_ID} s PF_TOTAL_SF=252 PF_SF_BAR_SIZE=12 NUM_OF_PF=${AMOUNT_OF_PORTS}
b. BMC on vendor cards
In some cases, it is disabled because Vendor Field Mode is enabled. It can be reenabled from within the OS on the card: Vendor Field Mode
c. The card is not detected as a PCIe device.
You can attempt to recover the card by logging in to BMC's serial console. For that, you need to have either a working BMC (responding on web interface/API) or, if that is not an option (card in VFM), you can get a physical serial console over an NC-SI Connector. On production cards, those are Molex 5011892010, pico-clasp. Cables from various sources work fine with those cards.
TODO: Brick one of the cards I have and provide instructions on how to recover it, step-by-step.
5. MTUSB-1
This device is used to modify the Mellanox NICs as an option when putting them into flash recovery does not work. It accesses the NIC via the I2C interface.
This only seems to be necessary on some OEM CX5 NICS and the CX6 NICs. It doesn’t seem to be necessary on the CX7 NICs.
The device looks like the following:
Another pic of how the foot bone is connected to the knee bone!
You will need to get a couple of extra parts that don’t come with the kit.
That includes the gender changer shown below:
I also needed to rig the 3 pin connector that will go through the 3 holes on the nic:
And another shot:
I originally thought that green wire would go to the G for ground on the nic ….but the white wire goes to the G on the nic!
Also, take note of the order of the 3 holes on a CX5 NIC may be different than the CX6 NIC!
Please note that this did not work in vmware esxi. I did get it to work in linux and windows.
What it looks like without the mtusb-1 connected:
With the mtusb-1 connected:
Now you can run any of the normal commands. You just need to specify the mtusb-1 as the device:
Some additional pics of the complete setup on a benchtop:
6. Useful links
Mellanox OFED cheat sheet - MLXOFED Cheatsheet
bfscripts/mlx-mkbfb at master · Mellanox/bfscripts
NVIDIA BlueField-2 Ethernet DPU User Guide
bfscripts/mlx-mkbfb at master · Mellanox/bfscripts
Configuring NVIDIA BlueField2 SmartNIC
39. NVIDIA MLX5 Ethernet Driver — Data Plane Development Kit 25.03.0 documentation
Levente Csikor – Medium - very good series of articles on working with Bluefield, however, it requires an account on Medium.
Last edited: