Unique OEM Dual LSI 2308 + Mellanox ConnectX-3 VPI Combo Expansion Cards

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
Hi Dave!

Unfortunately, this card is locked down tighter than crazy glue. It simply wont work without having one of the proprietary boxes this company sells for this card. The hardware in the custom boxes is the only thing that properly enables this card.

That said, give it another go my friend! Who knows, you might unlock this bad boy!
Thanks. I've got a few ideas of things to at least investigate that I don't think have been tried yet (or at least not documented in this thread). For $20, it's worth it to me to have another thing to tinker with a little bit (just don't tell my wife :p), 'cause I love doing that sort of thing anyway. We'll see!
 
  • Like
Reactions: Sleyk

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
Woohoo, my card arrived today (was supposed to be Monday)!

Played with it a little bit... I was able to dump the SBR from both controllers using lsirec (will upload later). As expected, at the moment I'm able to hostboot both controllers on the card but not much else. Poked around with lsiutil but got lots of errors trying to look at the interesting menu options...

Most interesting result I got so far is that if I flash a controller with the 9207-8e SBR, it'll no longer hostboot any firmware I have handy. Flashed the original SBR back allowed it to resume "functioning".

Will do some more tinkering this weekend and see if I can get anywhere.
 
  • Like
Reactions: Sleyk and dawsonkm

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
It occurs to me that one could probably bypass the firmware issues by just having a startup script that hostboots both of the controllers during system boot (lsirec on Linux would handle that fine, I think). Part of me wonders if that's all DDN did in their proprietary systems in the first place...
 
  • Like
Reactions: Sleyk

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
I spent some more time after the kids went to bed tinkering. I had varying amounts of success reading and writing the SBRs and hostbooting the controllers with the 9205-8e P20 firmware. I was able to poke around with sas2flsh and megarec and lsiutil from a FreeDOS USB stick.

One of the first steps in most cross-flashing guides is to wipe the flash with megarec -cleanflash 0. That resulted in this error:

PXL_20210731_035940600.jpg

CFI Query failed
Flash is Not Programmable

After much googling (to no avail), looking at the card, and scratching my head, I have come to the conclusion that the flash is not programmable because...it doesn't exist.

I'm not sure if the LSI 9xxx controllers have flash built into the chips themselves, or if the flash is external. But if you look at the pictures of this board, there are no chips anywhere that are actual flash memory chips (the SBRs are stored on a pair of 512KB two-wire serial memory SOIC8 chips). Compare that to other LSI 9xxx boards I've taken a look at, which all seem to have flash chips on them, usually right next to the controller package itself. I think it's either that the external flash is missing entirely, or there's a flag or fuse set that made the internal flash read-only (which, if that's actually the case, may be unlockable if someone were to find the right magic command for lsiutil or megaoem or something).

That would explain why @Sleyk and I kept dead-ending at flash write errors, and explains how Data Direct Networks made these cards proprietary - the LSI controllers are just space heaters unless you have their software and/or drivers that hostboot the card with a firmware image on bootup. After the controllers are hostbooted, they seem to be as functional as any other LSI HBA.

On Linux, you could certainly use lsirec to hostboot the controllers. I was hoping to use this card in my FreeNAS box, though, so that's probably not an option. At the very least, I think FreeNAS would be trying to online the ZFS pools from the disks before anything would be able to run and hostboot the cards (but I may be wrong...I might have to pay more attention to the boot sequence the next time I reboot my FreeNAS box).

For both Linux and FreeBSD, though, I suppose you could write a kernel module that mainly copies-and-pastes the hostboot code from lsirec. That might be an interesting academic exercise for anyone who has more free time than myself...

Ah, well...that was fun looking into regardless of the outcome.
 
Last edited:

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
Well, I was kinda bored today (and procrastinating housework), and my brain has been chewing on this in the background for a day or two, and I was able to come up with a (probably not terribly reliable - yet) way to host boot the LSI controllers on this card during system bootup.

Ta-da:

Code:
[   11.844173] mpt3sas version 36.100.00.00 loaded
[   11.846923] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (3904408 kB)
[   11.847518] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[   11.847572] mpt2sas_cm0: MSI-X vectors supported: 16
[   11.847574] mpt2sas_cm0:  0 4
[   11.847761] mpt2sas_cm0: High IOPs queues : disabled
[   11.847762] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 37
[   11.847763] mpt2sas0-msix1: PCI-MSI-X enabled: IRQ 38
[   11.847764] mpt2sas0-msix2: PCI-MSI-X enabled: IRQ 39
[   11.847765] mpt2sas0-msix3: PCI-MSI-X enabled: IRQ 40
[   11.847766] mpt2sas_cm0: iomem(0x00000000fbb40000), mapped(0x0000000075e16a13), size(65536)
[   11.847768] mpt2sas_cm0: ioport(0x000000000000e000), size(256)
[   11.848704] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[   11.848972] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[   11.849261] mpt2sas_cm0: request pool(0x00000000b5d6e163) - dma(0x121400000): depth(10368), frame_size(128), pool_size(1296 kB)
[   11.937527] mpt2sas_cm0: sense pool(0x00000000b6b62977)- dma(0x122a00000): depth(10107),element_size(96), pool_size(947 kB)
[   11.937605] mpt2sas_cm0: config page(0x0000000053b9c1ca) - dma(0x122943000): size(512)
[   11.937607] mpt2sas_cm0: Allocated physical memory: size(7454 kB)
[   11.937608] mpt2sas_cm0: Current Controller Queue Depth(10104),Max Controller Queue Depth(10240)
[   11.937609] mpt2sas_cm0: Scatter Gather Elements per IO(128)
[   11.938284] mpt2sas_cm0: overriding NVDATA EEDPTagMode setting
[   11.938710] mpt2sas_cm0: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[   11.938712] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[   11.940633] mpt2sas_cm0: sending port enable !!
[   11.940906] mpt2sas_cm1: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (3904408 kB)
[   11.941678] mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k
[   11.941757] mpt2sas_cm1: MSI-X vectors supported: 16
[   11.941759] mpt2sas_cm1:  0 4
[   11.941985] mpt2sas_cm1: High IOPs queues : disabled
[   11.941986] mpt2sas1-msix0: PCI-MSI-X enabled: IRQ 41
[   11.941987] mpt2sas1-msix1: PCI-MSI-X enabled: IRQ 42
[   11.941988] mpt2sas1-msix2: PCI-MSI-X enabled: IRQ 43
[   11.941988] mpt2sas1-msix3: PCI-MSI-X enabled: IRQ 44
[   11.941989] mpt2sas_cm1: iomem(0x00000000fb940000), mapped(0x0000000062cc6a10), size(65536)
[   11.941992] mpt2sas_cm1: ioport(0x000000000000d000), size(256)
[   11.943231] mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k
[   11.943554] mpt2sas_cm1: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[   11.943846] mpt2sas_cm1: request pool(0x000000003c09a58c) - dma(0x123c00000): depth(10368), frame_size(128), pool_size(1296 kB)
[   12.031758] mpt2sas_cm1: sense pool(0x00000000fc8a2da0)- dma(0x125300000): depth(10107),element_size(96), pool_size(947 kB)
[   12.031843] mpt2sas_cm1: config page(0x0000000047fba6aa) - dma(0x125251000): size(512)
[   12.031845] mpt2sas_cm1: Allocated physical memory: size(7454 kB)
[   12.031846] mpt2sas_cm1: Current Controller Queue Depth(10104),Max Controller Queue Depth(10240)
[   12.031847] mpt2sas_cm1: Scatter Gather Elements per IO(128)
[   12.033932] mpt2sas_cm1: overriding NVDATA EEDPTagMode setting
[   12.034374] mpt2sas_cm1: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[   12.034377] mpt2sas_cm1: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[   12.036212] mpt2sas_cm1: sending port enable !!
[   14.476642] mpt2sas_cm0: hba_port entry: 00000000e50d7e4f, port: 255 is added to hba_port list
[   14.480323] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x5000000080000000), phys(8)
[   14.621692] mpt2sas_cm1: hba_port entry: 00000000dcdc90fb, port: 255 is added to hba_port list
[   14.623682] mpt2sas_cm1: host_add: handle(0x0001), sas_addr(0x5000000080000000), phys(8)
[   19.610008] mpt2sas_cm0: port enable: SUCCESS
[   19.755025] mpt2sas_cm1: port enable: SUCCESS
Anyone interested in the details?
 

firworks

Member
May 7, 2021
37
27
18
If these could be made to reliably boot and work they'd be fantastic for 1U servers. I've got two that obviously are sorely lacking in PCI-E slots and this would be pretty great getting both high speed networking and DAS in one shot.
 
  • Like
Reactions: Samir

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
Life got busy last year and I never did get around to finishing the little how-to I was planning based on my last post. But, I've recently picked up this little side project again and have a few things of note.

One, since it's been so long since I last touched this, I am working on re-creating my test bench and the Linux module loading trickery I did earlier to get the card to host boot under normal Linux operations. I am switching my test bench from Fedora to Ubuntu and I almost have it working. I have the main pieces in place, but ran into an issue last night with not being able to actually host boot the card. Re-examining this thread, though, I think it's just that I was trying to boot the 9206-16e firmware instead of the 9205-8e firmware, so I'll make that tweak and try again tonight. I think there may be a further issue of the SAS addresses, so I have a to-do item to look at that as well once I get the host boot part working.

Two, I gave the card a whirl in TrueNAS Core 12 - I had noticed some messages in the boot log about checking LSI controller firmware, so I was curious if maybe its FreeBSD base had something to try to hostboot the card if it found invalid firmware or something, but sadly it would just hang with a message about the controller being in a reset state. So no-go there.

Three, TrueNAS Scale is based on Debian Linux, so if I can get the card to hostboot in Ubuntu, I am cautiously optimistic that I could make it work in TrueNAS Scale as well. I intend to try this out soon. I feel like this, if possible, would make this snowflake card actually useful (especially for me, since I currently have a pair of 9206-16e controllers and a two-port ConnectX-3 card in the Dell R720 I run TrueNAS on, so I could potentially switch to TrueNAS Scale, swap in the cards and swap over the network and DAS cables, and be up and running in no time...this is kind of an exciting possibility to me). In fact, I just bought a second card on eBay just in case this crazy scheme actually works...

More to come...
 
  • Like
Reactions: Sleyk

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
Ok, here's a very rough version. I'm waiting on an SFF-8644 to SFF-8482 cable to arrive so I can actually test this with some drives. Right now I can only confirm that the mpt3sas kernel module loads and sees the two controllers on the card, but I would appreciate any feedback if anyone wants to be a guinea pig...

These commands all need to be performed as root. This is currently tested on Ubuntu 20.04.1 with kernel 5.13.0-28-generic.

The basic premise is to have a script that utilizes lsirec to perform a hostboot of each controller, and then hook into the kernel module loading mechanism to run that script for each LSI controller found when the mpt3sas module is loaded.

Code:
# extract files
mkdir ddn-hostboot
cd ddn-hostboot

# replace with path to the file attached to this post
unzip ~/Downloads/ddn-hostboot.tgz
Code:
# set up hugepages
echo "vm.nr_hugepages=16" >> /etc/sysctl.conf
sysctl -p
Code:
# copy firmware
cp ddn-hostboot/9205-8e.bin /usr/lib/firmware/9205-8e-P20.bin
Code:
# install tools necessary to build lsirec
apt install make gcc

# fetch and build lsirec
wget https://github.com/marcan/lsirec/archive/master.zip
unzip master.zip
cd lsirec-master
make
# no 'make install', so just copy...
cp -p lsirec /usr/sbin/
Code:
# scripts and modprobe conf file
cp ddn-hostboot/ddn-hostboot.sh /usr/sbin/
# edit: make script executable
chmod +x /usr/sbin/ddn-hostboot.sh
cp ddn-hostboot/mpt3sas.conf /etc/modprobe.d/
Code:
# now refresh the modprobe config
depmod
Code:
# unload and reload the mpt3sas module
rmmod mpt3sas
modprobe mpt3sas
You should see something like this during the last step:

Code:
root@workbench-01:~# modprobe mpt3sas
Trying unlock in MPT mode...
Device in MPT mode
Device in MPT mode
Resetting adapter in HCB mode...
Trying unlock in MPT mode...
Device in MPT mode
IOC is RESET
Device in MPT mode
Resetting adapter in HCB mode...
Trying unlock in MPT mode...
Device in MPT mode
IOC is RESET
Setting up HCB...
HCDW virtual: 0x7f4b25800000
HCDW physical: 0x117000000
Loading firmware...
Loaded 809340 bytes
Booting IOC...
IOC is READY
IOC Host Boot successful.
Trying unlock in MPT mode...
Device in MPT mode
Device in MPT mode
Resetting adapter in HCB mode...
Trying unlock in MPT mode...
Device in MPT mode
IOC is RESET
Device in MPT mode
Resetting adapter in HCB mode...
Trying unlock in MPT mode...
Device in MPT mode
IOC is RESET
Setting up HCB...
HCDW virtual: 0x7ff83f000000
HCDW physical: 0x117000000
Loading firmware...
Loaded 809340 bytes
Booting IOC...
IOC is READY
IOC Host Boot successful.
If all goes well, you should see some stuff like this in the output of dmesg:

Code:
[  202.416975] mpt3sas version 38.100.00.00 loaded
[  202.417501] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (3905372 kB)
[  202.460042] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[  202.460060] mpt2sas_cm0: MSI-X vectors supported: 16
[  202.460062]      no of cores: 4, max_msix_vectors: -1
[  202.460064] mpt2sas_cm0:  0 4
[  202.460225] mpt2sas_cm0: High IOPs queues : disabled
[  202.460227] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 43
[  202.460229] mpt2sas0-msix1: PCI-MSI-X enabled: IRQ 44
[  202.460231] mpt2sas0-msix2: PCI-MSI-X enabled: IRQ 45
[  202.460232] mpt2sas0-msix3: PCI-MSI-X enabled: IRQ 46
[  202.460233] mpt2sas_cm0: iomem(0x00000000fbb40000), mapped(0x00000000d6ad4e57), size(65536)
[  202.460238] mpt2sas_cm0: ioport(0x000000000000e000), size(256)
[  202.513029] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[  202.540464] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[  202.541023] mpt2sas_cm0: request pool(0x00000000b915e915) - dma(0x32200000): depth(10368), frame_size(128), pool_size(1296 kB)
[  202.633830] mpt2sas_cm0: sense pool(0x0000000095b8810f) - dma(0x48d00000): depth(10107), element_size(96), pool_size (947 kB)
[  202.633837] mpt2sas_cm0: sense pool(0x0000000095b8810f)- dma(0x48d00000): depth(10107),element_size(96), pool_size(0 kB)
[  202.634076] mpt2sas_cm0: reply pool(0x000000000b8bfe78) - dma(0x48e00000): depth(10432), frame_size(128), pool_size(1304 kB)
[  202.634088] mpt2sas_cm0: config page(0x000000005bec4d55) - dma(0x48cea000): size(512)
[  202.634089] mpt2sas_cm0: Allocated physical memory: size(23190 kB)
[  202.634090] mpt2sas_cm0: Current Controller Queue Depth(10104),Max Controller Queue Depth(10240)
[  202.634091] mpt2sas_cm0: Scatter Gather Elements per IO(128)
[  202.678564] mpt2sas_cm0: overriding NVDATA EEDPTagMode setting
[  202.678982] mpt2sas_cm0: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[  202.678991] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[  202.679070] scsi host0: Fusion MPT SAS Host
[  202.682524] mpt2sas_cm0: sending port enable !!
[  202.682897] mpt2sas_cm1: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (3905372 kB)
[  202.737339] mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k
[  202.737353] mpt2sas_cm1: MSI-X vectors supported: 16
[  202.737354]      no of cores: 4, max_msix_vectors: -1
[  202.737355] mpt2sas_cm1:  0 4
[  202.737466] mpt2sas_cm1: High IOPs queues : disabled
[  202.737467] mpt2sas1-msix0: PCI-MSI-X enabled: IRQ 47
[  202.737468] mpt2sas1-msix1: PCI-MSI-X enabled: IRQ 48
[  202.737469] mpt2sas1-msix2: PCI-MSI-X enabled: IRQ 49
[  202.737470] mpt2sas1-msix3: PCI-MSI-X enabled: IRQ 50
[  202.737471] mpt2sas_cm1: iomem(0x00000000fb940000), mapped(0x00000000d7a4db15), size(65536)
[  202.737473] mpt2sas_cm1: ioport(0x000000000000d000), size(256)
[  202.791878] mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k
[  202.815836] mpt2sas_cm1: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[  202.816364] mpt2sas_cm1: request pool(0x000000000e12c7bb) - dma(0x49e00000): depth(10368), frame_size(128), pool_size(1296 kB)
[  202.907466] mpt2sas_cm1: sense pool(0x00000000f9fc16f6) - dma(0x4b500000): depth(10107), element_size(96), pool_size (947 kB)
[  202.907473] mpt2sas_cm1: sense pool(0x00000000f9fc16f6)- dma(0x4b500000): depth(10107),element_size(96), pool_size(0 kB)
[  202.907710] mpt2sas_cm1: reply pool(0x0000000043703275) - dma(0x4b600000): depth(10432), frame_size(128), pool_size(1304 kB)
[  202.907723] mpt2sas_cm1: config page(0x000000004caf29f6) - dma(0x4b498000): size(512)
[  202.907724] mpt2sas_cm1: Allocated physical memory: size(23190 kB)
[  202.907725] mpt2sas_cm1: Current Controller Queue Depth(10104),Max Controller Queue Depth(10240)
[  202.907726] mpt2sas_cm1: Scatter Gather Elements per IO(128)
[  202.948656] mpt2sas_cm1: overriding NVDATA EEDPTagMode setting
[  202.949063] mpt2sas_cm1: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[  202.949080] mpt2sas_cm1: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[  202.949167] scsi host10: Fusion MPT SAS Host
[  202.951882] mpt2sas_cm1: sending port enable !!
[  205.217418] mpt2sas_cm0: hba_port entry: 00000000281db942, port: 255 is added to hba_port list
[  205.222079] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x5000000080000000), phys(8)
[  205.459446] mpt2sas_cm1: hba_port entry: 0000000028266427, port: 255 is added to hba_port list
[  205.461168] mpt2sas_cm1: host_add: handle(0x0001), sas_addr(0x5000000080000000), phys(8)
[  210.354285] mpt2sas_cm0: port enable: SUCCESS
[  210.594282] mpt2sas_cm1: port enable: SUCCESS
Three known issues:
* SAS address are not set (will probably require creating a custom SBR for each controller with the SAS address in it)
* Must unload and reload mpt3sas module after boot (will require building new initrd image to incorporate the same files as above)
* The ddn-hostboot.sh script will attempt to halt and hostboot any LSI SAS controller installed. For now, you can manually edit the script if you know your card's PCI addresses for the controllers to avoid this if you have multiple LSI cards installed. New version of ddn-hostboot.sh script attached that should address this issue - it will only attempt to hostboot controllers that are in a RESET state.
 

Attachments

Last edited:

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
I'll get this into Github or something in the next few days once I work out the three known issues.

Many thanks to @Sleyk for the initial work discovering what firmware will boot on these cards and to @fohdeesha for his work on the Dell Mini RAID controller cross-flashing, which is where most of the utils and tricks used to hostboot the card from userland Linux came from.
 

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
I am thinking the solution to the SAS Address issue might be to embed the addresses in the SBRs. I played around with that a bit tonight, but my first attempt didn't work. That might be due to the sas address position possibly being different in the SAS2208/SAS2308 SBR versus the SAS2008/SAS2108 SBR (hence my post in the SBR thread...). Or it may be that the 9205-8e-P20 firmware doesn't use the SAS Address from the SBR and instead only pulls it from flash (which these cards don't have, hence the whole need for this process in the first place).
 
  • Like
Reactions: Sleyk

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
New version of ddn-hostboot.sh script attached to my prior post that should address an issue in systems with multiple LSI cards - it will only attempt to hostboot SAS2308 controllers that are in a RESET state.
 

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
@Dave Corder does the VPI NIC work with this script? Can I use IB?

Also if you can help me understand, the four external SAS ports are to connect to drives, and the LSI controller will manage RAID and provide it to the network - fine. But, where do I configure the RAID in Linux? How do I configure the server daemon?
I have yet to test the NIC portion - from prior reports, though, it appeared as a regular ConnectX-3 NIC, so I would assume it supports everything a standalone NIC would. I can test the Ethernet functionality later, but as for IB support I don't have the gear to test that personally. I can pull the mlx diagnostic info later.

The LSI chips are not RAID controllers, just SAS HBAs. You would need to use some sort of software RAID on top of it (potentially unRAID, maybe TrueNAS SCALE, or roll your own ZFS setup).
 

Dave Corder

Active Member
Dec 21, 2015
296
192
43
41
Hi @Dave Corder!
Really intrigued about your work with this card!
Wondering if there was any progress or updates?
Wish I had something to report, but life has been hectic and this little experiment has been back-burnered for a while. I will definitely post in this thread if/when I have any updates.
 

R30730

New Member
Aug 20, 2020
12
6
3
Wish I had something to report, but life has been hectic and this little experiment has been back-burnered for a while. I will definitely post in this thread if/when I have any updates.
thanks for the update! Will stay tuned!
 

richx

New Member
May 1, 2023
1
3
3
Some notes from playing with a couple of these cards. Each card pulled 37W at idle and the LSI chips run hot, I wouldn’t run them without a near fan in any long term solution.

Manually ran following sequence of commands from posted ddn-hostboot.sh script (lspci -nn shows this particular system has the LSI chips at 0000:05:00.0 and 0000:06:00.0):
Code:
# echo 16 > /proc/sys/vm/nr_hugepages
# rmmod mpt3sas
# ./lsirec 0000:05:00.0 unbind
# ./lsirec 0000:05:00.0 halt
# ./lsirec 0000:05:00.0 hostboot ./9205-8e.bin
# ./lsirec 0000:06:00.0 unbind
# ./lsirec 0000:06:00.0 halt
# ./lsirec 0000:06:00.0 hostboot ./9205-8e.bin
# modprobe mpt3sas
In BIOS/UEFI, had to disable VT-d/IOMMU to solve lsirec hostboot command failing with “IOC failed to become ready”, which kind of sucks.

The external SAS ports appeared to work fine with a 1 meter SFF-8644 to 4X SFF-8482 cable testing both SATA and SAS drives. However, the rear SAS LEDs didn’t light up.

Spent much more time on the networking side, looks like a Mellanox ConnectX-3 MCX353A-QCBT, 40Gb Infiniband (IB) and 10Gb Ethernet (10GbE). Ethernet over IB kernel module can be used to get faster than 10Gb speed, but with higher CPU usage and slower than native, modprobe ib_ipoib is needed to see Ethernet device.

Found out that a QSFP+ to SFP+ splitter cable does not work with the Mellanox ConnectX-3, so no 4x 10GbE, only one of the SFP+ ends was active. The two cards were connected with a 5 meter QSFP+ Twinax cable.

Mellanox is stuck at PCIe 2.0 speed as reported by kernel. Could be due to firmware INI setting (see below, didn’t try changing). The PCIe switch chip on card should support PCIe 3.0 but might have other reasons for limiting to 2.0.
Code:
mlx4_core 0000:04:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x8 link at 0000:03:00.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
Using IB, qperf -v 10.0.0.1 rc_bw performed close to 32Gb/s limit:
Code:
rc_bw:
    bw              =  3.28 GB/sec
    msg_rate        =    50 K/sec
    send_cost       =   103 ms/GB
    recv_cost       =   107 ms/GB
    send_cpus_used  =    34 % cpus
    recv_cpus_used  =    35 % cpus
With TCP/IP over IB using the IP over IB module, best I could get was ~2.83GB/sec (~24.4Gb/sec) using iperf2, but that depended on CPU/system tested, slower system side was only ~1.65GB/sec.

ibstat command is nice for quick stats, and ethtool -i to check ethernet speed, either with IP over IB or native Ethernet. Cards default to IB, switching the Mellanox from IB to Ethernet mode:
Code:
echo eth > /sys/bus/pci/devices/0000:04:00.0/mlx4_port1
The Mellanox cards have nice firmware tools available: GitHub - Mellanox/mstflint: Mstflint - an open source version of MFT (Mellanox Firmware Tools)
Rebuilding firmware BIN file with custom INI: GitHub - BeTeP-STH/mft-scripts: Mellanox firmware files and MFT related scripts

Plan was to try to get 40GbE working using latest 2.42.50xx FCBT firmware, but even with the 2.40.7000 10GbE QCBT firmware, the QSFP+ port link light would never come on and ibstat reported link down with any of official Mellanox firmwares, tried single port MCX353A and dual port MCX354A versions. Cards come with same 2.40.7000 firmware, but there must be some customization to the firmware for these.

The official firmware did add the FlexBoot BIOS. Official firmware query appeared to add the LINK_TYPE_P1 config option, but changing it never worked.
Code:
mstconfig -d 04:00.0 s LINK_TYPE_P1=ETH   ## "-E- Device doesn't support LINK_TYPE_P1 configuration"
Cards still appeared to run (without link) with the FCBT firmware with the following INI settings that run the Mellanox faster, looks like the chips are Rev 1 or A1 (Original INI has Name = DDN_SFA12K_A1):
Code:
core_f = 60
core_r = 14 
en_427_mhz = true
It would be nice to know why the card has a inactive QSFP+ port with any official firmwares. It does not appear to be INI related, tried changing different settings, except for obvious things like GPIO. Looks like original card INI should have worked.