$67 DDR Infiniband on Windows - 1,920MB/S and 43K IOPS

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I purchased an LSI HBA on eBay but was shipped a Mellanox Infiniband card instead. I'm returning it, but not before giving it a test run. I got some nice results, especially considering the price.

The Infiniband card was listed on eBay for $67 including shipping, so I didn't expect much. It is a dual-port Mellanox card, DDR speed (20Gbit), and it says it's VPI Connect-X2, but it's not a current Mellanox part number. It exists on the Mellanox site, but just barely. The part number, by the way, is MHRH2A-XSR. Does anyone know anything about this card?

I installed the card in a Dell c6100 node running Windows 2008R2 using the Mellanox Ethernet driver and 100% default settings. The card shows up as a 16Gbit Ethernet card - that's 20 Infiniband Gbits minus the overhead of the 8/10 encoding. 40 Gbit QDR cards show up as 32 Gbit when put into Ethernet mode, so this looks normal so far. After install, I assigned the card an IP address, plugged just one of the ports into my Infiniband network, and spun up a RAM disk using StarWind software. The RAM disk is capable of something like 7GB/S, so it won't be a bottleneck.

I then used simple Windows file sharing to mount the ram disk on another c6100 node that is connected to the Infiniband network with a faster QDR card.

With the above in place, I ran IOMeter on the QDR node across the IB/IP network over to the DDR node. I tested 1MB random reads and 4kb random reads with one worker and a queue depth of 32. I also tested writes, but those stress StarWind, not the IB card.

Surprise: Throughput was 1,920MB/Second. Maximum IOPS was 43,200. I really did not expect Windows 2008R2 to do so well with Infiniband DDR single port, especially with a card that is not a current model. Perhaps I should test dual ports on Windows2012 where, hopefully, SMB3 will utilize both channels and saturate the PCIe2 bus to at least 3.4GB/s.
 
Last edited:

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
Sweet! Thanks for the info.

I tried testing my SDR bandwidth using Iometer's dynamo yesterday and I couldn't break 600MB/s. I'll give your RAM disk method a try.
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
The RamDisk method you outlined wouldn't yield more than 680MB/s between my workstation and File server and 535MB/s between one of my render nodes and the file server. All running the same infinihost III HCAs through Topspin 90 on Win2008R2.

I'm expecting my ConnectX-3 EN 40GbE HCAs on Tuesday. They'll be direct connecting my workstation and file server. I'll post those results when I have them.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
Sweet! Thanks for the info.

I tried testing my SDR bandwidth using Iometer's dynamo yesterday and I couldn't break 600MB/s. I'll give your RAM disk method a try.
That's quite interesting. Given my results, I would have expected 950MB/S easy. You say that you are using infinihost III? That's one vote against that model of card.

For your new cards, why did you choose $1,600 worth of ConnectX-3 cards over a pair of ConnectX-2 cards for $250? It seems like a lot to spend for perhaps a 20% speed bump. Anyway, I'm anxious to see how your new cards test out. When you do post, include lots of test details and I'll try to run the same tests on ConnectX-2 on PCIe2. Maybe PCie3 will impress once again.
 
Last edited:

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
Thanks. Yeah, I assumed it was my drive arrays holding back the speed all of this time but now it seems that wasn't the case.

As for the ConnectX-3 EN cards, they were "only" $477ea (MCX313A-BCBT). As for why those cards and not second hand ConnectX-2 cards is because I'm migrating back to Ethernet.

I'll keep you posted.
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,529
241
63
Question: Can you connect a QDR card to a DDR or SDR card directly?
 

Chuckleb

Moderator
Mar 5, 2013
1,017
331
83
Minnesota
This should work as long as you have the right cables. Most of the SDR and DDR were CX4 connectors. The system should sync at the highest supported speed of the fabric.
 

RimBlock

Active Member
Sep 18, 2011
837
28
28
Singapore
My ConnectX-2 QDR card is connecting to my ConnectX DDR cards via a DDR switch and running at 20Gbps (DDR) without any issues or special configuration.

I just used a QSPF+ -> CX4 cable.

I cannot see why a direct connect would be any different.

RB
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I tried a very quick throughput test using both IB ports on the $67 card. While the single port pulled 1,920MB/S, both ports combined were good for... 2,247MB/S. I was hoping for something closer to 3GB/S.

UPDATE: The RAM disk may be the bottleneck. With two RAM disks running and testing locally, I'm getting too little throughput to rule it out. This is surprising since in other configurations I saw no such bottleneck from StarWind. So, for now, I have no conclusion regarding the maximum throughput of the card when using both ports.

UPDATED UPDATE: I was able to get around the StarWind limitation by spinning up multiple RAM disks on the same system. On a Dell c6100 with dual 5520 CPUs and 8x8GB RAM I saw 7,450MB/S in 1MB random reads when testing a RAM disk using a local copy of IOMeter - that's more than enough to test even a FDR Infiniband card.


I purchased an LSI HBA on eBay but was shipped a Mellanox Infiniband card instead. I'm returning it, but not before giving it a test run. I got some nice results, especially considering the price.

The Infiniband card was listed on eBay for $67 including shipping, so I didn't expect much. It is a dual-port Mellanox card, DDR speed (20Gbit), and it says it's VPI Connect-X2, but it's not a current Mellanox part number. It exists on the Mellanox site, but just barely. The part number, by the way, is MHRH2A-XSR. Does anyone know anything about this card?

I installed the card in a Dell c6100 node running Windows 2008R2 using the Mellanox Ethernet driver and 100% default settings. The card shows up as a 16Gbit Ethernet card - that's 20 Infiniband Gbits minus the overhead of the 8/10 encoding. 40 Gbit QDR cards show up as 32 Gbit when put into Ethernet mode, so this looks normal so far. After install, I assigned the card an IP address, plugged just one of the ports into my Infiniband network, and spun up a RAM disk using StarWind software. The RAM disk is capable of something like 7GB/S, so it won't be a bottleneck.

I then used simple Windows file sharing to mount the ram disk on another c6100 node that is connected to the Infiniband network with a faster QDR card.

With the above in place, I ran IOMeter on the QDR node across the IB/IP network over to the DDR node. I tested 1MB random reads and 4kb random reads with one worker and a queue depth of 32. I also tested writes, but those stress StarWind, not the IB card.

Surprise: Throughput was 1,920MB/Second. Maximum IOPS was 43,200. I really did not expect Windows 2008R2 to do so well with Infiniband DDR single port, especially with a card that is not a current model. Perhaps I should test dual ports on Windows2012 where, hopefully, SMB3 will utilize both channels and saturate the PCIe2 bus to at least 3.4GB/s.
 
Last edited:

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
I tried a very quick throughput test using both IB ports on the $67 card. While the single port pulled 1,920MB/S, both ports combined were good for... 2,247MB/S. I was hoping for something closer to 3GB/S.

UPDATE: The RAM disk may be the bottleneck. With two RAM disks running and testing locally, I'm getting too little throughput to rule it out. This is surprising since in other configurations I saw no such bottleneck from StarWind. So, for now, I have no conclusion regarding the maximum throughput of the card when using both ports.
Hmm. Well that sucks. I was hoping this would be a great way of testing my 40GbE cards. I'll contact mellanox and ask them directly what they recommend.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I know from earlier tests on other configurations that I can get far more bandwidth from a RAM disk than any QDR card could handle. I'm not sure what's happening here - it might be that StarWind loses performance when you spin up multiple drives or it might be something more complex than that. I'll find out eventually.

Hmm. Well that sucks. I was hoping this would be a great way of testing my 40GbE cards. I'll contact mellanox and ask them directly what they recommend.
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
I got my 40GbE cards and hooked them up with a Mellanox FDR optical cable which I was told by Mellanox would work. Indeed it does.

The card appears in device manager as "ethernet controller" prior to driver installation and after drivers are installed it seems to functions as a pure bread ethernet controller. I was able to bridge a 1GbE port to the 40GbE port in Win2008R2 without a hitch.

When installing the drivers I deselected the "run as 10GbE" option. In task manager it appears as 40Gbps. So far so good.

I spun up a RAM drive as before but could only get 1.37GB/s...

I looked inside the driver properties and there is a boat load of performance tuning options in there. I'll post on the mellanox forums to see what they suggest for these settings and what they recommend for bandwidth testing.

If anyone here knows what the best settings are for a 40GbE controller I'm all ears.
 

Smalldog

Member
Mar 18, 2013
62
2
8
Goodyear, AZ
dba-

I just want to say thanks for posting this information. I contacted the seller (pretty sure it was computersell11) and scooped up 4 at a decent price.. I always love to get new gadgets to play around with and see if I can fit them into anything I do.

I did notice that someone else is selling the same card on eBay, but has it listed as a RAID controller. I made an offer, but they came back with some ridiculous amount for 4 cards. More than twice what I bought 4 from this other seller for.

Again, thanks!
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I found a workaround: Create multiple RAM disks. On a Dell c6100 with 5520 CPUs, one RAM disk was good for a bit above 2,00MB/Second when tested using IOMeter running on the same machine as the RAM disks. If I spun up four RAM disks on the same machine and ran the tests again, i saw over 7,000MB/Second.

Hmm. Well that sucks. I was hoping this would be a great way of testing my 40GbE cards. I'll contact mellanox and ask them directly what they recommend.
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
Will give it a try. Still doesn't explain why I'm not getting close to your 2000GB/s figure for 1...

I posted the question on the mellanox community forum. waiting for reply.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
First try this series of tests:

1) Run the STREAM benchmark or SiSoft Sandra to see what your memory bandwidth is.
2) Spin up your RAM disks and test locally using IOMeter

Confirm that your raw memory bandwidth is huge (>100GB/S) and your RAM disk bandwidth is significantly greater than the HBA you are looking to test.

Will give it a try. Still doesn't explain why I'm not getting close to your 2000GB/s figure for 1...

I posted the question on the mellanox community forum. waiting for reply.
 
Last edited:

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
Thanks, dba. Using Sandra:


  1. I checked that both HCA's were running at PCIe 3 x8 and tested teh memory:
  2. Benchmarked Memory- Workstation: 75GB/s Fileserver: 67GB/s (Maybe low because I'm using RDIMMs)?

I then ran Iometer on both machines for comparison (8M Sectors, 32 I/Os per target, 4MB Read):

Workstation: Varied wildly from run to run: 1800MB/s -2300MB/s always starts out strong and tapewrs down.
File Server: Steady 2600MB/s

How big do you make your RAM disk and how big do you set your Iometer Sectors for Max Disk Size?

Sandra suggested I turn on Large Memory Pages. Is that advisable for my systems?
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
I just created a RAID-0 array in Windows using Four 4GB RAM Disks and ran Iometer on the array and got a dismal 1,780MB/s on the file server that was getting a pretty consistent 2400-2600MB/s on a single disk.

I don't know what gives.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
It's not critical, but your memory bandwidth looks low for dual E5s. Perhaps you don't have DIMMS in all of the memory channels?

More importantly, was that 2,600MB/s for one RAM disk? If so then that looks OK, so try spinning up three more RAM disks on the same machine: Just use the StarWind UI to "Add Device" three more times. Try four 4096MB RAM disks and set IOMeter to 8,000,000 sectors.

Don't RAID the RAM disks - test them as four separate disks in IOMeter; You want to testh throughput, not the efficiency of someones RAID implementation. You can multi-select drives in IOMeter by control-clicking.

Large pages is a big deal for databases, but should not impact your HBA testing significantly if at all.

I recommend 1MB reads in IOMeter for basic throughput testing. Some systems will show slightly higher throughput with 512kb reads, or 2MB or 4MB reads, but 1MB is a good compromise that works well almost everywhere. With such large transfers, random versus sequential won't show very different behavior. If you have a specific use case in mind - say a RAID implementation that uses 64kb or 128kb chunks - run a second test using that transfer size.


Thanks, dba. Using Sandra:


  1. I checked that both HCA's were running at PCIe 3 x8 and tested teh memory:
  2. Benchmarked Memory- Workstation: 75GB/s Fileserver: 67GB/s (Maybe low because I'm using RDIMMs)?

I then ran Iometer on both machines for comparison (8M Sectors, 32 I/Os per target, 4MB Read):

Workstation: Varied wildly from run to run: 1800MB/s -2300MB/s always starts out strong and tapewrs down.
File Server: Steady 2600MB/s

How big do you make your RAM disk and how big do you set your Iometer Sectors for Max Disk Size?

Sandra suggested I turn on Large Memory Pages. Is that advisable for my systems?
 
Last edited: