LSI 9207-8i - Successor to 9211?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I already have a Samsung 830 256GB which is the only swap. At the moment it's too little and too slow, which is whay I need more swap space and also faster swap.

Cheers
One last thing: If your Matlab code runs successfully now, with only 256GB of SWAP, then it is likely using your hard drive for temporary storage. If that is the case, then you have another performance knob to tweak: Your disk storage. It then makes sense to spend some, perhaps most, of your budget moving your data to SSD. Perhaps you could add a second Samsung 830 swap drive and also move your data to an SSD RAID?
 

normadize

New Member
Oct 11, 2012
6
0
0
Hi dba,

Thanks again for all the answers.

I was able to run one simulation with an SSD as swap by using all the 32 GB of ram plus about 90 GB swap from the SSD. The other simulations require about 650 GB more ram (and you don't want to know about the next one up). Using hard drives instead of ssd for swap is a no go. I tried it and it I stopped it after a couple of days as it was barely advancing. With an SSD I was able to run it.

We can't afford more than around 1.5-2K for the whole shebang so lots of actual RAM like you described is out of the question. Since we're going to reach 600+ GB of swap SSDs in RAID is the only way I can think of to speed up things.

Do you have or can you point me to any comparative info on RAID 0 performance for HBA + mdadm (e.g. LSI 9207 IT) vs. h/w raid (e.g. LSI 9207 IR) vs true h/w raid (e.g. LSI 9266)? I'll do tests of my own but would help to form a rough opinion first.

Too bad you can't remember the linux tool to monitor swap paging. I'll see what I can find. I'll most probably need SSDs that have both high IOPS and high sequential throughput anyway.

Thank again.
 

mobilenvidia

Moderator
Sep 25, 2011
1,952
225
63
New Zealand
What about Cachecade/Fastpath on a LSI9265/66/70/71 or even the older 9260/61 for spindles.
Setup 6x Spindled HDD's
Setup a smaller cheaper SSD for the Cachecade/Fastpath to cache the spindled drives.
Setup a large enough SSD to hold the Swap
You could have less Spindles more SSD's for cachecade (these you can RAID0/1) or have a RAID0 for Swap.
You could forgo Cachecade and just use the SSD's to cache the Spindles (Fastpath) as Cachecade array are not visible to OS.

With the 9265 and up you can setup JBOD, but I can't confirm if cachecade works with JBOD drives, or only in RAID.
You could still setup 6x individual RAID0 arrays then let the OS stripe them also.
I must do a test on this.
 

ehorn

Active Member
Jun 21, 2012
342
52
28
What other area have you looked at for optimization?

From a Matlab perspective; How much have you guys worked to optimize your matlab code (i.e. bounds, arrays, data access, etc..)? Can you achieve parrallelism (i.e. Parallel MATLAB)?

From other hardware avenues; Have you considered caching software (i.e. Flashcache, Bcache, etc..)? Have you considered a low-cost, clustered solution?
 
Last edited:

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
Hi dba,

Thanks again for all the answers.

I was able to run one simulation with an SSD as swap by using all the 32 GB of ram plus about 90 GB swap from the SSD. The other simulations require about 650 GB more ram (and you don't want to know about the next one up). Using hard drives instead of ssd for swap is a no go. I tried it and it I stopped it after a couple of days as it was barely advancing. With an SSD I was able to run it.

We can't afford more than around 1.5-2K for the whole shebang so lots of actual RAM like you described is out of the question. Since we're going to reach 600+ GB of swap SSDs in RAID is the only way I can think of to speed up things.

Do you have or can you point me to any comparative info on RAID 0 performance for HBA + mdadm (e.g. LSI 9207 IT) vs. h/w raid (e.g. LSI 9207 IR) vs true h/w raid (e.g. LSI 9266)? I'll do tests of my own but would help to form a rough opinion first.

Too bad you can't remember the linux tool to monitor swap paging. I'll see what I can find. I'll most probably need SSDs that have both high IOPS and high sequential throughput anyway.

Thank again.
I found some of my notes. I was using iostat and looking only at the swap disks. Run iostat with a small interval (1-10 seconds) to look for transient issues or say 5 minute intervals for long-term gathering, collect the results in a file, and graph. You are looking to calculate:

- read to write ratio
- swap write rate in MB/S
- swap read rate in MB/S
- average read size in kb
- average write size in kb

Apparently I was also looking at using sysstat and ksar: http://www.cyberciti.biz/tips/identifying-linux-bottlenecks-sar-graphs-with-ksar.html

I can't find any good mdadm versus x comparison benchmarks in my notes because I have been testing a very specific IO scenario (data warehousing) that looks nothing like OS swapping. You can search for more benchmarks yourself, but I maintain that they will be useless since you don't yet know the IO profile of your application. The solution that works best for low queue depth might be different than the one that works for high queue depth - and again for large versus small transfer sizes, random versus sequential operations, and how "scattered" the operations are over what size data set versus cache size. Then there is the fact that you want to know the performance of the Linux swap subsystem specifically, which nobody seems to have benchmarked using SSD RAID.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
ehorn, you make what is probably the most important observation: It's usually the code! We don't know what normadize is doing, unfortunately, so I guess we'll have to stick to general IT advice.

What other area have you looked at for optimization?

From a Matlab perspective; How much have you guys worked to optimize your matlab code (i.e. bounds, arrays, data access, etc..)? Can you achieve parrallelism (i.e. Parallel MATLAB)?

From other hardware avenues; Have you considered caching software (i.e. Flashcache, Bcache, etc..)? Have you considered a low-cost, clustered solution?
 

normadize

New Member
Oct 11, 2012
6
0
0
@dba: thanks again for taking the time and digging up old notes! you've been really great! Ill try iostat as you suggested. I was already using sysstat and sar but I found the report rate to be too low (it relies on cron) and it shows averages per interval whereas i'm also interested in peaks. iostat looks more promising.

@ehorn & @dba: I was so hoping people would not ask that but I admit that it is a pertinent question and that it is the case many times. Sadly, not this time. The code is unfortunately optimized memory wise. Further parallellization is impossible due to it's nature. Parts of it are parallelized already but as you know (if you are an experienced matlab user) parallelization in matlab usually involves using more memory, not less. We only parallelized some of it to improve speed (it's slow regardless of memory usage). You're probably wondering if the code is parallelizable then the different parallel streams can also be run sequentially and then aggregated in a loop, reducing memory consumption ... a valid approach, it already does that.

We're dealing with a real bitch of a simulation (projected memory usage for a desired scenario is 4.6 TB ... not in this decade I think).

Using hard disks is really out of the question, takes too long and we can't afford it. All forms of caching software to SSD or SSD caching techniques/hardware used in conjunction with spinning hard disks will hit the same problem, i.e. at some point cache will fill and then it will hit the hdd and when that happens, it crawls to a halt effectively. We're looking at 700GB-1TB of swap so it's best to put it all on SSDs as I understand it.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
OK, then I'll assume that the code is optimal and that it is not at all reasonable to make any simplifications to the model in order to gain speed. Wow - that's a big simulation on a low budget. Is this a commercial effort or academic?

Have you thought about "renting" some capacity for your simulation? With your swap needs, EC2 might not be the best choice, but there may be other options. Heck, for the right project I might be willing to rent you some server time. I have a 32 core machine with 128GB and 16 SSD drives and a 48 core with 256GB and 32 SSD drives.



@dba: thanks again for taking the time and digging up old notes! you've been really great! Ill try iostat as you suggested. I was already using sysstat and sar but I found the report rate to be too low (it relies on cron) and it shows averages per interval whereas i'm also interested in peaks. iostat looks more promising.

@ehorn & @dba: I was so hoping people would not ask that but I admit that it is a pertinent question and that it is the case many times. Sadly, not this time. The code is unfortunately optimized memory wise. Further parallellization is impossible due to it's nature. Parts of it are parallelized already but as you know (if you are an experienced matlab user) parallelization in matlab usually involves using more memory, not less. We only parallelized some of it to improve speed (it's slow regardless of memory usage). You're probably wondering if the code is parallelizable then the different parallel streams can also be run sequentially and then aggregated in a loop, reducing memory consumption ... a valid approach, it already does that.

We're dealing with a real bitch of a simulation (projected memory usage for a desired scenario is 4.6 TB ... not in this decade I think).

Using hard disks is really out of the question, takes too long and we can't afford it. All forms of caching software to SSD or SSD caching techniques/hardware used in conjunction with spinning hard disks will hit the same problem, i.e. at some point cache will fill and then it will hit the hdd and when that happens, it crawls to a halt effectively. We're looking at 700GB-1TB of swap so it's best to put it all on SSDs as I understand it.
 

normadize

New Member
Oct 11, 2012
6
0
0
OK, then I'll assume that the code is optimal and that it is not at all reasonable to make any simplifications to the model in order to gain speed. Wow - that's a big simulation on a low budget. Is this a commercial effort or academic?
Academic ... I was also thinking that if it was commercial then it would have easier to make a case for a larger investment.

Have you thought about "renting" some capacity for your simulation? With your swap needs, EC2 might not be the best choice, but there may be other options. Heck, for the right project I might be willing to rent you some server time. I have a 32 core machine with 128GB and 16 SSD drives and a 48 core with 256GB and 32 SSD drives.
We thought of it but the thing is that we plan to run many simulation of the same kind but with different parameters, and each takes a while to execute as well. It seemed somewhat uneconomical.

Renting some server time from you, that I would have never expected! I'll PM you about this.
 

Mark_T

New Member
Apr 23, 2012
17
0
1
I just bought a LSI 9207-8i. I do have a 4x256GB Samsung830 setup in a Raid0 configuration on a M1015 HBA and I`m very satisfied. I`m working with big media files and I`m only interested in speed. At the end of my work session the files that matters are moved on safe locations. My plan is to buy another 4 Samsung SSDs and replace the M1015 with LSI 9207-8i
The problem is that I can not setup the Raid0 with the new PSI 9207-8i I just bought. I tried using the LSI Config Utility before booting in windows. There is no Raid option anywhere. I tried using Megaraid Storage manager but the appropriate options to create the virtual drive are greyed.
Only after having this nasty experience experience, I fully realized how well documented and convenient it was to flash the M1015. :)
Thanks mobilenvidia! :)
Now, tbh I`m looking like a chimp to my new superfast card and I don`t know, for the love of god, what to do. I have a feeling that first step is to upgrade the firmware from P13 to P14. I downloaded the latest firmware from LSI website but is a sas2flash.exe and I don`t know how to use it. I already spent 2 days to figure it out but in the end I decided to ask for help.
I can see all the drives in windows and I was able to create a raid0 drive but I don`t think is working well. My C drive is also a Raid0 consisted from 2x256 Samsung830. I thought to test if copying a 40GB data chunk (made from 4k pixels images=32MB each) from the windows Raid0 4x to Intel Raid0 2x and after first 10 GB everything plunged down to something like 20-30MB/s. Almost impossible to transfer this amount of data.
I`ll be grateful for any help I can get to sort this out because I will need to do the upgrade in the end. I love my M1015, it was my first cool raid setup, but I have to upgrade.
Oh, and one last detail. :)
Yesterday I upgraded to Windows 8 Pro and I have a feeling this might be also a problem, drivers wise. I don`t know.
Thanks in advance for your time and for any help I can get.

Cheers,



 

Mark_T

New Member
Apr 23, 2012
17
0
1
Well, it`s obvious this is not my field of expertise. And on top of that I`m embaresed to admit that I didn`t loaded the motherboard chipset drivers. After installing the drivers I was able to transfer 43.2 GB from drive C (intel raid0 2xSSD) to drive D (windows raid0 4xSSD) in 58 seconds. This is an average of 700 MB/s.
And after reading a bit more I finally understood that the card is in IT mode, and because of that I can`t build the "Integrated Raid".
This is the difference between 9207 (IT) and 9217 (IR) right?
May I ask for help regarding flashing IR firmware to the adapter?
Thanks.
 

Mark_T

New Member
Apr 23, 2012
17
0
1
Flashing LSI 9207-8i in UEFI environment

It`s almost half an year since I flased my M1015 to LSI 2008 IR, and of course I forgot everything. I`m building my own PCs, but I`m not doing it every year. :)
And on top of that my P9X79 WS likes to do all these flashing exercises in UEFI environment, and this new LSI 9207-8i was no exception.
Lucky me I went back to my post regarding flashing the M1015 in UEFI. Now I know for sure it helped someone.

I followed the instructions found here: http://kb.lsi.com/KnowledgebaseArticle16266.aspx
I tried initially to do it in DOS, but I got again the "Failed to initialize PAL.Exiting program" error.
I re-downloaded the "x86_64 UEFI Shell 1.0 (Old)" from https://wiki.archlinux.org/index.php/Unified_Extensible_Firmware_Interface#UEFI_Shell.
I renamed the "Shell_Full.efi" to "shellx64.efi" and copied on the USB stick.
I downloaded the "SAS2_UEFI_BSD_P10" from: http://www.lsi.com/channel/support/...Host Bus Adapters&productname=LSI SAS 9211-8i
If you read the readme.txt file from: http://www.lsi.com/downloads/Public..._SATA_6G_P10/README_SAS2_UEFI_BSD_HII_P10.txt you will see that SAS2308_2 is there.
I also copied sas2flash.efi and the LSI 9207-8i IT firmware (9207-8.bin) on the stick. I skipped the bios.
I went in the UEFI environment from the motherboard bios and instead using the DOS commands I used the UEFI ones which are almost identical:

sas2flash.efi -o -e 6
sas2flash.efi -o -f 9207-8.bin

Now, the boot is a bit faster.
My current setup includes 4 x 256GB Samsung 830 in Windows 7 Raid0.
You can also see below the previous results I got with M1015 in the SSD same setup, but in IR mode/hardware Raid0.