The Dirt Cheap Data Warehouse V3

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I'm starting to build V3 of my Dirt Cheap Data Warehouse - a server designed for maximum throughput in a database application at minimum cost. Version 2 of the DCDW is a 48 core HP DL585 G7 capable of >18,000MB/s of throughput in a single SQL query, some 270 million rows per second. The goal for V3 is to exceed 25,000MB/s.

The specs:
Chassis: Supermicro SuperServer 4047R-7JRFT. This is a 4U chassis with 48 2.5" bays and a quad Xeon E5 motherboard with 10GbE and dual LSI 2308 chips built in. The expander-based backplanes will be replaced with non-expander backplanes.
CPUs: Quad Intel Xeon E5-4640 C1 stepping. These are 8 core CPUs. I hate paying Intel prices, even ES prices, but I need PCIe3.
Disk IO: 2x LSI 2308 chips built in, plus 4x LSI 2308-based PCIe3 internal port cards and 4x LSI 2308-based PCIe3 external-port cards.
Disks: 2.5" 128GB SSD drives. I'll start with 56 drives and expand to 80 drives to explore the limits of performance.
Networking: Gigabit Ethernet plus QDR Infiniband running IPoIB.


Measuring raw CPU horsepower, the Xeon E5s aren't much better than the quad 12-core AMD 6172 CPUs in DCDW (Dirt Cheap Data Warehouse) V2, and in fact have fewer total cores. In their favor, however, is their support for PCIe3 with its improved throughput and very low latency. On balance, this should improve query throughput considerably for most cases, though this will need to be verified.

With 10 PCIe3 disk controllers, this setup should be good for ~40GB/s of raw disk IO and - I hope - around 28GB/s of SQL query throughput.
 
Last edited:
  • Like
Reactions: Jeggs101

BlueLineSwinger

Active Member
Mar 11, 2013
176
66
28
FWIW, 120GB-class and lower SSDs typically have significantly lower throughput and IOPS numbers than those in the 240GB-class and higher. It may be worthwhile to go with fewer larger drives.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
FWIW, 120GB-class and lower SSDs typically have significantly lower throughput and IOPS numbers than those in the 240GB-class and higher. It may be worthwhile to go with fewer larger drives.
Some 120GB show way slower read speed than the 240GB from the same product line, but others have a much smaller differential. In all cases, however, two 120GB drives provide more throughput per dollar than one 240GB - at least with the drives that I have tested.

IOPS are important, but I'm already seeing >2,000,000, which is about 50X what I actually need. I have stopped worrying much about IOPS given how many drives I am using!
 
Last edited:

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
Considering this is "dirt-cheap" what is your actual budget looking like?
I have to admit that the DCDW V3 isn't nearly as cheap as V2. V3 costs more and will deliver less performance per dollar, but more performance overall. With the data set for a normal query having grown to ~3TB, the time spent waiting for queries was starting to feel "expensive" to me, more expensive than upgrading. Here are the rough costs:

Chassis: $3,100
CPUs: $1,640
RAM: $1,300
LSI cards: $1,700
SSDs: $7,000
Total: expensive, until you price an Oracle Exadata!
 
Last edited:

Chuntzu

Active Member
Jun 30, 2013
383
98
28
Very cool! I was thinking about an idea a while back about what machine would give me the best disk through put and I was looking at x9drx+-f. Have you looked at this board before? I know at one point you mentioned to me memory bandwidth was a determining factor in the platform that you're working with. Would be quad sockets have more memory bandwith Than dual socket in this scenario? That and that one pcie 4x Slot would overall slow system performance down compared to the quad socket. Though the quad socket may cost more....any way just drumming up ideas, can't wait to see the numbers you post up.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I like that Supermicro motherboard for a dual CPU setup, but I decided against it as having too much PCIe IO compared to CPU grunt.

For example, Patrick let me play with one of his killer dual CPU Xeon E5-2690 systems. I compared it to my quad AMD Opteron 6172 server, an HP DL585 G7. While the Xeon CPUs were three years newer and had PCIe3, they were - very surprisingly - not able to keep up with the AMDs in my star schema database use cases. The Xeon setup has far fewer cores total, and in the end that made the difference; The 16 Intel cores just could not keep up with the 48 AMD cores when it comes to database IO.

So my conclusion is that, to beat four AMD Opterons, I need four Xeon E5s.

Side note: I also tested an HP DL580 G7*, which had 40 Xeon E7 cores with 80 total threads. Xeon E7 sounds more impressive than Xeon E5, and 80 Intel threads seems like it would be better than 48 AMD cores, but the DL580 was pathetic. It had PCIe2 like the AMD version, but there were only two Intel IO chips for those four CPUs, which was absolutely insufficient. The AMD-based DL585, by comparison, has four IO chips and the Xeon E5s don't require separate IO chips at all.

*Cause for confusion: The HP 580 is an Intel box while the HP 585 is AMD. Similar names but different architectures.

Very cool! I was thinking about an idea a while back about what machine would give me the best disk through put and I was looking at x9drx+-f. Have you looked at this board before? I know at one point you mentioned to me memory bandwidth was a determining factor in the platform that you're working with. Would be quad sockets have more memory bandwith Than dual socket in this scenario? That and that one pcie 4x Slot would overall slow system performance down compared to the quad socket. Though the quad socket may cost more....any way just drumming up ideas, can't wait to see the numbers you post up.
 
Last edited:

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
Had you looked into the new 12 core E5 v2's?
I did, and they are quite impressive, but I decided against them for some rather pedestrian reasons. I could have waited for the E5-4xxx V2s to come out, but I didn't think that I'd find ES chips available at reasonable prices any time soon, so I eliminted that option. I could have gotten more than enough CPU power from a pair of top-tier Xeon E5-2xxx V2s with 12 cores each, but the cost would have been higher than buying four average E5 V1 chips, and I'd have fewer total cores less total PCIe IO bandwidth. Now if I were paying a per-CPU license fee, the decision would have come out differently.

The upcoming Xeon E7 V2s also look really fast - 15 cores and PCIe3 - but with no competition from AMD or anyone the pricing will be insane.
 
Last edited:

PigLover

Moderator
Jan 26, 2011
3,184
1,545
113
FWIW, 120GB-class and lower SSDs typically have significantly lower throughput and IOPS numbers than those in the 240GB-class and higher. It may be worthwhile to go with fewer larger drives.
You cant make generalizions like that without considering the requirements of the workload.

The advantage of the 240Gb drives vs 120/ is largest on writes. For example, in the 840 pro line that DBA is using the 120 only has about 50% of the write speeds that the 240 does. But the read speeds are much, much closer. With the cost difference just over 2:1, and with a read-dominated workload like a data warehouse, DBAs choice to go with the 120Gb drives makes perfect sense. Price/performance ratio favors the smaller drive by a large margin.

If he was using it for a write-dominated workload, e.g. a journal disk, write cache or Zil, then maybe getting the larger drive makes sense. Even more so since those applications generally call for 1 to 8 drives total...not the 50+ used in the DCDW.
 
Last edited:

nry

Active Member
Feb 22, 2013
312
61
28
This sounds incredible. Certainly puts my setup to shame but I have never had to do SQL queries on that scale.

Assuming this is Microsoft SQL?
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I use Oracle. and my configuration is especially good for Oracle since that database has a feature called "Automatic Storage Management" (ASM) that turns any number of unformatted drives into a massive and database-optimized RAID set with two-way or three-way mirroring. ASM gives me mirroring for redundancy and is the most performant RAID implementation that I have found - by far.

Actually, ASM is a really great RAID implementation in other ways as well, and I wish (and hope) that someone implements these features in a more general-purpose manner. ZFS comes pretty close, but there are still features that you can only get with ASM.

This sounds incredible. Certainly puts my setup to shame but I have never had to do SQL queries on that scale.

Assuming this is Microsoft SQL?
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
You cant make generalizions like that without considering the requirements of the workload.

The advantage of the 240Gb drives vs 120/ is largest on writes. For example, in the 840 pro line that DBA is using the 120 only has about 50% of the write speeds that the 240 does. But the read speeds are much, much closer. With the cost difference just over 2:1, and with a read-dominated workload like a data warehouse, DBAs choice to go with the 120Gb drives makes perfect sense. Price/performance ratio favors the smaller drive by a large margin.

If he was using it for a write-dominated workload, e.g. a journal disk, write cache or Zil, then maybe getting the larger drive makes sense. Even more so since those applications generally call for 1 to 8 drives total...not the 50+ used in the DCDW.
The advice from BlueLineSwinger is of course totally correct if you are buying a single SSD: Buy the 256GB+ version and you'll get better performance as well as more space - a double benefit. The math is different - as PigLover points out - when you are buying multiple drives. For example, imagine that you wanted to create an SSD array with 800GB of raw space. Should you use 8x 128GB drives formatted to 100GB each or 2x512GB drives formatted to 400GB each (both scenarios have the same percentage of overprovisioning, and you want lots of overprovisioning)?

8x 128 GB drives at $140 each = $1,120
2x 512 GB drives at $450 each = $900

Using the smaller drives costs 24% more, and the 128GB drives also have worse performance than the 512GB - 520MB/s writes compared to 390MB/s for example. The difference in read performance is much smaller, so I'll ignore it.

But add up your theoretical aggregate performance and the balance shifts, with some surprising numbers:

2x 512 GB drives: 1,040MB/s and 180K IOPS aggregate writes.
8x 128GB drives: 3,120MB/s and 720K aggregate writes.

Spend 24% more and you get a 300% improvement in write throughput and a 400% improvement write IOPS.

Of course the delta is smaller if you account for HBA, cable, and chassis costs. Still, for a long time the 128GB drives have been the price/performance sweet spot, with the 256GB drives now coming out on top in some scenarios.
 
Last edited:

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
Very cool project. I'm putting together my first "mini" data warehouse systems too. Starting with an E5-1650 & 8x480gb Intel 730s on a LSI RAID card. I wanted a high Ghz CPU to start as it will be doing a lot of misc things as well as mostly writes, and the heavy query/analysis most likely will be done on another system.

I am interested to hear more about the AMD > Intel when it comes to cores and SQL, and where you think that crossover is.

Do you think 2x12 Core intels would be > or 2x14 core intels would be > than the AMD setup you have? One or both will be my analsis machine with a P3600 or P3700 or whatever is available mid/late summer and fast :)

My data sets per-project aren't nearly as big as yours, but overall the data will grow rapidly, so I'm working on a cold storage and hot storage as the cold storage doesn't need to be fast, and with projects (most) under 100mbytes it may be worth using temp tables for analysis. Not 100% sure yet which way I'll go.. all SSD storage or non-SSD cold SSD hot for read, and the 'live write' machine separate too.
 

mrkrad

Well-Known Member
Oct 13, 2012
1,244
52
48
You'd be better off splitting the 8 drives across two megaraid cards! That's what I've found. Allows for better distribution of i/o IRQ events as well!
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
Interesting! Which version cards were you testing with? I just got their new 12GB/s with BBU&CACHE, haven't gotten system up and running yet but the plan is to test various SSD in mysql just to see what I can learn :)
 

mrkrad

Well-Known Member
Oct 13, 2012
1,244
52
48
I run the 9271 in a dl380 G7 - two of them, 4 840 pro's each - 33%OP - raid-1 with esxi 5.1 - Had I the opportunity I would have gone for 4 of the 4 port versions so the dual X5650's could dispatch more IO/QD across the cpu's.

One megaraid card seems to peter out around 6 drives on my old dl380!
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
What is your array setup like?

What tests did you use to measure performance?

What operating system were tests done on?

Did you ever test/compare baremetal vs esxi VM performance?
 

mrkrad

Well-Known Member
Oct 13, 2012
1,244
52
48
Raid-1 of 2 drives, 33% OP, 512gb drives as 384GB raid-1

Performance? ETL/SSRS jobs run on SQL 2008R2

Windows 2008R2 - sql 2008R2 running in VM's 2-4 vcpu max. 32gb to 64gb each.

No need to test baremetal - I don't ever plan to use baremetal servers!

You do have to be VNUMA aware and pci-e numa aware with dual X5650 and SQL server.

Has to do with ESXi divvying up resources with a scheduler that doesn't get used when 1VM gets 1 full LUN/TARGET/RAID card.

Avoiding the scheduler allows maximum queue depth utilization. SQL server loves having multiple LUN's attached to multiple filegroups!