The "Dirt Cheap Data Warehouse" - Database and Storage Servers on the cheap

dba

Moderator
Feb 20, 2012
1,478
183
63
San Francisco Bay Area, California, USA
Others have shared photos of their servers and server racks, so I'll do the same.

I am creating an identity management software package based on data warehouse (DW), data mining, and business intelligence (BI) technologies. A partial definition of a data warehouse is this: A really big, really expensive database with really good disk IO. Companies often spend hundreds of thousands of dollars on their DW infrastructure, with big database servers and hundreds or thousands of overpriced SAN disk spindles.

I need a data warehouse for my project, but I certainly can't afford to spend hundreds of thousands of dollars. As a solution, I have created what I call the "Dirt Cheap Data Warehouse (tm)" - aka the DCDW.

The secret to the Dirt Cheap Data Warehouse is this: Used generation-old server technology from eBay, plus a large number of consumer-grade SSD drives, carefully selected, configured, and deployed to avoid resource-wasting bottlenecks and to maximize query throughput and query throughput per dollar. Frankly, it's the details of the software and configuration that gives the DCDW most of its speed, but the hardware portion is still interesting by itself. This post will of course focus on the hardware.

Let the cable-pocolypse begin! Here are images of the current version of the DCDW plus some in-progress experiments to define the next generation of the architecture. I'll add a few other posts to describe what you are seeing. My "rack" is actually two 12U racks bolted together, one on top of the other, which is why I talk about the "top half" and "bottom half" of the rack. Originally I was so sure that 12U would be enough...

Image 1 - the bottom half of the rack, front view. Shows the HP DL585 G7 DB server plus two JBOD chassis with 28 Samsung SSD drives each. At the bottom is the DLI web-enabled PDU.
RackFrontBottom1.jpg


Image 2 - the top half of the rack, front view. Shows (from top to bottom) c6100 "Corporation in a Box", two c6100 storage nodes, two c6145 db cluster nodes, HP MSA2000G2 SAN.
RackFrontTop.jpg

Image 3 - the bottom half of the rack, rear view. Too many cables! Mellanox QDR Infiniband switch and Dell Gigabit switch at the bottom.
RackBackBottom.jpg

Image 4 - the top half of the rack, rear view
RackBackTop.jpg



See posts below for details if you want to know more:
 
Last edited:

TheBay

New Member
Feb 25, 2013
220
1
0
UK
Some good kit in that rack ;)
What the hell does that pull at the wall?, are you using a 220~240v line?
 

dba

Moderator
Feb 20, 2012
1,478
183
63
San Francisco Bay Area, California, USA
System One: The Dirt Cheap Data Warehouse

Take a look at the front photo of the bottom portion of the rack. From top to bottom, you'll see the following DCDW components:

1) HP DL585 G7 Server, which is the main database server
2) Two Supermicro SC216 chassis, which are the SSD storage chassis, each configured to hold 28 SSD drives
3) The Digital Loggers Inc. web-based power distribution unit (PDU)

The DL585 G7 is a quad AMD CPU server with 11 PCIe slots**. I have it configured with four AMD Opteron 6174 CPUs for a total of 48 true cores. RAM is 384GB with plenty of empty DIMM slots if I need them. The DL585 G7 is unique in that it has four AMD IO chips whereas most quad servers have only two. Each IO chip provides 42 PCIe2 lanes and the four chips provide, when the optional IO board is added, eleven PCIe slots, four x16 and the rest x8. I have inexpensive LSI 9200 and dual-controller 9202 host bus adapters installed into ten of the PCIe slots to provide a total of 14 SAS controllers and 112 disk channels. I have one x16 PCIe slot left over for future expansion. The DL server also has a battery-backed HP P410 RAID card, which I use along with eight internal 1TB laptop drives to provide bulk storage for database backups.

The Supermicro SC216 chassis are the non-expander JBOD versions stripped out and reconfiged with Supermicro JBOD power boards. Each chassis has 24 drive slots at the front and four at the rear, the four rear slots provided by an add-on mobile dock from StarTech. Each JBOD chassis presents seven SFF-8088 ports that connect to the HP sever with 1 meter SFF-8088 SAS cables - you can see the giant bundle of cables in the photo.

The disks used in these JBOD chassis are Samsung 830 and 840 Pro drives. An earlier version used OCZ Vertex3 drives, which worked perfectly and were by far the best deal at the time. When the Samsung 830s dropped below $100, however, they became the new go-to drive until recently when the 840 pros got reasonable. All of the drives are 128GB, which is the sweet spot for throughput per dollar. For almost any other use case I would have used drives with supercaps, but these are not required with Oracle.

All told, the DCDW has around 3 million IOPS and 6.9TB worth of SSD storage, formatted to 5.4TB with the remaining 1.5TB as overprovisioning. Yes, that's 1.5TB of "overprovisioning"! I use Oracle ASM, which is a form of software RAID1E, so I have 2.7TB worth of actual usable space.

The Digital Loggers box gives me web-based remote control of power and is very helpful. I plan on writing up an STH article on this little gem some time soon.

You'd be surprised how little money I have invested in this machine. The DL585 server was bought from eBay for a fraction of its price when new, even though it still has two years of on-site warranty service left. The HBAs cost me between $40 and $200 each - the more expensive being the 9202-16e cards with 16 ports each. If you do the math, I only utilize half of the SAS ports per card. Each LSI card can handle almost six SSD drives before bottlenecking, but SAS chassis present disks in sets of four. With the availability of low-cost LSI HBAs and the relatively high cost of disk chassis, it has been cheaper to deploy units of four disks rather than five or six.

The database throughput for this beast is impressive. With indexes turned off, I can issue a single query and get 16,800MB/Second of actual Oracle database throughput - with compression that's around a quarter billion fact table rows per second!

**HP also sells the DL580 G7, which is an Intel-based version of the same box. In several ways it is even better than the DL585: it has 64 DIMM slots and can take Xeon E7 CPUs with 10 true and 20 HyperThreading cores each. It has, however, one big flaw: The PCIe bandwidth is awful. While it has eleven PCIe2 slots like the DL585, the DL580 has only two Intel IO chips and as a result you get six x8 slots and five x4 slots, which is far too few for that much CPU horsepower. The DL585 showers you with four x16 slots and seven x8.
 
Last edited:

dba

Moderator
Feb 20, 2012
1,478
183
63
San Francisco Bay Area, California, USA
System Two: "Corporation in a Box", the Dell c6100-based VM Un-Cluster

The 12-bay Dell c6100 at the very top of the rack is my VM server, running all of the applications that any software company would need, each as its own VM. I call it my "Corporation in a Box".

Thanks to the tip from Patrick, I purchased this box at a great price from Vista Computer and then outfitted it with Xeon L5520 CPUs, lots of RAM, and a mix of standard and SSD drives.

Nodes 1-3 are dual-CPU Xeon L5520 nodes with 96GB of RAM each. Node four is a single-CPU node with 16GB of RAM. The three big nodes run Windows 2012 with Hyper-V. The VMs on those nodes replicate to the fourth node, which does nothing but receive such replications. The VMs all run on 512GB Samsung 840 Pro SSD drives for speed while a 2TB Hitachi drive attached to each node is used to perform daily backups. At last count, I had nearly 40 VMs spread across the three nodes - web, blog, mail server, source code control, bug tracking, continuous integration, code analysis, ETL, BI, databases of various brands, and lots and lots of development instances.

You'll also see a 12-bay disk chassis at the bottom of one of the photos. This is the HP MSA2312sa discussed in other posts and I use it for bulk storage. It is a SAS SAN device with dual controllers run in active-active mode. I have it configured with 12 1TB drives (thanks to mrkrad for the interposers!) as RAID6 for around 10TB of storage. It's fairly fast - around 1,000MB/Second - and can connect to up to eight servers, or four with MPIO. Because of the redundant host connections, redundant controllers, flash-backed cache, and redundant power, it should be quite reliable. I would have gone with ZFS instead of a old-fashioned external SAN, except that I found an extraordinary deal on this box.

A side note: The secret to great VM performance is disk IO throughput and disk IOPS. A moderately robust server can easily host dozens of VMs, but I commonly see such servers with just a few spinning hard drives or perhaps a Gigabit iSCSI connection or two. Those dozen VMs will be IO starved and will feel very sluggish. The common solution is to move VM storage to a fast SAN, which works well but is very expensive. With VM replication, there is another architecture: Fast local non-redundant SSD storage for VMs with disaster recovery provided by replication to a second server, aka Veeam or Hyper-V 2012. A pair of SSD drives can provide 1TB of VM storage at 1,000MB/Second and 75K IOPS for less than the price of a fiber channel or 10Gbe add-on card for a SAN.
 
Last edited:

dba

Moderator
Feb 20, 2012
1,478
183
63
San Francisco Bay Area, California, USA
System Three: Dell c6100 Storage Servers

You'll see two 24-bay Dell c6100 servers right below the Corporation in a Box. These are an experiment in progress. I plan to configure each server as a Solaris storage node serving up six SSD disks each to Oracle instances via Infiniband. I'll be testing different protocols to see which provides the greatest query throughput. If the experiment works, I hope to retain the excellent throughput provided by the directly-attached SSD storage while gaining the ability to share that storage across database nodes, enabling a shared-storage database cluster without the expense of a large monolithic SAN. With each of the eight nodes capable of serving up at least 2GB/Second worth of data, that's a potential 16GB/Second worth of shared storage.

Right now, each of the eight nodes in these two servers has dual Xeon L5520 CPUs - that's 16 CPUs total. Four of the servers have LSISAS2008 SAS Mezzanine cards and PCIe QDR Mellanox cards while the other four have Mellanox QDR Infiniband Mezzanine cards and will have PCIe LSI SAS2008-based HBAs. I may eventually settle on a single configuration for all eight nodes, but I'm not sure which will win out.

As discussed in the forums, the disk caddies in these Dell 24-bay nodes are standard HP 2.5" disk caddies, not Dell caddies. The HPs fit absolutely perfectly and cost 1/3 as much as the Dell caddies. Since I have to buy 96 caddies, the savings is huge.
 
Last edited:

dba

Moderator
Feb 20, 2012
1,478
183
63
San Francisco Bay Area, California, USA
System Four: Dell c6145 Shared-Nothing Database Cluster

The two 2U servers just above the HP storage array are the new-to-me Dell c6145 machines from Vista Computer. These are high-density servers like the Dell c6100, but are based on AMD and use quad CPU motherboards. Each c6145 houses two motherboards. Each motherboard has four AMD G34 CPU slots, 32 DIMM slots, three x16 PCIe slots, an x8 PCIe mezzanine slot, and an external PCIe x16 connector that is pretty much useless unless you own a Dell c401x GPU chassis. These two systems combined have 16 CPUs, up to 256 sever cores, 128 DIMM slots, and 288 PCIe lanes.

These machines look a bit silly now, barely connected to the network and with no PCIe storage cards, but that will soon change. I plan to build these servers into a shared-nothing database cluster using HP Vertica. While I'm getting excellent throughput from my Oracle database, I believe that the compression provided by a column-oriented database could easily double my effective IO throughput. With all four servers running, my "stretch goal" is 1 billion rows/second.
 
Last edited:

dba

Moderator
Feb 20, 2012
1,478
183
63
San Francisco Bay Area, California, USA
Infiniband and Ethernet Networking:

If you look at the bottom of Image 3, you'll see two networks: Ethernet and Infiniband.

The Ethernet network is a $71 Dell 2848 managed switch with 48 Gigabit ports. You'll see red and white cables - the former are for data and the latter are for IPMI. I use LACP wherever possible, for example the c6100 nodes. I have used up all 48 ports, and could use a few more, frankly.

The Infiniband network is a Mellanox Grid Director 4036 that I picked up from eBay. The price was excellent even before I did a lowball "best offer". For 36 ports of 40Gbit connectivity, it was a steal. I am a huge fan of Infiniband, and plan to move as much as possible to this new network; right now I'm patiently waiting for another deal on QSFP cables.

My only wish is that I had a good way to bridge the IB and Ethernet networks without using a server. If a Mellanox 4036E ever comes up for auction, and is cheap, I'll go for it.
 
Last edited:

dba

Moderator
Feb 20, 2012
1,478
183
63
San Francisco Bay Area, California, USA
I ran two dedicated 20A circuits to the rack and that provides plenty of power. If I were to run all of the servers at once and fully loaded - which happens never since this is essentially a lab - I would draw 3000 watts. It's more usual to have the "Corporation in a Box" running along with either the "DCDW" or one of the experimental systems, drawing 1400 watts or so. To save money I turn off the servers when they are not needed - having VMs really helps with this. Electricity is expensive in Northern California, and the rates escalate with increasing usage, so I save serious money by not leaving things running unnecessarily.

Some good kit in that rack ;)
What the hell does that pull at the wall?, are you using a 220~240v line?
 
Last edited:

marcoi

Well-Known Member
Apr 6, 2013
1,437
239
63
Gotha Florida
Is this all installed in your basement? haha or is it lab at work place?
Either way very nice. love to play with all that hardware.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,168
5,234
113
Wow - this is an amazing post. My garage rack looks paltry in comparison (as you saw last weekend.)
 

PigLover

Moderator
Jan 26, 2011
3,012
1,314
113
Wow - this is an amazing post. My garage rack looks paltry in comparison (as you saw last weekend.)
Don't obsess over your rack. No matter how nice a rack you have you're going to meet someone with a nicer rack. Be happy with your rack or you might find yourself spending $thousands having your rack enhanced...and still not be happy. ;)
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,168
5,234
113
Don't obsess over your rack. No matter how nice a rack you have you're going to meet someone with a nicer rack. Be happy with your rack or you might find yourself spending $thousands having your rack enhanced...and still not be happy. ;)
He saw the rack... in the house. That doesn't include the colo and the bedroom that has been converted to a STH photography studio and now a working lab. I think the lab/ studio room has at least 6 machines setup at this point for the STH 2013 Linux benchmarking extravaganza.
 

dba

Moderator
Feb 20, 2012
1,478
183
63
San Francisco Bay Area, California, USA
Very funny... or it would be if you were not absolutely positively talking just about server equipment... but you are... so it's not funny at all. Nope.

Don't obsess over your rack. No matter how nice a rack you have you're going to meet someone with a nicer rack. Be happy with your rack or you might find yourself spending $thousands having your rack enhanced...and still not be happy. ;)