CEPH redundant network design

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

charlie

Member
Jan 27, 2016
58
3
8
Budapest, HU
Hello,

I would like to ask some advice in this case.

We have a small all-HDD CEPH cluster (with 8 pieces of 2U nodes - each with 10 osd - and 3 monitoring nodes) and an 8 node (1U) compute cluster. All servers are HP server.

Currently all server have a dual port TGE network card and these connected to two LB6M switch on this way:

There are 4 VLAN, 1-1 for ceph public and private network, one for management traffic (which are configured as a tagged VLAN top of ceph public network) and one is the public network (internet).

CEPH private network and compute nodes connecting to switch "a" what require 16 ports (8 are in the CEPH private vlan and 8 are in the internet vlan). Plus 2 port as an uplink to the internet and 2 other ports interconnect to the two LB6M. So totally we using 20 ports. (and also connected a Mikrotik router to the gigabit port, what do the NAT for management netwokr, and this is a VPN server, etc. But its using copper 1G port, so not really matter).

Switch "b" also connected to all servers, where are ports native vlans is the CEPH public and there are the tagged management vlan. (this vlan basically not require too much traffic, its used to reach internet to upgrade servers, SSH connections, etc). And there are the two interconnect port, and also connected one 10G iscsi storage here. So totally we using 19 ports on this switch. Yeah, and monitoring nodes connected to the gigabit port of this switch.

These setup working almost 6 month ago, without any issue, downtime, etc.

However, as you can see, the network setup are not redundant, if one switch goes gown, entire storage cluster will be stop (redundant internet connections are not really matter).

I thinking about what is the best way. I can simply add an extra dual port TGE card to the ceph nodes, and copy the existing layout, but in this case, i need two new 48 ports TGE switch, which are not really cheap.

Or my other plans is, to create a bond beetwen two interfaces and i can ceph network in tagged vlan mode. But the documentation not recomend this (however, network capacity can be anough, even when recovery in process because of HDD's speed). In this case i not need to purchase new switches, but i will loose 10GBit public/internet network (i not really want to mix public and storage traffic on the same 10GBit line). However, i can live together with that, 2*1 GBit internet also will be fine. But 10G better :)

I open for any idia, or case study or anything, what will be the best or optimal solution.
 

MiniKnight

Well-Known Member
Mar 30, 2012
3,072
973
113
NYC
Is your 10GbE switch Base-T or SFP+?

And if it's not SFP+ are you willing to think about something that isn't Base-T?

You can get this for around $800 Arista DCS-7050QX-32 Switch 32-Ports 40GbE QSFP+ 2xAC PWR 4x Fans Switch | eBay

That'll be lower than a 48 port 10Gbase-T switch.

Then get Mellanox ConnectX-3 40GbE cards for under $100 per server. Mellanox Dual 40 56GbE QSFP CX314A ConnectX 3 40GB MCX314A BCBT TQ1314 | eBay

Then a few cables.

You'd have a 4 times or faster network. QSFP+ and SFP+ are lower latency and lower power than base-T. For 8 nodes the entire setup is gonna cost you under $2000 including cables.

If you just want another 10G network, Mikrotik CRS317-1G-16S+RM Cloud Router Switch, 16x SFP+, 1x Gbit LAN, Rackmount | eBay + 8x $40 SFP+ cards mellanox mcx311a's <$35 and cables will be under $800. I got ours two weeks ago. Seems way more basic than the Arista but if you aren't going to do inter-VLAN routing, it'll be OK.
 
Last edited:

charlie

Member
Jan 27, 2016
58
3
8
Budapest, HU
Quanta LB6M is an 24 port SFP+ switch with 3 copper gigabit port. I never used copper based network for storage because its latency is terrible (only for testing environment where its not matter).

I not really want to drop out my existing SFP nic and cable infrastructure. I have a lot of HP dual TGE nic's, so i not need to buy it.

IF i want another 10G netowkr, i can simply buy an other LB6M, thats not too hard.

Only i don't know, what is the best way. Using two existing devices and creating vlan and bonding based networks, or simply add two extra switch to create the redundancy, or upgrade 24 port switches to 48 ports and in this case all ports have untagged vlan, etc.
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,529
241
63
@charlie the 48 port 2nd switches is what I'd do. Limit complexity. On @MiniKnight 's suggestion the other benefit he's hinting at is that for Ceph a faster network is a good thing too.

That Arista switch you can also use SFP+ breakout cables with. I think you can get 64-96 SFP+ ports from that switch plus a few 40gb only ports. So that would cover your 48 port scenario too.

We're working to have 2x Arista with a 16 node cluster then using the remaining ports to uplink to other infra with 40 and provide breakout to 10G and 1G ports with SFP+.

Here's the thread on them Arista DCS-7050QX-32

They're very popular.