solarflare or Chelsio

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

fbcadmin

New Member
Nov 18, 2018
19
0
1
Hello

to increase our proxmox ceph storage cluster reliability, I'm looking to upgrade ceph network from Intel 10G ethernet to SFP+ .

we have an issue with slow requests , much like this person had Ceph, SolarFlare and Proxmox – slow requests are blocked . the issue has been going on since we started using ceph in production 18 months ago. 99 times out of 100 it happens for about 20 seconds per day.

they solved the issue by using recent kernel modules from Solarflare. Now I never heard of Solarflare until today , and am impressed by their 'Kernel bypass' method. However each kernel upgrade would require a dkms module build. I've done that in the past for zfs . It is not hard just a point of failure if not done prior to reboot.

Chelsio on the other hand has modules in Debian non free . I am not sure how up to date those are.. However there are no debs for Solarflare.

researching which is better is almost impossible due to all the marketing clutter. I suspect both companies have awesome products.

My question:

When comparing recent model cards - is Chelsio as good as Solarflare ? By good I mean ultra low latency, and not bothered if the linux lernel is busy dealing with rsync or backups?


thanks for reading and best regards, Rob.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Just wondering what Intel driver you are using. If you are going to build drivers weach kernel update anyway, perhaps a newer Intel driver would help?

Chelsio cards have generally worked for me in firewalls, but I have not seen this Ceph issue.
 

fbcadmin

New Member
Nov 18, 2018
19
0
1
Just wondering what Intel driver you are using. If you are going to build drivers weach kernel update anyway, perhaps a newer Intel driver would help?

Chelsio cards have generally worked for me in firewalls, but I have not seen this Ceph issue.
We are using the intel driver / modules which come with latest Proxmox / Debian .

I'll check in to using latest stable drivers from Intel.


Thank you for the suggestion.
 

fbcadmin

New Member
Nov 18, 2018
19
0
1
also for ceph I use a cronjob to check for slow requests. on the average they occur for 15 seconds per day. I'll notice them if doing cli work there will be a lag. these can affect data file writes - especially on a system that needs to run on linux 2.6 . the slow requests normally occur at random times and random osd's. however I can make them occur by doing things like running a proxmox backup from multiple nodes at the same time. so we schedule those so only one node at a time is backed up.

try this if you use ceph.
Code:
grep "slow requests are blocked"   /var/log/ceph/ceph.log
 

aero

Active Member
Apr 27, 2016
346
86
28
54
As someone that's been using solarflare nics almost exclusively going on 10 years now.... They're fantastic. Kernel bypass with openonload is excellent for latency.