FreeNAS 9.3 (CIFS poor 10GBe performance) Looking for tweaks / hints / whatever

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Kristian

Active Member
Jun 1, 2013
347
84
28
Introduction (Skip if you don't care)
After my xpenology box nearly lost all its data it took me a while to decide for the next OS I want to use.
As I don't have plenty of time to tinker around and learn things it had to be something that has a GUI and doesn't involve too much learning.
It had to be something that let's you install plugins, because my hardware is not capable of VT-d (and I need Plex and some other small things) And as I have 12+ spinning disks I wanted to be at least capable of spinning down the disks for the night. Another requirement was to archive between 400 and 500 MB/s in sequential read and writes
I tried OMV, Rockstor, FreeNAS, Windows and OpenATTIC.

Well FreeNAS seems to suit my needs the most.
So heres what I have

Hardware
Supermicro A1SAM 2750C
Intel X520 10 GBe NIC (in the PCIe 2.0 x4 slot)
LSI 9211 8i (in the PCIe 2.0 x8 slot) -- connected to a Supermicro BPN SAS3 846 backplane (got it real cheap)
32GB ECC Kingston Value RAM
500W gold certified PSU

8x 4TB WD Red Western Digital Red 4TB (WD40EFRX)
4x HGST H3IKNAS40003272SE 4 TB

Performance
Performance RAM Disk to RAM Disk copy from another machine, both directions:
750 MB/s sustained

Performance iPerf:
Windows (after driver tweaks and mtu 9000)
400 MB/s sustained)

FreeNAS (mtu 9000)
400 MB/s sustained (I would be okay with that)

Real World Performance:
Windows (after driver tweaks and mtu 9000) to a storage spaces stripe of 12 disks
starting at 600 MB/s dropping down to 400 MB/s (I would be okay with that)

FreeNAS (MTU 9000) to a share out of a stripe of 12 disks
between 130 MB/s and 190 MB/s (not okay!)

FreeNAS (MTU 9000) to a RAID Z2 of 12 disks
between 130 MB/s and 190 MB/s (not okay!)

FreeNAS (MTU 9000) to a RAID out of 2vdevs RAID Z2 of 6 disks
between 130 MB/s and 190 MB/s (not okay!)


what am I missing here?
I am completely without a clue...
I you have any ideas pleas feel free to share.
 

archangel.dmitry

Active Member
Sep 11, 2015
224
40
28
US
I might be wrong here but it does not seem that you have enough RAM (ZFS) for that amount of space. It is recommended to have 1GB of RAM per 1TB of storage + overhead of OS.

To make the situation simple, remove 8 WD drives and run test with ZFS using 4 HGSTs.
 
  • Like
Reactions: Kristian

markarr

Active Member
Oct 31, 2013
421
122
43
In BSD the CFIS implementation is only single threaded. So for freenas clock speed is crucial for CFIS transfer speed. I would look at top when transferring that file and see if you are maxing out a core.
 
  • Like
Reactions: Kristian

PnoT

Active Member
Mar 1, 2015
650
162
43
Texas
In BSD the CFIS implementation is only single threaded. So for freenas clock speed is crucial for CFIS transfer speed. I would look at top when transferring that file and see if you are maxing out a core.
I think they recommend staying away from the atom based SOC due to these types of issues as well. There are some posts that recommend Omni-OS + Napp-it for great iSCSI and CIFS performance.
 
  • Like
Reactions: Kristian

Rain

Active Member
May 13, 2013
276
124
43
I might be wrong here but it does not seem that you have enough RAM (ZFS) for that amount of space. It is recommended to have 1GB of RAM per 1TB of storage + overhead of OS.
This is really a rumor that needs to stop being spread. In an enterprise-like workload, with extremely high IOPs, it is indeed recommended to have a lot of RAM, but having a lot of RAM is recommended with any file system that's under high load! Every modern OS caches in RAM these days, even Windows, so the more RAM, the more can be cached! If @Kristian is using this for general storage, or even a light VM workload, he could probably get by with 16GB and be perfectly fine; maybe less!

On the single-threaded front: I've been testing 10GbE with some ConnectX-2 cards in various configurations recently. Samba4, ramdisk-to-ramdisk, 9000MTU can easily do nearly 10Gb/s sequential to and from a L5520 system. The L5520 is quite comparable to the C2750 in single-threaded performance, if I'm not mistaken.

iperf, even Windows-to-FreeNAS, should run at 10Gb/s. Before you do anything, I'd figure out why you're only getting ~4Gb/s; that's not right. What NICs are you using in your server and workstation? Switches?
 

markarr

Active Member
Oct 31, 2013
421
122
43
On the single-threaded front: I've been testing 10GbE with some ConnectX-2 cards in various configurations recently. Samba4, ramdisk-to-ramdisk, 9000MTU can easily do nearly 10Gb/s sequential to and from a L5520 system. The L5520 is quite comparable to the C2750 in single-threaded performance, if I'm not mistaken.
Some OS's do support multi-threaded CFIS unlike FreeBSD, so which one were you using to test this?

Those Connectx-2 cards do not like BSD.

Also the single thread performance for the C2750 is nearly half of the L5520 so that would make a significant difference.
 
  • Like
Reactions: Kristian

archangel.dmitry

Active Member
Sep 11, 2015
224
40
28
US
This is really a rumor that needs to stop being spread. In an enterprise-like workload, with extremely high IOPs, it is indeed recommended to have a lot of RAM, but having a lot of RAM is recommended with any file system that's under high load! Every modern OS caches in RAM these days, even Windows, so the more RAM, the more can be cached! If @Kristian is using this for general storage, or even a light VM workload, he could probably get by with 16GB and be perfectly fine; maybe less!

On the single-threaded front: I've been testing 10GbE with some ConnectX-2 cards in various configurations recently. Samba4, ramdisk-to-ramdisk, 9000MTU can easily do nearly 10Gb/s sequential to and from a L5520 system. The L5520 is quite comparable to the C2750 in single-threaded performance, if I'm not mistaken.

iperf, even Windows-to-FreeNAS, should run at 10Gb/s. Before you do anything, I'd figure out why you're only getting ~4Gb/s; that's not right. What NICs are you using in your server and workstation? Switches?
It is not a rumor, it is explicitly written on freenas.org
 
Last edited:

Kristian

Active Member
Jun 1, 2013
347
84
28
I might be wrong here but it does not seem that you have enough RAM (ZFS) for that amount of space. It is recommended to have 1GB of RAM per 1TB of storage + overhead of OS.

To make the situation simple, remove 8 WD drives and run test with ZFS using 4 HGSTs.
The 1GB per TB rule is no longer essential as far as I know.
Anyways: I removed the WDs and ran the tests with 5 HGST in Raid0

RAM Disk to ZFS over 10GBe with MTU 9000: 200MB/s decreasing to 180 MB/s sustained


In BSD the CFIS implementation is only single threaded. So for freenas clock speed is crucial for CFIS transfer speed. I would look at top when transferring that file and see if you are maxing out a core.
I don't think FreeNAS lets you see utilisation of the single cores.
I just fould overall cpu usage.
But when using parallel streams in iPerf that makes a difference (more on that below)

I think they recommend staying away from the atom based SOC due to these types of issues as well. There are some posts that recommend Omni-OS + Napp-it for great iSCSI and CIFS performance.
The FreeNAS mini (a commercial product, sold by the developers, uses an Avoton 2750 board)
iXsystems, Inc. – Enterprise Storage & Servers – FreeNAS Mini

I will have a closer look at Omni OS and Napp-IT, but I think this is a little bit more complicated than I am comfortable with.


iperf, even Windows-to-FreeNAS, should run at 10Gb/s. Before you do anything, I'd figure out why you're only getting ~4Gb/s; that's not right. What NICs are you using in your server and workstation? Switches?
Topology is not really straight forward.

Server -> Intel X520 -> Intel Transceiver -> OM3 Fibre -> Intel Transceiver -> D-Link DGS-1510-28X -> Intel transceiver -> OM3 -> Intel Transceiver -> Microtik CRS226-24G-2S+IN -> Cisco DAC -> Intel X520 -> Workstation (Intel 3770k, 16GB RAM, Asus Board)

Yeah too many uncertain parts...
Tomorow I will simplyfy down to
Server -> Intel X520 -> Intel Transceiver -> OM3 Fibre -> Intel Transceiver -> Intel X520 -> Workstation and see if that changes anything.

Some more iPerf results (experimenting with the number of parallel streams)
As the increase: Throughput improves

Sounds to me like a problem with the network topology...
But does it explain why windows to windows brings arround 600 MB/s and Windows -FreeNAS runs at 180MB/s?

1 parallel stream

Code:
bin/iperf.exe -c 192.168.178.60 -P 1 -i 1 -p 5001 -f G -t 10 -T 1
------------------------------------------------------------
Client connecting to 192.168.178.60, TCP port 5001
TCP window size: 0.00 GByte (default)
------------------------------------------------------------
[240] local 192.168.178.30 port 58187 connected with 192.168.178.60 port 5001
[ ID] Interval  Transfer  Bandwidth
[240]  0.0- 1.0 sec  0.31 GBytes  0.31 GBytes/sec
[240]  1.0- 2.0 sec  0.29 GBytes  0.29 GBytes/sec
[240]  2.0- 3.0 sec  0.30 GBytes  0.30 GBytes/sec
[240]  3.0- 4.0 sec  0.29 GBytes  0.29 GBytes/sec
[240]  4.0- 5.0 sec  0.29 GBytes  0.29 GBytes/sec
[240]  5.0- 6.0 sec  0.29 GBytes  0.29 GBytes/sec
[240]  6.0- 7.0 sec  0.29 GBytes  0.29 GBytes/sec
[240]  7.0- 8.0 sec  0.29 GBytes  0.29 GBytes/sec
[240]  8.0- 9.0 sec  0.29 GBytes  0.29 GBytes/sec
[240]  9.0-10.0 sec  0.30 GBytes  0.30 GBytes/sec
[240]  0.0-10.0 sec  2.94 GBytes  0.29 GBytes/sec
Done.

2 parallel streams

Code:
bin/iperf.exe -c 192.168.178.60 -P 2 -i 1 -p 5001 -f G -t 10 -T 1
------------------------------------------------------------
Client connecting to 192.168.178.60, TCP port 5001
TCP window size: 0.00 GByte (default)
------------------------------------------------------------
[244] local 192.168.178.30 port 58241 connected with 192.168.178.60 port 5001
[236] local 192.168.178.30 port 58240 connected with 192.168.178.60 port 5001
[ ID] Interval  Transfer  Bandwidth
[236]  0.0- 1.0 sec  0.31 GBytes  0.31 GBytes/sec
[244]  0.0- 1.0 sec  0.33 GBytes  0.33 GBytes/sec
[SUM]  0.0- 1.0 sec  0.64 GBytes  0.64 GBytes/sec
[244]  1.0- 2.0 sec  0.30 GBytes  0.30 GBytes/sec
[236]  1.0- 2.0 sec  0.30 GBytes  0.30 GBytes/sec
[SUM]  1.0- 2.0 sec  0.59 GBytes  0.59 GBytes/sec
[244]  2.0- 3.0 sec  0.29 GBytes  0.29 GBytes/sec
[236]  2.0- 3.0 sec  0.29 GBytes  0.29 GBytes/sec
[SUM]  2.0- 3.0 sec  0.58 GBytes  0.58 GBytes/sec
[236]  3.0- 4.0 sec  0.29 GBytes  0.29 GBytes/sec
[244]  3.0- 4.0 sec  0.29 GBytes  0.29 GBytes/sec
[SUM]  3.0- 4.0 sec  0.58 GBytes  0.58 GBytes/sec
[236]  4.0- 5.0 sec  0.28 GBytes  0.28 GBytes/sec
[244]  4.0- 5.0 sec  0.29 GBytes  0.29 GBytes/sec
[SUM]  4.0- 5.0 sec  0.57 GBytes  0.57 GBytes/sec
[244]  5.0- 6.0 sec  0.29 GBytes  0.29 GBytes/sec
[236]  5.0- 6.0 sec  0.29 GBytes  0.29 GBytes/sec
[SUM]  5.0- 6.0 sec  0.58 GBytes  0.58 GBytes/sec
[244]  6.0- 7.0 sec  0.28 GBytes  0.28 GBytes/sec
[236]  6.0- 7.0 sec  0.29 GBytes  0.29 GBytes/sec
[ ID] Interval  Transfer  Bandwidth
[SUM]  6.0- 7.0 sec  0.57 GBytes  0.57 GBytes/sec
[244]  7.0- 8.0 sec  0.28 GBytes  0.28 GBytes/sec
[236]  7.0- 8.0 sec  0.29 GBytes  0.29 GBytes/sec
[SUM]  7.0- 8.0 sec  0.57 GBytes  0.57 GBytes/sec
[236]  8.0- 9.0 sec  0.29 GBytes  0.29 GBytes/sec
[244]  8.0- 9.0 sec  0.28 GBytes  0.28 GBytes/sec
[SUM]  8.0- 9.0 sec  0.57 GBytes  0.57 GBytes/sec
[236]  9.0-10.0 sec  0.29 GBytes  0.29 GBytes/sec
[244]  9.0-10.0 sec  0.29 GBytes  0.29 GBytes/sec
[SUM]  9.0-10.0 sec  0.57 GBytes  0.57 GBytes/sec
[236]  0.0-10.0 sec  2.91 GBytes  0.29 GBytes/sec
[244]  0.0-10.0 sec  2.93 GBytes  0.29 GBytes/sec
[SUM]  0.0-10.0 sec  5.83 GBytes  0.58 GBytes/sec
Done.
4 parallel streams

Code:
bin/iperf.exe -c 192.168.178.60 -P 4 -i 1 -p 5001 -f G -t 10 -T 1
------------------------------------------------------------
Client connecting to 192.168.178.60, TCP port 5001
TCP window size: 0.00 GByte (default)
------------------------------------------------------------
[260] local 192.168.178.30 port 58300 connected with 192.168.178.60 port 5001
[252] local 192.168.178.30 port 58299 connected with 192.168.178.60 port 5001
[244] local 192.168.178.30 port 58298 connected with 192.168.178.60 port 5001
[236] local 192.168.178.30 port 58297 connected with 192.168.178.60 port 5001
[ ID] Interval  Transfer  Bandwidth
[244]  0.0- 1.0 sec  0.26 GBytes  0.26 GBytes/sec
[252]  0.0- 1.0 sec  0.25 GBytes  0.25 GBytes/sec
[260]  0.0- 1.0 sec  0.26 GBytes  0.26 GBytes/sec
[236]  0.0- 1.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  0.0- 1.0 sec  1.03 GBytes  1.03 GBytes/sec
[236]  1.0- 2.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  1.0- 2.0 sec  0.25 GBytes  0.25 GBytes/sec
[260]  1.0- 2.0 sec  0.26 GBytes  0.26 GBytes/sec
[252]  1.0- 2.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  1.0- 2.0 sec  1.03 GBytes  1.03 GBytes/sec
[252]  2.0- 3.0 sec  0.26 GBytes  0.26 GBytes/sec
[236]  2.0- 3.0 sec  0.26 GBytes  0.26 GBytes/sec
[260]  2.0- 3.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  2.0- 3.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  2.0- 3.0 sec  1.03 GBytes  1.03 GBytes/sec
[260]  3.0- 4.0 sec  0.26 GBytes  0.26 GBytes/sec
[236]  3.0- 4.0 sec  0.26 GBytes  0.26 GBytes/sec
[252]  3.0- 4.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  3.0- 4.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  3.0- 4.0 sec  1.03 GBytes  1.03 GBytes/sec
[ ID] Interval  Transfer  Bandwidth
[260]  4.0- 5.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  4.0- 5.0 sec  0.26 GBytes  0.26 GBytes/sec
[236]  4.0- 5.0 sec  0.26 GBytes  0.26 GBytes/sec
[252]  4.0- 5.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  4.0- 5.0 sec  1.03 GBytes  1.03 GBytes/sec
[260]  5.0- 6.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  5.0- 6.0 sec  0.26 GBytes  0.26 GBytes/sec
[252]  5.0- 6.0 sec  0.26 GBytes  0.26 GBytes/sec
[236]  5.0- 6.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  5.0- 6.0 sec  1.03 GBytes  1.03 GBytes/sec
[252]  6.0- 7.0 sec  0.26 GBytes  0.26 GBytes/sec
[236]  6.0- 7.0 sec  0.26 GBytes  0.26 GBytes/sec
[260]  6.0- 7.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  6.0- 7.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  6.0- 7.0 sec  1.03 GBytes  1.03 GBytes/sec
[260]  7.0- 8.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  7.0- 8.0 sec  0.26 GBytes  0.26 GBytes/sec
[252]  7.0- 8.0 sec  0.26 GBytes  0.26 GBytes/sec
[236]  7.0- 8.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  7.0- 8.0 sec  1.03 GBytes  1.03 GBytes/sec
[ ID] Interval  Transfer  Bandwidth
[252]  8.0- 9.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  8.0- 9.0 sec  0.26 GBytes  0.26 GBytes/sec
[236]  8.0- 9.0 sec  0.26 GBytes  0.26 GBytes/sec
[260]  8.0- 9.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  8.0- 9.0 sec  1.03 GBytes  1.03 GBytes/sec
[236]  9.0-10.0 sec  0.26 GBytes  0.26 GBytes/sec
[252]  9.0-10.0 sec  0.26 GBytes  0.26 GBytes/sec
[260]  9.0-10.0 sec  0.26 GBytes  0.26 GBytes/sec
[244]  9.0-10.0 sec  0.26 GBytes  0.26 GBytes/sec
[SUM]  9.0-10.0 sec  1.03 GBytes  1.03 GBytes/sec
[236]  0.0-10.0 sec  2.59 GBytes  0.26 GBytes/sec
[244]  0.0-10.0 sec  2.56 GBytes  0.26 GBytes/sec
[252]  0.0-10.0 sec  2.60 GBytes  0.26 GBytes/sec
[260]  0.0-10.0 sec  2.56 GBytes  0.26 GBytes/sec
[SUM]  0.0-10.0 sec  10.3 GBytes  1.03 GBytes/sec
Done.
 
  • Like
Reactions: Rain

Rain

Active Member
May 13, 2013
276
124
43
Some OS's do support multi-threaded CFIS unlike FreeBSD, so which one were you using to test this?
This was tested with Samba4 on Linux, which I can confirm is definitely single-threaded. In fact, the Samba crew has said that they don't have any plans to thread Samba for single-threaded transfers (at least, in the near future) due to various architectural problems. Edit: Also, I made zero tweaks in the Samba4 config other than quickly adding a guest share pointing to /tmp; no tweaks to socket options or anything like that - Samba4 is so much better than Samba3 in the performance department!

Those Connectx-2 cards do not like BSD.
They don't like 9.x.x FreeBSD. Apparently everything's been sorted out in FreeBSD 10; FreeNAS just hasn't hopped on the bandwagon yet. No matter though, as @Kristian is using Intel cards. That said, I've heard FreeBSD 9.x.x doesn't like Intel cards either (again, fixed in FreeBSD 10), so that could be part of the problem.

Also the single thread performance for the C2750 is nearly half of the L5520 so that would make a significant difference.
Not exactly: http://www.servethehome.com/intel-atom-c2750-8-core-avoton-rangeley-benchmarks-fast-power/. Even if it were half though, based on my testing, it would be well more than enough for full speed, sequential Samba/CIFS performance.

It is not a rumor, it is explicitly written on freenas.org
This statement has been around a long time, and debunked many times though. It is very, very workload dependent. Like I said, if you're going to be running a couple dozen VMs on your server: You'll want the RAM, ZFS will thank you for it. Otherwise, if you're storing pictures, home movies, and/or other media or backups, you'll be fine with far less than the so-called 1GB per 1TB "rule." FreeNAS is really just covering their behinds with that recommendation in the event someone uses FreeNAS in the enterprise space.

--------------------------

@Kristian Yeah, I'd cut as much out your network path as you can and see if things improve. I suppose its possible you could have a faulty switch or something, but it would indeed be a bit unexpected. You're sure you've got up-to-date Intel drivers installed on your workstation, correct? Just because Windows recognizes the card doesn't mean you'll get great performance straight out of the gate. (Windows 10 sees ConnectX-2 cards just fine, but without drivers they max @ about 6Gb/s, for example)

You could also try booting into a Linux LiveCD (Ubuntu would be fine), install Samba4, and create a share to /tmp to test as well; that would narrow it down to whether BSD is what needs tweaking or something else in your network.
 
Last edited:
  • Like
Reactions: Kristian

markarr

Active Member
Oct 31, 2013
421
122
43
I don't think FreeNAS lets you see utilisation of the single cores.
I just fould overall cpu usage.
But when using parallel streams in iPerf that makes a difference (more on that below)


The FreeNAS mini (a commercial product, sold by the developers, uses an Avoton 2750 board)
iXsystems, Inc. – Enterprise Storage & Servers – FreeNAS Mini
If you run top -P it will give you the core breakdown.

I would take any claims on performance with a grain of salt on the FreeNAS Mini.
 
  • Like
Reactions: Kristian

markarr

Active Member
Oct 31, 2013
421
122
43
Not exactly: http://www.servethehome.com/intel-atom-c2750-8-core-avoton-rangeley-benchmarks-fast-power/. Even if it were half though, based on my testing, it would be well more than enough for full speed, sequential Samba/CIFS performance.
I doubt that CPU usage would scale linearly with file transfer. Not necessarily the same but I had a E5-2418L and my CFIS would cap out at ~90MB/s I swapped the processor for a E5-1410 and then I could max out my gig connection between the two. That was a 2.0 vs 2.8 GHz difference, so IPC is more important with CIFS in FreeBSD than anything else, provided that the storage backend can support it.
 

Rain

Active Member
May 13, 2013
276
124
43
@markarr, I don't doubt your past experience, but for fun: let's do some science! The board I'm testing with happens to be an old Gigabyte X58-UD5 I had laying around, so it's got all the overclocking bells and whistles. I hopped into the BIOS and set the core multiplier to 12. Keeping the base clock the stock 133Mhz, this results in the L5520 being underclocked to ~1.6GHz. To further cripple things, I disabled all but two cores, disabled hyper threading, and disabled Turbo Boost.

Target machine: Ubuntu 14.04 LTS, Samba4, sharing /tmp, no modifications to default Samba configuration. L5520 @ 1.6GHz as explained above.
Client: Windows 10 (Windows 8 & above support SMB3), i7-3930k @ 4.5GHz
MTU on both machines is set to 9000 and they're directly connected -- no switches. Single-port ConnectX-2 cards were used.

Results:



The C2750 shouldn't have any problems with 10GbE & Samba4 -- rule that out. Edit: Unless FreeBSD/FreeNAS's Samba4 port is terrible, which is possible.
 
Last edited:
  • Like
Reactions: Kristian

vikingboy

New Member
Jun 17, 2014
29
6
3
I use a 10 drive RAIDz2 array with a e5-1630 processor and 32GB ram and manage to pretty much max out the 10gbe line.
I found 10gbe to be far from plug and play though and needed to tune both ends of the line to raise perf from around where you are now (I use a mac and thunderbolt > Intel x520 adapter). I think Markarr above who mentioned single threaded CIFS under BSD in right though, I imagine your processor is the bottleneck, take a look at the number of interrupts in 'top' and you'll probably see a single core hitting the ceiling.
 
  • Like
Reactions: Kristian

canta

Well-Known Member
Nov 26, 2014
1,012
216
63
43
just comment on samba,
as already mentioned, samba is a single thread on all releases: linux, *bsd, and others.

Atom is not really a good candidate for samba, this is ok when mostly idle and only serving samba
When other services are in processing, Atom would be a bottleneck since samba is a single thread and rely on one core cpu processing that could be shared with other services.
 

gea

Well-Known Member
Dec 31, 2010
3,161
1,195
113
DE
just comment on samba,
as already mentioned, samba is a single thread on all releases: linux, *bsd, and others.

Atom is not really a good candidate for samba, this is ok when mostly idle and only serving samba
When other services are in processing, Atom would be a bottleneck since samba is a single thread and rely on one core cpu processing that could be shared with other services.
A multithreaded kernel based alternative to SAMBA is Solaris CIFS in Oracle Solaris, OmniOS or NexentaStor. OmniOS is currently SMB1, Nexenta and Solaris 11.3 is SMB 2.1. Most feature rich is Solaris 11.3 as it includes real ZFS encryption, SMB 2.1, LZ4 compress and fast sequential resilvering. 11.3 is currently beta and may contain some debugging code so may not be as fast as possible. 11.3 Stable is expected this year.
 
  • Like
Reactions: Kristian

canta

Well-Known Member
Nov 26, 2014
1,012
216
63
43
A multithreaded kernel based alternative to SAMBA is Solaris CIFS in Oracle Solaris, OmniOS or NexentaStor. OmniOS is currently SMB1, Nexenta and Solaris 11.3 is SMB 2.1. Most feature rich is Solaris 11.3 as it includes real ZFS encryption, SMB 2.1, LZ4 compress and fast sequential resilvering. 11.3 is currently beta and may contain some debugging code so may not be as fast as possible. 11.3 Stable is expected this year.
samba states never run on multithread scenario...
if you are talking solaris cifs , you are talking away of samba world

I just state samba only.

please, point any hints that samba is mulithtread in the future.
as usual, I do not eat marketing gimmick, give me the real result!! and I would agree.
 

Rain

Active Member
May 13, 2013
276
124
43
Seriously, I'd like to see some proof/benchmarks of Samba4 (the key here being Samba4, which is much more performant than Samba3) on an Avoton CPU before I agree that it's really the issue here. Samba get's a bad rap for being a CPU hog all the time (it really only is for random and small IO because it isn't threaded) and the Avoton CPUs get a bad rap for being slow (and they really aren't that slow). I'm all ears if someone actually tests it, but blindly making statements like this based on past experience with Samba3 or whatever you've heard others say is meaningless if you haven't properly tested it yourself or lack a source that has and has documented it.

We need to wait and hear back from @Kristian to see what his further testing reveals.
 
Last edited:

canta

Well-Known Member
Nov 26, 2014
1,012
216
63
43
Seriously, I'd like to see some proof/benchmarks of Samba4 (the key here being Samba4, which is much more performant than Samba3) on an Avoton CPU before I agree that it's really the issue here. Samba get's a bad rap for being a CPU hog all the time (it really only is for random and small IO because it isn't threaded) and the Avoton CPUs get a bad rap for being slow (and they really aren't that slow). I'm all ears if someone actually tests it, but blindly making statements like this based on past experience with Samba3 or whatever you've heard others say is meaningless if you haven't properly tested it yourself or lack have a source that has.

We need to wait and hear back from @Kristian to see what his further testing reveals.
Atom has a bad reputation on cpu power :D...
the interesting on Intel, they push atom on steroid that renamed to celeron/pentium on last baytrail and new braswell..
I am surprised last baytrail is more powerful than atom. mnie is running proxmox with 1 VM rounter, 1 VM ubuntu (with openvpn client), 1 VM for light duty to update DDNS and minor tasks.

my speculation, Atom will be on android or embeded tablet(0r phone) to compete with risc or mips.
 

Rain

Active Member
May 13, 2013
276
124
43
Atom has a bad reputation on cpu power :D...
the interesting on Intel, they push atom on steroid that renamed to celeron/pentium on last baytrail and new braswell..
I am surprised last baytrail is more powerful than atom. mnie is running proxmox with 1 VM rounter, 1 VM ubuntu (with openvpn client), 1 VM for light duty to update DDNS and minor tasks.

my speculation, Atom will be on android or embeded tablet(0r phone) to compete with risc or mips.
Right, but look at where the C2750 resides in the single-threaded benchmarks here: http://linux-bench.com/results.html. It's roughly two thirds of an L5520; as shown above I cut a L5520 down to 1.6GHz (roughly 2/3 stock when you consider it it jumps between 2.26GHz & 2.4GHz) and it can happily still handle 10Gb speeds sequentially over Samba. I'm aware this is not perfectly scientific due to differences in cache, IPC, ect (I'd test with a C2750 if I could), but I still can't see the C2750 performing much worse, if at all, than the configuration I tested above. Someone needs to actually test on a C2750 -- this ain't a D525 we're talking about here; times have changed.
 

markarr

Active Member
Oct 31, 2013
421
122
43
Seriously, I'd like to see some proof/benchmarks of Samba4 (the key here being Samba4, which is much more performant than Samba3) on an Avoton CPU before I agree that it's really the issue here.
Samba 4 has been in FreeNAS since 9.2 which came out close to two years ago. I don't have an Avoton, just what I had running in my lab with no tweaking.

Samba get's a bad rap for being a CPU hog all the time (it really only is for random and small IO because it isn't threaded) and the Avoton CPUs get a bad rap for being slow (and they really aren't that slow). I'm all ears if someone actually tests it, but blindly making statements like this based on past experience with Samba3 or whatever you've heard others say is meaningless if you haven't properly tested it yourself or lack a source that has and has documented it.
Avoton are not yesterday's celerons true, Intel did a good job for the performance per watt, but most E3's still walk all over it.

Its not really meaningless since CIFS performance issues are plastered all over the freenas forums. I don't have any 10g stuff at home to test it with.