truenas with new intel p5800x optane

jpmomo

Active Member
Aug 12, 2018
286
97
28
Hello,
I am looking to get some help with optimization of a new truenas single server setup. The drives that are currently available include the following:
1. intel p5800x 800GB optane drive
2. samsung pm1735 3.2TB pcie 4.0 nvme
3. 2 x seagate ironwolf pro 10TB
4. server with amd epyc rome 7742 x 2 64c cpus, 512GB ddr4 3200mhz ram
5. mellanox connectx-5 cdat dual port 100GE pcie 4.0 NIC

The goal is to optimize the speed of the storage. We are not sure how best to utilize the optane drive. We are also not sure how to best utilize the samsung pm1735. We also need to make sure that the 2 seagate hdds don't slow everything down.
Thanks for any suggestions.
 

i386

Well-Known Member
Mar 18, 2016
2,658
774
113
32
Germany
Performance means all ssd and SAME (stripe and mirror everything aka raid 10/striped mirrors).

Which layout do you want to use? I'm asking because I see 2 ssds, 2 hdds, 128 cpu cores and a 100GBE nic and I can't see how a performant storage solution can be implemented (other than using the optane for storage and not for caching).
 
  • Like
Reactions: T_Minus

Rand__

Well-Known Member
Mar 6, 2014
5,600
1,228
113
More details please;)

Optimizing for read or write, amount of users (parallel?), type of data (vms, data...), size of dataset, access type (nfs/iscsi) what is the desired performance goal...
 
  • Like
Reactions: TrumanHW

jpmomo

Active Member
Aug 12, 2018
286
97
28
Thanks for both replies. I will try and answer both members questions. I sort of understand the basic need to stripe/mirror with a lot of identical drives. Ssd would be better than hdd and nvme would be best. The issue that I am trying to solve is to see if there is a way to optimize with the hw that I have. Which as you mention is not a lot of identical drives. I was hoping to leverage a tiered form of storage where most of the hits would be on the ram then optane then samsung and copy off to the slower seagate hdds in the background. One thought is to get another samsung and use that as its own high performance vdev. I still need to sort out how to best utilize the optane. Thoughts on the optane include using as both an l2arc and slog.

As to Rand's questions, I will be using proprietary test sw that will look at both r/w and a range of data sizes. The goal is to optimize the storage to keep pace with the network config.
Thanks again for your help.
 

Rand__

Well-Known Member
Mar 6, 2014
5,600
1,228
113
Well there is no layered storage as such in TrueNas, just cache (memory) and disks.

Unfortunately there is not much else to tell you without further information what kind of data you want to store as requirements vary wildly.
 
  • Like
Reactions: T_Minus

jpmomo

Active Member
Aug 12, 2018
286
97
28
Mostly windows smb file shares. It is good to know about the lack of tiered storage with truenas. With the new optane drives, I was hoping to leverage both r/w and latency performance with a storage server setup. Maybe vmware's vsan might be a better solution.
 

Rand__

Well-Known Member
Mar 6, 2014
5,600
1,228
113
You will need at least 2+1 boxes for vsan and its not a SMB Server at all

Whats the data size (Hot warm cold)? How many users will access (read/write) files (of what size) at the same time?
 
  • Like
Reactions: T_Minus

jpmomo

Active Member
Aug 12, 2018
286
97
28
Thanks again for your help. The benchmarking sw that I will use will vary the size and type of data. That is why I was looking to utilize the optane drive as multiple functions to improve both r/w performance. It's low latency which should help with smaller files. With truenas, what options do you have to optimize the use of a cache drive like the optane? I was told the if I dedicated that drive to only a slog device, it would not be used at all unless there was a power failure. In my use case, that would probably never happen so that doesn't seem like a good use for this drive. I also read that there might be a way to disable the cache on all of the normal vdev pools (with hdd) and redirect that cache to the optane. I can also go with an all nvme setup but would still like to utilize the optane to "front" the normal nvme drives to utilize it's 100 dwpd capability.
Thanks.
 

jpmomo

Active Member
Aug 12, 2018
286
97
28
See below for an excerpt describing the optane's ability to improve performance in a vmware environment:

VMware vSAN
VMware’s vSAN uses Optane SSDs within its hyper-converged storage platform. It uses Optane to make the storage on remote nodes appear local, which requires the ability to commit changes locally within Optane storage before writing the change across the network. While this operation doesn’t require a massive amount of local storage, the process does need to occur at very low latency. This is so that the performance impact of writing data across to other servers doesn’t impact any writes happening locally.

The Intel Optane P5800X SSD boasts 2M+ I/O Operations per second (IOPs) in a single Intel Optane card when working with four kilobytes (kb) read/write IOPS. When using 512-byte random read IOPs, a single P5800X device will push 4.5-5 Million IOPs at a response time of four microseconds. With VMWare’s vSAN, these storage throughput numbers are a realistic possibility. The only limit is network bandwidth rather than storage throughput.
 

i386

Well-Known Member
Mar 18, 2016
2,658
774
113
32
Germany
The Intel Optane P5800X SSD boasts 2M+ I/O Operations per second (IOPs) in a single Intel Optane card when working with four kilobytes (kb) read/write IOPS. When using 512-byte random read IOPs, a single P5800X device will push 4.5-5 Million IOPs at a response time of four microseconds. With VMWare’s vSAN, these storage throughput numbers are a realistic possibility. The only limit is network bandwidth rather than storage throughput.
:oops:
I'm not sure if you men intel optane p5800x ssds: Intel® Optane™ SSD DC P5800X Series (1.6TB, 2.5in PCIe x4, 3D XPoint™) Product Specifications
Random Read (100% Span): 1500000 IOPS
Random Write (100% Span): 1500000 IOPS
 

Rand__

Well-Known Member
Mar 6, 2014
5,600
1,228
113
Thanks again for your help. The benchmarking sw that I will use will vary the size and type of data. That is why I was looking to utilize the optane drive as multiple functions to improve both r/w performance. It's low latency which should help with smaller files. With truenas, what options do you have to optimize the use of a cache drive like the optane? I was told the if I dedicated that drive to only a slog device, it would not be used at all unless there was a power failure. In my use case, that would probably never happen so that doesn't seem like a good use for this drive. I also read that there might be a way to disable the cache on all of the normal vdev pools (with hdd) and redirect that cache to the optane. I can also go with an all nvme setup but would still like to utilize the optane to "front" the normal nvme drives to utilize it's 100 dwpd capability.
Thanks.
Regarding this - I would suggest to read up on slog and what its used for.
You might be able to use it as special vdev, but that will put your pool at risk with a single drive.

Else you either don't have a clear idea what you need or you're not telling it.
Either way you will not get good results if you do not provide us with the necessary info to help you.

If you can't, then you should work on that first.

Just coming here with a bunch of bought hardware and ask "help me to make this superfast for my totally unspecified use case" will just not work. Sorry.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,248
1,701
113
CA
" The issue that I am trying to solve is to see if there is a way to optimize with the hw that I have. "

That's not an actual issue you're trying to solve though.

If you describe your actuall issue, then people can give advice based on the hardware you have.

If you don't have an issue and just want the "best performance" From what you have, then you need to define what you qualify as "best performance".

Do you need to max out 4K read\write (mixed work load)? Do you need to max out 100Gig network card for big transfers?
Do you need to max 100Gig network for big transfers while also keeping IOPs available for something else?

Is this for file storage or VMs or both?

Either way IMO you don't have enough drives for much of a performance storage anything if you need capacity too.

800GB p5800x --> Intel® Optane™ SSD DC P5800X Series (800GB, 2.5in PCIe x4, 3D XPoint™) Product Specifications
 

jpmomo

Active Member
Aug 12, 2018
286
97
28
Regarding this - I would suggest to read up on slog and what its used for.
You might be able to use it as special vdev, but that will put your pool at risk with a single drive.

Else you either don't have a clear idea what you need or you're not telling it.
Either way you will not get good results if you do not provide us with the necessary info to help you.

If you can't, then you should work on that first.

Just coming here with a bunch of bought hardware and ask "help me to make this superfast for my totally unspecified use case" will just not work. Sorry.
I understand you point. The truenas and storage in general is not my forte . I did do some research and ran across some articles that tried to clarify the different types of functions within truenas. For the slog, they mentioned that this component should not be viewed as a form of cache but more of a persistent write log that would be used in the event of a power failure. I originally thought of a slog as more of a write cache. With regards to better describing my use case, it is trying to optimize truenas with a synthetic suite of test cases. Unfortunately, I understand that is too generic for you to give specific advice. Maybe I can be more specific as to some of my assumptions and see if those with experience can confirm or point me in another direction.
1. Is there any way to configure a drive in truenas to offload the write hits from a separate vdev pool?
2. Same as above but with read?
3. Does the slog function only come into play during a power failure?
4. Is there anyway to configure truenas to replicate storage dynamically to local hdd without impacting the primary I/O?
5. In general, the main way to increase performance with mixed workloads, is to aggregate multiple fast identical drives in something similar to raid10 (a mix of striping and mirroring)?

Thanks again for your help.
 

jpmomo

Active Member
Aug 12, 2018
286
97
28
" The issue that I am trying to solve is to see if there is a way to optimize with the hw that I have. "

That's not an actual issue you're trying to solve though.

If you describe your actuall issue, then people can give advice based on the hardware you have.

If you don't have an issue and just want the "best performance" From what you have, then you need to define what you qualify as "best performance".

Do you need to max out 4K read\write (mixed work load)? Do you need to max out 100Gig network card for big transfers?
Do you need to max 100Gig network for big transfers while also keeping IOPs available for something else?

Is this for file storage or VMs or both?

Either way IMO you don't have enough drives for much of a performance storage anything if you need capacity too.

800GB p5800x --> Intel® Optane™ SSD DC P5800X Series (800GB, 2.5in PCIe x4, 3D XPoint™) Product Specifications
I understand your point. Max out 100Gbps dual port NIC with big transfers.
Thanks for helping me clarify.
 

Rand__

Well-Known Member
Mar 6, 2014
5,600
1,228
113
I understand you point. The truenas and storage in general is not my forte .
They are a possible solution to a problem that we dont know, so no idea whether that's the correct tool for you issues.

1. Is there any way to configure a drive in truenas to offload the write hits from a separate vdev pool?
cache (memory) / special vdevs for non sync traffic (if i understood your request correctly)

2. Same as above but with read?
same and o/c arc/l2arc (if i understood your request correctly)

3. Does the slog function only come into play during a power failure?
No, also in case of cpu, psu, user issue, anything that can cause a reboot/crash/bluescreen while data in still in flight

4. Is there anyway to configure truenas to replicate storage dynamically to local hdd without impacting the primary I/O?
No. You will always need to read blocks from your primary storage. It might not be an issue for drives with good multiple process performance though (ssd-> nvme -> optane)

5. In general, the main way to increase performance with mixed workloads, is to aggregate multiple fast identical drives in something similar to raid10 (a mix of striping and mirroring)?
Not necessarily, this totally depends on the use case we've been asking all along (few large transfers react differently than many small ones.
Best option to speed up transfers is a faster device (single), if you have many parallel processes you can distribute load with more devices so you get many times a (large) fraction of the great single device speed.



Max out 100Gbps dual port NIC with big transfers.
Forget it
This sounds like quite a challenge to be honest with the hw given...
 
Last edited:
  • Like
Reactions: T_Minus

jpmomo

Active Member
Aug 12, 2018
286
97
28
just a quick update on my "endeavor":) I did get some guidance from this forum and several others including the folks on the truenas forum. the current setup looks like the following:
asrock rack romed6 (recently reviewed here) currently with a 7502 rome cpu but waiting on milan (hoping to be able to optimize the 6 channels available on the romed6 mb as these cpus support 8 but due to the m-atx form factor of the mb, there are only 6 available.) 192GB 3200 ram. truenas 12U3 (with a couple of fixes for the funky dashboard!) I picked up a relatively cheap solution for my striped 4x1TB pny xlr8 using a gigabyte aorus gen4 aic. I have several of the latest gen m.2 and these had the best random 4k w performance. the 2TB version of these drives are faster but I am just testing this for now. I can scale out horizontally if needed by adding another gigabyte card with 4 additional m.2 drives. There is obviously no resiliency with this setup. I was told that for this type of setup, I could take advantage of either frequent/scheduled snapshots or rsync. I will have several of the seagate iron wolf pros that can be configured with some fault tolerance and be used for the snapshots/rsync target. I am still using the mellanox connectx-5 cdat (100G dual port pcie 4.0 x16) that I can setup with a lagg that is connected to an arista 100G switch. I am still trying to sort out some way to save the consumer based m.2 ssds from eating up their (poor) endurance. I also have not found a way to utilize the fancy new optane drive with 100 dwpd! Most of the test traffic is smb from windows client emulators. I still don't have a clear understanding of when the actual writes hit the m.2 striped vdev. ex. if i am transferring a 15GB iso file from a windows client/s to the zfs pool, will it always perform a write on the m.2 vdev (and thus reduce its lifetime endurance counters)? I am expecting both T_Minus and Rand__ to chime in with some suggestions (and no smiley faces;)) and btw, all of this is in a cerberus m-atx case painted HOK blue blood red!
 

Rand__

Well-Known Member
Mar 6, 2014
5,600
1,228
113
So this post is barley readable with no structure, maybe adjust a bit.

What I think I understood is that you basically stripe some m2 for a fast pool and hope they survive a while...

And yes, in the end all the data will need to land at the M2 or its not stored permanently, so regardless of whether it lands in cache first or not, it will need to land there. Write kills ssds, not reads, so again, workload dependent.

So whats the SMB write speed on the M2's for a 15G file? Most of that should land in memory initially, so not a realistic test o/c, but curious anyway;)