Nutanix CE

Skud

Active Member
Jan 3, 2012
122
63
28
Anyone have experience with lots of storage on Nutanix CE?

I manage and admin our Nutanix cluster for work and I want to also run it in my home lab. I see the CE has an 18TB storage limit, however they give an example of 3 x 6TB disks. Is this per-node? It doesn’t make sense for 3x6TB disks per cluster.

They also allude to some users having “success” with running more drives.

I think I’m reading between the lines here and that the limitations are pretty soft, but before I run head first into building a cluster I’d like to know if anyone has run more than 18TB of storage in the CE.

Thanks!!
Riley
 

BoredSysadmin

Active Member
Mar 2, 2019
564
172
43
Don't call me a Nutanix guru, I did run a few smaller cluster a few years ago and build a single node CE.
As far as I remember were wasn't a hard limit on the amount of storage per se, but it was a hard stop at 4 node clusters.
The second thing CE didn't do is VT-D direct IO of a storage controller, thus limiting performance a bit, but it makes it much easier to find suitable hardware.
 

Skud

Active Member
Jan 3, 2012
122
63
28
So, some things that I've found out..

Specs:
Supermicro 4u four-node "Fat Twin".
Each node:
2 x E5 2640 V2
8 x 16GB
10Gb Intel x520
LSI 3008
64GB SATA DOM for boot
2 x Samsung PM883 1.92TB
4 x 4TB HGST SAS

1) It doesn't support 4kn disks. I noticed that none of my HDDs would show, but the SSDs did. I had formatted the HDDs from 512e to 4kn prior so I suspected that's what happened. I formatted them back to 512e and now they all show.

2) There doesn't seem to be any limit on disks. 4 x 4TB HDD and 2 x 2TB Samsung PM883 SSD per node.

1589303929724.png

3) CE does prevent HBA passthrough to the CVM (but DOES support NVMe passthrough). In "real" Nutanix the HBA is passed-through with all the disks so the CVM has direct access. You *can* do the same with CE by editing some of the config scripts. I managed to get it working, but I went back because I'm unsure how this will affect future upgrades. I don't want to bork my cluster with an update. After digging around the installation scripts it seems like the support is there. There is even specific CE edition detection logic for the various HBAs, but there is a hard coded "if "COMMUNITY_EDITION then use LUN passthrough" line which overrides this. So, maybe it's coming in the future. I hope so!!

4) Don't expect any of the hardware-centric things you get in real Nutanix to work in CE. Things like IPMI integration, the nice node diagrams/disk position layouts, and hardware alerts. You might be able to hack it together by editing the layout files in the installer image, but it would be some work AND your cluster may not work after an upgrade.

5) Performance is still "pretty good" for a lab setup. I'm running Nutanix X-Ray and doing some OLTP tests. Seems to be good enough for me...
1589304344999.png

1589304470605.png

1589304591256.png
 

Attachments

  • Like
Reactions: BoredSysadmin

BoredSysadmin

Active Member
Mar 2, 2019
564
172
43
I do vaguely remember that they limited storage controller to NOT be bypassed in CE edition, likely to do both a) limit performance a bit b) ensure wider hardware compatibility.
I am curious if you could provide a bit more specific details on point 3 - which config files you changed to make it work.
How much memory did you give to the CVM?
 

Skud

Active Member
Jan 3, 2012
122
63
28
CVMs each have 32GB. This was a requirement for running dedup on the storage pools.

The three files you need to look at are:
/home/install/phx_iso/phoenix/imaging_util.py:486 (forces LUN mode/disables passthrough)

/home/install/phx_iso/phoenix/layout/modules/community_ed.py (you'll need to modify to support your actual hardware/HBA. You can see the other templates in this folder. It might actually be possible to get "full" hardware support by creating a custom layout file.)

/home/install/phx_iso/phoenix/firstboot/kvm/kvm_first_boot.py:262 (there seems to be some logic that gets broken for nvme drives after the above changes. I just commented it out.

These are just rough changes. It's also possible that some of the changes aren't required and some of the stuff I've changed was due to me playing around/trying to get the 4kn drives working.

Riley
 
  • Like
Reactions: BoredSysadmin

Skud

Active Member
Jan 3, 2012
122
63
28
So, I'm still playing around with this and I'm trying to overcome an issue with very slow write speeds in a VM. It looks like the VM is being throttled for writes. I can transfer data to the VM at ~650MB/s and as soon as the cache fills in the VM performance plummets to 45MB/s for the remainder of the transfer. Writing to disk is that speed - 45MB/s. The VM is Server 2019 with the latest VirtIO and NGT installed. This also happens on a Ubuntu 20.04 VM, but performance is a bit better - around 120MB/s sequential. Transfers were performed with the native file copy and robocopy (windows) with the /MT option.

I don't seem to be hitting the spinning disks yet. They stay at 0 IOPS during these times.

Running some diagnostics shows that the hardware is more than capable of supporting more performance. I've been playing around with the X-Ray platform benchmarking and it shows really good performance:

xray-Sequential Write I_O Throughput.png


xray-Random Write IOPS.png


xray-Sequential Read I_O Throughput.png

xray-Random Read IOPS.png

Nutanix's diagnostics.py output is:

Code:
nutanix@NTNX-RJFNutanix-A-CVM:172.16.232.31:~/diagnostics$ ./diagnostics.py --display_latency_stats --run_iperf run
Running Iperf Test between CVMs
bandwidth between 172.16.232.31 and 172.16.232.32 is: 8.39 Gbits
bandwidth between 172.16.232.31 and 172.16.232.33 is: 8.23 Gbits
bandwidth between 172.16.232.31 and 172.16.232.34 is: 8.61 Gbits
Checking if an existing storage pool can be used ...
Using storage pool default-storage-pool-7712708351784 for the tests.
Checking if the diagnostics container/s exists ...
Container NTNX-NFS-DEFAULT-0 does not exist.
Creating a new container NTNX-NFS-DEFAULT-0 for the runs ... done.
Preparing 1 UVM/s on host 172.16.232.11 ...
  Importing diagnostics image ...  done.
  Deploying the UVM on host 172.16.232.11 ... done.
  Adding disks ... done.
Preparing 1 UVM/s on host 172.16.232.12 ...
  Deploying the UVM on host 172.16.232.12 ... done.
  Adding disks ... done.
Preparing 1 UVM/s on host 172.16.232.13 ...
  Deploying the UVM on host 172.16.232.13 ... done.
  Adding disks ... done.
Preparing 1 UVM/s on host 172.16.232.14 ...
  Deploying the UVM on host 172.16.232.14 ... done.
  Adding disks ... done.
Waiting for 1 VM on host 172.16.232.11 to bootup ... done. 3 remaining.
Waiting for 1 VM on host 172.16.232.12 to bootup ... done. 2 remaining.
Waiting for 1 VM on host 172.16.232.13 to bootup ... done. 1 remaining.
Waiting for 1 VM on host 172.16.232.14 to bootup ... done. 0 remaining.
Collect executable list.
Collect shared library list.
Start up persistent ssh connection to localhost:17000.
ERROR:root:Error connecting through control master
Connecting to localhost:17000 with password authentication.
Start up persistent ssh connection to localhost:17002.
ERROR:root:Error connecting through control master
Connecting to localhost:17002 with password authentication.
Start up persistent ssh connection to localhost:17004.
ERROR:root:Error connecting through control master
Connecting to localhost:17004 with password authentication.
Start up persistent ssh connection to localhost:17006.
ERROR:root:Error connecting through control master
Connecting to localhost:17006 with password authentication.
Prepare environment on SVMs.
Preserve original install in /home/nutanix/builds/stock.
Preserve original install in /home/nutanix/builds/stock.
Preserve original install in /home/nutanix/builds/stock.
Preserve original install in /home/nutanix/builds/stock.
Transfer shared libraries.
Transfer dynamic linker.
Transfer executables.
Run patchelf.
Transferring the build across the cluster.
Done!
Start running test prepare_disks
2020-05-21_19-02-02: Running setup "Prepare disks" ...
done.
Average SVM CPU: 172.16.232.31: 77%   172.16.232.32: 76%   172.16.232.33: 74%   172.16.232.34: 83%

SSD usage for disks:
     Disk id: 51, SSD used : 1074 GB,  SSD Capacity Bytes: 1523 GB, 70% used
     Disk id: 53, SSD used : 1022 GB,  SSD Capacity Bytes: 1463 GB, 69% used
     Disk id: 57, SSD used : 901 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 58, SSD used : 1031 GB,  SSD Capacity Bytes: 1463 GB, 70% used
     Disk id: 67, SSD used : 900 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 63, SSD used : 646 GB,  SSD Capacity Bytes: 1463 GB, 44% used
     Disk id: 74, SSD used : 1171 GB,  SSD Capacity Bytes: 1463 GB, 80% used
     Disk id: 75, SSD used : 1217 GB,  SSD Capacity Bytes: 1523 GB, 79% used
SSD usage for storage pool:
     SSD tier usage 7957 GB, SSD total capacity 11949 GB, SSD Usage 66%

Duration prepare_disks : 68 secs
*******************************************************************************

Start running test fio_seq_write
Waiting for the hot cache to flush ........... done.
2020-05-21_19-05-02: Running test "Sequential write bandwidth" ...
963 MBps , latency(msec): min=31, max=2820, median=414
Average SVM CPU: 172.16.232.31: 71%   172.16.232.32: 63%   172.16.232.33: 64%   172.16.232.34: 68%

SSD usage for disks:
     Disk id: 51, SSD used : 1074 GB,  SSD Capacity Bytes: 1523 GB, 70% used
     Disk id: 53, SSD used : 1023 GB,  SSD Capacity Bytes: 1463 GB, 69% used
     Disk id: 57, SSD used : 902 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 58, SSD used : 1032 GB,  SSD Capacity Bytes: 1463 GB, 70% used
     Disk id: 67, SSD used : 903 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 63, SSD used : 649 GB,  SSD Capacity Bytes: 1463 GB, 44% used
     Disk id: 74, SSD used : 1172 GB,  SSD Capacity Bytes: 1463 GB, 80% used
     Disk id: 75, SSD used : 1217 GB,  SSD Capacity Bytes: 1523 GB, 79% used
SSD usage for storage pool:
     SSD tier usage 7975 GB, SSD total capacity 11949 GB, SSD Usage 66%

Duration fio_seq_write : 65 secs
*******************************************************************************

Start running test fio_seq_read
Waiting for the hot cache to flush ........ done.
2020-05-21_19-07-33: Running test "Sequential read bandwidth" ...
3760 MBps , latency(msec): min=6, max=465, median=129
Average SVM CPU: 172.16.232.31: 50%   172.16.232.32: 44%   172.16.232.33: 43%   172.16.232.34: 37%

SSD usage for disks:
     Disk id: 51, SSD used : 1074 GB,  SSD Capacity Bytes: 1523 GB, 70% used
     Disk id: 53, SSD used : 1023 GB,  SSD Capacity Bytes: 1463 GB, 69% used
     Disk id: 57, SSD used : 902 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 58, SSD used : 1032 GB,  SSD Capacity Bytes: 1463 GB, 70% used
     Disk id: 67, SSD used : 903 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 63, SSD used : 649 GB,  SSD Capacity Bytes: 1463 GB, 44% used
     Disk id: 74, SSD used : 1172 GB,  SSD Capacity Bytes: 1463 GB, 80% used
     Disk id: 75, SSD used : 1217 GB,  SSD Capacity Bytes: 1523 GB, 79% used
SSD usage for storage pool:
     SSD tier usage 7975 GB, SSD total capacity 11949 GB, SSD Usage 66%

Duration fio_seq_read : 18 secs
*******************************************************************************

Start running test fio_rand_read
Waiting for the hot cache to flush .......... done.
2020-05-21_19-09-34: Running test "Random read IOPS" ...
169427 IOPS , latency(msec): min=0, max=312, median=2
Average SVM CPU: 172.16.232.31: 87%   172.16.232.32: 86%   172.16.232.33: 85%   172.16.232.34: 85%

SSD usage for disks:
     Disk id: 51, SSD used : 1074 GB,  SSD Capacity Bytes: 1523 GB, 70% used
     Disk id: 53, SSD used : 1023 GB,  SSD Capacity Bytes: 1463 GB, 69% used
     Disk id: 57, SSD used : 902 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 58, SSD used : 1032 GB,  SSD Capacity Bytes: 1463 GB, 70% used
     Disk id: 67, SSD used : 903 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 63, SSD used : 649 GB,  SSD Capacity Bytes: 1463 GB, 44% used
     Disk id: 74, SSD used : 1172 GB,  SSD Capacity Bytes: 1463 GB, 80% used
     Disk id: 75, SSD used : 1217 GB,  SSD Capacity Bytes: 1523 GB, 79% used
SSD usage for storage pool:
     SSD tier usage 7975 GB, SSD total capacity 11949 GB, SSD Usage 66%

Duration fio_rand_read : 106 secs
*******************************************************************************

Start running test fio_rand_write
Waiting for the hot cache to flush ......... done.
2020-05-21_19-12-55: Running test "Random write IOPS" ...
116727 IOPS , latency(msec): min=0, max=272, median=3
Average SVM CPU: 172.16.232.31: 88%   172.16.232.32: 82%   172.16.232.33: 81%   172.16.232.34: 82%

SSD usage for disks:
     Disk id: 51, SSD used : 1074 GB,  SSD Capacity Bytes: 1523 GB, 70% used
     Disk id: 53, SSD used : 1023 GB,  SSD Capacity Bytes: 1463 GB, 69% used
     Disk id: 57, SSD used : 902 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 58, SSD used : 1032 GB,  SSD Capacity Bytes: 1463 GB, 70% used
     Disk id: 67, SSD used : 903 GB,  SSD Capacity Bytes: 1523 GB, 59% used
     Disk id: 63, SSD used : 649 GB,  SSD Capacity Bytes: 1463 GB, 44% used
     Disk id: 74, SSD used : 1172 GB,  SSD Capacity Bytes: 1463 GB, 80% used
     Disk id: 75, SSD used : 1217 GB,  SSD Capacity Bytes: 1523 GB, 79% used
SSD usage for storage pool:
     SSD tier usage 7975 GB, SSD total capacity 11949 GB, SSD Usage 66%

Duration fio_rand_write : 106 secs
*******************************************************************************

Tests done.
 

ecosse

Active Member
Jul 2, 2013
393
75
28
I realise I am resurrecting this thread from beyond the grave but what is the feeling on Nutanix CE - worth a test drive?
 

Skud

Active Member
Jan 3, 2012
122
63
28
Would I recommend it for a reliable and fairly easy to use platform? Yes, I think I would. Mine has been 100% rock stable for almost a year.

However, there are some things to consider:

1) You need to throw a LOT of hardware at it (from a homelab perspective). For example, each host runs a controller VM (CVM). By default each one is assigned 8 cores and 16GB of RAM. To enable deduplication you need to up the RAM to 32GB. For me, a full 25% of each node is supporting the CVMs. There are ways to lower this, however.

You also need a fast network. 10Gb minimum. All the nodes replicate to one another and nodes will pull data from other nodes if it's not available locally or faster to get it from somewhere else.

2) I can't seem to figure out how to get some (any) sort of power management working. With Nutanix each of my nodes is idling around 175watts. The same node in Windows or a CentOS/Ubuntu Live environment - 80w. Multiplied by four, throw in some switches, firewall, and I'm almost hitting 1kW.

3) I manage a 20-node cluster at work running the full-blown thing. It's great. Everything is snappy and performance is good *in a multi-user, multi-threaded, multi-everything environment*. If you have tons of users or tons of VMs, or tons of I/O then it's *really* responsive - that's how it's tuned to work. However, if you need more performance for "single" things like file transfers then you'll be disappointed.

The best way I can describe it is like LACP on a switch. 4 x 1Gb connections doesn't mean you can transfer a file at 4Gb, but you can transfer four files at 1Gb no problem.

There is also a "gotcha" with sustained I/O performance dropping off. Nutanix utilizes something capped an oplog. Each virtual disk you assign to a VM gets ~6GB of oplog. This is sort of a write cache that goes to the flash tier. If you overflow this cache then performance will plummet. The solution is to add more disks - each disk gets you +6GB oplog. In Linux this means LVM/mdraid and in Windows this means storage spaces. I've created a few VMs with "simple" storage spaces (fault tolerance isn't necessary) consisting of 5-10 disks to get better sustained random write performance.

4) You're tied to their upgrade schedule and access to the management interface is dependent on your Nutanix CE account. If a new release comes out then you need to upgrade it or it doesn't let you in to the interface. Same thing goes for your CE account. No account, no access to your VMs.

Otherwise, it's a great system - especially for being free. As far as I can tell there aren't any real limitations to CE. You can use all the DR features, too for replicating to another box.

Riley