VMWare shared storage optimization question (blocksizes)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Hi,

been wondering for a while and saw other ppl have the same question but didn't find a definitive answer, so here it goes...

When building a shared storage for a ESX box (or cluster) - what is the most appropriate measurement to take in running synthetic benchmarks?

O/c one must differentiate between infrastructure activity (create/move vms) and application activity (Exchange/Database VM).

The latter will usually depend on the application so its difficult to specify an universal value.

Therefore i'd like to restrict this to infrastructure, so
-A - VMotion
-B - Create a new VM (thick provision)
-C - maybe regular OS activity

Secondary aspect is o/c the number of concurrent threads (do I need low QD optimized drives or deep QD) but thats fairly individual too so I'd like to look at low QD.


I always looked for 4K random writes but nowadays I am not sure thats it.
I ran some tests recently on a Compuverde cluster and they have a nice Gui feature where you can (roughly) see the block size being used.
I have not checked B or C explicitly but A was in between <128K (but not <64K) segment so it might be 64K.

Does anyone have more specific info?
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Well that describes the storage block size, so that might be applicable for B and C.
1MB standard (user configurable at datastore creation time)

The other question is what is the transfer chunk size on a vmotion?
Is that read block wise (say 1MB) and then split into TCP frames (1500 or 9k) and then spewed out at whatever comes in? Or aggregated at NIC/OS level and written in 1MB chunks? (only to be split into smaller chunks at physical level again)
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
(user configurable at datastore creation time)
That was true for vmfs <5, now it's 1mb only
Is that read block wise (say 1MB) and then split into TCP frames (1500 or 9k) and then spewed out at whatever comes in? Or aggregated at NIC/OS level and written in 1MB chunks? (only to be split into smaller chunks at physical level again)
If I remeber correctly files/data (the 1mb block) is split in smaller datagrams (tcp, up to 64kb) and then again at ethernet level (layer 2 in osi, 1500~9kb). On the other side the os puts the tcp datagrams back together in the right order and writes it (the 1mb block) to the filesystem.
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
so read would be 64K blocks and Write 1MB chunks?

Then its weird I saw 64KB (most liekly) block writes on Compuverde
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
No, the os (esxi) reads a 1mb block from the block device which could be a ssd, a hdd or a lun into ram.

In the tcp layer the 1mb block can be split into 64kb segments and so on:
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
So everything is read/written at 1MB blocks? Hm it did not look that way, weird
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
Yes and no :D

Esxi read and writes in 1mb parts. IF you have a lun or nfs server it will depend on the underlying storage and how it's implemented and so on.

Lets say you have a raid 10/zfs mirrors with 8 ssds and a stripe size of 128kb. When esxi writes a block the storage server will split the data in 128kb stripes and distribute it over the mirrors (each mirror has to write two stripes a la 128kb).

That's probably what you have seen in the compuverde cluster.

:)D)
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Hm that might be.
But then the display would be kind of useless as it would always show the internal decided/configured blocksize.

Will run some tests if a 1MB ATTO or fio run displays similar speeds as a vmotion :)

Thanks