Is bigger always better? What size of hard drives for your archives?

UhClem · May 4, 2024

BlueFox said:
[ You're making the assumption ... ]
...
24TB HC580: 1210Gbits/sq in and 298MB/s
10TB HC510: 816Gbits/sq in and 249MB/s

Approximately 48% higher density, but since we're not working in 1 dimension, take the square root and you're left with ~22%. Pretty close to the difference between the two drives (~20%). Of course the math isn't perfect because tracks and sectors are not square, nor are both axes shrinking at the same rate, but I think it fits.

Good try! ... But (now) You're making the assumption

.
Because the density you are referencing has NO direct relationship to sequential speed. The density you want is the Recording (or Bit) Density, and you need to look deeper for it, in the Product Specification. [Amusingly, in the section named "Data Sheet"]:
HC510:

HC580:

298/249=1.197 ~= 2320/1929=1.202

BlueFox · May 4, 2024

Which I explained at the end of the post you just quoted? Didn't find the other attributes on the first datasheet I encountered and the overall one explained it sufficiently. Either way, it's still purely based off density...

CyklonDX · May 4, 2024

twin_savage said:
AF specifies how the actual data bits on the platters are encoded, not just the logical sector composition (which explains why there are no advanced format SSDs).

PMR, SMR, HMR, CMR and even TDMR all encode data onto the platters of the drive in the exact same way; if you were to take a GME and scan the entire surface of a hdd platter they'd all look effectively the same between between the different technologies, with the only variance being the SMR disks would have their tracks spaced more closely together and would have some non-fixed blank "buffer tracks".

I lack the ability to prove otherwise. You are most likely right.

But wouldn't it be the case if you had 512M of cache on the disk, if you just drop in 450M shouldn't that provide immediate ack for io request? (without waiting for commit?)

I did play a lot by mounting a fresh disk (already pre-formated disk on linux), and dropping in files from ram without forcing to commit. It never was immediate always took around ~2-3sec ~ sas3 16T WUH721816AL5204 with 512M of cache, got couple other ones with different sizes. From my local testing it seemed like only 220-340M out of 512M cache seem to provide immediate ack suggesting there's more than meets the eye.
(those disks have the capability of all kinds of internal data bit manipulations, and bit format compressions that SED offers for encryption at rest)
(i excluded windows as it has its own many caches)

//
in ref. to stats provided above most often those are only real when the disk is empty, and you are writing very first Gigs, maybe 1st TB.
(Realistically half-way they all end up at some 120-140MB/s, and around or below 120MB/s by 80% full)

nexox · May 4, 2024

CyklonDX said:
I did play a lot by mounting a fresh disk (already pre-formated disk on linux), and dropping in files from ram without forcing to commit.

Depending on the filesystem you used it might issue fsyncs on its own, to protect against metadata corruption, and that may force some data to sync as well. To experiment on anything in the drive hardware you want to eliminate as many layers as possible and operate on the raw device, you may need to write a program to do exactly what you want unless you can find something suitable (dd, for example, tends not to do what anyone wants for real benchmarks.)

CyklonDX · May 4, 2024

nexox said:
Depending on the filesystem you used it might issue fsyncs on its own, to protect against metadata corruption, and that may force some data to sync as well. To experiment on anything in the drive hardware you want to eliminate as many layers as possible and operate on the raw device, you may need to write a program to do exactly what you want unless you can find something suitable (dd, for example, tends not to do what anyone wants for real benchmarks.)

ext4 mounted with nobarrier, noatime and few such settings.
Got a bash script that
copies a real file into ramdisk (ex. V7-22.10.3-PVN-MDL-WHQL-Nemesis-NimeZ-DCH.7z 399MiB)
does mount of the physical disk (has pre-created ext4 partition, clean)
copied the file from ramdisk to the mounted disk
records the amount system io commit to see disk activity every second.

(the disks do not have spin-down/spin-up that could potentially explain it)
First second is around 300-350MB/s, then 2nd or 3rd second gets the reminder of the file.
(first sec the latency is small suggesting it does indeed hit cache, but 2nd and 3rd latency is much greater suggesting the disk is dumping data from cache, and sends more io busy waits to the system.) ~ the commit size recorded from disk io is typically ~5M greater than file size. Potentially larger for larger files. (partition data etc...)

Are there better ways to do it? Potentially yes; Is it indicative of something behind scenes? Maybe it does, maybe not.
Potentially reverse engineer / hack the logic board and its chip, or directly steal trade secrets... or "take a GME and scan the entire surface"
(at this time is little beyond me - i haven't played with anything to that point beyond ps3 ~ and the SED adds even more complexity.

nexox · May 4, 2024

There are a lot of possible explanations for that behavior within the kernel/filesystem code, too many to make any inferences about what the drive is doing with its cache. You really can't draw a lot of conclusions about the hardware if you've got a filesystem in the way, all you can really do is compare how different devices work with that filesystem, which is interesting, but it's a more complex system.

twin_savage · May 7, 2024

Bert said:
This is very strange. I expect all platters to be used. Physically it should be possible but Article says tracks cannot be aligned but I thought read and write units were along the lines of cyclinder.

I don't quite understand why they are not investing into using multiple platters from a single head.

Now it is clear what the discrepancy is for. This also explains why they are putting multiple heads, I thought that was for random I/O. It looks like it will possibly double writr sequential write and read speeds.

You are correct that it should be physically possible to use all the heads to read/write at once; if this were implemented, modern high capacity HDDs with 10 platters could be doing 5GB/s sequential transfers. Perhaps this capability will eventually come to us, the new NVMe specification now allows for spinning platter hard drives to be used on PCIe so we have a bus that can handle this hypothetical bandwidth.

I had always assumed the reason this had not yet been implemented was that it would be a complex control scheme. The piezo stage of the head might need to be modified to allow for even more independent movement beyond the the arm actuator's, but this shouldn't be that difficult.

CyklonDX said:
Potentially reverse engineer / hack the logic board and its chip, or directly steal trade secrets... or "take a GME and scan the entire surface"

Whoops, I thought that was a googleable term but it is not.
GME=generalized magneto-optical ellipsometry ; basically a way to take a picture of the magnetic state of an entire patter after it has been removed from the harddrive, it can resolve much finer detail than a hard drive's read head could.

Bert · May 7, 2024

twin_savage said:
You are correct that it should be physically possible to use all the heads to read/write at once; if this were implemented, modern high capacity HDDs with 10 platters could be doing 5GB/s sequential transfers. Perhaps this capability will eventually come to us, the new NVMe specification now allows for spinning platter hard drives to be used on PCIe so we have a bus that can handle this hypothetical bandwidth.

I had always assumed the reason this had not yet been implemented was that it would be a complex control scheme. The piezo stage of the head might need to be modified to allow for even more independent movement beyond the the arm actuator's, but this shouldn't be that difficult.

Whoops, I thought that was a googleable term but it is not.
GME=generalized magneto-optical ellipsometry ; basically a way to take a picture of the magnetic state of an entire patter after it has been removed from the harddrive, it can resolve much finer detail than a hard drive's read head could.

The doc says tracks will not be aligned. I don't know how hard drives work but I assume each platter wiggles a little bit so tracks will be shifting around all the time.

At least we should be able to read from both surfaces of the platter.

I don't think this is a bus issue; SAS bus is already several times bigger than hard drives can feed into. (SAS4 )

Search

Is bigger always better? What size of hard drives for your archives?

UhClem

just another Bozo on the bus

BlueFox

Legendary Member Spam Hunter Extraordinaire

CyklonDX

Well-Known Member

nexox

Well-Known Member

CyklonDX

Well-Known Member

nexox

Well-Known Member

twin_savage

Member

Bert

Well-Known Member