300 consumer SSD's to wipe. Tips? Bootable USB? AUTONUKE?

Sable · Sep 17, 2022

Hi.

I have 300 SSD's here i need to clean. There is no important data on there that has to become irretrievable. A regular wipe or format would be fine. I am looking for the least amount of headaches.

Perhaps a bootable USB that automatically clears all drives the fastest would be ideal but i am open to any suggestions!

i386 · Sep 17, 2022

steam roller

or do you need the ssds?
form factor? (m.2 vs 2.5 sas/sata vs u.2/u.3 vs pcie add on card)

oneplane · Sep 17, 2022

DBAN would probably still work. But you shouldn't. To make the job faster, depending on the manpower, get 2 systems per person that can each contain multiple SSDs.

Say you can do 4 SSDs at a time and a 60s turnaround of a wipe (sending the erase command to the drive is the easiest - overwriting on SSDs is pointless as the FTL will ignore such 'writes'), you may need 5 minutes per batch. If you have 2 systems you can cut that in half as you can plug/unplug one batch while the other is booting, so you get 8 drives per 5 minutes. That means a morning's worth of time to get it done.

Now, if you have a lower efficiency, say, 10 minutes per 8 drives, it's suddenly most of the day working on this. Adding a second person would probably reduce it 33%, and one more by perhaps 20% (more people, more problems). That still means that 6 computers, 3 people and reasonable batch sizes makes it a job for one day to complete.

In other words: the boot-and-wipe isn't that hard, but for speed you'll need to be able to do multiple disk at once. If you have identical SSDs (say all of them are SATA or NVMe), a small shell script and an auto-login-as-root linux install on a USB drive will make the actual wipe action nearly automatic. If your SSDs are hot-pluggable, that would be your best case scenario. Consumer SATA SSDs can be wiped using a simple USB-SATA adapter as long as it supports UASP, and since those are cheap and hot-pluggable you could do this using a USB hub and a raspberry pi for that matter. The erase command is pretty small bandwidth-wise. This would technically enable you to do tens of drives at a time. Imagine doing 15 per batch, that's only 20 batches. And with two workstations that's 10 batches each, in parallel.

Sable · Sep 17, 2022

Thanks. I do need the drives after they are wiped.

As for DBAN, i thought i specifically read this program was only for hard drives.

Multiple systems could work, probably going to have 1-3. I have 10 connections on this board so that should speed things up.

The issue is these programs offer erasure which i believe and i am not sure, means overwriting which takes time? Or am i missing something. When i format drives in windows it usually takes a few seconds.

Probably am misunderstanding some stuff here.

Wasmachineman_NL · Sep 17, 2022

DO NOT USE DBAN ON SSDs.

It will ruin the NAND due to write amplification. Use something like KillDisk instead to Secure Erase them. If there's no PII related data on it (i.e games), just format them within Windows.

btw TS, waar ergens in nederland woon je? wil best eens langskomen met m'n HAF XB en een dagje schijven wissen als je in de buurt van Brabant woont.

Sable · Sep 17, 2022

Wasmachineman_NL said:
DO NOT USE DBAN ON SSDs.

It will ruin the NAND due to write amplification. Use something like KillDisk instead to Secure Erase them. If there's no PII related data on it (i.e games), just format them within Windows.

btw TS, waar ergens in nederland woon je? wil best eens langskomen met m'n HAF XB en een dagje schijven wissen als je in de buurt van Brabant woont.

No worries, didn't think it would work anyways and probably takes too long. Perhaps the windows with fast boot + formatting is the best way like you say. Yes, only hundreds of game files on there, no user data. As for the offer, i have a few people here that can do the job, just need to figure out the best and easiest way to do it so i can hand it over. Thanks a lot though.

oneplane · Sep 17, 2022

For wiping and windows, let's do a bit of an explainer (wall of text ahead! skip to the bottom for an easy solution):

The origin story

Operating systems and what you 'see' as a user in terms of files and directories (or folders) are just the tip of the iceberg of a big stack of abstractions. As a result, what is visually represented can stay the same for a long time, but the underlying layers can change independently. This is great, because normal operation of a computer doesn't have to be re-learned all that often, and you can support multiple variations of storage without having to write a new system and new documentation each time.

Many older storage technologies, especially the ones with a relatively small amount of intelligence in them simply had a bunch of addresses where some data could be stored. Depending on how far back in time you go this varies, but for simplicity lets say that each address is a block of 512 bytes. So on address 1 you can store 512 bytes, and if you need to store more you go to address 2 and you store more data there, and so on. That means that if you need to store 1Kb of data, you need 1024 bytes and that fits nicely into two of those blocks (512 + 512 = 1024 = 1Kb after all). Those addresses that point to blocks of data on a disk are essentially how floppy disks, diskettes and harddisks all used to work; you have a round disk of media with teeny tiny magnetic areas that can represent a 0 or a 1, and they are grouped into a block of 512 of those. They are laid out in a circular pattern so the disk spins around and a head on that disk then waits until the right address appears at the head and 'reads' it. This is what all software essentially has been based on since then.

Your operating system asks the disk: hey, give me data from the blocks at address 478283, 938294 and 394804 and your disk then says, OK, I'll be busy with that for a bit, hold on. Then once the disk is rotated in position it reads the data and makes it available to the OS.

This means that your OS can have an idea of what's happening on the disk: it knows the disk has addresses 0 to 100000 for example, and that they are all nearly laid out on a round disk that is essentially a bunch of parallel circles of blocks of data, each block having one address higher than the last one. Everything around it, the interfaces, protocols, host bus adapters, memory models (in some cases) are designed around that concept: spinning disks with addresses that appear one by one. So ATA, and Serial-ATA, and the firmware, the operating system, the storage drivers, any disk partitioning data, the filesystem itself are all assuming that this is how things work. This is important for the partitioning system, since it usually has a rather simple concept of what a partition is: a block address where the partition starts, and a block address where the partition ends. That's all a partition really is. It's like drawing a square on a piece of paper, and then dividing that square in two by drawing a line down the middle. No magic, just a 'line' to indicate that instead of having one big area, we decided to treat it as two independent areas.

The filesystem itself doesn't need to know about any of this, because just like a partition, it too just has a starting address and an ending address. Everything between those are 'for the filesystem'. A filesystem is what essentially makes it possible for you (the user) to not worry about addresses and data blocks, but instead you just tell the filesystem to make a file. The filesystem then makes sure to store it on the disk in blocks of data, and whenever you want to open the file and read it, it takes care of finding all the blocks that the file is using, and reads them in the right order so you see the file just as you expect it.

This layered approach means that your filesystem doesn't really care about what type of partitioning you have, because it just needs a start address and an end address. Your OS knows what the start and end address is, because you have a partition list (or table) on your disk (usually somewhere near the start address of the disk) and that just has a reference that says: "partition 1 starts at 32492 and ends at 9023842 and it contains a linux ext3 filesystem". Then your OS knows that to do anything useful with that, it needs to use the ext3 filesystem driver and tell it "check out anything between 32492 and 9023842, that's where you live".

So now, when you open a file, say "my_file.txt" you're actually asking the filesystem to get all the blocks for my_file.txt and it then looks at its own list of what blocks belong to that file, and then asks the disk for all those blocks. The disk spins around, and every time the right block appears at the read head, it reads that block and sends it to the operating system.

The cool thing beyond mechanical storage

Classic harddisks are of course not universally good at everything: they have to wait for the data to appear at the head as the disk spins around, and they have to do this many times because instead of reading everything at once it is likely that your computer is doing a bunch of stuff at the same time. So first it needs some data at block 14, and then at block 48223 and then at block 244 and then at block 239482 so it's constantly searching and waiting to access the block you asked for. If all it had to do is read blocks 200 to 400 in sequence, it would have been much simpler, wait until 200 appears, start reading and keep on reading as 201, 202 etc. appear while the disk is spinning. That'd be fast, and you'd be able to read everything at once. But you don't just need 1 file, and a file is generally spread all over the disk at different places (fragmented, if you will).

Disks are also mechanical devices, meaning, there are moving parts that are going to wear out, and if you were to shake or hit or drop them while the disk is busy, you might accidentally smash some of the moving parts together inside the disk. That would be bad.

Besides mechanical storage methods like harddisks, optical disks, floppies etc. we also have other means to store things, without moving parts. What do we call something monolithic with no movement? Solid! While there are various methods like magnetic core memory, ultrasonic delay line memory etc. a much more useful memory type is silicon memory using ICs. Those chips used to not be able to store a whole lot, they also used to be slow, expensive and wear out rather fast. This was right up until we invented NAND flash memory at which point it suddenly turned out to be amazing for general storage. It still wasn't perfect, and to use it you have to do a bit more thinking to get the right data in and out of it.

Older (much older) storage chips, and even some modern parallel NOR flash chips used to be very simple: they would have a ton of pins where on one set you signal an address, (i.e. 10101110) and on the other pins the data would appear (i.e. 101110111011011101101). That works great if you don't have really large capacities, because with 16 pins (or bits) for an address you can have 65536 addresses to choose from, and if each address contains 16 bits of data, that means you can store a total of 1048567 bits of data in a little chip! Of course, you'll need to have a chip 16 pins on it for the address and 16 pins for the data, and perhaps two pins for power and maybe a few more for things like "go to sleep chip, save some power". In other words, you'll end up with a huge big chip with over 40 pins. And it stores maybe 1Kb of data... not a lot.

So now what? We have a really cool storage chip, but we can't store all that much, and it needs a ton of pins, and every time we want to read a little bit of information we have to use a ton of addresses to get it all (512 bytes of data like in a old harddrive would not use 1 address in a chip, but 32!). In comes a multitude of cool inventions: instead of using 16 individual pins on a chip to let it know what address we want to read or write, we use 2 pins: one pin we call signal and the other pin we call 'next bit'. If we know we want to get address 1011 we just set signal to 1 and then use the other pin to say 'next bit', then we set signal to 0, and , then poke that 'next bit' pin again, set signal to 1, poke next again, and set signal to 1 again. Now we told the chip, one by one, that what we want is at 1011, and we only used 2 pins instead of 4! In reality, those pins are not really called signal or next, and there aren't just two, but the gist is the same, we simply transmit the thing we want in pieces with a small amount of electric connections instead of a lot. We also made other improvements, so instead of only having one question "address 12354 please" and one answer "that address contains the word: dog" we can also make it smarter where we say: "hey chip, give us 2000 bits starting from address 3824" and it will spew out that data like a firehose. So no more getting data bit by bit, instead we can get a whole block of data from the chip, just like a harddisk!

Now, all of those smart things are completely new to the operating system and ATA and the likes, all they know is that they used to have to ask for an address and get a block of 512 bytes of data back. To make those cool flash storage chips work with computers, a translator was invented, it sits in between the flash chips and the rest of the computer, and whenever the computer says "block 1234 please" it translates that to "block 1234 and 512 bytes after that as well". The downside is that such a translator is kinda dumb, because now you can only get data in blocks of 512 bytes, even if you only wanted a little bit of data. Technically, this was the case with harddisks too, but the problem with flash chips is that while they don't wear out due to mechanics, they do wear out a little bit every time a bit is changes from 1 to 0 and back. Because there are so many bits in a chip, that should be fine, because you generally are just modifying small bits of a file. If you write a letter and then change a few words, you only really want to have those changed words stored, not the entire letter shredded and re-written. But because the translator is somewhat dumb, that's exactly what it does.

The reality with SSDs

Back to the here and now, SSDs don't just have a single flash chip, and they don't just have one translator. They have many chips, each chip has address and data logic built in, the flash controller has a Flash Translation Layer (FTL), and it has a protocol translation layer (i.e. NVMe or ATA or SCSI). They all have different ideas of what an address, block of data and operation (read, write etc.) is. This has a big downside in that whenever you do something on your computer, all of those layers might cause the thing you did to turn into something different on the actual flash chips. So changing a few bytes in a file might end up completely erasing and rewriting the file on the SSD. To prevent this, the FTL has additional intelligence added, so that when ever it sees that what you really wanted was to just change a few pieces in a file, instead of doing a complete write it just does the bare minimum to store what you intended and makes an internal note "my_file.txt was changed, when the OS asks for it next time, substitute block 2342 for block 4982". That way, your operating system can pretend that the file still is just a bunch of blocks of data, while the chips in the SSD aren't completely hammered every time a small piece of data changes. Result: faster operation, longer life! Downside: what your OS thinks is happening is not really what is happening. So when your OS says: Delete file XYZ, the SSD doesn't actually do this, instead it makes a note "when the OS next asks for this, pretend it doesn't actually exist". Technically, the filesystem also does this. As does the partition table. This also means that whenever you actually DO need to write some new files the FTL has to make a choice: do we use more space on the chip, or do we take the time to actually erase those chips we marked as "pretend it empty" so we can write those new files. This is essentially where the world of garbage collection and TRIM fits in; the OS can assist the SSD and the other way around where they share information about their intent so they know what is happening behind the scenes. None of this was part of the ATA standard, or the filesystem, or the disk management software. So now we have a problem: more and more layers are added, and older layers don't actually know what is really happening anymore.

The problem at hand

When you 'erase' a disk, it is highly likely that no erasing is actually happening. In practically all user-facing tools like Windows, all it does is show you intent. If you intended to not see any of the data on the disk anymore, it will now show as empty (or, "erased") and when you actually start writing new data, and the new data needs to be stored on chips on the SSD, and the SSD has decided that it needs to really erase some of the chips to make space for that new data of yours, that's when really anything gets erased.

For most people, this is really not an issue, it doesn't matter if the data is still actually on the chips, because unless someone steals your SSD, desolders all the chips, steals/buys the FTL software from the SSD manufacturer, and takes a lot of time to reconstruct whatever is on those flash chips that used to be on the SSD, no data will be readable.

In your specific case, we're talking about consumer SSDs with data that doesn't really need to be irrecoverable, you just want to re-use the disks without computers getting confused about existing data. There are two ways to do this:

1. Simply delete the partition table. This doesn't prevent anyone from recovering the data that is now invisible, but your computer will assume the disk is empty and happily write data all over the place.

2. Issue an ERASE or SECURE ERASE command. This is not something the user-facing interface of windows (or macOS) will do for you, but other tools like hdparm will happily send this command to the drive. This tells the disk: pretend that you are empty. The disk will forget that any data exists, and it will have a mental note that says "all chips are free to be used for new data".

Both options can be fully automated, and in both cases, not using windows is the best option.
Using the ERASE or SECURE ERASE command is the 'nicest' since it will also let the SSD know about your intention to use it as a blank disk.
An example command would be "hdparm --user-master u --security-erase p /dev/sdb" where /dev/sdb is the disk you intend to wipe clean. This might take up to 2 minutes on some disks. Articles like Solid state drive/Memory cell clearing - ArchWiki can explain the finer details for you.

Now, if you were to insist on using windows, I'm sure there are some tools, maybe even with GUIs and buttons to click that can do this. But if you are going to windows it out, option 1 is built in, you can use DISKPART to do this using the CLEAN command. Articles like this explain it: How Do I Use DiskPart to Delete All Partitions in Windows? Your 2022 Guide Is Here

For SSDs, like posted, classic "write the disk full of nothings" is generally not a good idea. Modern SSDs are plenty protected against ruining the flash this way, but why do it wrong and risky if there are non-risky easy ways to do it

If you are looking for something you can use without any specific experience or knowledge, Parted Magic might be for you. Secure Erase - Powerful, easy to use, and inexpensive.

You can put it on an optical disk or USB drive, boot from it, and use it as a 'wiping station'. It has a GUI and nice buttons to help you out, and if you want to use USB, ATA, SATA, or PCIe connected devices, it will let you use the best command in each case. NVMe connected PCIe drives for example, have a SANITIZE command instead of SECURE ERASE.

Search

300 consumer SSD's to wipe. Tips? Bootable USB? AUTONUKE?

Sable

Active Horse

i386

Well-Known Member

oneplane

Well-Known Member

Sable

Active Horse

Wasmachineman_NL

Wittgenstein the Supercomputer FTW!

Sable

Active Horse

oneplane

Well-Known Member