Last Updated on January 4, 2023 10:23 am
2022 September 10
What is the best media for long term storage is a question that I read frequently from online forums. First and foremost, as far as I’m aware there is no real “set it and forget it” long term storage. In this digital age, data needs to be validated regularly and cycled to a new media every 5-10 years in order to ensure it’s safe and easily accessible.
Hard drives have been around for about as long as the PC has existed, and continue to be a solid inexpensive option for longer term storage, although they have known longevity issues. While some hard drives may stand the test of time, maintaining data on them for 20 years or more, that is more the exception than the norm. Now with SSD’s slowly taking over the duties of hard drives, it makes one wonder how good they may be at storing data for extended periods.
SSD vs HDD Data Storage
While both SSD’s and Hard Drives store data as binary data, how they do this is quite different.
Hard drives use one to many spinning platters where a tiny electromagnetic read/write head manipulates the surface of the hard drive to store data of tiny magnetic particles to represent 1’s or 0’s. There is a tried and true history with hard drives and we have a good idea on longevity of hard drives and how long they can store data. 5-7 years is typical before the risk of degradation is increased significantly.
SSD’s, on the other hand, store data in “cells” and trap electrons in these cells using floating gate transistors. The trapped electrons generate a voltage and that voltage is represented as a specific data state. SLC SSD’s use just a single 1-bit switch so 2^1 = 2 data states: 1 or 0 per cell. MLC store 2-bits or, 2^2 = 4 data states: 00, 01, 10, 11. TLC 3-bits or 2^3 = 8 data states: 000, 001, 010, 100, 011, 110, 101, 111, and latest is QLC or 2^4 = 16 data states: 0000, 0001, 0010, 0011, … don’t make me write them all out, but you get the idea.
In a “fresh” or “new” SSD the floating gate is very robust and the probability of electron leakage is small. A well used SSD, however, those floating gates are subject to the laws of physics and wear down with use as electrons pour through them over and over again. So in an idle powered off state, it may leak electrons, changing the voltage states of the cell. In SLC SSD’s it can tolerate a significant voltage drop because it only needs to read a 1 or a 0. But with something like QLC, when it has to read/store a voltage that relates to one of 16 data states, a slight loss of electrons will result in a voltage dip, which will result in reading a different value than was originally stored. This results in data corruption.
Apparently even plugging an SSD in a computer powered on won’t necessarily keep the electrons charged, but it may initiate a wear leveling routine or other algorithm that would keep the data fresh. But I’m looking to evaluate if you were to keep it unplugged from a power source.
Testing the Theory
Where this is all leading to is a stupid little test I decided to run. I nabbed some extremely cheap Leven JS600 128GB TLC 2.5″ SATA SSD’s from Amazon at about $13 each and subjected them to two states, a set of two SSD’s “fresh” and a set of two SSD’s “worn”.
Two SSD’s were written to with over 280TB of data, resulting in approximately 2200 write/erase cycles. TLC NAND can typically withstand 1000-3000 writes before they can start to fail. So these SSD’s are well worn.
The other two SSD’s were left new/fresh without any writes except for a single pass of zeroes (Hard Disk Sentinel / Windows 10) and read back to ensure nothing was glaringly wrong with the SSD’s out of the box.
The “worn” SSD’s were then wiped with a pass of 0’s across the disk using Hard Disk Sentinel in Windows 10, and read back to make sure there were no impending failures. CrystalDiskInfo ensured all disks were in good health
Each disk was formatted ExFAT and filled with the exact same set of random data. MD5 hashes were taken to ensure they all matched the source data. This was completed on 2022 Sep 02. These SSD’s will be stored in a drawer in regular climate controlled environment that I live in and will only be accessed with the following intervals:
Disk 1: “Worn” ~ 280TB written, 1 year: 2023SEP02, 3 year: 2025SEP02
Disk 2: “Worn” ~ 284TB written, 1 year: 2023SEP02, 3 year: 2025SEP02
Disk 3: “Fresh” ~ 214GB Written, 2 year: 2024SEP02, 4 year: 2026SEP02
Disk 4: “Fresh” ~ 214GB Written, 2 year: 2024SEP02, 4 year: 2026SEP02
They will be validated by verifying against the MD5 hashes taken from the source, using hashdeep64. The intent will be to keep it plugged in for as short a time as possible in order to minimize any SSD algorithm from refreshing the cells by moving data around.
Let’s see if I can properly maintain this task, and if we get any data loss after 1, 2, 3, or 4 years’ time.
Edit: 04 January 2023 – added PCB images, changed CrystalDiskInfo images to Slideshow for easier viewing