32GB solid state drives (SSDs) are already sampling and 50TB and 100TB are expected next year. The new 3D NAND process, coupled with in-chip die stacking, is increasing the capacity per chip at a tearaway pace. New form factors are being checked out.
As predicted, we are seeing servers with 12 or more M2 slots, which are only an inch or so wide, while Intel is offering a “ruler” drive that’s essentially a foot-long M2 with 32TB capacity prognosed for the near future. Their matching appliance design has 32 “ruler” slots, which would give a petabyte in 1U. Huawei has their own version of a petabyte unit, with a novel way of stacking drives in the appliance.
What do you do with such capacity? It’s not going to be as cheap as hard disk drives (HDD) until late next year, but when the reduction in appliance count is considered, the numbers actually favor the bulk SSD. With HDD stuck at around 10TB capacity in a 3.5 inch form factor, we are looking at 10 times the box count, and almost a full rack in space with HDD boxes for a petabyte. Those nine extra-sized appliances aren’t cheap!
So what are those huge SSDs for? It’s beginning to look like the drives will have NVM Express (NVMe) interfaces and, since all that capacity is achieved via internal parallelism; they’ll have top-end performance. That puts them firmly in primary drive space, excepting, of course, that flash and Optane non-volatile dual in-line memory module (NVDIMM) are poised to make a land-grab for that territory.
So the huge drives, and not so huge ones too, will fit more the role of shared capacity storage. In that space, where data is being down-tiered from the fastest NVDIMMs, compression will be a major factor in operations. There are two reasons for this – effective capacity is increased by factors of 3X or more, while network bandwidth gets a 3+x boost and transfers are correspondingly shorter.
The last couple of years have seen all-flash-arrays acting as front-ends to traditional networked storage, using the slow HDD-based gear as bulk, compressed capacity. With this gear already on site, in many cases, this was a very inexpensive way to boost capacity. That legacy gear costs a lot to operate, takes up a good deal of space and uses power. There are associated software licenses and, most importantly, a need to keep trained up admins specifically for that gear.
The overall cost proposition of 1U 1 petabyte boxes with low maintenance is very compelling compared with physically big array farms. It’s becoming clear that archiving space will migrate to QLC flash drives, which may increase raw capacity further. Adding deduplication to the storage flow that searches out duplicates of objects, coupled with deeper compression, and 10X compression may become the norm for non-media data. That’s perhaps 10 to 15 effective petabytes in that 1U box.
I suspect that we’ll see a move to smaller appliances rather than putting all the petabytes under one roof. This fits in with object store clusters better, where the minimum node count is usually four. We still may get 1 PB/U, but it will be in the form of four boxes fitted side-by-side in 1U. The result should be a better match of networks bandwidth to drive performance.