IBM invented the hard disk in 1956. It stored the equivalent of 3.5MB, was 5’ high by 6’ wide, weighed a ton (literally, not figuratively) and cost $7,000/month (in 2020 dollars) to lease. And while that may seem light-years away from today’s situation, with 8TB drives available for $150, there is one important similarity: companies ran out of space almost instantly.
The story of how fast data is growing is an old one – 64 years old in fact. But recent events have fundamentally changed how we create data. In the emerging Data Economy, data is almost always machine-created and that’s a big deal.
There is a limit to how fast people create data. On the other hand, research instruments, design simulation, sensors, imaging and other kinds of machines can quickly generate petabytes and even exabytes of data.
Second, this new class of data is unstructured. This is not the nice, ordered, well-mannered data that sits in massive databases. This is messy data that sits in billions of files, mostly in on-premises enterprise NAS systems. Unlike structured data, the unstructured data of the Data Economy is unruly and extremely difficult to manage.
Igneous recently set out to better understand the effects this explosion of unstructured data is having on enterprises. Last December we commissioned the Rise of the Data Economy survey at the AWS re:Invent show in Las Vegas. Here’s what we found.
An unstructured Data Tsunami is Here
The first thing we discovered is the massive scale of unstructured data across modern data enterprises. Most enterprises (60 percent) now manage more than one billion files and growing. The top 10 percent of that group manages at least 150 billion files comprising at least 83 petabytes of data. Good thing hard disks have advanced—that would be more than a quarter billion of IBM’s original hard disks. Managing this much data clearly is going to require an innovative new approach.
Most (70 percent) told us that managing their unstructured data is “somewhat to extremely difficult.” The reason? Data-centric enterprises are finding it really hard to move massive stores of unstructured file data. It’s a struggle between cloud platforms, between on-prem tiers and between on-prem and the cloud.
It’s even harder to gain visibility into billions of files. When you combine the difficulty of moving unstructured data with a fundamental lack of visibility, things can get downright ugly. If you cannot see and move data at scale, it’s impossible to backup or archive that data. In fact, half of our survey respondents said they aren’t even trying to backup or archive data to the cloud.
Curious how you compare to other enterprises preparing to weather this unstructured data tsunami? Check out our full survey report here, which includes Igneous’ best practices on managing unstructured data.