3 min read

3 Questions Organizations Must Answer to Understand Their Unstructured Data

By Jeff Hughes on August 7, 2018

Topics: Data Management

Today, businesses are more data-centric than ever, and it’s unstructured data that’s at the heart of business-critical data. For many organizations, understanding their unstructured data—knowing what’s there, what’s growing, where it lives and being able to make informed data management decisions—is of utmost importance.

Yet, understanding unstructured data, which includes images, videos, and other data that cannot be stored in rows and columns, is orders of magnitude harder than understanding structured data sets.

Read more: The difference between structured and unstructured data

Why? Unstructured datacreated and used by a combination of machines and peoplelacks the organizing principles of databases and virtual machines. Traditional enterprise data management solutions exist, but align to structured data, not unstructured data management requirements.

As a result, organizations are left with big unknowns that prevent them from fully understanding their unstructured data and making data management and business decisions based on these insights. And ultimately, teams cannot unlock the full strategic value of their unstructured data assets.  

To truly understand their unstructured data, organizations must be able to answer the following questions across their entire infrastructure and at any level of granularity.

1. What is growing the fastest?

Many IT teams only know that their data is growing and that their spend continues to increase year over year, without knowing where the growth comes from. If we think back to the budget analogy, this is like seeing the amount you spend every year increase without visibility into where the increased spend is going towards.

Data management platforms must give IT insight into how their data is growing and which datasets are growing the most quickly.

2. How old is my data?

This is another way of asking how active each dataset is. Being able to break down the age of each dataset by datacenter and each level beneath that down to the directory level gives organizations granularity into the insights they can derive.

For example, older datasets could be targeted for archive. It can also be helpful to know what datasets will continue to consume resources over time. Some datasets are very old but must be kept around for various reasons; these are like fixed costs in an organization’s budget.

Unfortunately, current solutions do not offer an easy way to know the age of your data despite the value of this information. We’ve worked with customers that have rolled out their own tools to obtain this information, but this is a time-consuming solution. Modern data management platforms will need to enable IT to easily and quickly know how old their data is.

3. Who is using this data?

Like with managing a household budget, it’s important to know how much each party consumes and spends. Knowing which groups and workflows are using the most data arms IT with knowledge that can inform budgeting and resource distribution decisions. Usually, this is a question that enterprise IT can answer in a basic capacity with existing tools.

However, a modern solution should provide more detailed insights, such as which clients or specific applications consume the most resources.

Which questions can your organization answer with your current data management tools? 

We'd love to hear from you. Contact us and let's chat about how our modern data management platform can help you understand your unstructured data.

Contact us

Jeff Hughes

Written by Jeff Hughes

Subscribe for Updates

Get the latest Igneous blog posts delivered to your inbox.