It’s a truism that companies have more data than at any other time in history. They have more ways of collecting, generating, and storing data than ever before. Because of cloud storage, data lakes can expand almost indefinitely and no one has to look at the hard drives piling up.
And that’s all awesome. Having more data means we know more about the world. Because we know more about the world, we can make better decisions. We can make better products because we know our markets. We can sell better because we know our customers.
Or at least that’s the theory. In reality, many companies have so much data they don’t know what to do with it. Perhaps that’s because they don’t have the in-house analytics expertise to properly categorize and analyze it. Perhaps it’s because they have no clear idea why they are collecting all that data in the first place, other than that “data is good to have.”
The result is dark data. Data that sits inert in huge data lakes, with no one at the company having any real idea of its potential meaning or significance.
There are two main problems with dark data. Firstly, if you don’t know what it is, it might be something bad. If you just happen to be storing private user details in a data lake that isn’t properly encrypted and fire-walled, there’s a potential disaster looming — both in terms of regulatory compliance and customer trust.
Secondly, dark data represents a wasted opportunity. Some data is of no value at all. But it can yield hugely valuable insights of substantial benefit to the company. If you have little idea of the content of data lakes or understanding of how that data can be leveraged, any value is wasted.
What can you do to resolve the knotty problem of dark data?
The easiest option. If your business is storing gigabytes or terabytes of data it isn’t using, that it isn’t required to store for regulatory compliance, and that isn’t well understood by anyone at the company, simply throwing it away is an option.
Deleting the data solves two problems. Firstly, if you’re not storing data, it can’t leak from your network and embarrass your company. Secondly, data storage costs money, and if the data is unused, that money is wasted.
If the data is already stored on a cloud platform, it’s perfectly placed for analysis. This course of action requires an investment by the company and may well mean hiring someone who has the expertise to deploy and manage an analytics solution.
The benefits here are obvious. The company learns what value the data represents, becomes cognizant of any risks it represents, and can apply any insights gleaned to future decision-making.
Be Smart About Data Collection
Data itself is of no real value. It’s just ones and zeros until the business forms some plan for dealing with it. The best way to make sure that data doesn’t become dark data is to have clear goals in mind — a big data strategy — that informs what data is to be collected and what value the company hopes to reap from its collection.
Data is an asset and a potential liability, the only way to know what your business’ collected data represents is to invest resources in shaping a coherent strategy that encompasses data collection, storage, and analysis.