How Well Are You Protecting Your Data Lake?

How Well Are You Protecting Your Data Lake?

CIOs tell me they see the potential to create significant business value for their business customers through big data and the so-called ‘data lake’. They say data lakes matter because they increase the availability and transparency of data and enable new business users the ability to find answers. In particular, they see value in allowing users to create value without IT.

Interestingly, most CIOs connect in their minds self-service business intelligence to the data lake as a single business requirement. Clearly, effective data lakes are not just about data storage but also about the end user and data scientist self-exploration of the data contained within them.

On the positive front, most CIOs see the establishment of new business intelligence options as broadening the community served by IT. Several CIOs, in fact, feel delivering these options offer the potential to drive better business/IT alignment.

On the negative side, CIOs see big data and data lakes as creating additional business risk. One IT leader put it to me this way: The big challenge is providing necessary information while maintaining appropriate access privacy and confidentiality. Therefore, CIOs say it is essential that data lakes build in appropriate data protections from the start.

How far along are most organizations?

Should organizations wait to make the investment in protecting the data in their data lake? For most, data lakes remain in the laboratory. In fact, Capgemini has found that only 27% of business executives say their big data projects have achieved profitability (“The Big Data Payoff”, Capgemini IDG. 2016). Most CIOs tell me their data lakes are still largely in the experimentation phase.

CIOs claim they are still trying to figure out how to do data lakes properly. To some, where they are reminds them of the early days for ERP. CIOs say that it took a while before most got them right. They say there was a lot of failure early on for ERP, and we expect the same thing here.

But while most are figuring out how or what to do, it would be a mistake to wait to protect this data. Business risk exists here even if there isn’t strategy around what the data lake is trying to solve or what measures will deliver the answers to the business questions that are the basis of a data lake initiative.

Regardless of whether your data lake is in production or experimentation, it is critical that you go about protecting this data from the start. There is a tendency to say this is in the backroom and no one is going to notice. However, data security needs to happen from the onset. Because unlike ERP, you are loading all your valuable data into one place.

Why make data lakes secure

Frankly, CIOs that I have talked to worry about the “putting of all your eggs in one basket” effect. They stress the importance of having data security and privacy in place from the beginning. Data lakes, they believe, will become targets for hackers or improper internal access. This means data protection governance needs to be done sooner rather than later.

At the same time, CIOs suggest a big challenge for data lakes is in providing necessary information while maintaining appropriate access control.

How to make data lakes secure

What is needed to do this well? CIOs say that they need to be systematic today. They need to protect data wherever it pools and flows, especially a collection point like a data lake. They believe you need to have the ability to centrally govern data access, as well, and to enforce those policies across every location that data flows, regardless of the nature of the data (structured, semi-structured, unstructured) or how it is stored. This is irrespective of whether it is in a traditional database or a big data file system (HDFS). And this is the case for cloud based BI systems like Amazon RedShift.

Today, regardless of the state of data – at rest, in use or in motion – data needs to be managed systematically rather than in piece parts.

Parting remarks

CIOs are clear that data lakes need to be protected at the very beginning; otherwise, unnecessary business risk is being created. As well, they perceive the need for data security to be viewed holistically, and doing this needs to be given an appropriate business and IT priority. Otherwise, enterprises could find themselves hostages of public scrutiny, income and/or business loss from their attempts to create, for example, greater customer intimacy.

As one CIO put it, “a company's reputation is built over years but can be destroyed in minutes. Information security is business critical in digital times”. And this, of course, includes the source of digital advantage.

Learn more about managing enterprise data security

Further reading

Article written by Myles Suer
Image credit by Getty Images, Blend Images, John Lund
Want more? For Job Seekers | For Employers | For Contributors