Data Management Fundamentals

Data Management Fundamentals

Companies eager to maximize the value of their digital data are taking small steps to establish the strong foundational policies and practices needed for managing data.

Three areas of activity are emerging, according to a recent report from IT trade association CompTIA:

  1. First, companies are evaluating the sources of their data, how it flows through the organization, and how often it is processed for analysis.

  2. Next, companies are structuring data management teams of in-house staff, third-party providers of data services, or a combination of the two.

  3. Finally, some organizations are evaluating how blockchain, other digital ledger technologies, or data-driven artificial intelligence may fit in.


"For many organizations data management has only recently become a priority," said Seth Robinson, senior director of technology analysis at CompTIA. "So while there is much work to be done, they are unencumbered by legacy thinking and can begin with a blank slate."

The number of companies that feel they are exactly where they want to be with their data capabilities has taken a step backward. One in four respondents said their company is exactly where they want to be today with their data capabilities. That's down from 31% in a 2015 survey. Another 41% said they are very close today, up marginally from 2015 (38%).

Three stages of data management

Components of data management

Companies must have a solid foundation of data management before taking on drastically new pieces. It is better to have a single overarching approach rather than treat big data as something separate.

The first component of a data management strategy is to fully understand the different sources of data. Data sources can relate to business operations (financial systems, ERP) to customer profiling (customer management, social media) to IT concerns (cybersecurity, help desk). Different departments may be finding new uses for their own datasets, but the optimal opportunity for insight and automation comes from pulling all the data together and finding nonintuitive connections. The types of applications producing data are currently slanted towards the most common business applications, but as cloud systems allow for greater application complexity, there will be a wider variety of data.

Common sources for data include:

  • Financial systems
  • Customer management applications
  • Social media
  • HR systems
  • Cybersecurity tools
  • Internet of Things devices
  • Help desk ticket systems
  • ERP systems

The second step of the data management flow is processing and organizing the data. This stage has expanded significantly in recent years, with companies moving beyond SQL tools and a standard Extract/Transform/Load (ETL) process into NoSQL, NewSQL, in-memory databases, and other tools that can handle unstructured data along with structured data.

One of the most important metrics for the processing stage is the speed of moving from raw data to analysis. The default goal is for real-time processing, but the devil is in the details. Just as there is increasing cost in moving from four nines to five nines in reliability, moving closer to real-time analysis demands sizable investment. Among those individuals who are aware of their company’s data processing approach, there is a relatively even split between those who say they are processing processing data as quickly as possible and those who currently rely more on batch jobs happening on a weekly or monthly basis.

The final step is data analysis, where the goal can be either understand trends from the past or predicting the direction of the future. Again, speed is important here along with matching analysis and visualization to the audience and the overall business flow. It is interesting to note that at this time, the heaviest concentration of stakeholders in the data management process is at the department level. Moving forward, there will be a stronger demand for data analytics at the highest levels of the organization.

Three major trends

Over the next 12 months, these trends will shape the data management process:

  1. Natural language processing will open the door to data analysis being performed by non-technical staff. Without having to know a specific coding language, employees can query the data using terms that make sense to them.

  2. Graph databases will rise in popularity with the increased interest in relationships between disparate data points. By far, the most popular tool used for data manipulation and analysis is the spreadsheet, which has many limitations in dealing with today’s data. Graph databases such as Apache Cassandra, Amazon Neptune, and Neo4j use a foundation of nodes and relationships to bring more performance, flexibility, and agility to data analysis.

  3. Artificial intelligence will bring more capability to data processing. Whether it is automation of certain steps or augmented analysis using machine learning algorithms, companies will leverage AI to handle the massive scope and complexity of their data and to assist their data management teams in driving value.

Digital ledger technology

One of the innovations that's slowly beginning to take hold is the use of blockchain or other digital ledger technologies (DLT) as data structures.

Twenty-one percent of companies said they're developing tools that use DLT as an underlying technology, while 27% have purchased tools that use DLT. Among the applications companies are exploring are digital identity, smart contracts, distributed storage, regulatory compliance, asset management, and cryptocurrency and payments.

"While these results are likely skewed due to early lack of understanding around the technology, there is clearly high interest and potential for DLT solutions," Robinson said.

The top challenges companies expect to face when it comes to DLT solutions include a lack of skill in working with and knowledge about the technology; uncertainty over the benefits and use cases for the technology; and a lack of buy-in from other companies, which is a necessity to realize the true value of DLT.

Building strong data practices

Businesses looking to build out their data practices should focus on four areas.

  1. Data Sourcing to understand where data is coming from and ensure that it is "clean" and of high quality.

  2. Data Processing to assure that different types of data are processed, stored and integrated appropriately.

  3. Data Analytics to tie data to corporate objectives and present findings in a way that business leaders will understand.

  4. Data Security to ensure integrity and minimize risks from data loss or breaches.

These areas are relevant to current operations and, more importantly, as companies make use of emerging technologies and trends to support their advanced data management efforts.

Robinson compared the so far modest embrace of data management to the practices companies once used with cybersecurity.

"For some time 'good enough' was as far as many companies were willing to go with their cybersecurity investments and activities," he explained. "They soon learned that stance was not nearly good enough and cybersecurity was elevated from an IT function to a business priority. We may see the same trajectory when it comes to comprehensive data management."

The findings are based on a survey of workforce professionals at 400 U.S. companies covering a range of industry sectors and company sizes. View the full report here.

Article published by icrunchdata
Image credit by Getty Images, Moment, MR.Cole_Photographer
Want more? For Job Seekers | For Employers | For Contributors