Why You Need Data Observability in the Digital Workplace
In early July, IBM announced it had acquired Tel Aviv-based Databand.ai, a provider of data observability software whose mission is to help organizations fix issues with their data. It wasn’t exactly a surprise. IBM is on a buying spree at the moment.
But the acquisition of this particular company, a startup with an observability platform for data and machine-learning pipelines, highlights a growing focus on the need for enhanced AI and transparency across data infrastructures. According to IBM — and most likely any company focused on the development of AI, machine learning or any technology that works with big data sets — as the volume of data continues to grow, organizations are struggling to manage the health and quality of their data sets.
One of the fixes for this, IBM believes, is data observability, which gives organizations a better view of the data in their system and automatically identifies, troubleshoots and resolve problems like breaking data changes or pipeline failures in near real-time.
Data Observability Technologies
According to Kunal Agarwal, founder and CEO of Palo Alto-Calif.-based cloud company Unravel Data, the technology behind observability shares the same foundational principles as monitoring. Telemetry data captured through lightweight instrumentation (agents and sensors) — logs, metrics, traces, events — is stitched together in a unified view to provide visibility into and understanding about what’s happening across various distributed systems and technologies.
Different observability solutions apply different analytics (dependency detection, visualization anomaly detection, root cause analysis) with differing levels of sophistication to provide comprehensive, real-time insight into the behavior, performance and health of the systems under observation.
Where observability differs from monitoring is that while both can tell you what is going on, observability can also tell you why. Observability, therefore, makes it easier to find out what you need to know, although it is still up to the user to act on this information.
The better observability tools also identify, based on patterns, things organizations need to pay attention to or areas they may not need to investigate. By applying machine learning and statistical algorithms, it essentially throws math at the correlated data to identify significant patterns: what has changed, what hasn’t, what is different.
“It’s the same kind of analysis a human expert would do, only done automatically with the help of ML and statistical algorithms,” Agarwal said.
There’s a next step beyond observability, where AI is applied to identify not just what went wrong and why but also what to do about it. This requires deep intelligence about the systems under observation (sophisticated modeling, granular details, comprehensive AI), which only a couple of solutions offer at this time.
Related Article: How Master Data Management Can Help Tame the Data Governance Mayhem
3 Key Approaches to Data Observability
Data observability is a key part of any organization that has adopted DevOps as part of its technology strategy. It is, in fact, a crucial part of any DevOps or site reliability engineering (SRE) initiative, said Farzad Rashidi, co-founder of Rockville, Md.-based Respona, as it allows organizations to quickly identify and fix problems before they cause major service disruptions.
Data observability can be achieved through a variety of means, including logging, monitoring and tracing. Each have their own strengths and limitations.
1. Logging
Logging, the process of capturing events and messages generated by a system, is the most basic form of data observability. It is often the first step that organizations take when trying to improve their understanding of their systems. Logging can be very helpful in identifying issues, but it has some limitations. Of note, it can be difficult to determine the root cause of a problem from log data alone, and logs can quickly become overwhelming, making it difficult to find the information you need.
2. Monitoring
Monitoring is the act of collecting data about system performance and health. The process can be used to identify trends and issues that would be difficult to spot in log data alone. Plus, it can provide a more complete picture of system health than logging, so it also helps overcome some of the earlier limitations. However, monitoring data can be overwhelming, and much like logging, it can be difficult to determine the root cause of a problem from monitoring data alone.
3. Tracing
Tracing is often used to supplement logging and monitoring data because it can be very helpful in identifying the root cause of a problem — where the other two approaches tend to fail. With this approach, companies can follow the flow of data through a system to identify bottlenecks and slowdowns. But tracing can be expensive and time-consuming, and it is often not practical for large systems.
Data observability is a crucial part of any DevOps or SRE initiatives, but it is only one piece of the puzzle. "In order to truly understand a system,” Rashidi said, “organizations need to take a holistic approach that includes data from all of these different sources."
Related Article: Data Mesh or Data Fabric as a Foundation for Data Management Strategy
Learning Opportunities
Business Value of Observability
While the investment in data observability strategies can be expensive, the business case for it is strong. When organizations have the ability to streamline data observability with a well-defined toolkit, they can unlock several advantages, said Sharad Varshney, co-founder and CEO of data governance consultancy OvalEdge, including finding and eliminating bottlenecks, identifying where data is going and where it's been, understanding the relationships between data assets and supporting better data quality.
All of these benefits are crucial in the digital workplace because the most critical business decisions are data-driven. In addition, for data-driven decisions to gain support and buy-in, there needs to be trust in the data.
Most processes, departments, roles and responsibilities in a digital workplace incorporate integrated applications. These applications enable data-driven decision-making through the various access points and analysis tools. Organizations that can audit and observe the state of their data landscape have the ability to ensure the data from these applications is legitimate and suitable for use.
One of the benefits of having observable data company-wide is fewer errors. Fewer errors mean quicker delivery times for projects and workflows. This is particularly important in today's fast-paced disruptive environment, where speed of execution is often key.
“When analytics and dashboards are inaccurate, business leaders may not be able to solve problems and pursue opportunities," said Varshney. “When there's an outage in critical data, every second counts. Data observability enables you to ascertain the state of a system by observing its external outputs.”
Related Article: Transformation Starts With Your Business Architecture
An End-to-End View of Performance
As data moves from source to destination across complex data landscapes, incidents can occur that impact service reliability, performance and data accuracy. Data generated in one source system may feed multiple data pipelines, and those pipelines may feed other pipelines or applications that depend on their output, said Rohit Choudhary, CEO and co-founder of Campbell, Calif.-based Acceldata.
The ability to assess data quality at any point in the process is needed to prevent schema or data drift from source to destination. With an end-to-end view of data, multidimensional data observability provides visibility that tracks the data journey from origin to consumption across interconnected infrastructure, pipeline and data layers.
Monitoring is sufficient for simply detecting when something failed. But observability goes further with analytics to gain deeper insights into usage, anomalies and trends in the data pipeline. An observability platform detects potential blind spots and provides alerts, recommendations and automation that give data teams the ability to rapidly fix issues of data validity and quality.
By reducing data risk and data cost, data observability help maximize a company's return on data investment. Data observability solutions should also reduce the number of steps and level of complexity in managing incidents and restoring health while minimizing the impact on business operations.
“A unified data observability platform provides visibility and control at every layer of your data infrastructure, in every repository and pipeline, no matter how expansive,” said Choudhary.
About the Author
Mike Prokopeak is editor in chief at Reworked, the premier publication covering the r/evolution of work, where he leads content development focused on the transformation of the workplace.