Data Fabric Might Be the Answer to Data Management Struggles
Data has become one of the most — if not the most — precious asset in business. But if the value of data has become clear, data management remains a challenge.
Companies are stockpiling data with limited knowledge of how it can be used — or if it can even be used. For many organizations, the data is collected across siloes and often stored in a way that renders it nearly unusable. To solve for this data governance problem, data fabric technologies have emerged as a way to unify and integrate data and data siloes. Gartner is predicting that by 2024, data fabric deployments will have made these deployments four times more efficient than they are now and will have cut human-driven data tasks in half.
The use cases, or advantages of data fabric, have also been noted by enterprise leaders. In fact, further Gartner research shows that data integration (49%) and data preparation (37%) are among the top three technologies that organizations would like to automate by the end of 2022. The result is a considerable rise in the demand for data architectures that can use metadata, semantics, artificial intelligence (AI) and machine learning (ML) algorithms, and knowledge graphs to push augmented data integration and data management. But is data fabric the right solution?
What Is Data Fabric?
Data fabric is a group of technologies intended to make it easier for users to access and share information in a distributed data environment. While this sounds pretty straight-forward, working out what exactly data fabric is, what it is not and what it should do remains a debated issue.
According to Matt Wallace, chief technology officer of Denver-based Faction, there is still no clear industry consensus on the definition of a data fabric. To some, particularly storage vendors, data fabric is a term for providing data services, such as storage and capabilities, across on-premises and cloud environments. To others, it's a data management design for flexible, reusable and augmented data management through metadata.
But data that is spread across locations and backend platforms can amplify the pain that necessitates a data fabric approach in the first place. This is referred to as data gravity, a term first coined by Dave McCrory, vice president of growth and global head of insights and analytics at Austin-based Digital Realty.
Data gravity refers to the idea that as data accumulates, there is greater likelihood that additional services and applications will be attracted to this data. By this logic, data gravity makes many aspects of data management more difficult. Yet, according to Wallace, data gravity provides an interesting lens into the benefits and requirements of a data fabric architecture. In Wallace's view, an approach that reduces the need for copies enables broader access to centralized data and allows for a more homogeneous approach across teams. This is where data fabric comes in.
Related Article: How Master Data Management Can Help Tame the Data Governance Mayhem
Data Fabric Reconciles Data Management Technologies
To be clear, a data fabric is not a single tool. Instead, it is a specialized architecture. The primary role of this architecture is to support integration and connect dispersed data management technologies.
In a modern, data-driven organization, various data tools and cloud infrastructure work in a decentralized ecosystem that enables complex data management and analysis, Sharad Varshney, CEO of Atlanta-based OvalEdge, said. There are multiple advantages in deploying data fabric:
- Integration: When data systems are integrated, they are easier to access. Data access is a critical element of data literacy and vital for adopting data technologies.
- Governance: Using a data fabric, organizations can create unified data governance protocols that work across all data systems. Whereas data governance would have been a separate consideration dependent on the mechanics of the technology, a data fabric architecture enables organizations to entrench the same protocols in every system.
- Security: A data fabric architecture provides an extra layer of security. This security layer helps mitigate cyber attacks and enables organizations to instill more robust data privacy provisions.
Still, the ultimate benefit of a data fabric architecture may be that it provides organizations with a better understanding of the data in their possession. By gaining a greater understanding, users can make better-informed business decisions.
Related Article: Enterprise Data Security Still Has a Long Way to Go
Hybrid Multi-cloud Data Integration
As computerized data management continues to progress, there will be new approaches that can further revolutionize the way we handle, store and share data, said Stefan Smulder of Netherlands-based Expandi. In his view, a data fabric is a hybrid multi-cloud data integration architecture with a wide range of data services that democratize data engineering, analytics and other data services across a variety of endpoints. It unifies data management techniques and usages in the cloud and on-premises. In simpler terms, it's a layer of abstraction to integrate all different resources into one entry point.
Learning Opportunities
In the scenario that multiple resources from differing source points are being used, a fabric hides all the details of each one to provide users and data consumers with one singular interface: the fabric.
“A data fabric is not any one thing, because the linguistic reality is that there isn’t enough consistency in how people are defining the term,” Smulder said. “The industry does not seem like it has been able to converge into a clear definition just yet. Regardless, the benefits of the undefined data fabric process are numerous.”
One such benefit is self-service data consumption. According to Smulder, with data fabric, data users are able to seek out relevant and high-quality data faster. Equally important, a data fabric allows for automated governance, data protection and security. It also optimizes and accelerates data delivery within an enterprise, thus virtually eliminating inefficient, repetitive and manual data integration processes. Real-time, continuous and automatic analysis aid in the delivery of high-quality data.
Related Article: Take Your Cloud Strategy Into the Future
Data Mesh as an Alternative?
Dave Mariani, chief technology officer and co-founder of Boston-based semantic layer data company AtScale, said there's yet another use for data fabric. In his view, the fabric is a design pattern for building an analytics stack that emphasizes the importance of metadata sharing and automation.
While data fabric principles can serve as a guide when selecting technology vendors, Mariani argues that data mesh is more relevant and actionable for enterprises. Data mesh is built on modern, distributed architecture for analytical data management and enables end users to access and query data where it lives without first transporting it to a data lake or data warehouse.
This data mesh approach has generated attention because it addresses the analytics supply chain by distributing the work of creating analytics products to business domain owners.
“Since it addresses the people side of the equation, it has a growing groundswell of support, and I see real companies starting to implement some of its core principles,” Mariani said. “In my opinion, moving to a data mesh style of delivering analytics has far more potential for changing the analytics landscape than data fabric.”