TLDR; Microsoft Fabric has clearly been influenced by Data Mesh. We illustrate how Microsoft Fabric measures up to the four data mesh principles. If you are seeking a Data Mesh inspired vision for your organisation, Microsoft Fabric is a solid choice to help you drive that strategy forward. However, there are a number of gaps that you will need to develop a strategy to address.
In May 2023, Microsoft announced Microsoft Fabric. It extends the promise of Azure Synapse Analytics integration to all analytics workloads from the data engineer to the business knowledge worker. It brings together reporting, analytics, data science and data engineering on a new generation of lake house infrastructure. Delivered as a unified SaaS offering, it aims to reduce cost and time to value, while enabling new "citizen data science" capabilities. See Ed Freeman's Introduction To Microsoft Fabric for more background.
The objective of this five part series of blogs is to help technology leaders assess how much support Microsoft Fabric will give to enabling a Data Mesh inspired vision. So, if you are a Chief Technology Officer (CTO), Chief Data Officer (CDO), Director of Data and Analytics, Director of Artificial Intelligence, Head of Data Science or any similar senior data leader role, seeking to apply Data Mesh within your organisaiton this article is for you!
How Microsoft Fabric measures up to Data Mesh
Microsoft Fabric has been heavily influenced by Data Mesh. Data Mesh is Zhamak Dehghani's thought leadership about how to "deliver data-driven value at scale". The fundamental thing to note about Data Mesh is that it is not just about technology or architecture. It also describes important cultural and organisational principles that need to be applied in order to drive value from data in a safe, scalable and secure manner.
Due to Conway's Law, we know that these cultural and organisational concerns will tend to override the technology architecture you are seeking to deliver.
- Domain-orientated ownership
- Data as a product
- Federated computational governance
- Self-serve data platform
Data Mesh is a relatively new concept. It borrows from Eric Evans' theory of domain-driven design and Manuel Pais' and Matthew Skelton's theory of team topologies. In doing so it takes practices that have had significant traction in mainstream software engineering and applies them to data and analytics.
Whilst there is a significant amount of interest in adoption of these principles, there are limited resources available to enable the principles to be successfully interpreted and implemented in the real world.
Microsoft Fabric is seeking to close this gap, by providing a platform that enables organisations to move closer to a Data Mesh driven vision. Data Mesh is very much a socio-technical endeavour. Microsoft Fabric can address the technical aspects, but the the full achievement of each Data Mesh principle will also depend on organisational readiness. On this basis, here's our assessment of how Microsoft Fabric aligns to and supports the four Data Mesh principles above:
Self-serve data platform
The strongest alignment is with the "self-serve data platform" principle - you get this "out of the box" when you sign up for Microsoft Fabric.
Data as a product
There is also good support for the "data as a product" where we consider the dataset layer in Power BI to provide the foundations for serving data products with wider capabilities in Microsoft Fabric that enable those data products to be standardised, shared, consumed and composed to meet different user needs. There is also the ability to promote and certify these data sets to support an internal "data product marketplace".
There's room for improvement in this area. Power BI Apps can be shared and endorsed within an organization, but this currently applies only to Power BI reports. OneLake Data Hub currently lists all the data items you either already have access to, or have been made discoverable elsewhere in the organisation, but it doesn't yet offer a means to package these up as a suite of artifacts forming a "data product".
Another key area where Microsoft Fabric has a gap is ability to write a comprehensive specification for a data product as part of the discovery, analysis and design phases of a project. This allows a detailed "contract" to be written and agreed between the data product provider and data product consumer before development begins, and for quality assurance checks to be developed in conjunction with development. This would be akin to the role that WSDL plays in the API economy.
The principle of "domain-orientated ownership" is concerned with making it easier for "citizen analysts" to discover, develop and own data products. Microsoft is in a unique position position because it can exploit its dominance across adjacent products: Azure Synapse Analytics and Power BI (now integrated within Fabric) and Microsoft 365 . By packaging Microsoft Fabric as a self service SaaS offering, Microsoft has an opportunity to bring the tools normally reserved for specialists (data engineers and data scientists) to generalists.
We believe a significant proportion of achieving "domain-orientated ownership" is about organisational readiness, which is something that technologies such as Microsoft Fabric cannot help with.
However, in order to achieve domain-orientated ownership without opening the organisation up to significant risk, there will need to be governance "guard rails" in place in areas such as cost management, data access permissions and quality assurance to enable this principle to be achieved in a safe, scalable and secure manner. This brings us onto the next principle...
Federated computational governance
We feel the largest gap is around "federated computational governance" - this is the hardest problem to solve and no vendor has a complete solution for this at the moment. Tools such as Purview can play a role, but they are primarily about discovering what you have and trying to get it under control rather than fundamentally transforming how data products are governed from the outset. These tools continue to evolve, but you will find there is need to develop your own solutions to in this area.
One specific area that we feel Microsoft Fabric falls short is Master Data Management (and the related topic of Reference Data Management). There are significant opportunities for Microsoft to bring the power of Microsoft 365 more into the mix, by enabling simple master / reference data management use cases to be supported through tools such as Dataverse, SharePoint lists and even Excel tables. At the moment integration of Microsoft 365 data sources into a Fabric based pipeline requires significant engineering effort - we hope that this can be streamlined in the future.
Microsoft Fabric will enable organisations to make a significant step towards a Data Mesh inspired vision, but it is not a complete solution
Microsoft Fabric will enable you to take a significant step towards a Data Mesh inspired vision, however there are a number of gaps that you will still need to address that Fabric either cannot help with (organisational change) or is not yet providing "out of the box". This presents an opportunity - organisations that can address these "hard to solve" problems can often use this to gain competitive advantage.
With regard to the technology related gaps, the most significant are:
- Data product marketplace
- Developing standards and patterns
- Master data management
- Tooling to support federated computational governance
- Data product "contract" specification
This can form the basis of a research and development backlog, with the relative priority of the items dependent on your specific requirements.