Skip to content
Barry Smart Barry Smart

Tutorial

In this video, Barry Smart, Director of Data & AI, examines how Microsoft Fabric can support a Data Mesh vision for data and analytics.

As we transition to a digital age powered by data, we discuss how data professionals can help their organizations thrive. Discover the capabilities of Microsoft Fabric, its role in creating data products, and how it aligns with the core principles of Data Mesh. Learn about the advancements in technology, the integration of open-source tools, and the importance of a socio-technical approach to data. Don't miss insights on federated computational governance, and DataOps. Watch to understand how to drive value and innovation in your organization with Microsoft Fabric.

The full transcript is available below.

The talk contains the following chapters:

  • 00:00 Introduction to Microsoft Fabric and Data Mesh
  • 01:33 The Evolution of Data Platforms
  • 02:57 Key Features of Microsoft Fabric
  • 05:00 Introduction to Data Mesh
  • 06:53 Principles of Data Mesh
  • 09:36 Combining Microsoft Fabric and Data Mesh
  • 11:00 Example Data Product in Fabric
  • 12:58 Conclusion and Final Thoughts

From Descriptive to Predictive Analytics with Microsoft Fabric:

Microsoft Fabric End to End Demo:

If you want to learn more about Fabric, take a look at some of our other content:

Decision Maker's Guide to Microsoft Fabric

and find all the rest of our content here.

Transcript

In this video, I'm going to talk about how Microsoft Fabric can be used to implement a Data Mesh inspired vision for data and analytics.

We're living in exciting times. We're leaving the industrial age and entering a new digital age.

The industrial age transformed the way we work and live by providing huge amounts of physical power. The digital age will also transform the way we work and live, but this time by providing huge amounts of thinking power. And this thinking power is fuelled by data.

So, as data professionals, we have a responsibility to help the organizations we work for to orientate and navigate their way in this new digital age.

In this short session, I want to help you think about the latest advances in technology, in this case Microsoft Fabric, and data architecture, in this case Data Mesh, and how they can be used to help your organization generate more value from data, and therefore thrive in this new digital age.

So let's have a look at Fabric. It's a self-serve data platform. Think of it as a Swiss Army knife for data that brings together Microsoft's discrete services into one cohesive offering. It's a SaaS platform, so you can be up and running in a few minutes. It's free to try, and there is a flexible and evolving commercial model which aims to ensure that advanced analytical workloads can be within the budget of any size of organization, from startup to global enterprise.

Fabric represents 20 years of data platform evolution. Firstly, we've moved from on prem to the cloud and then through the evolution of infrastructure as a service, platform as a service, serverless, and software as a service. Now Fabric's very much at that SaaS end of the spectrum.

In parallel, we've seen the rise of open source, which means we can leverage technologies such as Python for coding, and Spark for compute, and Delta Lake for storage, without worrying about vendor lock in or eye watering license fees.

And more recently we've seen the emergence of intelligent agents, that work in collaboration with analysts and data engineers and data scientists, helping them to be more productive, more creative and to deliver better data products.

Now the net result is that the barrier to entry is lowering. This technology is now available to organizations, whatever their budget, size, and existing data skills. And with this new advanced tooling, it also means we can create data products more rapidly than ever, and with greater levels of sophistication.

And there's an ongoing arms race between the major vendors and this continues to drive massive investment in the underlying platforms with new features being added all of the time, which is great for those of us who can keep up!

Now, as a cloud native analytics platform, Fabric separates the storage from the compute. The storage layer in Fabric is One Lake where all of your organization's analytical data can be stored. It's cheap and infinitely scalable, and it adopts the open Delta Lake storage format for tabular data. Fabric then provides a number of compute engines that you can choose from to bring to the data and work with it on One Lake. And the polyglot development environment enables you to use the tools that best suit the skills of your team.

Now Fabric's a Swiss Army knife. It does a lot, but it doesn't do everything. So there's integrations to services on Azure, to third party vendors. And also you've got access to the wider open source ecosystem. And all of this means you can create end-to- end solutions, or what Data Mesh calls data products. And an end to end solution often involves ingesting data from some operational source, then cleaning it, standardizing it, enriching it.

In order to deliver an actionable insight, and that actionable insight could take the form of a dashboard, a report, some kind of predictive forecast, an alert or a data API.

If you use Fabric in the right way, you should expect to see at least a 10x improvement in time to value and cost of ownership compared to legacy platforms and methods. But let me repeat that: if you use it in the right way!

To deliver value from data using cloud native analytics platforms such as Fabric "in the right" way means you're not going to only need to adopt new technology but you're also going to need to apply new processes and develop new skills. So platforms such as Fabric will help you with that technology part but they're not going to help you with the process and people parts which in our experience are often the most significant barrier to success.

But not to worry, this is where Data Mesh comes in!

So let's have a look at Data Mesh. The concept for Data Mesh first took shape around 2018 when Zhamak Dehghani was working at ThoughtWorks. ThoughtWorks are a global leader in software engineering practices such as agile, user centered design, and domain driven design. Now, Zamak saw an opportunity to apply these practices to help organizations that were struggling to deliver value from data. Specifically, to overcome challenges faced by traditional centralised architectures and teams, and especially in large and complex organisations. She formalized her thinking in a book which was published in 2021, and today you'll find a large and growing community of organizations that are adopting Data Mesh as their vision for data and analytics.

It's also fair to say that Data Mesh has also influenced the way that Fabric has been designed, with some features clearly intended to help organizations deliver a Data Mesh architecture.

Fundamentally, Data Mesh recognizes that data is a socio technical endeavour. This simple model fits any project. We've got the data team on the left, who build a data product to meet the needs of the users, to enable them to achieve a specific goal on the right.

Now, if this socio technical system is working well, value flows from left to right. And there's feedback loops in place that enable the data team to sustain that value over the long term.

Zhamak recognized that in many organizations this model had simply broken down through issues such as centralized data teams becoming a bit of a bottleneck and being out of touch with the needs of end users and the wider goals of the organization. Or, at the other end, users being reluctant to learn new skills or embrace new data products to put them into effective use

So Data Mesh tackles these challenges through four principles. The first principle is data as a product. And what we mean here is delivering small units of functionality that are designed to achieve a specific goal in the organization.

The second principle is that ownership of these products is decentralised and distributed across the domains. In other words, distributed to the business functions and departments that understand the data and therefore are best equipped to discover, build and own and operate data products over the long term.

Now, because these data products are discoverable and addressable, they can be combined in new ways. For example, the finance team may use data products that they discover in other departments as the inputs to a new machine learning data product that they create. To predict future revenue for the organization.

Now, these interconnections over time build up like a mesh across the organization, and this is of course, where the mesh in Data Mesh comes from!

Now to empower the domains to build and own data products, the third principle is that a self-serve data platform is provided. And this is clearly where platforms such as Microsoft Fabric play a role.

And finally, to prevent all of this descending into chaos. The fourth principle of federated computational governance is applied.

Now I think of Data Mesh as a Microservices architecture for data.

Now, in all of this, the principle of data as a product alone is a really powerful one. And I'd encourage you to think about adopting it in the work that you do. Good products, after all, solve a problem. And the users of those products love the product. And they can generate huge amounts of value. And in extreme cases, they can transform an organization or a whole industry.

Now, to build a successful product, we need to be in tune with users and to understand their needs. And this shift in mindset can really help data teams to be more outward looking. Perhaps shifting from being an order taker to actively engaging, seeking to challenge the status quo in their organization.

And they're therefore operating as an innovation engine, generating ideas and taking them through the life cycle you can see here. And this gets them thinking about things like lifetime value versus total cost of ownership. And therefore they're better equipped to pivot or fail fast if an idea for a data product is not commercially viable.

Now, could your organization benefit from this mindset and this thinking alone? I think it could!

So, let's put the two together. How does Fabric measure up to Data Mesh? This chart aims to sort of summarize our thinking about this.

Fundamentally, Fabric will enable organizations to make a significant step towards a Data Mesh inspired vision, but it's not a complete solution. There are some gaps. In particular, as you can see here in the area of federated computational governance. But this chart also reinforces the fact that technology forms only part of a Data Mesh vision. The organizational readiness gap you can see here represents the process and people parts of the socio-technical system we described earlier. And Fabric, as a technology, can't help you with that.

In other words, don't expect to adopt Fabric and then be able to claim you're also doing Data Mesh.

Now one gap we find with all data platforms is around their ability to support DataOps. And what we're aiming to do here is take all the best practices from DevOps in mainstream software engineering and apply them to data and analytics. And this clearly is going to involve some heavy lifting from your platform and infrastructure teams. But the good news is, with Fabric being a SaaS platform, there's less infrastructure and moving parts for your team to manage. So they should have more capacity to focus on these higher value DataOps initiatives.

So what does a data product look like in Fabric? Our thinking is evolving all the time. In part, really, because Fabric's evolving all the time. But here's a good example. This is a data product we're building as part of a video series. So go and check that out if you want to look at this in more detail. But the purpose of this product is to take Open source information about passengers on the Titanic and to present an interactive report for end users to explore the demographics of the passengers on the Titanic and to start to understand the survival rates of those different demographic groups.

So you can see here we're using the new task flow feature within this workspace to set our architectural intent. We're ingesting data from the internet onto a bronze area of the lake. We're cleaning it up and processing it into silver. And finally, we project out into the gold area of the lake to enable us to build a semantic model over the data, which looks like this. It's a mini star schema. It's intended to make reports intuitive and interactive, and that's exactly what it does.

But also in Fabric, we've built our data product inside our container of a workspace, but we can also choose to then publish or promote or certify certain artifacts in that workspace, making them visible and addressable to the wider organization. So that's a feature of a data product.

So here you can see we've certified the gold layer in the lake house. And we've also promoted the Titanic diagnostic analytics semantic model as well. So perhaps others in the organization can discover the data and build these kind of interactive reports and analytics or perhaps put it to another use like training a machine learning model!

So in conclusion, Fabric is a good fit with Data Mesh. It's all evolving at pace. Fabric's clearly being influenced by Data Mesh, so expect the gaps that we've identified to be addressed over time. For example, Microsoft are actively addressing gaps in federated computational governance through new features in Purview. So I encourage you to go and look at the latest features that are coming in that product.

Final thoughts is: you don't have to go all in with Data Mesh. There's some brilliant fundamentals that you should definitely consider. For example, it's useful to think about the work we do as being a socio technical endeavour. You can't succeed through adopting technology alone, like Microsoft Fabric. You're going to need to bring the wider organisation on a journey with you.

Focus on that flow of value and those feedback loops in the work that you do. And can you turn to things like DataOps to help you address those bottlenecks? And could Fabric help generate that space in your teams to do that DataOps work?

And that product mindset will enable your teams to optimize the work they do, encourage them to look outwards, and to operate as an innovation engine to get the most out of the tools such as Fabric to deliver that game changing impact for your organization.

I've covered a lot of ground, I would say please go and visit our website where you'll find blogs and talks that cover all of these topics in more depth.

Now please don't forget to hit like if you enjoyed this video and also subscribe to our content to keep following what we're doing on this channel.

So thanks very much for watching. Bye bye.