Skip to content
Barry Smart Barry Smart

SQLBits 2024

Microsoft Fabric has been influenced by Data Mesh which is Zhamak Dehghani's thought leadership about how to "deliver data-driven value at scale".

In this presentation, Barry will provide an overview of Data Mesh, how Microsoft Fabric measures up to Data Mesh principles and provide recommendations about how Fabric can be used to deliver a Data Mesh inspired data vision for data and analytics in your organisation.​

Chapters

  • 00:00 Introduction to Microsoft Fabric and Data Mesh
  • 00:33 Speaker's Background and Experience
  • 01:52 The Digital Age and Role of Data Professionals
  • 02:28 Overview of Microsoft Fabric
  • 03:38 Introduction to Data Mesh
  • 04:31 Principles of Data Mesh
  • 05:57 Challenges and Solutions in Data Mesh Implementation
  • 10:52 Data Mesh and Microsoft Fabric Integration
  • 12:52 Strategies for Adopting Data Mesh with Fabric
  • 14:55 Conclusion and Final Thoughts
  • 17:22 Closing Remarks

Transcript

So, I'm going to talk about Microsoft Fabric and Data Mesh. Data Mesh seems to be cropping up quite a bit during the conference. I've been to a couple of sessions that were Specifically talking about Data Mesh, but also a few sessions where its seemingly popped up. I'm sure you've heard it, people talking about it as well this during the week.

So hopefully I give a slightly different perspective today that you can use to build your thinking around Data Mesh and how you could apply that on fabric. So, my name's Barry Smart. I graduated as a physicist about 30 years ago, but I went into a role as a software engineer in the water industry initially.

where I was a graduate software engineer. I was lucky enough to spend four years actually working in Australia for Sydney Water. Then I came back to Scotland where I took a role, a kind of, where my formative years were as chief architect in Scottish Power's energy trading business. Then I moved into the Financial Services sector where I was promoted to CTO of a The UK's largest independent firm of consulting actuaries, and I led their adoption of Public Cloud.

In fact, we were the first Financial Services organization in the UK to put personal data onto Azure. And during that time, I rediscovered my love for coding and software engineering and data, and I decided to take a career break. I went back to my alma mater, Strathclyde University, and I completed a master's in Artificial Intelligence.

And now I'm director of data and AI at Endjin. Endjin is a small technology consultancy. We're a fully remote company based in the UK, but with customers all around the world. We're small, but we're able to achieve big things, and that's because we leverage our expertise, our processes, and our intellectual property.

And we like to help our clients do exactly the same thing. We're in an exciting time. We're leaving the industrial age, and we're entering into a new digital age. Industrial age was personified by physical power, which unlocked and transformed the way that we work and live. And it's the new power that's going to be transforming, the way that we work in Live is thinking power and it's daunting for us but it's even more daunting for the organizations that we work in and the people that, who could be impacted by all of this.

And it's our role as data professionals to think about this and help organizations navigate that, that new world successfully. Microsoft Fabric has come along. I'm sure you've all heard about it, but this is the kind of slide that I use to explain it. It's really been interesting over the last ten years to see the evolution of data platforms from on prem to infrastructure as a service, platform as a service, serverless, and now software as a service.

And Fabric is very much leading that SaaS-ification of the data landscape. And more recently, we've seen the emergence of intelligent agents, things like GitHub Copilot that work alongside us as professionals and power us up. They help us to achieve more and act as that friend, friendly buddy that can amplify our role as data professionals.

And these two things together are having a significant impact on our ability to deliver value at scale into organizations. It also means through the SaaS-ification of data platforms that the barrier to entry is lowering. for listening. Small and medium organizations that might not have been able to justify an investment in a data platform and the team to run a data platform can now embrace platforms like Fabric with a very low cost of entry.

Data Mesh was conceived in 2018 by Zhamak Dehghani. At the time she was working in ThoughtWorks, who are a global, globally recognized thought leader at the time in areas such as agile software development and user centric design. And Zhamak very much seemed to lift thinking from the mainstream software engineering world into the data plane. She could see problems in the organizations that she was working with where centralized data platforms and data teams were becoming bottlenecks. And the vision that she established through Data Mesh was very much about decentralizing and democratizing data ownership across the organization.

The way I like to personify Data Mesh is very much like bringing a kind of microservices architecture into the data plane itself. Data Mesh is founded on four principles. The one I really like is data as a product. I think it's a really useful way to think about how we deliver value into our organizations.

A data product is a small, unit of value that's meeting a specific need within the business. It's got users like any product that love the data product. They put it into use to achieve some end goal and some purpose. And if your data products are achieving that, then you're succeeding. Once you've got data products that are small and recognized in this way, you can then start to distribute ownership of those data products from a centralized model.

And you can distribute ownership across the business. Placing ownership of those data products with the domains or the departments that are best placed to own, evolve, and deliver those data products across the business. And there's this nice concept of interoperability between data products, a bit like microservices where you can plug them together.

So, you might have some data products that are close to the data and deliver value by presenting the data. Data in some way, but then you can build more sophisticated products. Use those as foundations to build more sophisticated products on top. So, this notion of, this is where the notion of the mesh comes from.

It's that interoperability between the products. And all of this is enabled, first of all, by a self-serve data platform. So, this is where we immediately start to think about fabric. And also, the concept of federated computational governance. Obviously, traditionally centralization of data and data expertise has made governance easy.

As soon as you distribute ownership of data products across your organization, there's a risk there that things get out of control. So, Zhamak's solution to that is the concept of federated computational governance. It's using computational means of embedding the governance into the platform so that you don't have to manage it through humans alone.

It's also about putting more responsibility onto the domains who own the data products. After all, with power comes great responsibility. It's about them stepping up and taking on responsibility of some of these things that may have before been centralized within the organization. The thing I like about Data Mesh is that it's very easy to use, that it's very much about recognizing that data is a socio technical endeavour. To succeed, don't just have to overcome the technical challenges, you're going to have to overcome cultural and organizational challenges. And I think, to me, it's even more about that social piece than the technical piece.

And certainly, what we see, if you can crack that organization on cultural basis concerns within an organization, you've got a much higher chance of succeeding. And this simple model kind of brings this to life. You've got the data team on the left, who deliver the data product, which meets the needs of the users on the right.

So, value is flowing from left to right, and as importantly, you have feedback loops. Hopefully you can see that down the bottom there, but that's really important. And where the flow of value or the feedback loops breaks down, you start to have problems. And this thing, there's a thing called Conway's Law that can often get in the way of these the flow of value and feedback loops.

If you've got a disjointed disconnected organizational structure where things, the flow of value is spanning multiple teams and they work in a sort of dysfunctional way, then the flow of value will be impacted accordingly. And the ability for you to deliver value will be undermined. So, the focus here is all about rapid and safe delivery of value and promoting these feedback loops, listening to what your feedback loops are telling you and being able to react to that.

And this virtuous cycle continues. Another great thing about data, the data product principle is, that I love, is that bringing this product mindset along and it helps, I think, here, the challenges to data teams to start acting as innovation centres within the organization. Traditionally, some of the data teams can be Turn into sort of order takers where they sit back waiting for new requests to come into them.

And this is about them actually partnering with the business and helping them to identify new ideas and ways in which data can be leveraged to generate value within the organization. And high-performance data teams really play to this life cycle. And they're prepared in the early stages of exploring or validating an idea to pivot or fail fast because they recognize not every idea works or is going to be worthy of taking into production. They'll, they're aware of that sort of value curve, but also the total cost of ownership that goes with any data product. And they're very conscious about delivering value at a TCO that that can be covered by the value that they're delivering.

It can be a useful way of thinking about data as a product is to look at your own existing estate. Think about, how you could identify data products in your current estate and weed out those products that might not be delivering value today. By getting rid of those products, retiring them, you're creating space to go through this innovation cycle and find new sources of value you can deliver into your organization.

Another interesting way to think about Data Mesh is to, if you remove the self-service data platform from the equation, you're left with the, these three principles that I talked about earlier. And there's natural tension between them. If you think about domain-oriented ownership that, as we discussed, to be able to do that in an environment where you're highly governed can be difficult because it plays against your kind of constraints of governance.

If you're in an organization that isn't ready to treat data as a product in terms of making providing a platform that can make data products discoverable and interoperable, domain orientated ownership can really put stress on that as well because you're going to end up with data existing in silos and being locked away from being put into use.

So, these three dimensions are important. aren't orthogonal. They're all constrained. And what we find is that it's quite useful when we talk to clients to try and identify the dimension here that they are most constrained with today, and to focus on unlocking that.

Because by unlocking that, like an elastic band being released, you can then push forward in the other dimensions as well. So, it's an interesting way of thinking about where are we today? Where do we want to get to and how can we get there? What's the best strategy to get there? So that's Data Mesh.

How does Fabric measure up to Data Mesh? This is a very high-level view of that. So, each of the principles that underpin Fabric are bars on this chart and you can see how Fabric today meets those requirements. What's interesting about this chart is it's not saying that Fabric should meet those requirements.

Solely be responsible for example for delivering domain orientated ownership. There's organizational readiness That's a major factor in that area as well but that sort of dark blue area represents the gap as we perceive it today So obviously Fabric's pretty strong on the self-service data platform The biggest gap is around federated computational governance How do you address that?

As data professionals, as fabricators, if you like, we have an opportunity, and I guess a responsibility if our organization wants to embrace a Data Mesh inspired vision to plug that gap. And we believe strongly in data ops as being a great approach for plugging some of these gaps. To us, data ops is basically just DevOps, but applied to data projects.

And this landscape is evolving. Really due to the forces I talked about before, the SaaS-ification of data platforms and that role that AI is starting to play in our day-to-day activities as software developers and data professionals. And some of these activities will very much remain, though a human endeavour.

But through the DataOps a lot of the activities in here around observability, discoverability, quality, and the like are all very much played to that gap currently within Fabric. But the good news is you can bring other services in Azure to bear to, to plug these gaps. So, it presents a bit of work that we need to do, but that's also an opportunity for us to help our organization differentiate potentially in this in this environment.

If you like, you want to go for a Data Mesh inspired vision and you like the look of fabric, how do you plot a strategy? How do you find a way forward? There's a lot of things to consider, a lot of information that's out there to help to, that you're being presented with. And I love this quote from David McCandless, when you're lost in information, an information map is useful, and in this case, we like to use a tool called a Wardley Map.

It's a two-dimensional map, but it's specifically designed to give you situational awareness and help you to understand where you are today and the strategies that you could adopt in the future through movement on the map. The Wardley Map is anchored at the top with the business goal, or the end user need that you're seeking to achieve.

And the vertical axis maps out your value chain. So, you can see an example here with the activities that are most visible to the user at the top and those that are least, less visible down towards the bottom. The really interesting axis is the horizontal axis. It plots the evolution, the natural evolution of technologies and activities over time.

And it's that evolution that you can explore through the maps and how that can unlock new opportunities within your organization. So, here's an example map that we put together when Fabric was released. We were keen to explore how moving from an Azure platform to a Fabric platform could unlock value generically within any organization.

And, this is Very generic, but I would encourage you to if you want to explore worldly maps, use this as a template and start at the top. That's the primary advice I would give you. Don't start with the technology, it's not about a solution looking for a problem to solve, it's the other way around.

Understand the business, the goals it's trying to achieve, map that value chain out, and then in the lower regions, understand how Fabric could help you to get there. Deliver that goal or that objective in, in some way. So, in conclusion there's great promise. Fabric is well aligned to the, a Data Mesh vision and there are many technical barriers there that aren't, Fabric isn't an off the shelf solution, there are technology barriers there, but we see the biggest barriers to success actually being more on the socio side of the socio technical system that we're trying to deliver.

For example, That the idea of domain ownership of data products is great, but is, are your users able and willing, have they got the skills, and have they got the willingness to build and own data products through their full life cycle? The SaaS-ification of these pro products. Platforms can present quite a bit of a threat to traditional IT and technology teams.

Are they willing to embrace this new world and adapt their skills and responsibilities accordingly? And is the wider organization ready to adopt data? And leverage the value of data. Is it willing to be agile and transform itself to embrace this brave new world? And that view is very much backed up by this quote.

Professor Dame Diane Coyle from the University of Cambridge isn't a technologist. She's a kind of public policy analyst. And she recognizes the current barriers to productivity, certainly in the UK, aren't there. Technology problems. They're the ability for organizations to understand how to leverage technology and put it into use.

So, my recommendations are don't start with the technology. Focus on the business goals. Think about that flow of value and those feedback loops in your organization today. What are they telling you and how could a Data Mesh or a fabric vision perhaps help you to unblock that? The barriers that you're seeing.

Use the Wardley Maps to envisage that future and then explore it incrementally. I think the product mindsets are great, a great mindset to adopt. Be more innovative, drive innovation within your organization. Be prepared to pivot or fail fast. Don't fall into committing to ideas that aren't going to deliver sufficient value to cover their total cost of ownership.

And recognize that in all of this, that data is a socio technical endeavour that people will be impacted. If it's scary to us this world that we're coming into, and the scale of change we're seeing, it's going to be ten times scarier for those that aren't data professionals. And it's our role to help them understand, and adapt, and embrace this new world in a responsible and ethical manner.

I've covered a huge amount of ground there. If you want to explore any of the topics, I've covered there's a range of blogs on our website that I and some of my colleagues have offered on these topics. So please, if you want to explore any of this in more depth, please visit the engine blog.

And the only thing I would ask that you do. Thanks for coming along to the session, but it'd be really great to get your feedback not least because every piece of feedback generates a contribution to a charity of your choice, so there's one reason to do it. The other reason is your name will get entered in the prize draw this afternoon.

And finally, I'd just love to hear what you thought of the session. Thanks very much for your attention, and I don't think we've got time for questions, but please come and find me at the end if you've got any questions. Thank you.