Perspectives on Microsoft Fabric
Briefing
Microsoft Fabric is a third generation data and analytics platform, building on strong foundations that have been established by Azure Synapse Analytics and Power BI. It is a SaaS solution that further lowers barriers to adoption. Endjin has been on the Private Preview for Microsoft Fabric since late 2022, and have put the platform through its paces using real-world data and scenarios.
In this 20 minute chat, Microsoft MVP Ian Griffiths interviews Barry Smart, Director of Data & AI, and Ed Freeman, Senior Data Engineer, about their experiences of this new unified data platform. The full transcript is available below.
The talk contains the following chapters:
- 00:50 What is Microsoft Fabric?
- 02:12 How does that compare with other offerings like Databricks or Snowflake?
- 03:43 What does the transition from PaaS to SaaS mean?
- 05:41 Does Fabric represent a shift in maturity and how would that look on a Wardley Map?
- 07:41 What have you been doing with Fabric?
- 08:30 Do you have a favourite feature?
- 09:40 Is Fabric ready for prime time?
- 12:19 Could you sketch out how you might approach building a spike to explore Fabric?
- 15:20 Ian's final thoughts
- 16:02 Barry's final thoughts
- 17:23 Ed's final thoughts
From Descriptive to Predictive Analytics with Microsoft Fabric:
- Part 1 - Overview
- Part 2 - Data Validation with Great Expectations
- Part 3 - Testing Notebooks
- Part 4 - Task Flows
- Part 5 - Observability
Microsoft Fabric End to End Demo Series:
- Part 1 - Lakehouse & Medallion Architecture
- Part 2 - Plan and Architect a Data Project
- Part 3 - Ingest Data
- Part 4 - Creating a shortcut to ADLS Gen2 in Fabric
- Part 5 - Local OneLake Tools
- Part 6 - Role of the Silver Layer in the Medallion Architecture
- Part 7 - Processing Bronze to Silver using Fabric Notebooks
- Part 8 - Good Notebook Development Practices
Microsoft Fabric First Impressions:
Decision Maker's Guide to Microsoft Fabric
and find all the rest of our content here.
Transcript
Ian Griffiths: Microsoft recently announced Fabric, their new data platform. Now here at endjin, we have had access to the preview of Fabric for about half a year now. You'll have seen the demos of the product launch at Build, but I wanted to get an independent view of what Fabric is like from people who've spent a significant amount of time building real systems on it.
So today I have with me two of my colleagues who've been working with the preview for some time now. We have Barry, our Director of Data Engineering and AI, and we also have Ed, a Senior Data Engineer. I am mainly involved in the software development side of things here at endjin. So that means that I know very little about Fabric, either in a fashion sense, because I'm a developer or in terms of Microsoft new products.
So could Ed, could you start by telling me what is Fabric, what's it about?
Ed Freeman: Sure. So Microsoft Fabric can be thought of as the third generation of Microsoft data platforms. Where the first generation, for example, HD Insight or SQL Data Warehouse was somewhat isolated takes on traditional data products.
The second generation was Synapse which integrated platforms at a UX level, but still felt a little disjointed at the data level. And now we have Microsoft Fabric, which builds on the Synapse unification vision, but with a particular focus on enabling deep data level interoperability. Now to do that, there's been a huge investment in standardizing the foundations of the platform to enable compute models to integrate and interoperate seamlessly.
Then these compute models are surfaced in an experience oriented UX that maps to common data personas like Data Engineers, Data Scientists, Citizen Analysts, and so on.
The other big shift is that Microsoft Fabric is SaaS rather than PaaS; that brings with it a more opinionated view of how a modern data platform should take shape and enables a shift further towards data democratization, which is a fancy way of saying more self-service. And that's for all personas. It also introduces a standardized capacity based commercial model where users don't need to manage billing for compute separately.
Ian Griffiths: So how does that compare with other offerings like Databricks or Snowflake?
Ed Freeman: That's a very big question indeed and one that I'm sure lots of people will be asking.
So these products are all great but we're not quite yet comparing apples with apples. So today, Databricks and Snowflake, they have somewhat narrower, deeper focus, at least with regards to their perception. So rightly or wrongly, Databricks is still widely seen by customers as the cross-cloud platform for Data Science and Machine Learning workloads.
Though it's marketing strategy would suggest it's now the home of the lakehouse and has even more recently branched into a fully fledged data warehouse offering. Snowflake is maybe best known as an innovative SaaS cloud Data warehouse, which is more recently started releasing features that would appeal to Data Engineer type personas.
Now Fabric from the outset has a much broader vision, innovating all the way up and down the stack from the low level data and compute nuts and bolts through to these kind of Knowledge Worker experiences that you've seen in the Microsoft demos.
Now, I wouldn't expect any of these products to stand still though. There's general convergence to this unified analytics platform happening across the whole industry. But we're going to go into more depth and produce some more content that goes into the detail, which we'll be putting on the website and sharing more videos about that.
Ian Griffiths: So Barry, Ed's talked a bit about how Microsoft's data platforms in the past have tended to be PaaS platform as a service, but Fabric moves more in the direction of SaaS.
Could you tell me a bit more about what that means?
Barry Smart: The SaaS-ification that they talk about with Fabric is really a major aspect of the new service. And I think it can have a significant impact going forward because it removes a lot of the barriers potentially to adopting a data platform.
It's great to see Microsoft investing in this platform. It has been driven by industry trends. Businesses want to be more data driven but they don't want to wait six to 18 months to get their first data product delivered. A big part of what they're doing with Fabric is to reduce that "time to value" and also reduce the Total Cost of Ownership.
Ed Freeman: What that does also mean though, if we look at how Power BI has grown over the last five years, because it's started off as a SaaS platform, lots of kind of the enterprise features like networking and DevOps principles, Application Lifecycle Management, didn't come until further down the line because those types of things aren't as commonplace in SaaS platforms as opposed to PaaS platforms.
Barry Smart: It's all been driven by this desire to decentralize data and analytics within organizations because the current model relying on centralized Data teams is just proving not to be scalable.
And as an organizations choose to become more and more data driven that's really why Microsoft have responded to that with, largely, with a lot of the features and capability that they've built into Fabric.
Ian Griffiths: As one of the things endjin we like to think about is Wardley Maps, which provide a way of visualizing where technologies sit on a kind of maturity landscape as they move from Genesis as brand new technologies gradually maturing over time to the point that they eventually become like Commodities.
So these days, very few people run their own power generators, for example. Does Fabric represent any sort of shift of that kind towards commoditization of the sorts of things that might once have been quite specialized technical work?
Barry Smart: Yeah, very much. We've actually analyzed That shift from products such as Synapse and Data Lake Gen two and their various component parts to what Fabric is offering on a Wardley Map, Ian, and it's definitely in Wardley Mapping terms, a big shift to the right and upwards. So, it's not just about the commoditization of the services, but also moving some of these services further up the Value Chain so they're more visible to the end users. And I think that's part and parcel with this overall kind of vision with Fabric to democratize data and analytics.
Ed Freeman: And it very much is more aligning with Data Mesh, so decentralized analytics, data as a product, domain ownership etc. There's lots of things within Microsoft Fabric that is aligned with that that methodology.
Barry Smart: Yeah. And that's right, that's one of the key features actually Fabric, is to have different "experiences" for those different personas. So, there's a Data Science experience, there's a Data Engineering experience and so on. So, they're very much thinking about empowering these different individuals within the organization to deliver more value, and deliver that value faster.
Ian Griffiths: Okay. That's great. Thank you. I find if I want to understand something, I need to get to grips of some specifics. Ed, can you tell me what you've been doing with Fabric since you've got your hands on it?
Ed Freeman: Yeah, absolutely. As you know Ian, we work for a Microsoft Partner. So we were fortunate enough to actually get access to the private preview in for Microsoft Fabric. And as a Data Consultancy and an Azure Consultancy we wanted to put ourselves in our client's shoes. What's quite commonplace we find amongst our clients are these kind regular cadence batch refreshes, some ETL that can be quite complex or ELT workloads. So, we've been trying to put that through its paces and other things that we're quite passionate about internally and more from, the software engineering background, so the DevOps, the testing, the GIT integration.
Ian Griffiths: And Barry, do you have a favourite feature of the things you've looked at so far?
Barry Smart: I'm a huge fan of Notebooks and I love what they've been doing with the Notebooks experience. So one of the kind of bug bearers with Synapse, for example, when you're using a Notebook when you open the Notebook up, if you're not connecting to an existing Spark cluster that's been spun up, it can take two, maybe three minutes for that to happen. And it's dialled down to seconds, which is amazing.
Another thing about Fabric, it feels to me like it's moving more and more towards sort of Microsoft's productivity suite. And it's closer to that and less close to a Azure now in terms of the sort of whole experience.
So they've been able to bring in things like commenting on Notebooks, so the features you enjoy with Word, when you can comment on a Word Document, raise comments and have people reply to those comments. And then the other major feature for Is the co-editing of Notebooks; you can have multiple parties all editing a Notebook at the same time again, like you can in the Office tooling.
Ian Griffiths: Okay, so this sounds great. So, I'm going to go to my Azure subscription and delete all of my existing data lakes and databases. And so, what then what status is Fabric in? Is it good to go or should I be a little more cautious than that?
Ed Freeman: I think you should be very cautious with that. The downside of kind of releasing this new product with a huge array of services is there is so much that's being released and all of it is new in some way. They have created new functionality or remove some functionality or added new functionality, and we certainly have seen quite a few blockers along the way. We've had a few hurdles and I think it's fair to say that being such a such a significant new release, the Public Preview will last at least a number of months. We don't know when GA exactly is, but it's going to be in Public Preview a while whilst they gather feedback and fix bugs.
What we wouldn't say is start, don't start migrating over your current solutions and certainly don't fully migrate any current solutions to turn on.
Barry Smart: What occurred to me when you were talking there is this is definitely not a rebranding of existing technologies like Synapse, Data Factory, and Power BI. There's significant change going on here, which is brilliant to see. This is definitely a new generation of technology from Microsoft, although a lot of it will seem familiar, what Ed's alluding to is they are investing in, redesigning a lot of the compute engines and introducing a lot of new tech to make this a fully integrated experience.
That's great, but it's not ready for production workloads. You wouldn't want to run a production workload on something that's Public Preview anyway. But I would definitely encourage you to invest in some technical spikes, some R&D, and I would encourage you to make those kind of business focused because this is a business orientated platform.
So think about what kind of business driven workload you would want to you would want to put on the platform and do some R&D around that to evaluate the platform and have, clear success criteria about what you want to achieve from that. And that will inform, and have you ready and prepared should you choose to do so to adopt platform once it goes into GA.
Ian Griffiths: But Barry, just to give us a rough idea of what a reasonable approach into this might look like, could you sketch out an example of something that you think might make a reasonable spike for someone who wants to try something with Fabric for the first time? Sort of a concrete example?
Barry Smart: Yeah, so I think it comes back to one of the core principles we have around data and analytics, and that's to be driven by business goals and to we, we do a process called Insight Discovery around a business goal to understand what the department is trying to achieve. Who's involved in delivering that, which personas what the current barriers are to them, how we could help them accelerate delivering their goals and how data ultimately could help in supporting that.
So, I would encourage you to take that sort of top down approach. And a lot of what Ed's been talking about is exactly that. Fabric, if it delivers on its promise, is going to allow organizations to focus much more on that sort of top-down view of the world and really target their brain power at that rather than managing infrastructure, which has been a bit of a distraction up until now.
And then just take that example. I would suggest keep it as simple as possible initially. You don't need to launch into complex data science to really understand a lot of about Fabric. Taking a single data set right the way from raw through, through Fabric into some actionable insight in Power BI, for example, would deliver a lot of useful insights for you and your team around how it could help you deliver value to the wider organization. So, it's that kind of approach.
Ed Freeman: And it's worth trying to mimic how Fabric, the terminology, Fabric is using when, with regards to architecting a warehouse or a lakehouse. And we've standardized on this medallion architecture, which again Databricks were commonly used throughout the last few years of their existence. So where you've got a Bronze Zone, a Silver Zone, and a Gold Zone where. That's also known as your Raw, your Cleansed or Enriched and your Curated, or your Projected kind of zone star schemers are certainly still in play here, right on your Gold side. But try and think about aligning with that Microsoft Architectural Guidelines as much as possible because that's what they're developing the tool around in an internally. So, it's wise to keep things not too far away from their recommended practices.
Ian Griffiths: Okay. Thank you very much both of you. So, it sounds to me with this, that Microsoft in a sense is getting back to their roots because they have always been about the democratization of access to computing power one way or another. And it seems like Fabric is doing that for the sorts of data platforms that have until recently been in the hands only of sort of data wizards. And so, it's now making that available to everyone, like they have them with their older products. I think the challenge for us then, is going to be how do you balance that out with the engineering that's required to make reliable solutions. And I think that's going to be the challenge for organizations like endjin and their customers going forward.
Any thoughts Barry?
Barry Smart: Yeah, I think the way to think about it, or certainly the way I've been thinking about it is really as Fabric as part of a wider socio-technical endeavour. Fabric means that the technology part of that socio-technical endeavour is becoming easier and it's going to shine a light even more on the social part.
Really what I mean is the organization ready for this kind of product to get the most out of it? And Fabric's going to put pressure on those kind of things. So I'm talking about things like leadership and culture and even the organization's operating model. We've talked a lot about decentralizing data and analytics and Citizen Analysts and all of that. But, you can't enable that without some organizational change. As part of evaluating Fabric, it's not just about evaluating the technology, I think it's actually starting to think about those wider concerns.
Because Fabric's amazing, you can spin it up and start using it potentially within minutes, but the kind of organizational change that I'm thinking about here is going to take years potentially. So the sooner you understand that and start to move towards it the better.
Ed Freeman: And I think on a similar note realistically it's going to be down to people's budgets as well, right? If they've recently re-platformed on to say, Azure Synapse, then probably the appetite to move to yet another data platform or migrate to yet another data platform might not be there. The budget just might not be there, and I think it's fair to say that we will still be proponents of Azure Synapse. We will still have Clients and customers that are planning to re-platform onto Azure Synapse, and that's still a perfectly valid decision.
I think it's fair to say that the Microsoft Fabric will be the future of data platforms within Microsoft, but certainly don't expect kind of Azure Synapse to be going away any time soon.
But I think from my perspective, the future is bright. There's a long way to go with regards to certain kind of functionality and features within Microsoft Fabric. But the vision is there. We've seen it and we, we believe in it too. It's going to be more about how it lands with the wider community, how the marketing lands because we know with Azure Synapse, didn't go entirely to plan, because not many people understood it to be much more than the Data Warehouse V2. It is fair to say Microsoft Fabric is much more than that. It builds on top of Synapse, brings new things. But there's the mindset shift, I think, which is the most fundamental thing of going from PaaS to SaaS.
Ian Griffiths: Okay. Barry and Ed, thank you both very much. Now endjin is publishing a wealth of content on Microsoft Fabric, so if you would like more depth, please check out the link in the description. Meanwhile, thank you very much for listening.