Decision Makers Guide to Microsoft Fabric - Hedging your Fabric Bet | endjin

Ian Griffiths Ed Freeman 11th July 2023

Decision Makers Guide

Welcome to the first part in a new series of interviews with real-world Decision Makers (CTOs, CIOs, Heads of / Directors of Software Engineering, Data & Analytics) about how they manage their strategic roadmap and evaluate new technologies to simplify their portfolio, deliver better outcomes for stakeholders, or give them a competitive advantage. In a 3-part interview we talk to Tom Peplow about his assessment of Microsoft Fabric.

TLDR;

Tom Peplow, Principal & Senior Director Product Strategy at Milliman, chats with Ian Griffiths & Ed Freeman from endjin about how Microsoft Fabric is a disruptive technology. Milliman has been using Azure since 2010, building a bespoke actuarial solution which incorporates big compute and big data technologies to perform petabyte scale data processing. Milliman are currently evaluating whether Fabric can solve their problems more efficiently than their current implementation. Tom is impressed with Microsoft's internal Finance Team's digital transformation case study, which exemplifies the improvements and productivity gains his customers desire, and believes Fabric could deliver the same type of outcome. However, there are concerns that this transformation may lead to historic headaches (like Microsoft Access) all over again, causing apprehension on the IT side. Tom believes that standards are a significant enabler and that openness is key.

The talk contains the following chapters:

00:00 Introduction
00:57 What does Fabric offer?
03:11 Costs and benefits of change
05:11 Fabric, Power BI and the end user experience
07:02 New enabling technology in Fabric (OneLake)
08:55 Open Table Format
10:20 Fabric's end user focus
11:53 Fear of Fabric
13:15 Guard rails for end users
13:55 DevOps & FinOps
16:16 Version Control

From Descriptive to Predictive Analytics with Microsoft Fabric:

Microsoft Fabric End to End Demo Series:

Microsoft Fabric First Impressions:

Decision Maker's Guide to Microsoft Fabric

and find all the rest of our content here.

Transcript

Ian Griffiths: Have you built systems on top of Microsoft Azure's data and analytics services? Are you wondering what Microsoft Fabric means for your existing investments and skills? I recently spoke to someone who is uniquely well placed to talk about these things. In fact, he had so much to say about Microsoft Fabric that we've split this recording into three parts.

If you want to hear the other two, please make sure that you're subscribed to this channel. So with me today, I have Tom Peplow, who is a Principal at Milliman, and we also have endjin's very own Ed Freeman, who is a Senior Data Engineer. Now I'm really excited to be able to talk to Tom today about Microsoft Fabric because Milliman have been pioneers in the world of high performance cloud based computation and analytics.

They've been building systems in this way since 2010, so they know a great deal about how to do this. Tom, don't you already have everything you need? Does Fabric bring anything to, to the party for you?

Tom Peplow: Yeah we do have a lot and we've built a lot over a long time. But one of the challenges that we have is that our differentiated value in the market is actuarial computations. We want to make it easy for Actuaries to do their job, and that means that we need to give them a breadth of tools to do all the things that are important to Actuaries.

A big part of that is data analytics. They run models to project how insurance companies are going to perform out into the future, which generates huge amounts of data, huge amounts of valuable insight which they can use to better plan for the future, design better products, and help keep people safe.

It's been really difficult to handle that much data, and to give them tools to report on it well. It's required us to build and glue together lots of capabilities from Azure, and we've evolved it over the past decade. When we started we used Hadoop now we use Apache Spark; so that was one big technology transition.

Our analytics uses Power BI Embedded, which is fantastic And we embed ETL capabilities from Azure Data Factory too so they can self-serve these things but we've put together a lot of capabilities ourselves and we have to look after them. Where Fabric is helpful is it could be disruptive to our strategy of how we glue together all these things together.

If Fabric existed 10 years ago we would have used it. It's now a thing we have to decide if we're going to use it or not. We're in the process of evaluating whether it solves the problems better than we've solved them. And there are obviously other players in the mix too; there are other vendors who have similar solutions. But we're very much looking at Fabric as a mechanism for us to deliver more value to our customers without having to write or support as much code for the long run.

Ed Freeman: And I assume Tom it's weighing up the cost of kind of upskilling on a new Microsoft Data Platform versus the benefits that you might get from this kind of a new Saas-ification, and self service flavour of a Microsoft Data Platform. When you've done these kind of platform migrations in the past, how easy has it been for the Actuaries to actually get used to a new platform?

Tom Peplow: So one of the interesting things we've done is to try and shield the customer from as much change as possible. For example, the change from Azure Data Factory V1 to Azure Data Factory V2 was transparent; they didn't notice. We ran both side-by-side so that we could evaluate when V2 was performing as we'd like, at the scale which we pushed it.

So while we have engineering ways that we can do that safely, it becomes difficult when the user experience is the thing that changes. Because they've got to click the buttons and the screens to do the things they want to do. For example we moved from writing PIG in Hadoop to a visual designer in Mapping Data Flows.

That was a very visible change to the users, they could choose when they wanted to do it. And we enabled both capabilities to co-exist in our platform still today. The customer can choose what they want to use. Over time it became evident that the newer technology we brought to bear was gaining traction and customers just use more of it. We try and provide options to keep people feeling comfortable that they can do their job today whilst providing them better options to their job in the future.

Ed Freeman: One of the benefits with Microsoft Fabric, especially from an end user perspective, is the foundational blocks are built on the Power BI service and the Power BI UI, so there should be a level of familiarity for existing Power BI users.

That being said, it's still new types of artifacts laid out in a slightly different way. There's always going to be some sort of upskilling exercise needed.

Tom Peplow: Microsoft have done quite a nice job of trying to keep a similar user experience across the new services that have come in. The way notebooks integrates with the other components; they've thought about the integration, which is really important, because of that handoff between needing to do data transformation, and then looking at information. You don't want to have to jump through lots of different technologies. We tried to make that easy for our users. It was very easy to do some transformation and then look at results. But having that really stuck together by the vendor is critical because it just simplifies the things you need to do to be efficient, and it lowers that inner-loop development cycle time.

You're going from doing your work to checking your work to automating your work. Making that a really tight loop is where you get value, because people struggle to retain context.

We know as developers if our unit tests are slow, it stresses us out; so we want to offer that same productive experience for our customers. We want our customers to be able to build models, and analyze the results as tightly as possible, which requires us to run large computations at scale to generate huge amounts of data for them to look at. We're focuses on making that really fast and really seamless. We're going to have to host our data and calculations in a platform to make that happen because no one else has our IP.

But what we want to do is when the customer is in that platform that everything is really seamlessly integrated so they're doing their job as quickly and as easily as they possibly can.

Ian Griffiths: Do you think that Microsoft Fabric is mainly going to just reduce your engineering and operational costs because it does more of the things that you used to have to do yourself? Or do you think there's actually any fundamentally enabling features in it that you couldn't, in principle, build for yourself?

Tom Peplow: OneLake is huge. One of the biggest challenges we have is how we federate our information with our customers in both directions. We are given every insurance policy holder they've got, across all the blocks of business they've ever issued; every asset they've invested in, every asset provider they've got, economic scenario data, Bloomberg feeds etc.

All these different types of data come from different places, from different vendors, and it's hard to manage that. You know what it's like when you do these Big Data integration projects and we've got customers with hundreds of separate feeds of information coming through to us. We then obviously take that and run actuarial models to produce really valuable insight to the business. The result is huge data set. We have to heavily aggregate it before we can give it back to them. Being able to have a federated environment where all of our data sits together and can be queried together with super low latency is a game changer. What we could be able to do with a technology like OneLake is really compelling because cloud providers can do this, but ISVs who sit on top of a cloud couldn't. My "bits and bytes" are stored on the physical hard disk in Azure and Ed's "bits and bytes" are also stored in the same place, but they're all muddled up So when I query my data, it's not really any different to me querying Ed's data; I just don't have the security permission to access it. That means that Microsoft have the opportunity to optimize queries in ways that others couldn't because they own the security close to the data.

Ed Freeman: Do you think the standardization on this "Open" table format in Delta Lake, is that something you or even your actuaries care about? Or will that actually enable you to do anything different with kind of cross-cloud or using existing, say, Databricks workspaces to read and write that to those tables? Do you see any benefit in that?

Tom Peplow: I think the openness is key. Like I described earlier, people's transitions are going to be incremental. It remains to be seen if Microsoft Fabric's going to win or be adopted. So being able to make a bet that will work if technology changes on us is huge. So not being locked into a particular technology because of open standards is very important.

The internet's probably the most successful information system; it has no challenge of sharing information. You can click one link and land on another page, and that was the beauty of the idea. Is it was completely standards based. Nobody owned that technology. It's owned by the community and you can improve it together.

So if we start thinking about the challenges inside of organizations, federating data across departments is hard. Federating data across businesses is hard. There's technology and standards that have achieved this at super big scale. Standards are a big enabler for this to start to happen much more smoothly within, and across organizations.

Ian Griffiths: So one of the things that seems to change with Microsoft Fabric compared to Azure Synapse Analytics, for example, and you've touched on this a little bit, is it seems to be much more end user focused. It feels like it's more an Microsoft Office Suite thing than a kind of developer oriented tool. Do you see both pros and cons of that shift more to that end?

Tom Peplow: I do. It's interesting because my job has me out on the road seeing our customers. And one of the things I've been doing is asking the customers what they think of this now it's public. We were able to bring some of our customers in on the private preview, so we've been able to have some dialogue with them.

There's a huge amount of excitement on the business side. I saw Amy Hood talk once about the finance transformation inside of Microsoft's accounting department. And that's what we do. We do actuarial transformations very similar to what Amy would have had to have done with Microsoft's finance team. A key point she made was that the accountants could do a lot themselves without needing to rely on IT, which freed IT up to do a huge amount of other valuable things for the business, not having to worry about how to build a balance sheet for an accountant.

Actuaries are like accountants on steroids. The types of things they want to do with data is not easily explained to people in IT. They're very excited because it's "Oh, I get more data. I get to do more things with that data. I can do my job more easily". So that's good. And that was a big enabler for Microsoft's accounting transformation.

On the other side you've got fear, everybody remembers Microsoft Access Databases. I've literally heard from a customer "It's this is just going to be like Microsoft Access all over again!" The other concern is around FinOps; how much are these people going to spend doing this type of analytics? Are they going to be strict at making sure there's a good ROI and all this stuff they do? You can spend a lot of money. There's apprehension on the IT side.

There's also apprehension because they're in a similar position to me. We have an existing platform that works and we need to decide what to do with. Do we continue to invest? Do we look to disrupt it? My dilemma is the same as everyone else's dilemma, because it's not like when we started this journey in 2010 when no-one was in the cloud and we're bringing them there. Now it's like we're there and we've got platforms they like and work. This is a particularly disruptive change to that because Microsoft have glued all the bits together for you, so you don't have to glue them together yourself. But then what do you do? What do you do with all the glue you've built? I hear both sides; businesses like it and then me and the IT side, trying to figure out what to do with it.

Ed Freeman: From that perspective; IT in a tug of war with the actual users, how difficult is it in a large enterprise like yours to put in those guardrails to make sure that things don't turn into a mess, and the actuaries don't start burning loads of money, and they don't start creating loads of artifacts that they end up abandoning? It's a non-trivial exercise to try and put those guardrails in place, keeping kind of IT and the governance team happy with how people are using the platform.

Tom Peplow: We are trying to bring like a software engineering approach to actuarial science. Software engineers have a development environment where they build and test their software. When they're done, they put it into a production environment. That simple separation of duties between "I'm doing my work" and "I'm building the things to", "I'm running the things" and "I'm looking at the numbers" does introduce rigor because you can put DevOps and FinOps around the process of being "in production".

I built a model. I want to put it in production. You can ask them the question, "how much does it cost to run?" "Do you have reports around it?" "What's the impact of the change you just made and you can control the changes from one place to the other?" And then you don't frustrate them because you have a robust CI/CD process for promoting changes from development into production.

What you get is a not frustrated user. They're not saying there's a whole load of governance, stopping them getting their job done, but you don't get frustrated IT team because you're not putting untested workloads into production that's very expensive to run.

Conversely sometimes you've got to answer questions quickly because your boss comes to you and says, "what's the impact of this thing that's just happened yesterday?" We do some really clever stuff with Machine Learning to train models so they can evaluate market conditions now based on not having to run these big, heavy, expensive models. That's really powerful.

Sometimes you just have to run a model. Sometimes it's you've must change some assumptions. Sometimes you've got to go spend money. When it's aligned to value, if the CFO of a large insurance company is asking you to do something quick, they're not going to care how much it costs if they get the answer on time. It's just creating the accountability back to track all that through. Being robust and mature is not a technology problem. It's a process problem.

Ed Freeman: Microsoft Fabric does seem to have taken that on board whereas Power BI in the early days didn't really think about CI/CD, DevOps and FinOps; that all came much more recently. But Microsoft Fabric seems to be from the outset, to be bringing in the principles like version control, Git integration, at the the Fabric portal layer. And soon there will be a swathe of APIs to drive all of these things. From your perspective, you want that rigor, and that is surely is a very good thing.

Tom Peplow: Definitely using Git is huge for two reasons; first the fact they had to do a lot of work to decompose Power BI artifacts so they can be version controlled. Two is being standards based; it's Git, so you can integrate with it easily. We've have mechanisms for putting our modelling technology alongside technology that we don't own the IP of. Microsoft are listening. They've taken some time to get to this point because it was a big investment. But all the things the community have been asking for in Fabric; better controls, better governance, better automation, more openness, version control, all those things are now just out of the box.

There are lots of features we had to build. We do support version control Power BI artifacts in our platform because it's important for us. When the regulator comes to us and asks, "why are these numbers, these numbers?" we have to be able to answer that question. At the time Microsoft didn't give us a mechanism for version controlling Power BI, So we essentially version controlled it as a black box. This isn't a particularly elegant solution because we can't show you what changed. But things are getting better. And this is an obvious places where we can reinvest in modernizing, so that we have a better experience than we've already have.