A Tour Around Microsoft Fabric | endjin

Ed Freeman 23rd May 2023

Tutorial

In this video Ed Freeman, Senior Data Engineer, who spent the last 6 months in the Microsoft Fabric Private Preview, takes you on a tour around Microsoft Fabric. Ed covers: The Fabric Portal, Fabric "Experiences", Fabric Workspace, Notebooks, SQL Endpoints, Data Factory Pipelines and much, much more! The full transcript is available below.

The talk contains the following chapters:

00:00 Brief introduction to Microsoft Fabric
00:38 Fabric portal landing page and "experience" overview
02:32 Navigating through Fabric experiences
03:58 Fabric Workspace view - exploring Fabric artifact types
04:25 Quick look at Notebooks within Fabric
05:49 Brief introduction to SQL Endpoints in Fabric
06:35 Data Factory pipelines in Fabric
07:10 Azure Synapse vs Microsoft Fabric
08:45 Artifact tabs - multi-tasking experience in Fabric
09:20 Fabric Workspace UI - filtering and changing views
10:00 Additional workspace features
10:42 Workspace roles
11:10 Monitoring Hub in Microsoft Fabric
11:58 Wrap-up

From Descriptive to Predictive Analytics with Microsoft Fabric:

Microsoft Fabric End to End Demo:

If you want to learn more about Fabric, take a look at some of our other content:

Decision Maker's Guide to Microsoft Fabric

and find all the rest of our content here.

Transcript

Ed Freeman: If you've been tuning into Microsoft Build, you'll have heard of Microsoft Fabric. Microsoft's new data platform, which is meant to unify all of the different data products that Microsoft has offered pretty much up to date. It's kind of the evolution of Synapse, the fabled, Synapse Gen 3. But it adds a few more things than that.

Now, you've probably seen some of the demo videos. You've maybe watched the keynote and the breakout sessions, but there's nothing like actually seeing it in real life without pristine demos. So that's what I'm going to show you today.

So you will be logging in to a new endpoint. So instead of what we know as app.powerbi.com to access Power BI stuff, you now access app.Fabric.microsoft.com and you'll see that this is this new landing page. So, we have a bunch of what they call "experiences" within Microsoft Fabric. So, these are. Different essentially facades over your data estate.

So we have the familiar one top left in Power BI, but we also have Data Factory, Data Engineering, Data Science, Data Warehouse, and real time analytics experiences. Those last four, using the same Synapse blue branding and actually using the same Synapse terminology, so those of you that are using Synapse already, you'll be quite familiar with the artifacts that we are working with here, dealing with here, and even Data Factory. We all know Synapse had. This the pipelines element, which was just the lift and shift of Data Factory. Data Factory has kept its name in Fabric as Data Factory, but again, it's the similar pipelines experience, which we'll see in a moment. But fundamentally, we have these six different experiences currently.

And the idea is these are aligned to some of the personas that you would have working within Fabric. So, remember, Fabric is the shift even further towards self-service, where we're bringing these experiences closer and closer to the personas that need to be doing these tasks.

So, I can click on any of these, but also, I want to take the opportunity to show you this kind of experience selected down here. These six experiences are just stacked up on top of one another. So, we can click, for example, Data Engineering. And that brings me to a Data Engineering specific view where I can create new artifacts that are related to Data Engineering.

So, we've got a Lakehouse, a Notebook, a Spark Job Definition, or a Data Pipeline. Similarly, if I change to the Data Science experience, I can create Data Science related artifacts like Models, Experiments, and Notebooks. Then if we move to the Data Warehouse Experience, there's only currently one thing that we can create with a Data Warehouse, and that's a Data Warehouse.

And then finally, with Realtime Analytics, we've got a KQL database, KQL query set and Event Stream. So these first two are much the same as Data Explorer. The preview feature over in Synapse, which never quite made it out of preview. Event Stream is this new. Feature within Fabric, which is very similar to the no code editor that you find in Azure Stream Analytics, if you're familiar with that.

So those are all the Synapse experiences built for Power BI. We have our usual Power BI experience, which I'm sure a lot of you are already familiar with. And then in Data Factory there's only a single thing Data Factory we can do in Data Factory. At the moment, that's create a Data Pipeline. So, these are all the different experiences, and if I go to navigate to a workspace that I've got some artifacts in, You'll see that there are various types of artifacts within this experience. So, we've got Notebooks, Reports, Data Sets, SQL Endpoints, Lakehouses. What are all these things? We're not going to dive into exactly what all of these things are in loads of detail in this video. But what I will show you is a couple of the experiences.

So, if I go into a Notebook, for example, just take a few seconds for that to load. This is quite a familiar Notebook experience, right? We've got our code cells on the right. We've got some markdown cells as well. But what we've also got now is this kind of this mount on the left-hand side, and that's a Lakehouse. So again, without going into too much detail, Lakehouse is essentially a warehouse that lives in your Data Lake as files and potentially table formats, which also are persisted as files, but can include the metadata so that they can be read and treated as tables using storage formats such as Delta Lake. But we have this on the right-hand side, and we can do things like drag and drop our files onto there to be automatically kind of code generating the Spark code.

But also things like Notebooks. We've got more kind of the productivity suite type features like commenting and also collaborating on Notebooks, which is really cool. But if I go back to the workspace, that's a Notebook, but I've also got these other artifact types.

Data sets are the usual Power BI Data Sets that we all know and love. Now, SQL Endpoint is a SQL endpoint. Over your Lakehouse table. So any tables you write with Spark and create a Lakehouse table, you can query those using a SQL endpoint. So very similar to how in SQL Serverless in Azure Synapse, you could query over a Delta Table in Delta format in your Data Lake.

That's what the SQL endpoints do. Everything in Fabric, all the tables in Fabric are stored as this open-source Delta table format. This SQL endpoint enables you to query those Delta Tables and join them together or wrangle them, however you want. This is a read only SQL endpoint.

Other things that we have in here are the other pipelines, as I suggested earlier, which if I go into one of these pipelines, we just take a few seconds for that to load up, this is just a familiar experience around Data Pipelines. So, it's a very simple one I've got here, but I've got parameters, this switch statement, and I'm doing a copy activity with slightly different configuration depending on what parameters. But we have a familiar experience. So, Fabric is bringing in a couple of new bits of functionality, which we'll highlight in separate videos, but it's also just resurfacing some existing functionality that we found in Azure Synapse and other tools like Stream Analytics with Event Stream.

And you might be thinking wasn't Azure Synapse meant to unify all of these things? Yes and no. Azure Synapse was a PaaS service, Platform as a Service. So, you still had to provision it in Azure, you had to provision a Data Lake and you had to hook a few things together. It integrated at UX level, but at the data level, being able to query any files with any storage engine that wasn't possible. Now Fabric as well as adding a couple of features that didn't live in Azure Synapse, it's also integrating things as a data level, so everything's stored as this Delta Lake Open format so that multiple compute engines can read from the same data that was created by a different compute engine.

Now you can't have multiple engines write to the same data table. But once a data table has been written by any one of the any one of the engines, it can be read by any of the other engines because it's using the same table format across. So that's what Fabric is. One of the unique selling points of Fabric is this interoperability between engines.

Now if we go back to the UX and the UI, you'll see that after I've created these two artifacts, you'll see them opening below the workspace selector on the left. So, I actually can navigate between these two artifacts quite quickly. Now, this has pivoted the Synapse view of the world where we had horizontal tabs. We now have vertical tabs. So that'll take a bit of getting used to, if you are familiar with Azure Synapse. But as you see with the UI, it's very similar to the Power BI experience now. I know when I look at this, I'm get a little bit overwhelmed with just the lack of organization here, because we've got loads of different types of things all living in a single list.

But as with Power BI, we can actually filter for the type of artifact that we're going for. So a Notebook or a Report or Data Sets, we can filter for that and we can even see lineage views. So how some of these things Depend on one another and tie together, we can see a lineage view just like we can in Power BI.

Some other things that are possible; we've got deployment pipelines still. If I go to some workspace settings, we also have Git integration, which is great. So, Git integration from the get-go rather than it being an afterthought. It's still got some places to improve some features that haven't landed yet. But that is there, and you should definitely start using it.

On the Data Engineering side, we've got some configuration options around the Spark compute that are used in the Notebooks and Spark Job Definitions experience, including some library management tools. So we can add feeds and we can add custom libraries that we may have built locally.

The workspace roles are currently very similar to Power BI workspace role. So, you've got your Admins, your Members, your Contributors, and your Viewers. Now, again, we won't go into the details of exactly what role can do what in this video will have separate resources on our blog or our YouTube channel for that. But it should feel relatively familiar for those of you familiar with Power BI.

We also have a Monitoring Hub. So much like in Azure Synapse, you have you have the monitoring area of Azure Synapse. We also have the Monitoring Hub in in Fabric. If I go to something that I have done, things that would've been monitored recently.

So on the Data Engineering tab I have a bunch of Notebooks, Spark job definitions and that I've run over the last couple of weeks. We can see those all in a nice list and we can actually drill through to some of them if we want to drill into more detail. And this will take me to the actual Spark application that was created when I ran this and we can access the Spark History server as well.

Now, I don't really want to go too much deeper into this. This was meant to be a whizz around Microsoft Fabric. I hope that's proven useful. We'll have a load more content going into deeper detail for lots of these artifacts. So please stay tuned. In the meantime, have fun playing around with Microsoft Fabric.

Certainly exciting for it to now be available in public preview.

Let us know if you have any comments. Otherwise, watch out for the next videos. Thank you.