Microsoft Fabric Machine Learning Tutorial - Part 5 - Observability | endjin

Barry Smart 11th September 2024

Course

In part 5 of this course Barry Smart, Director of Data and AI, walks through a demo showing how to improve observability of your data engineering processes using readily available technology and platforms in with Microsoft Fabric and Azure.

Barry begins the video by explaining that one of the pitfalls of using Fabric Notebooks in an operational setting is that it can impede observability of what is going on within the notebook. To address this he shows how standard Python Logging, combined with Azure Application Insights can be used to emit telemetry from key points in your notebook and captured centrally. We demonstrate this approach with a create_logger function, showcasing its application in a Fabric notebook and the benefits of centralizing logs.

This episode helps ensure continuous monitoring and rapid issue detection for optimal data product performance. Application Insights then provides to the tools ot enable you to proactively monitor the behaviour of your entire data estate to sustain value over the long term. It also provides a valuable source of troubleshooting information when things do go wrong to help you quickly diagnose and resolve issues.

Stay tuned for our next episode on deploying resources through infrastructure as code.

Chapters:

00:00 Series Recap
00:34 Focus on Observability
02:33 Implementing Logging in Fabric
04:01 Demo: Logging in Action
05:26 Monitoring with Application Insights
06:02 Conclusion and Next Steps
08:45 Final Thoughts and Next Steps

From Descriptive to Predictive Analytics with Microsoft Fabric:

Microsoft Fabric End to End Demo Series:

Microsoft Fabric First Impressions:

Decision Maker's Guide to Microsoft Fabric

and find all the rest of our content here.

Transcript

Hello everyone!

Welcome to the fifth episode in a series of videos we are creating to provide an end to end demo of Microsoft Fabric. Across this series, we're aiming to show off as many features as possible across both the data engineering and data science experiences in Fabric.

We also want to show how Fabric can be combined with selected services on Azure to deliver DevOps principles with the objective of releasing features rapidly and safely, lowering the cost of ownership and sustaining value over the long term.

Now, in this video, we're gonna focus specifically on the observability of data engineering processes in Fabric.

As a reminder, our goal for this data product is to create a Power BI report that allows users to interactively explore passenger data from the Titanic to understand patterns in survival rates.

To achieve this goal, we are adopting a medallion architecture to ingest, process and project the data by promoting it through the bronze, silver and gold areas of the lake.

Now we're doing the majority of the data engineering work in Fabric Notebooks, and in a previous video we described all the reasons we love notebooks. But we also pointed out there are some potential pitfalls to avoid, and one of those pitfalls is observability.

What we mean by this is that notebooks, when they're used in an operational setting, can be doing a lot of work but there's little visibility of what's going on inside the notebook. This lack of a feedback loop, can mean that you don't have that reassurance that the notebook is performing its intended role.

And then when things go wrong, it can also be a struggle to determine why the notebook has failed.

To address this, we're going to adopt the following approach:

Firstly, we want to adopt the principle of consciously generating telemetry at key stages in our notebooks to provide that visibility that we're looking for.
To do this, we're gonna use standard logging functionality and then gather that logging and telemetry that we're generating into a central repository. So that so we can observe activity across our whole data estate in one place.
The tools we're going to use in this example are the Python logging module, which comes out of the box with Python, and Azure Application Insights.

So let's head over to Fabric, and we can walk you through this approach and demo the solution in action.

Firstly, to help us do this logging in a consistent way, we've defined a "create logger" function in a central utility notebook.

A few things to observe here:

Firstly, we're using the built in Python logging package to generate our our telemetry.
We're also using the OpenCensus log handler for Azure to enable any logs we generate to be streamed directly into a central instance of Application Insights on Azure. We'll show you that later on.
The connection to Application Insights is a secret which we have stored in another Azure service called Key Vault.
We use this very handy "notebook utils" utility to connect to Key Vault and retrieve the secret.

Once we've done that, we can then set up the logger. We've got a function here called "create logger" - and what we do in there is:

Make sure the name the source of the logs is the same as the name of the notebook from which the log has been generated. So it creates that traceability.
We also set a consistent logging level. In this case, we've chosen to log up to INFO.
We set up the azure log handler so that all of the logs are being streamed into that central instance of Application Insights on Azure.
Finally we set up a consistent format for the logs messages so that, wherever we're logging from, it's got the same look and feel.

Then we can put this into use. So let's move over to one of our notebooks, in this case we're looking at the "Validate Location" notebook. We've looked at this in the past - if you want to see more details about this, go and have a look at some of the previous videos.

But the purpose of this notebook is to read raw data from the bronze area of the lake house, validate it, clean it, standardise it, and then write it to the silver area of the lake house.

And what we do here is to use the functionality I showed you earlier. We use the %run magic command to run that utility logging notebook, and that gives us no access to that "create logger" function.

We use this to set up a logger and all all that complexity is abstracted away, we know that if we're using this utility "create logger" method that we're going to be:

Streaming to application insights.
Our logs will automatically contain the name of this notebook.
Logs will be in that sort of consistent format that we want to see across all of our logging.

Then we can see how we use this at various points in this notebook.

Here's a great example where we're logging inside a try accept structure to catch any exceptions we encounter. In this case loading the raw data from bronze.

Just to add that sort of extra context and information, we've got a log there. Even if the the process succeeds, we also generate telemetry just to enhance that overall observability of the process.

And then if we navigate now over to Application Insights on Azure and we search for these transactions in Azure, we can see that they've been captured centrally:

They've all got the consistent format.
All the logs are captured here, as a severity level of information, which is what we chose when we created these logs.

That provides the continuous monitoring and feedback that we're looking for.

It allows us to ensure that our data products are performing well.

It allows us to identify issues that may be cropping up early and make you know, informed decisions for ongoing continuous improvement.

As you've seen this approach is founded by integrating Fabric with some underlying services on Azure. In this case, Application Insights and Key Vault.

So, in the next video, what we're gonna do is show you how to deploy these resources by applying another DataOps principle, which is infrastructure as code.

So that's it for now.

Please don't forget to hit like if you've enjoyed this video.

Subscribe to our channel if you want to keep following our content.

Thanks very much for watching bye bye.