Microsoft Fabric - Inspecting 28 Million row dataset | endjin

Ed Freeman 26th June 2023

Tutorial

In this video Ed Freeman continues the Microsoft Fabric End-to-End demo series by looking at the dataset we'll be using, and the problem statement we're trying to solve.

For a data platform, we need some data! In this series we're going to be using Land Registry data provided by the UK government which registers the ownership of land and property in England and Wales. The dataset is almost 5GB in size, and provides different types of files for complete or incremental processing. This will allow us to benefit from UPSERT-like functionality enabled by #DeltaLake without having to load all the data every time we receive new information.

In this video we'll take a quick look at the data, where it comes from and what format it's in, and we'll also frame up the insights we're aiming to achieve from the analysis taking place in this series. We'll finish by stepping through a sample architecture diagram - a powerful way to visualize involved data platforms at a high-level.

The talk contains the following chapters:

00:00 Introduction
00:17 Sample data introduction
00:58 Sample data inspection
03:54 Insight Discovery and defining goals
06:46 Fabric architecture walkthrough
11:32 Outro

Useful links:

📖 UK Land Registry data

Microsoft Fabric End to End Demo Series:

From Descriptive to Predictive Analytics with Microsoft Fabric:

Microsoft Fabric First Impressions:

Decision Maker's Guide to Microsoft Fabric

and find all the rest of our content here.