Microsoft Fabric - Inspecting 28 Million row dataset
Tutorial
In this video Ed Freeman continues the Microsoft Fabric End-to-End demo series by looking at the dataset we'll be using, and the problem statement we're trying to solve.
For a data platform, we need some data! In this series we're going to be using Land Registry data provided by the UK government which registers the ownership of land and property in England and Wales. The dataset is almost 5GB in size, and provides different types of files for complete or incremental processing. This will allow us to benefit from UPSERT-like functionality enabled by #DeltaLake without having to load all the data every time we receive new information.
In this video we'll take a quick look at the data, where it comes from and what format it's in, and we'll also frame up the insights we're aiming to achieve from the analysis taking place in this series. We'll finish by stepping through a sample architecture diagram - a powerful way to visualize involved data platforms at a high-level.
The talk contains the following chapters:
- 00:00 Introduction
- 00:17 Sample data introduction
- 00:58 Sample data inspection
- 03:54 Insight Discovery and defining goals
- 06:46 Fabric architecture walkthrough
- 11:32 Outro
Useful links:
Microsoft Fabric End to End Demo Series:
- Part 1 - Lakehouse & Medallion Architecture
- Part 2 - Plan and Architect a Data Project
- Part 3 - Ingest Data
- Part 4 - Creating a shortcut to ADLS Gen2 in Fabric
- Part 5 - Local OneLake Tools
- Part 6 - Role of the Silver Layer in the Medallion Architecture
- Part 7 - Processing Bronze to Silver using Fabric Notebooks
- Part 8 - Good Notebook Development Practices
From Descriptive to Predictive Analytics with Microsoft Fabric:
- Part 1 - Overview
- Part 2 - Data Validation with Great Expectations
- Part 3 - Testing Notebooks
- Part 4 - Task Flows
- Part 5 - Observability
Microsoft Fabric First Impressions:
Decision Maker's Guide to Microsoft Fabric
and find all the rest of our content here.