Browse our archives by topic…
Microsoft Fabric
SQLBits 2026: A Conference Recap
SQLBits is one of the largest data platform conferences in Europe. Here's a recap of my experience at SQLBits 2026, held at the ICC Wales.
Fabric Performance Benchmarking - Spark versus Python Notebooks
Benchmarking Pandas, PySpark, Polars, and DuckDB on Microsoft Fabric: in-process Python engines run 4-5x cheaper and faster than Spark for common workloads.
Scaling API Ingestion with the Queue-of-Work Pattern
The queue-of-work pattern enables massive parallelism for API ingestion by breaking large jobs into thousands of independent work items processed by concurrent workers. This approach reduced data ingestion time for our use case from 15 hours to under 2 hours while providing automatic retry handling and fault tolerance at a fraction of the cost of traditional orchestration tools.
Building data quality into Microsoft Fabric
Data quality issues are one of the biggest silent killers of analytics initiatives. This post explores how to build data quality into Microsoft Fabric from the ground up.
Top Features of Notebooks in Microsoft Fabric
Lakehouse integration, built-in notebook resources, and collaboration features that set Microsoft Fabric notebooks apart from Jupyter and Databricks.
FabCon Vienna 2025: Day 3
FabCon is a conference dedicated to everything Microsoft Fabric. Day 3's sessions included migration, Databricks, Spark optimisation, and more.
FabCon Vienna 2025: Day 2
FabCon is a conference dedicated to everything Microsoft Fabric. Day 2 featured deep dives into OneLake, Maps in Fabric, and multi-agent AI systems.
Batch Processing Triggered Pipeline Runs in Azure Synapse
Bursty event triggers in Azure Synapse can fire the same pipeline many times in quick succession. A batched-trigger orchestrator collapses them into a single run.
Reliably refreshing a Semantic Model from Microsoft Fabric Pipelines
This post describes a pattern for reliably refreshing Power BI semantic models from Microsoft Fabric Pipelines.
FabCon Vienna 2025: Day 1
FabCon is a conference dedicated to everything Microsoft Fabric. Day 1 was mostly focused around the hundreds of new feature announcements.
Writing structured data to SharePoint from Synapse Notebooks
Write data back to SharePoint from Synapse Notebooks using PySpark, the Microsoft Graph API, and Service Principal auth — Drive IDs, tokens, and upload patterns.
Reading structured data from SharePoint in Synapse Notebooks
This post describes an approach to copy files and data from SharePoint into Azure using Synapse Notebooks.
DuckLake in Perspective: Advanced Features and Future Implications
Explore DuckLake's advanced capabilities including built-in encryption, sophisticated conflict resolution, and the strategic implications for future data architecture. Understand how DuckLake enables new business models and positions itself against established lakehouse formats.
DuckLake in Practice: Hands-On Tutorial and Core Features
Get hands-on with DuckLake through a comprehensive tutorial covering installation, basic operations, file organization, snapshots, and time travel functionality. Learn how DuckLake's database-backed metadata management works in practice.
Introducing DuckLake: Lakehouse Architecture for the Modern Era
DuckDB Labs introduces DuckLake, a revolutionary approach to lakehouse architecture that solves fundamental problems with existing formats by bringing database principles back to data lake metadata management.
Reading structured data from SharePoint in Synapse Pipelines
This post describes an approach to copy files and data from SharePoint into Azure using Synapse Pipelines.
Synapse & Service Principal SharePoint Integration
The interactive notebook shared in this post defines the process of granting Service Principals (inc. Synapse managed identities) access to SharePoint sites.
DuckDB in Practice: Enterprise Integration and Architectural Patterns
DuckDB comes pre-installed in Microsoft Fabric Python notebooks, so code developed locally deploys straight to production with enterprise monitoring, governance, and OneLake integration.
DuckDB in Depth: How It Works and What Makes It Fast
Dive deep into the technical details of DuckDB, exploring its columnar architecture, vectorized execution, SQL enhancements, and the performance optimizations that make it exceptionally fast on a single machine.
DuckDB: the Rise of In-Process Analytics and Data Singularity
Modern laptops can now handle datasets up to a billion rows, yet 94% of query spending goes on big-data compute that isn't needed. DuckDB brings analytical SQL directly into your process.
Working locally with spark dev containers
Running Spark locally in a dev container can significantly improve development feedback loops. This first article explains why, and the rest of the series will show how.
Carbon Optimised Data Pipelines: Next Steps
Extending carbon-optimised pipelines: choose between Azure regions at runtime, work around Wait activity limits, and adapt the pattern beyond the UK.
Carbon Optimised Data Pipelines: Pipeline Definition
A portable Data Factory, Synapse, or Fabric pipeline that calls the Carbon Intensity API and waits for the greenest scheduling window — no custom code.
Carbon Optimised Data Pipelines: Architecture Overview
Translating carbon-optimised scheduling into a modern data pipeline architecture for Microsoft Fabric, Azure Synapse and Azure Data Factory.
Carbon Optimised Data Pipelines: Introduction
Cloud data pipelines often have flexibility in when they run. Using the UK National Grid Carbon Intensity API, you can schedule them for the greenest window.
Per-Property Rows from JSON in Spark on Microsoft Fabric
Spark doesn't always interpret JSON how we'd like. For example, if each key/value pair in a JSON object is conceptually one item, Spark won't give you a row per item by default. This article shows how to nudge Spark in the right direction.
Introduction to Python Logging in Synapse Notebooks
The first step on the road to implementing observability in your Python notebooks is basic logging. In this post, we look at how you can use Python's built in logging inside a Synapse notebook.
Star Schemas: unleashing value from data in Microsoft Fabric
Ralph Kimball's 1996 Star Schema principles still underpin modern cloud-native analytics — why dimensional modelling unlocks value from data in Microsoft Fabric.
Adopt A Product Mindset To Maximise Value From Microsoft Fabric
Treating data as a product turns data teams from order takers into innovation engines. A product mindset gives you a framework to fail fast, build user empathy, and focus resources on high-value work.
Exploring Strategies Enabled By Microsoft Fabric
Explore building situational awareness and leveraging strategic opportunities with Microsoft Fabric in this concise overview.
Developing a Data Mesh Inspired Vision Using Microsoft Fabric
Explore Microsoft Fabric, inspired by Data Mesh, for a data-driven strategy. Learn to approach a Data Mesh vision using this powerful tool.
How Does Microsoft Fabric Measure Up To Data Mesh?
Explore Data Mesh's influence on Microsoft Fabric, addressing gaps in data product marketplace, standards, master data management, and governance.
Microsoft Fabric Is A Socio-Technical Endeavour
Creating a successful organisation-wide data and analytics platform isn't just about architecture, schemas and semantic models. It's also about culture, organisational design and people. This blog explores the socio-technical nature of data and analytics and how this should influence your approach to adoption of Microsoft Fabric.
Copilot: Unleashing AI in Self-Service Analytics
Explore AI-powered self-service reporting with tools like Copilot in Power BI and Microsoft Fabric, balancing benefits and pitfalls.
Microsoft Fabric: Announced
Microsoft Fabric unifies Power BI, Data Factory & Data Lake on Synapse infrastructure, reducing cost & time while enabling citizen data science.
What is OneLake?
Explore OneLake, Microsoft Fabric's core storage for data in Azure & other clouds. Discover its role in Fabric workloads, the OneDrive equivalent for data storage.
Azure Synapse Analytics vs Microsoft Fabric: side-by-side comparison
In this Microsoft Fabric vs Synapse comparison we examine how features map from Azure Synapse to Fabric.
Intro to Microsoft Fabric
Microsoft Fabric unifies data & analytics, building on Azure Synapse Analytics for improved data-level interoperability. Explore its offerings & pros/cons.