Browse our archives by topic…
Data

The Data Product Canvas: Deep Dive into the Building Blocks
Explore the nine building blocks that make up the Data Product Canvas. Learn how to approach each component to design data products that deliver real value and avoid common pitfalls.

The Data Product Canvas: Stop Building Data Products That Fail
Turn data initiatives into business success stories with the Data Product Canvas. This practical framework helps teams design data products that deliver real value, avoid common pitfalls, and align with business objectives.

Big Data London 2025
AI agents dominated Big Data LDN 2025, but the real story wasn't the technology, it was which organisations could actually deploy it successfully. After five years tracking industry evolution through this event, one pattern emerged clearly: the winners had built their foundations first. For CTOs making platform decisions now, the strategic imperative isn't choosing between innovation and governance; it's recognizing that governance enables innovation at scale.

FabCon Vienna 2025: Day 3
FabCon is a conference dedicated to everything Microsoft Fabric. Day 3's sessions included migration, Databricks, Spark optimisation, and more.

FabCon Vienna 2025: Day 2
FabCon is a conference dedicated to everything Microsoft Fabric. Day 2 featured deep dives into OneLake, Maps in Fabric, and multi-agent AI systems.

FabCon Vienna 2025: Day 1
FabCon is a conference dedicated to everything Microsoft Fabric. Day 1 was mostly focused around the hundreds of new feature announcements.

What is the Medallion Architecture?
The Medallion Architecture consists of three data tiers: Bronze (raw), Silver (clean), and Gold (projected). Data moves through these three tiers and becomes more opinionated at each stage.

Encoding categorical data for Power BI: Using label encoded data vs one-hot encoded data in Power BI
Understand why label encoding is the preferred technique for encoding categorical data for analysis in Power BI over one-hot encoding.

Encoding categorical data for Power BI: Label encoding vs one-hot encoding - which encoding technique to use?
One-hot encoding and label encoding are two methods used to encode categorical data. Understand the specific advantages and disadvantages of these techniques.

There's something wrong with the Pandas API on Spark
Fix the following issues: Errors converting large datasets to pandas, pandas for Spark is very slow, and pandas for Spark column reduction doesn't reduce data.

Per-Property Rows from JSON in Spark on Microsoft Fabric
Spark doesn't always interpret JSON how we'd like. For example, if each key/value pair in a JSON object is conceptually one item, Spark won't give you a row per item by default. This article shows how to nudge Spark in the right direction.