Browse our archives by topic…
Blog

What is the Medallion Architecture?
The Medallion Architecture consists of three data tiers: Bronze (raw), Silver (clean), and Gold (projected). Data moves through these three tiers and becomes more opinionated at each stage.

Learning from Disaster - A Creative Walkthrough of the Titanic Power BI Report
In Paul Waller's final, and posthumously published blog post, he takes you through a creative walk-through of the Titanic Power BI Report he created with Barry Smart.

Retrospecting on my career at endjin
Liam joined endjin as part of the Software Engineering Apprenticeship 2021 cohort. In this post he looks back on his time at endjin before moving on.

How to Build Mobile Navigation in Power BI
This is follow guide to designing a mobile navigation in Power BI, covering form, icons, states, actions, with a view to enhancing report design & UI.

How do Data Lakehouses Work? An Intro to Delta Lake
With new technologies - such as Delta Lake and other open table formats - there have been huge improvements the performance of Data Lakehouses. But what is Delta Lake and how does it work?

What is a Data Lakehouse?
What exactly is a Data Lakehouse? This blog gives a general introduction to their history, functionality, and what they might mean for you!

DuckDB in Practice: Enterprise Integration and Architectural Patterns
Learn how to integrate DuckDB into enterprise environments, including Microsoft Fabric deployment, and explore the architectural patterns it enables for modern data processing workflows.

DuckDB in Depth: How It Works and What Makes It Fast
Dive deep into the technical details of DuckDB, exploring its columnar architecture, vectorized execution, SQL enhancements, and the performance optimizations that make it exceptionally fast on a single machine.

DuckDB: the Rise of In-Process Analytics and Data Singularity
Explore the concept of the 'data singularity' and how in-process analytics tools like DuckDB are transforming how we work with data by leveraging modern hardware capabilities.

Creating Quality Gates in the Medallion Architecture with Pandera
This blog explores how to implement robust validation strategies within the medallion architecture using Pandera, helping you catch issues early and maintain clean, trustworthy data.

What are record types in C# / .NET?
Records are primarily meant for representing data. They are usually immutable and allow you to copy, equate, and print, object properties.

C# 12.0: ref readonly
C# 12.0 adds a new way to annotate parameters: ref readonly. This seems like it should mean exactly the same as the older in annotation. This post explains why this new syntax is useful.

Encoding categorical data for Power BI: Using label encoded data vs one-hot encoded data in Power BI
Understand why label encoding is the preferred technique for encoding categorical data for analysis in Power BI over one-hot encoding.

Co/Contravariance in C# Interfaces
This post explains how covariance and contravariance in C# interfaces works

Encoding categorical data for Power BI: Label encoding vs one-hot encoding - which encoding technique to use?
One-hot encoding and label encoding are two methods used to encode categorical data. Understand the specific advantages and disadvantages of these techniques.

Power BI Images That Pop: A Guide to Intuitive, Easy-to-Maintain Reports
Explore integrating icons, pictograms and images into Power BI in the optimal way to enhance the user experience and minimise effort required to build and maintain reports.

Spark dev containers: packaging code for testability
Once you've thoroughly tested your code against the local Spark service in your dev container, you'll want to run it in a real Spark cluster. This posts shows how to deploy such code to Microsoft Fabric.

Spark dev containers: writing tests
Having seen earlier in the series how to configure a dev container to run Spark locally, this post shows how to write tests that use that local Spark service.

Spark dev containers: running Spark locally
See how to configure a dev container to run Spark locally, to improve development feedback loops.

Working locally with spark dev containers
Running Spark locally in a dev container can significantly improve development feedback loops. This first article explains why, and the rest of the series will show how.

C# 12.0: collection expressions
C# 12.0 provides a new, simpler syntax for initializing expressions. It typically generates the most efficient code possible, although as you'll see, it's useful to understand the choices it makes.

Why Power BI developers should care about the Tabular Model Definition Language (TMDL)
Power BI's adoption of TMDL improves the readability of the semantic model, enables version control and enhances collaboration and efficiency for developers.

Women of Silicon Roundabout: Day 2
Women of Silicon Roundabout is the UK's largest women in tech event. Day two topics included: green tech, burnout, and Python!

Women of Silicon Roundabout: Day 1
Women of Silicon Roundabout is the UK's largest women in tech event. Day one topics included: AI, career pathways, and generations of Women in Tech.

C# 12.0: inline arrays
A new feature in C# 12.0 enables data types to define fixed-size arrays that don't require separate array objects on the heap. Learn how this is useful in performance-oriented and interop scenarios.