Browse our archives by topic…
Blog

DuckLake in Perspective: Advanced Features and Future Implications
Explore DuckLake's advanced capabilities including built-in encryption, sophisticated conflict resolution, and the strategic implications for future data architecture. Understand how DuckLake enables new business models and positions itself against established lakehouse formats.

DuckLake in Practice: Hands-On Tutorial and Core Features
Get hands-on with DuckLake through a comprehensive tutorial covering installation, basic operations, file organization, snapshots, and time travel functionality. Learn how DuckLake's database-backed metadata management works in practice.

Introducing DuckLake: Lakehouse Architecture Reimagined for the Modern Era
DuckDB Labs introduces DuckLake, a revolutionary approach to lakehouse architecture that solves fundamental problems with existing formats by bringing database principles back to data lake metadata management.

Reading structured data from SharePoint in Synapse Pipelines
This post describes an approach to copy files and data from SharePoint into Azure using Synapse Pipelines.

Synapse & Service Principal SharePoint Integration
The interactive notebook shared in this post defines the process of granting Service Principals (inc. Synapse managed identities) access to SharePoint sites.

.NET Aspire: SQL persistence
.NET Aspire can create new local instances of services such as SQL Server from scratch each time you run. While this guarantees repeatability and isolation, it can be time consuming, so this post explores alternatives.

.NET Aspire: using SqlConnection in integration tests
.NET Aspire can create all of the resources an integration test required, but the test will often need direct access to the same SQL database as the code under test. This post shows how to do that.

.NET Aspire SQL Server integration tests and local development
.NET Aspire can create local services such as a SQL Server to stand in for cloud resources under local development and testing. This post shows how to ensure such a database is suitably initialized.

.NET Aspire SQL Server integration tests
.NET Aspire offers features enabling integration tests to use dev-time orchestration. This post shows how a test can use these.

.NET Aspire dev-time orchestration for SQL Server integration tests
.NET Aspire's dev-time orchestration can be used to build integration tests that depend on a SQL Server database. This blog series explains how.

What is the Medallion Architecture?
The Medallion Architecture consists of three data tiers: Bronze (raw), Silver (clean), and Gold (projected). Data moves through these three tiers and becomes more opinionated at each stage.

Learning from Disaster - A Creative Walkthrough of the Titanic Power BI Report
In Paul Waller's final, and posthumously published blog post, he takes you through a creative walk-through of the Titanic Power BI Report he created with Barry Smart.

How to Build Mobile Navigation in Power BI
This is follow guide to designing a mobile navigation in Power BI, covering form, icons, states, actions, with a view to enhancing report design & UI.

Retrospecting on my career at endjin
Liam joined endjin as part of the Software Engineering Apprenticeship 2021 cohort. In this post he looks back on his time at endjin before moving on.

How do Data Lakehouses Work? An Intro to Delta Lake
With new technologies - such as Delta Lake and other open table formats - there have been huge improvements the performance of Data Lakehouses. But what is Delta Lake and how does it work?

What is a Data Lakehouse?
What exactly is a Data Lakehouse? This blog gives a general introduction to their history, functionality, and what they might mean for you!

DuckDB in Practice: Enterprise Integration and Architectural Patterns
Learn how to integrate DuckDB into enterprise environments, including Microsoft Fabric deployment, and explore the architectural patterns it enables for modern data processing workflows.

DuckDB in Depth: How It Works and What Makes It Fast
Dive deep into the technical details of DuckDB, exploring its columnar architecture, vectorized execution, SQL enhancements, and the performance optimizations that make it exceptionally fast on a single machine.

DuckDB: the Rise of In-Process Analytics and Data Singularity
Explore the concept of the 'data singularity' and how in-process analytics tools like DuckDB are transforming how we work with data by leveraging modern hardware capabilities.

Creating Quality Gates in the Medallion Architecture with Pandera
This blog explores how to implement robust validation strategies within the medallion architecture using Pandera, helping you catch issues early and maintain clean, trustworthy data.

What are record types in C# / .NET?
Records are primarily meant for representing data. They are usually immutable and allow you to copy, equate, and print, object properties.

C# 12.0: ref readonly
C# 12.0 adds a new way to annotate parameters: ref readonly. This seems like it should mean exactly the same as the older in annotation. This post explains why this new syntax is useful.

Encoding categorical data for Power BI: Using label encoded data vs one-hot encoded data in Power BI
Understand why label encoding is the preferred technique for encoding categorical data for analysis in Power BI over one-hot encoding.

Co/Contravariance in C# Interfaces
This post explains how covariance and contravariance in C# interfaces works

Encoding categorical data for Power BI: Label encoding vs one-hot encoding - which encoding technique to use?
One-hot encoding and label encoding are two methods used to encode categorical data. Understand the specific advantages and disadvantages of these techniques.