Browse our archives by topic…
Editions
.NET Development
View all (360)
.NET Aspire: SQL persistence
.NET Aspire can create new local instances of services such as SQL Server from scratch each time you run. While this guarantees repeatability and isolation, it can be time consuming, so this post explores alternatives.

.NET Aspire: using SqlConnection in integration tests
.NET Aspire can create all of the resources an integration test required, but the test will often need direct access to the same SQL database as the code under test. This post shows how to do that.

.NET Aspire SQL Server integration tests and local development
.NET Aspire can create local services such as a SQL Server to stand in for cloud resources under local development and testing. This post shows how to ensure such a database is suitably initialized.
Analytics
View all (155)
Reading structured data from SharePoint in Synapse Pipelines
This post describes an approach to copy files and data from SharePoint into Azure using Synapse Pipelines.

Synapse & Service Principal SharePoint Integration
The interactive notebook shared in this post defines the process of granting Service Principals (inc. Synapse managed identities) access to SharePoint sites.

What is the Medallion Architecture?
The Medallion Architecture consists of three data tiers: Bronze (raw), Silver (clean), and Gold (projected). Data moves through these three tiers and becomes more opinionated at each stage.
Apprenticeship
View all (58)
Retrospecting on my career at endjin
Liam joined endjin as part of the Software Engineering Apprenticeship 2021 cohort. In this post he looks back on his time at endjin before moving on.

Life as an Apprentice Engineer at endjin
Eli joined endjin as part of the Software Engineering Apprenticeship 2021 cohort. In this post she reflects on her first two years.

My year in industry as a whole
As Charlotte's placement comes to an end, she reflects on her year at endjin, highlighting her experiences to take back to University with her
Architecture
View all (71)
What are record types in C# / .NET?
Records are primarily meant for representing data. They are usually immutable and allow you to copy, equate, and print, object properties.

Carbon Optimised Data Pipelines - minimise CO2 emissions through intelligent scheduling (Next Steps)
Intelligently scheduling cloud data pipelines based on carbon impact can optimize both environmental sustainability and operational efficiency.

Carbon Optimised Data Pipelines - minimise CO2 emissions through intelligent scheduling (Pipeline Definition)
Intelligently scheduling cloud data pipelines based on carbon impact can optimize both environmental sustainability and operational efficiency.
Automation
View all (6)
Power Query - Where can you use it? - Power BI
In this series of posts, we look at all the places where you can integrate Power Query as part of your data solutions. Here we look at Power BI.

Power Query - Where can you use it? - Microsoft 365
In this series of posts, we look at all the places where you can integrate Power Query as part of your data solutions. Here we look at Microsoft 365.

Using the Playwright C# SDK to automate 2FA authentication for AAD and MSA
Learn to configure AAD or MSA 2FA profiles for UI automation testing with Time-based One-Time Passwords.
Azure
View all (191)
Reading structured data from SharePoint in Synapse Pipelines
This post describes an approach to copy files and data from SharePoint into Azure using Synapse Pipelines.

Synapse & Service Principal SharePoint Integration
The interactive notebook shared in this post defines the process of granting Service Principals (inc. Synapse managed identities) access to SharePoint sites.

How do Data Lakehouses Work? An Intro to Delta Lake
With new technologies - such as Delta Lake and other open table formats - there have been huge improvements the performance of Data Lakehouses. But what is Delta Lake and how does it work?
Azure Synapse Analytics
View all (49)
Reading structured data from SharePoint in Synapse Pipelines
This post describes an approach to copy files and data from SharePoint into Azure using Synapse Pipelines.

Synapse & Service Principal SharePoint Integration
The interactive notebook shared in this post defines the process of granting Service Principals (inc. Synapse managed identities) access to SharePoint sites.

Working locally with spark dev containers
Running Spark locally in a dev container can significantly improve development feedback loops. This first article explains why, and the rest of the series will show how.
Big Compute
View all (28)
What is a Data Lakehouse?
What exactly is a Data Lakehouse? This blog gives a general introduction to their history, functionality, and what they might mean for you!

Azure Synapse Analytics: How serverless is replacing the data warehouse
Serverless data architectures enable leaner data insights and operations. How do you reap the rewards while avoiding the potential pitfalls?

Benchmarking Azure Synapse Analytics - SQL Serverless, using Polyglot Notebooks
New Azure Synapse Analytics service offers SQL Serverless for on-demand data lake queries. We tested its potential as a Data Lake Analytics replacement.
Big Data
View all (109)
DuckLake in Perspective: Advanced Features and Future Implications
Explore DuckLake's advanced capabilities including built-in encryption, sophisticated conflict resolution, and the strategic implications for future data architecture. Understand how DuckLake enables new business models and positions itself against established lakehouse formats.

DuckLake in Practice: Hands-On Tutorial and Core Features
Get hands-on with DuckLake through a comprehensive tutorial covering installation, basic operations, file organization, snapshots, and time travel functionality. Learn how DuckLake's database-backed metadata management works in practice.

Introducing DuckLake: Lakehouse Architecture Reimagined for the Modern Era
DuckDB Labs introduces DuckLake, a revolutionary approach to lakehouse architecture that solves fundamental problems with existing formats by bringing database principles back to data lake metadata management.
Cloud
View all (6)
Reading structured data from SharePoint in Synapse Pipelines
This post describes an approach to copy files and data from SharePoint into Azure using Synapse Pipelines.

Synapse & Service Principal SharePoint Integration
The interactive notebook shared in this post defines the process of granting Service Principals (inc. Synapse managed identities) access to SharePoint sites.

Carbon Optimised Data Pipelines - minimise CO2 emissions through intelligent scheduling (Next Steps)
Intelligently scheduling cloud data pipelines based on carbon impact can optimize both environmental sustainability and operational efficiency.
Culture
View all (132)
Retrospecting on my career at endjin
Liam joined endjin as part of the Software Engineering Apprenticeship 2021 cohort. In this post he looks back on his time at endjin before moving on.

Women of Silicon Roundabout: Day 2
Women of Silicon Roundabout is the UK's largest women in tech event. Day two topics included: green tech, burnout, and Python!

Women of Silicon Roundabout: Day 1
Women of Silicon Roundabout is the UK's largest women in tech event. Day one topics included: AI, career pathways, and generations of Women in Tech.
Data
View all (5)
What is the Medallion Architecture?
The Medallion Architecture consists of three data tiers: Bronze (raw), Silver (clean), and Gold (projected). Data moves through these three tiers and becomes more opinionated at each stage.

Encoding categorical data for Power BI: Using label encoded data vs one-hot encoded data in Power BI
Understand why label encoding is the preferred technique for encoding categorical data for analysis in Power BI over one-hot encoding.

Encoding categorical data for Power BI: Label encoding vs one-hot encoding - which encoding technique to use?
One-hot encoding and label encoding are two methods used to encode categorical data. Understand the specific advantages and disadvantages of these techniques.
Data Engineering
View all (36)
DuckLake in Perspective: Advanced Features and Future Implications
Explore DuckLake's advanced capabilities including built-in encryption, sophisticated conflict resolution, and the strategic implications for future data architecture. Understand how DuckLake enables new business models and positions itself against established lakehouse formats.

DuckLake in Practice: Hands-On Tutorial and Core Features
Get hands-on with DuckLake through a comprehensive tutorial covering installation, basic operations, file organization, snapshots, and time travel functionality. Learn how DuckLake's database-backed metadata management works in practice.

Introducing DuckLake: Lakehouse Architecture Reimagined for the Modern Era
DuckDB Labs introduces DuckLake, a revolutionary approach to lakehouse architecture that solves fundamental problems with existing formats by bringing database principles back to data lake metadata management.
Data Storytelling
View all (1)
How to Build Mobile Navigation in Power BI
This is follow guide to designing a mobile navigation in Power BI, covering form, icons, states, actions, with a view to enhancing report design & UI.
Databricks
View all (9)
What is a Data Lakehouse?
What exactly is a Data Lakehouse? This blog gives a general introduction to their history, functionality, and what they might mean for you!

Intro to Microsoft Fabric
Microsoft Fabric unifies data & analytics, building on Azure Synapse Analytics for improved data-level interoperability. Explore its offerings & pros/cons.

Version Control in Databricks
Explore how to implement source control in Databricks notebooks, promoting software engineering best practices.
Dataverse
View all (3)
How to access multi-select choice column choice labels from Azure Synapse Link for Dataverse with PySpark or SQL
Learn how to access multi-select choice column choice labels from Azure Synapse Link for Dataverse using PySpark or SQL.

How to access choice labels from Azure Synapse Link for Dataverse with SQL
Learn how to access the choice labels from Azure Synapse Link for Dataverse using T-SQL through SQL Serverless and by using Spark SQL in a Synapse Notebook.

How to access choice labels from Azure Synapse Link for Dataverse with PySpark
Learn how to access the choice labels from Azure Synapse Link for Dataverse using PySpark.
DevOps
View all (33)
Polyglot Notebooks for Ops
Polyglot Notebooks' PowerShell support enhances IT Ops with robust, repeatable processes via 'executable documentation'.

Exploring OpenChain: From License Compliance to Security Assurance
Open-source software has become an essential part of many organisation's software supply chain, however, this poses challenges with license compliance and security assurance.

Data validation in Python: a look into Pandera and Great Expectations
Implement Python data validation with Pandera & Great Expectations in this comparison of their features and use cases.
Engineering Practices
View all (146)
C# 11.0 new features: ref fields and the scoped keyword
C# 11.0 expanded high-performance, low-allocation functionality. This post describes the importance of the added support for ref fields, and how the scoped keyword relates.

adr - A .NET Tool for Creating & Managing Architecture Decision Records
Architectural Decision Records (ADRs) capture context, options, decisions, and consequences. dotnet-adr is a .NET tool for managing ADRs.

Data and AI Engineering Maturity - Fix our problems before we hit the buffers
As data and AI become the engine of business change, we need to learn the lessons of the past to avoid expensive failures.
Innovation
View all (30)
How to Monetize APIs with Azure API Management
Explore monetizing APIs with our guide. We offer strategies, videos, and code via Azure API Management to fast-track your business model.

Do robots dream of counting sheep?
Some of my thoughts inspired whilst helping out on the farm over the weekend. What is the future of work given the increasing presence of machines in our day to day lives? In which situations can AI deliver greatest value? How can we ease the stress of digital transformation on people who are impacted by it?

Azure Synapse Analytics: How serverless is replacing the data warehouse
Serverless data architectures enable leaner data insights and operations. How do you reap the rewards while avoiding the potential pitfalls?
Internet of Things
View all (15)
Do robots dream of counting sheep?
Some of my thoughts inspired whilst helping out on the farm over the weekend. What is the future of work given the increasing presence of machines in our day to day lives? In which situations can AI deliver greatest value? How can we ease the stress of digital transformation on people who are impacted by it?

How to use SQL Notebooks to access Azure Synapse SQL Pools & SQL on demand
Wishing Azure Synapse Analytics had support for SQL notebooks? Fear not, it's easy to take advantage rich interactive notebooks for SQL Pools and SQL on Demand.

ArrayPool vs MemoryPool—minimizing allocations in AIS.NET
Tracking down unexpected allocations in a high-performance .NET parsing library.
Machine Learning
View all (33)
What is a Data Lakehouse?
What exactly is a Data Lakehouse? This blog gives a general introduction to their history, functionality, and what they might mean for you!

SQLbits 2024 - The Best Bits
This is a summary of the sessions I attended at SQLbits 2024 - Europe's largest expert led data conference. This year SQLBits was hosted at Farnborough IECC, Hampshire.

SQLbits 2023 - The Best Bits
This is a summary of the sessions I attended at SQLbits 2023 in Newport Wales, which is Europe's largest expert led data conference.
Microsoft Fabric
View all (26)
DuckLake in Perspective: Advanced Features and Future Implications
Explore DuckLake's advanced capabilities including built-in encryption, sophisticated conflict resolution, and the strategic implications for future data architecture. Understand how DuckLake enables new business models and positions itself against established lakehouse formats.

DuckLake in Practice: Hands-On Tutorial and Core Features
Get hands-on with DuckLake through a comprehensive tutorial covering installation, basic operations, file organization, snapshots, and time travel functionality. Learn how DuckLake's database-backed metadata management works in practice.

Introducing DuckLake: Lakehouse Architecture Reimagined for the Modern Era
DuckDB Labs introduces DuckLake, a revolutionary approach to lakehouse architecture that solves fundamental problems with existing formats by bringing database principles back to data lake metadata management.
Modern Compute
View all (3)
Modern Compute: Compute-Intensive Workloads
We have a wide range of computational mechanisms at our disposal, some of which emerged thanks to recent advances in AI. In this post, we look at the kinds of workloads that can take advantage of these.

Modern Compute: Unavoidable Practicalities
Thanks in part to recent advances in AI, we have a range of computational mechanisms at our disposal. However, certain universal truths apply to all of them.

After the AI Storm: Modern Compute
Recent huge investment in AI has changed the modern computational landscape. Whatever the value of recent AI developments ultimately proves to be, we have some new hardware capabilities as a side effect. What else do these enable?
Open Source
View all (64)
What is a Data Lakehouse?
What exactly is a Data Lakehouse? This blog gives a general introduction to their history, functionality, and what they might mean for you!

Learn Reactive Programming for FREE: Introduction to Rx.NET 2nd Edition (2024)
Learn Reactive Programming with our free book, Introduction to Rx.NET 2nd Edition (2024), available in PDF, EPUB, online, and GitHub.

Implementing the OpenChain Specification
After a year of working on implementing the OpenChain specification, this blog takes you through the processes we created to track and manage our open-source licenses
OpenChain
View all (4)
Exploring OpenChain: From License Compliance to Security Assurance
Open-source software has become an essential part of many organisation's software supply chain, however, this poses challenges with license compliance and security assurance.

The OpenChain specification explained
When implementing OpenChain, understanding the specification will help guide your organisation to having processes in place to review and manage open-source software

What are the risks with open-source software?
The key risks associated with open-source software, from whether you use it minimally, to using it throughout all your systems.
Power BI
View all (75)
Learning from Disaster - A Creative Walkthrough of the Titanic Power BI Report
In Paul Waller's final, and posthumously published blog post, he takes you through a creative walk-through of the Titanic Power BI Report he created with Barry Smart.

How to Build Mobile Navigation in Power BI
This is follow guide to designing a mobile navigation in Power BI, covering form, icons, states, actions, with a view to enhancing report design & UI.

Encoding categorical data for Power BI: Using label encoded data vs one-hot encoded data in Power BI
Understand why label encoding is the preferred technique for encoding categorical data for analysis in Power BI over one-hot encoding.
Python
View all (21)
What is a Data Lakehouse?
What exactly is a Data Lakehouse? This blog gives a general introduction to their history, functionality, and what they might mean for you!

Creating Quality Gates in the Medallion Architecture with Pandera
This blog explores how to implement robust validation strategies within the medallion architecture using Pandera, helping you catch issues early and maintain clean, trustworthy data.

How to step into external code when debugging a Python Behave test in VS Code
Learn how to configure VS Code to enable stepping into external code when debugging a Python Behave test
Security and Compliance
View all (28)
No-code/Low-code is software DIY - how do you avoid DIY disaster?
No-code/Low-code democratizes software development with little to no coding skills needed. But how do you evaluate if software DIY is the right choice for you?

Exploring OpenChain: From License Compliance to Security Assurance
Open-source software has become an essential part of many organisation's software supply chain, however, this poses challenges with license compliance and security assurance.

The OpenChain specification explained
When implementing OpenChain, understanding the specification will help guide your organisation to having processes in place to review and manage open-source software
Spark
View all (3)
Spark dev containers: packaging code for testability
Once you've thoroughly tested your code against the local Spark service in your dev container, you'll want to run it in a real Spark cluster. This posts shows how to deploy such code to Microsoft Fabric.

Spark dev containers: writing tests
Having seen earlier in the series how to configure a dev container to run Spark locally, this post shows how to write tests that use that local Spark service.

Spark dev containers: running Spark locally
See how to configure a dev container to run Spark locally, to improve development feedback loops.
Startups
View all (15)
How to Monetize APIs with Azure API Management
Explore monetizing APIs with our guide. We offer strategies, videos, and code via Azure API Management to fast-track your business model.

10 ways working with Microsoft helped endjin grow since 2010
Microsoft recently shot a video interviewing endjin co-founder, Howard van Rooijen, and Director of Engineering, James Broome, about how Microsoft has helped endjin grow over the past decade. This posts the top 10 ways in which Microsoft helped - from providing access to valuable software and services, to opening up sales channels, to helping to navigate the minefield of UK Financial Services regulations around cloud adoption.

What makes a successful FinTech start-up?
In this post we discuss the characteristics of a great FinTech startup, and the importance of the API Economy to innovation in Financial Services.
Strategy
View all (70)
Launchpad to Success: Building and Leading Your Data Team
This guide captures the essential points that leaders should consider when setting up a new data team.

Data is a socio-technical endeavour
Our experience shows that the the most successful data projects rely heavily on building a multi-disciplinary team.

Data and AI Engineering Maturity - Fix our problems before we hit the buffers
As data and AI become the engine of business change, we need to learn the lessons of the past to avoid expensive failures.
UX
View all (25)
Learning from Disaster - A Creative Walkthrough of the Titanic Power BI Report
In Paul Waller's final, and posthumously published blog post, he takes you through a creative walk-through of the Titanic Power BI Report he created with Barry Smart.

How to Build Mobile Navigation in Power BI
This is follow guide to designing a mobile navigation in Power BI, covering form, icons, states, actions, with a view to enhancing report design & UI.

Power BI Images That Pop: A Guide to Intuitive, Easy-to-Maintain Reports
Explore integrating icons, pictograms and images into Power BI in the optimal way to enhance the user experience and minimise effort required to build and maintain reports.
Visualisation
View all (18)
Learning from Disaster - A Creative Walkthrough of the Titanic Power BI Report
In Paul Waller's final, and posthumously published blog post, he takes you through a creative walk-through of the Titanic Power BI Report he created with Barry Smart.

How to Build Mobile Navigation in Power BI
This is follow guide to designing a mobile navigation in Power BI, covering form, icons, states, actions, with a view to enhancing report design & UI.

Power BI Images That Pop: A Guide to Intuitive, Easy-to-Maintain Reports
Explore integrating icons, pictograms and images into Power BI in the optimal way to enhance the user experience and minimise effort required to build and maintain reports.