Browse our archives by topic…
Blog
Spark dev containers: running Spark locally
See how to configure a dev container to run Spark locally, to improve development feedback loops.
Working locally with spark dev containers
Running Spark locally in a dev container can significantly improve development feedback loops. This first article explains why, and the rest of the series will show how.
C# 12.0: collection expressions
C# 12.0 provides a new, simpler syntax for initializing expressions. It typically generates the most efficient code possible, although as you'll see, it's useful to understand the choices it makes.
Why Power BI developers should care about the Tabular Model Definition Language (TMDL)
Power BI's adoption of TMDL improves the readability of the semantic model, enables version control and enhances collaboration and efficiency for developers.
Women of Silicon Roundabout: Day 2
Women of Silicon Roundabout is the UK's largest women in tech event. Day two topics included: green tech, burnout, and Python!
Women of Silicon Roundabout: Day 1
Women of Silicon Roundabout is the UK's largest women in tech event. Day one topics included: AI, career pathways, and generations of Women in Tech.
C# 12.0: inline arrays
A new feature in C# 12.0 enables data types to define fixed-size arrays that don't require separate array objects on the heap. Learn how this is useful in performance-oriented and interop scenarios.
There's something wrong with the Pandas API on Spark
Fix the following issues: Errors converting large datasets to pandas, pandas for Spark is very slow, and pandas for Spark column reduction doesn't reduce data.
How .NET 9.0 boosted JSON Schema performance by 32%
We benchmarked endjin's JSON Schema library on .NET 9.0 and saw large performance gains. There are even more gains to be had with new System.Text.Json features.
How .NET 9.0 boosted AIS.NET performance by 9%
.NET 9.0 has shipped, and for the fourth year running, we benchmarked endjin's AIS.NET library and were very happy to see substantial performance gains, with no extra work required.
Carbon Optimised Data Pipelines - minimise CO2 emissions through intelligent scheduling (Next Steps)
Intelligently scheduling cloud data pipelines based on carbon impact can optimize both environmental sustainability and operational efficiency.
Carbon Optimised Data Pipelines - minimise CO2 emissions through intelligent scheduling (Pipeline Definition)
Intelligently scheduling cloud data pipelines based on carbon impact can optimize both environmental sustainability and operational efficiency.
Modern Compute: Compute-Intensive Workloads
We have a wide range of computational mechanisms at our disposal, some of which emerged thanks to recent advances in AI. In this post, we look at the kinds of workloads that can take advantage of these.
C# 12.0: primary constructors
C# 12.0's most prominent new feature is the primary constructor syntax. This post describes how it works, and looks at some pros and cons.
Carbon Optimised Data Pipelines - minimise CO2 emissions through intelligent scheduling (Architecture Overview)
Intelligently scheduling cloud data pipelines based on carbon impact can optimize both environmental sustainability and operational efficiency.
Carbon Optimised Data Pipelines - minimise CO2 emissions through intelligent scheduling (Introduction)
Intelligently scheduling cloud data pipelines based on carbon impact can optimize both environmental sustainability and operational efficiency.
Modern Compute: Unavoidable Practicalities
Thanks in part to recent advances in AI, we have a range of computational mechanisms at our disposal. However, certain universal truths apply to all of them.
How to step into external code when debugging a Python Behave test in VS Code
Learn how to configure VS Code to enable stepping into external code when debugging a Python Behave test
C# 11.0 new features: ref fields and the scoped keyword
C# 11.0 expanded high-performance, low-allocation functionality. This post describes the importance of the added support for ref fields, and how the scoped keyword relates.
After the AI Storm: Modern Compute
Recent huge investment in AI has changed the modern computational landscape. Whatever the value of recent AI developments ultimately proves to be, we have some new hardware capabilities as a side effect. What else do these enable?
Why Power BI developers should care about the Power BI enhanced report format (PBIR)
Power BI's new PBIR format enhances collaboration, version control, and efficiency for developers. Learn key benefits and future implications.
Why Power BI developers should care about Power BI projects (PBIP)
Power BI Projects are a game changer for teams building reports; offering a source-control friendly format, CI/CD support, and the ability to edit in a code editor.
Per-Property Rows from JSON in Spark on Microsoft Fabric
Spark doesn't always interpret JSON how we'd like. For example, if each key/value pair in a JSON object is conceptually one item, Spark won't give you a row per item by default. This article shows how to nudge Spark in the right direction.
C# Design Patterns - Iterator - Language Features
This post examines .NET's native support for iterators: IEnumerator<T>, IEnumerable<T>, and IAsyncEnumerable<T>.
Launchpad to Success: Building and Leading Your Data Team
This guide captures the essential points that leaders should consider when setting up a new data team.