Skip to content
Carmel Eve By Carmel Eve Software Engineer I
How can I improve my data model in Power BI?

I have already written a blog around loading and shaping data in Power BI. In that post I ran through how to build up a model in Power BI to support report visualisations. This included merging and manipulating tables, cleaning and shaping the data, and building relationships between different parts of the model. All of this forms the base for the high-level reporting which allows you to deliver insight using Power BI.

There are however some other things which can be done in order to improve your data model. The first of these is to configure model properties.

There are a few different model properties which can be configured. Some of these I have already touched on, but are used to make the model more discoverable:

  • Object names and descriptions – setting these helps users to understand what different values contained in the model relate to.
  • Hidden objects – Hiding objects which aren't useful for reporting means that the mode is not cluttered with objects which are meaningless outside of relationships and internal calculations.

And some are used to improve the reports produced by the model:

Data types and data categories

Ensuring your columns are the correct data type allows you to perform specific operations over the data. There are many data types: Whole number, decimal numbers, dates and times, text, true/false, binary. Each of these unlocks specific functionality.

Power BI can also infer things from within a data type. For example, it can infer that text data can relate to a specific geographic location which allows you to visualise your data on a map. There is sometimes a little more complexity in getting location data into the correct format, but this can be a powerful tool in bringing your data to life.

For example, if we use our data model seen in previous posts and display the number of children by city on a map:

Image of heat map of the UK.

We can get a clear visual idea of the distribution around the country.

Within this, you can also use data categories to further enrich the data. For example, you could specify that the above data relates to cities, and if you had worldwide data you could have a second column which you could categorise as "country". This allows you to present richer map views over the data. Other categories include image and web URLs, which can allow you to display this kind of content within your reports.

Numeric column format

Formatting your numeric columns means that you can display the data in a way which gives the viewer some context. For example, displaying your data as a percentage or a monetary amount means that anyone viewing the report can quickly and easily tell what they're looking at.

Sort order

Another thing which can be useful is defining a sort order for your data. This can be especially useful if say, for instance, you have a column which contains the month that someone is born. When you sort by that column it will automatically sort the text in alphabetical order, whereas clearly in the case of months that wouldn't be the expected behaviour.

So instead you can define a sort order based on another column, i.e. if you don't define a sort order:

Column of months in alphabetical order.

But if we add an additional column:

Additional month number column added.

And then define the sort order of the birth month:

Setting the month column to be sorted by the month number.

Then when we sort that column, we get the expected behaviour:

Showing month column in correct order.

Default summarization

Another thing that can be done to improve the model is to add a default summarization to columns. When you display aggregated numeric columns in your reports, the default aggregation method is to sum the values. For instance, if we create a visual which aggregates the number of children by city:

Showing the total number of children by city.

The total for each city is calculated by summing up the children in that city. If we instead wanted this to display the average children in each city, we could update the default summarization for the number of children column:

Updating the default summarization behaviour to be "average".

And we can then see that the table visualisation instead displays the averages:

Visualisation showing the average number of children by city.

This is especially useful if you have data which it does not make sense to sum – for example ratings.

Using these techniques we have a model which is far more discoverable, and is able to better support the visualisations we need.

Carmel Eve

Software Engineer I

Carmel Eve

Carmel is a software engineer and LinkedIn Learning instructor. She worked at endjin from 2016 to 2021, focused on delivering cloud-first solutions to a variety of problems. These included highly performant serverless architectures, web applications, reporting and insight pipelines, and data analytics engines. After a three-year career break spent travelling around the world, she rejoined endjin in 2024.

Carmel has written many blog posts covering a huge range of topics, including deconstructing Rx operators, agile estimation and planning and mental well-being and managing remote working.

Carmel has released two courses on LinkedIn Learning - one on the Az-204 exam (developing solutions for Microsoft Azure) and one on Azure Data Lake. She has also spoken at NDC, APISpecs, and SQLBits, covering a range of topics from reactive big-data processing to secure Azure architectures.

She is passionate about diversity and inclusivity in tech. She spent two years as a STEM ambassador in her local community and taking part in a local mentorship scheme. Through this work she hopes to be a part of positive change in the industry.

Carmel won "Apprentice Engineer of the Year" at the Computing Rising Star Awards 2019.