Recently, Jess and I were looking at a Power BI report we produced for one of our clients. On the 'overview' page, we have numerous high-level visuals: Total Sales, Previous Year Sales, % Change YoY etc. Something struck me as odd, though.
As I'm sure many of you who have spent days on a report using the same data model can vouch for, one gets used to the high level statistics of that data model pretty quickly, making it pretty easy to spot something that's not quite right.
Our report has a slicer on the 'Year' part of our date table, where the date table was manually generated and has a one-to-many relationship with a date column in our main dataset. By default, this slicer has only a single selection, e.g. '2018'. This means that measures like 'Total Sales' and 'Previous Year Sales' can't really be misconstrued.
In this scenario, an easy way to verify that the 'Previous Year Sales' measure is working as expected, is by going through each of the years in the slicer individually, and making sure the 'Total Sales' for one year is the 'Previous Year Sales' for the next. Right? Well, no – not necessarily.
This verification method works fine for 'complete' years, I.e. years for which there are date entries every day in the year in your date table. It is also fine for the earliest dates in your date table, even when your date entries only begin, say, in June of that year.
The 'Previous Year Sales' will, of course, be blank – because there are no data entries for the previous year. However, care needs to be taken for the final date entries in your date table. This is where it becomes important to understand the possible differences between generating a date table using the CALENDAR function and CALENDARAUTO function.
The CALENDAR function generates a date table, outputting every day between two dates you specify as parameters (inclusive). Of course, these parameters can be hard-coded, or a function that outputs a date based on your chosen date column can be used (which is probably more preferable for dynamic data models).
For example, the FIRSTDATE and LASTDATE functions can be used in the arguments to extract the earliest and latest dates that appear in whatever date column you wish to use to generate your date table.
CALENDARAUTO works slightly differently. CALENDARAUTO will output a date table with every day from the first day of the fiscal year based on the earliest date in your chosen date column, all the way up to the last day of the fiscal year of the latest date in your chosen date column. It takes one optional parameter ([fiscal_year_end_month]) whereby you can enter an integer between 1 and 12 to choose as the end of your fiscal year. The default is 12 (31st December).
So why does any of this matter in relation to YOY comparisons? This is where I'll switch to an example. Here's some data:
The first date in the dataset is 04/04/2014. The last date in the dataset is 30/06/2017 (I.e. end of Q2). I created two different date tables based off this date column and created a one-to-many relationship between the date table and the date column in my dataset:
CALENDAR() DATE = CALENDAR( FIRSTDATE(Dataset[OrderDate]), LASTDATE(Dataset[OrderDate]) )
CALENDARAUTO() DATE = CALENDARAUTO()
The measures are also quite simple:
TotalSales = SUM('Dataset'[Sales])
PreviousYearSalesForSamePeriodCD = CALCULATE( [TotalSales], SAMEPERIODLASTYEAR( 'CALENDAR() DATE'[Date] ) )
PreviousYearSalesForSamePeriodCAD = CALCULATE( [TotalSales], SAMEPERIODLASTYEAR( 'CALENDARAUTO() DATE'[Date] ) )
%ComparisonToPYSFSP_CD = [TotalSales] / [PreviousYearSalesForSamePeriodCD]
%ComparisonToPYSFSP_CAD = [TotalSales] / [PreviousYearSalesForSamePeriodCAD]
Don't worry – I've handled the 'divide-by-zero' possibility in real-life, but omitted it in this blog so that the measure looks cleaner :)
So, first of all, let's choose the year 2016 as the slicer. As expected, we get identical results:
Now let's change the slicer to 2017. Here's what we see:
Why are these 'Previous Year' values different? They're different because the 'SamePeriodLastYear' in question differs between the two. Remember, our actual data only goes up to the end of Q2 – and our CALENDAR generated date table corresponds to this. So although the page filter is set to 2017, the 'period' in question is 1/1/2017 - 30/06/2017 in the current context.
There are no entries after June, therefore there are no dates in our date table after June. Therefore the 'TotalSales' comparison is made between the total sales between 1/1 - 30/06 in 2017 and 1/1 - 30/06 in 2016. Effectively, we're doing a like-for-like comparison. Here's a way this can be verbalized: "We have had a 12% rise in sales compared to the first two quarters of last year."
Conversely, our CALENDARAUTO generated date table generated dates up until the end of the fiscal year (31st December), even though there are no dates in our main dataset associated with the latter half of the year. Since we have dates in 2017 all the way up until the end of the year, the 'SamePeriodLastYear' calculation spans the whole of the previous year's sales, as opposed to just those in the first two quarters. Hence the higher value for 'PreviousYearSales' and the lower value for '% Comparison'.
Here, we're effectively using the 'PreviousYearSales' as a fixed target, and we can see our progress towards reaching that target by using the '% comparison' KPI. "We are now halfway through the year, and we have received only 42% of last year's total sales."
Neither of these scenarios is 'wrong', they are just different ways of looking at the YOY comparison. The choice really depends on what insights you want to get from the data. There's nothing stopping you from implementing both. In fact, it's probably the best option, as I think there is value in both of these insights - just make sure you name appropriately, so you know what you're looking at!
Aside: It is possible to achieve a dynamic date table which spans the whole year using CALENDAR. Setting its arguments to
DATE ( YEAR ( FIRSTDATE ( Dataset[OrderDate] ) ), 1, 1 ) and
DATE ( YEAR ( LASTDATE ( Dataset[OrderDate] ) ), 12, 31 ) will achieve this. But, let's be honest, using CALENDARAUTO is much easier in this scenario!