In my last tutorial here, you learned how to apply a dynamic running average in DAX. In this part, you will learn how to offset the daily new confirmed cases vertically and horizontally.
Horizontal offset – Normalize cases by population
Different countries were affected differently by COVID-19. The most common analytic challenge in comparing between different countries is that they have different populations. A country with a smaller population but a high number of cases per capita may require more attention than a country with a small number of cases per capita.
Vertical offset – Align cases on the X axis since the day the country “got infected”
Since the pandemic had reached different countries at different times, it would make sense to analyze the spread and compare between countries by offsetting their curves by the relative day since COVID-19 was confirmed in each country. In the screenshot below, in the first line chart, you can see the running average daily confirmed cases for 5 countries by date. In the second line chart, we offset each country starting on the day it had 100 confirmed cases. So, instead of showing the trends for all countries by actual date, we show each country with its relevant offset.
Before you start this tutorial, I recommend you check out the entire series that will guide you on how to build your own COVID-19 dashboard in Power BI.
Before you start
To start this tutorial you can download this Power BI report file which includes the solution for the last tutorial and the baseline for this tutorial. Open the report file using the latest Power BI Desktop and refresh it to get the latest confirmed and death cases of COVID-19 worldwide (based on Johns Hopkins University dataset).
After you click Refresh in the Home tab of Power BI Desktop. You may notice the Privacy levels dialog box. Set the privacy level to Public as shown below and click Save.
Divide By Population
The report file includes a table which I imported from two Wikipedia pages with the population and density by country. Due to mismatching country names between the original World Population tables and Johns Hopkins University’s dataset, I decided to clean this data and provide it to you ready for use in its static format (using the Enter Data feature in Power BI). You can find the population information under the Population column.
Let’s connect between our population table (which will also serve as a dimension table for the countries) and our fact table. Go to the Model view on the Power BI report and drag and drop the Country field from World Population table to Country/Region column in Covid-19 Cases. This relationship will allow you to normalize any measure and divide it by the sum of the population as long as you use the Country field in World Population in the relevant visual of the report instead of Country/Region in Covid-19 Cases table.
You can now normalize our dynamic running-average daily confirmed cases that we built in the last tutorial by creating a measure. Click the ellipsis icon next to the Covid-19 Cases table in the Fields pane. In the shortcut menu, select New measure and enter this formula:
Daily Confirmed Running Avg per 1M people = DIVIDE ( [Daily New Cases Dynamic-Day-Avg], SUM ( 'World Population'[Population] ) ) * 1000000
The new measure is quite simple. It divides the Daily New Cases Dynamic-Day-Avg by the sum of population. Then it multiply the result by 1,000,000 to reflect a ratio per 1M people instead of per capita which will result in a very low fraction.
Note: You can use the formula above to apply any other measure and normalize per 1M people by placing the measure as the first agument for DIVIDE.
Now, we can create a new line chart below the first visual with the normalized measure. Go to Report view. Copy and Paste the top line chart as a second line chart below the first.
Remove Country/Region from Legend and remove Daily New Cases Dynamic-Day-Avg from Values.
Drag and drop Daily Confirmed Running Avg per 1M people to Values and Country from World Population to Legend
And here are the results. The second visual is now normalized by population.
It’s time to move to the next challenge and offset each country to start on the day it had 100 confirmed cases. So, instead of showing the trends for all countries by actual date, we will show each country with its relevant offset. The starting point is to create a disconnected table with numbers for the days that will be represented in the X-axis of the line chart. We can use the What-if parameter feature to create such a table.
In Modeling tab, select New parameter. Enter Days Since Start as Name. Enter 1 as Minimum and uncheck Add slicer to this page. Then click OK.
You can see the new table in the Fields pane with the formula:
Days Since Start = GENERATESERIES(1, 20, 1)
This table contains the numbers from 1 to 20. We need to change this table to keep incrementing days beyond 20 and accommodate an X-axis that will include the range of days from 1 to the number of days in our dataset. Fortunately, our Calendar table was built in the previous tutorials using a MAX on the dates in Covid-19 Cases. By applying COUNTROWS on the Calendar table we can find the last number we need in Days Since Start. Replace the 20 in the formula above with COUNTROWS(Calendar)
Days Since Start = GENERATESERIES(1, COUNTROWS(Calendar), 1)
How to offset our measure
Click the ellipsis icon next to the Covid-19 Cases table in the Fields pane. In the shortcut menu, select New measure and enter this formula:
Daily Cases Running Avg per 1M since 100 Cases = VAR _100_cases_day = CALCULATE ( MIN ( 'Covid-19 Cases'[Date] ), 'Covid-19 Cases'[Confirmed] >= 100 ) RETURN CALCULATE ( [Daily Confirmed Running Avg per 1M people], FILTER ( 'Calendar', 'Calendar'[Date] = _100_cases_day + [Days Since Start Value] ) )
The measure Daily Cases Running Avg per 1M since 100 Cases starts by declaring a variable _100_cases_day to calculate the date the country reached a cumulative 100 confirmed cases. This is done by applying a CALCULATE function with the expression MINon the Date column in Covid-19 Cases table with a filter on all the records with confirmed cases that are greater or equal to 100.
Following the RETURN statement, we apply another CALCULATE on the measure we want to offset. In our case, we use the previous measure Daily Confirmed Running Avg per 1M people. And here comes the trick. This measure will be applied on a line chart with the disconnected days in the new Days Since Start table we created above. We need to write a smart filter that will map between the values of days in the X-axis to the actual dates we have in the calendar table.
Remember the Days Since Start disconnected table we created. Notice that Power BI created for us also the measure Days Since Start Value. This measure returns the value that is selected in Days Since Start. If we use this measure as a value of a visual with the Days Since Start in the X-axis, we will get the days in the X-axis in this measure. You can see that in the last FILTER function in the formula, we read the days that are selected in Days Since Start Value and add it to the actual date in which the country reached 100 cases. The result is the required offset we need. When Days Since Start will be used in the X-axis, we can then map the corresponding date since 100 cases reached.
FILTER ( 'Calendar', 'Calendar'[Date] = _100_cases_day + [Days Since Start Value] )
We can now create a third line chart in the report page. Drag and drop the new measure Daily Cases Running Avg per 1M since 100 Cases to Values. Drag and drop Days Since Start to the Axis, and drag and drop the Country field of World Population table to the Legend.
Mission accomplished. It’s time to rename the titles of the 2 new line charts and start analyzing different countries.
You can download the solution file here and apply the vertical and horizontal offsets on other measures.