Better late than never. Prior to 2017 Power BI Data Insights Summit, Microsoft published here three challenges, including a complex data preparation. So, I had to try it.

At the time, I was too busy to submit my solution, but I planned to share it at DataChant. You can also learn how the challenge was solve by the winner of the challenge, Rishi Sapra, here.

Download my solution pbix file here.

Here is the challenge in a nutshell:
Hawaii Tourism Authority provides detailed data for visitors. Unfortunately, this data is buried in somewhat cumbersome table layouts. For this challenge, prep the data to visualize trends in Hawaii tourism:

  • Where are people visiting from?
  • Has there been any interesting shifts in what regions visit Hawaii in general or specific islands in particular?
  • What months are the busiest by Island?

The source data is an Excel workbook (available here), which contains a stack of summarized tables per year on a single worksheet. While the initial obstacle is to find out how to append all the tables together, and unpivot the tables, you will soon, discover that many values in this workbook need a proper normalization, as each island and some countries are represented with different names.

To solve the mismatching island names, I created a conversion table for the islands, which contains the pairs of source islands and their normalized name. By transposing the table conversion, as shown in the diagram below, we can quickly convert any island value in our fact table to its proper normalized value.

Hawaii Data Prep Challenge

And here is the function that normalize the island values, based on the conversion table #”Island Conversion Table” (the right table in the diagram above).

(txt) =>
    try Table.Column(#"Island Conversion Table", Text.From(txt)){0} otherwise txt

To create a good star schema in the Data Model, I used Reference + Remove Duplicates in Query Editor on the fact tables to create the tables Countries and Flight Types. As an alternative approach, I also used the Enter Data feature, to create the Islands lookup table.

Hawaii Tourism Data Challenge in Power BI

And here is the Relationships view of the star schema. The two fact tables are located at the bottom. Both tables are connected in many to one relationship to Calendar, Islands, Flight Types. The Countries table is connected to Visitors by Origin.

Hawaii Data Prep Challenge in Power BI - The Data Model

I intended to write a more detailed post with the steps I used to resolve this challenge, but I am sure you will figure it out by yourself, after you give a good look at the pbix file. Download it here.

Have questions, or better ideas how to prepare this data for analysis? Please share it in the comments below.