Expand Table Column – Pitfall #9


It’s time to expand table columns today, and get you acquainted with the ninth pitfall, which can be quite risky, as you’ll soon find out. For those of you who didn’t read my previous articles of the Pitfalls Series, you are welcome to bookmark this link, it can help you avoiding common mistakes, we always do in Power BI and Power Query in Excel.

So, what is Expand Table Column?

Expand Table Column is a transformation step which happens when you click on the highlighted button.

Expand Table Column

The highlighted button appears in a variety of scenarios in Power BI and Power Query in Excel. When you click on the expand button, the tables in the relevant column will get expanded.

Let’s go over one of the main scenarios – Merging queries.

In the following table, we have a list of employees and their bonuses over the years.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns
Bonuses Table

The column Employee ID, can be used to find the corresponding employee name and gender in the Employees table below.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns
Employees Table

Using Query Editor’s Merge Queries in Home tab, we can combine between Bonuses and Employees, using the column Employee ID to map between each row in Employees and the multiple rows in Bonuses. If you are new to the Merge functionality, read more about it here.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Following the merge step, we can now expand the table column NewColumn.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

In the expand pane, we can uncheck Employee ID, as we already have that column in the Bonuses table. Clicking OK, will reveal the 9th pitfall.

If you followed the first pitfall here, your formula bar is now active, and you can see the following code:

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

In the previous step, we unchecked Employee ID. As a result, the Query Editor auto-generated a code that refers to the other column names in table EmployeesFirst Name, Last Name and Gender.

The 9th Pitfall

Whenever we extract table columns, as we did above, the auto-generated Table.ExpandTableColumn formula is hard-coded with the specific column names, that were checked. So, when there is a new column name, it will be missed in the merge operation. For example, if the column First Name in Employees table, will change to FirstName, it will not be included in the expanded results. Instead, First Name will have nulls, as shown here:

The Solution: List.Difference and Table.ColumnNames

As you may recall, in the example above, we unchecked Employee ID. We didn’t check each of the columns we need. We wanted them all. Luckily for us, there is a formula to do it.

Instead of calling for columns one by one:

We can call for all the columns except of Employee ID:

And now, we can use the formula above as the second argument of Table.ExpandTableColumn as follows:

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

With this change, whenever the table Employees will have new or different column names, we will have them expanded in the resulting table. Isn’t it cool?

Combining CSVs with varying columns

A special case to consider the 9th pitfall, is when we import multiple files from a folder. Let’s examine the basic case of CSVs, though it is relevant for other file types as well.

Imagine we have multiple CSVs from a folder. Most of the CSV files share the same format, but in few of those files, we have extra columns. Let’s import the files from a folder and see how Power BI and Power Query in Excel handle this scenario.

You can download the ZIP file from here, and follow with me.

The ZIP file contains 4 CSV files. Most of them have the columns Date, Product and Sales:

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Let’s import the entire folder after you unzip the files into a folder. To import the CSV files, select Get Data in Power BI Desktop, click File, then click Folder and click Connect.

Note: This scenario is based on the recent improved combine binaries experience, read more here. If you are using Excel 2010/2013 with Power Query latest Add-In, or in Excel 2016 version 1612 Build 7668.2048 or later, you can follow the steps as well (From Data –> Get & Transform –> New Query –> From File –> From Folder).

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

In the Folder dialog, click Browse and select the folder with the CSV files. After you select the folder, click Edit in the preview window.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Click on Combine Binaries (highlighted below)

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

In CSV window, click OK

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

The results are as expected. We got Date, Product and Sales from all the CSV files.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

But the file Tokyo.csv has sub products in the fourth column. We missed it.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Let’s understand how we missed it, by looking at the formula bar.

From the code below, you may see that Table.ExpandTableColumn expanded the column “Transform Binary from CSVs3″ by the column names that exist in the first CSV file. How do we know it? Look at the highlighted “{0}”, which refers to the first item in the table column. That is the first table we’ve expanded.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Try to change the {0} part to {3}. Do you see now that Sub-Product was just expanded? You can scroll down till you reach Tokyo.csv under the first column to find sub products under Sub-Product.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

And here is when pitfall #9 have manifested itself. We’ve imported many CSV files from a folder, and Power BI / Power Query was only looking at the first file to determine which columns to expand. Apparently, it wasn’t sufficient to detect the sub-products in the fourth file.

How to combine a folder with varying columns in the files

When we have multiple files with varying column names, we would want to expand all the files into a unified table. Fortunately, there is a way to do it in the formula. Let’s see how we can build the solution.

Go one step back in APPLIED STEPS, before expand table column step. Select the table column, right click on its header, and click Drill Down.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

The result is a list of tables:

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Now, let’s wrap the formula with List.Transform which will convert each table to its list column names.

The result is a list of lists. Each item in the main list is a list of column names from the corresponding table.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Now let’s wrap the formula above with List.Union:

The result is a unified list of all the column names:

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Now, we can use the List.Union formula as a second argument of the original Table.ExpandTableColumn

And voila, all the columns are detected and expanded.

So, when I’ve added the column Discount in Tokyo.csv it is detected and expanded on refresh, without further changes.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Conclusions

Power Query Formula Language (M) can be your friend. You can use List.Difference and List.Union on Table.ColumnNames as shown above to avoid the ninth pitfall, and automatically detect new columns when you expand table columns.

Stay tuned for the 10th and last pitfall – Remove Duplicates…

7 comments

  1. JulianReply

    When it comes to Power Query, personally I would say your posts are the most valuable. I leaned from you a lot of stuffs I never knew. I do appreciate it.

  2. Gil Raviv Post authorReply

    Thank you Julian. Perhaps you are ready to share your knowledge on DataChant. Want to be a guest author?

  3. Pingback: The Ten Pitfalls of the Data Wrangler (aka the Wielder of M/Power Query) - PowerPivotPro

  4. Lisa SReply

    Can the List.Difference be used to exclude more than 1 column? I find in many cases that I have multiple columns that I want to exclude from expansion.

    • Gil Raviv Post authorReply

      Sure. You can add multiple columns as the second argument: List.Difference(all_columns, {“col1”, “col2”})

  5. cocis48Reply

    Gil,
    Excellent posts just started reading and your ability to show detailed steps is amazing.
    I found a small typo at the top ‘expend’ vs ‘expand’ “When you click on the expend button, the tables in the relevant column will get expended.” <–

    • Gil Raviv Post authorReply

      Thank you for the great feedback, and for detecting the typo. Fixed 🙂 Hope you will find the posts useful.

Leave a Reply