Expand Table Column – Pitfall #9


It’s time to expand table columns today, and get you acquainted with the ninth pitfall, which can be quite risky, as you’ll soon find out. For those of you who didn’t read my previous articles of the Pitfalls Series, you are welcome to bookmark this link, it can help you avoiding common mistakes, we always do in Power BI and Power Query in Excel.

So, what is Expand Table Column?

Expand Table Column is a transformation step which happens when you click on the highlighted button.

Expand Table Column

The highlighted button appears in a variety of scenarios in Power BI and Power Query in Excel. When you click on the expend button, the tables in the relevant column will get expended.

Let’s go over one of the main scenarios – Merging queries.

In the following table, we have a list of employees and their bonuses over the years.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns
Bonuses Table

The column Employee ID, can be used to find the corresponding employee name and gender in the Employees table below.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns
Employees Table

Using Query Editor’s Merge Queries in Home tab, we can combine between Bonuses and Employees, using the column Employee ID to map between each row in Employees and the multiple rows in Bonuses. If you are new to the Merge functionality, read more about it here.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Following the merge step, we can now expand the table column NewColumn.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

In the expand pane, we can uncheck Employee ID, as we already have that column in the Bonuses table. Clicking OK, will reveal the 9th pitfall.

If you followed the first pitfall here, your formula bar is now active, and you can see the following code:

= Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"First Name", "Last Name", "Gender"}, {"First Name", "Last Name", "Gender"})

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

In the previous step, we unchecked Employee ID. As a result, the Query Editor auto-generated a code that refers to the other column names in table EmployeesFirst Name, Last Name and Gender.

The 9th Pitfall

Whenever we extract table columns, as we did above, the auto-generated Table.ExpandTableColumn formula is hard-coded with the specific column names, that were checked. So, when there is a new column name, it will be missed in the merge operation. For example, if the column First Name in Employees table, will change to FirstName, it will not be included in the expended results. Instead, First Name will have nulls, as shown here:

The Solution: List.Difference and Table.ColumnNames

As you may recall, in the example above, we unchecked Employee ID. We didn’t check each of the columns we need. We wanted them all. Luckily for us, there is a formula to do it.

Instead of calling for columns one by one:

{"First Name", "Last Name", "Gender"}

We can call for all the columns except of Employee ID:

List.Difference(Table.ColumnNames(Employees), {"Employee ID"})

And now, we can use the formula above as the second argument of Table.ExpandTableColumn as follows:

= Table.ExpandTableColumn(#"Merged Queries", "NewColumn", List.Difference(Table.ColumnNames(Employees), {"Employee ID"}))

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

With this change, whenever the table Employees will have new or different column names, we will have them expended in the resulting table. Isn’t it cool?

Combining CSVs with varying columns

A special case to consider the 9th pitfall, is when we import multiple files from a folder. Let’s examine the basic case of CSVs, though it is relevant for other file types as well.

Imagine we have multiple CSVs from a folder. Most of the CSV files share the same format, but in few of those files, we have extra columns. Let’s import the files from a folder and see how Power BI and Power Query in Excel handle this scenario.

You can download the ZIP file from here, and follow with me.

The ZIP file contains 4 CSV files. Most of them have the columns Date, Product and Sales:

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Let’s import the entire folder after you unzip the files into a folder. To import the CSV files, select Get Data in Power BI Desktop, click File, then click Folder and click Connect.

Note: This scenario is based on the recent improved combine binaries experience, read more here. If you are using Excel 2010/2013 with Power Query latest Add-In, or in Excel 2016 version 1612 Build 7668.2048 or later, you can follow the steps as well (From Data –> Get & Transform –> New Query –> From File –> From Folder).

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

In the Folder dialog, click Browse and select the folder with the CSV files. After you select the folder, click Edit in the preview window.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Click on Combine Binaries (highlighted below)

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

In CSV window, click OK

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

The results are as expected. We got Date, Product and Sales from all the CSV files.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

But the file Tokyo.csv has sub products in the fourth column. We missed it.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Let’s understand how we missed it, by looking at the formula bar.

From the code below, you may see that Table.ExpandTableColumn expanded the column “Transform Binary from CSVs3″ by the column names that exist in the first CSV file. How do we know it? Look at the highlighted “{0}”, which refers to the first item in the table column. That is the first table we’ve expanded.

= Table.ExpandTableColumn(#"Removed Other Columns1", "Transform Binary from CSVs3", Table.ColumnNames(#"Removed Other Columns1"[#"Transform Binary from CSVs3"]{0}))

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Try to change the {0} part to {3}. Do you see now that Sub-Product was just expanded? You can scroll down till you reach Tokyo.csv under the first column to find sub products under Sub-Product.

= Table.ExpandTableColumn(#"Removed Other Columns1", "Transform Binary from CSVs3", Table.ColumnNames(#"Removed Other Columns1"[#"Transform Binary from CSVs3"]{3}))

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

And here is when pitfall #9 have manifested itself. We’ve imported many CSV files from a folder, and Power BI / Power Query was only looking at the first file to determine which columns to expand. Apparently, it wasn’t sufficient to detect the sub-products in the fourth file.

How to combine a folder with varying columns in the files

When we have multiple files with varying column names, we would want to expand all the files into a unified table. Fortunately, there is a way to do it in the formula. Let’s see how we can build the solution.

Go one step back in APPLIED STEPS, before expand table column step. Select the table column, right click on its header, and click Drill Down.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

The result is a list of tables:

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Now, let’s wrap the formula with List.Transform which will convert each table to its list column names.

= List.Transform(#"Removed Other Columns1"[Transform Binary from CSVs3], each Table.ColumnNames(_))

The result is a list of lists. Each item in the main list is a list of column names from the corresponding table.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Now let’s wrap the formula above with List.Union:

= List.Union(List.Transform(#"Removed Other Columns1"[Transform Binary from CSVs3], each Table.ColumnNames(_)))

The result is a unified list of all the column names:

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Now, we can use the List.Union formula as a second argument of the original Table.ExpandTableColumn

= Table.ExpandTableColumn(#"Removed Other Columns1", "Transform Binary from CSVs3", List.Union(List.Transform(#"Removed Other Columns1"[Transform Binary from CSVs3], each Table.ColumnNames(_))))

And voila, all the columns are detected and expanded.

So, when I’ve added the column Discount in Tokyo.csv it is detected and expanded on refresh, without further changes.

Pitfall 9 in Power BI and Power Query in Excel - Expand Table Columns

Conclusions

Power Query Formula Language (M) can be your friend. You can use List.Difference and List.Union on Table.ColumnNames as shown above to avoid the ninth pitfall, and automatically detect new columns when you expand table columns.

Coming Next: The 10th and last pitfall – Remove Duplicates.

45 comments

  1. Julian Reply

    When it comes to Power Query, personally I would say your posts are the most valuable. I leaned from you a lot of stuffs I never knew. I do appreciate it.

  2. Gil Raviv Post authorReply

    Thank you Julian. Perhaps you are ready to share your knowledge on DataChant. Want to be a guest author?

  3. Pingback: The Ten Pitfalls of the Data Wrangler (aka the Wielder of M/Power Query) - PowerPivotPro

  4. Lisa S Reply

    Can the List.Difference be used to exclude more than 1 column? I find in many cases that I have multiple columns that I want to exclude from expansion.

    • Gil Raviv Post authorReply

      Sure. You can add multiple columns as the second argument: List.Difference(all_columns, {“col1”, “col2”})

  5. cocis48 Reply

    Gil,
    Excellent posts just started reading and your ability to show detailed steps is amazing.
    I found a small typo at the top ‘expend’ vs ‘expand’ “When you click on the expend button, the tables in the relevant column will get expended.” <–

    • Gil Raviv Post authorReply

      Thank you for the great feedback, and for detecting the typo. Fixed 🙂 Hope you will find the posts useful.

  6. Eiki Reply

    Hi, Gil

    Nice trick to solve the annoying error. I tried to reconstruct what you did here but couldn’t get expected result, auto refreshed columns. When I try to see the column names from thes list of Tokyo after typing List.Transform(…), I only see 3 column names out there. I think this should be because of “Transform Sample Binary from CSVs3” query where the Source hardcoded [column = 3] that prevents users from seeing the other 2 columns when the first file with 3 column names was assigned for sample file…

    Just wonder if I missed something or not

    • Gil Raviv Post authorReply

      Thank you for sharing. You may try removing the [column = 3] altogether and see if that will create the right number of columns.

  7. funlife Reply

    In above post you have shown Table.ExpandColumns with merge queries. I had a query in which I used group by and selected all rows based on column. Then I used Table.Transpose function on column containing All Rows with “table” link. I now wanted to expand it and was using List.Difference(Table.ColumnNames… but it did not work. Can you please help.

  8. Steve Reply

    Respected Gil Raviv. Wonderful post as I was expecting similar solution for a problem mentioned by “Funlife” above.

    M expression
    let
    Source = Excel.CurrentWorkbook(){[Name=”StateSchoolRanking”]}[Content],
    State.Group = Table.Group(Source, {“State”}, {{“AllRows”, each _, type table}}),
    State.Rank = Table.AddColumn(State.Group, “fnRanking”, each fnRanking([AllRows], “Strength”, “Residential”, “Grd.GQPI”, “Grd.Average”, “Grd.QPI”, “Grd.Percent”, “State.Rk”)),
    #”Works” = Table.ExpandTableColumn(State.Rank, “fnRanking”, {“SchoolCode”, “BlockCode”, “DistCode”, “Block_E”, “District”, “Strength”, “School”, “Management”, “Area”, “Residential”, “Medium”, “HM_Name”, “Mobile”, “eMail”, “Block”, “No.01”, “Grd.A+”, “Grd.A”, “Grd.B+”, “Grd.B”, “Grd.C+”, “Grd.C”, “Grd.NC”, “Grd.Tot”, “Grd.Marks”, “Grd.Average”, “Grd.QPI”, “Grd.Pass”, “Grd.Percent”, “Grd.GSr”, “Grd.GQPI”, “Grd.Boy.A+”, “Grd.Boy.A”, “Grd.Boy.B+”, “Grd.Boy.B”, “Grd.Boy.C+”, “Grd.Boy.C”, “Grd.Boy.NC”, “Grd.Boy.Tot”, “Grd.Boy.Marks”, “Grd.Boy.Average”, “Grd.Boy.QPI”, “Grd.Boy.Pass”, “Grd.Boy.Percent”, “Grd.Boy.GSr”, “Grd.Boy.GQPI”, “Grd.Girl.A+”, “Grd.Girl.A”, “Grd.Girl.B+”, “Grd.Girl.B”, “Grd.Girl.C+”, “Grd.Girl.C”, “Grd.Girl.NC”, “Grd.Girl.Tot”, “Grd.Girl.Marks”, “Grd.Girl.Average”, “State.Rk”}),
    #”NotWorking” = Table.ExpandTableColumn(#”Works”, “fnRanking”, List.Difference(Table.ColumnNames(#”Expanded fnRanking1″), {“State”,”AllRows”}))
    in
    #”NotWorking”

    In the step #”Works” I am expanding columns from function “fnRanking”

    I am trying similar thing using Your method so that I can mention only columns not to be displayed
    Thank you in advance

    • Gil Raviv Post authorReply

      Can the function fnRanking return different column names for each call in Table.AddColumn? If not, you can apply this line, which gets the column names from the first result of fnRanking, and excludes Column1 and Column2, instead of #”Works” above:
      #”Works” = Table.ExpandTableColumn(State.Rank, “fnRanking”, List.Difference(Table.ColumnNames(State.Rank[fnRanking]{0}), {“Column1”, “Column2”}))),

  9. Steve Reply

    Sir you are simple brilliant. Your code works like a charm.
    Sir just one more doubt
    How to find minimum or maximum value in a row (record) and also column header
    Eg: Name Lang1, Lang2, Lang 3, Mat, Science, Social
    xyz 70, 80, 65, 55, 82, 90
    abc 75, 65, 85, 100, 80, 60
    For xyz min is 55 (Mat), max is 90 (Social)
    For abc min is 65 (Lang2) max is 100 (Mat)
    Thank you sir

    • Gil Raviv Reply

      Hi Steve,

      The fastest place to get these answers is the Power BI Desktop forum. Since your question is not related to this blog post, it can take me a while till I answer here.

    • Gil Raviv Post authorReply

      Probably the best way is to Unpivot all columns but Name, and then use DAX/Measures

  10. Luke Reply

    Hi there,

    I toiled over this myself for hours, but this is exactly what I was looking for thank you.

    However, in my situation I have over 80 columns, and for each column it seems to have to pull all the data across, and so is really really slow. When I remove all but the top row in the union list, it loads all the data once and when keep two rows does it all twice. As that increases the time exponentially increases, so by the time I get to 80 rows it just doesn’t finish.

    Any advice?

    Thanks

  11. Afzal khan Reply

    This is great site for power query lovers… Pitfall series is great one .. thanks for the efforts

  12. tyw214 Reply

    Hello! I am trying to do the List.Difference trick, but it’s throwing an error says we cannot convert … to type Table. 🙁

  13. Nigel Barrett Reply

    I am having the same problem as tyw214. I am loading an Excel workbook and loading the range from one worksheet.
    Below is the is the code I am using. The commented out bit works fine on the number of columns can change.

    Source = Excel.Workbook(File.Contents(File_Path), null, true),
    #”Removed Other Columns” = Table.SelectColumns(Source,{“Data”}),
    MyTable = #”Removed Other Columns”,
    #”Expanded Data” = Table.ExpandTableColumn(MyTable, “Data”, Table.ColumnNames(MyTable[Data])),
    // #”Expanded Data” = Table.ExpandTableColumn(MyTable, “Data”, {“Column1”, “Column2”, “Column3”, “Column4”, “Column5”}),

    • Gil Raviv Post authorReply

      Hi Nigel,
      I think you are missing the index for the table object in the row you would like to use to expand all tables. MyTable[Data] returns a list of tables (all the tables in Data). But you should provide here a single table for Table.ColumnNames function.
      So, let’s say you want to expand the Data column by the column names in the first table object in Data column. You can use the {0} portion to achieve it, as shown here:
      #”Expanded Data” = Table.ExpandTableColumn(MyTable, “Data”, Table.ColumnNames(MyTable[Data]{0}))

  14. Fredrik Reply

    Thanks Gil, this is great!. Is it possible to add a prefix to the added columns like Table.NewColumn instead of just NewColumn. When you do this via the GUI you can choose the prefix.

  15. Jack Reply

    Hi Gil,

    Thanks for this really useful post.

    I have a basic working knowledge of M and have followed your instructions in the this post.

    My two tables that I am combining each have about 100 columns (with different headers in each), most don’t have data in them (at this development stage). I am finding that all columns (both with and without data) of “Table 1” are pulling in to the ‘final’ dataset but only about 5-6 columns of “Table 2” are being pulled in.

    I have placed data into one of the columns not being pulled in to test if it is blanks that are excluded – this column remains excluded. Have I reached a column limit or something like that?

    Thank in advance for any light you could shed.

    Jack

  16. retamapark Reply

    Hi Gil,

    I am trying to reproduce the steps above in “How To Combine A Folder With Varying Columns In The Files” using Power Query in PowerBI, April 2019. The steps do not correlate. Is this due to the new version? Thanks.

    Steve Ross

  17. Daniel Huang Reply

    When I do this on a folder with only a few files it works well, but I have a folder that will result in 100+ files, It throws me a DataSource.Error: SharePoint: Request failed. Have you encountered this issue?

  18. Daniel Huang Reply

    Hi,

    Hi I am using the How To Combine A Folder With Varying Columns In The Files, it works on folders with very few files, but when I try with a folder with 50+, it throws me a DataSource.Error: SharePoint: Request failed. Is this a size issue? Any way to resolve?

  19. Steven Elster Reply

    Since I reuse the following, over and over, I am wondering if I can turn it into a function that could be invoked via add column, in power bi? Since I have many excel files to process and with different numbers of columns, is there a way to modify the following so that it would work with all columns that a spreadsheet might contain?

    #”Expanded table.Data” = Table.ExpandTableColumn(#”Kept First Rows”, “table.Data”, {“Column1”, “Column2”, “Column3”, “Column4”, “Column5”, “Column6”, “Column7”, “Column8”}, {“table.Data.Column1”, “table.Data.Column2”, “table.Data.Column3”, “table.Data.Column4”, “table.Data.Column5”, “table.Data.Column6”, “table.Data.Column7”, “table.Data.Column8″}),

    Should this approach work although modified? Is MyTable a user defined variable?
    #”Expanded Data” = Table.ExpandTableColumn(MyTable, “Data”, Table.ColumnNames(MyTable[Data]{0}))

    I get the feeling that you might suggest somehow doing something like this, but I’m not sure how to do it.

    = Table.ExpandTableColumn(#”Removed Other Columns1”, “Transform Binary from CSVs3”, List.Union(List.Transform(#”Removed Other Columns1″[Transform Binary from CSVs3], each Table.ColumnNames(_))))

  20. nickburns42 Reply

    I’ve noticed that there is a bit of a performance hit when using the List.Union() as opposed to the generated list of columns. Not sure why.

  21. Mahesh Reply

    Hi Gil,

    Thank you so much for this post

    You solved my very old problem of expanding columns

    You are just amazing!!!!

    From now on, I will follow you for all power query related questions

  22. DataShark Reply

    “The highlighted button appears in a variety of scenarios in Power BI and Power Query in Excel.”

    Could you explain some occasions where these buttons appear? I can see those buttons with tables in one database but not the other and don’t understand how I could make them appear.

  23. Luis J Azuaje Reply

    Mr Raviv… this was an excellent trick ( Varying Columns)… Just your book in Amazon a week ago… and I will follow your exercise. Do you have a blog so that i can post a problem where i am trying to unwind a very messy unstructured data?

  24. Cinesource Reply

    Mr Raviv… this was an excellent trick ( Varying Columns)… Just your book in Amazon a week ago… and I will follow your exercise. Do you have a blog so that i can post a problem where i am trying to unwind a very messy unstructured data?

  25. Benson Reply

    Thank you Gil, its has been an exciting experience studying your book. I have learnt alot from it already.

  26. Ben Reply

    Amazing solution! I got this to work in desktop but am having trouble duplcating it in the service in a dataflow. It works until the List.Union portion and then it get stuck and runs out of memory. This is my code:

    let
    Source = #”SharePoint Folder New”,
    #”Filtered Hidden Files1″ = Table.SelectRows(Source, each [Attributes]?[Hidden]? true),
    #”Invoke Custom Function1″ = Table.AddColumn(#”Filtered Hidden Files1″, “Transform File (2)”, each #”Transform File (2)”([Content])),
    #”Renamed Columns1″ = Table.RenameColumns(#”Invoke Custom Function1″, {“Name”, “Source.Name”}),
    #”Removed Other Columns1″ = Table.SelectColumns(#”Renamed Columns1″, {“Source.Name”, “Transform File (2)”}),
    #”Drill down” = List.Union(List.Transform(#”Removed Other Columns1″[#”Transform File (2)”], each Table.ColumnNames(_)))
    in
    #”Drill down”

    The only difference from desktop is that when you drilldown in the service it surrounds the custom function with #” “. This is the code generated when drilling down in the service when creating a dataflow”:

    #”Removed Other Columns1″[#”Transform File (2)”]

    And in desktop it is: #”Removed Other Columns1″[Transform File (2)]

    Could this be the reason why it’s not working?

    Any direction much appreciated!

    Thank you

    • Gil Raviv Post authorReply

      Thank you for sharing the feedback. List.Union is a memory-intense operation. When you work with Dataflows, you can try moving the List.Union into a separate dataflow that will operate separately. Try to refresh less frequently then the main dataflow. If it doesn’t work, consider adding more memory to your dataflow or move to Premium. Are the values unknown to you? How often do they change? Ideally, you will check them only once in desktop, and then apply them statically in dataflows.

Leave a Reply