Column Reordering Disorder – Pitfall #4 in #PowerBI #PowerQuery


Order is a crucial force in life, especially if you are a data analyst. But as we will see today, the misuse of order might cause an unexpected chaos. In today’s post, we will discuss the chaotic nature of column ordering in the Query Editor of Power BI and Power Query / Get & Transform in Excel, and we will learn how to avoid the fourth pitfall of column ordering.

You can follow the Pitfalls series here, and subscribe to DataChant below to get the file sample.

Background

In this article, we will discuss the reordering of columns. We’ll start with the rationale of column reordering (in this section), and will then describe the mistake we do when we are using it (The next section).

Note: To avoid confusion, we will not discuss the sorting of rows by columns, but the rearranging of columns’ positions in the queries and tables.

Reordering of columns is not supported as part of the Data Modeling (PowerPivot), but can be achieved in the data preparation phase in the Query Editor of Power BI and Excel Power Query / Get & Transform.

To reorder columns, you can simply select the columns and drag and drop them to their new position.

Pitfall 4 - Reorder columns in Power BI and Power Query in Excel

(This is how the dragging experience looks like):

Pitfall 4 - Reorder columns in Power BI and Power Query in Excel

Why do we need column reordering?

Column reordering in the Query Editor is quite useful. Here is a summary of the scenarios that support column reordering, and those that don’t support it.

In Excel, you can load the data into a table in the worksheet and control the order of the columns with the Query Editor. You can also control the order of the fields in the PivotTable / PivotChart field list.

Pitfall 3 in Power BI and Power Query in Excel
Using the Query Editor in Excel to reorder the columns works in Excel PivotTable Fields 🙂

Unfortunately, in Power BI, you will not maintain the columns’ order in the Fields list. Power BI automatically sorts the fields by alphabetic order.

Pitfall 3 - Column reordering in Power BI
Using the Query Editor in Power BI to reorder the columns doesn’t affect the Fields list, which is alphabetically ordered 🙁

The only view in Power BI Desktop where you can see the correct column order is the Data view.

Pitfall 4 in Power BI and Power Query in Excel
Column reordering is maintained in the Data view

Where else can we take advantage of column reordering? Inside the Query Editor to help us to author further steps. For example, when you want to merge multiple columns, or remove duplicates, you may reorder the relevant columns, to ease your multi-column selection.

You can also reorder a custom column for debugging purposes, and see how the calculated values in the new column are affected by values in other columns.

The screenshot below illustrates the need to reorder columns to debug custom columns. In the screenshot below, I created a new column which adds 1000 to the value in Column2.

Pitfall 4 in Power BI and Power Query in Excel

Unfortunately, the new custom column, is always added at the end of the table, which makes it quite challenging to move the new column to the left, if we have dozens of columns. But we always move the custom column to a better location, to help us testing our calculation.

Note: Vote here for this idea: When we add Custom Column, allow us to choose its position.

Pitfall 4 - Mistakes we do in Power BI and Power Query in Excel - Reordering of columns

Trick: If you have hundreds of columns in the preview of the Query Editor, and you want to move the last column to the right, you can move that column, one step to the left, and then edit the formula bar, to place the column in its right position. This trick, is a mistake, as any naive reordering you do in the Query Editor. And now, finally, let’s explain why.

Pitfall #4

So now when we know all the great advantages of column reordering in the Query Editor, let’s demonstrate how we always take a big risk when we use it as-is.

Let’s say, we want to reorder the last two columns as the second and third columns.

Pitfall 4 - Reorder columns in Power BI and Power Query in Excel

Following the reordering of the last two columns, Power Query will auto-generate the formula which is needed for the reordering step. If you have the formula bar open (Which you do, if you have avoided Pitfall #1 here), you can now see that formula refers to ALL of the columns names, and this is the pitfall!!!

Imagine that our external data source always includes the last four years. When we created the query, we had columns 2013, 2014, 2015 and 2016, but in 2017, we will have 2014, 2015, 2016 and 2017. Column 2013 will not exist anymore. 

And the result –  The auto-generated code will fail to refresh.

The example we use here is relatively naive. But let’s continue to use it to illustrate the challenge at hand. So, how can we avoid the pitfall, and create a robust query that will not fail to refresh when the data changes.

We can always avoid the reordering altogether. In our naive case, we can simply select the columns Region, RegionManager and HeadCount (Using the CTRL key, as the columns are not adjacent), and apply Unpivot Other Columns to have a table with the Year and Amount.

Tip for new-comers: If you have years or dates as column names in your tables, you are missing a great potential of insights. Unpivot your table (Here is my unpivot series).

How can we fix the formula below to prevent refresh errors?

We can add the optional attributes MissingField.Ignore, or MissingField.UseNull which will avoid refresh errors (here is the official Table.ReorderColumns page, which doesn’t mention MissingField.Ignore).

Here is the fixed formula that will avoid refresh errors and will ignore column 2013:

Hurray, the refresh didn’t fail. But the reordering ignores column 2017.

On many scenarios, you would prefer to keep having the missing column (especially if you need it in later transformation steps). To keep the missing columns you can use MissingField.UseNull, as shown in this formula:

With MissingField.UseNull we were able to keep column 2013 with null values. Still, the column reordering wasn’t achieved properly (Column 2017 is before 2014).

Could we somehow reorder only a subset of the columns, and keep the remaining columns intact?

Ideally, we would use the following formula to move the specified columns to the start of the table:

But the formula above has no effect on the column order. It remained the same.

Here is another failed attempt to specify a subset of the columns:

You can see that HeadCount remained at the end, with 2014 as the new last column. The fact that we need to refer to column 2014 is not reasonable, given the potential refresh failure.

Here is another failed attempt, which is mainly bad, as we refer to two columns.

 

So what is the rationale of using Table.ReorderColumns with a subset of the columns? To shuffle and switch between columns. So for a table with columns Col1, Col2, Col3, Col4Col5, Table.ReorderColumns(MyTable, {“Col2“, “Col1“}) will switch between Col1 and Col2, and Table.ReorderColumns(MyTable, {“Col5“, “Col3“, “Col2“}) will reshuffle the three columns, but will leave Col1 and Col4 untouched.

While this method can be handy to shuffle between few selected columns, it is useless for moving few columns over a long subset of columns, as we need to refer to all the column names in the code.

So here is the ideal (yet advanced) solution that requires the use of a

The Robust Query

Finally, we have reached our best solution, that will allow us gain the following objectives:

  • Reorder the specificed crucial columns
  • Maintain the original order of the unspecified columns
  • Position the newly ordered subset in ANY specified position
  • Don’t refer to any unnecessary columns

By fulfilling the objectives above, the fourth pitfall is avoided.

And here is the solution:

Create a blank query with the code below, and rename the query to FnReorderColumnsSubset.

You can copy the code from here, or download the PBIX sample if you are a DataChant subscriber (Subscribe form below).

The function receives the table, a subset list of the ordered column names, and a zero-based offset. The main transformation is Table.ReorderColumns, with a dynamic list of columns that is created using a combo of List.InsertRange and List.Difference.

Here is how you can use it to solve our challenge:

To arrange the first N columns in your table, use this formula, with an offset of zero. In our case we will use Region, RegionManager and HeadCount with offset of zero.

To be more precise, in our example, we could also just arrange RegionManager and HeadCount in after the first column (offset 1):

You can use different offsets to position the specificed ordered columns anywhare in the table. In the example below, I positioned Region, RegionManager and HeadCount in offset 2 (Third column).

Subscribe to DataChant to get Pitfall 4 Sample.PBIX file with the example, and query.

Subscribe ToDataChant

Join our mailing list to receive exclusive content that was used in the preparation of this article.

Thank you for subscribing to DataChant. An email will be sent to you shortly with more details.

 

 

11 comments

  1. Dominik Petri Reply

    Another great help to build robust queries! Thanks for sharing!
    Check the second to last screenshot: You call a function called fnReorderTABLEBySubset. Shouldn’t it be fnReorderCOLUMNSBySubset?

    • Gil Raviv Post authorReply

      Hi Dominik,
      Thank you for the feedback. You are right. It should be fnReorderColumnsSBySubset. I fixed the it above.
      Hope you find the next six pitfalls as helpful 🙂

  2. Pingback: Removing/Selecting Columns - Pitfall #5 in #PowerBI and #PowerQuery - DataChant

  3. Pingback: Column Renaming and List.Zip To The Rescue - Pitfall #6 in #PowerBI #PowerQuery - DataChant

    • Gil Raviv Post authorReply

      Hi Julian,
      The file is a PBIX file, not a ZIP one. Try to open it with Power BI Desktop.

  4. Julian Reply

    Yes, it’s truly a PBIX file with .zip file extension. I could open it with Power BI Desktop as you said.

  5. Pingback: The Ten Pitfalls of the Data Wrangler (aka the Wielder of M/Power Query) - PowerPivotPro

    • Gil Raviv Post authorReply

      Thank you Drigloff. I am glad you find it useful.

Leave a Reply