Power Query Transpose Bomb

Earlier in my days as a cyber security software developer, there was a term called Zip Bomb here, for malicious Zip files that were meant to consume excessive memory when they are decompressed, and as a result create hangs and crashes to the process that decompressed them. In today’s blog post, you will learn how you may unintentionally create similar “bomb” in Power Query using Transpose.

Power Query Transpose Bomb

Transpose in Power Query

The Transpose transformation is available in the Query Editor’s Transform tab, and as an M function Table.Transpose.

Transpose is a very useful transformation. It enables you to transpose your table, and switch between your columns and rows. So, each cell in coordinates X and Y in the original table, will now be in coordinates Y and X.

A common scenario when you may want to use Transpose, is when you need to apply transformations on rows or headers, but you are missing the UI option to do so. For example: There is no Fill Right in Power Query, only Fill Down. Transpose will help you to apply Fill Down instead of Fill Right, and then you can transpose the table back to its original state. Here are some of the most popular techniques that were published on DataChant, and require a transpose step: Unpivot any nested table here, and Automatically remove empty columns here.

Unfortunately, when you work with very large tables that contains a lot of rows, transposing these tables will lead to trouble. According to this support page, Power Query cannot handle more than 16,384 columns. So, if you have a table with more than 16,384 rows, transposing that table will lead to errors, when you will try to load it to your Power BI report.

Transpose Bomb

To test the impact of transposing large tables, I created the following query:

You can easily create it using the UI as well. Start with a blank query, and the formula = {1..16384}, which will create a list with running numbers between 1 to 16,384. Then click To Table in the List Tool, Transform contextual ribbon, and finally apply the Transpose in Transform tab.

This query will create a table with the columns Column1, Column2, till Column16384, and a single row with the numbers 1, 2 … 16,384 in the corresponding columns.

When you will load the query, you will end up getting the following error after a very long load which gradually consume a lot of memory:

We were unable to import the data into the model. This could be due to the large number of columns in ‘Zip Bomb’. Try reducing the number of columns before loading again.
Microsoft.AnalysisServices.OperationException: Failed to save modifications to the server. Error returned: ‘The total number of columns in the model is 16397, which exceeds the limit of 16000.
‘.
at Microsoft.AnalysisServices.Tabular.Model.SaveChanges(SaveOptions saveOptions)

It’s interesting to note here, that Power Query engine didn’t block this operation, it was the Power Pivot engine (Vertipaq), as the term “Microsoft.AnalysisServices.Tabular.Model” is used in the stack traces. From this error, we can deduce that Power BI has a limit of 16,000 columns per table.

Note that the error message was wrong. We didn’t have a total number of 16,397 columns, only 16,384. Did Power BI added some implicit columns, and wrongly counted our explicit columns?

I repeated the load, this time loading 17,000 columns, hoping to get a different error by the M engine, is the documented limit of columns were exceeded. Instead, I got the same error, the only change was that this time I was notified of 17,013 columns. It’s the same surplus of 13 columns. Do we have an extra 13 hidden columns in every model?

One last test – Loading 15,000 columns in Power BI Desktop led to this error, after a very long load time.

Trying to load a query of 16,384 to a table in Excel (without Data Model), works quite fast. Trying to load more than 16,384 columns leads to the following error:

The result of this query is too large to be loaded to the specified location on the worksheet. Worksheets have a limit of 1,048,576 rows and 16,384 columns. Please load the query to the Data Model instead.

Don’t follow the recommendation in the error massage, to try to load the query to the Data Model. It will lead to a different error after a very long wait, and very high memory consumption.

Finally, to wrap up our Transpose Bomb – and its relevance to Zip Bomb. A very small Excel workbook or a Power BI report, with a simple query as the one above, which can be saved before you load it to the report, can “explode” and excessively consume the computer memory, when you refresh it.

A Potential Denial of Service?

This section is intended for the curious white-hat hackers among us, and for Power BI team. The following test scenario can be used for stress testing on the Power BI service. Imagine a report that uses the query above. But instead of using a static number like 16,834 in the query, we can load this number from an external data source, such as an Excel workbook on OneDrive for Business, or a public web page. Now, we can configure this number to be very small (eg.10), to load the report, and publish it to Power BI service. Next, after applying a schedule refresh, we can increase the number in the external data source to 15,000, and see how Power BI service will handle it.

Recommendations

Finally, here is my recommendations if you intend to use Transpose on reports in production enviornments.

Be careful when you use Transpose. Ensure that your tables are small enough when you apply it as part of a sequence – such as the unpivot any nested table here, or the automatic removal of empty columns here. If you need the Transpose to work only on few rows, create a reference or duplicate of the original table, then remove all irrelevant rows, and transpose.

The following diagram flow, as an example, shows the steps to normalize column names by replacing underscore with space, and capitalize each word of the headers. But as the normalization of column names should be done on the transposed headers, we duplicate the original table, and transpose the columns only.

(This diagram is taken from my draft chapter from the Microsoft Press book I am working on).

Hope you find it useful.

3 comments

  1. vossf Reply

    Nice article. I have experienced issues while using transpose, but have never tested its limits.
    One way of avoiding transposing to normalize column names is using the Table.TransformColumnNames function. It lets you specify a function to do the transformations just as you would with List.Transform.

    • Gil Raviv Post authorReply

      Thank you for sharing. I will cover this function soon. M does simplify a lot of things, when you are open to it.

  2. zedleb Reply

    Thanks for this warning Gil, it is extremely pertinent. I will have to explore alternatives to using Transpose as I’m already at over 10k columns! I was astonished that it coped with that many – to be fair.

    Great site – thanks for all your postings.

Leave a Reply