Matrix Multiplication using the Cartesian product in Power Query

Guest Author: Henk-Jan van Well

As a follow up to two older Matrix Multiplication posts here and here, I would like to share with you an alternative query for matrix multiplication using the Cartesian product-functionality within Power Query.

This post was contributed by guest author Henk-Jan van Well.  When reading through Gil’s book I got triggered by a reference on p.244 to matrix multiplication using the List.Generate(.)-function as an iterator or the M-equivalent for the good-old “for-next”-loop. This example was exactly what I needed to get a better understanding of some of the data-structures and their specific syntax in M. A bit further in his book Gil touches upon the Cartesian product in Power Query. I recalled from linear algebra that matrix multiplication can be based on the Cartesian product. Out of curiosity and for my own understanding and learning of M I created the matrix multiplication from this blog. 

The two older posts referenced above were aimed at implementing a dynamic matrix multiplication without having to bother as a user about the dimensions of the resulting product matrix. With the current dynamic arrays and their spill-behavior, available in the latest versions of Excel (365 and 2021), this is no longer an issue. 

Nevertheless, the matrix implementation in this blog differs from the two older ones by choosing dedicated list-data structures and using the Cartesian product functionality from Power Query. All the information needed to perform the matrix multiplication are embedded in the chosen data structures in combination with the Cartesian product operation, i.e. no need for defining additional helper index-variables for iterating, or to merge/join information from one matrix to the other, to be able to perform the proper underlying calculations. For users who don’t have yet the latest versions of Excel available, this solution could still be an alternative for implementing a dynamic matrix multiplication. If not, it is at least a good example to get more familiar with manipulating list-data structures in M.

Let’s start with understanding the Cartesian product. In set theory, a Cartesian product can be applied on multiple sets to produce a set of all the combinations of members from the different sets – one from each set. In Power Query you can apply a Cartesian product on any two tables by adding to the first table a custom column with a reference to the second table. Then, by expanding the new column, you will reach the Cartesian product. For a quick visual demonstration you can go to RADACAD’s blog on this topic here.

Next we have to choose the dedicated data structures to be able to apply the correct Cartesian product. Before doing so let’s have a closer look at the mathematical representation of a matrix multiplication:

Let matrix A be (m×n), i.e. m-rows by n-columns:

Eq. 1

Let matrix B be (n×k), i.e. n-rows by k-columns:

Eq. 2

The resulting product matrix C = A·B will be (m×k), i.e. m-rows by k-columns:

Eq. 3

Where each of the elements cij of the product matrix C can be written as:

Eq. 4

The careful reader might already observe the Cartesian product in this notation as each i is combined with all j’s and vice versa. Now each i represents a row in the matrix A, while each j represents a column in the matrix B.

From linear algebra you might recall that the equation for cij can be written as an inner-product:

Eq. 5

Where:

Eq. 6

and

Eq. 7


Note: ai is (1×n) matrix, while bj is (n×1) matrix, hence the dimensions of the product matrix is (1×1), or a scalar. Using the inner-product notation for cij the matrix C can be written as:

Eq. 8

From this it can be seen that matrix multiplication is a Cartesian product of inner-products of all rows from the first matrix A with all columns of the second matrix B. The total number of inner-products equals m·k.

Although the inner-product notation of the elements cij strictly define a matrix-multiplication it can be seen from the initial sum-notation for these elements in Eq. 4 that the actual calculation is similar to the SUMPRODUCT()-function in regular Excel.

Both insights will help us in our choice of the data-structures to use when implementing the matrix-multiplication using the Cartesian product in M.

Download this workbook to see Power Query’s version for this specific matrix multiplication in action.

In the workbook we have matrix A and matrix B which have been defined as two Excel tables with names MatrixA resp. MatrixB. 

 Note that the dimension of MatrixA is (4×6), while the dimension of MatrixB is (6×3).

The product matrix C will therefore be of dimension (4×3), i.e. we will have to calculate 12 inner-products to arrive at the final matrix C.

Below the M-code from the MMULTPQ-query is shown implementing the matrix multiplication using the Cartesian product.

Using the Power Query Editor we will go through the APPLIED STEPS from the MMULTPQ-query explaining the data-structures.

 The first line of code will load the Excel table MatrixA into a Power Query variable named Amat of type table.

The next APPLIED STEP A (see Figure 2 below) executes three steps at once. The most inner step Table.ToRows(Amat) will create a list of lists from the table Amat. One should realize that the list data structure in M is basically a column-vector. The Table.ToRows(.)-function will transform each row of the table Amat into a list and next to that creates a new list out of these lists. Using the curly bracket notation for lists-data structures the transformation performed by the Table.ToRows()-function is demonstrated below:

Eq. 9

The middle step Table.FromList(.) is only used to transform this list of lists into a table format, without any additional transformations of the inner-lists. The following notation demonstrates this subtle difference:

Eq. 10

Note: the use of round brackets indicate the table-structure. This table has exactly one column and each of the m rows – yes, there are only m rows (!) –  holds a list. In M each column of a table is a list. If we would extract the first column from the table above, the list of lists from Eq. 9 is returned.

The third and final step Table.RenameColumns(.) is only cosmetic as it helps me to understand what the data is actually representing within the PQ-editor.

The next two APPLIED STEPS do exactly the same using MatrixB.

 The next APPLIED STEP CartesianProduct (see Figure 3 below) adds to the first table A a custom column with a reference to the second table B. Then, by expanding the new column the Cartesian product results automatically. This Cartesian product is a table with 12 rows and 2 columns. 

Each row (which is actually a record in M) holds two elements in list-form. The first list holds a (transposed) row of matrix A and the second list holds a column of matrix B. Throughout the rows of the CartesianProduct all combinations of (transposed) rows from matrix A (in list form) with columns of matrix B (in list form) are included.
Looking at the first row/record in more detail one will observe the following:

The first element of the first record represents the first (transposed) row of the initial matrix A (in list form), while the second element of the first record represents the first column of the initial matrix B (in list form).

To access the first element of the first record directly in M one should write CartesianProduct{0}[A]. As this is a list one could access the first element of this list via CartesianProduct{0}[A]{0} = 10.

Similar when looking at the last row/record in more detail:

The first element of the last record represents the last (transposed) row of the initial matrix A, while the second element of the last record represents the last column of the initial matrix B.

To access the second element of the last record directly in M one should write CartesianProduct{11}[B]. As this is a list one could access the last element of this list via CartesianProduct{11}[B]{5} = 1.

By scrolling through each row/record one will observe that all possible combinations of rows of matrix A with columns of matrix B are present. Starting with the first (transposed) row of matrix A, each column of matrix B is added, starting with the first column of matrix B. Next the second (transposed) row of matrix A is combined with each column of matrix B, again starting with the first column of matrix B, etc.

Going back to our earlier introduced inner-product-notation from Eq. 5, we would have now prepared all possible combinations of ai and bj to be able to calculate the elements of the final product matrix C. In order to calculate the elements of product matrix C we will actually use the sumproduct-approach from Eq. 4, although in a slightly different way.

The next APPLIED STEP AxB (see Figure 6 below) executes 5 steps at once aimed at the product-part of the sumproduct-approach. The first two steps transform each column of the CartesianProduct into one big list each, using the List.Combine(.) function. Each list holds exactly 72 numbers, as each of the original 12 nested lists in them hold 6 numbers, i.e. the 6 columns from matrix A, resp. the 6 rows from matrix B. The third step creates a table out of these two lists using the Table.FromColumns(.) function. The fourth step adds a column to this new table where each element of the new column is the product of the elements on the same row of the original two columns. This new column is named AxB. As we are only interested in this last column (the product part!) the fifth and final step defines the result equal to this last column only, i.e. the final [AxB] notation at the end, which is actually a list.

The result of this step is shown below in Figure 6. Please note that the last 6 numbers 18,10,8,8,18 and 7 are exactly the elementwise products of the elements from both lists from the last record of the former step, i.e. 6×3, 2×5, 8×1, 1×8, 6×3 and 7×1.

The final APPLIED STEP C (see Figure 7 below) executes 4 steps at once aimed at the sum-part of the sumproduct-approach and reshaping the final result. The first step splits the list from the former step in 12 sub-lists of 6 elements each, using the List.Split(.) function. The specific number of 6 elements relates to the s-index in Eq. 4, or the number of columns in matrix A, which can be accessed via the Table.ColumnCount(Amat) function. The second step transforms the list of 12 sub-lists using the List.Transform(.) function into a list of 12 scalars (the sum-part of the sumproduct-approach!), where each scalar is the sum of the 6 elements in each respective sub-list as a result of applying the List.Sum(.) function as the transform function on each of the sub-lists. These 12 scalars represent the cij elements from Eq. 3, although still in list-form. We already know that the final product matrix holds 3 columns, i.e. the number of columns in matrix B, which can be accessed via the Table.ColumnCount(Bmat) function. Therefore we split the list of 12 scalars into a list of 4 sub-lists (i.e. 4 rows!) of 3 scalars each, using the List.Split(.) function in combination with the Table.ColumnCount(Bmat) function. These 4 rows are the actual rows of our final C matrix, although still in list form. Applying the Table.FromRows(.) function onto this list will result in our final product matrix C as demonstrated below.

That’s it for my guest blog. To conclude, this implementation is not about speed of processing, but about using proper data structures to handle a problem as much as possible as mathematically meant to be. I will be glad to hear your feedback and might have a challenge for you as well: as each element of the product matrix C is an inner product, i.e. a matrix multiplication itself, would there  be an opportunity for implementing the full matrix multiplication by a kind of recursion?

*****

Henk-Jan van Well – April/May 2022

[email protected]

Leave a Reply