Book Excerpt: Loops, Recursions, Fibonacci and more from Chapter 9: M #PowerQuery

The draft of my book is almost ready (I even have few Excel formulas that are showing me, that I have already wrote 93% of the books, and a total of 359 draft pages so far). This weekend, after finishing 44 pages on M, I wanted to share with you, in this post, a 3-page sneak peek of Chapter 9. You can pre-order the book on Amazon here.

Excerpted from the draft manuscript of Collect, Transform and Combine Data using Power BI and Power Query in Excel (Pearson Education, 2018).

Loops

If you have a software programming background, you may have been wondering why M is missing built-in loops. The answer is not simple. M was designed to perform sequences of transformation, and iterations are already built-in into many transformation steps. For example, to replace a value in a column, you don’t need to write code that will iterate over each of the rows in the table. M does the iteration for you when you apply any of its built-in Table functions.

Nevertheless, there are some special cases, where you would need to create an explicit logic, and iterate over a sequence to aggregate data. One of the most common scenario, is when you import data from websites, and need to use a page number, or an offset to import a small portion of the data. To be able to paginate over the entire dataset, you would need to “loop” over a range of numbers.

To perform the pagination, you can start with a custom function that imports a certain page, by its number. Then you can create a list of numbers on a separate query, for example:

= {1..10}

Next, you can convert the list into a table, and invoke the custom function in a new column, to import the different pages (You can do it via Invoke Custom Function in Add Column tab). Finally, you can expand the new column, and the table objects, will be appended together.

(Hi, it’s Gil. This comment is not part of the book, but check out the Star Wars Jedi webinar here, that will guide you through some cool web pagination in Power Query).

Recursions

Another way to repeat a sequence of transformation steps, is via recursions. Recursions in programming languages enables you to resolve complex computation tasks, by a divide-and-conquer approach. Here is a classic example: Fibonacci numbers.

A Fibonacci number characterized by the fact that every number except 0 and 1, is the sum of the two preceding numbers in the sequence. So, 0+1=1 will be the next number, then 1+1=2, then 1+2=3, then 2+3=5, then 3+5=8, then 5+8=13, and so forth.

In M, you can implement a recursion, using the @ sign to re-invoke the function, inside the function, and avoid the inner scope limitations. Here is the implementation of the Fibonacci recursion in M.

The use of recursions is not recommended in M. The implementation of recursions is taxing, and will consume high memory in any programming language.

List.Generate

List.Generate enables you to implement endless types iterations, including nested loops. This function generates a list of values, from an initial state, and a set of functions. Here is the definition of List.Generate:

List.Generate(initial as function, condition as function, next as function, optional selector as nullable function) as list

Let’s explain the inner-working of this function, using a simple example that returns the Fibonacci sequence below 100: {1, 2, 3, 5, 8, 13, 21, 34, 55, 89}

The List.Generate starts with an initial function that returns a record of previous and current numbers. Previous is set to 0, and Current is set to 1. It’s second argument, the condition function (each [Previous] + [Current] < 100), will make sure that the list is being generated, if the condition is true. When Previous and Current will reach a number above or equal to 100, the generation of the list will end.

The third argument is the next function, which defines the transition from the current state to the next, and progresses the values in Current and Previous fields of the record. So, in the initial state, Previous was 0 and Current was 1. In the next phase, Current will advance to [Previous] + [Current] which is 0+1=1, and Previous will be assigned as 1. In the next step, Current will be assigned as 1+1=2, and Previous will be assigned as 1. Next, Current will be assigned as 2+1=3, and Previous as 2, as so forth.

The fourth argument, assigns the element in the output list during each iteration, the outcome is the Fibonacci sequence.

List.Generate enables you to resolve advanced challenges, and work-around the lack of loops. Nevertheless, you should always try first to use the built-in table functions to implement the iteration logic, since List.Generate, with its lazy evaluation, may be slower, and consume more memory. Still, on small datasets, you should feel comfortable in using it. To learn more examples on List.Generate, such as nested loops, cursor-based pagination, and matrix multiplication, go here.

List.Accumulate

The List.Accumulate function accumulates a result from an input list. Starting from the initial value seed, this function applies an accumulator function, and returns the final aggregated result. You can use this function to iterate over elements in a list, and aggregate the result.

Here is the definition of the function:
List.Accumulate(list as list, seed as any, accumulator as function) as any

Let’s explain the different arguments, using a simple function that goes over a list of incremental numbers, and sums them up.

The result of this expression is 10. Let’s explain step by step. The function starts with the input list of {1,2,3,4}, and 0 as the initial state. Then in the first step, which is implemented in the accumulator function, it receives the value 0 as state, and the value 1 as current, and sums them. In the next step, value 1 is the state, and the value 2 is current. The result, 1+2=3 is returned as the next state, and this time 3 is the third item in the list. 3+3=6. Finally, 6 is propagated as state, and 4 is current. The result is 6+4=10.

Here in a List.Accumulate implementation of Fibonacci number, that returns the Fibonacci number 89. Unlike List.Generate, you cannot define a dynamic break condition. The function will iterate over all the elements in the list. We initialized the list to numbers starting from 0 to 9, but the actual values of the list don’t take part in the calculation.

In the preceding example, you can see that seed is defined as a record of two numerical fields, PreviousNum and CurrentNum. The accumulator, returns an updated version of this record, by passing the CurrentNum value to PreviousNum, and summing up CurrentNum and PreviousNum in the new value of CurrentNum.

In Chapter 11, we will apply this function to perform iterative value replacements on the same column. To learn additional implementations of List.Accumulate go here.

Hope you liked the 3 out of 44 pages of Chapter 9 of my book.

8 comments

    • Gil Raviv Post authorReply

      Thank you Afzal. The book is planned to be out in September.

  1. pato lobos Reply

    Book title is wrong? It should be “… using Power Query in Power BI an Excel” or do I need more sleep (just 4 hours today)….

  2. André de Lange Reply

    Strange.. I was just researching recursion in PP/PQ last night, and this morning this blog post appeared in my inbox! Are you sure you don’t work for Facebook…? 😉 Placed my pre-order.

  3. jeffrey Weir Reply

    Great content. This sentence doesn’t make sense to me: The third argument is the next function, which uses defines the next state, and progress the values in Current and Previous fields of the record. Did you mean something like …the use of which defines the next state, and progresses the values in…?

    • Gil Raviv Post authorReply

      Thank you Jeffrey for the correction. I fixed that sentence: “The third argument is the next function, which defines the transition from the current state to the next, and progresses the values in Current and Previous fields of the record.”

Leave a Reply