Power Query List.Accumulate – Unleashed

Following a reader’s request, today we will unleash the power of List.Accumulate.

The official documentation on List.Accumulate here was very confusing for me:

ArgumentDescription
listThe List to check.
seedThe initial value seed.
accumulatorThe value accumulator function.

OK. I understood the list argument quite right, but the other two arguments were post-nuclear-bomb science fiction.

So let’s try to understand the example that was used in the official page:

Oh, so the code above sums up all the elements in the list. That is nice. Let’s make sure the calculation was done right. 1+2+3+4+5 = 15. Yes, this is right 🙂

To prove that the code works, we can paste it to the Query Editor after creating a blank query:

List.Accumulate Unleashed in Power Query and Get & Transform in Excel and Power BI

But wait a minute, why should we bother using List.Accumulate when we can already sum up the numbers in a list using the function List.Sum?

List.Accumulate Unleashed in Power Query and Get & Transform in Excel and Power BI

So there must be something more powerful here… We will try to reveal it today and prove that List.Accumulate: More than meets the eye…

Before we start, let’s try to understand the seed and accumulator. In the example above, zero is used as the value for the seed argument, and a function that receives state and current is used as the accumulator.

The terms state and current can better help us to understand List.Accumulate.

The state is an object in Power Query which can be of any type. In our example it is a number. Defined as zero in its initial state (This is the seed, the initial state), and transformed on each iteration of the list. The value in state will increase by the number that is represented by current at each step (current represents the current element in the list). To make sure we understand this example, let’s review each step.

In the first step state is zero (as defined by the seed argument) and current is 1 (the first element in the list). During this step Power Query implements the function, and performs the sum operation state + current.

Iteration 1: state = 0, current = 1, Assign new value to state = state + current = 1

Iteration 1: state = 1, current = 2, Assign new value to state = state + current = 3

Iteration 1:state = 3, current = 3, …

state = 6, current = 4

state = 10, current = 5, Return state+current as the result = 15

 

Now that we better understand the flow, here are few important insights about List.Accumulate.

You must use two-argument function as the accumulator.

The seed argument (or as we call it the initial state), should be of the same type as the first argument of your accumulator function and of the same type as of the function’s output.

Let’s try to demonstrate it with pseudo code:

So in the pseudo code above, myList is a list whose elements are of type TYPE_X (e.g. a record, list, number or text), and our initial state is of type TYPE_Y, we must define our accumulator function as a function that receives a state argument whose type is TYPE_Y and a current argument whose type is TYPE_X, and returns a value whose type is TYPE_Y.

If you are completely lost at this stage, that’s fine. Perhaps few examples will List.Accunmulate to sink in.

First exmaple – implementing List.Max

This is still a basic example. We are warming up. This function will return the number 100 which is the maximal number in the list {1, 2, 100, 4, 5}.

But again, it is not so useful as we already have List.Max.

Let’s try to use List.Accumulate to calculate both the max and the min values on the same “pipeline”.

Here is the code:

The result is a record: [min=-40, max=100]List.Accumulate Unleashed in Power Query and Get & Transform in Excel and Power BI

Let’s elaborate what we did here. Unlike the first example where we used a number as a state that will “remember” the maximal number in the list at each iteration, in this example we needed to “remember” two numbers as we go over the list. To pass two numbers as the state for List.Accumulate we defined a record whose keys are min and max. The accumulator function also calculates the new record of min and max according to the previous min and max and the current number in the list.

You can scale up and reuse the example above to perform multiple “online” calculations on your data with a single pass. Online means that you don’t require to store all the data to perform the calculation, and that you can gradually get your result by calculating an interim result at each step based on the previous results and the “online” data of the current step.

For example, in the code below we perform multiple calculations on the data which is serialized in myList. The calculations are performed at each step of the list iteration by the functions fnForResult1, fnForResult2, etc.

And here is a useful function that uses List.Accumulate to create a list of column names from a list of their column indices.

An additional way to think about List.Accumulate is as a pipeline of machines that transform your data in a sequence of operations. Each machine in its turn, gets the data from its previous machine, transforms it, and passes it to the next machine in the pipeline. The definitions for each machine are serialized in the list, so each machine can be customized to perform a unique flavor of the same transformation. The processed data is initially defined as the seed argument, and it will “evolve” into the final product when the last machine will apply its transformation.

In conclusion, List.Accumulate is quite a powerful function in Power Query, and I hope that be now I motivated you to seek for interesting new scenarios to explore it.

Looking forward to hearing from you how you use List.Accumulate.

7 comments

  1. Maxim ZelenskyReply

    I think that real power of this function could be revealed when using it with not-so-usual tasks, for example, when result needed is record or table. Or, when constructing some kind of progressions. For example, to calculate a progression of Sum(X/n) where n = 1 to i,
    = List.Accumulate({1,1,1,1,1}, [X=0,n=0],(state, current) => [X= state[X]+current/(state[n]+1), b=state[n]+1])[X]
    gives us 2,2833333333333332
    When it comes to simple operations, it is easy to replace List.Accumulate with one-two simple functions, for example, [min=List.Min(myList),max=List.Max(myList)]

    It is really powerful when it comes to ITERATIONS

    Thanks, Gil, it is nice inspiration

  2. ImkeFReply

    Very nice explanation – thanks a lot Gil!

    I was always thinking of List.Accumulate and List.Generate as recursive functions, because this is actually the only reason I would use them for: As they are costly in terms of performance (due to the iterative nature) and (at least for me) a bit tricky to edit/write, as they don’t reveal their results stepwise like M normally does.

    What I didn’t realize before is that List.Accumulate can actually return more than one value. Could it also return lists or just one record only? Wondering then when to use List.Generate and when List.Accumulate…

    • DataChantReply

      Thank you Imke. List.Accumulate can return lists of it is invoked with lists as its initial state. The use of a record is not mandatory. You can use any data type.
      The main differences between List.Accumulate and List.Generate is that the latter gets a list and returns a kind of an accumulation of that list, while the former gets instructions to generate a list from scratch.
      BTW, I don’t think that these functions are implemented by Microsoft as recursions (too costly), but as iterations. That is why it is recommended to use these functions instead of recursions whenever it is possible.

  3. Pingback: BIN2DEC in Power Query using List.Accumulate | Excel Inside

  4. Maxim ZelenskyReply

    oops, there an error in my code
    should be
    = List.Accumulate({1,1,1,1,1}, [X=0,n=0],(state, current) => [X=state[X]+current/(state[n]+1), n=state[n]+1])[X]

Leave a Reply