The Fill Down Hierarchies -Solutions

In today’s blog post, I will share the simple solutions for the problem that I posted here.

For the recursion approach, stay tuned for Part 2 of this post, or download the solution workbook here.

As you recall, we had a table of backlog items with three levels of hierarchies (Scenario, Feature and Backlog Item). In each row, the source data included the unique ID of the lowest level but had blank values in the columns that represented higher levels in the hierarchy, as shown at the left table.

The challenge was to fill down the table using Power Query in Excel or Power BI and reach the format of the right table above. I also asked if you can you take your solution one step further and write it in a way that will support dynamic number of levels and different header names (For example: Continent, Country, Country, City, Street). This approach allows you to have a dynamic solution that doesn’t require any changes to the queries (except for one change – the number of hierarchies in the table).

Special Thank You

Andrew Lindfield, Alex Powers, Nick M., Mehdi Hamadi, James Fancke, Frank T, Cristian Angyal, Kunle Sopeju, Shungu, Stevej, and Jeremy Thiele – Thank you for sharing your solutions for the Fill Down Hierarchies challenge. Except one, you all nailed it. Few of you even addressed the bigger challenge and created a dynamic solution.

Let’s start with the simple solution that fills down the specific three hierarchies.

The Simple Solution

Download the workbook here to start. If you use Power BI Desktop, start from a blank report, and load the Backlogs table using Get Data. If you prefer using Excel open the workbook above and select From Table/Range in the Data ribbon.

In the Power Query Editor, you can see that the first column should be filled down.

To do it, right click on Scenario column and select Fill and Down.

If you now proceed with another Fill Down step on the Feature column, you would get wrong results. The feature F4 will be filled down and will “leak” into scenario S2. The feature F7 will be wrongly filled down to scenarios S3 and S4. The wrong results happen because the basic Fill Down transformation cannot look at the context in the Scenario column and stop filling down when the scenario changes.

So, how can we perform a smart Fill Down? It’s relatively easy. We will somehow mark the null values that should not be filled with temporary values, fill down the Feature column and then replace back the temporary values into nulls.

Let’s start with the most direct approach to do it. But first, delete the previous Fill Down. We need to take advantage of the null values in the Scenario column.

After you delete the Filled Down step in Applied Steps, select Conditional Column in Add Column tab. In the Add Conditional Column dialog box, follow these steps:

  • Enter Feature – Temp in New column name.
  • Select Scenario in Column Name drop-down menu.
  • Enter null in Value.
  • Select Select a Column in the drop-down menu below Output.
  • Select Feature in the second Output drop down menu.
  • Enter Don’t fill down in Otherwise and click OK.

You can now see that the Feature – Temp column has the values “Don’t fill down” whenever the value in Scenario column is not null, and the original value in Feature when Scenario is null.

Now we are ready for the Fill down. Select both Scenario and Feature – Temp, and right click on of their headers. In the shortcut menu, select Fill, and then select Down. You can now remove the Feature column and rename Feature – Temp to Feature.

It’s time to replace back the “Don’t fill down” values into nulls. Right click on Feature column, and select Replace Values..

In the Replace Values dialog box, enter Don’t fill down in the Value To Find, and null in Replace With. Then, click OK.

Finally, if you use Excel and intend to load the transformed table back into Excel, you would need to change the order of the columns, and reorder Feature as the second column.

The Simplest Solution – Conditional Replace Values

Now that we got the gist of it, I wanted to share the solution that was provided by Frank T. Instead of adding a new column, we could manipulate the null values using the Replace Values transformation. Here are the steps:

Import the Backlogs table into Power Query Editor. Right click on the Feature column, and select Replace Values..

In the Replace Values dialog box, enter a as Value To Find and b as Replace With. Then expand the Advanced options and select Match entire cell contents. When you are done, click OK.

We used the previous step to generate the base code we need for the conditional replace values step. Let’s look at the code:

= Table.ReplaceValue(Source,"a","b",Replacer.ReplaceValue,{"Feature"})

We will replace “a” and “b” as follows:

= Table.ReplaceValue(Source, each [Feature], each if [Scenario]=null then [Feature] else "Don't fill down", Replacer.ReplaceValue, {"Feature"})

“a” was replaced to each [Feature] which returns the current value in the Feature column.

“b” was replaced with the same condition we used in the Conditional Column step earlier.

From here, you can apply the Fill Down on the first two columns and then replace the values Don’t fill down with nulls, as we did earlier.

You can review the two solutions here.

In the next blog post we will move to the bigger challenge: How can we fill down multiple columns with any number of hierarchies and unknown column names?

9 comments

  1. Micah Dail Reply

    Sorry I’m a little late to the party, but there is a solution to the problem that can handle a dynamic number of columns and also do so without recursion. See below:

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    Demote_Header = Table.DemoteHeaders(Source),
    Set_Row_Nbr = Table.AddIndexColumn(Demote_Header, “Row_Nbr”, 0, 1),
    Unpivot_Data = Table.UnpivotOtherColumns(Set_Row_Nbr, {“Row_Nbr”}, “Column”, “Value”),
    Good_Data = Table.TransformColumns(Unpivot_Data, {{“Column”, each Number.From(Text.Replace(_, “Column”, “”)), Int64.Type}, {“Row_Nbr”, each Number.From(_), Int64.Type}}),
    Remove_Header = Table.SelectRows(Good_Data, each ([Row_Nbr] 0)),
    Repivot_Without_Header = Table.Pivot(Table.TransformColumnTypes(Remove_Header, {{“Column”, type text}}, “en-US”), List.Distinct(Table.TransformColumnTypes(Remove_Header, {{“Column”, type text}}, “en-US”)[Column]), “Column”, “Value”),
    Dumb_Fill_Down = Table.FillDown(Repivot_Without_Header,Table.ColumnNames(Repivot_Without_Header)),
    Unpivot_Dumb_Fill = Table.UnpivotOtherColumns(Dumb_Fill_Down, {“Row_Nbr”}, “Column”, “Value”),
    Dumb_Data = Table.TransformColumnTypes(Unpivot_Dumb_Fill,{{“Column”, Int64.Type}}),
    Combine_Good_And_Dumb = Table.NestedJoin(Good_Data,{“Row_Nbr”},Dumb_Data,{“Row_Nbr”},”Dumb_Data”,JoinKind.Inner),
    Expand_Dumb_Data = Table.ExpandTableColumn(Combine_Good_And_Dumb, “Dumb_Data”, {“Column”, “Value”}, {“Dumb_Data.Column”, “Dumb_Data.Value”}),
    Filter_Dumb_Records_For_Brilliance = Table.SelectRows(Expand_Dumb_Data, each ([Dumb_Data.Column] <= [Column])),
    Finalize_Data = Table.SelectColumns(Filter_Dumb_Records_For_Brilliance,{"Row_Nbr", "Dumb_Data.Column", "Dumb_Data.Value"}),
    Rename_Columns = Table.RenameColumns(Finalize_Data,{{"Dumb_Data.Column", "Column"}, {"Dumb_Data.Value", "Value"}}),
    Isolate_Header = Table.SelectRows(Good_Data, each ([Row_Nbr] = 0)),
    Append_Finalized_To_Header = Table.Combine({Isolate_Header, Rename_Columns}),
    Repivot_Data = Table.Pivot(Table.TransformColumnTypes(Append_Finalized_To_Header, {{"Column", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(Append_Finalized_To_Header, {{"Column", type text}}, "en-US")[Column]), "Column", "Value"),
    Ensure_Original_Sort = Table.Sort(Repivot_Data,{{"Row_Nbr", Order.Ascending}}),
    Remove_Row_Nbr = Table.RemoveColumns(Ensure_Original_Sort,{"Row_Nbr"}),
    Restore_Original_Header = Table.PromoteHeaders(Remove_Row_Nbr, [PromoteAllScalars=true])
    in
    Restore_Original_Header

    • Micah Dail Reply

      I think that thinking about it in the following way will help:

      A simple fill down will result in both VALID as well as INVALID fills.

      So how do we determine if a fill is invalid? Well, a fill is invalid if it’s filling down into a depth beyond what the original data element existed at. For example, element #10 “F4” had in the original data a depth of 2 (it was in the second column) so any fills at a lower depth (i.e. 3+) are invalid. Similarly element #12 “S2” had a depth of 1 (it was in the first column) so any fills at a lower depth (i.e. 2+) are invalid.

      That’s all my code does. It performs a dumb fill and then joins back to the original data depths and filters out the dumb fill depths that exist below the original data element.

  2. Prinsloo Potgieter Reply

    Thank you Gil. Very good article.
    And thank you Micah. Couldn’t quite work out your code – still above my pay grade. Maybe post your file as well? I got confused with the combine part – this might need a Good_Data and Dumb_Data table, or not? Like I said – I still have some way to go. Thank you in any case. Prinsloo

  3. Colin Banfield Reply

    “In the next blog post we will move to the bigger challenge: How can we fill down multiple columns with any number of hierarchies and unknown column names?”

    Hi Gil,

    I came late to party, so didn’t post a solution in the previous thread. Apparently I mentally extrapolated your specific scenario into a general solution, because ultimately, it becomes a reusable pattern.

    Since you’ve revealed the topic of your next post, and since we don’t know when it’s coming, I thought I might as post my general solution here. It would interesting to compare it with your own.

    let
    Source = Excel.CurrentWorkbook(){[Name=”Table1″]}[Content],
    ColumnNames = Table.ColumnNames(Source),
    ReplacedSelectedNullValues =
    List.Accumulate(
    List.Positions(ColumnNames),
    Source,
    (accum, curr) =>
    if curr = 0 then accum
    else
    Table.ReplaceValue(
    accum,
    each Record.Field(_, ColumnNames{curr}),
    each if Record.Field(_, ColumnNames{curr – 1}) = null then
    Record.Field(_, ColumnNames{curr})
    else “****”,
    Replacer.ReplaceValue,
    {ColumnNames{curr}}
    )
    ),
    FilledDown = Table.FillDown(ReplacedSelectedNullValues, ColumnNames),
    ReplacedValue = Table.ReplaceValue(FilledDown, “****”, null, Replacer.ReplaceValue, List.Skip(ColumnNames))
    in
    ReplacedValue

    Regards, Colin

    • Jeff Weir Reply

      @Colin Banfield: That’s freakin awesome. I love the way you replace the current value with itself if the value to the left is null, but otherwise put a placeholder in there. I’ve got another approach in mind, but it remains to be seen if I’ve got the smarts to pull it off!

  4. jeffrey Weir Reply

    Here’s a fun way to do the non-dynamic challenge:

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    #”Changed Type” = Table.TransformColumnTypes(Source,{{“Scenario”, type text}, {“Feature”, type text}, {“Backlog item”, type text}}),
    #”Replaced Value” = Table.ReplaceValue(#”Changed Type”,null,”|”,Replacer.ReplaceValue,{“Scenario”, “Feature”}),
    #”Inserted Merged Column” = Table.AddColumn(#”Replaced Value”, “Merged”, each Text.Combine({[Scenario], [Feature]}, “”), type text),
    #”Replaced Value1″ = Table.ReplaceValue(#”Inserted Merged Column”,”|”,null,Replacer.ReplaceValue,{“Scenario”, “Feature”}),
    #”Filled Down” = Table.FillDown(#”Replaced Value1″,{“Scenario”, “Feature”}),
    #”Added Conditional Column” = Table.AddColumn(#”Filled Down”, “Custom”, each if Text.PositionOf([Merged], “|”) = 0 then [Feature] else null),
    #”Removed Columns” = Table.RemoveColumns(#”Added Conditional Column”,{“Feature”, “Merged”}),
    #”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“Custom”, “Feature”}}),
    #”Reordered Columns” = Table.ReorderColumns(#”Renamed Columns”,{“Scenario”, “Feature”, “Backlog item”})
    in
    #”Reordered Columns”

  5. Jeff Weir Reply

    And here’s another way to do it, using a Self Join:

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    #”Changed Type” = Table.TransformColumnTypes(Source,{{“Scenario”, type text}, {“Feature”, type text}, {“Backlog item”, type text}}),
    #”Duplicated Column” = Table.DuplicateColumn(#”Changed Type”, “Feature”, “Feature – Copy”),
    #”Filled Down” = Table.FillDown(#”Duplicated Column”,{“Feature – Copy”}),
    #”Added Index” = Table.AddIndexColumn(#”Filled Down”, “Index”, 0, 1),
    #”Added Index1″ = Table.AddIndexColumn(#”Added Index”, “Index.1″, 1, 1),
    #”Merged Queries” = Table.NestedJoin(#”Added Index1″,{“Index”},#”Added Index1″,{“Index.1″},”Added Index1″,JoinKind.LeftOuter),
    #”Expanded Added Index1″ = Table.ExpandTableColumn(#”Merged Queries”, “Added Index1”, {“Feature – Copy”}, {“Feature – Copy.1″}),
    #”Sorted Rows” = Table.Sort(#”Expanded Added Index1″,{{“Index”, Order.Ascending}}),
    #”Added Conditional Column” = Table.AddColumn(#”Sorted Rows”, “Custom”, each if [Backlog item] null then [#”Feature – Copy.1″] else [Feature]),
    #”Removed Columns” = Table.RemoveColumns(#”Added Conditional Column”,{“Feature”, “Feature – Copy”, “Index”, “Index.1”, “Feature – Copy.1″}),
    #”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“Custom”, “Feature”}}),
    #”Reordered Columns” = Table.ReorderColumns(#”Renamed Columns”,{“Scenario”, “Feature”, “Backlog item”}),
    #”Filled Down1″ = Table.FillDown(#”Reordered Columns”,{“Scenario”})
    in
    #”Filled Down1″

  6. Jeff Weir Reply

    Here’s my dynamic code.

    let
    Source = Excel.CurrentWorkbook(){[Name=”Table1″]}[Content],
    Index = Table.AddIndexColumn(Source, “Index”, 0, 1),
    ColumnNames = Table.ColumnNames(Index),
    Custom1 = ColumnNames,
    #”Converted to Table” = Table.FromList(Custom1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #”Added Prefix” = Table.TransformColumns(#”Converted to Table”, {{“Column1”, each “copy.” & _, type text}}),
    PrefixedNames = #”Added Prefix”[Column1],
    #”Merged Queries” = Table.NestedJoin(Index,{“Index”},Index,{“Index”},”Added Index1″,JoinKind.LeftOuter),
    #”Expanded Added Index1″ = Table.ExpandTableColumn(#”Merged Queries”, “Added Index1″, ColumnNames, PrefixedNames),
    #”Filled Down1″ = Table.FillDown(#”Expanded Added Index1″,PrefixedNames),
    Positions = List.RemoveFirstN(List.Reverse(List.Positions(ColumnNames)),2),
    Replaced =
    List.Accumulate(
    Positions,
    #”Filled Down1”,
    (state, current) =>
    Table.ReplaceValue(
    state,
    each Record.Field(_, ColumnNames{current}),
    each
    if Record.Field(_, ColumnNames{current+1}) null
    then Record.Field(_, “copy.” & ColumnNames{current} ) //Here I want to reference the previous record
    else Record.Field(_, ColumnNames{current}),
    Replacer.ReplaceValue,
    {ColumnNames{current}}
    )
    ),
    #”Removed Other Columns” = Table.SelectColumns(Replaced,List.RemoveLastN(ColumnNames,1))
    in
    #”Removed Other Columns”

Leave a Reply