The Fill Down Hierarchies -Generic Solutions

In today’s blog post, I share the generic solutions for the Fill Down challenge that I posted here.

As you recall, we had a table of backlog items with three levels of hierarchies (Scenario, Feature and Backlog Item). In each row, the source data included the unique ID of the lowest level but had blank values in the columns that represented higher levels in the hierarchy, as shown at the left table.

The challenge was to fill down the table using Power Query in Excel or Power BI and reach the format of the right table above. In the last blog post in this series here, you were introduced to the simple solutions.

While it is relatively easy to resolve this challenge when the hierarchies and their corresponding column names are known, it takes a different skill set to resolve this challenge in a generic way and apply the fill down dynamically on any number of levels and column names. Such generic approach allows you to have a dynamic solution that doesn’t require many changes in the queries whenever the hierarchies change.

All the techniques below will refer to the Source query which imports the original table.

Thank you, Jeremy Thiel, Colin Banfield, Jeff Weir, Steve J, Micah Dail for sharing your solutions.

Jeremy Thiele’s Recursions

Jeremy’s solution is similar to the one I originally shared in the solution file. It contains two recursions. The recursiveFillDown function recursively drills down to the lowest sub groups and fills down the lowest hierarchy in each sub group. Then it moves upward and fills down the parent hierarchy levels till it reaches the first hierarchy.

The recursiveTableExpand function performs the expanding of subgroups which were grouped by recursiveFillDown. Here is the code:

let
    recursiveFillDown =
        (table, depth) =>
        let
            Headers = Table.ColumnNames(table),
            #"Filled Down" = Table.FillDown(table,{Headers{depth}}),
            #"Grouped Rows" = Table.Group(
                #"Filled Down",
                List.FirstN(Headers,depth + 1),
                {{"Table", each _, type table}}
            ),
            nestedTables = Table.SelectColumns(#"Grouped Rows",{"Table"}),
            recursion =
                Table.TransformColumns(
                    nestedTables,
                    {{"Table",
                      each @recursiveFillDown(
                          _,
                          depth + 1
                      )}}
                ),

            result =
                if depth >= (Table.ColumnCount(table) - 1) then
                    nestedTables
                else
                    recursion
        in
            result,
        
    recursiveTableExpand =
        (table, depth, headers) =>
        let
            expandTable =
                Table.ExpandTableColumn(
                    table, "Table", {"Table"}
                ),
            intermediateTable =
                @recursiveTableExpand(expandTable, depth + 1, headers),
            finalTable = Table.ExpandTableColumn(table, "Table", headers),
            result = if depth >= (List.Count(headers) - 1) then
                finalTable
            else
                intermediateTable
        in
            result,

    FillDown = recursiveFillDown(Source, 0),
    Headers = Table.ColumnNames(Source),
    TableExpand = recursiveTableExpand(FillDown, 0, Headers)
in
    TableExpand

The List.Accumulate Approach

While Recursions are easier to write and understand, you can usually find an alternative solution that will run in a loop and achieve the same goals – without the costly consumption of the call stack (If you don’t understand what is a call stack, ignore what I wrote – Just remember that recursions can quickly reach a limit and will not work well on large datasets – Not relevant in this case if the number of hierarchies is small). To achieve the loop that is necessary here, Colin Banfield, Steve J. and Jeff Weir implemented the solution using List.Accumulate. Read more about List.Accumulate here.


Colin Banfield’s List.Accumulate

let
    ColumnNames = Table.ColumnNames(Source),
    ReplacedSelectedNullValues =
        List.Accumulate(
            List.Positions(ColumnNames),
            Source,
            (accum, curr) =>
                if curr = 0 then
                    accum
                else
                    Table.ReplaceValue(
                        accum,
                        each Record.Field(_, ColumnNames{curr}),
                        each if Record.Field(
                            _, ColumnNames{curr - 1}) = null then
                            Record.Field(_, ColumnNames{curr})
                        else "****",
                        Replacer.ReplaceValue,
                        {ColumnNames{curr}}
                    )
        ),
    FilledDown = Table.FillDown(ReplacedSelectedNullValues, ColumnNames),
    ReplacedValue =
        Table.ReplaceValue(
            FilledDown, "****", null, Replacer.ReplaceValue, 
            List.Skip(ColumnNames)
        )
in
    ReplacedValue

Steve J’s List.Accumulate

(Data as table, HierarchyColumns as list) as table =>
let
    AllColumns = Table.ColumnNames(Data),
    HierIndexes = List.Transform(
        HierarchyColumns,
        each List.PositionOf(AllColumns, _)
    ),
    HierData = List.Transform(
        HierarchyColumns,
        each Table.Column(Data, _)
    ),
    ValidateInput =
    let
        N = List.Count(HierarchyColumns),
        Names = Text.Combine(HierarchyColumns, ", "),
        ListTooSmall =
            error "Hierarchy must be 2 or more columns: " & Names,
        NeedsTwo = N > 1 or ListTooSmall,
        NameMismatch =
            error "Hierarchy column(s) not found in input table: " &
                Names,
        ColumnsCheck = List.Count(HierIndexes) = N or NameMismatch
    in
        NeedsTwo and ColumnsCheck,

    NullToEmpty = each if _ = null or Text.Trim(_) = "" then "" else _,
    EmptyToNull = each if _ = null or Text.Trim(_) = "" then null else _,

    PrepLevel = (Prv as list, Index as number) as list =>
    let
        IsHier = not List.IsEmpty(Prv),
        HierCol = HierData{Index},
        PrepHier = List.Transform(HierCol, EmptyToNull),
        New =
            List.Transform(
                List.Zip({PrepHier, Prv{0}}),
                each
                    if Value.Is(_{0}, type text) then
                        _{0}
                    else if _{1} <> "" then
                        null
                    else ""
            ),
        Result =
            if IsHier then
                {New} & Prv
            else
                {List.Transform(HierCol, NullToEmpty)}
    in Result,

    DenseHierarchy = 
        List.Accumulate(
            List.Reverse(List.Positions(HierData)),
            {},
            PrepLevel
        ),
    PreppedDataTable =
        if List.Count(AllColumns) = List.Count(HierarchyColumns) then
            Table.FromColumns(DenseHierarchy, HierarchyColumns)
        else
            Table.FromColumns(
                List.ReplaceRange(
                    Table.ToColumns(Data),
                    HierIndexes{0},
                    List.Count(HierIndexes),
                    DenseHierarchy
                ),
                AllColumns
            ),
    Result = Table.FillDown(
        PreppedDataTable,
        List.RemoveLastN(HierarchyColumns, 1)
    )
    in
        if ValidateInput then Result else ""

Jeff Weir’s List.Accumulate

let
    Index = Table.AddIndexColumn(Source, "Index", 0, 1),
    ColumnNames = Table.ColumnNames(Index),
    Custom1 = ColumnNames,
    #"Converted to Table" =
        Table.FromList(
            Custom1,
            Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Added Prefix" =
        Table.TransformColumns(#"Converted to Table",
            {{"Column1", each "copy." & _, type text}}),
    PrefixedNames = #"Added Prefix"[Column1],
    #"Merged Queries" =
        Table.NestedJoin(
            Index,
            {"Index"},
            Index,
            {"Index"},
            "Added Index1",
            JoinKind.LeftOuter),
    #"Expanded Added Index1" =
        Table.ExpandTableColumn(
            #"Merged Queries",
            "Added Index1",
            ColumnNames,
            PrefixedNames),
    #"Filled Down1" =
        Table.FillDown(
            #"Expanded Added Index1",
            PrefixedNames
        ),
    Positions = 
        List.RemoveFirstN(List.Reverse(List.Positions(ColumnNames)),2),
    Replaced =
        List.Accumulate(
            Positions, 
            #"Filled Down1", 
            (state, current) => 
                Table.ReplaceValue(
                    state,
                    each Record.Field(_, ColumnNames{current}),
                    each 
                        if Record.Field(
                            _,
                            ColumnNames{current+1}
                        ) <> null 
                        then
                            Record.Field(
                                _,
                                "copy." & ColumnNames{current}
                            )
                        else
                            Record.Field(_, ColumnNames{current}),
                Replacer.ReplaceValue,
                {ColumnNames{current}}
            )
        ),
    #"Removed Other Columns" =
        Table.SelectColumns(Replaced,List.RemoveLastN(ColumnNames,1))
in
    #"Removed Other Columns"

Micah Dail’s Solution –

I really liked Micah’s creative approach to this challenge, as it didn’t use any advanced recursions or List.Accumulate.

In Micah’s own words:

“A simple fill down will result in both VALID as well as INVALID fills. So how do we determine if a fill is invalid? Well, a fill is invalid if it’s filling down into a depth beyond what the original data element existed at. For example, element #10 “F4” had in the original data a depth of 2 (it was in the second column) so any fills at a lower depth (i.e. 3+) are invalid. Similarly element #12 “S2” had a depth of 1 (it was in the first column) so any fills at a lower depth (i.e. 2+) are invalid.

That’s all my code does. It performs a dumb fill and then joins back to the original data depths and filters out the dumb fill depths that exist below the original data element.”

let
    Demote_Header = Table.DemoteHeaders(Source),
    Set_Row_Nbr = Table.AddIndexColumn(Demote_Header, "Row_Nbr", 0, 1),
    Unpivot_Data =
        Table.UnpivotOtherColumns(
            Set_Row_Nbr, {"Row_Nbr"}, "Column", "Value"
        ),
    Good_Data =
        Table.TransformColumns(
            Unpivot_Data,
            {
                {
                    "Column",
                    each Number.From(Text.Replace(_, "Column", "")),
                    Int64.Type
                },
                {
                    "Row_Nbr",
                    each Number.From(_),
                    Int64.Type
                }
            }
        ),
    Remove_Header = Table.SelectRows(Good_Data, each ([Row_Nbr] >0)),
    Repivot_Without_Header =
        Table.Pivot(
            Table.TransformColumnTypes(
                Remove_Header, {{"Column", type text}},"en-US"
            ),
            List.Distinct(
                Table.TransformColumnTypes(
                    Remove_Header,
                    {{"Column", type text}}, "en-US"
                )[Column]
            ), "Column", "Value"
        ),
        Dumb_Fill_Down =
            Table.FillDown(
                Repivot_Without_Header,
                Table.ColumnNames(Repivot_Without_Header)
            ),
        Unpivot_Dumb_Fill =
            Table.UnpivotOtherColumns(
                Dumb_Fill_Down, {"Row_Nbr"}, "Column", "Value"
            ),
            Dumb_Data = Table.TransformColumnTypes(
                Unpivot_Dumb_Fill,{{"Column", Int64.Type}}
            ),
        Combine_Good_And_Dumb =
            Table.NestedJoin(
                Good_Data, {"Row_Nbr"},
                Dumb_Data, {"Row_Nbr"},
                "Dumb_Data", JoinKind.Inner
            ),
        Expand_Dumb_Data =
            Table.ExpandTableColumn(
                Combine_Good_And_Dumb,
                "Dumb_Data",
                {"Column", "Value"},
                {"Dumb_Data.Column", "Dumb_Data.Value"}
            ),
        Filter_Dumb_Records_For_Brilliance =
            Table.SelectRows(
                Expand_Dumb_Data, each ([Dumb_Data.Column] <= [Column])),
        Finalize_Data =
            Table.SelectColumns(
                Filter_Dumb_Records_For_Brilliance,
                {"Row_Nbr", "Dumb_Data.Column", "Dumb_Data.Value"}
            ),
        Rename_Columns =
            Table.RenameColumns(
                Finalize_Data,
                {{"Dumb_Data.Column", "Column"},
                 {"Dumb_Data.Value", "Value"}}
            ),
        Isolate_Header =
            Table.SelectRows(Good_Data, each ([Row_Nbr] = 0)),
        Append_Finalized_To_Header =
            Table.Combine({Isolate_Header, Rename_Columns}),
        Repivot_Data =
            Table.Pivot(
                Table.TransformColumnTypes(
                    Append_Finalized_To_Header,
                    {{"Column", type text}}, "en-US"
                ),
                List.Distinct(
                    Table.TransformColumnTypes(
                        Append_Finalized_To_Header,
                        {{"Column", type text}},
                        "en-US"
                    )[Column]
                ),
                "Column",
                "Value"
            ),
        Ensure_Original_Sort =
            Table.Sort(Repivot_Data,{{"Row_Nbr", Order.Ascending}}),
        Remove_Row_Nbr =
            Table.RemoveColumns(Ensure_Original_Sort,{"Row_Nbr"}),
        Restore_Original_Header =
            Table.PromoteHeaders(Remove_Row_Nbr, [PromoteAllScalars=true])
in
    Restore_Original_Header

You can download the solution file here.

Jeremy, Colin, Steve and Micah, you are welcome to contact me to publish a guest blog post that explains your code and how you have come up with your solution.

Leave a Reply