The Fill Down Hierarchies Challenge

A weekend challenge

I hope you are enjoying your second weekend of 2019. I was helping a friend recently figuring out the following data challenge:

You have a table of Backlog items with three levels of hierarchies (Scenario, Feature and Backlog Item). In each row, the source data includes the unique ID of the lowest level, but has blank values in the columns that represent the higher levels, as shown at the left table.

The format on the left is not ideal for analytics. You may think that it is not common to find such format. Unfortunately, it is a common format. For example, here is a screenshot from an output of Team Foundation Add-In in Excel (source is here).

Can you fill down the table using Power Query in Excel or Power BI to reach the format of the right table above? Can you take your solution one step further, and write it in a way it will support dynamic number of levels and different header names? (For example: Continent, Country, Country, City, Street).

You can download the source table here. The solutions will be published soon.

Update (Jan 14, 2019): Thank you for the solutions you proposed below. Here is a solution that receives the table and the number of hierarchies in and uses recursions to resolve the challenge, without knowing in advance how many levels of hierarchies or column names are used. In the next blog post, I will describe it in details along with the best queries that were shared below.

21 comments

  1. Andrew Lindfield (@ajlind25) Reply

    Here’s one solution. It prevents the fill down by replacing the nulls with a temporary value – in this case “(prevent filldown)”.

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    #”Added tmpFeature” = Table.AddColumn(Source, “tmpFeature”, each if ([Feature] = null and [Scenario] null) then “(prevent filldown)” else [Feature], type text),
    #”Added tmpBacklog item” = Table.AddColumn(#”Added tmpFeature”, “tmpBacklog item”, each if ([Backlog item] = null and [tmpFeature] null) then “(prevent filldown)” else [Backlog item], type text),
    #”Removed Columns” = Table.RemoveColumns(#”Added tmpBacklog item”,{“Feature”, “Backlog item”}),
    #”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“tmpFeature”, “Feature”}, {“tmpBacklog item”, “Backlog item”}}),
    #”Filled Down” = Table.FillDown(#”Renamed Columns”,{“Scenario”, “Feature”, “Backlog item”}),
    #”Replaced Value” = Table.ReplaceValue(#”Filled Down”,”(prevent filldown)”,null,Replacer.ReplaceValue,{“Scenario”, “Feature”, “Backlog item”})
    in
    #”Replaced Value”

  2. Nick M. Reply

    Here’s how I accomplished the first step (still a little shaky on the custom functions right now)

    1) Duplicated the Feature column
    2) Filled Down the Scenario and original Feature columns
    3) Added a custom column that gave null if Backlog Item and the copy of the Feature columns were null, if not give me the value of the feature column that was filled down:

    =if
    [Backlog item] = null and [#”Feature – Copy”] = null
    then
    null
    else
    [Feature]

    4) Selected Scenario, Backlog Item and the Custom Column from above and removed the other columns
    5) Renamed the custom column to Feature
    6) Reordered the columns
    7) Set data types

    Final table matches the output requirement.

    -Nick

  3. James Fancke Reply

    I didn’t quite make it dynamic but I was getting close! I look forward to your solution.

    I took a subtractive approach where I replaced values with nulls after completing a filldown on the whole table.

    Solution also can be found here: https://1drv.ms/x/s!AgoytmOFSCGLgaxvOMh1htExcXKixg

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    InsertIndex = Table.AddIndexColumn(Source, “Index”, 0, 1),
    InsertLevel = Table.AddColumn(InsertIndex, “Level”, each Table.SelectRows(Table.AddIndexColumn(Record.ToTable(Source{[Index]}), “SubIndex”, 0, 1), each [Value] null)[SubIndex]{0}),
    FilledDown = Table.FillDown(InsertLevel,Table.ColumnNames(InsertLevel)),

    ReplaceScenario = Table.ReplaceValue(FilledDown, each [Scenario], each if [Level] < 0 then null else [Scenario], Replacer.ReplaceValue,{"Scenario"}),
    ReplaceFeature = Table.ReplaceValue(ReplaceScenario, each [Feature], each if [Level] < 1 then null else [Feature], Replacer.ReplaceValue,{"Feature"}),
    ReplaceBacklog = Table.ReplaceValue(ReplaceFeature, each [Backlog item], each if [Level] < 2 then null else [Backlog item], Replacer.ReplaceValue,{"Backlog item"}),
    SelectColumns = Table.SelectColumns(ReplaceBacklog,{"Scenario", "Feature", "Backlog item"})

    in
    SelectColumns

  4. FrankT Reply

    As a one-off:

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    ReplacedValue = Table.ReplaceValue(Source, each [Feature], each if [Scenario]=null then [Feature] else “”, Replacer.ReplaceValue, {“Feature”}),
    FilledDown = Table.FillDown(ReplacedValue,{“Scenario”, “Feature”})
    in
    FilledDown

  5. cristianangyal Reply

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    #”Changed Type” = Table.TransformColumnTypes(Source,{{“Scenario”, type text}, {“Feature”, type text}, {“Backlog item”, type text}}),
    #”Filled Down” = Table.FillDown(#”Changed Type”,{“Scenario”}),
    #”Added Custom” = Table.AddColumn(#”Filled Down”, “Feature_Solved”, each if [Backlog item]null then Table.FillDown(#”Filled Down”,{“Feature”}) else null),
    #”Expanded Feature_Solved” = Table.ExpandTableColumn(#”Added Custom”, “Feature_Solved”, {“Feature”, “Backlog item”}, {“Feature.1”, “Backlog item.1″}),
    #”Added Custom1″ = Table.AddColumn(#”Expanded Feature_Solved”, “Custom”, each [Backlog item]=[Backlog item.1]),
    #”Filtered Rows” = Table.SelectRows(#”Added Custom1″, each ([Custom] = true)),
    #”Merged Columns” = Table.CombineColumns(#”Filtered Rows”,{“Feature”, “Feature.1”},Combiner.CombineTextByDelimiter(“”, QuoteStyle.None),”Feature.2″),
    #”Reordered Columns” = Table.ReorderColumns(#”Merged Columns”,{“Scenario”, “Feature.2”, “Backlog item”, “Backlog item.1”, “Custom”}),
    #”Removed Columns” = Table.RemoveColumns(#”Reordered Columns”,{“Backlog item.1”, “Custom”}),
    #”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“Feature.2”, “Feature”}})
    in
    #”Renamed Columns”

    • Gil Raviv Post authorReply

      Thank you for sharing Cristian. Can you make it shorter? Look at the other solutions that were shared here.

  6. Kunle SOPEJU Reply

    Thanks for the challenge, Gil.

    My suggested solution for the first part is as follows:

    Import data into Power BI or Power Query in Excel.

    1. Fill down the Scenario column.
    2. Group the table by Scenario column using All rows.
    3. Fill down the [Features] table column within the filtered table ( ala Bill Szysz, Gil Ravi , et al)
    4. Select last column and remove Other Columns.
    5. Expand the selected column

    Pronto !!! You’re done !

  7. Tom Allan Reply

    I like FrankT’s solution, but think it would be better if ended with a step to clean up remaining empty strings that were introduced into Feature column. Looking forward to learning recursive approach.

  8. Shungu Reply

    This is the M function I used to achieve the filled down table with the demoted headers step accommodating different column names assuming a similar Table structure ,( ie 3 columns) and hierarchy. About the different level of hierarchies, that became a little tricky but was almost there.

    (Table as table) => let

    #”Demoted Headers” = Table.DemoteHeaders(Table),
    #”Added Custom” = Table.AddColumn(#”Demoted Headers”, “Parent”, each if [Column2] = null and [Column3] = null then 0 else null),
    #”Filled Down” = Table.FillDown(#”Added Custom”,{“Column1”, “Column2″}),
    #”Added Custom1″ = Table.AddColumn(#”Filled Down”, “Column2.2″, each if [Parent] = 0 then null else [Column2]),
    #”Removed Other Columns” = Table.SelectColumns(#”Added Custom1″,{“Column1”, “Column2.2”, “Column3″}),
    #”Promoted Headers” = Table.PromoteHeaders(#”Removed Other Columns”, [PromoteAllScalars=true])

    in

    #”Promoted Headers”

  9. Shungu Reply

    As a function that can take a table of similar structure and hierarchy thanks to the Demoted Headers Step.

    (Table as table) => let

    #”Demoted Headers” = Table.DemoteHeaders(Table),
    #”Added Custom” = Table.AddColumn(#”Demoted Headers”, “Parent”, each if [Column2] = null and [Column3] = null then 0 else null),
    #”Filled Down” = Table.FillDown(#”Added Custom”,{“Column1”, “Column2″}),
    #”Added Custom1″ = Table.AddColumn(#”Filled Down”, “Column2.2″, each if [Parent] = 0 then null else [Column2]),
    #”Removed Other Columns” = Table.SelectColumns(#”Added Custom1″,{“Column1”, “Column2.2”, “Column3″}),
    #”Promoted Headers” = Table.PromoteHeaders(#”Removed Other Columns”, [PromoteAllScalars=true])

    in

    #”Promoted Headers”

    • Gil Raviv Post authorReply

      Hi Shungu,
      Thank you for sharing. Try to modify this function to accept another parameter for the number of hierarchies and operate on multiple columns.
      I suspect that you would need to use List.Accumulate to iterate over the creation of new columns.

  10. stevej Reply

    //replace the hierarchy columns with versions prepped for a Table.FillDown, then do the FillDown
    // (also minimal args validation)

    (Data as table, HierarchyColumns as list) as table =>
    let
    AllColumns = Table.ColumnNames(Data),
    HierIndexes = List.Transform(HierarchyColumns, each List.PositionOf(AllColumns, _)),
    HierData = List.Transform(HierarchyColumns, each Table.Column(Data, _)),

    ValidateInput =
    let
    N = List.Count(HierarchyColumns),
    Names = Text.Combine(HierarchyColumns, “, “),
    ListTooSmall = error “Hierarchy must be 2 or more columns: ” & Names,
    NeedsTwo = N > 1 or ListTooSmall,

    NameMismatch = error “Hierarchy column(s) not found in input table: ” & Names,
    ColumnsCheck = List.Count(HierIndexes) = N or NameMismatch
    in
    NeedsTwo and ColumnsCheck,

    NullToEmpty = each if _=null or Text.Trim(_)=”” then “” else _,
    EmptyToNull = each if _=null or Text.Trim(_)=”” then null else _,

    PrepLevel = (Prv as list, Index as number) as list =>
    let
    IsHier = not List.IsEmpty(Prv),
    HierCol = HierData{Index},
    PrepHier = List.Transform(HierCol, EmptyToNull),
    New = List.Transform(List.Zip({PrepHier, Prv{0}}),
    each if Value.Is(_{0}, type text) then _{0} else if _{1} “” then null else “”),
    Result = if IsHier
    then {New} & Prv
    else {List.Transform(HierCol, NullToEmpty)}
    in Result,

    DenseHierarchy = List.Accumulate(List.Reverse(List.Positions(HierData)), {}, PrepLevel),
    PreppedDataTable =
    if List.Count(AllColumns) = List.Count(HierarchyColumns)
    then Table.FromColumns(DenseHierarchy, HierarchyColumns)
    else Table.FromColumns(
    List.ReplaceRange(Table.ToColumns(Data), HierIndexes{0}, List.Count(HierIndexes), DenseHierarchy),
    AllColumns),

    Result = Table.FillDown(PreppedDataTable, List.RemoveLastN(HierarchyColumns, 1))

    in if ValidateInput then Result else “”

    • stevej Reply

      //There’s a “not equals” in the code that got ripped because…HTML
      //Here’s the line with a C language “not equals” (!=) where the equivalent M operator would be:

      New = List.Transform(List.Zip({PrepHier, Prv{0}}),
      each if Value.Is(_{0}, type text) then _{0} else if _{1} != “” then null else “”),

      //Also to use the code:
      let
      Source = Excel.CurrentWorkbook(){[Name=”Sample1″]}[Content],
      Result = SparseToDense(Source, {“Scenario”, “Feature”, “Backlog Item”})
      in
      Result

      //Tested for both examples in the post, but nothing else :/

      • Gil Raviv Post author

        SteveJ Thank you for sharing. I will publish your solution in my next blog post. You are welcome to contact me and publish a guest blog post to explain your code.

  11. Jeremy Thiele Reply

    Neat problem. Here is a solution that used recursion and group by.

    // Backlogs_Sln
    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    /* Source = table, */
    FillDown = recursiveFillDown(Source, 0),
    Headers = Table.ColumnNames(Source),
    TableExpand = recursiveTableExpand(FillDown, 0, Headers)
    in
    TableExpand

    // recursiveFillDown
    (table, depth) =>

    let
    Headers = Table.ColumnNames(table),
    #”Filled Down” = Table.FillDown(table,{Headers{depth}}),
    #”Grouped Rows” = Table.Group(#”Filled Down”, List.FirstN(Headers,depth + 1), {{“Table”, each _, type table}}),
    nestedTables = Table.SelectColumns(#”Grouped Rows”,{“Table”}),
    recursion = Table.TransformColumns(nestedTables, {{“Table”, each @recursiveFillDown(_, depth + 1)}}),

    result = if depth >= (Table.ColumnCount(table) – 1) then nestedTables else recursion

    in
    result

    // recursiveTableExpand
    (table, depth, headers) =>

    let
    expandTable = Table.ExpandTableColumn(table, “Table”, {“Table”}),
    intermediateTable = @recursiveTableExpand(expandTable, depth + 1, headers),

    finalTable = Table.ExpandTableColumn(table, “Table”, headers),

    result = if depth >= (List.Count(headers) – 1) then finalTable else intermediateTable
    in
    result

    • Gil Raviv Post authorReply

      Jeremy, Thank you for sharing. I will publish your solution in my next blog post. You are welcome to contact me and publish a guest blog post to explain your code.

  12. Micah Dail Reply

    Here you go. No recursion or custom functions necessary. Essentially it uses a dumb fill to populate all of the possible cells, but then gets the actual column depth for a record and removes any fills outside of that depth.

    // Smart Hierarchy Fill
    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    Demote_Header = Table.DemoteHeaders(Source),
    Set_Row_Nbr = Table.AddIndexColumn(Demote_Header, “Row_Nbr”, 0, 1),
    Unpivot_Data = Table.UnpivotOtherColumns(Set_Row_Nbr, {“Row_Nbr”}, “Column”, “Value”),
    Good_Data = Table.TransformColumns(Unpivot_Data, {{“Column”, each Number.From(Text.Replace(_, “Column”, “”)), Int64.Type}, {“Row_Nbr”, each Number.From(_), Int64.Type}}),
    Remove_Header = Table.SelectRows(Good_Data, each ([Row_Nbr] 0)),
    Repivot_Without_Header = Table.Pivot(Table.TransformColumnTypes(Remove_Header, {{“Column”, type text}}, “en-US”), List.Distinct(Table.TransformColumnTypes(Remove_Header, {{“Column”, type text}}, “en-US”)[Column]), “Column”, “Value”),
    Dumb_Fill_Down = Table.FillDown(Repivot_Without_Header,Table.ColumnNames(Repivot_Without_Header)),
    Unpivot_Dumb_Fill = Table.UnpivotOtherColumns(Dumb_Fill_Down, {“Row_Nbr”}, “Column”, “Value”),
    Dumb_Data = Table.TransformColumnTypes(Unpivot_Dumb_Fill,{{“Column”, Int64.Type}}),
    Combine_Good_And_Dumb = Table.NestedJoin(Good_Data,{“Row_Nbr”},Dumb_Data,{“Row_Nbr”},”Dumb_Data”,JoinKind.Inner),
    Expand_Dumb_Data = Table.ExpandTableColumn(Combine_Good_And_Dumb, “Dumb_Data”, {“Column”, “Value”}, {“Dumb_Data.Column”, “Dumb_Data.Value”}),
    Filter_Dumb_Records_For_Brilliance = Table.SelectRows(Expand_Dumb_Data, each ([Dumb_Data.Column] <= [Column])),
    Finalize_Data = Table.SelectColumns(Filter_Dumb_Records_For_Brilliance,{"Row_Nbr", "Dumb_Data.Column", "Dumb_Data.Value"}),
    Rename_Columns = Table.RenameColumns(Finalize_Data,{{"Dumb_Data.Column", "Column"}, {"Dumb_Data.Value", "Value"}}),
    Isolate_Header = Table.SelectRows(Good_Data, each ([Row_Nbr] = 0)),
    Append_Finalized_To_Header = Table.Combine({Isolate_Header, Rename_Columns}),
    Repivot_Data = Table.Pivot(Table.TransformColumnTypes(Append_Finalized_To_Header, {{"Column", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(Append_Finalized_To_Header, {{"Column", type text}}, "en-US")[Column]), "Column", "Value"),
    Ensure_Original_Sort = Table.Sort(Repivot_Data,{{"Row_Nbr", Order.Ascending}}),
    Remove_Row_Nbr = Table.RemoveColumns(Ensure_Original_Sort,{"Row_Nbr"}),
    Restore_Original_Header = Table.PromoteHeaders(Remove_Row_Nbr, [PromoteAllScalars=true])
    in
    Restore_Original_Header

    • Gil Raviv Post authorReply

      Micah, Thank you for sharing. I will publish your solution in my next blog post. You are welcome to contact me and publish a guest blog post to explain your code.

  13. John BIAN Reply

    Here’s my solutions fyi:

    Solution 1: Table.Group. The 5th parameter was used.

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
    Group = Table.Group( Source,
    “Scenario”,
    {“a”,each Table.FillDown(_,{“Scenario”,”Feature”})},
    0,
    (x,y)=>Byte.From(ynull)),
    Result = Table.Combine(Group[a])
    in
    Result

    Solution 2: List.Generate

    let
    Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],

    gen = let lst = List.Buffer(Table.ToRows(Source)),
    num = List.Count(lst)

    in List.Generate( ()=>{{},0},
    each _{1}<=num,
    each let l = List.Buffer(lst{_{1}}?),
    a = List.FirstN(l,each _=null),
    n = List.Count(a),
    b = List.Buffer(List.FirstN(_{0},n)),
    v = List.ReplaceRange(l,0,n,b)
    in {v,_{1}+1},
    each _{0} ),

    rlt = #table(3,List.Skip(gen))
    in
    rlt

Leave a Reply