A weekend challenge
I hope you are enjoying your second weekend of 2019. I was helping a friend recently figuring out the following data challenge:
You have a table of Backlog items with three levels of hierarchies (Scenario, Feature and Backlog Item). In each row, the source data includes the unique ID of the lowest level, but has blank values in the columns that represent the higher levels, as shown at the left table.

The format on the left is not ideal for analytics. You may think that it is not common to find such format. Unfortunately, it is a common format. For example, here is a screenshot from an output of Team Foundation Add-In in Excel (source is here).

Can you fill down the table using Power Query in Excel or Power BI to reach the format of the right table above? Can you take your solution one step further, and write it in a way it will support dynamic number of levels and different header names? (For example: Continent, Country, Country, City, Street).
You can download the source table here. The solutions will be published soon.
Update (Jan 14, 2019): Thank you for the solutions you proposed below. Here is a solution that receives the table and the number of hierarchies in and uses recursions to resolve the challenge, without knowing in advance how many levels of hierarchies or column names are used. In the next blog post, I will describe it in details along with the best queries that were shared below.
Here’s one solution. It prevents the fill down by replacing the nulls with a temporary value – in this case “(prevent filldown)”.
let
Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
#”Added tmpFeature” = Table.AddColumn(Source, “tmpFeature”, each if ([Feature] = null and [Scenario] null) then “(prevent filldown)” else [Feature], type text),
#”Added tmpBacklog item” = Table.AddColumn(#”Added tmpFeature”, “tmpBacklog item”, each if ([Backlog item] = null and [tmpFeature] null) then “(prevent filldown)” else [Backlog item], type text),
#”Removed Columns” = Table.RemoveColumns(#”Added tmpBacklog item”,{“Feature”, “Backlog item”}),
#”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“tmpFeature”, “Feature”}, {“tmpBacklog item”, “Backlog item”}}),
#”Filled Down” = Table.FillDown(#”Renamed Columns”,{“Scenario”, “Feature”, “Backlog item”}),
#”Replaced Value” = Table.ReplaceValue(#”Filled Down”,”(prevent filldown)”,null,Replacer.ReplaceValue,{“Scenario”, “Feature”, “Backlog item”})
in
#”Replaced Value”
Thanks for the challenge Gil. Looking forward to reading what your proposed solution is.
My Solution: https://pastebin.com/djVrDHAj
Here’s how I accomplished the first step (still a little shaky on the custom functions right now)
1) Duplicated the Feature column
2) Filled Down the Scenario and original Feature columns
3) Added a custom column that gave null if Backlog Item and the copy of the Feature columns were null, if not give me the value of the feature column that was filled down:
=if
[Backlog item] = null and [#”Feature – Copy”] = null
then
null
else
[Feature]
4) Selected Scenario, Backlog Item and the Custom Column from above and removed the other columns
5) Renamed the custom column to Feature
6) Reordered the columns
7) Set data types
Final table matches the output requirement.
-Nick
Thank you for the challenge Gil. The solution I propose can be found here https://1drv.ms/x/s!At0U3V37GjTFhEjFWcuCMtnN-86j
I didn’t quite make it dynamic but I was getting close! I look forward to your solution.
I took a subtractive approach where I replaced values with nulls after completing a filldown on the whole table.
Solution also can be found here: https://1drv.ms/x/s!AgoytmOFSCGLgaxvOMh1htExcXKixg
let
Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
InsertIndex = Table.AddIndexColumn(Source, “Index”, 0, 1),
InsertLevel = Table.AddColumn(InsertIndex, “Level”, each Table.SelectRows(Table.AddIndexColumn(Record.ToTable(Source{[Index]}), “SubIndex”, 0, 1), each [Value] null)[SubIndex]{0}),
FilledDown = Table.FillDown(InsertLevel,Table.ColumnNames(InsertLevel)),
ReplaceScenario = Table.ReplaceValue(FilledDown, each [Scenario], each if [Level] < 0 then null else [Scenario], Replacer.ReplaceValue,{"Scenario"}),
ReplaceFeature = Table.ReplaceValue(ReplaceScenario, each [Feature], each if [Level] < 1 then null else [Feature], Replacer.ReplaceValue,{"Feature"}),
ReplaceBacklog = Table.ReplaceValue(ReplaceFeature, each [Backlog item], each if [Level] < 2 then null else [Backlog item], Replacer.ReplaceValue,{"Backlog item"}),
SelectColumns = Table.SelectColumns(ReplaceBacklog,{"Scenario", "Feature", "Backlog item"})
in
SelectColumns
As a one-off:
let
Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
ReplacedValue = Table.ReplaceValue(Source, each [Feature], each if [Scenario]=null then [Feature] else “”, Replacer.ReplaceValue, {“Feature”}),
FilledDown = Table.FillDown(ReplacedValue,{“Scenario”, “Feature”})
in
FilledDown
let
Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
#”Changed Type” = Table.TransformColumnTypes(Source,{{“Scenario”, type text}, {“Feature”, type text}, {“Backlog item”, type text}}),
#”Filled Down” = Table.FillDown(#”Changed Type”,{“Scenario”}),
#”Added Custom” = Table.AddColumn(#”Filled Down”, “Feature_Solved”, each if [Backlog item]null then Table.FillDown(#”Filled Down”,{“Feature”}) else null),
#”Expanded Feature_Solved” = Table.ExpandTableColumn(#”Added Custom”, “Feature_Solved”, {“Feature”, “Backlog item”}, {“Feature.1”, “Backlog item.1″}),
#”Added Custom1″ = Table.AddColumn(#”Expanded Feature_Solved”, “Custom”, each [Backlog item]=[Backlog item.1]),
#”Filtered Rows” = Table.SelectRows(#”Added Custom1″, each ([Custom] = true)),
#”Merged Columns” = Table.CombineColumns(#”Filtered Rows”,{“Feature”, “Feature.1”},Combiner.CombineTextByDelimiter(“”, QuoteStyle.None),”Feature.2″),
#”Reordered Columns” = Table.ReorderColumns(#”Merged Columns”,{“Scenario”, “Feature.2”, “Backlog item”, “Backlog item.1”, “Custom”}),
#”Removed Columns” = Table.RemoveColumns(#”Reordered Columns”,{“Backlog item.1”, “Custom”}),
#”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“Feature.2”, “Feature”}})
in
#”Renamed Columns”
Thank you for sharing Cristian. Can you make it shorter? Look at the other solutions that were shared here.
Thanks for the challenge, Gil.
My suggested solution for the first part is as follows:
Import data into Power BI or Power Query in Excel.
1. Fill down the Scenario column.
2. Group the table by Scenario column using All rows.
3. Fill down the [Features] table column within the filtered table ( ala Bill Szysz, Gil Ravi , et al)
4. Select last column and remove Other Columns.
5. Expand the selected column
Pronto !!! You’re done !
I like FrankT’s solution, but think it would be better if ended with a step to clean up remaining empty strings that were introduced into Feature column. Looking forward to learning recursive approach.
This is the M function I used to achieve the filled down table with the demoted headers step accommodating different column names assuming a similar Table structure ,( ie 3 columns) and hierarchy. About the different level of hierarchies, that became a little tricky but was almost there.
(Table as table) => let
#”Demoted Headers” = Table.DemoteHeaders(Table),
#”Added Custom” = Table.AddColumn(#”Demoted Headers”, “Parent”, each if [Column2] = null and [Column3] = null then 0 else null),
#”Filled Down” = Table.FillDown(#”Added Custom”,{“Column1”, “Column2″}),
#”Added Custom1″ = Table.AddColumn(#”Filled Down”, “Column2.2″, each if [Parent] = 0 then null else [Column2]),
#”Removed Other Columns” = Table.SelectColumns(#”Added Custom1″,{“Column1”, “Column2.2”, “Column3″}),
#”Promoted Headers” = Table.PromoteHeaders(#”Removed Other Columns”, [PromoteAllScalars=true])
in
#”Promoted Headers”
As a function that can take a table of similar structure and hierarchy thanks to the Demoted Headers Step.
(Table as table) => let
#”Demoted Headers” = Table.DemoteHeaders(Table),
#”Added Custom” = Table.AddColumn(#”Demoted Headers”, “Parent”, each if [Column2] = null and [Column3] = null then 0 else null),
#”Filled Down” = Table.FillDown(#”Added Custom”,{“Column1”, “Column2″}),
#”Added Custom1″ = Table.AddColumn(#”Filled Down”, “Column2.2″, each if [Parent] = 0 then null else [Column2]),
#”Removed Other Columns” = Table.SelectColumns(#”Added Custom1″,{“Column1”, “Column2.2”, “Column3″}),
#”Promoted Headers” = Table.PromoteHeaders(#”Removed Other Columns”, [PromoteAllScalars=true])
in
#”Promoted Headers”
Hi Shungu,
Thank you for sharing. Try to modify this function to accept another parameter for the number of hierarchies and operate on multiple columns.
I suspect that you would need to use List.Accumulate to iterate over the creation of new columns.
//replace the hierarchy columns with versions prepped for a Table.FillDown, then do the FillDown
// (also minimal args validation)
(Data as table, HierarchyColumns as list) as table =>
let
AllColumns = Table.ColumnNames(Data),
HierIndexes = List.Transform(HierarchyColumns, each List.PositionOf(AllColumns, _)),
HierData = List.Transform(HierarchyColumns, each Table.Column(Data, _)),
ValidateInput =
let
N = List.Count(HierarchyColumns),
Names = Text.Combine(HierarchyColumns, “, “),
ListTooSmall = error “Hierarchy must be 2 or more columns: ” & Names,
NeedsTwo = N > 1 or ListTooSmall,
NameMismatch = error “Hierarchy column(s) not found in input table: ” & Names,
ColumnsCheck = List.Count(HierIndexes) = N or NameMismatch
in
NeedsTwo and ColumnsCheck,
NullToEmpty = each if _=null or Text.Trim(_)=”” then “” else _,
EmptyToNull = each if _=null or Text.Trim(_)=”” then null else _,
PrepLevel = (Prv as list, Index as number) as list =>
let
IsHier = not List.IsEmpty(Prv),
HierCol = HierData{Index},
PrepHier = List.Transform(HierCol, EmptyToNull),
New = List.Transform(List.Zip({PrepHier, Prv{0}}),
each if Value.Is(_{0}, type text) then _{0} else if _{1} “” then null else “”),
Result = if IsHier
then {New} & Prv
else {List.Transform(HierCol, NullToEmpty)}
in Result,
DenseHierarchy = List.Accumulate(List.Reverse(List.Positions(HierData)), {}, PrepLevel),
PreppedDataTable =
if List.Count(AllColumns) = List.Count(HierarchyColumns)
then Table.FromColumns(DenseHierarchy, HierarchyColumns)
else Table.FromColumns(
List.ReplaceRange(Table.ToColumns(Data), HierIndexes{0}, List.Count(HierIndexes), DenseHierarchy),
AllColumns),
Result = Table.FillDown(PreppedDataTable, List.RemoveLastN(HierarchyColumns, 1))
in if ValidateInput then Result else “”
//There’s a “not equals” in the code that got ripped because…HTML
//Here’s the line with a C language “not equals” (!=) where the equivalent M operator would be:
New = List.Transform(List.Zip({PrepHier, Prv{0}}),
each if Value.Is(_{0}, type text) then _{0} else if _{1} != “” then null else “”),
//Also to use the code:
let
Source = Excel.CurrentWorkbook(){[Name=”Sample1″]}[Content],
Result = SparseToDense(Source, {“Scenario”, “Feature”, “Backlog Item”})
in
Result
//Tested for both examples in the post, but nothing else :/
SteveJ Thank you for sharing. I will publish your solution in my next blog post. You are welcome to contact me and publish a guest blog post to explain your code.
Neat problem. Here is a solution that used recursion and group by.
// Backlogs_Sln
let
Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
/* Source = table, */
FillDown = recursiveFillDown(Source, 0),
Headers = Table.ColumnNames(Source),
TableExpand = recursiveTableExpand(FillDown, 0, Headers)
in
TableExpand
// recursiveFillDown
(table, depth) =>
let
Headers = Table.ColumnNames(table),
#”Filled Down” = Table.FillDown(table,{Headers{depth}}),
#”Grouped Rows” = Table.Group(#”Filled Down”, List.FirstN(Headers,depth + 1), {{“Table”, each _, type table}}),
nestedTables = Table.SelectColumns(#”Grouped Rows”,{“Table”}),
recursion = Table.TransformColumns(nestedTables, {{“Table”, each @recursiveFillDown(_, depth + 1)}}),
result = if depth >= (Table.ColumnCount(table) – 1) then nestedTables else recursion
in
result
// recursiveTableExpand
(table, depth, headers) =>
let
expandTable = Table.ExpandTableColumn(table, “Table”, {“Table”}),
intermediateTable = @recursiveTableExpand(expandTable, depth + 1, headers),
finalTable = Table.ExpandTableColumn(table, “Table”, headers),
result = if depth >= (List.Count(headers) – 1) then finalTable else intermediateTable
in
result
Jeremy, Thank you for sharing. I will publish your solution in my next blog post. You are welcome to contact me and publish a guest blog post to explain your code.
Here you go. No recursion or custom functions necessary. Essentially it uses a dumb fill to populate all of the possible cells, but then gets the actual column depth for a record and removes any fills outside of that depth.
// Smart Hierarchy Fill
let
Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
Demote_Header = Table.DemoteHeaders(Source),
Set_Row_Nbr = Table.AddIndexColumn(Demote_Header, “Row_Nbr”, 0, 1),
Unpivot_Data = Table.UnpivotOtherColumns(Set_Row_Nbr, {“Row_Nbr”}, “Column”, “Value”),
Good_Data = Table.TransformColumns(Unpivot_Data, {{“Column”, each Number.From(Text.Replace(_, “Column”, “”)), Int64.Type}, {“Row_Nbr”, each Number.From(_), Int64.Type}}),
Remove_Header = Table.SelectRows(Good_Data, each ([Row_Nbr] 0)),
Repivot_Without_Header = Table.Pivot(Table.TransformColumnTypes(Remove_Header, {{“Column”, type text}}, “en-US”), List.Distinct(Table.TransformColumnTypes(Remove_Header, {{“Column”, type text}}, “en-US”)[Column]), “Column”, “Value”),
Dumb_Fill_Down = Table.FillDown(Repivot_Without_Header,Table.ColumnNames(Repivot_Without_Header)),
Unpivot_Dumb_Fill = Table.UnpivotOtherColumns(Dumb_Fill_Down, {“Row_Nbr”}, “Column”, “Value”),
Dumb_Data = Table.TransformColumnTypes(Unpivot_Dumb_Fill,{{“Column”, Int64.Type}}),
Combine_Good_And_Dumb = Table.NestedJoin(Good_Data,{“Row_Nbr”},Dumb_Data,{“Row_Nbr”},”Dumb_Data”,JoinKind.Inner),
Expand_Dumb_Data = Table.ExpandTableColumn(Combine_Good_And_Dumb, “Dumb_Data”, {“Column”, “Value”}, {“Dumb_Data.Column”, “Dumb_Data.Value”}),
Filter_Dumb_Records_For_Brilliance = Table.SelectRows(Expand_Dumb_Data, each ([Dumb_Data.Column] <= [Column])),
Finalize_Data = Table.SelectColumns(Filter_Dumb_Records_For_Brilliance,{"Row_Nbr", "Dumb_Data.Column", "Dumb_Data.Value"}),
Rename_Columns = Table.RenameColumns(Finalize_Data,{{"Dumb_Data.Column", "Column"}, {"Dumb_Data.Value", "Value"}}),
Isolate_Header = Table.SelectRows(Good_Data, each ([Row_Nbr] = 0)),
Append_Finalized_To_Header = Table.Combine({Isolate_Header, Rename_Columns}),
Repivot_Data = Table.Pivot(Table.TransformColumnTypes(Append_Finalized_To_Header, {{"Column", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(Append_Finalized_To_Header, {{"Column", type text}}, "en-US")[Column]), "Column", "Value"),
Ensure_Original_Sort = Table.Sort(Repivot_Data,{{"Row_Nbr", Order.Ascending}}),
Remove_Row_Nbr = Table.RemoveColumns(Ensure_Original_Sort,{"Row_Nbr"}),
Restore_Original_Header = Table.PromoteHeaders(Remove_Row_Nbr, [PromoteAllScalars=true])
in
Restore_Original_Header
Micah, Thank you for sharing. I will publish your solution in my next blog post. You are welcome to contact me and publish a guest blog post to explain your code.
Here’s my solutions fyi:
Solution 1: Table.Group. The 5th parameter was used.
let
Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
Group = Table.Group( Source,
“Scenario”,
{“a”,each Table.FillDown(_,{“Scenario”,”Feature”})},
0,
(x,y)=>Byte.From(ynull)),
Result = Table.Combine(Group[a])
in
Result
Solution 2: List.Generate
let
Source = Excel.CurrentWorkbook(){[Name=”Backlogs”]}[Content],
gen = let lst = List.Buffer(Table.ToRows(Source)),
num = List.Count(lst)
in List.Generate( ()=>{{},0},
each _{1}<=num,
each let l = List.Buffer(lst{_{1}}?),
a = List.FirstN(l,each _=null),
n = List.Count(a),
b = List.Buffer(List.FirstN(_{0},n)),
v = List.ReplaceRange(l,0,n,b)
in {v,_{1}+1},
each _{0} ),
rlt = #table(3,List.Skip(gen))
in
rlt