Compare Tables in Excel & Power BI – Part 2

As a follow-up for the tutorial here, in today’s blog post, we will show you how to use an improved version of the Table Comparison implementation, and apply a comparison of rows with matching keys. The solution can be used for many purposes:

  • Audit repetitive changes on the same table
  • Create an automated revision control for your data
  • Automate regression tests in your reporting solution

Compare two tables in Excel

 

To follow this tutorial, and create an automated comparison of two tables, we will use Power Query, AKA Get & Transform in Excel 2016. You can follow the steps below with Excel 2010 and 2013 after installing Power Query Add-In, or in Power BI Desktop.

Want to download the solution?

You can now purchase the full solution

Buy Now

Want to create the solution from scratch?

Create a query with the name FnCompareTables, and the following M expression (copy and paste it into the Advanced Editor):

(Source1 as table, Source2 as table, UniqueKeys as list, Delimiter as text) as table =>
let
    //UniqueKeys = {"ID1", "ID2"},
    //Delimiter = ",",

    fnPrepTable = (src, name, uniqueKeys as list, delimiter)=>
    let
        //uniqueKeys = {"ID"},
        #"Replaced Value" = Table.ReplaceValue(src,null,"",Replacer.ReplaceValue,Table.ColumnNames(src)),
        #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Replaced Value", uniqueKeys, "Attribute", "Value"),
        #"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Attribute", "Column"}, {"Value", name & " Value"}}),

        columnTypesList = List.Transform(uniqueKeys, each {_, type text}),

        #"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Renamed Columns", columnTypesList, "en-US"), uniqueKeys, Combiner.CombineTextByDelimiter(delimiter, QuoteStyle.None),"Keys")
    in
        #"Merged Columns",

    Table1 = fnPrepTable(Source1, "Table1", UniqueKeys, Delimiter),
    Table2 = fnPrepTable(Source2, "Table2", UniqueKeys, Delimiter),
    Merged = Table.NestedJoin(Table1,{"Keys", "Column"} ,Table2,{"Keys", "Column"},"NewColumn",JoinKind.FullOuter),

    #"Expanded NewColumn" = Table.ExpandTableColumn(Merged, "NewColumn",  {"Keys", "Column", "Table2 Value"}, {"Keys.1", "Column.1", "Table2 Value"}),
    #"Added Conditional Column" = Table.AddColumn(#"Expanded NewColumn", "Diff", each [Table1 Value] <> [Table2 Value] ),
    #"Replaced Value1" = Table.ReplaceValue(#"Added Conditional Column",null,"",Replacer.ReplaceValue,{"Table1 Value", "Table2 Value"}),
    #"Inserted Merged Column" = Table.AddColumn(#"Replaced Value1", "Change", each if [Diff] then Text.Combine({Text.From([Table1 Value]), Text.From([Table2 Value])}, " --> ") else [Table1 Value], type text),
    #"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"Table1 Value", "Table2 Value"}),
    #"Added Conditional Column2" = Table.AddColumn(#"Removed Columns", "Row", each if [Keys] = null then [Keys.1] else [Keys] ),
    #"Removed Columns2" = Table.RemoveColumns(#"Added Conditional Column2",{"Keys", "Keys.1"}),
    #"Added Conditional Column3" = Table.AddColumn(#"Removed Columns2", "New Column", each if [Column] = null then [Column.1] else [Column] ),
    #"Removed Columns3" = Table.RemoveColumns(#"Added Conditional Column3",{"Column", "Column.1"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns3",{{"New Column", "Column"}}),
    #"Pivoted Column" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Column]), "Column", "Change"),
    #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([Diff] = true)),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"Diff"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns1",{{"Amount", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Row", Splitter.SplitTextByDelimiter(Delimiter, QuoteStyle.Csv), UniqueKeys)
in
    #"Split Column by Delimiter"

Import your tables as two separate queries: Source1 and Source2.

Create a blank query and use the function as follows:

= FnCompareTables(Source1, Source2, [single or multiple columns key], [delimiter])

[single or multiple columns key] is a list of column headers, for example: { “ID1”, “ID2” }

[delimiter] is a text used to merge multiple columns. Don’t use a character that is already used in the headers. For example: If your multiple columns key is the combination of the columns Transaction-ID and Product-ID {“Transaction-ID”, “Product-ID”}, don’t use dash “-” as a delimiter.

Here are few examples that invoke FnCompareTables:

The example below can be used if you have multiple columns key as shown in screenshot below.

= FnCompareTables(Source1, Source2, {"ID1", "ID2"}, ",")

The example below can be used if you don’t have any unique IDs, and you want to ensure that the tables are matching by their row order. In this case, you can create the Index column and use it as the unique key.

= FnCompareTables(Source1, Source2, {"Index"}, ",")

The example below can be used if you have a unique key in a single column Transaction ID.

= FnCompareTables(Source1, Source2, {"Transaction ID"}, ",")

Next step is to load the query to a table and view the highlighted differences.

Matching rows (by multiple columns keys) will highlight the differences as shown in this screenshot:

 

Missing rows in the first table will be highlighted as follows:

Missing rows in the second table will be highlighted as follows:

Hope you find this tutorial helpful.

Subscribe to DataChant and download the solution workbook from here.

Subscribe ToDataChant

Join our mailing list to receive exclusive content that was used in the preparation of this article.

Thank you for subscribing to DataChant. An email will be sent to you shortly with more details.

Leave a Reply