Compare Tables in Excel & Power BI – Part 2

As a follow-up for the tutorial here, in today’s blog post, we will show you how to use an improved version of the Table Comparison implementation, and apply a comparison of rows with matching keys. The solution can be used for many purposes:

  • Audit repetitive changes on the same table
  • Create an automated revision control for your data
  • Automate regression tests in your reporting solution

Compare two tables in Excel

To follow this tutorial, and create an automated comparison of two tables, we will use Power Query, AKA Get & Transform in Excel 2016. You can follow the steps below with Excel 2010 and 2013 after installing Power Query Add-In, or in Power BI Desktop.

Want to download the solution?

You can now purchase the full solution

Want to create the solution from scratch?

Create a query with the name FnCompareTables, and the following M expression (copy and paste it into the Advanced Editor):

(Source1 as table, Source2 as table, UniqueKeys as list, Delimiter as text) as table =>
let
    //UniqueKeys = {"ID1", "ID2"},
    //Delimiter = ",",

    fnPrepTable = (src, name, uniqueKeys as list, delimiter)=>
    let
        //uniqueKeys = {"ID"},
        #"Replaced Value" = Table.ReplaceValue(src,null,"",Replacer.ReplaceValue,Table.ColumnNames(src)),
        #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Replaced Value", uniqueKeys, "Attribute", "Value"),
        #"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Attribute", "Column"}, {"Value", name & " Value"}}),

        columnTypesList = List.Transform(uniqueKeys, each {_, type text}),

        #"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Renamed Columns", columnTypesList, "en-US"), uniqueKeys, Combiner.CombineTextByDelimiter(delimiter, QuoteStyle.None),"Keys")
    in
        #"Merged Columns",

    Table1 = fnPrepTable(Source1, "Table1", UniqueKeys, Delimiter),
    Table2 = fnPrepTable(Source2, "Table2", UniqueKeys, Delimiter),
    Merged = Table.NestedJoin(Table1,{"Keys", "Column"} ,Table2,{"Keys", "Column"},"NewColumn",JoinKind.FullOuter),

    #"Expanded NewColumn" = Table.ExpandTableColumn(Merged, "NewColumn",  {"Keys", "Column", "Table2 Value"}, {"Keys.1", "Column.1", "Table2 Value"}),
    #"Added Conditional Column" = Table.AddColumn(#"Expanded NewColumn", "Diff", each [Table1 Value] <> [Table2 Value] ),
    #"Replaced Value1" = Table.ReplaceValue(#"Added Conditional Column",null,"",Replacer.ReplaceValue,{"Table1 Value", "Table2 Value"}),
    #"Inserted Merged Column" = Table.AddColumn(#"Replaced Value1", "Change", each if [Diff] then Text.Combine({Text.From([Table1 Value]), Text.From([Table2 Value])}, " --> ") else [Table1 Value], type text),
    #"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"Table1 Value", "Table2 Value"}),
    #"Added Conditional Column2" = Table.AddColumn(#"Removed Columns", "Row", each if [Keys] = null then [Keys.1] else [Keys] ),
    #"Removed Columns2" = Table.RemoveColumns(#"Added Conditional Column2",{"Keys", "Keys.1"}),
    #"Added Conditional Column3" = Table.AddColumn(#"Removed Columns2", "New Column", each if [Column] = null then [Column.1] else [Column] ),
    #"Removed Columns3" = Table.RemoveColumns(#"Added Conditional Column3",{"Column", "Column.1"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns3",{{"New Column", "Column"}}),
    #"Pivoted Column" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Column]), "Column", "Change"),
    #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([Diff] = true)),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"Diff"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns1",{{"Amount", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Row", Splitter.SplitTextByDelimiter(Delimiter, QuoteStyle.Csv), UniqueKeys)
in
    #"Split Column by Delimiter"

Import your tables as two separate queries: Source1 and Source2.

Create a blank query and use the function as follows:

= FnCompareTables(Source1, Source2, [single or multiple columns key], [delimiter])

[single or multiple columns key] is a list of column headers, for example: { “ID1”, “ID2” }

[delimiter] is a text used to merge multiple columns. Don’t use a character that is already used in the headers. For example: If your multiple columns key is the combination of the columns Transaction-ID and Product-ID {“Transaction-ID”, “Product-ID”}, don’t use dash “-” as a delimiter.

Here are few examples that invoke FnCompareTables:

The example below can be used if you have multiple columns key as shown in screenshot below.

= FnCompareTables(Source1, Source2, {"ID1", "ID2"}, ",")

The example below can be used if you don’t have any unique IDs, and you want to ensure that the tables are matching by their row order. In this case, you can create the Index column and use it as the unique key.

= FnCompareTables(Source1, Source2, {"Index"}, ",")

The example below can be used if you have a unique key in a single column Transaction ID.

= FnCompareTables(Source1, Source2, {"Transaction ID"}, ",")

The next step is to load the query to a table and view the highlighted differences.

Matching rows (by multiple columns keys) will highlight the differences as shown in this screenshot:

Missing rows in the first table will be highlighted as follows:

Missing rows in the second table will be highlighted as follows:

Hope you find this tutorial helpful.

5 comments

  1. Lidou Reply

    Hello

    I just bought this solution.
    Is it possible to use it with 3 differents sources ?

    Best regards

    • Gil Raviv Post authorReply

      It was designed to support two tables. Of course you can run it multiple times for each pair. Do you want to share the requirements and desired output? Contact me at gilra@datachant.com

  2. Marc Reply

    Hi Gil, this looks like a great functionality, but I get an error message on the column “Amount” not being found when the function is executed. Indeed the code seems to be trying to convert the format of a non-created column. Is this me or is this something that need to be fixed? I’m no expert, sorry if this is not a real issue!

    • Gil Raviv Post authorReply

      Hi Marc
      Can you send me your report by email? (gilra@datachant.com)

  3. Marc Reply

    Hi Gil,
    Many thanks for the ultra-fast feedback!
    I just sent the report by mail. Any hint you can provide will be much appreciated!

    Marc

Leave a Reply