Masking confidential data in Power BI reports

4 years ago, I published on my blog a technique to mask data in Power BI. In today’s post, I wanted to re-share it and illustrate how effective it can for demo purposes – Let’s say you work with confidential data, and you would like to share your Power BI report with an external audience but hide some items. How can you take away the confidential elements in your report?

In my last post, I announced a new Power BI app that I released on AppSource. To be able to release the app quickly, I needed to provide demo data in the report that will allow you to install it without seeing any client-specific data.

The highlighted sections in my report are masked. Every refresh a new set of randmoized letters will be generated.

Instead of creating fake data, or manually mask the data on the external source, I used Power Query to randomize the data during the refresh time. In today’s post, I will go through the main steps that will allow you to repeat this technique and mask real data in a way you will be able to present it.

Here are some of the advantages of this technique:

  1. No need to change the data on the external data source
  2. Faster implementation time
  3. Confidential data is not loaded into the model, and cannot be reversed engineered by users
  4. Every refresh the characters’ mapping will be randomized differently, so, for hackers, to reverse engineer the text may not be a practical effort to try.
  5. The masked values keep their dimensional identities in the model. For example, you can scramble a client name or an email address, and the client will have the same amounts in your fact tables.
  6. If you are in the consulting business or work with external partners, you can use this technique on any report and omit and confidential data from the reports.
  7. If you build Power BI apps, you can use this technique to generate real data into demo data without spending too much time on creating completely fake datasets.
  8. If you want to show the same report to both internal and external users, you can combine the masked data and the unmasked data on the same column and add a new column to differentiate between masked and unmasked values. Then apply Row Level Security to show the masked or unmasked records to relevant audiences.

The M Code

In this section I will share the M code that I use, without further explanations on how it works.

This query performs the randomization of characters. We will refer to it as ConversionTableEnglish in the next blocks of code. You can change it to accept other non-ascii supported languages. I may share an example in another blog post in the future.

    FnConvertRange = (StartingChararcterNum, Count) =>
        Source = List.Random(Count),
        #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "To"}}),
        #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", StartingChararcterNum, 1),
        #"Sorted Rows" = Table.Sort(#"Added Index",{{"To", Order.Ascending}}),
        #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index.1", StartingChararcterNum, 1),
        #"Removed Columns" = Table.RemoveColumns(#"Added Index1",{"To"}),
        #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Index.1", "Index"}),
        #"Renamed Columns1" = Table.RenameColumns(#"Reordered Columns",{{"Index.1", "From"}, {"Index", "To"}}),
        #"Transposed Table" = Table.Transpose(#"Renamed Columns1"),
        ConvertedResult = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true])

    Uppercase = FnConvertRange(65, 26),
    Lowercase = FnConvertRange(97, 26),

    Source = Record.Combine(List.Combine({Table.ToRecords(Uppercase),Table.ToRecords(Lowercase)})),
    #"Converted to Table" = Record.ToTable(Source),
    #"Transposed Table" = Table.Transpose(#"Converted to Table"),
    #"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true])
    #"Promoted Headers"

The next M expression is for a custom function which translates a given string into its masked version. We will refer to this function FnMaskText in the last block of code.

    Source = (txt, convTable) =>
    FnConvertCharacter = (c, convTable) =>
            num = Character.ToNumber(c),
            mod_num = Number.Mod(num, 2048),                
            converted_num = try Table.Column(convTable, Text.From(mod_num)){0} otherwise num,
            converted_char = Character.FromNumber(converted_num)

    ObfuscatedList = List.Transform(Text.ToList(txt), each try FnConvertCharacter(_, convTable) otherwise _),
    res = Text.Combine(List.Transform(ObfuscatedList, Text.From))

You can try the code below to run on this table of 3 strings and see how we use Table.AddColumn to generate the new Masked Column. Note, every refresh you will get a different text.

Every refresh you get a different randomized masking
    ConversionTableBuf = Table.Buffer(ConversionTableEnglish),
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCskvUChOTS5KLVGK1YlW8i8tgnIV0vKLcktzEq0UXG19neOMwNKeJak5OanJJaWJOQoBRfkFqUUllUqxsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Confidential Column" = _t]),
    #"Invoked Custom Function" = Table.AddColumn(Source, "Masked column", each FnMaskText([Confidential Column], ConversionTableBuf))
    #"Invoked Custom Function"

In this example, you can add a new parameter Is Demo, to allow you to toggler between a demo and a real report.

And here is the generic M code you can use to mask a column in your existing code using the Is Demo parameter. In this example, the column we want to mask is called ColumnToMask.

    Source = ...,


    LastStepWithConfiedntialData = ...,

    ConversionTableBuf = Table.Buffer(ConversionTableEnglish),
    Mask1 = Table.TransformColumns(
        {{"ColumnToMask", each FnMaskText(_, ConversionTableBuf), type text}}
    FinalTable = if #"Is Demo" then Mask1 else LastStepWithConfiedntialData

I hope you find this post useful. If you are looking for help in generating masked data in your reports, and you have an interesting use-case, please don’t hesitate to reach out.

Leave a Reply