Text Obfuscation in Power BI and Power Query in Excel

Power BI and Excel offer a wide variety of collaboration and sharing experiences. In today’s blog post, we will share a useful method to easily obfuscate / mask confidential text (e.g. client names) in Power BI, and share the reports and dashboards without exposing the original text to others. With the method below, you can streamline the obfuscation phase into your queries in Power BI and Excel, so the actual data that is stored in the report is already obfuscated when the data is refreshed. And everything is automatic!!!

To download the entire Power BI solution, subscribe to DataChant, get the link to the subscribers’ folder by email, and download the file: Text Obfuscation.pbix.

How can we obfuscate text in our Power  BI reports?

We will build queries that will randomize our text messages, by converting each English letter to a different one.

Note: In this blog post we will share a method that convert English letters, or any language that shares the English characters. The solution that is shared with DataChant subscribers has another method to obfuscate any language and any text.

So, the text message: “This is a top secret text”, can randomly be converted to “Nlri ri s vqt imzhmv vmwv”, or any other combination of letters. Each refresh, different conversion rules are generated, so the likelihood to decode the obfuscated text is low (Especially if you frequently refresh the report).

Ready to start?

Let’s assume you have a Power BI report with table that contains confidential textual column.

Obfuscate text in Power BI

Create a blank query, and paste the following M expression in the Advanced Editor:

let
    
    FnConvertRange = (StartingChararcterNum, Count) =>
    let
        Source = List.Random(Count),
        #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "To"}}),
        #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", StartingChararcterNum, 1),
        #"Sorted Rows" = Table.Sort(#"Added Index",{{"To", Order.Ascending}}),
        #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index.1", StartingChararcterNum, 1),
        #"Removed Columns" = Table.RemoveColumns(#"Added Index1",{"To"}),
        #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Index.1", "Index"}),
        #"Renamed Columns1" = Table.RenameColumns(#"Reordered Columns",{{"Index.1", "From"}, {"Index", "To"}}),
        #"Transposed Table" = Table.Transpose(#"Renamed Columns1"),
        ConvertedResult = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true])
    in
        ConvertedResult,

    Uppercase = FnConvertRange(65, 26),
    Lowercase = FnConvertRange(97, 26),

    Source = Record.Combine(List.Combine({Table.ToRecords(Uppercase),Table.ToRecords(Lowercase)})),
    #"Converted to Table" = Record.ToTable(Source),
    #"Transposed Table" = Table.Transpose(#"Converted to Table"),
    #"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true])
in
    #"Promoted Headers"

Rename the new query to ConversionTableEnglish.

Explaining the code:

The code above, receives two ranges of numbers, that represent the ASCII codes for the uppercase letters ABC..Z and the lowercase ones: abc…z. Next, we create two tables that translate each ASCII number in the uppercase and lowercase subsets into a different random number in the subset. Next, we append the two tables into a single record of conversions, as shown in the screenshot below. Each key-value pair is a conversion rule between a letter (as a key), and the converted letter (as a value).

Obfuscate Text in Power BI and Excel

Next, in the code above, we convert the record to a table:

Text obfuscation in Power BI

Next, we transpose the table and use the first row as headers, to allow us to easily access each letter by its column name and find the new letter we should use.

And here is the final preview of the query ConversionTableEnglish:

Now, let’s create a new blank query, and paste the following expression in the Advanced Editor:

(txt, convTable) =>
let

    FnConvertCharacter = (c, convTable) =>
        let
            num = Character.ToNumber(c),
            converted_num = try Table.Column(convTable, Text.From(num)){0} otherwise num,
            converted_char = Character.FromNumber(converted_num)
        in
            converted_char,

    ObfuscatedList = List.Transform(Text.ToList(txt), each FnConvertCharacter(_, convTable)),
    res = Text.Combine(List.Transform(ObfuscatedList, Text.From))
in
    res

Rename the new query to FnObfuscateText.

Select the query with the textual column you want to obfuscate, and click on the fx icon to add a new step.

Next, copy and paste the formula below into the formula bar

= Table.TransformColumns(Source,{"Confidential Column", each FnObfuscateText(_, ConversionTableEnglish)})

Inside the formula, you should modify Source to reflect the actual name you have in previous step of Applied Steps, Then, modify Confidential Column to reflect the actual column name that needs to be obfuscated.

Now you can close the Query Editor by clicking Close & Apply, and all your visuals who used the textual column will now show obfuscated text.

Now you can publish the report, create a dashboard and share it with colleagues and clients with better means to hide confidential data. If you are using the Publish to Web functionality to share Power BI report on your website, this is a great method to automate and streamline an obfuscation step, which will allow you to share reports without the manual masking of your confidential data.

Hope you’ll find this blog post useful. Don’t forget to subsribe to gain access to the Power BI report. You will be able to reuse it from now one, when you need to mask data before you share your reports.

19 comments

  1. Cj Reply

    I think I would be inclined to leave the field off my report or simply replace it with some standard text like “[customer name removed]” than to scramble it with random letters. While this is a creative use of M, I’m not sure that it is something that would likely be used at my company. Am I missing a reason why this would be the preferred approach?

    Thanks for your great insight into PowerQuery/M!

    • Gil Raviv Post authorReply

      Hi Cj,
      I don’t think that the obfuscation approach is better than your approach. It is a matter of design. In your approach, you may still need to assign unique IDs as a suffix for “[customer name removed]” otherwise some of your visuals and slicers will show a single customer.

  2. Cj Reply

    Nice! Hadn’t thought through how the dynamic aspects of this might work. I was simply seeing rows on a report. Very good!

  3. Julian Reply

    I tried to refresh many times but the text in Japanese just remained unchaged without obfuscating, why?

    • Gil Raviv Post authorReply

      For non-English languages you can find my solution in DataChant subscription folder that I send by email.

  4. Julian Reply

    I’ve not received any email from you and I checked your folder the “Text Obfuscation.pbix” remained unchanged – last modified on 03/01?

    • Gil Raviv Post authorReply

      Hi Julian,
      The “Text Obfuscation.pbix” file contains a query that obfuscate all languages. If you cannot find it, please let me know. I will reply with detailed example soon.

    • Gil Raviv Post authorReply

      Hi Julian.
      I have fixed the report. Japanese characters are now obfuscated. My original code didn’t handle Japanese.

  5. Johnnyan Reply

    Hi, could you also let us know how to do a random replacement for numbers in a column ?
    I tried using Number.RandomBeteen but I am getting the following error:

    Expression.Error: We cannot convert the value xxxxxxxxxxxxx to type Function.

  6. Kyle Reply

    Hi, thanks for this – this is exactly what I’m after. However if I try to obfuscate multiple columns in the same table, it only retains the last step. The previous columns go back to their original state. Is there a way to obfuscate multiple columns in the same table?

    • Gil Raviv Post authorReply

      Thank you Kyle.
      You can obfuscate column by column in two separate steps. Not sure I understand why it didn’t work for you. Can you share a sample report?

      • Kyle

        Hi Gil, I figured out my mistake. I wasn’t modifying the “Source” with the previous step, therefore it wasn’t keeping that step. It’s working now! Thanks

  7. Priyadarshini Patwardhan Reply

    Also I subscribed but didnt receive the Text Obfuscation.pbix file. What could be the issue. I am new to power BI and I will be working on a Telecom client. I have 2 tables which is linked on the confidencial field (Customer Telephone no.). I want to mask that field. since that column is also used for linking two table so i need to just scarmble the number in the dashboard.

  8. PJ Brumit Reply

    Say I have a list of clients names as aaa, bbb, ccc, and ddd. I want to apply the obfuscation to all except client bbb. How would I go about doing that?

  9. Semi Faceless Reply

    say I want to implement role based masking. Manager of team A should see the employee names of team A, and masked employee names of team B. Similarly for manager of team B. Additionally employee of team A should see other members of team A as masked (own name not masked).

    • Gil Raviv Post authorReply

      I would have two disconnected tables. One with all the employees with unique ID and masked names. The other table with Employee unique ID, Email and unmasked name. On the second table you would apply RLS (dynamic). From here you can start building measures and use LOOKUPVALUE whenever you need to show the employee names for the relevant manager’s team. As an alternative to LOOKUPVALUE you may also use inactive relationship between the two tables (to avoid having the unmasked second table filtering the first table). And use USERELATIONSHIP to activate the filter when you need to show the names of your team in a measure.

Leave a Reply