Text Obfuscation in Power BI and Power Query in Excel

Power BI and Excel offer a wide variety of collaboration and sharing experiences. In today’s blog post, we will share a useful method to easily obfuscate / mask confidential text (e.g. client names) in Power BI, and share the reports and dashboards without exposing the original text to others. With the method below, you can streamline the obfuscation phase into your queries in Power BI and Excel, so the actual data that is stored in the report is already obfuscated when the data is refreshed. And everything is automatic!!!

To download the entire Power BI solution, subscribe to DataChant, get the link to the subscribers’ folder by email, and download the file: Text Obfuscation.pbix.

Subscribe To DataChant

Join our mailing list to receive exclusive content that was used in the preparation of this article.

Thank you for subscribing to DataChant. An email will be sent to you shortly with more details.

How can we obfuscate text in our Power  BI reports?

We will build queries that will randomize our text messages, by converting each English letter to a different one.

Note: In this blog post we will share a method that convert English letters, or any language that shares the English characters. The solution that is shared with DataChant subscribers has another method to obfuscate any language and any text.

So, the text message: “This is a top secret text”, can randomly be converted to “Nlri ri s vqt imzhmv vmwv”, or any other combination of letters. Each refresh, different conversion rules are generated, so the likelihood to decode the obfuscated text is low (Especially if you frequently refresh the report).

Ready to start?

Let’s assume you have a Power BI report with table that contains confidential textual column.

Obfuscate text in Power BI

Create a blank query, and paste the following M expression in the Advanced Editor:

Rename the new query to ConversionTableEnglish.

Explaining the code:

The code above, receives two ranges of numbers, that represent the ASCII codes for the uppercase letters ABC..Z and the lowercase ones: abc…z. Next, we create two tables that translate each ASCII number in the uppercase and lowercase subsets into a different random number in the subset. Next, we append the two tables into a single record of conversions, as shown in the screenshot below. Each key-value pair is a conversion rule between a letter (as a key), and the converted letter (as a value).

Obfuscate Text in Power BI and Excel

Next, in the code above, we convert the record to a table:

Text obfuscation in Power BI

Next, we transpose the table and use the first row as headers, to allow us to easily access each letter by its column name and find the new letter we should use.

And here is the final preview of the query ConversionTableEnglish:

Now, let’s create a new blank query, and paste the following expression in the Advanced Editor:

Rename the new query to FnObfuscateText.

Select the query with the textual column you want to obfuscate, and click on the fx icon to add a new step.

Next, copy and paste the formula below into the formula bar

Inside the formula, you should modify Source to reflect the actual name you have in previous step of Applied Steps, Then, modify Confidential Column to reflect the actual column name that needs to be obfuscated.

Now you can close the Query Editor by clicking Close & Apply, and all your visuals who used the textual column will now show obfuscated text.

Now you can publish the report, create a dashboard and share it with colleagues and clients with better means to hide confidential data. If you are using the Publish to Web functionality to share Power BI report on your website, this is a great method to automate and streamline an obfuscation step, which will allow you to share reports without the manual masking of your confidential data.

Hope you’ll find this blog post useful. Don’t forget to subsribe to gain access to the Power BI report. You will be able to reuse it from now one, when you need to mask data before you share your reports.

 

11 comments

  1. CjReply

    I think I would be inclined to leave the field off my report or simply replace it with some standard text like “[customer name removed]” than to scramble it with random letters. While this is a creative use of M, I’m not sure that it is something that would likely be used at my company. Am I missing a reason why this would be the preferred approach?

    Thanks for your great insight into PowerQuery/M!

    • Gil Raviv Post authorReply

      Hi Cj,
      I don’t think that the obfuscation approach is better than your approach. It is a matter of design. In your approach, you may still need to assign unique IDs as a suffix for “[customer name removed]” otherwise some of your visuals and slicers will show a single customer.

  2. CjReply

    Nice! Hadn’t thought through how the dynamic aspects of this might work. I was simply seeing rows on a report. Very good!

  3. JulianReply

    I tried to refresh many times but the text in Japanese just remained unchaged without obfuscating, why?

    • Gil Raviv Post authorReply

      For non-English languages you can find my solution in DataChant subscription folder that I send by email.

  4. JulianReply

    I’ve not received any email from you and I checked your folder the “Text Obfuscation.pbix” remained unchanged – last modified on 03/01?

    • Gil Raviv Post authorReply

      Hi Julian,
      The “Text Obfuscation.pbix” file contains a query that obfuscate all languages. If you cannot find it, please let me know. I will reply with detailed example soon.

    • Gil Raviv Post authorReply

      Hi Julian.
      I have fixed the report. Japanese characters are now obfuscated. My original code didn’t handle Japanese.

  5. JohnnyanReply

    Hi, could you also let us know how to do a random replacement for numbers in a column ?
    I tried using Number.RandomBeteen but I am getting the following error:

    Expression.Error: We cannot convert the value xxxxxxxxxxxxx to type Function.

Leave a Reply