For my first blog post I wanted to write about one of my favorite topics: Power BI Dataflows. If it’s the first time you read about this product or if it’s a product you have been trying to learn more about, this is meant for you.
What Are Power BI Dataflows?
Power BI Dataflows is Microsoft Power BI’s Self-Service ETL (Extract-Transform-Load) technology designed for all users. You can author Dataflows in the Power BI Service and the technology used to create them is Power Query. If you have experience with Power Query, you will easily recognize the same User Interface and will be able to utilize almost all the same functions as you would in the Power BI Desktop in the Transform data (Power Query) window.
Power BI Dataflows – Power Query Editor
Power BI Desktop – Power Query Editor
Where Do Power BI Dataflows Fit In Your Power BI Solution?
Types Of Entities Available In Power BI Dataflows
When you are authoring a solution using Power BI Dataflows, you will create Dataflows and tables. Let’s break it down.
A Dataflow is made up of 1 or more table. A table is made up of columns and rows.
There are multiple options for creating tables:
- New Table – Same experience as a new Table in the Power BI Desktop Power Query Editor. You select the data source, your tables and apply any transformations you would like. Same experience as creating a new table in the PBI Desktop App.
- Linked Table – Similar to the Reference function in the PBI Desktop App, you can reference entities previously created in other Dataflows and apply additional logic.
- Computed Table – New table that is created when you apply transformations to linked entities. This means that the transformations run in the Power BI Service memory, instead of the Gateway or datasource (depending on the design of your solution).
When Should You Use Dataflows?
Single Source Of Truth
The keyword is Governance. As your organization grows and more people start adopting Power BI as the technology to create BI solutions, you may need to think about Governance and the best way to make sure people across teams are looking at the same data and KPIs. With Power BI Dataflows, you create that single source of truth with the necessary transformations to create Power BI Reports. There is only one owner and you can provide Reader access to other users to author reports using the Dataflows you create. This gives you control on the data the teams use to report on and also the data and sources people have access. When sharing Dataflows, the users don’t require data source access, they can read the Dataflows you authored and use that version instead.
Reusable Transformation Logic
If you feel like you are applying the same transformations for tables on different reports in the Power BI Desktop using Power Query, it is time to start using Dataflows. You can create Dataflows that apply the transformation logic that you need for multiple reports and then you can connect to the Dataflows entities directly from the Power BI Desktop app. You can even continue to apply more transformations specific to the reports from the Desktop app. This will not only help you reduce the time when you need to modify or update the logic on a table, it will also reduce the load on the data source that may help avoiding any refresh issues. Think about only having to refresh a Dataflow once, instead of refreshing 6 reports at the same time.
Share Data To Other Services In Azure
Maybe you are interested in starting to work with other services in Azure such as Cognitive Services. With Power BI Dataflows you can write your data to an Azure Data Lake Gen 2 instance and it will be saved in the CDM format.
CDM stands for the Common Data Model and it represents a schema that is shared across Microsoft services.
Working With Large Volumes Of Data
Previously we introduced the types of entities, Linked and Computed, these entities are available to you when you use Power BI Premium (Dedicated Capacity or Premium Per User plans). Dataflows support a growing number of on-premises and cloud sources and may reduce the refresh times for complex transformations when you leverage Linked and Computed entities. However, it’s important to mention that you may need to consider another technology to work alongside Dataflows, such as Azure Synapse or Databricks, in situations when your users need the data refreshed multiple times a day and the transformations are fairly complex and you are dealing with over millions of rows of data. Now, there is no specific limit on the number of rows or transformations, it will depend on your specific needs and the needs of your end users.
I hope you enjoyed this article and that after reading, you have a better understanding of what are Power BI Dataflows. Stay tuned, this is the first blog post but more to come.
Follow me on Twitter