Find overlapping time periods using #PowerQuery

In Chapter 11 of my book, I introduced the Cartesian Product technique that enables you to combine two datasets in order to apply a calculation or filtering on each combination of paired records. In this article, you will learn how to apply this technique to find overlapping time periods.

The Overlapping Time Periods Challenge:

Imagine you are a project manager responsible for tracking the progress of a mission-critical project in your organization. You have a list of resources that are assigned to tasks. Your job is to find the over-allocated resources that are assigned to multiple tasks with overlapping time periods (as demonstrated in the figure below).

Find overlapping time periods using Power Query
Can you find the overlapping time periods and the over-allocated resources?

In the example below, you can find 5 tasks allocated to 3 resources. The figure above highlights the overlapping time periods. While it is easy to detect the overlaps in a Gantt chart on a small dataset, it is much more interesting to assume that the number of tasks and resources can be significant high. Can you automate the detection of overlaps in Power BI or Excel? Before you read my solution, I encourage you to try it yourself using the sample data below:

TaskResourceStartEnd
Task 1Resource 17/2/2019 9:00:00 AM7/2/2019 5:00:00 PM
Task 2Resource 27/2/2019 11:00:00 AM7/2/2019 9:00:00 PM
Task 3Resource 27/2/2019 2:00:00 PM7/2/2019 8:00:00 PM
Task 4Resource 37/2/2019 4:00:00 PM7/2/2019 11:00:00 PM
Task 5Resource 37/2/2019 1:00:00 PM7/2/2019 7:00:00 PM

The Solution

In the next article I will provide step-by-step walkthrough of the solution. For now, here is the M code. You can download my solution here. If you have a different solution, please contact me and I may publish it along my detailed solution.

The M Code

Import the table above and name the query Tasks. To do it you can copy and paste the code below into a blank query in Power Query:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCkkszlYwVNJRCkotzi8tSk4Fc8z1jfSNDAwtFSytDAyASMHRF1nUFCoa4KsUqwM1wwjZDCNk1YaGWA2xxGKIMU5DjBCqkUQtsJhhgmyGMbJqE6xmwJ2HbIgpTkMMsRpijmRGLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Task = _t, Resource = _t, Start = _t, End = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Task", type text}, {"Resource", type text}, {"Start", type datetime}, {"End", type datetime}})
in
    #"Changed Type"

Create a function using a blank query and name it MinDate. This function compares between two dates and return the earliest one.

(date1, date2)=>
    if date1 < date2 then date1 else date2

Create a function using a blank query and name it MaxDate. This function compares between two dates and return the latest one.

(date1, date2)=>
    if date1 >= date2 then date1 else date2

Create a new blank query and name it Overlapping Tasks. Paste the following code:

let
    Source = Tasks,
    BufferedSource = Table.Buffer(Source),
    #"Added Custom" = Table.AddColumn(Source, "Source2", each BufferedSource),
    #"Expanded Source2" = Table.ExpandTableColumn(#"Added Custom", "Source2", {"Task", "Resource", "Start", "End"}, {"Source2.Task", "Source2.Resource", "Source2.Start", "Source2.End"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Source2",{{"Source2.Start", type datetime}, {"Source2.End", type datetime}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Task] <> [Source2.Task] and [Resource] = [Source2.Resource])),
    #"Added Custom1" = Table.AddColumn(#"Filtered Rows", "Overlapping Duration", each Duration.TotalHours(
        MinDate([End], [Source2.End]) -
        MaxDate([Start], [Source2.Start])
)),
    #"Filtered Rows1" = Table.SelectRows(#"Added Custom1", each [Overlapping Duration] > 0),
    #"Invoked Custom Function" = Table.AddColumn(#"Filtered Rows1", "Overlap Start", each MaxDate([Start], [Source2.Start])),
    #"Invoked Custom Function1" = Table.AddColumn(#"Invoked Custom Function", "Overlap End", each MinDate([End], [Source2.End])),
    #"Removed Other Columns" = Table.SelectColumns(#"Invoked Custom Function1",{"Task", "Resource", "Overlapping Duration", "Overlap Start", "Overlap End"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Overlapping Duration", Int64.Type}, {"Overlap Start", type datetime}, {"Overlap End", type datetime}})
in
    #"Changed Type1"

Visualizing overlapping tasks

In this PBIX file, you can find the Power Query solution and a simple visualization of the overlapping tasks. The report loads the tasks into a custom visual (as Timeline by Antti Suanto), and shows the over-allocated resource on a separate table with their corresponding tasks and the duration of the time overlaps.

When you click on one of the tasks in the as Timeline visual, you can see which tasks are overlapping. For example, in this screenshot, I clicked on Task 4 which is assigned to Resource 3 and found out that this resource is over allocated as it is also assigned to Task 5 with an overlap of 3 hours, starting from 7/2/2019 7PM.

I hope you find this technique useful. There are several other solutions that doesn’t require Cartesian Product (e.g. using List.DateTimes and many variations of DAX). But I like this one the most. Stay tuned to the next article to learn step-by-step how I built the query for this solution and contact me if you have solved it differently.

6 comments

  1. Den Reply

    Some time ago I ran into a similar task.
    I had 2 tables
    1) changes in the positions of employees for the period (Personnel Number, Position, Start of Period, End of Period).
    2) changes in the salaries of employees for the period (Personnel Number, Salary, Start of period, End of period).
    It was necessary to determine all the intersections of the salary table with the positions table and get a table with all the time periods where the position and salary remain unchanged, as well as determine the intervals within a given year in which there are no records for the employee. This could happen if an employee was hired, for example, May 1, 2018. Then, in the analysis of 2018, I will not have data on the position and salary for the period from 01/01/2018 to 04/30/2018.
    Initially I tried to get a list of dates in the interval for each row of 1 table, then collect all the lists by 1 personnel number and cross them with dates of the whole year. But in the end I did not think of how to get lists of continuous intervals from the list of dates. For example from the list:
    04/01/2018
    04/02/2018
    04/03/2018
    04/05/2018
    04/06/2018
    Get 2 intervals
    1) from 04/01/2018 to 04/03/2018
    2) from 04/05/2018 to 04/06/2018
    Perhaps you tell me how you can implement a similar function.

    • Gil Raviv Post authorReply

      Hi Den
      Thank you for sharing this challenge. Can you please send me sample data and the desired format of the output by email to [email protected]?

  2. Flavio Reply

    Thanks for the post. With some adjustments it is really helping me to pre-filter overlaping periods source files and maintain one single, most recent registry for each date!

  3. Flavio Reply

    Thanks for the post Gil! I’m working with multiple CSV with overlaping periods. In the import routing, before expanding the source files, I’m using your approach to keep a single, most recent registry for each date. It’s working fine!

    • Gil Raviv Post authorReply

      Thank you Flavio for sharing your scenario. I am glad you find this method useful.

Leave a Reply