Sentiment Analysis in Power BI – Part 2

Beyond 1000 Messages

Today’s blog post will take you one step further with Sentiment Analysis in Power BI and will show you how easy it is to extend the solution beyond 1000 messages. Read and follow the steps of Part 1 before we proceed, or subscribe below to DataChant to receive the Power BI Template file that can be used as a starting point.

My recent article on Power BI Community blog (referred here as Part 1) demonstrated how easy it is to apply Microsoft Cognitive Services on messages, and perform Sentiment Analysis inside Power BI Desktop, and without any help from Data Scientists, or big data developers.

As Microsoft Cognitive service allows 1000 messages per API call, I decided to keep the last article short and simple to follow, and narrowed down our analysis to a single API call with the first 1000 messages.

So today we will learn how to handle thousands of messages and feed them to the Sentiment Analysis API in groups of 1000 messages. The great thing here is that we still do everything for free, as we are allowed to make 5000 API calls (5 million messages) per month.

Subscribe to DataChant and take the shortcut

Subscribe To DataChant

Join our mailing list to receive exclusive content that was used in the preparation of this article.

Thank you for subscribing to DataChant. An email will be sent to you shortly with more details.

If you didn’t follow all the steps in Part 1, don’t worry, you can take a shortcut and subscribe to my blog in the form above, and you’ll get the Power BI Template file of Part 1. When you get the template and open it, you will be asked to provide the Sentiment Analysis API key (You can learn in Part 1 how to get the API Key).

If you already followed Part 1, you should consider changing the sample data. Last time, we used an Excel workbook that had 1000 messages. This time you can use a larger data set. You can download it here, and change your query sources to this file.

Ready to Start?

Click Edit Queries in Power Bi Desktop (after you followed the steps in Part 1, or received the Power Bi Template, and entered the API key).

Click on the SentimentsResults query.

Screenshot_26

Delete all the steps starting from Kept First Rows. You can start by deleting the last step and proceed backwards till you delete Kept First Rows.

Screenshot_27

Click Add Index Column, in Add Column tab, and select From 0.

Screenshot_30

Divide & Conquer

Following the last step, we now have a running index. If we divide it by 1000 and get the integer part of the result, we will be able to get a running index of bulks of 1000 rows. First 1000 rows will receive a zero, second 1000 rows will receive a one, etc.

So let’s perform Integer-Divide on the index.

Select column Index and then in Transform tab, click Standard then select Integer-Divide.Screenshot_32

In the Integer-Divide dialog, set Value to 1000 and click OK.

Screenshot_33

You will now notice that the running index is transformed as expected. We have a zero index in the first 1000 rows,  a one in the second 1000 rows, a two in the third 1000 rows.Screenshot_34

So we divided. Now let’s move to the conquering part …

Select the column Index, and click Group By in Transform tab.

Screenshot_35

In the Group By dialog, make sure Index is selected in the first drop down menu. Type Data in New column name, select All Rows in Operation, and click OK.Screenshot_36

The last step, groups the original rows into smaller tables according to Index. You will now see in the preview pane three rows with the original Tables as cells of the column Data.

Make sure that the formula bar is visible, and notice that Power BI Desktop refers addresses each grouped table by using the function each _, which returns the result in the type of a table.

Screenshot_37

As you recall, in Part 1, we created the query function GetSentimentResults. As a reminder, here is the code that we used.

As you may notice, the function returns a record object, and not a table. Let’s use the function on each table that was grouped earlier.

We will change the formula:

To this new formula:

And the changes are highlighted in this screenshot:

Screenshot_41

Now, lets expand the column Data by clicking on the highlighted expand icon. Ensure all columns are selected, mark off Use original column name as prefix, and click OK.

Screenshot_42

The preview pane will show you two new columns. Assuming we don’t get errors from the Sentiment Analysis service, you can ignore the next two steps. I am showing these steps to stress out how useful Power Query is to work and troubleshoot your data, even when the steps are only used temporarily.

Expand the column errors.

Screenshot_43

Make sure that all the errors are null. Here you may find error messages that will help you to refine your query.

Screenshot_44

After we confirmed that there are no errors in column errors, remove that column.

Note: In the future you can go again to this step and use the preview pane to find new errors. We could also duplicate the query at this stage to load errors into a separate table (But that will double the API calls, and you may not want to do it). There are also more advanced tricks to append the errors to the table, and avoid the multiple calls, but this is beyond our scope today.

Second Note: The Query Editor is quite “greedy” in the way it works to cache a preview for each transformation step. Each time you select a step or create a new one, Power Query may trigger new API calls to the service. This is true to all types of external data sources you use. But in our case, you may reach your free 5000 messages/month API calls sooner than expected.

Let’s continue. Few more steps and we are done.

After removing the column errors, and column Index, we can expand column documents.

Screenshot_45

Expand it again.

Screenshot_46

One last expand…

In the expand pane, make sure that all columns are selected and mark off Use original column name as prefix. Then click OK.

Screenshot_47

Now let’s change the type of column score to Decimal Number.

Screenshot_48

Finally, rename the columns to Sentiment Score and Post ID for better context and we can now click Close & Apply in Query Editor Home.

Screenshot_2

That’s it for today. Stay tuned for our next post we will drill down on the main visualizations we can create for the Sentiment Scores; We will create a cool Chiclet Slicer and measures for Positive, Neutral and Negative Posts, and will also apply a disconnected table to help us reaching our goal.

If you do choose to be a DataChant subscriber, you will get access to two Power BI Templates: The first template will be used as the starting point for today’s post. The second template will include today’s steps. You will also get all the cool visuals and DAX tricks that I will share next time, included in these templates.

Screenshot_3
Screenshot of the template you will receive as a subscriber

Subscribe To DataChant

Join our mailing list to receive exclusive content that was used in the preparation of this article.

Thank you for subscribing to DataChant. An email will be sent to you shortly with more details.

37 comments

  1. Anonymous Reply

    Hey Gil,

    After I group the table after I change the formula some of my records in the data column have errors. When I expand the data column to get the error column I get an error that says

    “An error occurred in the ‘’ query. DataSource.Error: Web.Contents failed to get contents from ‘https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment’ (400): Bad Request
    Details:
    DataSourceKind=Web
    DataSourcePath=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment
    Url=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment”.

    My data set isn’t huge it is just under 5000 rows. Is this a problem on my end? How do I go about fixing this?

    Thank you,

    Jordan

    Also I tried doing the sentiment analysis with a parameter and a blank query like you did for the keyword extraction and kept bombing out when I would hit close and apply. Is it possible to do it for the sentiment?

    • Gil Raviv Post authorReply

      Hi Jordan,

      To prevent the first error, please ensure that column “id” has unique values (trim, lowercase and remove duplicates to ensure it). You should also remove empty values in column “text”. Hope it will help.
      Regarding your second question, please elaborate. The keyword extraction and Sentiment Analysis are quite similar.

  2. Pingback: Updates for Power BI Desktop and service and more... - Roundup #56 | Guy in a Cube

  3. Julian Reply

    What’s wrong with the following message when I open the pbix file? How can I fix it?

    CALENDARAUTO function can not find a valid date in base columns of DateTime type in the model.

    • Gil Raviv Post authorReply

      Make sure that you have a date column with a date type in your main fact table.

      • Julian

        1. I opened the file “Sentiment Analysis Sample (Handling thousands of messages).pbix”, all the charts in the report view were pending fix due to CalendarTable malfunction – I guessed.
        2. I went back to data view then found except the column heads none of the contents of all tables were visible.
        3. I deleted all the tables then run Get Data from “Facebook Reactions Data Sample.xlsx” but that excel file only have 998 posts – unable to do Integer-Divide by 1000.
        4. I tried to use a larger data set as you guided but the file was not downloadable even I clicked the link several times.

        Were there something I did wrong? Please advise. Thanks.

  4. Julian Reply

    When clicking Refresh I was asked to Access web content with credentials check.

  5. Julian Reply

    Yes, I did – it’s the first option but I got the error message saying “Access to the resource is forbidden”.

    • Gil Raviv Post authorReply

      I am sorry for your experience. I have tried to reproduce this error, but couldn’t. Can you clear the permissions of the relevant data sources, and try to refresh again?

      • Julian

        Clearing the permissions didn’t work either. However, in order not to bother you too much, I think I’d better let it go for now. Anyway, thank you very much for the great support.

      • Gil Raviv Post author

        I am sorry it doesn’t work for you. The last recommendation I can think of is to retry again from scratch, and try refreshing from the Query Editor instead of the main dialog. I had similar experiences in the past when working with templates. Usually after few attempts it was resolved.

  6. Marco Bajaña Reply

    I get this error when I Generate the sentiment analysis in the moment to invoke the function. “Formula. Firewall: Query “All Candidates(2) (Step ‘Invoke Custom Function’) references other queries or steps, so it may not directly access a data sorce. Please rebuild this data combination”.

  7. Steven Reply

    HI Gil

    I have been through the instructions and even watching your YouTube video on “Facebook & Sentiment Analysis By Gil Raviv,” I can’t get this to work past the step after grouping into 1000 and then applying the invoke custom function I get errors:

    An error occurred in the ‘GetSentimentResults’ query. DataSource.Error: Web.Contents failed to get contents from ‘https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment’ (400): Bad Request
    Details:
    DataSourceKind=Web
    DataSourcePath=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment
    Url=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment

    I have cleaned, trimmed lower case every possible option and it still occurs!

    It seems it is not only me having an issue with this. Its there a problem with Power BI? If not how is this fixed?

    Cheer

    Steven

    • Gil Raviv Post authorReply

      Did you remove empty values and remove duplicate IDs?

  8. Steven Reply

    WOW, I must have done something because it worked! It must have been empty values, as I removed duplicate. Thanks for the tip !!

  9. Letitia Reply

    Thanks very much for this useful post.

    When I change the formula after grouping rows to add in GetSentimentResults I get the following error message:

    “Formula.Firewall: Query ‘Sentiment results’ (step ‘Grouped Rows’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.”

    Any ideas what went wrong?

    • Gil Raviv Post authorReply

      Thank you Letitia,
      The solution is to set Privacy in File -> Options to ignore privacy levels. Then click Refresh.

  10. Letitia Reply

    Hi Gil, thanks for this post.

    When I try to change the formula after grouping rows into a table based on the index I get the following error:

    “Formula.Firewall: Query ‘Sentiment results’ (step ‘Grouped Rows’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.”

    Do you have any advice?

    Thanks,

  11. Letitia Reply

    Hi again, I removed a set of brackets { } which solved the formula firewall error (none of the privacy settings worked for me), but now I’m unable to authenticate credentials (set to anonymous)

    • Gil Raviv Post authorReply

      Thank you Letitia, I suggest you return the brackets. Not sure which brackets you refer to, but these are probably needed.
      The solution to your previous issue is to set Privacy in File -> Options to ignore privacy levels. Then click Refresh.

  12. Letitia Reply

    Hi I get the following error after entering the formula:

    An error occurred in the ‘’ query. DataSource.Error: Web.Contents failed to get contents from ‘https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment’ (413): Request Entity Too Large
    Details:
    DataSourceKind=Web
    DataSourcePath=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment
    Url=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment

    Any tips for how to fix? Thanks

    • Gil Raviv Post authorReply

      This error may occur when the dataset is beyond 1000 text messages. If you follow the steps including the grouping, you should not get his error.

    • Gil Raviv Post authorReply

      Hi Tim,
      I was able to use your report and get the sentiment results for the first 3500 comments. Did you use the APIkey in GentSentimentResults? I suspect that your key is invalid, or that you have used all your quota.

  13. Tim Reply

    It finally worked after changing credentials to anonymous. I think its critical to do that. Thanks.

  14. Kok Wei Khong Reply

    Works like a charm. What I normally do is to slice the integer to 500 to see problematic rows. I also make it a point that I do a keep top values so no ‘null’ happens at the end of the dataset.

  15. Juan Grijalba Reply

    Hello, this post is amazing, thank you. One question on a production report I created there are new comments that need to be sent to Azure for processing. My question is does every time I hit refresh on the report and power query goes through the steps, am I sending the same old comments plus new ones to be analyzed and so am I reusing API calls on the same messages every time the report refreshes?
    Or does Azure know that messages with the same ID column have been processed before and therefore they don’t get analyzed again? Hope this question makes sense..

    Thank you for your posts

    • Gil Raviv Post authorReply

      Thank you Juan. I am glad that you find this article helpful. You are correct. Every refresh will apply the API on all the dataset. Incremental refresh may be supported by Power BI in the future (As part of Power BI Premium). So hopefully, we will be able to implement your scenario.

  16. Javier Recasens Reply

    Thank you Gil.

    To make this a viable solution for large datasets, we not only need incremental data loads but also queries should run only once and not twice when refreshing a table.

    Since there is no isolation of queries, API calls are repeated when we have dependencies. For example, if I first use language detection on the text and reference those results to do sentiment analysis, the calls will happen again.

  17. Andrew Clark Reply

    Is there a maximum character length for each message which is sent for analysis?

    I’m getting this error:

    An error occurred in the ‘’ query. DataSource.Error: Web.Contents failed to get contents from ‘https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment’ (413): Request Entity Too Large
    Details:
    DataSourceKind=Web
    DataSourcePath=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment
    Url=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment

    BUT, my groupings are fine (no group has more than 1000 messages in it). I’m analysing product reviews so some messages are quite long.

    • Gil Raviv Post authorReply

      I think that the maximal length is 10240 characters, but last time I checked you should get a valid response to the 1000 messages, and only the specific long messages will be omitted from the results (and included in the errors field).

  18. Andrew Clark Reply

    So after running some tests on my data set, it appears that the query fails on my dataset when more than 600 characters are sent in a single message. This seems quite limiting for key phase extraction and sentiment analysis. The documentation states that messages can be up to 10kb in size which seems to be ~10,000 characters. Any ideas why there is such a gulf in terms of what the documentation states and what’s working in Power BI?

  19. Andrew Clark Reply

    I’ve found a solution – it must have been throwing the 413 error based on the TOTAL size of the 1000 records which were being sent for analysis. I changed the grouping to 100 instead of 1000 and it worked perfectly. Just posting this up in case it helps others with the same issue. In my example each comment was around 2-3kb so the 1000 record stack was too large (over the 1mb limit).

Leave a Reply