Improve refresh time by batching Web API calls – Yahoo Finance example


In today’s blog post we will learn a neat Power Query trick that is extremely useful in improving refresh time on certain Web API calls (Like Yahoo Finance, Quandl and Twitter).

Whenever you use Power Query to import data from a Web API that accepts certain object IDs, if you apply a single object ID in each API call, you will eventually experience a slow refresh, and reach the API limits, especially if the number of object IDs is high.

Some API calls allow you to better utilize their resources, by working with batches of object IDs. For example, on Yahoo Finance you can get the current Ask and Bid for AAPL, GOOG and MSFT, by calling these three API calls (learn more here):

http://finance.yahoo.com/d/quotes.csv?f=snab&s=AAPL

http://finance.yahoo.com/d/quotes.csv?f=snab&s=GOOG

http://finance.yahoo.com/d/quotes.csv?f=snab&s=MSFT

But you can get the data faster by calling all quotes at once:

http://finance.yahoo.com/d/quotes.csv?f=snab&s=AAPL+GOOG+MSFT

Imagine that you have thousands of symbols you wish to monitor. In the past you probably applied a single API call for each object ID using a function query and the Custom Column in Power Query. Today we’ll show you how to create batches of object IDs to reduce the number of API calls, as shown in this diagram. We will walk through the Yahoo Finance example.

Merging multiple IDs to improve Web API Refresh time

Subscribe to DataChant to get access to the Excel solution file.

[rad_rapidology_inline optin_id=optin_5]

Download this Excel workbook to start the tutorial. The workbook contains thousands of NASDAQ symbols.

Select any cell in table Symbols and click From Table in the section Get & Transform of Data ribbon (Note: if you are not on Excel 2016, you will find From Table in Power Query ribbon after you install the add-in. You can also follow this tutorial with Power BI Desktop).

In Create Table select My table has headers and click OK.

screenshot_58

In the Query Editor window, select the column Symbol and click Add Index Column in Add Column tab.

screenshot_39

Duplicate the column Index, by clicking Duplicate Column in Add Column tab.

screenshot_40

Select column Index, and in Transform tab, select Standard, then click Modulo.

screenshot_41

In Modulo, set 100 as Value and click OK.

screenshot_13

Rename the second column to Mod.

This column now contains values from 0 to 99 in rows 1-100. Each 100 rows, the value in column Mod resets to 0 and is incremented by 1 on each row till the next 100 row.

Next step – Select the last column, and while you are still in Transform tab, select Standard and click Integer-Divide.

screenshot_42

In Integer-Divide set 100 as Value, and click OK.

screenshot_16

Rename the last column to Integer-Divide. This column now contains zero values for the first 100 rows and is incremented by one for each 100 rows. We will now use the Mod and Integer-Divide to pivot the table. Mod will represent the column, and Integer-Divide will represent the row.

screenshot_59

Select column Mod and click Pivot Column in Transform tab.

screenshot_43

In Pivot Column window select Symbol as Values Column and select Don’t Aggregate in Advanced options, Aggregate Value Function. Then click OK.

screenshot_44

If you are new to Power Query in Excel or Power BI, this may be a magical moment for you. We now have a new table with 100 symbols on each row. We can now remove the column Integer-Divide.

screenshot_45

Select column 0 and then press CTRL+A to select all columns. Then, right click on any column header and select Merge Columns.

screenshot_46

In Merge Columns window select –Custom– as Separator, set + as the separator, and then set Symbols as New column name. When you’re done, click OK.

screenshot_47

The last step will merge all the columns together with a “+” as a separator.

Scroll down to the last row to see an edge case we should now fix. The last row contains a trailing “++++++++++” string which was resulted by the fact that the last row had less than 100 symbols. We can easily fix it.screenshot_48

Select the column Symbols and right click on the header, then select Split Column and click By Delimiter…

screenshot_49

In the next window, select –Custom– and set ++ as the delimiter. Then, select At the left-most delimiter and click OK. This will split the column to two, with null values if there are no “++” in the text, and the trailing “+++++++” will be “pushed” to the new column.

screenshot_50

We can now remove the second column and rename the first one to Symbols.

screenshot_51

Now you can create a blank query and paste the following formula (Not sure how, read here how). Rename the query as FnGetYahooData.

(Symbols as text)=>
    Csv.Document(Web.Contents("http://finance.yahoo.com/d/quotes.csv?f=snab&s="& Symbols),[Delimiter=",",Encoding=1252])

In Add Column tab, click Invoke Custom Function.

screenshot_52

In the Invoke Custom Function window, select FnGetYahooData as Function query, then ensure the column Symbols is selected and click OK.

screenshot_53

Note: The same function can read individual symbols, but the refresh will probably fail after a long waiting.

Next, expand the column FnGetYahooData and click OK in the expand pane to expand all columns.

screenshot_55

You can now remove the column Symbols (We made sure we get the symbols on the result), and rename the columns to: Symbol, Name, Ask, Bid.screenshot_56

Hope you enjoyed this tutorial. Check out Part 2 here for an improved version that requires minor M tweaks.

2 comments

  1. Ricardo F. Mota Reply

    Hi, great work Gil Raviv. I tried to add also the tags into the function so we can promote headers and instead of change the column 1 2 … we can get the tags in the headers, but I haven’t the knowledge to do that:

    Is it possible

    • Gil Raviv Post authorReply

      Hi Ricardo, I am not sure I understand the exact need. Could you please elaborate?

Leave a Reply