Follow Me
In today’s blog post we will learn a neat Power Query trick that is extremely useful in improving refresh time on certain Web API calls (Like Yahoo Finance, Quandl and Twitter).
Whenever you use Power Query to import data from a Web API that accepts certain object IDs, if you apply a single object ID in each API call, you will eventually experience a slow refresh, and reach the API limits, especially if the number of object IDs is high.
Some API calls allow you to better utilize their resources, by working with batches of object IDs. For example, on Yahoo Finance you can get the current Ask and Bid for AAPL, GOOG and MSFT, by calling these three API calls (learn more here):
http://finance.yahoo.com/d/quotes.csv?f=snab&s=AAPL
http://finance.yahoo.com/d/quotes.csv?f=snab&s=GOOG
http://finance.yahoo.com/d/quotes.csv?f=snab&s=MSFT
But you can get the data faster by calling all quotes at once:
http://finance.yahoo.com/d/quotes.csv?f=snab&s=AAPL+GOOG+MSFT
Imagine that you have thousands of symbols you wish to monitor. In the past you probably applied a single API call for each object ID using a function query and the Custom Column in Power Query. Today we’ll show you how to create batches of object IDs to reduce the number of API calls, as shown in this diagram. We will walk through the Yahoo Finance example.
Subscribe to DataChant to get access to the Excel solution file.
[rad_rapidology_inline optin_id=optin_5]
Download this Excel workbook to start the tutorial. The workbook contains thousands of NASDAQ symbols.
Select any cell in table Symbols and click From Table in the section Get & Transform of Data ribbon (Note: if you are not on Excel 2016, you will find From Table in Power Query ribbon after you install the add-in. You can also follow this tutorial with Power BI Desktop).
In Create Table select My table has headers and click OK.
In the Query Editor window, select the column Symbol and click Add Index Column in Add Column tab.
Duplicate the column Index, by clicking Duplicate Column in Add Column tab.
Select column Index, and in Transform tab, select Standard, then click Modulo.
In Modulo, set 100 as Value and click OK.
Rename the second column to Mod.
This column now contains values from 0 to 99 in rows 1-100. Each 100 rows, the value in column Mod resets to 0 and is incremented by 1 on each row till the next 100 row.
Next step – Select the last column, and while you are still in Transform tab, select Standard and click Integer-Divide.
In Integer-Divide set 100 as Value, and click OK.
Rename the last column to Integer-Divide. This column now contains zero values for the first 100 rows and is incremented by one for each 100 rows. We will now use the Mod and Integer-Divide to pivot the table. Mod will represent the column, and Integer-Divide will represent the row.
Select column Mod and click Pivot Column in Transform tab.
In Pivot Column window select Symbol as Values Column and select Don’t Aggregate in Advanced options, Aggregate Value Function. Then click OK.
If you are new to Power Query in Excel or Power BI, this may be a magical moment for you. We now have a new table with 100 symbols on each row. We can now remove the column Integer-Divide.
Select column 0 and then press CTRL+A to select all columns. Then, right click on any column header and select Merge Columns.
In Merge Columns window select –Custom– as Separator, set + as the separator, and then set Symbols as New column name. When you’re done, click OK.
The last step will merge all the columns together with a “+” as a separator.
Scroll down to the last row to see an edge case we should now fix. The last row contains a trailing “++++++++++” string which was resulted by the fact that the last row had less than 100 symbols. We can easily fix it.
Select the column Symbols and right click on the header, then select Split Column and click By Delimiter…
In the next window, select –Custom– and set ++ as the delimiter. Then, select At the left-most delimiter and click OK. This will split the column to two, with null values if there are no “++” in the text, and the trailing “+++++++” will be “pushed” to the new column.
We can now remove the second column and rename the first one to Symbols.
Now you can create a blank query and paste the following formula (Not sure how, read here how). Rename the query as FnGetYahooData.
(Symbols as text)=> Csv.Document(Web.Contents("http://finance.yahoo.com/d/quotes.csv?f=snab&s="& Symbols),[Delimiter=",",Encoding=1252])
In Add Column tab, click Invoke Custom Function.
In the Invoke Custom Function window, select FnGetYahooData as Function query, then ensure the column Symbols is selected and click OK.
Note: The same function can read individual symbols, but the refresh will probably fail after a long waiting.
Next, expand the column FnGetYahooData and click OK in the expand pane to expand all columns.
You can now remove the column Symbols (We made sure we get the symbols on the result), and rename the columns to: Symbol, Name, Ask, Bid.
Hope you enjoyed this tutorial. Check out Part 2 here for an improved version that requires minor M tweaks.
Hi, great work Gil Raviv. I tried to add also the tags into the function so we can promote headers and instead of change the column 1 2 … we can get the tags in the headers, but I haven’t the knowledge to do that:
Is it possible
Hi Ricardo, I am not sure I understand the exact need. Could you please elaborate?