How to Read Excel File from URL into a Pandas DataFrame


How can we read an XLS file from a URL into a Pandas DataFrame?

Example scenario

Let’s see a real-life example of how we might come across a XLS file to download.

Suppose we want to grab the Chicago Home Price Index data from Fred Economic Data.

There is an option to DOWNLOAD the Excel (data) on that page, which will download the Excel data locally.

If we right click Excel (data) and select Copy link address, we’ll find the URL that will directly download the Excel data onto our machine.

This URL is quite long, but it can be reduced down to the following URL.

https://fred.stlouisfed.org/graph/fredgraph.xls?id=CHXRSA

Read XLS files using requests

We can use requests to read an XLS file from a URL.

import requests
import pandas as pd
url = 'https://fred.stlouisfed.org/graph/fredgraph.xls?id=CHXRSA'
r = requests.get(url)
open('temp.xls', 'wb').write(r.content)
df = pd.read_excel('temp.xls')

Read up on the requests library in Python.

We can read specific sheets in the Excel file using sheet_name.

df = pd.read_excel('temp.xls', sheet_name="Sheet Name")

We can also skip the first n rows or last n rows.

df = pd.read_excel('temp.xls', skiprows=n, skipfooter=n)

Read more on Panda’s read_excel() function. There are lots of parameters we can change.