How to Read Excel File from URL into a Pandas DataFrame
How can we read an XLS file from a URL into a Pandas DataFrame?
Example scenario
Let’s see a real-life example of how we might come across a XLS file to download.
Suppose we want to grab the Chicago Home Price Index data from Fred Economic Data.
There is an option to DOWNLOAD
the Excel (data)
on that page, which will download the Excel data locally.
If we right click Excel (data)
and select Copy link address
, we’ll find the URL that will directly download the Excel data onto our machine.
This URL is quite long, but it can be reduced down to the following URL.
https://fred.stlouisfed.org/graph/fredgraph.xls?id=CHXRSA
Read XLS files using requests
We can use requests
to read an XLS file from a URL.
import requests
import pandas as pd
url = 'https://fred.stlouisfed.org/graph/fredgraph.xls?id=CHXRSA'
r = requests.get(url)
open('temp.xls', 'wb').write(r.content)
df = pd.read_excel('temp.xls')
Read up on the
requests
library in Python.
We can read specific sheets in the Excel file using sheet_name
.
df = pd.read_excel('temp.xls', sheet_name="Sheet Name")
We can also skip the first n
rows or last n
rows.
df = pd.read_excel('temp.xls', skiprows=n, skipfooter=n)
Read more on Panda’s
read_excel()
function. There are lots of parameters we can change.