How to Export a DataFrame to CSV with Pandas in Python
How can we store a DataFrame object into a .csv
file in Pandas?
We can achieve this with the to_csv()
function, which provides plenty of parameters to suit our use cases.
df.to_csv("filename.csv")
Suppose we have a DataFrame object df
that looks like this.
Dog Age
0 corgi 7
1 shih tzu 5
This to_csv()
function will produce a CSV file with the following:
,Dog,Age
0,corgi,7
1,shih tzu,5
Write without index
By default, the first column of our CSV will contain the index of each row in df
.
If we don’t want to include the index in our CSV, we can set the index
parameter to be False
.
df.to_csv("filename.csv", index=False)
Our CSV file would then look something like this.
Dog,Age
corgi,7
shih tzu,5
All subsequent examples will assume
index=False
.
Use a different separator
The default separator is a comma ,
.
We can change this using the sep
parameter. For instance, we can delimit by tab or semicolon.
df.to_csv("filename.csv", sep='\t') # Tab
df.to_csv("filename.csv", sep=';') # Semicolon
Here’s what the semicolon delimiter would look like.
Dog;Age
corgi;7
shih tzu;5
Write without header
If we don’t want to include the header in our CSV file, we can set header
to False
.
df.to_csv("filename.csv", header=False)
Let’s take out that header.
corgi,7
shih tzu,5
Write specific columns
If we want to only write a subset of columns to our CSV, we can specify the columns as a list of strings in the columns
parameter.
df.to_csv("filename.csv", columns=['Dog'])
Only dogs, no age.
Dog
corgi
shih tzu
Change file format encoding
To be safe, we can set the encoding
parameter so other applications know how to read our CSV file.
df.to_csv("filename.csv", encoding='utf-8')
Compress a CSV
If we’re writing hundreds of thousands of rows into a CSV file, it might be best to compress the CSV.
The output zip
or gzip
file will be smaller, but the write and read times will involve compressing and decompressing, making the process take a bit longer.
df.to_csv("filename.gzip", compression='gzip')
Replace NaN
with string
We can replace all instances of NaN
in df
with a string like "N/A"
using the na_rep
parameter.
df.to_csv("filename.csv", na_rep='N/A')
Format datetime
columns
We can specify a format for all datetime
columns using the date_format
column.
df.to_csv("filename.csv", date_format='%m/%d/%Y')