How to Fix "ValueError" While Merging DataFrames in Pandas


Suppose we’re trying to merge multiple DataFrames.

Example Scenario

We have DataFrame df1:

   year  col1
0  2010     1
1  2011     2
2  2012     3
3  2013     4
4  2014     5

And DataFrame df2:

   year  col2 
0  2010     6  
1  2011     7  
2  2012     8  
3  2013     9  
4  2014    10  

We’ll try merging.

merged_df = df1.merge(df2, on=['year'], how='inner')

We might end up with an error message like this.

ValueError: You are trying to merge on object and int64 columns. 
If you wish to proceed you should use pd.concat

Change the column type

The error message is telling us that there’s a type error on the column we are merging over.

The first step would be to check the on=[] list in our call to merge(), and check the dtypes for those columns.

print(df1.dtypes)
print(df2.dtypes)

In our scenario, maybe year is an int in df1, but in df2, year is a str.

We can first cast that column to some common datatype, and then merge.

df1['year'] = df1['year'].astype(int)
df2['year'] = df2['year'].astype(int)

Note that we would need to check each column we’re merging over.