Merge Dataframes in Pandas

This tutorial shows example how of merge dataframes in Pandas Python. In particular, merge() can be used to merge two dataframes. The basic syntax is as follows.

merge(df_1,df_2, left_index=True, right_index=True)

Below is the two dataframes to be merged.

import pandas as pd
# create a dataframe called car_location
car_location = pd.DataFrame({'Brand': ['Tesla', 'Toyota','Tesla','Ford'], 
     'Location': ['CA', 'CA','NY','MA']},index=list('abcd'))
print("dataframe 1: car_location : \n",car_location )

# create a dataframe called car_name
car_name = pd.DataFrame({ 
    'Name': ['Jake', 'Jacob','John','Jess','James']},index=list('abdek'))
print("dataframe 2: car_name: \n",car_name)
dataframe 1: car_location : 
     Brand Location
a   Tesla       CA
b  Toyota       CA
c   Tesla       NY
d    Ford       MA

dataframe 2: car_name: 
     Name
a   Jake
b  Jacob
d   John
e   Jess
k  James

Method 1: how='inner'

By default, how='inner' in merge(). Thus, we can see that the output is the same, without or with how='inner'.


# without how='inner'
merged_df=pd.merge(car_location,car_name,left_index=True,right_index=True)
print(merged_df)

The following is the merged dataframe without the “how='inner'.”

    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
d    Ford       MA   John

The following Python code with the parameter of “how='inner'.”

# with how='inner'
merged_df=pd.merge(car_location,car_name,how='inner',left_index=True,right_index=True)
print(merged_df)

The following is the output.

    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
d    Ford       MA   John

Method 2: how='outer'

The following Python code show the example of merge() function with the parameter of how='outer'.

# outer merge() in Pandas in Python
merged_df=pd.merge(car_location,car_name,how='outer',left_index=True,right_index=True)
print(merged_df)

The following is the output for the outer merge in Pandas in Python.

    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
c   Tesla       NY    NaN
d    Ford       MA   John
e     NaN      NaN   Jess
k     NaN      NaN  James

Method 3: how='left'

The following Python code show the example of merge() function with the parameter of how='left'.

# left merge() in Pandas in Python
merged_df=pd.merge(car_location,car_name,how='left',left_index=True,right_index=True)
print(merged_df)

The following is the output for left merge in Pandas in Python.

    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
c   Tesla       NY    NaN
d    Ford       MA   John

Method 4: how='right'

The following Python code show the example of merge() function with the parameter of how='right'.

# right merge() in Pandas in Python
merged_df=pd.merge(car_location,car_name,how='right',left_index=True,right_index=True)
print(merged_df)

The following is the output for right merge in Pandas in Python.

    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
d    Ford       MA   John
e     NaN      NaN   Jess
k     NaN      NaN  James

Method 5: another way of writing the function

There are two ways of putting df_1 and df_2, see below. They are equivalent.

pd.merge(df_1,df_2) == df_1.merge(df_2)

The following Python code shows the example of writing merge() function in another way.

# df_1.merge(df_2)
merged_df=car_location.merge(car_name,left_index=True,right_index=True)
print(merged_df)

The following is the output.

    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
d    Ford       MA   John

Further Reading