Merge Dataframes in Pandas
This tutorial shows example how of merge dataframes in Pandas Python. In particular, merge() can be used to merge two dataframes. The basic syntax is as follows.
merge(df_1,df_2, left_index=True, right_index=True)
Below is the two dataframes to be merged.
import pandas as pd
# create a dataframe called car_location
car_location = pd.DataFrame({'Brand': ['Tesla', 'Toyota','Tesla','Ford'],
'Location': ['CA', 'CA','NY','MA']},index=list('abcd'))
print("dataframe 1: car_location : \n",car_location )
# create a dataframe called car_name
car_name = pd.DataFrame({
'Name': ['Jake', 'Jacob','John','Jess','James']},index=list('abdek'))
print("dataframe 2: car_name: \n",car_name)
dataframe 1: car_location :
Brand Location
a Tesla CA
b Toyota CA
c Tesla NY
d Ford MA
dataframe 2: car_name:
Name
a Jake
b Jacob
d John
e Jess
k James
Method 1: how='inner'
By default, how='inner' in merge(). Thus, we can see that the output is the same, without or with how='inner'.
# without how='inner'
merged_df=pd.merge(car_location,car_name,left_index=True,right_index=True)
print(merged_df)
The following is the merged dataframe without the “how='inner'.”
Brand Location Name a Tesla CA Jake b Toyota CA Jacob d Ford MA John
The following Python code with the parameter of “how='inner'.”
# with how='inner'
merged_df=pd.merge(car_location,car_name,how='inner',left_index=True,right_index=True)
print(merged_df)
The following is the output.
Brand Location Name a Tesla CA Jake b Toyota CA Jacob d Ford MA John
Method 2: how='outer'
The following Python code show the example of merge() function with the parameter of how='outer'.
# outer merge() in Pandas in Python
merged_df=pd.merge(car_location,car_name,how='outer',left_index=True,right_index=True)
print(merged_df)
The following is the output for the outer merge in Pandas in Python.
Brand Location Name a Tesla CA Jake b Toyota CA Jacob c Tesla NY NaN d Ford MA John e NaN NaN Jess k NaN NaN James
Method 3: how='left'
The following Python code show the example of merge() function with the parameter of how='left'.
# left merge() in Pandas in Python
merged_df=pd.merge(car_location,car_name,how='left',left_index=True,right_index=True)
print(merged_df)
The following is the output for left merge in Pandas in Python.
Brand Location Name a Tesla CA Jake b Toyota CA Jacob c Tesla NY NaN d Ford MA John
Method 4: how='right'
The following Python code show the example of merge() function with the parameter of how='right'.
# right merge() in Pandas in Python
merged_df=pd.merge(car_location,car_name,how='right',left_index=True,right_index=True)
print(merged_df)
The following is the output for right merge in Pandas in Python.
Brand Location Name a Tesla CA Jake b Toyota CA Jacob d Ford MA John e NaN NaN Jess k NaN NaN James
Method 5: another way of writing the function
There are two ways of putting df_1 and df_2, see below. They are equivalent.
pd.merge(df_1,df_2) == df_1.merge(df_2)
The following Python code shows the example of writing merge() function in another way.
# df_1.merge(df_2)
merged_df=car_location.merge(car_name,left_index=True,right_index=True)
print(merged_df)
The following is the output.
Brand Location Name a Tesla CA Jake b Toyota CA Jacob d Ford MA John