Pandas: Join Two Dataframes

We can use join() to combine dataframes in Python Pandas. The basic syntax is as follows, in which df_1 and df_2 represent two dataframes to be joined.

df_1.join(df_2)

The following includes examples showing how to join two dataframes using join(). Below is the two dataframes.

import pandas as pd
# dataframe 1
car_data = pd.DataFrame({'Brand': ['Tesla', 'Toyota','Tesla','Ford'], 
     'Location': ['CA', 'CA','NY','MA']},index=list('abcd'))
print("dataframe 1: car_data: \n",car_data)

# dataframe 2
car_name = pd.DataFrame({ 
    'Name': ['Jake', 'Jacob','John','Jess','James']},index=list('abdek'))
print("dataframe 2: car_name: \n",car_name)
dataframe 1: car_data: 
     Brand Location
a   Tesla       CA
b  Toyota       CA
c   Tesla       NY
d    Ford       MA

dataframe 2: car_name: 
     Name
a   Jake
b  Jacob
d   John
e   Jess
k  James

Example 1: how='left' in join()

how='left' tells join() how to join two dataframes. By default, how='left' is in join(). Thus, we can see with and without how='left' have the exactly same result.

df_1.join(df_2) == df_1.join(df_2, how=’left’)

# without how='left'
joined_dataframe=car_data.join(car_name)
print(joined_dataframe)
    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
c   Tesla       NY    NaN
d    Ford       MA   John
# with how='left'
joined_dataframe=car_data.join(car_name,how='left')
print(joined_dataframe)
    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
c   Tesla       NY    NaN
d    Ford       MA   John

As we can see, without and with how='left' have the same result.


Example 2: how='outer' in join()

df_1.join(df_2, how=’outer’)

# with how='outer'
joined_dataframe=car_data.join(car_name,how='outer')
print(joined_dataframe)
    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
c   Tesla       NY    NaN
d    Ford       MA   John
e     NaN      NaN   Jess
k     NaN      NaN  James

Example 3: how='inner' in join()

df_1.join(df_2, how=’inner’)

# setting how='inner'
joined_dataframe=car_data.join(car_name,how='inner')
print(joined_dataframe)
    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
d    Ford       MA   John

Example 4: how='right' in join()

df_1.join(df_2, how=’right’)

setting how='right'
joined_dataframe=car_data.join(car_name,how='right')
print(joined_dataframe)
    Brand Location   Name
a   Tesla       CA   Jake
b  Toyota       CA  Jacob
d    Ford       MA   John
e     NaN      NaN   Jess
k     NaN      NaN  James

Further Reading