We can use join()
to combine dataframes in Python Pandas. The basic syntax is as follows, in which df_1 and df_2 represent two dataframes to be joined.
df_1.join(df_2)
The following includes examples showing how to join two dataframes using join()
. Below is the two dataframes.
import pandas as pd
# dataframe 1
car_data = pd.DataFrame({'Brand': ['Tesla', 'Toyota','Tesla','Ford'],
'Location': ['CA', 'CA','NY','MA']},index=list('abcd'))
print("dataframe 1: car_data: \n",car_data)
# dataframe 2
car_name = pd.DataFrame({
'Name': ['Jake', 'Jacob','John','Jess','James']},index=list('abdek'))
print("dataframe 2: car_name: \n",car_name)
dataframe 1: car_data: Brand Location a Tesla CA b Toyota CA c Tesla NY d Ford MA dataframe 2: car_name: Name a Jake b Jacob d John e Jess k James
Example 1: how='left'
in join()
how='left'
tells join()
how to join two dataframes. By default, how='left'
is in join(). Thus, we can see with and without how='left'
have the exactly same result.
df_1.join(df_2) == df_1.join(df_2, how=’left’)
# without how='left'
joined_dataframe=car_data.join(car_name)
print(joined_dataframe)
Brand Location Name a Tesla CA Jake b Toyota CA Jacob c Tesla NY NaN d Ford MA John
# with how='left'
joined_dataframe=car_data.join(car_name,how='left')
print(joined_dataframe)
Brand Location Name a Tesla CA Jake b Toyota CA Jacob c Tesla NY NaN d Ford MA John
As we can see, without and with how='left'
have the same result.
Example 2: how='outer'
in join()
df_1.join(df_2, how=’outer’)
# with how='outer'
joined_dataframe=car_data.join(car_name,how='outer')
print(joined_dataframe)
Brand Location Name a Tesla CA Jake b Toyota CA Jacob c Tesla NY NaN d Ford MA John e NaN NaN Jess k NaN NaN James
Example 3: how='inner'
in join()
df_1.join(df_2, how=’inner’)
# setting how='inner'
joined_dataframe=car_data.join(car_name,how='inner')
print(joined_dataframe)
Brand Location Name a Tesla CA Jake b Toyota CA Jacob d Ford MA John
Example 4: how='right'
in join()
df_1.join(df_2, how=’right’)
setting how='right'
joined_dataframe=car_data.join(car_name,how='right')
print(joined_dataframe)
Brand Location Name a Tesla CA Jake b Toyota CA Jacob d Ford MA John e NaN NaN Jess k NaN NaN James