This tutorial shows how to select columns to form a new dataframe in Python Pandas.
The following figure illustrates that you got 4 columns but only want to select 2 columns to form a new dataframe. The figure is from the original Pandas manual.
The following shows the steps of building a dataframe and creating a new dataframe based on the original one.
Step 1: Create a dataframe
# importing Pandas
import pandas as pd
# Create a dataframe
car_data = {'Brand': ['Tesla', 'Tesla','Tesla','Ford','Ford'],
'Location': ['CA', 'CA','NY','MA','CA'],
'Year':['2019','2018','2020','2019','2019']}
car_data=pd.DataFrame(data=car_data)
#print out the original dataframe
print('Original Dataframe: \n', car_data)
Original Dataframe: Brand Location Year 0 Tesla CA 2019 1 Tesla CA 2018 2 Tesla NY 2020 3 Ford MA 2019 4 Ford CA 2019
Step 2: Select columns and save
The following code subset two columns, Brand and Year, and save them as a new dataframe called selected_df.
# subset two columns, Brand and Year
selected_df=car_data[["Brand","Year"]]
# print out the new dataframe
print(selected_df)
The following is the new dataframe.
Brand Year 0 Tesla 2019 1 Tesla 2018 2 Tesla 2020 3 Ford 2019 4 Ford 2019
The inner square brackets define a Python list of column names. The outer brackets are used to select the data from the dataframe.