You can call groupby() and pass the name of the column that you want to group on in Pandas. Then, you need to specify the columns on which you want to perform the aggregation. The following is the basic syntax grammar.
df.groupby("column_name1")["column_name2"].function
The following is to generate the sample dataframe.
# to generate the sample dataframe
import pandas as pd
car_data = {'Brand': ['Brand 1', 'Brand 1','Brand 2','Brand 1','Brand 3'],
'Location': ['CA', 'CA', 'CA','NY','MA'],
'Number':[200,20,300,400,500]}
car_data=pd.DataFrame(data=car_data)
print(car_data)
Brand Location Number 0 Brand 1 CA 200 1 Brand 1 CA 20 2 Brand 2 CA 300 3 Brand 1 NY 400 4 Brand 3 MA 500
# Groupby "location", on the measure of "Number"
car_data.groupby("Location")["Number"].sum()
Location CA 520 MA 500 NY 400 Name: Number, dtype: int64
You can also use the groupby()
function to generate the count of row frequency.
car_data.groupby("Location")["Number"].count()
Location CA 3 MA 1 NY 1 Name: Number, dtype: int64