How to Use groupby() in Pandas Dataframes

You can call groupby() and pass the name of the column that you want to group on in Pandas. Then, you need to specify the columns on which you want to perform the aggregation. The following is the basic syntax grammar.

df.groupby("column_name1")["column_name2"].function

The following is to generate the sample dataframe.

# to generate the sample dataframe
import pandas as pd
car_data = {'Brand': ['Brand 1', 'Brand 1','Brand 2','Brand 1','Brand 3'], 
     'Location': ['CA', 'CA', 'CA','NY','MA'],
    'Number':[200,20,300,400,500]}
car_data=pd.DataFrame(data=car_data)
print(car_data)
     Brand Location  Number
0  Brand 1       CA     200
1  Brand 1       CA      20
2  Brand 2       CA     300
3  Brand 1       NY     400
4  Brand 3       MA     500
# Groupby "location", on the measure of "Number"
car_data.groupby("Location")["Number"].sum()
Location
CA    520
MA    500
NY    400
Name: Number, dtype: int64

You can also use the groupby() function to generate the count of row frequency.

car_data.groupby("Location")["Number"].count()
Location
CA    3
MA    1
NY    1
Name: Number, dtype: int64