In this post, we will learn How to groupby mutiple columns count in Pandas Dataframe. The Pandas groupby() function is used to group the same repeated values in given data and split the dataframe into different groups. This function returns a Dataframegroupby object.We can apply aggregate() functions like sum() and mean(), max(), and count(), min(),median() on grouped dataframe.We can apply single or multiple functions on grouped dataframe passing as an argument to function agg() by e.g ‘sum’ or a list of functions like [‘sum’,’max’,’min’].
1. Pandas groupby single column count
In this example ,we have grouped the entire dataframe by column ‘Name’ and count the column ‘Fee’ values. The reset_index() is used to set the index for group data to display the dataframe data correctly.
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby('Name')['Fee'].count().reset_index()
print(dfobj)
Output
Name Fee
0 Max 1
1 Rama 3
- Pandas sum rows by columns(6 ways)
- Pandas sum columns by multiple conditions
- How to Pandas sum all columns except one
- How to sum columns with Nan Pandas Dataframe
- How to sum specific rows Pandas dataframe
- How to Sum rows by condition Pandas Dataframe
- How to sum Pandas columns into new column
2. Pandas groupby mutiple columns count using agg()
In this example, we will learn how to groupby count multiple columns in Pandas dataframe. First, we have grouped the dataframe by the same values using multiple columns[‘Name’,’ Marks’] and apply count() function on multiple columns [‘Fee’, ‘Tution_Fee’] with the help of agg() method. The agg() method is used to apply single or multiple aggregate functions on Pandas dataframe.The list aggregate functions are:
- mode:
- var :
- count – count non-null values
- min : minimum
- max : maximum
- std :standard deviation
- sum : sum of values
- mean : mean
- median : median
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby(['Name','Marks'])['Fee','Tution_Fee'].agg('count').reset_index()
print(dfobj)
Output
Name Marks Fee Tution_Fee
0 Max 100 1 1
1 Rama 97 3 3
3. Pandas groupby mutiple columns count by column value
In this example, We will learn how to groupby multiple columns by column value. First group the entire dataframe by the same values and count the values based on column value and count the value for columns [‘Fee’,’Tution_Fee’] and the reset_index() is used to set the new index on result output.
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby(['Name','Marks'])['Fee','Tution_Fee'].count().reset_index()
print(dfobj)
Output
Name Marks Fee Tution_Fee
0 Max 100 1 1
1 Rama 97 3 3
4. Pandas groupby mutiple columns count distinct
In this example, First, we have grouped the same values of Dataframe by column (‘Name’) and counted distinct values by using the unique() method. The reset_index() Method is used to reset the index of the Dataframe.
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby('Name')['Fee'].unique().reset_index()
print(dfobj)
Output
Name Fee
0 Max [300]
1 Rama [100]
5. Pandas groupby mutiple columns sum and count
In this example, we will learn how to group by mutiple columns sum and count in Pandas dataframe. First, we have to group the entire column by column ‘Name’ and find the ‘count’ and ‘Sum’ of columns.
- The agg() function is used to count the Marks column and Sum of the ‘Tution_Fee’ column.
- The next step is to reset the index using the reset_index() function of the dataframe and
- rename the column using rename() function ‘Marks’ to ‘Marks count’
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby('Name').agg({'Marks':'count', 'Tution_Fee': 'sum'}).reset_index().rename(columns={'Marks':'Marks count'})
print(dfobj)
Output
Name Marks count Tution_Fee
0 Max 1 600
1 Rama 3 1200