In this post, we are going to learn How to group by count unique value Pandas. To install Pandas on the local system by using the pip command “pip install pandas” and import it into our code by using “import pandas as pd” to use its function Dataframe. The Pandas groupby() function is used to group the same repeated values in given data and split the dataframe into different groups. This function returns a Dataframegroupby object.We can apply aggregate() functions sum() and mean(), max(), and count(), min(),median() on grouped dataframe.We can apply single or multiple functions on grouped dataframe passing as an argument to function agg() by e.g ‘sum’ or a list of functions like [‘sum’,’max’,’min’].
1. Pandas groupby single column count distinct
Sometimes we have grouped by multiple columns with identical values and count distinct values from the group resulting in data. In this below example We have grouped the same values in a dataframe by using multiple columns [‘Name’,’Marks’] and counting distinct Values by using nunique() method. The reset_index() to set a new index after group data to display the dataframe.
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby(['Name','Marks'])['Fee', 'Tution_Fee'].nunique().reset_index()
print(dfobj)
Output
Name Marks Fee Tution_Fee
0 Max 100 1 1
1 Rama 97 1 1
- Pandas sum rows by columns(6 ways)
- Pandas sum columns by multiple conditions
- How to Pandas sum all columns except one
- How to sum columns with Nan Pandas Dataframe
- How to sum specific rows Pandas dataframe
- How to Sum rows by condition Pandas Dataframe
- How to sum Pandas columns into new column
2. Pandas groupby multiple columns count distinct
In this example, First, we have grouped the same values of Dataframe by column (‘Name’) and counted distinct values by using the unique() method. The reset_index() Method is used to reset the index of the Dataframe.
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby('Name')['Fee'].unique().reset_index()
print(dfobj)
Output
Name Fee
0 Max [300]
1 Rama [100]