In this post, we are going to learn how to Pandas Groupby median multiple columns. The Pandas groupby() function is used to group the same repeated values in given data and split the DataFrame into different groups. This function returns a Dataframegroupby object.We can apply functions like sum() and mean(), max(), and count(), min(),median() on result of groupby() . Other ways to apply single or multiple functions on grouped dataframe passing as an argument to function agg() for e.g ‘sum’ or a list of functions like [‘sum’,’max’,’min’].
1. Pandas Groupby median multiple columns using agg()
In this example, we have grouped the DataFrame on mutiple columns as per requirement and apply the function ‘median’ by passing it as a parameter to agg() function on the columns in which the median needs to be calculated.Here we are calculating for columns ‘Fee’ and ‘Tution_Fee’
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby(['Name','Marks']).agg({'Fee':'median','Tution_Fee':'median'})
print(dfobj)
Output
Fee Tution_Fee
median median
Name Marks
Max 100 300 600
Rama 97 100 400
- Pandas sum rows by columns(6 ways)
- Pandas sum columns by multiple conditions
- How to Pandas sum all columns except one
- How to sum columns with Nan Pandas DataFrame
- How to sum specific rows Pandas DataFrame
- How to Sum rows by condition Pandas DataFrame
- How to sum Pandas columns into new column
2. Pandas Groupby median multiple column rename using agg()
In this example, we have calculated the median of mutiple columns by using agg() function and renamed the column name as well.
- First grouped the DataFrame based on multiple columns as per requirement.
- apply the function ‘median’ by passing as a parameter to agg() function for columns in which the median needs to be calculated, that are ‘Fee’ and ‘Tution_Fee’.
- The rest_index() to set the new index along with rename() function to rename columns ‘Marks’ to ‘Marks count’.
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby(['Name','Marks']).agg({'Fee':'median','Tution_Fee':'median'}).rename(columns={'Fee':'F_median','Tution_Fee': 'Tution_Fee_median'})
print(dfobj)
Output
Name Marks Median
sum median
0 Max 100 300 300
1 Rama 97 300 100
3. Pandas Groupby median multiple columns by selecting Colum name
We have group by Pandas DataFrame by multiple columns and select the columns in which the median needs to be calculated. Finally apply the median() function on selected columns that are [‘Fee’,’Tution_Fee’].
- The rest_index() to set the new index along with rename() function to rename columns ‘Marks’ to ‘Marks count’
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby(['Name','Marks'])[['Fee','Tution_Fee']].median().reset_index().rename(columns={'Fee':'Fee_Meadian','Tution_Fee': 'T_Fee_Meadian'})
print(dfobj)
Output
Name Marks Fee_Meadian T_Fee_Meadian
0 Max 100 300 600
1 Rama 97 100 400
4. Pandas Groupby median multiple columns Using pivot
In this example, we have grouped mutiple columns in Pandas by using the pivot() function. It returns the reshaped data organized by index/column values. Let us understand with the below example.
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby(['Fee','Tution_Fee'],as_index = False).median().pivot('Fee','Tution_Fee').fillna(0)
print(dfobj )
Output
Marks
Tution_Fee 400 600
Fee
100 97.0 0.0
300 0.0 100.0
5. Pandas Groupby median multiple column rename using join()
In this example, We have calculated the ‘median ‘ of multiple columns using agg() and renamed the column name with the help of the join() function.
- First grouped the DataFrame based on multiple columns as per requirement
- apply the function ‘median’ by passing as a parameter to agg() function for columns which median needs to calculate, that are ‘Fee’ and ‘Tution_Fee’
import pandas as pd
data = {
'Name': ['Rama', 'Rama', 'Max', 'Rama'],
'Marks':[97,97,100,97],
'Fee':[100,100,300,100],
'Tution_Fee':[400,400,600,400]
}
dfobj = pd.DataFrame(data)
dfobj = dfobj.groupby(['Name','Marks']).agg({'Fee':['median'],'Tution_Fee':['median']})
dfobj.columns = dfobj.columns.map('_'.join)
print(dfobj)
Output
Fee_median Tution_Fee_median
Name Marks
Max 100 300 600
Rama 97 100 400
Summary
In this post we have learned how to find Pandas Groupby average multiple column with example using agg(),reset_index(),median() and join()function with examples.