Pandas Groupby median multiple columns(5 examples)

In this post, we are going to learn how to Pandas Groupby median multiple columns. The Pandas groupby() function is used to group the same repeated values in given data and split the DataFrame into different groups. This function returns a Dataframegroupby object.We can apply functions like sum() and mean(), max(), and count(), min(),median() on result of groupby() . Other ways to apply single or multiple functions on grouped dataframe passing as an argument to function agg() for e.g ‘sum’ or a list of functions like [‘sum’,’max’,’min’].

1. Pandas Groupby median multiple columns using agg()


In this example, we have grouped the DataFrame on mutiple columns as per requirement and apply the function ‘median’ by passing it as a parameter to agg() function on the columns in which the median needs to be calculated.Here we are calculating for columns ‘Fee’ and ‘Tution_Fee’

import pandas as pd
 
    
data = {
    'Name': ['Rama', 'Rama', 'Max', 'Rama'],     
    'Marks':[97,97,100,97],    
    'Fee':[100,100,300,100],    
    'Tution_Fee':[400,400,600,400]
}
  
  
dfobj = pd.DataFrame(data)
 


dfobj  =  dfobj.groupby(['Name','Marks']).agg({'Fee':'median','Tution_Fee':'median'})

print(dfobj)



Output

       Fee Tution_Fee
           median     median
Name Marks                  
Max  100      300        600
Rama 97       100        400

2. Pandas Groupby median multiple column rename using agg()


In this example, we have calculated the median of mutiple columns by using agg() function and renamed the column name as well.

  • First grouped the DataFrame based on multiple columns as per requirement.
  • apply the function ‘median’ by passing as a parameter to agg() function for columns in which the median needs to be calculated, that are ‘Fee’ and ‘Tution_Fee’.
  • The rest_index() to set the new index along with rename() function to rename columns  ‘Marks’ to ‘Marks count’.
import pandas as pd
 
    
data = {
    'Name': ['Rama', 'Rama', 'Max', 'Rama'],     
    'Marks':[97,97,100,97],    
    'Fee':[100,100,300,100],    
    'Tution_Fee':[400,400,600,400]
}
  
  
dfobj = pd.DataFrame(data)
 


dfobj  =  dfobj.groupby(['Name','Marks']).agg({'Fee':'median','Tution_Fee':'median'}).rename(columns={'Fee':'F_median','Tution_Fee': 'Tution_Fee_median'})




print(dfobj)

Output

 Name Marks Median       
                 sum median
0   Max   100    300    300
1  Rama    97    300    100

3. Pandas Groupby median multiple columns by selecting Colum name


We have group by Pandas DataFrame by multiple columns and select the columns in which the median needs to be calculated. Finally apply the median() function on selected columns that are [‘Fee’,’Tution_Fee’].

  • The rest_index() to set the new index along with rename() function to rename columns  ‘Marks’ to ‘Marks count’
import pandas as pd
 
    
data = {
    'Name': ['Rama', 'Rama', 'Max', 'Rama'],     
    'Marks':[97,97,100,97],    
    'Fee':[100,100,300,100],    
    'Tution_Fee':[400,400,600,400]
}
  
  
dfobj = pd.DataFrame(data)
 


dfobj  =  dfobj.groupby(['Name','Marks'])[['Fee','Tution_Fee']].median().reset_index().rename(columns={'Fee':'Fee_Meadian','Tution_Fee': 'T_Fee_Meadian'})




print(dfobj)


Output

   Name  Marks  Fee_Meadian  T_Fee_Meadian
0   Max    100          300            600
1  Rama     97          100            400

4. Pandas Groupby median multiple columns Using pivot


In this example, we have grouped mutiple columns in Pandas by using the pivot() function. It returns the reshaped data organized by index/column values. Let us understand with the below example.

import pandas as pd
 
    
data = {
    'Name': ['Rama', 'Rama', 'Max', 'Rama'],     
    'Marks':[97,97,100,97],    
    'Fee':[100,100,300,100],    
    'Tution_Fee':[400,400,600,400]
}
  
  
dfobj = pd.DataFrame(data)
 


dfobj  =  dfobj.groupby(['Fee','Tution_Fee'],as_index = False).median().pivot('Fee','Tution_Fee').fillna(0)
print(dfobj  )

Output

           Marks       
Tution_Fee   400    600
Fee                    
100         97.0    0.0
300          0.0  100.0

5. Pandas Groupby median multiple column rename using join()


In this example, We have calculated the ‘median ‘ of multiple columns using agg() and renamed the column name with the help of the join() function.

  • First grouped the DataFrame based on multiple columns as per requirement
  • apply the function ‘median’ by passing as a parameter to agg() function for columns which median needs to calculate, that are ‘Fee’ and ‘Tution_Fee’
import pandas as pd
 
    
data = {
    'Name': ['Rama', 'Rama', 'Max', 'Rama'],     
    'Marks':[97,97,100,97],    
    'Fee':[100,100,300,100],    
    'Tution_Fee':[400,400,600,400]
}
  
  
dfobj = pd.DataFrame(data)
 


dfobj  =  dfobj.groupby(['Name','Marks']).agg({'Fee':['median'],'Tution_Fee':['median']})
dfobj.columns = dfobj.columns.map('_'.join)

print(dfobj)

Output

          Fee_median  Tution_Fee_median
Name Marks                               
Max  100           300                600
Rama 97            100                400

Summary

In this post we have learned how to find Pandas Groupby average multiple column with example using agg(),reset_index(),median() and join()function with examples.