How to Sum rows by condition Pandas Dataframe

In this post, we will learn how to Sum rows by a condition in Pandas Dataframe or Pandas rows sum on condition with example. To use the Pandas library first we have to install it on the local system by using the pip command “pip install pandas” and import it into our code by using “import pandas as pd” to use its functions.

1. How to Sum rows by condition Pandas Dataframe using loc[]


The Python Pandas sum() function returns the sum of the given values over the requested axis. If the parameter is axis=0 for summing the rows of Pandas dataframe it is the default value. axis=1 means to Sum the columns of Pandas dataframe.In this example, we are aggregating Pandas rows by condition. Where values of Marks columns > 97.When we sum all rows of Pandas dataframe along with axis =0 column-wise. The non-numeric values are appended on the header, to avoid that we have used numeric_only=True.

import pandas as pd

   
data = {
    'Name': ['Rama', 'Rack', 'Max', 'David'],     
    'Marks':[97,97,100,100],    
    'Fee':[100,200,300,400],    
    'Tution_Fee':[400,500,600,700]
}

 
 
dfobj = pd.DataFrame(data)


#to avoid concatenation of the non numeric values in the row header numeric_only=True
row_Sum = dfobj.loc[dfobj['Marks']>97].sum(axis=0,numeric_only=True)



print(row_Sum)

Output

Marks          200
Fee            700
Tution_Fee    1300
dtype: int64

2. How to Sum rows by Mutiple conditions Pandas Dataframe


In this example, we have used the sum() function with axis=0 along with loc[] to Sum rows of the dataframe under multiple conditions.We are Summing the rows that have ‘Marks’]>97)&(dfobj[‘Fee’] >=200.

  • When we sum all rows of Pandas dataframe along with axis =0 column-wise. The non-numeric values are appended on the header, to avoid that we have used numeric_only=True.
import pandas as pd

   
data = {
    'Name': ['Rama', 'Rack', 'Max', 'David'],     
    'Marks':[97,97,100,100],    
    'Fee':[100,200,500,400],    
    'Tution_Fee':[400,500,600,700]
}

 
 
dfobj = pd.DataFrame(data)


#to avoid concatenation of the non numeric values in the row header numeric_only=True


row_Sum = dfobj.loc[(dfobj['Marks']>97)&(dfobj['Fee'] >=200)].sum(axis=0,numeric_only=True)


print(row_Sum)

Output

Marks          200
Fee            900
Tution_Fee    1300
dtype: int64

3. How to Sum rows by condition Pandas Dataframe using query()


The pandas.DataFrame.query() function is used to evaluate a string query .In the below example we have passed a string query ‘Marks>97’ to sum the rows of Pandas dataframe based on condition.

import pandas as pd

   
data = {
    'Name': ['Rama', 'Rack', 'Max', 'David'],     
    'Marks':[97,97,100,100],    
    'Fee':[100,200,300,400],    
    'Tution_Fee':[400,500,600,700]
}

 
 
dfobj = pd.DataFrame(data)


#to avoid concatenation of the non numeric values in the row header numeric_only=True

row_Sum = dfobj.query('Marks>97').sum(axis=0,numeric_only=True)



print(row_Sum)

Output

Marks          200
Fee            700
Tution_Fee    1300
dtype: int64

4. Pandas Sum rows by multiple conditions using query()


In this python program we have used the pandas.DataFrame.query() function and passed a query string ‘Marks>97 and Fee >=200’ with multiple condition to Sum rows based multiple condition.

import pandas as pd

   
data = {
    'Name': ['Rama', 'Rack', 'Max', 'David'],     
    'Marks':[97,97,100,100],    
    'Fee':[100,200,500,400],    
    'Tution_Fee':[400,500,600,700]
}

 
 
dfobj = pd.DataFrame(data)


#to avoid concatenation of the non numeric values in the row header numeric_only=True

row_Sum = dfobj.query('Marks>97 and Fee >=200').sum(axis=0,numeric_only=True)

print(row_Sum)

Output

Marks          200
Fee            900
Tution_Fee    1300
dtype: int64

5. Pandas sum rows by multiple conditions using group by


Sometimes we have to sum rows of Pandas dataframe by group.The groupby() function is used to group the identical repeated values in given data and return a Dataframegroupby that contain result after applying aggregate function like sum() and mean.In this example we are using the groupby() function by multiple conditions.

import pandas as pd

   
data = {
    'Name': ['Rama', 'Rack', 'Max', 'David'],     
    'Marks':[97,97,100,100],    
    'Fee':[100,200,300,400],    
    'Tution_Fee':[400,500,600,700]
}

 
 
dfobj = pd.DataFrame(data)


#to avoid concatenation of the non numeric values in the row header numeric_only=True


row_Sum  = dfobj.groupby('Marks').apply(lambda x: x[x['Marks']> 97].sum(axis=0,numeric_only=True))



print(row_Sum)

Output

 Marks  Fee  Tution_Fee
Marks                        
97         0    0           0
100      200  700        1300

6. How to Sum rows by condition Pandas Dataframe


In this example We have used the simple way, column label to select rows based on condition and apply sum() to sum rows of Pandas dataframe based on condition.

import pandas as pd
   
data = {
    'Name': ['Jack', 'Jack', 'Max', 'Jack'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,201,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,400,600,700]
}
   
dfobj = pd.DataFrame(data)


#to avoid concatenation of the non numeric values in the row header numeric_only=True


row_Sum = dfobj[dfobj['Marks']>97].sum(axis=0,numeric_only=True)

print(row_Sum)

Output

Marks          200
Admit_fee      411
Fee            700
Tution_Fee    1300
dtype: int64