Pandas sum columns by multiple conditions

In this post, we will learn how to do the Pandas sum columns by multiple conditions or Pandas sum columns with conditions. The Pandas is an analytical data library that stores data in tabular form and the table in Pandas is called a dataframe that contains rows and columns. To use a Pandas library first we have to install it on the local system by using the pip command “pip install pandas” and import it into our code by using “import pandas as pd” to use its functions.

1. Pandas sum columns by single condition


In this below example, we are summing columns of Pandas dataframe by a single condition and appending the resulting output to the new column name ‘Total’.

import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

dfobj["Total"] = dfobj[dfobj["Marks"] > 97].sum(axis=1)

print(dfobj)


Output

    Name  Marks  Admit_fee  Fee  Tution_Fee   Total
0   Jack     97        201  100         400     NaN
1   Rack     97        203  200         500     NaN
2    Max    100        205  300         600  1205.0
3  David    100        206  400         700  1406.0

2. Pandas sum columns by multiple conditions


In this example, we are finding the sum of columns ‘Fee’ by applying multiple conditions using the dataframe loc[] property it will return the sum of rows where [‘Marks’]>= 97 and dfobj[‘Fee’]>=300.

import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

Col_fee_sum = dfobj.loc[(dfobj['Marks']>=97) & (dfobj['Fee']>=300 ), 'Fee'].sum()

print(Col_fee_sum)

Output

700

3. Pandas sum columns by multiple conditions


In this example, we are finding the sum of multiple columns by multiple conditions it will return the sum of rows where values of ‘Fee’ columns is in 200,300 by using the loc[] property along with isin() of Pandas dataframe function.

import pandas as pd
  
data = {
    'Name': ['Jack', 'Jack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

Col_fee_sum = dfobj.loc[dfobj['Fee'].isin([200,300]), 'Fee'].sum()

print(Col_fee_sum)


Output

500

4. Pandas sum columns by multiple conditions using loc[]


In this example, we are summing the column of the dataframe by using the loc[] property, Where we are validating the values with condition where dfobj[‘Marks’] >97 the sum function is applied to get sum based on given condition. The result output is appending into a new column Total_Fee of dataframe.

import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

dfobj['Total_Fee'] = dfobj.loc[dfobj['Marks'] >97,['Fee','Tution_Fee']].sum(axis=1)
dfobj['Total_Fee'].fillna(0, inplace=True)

print(dfobj)

Output

     Name  Marks  Admit_fee  Fee  Tution_Fee  Total_Fee
0   Jack     97        201  100         400        0.0
1   Rack     97        203  200         500        0.0
2    Max    100        205  300         600      900.0
3  David    100        206  400         700     1100.0

5. Pandas sum columns by multiple conditions


In this example, we are summing the values of multiple columns [‘Fee’,’Admit_fee’,’Tution_Fee’] by checking with conditions where dfobj[‘Marks’]>97 with the help of loc[] property of Pandas dataframe.

import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

Col_fee_sum = dfobj.loc[dfobj['Marks']>97, ['Fee','Admit_fee','Tution_Fee']].sum(axis=1)

print(Col_fee_sum)

Output

2    1105
3    1306
dtype: int64

6.Pandas sum columns by multiple conditions using where


Another way to sum the column of Pandas dataframe by multiple condition is by using the where() method, is a short and one-line code. In this example, we are summing the column of the dataframe by using the loc[] property, Where we are checking if values of column dfobj[‘Marks’] >97 the sum function will return the sum of a list columns [‘Fee’, ‘Tution_Fee’] and the result output is appending into a new column Total_Fee of dataframe.

import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

dfobj['Total_Fee'] = dfobj[['Fee','Tution_Fee']].sum(axis=1).where(dfobj['Marks'] >97, 0)
print(dfobj)

Output

  Name  Marks  Admit_fee  Fee  Tution_Fee  Total_Fee
0   Jack     97        201  100         400          0
1   Rack     97        203  200         500          0
2    Max    100        205  300         600        900
3  David    100        206  400         700       1100