Pandas sum rows by columns(6 ways)

In this post, we will learn how to Pandas sum of rows by columns or Pandas sum of rows by index. To use a Pandas library first we have to install it on the local system by using the pip command “pip install pandas” and import it into our code by using “import pandas as pd” to use its functions.

1. Pandas Sum rows by column with condition


The Python Pandas sum() function returns the sum of a given value over the requested axis. if the parameter is axis=0 that means summing the rows of Pandas dataframe, In the case of axis=1 means summing the columns of Pandas dataframe.In this example, we are aggregating Pandas rows by condition. Where values of Marks columns > 97.When we sum all rows of Pandas dataframe along with axis =0 column-wise. The non-numeric values are appended on the header, to avoid that we have used numeric_only=True.

import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)


#avoid concatenation of the non numeric values in the row header numeric_only=True


Sum = dfobj.loc[dfobj['Marks']>97].sum(axis=0,numeric_only=True)
print(Sum)

Output

Marks          200
Admit_fee      411
Fee            700
Tution_Fee    1300
dtype: int64

2. Pandas sum rows by columns of all rows


In this example, The sum is called along with axis=0 that added all rows column-wise and returns a series object that contains Total of Marks, Admit, Fee, and Tuition_Fee columns. The sum of all columns is added as new to the dataframe with the index label “Total”.

  • The series object is returned by applying the sum() function that has been transposing to create a Dataframe with a single row. So all the indexes of Series became the columns in the new dataframe.
  • Finally, we have opened this new dataframe to the original dataframe using the append() function
import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

dfobj.set_index('Name', inplace=True)

total_row = dfobj.sum()
total_row.name = 'Total'

#new row appended to dataframe
dfobj = dfobj.append(total_row.transpose())

print(dfobj)

Output

            Marks  Admit_fee   Fee  Tution_Fee
Name                                     
Jack      97        201   100         400
Rack      97        203   200         500
Max      100        205   300         600
David    100        206   400         700
Total    394        815  1000        2200

3. Pandas sum rows by columns all rows ignore header string


In the above example, We have summed all rows by columns and append them as a new row in the dataframe. But instead of adding a new row, we can result in the output Total of all columns by passing axis=0 to the sum() function. When we sum all rows of Pandas dataframe along with axis =0 column-wise. The non-numeric values are appended on the header, to avoid that we have used numeric_only=True.

  
dfobj = pd.DataFrame(data)
print(dfobj.sum(axis=0))



#avoid concatenation of the non numeric values in the row header numeric_only=True
dfobj = pd.DataFrame(data)
print('Avoid  non numeric values in the row header:'',dfobj.sum(axis=0,numeric_only=True))

Output

Name          JackRackMaxDavid
Marks                      394
Admit_fee                  815
Fee                       1000
Tution_Fee                2200
dtype: object

Avoid  non numeric values in the row header:
Marks          394
Admit_fee      815
Fee           1000
Tution_Fee    2200
dtype: int64

4. Pandas sum rows by columns a range of indexes


In this example we have added rows by a range of indices, Instead of adding all rows, there is a need to add a range of top 3 or last 3 rows column-wise. We have rows from range 0:4 that can be changed as per the requirement.

  • The series object is returned by applying the sum() function that has been transposed to create a Dataframe with a single row. So all the indexes of Series became the columns in the new dataframe.
  • Finally, we have appended this new dataframe to the original dataframe using the append() function.
import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

dfobj.set_index('Name', inplace=True)

total_row = dfobj.iloc[0:4].sum(axis=0)
total_row.name = 'Total_Fee'

#added new row with name Total_Fee

dfobj = dfobj.append(total_row.transpose())

print(dfobj)


Output

            Marks  Admit_fee   Fee  Tution_Fee
Name                                         
Jack          97        201   100         400
Rack          97        203   200         500
Max          100        205   300         600
David        100        206   400         700
Total_Fee    394        815  1000        2200

5. Pandas sum rows by columns value


In this example, we have added rows of dataframe by column values with the help of df. loc[] property.We are adding rows values column-wise where names are ‘Jack’, ‘Rack’, ‘Max’.

import pandas as pd
  
data = {
    'Name': ['Jack', 'Rack', 'Max', 'David'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,203,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,500,600,700]
}
  
dfobj = pd.DataFrame(data)

dfobj.set_index('Name', inplace=True)


total_row = dfobj.loc[['Jack', 'Rack', 'Max']].sum(axis=0)


print(total_row)

Output

Marks          294
Admit_fee      609
Fee            600
Tution_Fee    1500
dtype: int64

6. Pandas sum rows using group by


The groupby() function is used to group the identical repeated values in given data and return a Dataframegroupby that contain result after applying aggregate function like sum() and mean.

In this example, we have applied group by on column ‘Name’ to sum the column values that are identical and reset the index using reset_index() and reindex the dataframe using reindex()

import pandas as pd
  
data = {
    'Name': ['Jack', 'Jack', 'Max', 'Jack'],
    'Marks':[97,97,100,100],
    'Admit_fee':[201,201,205,206],
    'Fee':[100,200,300,400],
    'Tution_Fee':[400,400,600,700]
}
  
dfobj = pd.DataFrame(data)



dfobj = dfobj.groupby(['Name']).sum().reset_index().reindex(columns=dfobj.columns)

print(dfobj

Output

   Name  Marks  Admit_fee  Fee  Tution_Fee
0  Jack    294        608  700        1500
1   Max    100        205  300         600