How to sum columns with Nan Pandas Dataframe

In this post, we will learn How to sum columns with Nan Pandas Dataframe. The pandas dataframe fillna() method allows us to replace nan or missing values with custom values that can be zero or any value. It takes 0 as an argument to replace the NAN values with zero and returns a new dataframe in which NAN values are replaced by zero.

1. How to sum columns with Nan Pandas Dataframe


We have used the NumPy library to Fill a Datafram with np.nan values. The nan values represent undefined or missing value. The fillna() method of Pandas dataframe is used to handle NAN values. It Replaces the Nan with the specified value. In this below example, we have used fillna() to replace Nan value with zero to sum columns that contain Nan values.

import pandas as pd
import numpy as np
   
data = {
    'Name': ['Jack', 'Jack', 'Max', 'Jack'],
    'Marks':[97,97,100,np.nan],
    'Admit_fee':[np.nan,201,205,206],
    'Fee':[np.nan,200,300,np.nan],
   
}   
df = pd.DataFrame(data)

df['Sum'] = df.fillna(0)['Admit_fee'] + df.fillna(0)['Fee']
print(df)

#another way to achieve same
df['Sum'] = df.Admit_fee.fillna(0) + df.Fee.fillna(0)
print(df)


Output

   Name  Marks  Admit_fee    Fee    Sum
0  Jack   97.0        NaN    NaN    0.0
1  Jack   97.0      201.0  200.0  401.0
2   Max  100.0      205.0  300.0  505.0
3  Jack    NaN      206.0    NaN  206.0

2. How to sum Specific columns with Nan Pandas Dataframe


The default sum of empty or NA/NAN values is zero. To sum the NA/NAN or empty values to NAN can be controlled by using the sum() method parameter min_count=1 to sum columns with NAN values to NAN, not zero. The min_count sum of the row where at least one is not null or if all are NULL/NAN then NAN/NULL.

import pandas as pd
import numpy as np
   
data = {
    'Name': ['Jack', 'Jack', 'Max', 'Jack'],
    'Marks':[97,97,100,np.nan],
    'Admit_fee':[np.nan,201,205,206],
    'Fee':[np.nan,200,300,np.nan],
   
}
   
df = pd.DataFrame(data)


df['Sum'] = df[["Admit_fee","Fee"]].sum(axis=1, min_count=1)
print(df)


Output

   Name  Marks  Admit_fee    Fee    Sum
0  Jack   97.0        NaN    NaN    NaN
1  Jack   97.0      201.0  200.0  401.0
2   Max  100.0      205.0  300.0  505.0
3  Jack    NaN      206.0    NaN  206.0

3. How to sum columns with Nan Pandas Dataframe


In this example, we have used the loc[] property that is used to select a group of rows and columns based on the label. Here we have selected two columns and rows that contain NAN values in the Pandas dataframe to sum them.

import pandas as pd
import numpy as np
   
data = {
    'Name': ['Jack', 'Jack', 'Max', 'Jack'],
    'Marks':[97,97,100,np.nan],
    'Admit_fee':[np.nan,201,205,206],
    'Fee':[np.nan,200,300,np.nan],
   
}
   
df = pd.DataFrame(data)


df.loc[:,'Sum'] = df.loc[:,["Admit_fee","Fee"]].sum(axis=1)
print(df)

Output

   Name  Marks  Admit_fee    Fee    Sum
0  Jack   97.0        NaN    NaN    0.0
1  Jack   97.0      201.0  200.0  401.0
2   Max  100.0      205.0  300.0  505.0
3  Jack    NaN      206.0    NaN  206.0

4. How to sum all columns with Nan Pandas Dataframe using Mask


The pandas.DataFrame.mask method replaces value where the condition is True. In this example, we have used a mask to sum the column that contains NA/nan values.

import pandas as pd
import numpy as np
   
data = {
    'Name': ['Jack', 'Jack', 'Max', 'Jack'],
    'Marks':[97,97,99.99,np.nan],
    'Admit_fee':[np.nan,201,205,206],
    'Fee':[np.nan,200,300,np.nan],
   
}
   
df = pd.DataFrame(data)


df.loc[:,'Sum'] = df.sum(1).mask(df.isna().all(1))
print(df)

Output

 Name  Marks  Admit_fee    Fee     Sum
0  Jack  97.00        NaN    NaN   97.00
1  Jack  97.00      201.0  200.0  498.00
2   Max  99.99      205.0  300.0  604.99
3  Jack    NaN      206.0    NaN  206.00

5. Dropna(): to sum columns with NAN pandas dataframe


In this example, we have used pandas.DataFrame.dropna() method to sum the columns with NAN values.Let us understand with the below examples.

import pandas as pd
import numpy as np
   
data = {
    'Name': ['Jack', 'Jack', 'Max', 'Jack'],
    'Marks':[97,97,100,np.nan],
    'Admit_fee':[np.nan,201,205,206],
    'Fee':[np.nan,200,300,np.nan],
   
}
   
df = pd.DataFrame(data)


df['Sum'] = df.dropna(how='all').sum(1)
print(df)

Output

   Name  Marks  Admit_fee    Fee    Sum
0  Jack   97.0        NaN    NaN   97.0
1  Jack   97.0      201.0  200.0  498.0
2   Max  100.0      205.0  300.0  605.0
3  Jack    NaN      206.0    NaN  206.0

Summary

In this post, we have learned How to sum columns with Nan Pandas Dataframe with examples by using dataframe Sum(),loc[],dropna(), and isna().