Convert string column to int in Pandas

In this post, we are going to understand how to Convert string column to int in Pandas using some of the built-in methods that can be single or multiple columns.

1. astype(int) to Convert column string to int in Pandas


The astype() method allows us to pass datatype explicitly, even we can use Python dictionary to change multiple datatypes at a time, Where keys specify the column and values specify the new datatype.

Python Pandas Program to Convert string column to int

import pandas as pd
 
Student_dict = {
    'Name': ['Jack', 'Rack', 'Max'],
    'Marks':['100','100', '100'],
    'Subject': ['Math', 'Math', 'Music']
}
 


dfobj = pd.DataFrame(Student_dict)


dfobj['Marks'] = dfobj['Marks'].astype(int)

print ('\n string to int:\n',dfobj)
print ('\n converted datatype :\n',dfobj.dtypes)

Output

 string to int:
    Name  Marks Subject
0  Jack    100    Math
1  Rack    100    Math
2   Max    100   Music

 converted datatype :
 Name       object
Marks       int32
Subject    object
dtype: object

2.astype(int) to Convert multiple string column to int in Pandas


In this example, we are converting multiple columns containing numeric string values to int by using the astype(int) method of the Pandas library by passing a dictionary

We are using a Python dictionary to change multiple columns datatype Where keys specify the column and values specify a new datatype.

Python Pandas Program to Convert mutiple string column to int

import pandas as pd
 
Student_dict = {
    'Name': ['Jack', 'Rack', 'Max'],
    'Marks':['100','100', '100'],
    'Fee':['100','200','300'],
    'Subject': ['Math', 'Math', 'Music']
}
 


dfobj = pd.DataFrame(Student_dict)

dict_columns_type = {'Marks': int,
                'Fee': int
               }
  
dfobj = dfobj.astype(dict_columns_type)
print('dataframe str to int:\n',dfobj)

print(f'\n {dfobj.dtypes}')

Output

dataframe str to int:
    Name  Marks  Fee Subject
0  Jack    100  100    Math
1  Rack    100  200    Math
2   Max    100  300   Music

 Name       object
Marks       int32
Fee         int32
Subject    object
dtype: object

3. to_numeric() to convert single string column to int


The to_numeric() function is used to convert non-numeric values to suitable numeric type. In this, We can use both numeric or non-numeric values. It raises this error “ValueError: Unable to parse string” , the error parameter of to_numeric() method is used to handle this error.

The error parameter has two values

  • errors=’coerce’ used to convert non-numeric values to NAN
  • errors=’ignore’ use to ignore the error

Python Pandas Program to convert single string column to int

import pandas as pd
  
Student_dict = {
    'Name': ['Jack', 'Rack', 'Max'],
    'Marks':['100','100', '100'],
    'Subject': ['Math', 'Math', 'Music']
}
  
 
 
dfobj = pd.DataFrame(Student_dict)
 
 
dfobj['Marks'] = pd.to_numeric(dfobj['Marks'], errors='coerce')
 
 
print ('\n string to int:\n',dfobj)
print ('\n converted datatype :\n',dfobj.dtypes)

Output

  string to int:
    Name  Marks Subject
0  Jack    100    Math
1  Rack    100    Math
2   Max    100   Music

 converted datatype :
 Name       object
Marks       int64
Subject    object
dtype: object

We can change the ‘NAN’ values to 0 by using the replace() method as we have done in the below example

Program to Replace Nan values to 0


import pandas as pd
import numpy as np
 
Student_dict = {
    'Name': ['Jack', 'Rack', 'Max'],
    'Marks':['100','100', 'z100'],
    'Subject': ['Math', 'Math', 'Music']
}
 


dfobj = pd.DataFrame(Student_dict)


dfobj['Marks'] = pd.to_numeric(dfobj['Marks'], errors='coerce')
dfobj = dfobj.replace(np.nan, 0, regex=True)


print ('\n string to int :\n',dfobj)

Output

 string to int :
    Name  Marks Subject
0  Jack  100.0    Math
1  Rack  100.0    Math
2   Max    0.0   Music

4. to_numeric() to convert multiple string column to int


In this example, we are using apply() method and passing datatype to_numeric as an argument to change columns numeric string value to an integer.

Python Pandas Program to convert mutiple columns to int

import pandas as pd
 
Student_dict = {
    'Name': ['Jack', 'Rack', 'Max'],
    'Marks':['100','100', '100'],
    'Fee':['100','200','300'],
    'Subject': ['Math', 'Math', 'Music']
}
 


dfobj = pd.DataFrame(Student_dict)

dfobj[['Marks','Fee']]= dfobj[['Marks','Fee']].apply(pd.to_numeric)
  

print('dataframe str to int:\n',dfobj)

print(f'\n {dfobj.dtypes}')

Output

dataframe str to int:
    Name  Marks  Fee Subject
0  Jack    100  100    Math
1  Rack    100  200    Math
2   Max    100  300   Music

 Name       object
Marks       int64
Fee         int64
Subject    object
dtype: object

5. Pandas Convert entire dataframe to int


To convert an entire dataframe columns string to int we just need to call the astype() method by specifying the datatype in which we want to convert using the dataframe object.

Program Example

import pandas as pd
 
Student_dict = {
    'StudID': ['12', '13', '14'],    
    'Marks':['100','100', '100'],
    'Fee':['100','200','300']
    
}
 


dfobj = pd.DataFrame(Student_dict)

dfobj= dfobj.astype(int)
  

print('dataframe str to int:\n',dfobj)

print(f'\n {dfobj.dtypes}')



Output

dataframe str to int:
    StudID  Marks  Fee
0      12    100  100
1      13    100  200
2      14    100  300

 StudID    int32
Marks     int32
Fee       int32
dtype: object

Summary

In this post, we have understood multiple ways of how to Convert string columns to int in Pandas with examples using the built-in method. These methods are also used to convert string to float.