In this post, we are going to understand how to Convert string column to int in Pandas using some of the built-in methods that can be single or multiple columns.
1. astype(int) to Convert column string to int in Pandas
The astype() method allows us to pass datatype explicitly, even we can use Python dictionary to change multiple datatypes at a time, Where keys specify the column and values specify the new datatype.
Python Pandas Program to Convert string column to int
import pandas as pd
Student_dict = {
'Name': ['Jack', 'Rack', 'Max'],
'Marks':['100','100', '100'],
'Subject': ['Math', 'Math', 'Music']
}
dfobj = pd.DataFrame(Student_dict)
dfobj['Marks'] = dfobj['Marks'].astype(int)
print ('\n string to int:\n',dfobj)
print ('\n converted datatype :\n',dfobj.dtypes)
Output
string to int:
Name Marks Subject
0 Jack 100 Math
1 Rack 100 Math
2 Max 100 Music
converted datatype :
Name object
Marks int32
Subject object
dtype: object
2.astype(int) to Convert multiple string column to int in Pandas
In this example, we are converting multiple columns containing numeric string values to int by using the astype(int) method of the Pandas library by passing a dictionary
We are using a Python dictionary to change multiple columns datatype Where keys specify the column and values specify a new datatype.
Python Pandas Program to Convert mutiple string column to int
import pandas as pd
Student_dict = {
'Name': ['Jack', 'Rack', 'Max'],
'Marks':['100','100', '100'],
'Fee':['100','200','300'],
'Subject': ['Math', 'Math', 'Music']
}
dfobj = pd.DataFrame(Student_dict)
dict_columns_type = {'Marks': int,
'Fee': int
}
dfobj = dfobj.astype(dict_columns_type)
print('dataframe str to int:\n',dfobj)
print(f'\n {dfobj.dtypes}')
Output
dataframe str to int:
Name Marks Fee Subject
0 Jack 100 100 Math
1 Rack 100 200 Math
2 Max 100 300 Music
Name object
Marks int32
Fee int32
Subject object
dtype: object
3. to_numeric() to convert single string column to int
The to_numeric() function is used to convert non-numeric values to suitable numeric type. In this, We can use both numeric or non-numeric values. It raises this error “ValueError: Unable to parse string” , the error parameter of to_numeric() method is used to handle this error.
The error parameter has two values
- errors=’coerce’ used to convert non-numeric values to NAN
- errors=’ignore’ use to ignore the error
Python Pandas Program to convert single string column to int
import pandas as pd
Student_dict = {
'Name': ['Jack', 'Rack', 'Max'],
'Marks':['100','100', '100'],
'Subject': ['Math', 'Math', 'Music']
}
dfobj = pd.DataFrame(Student_dict)
dfobj['Marks'] = pd.to_numeric(dfobj['Marks'], errors='coerce')
print ('\n string to int:\n',dfobj)
print ('\n converted datatype :\n',dfobj.dtypes)
Output
string to int:
Name Marks Subject
0 Jack 100 Math
1 Rack 100 Math
2 Max 100 Music
converted datatype :
Name object
Marks int64
Subject object
dtype: object
We can change the ‘NAN’ values to 0 by using the replace() method as we have done in the below example
Program to Replace Nan values to 0
import pandas as pd
import numpy as np
Student_dict = {
'Name': ['Jack', 'Rack', 'Max'],
'Marks':['100','100', 'z100'],
'Subject': ['Math', 'Math', 'Music']
}
dfobj = pd.DataFrame(Student_dict)
dfobj['Marks'] = pd.to_numeric(dfobj['Marks'], errors='coerce')
dfobj = dfobj.replace(np.nan, 0, regex=True)
print ('\n string to int :\n',dfobj)
Output
string to int :
Name Marks Subject
0 Jack 100.0 Math
1 Rack 100.0 Math
2 Max 0.0 Music
4. to_numeric() to convert multiple string column to int
In this example, we are using apply() method and passing datatype to_numeric as an argument to change columns numeric string value to an integer.
Python Pandas Program to convert mutiple columns to int
import pandas as pd
Student_dict = {
'Name': ['Jack', 'Rack', 'Max'],
'Marks':['100','100', '100'],
'Fee':['100','200','300'],
'Subject': ['Math', 'Math', 'Music']
}
dfobj = pd.DataFrame(Student_dict)
dfobj[['Marks','Fee']]= dfobj[['Marks','Fee']].apply(pd.to_numeric)
print('dataframe str to int:\n',dfobj)
print(f'\n {dfobj.dtypes}')
Output
dataframe str to int:
Name Marks Fee Subject
0 Jack 100 100 Math
1 Rack 100 200 Math
2 Max 100 300 Music
Name object
Marks int64
Fee int64
Subject object
dtype: object
5. Pandas Convert entire dataframe to int
To convert an entire dataframe columns string to int we just need to call the astype() method by specifying the datatype in which we want to convert using the dataframe object.
Program Example
import pandas as pd
Student_dict = {
'StudID': ['12', '13', '14'],
'Marks':['100','100', '100'],
'Fee':['100','200','300']
}
dfobj = pd.DataFrame(Student_dict)
dfobj= dfobj.astype(int)
print('dataframe str to int:\n',dfobj)
print(f'\n {dfobj.dtypes}')
Output
dataframe str to int:
StudID Marks Fee
0 12 100 100
1 13 100 200
2 14 100 300
StudID int32
Marks int32
Fee int32
dtype: object
Summary
In this post, we have understood multiple ways of how to Convert string columns to int in Pandas with examples using the built-in method. These methods are also used to convert string to float.