5 Methods to change columns type in Pandas

Pandas

In this post, we are going to understand the 5 Methods to change columns type in Pandas. We will learn how to change column types in Pandas Python by using some built-in methods with examples.

5 Method to change type change columns type in Pandas


  • Dataframe.astype() : Change any type to another type
  • Dataframe.is_numeric() : Change non-numeric type to numeric.
  • pd.to_datetime() : Change string to datetime
  • df.convert_dtypes()
  • Dataframe.infer_object()

1. Using DataFrame.astype()


The DataFrame.astype(float) method converts pandas object to the given datatype.

Syntax

DataFrame.astype(dtype, copy=True, errors='raise')

Parameters

  • dtype : we use numpy.dtype to convert entire dataframe to given datatype or we can use dictionary of {col:dtype}. Where keys specify the column and values specify the new datatype.
  • copy: It return copy of dataframe if true and else change in current object
  • error :It control raising of exception
    • raise : It allow exception to be raise.
    • ignore : It supress the exception


1.1 astype() to Change single column type string to float


In this example, we are converting the ‘Marks’ column string value to float.

Program Example to Convert string column to float

import pandas as pd
 
Student_dict = {
    'Name': ['Jack', 'Rack', 'Max'],
    'Marks':['100','100', '100'],
    'Subject': ['Math', 'Math', 'Music']
}
 


dfobj = pd.DataFrame(Student_dict)


dfobj['Marks'] = dfobj['Marks'].astype(float)

print (dfobj.dtypes)

Output

Name        object
Marks      float64
Subject     object
dtype: object

1.2.astype() to Change mutiple columns type in Pandas


In this example we are converting multiple columns ‘StudId,’Marks’,’Fee’,’Subject’ to different types by using astype() by passing column name and datatype in which you want to convert.

Python Program Example

import pandas as pd
  
Student_dict = {
    'StudId': ['1', '2', '3'],
    'Marks':['100','100', '100'],
    'Fee':['100','200','300'],
    'Subject': ['Math', 'Math', 'Music']
}
  
 
 
dfobj = pd.DataFrame(Student_dict)

 
dfobj = dfobj.astype({'StudId':'int64','Fee': 'float64', 'Marks': 'int64'})

 
print(f'\n {dfobj.dtypes}')


Output

Name       object
Marks       int64
Fee         int64
Subject    object
dtype: object

1.3astype() to Change entire dataframe type to int


To convert an entire dataframe column to int we just need to call the astype() method using the dataframe object.

Program Example

import pandas as pd
 
Student_dict = {
    'StudID': ['12', '13', '14'],    
    'Marks':['100','100', '100'],
    'Fee':['100','200','300']
    
}
 


dfobj = pd.DataFrame(Student_dict)

dfobj= dfobj.astype(int)


print(f'\n {dfobj.dtypes}')



Output

StudID    int32
Marks     int32
Fee       int32
dtype: object

2. to_numeric() to convert single string column to int


The to_numeric() function is used to convert non-numeric values to suitable numeric types. In this, We can use both numeric or non-numeric values. It raises this error “ValueError: Unable to parse string” , the error parameter of to_numeric() method is used to handle this error.

The error parameter has two values:

  • errors=’coerce’ used to convert non-numeric values to NAN
  • errors=’ignore’ use to ignore the error

Program Example

import pandas as pd
 
Student_dict = {
    'Name': ['Jack', 'Rack', 'Max'],
    'Marks':['100','100', 'z100'],
    'Subject': ['Math', 'Math', 'Music']
}
 


dfobj = pd.DataFrame(Student_dict)


dfobj['Marks'] = pd.to_numeric(dfobj['Marks'], errors='coerce')

print (dfobj.dtypes)

Output

 Name        object
Marks      float64
Subject     object
dtype: object

3. infer_objects() to Change columns type in Pandas


The infer_objects() converts the dataframe column that has an object to a more specific type. It was introduced in the 0.21.0 version.

  • we have created a dataframe by speccifying dtype ‘object’ for all columns.
  • The infer_objects() convert the type ‘StudId’ columns to int64.

Python Program Example

import pandas as pd
  
Student_dict = {
    'StudId': [1, 2, 3],
    'Marks':['100','100', '100'],
    'Fee':['100','200','300'],
    'Subject': ['Math', 'Math', 'Music']
}
  
 
 
dfobj = pd.DataFrame(Student_dict,dtype = 'object')


print(dfobj.dtypes)

dfobj = dfobj.infer_objects()
 

print(f'\n---After using infer_objects() method----: \n{dfobj.dtypes}')

Output

StudId     object
Marks      object
Fee        object
Subject    object
dtype: object

---After using infer_objects() method----: 
StudId      int64
Marks      object
Fee        object
Subject    object
dtype: object

4.pd.to_datetime() to Convert string column to datetime


In this example, the to_datetime() method is used to convert a given value string type date value to datetime. we have imported the Pandas module in python by using import pandas as pd. We are using the pandas library.

Steps to convert string column to datetime in pandas

  • Import pandas module import pandas as pd
  • call to_datetime() method by passing the column name
  • check the type of column by using dtypes property
  • Print the final result using print() method

Python Program Example

import pandas as pd


dfobj = pd.DataFrame({'Date':['11/8/2014', '21/04/2020', '10/2/2017'],
                'Name':['rack', 'David', 'Max'],
                'Fee':[12000, 15000, 15000]})


print('before conversion :\n',dfobj.dtypes)


dfobj['Date']= pd.to_datetime(dfobj['Date'])

print('\n string to date conversion:\n',dfobj.dtypes)

Output

before conversion :
 Date    object
Name    object
Fee      int64
dtype: object

string to date conversion:
Date    datetime64[ns]
Name            object
Fee              int64
dtype: object

5. df.convert_dtypes() to change type in Pandas


The df.convert_dtypes() method convert a column to best possible datatype supporting pd.na.

In this example, we have all columns storing data in string datatype. But the data type of all columns is ‘object’.The df.convert_dtypes() method change the columns type to the best type that is the string.

Python Program Example

import pandas as pd
  
Student_dict = {
    'StudId': ['1', '2', '3'],
    'Marks':['100','100', '100'],
    'Fee':['100','200','300'],
    'Subject': ['Math', 'Math', 'Music']
}
  
 
 
dfobj = pd.DataFrame(Student_dict)


print(dfobj.dtypes)

dfobj = dfobj.convert_dtypes()
 

print(f'\n---After using infer_objects() method----: \n{dfobj.dtypes}')

Output

StudId     object
Marks      object
Fee        object
Subject    object
dtype: object

---After using infer_objects() method----: 
StudId     string
Marks      string
Fee        string
Subject    string
dtype: object

Summary

In this post, we have learned 5 Methods to change columns type in Pandas examples.