How to Convert text file into Pandas DataFrame

Pandas

In this post, we are going to understand How to Convert text file into Pandas DataFrame with examples. We are going to use an inbuilt python pandas function

Methods to convert text file to DataFrame


  • read_csv() method
  • read_table() function
  • read_fwf() function

Pandas read_csv() Method


Pandas library has a built-in read_csv() method to read a CSV that is a comma-separated value text file so we can use it to read a text file to Dataframe. It read the file at the given path and read its contents in the dataframe. It uses a comma as a defualt separator or delimiter or regular expression can be used.

Syntax

pandas.read_csv(filepath_or_buffer, sep='', delimiter=None, header='infer', names=<no_default>, index_col=None)

Parameters

  • FilePath: The path of file.
  • Sep: This is used as a delimiter while reading a file to Dataframe.
  • header: To specify first rows consider as a header or not, by default the first row is considered as a header.
  • names: used to pass the name of columns.
  • index_col: This is used to specify the custom indexes.

1. Read_CSV() to convret text file to DataFrame


File contents

Name Subjs Marks
Alex Phy 100
Ben Chem 100
Jack Math 100

In this example, we are reading a text file to a dataframe by using a custom delimiter colon(:) with the help of the read_csv() method. This file exists in the current directory we just pass the file path not Full Path

Program Exmaple

import pandas as pd
 
# Read a text file to a dataframe using colon delimiters
student_csv =  pd.read_csv('students.txt', sep=':', engine='python')

print(student_csv)

Output

   Name Subjs  Marks
0  Alex   Phy    100
1   Ben  Chem    100
2  Jack  Math    100

2. Reg Exp to Read_csv() with mutiple delimters


This is a file contents we are using in the below program example.it is present in the current directory.

File content

Name,Subjs;Marks
Alex:Phy|100
Ben;Chem_100
Jack,Math|100

In this example, we are reading a text file that is separated by multiple delimiters(:;|_) with the help of Regular Expressions to a dataframe. The Regular expression is used to remove multiple delimiters from a text file.

Program example

import pandas as pd
 
# Read a text file to a dataframe using mutiple delimiters
student_csv =  pd.read_csv('students.txt', sep='[:,;|_]', engine='python')

print(student_csv)

Output

   Name Subjs  Marks
0  Alex   Phy    100
1   Ben  Chem    100
2  Jack  Math    100

3. read_table() to convert text file to Dataframe


The read_table() function to used to read the contents of different types of files as a table. It uses a tab(\t) delimiter by default. Let us understand by example how to use it.

File Contents

Name Subjs Marks
Alex Phy  100
Ben  Chem 100
Jack Math 100

Program Example

import pandas as pd
 
# Read a text file to a dataframe using read_table function
student_csv =  pd.read_table('students.txt', 
delimiter = ' ')

print(student_csv)

Output

  Name Subjs Marks
0    Alex Phy  100
1    Ben  Chem 100
2    Jack Math 100

4. read_fwf() to convert text file to Dataframe


The read_fwf() function is used to read fixed-width formatted lines to convert a text file to a dataframe.it does not use any delimiter to delimit the lines.

Program Example

import pandas as pd

student_csv =  pd.read_fwf('students.txt')
print(student_csv)

Output

  Name Subjs Marks
0     Alex Phy 100
1     Ben Chem 100
2    Jack Math 100

Summary

We have understood 3 ways of How to Convert text files into Pandas DataFrame using the built-in methods.