In this post, we are going to learn how to read_csv skip rows while reading CSV to Dataframe that includes Pandas read_csv skip row at the start, end, skip rows at a specific position, skip rows by condition, Pandas read_csv skip N rows after the header and many more
Pandas read_csv() method
Pandas library has a built-in read_csv() method to read a CSV file to Dataframe. It read the file at the given path and read its contents in the dataframe.
Syntax
pandas.read_csv(filepath_or_buffer,sep='',skiprows=N)
Parameters
- filepath_or_buffer :Path of file.
- Skiprows:The numbers of rows to skips
- If int then skip row from top.
- if list of index is passed then skip rows for given indexes.
- if callback function then check for each given index need to skip row or not.
- sep: The default seperator is comma(,).We can use custom separtor as per need.
Sample CSV File
Name, Subjs,Marks
Alex,Phy,100
Ben,Chem,100
Jack,Math,100
Max,Phy,100
Tawn,Chem,100
Bruise,Math,100
1. Pandas read_csv skiprow at start
In this example, we are skipping 4 rows from the start of the CSV file.
Program Example
import pandas as pd
studf = pd.read_csv('student.csv', skiprows = 4)
print(studf)
Output
Max Phy 100
0 Tawn Chem 100
1 Bruise Math 100
2. Pandas read_csv skip rows by condition
The callback function or lambda function is passed to skiprows arguments of read_csv(). This function is called for each row to check if the row needs to be skipped or not.
Program Example
#python3 program to Pandas read_csv skip rows by condition
import pandas as pd
def fun_skiprows(index):
if index % 2 == 0:
return True
return False
studf = pd.read_csv('student.csv', skiprows= lambda row: fun_skiprows(row) )
print(studf)
Output
Jack Math 100
0 Rack Physisc 90
1 Max Math 100
2 David Music 100
3 Tawn Chem 90
3. Pandas read_csv skip rows at specific postion
In this example, we have specified the list of rows number from which we need to skip the rows.
Program Example
#python3 program to Pandas read_csv skip rows at specific postion
import pandas as pd
studf = pd.read_csv('student.csv', skiprows = [1, 2, 4])
print(studf)
Output
Name Subjs Marks
0 Jack Math 100
1 Tawn Chem 100
2 Bruise Math 100
4. Pandas read_csv skip N rows from end
The pandas reda_csv() method skipfooter argument is used to specify the number of rows to skip from the end or footer. In this example, we are skipping 4 rows from the end of the CSV file.
The engine = ‘python’ to avoid the fatal warning:-
ParserWarning: Falling back to the ‘python’ engine because the ‘c’ engine does not support skipfooter; you can avoid this warning by specifying engine=’python’.
Program Example
import pandas as pd
studf = pd.read_csv('student.csv', skipfooter = 4,
engine = 'python' )
print(studf)
Output
Name Mark Subject
0 Jack Math 100.0
1 NaN NaN NaN
2 Rack Physisc 90.0
3 NaN NaN NaN
4 Max Math 100.0
5.Pandas read_csv skip N rows after header
In this example, we are skipping 4 rows from start except for the column name.
Program Example
import pandas as pd
studf = pd.read_csv('student.csv', skiprows = [x for x in range(1, 4)])
print(studf)
Output
Name Subjs Marks
0 Max Phy 100
1 Tawn Chem 100
2 Bruise Math 100
6. Pandas read_csv skip rows and Data Column
Sometimes we do not want to load unwanted data, So to load required columns, we use usecols to specify the indexes of columns.
In this example, we are skipping the rows [1,2] from the start and loading columns using usecols =[0,1]. The header=0 is used to specify the first row consider it as header information.
Program Example
import pandas as pd
studf = pd.read_csv(
'student.csv',sep = ',',skiprows=[1,2],header =0,usecols = [0, 1]
)
print(studf.head(10))
Output
Name Subjs
0 Jack Math
1 Max Phy
2 Tawn Chem
3 Bruise Math
7. Pandas read_csv skip empty rows
In this example, we are skipping the empty rows using while reading the CSV file to the dataframe.
Sample File with empty rows
Name, Subjs,Marks
Alex,Phy,100
,,
,,
Ben,Chem,100
Jack,Math,100
Max,Phy,100
Program Example
- First converted the CSV file to dataframe using read_csv().
- Then using the dataframe object with dropna() method to drop empty rows.
- We are resetting the Index using the reset_index() method.
import pandas as pd
dfobj = pd.read_csv('student.csv')
dfobj = dfobj.dropna()
#to reset the index
dfobj = dfobj.reset_index(drop=True)
print(dfobj)
Output
Name Subjs Marks
0 Alex Phy 100.0
1 Ben Chem 100.0
2 Jack Math 100.0
3 Max Phy 100.0
Summary
In this post, we have learned different ways of how to read_csv skip rows while reading CSV to Dataframe.