How to convert text file to NumPy array in Python

Numpy

In this article, we are going to learn how to convert text file to NumPy array in Python by using the numpy library loadtxt() and genfromtxt() function with a delimiter or skipping rows with program examples.

1. Convert text file to Numpy Array Using Loadtxt()



Steps to read a text file in NumPy Array


We will need to import numpy to our program to be able to use the functions available in numpy.

  • To load data using the function loadtxt() from numpy.We have passed the name of the file, but you will need to make sure your file name is with the correct path location to the file.
  • Now the data is loaded in to ndarray , we can print the data and confirm the contents
  • We can also check the size of our array-like how many elements
  • We can check the shape of the matrix in terms of number of rows and number of columns.

Sample File content : numpydata-1.txt

Assume we have the text file with the below content in it.

1 2 3
4 5 6
7 8 9

Python Program to convert text file to NumPy array

In this python program we have first imported the NumPy module “import numpy as np” using Here is the full example that we explained step by step above:

import numpy as np
loadedndarray = np.loadtxt('numpydata-1.txt')
print(loadedndarray)
print("Size of Data elements",loadedndarray.size)
print("Size of Data elements",loadedndarray.shape)

Output

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]
Size of Data elements 9
Size of Data elements (3, 3)

2. Convert text file to Numpy array Using delimiter


File numpydata-2.txt’ contents

Assume we have the text file with the below content in it.

1,1,2
2,4,5
3,7,8
4,10,11
5,13,14

Python Program to Convert text file to Numpy array Using delimiter

Now let’s see other variants of loadtxt() function where we convert a text file with comma-separated data in it. To check how to load text data separated by a comma into a numpy array. We will pass comma(‘,’) as a delimiter to loadtxt() function in the second argument.

import numpy as np
loadedndarray1 = np.loadtxt('numpydata-2.txt', delimiter=',')
print(loadedndarray1)

Output :

[[ 1.  1.  2.]
 [ 2.  4.  5.]
 [ 3.  7.  8.]
 [ 4. 10. 11.]
 [ 5. 13. 14.]]

3. Convert text file to NumPy array by ignore header


numpydata-3.txt contents

Assume we have the text file with headers.

No,Students,Marks
1,1,30
2,2,45
3,7,50
4,5,60
5,8,66

Python Program to convert text file to NumPy array by ignoring headers/sep

In the above contents, the file contains data along with the header and delimiter. So while reading data to numpy array we need to skip the header rows from the file this can be achieved bypassing third argument to loadtxt() function which is ‘skidrow’

import numpy as np
loadedndarray2 = np.loadtxt('numpydata-3.txt', delimiter=',', skiprows=1)
print("Data after skipping the header row",loadedndarray2)


Output:

Data after skipping the header row [[ 1.  1. 30.]
 [ 2.  2. 45.]
 [ 3.  7. 50.]
 [ 4.  5. 60.]
 [ 5.  8. 66.]]

4. convert text file to NumPy array by specify datatype


We can convert the data loaded into string data by adding the fourth argument dtype( data type) we can set it to ‘str’ which will convert each element of data to a string

import numpy as np
loadedndarray3 = np.loadtxt('numpydata-3.txt', delimiter=',', skiprows=1, dtype=str)
print("Data with string type",loadedndarray3)

‘Output:

Data with string type [['1' '1' '30']
 ['2' '2' '45']
 ['3' '7' '50']
 ['4' '5' '60']
 ['5' '8' '66']]

We can apply logic to any column on the data by providing a logic function to loadtxt. loadtxt() will load the data and apply the logic function to the specified column before returning the modified data to ndarray.
This 4th argument is converters and we are multiplying by 10 to all the values of column 1.

import numpy as np
def Multiply(columnno):
 return int(columnno) * 10
loadedndarray4 = np.loadtxt('numpydata-2.txt', delimiter=',', dtype=int, converters={1: Multiply})
print("Data with modified column 1 ",loadedndarray4)


Output:

Data with modified column 1
[[ 1 10 2]
[ 2 40 5]
[ 3 70 8]
[ 4 100 11]
[ 5 130 14]]

5. Convert text file to Numpy array with nan


To read a file with a missing value replaced with ‘nan’ or some custom values the genfromtxt() function is used.

If our data have some of the values missing in any column or row then we can fill the missing values by using genfromtxt() function. This function loads our data and also fill the missing values with ‘nan’ which means not a number. so let’s understand this with the below example:

numpydata-4.txt content


Assume we have the text file with the below content in it.

No,Students,Marks
1,1,30
2,2,45
3,7,
4,5,
5,8,66

Python program to read text file to Numpy array with NAN

import numpy as np
loadedndarray5 = np.genfromtxt('numpydata-4.txt', delimiter=',',skip_header = 1)
print("Missing Values filled with nan",loadedndarray5)

Output:

Missing Values filled with nan [[ 1.  1. 30.]
 [ 2.  2. 45.]
 [ 3.  7. nan]
 [ 4.  5. nan]
 [ 5.  8. 66.]]

But if you would like to fill the missing values with a specific value then you can use the filling_values argument and pass a value that will be used in places where we have a missing value in our data.

import numpy as np
loadedndarray5 = np.genfromtxt('numpydata-4.txt', delimiter=',',skip_header = 1,filling_values=0)
print("Mising values filled with 0",loadedndarray5)



Output:

Mising values filled with 0 [[ 1.  1. 30.]
 [ 2.  2. 45.]
 [ 3.  7.  0.]
 [ 4.  5.  0.]
 [ 5.  8. 66.]]