Splitting NumPy Arrays using split(),hsplit() & vsplit() functions

Numpy

Today we are going to learn how we can split a NumPy array. We will use different techniques to split a NumPy array. We will learn about splitting the array into n equal parts, we will learn about splitting the array horizontally and vertically along the axes.

So let us begin with our examples:

1. Using split() function


We can split an array into multiple sub-arrays by using the split() function of the NumPy library. In this example, we will see how we can do this by specifying the number of equally shaped arrays to return from an input array. Also, we can split an array by specifying the columns after which we want the split to happen.

Let us see the syntax of split function:

split(array, indices_or_sections, axis=0)

To understand the details of this function , first we will create a linear array that we will use as input and we will split this array as per our example need.

Here we are going to create an array of numbers from 0 to 8. Then we will split this array in to 3 equal size arrays by using split function.

import numpy as np
linarray = np.arange(9)
print(linarray)
print(np.split(linarray, 3))

As you can see from the output that we are able to split our array in to 3 array of equal size.

Output:

[0 1 2 3 4 5 6 7 8]
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]

An important point to note is:

The number of splits that we are requesting must be a divisor of the total number of elements in the array.If we won’t follow this rule then the NumPy library will throw an error.

We can also split the array based on the index ranges where we pass the start and end index positions to the split() function.

Here is an example where we are providing the index positions from where we want the split to happen.

import numpy as np
linarray = np.arange(9)

print('Split the array with position indexes' )
print(np.split(linarray,[3,6]))

Output:

Split the array with position indexes
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]

2. Using hsplit() function


We have the option to split the array horizontally by using NumPy.hsplit() function. In this example, the split will be done horizontally.

We are first creating an array with some of the values. Then we are trying to split this array in to 2 arrays. In another example we are splitting the same array into 4 arrays.

import numpy as np
mdarray = np.array([("Apple","Banana", "Grapes","Pears"),
              ("Guvava","Cherry", "Mango","Peach" )])

mdarray1, mdarray2 = np.hsplit(mdarray, 2)
print("Array Before Split",mdarray)
print("Horizontal split array 1:",mdarray1)
print("Horizontal split array 2:",mdarray2)

print("Horizontal split arrays:",np.hsplit(mdarray,4))

Output:

Array Before Split [['Apple' 'Banana' 'Grapes' 'Pears']
 ['Guvava' 'Cherry' 'Mango' 'Peach']]
Horizontal split array 1: [['Apple' 'Banana']
 ['Guvava' 'Cherry']]
Horizontal split array 2: [['Grapes' 'Pears']
 ['Mango' 'Peach']]
Horizontal split arrays: [array([['Apple'],
       ['Guvava']], dtype='<U6'), array([['Banana'],
       ['Cherry']], dtype='<U6'), array([['Grapes'],
       ['Mango']], dtype='<U6'), array([['Pears'],
       ['Peach']], dtype='<U6')]

3. Using vsplit() function


Another method that we have to split the array vertically is by using numpy.vsplit() function.In this example, the split will be done vertically along the vertical axis. Here we are splitting the array into 2 arrays.

import numpy as np
mdarray = np.array([("Apple","Banana", "Grapes","Pears"),
              ("Guvava","Cherry", "Mango","Peach" )])
mdarray3,mdarray4 = np.vsplit(mdarray, 2)

print("Vertical split array 1:",mdarray3)
print("Vertical split array 2:",mdarray4)

Output:

Vertical split array 1: [['Apple' 'Banana' 'Grapes' 'Pears']]
Vertical split array 2: [['Guvava' 'Cherry' 'Mango' 'Peach']]

When we work on huge data samples that are too big in size then we need to split the data into multiple small data samples. This helps us operate on a controlled amount of data.Data representation also becomes easier with smaller data samples.

These functions are really helpful in statistical analysis and other data science applications to reduce the scope. There is great amount of flexibility in these functions that we can utilize according to our program need.

I hope now you are clear about the array split concept of NumPy library and ready to make use of these functions!!!