Find unique value in column of Pandas DataFrame

Pandas Python

In this post, we are going to understand how to Find unique value in column of Pandas DataFrame by using examples. While doing data manipulation in Python, we can filter data by finding the unique key and using the Python pandas dataframe built-in unique() method ,series.unique() and nunique() methods.

1. Pandas Dataframe.nunique() method


The Pandas Dataframe.nunique() method returns a numpy array of a number of elements in the dataframe that are unique.

Syntax

Dataframe.nunique(self,axis=0,dropna=true,)

Parameters

  • axis =0 : count unique element in each column.It is default value of axis.
  • axis =1 : count unique element in each row.
  • dropna : If true that does not include count ‘NAN’ values, else include ‘NAN’ values.

1.1 Count unique values in a column of DataFrame


To count the number of unique elements in single columns of dataframe we are using a dataframe.nunique() method. The default value for the axis is 0 if not specified. Let us understand with examples.

Program Example

#python 3 program to Count unique values in a column of   DataFrame 

import pandas as pd
 
Student_dict = [('Jack',100,'Math'),
 ('Rack',100,'Math'),
( 'Max',100,'Music'),
                ('Jack',100,'Math'),
                ('Jack',100,'Math'),
]
  
 
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
 
print(df)

unique_val = df.nunique()
print("\nUnique values in each column of dataframe:\n")
print(unique_val)

Output

   Name  Marks Subject
0  Jack    100    Math
1  Rack    100    Math
2   Max    100   Music
3  Jack    100    Math
4  Jack    100    Math

 Unique values in each column of dataframe:

Name       3
Marks      1
Subject    2
dtype: int64

1.2 count Unique value in column of include NAN


The dataframe nunique(dropna=True) does not include NAN values in column.So to count the unique NAN value in the column we have to use the dataframe method nunique(dropna=False) as in the below example.

Program Example

#python 3 program to find unique values in a column of   DataFrame 

import pandas as pd
 
Student_dict = [('',100,'Math'),
 ('Rack',100,'Math'),
( 'Max',100,'Music'),
                ('',100,'Math'),
                ('',100,'Math'),
]
  
 
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
 
print(df)

unique_val = df.nunique(dropna=False)
print("\nUnique values in each column of dataframe:\n")
print(unique_val)

Output

   Name  Marks Subject
0          100    Math
1  Rack    100    Math
2   Max    100   Music
3          100    Math
4          100    Math

Unique values in each column of dataframe:

Name       3
Marks      1
Subject    2
dtype: int64

1.3 Find Unique value in multiple columns of dataframe


To find unique values in multiple columns of the dataframe, we have merged the multiple columns to a single series object using the append() method and called the unique() method.

Program Example

import pandas as pd
 
Student_dict = [('Jack',100,'Math'),
 ('Rack',100,'Math'),
( 'Max',100,'Music'),
                ('Jack',100,'Math'),
                ('Jack',100,'Math'),
]
  
 
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
 
print(df)

unique_val = (df['Name'].append(df['Subject'])).unique()
print("\nUnique values in Name and subject columns:\n")
print(unique_val)

Output

   Name  Marks Subject
0  Jack    100    Math
1  Rack    100    Math
2   Max    100   Music
3  Jack    100    Math
4  Jack    100    Math

Unique values in Name and subject columns:

['Jack' 'Rack' 'Max' 'Math' 'Music']

2. Find and count unique values in series


Methods to find unique value in series, we can use series.unique() and series.nunique() methods.

1. series.unique() method :

It returns a numpy array of unique elements.

Syntax

series.unique(self)

2. series.nunique() method

It returns a number of elements in series that are unique.

Syntax

series.unique(self,dropna=true)

2.1 Find unique value in series single column


To find all the unique values in the series, we have used series.unique() method with ‘Name’ column.

Program Example

import pandas as pd
 
Student_dict = [('Jack',100,'Math'),
 ('Rack',100,'Math'),
( 'Max',100,'Music'),
                ('Jack',100,'Math'),
                ('Jack',100,'Math'),
]
  
 
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
 
print(df)

unique_val = (df['Name'].unique())
print("\nUnique values in Name column:\n")
print(unique_val)

Output

  Name  Marks Subject
0  Jack    100    Math
1  Rack    100    Math
2   Max    100   Music
3  Jack    100    Math
4  Jack    100    Math

Unique values in Name column:

['Jack' 'Rack' 'Max']

2.2 Count unique value in single column


To count unique values in the series column, we have used series.nunique() method with ‘Name’ column.It will return the number of unique values in a column of series.

Program Example

import pandas as pd
 
Student_dict = [('Jack',100,'Math'),
 ('Rack',100,'Math'),
( 'Max',100,'Music'),
                ('Jack',100,'Math'),
                ('Jack',100,'Math'),
]
  
 
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
 
print(df)

unique_val = (df['Name'].nunique())
print("\nUnique values in Name column:\n")
print(unique_val)

Output

   Name  Marks Subject
0  Jack    100    Math
1  Rack    100    Math
2   Max    100   Music
3  Jack    100    Math
4  Jack    100    Math

Unique values in Name column:

3

Summary

In this, we have learned multiple ways to find unique value in a dataframe and a series by Using the built-in dataframe.nunique() and series unique() and unique().