In this post, we are going to understand how to Find unique value in column of Pandas DataFrame by using examples. While doing data manipulation in Python, we can filter data by finding the unique key and using the Python pandas dataframe built-in unique() method ,series.unique() and nunique() methods.
1. Pandas Dataframe.nunique() method
The Pandas Dataframe.nunique() method returns a numpy array of a number of elements in the dataframe that are unique.
Syntax
Dataframe.nunique(self,axis=0,dropna=true,)
Parameters
- axis =0 : count unique element in each column.It is default value of axis.
- axis =1 : count unique element in each row.
- dropna : If true that does not include count ‘NAN’ values, else include ‘NAN’ values.
1.1 Count unique values in a column of DataFrame
To count the number of unique elements in single columns of dataframe we are using a dataframe.nunique() method. The default value for the axis is 0 if not specified. Let us understand with examples.
Program Example
#python 3 program to Count unique values in a column of DataFrame
import pandas as pd
Student_dict = [('Jack',100,'Math'),
('Rack',100,'Math'),
( 'Max',100,'Music'),
('Jack',100,'Math'),
('Jack',100,'Math'),
]
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
print(df)
unique_val = df.nunique()
print("\nUnique values in each column of dataframe:\n")
print(unique_val)
Output
Name Marks Subject
0 Jack 100 Math
1 Rack 100 Math
2 Max 100 Music
3 Jack 100 Math
4 Jack 100 Math
Unique values in each column of dataframe:
Name 3
Marks 1
Subject 2
dtype: int64
1.2 count Unique value in column of include NAN
The dataframe nunique(dropna=True) does not include NAN values in column.So to count the unique NAN value in the column we have to use the dataframe method nunique(dropna=False) as in the below example.
Program Example
#python 3 program to find unique values in a column of DataFrame
import pandas as pd
Student_dict = [('',100,'Math'),
('Rack',100,'Math'),
( 'Max',100,'Music'),
('',100,'Math'),
('',100,'Math'),
]
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
print(df)
unique_val = df.nunique(dropna=False)
print("\nUnique values in each column of dataframe:\n")
print(unique_val)
Output
Name Marks Subject
0 100 Math
1 Rack 100 Math
2 Max 100 Music
3 100 Math
4 100 Math
Unique values in each column of dataframe:
Name 3
Marks 1
Subject 2
dtype: int64
1.3 Find Unique value in multiple columns of dataframe
To find unique values in multiple columns of the dataframe, we have merged the multiple columns to a single series object using the append() method and called the unique() method.
Program Example
import pandas as pd
Student_dict = [('Jack',100,'Math'),
('Rack',100,'Math'),
( 'Max',100,'Music'),
('Jack',100,'Math'),
('Jack',100,'Math'),
]
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
print(df)
unique_val = (df['Name'].append(df['Subject'])).unique()
print("\nUnique values in Name and subject columns:\n")
print(unique_val)
Output
Name Marks Subject
0 Jack 100 Math
1 Rack 100 Math
2 Max 100 Music
3 Jack 100 Math
4 Jack 100 Math
Unique values in Name and subject columns:
['Jack' 'Rack' 'Max' 'Math' 'Music']
2. Find and count unique values in series
Methods to find unique value in series, we can use series.unique() and series.nunique() methods.
1. series.unique() method :
It returns a numpy array of unique elements.
Syntax
series.unique(self)
2. series.nunique() method
It returns a number of elements in series that are unique.
Syntax
series.unique(self,dropna=true)
2.1 Find unique value in series single column
To find all the unique values in the series, we have used series.unique() method with ‘Name’ column.
Program Example
import pandas as pd
Student_dict = [('Jack',100,'Math'),
('Rack',100,'Math'),
( 'Max',100,'Music'),
('Jack',100,'Math'),
('Jack',100,'Math'),
]
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
print(df)
unique_val = (df['Name'].unique())
print("\nUnique values in Name column:\n")
print(unique_val)
Output
Name Marks Subject
0 Jack 100 Math
1 Rack 100 Math
2 Max 100 Music
3 Jack 100 Math
4 Jack 100 Math
Unique values in Name column:
['Jack' 'Rack' 'Max']
2.2 Count unique value in single column
To count unique values in the series column, we have used series.nunique() method with ‘Name’ column.It will return the number of unique values in a column of series.
Program Example
import pandas as pd
Student_dict = [('Jack',100,'Math'),
('Rack',100,'Math'),
( 'Max',100,'Music'),
('Jack',100,'Math'),
('Jack',100,'Math'),
]
df = pd.DataFrame(Student_dict,columns=['Name','Marks','Subject'])
print(df)
unique_val = (df['Name'].nunique())
print("\nUnique values in Name column:\n")
print(unique_val)
Output
Name Marks Subject
0 Jack 100 Math
1 Rack 100 Math
2 Max 100 Music
3 Jack 100 Math
4 Jack 100 Math
Unique values in Name column:
3
Summary
In this, we have learned multiple ways to find unique value in a dataframe and a series by Using the built-in dataframe.nunique() and series unique() and unique().