Intro to Pandas DataFrame How its work

Pandas

In this article, we are going to Intro to Pandas DataFrame How its work with examples.DataFrame is a 2-dimensional labeled data structure frame with columns of data types. It looks like an excel spreadsheet or SQL table, or a dictionary of Series objects. It is the most commonly used pandas object.

The Pandas DataFrame constructor is used to create a dataframe.To better understand DataFrame let us first go through the dataframe() constructor.

What is dataframe() constructor


The dataFrame is a tabular and 2-dimensional labeled data structure frame with columns of data types. It looks like an excel spreadsheet or SQL table, or a dictionary of Series objects. It is the most commonly used pandas object.

DataFrame class constructor is used to create a dataframe.

pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=None)

Parameters

  • Data : It can be array,iterable(tuple,dictionary,list) or a dataframe.It is optional parmeter.
  • index : It is optional parameter can be an array.
  • columns :optional,Use to pass the column name
  • datatype :optional,datatype use to force a dataype for column.
  • copy : optional,It is use to copy data from input

1. Create a DataFrame from a 2-D array


First, we will create a 2-D array by using NumPy random function. Now we have an array so let us convert it to a data frame. We passed the ndarray to the DataFrame function and got the DataFrame as output.

import numpy as np
import pandas as pd
nd_array = np.random.rand(3,3)
print(nd_array)
df = pd.DataFrame(nd_array)
print(df)

Output:

[[0.57661755 0.54761212 0.3084379 ]
 [0.39885032 0.66864916 0.99126693]
 [0.4320829  0.59168224 0.70226786]]

         0         1         2
0  0.576618  0.547612  0.308438
1  0.398850  0.668649  0.991267
2  0.432083  0.591682  0.702268

2. How to Check datatype of dataframe


In this example, we have created a dataframe using the NumPy rand() method and creating a DataFrame from it. We are checking the datatype of DataFrame using the type() method.

import numpy as np
import pandas as pd
nd_array = np.random.rand(3,3)
df = pd.DataFrame(nd_array)
print(type(df))

Here we can see the output as df is of type pandas.core.frame.DataFrame.

<class 'pandas.core.frame.DataFrame'>

3. How to Access element of Dataframe


Since we are able to create a Dataframe from ndarray data, now if you try to access elements of a data frame by index. This will throw an error and you will not be able to access data frame elements by using numeric indexes.

import numpy as np
import pandas as pd
nd_array = np.random.rand(3,3)
df = pd.DataFrame(nd_array)
print(df[0,1])

Output

KeyError: (0, 1)

Dataframes use the column property which shows the column indexes. See with below example.

import numpy as np
import pandas as pd
nd_array = np.random.rand(3,3)
df = pd.DataFrame(nd_array)
print(df.columns)

The output will show you the column details:

RangeIndex(start=0, stop=3, step=1)

4. How to Access column wise data of Dataframe


You will be able to access the column-wise data of a data frame and the result will be in the form of a series.

import numpy as np
import pandas as pd
nd_array = np.random.rand(3,3)
df = pd.DataFrame(nd_array)
print(df[0])

Series output will be like this:

0    0.318376
1    0.039742
2    0.924020
Name: 0, dtype: float64

5. How to Access data by index from dataframe


Same as series you can assign the index to the data frame like below and then can access the column data using the same indexes.

import numpy as np
import pandas as pd
nd_array = np.random.rand(3,3)
df = pd.DataFrame(nd_array)
df.columns = ["zero","one","two"]
print(df)

Output :

      zero       one       two
0  0.673679  0.042306  0.350949
1  0.422954  0.370127  0.018281
2  0.103772  0.170156  0.389879

Now we can access using the column indexes which we assigned.

import numpy as np
import pandas as pd
nd_array = np.random.rand(3,3)
df = pd.DataFrame(nd_array)
df.columns = ["zero","one","two"]print(df["zero"])
print(df["one"])
print(df["two"])

The output we can see as :

0    0.673679
1    0.422954
2    0.103772
Name: zero, dtype: float64
0    0.042306
1    0.370127
2    0.170156
Name: one, dtype: float64
0    0.350949
1    0.018281
2    0.389879
Name: two, dtype: float64