Python pandas dataframe indexing

Pandas DataFrame Indexing: Set the Index of a Pandas Dataframe

Set Index Of A Dataframe In Python

Hello Readers! In this tutorial, we are going to discuss the different ways to set the index of a Pandas DataFrame object in Python.

What do we mean by indexing of a Pandas Dataframe?

In Python, when we create a Pandas DataFrame object using the pd.DataFrame() function which is defined in the Pandas module automatically (by default) address in the form of row indices and column indices is generated to represent each data element/point in the DataFrame that is called index.

But, the row indices are called the index of the DataFrame, and column indices are simply called columns. The index of a Pandas DataFrame object uniquely identifies its rows. Let’s start our core discussion about the different ways to set the index of a Pandas DataFrame object in Python.

Set index of the DataFrame while creating

In Python, we can set the index of the DataFrame while creating it using the index parameter. In this method, we create a Python list and pass it to the index parameter of the pd.DataFrame() function to its index. Let’s implement this through Python code.

# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a Python list of Roll NOs Roll = [11, 12, 13, 14, 15] # Create a DataFrame from the dictionary # and set Roll column as the index # using DataFrame() function with index parameter df = pd.DataFrame(data, index = Roll) print(df)

Set Index Using Index Parameter

Set index of the DataFrame using existing columns

In Python, we can easily set any existing column or columns of a Pandas DataFrame object as its index in the following ways.

Читайте также:  Проценты загрузки файла на php

1. Set column as the index (without keeping the column)

In this method, we will make use of the inplace parameter which is an optional parameter of the set_index() function of the Python Pandas module. By default the value of the inplace parameter is False . But here we will set the value of inplace as True . So that the old index of the DataFrame is replaced by the existing column which has been passed to the pd.set_index() function as the new index. Let’s implement this through Python code.

# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Set the Roll column as the index # using set_index() function df = df.set_index('Roll') print("\nThis is the final DataFrame:") print(df)

Set Column As Index

2. Set column as the index (keeping the column)

In this method, we will make use of the drop parameter which is an optional parameter of the set_index() function of the Python Pandas module. By default the value of the drop parameter is True . But here we will set the value of the drop parameter as False . So that the column which has been set as the new index is not dropped from the DataFrame. Let’s implement this through Python code.

# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Set the Name column as the index # using set_index() function with drop df = df.set_index('Name', drop = False) print("\nThis is the final DataFrame:") print(df)

Set Index Using Drop Parameter

3. Set multiple columns as the index of the DataFrame

In this method, we can set multiple columns of the Pandas DataFrame object as its index by creating a list of column names of the DataFrame then passing it to the set_index() function. That’s why in this case, the index is called multi-index. Let’s implement this through Python code.

# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Set the Roll & Name column as the multi-index # using set_index() function and list of column names df = df.set_index(['Roll', 'Name']) print("\nThis is the final DataFrame:") print(df)

Set Columns As Multi Index

Set index of the DataFrame using Python objects

In Python, we can set any Python object like a list, range, or series as the index of the Pandas DataFrame object in the following ways.

1. Python list as the index of the DataFrame

In this method, we can set the index of the Pandas DataFrame object using the pd.Index() , range() , and set_index() function. First, we will create a Python sequence of numbers using the range() function then pass it to the pd.Index() function which returns the DataFrame index object. Then we pass the returned DataFrame index object to the set_index() function to set it as the new index of the DataFrame. Let’s implement this through Python code.

# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Create a Python list list = ['I', 'II', 'III', 'IV', 'V'] # Create a DataFrame index object # using pd.Index() function idx = pd.Index(list) # Set the above DataFrame index object as the index # using set_index() function df = df.set_index(idx) print("\nThis is the final DataFrame:") print(df)

Set List As Index

2. Python range as the index of the DataFrame

In this method, we can set the index of the Pandas DataFrame object using the pd.Index() and set_index() function. First, we will create a Python list then pass it to the pd.Index() function which returns the DataFrame index object. Then we pass the returned DataFrame index object to the set_index() function to set it as the new index of the DataFrame. Let’s implement this through Python code.

# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Create a DataFrame index object # using pd.Index() & range() function idx = pd.Index(range(1, 6, 1)) # Set the above DataFrame index object as the index # using set_index() function df = df.set_index(idx) print("\nThis is the final DataFrame:") print(df)

Set Range As Index

3. Python series as the index of the DataFrame

In this method, we can set the index of the Pandas DataFrame object using the pd.Series() , and set_index() function. First, we will create a Python list and pass it to the pd.Series() function which returns a Pandas series that can be used as the DataFrame index object. Then we pass the returned Pandas series to the set_index() function to set it as the new index of the DataFrame. Let’s implement this through Python code.

# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Create a Pandas series # using pd.Series() function & Python list series_idx = pd.Series([5, 4, 3, 2, 1]) # Set the above Pandas series as the index # using set_index() function df = df.set_index(series_idx) print("\nThis is the final DataFrame:") print(df)
This is the initial DataFrame: Roll Name Marks City 0 111 Rajan 93 Agra 1 112 Raman 88 Pune 2 113 Deepak 95 Delhi 3 114 David 75 Sivan 4 115 Shivam 99 Delhi This is the final DataFrame: Roll Name Marks City 5 111 Rajan 93 Agra 4 112 Raman 88 Pune 3 113 Deepak 95 Delhi 2 114 David 75 Sivan 1 115 Shivam 99 Delhi

4. Set index of the DataFrame keeping the old index

In this method, we will make use of the append parameter which is an optional parameter of the set_index() function of the Python Pandas module. By default the value of the append parameter is False . But here we will set the value of the append parameter as True . So that the old index of the DataFrame is appended by the new index which has been passed to the set_index() function. Let’s implement this through Python code.

# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Set Roll column as the index of the DataFrame # using set_index() function & append df = df.set_index('Roll', append = True) print("\nThis is the final DataFrame:") print(df)

Set Index Using Append Parameter

Conclusion

In this tutorial we have learned the following things:

  • What is the index of a Pandas DataFrame object?
  • How to set index while creating a DataFrame?
  • How to set existing columns of DataFrame as index or multi-index?
  • How to set the Python objects like list, range, or Pandas series as index?
  • How to set new index keeping the older one?

Источник

Оцените статью