- Pandas DataFrame Indexing: Set the Index of a Pandas Dataframe
- What do we mean by indexing of a Pandas Dataframe?
- Set index of the DataFrame while creating
- Set index of the DataFrame using existing columns
- 1. Set column as the index (without keeping the column)
- 2. Set column as the index (keeping the column)
- 3. Set multiple columns as the index of the DataFrame
- Set index of the DataFrame using Python objects
- 1. Python list as the index of the DataFrame
- 2. Python range as the index of the DataFrame
- 3. Python series as the index of the DataFrame
- 4. Set index of the DataFrame keeping the old index
- Conclusion
Pandas DataFrame Indexing: Set the Index of a Pandas Dataframe
Hello Readers! In this tutorial, we are going to discuss the different ways to set the index of a Pandas DataFrame object in Python.
What do we mean by indexing of a Pandas Dataframe?
In Python, when we create a Pandas DataFrame object using the pd.DataFrame() function which is defined in the Pandas module automatically (by default) address in the form of row indices and column indices is generated to represent each data element/point in the DataFrame that is called index.
But, the row indices are called the index of the DataFrame, and column indices are simply called columns. The index of a Pandas DataFrame object uniquely identifies its rows. Let’s start our core discussion about the different ways to set the index of a Pandas DataFrame object in Python.
Set index of the DataFrame while creating
In Python, we can set the index of the DataFrame while creating it using the index parameter. In this method, we create a Python list and pass it to the index parameter of the pd.DataFrame() function to its index. Let’s implement this through Python code.
# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a Python list of Roll NOs Roll = [11, 12, 13, 14, 15] # Create a DataFrame from the dictionary # and set Roll column as the index # using DataFrame() function with index parameter df = pd.DataFrame(data, index = Roll) print(df)
Set index of the DataFrame using existing columns
In Python, we can easily set any existing column or columns of a Pandas DataFrame object as its index in the following ways.
1. Set column as the index (without keeping the column)
In this method, we will make use of the inplace parameter which is an optional parameter of the set_index() function of the Python Pandas module. By default the value of the inplace parameter is False . But here we will set the value of inplace as True . So that the old index of the DataFrame is replaced by the existing column which has been passed to the pd.set_index() function as the new index. Let’s implement this through Python code.
# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Set the Roll column as the index # using set_index() function df = df.set_index('Roll') print("\nThis is the final DataFrame:") print(df)
2. Set column as the index (keeping the column)
In this method, we will make use of the drop parameter which is an optional parameter of the set_index() function of the Python Pandas module. By default the value of the drop parameter is True . But here we will set the value of the drop parameter as False . So that the column which has been set as the new index is not dropped from the DataFrame. Let’s implement this through Python code.
# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Set the Name column as the index # using set_index() function with drop df = df.set_index('Name', drop = False) print("\nThis is the final DataFrame:") print(df)
3. Set multiple columns as the index of the DataFrame
In this method, we can set multiple columns of the Pandas DataFrame object as its index by creating a list of column names of the DataFrame then passing it to the set_index() function. That’s why in this case, the index is called multi-index. Let’s implement this through Python code.
# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Set the Roll & Name column as the multi-index # using set_index() function and list of column names df = df.set_index(['Roll', 'Name']) print("\nThis is the final DataFrame:") print(df)
Set index of the DataFrame using Python objects
In Python, we can set any Python object like a list, range, or series as the index of the Pandas DataFrame object in the following ways.
1. Python list as the index of the DataFrame
In this method, we can set the index of the Pandas DataFrame object using the pd.Index() , range() , and set_index() function. First, we will create a Python sequence of numbers using the range() function then pass it to the pd.Index() function which returns the DataFrame index object. Then we pass the returned DataFrame index object to the set_index() function to set it as the new index of the DataFrame. Let’s implement this through Python code.
# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Create a Python list list = ['I', 'II', 'III', 'IV', 'V'] # Create a DataFrame index object # using pd.Index() function idx = pd.Index(list) # Set the above DataFrame index object as the index # using set_index() function df = df.set_index(idx) print("\nThis is the final DataFrame:") print(df)
2. Python range as the index of the DataFrame
In this method, we can set the index of the Pandas DataFrame object using the pd.Index() and set_index() function. First, we will create a Python list then pass it to the pd.Index() function which returns the DataFrame index object. Then we pass the returned DataFrame index object to the set_index() function to set it as the new index of the DataFrame. Let’s implement this through Python code.
# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Create a DataFrame index object # using pd.Index() & range() function idx = pd.Index(range(1, 6, 1)) # Set the above DataFrame index object as the index # using set_index() function df = df.set_index(idx) print("\nThis is the final DataFrame:") print(df)
3. Python series as the index of the DataFrame
In this method, we can set the index of the Pandas DataFrame object using the pd.Series() , and set_index() function. First, we will create a Python list and pass it to the pd.Series() function which returns a Pandas series that can be used as the DataFrame index object. Then we pass the returned Pandas series to the set_index() function to set it as the new index of the DataFrame. Let’s implement this through Python code.
# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Create a Pandas series # using pd.Series() function & Python list series_idx = pd.Series([5, 4, 3, 2, 1]) # Set the above Pandas series as the index # using set_index() function df = df.set_index(series_idx) print("\nThis is the final DataFrame:") print(df)
This is the initial DataFrame: Roll Name Marks City 0 111 Rajan 93 Agra 1 112 Raman 88 Pune 2 113 Deepak 95 Delhi 3 114 David 75 Sivan 4 115 Shivam 99 Delhi This is the final DataFrame: Roll Name Marks City 5 111 Rajan 93 Agra 4 112 Raman 88 Pune 3 113 Deepak 95 Delhi 2 114 David 75 Sivan 1 115 Shivam 99 Delhi
4. Set index of the DataFrame keeping the old index
In this method, we will make use of the append parameter which is an optional parameter of the set_index() function of the Python Pandas module. By default the value of the append parameter is False . But here we will set the value of the append parameter as True . So that the old index of the DataFrame is appended by the new index which has been passed to the set_index() function. Let’s implement this through Python code.
# Import Pandas module import pandas as pd # Create a Python dictionary data = # Create a DataFrame from the dictionary df = pd.DataFrame(data) print("\nThis is the initial DataFrame:") print(df) # Set Roll column as the index of the DataFrame # using set_index() function & append df = df.set_index('Roll', append = True) print("\nThis is the final DataFrame:") print(df)
Conclusion
In this tutorial we have learned the following things:
- What is the index of a Pandas DataFrame object?
- How to set index while creating a DataFrame?
- How to set existing columns of DataFrame as index or multi-index?
- How to set the Python objects like list, range, or Pandas series as index?
- How to set new index keeping the older one?