pandas map() Function – Examples
pandas map() function from Series is used to substitute each value in a Series with another value, that may be derived from a function, a dict or a Series . Since DataFrame columns are series, you can use map() to update the column and assign it back to the DataFrame.
pandas Series is a one-dimensional array-like object containing a sequence of values. Each of these values is associated with a label called index. We can create a Series by using an array-like object (e.g., list) or a dictionary.
- This method defined only in Series and not present in DataFrame.
- map() accepts dict , Series , or callable
- You can use this to perform operations on a specific column of a DataFrame as each column in a DataFrame is Series.
- map() when passed a dictionary/Series will map elements based on the keys in that dictionary/Series. Missing values will be recorded as NaN in the output.
- Series.map() operate on one element at time
1. Syntax of pandas map()
The following is the syntax of the pandas map() function. This accepts arg and na_action as parameters and returns a Series.
- arg – Accepts function, dict, or Series
- na_action – Accepts ignore, None. Default set to None.
Let’s create a DataFrame and use it with map() function to update the DataFrame column.
df = pd.DataFrame(technologies) print(df)
2. Series.map() Example
You can only use the Series.map() function with the particular column of a pandas DataFrame. If you are not aware, every column in DataFrame is a Series. For example, df[‘Fee’] returns a Series object. Let’s see how to apply the map function on one of the DataFrame column and assign it back to the DataFrame.
Yields below output. This example substitutes 10% from the Fee column value.
You can also apply a function with the lambda as below. This yields the same output as above.
3. Handling NaN by using na_action param
The na_action param is used to handle NaN values. The default option for this argument is None , using which the NaN values are passed to the mapping function and may result in incorrect. You can also use ‘ignore’ , where no action is performed.
Yields below output. Notice that the Value for Fee column for index 3 is ‘nan RS’ which doesn’t make sense.
Now let’s use the na_action=’ignore’. This ignores the updating when it sees the NaN value.
RS'.format, na_action='ignore') print(df)
4. Using map() with Dictionary
Alternatively, you can also use the dictionary as the mapping function.
updateSer = df['Duration'].map(dict_map) df['Duration'] = updateSer print(df)
5. Complete Example of pandas map() Function
df = pd.DataFrame(technologies) print(df) # Using Lambda Function df['Fee'] = df['Fee'].map(lambda x: x - (x*10/100)) print(df) # Using custom function def fun1(x): return x/100 ser = df['Fee'].map(lambda x:fun1(x)) print(ser) # Let's add the currently to the Fee df['Fee'] = df['Fee'].map('<> RS'.format) print(df) df['Fee'] = df['Fee'].map('<> RS'.format, na_action='ignore') print(df) # Using Dictionary for mapping dict_map = updateSer = df['Duration'].map(dict_map) df['Duration'] = updateSer print(df)
Conclusion
In this article, I have explained map() function is from the Series which is used to substitute each value in a Series with another value and returns a Series object, since DataFrame is a collection of Series, you can use the map() function to update the DataFrame.
Related Articles
References
You may also like reading:
pandas.DataFrame.map#
New in version 2.1.0: DataFrame.applymap was deprecated and renamed to DataFrame.map.
This method applies a function that accepts and returns a scalar to every element of a DataFrame.
Parameters : func callable
Python function, returns a single value from a single value.
na_action , default None
If ‘ignore’, propagate NaN values, without passing them to func.
Additional keyword arguments to pass as keywords arguments to func .
Apply a function along input axis of DataFrame.
Replace values given in to_replace with value .
Apply a function elementwise on a Series.
>>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]]) >>> df 0 1 0 1.000 2.120 1 3.356 4.567
>>> df.map(lambda x: len(str(x))) 0 1 0 3 4 1 5 5
Like Series.map, NA values can be ignored:
>>> df_copy = df.copy() >>> df_copy.iloc[0, 0] = pd.NA >>> df_copy.map(lambda x: len(str(x)), na_action='ignore') 0 1 0 NaN 4 1 5.0 5
Note that a vectorized version of func often exists, which will be much faster. You could square each number elementwise.
>>> df.map(lambda x: x**2) 0 1 0 1.000000 4.494400 1 11.262736 20.857489
But it’s better to avoid map in that case.
>>> df ** 2 0 1 0 1.000000 4.494400 1 11.262736 20.857489
pandas.DataFrame.map#
New in version 2.1.0: DataFrame.applymap was deprecated and renamed to DataFrame.map.
This method applies a function that accepts and returns a scalar to every element of a DataFrame.
Parameters : func callable
Python function, returns a single value from a single value.
na_action , default None
If ‘ignore’, propagate NaN values, without passing them to func.
Additional keyword arguments to pass as keywords arguments to func .
Apply a function along input axis of DataFrame.
Replace values given in to_replace with value .
Apply a function elementwise on a Series.
>>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]]) >>> df 0 1 0 1.000 2.120 1 3.356 4.567
>>> df.map(lambda x: len(str(x))) 0 1 0 3 4 1 5 5
Like Series.map, NA values can be ignored:
>>> df_copy = df.copy() >>> df_copy.iloc[0, 0] = pd.NA >>> df_copy.map(lambda x: len(str(x)), na_action='ignore') 0 1 0 NaN 4 1 5.0 5
Note that a vectorized version of func often exists, which will be much faster. You could square each number elementwise.
>>> df.map(lambda x: x**2) 0 1 0 1.000000 4.494400 1 11.262736 20.857489
But it’s better to avoid map in that case.
>>> df ** 2 0 1 0 1.000000 4.494400 1 11.262736 20.857489