Python map для dataframe

pandas map() Function – Examples

pandas map() function from Series is used to substitute each value in a Series with another value, that may be derived from a function, a dict or a Series . Since DataFrame columns are series, you can use map() to update the column and assign it back to the DataFrame.

pandas Series is a one-dimensional array-like object containing a sequence of values. Each of these values is associated with a label called index. We can create a Series by using an array-like object (e.g., list) or a dictionary.

  • This method defined only in Series and not present in DataFrame.
  • map() accepts dict , Series , or callable
  • You can use this to perform operations on a specific column of a DataFrame as each column in a DataFrame is Series.
  • map() when passed a dictionary/Series will map elements based on the keys in that dictionary/Series. Missing values will be recorded as NaN in the output.
  • Series.map() operate on one element at time

1. Syntax of pandas map()

The following is the syntax of the pandas map() function. This accepts arg and na_action as parameters and returns a Series.

  • arg – Accepts function, dict, or Series
  • na_action – Accepts ignore, None. Default set to None.

Let’s create a DataFrame and use it with map() function to update the DataFrame column.

 df = pd.DataFrame(technologies) print(df) 

2. Series.map() Example

You can only use the Series.map() function with the particular column of a pandas DataFrame. If you are not aware, every column in DataFrame is a Series. For example, df[‘Fee’] returns a Series object. Let’s see how to apply the map function on one of the DataFrame column and assign it back to the DataFrame.

Читайте также:  Base64 pdf in html

Yields below output. This example substitutes 10% from the Fee column value.

You can also apply a function with the lambda as below. This yields the same output as above.

3. Handling NaN by using na_action param

The na_action param is used to handle NaN values. The default option for this argument is None , using which the NaN values are passed to the mapping function and may result in incorrect. You can also use ‘ignore’ , where no action is performed.

Yields below output. Notice that the Value for Fee column for index 3 is ‘nan RS’ which doesn’t make sense.

Now let’s use the na_action=’ignore’. This ignores the updating when it sees the NaN value.

 RS'.format, na_action='ignore') print(df) 

4. Using map() with Dictionary

Alternatively, you can also use the dictionary as the mapping function.

 updateSer = df['Duration'].map(dict_map) df['Duration'] = updateSer print(df) 

5. Complete Example of pandas map() Function

 df = pd.DataFrame(technologies) print(df) # Using Lambda Function df['Fee'] = df['Fee'].map(lambda x: x - (x*10/100)) print(df) # Using custom function def fun1(x): return x/100 ser = df['Fee'].map(lambda x:fun1(x)) print(ser) # Let's add the currently to the Fee df['Fee'] = df['Fee'].map('<> RS'.format) print(df) df['Fee'] = df['Fee'].map('<> RS'.format, na_action='ignore') print(df) # Using Dictionary for mapping dict_map = updateSer = df['Duration'].map(dict_map) df['Duration'] = updateSer print(df) 

Conclusion

In this article, I have explained map() function is from the Series which is used to substitute each value in a Series with another value and returns a Series object, since DataFrame is a collection of Series, you can use the map() function to update the DataFrame.

References

You may also like reading:

Источник

pandas.DataFrame.map#

New in version 2.1.0: DataFrame.applymap was deprecated and renamed to DataFrame.map.

This method applies a function that accepts and returns a scalar to every element of a DataFrame.

Parameters : func callable

Python function, returns a single value from a single value.

na_action , default None

If ‘ignore’, propagate NaN values, without passing them to func.

Additional keyword arguments to pass as keywords arguments to func .

Apply a function along input axis of DataFrame.

Replace values given in to_replace with value .

Apply a function elementwise on a Series.

>>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]]) >>> df 0 1 0 1.000 2.120 1 3.356 4.567 
>>> df.map(lambda x: len(str(x))) 0 1 0 3 4 1 5 5 

Like Series.map, NA values can be ignored:

>>> df_copy = df.copy() >>> df_copy.iloc[0, 0] = pd.NA >>> df_copy.map(lambda x: len(str(x)), na_action='ignore') 0 1 0 NaN 4 1 5.0 5 

Note that a vectorized version of func often exists, which will be much faster. You could square each number elementwise.

>>> df.map(lambda x: x**2) 0 1 0 1.000000 4.494400 1 11.262736 20.857489 

But it’s better to avoid map in that case.

>>> df ** 2 0 1 0 1.000000 4.494400 1 11.262736 20.857489 

Источник

pandas.DataFrame.map#

New in version 2.1.0: DataFrame.applymap was deprecated and renamed to DataFrame.map.

This method applies a function that accepts and returns a scalar to every element of a DataFrame.

Parameters : func callable

Python function, returns a single value from a single value.

na_action , default None

If ‘ignore’, propagate NaN values, without passing them to func.

Additional keyword arguments to pass as keywords arguments to func .

Apply a function along input axis of DataFrame.

Replace values given in to_replace with value .

Apply a function elementwise on a Series.

>>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]]) >>> df 0 1 0 1.000 2.120 1 3.356 4.567 
>>> df.map(lambda x: len(str(x))) 0 1 0 3 4 1 5 5 

Like Series.map, NA values can be ignored:

>>> df_copy = df.copy() >>> df_copy.iloc[0, 0] = pd.NA >>> df_copy.map(lambda x: len(str(x)), na_action='ignore') 0 1 0 NaN 4 1 5.0 5 

Note that a vectorized version of func often exists, which will be much faster. You could square each number elementwise.

>>> df.map(lambda x: x**2) 0 1 0 1.000000 4.494400 1 11.262736 20.857489 

But it’s better to avoid map in that case.

>>> df ** 2 0 1 0 1.000000 4.494400 1 11.262736 20.857489 

Источник

Оцените статью