Apply function to dataframe column python

Содержание

Pandas apply() Function to Single & Multiple Column(s)
1. Quick Examples of pandas Apply Function to a Column
2. pandas.DataFrame.apply() Function Syntax
3. Pandas Apply Function to Single Column
4. Pandas Apply Function to All Columns
5. Pandas Apply Function to Multiple List of Columns
6. Apply Lambda Function to Each Column
7. Apply Lambda Function to Single Column
8. Using pandas.DataFrame.transform() to Apply Function Column
9. Using pandas.DataFrame.map() to Single Column
10. DataFrame.assign() to Apply Lambda Function
11. Using Numpy function on single Column
12. Using NumPy.square() Method
13. Multiple columns Using NumPy.square() and Lambda Function
Conclusion
Related Articles
References
You may also like reading:
pandas.DataFrame.apply#

Pandas apply() Function to Single & Multiple Column(s)

Using pandas.DataFrame.apply() method you can execute a function to a single column, all and list of multiple columns (two or more). In this article, I will cover how to apply() a function on values of a selected single, multiple, all columns. For example, let’s say we have three columns and would like to apply a function on a single column without touching other two columns and return a DataFrame with three columns.

1. Quick Examples of pandas Apply Function to a Column

If you are in a hurry, below are some of the quick examples of how to apply a function to a single and multiple columns (two or more) in pandas DataFrame.

 # Below are some quick examples # Using Dataframe.apply() to apply function add column def add_3(x): return x+3 df2 = df.apply(add_3) # Using apply function single column def add_4(x): return x+4 df["B"] = df["B"].apply(add_4) # Apply to multiple columns df[['A','B']] = df[['A','B']].apply(add_3) # Apply a lambda function to each column df2 = df.apply(lambda x : x + 10) # Using Dataframe.apply() and lambda function df["A"] = df["A"].apply(lambda x: x-2) # Using Dataframe.apply() & [] operator df['A'] = df['A'].apply(np.square) # Using numpy.square() and [] operator df['A'] = np.square(df['A']) # Apply function NumPy.square() to square the values of two rows #'A'and'B df2 = df.apply(lambda x: np.square(x) if x.name in ['A','B'] else x) # Apply function single column using transform() def add_2(x): return x+2 df = df.transform(add_2) # Using DataFrame.map() to Single Column df['A'] = df['A'].map(lambda A: A/2.) # Using DataFrame.assign() and Lambda df2 = df.assign(B=lambda df: df.B/2)

2. pandas.DataFrame.apply() Function Syntax

If you are a learner let’s see the syntax of apply() method and executing some examples of how to apply it on a single column, multiple, and all columns. Our DataFrame contains column names A , B , and C .

Читайте также: Print line return python

Below is a syntax of pandas.DataFrame.apply()

 # Syntax of apply() function DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)

Let’s create a sample DataFrame to work with some examples.

 import pandas as pd import numpy as np data = [(3,5,7), (2,4,6),(5,8,9)] df = pd.DataFrame(data, columns = ['A','B','C']) print(df)

 # Output: A B C 0 3 5 7 1 2 4 6 2 5 8 9

3. Pandas Apply Function to Single Column

We will create a function add_3() which adds value 3 column value and use this on apply() function. To apply it to a single column, qualify the column name using df[«col_name»] . The below example applies a function to a column B .

 # Using apply function single column def add_4(x): return x+4 df["B"] = df["B"].apply(add_4) print(df)

Yields below output. This applies the function to every row in DataFrame for a specified column.

 # Output: A B C 0 3 9 7 1 2 8 6 2 5 12 9

4. Pandas Apply Function to All Columns

In some cases we would want to apply a function on all pandas columns, you can do this using apply() function. Here the add_3() function will be applied to all DataFrame columns.

 # Using Dataframe.apply() to apply function add column def add_3(x): return x+3 df2 = df.apply(add_3) print(df2)

 # Output: A B C 0 6 8 10 1 5 7 9 2 8 11 12

5. Pandas Apply Function to Multiple List of Columns

Similarly using apply() method, you can apply a function on a selected multiple list of columns. In this case, the function will apply to only selected two columns without touching the rest of the columns.

 # Apply() function on selected list of multiple columns df = pd.DataFrame(data, columns = ['A','B','C']) df[['A','B']] = df[['A','B']].apply(add_3) print(df)

 # Output: A B C 0 6 8 7 1 5 7 6 2 8 11 9

6. Apply Lambda Function to Each Column

You can also apply a lambda expression using the apply() method, the Below example, adds 10 to all column values.

 # Apply a lambda function to each column df2 = df.apply(lambda x : x + 10) print(df2)

 # Output: A B C 0 13 15 17 1 12 14 16 2 15 18 19

7. Apply Lambda Function to Single Column

You can apply the lambda function for a single column in the DataFrame. The following example subtracts every cell value by 2 for column A – df[«A»]=df[«A»].apply(lambda x:x-2) .

 # Using Dataframe.apply() and lambda function df["A"] = df["A"].apply(lambda x: x-2) print(df)

 # Output: A B C 0 1 5 7 1 0 4 6 2 3 8 9

Similarly, you can also apply the Lambda function to all & multiple columns in pandas, I will leave this to you to explore.

8. Using pandas.DataFrame.transform() to Apply Function Column

Using DataFrame.apply() method & lambda functions the resultant DataFrame can be any number of columns whereas with transform() function the resulting DataFrame must have the same length as the input DataFrame.

 # Using DataFrame.transform() def add_2(x): return x+2 df = df.transform(add_2) print(df)

 # Output: A B C 0 5 7 9 1 4 6 8 2 7 10 11

9. Using pandas.DataFrame.map() to Single Column

Here is another alternative using map() method.

 # Using DataFrame.map() to Single Column df['A'] = df['A'].map(lambda A: A/2.) print(df)

 # Output: A B C 0 1.5 5 7 1 1.0 4 6 2 2.5 8 9

10. DataFrame.assign() to Apply Lambda Function

 # Using DataFrame.assign() and Lambda df2 = df.assign(B=lambda df: df.B/2) print(df2)

 # Output: A B C 0 3 2.5 7 1 2 2.0 6 2 5 4.0 9

11. Using Numpy function on single Column

Use df[‘A’]=df[‘A’].apply(np.square) to select the column from DataFrame as series using the [] operator and apply NumPy.square() method.

 # Using Dataframe.apply() & [] operator df['A'] = df['A'].apply(np.square) print(df)

 # Output: A B C 0 9 5 7 1 4 4 6 2 25 8 9

12. Using NumPy.square() Method

You can also do the same without using apply() function and directly using Numpy.

 # Using numpy.square() and [] operator df['A'] = np.square(df['A']) print(df)

Yields same output as above.

13. Multiple columns Using NumPy.square() and Lambda Function

Apply a lambda function to multiple columns in DataFrame using Dataframe apply(), lambda, and Numpy functions.

 # Apply function NumPy.square() to square the values of two rows 'A'and'B df2 = df.apply(lambda x: np.square(x) if x.name in ['A','B'] else x) print(df2)

 # Output: A B C 0 9 25 7 1 4 16 6 2 25 64 9

Conclusion

In this article, you have learned how to apply a function to a single column, all and multiple columns (two or more) of pandas DataFrame using apply() , transform() and NumPy.square() , map() , transform() and assign() methods.

References

pandas.DataFrame.apply#

Objects passed to the function are Series objects whose index is either the DataFrame’s index ( axis=0 ) or the DataFrame’s columns ( axis=1 ). By default ( result_type=None ), the final return type is inferred from the return type of the applied function. Otherwise, it depends on the result_type argument.

Parameters func function

Function to apply to each column or row.

Axis along which the function is applied:

Determines if row or column is passed as a Series or ndarray object:

False : passes each row or column as a Series to the function.
True : the passed function will receive ndarray objects instead. If you are just applying a NumPy reduction function this will achieve much better performance.

These only act when axis=1 (columns):

‘expand’ : list-like results will be turned into columns.
‘reduce’ : returns a Series if possible rather than expanding list-like results. This is the opposite of ‘expand’.
‘broadcast’ : results will be broadcast to the original shape of the DataFrame, the original index and columns will be retained.

The default behaviour (None) depends on the return value of the applied function: list-like results will be returned as a Series of those. However if the apply function returns a Series these are expanded to columns.

args tuple

Positional arguments to pass to func in addition to the array/series.

Additional keyword arguments to pass as keywords arguments to func .

Returns Series or DataFrame

Result of applying func along the given axis of the DataFrame.

For elementwise operations.

Only perform aggregating type operations.

Only perform transforming type operations.

Functions that mutate the passed object can produce unexpected behavior or errors and are not supported. See Mutating with User Defined Function (UDF) methods for more details.

>>> df = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B']) >>> df A B 0 4 9 1 4 9 2 4 9

Using a numpy universal function (in this case the same as np.sqrt(df) ):

>>> df.apply(np.sqrt) A B 0 2.0 3.0 1 2.0 3.0 2 2.0 3.0

Using a reducing function on either axis

>>> df.apply(np.sum, axis=0) A 12 B 27 dtype: int64

>>> df.apply(np.sum, axis=1) 0 13 1 13 2 13 dtype: int64

Returning a list-like will result in a Series

>>> df.apply(lambda x: [1, 2], axis=1) 0 [1, 2] 1 [1, 2] 2 [1, 2] dtype: object

Passing result_type=’expand’ will expand list-like results to columns of a Dataframe

>>> df.apply(lambda x: [1, 2], axis=1, result_type='expand') 0 1 0 1 2 1 1 2 2 1 2

Returning a Series inside the function is similar to passing result_type=’expand’ . The resulting column names will be the Series index.

>>> df.apply(lambda x: pd.Series([1, 2], index=['foo', 'bar']), axis=1) foo bar 0 1 2 1 1 2 2 1 2

Passing result_type=’broadcast’ will ensure the same shape result, whether list-like or scalar is returned by the function, and broadcast it along the axis. The resulting column names will be the originals.

>>> df.apply(lambda x: [1, 2], axis=1, result_type='broadcast') A B 0 1 2 1 1 2 2 1 2

Источник