Apply Function to Every Row in a Pandas DataFrame
There are various ways to Perform element-wise operations on DataFrame columns. here we are discussing some examples for Perform element-wise operations on DataFrame columns those are following.
- Applying User-Defined Function to Every Row of Pandas DataFrame
- Apply Lambda to Every Row of DataFrame
- Apply NumPy.sum() to Every Row
- Normalizing DataFrame Column Values Using Custom Function in Pandas
- Applying Range Generation Function to DataFrame Rows in Pandas
One can use apply() function to apply a function to every row in a given data frame. Let’s see the ways we can do this task.
Applying User-Defined Function to Every Row of Pandas DataFrame
In this example, we defines a function add_values(row)
that calculates the sum of values in the ‘A’, ‘B’, and ‘C’ columns for each row. In the main()
function, a DataFrame is created from a dictionary, and the function is applied to every row using the apply()
method, resulting in a new column ‘add’ containing the sum values. The original and modified DataFrames are then printed.
Python3
import pandas as pd # Function to add def add_values(row): return row[ 'A' ] + row[ 'B' ] + row[ 'C' ] def main(): # Create a dictionary with three fields each data = { 'A' : [ 1 , 2 , 3 ], 'B' : [ 4 , 5 , 6 ], 'C' : [ 7 , 8 , 9 ]} # Convert the dictionary into DataFrame df = pd.DataFrame(data) print ( "Original DataFrame:\n" , df) # Apply the user-defined function to every row df[ 'add' ] = df. apply (add_values, axis = 1 ) print ( '\nAfter Applying Function: ' ) # Print the new DataFrame print (df) if __name__ = = '__main__' : main() |
Output
Original DataFrame:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
After Applying Function:
A B C add
0 1 4 7 12
1 2 5 8 15
2 3 6 9 18
Apply Lambda to Every Row of DataFrame
In this example, we defines a function add(a, b, c)
that returns the sum of its three arguments. In the main()
function, a DataFrame is created from a dictionary, and a new column ‘add’ is added to the DataFrame using the apply()
method with a lambda function. The lambda function applies the add
function element-wise to the ‘A’, ‘B’, and ‘C’ columns for every row, and the resulting DataFrame is printed before and after the function is applied. The output demonstrates applying a user-defined function to every row of the DataFrame.
Python3
# Import pandas package import pandas as pd # Function to add def add(a, b, c): return a + b + c def main(): # create a dictionary with # three fields each data = { 'A' : [ 1 , 2 , 3 ], 'B' : [ 4 , 5 , 6 ], 'C' : [ 7 , 8 , 9 ]} # Convert the dictionary into DataFrame df = pd.DataFrame(data) print ( "Original DataFrame:\n" , df) df[ 'add' ] = df. apply ( lambda row: add(row[ 'A' ], row[ 'B' ], row[ 'C' ]), axis = 1 ) print ( '\nAfter Applying Function: ' ) # printing the new dataframe print (df) if __name__ = = '__main__' : main() |
Output
Original DataFrame:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
After Applying Function:
A B C add
0 1 4 7 12
1 2 5 8 15
2 3 6 9 18
Apply NumPy.sum() to Every Row
You can use the numpy function as the parameters to the dataframe as well. In this example, we create a DataFrame from a dictionary, and then applies the NumPy sum
function to each row using the apply()
method with axis=1
, resulting in a new column ‘add’ containing the sum of values in each row. The original and modified DataFrames are then printed to demonstrate the application of the function.
Python3
import pandas as pd import numpy as np def main(): # create a dictionary with # five fields each data = { 'A' : [ 1 , 2 , 3 ], 'B' : [ 4 , 5 , 6 ], 'C' : [ 7 , 8 , 9 ]} # Convert the dictionary into DataFrame df = pd.DataFrame(data) print ( "Original DataFrame:\n" , df) # applying function to each row in the dataframe # and storing result in a new column df[ 'add' ] = df. apply (np. sum , axis = 1 ) print ( '\nAfter Applying Function: ' ) # printing the new dataframe print (df) if __name__ = = '__main__' : main() |
Output
Original DataFrame:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
After Applying Function:
A B C add
0 1 4 7 12
1 2 5 8 15
2 3 6 9 18
Normalizing DataFrame Column Values Using Custom Function in Pandas
Here, we defines a normalize
function that takes two arguments and calculates a normalized value based on their mean and range. In the main()
function, a DataFrame is created from a dictionary, and the normalize
function is applied to each row using the apply()
method with a lambda function. The resulting DataFrame contains the normalized values in column ‘X’, and both the original and modified DataFrames are printed.
Python3
# Import pandas package import pandas as pd def normalize(x, y): x_new = ((x - np.mean([x, y])) / ( max (x, y) - min (x, y))) # print(x_new) return x_new def main(): # create a dictionary with three fields each data = { 'X' : [ 1 , 2 , 3 ], 'Y' : [ 45 , 65 , 89 ]} # Convert the dictionary into DataFrame df = pd.DataFrame(data) print ( "Original DataFrame:\n" , df) df[ 'X' ] = df. apply ( lambda row: normalize(row[ 'X' ], row[ 'Y' ]), axis = 1 ) print ( '\nNormalized:' ) print (df) if __name__ = = '__main__' : main() |
Output
Original DataFrame:
X Y
0 1 45
1 2 65
2 3 89
Normalized:
X Y
0 -0.5 45
1 -0.5 65
2 -0.5 89
Applying Range Generation Function to DataFrame Rows in Pandas
In this example, we are creating a generate_range
function to create a range based on the given integer input, and a replace
function that applies the generate_range
function element-wise to each row of a DataFrame. In the main()
function, a DataFrame is created from a dictionary, and the replace
function is applied to each row using the apply()
method with a lambda function, resulting in a new DataFrame with values replaced by corresponding ranges. The original and modified DataFrames are then printed.
Python3
import pandas as pd import numpy as np pd.options.mode.chained_assignment = None # Function to generate range def generate_range(n): # printing the range for eg: # input is 67 output is 60-70 n = int (n) lower_limit = n / / 10 * 10 upper_limit = lower_limit + 10 return str ( str (lower_limit) + '-' + str (upper_limit)) def replace(row): for i, item in enumerate (row): # updating the value of the row row[i] = generate_range(item) return row def main(): # create a dictionary with # three fields each data = { 'A' : [ 0 , 2 , 3 ], 'B' : [ 4 , 15 , 6 ], 'C' : [ 47 , 8 , 19 ]} # Convert the dictionary into DataFrame df = pd.DataFrame(data) print ( 'Before applying function: ' ) print (df) # applying function to each row in # dataframe and storing result in a new column df = df. apply ( lambda row: replace(row)) print ( 'After Applying Function: ' ) # printing the new dataframe print (df) if __name__ = = '__main__' : main() |
Output
Before applying function:
A B C
0 0 4 47
1 2 15 8
2 3 6 19
After Applying Function:
A B C
0 0-10 0-10 40-50
1 0-10 10-20 0-10
2 0-10 0-10 10-20
Apply function to every row in a Pandas DataFrame
Python is a great language for performing data analysis tasks. It provides a huge amount of Classes and functions which help in analyzing and manipulating data more easily. In this article, we will see how we can apply a function to every row in a Pandas Dataframe.