How to use withColumnRenamed() In Python

This takes a resultant aggregated column name and renames this column. After aggregation, It will return the column names as aggregate_operation(old_column)

so using this we can replace this with our new column

Syntax:

dataframe.groupBy(“column_name_group”).agg({“column_name”:”aggregate_operation”}).withColumnRenamed(“aggregate_operation(column_name)”, “new_column_name”)

Example: Aggregating DEPT column with sum() FEE and rename to Total Fee

Python3

# importing module 
import pyspark 
  
# importing sparksession from pyspark.sql module 
from pyspark.sql import SparkSession 
  
#import functions 
from pyspark.sql import functions 
  
# creating sparksession and giving an app name 
spark = SparkSession.builder.appName('sparkdf').getOrCreate() 
  
# list  of student  data 
data = [["1", "sravan", "IT", 45000], 
        ["2", "ojaswi", "CS", 85000], 
        ["3", "rohith", "CS", 41000], 
        ["4", "sridevi", "IT", 56000], 
        ["5", "bobby", "ECE", 45000], 
        ["6", "gayatri", "ECE", 49000], 
        ["7", "gnanesh", "CS", 45000], 
        ["8", "bhanu", "Mech", 21000] 
        ] 
  
# specify column names 
columns = ['ID', 'NAME', 'DEPT', 'FEE'] 
  
# creating a dataframe from the lists of data 
dataframe = spark.createDataFrame(data, columns) 
  
  
# aggregating DEPT column with sum() FEE and rename to Total Fee 
dataframe.groupBy("DEPT").agg({"FEE": "sum"}).withColumnRenamed( 
    "sum(FEE)", "Total Fee").show() 

Output:

Renaming columns for PySpark DataFrames Aggregates

In this article, we will discuss how to rename columns for PySpark dataframe aggregates using Pyspark.

Dataframe in use:

In PySpark, groupBy() is used to collect the identical data into groups on the PySpark DataFrame and perform aggregate functions on the grouped data. These are available in functions module:

How to use withColumnRenamed() In Python

Python3

Renaming columns for PySpark DataFrames Aggregates

Categories

Contact US

How to use withColumnRenamed() In Python

Python3

Renaming columns for PySpark DataFrames Aggregates

Similar Reads

Categories

Contact US