How to useorderBy() in Python

This function will return the dataframe after ordering the multiple columns. It will sort first based on the column name given.


  • Ascending order: dataframe.orderBy([‘column1′,’column2′,……,’column n’], ascending=True).show()
  • Descending Order: dataframe.orderBy([‘column1′,’column2′,……,’column n’], ascending=False).show()


  • dataframe is the Pyspark Input dataframe
  • ascending=True specifies to sort the dataframe in ascending order
  • ascending=False specifies to sort the dataframe in descending order

Example 1: Sort the PySpark dataframe in ascending order with orderBy().


# importing module
import pyspark
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
# list  of employee data
data = [["1", "sravan", "company 1"],
        ["2", "ojaswi", "company 1"],
        ["3", "rohith", "company 2"],
        ["4", "sridevi", "company 1"],
        ["5", "bobby", "company 1"]]
# specify column names
columns = ['ID', 'NAME', 'Company']
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data, columns)
# orderBy dataframe in asc order
dataframe.orderBy(['Name', 'ID', 'Company'],


Example 2: Sort the PySpark dataframe in descending order with orderBy().


# importing module
import pyspark
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
# list  of employee data
data = [["1", "sravan", "company 1"],
        ["2", "ojaswi", "company 1"],
        ["3", "rohith", "company 2"],
        ["4", "sridevi", "company 1"],
        ["5", "bobby", "company 1"]]
# specify column names
columns = ['ID', 'NAME', 'Company']
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data, columns)
# orderBy dataframe in desc order
dataframe.orderBy(['Name', 'ID', 'Company'], 


PySpark – Order by multiple columns

In this article, we are going to see how to orderby multiple columns in  PySpark DataFrames through Python.

Create the dataframe for demonstration:


# importing module
import pyspark
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
# list  of employee data
data = [["1", "sravan", "company 1"],
        ["2", "ojaswi", "company 1"],
        ["3", "rohith", "company 2"],
        ["4", "sridevi", "company 1"],
        ["5", "bobby", "company 1"]]
# specify column names
columns = ['ID', 'NAME', 'Company']
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data, columns)


orderby means we are going to sort the dataframe by multiple columns in ascending or descending order. we can do this by using the following methods.

Similar Reads

Method 1 : Using orderBy()


Method 2: Using sort()

This function will return the dataframe after ordering the multiple columns. It will sort first based on the column name given....