How to use where() method In Python

where() is used to check the condition and give the results

Syntax: dataframe.where(condition)

where, condition is the dataframe condition

Overall Syntax with where clause:

dataframe.where((dataframe.column_name).isin([elements])).show()

where,

  • column_name is the column
  • elements are the values that are present in the column
  • show() is used to show the resultant dataframe

Example: Get the particular colleges with where() clause

Python3




# get college as vignan
dataframe.where((
  dataframe.college).isin(['vignan'])).show()


Output:



Filtering a row in PySpark DataFrame based on matching values from a list

In this article, we are going to filter the rows in the dataframe based on matching values in the list by using isin in Pyspark dataframe

isin(): This is used to find the elements contains in a given dataframe, it will take the elements and get the elements to match to the data

Syntax: isin([element1,element2,.,element n])

Create Dataframe for demonstration:

Python3




# importing module
import pyspark
  
# importing sparksession
from pyspark.sql import SparkSession
  
# creating sparksession
# and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
# list  of students  data  with null values
# we can define null values with none
data = [[1, "sravan", "vignan"],
        [2, "ramya", "vvit"],
        [3, "rohith", "klu"],
        [4, "sridevi", "vignan"],
        [5, "gnanesh", "iit"]]
  
# specify column names
columns = ['ID', 'NAME', 'college']
  
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data, columns)
  
dataframe.show()


Output:

Similar Reads

Method 1: Using filter() method

...

Method 2: Using where() method

It is used to check the condition and give the results, Both are similar...