How to use type() function In Python

type() command is used to return the type of the given object.

Syntax: type(data_object)

Here, dataobject is the rdd or dataframe data.

Example 1: Python program to create data with RDD and check the type

Python3




# need to import for session creation
from pyspark.sql import SparkSession
 
# creating the  spark session
spark = SparkSession.builder.getOrCreate()
 
# create an rdd with some data
rdd = spark.sparkContext.parallelize([(1, "Sravan","vignan",98),
                                      (2, "bobby","bsc",87)])
 
# check the type using type() command
print(type(rdd))


Output:

<class 'pyspark.rdd.RDD'>

Example 2: Python program to create dataframe and check the type.

Python3




# importing module
import pyspark
 
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
 
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
 
# list  of employee data
data =[[1,"sravan","company 1"],
       [2,"ojaswi","company 1"],
       [3,"rohith","company 2"],
       [4,"sridevi","company 1"],
       [1,"sravan","company 1"],
       [4,"sridevi","company 1"]]
 
# specify column names
columns=['ID','NAME','Company']
 
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data,columns)
 
# check that type of
# data with type() command
print(type(dataframe))


Output:

<class 'pyspark.sql.dataframe.DataFrame'>

How to check if something is a RDD or a DataFrame in PySpark ?

In this article we are going to check the data is an RDD or a DataFrame using isinstance(), type(), and dispatch methods.

Similar Reads

Method 1. Using isinstance() method

It is used to check particular data is RDD or dataframe. It returns the boolean value....

Method 2: Using type() function

...

Method 3: Using Dispatch

...