Iteration through Row list

In this method, we will traverse through the Row list, and convert each row object to a DataFrame using createDataFrame(). We will then append() this DataFrame to an accumulative final DataFrame which will be our final answer. The details of append() are given below :

Syntax: df.append(other, ignore_index=False, verify_integrity=False, sort=None)

df : Pandas DataFrame

Parameters :

other : Pandas DataFrame, Numpy Array, Numpy Series etc.

ignore_index : Checks if index labels are to be used or not.

verify_integrity : If True, raise ValueError on creating index with duplicates.

sort : Sort columns if the columns of df and other are unaligned.

Returns: A new appended DataFrame

Example:

In this example, we will then use createDataFrame() to create a PySpark DataFrame and then use append() to get a Pandas DataFrame.

Python

# Importing PySpark
# Importing Pandas for append()
import pyspark
import pandas
from pyspark.sql import SparkSession
from pyspark.sql import Row
 
# PySpark Session
row_pandas_session = SparkSession.builder.appName(
    'row_pandas_session'
).getOrCreate()
 
# List of Sample Row objects
row_object_list = [Row(Topic='Dynamic Programming', Difficulty=10),
                   Row(Topic='Arrays', Difficulty=5),
                   Row(Topic='Sorting', Difficulty=6),
                   Row(Topic='Binary Search', Difficulty=7)]
 
# Our final DataFrame initialized
mega_df = pandas.DataFrame()
 
# Traversing through the list
for i in range(len(row_object_list)):
   
    # Creating a Spark DataFrame of a single row
    small_df = row_pandas_session.createDataFrame([row_object_list[i]])
 
    # appending the Pandas version of small_df
    # to mega_df
    mega_df = mega_df.append(small_df.toPandas(),
                             ignore_index=True)
 
# Printing our desired DataFrame
print(mega_df)

Output :

Convert PySpark Row List to Pandas DataFrame

In this article, we will convert a PySpark Row List to Pandas Data Frame. A Row object is defined as a single Row in a PySpark DataFrame. Thus, a Data Frame can be easily represented as a Python List of Row objects.

Iteration through Row list

Python

Convert PySpark Row List to Pandas DataFrame

Categories

Contact US

Iteration through Row list

Python

Convert PySpark Row List to Pandas DataFrame

Similar Reads

Categories

Contact US