Read All CSV Files in Directory

To read all CSV files in the directory, we will use * for considering each file in the directory.

Python3

from pyspark.sql import SparkSession
 
spark = SparkSession.builder.appName(
    'Read All CSV Files in Directory').getOrCreate()
 
file2 = spark.read.csv('/content/*.csv', sep=',', 
                    inferSchema=True, header=True)
 
df1 = file2.toPandas()
display(df1.head())
display(df1.tail())

Output:

This will read all the CSV files present in the current working directory, having delimiter as comma ‘,‘ and the first row as Header.

PySpark – Read CSV file into DataFrame

In this article, we are going to see how to read CSV files into Dataframe. For this, we will use Pyspark and Python.

Files Used:

authors
book_author
books

Contact US

Email:

Content: