Read All CSV Files in Directory
To read all CSV files in the directory, we will use * for considering each file in the directory.
Python3
from pyspark.sql import SparkSession spark = SparkSession.builder.appName( 'Read All CSV Files in Directory' ).getOrCreate() file2 = spark.read.csv( '/content/*.csv' , sep = ',' , inferSchema = True , header = True ) df1 = file2.toPandas() display(df1.head()) display(df1.tail()) |
Output:
This will read all the CSV files present in the current working directory, having delimiter as comma ‘,‘ and the first row as Header.
PySpark – Read CSV file into DataFrame
In this article, we are going to see how to read CSV files into Dataframe. For this, we will use Pyspark and Python.
Files Used:
- authors
- book_author
- books