Reading Tabular Data
Pandas provides the read_csv() function to read data stored as a csv file into a pandas DataFrame. pandas supports many different file formats or data sources out of the box (csv, excel, sql, json, parquet, …), each of them with the prefix read_*.
Importing Necessary libraries
Python3
import pandas as pd |
CSV file
1. Reading the csv file
Dataset link : dataset.csv
Python
# Load the dataset from the 'dataset.csv' file using Pandas data = pd.read_csv( 'dataset.csv' ) # Display the first few rows of the loaded dataset print (data.head()) |
Output:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
2. Reading excel file
Dataset link : data.xlsx
Python
# Load the dataset from the 'data.xlsx' file using Pandas data = pd.read_excel( 'data.xlsx' ) # Display the first few rows of the loaded dataset print (data.head()) |
Output:
Column1 Column2 Column3
0 1 A 10.5
1 2 B 20.3
2 3 C 15.8
3 4 D 8.2
Read And Write Tabular Data using Pandas
Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental, high-level building block for doing practical, real-world data analysis in Python.
The two primary data structures of Pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything about R’s data.frame provides, and much more. Pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other third-party libraries.
Data structures
Dimension |
Name |
Description |
---|---|---|
1 |
Series |
1D-labeled homogeneously-typed array |
2 |
DataFrame |
General 2D labeled, size-mutable tabular structure with potentially heterogeneously-typed column |