Sweetivz

The library is mainly known for visualizing target values and comparing datasets. It is good tool for comparing different dataset like the train and test or different parts of the same dataset like (dataset divided into two categories based on a categorical feature like gender)

Key features of this library are :

Investigates the relationship between a target value (e.g., “Survived” in the Titanic dataset) and other features.
Visualization and Comparison: Visualizes and compares the target variable with various features to uncover patterns, trends, and associations.
Distinct Datasets:Allows comparison between distinct datasets, such as training and test data, to assess consistency or differences in target-related characteristics.
Intra-set Characteristics: Analyzes intra-set characteristics, like comparing the target variable across different groups (e.g., male versus female) within the dataset.
Summary Information: Provides summary information for each feature, including its type, unique values, missing values, duplicate rows, and most frequent values.

Install the library

!pip install sweetviz

Implementation

Let us use this library to compare two subsets of our data frame(male vs female).

Here, a FeatureConfig object is created to configure how Sweetviz analyzes features. In this specific configuration:
- The feature with the name “PassengerId” will be skipped during the analysis.
- The feature “Age” will be treated as a text feature (force_text), which means Sweetviz will consider it as a categorical feature rather than a numerical one.
The compare_intra function is used to generate a comparative analysis report. Here’s a breakdown of the parameters:
- df: The pandas DataFrame that you want to analyze.
- df[“Sex”] == “male”: This is a condition that splits the dataset into two groups based on the “Sex” column, where the value is “male.”
- [“Male”, “Female”]: The names assigned to the two groups created by the condition.
- “Survived”: The target variable for the analysis.
- feature_config: The configuration object created earlier.

Python3

import sweetviz as sv 
  
feature_config = sv.FeatureConfig(skip="PassengerId", force_text=["Age"]) 
my_report = sv.compare_intra(df, df["Sex"] == "male", ["Male", "Female"], "Survived", feature_config) 
my_report.show_notebook() 
my_report.show_html() # Default arguments will generate to "SWEETVIZ_REPORT.html"

Output:

Sweetiviz Output

Tools to Automate EDA

Exploratory Data Analysis (EDA) is a critical phase in the data analysis process, where analysts and data scientists examine and explore the characteristics and patterns within a dataset. In this article, We’ll learn how to automate this process with Python libraries.

Table of Content

Exploratory Data Analysis
Python Libraries for Exploratory Data Analysis
1. Ydata-Profiling
2. AutoViz
3. Sweetivz
4. Data Prep
5. D-Tale
Comparing Data Exploration Libraries

Sweetivz

Install the library

Implementation

Python3

Tools to Automate EDA

Categories

Contact US

Sweetivz

Install the library

Implementation

Python3

Tools to Automate EDA

Similar Reads

Categories

Contact US