ROC Curve in Python

Let’s implement roc curve in python using breast cancer in-built dataset. The breast cancer dataset is a commonly used dataset in machine learning, for binary classification tasks.

Step 1: Importing the required libraries

In scikit-learn, the roc_curve function is used to compute Receiver Operating Characteristic (ROC) curve points. On the other hand, the auc function calculates the Area Under the Curve (AUC) from the ROC curve.

AUC is a scalar value representing the area under the ROC curve quantifing the classifier’s ability to distinguish between positive and negative examples across all possible classification thresholds.

Python3

import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc

Step 2: Loading the dataset

Python3

data = load_breast_cancer()
X = data.data
y = data.target # Split the data into features (X) and target variable (y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

Step 3: Training and testing the model

Python3

# Train a logistic regression model
model = LogisticRegression() 
model.fit(X_train, y_train)
# Predict probabilities on the test set
y_pred_proba = model.predict_proba(X_test)[:, 1]

Step 4: Plot the ROC Curve

The roc_curve function is used to calculate the False Positive Rates (FPR), True Positive Rates (TPR), and corresponding thresholds with true labels and the predicted probabilities of belonging to the positive class as inputs.
plt.plot([0, 1], [0, 1], 'k--', label='No Skill') is used to plot a diagonal dashed line representing a classifier with no discriminative power (random guessing).

Python3

# Calculate ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba) 
roc_auc = auc(fpr, tpr)
# Plot the ROC curve
plt.figure()  
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--', label='No Skill')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve for Breast Cancer Classification')
plt.legend()
plt.show()

Output:

The dashed line represents the ROC Curve for the classifier. The AUC as 1.00, signifies perfect classification, meaning the model can distinguish malignant from benign tumors flawlessly at any threshold.

How to plot ROC curve in Python

The Receiver Operating Characteristic (ROC) curve is a fundamental tool in the field of machine learning for evaluating the performance of classification models. In this context, we’ll explore the ROC curve and its associated metrics using the breast cancer dataset, a widely used dataset for binary classification tasks.

ROC Curve in Python

Step 1: Importing the required libraries

Python3

Step 2: Loading the dataset

Python3

Step 3: Training and testing the model

Python3

Step 4: Plot the ROC Curve

Python3

How to plot ROC curve in Python

Categories

Contact US

ROC Curve in Python

Step 1: Importing the required libraries

Python3

Step 2: Loading the dataset

Python3

Step 3: Training and testing the model

Python3

Step 4: Plot the ROC Curve

Python3

How to plot ROC curve in Python

Similar Reads

Categories

Contact US