Implementation of Support Vector Machines (SVM) using Decision Trees

In this implementation, we have set up to use a Voting Classifier with a Support Vector Machine (SVM) and a Decision Tree (DT) as base estimators for the breast cancer dataset.

Importing Necessary Libraries

Python3




from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.ensemble import VotingClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier


Loading and splitting the dataset

Python3




# Load the breast cancer dataset
breast_cancer = load_breast_cancer()
X_bc, y_bc = breast_cancer.data, breast_cancer.target
 
# Split the dataset into training and test sets
X_train_bc, X_test_bc, y_train_bc, y_test_bc = train_test_split(X_bc, y_bc, test_size=0.2, random_state=42)


Creating Base Estimators

  • SVC (Support Vector Classifier): The probability=True parameter allows the model to predict probabilities for each class, which is necessary for soft voting in the VotingClassifier.
  • DecisionTreeClassifier: This classifier creates a model that predicts the value of a target variable by learning simple decision rules inferred from the data features. Each internal node represents a “test” on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label. In the context of the VotingClassifier, the decision tree serves as another base estimator for voting.

Python3




# Create the base estimators
svm_bc = SVC(probability=True)
dt_bc = DecisionTreeClassifier()


Ensemble Learning

  • VotingClassifier creation: The VotingClassifier is created with estimators=[('svm', svm_bc), ('dt', dt_bc)], specifying the list of base estimators to be used for the voting. The voting='soft' parameter indicates that the classifier will use soft voting, which means it predicts the class label based on the argmax of the sums of the predicted probabilities.
  • Training the voting classifier: The fit method is called on the voting_clf_bc object with the training data X_train_bc and y_train_bc to train the classifier on the breast cancer dataset.

Python3




# Create the voting classifier
voting_clf_bc = VotingClassifier(estimators=[('svm', svm_bc), ('dt', dt_bc)], voting='soft')
 
# Train the voting classifier
voting_clf_bc.fit(X_train_bc, y_train_bc)


Evaluation of the Model

  • Making predictions: The predict method is called on the voting_clf_bc object with the test data X_test_bc to make predictions for the breast cancer dataset.
  • Evaluating accuracy: The accuracy_score function is used to compare the predicted labels y_pred_bc with the actual labels y_test_bc from the test set. The accuracy is then printed to the console using f-string formatting.

Python3




# Make predictions
y_pred_bc = voting_clf_bc.predict(X_test_bc)
 
# Evaluate the accuracy
accuracy_bc = accuracy_score(y_test_bc, y_pred_bc)
print(f'Accuracy on breast cancer dataset: {accuracy_bc}')


Output:

Accuracy on breast cancer dataset: 0.9385964912280702



Ensemble Learning with SVM and Decision Trees

Ensemble learning is a machine learning technique that combines multiple individual models to improve predictive performance. Two popular algorithms used in ensemble learning are Support Vector Machines (SVMs) and Decision Trees.

Similar Reads

What is Ensemble Learning?

By merging many models (also referred to as “base learners” or “weak learners”), ensemble learning is a machine learning approach that creates a stronger model that is referred to as an “ensemble model.” The concept of ensemble learning is based on the premise that an ensemble model may frequently outperform any individual model in the ensemble by aggregating the predictions of numerous models....

What are decision trees?

A decision tree is a tree-like structure where:...

What are support vector machines?

Support Vector Machines (SVMs) are supervised learning models used for classification and regression tasks. In classification, SVMs find the hyperplane that best separates different classes in the feature space. This hyperplane is chosen to maximize the margin, which is the distance between the hyperplane and the nearest data point from each class, also known as support vectors....

How to combine Support Vector Machines (SVM) and Decision Trees?

Here are some common approaches to how to combine Support Vector Machines (SVM) and Decision Trees :...

Implementation of Support Vector Machines (SVM) using Decision Trees

In this implementation, we have set up to use a Voting Classifier with a Support Vector Machine (SVM) and a Decision Tree (DT) as base estimators for the breast cancer dataset....