CART (Classification and Regression Trees)
CART is a decision tree algorithm that can be used for both classification and regression tasks. It works by finding splits that minimize the Gini impurity, a measure of impurity in the data. CART uses Gini impurity for classification. When selecting a feature to split, it calculates the Gini impurity for each possible split and chooses the one with the lowest impurity.
The likelihood of incorrectly classifying an element selected at random and labeled in accordance with the distribution of labels in the set is measured by the Gini impurity.
- Gini Impurity (for Classification) :CART uses Gini impurity as the criterion to measure the impurity or purity of a dataset. Gini impurity, denoted by Gini(D) for dataset D, is calculated using the formula:
[Tex]Gini(D) = 1 – \Sigma^n _{i=1}\; p^2_{i} [/Tex]
CART can be used to create regression trees for continuous target variables in addition to classification. When deciding which subsets to split, the algorithm in this instance minimizes the variance of the target variable inside each subset.
- Mean Squared Error (for Regression): For regression tasks, CART uses mean squared error (MSE) to evaluate splits. MSE measures the average squared difference between the predicted and actual values. The split with the lowest MSE is chosen.
[Tex]MSE(D) = \frac{1}{|D|}\Sigma^{|D|} _{i=1}(y_{i} – \overline{y})^2 [/Tex]
Where [Tex]y_{i} [/Tex]represents the target values, [Tex]\overline{y} [/Tex] is the mean of the target values in dataset D, and ∣D∣ is the number of data points in D.
Recursively dividing the dataset according to the characteristic that minimizes the Gini impurity or maximizes the information gain at each stage is done by CART using a greedy strategy. It looks at all potential split points for each attribute and selects the one that produces the lowest Gini impurity for the subsets that are generated.
To lessen overfitting, CART employs a method known as cost-complexity pruning once the decision tree is constructed. This entails determining the tree that minimizes the total cost, which is the sum of the impurity and the complexity, by adding a complexity parameter to the impurity measure.
Every internal node in a binary tree created by CART has exactly two child nodes. This facilitates the splitting procedure and facilitates the interpretation of the resultant trees.
Decision Tree Algorithms
Decision trees are a type of machine-learning algorithm that can be used for both classification and regression tasks. They work by learning simple decision rules inferred from the data features. These rules can then be used to predict the value of the target variable for new data samples.
Decision trees are represented as tree structures, where each internal node represents a feature, each branch represents a decision rule, and each leaf node represents a prediction. The algorithm works by recursively splitting the data into smaller and smaller subsets based on the feature values. At each node, the algorithm chooses the feature that best splits the data into groups with different target values.
Table of Content
- Understanding Decision Trees
- Components of a Decision Tree
- Working of the Decision Tree Algorithm
- Understanding the Key Mathematical Concepts Behind Decision Trees
- Types of Decision Tree Algorithms
- ID3 (Iterative Dichotomiser 3)
- C4.5
- CART (Classification and Regression Trees)
- CHAID (Chi-Square Automatic Interaction Detection)
- MARS (Multivariate Adaptive Regression Splines)
- Implementation of Decision Tree Algorithms