Vanshika Jain's Profile

Random Forests 101

The Random Forest algorithm is an extension of the bagging method.

Here’s how it’s different and more effective.

Source: WallStreetMojo

Random Forests use bagging, which means creating multiple sets of data by randomly sampling with replacement.

Each set trains a separate decision tree.

Unlike regular decision trees that consider all features for each split, Random Forests use only a random subset of features.

This technique is called feature bagging or the random subspace method.

Why not decision trees?

Decision Trees - Consider all features for each split, which can lead to highly correlated trees.

Random Forests - Select a random subset of features for each split, ensuring that the trees are less correlated and more diverse.

Why it matters?

Using different subsets of features ensures that the trees in the forest are not too similar.

This reduces the overall variance and improves the model’s performance.

Random Forests enhance the bagging method by adding feature randomness, which leads to a more diverse and powerful collection of decision trees.

This results in more accurate and reliable predictions.

Get in touch at jain.van@northeastern.edu

Page updated

Google Sites

Report abuse