Bagging
- Its an ensemble learning technique where the decision models are used in parallel and may be some democratic voting technique is used for final prediction.
Boosting
- This assigns a weighted sampling method — uses the weakness of previous learners to build the new model.
- Its sequential in nature.

What is the difference between Bagging and Boosting?
Random Forest (Bagging Technique)
- Takes multiple weak learners aka decision trees which are slightly better than random guessing and makes a more strong learner.
- Ensemble learning technique
- It uses a random subset of features to be used for the decision tree.
Gradient Boosting
- The idea is from boosting — take a single weak model and try to improve it by combining to many weak models to form a strong model.
Gradient Boosting, Decision Trees and XGBoost with CUDA | NVIDIA Technical Blog
- The idea is to fit successive models on the previous model
- Scaled gradient based update
Gradient Boosted Trees —
Kind of uses a second order taylor series expansion — key idea
- XGBoost implementation has penalisation (regularisation) terms to address overfitting.
Extreme Gradient Boosting
- Notorious for winning all kaggle competitions and SOTA performance in structured data category.
- Can work on CUDA.
- Can handle data in the order of billions.
- As long as the data fits in your RAM it may work well. But typically for medium sized datasets - order of 10k