Introduction to ML

Created onDibuat padaJanuary 18, 202618 Januari 2026

Last updated onTerakhir diperbaruiJanuary 18, 202618 Januari 2026

2 min read2 menit baca

IDAIdata-science

ContributorsKontributor

Razi

IntroductionPendahuluan Definisi Tipe dalam learning Types of Ensemble Methods Why Use Ensemble Methods?Practical Implementation Key Considerations Conclusion

Machine learning, subset dari AI.

#Link to this headingDefinisi

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. (Tom Mitchell, 1998)

An automated process that extracts patterns from data. (Kelleher et al., 2015)

#Link to this headingTipe dalam learning

Supervised Learning:
Unsupervised Learning:
Reinforcement Learning:

Ensemble methods work by aggregating predictions from multiple models. The key insight is that a group of weak learners can be combined to form a strong learner.

#Link to this headingTypes of Ensemble Methods

Bagging (Bootstrap Aggregating)
- Creates multiple versions of a model using bootstrap sampling
- Example: Random Forest
Boosting
- Sequentially trains models, with each new model focusing on the errors of previous ones
- Examples: AdaBoost, Gradient Boosting, XGBoost
Stacking
- Combines predictions from multiple models using a meta-learner
- Can use different types of base models

#Link to this headingWhy Use Ensemble Methods?

Improved Accuracy: Combining models often leads to better predictions than any single model
Reduced Overfitting: Averaging reduces variance in predictions
Robustness: Less sensitive to outliers and noise in the data

#Link to this headingPractical Implementation

Here's a simple example using scikit-learn:

from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
 
# Create dataset
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
 
# Random Forest (Bagging)
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)
print(f"Random Forest Accuracy: {rf.score(X_test, y_test):.3f}")
 
# Gradient Boosting
gb = GradientBoostingClassifier(n_estimators=100, random_state=42)
gb.fit(X_train, y_train)
print(f"Gradient Boosting Accuracy: {gb.score(X_test, y_test):.3f}")

#Link to this headingKey Considerations

Computational Cost: Ensemble methods require training multiple models
Interpretability: Can be harder to understand than single models
Hyperparameter Tuning: More parameters to optimize

#Link to this headingConclusion

Ensemble methods are essential tools in the machine learning toolkit. By combining multiple models, we can achieve better performance and more robust predictions.

Ensemble Methods

Backlinks

IF3270