728x90
반응형
Learn to handle missing values, non-numeric values, data leakage and more.
Your models will be more accurate and useful
you will accelerate your machine learning expertise by learning how to:
- tackle data types often found in real-world datasets (missing values, categorical variables),
- design pipelines to improve the quality of your machine learning code,
- use advanced techniques for model validation (cross-validation),
- build state-of-the-art models that are widely used to win Kaggle competitions (XGBoost), and
- avoid common and important data science mistakes (leakage).
from sklearn.ensemble import RandomForestRegressor
# Define the models
model_1 = RandomForestRegressor(n_estimators=50, random_state=0)
model_2 = RandomForestRegressor(n_estimators=100, random_state=0)
model_3 = RandomForestRegressor(n_estimators=100, criterion='mae', random_state=0)
model_4 = RandomForestRegressor(n_estimators=200, min_samples_split=20, random_state=0)
model_5 = RandomForestRegressor(n_estimators=100, max_depth=7, random_state=0)
models = [model_1, model_2, model_3, model_4, model_5]
모델을 다양하게 만든 뒤 각각 MAE를 측정해보고 가장 낮은 MAE를 산출하는 모델을 고르자.
728x90
반응형
'Machine Learning > [Kaggle Course] ML (+ 딥러닝, 컴퓨터비전)' 카테고리의 다른 글
[Kaggle Course] Categorical Variables (0) | 2020.10.12 |
---|---|
[Kaggle Course] Missing Values (0) | 2020.10.11 |
Intro to AutoML (0) | 2020.10.08 |
!! Steps to apply machine learing to real-world data (0) | 2020.10.08 |
Random Forest (0) | 2020.09.28 |