728x90
반응형
Prediction of New House Price in Melbourne¶
['Rooms', 'Bathroom', 'Landsize', 'Lattitude', 'Longtitude']에 따라 house의 Price가 어떻게 되는지 model을 만들자.
In [6]:
import pandas as pd #It has DataFrame(SQL)
melbourne_file_path = r"C:\Users\32mou\Desktop\melb_data.csv\melb_data.csv"
melbourne_data = pd.read_csv(melbourne_file_path)
melbourne_data.describe()
#Checking Missing Value is important
Out[6]:
In [7]:
melbourne_data.columns
Out[7]:
In [9]:
melbourne_data.dropna(axis=0) #drops missing values
Out[9]:
In [10]:
y = melbourne_data.Price
Choosing "Features"¶
In [17]:
melbourne_features = ['Rooms', 'Bathroom', 'Landsize', 'Lattitude', 'Longtitude']
X = melbourne_data[melbourne_features]
X.describe()
Out[17]:
In [16]:
X.head()
Out[16]:
In [18]:
from sklearn.tree import DecisionTreeRegressor
#Define Model
#random_state 입력값 구체화: 매 실행마다 같은 결과를 보장해주기 위해 사용
melbourne_model = DecisionTreeRegressor(random_state=1)
#Fit model
melbourne_model.fit(X,y)
Out[18]:
모든 데이터에 대해서 prediction하기 전에 위에 training data의 상단의 몇줄만 이용하여 predict이 잘 되는지 확인해보자
In [19]:
print("Making predictions for the following 5 houses:")
print(X.head())
실제값과 예측값 비교하며 예측을 잘 했는지 보자
- 실제값: y.head()
- 예측값: melbourne_model.predict(X.head())
In [20]:
print("The predictions are")
print(melbourne_model.predict(X.head()))
In [21]:
y.head()
Out[21]:
728x90
반응형
'Machine Learning > [Kaggle Course] ML (+ 딥러닝, 컴퓨터비전)' 카테고리의 다른 글
Random Forest (0) | 2020.09.28 |
---|---|
[Kaggle Courses] UnderFitting vs OverFitting (0) | 2020.09.27 |
[Kaggle Courses] What is Model Validation (Evaluating) (0) | 2020.09.26 |
[Kaggle Courses] From Fitting to Prediction (0) | 2020.09.26 |
[Kaggle Courses] How Models Works (0) | 2020.09.22 |