기본 환경: IDE: VS code, Language: Python
Model Construction 후, Training → Evaluation → Predict 과정을 거침
➕ Training → 'Validation' → Evaluation → Predict
: 검증 과정을 추가하여, 훈련의 체계성과 예측 가능성을 높임
(예: 학습 후, 문제 풀이 과정 추가)
⚠️ Validation을 실시한다고 평가 예측 결과가 무조건 우수해지는 것은 아님
⭐ Validation Data split 방법
1. 직접 나누기(x_val, y_val)
# validation_data_make.py
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# 1. Data
# For train
x_train = np.array(range(1, 11)) # 1,2,3,4,5,6,7,8,9,10
y_train = np.array(range(1, 11))
# For evaluate
x_test = np.array([11,12,13])
y_test = np.array([11,12,13])
# For validatoin
x_validation = np.array([14,15,16])
y_validation = np.array([14,15,16])
# 2. Model
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))
# 3. compile and train
model.compile(loss='mse', optimizer='adam')
model.fit(x_train, y_train, epochs=128, batch_size=32,
validation_data=(x_validation, y_validation))
# validation_data를 통해서 val_loss 추가
# 훈련 + '검증(Validation)' + 평가 (fit + 'validation'+ evaluate)
# 4. evaluate and predict
loss = model.evaluate(x_test, y_test)
result = model.predict([17])
print("predict [17]: ", result)
'''
Result(make val_data)
training loss - loss: 0.5571, val_loss: 5.2239
evaluate loss - loss: 2.8699
predict [17] - [14.324451]
'''
2. 전체 데이터에서 slice 활용
# validation_data_slice.py
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# 1. Data
x = np.array(range(1, 17)) # 1~16
y = np.array(range(1, 17))
x_train = x[:10] # 1~10
y_train = y[:10]
x_test = x[10:13] # 11~13
y_test = y[10:13]
x_validation = x[13:] # 14~16
y_validation = y[13:]
# slicing [초과:이하]
# 2. Model
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))
# 3. compile and train
model.compile(loss='mse', optimizer='adam')
model.fit(x_train, y_train, epochs=128, batch_size=32,
validation_data=(x_validation, y_validation))
# validation_data를 통해서 val_loss 추가
# 훈련 + '검증(Validation)' + 평가 (fit + 'validation'+ evaluate)
# 4. evaluate and predict
loss = model.evaluate(x_test, y_test)
result = model.predict([17])
print("predict [17]: ", result)
'''
Result(slice val_data)
training loss - loss: 0.1758, val_loss: 0.0382
evaluate loss - loss: 0.0025
predict [17] - [16.688446]
'''
3. train_test_split() 활용
# validation_data_split.py
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
# 1. Data
x = np.array(range(1, 17))
y = np.array(range(1, 17))
x_train, x_test_tmp, y_train, y_test_tmp = train_test_split(
x, y,
shuffle=False,
train_size=0.625,
random_state=123
)
x_test, x_val, y_test, y_val = train_test_split(
x_test_tmp, y_test_tmp,
shuffle=False,
train_size=0.5,
random_state=123
)
# 2. Model
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))
# 3. compile and train
model.compile(loss='mse', optimizer='adam')
model.fit(x_train, y_train, epochs=128, batch_size=32,
validation_data=(x_val, y_val))
# validation_data를 통해서 val_loss 추가
# 훈련 + '검증(Validation)' + 평가 (fit + 'validation'+ evaluate)
# 4. evaluate and predict
loss = model.evaluate(x_test, y_test)
result = model.predict([17])
print("predict [17]: ", result)
'''
Result(split val_data)
training loss - loss: 0.0505, val_loss: 0.2850
evaluate loss - loss: 0.1105
predict [17] - [16.333452]
'''
4. fit()에서 val_split 활용
# validation_data_split(fit).py
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
# 1. Data
x = np.array(range(1, 17))
y = np.array(range(1, 17))
x_train, x_test, y_train, y_test = train_test_split(
x,y,
test_size=0.2,
random_state=1234
)
# 2. Model
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))
# 3. compile and train
model.compile(loss='mse', optimizer='adam')
model.fit(x_train, y_train, epochs=128, batch_size=32,
validation_split=0.25)
# validation_data를 통해서 val_loss 추가
# 훈련 + '검증(Validation)' + 평가 (fit + 'validation'+ evaluate)
# 4. evaluate and predict
loss = model.evaluate(x_test, y_test)
result = model.predict([17])
print("predict [17]: ", result)
'''
Result(split(fit) val_data)
training loss - loss: 0.1299, val_loss: 0.1981
evaluate loss - loss: 0.2424
predict [17] - [16.16812]
'''
소스 코드
'Naver Clould with BitCamp > Aartificial Intelligence' 카테고리의 다른 글
Classification and One-Hot Encoding (0) | 2023.01.23 |
---|---|
Handling Overfitting: EarlyStopping (0) | 2023.01.22 |
Activation Function (0) | 2023.01.22 |
Pandas Package and Missing Value Handling (0) | 2023.01.21 |
Environment Settings for GPU usage (0) | 2023.01.21 |