본문 바로가기
Naver Clould with BitCamp/Aartificial Intelligence

Validation Data

by HJ0216 2023. 1. 22.

기본 환경: IDE: VS code, Language: Python

 

Model Construction 후, Training → Evaluation → Predict 과정을 거침

➕ Training → 'Validation' → Evaluation → Predict

: 검증 과정을 추가하여, 훈련의 체계성과 예측 가능성을 높임
(예: 학습 후, 문제 풀이 과정 추가)

⚠️ Validation을 실시한다고 평가 예측 결과가 무조건 우수해지는 것은 아님

 

Validation Data split 방법

 

1. 직접 나누기(x_val, y_val)

# validation_data_make.py

import numpy as np

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense


# 1. Data
# For train
x_train = np.array(range(1, 11)) # 1,2,3,4,5,6,7,8,9,10
y_train = np.array(range(1, 11))

# For evaluate
x_test = np.array([11,12,13])
y_test = np.array([11,12,13])

# For validatoin
x_validation = np.array([14,15,16])
y_validation = np.array([14,15,16])


# 2. Model
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))


# 3. compile and train
model.compile(loss='mse', optimizer='adam')
model.fit(x_train, y_train, epochs=128, batch_size=32,
          validation_data=(x_validation, y_validation))
# validation_data를 통해서 val_loss 추가
# 훈련 + '검증(Validation)' + 평가 (fit + 'validation'+ evaluate)


# 4. evaluate and predict
loss = model.evaluate(x_test, y_test)

result = model.predict([17])
print("predict [17]: ", result)



'''
Result(make val_data)
training loss - loss: 0.5571, val_loss: 5.2239
evaluate loss - loss: 2.8699
predict [17] - [14.324451]

'''

 

2. 전체 데이터에서 slice 활용

# validation_data_slice.py

import numpy as np

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense


# 1. Data
x = np.array(range(1, 17)) # 1~16
y = np.array(range(1, 17))

x_train = x[:10] # 1~10
y_train = y[:10]
x_test = x[10:13] # 11~13
y_test = y[10:13]
x_validation = x[13:] # 14~16
y_validation = y[13:]
# slicing [초과:이하]


# 2. Model
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))


# 3. compile and train
model.compile(loss='mse', optimizer='adam')
model.fit(x_train, y_train, epochs=128, batch_size=32,
          validation_data=(x_validation, y_validation))
# validation_data를 통해서 val_loss 추가
# 훈련 + '검증(Validation)' + 평가 (fit + 'validation'+ evaluate)


# 4. evaluate and predict
loss = model.evaluate(x_test, y_test)

result = model.predict([17])
print("predict [17]: ", result)



'''
Result(slice val_data)
training loss - loss: 0.1758, val_loss: 0.0382
evaluate loss - loss: 0.0025
predict [17] - [16.688446]

'''

 

3. train_test_split() 활용

# validation_data_split.py

import numpy as np

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

from sklearn.model_selection import train_test_split


# 1. Data
x = np.array(range(1, 17))
y = np.array(range(1, 17))


x_train, x_test_tmp, y_train, y_test_tmp = train_test_split(
    x, y,
    shuffle=False,
    train_size=0.625,
    random_state=123
)

x_test, x_val, y_test, y_val = train_test_split(
    x_test_tmp, y_test_tmp,
    shuffle=False,
    train_size=0.5,
    random_state=123
)


# 2. Model
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))


# 3. compile and train
model.compile(loss='mse', optimizer='adam')
model.fit(x_train, y_train, epochs=128, batch_size=32,
          validation_data=(x_val, y_val))
# validation_data를 통해서 val_loss 추가
# 훈련 + '검증(Validation)' + 평가 (fit + 'validation'+ evaluate)


# 4. evaluate and predict
loss = model.evaluate(x_test, y_test)

result = model.predict([17])
print("predict [17]: ", result)



'''
Result(split val_data)
training loss - loss: 0.0505, val_loss: 0.2850
evaluate loss - loss: 0.1105
predict [17] - [16.333452]

'''

 

4. fit()에서 val_split 활용

# validation_data_split(fit).py

import numpy as np

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

from sklearn.model_selection import train_test_split


# 1. Data
x = np.array(range(1, 17))
y = np.array(range(1, 17))

x_train, x_test, y_train, y_test = train_test_split(
    x,y,
    test_size=0.2,
    random_state=1234
)


# 2. Model
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))


# 3. compile and train
model.compile(loss='mse', optimizer='adam')
model.fit(x_train, y_train, epochs=128, batch_size=32,
          validation_split=0.25)
# validation_data를 통해서 val_loss 추가
# 훈련 + '검증(Validation)' + 평가 (fit + 'validation'+ evaluate)


# 4. evaluate and predict
loss = model.evaluate(x_test, y_test)

result = model.predict([17])
print("predict [17]: ", result)



'''
Result(split(fit) val_data)
training loss - loss: 0.1299, val_loss: 0.1981
evaluate loss - loss: 0.2424
predict [17] - [16.16812]

'''

 

 

 

소스 코드

🔗 HJ0216/TIL