research notes

Validation data set는 모델 학습에 사용이 되는가? 본문

머신러닝/ML basic

Validation data set는 모델 학습에 사용이 되는가?

forest62590 2022. 2. 5. 23:12
728x90

If you want to build a solid model you have to follow that specific protocol of splitting your data into three sets: One for training, one for validation and one for final evaluation, which is the test set.

 

The idea is that you train on your training data and tune your model with the results of metrics (accuracy, loss etc) that you get from your validation set.

 

Your model doesn't "see" your validation set and isn't in any way trained on it, but you as the architect and master of the hyperparameters tune the model according to this data. Therefore it indirectly influences your model because it directly influences your design decisions. You nudge your model to work well with the validation data and that can possibly bring in a tilt.

 

Exactly that is the reason you only evaluate your model's final score on data that you never have used – and that is the third chunk of data, your test set.

 

Only this procedure makes sure you get an unaffected view of your models quality and ability to generalize what is has learned on totally unseen data.

 

References:

[1] https://stackoverflow.com/questions/46308374/what-is-validation-data-used-for-in-a-keras-sequential-model

[2] https://www.brainstobytes.com/test-training-and-validation-sets/

728x90

'머신러닝 > ML basic' 카테고리의 다른 글

PCA (Principal Component Analysis)  (0) 2022.02.19
K-평균 알고리즘(K-Means Clustering)  (0) 2022.02.11
규제 (Regularization)  (0) 2022.01.31
성능평가지표 (Evaluation Metric)  (0) 2022.01.30
사이킷런 정리  (0) 2022.01.28
Comments