일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
- 케라스
- GPT
- Tokenization
- gpt2
- 머신러닝
- 딥러닝
- trustworthiness
- ChatGPT
- 인공지능
- 설명가능성
- XAI
- word2vec
- 인공지능 신뢰성
- nlp
- ML
- MLOps
- 지피티
- AI Fairness
- Ai
- GPT-3
- DevOps
- 신뢰성
- cnn
- Bert
- 트랜스포머
- 챗GPT
- 챗지피티
- 자연어
- Transformer
- fairness
- Today
- Total
research notes
Validation data set는 모델 학습에 사용이 되는가? 본문
If you want to build a solid model you have to follow that specific protocol of splitting your data into three sets: One for training, one for validation and one for final evaluation, which is the test set.
The idea is that you train on your training data and tune your model with the results of metrics (accuracy, loss etc) that you get from your validation set.
Your model doesn't "see" your validation set and isn't in any way trained on it, but you as the architect and master of the hyperparameters tune the model according to this data. Therefore it indirectly influences your model because it directly influences your design decisions. You nudge your model to work well with the validation data and that can possibly bring in a tilt.
Exactly that is the reason you only evaluate your model's final score on data that you never have used – and that is the third chunk of data, your test set.
Only this procedure makes sure you get an unaffected view of your models quality and ability to generalize what is has learned on totally unseen data.
References:
[2] https://www.brainstobytes.com/test-training-and-validation-sets/
'머신러닝 > ML basic' 카테고리의 다른 글
PCA (Principal Component Analysis) (0) | 2022.02.19 |
---|---|
K-평균 알고리즘(K-Means Clustering) (0) | 2022.02.11 |
규제 (Regularization) (0) | 2022.01.31 |
성능평가지표 (Evaluation Metric) (0) | 2022.01.30 |
사이킷런 정리 (0) | 2022.01.28 |