After building the model, Evaluation checks how well it performs. This involves testing the model on new, unseen data to measure accuracy, precision, recall, or other relevant metrics depending on the problem. Evaluation is critical because it ensures the AI system is reliable and performs as intended. It can also highlight weaknesses, such as bias in predictions or poor generalization to new cases.
In the school example, the model’s predicted exam results would be compared with actual results to see how accurately it identifies students at risk. If the model performs poorly, adjustments can be made to improve its accuracy before deployment.