Model Metrics Evaluation (Part 3)

In our introductory post here, my teammates and I covered data preparation steps: acquiring the dataset, Importing all crucial libraries, Importing the dataset etc. as well as model creation.

In this section, we will analyze the models using a few model metrics as discussed below.


What is Model Metric evaluation: The idea of building machine learning models works on a constructive feedback principle. You build a model, get feedback from metrics, make adjustments and proceed until a desirable accuracy is achieved. Evaluation metrics illustrate a model’s efficiency. The ability to differentiate between model outcomes is an essential feature of assessing model performance.

I’ve seen many aspiring data scientists not even bother to verify how stable their model is. Once a model has been completed, it hastily maps the projected values to unseen results. Ideally, this hasty application of machine learning models is not a good strategy.

Simply constructing a classification, generative or predictive model should not be the inspiration. The inspiration should be about designing and selecting a model that gives high accuracy to the sample data at hand. It is, therefore, important to verify the performance of your model before calculating the expected values. Remember Garbage in Garbage out

Metrics used in our models

Below we will cover and analyze on the metrics that we applied on our models. One can use; however, for the proper choice of metric, one should consider the data they have and the model.

F1 score is the harmonic mean of accuracy and recall values for classification problem. The F1 score formula is as shown below.

F1 score formula

We will implement the same in python and get scores for our models.

As mentioned above on F1 score, we realise that instead of taking the arithmetic mean we take the harmonic mean. Harmonic mean important as punishes extreme values more.

The mean absolute error of any model to a test set is the mean of the absolute values of each prediction errors on over all instances in the test set


We will implement the same in python, applying to our Random Forest model as an example.


Mean Absolute Error: 0.42857142857142855


Model metrics analysis is important for any Machine learning model. It is not Ideal to create a model and not analyze how it performs because the model my hide a lot and believing such results may be disastrous.

GitHub link to full code

Data scientist , Artificial intelligence, |Applied problem solving Enthusiast|,Computational Sciences major|