Build your first Machine Learning Model

The purpose of any Machine Learning model is to build an equation corresponding to the data provided.

For example, y=mx+c is an equation which predicts value of y when given with the value of x.

Let’s try to build a model which can predict the coefficients of the equation. Consider the following table,

abcy
12320
43641
53851

We have taken output value as sum of 2xa, 3xb and 4xc. We’ll train our model with the training dataset and we’ll see if model is able to correctly come up with coefficients.

Training Dataset

We’ll prepare a dummy dataset with the coefficients we have just discussed.

from random import randint

X = []
Y = []

for i in range(100):
    a = randint(0,50)
    b = randint(0,50)
    c = randint(0,50)

    f = (2*a) + (3*b) + (4*c)

    X.append([a,b,c])
    Y.append(f)

We’ll consider a dataset of 100 values.

In the above code, X & y are first defined as empty lists.

Within each loop, we’re randomly picking values for a,b,c variables. We’re then calculating output value based on our own equation and store it in variable ‘f’. Note that coefficients’ here are 2,3 4.

We then store these data-points in X & y variables (lists).

Model

We’ll use simple linear regression model to predict the equation or coefficients of the equation.

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X,Y)

# x = [[1,2,3]]
# y = 1*2 + 2*3 +3*4 = 2+6+12 = 20


We’ll import LinearRegression class from the package sklearn. We define an instance of the class using a variable called model.

We then call the fit method of the class which will take care of training the dataset.

After training, model now knows coefficients of our equation.

Test the Model

We’ll check if the model has learnt things correctly. Since we already know the equation, let’s take random input values 1,2,3. According to our equation the output should be 20

X_test = [[1,2,3]]
Y_pred = model.predict(X_test)
coef = model.coef_

print(f'Predicted value is {Y_pred} and coefficients are {coef}')

Output : Predicted value [20.] and coefficients are [2. 3. 4.]

We can see that the model has also predicted that the output is 20. So, we can say that model has learnt to find the coefficients of the equation very well.

Conclusion

We have first built a training dataset using our own equation. We then created a simple Linear Regression model and trained it on our dataset. Finally, we verified the output of the model with our own equation.

Feel free to tinker the code and see if you get the expected output.