APITROVE BLOG

supervised learning with scikit-learn

we use defined inputs and their outputs to train a model.

in most simple words, in y=f(x), we know x and y, and we train model to find that function that maps them correctly.

So, training data is known and must be accurate

For supervised learning, we have two types:

1- classification(categories of the data i.e. image is either adult or it is not adult)

2- regression( continuous values/results prediction for given data i.e. predict dollar price tomorrow based on given data of previous decade etc)


scikit-learn has basic syntax to train a model and predict results for given input:

-> from sklearn.module import Model

-> model = Model();

-> model.fit(x,y);

-> model.predict(target_x);


Terms: labled data(training data), unlabled data(testing data)


K-nearest neighbours(KNN) classification model:

This algorithm returns the result for given target based on its neighbouring data via majority vote. for example, let's say we want to know color of a point in an xy-plane. now, if its neighbouring points will have majority of red colors inside given radius(total no. of neighbours to check) of KNN, then target will be marked as red. and so on....


*code is quite simple. you can check official docs for KNN to see example


accuracy= correct_predictions/total_tests

to measure, we divide data into traning data and test data. the ration is generally 7:3 of total data