Understanding Classification Models

Classification Models

Robert Ayub Technology 10 March 2023 Hits: 606

Classification is a task of machine learning which assigns a label value to a specific observation and then uses these labelled observations to identify a particular type to be of one kind or the other. An example is the classification of email as either spam or not spam

To build any classification model, you will require a training dataset with several examples of inputs (feature variables) and outputs (target variables) from which the model will train itself. The training data must include all the possible scenarios of the problem and must have sufficient data for each label for the model to be trained correctly. Class labels are often returned as string values and hence needs to be encoded into an integer like either representing 0 for "spam" and 1 for "no-spam".

Types of classification tasks

Binary Classification
Multi-class classification
Multi-label classification
Imbalanced classification

Binary Classification for Machine Learning

These refers to those tasks which can give either of any two classes as output
One class is normally considered as the normal state, and the other is considered as the abnormal state
- Email spam detection: Normal State - Not Spam, Abnormal State - Spam
- Conversion prediction: Normal State - Not Churned, Abnormal State - Churn
Mostly followed notation is that the normal state is assigned a 0 and the abnormal state is assigned a 1
One can also create, instead of predicting a class label, the model predicts a Bernoulli probability for the output
Most popular algorithms for binary classification task are:
- K-Nearest Neighbours
- Logistic Regression
- Support Vector Machine
- Decision Trees
- Naive Bayes

Multi-class Classification for Machine Learning

These can have any number of labels with the minimum being three labels
Examples are:
- Plant species classification
- Sentiment analysis (happy, sad, neutral)
- Optical character recognition
These types of models are normally done using Categorical Distribution
The model predicts the probability of input with respect to each of the output labels
Most common algorithms used for Multi-Class classification:
- K-Nearest Neighbours
- Naive Bayes
- Decision Trees
- Gradient Boosting
- Random Forest

Multi-label Classification for Machine Learning

In these tasks, we assign two or more specific class labels that could be predicted for each example
Example is where we have a single photo that needs to identify all the objects in the photo
The commonly used algorithms are:
- Multi-label Random Forest
- Multi-label Decision Trees
- Multi-label Gradient Boosting

Imbalanced Classification for Machine Learning

This refers to those tasks where the number of classes are unequally distributed
Generally, imbalanced classification tasks are binary classification jobs where a major portion of the training dataset is of the normal class type and a minority of them belong to the abnormal class type
Examples are:
- Fraud Detection
- Outlier Detection
- Medical Diagnosis Test

Evaluating Classification Models

Sources

Robert Ayub

Kenya

+254 718 758 221

robert@ayub.co.ke

+254 718 758 221

Technology

Classification Models

Robert Ayub

Kenya

+254 718 758 221

robert@ayub.co.ke

+254 718 758 221

Technology

Classification Models

Related Articles

What is Artificial Intelligence

Mobile App or Website?

Multipl Linear Regression