An Overview of Fundamental Machine Learning Models

Chapter 1: Introduction to Machine Learning Models

Greetings, readers! In this post, we aim to provide an overview of the most fundamental Machine Learning models, often referred to as 'traditional' models. We will explore each model briefly, discussing their strengths, optimal applications, and practical examples. This guide can serve as a reference for choosing the appropriate ML model in your future projects.

To kick things off, let’s begin with the simplest model: Linear Regression.

Linear Regression

Linear Regression is typically the first algorithm introduced in ML courses and textbooks. This straightforward model accepts a vector of features (the characteristics of our dataset) and produces a continuous numeric output. As indicated by its name, it is a regression model and serves as the cornerstone for a family of linear algorithms, including Generalised Linear Models (GLMs).

It can be trained using a closed-form solution or, more commonly in the ML realm, through iterative optimization algorithms like Gradient Descent. Being a parametric model, Linear Regression has a fixed number of parameters that correspond to the number of input features, allowing for rapid training. It performs well when there is a linear relationship between the input features and the target variable, making it easy to interpret and understand.

For instance, a Linear Regression model might predict housing prices based on factors such as square footage, location, number of bedrooms, or the presence of an elevator. The accompanying figure illustrates how Linear Regression estimates the price of a house using just one feature: its square footage.

Linear Regression predicting house prices based on area

The line shown in the figure is fitted during training using an optimization algorithm like Gradient Descent, which iteratively adjusts the slope until the best possible line is achieved.

To delve deeper into Linear Regression, check out the following resources:

StatQuest's YouTube Video: 'Linear Regression'
Towards Data Science Article: 'Linear Regression Explained'
Machine Learning Mastery: 'Linear Regression for Machine Learning'
An Introduction to Statistical Learning: with Applications in R, Chapter 3.

Now that we've covered Linear Regression, let's shift our focus to its counterpart: Logistic Regression!

Logistic Regression

Logistic Regression is the sibling of Linear Regression, designed for classification tasks rather than regression. Similar to its linear counterpart, it processes an input feature vector but outputs a class label instead of a continuous value.

A practical example of Logistic Regression could involve predicting whether a patient has a specific illness based on their medical history. This model utilizes a sigmoid function, enabling it not only to provide class labels (e.g., sick or not) but also to estimate the probability of these outcomes.

This capability allows for more nuanced decision-making based on the output probabilities, rather than relying solely on binary predictions. Like Linear Regression, this model is intuitive and explainable; by examining the model parameters, we can identify which features are most influential in the predictions.

The figure below illustrates a Logistic Regression model that estimates the likelihood of a patient being obese based on their weight.

Logistic Regression predicting obesity probability based on weight

In the figure, we see a sigmoid curve indicating the probability of a 60 kg patient being obese (approximately 10%) compared to a 120 kg patient (about 93%). If additional features were included, the horizontal axis would represent a weighted combination of those features while the vertical axis would continue to show probabilities.

For further insights into Logistic Regression, refer to these resources:

StatQuest's YouTube Video: 'Logistic Regression'
Towards Data Science Article: 'Logistic Regression Explained'
KD Nuggets: 'Linear and Logistic Regression Gold Post'
Andriy Burkov's explanation in 'The Hundred-Page Machine Learning Book', Chapter 3.

Let’s now explore another intuitive model: Decision Trees!

Decision Trees

Decision Trees are versatile models suitable for both regression and classification tasks. They consist of nodes and branches, where each node evaluates one of the data features, splitting observations during training or guiding predictions.

In essence, Decision Trees create paths by assessing various variables, leading to leaf nodes where similar observations are grouped. During training, the tree analyzes potential features and their values to determine which splits reduce error.

The following figure provides a conceptual overview of a Decision Tree built using the Iris dataset.

Decision Tree structure for classification

While Decision Trees are straightforward and easy to interpret, they can sometimes produce less powerful results compared to more complex models like Neural Networks. They also have a tendency to overfit, making them less effective on new data. Nevertheless, they remain one of the most accessible ML models, allowing users to easily understand the reasoning behind predictions.

In classification scenarios, the predicted class corresponds to the most frequent class among training data points in the leaf node where a new observation falls. For regression, it represents the mean of the target values for those points.

For more information about Decision Trees, check out:

StatQuest Video: 'Decision Trees Explained'
Towards Data Science Article: 'Decision Trees Explained'
Towardsai post: 'Decision Trees Explained with a Practical Example'
Sebastian Raschka's book 'Python Machine Learning', Chapter 3.

Next, let’s see how multiple Decision Trees can be combined to form a Random Forest!

Random Forest

Random Forest models are non-parametric and can be used for both regression and classification. They are among the most popular ensemble methods, classified under Bagging techniques.

Ensemble methods harness multiple learners to improve the performance of individual models. In this case, a Random Forest is an ensemble of numerous individual Decision Trees. By introducing randomness, these models address issues like overfitting that can occur with single Decision Trees.

In a Random Forest, each tree is constructed using a subset of the training data and typically considers only a subset of available features. As multiple trees are generated, a broader range of data and features are utilized, resulting in a robust aggregated model.

Each individual tree operates independently, following the same procedures as a standard Decision Tree but with a randomly selected portion of the data and features at each node. Predictions from all trees are aggregated: for classification, the most common prediction is chosen, and for regression, the mean of all predictions is calculated.

The following figure illustrates how predictions are made using a Random Forest model.

Aggregated predictions from a Random Forest model

For further reading on Random Forests, explore:

StatQuest's Video on Random Forest
Towards Data Science Post: Random Forest Explained
Built In: A Complete Guide to Random Forest

Random Forests, along with Boosting methods, are among the most widely used traditional ML models in industry due to their ease of use and effectiveness.

Now, let’s discuss Boosting methods!

Boosting Methods

Boosting methods differ from Bagging techniques like Random Forests in that trees are built sequentially rather than in parallel. Each tree learns from the previous one, meaning Tree 4 is influenced by Tree 3, and so forth.

Initially known as Hypothesis Boosting, this approach focuses on filtering or weighting the training data. Each new learner is trained with observations that previous learners struggled to classify accurately. This method enables the ensemble to make precise predictions across various data types, compensating for any weak individual models.

Boosting methods, including LightGBM, AdaBoost, and XGBoost, are among the most effective for standard tabular data in the Machine Learning/Data Science domain.

For more insights into boosting, consider these resources:

What is Boosting in Machine Learning? @Towards Data Science
Boosting by Udacity
Bagging vs Boosting Explained on the QuantDare Blog

Finally, let’s wrap up with the powerful Support Vector Machines!

Support Vector Machines

Support Vector Machines (SVMs) are a widely utilized family of Machine Learning models adept at addressing various ML challenges, including linear and non-linear classification, regression, and outlier detection. They are particularly effective with small to medium-sized, complex datasets.

For classification tasks, SVMs create a decision boundary that aims to optimally separate the data, as depicted in the following figure.

Decision boundary for a Support Vector Classifier

In cases where a linear boundary is insufficient, SVMs employ a technique known as the Kernel Trick to enhance separability. This method allows some data points to be misclassified, depending on the chosen kernel and tuning parameters.

Despite their effectiveness with high-dimensional data, SVMs require significant training time, making them suitable primarily for smaller datasets.

For additional resources on SVMs, check out:

Support Vector Machines Explained on Towards Data Science
SVMs Video by Josh Starmer
An Introduction to SVMs on MonkeyLearn

In Conclusion

Thank you for reading! I hope this overview of basic Machine Learning models has been informative and that you feel equipped to explore these concepts further.

While we omitted Artificial Neural Networks and some probabilistic methods from this discussion, you can find extensive information in my other articles on Towards Data Science. Additionally, we did not cover unsupervised models here.

To enhance this article, here are a few resources that can help you select the right Machine Learning algorithm for various scenarios:

Machine Learning Model Cheat-Sheet for Scikit-Learn
Machine Learning Algorithms are Your Friends
SaS Machine Learning Algorithms Cheat-Sheet

For even more resources on Machine Learning and AI, feel free to explore the repository provided at the end of this post!

The first video titled "All Machine Learning Models Explained in 5 Minutes" provides a succinct overview of various ML models, making it a great starting point for beginners.

The second video, "Basic Machine Learning Algorithms Overview - Data Science Crash Course Mini-series," offers a comprehensive introduction to essential ML algorithms, perfect for those looking to deepen their understanding.

Cover Image from Unsplash. Icons sourced from Flaticon.com. All other images created by the author.

afyonkarahisarkitapfuari.com

An Overview of Fundamental Machine Learning Models

Chapter 1: Introduction to Machine Learning Models

Linear Regression

Logistic Regression

Decision Trees

Random Forest

Boosting Methods

Support Vector Machines

In Conclusion

Share the page:

Recent Post:

The Divine Mystery of Nature: Insights from Science and Philosophy

Embrace Your Dreams: Overcoming Fear for a Fulfilling Life

Understanding the Complex Interplay Between Leptin and GLP-1 RAs in Metabolic Wellness and Weight Management

Creating Your Own Luck: Lessons from Elon Musk

# An In-Depth Exploration of Newton's Laws of Motion

Unlocking Entrepreneurial Potential: Mastering Intrapreneurship Skills

Essential Insights into Data Science: A Guide for Beginners

Transform Your Writing Skills with These Essential Practices