Machine learning (ML) is a subfield of artificial intelligence (AI) that focuses on the creation of computer systems that can learn, adapt, predict and correlate, all without following explicit instructions.
The goal of machine learning is to understand and process a large amount of data by leveraging algorithms and making generalized models that can produce user-friendly outputs.
Machine learning commonly works by following the steps below:
- Gathering data from various sources
- Cleaning data to have homogeneity
- Building a model using ML algorithm
- Gaining insights from the model's results
- Data visualization and transforming results into visual graphs
1. Gathering data from various sources
Machine learning requires a lot of data to make a production-ready model.
Data gathering for ML is done in two ways: automated and manual.
- Automated data gathering utilizes programs and scripts that scrape data from the web.
- Manual data gathering is a process of manually gathering data and preparing it in a homogeneous way.
2. Cleaning data to have homogeneity
Ensuring data homogeneity is a crucial step to make machine learning work and generate results.
Data cleaning for ML is either done manually or automatically with a help of algorithms and consists of fixing and/or removing incorrect, corrupted, wrongly formatted, duplicate and incomplete data within the dataset.
3. Building a model using ML algorithm
An ML (machine learning) model is a file that contains the results of machine learning algorithms and is used to reason over dynamic input.
An ML (machine learning) model works by containing a list of patterns that are matched against real-time input, then produces the output according to the matched pattern.
ML models can have various structure types, with the most common types being: binary classification, multiclass classification, and regression.
- The binary classification model predicts a binary outcome, meaning one of two possible outcomes.
- The multiclass classification model predicts one of more than two outcomes.
- The regression model predicts numeric values.
The process of building a machine learning model is called training.
Machine learning training is done with the help of algorithms and is divided into two categories: supervised learning and unsupervised learning.
- Supervised learning (SL) is when the ML model is trained using labeled data, meaning the data that has both input and output values.
- Unsupervised learning (UL) is when the ML model is trained using unlabelled data, meaning the data that has no tags or known results.
Neural networks (NNs) are at the core of unsupervised learning and consist of mapping between the data within the dataset, allowing to make correlations.
4. Gaining insights from the model's results
Gaining insights from the ML models means understanding the previously unknown patterns and testing the model's ability to make predictions and conclusions.
Gaining insights is important to verify the model's validity and determining if there are any changes that need to be made to the learning algorithm(s).
5. Data visualization and transforming results into visual graphs
Data visualization of the ML model consists of putting the output data on a graph and providing the interactive API.