Are you trying to solve a task that’s way too complex to code in a traditional way? Show your data to a machine learning model instead and let it figure out how to give the correct answer! This is how it works:
During the training process of a machine learning model it learns how to give correct predictions using already known answers called labels (training data). At first the model will output random answers, but over time they get more and more accurate. It uses the error rate between predictions and known labels as an indicator of what needs to be changed to improve performance. The goal is to give correct predictions not only for known data but also for new, unknown input after training.
That’s machine learning in a nutshell.
When to use ML
Machine learning is often used when it would be too complicated to write an algorithm the traditional way. For example it’s very hard to write code from scratch that classifies handwritten digits reliably. Every case is different and fixed rules are pushed to their limits very quickly.
Another reason is that machine learning nowadays outperforms even humans in certain tasks. It can detect tiny changes in input data that are invisible even to experts of that field. 1
How it works
As mentioned before machine learning uses models to learn the right output for any given input. A model can be seen as a mathematical formula which gives an approximation 2 of the desired answers.
Usually, a model architecture is designed for a specific task. Classical ML models shine when it comes to predicting number-to-number relationships. For example: If you want to forecast the sales price of a diamond given its size, color, clarity and other numerical factors. Complex tasks like Natural Language Processing or Computer Vision require the use of Deep Neural Networks for state of the art results.
Deep Learning is a special kind of machine learning that refers to a deep model architecture with stacked layers. A layer can be thought of as a simple machine learning model on its own.
In recent years deep learning has become very popular because of its ability to learn very complex tasks. It performs particularly well with unstructured data such as images, text or sound.
In its purest form each layer of a deep neural network has a given number of neurones which transform the input using learned parameters. The result is piped through an activation function f before passing it onto the next layer. The activation function is a mathematical necessity, but it’s not relevant to further understand this topic.
Here is a fancy formula:
This formula shows that the most important parameters for each neurone are weight w and bias b. Those parameters are optimized for each neurone the data runs through and are ultimately how the model comes to its final result.
This is where the magic happens. ✨
Each layer contains many neurones and in the case of a fully connected layer each neurone is connected to each neurone of the next and the previous layer. The weight w defines how much the result of previous neurone x contributes to the output of the current neurone.
This is somewhat similar to what biological neurones do, hence the name.
The error function is an important part of the training process because it allows to calculate how the network's weights and biases should be changed to improve the predictions. In particular the gradient (slope) of the error function is used to lead the network to a smaller error and therefore a better prediction. This method is called gradient descent.
For machine learning algorithms to perform well the input data needs to meet certain criteria. The training data should be as diverse as possible, showing many cases of what it will predict later. This implies that the data set also needs to have a sufficient size.
Remember: The machine learning model has no prior knowledge of the world. All it knows and ever will know is what it was shown during the training process. Even the best model architecture and training methods can't compensate for bad data quality.
Machine learning is a complex but powerful tool for automating tasks that were not possible before. Learned parameters enable a machine learning model to transform the input into the desired output. Data is the foundation of machine learning, with data quality and size being the most important components.
Here are some links to popular resources:
Python 3 - the most popular programming language for machine learning
Jupyter Notebooks - the functional (but ugly) code editor for Data Scientists
scikit-learn - a powerful ML library with a ton of useful functions
TensorFlow - most popular deep learning framework
PyTorch - the most pythonic deep learning framework
- Google has found that it’s possible to predict age, gender, smoking, and blood pressure (in addition to various illnesses) just by looking at retinal scans. https://ai.googleblog.com/2018/02/assessing-cardiovascular-risk-factors.html↩
- There is no such thing as 100% accuracy in the real world.↩