Many professionals view machine learning as advanced technology that you deploy when other approaches fail. In their mind, it’s a sophisticated tool reserved for problems too complex for conventional methods.
So, when I talk to businesses, I often hear comments like, “We don’t need to use machine learning for that” or “We solved that problem years ago before machine learning was a thing.”
Sometimes, those comments are valid, making machine learning the wrong technology. However, more often, they stem from a false belief that their traditional approach is the straightforward alternative, and that misconception leads to businesses sticking with and maintaining outdated solutions.
The reality is that machine learning is a fantastic tool for creating robust and easy-to-maintain solutions to simple problems. You can often replace thousands of lines of code with a single line, performing the task more accurately and reliably. I’ve worked with several companies who were astonished at how much easier their solution became when we replaced their manual rule with a machine learning algorithm.
Let’s explore this topic together and learn a few valuable lessons you can apply to your work using real examples.
Example of a Simple Task
My company develops deep learning algorithms that detect damage and essential components in visual data. Currently, we focus primarily on the railway industry, and therefore, many of my examples are centered around computer vision with data from the railroad industry.
However, these examples directly apply to other datasets, industries, and use cases. For a machine learning engineer, there’s little difference when you switch to another use case or industry as long as the data type remains similar, and you have access to someone with domain expertise.
Detecting Rail Heads in Track Imagery
I want to explain this lesson by showing you an example where I use machine learning instead of traditional computer vision to create a reliable solution in one hour of work. Let’s start by looking at these three 2D images of a railway sleeper taken from a measurement train on different dates.
The sleeper is the same in each image, and we wanted to develop a fingerprint algorithm that creates an ID for each sleeper to track it over time. However, to make that task more manageable, I first wanted to align the images to ensure the rail is in the same place each time.
The Conventional Approach
Detecting and aligning the rail is a simple task that most people would try to solve using traditional methods. These methods involve rules utilizing the variation in pixel values on the y-axis or edge detection. The rules are easy to define, and you can quickly make something that works most of the time.
The problems start when you must update your rules to deal with failed examples. As you add new clever ways to detect the rail, the complexity of your code grows, and you start spending a ton of time dealing with edge cases.
These edge cases pile up quickly if you have a lot of data with varying quality, and if that happens, creating rules manually can become a rabbit hole. The more rules you make, the more difficult they are to maintain and change.
A Simple Solution Using Machine Learning
To solve the problem with machine learning, I created a tiny CNN with 80,000 parameters using PyTorch. The algorithm takes an image as input and returns a probability for each column, telling me if it’s the center of the rail or not.
I took the original image and extracted two areas large enough to know that they include the rail, downscaled each area to 128x64 pixels, and marked the center of the rail.
To create a solution that worked almost perfectly, I only needed to annotate 20 images, which took me 5 minutes. I also made a validation set with 50 images to ensure that my algorithm works on previously unseen data. The belief that you need a lot of data before you can train a machine learning algorithm is one of the most common misconceptions among businesses.
Since my algorithm is tiny, I don’t need a GPU and can train the algorithm directly on my Macbook. The training took approximately 10 minutes, and the fans of my computer barely made a sound.
The entire solution took me one hour to create. It works almost perfectly, only missing the center with 2–3 pixels, which is good enough for my intended use case. Here are some examples from my validation data where the red line is the truth, and the blue line is the output from my algorithm.
Improving the Solution
I could have solved the problem using conventional methods in a similar amount of time, but the best part of my machine learning solution is how easy it is to improve.
The only thing I need to do to get a better solution is to create more training data. That’s fantastic since it only took 5 minutes of annotation to get a working solution. If I add examples where the algorithm struggles, I will get a significant improvement with just 10 more data points.
Lessons to Learn
The primary lesson is to think about machine learning as a versatile technology that you can use to solve both simple and complex problems. It’s just as easy to train and deploy an algorithm (if not easier) as any other approach. Here are some key takeaways that everyone should understand.
Machine learning solutions to simple problems are:
- Fast to develop: A skilled machine learning engineer should be able to create the first solution to a simple problem in less than one day. You label a few data points, make a small neural network (or other algorithm), and focus on expressing the task in a way that helps the algorithm learn the right things.
- More robust to edge-cases: When you train a machine learning algorithm, you constantly make minor adjustments to the data, such as changing brightness. That’s called data augmentation, and if we use it well, the algorithm can learn to deal with almost any edge case.
- Easier to maintain: Many traditional approaches quickly become complicated. Maintaining a simple machine-learning solution only involves labeling new examples that the current algorithms find difficult. If you keep your training and testing data consistent and the performance has improved, you can deploy the new algorithm without worrying about bugs or deteriorating performance.
- Less code complexity: Simple machine-learning solutions can replace thousands of lines of code with a single one. We move the complexity to inside the algorithm and trust our testing data. Getting rid of complicated rules is often a surprising benefit for people new to machine learning.
In Closing
I don’t think about machine learning as a hammer and every problem as a nail, but it’s clear to me that very few people understand how often this technology can and should be used. To me, it’s a tool for software development that I use all the time.
Instead of asking yourself if you can solve a problem “without machine learning,” you should include it as an alternative on the same grounds as anything else.
For many simple problems, machine learning doesn’t require any advanced hardware, and you can often develop a better and more reliable solution in less time compared to conventional methods.
Originally published in ML Lessons for Managers and Engineers.