Is your use case a viable ML product from a traditional ML and production perspective?
Have you ever thought about building a data application, but don’t know the requirements for building an ML system? Or, maybe you’re a senior manager at your company with ambitions to use ML, but you’re not quite sure if your use case is ML-friendly.
Lots of businesses are struggling to keep up with the exponential growth of AI/ML technology, with many aware that the implications of not factoring AI/ML into their roadmap may be existential.
Companies see the power of Large Language Models (LLM) and think that AI/ML is a ’silver bullet’ for their problems. Most businesses are spending money on new data teams, computing power, and the latest database technology, but do they know if their problem can be solved using ML?
I have distilled a checklist to validate whether your ML idea is viable from a traditional ML perspective including:
1. Do you have the appropriate features to make a prediction?
2. Are there patterns to learn from in your data?
3. Do you have enough data for ML to be effective, or can you collect data from sources?
4. Can your use case be framed as a prediction problem?
5. Does the data you wish to predict have associated patterns with the training data?
And, from the viewpoint of productionising ML solutions:
1. Is your use case repetitive?
2. Will wrong predictions have drastic consequences for end users?
3. Is your use case scalable?
4. Is your use case a problem where patterns continually evolve?
Traditional Considerations
Arthur Samuel first popularised the phrase ’Machine Learning’ in 1959 stating it is ”the field of study that gives computers the ability to learn without being explicitly programmed”.
A more systematic definition of ML is given by Chip Huyen — an AI/ML leader and entrepreneur — in her book ’Designing Machine Learning Systems’ — a must-read for anyone interested in production ML:
“Machine learning is an approach to (1) learn (2) complex patterns from (3) existing data and use these patterns to make (4) predictions on (5) unseen data.”
Chip breaks down the components of ML into five chunks, and expands on them by including four modern reasons for ML adoption which we’re going to dissect further below.
Opportunity to Learn
Do you have the appropriate features to make a prediction?
Data is fundamental to ML. It provides both the inputs and outputs producing a prediction reflecting patterns with the data.
For example, you might be an avid football fan, and you want to predict Premier League player market values based on past performance
The input data would involve player statistics like goals and assists, and the associated player value. An ML model can learn the patterns from this input data to predict unseen player data.
Complex Patterns
Are there patterns to learn from in your data?
ML is at its best when data is complicated, and a human cannot easily identify the patterns needed to predict an output.
In the football player market value example, it can be difficult to precisely say the value of a footballer given there are many variables that value depends on. ML models can take value (output) and performance statistics (input), and figure out the valuation automatically.
Data Availability
Do you have enough data for ML to be effective, or can you collect data from sources?
There is an ongoing debate as to whether data or better algorithms lead to greater predictive power. Although, this debate has quietened lately considering the enormous performance leaps taken by LLMs as dataset sizes increase into the hundreds of billions, and even trillions of parameters.
Data needs to be readily available for your ML application to learn from. If data is scarce, then ML is likely not the best approach.
In football, data is constantly being generated on player performance by data vendors such as Opta, Fbref, and Transfermarkt as teams look to apply data-driven decisions to all club aspects from player performance to recruitment.
However, obtaining data from third parties like Opta is expensive due to the intense data collection process and the high demand for detailed stats to give teams an advantage.
Problem Solved by Prediction
Can your use case be framed as a prediction problem?
We can frame the football player market value example as a prediction problem in several ways.
Two common strands of ML prediction are regression and classification. Regression returns a continuous prediction (i.e. a number) in the same scale as the input variable (i.e. value). Whereas, classification can return a binary (1 or 0), multi-class (1, 2, 3…n), or multi-label (1, 0, 1, 0, 1) prediction.
The player value prediction problem can be framed as a regression and multi-class issue. Regression simply returns a number such as predicting £100 million for Jude Bellingham’s value based on his season performance.
Conversely, if we address this as a classification problem, we can bin valuations into buckets and predict which valuation bucket a player resides in. For instance, predictions buckets could be £1m-£10m, £10m-£30m, and £30m+.
Similar Unseen Data
Does the data you wish to predict have associated patterns with the training data?
The unseen data that you want to predict must share similar patterns with the data used to train the ML model.
For example, if I use player data from 2004 to train an ML model to predict player valuations. If the unseen data is from 2020, then predictions will not reflect the changes in market valuations across the 16 years from training to predicting.
Production Considerations
ML model development is only a small component of a much larger system needed to bring ML to life.
If you build a model in isolation without an understanding of how it will perform at scale, when it comes to production you may find that your model is not viable.
It’s important that your ML use case can check production-level criteria.
Repetitive Task
Is your use case repetitive?
It takes repetition of patterns for ML to learn from. Models need to be fed a large number of samples to adequately learn patterns meaning if your prediction target occurs frequently you will likely have good data from which ML can learn the patterns.
For example, if your use case involves trying to predict something that occurs rarely, like an uncommon medical condition, then there’s likely not enough signal in your data for an ML model to pick up on, leading to a poor prediction.
This problem is referred to as a class imbalance, and strategies such as over-sampling and under-sampling have been developed to overcome this problem.
Travis Tang’s article does a good job of explaining class imbalance and remedies for it in more detail here.
Small Consequence for Wrong Prediction
Will wrong predictions have drastic consequences for end users?
ML models will struggle to predict with 100% accuracy every time which means when your model makes a false prediction, does it have a negative impact?
This is a common problem experienced in the medical sector where false-positive and false-negative rates are a concern.
A false-positive prediction indicates the presence of a condition when it does not exist. This can lead to inefficient allocation of resources and undue stress on patients.
Perhaps even worse, a false-negative does not indicate the presence of a condition when it does exist. This can lead to patient misdiagnosis and delay of treatment which may lead to medical complications, and increased long-run costs to treat more severe conditions.
Scale
Is your use case scalable?
Production costs can be incredibly expensive, I found this myself when I hosted an XGBRegressor model on Google’s Vertex AI costing me £11 for 2 days! Admittedly, I should not have left it running, but imagine the costs for large-scale applications.
A well-known example of a scalable ML solution is Amazon’s product recommendation system which generates 35% of the company’s revenue.
Although it’s an extreme example, this system leverages and justifies the cost of computing power, data, infrastructure, and talented workers, illustrating the fundamentals of building a scalable ML solution that generates value.
Evolving Patterns
Is your use case a problem where patterns continually evolve?
ML is flexible enough to fit new patterns easily and prevents the need to endlessly hard code new solutions every time the data changes.
Football player values are constantly changing as tactics evolve leading to changes in what teams want from players meaning features will change in their weighting on predicting values.
To monitor changes, tools like Mlflow and Weights & Biases help track and log the performance of your models, and update them to match the evolving data patterns.
Conclusion
Deciding to use ML for your use case should consider much more than just using some historical data you’ve got, slapping a fancy algorithm on it and hoping for the best.
It requires thinking about complex patterns if you have data available now and in the future, as well as production concerns like whether the cost of a wrong prediction is cheap? Is my use case scalable? And, are the patterns constantly evolving?
There are reasons you should NOT use ML, including ethics, cost-effectiveness, and whether a simpler solution will suffice, but we can leave that for another time.
That’s all for now!
Thanks for reading! Let me know if I’ve missed anything, and I would love to hear from people about their ML use cases!
Connect with me on LinkedIn
References
Huyen, C. (2022). Designing Machine Learning Systems. Sebastopol, CA: O’Reilly
Geron, A. (2019). Hands-on machine learning with Scikit-Learn, Keras and TensorFlow: concepts, tools, and techniques to build intelligent systems (2nd ed.). O’Reilly.
Essential Considerations for Implementing Machine Learning was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Essential Considerations for Implementing Machine Learning
Go Here to Read this Fast! Essential Considerations for Implementing Machine Learning