Choosing the Right Machine Learning Use Cases


How do you get started with machine learning? A good first step is choosing the right use cases for machine learning (ML).


Given that some 87% of machine learning projects fail or fail to make it to production, will choosing the right use case avoid the many pitfalls faced by machine learning projects? While it isn’t a guarantee, it certainly pays to identify, evaluate, and validate your options.

So, where should you start looking for good machine learning use cases across your organization? It seems these unicorns exist at the magical intersection of value, data, suitability, and cost.


Most organizations quantify value in economic terms: cost reduction, risk reduction, or profit improvement. However, if a portrait of Milton Friedman doesn’t sit on your desk, you can generalize value to be anything you want to make better (increase or decrease). Other use cases that could be suitable for machine learning could be related to environmental, social, and governance targets, scheduling, wellness, and the like.


The second leg is “Data.” Do you have enough of the right type of data to generate the required results? There are some rules of thumb around the number of examples needed to train different machine learning models. However, “enough” for your use case hinges on your required results. In other words, the amount of data you need is related to your desired accuracy or some other threshold goal tied to your model output. Typically, the higher you set the bar for model performance, the more quality data you’ll need to achieve it. Be careful here: a “good enough” solution can still yield valuable and profitable results, and additional data is rarely free.

What is the “right type” of data? There are two equally important considerations.

  • First, do you have plenty of (reliable) observations of what your model is attempting to learn? For machine learning, this means you have captured enough examples (i.e., an event, measure, behavior, status, outcome, category, etc.) and recorded the results among the possibilities. Examples include daily temperatures in your city’s zip code, residential mortgage defaults in the U.S. from 2006-2009, time to failure for a new LED light bulb, inbound emails that are spam, which customers will respond to a promotion, etc.
  • Second, the data should enable your ML algorithm to recognize the patterns in the data that lead to the observed outcome. Another way of saying this is: Data can’t be just random or have too much “noise.” Data scientists and analysts apply various techniques to explore these aspects of your data. They’ll look for biases, missing data, erroneous data, etc. Even relatively clean data may still not be good at modeling the desired outcome. In other words, it is also important to consider how well your data relates to your target, and even how elements of that data relate to each other. For example, attendance at a baseball stadium isn’t going to help you build a successful attendance forecast for a football stadium. They aren’t completely unrelated, but that relationship is weak.
Suitability (A Good Fit for ML?)

How suitable is your use case for ML? Your data scientist or other data professionals will provide good insight here. Machine learning can learn and generate solutions across many domains and problems. However, both ML and individual algorithms have limitations. In the previous section, we alluded to some important watch-outs in your data. Insufficient or unbalanced, or biased data might arise from your observations or how they are collected. Even the most skilled modelers can’t mitigate these real-world challenges. Training yourself to spot common machine learning obstacles at inception will help you avoid a potential dead-end.

Even if you avoid common data issues, some problems are still difficult for machines. Ambiguous outcomes or lack of separation among categories are difficult for machine learning. If humans would have difficulty agreeing on the answer, machines will too. Other well-suited areas for machine learning include classification; anomaly detection; image analysis; reading, recognizing, understanding and translating written or verbal communication; learning associations; segmentation; prediction; forecasting; and many others.


You also want to be sure to choose a problem where the benefit outweighs the costs of your machine learning model, and where the solution will generate sufficient value for a decent-sized group or sub-group.

As previously mentioned, capitalizing on your model’s output might involve capturing more revenue (behavior-based insurance pricing), reducing costs through automation or other process improvements (automated credit scoring), or both (Increasing conversion rates with personalized offers). However, there ain’t no such thing as a free lunch. Many ML applications can require a lot of data ($) as well as technical and human resources ($). Accurately estimating these costs and benefits and the scale of your project should be an essential part of evaluating any potential use case. Taking a closer look at KPIs – such as cost per lead, engagement, customer lifetime value, revenue forecasts, cash management cycles, fulfillment times, outages, etc. – can often become machine learning projects in and of themselves.

In summary, there are four essential concepts to consider when examining potential machine learning use cases: value, data, suitability, and cost. In addition, you need to have a clear understanding of how the outcome of your machine learning project will create your desired benefit – and what inputs you need to get you there. This will help you avoid the common fate of most machine learning projects.

At CoStrategix, we help organizations identify machine learning use cases in our discovery workshop engagement. We not only brainstorm high-value use cases but also validate using your data to experiment and build Proof of Concept (POC) models. Contact us if you would like to get started with applying machine learning to some of your strategic priorities or key opportunities.