Machine Learning Lifecycle
We have reached the fourth and final article covering the machine learning project lifecycle. In this series, I cover all four steps presented by Andrew Ng on DeepLearning.AI’s course called Machine Learning Engineering for Production (MLOps). Please see the sidebar for Parts 1-3, exploring deployment, modeling, and data.
In today’s story, we’ll cover the scoping stage. This stage involves defining the project’s goals and objectives and establishing a baseline for measuring success.
Picking the Right Project
Thoughtful planning before starting is crucial.
The initial process of scoping a machine learning project involves carefully examining the current problems the business is facing and studying potential solutions.
Selecting the right project is relevant to avoid frustration and failure in meeting expectations. You need a project that is both feasible and impactful. It requires a deep understanding of the business needs and the available data to meet these goals. By thinking through the problem and potential solutions before starting work, you can avoid wasting time and resources on projects that are unlikely to succeed.
Consider the following example: An e-commerce retailer.
What are the potential projects for this case study? There are several possible choices to pick:
- Implement or improve a product recommendation system.
- Improve search functionality.
- Optimize inventory management.
- Price optimization.
By understanding the business needs, you will be able to identify projects with a substantial potential for impact.
Once you have identified potential solutions to the current problems, you must define how to measure the project’s success and determine all the required resources, such as data, time, costs, and personnel.
Also, be aware not to fall into the novelty trap. Not all problems are solvable by AI!
Define the What and the How
The first step for successful scoping is starting with the what. Finding your what requires a good understanding of the business needs and its core goals.
What is it that we must do?
- Increase conversions
- Reduce inventory
- Improve profit margins
Once you have your what, it is time to brainstorm the solution. Generate and explore multiple AI-based solutions for the business problem.
How will we do it?
- Increase conversions → Product recommendation system.
- Reduce inventory → Build a model to predict increases and decreases in demand.
- Improve profit margins → Optimize product selection.
You now have your what and the how for the business problem and potential solution. The next step is to measure technical feasibility and analyze the potential ROI — the return on investment — for this specific project.
Brainstorming: A Collaborative Step Between Engineers and Business Owners
Business stakeholders and ML engineers must collaborate during brainstorming to identify and develop new AI projects. Input from both groups is crucial. One will bring in the business needs, whereas the other will identify the technical feasibility.
You might consult external and internal benchmarks to determine if a project is feasible. Look at published research and analyze what competitors have done. If possible, look for existing systems in the company and work to improve its performance.
If there is no other existing system within the business, you can build a proof of concept, a smaller-scale project to gain insights and create realistic estimates.
The next step is to determine milestones. We need to identify the metrics we will use to measure success, including machine learning and business metrics alike.
For machine learning metrics, you have HLP, which tells you if humans can do the task with the same data the AI would have. This ideal for unstructured data applications, such as images and audio. For structured data, you have accuracy, precision, recall, and F1-Score.
Be aware that optimizing accuracy on a test set will not solve every business problem. Translate accuracy into business metrics, like revenue and user engagement. If you find it hard to find suitable business metrics, it may be a sign you need to learn more about the problem and the business itself.
Budgeting and Resources
Measuring budgeting and resource allocation are critical aspects of the project. You must carefully consider all of the required resources to achieve the project’s goals, including the necessary data, personnel, time of development, and any integrations or support needed from other teams.
Data, for instance, can be an expensive and time-consuming resource to acquire. There might be several costs involved in purchasing or collecting data, as well as cleaning and preparing it for use in a production environment.
Personnel costs can become particularly significant, especially if you need to hire specialized experts to work on the project.
Time is another core asset to consider. AI projects can take months or even years to complete, so be realistic about the time commitment required and be transparent about it with business and application owners.
It is also important not to underestimate the costs of integrations or support that you may need from other teams. Examples include integrating your model with an existing software application, where you must consider the costs of development and testing.
Summarizing Everything
The scoping process of any machine learning project involves carefully considering the business problems and potential solutions, as well as estimating the resources required to complete the project.
We can break down this process into the following steps:
Step 1: Problem Definition
- Identify the business problem. What are the current challenges facing the business?
- Explore potential solutions. How can AI solutions be used to address these challenges?
Step 2: Feasibility Analysis
- Assess technical feasibility. Is it possible to build an AI solution that meets the business needs?
- Analyze potential ROI. What is the expected return on investment for this project?
Step 3: Brainstorming
- Collaborate with business stakeholders and ML engineers. Generate and explore multiple AI-based solutions for the business problem.
- Consult external and internal benchmarks. Determine if the project is feasible based on existing research and competitors.
- Build a proof of concept (if necessary). An initial small-scale model can help you gain insights and create realistic expectations for the project.
Step 4: Metrics and Milestones
- Define success metrics. What metrics will we use to measure the success of the project? These should include both machine learning metrics and business metrics.
- Determine milestones. What are the key milestones we need to achieve throughout the project lifecycle?
Step 5: Budgeting and Resources
- Identify required resources. It includes data, personnel, time, and any integrations or support required.
- Develop a budget. Allocate resources to each stage of the project lifecycle.
It is important to note that the overall machine learning process is iterative. As you progress through the project, you may need to revisit earlier stages and adjust your plans based on new information and insights.
You can increase the likelihood that your machine learning project will be successful and deliver real value to the business needs by following these stages and steps.
In Closing
This article concludes our four-part series on the machine learning project lifecycle. We have covered all the stages in developing machine learning applications, from scoping and data collection to model training and deployment.
In this final piece, we have focused on the scoping stage, where you define the project’s goals, assess feasibility, and determine the resources required to complete it successfully. By carefully considering all of these factors upfront, you can increase the chances that your AI-based solution will be successful and deliver value to real-world business problems.
These four articles provide a comprehensive overview of the entire project lifecycle, so feel free to go back to any of the previous articles whenever you believe it is necessary.
It is crucial to remember that machine learning is a highly iterative process. As you progress through the project, you will most likely need to revisit and adjust the earlier steps based on new challenges and information you gather.
Originally published in Artificial Intelligence in Plain English.