Generative AI opens the door to faster development cycles, minimized technical and maintenance efforts, and innovative use cases that before seemed out of reach. At the same time, it brings new risks—like hallucinations and dependencies on third-party APIs.

For data scientists and machine learning teams, this evolution has a direct impact on their roles. A new type of AI project has appeared, with part of the AI already implemented by external model providers (OpenAI, Anthropic, Meta, and so on). Non-AI-expert teams can now integrate AI solutions with relative ease.

In this blog post we’ll discuss what all this means for data science and machine learning teams:

A wider variety of problems can now be solved, but not all problems are AI problems.
Traditional ML is not dead but is augmented through GenAI.
Some problems are best solved with GenAI, but still require ML expertise to run evaluations and mitigate ethical risks.
AI literacy is becoming more important within companies, and data scientists play a key role in making AI literacy a reality.

Not All Problems Are AI problems

GenAI has unlocked the potential to solve a much broader range of problems, but this doesn’t mean that every problem is an AI problem. Data scientists and AI experts remain key to identifying when AI makes sense, selecting the appropriate AI techniques, and designing and implementing reliable solutions to solve the given problems (regardless of the solution being GenAI, traditional ML, or a hybrid approach).

However, while the width of AI solutions has grown, two things need to be taken into consideration to select the right use cases and to ensure solutions will be future-proof:

At any given moment GenAI models will have certain limitations that might negatively impact a solution. This will always hold true as we are dealing with predictions and probabilities that will always have a degree of error and uncertainty.
At the same time, things are advancing fast and will continue to evolve in the near future, decreasing and modifying the limitations and weaknesses of GenAI models and adding new capabilities and features.

If current large language model (LLM) versions cannot solve specific issues but future versions likely will, it may be more strategic to wait or develop an interim solution, rather than investing in complex in-house developments to address current limitations. Again, data scientists and AI experts can guide the direction of progress and help distinguish between issues that model providers should address and those that require internal solutions. For instance, incorporating features that allow users to edit or supervise the output of an LLM can be more effective than aiming for full automation with complex logic or fine-tunings.

Differentiation in the market won’t come from merely using LLMs, as these are now accessible to everyone, but from the unique experiences, functionalities, and value products can provide through them. With GenAI solutions, data science teams might need to focus less on the model development part, and more on the whole AI system.

A Comprehensive Guide to Responsible Analytics Governance - Complimentary Research Brief

This complimentary version of the "Comprehensive Guide to Responsible Analytics Governance" lays the foundation for operationalizing analytics governance across all business functions in the enterprise and will help you build out an analytics program that is a natural part of digital ethics.

Download Now

Traditional ML Augmented Through GenAI

While GenAI has revolutionized the field of AI and many industries, traditional ML remains indispensable. Many use cases, especially those not involving text or images, still require traditional ML solutions, while other problems may be more efficiently solved with ML rather than GenAI.

Far from replacing traditional ML, GenAI often complements it: GenAI allows for faster prototyping and experimentation and can augment certain use cases through hybrid ML and GenAI solutions.

In traditional ML workflows, developing a natural language processing (NLP) classifier involves obtaining and possibly labeling training data, preparing the data, training and fine-tuning the model, evaluating performance, and deploying, monitoring, and maintaining the system. This process often takes months and requires significant resources for development and ongoing maintenance.

By contrast, with GenAI, the workflow simplifies dramatically: select the appropriate LLM, engage in prompt engineering or prompt iteration, perform offline evaluation, and use an API to integrate the model into production. This greatly reduces the time from idea to deployment, often taking just weeks instead of months. Moreover, much of the maintenance burden is managed by the LLM provider, further decreasing operational costs and complexity.

ML vs GenAI project phases. Image by author.

For this reason, GenAI allows testing ideas and proving value quickly, without the need to collect labeled data or invest in training and deploying in-house models. Once value is proven, ML teams might decide it makes sense to transition to traditional ML solutions to decrease costs or latency, while potentially leveraging labeled data from the initial GenAI system. Similarly, many companies are now moving to small language models once value is proven, as they can be fine-tuned and more easily deployed while achieving comparable or superior performances compared to LLMs.

In other cases, the optimal solution combines GenAI and traditional ML into hybrid systems that leverage the best of both worlds. A good example is “Building DoorDash’s product knowledge graph with large language models,” where they explain how traditional ML models are used alongside LLMs to refine classification tasks, such as tagging product brands. An LLM is used when the traditional ML model can’t confidently classify something, and if the LLM is able to do so, the traditional ML model is retrained with the new annotations (which is a great feedback loop!). Regardless, ML teams will continue to develop traditional ML solutions, fine-tune, and deploy predictive models while recognizing GenAI's potential to enhance the speed and quality of these solutions.

Some Problems Are Better Solved with GenAI

The AI field is shifting from using numerous in-house specialized models to a few huge multi-task models owned by external companies. ML teams need to embrace this change and be ready to include GenAI solutions in their list of possible methods to stay competitive. Although the model training phase is complete, it is important to maintain an understanding that ML and AI solutions remain probabilistic, contrasting sharply with the determinism of traditional software development.

Despite all the benefits that come with GenAI, ML teams will have to address its own set of challenges and risks. The main added risks when considering GenAI-based solutions instead of in-house traditional ML-based ones are:

New GenAI risks are added to the traditional ML risks (in purple). Image by author.

Dependency on third-party models: This introduces new costs per call, higher latency that might impact the performance of real-time systems, and reduced control due to limited knowledge of training data and design decisions, with provider updates potentially causing unforeseen issues in production.
GenAI-specific risks: We are well aware of the free input/free output relationship with GenAI. Free input introduces new privacy and security risks (e.g., due to data leakage or prompt injections), while free output introduces risks of hallucination, toxicity or an increase of bias and discrimination.

GenAI Needs ML Expertise for Safe and Effective Use

Although GenAI solutions are often easier to implement than traditional ML models, their deployment still requires ML expertise, especially in evaluation, monitoring, and ethical risk management.

Just as with traditional ML, the success of GenAI relies on robust evaluation. These solutions require assessment from multiple perspectives, including answer relevancy, correctness, tone, hallucinations, and risk of harm, due to their general “free output” relationship. It is important to run this step before deployment (see ML vs GenAI project phases above), usually referred to as “offline evaluation,” as it allows one to have an idea of the behavior and performance of the system when it will be deployed. Make sure to check this great overview of LLM evaluation metrics, which differentiates between statistical scorers (quantitative metrics like BLEU or ROUGE for text relevance) and model-based scorers (e.g., embedding-based similarity measures). DS teams excel in designing and evaluating metrics, even when these metrics can be kind of abstract (e.g., how do you measure usefulness or relevancy?).

Once a GenAI solution is deployed, monitoring becomes critical to ensure that it works as intended and as expected over time. Similar metrics to the ones mentioned for evaluation can be checked to ensure that the conclusions from the offline evaluation are maintained once the solution is deployed and working with real data. Monitoring tools like Datadog are already offering LLM-specific observability metrics. In this context, it can also be interesting to enrich the quantitative insights with qualitative feedback, by working close to user research teams that can help by asking users directly for feedback (e.g., “Do you find these suggestions useful, and if not, why?”).

The complexity and black box design of GenAI models amplifies the ethical risks they carry. ML teams play a crucial role by bringing their expertise in trustworthy AI to the table, recognizing potential issues, and mitigating these risks. This work can include running risk assessments, choosing less biased foundational models (COMPL-AI is an interesting new framework to evaluate and benchmark LLMs on ethical dimensions), defining and evaluating fairness and no-discrimination metrics, and applying techniques and guardrails to ensure outputs are aligned with societal and the organization’s values.

AI Literacy Is Becoming More Important Within Companies

A company’s competitive advantage will depend not just on its AI internal projects but on how effectively its workforce understands and uses AI. Data scientists play a key role in fostering AI literacy across teams and enabling employees to leverage AI while understanding its limitations and risks. With their help, AI should act not just as a tool for technical teams but as a core competency across the organization.

To build AI literacy, organizations can host internal trainings, workshops, meetups, and hackathons led by data scientists and AI experts. This awareness can help:

Augment internal teams and improve their productivity, by encouraging the use of general-purpose AI or specific AI-based features in tools the teams are already using.
Identify opportunities of great potential from within the teams and their expertise. Business and product experts can propose valuable projects on topics previously deemed too complex or impossible, now made viable with GenAI.

In Closing: The Ever-Evolving Role of Data Scientists

It is indisputable that the field of data science and artificial intelligence is changing fast, and with it the role of data scientists and machine learning teams. While it’s true that GenAI APIs enable teams with little ML knowledge to implement AI solutions, the expertise of data science and ML teams is crucial for developing robust, reliable and ethically sound solutions. The redefined role of data scientists under this new context includes:

Staying up-to-date with AI progress to select the best techniques for problem-solving, designing and implementing effective solutions, and making these solutions future-proof while acknowledging limitations.
Adopting a system-wide perspective rather than focusing solely on the predictive model, encompassing end-to-end responsibilities and enhancing collaboration with other roles to influence user interactions and system supervision.
Continuing to develop traditional ML solutions, while recognizing how GenAI can enhance the speed and quality of these solutions.
Gaining a deep understanding of GenAI limitations and risks to build reliable and trustworthy AI systems, including evaluation, monitoring, and risk management.
Acting as an AI Champion across the organization to promote AI literacy and assist non-technical teams in leveraging AI and identifying the right opportunities.

The data scientist role is not being replaced; it is being redefined. By embracing this evolution, it will remain indispensable, guiding organizations toward leveraging AI effectively and responsibly.

Originally published in Towards Data Science.

GenAI Is Reshaping Data Science Teams