Deploying agentic AI applications

Last updated: May 27, 2025

Agentic AI is a type of artificial intelligence that uses sophisticated reasoning and iterative planning to autonomously solve complex, multi-step problems. It enables AI agents to think and act like humans, making decisions and taking actions in a dynamic environment.

Key differences: generative and agentic AI

Agentic AI systems collect and process large amounts of data from various sources to identify challenges, create plans, and take action on their own. Unlike traditional generative AI systems, agentic AI can take on tasks that require multiple steps and interactions, such as processing text, images, and data. These applications use LLMs to make decisions, generate text, or perform tasks. Businesses are using agentic AI to enhance customer service, improve software development processes, and improve patient interactions.

In contrast, generative AI applications are focused on making decisions, taking actions, and adapting to their environment. Generative AI is a type of artificial intelligence that creates new content, such as text, images, music, or video, by learning from data and generating original content. It uses machine learning algorithms to produce new outputs based on patterns and structures learned from its training data.

The following table provides a summary of the differences between generative AI, agentic AI, and LLM-based agentic AI applications:

Key differences: Generative and Agentic AI
Category	Description	Primary focus	Approach	Autonomy	Examples
Generative AI	Creates new content (text, images, music, video)	Content generation	Creative	Limited (follows instructions/prompts)	Chatbots, language models, image generators, music composers, video editors
Agentic AI	Uses LLMs to augment decision-making and action-taking	Decision-making, action-taking	Analytical (with LLM assistance)	High (operates autonomously)	Virtual assistants, language-based chatbots, content generation tools, text summarization tools

Deploying agentic AI applications

You can deploy single-agent or multi-agent applications with watsonx.ai:

Deploying single-agent applications: In a single-agent system, the agent operates independently, making decisions and taking actions based on its own goals and objectives. It is a self-contained system that does not interact with other agents or systems.

The agent accesses memory for storing and retrieving past information and tools, including external resources such as APIs or databases. There is no coordination with other agents, making the system simpler but limited in handling complex tasks.

Single-agent AI systems are often used in applications such as chatbots, virtual assistants, and autonomous vehicles.
Deploying multi-agent applications: A multi-agent system consists of multiple agents that interact with each other and their environment to achieve a common goal. In a multi-agent system, each agent may have its own goals, motivations, and behaviors, and they must coordinate and interact with each other to achieve the overall goal.

These agents collaborate, exchanging information and distributing tasks among themselves. Each agent can access memory and tools independently, enabling parallel processing for efficiency. The agents communicate with each other, making the system more adaptive and capable of handling complex, distributed tasks.

Multi-agent AI systems are often used in applications such as robotics, autonomous systems, and complex decision-making systems.

The following graphic shows the difference between single-agent and multi-agent systems. In a single-agent system, the AI agent independently processes information and makes decisions. In contrast, in a multi-agent system, the user interacts with multiple AI agents instead of just one.

Single-agent and Multi-agent systems

Methods for deploying agentic AI applications

Agentic AI applications are deployed as an AI service. An AI service is a deployable unit of code that captures the logic of your generative AI use cases. When your AI services are successfully deployed, you can use the endpoint for inferencing from your application.

Deploying single-agent applications

You can deploy single-agent AI applications with watsonx.ai by following these approaches:

No-code approach: Single-agent applications can be built in the Agent Lab by following a no-code approach. If you used the Agent Lab to build your agentic AI solution, you can simply deploy your application from the interface to create an online deployment automatically. Use this method of direct deployment from the Agent Lab if your application does not require much customization.

To learn more about deploying single-agent applications as AI services with a no-code approach, see Deploying AI service with visual tools.
Low-code approach: If you built your agentic AI solution in the Agent Lab, but require some customization before deployment, you can save your work in a deployment notebook. When you save your work in a deployment notebook, watsonx.ai automatically generates a notebook which captures the logic of your agentic AI application in an AI service, which is a deployable unit of code. The deployment notebook contains auto-generated code to promote your AI service asset to a deployment space and create a deployment for the asset.

To learn more about deploying single-agent applications as AI services with a low-code approach, see Deploying AI service with visual tools.
Full-code approach: If you used frameworks such as LlamaIndex to build your application and require complete customization or integration of agentic AI applications in your own workflows or existing pipelines, you can adopt a full-coding approach for deployment. Coding your solution provides complete flexibility for customization and high scalability.

For coding your application, you can manually coding your AI service to deploy your agentic AI application.

Alternatively, if you are building your agentic AI application with frameworks such as CrewAI, LangGraph, or more, you can opt for using a pre-defined sample templates to deploy your application as an AI service. These templates provide a pre-built foundation for AI applications, enabling you to focus on the core logic of your application, rather than starting from scratch.

To learn more about deploying your application as as AI service, see Deploying AI services with code.

Deploying multi-agent applications

If you built a multi-agent application with frameworks such as CrewAI or LangGraph, you must deploy your application by following a programmatic approach. Create an AI service to capture the programming logic of your application and deploy the AI service to get an endpoint for inferencing.

To deploy multi-agent systems programmatically, you can adopt the manual coding approach by creating the AI service in a notebook. Manual coding provides full customization and high scalability for deployment.

Alternatively, you can use pre-defined templates to deploy your AI services in watsonx.ai. These templates provide a pre-built foundation for AI applications, enabling you to focus on the core logic of your application, rather than starting from scratch.

You can also use CPDCTL, which is a command-line tool for deploying and managing AI services on the IBM Cloud Pak for Data (CPD) platform.

To learn more about deploying multi-agent applications as as AI service, see Deploying AI services with code.

Inferencing deployed agentic AI applications

After deploying your agentic AI application as an AI service, you can get the API endpoint for inferencing or interact with the deployed application.

Inferencing single-agent applications

Depending on the approach you chose for deployment, single-agent applications can be inferenced from the Agent Lab or programmatically.

Since single-agentic AI applications are deployed as AI services, you can also use the chat method to interact with your deployed application when you use the Agent Lab. Chatting with a deployed application provides you with a more natural and intuitive way to interact with your deployment, making it easier to ask questions, receive answers, and complete tasks.

You can also inference your deployed application programmatically for text generation or streaming by using the watsonx.ai Python client library or REST API.

To learn how to get the inferencing endpoint for a single-agent AI application that is deployed as an AI service, see Testing AI service deployments.

Inferencing multi-agent applications

When you deploy multi-agent applications as an AI service programmatically, you can inference your deployed application programmatically by using the watsonx.ai Python client library or REST API.

To learn how to get the inferencing endpoint for a multi-agent AI application that is deployed as an AI service, see Testing AI service deployments.

Managing deployed agentic AI applications

After you deploy an agentic AI application, you can manage the deployment by updating details, scaling, or deleting the deployment. To manage agentic AI application that are deployed as an AI service, see Managing AI service deployments.

Learn more

Deploying AI services

Parent topic: Deploying generative and agentic AI applications

Was the topic helpful?

0/1000