You can choose from a collection of third-party and IBM foundation models to inference in IBM watsonx.ai. Find foundation models that best suit the needs of your generative AI application and your budget.
You can host foundation models in watsonx.ai in various ways.
If you want to deploy foundation models in your own data center, you can purchase watsonx.ai software. For more information, see Overview of IBM watsonx.ai and IBM watsonx.governance software.
Based on how foundation models are hosted in watsonx.ai, they are categorized as:
- Provided foundation models
- Deploy on demand foundation models
- Custom foundation models
- Prompt-tuned foundation models
Deployment methods comparison
To help you choose the deployment method that best fits your use case, review the comparison table.
Deployment type | Available from | Deployment mechanism | Hosting environment | Billing method | Deprecation policy |
---|---|---|---|---|---|
Foundation models provided with watsonx.ai | • Resource hub>Pay per token • Prompt Lab |
Curated and deployed by IBM | Multitenant hardware | By tokens used | Deprecated according to published lifecycle. See Foundation model lifecycle. |
Deploy on demand foundation models | • Resource hub>Pay by the hour • Prompt Lab |
Curated and deployed by IBM at your request | Dedicated hardware | By hour deployed | Your deployed model is not deprecated |
Custom foundation models | • Prompt Lab | Curated and deployed by you | Dedicated hardware | By hour deployed | Not deprecated |
Prompt-tuned foundation models | • Prompt Lab | Tuned and deployed by you | Multitenant hardware | • Training is billed by CUH • Inferencing is billed by tokens used |
Deprecated when the underlying model is deprecated unless you add the underlying model as a custom foundation model |
For details on how model pricing is calculated and monitored, see Billing details for generative AI assets.
Provided foundation models that are ready to use
A collection of third-party and IBM foundation models are deployed on multitenant hardware in IBM watsonx.ai by IBM. You can prompt these foundation models in the Prompt Lab or programmatically. You pay based on the number of tokens used.
To start inferencing a provided foundation model, complete these steps:
- From the main menu, select Resource hub.
- Click View all in the Pay per token section.
- Click a foundation model tile, and then click Open in Prompt Lab.
Deploy on demand foundation models
A deploy on demand model is an instance of an IBM-curated foundation model that you deploy and that is dedicated for the exclusive use of your organization. Only colleagues who are granted access to the deployment can inference the foundation model. A dedicated deployment means faster and more responsive interactions without rate limits.
To work with a deploy on demand foundation model, complete these steps:
- From the main menu, select Resource hub.
- Click View all in the Pay by the hour section.
- Click a foundation model tile, and then click Deploy.
For more information, see Deploying foundation models on-demand.
Custom foundation models
In addition to working with foundation models that are curated by IBM, you can upload and deploy your own foundation models. After the custom models are deployed and registered with watsonx.ai, you can create prompts that inference the custom models from the Prompt Lab or the watsonx.ai API.
The instance of the custom foundation model that you deploy is dedicated for your use. A dedicated deployment means faster and more responsive interactions. You pay for hosting the foundation model by the hour.
To learn more about how to upload, register, and deploy a custom foundation model, see Deploying a custom foundation model.
Prompt-tuned foundation models
A subset of the provided foundation models can be customized for your needs by prompt tuning the model from the watsonx.ai API or Tuning Studio. A prompt-tuned foundation model relies on the underlying deployed foundation model. The underlying model can be deprecated. You pay for the resources that you consume to tune the model. After the model is tuned, you pay by tokens used to inference the model.
You can customize the following foundation models by prompt tuning them in watsonx.ai:
For more information, see Tuning Studio.
Learn more
For the complete list of models you can work with in watsonx.ai, see Supported foundation models.
Parent topic: Gen AI solutions