0 / 0

Prompting a custom foundation model

Last updated: May 13, 2025
Prompting a custom foundation model

After a custom foundation model is installed, registered, and deployed, use the capabilities of watsonx.ai to prompt the model.

Note: Only members of the project or space where the custom foundation model is deployed can prompt it. The model is not available to users of other projects or spaces.

When your deployed custom foundation model is available, you can prompt it by using one of these methods:

Prompt a custom model by using Prompt Lab

Open the custom model from the list of available foundation models. You can then work with the model as you do with foundation models that are provided with watsonx.ai.

Prompt a custom model by using the API

Refer to these examples to code a prompt for the custom foundation model:

Generating text response

The following code sample shows how to generate a text response by using the /ml/v1/deployments/<deployment ID>/text/generation API endpoint:

curl -X POST "https://<cluster_url>/ml/v1/deployments/<your deployment ID>/text/generation?version=2024-01-29" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
--data '{
 "input": "Hello, what is your name",
 "parameters": {
    "max_new_tokens": 200,
    "min_new_tokens": 20
 }
}'

You can also use the chat API to prompt your deployed custom foundation model. To enable chat API support, the model content (tokenizer_config.json file in model content) must include a chat template. Chat API support can only be used to prompt models that use the vLLM runtime for deployment, therefore they must be deployed with the watsonx-cfm-caikit-1.1 software specification.

Note:

You can only use the chat API to prompt new deployment. The chat API cannot be used to patch support for existing deployments.

The following code sample shows how to generate a text response by using the /ml/v1/deployments/<deployment ID>/text/chat API endpoint:

curl --request POST\
--url'https://{region}.ml.cloud.ibm.com/ml/v1/deployments/<deployment-id>/text/chat?version=2020-10-10'\
--header'Authorization: Bearer$token'\
--header'Content-Type: application/json'\
--data'{
	"messages": [
	  {
	    "role": "user",
		"content": [
		  {
			"type": "text",
			"text": "What is the capital of India"
		  }
		]
	  },
	  {
			"role": "assistant",
			"content": "New Delhi."
		},
		{
		"role": "user",
		"content": [
		  {
			"type": "text",
			"text": "Which continent"
		  }
		]
	  }
	]
}'

Generating stream response

The following code sample shows how to generate a stream response by using the /ml/v1/deployments/<deployment ID>/text/generation_stream API endpoint:

curl -X POST "https://<cluster_url>/ml/v1/deployments/<your deployment ID>/text/generation_stream?version=2024-01-29" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
--data '{
 "input": "Hello, what is your name",
 "parameters": {
    "max_new_tokens": 200,
    "min_new_tokens": 20
 }
}'

You can use the chat API to prompt your deployed custom foundation model, which implements methods for interacting with foundation models in a conversational way. To enable chat API support, the model content (tokenizer_config.json file in model content) must include a chat template. Chat API support can only be used to prompt models that use the vLLM runtime for deployment, therefore they must be deployed with the watsonx-cfm-caikit-1.1 software specification.

Note:

You can only use the chat API to prompt new deployment. The chat API cannot be used to patch support for existing deployments.

The following code sample shows how to generate a text response by using the /ml/v1/deployments/<deployment ID>/text/chat_stream API endpoint:

curl --request POST \
  --url 'https://{region}.ml.cloud.ibm.com/ml/v1/deployments/<deployment-id>/text/chat_stream?version=2020-10-10' \
  --header 'Authorization: Bearer$token' \
  --header 'Content-Type: application/json' \
  --data '{
	"messages": [
	  {
		"role": "user",
		"content": [
		  {
			"type": "text",
			"text": "What is the capital of USA"
		  }
		]
	  }
	]
}'

Parent topic: Deploying a custom foundation model