Prompting a custom foundation model

Last updated: Jul 25, 2025

After a custom foundation model is installed, registered, and deployed, use the capabilities of watsonx.ai to prompt the model.

Note: Only members of the project or space where the custom foundation model is deployed can prompt it. The model is not available to users of other projects or spaces.

When your deployed custom foundation model is available, you can prompt it by using one of these methods:

Prompt a custom model by using Prompt Lab
Prompt a custom model by using the API

Prompt a custom model by using Prompt Lab

Open the custom model from the list of available foundation models. You can then work with the model as you do with foundation models that are provided with watsonx.ai.

Use the Prompt Lab to create prompts and review prompts for your custom foundation model
Build and save reusable prompts as prompt templates with variables
Deploy and test prompt templates

Prompt a custom model by using the API

Refer to these examples to code a prompt for the custom foundation model:

Generating a text response

The following code sample shows how to generate a text response by using the /ml/v1/deployments/<deployment ID>/text/generation API endpoint:

curl -X POST "<Your watsonx URL>/ml/v1/deployments/<your deployment ID>/text/generation?version=2024-01-29" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
--data '{
 "input": "Hello, what is your name",
 "parameters": {
	"max_new_tokens": 200,
	"min_new_tokens": 20
 }
}'

You can also use the chat API to prompt your deployed custom foundation model. To enable chat API support, the model content (tokenizer_config.json file in model content) must include a chat template. Chat API support can only be used to prompt models that use the vLLM runtime for deployment, therefore they must be deployed with the watsonx-cfm-caikit-1.1 software specification.

Note:

You can only use the chat API to prompt a new deployment. The chat API cannot be used to patch support for existing deployments.

The following code sample shows how to generate a text response by using the /ml/v1/deployments/<deployment ID>/text/chat API endpoint:

curl --request POST\
--url'<Your watsonx URL>/ml/v1/deployments/<deployment-id>/text/chat?version=2020-10-10'\
--header'Authorization: Bearer$token'\
--header'Content-Type: application/json'\
--data'{
	"messages": [
	  {
		"role": "user",
		"content": [
		  {
			"type": "text",
			"text": "What is the capital of India"
		  }
		]
	  },
	  {
			"role": "assistant",
			"content": "New Delhi."
		},
		{
		"role": "user",
		"content": [
		  {
			"type": "text",
			"text": "Which continent"
		  }
		]
	  }
	]
}'

Generating a stream response

The following code sample shows how to generate a stream response by using the /ml/v1/deployments/<deployment ID>/text/generation_stream API endpoint:

curl -X POST "https://<Your watsonx URL>/ml/v1/deployments/<your deployment ID>/text/generation_stream?version=2024-01-29" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
--data '{
 "input": "Hello, what is your name",
 "parameters": {
	"max_new_tokens": 200,
	"min_new_tokens": 20
 }
}'

You can use the chat API to prompt your deployed custom foundation model, which implements methods for interacting with foundation models in a conversational way. To enable chat API support, the model content (tokenizer_config.json file in model content) must include a chat template. Chat API support can only be used to prompt models that use the vLLM runtime for deployment, therefore they must be deployed with the watsonx-cfm-caikit-1.1 software specification.

Note:

You can only use the chat API to prompt new deployment. The chat API cannot be used to patch support for existing deployments.

The following code sample shows how to generate a text response by using the /ml/v1/deployments/<deployment ID>/text/chat_stream API endpoint:

curl --request POST \
  --url '<Your watsonx URL>/ml/v1/deployments/<deployment-id>/text/chat_stream?version=2020-10-10' \
  --header 'Authorization: Bearer$token' \
  --header 'Content-Type: application/json' \
  --data '{
	"messages": [
	  {
		"role": "user",
		"content": [
		  {
			"type": "text",
			"text": "What is the capital of USA"
		  }
		]
	  }
	]
}'

Inferencing a time-series model

For an example of how to inference a time-series model, see this example code snippet:

curl --request POST \
  --url '<cluster url>/ml/v1/deployments/<your deployment ID>/time_series/forecast?version=2025-05-01' \
  --header 'Authorization: Bearer <your token>' \
  --header 'content-type: application/json' \
  --data '{
  "schema": {
	"timestamp_column": "date",
	"freq": "1h"
  },
  "data": {
	"date": [
			"2017-10-02T16:00:00",
			"2017-10-02T17:00:00",
			"2017-10-02T18:00:00",
			"2017-10-02T19:00:00",
			"2017-10-02T20:00:00",
			"2017-10-02T21:00:00",
			"2017-10-02T22:00:00",
			"2017-10-02T23:00:00",
			"2017-10-03T00:00:00",
			"2017-10-03T01:00:00",
			"2017-10-03T02:00:00",
			"2017-10-03T03:00:00",
			"2017-10-03T04:00:00",
			"2017-10-03T05:00:00",
			"2017-10-03T06:00:00",
			"2017-10-03T07:00:00",
			"2017-10-03T08:00:00",
			"2017-10-03T09:00:00",
			"2017-10-03T10:00:00",
			"2017-10-03T11:00:00"
		],
		"HULL": [
			7.969533546848622,
			3.517834648350302,
			0.27686073271722456,
			7.270215589283335,
			8.301608634471089,
			3.5715043729826013,
			4.837339724448691,
			6.259279065718229,
			4.992513028164783,
			4.054610086143303,
			8.734517184165343,
			5.6456574912796444,
			4.9638106191135964,
			7.582701093090489,
			4.286026911335609,
			7.821287465558724,
			4.987596574195042,
			9.301099509722704,
			5.983153494948004,
			4.4075103791467605
		],
		"HUFL": [
			4.721054631564481,
			7.9107469233587935,
			9.379306714057732,
			9.466520103213014,
			3.0316660084605607,
			1.2258327967405924,
			7.827951462135792,
			5.619227620657014,
			2.1684074695772115,
			2.2928977662265337,
			7.495460408629855,
			1.7033375810280893,
			8.27439736702749,
			6.3714075795857275,
			5.545698731301024,
			4.609266503722505,
			4.8792239645717235,
			5.319728292461291,
			3.0308754657454253,
			5.5629867790456675
		]
	}
}

Parent topic: Deploying a custom foundation model

Was the topic helpful?

0/1000