Customize foundation models to return better results for tasks that meet the unique needs of your business.
Foundation models are AI models that are pretrained on terabytes of data from across the internet and other public resources. They are unrivaled in their ability to predict the next best word and generate language. While language-generation can be useful for brainstorming and spurring creativity, foundation models typically need to be guided to achieve concrete tasks. Model tuning, and other techniques, such as retrieval-augmented generation, help you to use foundation models in meaningful ways for your business.
With the Tuning Studio, you can tune a smaller foundation model to improve its performance on natural language processing tasks such as classification, summarization, and generation. Tuning can help a smaller foundation model achieve results comparable to larger models in the same model family. By tuning and deploying the smaller model, you can reduce long-term inference costs.
Much like prompt engineering, tuning a foundation model helps you to influence the content and format of the foundation model output. Knowing what to expect from a foundation model is essential if you want to plug the step of inferencing a foundation model into a business workflow.
The following diagram illustrates how tuning a foundation model can help you guide the model to generate useful output. You provide labeled data that can include proprietary information which is unknown to the foundation model and that illustrates the format and type of output that you want the model to return. Your examples give the foundation model a pattern to follow and apply to future output.
Instead of generalizing in traditional tasks such as text translation and text or answer generation, after you tune a foundation model, the output is more tailored to your needs.
- Generated text or answers can follow a specific style
- The tuned model can summarize or extract information in the way you want
- With much smaller prompts, the tuned model can classify text effectively
To learn more about when tuning a model is the right approach, see When to tune a foundation model.
Ways to work
Watsonx.ai offers various ways for you to customize foundation models, including:
- Tuning Studio: A graphical user interface tool for tuning a foundation model
- Model tuning with code: Programmatic methods for tuning a foundation model
Workflow
Whichever way you choose to work, the workflow for tuning a foundation model remains the same. Tuning a foundation model involves the following tasks:
-
Engineer prompts that work well with the model you want to use.
Tuning does not mean you can skip prompt engineering altogether. Experimentation is necessary to find the right foundation model for your use case. Experiment until you understand which prompt formats show the most potential for getting good results from the model. You can use the Prompt Lab to submit test prompts. For help, see Prompt Lab.
Find the largest foundation model that works best for the task.
-
Create training data to use for model tuning.
-
Create a tuning experiment to tune the model.
-
Evaluate the tuned model.
If necessary, change the training data or the experiment parameters and run more experiments until you're satisfied with the results.
-
Deploy the tuned model.
-
Submit inference requests to the tuned model.
Foundation model tuning costs
The cost of tuning a foundation model is measured in capacity unit hours, which measures the compute resource consumption of the tuning experiment. For more information, see Capacity Unit Hours metering.
The cost of inferencing a tuned model is measured in resource units. The rate depends on the model's billing class. A prompt-tuned foundation model has the same billing class as the foundation model that it tunes. For more information, see Resource unit metering.
Learn more
Get started
- Quick start: Tune a foundation model
- Sample notebook: Tune a model to classify CFPB documents in watsonx
- Sample notebook: Prompt tuning for multi-class classification with watsonx
Parent topic: Developing generative AI solutions