0 / 0
Choosing compute resources for running tools in projects

Choosing compute resources for running tools in projects

You use compute resources in projects when you run jobs and most tools. Depending on the tool, you might have a choice of compute resources for the runtime for the tool.

Compute resources are known as either environment templates or hardware and software specifications. In general, compute resources with larger hardware configurations incur larger usage costs. Many tools in projects use the Watson Studio service for compute resources, but some tools use other services. Each service tracks and bills compute usage separately.

These tools have multiple choices for configuring runtimes that you can choose from:

These tools have one runtime configuration that is assigned automatically:

The following tools do not consume compute resources:

  • Metadata import
  • Master Data configuration

Profiling data assets

Profiling a data asset in a project or a catalog consumes 6 CUH per hour from the IBM Knowledge Catalog, with a minimum amount of 0.96 CUH per profiling session. Profiling requires the IBM Knowledge Catalog service.

The runtime for profiling does not appear on the Resource usage page of the Manage tab of the project. You can't track compute usage for profiling.

Metadata enrichment

Metadata enrichment requires the IBM Knowledge Catalog service. The amount of CUH per hour from IBM Knowledge Catalog that metadata enrichment jobs consume depends on the enrichment objectives that you select.

Table 1. CUH usage in specific metadata enrichment objectives
Metadata enrichment objectives Capacity units per hour (CUH)
Profile data 6
Profile data and assign terms 8

When you run metadata enrichment, one or more jobs are started. Each job handles a maximum of 200 tables. When you enrich more than 200 tables at a time, you start multiple jobs. For example, if you run metadata enrichment on 500 tables, you start three jobs. The minimum amount of CUH that is billed for each metadata enrichment job is 0.96 CUH.

Jobs for metadata enrichment with the Expand metadata option or semantic term assignment are limited to 10 tables per job.

The amount of CUH consumed by metadata enrichment depends on the number of tables, as well as columns in the tables. Other factors, such as the structure of the data, can also affect the amount of consumed CUH. For example:

  • The three jobs for profiling data for 500 tables with 500 columns might consume a total of approximately 24 CUH.
  • The three jobs for profiling data and assigning terms for 500 tables with 500 columns might consume a total of approximately 30 CUH.

The runtimes for metadata enrichment does not appear on the Resource usage page of the Manage tab of the project. You can't track compute usage for metadata enrichment.

Data quality rules

A data quality rule job runs as a DataStage flow with the Default DataStage PX S environment, which consumes 1 CUH per hour, with a minimum of 1 minute of CUH. Data quality rules require the IBM Knowledge Catalog and DataStage services.

The runtime for data quality rules appears as a DataStage flow on the Resource usage page of the Manage tab of the project.

Learn more

Parent topic: Projects

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more