Troubleshoot Watson Machine Learning
The following are the answers to common troubleshooting questions about using IBM Watson Machine Learning.
Getting help and support for Watson Machine Learning
If you have problems or questions when you use Watson Machine Learning, you can get help by searching for information or by asking questions through a forum. You can also open a support ticket.
When you use the forums to ask a question, tag your question so that it is seen by the Watson Machine Learning development teams.
If you have technical questions about Watson Machine Learning, post your question on Stack Overflow and tag your question with ibm-bluemix and machine-learning.
For questions about the service and getting started instructions, use the IBM developerWorks dW Answers forum. You must include the machine-learning and bluemix tags.
Contents
- Training an AutoAI experiment fails with service ID credentials
- Creating a job for an SPSS Modeler flow in a deployment space fails
- Inactive Watson Machine Learning instance
- The authorization token is not provided
- Invalid authorization token
- The authorization token and instance_id that was used in the request are not the same
- Authorization token is expired
- The public key that is needed for authentication is not available
- Operation that is timed out after {{timeout}}
- Unhandled exception of type {{type}} with {{status}}
- Unhandled exception of type {{type}} with {{response}}
- Unhandled exception of type {{type}} with {{json}}
- Unhandled exception of type {{type}} with {{message}}
- The requested object might not be found
- The underlying database reported too many requests
- The definition of the evaluation is not defined either in the artifactModelVersion or in the deployment. It needs to be specified " +\n "at least in one of the places
- Data module not found in IBM Federated Learning
- Evaluation requires a learning configuration that is specified for the model
- Evaluation requires spark instance to be provided in
X-Spark-Service-Instance
header - Model does not contain any version
- Patch operation can modify existing learning configuration only
- Patch operation expects exactly one replace operation
- The payload is missing the required fields: FIELD or the values of the fields are corrupted
- Provided evaluation method: METHOD is not supported. Supported values: VALUE
- You can have only one active evaluation per model. The request might not be completed because of existing active evaluation: {{url}}
- The deployment type {{type}} is not supported
- Incorrect input: ({{message}})
- Insufficient data - metric {{name}} might not be calculated
- For type {{type}} spark instance must be provided in
X-Spark-Service-Instance
header - The action {{action}} failed with message {{message}}
- Path
{{path}}
is not allowed. The only allowed path for Patch stream is/status
- Patch operation is not allowed, for instance, of type
{{$type}}
- Data connection
{{data}}
is invalid for feedback_data_ref - Path {{path}} is not allowed. The only allowed path for Patch model is
/deployed_version/url
or/deployed_version/href
for V2 - Parsing failure: {{msg}}
- Runtime environment for selected model: {{env}} is not supported for
learning configuration
. Supported environments: - [{{supported_envs}}] - Current plan '{{plan}}' allows {{limit}} deployments only
- Database connection definition is not valid ({{code}})
- Problems connecting underlying {{system}}
- Error extracting X-Spark-Service-Instance header: ({{message}})
- This function is forbidden for nonbeta users
- {{code}} {{message}}
- Rate limit exceeded
- Invalid query parameter
{{paramName}}
value: {{value}} - Invalid token type: {{type}}
- Invalid token format. You must use bearer token format.
- Input JSON file is missing or invalid: 400
- The authorization token expired: 401
- Unknown deployment identification: 404
- Internal server error: 500
- Invalid type for ml_artifact: Pipeline
- ValueError: Training_data_ref name and connection cannot be None, if Pipeline Artifact is not given.
Follow these tips to resolve common problems you might encounter when you work with Watson Machine Learning.
Training an AutoAI experiment fails with service ID credentials
If you are training an AutoAI experiment using the API key for the serviceID, training might fail with this error:
User specified in query parameters does not match user from token.
One way to resolve this issue is to run the experiment with your user credentials. If you want to run the experiment with credentials for the service, follow these steps to update the roles and policies for the service ID.
-
Open your serviceID on IBM Cloud.
-
Create a new serviceID or update the existing ID with the following access policy:
- All IAM Account Management services with the roles API key reviewer,User API key creator, Viewer, Operator, and Editor. Ideally it is best if they create a new apikey for this ServiceId.
-
The updated policy will look as follows:
-
Run the training again with the credentials for the updated serviceID.
Creating a job for an SPSS Modeler flow in a deployment space fails
During the process of configuring a batch job for your SPSS Modeler flow in a deployment space, the automatic mapping of data assets with their respective connection might fail.
To fix the error with the automatic mapping of data assets and connections, follow these steps:
-
Click Create and save to save your progress and exit from the New job configuration dialog box.
-
In your deployment space, click the Jobs tab and select your SPSS Modeler flow job to review the details of your job.
-
In the job details page, click the Edit icon to manually update the mapping of your data assets and connections.
-
After updating the mapping of data assets and connection, you can resume with the process of configuring settings your job in the New job dialog box. For more information, see Creating deployment jobs for SPSS Modeler flows
Inactive Watson Machine Learning instance
Symptoms
After you try to submit an inference request to a foundation model by clicking the Generate button in the Prompt Lab, the following error message is displayed:
'code': 'no_associated_service_instance_error',
'message': 'WML instance {instance_id} status is not active, current status: Inactive'
Possible causes
The association between your watsonx.ai project and the related Watson Machine Learning service instance was lost.
Possible solutions
Recreate or refresh the association between your watsonx.ai project and the related Watson Machine Learning service instance. To do so, complete the following steps:
- From the main menu, expand Projects, and then click View all projects.
- Click your watsonx.ai project.
- From the Manage tab, click Services & integrations.
- If the the appropriate Watson Machine Learning service instance is listed, disassociate it temporarily by selecting the instance, and then clicking Remove. Confirm the removal.
- Click Associate service.
- Choose the appropriate Watson Machine Learning service instance from the list, and then click Associate.
The public key that is needed for authentication is not available.
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem can happen due to internal service issues.
How to fix it
Contact the support team.
Operation that is timed out after {{timeout}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
The timeout occurred while performing the requested operation.
How to fix it
Try to invoke the operation again.
Unhandled exception of type {{type}} with {{status}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem can happen due to internal service issues.
How to fix it
Try to invoke the operation again. If it happens again, contact the support team.
Unhandled exception of type {{type}} with {{response}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem can happen due to internal service issues.
How to fix it
Try to invoke the operation again. If it happens again, contact the support team.
Unhandled exception of type {{type}} with {{json}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem can happen due to internal service issues.
How to fix it
Try to invoke the operation again. If it happens again, contact the support team.
Unhandled exception of type {{type}} with {{message}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem can happen due to internal service issues.
How to fix it
Try to invoke the operation again. If it happens again, contact the support team.
The requested object is not found.
What's happening
The REST API cannot be invoked successfully.
Why it's happening
The request resource is not found.
How to fix it
Make sure that you are referring to the existing resource.
The underlying database reported too many requests.
What's happening
The REST API cannot be invoked successfully.
Why it's happening
The user sent too many requests in a specific time.
How to fix it
Try to invoke the operation again.
The definition of the evaluation is not defined in the artifactModelVersion or deployment. It must be specified " +\n "at least in one of the places.
What's happening
The REST API cannot be invoked successfully.
Why it's happening
Learning Configuration does not contain all the required information
How to fix it
Provide definition
in learning configuration
Evaluation requires a learning configuration that is specified for the model.
What's happening
It is not possible to create learning iteration
.
Why it's happening
learning configuration
isn't defined for the model.
How to fix it
Create learning configuration
and try to create learning iteration
again.
Evaluation requires spark instance to be provided in X-Spark-Service-Instance
header
What's happening
The REST API cannot be invoked successfully.
Why it's happening
learning configuration
does not have the required information.
How to fix it
Provide spark_service
in Learning Configuration or in X-Spark-Service-Instance
header.
Model does not contain any version.
What's happening
It is not possible to create deployment or set the learning configuration
.
Why it's happening
This problem can happen due to inconsistency that is related to the persistence of the model.
How to fix it
Try to persist the model again and try to perform the action again.
Data module not found in IBM Federated Learning.
What's happening
The data handler for IBM Federated Learning is trying to extract a data module from the FL library but is unable to find it. You might see the following error message:
ModuleNotFoundError: No module named 'ibmfl.util.datasets'
Why it's happening
Possibly an outdated DataHandler.
How to fix it
Review and update your DataHandler to conform to the most recent MNIST data handler or make sure that your sample versions are up to date.
Patch operation can modify existing learning configuration only.
What's happening
It is not possible to invoke patch REST API method to patch learning configuration.
Why it's happening
learning configuration
isn't set for this model or the model does not exist.
How to fix it
Endure that model exists and has already learning configuration set.
Patch operation expects exactly one replace operation.
What's happening
The deployment cannot be patched.
Why it's happening
The patch payload contains more than one operation or the patch operation is different than replace
.
How to fix it
Use only one operation in the patch payload, which is replace
operation.
The payload is missing the required fields: FIELD or the values of the fields are corrupted.
What's happening
It is not possible to process action that is related to access to the underlying data set.
Why it's happening
The access to the data set is not properly defined.
How to fix it
Correct the access definition for the data set.
Provided evaluation method: METHOD is not supported. Supported values: VALUE.
What's happening
It is not possible to create learning configuration.
Why it's happening
The wrong evaluation method was used to create learning configuration.
How to fix it
Use a supported evaluation method, which is one of: regression
, binary
, multiclass
.
You can have only one active evaluation per model. The request cannot be completed because of existing active evaluation: {{url}}
What's happening
It is not possible to create another learning iteration.
Why it's happening
You can have only one running evaluation for the model.
How to fix it
See the already running evaluation or wait for the evaluation to end and start the new one.
The deployment type {{type}} is not supported.
What's happening
It is not possible to create the deployment.
Why it's happening
Not supported deployment type was used.
How to fix it
A supported deployment type must be used.
Incorrect input: ({{message}})
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem happens due to an issue with parsing JSON.
How to fix it
Make sure that the correct JSON is passed in the request.
Insufficient data - metric {{name}} cannot be calculated
What's happening
Learning iteration failed.
Why it's happening
Value for metric with defined threshold cannot be calculated because of insufficient feedback data.
How to fix it
Review and improve data in data source feedback_data_ref
in learning configuration
For type {{type}} spark instance must be provided in X-Spark-Service-Instance
header
What's happening
Deployment cannot be created
Why it's happening
batch
and streaming
deployments require spark instance to be provided
How to fix it
Provide spark instance in X-Spark-Service-Instance
header
Action {{action}} failed with message {{message}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem happens due to an issue with invoking underlying service.
How to fix it
If the message provides a suggestion to fix the issue, follow the suggestion. Otherwise, contact the support team.
Path {{path}}
is not allowed. The only allowed path for patch stream is /status
What's happening
Stream deployment cannot be patched.
Why it's happening
The wrong path was used to patch the stream
deployment.
How to fix it
Patch the stream
deployment with supported path option, which is /status
(it allows to start/stop stream processing).
Patch operation is not allowed, for instance, of type {{$type}}
What's happening
Deployment cannot be patched.
Why it's happening
The wrong deployment type is being patched.
How to fix it
Patch the stream
deployment type.
Data connection {{data}}
is invalid for feedback_data_ref
What's happening
learning configuration
cannot be created for the model.
Why it's happening
Supported data source was not used when feedback_data_ref
was defined.
How to fix it
Use only the supported data source type dashdb
.
Path {{path}} is not allowed. The only allowed path for patch model is /deployed_version/url
or /deployed_version/href
for V2
What's happening
No option to patch model.
Why it's happening
The wrong path was used during patching of the model.
How to fix it
Patch model with supported path that you can use to update the version of the deployed model.
Parsing failure: {{msg}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
The requested payload cannot be parsed successfully.
How to fix it
Make sure that your request payload is correct and can be parsed correctly.
Runtime environment for selected model: {{env}} is not supported for learning configuration
. Supported environments: [{{supported_envs}}].
What's happening
No option to create learning configuration
.
Why it's happening
The model for which the learning_configuration
was tried to be created is not supported.
How to fix it
Create learning configuration
for model, which has the supported runtime.
Current plan '{{plan}}' allows {{limit}} deployments only
What's happening
It is not possible to create the deployment.
Why it's happening
The limit for number of deployments was reached for the current plan.
How to fix it
Upgrade to the plan that does not have such limitation.
Database connection definition is not valid ({{code}})
What's happening
It is not possible to use the learning configuration
function.
Why it's happening
The database connection definition is invalid.
How to fix it
Try to fix the issue that is described by code
returned by the underlying database.
Problems connecting underlying {{system}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem might happen due to an issue during connection to the underlying system. It might be a temporary network issue.
How to fix it
Try to invoke the operation again. If you get an error again, contact the support team.
Error extracting X-Spark-Service-Instance header: ({{message}})
What's happening
This problem might happen if REST API that requires Spark credentials cannot be invoked.
Why it's happening
This problem might happen due to an issue with base-64 decoding or parsing Spark credentials.
How to fix it
Make sure that the correct Spark credentials were correctly base-64 encoded. For more information, see the documentation.
This function is forbidden for non-beta users.
What's happening
The REST API cannot be invoked successfully.
Why it's happening
The REST API that was invoked is in beta.
How to fix it
If you are interested in participating, add yourself to the wait list. The details can be found in the documentation.
{{code}} {{message}}
What's happening
The REST API cannot be invoked successfully.
Why it's happening
This problem might happen due to an issue with invoking underlying service.
How to fix it
If the message provides a suggestion to fix the issue, follow the suggestion. Otherwise, contact the support team.
Rate limit exceeded.
What's happening
Rate limit exceeded.
Why it's happening
The rate limit for current plan is exceeded.
How to fix it
To solve this problem, acquire another plan with a greater rate limit
Invalid query parameter {{paramName}}
value: {{value}}
What's happening
Validation error as passed incorrect value for query parameter.
Why it's happening
Error in getting result for query.
How to fix it
Correct query parameter value. The details can be found in the documentation.
Invalid token type: {{type}}
What's happening
Error regarding token type.
Why it's happening
Error in authorization.
How to fix it
Token must be started with Bearer
prefix.
Invalid token format. You must use bearer token format.
What's happening
Error regarding token format.
Why it's happening
Error in authorization.
How to fix it
The token must be a bearer token and must start with Bearer
prefix.
Input JSON file is missing or invalid: 400
What's happening
The following message displays when you try to score online: Input JSON file is missing or invalid.
Why it's happening
This message displays when the scoring input payload doesn't match the expected input type that is required for scoring the model. Specifically, the following reasons might apply:
- The input payload is empty.
- The input payload schema is not valid.
- The input data types do not match the expected data types.
How to fix it
Correct the input payload. Make sure that the payload has correct syntax, a valid schema, and proper data types. After you make corrections, try to score online again. For syntax issues, verify the JSON file by using the jsonlint
command.
Unknown deployment identification: 404
What's happening
The following message displays when you try to score online Unknown deployment identification.
Why it's happening
This message displays when the deployment ID that is used for scoring does not exist.
How to fix it
Make sure you are providing the correct deployment ID. If not, deploy the model with the deployment ID and then try scoring it again.
Internal server error: 500
What's happening
The following message displays when you try to score online: Internal server error
Why it's happening
This message displays if the downstream data flow on which the online scoring depends fails.
How to fix it
Wait for some time and try to score online again. If it fails again, contact IBM Support.
Invalid type for ml_artifact: Pipeline
What's happening
The following message displays when you try to publish a Spark model by using Common API client library on your workstation.
Why it's happening
This message displays if you have an invalid pyspark set up in the operating system.
How to fix it
Set up system environment paths according to the instruction:
SPARK_HOME={installed_spark_path}
JAVA_HOME={installed_java_path}
PYTHONPATH=$SPARK_HOME/python/
ValueError: Training_data_ref name and connection cannot be None, if Pipeline Artifact is not given.
What's happening
The training data set is missing or is not referenced properly.
Why it's happening
The Pipeline Artifact is a training data set in this instance.
How to fix it
You must supply a training data set when you persist a Spark PipelineModel. If you don't, the client says it doesn't support PipelineModels, rather than saying a PipelineModel must be accompanied by the training set.