Troubleshoot IBM DataStage
Use these solutions to help resolve problems that you might encounter with IBM® DataStage®.
Getting help and support for DataStage
If you have problems or questions when you use DataStage, you can get help by searching for information or by asking questions through a forum. You can also open a support ticket.
When you ask a question on the forums, tag your question so that it is seen by the DataStage development teams.
For questions about the service and getting started instructions, use the forum at https://stackoverflow.com/questions/tagged/datastage.
If you still can't resolve the problem, you can open an IBM support ticket. For information about opening an IBM support ticket or about support levels, see "Contact support" at https://cloud.ibm.com/unifiedsupport/supportcenter.
- General
-
- Jobs fail because SQL and Before SQL statements run in incorrect order
- The mailx command fails to run in before-job and after-job subroutines without SMTP server info
- Properties selections not preserved if you deselect "Use DataStage properties"
- Routine fails when CEL function ds.getUserStatus is run on an external flow
- Job fails when loading a large Excel file
- Exported flows generate JSON connection files that contain plaintext passwords
- Issues browsing database tables with columns that contain special characters
- Incorrect inferences assigned to a schema read by the Asset Browser
- Using sequential files as a source
- Error running jobs with a parquet file format
- Migration pod getting evicted for exceeding its ephemeral storage limits
- Error occurs during the upgrade of the Cloud Pak for Data from 5.0.0 to 5.0.1 version
- Compilation of a flow that contains the transformer stage is timed out
- Connectors
-
- Netezza® connector: Duplicate records occur when partitioned reads are enabled
- MySQL connector: Jobs might fail if you use Write mode "Update" for the target without a primary key
- FTP connector: The home directory path is prepended to the path
- Jobs fail with error "The connector could not establish a connection to Db2® database"
- Job with source data from a SAP OData connector fails
- Cannot run transactional SQL on data from Apache Hive version 2.0 or earlier
- IBM Db2 for DataStage connection with SSL certificate fails with "Protocol specific error code 414" error
- Error parameterizing the credential field for a flow connection in IBM Cloud® Object Storage
- PostgreSQL connector times out on large tables
- Schema changes that originate in data from the HTTP connector can cause the job to fail
General
- Jobs fail because SQL and Before SQL statements run in incorrect order
-
On the Teradata connector set to ANSI transaction mode, the Before SQL statement may run after the SQL statement instead of before, causing the job to fail.
Workaround: Add a commit statement after each Before SQL statement.
- The mailx command fails to run in before-job and after-job subroutines without SMTP server info
-
If the
mailx
command is used in a before-job or after-job subroutine, you must provide the SMTP server info or it will forward execution tosendmail
and fail.
- Properties selections not preserved if you deselect "Use DataStage properties"
-
If you enter other properties (for example for tables or schemas) with the default Use DataStage properties option selected, and then deselect Use DataStage properties, the properties are not preserved.
Workaround: Deselect the default Use DataStage properties if you intend not to use them before you enter other properties. Otherwise, reselect the properties.
- Routine fails when CEL function ds.getUserStatus is run on an external flow
- When built-in CEL function
ds.getUserStatus
is run on a target that is not within the same pipeline, it fails and cannot retrieve the user status. Use thedsjob
CLI in your Run Bash script node instead. For an example of how to rewrite this, see thedsjob
command used byDSGetUserStatus()
in Routine replacement examples in DataStage.
- Job fails when loading a large Excel file
-
A job with a connector that is processing a large Excel file might fail with this error:
"CDICO9999E: Internal error occurred: IO error: The Excel file is too large. (error code: DATA_IO_ERROR)"
Try increasing the heap size. The Heap size properties option is in the Other properties section of the connector's Stage tab.
- Exported flows generate JSON connection files that contain plaintext passwords
-
Downloaded flows might include connection assets that have credentials or other sensitive information. You can run the following command to change the export behavior so that all future exports remove credentials by default.
oc -n ${PROJECT_CPD_INST_OPERANDS} patch datastage datastage --patch '{"spec":{"migration_export_remove_secrets":true}}' --type=merge
- Issues browsing database tables with columns that contain special characters
-
You might have issues when you use the Asset Browser to browse database tables if the selected table contains a column with special characters such as ., $, or #, and you add that table into a DataStage flow. DataStage does not support column names that contain special characters. DataStage flows that reference columns with names that include these special characters will not work.
To work around this problem, create a view over the database table and redefine the column name in the view. For example:
create view view1 as select column1$ as column1, column2# as column2 ... from table
Then, when you use the Asset Browser, find the view and add it to the DataStage flow.
- Incorrect inferences assigned to a schema read by the Asset Browser
-
The Asset Browser will read the first 1000 records and infer the schema, such as column name, length, data type, and nullable, based on these first 1000 records in the files in IBM Cloud Object Storage, Amazon S3, Google Cloud Storage, Azure File Storage, Azure Blob Storage, or the Azure Data Lake service. For instance, the Asset Browser might identify a column as an integer based on what is detected in the first 1000 records, however, later records in the file might show that this column ought to be treated as varchar data type. Similarly, the Asset Browser might infer a column as varchar(20) even though later records show that the column ought to be varchar(100).
Workaround:- Profile the source data to generate better metadata.
- Change all columns to be varchar(1024) and gradually narrow down the data type.
- Using sequential files as a source
-
To use sequential files as a source, you must load files into a project bucket in a specific location. To determine the project bucket location:
- Find the project Cloud Object Storage instance.
- In the project instance, find the bucket corresponding to the current project. The location is
usually:
<lowercase-project-name>-donotdelete-<random-string>
For example:
project2021mar01-donotdelete-pr-ifpkjcbk71s36j
Then, upload the files to by specifying
DataStage/files/
in the Prefix for object field.
- Error running jobs with a parquet file format
- You might receive the following error when you try to run a job with a parquet file format:
The unsigned 32-bit integer(uint32) and unsigned 64-bit integer(uint64) data types are not supported in the Parquet format that DataStage is using for all the file connectors.Error: CDICO9999E: Internal error occurred: Illegal state error: INTEGER(32,false) can only annotate INT32.
Workaround: You must use supported data types.
- Migration pod getting evicted for exceeding its ephemeral storage limits
-
During import, pod usage of ephemeral local storage can exceed the total limit of containers. You might receive the following message:
Workaround: To avoid this problem, you need to increase the ephemeral storage limit to 4Gi from default of 900Mi by running the following command:Status: Failed Reason: Evicted Message: Pod ephemeral local storage usage exceeds the total limit of containers 900Mi.
oc -n ${PROJECT_CPD_INST_OPERANDS} patch datastage datastage --type merge -p '{"spec": {"custom": {"resources":{"components":{"migration":{"limits":{"ephemeral":"4Gi"}}}}}}}'
- Error occurs during the upgrade of the Cloud Pak for Data from 5.0.0 to 5.0.1 version
-
You may face this error while performing the upgrade of Cloud Pak for Data from 5.0.0 to 5.0.1 The upgrade fails in new upgrade tasks for the remote instances.
Workaround: When the DataStage CR is alternating between Failed and InProgress during the 5.0.1 upgrade, go through the following steps:- Log in to the Red Hat® OpenShift Container
Platform cluster by
oc
and set the default project path to where the Cloud Pak for Data is installed.oc project $PROJECT_CPD_INST_OPERANDS
- Verify if the PXRruntime instances have been successfully upgraded to version
5.0.1.
oc get pxruntime
- If the PXRruntime CR is not successfully upgraded to version 5.0.1, then run the following
commands:
echo "Adding installedVersion to DataStage CR" oc patch datastage datastage --type='json' -p='[{"op": "add", "path": "/spec/installedVersion", "value": "5.0.1" }]' while true; do echo "Waiting for DataStage CR to be in Completed state"; sleep 30; if [ $(oc get datastage datastage -o=jsonpath="{.status.dsStatus}") = "Completed" ]; then break; fi; done echo "Removing installedVersion from DataStage CR" oc patch datastage datastage --type='json' -p='[{"op": "remove", "path": "/spec/installedVersion"}]' while true; do echo "Waiting for DataStage CR to be in Completed state"; sleep 30; if [ $(oc get datastage datastage -o=jsonpath="{.status.dsStatus}") = "Completed" ]; then break; fi; done
- Log in to the Red Hat® OpenShift Container
Platform cluster by
- Flows that contain the Transformer stage time out during compilation
-
A timeout may occur while compiling flows that contain the transformer stage.
Workaround: Disable compile time optimization by adding
-O0
to compile options in theAPT_COMPILEOPT
environment variable.
Connectors
- Netezza connector: Duplicate records occur when partitioned reads are enabled
-
When partitioned reads are enabled on the Netezza connector in parallel execution mode, duplicate records may occur. To avoid duplicate records, add partition placeholders into the SQL or set the execution mode to sequential. To add partition placeholders, add the string
mod(datasliceid,[[node-count]])=[[node-number]]
, as in the following example.SELECT * FROM table WHERE mod(datasliceid,[[node-count]])=[[node-number]]
- MySQL connector: Jobs might fail if you use Write mode "Update" for the target without a primary key
-
If you create a table in a MySQL database without specifying a primary key in the WHERE column, and then you try to run a job that uses that table with the Write mode Update for the target, the job might fail.
Solution: Specify a primary key name in the Key column names field. If the table is big and does not have a primary column, you can create a separate column with auto-increment values to use as the primary key.
- FTP connector: The home directory path is prepended to the path
-
When you run a job that uses data from an FTP data source, the home or login directory is prepended to the path that you specified. This action happens regardless if you specify an absolute path (with a leading forward slash) or a relative path (without a leading forward slash). For example, if you specify the directory as
/tmp/SampleData.txt
, the path resolves to/home/username/tmp/SampleData.txt
.Workaround: Edit the File name in the FTP connector. Specify the absolute path to the source or target file.
- Jobs fail with error "The connector could not establish a connection to Db2 database"
-
Jobs may fail with error "The connector could not establish a connection to Db2 database".
Workaround: Go to the connection properties and set the Options property to
connectTimeout=0
.
- Job with source data from a SAP OData connector fails
-
If your flow includes source data from SAP OData, the flow might fail if you created the flow by manually adding columns that do not follow the SAP naming convention.
Workaround: Update the flow by or adding the columns with the Asset browser or by renaming the columns according to the SAP naming convention. The SAP naming convention follows the SAP object hierarchy with two underscore characters (
__
) as a separator. For example, if the PurchaseOrder column belongs to PurchaseOrderNote, then the column name should be specified as PurchaseOrderNote__PurchaseOrder.
- Cannot run transactional SQL on data from Apache Hive version 2.0 or earlier
-
If your data is from Apache Hive version 2.0 or earlier and your DataStage flow executes UPDATE or DELETE statements, the job might fail. Make sure that the target table has been created according to Hive transactions requirements and that the Apache Hive server is configured to support ACID operations.
The minimum set of parameters (configured in the hive-site.xml file) that you must enable for ACID tables in Apache Hive is:
hive.support.concurrency = true hive.enforce.bucketing = true (not required as of Hive 2.0) hive.exec.dynamic.partition.mode = nonstrict hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager hive.compactor.initiator.on = true hive.compactor.worker.threads = <a positive number>
For more information, refer to Hive Transactions.
- IBM Db2 for DataStage connection with SSL certificate fails with "Protocol specific error code 414" error
-
If you use an SSL certificate in the IBM Db2 for DataStage connection and the connection fails with a "Protocol specific error code 414" error, use this workaround:
- Identify the root certificate on the Db2
server. You can use this command to view the certificate
chain:
openssl s_client -connect <hostname>:<port> -showcerts
- Ensure that the certificate has the same subject and issuer.
- In the Create connection: IBM IBM Db2 for DataStage page, enter the root certificate in the SSL certificate (arm) field.
- Identify the root certificate on the Db2
server. You can use this command to view the certificate
chain:
- Error parameterizing the credential field for a flow connection in IBM Cloud Object Storage
-
When the Authentication method property is set to Service credentials (full JSON snippet), do not parameterize the Service credentials field. If a parameter is provided for that field, the flow will not compile
- PostgreSQL connector times out on large tables
-
The PostgreSQL connector might fail with a timeout error when a large table (100,000+ rows) is used as a source. To fix this error, try setting a higher timeout value for the APT_RECORD_TIMEOUT environment variable. See Managing environment variables in DataStage.
- Schema changes that originate in data from the HTTP connector can cause the job to fail
-
When you use the HTTP connector to download a file and then upload the same file into IBM Cloud Object Storage or a database, if the file's schema changes over time, the job might fail.
Workaround: Re-create the stage.
Runtime
- Out of memory issue for the DataStage operator
-
When more than 5 PX runtime instances are deployed on the cluster, the operator may run out of memory. To resolve this issue, update the CSV to increase the memory limits:
Retrieve the DataStage CSV:oc -n ${PROJECT_CPD_INST_OPERATORS} oc get csv | grep datastage
Patch the DataStage CSV to increase the operator memory from 1Gi to 2Gi:oc -n ${PROJECT_CPD_INST_OPERATORS} patch csv <DataStage-CSV-name> --type='json' -p='[{"op": "replace", "path": "/spec/install/spec/deployments/0/spec/template/spec/containers/0/resources/limits/memory", "value": "2Gi" }]'
- Jobs queue when compute pods fail to start
-
If the runtime instance compute pods don't start, all jobs will run on the px-runtime pod. Resource limits lead to jobs being queued.
Workaround: Fix any issues that are preventing the compute pods from starting.
- PXRuntime instance that is deployed on physical location cannot appear
-
While deploying remote data plane on tech zone clusters, you can run into the issue of not appearing PXRruntime instance, which was deployed on physical location. It happens because the connection is being closed before completing the request.
Workaround: The timeout setting for the load balancer needs to be increased. For more information, see: Changing load balancer timeout settings.