0 / 0
Migrating connections in DataStage

Migrating connections in DataStage

Complete the following tasks after migrating jobs that contain connections.

Update connection information

The following table provides an overview of the actions that are required after migrating jobs that contain connections.
Table 1. Connections and actions
Scenario Actions
The connection information or credentials are directly embedded inside the job.
  • The migration service creates a connection with credentials. You can then create a DataStage® flow, compile this flow, create a DataStage job, and the job is ready to run after migration.
  • With flow connections, when you migrate a job from traditional DataStage, if the properties that are embedded in the job are different from the data connection object properties, the connection properties that are embedded in the traditional DataStage job will be added to the job definition.
The connection is parameterized.
  • If no credentials are on the source system: for each unique type of database with a parameter reference that is referenced in the source ISX file, DataStage creates a new connection with the parameters. The properties are migrated as-is.

    Compatibility with features that are not present in traditional DataStage:

    Although you can compile and run the job, these additional features will not work: Test connection, Asset browser, preview data, importing a connected data asset. Also, you will not be able to use the connection with other project services or tools such as notebooks, Data Refinery, or Watson Machine Learning. If you want to use the additional features or use the connection in other tools, edit the project connection and change the parameters to the actual values.

The properties do not have one-to-one mappings between traditional DataStage and DataStage in IBM Cloud Pak® for Data.
  • Auto-generated parameters from a migrated flow:

    Some properties don't have one-to-one mappings. These properties are added to auto-generated parameter sets.

    To address these parameters, complete the following steps:
    1. After migration, open each parameter set, provide default values, then compile.
    2. Address any issues that arose during compilation, then run the job that was created during migration for the corresponding migrated DataStage flow.
  • If the flow connection feature is enabled, DataStage creates a job that contains a flow connection with the parameter sets. Most connection properties that were parameterized in the traditional DataStage job will also be parameterized in imported flow.

    To enable flow connections, see Enable flow connections.

Specify hostname and port number information

In jobs in the traditional version of DataStage, data connection information might be specified in the form of a parameter reference.

When you import one of these jobs into the modern version of DataStage, the connection information is not recognized, and the import process produces warnings. To address the warnings, manually specify the information in the DataStage job settings. Then, recompile the job.

The following table shows the connections that might have this issue, the default values of the parameters, and what fields in traditional DataStage that the parameter information is derived from.

Note: Some traditional connection names map to multiple modern connection names and some modern connection names map to multiple traditional connection names.

The port is parameterized for all connections.

Table 2. Connections and parameters
Connection name in traditional DataStage Connection name in modern DataStage Field that the original value was derived from in traditional DataStage Parameter that receives the parameter reference Default value Notes
Sybase Enterprise SAP ASE Server host port = 5000  
Sybase OC  SAP ASE Server host port = 5000  
Db2® IBM Db2 for DataStage Instance advanced.host advanced.port = 50000  
Db2 Enterprise IBM Db2 for DataStage Server advanced.host advanced.port = 50000  
Oracle Oracle Database for DataStage Server oracle_service_name oracle_db_port = 1522  
Oracle Enterprise Oracle Database for DataStage Server oracle_service_name oracle_db_port = 1522  
Hive Apache Hive Server host
  • port = 10000
  • database = default
 
Netezza® Enterprise IBM Netezza Performance Server for DataStage Database database
  • hostname = <hostname>
  • port = 5480
 
FTP Enterprise FTP URI host
  • connection_mode = SSH
  • port = 22
The file_name property will also be set to a default in the associated flow stage. The default value will be <file_name>.

In addition to manually updating the connection properties and parameter value, you must also change this usage property within the stage, then compile the job.

The default connection mode is restricted to SSH.

Azure Storage Connector Microsoft Azure File Storage accountName, accountKey N/A https;AccountName=accountName;
AccountKey=accountKey
If either accountName or accountKey are parameter references, the connection string is replaced with a parameter reference for that field.

Update job settings for missing properties

When you import certain jobs from the traditional version of DataStage into the modern version, some properties are not specified in the original job and you might need to specify them manually. DataStage automatically parameterizes some properties. For example, host is parameterized to #ConnParamSet_[connection type].host#. By default, parameter sets are created for imported connections unless the connection uses an existing parameter set, in which case parameterized properties are appended to the set.

Oracle, ODBC, and JDBC: A special case for generating parameter sets applies to Oracle, Db2, ODBC, and JDBC. When these connection details are parameterized, parameter sets are generated with a unique identifier tag appended to the name, in the following format: ConnParamSet_[connection_type].[unique identifier tag]. A unique parameter set is generated for each unique identifier tag. For example, when two Oracle connections are migrated and each has a different Host value, two parameter sets will be generated, ConnParamSet_Oracle.host1 and ConnParamSet_Oracle.host2. If both connections have the same Host value, they will share the parameter set ConnParamSet_Oracle.host. The following properties provide the unique identifier tag for their corresponding connections:

  • Db2: Database
  • Oracle : Host
  • ODBC: Datasource type
  • JDBC: Vendor

ODBC: A special case for the parameter naming is ODBCConnectorPx and PxODBC, which might insert dsn_type into the name similar to ConnParamSet_[connectorName]_[dsn_type].hostname. A different pattern applies to JDBC, which takes the form ConnParamSet_JDBC.[vendor]_[property]. If the string portion protocol is readable, it will be carried over, otherwise it will be cut from the string and take the form ConnParamSet_JDBC.[property]. This case applies if dsn_type or protocol are available. This scenario also applies to ODBC.

You can also set the --create-connection-paramsets flag to false on the cpdctl dsjob migrate command to parameterize connection details as job parameters instead of a parameter set asset. For example, host is parameterized to #ConnProp_[connection type]_host#. This option allows you to set parameter values for each individual job. For more information on the CLI command, see DataStage command-line tools.

After being migrated, most connections share a parameter set with all connections of the same type in the migrated flow. For example, all Teradata connections in a migrated flow will use the parameter set ConnParamSet_Teradata.

The following properties have default values when parameterized, whether in the ConnProp_[connection type]_[property] job parameter format or the ConnParamSet_[connection type].[property] parameter set format:
Apache Cassandra
local_datacenter parameterized to #ConnParamSet_Cassandra.local_datacenter# will have a default value of "datacenter1"
Apache Hive
database parameterized to #ConnParamSet_Hive.database# will have a default value of "default"
DRS as Oracle
service_name parameterized to #ConnParamSet_Oracle.service_name# will have a default value of "orcl"
Informix® CLI
database parameterized to #ConnParamSet_Informix.database# will have a default value of "default"
Informix Enterprise
database parameterized to #ConnParamSet_Informix.database# will have a default value of "default"
Informix Load
database parameterized to #ConnParamSet_Informix.database# will have a default value of "default"
Informix XPS Load
database parameterized to #ConnParamSet_Informix.database# will have a default value of "default"
Oracle
service_name parameterized to #ConnParamSet_Oracle.service_name# will have a default value of "orcl"
Oracle Enterprise
service_name parameterized to #ConnParamSet_Oracle.service_name# will have a default value of "orcl"

If the missing property is port, the import process produces warnings. To address the warnings, manually specify the information in the DataStage job settings. Then, recompile the job.

The following table shows the connections that might have missing properties, the properties that cannot be retrieved, the default value that will be given, and the field that the original value was derived from in traditional DataStage.

Note: Some traditional connection names map to multiple modern connection names and some modern connection names map to multiple traditional connection names.
Table 3. Connections and properties
Connection name in traditional DataStage Connection name in modern DataStage Missing properties that are automatically given a default parameter Missing properties that are automatically given a default value Notes
ODBC Connector ODBC
  • hostname
  • database
  • port = 0
  • dsn_type = DB2
 
ODBC Enterprise Connector ODBC Enterprise
  • hostname
  • database
  • port = 0
  • dsn_type = DB2
If dsn_type is not in the list of allowed values, the default is DB2.
DRS Connector ODBC, Oracle, IBM Db2 for DataStage (not available in the modern version of DataStage) advanced.hostname advanced.port = 0
  • The connection, by default, will be migrated as a Db2 connector.
  • If you use a parameter for Database type, DRS is migrated as Db2 by default.
  • The RDBMS property cannot be parameterized. Specify one of ODBC, Oracle, or Db2 or the flow will import with random stage as a placeholder, and it will skip the flow compilation. You need to manually delete the random stage and add the wanted one.
Informix Load Informix
  • host
  • password
  • username
port = 1526  
Informix Enterprise (if remote server = true) Informix
  • host
  • server
port = 1526  
Informix Enterprise (If remote server = false) Informix
  • username
  • password
  • host
  • server
port = 1526  
Informix CLI Informix
  • host
  • username
  • password
port = 1526  
Informix XPS Load Informix host port = 1526  
Db2 Z Bulk Load (PxDB2Z) Db2 for z/OS® host port = 446 In migrated jobs that contain this connection, DataStage automatically converts other properties to comparable ones in modern DataStage. After you migrate jobs, make sure to check all connection properties for their proper values and provide them where it is necessary.

Update credential information

For the following stages, you must replace some properties after the import process is finished:
  • Google BigQuery
  • Google Cloud Storage
  • Google Cloud Pub/Sub
After the import process is finished, specify the correct values for the following fields :
  • client_id
  • client_secret
  • access_ token
  • refresh_token

Enable flow connections

If you have a job in traditional DataStage that you want to import that has a connection, you can have the connection properties included in the flow definition when you migrate the job to modern DataStage. To migrate the job this way, you must use the dsjob CLI command to migrate the ISX file that contains the jobs. You must also enable flow connections by adding the parameter --enable-local-connection to the command.

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more