IBM Cloud Data Engine connection
To access your data in IBM Cloud Data Engine, create a connection asset for it.
The IBM Cloud Data Engine connector is deprecated and will be discontinued in a future release. For more information, see Deprecation of Data Engine.
IBM Cloud Data Engine is a service on IBM Cloud that you use to build, manage, and consume data lakes and their table assets in IBM Cloud Object Storage (COS). IBM Cloud Data Engine provides functions to load, prepare, and query big data that is stored in various formats. It also includes a metastore with table definitions. IBM Cloud Data Engine was formerly named "IBM Cloud SQL Query."
Prerequisites
- An IBM Cloud Data Engine Standard plan instance is required in order to create tables or views.
- Before you can run SQL queries, you need to have one or more Cloud Object Storage buckets to hold the data to be analyzed and to hold the query results. You have two choices for Cloud Object Storage:
- The Cloud Object Storage instance that you associated with your projects, deployment spaces, or catalogs in Cloud Pak for Data as a Service. For information, see Cloud Object Storage on Cloud Pak for Data as a Service.
- Provision a new instance of Cloud Object Storage For instructions, see Provisioning storage.
Create a connection to IBM Cloud Data Engine
If you have set up an integrated cloud service, select the service instance to automatically fill in the fields in the connection form. Confirm that all the fields are complete.
To create the connection asset, you need these connection details:
- The Cloud Resource Name (CRN) of the IBM Cloud Data Engine instance. Go to the IBM Cloud Data Engine service instance in your resources list in your IBM Cloud dashboard and copy the value of the CRN from the deployment details.
- Target Cloud Object Storage: A default location where IBM Cloud Data Engine stores query results. You can specify any Cloud Object Storage bucket that you have access to. You can also select the default Cloud Object Storage bucket that is created when you open the IBM Cloud Data Engine web console for the first time from IBM Cloud dashboard. See the Target location field in the IBM Cloud Data Engine web console.
- IBM Cloud API key: An API key for a user or service ID that has access to your IBM Cloud Data Engine and Cloud Object Storage services (for both the Cloud Object Storage data that you want to query and the default target Cloud Object Storage
location).
You can create a new API key for your own user:
- In the IBM Cloud console, go to Manage > Access (IAM).
- In the left navigation, select API keys.
- Select Create an IBM Cloud API Key.
Credentials
IBM Cloud Data Engine uses the SSO credentials that are specified as a single API key, which authenticates a user or service ID.
The API key must have the following properties:
- Manage permission for the IBM Cloud Data Engine instance
- Read access to all Cloud Object Storage locations that you want to read from
- Write access to the default Cloud Object Storage target location
- Write access to the IBM Cloud Data Engine instance
Choose the method for creating a connection based on where you are in the platform
- In a project
- Click Assets > New asset > Connect to a data source. See Adding a connection to a project.
- In a catalog
- Click Add to catalog > Connection. See Adding a connection asset to a catalog.
- In a deployment space
- Click Import assets > Data access > Connection. See Adding data assets to a deployment space.
- In the Platform assets catalog
- Click New connection. See Adding platform connections.
Next step: Add data assets from the connection
Where you can use this connection
You can use IBM Cloud Data Engine connections in the following workspaces and tools:
Projects
- Data Refinery (Watson Studio or IBM Knowledge Catalog)
- DataStage (DataStage service). See Connecting to a data source in DataStage.
- Metadata enrichment (IBM Knowledge Catalog)
- Metadata import (IBM Knowledge Catalog)
- Notebooks (Watson Studio). See the Notebook tutorial for using the IBM Cloud Data Engine (SQL Query) API to run SQL statements.
- SPSS Modeler (Watson Studio)
Catalogs
-
Platform assets catalog
-
Other catalogs (IBM Knowledge Catalog)
Note:Preview, profile, and masking are not certified for this connection in IBM Knowledge Catalog.
Restrictions
You can only use this connection for source data. You cannot write to data or export data with this connection.
IBM Cloud Data Engine setup
To set up IBM Cloud Data Engine on IBM Cloud Object Storage, see Getting started with IBM Cloud Data Engine.
Supported encryption
By default, all objects that are stored in IBM Cloud Object Storage are encrypted by using randomly generated keys and an all-or-nothing-transform (AONT). For details, see Encrypting your data. Additionally, you can use managed keys to encrypt the SQL query texts and error messages that are stored in the job information. See Encrypting SQL queries with Key Protect.
Running SQL statements
Learn more
Parent topic: Supported connections