Snowflake connection
To access your data in Snowflake, create a connection asset for it.
Snowflake is a cloud-based data storage and analytics service.
Create a connection to Snowflake
To create the connection asset, you need the following connection details:
- Account name: The full name of your account
- Database name
- Role: The default access control role to use in the Snowflake session
- Warehouse: The virtual warehouse
Credentials
Authentication method:
-
Username and password
-
Key-Pair: Enter the contents of the private key and the key passphrase (if configured). These properties must be set up by the Snowflake administrator. For information, see Key Pair Authentication & Key Pair Rotation in the Snowflake documentation.
-
Okta URL endpoint: If your company uses native Okta SSO authentication, enter the Okta URL endpoint for your Okta account. Example:
https://<okta_account_name>.okta.com
. Leave this field blank if you want to use the default authentication of Snowflake. For information about federated authentication provided by Okta, see Native SSO.
Choose the method for creating a connection based on where you are in the platform
- In a project
- Click Assets > New asset > Connect to a data source. See Adding a connection to a project.
- In a catalog
- Click Add to catalog > Connection. See Adding a connection asset to a catalog.
- In a deployment space
- Click Import assets > Data access > Connection. See Adding data assets to a deployment space.
- In the Platform assets catalog
- Click New connection. See Adding platform connections.
Next step: Add data assets from the connection
Where you can use this connection
You can use Snowflake connections in the following workspaces and tools:
Projects
- Data quality rules (IBM Knowledge Catalog)
- Data Refinery (watsonx.ai Studio or IBM Knowledge Catalog)
- DataStage (DataStage service). For more information, see Connecting to a data source in DataStage.
- Decision Optimization (watsonx.ai Studio and watsonx.ai Runtime)
- Metadata enrichment (IBM Knowledge Catalog)
- Metadata import (IBM Knowledge Catalog)
- SPSS Modeler (watsonx.ai Studio)
Catalogs
-
Platform assets catalog
-
Other catalogs (IBM Knowledge Catalog)
Data lineage
- Metadata import (lineage) (IBM Knowledge Catalog and Manta Data Lineage)
- Watson Query service
- You can connect to this data source from Watson Query. This connection requires special consideration in Watson Query. For more information, see Connecting to Snowflake in Watson Query.
Snowflake setup
Running SQL statements
To ensure that your SQL statements run correctly, refer to the Snowflake SQL Command Reference for the correct syntax.
Configuring lineage metadata import for Snowflake
When you create a metadata import for the Snowflake connection, you can set options specific to this data source, and define the scope of data for which lineage is generated. For details about metadata import, see Designing metadata imports.
Scope of lineage metadata import
- Include and exclude lists
- You can include or exclude assets up to the schema level. Provide databases and schemas in the format database/schema. Each part is evaluated as a regular expression. Assets which are added later in the data source will also be included or excluded if they match the conditions specified in the lists. Example values:
myDB/
: all schemas inmyDB
database.myDB2/.*
: all schemas inmyDB2
database.myDB3/mySchema1
:mySchema1
schema frommyDB3
database.myDB4/mySchema[1-5]
: any schema in mymyDB4
database with a name that starts withmySchema
and ends with a digit between 1 and 5.
- External inputs
- If you use external Snowflake SQL scripts which are not extracted directly from the connected Snowflake server, you can add them in a ZIP file as an external input. You can organize the structure of a ZIP file as subfolders that represent databases and schemas. After the scripts are scanned, they are added under respective databases and schemas in the selected catalog or project. The ZIP file can have the following structure:
<database_name>
<schema_name>
<script_name.sql>
<database_name>
<script_name.sql>
<script_name.sql>
replace.csv
The replace.csv
file contains placeholder replacements for the scripts that are added in the ZIP file. For more information about the format, see Placeholder replacements.
Advanced metadata import options
- TableStage Regular Expression
- You can enable extraction of staged files from table stages and specify the scope by using a regular expression. Use a fully qualified name and enclose each segment with double quotation marks. If you do not want to extract staged files from table stages, leave the value empty. For example, when a stage contains more than one million files, the extraction takes a very long time and many filesystem entries are not useful for lineage. You can use this setting to define which files to include in extraction. Example value:
\"mydb\"\\.\"schema1\"\\.\".*\"|\"mydb\"\\.\"myschema\"\\.\"abc.*\"
- Transformation logic extraction
- You can enable building transformation logic descriptions from SQL code in SQL scripts.
Learn more
Parent topic: Supported connections