Governing virtual data with data protection rules in Watson Query
You can govern your virtual data by defining data protection rules.
Before you begin
- Created data protection rules in IBM Knowledge Catalog. For more information, see Managing data protection rules.
- Enabled policy enforcement in Watson Query if you want to force the use of data protection rules. For more information, see Enabling policy enforcement.
- Published and annotated the objects that you want to be subject to data
protection rules to a governed catalog. For more information, see Publishing virtual data to the catalog in Watson Query.Important:
-
Watson Query enforces data protection rules only if the data asset is cataloged in a catalog that has the Enforce data protection rules option enabled.
For more information, see Changing catalog settings.
-
About this task
You can use following types of data protection rules:
- Deny of access
- Deny of access prevents users from accessing all the data of a Watson Query asset. For example, if the Data steward doesn’t want to expose the entire asset to one user, they can define this rule with a condition that matches the username.
- Data masking
- Data masking is used to hide sensitive data but still allow users to use the asset. There are
three types of data masking rules: redact, substitute, and obfuscate. The user can decide to enable
one of these rules based on how to use the data in the upstream application.
- Redaction replaces all or a subset of characters in a data cell.
- Substitute replaces data with the salted hashes of the original values. This method is the most likely to maintain referential integrity.
- Obfuscate replaces data with formatted values that are similar to the original data.
- Row-level filtering
-
You can create data protection rules to include or exclude rows in your virtualized data to limit the rows that users can see. For example, you can define a rule so that employees can see customer data that is associated only with their department.
You can apply filter criteria to include or exclude rows.
You cannot apply data masking and row filtering rules to views directly. The result sets of a view is masked according to the data protection rules that apply to the objects that are referenced by the view. You can filter rows or mask identifying details from tables that are referenced in the view definition.
Access to the tables that are referenced in the row-level filter expressions is not evaluated, including data masking.
Watson Query access control is not applied when data masking or row-level filtering applies to the preview in Watson™ services (other than Watson Query). The Watson Query internal access controls, which are controlled by using Manage access in the Watson Query UI, do not apply to the preview from the other Watson services with masking or row-level filtering. You must define your rules to manage access to the catalogs, projects, data assets, or connections for access control in the other Watson services.
When you publish virtualized data assets to a catalog, they are treated like any other data asset and are subject to data protection rules. Data protection rules can deny or mask access to assets based on criteria that can include governance artifacts, such as business terms and data classes.
Procedure
To govern your virtual data with data protection rules: