Governing virtual data in Data Virtualization
Data Virtualization can integrate with IBM Knowledge Catalog to govern the virtual data that you publish to governed catalogs. Data governance involves applying business context, data policies, and data protection rules to your virtual data.
Before you begin, review the end-to-end process.
- Understand catalogs. See Catalogs.
- Choose a method for publishing your assets to a catalog. See Methods for publishing assets to the catalog.
- Understand the ownership of an asset that is published to a catalog. See Virtual object owner and data asset owner.
- Decide to use business terms or data protection rules. See Applying business terms and data protection rules.
Catalogs
IBM Knowledge Catalog is a secure enterprise data catalog management platform. With IBM Knowledge Catalog, you use catalogs to easily find and share your data assets. A catalog is a way to organize, label, and search for data assets. An asset in a catalog consists of metadata about a data asset. Data protection rules are enforced only on data that is published or added to a catalog. For more information about catalogs and data assets, see Catalogs.
- Enforced publishing method is normally used for control and governance of assets.
- Standard publishing method is used to facilitate sharing of virtualized data for easier collaboration.
Methods for publishing assets to the catalog
In Data Virtualization, you can use two methods to publish assets to a catalog. You can choose to enforce publishing of all assets to a primary catalog or you can allow users to choose to publish to any catalog that they have the Manager or Editor role for.
- Enforced publishing method
-
If you want to enforce publishing to a primary catalog, a Data Virtualization Manager must enable Enforce publishing to a governed catalog in and choose a primary catalog that all virtualized objects that are created with the user interface will be published to. If this setting is enabled, users will not be able to choose the catalog that they publish to when they virtualize data. All assets will be published to the primary catalog automatically.
To change a primary catalog, a Data Virtualization Manager must satisfy the following requirements:
- They must be an Manager on the current primary catalog.
- They must be an Manager on the newly selected primary catalog.
Note: If you enforce publishing to a primary catalog, the service ID is added as a collaborator on the catalog in the background. The service ID performs the automatic publishing. Therefore, if you enforce publishing to a primary catalog, the service ID will take up one catalog collaborator spot from your plan quota.Do not remove this service ID from the catalog. It is required for automatic publishing to the primary catalog. The service ID will appear as Unavailable user in your selected primary catalog collaborators list and will have the Manager role assigned.
- Standard publishing method
-
If publishing to a primary catalog is not enforced, a user can choose to publish to any catalog that they have the Manager or Editor role for. The user can choose the catalog from the drop-down list on the Virtualize page.
For more information, see Publishing virtual data to the catalog in Data Virtualization.
Virtual object owner and data asset owner
- Virtual object owner
- The user that created the virtual object in Data Virtualization.
- Data asset owner
- The user that owns the asset for a virtual object in a catalog.
- For example, a user might choose not to publish a Data Virtualization object when it is virtualized. Or the object might have been created by a method that does not automatically attempt to publish the object, such as when the user runs SQL to create a view. The object is then shared with other users. One of those users might publish the object and then that user would become the asset owner instead of the original object creator.
- Or, the asset owner might be modified in the catalog to change the asset owner.
Applying business terms and data protection rules
You can create virtual tables in Data Virtualization from existing data assets that have business term assignments. Data Virtualization can use business terms assigned to tables in the catalog to rename table and column names while these tables are being virtualized.
A catalog data asset contains a set of properties that includes business terms and tags. After your virtual data is in a catalog, you can:
- Assign business terms, data classes, and tags that are authored in IBM Knowledge
Catalog to tables and columns and thus, form a
logical structure of your virtual data.
For more information, see Virtualizing data with business terms.
- Use data protection rules to deny access to your virtual data or mask it. These data protection rules can be based on the assigned tags and business terms. For more information, see Data protection rules.