Validating your data in Data Refinery

Last updated: Jul 18, 2025
Validating your data in Data Refinery

At any time after you've added data to Data Refinery, you can validate your data. Typically, you'll want to do this at multiple points in the refinement process.

To validate your data:

  1. From Data Refinery, click the Profile tab.

  2. Review the metrics for each column by clicking a graph in the Audit tab.

  3. Take appropriate actions, as described in the following sections, depending on what you learn.

Frequency

For nominal data, frequency is the number of times that a value, or a value in a specified range, occurs. Each frequency distribution (bar) shows the count of unique values in a column.

Review the frequency distribution to find anomalies in your data. If you want to cleanse your data of those anomalies, simply remove the values.

Basic Statistics

Basic statistics are a collection of quantitative data. For each column, these statistics include the minimum, maximum, mean, and other measures.

Depending on a column's data type, the statistics for each column will vary. For example, statistics for a column with an integer data type include the minimum, maximum, median, mean, sum, mode, and other relevant measures. Whereas, statistics for a column with a string data type include the minimum, maximum, number of unique values, mode, and additional applicable metrics.

Advanced insights

For columns with numerical data, you can also see more advanced statistics such as percentiles, standard deviation, covariance, skewness and other measures.

Learn more

Parent topic: Refining data