Preparing data

Last updated: Nov 26, 2024
Preparing data

After you create a project, or join one, the next step is to add data to the project and prepare the data for analysis.

Required permissions
You must have the Admin or Editor role in a project to add or prepare data.

Methods to add data to a project

You can add data assets from your local system, from a catalog, from the Resource hub, or from connections to data sources. See Adding data to a project.

You can add these types of data assets to a project:

  • Data assets from files from your local system, including structured data, unstructured data, and images. The files are stored in the project's IBM Cloud Object Storage bucket.
  • Connection assets that contain information for connecting to data sources. You can add connections to IBM or third-party data sources. See Connectors.
  • Connected data assets that specify a table, view, or file that is accessed through a connection to a data source.
  • Connected folder assets that specify a path in IBM Cloud Object Storage.

To get started quickly, take a tutorial. See Quick start tutorials.

Methods for preparing structured data

You can choose from these methods to prepare structured data, such as relational tables.

Prepare structured data
Task Method
Cleanse and shape data Data Refinery
Generate synthetic data Synthetic Data Generator
Preserve features Feature groups

Methods for preparing unstructured data

You can choose from these methods to prepare unstructured data, such as documents.

Prepare structured data
Task Method
Vectorize documents Vector index
Generate synthetic data Synthetic data generation API

Learn more