Virtual Datasets


Virtual Datasets


Overview

The Virtual Datasets feature is a data catalog that streamlines the accessibility of diverse data sources. Virtual datasets provide a way to organize, catalog, and control access to specific data in CData Connect. You can bundle related data items separately in a simple and scalable way. Virtual datasets offer a customizable organization scheme that streamlines data analysis across numerous business functions.

Virtual Datasets Structure

Virtual datasets can be stored inside workspaces and folders. Folders can only exist one level down as part of a workspace. In the CData Connect data catalog, you can view a preview of the virtual dataset’s column metadata.

Virtual datasets can be:

  • Tables and Views–from connected CData Connect data sources
  • Derived Views–derived from one or many data sources in CData Connect

Tables, views, and derived views are exposed directly without transformations, so the column names and other properties remain the same. Any table, view, or derived view can be added to any number of workspaces. They can be referenced by a unique Alias. Tables that are virtual datasets support full CRUD capabilities.

Workspaces

Workspaces are the highest level of organization. They are customizable to adjust to various business needs. A business unit like marketing or data science can have their own separate workspaces. Sales may choose to organize folders in their workspace for each client. Teams can choose the structure that supports them best.

To get started, you can create a workspace using the following steps:

  1. Click Virtual Datasets on the left side of the dashboard.
  2. Click Add on the top right side.
  3. Enter a Workspace Name.
  4. Click Confirm to create your new workspace.

After setting a name and description for a Workspace, SQL Server and OData Endpoints are automatically generated. Endpoints are the door to a wide range of connections that form the backbone of virtual datasets.

Inbound Connections

When connecting to CData Connect from a client application, you can connect directly to a workspace via workspace specific endpoints. CData Connect supports different endpoints for inbound connections, including a REST API, Virtual SQL Server, and OData. This means that as a user, you can connect your client tools directly to a workspace. You can identify endpoints quickly via a configurable naming structure that places periods in between workspace, folder, and virtual dataset names by default, for example, Workspace.Folder.VirtualDataset.

To view connection details, click View Endpoints. This contains the following parameters.

  • SQL Server Host Name
  • Port
  • OData URL

Connect to the REST API

To make a connection to the REST API, add a URL parameter to the REST API URL named “workspace”. Set this to the name of the workspace you want to query. This works with both the GET operation on the metadata endpoint, as well as the POST operation on the query and batch endpoints. Note that the /exec endpoint does not support workspaces.

Connect to Virtual SQL Server

To make a connection using Virtual SQL Server, use the SQL Server Host Name and Port. Your username to connect to a workspace is in the format of [your_email_address]@[workspace_name].

Connect to OData

To make a connection using OData, use the OData URL and add the workspace name you want to connect to. The URL looks like https://cloud.cdata.com/api/odata/[workspace_name].

Connection Authentication

To authenticate, use your CData Connect email address as your username. You need a Personal Access Token (PAT) to use as your password. You can generate a PAT on the Settings page.

Virtual Datasets

The virtual datasets created within workspaces link to specific data items and provide a second layer of organization. To create a virtual dataset or a folder inside of the workspace:

  1. Click Add on the top right side.
  2. Choose from the menu what you want to add:
    • Folder
    • Tables and Views
    • Derived Views
  3. Click Save.

When you click on an asset in a workspace, the horizontal tabs offer the following functionality:

  • Columns–a table with the column metadata
  • Preview–the connected data item’s contents in a preview table
  • SQL–a generated reference SQL query for a table, view, or derived view, including the configurable Alias which is an important reference when querying.

Data Model

The data model follows a Workspace.Folder.Table design. If no folder is present in the hierarchy, it displays as [ROOT]. For more information about the data model, refer to SQL Reference.