Back to list
Why a Data Access Layer?
2022/06/27

Introduction

The growth of data innovation has exploded in recent years, mainly due to the thriving cloud data warehouse communities such as Snowflake, Redshift, and BigQuery, with their primary focus on SQL users and business intelligence use cases. Lakehouse architecture brings analytics closer to data lakes, enabling heterogeneous and distributed data processing engines to ingest sources, including diverse workloads such as data science, machine learning capabilities, and near real-time analytics enablement. It has also spawned thriving innovations in integrated data services that automate and unify data modeling, transformation, and metrics, such as dbt and LookML. Collectively, these tools lay the foundation upon which next-generation operational and analytical data applications can be constructed for various data consumers of different personas.

The Data Access Challenges

First of all, let's look at the ultimate goal for data access in a company:

Allow data consumers to find and access their data, fast and simple.

Let's look at a company's current data access workflow; here's what it looks like for a data consumer requesting and accessing data.

  1. Data consumers need to find who owns the data.
  2. Request access from the data owners; then, they must filter or mask certain rows and columns to specific groups or users before exposing them to usage.
  3. Based on different data applications/tools, such as using Excel, BI, AI, or RESTful API, data owners need to evaluate the best data access method.
  4. Last is automation. Periodically, update and deliver to end applications and ensure they are secure and auditable.

Data access in an enterprise encounters several challenges.

Without Canner

From left to right, the data heterogeneity needs to homogenize in the metadata and logical level; data authorization is a sophisticated access control and authorization for domain-specific datasets and their associated data applications; empower datasets with semantic meanings through the process of data productization; Finally, based on different data consumers' persona provide endpoints.

A complete data access layer must solve data heterogeneity, usability, and authorization while enabling consistency and scalability across different data applications.

The Data Access Layer

As a company grows, it starts to pile up with hundreds and thousands of requests from data consumers, and the backlogs of demands will need days and even weeks to resolve.

The bottleneck is obvious: the "data access workflow". We need a data access layer that is a secure, efficient, automated, and intelligent way for data consumers to access data themselves, avoid delays and misalignment, and data owners can authorize. Audit datasets make sure the right person accesses the suitable dataset.

We need a data access layer that is secure, efficient, and intelligent.

Ultimately, achieve equilibrium between data, people, and applications.

balance

Four-core design principles

Data Access Layer consists of four core design principles.

1. Data Connectors

Data Access Layer is collaborative and distributed in nature, with each silo or data source independently scalable or together as an aggregate.

2. Data Product

Transform data models to domain-oriented datasets; Domain-oriented datasets owned by data owners can be shared and governed by open APIs, with the flexibility of interchangeable metadata and access rules, let data speak your business language.

3. Data Authorization

Consistent data authorization framework from sources to data applications and integrated with existing Identity and Access Management (IAM). Make data authorization consistent across data sources, IAM, and data applications.

4. Data Consumption

Data consumers can generate Queries and APIs with intent and contextual settings, applied to the corresponding datasets via intent declaration, and deliver them to target consumers where final analytics are performed and displayed.

Canner structure

The Result

Enterprises can significantly eliminate data complexity, communication, and productivity through the data access layer.

  1. Data access from days to minutes: Reduce 60% of data integration cost with up-to-date data delivery.
  2. Reduce duplicate datasets: Create masked and filtered datasets without physically moving data.
  3. Achieve self-service analytics: Improve data productivity across analytical and operational data applications and tools.

No reproduction without permission, please indicate the source if authorized.

Share to your friends:
Try Canner Today.
Request a Demo