It’s the end of 2023! I'm so excited and overwhelmed by so many blog posts talking about and mentioning “Semantic Layer” recently… it's definitely a fantastic and big year in 2023 for “Semantic Layer”.
The Definite Future
Earlier in 2023, there was a big news about dbt's acquisition Since then, the data ecosystem has been eagerly waiting for dbt's next move into the semantic layer. In 2023 October, during the dbt Coalesce, dbt announced its next-generation "dbt semantic layer," which envisions the future of how the semantic layer looks like. Recently, Jen Grant, the COO of Cube, dubbed 2023 as the year of the semantic layer in a post. She stated that "While Cube and the semantic layer have been around for a long time, it was only in 2023 when the stand-alone semantic layer category turned from a cool idea to a necessary ingredient in any data stack."
Let’s look at macro statistics, Google Trend also shows starting from 2022, the term “Semantic Layer” started to gain global attention. In 2023, it is at its peak. 2024 will be more exciting for “Semantic Layer”!
Not only US, but also Asia
Many US-based companies shared customer’s enthusiasm around “Semantic Layer”.
At Canner, we also see the same trend happening not only in the US market but also in Asia. This year, we are fortunate to provide and implement our solution in banks, insurance companies, manufacturing, gaming, and retail sectors in the Asia region. By providing Canner Enterprise to our clients we still see “Semantic Layer” is still in an early stage, and have so much great potential and possibility in the upcoming years.
As many new initiates and innovations are still happening in Canner. Today I would like to share our vision and focus of “Semantic Layer” towards 2024.
Our 5 Key focus of innovating the Universal Semantic Layer in 2024
1. Query Virtualization Compatibility
The semantic layer operates as the same fundamental logic across all data applications, including BI, AI, and advanced analytics tools. The way to achieve this is that Semantic Layer solutions will implement technology like Query Virtualization, which is an essential part of a semantic layer; through virtualization, enterprise data consumers can access data in a vendor-agnostic way; behind the scenes, the semantic layer will automatically rewrite queries and pushdown into source-specific SQL language with optimization and automation.
As the semantic layer grows into mainstream adoption, the compatibility problem for traditional and latest sources and data applications must be gracefully solved. At Canner, I've supported many sources and data applications that can connect to the Semantic Layer, sources like Informix, Sybase, Oracle, BigQuery, and applications such as SSIS, SSAS, SAS, Tableau, Power BI, Metabase, etc.
In 2024, we continue to expand the compatibility around data applications and sources.
2. Context-aware Metrics Access
The future of metrics, we believe is dynamic and composible, which means that the metrics should be aware of the context. This will allow for the automatic composition of metrics when different users or queries are involved without the need for creating views and tables. To achieve composability, a metrics interface for SQL and APIs is required.
We will need a metrics interface that is a well-defined shared boundary between metrics producers and data consumers. Semantic modeling is defined using the Model Definition Language (MDL), which allows the data team to expose a consistent, strongly typed, and extendable interface. The client-driven architecture retrieves data tailored to users' needs without knowing any data structure in the metrics. It prunes and filters the correct data out, gratefully decoupling the computation logic and data serving. This enforces data consistency and fast performance by reducing huge chunks of over-fetching data in application tooling.
3. Governance in Business Collaboration
Governance is crucial for any data-related tools used by enterprise customers. The Semantic Layer plays a unique role in the modern enterprise data stack by bridging the gap between data and business teams. This means that many complex governance issues can only be resolved through collaboration between these two teams.
In the future, the governance of the Semantic Layer will be fully integrated with workflow processes. This will include data-sharing processes, approval workflows between departments, and cross-functional teams. The integration will be built on top of basic functions such as auditing trails, granular data access control, lineage, notification, and data policies that will also be incorporated into the Semantic Layer. As a result, teams and departments can collaborate seamlessly without switching between different platforms.
4. Enable AI in Data Analytics
According to recent studies, the Semantic Layer can help prevent AI hallucinations. To achieve this, a semantic layer must be established to provide adequate context and semantics to data, as well as define standard data and metric definitions across sources. This will help ensure that the necessary information is provided.
Direct access between AI and raw data can lead to security vulnerabilities. To mitigate this, generating SQL through the semantic layer can ensure granular access control policies are in place.
5. Open-source Semantic Layer
We believe that openness is crucial for the ecosystem, and some people in the community are calling for the semantic layer to have open standards. We plan to open-source our core - the Open Semantic Layer. Our goal is to create query standardization across BI, data science, AI, and advanced analysis tools through open definition standards and query engine. This will help enterprises solve the problem of inconsistency that arises from multiple data definitions in different application tools.
Our plan includes three main design components: Semantic Modeling, Semantic Definition Language, and Standard Protocol. Technical engineers can define the Semantic Layer via a syntax similar to GraphQL to define metric definitions. In addition, we provide a self-service UI interface for non-technical personnel to establish semantic standards across different application tools. We will also provide a standard SQL query via the PostgreSQL wire protocol standard, which most data applications can query against natively and API interface accessible by any application tool, ensuring that all applications can be managed and applied under a unified data governance framework.
It's exciting to get started in 2024 and beyond! We will share more details in the upcoming blog posts and sharings, see you there!
No reproduction without permission, please indicate the source if authorized.