Data Vault Ensemble Modeling

Building a Real-time Cloud Data Warehouse in Four Simple Steps

Remco Broekmans, Genesee Academy

Learn how to model, load, and discover insights from a real-time cloud data warehouse in 4 simple steps. This session will have a focus on the modeling part – Step 1 – and will show the other 3 steps. This presentation reflects several events where the modeling, building, streaming and analyzing has been done within 4 hours. The complete program of the events were as following:

  • Step 1: Using bike sharing data we will design a data warehouse data model via the Data Vault modeling pattern – the next generation data modeling methodology. This design is agile friendly, supports massively parallel ETL, and by its nature leads to a fast turnaround on requirements. 80% of data warehouses in the Netherlands are implemented using the Data Vault approach.
  • Step 2: Show how the Data Vault data model is deployed on Snowflake including a virtual data mart for reporting and dashboards. Snowflake is a true data warehouse as a service – it was purpose built for the cloud. With its scalability, decoupling of storage & compute, and a pay per use billing model, Snowflake is the #1 cloud data platform.
  • Step 3: Explain how to stream data from our SQL Server source database into the new Data Vault on Snowflake using Fivetran. Fivetran is a cloud based heterogeneous data pipeline supporting batch processing or streaming, can adjust to source schema changes automatically, and provides a plethora of connectors.
  • Step 4: We will load the data into the DataChemist Knowledge Graph to find and visualize hidden connections and derive new insights. The DataChemist Knowledge Graph acquires and integrates information and applies a reasoner to infer new knowledge. With DataChemist you can trivially traverse vast datasets and answer the connection oriented questions that matter. DataChemist’s Knowledge Graph perfectly dovetails with the Data Vault data model.

Remco Broekmans Is the vice president of international programs for Genesee Academy. He works in Business Intelligence and Enterprise Data Warehousing as a trainer, advisor and speaker with a focus on modeling and architecture including Ensemble and Data Vault modeling. He works internationally, and is based in the Netherlands.

The last 15 years, Remco has taught different classes on business intelligence and data warehousing, and has worked for several consulting companies before starting his own companies; Coarem and BI Academy.

Specialties: Information Management and Modeling, Ensemble Modeling, Data Vault Modeling, Agile Data Warehousing, Education and Business Development.