Data Vault Ensemble Modeling

Using the Chebotko Method to Design Sound and Scalable Data Models for Apache Cassandra

Dr. Artem Chebotko, DataStax

Apache Cassandra is a leading open-source, distributed database for modern cloud applications. It is known for its instantaneous response time, linear scalability to very large datasets and millions of transactions per second, always-on availability, and seamless multi-data center support. Data modeling is the primary challenge to adopting Apache Cassandra in your data layer. In this talk, we present a rigorous and practical data modeling approach that ensures sound and efficient Apache Cassandra database schema design. Based on a selected use case, we demonstrate key techniques for designing conceptual, logical, and physical data models for a Cassandra database. You will learn about the Apache Cassandra data model, query language, query-driven data modeling methodology, data model visualization techniques, and relevant tools.

Dr. Artem Chebotko is a Solutions Architect at DataStax, Inc. His core expertise is in data modeling, data management, data mining, and data analytics. For over 15 years, he has been leading and participating in research and development projects on NoSQL, Graph, XML, Relational, and Provenance databases. His current focus is on distributed data management technologies, including Apache Cassandra, Apache Spark, Apache Solr, Apache Kafka, Apache TinkerPop, and DataStax Enterprise. He is the inventor of the Big Data Modeling Methodology for Apache Cassandra and the author of over 50 peer-reviewed research and technical papers published in international journals and conference proceedings. He is an educator with extensive experience in both industry and academic training. He received his Ph.D. in Computer Science from Wayne State University in 2008.