Data Vault Ensemble Modeling

Bigger is often Better: Data Modeling in the Age of Big Data

Dr. Raja Sooriamurthi, Carnegie Mellon University

Since early 2000 the nature of data has morphed.  Big Data is differentiated from traditional data in terms of many ‘V’s, the three prominent of which are: volume, velocity, and variety.  This raises some foundational questions. For example, when we process data at the Tera, Petabyte level and beyond, what fundamental shift in our approach to solving problems occurs?  Given the fast transmission and computational speeds of current systems, what new capabilities are enabled by the processing of huge amounts of data in real time? Estimates are that more than 90% of the world’s data is not structured (i.e., not in classical relational databases amenable to SQL queries).  What type of new actionable insights are facilitated by the processing of semi-structured (e.g., csv, JSON) and unstructured (e.g., text, images, audio) data?  In this talk, we’ll discuss the nature of big data and will discuss various tools, techniques for harnessing the power and how the role of the data modeler is evolving.

Dr. Raja Sooriamurthi is a Teaching Professor with the Information Systems Program at Carnegie Mellon University, Pittsburgh. His research and teaching interests span the fields of artificial intelligence and software development with a current focus on data-driven decision making. Along with his co-authors, he has investigated a novel approach to teaching critical thinking and problem solving termed puzzle-based learning resulting in the book Guide to Teaching Puzzle-based Learning (Springer, 2014). In addition to his university courses, Raja has taught several conference and industry workshops in the US, Australia, the Middle-East (Qatar, The United Arab Emirates), and India.  Over the years, since a graduate student, his pedagogical efforts have been recognized with several awards for teaching excellence.