Data Vault Ensemble Modeling

Elections Are Complex: A Ballot Data Model

David Hay

In 2018, David Hay had the opportunity to be an Election Day Judge in Harris County (which is approximately Houston), Texas. In this role, he supervised the acceptance and routing to booths of 769 voters.

Among other things, this introduced him to the origin and structure of the ballots that appeared in each booth.

Each polling location is in a mapped neighborhood “precinct”, and supports the voters living in that neighborhood. Each of these has a different ballot template, depending on the congressional, judicial, and various legislative districts that the precinct’s map overlays. The ballot seen by each voter in h’ booth is based that on that template, but templates are different from precinct to precinct.

The task of managing this collection of ballots with their overlapping election districts is incredibly complex. Since he has data modeling in his blood, Mr. Hay saw that the best way to make sense out of it all would be to portray it as a (conceptual) data model—which he proceeded to do.

Viewers of this presentation will have the opportunity to see that model. It is divided into the following sections:

  • Geopolitical Area – Each precinct—a neighborhood area—is a well-defined geopolitical area, as are the multiple electoral districts that are the basis for candidates for office. Among others, these include US Congressional Districts, state Senate Districts and state Legislative Districts.
  • Precinct Structure – Each voting place represents one precinct. The precinct is the place where voters register and candidates organize their campaigns. This part of the model shows how each precinct, as a geopolitical area, is related to the other geopolitical areas that overlap it.
  • Voter Registration – This involves recording information about each prospective voter. Among other things, each person is placed in a single precinct, based on h’ residence address.
  • Candidacy – Each candidate runs in a county, election district, or the state as a whole. Each ballot must show just those candidates relevant to a particular precinct.
  • Election – An election consists of an array of ballots constructed, one for each precinct, in terms of the election districts that overlap that precinct.
  • Voter Action – When a registered voter votes, ‘e uses a ballot which describes those candidacies that apply to h’ precinct.

This process is conducted by two different agencies in the County, with various bits of hardware and software. Data quality? Uh, yes, that’s important. But as always achieving it requires a good understanding of the underlying structure of the data involved. This model is intended to contribute to that.

David Hay was in fact born the year that the transistor was invented. Some thirty years ago he took up the production of data models to support strategic information planning and requirements analysis. He has worked in a variety of industries, including, among others, power generation, clinical pharmaceutical research, oil refining, banking, and broadcast. He is President of Essential Strategies International, a consulting firm dedicated to helping clients define corporate information architecture, identify requirements, and plan strategies for the implementation of new systems. 

Mr. Hay is the author of the ground-breaking 1995 book, Data Model Patterns: Conventions of Thought, as well as Requirements Analysis: From Business Views to Architecture, and Data Model Patterns: A Metadata Map.

Since then, he has written Enterprise Model Patterns: Describing the World, and UML and Data Modeling: A Reconciliation, both published by Technics Publications. Also published by Technics Publications is his latest book, Achieving Buzzword Compliance: Data Architecture Language and Vocabulary.

Mr. Hay has spoken numerous times to “The Data Modeling Zone”, annual DAMA International Conferences (both in the United States and overseas), annual conferences for various vendor user groups, and numerous local chapters of both data administration and database management system groups.

He may be reached at or (713) 464-8316. Many of his works can be found at