This project posed some great challenges around infrastructure. The core data source, falling within the terabyte range, needed to be prepared for large-scale statistical analysis in a computationally efficient manner and integrated with a heterogeneous set of supplemental data sources for contextualization.