County Health Explorer: Full-Stack Spatial Data Science App

Richard

Richard Donohue

๐Ÿ—‚๏ธ County Health Explorer

A minimalist, reproducible, full-stack spatial data science application to explore U.S. county-level health data. This project showcases backend-to-frontend integration using DuckDB, FastAPI, and vanilla JavaScript with Observable Plot for cartographically accurate mapping and statistical charting.
โš ๏ธ Status: Backend and frontend servers are fully functional, but other features are still in development

๐Ÿงฑ Tech Stack

Backend

Language: Python 3.10+
Framework: FastAPI
Database: DuckDB (with spatial extension)
Spatial Libraries: GeoPandas, PySAL

Frontend

Language: Vanilla JavaScript (ES6)
Mapping & Charts: Observable Plot (UMD) with D3 projections
Projections: Albers Equal Area for accurate spatial representation
UI: Plain HTML/CSS (no frameworks)

๐Ÿ“ Project Structure

county-health-explorer/
โ”œโ”€โ”€ backend/
โ”‚ โ”œโ”€โ”€ app/
โ”‚ โ”‚ โ”œโ”€โ”€ main.py # FastAPI application
โ”‚ โ”‚ โ”œโ”€โ”€ database.py # DuckDB setup with spatial extension
โ”‚ โ”‚ โ”œโ”€โ”€ etl.py # Data extraction, transformation, loading
โ”‚ โ”‚ โ””โ”€โ”€ routes/
โ”‚ โ”‚ โ”œโ”€โ”€ api.py # Core API endpoints
โ”‚ โ”‚ โ””โ”€โ”€ tiles.py # Spatial tile endpoints
โ”‚ โ”œโ”€โ”€ templates/ # Jinja2 HTML templates
โ”‚ โ””โ”€โ”€ tests/ # Test suite
โ”œโ”€โ”€ frontend/
โ”‚ โ”œโ”€โ”€ css/ # Vanilla CSS styles
โ”‚ โ”œโ”€โ”€ js/ # Vanilla JavaScript modules
โ”‚ โ””โ”€โ”€ index.html # Main application page
โ”œโ”€โ”€ data/ # Data files and DuckDB database
โ”œโ”€โ”€ scripts/
โ”‚ โ”œโ”€โ”€ etl.sh # ETL pipeline script
โ”‚ โ””โ”€โ”€ dev.sh # Development setup script
โ””โ”€โ”€ README.md

๐ŸŽฏ Features

Cartographically Accurate Maps: County-level choropleth maps with Albers Equal Area projection via Observable Plot
Dynamic Variable Selection: Switch between health indicators via dropdown
Statistical Analysis: Summary statistics, spatial autocorrelation (Moran's I), correlations
Real-time Charts: Histograms and scatter plots with Observable Plot
Spatial Analysis: County neighbors and local spatial statistics
Responsive Design: Mobile-friendly interface with progressive enhancement
Performance: Optimized spatial queries and SVG rendering

๐Ÿš€ Quick Start

Prerequisites

Python 3.10+
pip

Installation

Clone or navigate to the project directory
Install dependencies:
pip install -r requirements.txt
Set up virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
Run ETL pipeline (first time only):
PYTHONPATH=backend python -m app.etl

Development (Recommended)

Start both servers individually:
Terminal 1 - Backend server:
cd backend && ../venv/bin/python3 -m uvicorn app.main:app --reload --port 8000
Terminal 2 - Frontend server:
python serve_frontend.py
Alternative: Use the development script (may require debugging):
python start_dev.py

Production

Start the main application server:
cd backend && uvicorn app.main:app --port 8000

Access Points

๐ŸŒ Frontend Application: http://localhost:3000
๐Ÿ“š API Documentation: http://localhost:8000/docs
๐Ÿ” Health Check: http://localhost:8000/health

๐Ÿ”— API Endpoints

Core Endpoints

GET /api/vars - List available health variables and metadata
GET /api/variables/categories - Variables grouped by health domain
GET /api/choropleth?var=<variable> - GeoJSON with joined data and class breaks
GET /api/stats?var=<variable> - Summary statistics (count, mean, std, min, max)
GET /api/moran?var=<variable> - Moran's I spatial autocorrelation
GET /api/corr?vars=var1,var2 - Correlation between variables

County-Specific

GET /api/counties/{fips} - Individual county details
GET /api/neighbors/{fips} - Spatial neighbors for local analysis

Example API Responses

// GET /api/stats?var=premature_death
{
"variable": "premature_death",
"count": 3142,
"mean": 7605.2,
"std": 1212.3,
"min": 4321,
"max": 10234
}

// GET /api/choropleth?var=obesity
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {"fips": "08031", "value": 24.3, "class": 5},
"geometry": { ... }
}
]
}

๐Ÿ—๏ธ Development

Architecture

The application uses a simple but powerful architecture:
DuckDB with spatial extension for fast analytical queries
FastAPI for RESTful API with automatic OpenAPI documentation
Jinja2 for server-side HTML templating
Vanilla JavaScript with ES6 modules for frontend interactions
Observable Plot for cartographically accurate mapping with proper projections
Observable Plot for statistical visualizations (histograms, scatter plots)

State Management

Frontend uses a central AppState object:
const AppState = {
currentVariable: 'premature_death',
selectedCounty: null,
mapLoaded: false,
projection: 'albers-usa' // Albers Equal Area projection
};

ETL Pipeline

Loads county_health.csv into DuckDB
Normalizes column names
Joins county GeoJSON via fips_code
Creates spatial WKB views
Validates output (3142 counties, data integrity)

๐Ÿ“Š Data Sources

This application utilizes data from the following authoritative sources:

Health Data

County Health Rankings & Roadmaps: https://www.countyhealthrankings.org/
Provides comprehensive county-level health indicators including health outcomes, health factors, and social determinants
2025 Annual Data Release with measures for all U.S. counties
Data includes premature death rates, preventable hospital stays, health behaviors, clinical care access, and social & economic factors

Geographic Data

County boundary shapefiles optimized for thematic mapping
1:500,000 scale resolution for detailed visualization
Simplified representations from the Census Bureau's MAF/TIGER geographic database

๐ŸŽจ Design Principles

Minimalist: No external JS frameworks or build tools
Reproducible: Fixed seeds and idempotent ETL
Progressive: Graceful degradation without JavaScript
Cartographically Accurate: Albers Equal Area projection for proper spatial representation
Performant: Optimized spatial queries and efficient SVG rendering
Accessible: Mobile-friendly with accessible color palettes

๐Ÿงช Testing

Run the test suite:
python -m pytest backend/tests/

๐Ÿ“ˆ Performance Metrics

Time-to-First-Map: <2s
API Latency: <150ms
DuckDB File Size: <50MB
Codebase: <2,000 LOC

๐Ÿค Contributing

This project is designed for agentic development and reproducible science. Contributions should maintain:
Zero external JS frameworks
Minimal dependencies
Full test coverage for core logic
API documentation via OpenAPI

๐Ÿ“„ License

MIT License
"Simplicity is an antidote to confusion; when the mind is clear, action is precise."
Built for public-serving clarity and repeatable science.
Like this project

Posted Jul 21, 2025

Developed a full-stack app for exploring U.S. county health data using DuckDB, FastAPI, and JavaScript.