A static-site dashboard mapping every NYC restaurant and bar, built from two open data sources.
| Source | Records | API |
|---|---|---|
| DOHMH β NYC Dept. of Health Restaurant Inspection Results | ~30 k | Socrata 43nn-pn8j |
| SLA β NYS Liquor Authority Active Licenses (NYC only) | ~22 k | ArcGIS FeatureServer |
Sources are pluggable β drop a new DataSource subclass into sources/ and
it's automatically discovered at build time.
Venues appearing in both datasets are merged into a single marker with
source="both". The pipeline runs three passes:
- Exact address + borough β normalised addresses (suffix canonicalization, ordinal stripping, AKA/unit removal) are compared by borough.
- Address range containment β SLA range addresses like
"77 79 HUDSON ST"match a DOHMH entry at77 HUDSON STREETif the number falls within the range. Queens block-lot addresses (30-12 20TH AVE) are excluded. - Geo-proximity β€ 30 m β remaining unmatched venues within 30 metres are merged using a grid-based spatial index. Catches typos, AKA addresses, and other text variations.
Before merging, DOHMH venues are deduplicated:
- First by
camis(unique restaurant ID β the dataset has one row per inspection). - Then by normalised name + BIN (Building Identification Number) to collapse duplicate registrations in the same building.
Typical result: ~51 k raw β ~37 k post-merge, with ~14 k merged pairs. Exact counts are shown in the sidebar's Pipeline table.
- Leaflet 1.9 with CARTO light base tiles and MarkerCluster
- Grade-letter markers (A/B/C/P/Z) for DOHMH & merged venues
- Martini-glass markers for SLA-only venues
- Purple merged markers showing both names in the popup
- Spiderfying on every zoom level for overlapping markers
- Filter by source, cuisine tag, and search by name
- Cache-busted
venues.jsvia SHA-256 content hash
nyc-eats/
βββ build.py # Static site generator (fetch β merge β render)
βββ Makefile # Build, deploy, cron targets
βββ requirements.txt # Python deps (requests, jinja2)
βββ nginx.conf.in # nginx config template (values from .env)
βββ .env.example # Server-specific settings template
βββ cron/
β βββ nyc-eats-refresh # Refresh script (runs on server)
β βββ nyc-eats-refresh.service.in # systemd service template
β βββ nyc-eats-refresh.timer # systemd timer unit
β βββ README.md # Cron/timer setup instructions
βββ sources/
β βββ base.py # Venue dataclass + DataSource ABC + auto-discovery
β βββ dohmh.py # DOHMH Socrata source
β βββ sla.py # SLA ArcGIS FeatureServer source
βββ templates/
β βββ index.html # Jinja2 template (Leaflet map + sidebar)
βββ static/
β βββ style.css # Dashboard styles
β βββ favicon.svg # π΄ favicon
βββ dist/ # Generated site (git-ignored)
# Clone & set up
git clone https://github.com/YOUR_USER/nyc-eats.git
cd nyc-eats
cp .env.example .env # edit with your server details
make install # creates .venv, installs deps
# Build (fetches live data, ~30s)
make build
# Or use cached data if already fetched in the last 24h
make build-cached
# Local dev server
make serve # http://localhost:8000The site can be deployed to any VPS with nginx. Server-specific settings
(hostname, paths) live in .env (see .env.example).
make deploy # build + rsync + reload nginx
make deploy-only # rsync + reload (skip rebuild)# On the VPS β create the serving directory:
mkdir -p ~/your-domain.com/dist
# Symlink nginx config (done automatically by `make deploy`)
sudo ln -sf ~/your-domain.com/nginx.conf \
/etc/nginx/sites-enabled/your-domain.com.conf
sudo nginx -t && sudo systemctl reload nginx
# HTTPS
sudo certbot --nginx -d your-domain.comThe data is refreshed every Sunday at 3:00 AM ET via a systemd timer on the
server. See cron/README.md for full details.
make timer-install # uploads units + enables timer on VPSThe refresh script lives at cron/nyc-eats-refresh β
it rebuilds the site and rsyncs the output to the serving directory.
Logs go to $VPS_PATH/refresh.log.
- Create
sources/my_source.py - Subclass
DataSource(fromsources.base) - Implement
name,descriptionproperties andfetch() β list[Venue] - Run
make buildβ it's auto-discovered
Unlicensed