Skip to content

jareklupinski/nyc-eats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

NYC Eats 🍴

Screenshot 2026-02-17 at 7 47 49β€―PM

A static-site dashboard mapping every NYC restaurant and bar, built from two open data sources.

Leaflet + MarkerCluster map Python 3.12+

Data sources

Source Records API
DOHMH β€” NYC Dept. of Health Restaurant Inspection Results ~30 k Socrata 43nn-pn8j
SLA β€” NYS Liquor Authority Active Licenses (NYC only) ~22 k ArcGIS FeatureServer

Sources are pluggable β€” drop a new DataSource subclass into sources/ and it's automatically discovered at build time.

Merge pipeline

Venues appearing in both datasets are merged into a single marker with source="both". The pipeline runs three passes:

  1. Exact address + borough β€” normalised addresses (suffix canonicalization, ordinal stripping, AKA/unit removal) are compared by borough.
  2. Address range containment β€” SLA range addresses like "77 79 HUDSON ST" match a DOHMH entry at 77 HUDSON STREET if the number falls within the range. Queens block-lot addresses (30-12 20TH AVE) are excluded.
  3. Geo-proximity ≀ 30 m β€” remaining unmatched venues within 30 metres are merged using a grid-based spatial index. Catches typos, AKA addresses, and other text variations.

Before merging, DOHMH venues are deduplicated:

  • First by camis (unique restaurant ID β€” the dataset has one row per inspection).
  • Then by normalised name + BIN (Building Identification Number) to collapse duplicate registrations in the same building.

Typical result: ~51 k raw β†’ ~37 k post-merge, with ~14 k merged pairs. Exact counts are shown in the sidebar's Pipeline table.

Map features

  • Leaflet 1.9 with CARTO light base tiles and MarkerCluster
  • Grade-letter markers (A/B/C/P/Z) for DOHMH & merged venues
  • Martini-glass markers for SLA-only venues
  • Purple merged markers showing both names in the popup
  • Spiderfying on every zoom level for overlapping markers
  • Filter by source, cuisine tag, and search by name
  • Cache-busted venues.js via SHA-256 content hash

Project structure

nyc-eats/
β”œβ”€β”€ build.py              # Static site generator (fetch β†’ merge β†’ render)
β”œβ”€β”€ Makefile              # Build, deploy, cron targets
β”œβ”€β”€ requirements.txt      # Python deps (requests, jinja2)
β”œβ”€β”€ nginx.conf.in         # nginx config template (values from .env)
β”œβ”€β”€ .env.example          # Server-specific settings template
β”œβ”€β”€ cron/
β”‚   β”œβ”€β”€ nyc-eats-refresh            # Refresh script (runs on server)
β”‚   β”œβ”€β”€ nyc-eats-refresh.service.in # systemd service template
β”‚   β”œβ”€β”€ nyc-eats-refresh.timer      # systemd timer unit
β”‚   └── README.md                   # Cron/timer setup instructions
β”œβ”€β”€ sources/
β”‚   β”œβ”€β”€ base.py           # Venue dataclass + DataSource ABC + auto-discovery
β”‚   β”œβ”€β”€ dohmh.py          # DOHMH Socrata source
β”‚   └── sla.py            # SLA ArcGIS FeatureServer source
β”œβ”€β”€ templates/
β”‚   └── index.html        # Jinja2 template (Leaflet map + sidebar)
β”œβ”€β”€ static/
β”‚   β”œβ”€β”€ style.css         # Dashboard styles
β”‚   └── favicon.svg       # 🍴 favicon
└── dist/                 # Generated site (git-ignored)

Quick start

# Clone & set up
git clone https://github.com/YOUR_USER/nyc-eats.git
cd nyc-eats
cp .env.example .env     # edit with your server details
make install              # creates .venv, installs deps

# Build (fetches live data, ~30s)
make build

# Or use cached data if already fetched in the last 24h
make build-cached

# Local dev server
make serve            # http://localhost:8000

Deployment

The site can be deployed to any VPS with nginx. Server-specific settings (hostname, paths) live in .env (see .env.example).

make deploy           # build + rsync + reload nginx
make deploy-only      # rsync + reload (skip rebuild)

Server setup (one-time)

# On the VPS β€” create the serving directory:
mkdir -p ~/your-domain.com/dist

# Symlink nginx config (done automatically by `make deploy`)
sudo ln -sf ~/your-domain.com/nginx.conf \
  /etc/nginx/sites-enabled/your-domain.com.conf
sudo nginx -t && sudo systemctl reload nginx

# HTTPS
sudo certbot --nginx -d your-domain.com

Automated refresh

The data is refreshed every Sunday at 3:00 AM ET via a systemd timer on the server. See cron/README.md for full details.

make timer-install    # uploads units + enables timer on VPS

The refresh script lives at cron/nyc-eats-refresh β€” it rebuilds the site and rsyncs the output to the serving directory. Logs go to $VPS_PATH/refresh.log.

Adding a data source

  1. Create sources/my_source.py
  2. Subclass DataSource (from sources.base)
  3. Implement name, description properties and fetch() β†’ list[Venue]
  4. Run make build β€” it's auto-discovered

License

Unlicensed

About

nyc-eats provides an unbiased map of all the places to get a bite in this one horse town 🐴

Topics

Resources

Stars

Watchers

Forks

Contributors