Building production-grade ML and simulation systems for large-scale supply chain operations — not just notebooks.
At Amazon, I own end-to-end systems that drive real planning decisions:
Demand Forecasting — Replaced per-lane time-series models with multivariate XGBoost/LightGBM ensembles across 10+ regions. Features engineered from shipment flows, lead time distributions (mean + P95), and event proximity signals. MAPE improved 5.2% → 3.1%, informing $450K+ in inventory allocation decisions.
THANOS — Network Simulation — Monte Carlo simulation platform modelling stochastic demand, capacity distributions, and routing logic across hundreds of supply chain lanes. Produces risk distributions over network outcomes — not point estimates — to stress-test strategic plans. Informed 3 FC expansion decisions (~$2M+ projected impact).
RAG-Based Operational Intelligence — Built over 100K+ heterogeneous operational documents using hybrid retrieval (dense embeddings + BM25) and hierarchical chunking. Reduced decision error rate by 40%. Designed three-dimensional evaluation framework (retrieval precision, stakeholder adoption rate, decision velocity) — adopted org-wide as the standard for LLM system validation.
Workforce Spillover Optimization — Reformulated shift staffing as a stochastic allocation problem. Applied asymmetric cost weighting to derive quantile-based targets for high-variance periods.
End-to-end NLP pipeline for task detection and responsible-person
assignment in email threads. BERT sentence-pair classifier fine-tuned
on the Enron EPA dataset. LR baseline F1=0.925, DistilBERT F1=0.703.
Served via FastAPI.
PyTorch BERT HuggingFace FastAPI Scikit-learn
Multi-lane demand forecasting simulator — XGBoost/LightGBM vs
baselines, with shipment flow, lead time (mean + P95), and event
proximity features. Walk-forward evaluation at daily and weekly
(cycle-level) granularity.
XGBoost LightGBM Statsmodels Python
Monte Carlo simulation engine for supply chain network planning. Models
stochastic demand (log-normal), transit disruptions, and capacity
constraints. Compares constrained vs unconstrained scenarios via cost
and spillover risk distributions.
Monte Carlo NumPy SciPy NetworkX
| Domain | Tools |
|---|---|
| ML & Modeling | XGBoost, LightGBM, PyTorch, TensorFlow, Scikit-learn |
| LLMs & AI | RAG Pipelines, LangChain, HuggingFace Transformers, Prompt Engineering |
| Simulation | Monte Carlo, SciPy, NumPy, NetworkX |
| MLOps | SageMaker, MLflow, Docker, Model Monitoring |
| Data | Python, SQL (PostgreSQL, Redshift), A/B Testing, Causal Inference |
| Cloud | AWS (SageMaker, Lambda, S3, Redshift, EC2) |
| $2M+ | Projected annual impact from THANOS network planning |
| $450K+ | Inventory allocation decisions informed by forecasting system |
| 40% | MAPE improvement (5.2% → 3.1%) on demand forecasting |
| 40% | Error rate reduction on RAG-based decision support |
| 8+ | Production ML models owned end-to-end |
| 10K+ | Weekly supply chain events handled via automated NLP triaging |
Building ML systems is easy. Building ML systems people trust is the real work.
