Language Model Loss Captures Mutual Intelligibility Gradients in Turkic Languages
Moldir Baidildinova, Shiva Upadhye, Austin Wagner (UC Irvine)
- Accepted at HSP 2026 (39th Annual Conference on Human Sentence Processing, MIT) and Tu+11 (11th Workshop on Turkic and Languages in Contact with Turkic, MIT)
- Character-level LSTM trained on 6 Turkic languages + Finnish (control), all transliterated to broad IPA
- Built on Turkic API — rules-based IPA transliteration with phonetic rules cited from peer-reviewed research
Plant Metabolomics & BVOCs
- Research on biogenic volatile organic compounds from tree leaf and root tissues across 9 California field sites
- GC-MS data processing pipeline: Tree Bot — peak detection, co-elution analysis, compound classification
- Interactive Metabolomics Dashboard — Welch's t-test with FDR correction, D3.js visualizations
All services live in my API monorepo — 13 microservices, 21 shared libraries, FastAPI + Redis + RQ architecture.
All Services (13) & Shared Libraries (21)
See the full API monorepo for all services and libraries including Data Bank, Music Wrapped, Transcript API, Handwriting AI, QR API, GitHub Stats, Opportunity Radar, Procart API, and 21 shared libraries covering ML, NLP, workers, persistence, and more.
Interactive data dashboards at austinwagner.org — see the Dashboards repo.
|
City council data for 34 Orange County cities — YAML-driven, SQLite-backed, GitHub Actions auto-rebuild |
OC Flock Safety ALPR surveillance research — FOIA documents, 19 cities, 5M+ plate scans |
Contributor to microsoft/LightGBM — 9 PRs merged across Python type safety, C++ networking fixes, and test coverage.
| PR | Description |
|---|---|
| #7137 | Fix socket timeout on POSIX systems (wrong type for setsockopt) |
| #7131 | Fix test_numeric_split_direction to test all parameter combinations |
| #7133 | Add test for Booster.rollback_one_iter() |
| #7130 | Fix numpy integer cast in plot_importance |
| #7115–7119 | Type annotations — TypeGuard, Literal, DTypeLike, return types for sklearn predict |

