ML Systems & Edge AI Engineer | Building production AI that runs on real hardware, not slides.
I spend most of my time optimizing models until they stop complaining β reducing FP32 bloat into efficient INT8 inference, squeezing detection pipelines on edge accelerators, and architecting memory systems that let AI agents reason without token-bombing the context window.
3 years production. Currently at NXP, Ex-OnePlus. Published ICPR & ACCV . Now more into agentic AI, LLM reasoning, and context management infrastructure.
- βοΈ Model Optimization β PTQ, QAT, Mixed Precision, deployment-aware quantization
- π Edge Deployment β NPUs, ONNX, TensorRT, hardware-aware inference pipelines
-
- π§ LLM & Agentic Infrastructure β Memory systems, context management, reasoning workflows
- ποΈ Computer Vision β Object detection, classification, segmentation (mostly everything)
-
π§ MemoryClaw β Open-source hierarchical memory layer for AI agents. Four-tier model (recent, important, consolidated, search index) with hybrid keyword + vector retrieval. Cuts token overhead by avoiding brute-force context dumps. Built for OpenClaw framework.
-
ποΈ Maestro β Interactive voice AI tutor generating real-time visual aids (mind maps, diagrams, timelines) during lessons. Warm stone/amber design, horizontal scroll interface. Companion effects library (
maestro-effects) for dynamic visual rendering. MemoryClaw started as a personal itch that became useful to others. -
π€« More coming soon.
ICPR Β· ACCV β Computer vision publications. Detection and classification on constrained hardware. The work that came before "everyone" decided AI was easy.
ONNX Runtime Β· TensorFlow Β· CUDA optimization Β· Hardware profiling



