Skip to content

Writing

Engineering notes, mostly about systems and applied ML. I try to name what broke, not only what worked.

Building a Real-Time Bus Prediction System for Madison Metro 2026-03-04 Live ML that corrects the transit API's ETAs with a 47-feature XGBoost model and Mondrian conformal prediction, retrained nightly behind a hard deploy gate. machine learning conformal prediction transit Deploying RAG in AWS Bedrock: Benchmarking 9 LLMs on the WattBot Challenge 2026-02-17 Ensemble majority voting beat every individual model. The highest-citation model finished last. A serverless RAG pipeline on Bedrock with full cost tracking. RAG AWS Bedrock LLM eval Building a Speculative Decoding Engine from Scratch 2026-02-12 Custom Triton kernels, tree-structured attention, four bugs, and an honest negative result: 0.66x baseline. The full arc from 0.08x to 0.66x and what it taught me. CUDA speculative decoding inference