Projects
Things I've built.
Futbol Report — Multi-Model LLM Comparison
The same soccer-digest prompt sent to four LLMs (Claude, Kimi, Qwen, Gemma) with identical web-search context, rendered side by side. A live eval surfacing how different models handle the same task — format adherence, context filtering, and information density.
Learn more →