Best AI News — Updated Every 3 Hours
Story Page
← All Stories
Home Community Story
Community

We fit a 24M-parameter LLM into 15MB with per-row MSE quantization

Via r/LocalLlama
Wednesday, Mar 25, 2026 · 3:27AM
Summary

Working on OpenAI's Parameter Golf challenge (train best LLM possible, must fit in 16MB). Hit Top-3 on the leaderboard. The quantization trick: instead of fixed-percentile INT8 clipping, we search 5 clip values per weight row and keep whichever gives lowest reconstruction MSE. Costs 5x quantization

Continue reading the full article
Read at r/LocalLlama
www.reddit.com
Back to all stories