Best AI News — Updated Every 3 Hours
Story Page
← All Stories
Home Community Story
Community

Designed a photonic chip for O(1) KV cache block selection — 944x faster, 18,000x less energy than GPU scan at 1M context

Via r/LocalLlama
Monday, Mar 23, 2026 · 12:17PM
Summary

I’m a nanophotonics PhD student, and I think photonic chips can solve the KV cache scanning bottleneck. Block-sparse methods like Quest/RocketKV reduce blocks fetched, but still scan all N block signatures from HBM every decode step. That scan is O(N) — at 1M context on H100, it’s ~8.5μs per query.

Continue reading the full article
Read at r/LocalLlama
www.reddit.com
Back to all stories