Community

Request: Training a pretrained, MoE version of Mistral Nemo

Via r/LocalLlama

Tuesday, Mar 24, 2026 · 9:45AM

Summary

I converted Mistral Nemo from a dense model into a sixteen expert MoE model: https://huggingface.co/blascotobasco/Mistral-NeMoE-12B-16E The core problem is that I am a student with budget constraints and can’t afford full parameter or extended fine tuning. I did my best to restore coherence, and it

Continue reading the full article

Read at r/LocalLlama

www.reddit.com