If you run Ollama, vLLM, TGI, or any custom model server that loads and unloads models, you've probably seen RSS creep up over hours until Linux kills the process. It's not a Python leak. It's not PyTorch. It's glibc's heap allocator fragmenting and never returning pages to the OS. Fix: export MALLO