A German research team lets Transformer models decide for themselves how many times they think about a problem. Combined with additional memory, the approach outperforms larger models on math problems. The article Math needs thinking time, everyday knowledge needs memory, and a new Transfor