Our company uses hugging face TGI as the default engine on AWS Sagemaker AI. I really had bad experiences of TGI comparing to my home setup using llama.cpp and vllm. I just saw that Huggingface ended new developments of TGI: https://huggingface.co/docs/text-generation-inference/index There were deba