llama.cpp
is a library used to make LLM inference with minimal setup and state-of-the-art performance on a wide variety of hardware - locally and in the cloud.
It also has support for various multimodal models (e.g., LLaVA, BakLLaVA, etc.)
It has bindings for various programming languages (Python, Go, PHP, Ruby, etc.)
It is used by various other UI tools including (Ollama or LMStudio).
LMStudio allows users to install and run LLMs on their own computers.
It has the following features: