Running LLMs

LLaMA.cpp

llama.cpp is a library used to make LLM inference with minimal setup and state-of-the-art performance on a wide variety of hardware - locally and in the cloud.
It also has support for various multimodal models (e.g., LLaVA, BakLLaVA, etc.)
It has bindings for various programming languages (Python, Go, PHP, Ruby, etc.)
It is used by various other UI tools including (Ollama or LMStudio).

LMStudio

LMStudio allows users to install and run LLMs on their own computers.

It has the following features:

Compatibility inference
Chat mode interface