Local + cloud LLM gateway

Stop playing with AI
and start engineering it.

Lattis is a desktop manager and gateway for LLMs. One local endpoint speaks both the OpenAI and Anthropic APIs, routing every request to local or cloud models — with usage and cost you can actually account for.

Download Read the docs

Open source · macOS, Linux, Windows · Rust + iced

~/lattis

$ curl localhost:1234/v1/chat/completions \
  -d '"model": "qwen3-4b-instruct-2507", ...'   # local GGUF

$ curl localhost:1234/v1/messages \
  -d '"model": "claude-opus-4-8", ...'          # cloud, OAuth

$ curl localhost:1234/v1/chat/completions \
  -d '"model": "gpt-5.5", ...'                   # cloud, API key

→ one endpoint · OpenAI + Anthropic APIs · local + cloud

What it does

A gateway, not another playground.

One endpoint, two dialects

A single local HTTP server speaks both the OpenAI and Anthropic APIs. Point any existing SDK or tool at localhost — no code changes.

Local models, resident

Download GGUF models from Hugging Face and serve them with a bundled llama-server in router mode. Keep several resident; swap on demand.

Cloud, when you want it

Connect Anthropic and OpenAI / Codex via OAuth or API key. Requests are translated between OpenAI, Anthropic, and Responses formats automatically.

MLX on Apple Silicon

Detects Apple's mlx_lm if installed and serves MLX models alongside the rest — Metal-accelerated, hidden when unavailable.

Usage you can account for

Per-app, per-model, and per-project token usage with estimated cost. Know exactly what each project and provider is spending.

Runs as a daemon

The daemon owns storage, downloads, the inference engine, and cloud connections. Launch on login and it serves the API 24/7 — app optional.

How it routes

One hub. Every model.

Behind a single endpoint, Lattis routes each request to the right engine — a resident local GGUF, an MLX model on Apple Silicon, or a connected cloud provider — translating API formats as needed so your client never has to care.

See the API →

Get going

Three steps to a local AI backbone.

01

Install

Grab the desktop app (coming soon) or build from source.
02

Add a model

Download a GGUF from the Library, or connect a cloud provider.
03

Point & ship

Aim your client at localhost:1234 and build.

Get Lattis Quick start

Stop playing with AIand start engineering it.