Local + cloud LLM gateway

Stop playing with AI
and start engineering it.

Lattis is a desktop manager and gateway for LLMs. One local endpoint speaks both the OpenAI and Anthropic APIs, routing every request to local or cloud models — with usage and cost you can actually account for.

Open source · macOS, Linux, Windows · Rust + iced

What it does

A gateway, not another playground.

One endpoint, two dialects

A single local HTTP server speaks both the OpenAI and Anthropic APIs. Point any existing SDK or tool at localhost — no code changes.

Local models, resident

Download GGUF models from Hugging Face and serve them with a bundled llama-server in router mode. Keep several resident; swap on demand.

Cloud, when you want it

Connect Anthropic and OpenAI / Codex via OAuth or API key. Requests are translated between OpenAI, Anthropic, and Responses formats automatically.

MLX on Apple Silicon

Detects Apple's mlx_lm if installed and serves MLX models alongside the rest — Metal-accelerated, hidden when unavailable.

Usage you can account for

Per-app, per-model, and per-project token usage with estimated cost. Know exactly what each project and provider is spending.

Runs as a daemon

The daemon owns storage, downloads, the inference engine, and cloud connections. Launch on login and it serves the API 24/7 — app optional.

How it routes

One hub. Every model.

Behind a single endpoint, Lattis routes each request to the right engine — a resident local GGUF, an MLX model on Apple Silicon, or a connected cloud provider — translating API formats as needed so your client never has to care.

See the API →

Get going

Three steps to a local AI backbone.

  1. 01

    Install

    Grab the desktop app (coming soon) or build from source.

  2. 02

    Add a model

    Download a GGUF from the Library, or connect a cloud provider.

  3. 03

    Point & ship

    Aim your client at localhost:1234 and build.