GoModel - AI Gateway in Go

# GoModel

**Repository Path**: one_check/GoModel

## Basic Information

- **Project Name**: GoModel
- **Description**: No description available
- **Primary Language**: Go
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-06-16
- **Last Updated**: 2026-06-16

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

<p align="center">
  <img alt="GoModel logo" src="docs/logo.svg" width="96">
</p>

<h1 align="center">
  GoModel - AI Gateway in Go
</h1>

<p align="center">
  <a href="https://github.com/ENTERPILOT/GoModel/actions/workflows/test.yml"><img alt="CI" src="https://github.com/ENTERPILOT/GoModel/actions/workflows/test.yml/badge.svg"></a>
  <a href="https://github.com/ENTERPILOT/GoModel/blob/main/go.mod"><img alt="GO Version" src="https://img.shields.io/github/go-mod/go-version/ENTERPILOT/GoModel?label=GO"></a>
  <a href="https://hub.docker.com/r/enterpilot/gomodel"><img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/enterpilot/gomodel?label=Docker%20Pulls"></a>
  <a href="https://discord.gg/gaEB9BQSPH"><img alt="Discord" src="https://img.shields.io/badge/Discord-Join-5865F2?logo=discord&logoColor=white"></a>
</p>

<p align="center">
  <a href="https://news.ycombinator.com/item?id=47849097"><img alt="Hacker News" src="https://img.shields.io/badge/Hacker%20News-Apr%2021%20%2726%20%7C%20%234-brightgreen?logo=ycombinator&logoColor=white"></a>
  <a href="https://gomodel.enterpilot.io/docs"><img alt="docs GoModel" src="https://img.shields.io/badge/Docs-GoModel-blue"></a>
</p>

<p align="center">
  <a href="https://news.ycombinator.com/item?id=47849097"><img alt="GoModel on Hacker News" src="https://hackerbadge.vercel.app/api?id=47849097"></a>
</p>

<p align="center">
  A fast and lightweight AI gateway written in Go, providing a unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, DeepSeek, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, Ollama, and more.
</p>

<a href="docs/dashboard.gif">
  <img src="docs/dashboard.gif" alt="GoModel AI gateway dashboard showing AI usage analytics, observability panel, token and costs tracking, and estimated cost monitoring" width="100%">
</a>

## Quick Start with Docker

**Step 1:** Start GoModel container

```bash
docker run --rm -p 8080:8080 \
  -e LOGGING_ENABLED=true \
  -e LOGGING_LOG_BODIES=true \
  -e LOG_FORMAT=text \
  -e LOGGING_LOG_HEADERS=true \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel
```

Pass only the provider credentials or base URL you need (at least one required):

```bash
docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY="your-openai-key" \
  -e ANTHROPIC_API_KEY="your-anthropic-key" \
  -e GEMINI_API_KEY="your-gemini-key" \
  -e VERTEX_PROJECT="your-gcp-project" \
  -e VERTEX_LOCATION="us-central1" \
  -e VERTEX_AUTH_TYPE="gcp_adc" \
  -e DEEPSEEK_API_KEY="your-deepseek-key" \
  -e GROQ_API_KEY="your-groq-key" \
  -e OPENROUTER_API_KEY="your-openrouter-key" \
  -e ZAI_API_KEY="your-zai-key" \
  -e XAI_API_KEY="your-xai-key" \
  -e AZURE_API_KEY="your-azure-key" \
  -e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \
  -e AZURE_API_VERSION="2024-10-21" \
  -e ORACLE_API_KEY="your-oracle-key" \
  -e ORACLE_BASE_URL="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1" \
  -e ORACLE_MODELS="openai.gpt-oss-120b,xai.grok-3" \
  -e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \
  -e VLLM_BASE_URL="http://host.docker.internal:8000/v1" \
  enterpilot/gomodel
```

⚠️ Avoid passing secrets via `-e` on the command line - they can leak via shell history and process lists. For production, use `docker run --env-file .env` to load API keys from a file instead.

**Step 2:** Make your first API call

```bash
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5-chat-latest",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

**That's it!** GoModel automatically detects which providers are available based on the credentials you supply.

### Supported LLM Providers

Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider.

| Provider                             | Credential                                                        | Example Model                              | Chat | `/responses` | Embed | Files | Batches | Passthru |
| ------------------------------------ | ----------------------------------------------------------------- | ------------------------------------------ | :--: | :----------: | :---: | :---: | :-----: | :------: |
| OpenAI                               | `OPENAI_API_KEY`                                                  | `gpt-5.5`                                  |  ✅  |      ✅      |  ✅   |  ✅   |   ✅    |    ✅    |
| Anthropic                            | `ANTHROPIC_API_KEY`                                               | `claude-sonnet-4-20250514`                 |  ✅  |      ✅      |  ❌   |  ❌   |   ✅    |    ✅    |
| Google Gemini                        | `GEMINI_API_KEY`                                                  | `gemini-2.5-flash`                         |  ✅  |      ✅      |  ✅   |  ✅   |   ✅    |    ❌    |
| Vertex AI                            | `VERTEX_PROJECT` + `VERTEX_LOCATION`                              | `google/gemini-2.5-flash`                  |  ✅  |      ✅      |  ✅   |  ❌   |   ❌    |    ❌    |
| DeepSeek                             | `DEEPSEEK_API_KEY`                                                | `deepseek-v4-pro`                          |  ✅  |      ✅      |  ❌   |  ❌   |   ❌    |    ✅    |
| Groq                                 | `GROQ_API_KEY`                                                    | `llama-3.3-70b-versatile`                  |  ✅  |      ✅      |  ✅   |  ✅   |   ✅    |    ❌    |
| OpenRouter                           | `OPENROUTER_API_KEY`                                              | `google/gemini-2.5-flash`                  |  ✅  |      ✅      |  ✅   |  ✅   |   ✅    |    ✅    |
| Z.ai                                 | `ZAI_API_KEY` (`ZAI_BASE_URL` optional)                           | `glm-5.1`                                  |  ✅  |      ✅      |  ✅   |  ❌   |   ❌    |    ✅    |
| xAI (Grok)                           | `XAI_API_KEY`                                                     | `grok-4`                                   |  ✅  |      ✅      |  ✅   |  ✅   |   ✅    |    ❌    |
| Alibaba Cloud Model Studio (Bailian) | `BAILIAN_API_KEY` (`BAILIAN_BASE_URL` optional)                   | `qwen3-max`                                |  ✅  |      ✅      |  ✅   |  ✅   |   ✅    |    ✅    |
| MiniMax                              | `MINIMAX_API_KEY` (`MINIMAX_BASE_URL` optional)                   | `MiniMax-M3`                               |  ✅  |      ✅      |  ✅   |  ❌   |   ❌    |    ✅    |
| Xiaomi MiMo                          | `XIAOMI_API_KEY` (`XIAOMI_BASE_URL` optional)                     | `mimo-v2.5-pro`                            |  ✅  |      ✅      |  ❌   |  ❌   |   ❌    |    ✅    |
| OpenCode Go                          | `OPENCODE_GO_API_KEY` (`OPENCODE_GO_BASE_URL` optional)           | `glm-5.1`                                  |  ✅  |      ✅      |  ❌   |  ❌   |   ❌    |    ❌    |
| Azure OpenAI                         | `AZURE_API_KEY` + `AZURE_BASE_URL` (`AZURE_API_VERSION` optional) | `gpt-5`                                    |  ✅  |      ✅      |  ✅   |  ✅   |   ✅    |    ✅    |
| Oracle                               | `ORACLE_API_KEY` + `ORACLE_BASE_URL`                              | `openai.gpt-oss-120b`                      |  ✅  |      ✅      |  ❌   |  ❌   |   ❌    |    ❌    |
| Ollama                               | `OLLAMA_BASE_URL`                                                 | `llama3.2`                                 |  ✅  |      ✅      |  ✅   |  ❌   |   ❌    |    ❌    |
| vLLM                                 | `VLLM_BASE_URL` (`VLLM_API_KEY` optional)                         | `meta-llama/Llama-3.1-8B-Instruct`         |  ✅  |      ✅      |  ✅   |  ❌   |   ❌    |    ✅    |
| Amazon Bedrock                       | `BEDROCK_BASE_URL` (region or endpoint) + AWS credentials         | `anthropic.claude-3-5-haiku-20241022-v1:0` |  ✅  |      ✅      |  ❌   |  ❌   |   ❌    |    ❌    |

✅ Supported ❌ Unsupported

For Z.ai's GLM Coding Plan, set `ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4`.
Xiaomi MiMo TTS (`mimo-v2.5-tts*`) and ASR (`mimo-v2.5-asr`) are served through
`/v1/audio/speech` and `/v1/audio/transcriptions` (translated to MiMo's
chat-completions audio dialect) as well as directly via chat completions; for
1M context append `[1m]` to the model ID and list it in `XIAOMI_MODELS`.
OpenCode Go (OpenCode Zen) routes per model — most models use OpenAI-style
`/chat/completions`, while `/messages`-only models (default `qwen3.7-max`,
override with `OPENCODE_GO_MESSAGES_MODELS`) are sent to the Anthropic-native
endpoint. Set `OPENCODE_GO_API_KEY`; the base URL defaults to
`https://opencode.ai/zen/go/v1`.
Configured model lists are available for every provider with
`<PROVIDER>_MODELS`, for example
`OPENROUTER_MODELS=openai/gpt-oss-120b,anthropic/claude-sonnet-4` or
`ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3`. DeepSeek defaults to
`https://api.deepseek.com`; set `DEEPSEEK_BASE_URL` only when using a compatible
proxy or alternate DeepSeek endpoint. By default,
`CONFIGURED_PROVIDER_MODELS_MODE=fallback` uses those lists only when upstream
`/models` is unavailable or empty. Set `CONFIGURED_PROVIDER_MODELS_MODE=allowlist`
to expose only configured models for providers that define a list, skipping
their upstream `/models` calls.
For vLLM, set `VLLM_API_KEY` only if the upstream server was started with
`--api-key`.
To register multiple instances of the same provider type without `config.yaml`,
use suffixed env vars such as `OPENAI_EAST_API_KEY` and
`OPENAI_EAST_BASE_URL`; add `OPENAI_EAST_MODELS` to configure that instance's
model list. This registers provider `openai-east` with type `openai`.
Vertex AI follows the same suffix pattern — `VERTEX_US_PROJECT` registers
provider `vertex-us`. Vertex project and location env vars must match the
instance prefix: for a suffixed instance such as `VERTEX_US_PROJECT`, also set
`VERTEX_US_LOCATION` and any other suffixed settings for that instance, rather
than the generic `VERTEX_PROJECT` / `VERTEX_LOCATION`. `VERTEX_AUTH_TYPE`
defaults to Application Default Credentials (`gcp_adc`).

---

## Alternative Setup Methods

### Running from Source

**Prerequisites:** Go 1.26.4+

1. Create a `.env` file:

   ```bash
   cp .env.template .env
   ```

2. Add your API keys to `.env` (at least one required).

3. Start the server:

   ```bash
   make run
   ```

### Docker Compose

**Infrastructure only** (Redis, PostgreSQL, MongoDB, Adminer - no image build):

```bash
docker compose up -d
# or: make infra
```

**Full stack** (adds GoModel + Prometheus; builds the app image):

```bash
cp .env.template .env
# Add your API keys to .env
docker compose --profile app up -d
# or: make image
```

| Service         | URL                   |
| --------------- | --------------------- |
| GoModel API     | http://localhost:8080 |
| Adminer (DB UI) | http://localhost:8081 |
| Prometheus      | http://localhost:9090 |

### Building the Docker Image Locally

```bash
docker build -t gomodel .
docker run --rm -p 8080:8080 --env-file .env gomodel
```

---

## API Endpoints

### OpenAI-Compatible API

| Endpoint                   | Method | Description                                                                                                  |
| -------------------------- | ------ | ------------------------------------------------------------------------------------------------------------ |
| `/v1/chat/completions`     | POST   | Chat completions (streaming supported)                                                                       |
| `/v1/responses`            | POST   | OpenAI Responses API                                                                                         |
| `/v1/conversations`        | POST   | Create a conversation (gateway-managed)                                                                      |
| `/v1/conversations/{id}`   | GET    | Retrieve a conversation                                                                                      |
| `/v1/conversations/{id}`   | POST   | Replace conversation metadata in full                                                                        |
| `/v1/conversations/{id}`   | DELETE | Delete a conversation                                                                                        |
| `/v1/embeddings`           | POST   | Text embeddings                                                                                              |
| `/v1/models`               | GET    | List available models                                                                                        |
| `/v1/files`                | POST   | Upload a file (OpenAI-compatible multipart)                                                                  |
| `/v1/files`                | GET    | List files                                                                                                   |
| `/v1/files/{id}`           | GET    | Retrieve file metadata                                                                                       |
| `/v1/files/{id}`           | DELETE | Delete a file                                                                                                |
| `/v1/files/{id}/content`   | GET    | Retrieve raw file content                                                                                    |
| `/v1/batches`              | POST   | Create a native provider batch (OpenAI-compatible schema; inline `requests` supported where provider-native) |
| `/v1/batches`              | GET    | List stored batches                                                                                          |
| `/v1/batches/{id}`         | GET    | Retrieve one stored batch                                                                                    |
| `/v1/batches/{id}/cancel`  | POST   | Cancel a pending batch                                                                                       |
| `/v1/batches/{id}/results` | GET    | Retrieve native batch results when available                                                                 |

### Anthropic-Compatible API

| Endpoint                    | Method | Description                                                                   |
| --------------------------- | ------ | ----------------------------------------------------------------------------- |
| `/v1/messages`              | POST   | Anthropic Messages API through translated model routing (streaming supported) |
| `/v1/messages/count_tokens` | POST   | Heuristic Anthropic Messages input token estimate                             |

### Provider Passthrough

| Endpoint            | Method                                       | Description                                                |
| ------------------- | -------------------------------------------- | ---------------------------------------------------------- |
| `/p/{provider}/...` | GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS | Provider-native passthrough with opaque upstream responses |

### Admin Endpoints

| Endpoint                    | Method | Description                                |
| --------------------------- | ------ | ------------------------------------------ |
| `/admin/dashboard`          | GET    | Admin dashboard UI                         |
| `/admin/runtime/config`     | GET    | Admin runtime configuration                |
| `/admin/cache/overview`     | GET    | Cache statistics overview                  |
| `/admin/usage/summary`      | GET    | Aggregate token usage statistics           |
| `/admin/usage/daily`        | GET    | Per-period token usage breakdown           |
| `/admin/usage/models`       | GET    | Usage breakdown by model                   |
| `/admin/usage/user-paths`   | GET    | Usage breakdown by user path               |
| `/admin/usage/log`          | GET    | Paginated usage log entries                |
| `/admin/audit/detail`       | GET    | Detailed audit entry information           |
| `/admin/audit/log`          | GET    | Paginated audit log entries                |
| `/admin/audit/conversation` | GET    | Conversation thread around one audit entry |
| `/admin/providers/status`   | GET    | Provider availability status               |
| `/admin/runtime/refresh`    | POST   | Refresh runtime configuration              |
| `/admin/models`             | GET    | List models with provider type             |
| `/admin/models/categories`  | GET    | List model categories                      |
| `/admin/model-overrides`    | GET    | List model overrides                       |
| `/admin/model-overrides`    | PUT    | Create/update model override               |
| `/admin/model-overrides`    | DELETE | Remove model override                      |
| `/admin/auth-keys`          | GET    | List authentication keys                   |

> **Legacy alias:** Until **2026-08-09**, all admin endpoints are also
> reachable under `/admin/api/v1/*`. Legacy responses include
> `Deprecation: true` and `Sunset: Sun, 09 Aug 2026 00:00:00 GMT` headers.
> The endpoint formerly at `/admin/api/v1/dashboard/config` moved to
> `/admin/runtime/config` on the new prefix.

### Operations Endpoints

| Endpoint              | Method | Description                                     |
| --------------------- | ------ | ----------------------------------------------- |
| `/health`             | GET    | Health check                                    |
| `/metrics`            | GET    | Prometheus metrics (experimental, when enabled) |
| `/swagger/index.html` | GET    | Swagger UI (when enabled)                       |

---

## Gateway Configuration

GoModel is configured through environment variables and an optional `config.yaml`. Environment variables override YAML values. See [`.env.template`](.env.template) and [`config/config.example.yaml`](config/config.example.yaml) for the available options.

Key settings:

| Variable                                | Default                                         | Description                                                                         |
| --------------------------------------- | ----------------------------------------------- | ----------------------------------------------------------------------------------- |
| `PORT`                                  | `8080`                                          | Server port                                                                         |
| `BASE_PATH`                             | `/`                                             | Mount the gateway under a path prefix such as `/g`                                  |
| `GOMODEL_MASTER_KEY`                    | (none)                                          | API key for authentication                                                          |
| `USER_PATH_HEADER`                      | `X-GoModel-User-Path`                           | Header used to read/write request `user_path` values                                |
| `ENABLE_PASSTHROUGH_ROUTES`             | `true`                                          | Enable provider-native passthrough routes under `/p/{provider}/...`                 |
| `ALLOW_PASSTHROUGH_V1_ALIAS`            | `true`                                          | Allow `/p/{provider}/v1/...` aliases while keeping `/p/{provider}/...` canonical    |
| `ENABLED_PASSTHROUGH_PROVIDERS`         | `openai,anthropic,openrouter,zai,vllm,deepseek` | Comma-separated list of enabled passthrough providers                               |
| `GEMINI_API_MODE`                       | `native`                                        | Gemini AI Studio upstream mode: `native` or `openai_compatible`                     |
| `VERTEX_API_MODE`                       | `native`                                        | Vertex AI Gemini upstream mode: `native` or `openai_compatible`                     |
| `USE_GOOGLE_GEMINI_NATIVE_API`          | `true`                                          | Legacy global Gemini mode toggle used when per-provider `*_API_MODE` is unset       |
| `STORAGE_TYPE`                          | `sqlite`                                        | Storage backend (`sqlite`, `postgresql`, `mongodb`)                                 |
| `METRICS_ENABLED`                       | `false`                                         | Enable Prometheus metrics (experimental)                                            |
| `LOGGING_ENABLED`                       | `false`                                         | Enable audit logging                                                                |
| `DASHBOARD_LIVE_LOGS_ENABLED`           | `true`                                          | Stream realtime dashboard log previews with bounded replay                          |
| `DASHBOARD_LIVE_LOGS_BUFFER_SIZE`       | `10000`                                         | Max in-memory live events retained; increase above ~1000 msgs/sec or bursty traffic |
| `DASHBOARD_LIVE_LOGS_REPLAY_LIMIT`      | `1000`                                          | Max events replayed after reconnect; increase for longer reconnect windows          |
| `DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS` | `15`                                            | SSE heartbeat interval; lower when proxies need faster liveness checks              |
| `GUARDRAILS_ENABLED`                    | `false`                                         | Enable the configured guardrails pipeline                                           |

**Quick Start - Authentication:** By default `GOMODEL_MASTER_KEY` is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. **Strongly recommend** setting a strong secret before exposing the service. Add `GOMODEL_MASTER_KEY` to your `.env` or environment for production deployments.

---

## Response Caching

GoModel has a two-layer response cache that reduces LLM API costs and latency for repeated or semantically similar requests.

### Layer 1 - Exact-match cache

Hashes the full request body (path + `Workflow` + body) and returns a stored response on byte-identical requests. Sub-millisecond lookup. Activate by environment variables: `RESPONSE_CACHE_SIMPLE_ENABLED` and `REDIS_URL`.

Responses served from this layer carry `X-Cache: HIT (exact)`.

### Layer 2 - Semantic cache

Embeds the last user message via your configured provider’s OpenAI-compatible `/v1/embeddings` API (`cache.response.semantic.embedder.provider` must name a key in the top-level `providers` map) and performs a KNN vector search. Semantically equivalent queries - e.g. _"What's the capital of France?"_ vs _"Which city is France's capital?"_ - can return the same cached response without an upstream LLM call.

Expected hit rates: ~60–70% in high-repetition workloads vs. ~18% for exact-match alone.

Responses served from this layer carry `X-Cache: HIT (semantic)`.

Supported vector backends: `qdrant`, `pgvector`, `pinecone`, `weaviate` (set `cache.response.semantic.vector_store.type` and the matching nested block).

Both cache layers run **after** guardrail/workflow patching so they always see the final prompt. Use `Cache-Control: no-cache` or `Cache-Control: no-store` to bypass caching per-request.

---

See [DEVELOPMENT.md](docs/DEVELOPMENT.md) for testing, linting, and pre-commit setup.

---

# Roadmap

## Commercial features

- [ ] Intelligent routing
- [ ] Context window compression
- [ ] Cluster mode

## Roadmap to 0.2.0

- [ ] UI visibility for prompt caching and local cache usage
- [ ] Broader provider support, including Cohere, Command A, and Operational
- [ ] Full support for the OpenAI `/conversations` lifecycle
- [ ] Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client
- [ ] Provider-native passthrough for all providers, beyond the current beta coverage
- [x] Budget management with limits per `user_path` and/or API key
- [x] Editable model pricing for accurate cost tracking and budgeting
- [x] Full support for the OpenAI `/responses` lifecycle
- [x] Anthropic-compatible `/messages` ingress and `/messages/count_tokens`
- [x] Prompt cache visibility showing how much of each prompt was cached by the provider
- [x] Fix failover charts in the dashboard

## Community

Join our [Discord](https://discord.gg/gaEB9BQSPH) to connect with other GoModel users.

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=enterpilot/gomodel&type=date&legend=top-left)](https://www.star-history.com/#enterpilot/gomodel&type=date&legend=top-left)