# GoModel **Repository Path**: one_check/GoModel ## Basic Information - **Project Name**: GoModel - **Description**: No description available - **Primary Language**: Go - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-06-16 - **Last Updated**: 2026-06-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

GoModel logo

GoModel - AI Gateway in Go

CI GO Version Docker Pulls Discord

Hacker News docs GoModel

GoModel on Hacker News

A fast and lightweight AI gateway written in Go, providing a unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, DeepSeek, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, Ollama, and more.

GoModel AI gateway dashboard showing AI usage analytics, observability panel, token and costs tracking, and estimated cost monitoring ## Quick Start with Docker **Step 1:** Start GoModel container ```bash docker run --rm -p 8080:8080 \ -e LOGGING_ENABLED=true \ -e LOGGING_LOG_BODIES=true \ -e LOG_FORMAT=text \ -e LOGGING_LOG_HEADERS=true \ -e OPENAI_API_KEY="your-openai-key" \ enterpilot/gomodel ``` Pass only the provider credentials or base URL you need (at least one required): ```bash docker run --rm -p 8080:8080 \ -e OPENAI_API_KEY="your-openai-key" \ -e ANTHROPIC_API_KEY="your-anthropic-key" \ -e GEMINI_API_KEY="your-gemini-key" \ -e VERTEX_PROJECT="your-gcp-project" \ -e VERTEX_LOCATION="us-central1" \ -e VERTEX_AUTH_TYPE="gcp_adc" \ -e DEEPSEEK_API_KEY="your-deepseek-key" \ -e GROQ_API_KEY="your-groq-key" \ -e OPENROUTER_API_KEY="your-openrouter-key" \ -e ZAI_API_KEY="your-zai-key" \ -e XAI_API_KEY="your-xai-key" \ -e AZURE_API_KEY="your-azure-key" \ -e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \ -e AZURE_API_VERSION="2024-10-21" \ -e ORACLE_API_KEY="your-oracle-key" \ -e ORACLE_BASE_URL="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1" \ -e ORACLE_MODELS="openai.gpt-oss-120b,xai.grok-3" \ -e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \ -e VLLM_BASE_URL="http://host.docker.internal:8000/v1" \ enterpilot/gomodel ``` ⚠️ Avoid passing secrets via `-e` on the command line - they can leak via shell history and process lists. For production, use `docker run --env-file .env` to load API keys from a file instead. **Step 2:** Make your first API call ```bash curl http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5-chat-latest", "messages": [{"role": "user", "content": "Hello!"}] }' ``` **That's it!** GoModel automatically detects which providers are available based on the credentials you supply. ### Supported LLM Providers Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider. | Provider | Credential | Example Model | Chat | `/responses` | Embed | Files | Batches | Passthru | | ------------------------------------ | ----------------------------------------------------------------- | ------------------------------------------ | :--: | :----------: | :---: | :---: | :-----: | :------: | | OpenAI | `OPENAI_API_KEY` | `gpt-5.5` | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Anthropic | `ANTHROPIC_API_KEY` | `claude-sonnet-4-20250514` | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | | Google Gemini | `GEMINI_API_KEY` | `gemini-2.5-flash` | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | | Vertex AI | `VERTEX_PROJECT` + `VERTEX_LOCATION` | `google/gemini-2.5-flash` | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | | DeepSeek | `DEEPSEEK_API_KEY` | `deepseek-v4-pro` | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | | Groq | `GROQ_API_KEY` | `llama-3.3-70b-versatile` | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | | OpenRouter | `OPENROUTER_API_KEY` | `google/gemini-2.5-flash` | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Z.ai | `ZAI_API_KEY` (`ZAI_BASE_URL` optional) | `glm-5.1` | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | | xAI (Grok) | `XAI_API_KEY` | `grok-4` | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | | Alibaba Cloud Model Studio (Bailian) | `BAILIAN_API_KEY` (`BAILIAN_BASE_URL` optional) | `qwen3-max` | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | MiniMax | `MINIMAX_API_KEY` (`MINIMAX_BASE_URL` optional) | `MiniMax-M3` | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | | Xiaomi MiMo | `XIAOMI_API_KEY` (`XIAOMI_BASE_URL` optional) | `mimo-v2.5-pro` | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | | OpenCode Go | `OPENCODE_GO_API_KEY` (`OPENCODE_GO_BASE_URL` optional) | `glm-5.1` | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | | Azure OpenAI | `AZURE_API_KEY` + `AZURE_BASE_URL` (`AZURE_API_VERSION` optional) | `gpt-5` | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Oracle | `ORACLE_API_KEY` + `ORACLE_BASE_URL` | `openai.gpt-oss-120b` | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | | Ollama | `OLLAMA_BASE_URL` | `llama3.2` | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | | vLLM | `VLLM_BASE_URL` (`VLLM_API_KEY` optional) | `meta-llama/Llama-3.1-8B-Instruct` | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | | Amazon Bedrock | `BEDROCK_BASE_URL` (region or endpoint) + AWS credentials | `anthropic.claude-3-5-haiku-20241022-v1:0` | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ Supported ❌ Unsupported For Z.ai's GLM Coding Plan, set `ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4`. Xiaomi MiMo TTS (`mimo-v2.5-tts*`) and ASR (`mimo-v2.5-asr`) are served through `/v1/audio/speech` and `/v1/audio/transcriptions` (translated to MiMo's chat-completions audio dialect) as well as directly via chat completions; for 1M context append `[1m]` to the model ID and list it in `XIAOMI_MODELS`. OpenCode Go (OpenCode Zen) routes per model — most models use OpenAI-style `/chat/completions`, while `/messages`-only models (default `qwen3.7-max`, override with `OPENCODE_GO_MESSAGES_MODELS`) are sent to the Anthropic-native endpoint. Set `OPENCODE_GO_API_KEY`; the base URL defaults to `https://opencode.ai/zen/go/v1`. Configured model lists are available for every provider with `_MODELS`, for example `OPENROUTER_MODELS=openai/gpt-oss-120b,anthropic/claude-sonnet-4` or `ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3`. DeepSeek defaults to `https://api.deepseek.com`; set `DEEPSEEK_BASE_URL` only when using a compatible proxy or alternate DeepSeek endpoint. By default, `CONFIGURED_PROVIDER_MODELS_MODE=fallback` uses those lists only when upstream `/models` is unavailable or empty. Set `CONFIGURED_PROVIDER_MODELS_MODE=allowlist` to expose only configured models for providers that define a list, skipping their upstream `/models` calls. For vLLM, set `VLLM_API_KEY` only if the upstream server was started with `--api-key`. To register multiple instances of the same provider type without `config.yaml`, use suffixed env vars such as `OPENAI_EAST_API_KEY` and `OPENAI_EAST_BASE_URL`; add `OPENAI_EAST_MODELS` to configure that instance's model list. This registers provider `openai-east` with type `openai`. Vertex AI follows the same suffix pattern — `VERTEX_US_PROJECT` registers provider `vertex-us`. Vertex project and location env vars must match the instance prefix: for a suffixed instance such as `VERTEX_US_PROJECT`, also set `VERTEX_US_LOCATION` and any other suffixed settings for that instance, rather than the generic `VERTEX_PROJECT` / `VERTEX_LOCATION`. `VERTEX_AUTH_TYPE` defaults to Application Default Credentials (`gcp_adc`). --- ## Alternative Setup Methods ### Running from Source **Prerequisites:** Go 1.26.4+ 1. Create a `.env` file: ```bash cp .env.template .env ``` 2. Add your API keys to `.env` (at least one required). 3. Start the server: ```bash make run ``` ### Docker Compose **Infrastructure only** (Redis, PostgreSQL, MongoDB, Adminer - no image build): ```bash docker compose up -d # or: make infra ``` **Full stack** (adds GoModel + Prometheus; builds the app image): ```bash cp .env.template .env # Add your API keys to .env docker compose --profile app up -d # or: make image ``` | Service | URL | | --------------- | --------------------- | | GoModel API | http://localhost:8080 | | Adminer (DB UI) | http://localhost:8081 | | Prometheus | http://localhost:9090 | ### Building the Docker Image Locally ```bash docker build -t gomodel . docker run --rm -p 8080:8080 --env-file .env gomodel ``` --- ## API Endpoints ### OpenAI-Compatible API | Endpoint | Method | Description | | -------------------------- | ------ | ------------------------------------------------------------------------------------------------------------ | | `/v1/chat/completions` | POST | Chat completions (streaming supported) | | `/v1/responses` | POST | OpenAI Responses API | | `/v1/conversations` | POST | Create a conversation (gateway-managed) | | `/v1/conversations/{id}` | GET | Retrieve a conversation | | `/v1/conversations/{id}` | POST | Replace conversation metadata in full | | `/v1/conversations/{id}` | DELETE | Delete a conversation | | `/v1/embeddings` | POST | Text embeddings | | `/v1/models` | GET | List available models | | `/v1/files` | POST | Upload a file (OpenAI-compatible multipart) | | `/v1/files` | GET | List files | | `/v1/files/{id}` | GET | Retrieve file metadata | | `/v1/files/{id}` | DELETE | Delete a file | | `/v1/files/{id}/content` | GET | Retrieve raw file content | | `/v1/batches` | POST | Create a native provider batch (OpenAI-compatible schema; inline `requests` supported where provider-native) | | `/v1/batches` | GET | List stored batches | | `/v1/batches/{id}` | GET | Retrieve one stored batch | | `/v1/batches/{id}/cancel` | POST | Cancel a pending batch | | `/v1/batches/{id}/results` | GET | Retrieve native batch results when available | ### Anthropic-Compatible API | Endpoint | Method | Description | | --------------------------- | ------ | ----------------------------------------------------------------------------- | | `/v1/messages` | POST | Anthropic Messages API through translated model routing (streaming supported) | | `/v1/messages/count_tokens` | POST | Heuristic Anthropic Messages input token estimate | ### Provider Passthrough | Endpoint | Method | Description | | ------------------- | -------------------------------------------- | ---------------------------------------------------------- | | `/p/{provider}/...` | GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS | Provider-native passthrough with opaque upstream responses | ### Admin Endpoints | Endpoint | Method | Description | | --------------------------- | ------ | ------------------------------------------ | | `/admin/dashboard` | GET | Admin dashboard UI | | `/admin/runtime/config` | GET | Admin runtime configuration | | `/admin/cache/overview` | GET | Cache statistics overview | | `/admin/usage/summary` | GET | Aggregate token usage statistics | | `/admin/usage/daily` | GET | Per-period token usage breakdown | | `/admin/usage/models` | GET | Usage breakdown by model | | `/admin/usage/user-paths` | GET | Usage breakdown by user path | | `/admin/usage/log` | GET | Paginated usage log entries | | `/admin/audit/detail` | GET | Detailed audit entry information | | `/admin/audit/log` | GET | Paginated audit log entries | | `/admin/audit/conversation` | GET | Conversation thread around one audit entry | | `/admin/providers/status` | GET | Provider availability status | | `/admin/runtime/refresh` | POST | Refresh runtime configuration | | `/admin/models` | GET | List models with provider type | | `/admin/models/categories` | GET | List model categories | | `/admin/model-overrides` | GET | List model overrides | | `/admin/model-overrides` | PUT | Create/update model override | | `/admin/model-overrides` | DELETE | Remove model override | | `/admin/auth-keys` | GET | List authentication keys | > **Legacy alias:** Until **2026-08-09**, all admin endpoints are also > reachable under `/admin/api/v1/*`. Legacy responses include > `Deprecation: true` and `Sunset: Sun, 09 Aug 2026 00:00:00 GMT` headers. > The endpoint formerly at `/admin/api/v1/dashboard/config` moved to > `/admin/runtime/config` on the new prefix. ### Operations Endpoints | Endpoint | Method | Description | | --------------------- | ------ | ----------------------------------------------- | | `/health` | GET | Health check | | `/metrics` | GET | Prometheus metrics (experimental, when enabled) | | `/swagger/index.html` | GET | Swagger UI (when enabled) | --- ## Gateway Configuration GoModel is configured through environment variables and an optional `config.yaml`. Environment variables override YAML values. See [`.env.template`](.env.template) and [`config/config.example.yaml`](config/config.example.yaml) for the available options. Key settings: | Variable | Default | Description | | --------------------------------------- | ----------------------------------------------- | ----------------------------------------------------------------------------------- | | `PORT` | `8080` | Server port | | `BASE_PATH` | `/` | Mount the gateway under a path prefix such as `/g` | | `GOMODEL_MASTER_KEY` | (none) | API key for authentication | | `USER_PATH_HEADER` | `X-GoModel-User-Path` | Header used to read/write request `user_path` values | | `ENABLE_PASSTHROUGH_ROUTES` | `true` | Enable provider-native passthrough routes under `/p/{provider}/...` | | `ALLOW_PASSTHROUGH_V1_ALIAS` | `true` | Allow `/p/{provider}/v1/...` aliases while keeping `/p/{provider}/...` canonical | | `ENABLED_PASSTHROUGH_PROVIDERS` | `openai,anthropic,openrouter,zai,vllm,deepseek` | Comma-separated list of enabled passthrough providers | | `GEMINI_API_MODE` | `native` | Gemini AI Studio upstream mode: `native` or `openai_compatible` | | `VERTEX_API_MODE` | `native` | Vertex AI Gemini upstream mode: `native` or `openai_compatible` | | `USE_GOOGLE_GEMINI_NATIVE_API` | `true` | Legacy global Gemini mode toggle used when per-provider `*_API_MODE` is unset | | `STORAGE_TYPE` | `sqlite` | Storage backend (`sqlite`, `postgresql`, `mongodb`) | | `METRICS_ENABLED` | `false` | Enable Prometheus metrics (experimental) | | `LOGGING_ENABLED` | `false` | Enable audit logging | | `DASHBOARD_LIVE_LOGS_ENABLED` | `true` | Stream realtime dashboard log previews with bounded replay | | `DASHBOARD_LIVE_LOGS_BUFFER_SIZE` | `10000` | Max in-memory live events retained; increase above ~1000 msgs/sec or bursty traffic | | `DASHBOARD_LIVE_LOGS_REPLAY_LIMIT` | `1000` | Max events replayed after reconnect; increase for longer reconnect windows | | `DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS` | `15` | SSE heartbeat interval; lower when proxies need faster liveness checks | | `GUARDRAILS_ENABLED` | `false` | Enable the configured guardrails pipeline | **Quick Start - Authentication:** By default `GOMODEL_MASTER_KEY` is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. **Strongly recommend** setting a strong secret before exposing the service. Add `GOMODEL_MASTER_KEY` to your `.env` or environment for production deployments. --- ## Response Caching GoModel has a two-layer response cache that reduces LLM API costs and latency for repeated or semantically similar requests. ### Layer 1 - Exact-match cache Hashes the full request body (path + `Workflow` + body) and returns a stored response on byte-identical requests. Sub-millisecond lookup. Activate by environment variables: `RESPONSE_CACHE_SIMPLE_ENABLED` and `REDIS_URL`. Responses served from this layer carry `X-Cache: HIT (exact)`. ### Layer 2 - Semantic cache Embeds the last user message via your configured provider’s OpenAI-compatible `/v1/embeddings` API (`cache.response.semantic.embedder.provider` must name a key in the top-level `providers` map) and performs a KNN vector search. Semantically equivalent queries - e.g. _"What's the capital of France?"_ vs _"Which city is France's capital?"_ - can return the same cached response without an upstream LLM call. Expected hit rates: ~60–70% in high-repetition workloads vs. ~18% for exact-match alone. Responses served from this layer carry `X-Cache: HIT (semantic)`. Supported vector backends: `qdrant`, `pgvector`, `pinecone`, `weaviate` (set `cache.response.semantic.vector_store.type` and the matching nested block). Both cache layers run **after** guardrail/workflow patching so they always see the final prompt. Use `Cache-Control: no-cache` or `Cache-Control: no-store` to bypass caching per-request. --- See [DEVELOPMENT.md](docs/DEVELOPMENT.md) for testing, linting, and pre-commit setup. --- # Roadmap ## Commercial features - [ ] Intelligent routing - [ ] Context window compression - [ ] Cluster mode ## Roadmap to 0.2.0 - [ ] UI visibility for prompt caching and local cache usage - [ ] Broader provider support, including Cohere, Command A, and Operational - [ ] Full support for the OpenAI `/conversations` lifecycle - [ ] Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client - [ ] Provider-native passthrough for all providers, beyond the current beta coverage - [x] Budget management with limits per `user_path` and/or API key - [x] Editable model pricing for accurate cost tracking and budgeting - [x] Full support for the OpenAI `/responses` lifecycle - [x] Anthropic-compatible `/messages` ingress and `/messages/count_tokens` - [x] Prompt cache visibility showing how much of each prompt was cached by the provider - [x] Fix failover charts in the dashboard ## Community Join our [Discord](https://discord.gg/gaEB9BQSPH) to connect with other GoModel users. ## Star History [![Star History Chart](https://api.star-history.com/svg?repos=enterpilot/gomodel&type=date&legend=top-left)](https://www.star-history.com/#enterpilot/gomodel&type=date&legend=top-left)