v1.80.5-stable
Deploy this version​
- Docker
- Pip
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.80.5-stable
pip install litellm==1.80.5
Key Highlights​
- Prompt Management - Full prompt versioning support with UI for editing, testing, and version history
- MCP Hub - Publish and discover MCP servers within your organization
- Model Compare UI - Side-by-side model comparison interface for testing
- Gemini 3w - Day-0 support with thought signatures in Responses API
- Azure GPT-5.1 Models - Complete Azure GPT-5.1 family support with EU region pricing
- Performance Improvements - Realtime endpoint optimizations and SSL context caching
New Providers and Endpoints​
New Providers​
| Provider | Supported Endpoints | Description |
|---|---|---|
| Docker Model Runner | /v1/chat/completions | Run LLM models in Docker containers |
New Models / Updated Models​
New Model Support​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| Azure | azure/gpt-5.1 | 272K | $1.38 | $11.00 | Reasoning, vision, PDF input, responses API |
| Azure | azure/gpt-5.1-2025-11-13 | 272K | $1.38 | $11.00 | Reasoning, vision, PDF input, responses API |
| Azure | azure/gpt-5.1-codex | 272K | $1.38 | $11.00 | Responses API, reasoning, vision |
| Azure | azure/gpt-5.1-codex-2025-11-13 | 272K | $1.38 | $11.00 | Responses API, reasoning, vision |
| Azure | azure/gpt-5.1-codex-mini | 272K | $0.275 | $2.20 | Responses API, reasoning, vision |
| Azure | azure/gpt-5.1-codex-mini-2025-11-13 | 272K | $0.275 | $2.20 | Responses API, reasoning, vision |
| Azure EU | azure/eu/gpt-5-2025-08-07 | 272K | $1.375 | $11.00 | Reasoning, vision, PDF input |
| Azure EU | azure/eu/gpt-5-mini-2025-08-07 | 272K | $0.275 | $2.20 | Reasoning, vision, PDF input |
| Azure EU | azure/eu/gpt-5-nano-2025-08-07 | 272K | $0.055 | $0.44 | Reasoning, vision, PDF input |
| Azure EU | azure/eu/gpt-5.1 | 272K | $1.38 | $11.00 | Reasoning, vision, PDF input, responses API |
| Azure EU | azure/eu/gpt-5.1-codex | 272K | $1.38 | $11.00 | Responses API, reasoning, vision |
| Azure EU | azure/eu/gpt-5.1-codex-mini | 272K | $0.275 | $2.20 | Responses API, reasoning, vision |
| Gemini | gemini-3-pro-preview | 2M | $1.25 | $5.00 | Reasoning, vision, function calling |
| Gemini | gemini-3-pro-image | 2M | $1.25 | $5.00 | Image generation, reasoning |
| OpenRouter | openrouter/deepseek/deepseek-v3p1-terminus | 164K | $0.20 | $0.40 | Function calling, reasoning |
| OpenRouter | openrouter/moonshot/kimi-k2-instruct | 262K | $0.60 | $2.50 | Function calling, web search |
| OpenRouter | openrouter/gemini/gemini-3-pro-preview | 2M | $1.25 | $5.00 | Reasoning, vision, function calling |
| XAI | xai/grok-4.1-fast | 2M | $0.20 | $0.50 | Reasoning, function calling |
| Together AI | together_ai/z-ai/glm-4.6 | 203K | $0.40 | $1.75 | Function calling, reasoning |
| Cerebras | cerebras/gpt-oss-120b | 131K | $0.60 | $0.60 | Function calling |
| Bedrock | anthropic.claude-sonnet-4-5-20250929-v1:0 | 200K | $3.00 | $15.00 | Computer use, reasoning, vision |
Features​
-
Gemini (Google AI Studio + Vertex AI)
- Add Day 0 gemini-3-pro-preview support - PR #16719
- Add support for Gemini 3 Pro Image model - PR #16938
- Add reasoning_content to streaming responses with tools enabled - PR #16854
- Add includeThoughts=True for Gemini 3 reasoning_effort - PR #16838
- Support thought signatures for Gemini 3 in responses API - PR #16872
- Correct wrong system message handling for gemma - PR #16767
- Gemini 3 Pro Image: capture image_tokens and support cost_per_output_image - PR #16912
- Fix missing costs for gemini-2.5-flash-image - PR #16882
- Gemini 3 thought signatures in tool call id - PR #16895
-
- Snowflake provider support: added embeddings, PAT, account_id - PR #15727
-
- Add oci_endpoint_id Parameter for OCI Dedicated Endpoints - PR #16723
-
- Add support for Grok 4.1 Fast models - PR #16936
-
- Add GLM 4.6 from together.ai - PR #16942
-
- Fix Cerebras GPT-OSS-120B model name - PR #16939
Bug Fixes​
-
General
LLM API Endpoints​
Features​
-
- Search APIs - error in firecrawl-search "Invalid request body" - PR #16943
-
- Fix videos tagging - PR #16770
Bugs​
- General
Management Endpoints / UI​
Features​
-
Proxy CLI Auth
- Allow using JWTs for signing in with Proxy CLI - PR #16756
-
Virtual Keys
- Fix Key Model Alias Not Working - PR #16896
-
Models + Endpoints
-
Teams
- Teams table empty state - PR #16738
-
Fallbacks
- Fallbacks icon button tooltips and delete with friction - PR #16737
-
MCP Servers
- Delete user and MCP Server Modal, MCP Table Tooltips - PR #16751
-
Callbacks
-
Usage & Analytics
-
General UI
Bugs​
-
UI Fixes
- Fix flaky tests due to antd Notification Manager - PR #16740
- Fix UI MCP Tool Test Regression - PR #16695
- Fix edit logging settings not appearing - PR #16798
- Add css to truncate long request ids in request viewer - PR #16665
- Remove azure/ prefix in Placeholder for Azure in Add Model - PR #16597
- Remove UI Session Token from user/info return - PR #16851
- Remove console logs and errors from model tab - PR #16455
- Change Bulk Invite User Roles to Match Backend - PR #16906
- Mock Tremor's Tooltip to Fix Flaky UI Tests - PR #16786
- Fix e2e ui playwright test - PR #16799
- Fix Tests in CI/CD - PR #16972
-
SSO
-
Auth
-
Swagger UI
- Fixes Swagger UI resolver errors for chat completion endpoints caused by Pydantic v2
$defsnot being properly exposed in the OpenAPI schema - PR #16784
- Fixes Swagger UI resolver errors for chat completion endpoints caused by Pydantic v2
AI Integrations​
Logging​
-
- Filter secret fields form Langfuse - PR #16842
-
General
Guardrails​
-
- Fix IBM Guardrails optional params, add extra_headers field - PR #16771
-
- Grayswan guardrail passthrough on flagged - PR #16891
-
General Guardrails
- Fix prompt injection not working - PR #16701
Prompt Management​
- Prompt Management
- Allow specifying just prompt_id in a request to a model - PR #16834
- Add support for versioning prompts - PR #16836
- Allow storing prompt version in DB - PR #16848
- Add UI for editing the prompts - PR #16853
- Allow testing prompts with Chat UI - PR #16898
- Allow viewing version history - PR #16901
- Allow specifying prompt version in code - PR #16929
- UI, allow seeing model, prompt id for Prompt - PR #16932
- Show "get code" section for prompt management + minor polish of showing version history - PR #16941
Secret Managers​
- AWS Secrets Manager
- Adds IAM role assumption support for AWS Secret Manager - PR #16887
MCP Gateway​
- MCP Hub - Publish/discover MCP Servers within a company - PR #16857
- MCP Resources - MCP resources support - PR #16800
- MCP OAuth - Docs - mcp oauth flow details - PR #16742
- MCP Lifecycle - Drop MCPClient.connect and use run_with_session lifecycle - PR #16696
- MCP Server IDs - Add mcp server ids - PR #16904
- MCP URL Format - Fix mcp url format - PR #16940
Performance / Loadbalancing / Reliability improvements​
- Realtime Endpoint Performance - Fix bottlenecks degrading realtime endpoint performance - PR #16670
- SSL Context Caching - Cache SSL contexts to prevent excessive memory allocation - PR #16955
- Cache Optimization - Fix cache cooldown key generation - PR #16954
- Router Cache - Fix routing for requests with same cacheable prefix but different user messages - PR #16951
- Redis Event Loop - Fix redis event loop closed at first call - PR #16913
- Dependency Management - Upgrade pydantic to version 2.11.0 - PR #16909
Documentation Updates​
-
Provider Documentation
-
API Documentation
-
General Documentation
- Add mini-swe-agent to Projects built on LiteLLM - PR #16971
Infrastructure / CI/CD​
-
UI Testing
-
Dependency Management
-
Migration
- Migration job labels - PR #16831
-
Config
- This yaml actually works - PR #16757
-
Release Notes
-
Investigation
- Investigate issue root cause - PR #16859
Model Compare UI​
New interactive playground UI enables side-by-side comparison of multiple LLM models, making it easy to evaluate and compare model responses.
Features:
- Compare responses from multiple models in real-time
- Side-by-side view with synchronized scrolling
- Support for all LiteLLM-supported models
- Cost tracking per model
- Response time comparison
- Pre-configured prompts for quick and easy testing
Details:
-
Parameterization: Configure API keys, endpoints, models, and model parameters, as well as interaction types (chat completions, embeddings, etc.)
-
Model Comparison: Compare up to 3 different models simultaneously with side-by-side response views
-
Comparison Metrics: View detailed comparison information including:
- Time To First Token
- Input / Output / Reasoning Tokens
- Total Latency
- Cost (if enabled in config)
-
Safety Filters: Configure and test guardrails (safety filters) directly in the playground interface
Get Started with Model Compare - PR #16855
New Contributors​
- @mattmorgis made their first contribution in PR #16371
- @mmandic-coatue made their first contribution in PR #16732
- @Bradley-Butcher made their first contribution in PR #16725
- @BenjaminLevy made their first contribution in PR #16757
- @CatBraaain made their first contribution in PR #16767
- @tushar8408 made their first contribution in PR #16831
- @nbsp1221 made their first contribution in PR #16845
- @idola9 made their first contribution in PR #16832
- @nkukard made their first contribution in PR #16864
- @alhuang10 made their first contribution in PR #16852
- @sebslight made their first contribution in PR #16838
- @TsurumaruTsuyoshi made their first contribution in PR #16905
- @cyberjunk made their first contribution in PR #16492
- @colinlin-stripe made their first contribution in PR #16895
- @sureshdsk made their first contribution in PR #16883
- @eiliyaabedini made their first contribution in PR #16875
- @justin-tahara made their first contribution in PR #16957
- @wangsoft made their first contribution in PR #16913
- @dsduenas made their first contribution in PR #16891

