-
Notifications
You must be signed in to change notification settings - Fork 57
feat: Async Journey: LiteLLM Removal from Async Engine #310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: async/async-facade
Are you sure you want to change the base?
Conversation
Comprehensive analysis of removing the litellm dependency from Data Designer. Covers blast radius (per-phase), provider SDK research (OpenAI, Anthropic, Bedrock), risk assessment, and a 4-phase implementation plan using the models_v2/ parallel stack approach. Co-Authored-By: Remi <noreply@anthropic.com>
Greptile OverviewGreptile SummaryAdds comprehensive planning document for removing LiteLLM dependency from Data Designer. The document is thorough, well-structured, and demonstrates deep understanding of the codebase through 10 independent reviewer validations. Key strengths:
Implementation strategy:
The document correctly identifies that Data Designer underuses LiteLLM (single-deployment Router instead of load balancing), making removal feasible. The parallel stack approach via
|
| Filename | Overview |
|---|---|
| LITELLM_REMOVAL_ANALYSIS.md | New comprehensive planning document analyzing LiteLLM removal strategy with 4-phase implementation plan |
Sequence Diagram
sequenceDiagram
participant Config as Config Layer
participant Factory as models_v2/factory.py
participant Facade as models_v2/facade.py
participant Client as ModelClient (OpenAI/Anthropic/Bedrock)
participant SDK as Provider SDK
participant API as Provider API
Note over Config,Factory: Phase 1: OpenAI Adapter
Config->>Factory: create_model_registry(model_configs)
Factory->>Factory: Construct OpenAIModelClient
Factory->>Client: Initialize with api_key, base_url
Factory->>Facade: ModelFacade(client=OpenAIModelClient)
Note over Facade,API: Inference Request Flow
Facade->>Facade: completion(messages, **params)
Facade->>Client: client.completion(messages, **kwargs)
Client->>Client: Translate DD params → SDK params
Client->>SDK: await sdk.chat.completions.create(...)
SDK->>SDK: Built-in retry/backoff
SDK->>API: HTTPS POST /v1/chat/completions
API-->>SDK: 200 OK with response
SDK-->>Client: OpenAI response object
Client->>Client: Extract content, tool_calls, usage
Client-->>Facade: CompletionResponse
Facade-->>Config: Generated text
Note over Factory,Client: Phase 3: Multi-Provider
Factory->>Factory: match provider_type
alt provider_type == "openai"
Factory->>Client: OpenAIModelClient
else provider_type == "anthropic"
Factory->>Client: AnthropicModelClient
Note over Client: Translates content blocks → string
else provider_type == "bedrock"
Factory->>Client: BedrockModelClient
Note over Client: Manual retry for throttling
end
Note over Facade,SDK: Error Handling
SDK-->>Client: SDK-specific exception (e.g., RateLimitError)
Client->>Client: Map to DD error types
Client-->>Facade: ModelRateLimitError
Facade-->>Config: Propagate with FormattedLLMErrorMessage
|
|
||
| ### Config migration is clean | ||
|
|
||
| `provider_type` is only consumed in one place (`_get_litellm_deployment()` → `f"{provider_type}/{model_name}"`). Pydantic handles string → enum coercion automatically, so existing YAML/JSON configs and programmatic construction continue to work. The CLI text field for provider_type would become a select/choice field. All predefined providers and examples use `"openai"` — no existing users need migration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be the only breaking change ux wise? I don't see a need for this as long as we require users to provide ModelConfig.model with the full model path as understood by the inference stack they are using.
Summary
Adds a comprehensive analysis of removing the
litellmdependency from Data Designer. This is a planning document — no code changes.Key findings
engine/models/andengine/models_v2/)Implementation plan (4 phases)
models_v2/— OpenAI SDK adapter, keep OpenAI response format as canonical type.models/untouched as fallback.models/fallback still available.models/, remove litellm from deps. Only after all adapters are proven.Reviewed by
10 independent code reviewers examined the report against the actual codebase. Corrections incorporated: expanded test blast radius (4 files, ~56 functions), upgraded Anthropic risk to HIGH, added MCP facade cross-layer caveat, corrected dependency impact analysis.