Skip to content

Remove modelRouter and add model_providers concept#224

Open
philipph-askui wants to merge 56 commits intomainfrom
chore/modelrouter
Open

Remove modelRouter and add model_providers concept#224
philipph-askui wants to merge 56 commits intomainfrom
chore/modelrouter

Conversation

@philipph-askui
Copy link
Contributor

@philipph-askui philipph-askui commented Jan 21, 2026

[Edited] PR Description

To get a quick idea of the new API, I added an example under examples/model_providers.py that you can run
Note: You will need an anthropic API Key for that

This PR:

  • Removes ModelRouter, ModelRegistry, and model_store
  • Introduces a provider-based configuration system (AgentSettings)
  • Renames VisionAgent → ComputerAgent and AndroidVisionAgent → AndroidAgent
  • Updates docs and examples

Summary

  • Replaced the ModelRouter/model_store abstraction with three typed provider slots (vlm_provider, image_qa_provider, detection_provider) configured via AgentSettings
  • Providers own their endpoint, credentials, and model ID — validated lazily on first API call
  • get() and locate() are now backed by GetTool/LocateTool, which are also available to the LLM during act() — no separate model injection path
  • Significantly reduced codebase complexity (~4600 lines removed)
  • Updated all docs and examples to reflect the new API

Key Changes

  • AgentSettings: single configuration object with provider slots; defaults to AskUI-hosted providers reading credentials from env vars
  • Built-in providers: AskUIVlmProvider, AskUIImageQAProvider, AskUIDetectionProvider, AnthropicVlmProvider, AnthropicImageQAProvider, GoogleImageQAProvider, OpenAICompatibleProvider
  • GetTool / LocateTool: wired into the act loop as ToolWithAgentOS — LLM can call them directly during act()
  • Deleted: entire src/askui/model_store/ directory
  • Renamed: VisionAgent → ComputerAgent, AndroidVisionAgent → AndroidAgent
  • Docs: 03_Using-Models-and-BYOM.md fully rewritten; VisionAgent replaced with ComputerAgent across all docs

Further Changes

  • removes chat-api related code as it was deprecated

  • removes UI-TARS related code as it was deprecated

    Breaking Changes

    • VisionAgent is removed — use ComputerAgent
    • AndroidVisionAgent is removed — use AndroidAgent
    • act_model, get_model, locate_model constructor parameters are removed — use AgentSettings(vlm_provider=..., image_qa_provider=..., detection_provider=...)
    • model_store factory functions are removed
    • String-based model selection is removed

…del store

BREAKING CHANGE: Removed ModelRouter and ModelRegistry classes. Users must now use direct model injection.
@philipph-askui philipph-askui changed the title Chore/modelrouter Remove Modelrouter Jan 21, 2026
@philipph-askui philipph-askui changed the title Remove Modelrouter Remove modelRouter and add mode_store Jan 22, 2026
@philipph-askui philipph-askui marked this pull request as ready for review January 26, 2026 06:53
Copy link
Collaborator

@programminx-askui programminx-askui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @philipph-askui ,

I started to review the code. General remarks:

  1. Changes are to big to review. -> We need to test this heavly
  2. Creating new files, instead of rename/move files -> we don't know which code was already reviewed, which code is new.

Overall it is going in the right direction.

I've reviewed only view files, so you can start working on it. A deeper review is outstanding.

@philipph-askui
Copy link
Contributor Author

I added a new optional dependency group "tracing", that contains all the OTEL packages.

validate_by_alias=True,
)

base_name: str = Field(alias="name", description="Name of the tool")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this renaming has influence of any seralized data? e.g. Caching?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, should still work as the tool name provided by to the model should not change

self._model_id_value = model_id
self._askui_settings = askui_settings or AskUiInferenceApiSettings()
self._model_id = model_id
self._injected_client = client
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here


logger = logging.getLogger(__name__)

_MAX_FILE_SIZE_BYTES = 20 * 1024 * 1024
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the limit coming from? Can we link to the API docs from google in a comment? Otherwise I wouldn't check the file size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found it here: https://ai.google.dev/gemini-api/docs/file-input-methods

According to a quick research, the limit was increased to 50MB recently. Hence, I updated the limit in the code.

Comment on lines +70 to +74
if len(data) > _MAX_FILE_SIZE_BYTES:
err_msg = (
f"PDF file size exceeds the limit of {_MAX_FILE_SIZE_BYTES} bytes."
)
raise ValueError(err_msg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated code. please wrap this an a function check_file_size()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error messages are different, hence all we could wrap in a function here is the if comparison

from askui.models.shared.tools import ToolCollection


def _to_openai_messages(messages: list[MessageParam]) -> list[dict[str, Any]]: # noqa: C901
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I review this?

I would expect here an integration tests. Which coveres all message types.

Comment on lines 1 to 18
"""OpenAICompatibleMessagesApi —
MessagesApi implementation for OpenAI-compatible endpoints."""

import json as json_lib
from typing import Any

from openai import OpenAI
from typing_extensions import override

from askui.models.shared.agent_message_param import (
Base64ImageSourceParam,
ContentBlockParam,
ImageBlockParam,
MessageParam,
TextBlockParam,
ThinkingConfigParam,
ToolChoiceParam,
ToolResultBlockParam,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete OpenAI Act integration. -> Let's move this to another issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants