feat(engine): env-var switch for async-first models experiment #280

eric-tramel · 2026-02-02T18:14:38Z

Summary

Add async engine support to Data Designer's model inference layer, enabling true async concurrency via asyncio as an alternative to the existing thread-based fan-out.

What changed

Async ModelFacade methods — Three new async methods on ModelFacade:

acompletion() → calls Router.acompletion() (LiteLLM native async)
agenerate() → async generation loop with correction/restart logic, MCP tool calling, and usage tracking
agenerate_text_embeddings() → async embedding generation

AsyncConcurrentExecutor — New executor that runs coroutines on an event loop instead of dispatching to a thread pool. Used when DATA_DESIGNER_ASYNC_ENGINE=1.

Environment variable gating — DATA_DESIGNER_ASYNC_ENGINE=1 controls whether column_wise_builder.py dispatches to AsyncConcurrentExecutor (async) or ConcurrentThreadExecutor (threads). The gate is at the executor selection level, not at the module import level.

Architecture

The async methods live directly on ModelFacade alongside the sync methods — no separate package, no import redirection. The only difference at the call site is self.model.generate(...) vs await self.model.agenerate(...).

Shared logic in llm_completion.py is factored into _prepare_generation_kwargs() and _process_generation_result(), so generate() and agenerate() are each ~3 lines.

Shared fan-out setup in column_wise_builder.py is factored into _setup_fan_out() and _finalize_fan_out(), so each fan-out method is ~7 lines.

Files changed

File	Change
`engine/models/errors.py`	Add `acatch_llm_exceptions` async decorator
`engine/models/facade.py`	Add `acompletion`, `agenerate`, `agenerate_text_embeddings`
`engine/models/__init__.py`	License-only (clean)
`engine/.../llm_completion.py`	Extract shared generate/agenerate logic
`engine/.../column_wise_builder.py`	Extract shared fan-out setup
`engine/.../utils/async_concurrency.py`	New `AsyncConcurrentExecutor`
`tests/.../test_async_engine_switch.py`	Tests for async methods + env var gating

Test plan

make test — all tests pass (487 across 3 packages)
make lint-fix — all checks passed
make format — no changes needed
make update-license-headers — all headers current
Zero references to models_v2 in codebase
ModelFacade has all 3 async methods confirmed via import check

🤖 Generated with Claude Code

Adds an opt-in async execution path (DATA_DESIGNER_ASYNC_ENGINE=1) for the cell-by-cell generation pipeline. Replaces thread-pool concurrency with native asyncio TaskGroup + Semaphore for bounded concurrent LLM calls, while keeping the sync path as the default. Key changes: - ModelFacade: acompletion(), agenerate_text_embeddings(), agenerate() - acatch_llm_exceptions decorator (async mirror of catch_llm_exceptions) - AsyncConcurrentExecutor with persistent background event loop - ColumnWiseBuilder branches on env var to fan out via async or threads - Benchmark updated with async mock support Co-Authored-By: Remi <noreply@anthropic.com>

Resolved conflicts: - llm_completion.py: kept agenerate() async method + main's new _extract_reasoning_content(), TraceType handling, and extract_reasoning_content config. Updated agenerate() to match main's trace handling patterns. - column_wise_builder.py: kept DATA_DESIGNER_ASYNC_ENGINE env var + adopted main's get_library_version() replacing importlib.metadata. Co-Authored-By: Remi <noreply@anthropic.com>

…models/ Delete the models_v2/ package (~2,500 lines) that was a near-complete copy of models/ with only ~250 lines of actual async additions. Instead: - Add acatch_llm_exceptions to models/errors.py - Add acompletion, agenerate, agenerate_text_embeddings to ModelFacade - Fix agenerate() to include total_tool_calls tracking (missing in v2 fork) - Fix agenerate() parser default to use _identity (missing in v2 fork) - Remove __path__ swap machinery from models/__init__.py - Env var DATA_DESIGNER_ASYNC_ENGINE now gates at the right level: column_wise_builder choosing between AsyncConcurrentExecutor and ConcurrentThreadExecutor, not swapping entire module trees Also deduplicate: - llm_completion.py: extract _prepare_generation_kwargs/_process_generation_result - column_wise_builder.py: extract _setup_fan_out/_finalize_fan_out All tests pass (make test, make lint, make format, make update-license-headers). Co-Authored-By: Remi <noreply@anthropic.com>

Address Codex review findings: - Add 11 async behavior tests for ModelFacade (acompletion, agenerate, agenerate_text_embeddings) mirroring existing sync test patterns - Add default agenerate() to ColumnGenerator base class that delegates to sync generate() via asyncio.to_thread — fixes AttributeError for EmbeddingCellGenerator and CustomColumnGenerator under async engine - Add coro.close() cleanup in AsyncConcurrentExecutor._run_task early returns to prevent "coroutine was never awaited" warnings - Tighten types: list[ChatMessage] for traces, list[dict[str, Any]] for multi_modal_context, dict[str, Any] for executor kwargs Co-Authored-By: Remi <noreply@anthropic.com>

greptile-apps · 2026-02-10T23:14:41Z

Greptile Overview

Greptile Summary

This PR adds async engine support to Data Designer's model inference layer, enabling true async concurrency via asyncio as an opt-in alternative to the existing thread-based execution. The implementation is well-architected with minimal duplication — async methods (acompletion, agenerate, agenerate_text_embeddings) live directly on ModelFacade alongside sync methods, and shared logic is factored into helper functions.

Key changes:

Three new async methods on ModelFacade that mirror sync behavior with proper async/await semantics
AsyncConcurrentExecutor for running coroutines with bounded concurrency and error rate monitoring
Environment variable DATA_DESIGNER_ASYNC_ENGINE=1 controls executor selection in column_wise_builder.py
Async decorator @acatch_llm_exceptions reuses existing exception handling logic
MCP tool calling operations delegated to thread pool via asyncio.to_thread (safe given MCP facade is stateless per call)
Default agenerate fallback on base ColumnGenerator delegates to sync via asyncio.to_thread
Comprehensive test coverage including correction/retry logic, tool calling, and env var gating

Architecture highlights:

No separate async package — async and sync coexist in the same modules
Shared logic extracted into _prepare_generation_kwargs and _process_generation_result in llm_completion.py
Shared fan-out setup extracted into _setup_fan_out and _finalize_fan_out in column_wise_builder.py
Persistent background event loop shared across all AsyncConcurrentExecutor instances to avoid breaking LiteLLM's internal async state

Confidence Score: 5/5

This PR is safe to merge with minimal risk
Score reflects thorough implementation with excellent test coverage (comprehensive async test suite mirroring sync tests), clean architecture (minimal duplication via shared helpers), proper error handling (async decorator reuses proven exception logic), safe concurrency patterns (persistent event loop, proper semaphore usage), backward compatibility (env var gated, defaults to existing thread-based execution), and validation through extensive testing (487 tests pass)
No files require special attention

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/models/facade.py	Added three async methods (`acompletion`, `agenerate`, `agenerate_text_embeddings`) mirroring sync methods, with proper async/await and error handling via `@acatch_llm_exceptions` decorator
packages/data-designer-engine/src/data_designer/engine/models/errors.py	Added `acatch_llm_exceptions` async decorator that properly wraps async functions and reuses existing exception handling logic
packages/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_concurrency.py	New `AsyncConcurrentExecutor` mirrors `ConcurrentThreadExecutor` API with async task execution, bounded concurrency via semaphore, and early shutdown on error rate threshold
packages/data-designer-engine/src/data_designer/engine/column_generators/generators/llm_completion.py	Refactored to extract shared logic into `_prepare_generation_kwargs` and `_process_generation_result`, added async `agenerate` method that reuses these helpers
packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py	Added env var gating (`DATA_DESIGNER_ASYNC_ENGINE`) to switch between thread and async executors, refactored fan-out setup/teardown into shared helpers

Sequence Diagram

sequenceDiagram
    participant User
    participant ColumnWiseBuilder
    participant AsyncExecutor
    participant Generator
    participant ModelFacade
    participant LiteLLM

    User->>ColumnWiseBuilder: build() with DATA_DESIGNER_ASYNC_ENGINE=1
    ColumnWiseBuilder->>ColumnWiseBuilder: _run_cell_by_cell_generator()
    ColumnWiseBuilder->>AsyncExecutor: AsyncConcurrentExecutor(max_workers=N)
    loop For each record
        ColumnWiseBuilder->>AsyncExecutor: add work_item(generator.agenerate(record))
    end
    ColumnWiseBuilder->>AsyncExecutor: run(work_items)
    AsyncExecutor->>AsyncExecutor: _ensure_async_engine_loop()
    AsyncExecutor->>AsyncExecutor: _run_all() on event loop
    par Concurrent async tasks (max N)
        AsyncExecutor->>Generator: agenerate(record_1)
        Generator->>ModelFacade: agenerate(prompt, parser, ...)
        ModelFacade->>ModelFacade: acompletion(messages)
        ModelFacade->>LiteLLM: router.acompletion()
        LiteLLM-->>ModelFacade: response
        ModelFacade->>ModelFacade: parser(response)
        ModelFacade-->>Generator: (output, trace)
        Generator-->>AsyncExecutor: result_dict
    and
        AsyncExecutor->>Generator: agenerate(record_N)
        Generator->>ModelFacade: agenerate(prompt, parser, ...)
        ModelFacade->>LiteLLM: router.acompletion()
        LiteLLM-->>ModelFacade: response
        ModelFacade-->>Generator: (output, trace)
        Generator-->>AsyncExecutor: result_dict
    end
    AsyncExecutor-->>ColumnWiseBuilder: execution complete
    ColumnWiseBuilder-->>User: dataset built

nabinchha · 2026-02-11T17:57:07Z

packages/data-designer-engine/tests/engine/models/test_litellm_overrides.py

+@patch.object(litellm_overrides, "quiet_noisy_logger", autospec=True)
+def test_apply_litellm_patches(mock_quiet_noisy_logger: object) -> None:
+    litellm_overrides.apply_litellm_patches()


was the previous test breaking without having to patch the object directly? What changed?

same question for other tests where we switched from @patch("...") to @patch.object(...)`

No runtime behavior changed in litellm_overrides that forced this switch. patch.object patches the exact object reference we import/use, which avoids brittle string targets that can silently break during refactors. If a module gets reorganized, a string path stops resolving quietly, but patch.object would fail loudly at the call site. We switched for robustness.

remi

nabinchha · 2026-02-11T18:00:45Z

...es/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_concurrency.py

+    to a specific event loop.
+    """
+    global _loop, _thread
+    with _lock:


should _lock also be declared as global?

_lock doesn't need global. In Python, the global keyword is only required when you reassign a module-level name inside a function. Without it, _loop = asyncio.new_event_loop() would create a local variable shadowing the module-level one. _lock is only ever read (with _lock:), never reassigned, so Python resolves it to the module scope automatically. _loop and _thread need global because both are reassigned on lines 60-61. _lock doesn't because nothing ever rebinds it.

Separately, looking more closely at this function, there's a subtle startup race. Between _thread.start() returning and the background thread actually entering run_forever(), _loop.is_running() returns False. If a second caller enters the lock during that window, it creates a second loop, orphaning the first and splitting work across two event loops. That's the loop-affinity bug this singleton pattern exists to prevent.

Low probability for single-build usage, but since this is a shared singleton path it's worth hardening. A minimal fix would be a threading.Event readiness handshake where the background thread sets it right before run_forever(), and _ensure_async_engine_loop holds the lock until the event fires. We can pick that up in a follow-up.

remi

nabinchha · 2026-02-11T18:09:14Z

...es/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_concurrency.py

+    return _loop
+
+
+class AsyncConcurrentExecutor:


Would this class also encapsulate auto throttling of max_workers based on failures it is seeing? It will need to unpack an look for 429s I assume?

This class would have to take action of the 429's for that, yes -- one of those actions could be to change / adjust the attributes of semaphores perhaps (by some means or plumbing).

nabinchha · 2026-02-11T18:11:38Z

packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py

+        if DATA_DESIGNER_ASYNC_ENGINE:
+            logger.info("⚡ Using async engine for concurrent execution")
+            self._fan_out_with_async(generator, max_workers=max_workers)
+        else:
+            self._fan_out_with_threads(generator, max_workers=max_workers)


Perhaps we leave this dataset builder untouched since perf gain with async isn't there. When we work on the new builder, we can start to use it there.

This switch is critical for being able to test end-to-end correctness of the async implementations. Without this we do not have a clean way to say that the async stack is correct. However, yes, once moving to async tasks we would need to hoist this context to some higher step and it wouldn't exist within column_wise_builder.

eric-tramel self-assigned this Feb 2, 2026

eric-tramel added the enhancement New feature or request label Feb 2, 2026

eric-tramel added 3 commits February 2, 2026 14:04

Initialize alternate module path

7679741

Fix tests

1c461c1

Add a benchmark to track progress.

1129ed6

eric-tramel force-pushed the async/async-facade branch from de634c0 to 1129ed6 Compare February 2, 2026 19:04

eric-tramel and others added 5 commits February 2, 2026 14:49

fix test patching

4dbbef1

eric-tramel requested review from andreatgretel, johnnygreco and nabinchha February 10, 2026 23:08

eric-tramel marked this pull request as ready for review February 10, 2026 23:11

eric-tramel requested a review from a team as a code owner February 10, 2026 23:11

Merge branch 'main' into async/async-facade

a5f9c19

nabinchha reviewed Feb 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(engine): env-var switch for async-first models experiment #280

feat(engine): env-var switch for async-first models experiment #280

Uh oh!

eric-tramel commented Feb 2, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 10, 2026

Confidence Score: 5/5

Sequence Diagram

Uh oh!

nabinchha Feb 11, 2026

Uh oh!

nabinchha Feb 11, 2026

Uh oh!

eric-tramel Feb 12, 2026 •

edited

Loading

Uh oh!

nabinchha Feb 11, 2026

Uh oh!

eric-tramel Feb 12, 2026 •

edited

Loading

Uh oh!

nabinchha Feb 11, 2026

Uh oh!

eric-tramel Feb 11, 2026

Uh oh!

nabinchha Feb 11, 2026

Uh oh!

eric-tramel Feb 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(engine): env-var switch for async-first models experiment #280

Are you sure you want to change the base?

feat(engine): env-var switch for async-first models experiment #280

Uh oh!

Conversation

eric-tramel commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Architecture

Files changed

Test plan

Uh oh!

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

nabinchha Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

eric-tramel Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nabinchha Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

eric-tramel Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nabinchha Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

eric-tramel Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

eric-tramel Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eric-tramel commented Feb 2, 2026 •

edited

Loading

eric-tramel Feb 12, 2026 •

edited

Loading

eric-tramel Feb 12, 2026 •

edited

Loading

eric-tramel Feb 11, 2026 •

edited

Loading