Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
dc041f7
Add generation type to ModelConfig
nabinchha Nov 25, 2025
0d6b830
pass tests
nabinchha Nov 25, 2025
254fd8a
added generate_text_embeddings
nabinchha Nov 25, 2025
1126ea1
tests
nabinchha Nov 25, 2025
744bc8f
remove sensitive=True old artifact no longer needed
nabinchha Nov 25, 2025
b913f8d
Slight refactor
nabinchha Nov 26, 2025
052db7a
slight refactor
nabinchha Nov 26, 2025
5504c8d
Added embedding generator
nabinchha Nov 26, 2025
4b6f877
chunk_separator -> chunk_pattern
nabinchha Nov 26, 2025
04fc0f3
update tests
nabinchha Nov 26, 2025
26d6da1
rename for consistency
nabinchha Nov 26, 2025
6facbd2
Restructure InferenceParameters -> CompletionInferenceParameters, Bas…
nabinchha Nov 26, 2025
2c1b267
Remove purpose from consolidated kwargs
nabinchha Nov 26, 2025
4b1492b
WithModelConfiguration.inference_parameters should should be typed wi…
nabinchha Dec 2, 2025
c445caf
Type as WithModelGeneration
nabinchha Dec 2, 2025
4b8aa2b
Add image generation modality
nabinchha Dec 2, 2025
2c5933f
update return type for generate_kwargs
nabinchha Dec 3, 2025
c6c29d4
make generation_type a field of ModelConfig as opposed to a prop reso…
nabinchha Dec 3, 2025
06a724b
remove regex based chunking from embedding generator
nabinchha Dec 3, 2025
6b9733f
Merge branch 'main' into nmulepati/feat/support-embedding-and-image-g…
nabinchha Dec 23, 2025
81949e6
Merge branch 'main' into nmulepati/feat/support-embedding-and-image-g…
nabinchha Feb 3, 2026
f291033
save progress
nabinchha Feb 4, 2026
e0a4657
Merge branch 'main' into nmulepati/feat/125-support-image-generation
nabinchha Feb 5, 2026
1506ab5
Simplify to ImageInferenceParams. Persist images in create mode to disk
nabinchha Feb 6, 2026
ed9787b
support generation of multiple images
nabinchha Feb 6, 2026
7dea87a
clean up visualization
nabinchha Feb 6, 2026
31cc24e
clean up some util methods + add tests
nabinchha Feb 6, 2026
0f07f7b
Streamline integration for image generation
nabinchha Feb 6, 2026
2aae6cc
streamline generation
nabinchha Feb 7, 2026
1677f06
track images generated in usage
nabinchha Feb 9, 2026
3b4acf1
fix image usage tracking
nabinchha Feb 9, 2026
33b4211
test clean up
nabinchha Feb 9, 2026
fad791e
Small refactor for simplicity
nabinchha Feb 9, 2026
54ebcc8
update ImageInferenceParams
nabinchha Feb 9, 2026
3aad608
add example tutorial for image generation
nabinchha Feb 9, 2026
f252c37
support multi-modal context in ImageColumnConfig
nabinchha Feb 10, 2026
d6a0f2f
updated tutorial notebook
nabinchha Feb 10, 2026
f5c6cf9
organize image artifacts by column name
nabinchha Feb 10, 2026
71e2bac
address pr comments
nabinchha Feb 10, 2026
46138d8
fix license headers
nabinchha Feb 10, 2026
b187ff4
Merge branch 'main' into nmulepati/feat/125-support-image-generation
nabinchha Feb 10, 2026
deb5fc2
generate collab notebooks
nabinchha Feb 10, 2026
d11d049
move pillow to lib dep from notebook
nabinchha Feb 10, 2026
511e1f2
update uv lock"
nabinchha Feb 10, 2026
2b22df8
remove legacy flag from display_sample_record
nabinchha Feb 10, 2026
9239544
remove unnecessary override of generate kwargs
nabinchha Feb 10, 2026
3a779aa
Restore some changes not needed
nabinchha Feb 10, 2026
33b6cd9
use a specific image generation exception instead of generic ModelAPI…
nabinchha Feb 10, 2026
3a98caf
more cleanup
nabinchha Feb 10, 2026
cd39941
more tests for hf image folder upload
nabinchha Feb 10, 2026
52e023d
Fix test
nabinchha Feb 10, 2026
c53a1dc
set init=False for media_storage
nabinchha Feb 10, 2026
8f813b1
handle image url in _display_image_if_in_notebook
nabinchha Feb 10, 2026
281859b
Merge branch 'main' into nmulepati/feat/125-support-image-generation
nabinchha Feb 10, 2026
782a346
Fix path traversal vulnerability in MediaStorage subfolder handling
nabinchha Feb 10, 2026
2d7a202
Fix PIL format detection in detect_image_format
nabinchha Feb 10, 2026
5fca3a6
Fix Pydantic v2 compatibility in ArtifactStorage
nabinchha Feb 10, 2026
a5fab8a
Move ArtifactStorage to engine/storage/ module
nabinchha Feb 10, 2026
cf2b364
Address PR review comments
nabinchha Feb 10, 2026
5aa7e10
Use regex for base64 character validation in is_base64_image
nabinchha Feb 10, 2026
ecaeb72
move to a constant
nabinchha Feb 10, 2026
622b1c4
fix pyproject.toml
nabinchha Feb 10, 2026
400e97b
regen colab notebooks
nabinchha Feb 10, 2026
469a3d2
raise a ValueError if we fail to detect image format
nabinchha Feb 10, 2026
1e43394
Fix diffusion image gen
nabinchha Feb 11, 2026
8f6be9b
Add requests to config pyproject.toml
nabinchha Feb 11, 2026
a277dc1
Merge branch 'nmulepati/feat/125-support-image-generation' into nmule…
nabinchha Feb 11, 2026
d85ccb3
Merge branch 'main' into nmulepati/feat/125-support-image-generation
nabinchha Feb 11, 2026
87dcab1
address pr feedback from andre
nabinchha Feb 11, 2026
12cc1fe
Merge branch 'main' into nmulepati/feat/125-support-image-generation
nabinchha Feb 11, 2026
a0ea92b
Merge branch 'main' into nmulepati/feat/125-support-image-generation
nabinchha Feb 12, 2026
7590778
Merge branch 'nmulepati/feat/125-support-image-generation' into nmule…
nabinchha Feb 12, 2026
f0ea307
Merge branch 'main' into nmulepati/chore/move-artifact-storage
nabinchha Feb 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
from typing import TYPE_CHECKING, Generic, TypeVar, get_origin

from data_designer.config.base import ConfigBase
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.resources.resource_provider import ResourceProvider
from data_designer.engine.storage.artifact_storage import ArtifactStorage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@
)
from data_designer.engine.column_generators.utils.generator_classification import column_type_is_model_generated
from data_designer.engine.compiler import compile_data_designer_config
from data_designer.engine.dataset_builders.artifact_storage import SDG_CONFIG_FILENAME, ArtifactStorage
from data_designer.engine.dataset_builders.errors import DatasetGenerationError
from data_designer.engine.dataset_builders.multi_column_configs import MultiColumnConfig
from data_designer.engine.dataset_builders.utils.concurrency import ConcurrentThreadExecutor
Expand All @@ -40,6 +39,7 @@
from data_designer.engine.processing.processors.drop_columns import DropColumnsProcessor
from data_designer.engine.registry.data_designer_registry import DataDesignerRegistry
from data_designer.engine.resources.resource_provider import ResourceProvider
from data_designer.engine.storage.artifact_storage import SDG_CONFIG_FILENAME, ArtifactStorage
from data_designer.engine.storage.media_storage import StorageMode
from data_designer.lazy_heavy_imports import pd

Expand Down Expand Up @@ -182,10 +182,6 @@ def _has_image_columns(self) -> bool:
"""Check if config has any image generation columns."""
return any(col.column_type == DataDesignerColumnType.IMAGE for col in self.single_column_configs)

def _has_image_columns(self) -> bool:
"""Check if config has any image generation columns."""
return any(col.column_type == DataDesignerColumnType.IMAGE for col in self.single_column_configs)

def _initialize_generators(self) -> list[ColumnGenerator]:
return [
self._registry.column_generators.get_for_config_type(type(config))(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
from pathlib import Path
from typing import TYPE_CHECKING, Callable, Container, Iterator

from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage, BatchStage
from data_designer.engine.dataset_builders.utils.errors import DatasetBatchManagementError
from data_designer.engine.storage.artifact_storage import ArtifactStorage, BatchStage
from data_designer.lazy_heavy_imports import pd, pq

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,15 @@
from enum import Enum
from typing import TYPE_CHECKING

from data_designer.engine.dataset_builders.artifact_storage import BatchStage
from data_designer.engine.dataset_builders.errors import DatasetProcessingError
from data_designer.engine.storage.artifact_storage import BatchStage

if TYPE_CHECKING:
import pandas as pd

from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.dataset_builders.utils.dataset_batch_manager import DatasetBatchManager
from data_designer.engine.processing.processors.base import Processor
from data_designer.engine.storage.artifact_storage import ArtifactStorage

logger = logging.getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
from typing import TYPE_CHECKING

from data_designer.config.processors import DropColumnsProcessorConfig
from data_designer.engine.dataset_builders.artifact_storage import BatchStage
from data_designer.engine.processing.processors.base import Processor
from data_designer.engine.storage.artifact_storage import BatchStage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@
from typing import TYPE_CHECKING, Any

from data_designer.config.processors import SchemaTransformProcessorConfig
from data_designer.engine.dataset_builders.artifact_storage import BatchStage
from data_designer.engine.processing.ginja.environment import WithJinja2UserTemplateRendering
from data_designer.engine.processing.processors.base import Processor
from data_designer.engine.processing.utils import deserialize_json_values
from data_designer.engine.storage.artifact_storage import BatchStage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
from data_designer.config.run_config import RunConfig
from data_designer.config.seed_source import SeedSource
from data_designer.config.utils.type_helpers import StrEnum
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.mcp.factory import create_mcp_registry
from data_designer.engine.mcp.registry import MCPRegistry
from data_designer.engine.model_provider import (
Expand All @@ -22,6 +21,7 @@
from data_designer.engine.resources.managed_storage import ManagedBlobStorage
from data_designer.engine.resources.seed_reader import SeedReader, SeedReaderRegistry
from data_designer.engine.secret_resolver import SecretResolver
from data_designer.engine.storage.artifact_storage import ArtifactStorage


class ResourceType(StrEnum):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,10 @@
DatasetProfilerConfig,
)
from data_designer.engine.analysis.utils.judge_score_processing import JudgeScoreDistributions
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.models.registry import ModelRegistry
from data_designer.engine.registry.data_designer_registry import DataDesignerRegistry
from data_designer.engine.resources.resource_provider import ResourceProvider
from data_designer.engine.storage.artifact_storage import ArtifactStorage
from data_designer.lazy_heavy_imports import pa, pd

if TYPE_CHECKING:
Expand Down
2 changes: 1 addition & 1 deletion packages/data-designer-engine/tests/engine/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@
import pytest

from data_designer.config.run_config import RunConfig
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.models.facade import ModelFacade
from data_designer.engine.models.registry import ModelRegistry
from data_designer.engine.resources.managed_storage import ManagedBlobStorage
from data_designer.engine.resources.resource_provider import ResourceProvider
from data_designer.engine.storage.artifact_storage import ArtifactStorage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@

import pytest

from data_designer.engine.dataset_builders.artifact_storage import BatchStage
from data_designer.engine.dataset_builders.utils.dataset_batch_manager import DatasetBatchManager
from data_designer.engine.dataset_builders.utils.errors import DatasetBatchManagementError
from data_designer.engine.storage.artifact_storage import BatchStage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
import pytest

from data_designer.config.processors import DropColumnsProcessorConfig
from data_designer.engine.dataset_builders.artifact_storage import BatchStage
from data_designer.engine.processing.processors.drop_columns import DropColumnsProcessor
from data_designer.engine.storage.artifact_storage import BatchStage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
import pytest

from data_designer.config.processors import SchemaTransformProcessorConfig
from data_designer.engine.dataset_builders.artifact_storage import BatchStage
from data_designer.engine.processing.processors.schema_transform import SchemaTransformProcessor
from data_designer.engine.resources.resource_provider import ResourceProvider
from data_designer.engine.storage.artifact_storage import BatchStage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
import pytest

from data_designer.config.mcp import LocalStdioMCPProvider, MCPProvider, ToolConfig
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.models.registry import ModelRegistry
from data_designer.engine.resources.resource_provider import (
ResourceProvider,
_validate_tool_configs_against_providers,
create_resource_provider,
)
from data_designer.engine.storage.artifact_storage import ArtifactStorage


def _stub_model_registry() -> ModelRegistry:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@
import pytest
from pyarrow import ArrowNotImplementedError

from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage, BatchStage
from data_designer.engine.dataset_builders.errors import ArtifactStorageError
from data_designer.engine.storage.artifact_storage import ArtifactStorage, BatchStage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down Expand Up @@ -201,7 +201,7 @@ def test_artifact_storage_batch_numbering(stub_artifact_storage, batch_number):
assert path.name == expected_name


@patch("data_designer.engine.dataset_builders.artifact_storage.datetime")
@patch("data_designer.engine.storage.artifact_storage.datetime")
def test_artifact_storage_resolved_dataset_name(mock_datetime, tmp_path):
mock_datetime.now.return_value = datetime(2025, 1, 1, 12, 3, 4)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@

from data_designer.config.base import ConfigBase
from data_designer.engine.configurable_task import ConfigurableTask, DataT, TaskConfigT
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.models.registry import ModelRegistry
from data_designer.engine.resources.resource_provider import ResourceProvider
from data_designer.engine.storage.artifact_storage import ArtifactStorage
from data_designer.lazy_heavy_imports import pd

if TYPE_CHECKING:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from huggingface_hub.utils import HfHubHTTPError, validate_repo_id

from data_designer.config.utils.constants import HUGGINGFACE_HUB_DATASET_URL_PREFIX
from data_designer.engine.dataset_builders.artifact_storage import (
from data_designer.engine.storage.artifact_storage import (
FINAL_DATASET_FOLDER_NAME,
METADATA_FILENAME,
PROCESSORS_OUTPUTS_FOLDER_NAME,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@
from data_designer.config.utils.info import InfoType, InterfaceInfo
from data_designer.engine.analysis.dataset_profiler import DataDesignerDatasetProfiler, DatasetProfilerConfig
from data_designer.engine.compiler import compile_data_designer_config
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.dataset_builders.column_wise_builder import ColumnWiseDatasetBuilder
from data_designer.engine.model_provider import resolve_model_provider_registry
from data_designer.engine.resources.managed_storage import init_managed_blob_storage
Expand All @@ -51,6 +50,7 @@
PlaintextResolver,
SecretResolver,
)
from data_designer.engine.storage.artifact_storage import ArtifactStorage
from data_designer.interface.errors import (
DataDesignerGenerationError,
DataDesignerProfilingError,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@
from data_designer.config.config_builder import DataDesignerConfigBuilder
from data_designer.config.dataset_metadata import DatasetMetadata
from data_designer.config.utils.visualization import WithRecordSamplerMixin
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.dataset_builders.errors import ArtifactStorageError
from data_designer.engine.storage.artifact_storage import ArtifactStorage
from data_designer.integrations.huggingface.client import HuggingFaceHubClient
from data_designer.lazy_heavy_imports import pd

Expand Down
2 changes: 1 addition & 1 deletion packages/data-designer/tests/interface/test_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
from data_designer.config.preview_results import PreviewResults
from data_designer.config.utils.errors import DatasetSampleDisplayError
from data_designer.config.utils.visualization import display_sample_record as display_fn
from data_designer.engine.dataset_builders.artifact_storage import ArtifactStorage
from data_designer.engine.storage.artifact_storage import ArtifactStorage
from data_designer.interface.results import DatasetCreationResults
from data_designer.lazy_heavy_imports import pd

Expand Down