Skip to content

add subStore support#3

Open
bupt-lmy wants to merge 1 commit intoob-labs:mainfrom
bupt-lmy:sub_store
Open

add subStore support#3
bupt-lmy wants to merge 1 commit intoob-labs:mainfrom
bupt-lmy:sub_store

Conversation

@bupt-lmy
Copy link
Contributor

@bupt-lmy bupt-lmy commented Feb 7, 2026

Summary

  • Add sub-store routing support (Python sub_stores / SubStorageAdapter concept) to the Java SDK.
  • Allow routing memories/searches to different VectorStore + Embedder pairs based on metadata (add) / filters (search).
  • Extend ConfigLoader to parse sub-store configuration from .env / env vars (format inspired by Python .env.example style).
  • Add unit tests to validate routing behavior and config parsing.

Motivation

Python PowerMem supports sub_stores to partition data across multiple physical collections/tables and/or multiple embedding models, routing by metadata filters. This is useful for:

  • multi-tenant routing (e.g. tenant=A → store A)
  • multi-model routing (different embedding dims/models)
  • gradual migration to new collections

This PR brings the same capability to the Java SDK.

What’s included

1) Storage adapter routing (SubStorageAdapter)

  • SubStorageAdapter now implements real routing instead of being a stub.
  • Routing rule: for each sub store, all (key,value) pairs in routingFilter must match the incoming metadata/filters.
  • Adds Python-parity helper methods:
    • listSubStores()
    • getTargetStoreName(filtersOrMetadata)
    • isSubStoreReady(storeName) / setSubStoreReady(storeName, ready)

Files:

  • src/main/java/com/oceanbase/powermem/sdk/storage/adapter/SubStorageAdapter.java
  • src/main/java/com/oceanbase/powermem/sdk/storage/adapter/StorageAdapter.java (adds embed(text, action, ctx) hook)

2) Memory wiring (use routed embedder + store)

Memory now:

  • builds SubStorageAdapter when MemoryConfig.subStores is configured
  • routes embedding generation through storage.embed(...) so the right embedder is used per store
  • routes get() through storage.getMemory(...) so records can be found in sub stores

File:

  • src/main/java/com/oceanbase/powermem/sdk/core/Memory.java

3) Configuration model

  • New config model SubStoreConfig for a sub store definition.
  • Adds MemoryConfig.subStores.
  • Adds SubStoreConfig.ready to emulate Python’s “migration readiness” gating (see parity notes below).

Files:

  • src/main/java/com/oceanbase/powermem/sdk/config/SubStoreConfig.java
  • src/main/java/com/oceanbase/powermem/sdk/config/MemoryConfig.java

4) ConfigLoader: .env / env var parsing

ConfigLoader now supports:

  • Indexed sub-store configuration:
    • SUB_STORES_COUNT=... (optional, indices can also be auto-detected)
    • SUB_STORE_0_COLLECTION=...
    • routing filter keys via SUB_STORE_0_ROUTE_<KEY>=<VALUE> (keys are normalized to lower-case)
    • optional overrides for vector store + embedder, all prefixed by SUB_STORE_0_...
  • Optional advanced input:
    • SUB_STORES_JSON=[{...}, {...}] (best-effort)

Files:

  • src/main/java/com/oceanbase/powermem/sdk/config/ConfigLoader.java
  • .env.example
  • src/main/resources/.env.example

Usage example (dotenv)

# Enable sub stores
SUB_STORES_COUNT=1

# Route metadata/filters.category=pref to a sub collection/store
SUB_STORE_0_COLLECTION=memories_pref
SUB_STORE_0_ROUTE_CATEGORY=pref

# Optional readiness gating (default is true)
SUB_STORE_0_READY=true

# Optional: sub store uses its own VectorStore config
SUB_STORE_0_DATABASE_PROVIDER=sqlite
SUB_STORE_0_SQLITE_PATH=./data/powermem_sub_pref.db

# Optional: sub store uses its own embedder config (otherwise inherits main embedder)
SUB_STORE_0_EMBEDDING_PROVIDER=mock

Python parity notes (differences and alignment)

Aligned / implemented

  • Routing by metadata/filters: implemented with “all conditions match” semantics.
  • CRUD/search routing: add/search/get/update/delete operations route to the right store.
  • Helper methods: listSubStores/getTargetStoreName/isSubStoreReady equivalents exist in Java.

Intentional differences / current gaps

  • OceanBase-only restriction:
    • Python currently only enables sub_stores for OceanBase (non-OB configs are ignored with a warning).
    • Java implementation allows sub-store routing for any VectorStore (including SQLite), which is useful for local/offline testing.
  • DB-backed migration status:
    • Python uses a DB table sub_store_migration_status to persist readiness and supports migrate_to_sub_store(...).
    • Java does not implement DB-backed migration status yet. Instead:
      • SubStoreConfig.ready + setSubStoreReady(...) provide a lightweight readiness gate.
      • Full migrate_to_sub_store(...) parity is a follow-up.

Test plan

  • Run all tests:
    • mvn test
  • Added tests:
    • SubStorageAdapterRoutingTest (routing + CRUD/search end-to-end on sqlite)
    • SubStorageAdapterReadyTest (ready=false fallback behavior)
    • SubStoreConfigLoaderTest (dotenv/env map parsing)

Follow-ups (optional)

  • Implement OceanBase DB-backed migration status table + migrate_to_sub_store to fully match Python behavior.
  • Consider exposing a stable public API for sub-store administration (migration status, listing, diagnostics).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant