-
Notifications
You must be signed in to change notification settings - Fork 575
refactor: Add dynamic plugin loading for enterprise components #1736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
## What - Add dynamic plugin loading support to OSS codebase - Enable enterprise components to be loaded at runtime without modifying tracked files ## Why - Enterprise code was overwriting git-tracked OSS files causing dirty git state - Need clean separation between OSS and enterprise codebases - OSS should work independently without enterprise components ## How - `unstract_migrations.py`: Uses try/except ImportError to load from `pluggable_apps.migrations_ext` - `api_hub_usage_utils.py`: Uses try/except ImportError to load from `plugins.verticals_usage` - `utils.py`: Uses try/except ImportError to load from `pluggable_apps.manual_review_v2` and `plugins.workflow_manager.workflow_v2.rule_engine` - `backend.Dockerfile`: Conditional install of `requirements.txt` if present ## Can this PR break any existing features. If yes, please list possible items. If no, please explain why. - No. The changes add optional plugin loading that gracefully falls back to default behavior when plugins are not present. Existing OSS functionality is preserved. ## Database Migrations - None ## Env Config - None ## Relevant Docs - None ## Related Issues or PRs - None ## Dependencies Versions - None ## Notes on Testing - OSS build: Verify app starts and works without enterprise plugins - Enterprise build: Verify plugins are loaded and function correctly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds an extensible migrations class, replaces OSS no-op API Hub usage functions with plugin-driven implementations, implements workflow utility functions that delegate to pluggable helpers, and conditionally installs extra Python deps in the backend Docker production stage. (≤50 words) Changes
Sequence Diagram(s)sequenceDiagram
participant APIHubUtil as APIHubUsageUtil
participant Plugin as verticals_usage plugin
participant HeadersCache as headers_cache_class
participant UsageTracker as usage_tracker
APIHubUtil->>Plugin: obtain plugin (verticals_usage)
APIHubUtil->>Plugin: create headers_cache and usage_tracker
APIHubUtil->>Plugin: call extract_api_hub_headers(request)
Plugin-->>APIHubUtil: headers (or None / raises)
alt headers present
APIHubUtil->>HeadersCache: store_headers(headers)
HeadersCache-->>APIHubUtil: success/failure
APIHubUtil->>UsageTracker: store usage(headers, metadata)
UsageTracker-->>APIHubUtil: success/failure
APIHubUtil-->>Caller: return True
else no headers or error
APIHubUtil-->>Caller: return False
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py (1)
79-106: Remove unusedttl_secondsparameter or pass it tostore_headers.The
ttl_secondsparameter (line 83) is not passed tostore_headers(line 103). Either remove the parameter from the method signature or pass it to the underlying implementation. Additionally, uselogger.exceptioninstead oflogger.errorto capture stack traces.🔧 Proposed fix
try: - return api_hub_headers_cache.store_headers(execution_id, headers) + return api_hub_headers_cache.store_headers(execution_id, headers, ttl_seconds) except Exception as e: - logger.error(f"Error caching API hub headers: {e}") + logger.exception(f"Error caching API hub headers: {e}") return False
🤖 Fix all issues with AI agents
In @backend/plugins/workflow_manager/workflow_v2/utils.py:
- Around line 38-46: Remove the unused import by deleting
get_db_rules_by_workflow_id from the try block inside _mrq_files; keep the
random import intact and ensure the try/except still only catches ImportError
for missing dependencies, so the function uses random.sample as before without
importing the unused helper.
- Around line 93-95: The FileHash DTO's file_destination is typed as
tuple[str,str] | None but code assigns a plain string
(WorkflowEndpoint.ConnectionType.MANUALREVIEW) to file_hash.file_destination;
update the code to match the DTO by assigning a tuple (e.g.,
(WorkflowEndpoint.ConnectionType.MANUALREVIEW, "<optional-second>") or a
meaningful second element) wherever file_destination is set, or change the
FileHash type to str | None if the design intends a single string; ensure
consistency by updating all initializations and comparisons that currently use
empty strings "" and references to file_hash.file_destination and
WorkflowEndpoint.ConnectionType.MANUALREVIEW accordingly.
🧹 Nitpick comments (2)
backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py (2)
50-54: Uselogger.exceptionfor better error diagnostics.When catching exceptions during usage tracking,
logger.exceptionautomatically includes the stack trace, which aids debugging in production environments.♻️ Proposed fix
except Exception as e: - logger.error( + logger.exception( f"Failed to track API hub usage for execution {workflow_execution_id}: {e}" ) return False
75-77: Uselogger.exceptionfor better error diagnostics.♻️ Proposed fix
except Exception as e: - logger.error(f"Error extracting API hub headers: {e}") + logger.exception(f"Error extracting API hub headers: {e}") return None
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (4)
backend/migrating/v2/unstract_migrations.pybackend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.pybackend/plugins/workflow_manager/workflow_v2/utils.pydocker/dockerfiles/backend.Dockerfile
🧰 Additional context used
🧬 Code graph analysis (3)
backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py (1)
backend/workflow_manager/workflow_v2/models/execution.py (1)
organization_id(266-274)
backend/migrating/v2/unstract_migrations.py (1)
backend/migrating/v2/query.py (3)
MigrationQuery(1-765)get_public_schema_migrations(9-252)get_organization_migrations(254-765)
backend/plugins/workflow_manager/workflow_v2/utils.py (4)
backend/workflow_manager/endpoint_v2/dto.py (1)
FileHash(11-54)workers/shared/clients/manual_review_stub.py (1)
get_q_no_list(52-72)workers/shared/utils/manual_review_factory.py (9)
get_q_no_list(121-123)get_q_no_list(313-314)get_q_no_list(402-403)add_file_destination_filehash(116-118)add_file_destination_filehash(309-310)add_file_destination_filehash(398-399)get_hitl_ttl_seconds(150-152)get_hitl_ttl_seconds(348-349)get_hitl_ttl_seconds(456-457)backend/workflow_manager/endpoint_v2/models.py (1)
WorkflowEndpoint(17-66)
🪛 Ruff (0.14.10)
backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py
48-48: Consider moving this statement to an else block
(TRY300)
50-50: Do not catch blind exception: Exception
(BLE001)
51-53: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
75-75: Do not catch blind exception: Exception
(BLE001)
76-76: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
104-104: Do not catch blind exception: Exception
(BLE001)
105-105: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backend/plugins/workflow_manager/workflow_v2/utils.py
156-156: Consider moving this statement to an else block
(TRY300)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (8)
docker/dockerfiles/backend.Dockerfile (1)
76-79: LGTM!The conditional install pattern properly handles the optional
requirements.txtfor enterprise dependencies. The file existence check prevents build failures when the file is absent in the OSS build.backend/migrating/v2/unstract_migrations.py (2)
22-39: LGTM!The extension pattern correctly loads additional migrations when available while preserving core migration behavior. The fallback to core-only migrations ensures OSS compatibility.
41-66: LGTM!The organization migrations extension follows the same pattern and correctly passes all required parameters to the extended migrations function.
backend/plugins/workflow_manager/workflow_v2/utils.py (5)
48-67: LGTM!The method correctly retrieves DB rules and delegates to
_mrq_fileswhen a valid percentage is configured. The ImportError fallback ensures OSS compatibility.
97-126: LGTM!The rule engine validation correctly delegates to the plugin when available and returns
Falseas a safe fallback for OSS builds.
128-140: LGTM!The backward-compatible wrapper maintains existing API while deprecating in favor of
validate_rule_engine.
142-160: LGTM!The API rules check correctly queries the DB rules configuration. The naming suggests DB rules may contain API-specific configuration (rule_string).
162-181: LGTM!The HITL TTL retrieval follows the established plugin loading pattern with a safe
Nonefallback for unlimited TTL in OSS.
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Removed redundant import of random and exception handling for manual_review_v2. Signed-off-by: Hari John Kuriakose <hari@zipstack.com>
|
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py:
- Around line 98-103: The ttl_seconds parameter accepted by the function is
never forwarded to the cache call; update the call to
headers_cache.store_headers(execution_id, headers, ttl_seconds) so the
underlying headers_cache_class implementation receives the TTL (or, if TTL is
intentionally unused, remove ttl_seconds from the function signature and
docstring). Ensure you modify the invocation using headers_cache_class and the
store_headers method to include ttl_seconds (or remove parameter and update docs
accordingly).
In @backend/plugins/workflow_manager/workflow_v2/utils.py:
- Around line 39-40: The current logic can raise ValueError when n == 0 or
num_to_select > n; update the selection logic to first handle n == 0 by
returning an empty set, clamp percentage to the 0–100 range, compute
num_to_select = max(0, min(n, int(n * (percentage / 100)))), and then call
random.sample on range(1, n + 1) only when num_to_select > 0; reference the
variables num_to_select, n, percentage and the random.sample call in the return
expression to locate and modify the code.
- Around line 87-89: The FileHash.file_destination field is currently typed as
tuple[str, str] | None in the DTO but is assigned string values like
WorkflowEndpoint.ConnectionType.MANUALREVIEW elsewhere; update the type
annotation of FileHash.file_destination in the FileHash definition (in
endpoint_v2/dto.py) from tuple[str, str] | None to str | None, and run a quick
grep for FileHash.file_destination uses to ensure no code expects a tuple
(adjust any typed usages/imports accordingly) so that assignments like
WorkflowEndpoint.ConnectionType.MANUALREVIEW are type-safe.
🧹 Nitpick comments (4)
backend/plugins/workflow_manager/workflow_v2/utils.py (1)
146-154: Consider restructuring the try/except for clarity.The static analysis tool (TRY300) suggests moving the success return to an
elseblock. This is a minor style preference that separates the "happy path" from exception handling.Optional refactor
try: from pluggable_apps.manual_review_v2.helper import get_db_rules_by_workflow_id - - db_rule = get_db_rules_by_workflow_id(workflow=workflow) - return db_rule is not None and db_rule.rule_string is not None except ImportError: - pass - - return False + return False + else: + db_rule = get_db_rules_by_workflow_id(workflow=workflow) + return db_rule is not None and db_rule.rule_string is not Nonebackend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py (3)
51-55: Uselogger.exceptionto preserve stack trace.When catching exceptions,
logger.exceptionautomatically includes the traceback, which aids debugging plugin issues.Proposed fix
except Exception as e: - logger.error( + logger.exception( f"Failed to track API hub usage for execution {workflow_execution_id}: {e}" ) return False
74-76: Uselogger.exceptionfor better debugging.Same recommendation as above—use
logger.exceptionto include the traceback.Proposed fix
except Exception as e: - logger.error(f"Error extracting API hub headers: {e}") + logger.exception(f"Error extracting API hub headers: {e}") return None
101-103: Uselogger.exceptionfor better debugging.Same recommendation—use
logger.exceptionto preserve the traceback.Proposed fix
except Exception as e: - logger.error(f"Error caching API hub headers: {e}") + logger.exception(f"Error caching API hub headers: {e}") return False
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (2)
backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.pybackend/plugins/workflow_manager/workflow_v2/utils.py
🧰 Additional context used
🧬 Code graph analysis (2)
backend/plugins/workflow_manager/workflow_v2/utils.py (4)
backend/workflow_manager/endpoint_v2/dto.py (1)
FileHash(11-54)workers/shared/clients/manual_review_stub.py (1)
get_q_no_list(52-72)workers/shared/utils/manual_review_factory.py (6)
get_q_no_list(121-123)get_q_no_list(313-314)get_q_no_list(402-403)get_hitl_ttl_seconds(150-152)get_hitl_ttl_seconds(348-349)get_hitl_ttl_seconds(456-457)backend/workflow_manager/endpoint_v2/models.py (1)
WorkflowEndpoint(17-66)
backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py (1)
backend/workflow_manager/workflow_v2/models/execution.py (1)
organization_id(266-274)
🪛 Ruff (0.14.10)
backend/plugins/workflow_manager/workflow_v2/utils.py
150-150: Consider moving this statement to an else block
(TRY300)
backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py
49-49: Consider moving this statement to an else block
(TRY300)
51-51: Do not catch blind exception: Exception
(BLE001)
52-54: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
74-74: Do not catch blind exception: Exception
(BLE001)
75-75: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
101-101: Do not catch blind exception: Exception
(BLE001)
102-102: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (5)
backend/plugins/workflow_manager/workflow_v2/utils.py (4)
1-16: LGTM!Module docstring clearly describes the plugin-based architecture, and imports are appropriate.
42-61: LGTM!Clean implementation of optional plugin loading with appropriate fallback to
None.
91-120: LGTM!Clean plugin loading with proper fallback to
Falsewhen the rule engine is unavailable.
156-175: LGTM!Consistent plugin loading pattern with appropriate return type.
backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py (1)
1-11: LGTM!Clean module structure with plugin-based architecture.
| try: | ||
| headers_cache = verticals_usage_plugin["headers_cache_class"]() | ||
| return headers_cache.store_headers(execution_id, headers) | ||
| except Exception as e: | ||
| logger.error(f"Error caching API hub headers: {e}") | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py | sed -n '78,103p'Repository: Zipstack/unstract
Length of output: 1124
🏁 Script executed:
# Find the store_headers method definition to check its signature
rg -A 5 "def store_headers" backend/Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
# Check if ttl_seconds is used anywhere in the file
rg "ttl_seconds" backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.pyRepository: Zipstack/unstract
Length of output: 160
🏁 Script executed:
# Search for store_headers method definition without pattern restrictions
rg "def store_headers" --type py -A 3Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
# Find where headers_cache_class is defined or used
rg "headers_cache_class" --type py -B 2 -A 2Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
# Look for the verticals_usage plugin to understand the cache interface
fd -e py | xargs rg "verticals_usage" | head -20Repository: Zipstack/unstract
Length of output: 190
🏁 Script executed:
# List the repository structure to understand the layout
git ls-files | grep -E "(plugin|verticals)" | head -20Repository: Zipstack/unstract
Length of output: 992
🏁 Script executed:
# Search for any reference to store_headers in comments or strings
rg "store_headers" --type py -iRepository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
# Check if there are any type hints or comments about what store_headers expects
rg -B 3 -A 3 "cache\(" backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.pyRepository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
# Search for all method definitions in the file to see context
rg -E "^\s+def " backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py -A 1Repository: Zipstack/unstract
Length of output: 106
🏁 Script executed:
# Check the full file to understand all methods and their parameter handling
cat -n backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py | head -150Repository: Zipstack/unstract
Length of output: 4192
🏁 Script executed:
# Look for any imports or hints about the cache class
head -50 backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.pyRepository: Zipstack/unstract
Length of output: 1668
ttl_seconds parameter is unused and should be passed to the underlying cache.
The ttl_seconds parameter is accepted at line 82 and documented in the docstring, but is never passed to store_headers at line 100. This should either be passed to the cache implementation or removed from the method signature.
Proposed fix
- return headers_cache.store_headers(execution_id, headers)
+ return headers_cache.store_headers(execution_id, headers, ttl_seconds)🧰 Tools
🪛 Ruff (0.14.10)
101-101: Do not catch blind exception: Exception
(BLE001)
102-102: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🤖 Prompt for AI Agents
In @backend/plugins/workflow_manager/workflow_v2/api_hub_usage_utils.py around
lines 98 - 103, The ttl_seconds parameter accepted by the function is never
forwarded to the cache call; update the call to
headers_cache.store_headers(execution_id, headers, ttl_seconds) so the
underlying headers_cache_class implementation receives the TTL (or, if TTL is
intentionally unused, remove ttl_seconds from the function signature and
docstring). Ensure you modify the invocation using headers_cache_class and the
store_headers method to include ttl_seconds (or remove parameter and update docs
accordingly).
| num_to_select = max(1, int(n * (percentage / 100))) | ||
| return set(random.sample(range(1, n + 1), num_to_select)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential ValueError when num_to_select > n or n == 0.
random.sample raises ValueError if the sample size exceeds the population. This can occur when:
n == 0(empty range)percentage > 100or rounding causesnum_to_select > n
Consider adding bounds checking:
Proposed fix
- num_to_select = max(1, int(n * (percentage / 100)))
- return set(random.sample(range(1, n + 1), num_to_select))
+ if n <= 0:
+ return set()
+ num_to_select = max(1, min(n, int(n * (percentage / 100))))
+ return set(random.sample(range(1, n + 1), num_to_select))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| num_to_select = max(1, int(n * (percentage / 100))) | |
| return set(random.sample(range(1, n + 1), num_to_select)) | |
| if n <= 0: | |
| return set() | |
| num_to_select = max(1, min(n, int(n * (percentage / 100)))) | |
| return set(random.sample(range(1, n + 1), num_to_select)) |
🤖 Prompt for AI Agents
In @backend/plugins/workflow_manager/workflow_v2/utils.py around lines 39 - 40,
The current logic can raise ValueError when n == 0 or num_to_select > n; update
the selection logic to first handle n == 0 by returning an empty set, clamp
percentage to the 0–100 range, compute num_to_select = max(0, min(n, int(n *
(percentage / 100)))), and then call random.sample on range(1, n + 1) only when
num_to_select > 0; reference the variables num_to_select, n, percentage and the
random.sample call in the return expression to locate and modify the code.
| if index in q_file_no_list: | ||
| file_hash.file_destination = WorkflowEndpoint.ConnectionType.MANUALREVIEW | ||
| return file_hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, find the dto.py file and check FileHash.file_destination type
find . -name "dto.py" -type f | head -20Repository: Zipstack/unstract
Length of output: 850
🏁 Script executed:
#!/bin/bash
# Search for FileHash class definition and file_destination field
rg -n "class FileHash" --type=py -A 30Repository: Zipstack/unstract
Length of output: 9370
🏁 Script executed:
#!/bin/bash
# Search for WorkflowEndpoint.ConnectionType definition
rg -n "class ConnectionType\|ConnectionType\s*=" --type=py -B 2 -A 10Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
#!/bin/bash
# Check how file_destination is used throughout the codebase
rg -n "file_destination\s*=" --type=py -B 2 -A 2Repository: Zipstack/unstract
Length of output: 5435
🏁 Script executed:
#!/bin/bash
# Search for WorkflowEndpoint class and ConnectionType
rg -n "class WorkflowEndpoint\|WorkflowEndpoint.*ConnectionType" --type=py -A 15Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
#!/bin/bash
# Search for FileDestinationType definition
rg -n "class FileDestinationType\|FileDestinationType\s*=" --type=py -A 10Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
#!/bin/bash
# Read the actual utils.py file around lines 63-89
cat -n backend/workflow_manager/workflow_v2/utils.py | sed -n '63,89p'Repository: Zipstack/unstract
Length of output: 138
🏁 Script executed:
#!/bin/bash
# Find the correct path to utils.py
find . -path "*/workflow_v2/utils.py" -type fRepository: Zipstack/unstract
Length of output: 116
🏁 Script executed:
#!/bin/bash
# Search for add_file_destination_filehash function to locate the file
rg -n "add_file_destination_filehash" --type=py -B 2 -A 25Repository: Zipstack/unstract
Length of output: 12110
🏁 Script executed:
#!/bin/bash
# Also search for WorkflowEndpoint and ConnectionType more broadly
rg -n "WorkflowEndpoint" --type=py | head -20Repository: Zipstack/unstract
Length of output: 2104
🏁 Script executed:
#!/bin/bash
# Read the actual utils.py file at the correct path
cat -n ./backend/plugins/workflow_manager/workflow_v2/utils.py | sed -n '60,95p'Repository: Zipstack/unstract
Length of output: 1431
🏁 Script executed:
#!/bin/bash
# Search for WorkflowEndpoint.ConnectionType.MANUALREVIEW
rg -n "MANUALREVIEW" --type=py -B 2 -A 2Repository: Zipstack/unstract
Length of output: 9207
🏁 Script executed:
#!/bin/bash
# Search for WorkflowEndpoint class definition
rg -n "class WorkflowEndpoint\|enum.*ConnectionType" --type=py -A 10Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
#!/bin/bash
# Get the WorkflowEndpoint model definition to see ConnectionType enum
cat -n backend/workflow_manager/endpoint_v2/models.py | head -50Repository: Zipstack/unstract
Length of output: 2166
🏁 Script executed:
#!/bin/bash
# Check the FileHash DTO definition more carefully
cat -n backend/workflow_manager/endpoint_v2/dto.py | head -45Repository: Zipstack/unstract
Length of output: 1772
Type mismatch: assigning str to tuple[str, str] | None field.
FileHash.file_destination is typed as tuple[str, str] | None in backend/workflow_manager/endpoint_v2/dto.py (line 20-22), but line 88 assigns WorkflowEndpoint.ConnectionType.MANUALREVIEW, which is a string value "MANUALREVIEW". This type mismatch will cause type checker failures.
The codebase consistently uses file_destination as a string (e.g., empty strings in views, string comparisons in processor.py), so the type annotation should be str | None rather than tuple[str, str] | None. Update the type annotation to match actual usage.
🤖 Prompt for AI Agents
In @backend/plugins/workflow_manager/workflow_v2/utils.py around lines 87 - 89,
The FileHash.file_destination field is currently typed as tuple[str, str] | None
in the DTO but is assigned string values like
WorkflowEndpoint.ConnectionType.MANUALREVIEW elsewhere; update the type
annotation of FileHash.file_destination in the FileHash definition (in
endpoint_v2/dto.py) from tuple[str, str] | None to str | None, and run a quick
grep for FileHash.file_destination uses to ensure no code expects a tuple
(adjust any typed usages/imports accordingly) so that assignments like
WorkflowEndpoint.ConnectionType.MANUALREVIEW are type-safe.



What
Why
How
unstract_migrations.py: Uses try/except ImportError to load frompluggable_apps.migrations_extapi_hub_usage_utils.py: Uses try/except ImportError to load fromplugins.verticals_usageutils.py: Uses try/except ImportError to load frompluggable_apps.manual_review_v2andplugins.workflow_manager.workflow_v2.rule_enginebackend.Dockerfile: Conditional install ofrequirements.txtif presentCan this PR break any existing features. If yes, please list possible items. If no, please explain why.
Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Screenshots
Checklist
I have read and understood the Contribution Guidelines.
🤖 Generated with Claude Code