Feat: persist AQC reasoning! #37

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

amindadgar merged 10 commits into main from feat/34-aqc-reasoning

Jun 18, 2025

.env.example

-Original file line number
+Diff line change
@@ Expand Up / @@ -8,4 +8,9 @@ OPENAI_API_KEY= @@
     REDIS_HOST=
     REDIS_PORT=
-    REDIS_PASSWORD=
+    REDIS_PASSWORD=
+    MONGODB_HOST=
+    MONGODB_PORT=
+    MONGODB_USER=
+    MONGODB_PASS=

README.md

-Original file line number
+Diff line change
@@ -1 +1,164 @@
-    # agents-workflow
+    # Agents Workflow with MongoDB Persistence
+    This project implements a CrewAI-based workflow system with comprehensive MongoDB persistence for tracking every step of the workflow execution.
+    ## Features
+    - **MongoDB Persistence**: Every step of the workflow is persisted to MongoDB for audit trails and debugging
+    - **Workflow Tracking**: Complete visibility into the execution flow with timestamps and step data
+    - **Error Handling**: Comprehensive error tracking and recovery mechanisms
+    - **Chat History**: Redis-based chat history management
+    - **RAG Integration**: Retrieval-Augmented Generation pipeline for data source queries
+    ## Architecture
+    ### Components
+. **MongoPersistence**: Handles all MongoDB operations for workflow state tracking
+. **AgenticHivemindFlow**: CrewAI flow that orchestrates the agent interactions
+. **run_hivemind_agent_activity**: Temporal activity that manages the workflow execution
+. **QueryDataSources**: Handles RAG queries with workflow ID tracking
+    ### Workflow Steps Tracked
+    The system tracks the following steps in MongoDB:
+. **initialization**: Initial workflow setup with parameters
+. **chat_history_retrieval**: Redis chat history retrieval (if applicable)
+. **no_chat_history**: When no chat history is available
+. **flow_initialization**: AgenticHivemindFlow setup
+. **flow_execution_start**: Beginning of CrewAI flow execution
+. **local_model_classification**: Local transformer model classification result
+. **question_classification**: Language model question classification with reasoning
+. **rag_classification**: RAG question classification with score and reasoning
+. **history_query_classification**: History vs RAG query classification (if applicable)
+. **flow_execution_complete**: Completion of CrewAI flow
+. **answer_processing**: Processing of the final answer
+. **error_handling**: Any error handling steps
+. **memory_update**: Redis memory updates (if applicable)
+. **error_occurred**: Any errors during execution
+    ## Environment Variables
+    Use the `.env.example` to prepare your `.env` file.
+    ## Classification Data Persistence
+    The system now persists detailed classification reasoning and results for better audit trails and debugging:
+    ### Local Model Classification
+    - **Step Name**: `local_model_classification`
+    - **Data**: Result from local transformer model
+    - **Model**: `local_transformer`
+    ### Question Classification
+    - **Step Name**: `question_classification`
+    - **Data**:
+      - `result`: Boolean indicating if the message is a question
+      - `reasoning`: Detailed explanation for the classification
+      - `model`: `language_model`
+      - `query`: Original user query
+    ### RAG Classification
+    - **Step Name**: `rag_classification`
+    - **Data**:
+      - `result`: Boolean indicating if RAG is needed
+      - `score`: Sensitivity score (0-1)
+      - `reasoning`: Detailed explanation for the score
+      - `model`: `language_model`
+      - `query`: Original user query
+    ### History Query Classification
+    - **Step Name**: `history_query_classification`
+    - **Data**:
+      - `result`: Boolean indicating if it's a history query
+      - `model`: `openai_gpt4`
+      - `query`: Original user query
+      - `hasChatHistory`: Boolean indicating if chat history was available
+    ## MongoDB Schema
+    The workflow states are stored in the `internal_messages` collection with the following structure:
+    ```json
+    {
+      "_id": "ObjectId",
+      "communityId": "string",
+      "route": {
+        "source": "string",
+        "destination": {
+          "queue": "string",
+          "event": "string"
+        }
+      },
+      "question": {
+        "message": "string",
+        "filters": "object (optional)"
+      },
+      "response": {
+        "message": "string"
+      },
+      "metadata": "object",
+      "createdAt": "datetime",
+      "updatedAt": "datetime",
+      "steps": [
+        {
+          "stepName": "string",
+          "timestamp": "datetime",
+          "data": "object"
+        }
+      ],
+      "currentStep": "string",
+      "status": "string",
+      "chatId": "string (optional)",
+      "enableAnswerSkipping": "boolean"
+    }
+    ```
+    ## Usage
+    ### Running the Worker
+    ```bash
+    python worker.py
+    ```
+    ### Querying Workflow States
+    You can query the MongoDB collection to inspect workflow execution:
+    ```python
+    from tasks.mongo_persistence import MongoPersistence
+    persistence = MongoPersistence()
+    workflow_state = persistence.get_workflow_state("workflow_id_here")
+    print(workflow_state)
+    ```
+    ## Testing
+    Run the unit tests:
+    ```bash
+    python -m pytest tests/unit/test_mongo_persistence.py
+    ```
+    ## Dependencies
+    - `pymongo==4.8.0`: MongoDB driver
+    - `redis==5.2.0`: Redis client
+    - `crewai==0.105.0`: AI agent framework
+    - `temporalio`: Temporal workflow engine
+    - `openai==1.66.3`: OpenAI API client
+    ## Workflow ID Tracking
+    The workflow ID is passed through the entire execution chain:
+. Created in `run_hivemind_agent_activity`
+. Passed to `AgenticHivemindFlow`
+. Passed to `RAGPipelineTool`
+. Passed to `QueryDataSources`
+. Included in `HivemindQueryPayload` for the `HivemindWorkflow`
+    This ensures complete traceability from the initial query to the final response.

requirements.txt

-Original file line number
+Diff line change
@@ Expand Up / @@ -2,7 +2,8 @@ python-dotenv>=1.0.0, <2.0.0 @@
     redis==5.2.0
     pydantic==2.9.2
     crewai==0.105.0
-    tc-temporal-backend==1.1.2
+    tc-temporal-backend==1.1.3
     transformers[torch]==4.49.0
     nest-asyncio==1.6.0
     openai==1.66.3
+    tc-hivemind-backend==1.4.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: persist AQC reasoning! #37

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!

Uh oh!