-
Notifications
You must be signed in to change notification settings - Fork 27
flo_ai aws bedrock integration #228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,265 @@ | ||
| import json | ||
| import re | ||
| from typing import Dict, Any, List, AsyncIterator, Optional | ||
| import boto3 | ||
| import asyncio | ||
| from .base_llm import BaseLLM | ||
| from flo_ai.models.chat_message import ImageMessageContent | ||
| from flo_ai.tool.base_tool import Tool | ||
| from flo_ai.telemetry.instrumentation import ( | ||
| trace_llm_call, | ||
| trace_llm_stream, | ||
| llm_metrics, | ||
| add_span_attributes, | ||
| ) | ||
| from flo_ai.telemetry import get_tracer | ||
| from opentelemetry import trace | ||
|
|
||
|
|
||
| class AWSBedrock(BaseLLM): | ||
| def __init__( | ||
| self, | ||
| model: str = 'openai.gpt-oss-20b-1:0', | ||
| temperature: float = 0.7, | ||
| **kwargs, | ||
| ): | ||
| super().__init__(model=model, temperature=temperature, **kwargs) | ||
| self.boto_client = boto3.client('bedrock-runtime') | ||
| self.model = model | ||
| self.kwargs = kwargs | ||
|
|
||
| @staticmethod | ||
| def _strip_reasoning(text: str) -> str: | ||
| return re.sub(r'<reasoning>.*?</reasoning>', '', text, flags=re.DOTALL).strip() | ||
|
|
||
| def _convert_messages( | ||
| self, messages: list[dict], output_schema: dict = None | ||
| ) -> list[dict]: | ||
| result = [] | ||
|
|
||
| if output_schema: | ||
| result.append( | ||
| { | ||
| 'role': 'system', | ||
| 'content': f'Provide output in the following JSON schema:\n{json.dumps(output_schema, indent=2)}', | ||
| } | ||
| ) | ||
|
|
||
| for msg in messages: | ||
| if msg['role'] == 'function': | ||
| result.append( | ||
| { | ||
| 'role': 'tool', | ||
| 'tool_call_id': msg.get('tool_use_id', 'unknown'), | ||
| 'content': msg['content'], | ||
| 'name': msg.get('name', ''), | ||
| } | ||
| ) | ||
| else: | ||
| result.append(msg) | ||
|
|
||
| return result | ||
|
|
||
| @trace_llm_call(provider='bedrock') | ||
| async def generate( | ||
| self, | ||
| messages: list[dict], | ||
| functions: Optional[List[Dict[str, Any]]] = None, | ||
| output_schema: Optional[Dict[str, Any]] = None, | ||
| **kwargs, | ||
| ) -> Any: | ||
| converted = self._convert_messages(messages, output_schema) | ||
|
|
||
| request_body: Dict[str, Any] = { | ||
| 'model': self.model, | ||
| 'messages': converted, | ||
| 'temperature': self.temperature, | ||
| } | ||
| if 'max_tokens' in self.kwargs: | ||
| request_body['max_completion_tokens'] = self.kwargs['max_tokens'] | ||
| if functions: | ||
| request_body['tools'] = functions | ||
|
|
||
| response = await asyncio.to_thread( | ||
| self.boto_client.invoke_model, | ||
| modelId=self.model, | ||
| body=json.dumps(request_body), | ||
| ) | ||
| response_body = json.loads(response['body'].read().decode('utf-8')) | ||
|
Comment on lines
+63
to
+88
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🌐 Web query:
💡 Result: In Amazon Bedrock, “profiles” in the context of
Sources: 🌐 Web query:
💡 Result: In Amazon Bedrock Runtime, Key differences
Practical guidance
Sources: AWS “What’s New” (Converse API announcement) [1], Bedrock Runtime API refs for 🌐 Web query:
💡 Result:
Generic REST shape
Example request bodies (common ones)Anthropic Claude (Messages API format) {
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": [ { "type": "text", "text": "Hello world" } ] }
]
}Amazon Titan Text (text generation) {
"inputText": "Write a short poem about apples.",
"textGenerationConfig": {
"maxTokenCount": 256,
"temperature": 0,
"topP": 0.9,
"stopSequences": []
}
}Amazon Titan Text Embeddings {
"inputText": "Text to embed"
}If you tell me the exact Citations:
🏁 Script executed: cd flo_ai && head -60 flo_ai/llm/aws_bedrock_llm.py | cat -nRepository: rootflo/wavefront Length of output: 2260 🏁 Script executed: cd flo_ai && rg -A 5 -B 5 "class.*Bedrock" flo_ai/llm/aws_bedrock_llm.pyRepository: rootflo/wavefront Length of output: 294 🏁 Script executed: cd flo_ai && rg "invoke_model\|InvokeModel\|Converse\|modelId" flo_ai/llm/aws_bedrock_llm.pyRepository: rootflo/wavefront Length of output: 43 🏁 Script executed: cd flo_ai && sed -n '63,90p' flo_ai/llm/aws_bedrock_llm.py | cat -nRepository: rootflo/wavefront Length of output: 1226 🏁 Script executed: cd flo_ai && rg -i "openai\|converse\|model.*compatibility\|supported.*model" flo_ai/llm/aws_bedrock_llm.pyRepository: rootflo/wavefront Length of output: 43 🏁 Script executed: cd flo_ai && grep -n "invoke_model\|ConversUrl" flo_ai/llm/aws_bedrock_llm.pyRepository: rootflo/wavefront Length of output: 174 🏁 Script executed: cd flo_ai && head -100 flo_ai/llm/aws_bedrock_llm.py | tail -40Repository: rootflo/wavefront Length of output: 1440 🏁 Script executed: cd flo_ai && grep -n "class\|def\|#" flo_ai/llm/aws_bedrock_llm.py | head -30Repository: rootflo/wavefront Length of output: 861 This implementation is coupled to OpenAI-compatible Bedrock models, not general Bedrock support. The request body uses OpenAI-style schema ( Users attempting to use native Bedrock models (e.g., Recommended fixes:
Also, the 🧰 Tools🪛 Ruff (0.15.2)[warning] 69-69: Unused method argument: (ARG002) 🤖 Prompt for AI Agents
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks likes a valid issue, maybe u can add a comment on whether its true There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: Length of output: 2124 🏁 Script executed: Length of output: 2027 🏁 Script executed: Length of output: 1008
Evidence from the implementation:
What breaks with native Bedrock models:
Recommendation: Either add a clear docstring/comment stating "This class only supports OpenAI-compatible Bedrock models (openai.*)" or rename to |
||
|
|
||
| usage = response_body.get('usage', {}) | ||
| if usage: | ||
| llm_metrics.record_tokens( | ||
| total_tokens=usage.get('total_tokens', 0), | ||
| prompt_tokens=usage.get('prompt_tokens', 0), | ||
| completion_tokens=usage.get('completion_tokens', 0), | ||
| model=self.model, | ||
| provider='bedrock', | ||
| ) | ||
| tracer = get_tracer() | ||
| if tracer: | ||
| add_span_attributes( | ||
| trace.get_current_span(), | ||
| { | ||
| 'llm.tokens.prompt': usage.get('prompt_tokens', 0), | ||
| 'llm.tokens.completion': usage.get('completion_tokens', 0), | ||
| 'llm.tokens.total': usage.get('total_tokens', 0), | ||
| }, | ||
| ) | ||
|
|
||
| choices = response_body.get('choices', []) | ||
| if not choices: | ||
| return {'content': '', 'raw_message': response_body} | ||
|
|
||
| message = choices[0].get('message', {}) | ||
| if 'content' in message and message['content']: | ||
| message['content'] = self._strip_reasoning(message['content']) | ||
| text_content = message.get('content', '') or '' | ||
| tool_call = None | ||
|
|
||
| tool_calls = message.get('tool_calls', []) | ||
| if tool_calls: | ||
| tc = tool_calls[0] | ||
| tool_call = { | ||
| 'name': tc['function']['name'], | ||
| 'arguments': tc['function']['arguments'], | ||
| 'id': tc['id'], | ||
| } | ||
|
|
||
| if tool_call: | ||
| return { | ||
| 'content': text_content, | ||
| 'function_call': tool_call, | ||
| 'raw_message': message, | ||
| } | ||
| return {'content': text_content, 'raw_message': message} | ||
|
|
||
| @trace_llm_stream(provider='bedrock') | ||
| async def stream( | ||
| self, | ||
| messages: List[Dict[str, Any]], | ||
| functions: Optional[List[Dict[str, Any]]] = None, | ||
| **kwargs: Any, | ||
| ) -> AsyncIterator[Dict[str, Any]]: | ||
| converted = self._convert_messages(messages) | ||
|
|
||
| request_body: Dict[str, Any] = { | ||
| 'model': self.model, | ||
| 'messages': converted, | ||
| 'temperature': self.temperature, | ||
| 'stream': True, | ||
| } | ||
| if 'max_tokens' in self.kwargs: | ||
| request_body['max_completion_tokens'] = self.kwargs['max_tokens'] | ||
| if functions: | ||
| request_body['tools'] = functions | ||
|
|
||
| response = await asyncio.to_thread( | ||
| self.boto_client.invoke_model_with_response_stream, | ||
| modelId=self.model, | ||
| body=json.dumps(request_body), | ||
| ) | ||
|
|
||
| buffer = '' | ||
|
|
||
| for event in response['body']: | ||
| chunk_bytes = event.get('chunk', {}).get('bytes', b'') | ||
| if not chunk_bytes: | ||
| continue | ||
| text = chunk_bytes.decode('utf-8').strip() | ||
| # Try direct JSON first (some Bedrock models skip SSE envelope) | ||
| try: | ||
| data = json.loads(text) | ||
| content = data.get('choices', [{}])[0].get('delta', {}).get('content') | ||
| if content: | ||
| buffer += content | ||
| continue | ||
| except json.JSONDecodeError: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please address this, we shouldnt do pass without a log |
||
| pass | ||
| # Fall back to SSE format: "data: {...}" | ||
| for line in text.split('\n'): | ||
| line = line.strip() | ||
| if line.startswith('data: ') and line != 'data: [DONE]': | ||
| try: | ||
| data = json.loads(line[6:]) | ||
| content = ( | ||
| data.get('choices', [{}])[0].get('delta', {}).get('content') | ||
| ) | ||
| if content: | ||
| buffer += content | ||
| except json.JSONDecodeError: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here |
||
| pass | ||
|
|
||
| clean = self._strip_reasoning(buffer) | ||
| if clean: | ||
| yield {'content': clean} | ||
|
Comment on lines
+163
to
+195
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The method collects the entire response into Additionally, the 🐛 Proposed fix — yield incremental chunks and offload blocking iteration- buffer = ''
-
- for event in response['body']:
- chunk_bytes = event.get('chunk', {}).get('bytes', b'')
- if not chunk_bytes:
- continue
- text = chunk_bytes.decode('utf-8').strip()
- # Try direct JSON first (some Bedrock models skip SSE envelope)
- try:
- data = json.loads(text)
- content = data.get('choices', [{}])[0].get('delta', {}).get('content')
- if content:
- buffer += content
- continue
- except json.JSONDecodeError:
- pass
- # Fall back to SSE format: "data: {...}"
- for line in text.split('\n'):
- line = line.strip()
- if line.startswith('data: ') and line != 'data: [DONE]':
- try:
- data = json.loads(line[6:])
- content = (
- data.get('choices', [{}])[0].get('delta', {}).get('content')
- )
- if content:
- buffer += content
- except json.JSONDecodeError:
- pass
-
- clean = self._strip_reasoning(buffer)
- if clean:
- yield {'content': clean}
+ queue: asyncio.Queue = asyncio.Queue()
+
+ def _iter_events():
+ for event in response['body']:
+ chunk_bytes = event.get('chunk', {}).get('bytes', b'')
+ if chunk_bytes:
+ queue.put_nowait(chunk_bytes)
+ queue.put_nowait(None) # sentinel
+
+ loop = asyncio.get_event_loop()
+ loop.run_in_executor(None, _iter_events)
+
+ while True:
+ chunk_bytes = await queue.get()
+ if chunk_bytes is None:
+ break
+ text = chunk_bytes.decode('utf-8').strip()
+ content = None
+ try:
+ data = json.loads(text)
+ content = data.get('choices', [{}])[0].get('delta', {}).get('content')
+ except json.JSONDecodeError:
+ for line in text.split('\n'):
+ line = line.strip()
+ if line.startswith('data: ') and line != 'data: [DONE]':
+ try:
+ data = json.loads(line[6:])
+ content = data.get('choices', [{}])[0].get('delta', {}).get('content')
+ except json.JSONDecodeError:
+ logger.debug("Skipping malformed SSE line: %s", line)
+ if content:
+ clean = self._strip_reasoning(content)
+ if clean:
+ yield {'content': clean}Note: This sketch uses a queue + executor to avoid blocking the event loop. The exact approach may need refinement depending on the framework's async patterns, but the key point is: yield each chunk as it arrives. 🤖 Prompt for AI Agents
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this too There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Just let me know and I'll use `` to break down the implementation for you! 🚀 💡 For code changes across multiple files, use |
||
|
|
||
| def get_message_content(self, response: Dict[str, Any]) -> str: | ||
| content = ( | ||
| response.get('content', '') if isinstance(response, dict) else str(response) | ||
| ) | ||
| return self._strip_reasoning(content) | ||
|
|
||
| def format_tool_for_llm(self, tool: 'Tool') -> Dict[str, Any]: | ||
| return { | ||
| 'type': 'function', | ||
| 'function': { | ||
| 'name': tool.name, | ||
| 'description': tool.description, | ||
| 'parameters': { | ||
| 'type': 'object', | ||
| 'properties': { | ||
| name: { | ||
| 'type': info.get('type', 'string'), | ||
| 'description': info.get('description', ''), | ||
| **( | ||
| {'items': info['items']} | ||
| if info.get('type') == 'array' and 'items' in info | ||
| else {} | ||
| ), | ||
| } | ||
| for name, info in tool.parameters.items() | ||
| }, | ||
| 'required': [ | ||
| name | ||
| for name, info in tool.parameters.items() | ||
| if info.get('required', True) | ||
| ], | ||
| }, | ||
| }, | ||
| } | ||
|
|
||
| def format_tools_for_llm(self, tools: List['Tool']) -> List[Dict[str, Any]]: | ||
| return [self.format_tool_for_llm(tool) for tool in tools] | ||
|
|
||
| def format_image_in_message(self, image: ImageMessageContent) -> dict: | ||
| if image.base64: | ||
| return { | ||
| 'type': 'image_url', | ||
| 'image_url': { | ||
| 'url': f'data:{image.mime_type or "image/jpeg"};base64,{image.base64}' | ||
| }, | ||
| } | ||
| raise NotImplementedError( | ||
| 'AWS Bedrock LLM requires image base64 data to format image content.' | ||
| ) | ||
|
|
||
| def get_assistant_message_for_tool_call( | ||
| self, response: Dict[str, Any] | ||
| ) -> Optional[Any]: | ||
| if isinstance(response, dict) and 'raw_message' in response: | ||
| return response['raw_message'] | ||
| return None | ||
|
|
||
| def get_tool_use_id(self, function_call: Dict[str, Any]) -> Optional[str]: | ||
| return function_call.get('id') | ||
|
|
||
| def format_function_result_message( | ||
| self, function_name: str, content: str, tool_use_id: Optional[str] = None | ||
| ) -> Dict[str, Any]: | ||
| return { | ||
| 'role': 'tool', | ||
| 'tool_call_id': tool_use_id or 'unknown', | ||
| 'content': content, | ||
| 'name': function_name, | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 302
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 1890
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 4198
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 3205
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 120
🏁 Script executed:
Repository: rootflo/wavefront
Length of output: 305
Add
boto3to production dependencies and fix suspicious default model ID.boto3is missing from the productiondependencieslist inpyproject.toml. It's currently only in the[dependency-groups] devsection. Users installing this package will get anImportErrorat runtime when trying to useAWSBedrock. Add"boto3>=1.36.1,<2"to the maindependencieslist.The default model
'openai.gpt-oss-20b-1:0'does not match AWS Bedrock model ID patterns (e.g.,anthropic.claude-3-sonnet-20240229-v1:0,amazon.titan-text-express-v1). This appears to be a placeholder or incorrect value. Provide a valid Bedrock model ID or document why this unusual format is intentional.Minor:
self.modelandself.kwargson lines 28–29 are redundantly assigned;BaseLLM.__init__()already sets these (harmless but unnecessary).🤖 Prompt for AI Agents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 you need to add, the current code works because some of our libraries have boto dependency internally
3. Check 3 as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Length of output: 1513
@vizsatiz, confirmed on both points:boto3 dependency: You're right that it should be explicitly added to the production
dependencieslist inpyproject.toml, even though it currently works through transitive dependencies. Add:"boto3>=1.36.1,<2"Redundant assignments on lines 28-29: Confirmed redundant.
BaseLLM.__init__()already sets bothself.modelandself.kwargs(lines 17 and 20 in base_llm.py). These lines can be safely removed:✏️ Learnings added