Skip to content

Conversation

@fbalicchia
Copy link
Contributor

Summary

Implemented prompt caching for Anthropic Claude models on AWS Bedrock to reduce costs
Introduced an intelligent cache point placement strategy that complies with AWS Bedrock’s four cache point limitation.
Added the BEDROCK_ENABLE_CACHING configuration parameter

Testing

  • Verify caching is enabled for Claude models by default
  • Test that cache points are correctly placed in system prompts
  • Test that cache points are correctly placed in user/assistant messages
  • Verify cache point limit (4 max) is respected

@fbalicchia fbalicchia requested a review from a team as a code owner December 17, 2025 11:34
@fbalicchia fbalicchia changed the title add prompt caching add bedrock prompt caching Dec 17, 2025
@fbalicchia fbalicchia requested a review from DOsinga December 18, 2025 15:31
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
@fbalicchia fbalicchia force-pushed the add-prompt-bedrock-cache branch from 381bf24 to fcac5be Compare January 7, 2026 08:57
@fbalicchia
Copy link
Contributor Author

Hi @DOsinga , I’ve pushed some updates to this PR. When you have a moment, could you please take another look? Thanks!

Abhijay007 and others added 20 commits January 13, 2026 09:52
Signed-off-by: Abhijay007 <Abhijay007j@gmail.com>
…ositive rates (block#6390)

Signed-off-by: Dorien Koelemeijer <dkoelemeijer@squareup.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com>
thanks from shanto12 (PR#6349 which has original code)
dianed-square and others added 26 commits January 13, 2026 09:52
Signed-off-by: Yelsin Sepulveda <yelsinsepulveda@gmail.com>
Signed-off-by: Yelsin Sepulveda <yelsinsepulveda@gmail.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com>
Signed-off-by: Rodolfo Olivieri <rodolfo.olivieri3@gmail.com>
Signed-off-by: Fatih Kadir Akın <fatihkadirakin@gmail.com>
Signed-off-by: Rodolfo Olivieri <rodolfo.olivieri3@gmail.com>
Signed-off-by: Sai Karthik <kskarthik@disroot.org>
Co-authored-by: Douwe Osinga <douwe@squareup.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com>
Signed-off-by: rabi <ramishra@redhat.com>
… in /ui/desktop (block#6375)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…lock#6408)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: The-Best-Codes <bestcodes.official@gmail.com>
Signed-off-by: Adrian Cole <adrian@tetrate.io>
@fbalicchia
Copy link
Contributor Author

After making a bit of a mess with the branch I close this pr in favor of 6463

@fbalicchia fbalicchia closed this Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.