Unofficial OpenAI-compatible wrapper in front of chatjimmy.ai (by Taalas) for the llama-3.1:8B model, which aims to deliver ~17,000 tokens per second, per user.
api chat wrapper proxy inference speed tokens openai endpoint completions v1 fastify openai-api llm hc1 chatjimmy taalas
-
Updated
Feb 23, 2026 - TypeScript