Skip to content
/ llmops Public

πŸš€ The Ultimate Curated List of LLMOps Tools, Frameworks, and Resources - A comprehensive collection of the best tools for Large Language Model Operations

License

Notifications You must be signed in to change notification settings

pmady/llmops

Awesome LLMOps Awesome

GitHub stars GitHub forks GitHub issues License Contributors

πŸš€ The Ultimate Curated List of LLMOps Tools, Frameworks, and Resources

A comprehensive collection of the best tools, frameworks, models, and resources for Large Language Model Operations (LLMOps)


πŸ“‹ Table of Contents


What's New

πŸ†• Recently Added (January 2026)

Infrastructure & Deployment:

  • Skypilot - Run LLMs on any cloud with one command
  • Modal - Serverless platform for AI/ML workloads

Evaluation & Testing:

  • Ragas - Evaluation framework for RAG pipelines
  • PromptFoo - Test and evaluate LLM outputs

Agent Frameworks:

  • Phidata - Build AI assistants with memory and knowledge
  • Composio - Integration platform for AI agents

Monitoring & Observability:

πŸ“ˆ Trending This Month

  • vLLM continues to dominate high-throughput inference
  • LangGraph gaining traction for stateful agent workflows
  • Ollama becoming the go-to for local LLM deployment
  • DeepSeek models showing impressive cost-performance ratios

What is LLMOps?

LLMOps (Large Language Model Operations) is a set of practices, tools, and workflows designed to deploy, monitor, and maintain large language models in production environments. It encompasses the entire lifecycle of LLM applications, from development and training to deployment, monitoring, and continuous improvement.

Key Components of LLMOps:

  • Model Development: Training, fine-tuning, and optimizing LLMs
  • Deployment: Serving models efficiently at scale
  • Monitoring: Tracking performance, costs, and quality
  • Prompt Management: Version control and optimization of prompts
  • Security: Ensuring safe and responsible AI usage
  • Evaluation: Testing and validating model outputs
  • Data Management: Handling training data and embeddings

LLMOps vs MLOps

Aspect MLOps LLMOps
Model Size Typically smaller models Very large models (billions of parameters)
Training Full model training common Fine-tuning and prompt engineering preferred
Deployment Standard serving infrastructure Specialized inference optimization required
Monitoring Metrics-focused Quality, safety, and cost-focused
Versioning Model versions Model + prompt + configuration versions
Cost Moderate compute costs High compute and inference costs
Latency Milliseconds Seconds (streaming helps)
Data Structured/tabular data Unstructured text, multimodal data

Models

Large Language Models

Model Description Stars License
LLaMA Meta's foundational large language models Stars Research
Mistral High-performance open models from Mistral AI Stars Apache 2.0
Gemma Google's lightweight open models N/A Gemma License
Qwen Alibaba's multilingual LLM series Stars Apache 2.0
DeepSeek Cost-effective open-source LLMs Stars MIT
Phi Microsoft's small language models N/A MIT
ChatGLM Bilingual conversational language model Stars Apache 2.0
Alpaca Stanford's instruction-following model Stars Apache 2.0
Vicuna Open chatbot trained by fine-tuning LLaMA Stars Apache 2.0
BELLE Chinese language model based on LLaMA Stars Apache 2.0
Falcon TII's high-performance open models N/A Apache 2.0
Bloom Multilingual LLM from BigScience Stars RAIL

Multimodal Models

Model Description Stars
LLaVA Large Language and Vision Assistant Stars
MiniCPM-V Efficient multimodal model Stars
Qwen-VL Vision-language model from Alibaba Stars

Audio Foundation Models

Model Description Stars
Whisper OpenAI's speech recognition model Stars
Faster Whisper Fast inference engine for Whisper Stars

Inference & Serving

Inference Engines

Tool Description Stars
vLLM High-throughput and memory-efficient inference engine Stars
llama.cpp LLM inference in C/C++ Stars
TensorRT-LLM NVIDIA's optimized inference library Stars
LMDeploy Toolkit for compressing and deploying LLMs Stars
DeepSpeed-MII Low-latency inference powered by DeepSpeed Stars
CTranslate2 Fast inference engine for Transformer models Stars
Cortex.cpp Local AI API Platform Stars
LoRAX Multi-LoRA inference server Stars
MInference Speed up long-context LLM inference Stars
ipex-llm Accelerate LLM inference on Intel hardware Stars

Inference Platforms

Platform Description Stars
Ollama Run LLMs locally with ease Stars
LocalAI OpenAI-compatible API for local models Stars
LM Studio Desktop app for running LLMs locally N/A
GPUStack Manage GPU clusters for LLM inference Stars
OpenLLM Operating LLMs in production Stars
Ray Serve Scalable model serving with Ray Stars

Model Serving Frameworks

Framework Description Stars
BentoML Unified model serving framework Stars
Triton Inference Server NVIDIA's optimized inference solution Stars
TorchServe Serve PyTorch models in production Stars
TensorFlow Serving Flexible ML serving system Stars
Jina Build multimodal AI services Stars
Mosec Model serving with dynamic batching Stars
Infinity REST API for text embeddings Stars

Orchestration

Application Frameworks

Framework Description Stars
LangChain Framework for developing LLM applications Stars
LlamaIndex Data framework for LLM applications Stars
Haystack End-to-end NLP framework Stars
Semantic Kernel Microsoft's SDK for AI orchestration Stars
Langfuse Open-source LLM engineering platform Stars
Neurolink Universal AI development platform Stars

Agent Frameworks

Framework Description Stars
AutoGPT Autonomous AI agent framework Stars
CrewAI Framework for orchestrating AI agents Stars
AutoGen Multi-agent conversation framework Stars
LangGraph Build stateful multi-actor applications Stars
AgentMark Type-safe Markdown-based agents Stars

Workflow Management

Tool Description Stars
Prefect Modern workflow orchestration Stars
Airflow Platform to programmatically author workflows Stars
Flyte Kubernetes-native workflow automation Stars
Flowise Drag & drop UI for LLM flows Stars

Training & Fine-Tuning

Training Frameworks

Framework Description Stars
DeepSpeed Deep learning optimization library Stars
Megatron-LM Large-scale transformer training Stars
PyTorch FSDP Fully Sharded Data Parallel N/A
Colossal-AI Unified deep learning system Stars
Accelerate Simple way to train on distributed setups Stars

Fine-Tuning Tools

Tool Description Stars
Axolotl Streamlined LLM fine-tuning Stars
LLaMA-Factory Unified fine-tuning framework Stars
PEFT Parameter-Efficient Fine-Tuning Stars
Unsloth 2x faster LLM fine-tuning Stars
TRL Transformer Reinforcement Learning Stars
LitGPT Pretrain, fine-tune, deploy LLMs Stars

Experiment Tracking

Tool Description Stars
Weights & Biases ML experiment tracking Stars
MLflow Open-source ML lifecycle platform Stars
TensorBoard TensorFlow's visualization toolkit Stars
Aim Easy-to-use experiment tracker Stars

Prompt Engineering

Tools & Platforms

Tool Description Link
PromptBase Marketplace for prompt engineering πŸ”—
PromptHero Prompt engineering resources πŸ”—
Prompt Perfect Auto prompt optimizer πŸ”—
Learn Prompting Prompt engineering tutorials πŸ”—
LangSmith Debug and test LLM applications πŸ”—
PromptLayer Prompt engineering platform πŸ”—

Resources


Vector Search & RAG

Tool Description Stars
Chroma AI-native embedding database Stars
Weaviate Vector search engine Stars
Qdrant Vector similarity search engine Stars
Milvus Cloud-native vector database Stars
Pinecone Managed vector database N/A
FAISS Efficient similarity search library Stars
pgvector Vector similarity search for Postgres Stars
LanceDB Developer-friendly vector database Stars

Observability & Monitoring

Tool Description Stars
Langfuse Open-source LLM observability Stars
Phoenix AI observability & evaluation Stars
Helicone Open-source LLM observability Stars
Lunary Production toolkit for LLMs N/A
OpenLIT OpenTelemetry-native LLM observability Stars
Evidently ML and LLM observability framework Stars
DeepEval LLM evaluation framework Stars
PostHog Product analytics and feature flags Stars

Security & Safety

Tool Description Stars
NeMo Guardrails Programmable guardrails for LLM apps Stars
Guardrails AI Add guardrails to LLM applications Stars
LLM Guard Security toolkit for LLM interactions Stars
Rebuff Prompt injection detection Stars
LangKit LLM monitoring toolkit Stars

Data Management

Tool Description Stars
DVC Data version control Stars
LakeFS Git for data lakes Stars
Pachyderm Data versioning and pipelines Stars
Delta Lake Storage framework for data lakes Stars

Optimization & Performance

Tool Description Stars
ONNX Runtime Cross-platform ML accelerator Stars
TVM ML compiler framework Stars
BitsAndBytes 8-bit optimizers and quantization Stars
AutoGPTQ Easy-to-use LLM quantization Stars
GPTQ-for-LLaMa 4-bit quantization for LLaMA Stars

Development Tools

IDEs & Code Assistants

Tool Description Stars
GitHub Copilot AI pair programmer N/A
Cursor AI-first code editor N/A
Continue Open-source AI code assistant Stars
Cody AI coding assistant N/A
Tabby Self-hosted AI coding assistant Stars

Notebooks & Workspaces

Tool Description Stars
Jupyter Interactive computing environment Stars
Google Colab Free cloud notebooks N/A
Gradient Managed notebooks and workflows N/A

LLMOps Platforms

Platform Description Stars
Agenta LLMOps platform for building robust apps Stars
Dify LLM app development platform Stars
Pezzo Open-source LLMOps platform Stars
Humanloop Prompt management and evaluation N/A
PromptLayer Prompt engineering platform N/A
Weights & Biases ML platform with LLM support N/A

Resources & Learning

Documentation & Guides

Awesome Lists

Papers & Research


Contributing

We welcome contributions from the community! Here's how you can help:

How to Contribute

  1. Fork the repository
  2. Create a new branch (git checkout -b feature/amazing-tool)
  3. Add your contribution following our guidelines
  4. Commit your changes (git commit -m 'Add amazing tool')
  5. Push to the branch (git push origin feature/amazing-tool)
  6. Open a Pull Request

Contribution Guidelines

  • Quality over quantity: Only add tools/resources you've personally used or thoroughly researched
  • Keep descriptions concise: 1-2 sentences maximum
  • Include GitHub stars badge: Use the format shown in existing entries
  • Maintain alphabetical order: Within each category
  • Check for duplicates: Search before adding
  • Update the Table of Contents: If adding new sections
  • Follow the existing format: Match the style of current entries

What to Contribute

  • βœ… New tools, frameworks, or platforms
  • βœ… Useful resources, tutorials, or guides
  • βœ… Bug fixes or improvements to existing entries
  • βœ… Better descriptions or categorizations
  • ❌ Promotional content or spam
  • ❌ Outdated or unmaintained projects (unless historically significant)

See CONTRIBUTING.md for detailed guidelines.


License

CC0

This project is licensed under CC0 1.0 Universal. See LICENSE for details.


Star History

Star History Chart


Acknowledgments

This repository is inspired by and builds upon several excellent awesome lists:

Special thanks to all contributors who help maintain and improve this resource!


If you find this repository helpful, please consider giving it a ⭐️

Made with ❀️ by the community

Releases

No releases published

Packages

No packages published

Languages