Awesome LLMOps

🚀 The Ultimate Curated List of LLMOps Tools, Frameworks, and Resources

A comprehensive collection of the best tools, frameworks, models, and resources for Large Language Model Operations (LLMOps)

📋 Table of Contents

What's New
What is LLMOps?
LLMOps vs MLOps
Models
Inference & Serving
Orchestration
Training & Fine-Tuning
Prompt Engineering
Vector Search & RAG
Observability & Monitoring
Security & Safety
Data Management
Optimization & Performance
Development Tools
LLMOps Platforms
Resources & Learning
Contributing

What's New

🆕 Recently Added (January 2026)

Infrastructure & Deployment:

Skypilot - Run LLMs on any cloud with one command
Modal - Serverless platform for AI/ML workloads

Evaluation & Testing:

Ragas - Evaluation framework for RAG pipelines
PromptFoo - Test and evaluate LLM outputs

Agent Frameworks:

Phidata - Build AI assistants with memory and knowledge
Composio - Integration platform for AI agents

Monitoring & Observability:

Traceloop - OpenTelemetry for LLMs
LangWatch - LLM monitoring and analytics

📈 Trending This Month

vLLM continues to dominate high-throughput inference
LangGraph gaining traction for stateful agent workflows
Ollama becoming the go-to for local LLM deployment
DeepSeek models showing impressive cost-performance ratios

What is LLMOps?

LLMOps (Large Language Model Operations) is a set of practices, tools, and workflows designed to deploy, monitor, and maintain large language models in production environments. It encompasses the entire lifecycle of LLM applications, from development and training to deployment, monitoring, and continuous improvement.

Key Components of LLMOps:

Model Development: Training, fine-tuning, and optimizing LLMs
Deployment: Serving models efficiently at scale
Monitoring: Tracking performance, costs, and quality
Prompt Management: Version control and optimization of prompts
Security: Ensuring safe and responsible AI usage
Evaluation: Testing and validating model outputs
Data Management: Handling training data and embeddings

LLMOps vs MLOps

Aspect	MLOps	LLMOps
Model Size	Typically smaller models	Very large models (billions of parameters)
Training	Full model training common	Fine-tuning and prompt engineering preferred
Deployment	Standard serving infrastructure	Specialized inference optimization required
Monitoring	Metrics-focused	Quality, safety, and cost-focused
Versioning	Model versions	Model + prompt + configuration versions
Cost	Moderate compute costs	High compute and inference costs
Latency	Milliseconds	Seconds (streaming helps)
Data	Structured/tabular data	Unstructured text, multimodal data

Models

Large Language Models

Model	Description	Stars	License
LLaMA	Meta's foundational large language models		Research
Mistral	High-performance open models from Mistral AI		Apache 2.0
Gemma	Google's lightweight open models	N/A	Gemma License
Qwen	Alibaba's multilingual LLM series		Apache 2.0
DeepSeek	Cost-effective open-source LLMs		MIT
Phi	Microsoft's small language models	N/A	MIT
ChatGLM	Bilingual conversational language model		Apache 2.0
Alpaca	Stanford's instruction-following model		Apache 2.0
Vicuna	Open chatbot trained by fine-tuning LLaMA		Apache 2.0
BELLE	Chinese language model based on LLaMA		Apache 2.0
Falcon	TII's high-performance open models	N/A	Apache 2.0
Bloom	Multilingual LLM from BigScience		RAIL

Multimodal Models

Model	Description	Stars
LLaVA	Large Language and Vision Assistant
MiniCPM-V	Efficient multimodal model
Qwen-VL	Vision-language model from Alibaba

Audio Foundation Models

Model	Description	Stars
Whisper	OpenAI's speech recognition model
Faster Whisper	Fast inference engine for Whisper

Inference & Serving

Inference Engines

Tool	Description	Stars
vLLM	High-throughput and memory-efficient inference engine
llama.cpp	LLM inference in C/C++
TensorRT-LLM	NVIDIA's optimized inference library
LMDeploy	Toolkit for compressing and deploying LLMs
DeepSpeed-MII	Low-latency inference powered by DeepSpeed
CTranslate2	Fast inference engine for Transformer models
Cortex.cpp	Local AI API Platform
LoRAX	Multi-LoRA inference server
MInference	Speed up long-context LLM inference
ipex-llm	Accelerate LLM inference on Intel hardware

Inference Platforms

Platform	Description	Stars
Ollama	Run LLMs locally with ease
LocalAI	OpenAI-compatible API for local models
LM Studio	Desktop app for running LLMs locally	N/A
GPUStack	Manage GPU clusters for LLM inference
OpenLLM	Operating LLMs in production
Ray Serve	Scalable model serving with Ray

Model Serving Frameworks

Framework	Description	Stars
BentoML	Unified model serving framework
Triton Inference Server	NVIDIA's optimized inference solution
TorchServe	Serve PyTorch models in production
TensorFlow Serving	Flexible ML serving system
Jina	Build multimodal AI services
Mosec	Model serving with dynamic batching
Infinity	REST API for text embeddings

Orchestration

Application Frameworks

Framework	Description	Stars
LangChain	Framework for developing LLM applications
LlamaIndex	Data framework for LLM applications
Haystack	End-to-end NLP framework
Semantic Kernel	Microsoft's SDK for AI orchestration
Langfuse	Open-source LLM engineering platform
Neurolink	Universal AI development platform

Agent Frameworks

Framework	Description	Stars
AutoGPT	Autonomous AI agent framework
CrewAI	Framework for orchestrating AI agents
AutoGen	Multi-agent conversation framework
LangGraph	Build stateful multi-actor applications
AgentMark	Type-safe Markdown-based agents

Workflow Management

Tool	Description	Stars
Prefect	Modern workflow orchestration
Airflow	Platform to programmatically author workflows
Flyte	Kubernetes-native workflow automation
Flowise	Drag & drop UI for LLM flows

Training & Fine-Tuning

Training Frameworks

Framework	Description	Stars
DeepSpeed	Deep learning optimization library
Megatron-LM	Large-scale transformer training
PyTorch FSDP	Fully Sharded Data Parallel	N/A
Colossal-AI	Unified deep learning system
Accelerate	Simple way to train on distributed setups

Fine-Tuning Tools

Tool	Description	Stars
Axolotl	Streamlined LLM fine-tuning
LLaMA-Factory	Unified fine-tuning framework
PEFT	Parameter-Efficient Fine-Tuning
Unsloth	2x faster LLM fine-tuning
TRL	Transformer Reinforcement Learning
LitGPT	Pretrain, fine-tune, deploy LLMs

Experiment Tracking

Tool	Description	Stars
Weights & Biases	ML experiment tracking
MLflow	Open-source ML lifecycle platform
TensorBoard	TensorFlow's visualization toolkit
Aim	Easy-to-use experiment tracker

Prompt Engineering

Tools & Platforms

Tool	Description	Link
PromptBase	Marketplace for prompt engineering	🔗
PromptHero	Prompt engineering resources	🔗
Prompt Perfect	Auto prompt optimizer	🔗
Learn Prompting	Prompt engineering tutorials	🔗
LangSmith	Debug and test LLM applications	🔗
PromptLayer	Prompt engineering platform	🔗

Resources

Vector Search & RAG

Tool	Description	Stars
Chroma	AI-native embedding database
Weaviate	Vector search engine
Qdrant	Vector similarity search engine
Milvus	Cloud-native vector database
Pinecone	Managed vector database	N/A
FAISS	Efficient similarity search library
pgvector	Vector similarity search for Postgres
LanceDB	Developer-friendly vector database

Observability & Monitoring

Tool	Description	Stars
Langfuse	Open-source LLM observability
Phoenix	AI observability & evaluation
Helicone	Open-source LLM observability
Lunary	Production toolkit for LLMs	N/A
OpenLIT	OpenTelemetry-native LLM observability
Evidently	ML and LLM observability framework
DeepEval	LLM evaluation framework
PostHog	Product analytics and feature flags

Security & Safety

Tool	Description	Stars
NeMo Guardrails	Programmable guardrails for LLM apps
Guardrails AI	Add guardrails to LLM applications
LLM Guard	Security toolkit for LLM interactions
Rebuff	Prompt injection detection
LangKit	LLM monitoring toolkit

Data Management

Tool	Description	Stars
DVC	Data version control
LakeFS	Git for data lakes
Pachyderm	Data versioning and pipelines
Delta Lake	Storage framework for data lakes

Optimization & Performance

Tool	Description	Stars
ONNX Runtime	Cross-platform ML accelerator
TVM	ML compiler framework
BitsAndBytes	8-bit optimizers and quantization
AutoGPTQ	Easy-to-use LLM quantization
GPTQ-for-LLaMa	4-bit quantization for LLaMA

Development Tools

IDEs & Code Assistants

Tool	Description	Stars
GitHub Copilot	AI pair programmer	N/A
Cursor	AI-first code editor	N/A
Continue	Open-source AI code assistant
Cody	AI coding assistant	N/A
Tabby	Self-hosted AI coding assistant

Notebooks & Workspaces

Tool	Description	Stars
Jupyter	Interactive computing environment
Google Colab	Free cloud notebooks	N/A
Gradient	Managed notebooks and workflows	N/A

LLMOps Platforms

Platform	Description	Stars
Agenta	LLMOps platform for building robust apps
Dify	LLM app development platform
Pezzo	Open-source LLMOps platform
Humanloop	Prompt management and evaluation	N/A
PromptLayer	Prompt engineering platform	N/A
Weights & Biases	ML platform with LLM support	N/A

Resources & Learning

Documentation & Guides

OpenAI Cookbook - Examples and guides for OpenAI API
LLM University - Cohere's LLM learning resources
Hugging Face Course - NLP with Transformers
Full Stack LLM Bootcamp - Comprehensive LLM course

Awesome Lists

Awesome LLM - Curated list of LLM resources
Awesome ChatGPT Prompts - Prompt examples
Awesome AI Agents - AI agent resources
Awesome LangChain - LangChain resources

Papers & Research

Contributing

We welcome contributions from the community! Here's how you can help:

How to Contribute

Fork the repository
Create a new branch (git checkout -b feature/amazing-tool)
Add your contribution following our guidelines
Commit your changes (git commit -m 'Add amazing tool')
Push to the branch (git push origin feature/amazing-tool)
Open a Pull Request

Contribution Guidelines

Quality over quantity: Only add tools/resources you've personally used or thoroughly researched
Keep descriptions concise: 1-2 sentences maximum
Include GitHub stars badge: Use the format shown in existing entries
Maintain alphabetical order: Within each category
Check for duplicates: Search before adding
Update the Table of Contents: If adding new sections
Follow the existing format: Match the style of current entries

What to Contribute

✅ New tools, frameworks, or platforms
✅ Useful resources, tutorials, or guides
✅ Bug fixes or improvements to existing entries
✅ Better descriptions or categorizations
❌ Promotional content or spam
❌ Outdated or unmaintained projects (unless historically significant)

See CONTRIBUTING.md for detailed guidelines.

License

This project is licensed under CC0 1.0 Universal. See LICENSE for details.

Star History

Acknowledgments

This repository is inspired by and builds upon several excellent awesome lists:

Special thanks to all contributors who help maintain and improve this resource!

If you find this repository helpful, please consider giving it a ⭐️

Made with ❤️ by the community

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GOOD_FIRST_ISSUES.md		GOOD_FIRST_ISSUES.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SETUP_INSTRUCTIONS.md		SETUP_INSTRUCTIONS.md
setup-github-repo.sh		setup-github-repo.sh

License

pmady/llmops

Folders and files

Latest commit

History

Repository files navigation

Awesome LLMOps

🚀 The Ultimate Curated List of LLMOps Tools, Frameworks, and Resources

📋 Table of Contents

What's New

🆕 Recently Added (January 2026)

📈 Trending This Month

What is LLMOps?

Key Components of LLMOps:

LLMOps vs MLOps

Models

Large Language Models

Multimodal Models

Audio Foundation Models

Inference & Serving

Inference Engines

Inference Platforms

Model Serving Frameworks

Orchestration

Application Frameworks

Agent Frameworks

Workflow Management

Training & Fine-Tuning

Training Frameworks

Fine-Tuning Tools

Experiment Tracking

Prompt Engineering

Tools & Platforms

Resources

Vector Search & RAG

Observability & Monitoring

Security & Safety

Data Management

Optimization & Performance

Development Tools

IDEs & Code Assistants

Notebooks & Workspaces

LLMOps Platforms

Resources & Learning

Documentation & Guides

Awesome Lists

Papers & Research

Contributing

How to Contribute

Contribution Guidelines

What to Contribute

License

Star History

Acknowledgments

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages