This project implements Corrective Retrieval-Augmented Generation (CRAG), a framework designed to enhance the robustness of RAG systems by self-correcting retrieval results. It addresses the common issue where Large Language Models (LLMs) may hallucinate when provided with irrelevant or low-quality retrieved documents.
Standard Retrieval-Augmented Generation (RAG) often indiscriminately incorporates retrieved documents into the generation process, regardless of their relevance. This project implements the CRAG framework, which introduces a retrieval evaluator to assess document quality and trigger different actions — such as refinement or external web searches — to ensure the generator receives the most accurate knowledge.
-
Lightweight Retrieval Evaluation: Evaluates the relevance of retrieved documents to the user query before they are used for generation.
-
Knowledge Correction: Discards irrelevant documents and triggers a web search extension to fill knowledge gaps.
-
Adaptive Workflow: Uses LangGraph to manage a stateful workflow that can route queries between local vector stores and the web.
-
Post-Generation Verification: Includes self-reflection steps to grade answers for hallucinations and utility, drawing inspiration from Self-RAG.
The workflow follows a directed graph managed by langgraph:
-
Routing: Determines if the query should go to the vector store or directly to web search.
-
Retrieval: Fetches relevant documents from a Qdrant vector store.
-
Grading: A "Retrieval Evaluator" checks the relevance of each document. If a document is deemed irrelevant, a web search flag is set.
-
Web Search: If needed, the system uses Tavily to supplement the knowledge base with fresh web data.
-
Generation: Produces an answer using the refined context.
-
Self-Correction: Verifies the answer against the documents (Hallucination Grade) and the question (Answer Grade).
-
Environment: Requires Python 3.10+ and a tool like uv or pip for dependency management.
-
Configuration: Set the following environment variables:
Variable Description OPENAI_API_KEYFor LLM and embeddings. QDRANT_URLFor the vector store (defaults to http://localhost:6333).TAVILY_API_KEYFor web search capabilities.
The project includes a CLI via main.py:
-
Ingest Documents: Load data into the vector store.
python main.py ingest ./path/to/docs
-
Invoke Agent: Ask a question.
python main.py invoke "What is Corrective RAG?" -
Visualize Graph: Save the LangGraph workflow as an image.
python main.py visualize ./graph.png
-
Corrective Retrieval Augmented Generation, Anonymous Authors, Under review at ICLR 2025.
-
Self-RAG: Learning to Retrieve, Generate, and Critique Through Self-Reflection, Akari Asai et al., ICLR 2024.
-
Adaptive-RAG: Learning to Adapt Retrieval-Augmented LLMs through Question Complexity, Soyeong Jeong et al., 2024.
