A hands-on workshop for building Retrieval-Augmented Generation (RAG) systems, progressing from simple word-overlap retrieval to agentic RAG with ReAct agents.
- uv package manager (handles Python installation automatically)
- OpenAI API key (see below)
-
Create an OpenAI account at https://platform.openai.com/
-
Give us your OpenAI email so we can invite you to the organization.
-
Accept the invitation to the "MW Hackathon" organization.
- Check your email for subject: "You were invited to the organization MW Hackathon on OpenAI"
-
Generate your API key:
- Go to https://platform.openai.com/
- In the top left, change the organization to MW Hackathon (You should see "MW Hackathon / RAG Workshop" at the top)
- On the left sidebar, click API keys (under Organization)
- Click Create new secret key
- Fill in:
- Name:
<your name> - Project:
RAG Workshop - Permissions:
All
- Name:
- Click Create secret key and copy it to your
.envfile
-
Install uv (if not already installed):
Follow the official installation guide for your operating system.
Verify the installation:
uv --version
-
Clone the repository:
git clone https://github.com/marshallwace/rag-workshop.git cd rag-workshop -
Install dependencies:
uv sync
This will automatically download Python 3.14 and install all dependencies.
-
Configure environment:
cp .env.sample .env
Then edit
.envand add your OpenAI API key (see Getting your OpenAI API key above). -
Verify setup:
uv run python -c "from utils import get_embedding, generate_completion; print('Setup OK')"If this fails, check that your
.envfile exists and contains a validOPENAI_API_KEY.
Note: Exercises build on each other. Complete them in order (Step 1 → Step 2).
| Exercise | Description | Command |
|---|---|---|
| Exercise 1 | Simple word-overlap retrieval | uv run pytest step1_retrieval/exercise_1/test_retrieval.py |
| Exercise 2 | Embedding-based retrieval | uv run pytest step1_retrieval/exercise_2/test_retrieval.py |
| Exercise | Description | Command |
|---|---|---|
| Exercise 1 | Simple RAG (retrieve + generate) | uv run python -m step2_generation.exercise_1.demo_rag |
| Exercise 2 | Agentic RAG with ReAct | uv run pytest step2_generation/exercise_2/test_retrieval.py |
Run all tests:
uv run pytestRun a specific exercise:
uv run pytest step1_retrieval/exercise_1/test_retrieval.py -vUse -s to see console output (print statements, logs, etc.):
uv run pytest step1_retrieval/exercise_1/test_retrieval.py -vsThe utils module provides helper functions for exercises:
from utils import get_embedding, generate_completion, cosine_similarity_batch
# Get embeddings
embedding = await get_embedding("some text") # returns list[float]
# Generate completions
answer = await generate_completion("your prompt here")
# Compute similarity scores
scores = cosine_similarity_batch(query_emb, [emb1, emb2, ...]) # returns list[float]