A simple, mini hands-on course on language models. Learn by doing.
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and setup
git clone https://github.com/yourusername/lm-course.git
cd ttlm
uv sync
# Start learning
uv run python -m scripts.pretrain --experiment=default # (or any other experiment, like default_cpu for cpu only job)(You only need to do 1-3 once)
- Fork this repository to your own account.
- Clone it locally to your computer so you can edit as you wish, then push your changes to your fork.
- Run scripts/launch_colab.sh to set up your ssh config for colab (works only for MacOS with homebrew).
- Open a new notebook on colab and start a new runtime (GPU or CPU, recommended GPU)
- Run the following code in a cell (replace {your-username}, {your-name} and {your-email}):
from google.colab import drive
import os
drive.mount('/content/drive')
work_dir = '/content/drive/MyDrive/colab-projects'
os.makedirs(work_dir, exist_ok=True)
os.chdir(work_dir)
if not os.path.exists('lm-course'):
!git clone https://github.com/cottascience/lm-course.git
!ln -sf /content/drive/MyDrive/colab-projects/lm-course /root/lm-course
os.chdir('lm-course')
!git config --global user.email "your-email@example.com"
!git config --global user.name "Your Name"
!pip install colab-ssh
!chmod +x /content/cloudflared 2>/dev/null || true
!chmod +x ./cloudflared 2>/dev/null || true
!chmod +x ~/.cloudflared 2>/dev/null || true
!find /content -name "cloudflared" -exec chmod +x {} \; 2>/dev/null || true
from colab_ssh import launch_ssh_cloudflared
import getpass
password = getpass.getpass("Enter SSH password: ")
launch_ssh_cloudflared(password=password)- Now, you can use the SSH connection to run your own code. Edit locally, push to your fork, pull in the machine and run the code.
lm-course/
├── notes/ # Lecturer's notes
└── ttlm/ # Code and utilities
A full manuscript with notes is being prepared. For now, you can find some slides that will support our discussion in notes.
- Python/torch programming
- Basic linear algebra
- Some familiarity and previous experience with neural networks (helpful)
A (too short and biased) reading list, that can be helpful to get you started:
- Transformers:
- Data and Pretraining Pipeline:
- Post-training:
- Inference and Deployment:
- Implement BPE and compare performance with character-level tokenizer;
- Implement KV cache and compare peformance with naive inference;
- Implement a simple RLHF tuning using LLM-generated preference dataset, e.g., make the model speak like a pirate;
- Implement a forward pass with flash attention for the model.
MIT License - use freely!