Skip to content
/ resq Public

Hybrid DDQN + A* agent for autonomous search and rescue in partially observable environments.

Notifications You must be signed in to change notification settings

Dymon18/resq

Repository files navigation

📑 Table of Contents


ResQ: Hybrid Double DQN & A* Agent for Disaster Navigation

🧭 Overview

ResQ is an autonomous agent designed to solve the Search-and-Rescue problem in partially observable disaster environments, implementing a Hybrid Architecture combining:

  • Double DQN: Handles high-level decision-making. The agent processes limited local observations ($3 \times 3$ grid) to decide between step-by-step exploration and triggering the A* macro-action.

  • A* Macro-action: Invoked by the agent for rapidly securing detected victims within the local observation grid and transporting them to the drop-off zone via optimal paths.

The ResQ agent is trained and evaluated against two realistic Search-and-Rescue baselines:

  • The Lawnmower Baseline: A systematic, non-adaptive strategy that executes a snake-like sweep pattern to exhaustively cover the map row-by-row.

  • The Greedy Observation Baseline: A heuristic agent that mimics human intuition in low-visibility conditions. It prioritizes immediate rescue upon victim detection and otherwise navigates to least-visited adjacent cells to explore new areas.

This project evaluates whether reinforcement learning can outperform hand-crafted heuristic strategies in a partially observable environment.


📁 Repository Structure

resq-agent/
│── baseline/
│     ├── lawnmower.py
│     ├── greedy_observation.py
│── environment/
│     └── ResQEnv/...
│── images/
│     └── demo.gif
│── model/
│── a_star.py
│── callback.py
│── config.py
│── evaluate.py
│── train.py
│── test.py
│── requirements.txt
│── .gitignore

🏆 Results & Benchmarks

We evaluated the ResQ Agent against two baselines over 2000 randomized episodes/maps.

1. Performance Summary

The table below compares the navigational efficiency and reliability of each strategy.

Strategy Avg. Steps (Lower=Faster) Performance Gap Stuck Rate
Lawnmower Baseline 322.95 Lowerbound 0.0% (Guaranteed)
ResQ Agent 333.78 +3.3% (Near-Optimal) 0.0% (n=2000 maps)
Greedy Baseline 350.10 +8.4% >0%

2. Visualization

Benchmark Bar Chart

🚀 Installation & Environment Setup

1. Clone the repository

git clone https://github.com/johnnyau19/resq-agent
cd resq-agent

2. Create & activate a virtual environment

python3 -m venv venv

# macOS/Linux
source venv/bin/activate 

# Windows:
venv\Scripts\activate

3. Install dependencies

pip install -r requirements.txt

🧠 Training the DQN Agent

Training is controlled by train.py, using Stable-Baselines3 DQN with custom callbacks and TensorBoard logging.

Run training

python train.py

You will be prompted to enter the desired number of training timesteps when the script starts.

This will:

  • Train a Double DQN agent
  • Logs learning curves to ./logs/tensorboard/
  • Saves periodic checkpoints to ./logs/checkpoints/
  • Uses our BaselineCompareCallback, which:
    • Evaluates the agent against the Lawnmower baseline
    • Automatically saves the best-performing model to /models/best_overall_model/

📊 Evaluation (DQN vs Baselines)

Run full evaluation:

python evaluate.py

This will:

  • Run the trained DQN model, Lawnmower and Greedy baselines
  • Evaluate performance over 2000 randomized episodes (unique seed per run)
  • Display real-time Matplotlib performance curves
  • Print summary statistics

Example output:

DQN average steps: XXX
Lawnmower average steps: XXX
Greedy average steps: XXX

🧑‍💻 Tests

python test.py

Runs simple validation checks for environment and agent logic.


About

Hybrid DDQN + A* agent for autonomous search and rescue in partially observable environments.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages