This document explains how to use the aic4-eval.py script to evaluate image quality.
- What Does This Tool Do?
- Web Application
- Prerequisites
- Environment Setup
- Understanding the Feature Extractor
- Dataset Preparation
- Running the Script
- Understanding the Output
- Advanced Options
- Troubleshooting
This repository provides two evaluation scripts for image quality assessment:
The main evaluation script that uses variance-guided feature selection. It evaluates the quality of distorted images by comparing them to their original (reference) versions. The higher the score, the better the quality of the distorted image.
An advanced evaluation script that uses a weighted patch-based approach. It divides feature maps into patches and computes spatially-weighted quality scores, providing more detailed spatial quality assessment.
Both tools can work in two ways:
- Single mode: Compare one reference image to one distorted image
- Dataset mode: Compare many reference images to their corresponding distorted versions all at once
You can try the single image pair evaluation mode directly in your browser without any installation:
🌐 Live Demo: https://idfiqa.ivp-lab.ir
The web app provides an easy-to-use interface for uploading a reference image and a distorted image to get instant quality scores.
Before you can use this tool, you need:
-
A computer with:
- Linux, macOS, or Windows operating system
- Optional but recommended: NVIDIA GPU with CUDA support (makes processing much faster)
-
Python installed:
- Python version 3.10 or higher
- You can check by running:
python --versionorpython3 --version
-
Basic command line knowledge:
- How to open a terminal/command prompt
- How to navigate folders using
cdcommand - How to run Python scripts
Follow these steps carefully to set up your environment. Each step is important!
On Linux/Ubuntu:
sudo apt update
sudo apt install python3 python3-pip python3-venvOn macOS:
# Install Homebrew first if you don't have it (visit https://brew.sh)
brew install python3On Windows: Download and install Python from python.org
A virtual environment is like a separate workspace for this project. It keeps all the required packages isolated from other Python projects on your computer.
Navigate to the project directory:
cd /path/to/variance_guided_iqaCreate the virtual environment:
python3 -m venv venvThis creates a folder called venv in your project directory.
On Linux/macOS:
source venv/bin/activateOn Windows:
venv\Scripts\activateAfter activation, you should see (venv) at the beginning of your command line prompt. This means you're now working inside the virtual environment.
The script needs several Python packages to work. All of them are listed in the requirements.txt file.
Install all packages at once:
pip install -r requirements.txtThis will download and install:
- PyTorch: The deep learning framework that powers the model
- torchvision: Provides pre-trained models and image processing tools
- Pillow: For loading and handling images
- tqdm: Shows progress bars so you know how long processing will take
- And several other supporting packages
Note: This installation might take 5-15 minutes depending on your internet speed. PyTorch is a large package.
Check that PyTorch is installed correctly:
python -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}')"Expected output:
PyTorch version: 2.8.0 (or similar)
CUDA available: True (if you have an NVIDIA GPU) or False (if you don't)
The feature extractor is a pre-trained neural network that the script uses to analyze images. Don't worry - you don't need to download this manually! The script handles it automatically.
When you run the script for the first time, PyTorch will:
-
Download the pre-trained model from the internet
- The script uses EfficientNet-B4 by default
- This is about 70-80 MB in size
- It gets downloaded to your home directory in a folder called
.cache/torch/hub/checkpoints/
-
Save it for future use
- You only download it once
- Future runs will use the cached version
- This makes subsequent runs much faster
On Linux/macOS:
~/.cache/torch/hub/checkpoints/efficientnet_b4-*.pth
On Windows:
C:\Users\YourUsername\.cache\torch\hub\checkpoints\efficientnet_b4-*.pth
- First run: Takes longer because the model needs to be downloaded
- Subsequent runs: Much faster since the model is already cached
You don't need to do anything special - just make sure you have an internet connection the first time you run the script!
If you want to use the dataset mode (evaluating AIC4-Evaluation dataset), follow these steps:
Download the evaluation dataset from:
https://drive.google.com/drive/folders/1TlmLVFTIv7bobxInKnBrd8Nmu4T0AalZ?usp=sharing
This will be a aic4-evaluation-png.zip file containing reference and distorted images.
Extract the ZIP file to a location on your computer. For example:
unzip aic4-evaluation-png.zip -d /path/to/your/datasets/Both scripts have two modes: single (for comparing one pair of images) and dataset (for comparing many images).
Use this mode when you want to quickly check the quality of one distorted image compared to its reference.
Basic syntax:
python aic4-eval.py --mode single --ref-img /path/to/reference.png --dist-img /path/to/distorted.pngExample:
python aic4-eval.py --mode single --ref-img ./images/original.png --dist-img ./images/compressed.pngWhat happens:
- The script loads both images
- Processes them through the quality evaluation model
- Prints the quality score to the terminal
Example output:
Quality Score: 0.847523
Reference: ./images/original.png
Distorted: ./images/compressed.png
Interpreting the score:
- 0.9 - 1.0: Excellent quality, barely any difference
- 0.8 - 0.9: Good quality, small differences
- 0.7 - 0.8: Moderate quality, noticeable differences
- Below 0.7: Poor quality, significant differences
Use this mode when you have many images to evaluate and want results saved to a file.
Basic syntax:
python aic4-eval.py --mode dataset --root-dir /path/to/dataset-folder --output results.csvExample:
python aic4-eval.py --mode dataset --root-dir ./dataset-folder --output results.csvWhat happens:
- The script finds all reference images in the
source/folder - For each reference image, it finds all corresponding distorted images
- Calculates quality score for each pair
- Saves results to a CSV file
- Shows a progress bar so you can track completion
Example output during processing:
Evaluating images: 100%|████████████████| 5600/5600 [05:23<00:00, 4.64it/s]
Results saved to results.csv
The output CSV file contains:
ref_img_name,dis_img_name,quality_score,jnd_mapped
src_001.png,src_001.png-01-01.png,0.9993844032287598,0.007380344183170351
src_001.png,src_001.png-01-02.png,0.9991859793663024,0.009756329974884181
src_001.png,src_001.png-01-03.png,0.9991020560264589,0.010760827245163362
src_001.png,src_001.png-01-04.png,0.9989848732948304,0.012162990596872536
src_001.png,src_001.png-01-05.png,0.9986413717269896,0.016270358697388243
...By default, the script uses 2 worker processes to load images. You can adjust this based on your computer's CPU:
For faster computers (4+ CPU cores):
python aic4-eval.py --mode dataset --root-dir ./dataset-folder --num-workers 4For slower computers (2 CPU cores):
python aic4-eval.py --mode dataset --root-dir ./dataset-folder --num-workers 1Note: More workers = faster image loading, but uses more RAM. Start with 2 and increase if your computer can handle it.
The patching-eval.py script uses the same modes and arguments as aic4-eval.py, with an additional option for patch size.
Basic syntax:
python patching-eval.py --mode single --ref-img /path/to/reference.png --dist-img /path/to/distorted.pngExample:
python patching-eval.py --mode single --ref-img ./images/original.png --dist-img ./images/compressed.pngExample output:
WeightedPatchIDFIQA Score: 0.847523
Reference: ./images/original.png
Distorted: ./images/compressed.png
Basic syntax:
python patching-eval.py --mode dataset --root-dir /path/to/dataset-folder --output results.csvExample:
python patching-eval.py --mode dataset --root-dir ./dataset-folder --output patching-results.csvThe patching script includes an additional --patch-size parameter:
python patching-eval.py --mode single \
--ref-img ref.png \
--dist-img dist.png \
--patch-size 8 # Size of patches in feature space (default: 8)When you run single mode, you get a simple text output:
Quality Score: 0.847523
Reference: ./images/original.png
Distorted: ./images/compressed.png
- Quality Score: The quality metric (0.0 to 1.0)
- Reference: Path to the original image you provided
- Distorted: Path to the distorted image you provided
When you run dataset mode, you get a CSV file that looks like this:
ref_img_name,dis_img_name,quality_score,jnd_mapped
src_001.png,src_001.png-01-01.png,0.9993844032287598,0.007380344183170351
src_001.png,src_001.png-01-02.png,0.9991859793663024,0.009756329974884181
src_001.png,src_001.png-01-03.png,0.9991020560264589,0.010760827245163362
src_001.png,src_001.png-01-04.png,0.9989848732948304,0.012162990596872536
src_001.png,src_001.png-01-05.png,0.9986413717269896,0.016270358697388243
...Columns explained:
- ref_img_name: The name of the reference (original) image
- dis_img_name: The name of the distorted image being compared
- quality_score: The quality score for this pair (0-1)
- jnd_mapped: quality score mapped to JND.
We use the following mapping function with
b = 1.0anda = 15.0
a * max(0, b - x)| Argument | Description | Default | Required |
|---|---|---|---|
--mode |
Evaluation mode: single or dataset |
single |
No |
--ref-img |
Path to reference image (single mode only) | None | Yes (for single mode) |
--dist-img |
Path to distorted image (single mode only) | None | Yes (for single mode) |
--root-dir |
Root directory of dataset (dataset mode only) | (none - must specify) | Yes (for dataset mode) |
--output |
Output CSV file path (dataset mode only) | results.csv |
No |
--num-workers |
Number of worker processes for data loading | 2 |
No |
--percent-features |
Percentage of feature maps to keep (0.0 to 1.0) | 0.7 |
No |
| Argument | Description | Default | Required |
|---|---|---|---|
--patch-size |
Size of patches in feature space | 8 |
No |
The --percent-features option controls an internal parameter of the quality evaluation algorithm. The default value of 0.7 (70%) has been carefully chosen and works well for most image quality evaluation tasks.
Default behavior:
python aic4-eval.py --mode single --ref-img ref.png --dist-img dist.png
# Uses --percent-features 0.7 automaticallyThis parameter is included for completeness and research purposes, but most users should not modify it. The default value provides a good balance between accuracy and computational efficiency.
Solution: Try using python3 instead:
python3 aic4-eval.py --mode single --ref-img ref.png --dist-img dist.pngCause: You either didn't activate the virtual environment or didn't install the requirements.
Solution:
# Activate the virtual environment
source venv/bin/activate # On Linux/macOS
# or
venv\Scripts\activate # On Windows
# Install requirements
pip install -r requirements.txtIf you encounter an error not listed here:
- Read the error message carefully - it usually tells you what's wrong
- Check your Python version:
python --version(should be 3.10+) - Verify all packages are installed:
pip list | grep torch - Try the single mode first - it's simpler and helps isolate issues
- Check file permissions - make sure you can read the input files and write to the output location
Use this checklist to make sure you've completed all setup steps:
- Python 3.10+ installed
- Virtual environment created (
python3 -m venv venv) - Virtual environment activated (
source venv/bin/activate) - Requirements installed (
pip install -r requirements.txt) - Dataset downloaded and extracted (for dataset mode)
- Ran test with single mode first
- Internet connection available (for first run model download)
Once all items are checked, you're ready to use the tool!
# 1. Activate environment
source venv/bin/activate
# 2. Test with one pair
python aic4-eval.py --mode single \
--ref-img ./test_images/original.png \
--dist-img ./test_images/compressed.png
# Output: IDFIQA Score: 0.847523# 1. Activate environment
source venv/bin/activate
# 2. Test with one pair using patching approach
python patching-eval.py --mode single \
--ref-img ./test_images/original.png \
--dist-img ./test_images/compressed.png
# Output: WeightedPatchIDFIQA Score: 0.847523# 1. Activate environment
source venv/bin/activate
# 2. Run full dataset evaluation (choose one)
python aic4-eval.py --mode dataset \
--root-dir ./dataset-folder \
--output results.csv \
--num-workers 4
# Or use the patching approach
python patching-eval.py --mode dataset \
--root-dir ./dataset-folder \
--output patching-results.csv \
--num-workers 4
# 3. Check results
head -n 10 results.csv
# 4. Deactivate when done
deactivate