GitHub - chenxi-Guo/TransGOP

TransGOP: Transformer-based Gaze Object Prediction

This repository is the official implementation of TransGOP, which studies the gaze object prediction task. In this work, we introduces Transformer into the fields of gaze object prediction and proposes an end-to-end Transformer-based gaze object prediction method named TransGOP. Specifically, TransGOP uses of-the-shelf Transformer-based object detector to detect the location of objects, and fed the fused feature of the head image and scene image into a Transformer-based gaze prediction branch to regress the gaze heatmap. Moreover, to better enhance the gaze regressor by the informative knowledge from the object detectors, we propose an object-to-gaze attention mechanism to let the queries in the gaze regressor receive the encoded features of the object detector. Finally, to make the whole framework can be end-to-end trained, we propose a Gaze Box loss to jointly optimize the object detector and gaze regressor by enhancing the gaze energy in the box of the stared object. Extensive experiments on the GOO-Synth and GOO-Real datasets demonstrate that our TransGOP achieves state-of-the-art performance on all tracks, i.e. , object detection, gaze estimation, and gaze object prediction.

Data Preparation

The GOO dataset contains two subsets: GOO-Sync and GOO-Real.

You can download GOO-Synth dataset and annotations from Baidu Netdisk:

GOOsynth-data and annotations(code:166j)

You can download GOO-Real dataset and annotations from Baidu Netdisk:

GOOreal-data and annotations(code:pfni)

Please ensure the data structure is as below

├── Datasets
   └── goosynth
       └── annotations
          ├── gop_train.json
          ├── gop_val.json
       └── val
          ├── 0.png
          ├── 1.png
          ├── ...
       └── train
          ├── 0.png
          ├── 1.png
          ├── ...
   └── gooreal
        └── annotations
          ├── gop_train.json
          ├── gop_val.json
        └── val
          ├── 0.png
          ├── 1.png
          ├── ...
        └── train
          ├── 0.png
          ├── 1.png
          ├── ...

Environment Preparation

conda env create -n TransGOP -f environment.yaml

Compiling CUDA operators

cd models/dino/ops
python setup.py build install

Training & Inference

To carry out experiments on the GOO dataset, please follow these commands:

Experiments on GOO-Synth:

bash scripts/TransGOP_train.sh /Dateses/goosynth/

Experiments on GOO-Real:

bash scripts/TransGOP_train.sh /Dateses/gooreal/

Pre-trained Models

You can download pretrained models from baiduyun: Pre-trained Models on GOO-Synth dataset (code:pk58).

Pre-trained Models on GOO-Real dataset (code:e570).

Get_Result

Test on the GOO-Synth:

bash scripts/TransGOP_eval.sh /Dateses/goosynth/ /path/to/your/checkpoint

Test on the GOO-Real:

bash scripts/TransGOP_eval.sh /Dateses/gooreal/ /path/to/your/checkpoint

Results

Our model achieves the following performance on GOOSynth dataset:

AUC	Dist.	Ang.	AP	AP50	AP75	Gaze object prediction mSoC (%)
0.963	0.079	13.30	87.6	99.0	97.3	92.8

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.idea		.idea
config/TransGOP		config/TransGOP
datasets		datasets
figs		figs
model_data		model_data
models		models
scripts		scripts
tools		tools
util		util
LICENSE		LICENSE
README.md		README.md
config_args_raw.json		config_args_raw.json
config_cfg.py		config_cfg.py
engine.py		engine.py
environment.yaml		environment.yaml
goo_name.json		goo_name.json
main.py		main.py
requirements.txt		requirements.txt
vis_Gaze-Dino.py		vis_Gaze-Dino.py
vis_OD.py		vis_OD.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TransGOP: Transformer-based Gaze Object Prediction

Data Preparation

Training & Inference

Pre-trained Models

Get_Result

Results

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

chenxi-Guo/TransGOP

Folders and files

Latest commit

History

Repository files navigation

TransGOP: Transformer-based Gaze Object Prediction

Data Preparation

Training & Inference

Pre-trained Models

Get_Result

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages