Dataset and Code for the paper:
Unveiling the Power of Visible-Thermal Video Object Segmentation [paper], IEEE Transactions on Circuits and Systems for Video Technology, 2023.
If you find our work useful for your research, please consider citing the paper:
@article{yang2023unveiling, title={Unveiling the Power of Visible-Thermal Video Object Segmentation}, author={Yang, Jinyu and Gao, Mingqi and Cong, Runmin and Wang, Chengjie and Zheng, Feng and Leonardis, Ale{\v{s}}}, journal={IEEE Transactions on Circuits and Systems for Video Technology}, year={2023}, publisher={IEEE} }
VisT300
├── train
| └── RGBImages
| └── video1
| ├── 00000.jpg
| ├── 00005.jpg
| ├── xxxxx.jpg
| ...
| └── ThermalImages
| └── video1
| ├── 00000.jpg
| ├── 00005.jpg
| ├── xxxxx.jpg
| ...
| └── Annotations
| └── video1
| ├── 00000.png
| ├── 00005.png
| ├── xxxxx.png
| ...
├── test (same organization as the train set)
PyTorch implementation of VTiNet. We test the code in the following environments, other versions may also be compatible: Python=3.9, PyTorch=1.10.1, CUDA=11.3
- Install
pip install -r requirements.txt
- Train
torchrun --master_port 10010 --nproc_per_node=2 train.py --exp_id vist300 --rgbt_root [path to VisT300/train] --save_path [path to save checkpoints] --load_network [path to pretrained xmem]
- Test
python test.py --model [path to vtinet checkpoint] --rgbt_path [path to path to VisT300/test] --save_path [path to results]
- Evaluate
python eval.py -g [path to VisT300/test/Annotations] -r [path to results]