COCO-CN is a bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags. The new dataset can be used for multiple tasks including image tagging, captioning and retrieval, all in a cross-lingual setting.
Chinese sentences | COCO-CN train | COCO-CN val | COCO-CN test |
---|---|---|---|
human written | ✅ | ✅ | ✅ |
human translation | ❌ | ❌ | ✅ |
machine translation (baidu) | ✅ | ✅ | ✅ |
- version 201805: 20,341 images (training / validation / test: 18,341 / 1,000 / 1,000), associated with 22,218 manually written Chinese sentences and 5,000 manually translated sentences. Data is freely available at HuggingFace.
- Precomputed image features: ResNext-101
- COCO-CN-Results-Viewer: A lightweight tool to inspect the results of different image captioning systems on the COCO-CN test set, developed by Emiel van Miltenburg at the Tilburg University.
- NUS-WIDE100: An extra test set.
- 2018-12-16: Code for cross-lingual image tagging and captioning released.
- 2018-12-20: Code for cross-lingual image retrieval and our image annotation system released.
- 2019-01-13: The COCO-CN paper accepted as a regular paper by the T-MM journal.
- 2021-02-03: Release of new annotations (4,573 images and 4,712 manually written sentences) collected via our iCap interactive image captioning System. The images have no overlap with the prevously released dataset.
- 2025-02-12: Considering the increasing amount of dataset applications, we have released COCO-CN at Hugging Face.
If you find COCO-CN useful, please consider citing the following paper:
- Xirong Li, Chaoxi Xu, Xiaoxu Wang, Weiyu Lan, Zhengxiong Jia, Gang Yang, Jieping Xu, COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval, IEEE Transactions on Multimedia, Volume 21, Number 9, pages 2347-2360, 2019