COCO-CN

COCO-CN is a bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags. The new dataset can be used for multiple tasks including image tagging, captioning and retrieval, all in a cross-lingual setting.

Chinese sentences	COCO-CN train	COCO-CN val	COCO-CN test
human written	✅	✅	✅
human translation	❌	❌	✅
machine translation (baidu)	✅	✅	✅

Progress

version 201805: 20,341 images (training / validation / test: 18,341 / 1,000 / 1,000), associated with 22,218 manually written Chinese sentences and 5,000 manually translated sentences. Data is freely available at HuggingFace.
Precomputed image features: ResNext-101
COCO-CN-Results-Viewer: A lightweight tool to inspect the results of different image captioning systems on the COCO-CN test set, developed by Emiel van Miltenburg at the Tilburg University.
NUS-WIDE100: An extra test set.

2018-12-16: Code for cross-lingual image tagging and captioning released.
2018-12-20: Code for cross-lingual image retrieval and our image annotation system released.
2019-01-13: The COCO-CN paper accepted as a regular paper by the T-MM journal.
2021-02-03: Release of new annotations (4,573 images and 4,712 manually written sentences) collected via our iCap interactive image captioning System. The images have no overlap with the prevously released dataset.
2025-02-12: Considering the increasing amount of dataset applications, we have released COCO-CN at Hugging Face.

Citation

If you find COCO-CN useful, please consider citing the following paper:

Xirong Li, Chaoxi Xu, Xiaoxu Wang, Weiyu Lan, Zhengxiong Jia, Gang Yang, Jieping Xu, COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval, IEEE Transactions on Multimedia, Volume 21, Number 9, pages 2347-2360, 2019

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
code		code
data		data
eval		eval
LICENSE		LICENSE
README.md		README.md
dataset-snapshot.png		dataset-snapshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COCO-CN

Progress

Citation

About

Releases

Packages

Languages

License

li-xirong/coco-cn

Folders and files

Latest commit

History

Repository files navigation

COCO-CN

Progress

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages