Introduction

We provide extra annotations beyond the original ImageNet classification dataset (ILSVRC CLS 2012). Compared with the 2012 database (1000 classes), the new EIC dataset contains nearly 3000 classes where the augmented categories are mostly the fine-grained, sub-classes of the 1000 ones. The source images descend directly from the official ImageNet website; we annotate these images, which presumably consist of one class only and might have multiple instances in an image. Note that part of these images are already labelled and we just utilize the annotations provided from the official source. The EIC dataset is introduced in the paper:

Do We Really Need More Training Data for Object Localization [abstract] [paper]
Hongyang Li, Yu Liu, Xin Zhang*, Zhecheng An, Jingjing Wang, Yibo Chen and Jihong Tong
IEEE International Conference on Image Processing (ICIP), 2017

The key factor for training a good neural network lies in both model capacity and large-scale training data. 
As more datasets are available nowadays, one may wonder whether the success of deep learning descends from
data augmentation only. In this paper, we propose a new dataset, namely, Extended ImageNet Classification (EIC)
dataset based on the original ILSVRC CLS 2012 set to investigate if more training data is a crucial step. We address
the problem of object localization where given an image, some boxes (also called anchors) are generated to localize
multiple instances. Different from previous work to place all anchors at the last layer, we split boxes of different
sizes at various resolutions in the network, since small anchors are more prone to be identified at larger spatial
location in the shallow layers. Inspired by the hourglass work, we apply a conv-deconv network architecture to
generate object proposals. The motivation is to fully leverage high-level summarized semantics and to utilize their up
sampling version to help guide local details in the low-level maps. Experimental results demonstrate the effectiveness
of such a design. Based on the newly proposed dataset, we find more data could enhance the average recall, but a more
balanced data distribution among categories could obtain better results at the cost of fewer training samples.

* X. Zhang is the corresponding author.
For enquries on the paper and dataset, please email him at:
zhangxin15@mails.tsinghua.edu.cn


The Extended ImageNet Classification dataset at a glance

Grab and Go

Disclaimer

The source images and partial annotations belong to the official ImageNet challenge; please refer to the copyright and terms of use on their website.
For our part (annotations), EIC can only be used for non-commercial and research purposes. If you use this dataset, please cite:

@inproceedings{li_icip17,
  title={Do We Really Need More Training Data for Object Localization},
  author={Li, Hongyang and Liu, Yu and Zhang, Xin and An, Zhecheng and Wang, Jingjing and Chen, Yibo and Tong, Jihong},
  booktitle={Proceedings of the IEEE Conference on Image Processing},
  year={2017}
}


EIC Annotation Download

[imagenet_cls_3k_EIC_release_icip17.tar.gz] (Dropbox link, 39.32MB)
All annotations are stored in the .mat file. Please refer to readme.txt for details.
There are 2686 classes; each image has at least one instance that belongs to its class.
Bounding box labels are of form [n x 4], where n is the number of instances; [x1, y1, x2, y2] ranging from 1 instead of 0.

EIC Image Download

We do not provide a direct access of the EIC images; instead, please download them from the ImageNet website.
Here is a reference link. Please refer to our readme file for mapping the images to our database.

Update: Due to many requests, we now release our version of EIC images.
Please fill in the form and you can have access to the download links.

Facts and Statistics

We are just too lazy to list here. Please refer to Table 1 in the paper.