ELEVATER

Various Datasets over Representative Tasks

20 image classification datasets / 35 object detection datasets.

Toolkit

Automatic hyper-parameter tuning; Strong language-augmented efficient adaptation methods

Diverse Knowledge Source

Each dataset concept is augmented with diverse knowledge source include: WordNet, Wiktionary, and GPT3.

Leaderboard!

To track the research advances in language-image models.

The ELEVATER benchmark is a collection of resources for training, evaluating, and analyzing language-image models on image classification and object detection. ELEVATER consists of:

Benchmark: A benchmark suite that consists of 20 image classification datasets and 35 object detection datasets, augmented with external knowledge
Toolkit: An automatic hyper-parameter tuning toolkit; Strong language-augmented efficient model adaptation methods.
Baseline: Pre-trained languange-free and languange-augmented visual models.
Knowledge: A platform to study the benefit of external knowledge for vision problems.
Evaluation Metrics: Sample-efficiency (zero-, few-, and full-shot) and Parameter-efficiency.
Leaderboard: A public leaderboard to track performance on the benchmark

The ultimate goal of ELEVATER is to drive research in the development of language-image models to tackle core computer vision problems in the wild.

[Quick introduction with slides]

[ December 3, 2022 ] The Chinese version of ELEVATER image classification benchmark has been released by Alibaba OFA team, a step towards evaluating Chinese and multilingual language-image models for classification. Please check out their Chinese CLIP project page .
[ December 1, 2022 ] Presenting ELEVATER at NeurIPS 2022; Please check out our poster .
[ October 23, 2022 ] Virtual meeting of ECCV workshop "Computer Vision in the Wild" . Please check out the fabulous presentations from invite speakers, workshop papers and challenge winners. The CVinW Challenge summary is presented.
[ Sep 28, 2022 ] For those who are new to the topic of "Computer Vision in the Wild", please check out the CVinW Reading List .
[ Sep 16, 2022 ] Paper accepted in NeurIPS 2022 Datasets and Benchmarks Track. [OpenReview]

[ Sep 1, 2022 ] Call for Papers & Participation: ECCV Workshop and Challenge on Computer Vision in the Wild (CVinW).

[Workshop]

[IC Challenge]

[OD Challenge]

[ Summer, 2022] Interested in learning what is ``Computer Vision in the Wild''?
- Talks Please check out an overview of our team effort "A Vision-and-Language Approach to Computer Vision in the Wild: Modeling and Benchmark". Talks at Apple AI/ML, NIST, Xiaoice, The AI Talks. [YouTube]
- Demos Vision systems that are equipped with the mechanism to recognize any concept from any given images. Check out the demos on image classification with UniCL, object detection with RegionCLIP and GLIP.
- Challenge Have a better idea? Join the community.

[ June 19, 2022 ] CVPR Tutorial on knowledge and benchmark on CVinW. [Slides] [YouTube, Bilibili]
[ Apr 19, 2022 ] The first version is on arXiv.

A more diverse set of CV tasks

Please cite our paper as below if you use the ELEVATER benchmark or our toolkit.


@article{li2022elevater,
    title={ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models},
    author={Li, Chunyuan and Liu, Haotian and Li, Liunian Harold and Zhang, Pengchuan and Aneja, Jyoti and Yang, Jianwei and Jin, Ping and Hu, Houdong and Liu, Zicheng and Lee, Yong Jae and Gao, Jianfeng},
    journal={Neural Information Processing Systems},
    year={2022}
}

Hello ELEVATER! A Platform for Computer Vision in the Wild

Why ELEVATER?

Various Datasets over Representative Tasks

Toolkit

Diverse Knowledge Source

Leaderboard!

What is ELEVATER?

News

A more diverse set of CV tasks

Paper

Please cite our paper as below if you use the ELEVATER benchmark or our toolkit.

Contact

Have any questions or suggestions? Feel free to reach us by opening a GitHub issue!