ECCV 2022 Workshop on Computer Vision in the Wild

Overview

State-of-the-art CV systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept.

Recent works, such as CLIP , ALIGN and Florence for image classification, and ViLD , RegionCLIP and GLIP for object detection, show that learning from large-scale image-text data is a promising approach to achieve open-vocabulary CV models. These vision models with language interface also show superior zero-shot and few-shot adaption performance on various real-world tasks.

We propose this "Computer Vision in the Wild" workshop, aiming to gather academic and industry communities to work on CV problems in real-world scenarios, focusing on the challenge of open vocabulary vision tasks and efficient domain adaption.

Since there is no established benchmarks to measure the progress of "CV in the wild", we develop two new benchmarks for: 1) classification and 2) detection, to measure the adaption efficiency of various models/methods over diverse real-world datasets. This workshop will host two challenges, based on these two benchmarks, respectively.

CV in the Wild Challenges

There are two challenges associated to this workshop: "Image Classification in the Wild" (ICinW) and "Object Detection in the Wild" (ODinW). We summarize their evaluation datasets and metrics in the table below.

To prevent a race purely in pre-training data and model size, we will have two tracks in both challenges.

For the academic track, pre-training data is limited to ImageNet21k, Objects365 and CC15M.

For the industry track, there is no limitation on pre-training data and model size. Teams are required to disclose meta info of model and data if extra data is used.

Our evaluation codes are to be released. The evaluation codes can be downloaded here for local usage by participants. We use CodaLab as our evaluation server link .

  • Evaluation datasets and metrics for two challenges. *Pre-training data for the academic track, beyond which meta information of extra data should be disclosed.
  • Challenge Eval Datasets Eval Metrics Pretraining data*
    ICinW 21 IC datasets averaged n-shot accuracy ImageNet21k+CC15M
    ODinW 36 OD datasets averaged n-shot accuracy Objects365+CC15M

Important dates

  • June 15th: Workshop registration available
  • July 25th: Competition starts, testing phase begins
  • Sep 30th: Competition ends (challenge paper submission - optional)
  • TBD: Workshop paper submission deadline
  • TBD: Workshop paper acceptance decision to authors

Invited Speakers

To be announced.

Organizers

Pengchuan Zhang, Microsoft

Chunyuan Li, Microsoft

Jyoti Aneja, Microsoft

Ping Jin, Microsoft

Jianwei Yang, Microsoft

Xin Wang, Microsoft

Houdong Hu, Microsoft

Zicheng Liu, Microsoft

Haotian Liu, Univ. of Wisconsin at Madison

Liunian Li, UCLA

Jianfeng Gao, Microsoft

Plain Academic