Workshop on Computer Vision in the Wild
@ ECCV 2022, October 23



Overview

State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept.

Recent works show that learning from large-scale image-text data is a promising approach to building transferable visual models that can effortlessly adapt to a wide range of downstream computer vision (CV) and multimodal (MM) tasks. For example, CLIP , ALIGN and Florence for image classification, ViLD , RegionCLIP and GLIP for object detection. These vision models with language interface are naturally open-vocabulary recogntion models, showing superior zero-shot and few-shot adaption performance on various real-world scenarios.

We propose this "Computer Vision in the Wild" workshop, aiming to gather academic and industry communities to work on CV problems in real-world scenarios, focusing on the challenge of open-set/domain visual recognition and efficient task-level transfer. Since there is no established benchmarks to measure the progress of "CV in the Wild", we develop new benchmarks for image classification and object detection, to measure the task-level transfer ablity of various models/methods over diverse real-world datasets, in terms of both prediction accuracy and adaption efficiency. This workshop will also host two challenges based on the benchmarks.

Call for Papers

    Topics of interest include but are not limited to:
  • Open-set visual recognition methods, including classification, object detection, segmentation in images and videos
  • Zero/Few-shot text-to-image generation/editing; Open-domain visual QA & image captioning
  • Unified neural networks architectures and training objectives over different CV & MM tasks
  • Large-scale pre-training, with images/videos only, image/video-text pairs, and external knoweldge
  • Efficient large visual model adaptation methods, measured by #training samples (zero-shot and few-shot), #trainable parameter, throughput, training cost
  • New metrics / benchmarks / datasets to evaluate task-level transfer and open-domain visual recognition

  • We accept abstract submissions to our workshop. All submissions shall have maximally 8 pages (excluding references) following the ECCV 2022 author guidelines. All submissions will be reviewed by the Program Committee on the basis of technical quality, relevance to scope of the conference, originality, significance, and clarity.

    Submission Portal: [CMT]

CV in the Wild Challenges

    There are two challenges associated with this workshop: "Image Classification in the Wild" (ICinW) and "Object Detection in the Wild" (ODinW). We summarize their evaluation datasets and metrics in the table below.

    Challenge
    Eval Datasets
    Eval Metrics
    Make a Submission
    ICinW
    20 Image Classification Datasets
    Zero, few, full-shot
    ODinW
    35 Object Detection Datasets
    Zero, few, full-shot
    Open Soon
    To prevent a race purely in pre-training data and model size, we will have two tracks.
  • For the academic track, pre-training data is limited to ImageNet21k, Objects365, CC15M, and YFCC15M
  • For the industry track, there is no limitation on pre-training data and model size. Teams are required to disclose meta info of model and data if extra data is used.

  • More information about the challenges are released: [Benchmark] [Document] . Our evaluation server will be online soon.

Dates

July 25, 2022 Competition starts, testing phase begins
September 30, 2022 Competition ends (challenge paper submission)
September 16, 2022 Workshop paper submission deadline
October 9,2022 Workshop paper acceptance decision to authors
October 16,2022 Camera-ready submission deadline


Invited Speakers (TBD)

Program (TBD)


Workshop Organizers



Pengchuan Zhang
Meta AI



Chunyuan Li
Microsoft



Jyoti Aneja
Microsoft



Ping Jin
Microsoft



Jianwei Yang
Microsoft



Xin Wang
Microsoft



Haotian Liu
UW Madison



Liunian Li
UCLA



Haotian Zhang
University of Washington



Shohei Ono
Microsoft


Challenge Organizers (TBD)



Yinfei Yang
Apple



Yi-Ting Chen
Google



Ye Xia
Google



Yangguang Li
Sensetime



Feng Liang
UT Austin



Yufeng Cui
Sensetime



Kunaiki Saito
Google



Kihyuk Sohn
Google



Xiang Zhang
Google



Chun-Liang Li
Google



Chen-Yu Lee
Google



Houwen Peng
MSRA


Advisory Committee



Trevor Darrell
UC Berkley



Lei Zhang
IDEA



Jenq-Neng Hwang
University of Washington



Yong Jae Lee
UW Madison



Houdong Hu
Microsoft



Zicheng Liu
Microsoft



Ce Liu
Microsoft



Xuedong Huang
Microsoft



Kai-Wei Chang
UCLA



Jingdong Wang
Baidu



Zhuowen Tu
UCSD



Jianfeng Gao
Microsoft


Workshop and Challenge Questions?
Reach out: https://github.com/Computer-Vision-in-the-Wild/eccv-2022
Workshop Organizing Team