CVPR 2022

In 2019 I hit a career goal and had a paper published in CVPR. Today, 3 years and several papers later, I found that the blog post I made back then is so valuable for me and the readers of this blog. Therefore, in this blog post I will try to recreate past glory and share my CVPR 2022 experience.

Preparing

  1. The first thing I do before attending a conference is making a schedule. It is so easy to lose yourself in all of the talks happening at the same time. I first filter by topic (usually leaves me with 2-3 candidates) and then filter by subject/speaker. I also take note of the location, and time of my selected workshops. I also make backup plans if it doesn’t meet my expectations. I made a special Google Calendar for it and shared it with the world.
  2. Make plans to meet people! I usually email colleagues and friends to squeeze in at least a lunch together (beer is always better). I am a people person and these conferences are great opportunities to strengthen these networks.
  3. Arrive a day early to fix your jetlag and pick up the badge and avoid the queue.


Day 1 – Workshops and tutorials

I started off the day with a talk by Jitendra Malik on vision and action. He talked about how we see to move and move to see and how these two are interconnected. Citing a 1999 paper was interesting and amusing, it turns out that ego-exo views were possible even before digital cameras (and to make tea !).

I then went to the workshop on synthetic data in machine learning. The first talk by Erroll Wood demonstrated a remarkable work of generating synthetic portraits of humans. He pointed out that what makes synthetic data so good is realism, diversity and richness. He then detailed how finding the gaussian of more markers works better for matching poses and expressions, and how they created a strand-level hair dataset.

Sanja Fidler was up next, showcasing the amazing work NVIDIA is doing to push synthetic data to the very edge. She stressed the difficulty of getting high-quality ground truth data and showed how Omniverse is a powerful tool. She demonstrated its capabilities in a plethora of papers that cover Graphics, mixed reality, neural reality, and generative models.

BTW, the room was not empty, the hall was huge and packed. I simply sat in the front rows, like a proud nerd.

At the ScanNet workshop, Gim He Lee and Francis Englemann gave a nice overview of their papers on instance-level 3D segmentation and detection. Francis’ demo at the end was a really nice touch. The workshop ended in a panel session that included Or Litany and Matthias Niessner and discussed the future challenges of the field. It seems that out-of-distribution samples and classes are particularly interesting but also finding a way around the requirement of large-scale datasets of per-point segmentation is required.


We finished off the day by bringing the ANU gang back together: had a great dinner and some drinks with Chamin, Yicong, Dylan, and Steve.


Day 2 – Workshops and tutorials

On the second day, I decided to go to the full day tutorial on Neural Fields. I already knew many of the topics but decided to go anyway since it never hurts to get someone else’s perspective. I particularly liked how they structured the tutorial with an intro, techniques, applications and invited speakers section. The most impressive aspect of everything was the explosion of papers within only 2-3 years of its existence. I particularly appreciated the effort of building a community, starting with a website that manages all of the literature and a slack channel for the community.

Day 3 – Main conference Day 1

The day started with the best paper and award ceremony. One of the things I like most about this ceremony is the numbers. How many attendees? from which country?, how many authors? etc. I am always a bit surprised but mostly proud of Israel for having substantial numbers. This year, in particular, two of the best papers had an Israeli first author.

And the winners are:

Best paper: Learning to solve hard minimal problems

Best paper honourable mention: Dual-shutter optical vibration sensing

Best student paper: EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Best student paper honourable mention: Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields

Tomas Huang memorial prize: Fei Fei Li

Young researcher award: Bharath Hariharan, Olga Russakovsky
Longuet-Higgins Prize: Are we ready for autonomous driving? The KITTI vision benchmark suite

Here are a few cool posters I got to check out during the poster session:

The day ended at the Microsoft party wher I even got a chance to take a picture with the first author of the best paper Petr Hruby.

Day 4-5 = Main conference day 2-3

Unfortunately, I hurt my knee and couldn’t walk. So I attended these days virtually but missed the poster sessions. However, I am happy to share a few posters that I liked and found on social media mainly on Jared Heinly’s Twitter (@JaredHeinly). We share the 3D point cloud interest

Day 6 = Main conference day 4 = Our poster day

On the last day of the conference, I dragged my bum knee to present our DiGS poster. Talking about research is a great distraction from the pain. Before the session started I had the chance to catch up with David Lindell who was recently a guest on my Talking Papers Podcast talking about the paper he presented in the morning session “BACON: Band-Limited Coordinate Networks for Multiscale Scene Representation“.


Also, I got to touch a little bit of Twitter stardust when I met @AK in person. I used to think he was a bot, I couldn’t imagine a human spending so much time and effort on reading papers from so many subfields and posting them on Twitter. So, yes, he is a human… and a pretty nice one too. I took a picture with him but this happened (see below)… seems to be a known bug in many cameras.

A few minutes before the session started I took this picture:

It was one of those amazing moments where dream meets reality. My life goal is to inspire young minds. Having this X-year-old look up at my poster was a reminder and a motivation boost to continue what I am doing. It seems to be working.


Presenting my poster in the last session is never great. Most attendees are on their way to catch the flight back. However, there were plenty of people still arriving to chat and it allowed the interactions to be in-depth and more of a conversation than a monologue.

Summary

So what is Hot and what is Not? As Hamid Rezatofighi tweeted, his next paper title is going to be “Diffusion based Neural Implicit Function for Egocentric Perception for Embodied AI using Transformers

I think if you look at the majority of the papers there seem to have been a few themes:

  1. Neural Fields – Since DeepSDF and NeRF this field has exploded but I feel there are still many open questions and plenty of work to be done there.
  2. Ego view – Seems like trying to see the world with more human eyes (at least the view) is an up-and-coming trend with interesting challenges and new datasets.
  3. Transformers – Yes, they are useful in many cases.
  4. Diffusion models – show great potential.

It is always very difficult to know which paper is going to be the next AlexNet or PointNet (for the 3D folks) but it is important to remember that research is an infinite game. You can’t win research, the goal is to stay in the game and CVPR does exactly that with a sprinkle of FOMO (fear of missing out).

Finally, I would like to thank all of the organizers for an amazing event! It takes such an incredible amount of work to pull off something on this scale. Thank you.