CVPR 2023


In this blog post, I endeavoured to recreate the past glory of the CVPR 2019 and CVPR 2022 blog posts by sharing the essence of my CVPR 2023 experience with you. From arriving early to fix jetlag and skip queues, to crafting a strategic schedule that maximized my learning potential, to indulging in culinary delights and fostering valuable connections, every aspect of the conference contributed to an unforgettable journey.

Preparing

Stepping into a conference is like entering a world of endless possibilities. With a multitude of captivating talks vying for your attention, it’s easy to feel lost in the sea of knowledge. That’s why it’s crucial to embark on your conference journey armed with a well-crafted schedule.

  • Illuminating the Path: Filtering by Topic

In the conference chaos, finding your way starts with filtering sessions by topic. By focusing on areas that truly resonate with your interests and goals, you narrow down the overwhelming options. As you immerse yourself in this process, let your passion guide you, allowing only the most captivating topics to shine through. The result? A curated selection of 2-3 sessions will fuel your thirst for knowledge.

  • The Spark of Inspiration: Subject and Speaker Selection

Once you’ve honed in on the most promising sessions, it’s time to delve deeper. Explore the subjects and speakers connected to each talk on your shortlist. Look beyond surface-level information and uncover the hidden gems—the stories, insights, and expertise that will leave you inspired. Let the magnetic pull of captivating subjects and influential speakers shape your final choices, ensuring a transformative conference experience.

  • The Power of Practicality: Location and Timing

To make the most of every moment, it’s crucial to consider the logistics. Take note of the location and timing of your chosen workshops. Map out your conference journey, paying attention to travel distances between sessions and planning for efficient transitions. Remember, time is a precious resource, so optimize it wisely. By strategizing your schedule, you’ll be free to immerse yourself fully in the conference magic.

  • Preparing for Serendipity: The Role of Backup Plans

Even with meticulous planning, surprises may lurk around the corner. What if a session doesn’t meet your expectations or a change in circumstances arises? That’s where the power of backup plans comes into play. Stay flexible and prepared by identifying alternative sessions or workshops that align with your interests. Embrace the serendipitous moments that may unexpectedly shape your conference journey.

  • Make plans to meet people!

Reach out to colleagues and friends ahead of time, inviting them to share a lunch or grab a drink together. Conferences provide the perfect setting to strengthen relationships and forge new connections. As a people person, these interactions will enhance your conference experience and expand your professional network.

  • Arrive early!

Arrive a day early to fix your jetlag and pick up your badge. Give yourself ample time to adjust to the new time zone, ensuring you’re well-rested and fully present for the conference. Arriving early also allows you to avoid long queues for registration, giving you a head start and a smooth entry into the conference. Make the most of this extra day to explore the surroundings, familiarize yourself with the venue, and mentally prepare for the exciting days ahead.

  • Don’t be shy—say Hi!

Step out of your comfort zone and embrace the power of connection. Seize this golden opportunity to meet new people, expand your network, and create lasting professional relationships. Let go of your fears, ignite conversations, and unlock new possibilities. Embrace the magic of networking at its finest!

  • Find a good place to eat!

Take advantage of the conference location to explore local culinary gems. Seek recommendations from locals or fellow attendees, and venture out to savour delicious meals that reflect the essence of the city. From cozy cafes to vibrant restaurants, discovering new flavours can be an unforgettable part of your conference experience.
Here is a list of recommendations I got from a local (shout out to Derek Liu):

  1. Meat & Bread: Sandwiches with very chewy bread (a very convenient place for a quick take-out meal between sessions).
  2. Ramen Danbo: A famous ramen place.
  3. Sura: a popular Korean restaurant (note the opening hours).
  4. Saku: Tonkatsu place.
  5. Namsan: Korean place inside a small, old food court.
  6. Maruhachi Ramen.

Day 1: Workshops and Tutorials

I spent all day at the “Generative Models for Computer Vision” workshop. It had an amazing lineup of speakers including Phillip Isola, Angjoo Kanazawa, Angela Dai, Christian Rupprecht, Bjorn Ommer, Andrea Tagliasacchi, Alan Yuille presenting in person and Gordon Wetzstein, Vincent Sitzmann, and Shubham Tulsiani presenting remotely. For my detailed notes you can check out my twitter threads (thread #1 #2, #3, #4).

Generative models have emerged as a powerful tool in the field of computer vision, revolutionizing the way we perceive and utilize visual data. In this workshop that I attended, the question of whether generative models are useful for vision was unequivocally answered. It is no longer a matter of debate; their utility is now evident. However, a new question has taken center stage: how can we best harness their potential?


One intriguing revelation from the workshop was the growing evidence that training vision models on generated data can lead to significant performance boosts. This breakthrough opens up exciting possibilities for enhancing our understanding of the visual world. During the panel discussion, an intriguing query arose: should we sample from these generative models, or is there a more effective way to directly leverage their capabilities?

Moreover, the workshop shed light on another important aspect of generative models—they can serve as valuable priors. By utilizing these models as prior knowledge, we gain a deeper understanding of the underlying data distribution, enabling more informed analysis.

The workshop was a captivating exploration of the potential of generative models in computer vision. It not only affirmed their usefulness but also sparked thought-provoking discussions on how to optimize their utilization. As we venture further into the realm of generative models, it is clear that they hold tremendous promise for advancing the field of computer vision and shaping the future of visual perception.

In conclusion, the workshop highlighted the transformative power of generative models in computer vision. With their ability to enhance performance, serve as priors, and pose new questions, they have become an indispensable asset for researchers and practitioners. The journey of leveraging generative models in computer vision has just begun, and we eagerly anticipate the breakthroughs and discoveries that lie ahead.

The quote of the day award goes to Alan Yuille: “Here are some tables, I don’t really mind them, we need them to get papers published”. This statement serves as a powerful reminder that amidst our obsession with bold numbers and statistical significance, we mustn’t lose sight of the ideas, science, and research that underpin them.



Day 2: Workshops and Tutorials

The day kicked off with a whirlwind of excitement as I attended three compelling workshops. In “Deep Learning for Geometric Computing,” Yifan Wang’s talk on “Geometric structures and why (and how) to use them in deep models?” offered profound insights into the integration of geometry and deep learning. Next, in the “AI for Content Creation” workshop, Ben Mildenhall delved into the captivating realm of Neural Radiance Fields (NeRFs), unraveling their potential for revolutionizing content creation. Lastly, at the “3DMV: Learning 3D with Multi-View Supervision” workshop, Ben Poole’s discussion on DreamFusion captivated the audience, shedding light on the exciting possibilities of learning 3D from multiple viewpoints.

As I journeyed through these workshops, I couldn’t help but capture my thoughts in a Twitter thread, summarizing the key takeaways and sharing the knowledge with a wider audience. It was a moment of synergy, where the digital realm intersected with the physical, amplifying the impact of the workshops beyond the conference walls.

However, the true highlight of my day was the “Scholars & Big Models” workshop, where luminaries from academia and industry gathered to discuss the future of the field. The list of attendees reads like a who’s who of computer vision: Jitendra Malik, Alyosha Efros, Angjoo Kanazawa, David Crandall, Georgia Gkioxari, Derek Hoiem, Jon Barron, Devi Parikh, Cordelia Schmid, David Forsyth, and Erik Learned-Miller. The discussions were intellectually stimulating, covering a range of topics, including the dynamics between industry and academia, the existence of open problems, and the challenges of conducting research without large-scale computing.

Throughout the workshop, three key messages resonated: adapt, follow your passion, and leverage your strategic advantage. These words of wisdom underscored the ever-evolving nature of computer vision and emphasized the importance of staying curious and agile in the face of rapid advancements. Among the remarkable talks, Jon Barron’s forecast left a lasting impression, drawing parallels between computer vision’s current state and the past trajectory of graphics. It sparked both curiosity and concern, raising questions about the potential saturation of computer vision or the existence of unexplored dimensions that lie beyond our current projections. Are we truly at the end of the sigmoid or rather on the verge of an explosion of exponential growth? Or maybe, the sigmoid simply has more than one dimension?

I eagerly anticipate the forthcoming days, filled with more discoveries, connections, and revelations that will shape the future of computer vision.


Day 3: Main Conference

The opening ceremony, a personal favourite of mine, kicked off the proceedings with a delightful dose of numbers and statistics. It was an opportunity to grasp the sheer magnitude of the conference’s impact. This year, a staggering 9,155 papers were submitted, and out of those, 2,359 made the cut, including 235 highlight papers and 12 award candidates. The global reach of CVPR was evident, with 8,337 attendees representing 75 countries, including 118 from Israel and 115 from Australia. With 100 workshops, 33 tutorials, and a whirlwind of approximately 400 posters per session.

The keynote address by Rodney Brooks added another layer of excitement to the day. Titled “Revisiting Old Ideas with Modern Hardware,” Brooks took us on a captivating journey through the history of neural networks. From the foundational works of backpropagation in 1974 to the resurgence of deep learning with AlexNet dominating NeurIPS in 2012, his talk encapsulated the evolution of this groundbreaking field. He also touched on the Seven Deadly Sins of Predicting the Future of AI, shedding light on common pitfalls such as overestimating short-term effects, Hollywood scenarios, and the dangers of overgeneralization.

Brooks’s thought-provoking presentation left a lasting impression, and his emphasis on reducing hype around current advancements resonated deeply. As he concluded his talk, he recommended “The Promise of Artificial Intelligence” by Brian Cantwell Smith, a book that delves into the potential and limitations of AI.

The energy continued to surge as the day unfolded, and I had the pleasure of supporting Jiahao in presenting our work on “Aligning Step-by-Step Instructional Diagrams to Video Demonstrations” during the first poster session. It was an invaluable opportunity to share our research, exchange ideas, and receive feedback from the vibrant community of computer vision enthusiasts.

In the second poster session, I immersed myself in the ocean of knowledge displayed by fellow researchers. With each poster I visited, I discovered novel insights and innovative approaches, expanding my understanding of the diverse facets of computer vision. From action recognition to NeRFs and object detection and beyond, the passion and ingenuity showcased in every poster were inspiring. Here are some I visited today:



Day 4: Main Conference

Twitter summary thread link.
The day commenced with the much-anticipated award ceremony, evoking mixed emotions as it became apparent that a few of the award-winning authors were unable to attend due to visa issues. Nevertheless, some coauthors showcased their ingenuity by devising a creative solution to bring those absent authors up to the stage in a memorable and inclusive manner.


The award-winning papers:
student honourable mention: “Dreambooth: Fine-tuning text-to-image diffusion models for a subject-driven generation”
student best paper: “3D Registration with Maximal Cliques.”

Best paper honourable mention: “DynIBAR: Neural Dynamic Image-based Rendering”.

Best paper 1: “Visual Programming: Compositional Visual Reasoning without Training”.

Best paper 2: “Planning oriented Autonomous driving”.

PAMITC award:

Longuet Higgins prize (CVPR 2013 paper): “Online Object Tracking: A Benchmark”
Young Researchers Award: Christoph Feichtenhofer and Judy Hoffman.

Thomas S. Huang Memorial Award: Alyosha Efros

The plenary talk was delivered by Yejin Choi on the intriguing topic of “2050: an AI Odyssey: The Dark Matter of Intelligence.” The session commenced with thought-provoking speculations regarding CVPR 2025, pondering over questions such as its prospective venue: the metaverse or even Mars. Choi elaborated on three key points throughout her presentation: the realm of possible impossibilities, the exploration of impossible possibilities, and the enigmatic concept of paradox. She eloquently concluded her talk by emphasizing the idea that talent is not an inherent trait but rather a product of nurture and development.



I collected some posters from the morning and afternoon sessions and unfortunately did not attend the panel due to exhaustion.

Day 5: Main Conference – Final day

Twitter summary thread link .

Larry Zitnick delivered an impactful plenary talk titled “Modeling Atoms to Address Our Climate Crisis,” emphasizing the significance of chemistry and its potential to tackle pressing global challenges. He highlighted the historical success of chemists in addressing societal concerns, such as inventing modern fertilizer to combat food scarcity in the 1980s. Zitnick drew attention to the urgency of the current issue of global warming, acknowledging the progress made in power generation but highlighting the synchronization issue between energy demand and supply, particularly in relation to intermittent renewable energy sources. The costliness of energy storage and transportation, attributed to chemical reactions and catalyst-related challenges, was identified as a crucial problem.
Zitnick introduced the Open Catalyst project as a solution, showcasing its key components. The project presented a vast dataset of 140 million training examples, consisting of 3D positions and atomic numbers as inputs, and energy and atom forces as outputs. Utilizing a graph neural network with a focus on equivariance, Zitnick outlined the significance of modeling atoms with proper orientation representation using spherical channels and spherical harmonics for various functions. The Open Catalyst project aimed to enable chemists to leverage this approach through the Open Catalyst API service, streamlining the exploration of new catalyst materials using generative AI.
Remaining challenges were acknowledged, including the need for continuous improvement of models, determining accurate relaxations, and developing models to predict experimental results. Zitnick highlighted potential applications in areas like batteries, proteins, and drug discovery, emphasizing the importance of responsible research and being conscious of any adverse effects. He shared valuable advice for those entering new fields, stressing the importance of building connections, collaborating on problem-solving, and finding solutions collectively.
The decision to focus on chemistry was justified by its relevance, the unique contributions that the community can make, and the potential for enlightening “aha moments” achieved by approaching the field with fresh perspectives. Zitnick’s talk encouraged an understanding that modeling atoms and harnessing the power of AI in chemistry can play a vital role in shaping a better future while urging responsible and ethical practices.

Throughout the day, I had the opportunity to present my poster in both the morning and afternoon sessions, which kept me occupied and engaged. As a result, my day was primarily dedicated to this activity, leaving me with fewer updates to share except for a handful of captivating poster shots. Today I presented our GraVoS work and OGINR.

Reflections on CVPR 2023: Insights and Musings

Amidst an existential crisis within the field, marked by a pervasive sense of uncertainty and the apprehension of exponential (or sigmoid) growth, one truth emerges—adaptation to this evolving landscape is key to flourishing. Despite the trepidation, we possess the very tools needed to overcome any challenges that arise; all it requires is leveraging our strategic advantage. Reflecting upon an awe-inspiring experience, both socially and intellectually, it becomes evident that embracing this new reality is the catalyst for future growth. With excitement and anticipation, I eagerly look forward to the countless adventures that lie ahead.