Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model

In the latest episode of the Talking Papers Podcast, I had the pleasure of hosting Jiahao Li, a talented PhD student at Toyota Technological Institute at Chicago (TTIC). Our conversation centered around his paper titled “Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model,” which was recently published in ICLR 2024.

The paper addresses the challenges of text-to-3D generation, specifically the slow inference, low diversity, and quality issues faced by existing methods. Jiahao introduced Instant3D, a novel approach that generates high-quality and diverse 3D assets from text prompts in a feed-forward manner. This is achieved through a two-stage paradigm involving the generation of a sparse set of four structured and consistent views from text, followed by the direct regression of the NeRF (Neural Radiance Field) from the generated images using a transformer-based sparse-view reconstructor.

I must say that the results are truly impressive, especially given the remarkable speed at which they are generated. As someone with a strong inclination towards 3D, the notion of going through a 2D projection initially felt peculiar to me. However, I can’t argue with the visually striking outputs that Instant3D produces. This research underscores the importance of obtaining more and better 3D data, further pushing the boundaries of text-to-3D conversion.

It’s worth mentioning that I was introduced to Jiahao through Yicong Hong, another guest on the podcast who coincidentally happens to be a co-author on this paper as well. Yicong was aPhD student at the Australian National University (ANU) while I was doing my postdoc there, also interned at Adobe with Jiahao. It’s always interesting to see connections and collaborations come full circle in the research community.

While I regret that the Instant3D model is not made public, I understand Adobe’s decision, considering the substantial computational resources required to train such models and copyright issues. Nevertheless, I am excited to see what future research Jiahao and his collaborators will bring to the field of text-to-3D conversion. Stay tuned for more exciting episodes of the Talking Papers Podcast, where we continue to delve into the latest research and discoveries in academia.

AUTHORS


Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, Sai Bi

ABSTRACT


Text-to-3D with diffusion models has achieved remarkable progress in recent years. However, existing methods either rely on score distillation-based optimization which suffer from slow inference, low diversity and Janus problems, or are feed-forward methods that generate low-quality results due to the scarcity of 3D training data. In this paper, we propose Instant3D, a novel method that generates high-quality and diverse 3D assets from text prompts in a feed-forward manner. We adopt a two-stage paradigm, which first generates a sparse set of four structured and consistent views from text in one shot with a fine-tuned 2D text-to-image diffusion model, and then directly regresses the NeRF from the generated images with a novel transformer-based sparse-view reconstructor. Through extensive experiments, we demonstrate that our method can generate diverse 3D assets of high visual quality within 20 seconds, which is two orders of magnitude faster than previous optimization-based methods that can take 1 to 10 hours. Our project webpage: this https URL.

RELATED WORKS

๐Ÿ“šDreamFusion

๐Ÿ“šShape-E

๐Ÿ“šProlific Dreamer

LINKS AND RESOURCES

๐Ÿ“šPreprint

๐Ÿ’ปProject page

To stay up to date with his latest research, follow on:

๐Ÿ‘จ๐Ÿปโ€๐ŸŽ“Personal website

๐Ÿ‘จ๐Ÿปโ€๐ŸŽ“Google scholar

๐ŸฆTwitter

๐Ÿ‘จ๐Ÿปโ€๐ŸŽ“LinkedIn

This episode was recorded on January 24th 2024

CONTACT


If you would like to be a guest, sponsor or share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com

SUBSCRIBE AND FOLLOW


๐ŸŽงSubscribe on your favourite podcast app

๐Ÿ“งSubscribe to our mailing list

๐ŸฆFollow us on Twitter

๐ŸŽฅSubscribe to our