Minghao Guo


github
googlescholar
facebook
twitter

Bio

I am a fourth-year Ph.D. student in the Computational Design & Fabrication Group at MIT CSAIL, supervised by Prof. Wojciech Matusik. Before coming to MIT, I received my M.Phil. from The Chinese University of Hong Kong, supervised by Prof. Dahua Lin. I obtained my Bachelor's degree from Tsinghua University.

My research focuses on addressing the challenges of scientific discovery and 3D content creation in data-sparse contexts, where the validity, functionality, and efficiency of generated results are paramount. I draw on insights from procedural modeling and geometry processing to develop fundamental representations and generative models - typically featuring the combination of discrete graphs with associated continuous parameters - which inherently enforce necessary constraints without excluding any valid solutions, ensuring both data efficiency and the preservation of critical properties.

Publications

→ Full list

Physically Compatible 3D Object Modeling from a Single Image
Minghao Guo, Bohan Wang, Pingchuan Ma, Tianyuan Zhang, Crystal Elaine Owens, Chuang Gan, Joshua B. Tenenbaum, Kaiming He, Wojciech Matusik
NeurIPS 2024
Spotlight
[PDF] [Project Page]

Summary

A computational framework that transforms single images into 3D physical objects by considering mechanical properties, external forces, and rest-shape geometry. Unlike existing single-view 3D reconstruction methods that overlook these factors, our approach ensures physical compatibility by embedding these attributes into the reconstruction process and results in 3D shapes that exhibit desired physical behaviors and withstand real-world forces. Our framework is particularly useful for dynamic simulations and 3D printing, where physical compatibility is crucial.

TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
Minghao Guo*, Bohan Wang*, Kaiming He, Wojciech Matusik
Preprint
[PDF] [Code]

Summary

A Lagrangian representation for reconstructing 3D shapes with high-quality geometry. Unlike conventional methods that rely on Eulerian representations, TetSphere splatting uses tetrahedral meshes to achieve superior mesh quality without neural networks or post-processing. This method deforms multiple initial tetrahedral spheres through differentiable rendering and geometric energy optimization, ensuring computational efficiency. TetSphere splatting excels in applications like single-view 3D reconstruction and image-/text-to-3D content generation, outperforming existing methods in optimization speed, mesh quality, and preservation of thin structures.

Medial Skeletal Diagram: A Generalized Medial Axis Approach for 3D Shape Representation
Minghao Guo*, Bohan Wang*, Wojciech Matusik
SIGGRAPH Asia 2024 (Journal Track)
[PDF]

Summary

A novel skeletal representation that tackles the prevailing issues around compactness and reconstruction accuracy in existing skeletal representations. Our approach augments the continuous elements in the medial axis representation to effectively shift the complexity away from discrete elements. We introduce generalized enveloping primitives, an enhancement of the standard primitives in medial axis, which ensures efficient coverage of intricate local features of the input shape and substantially reduces the number of discrete elements required. Moreover, we present a computational framework that constructs a medial skeletal diagram from an arbitrary closed manifold mesh. We exemplify the versatility of our representation in downstream applications such as shape optimization, shape generation, mesh decomposition, mesh alignment, mesh compression, and user-interactive design.

Representing Molecules as Random Walks Over Interpretable Grammars
Michael Sun, Minghao Guo, Weize Yuan, Veronika Thost, Crystal Elaine Owens, Aristotle Franklin Grosz, Sharvaa Selvan, Katelyn Zhou, Hassan Mohiuddin, Benjamin J Pedretti, Zachary P Smith, Jie Chen, Wojciech Matusik
International Conference on Machine Learning (ICML) 2024
Spotlight
[PDF] [Code] [Video]

Summary

A graph grammar-based model to efficiently represent and reason over these molecules, leveraging hierarchical design spaces and motifs. By utilizing random walks over the design space, our method excels in molecule generation and property prediction. We demonstrate superior performance, efficiency, and synthesizability of predicted molecules compared to existing methods, while offering detailed chemical interpretability.

LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
Pingchuan Ma, Tsun-Hsuan Wang, Minghao Guo, Zhiqing Sun, Joshua B. Tenenbaum, Daniela Rus, Chuang Gan, Wojciech Matusik
International Conference on Machine Learning (ICML) 2024
[PDF] [Code]

Summary

A bilevel optimization framework that combines LLMs’ reasoning abilities with the computational power of simulations. In this framework, LLMs propose hypotheses and reason about discrete components, while simulations provide feedback and optimize continuous parameters. Our experiments demonstrate the efficacy of SGA in discovering constitutive laws and designing molecules, revealing novel and coherent solutions beyond conventional human expectations.

Variational Quasi-harmonic Maps for Computing Diffeomorphisms
Yu Wang, Minghao Guo, Justin Solomon
ACM Transactions on Graphics 2023 (SIGGRAPH 2023)
[PDF]

Summary

A novel method for computing smooth, inversion-free maps based on variational quasi-harmonic maps and a special Cauchy boundary condition. By transforming the mapping problem into an optimal control problem defined by an elliptic PDE and minimization within a family of functionals, an efficient numerical procedure is proposed, yielding significant improvements in speed and quality over current methods and allowing for optimization of a generic energy in the framework of quasi-harmonic maps.

Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction
Minghao Guo, Veronika Thost, Samuel Song, Adithya Balachandran, Payel Das, Jie Chen, Wojciech Matusik
International Conference on Machine Learning (ICML) 2023
[PDF] [Code]

Summary

A data-efficient property predictor for molecular properties utilizing a learnable hierarchical molecular grammar and graph neural diffusion over the grammar-induced geometry. Despite limited labeled data, the proposed method significantly outperforms various baselines, including supervised and pre-trained graph neural networks, in both small and large datasets.

Data-Efficient Graph Grammar Learning for Molecular Generation
Minghao Guo, Veronika Thost, Beichen Li, Payel Das, Jie Chen, Wojciech Matusik
International Conference on Learning Representations (ICLR) 2022
Oral Presentation [acceptance rate: 1.6%]
[PDF] [Code]

Summary

A data-efficient generative model that can be learned from datasets with orders of magnitude smaller sizes than common benchmarks. At the heart of this method is a learnable graph grammar that generates molecules from a sequence of production rules. Our learned graph grammar yields state-of-the-art results on generating high-quality molecules for three monomer datasets that contain only ∼20 samples each.

PolyGrammar: Grammar for Digital Polymer Representation and Generation
Minghao Guo, Liane Makatura, Wan Shou, Timothy Erps, Michael Foshey, Wojciech Matusik
Advanced Science
[PDF]

Summary

A formal grammar specifically designed for the polymers, providing a concise, explicit, and robust representation for any polymer within a given class. This project serves as a foundational blueprint for constructing general chemical design models, with closely integrated representation and generative capabilities.

Pareto Gamuts: Exploring Optimal Designs Across Varying Contexts
Liane Makatura, Minghao Guo, Adriana Schulz, Justin Solomon, Wojciech Matusik
ACM Transactions on Graphics 2021 (SIGGRAPH 2021)
[PDF] [Project Page] [Video]

Summary

A framework for variable-context multi-objective optimization capturing Pareto fronts over a range of contexts (Pareto gamut). A global-local optimization algorithm is developed to discover the Pareto gamut. Its practical utility is demonstrated for several engineering design problems.

Texture Memory-Augmented Deep Patch-Based Image Inpainting
Rui Xu, Minghao Guo, Jiaqi Wang, Xiaoxiao Li, Bolei Zhou, Chen Change Loy
Transactions on Image Processing (TIP).
[PDF]

When NAS Meets Robustness: In Search of Robust Architectures against Adversarial Attacks
Minghao Guo*, Yuzhe Yang*, Rui Xu, Ziwei Liu, Dahua Lin
Conference on Computer Vision and Pattern Recognition (CVPR) 2020
[PDF] [Project Page] [Code] [Video]

Summary

Take the first step to understand adversarial robustness of neural networks from an architectural perspective. Based on our “robust architecture Odyssey”, we design a family of robust architectures RobNets that exhibit superior robustness performance to other widely used architectures.

Online Hyper-parameter Learning for Auto-Augmentation Strategy
Chen Lin*, Minghao Guo*, Chuming Li, Xin Yuan, Wei Wu, Junjie Yan, Dahua Lin, Wanli Ouyang
International Conference on Computer Vision (ICCV) 2019
[PDF]

Summary

An economical solution that learns the data augmentation policy along with network training. The augmentation policy is formulated as a parameterized probability distribution, thus allowing its parameters to be optimized jointly with network parameters. Compared with SOTA methods, this framework achieves 60× faster on CIFAR-10 and 24× faster on ImageNet, while maintaining competitive accuracies.

IRLAS: Inverse Reinforcement Learning for Architecture Search
Minghao Guo, Zhao Zhong, Wei Wu, Dahua Lin, Junjie Yan
Conference on Computer Vision and Pattern Recognition (CVPR) 2019
[PDF] [Code]

Summary

An inverse reinforcement learning framework for the training of an agent to learn to search network structures that are topologically inspired by human-designed network. A mirror stimuli function motivated by biological cognition theory is proposed to extract the Summary topological knowledge and facilitates the discovery of architectures that stride a satisfactory balance between accuracy and efficiency.

Dual-Agent Deep Reinforcement Learning for Deformable Face Tracking
Minghao Guo, Jiwen Lu, Jie Zhou
European Conference on Computer Vision (ECCV) 2018
Oral Presentation [acceptance rate: 2.4%]
[PDF] [Video]

Summary

A deformable face tracking system that can automatically balance the efficiency and effectiveness according to the image quality and tracking complexity of each frame. Dual-agent deep reinforcement learning is used to make adaptive decisions to position updates of bounding boxes and landmarks. The interaction of these two subtasks is explicitly exploited following a Bayesian model.

Learning Reasoning-Decision Networks for Robust Face Alignment
Hao Liu, Jiwen Lu, Minghao Guo, Suping Wu, Jie Zhou
Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
[PDF]

Education & Experiences