ShapeMOD:
Macro Operation Discovery for 3D Shape Programs


R. Kenny Jones1      David Charatan1      Paul Guerrero2      Niloy J. Mitra2,3      Daniel Ritchie1     
1Brown University      2Adobe Research      3University College London     

Paper | Code | Video | Supplemental
SIGGRAPH 2021





We propose ShapeMOD, an algorithm which takes as input a collection of 3D shape programs and makes them more compact by automatically discovering common macros which can be re-used across the collection.We apply ShapeMOD to datasets of ShapeAssembly programs and find that generative models which train on refactored programs containing these macros produce more plausible output shapes than those trained on the original programs. The discovered macros also facilitate shape editing by exposing only a small number of meaningful parameters for manipulating shape attributes. For example, the four_leg_base macro exposes two parameters (visualized as sliders with red handles); one parameter controls leg size, while the other controls leg spacing.



Abstract

A popular way to create detailed yet easily controllable 3D shapes is via procedural modeling, i.e. generating geometry using programs. Such programs consist of a series of instructions along with their associated parameter values. To fully realize the benefits of this representation, a shape program should be compact and only expose degrees of freedom that allow for meaningful manipulation of output geometry. One way to achieve this goal is to design higher-level macro operators that, when executed, expand into a series of commands from the base shape modeling language. However, manually authoring such macros, much like shape programs themselves, is difficult and largely restricted to domain experts.

In this work, we present ShapeMOD, an algorithm for automatically discovering macros that are useful across large datasets of 3D shape programs. ShapeMOD operates on shape programs expressed in an imperative, statement-based language. It is designed to discover macros that make programs more compact by minimizing the number of function calls and free parameters required to represent an input shape collection. We run ShapeMOD on multiple collections of programs expressed in a domain-specific language for 3D shape structures. We show that it automatically discovers a concise set of macros that abstract out common structural and parametric patterns that generalize over large shape collections. We also demonstrate that the macros found by ShapeMOD improve performance on downstream tasks including shape generative modeling and inferring programs from point clouds. Finally, we conduct a user study that indicates that ShapeMOD's discovered macros make interactive shape editing more efficient.




ShapeMOD Algorithm


ShapeMOD’s goal is to take a dataset of programs D and the library of DSL functions used to express them L, and return a new library (with additional macros) which is able to express the programs in D with fewer free parameters. The motivation here is that macros should remove free parameters that correspond to extraneous degrees of freedom, i.e. degrees of freedom that can create implausible output shapes, such as independently changing the length of each leg of a table. At the same time, we want to keep the number of functions in our library relatively small, so as not to remove necessary degrees of freedom that can create meaningful shape manipulations. We formalize this trade-off in an objective function which the algorithm attempts to minimize:




The ShapeMOD algorithm has two phases. First, a proposal phase finds clusters of similar programs and uses these clusters to propose a set of candidate macros. Then, an integration phase greedily iterates through a ranked list of these candidate macros and adds them to the library L whenever it would improve the objective function . These phases can be alternated one after the other for multiple rounds, with the output of one phase treated as the input for the next.




Discovered Macros



We experimentally evaluate ShapeMOD’s effectiveness at compressing shape programs and at supporting downstream tasks. Our experiments use three categories of manufactured shapes (Chairs, Tables, Storage) from CAD models in PartNet, using the ShapeAssembly DSL.




Above, we show some macros (top-middle) that ShapeMOD discovered when run on the Table dataset, and program refactors that use these macros to significantly compress the number of exposed free parameters (ShapeMOD arrows from outside to inside). We show program edits (down arrows) of corresponding parameters in both programs with macros (green) and without macros (red). The discovered macros capture parametric relationships that better preserve shape plausibility under manipulation; for example, all chair legs remain the same size in the third column (macros), while the shape in the fourth column (no macros) becomes disconnected and physically implausible .




In the above table, we also measure how well different discovered libraries can compress a dataset of shape programs. For all compression metrics, lower values are better, as our goal is to find a small collection of functions that remove many degrees of freedom from the underlying shape programs. ShapeMOD operates by attempting to minimize f , and we show that it does in fact improve f compared to the No Macros version. CC, cross-category, denotes a variant of ShapeMOD where a single library is discovered across multiple datasets of shape programs at once. Interestingly, the library of functions discovered across multiple categories led to better program compression statistics, but slightly degraded performance on novel shape generation and program inference tasks, compared with libraries discovered by category specific ShapeMOD runs.




Above we show how the symbolic representation of a single PartNet shape (top-right) evolves with our method. From left to right, StructureNet json files describing OBB part-graphs are converted into ShapeAssembly programs that can be further compressed with the macros discovered by ShapeMOD.




Visual Program Induction



Qualitative (above) and quantitative (below) results from our visual program induction experiment, where we train encoder-decoder models that learn to infer ShapeAssembly programs from point clouds. ShapeMOD macros regularize the output program space, leading to significant and consistent improvement in both reconstruction accuracy and physical validity, especially for the heterogenous Storage category. Note that while the cross-category discovered macros do lead to better performance compared with No Macros, for program inference it is better to use a macro library tailored to a particular category of objects.





Generative Modeling



Some example outputs of generative models trained to produce ShapeAssembly programs expressed with macros discovered by ShapeMOD, along with their training set nearest neighbors (NN) by geometric and program similarity. Each cuboid represents a part proxy bounding volume. Structures are formed through attaching parts to one another (red dots). The generative models produce a variety of plausible structures without memorizing their training data.




Interactive Shape Editing




Our final downstream task is interactive shape editing. We hypothesize that programs with macros will support easier, more efficient shape editing. To test this hypothesis, we built an interactive ShapeAssembly editor and conducted a user study with it. Above, we show a video demonstration of such an editing sequence.





Above we show the results from our user study. Top row: the initial program output shape (gray) and target shape (yellow) for each task in our goal-directed editing study. Bottom row: plots of how quickly participants were able to edit a program’s parameters to match the target shape, with 95% confidence intervals shown. The x axis is time elapsed in minutes, while the y axis is the mean of the running minimum of each participant’s corner distance to the target shape. In general, participants using ShapeMOD macros more quickly converged to the target shape and achieved a closer fit. To allow users to take breaks between tasks, time starts when the user makes their first edit for each task.



Related Publications

ShapeAssembly: Learning to Generate Programs for 3D Shape Structure Synthesis
R. Kenny Jones, Theresa Barton, Xianghao Xu, Kai Wang, Ellen Jiang, Paul Guerrero, Niloy J. Mitra, and Daniel Ritchie
Siggraph Asia 2020 Paper | Project Page | Code

StructureNet: Hierarchical Graph Networks for 3D Shape Generation
Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy J. Mitra, and Leonidas J. Guibas
Siggraph Asia 2019 Paper | Project Page | Code

PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding
Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, and Hao Su
CVPR 2019 Paper | Project Page | Code



Bibtex

@article{jones2021shapeMOD,
  title={ShapeMOD: Macro Operation Discovery for 3D Shape Programs},
  author={Jones, R. Kenny and Charatan, David and Guerrero, Paul and Mitra, Niloy J. and Ritchie, Daniel},
  journal={ACM Transactions on Graphics (TOG)},
  volume={40},
  number={4},
  year={2021},
  publisher={ACM},
  address = {New York, NY, USA},
  articleno = {153}
}


Acknowledgements

We would like to thank the participants in our user study for their contribution to our research. We would also like to thank the anonymous reviewers for their helpful suggestions. Renderings of part cuboids and point clouds were produced using the Blender Cycles renderer. This work was funded in parts by NSF award #1941808, a Brown University Presidential Fellowship, an ERC grant (SmartGeometry), and gifts from Adobe. Daniel Ritchie is an advisor to Geopipe and owns equity in the company. Geopipe is a start-up that is developing 3D technology to build immersive virtual copies of the real world with applications in various fields, including games and architecture.