Envisioning 3D design systems for the Metaverse

How to scale 3D content design for the next-generation internet

Puzzle BLOCKHAUS

Metaverse seems like a glamorous futurist proposition, that “one day” we will have virtual worlds that billions of people are willing to “live in”. Obviously, we haven’t seen a real “metaverse” product yet, and it’s unlikely to see such a product in 3–5 years. So, what is preventing the metaverse from becoming a reality in the short term? I believe the biggest bottleneck is the content creation ecosystem. Metaverse needs massive 3D content, and 3D is complicated and expensive. We need to simplify the authoring tools, standardize the content creation pipeline, and leverage well-tested design solutions to empower users.

In this article, I’d like to sketch out my envision for 3D design systems from a high level through pattern theory to useful practical techniques like generative art and artificial intelligence applications.

Table of content

A design system is a collection of reusable components, guided by clear standards, that can be assembled together to build any number of applications. Design systems provide systematic solutions to guide and leverage collective efforts across different teams and stakeholders. With unified visual languages and reusable components, designers and developers can build cohesive experiences across all platforms better and faster.

Companies like Google, Airbnb, and Alibaba have all developed their own design systems. Creating unique open-source design systems is not only about making a bold brand statement but also cultivating an ecosystem of adopters for community growth and iterations.

For 2D design systems, we have many great examples like Google’s Material Design, Alibaba’s Ant Design, Shopify’s Polaris…etc. In terms of design system toolings, there are Figma, Sketch, InVision…

  • Make 3D content creation easier for everyone

The internet has evolved from a text-based world to 2D catalogues with images and videos, and will presumably move towards the 3D era — the metaverse. Every digital media era has spawned tools that users can easily produce, edit, and share content, so the general public can actively participate in content creation, such as photo and video editing apps on your phone.

Metaverse needs massive 3D content to engage the users, and 3D is expensive — to make, to understand, and to store. The production of 3D content is a painstaking task that is typically delegated to trained experts. Many authoring tools are expensive and have a steep learning curve. You might also need to learn programming and game engines. Each tool is huge and complex, but after learning all the tools, can users produce high-quality scenes?

  • Empower users for quick 3D prototyping

I have seen a lot of emerging web-based 3D design software, which initially solved the problems of accessibility and real-time collaboration, but they just turned Microsoft Office into a web-based Google Doc. Users who don’t understand 3D design still don’t know where to start and how to build high-quality scenes. I think we can consider developing 3D version design systems, and craft building blocks and rules (relationships) so that users can build their own. This kind of building block not only includes spatial forms, materials, and shaders but also contains generation logic, interaction methods, and data interface with other modules. General users can generate 3D content by simply dragging and dropping and clicking. For users with some technical skills, we can provide them with development tools to allow them to customize and extend functions, or generate new modules. The independent game Townscraper, in my opinion, is a design system. The developers pre-set the modules of the medieval town and their rules. Users can quickly generate scenes by clicking, and the learning cost is basically zero.

Users of Townscraper can create 3d content by simply clicking
  • 3D design systems are about variety and community

Due to the richness of the three-dimensional world, the design system will inevitably be full of varieties, and it is unlikely to see that big tech standards dominate the market. We could have the gothic design system, cyberpunk design system, or specific traffic planning systems, vegetation systems, character systems. . . endless possibilities. If you develop a design system of your own style, it is possible to build a community, and then build a virtual world of your own.

Pattern theory

“A person with a pattern language does not need to be an “expert”. The expertise is in the language. He/she can contribute to planning and design because they know relevant patterns, how to combine them, and how the particular piece fits into the larger whole.” — Christopher Alexandar, A Pattern Language

When it comes to how to scale design, the first inspiration that popped into my mind is Christopher Alexander. He was an architect and design theorist that had affected fields beyond architecture, including urban design, software, and sociology. Ironically, Alexander was considered controversial among some mainstream architects and critics in his own field, while his work gained greater recognition in other fields. In software, Alexander is regarded as the father of the pattern language movement. Alexander’s work has also influenced the development of agile software development.

Alexander is perhaps best known for his 1977 book A Pattern Language, with the intention to give ordinary people, not only professionals, a way to design nurturing environments and build at any scale. His book directly inspired Will Wright, the creator of the city-building simulation game SimCity, “I’m interested in the process and strategies for design. The architect Christopher Alexander, in his book A Pattern Language, formalized a lot of spatial relationships into a grammar for design. I’d really like to work toward a grammar for complex systems and present someone with tools for designing complex things.”

As A Pattern Language explains “Each pattern describes a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in a way that you can use this solution a million times over, without ever doing it the same way twice.”

To UX designers, this concept sounds wonderfully familiar — it’s the basis of every design system. Design patterns are a universal resource to align best practices, describe the elements of good designs, and most importantly, provide a repository so that other people can easily reuse these solutions. Why invent the wheel when a perfectly good one already exists¹?

“Life” in the built environment

What are good designs and how to evaluate designs? Can we study “feelings” objectively? Alexander aims to create a scientific view of the world in which this concept —that everything has its degree of life—is well defined. In his later book The Nature of Order, Alexander attempts to define “life” in the built environment and determine why one built environment may have more life than another.

Sadly, he believes we no longer have the ability to produce “living structures” in the world, especially after World War II. Much modern architecture is inert and makes people feel dead inside. The “living structure” which is needed to sustain us and nurture us and which did exist to some degree in the traditional societies and in rural communities and early urban settlements has disappeared.

What went wrong, and how might architecture correct its course? Beginning with his design patterns, he discovered that the designs that stirred up the most feeling in people, what he called living structures, shared certain qualities. This wasn’t just a hunch, but a testable empirical theory, one that he validated and refined from the late 1970s until the turn of the century. He identified 15 qualities, each with a technical definition and many examples. As additions to the pattern theory, these 15 geometric properties appeared to exist recursively in space whenever buildings had life. These 15 properties seemed to define a more fundamental kind of stuff; similar to the patterns we mentioned earlier, but more condensed, more essential — some kind of stuff that all good patterns were made of².

The qualities are:

  • Levels of scale
  • Strong centers
  • Boundaries
  • Alternating repetition
  • Positive space
  • Good shape
  • Local symmetries
  • Deep interlocking and ambiguity
  • Contrast gradients
  • Roughness
  • Echoes
  • The void
  • Simplicity and inner calm
  • Not separateness

These 15 properties turned out to be a substrate of all patterns and began showing up more and more clearly in his works as the main correlates of living structures in places, buildings, things, space and so forth³.

Generation of a whole, living world

In Alexander’s view, a thing is not an isolated entity, but part of an interconnected whole. To learn how ‘good’ that thing is, it must be assessed holistically with respect to the context of that thing. For example, a room is a substructure of the building, which is a substructure of the street, which is a substructure of the city, which is a substructure of the country towards the entire Earth or the cosmos. On the other hand, that same room also contains far more small substructures than large ones at different levels of scale (or hierarchy). For example, a wall with a painting may further contain far more ‘smalls’ than ‘larges’. Therefore, the goodness of that room depends on its adjacent rooms and things, on the smaller things within that room, and on the larger thing (or the building) that contains that room. It is essentially a recursive definition of the goodness of a space⁴.

Generative art refers to any art practice where the artist uses a system, such as a set of natural language rules, a computer program, a machine, or other procedural invention, which is set into motion with some degree of autonomy contributing to or resulting in a completed work of art.

— — Philip Galanter

There are many well-established generative techniques, and each of them can be elaborated with endless paper and research. In this chapter, I will give a brief introduction to these concepts with examples in 3D content generation.

Randomness and noise

Using Randomness and noise is probably the oldest, simplest, and most common method of generative art. The artist uses various parameters to define a changing space, which can take values within a certain range, and the randomly selected value affects the manipulated elements to form the final work.

The basis of noise comes from random numbers. The characteristic of random numbers is that the value of each point is discrete and has nothing to do with each other, while noise makes discrete random numbers continuous.

There are different kinds of noise, like white, red, Perlin noise..etc. For more detail about noise theory, you can review this article:

Perlin noise is a procedural generation algorithm invented by Ken Perlin. It can be used to generate various effects with natural qualities, such as clouds, flame, terrain, landscapes, and patterned textures like marble. The algorithm can have to be implemented in different dimensions.

Minecraft terrain generation with Perlin noise

You can also play the noise-based generative biomes below, click randomize to create new ones or customize them yourself.

Chaos theory and fractals

Chaos is the science of surprises, of the nonlinear and the unpredictable. It teaches us to expect the unexpected. While most traditional science deals with supposedly predictable phenomena like gravity, electricity, or chemical reactions, Chaos Theory deals with nonlinear things that are effectively impossible to predict or control, like turbulence, weather, the stock market, our brain states, and so on. These phenomena are often described by fractal mathematics, which captures the infinite complexity of nature. Many natural objects exhibit fractal properties, including landscapes, clouds, trees, organs, rivers etc, and many of the systems in which we live exhibit complex, chaotic behaviour.

A fractal is a never-ending pattern. Fractals are infinitely complex patterns that are self-similar across different scales. They are created by repeating a simple process over and over in an ongoing feedback loop. Driven by recursion, fractals are images of dynamic systems — the pictures of Chaos. Geometrically, they exist in between our familiar dimensions. Fractal patterns are extremely familiar since nature is full of fractals. For instance: trees, rivers, coastlines, mountains, clouds, seashells, hurricanes, etc.⁵

While extending fractals in 3D, there are many great examples and tools. Hindu temples feature self-similar, fractal-like structures. The Mandelbulb is a three-dimensional fractal, constructed for the first time in 1997 by Jules Ruis and in 2009 further developed by Daniel White and Paul Nylander. Mandelbulb 3D is a free software application created for 3D fractal imaging.

Hindu temples feature self-similar, fractal-like structures, where parts resemble the whole.

L-system

Biologist Aristid Lindenmayer created Lindenmayer systems, or L-systems, in 1968 to formalize bacteria growth patterns. L-systems are a recursive, string-rewriting framework, commonly used today in computer graphics to visualize and simulate organic growth, with applications in plant development, procedural content generation, and fractal-like art.⁶

L-system is often used by artists to generate plant forms or to simulate the growth process of plants.

Trees generated using an L-system in 3D

Here is an interactive online tool to tweak and play with 2d L-system generators:

Michael Hansmeyer is an architect and programmer who explores the use of algorithms and computation to generate architectural forms. Inspired by cell division, Michael Hansmeyer wrote a design algorithm with stunningly flamboyant shapes and countless facets. No one can draw them by hand, but they can be made — and they can set off a frenzy of thought into conventional architectural forms.

Michael Hansmeyer subdivided columns

Here’s Michael Hansmeyer’s TED 2012 talk Building unimaginable shapes:

Shape grammars

Shape grammars in the computation are a specific class of production systems that generate geometric shapes. Typically, shapes are 2- or 3-dimensional, thus shape grammars are a way to study 2- and 3-dimensional languages. Shape grammar was first introduced in a seminal article by George Stiny and James Gips in 1971.

Shape grammars are “a set of shape rules that apply in a step-by-step way to generate a set, or language, of designs” (Shape grammars in education and practice, by Terry Knight, 1999) in both 2D and 3D space. They can serve as a simple starting point for sophisticated and complex designs.⁷

Here is a shape grammar that recreates Palladian villas. Palladian grammar was originally proposed by Stiny and Mitchell.

In Unreal’s The Matrix Awakens, the building generator uses a shape grammar language to style the building volume. Each different building style has a different set of rules.

Screenshot of Unreal tech talk
Screenshot of Unreal tech talk

You can also watch the tech talk of The Matrix Awakens: Generate a world.

CityEngine is a tool that allows to generate shapes procedurally, i.e. by writing the rules that describe them instead of creating their geometry directly. This strategy, also known as grammar-based modelling is particularly useful when shapes that obey certain standardized rules need to be created in large numbers, which makes it very suitable for the generation of urban environment. CityEngine offers its own programming language, called CGA shape grammar, that is specifically created for writing rules for the generation of architectural 3D content.

Wave function collapse

Wave function collapse is an algorithm developed by Maxim Gumin as a texture synthesis method based on simple configuration or sample images. It is a constraint-based procedural algorithm that is inspired and named after the concept of wave function collapse from quantum physics. In quantum physics, wave function collapse is the idea that the unobserved state of a particle can be anything. As soon as the particle is observed, the possibilities disappear and the wave function collapses. The same idea is the backbone of the procedural algorithm.

This article below did a great job explaining the wave function collapse algorithm:

Developer Marian used wave function collapse algorithm to build an infinite city generator:

One of the best-known games developed with wave function collapse algorithm is Townscrape. It is a small game, with big ambitions. A city builder in which players can simply add or remove a block from the game world. With this limited toolset, you can craft everything from idyllic seaside towns to a horizon-spanning metropolis.

You can try and play Townscaper’s web version here:

Markov algorithm

In theoretical computer science, a Markov algorithm is a string rewriting system that uses grammar-like rules to operate on strings of symbols. Markov algorithms have been shown to be Turing-complete, which means that they are suitable as a general model of computation and can represent any mathematical expression from its simple notation. Markov algorithms are named after the Soviet mathematician Andrey Markov, Jr.

Based on Markov algorithm, developer Maxim Gumin came up with MarkovJunior, a probabilistic language based on pattern matching and constraint propagation.

Probabilistic inference in MarkovJunior allows to impose constraints on the future state and generates only those runs that lead to the constrained future. For example, inference in Sokoban rules {RWB=BRW RB=BR} makes a group of (red) agents organize (white) crates into specified shapes.

Using these ideas, we construct many probabilistic generators of dungeons, architecture, puzzles and fun simulations.

Here is the source code of MarkovJunior:

Tessellation

A tessellation or tiling is the covering of a surface, often a plane, using one or more geometric shapes, called tiles, with no overlaps and no gaps. In mathematics, tessellation can be generalized to higher dimensions and a variety of geometries. Tessellations have given rise to many types of tiling puzzles, from traditional jigsaw puzzles (with irregular pieces of wood or cardboard) and the tangram to more modern puzzles which often have a mathematical basis.

In architecture, tessellations have been used to create decorative motifs since ancient times. Mosaic tilings often had geometric patterns. Tessellations frequently appeared in the graphic art of M. C. Escher; he was inspired by the Moorish use of symmetry in places such as the Alhambra when he visited Spain in 1936.⁹

Lizard, M.C Escher

Tessellation can be extended to three dimensions. Certain polyhedra can be stacked in a regular crystal pattern to fill (or tile) three-dimensional space, including the cube (the only Platonic polyhedron to do so), the rhombic dodecahedron, the truncated octahedron, triangular, quadrilateral, and hexagonal prisms, among others.

Spiralling close-packers of 3D space

Genetic algorithm

A genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection.¹⁰

The single loop of a typical algorithm basically operates from initial Selection, via Crossover, to Mutation in turn. If the result does not fit the criteria, the algorithm loop will restart, using the new offspring as the initial population of chromosomes. The picture below shows the logic that how GAs works.

GA is especially suitable for solving optimization problems with a very large number of possible solutions. I consider the multiplicity of candidates and the selection system of the Genetic Algorithm to benefit the architectural evolution process.

An example below shows that Nathaniel Louis Jones used GAs as an evolution tool to create the fitter generation of houses, which are ranked by the lighting, heating and functional criteria. “After a few runs of the algorithm, the architect has many fit design options to choose from… A few solutions discovered by the GA are notable for their display of machine creativity, adaptations that seem particularly well thought-out even though no human intelligence is behind them”, said N. Jones.¹¹

Cellular automaton

Cellular automata were studied in the early 1950s as a possible model for biological systems. A cellular automaton is a collection of “coloured” cells on a grid of specified shape that evolves through a number of discrete time steps according to a set of rules based on the states of neighbouring cells. The rules are then applied iteratively for as many time steps as desired. von Neumann was one of the first people to consider such a model, and incorporated a cellular model into his “universal constructor.”¹²

In two dimensions, the best-known cellular automaton is Conway’s game of life, discovered by J. H. Conway in 1970.

Conway’s game of life

While extending cellular automata to 3d, the project below demoed growing Minecraft-like 3d artifacts with neural cellular automata.

Emergence and swarm intelligence

Swarm intelligence systems consist typically of a population of simple agents or boids interacting locally with one another and with their environment. The inspiration often comes from nature, especially biological systems. The agents follow very simple rules, and although there is no centralized control structure dictating how individual agents should behave, local, and to a certain degree random, interactions between such agents lead to the emergence of “intelligent” global behaviour, unknown to the individual agents. Examples of swarm intelligence in natural systems include ant colonies, bee colonies, bird flocking, hawks hunting, animal herding, bacterial growth, fish schooling and microbial intelligence.

As with most artificial life simulations, Boids is an example of emergent behaviour; that is, the complexity of Boids arises from the interaction of individual agents (the boids, in this case) adhering to a set of simple rules. The rules applied in the simplest Boids world are as follows:

  • separation: steer to avoid crowding local flockmates
  • alignment: steer towards the average heading of local flockmates
  • cohesion: steer to move toward the average position (center of mass) of local flockmates

More complex rules can be added, such as obstacle avoidance and goal-seeking.¹³

In the modern entertainment industry, large scenes such as crowds, wars, and animal clusters in movies and games are increasingly generated with the help of swarm computing.

Screenshot of Horizon Zero Dawn

AI is a rapidly evolving field. What I’m trying right now might be completely obsolete after 6 months. As scary as it sounds, I believe AI-backed 3d content creation tools will take off to enable users to populate the metaverse, it is just a matter of time.

For text->2d image generation, AI tools like DALL·E, Midjourney can already do amazing jobs. Users can just type text prompts to generate stunning images.

Image generated by Midjourney, text prompt: beautiful, fantasy city unreal engine

Trained using only 2D images, NVIDIA GET3D generates 3D shapes with high-fidelity textures and complex geometric details.

Google Research has unveiled DreamFusion, a new method of generating 3D models from text prompts. The approach, which combines a text-to-2D-image diffusion model with Neural Radiance Fields (NeRF), generates textured 3D models of a quality suitable for use in AR projects, or as base meshes for sculpting.¹⁴

And crucially, it does not require a set of real 3D models to use as training data — potentially paving the way to the development of practical, mass-market AI-based text-to-3D tools.

You can also use AI tools to animate 3D assets, do motion tracking and automatic rigging.

Maybe in the near future, users can combine a set of AI tools to describe a scene in text prompts and generate high-quality 3D assets with ease, and this will fundamentally change the content creation landscape and make the 3d immersive metaverse a reality.

Easy to use and access

The products of 3D design systems need to be intuitive for users without design experience, to expose minimal complexity but also with flexibility so users can get the most out of it with just what they need to know.

The visual programming language could be user-friendly to apply and conduct content creation tasks. Node-based interfaces are very common in design/development tools, like Scratch, Houdini, Unreal Blueprint, Grasshopper, Blender geometry nodes..etc. Users can combine and adjust pre-defined design packages to build scenes and apply interactions and visual effects.

Block-based Scratch
Procedural and VFX tool, node-based Houdini

It should be easy to access and collaborate with your team, so the tools need to be web-based. We need to craft the browser-based geometry engine.

Here are some web-based 3D design products that I found promising:

Easy to integrate, extend and interoperability

3d design systems need to leverage the design solutions built with existing tools, and also provide options for advanced users to extend functionalities. 3D asset interoperability is the key to making sure everyone is building on top of the same foundation.

The Metaverse Standards Forum is a unique venue for coordination between standards organizations and industry, with a mission to foster the pragmatic and timely standardization that will be essential to an open and inclusive metaverse.” “Technology and standards are the bricks and cement of the metaverse.”

Design-driven, system-thinking, and community maintained

3d design systems are about exploring processes and strategies of designs that can scale. Systems thinking is a way of making sense of the complexity of the world by looking at it in terms of wholes and relationships rather than by splitting it down into its parts. There are tons of design solutions we can leverage and integrate, no need to invent wheels from scratch. We just need to make the system super easy to use and evolve.

Design systems won’t flourish without the community’s support. We need a variety of design solutions, styles, motions, and interactions to populate the “system library” to empower end users.

Published
Categorized as UX