⚡️ Rendering User Interfaces on the GPU by bbohlender

Rendering user interfaces using the GPU allows mixing user interfaces with 3D content and frees the CPU of work by leveraging the speed of the GPU. In this article, we discuss the learnings from building pmndrs/uikit, a three.js library for rendering user interfaces on the GPU. First, we discuss how to render modern user interfaces on the GPU. Then, we explain what we do to make rendering user interfaces fast ⚡️.

Rendering User Interfaces on the GPU

Modern user interfaces are typically composed of rounded panels with borders, texts, and icons. The following image illustrates the panels 🟨, texts 🟥, and icons 🟦 that make up the user interface for YouTube on the web. Youtube Web User Interface with highlighted panels, texts, and icons

Rendering Panels

We render a panel as a rectangle that can have borders and rounded corners. Panels can have a background color or display an image. Uikit renders a panel using only 2 triangles. A custom shader allows to draw the border and can discard the pixels in the corners to make the panel rounded. The shader used in pmndrs/uikit receives the information required for rendering using two 4x4 matrices containing the

transformation matrix
background color
background opacity
border sizes for all 4 sides
border radiuses for all 4 corners
border color
border bend
border opacity

The border bend value allows to smoothly bend the normals of the border for an efficient 3D effect without adding additional geometry.

Example image of panel using border blend

To support scrolling and hiding overflowing elements, panels need to support clipping. Clipping occurs when the parent container instructs its children to hide overflowing content. We use an additional 4x4 matrix to store the 3D clipping planes required for clipping, which are computed based on the ancestors in the user interface. 3D clipping planes are required when a user interface allows transforming (rotating, translating, scaling) individual user interface elements.

Rendering Texts and Icons

In contrast to simple panels, texts and icons are typically stored as vector graphics composed of multiple mathematical forms. Rendering these mathematical forms directly on the GPU is inefficient. Instead, vector graphics are typically encoded into textures. The textures contain the distance to the vector graphic at each pixel, which is called a signed distance function (SDF). An SDF allows to render vector graphics efficiently on the GPU while preserving their clarity. Since textures typically provide 3 color channels, we use a multi-channel SDF (MSDF) approach proposed by valve, which allows to "improve visuals significant without losing performance" compared to an SDF using a single channel.

Similar to the panels, texts and icons need to support clipping. Therefore, they also receive a 4x4 matrix that encodes four 3D clipping planes.

An implementation for creating an MSDF and the shader code for rendering it on the GPU can be found here.

Transparency and Ordering

When rendering user interfaces on the GPU, children elements must be rendered after their parents because of 2 reasons. Firstly, the colors of stacked potentially transparent user interface elements must be mixed correctly, requiring rendering them from back to front. Secondly, 3D engines often use a depth buffer to allow order-independent rendering. However, user interfaces rarely contain depth and often require to render elements at the same depth. Therefore, we need to manually sort the user interface elements based on their parent-child relationship and disable the depth sorting from the 3D engine (e.g., for threejs depthWrite=false).

Speeding up the rendering ⚡️

Now that we can render modern user interfaces with panels, texts, icons, tranparency, and correct ordering, we can start thinking of performance. User interfaces are often made up of hundreds of individual elements. If the CPU instructs the GPU to draw each element individually, the CPU has to constantly wait for the GPU to draw while the GPU has to constantly wait for the CPU to provide the next instruction. Instead, to efficiently render user interfaces, the GPU should be instructed to draw the whole user interface using the least amount of possible draw calls (see Instancing). Since, each draw call can only use one shader, we render all panels using the same shader in one draw call and all texts with a another shader in one draw call. Just like before, we need to render elements from front to back. Therefore, rendering user interface elements with as few draw calls as possible requires grouping elements based on their type and parent-child relationship. The following image illustrates a user interface where a panel (A) contains text (Hello World), which contains another panel (B).

Image illustrating the example with nested panel A, text, and panel B

In this example, panels A and B have the same shader code, but they cannot be rendered in the same draw call since the text is placed in between panel A and panel B, and, therefore, must be rendered in front of panel A and behind panel B. Luckily, in most user interfaces, texts are in front of the panels, allowing them to render using 2 draw calls.

The instructions that are sent to the GPU to render multiple elements at once are stored in a manually allocated buffer. Modifying parts of the buffer requires to re-send these parts to the GPU. Therefore, the goal is to reduce modifications to the buffer. When inserting elements into the buffer, the rendering order, which is based on the parent-child relationship, must be enforced. However, the rendering order between sibling elements (e.g. two panels are placed side-by-side) is not relevant for rendering. Therefore, the position inside the buffer for inserting, replacing, and removing elements has some freedom. We use the concept of buckets, where each bucket contains all elements with the same hierarchical depth to reduce the number of buffer modifications. Each element type uses a different set of buckets and a different buffer. The following image illustrates how user interface elements are allocated into buckets and how these buckets are allocated into the buffer.

Example of a user interface, the corresponding graph, the buckets derived from the graph, and the buffer allocation

Note that the example only illustrates the buckets and the buffer for the panels.

Conclusion

We showed how modern user interfaces can be rendered on the GPU, and how techniques like Instancing, efficient buffer allocation, and ordering allow to make the rendering performant. Our library pmndrs/uikit demonstrates that even interactive and complex user interfaces built with designs based on the latest component library shadcn can be rendered efficiently using the GPU on the web. → Demo

pmndrs/uikit is free and open source, implementations for all described concepts can be found at the Github Repository.

↳ If this article or pmndrs/uikit is helpful for you, consider supporting this project ❤️.

Example user interface build with pmndrs/uikit copied form shadcn

Bela Bohlender