QtWebEngine/Rendering

From Qt Wiki
Jump to navigation Jump to search

This article is out of date and not updated for rendering in Qt 5.15 or the major change in Qt 6.5.

Qt WebEngine uses the Qt Quick scene graph for rendering to take advantage of HW acceleration as much as possible. Chromium, which is used in WebEngine to load and render web content, is also designed to make use of hardware acceleration as much as possible.

This document describes the overall concepts of how web contents actually gets rendered into a web engine view and how the Qt and the Chromium parts interact. It also tries to at least describe the basic concepts of rendering in Chromium and provides further links to the Chromium documentation for reference.

Broad overview

A very simplified description of how the rendering works is this:

  • The Web Engine views tie into the scene graph as a QQuickItem, i.e. as a scene graph node including a tree of subnodes.
  • Chromium renders the web content and either uploads the results as textures to the GPU or as Bitmaps to shared memory (depending on whether HW acceleration is used or not)
  • When a complete frame is ready to be displayed, Chromium notifies the WebEngine part by calling RenderWidgetHostViewQt::OnSwapCompositorFrame with the data that is needed to display the frame (e.g., textures and the geometries that they should be applied to)
  • WebEngine saves this frame data and tells the scene graph that it needs to be repainted
  • When the scene graph prepares rendering the next frame, it will update all the dirty nodes, which will end up calling DelegatedFrameNode::commit in WebEngine
  • DelegatedFrameNode::commit will go through all the geometries and textures for the frame, create corresponding scene graph nodes and apply the textures to them
  • When the scene graph renders the frame, it will call preprocess on all dirty nodes, which will end up in DelegatedFrameNode::preprocess, which will actually fetch the textures and wait for them to be generated.
  • The scene graph will then render using the nodes that have been prepared before.

The code doing most of what is described above can be found in src/core/delegated_frame_node.cpp and src/core/render_widget_host_view_qt.cpp

Rendering architecture in Chromium

Chromium is designed to make use of hardware acceleration for rendering and painting as much as possible. In order to do this, Blink moved away from the traditional model of painting directly into software bitmaps to the model of recording the painting calls in Skia Pictures (SkPictures) which records the draw commands and can be used later to actually paint the RenderObject it was created for. Each RenderObject is associated with a RenderLayer either directly or through its parent RenderObject. Through a series of compositing steps, RenderLayers are converted to GraphicsLayers, which are then associated with compositing layers that can be GPU backed and drawn using hardware acceleration. The compositor will then divide up the layers into a number of tiles and only rasterize the tiles that are either in the viewport or adjacent to the tiles in the viewport to enable faster scrolling while still saving memory and resources by leaving out tiles that are further away from the viewport. These tiles and layers will then be composited into a compositor frame, which mainly consists RenderPasses and TransferrableResources. This data is what is handed off to Qt WebEngine and used to generate nodes for the Qt Quick scene graph.

How does the conversion in DelegatedFrameNode::commit actually look?

The data that Chromium passes over to Qt basically consists of a list of RenderPasses and an array of TransferableResources that are used by the RenderPasses. These resources would for example be textures that have been uploaded to the GPU by Chromium in case if hardware acceleration is used, or they could also be handles to bitmaps in shared memory in case software rendering is used. The resources are associated to synchronization points so that they can be used across threads.

What happens in DelegatedFrameNode::commit is that first, the objects that were created for the previous frame (except the scene graph nodes themselves) are saved. The scene graph nodes used for the previous frame are deleted.

Then, the list of resources is traversed. If the resource was already in use previously, it's reference count is increased, otherwise, a new ResourceHolder wrapper is created for it.

Then, the list of RenderPasses is traversed, with the root RenderPass being the last one. Each RenderPass except for the root RenderPass will be put on a QSGLayer and have its own QSGRootNode assigned to it. It is attempted to reuse these objects from the previous compositor frame if a RenderPass with the same ID was already present there.

Then, the list of DrawQuads in the RenderPass is traversed and QSGNodes are created for each DrawQuad, with the actual subclass of QSGNode being instantiated depending on the DrawQuad type. After applying the DrawQuad's geometry and other resources such as color or texture to the scene graph node and intializing the texture for it, the node is appended to the list of children. If a DrawQuad or a collection of DrawQuads has clip rects, opacities or transforms applied to them, the corresponding scene graph nodes will be prepended to them in the scene graph node tree. The resources used by a quad will be removed from the list of resource candidates and be save in the DelegatedFrameNode's collection of currently used resources.

After this has been done for all RenderPasses, the resources that were in the resource candidate list, but not used in this frame, are returned to Chromium in order to free them.

What happens in DelegatedFrameNode::preprocess

DelegatedFrameNode::preprocess goes through the list of resources that have been marked as in use by DelegatedFrameNode::commit and assembles a list that contains the resources whose textures have not been fetched from the Chromium GPU thread where they were generated to the main thread.

After assembling the list, it will go through it and post a task for each of the textures to run pullTexture for it on the GPU thread and wait until all these tasks have completed on the GPU thread. pullTexture will create a GL fence for each texture, which will be transferred to the main thread and be used to wait for the texture to be generated.

At the end, the list of QSGLayers that have been created for the (non-root) RenderPasses will be traversed and the scene graph will be requested to update each layer's rendering.

Rendering without the GPU

In a number of scenarios, we disable using the GPU (e.g. when using ANGLE on Windows or if the OpenGL driver is known to have problems). In this case, the GPU thread will not be initialized at all and the renderer processes will use Skia's software rasterizer to generate bitmaps that are copied to the browser process using shared memory. In this case, Qt WebEngine will create copies of these bitmaps and use them to generate QImages, which are then used to create QSGTextures. This happens during the creation of scene graph nodes in DelegatedFrameNode::commit. The code that does the actual copying can be found in ResourceHolder::initTexture.

Where do Qt and Chromium rendering code interface directly?

RenderWidgetHostView::OnSwapCompositorFrame

This method is called by Chromium when a new frame has been composited and should be swapped with the one currently displayed. It copies the data for the frame to be rendered and triggers an update of the DelegatedFrameNode by the scene graph.

GLSurfaceQt

These classes derive from Chromium's GLSurface class and wrap the platform specific parts for the GL implementation in use, e.g. GLX on Linux/X11 or WGL on Windows. Successful initialization of a platform GLSurface is part of Chromium's startup sequence. Later, the GLSurfaceQt instances are created by the GPUChannelManager on the GPU thread to create offscreen surfaces for the GPU thread to render into when decoding and executing streamed GL commands.

QtShareGLContext

In order to be able to share the OpenGL resources across different threads, the contexts that are used to issue the GL commands have to be created using a context that enables resource sharing. Chromium uses the concept of the GLShareGroup that keeps track of all the initialized contexts that share the same resource id namespace. The Qt WebEngine class ShareGroupQtQuick derives from this and makes sure that resource sharing is also enabled on the Qt side. It will also be used to provide an instance of the QtShareGLContext, which derives from Chromium's gl::GLContext class. This context will only be used to provide a native context handle for creating other GL contexts that are in the same share group and hence can access the same resources.

Explanation of terms

GPU thread

Chromium usually runs all code that has access to the GPU in a separate process. On Android, GPU access is made from a thread in browser process. In QtWebEngine, we are using the latter model of using a thread that runs in the application process. This thread will have the identifier "Chrome_InProcGPUThread". The actual Open GL ES 2 commands that Chromium uses for HW accelerated rendering will be issued from this thread. However, the logic to figure out which GL commands will be generated and executed is taken by the render process and then streamed to the GPU thread in the browser process. This happens via the use of GLES2CommandEncoders (renderer process), CommandBuffers and GLES2CommandDecoders (browser process).

CommandBuffer / GLES2CommandDecoder

CommandBuffers are used by the GPU thread to buffer the rendering commands that the renderer processes stream to the browser process. These commands are closely matching OpenGL ES 2.0 commands and get encoded by the renderer process and then decoded into actual OpenGL ES 2.0 commands on the browser process side. Each CommandBuffer has its own command decoder and its own GL context that it can use for rendering. In order for this to work, the GL contexts have to allow resource sharing, which has to be enabled when creating the context. These contexts are rendered into an offscreen FrameBuffer Objects and then handed off to the compositor in order to composite everything into a compositor frame which is then rendered into the back buffer and swapped when ready.

RenderPass

A RenderPass is an aggregation of information to draw a layer in the frame. It contains a list of DrawQuads that are to be drawn, as well as the output and damage rect of the pass, and the transform to the root render pass. Each RenderPass, except for the root RenderPass, will be put on a separate QSGLayer.

DrawQuad

A DrawQuad is a rectangular shape that should be drawn on the screen with a certain material applied. This material could be a texture, a solid color etc. It also contains information on the rectangle it should be drawn to, the blend mode / opacity, clip state and some more. Based on the material, the quad also has specific properties such as the color or texture ID. In practice, this is implemented by deriving material specific classes from DrawQuad and casting to the subclass to access specific properties.

gpu::Mailbox andf gpu::MailboxHolder

A gpu::Mailbox is an identifier for texture data on the on the GPU or shared Bitmaps in case of software rendering. This identifier can be used to share texture data across separate OpenGL contexts. A MailboxHolder is used to provide synchronized access to a Mailbox. It consists of the Mailbox, a SyncToken that can be used to make sure that all commands necessary to create the texture data have been executed before using it, and possibly the texture type in case the Mailbox refers to an OpenGL Texture.

cc::TransferrableResource

A TransferrableResource is used to share resources from one compositing context to another (e.g., from Chromium to Qt WebEngine / the scene graph). It contains a reference to a MailboxHolder that the resource is connected to, an ID for the resource, as well as information on the size and format of the resource, and whether the resource is a software resource or backed by hardware acceleration.

After the resource is no longer used, it needs to be returned to its producer in order to be released.

ResourceHolder

A Qt WebEngine wrapper around a TransferrableResource and the texture that it represents. It is used to keep track of the number of users of this texture as well as of the texture's state, that is, whether the texture has already been fetched or not. It also contains methods to return the resource to Chromium after it is no longer used in the scene graph.


Further references

Chromium Design Docs

GPU accelerated compositing in Chromium GPU CommandBuffers