This page is about the internals of script injection and extensions in Qt WebEngine and Chromium. Qt WebEngine currently does not provide any API for extensions, although internally some code exists for supporting the PDF viewer extension bundled with Chromium.
- 1 Introduction to Script Injection
- 2 Introduction to User Scripts, User Styles, and Extensions
- 3 Script Injection in the Bowels of Blink
- 4 Script Injection in Qt WebEngine
- 5 Script Injection in Chromium's Extensions Module
- 6 Implementation of Chromium Extensions
Introduction to Script Injection
Two major types of use cases can be distinguished. On the one hand, applications can use script injection to implement a specific feature, such as emacs-style text editing. In this case, the use of injection is an internal implementation detail of the application that is invisible to the end-user. On the other hand, applications can provide extension mechanisms to allow end-users to modify the behavior web pages via script injection. In this case, the application must provide a stable public interface for script injection; many such interfaces already exist, including Greasemonkey's user scripts, Stylish' user styles, and Chromium's extensions. This is, of course, only a distinction on use cases, and does not necessarily apply cleanly to the implementation, because the implementations of these extension mechanisms can be generally useful even if used only internally. Chromium, for example, uses extensions internally to support the PDF viewer, the media router, and other features. Likewise, Qt WebEngine implements some support for Chromium extensions in order to use the PDF viewer, without providing a public API for installing or managing extensions, etc.
Introduction to User Scripts, User Styles, and Extensions
Chromium supports user scripts natively by automatically converting them into extensions when the user clicks on a link that ends with .user.js. However, there is also the extension Tampermonkey, which implements user scripts without this conversion, and provides a more complete feature set along with a specialized user interface. User styles are not natively supported by Chromium but can be used via extensions such as Stylish and Stylus.
Support for user scripts was added to Qt WebEngine in 2016. This is not a full implementation of the Greasemonkey spec, but only of the part that cannot be implemented in terms of the existing API, namely conditional injection using pattern matching rules. This feature is used by the open-source browsers qutebrowser and falkon to implement the full Greasemonkey spec. Both browsers use their own Greasemonkey metadata parsers to extract the metadata, then pass the same script along to Qt WebEngine for a second parsing.
In general, user scripts are not a particularly popular feature, since they are mainly aimed at advanced users; their main advantage is that they are easy to create and are portable across browsers. Extensions, on the other hand, can have their own user interface elements and are overall a much more feature-rich mechanism for extending the browser's functionality.
Extensions must declare a list of permission keywords in their manifest. These must be presented to the user for approval during installation. Additionally, there is a list of optional permissions, which can be requested programmatically.
The Background Script
At the core of an extension is the background script. This is the central brain of the extension that is responsible for reacting to events from the browser and other parts of the extension; events such as open or closing tabs, navigations, downloads, clicks on the extension's icon, etc. The background script is executed in the context of the extension's background page, which is a lot like an ordinary web page, except with special privileges and invisible. The background page can be lazy or persistent. A lazy background page is suspended as soon as possible and awakened only when events are fired that the background script has registered an interest in. A persistent background page is always kept around in memory and is intended to be used with the webRequest API for intercepting network traffic.
User Interface Elements
As for user interface elements, these consist of browser actions and page actions with popup HTML pages, context menus, omnibox keywords with optional icons, override pages that can replace the new tab page and others, keyboard shortcuts that can potentially be global, devtools extensions for adding new tabs to devtools, and an options page that can be a separate page or a popup in the chrome://extensions page. Browser and page actions refer to the extension icon displayed in Chromium; in manifest v3, these two will be merged into just one type of action. Actions have a lot of features, like rendering the icon with an HTML5 canvas, but most importantly they can open a popup page, which is an HTML file declared in the manifest and rendered in the extension process. The popup page can communicate with the background page via messages and storage API, likewise with the options page.
Concerning incognito mode, the extension can be spanned or split. In split mode, the extension will have a separate process for incognito windows, and pages in the two processes cannot communicate except through the storage API, nor access each other's tabs or other browser features. Conversely, in spanned mode, the extension will share its pages between incognito and ordinary windows.
Before descending into the depths of Chromium, it might be a good idea to recall how script injection worked in Qt WebKit. The matter is much simpler in a single-process architecture, and, since Blink is a fork of WebKit, the APIs are naturally similar.
Aside from the methods ExecuteScript and ExecuteScriptInIsolatedWorld of blink::WebLocalFrame, there are also the asynchronous variants, RequestExecute*, which use blink::PausableScriptExecutor to execute scripts asynchronously while optionally delaying the window.onload event until after their execution finished. This is supposed to reduce UI jank.
Script Injection in Qt WebEngine
Now, there is a general solution to this sort of problem. Instead of a flurry of little back-and-forth messages, where the application is notified of events and sends back commands, the application must declare upfront what it wants to happen in reaction to which event; then no communication is needed, and therefore there is also no problem with communication overhead. This is the difference, for example, between client-side and server-side filtering in a database query: in client-side filtering the database sends each row to the application who then implements the filtering predicate in its native programming language, while in server-side filtering the application encodes the filtering predicate in some domain-specific language and hands it off to the server who can then send back only the matching rows. Likewise, with script injection, we can employ a declarative approach where the application decides upfront which scripts should be injected in reaction to which loading events, encodes this in some data structure or other, and hands it off to WebEngine. This allows the main process to distribute the serialized injection logic ahead of time to all its subprocesses and each subprocess can then carry out the injections independently, that is, without waiting on the main process.
The API was implemented incrementally. In 2015, QWebEngineScript and QWebEngineScriptCollection were added to the Widgets-based API. In a follow-up patch, QQuickWebEngineScript was added to the Quick-based API. Fun fact, in a last minute API review cleanup, the script collection getter in QWebEngineProfile was changed from returning a reference to returning a pointer to be more idiomatic Qt. The getter in QWebEnginePage was forgotten, which is how we now have inconsistent return types for the two getters. In 2016, support was added for Greasemonkey pattern rules in metadata comments of the script source code. In early 2017, profile-wide scripts were added to the Quick-based API, bringing it to feature parity with the Widgets-based API. At the same time in 2017, QQuickWebEngineScript was made public to allow scripts to be managed from C++ in a Qt Quick application.
Script Injection in Chromium's Extensions Module
Script injection forms only a small of Chromium's implementation extensions, however this small part is certainly large enough to deserve its own section.
Scripts are loaded in the browser process by objects of the class UserScriptLoader. Each loader owns a set of UserScript objects and triggers loading whenever scripts are added or removed from this set. The actual loading is delegated to subclasses, either ExtensionUserScriptLoader or WebUIUserScriptLoader. These subclasses go over all of the UserScript::File objects in each UserScript, load the script contents for the file path, and store the contents inside the UserScript::File object as an std::string. They then serialize the whole set of UserScript objects and copy the result into a newly created shared memory region. Then UserScriptLoader takes over again and broadcasts a handle to the shared-memory region to each renderer process. The corresponding object on the renderer is the UserScriptSetManager.
The ExtensionUserScriptLoader subclass is used for loading script files of extensions. If the extension is a component extension, then files are loaded from the resource bundle, otherwise from inside the profile directory where the files of an installed extension are stored. Conversely, the WebUIUserScriptLoader subclass loads script files over the network service, which is apparently somehow used when a WebView is embedded within a WebUI to load the embedder's content scripts.
The UserScriptSetManager on the rendered side receives the shared memory region from the UserScriptLoader and hands it off to a UserScriptSet. The manager has one UserScriptSet for all scripts that have been statically defined in extension manifests, and separate sets for the programmatically-defined scripts of each extension. The UserScriptSet maintains the deserialized UserScript objects on the renderer side and creates ScriptInjection objects to perform injections.
* ProgrammaticScriptInjector * UserScriptInjector
Implementation of Chromium Extensions