Performance Tip Startup Time: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
(Convert ExpressionEngine links)
(LTO Qt build)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
{{Cleanup | reason=Auto-imported from ExpressionEngine.}}
 


[[Category:Developing_with_Qt::General]]
[[Category:Developing_with_Qt::General]]
[toc align_right="yes" depth="3"]
 


= How to optimize application startup time =
= How to optimize application startup time =
Line 22: Line 22:
** [http://msdn.microsoft.com/en-us/library/0zza0de8(VS.80).aspx MSVC documentation]
** [http://msdn.microsoft.com/en-us/library/0zza0de8(VS.80).aspx MSVC documentation]
** Background: By creating usage statistics of your application, the linker can re-arrange the object code to improve loading time.
** Background: By creating usage statistics of your application, the linker can re-arrange the object code to improve loading time.
** To build Qt with LTO, configure it with <tt>-reduce-relocations -ltcg</tt> (LTCG stands for Link-Time Code Generation)


* (GNU ld) Use <code>-Bsymbolic-functions<code> for your shared libraries. This tells the linker to use direct local jumps to symbols within your library instead of trying to resolve them by the usual means. The effect is that every function call within your library will be initially faster since there's no lookup required. This leads to faster load times.
* (GNU ld) Use <tt>-Bsymbolic-functions</tt> for your shared libraries. This tells the linker to use direct local jumps to symbols within your library instead of trying to resolve them by the usual means. The effect is that every function call within your library will be initially faster since there's no lookup required. This leads to faster load times.
** Note: Side-effects are that it's impossible to use ''LD_PRELOAD'' to override a symbol in a library.
** Note: Side-effects are that it's impossible to use ''LD_PRELOAD'' to override a symbol in a library.
** There's a QMAKE variable ''QMAKE_LFLAGS_BSYMBOLIC_FUNC'' that expands to the corresponding linker flag if the linker supports symbolic functions.
** There's a QMAKE variable ''QMAKE_LFLAGS_BSYMBOLIC_FUNC'' that expands to the corresponding linker flag if the linker supports symbolic functions.
Line 36: Line 37:
== Platform level ==
== Platform level ==


* (MeeGo) MeeGo supports "boosted" applications. See [[Harmattan_Booster_for_Qt_Quick_Applications]] on how to enable your Qt Quick applications to be boosted and the [http://apidocs.meego.com/1.1/platform/html/libmeegotouch/launcher.html MeeGo launcher documentation] on how to boost Qt and generic apps.
* (MeeGo) MeeGo supports "boosted" applications. See [[Harmattan_Booster_for_Qt_Quick_Applications|Harmattan Booster for Qt Quick Applications]] on how to enable your Qt Quick applications to be boosted and the [http://apidocs.meego.com/1.1/platform/html/libmeegotouch/launcher.html MeeGo launcher documentation] on how to boost Qt and generic apps.
** Note: Although the boosters are part of MeeGo, the core parts are written in play C++ and can be re-used on other platforms as well.
** Note: Although the boosters are part of MeeGo, the core parts are written in play C++ and can be re-used on other platforms as well.
** Note: You have to rewrite a small portion of your app, and you need to compile your app as ''position independent executable''
** Note: You have to rewrite a small portion of your app, and you need to compile your app as ''position independent executable''
Line 46: Line 47:


* Cache things
* Cache things
** Example: [http://qt.gitorious.org/qt/qt/blobs/master/src/opengl/gl2paintengineex/qglshadercache_meego_p.h MeeGo shader cache] compiles OpenGL shaders into a binary representation and puts them in a shared memory area for other apps to use. Only the first application startup will be slow, since that has to populate the cache. All further apps start faster.
** Example: [http://code.qt.io/cgit/qt/qt.git/tree/src/opengl/gl2paintengineex/qglshadercache_meego_p.h MeeGo shader cache] compiles OpenGL shaders into a binary representation and puts them in a shared memory area for other apps to use. Only the first application startup will be slow, since that has to populate the cache. All further apps start faster.
** Example (KDE): KDE uses an icon cache to prevent that every icon is loaded / processed over and over again.
** Example (KDE): KDE uses an icon cache to prevent that every icon is loaded / processed over and over again.


== Application level ==
== Application level ==


* QML apps: See [[Performance_tip_Use_Loaders]]
* QML apps: See [[Performance_tip_Use_Loaders|Performance tip Use Loaders]]
* Lazy initialization
* Lazy initialization
** Load things only when you need them, not on application startup
** Load things only when you need them, not on application startup
** Don't use static global objects. The code that initializes that global object runs before the ''main()'' function, thus startup time goes up. Instead, use the Singleton pattern to create your global static object the first time that it is used.
** Don't use static global objects. The code that initializes that global object runs before the ''main()'' function, thus startup time goes up. Instead, use the Singleton pattern to create your global static object the first time that it is used.

Latest revision as of 14:29, 22 April 2016


How to optimize application startup time

This page lists various approaches on how to optimize application startup time. Feel free to discuss this page and share your knowledge by improving this page.

The approaches are separated in three areas:

  • Toolchain level - this includes various optimization in the linker, the compiler and the dynamic linker.
  • Platform level - this includes various approaches that platforms offer.
  • Application level - this includes everything that the app itself can do to start up faster

Toolchain level

Linker

  • Link Time Optimization (LTO) or Whole Program Optimization (WPO) can be used to improve startup times.
    • gcc documentation
    • MSVC documentation
    • Background: By creating usage statistics of your application, the linker can re-arrange the object code to improve loading time.
    • To build Qt with LTO, configure it with -reduce-relocations -ltcg (LTCG stands for Link-Time Code Generation)
  • (GNU ld) Use -Bsymbolic-functions for your shared libraries. This tells the linker to use direct local jumps to symbols within your library instead of trying to resolve them by the usual means. The effect is that every function call within your library will be initially faster since there's no lookup required. This leads to faster load times.
    • Note: Side-effects are that it's impossible to use LD_PRELOAD to override a symbol in a library.
    • There's a QMAKE variable QMAKE_LFLAGS_BSYMBOLIC_FUNC that expands to the corresponding linker flag if the linker supports symbolic functions.
    • Note: You can create a whitelist of symbols that will ignore the -Bsymbolic-functions switch, using the —dynamic-list parameter. See for example QtCore.dynlist in the QtCore source tree.
  • (GNU ld) Make sure to use GNU style hashes for symbol lookup (—hash-style=gnu). This is the default on Linux, however, some toolchains might still default to the old sysv hash style, which has slower symbol lookup when using shared libraries. GNU hash style improves startup time by improving the time to resolve symbols.
  • Profiling startup time optimizations
    • valgrind:http://valgrind.org/ provides support via callgrind/cachegrind to measure the time spent before your main function, which includes symbol resolving and the time spent in initialization code for dependent libraries.
    • (GNU ld) Use the LD_DEBUG environment variable to output statistics from the dynamic linker.

Platform level

  • (MeeGo) MeeGo supports "boosted" applications. See Harmattan Booster for Qt Quick Applications on how to enable your Qt Quick applications to be boosted and the MeeGo launcher documentation on how to boost Qt and generic apps.
    • Note: Although the boosters are part of MeeGo, the core parts are written in play C++ and can be re-used on other platforms as well.
    • Note: You have to rewrite a small portion of your app, and you need to compile your app as position independent executable
    • Background: MeeGo is pre-launching a few processes in the background that wait for the actual app to launch. Since all initialization is already done (e.g. QApplication constructor already ran), the app startup is perceived considerably faster.
  • (KDE) kdeinit is used to start applications
    • Note: This approach can be adapted to other platforms as well to improve startup times of multiple apps
    • Background: kdeinit is a pre-started process that links to various core libraries, so symbol resolving and library mapping into memory is already partially done. When the actual application is started, kdeinit forks and executes it. The time required to resolve symbols goes down, thus startup time goes up.
  • Cache things
    • Example: MeeGo shader cache compiles OpenGL shaders into a binary representation and puts them in a shared memory area for other apps to use. Only the first application startup will be slow, since that has to populate the cache. All further apps start faster.
    • Example (KDE): KDE uses an icon cache to prevent that every icon is loaded / processed over and over again.

Application level

  • QML apps: See Performance tip Use Loaders
  • Lazy initialization
    • Load things only when you need them, not on application startup
    • Don't use static global objects. The code that initializes that global object runs before the main() function, thus startup time goes up. Instead, use the Singleton pattern to create your global static object the first time that it is used.