Judging from experience, performance improvements in networking are orders of magnitudes higher than performance improvements gained from reducing library size, avoiding memcpy etc. In the particular case of code issuing a network request, the time spent in parsing the HTTP replies, computing SSL keys etc. takes less than 5% of the overall time, i.e. 95% of the time below is spent in waiting for network I/O.
In other words: When trying to improve QtNetwork performance, try to shave off network roundtrips and reduce latency rather than trying to improve parsing of HTTP replies / avoiding memcpy etc.
When e.g. opening the Facebook app on BlackBerry 10, it takes 1.5 to 2 seconds (worst case) until data is retrieved; this can be broken down into several parts (numbers measured with tcpdump on a Wifi network):
To improve performance for the different types of network traffic, the following things can be done; note that some of the tasks do not help at app startup but while the app is already running.
1. DNS lookup:
- enable global DNS cache. If there is a global DNS cache on your system, you probably want to enable it; this will most likely provide a massive performance boost. For examples of DNS TTL times, see the table at the bottom of this page.
- [more concurrent requests]. Currently, Qt supports 5 concurrent requests, which is too low and sometimes DNS requests are stalled waiting for another one to finish.
- [Qt DNS cache]. Already implemented in Qt, but cached entries will only live as long as the app is running of course. I.e. the 1st DNS lookup will always go to the system DNS cache or the network.
- [do not lookup the same DNS name twice]
- [pre-DNS lookup], i.e. make the DNS lookup not upon connection, but as early as possible. This will not help right when the app is started, but for consecutive requests. Even better: pre-TCP handshake, see below. Yet even better: pre-SSL handshake, see below.
- [asynchronous DNS lookup]. Since getaddrinfo() is a synchronous call, Qt executes it in a thread (through QRunnable / QThreadPool). This is not costly by itself, but might be so when there are many runnables / threads competing for CPU power and the thread pool queues DNS lookups. Whether there are asynchronous calls for DNS lookups available depends on the platform (e.g. Linux: getaddrinfo_a()).
2. TCP handshake:
- [pre-TCP handshake], i.e. make the TCP handshake not upon connection, but as early as possible. This will not help right when the app is started, but for consecutive requests. Even better: pre-SSL handshake, see below.
- [pre-connect more sockets per host]. We don't necessarily need smart heuristics for that (but could do so by inspecting the cache), but if an app knows it needs more connection anyhow, it could tell us and we could pre-allocate sockets.
- persistent connections, i.e. TCP connections are cached for 1 minute and can be reused; this has been implemented in Qt since always. Also good for using sockets with already high TCP congestion window to avoid TCP slow start. Works best if the app only uses 1 instance of QNetworkAccessManager, and there is no reason to use more than 1 instance anyhow.
3. SSL handshake:
- [pre-SSL handshake]. Would not help at app start, but if an app knows a server will be used, it should do the SSL handshake as early as possible.
- [try to persist / re-use SSL session IDs]. If we find servers with a high session re-use lifetime, it might be worth persisting the session ID across app restarts.
- [SSL: enable TLS session tickets (Stateless TLS Session Resumption)]. Probably preferrable over re-using session IDs because of higher life time.
- [use SSL False Start]. Currently not implemented in OpenSSL (but in NSS), so nothing we can do unfortunately, but it would save us 1 roundtrip for all SSL handshakes, which would be a massive performance boost.
- [enable SSL session sharing]. Already implemented, saving 1 network roundtrip (~ 200-300 ms) for sockets re-using an SSL connection. Works best if the app only uses 1 instance of QNetworkAccessManager, and there is no reason to use more than 1 instance anyhow.
- [use faster SSL ciphers]. Probably not much gain, not important for now from a performance point of view.
4. HTTP request/reply:
- [improve HTTP cache]. The cache currently reads everything from disk instead of keeping items in memory, and IIRC does not load as much from cache as it could. Would be really good for performance, but is quite some work as well.
- [prioritize requests]. Needs to be done by the app, but we should definitely prioritize requests that contain links to other resources (for apps: JSON and XML, just like browsers prioritize HTML/CSS/JS over JPG etc.) over requests to "end" resources like images.
- [enable HTTP pipelining by default]. Could also be done by the app on a per-request basis, also need to check whether this is fine for SSL requests. Might also be a gain, but also a bit risky, even though I think the servers from the big sites like Facebook, Twitter etc. should be fine with it.
- [implement SPDY]. Supported by Google, Facebook, Twitter and probably others. No idea how much gain this would be.
There are more items listed at [QtNetwork performance improvements on Jira] , but I think these are the most important ones. Feel free to help implementing :)
Questions, suggestions etc.: Please contact us on irc.freenode.net #qt-earth-team —-
Appendix A: DNS TTL table
|app||# DNS requests||average DNS TTL|
|4+|| api.facebook.com: <= 1 hour
graph.facebook.com: <= 1 hour others (for images etc.): <= 5 mins
|? (app was broken on my build)||api.twitter.com: ~ 3 mins|
|~ 10 (2 for LinkedIn and rest for other servers)|| api.linkedIn.com: only few seconds
touch.www.linkedIn.com: only few seconds rest: 5-35 mins
|Google Mail through Hub||6 (5 of them to google.com (??))|| www.google.com: <= 5 mins
google.com: <= 5 mins
Appendix B: SSL session ticket and ID lifetime
|site||session ticket lifetime||session ID lifetime|
|api.facebook.com||~ 24 hours||>= 60 min|
|graph.facebook.com||~ 24 hours||>= 60 min|
|fbcdn-profile-a.akamaihd.net||2 hours||>= 45 min, < 60 min|
|api.twitter.com||4 hours||>= 60 min|
|si0.twimg.com||5 minutes||not supported|
|api.linkedin.com||not supported||< 5 min|
|touch.www.linkedin.com||not supported||< 5 min|
|media.licdn.com||not supported||< 5 min|
|foursquare.com||0||< 5 min|