QtNetwork performance: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
No edit summary
 
No edit summary
 
(6 intermediate revisions by 3 users not shown)
Line 1: Line 1:
'''Judging from experience, performance improvements in networking are orders of magnitudes higher than performance improvements gained from reducing library size, avoiding memcpy etc. In the particular case of code issuing a network request, the time spent in parsing the <span class="caps">HTTP</span> replies, computing <span class="caps">SSL</span> keys etc. takes less than 5% of the overall time, i.e. 95% of the time below is spent in waiting for network I/O.'''


'''In other words: When trying to improve QtNetwork performance, try to shave off network roundtrips and reduce latency rather than trying to improve parsing of <span class="caps">HTTP</span> replies / avoiding memcpy etc.'''


——<br /> When e.g. opening the Facebook app on BlackBerry 10, it takes '''1.5 to 2 seconds''' (worst case) until data is retrieved; this can be broken down into several parts (numbers measured with tcpdump on a Wifi network):
'''Judging from experience, performance improvements in networking are orders of magnitudes higher than performance improvements gained from reducing library size, avoiding memcpy etc. In the particular case of code issuing a network request, the time spent in parsing the HTTP replies, computing SSL keys etc. takes less than 5% of the overall time, i.e. 95% of the time below is spent in waiting for network I/O.'''


[[Image:gupt9gmt.png|connecting to Facebook on BB10]]
'''In other words: When trying to improve QtNetwork performance, try to shave off network roundtrips and reduce latency rather than trying to improve parsing of HTTP replies / avoiding memcpy etc.'''


To improve performance for the different types of network traffic, the following things can be done; note that some of the tasks do not help at app startup but while the app is already running.


1. <span class="caps">DNS</span> lookup:
-----


* enable global <span class="caps">DNS</span> cache. If there is a global <span class="caps">DNS</span> cache on your system, you probably want to enable it; this will most likely provide a massive performance boost. For examples of <span class="caps">DNS</span> <span class="caps">TTL</span> times, see the table at the bottom of this page.
When e.g. opening the Facebook app on BlackBerry 10, it takes '''1.5 to 2 seconds''' (worst case) until data is retrieved; this can be broken down into several parts (numbers measured with tcpdump on a Wifi network):
* [https://bugreports.qt.io/browse/QTBUG-30866 more concurrent requests] ''[bugreports.qt.io]''. Currently, Qt supports 5 concurrent requests, which is too low and sometimes <span class="caps">DNS</span> requests are stalled waiting for another one to finish.
* [https://bugreports.qt.io/browse/QTBUG-30807 Qt <span class="caps">DNS</span> cache] ''[bugreports.qt.io]''. Already implemented in Qt, but cached entries will only live as long as the app is running of course. I.e. the 1st <span class="caps">DNS</span> lookup will always go to the system <span class="caps">DNS</span> cache or the network.
* [https://bugreports.qt.io/browse/QTBUG-30867 do not lookup the same <span class="caps">DNS</span> name twice] ''[bugreports.qt.io]''
* [https://bugreports.qt.io/browse/QTBUG-30771 pre-<span class="caps">DNS</span> lookup] ''[bugreports.qt.io]'', i.e. make the <span class="caps">DNS</span> lookup not upon connection, but as early as possible. This will not help right when the app is started, but for consecutive requests. Even better: pre-<span class="caps">TCP</span> handshake, see below. Yet even better: pre-<span class="caps">SSL</span> handshake, see below.
* [https://bugreports.qt.io/browse/QTBUG-30795 asynchronous <span class="caps">DNS</span> lookup] ''[bugreports.qt.io]''. Since getaddrinfo() is a synchronous call, Qt executes it in a thread (through QRunnable / QThreadPool). This is not costly by itself, but might be so when there are many runnables / threads competing for <span class="caps">CPU</span> power and the thread pool queues <span class="caps">DNS</span> lookups. Whether there are asynchronous calls for <span class="caps">DNS</span> lookups available depends on the platform (e.g. Linux: getaddrinfo_a()).


2. <span class="caps">TCP</span> handshake:
http://s1.directupload.net/images/130424/gupt9gmt.png


* [https://bugreports.qt.io/browse/QTBUG-28762 pre-<span class="caps">TCP</span> handshake] ''[bugreports.qt.io]'', i.e. make the <span class="caps">TCP</span> handshake not upon connection, but as early as possible. This will not help right when the app is started, but for consecutive requests. Even better: pre-<span class="caps">SSL</span> handshake, see below.
To improve performance for the different types of network traffic, the following things can be done; note that some of the tasks do not help at app startup but while the app is already running.
* [https://bugreports.qt.io/browse/QTBUG-28762 pre-connect more sockets per host] ''[bugreports.qt.io]''. We don’t necessarily need smart heuristics for that (but could do so by inspecting the cache), but if an app knows it needs more connection anyhow, it could tell us and we could pre-allocate sockets.
* persistent connections, i.e. <span class="caps">TCP</span> connections are cached for 1 minute and can be reused; this has been implemented in Qt since always. Also good for using sockets with already high <span class="caps">TCP</span> congestion window to avoid <span class="caps">TCP</span> slow start. '''Works best if the app only uses 1 instance of QNetworkAccessManager, and there is no reason to use more than 1 instance anyhow.'''


3. <span class="caps">SSL</span> handshake:
1. DNS lookup:
* enable global DNS cache. If there is a global DNS cache on your system, you probably want to enable it; this will most likely provide a massive performance boost. For examples of DNS TTL times, see the table at the bottom of this page.
* [[https://bugreports.qt.io/browse/QTBUG-30866 more concurrent requests]]. Currently, Qt supports 5 concurrent requests, which is too low and sometimes DNS requests are stalled waiting for another one to finish.
* [[https://bugreports.qt.io/browse/QTBUG-30807 Qt DNS cache]]. Already implemented in Qt, but cached entries will only live as long as the app is running of course. I.e. the 1st DNS lookup will always go to the system DNS cache or the network.
* [[https://bugreports.qt.io/browse/QTBUG-30867 do not lookup the same DNS name twice]]
* [[https://bugreports.qt.io/browse/QTBUG-30771 pre-DNS lookup]], i.e. make the DNS lookup not upon connection, but as early as possible. This will not help right when the app is started, but for consecutive requests. Even better: pre-TCP handshake, see below. Yet even better: pre-SSL handshake, see below.
* [[https://bugreports.qt.io/browse/QTBUG-30795 asynchronous DNS lookup]]. Since getaddrinfo() is a synchronous call, Qt executes it in a thread (through QRunnable / QThreadPool). This is not costly by itself, but might be so when there are many runnables / threads competing for CPU power and the thread pool queues DNS lookups. Whether there are asynchronous calls for DNS lookups available depends on the platform (e.g. Linux: getaddrinfo_a()).


* [https://bugreports.qt.io/browse/QTBUG-30771 pre-<span class="caps">SSL</span> handshake] ''[bugreports.qt.io]''. Would not help at app start, but if an app knows a server will be used, it should do the <span class="caps">SSL</span> handshake as early as possible.
2. TCP handshake:
* [https://bugreports.qt.io/browse/QTBUG-30878 try to persist / re-use <span class="caps">SSL</span> session IDs] ''[bugreports.qt.io]''. If we find servers with a high session re-use lifetime, it might be worth persisting the session ID across app restarts.
* [[https://bugreports.qt.io/browse/QTBUG-28762 pre-TCP handshake]], i.e. make the TCP handshake not upon connection, but as early as possible. This will not help right when the app is started, but for consecutive requests. Even better: pre-SSL handshake, see below.
* [https://bugreports.qt.io/browse/QTBUG-20668 <span class="caps">SSL</span>: enable <span class="caps">TLS</span> session tickets] ''[bugreports.qt.io]''. Probably preferrable over re-using session IDs because of higher life time.
* [[https://bugreports.qt.io/browse/QTBUG-28762 pre-connect more sockets per host]]. We don't necessarily need smart heuristics for that (but could do so by inspecting the cache), but if an app knows it needs more connection anyhow, it could tell us and we could pre-allocate sockets.
* [https://bugreports.qt.io/browse/QTBUG-15452 use <span class="caps">SSL</span> False Start] ''[bugreports.qt.io]''. Currently not implemented in OpenSSL (but in <span class="caps">NSS</span>), so nothing we can do unfortunately, but it would save us 1 roundtrip for all <span class="caps">SSL</span> handshakes, which would be a massive performance boost.
* persistent connections, i.e. TCP connections are cached for 1 minute and can be reused; this has been implemented in Qt since always. Also good for using sockets with already high TCP congestion window to avoid TCP slow start. '''Works best if the app only uses 1 instance of QNetworkAccessManager, and there is no reason to use more than 1 instance anyhow.'''
* [https://bugreports.qt.io/browse/QTBUG-14983 enable <span class="caps">SSL</span> session sharing] ''[bugreports.qt.io]''. Already implemented, saving 1 network roundtrip (~ 200-300 ms) for sockets re-using an <span class="caps">SSL</span> connection. '''Works best if the app only uses 1 instance of QNetworkAccessManager, and there is no reason to use more than 1 instance anyhow.'''
* [https://bugreports.qt.io/browse/QTBUG-28786 use faster <span class="caps">SSL</span> ciphers] ''[bugreports.qt.io]''. Probably not much gain, not important for now from a performance point of view.


4. <span class="caps">HTTP</span> request/reply:
3. SSL handshake:
* [[https://bugreports.qt.io/browse/QTBUG-30771 pre-SSL handshake]]. Would not help at app start, but if an app knows a server will be used, it should do the SSL handshake as early as possible.
* [[https://bugreports.qt.io/browse/QTBUG-30878 try to persist / re-use SSL session IDs]]. If we find servers with a high session re-use lifetime, it might be worth persisting the session ID across app restarts.
* [[https://bugreports.qt.io/browse/QTBUG-20668 SSL: enable TLS session tickets (Stateless TLS Session Resumption)]]. Probably preferrable over re-using session IDs because of higher life time.
* [[https://bugreports.qt.io/browse/QTBUG-15452 use SSL False Start]]. Currently not implemented in OpenSSL (but in NSS), so nothing we can do unfortunately, but it would save us 1 roundtrip for all SSL handshakes, which would be a massive performance boost.
* [[https://bugreports.qt.io/browse/QTBUG-14983 enable SSL session sharing]]. Already implemented, saving 1 network roundtrip (~ 200-300 ms) for sockets re-using an SSL connection. '''Works best if the app only uses 1 instance of QNetworkAccessManager, and there is no reason to use more than 1 instance anyhow.'''
* [[https://bugreports.qt.io/browse/QTBUG-28786 use faster SSL ciphers]]. Probably not much gain, not important for now from a performance point of view.


* [https://bugreports.qt.io/browse/QTBUG-18751 improve <span class="caps">HTTP</span> cache] ''[bugreports.qt.io]''. The cache currently reads everything from disk instead of keeping items in memory, and <span class="caps">IIRC</span> does not load as much from cache as it could. Would be really good for performance, but is quite some work as well.
4. HTTP request/reply:
* [https://bugreports.qt.io/browse/QTBUG-30732 prioritize requests] ''[bugreports.qt.io]''. Needs to be done by the app, but we should definitely prioritize requests that contain links to other resources (for apps: <span class="caps">JSON</span> and <span class="caps">XML</span>, just like browsers prioritize <span class="caps">HTML</span>/CSS/JS over <span class="caps">JPG</span> etc.) over requests to “end” resources like images.
* [[https://bugreports.qt.io/browse/QTBUG-18751 improve HTTP cache]]. The cache currently reads everything from disk instead of keeping items in memory, and IIRC does not load as much from cache as it could. Would be really good for performance, but is quite some work as well.
* [https://bugreports.qt.io/browse/QTBUG-19052 enable <span class="caps">HTTP</span> pipelining by default] ''[bugreports.qt.io]''. Could also be done by the app on a per-request basis, also need to check whether this is fine for <span class="caps">SSL</span> requests. Might also be a gain, but also a bit risky, even though I think the servers from the big sites like Facebook, Twitter etc. should be fine with it.
* [[https://bugreports.qt.io/browse/QTBUG-30732 prioritize requests]]. Needs to be done by the app, but we should definitely prioritize requests that contain links to other resources (for apps: JSON and XML, just like browsers prioritize HTML/CSS/JS over JPG etc.) over requests to "end" resources like images.
* [https://bugreports.qt.io/browse/QTBUG-18714 implement <span class="caps">SPDY</span>] ''[bugreports.qt.io]''. Supported by Google, Facebook, Twitter and probably others. No idea how much gain this would be.
* [[https://bugreports.qt.io/browse/QTBUG-19052 enable HTTP pipelining by default]]. Could also be done by the app on a per-request basis, also need to check whether this is fine for SSL requests. Might also be a gain, but also a bit risky, even though I think the servers from the big sites like Facebook, Twitter etc. should be fine with it.
* [[https://bugreports.qt.io/browse/QTBUG-18714 implement SPDY]]. Supported by Google, Facebook, Twitter and probably others. No idea how much gain this would be.


There are more items listed at [https://bugreports.qt.io/browse/QTBUG-28762 QtNetwork performance improvements on Jira] ''[bugreports.qt.io]'' , but I think these are the most important ones. Feel free to help implementing <span class="smiley">:)</span>
There are more items listed at [[https://bugreports.qt.io/browse/QTBUG-28762 QtNetwork performance improvements on Jira]] , but I think these are the most important ones. Feel free to help implementing :)


Questions, suggestions etc.: Please contact us on irc.freenode.net #qt-earth-team<br /> ——<br /> h5. Appendix A: <span class="caps">DNS</span> <span class="caps">TTL</span> table
Questions, suggestions etc.: Please contact us on irc.freenode.net #qt-earth-team
—-
===== Appendix A: DNS TTL table =====


{| class="infotable line"
{| class="wikitable"
|
|app  
| app
| # DNS requests  
|
| average DNS TTL  
| # <span class="caps">DNS</span> requests
|
| average <span class="caps">DNS</span> <span class="caps">TTL</span>
|
|-
|-
|
| Facebook  
| Facebook
| 4+  
|
| api.facebook.com: <= 1 hour
| 4+
graph.facebook.com: <= 1 hour
|
others (for images etc.): <= 5 mins  
| api.facebook.com: &lt;= 1 hour<br /> graph.facebook.com: &lt;= 1 hour<br /> others (for images etc.): &lt;= 5 mins
|
|-
|-
|
| Twitter  
| Twitter
| ? (app was broken on my build)  
|
| api.twitter.com: ~ 3 mins  
| ? (app was broken on my build)
|
| api.twitter.com: ~ 3 mins
|
|-
|-
|
| LinkedIn  
| LinkedIn
| ~ 10 (2 for LinkedIn and rest for other servers)  
|
| api.linkedIn.com: only few seconds
| ~ 10 (2 for LinkedIn and rest for other servers)
touch.www.linkedIn.com: only few seconds
|
rest: 5-35 mins  
| api.linkedIn.com: only few seconds<br /> touch.www.linkedIn.com: only few seconds<br /> rest: 5-35 mins
|
|-
|-
|
| Google Mail through Hub  
| Google Mail through Hub
| 6 (5 of them to google.com (??))  
|
| www.google.com: <= 5 mins
| 6 (5 of them to google.com (??))
google.com: <= 5 mins  
|
| www.google.com: &lt;= 5 mins<br /> google.com: &lt;= 5 mins
|
|}
|}


=====Appendix B: <span class="caps">SSL</span> session ticket and ID lifetime=====
===== Appendix B: SSL session ticket and ID lifetime =====


{| class="infotable line"
{| class="wikitable"
|
! site  
| site
! session ticket lifetime  
|
! session ID lifetime  
| session ticket lifetime
|
| session ID lifetime
|
|-
|-
|
| api.facebook.com  
| api.facebook.com
| ~ 24 hours  
|
| >= 60 min  
| ~ 24 hours
|
| &gt;= 60 min
|
|-
|-
|
| graph.facebook.com  
| graph.facebook.com
| ~ 24 hours  
|
| >= 60 min  
| ~ 24 hours
|
| &gt;= 60 min
|
|-
|-
|
| fbcdn-profile-a.akamaihd.net  
| fbcdn-profile-a.akamaihd.net
| 2 hours  
|
| >= 45 min, < 60 min  
| 2 hours
|
| &gt;= 45 min, &lt; 60 min
|
|-
|-
|
|  
|
|  
|
|  
|
|
|
|
|-
|-
|
| api.twitter.com  
| api.twitter.com
| 4 hours  
|
| >= 60 min  
| 4 hours
|
| &gt;= 60 min
|
|-
|-
|
| si0.twimg.com  
| si0.twimg.com
| 5 minutes  
|
| not supported  
| 5 minutes
|
| not supported
|
|-
|-
|
|
|
|  
|
|  
|
|
|
|
|-
|-
|
| api.linkedin.com  
| api.linkedin.com
| not supported  
|
| < 5 min  
| not supported
|
| &lt; 5 min
|
|-
|-
|
| touch.www.linkedin.com  
| touch.www.linkedin.com
| not supported  
|
| < 5 min  
| not supported
|
| &lt; 5 min
|
|-
|-
|
| media.licdn.com  
| media.licdn.com
| not supported  
|
| < 5 min  
| not supported
|
| &lt; 5 min
|
|-
|-
|
|  
|
|  
|
|  
|
|
|
|
|-
|
| foursquare.com
|
| 0
|
| &lt; 5 min
|
|-
|-
|
| foursquare.com  
| api.foursquare.com
| 0  
|
| < 5 min
| 0
|
| not supported
|
|-
|-
|
| api.foursquare.com
| ir.4sqi.net
| 0
|
| not supported
| 2 hours
|
| &gt;= 45 min, &lt; 60 min
|
|}
|}

Latest revision as of 09:38, 1 April 2015


Judging from experience, performance improvements in networking are orders of magnitudes higher than performance improvements gained from reducing library size, avoiding memcpy etc. In the particular case of code issuing a network request, the time spent in parsing the HTTP replies, computing SSL keys etc. takes less than 5% of the overall time, i.e. 95% of the time below is spent in waiting for network I/O.

In other words: When trying to improve QtNetwork performance, try to shave off network roundtrips and reduce latency rather than trying to improve parsing of HTTP replies / avoiding memcpy etc.



When e.g. opening the Facebook app on BlackBerry 10, it takes 1.5 to 2 seconds (worst case) until data is retrieved; this can be broken down into several parts (numbers measured with tcpdump on a Wifi network):

gupt9gmt.png

To improve performance for the different types of network traffic, the following things can be done; note that some of the tasks do not help at app startup but while the app is already running.

1. DNS lookup:

  • enable global DNS cache. If there is a global DNS cache on your system, you probably want to enable it; this will most likely provide a massive performance boost. For examples of DNS TTL times, see the table at the bottom of this page.
  • [more concurrent requests]. Currently, Qt supports 5 concurrent requests, which is too low and sometimes DNS requests are stalled waiting for another one to finish.
  • [Qt DNS cache]. Already implemented in Qt, but cached entries will only live as long as the app is running of course. I.e. the 1st DNS lookup will always go to the system DNS cache or the network.
  • [do not lookup the same DNS name twice]
  • [pre-DNS lookup], i.e. make the DNS lookup not upon connection, but as early as possible. This will not help right when the app is started, but for consecutive requests. Even better: pre-TCP handshake, see below. Yet even better: pre-SSL handshake, see below.
  • [asynchronous DNS lookup]. Since getaddrinfo() is a synchronous call, Qt executes it in a thread (through QRunnable / QThreadPool). This is not costly by itself, but might be so when there are many runnables / threads competing for CPU power and the thread pool queues DNS lookups. Whether there are asynchronous calls for DNS lookups available depends on the platform (e.g. Linux: getaddrinfo_a()).

2. TCP handshake:

  • [pre-TCP handshake], i.e. make the TCP handshake not upon connection, but as early as possible. This will not help right when the app is started, but for consecutive requests. Even better: pre-SSL handshake, see below.
  • [pre-connect more sockets per host]. We don't necessarily need smart heuristics for that (but could do so by inspecting the cache), but if an app knows it needs more connection anyhow, it could tell us and we could pre-allocate sockets.
  • persistent connections, i.e. TCP connections are cached for 1 minute and can be reused; this has been implemented in Qt since always. Also good for using sockets with already high TCP congestion window to avoid TCP slow start. Works best if the app only uses 1 instance of QNetworkAccessManager, and there is no reason to use more than 1 instance anyhow.

3. SSL handshake:

  • [pre-SSL handshake]. Would not help at app start, but if an app knows a server will be used, it should do the SSL handshake as early as possible.
  • [try to persist / re-use SSL session IDs]. If we find servers with a high session re-use lifetime, it might be worth persisting the session ID across app restarts.
  • [SSL: enable TLS session tickets (Stateless TLS Session Resumption)]. Probably preferrable over re-using session IDs because of higher life time.
  • [use SSL False Start]. Currently not implemented in OpenSSL (but in NSS), so nothing we can do unfortunately, but it would save us 1 roundtrip for all SSL handshakes, which would be a massive performance boost.
  • [enable SSL session sharing]. Already implemented, saving 1 network roundtrip (~ 200-300 ms) for sockets re-using an SSL connection. Works best if the app only uses 1 instance of QNetworkAccessManager, and there is no reason to use more than 1 instance anyhow.
  • [use faster SSL ciphers]. Probably not much gain, not important for now from a performance point of view.

4. HTTP request/reply:

  • [improve HTTP cache]. The cache currently reads everything from disk instead of keeping items in memory, and IIRC does not load as much from cache as it could. Would be really good for performance, but is quite some work as well.
  • [prioritize requests]. Needs to be done by the app, but we should definitely prioritize requests that contain links to other resources (for apps: JSON and XML, just like browsers prioritize HTML/CSS/JS over JPG etc.) over requests to "end" resources like images.
  • [enable HTTP pipelining by default]. Could also be done by the app on a per-request basis, also need to check whether this is fine for SSL requests. Might also be a gain, but also a bit risky, even though I think the servers from the big sites like Facebook, Twitter etc. should be fine with it.
  • [implement SPDY]. Supported by Google, Facebook, Twitter and probably others. No idea how much gain this would be.

There are more items listed at [QtNetwork performance improvements on Jira] , but I think these are the most important ones. Feel free to help implementing :)

Questions, suggestions etc.: Please contact us on irc.freenode.net #qt-earth-team —-

Appendix A: DNS TTL table
app # DNS requests average DNS TTL
Facebook 4+ api.facebook.com: <= 1 hour

graph.facebook.com: <= 1 hour others (for images etc.): <= 5 mins

Twitter ? (app was broken on my build) api.twitter.com: ~ 3 mins
LinkedIn ~ 10 (2 for LinkedIn and rest for other servers) api.linkedIn.com: only few seconds

touch.www.linkedIn.com: only few seconds rest: 5-35 mins

Google Mail through Hub 6 (5 of them to google.com (??)) www.google.com: <= 5 mins

google.com: <= 5 mins

Appendix B: SSL session ticket and ID lifetime
site session ticket lifetime session ID lifetime
api.facebook.com ~ 24 hours >= 60 min
graph.facebook.com ~ 24 hours >= 60 min
fbcdn-profile-a.akamaihd.net 2 hours >= 45 min, < 60 min
api.twitter.com 4 hours >= 60 min
si0.twimg.com 5 minutes not supported
api.linkedin.com not supported < 5 min
touch.www.linkedin.com not supported < 5 min
media.licdn.com not supported < 5 min
foursquare.com 0 < 5 min
api.foursquare.com 0 not supported