QtCS2017 Discuss Qt Logging enhancements: Difference between revisions
Arnaud Clere (talk | contribs) (added presentation summary) |
Arnaud Clere (talk | contribs) (cosmetic change) |
||
(13 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== Discussion == | |||
* | * Structured logging: | ||
* | ** We want to extract information from logs and detect common messages. For example: | ||
and how to | {| | ||
! From | |||
! To | |||
|- | |||
| | |||
Got message "Hello" | |||
| | |||
Got message "%s" Hello | |||
|- | |||
| | |||
Got message "World" | |||
| | |||
Got message "%s" World | |||
|} | |||
** We want to allow logging to store ancillary data, possibly in machine-readable format, likely not part of the message itself | |||
*** Like journald API can do | |||
*** How do we allow different formatting depending on backend? Needs to support formatting user types too. | |||
* Structured output | |||
** Store to databases, for example | |||
** Select backend with environment variable | |||
Need to review with [https://codereview.qt-project.org/185287 LTTNG and ETW tracing] | |||
ML discussion: http://lists.qt-project.org/pipermail/development/2017-October/031047.html | |||
== Summary == | |||
Qt Logging lacks regarding 2 different use cases: | |||
1/ High-performance code (say HW handling, big data processing) needs a way to inject a few structured tracepoints to generate large binary traces (say, for profiling) | |||
2/ Existing tracepoints (think about nb of qDebug in Qt user code) should generate more structured traces to: | |||
* facilitate exploration and diagnostic, and | |||
* allow immediate use of analysis tools like Python | |||
Service and Device developers need #1. Medical Devices would benefit from #2 for post-market surveillance. Device designers operating in an uncontrolled environment would probably benefit from #2 as well | |||
=== Prototypes === | |||
* For #1: [https://codereview.qt-project.org/185287 LTTNG and ETW tracing] use external description of Statically-Defined Tracepoints (SDT) to generate code for LTTNG and ETW backends (or no code at all) | |||
* For #2: Qt-based modmedLog library ([https://drive.google.com/file/d/0B_ZGW8rLIfE3aHVxR3loMThTbzA/view?usp=sharing presentation], [https://forge.imag.fr/frs/?group_id=982 forge]) uses Bind<T> and IData implementations to bind tracepoint data to JSON/XML/Console outputs and possibly read them back for replay | |||
=== Conclusion === | |||
We should sit down and see how these 2 use cases intersect and the best things that are provided by each prototype | |||
== Requirements == | |||
For #1: | |||
* as little generated code as possible at the tracepoint: | |||
** no binary or numeric formatting | |||
** completely removable from release code (not just bypassing like QNoDebug::operator<<) | |||
For #2: | |||
* source compatibility with qDebug tracepoints | |||
** do not require non trivial changes such as changing from qDebug()<< to qDebug(...) | |||
** allow changing qDebug() to qTrace() or whatever to provide choice of old vs structured tracing | |||
** extensible to any user data type | |||
* offer structured trace formats (TSV+JSON, XML, sqlite?) to avoid low-level parsing | |||
== Questions == | |||
* Do we still obey QT_MESSAGE_PATTERN? YES and it could be extended with, say, %{format} %{arg0} ... | |||
* Can we generate efficient tracepoint code without an external tracepoint description and tool? | |||
* Should we provide a format+args handler delegating by default to the existing handler? | |||
(*QtMessageHandler2)(...,const QString & format, const QStringList & args); | |||
* Should we provide a formatting-independant handler based on Bind<T>: | |||
(*QtMessageHandler3)(...,const QString & format, const QList<BindableRef> & args); | |||
* What about modmedLog performance? | |||
* When variadic macros and templates are available, can they be used to get the best of both prototypes? | |||
== Tracing/Logging survey == | |||
=== Tracepoint kinds === | |||
==== Dynamically-Defined ==== | |||
Data layout built at run-time (usually text format but binary is possible too) : | |||
* qDebug()<<... (QString msg assembled by QDebug) | |||
* LTTng tracef(format,...) (null-terminated string formatted by printf) | |||
* sd_journal_send(format,...) | |||
* MacOS os_log(level,format,....) | |||
* ETW TraceEvent(...,buffer) (buffer containing assembled data and metadata) | |||
PROS: allows any kind of event data (nullable, dynamically-sized structures and arrays, variants, nested data) | |||
CONS: prohibits bare metal performance (runtime function calls for assembling event data, data structure must be explicitely repeated for all event occurences) | |||
USAGE: maximum acceptable effort for most app developer traces | |||
(see rsyslog author's remark on developers still not identifying their events 15+years after MS event log offered the possibility: http://blog.gerhards.net/2011/11/what-i-dont-like-about-journald.html ) | |||
==== Statically-Defined ==== | |||
(data layout fixed at compile-time) | |||
* External definition | |||
** ETW manifest for uniquely identified events | |||
** Proposed Qt tool https://codereview.qt-project.org/#/c/185287/ | |||
* Macro-based assembling | |||
** MS TraceLoggingWrite() and related macros | |||
** LTTng tracepoint() | |||
PROS: allows bare metal performance (no heap allocation, fixed instruction pipeline, data structure removed from event occurences) | |||
CONS: | |||
* prohibits or complicates some dynamic event data (nullable, dynamically-sized structures and arrays, variants, nested data) | |||
* app developers will generally not take the burden to generate an UUID, add a prebuild step to their build and describe trace points data when they can just append it to a generic tracepoint (see above) | |||
USAGE: system troubleshooting for sysadmin | |||
==== Probes ==== | |||
(no-ops that can be dynamically changed to collect data with DTrace or similar tools) | |||
* DTrace (on MacOS for instance) | |||
* Linux bcc/eBPF | |||
PROS: allows many live analysis with limited performance impact (no impact when unused) | |||
CONS: probe safety is a concern (solved by DTrace in general but still undesirable for certified software IMO) | |||
USAGE: performance analysis graal for experts only (DTrace author admits most DTrace users just run existing D scripts) | |||
=== Tracepoint data formats === | |||
==== Binary ==== | |||
* LTTng CTF (public, though it remains unclear how to pass dynamically-encoded CTF to LTTng library) | |||
* MS EtwSelfDescribingFormat (private, though TraceLogging macros may be used to understand it) | |||
* CBOR (https://tools.ietf.org/html/rfc7049, https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml) | |||
PROS: write performance mainly depends on memory copies, sometime on memory allocations and function calls | |||
CONS: difficult choice between interoperability and native toolsets | |||
==== Text ==== | |||
* printf(format,...) | |||
* qDebug(format,...) | |||
* qDebug()<<... | |||
* LTTng tracef(format,...) | |||
CONS: write performance impaired by encoding of numbers, QString, QByteArray | |||
PROS: interoperability and evolutions facilitated | |||
==== Configurable ==== | |||
* modmedLog LogOutput + IData | |||
* log4xx | |||
* ... | |||
=== Trace backends === | |||
==== File ==== | |||
PROS: interoperability, simple toolset | |||
CONS: unknown throughput, lacks basic logging features (log rotation) | |||
==== MS ETW / LTTng ==== | |||
PROS: best throughput, async IO, unlogged events accessible from crash dump | |||
CONS: specialised and platform-dependent toolset unknown from most app developers | |||
==== MS EventViewer / Journald / syslog ==== | |||
PROS: good throughput | |||
CONS: platform-dependent toolset | |||
==== Configurable ==== | |||
* QtMessageHandler | |||
* modmedLog LogOutput | |||
* log4xx | |||
* ... |
Latest revision as of 17:22, 30 October 2017
Discussion
- Structured logging:
- We want to extract information from logs and detect common messages. For example:
From | To |
---|---|
Got message "Hello" |
Got message "%s" Hello |
Got message "World" |
Got message "%s" World |
- We want to allow logging to store ancillary data, possibly in machine-readable format, likely not part of the message itself
- Like journald API can do
- How do we allow different formatting depending on backend? Needs to support formatting user types too.
- We want to allow logging to store ancillary data, possibly in machine-readable format, likely not part of the message itself
- Structured output
- Store to databases, for example
- Select backend with environment variable
Need to review with LTTNG and ETW tracing
ML discussion: http://lists.qt-project.org/pipermail/development/2017-October/031047.html
Summary
Qt Logging lacks regarding 2 different use cases:
1/ High-performance code (say HW handling, big data processing) needs a way to inject a few structured tracepoints to generate large binary traces (say, for profiling)
2/ Existing tracepoints (think about nb of qDebug in Qt user code) should generate more structured traces to:
- facilitate exploration and diagnostic, and
- allow immediate use of analysis tools like Python
Service and Device developers need #1. Medical Devices would benefit from #2 for post-market surveillance. Device designers operating in an uncontrolled environment would probably benefit from #2 as well
Prototypes
- For #1: LTTNG and ETW tracing use external description of Statically-Defined Tracepoints (SDT) to generate code for LTTNG and ETW backends (or no code at all)
- For #2: Qt-based modmedLog library (presentation, forge) uses Bind<T> and IData implementations to bind tracepoint data to JSON/XML/Console outputs and possibly read them back for replay
Conclusion
We should sit down and see how these 2 use cases intersect and the best things that are provided by each prototype
Requirements
For #1:
- as little generated code as possible at the tracepoint:
- no binary or numeric formatting
- completely removable from release code (not just bypassing like QNoDebug::operator<<)
For #2:
- source compatibility with qDebug tracepoints
- do not require non trivial changes such as changing from qDebug()<< to qDebug(...)
- allow changing qDebug() to qTrace() or whatever to provide choice of old vs structured tracing
- extensible to any user data type
- offer structured trace formats (TSV+JSON, XML, sqlite?) to avoid low-level parsing
Questions
- Do we still obey QT_MESSAGE_PATTERN? YES and it could be extended with, say, %{format} %{arg0} ...
- Can we generate efficient tracepoint code without an external tracepoint description and tool?
- Should we provide a format+args handler delegating by default to the existing handler?
(*QtMessageHandler2)(...,const QString & format, const QStringList & args);
- Should we provide a formatting-independant handler based on Bind<T>:
(*QtMessageHandler3)(...,const QString & format, const QList<BindableRef> & args);
- What about modmedLog performance?
- When variadic macros and templates are available, can they be used to get the best of both prototypes?
Tracing/Logging survey
Tracepoint kinds
Dynamically-Defined
Data layout built at run-time (usually text format but binary is possible too) :
- qDebug()<<... (QString msg assembled by QDebug)
- LTTng tracef(format,...) (null-terminated string formatted by printf)
- sd_journal_send(format,...)
- MacOS os_log(level,format,....)
- ETW TraceEvent(...,buffer) (buffer containing assembled data and metadata)
PROS: allows any kind of event data (nullable, dynamically-sized structures and arrays, variants, nested data)
CONS: prohibits bare metal performance (runtime function calls for assembling event data, data structure must be explicitely repeated for all event occurences)
USAGE: maximum acceptable effort for most app developer traces (see rsyslog author's remark on developers still not identifying their events 15+years after MS event log offered the possibility: http://blog.gerhards.net/2011/11/what-i-dont-like-about-journald.html )
Statically-Defined
(data layout fixed at compile-time)
- External definition
- ETW manifest for uniquely identified events
- Proposed Qt tool https://codereview.qt-project.org/#/c/185287/
- Macro-based assembling
- MS TraceLoggingWrite() and related macros
- LTTng tracepoint()
PROS: allows bare metal performance (no heap allocation, fixed instruction pipeline, data structure removed from event occurences)
CONS:
- prohibits or complicates some dynamic event data (nullable, dynamically-sized structures and arrays, variants, nested data)
- app developers will generally not take the burden to generate an UUID, add a prebuild step to their build and describe trace points data when they can just append it to a generic tracepoint (see above)
USAGE: system troubleshooting for sysadmin
Probes
(no-ops that can be dynamically changed to collect data with DTrace or similar tools)
- DTrace (on MacOS for instance)
- Linux bcc/eBPF
PROS: allows many live analysis with limited performance impact (no impact when unused)
CONS: probe safety is a concern (solved by DTrace in general but still undesirable for certified software IMO)
USAGE: performance analysis graal for experts only (DTrace author admits most DTrace users just run existing D scripts)
Tracepoint data formats
Binary
- LTTng CTF (public, though it remains unclear how to pass dynamically-encoded CTF to LTTng library)
- MS EtwSelfDescribingFormat (private, though TraceLogging macros may be used to understand it)
- CBOR (https://tools.ietf.org/html/rfc7049, https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml)
PROS: write performance mainly depends on memory copies, sometime on memory allocations and function calls
CONS: difficult choice between interoperability and native toolsets
Text
- printf(format,...)
- qDebug(format,...)
- qDebug()<<...
- LTTng tracef(format,...)
CONS: write performance impaired by encoding of numbers, QString, QByteArray
PROS: interoperability and evolutions facilitated
Configurable
- modmedLog LogOutput + IData
- log4xx
- ...
Trace backends
File
PROS: interoperability, simple toolset
CONS: unknown throughput, lacks basic logging features (log rotation)
MS ETW / LTTng
PROS: best throughput, async IO, unlogged events accessible from crash dump
CONS: specialised and platform-dependent toolset unknown from most app developers
MS EventViewer / Journald / syslog
PROS: good throughput
CONS: platform-dependent toolset
Configurable
- QtMessageHandler
- modmedLog LogOutput
- log4xx
- ...