QtCS2017 Discuss Qt Logging enhancements: Difference between revisions

Latest revision as of 17:22, 30 October 2017

Discussion

Structured logging:
- We want to extract information from logs and detect common messages. For example:

From	To
Got message "Hello"	Got message "%s" Hello
Got message "World"	Got message "%s" World

- We want to allow logging to store ancillary data, possibly in machine-readable format, likely not part of the message itself
  - Like journald API can do
  - How do we allow different formatting depending on backend? Needs to support formatting user types too.
Structured output
- Store to databases, for example
- Select backend with environment variable

Need to review with LTTNG and ETW tracing

ML discussion: http://lists.qt-project.org/pipermail/development/2017-October/031047.html

Summary

Qt Logging lacks regarding 2 different use cases:

1/ High-performance code (say HW handling, big data processing) needs a way to inject a few structured tracepoints to generate large binary traces (say, for profiling)

2/ Existing tracepoints (think about nb of qDebug in Qt user code) should generate more structured traces to:

facilitate exploration and diagnostic, and
allow immediate use of analysis tools like Python

Service and Device developers need #1. Medical Devices would benefit from #2 for post-market surveillance. Device designers operating in an uncontrolled environment would probably benefit from #2 as well

Prototypes

For #1: LTTNG and ETW tracing use external description of Statically-Defined Tracepoints (SDT) to generate code for LTTNG and ETW backends (or no code at all)
For #2: Qt-based modmedLog library (presentation, forge) uses Bind<T> and IData implementations to bind tracepoint data to JSON/XML/Console outputs and possibly read them back for replay

Conclusion

We should sit down and see how these 2 use cases intersect and the best things that are provided by each prototype

Requirements

For #1:

as little generated code as possible at the tracepoint:
- no binary or numeric formatting
- completely removable from release code (not just bypassing like QNoDebug::operator<<)

For #2:

source compatibility with qDebug tracepoints
- do not require non trivial changes such as changing from qDebug()<< to qDebug(...)
- allow changing qDebug() to qTrace() or whatever to provide choice of old vs structured tracing
- extensible to any user data type
offer structured trace formats (TSV+JSON, XML, sqlite?) to avoid low-level parsing

Questions

Do we still obey QT_MESSAGE_PATTERN? YES and it could be extended with, say, %{format} %{arg0} ...
Can we generate efficient tracepoint code without an external tracepoint description and tool?
Should we provide a format+args handler delegating by default to the existing handler?

(*QtMessageHandler2)(...,const QString & format, const QStringList & args);

Should we provide a formatting-independant handler based on Bind<T>:

(*QtMessageHandler3)(...,const QString & format, const QList<BindableRef> & args);

What about modmedLog performance?
When variadic macros and templates are available, can they be used to get the best of both prototypes?

Tracing/Logging survey

Tracepoint kinds

Dynamically-Defined

Data layout built at run-time (usually text format but binary is possible too) :

qDebug()<<... (QString msg assembled by QDebug)
LTTng tracef(format,...) (null-terminated string formatted by printf)
sd_journal_send(format,...)
MacOS os_log(level,format,....)
ETW TraceEvent(...,buffer) (buffer containing assembled data and metadata)

PROS: allows any kind of event data (nullable, dynamically-sized structures and arrays, variants, nested data)

CONS: prohibits bare metal performance (runtime function calls for assembling event data, data structure must be explicitely repeated for all event occurences)

USAGE: maximum acceptable effort for most app developer traces (see rsyslog author's remark on developers still not identifying their events 15+years after MS event log offered the possibility: http://blog.gerhards.net/2011/11/what-i-dont-like-about-journald.html )

Statically-Defined

(data layout fixed at compile-time)

External definition
- ETW manifest for uniquely identified events
- Proposed Qt tool https://codereview.qt-project.org/#/c/185287/
Macro-based assembling
- MS TraceLoggingWrite() and related macros
- LTTng tracepoint()

PROS: allows bare metal performance (no heap allocation, fixed instruction pipeline, data structure removed from event occurences)

CONS:

prohibits or complicates some dynamic event data (nullable, dynamically-sized structures and arrays, variants, nested data)
app developers will generally not take the burden to generate an UUID, add a prebuild step to their build and describe trace points data when they can just append it to a generic tracepoint (see above)

USAGE: system troubleshooting for sysadmin

Probes

(no-ops that can be dynamically changed to collect data with DTrace or similar tools)

DTrace (on MacOS for instance)
Linux bcc/eBPF

PROS: allows many live analysis with limited performance impact (no impact when unused)

CONS: probe safety is a concern (solved by DTrace in general but still undesirable for certified software IMO)

USAGE: performance analysis graal for experts only (DTrace author admits most DTrace users just run existing D scripts)

Tracepoint data formats

Binary

LTTng CTF (public, though it remains unclear how to pass dynamically-encoded CTF to LTTng library)
MS EtwSelfDescribingFormat (private, though TraceLogging macros may be used to understand it)
CBOR (https://tools.ietf.org/html/rfc7049, https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml)

PROS: write performance mainly depends on memory copies, sometime on memory allocations and function calls

CONS: difficult choice between interoperability and native toolsets

Text

printf(format,...)
qDebug(format,...)
qDebug()<<...
LTTng tracef(format,...)

CONS: write performance impaired by encoding of numbers, QString, QByteArray

PROS: interoperability and evolutions facilitated

Configurable

modmedLog LogOutput + IData
log4xx
...

Trace backends

File

PROS: interoperability, simple toolset

CONS: unknown throughput, lacks basic logging features (log rotation)

MS ETW / LTTng

PROS: best throughput, async IO, unlogged events accessible from crash dump

CONS: specialised and platform-dependent toolset unknown from most app developers

MS EventViewer / Journald / syslog

PROS: good throughput

CONS: platform-dependent toolset

Configurable

QtMessageHandler
modmedLog LogOutput
log4xx
...

@@ Line 21: / Line 21: @@
 * Structured output
 ** Store to databases, for example
-** Do we still obey QT_MESSAGE_PATTERN?
 ** Select backend with environment variable
 Need to review with [https://codereview.qt-project.org/185287 LTTNG and ETW tracing]
-==== Requirements? ====
+ML discussion: http://lists.qt-project.org/pipermail/development/2017-October/031047.html
-performance similar to the current state (or provide choice)
-streaming generates a lot of code
+== Summary ==
-do not change qdebug but provide structured tracing as a new facility
-== Executive Summary ==
 Qt Logging lacks regarding 2 different use cases:
@@ Line 46: / Line 40: @@
 === Prototypes ===
-* LTTNG/ETW support currently under Gerrit uses external tracepoint data description to generate code for LTTNG and ETW backends (or no code at all)
+* For #1: [https://codereview.qt-project.org/185287 LTTNG and ETW tracing] use external description of Statically-Defined Tracepoints (SDT) to generate code for LTTNG and ETW backends (or no code at all)
-* Qt-based modmedLog library uses Bind<T> and IData implementations to bind tracepoint data to JSON/XML/Console outputs and possibly read them back for replay
+* For #2: Qt-based modmedLog library ([https://drive.google.com/file/d/0B_ZGW8rLIfE3aHVxR3loMThTbzA/view?usp=sharing presentation], [https://forge.imag.fr/frs/?group_id=982 forge]) uses Bind<T> and IData implementations to bind tracepoint data to JSON/XML/Console outputs and possibly read them back for replay
 === Conclusion ===
 We should sit down and see how these 2 use cases intersect and the best things that are provided by each prototype
-=== Requirements ===
+== Requirements ==
 For #1:
 * as little generated code as possible at the tracepoint:
 ** no binary or numeric formatting
-** completely removable from release code
+** completely removable from release code (not just bypassing like QNoDebug::operator<<)
 For #2:
 * source compatibility with qDebug tracepoints
 ** do not require non trivial changes such as changing from qDebug()<< to qDebug(...)
+** allow changing qDebug() to qTrace() or whatever to provide choice of old vs structured tracing
 ** extensible to any user data type
 * offer structured trace formats (TSV+JSON, XML, sqlite?) to avoid low-level parsing
-=== Open Questions ===
+== Questions ==
+* Do we still obey QT_MESSAGE_PATTERN? YES and it could be extended with, say, %{format} %{arg0} ...
 * Can we generate efficient tracepoint code without an external tracepoint description and tool?
 * Should we provide a format+args handler delegating by default to the existing handler?
@@ Line 71: / Line 67: @@
 * What about modmedLog performance?
 * When variadic macros and templates are available, can they be used to get the best of both prototypes?
+== Tracing/Logging survey ==
+=== Tracepoint kinds ===
+==== Dynamically-Defined ====
+Data layout built at run-time (usually text format but binary is possible too) :
+* qDebug()<<... (QString msg assembled by QDebug)
+* LTTng tracef(format,...) (null-terminated string formatted by printf)
+* sd_journal_send(format,...)
+* MacOS os_log(level,format,....)
+* ETW TraceEvent(...,buffer) (buffer containing assembled data and metadata)
+PROS: allows any kind of event data (nullable, dynamically-sized structures and arrays, variants, nested data)
+CONS: prohibits bare metal performance (runtime function calls for assembling event data, data structure must be explicitely repeated for all event occurences)
+USAGE: maximum acceptable effort for most app developer traces
+(see rsyslog author's remark on developers still not identifying their events 15+years after MS event log offered the possibility: http://blog.gerhards.net/2011/11/what-i-dont-like-about-journald.html )
+==== Statically-Defined ====
+(data layout fixed at compile-time)
+* External definition
+** ETW manifest for uniquely identified events
+** Proposed Qt tool https://codereview.qt-project.org/#/c/185287/
+* Macro-based assembling
+** MS TraceLoggingWrite() and related macros
+** LTTng tracepoint()
+PROS: allows bare metal performance (no heap allocation, fixed instruction pipeline, data structure removed from event occurences)
+CONS:
+* prohibits or complicates some dynamic event data (nullable, dynamically-sized structures and arrays, variants, nested data)
+* app developers will generally not take the burden to generate an UUID, add a prebuild step to their build and describe trace points data when they can just append it to a generic tracepoint (see above)
+USAGE: system troubleshooting for sysadmin
+==== Probes ====
+(no-ops that can be dynamically changed to collect data with DTrace or similar tools)
+* DTrace (on MacOS for instance)
+* Linux bcc/eBPF
+PROS: allows many live analysis with limited performance impact (no impact when unused)
+CONS: probe safety is a concern (solved by DTrace in general but still undesirable for certified software IMO)
+USAGE: performance analysis graal for experts only (DTrace author admits most DTrace users just run existing D scripts)
+=== Tracepoint data formats ===
+==== Binary ====
+* LTTng CTF (public, though it remains unclear how to pass dynamically-encoded CTF to LTTng library)
+* MS EtwSelfDescribingFormat (private, though TraceLogging macros may be used to understand it)
+* CBOR (https://tools.ietf.org/html/rfc7049, https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml)
+PROS: write performance mainly depends on memory copies, sometime on memory allocations and function calls
+CONS: difficult choice between interoperability and native toolsets
+==== Text ====
+* printf(format,...)
+* qDebug(format,...)
+* qDebug()<<...
+* LTTng tracef(format,...)
+CONS: write performance impaired by encoding of numbers, QString, QByteArray
+PROS: interoperability and evolutions facilitated
+==== Configurable ====
+* modmedLog LogOutput + IData
+* log4xx
+* ...
+=== Trace backends ===
+==== File ====
+PROS: interoperability, simple toolset
+CONS: unknown throughput, lacks basic logging features (log rotation)
+==== MS ETW / LTTng ====
+PROS: best throughput, async IO, unlogged events accessible from crash dump
+CONS: specialised and platform-dependent toolset unknown from most app developers
+==== MS EventViewer / Journald / syslog ====
+PROS: good throughput
+CONS: platform-dependent toolset
+==== Configurable ====
+* QtMessageHandler
+* modmedLog LogOutput
+* log4xx
+* ...

QtCS2017 Discuss Qt Logging enhancements: Difference between revisions

Latest revision as of 17:22, 30 October 2017

Contents

Discussion

Summary

Prototypes

Conclusion

Requirements

Questions

Tracing/Logging survey

Tracepoint kinds

Dynamically-Defined

Statically-Defined

Probes

Tracepoint data formats

Binary

Text

Configurable

Trace backends

File

MS ETW / LTTng

MS EventViewer / Journald / syslog

Configurable

Navigation menu

QtCS2017 Discuss Qt Logging enhancements: Difference between revisions

Latest revision as of 17:22, 30 October 2017

Discussion

Summary

Prototypes

Conclusion

Requirements

Questions

Tracing/Logging survey

Tracepoint kinds

Dynamically-Defined

Statically-Defined

Probes

Tracepoint data formats

Binary

Text

Configurable

Trace backends

File

MS ETW / LTTng

MS EventViewer / Journald / syslog

Configurable

Navigation menu

Search