Handling Document Formats

From Qt Wiki
Jump to: navigation, search

There are many use-cases that may require Qt applications to deal with document formats - usually either involving transparently parsing/writing documents, or displaying documents to the user. This page covers some general considerations, and provides an overview of wiki pages discussing available options for specific formats.

General Considerations

The Scribe Framework

While not being able to provide built-in functionality for every imaginable document handling use-case, Qt does ship with a generic rich text document framework, nicknamed "Scribe".

It revolves around the class QTextDocument, which provides an object-oriented frame-based representation of a document consisting of blocks (sub-frames, paragraphs, tables, lists, …) which in turn can contain strings of styled text fragments.

API is included for loading from HTML and saving to HTML and ODT (see QTextDocumentWriter ),as well as for displaying documents to the user (in read-only or interactively editable mode) through QTextEdit.

The Scribe framework's built-in document feature set (which all built-in loading/saving/displaying/editing operations are limited to) covers only the basics and doesn't come anywhere close to what modern full-featured document formats and authoring tools (like Microsoft Word) support, although it is sufficient for many tasks such as generating reports. Most parts of the framework are extensible through subclassing, so application authors can implement additional document features or save/load formats as they see fit.

XML Processing

Many modern document formats are based on XML. So depending on what kind of processing you wish to perform, manual parsing/writing using Qt's powerful XML handling classes might be a viable option.

  • The efficient XML Streaming classes available in QtCore are recommended for most purposes.
  • In some cases the SAX and DOM classes from the QtXml module can be a useful alternative.
  • If your application needs to repeatedly extract a certain piece of information, or apply a certain transformation, on many documents with a similar structure, then the QtXmlPatterns module might provide an elegant solution.

Individual Formats

For information/tips (gathered by the community) on how to work with a specific document format in your Qt application, click on the name of the format in the list below:

Text Documents

HTML .html .htm .xhtml
PDF .pdf
Microsoft Word .doc .docx (native format of Microsoft Word)
OpenDocument Text .odt (native format of OpenOffice/LibreOffice Writer, among others)
Rich Text Format .rtf (here referring specifically to Microsoft's "RTF" format, not rich text in general)
LaTeX .tex


Microsoft Excel .xls, .xlsx (native format of Microsoft Excel)
OpenDocument Spreadsheet .ods (native format of OpenOffice/LibreOffice Calc, among others)
comma-separated values .csv (simple file format that is widely supported by consumer, business, and scientific applications.)


Microsoft PowerPoint .ppt, .pptx (native format of Microsoft Powerpoint)
OpenDocument Presentation .odp (native format of OpenOffice/LibreOffice Impress, among others)


MathML .mml
OpenDocument Formula .odf (native format of OpenOffice/LibreOffice Math, among others)
LaTeX Math

See Also