Handling Microsoft Excel file format: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
No edit summary
 
(Added xlnt library)
 
(26 intermediate revisions by 10 users not shown)
Line 1: Line 1:
=Handling Microsoft Excel (file format)=
{{Ambox|text=TODO: If you know more about the container format, and whether it really needs a specialized library for processing, please expand this section.}}


This page discusses various available options for working with [http://en.wikipedia.org/wiki/Microsoft_Excel#File_formats Microsoft Excel] ''[en.wikipedia.org]'' documents in your Qt application. Please also read the general considerations outlined on the [[Handling Document Formats]] page.
{{Ambox|text=TODO: Tips for displaying Excel documents which were manually parsed using one of the methods described.}}


''<font size="0.9em"><font color="#335">Note that this information is collaboratively collected by the community, with no promise of completeness or correctness. In particular, use your own research and judgment when evaluating third-party libraries or tools!</font></font>''
{{Ambox|text=TODO: If you know whether Excel provides a "viewer" ActiveX control that can be embedded in a Qt application through ActiveQt, please fill out this section (including links to relevant resources).}}
 
{{LangSwitch}}
[[Category:Developing_with_Qt]]
 
This page discusses various available options for working with [http://en.wikipedia.org/wiki/Microsoft_Excel#File_formats Microsoft Excel] documents in your Qt application. Please also read the general considerations outlined on the [[Handling_Document_Formats | Handling Document Formats]] page.


One needs to distinguish between two different formats (this page deals with both of them):
One needs to distinguish between two different formats (this page deals with both of them):


{| class="infotable line" style="width: 95%; margin-left: 2.5%"
{| class="wikitable"
|
|
! Legacy “Excel Spreadsheet” format
! style="text-align:left;" | Legacy "Excel Spreadsheet" format
! “Office Open <span class="caps">XML</span> Workbook” format
! style="text-align:left;" | "Office Open XML Workbook" format
|-
|-
| ''classification:''
| classification:
| binary <span class="caps">BIFF</span>-based
| binary BIFF-based
| <span class="caps">XML</span>-based
| XML-based
|-
|-
| ''main filename extension:''
| main filename extension:
| <span style="font: 1em monospace">.xls</span>
| .xls
| <font face="monospace">.xlsx</font>
| .xlsx
|-
|-
| ''main internet media type:''
| main internet media type:
| <span style="font: 0.9em monospace">application/vnd.ms-excel</span>
| <small> application/vnd.ms-excel </small>
| <span style="font: 0.9em monospace">application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</span>
| <small> application/vnd.openxmlformats-officedocument.spreadsheetml.sheet </small>
|-
|-
| ''default format of Excel:''
| default format of Excel:
| until Excel 2003
| until Excel 2003
| since Excel 2007
| since Excel 2007
|}
|}


==Reading / Writing==
== Reading / Writing ==


===Using Excel itself===
=== Using Excel itself ===


If you are exclusively targeting the Windows platform and Microsoft Excel will be installed on all target machines, then you can use [http://doc.qt.io/qt-4.8/activeqt.html Qt’s ActiveX framework] ''[qt.io]'' to access Excel’s spreadsheet processing functionality through <span class="caps">OLE</span> automation. For an introductory code example (and a way to list the <span class="caps">API</span> provided by the Excel <span class="caps">COM</span> object), consult [[Using ActiveX Object in QT|this how to]].
If you are exclusively targeting the Windows platform and Microsoft Excel will be installed on all target machines, then you can use [http://doc.qt.io/qt-4.8/activeqt.html Qt's ActiveX framework] to access Excel's spreadsheet processing functionality through OLE automation. For an introductory code example (and a way to list the API provided by the Excel COM object), consult [[Using_ActiveX_Object_in_QT | this how-to]].


{| class="infotable line" style="width: 95%; margin-left: 2.5%"
{| class="wikitable"
|
|
! <span class="caps">DLL</span> file name
! style="text-align:left;" | DLL file name
! <span class="caps">COM</span> object name
! style="text-align:left;" | COM object name
! platforms
! style="text-align:left;" | platforms
! license
! style="text-align:left;" | license
|-
|-
|
| [http://office.microsoft.com/excel/ '''Microsoft Excel''']
[http://office.microsoft.com/excel/ '''Microsoft Excel'''] ''[office.microsoft.com]''
| ?
| ?
| <font face="monospace">Excel.Application</font>
| Excel.Application
| Windows
| Windows
| <font color="#458">commercial</font>
| <span style="color:Navy"> commercial </span>
|}
|}


===Using <span class="caps">ODBC</span>===
=== Using ODBC ===
 
To read an Excel file with ODBC (tested on Windows 7 with QT 4.7.1 and Windows 10 with QT 5.7) :


''<font size="0.9em"><font color="#530"><span class="caps">TODO</span>: Info on using <span class="caps">ODBC</span> drivers (via QSqlDatabase) for accessing Excel spreadsheets – please fill out this section if you know more. (What is the <span class="caps">ODBC</span> driver called? Where does it come from? Windows only or also Mac/Linux? Link to sample code snippet?)</font></font>''
<code>
QSqlDatabase db = QSqlDatabase::addDatabase("QODBC", "xlsx_connection");
db.setDatabaseName("DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=" + QString("c:\\path\\to\\your\\file\\file.xlsx"));
if(db.open())
{
QSqlQuery query("select * from [" + QString("Sheet1") + "$]",db); // Select range, place A1:B5 after $
while (query.next())
{
QString column1= query.value(0).toString();
qDebug() << column1;
}
db.close();
QSqlDatabase::removeDatabase("xlsx_connection");
}
</code>


To read an Excel file with <span class="caps">ODBC</span> (tested on Windows 7 with QT 4.7.1) :
The above code print all of column1's values to the debug output. It works for <span style="color:SeaGreen">*.xls</span> and <span style="color:SeaGreen">*.xlsx</span> and the other excel file formats.


This sample print in the console all column1’s values. It works for <font color="#580">.xls</font> and <font color="#580">.xlsx</font>
By default OBDC uses the first row as names for the columns, you are supposed to be able to change this with the 'FirstRowHasNames' option in the connection settings, however there is a bug (see [https://support.microsoft.com/en-us/kb/288343 KB288343]). Keep in mind that you are using a database and that each column has his own datatype, so if your second row contains text and your third row contains numbers, sql will pick one of these datatypes. If a few rows contain text and the rest of them contain floating numbers, sql will make the text appear and will make the numbers disappear.


By default <span class="caps">OBDC</span> uses the first row as names for the columns, you can change this whith the ‘FirstRowHasNames’ option in the connection settings. Keep in mind that you are using a database and that each column has his own datatype. So if your second row contains text and your third row contains numbers, sql wil pick one of these datatypes. If a few rows contain text and the rest of them contains floating numbers, sql wil make the text appear and will make the numbers disappear.
'''NOTE:''' To use ODBC on Windows, the MS Access Database Engine has to be installed. You can find it here: [https://www.microsoft.com/en-us/download/details.aspx?id=13255 Microsoft Access Database Engine 2010]. The Engine is maybe distributed with a MS Office Access installation, but on this should not be relied on. In Addition, you should regard that a 64 bit application can only use the 64 bit Engine and so for 32 bit accordingly. That’s why you maybe install both versions to avoid problems. Furthermore, the Engine should not be confused with the MS Access Runtime which contains the Engine.


===Using independent parser/writer libraries===
=== Using independent parser/writer libraries ===


For a more portable solution, you could take a look at some of the available third-party C/C++ libraries for parsing/writing Excel files:
For a more portable solution, you could take a look at some of the available third-party C/C++ libraries for parsing/writing Excel files:


{| class="infotable line" style="width: 95%; margin-left: 2.5%"
{| class="wikitable"
|
|
! <span class="caps">API</span>
! style="text-align:left;" | API
! <font face="monospace">.xls</font>
! style="text-align:left;" | .xls
! <font face="monospace">.xlsx</font>
! style="text-align:left;" | .xlsx
! reading
! style="text-align:left;" | reading
! writing
! style="text-align:left;" | writing
! platforms
! style="text-align:left;" | platforms
! license
! style="text-align:left;" | license
|-
| [https://github.com/QtExcel/QXlsx '''QXlsx''']
| C++
Qt
| <span style="color:DarkRed">no</span>
| <span style="color:Green">yes</span>
| <span style="color:Green">yes</span>
| <span style="color:Green">yes</span>
| Win, Mac, Linux, …
| MIT <span style="color:Navy">[weak copyleft]</span>
|-
|-
|
| [https://github.com/tfussell/xlnt '''xlnt''']
[https://github.com/dbzhang800/QtXlsxWriter '''Qt Xlsx'''] ''[github.com]''
| C++
| C++ Qt
| <span style="color:DarkRed">no</span>
| <font color="#920">no</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| Win, Mac, Linux, …
| Win, Mac, Linux, …
| <span class="caps">MIT</span> <font color="#458">[weak copyleft]</font>
| MIT <span style="color:Navy">[weak copyleft]</span>
|-
|-
|
| [http://xlslib.sourceforge.net/ '''xlsLib''']
[http://xlslib.sourceforge.net/ '''xlsLib'''] ''[xlslib.sourceforge.net]''
| C++
| C++
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| Win, Mac, Linux, …
| Win, Mac, Linux, …
| <span class="caps">LGPL</span> v3 <font color="#458">[weak copyleft]</font>
| LGPL v3 <span style="color:Navy">[weak copyleft]</span>
|-
|-
|
| [http://libxls.sourceforge.net '''libxls''']
[http://libxls.sourceforge.net '''libxls'''] ''[libxls.sourceforge.net]''
| C
| C
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| Win, Mac, Linux, …
| Win, Mac, Linux, …
| <span class="caps">LGPL</span> <font color="#458">[weak copyleft]</font>
| LGPL <span style="color:Navy">[weak copyleft]</span>
|-
|-
|
| [http://www.libxl.com/ '''LibXL''']
[http://www.libxl.com/ '''LibXL'''] ''[libxl.com]''
| C++
| C++
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| Win, Mac, Linux, …
| Win, Mac, Linux, …
| <font color="#458">commercial</font>
| <span style="color:Navy">commercial</span>
|-
|-
|
| [http://www.qtsoftware.de/vertrieb/db/qtxls_e.htm '''qtXLS''']
[http://www.qtsoftware.de/vertrieb/db/qtxls_e.htm '''qtXLS'''] ''[qtsoftware.de]''
| C
| C
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| Win, ?
| Win, ?
| <font color="#458">commercial</font>
| <span style="color:Navy">commercial</span>
|-
|-
|
| [https://www.gaia-gis.it/fossil/freexl '''FreeXL''']
[https://www.gaia-gis.it/fossil/freexl '''FreeXL'''] ''[gaia-gis.it]''
| C
| C
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| Linux, ?
| Linux, ?
| <span class="caps">LGPL</span> / <span class="caps">MPL</span> <font color="#458">[weak copyleft]</font>
| LGPL / MPL <span style="color:Navy">[weak copyleft]</span>
|-
|-
|
| [http://www.codeproject.com/Articles/13852/BasicExcel-A-Class-to-Read-and-Write-to-Microsoft '''BasicExcel''']
[http://www.codeproject.com/Articles/13852/BasicExcel-A-Class-to-Read-and-Write-to-Microsoft '''BasicExcel'''] ''[codeproject.com]''
| C++
| C++
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| ?
| ?
| ?
| ?
|-
|-
|
| [https://numberduck.com/ '''Number Duck''']
[https://numberduck.com/ '''Number Duck'''] ''[numberduck.com]''
| C++
| C++
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#920">no</font>
| <span style="color:DarkRed">no</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| <font color="#580">yes</font>
| <span style="color:Green">yes</span>
| Win, Linux
| Win, Linux
| <font color="#458">commercial</font>
| <span style="color:Navy">commercial</span>
|-
| [https://github.com/VSRonin/QtXlsxWriter '''Qt Xlsx'''] (Unmaintained)
| C++
Qt
| <span style="color:DarkRed">no</span>
| <span style="color:Green">yes</span>
| <span style="color:Green">yes</span>
| <span style="color:Green">yes</span>
| Win, Mac, Linux, …
| MIT <span style="color:Navy">[weak copyleft]</span>
|}
|}


Note that these libraries differ in their scope and general approach to the problem.
Note that these libraries differ in their scope and general approach to the problem.


===Using manual <span class="caps">XML</span> processing===
=== Using manual XML processing ===


Files using the <span class="caps">XML</span>-based (.xlsx) format could be processed using Qt’s <span class="caps">XML</span> handling classes (see [[Handling Document Formats]]). Third-party libraries can help you in dealing with the container format that wraps the actual <span class="caps">XML</span> files:
Files using the XML-based (.xlsx) format could be processed using Qt's XML handling classes (see [[Handling_Document_Formats | Handling Document Formats]]). Third-party libraries can help you in dealing with the container format that wraps the actual XML files:


{| class="infotable line" style="width: 95%; margin-left: 2.5%"
{| class="wikitable"
|
|
! <span class="caps">API</span>
! style="text-align:left;" | API
! supported platforms
! style="text-align:left;" | supported platforms
! license
! style="text-align:left;" | license
|-
|-
|
| [http://libopc.codeplex.com '''libopc''']
[http://libopc.codeplex.com '''libopc'''] ''[libopc.codeplex.com]''
| C
| C
| Win, Mac, Linux, …
| Win, Mac, Linux, …
| <font color="#458">permissive</font>
| <span style="color:Navy">permissive</span>
|}
|}


''<font size="0.9em"><font color="#530"><span class="caps">TODO</span>: If you know more about the container format, and whether it really needs a specialized library for processing, please expand this section.</font></font>''
=== Using batch conversion tools ===


===Using batch conversion tools===
If all else fails, there is always the option of using an existing tool to automatically convert between Excel files and a more manageable format, and let your Qt application deal with that format instead. The conversion tool could be bundled with your application or specified as a prerequisite, and controlled via [[Doc:QProcess]]. Some possibilities are:


If all else fails, there is always the option of using an existing tool to automatically convert between Excel files and a more manageable format, and let your Qt application deal with that format instead. The conversion tool could be bundled with your application or specified as a prerequisite, and controlled via [[doc/QProcess|QProcess]]. Some possibilities are:
{| class="wikitable"
 
|
{| class="infotable line" style="width: 95%; margin-left: 2.5%"
! style="text-align:left;" | .xls to *  
!
! style="text-align:left;" | .xlsx to *
! <font face="monospace">.xls</font> to *
! style="text-align:left;" | *to .xls
! <font face="monospace">.xlsx</font> to *
! style="text-align:left;" | *to .xlsx  
! * to <font face="monospace">.xls</font>
! style="text-align:left;" | platforms
! * to <font face="monospace">.xlsx</font>
! style="text-align:left;" | license
! platforms
! license
|-
|-
|
| [http://www.libreoffice.org/ '''LibreOffice''']
[http://www.libreoffice.org/ '''LibreOffice'''] ''[libreoffice.org]''
| .ods .csv …
| <font face="monospace">.ods .csv …</font>
| .ods .csv …
| <font face="monospace">.ods .csv …</font>
| .ods .csv …
| <font face="monospace">.ods .csv …</font>
| .ods .csv …
| <font face="monospace">.ods .csv …</font>
| Win, Mac, Linux, …
| Win, Mac, Linux, …
| <span class="caps">GPL</span> v3 <font color="#458">[strong copyleft]</font>
| GPL v3 <span style="color:Navy">[strong copyleft]</span>
|-
|-
|
| [http://'''…''']
[http:// '''…''']
| …
| …
| …
| …
Line 208: Line 235:
|}
|}


''Notes:''<br />
''Notes:''
 
LibreOffice can be used like this for batch conversion (it's slow, though): <code>soffice —invisible -convert-to xls test.ods</code>
==Displaying / User Interaction==
 
===Using Excel itself===
 
''<font size="0.9em"><font color="#530"><span class="caps">TODO</span>: If you know whether Excel provides a “viewer” ActiveX control that can be embedded in a Qt application through ActiveQT, please fill out this section (including links to relevant resources).</font></font>''
 
===Manual solution===
 
''<font size="0.9em"><font color="#530"><span class="caps">TODO</span>: Tips for displaying Excel documents which were manually parsed using one of the methods described above.</font></font>''


<font color="#fff">.</font>
----


==See Also==
== See Also ==


* [[Handling Document Formats]]
* [[Handling Document Formats]]
** ''other Microsoft Office formats:''
* [[Handling Microsoft Word file format]]
*** [[Handling Microsoft Word file format|Microsoft Word]]
* [[Handling Microsoft PowerPoint file format]]
*** [[Handling Microsoft PowerPoint file format|Microsoft Powerpoint]]
* [[Handling HTML]]
** ''other “spreadsheet” formats:''
* [[Handling PDF]]
*** [[Handling OpenDocument Spreadsheet|OpenDocument Spreadsheet]]
 
===Categories:===
 
* [[:Category:Developing with Qt|Developing_with_Qt]]

Latest revision as of 09:15, 11 June 2021

TODO: If you know more about the container format, and whether it really needs a specialized library for processing, please expand this section.
TODO: Tips for displaying Excel documents which were manually parsed using one of the methods described.
TODO: If you know whether Excel provides a "viewer" ActiveX control that can be embedded in a Qt application through ActiveQt, please fill out this section (including links to relevant resources).

En Ar Bg De El Es Fa Fi Fr Hi Hu It Ja Kn Ko Ms Nl Pl Pt Ru Sq Th Tr Uk Zh

This page discusses various available options for working with Microsoft Excel documents in your Qt application. Please also read the general considerations outlined on the Handling Document Formats page.

One needs to distinguish between two different formats (this page deals with both of them):

Legacy "Excel Spreadsheet" format "Office Open XML Workbook" format
classification: binary BIFF-based XML-based
main filename extension: .xls .xlsx
main internet media type: application/vnd.ms-excel application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
default format of Excel: until Excel 2003 since Excel 2007

Reading / Writing

Using Excel itself

If you are exclusively targeting the Windows platform and Microsoft Excel will be installed on all target machines, then you can use Qt's ActiveX framework to access Excel's spreadsheet processing functionality through OLE automation. For an introductory code example (and a way to list the API provided by the Excel COM object), consult this how-to.

DLL file name COM object name platforms license
Microsoft Excel ? Excel.Application Windows commercial

Using ODBC

To read an Excel file with ODBC (tested on Windows 7 with QT 4.7.1 and Windows 10 with QT 5.7) :

QSqlDatabase db = QSqlDatabase::addDatabase("QODBC", "xlsx_connection");
db.setDatabaseName("DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=" + QString("c:\\path\\to\\your\\file\\file.xlsx"));
if(db.open())
{
 QSqlQuery query("select * from [" + QString("Sheet1") + "$]",db); // Select range, place A1:B5 after $
 while (query.next())
 {
 QString column1= query.value(0).toString();
 qDebug() << column1;
 }
db.close();
QSqlDatabase::removeDatabase("xlsx_connection");
}

The above code print all of column1's values to the debug output. It works for *.xls and *.xlsx and the other excel file formats.

By default OBDC uses the first row as names for the columns, you are supposed to be able to change this with the 'FirstRowHasNames' option in the connection settings, however there is a bug (see KB288343). Keep in mind that you are using a database and that each column has his own datatype, so if your second row contains text and your third row contains numbers, sql will pick one of these datatypes. If a few rows contain text and the rest of them contain floating numbers, sql will make the text appear and will make the numbers disappear.

NOTE: To use ODBC on Windows, the MS Access Database Engine has to be installed. You can find it here: Microsoft Access Database Engine 2010. The Engine is maybe distributed with a MS Office Access installation, but on this should not be relied on. In Addition, you should regard that a 64 bit application can only use the 64 bit Engine and so for 32 bit accordingly. That’s why you maybe install both versions to avoid problems. Furthermore, the Engine should not be confused with the MS Access Runtime which contains the Engine.

Using independent parser/writer libraries

For a more portable solution, you could take a look at some of the available third-party C/C++ libraries for parsing/writing Excel files:

API .xls .xlsx reading writing platforms license
QXlsx C++

Qt

no yes yes yes Win, Mac, Linux, … MIT [weak copyleft]
xlnt C++ no yes yes yes Win, Mac, Linux, … MIT [weak copyleft]
xlsLib C++ yes no no yes Win, Mac, Linux, … LGPL v3 [weak copyleft]
libxls C yes no yes no Win, Mac, Linux, … LGPL [weak copyleft]
LibXL C++ yes yes yes yes Win, Mac, Linux, … commercial
qtXLS C yes no yes yes Win, ? commercial
FreeXL C yes no yes no Linux, ? LGPL / MPL [weak copyleft]
BasicExcel C++ yes no yes yes ? ?
Number Duck C++ yes no yes yes Win, Linux commercial
Qt Xlsx (Unmaintained) C++

Qt

no yes yes yes Win, Mac, Linux, … MIT [weak copyleft]

Note that these libraries differ in their scope and general approach to the problem.

Using manual XML processing

Files using the XML-based (.xlsx) format could be processed using Qt's XML handling classes (see Handling Document Formats). Third-party libraries can help you in dealing with the container format that wraps the actual XML files:

API supported platforms license
libopc C Win, Mac, Linux, … permissive

Using batch conversion tools

If all else fails, there is always the option of using an existing tool to automatically convert between Excel files and a more manageable format, and let your Qt application deal with that format instead. The conversion tool could be bundled with your application or specified as a prerequisite, and controlled via Doc:QProcess. Some possibilities are:

.xls to * .xlsx to * *to .xls *to .xlsx platforms license
LibreOffice .ods .csv … .ods .csv … .ods .csv … .ods .csv … Win, Mac, Linux, … GPL v3 [strong copyleft]

Notes:

LibreOffice can be used like this for batch conversion (it's slow, though):

soffice invisible -convert-to xls test.ods

See Also