Handling Microsoft Excel file format: Difference between revisions
Henri Vikki (talk | contribs) No edit summary |
(Added xlnt library) |
||
(16 intermediate revisions by 8 users not shown) | |||
Line 1: | Line 1: | ||
{{Ambox|text=TODO: If you know more about the container format, and whether it really needs a specialized library for processing, please expand this section.}} | |||
{{Ambox|text=TODO: Tips for displaying Excel documents which were manually parsed using one of the methods described.}} | |||
{{Ambox|text=TODO: If you know whether Excel provides a "viewer" ActiveX control that can be embedded in a Qt application through ActiveQt, please fill out this section (including links to relevant resources).}} | |||
{{LangSwitch}} | |||
[[Category:Developing_with_Qt]] | [[Category:Developing_with_Qt]] | ||
This page discusses various available options for working with [http://en.wikipedia.org/wiki/Microsoft_Excel#File_formats Microsoft Excel] documents in your Qt application. Please also read the general considerations outlined on the [[Handling_Document_Formats | Handling Document Formats]] page. | This page discusses various available options for working with [http://en.wikipedia.org/wiki/Microsoft_Excel#File_formats Microsoft Excel] documents in your Qt application. Please also read the general considerations outlined on the [[Handling_Document_Formats | Handling Document Formats]] page. | ||
One needs to distinguish between two different formats (this page deals with both of them): | One needs to distinguish between two different formats (this page deals with both of them): | ||
Line 16: | Line 14: | ||
{| class="wikitable" | {| class="wikitable" | ||
| | | | ||
! style="text-align:left;"| Legacy "Excel Spreadsheet" format | ! style="text-align:left;" | Legacy "Excel Spreadsheet" format | ||
! style="text-align:left;"| "Office Open XML Workbook" format | ! style="text-align:left;" | "Office Open XML Workbook" format | ||
|- | |- | ||
| classification: | | classification: | ||
Line 44: | Line 42: | ||
{| class="wikitable" | {| class="wikitable" | ||
| | | | ||
! style="text-align:left;"| DLL file name | ! style="text-align:left;" | DLL file name | ||
! style="text-align:left;"| COM object name | ! style="text-align:left;" | COM object name | ||
! style="text-align:left;"| platforms | ! style="text-align:left;" | platforms | ||
! style="text-align:left;"| license | ! style="text-align:left;" | license | ||
|- | |- | ||
| [http://office.microsoft.com/excel/ '''Microsoft Excel'''] | | [http://office.microsoft.com/excel/ '''Microsoft Excel'''] | ||
Line 58: | Line 56: | ||
=== Using ODBC === | === Using ODBC === | ||
To read an Excel file with ODBC (tested on Windows 7 with QT 4.7.1 and Windows 10 with QT 5.7) : | |||
To read an Excel file with ODBC (tested on Windows 7 with QT 4.7.1) : | |||
<code> | <code> | ||
QSqlDatabase db = QSqlDatabase::addDatabase("QODBC"); | QSqlDatabase db = QSqlDatabase::addDatabase("QODBC", "xlsx_connection"); | ||
db.setDatabaseName("DRIVER={Microsoft Excel Driver ( | db.setDatabaseName("DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=" + QString("c:\\path\\to\\your\\file\\file.xlsx")); | ||
if(db.open()) | if(db.open()) | ||
{ | { | ||
QSqlQuery query("select | QSqlQuery query("select * from [" + QString("Sheet1") + "$]",db); // Select range, place A1:B5 after $ | ||
while (query.next()) | while (query.next()) | ||
{ | { | ||
Line 74: | Line 69: | ||
qDebug() << column1; | qDebug() << column1; | ||
} | } | ||
db.close(); | |||
QSqlDatabase::removeDatabase("xlsx_connection"); | |||
} | } | ||
</code> | </code> | ||
The above code print all of column1's values to the debug output. It works for <span style="color:SeaGreen">*.xls</span> and <span style="color:SeaGreen">*.xlsx</span> and the other excel file formats. | |||
By default OBDC uses the first row as names for the columns, you are supposed to be able to change this with the 'FirstRowHasNames' option in the connection settings, however there is a bug (see [https://support.microsoft.com/en-us/kb/288343 KB288343]). Keep in mind that you are using a database and that each column has his own datatype, so if your second row contains text and your third row contains numbers, sql will pick one of these datatypes. If a few rows contain text and the rest of them contain floating numbers, sql will make the text appear and will make the numbers disappear. | |||
'''NOTE:''' To use ODBC on Windows, the MS Access Database Engine has to be installed. You can find it here: [https://www.microsoft.com/en-us/download/details.aspx?id=13255 Microsoft Access Database Engine 2010]. The Engine is maybe distributed with a MS Office Access installation, but on this should not be relied on. In Addition, you should regard that a 64 bit application can only use the 64 bit Engine and so for 32 bit accordingly. That’s why you maybe install both versions to avoid problems. Furthermore, the Engine should not be confused with the MS Access Runtime which contains the Engine. | |||
=== Using independent parser/writer libraries === | === Using independent parser/writer libraries === | ||
Line 87: | Line 86: | ||
{| class="wikitable" | {| class="wikitable" | ||
| | | | ||
! style="text-align:left;"| API | ! style="text-align:left;" | API | ||
! style="text-align:left;"| .xls | ! style="text-align:left;" | .xls | ||
! style="text-align:left;"| .xlsx | ! style="text-align:left;" | .xlsx | ||
! style="text-align:left;"| reading | ! style="text-align:left;" | reading | ||
! style="text-align:left;"| writing | ! style="text-align:left;" | writing | ||
! style="text-align:left;"| platforms | ! style="text-align:left;" | platforms | ||
! style="text-align:left;"| license | ! style="text-align:left;" | license | ||
|- | |- | ||
| [https://github.com/ | | [https://github.com/QtExcel/QXlsx '''QXlsx'''] | ||
| C++ | | C++ | ||
Qt | Qt | ||
| <span style="color:DarkRed">no</span> | |||
| <span style="color:Green">yes</span> | |||
| <span style="color:Green">yes</span> | |||
| <span style="color:Green">yes</span> | |||
| Win, Mac, Linux, … | |||
| MIT <span style="color:Navy">[weak copyleft]</span> | |||
|- | |||
| [https://github.com/tfussell/xlnt '''xlnt'''] | |||
| C++ | |||
| <span style="color:DarkRed">no</span> | | <span style="color:DarkRed">no</span> | ||
| <span style="color:Green">yes</span> | | <span style="color:Green">yes</span> | ||
Line 167: | Line 175: | ||
| Win, Linux | | Win, Linux | ||
| <span style="color:Navy">commercial</span> | | <span style="color:Navy">commercial</span> | ||
|- | |||
| [https://github.com/VSRonin/QtXlsxWriter '''Qt Xlsx'''] (Unmaintained) | |||
| C++ | |||
Qt | |||
| <span style="color:DarkRed">no</span> | |||
| <span style="color:Green">yes</span> | |||
| <span style="color:Green">yes</span> | |||
| <span style="color:Green">yes</span> | |||
| Win, Mac, Linux, … | |||
| MIT <span style="color:Navy">[weak copyleft]</span> | |||
|} | |} | ||
Line 177: | Line 195: | ||
{| class="wikitable" | {| class="wikitable" | ||
| | | | ||
! style="text-align:left;"| API | ! style="text-align:left;" | API | ||
! style="text-align:left;"| supported platforms | ! style="text-align:left;" | supported platforms | ||
! style="text-align:left;"| license | ! style="text-align:left;" | license | ||
|- | |- | ||
| [http://libopc.codeplex.com '''libopc'''] | | [http://libopc.codeplex.com '''libopc'''] | ||
Line 186: | Line 204: | ||
| <span style="color:Navy">permissive</span> | | <span style="color:Navy">permissive</span> | ||
|} | |} | ||
=== Using batch conversion tools === | === Using batch conversion tools === | ||
Line 195: | Line 211: | ||
{| class="wikitable" | {| class="wikitable" | ||
| | | | ||
! style="text-align:left;"| .xls to * | ! style="text-align:left;" | .xls to * | ||
! style="text-align:left;"| .xlsx to * | ! style="text-align:left;" | .xlsx to * | ||
! style="text-align:left;"| *to .xls | ! style="text-align:left;" | *to .xls | ||
! style="text-align:left;"| *to .xlsx | ! style="text-align:left;" | *to .xlsx | ||
! style="text-align:left;"| platforms | ! style="text-align:left;" | platforms | ||
! style="text-align:left;"| license | ! style="text-align:left;" | license | ||
|- | |- | ||
| [http://www.libreoffice.org/ '''LibreOffice'''] | | [http://www.libreoffice.org/ '''LibreOffice'''] | ||
Line 221: | Line 237: | ||
''Notes:'' | ''Notes:'' | ||
LibreOffice can be used like this for batch conversion (it's slow, though): <code>soffice —invisible -convert-to xls test.ods</code> | LibreOffice can be used like this for batch conversion (it's slow, though): <code>soffice —invisible -convert-to xls test.ods</code> | ||
---- | ---- | ||
Line 238: | Line 242: | ||
== See Also == | == See Also == | ||
* [[ | * [[Handling Document Formats]] | ||
* | * [[Handling Microsoft Word file format]] | ||
* [[Handling Microsoft PowerPoint file format]] | |||
* [[Handling HTML]] | |||
* | * [[Handling PDF]] |
Latest revision as of 09:15, 11 June 2021
TODO: If you know more about the container format, and whether it really needs a specialized library for processing, please expand this section. |
TODO: Tips for displaying Excel documents which were manually parsed using one of the methods described. |
TODO: If you know whether Excel provides a "viewer" ActiveX control that can be embedded in a Qt application through ActiveQt, please fill out this section (including links to relevant resources). |
En Ar Bg De El Es Fa Fi Fr Hi Hu It Ja Kn Ko Ms Nl Pl Pt Ru Sq Th Tr Uk Zh
This page discusses various available options for working with Microsoft Excel documents in your Qt application. Please also read the general considerations outlined on the Handling Document Formats page.
One needs to distinguish between two different formats (this page deals with both of them):
Legacy "Excel Spreadsheet" format | "Office Open XML Workbook" format | |
---|---|---|
classification: | binary BIFF-based | XML-based |
main filename extension: | .xls | .xlsx |
main internet media type: | application/vnd.ms-excel | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
default format of Excel: | until Excel 2003 | since Excel 2007 |
Reading / Writing
Using Excel itself
If you are exclusively targeting the Windows platform and Microsoft Excel will be installed on all target machines, then you can use Qt's ActiveX framework to access Excel's spreadsheet processing functionality through OLE automation. For an introductory code example (and a way to list the API provided by the Excel COM object), consult this how-to.
DLL file name | COM object name | platforms | license | |
---|---|---|---|---|
Microsoft Excel | ? | Excel.Application | Windows | commercial |
Using ODBC
To read an Excel file with ODBC (tested on Windows 7 with QT 4.7.1 and Windows 10 with QT 5.7) :
QSqlDatabase db = QSqlDatabase::addDatabase("QODBC", "xlsx_connection");
db.setDatabaseName("DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=" + QString("c:\\path\\to\\your\\file\\file.xlsx"));
if(db.open())
{
QSqlQuery query("select * from [" + QString("Sheet1") + "$]",db); // Select range, place A1:B5 after $
while (query.next())
{
QString column1= query.value(0).toString();
qDebug() << column1;
}
db.close();
QSqlDatabase::removeDatabase("xlsx_connection");
}
The above code print all of column1's values to the debug output. It works for *.xls and *.xlsx and the other excel file formats.
By default OBDC uses the first row as names for the columns, you are supposed to be able to change this with the 'FirstRowHasNames' option in the connection settings, however there is a bug (see KB288343). Keep in mind that you are using a database and that each column has his own datatype, so if your second row contains text and your third row contains numbers, sql will pick one of these datatypes. If a few rows contain text and the rest of them contain floating numbers, sql will make the text appear and will make the numbers disappear.
NOTE: To use ODBC on Windows, the MS Access Database Engine has to be installed. You can find it here: Microsoft Access Database Engine 2010. The Engine is maybe distributed with a MS Office Access installation, but on this should not be relied on. In Addition, you should regard that a 64 bit application can only use the 64 bit Engine and so for 32 bit accordingly. That’s why you maybe install both versions to avoid problems. Furthermore, the Engine should not be confused with the MS Access Runtime which contains the Engine.
Using independent parser/writer libraries
For a more portable solution, you could take a look at some of the available third-party C/C++ libraries for parsing/writing Excel files:
API | .xls | .xlsx | reading | writing | platforms | license | |
---|---|---|---|---|---|---|---|
QXlsx | C++
Qt |
no | yes | yes | yes | Win, Mac, Linux, … | MIT [weak copyleft] |
xlnt | C++ | no | yes | yes | yes | Win, Mac, Linux, … | MIT [weak copyleft] |
xlsLib | C++ | yes | no | no | yes | Win, Mac, Linux, … | LGPL v3 [weak copyleft] |
libxls | C | yes | no | yes | no | Win, Mac, Linux, … | LGPL [weak copyleft] |
LibXL | C++ | yes | yes | yes | yes | Win, Mac, Linux, … | commercial |
qtXLS | C | yes | no | yes | yes | Win, ? | commercial |
FreeXL | C | yes | no | yes | no | Linux, ? | LGPL / MPL [weak copyleft] |
BasicExcel | C++ | yes | no | yes | yes | ? | ? |
Number Duck | C++ | yes | no | yes | yes | Win, Linux | commercial |
Qt Xlsx (Unmaintained) | C++
Qt |
no | yes | yes | yes | Win, Mac, Linux, … | MIT [weak copyleft] |
Note that these libraries differ in their scope and general approach to the problem.
Using manual XML processing
Files using the XML-based (.xlsx) format could be processed using Qt's XML handling classes (see Handling Document Formats). Third-party libraries can help you in dealing with the container format that wraps the actual XML files:
API | supported platforms | license | |
---|---|---|---|
libopc | C | Win, Mac, Linux, … | permissive |
Using batch conversion tools
If all else fails, there is always the option of using an existing tool to automatically convert between Excel files and a more manageable format, and let your Qt application deal with that format instead. The conversion tool could be bundled with your application or specified as a prerequisite, and controlled via Doc:QProcess. Some possibilities are:
.xls to * | .xlsx to * | *to .xls | *to .xlsx | platforms | license | |
---|---|---|---|---|---|---|
LibreOffice | .ods .csv … | .ods .csv … | .ods .csv … | .ods .csv … | Win, Mac, Linux, … | GPL v3 [strong copyleft] |
… | … | … | … | … | … | … |
Notes:
LibreOffice can be used like this for batch conversion (it's slow, though):
soffice —invisible -convert-to xls test.ods