Qt-contributors-summit-2013-Qt ICU: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
No edit summary
 
No edit summary
Line 1: Line 1:
'''<span class="caps">WIP</span>'''


=Agenda=
* Review decision on using <span class="caps">ICU</span> everywhere in Qt
* Discuss QTimeZone integration into QDateTime: initialization, validation and daylight saving transitions
=Minutes=
Take by David Faure
QSystemLocale picks up the user settings (bypassing <span class="caps">ICU</span>, which doesn’t do that).<br /> So it needs platform backends to do that.
Same for the user-selected timezone, etc.<br /> =&gt; <span class="caps">ICU</span> is really just the database.
We need to complete that code, to get all settings on all OSes.
<span class="caps">ICU</span> data is 26 MB on disk (on Windows). The bulk is timezones. Lots of translations. 4.6 MB with just the english locale and language.
ICU4C lib = 5MB on top.
Idea from Thiago: QLocale – basic, based on host apis Something more complete based on <span class="caps">ICU</span> -&gt; in a separate module Could be done with QCollator, calendering and other future stuff, but not very consistent
Idea from Lars: a stripped <span class="caps">ICU</span>. Translations would come from Qt po files.
Another option: load <span class="caps">ICU</span> dynamically, and use system apis if it’s not available.
Dropped idea: a plugin (loaded with native <span class="caps">API</span>s, no QString), which would also exist with<br /> minimal, small, and full versions. Creates a deployment problem.
Conclusions:
Mac: we’ll use the Mac <span class="caps">API</span> wrapper
Android: <span class="caps">TODO</span>: check if we can use the <span class="caps">ICU</span> data files from ICU4J (same as the ones from ICU4C) <span class="caps">TODO</span>: check if we can use that data, even if the lib isn’t available. It should be the same version of <span class="caps">ICU</span> though! Or we could use <span class="caps">JNI</span>….. (everyone is horrified by the idea though).
Idea from Lars: an <span class="caps">ICU</span>-compatible replacement <span class="caps">API</span>, that can be shipped instead of <span class="caps">ICU</span> to save size.<br /> QtCore uses 26 symbols from <span class="caps">ICU</span>.<br /> So the user could choose which <span class="caps">ICU</span> to use: 1) minimal (just C), 2) small (C and one selected locale?), 3) full.<br /> Whenever using QtWebKit, the full one is required. It’s really mutually exclusive (webkit uses 85 symbols from icu)
==<span class="caps">ICU</span> and Qt==
At QtCS 2012 it was decided to migrate to using <span class="caps">ICU</span> for all localization services to reduce our own code and data, and possibly for code conversion tables, effectively making it a hard requirement for Qt5. Since then a number of technical and political objections have been raised which need to be addressed. This session is to discuss the problems and try come up with a solution.
Primary issues:
* <span class="caps">ICU</span> ignores host data and any user custom settings, so may not appear ‘native’
* BC issues with using system libraries limit us to using the C api which lacks features we may need
* Mac doesn’t ship system library headers, App Store bans direct linking to libicu, out-dated system versions
* Complaints from Windows devs that <span class="caps">ICU</span> download is too big to ship and has no debug version, but too hard to build / shrink the build themselves
* Android doesn’t guarantee ICU4C will be installed(?), but offers no other <span class="caps">NDK</span> localization option(?)
* Tizen also doesn’t ship <span class="caps">ICU</span> headers or allow linking to system <span class="caps">ICU</span> in its app store
* QtWebKit has hard requirement that is not easily removed
This essentially means we must either use only native api on Win and Mac, or ship our own minimal version of <span class="caps">ICU</span>.
A default <span class="caps">ICU</span> 50.1 build with full data is 26.5 MB on disk or 10.5 MB zipped. Reduced to the minimum library and only the English locale, translation and mapping data the build is 10 MB on disk or 4.6 MB zipped. Further optimizations possible. An app choosing to ship <span class="caps">ICU</span> with a reasonable number of supported language translations can expect a build of about 12 MB on disk or 6 MB zipped. The big decision is whether this is an acceptable size for downloads on Win, Mac and Android. (see below).
Practically, there are three options depending on what features we need to use from <span class="caps">ICU</span>:<br /> 1) Only use the host api on Mac and Windows, use <span class="caps">ICU</span> as the host api on Linux, and don’t provide any features that are not common to all three api’s.<br /> 2) Use Win/Mac host system api wherever possible, only require <span class="caps">ICU</span> for optional features where devs will be motivated to build and ship <span class="caps">ICU</span>.<br /> 3) Make <span class="caps">ICU</span> as a hard requirement to be always be built and shipped by all devs on Mac and Windows.
Major questions:
* Are host system api sufficient for requirements? Mac yes for localization as thin wrapper to <span class="caps">ICU</span>, WinRT appears to be modelled on <span class="caps">ICU</span>, Win32 needs research but seems doubtful. Not clear if Windows allows opening resource files for non-system/custom locales.
* Android / BB10 / <span class="caps">QNX</span> / Tizen details?
Side note: The Chromium/Blink project has debated these same issues and decided to always build and ship their own version of <span class="caps">ICU</span> on all platforms including Linux and Android, wrapped in a thin abstraction layer.
It is proposed to follow option 2) until such time as we need features only <span class="caps">ICU</span> or the the C++ api can provide. An extension of option 3) is to also force our own build on Linux which would allow us to use the C++ api.
* New build script in qtbase/3rdparty to checkout, configure and build minimal required <span class="caps">ICU</span> as part of qtbase standard build system
* Build script able to be customized to choose what locales and translations to ship
* Localization to always use host api for system locale, any features needing <span class="caps">ICU</span> must be optional build-time flag
* Script detects if <span class="caps">ICU</span> features are needed, Win/Mac/iOS/Android build and link own copy of libicu, Linux/QNX defaults to system library but can choose own copy if needed
* Only use <span class="caps">ICU</span> C api for now, but require a recent enough version on all platforms to be useful, if not available from system then must build own
* Minimal non-<span class="caps">ICU</span> C locale for embedded if required?
==QTimeZone / QDateTime integration issues==
The current implementation of QDateTime with the system time zone has a number of implementation issues around initialization, validity checking, and maths that will also affect QTimeZone so should have their behaviour defined or fixed before QTimeZone is integrated.
* QTimeZone uses a lazy initialization that accepts any date, time and spec, validity is only checked when used
* isValid() does not take the time zone into account, only if QDate and QTime are individually valid
* Date-only math functions (add day/month/year) only done in QDate, i.e. validity check and maths applied is on date only and doesn’t consider time and time zone, nor whether the result is valid in the tz.
* Time math functions do use QDateTIme::isValid() and convert to <span class="caps">UTC</span> to calculate
* Changing spec (i.e. to <span class="caps">UTC</span> in calculations) calls mktime to calculate and validate,
* The transition from Standard Time to Daylight Time (e.g. 2am becomes 3am) leaves a ‘hole’ of 1 hour that should be considered invalid but isValid() returns true and the date and time maths functions are still applied
It has been proposed to re-write QDateTime to internally store as an absolute msecs since epoch which would inherently solve many of these issues, but this would radically change the behaviour of QDateTime which currently treats the ymd/hms values as “fixed” and the time spec is used to interpret that value. This would mostly affect the default SystemTime spec where the system time zone can change underneath QDateTime causing the absolute <span class="caps">UTC</span> value to change. A considerable re-write would be required to keep the behaviour consistent. There is also the behaviour that you can store an invalid date but valid time in QDateTime, and later fix the date to be valid, which may not be possible if using a single qint64. Other likely effects would be:
* Creation would be slower as has to validate and convert
* Accessing most commonly used functions would be slower, e.g. caliing dt.date().day or dt.time().hour()
* Maths and conversion functions would be faster and simpler and more accurate
* Memory footprint would be reduced by one-third
If we keep storage in QDate/QTime format we have to ensure validity is checked properly. This solution is a lot less code change and keeps current behaviour, but
* QDateTime::isValid() must check if valid in tz, i.e. call mktime the first time called then cache result in QDateTImePrivate::Spec
* All date-only maths needs to be converted to <span class="caps">UTC</span> first then converted back, same as for time
A further major issue is the Second Occurrence:
* The transition from Daylight Time to Standard Time (e.g. 3am become 2am) has an hour that ‘occurs’ twice and is thus ambiguous, i.e. 2:30am. We currently have no api to indicate or set which occurrence.
* Windows, Linux and Mac implementations of mktime have different assumptions for ambiguous times: Linux assumes first occurrence, Windows assumes second occurrence, Mac assumes first for about the first 40 minutes and second for the last 20 minutes (probably a bug).
* The mktime tm_isdst flag is not sufficient for certain scenarios
The current time zone patches have api for setting and reading the occurrence, but issues with mktime have prevented working code so far.
Development plan:
* Re-submit patches for offsetFromUtc and cleaning up format/parse code
* Implement chosen solution for QDateTIme internals, other clean-ups
* Implement second occurrence support for SystemTime only
* Re-submit QTimeZone patches
=Detailed notes on <span class="caps">ICU</span> in Qt5=
==Current use of <span class="caps">ICU</span> in Qt5==
* QtWebKit – required on all platforms for localization and text layout.
* QtCore / QIcuCodec – private, optional
* QtCore / QCollator – private, optional
* QtCore / QLocale for toUpper() and toLower() – private, optional
* sqlite3?
==Issues with <span class="caps">ICU</span>==
===Common===
<span class="caps">ICU</span> are notoriously bad at building libraries that can be reliably linked against, they offer no BC guarantees for C++, changing so names, dat files tightly coupled to library version, etc. For compatibility this leaves us to use the C api which lacks many of the advanced features in the C++ api that make using it desirable. C <span class="caps">API</span> additions to match can be requested and some have been added as a result, but old system versions shipped in <span class="caps">OSX</span> and Linux won’t have these available. To use the C++ api would require strictly controlling the version of <span class="caps">ICU</span> linked to, i.e. building it ourselves.
<span class="caps">ICU</span> respects the users locale code (e.g. en_GB) but doesn’t use the host system data or the users custom settings (e.g. different date format), so <span class="caps">ICU</span> apps may not fully fit in with a users environment. This may especially be a problem on Windows where settings and behaviour are very different from the <span class="caps">POSIX</span> world.
The default <span class="caps">ICU</span> data is fairly large, up to 21.3 MB on disk or 8.5 MB zipped, but not all data is actually required, such as code tables or translations for languages not supported by an app, and could be easily shrunk by apps to about 7MB on disk or 3.5 MB zipped.
===Windows===
See http://thread.gmane.org/gmane.comp.lib.qt.devel/9226
<span class="caps">ICU</span> is not shipped with Windows so all apps need to build and distribute their own copy of <span class="caps">ICU</span> including data. Devs have complained that <span class="caps">ICU</span> is too big (!) and too hard to build.
A binary download of 11.4 MB is available from <span class="caps">ICU</span>, but only for mvsc10 and not debug versions which causes issues.
Work required to determine if sufficient host api functionality available for system locale, and if locale resource bundles can be opened to use for custom locales.
For localization it seems unlikely the native <span class="caps">API</span> will provide a sufficient solution (especially on XP) so we need to make the build easier and the data smaller, or accept a lesser feature set for apps that don’t wish to ship <span class="caps">ICU</span>. This may not be an issue for most apps that don’t require the more advanced locale features or custom locales. Those that do need the features will be more willing to make the effort.
===Mac===
<span class="caps">ICU</span> ships as standard on <span class="caps">OSX</span> and iOS, with the official <span class="caps">API</span> classes effectively thin wrappers around <span class="caps">ICU</span> that allow for user customisations. Shipped versions of <span class="caps">ICU</span> tend to be rather old, and the headers are not included to discourage the direct use of <span class="caps">ICU</span>. Currently Qt5 requires installing MacPorts and linking to their version of <span class="caps">ICU</span>, but this is a bad solution as it is not portable or distributable (it also causes build problems if Macports Qt4 is also installed). It is possible to download the headers from opensource.apple.com with some effort and use those instead (WebKit, Chrome and others do this by including a copy of the headers). However apps are rejected from the App Store if they directly link to <span class="caps">ICU</span> which effectively rules out using the system <span class="caps">ICU</span> for anything on iOS and thus for simplicity on <span class="caps">OSX</span> too. It is not clear if shipping a self-built copy of <span class="caps">ICU</span> is acceptable to the App Store, but the extra 20Mb added to the download is not likely to be acceptable to iOS developers so would need to be trimmed down.
<span class="caps">ICU</span> provides a 64bit binary download of 11Mb.
For localization the native <span class="caps">API</span> will probably be sufficient. Code tables require investigation.
QtWebKit is not permitted on iOS by the App Store rules, apps must use the native WebKit install. Therefore iOS does not need to be considered in any QtWebKit solution.
===Linux===
<span class="caps">ICU</span> is available on all distro’s and likely to always be installed due to other projects depending on it so is not a problem for availability or download size. Distro’s are used to the issues involved with using <span class="caps">ICU</span> so it might not be unreasonable to make it a requirement to rebuild Qt whenever a major <span class="caps">ICU</span> update is made. In fact, this seems to be existing policy for other packages using <span class="caps">ICU</span>.
===Android===
Android ships <span class="caps">ICU</span>? Or just the Java version by default? Needs more investigation on how it is used and if there are any problems with linking to it directly instead of the native <span class="caps">API</span>.
ICU4C is not a standard part of Android, but is supported in Android External and builds fine. Appears to be no native C/C++ api to use in <span class="caps">NDK</span>, only Java native api. Not yet clear if should use Android src repo or <span class="caps">ICU</span> master repo. Appears we will need to build and ship our own copy.
Android repo at https://android.googlesource.com/platform/external/icu4c but doc at http://source.android.com/source/submit-patches.html#icu4c makes it clear upstream is considered the master.
See https://groups.google.com/a/chromium.org/forum/#!msg/blink-dev/eSiHBND2rAQ/X9LyMHImNjgJ for an interesting discussion on using <span class="caps">ICU</span> in WebKit / Blink / Android which seems to suggest <span class="caps">ICU</span> system will become standard install?
===BB10 / <span class="caps">QNX</span>===
<span class="caps">QNX</span> ships <span class="caps">ICU</span>. Needs more investigation on how it is used and if there are any problems with linking to it directly instead of the native <span class="caps">API</span>.
BB10 ships <span class="caps">ICU</span> in firmware to use for <span class="caps">NDK</span> localization and Qt4 in firmware to use for gui’s (and thus uses old QLocale). Qt5 can be built and used, but is not yet a standard install in firmware. This means Qt5 can use the system <span class="caps">ICU</span> as the locale back-end, i.e. same as Linux. Initially will only be able to use the C api, but once both are in firmware it will always be a monolithic system build so can probably use the C++ api.
===QtWebKit===
WebKit provides a localization, text layout and string encoding abstraction layer. WebKit ports such as QtWebKit can provide their own backend but most choose to use the existing <span class="caps">ICU</span> back-end for convenience. QtWebKit4 apparently used to use QLocale/QString but switched to <span class="caps">ICU</span> for Qt5? This means QtWebKit needs and uses <span class="caps">ICU</span> on all platforms and so may not always properly fit in, whereas other system-provided WebKit ports may actually use the native api and so fit in better. Need to determine exactly what QtWebKit uses from <span class="caps">ICU</span> and whether current approach is still best. Also an issue that by using <span class="caps">ICU</span> for localization may get different results than QLocale may provide for the rest of the app.
WebKit / QtWebKit has 4 copies of the <span class="caps">ICU</span> headers included in its source tree, used to build on Mac 10.4.
WebKit / QtWebKit mostly uses the <span class="caps">ICU</span> C api, but does occasionally use the C++ api in port specific code.
See https://groups.google.com/a/chromium.org/forum/#!msg/blink-dev/eSiHBND2rAQ/X9LyMHImNjgJ for an interesting discussion on using <span class="caps">ICU</span> in WebKit / Blink.
Build appears to only link against core and i18n libraries. See bottom for detailed breakdown of <span class="caps">ICU</span> includes used in WebKit.
QtWebKit is not permitted on iOS by the App Store rules, apps must use the native WebKit install. Therefore iOS does not need to be considered in any QtWebKit solution.
==Possible solutions==
It seems clear we cannot rely on the system <span class="caps">ICU</span> on Mac, and there is no system install on Windows, which swings the platform balance to 2 to 1 on devs having to build and ship <span class="caps">ICU</span> themselves. While forcing a self-build is extra work for devs, it does have the benefit of allowing us to use the C++ api.
Practically, there are three options depending on what features we choose to use from <span class="caps">ICU</span>:<br /> 1) Only use <span class="caps">ICU</span> on Linux for the host system localization, use the host api on Mac and Windows, and don’t provide any advanced features that are not common to all three api’s. The same would apply to Code Tables.<br /> 2) Keep <span class="caps">ICU</span> optional on Win/Mac, use Win/Mac host system api wherever possible and only require <span class="caps">ICU</span> for optional advanced features when devs will be motivated to build and ship <span class="caps">ICU</span>.<br /> 3) Make <span class="caps">ICU</span> a hard requirement to be built and shipped by all devs on Mac and Windows, and require Linux devs to either build and ship themselves or use the system install and always rebuild Qt on system <span class="caps">ICU</span> upgrades.
Choosing 1) denies advanced features and doesn’t solve the QtWebKit issue. In either of 2) or 3) devs will be faced with the need to build and ship <span class="caps">ICU</span> themselves and we need to make this easy and lightweight for them. Shipping all of <span class="caps">ICU</span> inside qtbase/3rdparty with a default config is not desirable, but nor can we expect all devs to suddenly become experts on building <span class="caps">ICU</span>. We should provide a simple build script that configures and enables those features in <span class="caps">ICU</span> that Qt uses, and allow the devs to choose what locales and code tables to ship.
One advantage of requiring our own copy of <span class="caps">ICU</span> is we can set a minimum version that has all the features we want to use on all platforms, and can use the C++ <span class="caps">API</span>.
===Localization===
The QTimeZone code provides a template for the solution. Keep the concept of a default system locale that uses the host facilities directly, but at compile time can determine if want to use <span class="caps">ICU</span> instead. This means maintaining more code but seems the only practical solution.
On Linux: Use <span class="caps">ICU</span> for system and custom locale.<br /> On Mac: Use standard <span class="caps">API</span> for system and custom locale.<br /> On Windows: Work required to determine if sufficient functionality available for system locale, and if locale resource bundles can be opened for custom locales, otherwise will have to use <span class="caps">ICU</span>.
===Data size / build===
A number of options are available to reduce the size of both the library and the data, by not building some features and reducing the data shipped for those features that are enabled.
Work was started by Kai to determine what data resources were required and not required, but no results have been published.
The data can be built in 2 ways:<br /> 1) The default is built as a shared data library that is linked to and loaded alongside the main <span class="caps">ICU</span> library. This library is generated from a copy of the data files in the source tree. This is the fastest option but means the library must be updated if the data is to be updated, and is not portable across platforms.<br /> 2) The shared data library is built as a stub and the data is loaded from a .dat file located in a defined directory. The .dat file is specific to a given major and minor release, but maintains BC for point releases and is portable across some platforms that have the same endianess.
Data is mmapped so memory usage is not affected by how much data is shipped.
Features can be disabled at build time by either editing the uconfig.h, uversion.h and utypes.h files, or more practically by passing -D flags.
Data resources that are not required can be removed to reduce the download size. This is done at build time by either directly modifying the original .mk files, or more practically by saving the modified options in new reslocal.mk files which the <span class="caps">ICU</span> build system will then use to override the original settings. Another option is to manually use the online data customiser to build a custom .dat file, but this is a manual interactive process not easily automated and may be prone to human error.
Most data is the mapping conversion tables, removing these will have the greatest effect. <span class="caps">ICU</span> notes “<span class="caps">ICU</span> provides full internationalization functionality without any conversion table data. The common library contains code to handle several important encodings algorithmically: US-<span class="caps">ASCII</span>, <span class="caps">ISO</span>-8859-1, <span class="caps">UTF</span>-7/8/16/32, <span class="caps">SCSU</span>, <span class="caps">BOCU</span>-1, <span class="caps">CESU</span>-8, and <span class="caps">IMAP</span>-mailbox-name (i.e., US-<span class="caps">ASCII</span>, <span class="caps">ISO</span>-8859-1, and all Unicode charsets; see source/data/mappings/convrtrs.txt for the current list).” As such even if Qt uses <span class="caps">ICU</span> for conversions we may not need all/any of the conversion tables.
Locale data takes a small proportion, but may also be reduced by removing uncommon locales, or allowing devs to choose which they want.
Collation data uses a significant amount of data. Removing the Asian collation files would greatly reduce this but is possibly undesirable. Another option is to remove the tailoring rule strings from which the data is built which are rarely used at runtime.
Build sizes<br />
{| class="infotable line"
! Build
! Disk Size
! Zipped Size
|-
| Default full build, full data
| 26.5 MB
| 10.5 MB
|-
| Core build, full data
| 25.9 MB
| 10.3 MB
|-
| Core build, all locales, only en translations
| 11.2 MB
| 5.0 MB
|-
| Core build, only en locale and translations
| 10.0 MB
| 4.6 MB
|}
Core build excludes the optional I/O, Font Layout and Tool Utility libraries. Difference of 0.2 MB zipped size means little to be gained form code/functionality reductions, but other flags may further reduce size.
{| class="infotable line"
! Library
! Linux Filename
! Linux Size
|-
| Common Library
| libicuuc
| 1.8 MB
|-
| i18n Library
| libicu18n
| 2.6 MB
|-
| Data Library
| libicudata
| 21.3 MB
|-
| I/O Library (optional)
| libicuio
| 66 KB
|-
| Font Layout Library (optional)
| libicule
| 440 KB
|-
| Font Layout Extension Library (optional)
| libiculx
| 66 KB
|-
| Tool Utility Library (optional)
| libicutu
| 203 KB
|}
{| class="infotable line"
! Data
! Disk Size
! Zipped Size
! Zipped %
|-
| Total Data
| 21.3 MB
| 8.5 MB
| 100%
|-
| Code Table Mappings
| 4.4
| 2.3
| 27%
|-
| Collation Rules
| 3.3
| 0.8
| 9.5%
|-
| Language &amp; Region Names
| 2.8
| 0.9
| 10.5%
|-
| Time Zone Names
| 2.1
| 0.6
| 7%
|-
| Currency Names &amp; Plurals
| 1.9
| 0.6
| 7%
|-
| Locale Formats
| 1.2
| 0.4
| 5%
|-
| Transliteration Rules and Names
| 0.6
| 0.2
| 2.4 %
|-
| Rule Based Number Formatting
| 0.3
| 0.1
| 1.2%
|-
| String Preparation (<span class="caps">RFC</span>’s)
| 0.2
| 0.1
| 1.2%
|-
| Root data?
| 4.5
| 2.5
| 29.5%
|}
Proposal:<br /> 1) Include a new build script in qtbase/3rdparty/icu<br /> 2) Script is run as part of configure depending on platform and flags<br /> 3) Script downloads the src tarball recommended for a given version of Qt<br /> 4) Script defaults to building only those features used by Qt on a given platform and removes data resources that are not needed by most clients.<br /> 5) Script can either be manually modified to include or excluded more features and data, or can interactively ask during configure step.<br /> 6) Script writes modified data options to icu/src/data/*/reslocal.mk build files which override the main *.mk files<br /> 7) Script runs configure with build-time options required<br /> 8) Build happens as part of normal Qt build.
Suggested flags from <span class="caps">ICU</span> readme:<br /> U_USING_ICU_NAMESPACE=0<br /> U_CHARSET_IS_UTF8=1 – On UTF8 platforms<br /><span class="caps">UNISTR</span>_FROM_CHAR_EXPLICIT=explicit<br /><span class="caps">UNISTR</span>_FROM_STRING_EXPLICIT=explicit<br /> U_NO_DEFAULT_INCLUDE_UTF_HEADERS=1<br /> U_HIDE_DRAFT_API<br /> U_HIDE_INTERNAL_API<br /> U_HIDE_SYSTEM_API<br /> —with-library-suffix – Add Qt as a suffix to name
Other options available:<br /> —with-data-packaging=archive – To use .dat file instead<br /> —enable-static —disable-shared – For static builds
See http://thebugfreeblog.blogspot.co.uk/2013/05/cross-building-icu-for-applications-on.html
export <span class="caps">CPPFLAGS</span>=”-DU_USING_ICU_NAMESPACE=0 -DU_CHARSET_IS_UTF8=1 -<span class="caps">DUNISTR</span>_FROM_CHAR_EXPLICIT=explicit -<span class="caps">DUNISTR</span>_FROM_STRING_EXPLICIT=explicit -DU_NO_DEFAULT_INCLUDE_UTF_HEADERS=1”<br /> ./runConfigureICU Linux —with-library-suffix=qt —disable-draft —disable-extras —disable-icuio —disable-layout —disable-test —disable-samples
===QtWebKit===
Two options:<br /> 1) Continue using <span class="caps">ICU</span>, use new QtCore built copy of <span class="caps">ICU</span>, assumes iOS will accept shipping extra copy of <span class="caps">ICU</span>.<br /> 2) Write new platform back-end using new Qt classes for locale and text layout, but this is a lot of work.
Option 1) is the only practical solution until such time as QtCore can provide all the required functions. The locale functions will come as a result of the new QLocale <span class="caps">ICU</span> backend and wrapper classes, and as the design matches <span class="caps">ICU</span> closely should be straightforward to implement. The difficulty of the text layout and encoding back-ends is an open question.
==<span class="caps">ICU</span> includes in QtWebKit==
<span class="caps">ICU</span> Backend in qtwebkit/Source/WebCore/platform/text/<br /> LineBreakIteratorPoolICU.h<br /> LocaleICU.h/.cpp<br /> LocaleToScriptMappingICU.cpp<br /> TextBreakIteratorICU.h/.cpp<br /> TextCodecICU.h/.cpp<br /> TextEncodingDetector.cpp
<span class="caps">ICU</span> Backend in Source/WTF/wtf/:<br /> url/src/URLCanonICU.cpp<br /> unicode/icu/UnicodeIcu.h<br /> unicode/icu/CollatorICU.cpp<br /> unicode/qt4/UnicodeQt4.h
All <span class="caps">ICU</span> includes used in QtWebKit source tree, not all are built by Qt port:<br />
{| class="infotable line"
! Include
! Language
! Function
! Used in WebKit By
|-
| &lt;unicode/locid.h&gt;
| C++
| Locale
| skia, chromium
|-
| &lt;unicode/normlzr.h&gt;
| C++
| Normalization
| freetype, harfbuzz, skia
|-
| &lt;unicode/uniset.h&gt;
| C++
| Sets of Unicode Code Points and Strings
| chromium
|-
| &lt;unicode/ubrk.h&gt;
| C
| Text Boundary Analysis (Break Iteration)
| text
|-
| &lt;unicode/uchar.h&gt;
| C
| Unicode Character Properties and Names
| win, harfbuzz, wx, chromium, wtf
|-
| &lt;unicode/ucnv_cb.h&gt;
| C
| Codepage Conversion and Unicode Text Compression
| text, wtf
|-
| &lt;unicode/ucnv.h&gt;
| C
| Codepage Conversion and Unicode Text Compression
| text, wtf
|-
| &lt;unicode/udat.h&gt;
| C
| Date api
| JavaScriptCore/runtime, text
|-
| &lt;unicode/udatpg.h&gt;
| C
| Date Pattern Generator
| text
|-
| &lt;unicode/uidna.h&gt;
| C
| International Domain Names in Applications
| WebCore/platform/KURL.cpp, wtf
|-
| &lt;unicode/uloc.h&gt;
| C
| Locales
| text
|-
| &lt;unicode/unorm.h&gt;
| C
| Normalization
| graphics/SurrogatePairAwareTextIterator.cpp, win, wx, chromium, text
|-
| &lt;unicode/unum.h&gt;
| C
| Number Formatting
| text
|-
| &lt;unicode/uscript.h&gt;
| C
| Unicode Character Properties and Names
| chromium, wtf
|-
| &lt;unicode/usearch.h&gt;
| C
| String Searching
| WebCore/editing
|-
| &lt;unicode/uset.h&gt;
| C
| Sets of Unicode Code Points and Strings
| WebCore/editing
|-
| &lt;unicode/ustring.h&gt;
| C
| Strings and Character Iteration
| blackberry, wtf
|-
| &lt;unicode/utf16.h&gt;
| C
| Strings and Character Iteration
| blackberry, wx, linux, wtf
|-
| &lt;unicode/utypes.h&gt;
| C
| Basic Types and Constants
| text
|}
==<span class="caps">ICU</span> Documentation==
http://site.icu-project.org/charts/charset<br />http://site.icu-project.org/charts/icu4c-footprint<br />http://userguide.icu-project.org/packaging<br />http://userguide.icu-project.org/design<br />http://userguide.icu-project.org/design#TOC-<span class="caps">ICU</span>-Binary-Compatibility:-Using-<span class="caps">ICU</span>-as-an-Operating-System-Level-Library<br />http://userguide.icu-project.org/icudata<br />http://apps.icu-project.org/datacustom/<br />http://www.icu-project.org/docs/demo/datacustom_help.html<br />http://source.icu-project.org/repos/icu/icu/trunk/readme.html

Revision as of 13:58, 24 February 2015