String Handling: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
(Created page with "Category:QtCS2023 ==Session Summary== ==Session Owners== ==Notes==")
 
No edit summary
Line 5: Line 5:


==Session Owners==
==Session Owners==
Marc Mutz


==Notes==
Issues:
* QByteArray doubles as QUtf8String
* UTF-8 is not a good in-memory format:
** Comparison cannot early-exit based on size (when comparing to L1 or U16)
** Indexing (too many muti-code-unit encodings; UTF-16 has fewer)
** Searching (no Boyer-Moore for non-ASCII)
* Suggestion of where to go:
** complete mixed string operations (comparison, searching, tokenisation)
** add UTF-8 searching and tokenisation
** add UTF-32 (for Python compat)
** add owning versions of all


==Notes==
* Can QString get more constexpr support?
** Unlikely, it requires more C++ support
** QString has implicit sharing and a lot of out-of-line API
 
* QAnyStringView is missing a lot of API
** because the underlying Q*StringView are also lacking such
 
* Thiago requests that we make proof-of-concept of the final state of this API
** Avoid breaking user code (e.g., implicit conversions to QString)
** Create a Library API Design document what to craete
*** Needs a plan for getting there from where we are
*** Return types
*** Where to keep simple QStrings (for other develoeprs not working on QtCore)
 
QString considers equal only strings that have the exact same code units. Equivalence based on Unicode transforms (NFD, NFC) is not taken into account, but there's API to do the conversions if needed.

Revision as of 15:06, 30 November 2023


Session Summary

Session Owners

Marc Mutz

Notes

Issues:

  • QByteArray doubles as QUtf8String
  • UTF-8 is not a good in-memory format:
    • Comparison cannot early-exit based on size (when comparing to L1 or U16)
    • Indexing (too many muti-code-unit encodings; UTF-16 has fewer)
    • Searching (no Boyer-Moore for non-ASCII)
  • Suggestion of where to go:
    • complete mixed string operations (comparison, searching, tokenisation)
    • add UTF-8 searching and tokenisation
    • add UTF-32 (for Python compat)
    • add owning versions of all
  • Can QString get more constexpr support?
    • Unlikely, it requires more C++ support
    • QString has implicit sharing and a lot of out-of-line API
  • QAnyStringView is missing a lot of API
    • because the underlying Q*StringView are also lacking such
  • Thiago requests that we make proof-of-concept of the final state of this API
    • Avoid breaking user code (e.g., implicit conversions to QString)
    • Create a Library API Design document what to craete
      • Needs a plan for getting there from where we are
      • Return types
      • Where to keep simple QStrings (for other develoeprs not working on QtCore)

QString considers equal only strings that have the exact same code units. Equivalence based on Unicode transforms (NFD, NFC) is not taken into account, but there's API to do the conversions if needed.