String Handling: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
(→‎Notes: some fixes from memory)
(bullet-ize the last item)
 
Line 34: Line 34:
*** Where to keep simple QStrings (for other develoeprs not working on QtCore)
*** Where to keep simple QStrings (for other develoeprs not working on QtCore)


QString considers equal only strings that have the exact same code units. Equivalence based on Unicode transforms (NFD, NFC) is not taken into account, but there's API to do the conversions if needed.
* QString considers equal only strings that have the exact same code units.
** Equivalence based on Unicode transforms (NFD, NFC) is not taken into account
*** but there's API to do the conversions if needed.

Latest revision as of 15:38, 30 November 2023


Session Summary

Session Owners

Marc Mutz

Notes

Issues:

  • QByteArray doubles as QUtf8String
  • UTF-8 is not a good in-memory format:
    • Comparison cannot early-exit based on size (when comparing to L1 or U16)
    • Indexing (too many muti-code-unit encodings; UTF-16 has fewer)
    • Searching (no Boyer-Moore for general UTF-8)
  • Suggestion of where to go:
    • complete mixed string operations (comparison, searching, tokenisation)
    • add UTF-8 searching and tokenisation
    • add UTF-32 (for Python compat)
    • add owning versions of all
  • Can QString get more constexpr support?
    • Unlikely, it requires more C++ support
    • QString has implicit sharing and a lot of out-of-line API
  • QAnyStringView is missing a lot of API
    • because QUtf8StringView are also lacking such
  • Thiago requests that we make proof-of-concept of the final state of this API
    • Avoid breaking user code (e.g., implicit conversions to QString)
    • Create a Library API Design document what to create
      • Needs a plan for getting there from where we are
      • Return types
      • Where to keep simple QStrings (for other develoeprs not working on QtCore)
  • QString considers equal only strings that have the exact same code units.
    • Equivalence based on Unicode transforms (NFD, NFC) is not taken into account
      • but there's API to do the conversions if needed.