String Handling

From Qt Wiki
Revision as of 15:38, 30 November 2023 by Marc Mutz (talk | contribs) (bullet-ize the last item)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Session Summary

Session Owners

Marc Mutz

Notes

Issues:

  • QByteArray doubles as QUtf8String
  • UTF-8 is not a good in-memory format:
    • Comparison cannot early-exit based on size (when comparing to L1 or U16)
    • Indexing (too many muti-code-unit encodings; UTF-16 has fewer)
    • Searching (no Boyer-Moore for general UTF-8)
  • Suggestion of where to go:
    • complete mixed string operations (comparison, searching, tokenisation)
    • add UTF-8 searching and tokenisation
    • add UTF-32 (for Python compat)
    • add owning versions of all
  • Can QString get more constexpr support?
    • Unlikely, it requires more C++ support
    • QString has implicit sharing and a lot of out-of-line API
  • QAnyStringView is missing a lot of API
    • because QUtf8StringView are also lacking such
  • Thiago requests that we make proof-of-concept of the final state of this API
    • Avoid breaking user code (e.g., implicit conversions to QString)
    • Create a Library API Design document what to create
      • Needs a plan for getting there from where we are
      • Return types
      • Where to keep simple QStrings (for other develoeprs not working on QtCore)
  • QString considers equal only strings that have the exact same code units.
    • Equivalence based on Unicode transforms (NFD, NFC) is not taken into account
      • but there's API to do the conversions if needed.