String Handling: Difference between revisions

From Qt Wiki
Jump to navigation Jump to search
(Created page with "Category:QtCS2023 ==Session Summary== ==Session Owners== ==Notes==")
 
(bullet-ize the last item)
 
(2 intermediate revisions by 2 users not shown)
Line 5: Line 5:


==Session Owners==
==Session Owners==
Marc Mutz


==Notes==
Issues:
* QByteArray doubles as QUtf8String
* UTF-8 is not a good in-memory format:
** Comparison cannot early-exit based on size (when comparing to L1 or U16)
** Indexing (too many muti-code-unit encodings; UTF-16 has fewer)
** Searching (no Boyer-Moore for general UTF-8)
* Suggestion of where to go:
** complete mixed string operations (comparison, searching, tokenisation)
** add UTF-8 searching and tokenisation
** add UTF-32 (for Python compat)
** add owning versions of all


==Notes==
* Can QString get more constexpr support?
** Unlikely, it requires more C++ support
** QString has implicit sharing and a lot of out-of-line API
 
* QAnyStringView is missing a lot of API
** because QUtf8StringView are also lacking such
 
* Thiago requests that we make proof-of-concept of the final state of this API
** Avoid breaking user code (e.g., implicit conversions to QString)
** Create a Library API Design document what to create
*** Needs a plan for getting there from where we are
*** Return types
*** Where to keep simple QStrings (for other develoeprs not working on QtCore)
 
* QString considers equal only strings that have the exact same code units.
** Equivalence based on Unicode transforms (NFD, NFC) is not taken into account
*** but there's API to do the conversions if needed.

Latest revision as of 15:38, 30 November 2023


Session Summary

Session Owners

Marc Mutz

Notes

Issues:

  • QByteArray doubles as QUtf8String
  • UTF-8 is not a good in-memory format:
    • Comparison cannot early-exit based on size (when comparing to L1 or U16)
    • Indexing (too many muti-code-unit encodings; UTF-16 has fewer)
    • Searching (no Boyer-Moore for general UTF-8)
  • Suggestion of where to go:
    • complete mixed string operations (comparison, searching, tokenisation)
    • add UTF-8 searching and tokenisation
    • add UTF-32 (for Python compat)
    • add owning versions of all
  • Can QString get more constexpr support?
    • Unlikely, it requires more C++ support
    • QString has implicit sharing and a lot of out-of-line API
  • QAnyStringView is missing a lot of API
    • because QUtf8StringView are also lacking such
  • Thiago requests that we make proof-of-concept of the final state of this API
    • Avoid breaking user code (e.g., implicit conversions to QString)
    • Create a Library API Design document what to create
      • Needs a plan for getting there from where we are
      • Return types
      • Where to keep simple QStrings (for other develoeprs not working on QtCore)
  • QString considers equal only strings that have the exact same code units.
    • Equivalence based on Unicode transforms (NFD, NFC) is not taken into account
      • but there's API to do the conversions if needed.