QtCS2024 AI tooling for Qt developers

From Qt Wiki
Revision as of 08:18, 5 September 2024 by Dasmith (talk | contribs) (Qt Contributor Summit 2024 Session: AI tooling for Qt developers)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Qt already has a network of bots which augment development workflows-- Cherry-Pick Bot, Submodule Update Bot, Flake8 Bot for Python, and so on.

With the rise of LLMs, a couple bots in this vain have been implemented with pretty good success, even considering limitations.

  • API Header Review Bot - Identifies changes to public headers, summarizes them, and flags the change for review before the next release
    • Uses GPT-4 for analysis. Generally good results, but in current state, inputs are not comprehensive and do not represent a full "API change" across multiple change reviews.
    • Useful enough to at least flag changes.
  • CI Failure Analysis Bot - Analyzes failure log, test sources, and change diff to determine if the change caused the failure. May suggest fixes if obvious.
    • Very good results during a Proof-of-concept trial run in Qt Company bugfix Sprint H2 2024.
    • Guessing at least 90% accuracy for changes causing/not causing the CI failure based on manual sampling and review of outputs.
    • Identification of infrastructure issues as cause of failure.
    • Identification of flaky tests as cause of failure.
    • Limitation of 128k context, covers all but the largest changes.
    • Limitation of only analyzing atomic changes, cannot take in a full relation chain or topic.
      • Sometimes results in blaming multiple changes as the cause of failure with ambiguous analysis results, but even so, remains usually correct about the changes being related.