QtCS2024 AI tooling for Qt developers

Session Summary

Qt already has a network of bots which augment development workflows-- Cherry-Pick Bot, Submodule Update Bot, Flake8 Bot for Python, and so on.

With the rise of LLMs, a couple bots in this vain have been implemented with pretty good success, even considering limitations.

API Header Review Bot - Identifies changes to public headers, summarizes them, and flags the change for review before the next release
- Uses GPT-4 for analysis. Generally good results, but in current state, inputs are not comprehensive and do not represent a full "API change" across multiple change reviews.
- Useful enough to at least flag changes.
CI Failure Analysis Bot - Analyzes failure log, test sources, and change diff to determine if the change caused the failure. May suggest fixes if obvious.
- Very good results during a Proof-of-concept trial run in Qt Company bugfix Sprint H2 2024.
- Guessing at least 90% accuracy for changes causing/not causing the CI failure based on manual sampling and review of outputs.
- Identification of infrastructure issues as cause of failure.
- Identification of flaky tests as cause of failure.
- Limitation of 128k context, covers all but the largest changes.
- Limitation of only analyzing atomic changes, cannot take in a full relation chain or topic.
  - Sometimes results in blaming multiple changes as the cause of failure with ambiguous analysis results, but even so, remains usually correct about the changes being related.