Qt Contributors Summit 2022 - Program/Qt Infrastructure for CI

From Qt Wiki
< Qt Contributors Summit 2022 - Program
Revision as of 10:36, 8 June 2022 by SGaist (talk | contribs) (Added notes from session)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Session Summary

A presentation where the hardware side of the CI is presented along with monitoring of the system

Presentation / Discussion (30 min)

Session Owners

  • Tony Sarajärvi

Notes

(Taken by Samuel Gaist)

Delivery and Automation team

Coin CI 8 members

Oslo 3 members

Stages in short

- Code review

- Coin gets notified

- Build

- Release artifacts

- Packaging

- Release

What is COIN ? (COntinuous INtegration)

- From scratch in python

Does

- Archive and reuse built modules

- Read instructions from files within the git repositories

- Customize the size of the VMs needed to run specific jobs (qtbase requires more than other smaller modules, tests do not require the same)

- A GUI that lets select what is to be done based on changing possibilities

COIN allows to reuse modules rather than re-build every modules from scratch each time a patch is added to a module.

COIN itself

~50 platform configurations for each integration

4 million VM created last year

1700+ VMS for a full qt5 integration

1200+ VMS to test the build results

The infrastructure changes every month

- Hardware gets old and support ends

- New hardware replaces old

- New hardware expands the current setup

- Old hardware is upgraded

- Can't just replace enterprise grade items with "standard" one (e.g. DELL SDD with commercial grade SDD)

- Old hardware is used until it literally dies then for spare parts

This also includes planning the electrical capacity of the infrastructure (Dell power sources are not the same as other). The networking must also be adapted and evolved to be efficient but there might be hardware either not yet created or not available.

Cloud service providers have come down in price however running and maintaining own hardware is still cost efficient.

It also allows to connect custom hardware like Apple systems, embedded systems, etc.

Monitoring is done on several unrelated layers

- Health

- Bandwidth

- Loads (CPU, Memory, swaps, disk spaces, latencies)

- Warranties

- Certificates

- PowerStore/Compellent, hosts, VMs, switches, firewall, idracs

All through Grafana

For example, there's a service downloading a big package regularly to ensure that the bandwith is correctly used.

Setup

- Ansible is slowly replacing many manual tasks

- Security:

  - SSH keys

  - Segmentation

  - Automation

The usual question

- Why not AWS and Azure ?

  - COIN would be way very complicated to setup there (not impossible though)

  - The current system is pretty monolithic

  - Offloading some task would be possible however the main issues would be data transfer

  - Distribute 7TB of data daily

Q&A

- Distribute image to approvers and maintainers for debugging ?

  - Doable for Linux image

  - Not with Windows

  - Likely not for Apple as well

- What about an ssh / VNC session ?

  - Doable but not had much request for that

- Currently would require to be a subcontractor because of NDA and other internal requirements.

- There's going some work to be done in order to be able to grant access to external people in a secured manner to specific machines.

- Apple arm64 is not yet virtualized so each Apple machine must be cleaned after each build.