DevelopmentPrinciples - cockpit-project/cockpit GitHub Wiki
Development
- keep master working on as many (current) OSes as possible; use run-time feature/API detection
- test for every change; some OS conditionals in tests
- code is easy (or possible+documented) to run straight out of git tree; no permanent system modifications
- test VMs double as devel environment for testing intrusive changes; faster iteration with
scp
instead ofimage-prepare
- tests are easy to run and debug locally
Upstream CI
- test on every supported OS
- offline build and tests
- provide our own versions of third-party services: FreeIPA, Samba AD, candlepin, OpenShift, ovirt, selenium containers
- provide mechanics for creating rpms, debs, and entire repositories from scratch locally
- separate OS image refreshes
- test robustness: touched tests succeed 3x in a row, untouched tests succeed 1 out of 3; database of test flakes
Fedora/RHEL
- run upstream integration tests in downstream gating
- the above approach allows us to upload current master until the latest freeze
Releases
- automate everything: github, fedora, copr, PPA, dockerhub, home page (docs)
- process in principle: create tag, write blog post
Our tests/CI Error Budget
High-level goal: What keeps our velocity and motivation?
- PRs get validated in a reasonable time (queue + test run time)
- We don’t waste time on interpreting unstable test results
- We are not afraid of touching code
- Test failures are relevant and meaningful. Relieve us from having to decide about “unrelated or not” every. single. time.
Service Level Objectives
When the following objectives are fulfilled, we operate normally and happily. Once these drop below the mark (“exceeding error budget”), a part of the team (discussed in daily standups) stops feature development and non-urgent changes, and fixes our infrastructure and tests to get back into the agreed service level.
Objectives that support the high-level goal, in descending importance:
- A merged PR became fully green with a 75% chance at the first attempt, and with a 95% chance after one retry
- Every individual test succeeds at least 90% of the time
- 95% of all PRs are merged without failed tests
- 95% of test runs take no more than 1 hour between pushing a PR and getting all results
- 95% of scheduled tests run through to completion (all tests ran and status got reported to PR)