RegressionTest - coq/coq GitHub Wiki
Systems currently running some sort of benchmarking, regression testing for Coq.
label | URL | access | tests what | comment |
---|---|---|---|---|
lix-bench | bench | lix | contribs (from git) | |
lix-newbench | newbench | lix | - | down very often |
inria-ci | jenkins | inria | coq (from git) | |
opam-bench | opam-bench | jobs on gihub, infrastructure ??? | opam packages |
The problems we have w.r.t. regression testing, that apply to one or many of the above. "we" means people working on Coq, not necessarily in PPS, Inria, France, Europe...
- 4 systems, overlap, which is the reference? Maybe 5, wasn't pyrolyse intended to be the bench system?
- access: we need to be able to access the compilation logs (possible with Jenkins)
- access: we need to be able to fix the bench system, at least for trivial errors (like it picks the wrong branch, coq_makefile refresh)
- reproducibility: understand why it fails (access to logs, maybe more), replicate on more hardware
- too slow (wait 24h for a run)
- ability to start a bench on demand and to start custom benches
- are the tests representative?
- only 1 big test (all contribs), all or nothing
- test a personal experiment without impacting the other developers
- single point of failure (only 1 gatekeeper)
Things that are desirable, not necessarily a problem:
- extend the bench with extra tools/scripts (graphs, etc...)
- extend the bench to produced precompiled artifacts (installers...)
- windows (ci.inria has it)
- osx (ci.inria does not have it)