Code Coverage - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki
General definition
Test adequacy criteria: quality of test, with higher quality test suits detecting more faults. Code coverage is a subcategory.
Control Flow Code coverage
This focuses on the individual decision nodes, the atomic sections, not the entire block. CodeCover a Java library in this study
Statement Coverage (block coverage)
- Each statement executed at leas once. Here a statement constitutes any code-snippet, where information is written or evaluated
a=5is 1 statement;if x > 0:is 1 statement,print(x)is 1 statement- The most basic form of code coverage
Branch Coverage
- Examines all branches
- Here both
trueandfalsecases of a decision point are examined - Better measure than Statement Coverage
MC/DC Coverage (Modified Condition-Decision Coverage)
- Tests, that each atomic condition can independently affect the decision outcome
Loop Coverage
- Test weather loops are being covered more than once.
[Memmati]
Implementation Ideas:
I think it should be very simple to implement branch coverage with most testing libraries. This should be testing the coverage of the response code on the prompt code. This is very basic but very necessary. Loop coverage is also standard to have in addition to branch coverage. If possible and easy, we can probably also add MC/DC, but this would be "going the exstra mile" and not 100% necessary.
Data Flow code coverage
Evaluate variable occurrences: definitions where a value is assigned as def $d$ and use where a value $v$ is referred / read as use $u$.
def-use pair coverage
- The DUA $(d, u, v)$ pair is marked as satisfactory. The last $d$ recorded is seen as the definition that reaches $u$.
- Use DUA-FORENSICS [Santelices]
- basically these DUA pairs can be used to check what possible values can be defined at $d$ and if all values $v$ are handled/ valid in $u$. The coverage should
static analysis - before execution of test
dynamic analysis - during execution
reporting - determine DUA with statistic & dynamic analysis
[Memmati][Santelices]
Static Data Flow Testing
- A flow diagram is created, tracking the static definitions and usages of variables.
- The anomalies discovered here include defined, but unused variables, used but undefined variables and variables that are defined twice before use. [Geeksforgeeks] [Lambdatest]
Dynamic Data Flow Testing
- During (test-)code execution create the flow diagram and analiese for the same anomalies as in Static Data Flow Testing [Lambdatest]
Implementation Ideas:
The most beneficial would be to test Static Data Flow on the response code, since we want to measure how good the responses are.
- Create the flow diagram for the response-code
- Check if any of the anomalies appear, note how many variables are created and how many are used, note how
Personal Opinion: Since the unit-test of our test_cases have little to no variables, this type of coverage test is a sanity-check on the response. It is suited for test that are more complex (for example integration-test, ui-test, ...), where many variables have to be created in the test. In my experience, the models we use do do not write code with these anomalies.
Path-Based Coverage
test entire sequences. This usually has exponential growth.
Intra-method paths (IMP)
- Test each Path of a method. Start at beginning of method and end at return. The nested methods are not included.
- loop-tests, and recursions would add an infinite amount of test. => this is a theoretical idear of path-based coverage
Acyclic intra-method paths (AIMP)
- bound path by considering only acrylic paths of IMP
- treat loops as single decision points. Do not consider recursion as different paths. [Gligoric]
Implementation Ideas:
AIMP is the best we can measure in this cathegory. It might be possible to measure this with an existing python library. Here we track how many of the possible paths have been traversed. We might be able to mearure this with a library.
pytest_cov.plugin PytestCovPlugin
If we can not use a library this is the approach I would take:
- Create the control-flow graph of the prompt-code. This may use the same mechanisms as our mcc or ccc implementation. Importantly: a decision node is any form of
if-else. If there is anelseif, then countif-elseifandelseif-elseas two individual dissension nodes. Loops count as one dissension node, entering or not entering the loop-body. Recursions are not counted as dissension points and may be disregarded. - Split the graph into individual paths
- During the execution of the test-code (extracted response) "count" how many of the paths are traversed with the test and mark them
- Display the findings.
Alternatives / extensions / problems
State Coverage
- measure how well the code executes specifications [Vanoverberghe]
Test Coverage
- qualitative metric, that shows how well the requirements are being tested for. The focus is on how well the software dose what it is supposed to do, not on how many lines of the code are tested. This is a holistic approach on the product. It is usually part of integration-testing (testing weather things work together).[Gireesh]
Research
- Even for 100% code coverage 7% to 35% of faults may be undetected[Hemmati]
- other or complementary techniques should also be considered.
- code coverage is not as prevalent in industry, since it adds computational cost[Ivankovio]
Sources
Vanoverberghe Hemmati Santelices Santelices2 Gligoric Lambdatest Geeksforgeeks Ivankovio Gireesh