Research Protocol - nsalminen/software-analytics-book GitHub Wiki

In this appendix, we will describe in detail how we applied the protocol for performing systematic literature reviews by @kitchenham2004procedures. In order, we will go over the search strategy, study selection, study quality assessment, and data extraction. The last subsection will list which studies we included in this review and which we have found, but excluded from the review for a specific reason.

Search Strategy

Since release engineering is a relatively new research topic, we took an exploratory approach in collecting any literature revolving around the topic of release engineering from the perspective of software analytics. This aided us in determining a more narrow scope for our survey, subsequently allowing us to find additional literature fitting this scope.

At the start of this project, we were provided with an initial seed of five papers as a starting point for our literature survey [@adams2016a; @da2014a; @da2016a; @khomh2012a; @khomh2015a].

We collected other publications using two search engines: Scopus and Google Scholar. Each of the two search engines comprises several databases such as ACM Digital Library, Springer, IEEE Xplore and ScienceDirect. The main query that we constructed is displayed in Figure 1. The publications found using this query were:

  • @kaur2019a
  • @kerzazi2013a
  • @castelluccio2017a
  • @karvonen2017a
  • @claes2017a
  • @fujibayashi2017a
  • @souza2015a
  • @laukkanen2018a
  • @dyck2015a
TITLE-ABS-KEY(
  (
    "continuous release" OR "rapid release" OR "frequent release"
    OR "quick release" OR "speedy release" OR "accelerated release"
    OR "agile release" OR "short release" OR "shorter release"
    OR "lightning release" OR "brisk release" OR "hasty release"
    OR "compressed release" OR "release length" OR "release size"
    OR "release cadence" OR "release frequency"
    OR "continuous delivery" OR "rapid delivery" OR "frequent delivery"
    OR "fast delivery" OR "quick delivery" OR "speedy delivery"
    OR "accelerated delivery" OR "agile delivery" OR "short delivery"
    OR "lightning delivery" OR "brisk delivery" OR "hasty delivery"
    OR "compressed delivery" OR "delivery length" OR "delivery size"
    OR "delivery cadence" OR "continuous deployment" OR "rapid deployment"
    OR "frequent deployment" OR "fast deployment" OR "quick deployment"
    OR "speedy deployment" OR "accelerated deployment" OR "agile deployment"
    OR "short deployment" OR "lightning deployment" OR "brisk deployment"
    OR "hasty deployment" OR "compressed deployment" OR "deployment length"
    OR "deployment size" OR "deployment cadence"
  ) AND (
    "release schedule" OR "release management" OR "release engineering"
    OR "release cycle" OR "release pipeline" OR "release process"
    OR "release model" OR "release strategy" OR "release strategies"
    OR "release infrastructure"
  )
  AND software
) AND (
  LIMIT-TO(SUBJAREA, "COMP") OR LIMIT-TO(SUBJAREA, "ENGI")
)
AND PUBYEAR AFT 2014

Figure 1. Query used for retrieving release engineering publications via Scopus.

In addition to querying search engines as described above, references related to retrieved papers were analyzed. These reference lists were obtained from Google Scholar and from the References section in the papers themselves. We selected all papers on release engineering that are citing or being cited by the initial set of papers. Using this approach, we have found six additional papers. The results of the reference analysis are listed in Table 1.

Table 1. Papers found indirectly by investigating citations of/by other papers.

Starting point Type Result
@souza2015a has cited @plewnia2014a @mantyla2015a
@khomh2015a is cited by @poo-caamano2016a @teixeira2017a
@mantyla2015a is cited by @rodriguez2017a @cesar2017a
@laukkanen2018a has cited @laukkanen2017a

All the papers that were found, were stored in a custom built web-based tool for conducting literature reviews. The source code of this tool is published in a GitHub repository. The tool was hosted on a virtual private server, such that all retrieved publications were stored centrally, accessible to all reviewers.

Study Selection

We selected the studies that we wanted to include in the survey with aid of the aforementioned tool for storing the papers. In this tool, it is possible to label papers with tags and leave comments and ratings. Every paper is reviewed based on the selection criteria. Based on this, the tool allowed to filter out all papers that appeared not to be relevant for this literature survey.

The selection criteria are as follows:

  1. The study must show (at least) one release engineering technique.
  2. The study must not just show a release engineering technique, but analyze its performance compared to other techniques.

The last subsection of this appendix lists which studies were selected and which were discareded.

Study Quality Assessment

Based on @kitchenham2004procedures, the quality of a paper will be assessed by the evidence it provides, based on the following scale. All levels of quality in this scale will be accepted, except for level 5 (evidence obtained from expert opinion).

  1. Evidence obtained from at least one properly-designed randomised controlled trial.
  2. Evidence obtained from well-designed pseudo-randomised controlled trials (i.e. non-random allocation to treatment).
  3. Comparative studies in a real-world setting:
    1. Evidence obtained from comparative studies with concurrent controls and allocation not randomised, cohort studies, case-control studies or interrupted time series with a control group.
    2. Evidence obtained from comparative studies with historical control, two or more single arm studies, or interrupted time series without a parallel control group.
  4. Experiments in artificial settings:
    1. Evidence obtained from a randomised experiment performed in an artificial setting.
    2. Evidence obtained from case series, either post-test or pre-test/post-test.
    3. Evidence obtained from a quasi-random experiment performed in an artificial setting.
  5. Evidence obtained from expert opinion based on theory or consensus.

Also, the studies will be examined to see if they contain any type of bias. For this, the same types of biases will be used as described by @kitchenham2004procedures:

  • Selection/Allocation bias: Systematic difference between comparison groups with respect to treatment.
  • Performance bias: Systematic difference is the conduct of comparison groups apart from the treatment being evaluated.
  • Measurement/Detection bias: Systematic difference between the groups in how outcomes are ascertained.
  • Attrition/Exclusion bias: Systematic differences between comparison groups in terms of withdrawals or exclusions of participants from the study sample.

The studies will be labeled by their quality level and possible biases. This information can be used during the Data Synthesis phase to weigh the importance of individual studies [@kitchenham2004procedures].

Data Extraction

To accurately capture the information contributed by each publication in our survey, we will use a systematic approach to extracting data. To guide this process, we will be using a data extraction form which describes what aspects of a publication are crucial to record. Besides general publication information (title, author etc.), the form contains questions that are based on our defined research questions. Furthermore, the form contains a section for quantitative research, where aspects such as population and evaluation will be documented. The form that is used for this is shown below:

General information:

- Name of person extracting data:
- Date form completed (dd/mm/yyyy):
- Publication title:
- Author information:
- Publication type:
- Conference/Journal:
- Type of study:

What practices in release engineering does this publication mention?

Are these practices to be classified under dated, state of the art or state of
the practice? Why?

What open challenges in release engineering does this publication mention?

What research gaps does this publication contain?

Are these research gaps filled by any other publications in this survey?

Quantitative research publications:

- Study start date:
- Study end date or duration:
- Population description:
- Method(s) of recruitment of participants:
- Sample size:
- Evaluation/measurement description:
- Outcomes:
- Limitations:
- Future research:

Notes:

Data Synthesis

To summarize the contributions and limitations of each of the included publications, we will apply a descriptive synthesis approach. In this part of our survey, we will compare the data that was extracted of the included publications. Publications with similar findings will be grouped and evaluated, and differences between groups of publications will be structured and elaborated on. In this we will compare them using specifics such as their study types, time of publication and study quality.

If the extracted data allows for a structured tabular visualization of similarities and differences between publications this we serve as an additional form of synthesis. However, this depends on the final included publications of this survey.

Included and Excluded Studies

Included:

  • @adams2016a
  • @castelluccio2017a
  • @cesar2017a
  • @claes2017a
  • @da2014a
  • @da2016a
  • @dyck2015a
  • @fujibayashi2017a
  • @karvonen2017a
  • @kerzazi2013a
  • @khomh2015a
  • @laukkanen2017a
  • @laukkanen2018a
  • @mantyla2015a
  • @plewnia2014a
  • @poo-caamano2016a
  • @rodriguez2017a
  • @souza2015a
  • @teixeira2017a

Excluded:

  • @khomh2012a has been excluded, because it presents the same results as @khomh2015a, while the latter is more extensive because it is a journal article instead of a conference article.
  • @kaur2019a has been excluded, because we could not obtain the actual paper since it has not yet been officially released.