Raw Extracted Data - nsalminen/software-analytics-book GitHub Wiki

Understanding the impact of rapid releases on software quality -- The Case of Firefox

Reference: @khomh2015a

General information:

Name of person extracting data: Maarten Sijm
Date form completed: 27-09-2018
Author information: Foutse Khomh, Bram Adams, Tejinder Dhaliwal, Ying Zou
Publication type: Paper in Conference Proceedings
Conference: Mining Software Repositories (MSR)
Type of study: Quantitative, empirical case study

What practices in release engineering does this publication mention?

Changing from traditional to rapid release cycles in Mozilla Firefox

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the practice, because they study Firefox and Firefox is still using rapid release cycles. However, it is dated because the data is six years old.

What open challenges in release engineering does this publication mention?

More case studies are needed

What research gaps does this publication contain?

More case studies are needed

Are these research gaps filled by any other publications in this survey?

Not yet known TODO

Quantitative research publications:

Study start date: 01-01-2010 (Firefox 3.6)
Study end date or duration: 20-12-2011 (Firefox 9.0)
Population description: Mozilla Wiki, VCS, Crash Repository, Bug Repository
Method(s) of recruitment of participants: N/A (case study)
Sample size: 25 alpha versions, 25 beta versions, 29 minor versions and 7 major versions. Amount of bugs/commits/etc. is not specified.
Evaluation/measurement description: Wilcoxon rank sum test
Outcomes:
- With shorter release cycles, users do not experience significantly more post-release bugs
- Bugs are fixed faster
- Users experience these bugs earlier during software execution (the program crashes earlier)
Limitations: Results are specific to Firefox
Future research: More case studies are needed

On the influence of release engineering on software reputation

Reference: @plewnia2014a

General information:

Name of person extracting data: Maarten Sijm
Date form completed: 27-09-2018
Author information: Christian Plewnia, Andrej Dyck, Horst Lichter
Publication type: Paper in Conference Proceedings
Conference: 2nd International Workshop on Release Engineering
Type of study: Quantitative, empirical case study on multiple software

What practices in release engineering does this publication mention?

Rapid releases

Are these practices to be classified under dated, state of the art or state of the practice? Why?

Dated practice, data is from before 2014

What open challenges in release engineering does this publication mention?

Identifying software reputation can better be done using a qualitative study.

What research gaps does this publication contain?

Identifying software reputation can better be done using a qualitative study.

Are these research gaps filled by any other publications in this survey?

Not yet known TODO

Quantitative research publications:

Study start date: Q3 2008
Study end date or duration: Q4 2013
Population description: Chrome, Firefox, Internet Explorer
Method(s) of recruitment of participants: N/A (case study)
Sample size: 3 browsers
Evaluation/measurement description: No statistical analysis, just presenting market share results
Outcomes:
- Chrome's market share increased after adopting rapid releases
- Firefox's market share decreased after adopting rapid releases
- IE's market share decreased
Limitations:
- Identifying software reputation can better be done using a qualitative study.
Future research:
- Identifying software reputation can better be done using a qualitative study.

On rapid releases and software testing: a case study and a semi-systematic literature review

Reference: @mantyla2015a

General information:

Name of person extracting data: Maarten Sijm
Date form completed: 28-09-2018
Author information: Mäntylä, Mika V. and Adams, Bram and Khomh, Foutse and Engström, Emelie and Petersen, Kai
Publication type: Journal/Magazine Article
Journal: Empirical Software Engineering
Type of study: Empirical case study and semi-systematic literature review

What practices in release engineering does this publication mention?

Impact of rapid releases on testing effort

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the practice for the case study
State of the art for the literature review

What open challenges in release engineering does this publication mention?

Future work should focus on empirical studies of these factors that complement the existing qualitative observations and perceptions of rapid releases.

What research gaps does this publication contain?

See open challenges

Are these research gaps filled by any other publications in this survey?

Not yet known TODO

Quantitative research publications:

Study start date: June 2006 (Firefox 2.0)
Study end date or duration: June 2012 (Firefox 13.0)
Population description: System-level test execution data
Method(s) of recruitment of participants: N/A (case study)
Sample size: 1,547 unique test cases, 312,502 executions, performed by 6,058 individuals on 2,009 software builds, 22 OS versions and 78 locales.
Evaluation/measurement description: Wilcoxon rank-sum test, Cliff's delta, Cohen's Kappa for Firefox Research Question (FF-RQ) 5.
Outcomes (FF-RQs; RR = rapid release; TR = traditional release):
1. RRs perform more test executions per day, but these tests focus on a smaller subset of the test case corpus.
2. RRs have less testers, but they have a higher workload.
3. RRs test fewer, but larger builds.
4. RRs test fewer platforms in total, but test each supported platform more thoroughly.
5. RRs have higher similarity of test suites and testers within a release series than TRs had.
6. RR testing happens closer to the release date and is more continuous, yet these findings were not confirmed by the QA engineer.
Limitations:
- Study measures correlation, not causation
- Not generalizable, as it is a case study on FF
Future research: More empirical studies

Semi-systematic literature survey:

Study date: Unknown (before 2015)
Population description: Papers with main focus on:
- Rapid Releases (RRs)
- Aspect of software engineering largely impacted by RRs
- An agile, lean or open source process having results of RRs
- Excluding: opinion papers without empirical data on RRs
Method(s) of recruitment of participants: Scopus queries
Sample size: 24 papers
Outcomes:
- Evidence is scarce. Often RRs are implemented as part of agile adoption. This makes it difficult to separate the impact of RRs from other process changes.
- Originates from several software development paradigms: Agile, FOSS, Lean, internet-speed software development
- Prevalence
  - Practiced in many software engineering domains, not just web applications
  - Between 23% and 83% of practitioners do RRs
- (Perceived) Problems:
  - Increased technical debt
  - RRs are in conflict with high reliability and high test coverage
  - Customers might be dipleased with RRs (many updates)
  - Time-pressure / Deadline oriented work
- (Perceived) Benefits:
  - Rapid feedback leading to increased quality focus of the devs and testers
  - Easier monitoring of progress and quality
  - Customer satisfaction
  - Shorter time-to-market
  - Continuous work / testing
- Enablers:
  - Sequential development where multiple releases are under work simultaneously
  - Tools for automated testing and efficient deployment
  - Involvement of product management and productive customers
Limitations:
- Not all papers that present results about RRs, have "rapid release" mentioned in the abstract.
Future research:
- Systematically search for agile and lean adoption papers

Release management in free and open source software ecosystems

Reference: @poo-caamano2016a

General information:

Name of person extracting data: Maarten Sijm
Date form completed: 28-09-2018
Author information: Germán Poo-Caamaño
Publication type: PhD Thesis
Type of study: Empirical case study on two large-scale FOSSs: GNOME and OpenStack

What practices in release engineering does this publication mention?

Communication in release engineering

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the practice, because case study

What open challenges in release engineering does this publication mention?

Is the ecosystem [around the studied software] shrinking or expanding?
How have communications in the ecosystem changed over time?

What research gaps does this publication contain?

More case studies are needed

Are these research gaps filled by any other publications in this survey?

Not yet known TODO

Quantitative research publications (GNOME):

Study start date: January 2009 (GNOME 2.x)
Study end date or duration: August 2011 (GNOME 3.x)
Population description: Mailing lists
Method(s) of recruitment of participants: GNOME's website recommends this channel of communication. IRC is also recommended, but its history is not stored.
Sample size: 285 mailing lists, 6947 messages, grouped into 945 discussions.
Evaluation/measurement description: Counting
Outcomes:
- Developers also communicate via blogs, bug trackers, conferences, and hackfests.
- The Release Team has direct contact with almost all participants in the mailing list
- The tasks of the Release Team:
  - defining requirements of GNOME releases
  - coordinating and communicating with projects and teams
  - shipping a release within defined quality and time specifications
- Major challenges of the Release Team:
  - coordinate projects and teams of volunteers without direct power over them
  - keep the build process manageable
  - monitor for unplanned changes
  - monitor for changes during the stabilization phase
  - test the GNOME release
Limitations:
- Only mailing list was investigated, other channels were not
- Possible subjective bias in manually categorizing email subjects
- Not very generalizable, as it's just one case study
Future research:
- Fix the limitations

Quantitative research publications (OpenStack):

Study start date: May 2012
Study end date or duration: July 2014
Population description: Mailing lists
Method(s) of recruitment of participants: Found on OpenStack's website
Sample size: 47 mailing lists, 24,643 messages, grouped into 7,650 discussions. Filtered data: 14,486 messages grouped into 2,682 discussions.
Evaluation/measurement description: Counting
Outcomes:
- Developers communicate via email, blogs, launchpad, wiki, gerrit, face-to-face, IRC, video-conferences, and etherpad.
- Project Team Leaders and the Release Team members are the key players in the communication and coordination across projects in the context of release management
- The tasks for the Release Team and Project Team Leaders:
  - defining the requirements of an OpenStack release
  - coordinating and communicating with projects and teams to reach the objectives of each milestone
  - coordinating feature freeze exceptions at the end of a release
  - shipping a release within defined quality and time specifications
- Major challenges of these teams:
  - coordinate projects and teams without direct power over them
  - keep everyone informed and engaged
  - decide what becomes part of of the integrated release
  - monitor changes
  - set priorities in cross-project coordination
  - overcome limitations of the communication infrastructure
Limitations:
- Only studies mailing list, to compare with GNOME case study
- Possible subjective bias in manually categorizing email subjects
- Not very generalizable, as it's just one case study
Future research:
- Fix the limitations

Notes:

Since there are two case studies, the results become a bit more generalizable
The author set up a theory that encapsulates the communication and coordination regarding release management in FOSS ecosystems, and can be summarized as:
1. The size and complexity of the integrated product is constrained by the release managers capacity
2. The release management should reach the whole ecosystem to increase awareness and participation
3. The release managers need social and technical skills

Release Early, Release Often and Release on Time. An Empirical Case Study of Release Management

Reference: @teixeira2017a

General information:

Name of person extracting data: Maarten Sijm
Date form completed: 28-09-2018
Author information: Jose Teixeira
Publication type: Paper in Conference Proceedings
Conference: Open Source Systems: Towards Robust Practices
Type of study: Empirical case study

What practices in release engineering does this publication mention?

Shifting towards rapid releases in OpenStack

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the practice, because it is a recent case study on OpenStack

What open challenges in release engineering does this publication mention?

More case studies are needed.

What research gaps does this publication contain?

More case studies are needed.

Are these research gaps filled by any other publications in this survey?

Not yet known TODO

Quantitative research publications:

Study start date: Not specified
Study end date or duration: Not specified
Population description: Websites and blogs
Method(s) of recruitment of participants: Random clicking through OpenStack websites
Sample size: Not specified
Evaluation/measurement description: Not specified
Outcomes:
- OpenStack releases in a cycle of six months
- The release management process is a hybrid of feature-based and time-based
- Having a time-based release strategy is a challenging coopearative task involving multiple people and technology
Limitations:
- Study is not completed yet, these are preliminary results
Future research:
- Not indicated

Kanbanize the release engineering process

Reference: @kerzazi2013a

General information:

Name of person extracting data: Jesse Tilro
Date form completed: 29-09-2018
Author information: Kerzazi, N. and Robillard, P.N.
Publication type: Paper in Conference Proceedings
Journal: 2013 1st International Workshop on Release Engineering, RELENG 2013 - Proceedings
Type of study: Action research

What practices in release engineering does this publication mention?

Following principles of the Kanban agile software development life-cycle model that implicitly describe the release process
(Switching to) more frequent (daily) release cycles
(Transitioning to) a structured release process

Are these practices to be classified under dated, state of the art or state of the practice? Why?

Either dated or state of the practice, not sure. Would have to do some additional research on the adoption of Kanban

What open challenges in release engineering does this publication mention?

Release effectiveness: minimize system failure and customer impact
Problems with releasing encountered in practice
- TODO list problems if of interest

What research gaps does this publication contain?

Are these research gaps filled by any other publications in this survey?

Quantitative research publications:

Study start date:
Study end date or duration:
Population description:
Method(s) of recruitment of participants:
Sample size:
Evaluation/measurement description:
Outcomes: 1.
Limitations:
Future research:

Notes:

Is it safe to uplift this patch? An empirical study on mozilla firefox

Reference: @castelluccio2017a

General information:

Name of person extracting data: Jesse Tilro
Date form completed: 29-09-2018
Author information: Castelluccio, M. and An, L. and Khomh, F.
Publication type: Paper in Conference Proceedings
Journal: Proceedings - 2017 IEEE International Conference on Software Maintenance and Evolution, ICSME 2017
Type of study: Case study, both quantitative (data analysis) and qualitative (interviews)

What practices in release engineering does this publication mention?

Patch uplift (meaning the promotion of patches from development directly to a stabilization channel, potentially skipping several channels)

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the practice: case study of what is being done in the field, quite recently (2017).

What open challenges in release engineering does this publication mention?

Exploring possibilities to leverage this research by building classifiers capable of automatically assessing the risk associated with patch uplift candidates and recommend patches that can be uplifted safely.
Validate and extend results of this study for generalizability.

What research gaps does this publication contain?

Study aimed to fill two identified gaps identified in literature:
- How do urgent patches in rapid release models affect software quality (in terms of fault proneness)?
- How can the reliability of the integration of urgent patches be improved?

Are these research gaps filled by any other publications in this survey?

The paper itself

Quantitative research publications:

Study start date:
Study end date or duration:
Population description:
Method(s) of recruitment of participants:
Sample size:
Evaluation/measurement description:
Outcomes: 1.
Limitations:
Future research:

Notes:

Systematic literature review on the impacts of agile release engineering practices

Reference: @karvonen2017a

General information:

Name of person extracting data: Jesse Tilro
Date form completed: 29-09-2018
Author information: Karvonen, T. and Behutiye, W. and Oivo, M. and Kuvaja, P.
Publication type: Journal/Magazine Article
Journal: Information and Software Technology
Type of study: Systematic literature review

What practices in release engineering does this publication mention?

Agile release engineering (ARE) practices
- Continuous integration (CI)
- Continuous delivery (CD)
- Rapid Release (RR)
- Continuous deployment
- DevOps (similar to CD, congruent with release engineering practices)

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the art, for it concerns a state of the art report and was published recently (2017).

What open challenges in release engineering does this publication mention?

Claims that modern release engineering practices allow for software to be delivered faster and cheaper should be further empirically validated.
This analysis could be extended with industry case studies, to develop a checklist for analyzing company and ecosystem readiness for continuous delivery and continuous deployment.
The comprehensive reporting of the context and how the practice is implemented instead of merely referring to usage of the practice should be considered by future research.
Different stakeholders' points of view, such as customer perceptions regarding practices require further research.
Research on DevOps would be highly relevant for release engineering and the continuous software engineering research domain.
Future research on the impact of RE practices could benefit from more extensive use of quantitative methodologies from case studies, and the combination of quantitative with qualitative (e.g. interviews) methods.

What research gaps does this publication contain?

Refer to challenges

Are these research gaps filled by any other publications in this survey?

TODO

Quantitative research publications:

Study start date: N/A
Study end date or duration: N/A
Population description: N/A
Method(s) of recruitment of participants: N/A
Sample size: N/A
Evaluation/measurement description: N/A
Outcomes: N/A
Limitations: N/A
Future research: N/A

Notes:

Abnormal Working Hours: Effect of Rapid Releases and Implications to Work Content

Reference: @claes2017a

General information:

Name of person extracting data: Jesse Tilro
Date form completed: 29-09-2018
Author information: Claes, M. and Mantyla, M. and Kuutila, M. and Adams, B.
Publication type: Paper in Conference Proceedings
Journal: IEEE International Working Conference on Mining Software Repositories
Type of study: Quantitative case study

What practices in release engineering does this publication mention?

Faster release cycles

Are these practices to be classified under dated, state of the art or state of the practice? Why?

What open challenges in release engineering does this publication mention?

Future research might further study the impact of time pressure and work patterns - indirectly release practices - on software developers.

What research gaps does this publication contain?

Are these research gaps filled by any other publications in this survey?

Quantitative research publications:

Study start date: first data item 2012-12-21
Study end date or duration: last data item 2016-01-03
Population description: N/A
Method(s) of recruitment of participants: N/A
Sample size: 145691 bug tracker contributors (1.8% timezone), 11.11 million comments (53% author with timezone)
Evaluation/measurement description: measure distributions on number of comments per day of the week and time of the day, before and after transition to rapid release cycles. Test distribution difference using Mann-Whitney U test and test effect size using Cohen's d and Cliff's delta. Also evaluate general development of number of comments, working day against weekend and day against night.
Outcomes:
1. Switching to rapid releases has reduced the amount of work performed outside of office hours. (Supported by results in psychology.)
2. Thus, rapid release cycles seem to have a positive effect on occupational health.
3. Comments posted during the weekend contained more technical terms.
4. Comments posted during weekdays contained more positive and polite vocabulary.
Limitations:
Future research:

Notes:

Does the release cycle of a library project influence when it is adopted by a client project?

Reference: @fujibayashi2017a

General information:

Name of person extracting data: Jesse Tilro
Date form completed: 29-09-2018
Author information: Fujibayashi, D. and Ihara, A. and Suwa, H. and Kula, R.G. and Matsumoto, K.
Publication type: Paper in Conference Proceedings
Journal: SANER 2017 - 24th IEEE International Conference on Software Analysis, Evolution, and Reengineering
Type of study: Quantitative study

What practices in release engineering does this publication mention?

Rapid release cycles

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the art and practice: practitioners currently practice it, researchers currently research it.

What open challenges in release engineering does this publication mention?

Gaining an understanding of the effect of a library's release cycle on its adoption.

What research gaps does this publication contain?

First step towards solving the above challenge.

Are these research gaps filled by any other publications in this survey?

This paper

Quantitative research publications:

Study start date: 21-07-2016 (data extraction)
Study end date or duration:
Population description:
Method(s) of recruitment of participants:
Sample size: 23 libraries, 415 client projects
Evaluation/measurement description:
- Scott-Knott test to group libraries with similar release cycle.
Outcomes:
1. There is a relationship between release cycle of a library project and the time for clients to adopt it: quicker release seems to be associated with quicker adoption.
Limitations:
- Small sample size
- Not controlled for many factors
- No statistical significance tests?
Future research:

Notes:

Very short, probably not very strong evidence, refer to limitations
Nice that the focus is libraries here, very interesting population because most studies focus on end-user targeting software systems

Rapid releases and patch backouts: A software analytics approach

Reference: @souza2015a

General information:

Name of person extracting data: Jesse Tilro
Date form completed: 29-09-2018
Author information: Souza, R. and Chavez, C. and Bittencourt, R.A.
Publication type: Journal/Magazine Article
Journal: IEEE Software
Type of study: Quantitative case study (Mozilla Firefox)

What practices in release engineering does this publication mention?

Rapid release
Backing out of broken patches (patch backouts)
Stabilization channels / monitored integration repository

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the practice (case study)

What open challenges in release engineering does this publication mention?

How rapid release cycles affect code integration, where patch backouts are a proxy for studying code integration
Integrate backout rate analysis in an analytics tool to provide release engineers with up-to-date information on the process

Quantitative research publications:

Study start date: first data item 30 june 2009
Study end date or duration: last data item 17 september 2013
Population description:
Method(s) of recruitment of participants:
Sample size: 43198 bug fixes, no further sample sizes of the raw data mentioned anywhere unfortunately. (Data from Mozilla Firefox project.)
Evaluation/measurement description: Associate commit log, bug reports and releases. Classify backouts. Measure rate of backouts against all fixed bugs, per month and per release strategy period. Test for statistical significance using Fisher's exact test and Wilcoxon signed-rank test.
Outcomes:
1. Absolute numbers of bug fixes and backouts increased under rapid releases (probably the increase in regular contributors played a role, cannot conclude anything about workload.)
2. Backout rate increased under rapid releases (sheriff managed integration repositories may have increased the prevalence of backout culture)
3. Higher early backout rate and lower late backout rate indicate a shift towards earlier problem detection (proportion early from 57 to 88 %) The time-to-backout also dropped.
Limitations:
- Sample size not mentioned
- Quite trivial statistics
Future research:
- Integrate backout rate analysis in an analytics tool to provide release engineers with up-to-date information on the process

Interview triangulation

Explanations of quantitative outcomes:
- larger code base and more products -> more conflicts
- evolution of automated testing toolset -> earlier and more backouts
- sheriff managed integration repos -> earlier and more backouts
Explanations of impact
- cultural shift reduced testing efforts beforehand, and higher early backout rate eventually reduced the effort to integrate bug fixes for developers
- given the many stabilization channels and the rarity of very late backouts both in traditional and rapid release cycles, changes in backouts do not seem to influence users' perception of quality (even though frequent update notifications and broken compatibilities caused upset users)

Notes:

Also reviews existing literature well.
Treats transitional period from traditional to rapid releases as a separate period.

Comparison of release engineering practices in a large mature company and a startup

Reference: @laukkanen2018a

General information:

Name of person extracting data: Jesse Tilro
Date form completed: 29-09-2018
Author information: Laukkanen, E. and Paasivaara, M. and Itkonen, J. and Lassenius, C.
Publication type: Journal/Magazine Article
Journal: Empirical Software Engineering
Type of study: Case study (2 cases)

What practices in release engineering does this publication mention?

Continuous Integration (mainly)
Code review
Internal Verification Scope
Domain Expert Testing
Testing with customers

Are these practices to be classified under dated, state of the art or state of the practice? Why?

What open challenges in release engineering does this publication mention?

The results in this study can be verified by additional case studies or or even surveys to close the of empirical research on release engineering

Quantitative research publications:

Study start date:
Study end date or duration:
Data acquisition period: 22 weeks (BigCorp) and 24 weeks (SmallCorp)
Population description:
Method(s) of recruitment of participants:
Sample size: 1889 builds (BigCorp) and 760 builds (SmallCorp)
Evaluation/measurement description:
Outcomes:
- High internal quality standards combined with the large distributed organizational context of BigCorp slowed the verification process down and therefore had a negative impact on release capability
- In SmallCorp, the lack of internal verification measures due to a lack of resources was mitigated by code review, disciplined CI and external verification by customers in customer environments. This allowed for fast release capability and gaining feedback from production.
- Variables
  - Multiple customers -> High quality standards
  - High quality standards -> Complex CI
  - High quality standards -> Slow Verification
  - Complex CI -> Undisciplined CI
  - Large distributed organization -> Undisciplined CI
  - Undisciplined CI -> Slow verification
  - Slow verification -> Slow release capability
Limitations:
- Only a case study, so difficult to generalize
Future research:

Notes:

Quantitative results triangulated with interviews

Modern Release Engineering in a Nutshell

Reference: @adams2016a

General information:

Name of person extracting data: Nels Numan
Date form completed (dd/mm/yyyy): 28/09/2018
Publication title: Modern Release Engineering in a Nutshell
Author information: Bram Adams and Shane McIntosh
Journal: 23rd International Conference on Software Analysis, Evolution, and Reengineering (2016)
Publication type: Conference paper
Type of study: Survey

What practices in release engineering does this publication mention?

Branching and merging
- Software teams rely on Version Control Systems
- Quality assurance activities like code reviews are used before doing a merge or even allowing a code change to be committed into a branch
- Keep branches short-lived and merge often. If this is impossible, a rebase can be done.
- "trunk-based development" can be applied to eliminate most branches below the master branch.
- Feature toggles are used to provide isolation for new features in case of the absence of branches.
Building and testing
- To help assess build and test conflicts, many projects also provide "try" servers to development teams, which automatically runs a build and test process referred to as CI.
- The CI process often does not run full test, but a representative subset.
- The more intensive tests, such as integration, system or performance typically get run nightly or in weekends.
Build system:
- GNU Make is the most popular file-based build system technology. Ant is the prototypical task-based build system technology. Lifecycle-based build technologies like Maven consider the build system of a project to have a sequence of standard build activities that together form a "build lifecycle."
- "Reproducible builds" involve for a given feature and hardware configuration of the code base, every build invoca- tion should yield bit-to-bit identical build results.
Infrastructure-as-code
- Containers or virtual machines are used to deploy new versions of the system for testing or even production.
- It has been recommended that infrastructure code is to be stored in a separate VCS repository than source code, in order to restrict access to infrastructure code.
Deployment
- The term "dark launching" corresponds to deploying new features without releasing them to the public, in which parts of the system automatically make calls to the hidden features in a way invisible to end users.
- "Blue green deployment" deploys the next software version on a copy of the production environment, and changes this to be the main enviroment on release.
- In "canary deployment" a prospective release of the software system is loaded onto a subset of the production environments for only a subset of users.
- "A/B testing" deploys alternative A of a feature to the environment of a subset of the user base, while alternative B is deployed to the environment of another subset.
Release
- Once a deployed version of a system is released, the release engineers monitor telemetry data and crash logs to track the performance and quality of releases. Several frameworks and applications have been introduced for this.

Are these practices to be classified under dated, state of the art or state of the practice? Why?

The majority of these practices are classified by the paper as state of the practice, but state of the art practices are also mentioned.

What open challenges in release engineering does this publication mention?

Branching and merging
- No methodology or insight exists on how to empirically validate the best branching structure for a given organization or project, and what results in the smallest amount of merge conflicts.
- Release engineers need to pay particular attention to conflicts and incompatibilities caused by evolving library and API dependencies.
Building and testing
- Speeding up CI might be the major concern of practitioners. This speed up can be achieved through predicting whether a code change will break the build, or by "chunking" code changes into a group and only compile and test each group once.
- The concept of "green builds" slowly is becoming an issue, in the sense that frequent triggering of the CI server consumes energy.
- Security of the release engineering pipeline in general, and the CI server in particular, also has become a major concern.
Release
- Qualitative studies are not only essential to understand the rationale behind quantitative findings, but also to identify design patterns and best practices for build systems.
  - How can developers make their builds more maintainable and of higher quality?
  - What refactorings should be performed for which build system anti-patterns?
- Identification and resolution of build bugs, i.e., source code or build specification changes that cause build breakage, possibly on a subset of the supported platforms.
- Basic tools have a hard time determining what part of the system is necessary to build.
- Studies on non-GNU Make build systems are missing.
- Apart from identifying bottlenecks, such approaches should also suggest concrete refactorings of the build system specifications or source code.
Infrastructure-as-code
- Research on differences between infrastructure languages is lacking.
- Best practices and design patterns for infrastructure-as-code need to be documented.
- Qualitative analysis of infrastructure code will be necessary to understand how developers address different infrastructure needs.
- Quantitative analysis of the version control and bug report systems can then help to determine which patterns were beneficial in terms of maintenance effort and/or quality.
Deployment
- More emperical studies can be done to answer question like this:
  - Is blue-green deployment the fastest means to deploy a new version of a web app?
  - Are A/B testing and dark launching worth the investment and risk?
  - Should one use containers or virtual machines for a medium-sized web app in order to meet application performance and robustness criteria?
  - If an app is part of a suite of apps built around a common database, should each app be deployed in a different container?
- Better tools for quality assurance are required, to prevent showstopper bugs from slipping through and requiring re-deployment of a mobile app version (with corresponding vetting), these include:
  - Defect prediction (either file- or commit-based)
  - Smarter/safer update mechanisms
  - Tools for improving code review
  - Generating tests
  - Filtering and interpreting crash reports
  - Prioritization and triaging of defect reports
Release
- More research is needed on determining which code change is the perfect one for triggering the release of one of these releases, or whether a canary is good enough to be released to another data centre.
- Question such as the following should be investigated:
  - Should one release on all platforms at the same time?
  - In the case of defects, which platform should receive priority?
  - Should all platforms use the same version numbering, or should that be feature-dependent?
  - Research on the continuous delivery and rapid releases from other systems should be explored.

What research gaps does this publication contain?

As is common with surveys, it does not contain the state of the field today. More quantitive and qualitive research has been done, which can not possibly be included.

Are these research gaps filled by any other publications in this survey?

An example of further research that expand on this study is @da2016a

The Impact of Switching to a Rapid Release Cycle on the Integration Delay of Addressed Issues

Reference: @da2016a

General information:

Name of person extracting data: Nels Numan
Date form completed (dd/mm/yyyy): 28/09/2018
Publication title: The Impact of Switching to a Rapid Release Cycle on the Integration Delay of Addressed Issues
Author information: Daniel Alencar da Costa, Shane McIntosh, Uira Kulesza, Ahmed E. Hassan
Journal: 13th Working Conference on Mining Software Repositories (2016)
Publication type: Conference paper
Type of study: Emperical study

What practices in release engineering does this publication mention?

To give a context to the study, the paper describes the concept of traditional releases, rapid releases, their differences, and how issue reports are structured.

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the practice. The paper describes common practices that were in use at the time of the publication.

What open challenges in release engineering does this publication mention?

The study mentions that comparing systems with different release structures is difficult since one has to distinguish to what extent the results are due to the release strategy and which are due to intricacies of the systems or organization itself.

What research gaps does this publication contain?

The main gap in this study is the specificity of the data. Only Mozilla has been considered, and external factors such as other organizational challenges which could have an effect on release time could not be included. More research that looks further into comparing this case to that of other organizations is needed.

Are these research gaps filled by any other publications in this survey?

Quantitative research publications:

Study start date: Used data starts from 1999
Study end date or duration: Used data ends in 2010
Population description: The paper describes multiple steps to describe their data collection approach. The paper collected the date and version number of each Firefox release. Tags within the VCS were used to link issue IDs to releases. The paper discards issues that are potential false positives: IDs that have less five digits, issues that refer to tests instead of bugfixes, any potential ID that is the name of a file. Since the commit logs are linked to the VCS tags, the paper is able to link the issue IDs found within these commit logs to the releases that correspond to those tags.
Method(s) of recruitment of participants: Firefox release history wiki and VCS logs
Sample size: 72114 issue reports from the Firefox system (34673 for traditional releases and 37441 for rapid releases)
Evaluation/measurement description: The paper aims to answer three research questions:
- Are addressed issues integrated more quickly in rapid releases?
  - Approach: Through beanplots to compare the distributions, the paper first observes the lifetime of the issues of traditional and rapid releases. Next, it looks at the time span of the triaging, fixing, and integration phases within the lifetime of an issue.
- Why can traditional releases integrate addressed issues more quickly?
  - Approach: the paper groups traditional and rapid releases into major and minor releases and study their integration delay through beanplots, Mann-Whiteney-Wilcoxon tests, Cliff's delta, and MAD.
- Did the change in the release strategy have an impact on the characteristics of delayed issues?
  - Approach: the paper builds linear regression models for both release approaches. The paper firstly estimates the degrees of freedom that can be spent on the models. Secondly, they check for metrics that are highly correlated using Spearman rank correlation tests and perform a redundancy check to remove redundant metrics. The paper then assesses the fit of our models using the ROC area and the Brier score. The ROC area is used to evaluate the degree of discrimination achieved by the model. The Brier score is used to evaluate the accuracy of probabilistic predictions. The used metrics include reporter experience, resolver experience, issue severity, issue priority, project queue rank, number of impacted files and fix time. A full list of metrics can be found in Table 2 of the paper.
Outcomes:
- Are addressed issues integrated more quickly in rapid releases?
  - Results: There is no significant difference between traditional and rapid releases regarding issue lifetime. Results:
- Why can traditional releases integrate addressed issues more quickly?
  - Results: Minor-traditional releases tend to have less integration delay than major/minor-rapid releases.
- Did the change in the release strategy have an impact on the characteristics of delayed issues?
  - Results: The models achieve a Brier score of 0.05- 0.16 and ROC areas of 0.81-0.83. Traditional releases prioritize the integration of backlog issues, while rapid releases prioritize the inte- gration of issues of the current release cycle.
Limitations: Defects in the tools that were developed to perform the data collection and evaluation could have an effect on the outcomes. Furthermore, the way that issue IDs are linked to releases may not represent the total addressed issues per release. The results cannot be generalized as the evaluation was solely done on the Firefox system.
Future research: Further research can look into applying the same evaluation strategy to other organizations that switched from traditional to rapid release.

Notes:

An Empirical Study of Delays in the Integration of Addressed Issues

Reference: @da2014a

General information:

Name of person extracting data: Nels Numan
Date form completed (dd/mm/yyyy): 29/09/18
Publication title: An Empirical Study of Delays in the Integration of Addressed Issues
Author information: Daniel Alencar da Costa, Surafel Lemma Abebe, Shane McIntosh, Uira Kulesza, Ahmed E. Hassan
Journal: 2014 IEEE International Conference on Software Maintenance and Evolution
Publication type: Conference paper
Type of study: Emperical study

What practices in release engineering does this publication mention?

This publication discusses the usage of issue tracking systems, and what the term issue means to form a context around the study.

Are these practices to be classified under dated, state of the art or state of the practice? Why?

State of the practice.

What open challenges in release engineering does this publication mention?

The results based on the investigated open source projects may not be generalizable and replication of the study is required on a larger set of projects to form a more general conclusion. Another challenge is finding metrics that are truly correlated with the integration delay of issues.

What research gaps does this publication contain?

Please see last question.

Are these research gaps filled by any other publications in this survey?

@da2016a

Quantitative research publications:

Study start date:
Used data start dates:
- ArgoUML: 18/08/2003
- Eclipse: 03/11/2003
- Firefox: 05/06/2012
Used data end dates:
- ArgoUML: 15/12/2011
- Eclipse: 12/02/2007
- Firefox: 04/02/2014
Population description:
Method(s) of recruitment of participants: The data was collected from both ITSs and VCSs of the studied systems.
Sample size: 20,995 issues from ArgoUML, Eclipse and Firefox projects
Evaluation/measurement description:
- How long are addressed issues typically delayed by the integration process?
  - Approach: models are created using metrics from four dimensions: reporter, issue, project, and history. Please refer to Table 2 in the paper for all of the metrics considered. The models are trained using the random forest technique. Precision, recall, F-measure, and ROC area are used to evaluate the models.
Outcomes:
- How long are addressed issues typically delayed by the integration process?
  - Addressed issues are usually delayed in a rapid release cycle. Many delayed issues were addressed well before releases from which they were omitted. Many delayed issues were addressed well before releases from which they were omitted.
- Can we accurately predict when an addressed issue will be integrated?
  - The prediction models achieve a weighted average precision between 0.59 to 0.88 and a recall between 0.62 to 0.88, with ROC areas of above 0.74. The models achieve better F-measure values than Zero-R.
- What are the most influential attributes for estimating integration delay?
  - The integrator workload has a bigger influence on integrator delay than the other attributes. Severity and priority have little influence on issue in- tegration delay.
Limitations: See open challenges.
Future research: See open challenges.

Notes:

Towards Definitions for Release Engineering and DevOps

Reference: @dyck2015a

General information:

Name of person extracting data: Nels Numan
Date form completed (dd/mm/yyyy): 30/09/2018
Publication title: Towards Definitions for Release Engineering and DevOps
Author information: Andrej Dyck, Ralf Penners, Horst Lichter
Journal:
Publication type:
Type of study: Survey

What practices in release engineering does this publication mention?

This paper talks about approaches to improve the collaboration between development and IT operations teams, in order to streamline software engineering processes. The paper defines for release engineering and devops.

Are these practices to be classified under dated, state of the art or state of the practice? Why?

Not applicable.

What open challenges in release engineering does this publication mention?

The paper mentions that creating a definition which is uniform and valid for many situations is difficult to find and that further research is needed.

What research gaps does this publication contain?

This paper aims to form a uniform definition for release engeneering and devops, in collaboration with experts. It is unclear how many experts were consulted for this definition, and more consultations and research could be done to further improve the definition.

Are these research gaps filled by any other publications in this survey?

Quantitative research publications:

Study start date:
Study end date or duration:
Population description:
Method(s) of recruitment of participants:
Sample size:
Evaluation/measurement description:
Outcomes:
Limitations:
Future research:

Notes:

Continuous deployment of software intensive products and services: A systematic mapping study

Reference: @rodriguez2017a

General information:

Name of person extracting data: Nels Numan
Date form completed (dd/mm/yyyy): 30/09/18
Publication title: Continuous deployment of software intensive products and services: A systematic mapping study
Author information: Pilar Rodrígueza, Alireza Haghighatkhaha, Lucy Ellen Lwakatarea, Susanna Teppolab, Tanja Suomalainenb, Juho Eskelib, Teemu Karvonena, Pasi Kuvajaa, June M. Vernerc, Markku Oivoa
Journal:
Publication type:
Type of study: Semantic study

What practices in release engineering does this publication mention?

This paper discussed the developments of continuous development over the years until June 2014. This paper has performed a semantic study to identify, classify and analyze primary studies related to continuous development. The paper has found the following major points:
- Almost all primary studies make reference in one way or another to accelerate the releae cycle by shortening the release cadence and turning it into a continuous flow.
- Some reviewed publications claim that accelerating the release cycle can make it harder to perform re-engineering activities.
- CD challenges and changes traditional planning towards continuous planning in order to achieve fast and frequent releases.
- Tighter integration between planning and execution is required in order to achieve a more holisitic view on planning in CD.
- It is important for the engineering and QA teams to ensure backward compatibility of enhancements, so that users perceive only improvements rather than experience any loss of functionality.
- Code change activities tend to focus more on bug fixing and maintenance than functional- ity expansion
- The architecture must be robust enough to allow the organization to invest its resources in offensive initiatives such as new functionalitity, product enhancements and innovation rather than defensive efforts such as bugfixes.
- A major challenge in CD is to retain the balance between speed and quality. Some approaches reviewed by this study propose a focus on measuring and monitoring source code and architectural quality.
- To avoid issues such as duplicated testing efforts and slow feedback loops it is important to make all testing activities transparent to individual developers.

What open challenges in release engineering does this publication mention?

Continuous and rapid experimentation is an emerging research topic with many possibilities for future work. This is why it's important to keep up with the newly contributed studies and add them to future reviews to compare their findings.

What research gaps does this publication contain?

Notes:

Frequent Releases in Open Source Software: A Systematic Review

Reference: @cesar2017a

General information:

Name of person extracting data: Nels Numan
Date form completed (dd/mm/yyyy): 30/09/18
Publication title: Frequent Releases in Open Source Software: A Systematic Review
Author information: Antonio Cesar Brandão Gomes da Silva, Glauco de Figueiredo Carneiro, Fernando Brito e Abreu and Miguel Pessoa Monteiro
Journal: Information
Publication type: Journal
Type of study: Survey

What practices in release engineering does this publication mention?

This paper discussed the developments of continuous development over the years. This paper has performed a semantic study to identify, classify and analyze primary studies related to continuous development. The paper finds:
- Two main motivations for the implementation of frequent software releases in the context of OSS projects, which are the project attractiveness/increase of participants and maintenance and increase of market share
- Four main strategies are adopted by practitioners to implement frequent software releases in the context of OSS projects: time-based release, automated release, test-driven development and continuous delivery/deployment.
- The main positive points associated to rapid releases are: quick return on customer needs, rapid delivery of new features, quick bug fixes, immediate release security patches, increased efficiency, entry of new collaborators, and greater focus on quality on the part of developers and testers.
- The main negative points assocaited to rapid releases are reliability of new versions, increase in the "technical debt", pressure felt by employees and community dependence.

Are these practices to be classified under dated, state of the art or state of the practice? Why?

The practices discussed are a combination of state of the art and state of the practice approaches.

What open challenges in release engineering does this publication mention?

A meta-model for the mining of open source bases in view of gathering data that leads to assessment of the quality of projects adoping the frequent release approach.

What research gaps does this publication contain?

Are these research gaps filled by any other publications in this survey?

Raw Extracted Data - nsalminen/software-analytics-book GitHub Wiki

Understanding the impact of rapid releases on software quality -- The Case of Firefox

On the influence of release engineering on software reputation

On rapid releases and software testing: a case study and a semi-systematic literature review

Release management in free and open source software ecosystems

Release Early, Release Often and Release on Time. An Empirical Case Study of Release Management

Kanbanize the release engineering process

Limitations:

Is it safe to uplift this patch? An empirical study on mozilla firefox

Systematic literature review on the impacts of agile release engineering practices

Abnormal Working Hours: Effect of Rapid Releases and Implications to Work Content

Does the release cycle of a library project influence when it is adopted by a client project?

Rapid releases and patch backouts: A software analytics approach

Comparison of release engineering practices in a large mature company and a startup

Future research:

Modern Release Engineering in a Nutshell

The Impact of Switching to a Rapid Release Cycle on the Integration Delay of Addressed Issues

An Empirical Study of Delays in the Integration of Addressed Issues

Towards Definitions for Release Engineering and DevOps

Continuous deployment of software intensive products and services: A systematic mapping study

Frequent Releases in Open Source Software: A Systematic Review