Github - yizhihenpidehou/yzhpdh-s-bookcase GitHub Wiki
-
Xiao, Wenxin, et al. "Recommending good first issues in GitHub OSS projects." Proceedings of the 44th International Conference on Software Engineering. 2022.PDF 【预测哪些issues会被项目维护者标为Good First Issues】
-
Golzadeh, Mehdi, et al. "A ground-truth dataset and classification model for detecting bots in GitHub issue and PR comments." Journal of Systems and Software 175 (2021): 110911. PDF 【github中的bots会影响一些社会实证分析,因此识别提出一个区分人/机器人的分类模型,并且构建了一个数据集(隐私关系没有开源具体的commit内容,只有账户名与是否为bot),提出的问题是给定一个在issue或PR的中comment过的贡献者,判断该贡献者是人类还是机器人】
-
Abdellatif, Ahmad, et al. "BotHunter: an approach to detect software bots in GitHub." Proceedings of the 19th International Conference on Mining Software Repositories. 2022. PDF 【大量的软件机器人为从业者和研究人员为区分人类帐户和机器人帐户导致了额外的负担。为了避免数据驱动的研究中的偏差,且之前的工作只针对识别某种类型的机器人,没有实现识别功能的泛化,因此文章选择并提取了19个与Github账户的档案信息、活动和评论相似性相关的特征。并且评估了五种机器学习分类器的性能,使用一个拥有超过5,000个GitHub帐户的数据集,结果表明,随机森林分类器的性能最好。与我们最接近的工作是提出了在社交编码平台上的账户级别上识别类型(机器人或人类)的方法】
-
Dey, Tapajit, et al. "Detecting and characterizing bots that commit code." Proceedings of the 17th international conference on mining software repositories. 2020.【文章的features围绕commit message、commit association、author name展开。BIMAN由BIN(name)、BIM(commit message)、BICA(commit Association)三个子模块组成,每个子模块可以单独执行,文章将每个模块的输出作为最终预测模块的输出,判断是否为bot】
-
Bao, Lingfeng, et al. "A large scale study of long-time contributor prediction for GitHub projects." IEEE Transactions on Software Engineering 47.6 (2019): 1277-1298. 【通过建立一个基于newcomer第一个月的开发活动的预测模型,预测一个开发者是否会成为该仓库的长期贡献者;根据GITHUB数据提取了63个特性,这些特性属于五个维度:开发人员简介、存储库简介、开发人员每月活动、存储库月活动和协作网络】
-
Dey, Tapajit, Bogdan Vasilescu, and Audris Mockus. "An exploratory study of bot commits." Proceedings of the IEEE/ACM 42nd international conference on software engineering workshops. 2020. (PDF)
Background: Bots help automate many of the tasks performed by software developers and are widely used to commit code in various social coding platforms.
Motivation: At present, it is not clear what types of activities these bots perform and understanding it may help design better bots, and find application areas which might benefit from bot adoption.
Method: 12,326,137 commits made by 461 popular bots (that made at least 1000 commits) were examined to identify the frequency and the type of files added/ deleted/ modified by the commits, and association rule mining was used to identify the types of files modified together.
Result:
- Bots are mostly taking care of file updates, and update a small number of files per commit
- The majority of bot commits comprises frequent updates to configuration, documentation, and data.
- Bots seem to be more active in Web-interface-related projects, since the majority of bot commits involves changes to HTML,JavaScript, and JSON files.
-
Gong, Qingyuan, et al. "Detecting malicious accounts in online developer communities using deep learning." Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019. (PDF)
-
Liao, Zhifang, et al. "BDGOA: A bot detection approach for GitHub OAuth Apps." Intelligent and Converged Networks (2023) (PDF) 【在bothunter的基础上多提了些特征】
-
R. He, H. He, Y. Zhang and M. Zhou, "Automating Dependency Updates in Practice: An Exploratory Study on GitHub Dependabot," in IEEE Transactions on Software Engineering, vol. 49, no. 8, pp. 4004-4022, Aug. 2023, doi: 10.1109/TSE.2023.3278129.(PDF) 【对Dependabot的深入分析】
-
A. Ghorbani et al., "Autonomy Is An Acquired Taste: Exploring Developer Preferences for GitHub Bots," 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, 2023, pp. 1405-1417, doi: 10.1109/ICSE48619.2023.00123.(PDF) 【This paper examined the factors influencing developer perceptions of GitHub bots】
-
Chidambaram, Natarajan, Alexandre Decan, and Tom Mens. "Distinguishing Bots From Human Developers Based on Their GitHub Activity Types." Seminar on Advanced Techniques and Tools for Software Evolution (SATToSE). CEUR. 2023.(PDF) 【This paper proposes some features of bot】