AI software engineer - chunhualiao/public-docs GitHub Wiki

claude code

image

common challenges , remember issues related to CompilerGPT. also key steps in software engineering

Benchmarks and leaderboards

https://liveswebench.ai/

SWE-bench

https://www.swebench.com/

  • focus on the Verified Leaderboard

Challenges

AI software engineer:challenge

  • for existing code bases: any changes involve several distinct tasks
    • find relevant code portions: search from issue descriptions or requirements
    • decide what changes to appy: generating patches
    • ensure correctness of the changes

Solutions

  • first, narrow down the scope by focusing on one or two key tasks of the entire software engineering task list: such as doc generation , bug fixes, feature additions, etc.
  • For one task: narrow down the scope further

Example systems

OpenHands + CodeAct v2.1 (claude-3-5-sonnet-20241022)

AutoCodeRover

https://arxiv.org/pdf/2411.01114

OpenDevin:https://github.com/OpenDevin/OpenDevin

SWE-Bench

3 月 13 日,微软提交了一篇关于全自动软件开发框架的论文AutoDev,还介绍了多智能体协同(Multi-Agent)的概念。

AutoCoder 是国人祝威廉团队在 3 月 18 日首次发布的命令行版 Devin。

Devika 是一个富有主动性的AI软件工程师,依托于Claude 3能够理解人类的高级指令,把这些指令分解成具体步骤,搜集所需的信息,并据此编写代码来完成既定目标。

4月3日,普林斯顿大学的NLP团队开发了一个开源AI程序员SWE-agent,它利用GPT-4模型在GitHub存储库中自动解决问题。