DeepCoder - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki
Important Facts
- Available with 1.5 or 14 billion parameters (14 probably too much, 1.5 should run on anything)
- Trained with focus on coding
- Results looking not that promising compared to f.e. qwen2.5-coder but its fast
- MIT License
DeepCoder is a code reasoning model developed through a collaboration between Agentica and Together AI. It is finetuned from DeepSeek-R1-Distilled-Qwen-14B using distributed reinforcement learning (RL) methods, and is released as a preview model for evaluation and research purposes.
🔍 Overview
DeepCoder is based on a 14-billion parameter architecture and focuses on improving code understanding and reasoning tasks. It reaches 60.6% Pass@1 accuracy on LiveCodeBench, an 8% gain over its base model performance.
Its results are comparable to models like o3-mini-2025-01-031 (Low) and o1-2024-12-17, despite having fewer parameters.
🔧 Key Features
-
Code Reasoning Improvements:
Designed to handle multi-step programming tasks, algorithmic challenges, and more complex code generation cases. -
LiveCodeBench Performance:
Achieves 60.6% Pass@1 on LiveCodeBench, a notable improvement over baseline 14B models. -
Finetuning Strategy:
Built on DeepSeek-R1-Distilled-Qwen-14B and finetuned via distributed RL, focusing on better alignment with coding tasks. -
Resource Considerations:
With 14 billion parameters, the model targets a middle ground — aiming to deliver higher reasoning ability without the higher resource demands of 30B+ models. -
Preview Status:
Currently released as a preview, with potential for further refinements based on broader testing and feedback.
🧠 Architecture
DeepCoder uses a standard transformer-based architecture, initially optimized via distillation and later refined using reinforcement learning techniques.
The focus during training was on improving code reasoning chains rather than purely enhancing token prediction accuracy.