Phi‑4 Reasoning

Phi‑4 Reasoning is a MIT-licensed 14 billion‑parameter, decoder‑only Transformer model from Microsoft Research, fine‑tuned for complex, multi‑step reasoning across domains such as math, science, coding, planning, and spatial tasks. It outputs explicit chain‑of‑thought reasoning followed by concise answers.

🔍 Origins & Architecture

Base model: Built on Phi‑4, a 14 B dense Transformer pre-trained with a data‑centric focus on high‑quality synthetic and organic datasets.
Phi‑4 Reasoning: Fine‑tuned via supervised learning using "teachable" prompts and reasoning chains.
Phi‑4 Reasoning‑Plus: Further enhanced with reinforcement learning (GRPO), optimized for longer reasoning traces and better accuracy.
Context window: Supports up to 32K tokens, allowing in-depth problem exploration.
Logic markers: Uses <think> … </think> blocks to structure reasoning steps.

⚙️ Training Details

Supervised fine-tuning:
- Trained on ~1.4 million high‑quality reasoning examples.
- Balanced curriculum targeting borderline‑solvable prompts.
- Adjusted hyperparameters: small batch sizes, moderate learning rates, rotary position embeddings.
Reinforcement learning (Plus variant):
- Added RL phase using ~6,400 math‑focused problems.
- Employed GRPO reward: +1 for correct, –0.5 for incorrect, penalized hallucinations.
- Results in ~50% longer outputs, ~15% AIME performance boost.

📊 Benchmark Performance

Task / Benchmark	Phi‑4	Phi‑4 Reasoning	Reasoning‑Plus
AIME 25	63.1 %	78.0 %	82.5 %
HMMT Feb 25	43.8 %	53.6 %	67.5 %
OmniMath	76.6 %	81.9 %	85.0 %
GPQA	67.1 %	69.3 %	77.7 %
LiveCodeBench	53.8 %	53.1 %	68.8 %

🌐 Significance

Transparency: Chain‑of‑thought outputs enhance interpretability and debuggability.
Efficiency: Combines high performance with manageable compute and memory footprint.

Phi‑4 Reasoning - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

Phi‑4 Reasoning

🔍 Origins & Architecture

⚙️ Training Details

📊 Benchmark Performance

🌐 Significance

📚 References

⚠️ GitHub.com Fallback ⚠️

Phi‑4 Reasoning - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

Phi‑4 Reasoning

🔍 Origins & Architecture

⚙️ Training Details

📊 Benchmark Performance

🌐 Significance

📚 References

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️