AI coding - ProkopHapala/FireCore GitHub Wiki

Update June 2025

Continue-dev because not updated (3rd party models not up-to-date) => uninstall
Cody has free Claude 4 sonet

AI model Comparisons / Leaderboards

Screenshot_2025-01-27_20-16-41 Screenshot_2025-01-27_20-15-43

Chat platforms

Free API providers

Google Vertex
- 300$ free credit for 3 month (150$ without credit card)
Google Studio
- Gemini 1.5_002 Pro / Flash with generous limits. Flash is super fast, Pro is quite good. Super big context window (1M tok)
  - Gemini 1.5_002 Pro : 2 Req/min, 50 Req/day, 32,000 T/min
  - Gemini 1.5_002 Flash : 15 RPM, 1,500 Req/day, 1 million TPM
Mistral
- AICodeKing: Mistral FREE API : This is the BEST FREE WAY to do AI CODING
- Codestral API (unlimited rate limit)
- Mistral 2 Large free api key (very generous rate limits)
  - mistral-large-2407 1 Request/s, 500,000 T/minute, 1,000,000,000 T/month
GitHub marketplace
- GPT-4o and GPT-4o-mini seems to be free at the moment
  - Developer Rate Limits
  - AICodeKing: Github Models FREE API : AI Coding with GPT-4O for FULLY FREE
hyperbolic
- 10$ free creadit
cerebras
- Llama3.1-8B ( 2204 T/s )
- Llama3.1-70B ( 2525 T/s )
sambanova
- Llama-3.1-8B ( 1111.59 T/s )
- Llama-3.1-70B ( 531.56 T/s )
- Llama-3.1-405B ( 124.95 T/s )
- Llama-3.2-90B-Vision ( 606.06 T/s )
groq
- deepseek-r1-distill-llama-70b ( 279 T/s )
- llama3-groq-70b-8192-tool-use-preview ( 313 T/s )
- llama-3.1-70B-versatile ( 250 T/s )
- llama-3.2-90b-vision-preview ( 202 T/s )

Good Local Models

Codestral 25
- good in Fill-in-middle
Phi-4 14B
qwq 32B
qwen2.5-coder (0.5B, 1.5B, 3B, 7B, 14B, 32B)
- ollama/qwen2.5-coder
- aider leaderboards : Qwen2.5-Coder-32B-Instruct beats gpt-4o-2024-05-13 and DeepSeek V2.5, just behind claude-3.5-sonnet&Haiku
OpenCoder - Best 8B coding model (November 2024)

ollama

Dolphin3.0-R1-Mistral-24B
deepseek-r1
- 8b, 14b, 32b, 70b
phi4
qwq
deepscaler - A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.

Cost Effective Models

qwen-2.5-coder-32b-instruct
- DeepInfra 32,768 0.18/0.18 [$/Mtok I/O] 62.91 [tok/s] Latency 0.42s
DeepSeek-2.5-236b
- DeepSeek 8,192 0.14/0.28 [$/Mtok I/O] 15.49 [tok/s] Latency 1.10s
- DeepSeek thinking model (like OpenAI's o1)
MiniMax-Text-01
- [AICodeKing:MiniMax-01: This OPENSOURCE Model HAS LONGEST 4M CONTEXT & BEATS OTHERS!] (https://www.youtube.com/watch?v=NKnRPykTIJs)
- MiniMax-Text-01 is a powerful language model with 456 billion total parameters, of which 45.9 billion are activated per token. To better unlock the long context capabilities of the model, MiniMax-Text-01 adopts a hybrid architecture that combines Lightning Attention, Softmax Attention and Mixture-of-Experts (MoE).

Coding Tools / Agents

Aider: blog
VS Code Extensions
- AI Toolkit for VS Code
- Cody
- https://www.continue.dev/
- Cline (Claude-dev)
- SuperMaven - code completion
Editors

Aider Architect (more)

Aider.chat
Reasonable Architect settings
- Architect
  - o1-preview ( best scientific reasoning, good coding )
  - Claude-3.5-Sonet ( good scientific reasoning, best coding, stic to format )
  - Gemini 1.5 Pro ( Good scientific reasoning, Good Context window )
- Coder - Aider-coder : google-doc
  - GPT-4o via GitHub(Azure) - Fast, Good reasoning and coding (almost as Sonet)
  - GPT-4o-mini
  - DeepSeek ( good coding, but slow 16 T/s :-( )
  - Gemini 1.5 Flash ( Fast, Decent coding )
  - Mistral Large ( decent coding and reasoning, almost free at low rate )
  - Grok beta ( 25$ free )
Free Model Options
- Fully Free (but with limits)
  - aider --model github/gpt-4o
  - aider --model gemini/gemini-1.5-pro-002 --map-tokens 2048
  - aider --model gemini/gemini-1.5-flash-002 --map-tokens 2048
  - aider --model mistral/mistral-large-latest --map-tokens 2048
  - aider --model codestral/codestral-latest --map-tokens 2048
  - aider --model cerebras/llama3.1-70b
- cheap (but not fully free)
  - aider --model openrouter/qwen/qwen-2.5-coder-32b-instruct
  - aider --model deepseek/deepseek-coder
Aider: usage
- Architect Model Options
  - Separating code reasoning and editing
  - Claude 3.5 sonet
    - aider --architect --sonnet --editor-model deepseek/deepseek-coder
    - aider --architect --sonnet --editor-model openrouter/qwen/qwen-2.5-coder-32b-instruct
    - aider --architect --sonnet --editor-model github/gpt-4o
    - aider --architect --sonnet --editor-model github/gpt-4o-mini
    - aider --architect --sonnet --editor-model mistral/mistral-large-latest
    - aider --architect --sonnet --editor-model codestral/codestral-latest
    - aider --architect --sonnet --editor-model gemini/gemini-1.5-flash-002
  - GPT-4o
    - aider --architect --model github/gpt-4o --editor-model github/gpt-4o-mini
    - aider --architect --model github/gpt-4o --editor-model mistral/mistral-large-latest
    - aider --architect --model github/gpt-4o --editor-model openrouter/qwen/qwen-2.5-coder-32b-instruct
    - aider --architect --model github/gpt-4o --editor-model deepseek/deepseek-coder
    - aider --architect --model github/gpt-4o --editor-model gemini/gemini-1.5-flash-002
    - aider --architect --model github/gpt-4o --editor-model cerebras/llama3.1-70b
  - Gemini 1.5-pro-002
    - aider --architect --model gemini/gemini-1.5-pro-002 --editor-model gemini/gemini-1.5-flash-002
    - aider --architect --model gemini/gemini-1.5-pro-002 --editor-model github/gpt-4o-mini
    - aider --architect --model gemini/gemini-1.5-pro-002 --editor-model openrouter/qwen/qwen-2.5-coder-32b-instruct
    - aider --architect --model gemini/gemini-1.5-pro-002 --editor-model deepseek/deepseek-coder
    - aider --architect --model gemini/gemini-1.5-pro-002 --editor-model mistral/mistral-large-latest
    - aider --architect --model gemini/gemini-1.5-pro-002 --editor-model codestral/codestral-latest
    - aider --architect --model gemini/gemini-1.5-pro-002 --editor-model cerebras/llama3.1-70b
  - Gemini exp-1206
    - aider --model gemini/gemini-exp-1206
    - aider --architect --model gemini/gemini-exp-1206 --editor-model github/gpt-4o-mini
    - aider --architect --model gemini/gemini-exp-1206 --editor-model openrouter/qwen/qwen-2.5-coder-32b-instruct
  - Gemini 2.0 flash
    - aider --model gemini/gemini-2.0-flash-exp
  - Qwen QwQ
    - aider --model openrouter/qwen/qwq-32b-preview --editor-model openrouter/qwen/qwen-2.5-coder-32b-instruct --editor-edit-format editor-whole

Youtube

Context Building

RAG for Coding

PDF to Text

Docling
- Fahd Mirza : Docling - IBM Library to Make Documents AI Ready - Install and Test Locally
- perplexity:docling

AI Literature research

Ai2 OpenScholar: Scientific literature synthesis with retrieval-augmented language models

AI Agents and scrapers

autogen / magentic-one
- AICodeKing: Microsoft's Magentic One: This FREE AI AGENT can CONTROL BROWSER, DO CODING & MORE!