Meta prompting - chunhualiao/public-docs GitHub Wiki

AlphaEvolve

evolves its own instructions to LLMs.

Certainly. Let’s walk through how meta-prompting works in AlphaEvolve, using a concrete example in the context of optimizing an image classification model written in JAX (as shown in Figure 3 of the paper).


🧩 What Is Meta-Prompting?

Meta-prompting in AlphaEvolve means:

Evolving not just solutions (like code), but also the prompts that are used to generate those solutions.

In other words, AlphaEvolve learns to improve the way it asks LLMs for code changes, just as it evolves the code itself.


🧪 Concrete Example: Evolving a CNN in JAX

Let's say a user is trying to improve a ConvNet architecture for image classification. The user has:

  • An initial ConvNet using 3 ResNet blocks
  • An evaluate function scoring accuracy and log loss
  • Evolution blocks in code marked with # EVOLVE-BLOCK-START / # EVOLVE-BLOCK-END

🔁 Step-by-Step: How Meta-Prompting Works


Step 1: AlphaEvolve builds a prompt template

The base prompt to the LLM might look like:

Act as an expert software developer. Your task is to iteratively improve the provided codebase.

Here is the current program (accuracy: 0.862, log-loss: 0.387):

class ConvNet(hk.Module):
    def __init__(...):
        self._block1 = ResNetBlock(...)
        self._block2 = ResNetBlock(...)
        self._block3 = ResNetBlock(...)
    ...

Suggest a modification that improves performance.

Use the following diff format:
<<<<<<< SEARCH
(original code)
=======
(modified code)
>>>>>>> REPLACE

Step 2: AlphaEvolve uses an LLM to generate this prompt

AlphaEvolve asks another LLM (or another call to the same model) to generate or improve the prompt template above. For example:

“Write a prompt for a coding LLM that will help it generate a better version of the current ConvNet architecture to improve accuracy and reduce overfitting.”

The LLM might output:

As a deep learning optimization expert, your goal is to enhance model generalization and learning dynamics. Consider adding layers, changing activation functions, modifying optimizer settings, or tuning regularization techniques. Reflect on recent successful architectures like EfficientNet or DenseNet.

Use diff format. Ensure changes are consistent.

This becomes the meta-prompt, used in future iterations.


Step 3: Store and evolve meta-prompts

AlphaEvolve keeps a meta-prompt database, just like it stores programs.

Each time a new meta-prompt leads to a better solution, it gets scored and stored. Over time, AlphaEvolve learns which types of prompts help the LLM generate better diffs.


Step 4: Use meta-prompts in main loop

Now, when AlphaEvolve is generating new solutions, it can:

  • Select a high-performing meta-prompt from the database
  • Combine it with context (previous programs, scores, code blocks)
  • Use this composite prompt to query the LLM for new diffs

Example of an Evolved Prompt (vs. Initial)

Prompt Type Prompt Snippet
Initial “Act as an expert software developer. Improve the code below.”
Evolved Prompt “You are optimizing a CNN for CIFAR-10. Consider adding dropout, using GELU instead of ReLU, and switching to AdamW with weight decay. Target improvements in top-1 accuracy and stability over 5 epochs.”

Step 5: Co-Evolution of Prompts and Programs

This creates a feedback loop:

  1. Better meta-prompts → better code modifications
  2. Better code modifications → higher evaluation scores
  3. Higher scores → better feedback on which meta-prompts are effective

Over time, AlphaEvolve co-evolves both the solutions and the strategies for producing them.


🧠 Why Meta-Prompting Matters

  • It reduces reliance on human-crafted prompts.
  • It adapts to task-specific best practices, e.g., tuning CNNs vs. optimizing hardware circuits.
  • It leverages the LLM’s own reasoning ability to bootstrap better instructions.

This is like having a researcher who not only proposes ideas—but also learns how to ask better questions over time.


Would you like to see how the meta-prompt evolution database might be structured or how it compares in ablation studies?

Great! Let's go deeper into how AlphaEvolve maintains and evolves its meta-prompt database, and what impact it has based on ablation studies in the paper.


🧱 How the Meta-Prompt Evolution Database Works

📦 1. Storage Format

Each meta-prompt entry includes:

Field Description
prompt_text Full text of the meta-prompt (instructions to the LLM)
origin How it was generated (e.g., human-authored seed, LLM-generated from scratch, or mutation of previous)
associated_runs List of code evolution runs that used this meta-prompt
score_stats Evaluation metrics from those runs (e.g., accuracy, loss, runtime)
last_used Timestamp of last use (for freshness tracking)

🔄 2. Evolution Process

  • New meta-prompts are generated by mutating or combining high-scoring prompts.

  • AlphaEvolve might ask an LLM something like:

    “Given that this prompt helped improve matrix multiplication efficiency, how would you rephrase it to be more specific to sparsity and numerical stability?”

  • Fitness of a meta-prompt is based on the average or peak performance of code it helped generate.

  • Periodically, AlphaEvolve prunes stale or ineffective prompts.


📊 Results: Impact of Meta-Prompting (from Ablation)

In Figure 8 of the paper, AlphaEvolve is tested with and without meta-prompting on two benchmark tasks:

🔁 Task 1: Matrix Multiplication (Tensor Decomposition)

Target: minimize the number of scalar multiplications Result: Meta-prompting consistently helps AlphaEvolve reach lower ranks faster, especially in early generations.

Variant Performance (lower is better)
Full AlphaEvolve 🟢 Best performance
No Meta Prompting 🔴 Slower convergence and higher final ranks

🔁 Task 2: Kissing Number Problem (Geometry)

Target: find better sphere packings Result: Meta-prompting improves early and mid-stage exploration.

Variant Performance (higher is better)
Full AlphaEvolve 🟢 Best
No Meta Prompting 🔴 Plateaued early

🧠 Interpretation

  • Meta-prompts act like evolving research strategies.
  • Instead of blindly modifying code, AlphaEvolve learns how to guide the LLM to be more creative, precise, or domain-aware.
  • Even small phrasing changes in prompts can dramatically change the LLM’s code generation behavior—AlphaEvolve captures and exploits this systematically.

🧰 Example: Meta-Prompt Evolution for Matrix Multiplication

Generation Prompt Snippet
Gen 0 “Optimize the following tensor decomposition for speed.”
Gen 2 “You are designing a low-rank decomposition algorithm for complex-valued 4×4 matrix multiplication. Minimize scalar multiplications and favor near-half-integer coefficients.”
Gen 5 “Incorporate stochastic gradient steps and discretization loss to encourage concise, exact decompositions with minimal numerical noise.”

📌 Summary

  • AlphaEvolve’s meta-prompting system enables autonomous discovery of better prompting strategies, a key to unlocking high-performance LLM behavior.
  • It treats prompt construction as a first-class evolvable asset, not a static input.
  • The approach works across domains (ML, math, hardware design) and significantly boosts sample efficiency and solution quality.

Would you like a schematic diagram of how meta-prompting fits into AlphaEvolve’s full architecture?