SUIP - breadboard-ai/breadboard GitHub Wiki

Semantic UI Protocol (SUIP)

1. Abstract

This document specifies the Semantic UI Protocol (SUIP), a format and protocol for generating user interfaces with Large Language Models (LLMs) or other untrusted code. The protocol uses JSON-RPC 2.0 to establish asynchronous and bidirectional communication channel between a sandboxed Generator (the LLM-driven code) and a host Renderer (the UI presentation layer). The core principle is the separation of UI semantics from UI rendering. The Generator emits a stream of semantic data structures, called Particles, which the Renderer interprets and presents to the user. This allows the host environment to retain full control over rendering, security, and performance, while enabling the LLM to dynamically drive the UI's content and structure.

2. Guiding Principles

The protocol design is guided by the following principles:

Security via Sandboxing: The environment executing the Generator's logic MUST be isolated from the environment that renders the UI. The Generator MUST NOT have direct access to the DOM or other sensitive browser APIs.
Semantic Communication: The Generator describes what it intends to convey, not how it should be displayed. The Renderer has the final say on presentation.
Reactivity by Default: The protocol is designed around a reactive data flow. The Renderer should be able to efficiently update the UI in response to a stream of changes from the Generator.
Bidirectional Interaction: The protocol MUST support communication from the Renderer back to the Generator, enabling user interactions (e.g., button clicks, form submissions) to trigger logic within the sandbox.

2.1 The Core Idea: Semantic UI

The central principle of SUIP is to separate the meaning (semantics) of a UI from its presentation (rendering). The Generator does not tell the Renderer to "draw a 2px bordered box with a drop shadow." Instead, it makes a semantic request, like "display a card."

Let's take that "card" component as an example. Semantically, a card is a container for a distinct piece of content with an established structure.

The Generator's Request (The "What"): The Generator describes the semantic elements of the card. It might request a card containing a title, an image, a block of text, and a "submit" button. It defines the content and the user interaction (submit), but not the appearance.
The Renderer's Interpretation (The "How"): The Renderer receives this semantic request. It has complete freedom over the presentation. It might render the card with rounded corners, specific padding, and a blue, branded "submit" button on a desktop view. On a mobile view, it might render the same semantic card with less padding and a full-width button for better usability.

It is crucial to understand that SUIP is the protocol for using an established semantic vocabulary, not for defining it. Both the Generator and the Renderer must agree on the set of semantic types (like "card") and their expected structures beforehand. The process of defining, negotiating, or distributing this shared knowledge is outside the scope of this protocol.

2.2 The "Why": Control and Constraint

As LLMs become increasingly effective at generating raw UI code (e.g., HTML, CSS, JavaScript), the primary challenge shifts from capability to control. An unconstrained LLM might generate UI that is insecure, off-brand, non-accessible, or simply undesirable. The key motivation for SUIP is to address this by fundamentally constraining the LLM's output. By forcing the Generator to communicate in a limited, semantic vocabulary instead of a Turing-complete rendering language, the host environment retains ultimate authority. It ensures that no matter what the LLM tries to do, it can only do what the Renderer allows it to do.

Furthermore, this model of control extends to the development of the Renderer itself. Because the Renderer only needs to handle a known, finite set of semantic types, the task of creating the UI components that render these semantics can also be LLM-assisted.

For instance, a developer could prompt an LLM to "generate the code for a UI component that renders a 'card' particle," knowing the output will be a self-contained, reviewable, and testable component. This allows for rapid UI development while still maintaining a crucial separation of concerns: the untrusted, sandboxed LLM generates semantic structures, while a trusted (potentially LLM-generated but vetted) Renderer implements the presentation.

3. Architecture

The protocol defines three core components:

The Generator: Resides inside the sandbox. This component, driven by LLM-generated code, produces and manipulates a semantic representation of the UI. It acts as a JSON-RPC client to send UI changes and as a JSON-RPC server to receive UI events
The Renderer: Resides outside the sandbox. This component receives semantic information from the Generator, renders the actual UI, and forwards user interaction events back to the Generator. It acts as a JSON-RPC server to receive UI changes and as a JSON-RPC client to send UI events.
The Channel: A bidirectional communication layer that guarantees reliable and ordered message delivery and enforces the sandbox boundary by transmitting serialized messages between the Generator and the Renderer.

flowchart LR
    subgraph "Host Environment"  
        Renderer("Renderer (View)")  
        UI("UI Framework (e.g., Lit, React)")  
    end

    subgraph "Sandboxed Environment"  
        Generator("Generator (Model \+ Controller)")  
        LLM("LLM-generated code")  
    end

    Channel("Channel (Message Bus)")

    LLM -- "Drives" --> Generator  
    Generator -- "Sends Operation Messages" --> Channel  
    Channel -- "Delivers Operation Messages" --> Renderer  
    Renderer -- "Updates" --> UI  
    UI -- "User Interactions" --> Renderer  
    Renderer -- "Sends Event Messages" --> Channel  
    Channel -- "Delivers Event Messages" --> Generator

3.1. The Tree-Stream-Tree Data Flow

The fundamental data flow of SUIP follows a Tree-Stream-Tree pattern.

Generator Tree(s): A Generator constructs a semantic representation of the UI portion it controls as a tree of Particles. This may be a partial tree, representing only a specific component or section of the overall UI. For security and modularity, a Generator might not have access to the complete UI tree managed by the Renderer.
Operation Stream(s): Instead of sending its entire tree state at once, each Generator calculates the difference between its current tree state and the last known state of the Renderer. It translates these differences into a sequence of granular Operation Messages (append, remove, replace, etc.). This sequence becomes a stream of instructions sent across the Channel. This approach is highly efficient, as it avoids redundant data transfer and allows for dynamic, real-time updates.
Renderer Tree: The Renderer receives streams of operations from one or more Generators. It integrates these operations to construct and maintain its own single, coherent local copy of the Particle tree. This consolidated tree acts as a view model, which the Renderer then uses to draw and update the actual UI presented to the user.

This pattern ensures that Generators and the Renderer are decoupled. A Generator only needs to reason about its portion of the semantic tree, while the Renderer processes simple, explicit instructions from multiple sources to keep the UI in sync.

4. The Particle Format

To implement the semantic concepts described above, SUIP uses a lightweight data structure called the Particle. Particles form a tree. The collection of all defined particle types and the rendering expectations for each one forms the "semantic language" of the protocol: this language is used to communicate the semantics of the UI.

In particular, the type property of a GroupParticle, dictates its expected structure and the roles of its children. This is how the abstract idea of a "card" is made concrete: as a GroupParticle with type: "card". The Generator uses this language to describe what it wants to show, and the Renderer uses its understanding of this language to decide how to show it.

4.1. Particle Types

There are three primary particle types.

TextParticle: Represents textual information.

  type TextParticle = {  
    text: string;  
    mimeType?: string; // Defaults to "text/markdown"  
  };

DataParticle: Represents binary or rich media information.

  type DataParticle = {  
    data: string; // A URL or data URI  
    mimeType: string;  
  };

GroupParticle: Represents a logical grouping of other particles. This is the basis for creating semantic structures.

  type GroupParticle = {  
    group: Map<ParticleIdentifier, Particle>;  
    type?: string; // Optional semantic type, e.g., "card", "form", "list"  
    on?: Record<string, EventHandler>; // Event handlers for the group  
  };

4.2. Interactivity: The EventHandler

To enable bidirectional communication, any particle can include an on property. This property maps event names (e.g., click, submit) to an EventHandler object.

type EventHandler = {  
  /**  
   * A unique identifier for the handler logic within the Generator.  
   */  
  handler: string;  
  /**  
   * If true, the default browser action for this event should be prevented.  
   */  
  preventDefault?: boolean;  
  /**  
   * Specifies which data from the event to send back to the Generator.  
   * - "none": (Default) No payload is sent.  
   * - "value": Sends the element's value (e.g., for inputs).  
   * - "formData": For a form, sends all named input values.  
   */  
  data?: "none" | "value" | "formData";  
};

When a user triggers an event on a rendered element, the Renderer MUST send an EventMessage back through the Channel.

5. The Communication Protocol

Communication occurs via asynchronous, serialized messages sent over the Channel.

5.1. The Particle Tree

The Renderer maintains the state of the UI as a tree of Particles, rooted in a top-level GroupParticle. The Generator never sends the whole tree, but rather a series of Operation Messages to manipulate it. A ParticleIdentifier is a string that uniquely identifies a particle within its parent GroupParticle. A ParticlePath is an array of identifiers representing the path from the root to a specific particle.

5.2. Generator-to-Renderer: Operation Messages

The Generator manipulates the particle tree by sending one of the following message types.

append: Adds a particle to a group.

  { "op": "append", "path": ["parent"], "id": "new", "particle": { ... } }

insert: Inserts a particle into a group before a specific sibling.

  { "op": "insert", "path": ["parent"], "id": "new", "particle": { ... }, "before": "sibling" }

remove: Removes a particle.

  { "op": "remove", "path": ["parent", "toRemove"] }

replace: Replaces an existing particle.

  { "op": "replace", "path": ["parent", "toReplace"], "particle": { ... } }

5.3. Renderer-to-Generator: Event Messages

When a user interaction occurs on an element with a defined EventHandler, the Renderer sends an EventMessage.

{ "type": "event", "handler": "hnd_1", "path": ["p1", "p2"], "payload": { ... } }

5.4. Transport Agnosticism

This specification defines the format of the messages and the behavior of the Generator and Renderer. It does not prescribe a specific transport mechanism. Implementations are free to choose any reliable, ordered, and serializable transport layer that suits their architecture (e.g., window.postMessage, WebSockets, etc.).