Imagen 3 Guide: Different Modes and Prompt set up - vertesia/llumiverse GitHub Wiki

This guide covers the different modes available in Imagen 3 and how to structure prompts for each mode.

Common Options

Available for most modes:

number_of_images: Number of variants to generate
seed: For reproducible results
safety_setting: Safety filter level
person_generation: Whether to allow the generation of people. Can set to adults only, all or off.
add_watermark: Adds a invisible watermark for AI image detection (cannot be used with seed).

1. TEXT_IMAGE (Default)

Basic text-to-image generation without reference images.

Prompt Structure:

user: Detailed description of the desired image
negative: Elements to avoid in the image

2. Edit Modes

EDIT_MODE_INPAINT_REMOVAL

Removes elements from an image based on a mask.

Prompt Structure:

user: Description of what to remove + image attachment
mask: Either auto-generated or user-provided mask image
negative: Elements to avoid in the edited area

Options:

Use MASK_MODE_USER_PROVIDED when you have a mask or the other modes to auto-generate a mask based on foreground, background or object class.
mask_dilation recommended: 0.01
edit_steps controls the editing process quality

EDIT_MODE_INPAINT_INSERTION

Adds new elements to an image based on a mask area.

Prompt Structure:

user: Description of what to add + image attachment
mask: Either auto-generated or user-provided mask image
negative: Elements to avoid in the inserted content

Options:

Use MASK_MODE_USER_PROVIDED when you have a mask or the other modes to auto-generate a mask based on foreground, background or object class.
mask_dilation recommended: 0.01
edit_steps controls the editing process quality

EDIT_MODE_BGSWAP

Replaces the background of an image while keeping the foreground intact.

Prompt Structure:

user: Description of new background + image attachment
mask: Auto-generated (typically use MASK_MODE_BACKGROUND) or user-provided
negative: Elements to avoid in the new background

Options:

mask_dilation recommended: 0.0
edit_steps controls the editing process quality

EDIT_MODE_OUTPAINT

Extends an image beyond its original boundaries.

Prompt Structure:

user: Description of extended content + image attachment
mask: Generated mask for extension area
negative: Elements to avoid in the extended areas

Options:

mask_dilation recommended: 0.01-0.03
edit_steps controls the editing process quality

3. Customization Modes

CUSTOMIZATION_SUBJECT

Creates new images featuring subjects from reference images.

Prompt Structure:

user: Prompt taken from text content.
user: First image is the control image. Additional images as subjects + subject descriptions
negative: Elements to avoid

Note: The first image is always processed as the control image, while subsequent images are treated as subjects.

CUSTOMIZATION_STYLE

Creates new images in the style of reference images.

Prompt Structure:

user: Prompt taken from text content.
user: Style reference images + style description
negative: Elements to avoid

Options:

styleDescription can be provided to clarify the style

CUSTOMIZATION_CONTROLLED

Creates images guided by control images (like face meshes, edges, etc).

Prompt Structure:

user: Prompt taken from text content.
user: Control image attachment or image to generate control image from.
negative: Elements to avoid

Options:

controlType: Choose between CONTROL_TYPE_FACE_MESH, CONTROL_TYPE_CANNY, or CONTROL_TYPE_SCRIBBLE
enableControlImageComputation: Set to true to let the model compute the control image

CUSTOMIZATION_INSTRUCT

Modifies reference images according to instructions.

Prompt Structure:

user: Reference image + specific modification instructions
negative: Elements to avoid in the modification