Imagen 3 Guide: Different Modes and Prompt set up - vertesia/llumiverse GitHub Wiki

This guide covers the different modes available in Imagen 3 and how to structure prompts for each mode.

Common Options

Available for most modes:

  • number_of_images: Number of variants to generate
  • seed: For reproducible results
  • safety_setting: Safety filter level
  • person_generation: Whether to allow the generation of people. Can set to adults only, all or off.
  • add_watermark: Adds a invisible watermark for AI image detection (cannot be used with seed).

Available Modes

1. TEXT_IMAGE (Default)

Basic text-to-image generation without reference images.

Prompt Structure:

  • user: Detailed description of the desired image
  • negative: Elements to avoid in the image

2. Edit Modes

EDIT_MODE_INPAINT_REMOVAL

Removes elements from an image based on a mask.

Prompt Structure:

  • user: Description of what to remove + image attachment
  • mask: Either auto-generated or user-provided mask image
  • negative: Elements to avoid in the edited area

Options:

  • Use MASK_MODE_USER_PROVIDED when you have a mask or the other modes to auto-generate a mask based on foreground, background or object class.
  • mask_dilation recommended: 0.01
  • edit_steps controls the editing process quality

EDIT_MODE_INPAINT_INSERTION

Adds new elements to an image based on a mask area.

Prompt Structure:

  • user: Description of what to add + image attachment
  • mask: Either auto-generated or user-provided mask image
  • negative: Elements to avoid in the inserted content

Options:

  • Use MASK_MODE_USER_PROVIDED when you have a mask or the other modes to auto-generate a mask based on foreground, background or object class.
  • mask_dilation recommended: 0.01
  • edit_steps controls the editing process quality

EDIT_MODE_BGSWAP

Replaces the background of an image while keeping the foreground intact.

Prompt Structure:

  • user: Description of new background + image attachment
  • mask: Auto-generated (typically use MASK_MODE_BACKGROUND) or user-provided
  • negative: Elements to avoid in the new background

Options:

  • mask_dilation recommended: 0.0
  • edit_steps controls the editing process quality

EDIT_MODE_OUTPAINT

Extends an image beyond its original boundaries.

Prompt Structure:

  • user: Description of extended content + image attachment
  • mask: Generated mask for extension area
  • negative: Elements to avoid in the extended areas

Options:

  • mask_dilation recommended: 0.01-0.03
  • edit_steps controls the editing process quality

3. Customization Modes

CUSTOMIZATION_SUBJECT

Creates new images featuring subjects from reference images.

Prompt Structure:

  • user: Prompt taken from text content.
  • user: First image is the control image. Additional images as subjects + subject descriptions
  • negative: Elements to avoid

Note: The first image is always processed as the control image, while subsequent images are treated as subjects.

CUSTOMIZATION_STYLE

Creates new images in the style of reference images.

Prompt Structure:

  • user: Prompt taken from text content.
  • user: Style reference images + style description
  • negative: Elements to avoid

Options:

  • styleDescription can be provided to clarify the style

CUSTOMIZATION_CONTROLLED

Creates images guided by control images (like face meshes, edges, etc).

Prompt Structure:

  • user: Prompt taken from text content.
  • user: Control image attachment or image to generate control image from.
  • negative: Elements to avoid

Options:

  • controlType: Choose between CONTROL_TYPE_FACE_MESH, CONTROL_TYPE_CANNY, or CONTROL_TYPE_SCRIBBLE
  • enableControlImageComputation: Set to true to let the model compute the control image

CUSTOMIZATION_INSTRUCT

Modifies reference images according to instructions.

Prompt Structure:

  • user: Reference image + specific modification instructions
  • negative: Elements to avoid in the modification