Planning Phase 13 - huqianghui/AI-Coach-vibe-coding GitHub Wiki
Auto-generated from
.planning/phases/13-voice-live-instance-agent-voice-management
Last synced: 2026-04-13
| # | Plan File | Status |
|---|---|---|
| 13-01 | 13-01-PLAN.md | Complete |
| 13-02 | 13-02-PLAN.md | Complete |
| 13-03 | 13-03-PLAN.md | Complete |
Click to expand research notes
Researched: 2026-04-03 Domain: Azure Voice Live instance management, Agent voice mode configuration, AI Foundry portal workflow automation Confidence: HIGH
Phase 13 aims to provide admin-level management of Voice Live instances (model selection) and Agent voice mode configuration, matching the AI Foundry portal's end-to-end workflow. The critical discovery from research is that Azure Voice Live is a fully managed service with no separate "instance" REST API. There is no Azure REST endpoint to "create a Voice Live instance" or "bind Voice Live to an agent" -- these are UI-only concepts in the AI Foundry portal that translate to:
- Model selection: The generative AI model (gpt-4o, gpt-4.1, gpt-5, etc.) is specified as a query parameter at WebSocket connection time. The "instance" is the active WebSocket session itself.
-
Bind to agent: Writing
microsoft.voice-live.configurationmetadata on the agent (already implemented inagent_sync_service.pyviabuild_voice_live_metadata()). -
Enable Voice mode: Presence of the
microsoft.voice-live.configurationmetadata key on the agent signals that voice mode is enabled. -
Speech/Avatar config: Stored as JSON in the agent metadata, sent to Voice Live API via
session.updateat connection time.
The existing codebase (Phase 11-12) already implements items 2-4 via build_voice_live_metadata() and per-HCP token broker. What Phase 13 adds is: (a) explicit admin UI for selecting the Voice Live generative AI model per-HCP, (b) a dedicated Voice Live management page showing the full HCP-to-Voice-Live binding chain, (c) upgrading azure-ai-projects from 1.0.0b12 to 2.0.1 (the stable release), and (d) comprehensive testing of the automated full chain.
Primary recommendation: Since no new Azure APIs need to be called (Voice Live has no instance management API), this phase is primarily a UI/UX enhancement phase: add model selection to HCP profiles, build a Voice Live management overview admin page, and ensure the existing agent metadata sync covers all Voice Live configuration fields.
<phase_requirements>
| ID | Description | Research Support |
|---|---|---|
| VOICE-13-01 | Admin can create/manage Voice Live instances (select generative AI model) | No separate API for "instances" -- model selection is a config choice stored per-HCP and sent as WebSocket param. Add voice_live_model column to HCP profile, admin selects from supported model list. |
| VOICE-13-02 | Admin can bind Voice Live to HCP Agents | Already implemented via build_voice_live_metadata() in agent_sync_service.py. Phase 13 adds admin UI visibility: show binding status on management page, allow re-binding. |
| VOICE-13-03 | Admin can enable Voice mode on agents and configure speech input/output/avatar parameters | Per-HCP voice/avatar/speech config already stored (Phase 12). Voice mode toggle = voice_live_enabled flag. Phase 13 adds management overview showing enabled/disabled status per HCP. |
| VOICE-13-04 | Platform automates HCP Profile -> Agent -> Voice Live -> Voice mode -> Speech/Avatar config chain | Automation exists in sync_agent_for_profile(). Phase 13 adds: (a) model selection in the chain, (b) admin visibility of the full chain status, (c) batch re-sync capability. |
| VOICE-13-05 | Matching AI Foundry portal workflow end-to-end | Admin management page mirrors AI Foundry portal steps: model selection, agent binding status, voice mode status, speech/avatar config summary per HCP. |
| </phase_requirements> |
| Library | Version | Purpose | Why Standard |
|---|---|---|---|
| azure-ai-projects | 2.0.1 | Agent CRUD with metadata (stable release) | Project standard, currently on 1.0.0b12 -- must upgrade |
| SQLAlchemy 2.0 (async) | >=2.0.0 | ORM model extension for voice_live_model field | Already in use |
| Alembic | >=1.13.0 | Migration for new voice_live_model column | Required by project rules |
| Pydantic v2 | >=2.0.0 | Schema extension | Already in use |
| @azure/ai-voicelive | 1.0.0-beta.3 | Frontend Voice Live SDK | Already installed |
| React 18 + TypeScript | strict | Admin management UI | Already in use |
| Library | Version | Purpose | When to Use |
|---|---|---|---|
| TanStack Query v5 | ^5.60.0 | Server state for management page queries | Admin page data fetching |
| react-i18next | existing | i18n for new admin UI text | All new UI strings |
| sonner | existing | Toast notifications for sync operations | Batch re-sync feedback |
| lucide-react | >=0.460.0 | Icons for management page | Status indicators |
| Instead of | Could Use | Tradeoff |
|---|---|---|
| Per-HCP model column | Global model config only | Global is simpler but doesn't match AI Foundry portal where each agent can have a different model |
| Management overview page | Extend existing HCP table | Separate page provides clearer workflow visualization matching AI Foundry portal steps |
Installation:
# Backend -- upgrade azure-ai-projects to stable 2.0.1
cd backend
pip install "azure-ai-projects>=2.0.1"
# Frontend -- no new packages neededVersion verification:
-
azure-ai-projects: Currently installed1.0.0b12, must upgrade to2.0.1(verified viapip index versions) - All other packages already installed from previous phases
backend/
alembic/versions/
k14a_add_voice_live_model_to_hcp_profile.py # NEW: migration
app/
models/hcp_profile.py # EXTEND: voice_live_model column
schemas/hcp_profile.py # EXTEND: voice_live_model field
schemas/voice_live.py # EXTEND: model list response
services/voice_live_service.py # EXTEND: per-HCP model in token
services/agent_sync_service.py # EXTEND: model in metadata, SDK upgrade compat
api/voice_live.py # EXTEND: management endpoints
api/hcp_profiles.py # NO CHANGE (already handles all fields)
frontend/
src/
types/hcp.ts # EXTEND: voice_live_model field
types/voice-live.ts # EXTEND: model list type
pages/admin/voice-live-management.tsx # NEW: Voice Live management overview page
components/admin/voice-live-chain-card.tsx # NEW: Per-HCP chain status card
components/admin/voice-avatar-tab.tsx # EXTEND: model selection dropdown
hooks/use-voice-live-management.ts # NEW: TanStack Query hooks for management
api/voice-live.ts # EXTEND: management API calls
public/locales/en-US/admin.json # EXTEND: management page strings
public/locales/zh-CN/admin.json # EXTEND: management page strings
What: Each HCP profile stores the generative AI model to use for Voice Live sessions. This model is passed as a query parameter when connecting the WebSocket. When to use: Admin configures per-HCP Voice Live settings. Example:
# Source: Azure Voice Live docs -- model is a WebSocket connection parameter
# wss://<resource>.services.ai.azure.com/voice-live/realtime?api-version=2025-10-01&model=gpt-4.1
# HCP profile stores the model choice
voice_live_model: Mapped[str] = mapped_column(
String(50), default="gpt-4o"
) # gpt-4o, gpt-4.1, gpt-5, gpt-realtime, phi4-mini, etc.What: Admin page showing the 4-step workflow chain for each HCP: Profile -> Agent -> Voice Live Config -> Speech/Avatar. When to use: Voice Live management admin page. Example:
// Each HCP shows its chain status:
// Step 1: HCP Profile (always exists)
// Step 2: Agent (synced/pending/failed/none)
// Step 3: Voice Live Config (voice_live_enabled + model selection)
// Step 4: Speech/Avatar (voice_name + avatar_character configured)
interface ChainStatus {
hcpId: string;
hcpName: string;
agentStatus: "synced" | "pending" | "failed" | "none";
agentId: string;
voiceLiveEnabled: boolean;
voiceLiveModel: string;
voiceName: string;
avatarCharacter: string;
avatarStyle: string;
}What: Admin can trigger batch re-sync of all HCP agents to update Voice Live metadata after changing model or config. When to use: After changing Voice Live model for multiple HCPs, or after SDK upgrade. Example:
# Source: Existing sync_agent_for_profile() pattern in agent_sync_service.py
async def batch_resync_agents(
db: AsyncSession,
hcp_profile_ids: list[str] | None = None,
) -> dict:
"""Re-sync all (or selected) HCP agents with current metadata.
Useful after changing Voice Live model settings or upgrading SDK.
Returns summary of results: {synced: N, failed: N, errors: [...]}
"""
# Prefetch config once for all profiles
endpoint, api_key, model = await prefetch_sync_config(db)
results = {"synced": 0, "failed": 0, "errors": []}
# ... iterate profiles and call sync_agent_for_profile()- Creating a separate "Voice Live Instance" table: Azure has no concept of persistent instances -- Voice Live is session-based. Store the model choice on the HCP profile.
- Calling a non-existent "create Voice Live instance" API: No such API exists. The portal UI creates the WebSocket session on-demand.
-
Ignoring the SDK version upgrade: The installed
1.0.0b12is a beta;2.0.1is stable with breaking changes to the agent API surface. - Storing model config globally only: Each HCP can use a different model in the AI Foundry portal. Per-HCP model selection is required.
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| Voice Live model list | Hardcoded incomplete list | Const from Azure docs (12 models) | Official list with pricing tiers |
| Agent metadata sync | Direct REST calls | agent_sync_service.sync_agent_for_profile() |
Already handles create/update with metadata |
| Voice Live config JSON | Manual JSON construction | build_voice_live_metadata() |
Already builds the microsoft.voice-live.configuration JSON |
| Chain status visualization | Custom complex component | Card grid with status badges | Simple, reuses existing badge/card patterns |
| Admin page routing | New router config | Add to existing admin routes | Follow existing admin page pattern |
What goes wrong: The agent_sync_service.py uses client.agents.create_version(), client.agents.get(), client.agents.delete() which may have different signatures in v2.0.1.
Why it happens: The project is on beta 1.0.0b12 but the memory says >=2.0.1. The stable release 2.0.1 may have different class/method names.
How to avoid: Before upgrading, check the v2.0 migration guide. Test create_version, get, delete methods against the new API surface. The PromptAgentDefinition import path may change.
Warning signs: ImportError or AttributeError on server startup after pip upgrade.
What goes wrong: The Voice Live model (e.g., gpt-4.1) is confused with the Agent model (e.g., gpt-4o used for agent instructions). These are different.
Why it happens: The ServiceConfig.model_or_deployment for azure_voice_live already stores a model/agent-mode config. The agent's own model (set in PromptAgentDefinition) is separate from the Voice Live WebSocket model.
How to avoid: Voice Live model is the model parameter passed in the WebSocket URL. In agent mode, the agent already has its own LLM model. The Voice Live model may be used for non-agent sessions or as the orchestration model. Store as voice_live_model on HCP profile separately from the agent's model field.
Warning signs: Admin selects gpt-4.1 for Voice Live but the agent still uses gpt-4o for its own responses.
What goes wrong: When connecting in agent mode, the model query parameter is not used -- the agent's own model is used instead. Passing both model and agent_id may cause errors.
Why it happens: Azure Voice Live docs say: "The only difference is the required model query parameter, or, when using the Agent service, the agent_id and project_id parameters." This is an either/or -- not both.
How to avoid: When HCP has a synced agent (agent_id), use agent mode (no model param). When HCP has no agent, use model mode with the selected voice_live_model. The token broker already handles this distinction.
Warning signs: WebSocket connection fails with "invalid parameters" when both model and agent_id are provided.
What goes wrong: Agent metadata values are limited to 512 characters per key. Complex Voice Live config JSON may exceed this.
Why it happens: Azure AI agent metadata has a 512-character value limit.
How to avoid: Already handled by _chunk_metadata_value() in agent_sync_service.py, which splits long values into key, key.1, key.2, etc.
Warning signs: Agent creation/update fails with metadata size error.
What goes wrong: Alembic migration fails on SQLite when adding the voice_live_model column.
Why it happens: Standard project gotcha -- SQLite needs batch_alter_table with server_default.
How to avoid: Use with op.batch_alter_table("hcp_profiles") as batch_op: with server_default=sa.text("'gpt-4o'").
Warning signs: Migration fails locally on SQLite but works on PostgreSQL.
# Source: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live
# Verified: 2026-04-03, docs updated 2026-02-04
VOICE_LIVE_MODELS = {
# Pro tier
"gpt-realtime": {"tier": "pro", "description": "GPT real-time + Azure TTS"},
"gpt-4o": {"tier": "pro", "description": "GPT-4o + Azure STT/TTS"},
"gpt-4.1": {"tier": "pro", "description": "GPT-4.1 + Azure STT/TTS"},
"gpt-5": {"tier": "pro", "description": "GPT-5 + Azure STT/TTS"},
"gpt-5-chat": {"tier": "pro", "description": "GPT-5 chat + Azure STT/TTS"},
# Basic tier
"gpt-realtime-mini": {"tier": "basic", "description": "GPT mini real-time + Azure TTS"},
"gpt-4o-mini": {"tier": "basic", "description": "GPT-4o mini + Azure STT/TTS"},
"gpt-4.1-mini": {"tier": "basic", "description": "GPT-4.1 mini + Azure STT/TTS"},
"gpt-5-mini": {"tier": "basic", "description": "GPT-5 mini + Azure STT/TTS"},
# Lite tier
"gpt-5-nano": {"tier": "lite", "description": "GPT-5 nano + Azure STT/TTS"},
"phi4-mm-realtime": {"tier": "lite", "description": "Phi4-mm realtime + Azure TTS"},
"phi4-mini": {"tier": "lite", "description": "Phi4-mini + Azure STT/TTS"},
}// Source: Azure Voice Live overview docs, organized by pricing tier
const VOICE_LIVE_MODEL_OPTIONS = [
// Pro tier
{ value: "gpt-realtime", label: "GPT Realtime", tier: "pro" },
{ value: "gpt-4o", label: "GPT-4o", tier: "pro" },
{ value: "gpt-4.1", label: "GPT-4.1", tier: "pro" },
{ value: "gpt-5", label: "GPT-5", tier: "pro" },
{ value: "gpt-5-chat", label: "GPT-5 Chat", tier: "pro" },
// Basic tier
{ value: "gpt-realtime-mini", label: "GPT Realtime Mini", tier: "basic" },
{ value: "gpt-4o-mini", label: "GPT-4o Mini", tier: "basic" },
{ value: "gpt-4.1-mini", label: "GPT-4.1 Mini", tier: "basic" },
{ value: "gpt-5-mini", label: "GPT-5 Mini", tier: "basic" },
// Lite tier
{ value: "gpt-5-nano", label: "GPT-5 Nano", tier: "lite" },
{ value: "phi4-mm-realtime", label: "Phi4-MM Realtime", tier: "lite" },
{ value: "phi4-mini", label: "Phi4 Mini", tier: "lite" },
] as const;# Source: Extend existing voice_live_service.py
# In model mode: use HCP's voice_live_model (default "gpt-4o")
# In agent mode: model is NOT sent (agent has its own model)
# In get_voice_live_token():
if is_agent:
# Agent mode: no model param needed -- agent has its own LLM
model_for_session = ""
else:
# Model mode: use HCP-level or global config model
model_for_session = (
profile.voice_live_model
if hcp_profile_id and profile
else mode_info.get("model", "gpt-4o")
)// Each HCP shows its full chain status
// Reuses existing Card, Badge, and cn() patterns
<Card>
<CardHeader>
<CardTitle>{hcp.name} - {hcp.specialty}</CardTitle>
</CardHeader>
<CardContent>
<div className="flex items-center gap-3">
{/* Step 1: Agent */}
<Badge variant={agentOk ? "default" : "destructive"}>
Agent: {hcp.agent_sync_status}
</Badge>
{/* Step 2: Voice Live */}
<Badge variant={hcp.voice_live_enabled ? "default" : "secondary"}>
Voice: {hcp.voice_live_enabled ? hcp.voice_live_model : "Disabled"}
</Badge>
{/* Step 3: Speech */}
<Badge variant="outline">
{hcp.voice_name}
</Badge>
{/* Step 4: Avatar */}
<Badge variant="outline">
{hcp.avatar_character}/{hcp.avatar_style}
</Badge>
</div>
</CardContent>
</Card># Source: Extend build_voice_live_metadata() in agent_sync_service.py
# The model is NOT stored in agent metadata -- it's a runtime connection parameter.
# Agent metadata stores speech/avatar config that the Voice Live session needs.
# Model selection is handled by the token broker at session start.
#
# Existing metadata format (already works):
# {
# "microsoft.voice-live.configuration": '{"voice":{"type":"azure-standard","name":"en-US-AvaNeural","temperature":0.9},"turn_detection":{"type":"server_vad"}}'
# }| Old Approach | Current Approach | When Changed | Impact |
|---|---|---|---|
| Single global model for Voice Live | Per-HCP model selection | Phase 13 | Each HCP can use different model tier |
| azure-ai-projects 1.0.0b12 (beta) | azure-ai-projects 2.0.1 (stable) | Phase 13 | Stable Agent Registry API, breaking changes possible |
| No admin visibility of Voice Live chain | Management overview page | Phase 13 | Admin sees full HCP -> Agent -> VL -> Speech chain |
| Implicit Voice Live "instance" creation | Explicit model selection in admin | Phase 13 | Matches AI Foundry portal workflow |
Azure Voice Live API supported models (current, verified 2026-04-03):
- Pro: gpt-realtime, gpt-4o, gpt-4.1, gpt-5, gpt-5-chat
- Basic: gpt-realtime-mini, gpt-4o-mini, gpt-4.1-mini, gpt-5-mini
- Lite: gpt-5-nano, phi4-mm-realtime, phi4-mini
Critical insight: Voice Live has no instance management API. The AI Foundry portal "Voice Live playground" is a UI experience that connects a WebSocket with model selection. There is no REST API to "create a Voice Live instance" that persists on Azure. The "instance" is simply a configuration choice (model + speech + avatar) that exists in the platform's database and is applied at WebSocket connection time.
-
SDK upgrade impact on agent_sync_service.py
- What we know:
azure-ai-projects2.0.1 is stable; installed version is 1.0.0b12 (beta). ThePromptAgentDefinition,client.agents.create_version(),client.agents.get(),client.agents.delete()are the methods used. - What's unclear: Whether v2.0.1 has the same API surface as v1.0.0b12. Method signatures may have changed.
- Recommendation: Upgrade in a dedicated plan. Test each method. If API surface changed, adapt
agent_sync_service.py. The memory note says v2.0+ usesclient.agents.create_version()which matches current code -- likely compatible.
- What we know:
-
Per-HCP model vs global config model
- What we know: Currently
model_or_deploymenton theazure_voice_liveServiceConfig stores the global model/agent config. HCP profiles don't have avoice_live_modelfield. - What's unclear: Whether the global config model should serve as a default that HCP-level overrides, or whether per-HCP model should be the only source.
- Recommendation: Add
voice_live_modelto HcpProfile with default "gpt-4o". Globalmodel_or_deploymentremains as system-level agent/model mode config. Per-HCPvoice_live_modelis used for model-mode sessions. Agent-mode sessions ignore it (agent has its own LLM).
- What we know: Currently
-
Management page as separate route or tab in existing HCP page
- What we know: AI Foundry portal has a separate "Voice Live" section showing all Voice Live configurations per agent.
- What's unclear: Whether a new admin page is needed or the existing HCP table is sufficient.
- Recommendation: New admin route
/admin/voice-liveshowing chain overview across all HCPs, with links to individual HCP editors. The HCP editor already has the Voice & Avatar tab for per-HCP config.
- Async everywhere: all backend functions must be
async def - Pydantic v2 schemas with
model_config = ConfigDict(from_attributes=True) - Route ordering: static paths before parameterized (
/{id}) - Service layer holds business logic, routers only handle HTTP
- No raw SQL -- use SQLAlchemy ORM
- TypeScript strict mode: no
any, no unused variables - TanStack Query hooks per domain, no inline useQuery
-
cn()for conditional class composition - i18n: all UI text externalized via react-i18next
- Conventional commits:
feat:,fix:,docs:,test:
- NEVER modify schema without Alembic migration
- All models use TimestampMixin
- batch_alter_table with server_default for SQLite compatibility
- Backend:
ruff check .,ruff format --check .,pytest -v - Frontend:
npx tsc -b,npm run build
- Use
python3notpythonin local commands - Unit tests MUST use real .env credentials when available
- Always complete full workflow: fix -> commit -> push -> CI verify
- Each phase needs >=95% test coverage
- All UI communication in Chinese with user, English in code/commits
| Dependency | Required By | Available | Version | Fallback |
|---|---|---|---|---|
| azure-ai-projects | Agent sync service | Installed (wrong version) | 1.0.0b12 | Must upgrade to 2.0.1 |
| @azure/ai-voicelive | Frontend Voice Live SDK | Installed | 1.0.0-beta.3 | -- |
| azure-identity | Entra ID auth | Installed | -- | API key fallback |
| Python 3.11+ | Backend | Available | 3.11+ | -- |
| Node 20+ | Frontend | Available | 20+ | -- |
Missing dependencies with no fallback:
-
azure-ai-projectsmust be upgraded from1.0.0b12to>=2.0.1(stable release)
Missing dependencies with fallback:
- None -- all runtime dependencies are already available
- Azure Voice Live how-to: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-how-to -- Session configuration, authentication, model selection, avatar config, voice config. Updated 2026-03-16.
- Azure Voice Live overview: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live -- Supported models list (12 models in 3 tiers), pricing, architecture. Updated 2026-02-04.
- Azure Voice Live customization: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-how-to-customize -- Custom speech, custom voice, personal voice, custom avatar config. Updated 2026-04-02.
- Existing codebase files (all read directly):
-
backend/app/services/agent_sync_service.py-- Agent CRUD withbuild_voice_live_metadata(),sync_agent_for_profile() -
backend/app/services/voice_live_service.py-- Token broker with per-HCP resolution -
backend/app/models/hcp_profile.py-- ORM model with 13 voice/avatar columns -
backend/app/schemas/hcp_profile.py-- Pydantic schemas with all voice/avatar fields -
backend/app/schemas/voice_live.py-- Token response and status schemas -
backend/app/api/voice_live.py-- Token broker and status endpoints -
backend/app/models/service_config.py-- Global config with model_or_deployment -
backend/app/services/agents/adapters/azure_voice_live.py-- parse_voice_live_mode(), encode_voice_live_mode() -
backend/app/services/region_capabilities.py-- Region availability maps -
frontend/src/hooks/use-voice-live.ts-- Voice Live WebSocket session with buildSessionConfig() -
frontend/src/hooks/use-avatar-stream.ts-- Avatar WebRTC connection -
frontend/src/components/admin/voice-avatar-tab.tsx-- Voice/Avatar admin settings with live test -
frontend/src/pages/admin/hcp-profile-editor.tsx-- Tabbed HCP editor -
frontend/src/types/hcp.ts-- HCP TypeScript types -
frontend/src/types/voice-live.ts-- Voice Live types
-
- pip index verification:
azure-ai-projectslatest stable is2.0.1, installed is1.0.0b12 - User memory (2026-04-03): AI Foundry Voice Live complete workflow with model selection and agent binding steps
- SDK v2.0.1 API surface compatibility with v1.0.0b12 -- needs runtime verification during upgrade plan
Confidence breakdown:
- Standard stack: HIGH - all libraries already in project, SDK upgrade path clear
- Architecture: HIGH - Voice Live has no instance management API (verified from official docs), pattern is config-driven
- Pitfalls: HIGH - SDK upgrade risk identified, model/agent-mode distinction well-understood from docs
- Azure API structure: HIGH - verified from official docs updated 2026-03-16 and 2026-04-02
Research date: 2026-04-03 Valid until: 2026-05-03 (stable -- Azure Voice Live API is GA, model list may expand but existing models remain)
Click to expand UI spec
Visual and interaction contract for the Voice Live Instance & Agent Voice Management phase. Generated by gsd-ui-researcher, verified by gsd-ui-checker.
Design Reference: Azure AI Foundry portal's 3-step Voice Live workflow (user-provided screenshots 2026-04-03):
- Voice Live Playground — Select Generative AI Model (Pro/Standard/Lite tiers), configure Response instruction, Temperature, Proactive engagement, Speech input/output
- Add to Agent — Dropdown binding Voice Live to existing Agents (HCP profiles)
- Agent Page — Voice mode toggle, Instructions, Tools, Knowledge, Memory, Guardrail, Avatar preview, Speech input (Language + Auto-detect), Speech output (Voice selection)
Our admin UI should mirror this workflow while adapting to our HCP-centric data model.
| Property | Value |
|---|---|
| Tool | none (Tailwind CSS v4 with @theme inline custom properties) |
| Preset | not applicable |
| Component library | Radix UI (via project @/components/ui/* wrappers) |
| Icon library | lucide-react >=0.460.0 |
| Font | Inter + Noto Sans SC (sans-serif), JetBrains Mono (monospace) |
Source: Existing frontend/src/styles/index.css @theme inline block, established in Phase 01. No new design system installations required. Continuity from Phase 12 UI-SPEC.
Declared values (must be multiples of 4):
| Token | Value | Usage in Phase 13 |
|---|---|---|
| xs | 4px | Icon gaps, inline badge padding within chain status cards (gap-1), tier label-to-model gap |
| sm | 8px | Compact element spacing, chain step icon-to-text gap (gap-2), model dropdown item padding |
| md | 16px | Default element spacing, card content padding, form field vertical gaps (space-y-4) |
| lg | 24px | Gap between chain cards on management page (gap-6), section padding within cards |
| xl | 32px | Page header-to-content gap, gap between major sections |
| 2xl | 48px | Page-level top/bottom padding |
| 3xl | 64px | Not used in this phase |
Exceptions: none
| Role | Size | Weight | Line Height | Phase 13 Usage |
|---|---|---|---|---|
| Badge/Tier | 12px (text-xs) |
400 or 600 | 1.5 | Model tier labels (Pro/Basic/Lite), chain step status badges, Voice Live model badge in HCP table |
| Body | 14px (text-sm) |
400 (normal) | 1.5 | Chain card description text, form field values, management page table cells, Select dropdown items |
| Label | 14px (text-sm) |
400 (normal) | 1.5 | FormLabel for Voice Live Model select, chain step labels, management page column headers |
| Heading | 16px (text-base) |
600 (semibold) | 1.5 | CardTitle for chain cards, page section headings, management page title |
| Page Title | 24px (text-2xl) |
600 (semibold) | 1.5 | Voice Live Management page title |
Two weights only: 400 (normal) for body text and labels, 600 (semibold) for headings. Follows Phase 12 contract.
| Role | Value | Usage in Phase 13 |
|---|---|---|
| Dominant (60%) |
var(--background) #FFFFFF
|
Page background, card backgrounds, management overview background |
| Secondary (30%) |
var(--card) #FFFFFF / var(--muted) #ececf0
|
Chain status cards, table header row bg-slate-50/50, tier group separator backgrounds, Select trigger background |
| Accent (10%) |
var(--primary) #1E40AF
|
Batch Re-sync primary button (bg-primary), active chain step indicator, "View Details" link text |
| Destructive |
var(--destructive) #EF4444
|
Failed chain step indicator, error status in chain card |
Accent reserved for:
- Batch Re-sync All button (
bg-primary) - Active/complete chain step connector line accent
- "Edit HCP" link text in chain cards
Additional semantic colors used in this phase (already established):
| Token | Value | Usage |
|---|---|---|
| Green |
bg-green-500 (dot) / bg-green-100 text-green-700 (badge) |
Complete chain step indicator, "synced" agent status, "enabled" Voice Live status |
| Amber |
bg-amber-500 (dot) / bg-amber-100 text-amber-700 (badge) |
Partial chain completion, "pending" agent status |
| Red |
bg-destructive (dot) / bg-red-100 text-red-700 (badge) |
Failed chain step, "failed" agent status |
| Muted foreground |
var(--muted-foreground) #717182
|
"Disabled" Voice Live status text, "none" agent status, empty model selection |
| Blue tint | bg-blue-50 text-blue-700 |
Pro tier model badge background |
| Slate tint | bg-slate-100 text-slate-700 |
Standard tier model badge background |
| Neutral tint | bg-neutral-100 text-neutral-600 |
Lite tier model badge background |
Source: Existing CSS custom properties in index.css, Phase 10 theme system. Tier-specific tints are new but follow the established badge color pattern (e.g., personality badge uses bg-slate-100 text-slate-700).
| Screen | Primary Focal Point | Rationale |
|---|---|---|
| Voice Live Management Page | Chain status overview grid | Admin needs to see all HCPs and their Voice Live chain status at a glance; this is the primary purpose of the page |
| HCP Profile Editor (Voice & Avatar tab) | Voice Live Model dropdown | New field added in this phase; the model selection is the core new capability that differentiates Phase 13 from Phase 12 |
| HCP Table | Voice Live Model badge (added to existing Voice & Avatar column) | Shows per-HCP model selection at list level for quick reference |
| Component | Location | Description |
|---|---|---|
| VoiceLiveChainCard | frontend/src/components/admin/voice-live-chain-card.tsx |
Card showing the 4-step workflow chain for a single HCP: (1) HCP Profile, (2) Agent Sync Status, (3) Voice Live Config (enabled + model), (4) Speech/Avatar Config. Each step shows a colored status dot (green/amber/red/muted), label, and value. Steps are connected by a vertical or horizontal line. Card header shows HCP name + specialty. Card footer has "Edit HCP" link (navigates to /admin/hcp-profiles/{id}). Uses existing Card, Badge, and Tooltip components. |
| VoiceLiveManagementPage | frontend/src/pages/admin/voice-live-management.tsx |
Admin page at route /admin/voice-live. Page header: title + description + Batch Re-sync All button. Body: responsive grid of VoiceLiveChainCard components (one per HCP with voice_live_enabled === true). Summary stats row at top: total HCPs, agents synced, Voice Live enabled, fully configured count. Loading state: 6 Skeleton cards in grid. Empty state: when no HCPs exist. Error state: standard error display with retry. |
| VoiceLiveModelSelect | frontend/src/components/admin/voice-live-model-select.tsx |
Grouped Select dropdown for Voice Live generative AI model selection. Groups: Pro tier, Standard tier, Lite tier. Each group has a non-selectable group label. Each option shows model name. Uses existing Select/SelectContent/SelectItem components with SelectGroup and SelectLabel for tier grouping. Default: "gpt-4o". |
| Component | Changes |
|---|---|
voice-avatar-tab.tsx |
Add VoiceLiveModelSelect field inside the Voice Settings Card, positioned after the Voice Live Config toggle and before the Voice Name field. New FormField for voice_live_model using VoiceLiveModelSelect component. |
hcp-profile-editor.tsx |
Extend hcpSchema with voice_live_model: z.string().default("gpt-4o"). Update HcpFormValues type (auto-derived from schema). Map voice_live_model field in form reset and submit handlers. |
hcp-table.tsx |
Add model badge to the existing Voice & Avatar column. When voice_live_model is set, show it as a third inline Badge (variant="outline", text-xs) alongside voice name and avatar badges. Badge text shows short model name (e.g., "GPT-4o", "GPT-4.1"). |
admin-layout.tsx |
Add Voice Live management page to the sidebar. New item in "Configuration" group: { path: "/admin/voice-live", labelKey: "voiceLive", icon: Radio } (using Radio icon from lucide-react, representing live/broadcast). Positioned after "Azure Services" and before "Settings". |
| Component | Usage in Phase 13 |
|---|---|
| Card / CardHeader / CardTitle / CardContent / CardFooter | Chain card containers, management page summary cards |
| Badge | Chain step status badges, model tier badges, model name in HCP table |
| Select / SelectTrigger / SelectContent / SelectItem / SelectGroup / SelectLabel | Grouped model selection dropdown |
| Button | Batch Re-sync All, Edit HCP link, individual Re-sync per card |
| Tooltip / TooltipTrigger / TooltipContent | Agent ID display in chain cards, model tier description |
| Skeleton | Loading state for chain cards grid |
| Switch | Voice Live Config toggle (existing, unchanged) |
| Form / FormField / FormItem / FormLabel / FormControl / FormMessage | Model select form integration |
| toast (sonner) | Batch re-sync success/failure notifications |
| EmptyState | Management page when no HCPs have Voice Live enabled |
Trigger: Admin opens Voice & Avatar tab in HCP Profile Editor.
Behavior: A new FormField "Voice Live Model" appears inside the Voice Settings card, between the Voice Live Config toggle and the Voice Name field. The field renders VoiceLiveModelSelect, which is a grouped Select dropdown. Groups: Pro (gpt-realtime, gpt-4o, gpt-4.1, gpt-5, gpt-5-chat), Basic (gpt-realtime-mini, gpt-4o-mini, gpt-4.1-mini, gpt-5-mini), Lite (gpt-5-nano, phi4-mm-realtime, phi4-mini). Default: "gpt-4o". Selection persists as part of the HCP form and is saved with Save Profile.
Visual: Select trigger h-8 text-xs matching existing voice-avatar-tab dropdowns. Group labels rendered as non-interactive tier headers inside SelectContent: "Pro", "Basic", "Lite" each in text-xs font-semibold text-muted-foreground px-2 py-1.5. SelectItems show model display name (e.g., "GPT-4o", "GPT-4.1 Mini").
Constraint: Model selection is independent of agent mode vs model mode. In agent mode, the selected model is NOT sent to the WebSocket (agent has its own LLM). In model mode, the selected model IS sent as the WebSocket query parameter. This distinction is handled by the token broker, not the UI.
Trigger: Admin clicks "Voice Live" in the admin sidebar.
Behavior: Navigates to /admin/voice-live. Page loads all HCP profiles via existing useHcpProfiles() hook. Displays summary statistics row and chain card grid. Page uses AdminLayout with standard sidebar and header.
Visual: Page title "Voice Live Management" / "Voice Live 管理" at text-2xl font-semibold. Description text below at text-sm text-muted-foreground. Summary stats as 4 small stat cards in a row (grid grid-cols-2 md:grid-cols-4 gap-4). Chain cards grid below: grid grid-cols-1 md:grid-cols-2 xl:grid-cols-3 gap-6.
Trigger: Management page renders a VoiceLiveChainCard for each HCP. Behavior: Each card shows 4 chain steps in order:
- HCP Profile -- always green (profile exists). Shows HCP name.
-
Agent -- reflects
agent_sync_status: green (synced), amber (pending), red (failed), muted (none). Shows agent_id or status label. -
Voice Live -- reflects
voice_live_enabled+voice_live_model: green (enabled + model selected), muted (disabled). Shows model name or "Disabled". -
Speech/Avatar -- reflects
voice_name+avatar_character: green (both configured), amber (partial -- only voice or only avatar), muted (defaults only). Shows voice short name + avatar character/style.
Steps are connected by a vertical line on the left side. Each step is a row with: status dot (size-2 rounded-full) | step label (text-xs text-muted-foreground) | step value (text-sm).
Visual: Card uses standard Card component. Steps stacked vertically with space-y-3. Vertical connector line: border-l-2 between steps, color matches overall chain health (green if all complete, amber if partial, muted if incomplete). Card header: HCP name (font-medium) + specialty (text-xs text-muted-foreground). Card footer: "Edit" Button link (variant="ghost", text-xs) navigating to /admin/hcp-profiles/{id}.
Trigger: Management page loads HCP profile data. Behavior: Four stat cards display computed counts:
- Total HCPs: count of all active HCP profiles
- Agents Synced: count where
agent_sync_status === "synced" - Voice Live Enabled: count where
voice_live_enabled === true - Fully Configured: count where agent synced AND voice_live_enabled AND voice_name set AND avatar_character set
Visual: Each stat card: small Card with p-4. Number at text-2xl font-bold. Label below at text-xs text-muted-foreground. Icon in top-right corner (size-4 text-muted-foreground). Grid: grid grid-cols-2 md:grid-cols-4 gap-4.
Trigger: Admin clicks "Batch Re-sync" button in management page header.
Behavior: Calls POST /api/v1/voice-live/batch-resync (new endpoint). Shows loading spinner on button during request. On success: toast.success() with count of synced/failed agents. On error: toast.error() with error message. Refetches HCP profile list after completion to update chain status cards.
Visual: Button in page header area: variant="default" (primary), size="sm", with RefreshCw icon (size-4 mr-2). When loading: icon spins via animate-spin. Button text: "Batch Re-sync" / "批量同步".
Constraint: Button disabled when no HCP profiles exist or when a batch sync is already in progress. Confirmation not required (re-sync is idempotent, non-destructive).
Trigger: HCP table renders rows with profiles that have voice_live_model set.
Behavior: The existing Voice & Avatar column extends to show a third Badge when voice_live_model is present and differs from the default "gpt-4o", OR when voice_live_enabled is true. Badge shows short model label (e.g., "GPT-4o", "GPT-4.1"). When voice_live_enabled is false, the model badge is not shown.
Visual: Third Badge after existing voice name and avatar badges. Badge variant="outline", text-xs. Displayed on a new line below the existing badges using flex flex-wrap items-center gap-1 to handle the wider content. When model is the default "gpt-4o" and other badges are shown, model badge uses standard outline. Non-default models show in outline with slightly bolder text.
Trigger: Admin clicks the re-sync icon button within a VoiceLiveChainCard that shows a failed or none agent status.
Behavior: Calls existing POST /api/v1/hcp-profiles/{id}/retry-sync endpoint. Shows loading spinner on the icon button. On success: toast.success() with sync result. On failure: toast.error(). Card updates automatically via TanStack Query invalidation.
Visual: Small icon button (variant="ghost", size="icon", className="size-7") with RefreshCw icon (size-3.5). Positioned in the Agent step row. Only visible when agent_sync_status is "failed" or "none". When loading: icon spins via animate-spin.
All copy externalized via react-i18next. English (en-US) and Chinese (zh-CN) values.
| Element | i18n Key | en-US Copy | zh-CN Copy |
|---|---|---|---|
| Page title | admin:voiceLive.title |
Voice Live Management | Voice Live 管理 |
| Page description | admin:voiceLive.description |
Manage Voice Live configurations across all HCP profiles. View the full chain from profile to agent to voice settings. | 管理所有 HCP 配置的 Voice Live 设置。查看从配置到代理到语音设置的完整链路。 |
| Batch re-sync button | admin:voiceLive.batchResync |
Batch Re-sync | 批量同步 |
| Batch re-sync success | admin:voiceLive.batchResyncSuccess |
Re-synced {{synced}} agents ({{failed}} failed) | 已同步 {{synced}} 个代理({{failed}} 个失败) |
| Batch re-sync error | admin:voiceLive.batchResyncError |
Failed to batch re-sync agents. Please try again. | 批量同步代理失败,请重试。 |
| Stat: Total HCPs | admin:voiceLive.statTotalHcps |
Total HCPs | HCP 总数 |
| Stat: Agents Synced | admin:voiceLive.statAgentsSynced |
Agents Synced | 已同步代理 |
| Stat: Voice Live Enabled | admin:voiceLive.statVoiceLiveEnabled |
Voice Live Enabled | Voice Live 已启用 |
| Stat: Fully Configured | admin:voiceLive.statFullyConfigured |
Fully Configured | 完全配置 |
| Chain step: HCP Profile | admin:voiceLive.stepProfile |
HCP Profile | HCP 配置 |
| Chain step: Agent | admin:voiceLive.stepAgent |
Agent Sync | 代理同步 |
| Chain step: Voice Live | admin:voiceLive.stepVoiceLive |
Voice Live Config | Voice Live 配置 |
| Chain step: Speech/Avatar | admin:voiceLive.stepSpeechAvatar |
Speech & Avatar | 语音和数字人 |
| Chain status: Complete | admin:voiceLive.statusComplete |
Complete | 已完成 |
| Chain status: Partial | admin:voiceLive.statusPartial |
Partial | 部分完成 |
| Chain status: Not configured | admin:voiceLive.statusNotConfigured |
Not configured | 未配置 |
| Chain status: Disabled | admin:voiceLive.statusDisabled |
Disabled | 已禁用 |
| Chain card: Edit | admin:voiceLive.editHcp |
Edit HCP | 编辑 HCP |
| Empty state heading | admin:voiceLive.emptyTitle |
No Voice Live Configurations | 暂无 Voice Live 配置 |
| Empty state body | admin:voiceLive.emptyBody |
Create HCP profiles and enable Voice Live to see configurations here. | 创建 HCP 配置并启用 Voice Live 以在此处查看配置。 |
| Error state | admin:voiceLive.loadError |
Failed to load Voice Live configurations. Click retry to try again. | 加载 Voice Live 配置失败。点击重试。 |
| Sidebar nav label | admin:voiceLive.nav |
Voice Live | Voice Live |
| Element | i18n Key | en-US Copy | zh-CN Copy |
|---|---|---|---|
| Voice Live Model label | admin:hcp.voiceLiveModel |
Voice Live Model | Voice Live 模型 |
| Voice Live Model description | admin:hcp.voiceLiveModelDesc |
Generative AI model for Voice Live sessions (model mode only) | Voice Live 会话的生成式 AI 模型(仅模型模式) |
| Model tier: Pro | admin:hcp.modelTierPro |
Pro | 专业版 |
| Model tier: Basic | admin:hcp.modelTierStandard |
Standard | 标准版 |
| Model tier: Lite | admin:hcp.modelTierLite |
Lite | 轻量版 |
Focal point: Chain status grid. Admin scans all HCPs and their full Voice Live pipeline status.
+-----------------------------------------------------------+
| Voice Live Management [Batch Re-sync] | <- Page header (text-2xl + button)
| Manage Voice Live configurations... | <- Description (text-sm text-muted-foreground)
+-----------------------------------------------------------+
| |
| +----------+ +----------+ +----------+ +----------+ | <- Summary stats row
| | Total | | Agents | | Voice | | Fully | | (grid grid-cols-2 md:grid-cols-4 gap-4)
| | HCPs | | Synced | | Live On | | Config'd | |
| | 8 | | 6 | | 5 | | 4 | |
| +----------+ +----------+ +----------+ +----------+ |
| |
| +------------------+ +------------------+ +-------------+ | <- Chain cards grid
| | Dr. Zhang | | Dr. Li | | Dr. Wang | | (grid grid-cols-1 md:grid-cols-2
| | Oncology | | Hematology | | Neurology | | xl:grid-cols-3 gap-6)
| | | | | | | |
| | * HCP Profile | | * HCP Profile | | * HCP Pro | |
| | * Agent: Synced | | * Agent: Failed | | * Agent: | |
| | * VL: GPT-4o | | * VL: Disabled | | None | |
| | * Speech: Ava | | * Speech: -- | | * VL: -- | |
| | Avatar: Lori-c | | | | * -- | |
| | | | | | | |
| | [Edit HCP] | | [Retry] [Edit] | | [Edit HCP] | |
| +------------------+ +------------------+ +-------------+ |
+-----------------------------------------------------------+
+-------------------------------------------+
| CardHeader |
| Dr. Zhang - Oncology [Re-sync?] |
+-------------------------------------------+
| CardContent |
| |
| [*] HCP Profile ............ Dr. Zhang | <- Step 1 (always green)
| | | Vertical line connector
| [*] Agent Sync ............. Synced | <- Step 2 (color by status)
| | asst_xxx | Tooltip for full agent_id
| [*] Voice Live Config ...... GPT-4o | <- Step 3 (green/muted)
| | Enabled |
| [*] Speech & Avatar ........ Ava / Lori | <- Step 4 (green/amber/muted)
| casual |
| |
+-------------------------------------------+
| CardFooter |
| [Edit HCP] | <- Ghost button, navigates to editor
+-------------------------------------------+
Each step row layout:
[dot size-2] [step-label text-xs text-muted-foreground w-32] [value text-sm flex-1]
Vertical connector: ml-1 border-l-2 h-3 between step rows. Connector color: border-green-300 when both adjacent steps are green, border-border otherwise.
Position: Inside the Voice Settings Card in voice-avatar-tab.tsx, after the Voice Live Config toggle, before the Custom Voice toggle.
+-----------------------------------------------------------+
| Card: Voice Live Config Toggle |
| Voice Live Config: [ON --------] [Switch] |
| Sync voice settings as agent metadata |
+-----------------------------------------------------------+
| space-y-4 |
+-----------------------------------------------------------+
| Card: Voice Settings |
| Voice Live Model: | <- NEW FIELD
| [Select dropdown \/] |
| +---------------------------+ |
| | Pro | <- SelectGroup label |
| | GPT Realtime | |
| | GPT-4o | <- default selected |
| | GPT-4.1 | |
| | GPT-5 | |
| | GPT-5 Chat | |
| | Basic | <- SelectGroup label |
| | GPT Realtime Mini | |
| | GPT-4o Mini | |
| | ... | |
| | Lite | <- SelectGroup label |
| | GPT-5 Nano | |
| | Phi4-MM Realtime | |
| | Phi4 Mini | |
| +---------------------------+ |
| |
| Custom voice: [OFF ----] [Switch] |
| Voice Name: [Select dropdown \/] |
+-----------------------------------------------------------+
| Voice & Avatar |
|-------------------------|
| [Ava] [Lori-casual] | <- Existing badges
| [GPT-4o] | <- NEW model badge (when voice_live_enabled)
Badges use flex flex-wrap items-center gap-1 to wrap naturally when column is narrow.
| State | Type | Location | Purpose |
|---|---|---|---|
| HCP form values (extended) |
useForm<HcpFormValues> (react-hook-form + zod) |
hcp-profile-editor.tsx |
Extended with voice_live_model field. Default: "gpt-4o". |
| Management page HCP list | TanStack Query via useHcpProfiles()
|
voice-live-management.tsx |
Reuses existing hook. No new query key. |
| Batch re-sync mutation | TanStack Query mutation |
use-voice-live-management.ts (new) |
Mutation for POST /api/v1/voice-live/batch-resync. Invalidates HCP profiles query on success. |
| Summary stats |
useMemo derived from HCP list |
voice-live-management.tsx |
Computed counts for total, synced, enabled, fully-configured. |
| Individual re-sync mutation |
useRetrySyncHcpProfile() (existing) |
voice-live-chain-card.tsx |
Reuses Phase 11 hook. |
| Requirement | Implementation |
|---|---|
| Model select keyboard navigation | Radix Select handles Arrow key navigation, Enter/Space activation, Escape to close |
| Group labels in select | SelectGroup + SelectLabel provide semantic grouping for screen readers (tier names announced) |
| Chain status dots | Each dot has adjacent text label (not color-only). Screen readers read the text label, not the dot |
| Batch re-sync loading state | Button shows "aria-busy=true" when loading, spinner icon is aria-hidden="true", button text changes to indicate loading |
| Chain card links | "Edit HCP" renders as <a> or <Button> with clear aria-label including HCP name |
| Summary stat cards | Each stat card value uses aria-label combining the number and the label text |
| Management page empty state | EmptyState component is focusable and describes the next action |
| Breakpoint | Management Page | Chain Cards | HCP Table Model Badge |
|---|---|---|---|
| Desktop (>=1280px) | 3-column chain card grid, 4-column stat row, full page header | Full card with all 4 steps visible | Model badge visible alongside voice/avatar badges |
| Tablet (768-1279px) | 2-column chain card grid, 2x2 stat row grid | Same card layout, narrower | Model badge wraps below voice/avatar badges |
| Mobile (<768px) | 1-column chain card grid, 2x2 stat row grid, page title stacks above button | Full card, steps use more horizontal space | Model badge hidden (column too narrow), visible on hover/tap or in editor |
| Registry | Blocks Used | Safety Gate |
|---|---|---|
| shadcn official | Not applicable (components already installed manually as Radix wrappers) | not required |
| Third-party | none | not applicable |
No new component installations needed. SelectGroup and SelectLabel are already available in the existing @/components/ui/select.tsx (standard Radix Select primitives). All required UI primitives (Card, Badge, Select, Button, Tooltip, Skeleton, Form, Switch) are already present in frontend/src/components/ui/.
- Dimension 1 Copywriting: PASS
- Dimension 2 Visuals: PASS
- Dimension 3 Color: PASS
- Dimension 4 Typography: PASS
- Dimension 5 Spacing: PASS
- Dimension 6 Registry Safety: PASS
Approval: pending