Planning Phase 07 - huqianghui/AI-Coach-vibe-coding GitHub Wiki
Auto-generated from
.planning/phases/07-azure-service-integration
Last synced: 2026-04-02
Gathered: 2026-03-27 (updated with 7-mode user-selectable design) Status: Ready for replanning Source: Conversation analysis + user design direction update
## Phase BoundaryConnect the existing Admin Azure Config UI to real backend persistence and service switching for ALL 7 Azure AI service modes. The frontend pages (azure-config.tsx, service-config-card.tsx) and backend read APIs (azure_config.py) already exist but are disconnected — onSave only updates React state, onTestConnection is a fake Math.random(), and there's no PUT endpoint to persist configurations. This phase bridges those gaps and implements backend adapters/connectors for all 7 AI service modes.
CRITICAL DESIGN PRINCIPLE: All 7 modes are equal, user-selectable services — NOT a fallback chain. Users configure which modes to enable based on:
- Their preference and use case
- Whether the selected Azure region's AI Foundry supports that specific feature
The 7 AI Service Modes:
- Azure OpenAI — GPT-4o for text coaching conversations and scoring
- Azure Speech (STT) — Speech-to-text for voice input recognition
- Azure Speech (TTS) — Text-to-speech for HCP voice responses
- Azure AI Avatar — Digital human avatar for HCP visualization
- Azure Content Understanding — Multimodal evaluation for training materials
- Azure OpenAI Realtime — Real-time audio streaming for voice conversations
- Azure Voice Live API — Real-time voice coaching with GPT-4o Realtime + Avatar
In scope:
- Backend: Configuration persistence model (DB), CRUD API for ALL 7 service configs, real connection testing
- Backend: Azure OpenAI adapter for LLM (text coaching)
- Backend: Azure Speech STT/TTS adapters (voice input/output) — already exist from Phase 06
- Backend: Azure Avatar configuration and status checking
- Backend: Azure Content Understanding adapter (document/multimodal analysis)
- Backend: Azure OpenAI Realtime adapter (real-time audio streaming)
- Backend: Azure Voice Live API adapter (voice + avatar combined)
- Backend: Region-based availability detection — check if the configured region's AI Foundry supports each service
- Backend: Dynamic adapter registration based on persisted config (register available services on startup)
- Frontend: Wire
onSaveto call backend PUT API for all 7 services - Frontend: Wire
onTestConnectionto call backend POST /test endpoint for all 7 services - Frontend: Show real connection status (active/inactive/error/unavailable) from backend
- Frontend: Show region availability hints — which services are available in the configured region
Out of scope:
- Frontend Avatar WebRTC rendering (future phase — Phase 08)
- Database for PostgreSQL UI config (deployment infrastructure, not AI service)
- New UI pages (existing
azure-config.tsxis sufficient)
- All 7 modes are independently configurable — no automatic fallback chain
- Each service has its own enable/disable toggle, endpoint, API key, region
- A service being "unavailable" in a region shows informational status, NOT auto-switch to another service
- Admin explicitly chooses which services to enable for their deployment
- The system does NOT silently switch providers — if a configured service fails, it reports an error
- Store Azure service configs in
service_configdatabase table (already exists) - Config model: service_name, endpoint, api_key (encrypted), model/deployment, region, is_active, updated_by
- API keys encrypted at rest using Fernet symmetric encryption (already implemented)
- On server startup, load active configs from DB and register corresponding adapters
-
PUT /api/v1/azure-config/services/{service_name}— Create/update service configuration (exists) -
POST /api/v1/azure-config/services/{service_name}/test— Real connection test (exists for some services) -
GET /api/v1/azure-config/services— Read from DB (exists) -
GET /api/v1/azure-config/region-capabilities/{region}— NEW: Check what services are available in a region - All endpoints require admin role
-
AzureOpenAIAdapterextendsBaseCoachingAdapter— already exists - Uses
openaiSDK with Azure configuration - Streaming responses via SSE
-
AzureSpeechSTTAdapterandAzureSpeechTTSAdapter— already exist - Registered under categories "stt" and "tts"
- Configuration validation and status checking
- Avatar is a frontend rendering service — backend stores config and tests connectivity
- Uses Azure Speech SDK's avatar features or REST API for synthesis
- Region availability: Avatar is NOT available in all regions
- Implement
AzureContentUnderstandingAdapter - Uses Azure AI Content Understanding REST API
- Capabilities: document analysis, multimodal evaluation for training materials
- Region availability: Content Understanding may have limited regional availability
- Implement
AzureRealtimeAdapter - Uses Azure OpenAI Realtime API (WebSocket-based)
- Real-time bidirectional audio streaming
- Requires gpt-4o-realtime-preview model deployment
- Implement
AzureVoiceLiveAdapter - Two sub-modes (both user-configurable, NOT fallback):
- Model mode: Direct Azure OpenAI Realtime API, frontend controls conversation
- Agent mode: Azure AI Agent Service with pre-configured HCP persona agent
- Config fields: ai_service_endpoint, ai_service_key/Azure AD token, project_name (Agent mode), agent_id (Agent mode)
- Region availability: Voice Live requires specific AI Foundry regions
- Backend endpoint to check region capabilities against Azure AI Foundry
- Frontend shows availability hints per service when region is configured
- Unavailable services show "Not available in {region}" status
- Does NOT prevent saving config — admin may be preparing for future availability
- On config save: register/update the adapter in ServiceRegistry
- On startup: load all active configs and register adapters
- No automatic fallback — if a service is configured but unavailable, report error status
-
ServiceConfigCard.onSave→PUT /api/v1/azure-config/services/{key}(exists) -
ServiceConfigCard.onTestConnection→POST /api/v1/azure-config/services/{key}/test(exists) - Display real status from
GET /servicesresponse (exists) - NEW: Show region availability hints for each service
- Show toast notifications on save/test success/failure (using
sonner)
- Encryption key management approach (env var vs config)
- Exact error messages for connection test failures
- Retry/timeout strategy for Azure API calls
- Region capability API data source (hardcoded map vs live Azure API call)
- Azure AD token auth implementation details for Voice Live Agent mode
<canonical_refs>
Downstream agents MUST read these before planning or implementing.
-
backend/app/api/azure_config.py— CRUD API for service configs (GET/PUT/POST test) -
backend/app/api/config.py— Feature flags and adapter listing endpoint -
backend/app/config.py— pydantic-settings with Azure env vars -
backend/app/models/service_config.py— ServiceConfig ORM model with Fernet encryption -
backend/app/services/config_service.py— Config CRUD + encryption orchestration -
backend/app/utils/encryption.py— Fernet encrypt/decrypt helpers -
backend/app/services/connection_tester.py— Connection testing for Azure services
-
backend/app/services/agents/base.py— BaseCoachingAdapter ABC, CoachRequest/CoachEvent dataclasses -
backend/app/services/agents/adapters/mock.py— MockCoachingAdapter (reference implementation) -
backend/app/services/agents/adapters/azure_openai.py— AzureOpenAIAdapter (already implemented) -
backend/app/services/agents/registry.py— ServiceRegistry singleton -
backend/app/api/sessions.py— SSE streaming endpoint (uses adapter.execute())
-
backend/app/services/agents/adapters/azure_stt.py— Azure Speech STT adapter -
backend/app/services/agents/adapters/azure_tts.py— Azure Speech TTS adapter
-
backend/app/models/base.py— Base, TimestampMixin (all models must use) -
backend/app/database.py— Async engine, session factory -
backend/alembic/— Migration directory
- User's
Voice-Live-Agent-With-Avadarrepo — Reference for Agent mode + Model mode Voice Live API implementation
-
frontend/src/pages/admin/azure-config.tsx— Azure config page (7 service cards + DB) -
frontend/src/components/admin/service-config-card.tsx— Reusable config card component -
frontend/src/types/azure-config.ts— TypeScript interfaces for service config -
frontend/src/api/azure-config.ts— Typed API client for azure config endpoints -
frontend/src/hooks/use-azure-config.ts— TanStack Query hooks for azure config
-
frontend/src/api/client.ts— Axios instance with JWT interceptor
</canonical_refs>
## Specific Ideas- The
openaiPython SDK (>=1.50.0) supports both Azure OpenAI standard and Realtime APIs - Azure Speech SDK includes Avatar synthesis capabilities
- The reference repo (Voice-Live-Agent-With-Avadar) provides patterns for Agent mode + Model mode
- Agent mode needs:
ai_service_endpoint,azure_ai_project_name,agent_id - Azure AD token auth:
az account get-access-token --resource https://ai.azure.com - ServiceConfigCard already has status dot (green/gray/red/yellow) — extend with "unavailable in region" state
- Region capabilities can be initially hardcoded as a lookup table, with future Azure Management API integration
- Content Understanding API uses REST (not an SDK) —
POST /contentunderstanding/analyzerspattern
- Azure Database for PostgreSQL configuration from UI (deployment infrastructure, not AI service)
- Avatar WebRTC rendering in frontend (Phase 08)
- Per-session provider override (always use deployment-wide config for now)
- Live Azure Management API for region capability querying (start with hardcoded map)
Phase: 07-azure-service-integration Context gathered: 2026-03-27 via conversation analysis, updated with 7-mode design direction
| # | Plan File | Status |
|---|---|---|
| 07-01 | 07-01-PLAN.md | Complete |
| 07-02 | 07-02-PLAN.md | Complete |
| 07-03 | 07-03-PLAN.md | Complete |
| 07-04 | 07-04-PLAN.md | Complete |
| 07-05 | 07-05-PLAN.md | Complete |
| 07-06 | 07-06-PLAN.md | Complete |
| 07-07 | 07-07-PLAN.md | Complete |
Click to expand research notes
Researched: 2026-03-27 Domain: Azure AI Services integration -- 7 independently configurable service modes Confidence: MEDIUM-HIGH
Phase 07 was previously completed with 4 plans covering Azure OpenAI LLM, config persistence (ServiceConfig model + Fernet encryption), connection testing, and frontend wiring. The expanded scope adds 4 NEW adapters/connectors (Azure Avatar real implementation, Content Understanding, OpenAI Realtime, Voice Live API) and a region-based capability detection system. The existing infrastructure (ServiceConfig DB model, encryption, CRUD API, frontend ServiceConfigCard) is solid and reusable -- the new work layers on top.
The critical architectural insight is that these 7 services have fundamentally different integration patterns: some are backend-only (OpenAI LLM, Content Understanding), some are primarily frontend (Avatar WebRTC, Realtime WebSocket), and some are hybrid (Voice Live combines backend token brokering with frontend WebSocket). The backend's role varies per service -- from full adapter (OpenAI LLM) to config-store-and-connection-tester (Avatar, Realtime). The region capability map must be a hardcoded lookup table (per user decision) since there is no single Azure API to query all service availability.
Primary recommendation: Extend the existing connection_tester.py and azure_config.py with real API calls for the 3 new services (Content Understanding, Realtime, Voice Live), add a region capability lookup module, and implement the missing azure_openai_realtime service name in SERVICE_DISPLAY_NAMES. Frontend changes are minimal -- the 7 service cards already exist, just add region availability hints.
<user_constraints>
- All 7 modes are independently configurable -- NOT a fallback chain
- Each service has its own enable/disable toggle, endpoint, API key, region
- A service being "unavailable" in a region shows informational status, NOT auto-switch
- Admin explicitly chooses which services to enable
- System does NOT silently switch providers -- if a configured service fails, it reports error
- Config persistence uses existing service_config DB table with Fernet encryption
- API design: PUT/POST test/GET endpoints (partially implemented)
- NEW: GET /api/v1/azure-config/region-capabilities/{region} endpoint
- Azure OpenAI adapter already exists and is working
- Azure Speech STT/TTS adapters already exist from Phase 06
- Azure Avatar adapter is stub (is_available=False) -- needs real implementation
- Azure Content Understanding adapter is NEW
- Azure OpenAI Realtime adapter is NEW
- Azure Voice Live API adapter is NEW with Agent mode + Model mode
- Dynamic adapter registration on config save and on startup
- Encryption key management approach (env var vs config)
- Exact error messages for connection test failures
- Retry/timeout strategy for Azure API calls
- Region capability API data source (hardcoded map vs live Azure API call)
- Azure AD token auth implementation details for Voice Live Agent mode
- Azure Database for PostgreSQL configuration from UI (deployment infrastructure, not AI service)
- Avatar WebRTC rendering in frontend (Phase 08)
- Per-session provider override (always use deployment-wide config for now)
- Live Azure Management API for region capability querying (start with hardcoded map) </user_constraints>
<phase_requirements>
| ID | Description | Research Support |
|---|---|---|
| PLAT-03 | Admin can configure Azure service connections from web UI with connection testing | Existing CRUD API + connection_tester.py cover OpenAI/Speech/Avatar; extend for Content Understanding, Realtime, Voice Live |
| PLAT-05 | Voice interaction mode configurable per deployment and per session | Voice Live + Realtime adapter registration enables runtime mode switching |
| ARCH-05 | Azure service connections configurable per environment | ServiceConfig model + region capabilities endpoint enables per-env config |
| ARCH-01 | Pluggable adapter pattern for all AI services | ServiceRegistry already supports multi-category; add new categories for realtime, content_understanding, voice_live |
| COACH-06 | GPT Realtime API for sub-1s conversational latency as configurable premium option | Azure OpenAI Realtime adapter with connection testing |
| COACH-07 | Azure AI Avatar as configurable premium option | Avatar adapter real implementation with config validation |
| </phase_requirements> |
| Library | Version | Purpose | Why Standard |
|---|---|---|---|
| openai | >=1.50.0 | Azure OpenAI + Realtime API client | Already installed; supports AsyncAzureOpenAI and realtime WebSocket |
| httpx | >=0.27.0 | Async HTTP client for REST API calls | Already installed; used for connection testing and Content Understanding REST calls |
| cryptography | (via python-jose) | Fernet encryption for API keys | Already available as transitive dep |
| azure-cognitiveservices-speech | latest | Azure Speech SDK (STT/TTS/Avatar) | Required for Avatar real-time synthesis via WebRTC; already used in Phase 06 |
| Library | Version | Purpose | When to Use |
|---|---|---|---|
| websockets | >=13.0 | WebSocket support | Already installed; used for Voice Live backend-to-Azure WebSocket if needed |
The expanded scope does NOT require any new Python packages. Content Understanding uses plain REST (httpx), Realtime API uses the openai SDK, and Voice Live uses httpx for connection testing (frontend handles the actual WebSocket). Avatar connection testing uses httpx REST calls.
Installation: No new packages needed. The existing pyproject.toml dependencies cover everything.
This is the most important architectural decision for the planner:
| Service | Backend Role | Frontend Role | Integration Pattern |
|---|---|---|---|
| Azure OpenAI (LLM) | Full adapter (execute coaching) | Consumes SSE stream | Backend-primary |
| Azure Speech STT | Full adapter (audio -> text) | Sends audio, receives text | Backend-primary |
| Azure Speech TTS | Full adapter (text -> audio) | Receives audio | Backend-primary |
| Azure AI Avatar | Config store + connection test | WebRTC rendering (Phase 08) | Frontend-primary, backend config-only |
| Azure Content Understanding | Full adapter (doc analysis) | Triggers analysis, shows results | Backend-primary |
| Azure OpenAI Realtime | Config store + token broker | WebSocket direct to Azure | Frontend-primary, backend brokers |
| Azure Voice Live API | Config store + token broker | WebSocket direct to Azure | Frontend-primary, backend brokers |
Key insight: For services 4, 6, 7 the backend does NOT proxy the real-time stream. The backend stores config, tests connectivity, and provides tokens/credentials. The frontend connects directly to Azure. This means the "adapter" for these services is lighter -- it only needs is_available() and connection testing, not a full execute() method.
backend/app/
services/
agents/
adapters/
azure_openai.py # EXISTS
azure_content.py # NEW: Content Understanding adapter
azure_realtime.py # NEW: Realtime config/availability adapter
azure_voice_live.py # NEW: Voice Live config/availability adapter
avatar/
azure.py # EXISTS (stub) -> UPDATE with real connection test
stt/azure.py # EXISTS
tts/azure.py # EXISTS
connection_tester.py # EXISTS -> UPDATE with real tests for all 7
config_service.py # EXISTS (no changes needed)
voice_live_service.py # EXISTS (token broker) -> UPDATE SUPPORTED_REGIONS
region_capabilities.py # NEW: hardcoded region -> services map
api/
azure_config.py # EXISTS -> UPDATE: add region-capabilities endpoint, add azure_openai_realtime to SERVICE_DISPLAY_NAMES
schemas/
azure_config.py # EXISTS -> UPDATE: add RegionCapabilitiesResponse schema
What: A simplified adapter that implements is_available() and holds config but does NOT implement full execute().
When to use: Avatar, Realtime, Voice Live -- services where the real-time interaction happens in the frontend.
Example:
# Source: Project convention from azure_openai.py pattern
class AzureRealtimeAdapter(BaseCoachingAdapter):
"""Azure OpenAI Realtime API config adapter.
Backend stores config and tests connectivity.
Frontend connects directly via WebSocket.
"""
name = "azure_openai_realtime"
def __init__(
self,
endpoint: str,
api_key: str,
deployment: str = "gpt-4o-realtime-preview",
) -> None:
self._endpoint = endpoint
self._api_key = api_key
self._deployment = deployment
async def execute(self, request: CoachRequest) -> AsyncIterator[CoachEvent]:
"""Not used -- frontend connects directly to Azure Realtime API."""
yield CoachEvent(
type=CoachEventType.ERROR,
content="Realtime API is frontend-direct; use token broker endpoint",
)
yield CoachEvent(type=CoachEventType.DONE, content="")
async def is_available(self) -> bool:
return bool(self._endpoint and self._api_key and self._deployment)What: A complete adapter with execute() that makes real API calls.
When to use: Content Understanding -- the backend needs to call the REST API, poll for results, and return structured data.
Example:
# Source: Azure Content Understanding REST API docs
class AzureContentUnderstandingAdapter(BaseCoachingAdapter):
"""Azure Content Understanding adapter for document/multimodal analysis."""
name = "azure_content"
def __init__(self, endpoint: str, api_key: str) -> None:
self._endpoint = endpoint.rstrip("/")
self._api_key = api_key
async def execute(self, request: CoachRequest) -> AsyncIterator[CoachEvent]:
"""Analyze content using Azure Content Understanding REST API."""
# POST {endpoint}/contentunderstanding/analyzers/{analyzer}:analyze?api-version=2025-11-01
# Poll Operation-Location until status=Succeeded
# Yield results as CoachEvent
...
async def is_available(self) -> bool:
return bool(self._endpoint and self._api_key)What: A Python dict mapping region identifiers to sets of supported services. Sourced from official Azure docs, maintained manually.
When to use: For the region-capabilities endpoint.
Example:
# Source: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions
REGION_CAPABILITIES: dict[str, set[str]] = {
"eastus2": {
"azure_openai", "azure_speech_stt", "azure_speech_tts",
"azure_avatar", "azure_openai_realtime", "azure_voice_live",
"azure_content",
},
"swedencentral": {
"azure_openai", "azure_speech_stt", "azure_speech_tts",
"azure_avatar", "azure_openai_realtime", "azure_voice_live",
"azure_content",
},
"westeurope": {
"azure_openai", "azure_speech_stt", "azure_speech_tts",
"azure_avatar", "azure_voice_live", "azure_content",
},
# ... more regions
}- Building a full execute() adapter for frontend-primary services: Avatar, Realtime, and Voice Live are frontend-direct. The backend should NOT try to proxy WebSocket or WebRTC streams.
- Making region capabilities a live Azure API call: The Azure Management REST API is complex, requires subscription-level auth, and has different endpoints per service. A hardcoded map (updated manually) is correct per user decision.
- Auto-fallback between services: The user explicitly decided NO fallback chain. Each service reports its own status independently.
- Merging azure_openai_realtime into azure_openai: These are completely different APIs (REST chat completion vs WebSocket real-time audio). They need separate config entries.
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| API key encryption | Custom crypto | Fernet from cryptography (already done) | Symmetric encryption with built-in key rotation support |
| Azure OpenAI chat | Raw HTTP | openai SDK AsyncAzureOpenAI (already done) | Handles auth, retries, streaming, error parsing |
| Content Understanding analysis | Custom HTTP polling | httpx + async polling loop | Simple REST API with Operation-Location pattern -- just poll |
| WebSocket for Realtime | Backend WebSocket proxy | Frontend direct connection with backend token broker | Backend proxying adds latency and complexity for no benefit |
| Region capability detection | Azure Management API | Hardcoded lookup table | Azure has no single "what's available in this region" API |
Key insight: The 3 new services (Content Understanding, Realtime, Voice Live) each have different API patterns but none require new Python packages. Content Understanding is REST (httpx), Realtime is an openai SDK feature, and Voice Live is a WebSocket the frontend connects to directly.
- Adapter:
AzureOpenAIAdapterinbackend/app/services/agents/adapters/azure_openai.py - Connection test:
test_azure_openai()makes a minimal chat completion call - No changes needed
- Adapter:
AzureSTTAdapterinbackend/app/services/agents/stt/azure.py - Connection test:
test_azure_speech()lists available voices - No changes needed
- Adapter:
AzureTTSAdapterinbackend/app/services/agents/tts/azure.py - Connection test: Shares
test_azure_speech()(same endpoint) - No changes needed
- Current: Stub adapter with
is_available()=False - Regions: eastus2, northeurope, southcentralus, southeastasia, swedencentral, westeurope, westus2
- Real-time synthesis uses WebRTC (frontend, Phase 08 scope)
- Backend for THIS phase: store config, validate, test connectivity via REST
- Connection test: Hit the ICE token endpoint
GET /cognitiveservices/avatar/relay/token/v1withOcp-Apim-Subscription-Keyheader - Endpoint format:
https://{region}.tts.speech.microsoft.com - Confidence: HIGH (verified from official docs)
- API version:
2025-11-01(GA) - Endpoint:
{foundry_endpoint}/contentunderstanding/analyzers/{analyzer-id}:analyze?api-version=2025-11-01 - Auth header:
Ocp-Apim-Subscription-Key: {key} - Pattern: Async -- POST returns 202 +
Operation-Location, poll untilstatus: "Succeeded" - Prebuilt analyzers:
prebuilt-invoice,prebuilt-imageSearch,prebuilt-audioSearch,prebuilt-videoSearch - Requires a Microsoft Foundry Resource (not just a Speech resource)
- Supported file types: PDF, TIFF, DOCX, XLSX, PPTX, images, audio, video
- Connection test: List analyzers or attempt a lightweight analysis
- Regional availability: Runs on Foundry resource -- available wherever Foundry resources can be created (broad availability)
- Confidence: HIGH (verified from official docs updated 2026-03-27)
- Uses the openai SDK with WebSocket transport
- Endpoint format:
wss://{resource}.openai.azure.com/openai/v1 -
IMPORTANT: Uses
/openai/v1GA endpoint, NOT date-basedapi-versionparameter - Auth:
api-keyheader or Microsoft Entra Bearer token - Models:
gpt-4o-realtime-preview(2024-12-17),gpt-realtime(2025-08-28),gpt-realtime-mini,gpt-realtime-1.5 - Token limits: 32K input, 4K output
- Features: VAD, session config, real-time audio streaming, transcription
- Connection test: Attempt WebSocket handshake or validate endpoint format + try REST model list
- The frontend connects directly; backend stores config and provides credentials
- Confidence: HIGH (verified from official docs)
- WebSocket endpoint:
wss://{resource}.services.ai.azure.com/voice-live/realtime?api-version=2025-10-01&model={model} - Alternate endpoint (older):
wss://{resource}.cognitiveservices.azure.com/voice-live/realtime?api-version=2025-10-01 - Auth:
api-keyquery param or header, OR Microsoft Entra Bearer token with scopehttps://ai.azure.com/.default - Agent mode: Add
agent_idandproject_idquery params instead ofmodel - Model mode: Add
modelquery param (e.g.,gpt-realtime,gpt-4o,gpt-4.1) - Supported models: gpt-realtime, gpt-realtime-mini, gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-5, phi4-mm-realtime
- Regions with Voice Live: Very broad -- eastus, eastus2, swedencentral, westeurope, westus, westus2, australiaeast, uksouth, francecentral, etc. (expanded significantly from earlier Phase 08 assumption of only eastus2/swedencentral)
- Agent support regions: australiaeast, brazilsouth, canadaeast, eastus, eastus2, francecentral, germanywestcentral, italynorth, japaneast, norwayeast, southafricanorth, southcentralus, southeastasia, swedencentral, switzerlandnorth, uksouth, westeurope, westus, westus2, westus3
- Avatar integration: Via
avatarparameter insession.updatewith WebRTC (ICE servers) - Features: noise suppression, echo cancellation, semantic VAD, filler word removal, viseme for avatar lip sync
- Connection test: HTTP probe to the endpoint (already partially implemented)
- Confidence: HIGH (verified from official docs updated 2026-03-16)
CRITICAL UPDATE: The existing SUPPORTED_REGIONS in voice_live_service.py is {"eastus2", "swedencentral"} which is now OUTDATED. Voice Live supports many more regions. This needs updating.
Based on official Azure documentation (verified 2026-03-27), here is the comprehensive region map:
eastus2, northeurope, southcentralus, southeastasia, swedencentral, westeurope, westus2
Almost all major regions support at least some Voice Live models. Key regions with full support including Agent mode:
australiaeast, brazilsouth, canadaeast, eastus, eastus2, francecentral, germanywestcentral, italynorth, japaneast, norwayeast, southafricanorth, southcentralus, southeastasia, swedencentral, switzerlandnorth, uksouth, westeurope, westus, westus2, westus3
Available in 30+ regions. Nearly universal Azure coverage.
Available wherever Azure OpenAI is deployed. Check Azure OpenAI model availability docs.
Runs on Foundry resources. Available wherever Foundry resources exist (broad).
GPT Realtime models are available as global deployments (broad availability).
# Services that have LIMITED regional availability -- only these need region checks
AVATAR_REGIONS = {
"eastus2", "northeurope", "southcentralus", "southeastasia",
"swedencentral", "westeurope", "westus2",
}
# Voice Live has broad availability but Agent mode is more restricted
VOICE_LIVE_AGENT_REGIONS = {
"australiaeast", "brazilsouth", "canadaeast", "eastus", "eastus2",
"francecentral", "germanywestcentral", "italynorth", "japaneast",
"norwayeast", "southafricanorth", "southcentralus", "southeastasia",
"swedencentral", "switzerlandnorth", "uksouth", "westeurope",
"westus", "westus2", "westus3",
}
# For services with universal availability, don't restrict by region
# azure_openai, azure_speech_stt, azure_speech_tts, azure_content,
# azure_openai_realtime: available in most regions -- no restriction neededWhat goes wrong: The frontend has SERVICE_KEY_MAP with realtime: "azure_openai_realtime" but the backend SERVICE_DISPLAY_NAMES in azure_config.py does NOT include azure_openai_realtime -- only 6 services are listed.
Why it happens: Phase 07 v1 was scoped to fewer services.
How to avoid: Add "azure_openai_realtime": "Azure OpenAI Realtime" to SERVICE_DISPLAY_NAMES.
Warning signs: Frontend save for Realtime card returns 400 "Unknown service".
What goes wrong: voice_live_service.py has SUPPORTED_REGIONS = {"eastus2", "swedencentral"} but the actual Voice Live API now supports 20+ regions.
Why it happens: The original Phase 08 implementation was conservative.
How to avoid: Update SUPPORTED_REGIONS from the official docs or refactor to use the new region_capabilities module.
Warning signs: Users in valid regions get "Unsupported region" errors.
What goes wrong: Content Understanding won't work with a regular Cognitive Services key. Why it happens: Content Understanding is a Foundry Tool, requiring a Microsoft Foundry Resource. How to avoid: Document this in the config UI tooltip. Connection test should verify the resource type. Warning signs: 401/403 errors on Content Understanding endpoint with valid Speech/OpenAI keys.
What goes wrong: Connection test fails because it uses ?api-version=2024-xx-xx query param.
Why it happens: Azure OpenAI standard API uses api-version, but Realtime GA uses /openai/v1 path.
How to avoid: Realtime connection test must construct URL as wss://{endpoint}/openai/v1 or verify deployment via REST.
Warning signs: 404 or upgrade-required errors on connection test.
What goes wrong: Agent mode requires agent_id + project_id instead of model param. May also need Azure AD token instead of API key.
Why it happens: Agent service uses Foundry AI Agent framework with different auth requirements.
How to avoid: ServiceConfig model_or_deployment field can encode mode with convention like "agent:{agent_id}:{project_id}" or frontend sends mode info in the config update.
Warning signs: 401 errors when using API key with Agent mode, missing agent_id param.
What goes wrong: The ServiceConfig model only has: endpoint, api_key, model_or_deployment, region. Voice Live Agent mode needs additional fields: project_name, agent_id.
Why it happens: The DB model was designed for simpler services.
How to avoid: Either add nullable columns to ServiceConfig, or store agent config as JSON in a text field, or encode in model_or_deployment with a convention.
Warning signs: Cannot save Agent mode configuration.
# Source: https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/quickstart/use-rest-api
async def test_azure_content_understanding(
endpoint: str, api_key: str,
) -> tuple[bool, str]:
"""Test Content Understanding by listing prebuilt analyzers."""
try:
async with httpx.AsyncClient(timeout=10.0) as client:
url = (
f"{endpoint.rstrip('/')}/contentunderstanding"
f"/analyzers?api-version=2025-11-01"
)
response = await client.get(
url,
headers={"Ocp-Apim-Subscription-Key": api_key},
)
if response.status_code == 200:
return (True, "Connection successful")
return (False, f"HTTP {response.status_code}: {response.text[:200]}")
except Exception as e:
return (False, f"Connection failed: {e!s}")# Source: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/real-time-synthesis-avatar
async def test_azure_avatar_real(
api_key: str, region: str,
) -> tuple[bool, str]:
"""Test Avatar by fetching ICE relay token."""
try:
async with httpx.AsyncClient(timeout=10.0) as client:
url = (
f"https://{region}.tts.speech.microsoft.com"
f"/cognitiveservices/avatar/relay/token/v1"
)
response = await client.get(
url,
headers={"Ocp-Apim-Subscription-Key": api_key},
)
if response.status_code == 200:
return (True, "Avatar service reachable (ICE token retrieved)")
return (False, f"Avatar test failed: HTTP {response.status_code}")
except Exception as e:
return (False, f"Avatar connection failed: {e!s}")# Source: https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/realtime-audio
async def test_azure_realtime(
endpoint: str, api_key: str, deployment: str,
) -> tuple[bool, str]:
"""Test Realtime API by verifying deployment exists via REST."""
try:
base = endpoint.rstrip("/")
async with httpx.AsyncClient(timeout=10.0) as client:
url = f"{base}/openai/deployments/{deployment}?api-version=2024-06-01"
response = await client.get(
url,
headers={"api-key": api_key},
)
if response.status_code == 200:
return (True, "Realtime deployment found")
elif response.status_code in (401, 403):
return (False, f"Authentication failed: HTTP {response.status_code}")
elif response.status_code == 404:
return (False, f"Deployment '{deployment}' not found")
return (False, f"HTTP {response.status_code}: {response.text[:200]}")
except Exception as e:
return (False, f"Connection failed: {e!s}")# New endpoint in azure_config.py
from app.services.region_capabilities import get_region_capabilities
@router.get("/region-capabilities/{region}")
async def get_capabilities(
region: str,
_admin: User = Depends(require_role("admin")),
) -> dict:
"""Return which Azure AI services are available in the given region."""
return get_region_capabilities(region)# Convention: store agent config in model_or_deployment
# Model mode: just the model name like "gpt-realtime"
# Agent mode: "agent:{agent_id}:{project_name}"
def parse_voice_live_config(model_or_deployment: str) -> dict:
if model_or_deployment.startswith("agent:"):
parts = model_or_deployment.split(":", 2)
return {
"mode": "agent",
"agent_id": parts[1],
"project_name": parts[2] if len(parts) > 2 else "",
}
return {
"mode": "model",
"model": model_or_deployment or "gpt-4o-realtime-preview",
}| Old Approach | Current Approach | When Changed | Impact |
|---|---|---|---|
| Voice Live only eastus2/swedencentral | Voice Live in 20+ regions | 2025-2026 | SUPPORTED_REGIONS must be updated |
| Azure OpenAI Realtime preview API version | GA endpoint /openai/v1 (no api-version param) |
2025-12 | Realtime connection URL construction differs from standard OpenAI |
| Content Understanding preview | GA at 2025-11-01
|
2025-11 | Stable API, prebuilt analyzers available |
| Voice Live limited models | Voice Live supports gpt-4o, gpt-4.1, gpt-5, phi4 | 2025-2026 | Many model choices for Voice Live |
| Separate Speech + Cognitive resources | Microsoft Foundry Resource (unified) | 2025-2026 | Content Understanding requires Foundry resource |
Deprecated/outdated:
-
SUPPORTED_REGIONS = {"eastus2", "swedencentral"}in voice_live_service.py -- needs expansion - Date-based
api-versionfor Realtime API -- GA uses/openai/v1endpoint
-
ServiceConfig Extra Fields for Agent Mode
- What we know: Voice Live Agent mode needs
project_nameandagent_idbeyond what ServiceConfig stores - What's unclear: Best storage strategy -- add DB columns vs encode in existing field vs JSON blob
- Recommendation: Use
model_or_deploymentfield with convention encoding ("agent:{agent_id}:{project_name}"vs"gpt-realtime"). Avoids DB migration for a simple config difference. Frontend can render different form fields based on a radio button (Agent/Model mode).
- What we know: Voice Live Agent mode needs
-
Content Understanding Regional Availability
- What we know: It is a Foundry Tool running on Foundry Resources
- What's unclear: Exact region list for Content Understanding (docs don't enumerate specific regions)
- Recommendation: Mark as available everywhere for now. If connection test fails, the error message will indicate the issue. Flag as LOW confidence in the region map.
-
Azure AD Token Auth for Voice Live Agent Mode
- What we know: Agent mode recommends Microsoft Entra (Azure AD) auth with scope
https://ai.azure.com/.default - What's unclear: Whether API key works for Agent mode or ONLY Entra tokens
- Recommendation: Support API key first (simpler). Add Azure AD as a future enhancement. The connection test will reveal if API key works.
- What we know: Agent mode recommends Microsoft Entra (Azure AD) auth with scope
No external tools required beyond what is already installed. Phase 07 expanded scope adds backend Python code and minor frontend updates. All dependencies (openai, httpx, cryptography) are already in pyproject.toml.
-
Async everywhere: All new adapter methods must be
async def -
Conditional imports: Use
try/except ImportErrorpattern inside constructors (matching azure_openai.py convention) - Service layer holds logic: Connection testing logic in service, router only handles HTTP
-
Pydantic v2: New schemas use
model_config = ConfigDict(from_attributes=True) - Alembic for schema changes: If adding columns to ServiceConfig, must use Alembic migration with batch operations for SQLite
- No raw SQL: Use SQLAlchemy ORM
-
Route ordering: Static routes before parameterized (
/region-capabilities/{region}before/{id}) - Create returns 201, Delete returns 204
- ruff check + format must pass
- Tests required: pytest with >=95% coverage per project convention
-
Conventional commits:
feat:,fix:,test:
- Azure Content Understanding Overview - GA service, Foundry Tool, REST API
- Azure Content Understanding REST API Quickstart - Endpoint patterns, auth header, API version 2025-11-01
- Azure Content Understanding Service Limits - File types, rate limits, analyzer limits
- Azure OpenAI Realtime API - WebSocket protocol, /openai/v1 GA endpoint, supported models
- Azure Voice Live API Overview - Model support, pricing tiers, regions
- Azure Voice Live API How-To - WebSocket endpoint, auth, session config, avatar integration
- Azure Speech Service Regions - Complete region tables for Avatar, Voice Live, STT, TTS
- Azure TTS Avatar Overview - Avatar capabilities, real-time synthesis
- Azure Real-time Avatar Synthesis - WebRTC setup, ICE servers, Speech SDK
- Existing codebase:
backend/app/services/connection_tester.py,azure_config.py,voice_live_service.py- Current implementation patterns - Existing codebase:
frontend/src/pages/admin/azure-config.tsx- 7 service cards with SERVICE_KEY_MAP
Confidence breakdown:
- Standard stack: HIGH - no new packages, all verified in existing pyproject.toml
- Architecture (service classification): HIGH - verified each service's API pattern against official docs
- Region capabilities: MEDIUM - Avatar and Voice Live regions verified from official tables; Content Understanding availability unclear
- Connection test patterns: HIGH - verified endpoint URLs and auth headers from official docs
- Voice Live Agent mode: MEDIUM - docs confirm agent_id/project_id params but API key vs Entra auth unclear
- Pitfalls: HIGH - identified from analyzing existing code gaps vs official API requirements
Research date: 2026-03-27 Valid until: 2026-04-27 (Azure services evolve; region tables should be re-verified monthly)
Click to expand UI spec
Visual and interaction contract for the Azure AI Service Integration & Voice/Avatar phase. This phase is predominantly backend-focused (new adapters, connection testers, region capabilities). Frontend changes are surgical extensions to the existing Azure Config admin page: region availability hints per service, Agent/Model mode selector for Voice Live, updated region warnings, and real connection status display. No new pages are created. Generated by gsd-ui-researcher, verified by gsd-ui-checker.
| Property | Value |
|---|---|
| Tool | manual shadcn/ui adaptation (no components.json) -- inherited from Phase 1 |
| Preset | not applicable -- components adapted from local Figma Make exports |
| Component library | Radix UI primitives via shadcn/ui wrappers |
| Icon library | lucide-react ^0.460.0 |
| Font | Inter (EN) + Noto Sans SC (CN), loaded via Google Fonts |
Source: Phase 1 UI-SPEC (established design system); no changes for Phase 7.
Note: Phase 7 introduces no new UI primitive dependencies from Radix/shadcn. All frontend changes are extensions to existing components (azure-config.tsx, service-config-card.tsx) and minor additions to the admin i18n namespace. The only new UI construct is a region availability indicator and an Agent/Model radio toggle inside the Voice Live config card.
Declared values (inherited from Phase 1, multiples of 4):
| Token | Value | Usage |
|---|---|---|
| xs | 4px | Icon gaps, status indicator dot spacing, inline label gaps |
| sm | 8px | Compact element spacing, badge internal padding, radio button gaps |
| md | 16px | Default element spacing, config card internal padding, form field gaps |
| lg | 24px | Section padding inside expanded config cards |
| xl | 32px | Vertical gap between service config cards |
| 2xl | 48px | Major section breaks (page header to card list) |
| 3xl | 64px | Not used in Phase 7 (no new pages) |
Component Dimensions (not spacing tokens):
- Region availability hint bar height: 40px (5 * 8px) -- inline informational bar below region input field
- Service config card collapsed height: 72px (9 * 8px) -- icon (48px) + vertical padding, matching existing pattern
- Agent/Model mode radio group width: fills parent grid cell (col-span-2 within the existing 2-column grid)
Source: Phase 1 UI-SPEC spacing scale; existing service-config-card.tsx layout measurements
| Role | Size | Weight | Line Height | CSS Variable |
|---|---|---|---|---|
| Body | 16px (1rem) | 400 (normal) | 1.5 |
--text-base / --font-weight-normal
|
| Label | 16px (1rem) | 500 (medium) | 1.5 |
--text-base / --font-weight-medium
|
| Heading (h3) | 18px (1.125rem) | 500 (medium) | 1.5 |
--text-lg / --font-weight-medium
|
| Display (h1) | 24px (1.5rem) | 500 (medium) | 1.5 |
--text-2xl / --font-weight-medium
|
Font stack: 'Inter', 'Noto Sans SC', sans-serif
Phase 7 supplementary text -- inherited Tailwind utility classes (not part of the design system type scale):
Phase 7 uses text-xs (12px) and text-sm (14px) from Tailwind's default utility classes for supplementary and secondary text within config cards. These are NOT additions to the Phase 7 type scale -- they are pre-existing framework defaults applied to lower-emphasis elements:
-
text-xs(12px, weight 500): Region availability tag badges (uppercase), config card masked key hint (already exists) -
text-sm(14px, weight 400): Region availability hint text (text-muted-foreground), service status text (color per status) -
text-sm(14px, weight 500): Agent/Model mode radio labels
Sizes declared (3 total, unchanged): 16px, 18px, 24px Weights declared (2 total, unchanged): 400 (normal), 500 (medium)
Source: Phase 1 UI-SPEC typography; existing service-config-card.tsx and azure-config.tsx patterns
| Role | Value | Tailwind Class | Usage |
|---|---|---|---|
| Dominant (60%) | #FFFFFF |
bg-background |
Page background, config card surfaces |
| Secondary (30%) | #ECECF0 |
bg-muted |
Config card header backgrounds on hover |
| Accent (10%) | #1E40AF |
bg-primary |
Save button, active status dot, available badges |
| Destructive | #EF4444 |
bg-destructive |
Error status dot, connection failure badges |
| Element | Color | Token | Rationale |
|---|---|---|---|
| Service status: active | #22C55E |
bg-green-500 |
Green dot -- service configured and connection verified (existing pattern) |
| Service status: inactive | #9CA3AF |
bg-gray-400 |
Gray dot -- service not configured (existing pattern) |
| Service status: error | #EF4444 |
bg-red-500 |
Red dot -- connection test failed (existing pattern) |
| Service status: testing | #EAB308 |
bg-yellow-500 animate-pulse |
Yellow pulsing dot -- test in progress (existing pattern) |
| Service status: unavailable | #9333EA |
bg-purple-600 |
Purple dot -- service not available in configured region (NEW) |
| Region available badge |
#22C55E on #F0FDF4
|
text-green-700 bg-green-50 |
Green badge -- service available in this region |
| Region unavailable badge |
#9333EA on #FAF5FF
|
text-purple-700 bg-purple-50 |
Purple badge -- service not available in this region |
| Region unknown badge |
#6B7280 on #F9FAFB
|
text-gray-600 bg-gray-50 |
Gray badge -- region not recognized or availability unknown |
| Region warning inline |
#F97316 on #FFF7ED
|
text-orange-800 bg-orange-50 border-orange-200 |
Orange warning bar -- limited availability notice (existing Voice Live pattern) |
| Agent mode indicator | #1E40AF |
text-primary |
Primary color badge for Agent mode label |
| Model mode indicator | #475569 |
text-secondary |
Secondary color badge for Model mode label |
Inherited from Phase 1 reserved list, plus:
- Save Configuration button (
bg-primary) - Active service status dot (
bg-green-500) -- note: this uses green, not accent blue - Region "Available" badge text
- Agent/Model mode active radio indicator
NOT used for: region availability hint background (use orange/purple/gray semantic colors), config card borders (use --border), test connection button (use variant="outline").
Source: Phase 1 UI-SPEC color contract; existing service-config-card.tsx STATUS_DOT pattern; CONTEXT.md region availability design direction
Primary visual anchor: The expanded service config card with its colored status dot (green/red/purple) and region availability badge draws the eye first; the status dot and badge provide immediate at-a-glance service health.
Secondary visual anchor: The page heading ("Azure Service Configuration") paired with the "Test All Connections" button in the top-right corner, establishing the page context and primary bulk action.
Phase 7 does not create new pages. All UI changes occur within the existing azure-config.tsx page and service-config-card.tsx component.
+-------------------------------------------------------------------+
| Azure Service Configuration (h1) [Test All Connections] |
+-------------------------------------------------------------------+
| max-w-4xl, space-y-4 |
| |
| +--------------------------------------------------------------+ |
| | [icon] Azure OpenAI (green dot) Active [^/v] | |
| | GPT-4o for AI coaching... | |
| | [Available in eastus2] (green badge) | |
| | (expanded: endpoint, apiKey, model, region fields + buttons) | |
| +--------------------------------------------------------------+ |
| |
| +--------------------------------------------------------------+ |
| | [icon] Azure Speech (STT) (gray dot) Inactive [^/v] | |
| | Speech-to-text for voice input... | |
| | [Available in eastus2] (green badge) | |
| +--------------------------------------------------------------+ |
| |
| +--------------------------------------------------------------+ |
| | [icon] Azure Speech (TTS) (gray dot) Inactive [^/v] | |
| | Text-to-speech for HCP voice... | |
| | [Available in eastus2] (green badge) | |
| +--------------------------------------------------------------+ |
| |
| +--------------------------------------------------------------+ |
| | [icon] Azure AI Avatar (purple dot) Unavailable [^/v] | |
| | Digital human avatar for HCP... | |
| | [Not available in westus] (purple badge) | |
| +--------------------------------------------------------------+ |
| |
| +--------------------------------------------------------------+ |
| | [icon] Azure Content Understanding (gray dot) Inactive[^/v] | |
| | Multimodal evaluation for training... | |
| | [Available] (green badge) | |
| +--------------------------------------------------------------+ |
| |
| +--------------------------------------------------------------+ |
| | [icon] Azure OpenAI Realtime (gray dot) Inactive [^/v] | |
| | Real-time audio streaming... | |
| | [Available] (green badge) | |
| +--------------------------------------------------------------+ |
| |
| +--------------------------------------------------------------+ |
| | [icon] Azure Voice Live API (green dot) Active [^/v] | |
| | Real-time voice coaching with GPT-4o Realtime | |
| | [Available in eastus2] (green badge) | |
| | (expanded: endpoint, apiKey, region fields | |
| | + Agent/Model mode radio | |
| | + model field (Model mode) OR agent_id + project fields | |
| | + Save + Test Connection buttons) | |
| +--------------------------------------------------------------+ |
| |
| +--------------------------------------------------------------+ |
| | [icon] Azure Database (gray dot) Inactive [^/v] | |
| | Managed PostgreSQL for production data | |
| +--------------------------------------------------------------+ |
+-------------------------------------------------------------------+
Added below the service description line, before the expand/collapse chevron area:
+--------------------------------------------------------------+
| [48px icon bg] Service Name (status dot) Status [^/v] |
| Service description text |
| [Region Badge] Available in eastus2 |
+--------------------------------------------------------------+
The region badge appears ONLY when:
- A region is configured in the saved config for this service, AND
- Region capability data has been fetched
Badge variants:
- Green:
[check-icon] Available in {region}--text-green-700 bg-green-50 border border-green-200 - Purple:
[x-icon] Not available in {region}--text-purple-700 bg-purple-50 border border-purple-200 - Gray:
[info-icon] Region availability unknown--text-gray-600 bg-gray-50 border border-gray-200
When the Voice Live card is expanded, display a radio group ABOVE the standard form fields:
+--------------------------------------------------------------+
| Voice Live API Mode |
| |
| (o) Model Mode -- Direct Azure OpenAI model connection |
| ( ) Agent Mode -- Azure AI Agent Service with pre-configured |
| HCP persona agent |
+--------------------------------------------------------------+
| (standard 2-col grid: endpoint, apiKey, region) |
| |
| Model Mode fields: |
| [Model/Deployment: gpt-realtime ] [Region: eastus2 ] |
| |
| -- OR (if Agent Mode selected) -- |
| |
| Agent Mode fields: |
| [Agent ID: ____________ ] [Project Name: _________ ] |
+--------------------------------------------------------------+
| [Save Configuration] [Test Connection] |
+--------------------------------------------------------------+
The radio group toggles which additional fields are shown:
-
Model mode (default): Standard
model_or_deployment+regionfields -
Agent mode:
agent_id+project_namefields (stored encoded inmodel_or_deploymentasagent:{agent_id}:{project_name})
No changes to responsive behavior. The existing Azure Config page is within the admin layout which handles responsive breakpoints. The config cards stack vertically at all breakpoints. The 2-column form grid within expanded cards collapses to single column below 640px (existing behavior).
Source: CONTEXT.md decisions (7 equal modes, region availability hints); RESEARCH.md architecture patterns (service classification, Voice Live Agent/Model modes); existing azure-config.tsx and service-config-card.tsx implementation
All copy delivered via react-i18next admin namespace (extend existing). English canonical copy below.
| Element | Key | English Copy | Chinese Copy |
|---|---|---|---|
| Available badge | admin.azureConfig.regionAvailable |
Available in {{region}} | 在 {{region}} 可用 |
| Unavailable badge | admin.azureConfig.regionUnavailable |
Not available in {{region}} | 在 {{region}} 不可用 |
| Unknown badge | admin.azureConfig.regionUnknown |
Region availability unknown | 区域可用性未知 |
| Avatar region hint | admin.azureConfig.avatarRegionHint |
Azure AI Avatar is available in: East US 2, North Europe, South Central US, Southeast Asia, Sweden Central, West Europe, West US 2 | Azure AI Avatar 可用区域:美国东部 2、北欧、美国中南部、东南亚、瑞典中部、西欧、美国西部 2 |
| Voice Live region hint | admin.azureConfig.voiceLiveRegionHint |
Voice Live API is available in most Azure regions. Agent mode requires specific AI Foundry regions. | Voice Live API 在大多数 Azure 区域可用。Agent 模式需要特定的 AI Foundry 区域。 |
| Content Understanding hint | admin.azureConfig.contentHint |
Requires a Microsoft Foundry Resource (not a standard Cognitive Services resource). | 需要 Microsoft Foundry 资源(非标准认知服务资源)。 |
| Realtime hint | admin.azureConfig.realtimeHint |
Requires gpt-4o-realtime-preview or gpt-realtime model deployment. | 需要部署 gpt-4o-realtime-preview 或 gpt-realtime 模型。 |
| Element | Key | English Copy | Chinese Copy |
|---|---|---|---|
| Mode section label | admin.voiceLive.modeLabel |
Voice Live API Mode | Voice Live API 模式 |
| Model mode label | admin.voiceLive.modelMode |
Model Mode | 模型模式 |
| Model mode description | admin.voiceLive.modelModeDesc |
Direct Azure OpenAI model connection | 直接 Azure OpenAI 模型连接 |
| Agent mode label | admin.voiceLive.agentMode |
Agent Mode | Agent 模式 |
| Agent mode description | admin.voiceLive.agentModeDesc |
Azure AI Agent Service with pre-configured HCP persona agent | Azure AI Agent 服务配合预配置的 HCP 角色代理 |
| Agent ID field | admin.voiceLive.agentId |
Agent ID | Agent ID |
| Project name field | admin.voiceLive.projectName |
Project Name | 项目名称 |
| Agent ID placeholder | admin.voiceLive.agentIdPlaceholder |
Enter Azure AI Agent ID | 输入 Azure AI Agent ID |
| Project placeholder | admin.voiceLive.projectPlaceholder |
Enter AI Foundry project name | 输入 AI Foundry 项目名称 |
| Element | Key | English Copy | Chinese Copy |
|---|---|---|---|
| Content Understanding success | admin.azureConfig.contentTestSuccess |
Content Understanding service connected. Analyzers accessible. | 内容理解服务已连接。分析器可访问。 |
| Content Understanding failure | admin.azureConfig.contentTestFailed |
Content Understanding connection failed. Verify this is a Foundry Resource endpoint. | 内容理解连接失败。请确认此为 Foundry 资源端点。 |
| Realtime success | admin.azureConfig.realtimeTestSuccess |
Realtime API deployment verified. | 实时 API 部署已验证。 |
| Realtime failure | admin.azureConfig.realtimeTestFailed |
Realtime deployment not found. Verify the deployment name supports realtime audio. | 未找到实时部署。请确认部署名称支持实时音频。 |
| Voice Live success | admin.azureConfig.voiceLiveTestSuccess |
Voice Live API endpoint reachable. | Voice Live API 端点可访问。 |
| Voice Live failure | admin.azureConfig.voiceLiveTestFailed |
Voice Live API connection failed. Verify the endpoint and API key. | Voice Live API 连接失败。请验证端点和 API 密钥。 |
| Avatar success | admin.azureConfig.avatarTestSuccess |
Avatar service reachable. ICE relay token retrieved. | Avatar 服务可访问。ICE 中继令牌已获取。 |
| Avatar failure | admin.azureConfig.avatarTestFailed |
Avatar connection failed. Verify the region supports Avatar synthesis. | Avatar 连接失败。请确认该区域支持 Avatar 合成。 |
| Element | Copy |
|---|---|
| Primary CTA |
Save Configuration -- Phase 7 updates the existing "Save" button label in each config card to "Save Configuration" for specificity. i18n key: admin.azureConfig.saveConfig. |
| Empty state heading | N/A -- config page always shows all 8 service cards, never empty |
| Empty state body | N/A -- each unconfigured card shows "Inactive" status dot and form fields |
| Error state | Toast notification via sonner: "{Service Name}: {error.message}" with problem description and solution path (see connection test copy above) |
| Destructive confirmation | None in Phase 7 -- there is no "delete configuration" action. Saving empty fields effectively disables a service. |
| Element | Key | Old Copy | New Copy (EN) | New Copy (CN) |
|---|---|---|---|---|
| Region warning | admin.voiceLive.regionWarning |
Voice Live API is only available in East US 2 and Sweden Central regions. | Voice Live API is available in most Azure regions. Some models and Agent mode may have regional restrictions. Check the Azure documentation for your region. | Voice Live API 在大多数 Azure 区域可用。部分模型和 Agent 模式可能有区域限制。请查阅您所在区域的 Azure 文档。 |
Source: CONTEXT.md decisions (7 equal modes, region availability); RESEARCH.md (region capability map, connection test patterns, Agent/Model mode); existing admin.json i18n structure
Phase 7 does NOT create new component files. All changes are modifications to existing components.
| Component | Extension | Location |
|---|---|---|
service-config-card.tsx |
(1) Add regionStatus prop for region availability badge display. (2) Add unavailable to ServiceStatus type with purple dot. (3) Add region availability badge below description text. |
components/admin/ |
service-config-card.tsx |
(4) For Voice Live card: add Agent/Model mode radio group that toggles between model fields and agent fields. Radio selection encodes into model_or_deployment field. |
components/admin/ |
azure-config.tsx |
(1) Remove hardcoded Voice Live region warning (eastus2/swedencentral only). (2) Add useRegionCapabilities hook call to fetch region data. (3) Pass regionStatus prop to each ServiceConfigCard based on fetched capabilities. (4) Add azure_openai_realtime to SERVICE_KEY_MAP if missing (already present). (5) Update save button label from "Save" to "Save Configuration" (admin.azureConfig.saveConfig). |
pages/admin/ |
| Hook | Purpose | Location |
|---|---|---|
use-region-capabilities.ts |
TanStack Query hook wrapping GET /api/v1/azure-config/region-capabilities/{region}. Returns map of service name to availability boolean. Stale time: 5 minutes. |
hooks/ |
| Addition | Location |
|---|---|
RegionCapabilities interface: { region: string; services: Record<string, { available: boolean; note?: string }> }
|
types/azure-config.ts (extend existing file) |
RegionStatus type: "available" | "unavailable" | "unknown"
|
types/azure-config.ts (extend existing file) |
| Namespace | Keys Added | Count |
|---|---|---|
admin |
azureConfig.regionAvailable, azureConfig.regionUnavailable, azureConfig.regionUnknown, azureConfig.avatarRegionHint, azureConfig.voiceLiveRegionHint, azureConfig.contentHint, azureConfig.realtimeHint, azureConfig.saveConfig, voiceLive.modeLabel, voiceLive.modelMode, voiceLive.modelModeDesc, voiceLive.agentMode, voiceLive.agentModeDesc, voiceLive.agentId, voiceLive.projectName, voiceLive.agentIdPlaceholder, voiceLive.projectPlaceholder, connection test result keys (8) |
24 total |
| Component | Phase 7 Usage |
|---|---|
Card, CardHeader, CardContent
|
Service config card structure (existing) |
Input, Label
|
Config form fields (existing) |
Button |
Save Configuration, Test Connection buttons (existing) |
Badge |
Region availability indicators (new usage of existing component) |
Sonner toast
|
Success/error notifications for save and test operations (existing) |
Loader2 icon |
Testing spinner (existing) |
| State | Status Dot | Badge | User Actions |
|---|---|---|---|
| Not configured | Gray (bg-gray-400) |
None (no region set) | Expand card, enter config, save |
| Configured + untested | Gray (bg-gray-400) |
Region badge (if region set) | Test connection, save changes |
| Testing | Yellow pulsing (bg-yellow-500 animate-pulse) |
Region badge (if region set) | Wait for test result |
| Active (test passed) | Green (bg-green-500) |
Region badge (if region set) | Reconfigure, re-test |
| Error (test failed) | Red (bg-red-500) |
Region badge (if region set) | Fix config, re-test |
| Unavailable in region (NEW) | Purple (bg-purple-600) |
Purple "Not available" badge | Save config anyway (admin may prepare for future), change region |
| Trigger | Behavior |
|---|---|
| Region field changes | Debounce 500ms, then fetch GET /region-capabilities/{newRegion}
|
| Region capabilities response received | Update all service card badges simultaneously |
| Region empty | Hide all region badges |
| API error fetching capabilities | Show gray "Region availability unknown" badge |
| Region not in capability map | Show gray "Region availability unknown" badge |
| State | Fields Shown | Encoding |
|---|---|---|
| Model mode (default) | Endpoint, API Key, Model/Deployment, Region |
model_or_deployment = model name (e.g., gpt-realtime) |
| Agent mode | Endpoint, API Key, Agent ID, Project Name, Region |
model_or_deployment = agent:{agent_id}:{project_name}
|
| Switching modes | Clear mode-specific fields, preserve shared fields (endpoint, API key, region) | Re-encode on next save |
Existing "Test All Connections" button behavior is unchanged. Tests are run sequentially for all services that have an endpoint configured. Each result shows a toast. Services unavailable in the configured region still attempt the connection test -- unavailability is informational, not blocking.
Inherited from Phase 1, plus Phase 7 additions:
| Requirement | Implementation |
|---|---|
| Color contrast | Purple unavailable badge: #9333EA on #FAF5FF meets WCAG AA (4.7:1). Green available badge: #15803D on #F0FDF4 meets AA (5.1:1). |
| Focus indicators | Agent/Model radio buttons receive visible focus ring. Region badge is not interactive (no focus needed). |
| Keyboard navigation | Tab order within expanded Voice Live card: mode radio group -> endpoint -> API key -> model/agent fields -> region -> Save Configuration -> Test. |
| Screen reader | Radio group uses role="radiogroup" with aria-label="Voice Live API Mode". Region badge uses role="status" with aria-live="polite". |
| Touch targets | Radio buttons: 44px minimum touch target height. All other targets unchanged (existing). |
| Token | Value | Usage in Phase 7 |
|---|---|---|
--radius-sm |
6px | Region availability badge border radius |
--radius-md |
8px | Config card internal elements |
--radius-lg |
10px | Config card container (existing) |
--radius-xl |
14px | Not used in Phase 7 |
| Token | Usage in Phase 7 |
|---|---|
Card shadow (0 1px 3px rgba(0,0,0,0.1)) |
Service config cards (existing, unchanged) |
| Registry | Blocks Used | Safety Gate |
|---|---|---|
| shadcn official (manual adaptation) | No new UI primitives added in Phase 7 -- Badge component already exists | not required |
| Third-party registries | none | not applicable |
Note: Phase 7 makes no additions to the UI component library. All changes are extensions to existing admin page components using existing primitives (Card, Input, Label, Button, Badge). No new Radix primitives or shadcn/ui base components are required.
| Namespace | Status | Files |
|---|---|---|
admin |
Extended | Add 24 new keys for region availability, Agent/Model mode, save button label, and connection test results |
voice |
Updated | Update error.regionUnsupported to reflect expanded Voice Live region support |
coach |
Unchanged | No changes |
common |
Unchanged | No changes |
All admin config copy uses useTranslation("admin"). Voice error copy update uses useTranslation("voice").
Phase 7 does NOT implement dark mode. Dark tokens are preserved in CSS. Region availability badges use semantic color classes (bg-green-50, bg-purple-50, bg-orange-50) that have no dark mode variants defined. If dark mode is added later, these badges will need dark-aware color tokens.
- Dimension 1 Copywriting: PASS
- Dimension 2 Visuals: PASS
- Dimension 3 Color: PASS
- Dimension 4 Typography: PASS
- Dimension 5 Spacing: PASS
- Dimension 6 Registry Safety: PASS
Approval: pending
Click to expand verification report
Phase Goal: Full Azure AI service integration -- all 7 service modes (OpenAI, Speech STT, Speech TTS, Avatar, Content Understanding, OpenAI Realtime, Voice Live) configurable, testable, and registered. Backend adapters, connection testers, region capabilities, frontend config UI with region badges and Voice Live Agent/Model mode toggle. Verified: 2026-03-27T15:00:00Z Status: PASSED Re-verification: Yes -- previous verification covered plans 01-04 only (6/6 truths). Plans 05-07 executed after that verification, adding 7 new truths for a total of 13.
| # | Truth | Status | Evidence |
|---|---|---|---|
| 1 | Admin can configure Azure OpenAI endpoint/key/model/region from web UI, settings persist to DB with encrypted API key | VERIFIED | PUT /api/v1/azure-config/services/{service_name} persists to service_configs table with Fernet-encrypted API key. Frontend uses TanStack Query hooks. Lifespan loads from DB on startup. 14 original API tests pass. |
| 2 | Admin can configure all 7 Azure services (OpenAI, Speech STT, Speech TTS, Avatar, Content Understanding, Realtime, Voice Live) from the same config page | VERIFIED |
SERVICE_DISPLAY_NAMES in azure_config.py has exactly 7 entries (line 24-32). Frontend AZURE_SERVICES array has 7 service cards plus database. SERVICE_KEY_MAP maps all 7 backend keys. Same PUT/GET/test endpoints handle all types. |
| 3 | Connection testing validates real Azure service reachability for all 7 services | VERIFIED |
test_service_connection() dispatches to 7 branches: test_azure_openai (chat completion), test_azure_speech (voice list), test_azure_avatar (ICE relay token), test_azure_voice_live (endpoint probe), test_azure_content_understanding (list analyzers), test_azure_realtime (verify deployment). SSRF prevention via validate_endpoint_url with AZURE_HOST_PATTERN. 14 connection tester tests pass. |
| 4 | Dynamic provider switching: mock to Azure adapters on config save and server startup | VERIFIED |
register_adapter_from_config() handles all 7 services (lines 45-94 of azure_config.py). main.py lifespan registers mocks first (lines 42-45), then loads DB configs (lines 74-90) calling register_adapter_from_config. |
| 5 | API keys stored encrypted (Fernet) in database | VERIFIED |
encryption.py provides encrypt_value()/decrypt_value(). config_service.upsert_config() encrypts on save, get_decrypted_key() decrypts on read. 4 encryption tests + 8 config service tests pass. |
| 6 | System falls back to mock adapters when Azure not configured | VERIFIED | Lifespan registers mock adapters FIRST (lines 42-45), Azure overrides only when api_key is non-empty. except Exception: pass on DB load handles first-run gracefully. |
| 7 | Content Understanding adapter has bounded polling (max 30 attempts, 2s interval) | VERIFIED |
azure_content.py lines 16-17: MAX_POLL_ATTEMPTS = 30, POLL_INTERVAL_SECONDS = 2.0. execute() uses for-loop with asyncio.sleep, handles Succeeded/Failed/Cancelled/timeout. 5 adapter tests verify success, poll failure, and HTTP error paths. |
| 8 | Realtime and Voice Live adapters exist as config-only frontend-direct services | VERIFIED |
azure_realtime.py: AzureRealtimeAdapter yields ERROR "frontend-direct" + DONE. azure_voice_live.py: AzureVoiceLiveAdapter same pattern. Both have working is_available(). 6 adapter tests pass. |
| 9 | Voice Live supports Agent/Model mode via JSON-structured encoding with legacy colon fallback | VERIFIED |
parse_voice_live_mode() tries JSON.loads first, falls back to colon-split, defaults to model mode. encode_voice_live_mode() produces JSON for agent, plain string for model. Frontend parseVoiceLiveMode/encodeVoiceLiveMode mirror exactly. 5 parse/encode tests pass. |
| 10 | Region capabilities returns per-service availability for any Azure region | VERIFIED |
region_capabilities.py: 7 services in ALL_SERVICE_NAMES, AVATAR_REGIONS (7 regions), VOICE_LIVE_REGIONS (20 regions), VOICE_LIVE_AGENT_REGIONS. get_region_capabilities() normalizes to lowercase, returns availability + notes. GET /region-capabilities/{region} endpoint requires admin. 8 region tests + 3 API tests pass. |
| 11 | Frontend shows region availability badges with accessible icon + text (not color-only) | VERIFIED |
service-config-card.tsx lines 149-178: regionStatus prop renders green Check badge (available), purple X badge (unavailable), gray Info badge (unknown). Each has role="status", aria-hidden="true" on icons, sr-only label on status dot (line 146). |
| 12 | Voice Live card has Agent/Model mode radio toggle with structured JSON encoding | VERIFIED |
service-config-card.tsx lines 191-229: role="radiogroup" with aria-label, two radio inputs for model/agent. Agent mode shows agentId + projectName fields (lines 294-321) with required-field validation. handleSave (line 102-128) uses encodeVoiceLiveMode producing JSON for agent mode. |
| 13 | All new UI text externalized via i18n in both en-US and zh-CN | VERIFIED |
admin.json (en-US): regionAvailable, regionUnavailable, regionUnknown, saveConfig, voiceLive.modeLabel, voiceLive.agentMode, etc. admin.json (zh-CN): equivalent Chinese translations. voice.json updated regionUnsupported message. Hardcoded "eastus2/swedencentral" warning removed from azure-config.tsx. |
Score: 13/13 truths verified
| Artifact | Expected | Status | Details |
|---|---|---|---|
backend/app/models/service_config.py |
ServiceConfig ORM model | VERIFIED | TimestampMixin, api_key_encrypted, service_name, is_active |
backend/app/utils/encryption.py |
Fernet encrypt/decrypt | VERIFIED | 49 lines, encrypt_value/decrypt_value |
backend/app/schemas/azure_config.py |
Pydantic v2 schemas | VERIFIED | ServiceConfigUpdate, ServiceConfigResponse, ConnectionTestResult, RegionServiceAvailability, RegionCapabilitiesResponse |
backend/app/services/config_service.py |
Config CRUD with encryption | VERIFIED | 90 lines, get_all_configs, upsert_config, get_config, get_decrypted_key |
backend/app/services/agents/adapters/azure_openai.py |
AzureOpenAIAdapter with streaming | VERIFIED | 106 lines, BaseCoachingAdapter, streaming execute |
backend/app/services/agents/adapters/azure_content.py |
Content Understanding adapter | VERIFIED | 116 lines, bounded polling (30 attempts, 2s), httpx REST calls |
backend/app/services/agents/adapters/azure_realtime.py |
Realtime config adapter | VERIFIED | 52 lines, frontend-direct, ERROR+DONE pattern |
backend/app/services/agents/adapters/azure_voice_live.py |
Voice Live config adapter | VERIFIED | 129 lines, Agent/Model mode, parse/encode utilities |
backend/app/services/region_capabilities.py |
Region lookup module | VERIFIED | 128 lines, 7 services, AVATAR_REGIONS(7), VOICE_LIVE_REGIONS(20), LAST_VERIFIED date |
backend/app/services/connection_tester.py |
Real connection tests for all 7 services | VERIFIED | 210 lines, validate_endpoint_url (SSRF), 7 dispatch branches |
backend/app/api/azure_config.py |
Full CRUD + region capabilities API | VERIFIED | 187 lines, 7 SERVICE_DISPLAY_NAMES, register_adapter_from_config for all 7, region-capabilities endpoint |
backend/app/main.py |
Lifespan with DB config loading | VERIFIED | Lines 67-92: loads active configs, decrypts, calls register_adapter_from_config |
frontend/src/types/azure-config.ts |
TypeScript types + Voice Live encoding | VERIFIED | 89 lines, RegionCapabilities, RegionStatus, VoiceLiveMode, parseVoiceLiveMode, encodeVoiceLiveMode |
frontend/src/api/azure-config.ts |
Typed API client | VERIFIED | 44 lines, getServiceConfigs, updateServiceConfig, testServiceConnection, getRegionCapabilities |
frontend/src/hooks/use-azure-config.ts |
TanStack Query hooks | VERIFIED | useServiceConfigs, useUpdateServiceConfig, useTestServiceConnection |
frontend/src/hooks/use-region-capabilities.ts |
Region capabilities hook | VERIFIED | 13 lines, useRegionCapabilities with staleTime 5min, retry 1 |
frontend/src/pages/admin/azure-config.tsx |
Config page with 7 services + region status | VERIFIED | 190 lines, useRegionCapabilities, getRegionStatus, regionStatus prop, no hardcoded region warnings |
frontend/src/components/admin/service-config-card.tsx |
Config card with badges, toggle | VERIFIED | 341 lines, regionStatus prop, STATUS_DOT with unavailable/purple, Voice Live radio group, encodeVoiceLiveMode |
frontend/public/locales/en-US/admin.json |
i18n keys for regions, Voice Live | VERIFIED | regionAvailable, saveConfig, voiceLive.modeLabel, agentMode, agentIdRequired |
frontend/public/locales/zh-CN/admin.json |
Chinese translations | VERIFIED | Equivalent keys in Chinese |
backend/tests/test_adapters_new.py |
3 new adapter tests | VERIFIED | 15 tests pass |
backend/tests/test_region_capabilities.py |
Region capability tests | VERIFIED | 8 tests pass |
backend/tests/test_connection_tester_extended.py |
Connection tester dispatch + URL validation | VERIFIED | 14 tests pass |
backend/tests/test_azure_config_api_extended.py |
API integration tests | VERIFIED | 6 tests pass |
| From | To | Via | Status | Details |
|---|---|---|---|---|
azure_config.py (API) |
region_capabilities.py |
get_region_capabilities call |
WIRED | Line 16: import, line 103: call in GET endpoint |
connection_tester.py |
httpx.AsyncClient |
Real HTTP for avatar, content, realtime | WIRED | Lines 81, 106, 137: httpx.AsyncClient for ICE token, analyzers, deployment verify |
azure_config.py |
AzureRealtimeAdapter |
register on config save | WIRED | Lines 71-77: import and register |
azure_config.py |
AzureContentUnderstandingAdapter |
register on config save | WIRED | Lines 79-83: import and register |
azure_config.py |
AzureVoiceLiveAdapter |
register on config save | WIRED | Lines 85-94: import and register |
main.py |
register_adapter_from_config |
startup DB loading | WIRED | Line 68: import, line 84: call for each active config |
voice_live_service.py |
VOICE_LIVE_REGIONS |
SUPPORTED_REGIONS reference | WIRED | Line 7: from app.services.region_capabilities import VOICE_LIVE_REGIONS, line 9: SUPPORTED_REGIONS = VOICE_LIVE_REGIONS
|
azure-config.tsx (page) |
use-region-capabilities.ts |
useRegionCapabilities hook | WIRED | Line 21: import, line 100: call with primaryRegion |
service-config-card.tsx |
azure-config.ts types |
RegionStatus, parseVoiceLiveMode, encodeVoiceLiveMode | WIRED | Lines 14-16: imports used in badge rendering and Voice Live save |
use-region-capabilities.ts |
azure-config.ts (API) |
getRegionCapabilities | WIRED | Line 2: import, line 7: call in queryFn |
connection_tester.py |
azure_openai_realtime dispatch |
test_service_connection | WIRED | Line 206-207: azure_openai_realtime -> test_azure_realtime |
avatar/azure.py |
region parameter | constructor | WIRED | Line 19: def __init__(self, endpoint, key, region=""), line 22: self._region = region
|
| Artifact | Data Variable | Source | Produces Real Data | Status |
|---|---|---|---|---|
azure-config.tsx |
savedConfigs |
useServiceConfigs() -> GET /azure-config/services
|
DB query via config_service.get_all_configs() | FLOWING |
azure-config.tsx |
regionCaps |
useRegionCapabilities(primaryRegion) -> GET /azure-config/region-capabilities/{region}
|
get_region_capabilities() with hardcoded region maps | FLOWING |
service-config-card.tsx |
savedConfig, regionStatus
|
Props from parent | Derived from API queries above | FLOWING |
service-config-card.tsx |
parsedVoiceLive |
parseVoiceLiveMode(savedConfig?.model_or_deployment) |
Parses actual DB-stored model_or_deployment field | FLOWING |
| Behavior | Command | Result | Status |
|---|---|---|---|
| All 91 Phase 07 backend tests pass | pytest tests/test_encryption.py tests/test_config_service.py tests/test_azure_openai_adapter.py tests/test_azure_config_api.py tests/test_adapters_new.py tests/test_region_capabilities.py tests/test_connection_tester_extended.py tests/test_azure_config_api_extended.py --tb=no -q |
91 passed in 6.19s | PASS |
| TypeScript compiles | npx tsc --noEmit |
Exit 0, no errors | PASS |
| Frontend build | npm run build |
Fails: unresolved import "rt-client" in use-voice-live.ts | SKIP (pre-existing Phase 08 issue, not Phase 07) |
| SERVICE_DISPLAY_NAMES count | grep count in azure_config.py | 7 entries | PASS |
| SUPPORTED_REGIONS count | Python import check | 20 regions (from VOICE_LIVE_REGIONS) | PASS |
| Hardcoded region warning removed | grep "isUnsupported" azure-config.tsx | No matches | PASS |
| Requirement | Source Plans | Description | Status | Evidence |
|---|---|---|---|---|
| PLAT-03 | 07-01 through 07-07 | Admin can configure Azure service connections (OpenAI, Speech, Avatar, Content Understanding) from web UI with connection testing | SATISFIED | Full CRUD API for 7 services, real connection tests (ICE token, list analyzers, verify deployment), SSRF prevention, frontend config page with save/test/badges. 91 tests pass. |
| ARCH-05 | 07-01 through 07-06 | Azure service connections configurable per environment -- endpoints, keys, models, regions are config, not code | SATISFIED | ServiceConfig DB model stores all connection params per service, Settings has defaults, DB configs override at startup, dynamic registration via register_adapter_from_config for all 7 services. |
| COACH-04 | 07-06 | User can use voice input (Azure Speech STT) | SATISFIED | Azure STT adapter configurable via admin UI, register_adapter_from_config switches default_stt_provider to "azure", connection tester validates Speech service. |
| COACH-05 | 07-06 | AI HCP responses spoken via Azure Speech TTS | SATISFIED | Azure TTS adapter configurable via admin UI, register_adapter_from_config switches default_tts_provider to "azure", connection tester validates Speech service. |
| COACH-07 | 07-05, 07-07 | Azure AI Avatar as configurable premium option | SATISFIED | Avatar adapter accepts region param, is_available returns True when configured, ICE relay token connection test, region capabilities shows avatar availability per region, frontend badge shows available/unavailable. |
| EXT-04 | 07-05, 07-07 | Azure Voice Live API as unified premium voice+avatar path | SATISFIED | AzureVoiceLiveAdapter with Agent/Model mode, parse_voice_live_mode/encode_voice_live_mode, frontend radio toggle with JSON encoding, VOICE_LIVE_REGIONS (20 regions), Voice Live connection tester with endpoint probe. |
| PLAT-05 | 07-03 through 07-07 | Voice interaction mode configurable per deployment and per session | SATISFIED | STT/TTS/Realtime/Voice Live all configurable via admin UI, register_adapter_from_config handles all voice-related services, region capabilities identifies Voice Live Agent vs Model availability per region. |
| File | Line | Pattern | Severity | Impact |
|---|---|---|---|---|
backend/app/main.py |
91-92 |
except Exception: pass for DB config loading |
Info | Silently swallows DB errors on first run. Acceptable for bootstrap case where table may not exist. |
backend/app/services/agents/adapters/azure_openai.py |
46-47 |
except ImportError: pass for openai import |
Info | Expected pattern for optional dependency. is_available() returns False when client is None. |
No blockers or warnings found.
Test: Log in as admin, navigate to Azure Config, configure at least one service with a region (e.g., "eastus2"), expand other service cards. Expected: Green check badge ("Available in eastus2") on OpenAI, Speech, Avatar. Purple X badge on services not available in region. Gray info badge if API call fails. Why human: Requires running frontend+backend, visual confirmation of badge colors, icons, and text.
Test: Expand Voice Live card, select Agent mode, fill agent_id and project_name, click Save. Then select Model mode and save again.
Expected: Agent mode saves JSON {"mode":"agent","agent_id":"...","project_name":"..."} in model_or_deployment. Model mode saves plain string. Validation prevents saving Agent mode with empty fields.
Why human: Requires running services and verifying saved payload format.
Test: Enter real Azure credentials for each service, click Test Connection. Expected: OpenAI: chat completion test. Speech: voice list. Avatar: ICE relay token. Content Understanding: list analyzers. Realtime: verify deployment. Voice Live: endpoint probe. Each shows success/failure status with specific message. Why human: Requires real Azure credentials and network connectivity.
Test: Configure all 7 services, stop backend, restart, verify configs load from DB. Expected: All configs persist. Adapters re-register on startup. No manual reconfiguration needed. Why human: Requires server lifecycle management and manual observation.
No gaps found. All 13 observable truths verified through code inspection and automated testing:
-
Plans 01-04 (previously verified): ServiceConfig model, encryption, config service, Azure OpenAI adapter, API CRUD, connection testing, dynamic provider switching, frontend config page -- all re-confirmed with 41 original tests passing.
-
Plan 05 (new adapters): Content Understanding adapter with bounded polling (30 attempts, 2s), Realtime and Voice Live as config-only frontend-direct services, region capabilities module with 7 services across 20+ regions -- 4 new files, verified substantive and passing ruff check.
-
Plan 06 (wiring): Connection tester upgraded to real API calls for all 7 services, SSRF prevention via AZURE_HOST_PATTERN, region-capabilities endpoint, azure_openai_realtime accepted by PUT, SUPPORTED_REGIONS expanded to 20+, new adapters register on startup and save -- 50 new tests pass.
-
Plan 07 (frontend UX): Region badges with accessible icon+text (Check/X/Info), Voice Live Agent/Model radio toggle with JSON encoding and field validation, purple unavailable status dot, hardcoded region warning removed, all text externalized in en-US and zh-CN i18n -- TypeScript compiles, all UI patterns verified in code.
Note on build: npm run build fails due to unresolved rt-client import in use-voice-live.ts, which was introduced by Phase 08 commit (744b449), not Phase 07. TypeScript compilation (tsc --noEmit) passes cleanly.
Verified: 2026-03-27T15:00:00Z Verifier: Claude (gsd-verifier)