T5 — Thiết kế module NLP Chatbot - congsinhv/fluxion GitHub Wiki

Thiết Kế Module NLP Chatbot

Issue: #27 — Thiết kế module NLP Chatbot Tuần: 5 | 21/04 – 27/04/2026

Numbering chính thức: Mục 3.5 theo Master TOC

Định vị module: Module NLP Chatbot trong Fluxion được xây dựng như một ứng dụng LLM tăng cường công cụ (Tool-Augmented LLM Application), tập trung vào prompt engineering, tool schema design, và safety mechanisms — không phải nghiên cứu NLP cơ bản. Đóng góp chính: system prompt chuyên biệt cho domain MDM, 5 tool definitions, cơ chế xác nhận 2 bước cho lệnh nguy hiểm.

3.5 Thiết Kế Module NLP Chatbot

3.5.1 Conversation Flow Diagrams

3.5.1.1 Simple Query Flow

User: "liệt kê thiết bị đang active"
  → LLM: intent = query, entity = state:active
    → tool_call: query_devices(state="active")
      → Lambda: SELECT * FROM devices WHERE state='active'
        → result: [{id, name, serial, state}, ...]
          → LLM: format response
            → User: "Có 12 thiết bị đang active: ..."

3.5.1.2 Command Flow (with Confirmation)

User: "khóa thiết bị iPhone-001"
  → LLM: intent = lock, entity = iPhone-001
    → tool_call: send_device_command(device_id="...", command="lock", confirmed=false)
      → Lambda: confirmed=false → return {requiresConfirmation: true}
        → LLM → User: "Xác nhận khóa thiết bị iPhone-001?"
          → User: "xác nhận"
            → tool_call: send_device_command(confirmed=true)
              → Lambda: dispatch via command pipeline
                → User: "Đã gửi lệnh khóa thiết bị iPhone-001. Trạng thái: ACTION_PENDING"

3.5.1.3 Disambiguation Flow

User: "khóa thiết bị của An"
  → LLM: device_id unclear
    → tool_call: query_devices() + filter by name
      → result: 2 devices match "An": iPhone-An-001, iPad-An-002
        → LLM → User: "Tìm thấy 2 thiết bị của An:
           1. iPhone-An-001 (active)
           2. iPad-An-002 (locked)
           Bạn muốn khóa thiết bị nào?"
          → User: "1"
            → tool_call: send_device_command(device_id="iPhone-An-001", command="lock", confirmed=false)
              → ... confirmation flow ...

3.5.2 System Prompt

You are the Fluxion MDM assistant. You help administrators manage iOS devices
via natural language commands in Vietnamese and English.

## Device States (FSM)
Available states: idle, registered, enrolled, active, locked, released

## Available Commands
- lock: Khóa thiết bị (chỉ khi active)
- unlock: Mở khóa (chỉ khi locked)
- send_message: Gửi thông báo push (chỉ khi active)
- lock_message: Gửi message lên lock screen (chỉ khi locked)
- release: Thu hồi thiết bị (từ bất kỳ state, admin hoặc operator)

## Rules
1. ALL command actions require explicit user confirmation before execution.
   Always call send_device_command with confirmed=false first.
   Only set confirmed=true AFTER user explicitly confirms.
2. If device_id is ambiguous or a name, call query_devices first to resolve.
3. Never guess device IDs — always verify via query.
4. Respond in the same language as the user.
5. When showing device info, include: name, serial, current state.

3.5.3 Tool Definitions (5 Tools)

3.5.3.1 query_devices

{
  "name": "query_devices",
  "description": "List or filter devices by FSM state.",
  "parameters": {
    "type": "object",
    "properties": {
      "state": {
        "type": "string",
        "enum": ["idle","registered","enrolled","active","locked","released"]
      },
      "search": {
        "type": "string",
        "description": "Search by device name or serial number"
      },
      "limit": { "type": "integer", "default": 20 }
    }
  }
}

3.5.3.2 get_device

{
  "name": "get_device",
  "description": "Get full info for a single device by ID or serial.",
  "parameters": {
    "type": "object",
    "properties": {
      "device_id": { "type": "string", "description": "UUID or serial number" }
    },
    "required": ["device_id"]
  }
}

3.5.3.3 send_device_command

{
  "name": "send_device_command",
  "description": "Send MDM command to a device. ALL commands require confirmed=true after user confirms.",
  "parameters": {
    "type": "object",
    "properties": {
      "device_id": { "type": "string" },
      "command": {
        "type": "string",
        "enum": ["lock", "unlock", "send_message", "lock_message", "release"]
      },
      "message": {
        "type": "string",
        "description": "Message content for send_message or lock_message commands"
      },
      "confirmed": {
        "type": "boolean",
        "description": "Must be true to execute. Always call with false first, then true after user confirms."
      }
    },
    "required": ["device_id", "command", "confirmed"]
  }
}

3.5.3.4 get_command_history

{
  "name": "get_command_history",
  "description": "Get command execution history for a device.",
  "parameters": {
    "type": "object",
    "properties": {
      "device_id": { "type": "string" },
      "limit": { "type": "integer", "default": 10 }
    },
    "required": ["device_id"]
  }
}

3.5.3.5 get_device_stats

{
  "name": "get_device_stats",
  "description": "Get aggregate device counts grouped by FSM state.",
  "parameters": {
    "type": "object",
    "properties": {}
  }
}

3.5.4 Safety & Authorization

3.5.4.1 Confirmation Gate

Mọi command action → confirmed=false (bắt buộc gọi trước)
  → Lambda: return {requiresConfirmation: true, summary: "..."}
    → LLM hiển thị confirmation prompt cho user
      → User xác nhận → confirmed=true → Lambda thực thi
      → User huỷ → không thực thi, log cancellation

Không có ngoại lệ — tất cả 5 commands (lock, unlock, send_message, lock_message, release) đều qua confirmation gate.

3.5.4.2 Role-Based Authorization

Command	Admin	Operator
lock	✅	✅
unlock	✅	✅
send_message	✅	✅
lock_message	✅	✅
release	✅	✅

Lambda kiểm tra JWT role claim trước mọi command execution. Nếu unauthorized:

Return error: UNAUTHORIZED — action requires appropriate role
LLM thông báo user: "Bạn không có quyền thực hiện lệnh này"

3.5.4.3 Audit Logging

Mọi command attempt ghi vào CloudWatch Logs:

{
  "timestamp": "2026-04-20T10:30:00Z",
  "user_id": "uuid",
  "device_id": "uuid",
  "command": "lock",
  "confirmed": true,
  "result": "dispatched",
  "session_id": "chat-session-uuid"
}

3.5.5 Architecture — chat-handler Lambda

3.5.5.1 Sequence Diagram

sequenceDiagram
    participant U as User (Chat UI)
    participant AS as AppSync
    participant CH as chat-handler Lambda
    participant DB as PostgreSQL
    participant LLM as GPT-4o mini
    participant AR as action-resolver

    U->>AS: mutation sendChatMessage(sessionId, message)
    AS->>CH: invoke with JWT context

    CH->>DB: SELECT last 10 messages WHERE session_id
    CH->>CH: Build messages array: [system_prompt] + history + user_msg

    CH->>LLM: POST /chat/completions (messages + tools)
    LLM-->>CH: tool_call: query_devices(state="active")

    CH->>DB: SELECT * FROM devices WHERE state='active'
    CH->>CH: Append tool_result to messages

    CH->>LLM: POST /chat/completions (messages + tool_result)
    LLM-->>CH: text response: "Có 12 thiết bị active..."

    CH->>DB: INSERT chat_messages (user_msg + assistant_msg)
    CH-->>AS: return ChatResponse {message, toolCalls}
    AS-->>U: response

3.5.5.2 Lambda Flow (Pseudocode)

def handler(event, context):
    # 1. Auth
    user = validate_jwt(event.identity)

    # 2. Load history
    session_id = event.arguments.sessionId
    history = db.query(
        "SELECT role, content FROM chat_messages "
        "WHERE session_id = %s ORDER BY created_at DESC LIMIT 10",
        session_id
    )

    # 3. Build messages
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        *reversed(history),
        {"role": "user", "content": event.arguments.message}
    ]

    # 4. Call LLM with tools
    response = llm_client.chat(messages=messages, tools=TOOLS)

    # 5. Handle tool calls (max 1 round)
    if response.has_tool_calls():
        for tool_call in response.tool_calls:
            result = execute_tool(tool_call, user)
            messages.append(tool_call_message)
            messages.append(tool_result_message(result))

        # Follow-up call
        response = llm_client.chat(messages=messages, tools=TOOLS)

    # 6. Persist
    db.insert_chat_messages(session_id, [
        {"role": "user", "content": event.arguments.message},
        {"role": "assistant", "content": response.text}
    ])

    return {"message": response.text}

3.5.5.3 Tool Execution

def execute_tool(tool_call, user):
    name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    if name == "query_devices":
        return db.query_devices(state=args.get("state"), search=args.get("search"))

    if name == "get_device":
        return db.get_device(args["device_id"])

    if name == "send_device_command":
        if not args.get("confirmed"):
            return {"requiresConfirmation": True,
                    "summary": f"{args['command']} device {args['device_id']}"}

        # Check role authorization
        # Role check: all commands require admin or operator
        if user.role not in ("admin", "operator"):
            return {"error": "UNAUTHORIZED", "message": "requires admin or operator role"}

        # Dispatch via AppSync mutation (internal)
        return appsync_client.mutate(
            mutation=DISPATCH_COMMAND,
            variables={"deviceId": args["device_id"], "command": args["command"],
                        "message": args.get("message")}
        )

    if name == "get_command_history":
        return db.get_command_history(args["device_id"], args.get("limit", 10))

    if name == "get_device_stats":
        return db.get_device_stats()

3.5.6 Context Management

Aspect	Strategy	Detail
History	Sliding window	10 messages gần nhất per session
Device context	Tool call on-demand	LLM gọi query_devices/get_device khi cần, không preload
Session	Per-user	Mỗi user có active session; session_id trong JWT hoặc client-managed
Token budget	~2000 tokens	System prompt (~500) + history (~1200) + user msg (~300)

3.5.7 LLM Provider Config

Setting	Value	Lý do
Model	`gpt-4o-mini`	Tốt nhất cost/performance cho thesis
Temperature	0.1	Deterministic tool calls; minimal creativity needed
Max tokens	500	MDM responses ngắn
Tool choice	`auto`	LLM tự quyết định gọi tool hay trả text
Streaming	Off (sync)	Response ngắn; sync đủ cho admin tooling

3.5.8 Biểu Đồ Lớp — Module NLP Chatbot (Class Diagram)

Biểu đồ lớp NLP Chatbot

Hình: Biểu đồ lớp module NLP Chatbot. Xanh lá = handler chính (Entry point), xanh dương = bộ thực thi công cụ (Executor), tím = lớp trừu tượng (Abstract), vàng = công cụ cụ thể (Concrete Tool), đỏ = công cụ nguy hiểm yêu cầu xác nhận (Safety Gate).

Kết Luận

Module NLP Chatbot của Fluxion được định vị là một ứng dụng LLM tăng cường công cụ (Tool-Augmented LLM Application) theo mô hình ReAct (Reasoning + Acting), không phải hệ thống NLP truyền thống. Đóng góp kỹ thuật tập trung vào 3 điểm: (1) system prompt chuyên biệt cho domain MDM với FSM state constraints và ngôn ngữ tự nhiên song ngữ Việt-Anh, (2) 5 tool definitions với schema rõ ràng bao phủ toàn bộ operations (query, get, command, history, stats), và (3) cơ chế xác nhận 2 bước (confirmed=false → user confirm → confirmed=true) đảm bảo không có command nào được thực thi mà không có sự đồng ý tường minh của người dùng.

Safety gate thông qua confirmed parameter là thiết kế quan trọng nhất của module: LLM bắt buộc phải gọi send_device_command với confirmed=false trước, nhận lại requiresConfirmation: true, hiển thị summary cho user, và chỉ thực thi khi user xác nhận bằng ngôn ngữ tự nhiên. Cơ chế này được enforce tại Lambda layer (không chỉ ở prompt level), kết hợp với RBAC check tại Lambda layer tạo thành defense-in-depth cho các lệnh nguy hiểm.

Context management sử dụng sliding window 10 messages (~2000 tokens) đủ cho multi-turn conversation thông thường trong môi trường MDM. Quyết định chọn gpt-4o-mini với temperature=0.1 phản ánh ưu tiên deterministic tool calling over creativity — phù hợp cho admin tooling yêu cầu độ chính xác cao trong nhận dạng intent và gọi tool đúng tham số.

Tài Liệu Tham Khảo

[1] Vaswani, A. et al. Attention Is All You Need. NeurIPS, 2017.

[2] Schick, T. et al. Toolformer: Language Models Can Teach Themselves to Use Tools. NeurIPS, 2023.

[3] Yao, S. et al. ReAct: Synergizing Reasoning and Acting in Language Models. ICLR, 2023.

[4] OpenAI. Function Calling Documentation. 2024.

[5] Patil, S. G. et al. Gorilla: Large Language Model Connected with Massive APIs. 2023.