[AI] 11_테스트_전략 - 100-hours-a-week/9-team-Devths-WIKI GitHub Wiki

AI Server 테스트 전략

AI 서버의 유닛 테스트, 통합 테스트, E2E 테스트 전략 가이드

📚 목차

1. 테스트 개요
2. 테스트 유형 및 범위
3. 테스트 디렉터리 구조
4. 유닛 테스트
5. 통합 테스트
6. E2E 테스트
7. LLM/VLM 테스트 전략
8. 테스트 도구 및 환경
9. CI/CD 연동
10. 테스트 로드맵

1. 테스트 개요

1.1. 테스트 목표

목표	설명
기능 정확성	API가 명세대로 동작하는지 검증
응답 형식	LLM 출력이 예상 JSON 스키마와 일치하는지 검증
성능 기준	응답 시간, TTFT가 목표치 내인지 검증
안정성	에러 처리, 타임아웃, 재시도 로직 검증
회귀 방지	코드 변경 시 기존 기능이 깨지지 않는지 검증

1.2. 테스트 대상 API (9개)

#	API	테스트 우선순위	복잡도
1	`/ai/ocr/extract`	🟡 Medium	비동기 + VLM
2	`/ai/file/embed`	🟢 Low	단순 임베딩
3	`/ai/analyze`	🔴 High	스트리밍 + RAG
4	`/ai/interview/question`	🔴 High	DB 히스토리 + RAG
5	`/ai/interview/save`	🟢 Low	단순 저장
6	`/ai/interview/report`	🔴 High	스트리밍 + 복잡한 분석
7	`/ai/chat`	🔴 High	스트리밍 + RAG + Tool Calling
8	`/ai/calendar/parse`	🟡 Medium	JSON 파싱
9	`/ai/masking/draft`	🟡 Medium	비동기 + VLM

2. 테스트 유형 및 범위

2.1. 테스트 피라미드

                    ▲
                   /│\
                  / │ \
                 /  │  \      E2E 테스트 (10%)
                /   │   \     - 실제 LLM API 호출
               /    │    \    - 전체 시나리오 검증
              /─────┼─────\
             /      │      \
            /       │       \    통합 테스트 (30%)
           /        │        \   - API 엔드포인트 검증
          /         │         \  - 서비스 간 연동
         /──────────┼──────────\
        /           │           \
       /            │            \   유닛 테스트 (60%)
      /             │             \  - 개별 함수/클래스
     /              │              \ - Mock 활용
    /───────────────┴───────────────\

2.2. 테스트 유형별 특징

유형	범위	LLM 호출	실행 시간	목적
유닛	함수/클래스	❌ Mock	< 1초	로직 검증
통합	API 엔드포인트	❌ Mock 또는 ✅ 실제	1~10초	연동 검증
E2E	전체 시나리오	✅ 실제	10~60초	시나리오 검증

3. 테스트 디렉터리 구조

ai_server/
├── app/
│   ├── api/
│   ├── services/
│   ├── chains/
│   └── ...
│
├── tests/
│   ├── conftest.py                 # pytest fixtures (공통)
│   ├── __init__.py
│   │
│   ├── unit/                       # 유닛 테스트
│   │   ├── __init__.py
│   │   ├── test_prompt_templates.py
│   │   ├── test_output_parsers.py
│   │   ├── test_embedding_service.py
│   │   ├── test_calendar_parser.py
│   │   ├── test_masking_coords.py
│   │   └── test_interview_logic.py
│   │
│   ├── integration/                # 통합 테스트
│   │   ├── __init__.py
│   │   ├── test_ocr_api.py
│   │   ├── test_embed_api.py
│   │   ├── test_analyze_api.py
│   │   ├── test_interview_api.py
│   │   ├── test_chat_api.py
│   │   ├── test_calendar_api.py
│   │   └── test_masking_api.py
│   │
│   ├── e2e/                        # E2E 테스트
│   │   ├── __init__.py
│   │   ├── test_resume_flow.py     # 이력서 업로드 → 분석 전체 흐름
│   │   ├── test_interview_flow.py  # 면접 시작 → 질문 → 리포트
│   │   └── test_chat_scenario.py   # 대화 + Tool Calling
│   │
│   ├── fixtures/                   # 테스트 데이터
│   │   ├── sample_resume.txt
│   │   ├── sample_job_posting.txt
│   │   ├── sample_image.png
│   │   └── expected_responses.json
│   │
│   └── mocks/                      # Mock 클래스
│       ├── mock_llm.py
│       ├── mock_vectordb.py
│       └── mock_external_api.py
│
├── pytest.ini                      # pytest 설정
└── requirements-test.txt           # 테스트 의존성

4. 유닛 테스트

4.1. 테스트 대상

대상	파일	테스트 내용
프롬프트 템플릿	`test_prompt_templates.py`	변수 치환, 포맷 검증
출력 파서	`test_output_parsers.py`	JSON 파싱, 스키마 검증
임베딩 처리	`test_embedding_service.py`	청킹, 벡터 변환 로직
캘린더 파서	`test_calendar_parser.py`	날짜 추출, 일정 구조화
마스킹 좌표	`test_masking_coords.py`	좌표 변환, 영역 계산
면접 로직	`test_interview_logic.py`	질문 수 체크, 세션 관리

4.2. 예시 코드

프롬프트 템플릿 테스트

# tests/unit/test_prompt_templates.py
import pytest
from app.prompts.analyze_prompts import create_analyze_prompt

class TestAnalyzePrompt:
    """분석 프롬프트 템플릿 테스트"""
    
    def test_create_analyze_prompt_with_valid_input(self):
        """정상 입력으로 프롬프트 생성"""
        resume = "3년차 백엔드 개발자, Python, FastAPI 경험"
        posting = "백엔드 개발자 채용, Python 필수"
        
        prompt = create_analyze_prompt(resume, posting)
        
        assert resume in prompt
        assert posting in prompt
        assert "분석" in prompt or "analyze" in prompt.lower()
    
    def test_create_analyze_prompt_with_empty_resume(self):
        """빈 이력서는 에러 발생"""
        with pytest.raises(ValueError, match="이력서"):
            create_analyze_prompt("", "채용공고")
    
    def test_prompt_token_limit(self):
        """프롬프트가 토큰 한계를 초과하지 않는지 확인"""
        long_resume = "A" * 50000  # 매우 긴 이력서
        posting = "채용공고"
        
        prompt = create_analyze_prompt(long_resume, posting)
        
        # 토큰 수 추정 (대략 4자 = 1토큰)
        estimated_tokens = len(prompt) / 4
        assert estimated_tokens < 30000  # 컨텍스트 한계 내

출력 파서 테스트

# tests/unit/test_output_parsers.py
import pytest
from app.services.output_parsers import parse_analyze_response

class TestAnalyzeParser:
    """분석 결과 파서 테스트"""
    
    def test_parse_valid_response(self):
        """정상 JSON 응답 파싱"""
        llm_output = '''
        {
            "resume_analysis": {
                "strengths": ["Python 경험 3년"],
                "weaknesses": ["클라우드 경험 부족"],
                "suggestions": ["AWS 학습 추천"]
            },
            "matching": {
                "score": 85,
                "grade": "A"
            }
        }
        '''
        
        result = parse_analyze_response(llm_output)
        
        assert result["matching"]["score"] == 85
        assert result["matching"]["grade"] == "A"
        assert len(result["resume_analysis"]["strengths"]) > 0
    
    def test_parse_malformed_json(self):
        """잘못된 JSON은 에러 또는 기본값 반환"""
        malformed = "이건 JSON이 아닙니다"
        
        with pytest.raises(ValueError):
            parse_analyze_response(malformed)
    
    def test_parse_missing_required_fields(self):
        """필수 필드 누락 시 에러"""
        incomplete = '{"resume_analysis": {}}'
        
        with pytest.raises(KeyError):
            parse_analyze_response(incomplete)

면접 로직 테스트

# tests/unit/test_interview_logic.py
import pytest
from app.services.interview_service import InterviewSession

class TestInterviewSession:
    """면접 세션 로직 테스트"""
    
    def test_session_creation(self):
        """세션 생성"""
        session = InterviewSession(
            room_id="room_001",
            interview_type="technical"
        )
        
        assert session.session_id is not None
        assert session.question_count == 0
        assert session.status == "in_progress"
    
    def test_max_questions_limit(self):
        """최대 질문 수(5개) 도달 시 자동 종료"""
        session = InterviewSession("room_001", "technical")
        
        for i in range(5):
            session.add_qa(f"질문 {i+1}", f"답변 {i+1}")
        
        assert session.question_count == 5
        assert session.should_auto_end() == True
    
    def test_manual_end(self):
        """수동 종료"""
        session = InterviewSession("room_001", "personality")
        session.add_qa("질문 1", "답변 1")
        
        session.end(ended_by="manual")
        
        assert session.status == "completed"
        assert session.ended_by == "manual"

5. 통합 테스트

5.1. 테스트 대상

API	테스트 파일	검증 항목
`/ai/analyze`	`test_analyze_api.py`	스트리밍 응답, JSON 스키마
`/ai/interview/*`	`test_interview_api.py`	질문 생성, 저장, 리포트 연동
`/ai/chat`	`test_chat_api.py`	RAG 검색, Tool Calling
`/ai/calendar/parse`	`test_calendar_api.py`	일정 추출 정확도

5.2. 예시 코드

분석 API 통합 테스트

# tests/integration/test_analyze_api.py
import pytest
from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

class TestAnalyzeAPI:
    """분석 API 통합 테스트"""
    
    @pytest.fixture
    def sample_request(self):
        return {
            "resume_id": "resume_123",
            "posting_id": "posting_456",
            "resume_text": "3년차 백엔드 개발자...",
            "posting_text": "백엔드 개발자 채용..."
        }
    
    def test_analyze_returns_streaming_response(self, sample_request):
        """스트리밍 응답 형식 검증"""
        response = client.post(
            "/ai/analyze",
            json=sample_request,
            headers={"Accept": "text/event-stream"}
        )
        
        assert response.status_code == 200
        assert response.headers["content-type"].startswith("text/event-stream")
    
    def test_analyze_response_schema(self, sample_request, mock_llm):
        """응답 JSON 스키마 검증"""
        response = client.post("/ai/analyze", json=sample_request)
        data = response.json()
        
        # 필수 필드 검증
        assert "resume_analysis" in data
        assert "posting_analysis" in data
        assert "matching" in data
        
        # 매칭 스키마 검증
        matching = data["matching"]
        assert 0 <= matching["score"] <= 100
        assert matching["grade"] in ["S", "A", "B", "C", "D"]
    
    def test_analyze_with_missing_resume(self):
        """이력서 누락 시 400 에러"""
        response = client.post(
            "/ai/analyze",
            json={"posting_text": "채용공고"}
        )
        
        assert response.status_code == 400

면접 API 통합 테스트

# tests/integration/test_interview_api.py
import pytest
from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

class TestInterviewAPI:
    """면접 API 통합 테스트"""
    
    def test_interview_full_flow(self, mock_llm, mock_db):
        """면접 전체 흐름: 질문 생성 → 저장 → 리포트"""
        
        # 1. 첫 질문 생성
        question_response = client.post(
            "/ai/interview/question",
            json={
                "room_id": "room_001",
                "session_id": "session_123",
                "interview_type": "technical",
                "resume_text": "이력서...",
                "posting_text": "채용공고..."
            }
        )
        assert question_response.status_code == 200
        question_data = question_response.json()
        assert "question" in question_data
        assert question_data["question_number"] == 1
        
        # 2. Q&A 저장
        save_response = client.post(
            "/ai/interview/save",
            json={
                "room_id": "room_001",
                "session_id": "session_123",
                "question_id": question_data["question_id"],
                "question": question_data["question"],
                "answer": "사용자 답변...",
                "is_followup": False,
                "question_number": 1
            }
        )
        assert save_response.status_code == 200
        
        # 3. 리포트 생성 (5개 질문 완료 가정)
        report_response = client.post(
            "/ai/interview/report",
            json={
                "room_id": "room_001",
                "session_id": "session_123",
                "interview_type": "technical",
                "ended_by": "manual"
            }
        )
        assert report_response.status_code == 200
        report_data = report_response.json()
        assert "evaluations" in report_data
        assert "report" in report_data

채팅 API Tool Calling 테스트

# tests/integration/test_chat_api.py
import pytest
from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

class TestChatAPI:
    """채팅 API 통합 테스트"""
    
    def test_chat_with_calendar_tool(self, mock_llm_with_tool_call):
        """캘린더 Tool Calling 검증"""
        response = client.post(
            "/ai/chat",
            json={
                "room_id": "room_001",
                "user_id": "user_456",
                "message": "이번 주 일정 알려줘",
                "history": []
            }
        )
        
        data = response.json()
        
        # Tool 호출 응답 형식 검증
        assert data["tool_used"] is not None
        assert data["tool_used"]["tool"] == "get_schedule"
        assert "start_date" in data["tool_used"]["params"]
    
    def test_chat_rag_response(self, mock_llm, mock_vectordb):
        """RAG 기반 대화 응답"""
        response = client.post(
            "/ai/chat",
            json={
                "room_id": "room_001",
                "user_id": "user_456",
                "message": "내 이력서 강점이 뭐야?",
                "history": []
            }
        )
        
        data = response.json()
        
        assert data["success"] == True
        assert data["response"] is not None
        assert data["tool_used"] is None  # RAG는 tool 아님

6. E2E 테스트

6.1. 테스트 시나리오

시나리오	흐름	검증 항목
이력서 분석	파일 업로드 → OCR → 임베딩 → 분석	전체 파이프라인
모의 면접	시작 → 질문 5개 → 저장 → 리포트	세션 관리, 일관성
대화 + 일정 추가	"내일 카카오 면접 추가해줘" → Tool → 저장	Tool Calling 연동

6.2. 예시 코드

# tests/e2e/test_resume_flow.py
import pytest
import time
from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

class TestResumeAnalysisFlow:
    """이력서 분석 E2E 테스트 (실제 LLM 호출)"""
    
    @pytest.mark.e2e
    @pytest.mark.slow
    def test_full_resume_analysis_flow(self):
        """이력서 업로드 → OCR → 임베딩 → 분석 전체 흐름"""
        
        # 1. OCR 요청 (비동기)
        ocr_response = client.post(
            "/ai/ocr/extract",
            json={
                "file_url": "https://s3.../test_resume.pdf",
                "file_type": "pdf"
            }
        )
        assert ocr_response.status_code == 200
        task_id = ocr_response.json()["task_id"]
        
        # 2. OCR 완료 대기 (폴링)
        for _ in range(30):  # 최대 30초 대기
            status_response = client.get(f"/ai/task/{task_id}")
            status = status_response.json()["status"]
            if status == "completed":
                break
            time.sleep(1)
        
        assert status == "completed"
        extracted_text = status_response.json()["result"]["extracted_text"]
        
        # 3. 임베딩 저장
        embed_response = client.post(
            "/ai/file/embed",
            json={
                "type": "resume",
                "id": "resume_e2e_test",
                "user_id": "user_e2e",
                "text": extracted_text
            }
        )
        assert embed_response.status_code == 200
        
        # 4. 분석 요청
        analyze_response = client.post(
            "/ai/analyze",
            json={
                "resume_id": "resume_e2e_test",
                "posting_id": "posting_e2e",
                "resume_text": extracted_text,
                "posting_text": "백엔드 개발자 채용..."
            }
        )
        assert analyze_response.status_code == 200
        
        # 5. 응답 검증
        result = analyze_response.json()
        assert "resume_analysis" in result
        assert "matching" in result
        assert 0 <= result["matching"]["score"] <= 100

7. LLM/VLM 테스트 전략

7.1. Mock vs 실제 호출

상황	전략	이유
유닛 테스트	✅ Mock	빠른 실행, 결정적 결과
통합 테스트 (CI)	✅ Mock	비용 절감, 안정성
통합 테스트 (로컬)	⚠️ 선택적 실제 호출	실제 동작 확인 필요 시
E2E 테스트	✅ 실제 호출	실제 시나리오 검증

7.2. LLM Mock 구현

# tests/mocks/mock_llm.py
from unittest.mock import MagicMock, AsyncMock

class MockLLM:
    """LLM API Mock 클래스"""
    
    def __init__(self, response_type: str = "analyze"):
        self.response_type = response_type
        self.call_count = 0
    
    async def generate(self, prompt: str) -> str:
        """Mock LLM 응답 생성"""
        self.call_count += 1
        
        if self.response_type == "analyze":
            return '''
            {
                "resume_analysis": {
                    "strengths": ["Python 경험 풍부"],
                    "weaknesses": ["클라우드 경험 부족"],
                    "suggestions": ["AWS 학습 추천"]
                },
                "matching": {"score": 85, "grade": "A"}
            }
            '''
        elif self.response_type == "interview_question":
            return '''
            {
                "question": "Python의 GIL에 대해 설명해주세요.",
                "is_followup": false
            }
            '''
        elif self.response_type == "tool_call":
            return '''
            {
                "tool": "get_schedule",
                "params": {"start_date": "2026-01-06", "end_date": "2026-01-12"}
            }
            '''
        
        return '{"message": "Mock response"}'


# conftest.py에서 fixture로 등록
@pytest.fixture
def mock_llm():
    """LLM Mock fixture"""
    from app.services import llm_service
    original = llm_service.llm
    llm_service.llm = MockLLM()
    yield llm_service.llm
    llm_service.llm = original

7.3. 응답 품질 검증 (E2E 전용)

# tests/e2e/test_response_quality.py
import pytest
from app.services.quality_evaluator import evaluate_response

class TestResponseQuality:
    """LLM 응답 품질 검증 (실제 호출)"""
    
    @pytest.mark.e2e
    def test_analyze_response_quality(self, real_llm):
        """분석 응답 품질 검증"""
        result = real_llm.analyze(
            resume="3년차 백엔드 개발자...",
            posting="백엔드 개발자 채용..."
        )
        
        # 품질 평가
        quality = evaluate_response(result)
        
        assert quality["completeness"] >= 0.8  # 필수 필드 80% 이상
        assert quality["relevance"] >= 0.7     # 관련성 70% 이상
        assert quality["format_valid"] == True  # JSON 형식 유효
    
    @pytest.mark.e2e
    def test_interview_question_quality(self, real_llm):
        """면접 질문 품질 검증"""
        question = real_llm.generate_question(
            interview_type="technical",
            resume="Python 백엔드 개발자..."
        )
        
        # 질문 품질 체크
        assert len(question) >= 10  # 최소 길이
        assert "?" in question      # 질문 형태
        assert "Python" in question or "백엔드" in question  # 맥락 관련성

8. 테스트 도구 및 환경

8.1. 필수 도구

도구	용도	설치
pytest	테스트 프레임워크	`pip install pytest`
pytest-asyncio	비동기 테스트	`pip install pytest-asyncio`
pytest-cov	코드 커버리지	`pip install pytest-cov`
httpx	비동기 HTTP 클라이언트	`pip install httpx`
respx	HTTP Mock	`pip install respx`
faker	테스트 데이터 생성	`pip install faker`

8.2. 테스트 의존성 파일

# requirements-test.txt
pytest>=7.0.0
pytest-asyncio>=0.21.0
pytest-cov>=4.0.0
pytest-timeout>=2.1.0
pytest-xdist>=3.0.0  # 병렬 실행
httpx>=0.24.0
respx>=0.20.0
faker>=18.0.0

8.3. pytest 설정

# pytest.ini
[pytest]
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*

asyncio_mode = auto

markers =
    unit: 유닛 테스트
    integration: 통합 테스트
    e2e: E2E 테스트 (실제 LLM 호출)
    slow: 느린 테스트

filterwarnings =
    ignore::DeprecationWarning

# 타임아웃 설정
timeout = 60
timeout_method = thread

# 커버리지 설정
addopts = --cov=app --cov-report=html --cov-report=term-missing

8.4. 테스트 실행 명령어

# 전체 테스트 실행
pytest

# 유닛 테스트만 실행
pytest -m unit

# 통합 테스트만 실행
pytest -m integration

# E2E 테스트 실행 (느림, 비용 발생)
pytest -m e2e

# 특정 파일 테스트
pytest tests/integration/test_analyze_api.py

# 병렬 실행 (4개 프로세스)
pytest -n 4

# 커버리지 리포트 생성
pytest --cov=app --cov-report=html

# 실패한 테스트만 재실행
pytest --lf

9. CI/CD 연동

9.1. GitHub Actions 워크플로우

# .github/workflows/ai-server-test.yml
name: AI Server Tests

on:
  push:
    branches: [main, develop]
    paths:
      - 'ai_server/**'
  pull_request:
    branches: [main, develop]
    paths:
      - 'ai_server/**'

jobs:
  unit-test:
    name: Unit Tests
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'
      
      - name: Install dependencies
        run: |
          cd ai_server
          pip install -r requirements.txt
          pip install -r requirements-test.txt
      
      - name: Run unit tests
        run: |
          cd ai_server
          pytest -m unit --cov=app --cov-report=xml
      
      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: ./ai_server/coverage.xml

  integration-test:
    name: Integration Tests
    runs-on: ubuntu-latest
    needs: unit-test
    
    services:
      chromadb:
        image: chromadb/chroma:latest
        ports:
          - 8000:8000
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'
      
      - name: Install dependencies
        run: |
          cd ai_server
          pip install -r requirements.txt
          pip install -r requirements-test.txt
      
      - name: Run integration tests (with mocks)
        env:
          CHROMA_HOST: localhost
          CHROMA_PORT: 8000
          USE_MOCK_LLM: true  # LLM Mock 사용
        run: |
          cd ai_server
          pytest -m integration

  e2e-test:
    name: E2E Tests (Manual)
    runs-on: ubuntu-latest
    if: github.event_name == 'workflow_dispatch'  # 수동 트리거만
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'
      
      - name: Install dependencies
        run: |
          cd ai_server
          pip install -r requirements.txt
          pip install -r requirements-test.txt
      
      - name: Run E2E tests
        env:
          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
          USE_MOCK_LLM: false  # 실제 LLM 사용
        run: |
          cd ai_server
          pytest -m e2e --timeout=120

9.2. 테스트 단계별 실행 정책

단계	트리거	LLM 호출	예상 시간
유닛 테스트	모든 PR/Push	❌ Mock	~1분
통합 테스트	모든 PR/Push	❌ Mock	~3분
E2E 테스트	수동 / 주간 스케줄	✅ 실제	~10분

10. 테스트 로드맵

10.1. 단계별 구현 계획

┌─────────────────────────────────────────────────────────────────────────────┐
│                           테스트 로드맵                                       │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   Phase 1: MVP (2주)                                                        │
│   ├── pytest 환경 구축                                                       │
│   ├── 핵심 API 유닛 테스트 (analyze, interview, chat)                         │
│   ├── LLM Mock 구현                                                         │
│   └── 기본 CI 연동 (GitHub Actions)                                          │
│                                                                             │
│   Phase 2: 1차 배포 (3주)                                                   │
│   ├── 전체 API 유닛 테스트 완료                                               │
│   ├── 통합 테스트 구현 (API 엔드포인트)                                        │
│   ├── 테스트 커버리지 60% 달성                                                │
│   └── VectorDB Mock 구현                                                    │
│                                                                             │
│   Phase 3: 2차 배포 (4주)                                                   │
│   ├── E2E 테스트 시나리오 구현                                                │
│   ├── 응답 품질 검증 테스트                                                   │
│   ├── 성능 테스트 (부하 테스트)                                               │
│   └── 테스트 커버리지 80% 달성                                                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

10.2. 커버리지 목표

단계	목표 커버리지	측정 범위
MVP	40%	핵심 서비스 로직
1차 배포	60%	API 레이어 포함
2차 배포	80%	전체 코드베이스

10.3. 테스트 우선순위

우선순위	API	이유
⭐⭐⭐	`/ai/analyze`	핵심 기능, 복잡도 높음
⭐⭐⭐	`/ai/interview/*`	세션 관리, 상태 유지
⭐⭐⭐	`/ai/chat`	Tool Calling, RAG 복합
⭐⭐	`/ai/calendar/parse`	JSON 파싱 정확도
⭐⭐	`/ai/masking/draft`	좌표 추출 정확도
⭐	`/ai/ocr/extract`	VLM 의존, Mock 어려움
⭐	`/ai/file/embed`	단순 로직

[AI] 11_테스트_전략 - 100-hours-a-week/9-team-Devths-WIKI GitHub Wiki

AI Server 테스트 전략

📚 목차

1. 테스트 개요

1.1. 테스트 목표

1.2. 테스트 대상 API (9개)

2. 테스트 유형 및 범위

2.1. 테스트 피라미드

2.2. 테스트 유형별 특징

3. 테스트 디렉터리 구조

4. 유닛 테스트

4.1. 테스트 대상

4.2. 예시 코드

프롬프트 템플릿 테스트

출력 파서 테스트

면접 로직 테스트

5. 통합 테스트

5.1. 테스트 대상

5.2. 예시 코드

분석 API 통합 테스트

면접 API 통합 테스트

채팅 API Tool Calling 테스트

6. E2E 테스트

6.1. 테스트 시나리오

6.2. 예시 코드

7. LLM/VLM 테스트 전략

7.1. Mock vs 실제 호출

7.2. LLM Mock 구현

7.3. 응답 품질 검증 (E2E 전용)

8. 테스트 도구 및 환경

8.1. 필수 도구

8.2. 테스트 의존성 파일

8.3. pytest 설정

8.4. 테스트 실행 명령어

9. CI/CD 연동

9.1. GitHub Actions 워크플로우

9.2. 테스트 단계별 실행 정책

10. 테스트 로드맵

10.1. 단계별 구현 계획

10.2. 커버리지 목표

10.3. 테스트 우선순위

참고 자료