πŸ€– 4단계 : LangChain 기반 λ©€ν‹°μŠ€ν… AI κ΅¬ν˜„ κ²€ν†  - 100-hours-a-week/7-team-ddb-wiki GitHub Wiki

πŸ“˜ LangChain Structured Agent 기반 μž₯μ†Œ μΆ”μ²œ 체인 섀계

βœ… Why: μ™œ κ΅¬μ‘°ν™”λœ λ©€ν‹°μŠ€ν… Agent 체인이 ν•„μš”ν•œκ°€?

λ³Έ μ„œλΉ„μŠ€λŠ” μ‚¬μš©μž μž…λ ₯을 기반으둜 μž₯μ†Œλ₯Ό μΆ”μ²œν•˜λŠ” μ‹œμŠ€ν…œμœΌλ‘œ,
μ •ν™•ν•œ μΆ”μ²œμ„ μœ„ν•΄ μ—¬λŸ¬ λ‹¨κ³„μ˜ μΆ”λ‘ κ³Ό μ™ΈλΆ€ λ¦¬μ†ŒμŠ€ ν™œμš©μ΄ ν•„μš”ν•¨.
λ˜ν•œ, μ΅œμ’… μΆ”μ²œ κ²°κ³ΌλŠ” λ‹€μŒκ³Ό 같은 μ •ν•΄μ§„ JSON μŠ€ν‚€λ§ˆλ‘œ λ°˜ν™˜λ˜μ–΄μ•Ό 함:

{
  "data": [
    {"place_id": 21, "similarity_score": 0.92},
    ...
  ]
}

이λ₯Ό μœ„ν•΄ LangChain의 AgentType.STRUCTURED_CHAT_ZERO_SHOT을 기반으둜 각 μΆ”λ‘  단계λ₯Ό κ΅¬μ‘°ν™”λœ Tool둜 λΆ„λ¦¬ν•˜κ³ , Agentκ°€ 이 Tool듀을 μžλ™μœΌλ‘œ μ‹€ν–‰ν•˜λ„λ‘ ꡬ성함.

πŸ› οΈ 체인 ꡬ성 흐름도

flowchart TD
    A[μ‚¬μš©μž μž…λ ₯] --> B[Tool 1: ν‚€μ›Œλ“œ μΆ”μΆœ]
    B --> C[Tool 2: μœ μ‚¬ ν‚€μ›Œλ“œ 검색]
    C --> D[Tool 3: 가쀑 평균 μž„λ² λ”© 생성]
    D --> E[Tool 4: μž₯μ†Œ 벑터 DB 검색]
    E --> F[κ΅¬μ‘°ν™”λœ μΆ”μ²œ κ²°κ³Ό λ°˜ν™˜]
Loading

πŸ”§ 각 Tool μ •μ˜

πŸ”Ή Tool 1: extract_keywords

  • κΈ°λŠ₯: LLM μ‚¬μš©μž μžμ—°μ–΄ μž…λ ₯μ—μ„œ μž₯μ†Œ μΆ”μ²œμ— ν•„μš”ν•œ ν‚€μ›Œλ“œ 및 이용 μ‹œκ°„ μΆ”μΆœ
  • μž…λ ₯ μŠ€ν‚€λ§ˆ:
{
  "user_input": "내일 저녁에 λΆ„μœ„κΈ° 쒋은 데이트 μž₯μ†Œ μΆ”μ²œν•΄μ€˜"
}
  • 좜λ ₯ μŠ€ν‚€λ§ˆ:
{
  "keywords": ["데이트", "감성", "μ•Όκ²½"]
}

πŸ”Ή Tool 2: search_similar_keywords

  • κΈ°λŠ₯: μΆ”μΆœλœ ν‚€μ›Œλ“œλ₯Ό λ²‘ν„°ν™”ν•˜μ—¬ 벑터 DBμ—μ„œ μœ μ‚¬ ν‚€μ›Œλ“œλ₯Ό 검색
  • μž…λ ₯ μŠ€ν‚€λ§ˆ:
{
  "keywords": ["데이트", "감성", "μ•Όκ²½"]
}
  • 좜λ ₯ μŠ€ν‚€λ§ˆ:
{
  "similar_keywords": [
    {"keyword": "λ‘œλ§¨ν‹±", "score": 0.87},
    {"keyword": "λΆ„μœ„κΈ°", "score": 0.85}
  ]
}

πŸ”Ή Tool 3: compute_user_embedding

  • κΈ°λŠ₯: μœ μ‚¬ ν‚€μ›Œλ“œμ˜ μœ μ‚¬λ„λ₯Ό κ°€μ€‘μΉ˜λ‘œ μ‚¬μš©ν•΄ 평균 μž„λ² λ”© 벑터 생성
  • μž…λ ₯ μŠ€ν‚€λ§ˆ:
{
  "similar_keywords": [
    {"keyword": "λ‘œλ§¨ν‹±", "score": 0.87},
    {"keyword": "λΆ„μœ„κΈ°", "score": 0.85}
  ]
}
  • 좜λ ₯ μŠ€ν‚€λ§ˆ:
{
  "vector": [0.125, -0.098, 0.234, ...]
}

πŸ”Ή Tool 4: recommend_places

  • κΈ°λŠ₯: μ‚¬μš©μž 벑터와 이용 μ‹œκ°„μ„ κΈ°μ€€μœΌλ‘œ μž₯μ†Œ 벑터 DBμ—μ„œ μœ μ‚¬ μž₯μ†Œ μΆ”μ²œ
  • μž…λ ₯ μŠ€ν‚€λ§ˆ:
{
  "vector": [...]
}
  • 좜λ ₯ μŠ€ν‚€λ§ˆ (μ΅œμ’… 응닡):
{
  "data": [
    {"place_id": 21, "similarity_score": 0.92},
    {"place_id": 36, "similarity_score": 0.86},
    ...
  ]
}

🧩 기술 μŠ€νƒ μš”μ•½

ν•­λͺ© 기술 μ„€λͺ…
LLM Gemini (via GoogleGenerativeAI) μ‚¬μš©μž μž…λ ₯ 이해 및 ν‚€μ›Œλ“œ μΆ”μΆœ
Embedding KR-SBERT ν‚€μ›Œλ“œ 및 μž₯μ†Œ 벑터화
Vector DB Chroma μœ μ‚¬ ν‚€μ›Œλ“œ 및 μž₯μ†Œ 검색
LangChain ꡬ성 Agent + StructuredTool + Pydantic Tool 기반 체인 μžλ™ μ‹€ν–‰

βœ… Agent ꡬ성 방식

  • AgentType.STRUCTURED_CHAT_ZERO_SHOT μ‚¬μš©
  • 각 Tool은 StructuredTool.from_function(...)으둜 등둝
  • LLM은 μžμ—°μ–΄ μž…λ ₯을 λ°”νƒ•μœΌλ‘œ Tool을 순차적으둜 호좜
  • 각 Tool의 μž…λ ₯/좜λ ₯은 λͺ…μ‹œμ μœΌλ‘œ μ •μ˜λœ JSON μŠ€ν‚€λ§ˆλ₯Ό 따름

πŸš€ κΈ°λŒ€ 효과

ν•­λͺ© 효과
🎯 응닡 정확도 쀑간 단계별 μΆ”λ‘ μœΌλ‘œ μž₯μ†Œ μΆ”μ²œμ˜ 정밀도 ν–₯상
πŸ“¦ API ν˜Έν™˜μ„± κ΅¬μ‘°ν™”λœ 응닡 μŠ€ν‚€λ§ˆλ‘œ λ°±μ—”λ“œμ™€μ˜ 톡합이 용이
πŸ›  μœ μ§€λ³΄μˆ˜μ„± 각 Tool이 λͺ¨λ“ˆν™”λ˜μ–΄ 디버깅, ꡐ체, ν…ŒμŠ€νŠΈκ°€ 쉬움
πŸ”„ ν™•μž₯ κ°€λŠ₯μ„± μƒˆλ‘œμš΄ Tool μΆ”κ°€λ§ŒμœΌλ‘œ κΈ°λŠ₯ ν™•μž₯ (예: μ‚¬μš©μž μœ„μΉ˜ ν•„ν„°)

🧭 ν–₯ν›„ ν™•μž₯ κ³„νš

  • μœ„μΉ˜ 기반 ν•„ν„° Tool μΆ”κ°€
  • 쑰건별 μΆ”μ²œ 이유 생성 Tool (LLM 기반)
  • Tool 호좜 κ²°κ³Ό 캐싱 λ˜λŠ” 둜그 μ €μž₯ κΈ°λŠ₯ μΆ”κ°€

πŸ“ μ°Έκ³ : 일뢀 κ΅¬ν˜„ μ˜ˆμ‹œ μ½”λ“œ

ꡬ성 μ˜ˆμ • 파일 ꡬ쑰 μ˜ˆμ‹œ

/place_recommendation_chain
β”œβ”€β”€ schemas.py            # Pydantic 기반 μš”μ²­/응닡 μ •μ˜
β”œβ”€β”€ tools.py              # λͺ¨λ“  Tool ν•¨μˆ˜ μ •μ˜
β”œβ”€β”€ agent.py              # Agent + Tool μ‘°ν•© + μ‹€ν–‰ ν•¨μˆ˜
└── run.py                # ν…ŒμŠ€νŠΈμš© μ—”νŠΈλ¦¬ν¬μΈνŠΈ (user_input -> final JSON 응닡)
schemas.py
# schemas.py

from pydantic import BaseModel, Field
from typing import List, Literal


# βœ… Tool 1: μ‚¬μš©μž μž…λ ₯ β†’ ν‚€μ›Œλ“œ μΆ”μΆœ

class ExtractKeywordsInput(BaseModel):
    user_input: str

class ExtractKeywordsOutput(BaseModel):
    keywords: List[str]


# βœ… Tool 2: ν‚€μ›Œλ“œ β†’ μœ μ‚¬ ν‚€μ›Œλ“œ 검색

class SearchSimilarKeywordsInput(BaseModel):
    keywords: List[str]

class SimilarKeyword(BaseModel):
    keyword: str
    score: float

class SearchSimilarKeywordsOutput(BaseModel):
    similar_keywords: List[SimilarKeyword]


# βœ… Tool 3: μœ μ‚¬ ν‚€μ›Œλ“œ β†’ 가쀑 평균 벑터 생성

class ComputeUserEmbeddingInput(BaseModel):
    similar_keywords: List[SimilarKeyword]

class ComputeUserEmbeddingOutput(BaseModel):
    vector: List[float]


# βœ… Tool 4: μ‚¬μš©μž 벑터 β†’ μž₯μ†Œ μΆ”μ²œ

class RecommendPlacesInput(BaseModel):
    vector: List[float]

class RecommendedPlace(BaseModel):
    place_id: int
    similarity_score: float

class RecommendPlacesOutput(BaseModel):
    data: List[RecommendedPlace]
tools.py
# tools.py

from langchain.tools import StructuredTool
from schemas import (
    ExtractKeywordsInput, ExtractKeywordsOutput,
    SearchSimilarKeywordsInput, SearchSimilarKeywordsOutput,
    ComputeUserEmbeddingInput, ComputeUserEmbeddingOutput,
    RecommendPlacesInput, RecommendPlacesOutput
)
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.llms import GoogleGenerativeAI
import json
import faiss
import numpy as np
import pickle

# πŸ”§ 곡톡 μž„λ² λ”© λͺ¨λΈ
embedding_model = HuggingFaceEmbeddings(model_name="snunlp/KR-SBERT-V40K-klueNLI-augSTS")

# πŸ”§ Gemini LLM (μ˜ˆμ‹œ)
llm = GoogleGenerativeAI(
    model="models/gemini-2.0-flash-lite",
    google_api_key="YOUR_GEMINI_API_KEY"
)


# βœ… Tool 1: ν‚€μ›Œλ“œ μΆ”μΆœ
def extract_keywords_fn(input: ExtractKeywordsInput) -> ExtractKeywordsOutput:
    prompt = f"""
    λ‹€μŒ μ‚¬μš©μž μž…λ ₯μ—μ„œ μž₯μ†Œ μΆ”μ²œμ„ μœ„ν•œ ν‚€μ›Œλ“œλ₯Ό JSON ν˜•μ‹μœΌλ‘œ μΆ”μΆœν•˜μ„Έμš”.

    μž…λ ₯: "{input.user_input}"

    좜λ ₯ μ˜ˆμ‹œ:
    {{
        "keywords": ["데이트", "감성", "μ•Όκ²½"]
    }}
    """
    response = llm(prompt)
    parsed = json.loads(response)
    return ExtractKeywordsOutput(**parsed)


# βœ… Tool 2: μœ μ‚¬ ν‚€μ›Œλ“œ 검색
def search_similar_keywords_fn(input: SearchSimilarKeywordsInput) -> SearchSimilarKeywordsOutput:
    index = faiss.read_index("keyword_vectors.index")
    with open("keyword_meta.pkl", "rb") as f:
        keyword_meta = pickle.load(f)

    vectors = embedding_model.embed_documents(input.keywords)
    query = np.mean(vectors, axis=0).astype("float32").reshape(1, -1)

    D, I = index.search(query, 5)
    similar = [{"keyword": keyword_meta[i], "score": float(D[0][j])} for j, i in enumerate(I[0])]
    return SearchSimilarKeywordsOutput(similar_keywords=similar)


# βœ… Tool 3: 가쀑 평균 벑터 생성
def compute_user_embedding_fn(input: ComputeUserEmbeddingInput) -> ComputeUserEmbeddingOutput:
    vectors = [embedding_model.embed_query(kw.keyword) for kw in input.similar_keywords]
    sims = [kw.score for kw in input.similar_keywords]

    weighted = sum(np.array(vec) * sim for vec, sim in zip(vectors, sims))
    avg_vector = weighted / sum(sims)

    return ComputeUserEmbeddingOutput(vector=avg_vector.tolist())


# βœ… Tool 4: μž₯μ†Œ μΆ”μ²œ
def recommend_places_fn(input: RecommendPlacesInput) -> RecommendPlacesOutput:
    index = faiss.read_index("place_vectors.index")
    with open("place_meta.pkl", "rb") as f:
        place_meta = pickle.load(f)

    vec = np.array(input.vector).astype("float32").reshape(1, -1)
    D, I = index.search(vec, 5)

    recommended = [
        {"place_id": int(place_meta[i]["id"]), "similarity_score": float(D[0][j])}
        for j, i in enumerate(I[0])
    ]

    return RecommendPlacesOutput(data=recommended)
agent.py
# agent.py

from langchain.agents import AgentExecutor, Tool, initialize_agent
from langchain.agents.agent_types import AgentType
from langchain_community.llms import GoogleGenerativeAI
from langchain.tools import StructuredTool

from tools import (
    extract_keywords_fn,
    search_similar_keywords_fn,
    compute_user_embedding_fn,
    recommend_places_fn
)
from schemas import (
    ExtractKeywordsInput, ExtractKeywordsOutput,
    SearchSimilarKeywordsInput, SearchSimilarKeywordsOutput,
    ComputeUserEmbeddingInput, ComputeUserEmbeddingOutput,
    RecommendPlacesInput, RecommendPlacesOutput
)

# βœ… LLM (Gemini)
llm = GoogleGenerativeAI(
    model="models/gemini-pro",
    google_api_key="YOUR_GEMINI_API_KEY"
)

# βœ… Tool λͺ©λ‘ (κ΅¬μ‘°ν™”λœ μž…λ ₯/좜λ ₯ μŠ€ν‚€λ§ˆ 적용)
tools = [
    StructuredTool.from_function(
        name="extract_keywords",
        description="μ‚¬μš©μž μž…λ ₯μ—μ„œ μž₯μ†Œ μΆ”μ²œμ— ν•„μš”ν•œ ν‚€μ›Œλ“œλ₯Ό μΆ”μΆœν•©λ‹ˆλ‹€.",
        func=extract_keywords_fn,
        args_schema=ExtractKeywordsInput,
        return_schema=ExtractKeywordsOutput
    ),
    StructuredTool.from_function(
        name="search_similar_keywords",
        description="μΆ”μ²œ ν‚€μ›Œλ“œλ₯Ό 기반으둜 벑터 DBμ—μ„œ μœ μ‚¬ ν‚€μ›Œλ“œλ₯Ό κ²€μƒ‰ν•©λ‹ˆλ‹€.",
        func=search_similar_keywords_fn,
        args_schema=SearchSimilarKeywordsInput,
        return_schema=SearchSimilarKeywordsOutput
    ),
    StructuredTool.from_function(
        name="compute_user_embedding",
        description="μœ μ‚¬ ν‚€μ›Œλ“œμ™€ μœ μ‚¬λ„ 점수λ₯Ό 기반으둜 가쀑 평균 μž„λ² λ”© 벑터λ₯Ό κ³„μ‚°ν•©λ‹ˆλ‹€.",
        func=compute_user_embedding_fn,
        args_schema=ComputeUserEmbeddingInput,
        return_schema=ComputeUserEmbeddingOutput
    ),
    StructuredTool.from_function(
        name="recommend_places",
        description="μ‚¬μš©μž μž„λ² λ”© 벑터λ₯Ό 기반으둜 μž₯μ†Œ 벑터 DBμ—μ„œ μΆ”μ²œ κ²°κ³Όλ₯Ό λ°˜ν™˜ν•©λ‹ˆλ‹€.",
        func=recommend_places_fn,
        args_schema=RecommendPlacesInput,
        return_schema=RecommendPlacesOutput
    )
]

# βœ… Agent 생성
agent_executor = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT,
    verbose=True
)
run.py
# run.py

from agent import agent_executor

if __name__ == "__main__":
    # πŸ“ ν…ŒμŠ€νŠΈμš© μ‚¬μš©μž μž…λ ₯
    user_query = "저녁에 λΆ„μœ„κΈ° 쒋은 데이트 μž₯μ†Œ μΆ”μ²œν•΄μ€˜"

    print("\n==============================")
    print("πŸ” μ‚¬μš©μž μš”μ²­:", user_query)
    print("==============================\n")

    # 🧠 LangChain Agent μ‹€ν–‰
    result = agent_executor.run(user_query)

    print("\n==============================")
    print("πŸ† μ΅œμ’… μΆ”μ²œ κ²°κ³Ό (κ΅¬μ‘°ν™”λœ JSON):")
    print(result)
    print("==============================")
⚠️ **GitHub.com Fallback** ⚠️