Page Index - 100-hours-a-week/6-nemo-ai GitHub Wiki

51 page(s) in this GitHub Wiki:

Home
[V1.5] [모델 배포 방식] Colab vLLM
Please reload this page
[V1.5] [모델 배포 방식] GCP에 올리기
Please reload this page
[V1.5] [임베딩 모델] 모델 비교 및 변경 사유
Please reload this page
[V1.5] [파인튜닝] 기능별 vs 일반적 파인튜닝
Please reload this page
[V1] Vertex AI 부하 상황공유
Please reload this page
[V2] AI 인프라 구성 (vLLM 서버 및 FastAPI 서버 분리 운영)
Please reload this page
[V2] Gemma 3 4B 텍스트 생성 실패 이슈 분석 보고서
Please reload this page
[V2] LangChain 적용여부
Please reload this page
[V2] LLM 응답 스트리밍 방식 비교 보고서
Please reload this page
[V2] RAG 기반 챗봇의 문제와 개선 방향
Please reload this page
[V2] SSE(Server‐Sent Events) 도입 보고서
Please reload this page
[V2] vLLM 기반 Gemma 3 4B 모델 서빙 보고서: GPU 메모리 사용, KV Cache, 양자화 필요성
Please reload this page
[V2] vLLM 스트리밍 응답 to WebSocket
Please reload this page
[V2] WebSocket 기반 챗봇 스트리밍 기술 사양서 (AI → BE)
Please reload this page
[V2] 로컬 모델 파인튜닝 ‐ 텍스트 생성 기능별 설계
Please reload this page
[V2] 모델 배포 방식 Colab vLLM
Please reload this page
[V2] 모델 배포 방식 GCP에 올리기
Please reload this page
[V2] 벡터 디비 RAG 구조 설계 문서
Please reload this page
[V2] 챗봇 1차 자유입력 RAG 챗봇 설계
Please reload this page
[V2] 챗봇 2차 설계
Please reload this page
[V2] 챗봇 LoRA 모델 비교 문서 (v1 ~ v4)
Please reload this page
[V2] 챗봇 QLoRA 모델 비교 문서 (v1 ~ v3)
Please reload this page
[V2] 챗봇 v3 설계 문서
Please reload this page
LLM 모델 추론 성능 최적화
Please reload this page
로컬 LLM 모델 전환 과정
Please reload this page