RAG Knowledge Base - IgorGanapolsky/trading GitHub Wiki

🧠 RAG Knowledge Base

Last Updated: 2025-12-01 08:23 AM ET Auto-Updated: Daily via GitHub Actions


πŸ“Š Knowledge Base Overview

Source Records Status Last Update
Sentiment RAG 10 tickers βœ… Active 2025-11-09
Berkshire Letters 14 PDFs (4.15MB) βœ… Downloaded 2010-2023
Bogleheads Forum 0 insights ⏳ Pending data collection Daily
YouTube Transcripts 5 videos (100KB) βœ… Active Daily
Reddit Sentiment 3 files βœ… Active Daily
News Sentiment 2 files βœ… Active Daily

🎯 Sentiment by Ticker

Ticker Sentiment Signal Regime Confidence
AMZN 🟒 +64.0 BULLISH neutral medium
NVDA 🟒 +60.0 BULLISH neutral high
QQQ 🟑 +41.0 BULLISH neutral medium
SPY 🟑 +35.0 BULLISH neutral high
AAPL 🟑 +35.0 BULLISH neutral medium
GME 🟑 +28.0 BULLISH neutral low
AMD 🟑 +23.0 BULLISH neutral low
TSLA βšͺ +5.0 NEUTRAL neutral medium
GOOGL 🟠 -30.0 BEARISH neutral medium
PLTR 🟠 -34.0 BEARISH neutral medium

πŸ“š Warren Buffett's Wisdom (Berkshire Letters)

Years Available: 2010-2023 Total Letters: 14 PDFs Total Size: 4.15 MB

Recent Letters

  • πŸ“„ 2023 Annual Letter
  • πŸ“„ 2022 Annual Letter
  • πŸ“„ 2021 Annual Letter
  • πŸ“„ 2020 Annual Letter
  • πŸ“„ 2019 Annual Letter

How to Query Buffett's Wisdom

from src.rag.collectors.berkshire_collector import BerkshireLettersCollector

collector = BerkshireLettersCollector()

# Search for investment advice
results = collector.search("index funds vs stock picking")

# Get stock mentions
apple_wisdom = collector.get_stock_mentions("AAPL")

πŸ—£οΈ Bogleheads Forum Insights

Status: Pending data collection Total Insights: 0 Data Files: 0

Forums Monitored

  • Personal Investments
  • Investing - Theory, News & General

Topics Tracked

  • Market timing, rebalancing, risk
  • Diversification, asset allocation
  • Index funds, ETFs (SPY, QQQ, VOO)

🎬 YouTube Financial Analysis

Transcripts Cached: 5 Videos Processed: 0 Total Size: 100 KB

Channels Monitored

  • Parkev Tatevosian, CFA
  • Joseph Carlson
  • Let's Talk Money! with Joseph Hogue
  • Financial Education
  • Everything Money

πŸ”Œ Data Collectors Status

Collector Source Status
Reddit r/wallstreetbets, r/stocks, r/investing βœ… Installed
Yahoo Finance Yahoo Finance API βœ… Installed
Alpha Vantage Alpha Vantage News API βœ… Installed
Seeking Alpha Seeking Alpha RSS βœ… Installed
LinkedIn LinkedIn Posts API βœ… Installed
TikTok TikTok Research API βœ… Installed
Berkshire Letters berkshirehathaway.com βœ… Installed

πŸ“ Data Storage Structure

data/
β”œβ”€β”€ rag/
β”‚   β”œβ”€β”€ sentiment_rag.db          # SQLite: Ticker sentiment embeddings
β”‚   β”œβ”€β”€ sentiment.db              # SQLite: Sentiment cache
β”‚   β”œβ”€β”€ berkshire_letters/
β”‚   β”‚   β”œβ”€β”€ raw/                  # Original PDF files
β”‚   β”‚   └── parsed/               # Extracted text
β”‚   β”œβ”€β”€ bogleheads/               # Forum insights JSON
β”‚   β”œβ”€β”€ chroma_db/                # ChromaDB vector store
β”‚   └── vector_store/             # FAISS indices
β”œβ”€β”€ sentiment/
β”‚   β”œβ”€β”€ reddit_*.json             # Daily Reddit sentiment
β”‚   └── news_*.json               # Daily news sentiment
└── youtube_cache/
    └── *_transcript.txt          # Video transcripts

πŸ”„ Data Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Data Sources   │────▢│   Collectors    │────▢│   RAG Store     β”‚
β”‚                 β”‚     β”‚                 β”‚     β”‚                 β”‚
β”‚ β€’ Reddit        β”‚     β”‚ β€’ Parse         β”‚     β”‚ β€’ Embeddings    β”‚
β”‚ β€’ YouTube       β”‚     β”‚ β€’ Extract       β”‚     β”‚ β€’ Vector Index  β”‚
β”‚ β€’ Seeking Alpha β”‚     β”‚ β€’ Normalize     β”‚     β”‚ β€’ SQLite Cache  β”‚
β”‚ β€’ LinkedIn      β”‚     β”‚ β€’ Score         β”‚     β”‚                 β”‚
β”‚ β€’ TikTok        β”‚     β”‚                 β”‚     β”‚                 β”‚
β”‚ β€’ Bogleheads    β”‚     β”‚                 β”‚     β”‚                 β”‚
β”‚ β€’ Berkshire     β”‚     β”‚                 β”‚     β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
                                                        β–Ό
                                               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                               β”‚ Trading System  β”‚
                                               β”‚                 β”‚
                                               β”‚ β€’ Unified       β”‚
                                               β”‚   Sentiment     β”‚
                                               β”‚ β€’ Trade         β”‚
                                               β”‚   Decisions     β”‚
                                               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”— Quick Links


This page is automatically updated daily by GitHub Actions.