architecture_overview - thesavant42/retrorecon GitHub Wiki

System Architecture Overview

This document summarizes the major runtime components of Retrorecon and illustrates how the database tables relate to each feature. The diagrams are written in Mermaid so they can be rendered directly in Markdown viewers that support it.

flowchart TB
    A((Browser UI)) --> B(Flask Server)
    subgraph B [Flask Application]
        C1[CDXFetcher]
        C2[JSONImporter]
        C3[ImportProgressTracker]
        C4[TagManager]
        C5[BulkActionAgent]
        C6[Notes API]
        C7[ScreenshotAgent]
        C8[SiteZipAgent]
        C9[SubdomainFetcher]
        C10[WebpackExploder]
        C11[DatabaseManager]
    end
    B --> D[(SQLite Database)]
    C1 -->|insert| urls
    C2 -->|insert| urls
    C3 -->|update| import_status
    C4 -->|update| urls
    C5 -->|update/delete| urls
    C6 -->|read/write| notes
    C7 -->|create| screenshots
    C8 -->|create| sitezips
    C9 -->|create| domains
    C10 -->|create| jobs
    C11 -->|load/save| D
Loading

The tables themselves are connected via the relationships shown below.

erDiagram
    urls {
        INTEGER id PK
        TEXT url
        TEXT domain
        TEXT timestamp
        INTEGER status_code
        TEXT mime_type
        TEXT tags
    }
    jobs {
        INTEGER id PK
        TEXT type
        TEXT domain
        TEXT status
        INTEGER progress
        TEXT result
        TIMESTAMP created_at
    }
    import_status {
        INTEGER id PK
        TEXT status
        TEXT detail
        INTEGER progress
        INTEGER total
    }
    notes {
        INTEGER id PK
        INTEGER url_id FK
        TEXT content
        TIMESTAMP created_at
        TIMESTAMP updated_at
    }
    jwt_cookies {
        INTEGER id PK
        TEXT token
        TEXT header
        TEXT payload
        TEXT notes
        TIMESTAMP created_at
    }
    screenshots {
        INTEGER id PK
        TEXT url
        TEXT method
        TEXT screenshot_path
        TEXT thumbnail_path
        INTEGER status_code
        TEXT ip_addresses
        TIMESTAMP created_at
    }
    sitezips {
        INTEGER id PK
        TEXT url
        TEXT method
        TEXT zip_path
        TEXT screenshot_path
        TEXT thumbnail_path
        INTEGER status_code
        TEXT ip_addresses
        TIMESTAMP created_at
    }
    domains {
        INTEGER id PK
        TEXT root_domain
        TEXT subdomain
        TEXT source
        TEXT tags
        INTEGER cdx_indexed
        TIMESTAMP fetched_at
    }

    urls ||--o{ notes : has
    urls ||--|{ screenshots : captures
    urls ||--|{ sitezips : archives
    notes }o--|| urls : references
    domains ||--o{ urls : contains
Loading
⚠️ **GitHub.com Fallback** ⚠️