Architecture Overview.md - himent12/FlashGenie GitHub Wiki

๐Ÿ—๏ธ Architecture Overview

This document provides a comprehensive overview of FlashGenie's system architecture, design patterns, and component interactions.

๐ŸŽฏ System Architecture

FlashGenie follows a modular, layered architecture designed for extensibility and maintainability:

graph TD
    A[User Interfaces] --> B[Application Layer]
    B --> C[Core Domain]
    B --> D[Extension System]
    C --> E[Data Access Layer]
    D --> E
    E --> F[Storage Backends]
    
    subgraph "User Interfaces"
    A1[CLI Interface]
    A2[Future: GUI]
    A3[Future: Web Interface]
    A4[Future: Mobile Apps]
    end
    
    subgraph "Application Layer"
    B1[Command Handlers]
    B2[Service Orchestration]
    B3[Event System]
    end
    
    subgraph "Core Domain"
    C1[Flashcard System]
    C2[Spaced Repetition]
    C3[Quiz Engine]
    C4[Analytics Engine]
    C5[Tag Management]
    end
    
    subgraph "Extension System"
    D1[Plugin Manager]
    D2[Custom Algorithms]
    D3[Importers/Exporters]
    D4[Visualization Tools]
    end
    
    subgraph "Data Access Layer"
    E1[Repository Pattern]
    E2[Unit of Work]
    E3[Data Mappers]
    end
    
    subgraph "Storage Backends"
    F1[JSON Files]
    F2[SQLite]
    F3[Future: Cloud Storage]
    end

Key Architectural Principles

  1. Domain-Driven Design: Core business logic is isolated in the domain layer
  2. Clean Architecture: Dependencies point inward, with domain at the center
  3. SOLID Principles: Single responsibility, Open-closed, Liskov substitution, Interface segregation, Dependency inversion
  4. Command Query Responsibility Segregation (CQRS): Separate read and write operations
  5. Event-Driven Architecture: Components communicate via events for loose coupling

๐Ÿงฉ Core Components

Flashcard System

The fundamental data model representing learning content:

classDiagram
    class Flashcard {
        +UUID id
        +String question
        +String answer
        +List~String~ tags
        +Metadata metadata
        +DateTime created_at
        +DateTime updated_at
        +create()
        +update()
        +addTag()
        +removeTag()
    }
    
    class Deck {
        +UUID id
        +String name
        +String description
        +List~Flashcard~ cards
        +DateTime created_at
        +DateTime updated_at
        +addCard()
        +removeCard()
        +findCardsByTag()
        +getStatistics()
    }
    
    class Tag {
        +String name
        +String description
        +Tag parent
        +List~Tag~ children
        +addChild()
        +removeChild()
        +getFullPath()
    }
    
    Deck "1" *-- "many" Flashcard
    Flashcard "many" o-- "many" Tag
    Tag "1" *-- "many" Tag

Key classes:

  • Flashcard: Core learning unit with question/answer pairs
  • Deck: Collection of flashcards with metadata
  • Tag: Hierarchical organization system

Spaced Repetition System

The learning algorithm that optimizes review scheduling:

classDiagram
    class SpacedRepetitionAlgorithm {
        +calculateNextReview(card, performance)
        +calculateDifficulty(card, performance)
        +getReviewSchedule(deck)
    }
    
    class SM2Algorithm {
        +calculateEaseFactor(card, performance)
        +calculateInterval(card, performance)
        +calculateNextReview(card, performance)
    }
    
    class AdaptiveAlgorithm {
        +learningRate
        +forgettingCurveModel
        +calculateNextReview(card, performance)
        +adjustDifficulty(card, performance)
    }
    
    class CardSchedule {
        +Flashcard card
        +DateTime nextReview
        +Float easeFactor
        +Int interval
        +Int repetitions
        +updateAfterReview(performance)
    }
    
    SpacedRepetitionAlgorithm <|-- SM2Algorithm
    SpacedRepetitionAlgorithm <|-- AdaptiveAlgorithm
    CardSchedule o-- SpacedRepetitionAlgorithm

Key classes:

  • SpacedRepetitionAlgorithm: Base algorithm interface
  • SM2Algorithm: Classic SuperMemo 2 implementation
  • AdaptiveAlgorithm: Enhanced algorithm with machine learning
  • CardSchedule: Tracks review history and scheduling

Quiz Engine

Manages study sessions and user interactions:

classDiagram
    class QuizEngine {
        +startSession(deck, options)
        +endSession(session)
        +getNextCard(session)
        +recordAnswer(session, card, performance)
    }
    
    class QuizSession {
        +UUID id
        +Deck deck
        +DateTime startTime
        +DateTime endTime
        +List~QuizResponse~ responses
        +QuizOptions options
        +getProgress()
        +getStatistics()
    }
    
    class QuizResponse {
        +Flashcard card
        +Int performance
        +Float confidence
        +Int timeToAnswer
        +DateTime timestamp
    }
    
    class QuizOptions {
        +String mode
        +Int cardLimit
        +List~String~ includeTags
        +List~String~ excludeTags
        +Boolean shuffleCards
    }
    
    QuizEngine -- QuizSession
    QuizSession "1" *-- "many" QuizResponse
    QuizSession -- QuizOptions

Key classes:

  • QuizEngine: Core session management
  • QuizSession: Represents a study session
  • QuizResponse: Individual card response data
  • QuizOptions: Configuration for quiz behavior

Analytics Engine

Processes learning data to provide insights:

classDiagram
    class AnalyticsEngine {
        +calculatePerformanceMetrics(user, timeframe)
        +predictMasteryTimeline(user, deck)
        +identifyStrengthsWeaknesses(user)
        +generateRecommendations(user)
    }
    
    class PerformanceTracker {
        +trackSession(session)
        +calculateRetentionRate(user, deck)
        +calculateLearningVelocity(user, deck)
        +identifyDifficultCards(user, deck)
    }
    
    class LearningCurve {
        +User user
        +Deck deck
        +List~DataPoint~ dataPoints
        +predictFuturePerformance()
        +calculateOptimalReviewSchedule()
    }
    
    class Recommendation {
        +String type
        +String description
        +Float confidence
        +List~Object~ relatedEntities
        +DateTime generated
    }
    
    AnalyticsEngine -- PerformanceTracker
    AnalyticsEngine -- LearningCurve
    AnalyticsEngine -- Recommendation

Key classes:

  • AnalyticsEngine: Core analytics processing
  • PerformanceTracker: Tracks and analyzes user performance
  • LearningCurve: Models learning progress over time
  • Recommendation: Actionable insights for users

๐Ÿ”„ Data Flow

Study Session Flow

sequenceDiagram
    participant User
    participant CLI as CLI Interface
    participant QE as Quiz Engine
    participant SRS as Spaced Repetition System
    participant DAL as Data Access Layer
    
    User->>CLI: Start quiz session
    CLI->>QE: startSession(deck, options)
    QE->>SRS: getReviewSchedule(deck)
    SRS->>DAL: loadCardHistory(deck)
    DAL-->>SRS: cardHistory
    SRS-->>QE: scheduledCards
    QE-->>CLI: session
    CLI-->>User: First card
    
    loop For each card
        User->>CLI: Submit answer
        CLI->>QE: recordAnswer(session, card, performance)
        QE->>SRS: calculateNextReview(card, performance)
        SRS-->>QE: updatedSchedule
        QE->>DAL: saveResponse(response)
        QE->>QE: getNextCard(session)
        QE-->>CLI: nextCard
        CLI-->>User: Show next card
    end
    
    User->>CLI: End session
    CLI->>QE: endSession(session)
    QE->>DAL: saveSession(session)
    QE->>DAL: updateCardSchedules(updatedCards)
    QE-->>CLI: sessionSummary
    CLI-->>User: Display session results

Data Synchronization Flow

sequenceDiagram
    participant User
    participant CLI as CLI Interface
    participant Sync as Sync Manager
    participant Local as Local Storage
    participant Remote as Remote Storage
    
    User->>CLI: Initiate sync
    CLI->>Sync: startSync()
    Sync->>Local: getLastSyncTimestamp()
    Local-->>Sync: lastSync
    Sync->>Local: getChangedSince(lastSync)
    Local-->>Sync: localChanges
    Sync->>Remote: authenticate()
    Remote-->>Sync: authToken
    Sync->>Remote: getChangedSince(lastSync)
    Remote-->>Sync: remoteChanges
    
    Sync->>Sync: resolveConflicts(localChanges, remoteChanges)
    
    Sync->>Remote: pushChanges(resolvedChanges)
    Remote-->>Sync: syncStatus
    Sync->>Local: updateChanges(resolvedChanges)
    Sync->>Local: updateLastSyncTimestamp()
    Sync-->>CLI: syncResults
    CLI-->>User: Display sync status

๐Ÿง  Design Patterns

FlashGenie implements several design patterns to ensure maintainability and extensibility:

Repository Pattern

Abstracts data access operations:

# Abstract repository interface
class DeckRepository(ABC):
    @abstractmethod
    def get_by_id(self, deck_id: UUID) -> Optional[Deck]:
        pass
    
    @abstractmethod
    def save(self, deck: Deck) -> None:
        pass
    
    @abstractmethod
    def delete(self, deck_id: UUID) -> None:
        pass
    
    @abstractmethod
    def find_by_tags(self, tags: List[str]) -> List[Deck]:
        pass

# Concrete implementation for JSON storage
class JsonDeckRepository(DeckRepository):
    def __init__(self, file_path: str):
        self.file_path = file_path
        # Implementation details...
    
    def get_by_id(self, deck_id: UUID) -> Optional[Deck]:
        # Implementation details...
        pass
    
    # Other method implementations...

Factory Pattern

Creates objects without specifying exact class:

class AlgorithmFactory:
    @staticmethod
    def create_algorithm(algorithm_type: str, **kwargs) -> SpacedRepetitionAlgorithm:
        if algorithm_type == "sm2":
            return SM2Algorithm(**kwargs)
        elif algorithm_type == "adaptive":
            return AdaptiveAlgorithm(**kwargs)
        elif algorithm_type == "leitner":
            return LeitnerSystem(**kwargs)
        else:
            raise ValueError(f"Unknown algorithm type: {algorithm_type}")

Strategy Pattern

Enables algorithm selection at runtime:

class DifficultyStrategy(ABC):
    @abstractmethod
    def calculate_difficulty(self, card: Flashcard, performance: int) -> float:
        pass

class StandardDifficulty(DifficultyStrategy):
    def calculate_difficulty(self, card: Flashcard, performance: int) -> float:
        # Standard implementation
        pass

class AdaptiveDifficulty(DifficultyStrategy):
    def calculate_difficulty(self, card: Flashcard, performance: int) -> float:
        # Adaptive implementation with ML
        pass

# Usage
difficulty_strategy = config.get_difficulty_strategy()
quiz_engine = QuizEngine(difficulty_strategy=difficulty_strategy)

Observer Pattern

Enables event-based communication:

class EventManager:
    def __init__(self):
        self.listeners = defaultdict(list)
    
    def subscribe(self, event_type: str, listener: Callable):
        self.listeners[event_type].append(listener)
    
    def unsubscribe(self, event_type: str, listener: Callable):
        self.listeners[event_type].remove(listener)
    
    def notify(self, event_type: str, data: Any):
        for listener in self.listeners[event_type]:
            listener(data)

# Usage
event_manager = EventManager()
event_manager.subscribe("session_completed", analytics_engine.process_session)
event_manager.subscribe("session_completed", achievement_manager.check_achievements)

๐Ÿ”Œ Extension System

FlashGenie's plugin architecture enables third-party extensions:

classDiagram
    class PluginManager {
        +List~Plugin~ plugins
        +loadPlugins()
        +registerPlugin(plugin)
        +unregisterPlugin(pluginId)
        +getPluginByName(name)
        +getHookImplementations(hookName)
    }
    
    class Plugin {
        +String id
        +String name
        +String version
        +String description
        +Dict hooks
        +initialize()
        +shutdown()
    }
    
    class Hook {
        +String name
        +List~Callable~ implementations
        +register(implementation)
        +unregister(implementation)
        +execute(*args, **kwargs)
    }
    
    PluginManager "1" *-- "many" Plugin
    PluginManager "1" *-- "many" Hook

Plugin Development

Plugins can extend FlashGenie in several ways:

# Example plugin implementation
class CustomAlgorithmPlugin(Plugin):
    def __init__(self):
        super().__init__(
            id="custom_algorithm",
            name="Custom Learning Algorithm",
            version="1.0.0",
            description="Implements a custom spaced repetition algorithm"
        )
    
    def initialize(self):
        # Register algorithm with the system
        algorithm_factory = self.get_service("algorithm_factory")
        algorithm_factory.register_algorithm(
            "custom", 
            CustomAlgorithm
        )
        
        # Register settings
        settings_manager = self.get_service("settings_manager")
        settings_manager.register_settings(
            "custom_algorithm",
            {
                "learning_rate": 0.3,
                "forgetting_threshold": 0.7
            }
        )
    
    def shutdown(self):
        # Clean up resources
        algorithm_factory = self.get_service("algorithm_factory")
        algorithm_factory.unregister_algorithm("custom")

Extension Points

FlashGenie provides several extension points:

  1. Algorithms: Custom spaced repetition algorithms
  2. Importers/Exporters: Support for additional file formats
  3. Visualizations: Custom data visualization tools
  4. Commands: New CLI commands
  5. Analytics: Custom analytics processors
  6. Storage Backends: Alternative storage mechanisms

๐Ÿ“Š Data Model

Core Entities

erDiagram
    USER {
        uuid id
        string username
        string email
        datetime created_at
        datetime last_login
    }
    
    DECK {
        uuid id
        string name
        string description
        datetime created_at
        datetime updated_at
        uuid user_id
    }
    
    FLASHCARD {
        uuid id
        string question
        string answer
        json metadata
        datetime created_at
        datetime updated_at
        uuid deck_id
    }
    
    TAG {
        string name
        string description
        string parent_tag
    }
    
    CARD_TAG {
        uuid card_id
        string tag_name
    }
    
    REVIEW_LOG {
        uuid id
        uuid card_id
        uuid user_id
        int performance
        float confidence
        int time_to_answer
        datetime timestamp
    }
    
    CARD_SCHEDULE {
        uuid card_id
        uuid user_id
        datetime next_review
        float ease_factor
        int interval
        int repetitions
    }
    
    USER ||--o{ DECK : creates
    DECK ||--o{ FLASHCARD : contains
    FLASHCARD ||--o{ CARD_TAG : has
    CARD_TAG }o--|| TAG : references
    USER ||--o{ REVIEW_LOG : generates
    FLASHCARD ||--o{ REVIEW_LOG : subject_of
    USER ||--o{ CARD_SCHEDULE : tracks
    FLASHCARD ||--o{ CARD_SCHEDULE : scheduled_as

Data Storage

FlashGenie supports multiple storage backends:

  1. JSON Files (Default)

    • Simple file-based storage
    • One file per entity type
    • Good for individual users
  2. SQLite (Local Database)

    • Relational storage
    • Single file database
    • Improved query performance
    • Better for larger collections
  3. Future: Cloud Storage

    • Synchronized storage
    • Multi-device support
    • Backup and recovery
    • Sharing capabilities

๐Ÿ”„ Synchronization

Conflict Resolution

FlashGenie implements a robust synchronization system:

class SyncManager:
    def __init__(self, local_repo, remote_repo):
        self.local_repo = local_repo
        self.remote_repo = remote_repo
    
    def synchronize(self):
        # Get changes since last sync
        last_sync = self.get_last_sync_timestamp()
        local_changes = self.local_repo.get_changes_since(last_sync)
        remote_changes = self.remote_repo.get_changes_since(last_sync)
        
        # Detect and resolve conflicts
        conflicts = self.detect_conflicts(local_changes, remote_changes)
        resolved_changes = self.resolve_conflicts(conflicts)
        
        # Apply resolved changes
        self.apply_changes(resolved_changes)
        
        # Update sync timestamp
        self.update_last_sync_timestamp()
    
    def detect_conflicts(self, local_changes, remote_changes):
        # Implementation details...
        pass
    
    def resolve_conflicts(self, conflicts):
        # Implementation details...
        pass

Conflict resolution strategies:

  1. Last-Write-Wins: Most recent change takes precedence
  2. Merge: Combine non-conflicting changes
  3. Manual Resolution: User decides for critical conflicts
  4. Field-Level Resolution: Apply different strategies per field

๐Ÿงช Testing Strategy

FlashGenie employs a comprehensive testing approach:

Unit Testing

# Example unit test for Flashcard
def test_flashcard_creation():
    # Arrange
    question = "What is the capital of France?"
    answer = "Paris"
    
    # Act
    card = Flashcard(question=question, answer=answer)
    
    # Assert
    assert card.question == question
    assert card.answer == answer
    assert card.id is not None
    assert card.created_at is not None

# Example unit test for SpacedRepetitionAlgorithm
def test_sm2_algorithm_calculation():
    # Arrange
    algorithm = SM2Algorithm()
    card = Flashcard("Q", "A")
    card_schedule = CardSchedule(card_id=card.id, ease_factor=2.5, interval=1)
    performance = 5  # Perfect response
    
    # Act
    new_schedule = algorithm.calculate_next_review(card_schedule, performance)
    
    # Assert
    assert new_schedule.ease_factor > card_schedule.ease_factor
    assert new_schedule.interval > card_schedule.interval

Integration Testing

# Example integration test for quiz session
def test_complete_quiz_session():
    # Arrange
    deck = create_test_deck()
    quiz_engine = QuizEngine()
    
    # Act
    session = quiz_engine.start_session(deck)
    
    # Simulate answering all cards
    while not session.is_complete():
        card = quiz_engine.get_next_card(session)
        performance = 3  # Average performance
        quiz_engine.record_answer(session, card, performance)
    
    results = quiz_engine.end_session(session)
    
    # Assert
    assert session.is_complete()
    assert len(session.responses) == len(deck.cards)
    assert results.accuracy_rate is not None
    assert results.average_time is not None

Performance Testing

# Example performance test
def test_large_deck_performance():
    # Arrange
    large_deck = create_large_test_deck(10000)  # 10,000 cards
    quiz_engine = QuizEngine()
    
    # Act
    start_time = time.time()
    session = quiz_engine.start_session(large_deck)
    initialization_time = time.time() - start_time
    
    start_time = time.time()
    card = quiz_engine.get_next_card(session)
    card_retrieval_time = time.time() - start_time
    
    # Assert
    assert initialization_time < 1.0  # Should initialize in under 1 second
    assert card_retrieval_time < 0.01  # Should retrieve cards in under 10ms

๐Ÿ” Code Organization

Package Structure

flashgenie/
โ”œโ”€โ”€ __init__.py
โ”œโ”€โ”€ cli/                  # Command-line interface
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ commands/         # CLI command implementations
โ”‚   โ””โ”€โ”€ formatters/       # Output formatting
โ”œโ”€โ”€ core/                 # Core domain logic
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ flashcard.py      # Flashcard entity
โ”‚   โ”œโ”€โ”€ deck.py           # Deck management
โ”‚   โ”œโ”€โ”€ spaced_repetition.py  # SRS algorithms
โ”‚   โ”œโ”€โ”€ quiz_engine.py    # Quiz session management
โ”‚   โ”œโ”€โ”€ performance_tracker.py  # Learning analytics
โ”‚   โ”œโ”€โ”€ tag_manager.py    # Hierarchical tagging
โ”‚   โ””โ”€โ”€ smart_collections.py  # Dynamic card collections
โ”œโ”€โ”€ data/                 # Data access layer
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ repositories/     # Repository implementations
โ”‚   โ”œโ”€โ”€ storage/          # Storage backends
โ”‚   โ””โ”€โ”€ serializers/      # Data serialization
โ”œโ”€โ”€ plugins/              # Plugin system
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ manager.py        # Plugin management
โ”‚   โ”œโ”€โ”€ hooks.py          # Extension points
โ”‚   โ””โ”€โ”€ builtin/          # Built-in plugins
โ”œโ”€โ”€ utils/                # Shared utilities
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ config.py         # Configuration management
โ”‚   โ”œโ”€โ”€ logging.py        # Logging utilities
โ”‚   โ””โ”€โ”€ validation.py     # Input validation
โ””โ”€โ”€ main.py               # Application entry point

Coding Standards

FlashGenie follows strict coding standards:

  1. PEP 8: Python style guide
  2. Type Hints: All public methods include type annotations
  3. Docstrings: Google-style docstrings for all classes and methods
  4. Naming Conventions:
    • Classes: PascalCase
    • Functions/Methods: snake_case
    • Constants: UPPER_SNAKE_CASE
    • Private attributes: _leading_underscore
  5. Testing: Minimum 80% code coverage

๐Ÿš€ Performance Considerations

Optimization Strategies

FlashGenie implements several performance optimizations:

  1. Lazy Loading: Load data only when needed
  2. Caching: Cache frequently accessed data
  3. Indexing: Optimize data retrieval with indexes
  4. Batch Processing: Process data in batches for efficiency
  5. Asynchronous Operations: Non-blocking I/O for UI responsiveness

Scalability

Design considerations for handling large datasets:

  1. Pagination: Limit data loaded in memory
  2. Incremental Processing: Process large datasets incrementally
  3. Resource Management: Careful memory and file handle management
  4. Efficient Algorithms: O(n) or better time complexity for critical operations
  5. Data Partitioning: Split large datasets into manageable chunks

๐Ÿ”’ Security Considerations

Data Protection

  1. Encryption: Sensitive data encrypted at rest
  2. Secure Defaults: Secure configuration by default
  3. Input Validation: All user input validated and sanitized
  4. Error Handling: Secure error handling without information leakage
  5. Dependency Management: Regular updates for security patches

Privacy

  1. Data Minimization: Collect only necessary data
  2. Local-First: Prioritize local storage over cloud
  3. Transparency: Clear documentation of data usage
  4. User Control: Options to export or delete all data
  5. Anonymization: Analytics data anonymized by default

๐Ÿ”„ Future Architecture

Planned architectural enhancements:

  1. Microservices: Split monolith into specialized services
  2. GraphQL API: Flexible API for frontend integration
  3. Real-time Sync: WebSocket-based synchronization
  4. Machine Learning Pipeline: Advanced personalization
  5. Containerization: Docker-based deployment

Next: Explore the API Reference for detailed technical documentation.