Architecture Overview.md - himent12/FlashGenie GitHub Wiki
๐๏ธ Architecture Overview
This document provides a comprehensive overview of FlashGenie's system architecture, design patterns, and component interactions.
๐ฏ System Architecture
FlashGenie follows a modular, layered architecture designed for extensibility and maintainability:
graph TD
A[User Interfaces] --> B[Application Layer]
B --> C[Core Domain]
B --> D[Extension System]
C --> E[Data Access Layer]
D --> E
E --> F[Storage Backends]
subgraph "User Interfaces"
A1[CLI Interface]
A2[Future: GUI]
A3[Future: Web Interface]
A4[Future: Mobile Apps]
end
subgraph "Application Layer"
B1[Command Handlers]
B2[Service Orchestration]
B3[Event System]
end
subgraph "Core Domain"
C1[Flashcard System]
C2[Spaced Repetition]
C3[Quiz Engine]
C4[Analytics Engine]
C5[Tag Management]
end
subgraph "Extension System"
D1[Plugin Manager]
D2[Custom Algorithms]
D3[Importers/Exporters]
D4[Visualization Tools]
end
subgraph "Data Access Layer"
E1[Repository Pattern]
E2[Unit of Work]
E3[Data Mappers]
end
subgraph "Storage Backends"
F1[JSON Files]
F2[SQLite]
F3[Future: Cloud Storage]
end
Key Architectural Principles
- Domain-Driven Design: Core business logic is isolated in the domain layer
- Clean Architecture: Dependencies point inward, with domain at the center
- SOLID Principles: Single responsibility, Open-closed, Liskov substitution, Interface segregation, Dependency inversion
- Command Query Responsibility Segregation (CQRS): Separate read and write operations
- Event-Driven Architecture: Components communicate via events for loose coupling
๐งฉ Core Components
Flashcard System
The fundamental data model representing learning content:
classDiagram
class Flashcard {
+UUID id
+String question
+String answer
+List~String~ tags
+Metadata metadata
+DateTime created_at
+DateTime updated_at
+create()
+update()
+addTag()
+removeTag()
}
class Deck {
+UUID id
+String name
+String description
+List~Flashcard~ cards
+DateTime created_at
+DateTime updated_at
+addCard()
+removeCard()
+findCardsByTag()
+getStatistics()
}
class Tag {
+String name
+String description
+Tag parent
+List~Tag~ children
+addChild()
+removeChild()
+getFullPath()
}
Deck "1" *-- "many" Flashcard
Flashcard "many" o-- "many" Tag
Tag "1" *-- "many" Tag
Key classes:
Flashcard
: Core learning unit with question/answer pairsDeck
: Collection of flashcards with metadataTag
: Hierarchical organization system
Spaced Repetition System
The learning algorithm that optimizes review scheduling:
classDiagram
class SpacedRepetitionAlgorithm {
+calculateNextReview(card, performance)
+calculateDifficulty(card, performance)
+getReviewSchedule(deck)
}
class SM2Algorithm {
+calculateEaseFactor(card, performance)
+calculateInterval(card, performance)
+calculateNextReview(card, performance)
}
class AdaptiveAlgorithm {
+learningRate
+forgettingCurveModel
+calculateNextReview(card, performance)
+adjustDifficulty(card, performance)
}
class CardSchedule {
+Flashcard card
+DateTime nextReview
+Float easeFactor
+Int interval
+Int repetitions
+updateAfterReview(performance)
}
SpacedRepetitionAlgorithm <|-- SM2Algorithm
SpacedRepetitionAlgorithm <|-- AdaptiveAlgorithm
CardSchedule o-- SpacedRepetitionAlgorithm
Key classes:
SpacedRepetitionAlgorithm
: Base algorithm interfaceSM2Algorithm
: Classic SuperMemo 2 implementationAdaptiveAlgorithm
: Enhanced algorithm with machine learningCardSchedule
: Tracks review history and scheduling
Quiz Engine
Manages study sessions and user interactions:
classDiagram
class QuizEngine {
+startSession(deck, options)
+endSession(session)
+getNextCard(session)
+recordAnswer(session, card, performance)
}
class QuizSession {
+UUID id
+Deck deck
+DateTime startTime
+DateTime endTime
+List~QuizResponse~ responses
+QuizOptions options
+getProgress()
+getStatistics()
}
class QuizResponse {
+Flashcard card
+Int performance
+Float confidence
+Int timeToAnswer
+DateTime timestamp
}
class QuizOptions {
+String mode
+Int cardLimit
+List~String~ includeTags
+List~String~ excludeTags
+Boolean shuffleCards
}
QuizEngine -- QuizSession
QuizSession "1" *-- "many" QuizResponse
QuizSession -- QuizOptions
Key classes:
QuizEngine
: Core session managementQuizSession
: Represents a study sessionQuizResponse
: Individual card response dataQuizOptions
: Configuration for quiz behavior
Analytics Engine
Processes learning data to provide insights:
classDiagram
class AnalyticsEngine {
+calculatePerformanceMetrics(user, timeframe)
+predictMasteryTimeline(user, deck)
+identifyStrengthsWeaknesses(user)
+generateRecommendations(user)
}
class PerformanceTracker {
+trackSession(session)
+calculateRetentionRate(user, deck)
+calculateLearningVelocity(user, deck)
+identifyDifficultCards(user, deck)
}
class LearningCurve {
+User user
+Deck deck
+List~DataPoint~ dataPoints
+predictFuturePerformance()
+calculateOptimalReviewSchedule()
}
class Recommendation {
+String type
+String description
+Float confidence
+List~Object~ relatedEntities
+DateTime generated
}
AnalyticsEngine -- PerformanceTracker
AnalyticsEngine -- LearningCurve
AnalyticsEngine -- Recommendation
Key classes:
AnalyticsEngine
: Core analytics processingPerformanceTracker
: Tracks and analyzes user performanceLearningCurve
: Models learning progress over timeRecommendation
: Actionable insights for users
๐ Data Flow
Study Session Flow
sequenceDiagram
participant User
participant CLI as CLI Interface
participant QE as Quiz Engine
participant SRS as Spaced Repetition System
participant DAL as Data Access Layer
User->>CLI: Start quiz session
CLI->>QE: startSession(deck, options)
QE->>SRS: getReviewSchedule(deck)
SRS->>DAL: loadCardHistory(deck)
DAL-->>SRS: cardHistory
SRS-->>QE: scheduledCards
QE-->>CLI: session
CLI-->>User: First card
loop For each card
User->>CLI: Submit answer
CLI->>QE: recordAnswer(session, card, performance)
QE->>SRS: calculateNextReview(card, performance)
SRS-->>QE: updatedSchedule
QE->>DAL: saveResponse(response)
QE->>QE: getNextCard(session)
QE-->>CLI: nextCard
CLI-->>User: Show next card
end
User->>CLI: End session
CLI->>QE: endSession(session)
QE->>DAL: saveSession(session)
QE->>DAL: updateCardSchedules(updatedCards)
QE-->>CLI: sessionSummary
CLI-->>User: Display session results
Data Synchronization Flow
sequenceDiagram
participant User
participant CLI as CLI Interface
participant Sync as Sync Manager
participant Local as Local Storage
participant Remote as Remote Storage
User->>CLI: Initiate sync
CLI->>Sync: startSync()
Sync->>Local: getLastSyncTimestamp()
Local-->>Sync: lastSync
Sync->>Local: getChangedSince(lastSync)
Local-->>Sync: localChanges
Sync->>Remote: authenticate()
Remote-->>Sync: authToken
Sync->>Remote: getChangedSince(lastSync)
Remote-->>Sync: remoteChanges
Sync->>Sync: resolveConflicts(localChanges, remoteChanges)
Sync->>Remote: pushChanges(resolvedChanges)
Remote-->>Sync: syncStatus
Sync->>Local: updateChanges(resolvedChanges)
Sync->>Local: updateLastSyncTimestamp()
Sync-->>CLI: syncResults
CLI-->>User: Display sync status
๐ง Design Patterns
FlashGenie implements several design patterns to ensure maintainability and extensibility:
Repository Pattern
Abstracts data access operations:
# Abstract repository interface
class DeckRepository(ABC):
@abstractmethod
def get_by_id(self, deck_id: UUID) -> Optional[Deck]:
pass
@abstractmethod
def save(self, deck: Deck) -> None:
pass
@abstractmethod
def delete(self, deck_id: UUID) -> None:
pass
@abstractmethod
def find_by_tags(self, tags: List[str]) -> List[Deck]:
pass
# Concrete implementation for JSON storage
class JsonDeckRepository(DeckRepository):
def __init__(self, file_path: str):
self.file_path = file_path
# Implementation details...
def get_by_id(self, deck_id: UUID) -> Optional[Deck]:
# Implementation details...
pass
# Other method implementations...
Factory Pattern
Creates objects without specifying exact class:
class AlgorithmFactory:
@staticmethod
def create_algorithm(algorithm_type: str, **kwargs) -> SpacedRepetitionAlgorithm:
if algorithm_type == "sm2":
return SM2Algorithm(**kwargs)
elif algorithm_type == "adaptive":
return AdaptiveAlgorithm(**kwargs)
elif algorithm_type == "leitner":
return LeitnerSystem(**kwargs)
else:
raise ValueError(f"Unknown algorithm type: {algorithm_type}")
Strategy Pattern
Enables algorithm selection at runtime:
class DifficultyStrategy(ABC):
@abstractmethod
def calculate_difficulty(self, card: Flashcard, performance: int) -> float:
pass
class StandardDifficulty(DifficultyStrategy):
def calculate_difficulty(self, card: Flashcard, performance: int) -> float:
# Standard implementation
pass
class AdaptiveDifficulty(DifficultyStrategy):
def calculate_difficulty(self, card: Flashcard, performance: int) -> float:
# Adaptive implementation with ML
pass
# Usage
difficulty_strategy = config.get_difficulty_strategy()
quiz_engine = QuizEngine(difficulty_strategy=difficulty_strategy)
Observer Pattern
Enables event-based communication:
class EventManager:
def __init__(self):
self.listeners = defaultdict(list)
def subscribe(self, event_type: str, listener: Callable):
self.listeners[event_type].append(listener)
def unsubscribe(self, event_type: str, listener: Callable):
self.listeners[event_type].remove(listener)
def notify(self, event_type: str, data: Any):
for listener in self.listeners[event_type]:
listener(data)
# Usage
event_manager = EventManager()
event_manager.subscribe("session_completed", analytics_engine.process_session)
event_manager.subscribe("session_completed", achievement_manager.check_achievements)
๐ Extension System
FlashGenie's plugin architecture enables third-party extensions:
classDiagram
class PluginManager {
+List~Plugin~ plugins
+loadPlugins()
+registerPlugin(plugin)
+unregisterPlugin(pluginId)
+getPluginByName(name)
+getHookImplementations(hookName)
}
class Plugin {
+String id
+String name
+String version
+String description
+Dict hooks
+initialize()
+shutdown()
}
class Hook {
+String name
+List~Callable~ implementations
+register(implementation)
+unregister(implementation)
+execute(*args, **kwargs)
}
PluginManager "1" *-- "many" Plugin
PluginManager "1" *-- "many" Hook
Plugin Development
Plugins can extend FlashGenie in several ways:
# Example plugin implementation
class CustomAlgorithmPlugin(Plugin):
def __init__(self):
super().__init__(
id="custom_algorithm",
name="Custom Learning Algorithm",
version="1.0.0",
description="Implements a custom spaced repetition algorithm"
)
def initialize(self):
# Register algorithm with the system
algorithm_factory = self.get_service("algorithm_factory")
algorithm_factory.register_algorithm(
"custom",
CustomAlgorithm
)
# Register settings
settings_manager = self.get_service("settings_manager")
settings_manager.register_settings(
"custom_algorithm",
{
"learning_rate": 0.3,
"forgetting_threshold": 0.7
}
)
def shutdown(self):
# Clean up resources
algorithm_factory = self.get_service("algorithm_factory")
algorithm_factory.unregister_algorithm("custom")
Extension Points
FlashGenie provides several extension points:
- Algorithms: Custom spaced repetition algorithms
- Importers/Exporters: Support for additional file formats
- Visualizations: Custom data visualization tools
- Commands: New CLI commands
- Analytics: Custom analytics processors
- Storage Backends: Alternative storage mechanisms
๐ Data Model
Core Entities
erDiagram
USER {
uuid id
string username
string email
datetime created_at
datetime last_login
}
DECK {
uuid id
string name
string description
datetime created_at
datetime updated_at
uuid user_id
}
FLASHCARD {
uuid id
string question
string answer
json metadata
datetime created_at
datetime updated_at
uuid deck_id
}
TAG {
string name
string description
string parent_tag
}
CARD_TAG {
uuid card_id
string tag_name
}
REVIEW_LOG {
uuid id
uuid card_id
uuid user_id
int performance
float confidence
int time_to_answer
datetime timestamp
}
CARD_SCHEDULE {
uuid card_id
uuid user_id
datetime next_review
float ease_factor
int interval
int repetitions
}
USER ||--o{ DECK : creates
DECK ||--o{ FLASHCARD : contains
FLASHCARD ||--o{ CARD_TAG : has
CARD_TAG }o--|| TAG : references
USER ||--o{ REVIEW_LOG : generates
FLASHCARD ||--o{ REVIEW_LOG : subject_of
USER ||--o{ CARD_SCHEDULE : tracks
FLASHCARD ||--o{ CARD_SCHEDULE : scheduled_as
Data Storage
FlashGenie supports multiple storage backends:
-
JSON Files (Default)
- Simple file-based storage
- One file per entity type
- Good for individual users
-
SQLite (Local Database)
- Relational storage
- Single file database
- Improved query performance
- Better for larger collections
-
Future: Cloud Storage
- Synchronized storage
- Multi-device support
- Backup and recovery
- Sharing capabilities
๐ Synchronization
Conflict Resolution
FlashGenie implements a robust synchronization system:
class SyncManager:
def __init__(self, local_repo, remote_repo):
self.local_repo = local_repo
self.remote_repo = remote_repo
def synchronize(self):
# Get changes since last sync
last_sync = self.get_last_sync_timestamp()
local_changes = self.local_repo.get_changes_since(last_sync)
remote_changes = self.remote_repo.get_changes_since(last_sync)
# Detect and resolve conflicts
conflicts = self.detect_conflicts(local_changes, remote_changes)
resolved_changes = self.resolve_conflicts(conflicts)
# Apply resolved changes
self.apply_changes(resolved_changes)
# Update sync timestamp
self.update_last_sync_timestamp()
def detect_conflicts(self, local_changes, remote_changes):
# Implementation details...
pass
def resolve_conflicts(self, conflicts):
# Implementation details...
pass
Conflict resolution strategies:
- Last-Write-Wins: Most recent change takes precedence
- Merge: Combine non-conflicting changes
- Manual Resolution: User decides for critical conflicts
- Field-Level Resolution: Apply different strategies per field
๐งช Testing Strategy
FlashGenie employs a comprehensive testing approach:
Unit Testing
# Example unit test for Flashcard
def test_flashcard_creation():
# Arrange
question = "What is the capital of France?"
answer = "Paris"
# Act
card = Flashcard(question=question, answer=answer)
# Assert
assert card.question == question
assert card.answer == answer
assert card.id is not None
assert card.created_at is not None
# Example unit test for SpacedRepetitionAlgorithm
def test_sm2_algorithm_calculation():
# Arrange
algorithm = SM2Algorithm()
card = Flashcard("Q", "A")
card_schedule = CardSchedule(card_id=card.id, ease_factor=2.5, interval=1)
performance = 5 # Perfect response
# Act
new_schedule = algorithm.calculate_next_review(card_schedule, performance)
# Assert
assert new_schedule.ease_factor > card_schedule.ease_factor
assert new_schedule.interval > card_schedule.interval
Integration Testing
# Example integration test for quiz session
def test_complete_quiz_session():
# Arrange
deck = create_test_deck()
quiz_engine = QuizEngine()
# Act
session = quiz_engine.start_session(deck)
# Simulate answering all cards
while not session.is_complete():
card = quiz_engine.get_next_card(session)
performance = 3 # Average performance
quiz_engine.record_answer(session, card, performance)
results = quiz_engine.end_session(session)
# Assert
assert session.is_complete()
assert len(session.responses) == len(deck.cards)
assert results.accuracy_rate is not None
assert results.average_time is not None
Performance Testing
# Example performance test
def test_large_deck_performance():
# Arrange
large_deck = create_large_test_deck(10000) # 10,000 cards
quiz_engine = QuizEngine()
# Act
start_time = time.time()
session = quiz_engine.start_session(large_deck)
initialization_time = time.time() - start_time
start_time = time.time()
card = quiz_engine.get_next_card(session)
card_retrieval_time = time.time() - start_time
# Assert
assert initialization_time < 1.0 # Should initialize in under 1 second
assert card_retrieval_time < 0.01 # Should retrieve cards in under 10ms
๐ Code Organization
Package Structure
flashgenie/
โโโ __init__.py
โโโ cli/ # Command-line interface
โ โโโ __init__.py
โ โโโ commands/ # CLI command implementations
โ โโโ formatters/ # Output formatting
โโโ core/ # Core domain logic
โ โโโ __init__.py
โ โโโ flashcard.py # Flashcard entity
โ โโโ deck.py # Deck management
โ โโโ spaced_repetition.py # SRS algorithms
โ โโโ quiz_engine.py # Quiz session management
โ โโโ performance_tracker.py # Learning analytics
โ โโโ tag_manager.py # Hierarchical tagging
โ โโโ smart_collections.py # Dynamic card collections
โโโ data/ # Data access layer
โ โโโ __init__.py
โ โโโ repositories/ # Repository implementations
โ โโโ storage/ # Storage backends
โ โโโ serializers/ # Data serialization
โโโ plugins/ # Plugin system
โ โโโ __init__.py
โ โโโ manager.py # Plugin management
โ โโโ hooks.py # Extension points
โ โโโ builtin/ # Built-in plugins
โโโ utils/ # Shared utilities
โ โโโ __init__.py
โ โโโ config.py # Configuration management
โ โโโ logging.py # Logging utilities
โ โโโ validation.py # Input validation
โโโ main.py # Application entry point
Coding Standards
FlashGenie follows strict coding standards:
- PEP 8: Python style guide
- Type Hints: All public methods include type annotations
- Docstrings: Google-style docstrings for all classes and methods
- Naming Conventions:
- Classes:
PascalCase
- Functions/Methods:
snake_case
- Constants:
UPPER_SNAKE_CASE
- Private attributes:
_leading_underscore
- Classes:
- Testing: Minimum 80% code coverage
๐ Performance Considerations
Optimization Strategies
FlashGenie implements several performance optimizations:
- Lazy Loading: Load data only when needed
- Caching: Cache frequently accessed data
- Indexing: Optimize data retrieval with indexes
- Batch Processing: Process data in batches for efficiency
- Asynchronous Operations: Non-blocking I/O for UI responsiveness
Scalability
Design considerations for handling large datasets:
- Pagination: Limit data loaded in memory
- Incremental Processing: Process large datasets incrementally
- Resource Management: Careful memory and file handle management
- Efficient Algorithms: O(n) or better time complexity for critical operations
- Data Partitioning: Split large datasets into manageable chunks
๐ Security Considerations
Data Protection
- Encryption: Sensitive data encrypted at rest
- Secure Defaults: Secure configuration by default
- Input Validation: All user input validated and sanitized
- Error Handling: Secure error handling without information leakage
- Dependency Management: Regular updates for security patches
Privacy
- Data Minimization: Collect only necessary data
- Local-First: Prioritize local storage over cloud
- Transparency: Clear documentation of data usage
- User Control: Options to export or delete all data
- Anonymization: Analytics data anonymized by default
๐ Future Architecture
Planned architectural enhancements:
- Microservices: Split monolith into specialized services
- GraphQL API: Flexible API for frontend integration
- Real-time Sync: WebSocket-based synchronization
- Machine Learning Pipeline: Advanced personalization
- Containerization: Docker-based deployment
Next: Explore the API Reference for detailed technical documentation.