Architecture Data Flow - hiraishikentaro/rails-factorybot-jump GitHub Wiki

Architecture: Data Flow

Data Flow Overview

Rails FactoryBot Jump processes data through several distinct flows: initialization, link generation, cache updates, and user navigation. Each flow is optimized for performance and reliability.

graph TB
    subgraph "Data Sources"
        A[Factory Files]
        B[Test Files]
        C[Configuration]
    end

    subgraph "Processing Layer"
        D[File Scanner]
        E[Pattern Detector]
        F[Cache Builder]
    end

    subgraph "Data Storage"
        G[Factory Cache]
        H[Trait Cache]
        I[File URI Cache]
    end

    subgraph "Output Layer"
        J[Document Links]
        K[Navigation Commands]
        L[User Interface]
    end

    A --> D
    B --> E
    C --> D
    D --> F
    E --> J
    F --> G
    F --> H
    F --> I
    G --> J
    H --> J
    J --> K
    K --> L

Primary Data Flows

1. Extension Initialization Flow

Trigger: VSCode opens a Ruby file for the first time

sequenceDiagram
    participant User
    participant VSCode
    participant Extension
    participant Provider
    participant FileSystem
    participant Cache

    User->>VSCode: Open Ruby file
    VSCode->>Extension: activate()
    Extension->>Provider: new FactoryLinkProvider()
    Extension->>VSCode: registerDocumentLinkProvider()
    Extension->>FileSystem: createFileSystemWatcher()
    Note over Provider: Lazy initialization
    VSCode->>Provider: provideDocumentLinks()
    Provider->>Cache: check if initialized
    Cache-->>Provider: not initialized
    Provider->>FileSystem: getConfiguration()
    FileSystem-->>Provider: factoryPaths
    Provider->>FileSystem: findFiles(factoryPaths)
    FileSystem-->>Provider: factory file URIs
    Provider->>Cache: buildFactoryCache()
    Provider->>Cache: buildTraitCache()
    Cache-->>Provider: caches ready
    Provider-->>VSCode: DocumentLink[]

Data Transformations:

  1. Configuration → Factory file paths (glob patterns)
  2. File paths → File URIs (VSCode workspace API)
  3. File URIs → File contents (text reading)
  4. File contents → Factory definitions (regex parsing)
  5. Factory definitions → Cache entries (Map structures)

Source: src/providers/factoryLinkProvider.ts

2. Factory Cache Building Flow

Input: Factory file URIs
Output: Factory and trait caches

graph LR
    A[Factory File URI] --> B[Read File Content]
    B --> C[Extract Factory Definitions]
    C --> D[Parse Factory Names]
    D --> E[Calculate Line Numbers]
    E --> F[Store in Factory Cache]

    B --> G[Extract Trait Definitions]
    G --> H[Parse Trait Names]
    H --> I[Associate with Factory]
    I --> J[Store in Trait Cache]

Factory Definition Parsing:

// Regex pattern for factory detection
const factoryPattern = /factory\s+:([a-zA-Z0-9_]+)\b/g
const traitPattern = /trait\s+:([a-zA-Z0-9_]+)\s+do/g

// Data transformation
factoryMatch → {
  name: string,           // Factory name (e.g., "user")
  uri: vscode.Uri,        // File location
  lineNumber: number      // Line in file
}

traitMatch → {
  name: string,           // Trait name (e.g., "admin")
  factory: string,        // Parent factory (e.g., "user")
  uri: vscode.Uri,        // File location
  lineNumber: number      // Line in file
}

3. Link Generation Flow

Trigger: User hovers over factory call in test file

sequenceDiagram
    participant User
    participant VSCode
    participant Provider
    participant PatternDetector
    participant Cache

    User->>VSCode: Hover over factory call
    VSCode->>Provider: provideDocumentLinks(document)
    Provider->>PatternDetector: findFactoryCalls(text)
    PatternDetector->>PatternDetector: Apply regex patterns
    PatternDetector-->>Provider: factory call matches

    loop For each factory call
        Provider->>Cache: lookup(factoryName)
        Cache-->>Provider: factory definition
        Provider->>Provider: createDocumentLink()
    end

    Provider-->>VSCode: DocumentLink[]
    VSCode-->>User: Show clickable links

Pattern Detection Process:

// Factory call detection regex
const factoryCallPattern = /(?:create|create_list|build|build_list|build_stubbed|build_stubbed_list)\s*(?:\(\s*)?((:[a-zA-Z0-9_]+)(?:\s*,\s*(:[a-zA-Z0-9_]+))*)/g

// Data extraction
textMatch → {
  factoryName: string,    // e.g., "user"
  traits: string[],       // e.g., ["admin", "verified"]
  startPos: number,       // Character position in document
  endPos: number          // End character position
}

// Cache lookup
factoryName → {
  uri: vscode.Uri,        // Target file
  lineNumber: number      // Target line
}

// Link generation
cacheEntry → vscode.DocumentLink {
  range: vscode.Range,    // Clickable text range
  target: vscode.Uri      // Navigation command URI
}

4. File Change Detection Flow

Trigger: Factory file is created, modified, or deleted

graph TB
    A[File System Change] --> B[File Watcher Event]
    B --> C{Event Type}
    C -->|Create| D[Add to Cache]
    C -->|Modify| E[Update Cache Entry]
    C -->|Delete| F[Remove from Cache]
    D --> G[Rebuild Affected Caches]
    E --> G
    F --> G
    G --> H[Invalidate Document Links]
    H --> I[VSCode Refreshes Links]

Cache Update Process:

// File change event
FileSystemEvent → {
  type: 'create' | 'change' | 'delete',
  uri: vscode.Uri
}

// Cache invalidation strategy
if (event.type === 'delete') {
  // Remove all entries for this file
  removeFileFromCaches(event.uri)
} else {
  // Re-parse and update entries for this file
  reparseFactoryFile(event.uri)
}

Source: src/extension.ts#L28-L41

5. Navigation Command Flow

Trigger: User clicks on factory link

sequenceDiagram
    participant User
    participant VSCode
    participant Extension
    participant FileSystem

    User->>VSCode: Click factory link
    VSCode->>Extension: executeCommand("gotoLine", args)
    Extension->>FileSystem: openTextDocument(uri)
    FileSystem-->>Extension: TextDocument
    Extension->>VSCode: showTextDocument(document)
    VSCode-->>Extension: TextEditor
    Extension->>VSCode: setSelection(lineNumber)
    Extension->>VSCode: revealRange(lineNumber)
    VSCode-->>User: Navigate to factory definition

Navigation Data Flow:

// Click event data
DocumentLink.target → vscode.Uri {
  scheme: "command",
  path: "rails-factorybot-jump.gotoLine",
  query: JSON.stringify({
    uri: string,      // Target file URI
    lineNumber: number // Target line number
  })
}

// Command execution
commandArgs → {
  uri: string,        // Target file path
  lineNumber: number  // 0-based line number
}

// VSCode operations
uri → vscode.TextDocument → vscode.TextEditor → vscode.Selection

Source: src/extension.ts#L14-L26

Data Structures

1. Cache Data Structures

Factory Cache:

Map<string, FactoryDefinition>

interface FactoryDefinition {
  uri: vscode.Uri      // File containing factory
  lineNumber: number   // 0-based line number
}

// Example entries
{
  "user" => { uri: "file:///spec/factories/users.rb", lineNumber: 1 },
  "post" => { uri: "file:///spec/factories/posts.rb", lineNumber: 0 }
}

Trait Cache:

Map<string, TraitDefinition>

interface TraitDefinition {
  uri: vscode.Uri      // File containing trait
  lineNumber: number   // 0-based line number
  factory: string      // Parent factory name
}

// Example entries (key format: "factory:trait")
{
  "user:admin" => { uri: "file:///spec/factories/users.rb", lineNumber: 5, factory: "user" },
  "post:published" => { uri: "file:///spec/factories/posts.rb", lineNumber: 8, factory: "post" }
}

2. Intermediate Data Structures

Pattern Match Results:

interface FactoryCallMatch {
  factoryName: string; // Factory name without ':'
  traits: string[]; // Trait names without ':'
  range: vscode.Range; // Text range in document
  fullMatch: string; // Complete matched text
}

File Parsing Results:

interface ParsedFactory {
  name: string; // Factory name
  lineNumber: number; // Definition line
  traits: ParsedTrait[]; // Traits within factory
}

interface ParsedTrait {
  name: string; // Trait name
  lineNumber: number; // Definition line
}

Performance Optimizations

1. Lazy Loading Strategy

graph LR
    A[Extension Activation] --> B{First Link Request?}
    B -->|Yes| C[Initialize Caches]
    B -->|No| D[Use Existing Caches]
    C --> E[Scan Factory Files]
    E --> F[Build Caches]
    F --> G[Generate Links]
    D --> G

Benefits:

  • Faster extension activation
  • Reduced memory usage for unused features
  • Better user experience

2. Incremental Cache Updates

graph TB
    A[File Change Event] --> B{Affected File?}
    B -->|Yes| C[Parse Only Changed File]
    B -->|No| D[No Action]
    C --> E[Update Specific Cache Entries]
    E --> F[Preserve Unaffected Entries]

Benefits:

  • Minimal re-processing on changes
  • Maintained cache consistency
  • Reduced CPU usage

3. Efficient Data Access Patterns

O(1) Cache Lookups:

// Direct hash map access
const factory = factoryCache.get(factoryName);
const trait = traitCache.get(`${factoryName}:${traitName}`);

Batch Operations:

// Process all factory calls in document at once
const allMatches = findAllFactoryCalls(document.getText());
const links = allMatches.map((match) => createDocumentLink(match));

Error Handling in Data Flow

1. File System Error Handling

graph TB
    A[File Operation] --> B{Success?}
    B -->|Yes| C[Process Data]
    B -->|No| D[Log Error]
    D --> E[Continue with Available Data]
    C --> F[Update Cache]
    E --> G[Graceful Degradation]

2. Parse Error Recovery

// Robust parsing with error recovery
try {
  const factories = parseFactoryFile(fileContent);
  updateCache(factories);
} catch (error) {
  console.warn(`Failed to parse factory file: ${file.path}`, error);
  // Continue with other files
}

3. Cache Consistency

Consistency Guarantees:

  • Atomic cache updates (all or nothing)
  • Rollback on parse failures
  • Eventual consistency through file watching

Data Flow Monitoring

1. Performance Metrics

Key Metrics:

  • Cache build time
  • Link generation time
  • File parsing time
  • Memory usage

2. Debug Information

Data Flow Tracing:

  • File discovery events
  • Cache hit/miss ratios
  • Parse success/failure rates
  • Link generation counts

This data flow architecture ensures efficient, reliable, and maintainable factory navigation while providing excellent performance and user experience.