Performance - ArunPrakashG/native-launcher GitHub Wiki

Performance

Native Launcher is built with performance as the #1 priority. Every feature is designed and optimized to maintain sub-100ms startup times and instantaneous search responses.

🎯 Performance Targets

Hard Limits (Must Meet)

  • Startup: <100ms cold start (target: <50ms)
  • Search: <10ms for 500 apps (target: <5ms)
  • Memory: <30MB idle (target: <20MB)
  • UI Latency: <16ms (60fps, target: 120fps)

Actual Benchmarks

Measured on typical Linux desktop (Intel i5, NVMe SSD):

  • Cold Start: ~45-60ms
  • Warm Start: ~30-40ms
  • Search (10 apps): 2-4ms
  • Search (500 apps): 6-9ms
  • Memory Idle: 18-25MB
  • Memory Peak: 28-35MB (with file search)

⚡ Performance Optimizations

1. Smart Debouncing (150ms)

Problem: Every keystroke triggered a full search, causing lag during rapid typing.

Solution: Counter-based debouncing with 150ms delay.

// Only the latest search executes after 150ms of inactivity
let debounce_counter: Rc<RefCell<u64>> = Rc::new(RefCell::new(0));

// Increment on each keystroke
*counter += 1;
let current_count = *counter;

// Wait 150ms, then check if still valid
glib::timeout_add_local_once(Duration::from_millis(150), move || {
    if *debounce_counter.borrow() == current_count {
        // Execute search
    }
});

Impact: Eliminates input lag, reduces CPU usage by 60% during typing.

2. Smart Triggering (Two-Pass Search)

Problem: File search ran on every query, even for obvious app searches like "firefox".

Solution: Two-pass architecture - apps first, expensive operations only when needed.

// First pass: Query Applications plugin only
let app_results = applications_plugin.search(&context);

// Count high-quality matches (score >= 700)
let good_matches = app_results.iter().filter(|r| r.score >= 700).count();

// Second pass: Other plugins with app count context
let context = context.with_app_results(good_matches);
let other_results = other_plugins.search(&context);

File Search Logic:

let has_good_app_matches = context.app_results_count >= 2;
let should_skip_file_search = has_good_app_matches && !is_file_command;

if !should_skip_file_search {
    // Run expensive file index search
}

Impact:

  • 60% of queries skip file search entirely
  • "firefox" → instant (0ms file search)
  • "config" → file search runs (intended behavior)
  • "@files test" → always runs (explicit command)

3. File Index Integration

Problem: Need system-wide file search without maintaining custom index.

Solution: Hook into native Linux file indexing tools with fallback chain.

Priority Chain:
1. plocate (fastest, 20-50ms, requires updatedb)
2. mlocate (fast, 30-80ms, requires updatedb)
3. locate (fast, 30-80ms, requires updatedb)
4. fd (medium, 100-300ms, no index needed)
5. find (slow, 500ms+, guaranteed fallback)

Auto-Detection:

fn detect_backend() -> Backend {
    if command_exists("plocate") { Backend::Plocate }
    else if command_exists("mlocate") { Backend::Mlocate }
    else if command_exists("locate") { Backend::Locate }
    else if command_exists("fd") { Backend::Fd }
    else { Backend::Find }
}

Caching Strategy:

  • Results cached for 2 minutes (configurable)
  • Cache hit: <1ms response time
  • Cache miss: 20-300ms depending on backend

Impact:

  • System-wide file search with <100ms latency (locate backends)
  • No manual index maintenance (uses system updatedb)
  • Graceful fallback for systems without locate

4. Async File Search (Ready for Future)

Implementation: Background thread + GTK callback pattern.

pub fn search_async<F>(&self, query: String, callback: F)
where
    F: Fn(Vec<PathBuf>) + 'static,
{
    // Check cache first - instant return
    if let Some(cached) = self.get_cached(&query) {
        callback(cached);
        return;
    }

    // Background search
    std::thread::spawn(move || {
        let results = expensive_search(&query);

        // Call back to main thread
        glib::idle_add_local_once(move || {
            callback(results);
        });
    });
}

Status: Implemented but not active (plugin trait is synchronous).

Future: When plugin async support lands, file search will be fully non-blocking.

5. Lazy Icon Loading (Optimized)

Previous: Icons preloaded at startup in background thread.

Problem: Wasted 10-20ms startup time and 5-10MB memory loading unused icons.

Current: Icons loaded on-demand with LRU cache:

// Icons resolve only when results are displayed
fn create_result_row(&self, result: &PluginResult) -> gtk4::Widget {
    let icon = resolve_icon_lazy(&result.icon);  // <1ms cached lookup
    // ...
}

Impact:

  • 10-20ms startup time reduction
  • 5-10MB memory savings
  • <1ms icon lookup with cache (vs 5-10ms cold load)

See: docs/OPTIMIZATIONS_IMPLEMENTED.md for full details.

6. Desktop Entry Caching

Parsed .desktop files cached to disk:

  • Location: ~/.cache/native-launcher/entries.cache
  • Invalidation: File mtime comparison
  • Impact: 80-120ms saved on startup

7. Hash-Based UI Change Detection

Problem: GTK rebuilt entire results list on every keystroke, even when results were identical.

Solution: Hash comparison before expensive UI operations:

// src/ui/results_list.rs
let results_hash = calculate_hash(&results);

if self.results_hash == Some(results_hash) {
    return; // Skip rebuild, results unchanged
}

self.results_hash = Some(results_hash);
// ... proceed with UI update

Impact:

  • 50-80% reduction in UI rebuilds during fast typing
  • Eliminates flickering when results stabilize
  • Lower CPU usage during search sessions

See: docs/OPTIMIZATIONS_IMPLEMENTED.md for implementation details.

8. Browser Query Optimization

Problem: Browser global search triggered on every keystroke, causing lag.

Solution: Minimum query length and result limits:

// Only trigger global browser search at 4+ characters
let min_length = 4;
if query.len() < min_length { return vec![]; }

// Limit to 2 results in global mode (5 with @tabs/@history)
let limit = if explicit_command { 5 } else { 2 };

Impact:

  • 75% reduction in unnecessary browser queries
  • Eliminated keystroke lag during browser history search
  • Sub-5ms query time with persistent SQLite index

See: docs/BROWSER_PERFORMANCE_FIX.md for analysis.

9. Lazy Widget Creation

UI widgets created on-demand:

// Only create result rows as they scroll into view
listbox.bind_model(Some(&model), |item| {
    create_row_widget(item) // Called only when visible
});

Impact: Faster initial render, lower memory for large result sets.

� Recent Optimizations (2024)

A comprehensive performance audit identified 15 optimization opportunities across startup, search, UI, and memory. Three quick-win optimizations were implemented with significant impact:

1. Icon Preloading Removal ✅

Before: Background thread preloaded 500+ icons at startup
After: Lazy on-demand loading with LRU cache
Gain: 10-20ms startup, 5-10MB memory saved

2. Hash-Based Result Comparison ✅

Before: GTK rebuilt entire results list on every keystroke
After: O(1) hash comparison before UI updates
Gain: 50-80% reduction in unnecessary UI rebuilds

3. Browser Query Optimization ✅

Before: Browser search triggered on every character
After: 4-character minimum, 2-result limit for global search
Gain: 75% query reduction, eliminated keystroke lag

Additional Opportunities

See docs/COMPREHENSIVE_OPTIMIZATIONS.md for 12 remaining optimization opportunities including:

  • Virtual scrolling (10-20ms for 100+ results)
  • Trigram indexing (<2ms searches, sub-50ms startup)
  • Icon sprite packing (1-2MB memory savings)
  • Thread pool for plugins (50%+ speedup on multi-plugin queries)
  • Query result caching (near-instant repeated queries)

Performance tracking: TODO.md includes performance gates for all new features.

�📊 Profiling Tools

Startup Time

# Measure cold start
time ./target/release/native-launcher

# Measure with logging
RUST_LOG=debug time ./target/release/native-launcher

Search Benchmarks

# Run criterion benchmarks
cargo bench

# Specific benchmark
cargo bench --bench search_benchmark

Memory Profiling

# Detailed memory stats
/usr/bin/time -v ./target/release/native-launcher

# Heap profiling (requires valgrind)
valgrind --tool=massif ./target/release/native-launcher

CPU Profiling

# Install flamegraph
cargo install flamegraph

# Generate flamegraph
cargo flamegraph

# Opens flamegraph.svg in browser

Plugin Performance

Built-in monitoring logs slow operations:

WARN: Plugin 'Files' search took 125ms (threshold: 100ms)

Configure thresholds in config.toml:

[plugins]
warn_threshold_ms = 100

🚨 Performance Guidelines

For Users

Maintain Performance:

  • Keep system file index updated: sudo updatedb
  • Install plocate or fd for faster file search
  • Limit script plugins to <1s timeout
  • Monitor plugin warnings in logs

Troubleshooting Slow Performance:

  1. Check RUST_LOG=debug output for slow plugin warnings
  2. Disable plugins one by one to isolate culprit
  3. Run cargo bench to compare with baseline
  4. Check system load (high CPU/disk I/O affects performance)

For Developers

Critical Rules:

DON'T:

  • Add features without profiling impact
  • Use blocking I/O in main thread
  • Import large dependencies without measuring binary size
  • Add animations >200ms
  • Allocate in hot paths (search loop, render loop)
  • Add "nice to have" features that hurt startup time

DO:

  • Profile before and after every significant change
  • Cache aggressively, compute lazily, render on-demand
  • Use async for I/O operations
  • Question any feature adding >10ms to critical paths
  • Measure everything: cargo bench, time, flamegraph

If a feature conflicts with performance, cut the feature.

Measuring Changes

# Before change
cargo bench -- --save-baseline before
cargo build --release
time ./target/release/native-launcher

# After change
cargo bench -- --baseline before
cargo build --release
time ./target/release/native-launcher

# Compare
# Search should be within 5% of baseline
# Startup should be <100ms

🎯 Future Optimizations

Planned

  1. Full async plugin architecture - Non-blocking file search
  2. Incremental search - Stream results as they arrive
  3. GPU-accelerated rendering - Smoother animations at 120fps
  4. Preload on compositor start - Daemon mode for instant appearance
  5. SIMD fuzzy matching - Faster string matching with vectorization

Under Investigation

  • Mmap for desktop files - Faster parsing with memory-mapped I/O
  • Custom allocator - Reduce allocation overhead (jemalloc/mimalloc)
  • Plugin compilation - JIT for script plugins (challenging security tradeoff)

📈 Performance History

v0.3.0 (October 2025)

  • ✅ Smart debouncing: Eliminated input lag
  • ✅ Two-pass search: 60% query optimization
  • ✅ File index integration: System-wide search <100ms
  • ✅ Async file search: Ready for future async plugins

v0.2.0 (September 2025)

  • ✅ Icon cache preloading: 5-10ms → <1ms
  • ✅ Desktop entry caching: 100ms startup improvement
  • ✅ Plugin performance monitoring

v0.1.0 (August 2025)

  • ✅ Initial release: <100ms startup, <10ms search

📚 See Also

⚠️ **GitHub.com Fallback** ⚠️