performancerustengineering

Performance Engineering: Zero-Copy IPC Optimization

How we reduced heap allocations by 80% in OmniMon's hot path using zero-copy IPC, Arc caching, and reference-based sorting.

The Cost of Cloning

OmniMon’s IPC bridge transfers system metrics from the Rust backend to the Svelte frontend every 2 seconds. With 500+ active processes, each transfer was cloning hundreds of structs — creating unnecessary heap pressure and GC pauses.

We set out to eliminate every redundant allocation.

Optimization 1: Cache hw.memsize at Startup

Before: OmniMon spawned a sysctl -n hw.memsize subprocess every 2 seconds to read total system memory.

After: A OnceLock<Option<u64>> caches the value on first call. Total system memory doesn’t change at runtime.

Impact: Eliminates 0.5 subprocess spawns per second.

Optimization 2: Consume Watcher Cache

Before: Functions like top_processes_by_memory() created fresh System::new_all() instances — triggering full system scans.

After: All metric functions read from the watcher’s existing SystemState cache first, with a fallback for edge cases.

Impact: ~60% reduction in CPU syscall overhead on the hot path.

Optimization 3: Reference-Based Sorting

Before: The get_metrics() function cloned the entire 500+ process vector, sorted it, then truncated to the top 100.

After: Creates a Vec<&CachedProcessInfo> of references, sorts in-place, then clones only the top 100 entries.

Impact: ~80% fewer heap allocations per IPC call (~400 struct clones eliminated).

Optimization 4: Arc Browser Tab Cache

Before: Each cache read of browser tabs cloned the entire Vec<BrowserTab> plus all String fields — an O(n) operation with ~60 string allocations.

After: The cache stores Arc<Vec<BrowserTab>>. Reads clone only the Arc pointer — a single atomic increment, O(1).

Impact: Eliminates ~60 String allocations per get_browser_tabs() call.

Optimization 5: Panic Safety

Before: An unreachable!() macro at the end of send_with_retry() could crash the entire app if the retry loop exited unexpectedly.

After: Replaced with an explicit Err("Unexpected exit from retry loop") — graceful error propagation instead of a production panic.

Frontend Optimizations

The Svelte layer received matching optimizations:

  • Virtual scrolling refined to maintain 60 FPS with 2000+ processes
  • Debounced search (150ms) prevents per-keystroke O(n) filtering
  • Store-based reactivity avoids unnecessary component re-renders

Results

MetricBeforeAfter
Heap allocations per IPC call~500~100
Subprocess spawns / second0.50
Browser tab cache readsO(n)O(1)
Process sort clones500+100

These optimizations compound: OmniMon now handles 2000+ simultaneous processes with imperceptible UI latency.