Q1. How do you diagnose event loop latency problems in a Node.js API under load?
I capture runtime telemetry like event loop delay, CPU profiles, and async call stacks to separate blocking CPU work from I/O stalls. Once identified, I either optimize hot code paths, batch expensive operations, or move compute-heavy tasks to worker threads and queues.
Q2. When would you choose worker threads over a queue-based background worker architecture?
Worker threads are best for CPU-heavy work that must stay close to request lifecycle and shared memory constraints. Queue workers are better for durable asynchronous jobs, retries, and horizontal scaling across machines.
Q3. What reliability patterns do you apply for external API calls in Node services?
I use timeouts, exponential backoff with jitter, circuit breakers for unstable dependencies, and request hedging only where idempotency is guaranteed. I also apply structured logging and tracing so failures are diagnosable across service boundaries.