feat(tarball): add Worker Threads support for parallel analysis#613
feat(tarball): add Worker Threads support for parallel analysis#6137amed3li wants to merge 3 commits intoNodeSecure:masterfrom
Conversation
Implement Worker Threads to parallelize JavaScript file analysis, delivering measurable performance improvements for large codebases. Performance Results (280 files): - Average speedup: +10-15% - Event loop responsiveness: +36% - Memory usage: -90-94% Key Features: - Dynamic load balancing (40-file batches) - Intelligent threshold (250+ files activation) - Persistent worker pool (5min idle timeout) - 4-level fallback system for robustness - Zero breaking changes (backward compatible) Implementation: - WorkerPool singleton with dynamic thread calculation - Batch processing to minimize communication overhead - JIT warmup to eliminate cold-start latency - Configurable via NODE_SECURE_DISABLE_WORKERS env var Testing: - Full unit test coverage (scanner.worker.spec.ts) - Integration tests (WorkerPool.spec.ts) - Comprehensive benchmarking suite Closes NodeSecure#578 Signed-off-by: Hamed Mohamed <hamedmohamed22w@gmail.com>
|
|
All alerts resolved. Learn more about Socket for GitHub. This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored. |
|
Make sure the linter (ESLint) is ok everywhere |
- Add proper import section comments (@openally/imports) - Fix inline comments to JSDoc format (no-inline-comments) - Fix duplicate imports (no-duplicate-imports) - Fix line lengths and operator precedence - Fix empty block statements Signed-off-by: Hamed Mohamed <hamedmohamed22w@gmail.com>
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
| // Import Internal Dependencies | ||
| import { WorkerPool } from "../dist/class/WorkerPool.class.js"; | ||
|
|
||
| const __dirname = path.dirname(fileURLToPath(import.meta.url)); |
There was a problem hiding this comment.
We use ESM so import.meta.dirname is available
| await new Promise((resolve) => { | ||
| setTimeout(resolve, 100); | ||
| }); |
There was a problem hiding this comment.
Prefer node:timers/promises instead
| interface WorkerResponse { | ||
| /** | ||
| * Success flag | ||
| */ | ||
| s: boolean; | ||
| /** | ||
| * Result data | ||
| */ | ||
| r?: ReportOnFile; | ||
| /** | ||
| * Error details | ||
| */ | ||
| e?: { |
There was a problem hiding this comment.
Not a big fan of interfaces with one letter properties and the comment explaining what it is.
| const { WorkerPool } = await import("./WorkerPool.class.js"); | ||
| const pool = WorkerPool.getInstance(); |
There was a problem hiding this comment.
Not sure why we need a singleton here
There was a problem hiding this comment.
Not sure why we need a singleton here
The main reason for using a singleton here is to keep the worker pool
alive across multiple scanPackage() calls (e.g. when scanning many
packages in a pipeline), which avoids paying the worker startup cost
on each invocation.
That said, I agree this is mostly an implementation detail.
If you’d prefer a simpler approach (e.g. a module-scoped instance or
a per-scanner lifecycle), I’m happy to refactor it accordingly.
- Use import.meta.dirname instead of path.dirname(fileURLToPath()) - Use node:timers/promises setTimeout instead of Promise wrapper - Use @nodesecure/fs-walk instead of custom findJavaScriptFiles - Rename properties to full names (s->success, r->result, e->error) - Update tests for new property names Signed-off-by: Hamed Mohamed <hamedmohamed22w@gmail.com>
Summary
This PR implements Worker Threads to parallelize JavaScript file analysis in
@nodesecure/tarball, delivering 10-15% average speedup and 36% better responsiveness for large codebases (250+ files).Performance Benchmarks
Tested on Intel Core i7 (8 cores), 16GB RAM with 280 JavaScript files:
Key Features
Statistical Note
Key Features
Configuration
Testing
Documentation
Closes
#578