Skip to content

perf: optimize native binary size and startup time#916

Draft
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/native-binary-optimization
Draft

perf: optimize native binary size and startup time#916
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/native-binary-optimization

Conversation

@He-Pin

@He-Pin He-Pin commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Motivation

The Scala Native binary was 17.7MB and loaded OpenSSL's libcrypto at startup even when hash functions were never called. JVM format-heavy workloads had redundant ASCII range checks for known-safe format strings.

Modification

  • Strip post-link (build.mill): Add strip step after nativeLink, saving ~2MB of __LINKEDIT (non-exported symbols)
  • dlopen OpenSSL (Platform.scala): Remove scala-native-crypto dependency; load libcrypto via dlopen/dlsym only when hash functions (std.md5, std.sha256, etc.) are actually called
  • Lazy stdlib (Interpreter.scala, Val.scala): Defer StdLibModule construction, static singletons, and Interpreter.std until first use
  • Format ASCII-safe propagation (StaticOptimizer.scala): Detect AsciiSafeStr LHS in % operator and pass sourceAsciiSafe=true to scanFormat, skipping redundant ASCII range checks

Result

Metric Before After Delta
Binary size (stripped) 15.0MB 13.0MB -13%
Binary size (pre-strip) 17.7MB 15.0MB -15%
__LINKEDIT section 3.1MB 1.1MB -2.0MB
repeat_format benchmark 0.135 ms/op 0.124 ms/op -8.1%

Most other benchmarks within noise margin (±3%), as lazy stdlib primarily benefits startup time rather than steady-state throughput.

References

  • Scala Native nativeLink documentation
  • OpenSSL dlopen/dlsym lazy loading pattern

@He-Pin He-Pin marked this pull request as draft June 14, 2026 13:40
@He-Pin He-Pin closed this Jun 14, 2026
@He-Pin He-Pin reopened this Jun 14, 2026
@He-Pin He-Pin force-pushed the perf/native-binary-optimization branch 3 times, most recently from e2b5c36 to 20c2b65 Compare June 17, 2026 19:26
Motivation:
The Scala Native binary was 17.7MB and loaded OpenSSL's libcrypto at
startup even when hash functions were never called. JVM format-heavy
workloads had redundant ASCII range checks for known-safe format strings.

Modification:
- Add strip post-link step in build.mill to reduce binary size
- Lazy-load OpenSSL via dlsym instead of linking at compile time
- Cache OpenSSL function pointers and add null checks
- Wrap cryptoFuncs dlsym calls in Zone.acquire for C string allocation
- Replace generic loadSym with concrete CFuncPtr.fromPtr calls
- Revert unnecessary evaluator pattern matching and num visibility
  changes
- Apply scalafmt formatting

Result:
Reduced native binary size and improved startup time by deferring
OpenSSL loading until hash functions are actually used.
@He-Pin He-Pin force-pushed the perf/native-binary-optimization branch from 20c2b65 to 31be008 Compare June 17, 2026 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant