refactor: defer zip manifest building to execution phase to improve analysis phase performance#3381
Merged
rickeylev merged 5 commits intobazel-contrib:mainfrom Nov 10, 2025
Conversation
Collaborator
|
As mentioned in the other PR comment: we can definitely accept PR until the questions about using an external tool are worked out. Overall, LGTM. Because of the profile results in the other thread showing format() (as called by map_zip_runfiles) was a significant chunk of the time, changing to |
py_binary and py_test rules
py_binary and py_test rules
Collaborator
Collaborator
|
Ok, cleaned this up, switched it to use @tobyh-canva If you have opportunity, could you run another profile? I'm interested to see how much of the 23 seconds of format() overhead in building the path strings is gone. |
rickeylev
approved these changes
Nov 10, 2025
Contributor
Author
|
Running a profile right now, thanks heaps! |
Contributor
Author
aignas
added a commit
to aignas/rules_python
that referenced
this pull request
Dec 6, 2025
Looking at the investigation in bazel-contrib#3381, it seems that we are calling the startswith many times and I wanted to see if it would be possible to optimize how it is done. I also realized that no matter what target we have, we will be calling the function once with a `__init__.py` path and we can inline this case as a separate if statement checking for equality instead, which Starlark optimizer should understand better. Before this PR for every executable target we would go through the `legacy_external_runfiles and "__init__.py".startswith("external")` and this PR eliminates this. Related to bazel-contrib#3380 and bazel-contrib#3381
github-merge-queue bot
pushed a commit
that referenced
this pull request
Dec 7, 2025
) Looking at the investigation in #3380, it seems that we are calling the startswith many times and I wanted to see if it would be possible to optimize how it is done. I also realized that no matter what target we have, we will be calling the function once with a `__init__.py` path and we can inline this case as a separate if statement checking for equality instead, which Starlark optimizer should understand better. Before this PR for every executable target we would go through the `legacy_external_runfiles and "__init__.py".startswith("external")` and this PR eliminates this. Related to #3380 and #3381
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


When py_binary/py_test were being built, they were flattening the runfiles
depsets at analysis time in order to create the zip file mapping manifest for
their implicit zipapp outputs. This flattening was necessary because they had
to filter out the original main executable from the runfiles that didn't belong
in the zipapp. This flattening is expensive for large builds, in some cases
adding over 400 seconds of time and significant memory overhead.
To fix, have the zip file manifest use the
runfiles_with_exeobject, which isthe runfiles, but pre-filtered for the files zip building doesn't want. This
then allows passing the depsets directly to
Args.add_alland using map_eachto transform them.
Additionally, pass
runfiles.empty_filenamesusing a lambda. Accessing thatattribute implicitly flattens the runfiles.
Finally, because the original profiles indicated
str.format()was a non-trivialamount of time (46 seconds / 15% of build time), switch to using
+instead.This is a more incremental alternative to #3380 which achieves most of the
same optimization with only Starlark changes, as opposed to introducing an
external script written in C++.
Profile of a large build, which shows a Starlark CPU profile. It shows an overall build
time of 305 seconds. 46 seconds (15%) are spent in
map_zip_runfiles, half of whichis in
str.startswith()and the other half instr.format().