Skip to content

Conversation

@tausbn
Copy link
Contributor

@tausbn tausbn commented Jan 9, 2026

This PR contains most of the changes that are needed in order to get all of the overlay[local] annotations to compile.

Reviewing on a commit-by-commit basis is highly recommended.

No change note required as these changes should not have any visible effect.

@github-actions github-actions bot added the Python label Jan 9, 2026
@tausbn tausbn force-pushed the tausbn/python-prepare-for-overlay-annotations branch 9 times, most recently from 45e4b92 to 1242a7b Compare January 13, 2026 13:53
@tausbn tausbn force-pushed the tausbn/python-prepare-for-overlay-annotations branch 8 times, most recently from 897bbbc to b8c0ed3 Compare January 29, 2026 16:28
@tausbn tausbn added the no-change-note-required This PR does not need a change note label Jan 29, 2026
@tausbn tausbn force-pushed the tausbn/python-prepare-for-overlay-annotations branch 3 times, most recently from e3579c9 to 734702a Compare January 29, 2026 17:06
Removes the dependence on the (global) `ModuleVariableNode.getARead()`,
by adding a local version (that doesn't include `import *` reads)
instead.
This may result in more nodes, but it should still be bounded by the
number of global variables in the source code.
With `ModuleVariableNode`s now appearing for _all_ global variables (not
just the ones that actually seem to be used), some of the tests changed
a bit. Mostly this was in the form of new flow (because of new nodes
that popped into existence). For some inline expectation tests, I opted
to instead exclude these results, as there was no suitable location to
annotate. For the normal tests, I just accepted the output (after having
vetted it carefully, of course).
Explicitly adds a bunch of nodes that were previously (using a global
analysis) identified as `ExtractedArgumentNode`s. These are then
subsequently filtered out in `argumentOf` (which is global) by putting
the call to `getCallArg` there instead of in the charpred.
Fixes the test failures that arose from making `ExtractedArgumentNode`
local.

For the consistency checks, we now explicitly exclude the
`ExtractedArgumentNode`s (now much more plentiful due to the
overapproximation) that don't have a corresponding `getCallArg` tuple.

For various queries/tests using `instanceof ArgumentNode`, we instead us
`isArgumentNode`, which explicitly filters out the ones for which
`isArgumentOf` doesn't hold (which, again, is the case for most of the
nodes in the overapproximation).
Uses the same trick as for `ExtractedArgumentNode`, wherein we postpone
the global restriction on the charpred to instead be in the `argumentOf`
predicate (which is global anyway).

In addition to this, we also converted `CapturedVariablesArgumentNode`
into a proper synthetic node, and added an explicit post-update node for
it. These nodes just act as wrappers for the function part of call
nodes. Thus, to make them work with the variable capture machinery, we
simply map them to the closure node for the corresponding control-flow
or post-update node.
As we now have many more capturing closure arguments, we must once again
exclude the ones that don't actually have `argumentOf` defined.
New nodes means new results. Luckily we rarely have a test that selects
_all_ dataflow nodes.
@tausbn tausbn force-pushed the tausbn/python-prepare-for-overlay-annotations branch from 3b632d9 to 958c798 Compare January 30, 2026 12:50
@tausbn tausbn marked this pull request as ready for review January 30, 2026 15:47
@tausbn tausbn requested a review from a team as a code owner January 30, 2026 15:47
Copilot AI review requested due to automatic review settings January 30, 2026 15:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the Python dataflow library to prepare for overlay[local] annotations. The changes are primarily internal and should not have any visible effects on query results, except for test outputs which now include additional intermediate nodes.

Changes:

  • Expanded ModuleVariableNode creation to all global variables (previously only escaping variables)
  • Refactored captured variable argument nodes into separate synthetic nodes and ArgumentNode wrappers
  • Distinguished local reads from import star reads via new getALocalRead() method

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
python/ql/lib/semmle/python/dataflow/new/internal/DataFlowPublic.qll Removed restrictions on ModuleVariableNode creation and added getALocalRead() method; refactored ExtractedArgumentNode with approximation logic
python/ql/lib/semmle/python/dataflow/new/internal/DataFlowDispatch.qll Split CapturedVariablesArgumentNode into SynthCapturedVariablesArgumentNode and CapturedVariablesArgumentNodeAsArgumentNode to avoid non-monotonic recursion
python/ql/lib/semmle/python/dataflow/new/internal/DataFlowPrivate.qll Added new synthetic node types to nodeIsHidden predicate
python/ql/lib/semmle/python/dataflow/new/internal/LocalSources.qll Changed from getARead() to getALocalRead() to exclude import star reads
python/ql/lib/semmle/python/dataflow/new/internal/VariableCapture.qll Added asClosureNode mappings for new synthetic argument nodes
python/ql/src/Classes/InitCallsSubclass/InitCallsSubclassMethod.ql Replaced instanceof ArgumentNode check with isArgumentNode predicate
python/ql/lib/utils/test/dataflow/callGraphConfig.qll Replaced instanceof ArgumentNode check with isArgumentNode predicate
python/ql/lib/utils/test/dataflow/MaximalFlowTest.qll Added ModuleVariableNode filtering and replaced instanceof ArgumentNode with isArgumentNode
python/ql/consistency-queries/DataFlowConsistency.ql Added missingArgumentCallExclude for overapproximated argument nodes
python/ql/test/library-tests/frameworks/internal-ql-helpers/PoorMansFunctionResolutionTest.ql Excluded ModuleVariableNodes from test results
python/ql/test/library-tests/dataflow/global-flow/test.py Updated comments and added $writes annotations for newly tracked global variables
Test expected outputs Updated to reflect new ModuleVariableNode entries and additional flow paths

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-change-note-required This PR does not need a change note Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants