Skip to content

perf: Speed up vertex/edge sequence construction (~10× for large result sets)#2696

Open
schochastics wants to merge 9 commits into
mainfrom
feat/lazy-altrep-names
Open

perf: Speed up vertex/edge sequence construction (~10× for large result sets)#2696
schochastics wants to merge 9 commits into
mainfrom
feat/lazy-altrep-names

Conversation

@schochastics
Copy link
Copy Markdown
Contributor

part of #2695. I went down the ALTREP hole and while it is possible to get near the return.vs.es = FALSE result, it does add quite some complexity to the code. Lets discuss if we want this or not. maybe some stuff has to go to the C core?


Summary

Functions that return many vertex sequences (max_cliques(), cliques(), ego(), shortest-path vpaths, all_simple_paths(), separators, cohesive blocks) were slow on named graphs because each returned sequence paid a fixed per-object tax. This PR removes that tax.

max_cliques(sample_gnp(500, 0.15)) on a named graph (37k cliques) locally:

median
main ~338 ms
this PR 34.5 ms
numeric floor (return.vs.es = FALSE) ~30 ms

The default (vs objects) is now within ~4 ms of returning bare integer indices.

The finding (and how it differs from the issue)

The motivation in #1614 / #1652 assumed the cost was create_vs() building the names attribute, and the fix was lazy/ALTREP names. Profiling says otherwise: lazy names was the smallest of the three contributors (~10%). The real costs were

  1. attaching a graph weak-reference to every object (three .Calls per sequence — ~75% of the time), and
  2. per-object attribute setting and R-loop overhead.

The weak-reference turned out to be the same for every sequence of a graph (Rf_duplicate() is a no-op on an environment), so one shared reference replaces one-per-object with no change in lifetime semantics. The rest came from building the payload directly, setting attributes in one pass, and finally constructing the whole list in a single C call.

This reframes #1652. The performance gap that blocked deprecating return.vs.es (the worst case stibu81 called "unusable" was an 8× gap) is now ~1.15×. For typical workloads it's negligible. The option's main remaining justification is gone, so it can be deprecated and left inert rather than urgently removed.

What changed

  • names/edge-names are a lazy ALTREP string vector (igraph_lazy_names) that materializes only when read, and subsets in O(1).
  • All sequences of a graph share one weak reference instead of minting one each.
  • Vertex-sequence lists are built by create_vs_list(), backed by a C constructor (Rx_igraph_vs_list) that builds the whole list in one pass.
  • The VERTEXSET_LIST stimulus template emits create_vs_list(); the generated aaa-*.R files were regenerated to match (verified by re-running stimulus).
  • New touchstone benchmarks cover construction (max_cliques_named, head_of_named, V_named/E_named) and the specific wins (ego_order2_named, max_cliques_sizes_named, all_simple_paths_named, vs_subset_positional).

Correctness

Output is unchanged: integer payloads with the same values and names, no names attribute on unnamed graphs, NA/out-of-range IDs map to NA_STRING, and inputs are never mutated. The shared weak reference still releases the graph: get_vs_graph(seq) returns NULL after rm(graph); gc(). The full suite passes (iterators, cliques, paths, components, flow, conversion, interface, structural-properties, cycles, attributes, aaa-auto), and the C code is clean under gctorture(TRUE).

Scope and follow-ups

Vertex sequences only. Edge sequences already share the weak reference via simple_es_index(), but their vnames ("tail|head") are still built eagerly; making those lazy and adding a C create_es_list() is left as a follow-up. The return.vs.es option and its codegen guards are untouched here — its deprecation is a separate decision, now much better supported.

Relates to #1652; complements the consistency fix in #1614 / #2439.

schochastics and others added 7 commits June 5, 2026 14:57
Vertex/edge sequences eagerly materialized their `names` (and edge
`vnames`) character vectors at construction time. For functions that
return many sequences (e.g. max_cliques returning tens of thousands),
this allocates a named character vector per object even when the names
are never read.

Add an `igraph_lazy_names` ALTREP string class (src/rinterface_extra.c)
that wraps the graph's full name vector plus a 1-based index, and only
materializes the strings when an element is touched (printing, named
indexing, as_ids()). Subsetting stays lazy via an Extract_subset method
that composes indices. V() and E() now attach names through the
lazy_index_names() helper instead of subsetting eagerly; downstream
constructors (create_vs/simple_vs_index/...) inherit laziness through
the Extract_subset path with no changes.

Names resolve identically to before; vertex/edge sequence printing,
named indexing and as_ids() are unaffected. Note: this removes the
name-materialization penalty (named sequences now cost about the same as
unnamed), but it is not by itself sufficient to close the gap to the
numeric (return.vs.es = FALSE) path -- that is dominated by per-object
graph-reference attachment in add_vses_graph_ref(), addressed separately.

cpp11::cpp_register() also drops two dead registrations
(Rx_igraph_weak_ref_run_finalizer, UUID_gen) that no R code .Call()s.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a benchmark group exercising the sequence-construction path on named
graphs, where building the `names`/`vnames` attribute and attaching the
graph reference dominate: max_cliques (tens of thousands of vertex
sequences), head_of over every edge, and V()/E() on a large named graph.
These guard the lazy-names work and any future construction-path changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Constructing a vertex/edge sequence attached a graph reference per
object via add_vses_graph_ref(), which calls .Call(Rx_igraph_copy_env),
.Call(Rx_igraph_make_weak_ref) and .Call(Rx_igraph_get_graph_id) for
every object. For functions returning many sequences (e.g. max_cliques
returning tens of thousands) this dominated construction time --
profiling showed it was ~75% of the cost, far more than name building.

The weak reference's key is the graph's environment, which is identical
for every sequence of a graph (Rf_duplicate() is a no-op on an
environment, so get_vs_ref() returns the same env each call). A single
shared weak reference is therefore semantically identical to one per
object: while the graph is alive the reference resolves, and once the
graph is released the (weak) reference reports it gone -- verified that
get_vs_graph() still returns NULL after rm(graph); gc().

simple_es_index() already propagates env/graph from its input, so edge
construction only needed the redundant per-object add_vses_graph_ref()
call removed. simple_vs_index() now propagates env/graph the same way,
and unsafe_create_vs()/unsafe_create_es() rely on that propagation. The
existing `lapply(res, unsafe_create_vs, graph = graph, verts = V(graph))`
call sites then share the single weak reference built by V()/E() with no
call-site or codegen changes.

max_cliques(sample_gnp(500, 0.15)) on a named graph: ~338ms -> ~200ms.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two construction-cost reductions on top of the shared weak reference:

* simple_vs_index() now sets names/class/env/graph in a single
  `attributes<-` call instead of separate `attr<-`/`class<-`
  assignments. Each incremental assignment shallow-copies the vector,
  and that copying dominated when building many sequences.

* unsafe_create_vs() no longer routes through simple_vs_index(). Its
  `idx` are vertex IDs from C and `verts` is always the full V(graph),
  so `verts[idx]` just reproduces `idx`; we now use the IDs directly as
  the (integer) payload and take names lazily from `verts` via the
  ALTREP's O(1) subsetting, avoiding a full copy of V(graph) per object.

Payload type (integer), names, NA handling and graph recovery are
unchanged. max_cliques(sample_gnp(500, 0.15)) on a named graph:
~200ms -> ~140ms (was ~338ms before this branch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Functions returning many vertex sequences used
`lapply(res, unsafe_create_vs, graph = graph, verts = V(graph))`, which
re-read the graph reference, graph id and name source from `verts` on
every object. Add `create_vs_list(graph, idx_list)` which hoists all of
that per-graph work out of the loop and builds each sequence with a
single `attributes<-` and a directly-constructed lazy-names ALTREP, so
per-object cost drops to ~an integer coercion plus attribute set.

The `VERTEXSET_LIST` OUTCONV template in tools/stimulus/types-RR.yaml now
emits `create_vs_list(graph, res)`; the generated R/aaa-*.R files are
updated to match (27 sites), along with the hand-written call sites in
cliques.R, cohesive.blocks.R, components.R, conversion.R, interface.R,
paths.R and structural-properties.R (10 sites). `unsafe_create_vs()`
stays as the single-object form.

Edge sequences are left as-is: simple_es_index() already propagates the
shared weak reference from a single E(graph), so unsafe_create_es() does
not pay the per-object cost. A lazy `vnames` and an es batch form remain
as follow-ups.

max_cliques(sample_gnp(500, 0.15)) on a named graph: ~140ms -> ~100ms;
~338ms -> ~100ms over the whole branch (numeric floor ~25-30ms). Output,
names, NA handling and graph recovery are unchanged; 0 test failures
across the vertex-sequence-returning suites and aaa-auto.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a benchmark group exercising the three distinct wins on named graphs:

* ego_order2_named -- batch construction of one vertex sequence per node
  (create_vs_list via neighborhood()).
* max_cliques_sizes_named -- build many cliques but only read their sizes;
  with lazy names the name vectors are never materialized.
* all_simple_paths_named -- another high-volume vertex-sequence-list path.
* vs_subset_positional -- O(1) lazy subsetting of a large named vertex
  sequence (ALTREP Extract_subset) instead of copying a name subset.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
create_vs_list() drove the per-element work (lapply closure, as.integer,
.Call to build the lazy-names ALTREP, and attributes<-) from R, repeated
once per sequence -- the dominant cost when functions like max_cliques()
return tens of thousands of igraph.vs objects.

Move that loop into Rx_igraph_vs_list() in rinterface_extra.c, which reuses
the file-static igraph_lazy_names ALTREP class directly. R now only builds
V(graph) once (to mint the shared weak reference and graph id) and hands the
pieces to C. Each element gets a fresh integer payload (guarded against
coerceVector aliasing so the caller's vectors are never mutated), an optional
lazy-names attribute, the shared env/graph attributes, and the igraph.vs
class.

max_cliques(sample_gnp(500, 0.15)) on a named graph (medians, 12 iters):

  vs:      ~100ms -> 34.5ms   (numeric floor ~30ms)

Correctness verified: integer payloads with identical values/names, no names
on unnamed graphs, NA/out-of-range IDs map to NA_STRING, inputs unmutated, and
the shared weakref still releases the graph after rm()+gc(). Full suite passes
(iterators, cliques, paths, components, flow, conversion, interface,
structural-properties, cycles, attributes, aaa-auto); clean under gctorture.

cpp11::cpp_register() added only the Rx_igraph_vs_list registration; it left
the previously-noted unused entries untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 21f3d4b is merged into main:

  • ✔️E_named: 17ms -> 16.5ms [-6.42%, +0.55%]
  • ❗🐌V_named: 296µs -> 329µs [+2.63%, +19.76%]
  • 🚀all_simple_paths_named: 21.4ms -> 5.44ms [-76.14%, -72.96%]
  • ✔️as_adjacency_matrix: 780ms -> 779ms [-1.09%, +0.88%]
  • ✔️as_biadjacency_matrix: 782ms -> 783ms [-0.98%, +1.35%]
  • ✔️as_data_frame_both: 1.68ms -> 1.68ms [-2.39%, +2.88%]
  • ✔️as_long_data_frame: 4.42ms -> 4.33ms [-8.24%, +3.82%]
  • 🚀ego_order2_named: 36.8ms -> 7.07ms [-82.68%, -78.94%]
  • ✔️es_attr_filter: 3.11ms -> 3.13ms [-2.78%, +4.11%]
  • ❗🐌graph_from_adjacency_matrix: 127ms -> 128ms [+0.05%, +2.89%]
  • ✔️graph_from_data_frame: 3.64ms -> 3.69ms [-0.31%, +3.15%]
  • 🚀head_of_named: 639µs -> 583µs [-12.31%, -5.18%]
  • 🚀max_cliques_named: 51.4ms -> 5.78ms [-89.76%, -87.76%]
  • 🚀max_cliques_sizes_named: 50.1ms -> 8.97ms [-83.86%, -80.3%]
  • ✔️vs_attr_filter: 1.82ms -> 1.87ms [-1.18%, +6.8%]
  • ❗🐌vs_by_name: 1.15ms -> 1.21ms [+1.37%, +8.1%]
  • 🚀vs_subset_positional: 1.53ms -> 1.15ms [-28.38%, -21.54%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

R CMD check on R 4.5 flagged a non-API call to `DATAPTR`, used by the
`igraph_lazy_names` ALTREP Dataptr method. Switch to `DATAPTR_RO` (and
keep `DATAPTR_OR_NULL`), both of which are part of the API and are the
sanctioned replacements; `DATAPTR` is on tools:::nonAPI, these are not.

The materialized name cache is read-only for our purposes (as.vector(),
coercion, printing), so a read-only data pointer is sufficient. Verified
the compiled object no longer references the `DATAPTR` entry point and
that names/as_ids/as.vector/subsetting/printing still work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 79fadf6 is merged into main:

  • 🚀E_named: 16.7ms -> 16.3ms [-4.39%, -1.42%]
  • ❗🐌V_named: 280µs -> 294µs [+2.58%, +7.9%]
  • 🚀all_simple_paths_named: 19.9ms -> 5.45ms [-73.22%, -72.07%]
  • ✔️as_adjacency_matrix: 723ms -> 723ms [-0.5%, +0.33%]
  • ✔️as_biadjacency_matrix: 724ms -> 729ms [-0.08%, +1.53%]
  • ❗🐌as_data_frame_both: 1.49ms -> 1.52ms [+0.74%, +2.55%]
  • ✔️as_long_data_frame: 3.9ms -> 3.94ms [-1.44%, +3.81%]
  • 🚀ego_order2_named: 33ms -> 6.78ms [-80.08%, -78.84%]
  • ✔️es_attr_filter: 2.8ms -> 2.68ms [-10.31%, +2.25%]
  • ✔️graph_from_adjacency_matrix: 129ms -> 129ms [-0.92%, +0.39%]
  • ✔️graph_from_data_frame: 3.4ms -> 3.4ms [-0.5%, +0.59%]
  • 🚀head_of_named: 568µs -> 525µs [-9.29%, -5.6%]
  • 🚀max_cliques_named: 43.5ms -> 5.46ms [-87.93%, -86.93%]
  • 🚀max_cliques_sizes_named: 44.7ms -> 8.2ms [-82.51%, -80.78%]
  • ❗🐌vs_attr_filter: 1.53ms -> 1.58ms [+2.15%, +3.68%]
  • ❗🐌vs_by_name: 995µs -> 1.03ms [+0.89%, +6.57%]
  • 🚀vs_subset_positional: 1.38ms -> 1.09ms [-22.08%, -19.38%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant