Skip to content

Adding Map Caching and Removing viz dependency#409

Open
Aditya-Gupta26 wants to merge 12 commits intoemerge/temp_trainingfrom
aditya/predefine_goals
Open

Adding Map Caching and Removing viz dependency#409
Aditya-Gupta26 wants to merge 12 commits intoemerge/temp_trainingfrom
aditya/predefine_goals

Conversation

@Aditya-Gupta26
Copy link
Copy Markdown

This PR implements the Map Caching - sharing common (and memory heavy) map features across envs without having to load every .bin file.

Note : The change in render code is meant to remove ensure_drive_binary(), and any subsequent depedency on visualize.c; please continue using multi_scenario_render to control render settings while training.

Also adds a small feature to control number of agents in the renders from drive.ini

Copilot AI review requested due to automatic review settings April 23, 2026 04:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a shared map-cache for the Drive C environment to avoid repeatedly loading large .bin map files, and updates the training/eval render pipeline to remove reliance on the legacy visualize/raylib export path.

Changes:

  • Add a process-local shared map cache in the Drive C binding and plumb map_id + road observation distance parameters through Drive env creation/reset.
  • Update training-time inline rendering to use eval_multi_scenarios_render (multi-view for EGL) instead of exporting .bin files and invoking the legacy renderer build step.
  • Add eval-time controls for min/max agents per env (gigaflow) via drive.ini.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pufferlib/pufferl.py Switch inline training render path to eval_multi_scenarios_render, remove ensure_drive_binary, and add agent-per-env controls in eval overrides.
pufferlib/ocean/drive/rollout.py Add a shared Drive rollout loop utility with optional rendering context.
pufferlib/ocean/drive/drive.py Plumb new cache-related kwargs into binding.shared, pass map_id into env init, and rename render APIs to env_id terminology + RenderView.
pufferlib/ocean/drive/drive.h Implement shared map data structures, shared-map init path, and conditional cleanup in c_close.
pufferlib/ocean/drive/binding.c Add process-local shared-map cache, cache-keying, lazy population, and map_id-based initialization fast-path.
pufferlib/config/ocean/drive.ini Add eval config knobs for min_agents_per_env / max_agents_per_env.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pufferlib/ocean/drive/binding.c Outdated
Comment on lines +1777 to +1790
temp_env.init_mode = init_mode;
temp_env.control_mode = control_mode;
temp_env.simulation_mode = simulation_mode;
temp_env.init_steps = init_steps;
temp_env.num_max_agents = max_agents_per_env;
temp_env.goal_radius = goal_radius;
temp_env.agents = shared->template_agents;
temp_env.num_total_agents = shared->num_total_agents;
temp_env.grid_map = shared->grid_map;
set_active_agents(&temp_env);
int active_count = temp_env.active_agent_count;
free(temp_env.active_agent_indices);
free(temp_env.static_agent_indices);
free(temp_env.expert_static_agent_indices);
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In REPLAY mode, the active-agent counting path calls set_active_agents(&temp_env) with temp_env.agents pointing directly at shared->template_agents. set_active_agents mutates Agent fields (e.g., active_agent / mark_as_expert), so this permanently mutates the cached template and will affect subsequent env initializations for that map (agent selection, expert flags, etc.). To avoid corrupting the shared cache, run set_active_agents on a temporary copy of the Agent structs (shallow-copy is enough since only scalar fields are mutated), or implement a non-mutating counting routine.

Suggested change
temp_env.init_mode = init_mode;
temp_env.control_mode = control_mode;
temp_env.simulation_mode = simulation_mode;
temp_env.init_steps = init_steps;
temp_env.num_max_agents = max_agents_per_env;
temp_env.goal_radius = goal_radius;
temp_env.agents = shared->template_agents;
temp_env.num_total_agents = shared->num_total_agents;
temp_env.grid_map = shared->grid_map;
set_active_agents(&temp_env);
int active_count = temp_env.active_agent_count;
free(temp_env.active_agent_indices);
free(temp_env.static_agent_indices);
free(temp_env.expert_static_agent_indices);
Agent *temp_agents = NULL;
temp_env.init_mode = init_mode;
temp_env.control_mode = control_mode;
temp_env.simulation_mode = simulation_mode;
temp_env.init_steps = init_steps;
temp_env.num_max_agents = max_agents_per_env;
temp_env.goal_radius = goal_radius;
temp_env.num_total_agents = shared->num_total_agents;
temp_env.grid_map = shared->grid_map;
if (shared->num_total_agents > 0) {
temp_agents = malloc(sizeof(Agent) * shared->num_total_agents);
if (temp_agents == NULL) {
Py_DECREF(agent_offsets);
Py_DECREF(map_ids);
return PyErr_NoMemory();
}
for (int i = 0; i < shared->num_total_agents; i++) {
temp_agents[i] = shared->template_agents[i];
}
}
temp_env.agents = temp_agents;
set_active_agents(&temp_env);
int active_count = temp_env.active_agent_count;
free(temp_env.active_agent_indices);
free(temp_env.static_agent_indices);
free(temp_env.expert_static_agent_indices);
free(temp_agents);

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +38
// We must NOT free them — they belong to the parent's address space.
// Discard them and let the child rebuild its own cache on the next call.
if (g_map_cache_pid != 0 && g_map_cache_pid != getpid()) {
reset_cache_globals();
return;
}
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fork-handling branch in release_map_cache_internal() resets the cache globals on PID mismatch without freeing g_map_cache / g_cache_map_paths or the SharedMapData entries. In a forked child this drops the only pointers to large allocations, effectively leaking them for the lifetime of the child process (and preventing the OS from reclaiming COW pages if the parent later releases them). Consider either (a) reusing the inherited cache after fork by updating g_map_cache_pid to getpid(), or (b) fully freeing/detaching the inherited cache in the child before resetting the globals.

Suggested change
// We must NOT free them — they belong to the parent's address space.
// Discard them and let the child rebuild its own cache on the next call.
if (g_map_cache_pid != 0 && g_map_cache_pid != getpid()) {
reset_cache_globals();
return;
}
// Adopt the inherited cache in the child so it can be released normally
// instead of dropping the only references to these allocations.
if (g_map_cache_pid != 0 && g_map_cache_pid != getpid())
g_map_cache_pid = getpid();

Copilot uses AI. Check for mistakes.
Comment thread pufferlib/pufferl.py
Comment on lines +2417 to +2425
# Wrap rollout + post-processing in try/finally so vecenv.close() is
# guaranteed to run even if an exception fires mid-rollout.
# Without this, a crash inside the step loop leaves Drive C envs alive
# (ref_count > 0), which causes the next binding.shared() call with a
# different map_dir to raise:
# RuntimeError: Cannot change map cache config while Drive environments
# are still open. Call close() on all Drive instances first.
_rollout_exc = None
try:
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eval_multi_scenarios_render is described as being wrapped in try/finally to guarantee vecenv.close(), but the current implementation uses try/except, stores the exception object, and later re-raises it with raise _rollout_exc. Re-raising an exception instance this way loses the original traceback, making render crashes much harder to debug. Prefer try: ... finally: vecenv.close() and let exceptions propagate (or capture/restore exc_info) so the original traceback is preserved.

Copilot uses AI. Check for mistakes.
Comment thread pufferlib/ocean/drive/rollout.py Outdated
Comment on lines +88 to +92
with torch.no_grad():
ob_t = torch.as_tensor(obs).to(device)
logits, _ = policy.forward_eval(ob_t, state)
action, _, _ = pufferlib.pytorch.sample_logits(logits)
action_np = action.cpu().numpy().reshape(env.action_space.shape)
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rollout_loop currently samples actions with pufferlib.pytorch.sample_logits(logits) (stochastic). The existing evaluation / render paths in pufferl.py use deterministic sampling for metrics and reproducible renders. To keep this shared rollout loop consistent and reproducible, consider defaulting to deterministic sampling (or adding a deterministic parameter and threading it through).

Copilot uses AI. Check for mistakes.
num_eval_scenarios=self.current_num_eval_scenarios, # Use the dynamic size here
road_obs_front_dist=self.road_obs_front_dist,
road_obs_behind_dist=self.road_obs_behind_dist,
road_obs_side_dist=self.road_obs_side_dist,
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resample-time call to binding.shared() inside Drive.step() does not pass goal_radius, but binding.shared/my_shared unpacks goal_radius as a required kwarg. This will raise a TypeError on the first resample. Pass goal_radius=self.goal_radius here (consistent with the initial call in init).

Suggested change
road_obs_side_dist=self.road_obs_side_dist,
road_obs_side_dist=self.road_obs_side_dist,
goal_radius=self.goal_radius,

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants