Skip to content

Interactive seg and stereo docs#1726

Merged
BryonLewis merged 46 commits into
mainfrom
interactive-seg-and-stereo-docs
Jun 27, 2026
Merged

Interactive seg and stereo docs#1726
BryonLewis merged 46 commits into
mainfrom
interactive-seg-and-stereo-docs

Conversation

@BryonLewis

Copy link
Copy Markdown
Collaborator

Docs update I forgot to add to #1582

mattdawkins and others added 30 commits June 22, 2026 12:03
Brings in the SAM2/SAM3-based interactive segmentation feature, the
SAM3 text-query workflow, and the desktop interactive stereo mode.
Web-girder paths are intentionally untouched for now — web support
will come in a follow-up.

- New segmentation point-click recipe + EditorMenu wiring; SAM2/SAM3
  models loaded via VIAME install configs.
- Desktop backend: viame_segmentation_service-backed IPC handlers and
  matching frontend API for segmentationInitialize/Predict/SetImage/
  ClearImage/Shutdown/IsReady, textQuery/refineDetections/
  runTextQueryPipeline, and stereoEnable/Disable/SetFrame/GetStatus/
  TransferLine/TransferPoints/SetCalibration/IsEnabled, plus disparity
  ready/error event hooks.
- EditAnnotationLayer: track shift-key state and right-click for Point
  mode, propagate background flag for negative SAM points.
- Sidebar / ViewerLoader / Viewer: stereo annotation mode UI, error
  dialog when seg or text-query model fails to load, dot-only-on-source
  -frame fix.
- useModeManager / EditAnnotationLayer / recipes: keep existing geometry
  type when current editing mode already matches; right-click in Point
  creation finalises and deselects.
A track-frame's polygon now expands to a list of polygons each with
their own keys, and each polygon supports holes.

- Server CSV (de)serializer: emit polygon-key column per polygon, support
  holes in the geoJSON FeatureCollection; auto_key path to append a new
  polygon to an existing track frame.
- Client recipes / useModeManager: handleAddHole / handleAddPolygon /
  handleCancelCreation; PolygonLayer emits polygon-clicked.
- Hole drawing reuses the polygon edit pipeline (left-click places a
  hole vertex without exiting creation mode).
- Test fixtures cover multi-polygon and polygons-with-holes round-trip.
(cherry picked from commit c2f3cd0)
…tton)

Strip the SAM3 text-query button, dialog, API, and IPC handlers from the interactive editor, keeping segmentation and stereo intact. The full text-query feature lives on the follow-up branch dev/text-query-annot-button.
Strip the no-transcode NativeVideoAnnotator path that should not be on the
segmentation/stereo branch: removes the residual Viewer.vue async-component,
nativeVideoPath plumbing, and template branch, plus the stale settings field.
The rebase onto main dropped the opening <template> tag, so vite parsed the
root <div> as a custom block and the electron build failed. Restore it.
In continuous Detection mode, each interactive-segmentation point click now finalizes its own detection and immediately starts a fresh one, instead of refining a single detection. Non-continuous mode is unchanged: clicks still accumulate to refine one detection until confirmed. Frame-navigation preview restores are excluded so they don't spuriously create detections.
Capture the completed track ID before newTrackSettingsAfterLogic, which
in continuous detection mode spawns a new track and changes
selectedTrackId, so stereo annotation-complete events attach to the
correct detection. Also re-activate the segmentation recipe when
re-editing a finalized Point detection so clicks resume predicting.

Ported from viame/master (2cad9aa, 4008a1f).
Add a _removeIfEmpty helper and call it from selectTrack when leaving
edit mode, so a detection created but never drawn (e.g. clicked away or
right-click deselect) doesn't linger as an empty track. handleEscapeMode
now reuses the same helper.

Ported from viame/master (4139f78).
handleConfirmRecipe now returns early unless an active segmentation
recipe has a pending prediction or was explicitly reset, so the
contextmenu event from a right-click that enters Point edit mode no
longer immediately deselects/deactivates before any points are placed.
Adds a wasReset flag on the recipe to allow finalizing after a reset.

Ported from viame/master (c4d149c, 69bc0f8).
Wire the edit layer's finalizeInProgress through a handler callback that
handleAddTrackOrDetection invokes, so pressing 'n' or starting a new
detection commits a valid in-progress polygon (or discards it) instead
of leaving it dangling. Also commit a pending segmentation prediction
before the track switch so a reset-on-deselect doesn't leave an empty
detection.

Ported from viame/master (842a20c, dea9653).
In continuous mode a background (negative) click is a refinement of the
current mask, not a new object, so it should no longer commit and start a
fresh detection.
The reset button only restored a default-key ('') polygon, so resetting a
detection whose existing polygon was segmentation-keyed removed it
entirely. Capture the pre-existing polygon keys in the snapshot and
restore all original polygon geometry, removing only segmentation-added
polygons.

Ported from viame/master.
ViewerLoader already builds a getFrameTime (frame/fps) and the backend
already forwards frame_time to the service, but the recipe ignored it and
never set frameTime on the predict request, so interactive segmentation
on video datasets couldn't seek to the current frame. Accept getFrameTime
in the recipe and include frameTime in the request.

Ported from viame/master (23ccb25, c363531).
BryonLewis and others added 16 commits June 25, 2026 08:53
- Await set_frame (ensureStereoFrame) before transfer in the draw handler,
  so drawing on the frame stereo was enabled on no longer stalls in the
  backend's 120s deferred-disparity wait. Factor the duplicated set-frame
  logic (enable kickoff + frame watcher) into the one helper.
- Use renderer-safe path helpers instead of npath.* (node 'path' is
  externalized under contextIsolation -> "npath is not defined").
- Declare the missing stereoCameraFps ref ("stereoCameraFps is not defined").
Port the warped-line fixes from viame/master so the line transferred to
the second camera is a normal line-mode-editable annotation:

- Preserve the source line's key through the transfer (key: params.key
  instead of '') and thread it through StereoAnnotationCompleteParams.
- Emit head/tail Point markers alongside the LineString so endpoint
  handles render and can be dragged.
- Expand the warped bounds by 10% to match the source side (headtail.ts).
- Preserve editing mode when left-clicking onto a camera that already has
  the selected track (the warped annotation), so it can be adjusted
  immediately.
In interactive stereo, only the user may modify a line a human authored:

- Mark a camera's line human-authored when the user draws/edits it (the
  stereo warp writes geometry directly and never fires this event, so the
  event firing always means a human edit).
- Warp source -> other only when the other side is absent or still
  machine-generated. Once the other side has been hand-edited it is frozen;
  further edits on the first camera no longer overwrite it (and vice versa).
- Length keeps tracking the shifting geometry, except when length_method is
  'user_set' (a new detection attribute the user can set to lock a length);
  the stereo update then leaves that length alone while still refreshing
  range/midpoint. Auto-computed lengths record length_method = 'stereo'.
A near-horizontal/vertical line otherwise produced a razor-thin box. After
the usual 10% expansion, grow the shorter side about its center until the
longer:shorter ratio is at most 6:1. Applied to both the drawn box
(headtail.ts tightBoundsExpanded) and the stereo-warped box (ViewerLoader).
Clicking a detection in a camera that isn't selected used to be ignored
(LayerManager Clicked early-returned), so it took one click to switch
cameras and another to act on the detection. Now that click switches to the
clicked camera and acts on the detection in the same click: left-click
selects it, right-click edits it. Select-then-edit keeps the result
deterministic, and it bails if a mode (e.g. linking) blocked the switch.
When creating a new detection, the creation cursor is now live on every
camera (not just the selected one), and a draw is routed to whichever camera
it lands on (switch + materialize the new track there). Works for all
creation types.

- LayerManager: enable the edit layer in creation mode on non-selected
  cameras (isCreatingNewDetection); route the drawn shape to the drawn-on
  camera in the update:geojson handler; suppress the select that rides the
  click which finalizes a shape, and the first-corner click in the
  cross-camera branch, so an overlapping detection isn't grabbed instead.
- Viewer: don't intercept camera-view mousedown mid-creation (it would
  preventDefault the rectangle drag); let the draw land and route.

Known limitation: a line's 2nd vertex landing inside an existing detection
still selects that detection (event-ordering quirk); accepted for now.
applyStereoLine stored the stereo/segmentation-created LineString under an
empty key, so it wasn't recognized by the HeadTail recipe and edit-layer like
hand-drawn lines. Store it under HeadTailLineKey ('HeadTails') instead, matching
hand-drawn head/tail lines.

(The controller-init guard from the source commit is already covered here by the
getViewerFrame() helper, so only the keying fix is ported.)
Replace the single 'Interactive Mode' stereo toggle with two independent Stereo
Settings controls: 'Update lengths when modified' (on by default) recomputes the
stereo measurement when a linked line is modified, and 'Auto-compute location on
other camera' (off by default) warps an annotation to the other camera when it
has no detection there yet. The backend service starts whenever either feature
is on, and the load-time auto-enable degrades silently on failure.
@BryonLewis BryonLewis merged commit 2ca4f59 into main Jun 27, 2026
3 checks passed
@BryonLewis BryonLewis deleted the interactive-seg-and-stereo-docs branch June 27, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants