fix: validate /api/set-workspace path — reject traversal, symlink escape, non-existent (closes #15) by timon0305 · Pull Request #22 · cppalliance/cppa-cursor-browser

timon0305 · 2026-05-07T21:57:17Z

Problem

POST /api/set-workspace accepted any string in body.path, did ~/ expansion, and stored the result in a module-global consumed by every subsequent file lookup. A client (anyone who reaches the dashboard, including a hostile JS payload — see related XSS issue #11) could post { "path": "/etc" } or { "path": "/home/user/.ssh" } and the app would then serve those files as Cursor data. Symlink escape (e.g. /tmp/cursor-link → /) bypassed any naive startswith check.

Change

utils/path_validation.py (new) — validate_workspace_path(raw) with WorkspacePathError. os.path.realpath() collapses .. AND resolves symlinks in one step; rejects non-string/empty, non-existent, non-directory, no-Cursor-markers; returns the canonical real path so the override is stored canonically, not as the caller sent it. Lives outside api/ so tests don't pull Flask into scope (existing test-suite convention).
api/config_api.py — handler calls the validator. Returns 400 { error: "<reason>" } on rejection (was silently 200); on accept returns 200 { success: true, path: "<canonical>" }.
tests/test_workspace_path_validation.py (new) — 11 regression cases.

Test plan

Unit (python3 -m unittest discover tests): 148/148 OK (was 137; +11 new).

Live HTTP smoke against the running app, all 9 cases:

Case	Payload	Expected	Got
A	empty body	`400 path is required`	✅
B	empty path string	`400 path is required`	✅
C	non-existent path	`400 path does not exist`	✅
D	file `/etc/passwd`	`400 path is not a directory`	✅
E	dir `/etc` (no markers)	`400 no Cursor workspaceStorage`	✅
F	traversal `…/storage/../..` lands at `/tmp`	`400` (`..` collapsed, then markers reject)	✅
G	symlink → `/`	`400` (realpath unmasked it)	✅
H	real fake workspace (`state.vscdb` in subdir)	`200 { path: "<canonical>", success: true }`	✅
I	symlink → real workspace	`200 { path: "<realpath>" }` (canonicalised — input was the symlink path)	✅

Case I is the meaningful one: even though the operator sent …/good-link, the override stored is the realpath. Every subsequent read goes through canonical, not the link — a future swap of the link target can't redirect reads.

Closes #15.

Summary by CodeRabbit

Improvements
- Stricter workspace path validation: rejects invalid/traversal inputs and non-workspace locations, resolves canonical real paths, and returns canonical path plus workspace count on validation.
- Setting a workspace now validates input and returns the canonical path on success; malformed/non-object JSON yields clear 400 errors.
Tests
- Added regression tests covering validation, canonicalization, symlink behavior, and API responses.
Documentation
- README clarified workspace-path validation and operator escape hatch.
Bug Fixes
- UI shows server-provided validation error messages.

…ape, non-existent (closes #15) POST /api/set-workspace accepted any string in body.path, ran tilde expansion, and stored it in a module-global override consumed by every subsequent file lookup. Anyone reaching the endpoint (including a hostile JS payload — see XSS issue #11) could repoint the app at /etc, /, ~/.ssh, or any other directory; the dashboard would then serve files from there as if they were Cursor chat data. Symlink-based escape (e.g. /tmp/cursor-link → /) bypassed any naive startswith-style check. New utils/path_validation.py with validate_workspace_path(raw): - Expands ~/. - os.path.realpath() — collapses `..` AND resolves symlinks in one step. Both classes of escape become equivalent to the canonical real path on disk; downstream marker check then operates on the truth. - Rejects with WorkspacePathError if the path doesn't exist, isn't a directory, or contains no immediate subdirectory with a state.vscdb marker (the same heuristic /api/validate-path already uses). - Returns the canonical real path so the override is stored in canonical form, not whatever the caller sent. api/config_api.py:set_workspace now calls the validator and returns 400 { error: "<reason>" } on rejection (was silently 200), and stores the canonical path on success: 200 { success: true, path: "<canonical>" }. Lives outside api/ so the test suite can import without Flask in scope (tests/test_cli_args.py convention). Tests: tests/test_workspace_path_validation.py — 11 cases covering: - happy path with marker file present - canonical path returned (`..` collapsed) - empty / whitespace / non-string rejected - non-existent / file-not-directory rejected - directory without Cursor markers rejected - traversal lands outside workspace → rejected on markers - symlink → / rejected on markers (POSIX-only) - symlink → real workspace canonicalised + accepted (POSIX-only) Full suite: 148/148 OK (was 137; +11 new). Live HTTP smoke against the running app verified all 9 documented behaviours (200 with canonical path on accept; 400 with the documented reason on each reject).

…n PR #16) Two CodeRabbit follow-ups: 1. api/config_api.py:set_workspace — when get_json(silent=True) returned a non-dict (JSON array, string, number, null), the truthy fallback `or {}` was bypassed and `body.get("path", "")` raised AttributeError, which the outer Exception handler mis-reported as a 500 server error. Added isinstance(body, dict) guard that returns 400 { error: "request body must be a JSON object" } directly. Diverged from CodeRabbit's literal patch in one way: they had it raise WorkspacePathError("path is required"), but the actual problem here is a malformed body shape — the error message should match the cause so the client can fix it. 2. tests/test_workspace_path_validation.py — the traversal test escaped to /tmp itself, which is shared and could be flipped by any other test / process creating <something>/state.vscdb there. Pinned the escape target to an isolated root inside self.tmp. Also added 4 API-layer regression tests (TestSetWorkspaceApi) using Flask test_client: JSON array / string / number return 400 (not 500), plus a sanity end-to-end with a valid {path} body returning 200 and the canonical realpath. Full suite: 152/152 OK (was 148; +4 new API tests).

coderabbitai · 2026-05-07T21:57:28Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 50c706cf-cd7a-4bd6-a98a-d2ad42b0a74b

📥 Commits

Reviewing files that changed from the base of the PR and between 2aa40b7 and f3ac9b9.

📒 Files selected for processing (2)

api/config_api.py
tests/test_workspace_path_validation.py

🚧 Files skipped from review as they are similar to previous changes (2)

api/config_api.py
tests/test_workspace_path_validation.py

📝 Walkthrough

Walkthrough

Adds a realpath-based workspace path validator with Cursor marker detection, integrates it into POST /api/set-workspace and POST /api/validate-path to return canonical paths and map validation failures to HTTP 400, updates docs/templates, and adds tests covering validator and endpoint behavior.

Changes

Workspace Path Validation & API Integration

Layer / File(s)	Summary
Path Validation Contract `utils/path_validation.py`	Introduces module docs and `WorkspacePathError` used to surface validation failures as client-facing errors.
Path Validation Logic `utils/path_validation.py`	Implements `_has_cursor_workspace_markers()` and `validate_workspace_path()` to expand `~`, canonicalize via `os.path.realpath`, verify existence and directory-ness, enforce Cursor workspace markers, and return the canonical real path or raise `WorkspacePathError`.
API Endpoint Integration `api/config_api.py`	Imports validator and exception; `/api/validate-path` and `/api/set-workspace` now call `validate_workspace_path()`, `/api/set-workspace` requires JSON object bodies (400 on non-dict), maps `WorkspacePathError` to 400 (or `valid:false` for validate-path), and returns `{"success": True, "path": <canonical>}` or `{"valid": True, "workspaceCount": N, "path": <canonical>}` on success.
Tests & Coverage `tests/test_workspace_path_validation.py`	Adds helper to create Cursor-like workspace markers; unit tests cover valid inputs, canonicalization, traversal and symlink scenarios; API tests verify 400 for non-dict bodies and 200 with `success` + canonical `path` for valid dict payloads; validate-path API returns `valid` plus `workspaceCount` and canonical `path`.
Docs & Template `README.md`, `utils/workspace_path.py`, `templates/config.html`	README documents API validation behavior; `resolve_workspace_path()` docstring expanded; `config.html` now displays `data.error` on validation failure when present.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related issues

Security: validate /api/set-workspace path (reject traversal, symlink escape, non-existent paths) #15 — Adds validate_workspace_path, realpath+marker checks, and updates POST /api/set-workspace and /api/validate-path to use it.

Poem

🐰 I hopped down a path both tangled and neat,

Realpath smoothed the turns beneath my feet.
Markers lit burrows where workspaces hide,
Canonical maps keep the rabbit inside.
Hooray—safe paths and carrots for guide.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 19.23% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately reflects the main change: adding validation to /api/set-workspace to reject traversal, symlinks, and non-existent paths, with reference to the closed issue `#15`.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/set-workspace-path-validation-15

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/test_workspace_path_validation.py (1)

161-187: ⚡ Quick win

Add explicit JSON null body regression case.

You already pin array/string/number non-dict bodies; adding null closes the same class completely and protects the shape-guard contract.

Suggested test

+    def test_non_dict_json_null_returns_400(self):
+        resp = self.client.post(
+            "/api/set-workspace",
+            data="null",
+            content_type="application/json",
+        )
+        self.assertEqual(resp.status_code, 400)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_workspace_path_validation.py` around lines 161 - 187, Add a
regression test to assert that a JSON null body returns 400 like other non-dict
bodies: create a test function (e.g., test_non_dict_json_null_returns_400) in
tests/test_workspace_path_validation.py mirroring the existing tests
(test_non_dict_json_array_returns_400_not_500,
test_non_dict_json_string_returns_400, test_non_dict_json_number_returns_400)
that posts data="null" with content_type="application/json" to
"/api/set-workspace" and asserts resp.status_code == 400 (and optionally that
resp.get_json() contains an "error" key to match the array case).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/config_api.py`:
- Around line 93-100: The call to set_workspace_path_override(canonical) can
raise and currently falls outside the validate_workspace_path() try/except,
causing an unstructured framework 500; wrap
set_workspace_path_override(canonical) in its own try/except that catches
Exception (or specific persistence errors if available) and returns a JSON error
response (e.g., jsonify({"error": "Failed to persist workspace path", "details":
str(e)}), 500), while optionally logging the exception; keep existing handling
for WorkspacePathError from validate_workspace_path() intact.

---

Nitpick comments:
In `@tests/test_workspace_path_validation.py`:
- Around line 161-187: Add a regression test to assert that a JSON null body
returns 400 like other non-dict bodies: create a test function (e.g.,
test_non_dict_json_null_returns_400) in tests/test_workspace_path_validation.py
mirroring the existing tests (test_non_dict_json_array_returns_400_not_500,
test_non_dict_json_string_returns_400, test_non_dict_json_number_returns_400)
that posts data="null" with content_type="application/json" to
"/api/set-workspace" and asserts resp.status_code == 400 (and optionally that
resp.get_json() contains an "error" key to match the array case).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7a3c5e99-e2df-4ce1-a18a-18556c9b9dfe

📥 Commits

Reviewing files that changed from the base of the PR and between f8b3cb3 and a63297f.

📒 Files selected for processing (3)

api/config_api.py
tests/test_workspace_path_validation.py
utils/path_validation.py

…deRabbit on PR #22) validate_workspace_path() failures were already returning structured JSON, but set_workspace_path_override(canonical) was outside the try block — a persistence failure would have surfaced as Flask's HTML 500 page instead of {"error": "..."}. Wraps the call in its own try/except so the response shape stays structured for any consumer parsing JSON.

… API tests Two test-hygiene follow-ups from a structured re-review of PR #22; no production code changes. 1. tests/test_workspace_path_validation.py — test_returns_canonical_path_collapsing_dotdot The canonical-path assertion was a substring check against `..` on the raw realpath string. That would spuriously fail if the OS-supplied tempdir name ever embedded `..` in a folder component — substring presence is the wrong primitive for the invariant we actually care about, which is "no `..` *segment* in the canonical path." Switched to `Path(result).parts`, which handles `\` vs `/` natively and asserts on path components. 2. tests/test_workspace_path_validation.py — TestSetWorkspaceApi.setUp The 200-path test mutates the module-global _workspace_path_override in utils/workspace_path.py via the API, and the tempdir it then points at is rmtree'd by the existing cleanup. Without an explicit reset, the global stays pinned at a now-deleted path across tests. Added addCleanup(set_workspace_path_override, None) so any future sibling test inspecting the override sees a clean None. Full suite: 152/152 OK (skipped=2 POSIX-only symlink tests on Windows). No behaviour change; ReadLints clean. Co-authored-by: Cursor <cursoragent@cursor.com>

bradjin8

utils/workspace_path.py 61–67

WORKSPACE_PATH env is still applied in resolve_workspace_path() without the same checks as the API. That is a normal “trusted operator” escape hatch but differs from XSS-driven POST abuse in #15. Optional: document in README, or apply the same validator when the env is read (stricter).

utils/path_validation.py 30–48

Marker heuristic only inspects immediate children for state.vscdb. Matches /api/validate-path and the issue; if Cursor ever nests storage differently, both endpoints could false-negative. Acceptable as stated.

- POST /api/validate-path now uses the same realpath + marker checks as set-workspace; returns canonical path and structured errors on failure. - README documents WORKSPACE_PATH as trusted-operator tilde expansion only. - Config page shows server error text when validation fails. - Docstrings + symlink-test CI note; TOCTOU comment after realpath. Co-authored-by: Cursor <cursoragent@cursor.com>

….com/cppalliance/cppa-cursor-browser into fix/set-workspace-path-validation-15

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

templates/config.html (1)

47-61: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't treat /api/set-workspace failures as a successful save.

Both save flows ignore the /api/set-workspace response. Because fetch() does not throw on HTTP 400/500, the page can still update localStorage, show success, and redirect even when the server rejected or failed to persist the override. Please gate the success path on response.ok/success, and store the canonical path returned by the API instead of the raw input.

Proposed fix for the manual save flow

     if (data.valid) {
-      localStorage.setItem('workspacePath', path);
-      await fetch('/api/set-workspace', {
+      const saveRes = await fetch('/api/set-workspace', {
         method: 'POST',
         headers: { 'Content-Type': 'application/json' },
         body: JSON.stringify({ path })
       });
+      const saveData = await saveRes.json();
+      if (!saveRes.ok || !saveData.success) {
+        statusEl.className = 'alert alert-danger';
+        statusEl.textContent = saveData.error || 'Failed to save configuration.';
+        statusEl.style.display = 'block';
+        return;
+      }
+      localStorage.setItem('workspacePath', saveData.path || data.path || path);
       statusEl.className = 'alert alert-success';
       statusEl.textContent = `Found ${data.workspaceCount} workspaces in the specified location`;
       statusEl.style.display = 'block';
       setTimeout(() => { window.location.href = '/'; }, 1000);

Apply the same response handling in the auto-detect branch at Lines 47-61.

Also applies to: 86-99

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@templates/config.html` around lines 47 - 61, The auto-detect branch currently
ignores the POST to '/api/set-workspace' and treats the operation as successful;
change the flow in the block handling 'detected' so you await and inspect the
response (call it e.g. setRes) from fetch('/api/set-workspace'), check setRes.ok
(or parse JSON and check a success flag), and only then update
localStorage.setItem('workspacePath', ...) / show success / redirect;
additionally parse the canonical path from the set-workspace response body and
store that canonical path (not the raw 'detected' input). Apply the same
response.ok + canonical-path handling to the manual save flow that also calls
'/api/set-workspace'.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/config_api.py`:
- Around line 55-60: The endpoint reads the JSON body with
request.get_json(silent=True) but doesn't guard against non-object JSON (e.g., a
string), so body.get("path", "") can raise; mirror the check used in
set_workspace(): after body = request.get_json(silent=True) or {}, verify
isinstance(body, dict) and if not return jsonify({"valid": False, "error":
"invalid JSON body", "workspaceCount": 0}); then use body.get("path", "") and
proceed to call validate_workspace_path(raw) and catch WorkspacePathError as
before to preserve the existing error response shape.

---

Outside diff comments:
In `@templates/config.html`:
- Around line 47-61: The auto-detect branch currently ignores the POST to
'/api/set-workspace' and treats the operation as successful; change the flow in
the block handling 'detected' so you await and inspect the response (call it
e.g. setRes) from fetch('/api/set-workspace'), check setRes.ok (or parse JSON
and check a success flag), and only then update
localStorage.setItem('workspacePath', ...) / show success / redirect;
additionally parse the canonical path from the set-workspace response body and
store that canonical path (not the raw 'detected' input). Apply the same
response.ok + canonical-path handling to the manual save flow that also calls
'/api/set-workspace'.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b77ae31b-364d-4422-acbe-56b463fc8095

📥 Commits

Reviewing files that changed from the base of the PR and between b7ba220 and 2aa40b7.

📒 Files selected for processing (6)

README.md
api/config_api.py
templates/config.html
tests/test_workspace_path_validation.py
utils/path_validation.py
utils/workspace_path.py

✅ Files skipped from review due to trivial changes (2)

README.md
utils/workspace_path.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_workspace_path_validation.py

Avoid AttributeError on truthy JSON scalars/arrays (same class as PR #16). Return valid:false + workspaceCount:0 to match validation error shape. Co-authored-by: Cursor <cursoragent@cursor.com>

timon0305 added 2 commits May 5, 2026 16:33

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

Comment thread api/config_api.py

timon0305 self-assigned this May 7, 2026

timon0305 requested a review from clean6378-max-it May 7, 2026 22:02

timon0305 requested review from bradjin8 and removed request for clean6378-max-it May 8, 2026 15:00

bradjin8 requested changes May 8, 2026

View reviewed changes

Comment thread tests/test_workspace_path_validation.py

Comment thread utils/path_validation.py

Comment thread api/config_api.py

bradjin8 and others added 2 commits May 8, 2026 16:07

Merge branch 'fix/set-workspace-path-validation-15' of https://github…

2aa40b7

….com/cppalliance/cppa-cursor-browser into fix/set-workspace-path-validation-15

bradjin8 approved these changes May 8, 2026

View reviewed changes

bradjin8 requested a review from wpak-ai May 8, 2026 20:12

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

Comment thread api/config_api.py

validate-path: reject non-object JSON before body.get

f3ac9b9

Avoid AttributeError on truthy JSON scalars/arrays (same class as PR #16). Return valid:false + workspaceCount:0 to match validation error shape. Co-authored-by: Cursor <cursoragent@cursor.com>

bradjin8 merged commit 95d3140 into master May 8, 2026
6 checks passed

bradjin8 deleted the fix/set-workspace-path-validation-15 branch May 8, 2026 20:18

coderabbitai Bot mentioned this pull request Jun 23, 2026

Fix/export workspace path override #114

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: validate /api/set-workspace path — reject traversal, symlink escape, non-existent (closes #15)#22

fix: validate /api/set-workspace path — reject traversal, symlink escape, non-existent (closes #15)#22
bradjin8 merged 7 commits into
masterfrom
fix/set-workspace-path-validation-15

timon0305 commented May 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 7, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

bradjin8 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

timon0305 commented May 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Change

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bradjin8 left a comment

Choose a reason for hiding this comment

utils/workspace_path.py 61–67

utils/path_validation.py 30–48

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

timon0305 commented May 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading