Skip to content

Preview/Opengrep#92

Open
andrescastellanos-coderabbit wants to merge 1 commit intomainfrom
Preview/Opengrep
Open

Preview/Opengrep#92
andrescastellanos-coderabbit wants to merge 1 commit intomainfrom
Preview/Opengrep

Conversation

@andrescastellanos-coderabbit
Copy link
Copy Markdown

@andrescastellanos-coderabbit andrescastellanos-coderabbit commented Apr 27, 2026

Screenshot 2026-04-27 at 9 20 55 AM

Demonstrates the Opengrep sandboxer tool. Custom inline ruleset (opengrep.yml) at repo root defines 12 rules across Python, JavaScript, Go, and Java targeting SQL injection, XSS, insecure deserialization (pickle and ObjectInputStream), command injection, eval/exec, weak crypto (MD5), and XXE. Each rule is paired with deliberate trigger code in opengrep/vuln_*.{py,js,go,java}. Repo-level .coderabbit.yml disables Semgrep (Opengrep's upstream), ast-grep, owasp-top10, and PMD to avoid shadowing. Opengrep does not auto-trigger without a config file present, so the opengrep.yml at root is what activates it.

Summary by CodeRabbit

  • New Features

    • Added OpenGrep security detection rule configuration with pattern matching rules across Python, JavaScript, Go, and Java.
    • Added sample implementations in multiple programming languages for testing and validation.
  • Configuration

    • Added code review tool configuration to manage review settings.

… ruleset, disable overlapping pattern scanners
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

📝 Walkthrough

Walkthrough

This PR adds vulnerability detection configuration and demonstration files: a CodeRabbit settings file disabling certain review tools, an OpenGrep rule definition covering SQL injection, deserialization, XSS, eval, command injection, weak hashing, and XXE patterns, and four vulnerable code samples in Go, Java, JavaScript, and Python.

Changes

Cohort / File(s) Summary
Configuration
.coderabbit.yml, opengrep.yml
CodeRabbit configuration disables semgrep, ast-grep, owasp-top10, and pmd review tools; OpenGrep rules define detectors for SQL injection via string concatenation, unsafe deserialization (pickle/ObjectInputStream), unsafe eval/exec, command injection (subprocess with shell=True, child_process.exec/execSync), XSS via innerHTML assignment, weak MD5 hashing, and XXE-vulnerable XML parsing.
Vulnerable Code Samples
opengrep/vuln_go.go, opengrep/vuln_java.java, opengrep/vuln_javascript.js, opengrep/vuln_python.py
Demonstration implementations across four languages exposing HTTP endpoints with intentional vulnerabilities: SQL injection via query concatenation, unsafe deserialization and XML parsing, eval/exec on user input, command injection via shell execution, and weak password hashing (MD5).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Four new tales of code gone wrong,
From Python's flask to Go so strong,
Injections lurk in query strings,
And eval brings such dangerous things—
But now OpenGrep shall sound the alarm! 🔔

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Preview/Opengrep' is vague and does not clearly convey the main purpose of the changeset despite being related to the content. Use a more descriptive title that explains the core purpose, such as 'Add Opengrep configuration with vulnerability detection rules' or 'Introduce Opengrep sandboxer with sample vulnerable code patterns'.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch Preview/Opengrep

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
opengrep/vuln_go.go (1)

15-29: SQL injection demo implemented correctly; ignored Scan error.

The SQL injection vulnerability using fmt.Sprintf is correctly demonstrated. However, line 26 ignores the error from rows.Scan(&id, &email). While this is demo code, handling scan errors is idiomatic Go.

Optional: handle Scan error
-		rows.Scan(&id, &email)
+		if err := rows.Scan(&id, &email); err != nil {
+			continue
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@opengrep/vuln_go.go` around lines 15 - 29, The loop in LookupUser ignores the
error returned by rows.Scan(&id, &email); update LookupUser to check the error
from rows.Scan inside the for rows.Next() loop and handle it (e.g., log or call
http.Error and return) so scan failures are not swallowed; ensure you also
handle any rows.Err() after the loop to catch iteration errors from the
sql.Rows.
opengrep.yml (1)

129-139: XXE rule may produce false positives in hardened codebases.

The java-xxe-vulnerable-parser rule flags any DocumentBuilderFactory.newInstance() or SAXParserFactory.newInstance() call. This will trigger even when the code subsequently configures XXE protections via setFeature(). Consider adding pattern-not clauses to exclude properly hardened patterns, or accept this as a "review required" warning.

For example, to reduce false positives when XXE features are disabled nearby:

# Could add pattern-not-inside to exclude hardened patterns
# pattern-not-inside: |
#   $FACTORY = DocumentBuilderFactory.newInstance()
#   ...
#   $FACTORY.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@opengrep.yml` around lines 129 - 139, The current java-xxe-vulnerable-parser
rule flags any usage of DocumentBuilderFactory.newInstance() or
SAXParserFactory.newInstance() which causes false positives when callers
immediately harden the factory with setFeature(..., true); update the rule to
add pattern-not or pattern-not-inside clauses that exclude factories which are
subsequently configured (e.g., references to
$FACTORY.setFeature("http://apache.org/xml/features/disallow-doctype-decl",
true) or other common XXE-protection features), so the rule still warns when
factories are used without protections but ignores instances where
DocumentBuilderFactory.newInstance() or SAXParserFactory.newInstance() are
followed by explicit setFeature hardening.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.coderabbit.yml:
- Around line 5-6: Remove the invalid "enabled" property from the ast-grep
configuration section and either delete the entire ast-grep block or replace it
with a supported property such as setting essential_rules: false to disable it;
ensure only valid keys remain (essential_rules, rule_dirs, util_dirs, packages)
so the config conforms to the schema with additionalProperties: false and will
not be rejected.

In `@opengrep/vuln_java.java`:
- Around line 45-54: The ObjectInputStream created in restoreSession is never
closed causing a resource leak; wrap creation of ObjectInputStream (and the
ByteArrayInputStream that uses payload) in a try-with-resources so they are
automatically closed, e.g., open ObjectInputStream in the try(...) block used to
readObject and set the session attribute (req.getSession().setAttribute("state",
state)), and keep the existing exception handling that sets resp.setStatus(400)
on failure.

In `@opengrep/vuln_javascript.js`:
- Around line 8-13: The handler app.get("/dashboard/render", (req, res) => { ...
}) is using browser DOM APIs (document.getElementById, innerHTML) on the server
which will throw; replace that DOM usage with server-side rendering or move the
logic to client code. To fix here, stop referencing document/container: read
const message = req.query.message || ""; construct the HTML string on the server
(e.g. "<div id=\"notice\"><p>" + message + "</p></div>") and send it with
res.send(...) (or sanitize/escape message if you don't want an XSS demo).
Alternatively move the container manipulation into a client-side script file so
the server simply serves the message value. Ensure references to document and
container are removed from this server route.
- Around line 15-20: The route handler app.get("/profile/bio") currently calls
document.querySelector and manipulates target.innerHTML which will crash in
Node.js; replace this client-side DOM usage with a server-side render: remove
document.querySelector/target.innerHTML, sanitize or escape req.query.bio (or
explicitly mark it unsafe only for a controlled XSS demo) and return an HTML
response via res.send or your template engine that injects the bio safely (or
use a server-side DOM library like jsdom if you must manipulate a DOM object),
referencing the handler function for changes.

In `@opengrep/vuln_python.py`:
- Around line 25-30: The current get_user function calls dict(cursor.fetchone())
which fails because cursor.fetchone() returns a tuple; change it so the fetched
row is a mapping — either set the DB connection to return sqlite3.Row (e.g., in
get_connection set conn.row_factory = sqlite3.Row) so cursor.fetchone() yields a
dict-like row, or convert the tuple to a dict in get_user using
cursor.description to map column names to values (use cursor.description and the
tuple from cursor.fetchone() to build the dict). Update the get_connection or
get_user implementation accordingly and keep the SQL injection demo intact.

---

Nitpick comments:
In `@opengrep.yml`:
- Around line 129-139: The current java-xxe-vulnerable-parser rule flags any
usage of DocumentBuilderFactory.newInstance() or SAXParserFactory.newInstance()
which causes false positives when callers immediately harden the factory with
setFeature(..., true); update the rule to add pattern-not or pattern-not-inside
clauses that exclude factories which are subsequently configured (e.g.,
references to
$FACTORY.setFeature("http://apache.org/xml/features/disallow-doctype-decl",
true) or other common XXE-protection features), so the rule still warns when
factories are used without protections but ignores instances where
DocumentBuilderFactory.newInstance() or SAXParserFactory.newInstance() are
followed by explicit setFeature hardening.

In `@opengrep/vuln_go.go`:
- Around line 15-29: The loop in LookupUser ignores the error returned by
rows.Scan(&id, &email); update LookupUser to check the error from rows.Scan
inside the for rows.Next() loop and handle it (e.g., log or call http.Error and
return) so scan failures are not swallowed; ensure you also handle any
rows.Err() after the loop to catch iteration errors from the sql.Rows.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Enterprise

Run ID: bb929bb9-a59b-4368-86ea-27a662788b39

📥 Commits

Reviewing files that changed from the base of the PR and between ef04b8d and 0349d73.

📒 Files selected for processing (6)
  • .coderabbit.yml
  • opengrep.yml
  • opengrep/vuln_go.go
  • opengrep/vuln_java.java
  • opengrep/vuln_javascript.js
  • opengrep/vuln_python.py

Comment thread .coderabbit.yml
Comment on lines +5 to +6
ast-grep:
enabled: false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

CodeRabbit ast-grep enabled property configuration

💡 Result:

In CodeRabbit, ast-grep is configured in the .coderabbit.yaml file under reviews.tools.ast-grep. The 'essential_rules' property enables the ast-grep-essentials security rules package (default: true). ast-grep runs if at least one of: essential_rules: true, rule_dirs specified, or packages listed. Example configuration: reviews: tools: ast-grep: essential_rules: true # Enables essential security rules rule_dirs: - "rules" util_dirs: - "utils" packages: - "coderabbitai/ast-grep-essentials" - essential_rules: boolean to enable predefined essential rules. - rule_dirs: directories with custom .yaml rule files. - util_dirs: directories for shared utility rules. - packages: public GitHub repos (owner/repo) with rules/utils. Rules must include a 'message' property for reviews. Supports many languages via tree-sitter.

Citations:


Remove or replace the invalid enabled property for ast-grep.

The enabled property is not supported by CodeRabbit's ast-grep configuration. Valid properties are essential_rules, rule_dirs, util_dirs, and packages. With additionalProperties: false in the schema, this configuration will be rejected. To disable ast-grep, set essential_rules: false or remove the ast-grep section entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.coderabbit.yml around lines 5 - 6, Remove the invalid "enabled" property
from the ast-grep configuration section and either delete the entire ast-grep
block or replace it with a supported property such as setting essential_rules:
false to disable it; ensure only valid keys remain (essential_rules, rule_dirs,
util_dirs, packages) so the config conforms to the schema with
additionalProperties: false and will not be rejected.

Comment thread opengrep/vuln_java.java
Comment on lines +45 to +54
protected void restoreSession(HttpServletRequest req, HttpServletResponse resp) throws java.io.IOException {
byte[] payload = req.getInputStream().readAllBytes();
try {
ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload));
Object state = ois.readObject();
req.getSession().setAttribute("state", state);
} catch (Exception e) {
resp.setStatus(400);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

ObjectInputStream not closed — resource leak.

While correctly demonstrating insecure deserialization (CWE-502), the ObjectInputStream on line 48 is never closed, causing a resource leak. Consider using try-with-resources.

Proposed fix with try-with-resources
     protected void restoreSession(HttpServletRequest req, HttpServletResponse resp) throws java.io.IOException {
         byte[] payload = req.getInputStream().readAllBytes();
         try {
-            ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload));
-            Object state = ois.readObject();
-            req.getSession().setAttribute("state", state);
+            try (ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload))) {
+                Object state = ois.readObject();
+                req.getSession().setAttribute("state", state);
+            }
         } catch (Exception e) {
             resp.setStatus(400);
         }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
protected void restoreSession(HttpServletRequest req, HttpServletResponse resp) throws java.io.IOException {
byte[] payload = req.getInputStream().readAllBytes();
try {
ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload));
Object state = ois.readObject();
req.getSession().setAttribute("state", state);
} catch (Exception e) {
resp.setStatus(400);
}
}
protected void restoreSession(HttpServletRequest req, HttpServletResponse resp) throws java.io.IOException {
byte[] payload = req.getInputStream().readAllBytes();
try {
try (ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload))) {
Object state = ois.readObject();
req.getSession().setAttribute("state", state);
}
} catch (Exception e) {
resp.setStatus(400);
}
}
🧰 Tools
🪛 OpenGrep (1.20.0)

[ERROR] 48-48: Insecure deserialization via ObjectInputStream on untrusted input enables remote code execution.

(java-insecure-deserialization)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@opengrep/vuln_java.java` around lines 45 - 54, The ObjectInputStream created
in restoreSession is never closed causing a resource leak; wrap creation of
ObjectInputStream (and the ByteArrayInputStream that uses payload) in a
try-with-resources so they are automatically closed, e.g., open
ObjectInputStream in the try(...) block used to readObject and set the session
attribute (req.getSession().setAttribute("state", state)), and keep the existing
exception handling that sets resp.setStatus(400) on failure.

Comment on lines +8 to +13
app.get("/dashboard/render", (req, res) => {
const message = req.query.message || "";
const container = document.getElementById("notice");
container.innerHTML = "<p>" + message + "</p>";
res.send("ok");
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

document is undefined in Node.js — this endpoint will crash at runtime.

Lines 10-11 reference document.getElementById and innerHTML, but this is server-side Express/Node.js code where the DOM API doesn't exist. The handler will throw a ReferenceError when invoked.

If the intent is to demonstrate XSS patterns for OpenGrep detection, consider either:

  1. Returning the unsanitized HTML in the response (server-side rendering pattern), or
  2. Moving this to a client-side script file
Proposed fix for server-rendered XSS demo
 app.get("/dashboard/render", (req, res) => {
   const message = req.query.message || "";
-  const container = document.getElementById("notice");
-  container.innerHTML = "<p>" + message + "</p>";
-  res.send("ok");
+  // Vulnerable: unsanitized user input rendered in HTML
+  res.send("<div id='notice'><p>" + message + "</p></div>");
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
app.get("/dashboard/render", (req, res) => {
const message = req.query.message || "";
const container = document.getElementById("notice");
container.innerHTML = "<p>" + message + "</p>";
res.send("ok");
});
app.get("/dashboard/render", (req, res) => {
const message = req.query.message || "";
// Vulnerable: unsanitized user input rendered in HTML
res.send("<div id='notice'><p>" + message + "</p></div>");
});
🧰 Tools
🪛 OpenGrep (1.20.0)

[ERROR] 11-11: Potential XSS: assigning to innerHTML with untrusted data. Use textContent or escape the value.

(js-xss-innerhtml-assignment)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@opengrep/vuln_javascript.js` around lines 8 - 13, The handler
app.get("/dashboard/render", (req, res) => { ... }) is using browser DOM APIs
(document.getElementById, innerHTML) on the server which will throw; replace
that DOM usage with server-side rendering or move the logic to client code. To
fix here, stop referencing document/container: read const message =
req.query.message || ""; construct the HTML string on the server (e.g. "<div
id=\"notice\"><p>" + message + "</p></div>") and send it with res.send(...) (or
sanitize/escape message if you don't want an XSS demo). Alternatively move the
container manipulation into a client-side script file so the server simply
serves the message value. Ensure references to document and container are
removed from this server route.

Comment on lines +15 to +20
app.get("/profile/bio", (req, res) => {
const bio = req.query.bio || "";
const target = document.querySelector("#bio");
target.innerHTML = bio;
res.send("ok");
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Same issue: document.querySelector is unavailable in Node.js.

This endpoint will also crash. Apply a similar server-rendered pattern if demonstrating XSS.

Proposed fix
 app.get("/profile/bio", (req, res) => {
   const bio = req.query.bio || "";
-  const target = document.querySelector("#bio");
-  target.innerHTML = bio;
-  res.send("ok");
+  // Vulnerable: unsanitized bio rendered directly
+  res.send("<div id='bio'>" + bio + "</div>");
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
app.get("/profile/bio", (req, res) => {
const bio = req.query.bio || "";
const target = document.querySelector("#bio");
target.innerHTML = bio;
res.send("ok");
});
app.get("/profile/bio", (req, res) => {
const bio = req.query.bio || "";
// Vulnerable: unsanitized bio rendered directly
res.send("<div id='bio'>" + bio + "</div>");
});
🧰 Tools
🪛 OpenGrep (1.20.0)

[ERROR] 18-18: Potential XSS: assigning to innerHTML with untrusted data. Use textContent or escape the value.

(js-xss-innerhtml-assignment)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@opengrep/vuln_javascript.js` around lines 15 - 20, The route handler
app.get("/profile/bio") currently calls document.querySelector and manipulates
target.innerHTML which will crash in Node.js; replace this client-side DOM usage
with a server-side render: remove document.querySelector/target.innerHTML,
sanitize or escape req.query.bio (or explicitly mark it unsafe only for a
controlled XSS demo) and return an HTML response via res.send or your template
engine that injects the bio safely (or use a server-side DOM library like jsdom
if you must manipulate a DOM object), referencing the handler function for
changes.

Comment thread opengrep/vuln_python.py
Comment on lines +25 to +30
@app.route("/users/<int:user_id>")
def get_user(user_id):
conn = get_connection()
cursor = conn.cursor()
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
return dict(cursor.fetchone() or {})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor issue: dict() on fetchone() result may fail.

While demonstrating SQL injection (f-string formatting), line 30 has a functional bug: cursor.fetchone() returns a tuple, not a mapping, so dict(cursor.fetchone()) will raise TypeError.

Proposed fix using Row factory
+def dict_factory(cursor, row):
+    return {col[0]: row[idx] for idx, col in enumerate(cursor.description)}
+
 def get_connection():
-    return sqlite3.connect("app.db")
+    conn = sqlite3.connect("app.db")
+    conn.row_factory = dict_factory
+    return conn
 
 `@app.route`("/users/<int:user_id>")
 def get_user(user_id):
     conn = get_connection()
     cursor = conn.cursor()
     cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
-    return dict(cursor.fetchone() or {})
+    return cursor.fetchone() or {}
🧰 Tools
🪛 OpenGrep (1.20.0)

[ERROR] 29-29: SQL injection: query built via string concatenation/formatting with untrusted input. Use parameterized queries.

(python-sql-injection-string-concat)

🪛 Ruff (0.15.12)

[error] 29-29: Possible SQL injection vector through string-based query construction

(S608)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@opengrep/vuln_python.py` around lines 25 - 30, The current get_user function
calls dict(cursor.fetchone()) which fails because cursor.fetchone() returns a
tuple; change it so the fetched row is a mapping — either set the DB connection
to return sqlite3.Row (e.g., in get_connection set conn.row_factory =
sqlite3.Row) so cursor.fetchone() yields a dict-like row, or convert the tuple
to a dict in get_user using cursor.description to map column names to values
(use cursor.description and the tuple from cursor.fetchone() to build the dict).
Update the get_connection or get_user implementation accordingly and keep the
SQL injection demo intact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant