Preview/Opengrep#92
Conversation
… ruleset, disable overlapping pattern scanners
📝 WalkthroughWalkthroughThis PR adds vulnerability detection configuration and demonstration files: a CodeRabbit settings file disabling certain review tools, an OpenGrep rule definition covering SQL injection, deserialization, XSS, eval, command injection, weak hashing, and XXE patterns, and four vulnerable code samples in Go, Java, JavaScript, and Python. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
opengrep/vuln_go.go (1)
15-29: SQL injection demo implemented correctly; ignoredScanerror.The SQL injection vulnerability using
fmt.Sprintfis correctly demonstrated. However, line 26 ignores the error fromrows.Scan(&id, &email). While this is demo code, handling scan errors is idiomatic Go.Optional: handle Scan error
- rows.Scan(&id, &email) + if err := rows.Scan(&id, &email); err != nil { + continue + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@opengrep/vuln_go.go` around lines 15 - 29, The loop in LookupUser ignores the error returned by rows.Scan(&id, &email); update LookupUser to check the error from rows.Scan inside the for rows.Next() loop and handle it (e.g., log or call http.Error and return) so scan failures are not swallowed; ensure you also handle any rows.Err() after the loop to catch iteration errors from the sql.Rows.opengrep.yml (1)
129-139: XXE rule may produce false positives in hardened codebases.The
java-xxe-vulnerable-parserrule flags anyDocumentBuilderFactory.newInstance()orSAXParserFactory.newInstance()call. This will trigger even when the code subsequently configures XXE protections viasetFeature(). Consider addingpattern-notclauses to exclude properly hardened patterns, or accept this as a "review required" warning.For example, to reduce false positives when XXE features are disabled nearby:
# Could add pattern-not-inside to exclude hardened patterns # pattern-not-inside: | # $FACTORY = DocumentBuilderFactory.newInstance() # ... # $FACTORY.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@opengrep.yml` around lines 129 - 139, The current java-xxe-vulnerable-parser rule flags any usage of DocumentBuilderFactory.newInstance() or SAXParserFactory.newInstance() which causes false positives when callers immediately harden the factory with setFeature(..., true); update the rule to add pattern-not or pattern-not-inside clauses that exclude factories which are subsequently configured (e.g., references to $FACTORY.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) or other common XXE-protection features), so the rule still warns when factories are used without protections but ignores instances where DocumentBuilderFactory.newInstance() or SAXParserFactory.newInstance() are followed by explicit setFeature hardening.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.coderabbit.yml:
- Around line 5-6: Remove the invalid "enabled" property from the ast-grep
configuration section and either delete the entire ast-grep block or replace it
with a supported property such as setting essential_rules: false to disable it;
ensure only valid keys remain (essential_rules, rule_dirs, util_dirs, packages)
so the config conforms to the schema with additionalProperties: false and will
not be rejected.
In `@opengrep/vuln_java.java`:
- Around line 45-54: The ObjectInputStream created in restoreSession is never
closed causing a resource leak; wrap creation of ObjectInputStream (and the
ByteArrayInputStream that uses payload) in a try-with-resources so they are
automatically closed, e.g., open ObjectInputStream in the try(...) block used to
readObject and set the session attribute (req.getSession().setAttribute("state",
state)), and keep the existing exception handling that sets resp.setStatus(400)
on failure.
In `@opengrep/vuln_javascript.js`:
- Around line 8-13: The handler app.get("/dashboard/render", (req, res) => { ...
}) is using browser DOM APIs (document.getElementById, innerHTML) on the server
which will throw; replace that DOM usage with server-side rendering or move the
logic to client code. To fix here, stop referencing document/container: read
const message = req.query.message || ""; construct the HTML string on the server
(e.g. "<div id=\"notice\"><p>" + message + "</p></div>") and send it with
res.send(...) (or sanitize/escape message if you don't want an XSS demo).
Alternatively move the container manipulation into a client-side script file so
the server simply serves the message value. Ensure references to document and
container are removed from this server route.
- Around line 15-20: The route handler app.get("/profile/bio") currently calls
document.querySelector and manipulates target.innerHTML which will crash in
Node.js; replace this client-side DOM usage with a server-side render: remove
document.querySelector/target.innerHTML, sanitize or escape req.query.bio (or
explicitly mark it unsafe only for a controlled XSS demo) and return an HTML
response via res.send or your template engine that injects the bio safely (or
use a server-side DOM library like jsdom if you must manipulate a DOM object),
referencing the handler function for changes.
In `@opengrep/vuln_python.py`:
- Around line 25-30: The current get_user function calls dict(cursor.fetchone())
which fails because cursor.fetchone() returns a tuple; change it so the fetched
row is a mapping — either set the DB connection to return sqlite3.Row (e.g., in
get_connection set conn.row_factory = sqlite3.Row) so cursor.fetchone() yields a
dict-like row, or convert the tuple to a dict in get_user using
cursor.description to map column names to values (use cursor.description and the
tuple from cursor.fetchone() to build the dict). Update the get_connection or
get_user implementation accordingly and keep the SQL injection demo intact.
---
Nitpick comments:
In `@opengrep.yml`:
- Around line 129-139: The current java-xxe-vulnerable-parser rule flags any
usage of DocumentBuilderFactory.newInstance() or SAXParserFactory.newInstance()
which causes false positives when callers immediately harden the factory with
setFeature(..., true); update the rule to add pattern-not or pattern-not-inside
clauses that exclude factories which are subsequently configured (e.g.,
references to
$FACTORY.setFeature("http://apache.org/xml/features/disallow-doctype-decl",
true) or other common XXE-protection features), so the rule still warns when
factories are used without protections but ignores instances where
DocumentBuilderFactory.newInstance() or SAXParserFactory.newInstance() are
followed by explicit setFeature hardening.
In `@opengrep/vuln_go.go`:
- Around line 15-29: The loop in LookupUser ignores the error returned by
rows.Scan(&id, &email); update LookupUser to check the error from rows.Scan
inside the for rows.Next() loop and handle it (e.g., log or call http.Error and
return) so scan failures are not swallowed; ensure you also handle any
rows.Err() after the loop to catch iteration errors from the sql.Rows.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Enterprise
Run ID: bb929bb9-a59b-4368-86ea-27a662788b39
📒 Files selected for processing (6)
.coderabbit.ymlopengrep.ymlopengrep/vuln_go.goopengrep/vuln_java.javaopengrep/vuln_javascript.jsopengrep/vuln_python.py
| ast-grep: | ||
| enabled: false |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
CodeRabbit ast-grep enabled property configuration
💡 Result:
In CodeRabbit, ast-grep is configured in the .coderabbit.yaml file under reviews.tools.ast-grep. The 'essential_rules' property enables the ast-grep-essentials security rules package (default: true). ast-grep runs if at least one of: essential_rules: true, rule_dirs specified, or packages listed. Example configuration: reviews: tools: ast-grep: essential_rules: true # Enables essential security rules rule_dirs: - "rules" util_dirs: - "utils" packages: - "coderabbitai/ast-grep-essentials" - essential_rules: boolean to enable predefined essential rules. - rule_dirs: directories with custom .yaml rule files. - util_dirs: directories for shared utility rules. - packages: public GitHub repos (owner/repo) with rules/utils. Rules must include a 'message' property for reviews. Supports many languages via tree-sitter.
Citations:
- 1: https://docs.coderabbit.ai/configuration/ast-grep-instructions
- 2: https://docs.coderabbit.ai/tools/ast-grep
Remove or replace the invalid enabled property for ast-grep.
The enabled property is not supported by CodeRabbit's ast-grep configuration. Valid properties are essential_rules, rule_dirs, util_dirs, and packages. With additionalProperties: false in the schema, this configuration will be rejected. To disable ast-grep, set essential_rules: false or remove the ast-grep section entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.coderabbit.yml around lines 5 - 6, Remove the invalid "enabled" property
from the ast-grep configuration section and either delete the entire ast-grep
block or replace it with a supported property such as setting essential_rules:
false to disable it; ensure only valid keys remain (essential_rules, rule_dirs,
util_dirs, packages) so the config conforms to the schema with
additionalProperties: false and will not be rejected.
| protected void restoreSession(HttpServletRequest req, HttpServletResponse resp) throws java.io.IOException { | ||
| byte[] payload = req.getInputStream().readAllBytes(); | ||
| try { | ||
| ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload)); | ||
| Object state = ois.readObject(); | ||
| req.getSession().setAttribute("state", state); | ||
| } catch (Exception e) { | ||
| resp.setStatus(400); | ||
| } | ||
| } |
There was a problem hiding this comment.
ObjectInputStream not closed — resource leak.
While correctly demonstrating insecure deserialization (CWE-502), the ObjectInputStream on line 48 is never closed, causing a resource leak. Consider using try-with-resources.
Proposed fix with try-with-resources
protected void restoreSession(HttpServletRequest req, HttpServletResponse resp) throws java.io.IOException {
byte[] payload = req.getInputStream().readAllBytes();
try {
- ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload));
- Object state = ois.readObject();
- req.getSession().setAttribute("state", state);
+ try (ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload))) {
+ Object state = ois.readObject();
+ req.getSession().setAttribute("state", state);
+ }
} catch (Exception e) {
resp.setStatus(400);
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| protected void restoreSession(HttpServletRequest req, HttpServletResponse resp) throws java.io.IOException { | |
| byte[] payload = req.getInputStream().readAllBytes(); | |
| try { | |
| ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload)); | |
| Object state = ois.readObject(); | |
| req.getSession().setAttribute("state", state); | |
| } catch (Exception e) { | |
| resp.setStatus(400); | |
| } | |
| } | |
| protected void restoreSession(HttpServletRequest req, HttpServletResponse resp) throws java.io.IOException { | |
| byte[] payload = req.getInputStream().readAllBytes(); | |
| try { | |
| try (ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(payload))) { | |
| Object state = ois.readObject(); | |
| req.getSession().setAttribute("state", state); | |
| } | |
| } catch (Exception e) { | |
| resp.setStatus(400); | |
| } | |
| } |
🧰 Tools
🪛 OpenGrep (1.20.0)
[ERROR] 48-48: Insecure deserialization via ObjectInputStream on untrusted input enables remote code execution.
(java-insecure-deserialization)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@opengrep/vuln_java.java` around lines 45 - 54, The ObjectInputStream created
in restoreSession is never closed causing a resource leak; wrap creation of
ObjectInputStream (and the ByteArrayInputStream that uses payload) in a
try-with-resources so they are automatically closed, e.g., open
ObjectInputStream in the try(...) block used to readObject and set the session
attribute (req.getSession().setAttribute("state", state)), and keep the existing
exception handling that sets resp.setStatus(400) on failure.
| app.get("/dashboard/render", (req, res) => { | ||
| const message = req.query.message || ""; | ||
| const container = document.getElementById("notice"); | ||
| container.innerHTML = "<p>" + message + "</p>"; | ||
| res.send("ok"); | ||
| }); |
There was a problem hiding this comment.
document is undefined in Node.js — this endpoint will crash at runtime.
Lines 10-11 reference document.getElementById and innerHTML, but this is server-side Express/Node.js code where the DOM API doesn't exist. The handler will throw a ReferenceError when invoked.
If the intent is to demonstrate XSS patterns for OpenGrep detection, consider either:
- Returning the unsanitized HTML in the response (server-side rendering pattern), or
- Moving this to a client-side script file
Proposed fix for server-rendered XSS demo
app.get("/dashboard/render", (req, res) => {
const message = req.query.message || "";
- const container = document.getElementById("notice");
- container.innerHTML = "<p>" + message + "</p>";
- res.send("ok");
+ // Vulnerable: unsanitized user input rendered in HTML
+ res.send("<div id='notice'><p>" + message + "</p></div>");
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| app.get("/dashboard/render", (req, res) => { | |
| const message = req.query.message || ""; | |
| const container = document.getElementById("notice"); | |
| container.innerHTML = "<p>" + message + "</p>"; | |
| res.send("ok"); | |
| }); | |
| app.get("/dashboard/render", (req, res) => { | |
| const message = req.query.message || ""; | |
| // Vulnerable: unsanitized user input rendered in HTML | |
| res.send("<div id='notice'><p>" + message + "</p></div>"); | |
| }); |
🧰 Tools
🪛 OpenGrep (1.20.0)
[ERROR] 11-11: Potential XSS: assigning to innerHTML with untrusted data. Use textContent or escape the value.
(js-xss-innerhtml-assignment)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@opengrep/vuln_javascript.js` around lines 8 - 13, The handler
app.get("/dashboard/render", (req, res) => { ... }) is using browser DOM APIs
(document.getElementById, innerHTML) on the server which will throw; replace
that DOM usage with server-side rendering or move the logic to client code. To
fix here, stop referencing document/container: read const message =
req.query.message || ""; construct the HTML string on the server (e.g. "<div
id=\"notice\"><p>" + message + "</p></div>") and send it with res.send(...) (or
sanitize/escape message if you don't want an XSS demo). Alternatively move the
container manipulation into a client-side script file so the server simply
serves the message value. Ensure references to document and container are
removed from this server route.
| app.get("/profile/bio", (req, res) => { | ||
| const bio = req.query.bio || ""; | ||
| const target = document.querySelector("#bio"); | ||
| target.innerHTML = bio; | ||
| res.send("ok"); | ||
| }); |
There was a problem hiding this comment.
Same issue: document.querySelector is unavailable in Node.js.
This endpoint will also crash. Apply a similar server-rendered pattern if demonstrating XSS.
Proposed fix
app.get("/profile/bio", (req, res) => {
const bio = req.query.bio || "";
- const target = document.querySelector("#bio");
- target.innerHTML = bio;
- res.send("ok");
+ // Vulnerable: unsanitized bio rendered directly
+ res.send("<div id='bio'>" + bio + "</div>");
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| app.get("/profile/bio", (req, res) => { | |
| const bio = req.query.bio || ""; | |
| const target = document.querySelector("#bio"); | |
| target.innerHTML = bio; | |
| res.send("ok"); | |
| }); | |
| app.get("/profile/bio", (req, res) => { | |
| const bio = req.query.bio || ""; | |
| // Vulnerable: unsanitized bio rendered directly | |
| res.send("<div id='bio'>" + bio + "</div>"); | |
| }); |
🧰 Tools
🪛 OpenGrep (1.20.0)
[ERROR] 18-18: Potential XSS: assigning to innerHTML with untrusted data. Use textContent or escape the value.
(js-xss-innerhtml-assignment)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@opengrep/vuln_javascript.js` around lines 15 - 20, The route handler
app.get("/profile/bio") currently calls document.querySelector and manipulates
target.innerHTML which will crash in Node.js; replace this client-side DOM usage
with a server-side render: remove document.querySelector/target.innerHTML,
sanitize or escape req.query.bio (or explicitly mark it unsafe only for a
controlled XSS demo) and return an HTML response via res.send or your template
engine that injects the bio safely (or use a server-side DOM library like jsdom
if you must manipulate a DOM object), referencing the handler function for
changes.
| @app.route("/users/<int:user_id>") | ||
| def get_user(user_id): | ||
| conn = get_connection() | ||
| cursor = conn.cursor() | ||
| cursor.execute(f"SELECT * FROM users WHERE id = {user_id}") | ||
| return dict(cursor.fetchone() or {}) |
There was a problem hiding this comment.
Minor issue: dict() on fetchone() result may fail.
While demonstrating SQL injection (f-string formatting), line 30 has a functional bug: cursor.fetchone() returns a tuple, not a mapping, so dict(cursor.fetchone()) will raise TypeError.
Proposed fix using Row factory
+def dict_factory(cursor, row):
+ return {col[0]: row[idx] for idx, col in enumerate(cursor.description)}
+
def get_connection():
- return sqlite3.connect("app.db")
+ conn = sqlite3.connect("app.db")
+ conn.row_factory = dict_factory
+ return conn
`@app.route`("/users/<int:user_id>")
def get_user(user_id):
conn = get_connection()
cursor = conn.cursor()
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
- return dict(cursor.fetchone() or {})
+ return cursor.fetchone() or {}🧰 Tools
🪛 OpenGrep (1.20.0)
[ERROR] 29-29: SQL injection: query built via string concatenation/formatting with untrusted input. Use parameterized queries.
(python-sql-injection-string-concat)
🪛 Ruff (0.15.12)
[error] 29-29: Possible SQL injection vector through string-based query construction
(S608)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@opengrep/vuln_python.py` around lines 25 - 30, The current get_user function
calls dict(cursor.fetchone()) which fails because cursor.fetchone() returns a
tuple; change it so the fetched row is a mapping — either set the DB connection
to return sqlite3.Row (e.g., in get_connection set conn.row_factory =
sqlite3.Row) so cursor.fetchone() yields a dict-like row, or convert the tuple
to a dict in get_user using cursor.description to map column names to values
(use cursor.description and the tuple from cursor.fetchone() to build the dict).
Update the get_connection or get_user implementation accordingly and keep the
SQL injection demo intact.
Demonstrates the Opengrep sandboxer tool. Custom inline ruleset (opengrep.yml) at repo root defines 12 rules across Python, JavaScript, Go, and Java targeting SQL injection, XSS, insecure deserialization (pickle and ObjectInputStream), command injection, eval/exec, weak crypto (MD5), and XXE. Each rule is paired with deliberate trigger code in opengrep/vuln_*.{py,js,go,java}. Repo-level .coderabbit.yml disables Semgrep (Opengrep's upstream), ast-grep, owasp-top10, and PMD to avoid shadowing. Opengrep does not auto-trigger without a config file present, so the opengrep.yml at root is what activates it.
Summary by CodeRabbit
New Features
Configuration