V2 Crawlers: Configurable Timeout + Command-Line Execution#288
V2 Crawlers: Configurable Timeout + Command-Line Execution#288TheAuditorTool wants to merge 2 commits intoOWASP-Benchmark:mainfrom
Conversation
…port Closes OWASP-Benchmark#3 (Add Timeout to Crawlers) Closes OWASP-Benchmark#1 (Add new command line crawler and verification crawler) New files (zero changes to existing code): - CommandLineTestCaseRequest.java: CLI test case support (tcType="CLI") - BenchmarkCrawler_newv2.java: configurable -T/--timeout, CLI dispatch - BenchmarkCrawlerVerification_newv2.java: verification with timeout + CLI
6c7eeed to
fc3f61b
Compare
|
@dandersonaspect - @TheAuditorTool stepped up and created a command line version of the verification crawler for Benchmark. Can you review/test to see if this seems to do the job we need for running command line verification? |
darkspirit510
left a comment
There was a problem hiding this comment.
Code itself looks good. I don't understand why this is a separate class.
@davewichers can't run https://github.com/OWASP-Benchmark/BenchmarkJava/blob/master/VMs/runDockerImage.sh on my Mac, so I can't verify this code actually works.
| * <p>Existing HTTP test suites work identically — this is a drop-in replacement. | ||
| */ | ||
| @Mojo(name = "run-crawler-v2", requiresProject = false, defaultPhase = LifecyclePhase.COMPILE) | ||
| public class BenchmarkCrawler_newv2 extends BenchmarkCrawler { |
There was a problem hiding this comment.
That's a weird class name 🤪
There was a problem hiding this comment.
Thats on me... I guess i was to afraid to directly replace @davewichers previous crawlers and the intention at the time was to create _newv2 so he could test it against his old versions before deciding if he liked/wanted them.
I see in hindsight how it was probably bit to cautious.
Happy to merge the v2 changes into the existing BenchmarkCrawler and BenchmarkCrawlerVerification classes if you and Dave prefer that.
| * <p>Existing HTTP test suites work identically — this is a drop-in replacement. | ||
| */ | ||
| @Mojo(name = "run-crawler-v2", requiresProject = false, defaultPhase = LifecyclePhase.COMPILE) | ||
| public class BenchmarkCrawler_newv2 extends BenchmarkCrawler { |
There was a problem hiding this comment.
TBH, I never had a look into BenchmarkCrawler.java before. Why is this a separate class?
There was a problem hiding this comment.
This became an artifact of my above choose.
It's a separate class on purpose — it registers as a different Maven goal (run-crawler-v2 vs run-crawler). The existing BenchmarkCrawler and BenchmarkCrawlerVerification are untouched.
Happy to make both official on your and daves approval.
| HttpHost httpHost = null; | ||
| String pHost = System.getProperty("proxyHost"); | ||
| String pPort = System.getProperty("proxyPort"); | ||
| if (pHost != null && pPort != null) { |
There was a problem hiding this comment.
I'm a fan of DRY. Can you refactor this code so it does the first three commands (until connection manager) and then put the setProxy in the if?
There was a problem hiding this comment.
I agree. I'll have it fixed. Thank you.
| if (selectedTestCaseName != null) { | ||
| for (AbstractTestCaseRequest request : this.testSuite.getTestCases()) { | ||
| if (request.getName().equals(selectedTestCaseName)) { | ||
| java.util.List<AbstractTestCaseRequest> requests = |
There was a problem hiding this comment.
Why does this require full qualified class name?
There was a problem hiding this comment.
It does not. My mistake of using inline instead of seeing the list. I'll have it fixed.
| JAXBContextFactory.createContext( | ||
| new Class[] {TestSuite.class, CommandLineTestCaseRequest.class}, null); | ||
| Unmarshaller unmarshaller = context.createUnmarshaller(); | ||
| unmarshaller.setEventHandler(new javax.xml.bind.helpers.DefaultValidationEventHandler()); |
There was a problem hiding this comment.
Here's the same full qualified name. Can't you just import the Handler?
There was a problem hiding this comment.
I was following the pattern from BenchmarkCrawler.java for consistency, but you're right that importing it is cleaner. I'll have it corrected.
| if (thisInstance == null) thisInstance = this; | ||
|
|
||
| if (null == this.crawlerFile) { | ||
| System.out.println("ERROR: A crawlerFile parameter must be specified."); |
There was a problem hiding this comment.
Since this will end the application in error state, maybe return a non-zero exit code?
There was a problem hiding this comment.
Same as above. I'll have it corrected.
Can you try @TheAuditorTool pull request which fixes/upgrades the Docker Image: OWASP-Benchmark/BenchmarkJava#445 and see if that works for you? If it does, I can merge it in. Let me know if this fixes your issue. |
PR OWASP-Benchmark#288 review raised 7 items. 2 were design questions answered in comments (class name rationale, why separate class). The remaining 5 are code fixes in this commit. THE PROBLEM: BenchmarkCrawler_newv2 had three categories of issue: 1. Inline FQNs (java.util.List, java.util.ArrayList, javax.xml.bind.helpers.DefaultValidationEventHandler) instead of imports — the DefaultValidationEventHandler pattern was copied from Utils.java:183 which uses the same inline FQN, but reviewer wants imports in new code 2. Duplicated HttpClients.custom() builder chain across proxy/no-proxy branches — inherited from parent BenchmarkCrawler.java:204-228 which has the same duplication 3. execute() prints error on null crawlerFile but returns silently with exit code 0 — same pattern as parent BenchmarkCrawler.java:350 but reviewer wants explicit failure THE SOLUTION: - Add imports for ArrayList, DefaultValidationEventHandler, HttpClientBuilder and replace all inline FQNs with short names - Extract HttpClientBuilder as a local variable, build once, then conditionally call .setProxy() — eliminates the duplicated 4-line builder chain - Add System.exit(-1) after the error println in both execute() methods (crawler + verification crawler) KEY DECISIONS: - System.exit(-1) not System.exit(1): codebase has 19 uses of System.exit(-1) for errors, zero uses of System.exit(1) - System.exit(-1) not throw MojoExecutionException: MojoExecutionException is declared on every execute() but never thrown anywhere in the codebase (0 occurrences). Introducing a new error pattern is scope creep. - HttpClientBuilder (explicit type) not var: Java 11 target supports var but 0 usages exist in codebase. Legacy explicit-type style throughout. - Spotless (google-java-format 1.17.0 AOSP, pom.xml:222-314) will normalize import order on mvn compile. Manual placement is correct but academic — formatter is the safety net. VERIFICATION: mvn compile (spotless-apply runs at compile phase) mvn test (267+ tests, 0 failures expected) git diff --stat shows 2 files, +16/-15 lines
|
Addressed code review items:
The class name (_newv2 suffix) and separate class design are intentional. I didn't want to directly replace the existing crawlers. Happy to merge/replace the existing ones but please review and approve first. |
V2 Crawlers: Configurable Timeout + Command-Line Execution
Closes #3 (Add Timeout to Crawlers)
Closes #1 (Add new command line crawler and verification crawler)
Overview
This adds three new Java files alongside the existing crawlers. Nothing existing is changed — the current
run-crawlerandrun-verification-crawlerMaven goals continue to work exactly as before. The new functionality is available through separate-v2goals when you're ready to use it.What's New
Configurable Timeout (Issue #3)
The current crawlers have a hardcoded 15-second timeout. If a test case takes longer (common on slower machines), the request is dropped.
The v2 crawlers add a
-Tflag that lets you set the timeout yourself:-Tflag (the default)-T 300-T 60When a request does time out, the crawler logs it and moves on to the next test case, just as Jonathon described in the issue.
Command-Line Test Case Execution (Issue #1)
The existing crawlers work by sending HTTP requests to web endpoints. The v2 crawlers can also run test cases as command-line programs — useful for benchmarking non-web applications like Python scripts or CLI tools.
To use this, test cases in the XML file use
tcType="CLI"instead ofSERVLET,SPRINGWS, orJERSEYWS:The crawler runs the command as a subprocess, captures its output, and uses the same attack/safe verification logic that the HTTP crawler uses. The
formparamelements still switch between attack and safe values — they're passed to the command as--name valuearguments.New Files
All three are in
plugin/src/main/java/org/owasp/benchmarkutils/tools/:CommandLineTestCaseRequest.javaAbstractTestCaseRequestso it fits into the existing test suite loading and sorting infrastructure.BenchmarkCrawler_newv2.javaBenchmarkCrawler. Maven goal:run-crawler-v2.BenchmarkCrawlerVerification_newv2.javaBenchmarkCrawler_newv2. Maven goal:run-verification-crawler-v2.How to Try It
Once installed (
mvn install), you can use the new goals in place of the existing ones:All existing flags (
-f,-n,-t) still work. The-Tflag is the only addition.What's Not Changed
BenchmarkCrawler.java— untouchedBenchmarkCrawlerVerification.java— untouchedAbstractTestCaseRequest.java— untouchedDesign Notes
-Tflag). This means a server that's down still fails fast, while a slow response can take as long as needed.ResponseInfoclass: stdout goes intoresponseString, exit code goes intostatusCode, and elapsed wall-clock time goes intotimeInSeconds.RegressionTesting.verifyResponse()for CLI test cases. This is the same string-matching logic used for HTTP — checking whether the attack success indicator appears in the output.