Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
version: 2
updates:
# Monitor root Go module
- package-ecosystem: "gomod"
directory: "/"
schedule:
interval: "daily"
commit-message:
prefix: "chore"
prefix-development: "chore"
include: "scope"

# Monitor e2e-tests tools Go module
- package-ecosystem: "gomod"
directory: "/e2e-tests/tools"
schedule:
interval: "daily"
commit-message:
prefix: "chore"
prefix-development: "chore"
include: "scope"
17 changes: 16 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,26 @@ jobs:
uses: actions/setup-go@v5

- name: Download dependencies
run: go mod download
run: find . -name go.mod -execdir go mod download \;

- name: Verify go.mod and go.sum are up to date
run: |
find . -name go.mod -execdir go mod tidy \;
if [ -n "$(git status --porcelain)" ]; then
echo "Error: go.mod or go.sum files are not up to date"
echo "Modified files:"
git status --porcelain
echo ""
echo "Please run 'go mod tidy' in all directories containing go.mod and commit the changes"
exit 1
fi

- name: Run tests with coverage
run: make test-coverage-and-junit

- name: Run E2E smoke test
run: make e2e-smoke-test

- name: Upload test results to Codecov
uses: codecov/test-results-action@v1
with:
Expand Down
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,9 @@

# Lint output
/report.xml

# E2E tests
/e2e-tests/.env
/e2e-tests/mcp-reports/
/e2e-tests/bin/
/e2e-tests/**/*-out.json
8 changes: 8 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,14 @@ helm-lint: ## Run helm lint for Helm chart
test: ## Run unit tests
$(GOTEST) -v ./...

.PHONY: e2e-smoke-test
e2e-smoke-test: ## Run E2E smoke test (build and verify mcpchecker)
@cd e2e-tests && ./scripts/smoke-test.sh

.PHONY: e2e-test
e2e-test: ## Run E2E tests
@cd e2e-tests && ./scripts/run-tests.sh

.PHONY: test-coverage-and-junit
test-coverage-and-junit: ## Run unit tests with coverage and junit output
go install github.com/jstemmer/go-junit-report/v2@v2.1.0
Expand Down
118 changes: 118 additions & 0 deletions e2e-tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# StackRox MCP E2E Testing

End-to-end tests for the StackRox MCP server using [mcpchecker](https://github.com/mcpchecker/mcpchecker).

## Quick Start

### Smoke Test (No Agent Required)

Validate configuration and build without running actual agents:

```bash
cd e2e-tests
./scripts/smoke-test.sh
```

This is useful for CI and quickly checking that everything compiles.

## Prerequisites

- Go 1.25+
- Google Cloud Project with Vertex AI enabled (for Claude agent)
- OpenAI API Key (for LLM judge)
- StackRox API Token

## Setup

### 1. Build mcpchecker

```bash
cd e2e-tests
./scripts/build-mcpchecker.sh
```

### 2. Configure Environment

Create `.env` file:

```bash
# Required: GCP Project for Vertex AI (Claude agent)
ANTHROPIC_VERTEX_PROJECT_ID=<GCP Project ID>

# Required: StackRox Central API Token
STACKROX_MCP__CENTRAL__API_TOKEN=<StackRox API Token>

# Required: OpenAI API Key (for LLM judge)
OPENAI_API_KEY=<OpenAI API Key>

# Optional: Vertex AI region (defaults to us-east5)
CLOUD_ML_REGION=us-east5

# Optional: Judge configuration (defaults to OpenAI)
JUDGE_MODEL_NAME=gpt-5-nano
```

## Running Tests

```bash
./scripts/run-tests.sh
```

Results are saved to `mcpchecker/mcpchecker-stackrox-mcp-e2e-out.json`.

### View Results

```bash
# Summary
jq '.[] | {taskName, taskPassed}' mcpchecker/mcpchecker-stackrox-mcp-e2e-out.json

# Tool calls
jq '[.[] | .callHistory.ToolCalls[]? | {name: .request.Params.name, arguments: .request.Params.arguments}]' mcpchecker/mcpchecker-stackrox-mcp-e2e-out.json
```

## Test Cases

| Test | Description | Tool |
|------|-------------|------|
| `list-clusters` | List all clusters | `list_clusters` |
| `cve-detected-workloads` | CVE detected in deployments | `get_deployments_for_cve` |
| `cve-detected-clusters` | CVE detected in clusters | `get_clusters_with_orchestrator_cve` |
| `cve-nonexistent` | Handle non-existent CVE | `get_clusters_with_orchestrator_cve` |
| `cve-cluster-does-exist` | CVE with cluster filter | `get_clusters_with_orchestrator_cve` |
| `cve-cluster-does-not-exist` | CVE with cluster filter | `get_clusters_with_orchestrator_cve` |
| `cve-clusters-general` | General CVE query | `get_clusters_with_orchestrator_cve` |
| `cve-cluster-list` | CVE across clusters | `get_clusters_with_orchestrator_cve` |

## Configuration

- **`mcpchecker/eval.yaml`**: Main test configuration, agent settings, assertions
- **`mcpchecker/mcp-config.yaml`**: MCP server configuration
- **`mcpchecker/tasks/*.yaml`**: Individual test task definitions

## How It Works

mcpchecker uses a proxy architecture to intercept MCP tool calls:

1. AI agent receives task prompt
2. Agent calls MCP tool
3. mcpchecker proxy intercepts and records the call
4. Call forwarded to StackRox MCP server
5. Server executes and returns result
6. mcpchecker validates assertions and response quality

## Troubleshooting

**Tests fail - no tools called**
- Verify StackRox Central is accessible
- Check API token permissions

**Build errors**
```bash
go mod tidy
./scripts/build-mcpchecker.sh
```

## Further Reading

- [mcpchecker Documentation](https://github.com/mcpchecker/mcpchecker)
- [StackRox MCP Server](../README.md)
110 changes: 110 additions & 0 deletions e2e-tests/mcpchecker/eval.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
kind: Eval
metadata:
name: "stackrox-mcp-e2e"
config:
agent:
type: "builtin.claude-code"
model: "claude-sonnet-4-5"
llmJudge:
env:
baseUrlKey: JUDGE_BASE_URL
apiKeyKey: JUDGE_API_KEY
modelNameKey: JUDGE_MODEL_NAME
mcpConfigFile: mcp-config.yaml
taskSets:
# Assertion Fields Explained:
# - toolsUsed: List of tools that MUST be called at least once
# - minToolCalls: Minimum TOTAL number of tool calls across ALL tools (not per-tool)
# - maxToolCalls: Maximum TOTAL number of tool calls across ALL tools (prevents runaway tool usage)
# Example: If maxToolCalls=3, the agent can make up to 3 tool calls total in the test,
# regardless of which tools are called.

# Test 1: List clusters
- path: tasks/list-clusters.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "list_clusters"
minToolCalls: 1
maxToolCalls: 1

# Test 2: CVE detected in workloads
# Claude does comprehensive CVE checking (orchestrator, deployments, nodes)
- path: tasks/cve-detected-workloads.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_deployments_for_cve"
argumentsMatch:
cveName: "CVE-2021-31805"
minToolCalls: 1
maxToolCalls: 3

# Test 3: CVE detected in clusters - basic
- path: tasks/cve-detected-clusters.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2016-1000031"
minToolCalls: 1
maxToolCalls: 3

# Test 4: Non-existent CVE
# Expects 3 calls because "Is CVE detected in my clusters?" triggers comprehensive check
# (orchestrator, deployments, nodes). The LLM cannot know beforehand if CVE exists.
- path: tasks/cve-nonexistent.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2099-00001"
minToolCalls: 1
maxToolCalls: 3

# Test 5: CVE with specific cluster filter (does exist)
# Claude does comprehensive checking even for single cluster (orchestrator, deployments, nodes)
- path: tasks/cve-cluster-does-exist.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "list_clusters"
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2016-1000031"
minToolCalls: 2
maxToolCalls: 4

# Test 6: CVE with specific cluster filter (does not exist)
- path: tasks/cve-cluster-does-not-exist.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "list_clusters"
minToolCalls: 1
maxToolCalls: 2

# Test 7: CVE detected in clusters - general
- path: tasks/cve-clusters-general.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2021-31805"
minToolCalls: 1
maxToolCalls: 5

# Test 8: CVE check with cluster list reference
- path: tasks/cve-cluster-list.yaml
assertions:
toolsUsed:
- server: stackrox-mcp
toolPattern: "get_clusters_with_orchestrator_cve"
argumentsMatch:
cveName: "CVE-2024-52577"
minToolCalls: 1
maxToolCalls: 5
12 changes: 12 additions & 0 deletions e2e-tests/mcpchecker/mcp-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
mcpServers:
stackrox-mcp:
command: go
args:
- run
- ../../cmd/stackrox-mcp/...
- --config
- ../stackrox-mcp-e2e-config.yaml
# API token loaded from parent shell environment (.env file)
# No env section = full environment inheritance
# Auto-approve all tools
enableAllTools: true
9 changes: 9 additions & 0 deletions e2e-tests/mcpchecker/tasks/cve-cluster-does-exist.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: Task
metadata:
name: "cve-cluster-does-exist"
difficulty: easy
steps:
prompt:
inline: "Is this CVE-2016-1000031 detected in cluster name staging-central-cluster?"
verify:
contains: "A response indicating whether CVE-2016-1000031 is detected in staging-central-cluster"
9 changes: 9 additions & 0 deletions e2e-tests/mcpchecker/tasks/cve-cluster-does-not-exist.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: Task
metadata:
name: "cve-cluster-does-not-exist"
difficulty: easy
steps:
prompt:
inline: "Is this CVE-2024-52577 detected in cluster name prod-hal-9000?"
verify:
contains: "A response indicating that CVE-2024-52577 is not detected in prod-hal-9000 cluster, or that the cluster does not exist"
9 changes: 9 additions & 0 deletions e2e-tests/mcpchecker/tasks/cve-cluster-list.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: Task
metadata:
name: "cve-cluster-list"
difficulty: easy
steps:
prompt:
inline: "Is this CVE-2024-52577 detected in any of my clusters?"
verify:
contains: "A response indicating whether CVE-2024-52577 is detected in clusters"
9 changes: 9 additions & 0 deletions e2e-tests/mcpchecker/tasks/cve-clusters-general.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: Task
metadata:
name: "cve-clusters-general"
difficulty: easy
steps:
prompt:
inline: "Is this CVE-2021-31805 detected in my clusters?"
verify:
contains: "A response indicating whether CVE-2021-31805 is detected in any clusters, with specific cluster names if detected, or a statement that no clusters have this CVE."
9 changes: 9 additions & 0 deletions e2e-tests/mcpchecker/tasks/cve-detected-clusters.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: Task
metadata:
name: "cve-detected-clusters"
difficulty: easy
steps:
prompt:
inline: "Is this CVE-2016-1000031 detected in my clusters?"
verify:
contains: "A response indicating whether CVE-2016-1000031 is detected in clusters, nodes, or deployments"
9 changes: 9 additions & 0 deletions e2e-tests/mcpchecker/tasks/cve-detected-workloads.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: Task
metadata:
name: "cve-detected-workloads"
difficulty: easy
steps:
prompt:
inline: "Is this CVE-2021-31805 detected in my workloads?"
verify:
contains: "A response indicating whether CVE-2021-31805 is detected in workloads/deployments, with specific deployment names if detected, or a statement that no deployments have this CVE."
9 changes: 9 additions & 0 deletions e2e-tests/mcpchecker/tasks/cve-nonexistent.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: Task
metadata:
name: "cve-nonexistent"
difficulty: easy
steps:
prompt:
inline: "Is CVE-2099-00001 detected in my clusters?"
verify:
contains: "A response indicating that CVE-2099-00001 is not found or not detected in any clusters"
9 changes: 9 additions & 0 deletions e2e-tests/mcpchecker/tasks/list-clusters.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: Task
metadata:
name: "list-clusters"
difficulty: easy
steps:
prompt:
inline: "List my clusters"
verify:
contains: "A response containing a list of cluster names"
Loading
Loading