feat(analyzer): policy-configurable security scoring (closes #49) by initializ-mk · Pull Request #53 · initializ/forge

initializ-mk · 2026-05-13T18:08:06Z

Summary

Fixes #49 — a custom skill's egress domain, binary, or env var can now be scored as non-risky without editing any file under forge-skills/analyzer/.

This closes the three wiring gaps called out in the issue:

SecurityPolicy.TrustedDomains existed but was never threaded into scoreEgress (always passed nil). It now is.
SecurityPolicy.MaxRiskScore was only consulted in policy violation checks. With the policy now flowing through scoring, that check uses the same effective policy.
forge skills audit hardcoded analyzer.DefaultPolicy() with no flag to override. A --policy <path> flag now loads a custom YAML.

Two new SecurityPolicy fields support binary and env-var overrides:

Field	Effect	Annotation in audit
`trusted_domains: [...]`	Domain scored as trusted (+2) instead of unknown (+10)	`trusted domain (via policy): X`
`acknowledged_bins: [...]`	Builtin high-risk binary scored as standard (+3) instead of high-risk (+15)	`high-risk binary (acknowledged by policy): X`
`acknowledged_env: [...]`	Builtin-sensitive env var scored as standard (+5) instead of sensitive (+10)	`sensitive variable (acknowledged by policy): X`

Builtin classifications still win — listing an already-builtin-trusted domain like api.github.com under trusted_domains is a no-op rather than re-attributing the trust to policy. This keeps audit reports consistent across projects that share the same builtin allowlist.

Auditability

Every policy-driven score reduction is recorded in the corresponding RiskFactor.Description. JSON and text audit output both surface "(via policy)" / "(acknowledged by policy)" next to the affected item, so operators can see which items were downgraded by policy without diffing scores.

Usage

# project.policy.yaml
trusted_domains:
  - internal.example.com
acknowledged_bins:
  - python
acknowledged_env:
  - DB_PASSWORD
max_risk_score: 90
script_policy: allow

forge skills audit --policy project.policy.yaml

A skill that previously scored 35 (script +20, python +15) might now score 23 (script +20, python +3 acknowledged) — and the audit report makes the override traceable.

API change

AnalyzeSkillDescriptor and AnalyzeSkillEntry now take a SecurityPolicy parameter. Both functions and their internal score helpers were internal to forge-skills/analyzer; the only external call sites are GenerateReport / GenerateReportFromEntries (which already accept a policy and now thread it through). Passing SecurityPolicy{} preserves the historical default scoring exactly.

Tests

gofmt -l forge-skills/ forge-cli/                                    # clean
golangci-lint run ./{forge-core,forge-cli,forge-plugins,forge-skills}/...  # 0 issues
cd forge-skills && go test ./...                                     # all pass
cd forge-cli && go test ./...                                        # all pass

New tests (analyzer package):

TestAnalyzeSkillDescriptor_PolicyTrustedDomain — domain in policy scored as trusted with via policy annotation.
TestAnalyzeSkillDescriptor_PolicyAcknowledgedBin — high-risk binary in policy drops to +3 with annotation.
TestAnalyzeSkillDescriptor_PolicyAcknowledgedEnv — sensitive env var in policy drops to +5 with annotation.
TestAnalyzeSkillDescriptor_PolicyPreservesDefault — builtin trust is not re-attributed to policy.
TestAnalyzeSkillDescriptor_PolicyDoesNotPromoteStandardBin — AcknowledgedBins cannot escalate a non-high-risk binary.
TestLoadPolicyFromFile — YAML round-trip for all new and existing fields.
TestLoadPolicyFromFile_PolicyAffectsScoring — end-to-end: YAML → LoadPolicyFromFile → scoreEgress/scoreBinaries honor it.
TestLoadPolicyFromFile_MissingFile, …_InvalidYAML — error paths.

Existing analyzer tests updated to pass SecurityPolicy{} so all prior expected scores are preserved unchanged.

Acceptance criteria from #49

Custom skill can have its egress domain, env var, or binary scored as non-risky without editing any file under forge-skills/analyzer/.
forge skills audit accepts a configurable policy that affects both scoring and policy checks, not only the latter.
Any score override path is auditable — leaves a trace in the audit report (factor description annotation).
Default behavior is unchanged for skills that declare no overrides and run under no custom policy.

Out of scope (deferred follow-ups)

Issue #49 listed two further candidate approaches that are intentionally not in this PR:

forge.security block in SKILL.md frontmatter — per-skill self-attestation (issue option 3).
forge.yaml project-level policy block — implicit policy without --policy flag (issue option 4).

Both build on this PR's policy-threading without further analyzer changes; they'd be net-additive parsers feeding the same SecurityPolicy struct. Worth a separate enhancement if there's demand.

Test plan

Run forge skills audit on the embedded registry without --policy and confirm the report matches the pre-PR output.
Create a policy.yaml with trusted_domains for a project-specific domain and confirm forge skills audit --policy policy.yaml lowers the corresponding skill's score and annotates the factor.
Confirm acknowledged_bins: [python] lowers a skill that requires python and surfaces "(acknowledged by policy)" in the text and JSON output.
Confirm a YAML with invalid content fails the command with a clear parse error.

Thread SecurityPolicy through AnalyzeSkillDescriptor / AnalyzeSkillEntry so operators can adjust scoring for custom skills without editing any file under forge-skills/analyzer/. - types.go: extend SecurityPolicy with AcknowledgedBins and AcknowledgedEnv; document TrustedDomains plus the two new fields as scoring overrides distinct from policy-check fields. - scoring.go: wire policy.TrustedDomains into scoreEgress (was always passed nil — the existing wiring gap). Add policy-aware variants for scoreBinaries and scoreEnv so AcknowledgedBins / AcknowledgedEnv lower the points and mark the RiskFactor description with "(acknowledged by policy)" or "(via policy)" so the override is visible in audit output. Builtin classifications still win to preserve audit consistency across projects. - policy_loader.go: LoadPolicyFromFile parses a YAML SecurityPolicy. - cmd/skills.go: --policy <path> flag on `forge skills audit` replaces DefaultPolicy() with the YAML-loaded one for both scoring and policy checks. - Tests: full coverage of overrides (TrustedDomains, AcknowledgedBins, AcknowledgedEnv), confirmation that builtin trust isn't re-attributed to policy, that AcknowledgedBins doesn't promote standard binaries, and a YAML loader round-trip including the end-to-end scoring effect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(analyzer): policy-configurable security scoring (closes #49)#53

feat(analyzer): policy-configurable security scoring (closes #49)#53
initializ-mk wants to merge 1 commit into
mainfrom
enhancement/skill-security-score

initializ-mk commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

initializ-mk commented May 13, 2026

Summary

Auditability

Usage

API change

Tests

Acceptance criteria from #49

Out of scope (deferred follow-ups)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant