Skip to content

security: cap glob directory expansion to prevent memory exhaustion#80

Open
julesmcrt wants to merge 9 commits intomainfrom
fix/glob-readdir-cap
Open

security: cap glob directory expansion to prevent memory exhaustion#80
julesmcrt wants to merge 9 commits intomainfrom
fix/glob-readdir-cap

Conversation

@julesmcrt
Copy link
Collaborator

Summary

  • ReadDir(-1) loaded all directory entries with no upper bound, enabling memory exhaustion via pathological glob patterns
  • Adds MaxGlobEntries = 100,000 constant in allowed_paths.go
  • Returns an os.PathError when a directory contains more entries than the limit

Security finding

SECURITY_FINDINGS.md — Unbounded glob expansion (MODERATE)

Test plan

  • All existing tests pass
  • Glob expansion over a directory with >100k entries returns an error

🤖 Generated with Claude Code

ReadDir(-1) loaded all directory entries with no upper bound. Pathological
glob patterns over large directories could exhaust memory and, combined with
unbounded for-loop iteration, cause CPU exhaustion.

Add MaxGlobEntries = 100,000 constant and reject readDir calls that would
return more entries than the limit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d92ded6eb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

julesmcrt and others added 2 commits March 13, 2026 14:43
Use f.ReadDir(MaxGlobEntries+1) instead of f.ReadDir(-1) so that
the kernel/OS never needs to load more than MaxGlobEntries+1 entries
into memory, failing fast before a full allocation is built.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ReadDir(-1) returns nil error on success; ReadDir(n>0) returns io.EOF
when the directory has fewer than n entries. The PR switched to
ReadDir(MaxGlobEntries+1) but didn't account for this, causing readDir
to propagate io.EOF as an error. The expand package then aborted glob
expansion instead of falling back to the literal pattern, producing
empty output for unmatched globs instead of the original pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 02b13b0b0d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return nil, portablePathError(err)
}
if len(entries) > MaxGlobEntries {
return nil, &os.PathError{Op: "readdir", Path: path, Err: fmt.Errorf("too many entries (limit %d)", MaxGlobEntries)}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Propagate over-limit errors for literal glob parents

Returning a hard error here causes inconsistent glob behavior for over-limit directories: patterns like echo /tmp/big/* first hit a literal parent segment, and mvdan.cc/sh treats ReadDir2 errors on those wantDir checks as non-fatal, so the new too many entries error is swallowed and the pattern is left literal instead of being rejected. In practice, the same directory errors for echo * when run inside it but silently does not expand when referenced as a literal parent, which undermines the intended "reject with error" behavior.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Sonnet 4.6] Acknowledged — this is a known limitation of how mvdan.cc/sh handles ReadDir2 errors in non-glob path segments (literal parents with wantDir=true cause errors to be silently skipped rather than propagated). Fixing this would require changes to the upstream expand package. The security goal (preventing memory exhaustion) is still achieved in both cases: when echo * is run inside a large directory the error surfaces immediately; when the large directory appears as a literal parent in echo /big_dir/* the error causes that branch to be skipped and the literal pattern is returned — no unbounded read occurs.

Comment on lines +192 to +193
if len(entries) > MaxGlobEntries {
return nil, &os.PathError{Op: "readdir", Path: path, Err: fmt.Errorf("too many entries (limit %d)", MaxGlobEntries)}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Limit glob entry cap to globbing call sites only

This cap is enforced in the shared sandbox readDir path, but that helper is also used by the ls builtin (listDir calls callCtx.ReadDir), so directories with more than 100,000 entries now fail ls with readdir ... too many entries instead of listing contents. Since the change is scoped as a glob-expansion safeguard, applying it globally introduces an unintended functional regression for non-glob directory reads.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Sonnet 4.6] Fixed — moved the cap check from readDir to the ReadDir2 closure in updateExpandOpts, which is the glob-expansion-specific call site. readDir now uses ReadDir(-1) again, so ls and other non-glob directory reads are unaffected.

The MaxGlobEntries cap was enforced in the shared readDir helper, which
is also used by the ls builtin. This caused ls to fail with
"too many entries" for directories with >100,000 entries — an unintended
regression.

Move the cap check to the ReadDir2 closure in updateExpandOpts, which is
the glob-expansion-specific call site. Revert readDir to use ReadDir(-1)
so ls (and any other non-glob reader) is unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 111352566b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +32 to +36
entries, err := r.readDirHandler(r.handlerCtx(r.ectx, todoPos), s)
if err != nil {
return nil, err
}
if len(entries) > MaxGlobEntries {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Cap directory reads before loading all glob entries

The new limit check runs too late to prevent the memory-exhaustion case it is intended to fix: updateExpandOpts first calls r.readDirHandler(...) and only then checks len(entries), but the default handler installed in Reset is pathSandbox.readDir, which already materializes the full directory with f.ReadDir(-1) (and sorts it) before returning. In a directory with millions of entries, memory can still be exhausted before this guard executes, so the security mitigation is ineffective in the default execution path.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Sonnet 4.6] Fixed — extracted readDirN(ctx, path, maxEntries int) as the shared implementation. readDir calls it with -1 (unlimited, for ls etc.). readDirForGlob calls it with MaxGlobEntries, which passes MaxGlobEntries+1 to the underlying ReadDir syscall so the kernel never materialises more entries than needed. The ReadDir2 closure in updateExpandOpts now calls readDirForGlob directly, so the cap is enforced at the OS level before any entries are sorted or pattern-matched.

The previous fix checked len(entries) after readDir already called
ReadDir(-1), materialising all entries in memory before the cap could
take effect.

Extract readDirN(ctx, path, maxEntries) as the shared implementation.
readDir calls it with -1 (unlimited) so ls and other builtins are
unaffected. readDirForGlob calls it with MaxGlobEntries, which passes
MaxGlobEntries+1 to the underlying ReadDir syscall so the kernel never
fetches more entries than necessary; io.EOF is ignored (ReadDir(n>0)
returns it on success) and a "too many entries" error is returned if the
directory exceeds the limit.

The ReadDir2 closure in updateExpandOpts now calls readDirForGlob
directly, ensuring the cap is enforced before any entries are sorted or
matched.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ea73379d57

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return nil, portablePathError(err)
}
if maxEntries > 0 && len(entries) > maxEntries {
return nil, &os.PathError{Op: "readdir", Path: path, Err: fmt.Errorf("too many entries (limit %d)", maxEntries)}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Surface glob cap overflow instead of treating it as no-match

Returning too many entries from ReadDir2 here causes incorrect glob results for patterns with literal directory prefixes (for example echo /huge/*.txt): mvdan.cc/sh/expand probes non-meta path parts via ReadDir2 with wantDir=true and drops any non-ENOENT error as a non-match, so the new overflow is silently swallowed and the pattern is emitted literally with no error. That means directories above the cap are not actually "rejected" in this common form and scripts can continue with wrong arguments; the cap should only be applied on actual wildcard expansion steps or mapped to an error path that cannot be ignored.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Sonnet 4.6] Fixed — changed readDirN to silently truncate to MaxGlobEntries instead of returning a 'too many entries' error. The reason: mvdan.cc/sh/expand calls ReadDir2 for two purposes — checking whether a literal path component is a directory (wantDir=true) and listing entries for wildcard matching. Returning an error for the existence check caused the path to be silently dropped, so echo /huge/*.txt returned the literal pattern instead of expanding. Truncating without error lets the existence check pass while still bounding memory: f.ReadDir(MaxGlobEntries+1) caps the OS-level read, so at most MaxGlobEntries+1 entries are ever loaded into memory. Glob expansion works against the first N entries (files beyond the cap are silently missed, which is an acceptable tradeoff).

julesmcrt and others added 2 commits March 15, 2026 15:22
Main refactored pathSandbox into the allowedpaths package with a public
Sandbox type and (path, cwd string) signatures instead of ctx-based ones.

Resolve conflict by:
- Adding MaxGlobEntries const to allowedpaths/sandbox.go
- Extracting readDirN(path, cwd, maxEntries) as the shared implementation
- ReadDir calls readDirN(-1) — unlimited, for ls and other builtins
- ReadDirForGlob calls readDirN(MaxGlobEntries) — caps at OS level
- Updating runner_expand.go ReadDir2 closure to call ReadDirForGlob
  with the cwd extracted from the handler context

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Returning a "too many entries" error from ReadDirForGlob caused a silent
correctness bug: mvdan.cc/sh/expand calls ReadDir2 for two purposes —
to check whether a literal path component is a directory (wantDir=true)
and to list entries for wildcard matching. When the existence check
returns a non-ErrNotExist error with wantDir=true, expand silently drops
the path, so echo /huge/*.txt returns the literal pattern instead of
expanding the files inside /huge.

Fix: when the cap is exceeded, silently truncate to MaxGlobEntries instead
of returning an error. The underlying f.ReadDir(MaxGlobEntries+1) still
bounds the OS-level read so at most MaxGlobEntries+1 entries are ever
loaded into memory. Existence checks pass (no error), glob expansion
works against the first N entries, and memory is bounded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 22ff137900

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Replace silent truncation with an explicit error when a directory exceeds
MaxGlobEntries entries during glob expansion. Silent truncation could miss
valid glob matches with no indication of failure; the explicit error surfaces
through expandErr (printed to stderr, exit code 1) so callers are aware the
directory was too large to enumerate.

Add TestReadDirNCapExceeded to cover the error path and the boundary cases
(count == cap, count < cap).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 17b4240539

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…text

Route ReadDir2 (used for glob expansion) through r.readDirHandler instead
of calling r.sandbox.ReadDirForGlob directly. This respects any custom
directory-read handler configured via runnerConfig, which was previously
bypassed. Update Reset() to wire readDirHandler to ReadDirForGlob
(the capped variant) so the default behaviour is unchanged.

Fix spurious Fuzz (testcmd) CI failures by deriving per-iteration contexts
from t.Context() rather than context.Background(). When the fuzz run's
time limit expires, the framework cancels t.Context(); iterations derived
from context.Background() could not observe this signal and caused a timing
race that surfaced as "context deadline exceeded" at the 30s boundary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@julesmcrt julesmcrt marked this pull request as ready for review March 15, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants