Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions .github/workflows/bazel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -298,11 +298,8 @@ jobs:
env:
MSYS_NO_PATHCONV: 1
MSYS2_ARG_CONV_EXCL: "*"
run: |
mkdir -p build
{
${{ inputs.run }}
} 2>&1 | tee build/bazel-console.log
BAZEL_RUN_CMD: ${{ inputs.run }}
run: ./scripts/github-actions/run-bazel-with-retry.sh
- name: Rerun failures with debug
if: failure() && steps.run-bazel.outcome == 'failure'
shell: bash
Expand Down
43 changes: 43 additions & 0 deletions scripts/github-actions/run-bazel-with-retry.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env bash
# Run the Bazel command supplied via the BAZEL_RUN_CMD environment variable,
# retrying up to a few times on transient GitHub CDN errors (HTTP 5xx during
# repo fetch). For any other failure, exits with Bazel's actual exit code so
# downstream "rerun with debug" behavior triggers normally.
#
# Usage: BAZEL_RUN_CMD="<bazel command>" run-bazel-with-retry.sh

set -uo pipefail

: "${BAZEL_RUN_CMD:?usage: BAZEL_RUN_CMD=\"<bazel command>\" $0}"
LOG_FILE="${BAZEL_CONSOLE_LOG:-build/bazel-console.log}"
BAZEL_MAX_ATTEMPTS=3
mkdir -p "$(dirname "$LOG_FILE")"

# Matches Bazel's HttpConnector error format for 502/503/504 responses
BAZEL_ERROR_PATTERN='GET returned 50[234] '

for i in $(seq 1 "$BAZEL_MAX_ATTEMPTS"); do
bash -c "$BAZEL_RUN_CMD" 2>&1 | tee "$LOG_FILE"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Tee failures ignored 🐞 Bug ☼ Reliability

run-bazel-with-retry.sh only considers Bazel’s exit code, so if tee fails to write the console log
(e.g., permission/disk-full), the step can still exit 0 and silently drop build/bazel-console.log.
This can hide infrastructure problems and break downstream tooling that parses that log on failures.
Agent Prompt
## Issue description
`run-bazel-with-retry.sh` pipes Bazel output to `tee`, but it only uses Bazel’s exit code (`PIPESTATUS[0]`). If `tee` fails (can’t write the log), the script may still report success and/or proceed with retry logic based on incomplete logs.

## Issue Context
Downstream logic expects `build/bazel-console.log` to be present and readable when diagnosing failures.

## Fix Focus Areas
- scripts/github-actions/run-bazel-with-retry.sh[20-21]

### Implementation guidance
After the pipeline, capture both statuses (Bazel + tee) and fail fast on `tee` errors, e.g.:
- `BAZEL_EXIT_CODE=${PIPESTATUS[0]}`
- `TEE_EXIT_CODE=${PIPESTATUS[1]}`
- if `TEE_EXIT_CODE != 0`, print an error and `exit $TEE_EXIT_CODE` (or a fixed non-zero code).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

BAZEL_EXIT_CODE=${PIPESTATUS[0]}

if [ "$BAZEL_EXIT_CODE" -eq 0 ]; then
exit 0
fi

if grep -qE "$BAZEL_ERROR_PATTERN" "$LOG_FILE"; then
if [ "$i" -ge "$BAZEL_MAX_ATTEMPTS" ]; then
break
fi
SLEEP=$((15 * i))
{
echo "⚠️ Transient CDN error detected (5xx). Retrying in ${SLEEP}s... (attempt $i of $BAZEL_MAX_ATTEMPTS)"
grep -E "$BAZEL_ERROR_PATTERN" "$LOG_FILE" | head -5
} >&2
sleep "$SLEEP"
else
exit "$BAZEL_EXIT_CODE"
fi
done

echo "❌ Exhausted retries for CDN errors after $BAZEL_MAX_ATTEMPTS attempts." >&2
exit "$BAZEL_EXIT_CODE"