fix: publish-to-pages — handle large presentations with external assets#1090
fix: publish-to-pages — handle large presentations with external assets#1090aaronpowell merged 1 commit intostagedfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves the publish-to-pages skill to better handle large PPTX/PDF inputs by switching to an external assets/ directory when output would otherwise become an extremely large single HTML file, and updates publishing to support multi-file output.
Changes:
- Add auto-detection (and manual flags) to emit images/pages into an
assets/directory for large PPTX/PDF conversions. - Add large-file preflight warnings (150MB+) for PPTX/PDF conversions.
- Update
publish.shto publishindex.htmlplus an adjacentassets/directory when present.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| skills/publish-to-pages/scripts/publish.sh | Copies an adjacent assets/ folder into the repo and stages all files for publishing. |
| skills/publish-to-pages/scripts/convert-pptx.py | Adds external-assets mode, large-file detection/warnings, and CLI flags for PPTX conversion. |
| skills/publish-to-pages/scripts/convert-pdf.py | Adds external-assets mode, large-file detection/warnings, and CLI flags for PDF conversion. |
| skills/publish-to-pages/SKILL.md | Documents external-assets mode behavior, thresholds, and publishing requirements. |
Comments suppressed due to low confidence (1)
skills/publish-to-pages/scripts/convert-pptx.py:374
--external-assetsand--no-external-assetscan both be provided at the same time; currently the script will silently prefer--external-assets. Make these flags mutually exclusive (and produce a clear argparse error when both are set) so the CLI has deterministic, user-friendly behavior.
parser = argparse.ArgumentParser(description="Convert PPTX to HTML presentation")
parser.add_argument("input", help="Path to .pptx file")
parser.add_argument("output", nargs="?", help="Output HTML path (default: same name with .html)")
parser.add_argument("--external-assets", action="store_true", default=None,
help="Save images as separate files in assets/ directory (auto-detected for large files)")
parser.add_argument("--no-external-assets", action="store_true",
help="Force inline base64 even for large files")
args = parser.parse_args()
ext_assets = None # auto-detect
if args.external_assets:
ext_assets = True
elif args.no_external_assets:
ext_assets = False
| # Copy assets directory if it exists alongside the HTML file | ||
| if [ -d "$HTML_DIR/assets" ]; then | ||
| cp -r "$HTML_DIR/assets" "$TMPDIR/assets" | ||
| echo "Copied assets/ directory ($(find "$HTML_DIR/assets" -type f | wc -l) files)" |
There was a problem hiding this comment.
Because this script runs with set -euo pipefail, the find ... | wc -l pipeline inside the echo will cause the entire publish to abort if find returns non-zero (e.g., permission errors, unreadable files, broken symlinks). Consider making the file-counting best-effort (suppress errors / avoid failing the script) or dropping the count entirely so copying assets/ can’t fail due to the diagnostic output.
| echo "Copied assets/ directory ($(find "$HTML_DIR/assets" -type f | wc -l) files)" | |
| asset_count=$(find "$HTML_DIR/assets" -type f 2>/dev/null | wc -l || true) | |
| echo "Copied assets/ directory (${asset_count:-0} files)" |
| import io | ||
| import os | ||
| import re |
There was a problem hiding this comment.
io and re are imported but not used anywhere in this script. Removing unused imports will reduce noise and avoid confusing future maintainers about expected behavior.
This issue also appears on line 361 of the same file.
| import io | |
| import os | |
| import re | |
| import os |
| parser = argparse.ArgumentParser(description="Convert PDF to HTML presentation") | ||
| parser.add_argument("input", help="Path to .pdf file") | ||
| parser.add_argument("output", nargs="?", help="Output HTML path (default: same name with .html)") | ||
| parser.add_argument("--external-assets", action="store_true", default=None, | ||
| help="Save page images as separate files in assets/ directory (auto-detected for large files)") | ||
| parser.add_argument("--no-external-assets", action="store_true", | ||
| help="Force inline base64 even for large files") | ||
| parser.add_argument("--dpi", type=int, default=150, help="Render DPI (default: 150)") | ||
| args = parser.parse_args() | ||
|
|
||
| ext_assets = None | ||
| if args.external_assets: | ||
| ext_assets = True | ||
| elif args.no_external_assets: | ||
| ext_assets = False |
There was a problem hiding this comment.
--external-assets and --no-external-assets can both be provided at the same time; currently the script will silently prefer --external-assets. Make these flags mutually exclusive (and produce a clear argparse error when both are set) so users can’t accidentally get the opposite behavior than intended.
Fixes large PPTX/PDF handling per AndreaGriffiths11/publish-to-pages#1
Changes
publish.shnow handles multi-file repos (HTML + assets/ directory)--external-assets/--no-external-assetsflags for manual control