Skip to content

fix(web): strengthen URL validation and redirect handling in OG scraper#1059

Open
RinZ27 wants to merge 1 commit into
supermemoryai:mainfrom
RinZ27:fix-og-scraper-ssrf
Open

fix(web): strengthen URL validation and redirect handling in OG scraper#1059
RinZ27 wants to merge 1 commit into
supermemoryai:mainfrom
RinZ27:fix-og-scraper-ssrf

Conversation

@RinZ27
Copy link
Copy Markdown

@RinZ27 RinZ27 commented Jun 6, 2026

Strengthened the isPrivateHost check to include the AWS/GCP/Azure metadata IP (169.254.169.254) and other loopback/unspecified addresses (0.0.0.0, ::). This ensures that the OG scraper cannot be used to probe sensitive cloud metadata or internal services.

Switched the fetch call to redirect: 'manual' to prevent automated redirect bypasses. The implementation now manually validates the redirect location against the private host blocklist before following it, providing a more robust defense against SSRF attempts.

Added a Content-Type check to ensure the scraper only processes text/html. This prevents the server from attempting to download and process large binary files, reducing potential resource exhaustion during scraping operations.

Refactored the HTML metadata extraction into a reusable processHtml helper to maintain clean logic across the new redirect handling flow. All existing functionality for Open Graph and Twitter tags remains intact.

@graphite-app graphite-app Bot requested a review from Dhravya June 6, 2026 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant