Improve Broken Backlinks SEO Quality #1252
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improve Broken Backlinks SEO Quality
🎯 Changes
Use Anchor Text for Matching
The clickable text of a link (anchor text) is now the primary signal when finding replacement URLs. If a broken link says "pricing plans", we prioritize pages about pricing rather than just matching URL patterns. → Better semantic matching
Stop Suggesting Homepage as Fallback
Previously, when no good match was found, we'd suggest redirecting to the homepage. Google treats homepage redirects from deep URLs as "soft 404s" which hurts SEO. Now we return no suggestion rather than a bad one. → Protects SEO health
Prioritize High-Authority Backlinks
Previously, broken links were ranked by traffic volume alone. Now we factor in Domain Rating (site authority) — a link from a trusted site is worth more than many links from unknown blogs. High-authority backlinks appear at the top. → Fix valuable links first
Ensure Content Diversity in Matching
Previously, alternative URLs were just the top 200 by traffic, which often skewed toward one content type. Now we sample proportionally across different sections of the site to find good matches for any type of content. → Matches for all content types
Reduce AI Processing Costs
Removed a redundant AI analysis step that duplicated work already done earlier in the pipeline. Saves ~33% of AI tokens per broken link with no loss in quality. → Lower costs
🔗 Related PRs
spacecat-shared— Adds new data fields from Ahrefs API - Improve Broken Backlinks SEO Quality #1252spacecat-audit-worker— Implements ranking, sampling, and prompt changes - Improve Broken Backlinks SEO Quality spacecat-audit-worker#1830mystique— Updates AI matching to use anchor text and new rules - https://git.corp.adobe.com/experience-platform/mystique/pull/1107