Seo Tools

Hreflang Tags Done Right: A Field Guide for Multilingual Sites

The Question Hreflang Is Actually Answering

Most teams treat hreflang as a metadata afterthought — a tag the CMS emits because somebody added the requirement to a Jira ticket three years ago. Then a French user lands on the German page, a Spanish-speaking user in Mexico lands on the Spanish-from-Spain version with the wrong currency, and the support inbox lights up with bug reports that look like product issues but are actually SEO issues. Hreflang exists to answer one narrow question: which of my URLs should Google send to a user who searches in language L from country C? Everything in this article comes back to that question.

The mechanics matter because Google does not use hreflang as a ranking signal. It uses it as a routing signal. If the tag is wrong, the wrong URL ranks. If the tag is missing, Google falls back to its own language detection — which works surprisingly well on simple sites and surprisingly badly the moment your locales overlap or your URLs do not contain obvious language hints. This article is the field guide I wish I had had the first time I shipped hreflang on a serious multilingual project. The companion tool — the browser-only Hreflang Checker — implements the validation rules described below so you can audit your own implementation in seconds.

The Three Delivery Methods (and Why The Sitemap Approach Scales)

Google accepts hreflang declarations in three places, and the three are equivalent for the search engine. The choice between them is purely an engineering one, but it has real consequences once your site grows.

HTML head link tags. The most common approach. Each page emits a <link rel="alternate" hreflang="..." href="..."> for itself and every sibling. Easy to implement because the per-page metadata is naturally templated. Hard to debug at scale because you have to crawl the site to see what was actually emitted, and a templating bug on one URL silently breaks the entire cluster.

HTTP Link response headers. Same payload, different transport. Useful when the response body is not HTML (PDFs, images, JSON) but you still want to declare alternates. Rarely used for HTML pages because the headers are harder to inspect than the head tags.

XML sitemap xhtml:link annotations. The approach large enterprise sites converge on. Each <url> entry in the sitemap declares its alternates inline using <xhtml:link rel="alternate" hreflang="..." href="..."/> elements. The advantage: every international declaration lives in one regenerable artifact. The CMS exports it as part of the sitemap build, the file goes into version control, and the entire cluster topology becomes a diff-friendly text file instead of a thousand-page crawl.

If your site has fewer than a few thousand pages and one or two locales, the HTML head approach is fine. If you have more locales than you can count on one hand, more than 10,000 pages, or a content team that regenerates the sitemap from a CMS export anyway, the sitemap approach pays for itself within one release cycle because the validator can read the sitemap directly and answer "is the cluster topology correct?" without a crawl.

Cluster Bidirectionality: The Rule That Trips Up Everyone

The rule is simple to state and brutal in practice: every URL in a hreflang cluster must declare every other URL in the cluster, including itself. If the cluster has three URLs (English, Spanish, French) then each of the three URLs must emit three <link rel="alternate" hreflang=...> tags — one self-reference and two siblings. If even one URL forgets even one return tag, Google may silently drop the entire cluster from its routing logic. The Search Console "international targeting issue" warning is the only signal you get, and it does not tell you which URL is missing which tag — it just tells you the cluster is broken.

The hreflang checker reports the exact missing tag. Not "cluster broken" but "URL /en/pricing/ is missing a return tag pointing to /fr/tarifs/." That precision matters because the fix is usually a one-line template change, and turning a vague Search Console warning into a precise code edit is the difference between "we will look at it next sprint" and "we are fixing it before lunch."

The most common causes of broken bidirectionality: (a) a CMS template that emits "other languages" without the current one, breaking the self-reference rule; (b) a partial deploy where some locales got the new template and some did not; (c) a hand-edited sitemap fragment that got out of sync with the rest; (d) a region rollout where the new locale's CMS knows about the old locales but the old locales' templates have not been updated to know about the new one.

Locale Codes: ISO 639-1 + ISO 3166-1, Not What You Made Up

The hreflang value is a BCP 47 language tag. In practice that means an ISO 639-1 two-letter language code, optionally followed by a hyphen and an ISO 3166-1 alpha-2 two-letter region code. Examples that pass: en, es, en-US, es-MX, zh-CN, pt-BR. Examples that fail and routinely show up in production: en-USA (region is three letters), en-UK (UK is not an ISO 3166-1 code — Great Britain is GB), en_US (separator must be a hyphen), english (language must be a two-letter code), br alone (br is the Breton language; if you meant Brazil you want pt-BR).

The single most common mistake in this category is en-UK. It feels right because UK is the everyday abbreviation for the United Kingdom, and SEO articles online sometimes use it casually. But the ISO 3166-1 code for the United Kingdom of Great Britain and Northern Ireland is GB. en-UK fails validation and is ignored by Google. The fix is en-GB. If you find en-UK in your codebase, find a code search across your repo for the literal string and replace every occurrence — it will be in templates, in the CMS, and possibly in the sitemap generator.

The hreflang checker validates each locale value against these patterns and reports INVALID_LOCALE with the exact offending string. The literal value x-default is recognized as a sentinel and exempted from validation — it is the one special case in the spec.

x-default: Almost Always Worth Declaring

The x-default alternate is the catch-all. It tells Google: if the user's locale does not match any of my explicit alternates, send them here. For most international sites the x-default points to the global landing page — often the English version (since English is the de facto fallback for international audiences), or a language-picker page if you have one. The spec is permissive: x-default can be the same URL as one of the language-specific alternates, or a completely separate URL.

The cost of omitting x-default is that Google has to guess. The guess is biased toward whichever locale has the most accumulated link authority, which often means a Japanese user trying to read your global homepage gets sent to your German URL because the German URL has been around longer and accumulated more backlinks. That is rarely what you want. The fix is one extra <link rel="alternate" hreflang="x-default" href="..."/> declaration per cluster, pointing to whatever URL you want unhandled-locale users to see.

The hreflang checker reports missing x-default as a warning by default because some niche cases (single-locale clusters, tightly-scoped two-language clusters where the languages are mutually exclusive) genuinely do not need it. For anything broader, toggle Require x-default in the validation options to escalate the warning to an error so the issue cannot be silently shipped.

Self-Reference: The Bug Every CMS Has Shipped At Some Point

The self-reference rule says: URL A's hreflang block must include A itself as one of the alternates. The natural CMS template emits "all the OTHER languages this page is available in" — and the natural mistake is to write that template literally, excluding the current language. The result is a cluster where every URL declares every sibling but no URL declares itself, and the entire cluster is invalid.

If you only fix one hreflang bug in your codebase as a result of reading this article, fix this one. It is the cheapest mistake to make and the cheapest to fix. The hreflang checker reports it as MISSING_SELF_REFERENCE with the exact URL that omits its own back-link.

Duplicate Locales: Two URLs Claiming The Same Audience

Less common but more confusing in practice. A single URL's hreflang block declares two alternates with the same locale value pointing to different URLs. Example: a page declares both hreflang="en-US" href="/en/" and hreflang="en-US" href="/en-us/". Google has no way to decide which one is correct, and the entry is typically discarded.

The root cause is usually a CMS migration where two URL conventions coexist temporarily and the template emits both. The fix is to canonicalize on one URL pattern and remove the duplicate from the alternate list. The hreflang checker reports this as DUPLICATE_LOCALE with both target URLs.

An Audit Workflow That Actually Catches Mistakes

The workflow I run for any non-trivial multilingual site:

  1. Export the production sitemap and feed it to the checker. If hreflang lives in xhtml:link annotations inside the sitemap, this single step audits the entire site topology in one pass. The checker will tell you exactly which URLs are missing return tags, which have invalid locales, and which clusters have no x-default.
  2. For HTML-head sites, sample one URL per cluster and paste the head into the HTML mode. Spot-check the most-trafficked clusters first. A bug in the head template usually affects every page that uses that template, so a single bad cluster is a signal that the whole template is broken.
  3. Triage findings by severity. Errors block international SEO. Warnings degrade it. Fix every error before the next deploy; schedule warnings into the SEO backlog.
  4. Export the findings as CSV. Drop it into an issue tracker and assign per-template ownership. The code column lets you group by error type, which often reveals that all errors come from a single broken template.
  5. Re-run the check after the fix. The whole audit takes less than a minute on a sitemap, so verification is cheap. Run it as part of the post-deploy smoke test.
  6. Wire the check into CI for the long term. The same validation logic is exposed by the open-source @anthropic-tools/tools-core package. Calling validateHreflangEntries in a CI script and failing the build on any error turns hreflang correctness into a regression-tested property of the codebase instead of a quarterly audit.

What Hreflang Cannot Do

A clean hreflang cluster is necessary but not sufficient for international SEO. It guarantees that Google routes users to the URL you intended; it does nothing about whether that URL is the best landing experience for those users. The most common failure modes that look like hreflang issues but are not:

  • Machine-translated content. A page automatically translated to Spanish is technically a Spanish page, but the translation quality is usually visible to users and almost always worse than market-specific copy. Google's quality systems can detect machine-translated content and demote it independently of hreflang.
  • Country targeting without ccTLDs or Search Console country settings. Hreflang tells Google about language and country fit; it does not by itself establish that your site is meant for, say, the Mexican market. Pair hreflang with a country-targeted Search Console property for ccTLDs or with the international targeting setting in the legacy Search Console for gTLDs.
  • Currency and pricing assumptions. A user sent to the Spanish URL still sees euro pricing if that is what the page emits. Hreflang routes the user; the page must localize the experience.
  • Canonicalization conflicts. If two alternates in a cluster have canonical tags pointing to each other or to a third URL, Google may collapse the cluster and treat one URL as the canonical for all locales. Run a canonical audit separately from the hreflang audit and resolve conflicts before they cause silent demotions.

The Bottom Line

Hreflang is a five-rule system: bidirectional clusters, self-references included, valid ISO codes, no duplicate locales, and x-default declared. The hreflang checker enforces all five in seconds against a sitemap or HTML head. The cost of running it is roughly zero; the cost of skipping it is misrouted traffic and Search Console warnings that take days to diagnose without precise reporting. Wire it into your pre-deploy checklist, fix the errors it surfaces, and the international SEO foundation underneath your content stops being a liability.

Paste a sitemap or a page's HTML head into the Hreflang Checker and see what your current implementation actually says. If it says "0 errors" you have done the hard part. If it does not, the report tells you exactly which lines to change.

← Volver al Blog