POV

/

01.08.13

The hreflang XML sitemap guide for international SEO

International brands occasionally have the wrong regional visibility for their website. This is often because one domain has more authority over the regional domain, or that the right domain has not been clearly specified to the search engines. As a result, the search engine cannot identify which domain is relevant to which region. For example, this would be the equivalent of an American website (.com) taking visibility priority over a German website (.de) in Germany.

Users are more inclined to convert and spend more time on your website when they are being served in their own currency and language. Recent tests have shown that the indexation rate has increased by around 150% after the implementation of an international sitemap.

Different kind of website structures

There are various different set ups for international websites, the most common three being:

  • Country Specific TLD (ccTLD) - such as .de, .nl and .co.uk

  • Subdomain - such as de.example.com, nl.example.com and uk.example.com

  • Subfolder Structure – such as example.com/de, example.com/nl and example.com/uk

These three different methods for internationalisation can be split in two different groups:

  1. Same brand and information, but a different domain (which applies for the ccTLD and subdomain structures)

  2. Same brand and information, on the same domain (which applies for the Subfolder structure)

The rules for these different groups are indeed different and shall be discussed in this document.

Language and Region Specification

Every language should be specified in ISO 639-1 format. The region is optional, but when specified it should be in ISO 3166-1 Alpha 2 format. If the region is not specified, the language will be overruling. For instance, German (de) would be applicable to all German speaking users, whereas de-at (AT being Austria) would be applicable to German speakers in Austria. When the website has sections for various countries with the same language, the region-specific targeting is recommended.

For language script variations, the proper script is derived from the country. For example, when using zh-TW, the language script is automatically derived (in this example: Chinese-Traditional). You can also specify the script itself explicitly using ISO 15924.

At the moment there is no officially recognised code for Europe (EU) yet, so when the whole of Europe is targeted, just a language should be specified.

Rules of a Sitemap

A sitemap should not be larger than 50MB in size and contain more than 10,000 URL’s. They can be split into different sitemaps and defined in a sitemap-index. The characters in the URLs should be encoded, meaning an ampersand (&) would become &. This counts for all special characters. More information on this can be found here: https://support.google.com/webmasters/answer/35653?hl=en

Robots.txt

At iCrossing we recommend our clients to have sitemaps (or sitemap-index) defined in the robots.txt for the ease of the bots to find the appropriate sitemap. The URL to the sitemap should be in absolute format. An example including the syntax is:

Sitemap: http://www.example.com/sitemap.xml

Sitemap Index

When multiple XML Sitemaps are required, a sitemap-index is recommended to let the search engines know where to find the other sitemaps. The sitemap-index file is straight forward and will have a heading and only direct (absolute) links to the other sitemaps. An example of this is below:

<?xml version="1.0" encoding="UTF-8"?> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap><loc>http://www.example.com/sitemap-1.xml</loc></sitemap> <sitemap><loc> http://www.example.com/sitemap-2.xml </loc></sitemap> <sitemap><loc> http://www.example.com/sitemap-3.xml </loc></sitemap> </sitemapindex>

Top-line Structure XML Sitemap

Heading of the sitemap should be:

<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">

Each URL should be specified in between the URL- nodes, as below:

<url>

<loc>http://www.example.com/de</loc>

<xhtml:link rel="alternate" hreflang="en" href="http://www.exmaple.com/en" />

<xhtml:link rel="alternate" hreflang="de" href="http://www.exmaple.com/de" />

</url>

Each unique URL should have its own loc-node, as below:

<loc>http://www.example.com/</loc>

Each alternate version (including the URL itself) should be specified as follow within the URL-node:

<xhtml:link rel="alternate" hreflang="de" href="http://www.exmaple.com/de" />

Each sitemap should end with the following tag:

</urlset>

Different Domains

Different domains can be achieved by using different regional subdomains or ccTLD’s (Country-Code Top-Level-Domains).  The XML Sitemap should have each of the current region URL specified in the loc-node with all the various other regions below it as an href-lang alternate with the appropriate language (and if applicable regional) code.

Example

<?xml version="1.0" encoding="UTF-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"> <url>

<loc>http://www.example.com/de</loc>

<xhtml:link rel="alternate" hreflang="en-gb" href="http://www.exmaple.com/gb" />

<xhtml:link rel="alternate" hreflang="de" href="http://www.exmaple.com/de" />

<xhtml:link rel="alternate" hreflang="en-us" href="http://www.exmaple.com/us" />

</url>

<url>

<loc>http://www.example.com/de/contact</loc>

<xhtml:link rel="alternate" hreflang="en-gb" href="http://www.exmaple.com/gb/contact " />

<xhtml:link rel="alternate" hreflang="de" href="http://www.exmaple.com/de/contact " />

<xhtml:link rel="alternate" hreflang="en-us" href="http://www.exmaple.com/us/contact" />

</url>

</urlset>

Same Domains

When the regions are based upon the same domain through subfolders, the XML Sitemap will contain all the URLs with their respective alternations. This differs from the different domains sitemap, as that will only have URLs of that domain as a main URL. Every URL will have to be repeated in its full extent.

Example

<?xml version="1.0" encoding="UTF-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">

<url>

<loc>http://www.example.com/de</loc>

<xhtml:link rel="alternate" hreflang="en-gb" href="http://www.exmaple.com/gb" />

<xhtml:link rel="alternate" hreflang="de" href="http://www.exmaple.com/de" />

<xhtml:link rel="alternate" hreflang="en-us" href="http://www.exmaple.com/us" />

</url>

<url>

<loc>http://www.example.com/gb</loc>

<xhtml:link rel="alternate" hreflang="en-gb" href="http://www.exmaple.com/gb" />

<xhtml:link rel="alternate" hreflang="de" href="http://www.exmaple.com/de" />

<xhtml:link rel="alternate" hreflang="en-us" href="http://www.exmaple.com/us" />

</url>

<url>

<loc>http://www.example.com/us</loc>

<xhtml:link rel="alternate" hreflang="en-gb" href="http://www.exmaple.com/gb" />

<xhtml:link rel="alternate" hreflang="de" href="http://www.exmaple.com/de" />

<xhtml:link rel="alternate" hreflang="en-us" href="http://www.exmaple.com/us" />

</url>

</urlset>

Continue reading
ix-chevron-bg

Contact

Are you ready to make a digital step-change?

We believe that moving too slowly in digital is the biggest risk your business faces. If you are ready to move faster in digital, we are here to help.

Get In touch