Map of the language codes in Wikipedia sorted by its script
Codes used in several Wikipedia language editions and its scripts

Wikipedia language codes are the letters at the beginning of the address for each language version. They are not the same as Internet top-level domains, which are “identification string that defines a realm of administrative autonomy, authority or control within the Internet.” Still, it is easy to confuse them because often coincide. Here is a small surrealistic poem with countries and languages:

bg is for Bulgaria and Bulgarian (also бг)

cz is for Czechia and Czech

de is for Germany and German

es is for Spain and Spanish

fi is for Finland and Finnish

fo is for Faroe Islands and Faroese

fr is for France and French

ht is for Haiti and Haitian Creole

hu is for Hungary and Hungarian

It is for Italy and Italian

lt is for Lithuania and Lithuanian

mk is for North Macedonia and Macedonian (also .мкд)

mt is for Malta and Maltese

nl is for the Netherlands and Dutch

pl is for Poland and Polish

pt is for Portugal and Portuguese

ro is for Romania and Romanian

ru is for Russia and Russian (also рф)

sk is for Slovakia and Slovak

sq is for Albania and Albanian

tr is for Turkey and Turkish

But it is not that simple and easy. Internet domains are not language codes. Let see some examples: is the home page for the Arabic encyclopedia, but pages with dot ar are based in Argentina. Webpages from neighboring Brazil obviously takes br, but in Wikipedia, this code is reserved for the Breton language. The Free online encyclopedia in Chuvash is quite small, it starts with cv like the ending of addresses from the small African country Cape Verde. Cy is the Cypriot web domain but also the code for Welsh. Although the Estonian language code is et, its ccTLD (country code top-level domain) is ee, just like the Ewe language wiki. Danish language code is da, but Danish pages on the Internet end in dk.

Greece has two web domains, like many countries with its own script, gr or ελ, the language code is a transliteration of the second, el. Hebrew, another language with a unique script, its wiki starts with he, but pages from Israel end in il, as well as ישראל. The Armenian edition of Wikipedia starts with hy, but Armenian pages have both a non-Latin domain, հայ, and a Latin domain, am, which is the code for the Amharic version too. Kazakhstan has three: two domains қаз and kz, and one language code, kk. Kuwait does not have an Arabic script domain, it is kw which is also for the Cornish language wiki.

Limburgish Wikipedia is li like Liechtenstein pages. Do not confuse with Libyan domain ly. For Norway in no and also for Norwegian Bokmål, but not for Norwegian Nynorsk, which is nn. Finally, Scot is for Scotland but the Scots Wikipedia version begins with sco, while Scottish Gaelic version begins with gd, the same letters as the domains from the tropical island of Grenada.

Some of our favorites: If you read an article in Serbian, it will start with sr, but if you read a webpage from Serbia it will be the other way around rs (or .срб) We also love dead languages: the Old ChurchSlavonic language is cu, same as the domain for Cuba. Another dead language, Latin, share la with Laos.

Another interesting coincidence: Sweden has the domain se but Swedish Wikipedia uses sv, which in fact is the domain of El Salvador. Even more surprising is that does exist. It is the home page in Northern Sami, a language spoken in Norway, Finland and … Sweden!

What about other languages in Spain? Articles starting with ca are from the Catalan wiki, but webs with this ending are domains from Canada. Dot cat is the local domain in Catalonia. Gal is for Galicia but gl is the code for Galician. The letters for the Wikipedia in Euskera or Basque language is eu, however EU stands for the European Union as well as so it does its domain. Web pages can use Eus if they are from Euskalerria. Exactly the same happens with the code for the Wikipedia in Ukrainian, it is uk, which is also the abbreviation of United Kingdon. Thus, Ukraine uses ua or укр.

Some of the weirdest ones: Sh is used for the Serbo-Croat edition and at the same time is the domain for Saint Helena, Ascension and Tristan da Cunha. Which is basically a collection of islands in the South Atlantic. Another group of islands, Trinidad and Tobago, in this case in the Caribean Sea, has the domain Tt, which is the code for the Tatar language.

Recommended Articles

1 Comment

  1. […] Mapa das edicións europeas da wikipedia, do código utilizado para as representar en candanseu subdominio e mais do alfabeto utilizado, via Mapologies. […]

Leave a Reply

Your email address will not be published.