List of Languages

Book of Knowledge

Chapters: 1 2 3 4 5 6 7 8 9 10

List of all languages referred to in the Book of Knowledge and other sections of the website.


DOWNLOAD  and print out the Book of Knowledge.

The list in the table below provides the names of all languages (above 220) referred to in the Languages in Danger website together with the links to the related sections of the Book of Knowledge or another sections of the website as well as short descriptions and references.

Some names on this list are also at the same time links to individual language sub-pages where some more details as well as photographs can be found.

Go to:  a … e d … g h … l m … r s … z

Language name 


Language info

ǂAkhoe Haiǁom
 BoK  2 ,   3 ,   Map ǂAkhoe Haiǁom is an endangered idiom of Namibia. It is treated as a separate language by some linguists and as a dialect of the language (dialect cluster) Khoekhoegowab by others. About 250,000 persons speak a dialect of Khoekhoegowab (also called Khoekhoe or Nama), but the number of speakers of ǂAkhoe Haiǁom is difficult to estimate, it may be about 7,500. A standardized form of Nama is used in Namibia in administration and education. ǂAkhoe Haiǁom has preserved some old words and grammatical forms that are no longer found in other dialects (source: DoBeS ǂAkhoe Haiǁom project). The language belongs to the Khoe (or Central Khoisan) group. Like other Khoisan languages, it uses clicks – obviously, for two of them are contained in the name ǂAkhoe Haiǁom. Further information and recourses: DoBeS ǂAkhoe Haiǁom project: . This site contains short information on the language and culture as well as several samples (videos, audios). The ǂAkhoe Haiǁom corpus in the DoBeS archive offers more information and resources for registered users.
Achumawi  BoK  8 A critically endangered language spoken by a few elderly speakers in California, USA. It belongs to the Hokan languages. Here you can listen to a folk tale in Achumawi:
Agta BoK  1 Agta (or Ayta) are several languages of the semi-nomadic Negrito people of Luzon, Philippines. Various estimations about how many speakers of Agta there are give numbers around 6,000.
Ainu  BoK   4 Isolated (unrelated to other living languages) language spoken in the past on the islands of Sakhalin, Kuril archipelago (at present in Russian Federation), and Hokkaido (Japan). Nowadays only a few elderly speakers; . The language and culture of the Ainu was investigated and recorded by Bronisław Piłsudski – whose scholarly heritage was researched and documented within an international project ICRAP:
Akele; Kélé  BoK  8
Akele is a Bantu language of the Niger-Congo family, spoken in Gabon. Find out more about this endangered language and watch videos in and about Akele at this site:
Akkadian   Akkadian was one of the languages of ancient Mesopotamia. It belonged to the Semitic group of the Afro-Asiatic language family.
Alemannic dialects of German BoK  6 A continuum of local varieties of Upper German (Indo-European) which are used on an area at the confluence of borders between Germany, France and Switzerland. See maps in Wikipedia – for example, the entry on Alemannic written in Alemannic []
Algonquian languages  BoK  7 A family of languages spoken in Canada and the USA. Examples of Algonquian languages are Cheyenne, Cree, Miami and Ottawa.
Amele  BoK   3 An endangered language spoken in Papua New Guinea, of the Trans-New Guinea family.
Anglo-romani BoK  6 In England and other English speaking countries most speakers of Romani shifted to English. Still, some words of Romani origin are still known by most Gypsies and mixed into English when they speak among themselves. This mixed language is called Anglo-Romani.
Arabic, Classical Arabic  BoK  3, 78 Arabic belongs to the Semitic branch of the Afro-Asiatic language family. Modern Arabic is one of the largest languages of the world (place 4 or 5 in different statistics). Classical Arabic is the language of Arabic texts from the Middle Ages.
Aranese BoK  5 The Aranese language is a Gascon Occitan (Romance langue d’oc) variety spoken in the fringe Pyrenean Valley of Aran in the Catalan Autonomous Community within the Kingdom of Spain by ca. 4-5 thousand people. Since 2010 recognized as official language of Val d’Aran and the entire Catalunya. More on the region and its language:
Armenian Armenian constitutes a group within the Indo-European language family. It is used in two distant varieties: – Eastern Armenian – the official language of the Transcaucasian Republic of Armenia (with ca. 4.3 million speakers), – Western Armenian – the language of Armenian diaspora dispersed in European, Asian and American countries (ca. 880 thousand). Old Armenian Grabar is used in liturgy of Armenian Christian churches. Armenian has its own alphabet.
Armeno-Kipchak  BoK  5 One of the Kypchak varieties within the Turkic languages (Altaic family) spoken by Armenian diasporas in e.g. Poland, Crimea and Ruthenia (~ Ukraine), where they used the traditional Armenian alphabet;  nowadays virtually extinct. More on Armenians in Poland:
Asmat  BoK  3 A cluster of dialects (or languages) of the Trans-New Guinea family spoken in Indonesia.
Asturian  BoK  9 Regional language in Asturia in northern Spain, used in several West Romance dialects by ca. 100-125 thousand speakers from the Biscay Coast to north-eastern Portugal (see: Mirandese). More on Asturian and the Asturian Language Board:
Avar BoK  3   Avar-Andic group of North Caucasian languages, spoken by ca. 790 thousand people in Dagestan (Russian Federation) and transborder areas of Azerbaijan. Avar dance and language to be watched:
Babalia Creole  BoK  7 An Arabic-based Creole language spoken in Chad.
Baka  BoK  2 A Niger-Congo (non-Bantu) language spoken in Cameroon and Gabon. Find out more and watch sample videos at:
Bantu languages  BoK  14 The Bantu language family embraces 250 to ca. 600 languages spoken in Central and Southern Africa. The most widely known Bantu languages are Swahili and Xhosa. Bantu is a subfamily of the larger family of Niger-Congo languages
Basque  BoK  9 Euskara – non Indo-European and one of the oldest languages spoken in Europe. Basque enjoys numerous language rights in the Basque Country within the Kingdom of Spain, where it is used by ca. 500 thousand speakers (revitalized and relatively safe, used in public domains), and relatively endangered in the French Republic, where it is spoken by ca. 75 thousand people. Watch this interesting video-clip on the Basque language:
Belarusian  BoK  8 Belarusian, also called (especially in the past): Byelorussian, is an East Slavic language. One of official languages (together with Russian) of the Republic of Belarus, endangered by the dominating social and political role and functions of Russian in the Belarusian society. Used also by Belarusian minorities in Poland (almost 50 thousand), countries of the former USSR, Western Europe and North America. Listen to a Belarusian song:
Bengali  BoK  1,   4 The Bengali language ( বাংলা in Bengali) is an Indo-European language spoken natively in Bangladesh and India. According to Lewis (2009), together with L2 speakers, it is used by 250 million people. This makes Bengali one of the 10 largest languages of the world.
Bete BoK  4 An endangered Niger-Congo language spoken in Nigeria.
Bislama BoK  8 Bislama, also called Bichelamar, is a creole language, one of the official languages of Vanuatu. It is the first language of many of the “Urban ni-Vanuatu” (those who live in Port Vila and Luganville), and the second language of much of the rest of the country’s residents. More than 95% of Bislama words are of English origin; the remainder combines a few dozen words from French, as well as some vocabulary inherited from various languages of Vanuatu, essentially limited to flora and fauna terminology.
Breton  BoK  8 9 Breton is a member of the Brythonic family of Celtic  languages, similar to Cornish and Welsh and was introduced into present-day Brittany (north west France) in the 6th century CE. Breton linguists estimate the number of speakers at 172,000 (Broudic 2009). Fañch Broudic. – Parler breton au XXIe siècle. Le nouveau sondage de TMO Régions. – Brest: Emgleo Breiz, 2009.
Burmese  BoK  1 Burmese is a Sino-Tibetan language with official status in Myanmar. Burmese is a tonal language and is written in a brahmic alphasyllabar. Here you can find a Burmese-English online dictionary:
Cantonese BoK  8 Cantonese is a language that originated in the vicinity of Canton (i.e., Guangzhou) in southern China, and is spoken by approximately 62 million speakers worldwide. It is often regarded as the prestige dialect of Yue. It is an official language in Hong Kong (along with English) and in Macau (along with Portuguese). In Singapore the government has a Speak Mandarin Campaign (SMC) which seeks to actively promote the use of Mandarin over other Chinese languages, such as Hokkien (41.1%), Teochew (21.0%), Cantonese (15.4%), Hakka (7.9%) and Hainanese (6.7%).
Carrier (=Dulkw’ahke)  BoK  5 An indigenous language of Canada, belonging to the Athabaskan languages of Na-Dene suprafamily. According to Ethnologue 2009 Carrier (Dulkw’ahke) is spoken by 2060 Native Americans, and closely related Hare (North Slavey ) by 1030. Written with a syllabic (actually syllabic-alphabetic) writing system Déné developed by missionaries.  See:
Catalan  BoK  9 Catalan constitutes an example of a language absolutely non-endangered in one country (Kingdom of Spain, where it is spoken in several varieties by over 11 million users) and endangered by a weak intergenerational transmission in another – Republic of France, where Catalan is a regional language used by ca. 100 speakers.
Celtic languages BoK 3, Map The Celtic languages are a group of languages, spoken in the west of the island of Britain, in Ireland and in Brittany (north-west France). Their respective names are: Irish, Scottish Gaelic, Manx, Welsh, Breton and Cornish. They are divided into two branches – P-Celtic and Q-Celtic. These labels come from the word for “head” in each of the languages. In Welsh, Breton and Cornish, the word for “head” is pen or penn (and thus these three languages are known as P-Celtic). In Irish, Gaelic and Manx, the word for “head” starts with c- or k- which historically would have been *k w – in Proto-Celtic, and thus we call these languages Q-Celtic.
Chamorro  BoK  3 An Austronesian language spoken in the Mariana Islands in the Philippine Sea (between Japan and New Guinea). Learn more and listen to samples at:
Champenois  BoK  9 see: Walloon
Chechen  BoK  3 Nakh group of North Caucasian languages, spoken by ca. 1.3 million people in Chechnya (Russian Federation), and – particularly after Russian-Chechen wars at the turn of the 21st c. – by Chechen refugees scattered throughout the world. In spite of the fact that the language is spoken even by Chechen monolinguals (an extremely rare case among Russia’s nationalities), the language has been threatened by gradual genocide of its speakers. Listen to a Chechen polyphonic song:
Cherokee  BoK 3 4 5 A Native American language of the Iroquoain family, spoken by about 16,000 people in the USA in the states of Oklahoma (greatest part of speakers) and North Carolina.   Cherokee is one of the few languages for which a script was invented (instead of adapting the writing system of another language): the Cherokee syllabary, which you can see, for example, here:
Czech  BoK  4 Czech is a West Slavic language, just as Polish. It is the national language of the Czech Republic and spoken by about 12 million speakers worldwide.
Cheyenne  BoK  4 Cheyenne is spoken in Oklahoma and Montana by about 2100 persons. Find more information

Chinese (cf. Hui; Mandarin; Wenyan)  BoK  1 4 9 see Hui; Mandarin; Wenyan
Chinese Pidgin English  BoK  7 Chinese Pidgin English is a pidgin language lexically based on English, but influenced by a Chinese substratum. From the 17th to the 19th centuries, there was also Chinese Pidgin English spoken in Cantonese-speaking portions of China. Chinese Pidgin English is heavily influenced by various Chinese languages with variants arising among different provinces (for example in Shanghai and Ningbo).
Chipaya BoK 6, Map Chipaya is spoken by about 1800 people in Bolivia. Most of them live north of Lake Coipasa in the department Oruro. Chipaya belongs to the Uru-Chipaya group of languages. The speakers of Chipaya know that their language is endangered and try to preserve it. They have established a language committee, who created a writing system for Chipaya in 2002 in order to have the language taught in schools. Further information and recourses:

This site contains information on the language and the people, their history and culture as well as on the geography of the area.


The Chipaya people and the landscape in which they live have fascinated also professional photographers. Here are two galleries worth watching:

Choctaw  BoK 3 Choctaw is a Native American language spoken mainly by elder adults in Oklahoma, USA. It belongs to the Muskogean family. See for more information.
Chontal    Map Chontal languages are one of many languages spoken in Oaxaca state, southern Mexico. They are genetically unclassified languages, although some scholars suggest they be grouped together with other languages in a Hokan stock. Our exercises and examples show Lowland Chontal, an endangered language with only around 200 mostly elderly speakers. Its sister language, Highland Chontal is similarly endangered. Tequis-lateco, a third variety of Chontal, is already extinct. Note and don’t confuse: there is another language in Mexico called Chontal of Tabasco, which belongs to the Mayan family and is not related to Chontal of Oaxaca. The name of both languages comes from the Nahuatl word chontalli ‘strangers’. More information on Lowland Chontal and samples: DoBes Chontal project: .
Endangeredlanguages: . On both sites you can find, among others, a word list with audio recording (words given in Spanish and translated into Chontal) and text as PDF (words in Spanish, Chontal, and English) and a short funny story about an attempt of courtship that didn’t work, as audio recording in Chontal with an English translation as PDF.
Cocopa  BoK  2 A Native American languages spoken in California and Mexico, of the Hokan (Yuman) family.
Cornish  BoK  9 The Cornish Language (Kernewek) is the direct descendant of the ancient language spoken by the Celtic  settlers who inhabited Cornwall (Kernow) and most of the British Isles long before the Roman conquest. By the nineteenth century, Cornish had died as a spoken community language, although there are records of the language being spoken particularly at sea by Newlyn fishermen. In 1929 Robert Morton Nance published his ‘Cornish for All’; a version of the language based on the medieval texts, known as Unified Cornish, and then in 1938 a Cornish-English dictionary. During the 1980s and 1990s a number of other variants of the language were developed (Richard Gendall: Late Cornish; Ken George: Kenewek Kemmyn; Nicholas Williams: Unified Cornish Revised). Recognised as a minority language by the EU in 2000, funding became available to allow the development of a standard spelling system for the language. According to MacKinnon (2007) there are a ‘few hundred speakers’ today. See for details.
Crimean Tatar  BoK  5 Crimean Tatar (see also: Tatar) – a Turkic language of the Altaic family, spoken as native by 260 thousand Tatars in the Crimea, and in the post-Soviet states in Central Asia. The entire Crimean Tatar population was deported there in 1944 by decree issued by the Soviet leader Joseph Stalin. The language is still considered endangered by assimilation within the surrounding and domination Russian language. More on deportation of Tatars from the Crimea:
Croatian BoK 6   Croatian is a South Slavic language, spoken in Croatia and neighbouring countries. The question whether Croatian, Serbian and Bosnian are separate languages or dialects of one language is answered controversially – these ethnolects are very similar to each other in vocabulary and grammar, but on cultural and political grounds they are nowadays often considered different languages. Croatian is written with the Latin alphabet, whereas Serbian and Bosnian traditionally use the Cyrillic alphabet.
Daakaka    BoK  13, Map Daakaka is an Austronesian language spoken by around 1200 inhabitants of the island of Ambrym in Vanuatu (Oceania). Besides Daakaka, there are five other languages spoken on that island, which is roughly the size of the Isle of Man.
Daakie  BoK  1 The Austronesian Daakie language is one of the six languages of the small island of Ambrym (Vanuatu). Apart from Daakie, there are also the following languages spoken on the island: Daakaka, Dalkalaen, the North Ambrym language, Lonwolwol and the Port-Vato language.
Dalkalaen  BoK  1 The Austronesian Dalkalaen (or: Dal kalaen) language is one of the six languages of the small island of Ambrym (Vanuatu). Apart from Dalkalaen, there are also the following languages spoken on the island: Daakaka, Daakie, the North Ambrym language, Lonwolwol and the Port-Vato language. Of all the languages spoken on Ambrym, Dalkalaen is the most dialectally diverse one.
Dravidian languages  BoK  1 The Dravidian language family comprises languages spoken by approximately 200 million people in southern India, Pakistan and Sri Lanka. Tamil, Kannada and Malayalam are the most known Dravidian language names.
Dungan  BoK  5 The Dungans are descendants of the Chinese-speaking Hui Muslims, who migrated to and settled in today’s Kyrgyzstan and Kazakhstan. As they were isolated from Chinese and the Russian writing system was imposed, their language changed significantly into a separate Sinitic ethnolect. Nowadays Dungan is spoken by some 42 thousand. Link to a Dungan language recording: .
Dutch BoK  3, 7 Dutch is a West Germanic language (as is German), spoken in the Netherlands, Belgium and Surinam and by smaller communities in other countries. The Dutch language in Belgium is also called Flemish. The closest relative to Dutch is Afrikaans, which developed out of Dutch dialects in south Africa.
Dyirbal  BoK   1 Dyirbal is an extinct Australian language. It is well known to linguists because it served as basis of a first wider study of ergativity. People interested in linguistics might know Dyirbal as having a special register called mother-in-law language. In Dyirbal, one is not supposed to use everyday speech in presence of certain family members. What they should employ instead, is a variety that largely differs from the everyday speech in terms of e.g. vocabulary. Some Dyirbal examples of mother-in-law-language can be found here:
English BoK  3,  4
English takes place two in the list of the world’s largest languages, second only to Chinese.
Estonian  BoK  1 3 Estonian is one of the largest Uralic languages. It is spoken by over a million people, mainly in Estonia where it has official status. Estonian is one of the official languages of the European Union.
Evenki   A complex of utterly diversified Manchu-Tungusic dialects (of the Altaic suprafamily) unified into Evenki language by Soviet language engineering in the 1920-1930s. Used by ca. 7.5 thousand speakers scattered in vast areas of the Russian Far East and 19 thousand in Manchuria (north-eastern China). The language is actually considered endangered, but there are attempts to revitalize it. Listen to an Evenki song from Manchuria:
Finnish  BoK   1, 6 Finnish belongs to the Baltic Finnic branch of the Finno-Ugric language family. It is spoken natively in Finland and the Republic of Karelia in Russia. Finnish is an agglutinative language and for this reason it is known as having very long words like syytt ämättäjättämättömyys or yhteiskuntavastuuraportointi. Here you can read an article about other beautiful Finnish words:
Finno-Ugric languages BoK 5   Besides the Indo-European family, Finno-Ugric languages is the second large family of languages spoken in Europe. Hungarian, Finnish, Estonian, Saami languages, Udmurt, and several other languages belong to this family. Finno-Ugric and Samoyedic languages are grouped together in the supra-family of Uralic languages.
Flemish (cf. Dutch)   see: Dutch
French BoK  3, 6, 7   French belongs to the Romance group of the Indo-European language family. It is one of the largest languages of the world, especially if native and second language users are counted together. It has official status in various countries in Europe and Africa as well as in Canada. French is also a language of international communication in several European and worldwide organisations.
Frisian  BoK  9 Frisian is a complex of three languages, used in the past in a Germanic continuum area ranging from South Holland to South Jutland along the North Sea coast. Nowadays the Frisian languages consist of:

  • West Frisian – co-official language of the Province of Fryslân in the Netherlands, considered relatively safe and spoken by almost 470 thousand Frisians;
  • (East Frisian) Saterfrisian – a seriously endangered language of Saterland county (Lower Saxony, Germany), where it is spoken by ca. 2 thousand
  • North Frisian – a group of dialects spoken by ca. 10 thousand people in north-western Schleswig-Holstein (Germany) – considered highly endangered.

More on Frisian:


Friulian  BoK   1 9 Friulan (or Friaul) is one of the officially recognized minority languages in Italy. Thanks to the support by the Italian state this peripheral language spoken by ca. 800 thousand in the north-eastern region of Friuli is less endangered at present than in the past. See a clip with a parody of a Friulan lesson:
Fulfulde; Fula  BoK   3 ,   8 Fula or Fulfulde is spoken in West Africa. It belongs to the Atlantic branch of the Niger-Congo family of languages. With 24 Million speakers it belongs to the 100 largest languages of the world.
Gaelic  BoK  7 8 Gaelic is the Celtic language still spoken in some parts of Scotland to this day. It is closely related to Irish. Once the main language across the country, Gaelic is now only spoken by around one per cent of the population, particularly in communities in the Outer Hebrides. See for more details.
Georgian  BoK  5 Georgian (ქართული ენა ) is a Kartvelian language of the Kaukasus spoken by about 4.5 million people.
German BoK  3,  4 A west Germanic language, one of the largest languages in the world by native speakers with official status in several European countries.
Gothic BoK  7 Gothic is the language of the earliest literary documents of the Germanic peoples as a whole. The language belongs to the East Germanic branch of languages. The primary source of linguistic data is the remains of a translation of the Bible, written in the 4th century CE. Gothic may have survived near the Black Sea, though in altered form, until at least the 16th century as a nonliterary language, now termed Crimean Gothic.
Grabar (= Classical Armenian)   See: Armenian
Greek, Classical Greek   Greek belongs to the Indo-European family of languages, where it forms a group of its own (at least, no other living language belongs to this group).
Greenlandic  BoK  3 Greenlandic Inuit, called Kalaallisut by its speakers, is the form of the Inuit language spoken in Greenland, where it has official status.
Guadelope Creole  BoK  7 A French-based Creole with elements from Carib and African languages, spoken in Guadeloupe.
Gujarati BoK   4   With 49 million speakers Gujarati is one the world’s major languages. It belongs to the Indo-Aryan branch of the Indo-European family and is spoken in India.
Hainanese BoK  8 Hainanese is a variety of Min Nan Chinese spoken in the southern Chinese island province of Hainan. “Hainanese” is also used to describe the language of the Li people living in Hainan, but generally refers to the Chinese dialect spoken in Hainan. It is mutually unintelligible with other Min Nan varieties, such as Teochew and Hokkien–Taiwanese. Many Hainanese, mostly from the north-eastern part of the island, emigrated to Singapore.
Hakka BoK  8 Hakka also rendered Kejia, is one of the major Chinese language subdivisions or varieties and is spoken natively by the Hakka people in southern China, Taiwan and throughout the diaspora areas of East Asia, Southeast Asia and around the world. It is most closely related to Gan and is sometimes classified as a variety of Gan.
Halcnovian  BoK  10 A critically endangered language in Hałcnów (Poland): according to the recent records it had only 8 speakers in 2013. The Ethnologue language catalogue does not offer any information on the dialects of the Bielsko-Biała enclave other than Wymysorys/Wilamowicean. Based on the dialects map that exist for the German language it might be possible to classify them as East Middle German. Read more here:
Hare (=Slavey) BoK  5 Hare, or North Slavey, is an Athabaskian language of Canada, one of the languages written with the Canadian Aboriginal syllabic writing system. Find more information about the language and its writing system here:
Hawaiian  BoK  2 9 An East Polynesian language. Critically endangered, spoken barely by 1 to 8 thousand (Ethnologue 2009) out of almost 240 thousand ethnic Hawaiians living on the Archipelago (some 100 thousand Hawaiians live in other US states). At the beginning of the 20th century Hawaiian was the first language for 37 thousand people.
Hebrew, Classical Hebrew BoK 3 Hebrew belongs to the Semitic group within the Afro-Asiatic language family. Classical Hebrew has been used for centuries in religious texts of the Jewish community. Modern Hebrew (Ivrit), one of the official languages of Israel and spoken by 5.3 Mio. people, is one of the most impressive examples of language revival. It was created in the 19th century on the base of Classical Hebrew.
High Sorbian cf. Sorbian  BoK  9 See: Sorbian
Hindi  BoK  1,  4 With over 180 million users, Hindi (हिन्दी  in Hindi) is the 5th largest language in the world. It is one of the main languages of India. Hindi belongs to the Indo-European language family. Many linguists consider Hindi and Urdu of Pakistan as the same language. However, from a socio-political point of view, they are in fact two separate languages.
Hixkaryana  BoK  3 An endangered language spoken by about 600 people in Brazil. It belongs to the Carib family of languages.
Hokan or Yuman languages  BoK  2 A family of Native American languages spoken in California and Mexico
Hokkien BoK  8 Hokkien is a group of mutually intelligible Min Nan Chinese dialects spoken by many overseas Chinese throughout Southeast Asia. Hokkien originated from a dialect in southern Fujian. It is closely related to the Teochew, though mutual comprehension is difficult, and is somewhat more distantly related to Hainanese.
Hoocąk   Map Hoocąk (alternative names: Hotcak, Ho-chunk, Winnebago) of the Siouan language family is one of more than 250 native languages of North America. The Hoocąk community has about 6000 members, but only around 200 of them can speak the language, most of them being elderly people. For that reason the language is considered highly endangered.
Hui  BoK  9 The Hui people are a community of nearly 10-million Chinese Muslims – recognized by Chinese law and by the Chinese government as an ethnic minority. They, however, have not had their own language that could act as a distinguisher. See also: Dungan
Hungarian BoK  3   With over 14 Mio. speakers, Hungarian is the largest language of the Finno-Ugrian family. It is the national language of Hungary and one of the official languages of the European Union.
Ibanag BoK  6 Ibanag is an Austronesian language spoken in northern parts of the Luzon island, the Philippines. Ibanag is used by roughly half a million speakers and enjoys institutional status.
Icelandic  BoK  8 Icelandic is one of the Nordic languages, which are a subgroup of the Germa nic languages . Linguistically it is most closely related to Faeroese and Norwegian. The vast majority of Icelandic speakers—about 320,000—live in Iceland. There are about 8,165 speakers of Icelandic living in Denmark ( ).

The language is also spoken by 5,112 people in the USA: , and by 2,170 in Canada (Census 2001).

Igbo  BoK   4 Igbo is one of the large languages of the Niger-Congo family (as are Fula and Yoruba). It is spoken by 20 million (or more) people in Nigeria.
Inari Saami  BoK  9 Inari Saami is one of the Saami languages (See: Saami). It is spoken on the territory of the Inari municipality (Northern Finland). Although it only has several hundred speakers, we can be slightly more optimistic about the fate of Inari Saami than that of other smallest Saami languages. This is because of the successful implementation of the idea of language nests. You can find more information about the language and the efforts to revitalise it.
Indo-European languages  BoK  1 The Indo-European language family is the largest family in the world in terms of number of speakers. Majority of speakers of Indo-European languages live in India. Indo-European languages are autochthonous also to Iran, Pakistan, Bangladesh, Afghanistan as well as most European countries.
Indonesian / Bahasa Indonesia  BoK  9 An Austronesian language based on Malay. Indonesian is the official language of the Republic of Indonesia and is spoken by over 220 million people, although many inhabitants of that multilingual archipelago are not fully fluent in the national language of Indonesia.
Ingush  BoK  8 Ingush belongs to the Nakh branch of Northeastern Caucasian languages (as does Chechen). It is spoken in Russia and Kazakhstan. Click this link to hear a song in Ingush:
Inuit (=Eskimo)   Native languages of the Arctic, belonging to the Eskimo-Aleut language family. Most probably spoken by some 90 thousand language users. Inuit languages are spoken in Greenland (Greenlandic), Canada and Alaska.
Irish  BoK  346, 8 Irish is a Celtic  language of the Indo-European family. It is the traditional language of Ireland and has official status in the Republic of Ireland and in Northern Ireland. However, it is the native language of only a minority in these countries, and the dominance of English makes it an endangered language. Find audio and video samples of Irish here:
Italian   Italian is a Romance language spoken in several European countries. Several ethnolects spoken in Italy and formerly thought of as Italian dialects are nowadays by many linguists regarded as languages of their own, for example Friulian, Ligurian, Sardinian, or Sicilian.
Iwaidjan languages   BoK 2, Map Iwaidja, Ilgar and Maung are Iwaidjan languages from Northern Territory, Australia. It is suggested that Iwaidjan languages be grouped together with other languages in the Arnhem Land language family. Iwaidja and Maung are highly endangered languages spoken by only around 200 people each. The last speaker of Ilgar died in September 2003.
Jalapa Mazatec, Mazatec, Jalapa de Diaz   BoK  4 One of several Mazatecan languages/dialects belonging to the Oto-Manguean family and spoken in Mexico. 
Japanese BoK  45 One of the world’s major languages, the national language of Japan. It probably has no other relatives than the Ryukyuan languages spoken on Japanese islands.
Karaim  BoK  6, 9, Map Karaims are the smallest officially recognized ethnic minority in Poland (ca. 50 persons) and Lithuania (ca. 270). The Karaim language originates from the Crimean Peninsula, and is still used by some 120 persons in the two countries, as well as by some unverified number of speakers in the Ukrainian Crimea. Karaim belongs to the Kypchak branch of Turkic languages of the Altaic family. It has not developed a uniform standard, but functioned in the following dialectal varieties:

  • Crimean – or eastern, spoken in the Crimean Peninsula, considered extinct by now (although Ethnologue 2009 quotes a number of 1000 Karaims living there); as well as two western Karaim varieties:
  • Troki/Trakai – used within the Karaim communities in Lithuania and Poland, considered an endangered language – according to Ethnologue 2009 spoken by ca. 120 out of 300 ethnic Karaims,
  • Łuck-Halicz / Lutsk-Halych – in the past spoken in the two towns (nowadays in Ukraine), by now used by two last speakers (according to Etnologue 2009 by twelve); not transmitted to next generations.

More on the Karaims:

Karata  BoK  5 The Karata/Karatin language belongs to Avar-Andic languages of North Caucasian family. According to Ethnologue 2009 there are ca. 5 thousand Karata. All of them still speak their language, although the intergenerational language transmission weakens due to lack of a written form of the language, accepted for the education purposes. See: , for an extended phonetic inventory of Karata (and other Caucasian languages)
Kashubian  BoK  26, 9   West Slavic language, Lekhitic subgroup (together with dialects of Polish, Pomeranian and extinct Polabian from today’s north-eastern Germany) – regarded in the past as [most distinct] dialect of Polish, although discussion concerning the status of Kashubian started already in the 19th c. Nowadays (according to the 2002 Population Census) used in home contacts by over 52 thousand persons in the province of Pomerania. The 2005 Law on national and ethnic minorities and on the regional language recognized Kashubian as the first (and only as yet) regional language in Poland. Thanks to a strategic language planning of the Kashubs themselves the language returns to or enters entirely new domains of everyday use, including the schooling system and public sphere. For the present Kashubian has got a status of an auxiliary official language in 10 Pomeranian municipalities and is used in almost 400 bilingual place names. More on the Kashubs and their language: ; you can also listen to a Kashubian internet radio: ; more details on Kashubian in education in a dossier available:
Kaska  BoK  8 Kaska is a severely endangered Athabaskan language spoken in British Columbia, Canada. Click here to access a Kaska language website:
Kawi  BoK  5 A classical literary language based on the Old Javanese grammar and vocabulary influenced greatly by Sanskrit (Old Indian language)
Kazakh BoK 9 Kazakh is a language belonging to the Turkic subfamily of Altaic languages. It is ​​spoken by approximately 11 million people, mainly in Kazakhstan, where it is the state language, China and the Russian Federation.
Kayardild BoK 1 A critically endangered Aboriginal language spoken on the South Wellesley Islands, Queensland, Australia.
Kedang  BoK   4 An Austronesian language spoken in Indonesia with ca. 30,000 speakers.
Khanty Map The Khanty language belongs to the Obi-ugrian branch of the Uralic language family. It has three groups of dialects: Eastern, Northern and Southern, of which the latter one is probably extinct by now. Khanty is one of the few languages of the world which distinguish the voiceless alveoral fricative lateral -sound.
KhoeKhoegowab, KhoeKhoe, Nama  BoK   4 A cluster of dialects that belong to the Khoe (or Central Khoisan) group of languages famous for their click sounds. Spoken in Namibia. With about 200,000 speakers it is the largest Khoisan language and also used in school. Nevertheless its position is vulnerable. Find more information and samples at: (language name: Khoekhoe). See also ǂAkhoe Haiǁom.
Khoisan languages BoK 9 The Khoisan languages form one of the world’s oldest language families, including more than 100 languages ​​spoken in the southern parts of Africa. Most Khoisan languages are seriously endangered. Of all the Khoisan languages, the largest number of users is with the Nama language spoken in Namibia – around 200,000. A characteristic feature of Khoisan are phonemes called clicks.
Kiliwa  BoK  2 A critically endangered language of the Yuman family spoken in Mexico.
Koongo  BoK  7 A macrolanguage of the Democratic Republic of the Congo Population total all countries: 5,955,908 (Ethnologue 2009)
Korean   With about 78 million speakers one of the world’s major languages. Not genetically related to other languages. The Korean alphabet, called Hangul, was invented in the middle of the 15th century.
Krio  BoK  7 English-based creole spoken in Sierra Leone.
Kurru, Yerakula   BoK  8 A Dravidian language spoken in south India.
Kumeyaay BoK  2 3 A severely endangered Native American language of the Yuman family spoken in southern California and Mexico.
Lakhota BoK  3 9 A Native American language of the Siouan family, spoken in the US states of South and North Dakota. Find samples here:
Latgalian BoK  3, 4,  6, 9,  10,  Map Latgalian is a Baltic ethnolect spoken in eastern Latvia (region of Latgale). It is considered either a dialect of Latvian with an own literary written tradition or a separate regional language. During the 2011 population census, ca. 165 thousand persons declared knowledge of Latgalian.

Latin BoK  3   Extinct, but still considered as inert living language originating from the Italic group of Indo-European languages. The regional vulgar=popular Latin varieties developed into individual Romance languages. At present one of official languages of the State of Vatican. Look at Vicipadia Latina:
Latvian BoK  6 Latvian is a Baltic language with about 1.5 Million native speakers and a growing number of fluent speakers whose native language is different. It is the official language of Latvia and therefore one of the official languages of the EU.
Laz Kartvelian language, closely related to Georgian, used by 2 thousand speakers in south-western Georgia and 30 thousand (Muslim Georgian minority) in north-eastern Turkey. A videoclip on revitalization of Laz available at:
Lelemi BoK  3   Lelemi is a Niger-Congo language of Ghana. It belongs to the Ghana Togo Mountain languages (as does Logba).
Lemko  BoK  9 , Keep languages alive! see: Rusyn

Liberian English

BoK  7 An English-based Pidgin. ‘Liberian English’ is a term used to refer to the varieties of English spoken in the African country of Liberia.
Lingala  BoK  7 Lingala is a Bantu language spoken throughout the northwestern part of the Democratic Republic of the Congo and a large part of the Republic of the Congo, as well as to some degree in Angola and the Central African Republic. It has over 10 million speakers. See for more details.
Livonian BoK 9 Livonian belongs to the Baltic Finnic group of the Finno-Ugric languages. It was once used in several villages on the Baltic Sea shoire in the northwestestern part of Latvia. Nowadays Livonian is practically extinct, but there are a few dozen of people who learned Livonian as their second language and who upkeep a vibrant revitalisation movement.
Logba  BoK  3, Map Logba is an endangered language spoken in south-eastern Ghana, on the hills of the Ghana-Togo frontier. The speakers themselves call their language Ikpana. It belongs to the Kwa group of the large Niger-Congo language family. The estimated number of speakers of Logba is 7,500 (Ethnologue 2009), or 5000 (Encyclopedia of the World’s Endangered Languages). Most speakers of Logba regularly use also one or more other languages: Ewe, the dominant language of the region and the language of primary education, English, the official state language of Ghana, and Twi (Akan), the language with the highest number of speakers in Ghana (8.3 Mio). Logba is mostly a spoken language, rarely used in writing. Further information and recourses: Short information and some recordings at: Dorvlo, Kofi. 2008. A grammar of Logba (Ikpana). Proefschrift, Universiteit Leiden. Available online at:
Lonwolwol  BoK  1 The Austronesian Lonwolwol language is one of the six languages of the small island of Ambrym (Vanuatu). Apart from Lonwolol, there are also the following languages spoken on the island: Daakaka, Daakie, Dalkalaen, the North Ambrym language, and the Port-Vato language.
Lorrain   See: Walloon
Low German  BoK  6, 9 Recognized regional language in the northern federal states of Germany and (as Low Saxon) in eastern provinces of the Netherlands. Used (mostly as second language) by almost 10 million speakers, but rapidly losing its domains of use. In the past, a language of rich culture and trade contacts of the Hanseatic Union, spoken commonly along the southern Baltic coasts. More information in a dossier available at:

Low Sorbian cf. Sorbian  BoK   1 , 4 9 See: Sorbian
Macedonian  BoK   1 Macedonian belongs to the southern group of the Slavonic languages. Because of political controversies, the numbers of speakers given in various sources differ considerably (1,6–3 million). Macedonian is the official language of the Republic of Macedonia.
Malayalam   A Dravidian language spoken in the Indian state of Kerala.
Malagasy  BoK   1 ,   3 The Austronesian Malagasy language is the national language of Madagascar. It is also spoken on Mayotta and Comores. In total there are 18 million speakers of Malagasy.
Mandarin Chinese  BoK   4 Mandarin Chinese is the language with by far the largest number of speakers in the world.
Mandinka  BoK  7 The Mandinka language (Mandi’nka kango) is a Mandé language spoken by millions of Mandinka people in Mali, Senegal, The Gambia, Guinea, Ivory Coast, Burkina Faso, Sierra Leone, Liberia, and Guinea-Bissau and Chad; it is the main language of The Gambia. It belongs to the Manding branch of Mandé.
Manx BoK  6 Although closely related to Irish and Scottish Gaelic, Manx looks quite different because of different spelling conventions. It was once spoken by almost the entire population of the Isle of Man until 1765, when the island was sold to the British Crown. After this the number of speakers went into decline as a result of the collapse of the Manx economy and large scale emigration. By the 1960s only two native speakers of Manx remained and the language was declared extinct in 1974. A revival of interest in the language began in the 1930s, since when many people have become ‘new’ speakers of Manx. In the 2011 census, 1,823 out of 80,398, or 2.27% of the population, claimed to have some knowledge of Manx.
Māori BoK  3 ,   9 East Polynesian Māori is a native language of New Zealand, declared a co-official language of that state, successfully revitalized and supported by various language education programs – according to Ethnologue 2009, Māori is spoken by ca. 60 thousand and understood by other 100 thousand New Zealanders. Listen to the national anthem of New Zealand in Māori:
Marquesan  BoK  4, Map A cluster of Polynesian dialects spoken on the Marquesas’ Islands. For more information see: See also: Tahuatan, tahutan
Mavea  BoK  8 Mavea (also known as Mav̋ea or Mafea) is an Oceanic language spoken on the island of Mavea in Vanuatu, off the eastern coast of Espiritu Santo. It belongs to the North–Central Vanuatu linkage of Southern Oceanic. The total population of the island is approximately 172, with only 34 fluent speakers of the Mavea language reported in 2008.
Mayan BoK 5 A family of languages spoken by at least 6 million people in Mesoamerica. The largest is K’iche’ (Quiché) with about 1 Mio. speakers. Many Mayan languages are endangered. The ancient Mayan writing system probably shows the independent invention of writing on the American continent (although some researchers link it to Old Chinese writing).
Maybrat  BoK  3 A West Papuan language spoken in Papua New Guinea and Indonesia.
Megrelian  BoK  5 Kartvelian language, separate from Georgian, although still not recognized officially by Georgian authorities. Hardly present in public domains of use, therefore regarded as endangered language, even if still spoken by ca. 500 thousand people in north-western Georgia. A Megrelian song can be listened at:
Middle High German LoL   Middle High German is the form of German spoken and written between 1050 and 1350. It is the ancestor not only of modern German, but also of Yiddish and Wilamowicean.
Minangkabau BoK  3   An Austronesian language spoken on the island of Sumatra (Indonesia).
Mirandese  BoK  5 A Romance variety closely related to Asturian spoken in northern Spain; the only native minority language in Portugal, spoken by ca. 5 thousand inhabitants of the town Miranda de Douro and vicinities, located in north-eastern Portugal. Listen to Mirandese at:
Mixtec  BoK  1 The Mixtec language continuum comprises several indigenous languages spoken by around 550,000 people in southern Mexico. Mixtec belongs to the Oto-Manguean language family.
Miyako BoK 6 , Map Miyako is an endangered Japonic language spoken on the Miyako Islands located south of Okinawa (Japan). Only the Miyako elders speak the language – but there are some efforts to preserve miyako, e.g. by recording traditional songs.
Mlabri  BoK  3 Mlabri is a small language of the Mon-Khmer family spoken in the mountains of the easternmost part of North Thailand. Its speakers are hunter-gatherers. It has two dialects with together less than 200 speakers. The linguist Jorgen Rischer, who has studied this language, writes about it: ” … a language spoken only within a couple of households need not be in a stage of disintegration. This is a fully functional language which can be used – and still is used – in all kinds of conversation and narration, as I know from my experience as a more or less participant observer.” Rischer, Jorgen. 1995. Minor Mlabri. A hunter-gatherer language of Northern Indochina. Copenhagen, 14-15.
Moldovan   A Romance language created by the Soviet language engineering methods, basing on Moldavian dialects of Romanian, after creation of the Moldavian Soviet socialist republic; written with Cyrillic alphabet.
Molise Croatian  BoK  5 South Slavic variety, spoken by ca. 3 thousand inhabitants (originating from the territory of Croatia) three villages in the Italian province of Molise; officially recognized as minority language in Italy. An old song in Molise Croatian:
Mongolian, Classical Mongolian  BoK  5 Mongolian is a Mongolic language within the supra-family of Altaic languages. Spoken in the Republic of Mongolia (as official by ca. 2.3 million) and Inner Mongolia – an autonomous region of China (3.4 million). In the 1940s, due to political reasons, Mongolian changed its writing system from and Old Uyghur script to Russian Cyrillic. A presentation on Mongolian at:
Mono  BoK  3 A severely endangered Native American language of the Uto-Aztecan family, spoken in California.
Mordvinian (Erzya) BoK  6 The Mordvinian language is actually a group composed of two languages: Moksha and Erzya. Compared to their language relatives – other Uralic languages – Erzya and Moksha are quite big: together, they are used by almost 400,000 speakers. However, the lack of official status and the fact that the intergenerational transmition is interrupted make the Mordvinian languages definitely endangered.
Nadëb  BoK  3 A language of the Amazonas, spoken in Brazil by about 350 people. Nadëb belongs to the Nadahup language family.
Nama (=KhoeKhoegowab) BoK  4   See: KhoeKhoegowab.
Napore  BoK  8 Napore is a now extinct variety of Uganda and is considered a dialect of the Ng’akarimojong language (ISO 639-3: kdj). From a typological perspective, it is a VSO language, is highly inflectional, has grammatical tone, vowel harmony and voiceless vowels. It belongs to the Turkana language family.
Naxi BoK  5, Map The Naxi (also known as the Na-khi ) are a non-Chinese minority people inhabiting mainly in the province of Yunnan (south-western China), as well as the neighbouring regions of Sichuan and Tibet, and most probably also in Burma/Myanmar. The Naxi language belongs to the Sino-Tibetan language family and is divided into several dialects. With a number of speaker reaching 309 thousand (Ethnologue 2009) – including ca. 100 thousand monolinguals – Naxi is not considered a particularly endangered. Naxi has been written with a unique pictographic Dongba script. See the clip:
Nenets  BoK  1 Nenets together with Enets constitute the Samoyed branch of the Uralic language family. There are two Nenets languages: Tundra Nenets and Forest Nenets. They are both endangered languages, with the former spoken by about 30,000 and the latter only 1,500 people. Spoken in Siberia.
Nyang’i  BoK  8 A critically endangered or maybe already extinct Nilo-Saharan language once spoken in Uganda.
Negerhollands  BoK  7 A Dutch-based creole spoken in the U.S. Virgin Islands.
Ngiti  BoK   4 A Nilo-Saharan language spoken in the Democratic Republic of the Congo.
North Ambrym  BoK  1 The Austronesian North Ambrym language is one of the six languages of the small island of Ambrym (Vanuatu). Apart from North Ambrym, there are also the following languages spoken on the island: Daakaka, Daakie, Dalkalaen, Lonwolwol and the Port-Vato language. Here you can access a website about a documentation project on North Ambrym:
Nunggubuyu  BoK  3 A severely endangered language of northern Australia, currently spoken only by adults and not passed on to children.
Occitan  BoK  9 Occitan is a language complex (langues d’oc) including the southern French language varieties: Provençal, Gascon, Lengadocian, Lemosin and Auvernghat. In spite of a considerable number of speakers (almost 2 million according to Ethnologue 2009), considered a language endangered by the weakening intergenerational transmission. Revitalised in Calendreta immersion schools –
Oirat   (Kalmyk-)Oirat – Mongolic / Altaic language spoken by 154 thousand in Kalmykia (Russian Federation), 206 thousand in Mongolia and 139 thousand in Inner Mongolia (China). See the clip:
Ojibwe   A language, or a group of dialects, of the Algonquian family, spoken in Canada and the USA. Spoken by over 56,000 people, it is one of the major Native American languages – but still endangered.
Oriya (Odia)   An Indo-European (Indo-Aryan) language of India. With 33 Mio. speakers one of the larger languages of the world.
Ös (Chulym) BoK  8 An endangerd Turkic language of Siberia. Find information and samples here:
Pech   An endangered Central American language of the Chibchan family, spoken in Honduras.
Persian, Old Persian   Old Persian is one of the languages that are attested in the oldest form of writing, cuneiform symbols fixed in clay. Persian belongs to the Iranian branch of the Indo-European language family. Modern Persian (Farsi) is one of the world’s largest languages.
Phoenician   Phoenician is an extinct Semitic language, related to Hebrew and Arabic. The Phoenician script was the basis for the Greek alphabet.
Picard  BoK   9 see: Wallon

BoK   4 An endangered language of Brazil; Mura family or isolate.  Find out more about this very interesting language and the people speaking it at: .
Polish BoK 3,  4,  6, 10 Polish is a Slavic language, belonging to the Lechitic subgroup of West Slavic languages. Polish is the official language of Poland, but it is also used throughout the world by Polish minorities in other countries.
Portuguese  BoK  1, 7 Portuguese is an Indo-European language most closely related to languages such as: Catalan, Spanish and Occitan. Portuguese is one of the biggest languages of our planet, with around 215 million speakers all over the world.
Port Vato  BoK  1 The Austronesian Port-Vato language is one of the six languages of the small island of Ambrym (Vanuatu). Apart from Port-Vato, there are also the following languages spoken on the island: Daakaka, Daakie, Dalkalaen, Lonwolwol and the North Ambrym language.
Puma  BoK  3, Map Puma is one of the over 100 languages spoken in Nepal. It belongs to the Kiranti group of the Tibeto-Burman branch of the Sino-Tibetan language family. According to the 2001 census the number of speakers was 4310. It is endangered, as it is spoken mostly by adults that also regularly use another language – the national language Nepali and/or another Kiranti language, Bantawa or Chamling (source: Ethnologue). Further information and recourses: DoBeS Chintang and Puma project: . This site contains some information on the language and culture, photographs and a short audio recording as sample. The Puma corpus in the DoBeS archive offers more information and many resources accessible for unregistered users.
Putonghua BoK 9 Putonghua is the official language of the Republic of China and its main communication community – the Han (ethnic Chinese). Putonghua is based on the Beijing dialect of Mandarin Chinese.
Quechua BoK  6 Quechua is native to South America. It is used in various regions of the Andes and in fact forms a language family: we have e.g. the Quechua of Ayacucho (Peru), the Quechua of Cuzco (Peru) or the Quechua of Santiago del Estero (Argentina). The different Quechuan languages display varying levels of vitality: from vibrant varieties used in Ecuador to critically endangered languages such as the Quechua of Yauyos (Peru). Quechua is used in several countries across South America, but only Bolivia, Ecuador and Peru recognize it as an official state language.
Rapanui  BoK  5 Rapanuian is a highly endangered language, with 3309 speakers, including 220 living on the Eastern Island (Ethnologue 2009). Rapanuian used Rongo-Rongo as its writing system.
Romani BoK  6 Romani is the language of the Gypsies, or maybe rather a group of languages, as different varieties of Romani may be very different. It belongs to the Indo-Ayrian branch of Indo-European languages, which means it is genetically related to languages of India such as Hindi and Urdu or Sanskrit. However, as Romani has been spoken for centuries in Europe and in close contact with European languages, it has many loanwords from languages such as Greek, Romanian, Bulgarian or Macedonian. On the other hand, several words of Romani origin have been borrowed into European languages, for example English lollipop. Find more information on Romani here: and here:
Romanian   Romanian belongs to the Eastern Romance group of Indo-European languages. It is spoken in Romania, Moldavia and neighbouring countries. When Moldavia was a Soviet republic, the language spoken there was called Moldovan and written with the Cyrillic alphabet, but after 1989 Romanian is written almost everywhere with the Latin alphabet (except for Transnistria at the border of Moldavia and Ukraine).
Rotokas BoK 4 Rotokas is a Papuan language spoken on Bougainville island not far from where Teop is spoken. Rotokas is famous for its small inventory of phonemes and, related to that, its small alphabet. Here you can listen to a story in Rotokas that includes a song that may be borrowed from the Teop:
Russian  BoK  1 ,   4 One of the world’s major languages, with an estimated total number of speakers (first and second language) of 254 to 285 million. Russian belongs to the East Slavic group of languages (together with Belarusian and Ukrainian). The Russian Cyrillic alphabet has been adopted (with minor modifications) for many languages of Russian and the former USSR.
Rusyn   The Rusyn language complex comprises the following East Slavic varieties (often regarded as dialects of Ukrainian): Lemko in Poland (almost 7 thousand speakers), Prešov dialects in Slovakia (ca. 24 thousand), Transcarpathian Ruthenian in Ukraine and Romania(ca. 560 thousand), 2 Pannonian-Rusyn enclaves in northern Hungary, as well as Rusyn in the Autonomous Province of Voivodina in Serbia (30 thousand). Some sources/specialists include also Boyko and Hutsul dialects (spoken in Ukraine). The first standard for literary Rusyn language developed in the then Socialist Federative Republic of Yugoslavia – where it was used as one of official languages. In Poland recognised as auxiliary minority language in 9 municipalities in southern Poland (Małopolska region/voivodship). More on Rusyn language:


Rutul  BoK  8 A Lezgic language of the North Caucasian family, spoken by ca. 29 thousand Dagestani (Russian Federation) and Azerbaijani. Find a sample (word list) of Rutul at:

Saami; Samic languages  BoK  16, 7 A Fennic language complex spoken by the Saami/Sami in central and northern Norway, northern Sweden, Finland and the Kola Peninsula (Russian Federation). Most of the Sami languages are critically endangered, e.g.:

  • Southern Sami (Norway, Sweden): 500 speakers;
  • Skolt Sami (Finland, Russia): 400 speakers;
  • Kildin Sami (Kola / Russia): 608 speakers.

Read more on the situation of Sami:

Listen to a Saami yoiking song:

Saliba  BoK  3 An Austronesian language spoken in the Milne Bay Province of Papua New Guinea. Find more information and pictures of speakers and landscape at:
Samoan  BoK  2 The Austronesian language Samoan is the first language of the vast majority of the inhabitants of Samoa islands. It has official status in both the Independent State of Samoa and American Samoa. It is therefore considered safe.
Samoyedic BoK 5   Samoyedic is one of the two branches of the Uralic language family, the other being Finno-Ugric languages. Samoyedic languages are spoken mainly in Siberia by small communities, but spread over a wide territory. All Samoyedic languages are endangered.
Sanskrit BoK 5   Sanskrit is a classical language of India, used almost exclusively in writing. Because of its exceptionally long and rich history as a written language, Sanskrit is especially important in historical linguistics. The discovery that Sanskrit is related to Greek and Latin led to the recognition of the Indo-European language family.
Serbian BoK 5   Serbian is a South Slavic language, spoken by Serbs in Serbia, Bosnia and Herzegovina, Montenegro, Croatia and Macedonia. The question whether Serbian, Croatian and Bosnian are separate languages or dialects of one language is answered controversially – these ethnolects are very similar to each other in vocabulary and grammar, but on cultural and political grounds they are nowadays often considered different languages. Serbian more often written with a Cyrillic alphabet, but the Latin alphabet is also used.
Scots  BoK  9 Scots is a west Germanic language and is spoken by 1.6 million people in Scotland today. For more details see .
Serbo-Croatian   see Serbian and Croatian.
Sheko BoK  3 Map Sheko is a minority language of Ethiopia with about 39,000 speakers. It belongs to the Omotic group of the Afro-Asiatic language family. It is not an endangered language at the moment, for it is spoken by adults and children at home and in many public situations, but its position is vulnerable. Education is available in Amharic and English, but many speakers of Sheko are illiterate. Further information and recourses: Short information at: Hellenthal, Anneke Christine. 2010. A grammar of Sheko.  Proefschrift, Universiteit Leiden. Available online at:

Sign languages  BoK  1 9   Sign languages are used by the deaf all over the world. Just as spoken languages, each sign language has its own grammar and vocabulary. The grammar of a sign language may be completely different from the grammar of the spoken language surrounding it (for example, German Sign Language is very different from German). People who use a sign language usually also use a spoken language and thus are bilingual. Watch here a clip about sign bilingualism as a human right:
Silesian  BoK  1 9 Silesian (also: Szlonzokian, Schlesisch) is a West Slavic complex of dialects spoken in southern Poland and north-east of the Czech Republic – although declared as language of home contacts by over 56 thousand people (2002), it is not recognized as regional/minority language in Poland, whose politicians and dialectologists claim it to be a dialect of Polish. Silesian nationality is declared by over 10 thousand Czech citizens. Silesian is being standardized and codified vigorously – see e.g. . The main danger for Silesian is its assimilation within the dominating Polish language.
Singa  BoK  8 entry for Uganda lists 6 languages, of which 3 are now considered extinct, namely Napore, Nyang’i and Singa.
Sirionó  BoK   1 The Sirionó (also known as: Siriono and Mbia Chee) language of the Tupi language family is spoken by roughly 400 people in the village of Ibiato (Bolivia). Sirionó is critically endangered.
Slavic languages; Slavonic languages  BoK   1 The Slavic (or: Slavonic) languages are native languages to majority of Central and Eastern Europeans. The family comprises languages such as Polish, Russian, Bulgarian and Croatian, but also smaller languages: Sorbian, Kashubian, Silesian and Rusyn, among others.
Old Church Slavonic BoK 9 Old Church Slavonic is the oldest literary Slavic language, formed on the basis of the Slavic dialect of Thessaloniki. Old Church Slavonic became the literary basis for languages​​ such as Bulgarian, Serbo-Croatian and Russian, and is still used in the Orthodox liturgy.
Slovak   Slovak is a West Slavic language, just as Polish. It is the national language of the Slovak Republic and spoken by about 7 million speakers worldwide.
Sorbian (High Sorbian and Low Sorbian)  BoK  1 9 West Slavic Sorbian has two varieties: Upper Sorbian spoken in the German federate state of Saxony, where it has ca. 18 thousand speakers (Ethnologue 2009) – considered endangered, and Lower Sorbian with less than 7 thousand speakers in the German federate state of Brandenburg (formerly Prussia) – considered critically endangered.

Sumerian  BoK  1 The ancient Sumerian language was a language of Mesopotamia up until 3rd millennium B.C. One of the oldest instances of written human language – cuneiform inscriptions from the Akkadian empire – were written in Sumerian.
Southern Sierra Miwok  BoK  3 A Native American language spoken by only a few speakers in California, Miwok-Costanoan family. Read more at:
Surzhyk BoK 7   The language variety in the Ukraine known as ‘surzhyk’ (the original term means poor quality bread made of mixed rye and oats) is seen as the language of peasants, and is used on the street or at the bazaar by newly urbanized inhabitants and is widely regarded as ‘a pejorative … collective label for a wide range of mixed Ukrainian-Russian and Russian-Ukrainian language forms that dissolve and intertwine the structures of the two Eastern Slavonic languages’ (Bernsand, Niklas. 2001. Surzhyk and national identity in Ukrainian nationalist language ideology. Berliner Osteuropa Info 17, 40)
Svan BoK  5, Map Native language of the Svans (Svanuri), who inhabit southern slopes and valleys of the High Caucasus in the region of Svaneti, located in north-western Georgia. The language, called in Svan Lushnu, is one of the Kartvelian language family, that includes also Georgian, Megrelian and Laz. The three languages are officially regarded by Georgian authorities and institutions as dialects of Georgian, in spite of opinions expressed by linguists and other experts, who emphasise different linguistic histories of the three regiolects and classify them as separate languages. Lack of any support from Georgian state and the low prestige of Svan have resulted in an increased rate of language decay and endangerment. In recent years some of the traditional Svan settlements have been destroyed by land- and mudslides, resulting in the long-established clannish Svan communities being resettled to the southern, extremely poor, abandoned and plain areas, mainly along the borders with Armenia and Azerbaijan. Likewise the Georgians the Svans are of Christian Orthodox denomination, although their traditions include also numerous remnants of pre-Christian beliefs. The Svan language is written at present with the letters of Georgian alphabet, although in the past it also used Latin or Cyrillic alphabets.
Swahili  BoK  7 Swahili belongs to the Bantu sub-family within the Niger-Congo family of languages. It is the major language of East Africa, where it is spoken as a second language by about 40 million people (the number of native speakers is only 6 million). Swahili has official status in Tanzania, Kenya, Uganda, the Democratic Republic of the Congo, and the Union of the Comoros.
Taa; !Xóo, !Xuun  BoK   4, Map Taa (also known under the name !Xóo or !Xuun) has about 500 speakers in Namibia and about 4000 speakers in Botswana (source: DoBeS Taa project). Not all of these speakers are really fluent in the language, and more pessimistic estimates give the total number of speakers as 2000 or 2500 (see ). Taa belongs to the Tuu (or Southern Khoisan) group of languages, which traditionally are grouped together with other languages as Khoisan languages. The example in this package show the West !Xoon dialect spoken in Namibia. Even though children still grow up with Taa as their mother tongue in some places, the language is highly endangered. This is due to the low socio-economic status of Taa speakers in both countries, the multilingual settings in which they live, and the lack of teaching materials.” (DoBeS Taa project). Among linguists Taa is famous because of its enormous inventory of phonemes (see also BoK 3.4): it distinguishes more than 80 consonants (the exact number differs according to different theories and interpretations), including clicks, and 20 vowels. It also uses two tones (high and low). Short information and many samples (audio, video, transcripts) at: DoBeS Taa project: . This site contains short information on the language as well as on geography and culture. The Taa corpus in the DoBeS archive offers more information and resources, some accessible for unregistered users.
Tahuatan, tahutan BoK  10,   Map One of South-Marquesan dialects, belonging to the Nuclear East Polynesian family; closely related to Hawaiian, Māori or Rapanuian. Ethnologue [2009: 594] refers to 2100 persons, mainly fishermen, speaking 3 mutually intelligible dialects of main Marquesas’ Islands: Tahuta, Hiva Oa and Fatu Hiva (within the French Polynesia). The North-Marquesan dialects constitute a separate language-complex. More on Marquesan:
Tamil   A Dravidian language spoken in southern India and Sri Lanka (where it is one of the national languages) by over 70 million people.
Tatar  BoK  9 A complex name for Turkic (a subfamily of Altaic languages) varieties spoken by various Tatar peoples in Eurasia, among others:

  • Kazan or Volga Tatars in Tatarstan (Russian Federation),
  • Siberian Tatars (Khakass, Shor, Teleut),
  • Polish-Lithuanian-Belarusian (Lipka) Tatars – who lost their language centuries ago – see: ,
  • Crimean Tatars – a separate people in terms of ethnicity and language.

See eg.:

Tay Boi   BoK  7 A French-based Pidgin spoken in Vietnam. Also known as Annamite French, Tay Boy and Vietnamese Pidgin French.
Telugu   A Dravidian language spoken in India by over 70 million people.
Temne  BoK  7 Temne is a language of the Mel branch of Niger–Congo, spoken in Sierra Leone by about 2 million first speakers. It also serves as a lingua franca for an additional 1,500,000 people living in areas near the Temne people (Ethnologue 2009).
Teochew BoK  8 The Teochew of Southern Min Chinese is spoken in the Chaoshan region of eastern Guangdong and by the Teochew diaspora in various regions around the world. Teochew preserves many Old Chinese archaic pronunciations and vocabulary that have been lost in some of the other modern dialects of Chinese.
Teop BoK  3,  10, Map Teop belongs to the Oceanic branch of Austronesian languages. It is spoken on the island Bougainville, which belongs to Papua New Guinea. Estimates of the number of speakers range from five to ten thousand. Read more and find samples at the website of the Teop Documentation Project: .
Thai BoK 4 Thai is one of the big languages of the world. Estimates of the number of speakers vary from 20 million to 60 million – depending on whether different dialects are counted as one language or several, and on whether bilingual speakers whith a different mother tongue are included in the count. If you want to learn Thai, check out this site:
Tibetan, Classical Tibetan BoK 5   Classical Tibetan is the language of texts from the 10th – 12th centuries, written in the Tibetan script. It is also the basis for Modern Standard Tibetan, the official language of the Tibet Autonomous Region in the People’s Republic of China. Tibetic languages form a branch of the Sino-Tibetan language family.
Tibeto-Burman languages BoK 5 The Tibeto-Burman languages are more than 300 languages ​​spoken from the Himalayas to Burma. They are considered, according to different hypotheses: a subfamily within the Sino-Tibetan language family, a separate language family or a stock of unrelated language groups in the area.
Tobian BoK  1 The Tobian language is a critically endangered Austronesian language of Palau. It is only spoken by roughly a dozen of Palauans, mainly fishermen.
Tondano BoK  3   An Austronesian language spoken in Northern Sulawesi.
Tok Pisin  BoK  3 Tok Pisin is an official language and a language of wider communication in Papua New Guinea. It is a creole language that developed of an English-based Pidgin with elements of Melanesian languages.
Totoli BoK  1, 3, Map Totoli is an Austronesian language spoken  by around 25,000 inhabitants of Central Sulawesi (Indonesia). Totoli is vulnerable to endangerment.  Here is a link to a documentation project of Totoli:
Tshanglakha, (or: Tshangla language) BoK  6 Otherwise known as ‘Sharchopkha’ – the language of the Sharchop (Tshangla) people who inhabit eastern Bhutan, the Arunachal Pradesh state in India and southern parts of Tibet. With around 170,000 speakers, it is one of the dominant languages of the region. Tshanglakha is a predominantly oral language and verbal art – e.g. riddling – plays an important cultural role among the speaker community. The Tschangla language belongs to the Sino-Tibetan language family.
Tshiluba  BoK  7 Luba-Kasai (also called Luba-Lulua, Western Luba, Kasai, Luva, and Tshiluba) is a Bantu language spoken in the Democratic Republic of the Congo, where it is a national language, along with Lingala, Swahili, and Kikongo.
Tsotsil  BoK  3 A Mayan language spoken in Mexico.
Turkic languages   A family of languages spread over Eurasia from east southern Europe to western China. The largest Turkic languages are Turkic , Azerbaijani, Uzbek and Kazakh. There are also several small and endangered Turkic languages, for example the Tatar languages, Armeno-Kychap , Ös (Chulym) and Karaim .
Turkish  BoK  3, 6 Turkic belongs to the Turkic family of languages. Other Turkic languages are spoken mainly in Central Asia and North Eurasia. With over 60 million speakers it is one of the world’s major languages.
Ubykh  BoK  4 Ubykh was a Northwestern Caucasian language, whose last speaker died in 1992.
Udmurt; Votyak  BoK  1 The Udmurt language (otherwise known as Votyak) is a Finno-Ugric language spoken primarily in the Udmurt Republic (Russia) where it enjoys official status. In 2012 the Eurovision Song Contest Russia was represented by Buranovskiye Babushki with a song that was partly in Udmurt. They came in second place in the final. Her you can watch Buranovskiye Babushki sing:
Ukrainian  BoK  8 Ukrainian is a member of the East Slavic subgroup of the Slavic languages. It is the official state language of Ukraine and demographically is stronger in the west of the country, the east of the Ukraine having more Russian speakers. Overall 85.2% of the population of the country claimed to speak Ukrainian in the 2001 census ( )
Ugaritic   Ugaritic was a Semitic language. We know it from texts from the 14th-12th century BC written in cuneiform on clay-tablets that were discovered in 1928.
Uralic   A family of languages used in the northerhn parts of Eurasia. The Uralic has two subfamilies: Finno-Ugric and Samoyedic. Languages such as Estonian, Finnish, Khanty, Nenets and Udmurt belong to this family.
Urdu  BoK  1 The Indo-European Urdu language (In Urdu: اُردُو‎ )  is spoken by over 60 million people, mainly in Pakistan and India. Many linguists consider Urdu and the Hindi language of India as the same language. However, from a socio-political point of view, they are in fact two separate languages.
Uto-aztecan languages BoK 9 A family of more than 60 languages ​​in North and Central America. They are used by approximately 2 million indigenous people of the United States and Mexico. Examples of languages of the Uto-Aztecan family include Nahuatl of Mexico and Hopi of Arizona.
Uyghur   A Turkic language of western China.
Vietnamese BoK   4  With 76 million speakers one of the larger languages of the world. It belongs to the Austro-Asiatic family of languages. Other than most languages of Asia, Vietnamese is written with a modified Latin alphabet.
Võro-Seto   The Balto-Fennic varieties of South-Estonian Võro-Seto are used by respective 70 and 10-13 thousand persons in southern Estonia (the latter also in the border areas in Russia). In spite of a distinct development of a written standard and a feeling of linguistic separate identity (with Seto united also by an Orthodox religious-cultural identity), the language(s) don’t have any official status recognized by the Estonian state, what impedes its full vitality. The language(s), however, is being taught at a number of school and extra-curricural courses. The Voro Institute runs its web-page: . Listen to a Seto choir: .
Walloon BoK  9 Complex of regional endogenous languages (Romance langues d’ oil) spoken in southern Belgium (Wallonia) and north-eastern France, next to Walloon proper including also closely related Picard, Lorrain, Champenois. Spoken by over one million and understood by two million citizens of the two countries. Because of its close linguistic proximity to French (official language of both states) not recognized as separate language, in spite of endeavors of Walloon organizations. More on the language: . The main danger for Walloon is its assimilation within the dominating French language .
Warlpiri BoK  6 Warlpiri is a native Australian language belonging to the Pama-Nyungan language family. Warlpiri is well-known to linguists for its interesting features such as e.g. the existence of the special variety called Jiriwirri – an avoidance language with special vocabulary used in initiation rites.
Warrgamay  BoK  3 A language of Australia, related to Dyirbal. According to Wikipedia, the language is already extinct, but see here for a revival program: and
or look at the wordlist here:
Wãnsöhöt    BoK  1 Wãnsöhöt , otherwise known as Puinave (or: Waipuniavi) is a language isolate spoken along the river Orinoco in Colombia and Venezuela. Puinave is a definitely endangered language, with about 4,000 speakers (Higuita and Wetzels 2007, Tone in Wãnsöhöt).
Welsh  BoK  1 3 ,   8 The oldest language spoken in Britain, some 19 percent of the population of Wales (some 568,500 people) claim to speak this Brythonic variant of Celtic  ( )
Wenyan (=Classical Chinese)   Classical Chinese literary language
Wichita  BoK  8 Wichita is spoken by only a few elderly speakers (most of them know only some words or sentences of this languages) in Oklahoma. It belongs to the Caddoan family of Native American languages. Find more information at: Here you may listen to two sentences in Wichita spoken by one of the last speakers:
Wilamowicean   BoK  3, 5, 9 Map Wilamowice is a small town with about 3,000 inhabitants near Bielsko-Biała in Silesia in southern Poland. About 50-60 of them still use an archaic regional Germanic called Wimisioerish. Linguists hold that this variant goes back to Middle High German dialects, brought into Poland by colonists from Germany in the late Middle Ages. The language is not transmitted anymore to next generations, although attempts are made to revive and revitalize the ethnolect and local culture. See e.g. .

Xibe BoK 5   A Tungusic language spoken in northwest China. For some more information see:
Xhosa  BoK   4 Xhosa is one of the official languages of South Africa. It belongs to the Bantu family and has about 7.8 million speakers.
Yagua  BoK  2
A language spoken in Peru, the only surviving member of the Peba-Yaguan family. (see also Peba-Yaguan, Peru. Source: Dahl & Velupillai 2011, ).
Yaka  BoK   4 Yaka is a Bantu language of central Africa.
Yemba  BoK  8 A Niger-Congo language spoken in the western province of Cameroon, Africa; 300,000 speakers according to Ethnologue 2009.
Yeri Map Yeri is an endangered Torricelli language from Papua New Guinea. In the village whereYeri is spoken, most young people have shifted to Tok Pisin – the national language of Papua New Guinea.
Yi (= Lolo, Nuosu)  BoK  5 The Yi languages (known also as Nuosu or Lolo) constitute a separate group within the Tibeto-Burman language family, six of which are recognized officially in China as separate, mutually intelligible language varieties: Northern Nuosu, Western Yi (Lalo), Central Yi (Lolopo), Southern Yi (Nisu), South-Eastern Yi (Wusa Nasu), eastern Yi (Nasu). Other Yi languages are also spoken in Vietnam, Burma and Thailand. Most of the Yi languages are not yet endangered in terms of numbers of speakers (albeit Ethnologue 2009 refers to such varieties as Miqie Yi – with 30 thousand users and decreasing); intense cultural and administrative contacts with official languages (mainly Chinese) have caused languages to adopt many loanwords and structures borrowed from those languages. Also the endangered literature in Yi script requires thorough documenting – see the clip: .
Yiddish BoK  7 Yiddish was, until the mid-twentieth century, the language of Ashkenazic Jews (the Jews of Central and Eastern Europe and their descendants). As a hybrid variety of Hebrew and medieval German, Yiddish takes about three-quarters of its vocabulary from German, but incorporates words liberally from Hebrew and many other languages (e.g. Polish, Russian, Romanian, etc). It has a grammatical structure distinct from modern German, and is written in an alphabet based on Hebrew characters. Yiddish is generally classified as a Germanic language, but this is sometimes questioned.
Yimas  BoK  3 An endangered language of Papua New Guinea, of the Sepik-Ramu family
Yoruba  BoK   1 ,   4 The Yoruba language belongs to the Niger-Congo language family. It is spoken by around 19 million people in Nigeria and Benin. Here you can learn some Yoruba on-line:
Yucateco BoK 5 Yucateco also known as Yucatec Maya language, is one of the biggest Mayan languages spoken in the Yucatan peninsula and in Belize.
Yukaghir  BoK  5 The Yukaghir language varieties: Northern (Tundra) and Southern (Kolyma / Forest) are classified as isolated cognate of the Uralic languages (i.e. Samoyedic and Fenno-Ugric). According to Ethnologue 2009 they are spoken by respectively 90 and 30 persons. Listen to a Yukaghir song:
Yurakaré  BoK  3,  10, Map Yurakaré (also spelt Yuracaré), is spoken in Bolivia, in an area northwest of Cochabamba. Current estimates of the number of speakers vary from 1,800 to 2,680 ( ). All speakers are bilingual with Spanish as the other language. It is a language isolate, which means that its relationship to any language family has not been proven, although several linguists have tried to connect Yurakaré to other languages of the Americas. The language is highly endangered. DoBeS Yurakaré project: . This site contains information on the language and the people, their history and culture as well as on the geography of the area. Access to the resources in the DoBeS archive is possible after individual registration and request. Van Gijn, Erik. 2006. A grammar of Yurakaré. Proefschrift (PhD thesis), Radboud Universiteit Nijmegen. Available online at
Zayse  BoK  3 Zayse belongs to the Omotic branch of the Afro-Asiatic language family. It is spoken in Ethiopia by about 18,500 people.
Žemaitian / Samogitian  BoK  9 Žemaitian / Samogitian is a Baltic dialect complex within the Lithuanian language area with a separate historic, territorial and literary writing tradition. Not recognised as a separate language, its speakers endeavour to upgrade the official (political and social) status of Žemaitian in Lithuania. Browse the developing Žemaitian Vikipedeje:; Listen to traditional Žemaitian songs: .
Zhuang   Zhuang – a Kam-Tai macrolanguage spoken by almost 15 million members of that national minority in South-Eastern China. The Zhuang languages have numerous tones. Listen to a Zhuang song available at: .
Zuni  BoK   4 A Native American language spoken in New Mexico and Arizona, no known relatives.

Go to: a … e , d … g , h … l , m … r , s … z
back to top