Welcome to the Pangloss Collection,
an open archive of endangered / under-documented languages.

Featured resources

The Pangloss collection offers, in free access, linguistic audio documents, with a specialization in rare or less-studied languages. Its aim is to contribute to the documentation and study of the human heritage represented by the languages of the world. The documents that the visitor can consult (and download) here are the result of a patient work of professional linguists who are working to collect, study and safeguard a world linguistic heritage. This task is urgent, because the diversity of languages is rapidly declining, in parallel with the decline of biodiversity. Each language that disappears takes with it a universe of knowledge. Like biodiversity, linguistic diversity is a wealth to be preserved and maintained in order to meet the challenges of the present.

Currently, the collection includes 252 corpora of languages and dialects from 46 countries, deposited by 88 researchers, which represents more than 1 180 hours of listening. The documents presented contain mostly spontaneous speech, but also word lists and questionnaire sessions, recorded in the field within speaker communities, and transcribed in consultation with these speakers. These documents were recorded and annotated by researchers from a wide variety of backgrounds, including researchers, teacher-researchers and doctoral students from the LACITO-CNRS laboratory. About half of the recordings are transcribed and annotated, allowing all listeners to understand what they are hearing.

The collection is managed by a team from LACITO.

If you wish to deposit new resources or if you wish to propose additional translations or annotations, we encourage you to contact us.

Map of corpora

The Pangloss Collection makes available the recordings collected by researchers in the course of their fieldwork on all continents.

Developed by CNRS-LACITO

Our partners