Currently approximately 7000 languages are spoken worldwide (many of them with a number of dialects). However, it is assumed that by the end of the 21st century, only one third – maybe even only one tenth – of these languages will continue to exist. As language is a unique expression of the intellectual heritage and cultural knowledge of each speaker community, ways of conceptualizing the environment and social structure will be irretrievably lost with the death of a language.
In 2000 the VolkswagenFoundation started the DOBES programme (Dokumentation bedrohter Sprachen) in order to document languages that are potentially in danger of becoming extinct within a few years’ time. In 2000 the pilot phase was started with seven documentation teams and one archiving team, with the intention of coming up with recommendations of how language documentation can work, and how the digitial archiving can best be done. Since then, new documentation teams were selected on a yearly basis in order to carry out significant documentation work within 3-5 years. 67 documentation projects have been funded and the funding program has come to its final round in 2011. In 2006 the first documentation teams have finished their contractual phase, but many teams still carry on with the documentation work even after their granted period. Yearly workshops are being held in which all past and present documentation projects meet in order to exchange experiences and results.
DOBES – as an initiative and as a collective of researchers – has had a major impact on the direction that language documentation has taken, concerning theoretical issues, questions of best practice in documentary work and technical tools and standards.
Language Documentation
Language Documentation is a reaction of the linguistic community to the immanent disappearance of the majority of the world’s languages. It has three major aims:
- Maintenance and revitalization
- Preserving information on language diversity and cultural treasures of mankind for future generations of speakers and researchers
- Introducing accountability to linguistic research
These main goals and the unique nature of each documentation team and setting shape the documentary record that is submitted to the digital archive.
Each documentation is carried out in close cooperation with the speech community. Audio and video data from a variety of genres are collected. The data are described with a set of standardized metadata categories and digitally archived according to open standards and made accessible. In addition, the archive has to take care of the long-term persistency of the digital material.
The language documentation depositories in the archive contain the following types of material:
- audio and video recordings with annotations of differing depths: usually a transcription and a translation into one or more major languages is present and often morphosyntactic glossing is included as well.
- photographs and drawings partly bundled into groups of photos documenting processes, e.g., how to build a house
- music recordings and videos of cultural activities and ceremonies
- documents on language’s genetic affiliation, its socio-linguistic context, its phonetic and grammatical features, and the circumstances of research, recording and documentation
Technology in DOBES
From the beginning the DOBES programme wanted to take advantage of modern state-of-the-art technology, and where necessary drive technology to suite the needs of the documentation work. Therefore, the following topics were discussed and widely agreed upon, in particular in the pilot phase
- specifications for archival document formats to promote long-term accessibility
- recommendations for recording and analysis formats, and tools to ensure quality and reduce the conversion effort
- the creation of new tools that support the audio/video annotation work, the metadata creation and the navigation in metadata domains, advanced web-based frameworks to access and enrich archived resources.
Given the dramatic situation of recordings of cultures and languages in general (according to a UNESCO overview about 80% of the storage media are subject to heavy chemical/physical deterioration) it was important to provide other contributors with the possibility of depositing valuable material into the archive. Therefore, web-based technology was built to allow the upload of new material or new versions into the archive. This technology can also be used by the DOBES teams to continue their work after the ending of their official DOBES phase.
The Boards
Two boards coordinate and accompany all DOBES activities:
The Steering Committee is coordinating the activities and addressing common issues within the DOBES programme. In particular, its members work out a program for the workshops.
The Linguistic Advisory Board gives general advice to members of the research and archive teams and will provide help with problems arising in the process of documenting a language.