Goals and objectives of the project

Ethiopian is economically the most fast-growing African country and the 5th fast-growing in the world. With a development rate of 8% a year, Ethiopia aims in the near future to build-up a solid economic apparatus and create infrastructures and work opportunities also in the most remote areas. This is taking to a rapid and drastic social and cultural change that affects also the speech habits of people, particularly those that are part of minority groups such as Bayso and Haro.

The project has the urgent task to create and archive a repository of speech samples of various length and genres in Bayso and Haro, along grammatical sketches, dictionaries and sociolinguistic and anthropological profiles. The material is recorded in audio and video. The documentation activities are multidisciplinary and are based on the collaboration of linguists and anthropologists. All the team members collect, transcribe and translate the recorded speech with the collaboration of local consultants. Linguists are in charge of grammatical annotation and the creation of grammatical sketches, dictionaries and sociolinguistic profiles. The anthropologists add cultural information about the texts and create anthropologic profiles.

Data collection is conducted both in the Gidiccho island and in the village of Alge. Visits are also paid to the other villages where some of the Bayso and Haro live. Recordings are made on digital support with the use of recorders providing uncompressed (.wav) sound files. The best possible sound quality is also reached with the use of a choice of microphones with different characteristics according to the recording context. In the case of videos, whenever possible, the audio is recorded separately with a voice recorder and then the video and the audio files are synchronized. The audiovisual material is particularly interesting as it shows those endangered practices that are menaced together with the linguistic expressions describing them.

Speech is provided by a wide range of speakers of different age, gender, and social status. A more conservative form of the languages is expected from people who spend most of their time on the Gidiccho island. Here they also preserve practices that tend to disappear in the mainland villages. Knowledgeable elders will be the reference speakers for the description of ancient and lost practices, while younger and educated speakers assist in transcription and translation. Other subjects involved are anthropology University students that have field experience in the area and other people who know the area and can provide practical assistance.

The texts are transcribed and translated on paper in the field. An agreed Latin orthographic system is used since these languages have no standard script. The working language is Amharic. Most of the team member speak Amharic, but local consultants who know English will also be used as intermediate translators in case of need. The transcription and the translation found in field notes are checked and typed in in an ELAN file. From the transcription, the most interesting texts are annotated grammatically still in ELAN. In particular, the ELAN-Corpa version of this program will be used as it has an additional function of semi-automatic annotation. The culturally most relevant recordings are enriched with anthropological notes.

The descriptive part of the project is devoted to the creation of vocabularies and grammatical sketches. Lexical collection starts from lists retrieved from published material. They are enriched with items collected in specific recording sessions and extracted from the recorded speech. The grammatical sketches are built up from the available descriptions and are refined by the findings resulting from the translation and annotation work. In the case of Haro, the description work is facilitated by the availability of an extensive grammar. Along with the linguistic description, the anthropological and sociolinguistic overviews of the speech communities inform about the cultural and social context of speech production.

All the material is stored in the DoBeS archive together with proper metadata. These inform not only about the content of the speech and the characteristics of the speaker, but also about the setting of the recording and the environmental and human context that influence the content of the recording. Proper consent to the recordings and their diffusion is attached to the metadata.

The material is edited in form of scientific articles and booklets designed for the community. The aim is to distribute to the communities a sketch and a dictionary of both languages. The dictionaries will be trilingual Bayso-Amharic-English and Haro-Amharic-English. We also plan to create a talking dictionary to be distributed in CD’s. A selection of videos and photos will also be edited in DVD’s and distributed to the community.

The project aims at providing scientifically rigorous linguistic and cultural material based on data collected in a friendly and collaborative atmosphere. A crucial point is the collaboration with local consultants. Aims and object of the research activities are carefully explained to the community stressing the value of these languages in the wider scientific context. This boosts interest and increases a sense of proud among the two communities. The practical result is to provide the Bayso and the Haro sound linguistic material that stimulate them, and in particular their children, to keep talking their languages in spite of the social and cultural change that is affecting them. The linguistic descriptions of Bayso and Haro will have the scientific result of showing the value of these languages in the wider linguistic human diversity before they shrink and eventually disappear.


Bender, Marvin Lionel (1975). Omotic: a new Afroasiatic language family. Carbondale: Southern Illinois University.

Croft, William (2003). Typology and Universals (2nd ed.). Cambridge: University Press.

Hayward Richard (1978). Bayso revisited. Some preliminary linguistic observations. Bulletin of the School of Oriental and African Studies 41,3:539-570.

Hayward Richard (1979). Bayso revisited. Some preliminary linguistic observations. Bulletin of the School of Oriental and African Studies 42,1:101-132.

Hirut Wolde-Mariam (2015). A Grammar of Haro. Muenchen: Lincom.

Mekonnen Hundie (2016). The Grammar of Girirra (A Lowland East Cushitic Language of Ethiopia). Unpublished PhD thesis, Addis Ababa University.