The project's second year has started. I am now in Addis Ababa to prepare my next four-week fieldwork. The OK for the continuation of the project has been given by the VolkswagenStiftung, the sponsor or the project in the context of the DoBeS programme for the endangered languages, on the basis of a certain amount of material that we have promised to archive in the DoBeS archive. See www.dobes.mpi.nl and look for the Bayso and Haro project.
The corpus mainly comprises of speech performances recorded in audio and video. All the recordings are transcribed and translated, a part of it is annotated by morphemic segmentation and morpheme by morpheme glossing. There also video files without speech transcription, ethnographic explanations of text dealing with certain topics such as religion and boat building, a first wordlist of Bayso and Haro translated in English, a first draft of the grammatical sketch and of a sociolinguistic profile of the two languages. Each archived file is accompanied by metadata information filled in using Arbil.
This is already rich material, but it must be revised and adjusted. The transcription is not homogenised because each team member used a different one. Now we have an unique system that came ouf from the phonological analysis contained in the first draft of the sketches. The morphological annotation should be revised and improved, but it is based on the existing description of the two languages, that are not bad at all. The translation is also a bit rough. Someone made both a literary/imitating translation and a free translation, some file have only one of the two. The Bayso lexicon, about 700 words, is based on the collected texts and is being compared with the one by Hayward (1978 and 1979) that I have digitalised. I have also converted the Bayso and English columns to get the English index. Also our lexicon has two indexes. Mechthild Reh has collected all the Haro words, about 850 words, of the grammar by Hirut (2004) and made comparisons with the words found in Brenzinger (1999). An additional lexicon of about 150 cultural terms has been created by Fabienne Braukmann. Also Endashaw Woldemichael created a 300-item wordlist. However, the transcription must be revised before archiving. The draft of the grammatical sketch of Bayso is mostly based on our material, while the one of Haro is based on Hirut (2004).
I think we are on a good track. There is a clear and elastic data workflow based on the creation of material in a first rough version and gradually checked and revised more precisely. And in next week new fieldwork and a new start towards and new target to get the OK for the third and final year!