Automatically Annotated Repository of Digital Audio and Video Resources Community
AARDVARC: Automatically Annotated Repository of Digital Audio and Video Resources Community, will address the problem of not transcribed, and therefore unavailable, documentation of understudied languages by building an interdisciplinary community of linguists, anthropologists, and computer scientists to share knowledge and collaborate on the specification of a repository and suite of tools to facilitate automatic or semi-automatic transcription and analysis of audio and visual information. It will provide for two workshops and a symposium to design a "take one leave one" repository and to explore recent advances in speech and video processing that will allow anthropologists and linguists to break the ‘transcription bottleneck’ for language and cultural data. The focus on lesser studied languages will present new challenges for computer scientists seeking to move beyond the tools available to well-studied languages. It is an NSF-sponsored project, award number 1244713.