Developing accessible corpora from lesser-described languages
for new ways of empirical research on diverse languages.

See the program of the workshop here

As a result of this workshop:
– CoEDL will have a better idea of the languages and content of corpora
We will share methods we use for building, exploring and analysing our corpora
We will identify places where improvement in methods or tools could assist
We will plan future workshops/conferences on corpus issues for the CoE

Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus Corpus 

Monday 18th April, morning
Opening: Stefan Schnell & Nick Thieberger

Session 1: Corpus linguistic projects

AIATSIS collections Australian languages
Australian sign languages  
language variation 
language contact 
social cognition 
language typology

Monday 18th April, afternoon

Session 2:Stocktake
Short presentations of existing material from 15 languages

Yolngu, Bininj Gun-Wok, Dalabon, Warumungu, Warlpiri, Warnman, Arrernte, Daly languages project, Wubuy, Kriol, Ngaanyatjarra, Anindilyakwa, Mudburra, Nen, Warnman, Marind, Kriol.

Tuesday 19th April, morning

8.30 start - Stocktake
Ku Waru, Nafsan, Living Archive of Australian Language

Session 3: Discussion and planning

We will discuss the following points with the aim of defining a first set of corpus data to be made accessible after the first year of our initiative:

- metadata content and formats
- primary data types, content and formats
- minimal standards of corpus annotation

Discussion of further corpus-linguistic CoE and CoE-associated events in 2016 and 2017.