Abstracts

Click on a title to see the abstract below.

ABSTRACTS

A vital resource for language documentation, linguistics research, and revitalization is the archive. These materials preserve cultural and linguistic knowledge that all too often becomes lost over time. But often, the archival materials as deposited need further processing before they can be used for a purpose beyond the original intent of the depositor. This paper outlines ways in which linguists can enrich these deposits using Automatic Speech Recognition tools such as forced alignment to use these resources in further research and revitalization efforts.1–3 A case study using material from 5 Australian language deposits in PARADISEC is presented, using the Montreal Forced Aligner4 to align audio to words and segments. Practical considerations in this process are discussed to ensure that the resulting aligned audio is as free of errors as possible, including transcription standards, data cleaning, and post-hoc manual correction. Aligning language audio with transcripts allows for phonetic and phonological linguistics research on the material and creates material suited for online dictionaries and learning materials within the language community. Depositing force-aligned materials as addenda to the original deposit after processing allows community members and future researchers to benefit from this process far beyond the scope of the original research.

References

1. Babinski, S. et al. A Robin Hood approach to forced alignment: English-trained algorithms and their use on Australian languages. Proc Ling Soc Amer 4, 3 (2019).

2. DiCanio, C. et al. Using automatic alignment to analyze endangered language data: Testing the viability of untrained alignment. The Journal of the Acoustical Society of America 134, 2235–2246 (2013).

3. Johnson, L. M. Forced Alignment for Understudied Language Varieties: Testing ProsodylabAligner with Tongan Data. Language Documentation 12, 44 (2018).

4. McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M. & Sonderegger, M. Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi. in Interspeech 2017 498–502 (ISCA, 2017). doi:10.21437/Interspeech.2017-1386.

Can the adaptive potential of song, forged in face-to-face interactions, translate to online environments in which songs have been abstracted away from their originating social niches? This chapter considers how the introduction of technologies for virtualising and commoditising song—such as recording, mass media broadcasting, archiving and social networking—have transformed and are transforming songs and singing practices in a fast-changing contemporary world. Despite the colonial history of many archival song collections, many community members believe that archival collections of song, responsibly managed in partnerships between archives and communities, have the potential to mitigate cultural loss and give new generations access to their birthright. Such collaborations are transforming the archives themselves, hijacking emerging technologies and platforms to create and engage both virtual and actual communities of interest.

Understanding the diversity of sound systems in the world’s spoken languages requires detailed analyses of their phonetic patterns, but systematic phonetic documentation is scarce for the majority of languages (Whalen et al., 2020). Recent trends towards ‘corpus phonetics’ have furthered phonetic research using large datasets of natural speech and tools to speed up data processing and analysis (Liberman, 2019), but largely focusing on English varieties, partly due to limited availability of suitable technologies and corpora for many languages (Besacier et al., 2014). Many archives house rich language documentation collections, offering opportunities to extend knowledge of phonetic patterns crosslinguistically. Phonetic analysis requires time-aligned annotations of audio, and while archival materials often include utterance-level orthographic transcriptions, examinations of vowels, consonants, and prosody require more granular segmentlevel and word-level annotations. Utterance-level orthographic transcriptions can, however be used to train a machine learning algorithm returning a model which can be used to align speech rapidly. A key question is, how many hours of audio is needed to train an effective model? We present a case study using archival recordings and transcriptions of Nafsan, an Oceanic language of Vanuatu (Thieberger, 2006), which were used to create a tokenised dictionary (Moran & Cysouw, 2018), and input to the Montreal Forced Aligner (McAuliffe et al., 2017), which uses the Kaldi Speech recognition engine (Povey et al., 2011). Findings show that materials in a typical language documentation corpus can be used to train very effective models if adequately prepared, and can lead to new phonetic insights into under-resourced languages.

Besacier, L., Barnard, E., Karpov, A., & Schultz, T. (2014). Automatic speech recognition for under-resourced languages: A survey. Speech Communication, 56, 85–100.

Liberman, M. Y. (2019). Corpus phonetics. Annual Review of Linguistics, 5, 91–107.

McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M., & Sonderegger, M. (2017). Montreal Forced Aligner: Trainable text-speech alignment using Kaldi. In F. Lacerda, D. House, M. Heldner, J. Gustafson, S. Strömbergsson, & M. Włodarczak (Eds.), Proceedings of Interspeech 2017 (pp. 498– 502). ISCA.

Moran, S., & Cysouw, M. (2018). The Unicode cookbook for linguists. Language Science Press.

Povey, D., Arnab, G., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlíček, P., Qian, Y., Schwarz, P., Silovský, J., Stemmer, G., & Veselý, K. (2011). The Kaldi speech recognition toolkit. In Proceedings of IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society.

Thieberger, N. (2006). A grammar of South Efate: An Oceanic language of Vanuatu. University of Hawaii Press.

Whalen, D. H., DiCanio, C., & Dockum, R. (2020). Phonetic documentation in three collections: Topics and evolution. Journal of the International Phonetic Association, FirstView, 1–27.

The Living Archive of Aboriginal Languages was established in 2012 as a digital archive of endangered literature of Indigenous languages of Australia’s Northern Territory. The purpose of the project was to ‘rescue’ the large body of printed materials produced in remote schools with bilingual education programs. The gradual decline of bilingual education left these materials vulnerable, leading to the need to rescue, digitise and make them available online. In the intervening years, this purpose has been achieved, and the active website currently contains nearly 4000 items in 50 languages, available online at http://laal.cdu.edu.au/.

The archive is now in a period of transition, as the end of funding and changes in the back-end software and management have required decisions about its future resting place. These negotiations reveal a range of possible futures for the collection – such as becoming part of the Northern Territory Library’s ‘Territory Stories’ collection, or as a historical repository in AIATSIS, or other as yet unforeseen possibilities.

Each of these imagined futures implies something about what the collection is, what it might achieve in the future, and given its future, how its past might have been different. As the Living Archive finds a new place in the world post its original development, the options available allow for a careful analysis of what has (and hasn’t) been achieved, and consideration of how the collection might continue to exist as a ‘living archive’ in a new context.

osf.io/3tk8d

Digital documentation is now pervasive and digital methods have allowed the collection of larger and more varied corpora than could have been previously anticipated. Many older collections have also been converted to digital formats. It is now possible to use corpora in ways that were unthought of when many documentation collections were originally created. However, no one, to my knowledge, has systematically investigated how depositors’ choices for archiving might affect the uses to which these corpora can be put in the future. It is important to find this out now, while something can be done about it. In this talk I describe the framework for a digital archival audit, which focuses on the uses to which corpora can be put and the things that need to happen before a corpus can be used from an archive. The audit takes the form of a questionnaire (currently being developed) which is based around common tasks and outputs from language data. The questionnaire focuses on audio and textual material and covers language identification, forced alignment and the utterance and segment level, the creation of dictionaries and wordlists, and searchability. Unlike some other corpus audits, this questionnaire is not based around the amount of material in the deposit, but rather it’s utility and suitability for different types of tasks.

Recent language and song revitalisation projects in Australia have emphasised local Indigenous frameworks, intergenerational approaches and creative responses as key to reengaging archival materials and the knowledges and practices they hold (Bracknell 2019; Ford 2020; Treloyn et al 2019). Challenges remain when it comes to sustaining regional public song and dance traditions. These include mobilising support from organisations and stakeholders and personnel for sustained access to and delivery of recordings, and most importantly, connecting apprentice singers with song specialists (for example, Turpin 2019).

This presentation reflects on ongoing collaborative work by Bininj (Aboriginal people) and Balanda (non-Indigenous/European people) at Gunbalanya, Northern Territory, to maintain diverse cultural and linguistic practices surrounding kun-borrk, a genre of public dance-accompanied song from western Arnhem Land. 2021 marks ten years since the ancestral remains of Bininj stolen in 1948 were sung and danced back to country as part of a re-burial ceremony at Gunbalanya. Around the same period, digital sound and film recordings made as part of the same Expedition began a process of return in which they were ritually and emotionally metabolised by Bininj and relocated in country and among kin. We demonstrate how the story of the theft has been reclaimed through new bim (painting) and kun-borrk. We will also discuss efforts to revitalise dormant kun-borrk song repertories through the Man-karre (ceremony/custom) Project, led by artists at Gunbalanya and kun-borrk song specialists in the region and facilitated by ethnomusicologist Reuben Brown and Injalak Arts and Crafts.

Illustrations hold a special place in language documentation, being used for language elicitation, representation and, at the resource production end of the cycle, learning materials (Goehring, 2017; Wood & Tinajero, 2002). They have advantages over photographs, such as fine-­‐grained focus, ease of reproduction and greater anonymity of human subjects. Line-­‐drawings have long been used in Australian spoken and sign language dictionaries (Douglas, 1990; Green 1992) and publications on

Indigenous ecological knowledge and cultural practices (Goddard, 1995; Latz 1995). IAD’s Picture

Dictionary Series is a pertinent example. The Series was originally developed in workshops with Anmatyerr people (Green, 2003). ‘Back-­‐translations’ of illustrations were conducted to check their interpretations and suitability. Subsequent dictionary publications have generated further illustration collections (Yuwaalaraay & Gamilaraay Language Program, 2006).

The CoEDL-­‐funded ‘Pictures and Pedagogy’ project has incorporated these collections into the ‘Australian Indigenous Languages Image Bank’ (AILIB). Creating the image corpus has involved locating and cataloguing illustrations, some of which had been, in paper form, lying under beds for over 40 years. The digital versions were dispersed, with inconsistent naming conventions and sometimes unclear provenance. The process has involved regularizing file names, contacting illustrators, and developing access and use conditions which respect and acknowledge their work. The Picture Dictionary illustrations from Central Australia provided the impetus for the project, but the collection now represents ecologies and activities from across Indigenous Australia. The archiving of AILIB with PARADISEC will make the collection of more than 1300 line-­‐drawings widely available to language educators and resource creators.

References

Douglas, W. H. (1990). Illustrated topical dictionary of the Western Desert language: Based on the Ngaanyatjarra dialect. Kalgoorlie College Press.

Goddard, C. (1995). With Kalotas, A. & Jones, J. Punu: Yankunytjatjara plant use. Alice Springs: Institute for Aboriginal Development.

Goehring, J. (2017). Voices in pictures: Learning foreign languages with images. EPALE -­‐ Electronic

Platform for Adult Learning in Europe. European Commission.

Green, J. (2003). Central Anmatyerr Picture Dictionary. Alice Springs: IAD Press.

Green, J. (1992). Alyawarr to English Dictionary. Alice Springs: IAD Press.

Latz, P. (1995). Bushfires and bush tucker: Aboriginal plant use in Central Australia. IAD Press.

Wood, K. D., & Tinajero, J. (2002). Using Pictures to Teach Content to Second Language Learners.

Middle School Journal, 33(5), 47-­‐51. doi:10.1080/00940771.2002.11495331

Yuwaalaraay & Gamilaraay Language Program (2006). Gaay Garay Dhadhin Gamilaraay & Yuwaalaraay Picture Dictionary. Alice Springs: IAD Press.

The Warlpiri Media Archive is located at Pintupi, Anmatyerr, Warlpiri (PAW) Media and Communications in Yuendumu, the heart of Warlpiri country in Central Australia. In this presentation, we draw together the perspectives of Warlpiri elders on the significance of this on-country archive for future transmission of ceremonial songs which are deeply valued aspects of Warlpiri cultural heritage. We reflect on the social significance of having this documentation of Warlpiri culture and history available on-country, mediated by Warlpiri people and part of their living culture and will include discussion of initiatives for community engagement with the archive.

The community-run Kwaio Archive was formally opened in Malaita’s central mountains in 2016, after several years of preparation. It is mostly digital and solar- and wind-powered. Its quarter of a million files include written, audio, and video field materials collected by anthropologists working with the community and increasingly by Kwaio people on their own, along with thousands of archival documents and publications about the Solomons and places beyond. The Archive is part of the Kwainaa`isi Cultural Centre, which also encompasses a community-run school and a Kwaio-wide environmental conservation project. A connected small museum is planned. While the Archive can serve as a case study of successful anthropological data repatriation and impressive community organization, it has had to deal with numerous logistical, cataloging, financial, and political challenges, and we will highlight those here. We hope to connect with other grassroots archival projects that are dealing with similar challenges (and successes). We will link to the presentation “Digitizing Field Recordings: Boosting the Feedback Loop,” by Cristela Garcia-Spitz of the Tuzin Archive for Melanesian Anthropology at UC San Diego, which has long collaborated with the Kwaio Archive.


All over the world communities are struggling to maintain their languages and to document and preserve them. In many places, communities and the scholars working with them have been collecting rich materials which have not found their way into the established digital archives. These invaluable collections in danger of loss and obsoleteness are often unknown to the community itself, the public and the academic audience. At the same time, major language archives like the DELAMAN archives may not be known or trusted locally, may not be accessible or staffed so they could support preservation efforts of communities and local scholars.

There is a clear need to increase the number of local archives to support communities and scholars in their efforts through local archives. However, the implementation of large-scale digital infrastructures requires financial and institutional long-term commitment and technical expertise often not present. In the meantime, a bottom up approach whereby local scholars and activists set up a basic content management system and create, curate and preserve the collections in the meantime, is one way to secure invaluable existing collections. In addition, such small scale projects can function as a seed growing into larger institutionally supported efforts once the value of the collections can be seen online.

We will report about our bottom-up participatory approach for archive creation in our “Archive of Languages and Cultures of Ethnic Groups of Thailand” project. The major goal was the implementation of a pilot small-case digital infrastructure for preservation and dissemination of indigenous linguistic materials and cultural heritage in Thailand. Using an instance of Mukurtu, we localised it fully and ran capacity building and outreach workshops and training with local researchers and community members. The project empowered the local team and allowed the self-determination and self-representation of the local communities in the archive. Once the potential of the digital infrastructure could be seen by the host university also institutional interest was raised which in turn can now lead to larger support and the implementation of the automated preservation layer at the moment lacking.

With concrete examples from the Thai project, in this presentation we will explore the institutional, social, technical, and financial advantages of a bottom-up approach for archive creation.

Payi Linda Ford: Ala still talks to me: Putj putj marridian

Ala Ngulilkang Nancy Daiyi ‘talks’ to me about the things that I should be doing and acting and responding as her daughter. Ala laid down the law and provided the way forward during her instructions to me. This took place from infancy, early childhood, through to motherhood and as a researcher.

Ala’s wisdom allows me to consider and appreciate the world in the way that we/I react and respond. It is through her wisdom that I have learnt to engage with people like Emeritus Professors Linda Barwick and Allan Marett. These two have worked with my Wangga & Wali ceremonial Families for four decades. Their capacity to undertake recordings has been nothing short of amazing and their ability to engage Tyipme - Aboriginal researchers has provided myself and family members confidence to undertake our own research.

In this presentation we discuss a recent project in which Warlpiri women will preserve and maintain ceremonial singing genres in a digital website space. As much more than a repository of documentation, this digital space is being set up so that access to archival sound, video and photographic documentation is possible within contemporary performance spaces. We illustrate how this technology will facilitate engagement with archival deposites left by knowledgeable elders from past generations, and can be used to continue to teach contemporary generations the songs, dances and designs necessary for the maintenance of different genres of Warlpiri song.

https://lib.ucsd.edu/oceania

https://lib.ucsd.edu/tuzin-archive

https://www.clir.org/recordings-at-risk/funded-projects/

https://digitalpasifik.org/

Ethnographic field recordings, like feedback loops, capture voices that are then stored for later analysis. What happens as these recordings are transferred into digital form? Can digitizing 20th-century field recordings transfer the output into input for future interactions? With a Council on Library and Information Resources Recordings at Risk grant the UC San Diego Library has digitized approximately 800 sound recordings from seven collections in the Tuzin Archive for Melanesian Anthropology. These field recordings include rare interviews, songs, performances, oral histories and linguistic materials collected in Papua New Guinea and the Solomon Islands from the mid-to-late 20th century. Curator Cristela Garcia-Spitz will discuss issues in dealing with culturally sensitive and confidential information, variable description and metadata across these collections, and the development work that went into providing access to them through the Library’s Digital Collections website and new virtual reading room service. Digitizing these recordings has also facilitated content-sharing and new possibilities for partnership. Last October, David Akin and Tommy Esau added some 300 recordings by anthropologist Roger Keesing to the growing collections in the Kwaio Archive, the first community-run archive in the Solomons. Their work to return the voices of the ancestors to the community has also enhanced what is known about the recordings. David and Tommy will join Cristela in discussing the ongoing collaboration between the Kwaio Archive and Tuzin Archive that is enhancing knowledge exchange and the feedback loop, and how to build and strengthen such collaborations in the digital era.

Monika Höhlig's Syuba (Kagate) materials include reel-to-reel tape, cassette tapes, Super8 film, photographs and documents. These materials were mostly collected in the 1970s and 1980s, and form the earliest known recorded materials in Syuba. In 2017 I worked with Monika Höhlig on the digitisation and metadata records of these materials, which were archived with PARADISEC (Hoehlig 1972). I also discuss how these materials are being used in research and in community documentation and support work.

In this talk I discuss the process of archiving these legacy materials, and how they sit alongside two other collections of Syuba archived with PARADISEC; The community recording project conducted by the Mother Tongue Centre Nepal (2013), and the archive of my own fieldwork recordings (Gawne 2009, Gawne 2018). I also draw on my experience of supporting the archiving of other Tibeto-Burman collections with PARADISEC including Langtang (LAN1, Slade 2014) and Boro (BRX1, Basumatri 2009) to discuss how archives, institutions and research funders can best support the archiving of language documentation materials.

Basumatary, Prafulla (collector), 2009. Bodo narratives and descriptions of traditional practices.

Collection BRX1 at catalog.paradisec.org.au [Open Access].

https://dx.doi.org/10.4225/72/56E82418613AD

Gawne, Lauren (collector), 2009. Kagate (Nepal). Collection SUY1 at catalog.paradisec.org.au

[Open Access]. https://dx.doi.org/10.4225/72/56E976A071650

Gawne, Lauren. An introduction to the Syuba (Kagate) online collection. Language Documentation & Conservation 12: 204-234.

Hoehlig, Monika (collector), 1972. Monika Hoehlig's Syuba (Kagate) materials. Collection MH1 at catalog.paradisec.org.au [Open Access]. https://dx.doi.org/10.4225/72/5a2aa8fa3fde0

Mother Tongue Centre Nepal (collector), 2013. Syuba audio recordings from the Mother Tongue Centre Nepal (MTCN). Collection MTC1 at catalog.paradisec.org.au [Open Access]. https://dx.doi.org/10.4225/72/5a2aa8fe9880e

Slade, Rebekah (collector), 2014. Langtang (Nepal). Collection LAN1 at catalog.paradisec.org.au [Open Access]. https://dx.doi.org/10.4225/72/574B120949E4C

This presentation will report on our efforts to reconnect some of the earliest sound recordings made in the Pacific region to members of the PNG diaspora resident in Australia. This work emerges from the international project True Echoes: reconnecting cultures with recordings from the beginning of sound, and deals with recordings made by C.G. Seligman on the 1904 Cooke-Daniels expedition to New Guinea. In this presentation we will consider the issues that are raised by repatriating sound recordings to Hula-speaking diaspora members. When physical travel is restricted by the current pandemic, can reconnection with the diaspora facilitate direct (if informal) pathways for repatriation to Hula community members both resident in their homelands and beyond? We will also consider how the sound recordings could be thought about as already “displaced archives” (Lowry 2017), whose location at the British Library is the result of contested colonial processes. What role is played by the displacement of the historical recordings on one hand, and the displacement of descendants of those recorded on the other, in determining how digital repatriation can occur?

True Echoes is a research project centred on the British Library’s collection of Oceanic wax cylinders. These cylinders were recorded by British anthropologists during the late nineteenth and early twentieth centuries and include materials from Papua New Guinea, Vanuatu, Solomon Islands, New Caledonia, and the Torres Strait Islands. These audio recordings are hugely significant as they represent both the earliest documentation of Oceanic oral traditions and the inaugural use of sound in anthropological research.

This presentation will provide an overview of the project and its aims, and will highlight how True Echoes is working with Oceanic institutions to increase both the visibility and accessibility of these collections for the source communities.

The presentation will focus particularly on the development of the True Echoes website. Originally planned as a research output at the end of the project, the website is now being used as a key digital platform for sharing ongoing research findings based on collaborative historic research and participatory fieldwork. We will outline the project’s research methodology and how the findings are being presented on the website for the benefit of source communities. We will also highlight the website’s potential as an educational and engagement tool for Oceanic communities, researchers and the wider public.

Abstract: Digital return has been an integral part of the digital endangered languages and cultures archiving movement since its inception, and access and rights concerns have motivated new and innovative ways of returning cultural heritage materials to owner communities (cf. Barwick 2004). One important tool which has emerged from these efforts is the Mukurtu Content Management System. Digital archival management can be a significant challenge for smaller libraries and organizations without access to dedicated content management solutions. Mukurtu provides these organizations and communities with the means to take control of digital heritage resources by providing culturally relevant narratives and managing access (Christen 2019). Since 2017 the Mukurtu Hubs and Spokes project has been assisting communities in Hawai‘i and the Pacific to create, curate and implement digital heritage management solutions. In this presentation we highlight some of the successes of the project, while also reflecting on the many remaining challenges. We encourage users to view Mukurtu not as a single one-size-fits-all solution, but rather as an integral part of a digital cultural heritage ecosystem, complementing related cataloging and curation efforts in community libraries and museums.

References

Barwick, Linda. 2004. Turning It All Upside Down . . . Imagining a distributed digital audiovisual archive. Literary and Linguistic Computing 19. 253–263.

Christen, Kimberly. 2019. ‘The songline is alive in Mukurtu’: Return, reuse, and respect. In Archival Returns: Central Australia and Beyond, 153–172. Honolulu and Sydney: University of Hawai’i Press and University of Sydney Press.

To date, researchers have frequently played an integral role in the process of establishing and managing archival collections. Moreover, when collections are established in an organization outside the custodian community, it is often the case that the researcher also has a central role in connecting that community with the archive and in enabling communication between the two entities. Thus, the actions of the researcher can have considerable impact upon both the archiving process and subsequent collection management, and are worthy of careful consideration. In this paper, I explore how and why the researcher might act primarily as a facilitator within this process, and focus particularly on the ways that such action might work to create space for important dialogue on archival collections. Drawing on my own experiences over more than ten years in working with one non-Australian community to create and manage an archived collection of musical recordings with PARADISEC, I illustrate some of the effects of adopting the role of researcher/facilitator within the archive management process, and suggest that this role is especially worthy of consideration in an era of globalized digital music distribution.

Gisa Jaehnichen and Ling Jiasui: The Experience of Others and How to Understand Source Communities

The understanding of source communities is the main challenge in the process of repatriating any kind of materials that was collected in times before one’s own existence. This will be the case in the future. Most younger archivists will not have taken part in economic excursions or colonial adventures. They cannot easily find access to culturally differences and to different time periods. Based on a short provoking discussion of current ideas on repatriation Ling Jiasui compares her experiences made during an internship with IAML in South Africa and the expected applications in the context of a large Chinese urban area at the fringe of the Pacific Region. ILAM's practice for repatriation has gone through different stages and attempts have been made to find the most genuine repatriation methods, which take the old recordings really back to the community in order to achieve home-coming. Can these experiences of others help understand today’s requirements? What differences in source communities exist and have to be considered? The compact paper will include examples and autoethnographic descriptions that could be of further use in search for expanding archival skills and knowledge among younger scholars in this regard.

The Pacific Manuscripts Bureau has been making preservation copies of archives, manuscripts and rare printed materials from and about the Pacific Islands since 1968. This short paper will look at how climate scientists are starting to explore the PMB collection of microfilms to assist with climate modelling. Instrumental weather recordings appear in a variety of primary documents such as shipping logs and newspapers. This valuable data is being extracted from PMB, and other collections around the world, in an effort to piece together a daily history of the world’s weather going back at least 200 years prior to official meteorological records. A number of small PMB microfilm collections have been identified and digitised. However, with only one full-time member of staff at PMB, and the vast majority of PMB holdings still only available on microfilm and difficult for some people to access, our capacity to fully explore the extent of useful data is constrained.

PMB continues to operate as an active preservation copying project, so in addition to mining the existing collection, has the potential to digitise any newly identified collections of historical weather data. The outbreak of COVID-19 is limiting our capacity in these efforts.

While we work to translate old songs into language our children can sing, we find that many of the song words can’t really be translated. The old songs we still sing and that we hear in the archive are just the starting point to remind us of the stories they refer to. It is the story that we pass down when we give our children the songs, not just the words. The meaning is there all around the words and how they go together, where they come from and who sings them.

Through examples of our personal discoveries in the recorded archive we will tell just a few of the many stories that are coming out from between the lines of the songs that are so much more than the text on a page. Far from being a static record, the collection of archived Tiwi song recordings (made between 1912 and 1981) has revealed itself to be an animation of sung, danced and spoken social and cultural history with direct connections continuing to be made between current and past singers. We will also tell you about our latest efforts to establish a Knowledge and Culture Centre on Bathurst Island, the pivotal role of old (and new) recordings to those efforts and the tensions between preserving something while still creating it, because there’s so much more to learn than just the words.

In 2019, PARADISEC launched the podcast series Toksave: Culture Talks. Produced by Jodie Kell and Steven Gagau, the podcast is a series of discussions with people who have a personal or cultural connections to collections in the PARADISEC digital archive. Using Jodie’s skills as an audio engineer and musicologist and Steven’s unique position as an Indigenous community member based within the archive, the podcast covers the Oceania region engaging with community members from Papua New Guinea, Vanuatu, Fiji, Solomon Islands and Indigenous Australia.

In this paper, we will argue that the process of producing the podcast empowers Indigenous community members to engage with archival collections. This enlivens the archive through emotional rediscovery of the past. It further generates enhanced cultural knowledge by increasing the metadata of collections and contributes to the continuation of cultural practices by improving accessibility and findability of materials.

We will demonstrate the collaborative approach we have developed in producing each episode. This aims to respectfully engage with cultural experts by acknowledging their contribution as well as enabling access to archival materials. Utilising community networks, particularly in the Melanesian diaspora communities of the Sydney region, we have been able to conduct interviews with Indigenous peoples as cultural knowledge bearers who share their lived experience of these archival materials and their connections and reconnections with the past.

Larry Kimura, Keiki Kawaiʻaeʻa, Andrea Berez-Kroeker and Dannii Yarbrough: Kaniʻāina, Voices of the Land: The digital repository for spoken ʻŌlelo Hawaiʻi

Kaniʻāina is here: http://ulukau.org/kaniaina/

We present “Kaniʻāina, Voices of the Land,” the first online repository of ʻŌlelo Hawaiʻi by L1 speakers. This short video will be in ʻŌlelo Hawaiʻi with English subtitles.

Kaniʻāina (http://ulukau.org/kaniaina/) is a digital repository with a bilingual ʻŌlelo Hawaiʻi and English interface that currently provides interactive access to some 525 hours of audio recordings, including the celebrated Ka Leo Hawaiʻi radio broadcasts from 1972-1988. These recordings are a treasure chest of Hawaiian language and cultural knowledge shared from among Hawaiʻi's last L1 ʻŌlelo Hawaiʻi, born between 1882 and 1920.

In addition to providing an interface for listening to spoken ʻŌlelo Hawaiʻi recordings, Kaniʻāina, in partnership with the Kaipuleohone Digital Language Archive, will also properly preserve those recordings and transcripts and implement a procedure for crowdsourced transcription of additional recordings from e.g. the public and from University of Hawai‘i students of ʻŌlelo Hawaiʻi.

Kaniʻāina grows out of decades of successful immersion-based language education and statewide interest in promoting ʻŌlelo Hawaiʻi use at every level. This project represents a continuing refinement of the methods of language documentation and unparalleled technologies for preserving, disseminating and mobilizing four decades of documentation of spoken ʻŌlelo Hawaiʻi.

Keita Kurabe, Ja Seng Roi Sumdu and Seng Pan Maran: PARADISEC and Kachin culture: Toward community-based practices of archival return in northern Myanmar

Over the past ten years, the presenters and other local collaborators have conducted an intensive community-based documentation project on oral literature of the Kachin people, who are indigenous to northern Myanmar and adjacent areas of China and India. Our research outcomes are a large body of 2,754 stories with 1,751 transcriptions archived with PARADISEC (Collection IDs: KK1 and KK2). In this talk, we will showcase our collaborative efforts toward getting these archival materials back to the Kachin community in and outside Myanmar. We will talk about how the invaluable cultural materials are used by the local community for both general and pedagogical purposes. We will also touch on our community-driven online social media platform to help facilitate and disseminate our archival materials to a wider audience who have little idea about where to find them. Our ongoing project to animate popular stories and to produce picture storybooks for children based on our archived materials in order to promote cultural transmission will also be introduced. We will show that the direct involvement from the community has always been the core component of our project at all stages.

This paper introduces a project which aims to recover, transcribe, translate, repatriate and publish documents and memories that relate to a very early Independence movement from Vanuatu, then the New Hebrides. Three senior chiefs from Central Vanuatu, Tarip̃oa Liu of Nguna, Kalsakau of Ifira, and Ti Nabua Mata of Tongoa, began meeting in 1912 to develop and promote their case for liberation from colonial rule. Perhaps the most famous Islander-authored document of this era in Vanuatu is Tarip̃oa Liu’s Book of Desires (1935?), translated by William Milne and published by Graham Miller. Our project focuses on a less well-known document written by Ti Nabua Mata’s elder brother, Dick Fakao Ti Nabua Koto, which reviews the Islander experience of labour migration to Queensland, colonial rule in the New Hebrides, and conversion to Christianity. A copy of this document was made by the Pacific Manuscripts Bureau from an original manuscript at the Vanuatu Kaljoral Senta. Drawing on the papers of Graham Miller and other Presbyterian missionaries of the period, we address the context for the document’s production, its key themes, and some of the surviving memories of the roles in this early Independence movement of Ti Nabua Mata and Ti Nabua Koto. Given the importance of this document for source communities, our paper is also an exploration of the scope for collaborative research between communities and outsiders under the conditions of Covid-19.

Digital endangered languages archives hold vast amounts of data of spoken, signed and whistled endangered languages. Paradoxically, these archives filled with hundreds of languages, are usually packaged in just one or in rare cases, a hand full of languages. That is, their interfaces and the metadata that make it possible to access the data, are in English or another major language, constituting a linguistic barrier.

The communities whose languages and cultural practices are represented in such archives, are therefore often unable to access the materials held therein. In an ideal world an archive’s interface would be available in every language represented in its collections. Even a watered down version of this, namely an interface available in all major contact languages of the languages represented in an archive, is currently too expensive and not a feasible solution for most archives.

This talk will showcase archive user guide templates developed for the new Endangered Languages Archive (ELAR) interface, which can easily be translated, and made available on the archive’s homepage. These guides enable users to navigate an archive even if they are unfamiliar with the language or even writing system of the interface.

In addition, this talk will propose best practices for making collections within an archive more linguistically accessible, by providing multilingual metadata, oral collection descriptions, and collection guides similar to the archive guides mentioned above.

When speakers of small languages take up the possibility of creating lasting resources in their language, what sorts of materials do they want to record? What audiences are they imagining? How do these hopes and values differ from, or resonate with, those of anthropologists or linguists who have created many existing resources in archives including PARADISEC? In this presentation, we reflect on experiences from Ranongga Island in the Western Province of Solomon Islands. Ranongga is home to a remarkable grassroots initiative called the Kulu Language Institute. A 2019 ELDP Legacy Materials grant and support from PARADISEC has allowed digitisation more than 100 hours of cassette recordings from the mid-1980s to the early 2000s. As part of this project, staff from the Kulu Language Institute worked on transcribing old material and typing stories that had been written by students. In September, Kulu staff and community member gathered think about the future of the Kulu Language Institute, the future of Ranongga people, and what kinds of stories should be recorded and preserved. These conversations and the stories that were recorded focused on core moral values, values that speakers hope will orient young Ranonggans toward a better future. This process also prompted some participants to reflect in a newly critical way on histories of Christianity, considering what might have been lost as local people came to see knowledge and truth as coming from an external source.

This presentation will discuss four areas of language materials from Northeast India that are under-archived:

• Manuscripts written in traditional scripts

• British era documents, particularly manuscripts (many in private hands)

• Community-produced orthographies and literacy materials

• Community-produced heritage recordings.

Each of these areas presents special challenges. For example, Tai manuscripts held both in Buddhist temple libraries and in private hands are written in traditional scripts that have low levels of community literacy. Due to underspecification of vowel contrasts, and lack of tone marking, the reading of these manuscripts requires very different skills from, for example, reading ancient texts in Latin or Greek. A high priority for documentation is not only photographing manuscripts but also recording them being read and explained.

The identification of British era documents presents different challenges. Some publications of this period are widely available in on-line publications, but these publications are not easy to access for community members and, both publications and manuscripts may require some interpretation.

Once archived, it is important to consider how materials can be made more widely available to community members. Whereas long-term preservation and curation is clearly best undertaken through existing archives like PARADISEC, members of community are more frequently accessing materials uploaded on sites like YouTube. Indeed, the 2020s present language documentation with opportunities to engage with this growing body of community produced work. But, knowing how tentative the future of YouTube materials are, can we find a way to incorporate materials uploaded to YouTube or Facebook into our well curated archives?

In some legacy sources, typography (including layout on the page) conveys information. For example, font weight may be used to indicate prominence, or indentation may be significant. Recovering this information in automatic processing can be challenging; for example standard OCR does not differentiate kinds of spacing. Two kinds of linguistic material, aligned interlinear (glossed) text (IGT) and dictionary entries, provide examples of the problems.

Interlinear text uses a horizontal dimension to keep information of the same kind together (original text, text analysed into morphemes, morpheme-by-morpheme gloss) and simultaneously shows the relationships between elements of these different kinds by using a vertical dimension. Use of spaces within cells and empty cells in this type of layout are problematic for processing. Dictionary entries are highly structured; if parts of the entry such as sub-entries are tagged with numbers or some similar device, automatic processing will be possible, but to the extent that hierarchical structure is indicated with devices such as levels of indentation, problems will arise.

Both of these cases involve interpreting spacing in a printed version. ALTO (Analysed Layout and Text Object) is an XML schema which encodes text and its layout and is a possible output of some OCR engines. Because it provides precise information about the position of text on a page, such output can be the basis for automated processing to recover information coded in typography. We illustrate this by showing that both the alignment of interlinear elements and meaningful levels of indentation can be recovered using ALTO.

In 1977, the USA launched the two Voyager spacecraft, each famously carrying discs with 27 examples of music. They are now many billions of kilometres from earth.

The selections were made by one group of people trying to represent a diverse planet within strict technical limitations. It is a product of exploration, education, and the desire to share excellence.

Amongst these recordings is an example from Papua New Guinea, representing minimally an ancestor, a clan, village, province, nation, and planet. But this inclusion was not made in consultation with the recordist, performers, the music owners, or any government officials. My paper will consider the complex road of recording (1964), inclusion on Voyager (1977), local discovery of that inclusion (1990–2013), and desire for and final receipt of an artefact of the project (2013–19)—spanning over half a century—for people from Kandingei village in East Sepik Province. And it will also detail the work of a government cultural office, an anthropologist, and a record company as key to efforts of recognition locally and internationally.

Is the inclusion of the Papua New Guinea example seen as a proud national symbol or a cultural rip-off? Are the stated good intentions of those involved perceived as such in Kandingei? What kind of compensation is possible now that the performers (Pranis Pandang, Kumbui), recordist (Robert MacLennan), promoter (Alan Lomax), and record-committee chair (Carl Sagan) are now all deceased and many unanswered questions remain?

Pete O'Connor, John Divilli, Sally Treloyn and Rona Charles: From collections to the dance ground: Bringing Junba dance-songs back to life from old sources

Over the past ten years, participants in The Junba Project from the Mowanjum Community in the west Kimberley have drawn upon digital collections of photographs, video, audio and associated documents to revive Junba dance-songs that had otherwise fallen out of usage. In this presentation dance leader Pete O’Connor (Worrorra) and song leaders John Divilli and Rona Charles (Ngarinyin, Nyikina) with ethnomusicologist Sally Treloyn, outline source collections used to revive Junba over the past ten years, how these are stored and shared locally, the work involved in identifying dances and songs in dispersed photographic, audio, and text collections, reviving tune and voice from hard-to-hear recordings, and choreographing dance from still photographs. The group speak to the social value and wellbeing stimulated by collections when they become accessible to the community.

Takurua Parent: Te firi tā'ai u'i ; Les archives, tresse intergénérationnelle ; Commun-tying archives: weaving intergenerational bonds

In French Polynesia, local communities are facing the growing need to access digital archives containing relevant information and resources that are instrumental to their well-being and long-term sustainability. This in turn raises a key question: is it still appropriate to gather data in the field without making them accessible to the source communities?

In order to make the issues at stake salient, this presentation will rely on a place-based example of how to conduct recordings within the framework of the socio-environmental “Rāhui Forum and Resource Center.” This presentation will thus focus on the process of recently recorded materials in Tahitian collected in Bora-Bora as part of a wider social-environmental project. In collaboration with the local Anavevo linguistic database project, those recordings will be put online with different access levels according to the wish of the person recorded. The development of the Anavevo database is about to be completed.

More specifically, this presentation aims at demonstrating how primary recordings in Polynesian languages are more likely to be accessible to the source communities through the rise of digital archives. It will also elaborate on the many relevant and innovative uses and purposes of these brand digital resources to the local community members, from daily resource management issues to local biocultural initiatives.

Hugh Paterson III: From Archive to Citation

We celebrate the work and trust-building it took to bring PARADISEC to 100TB of language resources. With the rise of the digital language archive and the plethora of referenceable content, a critical question arises: “How easy is it for authors to use existing tools to cite the content they are referencing?” This is especially important as people use archived materials as evidence within published language descriptions.

Archived resource metadata is well discussed in language documentation circles; however, citation metadata and its accessibility is less discussed. Discoverability metadata serves aggregators like OLAC filling the function of declaring that something exists. In contrast, citation metadata is about referencing and findability (where is an item located).

In this presentation we look at the interaction between Zotero, an open source citation manager, and the archive. We look at five different archives (PARADISEC, PANGLOSS, SIL Language and Culture Archive, ELAR, and Kaipuleohone) and three methods of importing metadata into Zotero (DOI import, HTML embedded metadata, and file based import). We report on the metadata provided by the archive to the author via Zotero’s interfaces: What’s included, what’s missing, and what’s misaligned.

Understanding the processes by which authors collect metadata for the purpose of citation, what metadata they need, and if it is being provided, facilitates the design of useful interfaces to archives.

When striving for an in-depth description of the linguistic expression of possession in a particular language, a variety of methods can be employed; including translation tasks, grammaticality judgements, typological questionnaires, collections of texts, songs, and conversations. Primarily in the area of language acquisition, a further method of language elicitation has become more and more important: language tasks and games (Eisenbeiss 2010). The author of this talk has adapted the latter to adult speakers of the endangered language Miriwoong (non-Pama-Nyungan, spoken in the Kununurra area of WA/Australia).

Language games and tasks were chosen as the main source of data for the author’s dissertation thesis because they constitute a suitable elicitation method for the endangered language situation, but are also a fitting supplement for the variety of revitalisation measures the community is undertaking. Archiving the resulting audio and video data does not only serve to safe-guard it but also to provide access to the community in a suitable form.

This paper will discuss how informed consent for the sharing of Miriwoong linguistic and cultural data and its archiving was obtained. Agreement forms were signed by the author of this talk, the Mirima Council Aboriginal Corporation and all participants. The forms comply with ethical guidelines such as those published by AIATSIS. In order to ensure that all participants can give informed consent, the goals, benefits and consequences of the project are explained in the forms and it is openly stated who is involved, and how the project is funded.

Acknowledgements

The field trips to Miriwoong country in 2014 and 2015 during which the data was obtained, were supported by grants from the DAAD (“DAAD-Doktorandenstipendium”), FEL and FAZIT. I am indebted to the Miriwoong people for devoting their time to this project and for contributing their knowledge. I am immensely grateful to Frances Kofod for allowing me to draw on her early research, recordings and transcriptions of Miriwoong language with many now deceased Miriwoong Elders.

References

AIATSIS. 2012. Guidelines for Ethical Research in Australian Indigenous Studies 2012. Available online at https://aiatsis.gov.au/research/ethical-research, accessed 2020-10-01.

Eisenbeiss, Sonja. 2010. “Production methods in language acquisition”. Experimental methods in language acquisition research. (Language Learning & Language Teaching.) ed. by Elma Blom & Sharon Unsworth, vol. 27, 11–34. Amsterdam, Netherlands: Benjamins.

Gudjal is a language of the Gudjalburra Nation, whose lands are in the country near Charters Towers in QLD. The language’s vitality has suffered greatly since European invasion, and currently there are no fluent speakers. However, in the 1970s, a handful of Gudjal speakers worked with linguists to create recordings of their language, and since then, these recordings have been used to produce a number of resources, predominantly wordlists (Santo 2006a, b, c). Our project seeks to build on these resources by further utilising the archived recordings and fieldnotes to create a learner’s guide and Welcome to Country in Gudjal. In our paper, we will discuss the challenges and benefits of working with these archival materials, with special focus paid to the complementation of knowledge from current and previous generations. We will also discuss how we intend to add value to these recordings by making their content accessible in various formats for community members.

Santo, William. 2006. Gudjal language pocket dictionary. QLD: Black Inc Press.

— Gudjal book of birds. QLD: Black Inc Press.

— Gudjal book of animals. QLD: Black Inc Press.

https://doi.org/10.5281/zenodo.4506935

https://twitter.com/fxru

https://doi.org/10.5281/zenodo.4506935

Researchers looking for data across language archives often rely on harvesters such as OLAC or VLO. In this search, they are dependent on mappings of metadata categories across archives.

While conducting a survey of metadata standards and landing pages of a number of DELAMAN archives, we came across problems with the notion that entering a search term in a unifying search box will reliably turn out all relevant available data, nothing less, nothing more.

However, different types of vagueness in the metadata distort this picture. Taking “language”, the probably most relevant metadata category for linguists as an example, it becomes clear that it can refer to at least two categories: In some cases, it describes the content language, in others the languages of the actor(s) in the recording, in yet others it may be the language that is used to translate the resource.

In our talk, we identify different types of vagueness: vagueness of the metadata category itself, values entered can be poorly defined, values can be overlapping, the relation of the metadata category to the resource can be unclear (e.g. does the date describe the publication date, the recording date, the date of metadata creation, the duration of a documentation project?).

We show how vagueness impacts findability of resources and reliability of search results. We attempt to sketch possible solutions for some types of vagueness and discuss whether we just have to live with this vagueness in our metadata descriptions.

No archive reaches a 100TB milestone easily, and very few research archives unaffiliated with national institutions have done so. This presentation highlights the vision and courage required to establish and grow an archive in the 21st Century. It discusses emerging archival practices on the Internet and the roles research archives can play in relation to them, community partners, and the general public.

Over the past 20 years, many Indigenous Australian language centres have created their own digital dictionaries using one-off grants. The result has been a diverse set of technologies, most of which do not offer opportunity for re-use or further development. OpenDictionary is an open source framework with a basic selfserve workflow for language centres to add text files and media files. The OpenDictionary project was developed with the intent to provide a tool like a Swiss army knife, that could be used to import and export text-based language files, translate language text between different fonts or spelling systems, and provide a visual web interface for accessing and/or editing dictionary and grammar material. We use the same technologies (.NET) as Lexique Pro, which is open source and cross platform. The Mawng OpenDictionary is set up to import a text file in backslash format with Toolbox MDF fields. It currently offers a fuzzy search option which has been identified by many users as a key feature for Indigenous Australian language dictionaries. Few become completely confident with the spelling systems of small languages such as Mawng. Dictionary entries present copyright/license information for media. Specific 'portals' that offer views of the dictionary designed for different users such as Mawng-speaking children, adult speakers and linguists is a planned feature. The OpenDictionary Project is being developed by Ben McIntyre in collaboration with community language projects run by linguists Margaret Carew and Ruth Singer.

Field methods classes, while primarily pedagogical in aim, are also opportunities to record under-studied languages. It is often considered best practice to archive these data, and the PARADISEC archive currently includes around a dozen collections with ‘field methods class’ (or similar) in the title. Some collections may represent the only material easily visible and accessible for these languages.

For the undergraduate class ‘Describing a Language’ (at the University of Sydney, 2020), descriptive study provided an opportunity to archive research on phonology, morphology and grammar, and on transmission and maintenance of a language through teaching resources. The language explored was Bisakol (Austronesian) as spoken in Bulan (Sorsogon) in the Philippines (a.k.a Southern Sorsogon/Southern Sorsoganon/Bikol Sorsogon, inter alia). The class worked with language consultant M. B. Girado. As a PhD student in anthropology, Girado simultaneously brought a lay person’s language knowledge and a critical anthropological understanding of field linguistics, providing a unique perspective on descriptive linguistics and the archiving process. The amassing of diverse data in an environment which was both collaborative and aimed at instruction in fieldwork methods also raised questions about what should be archived and how, especially as our work model and goals changed in response to the Covid-19 pandemic. For example, some records were not necessarily created with archiving in mind, and incorporated video and images from other sources, but nevertheless represent unique data on Bisakol. We discuss the questions and perspectives that arose as we worked on archiving research that was produced in this field methods class.

Small language centres and similar agencies often have difficulty building descriptions of collections of materials they create, typically resulting in accumulated files that are difficult to manage. Databases have also been used to describe and display primary records, however this separates files from their descriptions, and the use of proprietary software can result in orphaned data. In small agencies, the reliance on one individual can also mean that a well-structured set of materials may be orphaned if they leave, with significant effort typically required to recover data and metadata. To address this issue PARADISEC has begun working with a platform called Arkisto that is built on the Oxford Common File Layout and Research Object Crate.

With Arkisto, the guiding principle is that the catalog writes a complete metadata entry to the same directory as the item. In this way each item is self-described and we have built services that take items from a range of collections in PARADISEC, for example, all files in one language, regardless of which collection they occur in, and create subcollections of just those items.

With Arkisto we are building a set of services to run the collection and reduce dependence on the existing monolithic PARADISEC catalog, ideally replacing it entirely at the end of this project. These services could be shared among a group of language archives. We will demonstrate an OCFL version of the collection with faceted search and microservices for viewing items

Sally Treloyn, Rona Charles, Andrea Emberly, Lusani Dhavula and Tseki Maphasha: Using the Discovery tool to link dispersed song collections in Australia with source communities in the Kimberley (Australia) and Limpopo (South Africa)

The Discovery database tool is designed to hold surrogate source materials (audio and text documents) pertaining to particular regions and song genres. It works as a linking tool that allows the user to aggregate metadata from sources that are dispersed within and across archival collections. Users curate song materials drawn from the digital surrogates for community use for local sustainability, teaching and learning, and song, dance and language revitalisation initiatives. The tool was developed with materials tied to four regions and genres. This presentation focuses on two of these: dance-song materials referencing the Junba dance-song genre from the Kimberley, Australia, held in private collections and AIATSIS; and dance-song materials from the Blacking collection originating from Limpopo province, South Africa, held at the University of Western Australia. Presenters, including community users of the tool, will outline collections, structure and utility of the system, use in respective communities, and future directions.

The current landscape of data repository systems is rather fragmented. Language archives and data repositories in general are spending significant amounts of time and money on adopting existing solutions, on develop custom solutions themselves, or on migrating from one solution to the other. Numerous open source as well as commercial offerings have a healthy user base, and while the choice for a given solution is often not only based on technical features, this does suggest that there is a need for different solutions that meet different requirements, rather than a one-size-fits-all platform. Two recent developments however have the potential to change the way in which repository solutions are built and how easy it is to move from one solution to another. One is the Oxford Common File Layout specification (OCFL), which is a preservation-focused, application-independent specification for storing repository objects on a storage medium. The other development is the Digital Object Interface Protocol (DOIP), which is a proposed standard protocol for interacting with digital objects. In theory, if different repository solutions would adopt these standards for both the low-level storage as well as the protocol for interacting with the digital objects in the repository, the repository solutions themselves would be interchangeable. This would also make it possible to develop presentation layers that could be used with different underlying data repository systems. Some further alignment on minimal metadata requirements and on collection structures might be necessary though to make this a reality.

Field Notes is a podcast about linguistic fieldwork, which aims to share the stories of linguists doing fieldwork to document, describe, and research languages, particularly under-documented and under-described languages. Field Notes was inspired by my initial field trip to Amami Oshima (Ryukyu), and the anxiety that accompanied me on that first trip. As an early-career linguist, I sought out unedited stories from linguists who had undergone successful fieldwork, beyond the usual manuals. I was eager to hear how other fieldworkers had dealt with the unexpected and inevitable challenges that come with fieldwork, particularly those which aren’t often mentioned in Field Methods courses. What had others done when they’d been faced with unexpected challenges, and what could I and other early-career linguists learn from these experiences? The Field Notes podcast has become a platform for fieldworkers of all experience levels to exchange knowledge, share our work, feel a sense of solidarity, and, even, provide guidance. In this short talk, I will share the motivation behind Field Notes and how it has evolved into a platform to amplify lesser heard voices in language documentation (e.g. indigenous scholars, BIPOC scholars, etc.) while simultaneously improving the work being done in the field of documentary linguistics.

Myfany Turpin, Jodie Kell, Clint Bracknell and Felicity Meakins: From the archives to the air: producing a podcast with archival song recordings

In this paper we discuss the process of making A Song with No Boss, a podcast and website from archival audio recordings due for release in 2021. Part detective story, part oral history, it is a story of a traditional travelling ceremony that was popular in the first half of the 20 century, performed by men and women across outback Australia (see map). The ceremony was first documented by Daisy Bates who witnessed it in 1913 and referred to it as Wanjiwanji. The podcast and website brings to the modern airwaves 11 legacy audio of the song recorded from 1955—2017 in four states across a dozen language regions. It also weaves in contemporary responses to the recordings as the song is still widely known by old people; and it contemplates the close relationship between music, memory and emotion. The story is a tale of an orally transmitted song that remained unchanged across hundreds of years and thousands of kilometres.

In this paper we focus on two issues. One, the process of obtaining permissions to use the recordings, which involved negotiations with the communities of the singers, the recordist and the archive (AIATSIS); including an agreements between AIATSIS, the University and the ABC broadcaster. Only one community declined our request for usage. The second issue we discuss relates to the collection of contemporary recordings made in the course of the project from 2017-2019. We consider the issue of how these can be linked to the legacy recordings in the archives; and how the results of this contemporary fieldwork can be fed back into the metadata of the legacy recordings held at AIATSIS. Many of the legacy recordings do not identify the song, its social context or meaning in the metadata; and even the identity of the singers is undocumented in the early recordings. The value of feeding this information back to the archival is apparent in terms of discovery; yet how best to do this is a more complex issue.

Tahitian online dictionary : http://www.farevanaa.pf/dictionnaire.php

Atlas : https://anareo.cloud.pf/atlas/carte/carte.php

UPF digital library : http://anaite.upf.pf/

Cette communication présentera la base numérique lexicale et textuelle Anareo, littéralement ‘grotte des langues’, et ces différentes fonctionnalités. Dédiée aux langues de Polynésie française, elle est développée par l’Université de la Polynésie française en partenariat avec PARADISEC. Anareo héberge plusieurs dictionnaires, dont celui de l’Académie tahitienne, et l’Atlas numérique des langues de Polynésie française. Ces ressources lexicales sont corrélées progressivement à des corpus de texte écrits et bientôt des textes oraux retranscrits. Par exemple, en partant d’un mot du dictionnaire tahitien-français, les utilisateurs peuvent visualiser un répertoire d’exemples référencés issus de textes en tahitien du XIXe et du XXe siècles et déposés sur une bibliothèque numérique. Les données lexicales sont enrichies par la contribution de spécialistes, en particulier pour réguler la taxinomie scientifique des plantes, des poissons et des oiseaux et pour associer des photos aux espèces nommées.

Des fonctions informatiques développées sur Anareo permettent d’accroître l’interopérabilité entre les différentes ressources et d’accompagner le travail de recherche lexicographique et grammaticale en synchronie, mais aussi, grâce à l’association de données du Polynesian lexicon online (POLLEX), sur l’étymologie. Un nouveau projet inscrit dans le développement d’Anareo vise à offrir, sur le modèle de PARADISEC, une archive numérique dédiée au dépôt, à l’archivage et à la consultation d’enregistrements oraux réalisés en langues de Polynésie française.



The plight of endangered musical traditions is widespread across the globe, inspiring documentation initiatives of which PARADISEC stands as an outstandingly successful example. A case in point is dāphā, a tradition of participatory Hindu-Buddhist devotional singing performed in the towns and villages of the Kathmandu Valley, Nepal, by groups of male singers and instrumentalists mainly from the farming community. Despite its importance for the social identity and well-being of its performers and patron communities, and a centuries-long history, the tradition is in decline, and its continued transmission is under threat – from long-term social and economic changes as much as from disasters such as the 2015 earthquake and the current pandemic. What would be lost if dāphā became extinct? The words of the songs, encapsulating important religious, social and historical themes, would survive in manuscript song-books. Melodic, metrical and formal structures, however, would vanish, as they are orally transmitted and committed to memory. With them would go a range of performance practices, sedimented historical meanings, religious functions, and social roles. Dāphā songs and their performances articulate patterns that reflect history, characterise culture and structure daily life. The dynamics of dāphā as musical and social interaction demand to be documented along with the songs themselves.