Learn the foundations of linguistic annotation in just three intensive hours
Are you interested in corpus linguistics, natural language processing, digital humanities, or language data analysis? Join an intensive hands-on workshop designed to introduce participants to the essential concepts and practical skills of text annotatio, hosted by INDONESIA CORPUS LAB!
This workshop provides a beginner-friendly introduction to annotation workflows, covering both automatic and manual annotation methods. Participants will gain practical experience using annotation software, working with structured text formats, and understanding how linguistic data is prepared for corpus and NLP research. Recordings of the workshop will be made available to all participants!
Duration: 3 hours
Mode: Intensive hands-on training
Level: Zero/Beginner
Maximum participants: 10
Minimum participants: 5
To ensure personalized guidance and sufficient practice time, enrollment is strictly limited to a small group.
Online: IDR 250,000 / USD 15.00
Onsite: IDR 350,000/ USD 20.00 (snack & lunch available)
Register here (deadline 1 July 2026)
14 - July 2026, 8.30-11.30
Online: Zoom/Teams
Onsite: Hotel Antawirya, Undip Semarang, Meeting Room LT 1
Introduction to linguistic annotation and annotated corpora
Understanding annotation layers (tokens, lemmas, POS tags, metadata, etc.)
Automatic annotation using existing tools
Manual annotation and quality control
Introduction to XML for corpus annotation
Basic coding concepts for annotation workflows
Hands-on practice with annotation software
Creating and editing annotated datasets
This workshop is suitable for:
Undergraduate and postgraduate students
Researchers in linguistics and digital humanities
Language professionals
Anyone interested in corpus linguistics or NLP
Computer (preferably Windows OS). No prior knowledge is required. However, participants with basic familiarity with corpus linguistics will find it easier to follow some of the examples and exercises.
By the end of the workshop, participants will understand how annotated corpora are created and maintained, and will be able to perform basic annotation tasks using both automated tools and manual methods. The skills introduced in this workshop provide a foundation for further work in corpus linguistics, NLP, language technology, and digital humanities research.
By the end of the workshop, participants will understand how annotated corpora are created and maintained, and will be able to perform basic annotation tasks using both automated tools and manual methods. The skills introduced in this workshop provide a foundation for further work in corpus linguistics, NLP, language technology, and digital humanities research.
Only 10 places are available. Early registration is strongly encouraged.
UNIVERSITAS DIPONEGORO, INDONESIA
Introduction to Linguistics (Undergraduate program in English Language and Literature)
Introduction to Corpus Linguistics (Undergraduate program in English Language and Literature)
Research Methods and Scientific Writing Techniques (Master Program in Linguistics)
UNIVERSITAS SUMATERA UTARA, INDONESIA
Corpus Linguistics (Master Program in Linguistics)
UNIVERSITAS NEGRI JAKARTA, INDONESIA
Computational Linguistics (PhD in Applied Linguistics)
LANGUAGE DEVELOPMENT AND CULTIVATION AGENCY, MINISTRY OF EDUCATION, INDONESIA (Badan Bahasa Kemendikbud)
Corpus Linguistics
Corpus Lexicography
Diponegoro Summer Course in Corpus Linguistics 2023, 2024
Corpus and Corpus Query Systems
The Linguistic Society of Indonesia (Masyarakat Linguistik Indonesia)
Introduction to Corpus Linguistics