15th National Natural Language Processing Research Symposium

March 08-09, 2019

BU Amphitheater, Bicol University, Legazpi City

Oriental COCOSDA 2019 | O-COCOSDA 2019

Oriental COCOSDA 2020 | O-COCOSDA 2020

Call for Papers

Organized by the Computing Society of the Philippines – Special Interest Group on Natural Language Processing (CSP SIG-NLP), National University (NU) Human Language Technology Laboratory, and Bicol University (BU)

The 15th National Natural Language Processing Research Symposium (15NNLPRS) will take place on March 08-09, 2019 at Bicol University, Legazpi City. NNLPRS is a regular gathering of researchers from different fields working on the analysis, processing, and generation of human languages. This event is intended to provide a forum for the conduct of more research and networking. The past symposia have covered a wide range of topics in NLP and were graced by international invited speakers.

This symposium aims to encourage the submissions of current NLP research papers, to expose the attendees of selected institutions in the current trends and advances of NLP, and in exploring the challenges and prospects of conducting NLP research in academic settings. Specific objectives are:

  1. To establish as a mechanism for leveraging NLP research in academic institutions;
  2. To provide the attendees with orientation on NLP-based research and trainings on the use of various NLP tools; and
  3. To provide an avenue for the attendees to get assistance in the conduct of NLP-based researches.

This year’s theme is “Indigenous Languages”. According to Ethnologue, the country has 175 indigenous languages. Among these, 14 are in trouble, 11 are dying, and 4 are already extinct. The 15NNLPRS will be a venue for discussing the various challenges and opportunities that we face in addressing this issue and in integrating human language technology to document and process Philippine languages.

Relevant topics include but not limited to the following areas:

Language

  • Corpus building
  • Dictionary and Philippine languages
  • Discourse analysis
  • Phonology and morphology
  • Language resources and evaluation
  • Language clustering and mapping
  • Language learning
  • Lexicology
  • Multilingual speech corpora
  • Prosody
  • Sociolinguistics
  • Speech databases
  • Standardization
  • Syntax and grammar

Computing

  • Automatic Speech recognition
  • Culturomics
  • Information retrieval
  • Machine learning for natural language
  • Machine translation
  • Named entity recognition
  • Natural language generation
  • Segmentation and labeling
  • Sentiment analysis and opinion mining
  • Sign language processing
  • Speech synthesis
  • Text summarization and generation
  • Word sense disambiguation
  • WordNets and ontologies

CHED Endorsement

15NNLPRS CHED Endorsement.pdf

Program

15NNLPRS Program v2.pdf

Speakers

Dr. Tod Allman has been working in the field of Natural Language Generation for the past twenty years. He and his colleagues have designed and developed a linguistically based natural language generator called Linguist’s Assistant (LA). LA produces high quality draft translations in a wide variety of languages, particularly minority and endangered languages. Linguists may use LA to simultaneously document a language, and also produce initial draft translations of significant texts in the language. When experienced mother-tongue translators edit the translations produced by this system into publishable texts, their productivity is increased by more than 500% without any loss of quality. LA incorporates extensive typological, semantic, syntactic, and discourse research into its semantic representational system and its transfer and synthesizing grammars. Tod has worked with linguists and mother-tongue speakers in order to develop computational lexicons and grammars for a variety of languages including Korean, Kewa (Papua New Guinea), Jula (Cote d’Ivoire), Angas (Nigeria), Chinantec (Mexico), and Nsenga (Zambia). He has been living in the Manila area for more than five years, and is presently building lexicons and grammars for five languages: Tagalog, Ayta Mag-Indi, and Botolan Sambali which are spoken here in the Philippines, Ibwe which is spoken in Indonesia, and Hlai which is spoken in China. He hopes that the texts generated by LA will empower the speakers of these languages by enabling them to participate in the larger world, and by providing them with vital information which helps them live longer, healthier, and more productive lives.

Linguistically Based Computer Assisted Translation: Gleaning the Language Data, Editing the Computer’s Drafts, and Distributing the Published Texts

Linguist’s Assistant (LA) is a linguistically based natural language generator (NLG) designed and developed entirely from a linguist’s perspective. The system incorporates extensive typological, semantic, syntactic, and discourse research into its semantic representational system and its transfer and synthesizing grammars. It is presently being used to translate numerous texts into languages from several diverse language families, including three languages spoken here in the Philippines. This presentation will describe i) the process of gleaning the necessary language data so that the software can produce initial draft translations, ii) editing the computer’s drafts into publishable texts, and iii) distributing the published texts to the people who will benefit from them.

i) Gleaning the Language Data: In order to produce translations of texts in a particular language, every NLG requires three components developed specifically for that language: 1) a lexicon, 2) a transfer grammar, and 3) a synthesizing grammar. The development of these three components requires considerable time and effort by a computational linguist. The author spent approximately one year developing the lexicon, transfer grammar, and synthesizing grammar for Tagalog. In order to reduce the time and effort required to build these three components for other languages in the Philippines, new techniques have been developed to accelerate this process. The first portion of this presentation will summarize these new techniques.

ii) Editing the Computer’s Drafts: After LA has produced an initial draft translation of a book, mother-tongue translators must edit the computer’s draft into publishable form. They begin by editing the computer’s draft into a “presentable first draft.” We define a presentable first draft as a text that doesn’t contain any grammatical errors, conveys the original message accurately, clearly, and completely, and could be read to an audience and everyone would easily understand it. A presentable first draft is ready to be circulated amongst our team of editors who will polish the text into publishable form. The editing that is done during this process generally consists of 1) selecting more precise words, 2) adding more pronouns, 3) improving the information flow, and 4) increasing the naturalness of the text. We’ve performed three timed experiments with a Tagalog mother-tongue translator who edited the computer’s drafts into presentable first drafts, and the results of those experiments will be presented.

iii) Distributing the Published Texts: After a text has been translated and published, it must be delivered to the people who will benefit from the information. More specifically, the text must be delivered in a format that appeals to the people, and at a price they can afford. We’ve found that illustrated books with optional audio recordings are an ideal format for the texts we translate. Additionally, we deliver the texts through free phone apps by visiting the mountain villages and letting the people download the apps from a portable wifi hotspot. This process will be described in the final section of this presentation.

The conclusion of this research is that the time required for developing a lexicon, transfer grammar, and synthesizing grammar for a particular language has been drastically reduced. Additionally, the time required to edit the computer’s drafts into presentable first drafts and then into publishable form has also been significantly reduced. Delivering the texts to the people who will benefit from them is always an adventure. The author hopes that this process will be repeated for many of the languages spoken here in the Philippines.

Dr. Robert R. Roxas has been interested in Natural Language Processing, particularly in Machine Translation area. In fact, his master’s thesis was in the area of Machine Translation. Although he worked in the area of Cyber-Film, when he took up his PhD in Computer Science and Engineering in Japan, his heart for NLP has not changed. He has more students doing NLP-related researches than Cyber-Film researches and more NLP papers presented in conference, both local and international, than Cyber-Film papers. He is into Music Notation Software as well.

The Resurging Interests in NLP Researches These Days

In the past, very few Filipino researchers, faculty and students alike, did have any interests in Natural Language Processing (NLP) researches. But in recent years, many are already interested. This can be attributed to the fact that many people believed that computers could not succeed in achieving human-like activities such as understanding human languages, translating from one language to another, building models for a particular language, recognizing human speech, etc. because of computational complexity. But with the emergence of machine learning, deep learning, and other artificial intelligence learning algorithms, many faculty and students are now interested in doing researches in NLP because they now believe that computers can already achieve what they failed to achieve in the past. As evidences for this phenomenon, sample works in the area of NLP, especially applied to our Philippine languages, will be presented in overview form.

Interdisciplinary Research Presentation

Election Twitterspere 2016: Discovering Patterns in Political Conversations

Charibeth Cheng, Maria Guadalupe Salanga, Courtney Ann Ngo, De La Salle University

We gathered around 8M tweets during the PiliPinas 2016 Presidential Debate Series. These tweets were presumed to contain the opinions, beliefs, affiliations, emotions, expressions and participation of hundreds of thousands of social media accounts towards the candidates, the electoral process, and other national, political, governance and/or societal issues. How has the conversations in Twitter changed from 2013 to 2016. While there have been analysis of the impact of social media on the 2016 elections, most of these are anecdotal or considered only a small sampling of the data in Twitter. We present the results of our study on the 8M tweets, based on conversations associated with selected presidential candidates.

PiCAB Orientation

The Seoul Accord (www.seoulaccord.org) is a mutual recognition agreement pertaining to computing and information technology-related programs, in the same way that the Washington Accord is a mutual recognition agreement pertaining to engineering programs.

The jurisdictions of the U.S.A., Canada, Australia, UK, Japan, Republic of Korea, Chinese Taipei, and Hong Kong have found it advisable in this age of globalization to become signatory members of the Seoul Accord. The signatory members of the Seoul Accord are intent on having the Accord:

  • Serve as the international authority on quality assurance for education in the Computing and IT-related professions;
  • Promote and develop best practices for the improvement of education in Computing and IT-related disciplines;
  • Continually review its policies and procedures to ensure that they are relevant and reliable indicators of the future of Computing and IT-related technologies.

The Seoul Accord Guiding Principles specify that a jurisdiction’s representative to the Seoul Accord must be an association of individuals engaged in the professional practice of computing and information technology-related occupations. This association must neither be an educational institution nor a part of government.

Thus, it is the Australian Computer Society, the British Computer Society, the Canadian Information Processing Society that represent their respective jurisdictions in the Seoul Accord. In the Philippines, it is the Philippine Computer Society (PCS), the longest-existing association of computing and IT-related professionals in the country that leads the Philippine jurisdiction initiative for signatory membership in the Seoul Accord.

Just as the Philippine Technological Council (PTC) established within itself the PTC Accreditation and Certification Board for Engineering and Technology (P-ACBET) as the accrediting agency to represent the Philippine jurisdiction in the Washington Accord (www.ieagreements.org); PCS has established within itself the PCS Information and Computing Accreditation Board (PICAB) as the accrediting agency to represent the Philippine jurisdiction in the Seoul Accord (www.seoulaccord.org).

PICAB’s Board of Directors is comprised of individuals from PCS (www.philippinecomputersociety.org); PSIA (www.psia.org.ph); PSITE (http://www.psite.org); and CSP (www.csp.org.ph). PICAB has been formally delegated autonomous powers to manage, supervise, conduct, and decide regarding the accreditation of computing and information technology-related baccalaureate programs. PICAB, through its activities, aims to promote and develop best practices for the improvement of education in computing and IT-related disciplines.

The application of PICAB for the Seoul Accord provisional signatory was accepted unanimously by the signatories at the Seoul Accord General Meeting 2015 on June 20, 2015, in Istanbul, Turkey (http://www.seoulaccord.org/newsdetail-99.html).

Taken from: https://www.philippinecomputersociety.org/picab2/

On March 09, 2019, PiCAB will provide an orientation during 15NNLPRS.

Important Dates

Paper Submission

January 26, 2019

Acceptance Notification

February 09, 2019

Camera Ready Deadline

February 20, 2019

Paper Submission

Submissions should describe unpublished original work. Both completed works and works-in-progress are welcome. Authors intending to submit should follow the two-column ACM format and may consist of up to six (6) pages of content, including references and appendices. Authors must use the template (letter size) available at:

https://www.acm.org/binaries/content/assets/publications/article-templates/pubform.docx

No page number should appear in the paper. The copyright box should also be deleted. Submissions will be judged based on relevance, technical strength, significance and opportunities, and interest to the attendees. As the reviewing will be blind, authors must not indicate their names and affiliations in the papers.

Papers must be submitted through the Easy Chair Conference System:

https://easychair.org/conferences/?conf=15nnlprs

Accepted papers will be presented orally or as posters as determined by the committee. Oral papers and abstracts of poster papers will be included in the proceedings. Select papers will be invited to a special issue of the Philippine Computing Journal.

Registration


Undegraduate student

Graduate student

Regular

Early bird (until Feb. 16)

PHP 500.00

PHP 1,500.00

PHP 3,500.00

Onsite

PHP 800.00

PHP 2,000.00

PHP 4,000.00

Payment can be made through bank deposit.

  • Name: COMPUTING SOCIETY OF THE PHILIPPINES, INC.
  • Bank: BANCO DE ORO
  • Branch: Loyola Heights - Katipunan, Quezon City
  • Savings Account Number: 3570-0089-29

Please email the deposit slip to Ms. Mary Joy Canon at mjoycanon@yahoo.com

The deposit slip should also be presented during onsite registration.

CSP SIG-NLP offers an early bird rate for deposits made on or before February 16, 2018.

Pre-registration for ID printing: https://goo.gl/forms/o2bMMk1Z4YNd6pso1

Scientific Review Committee

  • Angie Ceniza-Canillo, University of San Carlos
  • Briane Paul Samson, Future University Hakodate / De La Salle University
  • Dalos Miguel, Saint Louis University
  • Erlyn Manguilimotan, Weathernews, Inc.
  • Helen Villanueva, Industry
  • Jeffrey Ingosan, University of the Cordilleras
  • John Noel Victorino, Ateneo de Manila University
  • Katrina Joy Abriol-Santos, University of the Philippines Los Banos
  • Maria Art Antonette Clariño, University of the Philippines Los Baños
  • Nathaniel Oco, National University
  • Ralph Vincent Regalado, Senti Techlabs, Inc.
  • Reginald Neil Recario, University of the Philippines Los Baños
  • Rodolfo Raga Jr., Jose Rizal University

Contact Information

Lany Maceda

  • Co-chair, 15NNLPRS
  • Faculty member, Bicol University
  • lanylm [at] yahoo [dot] com

Nathaniel Oco

  • Chair, 15NNLPRS
  • Assistant Director for Research, National University
  • naoco [at] national-u [dot] edu dot [ph]