IRworkshop2021

會議簡介

隨著全球資訊網快速蓬勃的發展，各式各樣的資訊內容和服務不斷地擴增，人類的生活和互動方式逐漸轉移到網路平台，並且伴隨著無線網路和多媒體技術的快速進展，傳統的資訊檢索技術也不斷地和這些新的資訊媒體和平台結合，產生許多創新的研究，這些創新研究和重要應用議題仍然受到學術界和產業界重視和熱烈討論。因此，本研討會將邀請國內外相關學者專家進行觀念和技術交流。本研討會係繼2002年「資訊自動分類技術研討會」、2003年「資訊檢索與電腦輔助語言教學研討會」、2004年「文件探勘技術研討會」、2005年「網路資訊檢索技術與趨勢研討會」、2006年「網路探勘技術與趨勢研討會」、2007年「Web 2.0技術與應用研討會」、2008年「網路社群服務計算暨探勘技術研討會」、2009年「行動資訊檢索暨行動定位服務技術研討會」、2010年「2010資訊檢索創新技術研討會」、2011年「音樂資訊檢索暨社群服務技術研討會」、2013、2014年「資訊檢索頂尖論文研討會」、2015年「跨領域自然語言處理與資訊檢索技術新趨勢」、2016年「資訊檢索大未來」及2018年「資訊檢索與人工智慧」後續的年度會議活動，每年研討會的主題都獲得廣大迴響。

近年來，由於人工智慧技術越來越受到關注與期待，資訊檢索技術與多媒體及自然語言處理技術的結合與交流，已產生大量的技術需求。今年度我們將循著這波技術浪潮，以「自然語言處理與資訊檢索技術之發展趨勢與未來之路」為主題舉辦本屆研討會，我們特別邀請到國內外的頂尖學者前來與我們分享人工智慧、自然語言處理與資訊檢索的頂尖技術，絕對是接觸相關新技術的絕佳機會，歡迎各界人士踴躍報名參加！

主辦人

張詠淳副教授 (臺北醫學大學大數據科技及管理研究所)

古倫維副研究員 (中央研究院資訊科學研究所)

主辦單位

臺北醫學大學管理學院

中央研究院資訊科學研究所

中華民國計算語言學學會

協辦單位

社團法人台灣北醫管理協會

臺北醫學大學大數據科技及管理研究所

臺北醫學大學智慧數據應用產業碩士專班

會議議程

IR Workshop 2021 Agenda

2021/03/10 (星期三)

March 10, 2021 (Wednesday)

8:30-8:50

Registration

8:50-9:00

Opening

🔗 Video🎞

9:00-10:00

Keynote Speech 1 - Artificial intelligence for data retrieval in medical applications, now and the future

杜奕瑾 執行長 (台灣人工智慧實驗室 Taiwan AI Labs )

🔗 Video🎞

Abstract

There is an increasing interest in exploiting the vast amount of rapidly growing content related to health using information retrieval and Deep Learning strategies. The Real-World-Evidence health-related content retrieval by AI in medicine-related applications is particularly challenging. Implicit differences in language characteristics depend on the content type. The difference comes from a different content format such as healthcare documentation and clinical records, professional or scientific publications, clinical trials documentation..., etc. Moreover, it is also critical to provide search solutions for non-English content and cross-language or multilingual IR solutions to overcome the challenge from the language mixture of Chinese, Taiwanese, Chinese-English. This talk will briefly introduce how we are currently applying AI-based information retrieval to diverse applications in the medical area.

Biography

Ethan Tu，其實就是台灣人熟悉的「PTT 之父」杜奕瑾。PTT 正是由他在 22 年前，就讀台灣大學資工系大二時，於宿舍內用 486 電腦所架設出來的。

杜奕瑾也因此被無數曾經、或至今仍熱切使用這個匿名言論平台的「鄉民」們，暱稱為「PTT之父」、「杜老爹」甚至「上古神獸」、「創世神」。（編按：PTT 是全華文世界最大的 BBS 網路社群，在 BBS 的全盛時期，15 年前全台共有超過 400 個 BBS 站，如今卻僅剩下 PTT 屹立不搖。其留下來的「鄉民」組織規範，社群的習性與動員力，深深影響台灣社會與中國大陸 BBS 社群的發展，就連臉書有的推讚推文習慣多年前在 PTT 年即已呈現。）

「創世神」向來不喜在幕前曝光，近年更極少出現在台灣的公開場合。在台灣，只有部分業界人士清楚知道，當年台大資工系畢業，並參與台灣第一代網路公司的蕃薯藤創立後，杜奕瑾便前往美國，先在美國菁英齊聚的國家衛生研究院（NIH）從事基因序列與癌症自動化檢測研究，接著於十一年前加入當時全球的科技巨擘微軟（Microsoft），在美國西雅圖的微軟總部，進行搜尋引擎 bing 的開發，以及擔任微軟人工智慧超過 11 年以上的研究工作，並當上微軟人工智慧團隊（AI.R.）首席亞太區研發總監。

「台灣孕育著國際級的教授，頂尖的軟體人才，我計畫召集《台灣 AI 實驗室》，實實在在地與台灣領頭企業合作 AI 實驗，配合國際科技巨擘與人才，願以台灣在地的體驗與創意，培養台灣的軟體實力，行銷國際。」

「台灣的 AI 元年，從此刻開始」

(source: https://crossing.cw.com.tw/article/7805)

10:00-10:20

Coffee break

10:20-11:10

Invited Talk 1 - Neural Structured Learning: Theory, Framework and Applications

阮大成博士 Technical Lead Manager (Google Research )

Abstract

Neural Structured Learning (NSL) is a new learning paradigm to train neural networks by leveraging structured signals in addition to feature inputs. Structure can be explicit as represented by a graph or implicit as induced by adversarial perturbation. Structured signals are commonly used to represent relations or similarity among samples that may be labeled or unlabeled. Therefore, leveraging these signals during neural network training harnesses both labeled and unlabeled data, which can improve model accuracy, particularly when the amount of labeled data is relatively small. Additionally, models trained with samples that are generated by adding adversarial perturbation have been shown to be robust against malicious attacks. NSL has been open sourced as part of the TF ecosystem, and we will also introduce several of industrial applications enabled by NSL, such as learning state-of-the-art image semantic embeddings and learning knowledge graph embeddings.

Biography

Machine learner, software developer, and researcher: Da-Cheng Juan is a tech lead and engineering manager at Google Research, leading a research group working on graph learning, adversarial learning, and their real-world applications. Da-Cheng also holds the position of adjunct faculty in the Department of Computer Science, National Tsing Hua University. Previously, he received his Ph.D. from the Department of Electrical and Computer Engineering and his Master’s from the Machine Learning Department, both at Carnegie Mellon University. Da-Cheng has published more than 50 research papers and has repetitively served as a program committee in top conferences and workshops in machine learning, computer vision, natural language processing and related fields; in addition to research, he also enjoys algorithmic programming and has won several awards in major programming contests. Da-Cheng was the recipient of the 2012 Intel PhD Fellowship. His current research interests span across machine learning, convex optimization, and energy-efficient computing.

11:10-12:00

Invited Talk 2 - Towards Conversational AI

陳縕儂副教授 (國立臺灣大學資訊工程學系 )

🔗 Slide
🔗 Video🎞

Abstract

Even conversational systems have attracted a lot of attention recently, the current systems sometimes fail due to the errors from different components. This talk presents potential directions for improvement: 1) we first focus on learning language embeddings specifically for practical scenarios for better robustness, and 2) secondly we propose a novel learning framework for natural language understanding and generation on top of duality for better scalability. Both directions enhance the robustness and scalability of conversational systems, showing the potential of guiding future research areas.

Biography

Yun-Nung (Vivian) Chen is currently an associate professor in the Department of Computer Science & Information Engineering at National Taiwan University. She earned her Ph.D. degree from Carnegie Mellon University, where her research interests focus on spoken dialogue systems, language understanding, natural language processing, and multimodality. She received Google Faculty Research Awards, Amazon AWS Machine Learning Research Awards, MOST Young Scholar Fellowship, and FAOS Young Scholar Innovation Award. Prior to joining National Taiwan University, she worked in the Deep Learning Technology Center at Microsoft Research Redmond. (http://vivianchen.idv.tw/)

12:00-13:00

Lunch

13:00-14:00

Keynote Speech 2 - 一個全新的自然語言模型 Principle-based Approach

許聞廉特聘研究員 (中央研究院資訊科學研究所 )

🔗 Slide
🔗 Video🎞

Abstract

統計式機器學習在語言理解上有下列致命傷: 1. 學到的「知識」(大量的參數)很難表達給人看懂，有錯誤很難修正；2. 統計式的機器學習以「分類」辨識為主，很難融入「規則」；3. 「純文本」的字面學習無法解決問題。有許多的external knowledge 必需在適當時機加入，才有可能讓電腦合理地運作下去(end-to-end不work)。我們提出一個新的model：Principle-based Approach（PBA），可綜合統計和rule-based兩者的優點，而且符合機器學習 N-fold training & test 的原則。PBA有幾個要素： 1. 事先統計每個詞 X 的「修飾語」，稱之為 X 的FB。將修飾語和 X 合併後的短語稱為「 X 的概念」; 2. 利用FB的簡化法自動將句子或片語表達成概念的N-gram，存成pattern（又稱principle）辭典; 3. PBA以pattern matching 作為 similarity 的比對依據。Pattern inference可解決許多自然語言的疑難雜症（尤其是目前機器學習難以做到的部分），我們將在這次的talk中詳細說明。

Biography

Wen-Lian Hsu (F'06) is a Distinguished Research Fellow of the Institute of Information Science, Academia Sinica, Taiwan. He received Ph.D. in operations research from Cornell University in 1980. Dr. Hsu's earlier contribution was on graph algorithms and he has applied similar techniques to tackle computational problems in biology and natural language. In 1993, he developed a Chinese input software, GOING, which has since revolutionized Chinese input on computer. He later applied similar semantic analysis techniques to question answering system and biological literature mining. Dr. Hsu received numerous awards both from academia and from industry. Recently, he developed an interpretable machine learning technique based on reduction, which takes advantage of the idea of context representation from word embedding, and performs better than BERT in several applications.

14:00-14:50

Invited Talk 3 - All the Wiser: Fake News Intervention towards Effective Clarification

古倫維 副研究員 (中央研究院資訊科學研究所 )

🔗 Slide
🔗 Video🎞

Abstract

Fake News has been shown to have a significant impact on people's daily life. Governments and research communities propose many approaches to stop the fake news dissemination. However, the effectiveness is limited and some side effects have been observed. This talk will introduce a news reading platform in which we propose an implicit approach to reduce people’s belief in fake news. Specifically, it touches on how we leverage reinforcement learning to learn an intervention module on top of a recommender system (RS) such that the module is activated to replace RS to recommend news toward the verification once users touch the fake news. The effectiveness of the proposed approach is shown by automatic evaluation and user study. Moreover, the comparisons to other commonly adopted methods will be discussed. The deployment, related applications and the future goal of the proposed concept will conclude this talk.

Biography

Prof. Lun-Wei Ku, IIS, Academia Sinica

Lun-Wei Ku is now an associate research fellow in Institute of Information Science, Academia Sinica, adjunct associate professor of national Chiao-Tung university (NCTU), and the secretary-general of Association for Computational Linguistics and Chinese Language Processing (ACLCLP). She received her M.S. and Ph.D. degrees from Department of Computer Science and Information Engineering, National Taiwan University. Her research interests include natural language processing, information retrieval, and computational linguistics. She has been working on sentiment analysis since year 2005 and was the co-organizer of NTCIR MOAT Task (Multilingual Opinion Analysis Task, traditional Chinese side) from year 2006 to 2010. Her international recognition includes Good Design Award Selected (2012), CyberLink Technical Elite Fellowship (2007), IBM Ph.D. Fellowship (2008), and ROCLING Doctorial Dissertation Distinction Award (2009). Other professional international activities she involved include: General Chair, StarSem 2021, Program Chair, StarSem 2019 and ARIS 2019, Best Paper Committee, ACL 2019; Student Workshop Chair, AACL-IJCNLP; Area Chair, ACL 2021, NAACL 2021, ACL 2020, COLING 2020, EMNLP 2019, ACL 2017, CCL 2016, NLPCC 2016, ACL-IJCNLP 2015 and EMNLP 2015; Financial Chair, IJCNLP 2017; Publication Co-Chair, IJCNLP 2013; Publicity Chair, AIRS 2010. She is also active in industrial collaborations and currently working with companies like E-Sun Bank and WinGene.

14:50-15:10

Coffee break

15:10-16:00

Invited Talk 4 - Artist Interpersonal Relation Enhanced Graph Model for Recommendation

黃瀚萱助理教授 (國立政治大學資訊科學系 )

🔗 Slide

Abstract

Music recommendation is a hot research topic in both academics and industry. Existing approaches to music recommendation are mostly based on structural information such as collaborative filtering and graph modeling. In this work, we propose a multi-modal heterogeneous graph (MMHG) model for leveraging both content-based and structure-based information for music recommendation. We train our MMHG to capture the relations among different kinds of vertices including users, music items, genres, moods, and artists' social network to enrich the features with the acoustic and the textual information of music contents and the social network of artists. By incorporating sophisticated relations among the different concepts in addition to enriched features, the effectiveness of our approach is confirmed in the experiments.

Biography

Dr. Hen-Hsen Huang is an assistant professor in the Department of Computer Science at the National Chengchi University. His research interests include natural language processing and information retrieval. His work has been published in ACL, SIGIR, WWW, IJCAI, CIKM, COLING, and so on. Dr. Huang’s award and honors include the Honorable Mention of Doctoral Dissertation Award of ACLCLP in 2014 and the Honorable Mention of Master Thesis Award of ACLCLP in 2008. He served as the registration chair of TAAI 2017, the publication chair of ROCLING 2020, and as PC members of representative conferences in computational linguistics including ACL, COLING, EMNLP, and NAACL. He was one of organizers of FinNum Task at NTCIR-2014 and FinNLP Workshop at IJCAI 2019.

16:00-16:50

Invited Talk 5 - Information extraction from unstructured text data in the electronic medical records-status quo and challenges

許明暉數據長 (臺北醫學大學數據處 )

🔗 Slide
🔗 Video🎞

Abstract

In the past ten years, with the continuous advancement and adoption of health information technology, medical institutions around the world have acquired a large amount of electronic medical record (EMR) data after long-term collection. For clinical scientific research, EMR data has the advantages of low cost and timely compared with data obtained in the clinical trial. At present, more and more studies have used EMR data in clinical research such as efficacy analysis and outcome analysis.

Health data includes personal health data from mobile devices, hospital clinical data, genetic data, and public data for disease prevention and control. The integration of data from the multiple sources can provide the fundamentals for health promotion, disease prevention, and national health strategies.

EMR has the characteristics of diversity, incompleteness and redundancy, which make it difficult to carry out data analysis directly. It is necessary to preprocess the source data to improve data quality. Different types of data require different processing technologies. Most structured data commonly needs classic preprocessing technologies, including data cleansing, data integration, data transformation, and data reduction. For semi-structured or unstructured data, such as medical text, it requires more complex and challenging processing methods. This presentation will focus on information extraction text data in EMR. Text in EMRs is accessible, especially with open-source information extraction algorithms, and significantly improves case detection when combined with codes. However, more harmonization of reporting within EMR studies is needed.

Biography

Min-Huei Marc Hsu is a Professor of the Graduate Institute of Data Science at Taipei Medical University. Dr. Hsu has dedicated himself to the adoption of health information technology. He has been involved deeply in digital health projects in Taiwan. He is one of the essential promoters of Taiwan's National EMR exchange program. Dr. Hsu was appointed as Director of Medical Informatics Center at the Ministry of Health and Welfare of Taiwan in March 2011. Before the MOHW appointment, Dr. Hsu served as CIO at Taipei Medical University and also a Consultant Neurosurgeon at Wanfang Hospital (a 746-bed hospital affiliated to Taipei Medical University). Besides, he is author and co-author of more than 80 papers and articles in international conferences and scientific journals, focusing on health data, health information technology, e-health, electronic medical record system, hospital information management, and patient safety.

16:50-17:00

Closing

報名繳費

報名費：

一般人士：會員 NT$700，非會員 NT$900

學　　生：會員 NT$500，非會員 NT$700

繳費截止：即日起至 2021 年 03 月 03 日（現場報名加收 NT$200元）。

報名方式：線上報名（請點選開啟報名網頁）

繳費：

劃　　撥：戶名「社團法人中華民國計算語言學學會」；帳號：19166251

(劃撥通訊欄內請註明「IR Workshop以及Registration ID.；同單位多位報名可合併劃撥)

線上刷卡：線上報名完成後可選擇線上支付方式，請依指示完成付款。

繳費期限：2021/03/06，屆時未付費者，將視同擬取消報名。

附註說明：

本處所指之會員為「中華民國計算語言學學會」之有效會員。（會員申請及效期請點選開啟學會網站參考）

學生非會員請提供學生身份證明。

其他：

出席研討會請遵守防疫相關規定。點選開啟臺北醫學大學防疫專區

敬請本研討會不提供紙本資料，僅提供講者同意分享之講義電子檔，與會者可自行下載。

會場資訊

交通資訊：

搭公車：1、1503、207、254、282、284、284直、292、292副、611、650、672、內科通勤專車10、南軟通勤專車中和線、南軟通勤專車雙和線(喬治商職站)

搭捷運：搭乘捷運文湖線至（六張犁站）下車，單一出口循基隆路走往台北市政府方向步行近 300 公尺（約 5 分鐘）可抵統一超商（7-ELEVEN）喬治門市，對面即是臺北醫學大學大安校區。

開車：

（國道3號）由信義快速道路下來走左側 2 條車道下出口，進入信義路五段直走往基隆路/市政中心方向行進約1.1公里後，左轉基隆路二段，沿基隆路二段直走1公里後，右側即可見臺北醫學大學大安校區。

（環東大道）沿著基隆路的路標走，靠左繼續走基隆路地下道，繼續直行基隆路一段，接續直行基隆路二段 1 公里後，右側即可見臺北醫學大學大安校區。

會場地址：

台北市大安區基隆路二段172-1號

B2 會議廳

聯絡方式

聯絡人：黃秘書

電　話：(02)6638-2736 分機 1105

Email ：chenyu@tmu.edu.tw

辦公室：106 台北市大安區基隆路二段172-1號11樓 (臺北醫學大學大安校區)

會議簡介

主辦人

主辦單位

協辦單位

會議議程

2021/03/10 (星期三)

🔗 Slide🔗 Video🎞

🔗 Slide🔗 Video🎞

🔗 Slide

🔗 Slide🔗 Video🎞

報名繳費

會場資訊

聯絡方式

🔗 Slide
🔗 Video🎞

🔗 Slide
🔗 Video🎞

🔗 Slide
🔗 Video🎞