The 3rd annual edition of the Singapore Symposium on Natural Language Processing (SSNLP) will take place online on December 11, 2020.
December 11 SSNLP 2020 is now live! join us here
November 21 SSNLP 2020 registration is now live! It's free, go register now!
November 20 Join our SSNLP 2020 Slack Workspace now!
November 1 We have confirmed three world-class academic speakers so far, with more on the way!
October 21 We just launched the website! Stay tuned for more details on registration and program schedule!
December 11, 2020 (SGT) | December 11, 2020 (SGT) | |
---|---|---|
08:10 - 08:25 | Welcome and Opening Remarks | |
08:25 - 09:00 |
Knowledge-Robust and Multimodally-Grounded NLP
speaker: Mohit Bansal :: chaired by: Soujanya Poria |
|
09:00 - 09:35 |
Low resourced but long tailed spoken dialogue system building
speaker: Eric Fosler-Lussier :: chaired by: Li Haizhou |
|
09:35 - 10:10 |
Advances in Question Answering Research for Personal Assistants
speaker: Alessandro Moschitti :: chaired by: Gao Wei |
|
10:10 - 10:45 |
A Typology of Ethical Risks in Language Technology with an Eye Towards
Where Transparent Documentation Can Help
speaker: Emily M. Bender :: chaired by: Kokil Jaidka |
|
10:45 - 11:20 | Do pretraining language
models really understand language?
speaker: Minlie Huang :: chaired by: Lei Wenqiang |
|
11:20 - 11:55 |
Low Resource Machine Translation
speaker: Pushpak Bhattacharyya :: chaired by: Soujanya Poria |
|
11:55 - 15:00 | Lunch break | |
15:00 - 16:00 | Panel
discussion: On Low-Resource NLP
chaired by: Nancy Chen |
|
16:00 - 16:35 |
Understanding Product Reviews: Topic-Specific Word Embedding Learning
and Question-Answering
speaker: Yulan He :: chaired by: Jing Jiang |
|
16:30 - 16:50 | Closing Remarks |
The following speakers have accepted to give keynotes at SSNLP 2020. You can view the detailed
information by clicking the images.
Title: Low Resource Machine Translation
Speaker: Pushpak Bhattacharyya
Abstract: AI now and in future will have to grapple continuously with the problem of low resource. AI will increasingly be ML intensive. But ML needs data often with annotation. However, annotation is costly. Over the years, through work on multiple problems, we have developed insight into how to do language processing in low resource setting. Following 6 methods- individually and in combination- seem to be the way forward:
The present talk will focus on low resource machine translation. We describe use of techniques from the above list and bring home the seriousness and methodology of doing Machine Translation in low resource settings.
Bio: Prof. Pushpak Bhattacharyya is Professor of Computer Science and Engineering Department IIT Bombay. Prof. Bhattacharyya's research areas are Natural Language Processing, Machine Learning and AI (NLP-ML-AI). Prof. Bhattacharyya has published more than 350 research papers in various areas of NLP. Author of the textbook 'Machine Translation' Prof. Bhattacharyya has shed light on all paradigms of machine translation with abundant examples from Indian Languages. Two recent monographs co-authored by him called 'Investigations in Computational Sarcasm' and 'Cognitively Inspired Natural Language Processing- An Investigation Based on Eye Tracking' describe cutting edge research in NLP and ML. Prof. Bhattacharyya is Fellow of Indian National Academy of Engineering (FNAE) and Abdul Kalam National Fellow. For sustained contribution to technology he received the Manthan Award of the Ministry of IT, P.K. Patwardhan Award of IIT Bombay and VNMM Award of IIT Roorkey. He is also a Distinguished Alumnus of IIT Kharagpur.
Title: Knowledge-Robust and Multimodally-Grounded NLP
Speaker: Mohit Bansal
Abstract: In this talk, I will present our group's recent work on NLP models that are knowledge-robust and multimodally-grounded. First, we will describe multi-task and reinforcement learning methods to incorporate novel auxiliary-skill tasks such as saliency, entailment, and back-translation validity (including bandit-based methods for automatic auxiliary task selection+mixing and multi-reward mixing). Next, we will discuss developing adversarial robustness against reasoning shortcuts and cross-domain/lingual generalization in QA and dialogue models (including auto-adversary generation). Lastly, we will discuss multimodal, grounded models which condition and reason on dynamic spatio-temporal information in images and videos, and action-based robotic navigation and assembling tasks (including commonsense reasoning for ambiguous robotic instructions).
Bio: Dr. Mohit Bansal is the Parker Associate Professor in the Computer
Science department at UNC Chapel Hill. Prior to this, he was a research assistant professor
at TTI-Chicago. He received his PhD from UC Berkeley and his BTech from IIT Kanpur. His
research expertise is in statistical natural language processing and machine learning, with
a particular focus on multimodal, grounded, and embodied semantics (including RoboNLP),
human-like language generation and Q&A/dialogue, and interpretable and generalizable deep
learning. He is a recipient of the 2020 IJCAI Early Career Spotlight, 2019 DARPA Director's
Fellowship, 2019 Google Focused Research Award, 2019 Microsoft Investigator Fellowship, and
2019 NSF CAREER Award. His service includes Program Co-Chair for CoNLL 2019, Senior Area
Chair for several ACL, EMNLP, AAAI conferences, and Associate Editor for CL, IEEE/ACM TASLP,
and CSL journals.
Webpages: cs.unc.edu/~mbansal, murgelab.cs.unc.edu, https://nlp.cs.unc.edu/
Title: A Typology of Ethical Risks in Language Technology with an Eye Towards Where Transparent Documentation Can Help
Speaker: Emily M. Bender
Abstract: People are impacted by language technology in various ways: as direct users of the technology (by choice or otherwise); indirectly, when others use the technology; and in its creation, as annotators or contributors to training data sets (knowingly or not). In these roles, risks are borne differentially by different speaker populations, depending on how well the technology works for their language varieties and the extent to which they are subjected to marginalization. This talk explores strategies for mitigating these risks based on transparent documentation of training data.
Bio: Emily is a professor of linguistics at the University of Washington where she is the faculty director of the professional MS program in computational linguistics. Her research interests include the interaction of linguistics and NLP and the societal impact of language technology and how transparent documentation can help mitigate the effects of bias and the potential for trained systems to perpetuate systems of oppression. She is also actively working on how to best incorporate training on ethics and societal impact into NLP curricula.
Title: Low resourced but long tailed spoken dialogue system building
Speaker: Eric Fosler-Lussier
Abstract: In this talk, I discuss lessons learned from our partnership with the Ohio State School of Medicine in developing a Virtual Patient dialog system to train medical students in taking patient histories. The OSU Virtual Patients unusual development history as a question-answering system provides some interesting insights into co-development strategies for dialog systems. I also highlight our work in “speechifying” the patient chatbot and handling semantically subtle questions when speech data is non-existent and language exemplars for questions are few.
Bio: Eric Fosler-Lussier is a Professor of Computer Science and Engineering, with courtesy appointments in Linguistics and Biomedical Informatics, at The Ohio State University. He is also co-Program Director for the Foundations of Artificial Intelligence Community of Practice at OSU's Translational Data Analytics Institute. After receiving a B.A.S. (Computer and Cognitive Science) and B.A. (Linguistics) from the University of Pennsylvania in 1993, he received his Ph.D. in 1999 from the University of California, Berkeley. He has also been a Member of Technical Staff at Bell Labs, Lucent Technologies, and held visiting positions at Columbia University and the University of Pennsylvania. He currently serves as the IEEE Speech and Language Technical Committee Chair and was co-General Chair of ASRU 2019 in Singapore. Eric's research has ranged over topics in speech recognition, dialog systems, and clinical natural language processing, which has been recognized in best paper awards from the IEEE Signal Processing Society and the International Medical Informatics Association.
Title: Advances in Question Answering Research for Personal Assistants
Speaker: Alessandro Moschitti
Abstract: Automated Question Answering (QA) has traditionally been an interesting topic for NLP researchers as its solutions involve the use of several language components, e.g., syntactic parsers, coreference and entity resolution, semantic similarity modules, knowledge sources, inference and so on. In recent years, there has been a renewed interest in QA also thanks to the introduction of personal assistants and chatbots, for which, QA can play an essential technology role. In this talk, we will describe how current NLP breakthroughs, i.e., neural architectures, pre-training and new datasets, can be used to build QA systems of impressive accuracy in answering standard information intent questions. In particular, we will (i) describe the components to design a state-of-the-art QA system, (ii) provide an interpretation of why Transformer models are very effective for QA, (iii) illustrate our transfer and adapt (TANDA) approach to improve Transformer models for QA, and (iv) provide effective solutions, e.g., our Cascade Transformer, to make such technology efficient.
Bio: Alessandro Moschitti is a Principal Applied Research Scientist of Amazon Alexa leading the research on retrieval-based QA systems (since 2018), and a professor of the CS Dept. of the University of Trento, Italy (since 2007). He obtained his Ph.D. in CS from the University of Rome in 2003. He was a Principal Scientist of the Qatar Computing Research Institute (QCRI) for 5 years (2013-2018), and worked as a research fellow at The University of Texas at Dallas for 2 years (2002-2004). He was (i) a visiting professor for the Universities of Columbia, Colorado, John Hopkins, and MIT (CSAIL department); and (ii) a visiting researcher for the IBM Watson Research center (participating at the Jeopardy! Challenge 2009-2011). His expertise concerns theoretical and applied machine learning in the areas of NLP, IR and Data Mining. He has devised innovative structural kernels and neural networks for advanced syntactic/semantic processing and inference over text, documented by about 300 scientific articles. He has received four IBM Faculty Awards, one Google Faculty Award, and five best paper awards. He has led about 25 projects, e.g., MIT CSAIL and QCRI joint projects, and European projects. He was the General Chair of EMNLP 2014, a PC co-chair of CoNLL 2015, and has had a chair role in more than 50 conferences and workshops. He has been an action editor of TACL, and currently he is an action/associate editor of ACM Computing Survey and JAIR, and in the editorial board of MLJ and JNLE.
Title: Do pretraining language models really understand language?
Speaker: Minlie Huang
Abstract: Today pretraining language models have been dominant in various natural language understanding and generation tasks. In this talk, the speaker will try to answer the question: do pretraining language models really understand language? First of all, what is meaning, understanding and knowledge will be discussed, and then what has and has not been learned by existing pretraining models. Last, the speaker will discuss how NLU or NLG tasks will be done better with knowledge. Some solutions such as knowledge injection, domain-specific pretraining tasks, and explicit control of knowledge use will be discussed.
Bio:
Dr. Minlie Huang is an associate professor at Tsinghua University. His research interests
include natural language processing, particularly on dialog systems and language generation.
He authored a Chinese book “Modern Natural Language Generation” and has published more than
80+ papers in premier conferences. He won Wuwenjun AI award in 2019, Alibaba Innovative
Research Award in 2019, and Hanvon Youngth Innovation Award in 2018. He won SIGDIAL 2020
best paper, NLPCC 2020 best student paper, IJCAI-ECAI 2018 distinguished paper, nominee of
ACL 2019 best demo paper. He served as ACL 2021 diversity&inclusion cochair, EMNLP 2021
workshop cochair, area chairs for ACL 2020/2016, EMNLP 2020/2019/2014/2011, AACL 2020, and
Senior PC of IJCAI 2020-2017/IJCAI 2018(Distinguished SPC), AAAI 2021-2017, and associate
editor for TNNLS, action editor for TACL. He was supported by several NSFC projects and one
key NSFC project.
His homepage is at: http://coai.cs.tsinghua.edu.cn/hml/.
The following speakers have accepted to serve as panelists for the panel discussion at SSNLP 2020.
You can view the detailed information by clicking the images.
Speaker: Monojit Choudhury
Bio: Dr. Monojit Choudhury is a Principal Researcher in Microsoft Research Lab India since 2007. His research spans many areas of Artificial Intelligence, cognitive science and linguistics. In particular, Dr. Choudhury has been working on technologies for low resource languages, code-switching (mixing of multiple languages in a single conversation), computational sociolinguistics and conversational AI. Dr. Choudhury is an adjunct faculty at International Institute of Technology Hyderabad and Ashoka University. He also organizes the Panini Linguistics Olympiad for high school children in India and is the founding co-chair of the Asia-Pacific Linguistics Olympiad. Dr. Choudhury holds a B.Tech and PhD degree in Computer Science and Engineering from Indian Institute of Technology, Kharagpur.
Speaker: Lidong Bing
Bio: Lidong Bing is leading the NLP team at R&D Center Singapore, Machine Intelligence Technology, Alibaba DAMO Academy. The team is working on a variety of NLP research and development projects that are tightly aligned with the globalization of Alibaba in Southeast Asia region. Prior to joining Alibaba, he was a Senior Researcher at Tencent AI Lab. He received a PhD degree from The Chinese University of Hong Kong and was a Postdoc Research Fellow in the Machine Learning Department at Carnegie Mellon University. His research interests include Low-resource NLP, Sentiment Analysis, Text Generation/Summarization, Information Extraction, Knowledge Base, etc.
Speaker: Bill Jun Lang
Bio: Dr Jun Lang is a Senior Expert for Search R&D at Taobao of Alibaba. He obtained the Ph.D. degree from Harbin Institute of Technology (HIT) in Jan. 2010. From Feb. 2010 to Feb. 2014, he was a Research Scientist at Human Language Technology Department (HLT) of Institute for Infocomm Research (I2R), Singapore, on Statistical Machine Translation R&D. His major research interests include Natural Language Processing, Information Extraction, Machine Translation, and Machine Learning. Currently, he is leading E-commerce Knowledge Graph Group of Taobao at Alibaba.com.
Speaker: Dat Quoc Nguyen
Bio: Dat Quoc Nguyen is a senior research scientist at VinAI Research, Vietnam. He is also an honorary fellow in the School of Computing and Information Systems at the University of Melbourne, Australia, where previously he was a research fellow. Before that, he received his PhD from the Department of Computing at Macquarie University, Australia. Dat Quoc Nguyen has been working on applications of machine learning to natural language processing. He has served as a PC member for top-tier NLP/AI conferences and authored over 30 highly-cited scientific papers.
Speaker: Attapol Rutherford
Bio: Attapol Te Rutherford is an Assistant Professor of Linguistics at Chulalongkorn University, Bangkok. Previously, he received his PhD in Computer Science from Brandeis University, USA and was a data scientist at LinkedIn. He is interested in NLP infrastructures for the Thai Language and NLP applications in computational legal studies and educational applications.
PC Chair: |
Gangeshwar Krishnamurthy, Institute of High Performance Computing Wenqiang Lei, National University of Singapore |
General Chair: | Jing Jiang, Singapore Management University |
Co-organizers: |
Min-Yen Kan, National University of Singapore Kokil Jaidka, Nanyang Technological University Soujanya Poria, Singapore University of Technology Ai Ti Aw, Institute for Infocomm Research Francis Bond, Nanyang Technological University Nancy Chen, Institute for Infocomm Research Shafiq Joty, Nanyang Technological University Hai zhou Li, National University of Singapore Wei Lu, Singapore University of Technology and Design Hwee Tou Ng, National University of Singapore Jian Su, Institute for Infocomm Research Gao Wei, Singapore Management University Luu Anh Tuan, Massachusetts Institute of Technology |
Registrations are open for the SSNLP 2020. Click here
SSNLP 2020 will be going virtual! Register now to receive the link to the event.
If you got any enquiries, please drop an email to Gangeshwar Krishnamurthy or Wenqiang Lei.