加拿大蒙特利尔大学聂建云教授、江西师范大学王明文教授与左家莉副教授访问实验室
2019年6月24日上午,应实验室孙乐研究员的邀请,加拿大蒙特利尔大学聂建云教授以及江西师范大学的王明文教授与左家莉副教授访问实验室。
聂建云老师是加拿大蒙特利尔大学教授,他在自然语言处理和信息检索领域深耕多年,主要研究领域包括信息检索模型、跨语言信息检索、Query扩展、Query推荐、Query理解、查询日志利用、情感分析等等。聂建云教授在IR和NLP领域的期刊和国际会议上发表了200多篇论文,他是多个国际期刊的编委(e.g. Journal of information retrieval),并作为程序委员会成员参与了IR和NLP领域的很多国际会议主会的组织工作。他是SIGIR 2011的大会主席,是SIGIR 2019的程序委员会主席。
聂建云教授跟大家分享了其研究团队即将在第42届ACM SIGIR(SIGIR 2019)上发表的在“关键词抽取”方面的最新研究成果《DivGraphPointer: A Graph Pointer Network for Extracting Diverse keyphrases》。
聂建云教授首先回顾了在关键词抽取(keyphrase extraction)在传统无监督学习方法和有监督学习方法的特点,并介绍了他所在研究团队利用图神经元网络来捕捉、编码文档级别的词语关联的最新方法,相比于非图神经元网络的方法,该方法在科学论文关键词生成数据集(Kp20k)上得到了state-of-the-art的结果。
报告结束后,聂建云教授同参加报告的师生热烈互动,解答大家的学术问题并分享了关于当前IR领域的前沿问题的思考,大家受益匪浅。
附《DivGraphPointer: A Graph Pointer Network for Extracting Diverse keyphrases》内容摘要:Keyphrase extraction from documents is useful to a variety of applications such as information retrieval and document summarization. We present an end-to-end method called DivGraphPointer for extracting a set of diversified keyphrases from a document. DivGraphPointer combines the advantages of traditional graph-based ranking methods and recent neural network-based approaches. Specifically, given a document, a word graph is constructed from the document based on word proximity and is encoded with graph convolutional networks, which effectively capture document-level word salience by modeling long-range dependency between words in the document and aggregating multiple appearances of identical words into one node. Furthermore, we propose a diversified point network to generate a set of diverse keyphrases out of the word graph in the decoding process. Experimental results on five benchmark data sets show that our proposed method significantly outperforms the existing state-of-the-art approaches.