Indexing the Biomedical Literature in a Time of Increased Demand and Limited Resources
Alan R Aronson, Lister Hill Center, U.S. National Library of Medicine, USA
Abstract
National Library of Medicine (NLM) began the NLM Indexing Initiative (II) project, in the mid 1990s to investigate methods for automatic and assisted indexing for the purpose of enhancing access to NLM document collections including MEDLINE/PubMed. A prototype indexing system was subsequently created and has since evolved into a mature application called the NLM Medical Text Indexer (MTI). MTI has been in use providing indexing assistance since mid 2002 and has seen ever increasing usage by NLM indexers. The MTI recommendations are now considered "first line" indexing (MTIFL) for a small but growing number of journals for which it performs well. After a brief historical introduction to MTI, this talk will focus on recent MTI development and other research areas the II team are pursuing in light of the increasing demand for indexing and simultaneous decrease in resources available to accomplish that indexing.
Bio
Alan (Lan) R. Aronson, PhD, is a senior researcher at the Lister Hill Center, U.S. National Library of Medicine. His research focuses on applying natural language processing (NLP) techniques to biomedical text for tasks ranging from indexing and retrieval of the biomedical literature to mining clinical text. He is the developer of the world-recognized MetaMap program, which automatically detects the biomedical concepts occurring in text; and his research group is responsible for NLM's Medical Text Indexer (MTI) which assists in various indexing efforts including MEDLINE indexing.
Besides being a member of several professional societies, Dr. Aronson was elected to the Faculty of the American College of Medical Informatics (FACMI) in 2005. In addition, he received the NLM Board of Regents Award in 2010 and an NIH Director's Award in 2012, both of which were for his sustained contribution to the medical informatics community in the development of MetaMap and the Medical Text Indexer.
Watson Beyond Jeopardy!: Adaptation to the Medical Domain
Jennifer Chu-Caroll, IBM T.J. Watson Research Center, USA
Abstract
In 2011, IBM's Watson system famously defeated the two best human players, Ken Jennings and Brad Rutter, in a two-game Jeopardy! exhibition match. The Watson Jeopardy! system demonstrated both high accuracy and speed in open-domain question answering that is unparalleled then and now. It achieved this level of performance through seamless integration of state-of-the-art techniques in Natural Language Processing, Information Retrieval, Knowledge Representation and Reasoning, Machine Learning, and High-Performance Computing.
Since 2011, the IBM Watson Research team has been working on developing rapid domain adaptation techniques to enable applications of Watson technologies in a variety of business domains. The team is currently focused on adapting Watson to work on differential diagnosis and treatment in the medical domain. This new task brought on the need for a high performing medical QA system to work in conjunction with an inference engine that reasons with assertions produced by the base QA system. In this talk, I will discuss the Watson adaptation process, and the challenges we encountered in applying these procedures to medical adaptation of the Watson system. I will also present WatsonPaths, an interactive medical application that leverages our base QA system along with the newly developed inference chaining capability to demonstrate how the technology can be used for Watson and the user to learn from one another.
Bio
Jennifer Chu-Caroll is a Research Staff Member at IBM T. J. Watson Research Center. She also manages the Knowledge Structures group which focuses on improving advanced search technology through the use of natural language processing and machine learning techniques. Prior to joining IBM in 2001, She spent 5 years as a Member of Technical Staff at Lucent Technologies Bell Laboratories. Her research interests include question answering, semantic search, natural language discourse processing, and spoken dialogue management.
She is currently involved in the DeepQA project, whose focus is on developing scalable and reusable Question Answering technology by developing and integrating state-of-the-art techniques in Natural Language Processing, Information Retrieval, Knowledge Representation and Reasoning, and Machine Learning. An initial application of DeepQA technology is in the development of Watson, a computer system that defeated Ken Jennings and Brad Rutter in a two-game Jeopardy! match in February 2011.
She is the General Chair for NAACL HLT 2012, and is currently serving on the editorial board of the Journal of Dialogue Systems. In the recent past, she served on the executive board of the North American Chapter of the ACL in 2007 and 2008, as program co-chair of the HLT/NAACL 2006 Conference and program committee area chair for EMNLP 2010 and EMNLP/HLT 2005. She also served on the editorial board of the Computational Linguistics Journal, and as secretary and scientific advisory board member of the ACL/ISCA special interest group on discourse and dialogue (SIGDIAL).