Machine learning in health: Opportunities and challenges
The Machine Learning in Health Speaker Series brings together leading thinkers, researchers and learners to discuss the current and potential impact of machine learning in healthcare.
This fall, McMaster’s Health Information Research Unit, in partnership with the Department of Health Research Methods, Evidence, and Impact (HEI), launched the Machine Learning in Health Speaker Series. Since then, leading thinkers and researchers from the fields of machine learning and AI have engaged more than 30 learners and faculty from the McMaster community, including the MacDATA Institute and the McMaster AI Society, on the topic of machine learning applications to health information, with a particular focus on text-based approaches.
Session one
In September, Lingyang Chu, an assistant professor new to McMaster’s Department of Computer Science and Software, presented on interpretability of AI models. Chu’s research focuses on algorithmic data mining, machine learning and deep learning, with a focus on transferring research to real-world application, such as healthcare. He develops novel methods to understand the ‘black box’ of deep neural networks – a challenge that confounds many of us in the move to integrate AI in healthcare.
Chu presented on interpretability of machine learning models to understand why certain predictions or decisions are made, and he included examples of approaches being used to interpret deep neural networks. Reliable interpretations need to be exact, consistent, robust and representative. Using easy to understand examples, he described why these are important, in addition to presenting his ongoing work to find exact and consistent interpretations, representative areas within images that support model predictions, and robust interpretations of graphs.
There are some open problems to be addressed in interpretability of machine learning models in healthcare. Two areas of interest to Chu are the application of knowledge to improve interpretability of models and the integration of what we learn from interpreting models into our knowledge base and care processes. For example, integrating clinical knowledge and evidence into model interpretation can improve the performance of the neural networks; this could include having humans with the clinical knowledge add their expertise to the interpretations and linking the model prediction to existing knowledge/evidence sources. Interpretations can also be used to generate new knowledge; if a deep neural network uncovers an important pattern from the data, how can this new information be integrated into the evidence base or care processes?
Session two
In October, Byron Wallace, an associate professor from Northeastern University, presented on AI and systematic reviews. Tools and approaches for automating components of the systematic review process are an active area of machine learning and AI with application of natural language processing techniques. Wallace presented on several projects that he and his team have been working on, including tools and datasets that are available to any user.
RobotReviewer is a tool that semi-automates information extraction for systematic reviews using a suite of machine learning models, including one that identifies randomized controlled trials (RCTs). In RobotReviewer, you can enter PDFs of articles and the machine learning models extract trial attributes, such as the PICO elements, and autogenerate a Cochrane risk of bias assessment. The extracted information is intended to support systematic reviewers who then validate that the information is correct and make any necessary edits. The RCT model was trained using a dataset of more than 3,000 open-source articles, with key elements annotated by clinicians. The dataset called Evidence Inference has also been used to answer other research questions.
Trialstreamer retrieves RCTs, using an RCT machine learning model, from PubMed and clinical trials registries daily, and extracts sample size, the PICO elements and main findings. It then presents a probability score for low risk of bias for the trial. These structured data are added to the trialstreamer database, a living, searchable database of RCTs.
Wallace shared a number of other exciting avenues of research, including the generation of narrative syntheses of RCTs for systematic reviews and monitoring trialstreamer to identify articles to update systematic reviews. The work done by Wallace and his collaborators could be of great value to members of HEI, as we conduct reviews and assess quality of published articles.
Upcoming sessions
Muhammad Afzal
Text classification and summarization
Thursday, November 25, 2021 form 9-10 a.m. EST | Zoom link here.
Guilherme Del Fiol
TBD
Thursday, December 16, 2021 form 4-5 p.m. EST | Zoom link available soon.
Recordings of previous sessions
You can find recordings of the September and October sessions here.
If members of the HEI community have suggestions for other machine learning in healthcare topics or guests, please contact Cynthia Lokker.