Natural Language Processing

At Uppsala Monitoring Centre, our researchers use their expertise in NLP methods to bring innovative research to concrete use in pharmacovigilance.

Natural Language Processing (NLP) is a scientific domain at the intersection between linguistics and artificial intelligence. It centres around the development of computer algorithms for processing unstructured data, like that found in clinical notes, medical literature, and patient records – to ultimately derive task-specific knowledge. Interest in harnessing data from such sources, rich in information relevant to pharmacovigilance, has grown. Together with recent advancements in NLP methods, this has allowed to tap into information that can help assess whether a reported side effect is associated with a medicine.

Combined with clinical expertise, our researchers use state-of-the-art NLP techniques to answer pertinent research questions where structured data reaches its limits.

Key NLP research at UMC

Processing case narratives

Narratives typically describe the events experienced by a patient during treatment and could be useful in identifying and assessing emerging safety concerns, as they can be rich in information not captured in the structured fields of E2B case reports. Narratives also provide the context surrounding the suspected adverse drug reaction (ADR), and sometimes state the clinical reasoning behind reporting of the suspected ADR. Given that narratives are written in free-text form, special care needs to be taken to protect the privacy of the patient and the reporter. Our research aims to extract clinically relevant information from narratives and develop algorithms to de-identify the text to support safe sharing of sensitive data.

Extraction of severity information from clinical narratives using statistical natural language processing, 2016

Deep neural networks for inverse de-identification of medical case narratives in reports of suspected adverse drug reactions, 2018

Also see:

Prospective evaluation of adverse event recognition system in Twitter: results from the Web-RADR project, 2020

How far can we go with just out-of-the-box BERT models?, 2020

Can we harness social media for ADR signal detection?, 2019

Recognising adverse event data from social media

Millions of individuals share their experiences with medicines on social media, offering a potentially useful and complementary source of information for post-market surveillance of medicines. At UMC, we have explored ways to extract information about adverse drug reactions from social media posts, such as Tweets, and assessed their value in identifying emerging safety signals.

Mining drug knowledge from structured product labels

The product label of a medicine provides a summary of the essential and most up-to-date scientific information needed for the safe use of a product, such as indications, dosage, and side effects. As most reported side effects are already known, there is a need for creating a knowledge base that maps medicines to known side effects. This could assist in identifying rare and previously unknown safety concerns or inform signal detection and/or assessment for related medicines. In our research, we use NLP methods to extract information rom structured product label documents and map them to specific terms of interest in pharmacovigilance, such as labelled side effects or indications.

Extracting adverse drug reactions from product labels using deep learning and natural language processing, 2020