Living Systematic Reviews: Real-time Evidence Monitoring Powered By Natural Language Processing (NLP).

We’ve already discussed a few times in the CapeStart blog the importance of systematic literature reviews (also known as SLRs or systematic reviews) and how they’re a cornerstone of evidence-based medicine (EBM). 

We’ve also explored various ways machine learning (ML) and other AI tools can help improve the speed and accuracy of various elements of SLR production, such as PICO element identification and extraction.

The addition of time-saving tools in this realm is important because one of the main downsides of traditional SLR production is that they take a lot of time and require a ton of effort. High methodological standards mean producing just one takes nearly 70 weeks, and many reviews are already outdated by the time they’re finally published.

Because clinicians rely on these reviews, an out-of-date review can have real-world consequences. Clinicians who rely on accurate SLRs, for example, may provide the wrong advice or prescriptions based on older information.

In the age of Covid-19, especially, timely access to the latest high-quality research in a rapidly changing environment can save lives and improve health outcomes.

That’s partly why the scientific community has begun to embrace the concept of living systematic reviews. They’re a new type of SLR that can be rapidly updated with the latest information. Because living reviews depend on advanced natural language processing (NLP), a type of ML technology, they weren’t even possible just a few years ago.

What are living systematic reviews?

The British medical research organization Cochrane says living systematic reviews (LSRs) are systematic reviews that are continually updated and that incorporate new and reliable evidence as it becomes available. They’re ongoing and easily accessible reviews always updated with the latest science.

That’s quite different from traditional SLRs, which are typically published once and rarely updated. Indeed, according to the BMJ, only a minority of reviews are updated within two years and nearly 10 percent of systematic reviews are inaccurate the day they’re published.

Along with having the same scientific rigor as traditional SLRs, Cochrane adds that LSRs also:

  • Rely on continuous monitoring of new evidence regularly (typically at least once a month);
  • Incorporate any new and important evidence promptly; and
  • Should proactively disclose the date of its last update and any new evidence that has been or will be incorporated

Living systematic reviews are especially important given the rapidly expanding amount of scientific research across numerous academic and research databases, with around 2.5 million new research papers published every year. That’s good for a four percent annual growth rate, with the total number of citations set to double every 12 years.

But advances in NLP technology – which transforms free text into code that computers can read and understand, including determining the context of words within language – and other AI methods have unlocked the potential of living reviews. 

When is it appropriate to do an LSR?

Not every SLR needs to be a living systematic review: Because they’re continually updated, living reviews typically require more effort and commitment over the long term. If the research question isn’t of clinical importance or if the certainty of existing evidence is extremely solid, for instance, then an LSR isn’t necessary.

But for pressing clinical questions with unsettled conclusions, or if there’s a good chance new and important evidence could appear in the future, then an LSR is appropriate. 

Major differences between traditional and living systematic reviews

Several noteworthy differences exist between traditional and living systematic reviews. We’ve broken down some of the biggest in the following table:

Process Element Traditional Systematic Reviews Living Systematic Reviews
Workload and division of labor Small workgroup of the same individuals with an extremely large workload for a limited period (typically between 1 and 1.5 years). A larger workgroup, often consisting of different individuals as time passes, with a permanent but lighter workload.
Literature search Create search strategy, run the search, screen results and incorporate using the push model. All of this is done once. Create search strategy, run the search, screen results and incorporate using the push model. Regular and ongoing search is then automated and authors are notified when potential new evidence is detected.
Updates/incorporation of new information Long, drawn-out republishing process that often requires the same effort as the original publication. Update parameters are defined in advance and LSRs are continually updated in accordance with those parameters, typically using an accessible online platform.
Editorial and peer review Editorial and peer review required before publication. Every update must go through the same time-consuming editorial and peer review process. Editorial and peer review required before publication. New updates may or may not require peer and editorial review, depending on the impact of new evidence on findings and conclusions.
Publication Static, rarely updated.  Dynamic, persistent publishing, typically in an online-only format via an accessible platform allowing for fast updates.


How exactly LSRs are updated is largely defined by the workgroup’s search parameters. Elliot et al., widely credited with inventing the concept of LSRs, proposed the following approach in 2017:

  • Search/screening updated but no new evidence found: Authors should show date of search and indicate that no new evidence was found
  • Search/screening updated, new evidence found that’s not likely to change review findings: Authors should show date of search, state that evidence was found and describe the new evidence, and explain why it’s not yet integrated into the review
  • Search/screening updated, new evidence found that will likely change review findings: Authors should show date of search, state that evidence was found and describe the new evidence, and incorporate the new evidence and new findings into the review.

How NLP makes living systematic reviews possible

AI tools for LSRs are most important in the research and selection phase, which traditionally requires human researchers to pore over thousands upon thousands of scientific documents. 

As Marshall et al. explain, core NLP technologies and models now used in both SLRs and LSRs are text classification and data extraction/data mining. 

  • Text classification: Automatically sorts documents (including abstracts, full text, or article snippets) based on predefined categories. This is often performed for purposes of abstract screening, which determines whether articles meet the project’s inclusion criteria. Typically, an NLP model assigns a probability value to each document and ranks documents by potential relevance based on probability.
  • Data extraction: Attempts to identify phrases or words/numbers corresponding to variables of interest (such as extracting the number of randomized participants in a clinical trial)

NLP models are typically underpinned by ML technology, rather than rule-based algorithms. ML models learn and improve effectiveness based on feedback – the more the models learn, the more accurate they become – and can be trained by a human expert. 

To achieve this, NLP models combine the following analysis:

  1. Lexical (locating words in a larger text)
  2. Morphologic (classifying words by parts of speech, such as nouns and verbs)
  3. Syntactic (determining the context of sentences and phrases around the words)
  4. Semantic (determining the meaning of words and phrases) 

While highly accurate and becoming more accurate all the time, NLP and ML models aren’t perfect. Indeed, systematic reviews require a very high level of accuracy that automation can sometimes struggle to achieve. Many organizations choose to semi-automate their SLR production

But the time-saving elements of these models – even accounting for some mistakes – can still speed up and improve the accuracy of the research and selection phase by orders of magnitude.  

Power up your systematic reviews with the NLP experts at CapeStart

The machine learning and natural language processing experts at CapeStart can semi-automate the SLR process flow – reducing time spent, improving accuracy, and scaling your team’s effectiveness. 

Our expert teams of machine learning engineers and data scientists can deploy near-infinite combinations of customized and pre-built ML and NLP models to quickly identify, isolate, and extract relevant information from scientific literature, improve PICO identification for research question formulation, and intelligently map evidence gaps.

Contact us today to learn more about our AI and development services for SLRs and meta analyses.

Contact Us.