Skip to contents

This function filters a data frame of articles, retaining only those that contain at least one of the specified whitelist terms in either the title or abstract. This allows for easy extraction of articles relevant to a set of predefined topics.

Usage

filter_articles(article, whitelist_terms)

Arguments

article

A data frame or tibble containing at least the "title" and "abstract" columns.

whitelist_terms

A character vector of terms that are used to filter articles by matching the title or abstract.

Value

A filtered data frame containing only articles where at least one of the whitelist terms is found in the title or abstract.

Details

The function combines the "title" and "abstract" columns into a single text string and uses regular expression matching to search for the presence of any of the specified whitelist terms. The search is case-insensitive. Only the articles that match one or more of the whitelist terms will be retained in the output data frame.

Examples

if (FALSE) { # \dontrun{
papers <- get_article(journal = "Nature Medicine")
filtered_papers <- filter_articles(papers, whitelist_terms = c("CRISPR", "gene therapy"))
} # }