This package can be used to create a highlighted source document based on the frequency of phrases found in single or multiple note sheets. The goal of this method is to indicate the portions of the source document that individuals felt was most worth copying into notes, based on phrase frequency. The inputs necessary for this procedure are a notes document and a source document. The output will be HTML code for generating the highlighted text.
This work was funded (or partially funded) by the Center for Statistics and Applications in Forensic Evidence (CSAFE) through Cooperative Agreements 70NANB15H176 and 70NANB20H019 between NIST and Iowa State University, which includes activities carried out at Carnegie Mellon University, Duke University, University of California Irvine, University of Virginia, West Virginia University, University of Pennsylvania, Swarthmore College and University of Nebraska, Lincoln.
You can install from CRAN with:
install.packages("highlightr")
You can install the development version of highlightr from GitHub with:
# install.packages("devtools")
::install_github("rachelesrogers/highlightr") devtools
library(highlightr)
<- dplyr::rename(comment_example, page_notes=Notes)
comment_example_rename <- token_comments(comment_example_rename)
toks_comment <- dplyr::rename(transcript_example, text=Text)
transcript_example_rename <- token_transcript(transcript_example_rename)
toks_transcript <- collocate_comments_fuzzy(toks_transcript, toks_comment)
collocation_object <- transcript_frequency(transcript_example_rename, collocation_object)
merged_frequency <- collocation_plot(merged_frequency)
freq_plot <- highlighted_text(freq_plot) page_highlight
page_highlight
The below image is generated through the resulting html output (as
seen in the vignette("highlightr")
).