diyar: Multistage Record Linkage and Case Definition for Epidemiological Analysis

Perform multistage deterministic linkages, apply case definitions to datasets, and deduplicate records. Records (rows) from datasets are linked by different matching criteria and sub-criteria (columns) in a specified order of certainty. The linkage process handles missing data and conflicting matches based on this same order of certainty. For episode grouping, rows of dated events (e.g. sample collection) or interval of events (e.g. hospital admission) are grouped into chronological episodes beginning with a "Case". The process permits several options such as episode lengths and recurrence periods which are used to build custom preferences for case assignment (definition). The record linkage and episode grouping processes assign unique group IDs to matching records or those grouped into episodes. This then allows for record deduplication or sub-analysis within these groups.

Version: 0.1.0
Imports: methods, grDevices, graphics, utils, dplyr (≥ 0.7.5)
Suggests: stringdist, knitr, rmarkdown, testthat, covr
Published: 2020-06-13
Author: Olisaeloka Nsonwu
Maintainer: Olisaeloka Nsonwu <olisa.nsonwu at>
License: GPL-3
NeedsCompilation: no
Language: en-GB
Materials: README NEWS
In views: OfficialStatistics
CRAN checks: diyar results


Reference manual: diyar.pdf
Vignettes: Implementing case definitions for epidemiological analysis in R
Number lines and overlaps
Multistage deterministic linkage in R
Package source: diyar_0.1.0.tar.gz
Windows binaries: r-devel:, r-release:, r-oldrel:
macOS binaries: r-release: diyar_0.1.0.tgz, r-oldrel: diyar_0.1.0.tgz
Old sources: diyar archive


Please use the canonical form to link to this page.