git2rdata: Store and Retrieve Data.frames in a Git Repository

The git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette("plain_text", package = "git2rdata"). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette("version_control", package = "git2rdata"). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette("workflow", package = "git2rdata") gives a toy example. 4) vignette("efficiency", package = "git2rdata") provides some insight into the efficiency of file storage, git repository size and speed for writing and reading.

Version: 0.4.1
Depends: R (≥ 3.5.0)
Imports: assertthat, git2r (≥ 0.23.0), methods, yaml
Suggests: ggplot2, knitr, microbenchmark, rmarkdown, testthat
Published: 2024-09-06
DOI: 10.32614/CRAN.package.git2rdata
Author: Thierry Onkelinx ORCID iD [aut, cre] (Research Institute for Nature and Forest (INBO)), Floris Vanderhaeghe ORCID iD [ctb] (Research Institute for Nature and Forest (INBO)), Peter Desmet ORCID iD [ctb] (Research Institute for Nature and Forest (INBO)), Els Lommelen ORCID iD [ctb] (Research Institute for Nature and Forest (INBO)), Research Institute for Nature and Forest (INBO) [cph, fnd]
Maintainer: Thierry Onkelinx <thierry.onkelinx at inbo.be>
BugReports: https://github.com/ropensci/git2rdata/issues
License: GPL-3
URL: https://ropensci.github.io/git2rdata/, https://github.com/ropensci/git2rdata/, https://doi.org/10.5281/zenodo.1485309
NeedsCompilation: no
Language: en-GB
Citation: git2rdata citation info
Materials: README NEWS
CRAN checks: git2rdata results

Documentation:

Reference manual: git2rdata.pdf
Vignettes: Efficiency Relative to Storage and Time (source, R code)
Adding metadata (source, R code)
Getting Started Storing Dataframes as Plain Text (source, R code)
Storing Large Dataframes (source, R code)
Optimizing Storage for Version Control (source, R code)
Suggested Workflow for Storing a Variable Set of Dataframes under Version Control (source, R code)

Downloads:

Package source: git2rdata_0.4.1.tar.gz
Windows binaries: r-devel: git2rdata_0.4.1.zip, r-release: git2rdata_0.4.1.zip, r-oldrel: git2rdata_0.4.1.zip
macOS binaries: r-release (arm64): git2rdata_0.4.1.tgz, r-oldrel (arm64): git2rdata_0.4.0.tgz, r-release (x86_64): git2rdata_0.4.1.tgz, r-oldrel (x86_64): git2rdata_0.4.0.tgz
Old sources: git2rdata archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=git2rdata to link to this page.