We detail in this vignette how {constructive} works and how you might
define custom constructors or custom .cstr_construct.*()
methods.
This documents provides the general theory here but you are encouraged to look at examples.
In particular the package {constructive.examples} accessible at https://github.com/cynkra/constructive.example/ contains 2 examples, support a new class (“qr”), or implement a new constructor for an already supported class (“tbl_df). This package might be used as a template.
The scripts starting with “s3-” and “s4-” in the {constructive} package provide many more examples in a similar but slightly different shape, those 2 resources along with the explanations in this document should get you started. Don’t hesitate to open issues if things are unclear.
The next 5 sections describe the inner logic of the package, the last 2 sections explain how to support a new class and/or define your own constructors.
The package is young and subject to breaking changes, so we apologize in advance for the possible API breaking changes in the future.
.cstr_construct()
builds code recursively, without
checking input or output validity, without handling errors, and without
formatting.construct()
wraps .cstr_construct()
and
does this extra work..cstr_construct()
is a generic and many methods are
implemented in the package, for instance construct(iris)
will call .cstr_construct.data.frame()
which down the line
will call .cstr_construct.atomic()
and
.cstr_construct.factor()
to construct its columns..cstr_construct()
attempts to match its data input to a list of objects provided to the
data
argument..cstr_construct
#> function (x, ..., data = NULL)
#> {
#> data_name <- perfect_match(x, data)
#> if (!is.null(data_name))
#> return(data_name)
#> UseMethod(".cstr_construct")
#> }
#> <bytecode: 0x12ee44ae0>
#> <environment: namespace:constructive>
.cstr_construct(letters)
#> [1] "c("
#> [2] " \"a\", \"b\", \"c\", \"d\", \"e\", \"f\", \"g\", \"h\", \"i\", \"j\", \"k\", \"l\", \"m\", \"n\", \"o\","
#> [3] " \"p\", \"q\", \"r\", \"s\", \"t\", \"u\", \"v\", \"w\", \"x\", \"y\", \"z\""
#> [4] ")"
construct(letters)
#> c(
#> "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o",
#> "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"
#> )
.cstr_construct.?()
methods.cstr_construct.?()
methods typically have this
form:
.cstr_construct.Date <- function(x, ...) {
opts <- .cstr_fetch_opts("Date", ...)
if (is_corrupted_Date(x) || opts$constructor == "next") return(NextMethod())
constructor <- constructors$Date[[opts$constructor]]
constructor(x, ..., origin = opts$origin)
}
.cstr_fetch_opts()
gathers options provided to
construct()
through the opts_*()
function (see
next section), or falls back to a default value if none were
provided.NextMethod()
to forward all our inputs to a lower
level constructor.constructor()
actually builds the code from the object
x
, the parameters forwarded through ...
and
the optional construction details gathered in opts
(here
the origin
)opts_?()
functionWhen implementing a new method you’ll need to define and export the
corresponding opts_?()
function. It provides to the user a
way to choose a constructor and object retrieved by
.cstr_fetch_opts()
in the .cstr_construct()
method.
It should always have this form:
opts_Date <- function(
constructor = c(
"as.Date", "as_date", "date", "new_date", "as.Date.numeric",
"as_date.numeric", "next", "atomic"
),
...,
origin = "1970-01-01"
) {
.cstr_combine_errors(
constructor <- .cstr_match_constructor(constructor),
ellipsis::check_dots_empty()
)
.cstr_options("Date", constructor = constructor, origin = origin)
}
opts_?()
function and as the first argument of .cstr_options()
.origin
The following code illustrates how the information is retrieved.
# .cstr_fetch_opts() takes a class and the dots and retrieves the relevant options
# if none were provided it falls back on the default value for the relevant opts_?() function
test <- function(...) {
.cstr_fetch_opts("Date", ...)
}
test(opts_Date("as_date"), opts_data.frame("read.table"))
#> <constructive_options_Date/constructive_options>
#> constructor: "as_date"
#> origin: "1970-01-01"
test()
#> <constructive_options_Date/constructive_options>
#> constructor: "as.Date"
#> origin: "1970-01-01"
is_corrupted_?()
functionis_corrupted_?()
checks if x
has the right
internal type and attributes, sometimes structure, so that it satisfies
the expectations of a well formatted object of a given class.
If an object is corrupted for a given class we cannot use
constructors for this class, so we move on to a lower level constructor
by calling NextMethod()
in
.cstr_construct()
.
This is important so that {constructive}
doesn’t choke
on corrupted objects but instead helps us understand them.
For instance in the following example x
prints like a
date but it’s corrupted, a date should not be built on top of characters
and this object cannot be built with as.Date()
or other
idiomatic date constructors.
x <- structure("12345", class = "Date")
x
#> [1] "2003-10-20"
x + 1
#> Error in unclass(e1) + unclass(e2): non-numeric argument to binary operator
We have defined :
And as a consequence the next method,
.cstr_construct.default()
will be called through
NextMethod()
and will handle the object using an atomic
vector constructor:
{constructive} exports a constructors
environment
object, itself containing environments named like classes, the latter
contain the constructor functions.
It is retrieved in the .cstr_construct()
method by:
For instance the default constructor for “Date” is :
constructors$Date$as.Date
#> function (x, ..., origin = "1970-01-01")
#> {
#> if (any(is.infinite(x)) && any(is.finite(x))) {
#> x_dbl <- unclass(x)
#> if (origin != "1970-01-01")
#> x_dbl <- x_dbl - as.numeric(as.Date(origin))
#> code <- .cstr_apply(list(x_dbl, origin = origin), "as.Date",
#> ..., new_line = FALSE)
#> }
#> else {
#> code <- .cstr_apply(list(format(x)), "as.Date", ...,
#> new_line = FALSE)
#> }
#> repair_attributes_Date(x, code, ...)
#> }
#> <bytecode: 0x12dfbb8a0>
#> <environment: namespace:constructive>
A function call is made of a function and its arguments. A
constructor sets the function and constructs its arguments recursively.
This is done with the help of .cstr_apply()
once these
output have been prepared. In the case above we have 2 logical paths
because dates can be infinite but date vectors containing infinite
elements cannot be represented by
as.Date(<character>)
, our preferred choice.
x <- structure(c(12345, 20000), class = "Date")
y <- structure(c(12345, Inf), class = "Date")
constructors$Date$as.Date(x)
#> [1] "as.Date(c(\"2003-10-20\", \"2024-10-04\"))"
constructors$Date$as.Date(y)
#> [1] "as.Date(c(12345, Inf), origin = \"1970-01-01\")"
It’s important to consider corner cases when defining a constructor,
if some cases can’t be handled by the constructor we should fall back to
another constructor or to another .cstr_construct()
method.
For instance constructors$data.frame$read.table()
falls
back on constructors$data.frame$data.frame()
when the input
contains non atomic columns, which cannot be represented in a table
input, and constructors$data.frame$data.frame()
itself
falls back on .cstr_construct.list()
when the data frame
contains list columns not defined using I()
, since
data.frame()
cannot produce such objects.
That last line of the function does the attribute reparation.
Constructors should always end by a call to
.cstr_repair_attributes()
or a function that wraps it.
These are needed to adjust the attributes of an object after
idiomatic constructors such as as.Date()
have defined their
data and canonical attributes.
x <- structure(c(12345, 20000), class = "Date", some_attr = 42)
# attributes are not visible due to "Date"'s printing method
x
#> [1] "2003-10-20" "2024-10-04"
# but constructive retrieves them
constructors$Date$as.Date(x)
#> [1] "as.Date(c(\"2003-10-20\", \"2024-10-04\")) |>"
#> [2] " structure(some_attr = 42)"
.cstr_repair_attributes()
essentially sets attributes
with exceptions :
idiomatic_class
argument.cstr_repair_attributes()
does a bit more but we don’t
need to dive deeper in this vignette.
constructive:::repair_attributes_Date
#> function (x, code, ...)
#> {
#> .cstr_repair_attributes(x, code, ..., idiomatic_class = "Date")
#> }
#> <bytecode: 0x1380e9820>
#> <environment: namespace:constructive>
constructive:::repair_attributes_factor
#> function (x, code, ...)
#> {
#> .cstr_repair_attributes(x, code, ..., ignore = "levels",
#> idiomatic_class = "factor")
#> }
#> <bytecode: 0x12d13a098>
#> <environment: namespace:constructive>
Registering a new class is done by defining and registering a
.cstr_construct.?()
method. In a package you might register
the method with {roxygen2} by using the “@export tag”
You should not attempt to modify manually the
constructors
object of the {constructive} package, instead
you should :
.cstr_register_constructors(class_name, constructor_name = constructor_function, ...)
Do the latter in .onload()
if the new constructor is to
be part of a package, for instance.