Extending constructive


We detail in this vignette how {constructive} works and how you might define custom constructors or custom .cstr_construct.*() methods.

This documents provides the general theory here but you are encouraged to look at examples.

In particular the package {constructive.examples} accessible at https://github.com/cynkra/constructive.example/ contains 2 examples, support a new class (“qr”), or implement a new constructor for an already supported class (“tbl_df). This package might be used as a template.

The scripts starting with “s3-” and “s4-” in the {constructive} package provide many more examples in a similar but slightly different shape, those 2 resources along with the explanations in this document should get you started. Don’t hesitate to open issues if things are unclear.

The next 5 sections describe the inner logic of the package, the last 2 sections explain how to support a new class and/or define your own constructors.

The package is young and subject to breaking changes, so we apologize in advance for the possible API breaking changes in the future.

Recursion system

#> function (x, ..., data = NULL) 
#> {
#>     data_name <- perfect_match(x, data)
#>     if (!is.null(data_name)) 
#>         return(data_name)
#>     UseMethod(".cstr_construct")
#> }
#> <bytecode: 0x1466e00f8>
#> <environment: namespace:constructive>
#> [1] "c("                                                                                                        
#> [2] "  \"a\", \"b\", \"c\", \"d\", \"e\", \"f\", \"g\", \"h\", \"i\", \"j\", \"k\", \"l\", \"m\", \"n\", \"o\","
#> [3] "  \"p\", \"q\", \"r\", \"s\", \"t\", \"u\", \"v\", \"w\", \"x\", \"y\", \"z\""                             
#> [4] ")"
#> c(
#>   "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o",
#>   "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"
#> )

.cstr_construct.?() methods

.cstr_construct.?() methods typically have this form:

.cstr_construct.Date <- function(x, ...) {
  opts <- .cstr_fetch_opts("Date", ...)
  if (is_corrupted_Date(x) || opts$constructor == "next") return(NextMethod())
  constructor <- constructors$Date[[opts$constructor]]
  constructor(x, ..., origin = opts$origin)

opts_?() function

When implementing a new method you’ll need to define and export the corresponding opts_?() function. It provides to the user a way to choose a constructor and object retrieved by .cstr_fetch_opts() in the .cstr_construct() method.

It should always have this form:

opts_Date <- function(
    constructor = c(
      "as.Date", "as_date", "date", "new_date",  "as.Date.numeric", 
      "as_date.numeric", "next", "atomic"
    origin = "1970-01-01"
  ) {
    constructor <- .cstr_match_constructor(constructor),
  .cstr_options("Date", constructor = constructor, origin = origin)

The following code illustrates how the information is retrieved.

# .cstr_fetch_opts() takes a class and the dots and retrieves the relevant options
# if none were provided it falls back on the default value for the relevant opts_?() function
test <- function(...) {
  .cstr_fetch_opts("Date", ...)
test(opts_Date("as_date"), opts_data.frame("read.table"))
#> <constructive_options_Date/constructive_options>
#> constructor: "as_date"
#> origin:      "1970-01-01"
#> <constructive_options_Date/constructive_options>
#> constructor: "as.Date"
#> origin:      "1970-01-01"

is_corrupted_?() function

is_corrupted_?() checks if x has the right internal type and attributes, sometimes structure, so that it satisfies the expectations of a well formatted object of a given class.

If an object is corrupted for a given class we cannot use constructors for this class, so we move on to a lower level constructor by calling NextMethod() in .cstr_construct().

This is important so that {constructive} doesn’t choke on corrupted objects but instead helps us understand them.

For instance in the following example x prints like a date but it’s corrupted, a date should not be built on top of characters and this object cannot be built with as.Date() or other idiomatic date constructors.

x <- structure("12345", class = "Date")
#> [1] "2003-10-20"
x + 1
#> Error in unclass(e1) + unclass(e2): non-numeric argument to binary operator

We have defined :

is_corrupted_Date <- function(x) {

And as a consequence the next method, .cstr_construct.default() will be called through NextMethod() and will handle the object using an atomic vector constructor:

#> "12345" |>
#>   structure(class = "Date")


{constructive} exports a constructors environment object, itself containing environments named like classes, the latter contain the constructor functions.

It is retrieved in the .cstr_construct() method by:

constructor <- constructors$Date[[opts$constructor]]

For instance the default constructor for “Date” is :

#> function (x, ..., origin = "1970-01-01") 
#> {
#>     if (any(is.infinite(x)) && any(is.finite(x))) {
#>         x_dbl <- unclass(x)
#>         if (origin != "1970-01-01") 
#>             x_dbl <- x_dbl - as.numeric(as.Date(origin))
#>         code <- .cstr_apply(list(x_dbl, origin = origin), "as.Date", 
#>             ..., new_line = FALSE)
#>     }
#>     else {
#>         code <- .cstr_apply(list(format(x)), "as.Date", ..., 
#>             new_line = FALSE)
#>     }
#>     repair_attributes_Date(x, code, ...)
#> }
#> <bytecode: 0x1313d8208>
#> <environment: namespace:constructive>

A function call is made of a function and its arguments. A constructor sets the function and constructs its arguments recursively. This is done with the help of .cstr_apply() once these output have been prepared. In the case above we have 2 logical paths because dates can be infinite but date vectors containing infinite elements cannot be represented by as.Date(<character>), our preferred choice.

x <- structure(c(12345, 20000), class = "Date")
y <- structure(c(12345, Inf), class = "Date")
#> [1] "as.Date(c(\"2003-10-20\", \"2024-10-04\"))"
#> [1] "as.Date(c(12345, Inf), origin = \"1970-01-01\")"

It’s important to consider corner cases when defining a constructor, if some cases can’t be handled by the constructor we should fall back to another constructor or to another .cstr_construct() method.

For instance constructors$data.frame$read.table() falls back on constructors$data.frame$data.frame() when the input contains non atomic columns, which cannot be represented in a table input, and constructors$data.frame$data.frame() itself falls back on .cstr_construct.list() when the data frame contains list columns not defined using I(), since data.frame() cannot produce such objects.

That last line of the function does the attribute reparation.

Attribute reparation

Constructors should always end by a call to .cstr_repair_attributes() or a function that wraps it.

These are needed to adjust the attributes of an object after idiomatic constructors such as as.Date() have defined their data and canonical attributes.

x <- structure(c(12345, 20000), class = "Date", some_attr = 42)
# attributes are not visible due to "Date"'s printing method
#> [1] "2003-10-20" "2024-10-04"

# but constructive retrieves them
#> [1] "as.Date(c(\"2003-10-20\", \"2024-10-04\")) |>"
#> [2] "  structure(some_attr = 42)"

.cstr_repair_attributes() essentially sets attributes with exceptions :

.cstr_repair_attributes() does a bit more but we don’t need to dive deeper in this vignette.

#> function (x, code, ...) 
#> {
#>     .cstr_repair_attributes(x, code, ..., idiomatic_class = "Date")
#> }
#> <bytecode: 0x1300be2e8>
#> <environment: namespace:constructive>

#> function (x, code, ...) 
#> {
#>     .cstr_repair_attributes(x, code, ..., ignore = "levels", 
#>         idiomatic_class = "factor")
#> }
#> <bytecode: 0x130808208>
#> <environment: namespace:constructive>

Register a new class

Registering a new class is done by defining and registering a .cstr_construct.?() method. In a package you might register the method with {roxygen2} by using the “@export tag”

Register new constructors

You should not attempt to modify manually the constructors object of the {constructive} package, instead you should :

Do the latter in .onload() if the new constructor is to be part of a package, for instance.

# in zzz.R
.onLoad <- function(libname, pkgname) {
    constructor_name1 = constructor1, 
    constructor_name2 = constructor2