Skip to content

Commit

Permalink
Adds names() function and deprecates datanames() (#347)
Browse files Browse the repository at this point in the history
# Pull Request

- Fixes #333
- Companion of insightsengineering/teal.code#218

#### Blockers

- `{teal.code}`
- [ ] Merge insightsengineering/teal.code#218

#### Changes description

- [x] Adds `names()` function
    - [x] Setter is implemented but throws warning (does nothing)
- [x] Deprecates `datanames()` getter and setter
- [x] Adapt tests to new API
- [x] Update NEWS
- [ ] Review documentation
- [ ] Review vignettes

<!-- 
- [ ] `{teal}` review vignettes
- [ ] `{teal}` Review use of `datanames()
- [ ] `{tmg}` Review use of `datanames()
- [ ] `{tmc}` Review use of `datanames()
- [ ] `{teal.gallery}` Review use of `datanames()
- [ ] `{tlg.catalog}` Review use of `datanames()
- [ ] `{teal.osprey}` Review use of `datanames()
- [ ] `{teal.goshawk}` Review use of `datanames()
- [ ] `{teal.*}` Review use of `datanames()
-->

---------

Signed-off-by: André Veríssimo <[email protected]>
Co-authored-by: Dawid Kałędkowski <[email protected]>
  • Loading branch information
averissimo and gogonzo authored Nov 8, 2024
1 parent eca9695 commit 3d899ee
Show file tree
Hide file tree
Showing 26 changed files with 363 additions and 378 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ URL: https://insightsengineering.github.io/teal.data/,
BugReports: https://github.com/insightsengineering/teal.data/issues
Depends:
R (>= 4.0),
teal.code (>= 0.5.0.9011)
teal.code (>= 0.5.0.9012)
Imports:
checkmate (>= 2.1.0),
lifecycle (>= 0.2.0),
Expand Down Expand Up @@ -72,8 +72,8 @@ Collate:
'join_keys.R'
'teal.data.R'
'teal_data-class.R'
'teal_data-datanames.R'
'teal_data-get_code.R'
'teal_data-names.R'
'teal_data-show.R'
'teal_data.R'
'testhat-helpers.R'
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ S3method("[[<-",join_keys)
S3method("join_keys<-",join_keys)
S3method("join_keys<-",teal_data)
S3method("names<-",join_keys)
S3method("names<-",teal_data)
S3method("parents<-",join_keys)
S3method("parents<-",teal_data)
S3method(c,join_key_set)
Expand All @@ -14,6 +15,8 @@ S3method(format,join_keys)
S3method(join_keys,default)
S3method(join_keys,join_keys)
S3method(join_keys,teal_data)
S3method(length,teal.data)
S3method(names,teal_data)
S3method(parents,join_keys)
S3method(parents,teal_data)
S3method(print,join_keys)
Expand Down
13 changes: 8 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,20 @@

### Breaking changes

- soft deprecate `datanames` argument of `get_code()`. Use `names` instead.
- Soft deprecate `datanames` argument of `get_code()`. Use `names` instead.
- Soft deprecate of `datanames()`. Use `names()` instead.
- Deprecate of `datanames(x) <- value`. Does nothing, replace with renaming the objects inside the environment.


### Enhancements

- `datanames()`
- if `join_keys` are provided, the `datanames()` are now sorted in topological way (`Kahn` algorithm),
- `names()` function is introduced replacing `datanames`.
- if `join_keys` are provided, the `names()` are now sorted in topological way (`Kahn` algorithm),
which means the parent dataset always precedes the child dataset.
- are extended by the parent dataset name, if one of the child dataset exist in `datanames()` and
- are extended by the parent dataset name, if one of the child dataset exist in `names()` and
the connection between child-parent is set through `join_keys` and `parent` exist in `teal_data` environment.
- do not allow to set a dataset name that do not exist in `teal_data` environment.
- `teal_data` no longer set default `datanames()` based on `join_keys` names - it uses only data names.
- `teal_data` no longer set default `names()` based on `join_keys` names - it uses only data names.

### Miscellaneous

Expand Down
48 changes: 48 additions & 0 deletions R/deprecated.R
Original file line number Diff line number Diff line change
Expand Up @@ -343,3 +343,51 @@ get_join_keys <- function(...) {
get_labels <- function(...) {
.deprecate_function("get_labels()", "Use col_labels(data)")
}

#' Names of data sets in `teal_data` object
#'
#' @description
#' `r lifecycle::badge("deprecated")`
#'
#' Use `names()` instead of `datanames()`.
#'
#' `datanames()` is deprecated. If object should be hidden, then use a `.` (dot)
#' prefix for the object's name.
#'
#' @param x (`teal_data` or `qenv_error`) object to access or modify
#' @param value (`character`) new value for `@datanames`; all elements must be names of variables existing in `@.xData`
#'
#' @return The contents of `@datanames` or `teal_data` object with updated `@datanames`.
#' @aliases `datanames<-.teal_data`
#'
#' @name datanames

#' @rdname datanames
#' @export
datanames <- function(x) {
lifecycle::deprecate_soft("0.6.1", "datanames()", details = "names()")
names(x)
}

#' @rdname datanames
#' @export
`datanames<-` <- function(x, value) {
lifecycle::deprecate_soft(
"0.6.1",
"`datanames<-`()",
details = "invalid to use `datanames()<-` or `names()<-` on an object of class `teal_data`. See ?names.teal_data"
)
names(x)
}

#' @rdname datanames
#' @export
#' @keywords internal
`names<-.teal_data` <- function(x, value) {
lifecycle::deprecate_warn(
"0.6.1",
"`names<-.teal_data`()",
details = "invalid to use `datanames()<-` or `names()<-` on an object of class `teal_data`. See ?names.teal_data"
)
x
}
1 change: 0 additions & 1 deletion R/join_keys.R
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,6 @@ join_keys.teal_data <- function(...) {
#' join_keys(td)
`join_keys<-.teal_data` <- function(x, value) {
join_keys(x@join_keys) <- value
datanames(x) <- x@datanames # datanames fun manages some exceptions
x
}

Expand Down
99 changes: 38 additions & 61 deletions R/teal_data-class.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,85 +14,62 @@ setOldClass("join_keys")
#' @name teal_data-class
#' @rdname teal_data-class
#'
#' @slot env (`environment`) environment containing data sets and possibly auxiliary variables.
#' Access variables with [get_var()] or [`[[`].
#' No setter provided. Evaluate code to add variables into `@env`.
#' @slot code (`character`) vector representing code necessary to reproduce the contents of `@env`.
#' @slot .xData (`environment`) environment containing data sets and possibly
#' auxiliary variables.
#' Access variables with [get()], [`$`], [get_var()] or [`[[`].
#' No setter provided. Evaluate code to add variables into `@.xData`.
#' @slot code (`character`) vector representing code necessary to reproduce the
#' contents of `@.xData`.
#' Access with [get_code()].
#' No setter provided. Evaluate code to append code to the slot.
#' @slot id (`integer`) random identifier assigned to each element of `@code`. Used internally.
#' @slot id (`integer`) random identifier assigned to each element of `@code`.
#' Used internally.
#' @slot warnings (`character`) vector of warnings raised when evaluating code.
#' Access with [get_warnings()].
#' @slot messages (`character`) vector of messages raised when evaluating code.
#' @slot join_keys (`join_keys`) object specifying joining keys for data sets in `@env`.
#' @slot join_keys (`join_keys`) object specifying joining keys for data sets in
#' `@.xData`.
#' Access or modify with [join_keys()].
#' @slot datanames (`character`) vector of names of data sets in `@env`.
#' Used internally to distinguish them from auxiliary variables.
#' Access or modify with [datanames()].
#' @slot verified (`logical(1)`) flag signifying that code in `@code` has been proven to yield contents of `@env`.
#' @slot verified (`logical(1)`) flag signifying that code in `@code` has been
#' proven to yield contents of `@.xData`.
#' Used internally. See [`verify()`] for more details.
#'
#' @import teal.code
#' @keywords internal
setClass(
Class = "teal_data",
contains = "qenv",
slots = c(join_keys = "join_keys", datanames = "character", verified = "logical"),
slots = c(join_keys = "join_keys", verified = "logical"),
prototype = list(
join_keys = join_keys(),
datanames = character(0),
verified = logical(0)
)
)

#' Initialize `teal_data` object
#' It initializes the `teal_data` class
#'
#' @name new_teal_data
#'
#' @param data (`named list`) of data objects.
#' @param code (`character` or `language`) code to reproduce the `data`.
#' Accepts and stores comments also.
#' @param join_keys (`join_keys`) object
#' @param datanames (`character`) names of datasets passed to `data`.
#' Needed when non-dataset objects are needed in the `env` slot.
#' @rdname new_teal_data
#' @keywords internal
new_teal_data <- function(data,
code = character(0),
join_keys = join_keys(),
datanames = names(data)) {
checkmate::assert_list(data)
checkmate::assert_class(join_keys, "join_keys")
if (is.null(datanames)) datanames <- character(0) # todo: allow to specify
checkmate::assert_character(datanames)
if (!any(is.language(code), is.character(code))) {
stop("`code` must be a character or language object.")
}

if (is.language(code)) {
code <- paste(lang2calls(code), collapse = "\n")
}
if (length(code)) {
code <- paste(code, collapse = "\n")
#' Accepts .xData as a list and converts it to an environment before initializing
#' parent constructor (`qenv`).
#' @noRd
setMethod(
"initialize",
"teal_data",
function(.Object, .xData = list(), join_keys = join_keys(), ...) { # nolint: object_name.
# Allow .xData to be a list and convert it to an environment
if (!missing(.xData) && inherits(.xData, "list")) {
.xData <- rlang::env_clone(list2env(.xData), parent = parent.env(.GlobalEnv)) # nolint: object_name.
lockEnvironment(.xData, bindings = TRUE)
}
args <- list(...)
checkmate::assert_environment(.xData)
checkmate::assert_class(join_keys, "join_keys")
checkmate::assert_list(args, names = "named")
methods::callNextMethod(
.Object,
.xData,
join_keys = join_keys,
verified = (length(args$code) == 0L && length(.xData) == 0L),
...
)
}
verified <- (length(code) == 0L && length(data) == 0L)

id <- sample.int(.Machine$integer.max, size = length(code))

new_env <- rlang::env_clone(list2env(data), parent = parent.env(.GlobalEnv))
lockEnvironment(new_env, bindings = TRUE)

datanames <- .get_sorted_datanames(datanames = datanames, join_keys = join_keys, env = new_env)

methods::new(
"teal_data",
env = new_env,
code = code,
warnings = rep("", length(code)),
messages = rep("", length(code)),
id = id,
join_keys = join_keys,
datanames = datanames,
verified = verified
)
}
)
68 changes: 0 additions & 68 deletions R/teal_data-datanames.R

This file was deleted.

6 changes: 4 additions & 2 deletions R/teal_data-get_code.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@
#'
#' Retrieve code from `teal_data` object.
#'
#' Retrieve code stored in `@code`, which (in principle) can be used to recreate all objects found in `@env`.
#' Use `names` to limit the code to one or more of the datasets enumerated in `@datanames`.
#' Retrieve code stored in `@code`, which (in principle) can be used to recreate
#' all objects found in the environment (`@.xData`).
#' Use `names` to limit the code to one or more of the datasets enumerated in
#' the environment.
#'
#' @section Extracting dataset-specific code:
#' When `names` is specified, the code returned will be limited to the lines needed to _create_
Expand Down
41 changes: 41 additions & 0 deletions R/teal_data-names.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#' Names of data sets in `teal_data` object
#'
#' Functions to get the names of a `teal_data` object.
#' The names are obtained from the objects listed in the `qenv` environment.
#'
#' Objects named with a `.` (dot) prefix will be ignored and not returned.
#' To get the names of all objects, use `ls(x, all.names = TRUE)`, however, it
#' will not group the names by the join_keys topological structure.
#'
#' @param x A (`teal_data`) object to access or modify.
#'
#' @return A character vector of names.
#'
#' @examples
#' td <- teal_data(iris = iris)
#' td <- within(td, mtcars <- mtcars)
#' names(td)
#'
#' td <- within(td, .CO2 <- CO2)
#' names(td) # '.CO2' will not be returned
#'
#' @export
names.teal_data <- function(x) {
# Sorting from `ls` can be safely done as environments don't have any order
# nor support numeric-index subsetting
envir <- as.environment(x)
.get_sorted_names(ls(envir = envir), join_keys(x), envir)
}

#' @export
length.teal.data <- function(x) length(ls(x))

#' @keywords internal
.get_sorted_names <- function(datanames, join_keys, env) {
child_parent <- sapply(datanames, parent, x = join_keys, USE.NAMES = TRUE, simplify = FALSE)

union(
intersect(unlist(topological_sort(child_parent)), ls(env, all.names = TRUE)),
datanames
)
}
2 changes: 1 addition & 1 deletion R/teal_data-show.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ setMethod("show", signature = "teal_data", function(object) {
} else {
cat("\u2716", "unverified teal_data object\n")
}
rlang::env_print(object@env)
rlang::env_print(teal.code::get_env(object))
})
5 changes: 3 additions & 2 deletions R/teal_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,9 @@ teal_data <- function(...,
if (length(data_objects) > 0 && !checkmate::test_names(names(data_objects), type = "named")) {
stop("Dot (`...`) arguments on `teal_data()` must be named.")
}
new_teal_data(
data = data_objects,
methods::new(
"teal_data",
.xData = data_objects,
code = code,
join_keys = join_keys
)
Expand Down
Loading

0 comments on commit 3d899ee

Please sign in to comment.