Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding keep_empty argument to list_c, list_cbind, and list_rbind. #1144

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# purrr (development version)

* Added a new keep_empty argument to `list_c()`, `list_cbind()`, and `list_rbind()` which will keep empty elements as `NA` in the returned data frame (@SokolovAnatoliy, #1096).
* `list_transpose()` now works with data.frames (@KimLopezGuell, #1109).
* Added `imap_vec()` (#1084)
* `list_transpose()` inspects all elements to determine the correct
Expand Down
31 changes: 26 additions & 5 deletions R/list-combine.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@
#' same size (i.e. number of rows).
#' @param name_repair One of `"unique"`, `"universal"`, or `"check_unique"`.
#' See [vctrs::vec_as_names()] for the meaning of these options.
#' @param keep_empty An optional logical. If `FALSE` (the default), then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keep_empty is the wrong name if double() isn't promoted to NA_real_ and kept. It's more like keep_null.

> list_c(list(1, double(), 2))
[1] 1 2
> list_c(list(1, double(), 2), keep_empty = T)
[1] 1 2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare with

> tidyr::unchop(tibble::tibble(x = list(1, double(), 2)), x)
# A tibble: 2 × 1
      x
  <dbl>
1     1
2     2
> tidyr::unchop(tibble::tibble(x = list(1, double(), 2)), x, keep_empty = TRUE)
# A tibble: 3 × 1
      x
  <dbl>
1     1
2    NA
3     2

#' empty (`NULL`) elements are silently ignored; if `TRUE`, then empty
#' elements are preserved by converting to `NA`.
#' @inheritParams rlang::args_dots_empty
#' @export
#' @examples
Expand All @@ -30,16 +33,21 @@
#'
#' x2 <- list(
#' a = data.frame(x = 1:2),
#' b = data.frame(y = "a")
#' b = data.frame(y = "a"),
#' c = NULL
#' )
#' list_rbind(x2)
#' list_rbind(x2, names_to = "id")
#' list_rbind(x2, names_to = "id", keep_empty = TRUE)
#' list_rbind(unname(x2), names_to = "id")
#'
#' list_cbind(x2)
list_c <- function(x, ..., ptype = NULL) {
#' list_cbind(x2, keep_empty = TRUE)
list_c <- function(x, ..., ptype = NULL, keep_empty = FALSE) {
vec_check_list(x)
check_dots_empty()
if (keep_empty) {
Comment on lines +45 to +48
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd want to use check_bool(keep_empty) everywhere

x <- convert_null_to_NA(x)
}

# For `list_c()`, we don't expose `list_unchop()`'s `name_spec` arg,
# and instead strip outer names to avoid collisions with inner names
Expand All @@ -58,19 +66,26 @@ list_cbind <- function(
x,
...,
name_repair = c("unique", "universal", "check_unique"),
size = NULL
size = NULL,
keep_empty = FALSE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavisVaughan my intuition is that keep_empty doesn't make sense for list_cbind(), but I don't have anything concrete to back that up. Do you have any thoughts?

) {
check_list_of_data_frames(x)
check_dots_empty()
if (keep_empty) {
x <- convert_null_to_NA(x)
}

vec_cbind(!!!x, .name_repair = name_repair, .size = size, .error_call = current_env())
}

#' @export
#' @rdname list_c
list_rbind <- function(x, ..., names_to = rlang::zap(), ptype = NULL) {
list_rbind <- function(x, ..., names_to = rlang::zap(), ptype = NULL, keep_empty = FALSE) {
check_list_of_data_frames(x)
check_dots_empty()
if (keep_empty) {
x <- convert_null_to_NA(x)
}

vec_rbind(!!!x, .names_to = names_to, .ptype = ptype, .error_call = current_env())
}
Expand All @@ -95,3 +110,9 @@ check_list_of_data_frames <- function(x, error_call = caller_env()) {
call = error_call
)
}

convert_null_to_NA <- function(x) {
is_null <- map_lgl(x, is.null)
x[is_null] <- list(NA)
Comment on lines +115 to +116
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vctrs::vec_detect_missing() would be much faster at detecting NULLs since we know x is a list.

See tidyr:::list_replace_null() for a robust and very fast version of this operation

x
}
17 changes: 12 additions & 5 deletions man/list_c.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 0 additions & 7 deletions tests/testthat/_snaps/adverb-auto-browse.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,3 @@
Error in `auto_browse()`:
! `.f` must not be a primitive function.

---

Code
auto_browse(identity)(NULL)
Output
NULL

24 changes: 24 additions & 0 deletions tests/testthat/_snaps/deprec-prepend.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,27 @@
`prepend()` was deprecated in purrr 1.0.0.
i Please use append(after = 0) instead.

# prepend throws error if before param is neither NULL nor between 1 and length(x)

Code
prepend(list(), 1, before = 1)
Condition
Error in `prepend()`:
! is.null(before) || (before > 0 && before <= n) is not TRUE

---

Code
x %>% prepend(4, before = 0)
Condition
Error in `prepend()`:
! is.null(before) || (before > 0 && before <= n) is not TRUE

---

Code
x %>% prepend(4, before = 4)
Condition
Error in `prepend()`:
! is.null(before) || (before > 0 && before <= n) is not TRUE

32 changes: 32 additions & 0 deletions tests/testthat/_snaps/deprec-utils.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,35 @@
Warning:
`rbernoulli()` was deprecated in purrr 1.0.0.

# rdunif fails if a and b are not unit length numbers

Code
rdunif(1000, 1, "a")
Condition
Error in `rdunif()`:
! is.numeric(a) is not TRUE

---

Code
rdunif(1000, 1, c(0.5, 0.2))
Condition
Error in `rdunif()`:
! length(a) == 1 is not TRUE

---

Code
rdunif(1000, FALSE, 2)
Condition
Error in `rdunif()`:
! is.numeric(b) is not TRUE

---

Code
rdunif(1000, c(2, 3), 2)
Condition
Error in `rdunif()`:
! length(b) == 1 is not TRUE

8 changes: 8 additions & 0 deletions tests/testthat/_snaps/deprec-when.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,11 @@
`when()` was deprecated in purrr 1.0.0.
i Please use `if` instead.

# error when named arguments have no matching conditions

Code
1:5 %>% when(a = sum(.) < 5 ~ 3)
Condition
Error in `when()`:
! At least one matching condition is needed.

16 changes: 16 additions & 0 deletions tests/testthat/_snaps/every-some-none.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# every() requires logical value

Code
every(list(1:3), identity)
Condition
Error in `every()`:
! `.p()` must return a single `TRUE` or `FALSE`, not an integer vector.

---

Code
every(list(function() NULL), identity)
Condition
Error in `every()`:
! `.p()` must return a single `TRUE` or `FALSE`, not a function.

24 changes: 24 additions & 0 deletions tests/testthat/_snaps/map-depth.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,16 @@
Error in `map_depth()`:
! Negative `.depth` (-5) must be greater than -4.

# default doesn't recurse into data frames, but can customise

Code
map_depth(x, 2, class)
Condition
Error in `.fmap()`:
i In index: 1.
Caused by error in `map_depth()`:
! List not deep enough

# modify_depth modifies values at specified depth

Code
Expand All @@ -42,6 +52,20 @@
Error in `modify_depth()`:
! Negative `.depth` (-5) must be greater than -4.

# vectorised operations on the recursive and atomic levels yield same results

Code
modify_depth(x, 5, `+`, 10L)
Condition
Error in `map()`:
i In index: 1.
Caused by error in `map()`:
i In index: 1.
Caused by error in `map()`:
i In index: 1.
Caused by error in `modify_depth()`:
! List not deep enough

# validates depth

Code
Expand Down
16 changes: 16 additions & 0 deletions tests/testthat/_snaps/pluck.md
Original file line number Diff line number Diff line change
Expand Up @@ -277,3 +277,19 @@
Error in `pluck_raw()`:
! Index 1 must have length 1, not 26.

# pluck() dispatches on vector methods

Code
chuck(x, 1, 1)
Condition
Error in `chuck()`:
! Length of S3 object must be a scalar integer.

---

Code
chuck(x, 1, "b", 1)
Condition
Error in `chuck()`:
! Index 2 is attempting to pluck from an unnamed vector using a string name.

18 changes: 18 additions & 0 deletions tests/testthat/test-list-combine.R
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,24 @@ test_that("NULLs are ignored", {
expect_equal(list_cbind(list(df1, NULL, df2)), vec_cbind(df1, df2))
})

test_that("NULLs are converted to NA when keep_empty = TRUE", {
df1 <- data.frame(x = 1)
df2 <- data.frame(y = 1)

expect_equal(
list_c(list(1, NULL, 2), keep_empty = TRUE),
c(1, NA, 2)
)
expect_equal(
list_rbind(list(df1, NULL, df1), keep_empty = TRUE),
data.frame(x = c(1, NA, 1))
)
expect_equal(
list_cbind(list(df1, z = NULL, df2), keep_empty = TRUE),
data.frame(df1, z = NA, df2)
)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to implement this then I'd like the chance to come through and write a lot of tests for the 3 cases individually. The keep_empty logic in tidyr is quite hard to get 100% right, and required a lot of edge case tests.


test_that("empty inputs return expected output", {
expect_equal(list_c(list()), NULL)
expect_equal(list_c(list(NULL)), NULL)
Expand Down
Loading