Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make scalar_in_linter() configurable #2574

Merged
merged 10 commits into from
May 21, 2024
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
* `any_is_na_linter()` is extended to catch the unusual usage `NA %in% x` (#2113, @MichaelChirico).
* `make_linter_from_xpath()` errors up front when `lint_message` is missing (instead of delaying this error until the linter is used, #2541, @MichaelChirico).
* `paste_linter()` is extended to recommend using `paste()` instead of `paste0()` for simply aggregating a character vector with `collapse=`, i.e., when `sep=` is irrelevant (#1108, @MichaelChirico).
* `scalar_in_linter` is now configurable to allow other `%in%` like operators to be linted. The data.table operator `%chin%` is no longer linted by default; use `in_operators = "%chin%"` to continue linting it. (@F-Noelle)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, i'm having trouble making edits myself since this is made in your main branch, I think you need to create a dedicated branch for me to make edits.

all this to say -- could you please move this item to the "breaking changes" section at the top :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved


### New linters

Expand Down
21 changes: 12 additions & 9 deletions R/scalar_in_linter.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
#' Block usage like x %in% "a"
#'
#' `vector %in% set` is appropriate for matching a vector to a set, but if
#' that set has size 1, `==` is more appropriate. `%chin%` from `{data.table}`
#' is matched as well.
#' that set has size 1, `==` is more appropriate.
#'
#' `scalar %in% vector` is OK, because the alternative (`any(vector == scalar)`)
#' is more circuitous & potentially less clear.
#'
#' @param in_operators Character vector of additional infix operators that behave like the `%in%` operator
F-Noelle marked this conversation as resolved.
Show resolved Hide resolved
#'
#' @examples
#' # will produce lints
#' lint(
Expand All @@ -16,7 +17,7 @@
#'
#' lint(
#' text = "x %chin% 'a'",
#' linters = scalar_in_linter()
#' linters = scalar_in_linter(in_operators = "%chin%")
#' )
#'
#' # okay
Expand All @@ -28,22 +29,24 @@
#' @evalRd rd_tags("scalar_in_linter")
#' @seealso [linters] for a complete list of linters available in lintr.
#' @export
scalar_in_linter <- function() {
scalar_in_linter <- function(in_operators = NULL) {
# TODO(#2085): Extend to include other cases where the RHS is clearly a scalar
# NB: all of logical, integer, double, hex, complex are parsed as NUM_CONST
xpath <- "
//SPECIAL[text() = '%in%' or text() = '%chin%']
xpath <- glue("
//SPECIAL[{xp_text_in_table(c('%in%', {in_operators}))}]
/following-sibling::expr[NUM_CONST[not(starts-with(text(), 'NA'))] or STR_CONST]
/parent::expr
"
")

Linter(linter_level = "expression", function(source_expression) {
xml <- source_expression$xml_parsed_content

bad_expr <- xml_find_all(xml, xpath)
in_op <- xml_find_chr(bad_expr, "string(SPECIAL)")
lint_msg <-
paste0("Use == to match length-1 scalars, not ", in_op, ". Note that == preserves NA where ", in_op, " does not.")
lint_msg <- glue(
"Use comparison operators (e.g. ==, !=, etc.) to match length-1 scalars instead of {in_op} ",
"as these operators preserve NA where {in_op} does not."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm afraid we've over-corrected 😄

"as" implies that preserving NA is the reason for the lint. But that's not true... actually this is a somewhat ambiguous thing. Some authors might prefer this behavior and thus disfavor this linter -- hence we're highlighting this aspect as a tradeoff users should be aware of.

So let's go with

Use comparison operators... to match length-1 scalars instead of {in_op}. Note that comparison operators preserve NA where {in_op} does not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

)

xml_nodes_to_lints(
bad_expr,
Expand Down
2 changes: 1 addition & 1 deletion inst/lintr/linters.csv
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ repeat_linter,style readability
return_linter,style configurable default
routine_registration_linter,best_practices efficiency robustness
sample_int_linter,efficiency readability robustness
scalar_in_linter,readability consistency best_practices efficiency
scalar_in_linter,readability consistency best_practices efficiency configurable
semicolon_linter,style readability default configurable
semicolon_terminator_linter,defunct
seq_linter,robustness efficiency consistency best_practices default
Expand Down
1 change: 1 addition & 0 deletions man/configurable_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 7 additions & 5 deletions man/scalar_in_linter.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

32 changes: 24 additions & 8 deletions tests/testthat/test-scalar_in_linter.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,28 +3,44 @@ test_that("scalar_in_linter skips allowed usages", {

expect_lint("x %in% y", NULL, linter)
expect_lint("y %in% c('a', 'b')", NULL, linter)
expect_lint("c('a', 'b') %chin% x", NULL, linter)
expect_lint("c('a', 'b') %in% x", NULL, linter)
expect_lint("z %in% 1:3", NULL, linter)
# scalars on LHS are fine (often used as `"col" %in% names(DF)`)
expect_lint("3L %in% x", NULL, linter)

# this should be is.na(x), but it more directly uses the "always TRUE/FALSE, _not_ NA"
# aspect of %in%, so we delegate linting here to equals_na_linter()
expect_lint("x %in% NA", NULL, linter)
expect_lint("x %in% NA_character_", NULL, linter)
})

test_that("scalar_in_linter blocks simple disallowed usages", {
linter <- scalar_in_linter()
lint_in_msg <- rex::rex("Use == to match length-1 scalars, not %in%.")
lint_chin_msg <- rex::rex("Use == to match length-1 scalars, not %chin%.")
linter <- scalar_in_linter(in_operators = c("%chin%", "%notin%"))
lint_msg <- rex::rex("Use comparison operators (e.g. ==, !=, etc.) to match length-1 scalars instead of")

expect_lint("x %in% 1", lint_msg, linter)
expect_lint("x %chin% 'a'", lint_msg, linter)
expect_lint("x %notin% 1", lint_msg, linter)
})

expect_lint("x %in% 1", lint_in_msg, linter)
expect_lint("x %chin% 'a'", lint_chin_msg, linter)
test_that("scalar_in_linter blocks or skips based on configuration", {
linter_default <- scalar_in_linter()
linter_config <- scalar_in_linter(in_operators = "%notin%")

lint_msg <- rex::rex("Use comparison operators (e.g. ==, !=, etc.) to match length-1 scalars instead of")

# default
expect_lint("x %in% 1", lint_msg, linter_default)
expect_lint("x %notin% 1", NULL, linter_default)
expect_lint("x %notin% y", NULL, linter_default)

# configured
expect_lint("x %in% 1", lint_msg, linter_config)
expect_lint("x %notin% 1", lint_msg, linter_config)
expect_lint("x %notin% y", NULL, linter_config)
})

test_that("multiple lints are generated correctly", {
linter <- scalar_in_linter()
linter <- scalar_in_linter(in_operators = "%chin%")

expect_lint(
trim_some('{
Expand Down
Loading