-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BiocCheck naive function line count penalizes good coding practices of commenting code and readable multiline code #220
Comments
Hi Pariksheet, @omsai I am not sure I understand your argument. One can argue that adding comments within the function are a stylistic choice. Best regards, |
That's right,
This gets most of the way there where it counts the number of statements but one improvement would be recursively looking inside closures (e.g. if / else/ for / while, etc.) # getFunctionLengths ------------------------------------------------------
file <- system.file("testpackages", "testpkg0", "R",
"parseme.R", package="BiocCheck")
## Read the file a list of language expressions.
exprs <- parse(file)
lapply(exprs, function(x) x)
## Walker to organize tokens into a character vector of closures.
walker <-
codetools::makeCodeWalker(
call = function(e, w) {
result <<- c(result, "(")
codetools::walkCode(e[[1]], w)
for (subexpr in as.list(e[-1])) {
if (missing(subexpr)) {
result <<- c(result, "<Missing>")
} else {
codetools::walkCode(subexpr, w)
}
}
result <<- c(result, ")")
},
leaf = function(e, w) {
if (typeof(e) == "symbol") {
if (e == "(") {
result <<- c(result, "\\(")
} else {
result <<- c(result, deparse(e))
}
}
})
## Gather tokens into closure.
results <- character(length(exprs))
for (i in seq_along(exprs)) {
result <- character()
codetools::walkCode(exprs[[i]], walker)
results[i] <- paste(result, collapse = " ")
}
results
## Process the extracted character vector of closures.
library(tidyverse)
closure_pop_outer <- function(closure) {
if (str_count(closure, "[(]") == 0L) {
return("")
}
str_extract(closure, "^[(][^(]+(.*)[)]$", group = 1L) |>
str_trim()
}
closure_name <- function(closure) {
if (str_count(closure, "[(]") == 0L) {
return("")
}
str_split(closure, " ", n = 3, simplify = TRUE)[, 2]
}
#' We want to remove closures leading up to the first function, and,
#' optionally, the first parenthesis closure found right after the
#' function.
#'
#' @param closure An atomic character vector.
#' @return An atomic character.
strip_function <- function(closure) {
while (closure != "" &&
closure_name(closure) != "function") {
closure <- closure_pop_outer(closure)
}
if (closure != "" &&
closure_name(closure) == "function") {
closure <- closure_pop_outer(closure)
if (closure != "" &&
closure_name(closure) == "{") {
closure <- closure_pop_outer(closure)
}
}
closure
}
closure_pop_sequential <- function(closure) {
if (str_count(closure, "[(]") == 0L) {
return("")
}
str_extract(closure, "^[(][^(]+[)](.*)", group = 1L) |>
str_trim()
}
closure_count_sequential <- function(closure) {
count <- 0L
while (closure != "") {
count <- count + 1L
closure <- closure_pop_sequential(closure)
}
count
}
## Convert to data.frame for convenience.
df <- tibble(orig = results,
statements = map_chr(orig, strip_function),
n_statements = map_int(statements, closure_count_sequential))
df
|
My unsollicited 2 cents on this: it makes sense to ignore comment-only lines and hopefully that is not too hard to implement. However I would not recommend trying to go to far or spend too much time on an overly complicated counting approach. Maybe counting non comment-only lines is good enough? It's just a NOTE after all. |
BiocCheck currently shows a NOTE for functions longer than 50 lines. However comment-only lines contribute to this limit, as do breaking up arguments of function calls with multiple lines to make them more legible. Instead of counting lines, BiocCheck should inspect code statements. For example, cloc is CLI utility that counts code statements for many languages https://github.com/AlDanial/cloc (there actually is an R packge for cloc https://github.com/hrbrmstr/cloc/ but it's not currently on CRAN). BiocCheck already imports the
codetools
package that I think should be able to count closures with an appropriate code walker.The text was updated successfully, but these errors were encountered: