Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Example Data: R-CMD check error #118

Open
the-mayer opened this issue Oct 29, 2024 · 5 comments
Open

Refactor Example Data: R-CMD check error #118

the-mayer opened this issue Oct 29, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@the-mayer
Copy link
Collaborator

the-mayer commented Oct 29, 2024

An R-CMD check error occurs when running this example code for reverseOperonSeq():

❯ checking examples ... [14s/15s] ERROR
  Running examples in ‘MolEvolvR-Ex.R’ failed
  The error most likely occurred in:
  
  > base::assign(".ptime", proc.time(), pos = "CheckExEnv")
  > ### Name: reverseOperonSeq
  > ### Title: reverseOperon: Reverse the Direction of Operons in Genomic
  > ###   ContextSeq
  > ### Aliases: reverseOperonSeq
  > 
  > ### ** Examples
  > 
  > # Example genomic context data frame
  > prot <- data.frame(GenContext = c("A>B", "C<D", "E=F*G", "H>I"))
  > reversed_prot <- reverseOperonSeq(prot)
  Error in ge[[x]] : subscript out of bounds
  Calls: reverseOperonSeq -> lapply -> FUN -> straightenOperonSeq
  Execution halted

Using the example data defined in prot the error occurs during the lapply operation @line 137 as ge has length 0 due to the previous subset operation.

Originally posted by @the-mayer in #97 (comment)

the-mayer added a commit to awasyn/MolEvolvR that referenced this issue Oct 29, 2024
@jananiravi jananiravi added the bug Something isn't working label Oct 31, 2024
@Joiejoie1
Copy link
Collaborator

@jananiravi @the-mayer Can I be assigned to this issue? I want to give it a try.

@jananiravi
Copy link
Member

Sure, @Joiejoie1!

@Joiejoie1
Copy link
Collaborator

Sure, @Joiejoie1!

Thanks @jananiravi

@Joiejoie1
Copy link
Collaborator

@jananiravi @the-mayer This is how I intend to Refactor Example Data: R-CMD check error
#118

  1. Inspect the Code to Understand the Logic:

reverseOperonSeq processes a data frame with a GenContext column, splits it and manipulates genomic context strings.
straightenOperonSeq is designed to annotate elements with directional indicators based on certain rules.

  1. Debugging the ge List Initialization:

The error message (subscript out of bounds) suggests that ge is empty or not structured as expected at the lapply() call.
Check where ge is assigned in the code to see if it could be empty. You’ll see that ge is created from te[witheq], where witheq is derived from te.

  1. Print Statements to Check Intermediate Variables:

Insert print() statements before the problematic line (line 137) to output the contents of ge, te, and witheq:
print(te) # Check the contents of te before filtering
print(witheq) # Verify which elements of te have "="
print(ge) # Ensure ge has the expected structure and elements

This will allow to see if ge is empty or incorrectly structured before it’s processed with lapply.

  1. Run the Code in Chunks:

Execute the code in sections (or line by line) up to line 137 to isolate where ge might become empty or miss elements.

  1. Test with Known Example Data:

Define example input data and run it with reverseOperonSeq() to observe how it handles the input and where it fails. Use the example from the documentation:
prot <- data.frame(GenContext = c("A>B", "C<D", "E=F*G", "H>I"))
reversed_prot <- reverseOperonSeq(prot)
This test will help determine if the error is consistent with the sample input, and you can check how the function processes each step.

  1. Check for Edge Cases:

Consider if certain patterns in the GenContext column could lead to "ge" becoming empty, such as missing certain characters or symbols.
Modify the input data to include different patterns or minimal cases (like c("A>B")) to see if the function consistently produces an output.

  1. Add Conditionals to Handle Empty Cases:

If identified that ge can be empty, modify the code to handle this case before applying lapply():

if (length(ge) > 0) {
ge <- lapply(1:length(ge), function(x) straightenOperonSeq(ge[[x]]))
} else {
warning("No elements to process in ge; skipping lapply operation.")
}
This condition ensures that "lapply" only runs if ge contains elements.

  1. Save and Rerun the Script:

Save the modified file and run the code from start to finish to confirm that it executes without errors.

  1. Verify Using R CMD check:

If the function works as expected, run devtools::check() to ensure the package passes R CMD check without errors.

@Joiejoie1
Copy link
Collaborator

@jananiravi @the-mayer I have created a PR to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants