-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Case study report automation #111
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@awasyn, thanks again for putting together this PR. I haven't gone through all the files & functions line-by-line, but I started noting some high-level comments first that might help with a more careful review by our team.
- It might help if this PR focuses on new function definitions and MolEvolvR/CARD calls. When possible, call existing functions directly from
R/
or use the downloaded github package calls. @the-mayer may advise one way or another. - Use current function names post batch renaming (recently merged PRs). The current list is here: https://jravilab.github.io/MolEvolvR/reference/index.html. This is another reason to avoid redefining functions.
- Avoid stylistic changes or minor formatting changes, incl. stripping tail-end white spaces as part of commits to keep content and style-based commits and PRs separate.
Please help us by pointing to new functions/functionalities so we can focus our code review there.
MolEvolvR: @the-mayer @falquaddoomi @epbrenner
CARD/amR: @AbhirupaGhosh @epbrenner
Thanks a ton once again, for your concerted effort on this mega issue!
summarise(totalcount = sum(count)) | ||
|
||
total <- left_join(prot, col_count, by = as_string(column)) | ||
total <- left_join(prot, col_count, by = column) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@the-mayer, make sure this switch causes no errors?
|
||
prot <- select(prot, {{ column }}, {{ lineage_col }}) %>% | ||
filter(!is.na({{ column }}) & !is.na({{ lineage_col }})) %>% | ||
filter({{ column }} != "") | ||
|
||
prot <- summarizeByLineage(prot, column, by = lineage_col, query = "all") | ||
col_count <- prot %>% | ||
group_by({{ column }}) %>% | ||
group_by(!!sym(column)) %>% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@the-mayer, do we prefer curly curly or is this switch going to be consistent with the rest of our code? @awasyn, is there any particular reason for this switch? Did it ({{}}
) throw a warning or error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it threw an error. eg column <- "DomArch"
is passed as string but earlier in the code there was a column <- sym(column)
which change "DomArch" to a symbol i.e "DomArch"
and the code tries to find column name in dataframe as "DomArch"
which doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. {{column}} should work similar to sym(column). So not sure what's going on. @the-mayer?
inst/report/scripts/viz_utils.R
Outdated
} | ||
|
||
# Function to convert accessions to names | ||
acc_to_name <- function(app_data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Distinguish from https://jravilab.github.io/MolEvolvR/reference/mapAcc2Name.html
inst/report/scripts/viz_utils.R
Outdated
} | ||
|
||
# Function to retrieve representative accession numbers | ||
get_representative_accession_numbers <- function(app_data, phylo_select, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as https://jravilab.github.io/MolEvolvR/reference/createRepresentativeAccNum.html?
Avoid function duplication, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off, I commend you for taking on a difficult task and completing it!
I'm still in the process of reviewing your PR, but I thought I'd submit what comments I had so far so you can act on them. I'll likely do a second review a bit later on, when I get a chance to run it and investigate your changes further.
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@awasyn, were these changes (& these) pulled from main
? If not, stylistic changes such as trailing spaces & indentation can be addressed later (post-merge).
okay thanks. I wanted to ease review by updating back to only the changed lines. my editor added spaces during save. |
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Signed-off-by: Awa Synthia <[email protected]>
Cases study report automation with MolEvolvR
rmd
report #27Rmd
/Qmd
#102This commit introduces a series of scripts and supporting code for automating case study reports via the MolEvolvR analysis pipeline. The added scripts enable end-to-end processing of pathogen and/or drug data from CARD data in FASTA format, automating key tasks like sequence retrieval, alignment, and protein annotation. Currently, the pipeline supports only the "full" analysis option,
Specifically, the following capabilities have been implemented:
BLAST and InterProScan Integration:
This approach is based on the MolEvolvR webapp pipeline , with portions of code directly adapted from MolEvolvR scripts. The original authorship has been retained to acknowledge the foundational work and provide proper attribution.
Current dependencies not inlcuded in this commit :
To-Do:
Update:
Hi, below are the complete instructions for running the case study report generation on Windows and Ubuntu, including how to install and use the getCaseStudyReport function for automating the case study:
Setup Instructions
Windows
blastn -version
cd-hit -h
Ubuntu
sudo apt-get install cd-hit -y
Next add this support data.
Try out the case study
Once everything is set up, you can test the automation by running the following in R/RStudio:
This will generate and open a detailed case study report. 🚀
The output files are in the parent directory of the current directory where the function is called.
@jananiravi @the-mayer @falquaddoomi
What kind of change(s) are included?
Checklist
Please ensure that all boxes are checked before indicating that this pull request is ready for review.