Pr 1 of 2: Add Microarray Pathway Analysis - GSEA example #345

cbethell · 2020-10-30T16:59:49Z

Analysis Purpose

This PR addresses issue #284

Pull Request Stage

This is a Refined PR - needs review of details and polishing

The draft PR relevant to this PR is #339.

Strategy

As suggested in one of the comments on the draft PR, the decision was made to break up the analysis notebook into two PRs.

For this PR, the first of the two, focus was placed on the preparation steps leading to the gene set enrichment analysis step (the second PR will be focused on the GSEA step and visualizations).

Therefore, this PR includes:

Context around the differential expression results file being imported
A section on Getting Familiar with clusterProfiler's` options
Gene identifier conversion from Ensembl to Entrez IDs
A step to filter out duplicate IDs to account for warning messages thrown by clusterProfiler::GSEA() (which will be used in the latter half of the notebook) -- brought up by this comment on the draft PR

Concerns/Questions for reviewers:

The context re what GSEA does at the top of the notebook, as well as the context in the Getting familiar with clusterProfiler's options section can take an extra close look.

The step filtering out duplicates can also take an extra close look as I am not sure that the decision made was best suited for this purpose (the reason I filtered out duplicated IDs is because we have filtered out multi-mapped IDs in a previously updated analysis and mentioned that we may want to do that depending on our downstream case, and in this case, our downstream analysis does not like duplicated IDs -- of which there were n = 2 duplicate IDs in this PR).

Overall, does the guided instructions and context provided in this PR appear to be suffice for our users? Should there be more/less context added?

Analysis Pull Request Check List (roughly in order):

Content checks

All {{BLANKS}} have been replaced with the correct content.
Sources are cited
Seed is set (not applicable)

Formatting Checks

Removed any manual numbering of sections.
Removed any instances of chunk naming.
Comments and documentation are up to date.
All links have been checked and are properly formatted.

Add datasets to S3

Added data and metadata files to S3.

Docker/Snakemake rendering components

Added the .html link to the navigation bar.
Any not yet added packages needed for this analysis have been added to the Dockerfile and it successfully builds.

* Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]>

@dvenprasad

* Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Update github actions to reflect staging branch (#311) * Update github actions to reflect staging branch * Add libglpk40 to Dockerfile * Make it gh-pages-stages! * Remove dockerfile change that should have been on its own all along * Does this work? * Declare a uses * Switch how env is declared * Force it to run so we can test it * try no curly brackets * What's up with the branch * Move to bash if instead * Need quotes? * forgot a `then` * Try dollar signs * Doesn't like the `.`? * Use curly brackets * Try ${GITHUB_REF} * Try ${BRANCH_NAME} * try ${GITHUB_REF#refs/*/} * use jashapiro suggestion * Change to base ref * Change back to `github.ref` * Get rid of PR `on:` * Try another test * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f. * Revert "Try another test" This reverts commit 845cf1a. * Add google analytics to renderings (#314) * Try adding google analytics * Add to header using includes * temporary file snuck in there * Restore master version so they aren't in the review * Let's call an html file and html file * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f. * Only push if we are in master. For simplicity, we will now run this even if the dockerfile hasn't changed. * Add test target * test staging workflow with this branch * back to latest tag * Try separate push step * change tags to test push * Revert "change tags to test push" This reverts commit 6a38574. * Remove this branch from triggers * Push staging, retag and push master Okay, so the branch name is now inaccurate, but that is fine... * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav * Update scripts/render-notebooks.R * Add some issue templates (#319) * Add some rough draft issue templates * Incorporate cbethell review * Get rid of `Other` labels that aren't useful * Update diagrams showing how microarray/RNA-seq work (#326) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Update microarray and RNAseq overview figures - add context re figures - change .jpg to .png for consistency * Revert "Mechanics for CSS file and navbar add feedback URL (#303)" This reverts commit 8b81fdd. * update links to diagrams * @dvenprasad updated figure spacing * add the right updated figure * replace section of link to figures with updated commit id * incorporate @cansavvy's suggested changes Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Chante Bethell <[email protected]>

- add first part of new GSEA notebook example - update Snakefile - update navbar file - update `references.bib` - update `dictionary.txt`

cansavvy

I think this looks very good! I just have a comment/question about maybe trimming down the mapping section, let me know what you think.

In a different note: this new PR process seems working well review-wise, how do you feel it's working on your end?

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

cansavvy · 2020-10-30T18:30:54Z

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

+
+It looks like we have two Entrez IDs that were mapped to multiple Ensembl IDs.
+For the purpose of performing GSEA later in this notebook, we will filter out the duplicated Entrez IDs.
+For more about how to explore this, take a look at our [microarray gene ID conversion example](https://alexslemonade.github.io/refinebio-examples/02-microarray/gene-id-annotation_microarray_01_ensembl.html).


I wonder if we want to spend as much time on the gene ID mapping since we have separate examples for that? We could just skip right to using multiVals = "filter" since this is essentially what you are doing here but manually.

That being said, we may not want to drop data here and I don't think we are as particular about which gene ID is used, so maybe we should just simplify, use multiVals = "first", tell them to see the mapping examples for more info and then move on?
https://www.rdocumentation.org/packages/AnnotationDbi/versions/1.30.1/topics/AnnotationDb-objects

I believe I implemented multiVals = "first" as you suggest here in the last commit. I used an inner_join() to join the expression data as to not have NAs in the entrez_id column (which may pose an issue later when running GSEA()). Please let me know if this is what you intended @cansavvy or if I should make any further changes here.

I will note however, that this still leaves us with two duplicated Entrez IDs that map to multiple Ensembl IDs resulting in the following warning message when running GSEA() later in the notebook: There are duplicate gene names, fgsea may produce unexpected results.

We should probably show users that these duplicates exist then and how to deal with them. The fact that there are only two instances of this is kinda of annoying but its probably good we have this come up so we can show users how to deal with it -- in other datasets or species its possible it will come up more (or maybe not at all).

I think we should incorporate two steps (that should maybe be their own chunk).

Show users how to test for if there are multiple entrez ids. A TRUE/FALSE like any(duplicated()) would probably work.

Show one way that you can decide on which entrez id's data to keep (here's where there could be a lot of ways to do this but we will just have to pick one that we think will be generally useful in most contexts.
I think an okay way to do this would be to keep the data for the entrez gene id with the higher t value (or lower p value) since it will be of greater interest.

May be good to get a @jashapiro opinion on this.

I think an okay way to do this would be to keep the data for the entrez gene id with the higher t value (or lower p value)

Picking whatever entry has the larger absolute value for the stat we use for ranking makes sense to me. (Are we still using LFC?) A caveat we should point out is that the genes that have duplicate identifiers could be enriched in a particular pathway/gene set and you may get an overly optimistic view of how perturbed that pathway is using this approach.

Are we still using LFC?

There is log fold change. And based on the draft PR LFC will be used for GSEA.

We are using the t value now (per your comment on the draft PR @jaclyn-taroni).

In the last commit, I added two steps, one to check for duplicate identifiers and the other to sort by t and remove the lower duplicate value.

Let me know if you think this is the best approach given the suggestions above @cansavvy and @jaclyn-taroni.

Did you find any evidence to support my thought that t would be more "standard?" That was mostly based on recollection.

Linking a comment that answers this question from PR 2 of 2 #347 here:

"""
The decision to create the pre-ranked gene vector based on the t-statistic rather than log fold change was explored based on this comment from the draft PR and eventually made based on what is recommended in Discovering statistically significant pathways in expression profiling studies and the explanation from a biostars forum which says "Try to understand how they relate to the question you're interested in e.g. if you're most interested in effect size then the fold change is what you should use but if you're more interested in statistical significance then look for one of the statistics taking into consideration the assumptions they make e.g. t-test" as I believe we are encouraging users to look at statistical significance here.
"""

cbethell · 2020-11-02T14:23:31Z

@cansavvy I believe I addressed your comments so this is ready for another look. Note my comment here however, as using multiVals = "first" still leaves us with duplicated Entrez IDs that map to multiple Ensembl IDs.

Also, to answer your question

In a different note: this new PR process seems working well review-wise, how do you feel it's working on your end?

I believe this new PR process is working very well from my end as well, as the PRs' author.

cansavvy · 2020-11-02T17:17:55Z

I think this is almost ready! Just need to make some decisions about the multiple entrez ids: #345 (comment)

- add note re using said approach

cansavvy

Just a few little comments/requests and then I think this is ready from my end.

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

cansavvy · 2020-11-04T15:50:13Z

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

+ dplyr::arrange(dplyr::desc(t)) %>%
+ # Filter out the duplicated rows -- this will keep the first row with the
+ # duplicated value thus keeping the row with the highest t-statistic value
+ dplyr::filter(!duplicated(entrez_id))


I think distinct() is a more direct version of the dplyr::filter(!duplicated()) you have here.

Second question, can we prove this to ourselves a bit? Perhaps as simply as printing out one of the duplicated entrez IDs and their t values before and after? (I don't want to add too much length to these steps, but I also think its good to make data removal steps proved and clear).

The reason I opted to use dplyr::filter(!duplicated(entrez_id)) is because dplyr::distinct(entrez_id) returns only the column entrez_id while dplyr::distinct() returns all the rows containing duplicate identifiers (since their t values etc. are different). Perhaps my implementation of dplyr::distinct() is incorrect in this case?

Also, I agree with your second point here. While developing, I used the following to find the duplicate ids and their associated data as a sanity check:

dge_mapped_df %>% dplyr::filter(duplicated(entrez_id))

However, this returns just one of the rows with each of the duplicate ids (I manually searched the before and after data frames for the associated data using the exact entrez_id value returned).

Perhaps I can include the step to print out the below output and use dge_mapped_df %>% dplyr::filter(entrez_id == 336702) as a sanity check?
I implemented this plan in the last commit.
What do you think? Do you have any suggestions to truncate this?

dplyr::distinct(entrez_id) returns only the column entrez_id

Ah. There's an .keep_all = TRUE argument for distinct() that you need to use to not drop columns.

Also, I agree with your second point here. While developing, I used the following to find the duplicate ids and their associated data as a sanity check:
dge_mapped_df %>% dplyr::filter(duplicated(entrez_id))

I was trying to find a straight forward way of doing this without having to make a separate duplicate entrez ids object, but I didn't come up with anything that's great. I was hoping duplicated() had an option to return values directly so you could use an %in% but it doesn't. I also looked to see if tidyverse had a reverse of distinct() but it doesn't seem to. If we installed another package (which I don't want to do) we could use janitor::get_dupes() but I don't find that worth having users install another package for.

So we are left with doing the manual preview you used here -- which I think may be the simplest route for users to follow and still get the point across. OR, this kind of thing:

dup_entrez_ids <- dge_mapped_df %>% dplyr::filter(duplicated(entrez_id)) %>% dplyr::pull(entrez_id)

where you then have to use dup_entrez_ids to retrieve things, but you'd still have to do an arrange.

I think we should just stick with your simple and effective use of an example entrez id like 336702.

In the most recent commit, I went ahead and replaced the dplyr::filter(!duplicated(entrez_id)) step with dplyr::distinct(entrez_id, .keep_all = TRUE) as you suggested @cansavvy, and left the subsequent steps as is because I also believe it is not worth having users install another package for. I do wish distinct() had a reverse function but I believe what we currently have is the next best simple yet effective solution.

- fix typo - add sanity check when removing duplicates

cansavvy

I think after you decide about distinct() and the other info in the comment I just left, this looks good by me. 👍 Request a review from Jackie when you are ready for her to take a look.

cbethell · 2020-11-05T13:43:17Z

@jaclyn-taroni this PR is ready for a second review whenever you are ready!

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

jaclyn-taroni

This looks pretty good! Before it goes in, I think we need to make sure we're explaining and showing folks everything they need to do as they're doing it. Regarding the point about moving up the GSEA explanation, I think we can work that over on #347.

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

jaclyn-taroni · 2020-11-05T14:50:52Z

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

+```
+
+Looks like we were able to successfully get rid of the duplicate gene identifiers and keep the observation with the higher t-statistic value!
+Note however, that a caveat in using this approach is that the genes that have duplicate identifiers could be enriched in a particular pathway/gene set and we may get an overly optimistic view of how perturbed that pathway truly is.


Move to ~line 296 and expand upon. At this point, you haven't explained GSEA yet (https://github.com/AlexsLemonade/refinebio-examples/pull/339/files#diff-0d0fd22250c391b41f655f78697b0bcbd9626a54072e31fc31cbd01c6faf295dR295)! So I'd also think about, now that we need to explain scientific decision making during the gene identifier conversion, if we want to move up that explanation to before the gene ID conversion.

I moved this line up to ~line 296 in this PR and left a comment on #347 to address the latter part of this comment: #347 (comment)

- add preview of `dr_hallmark_df` - add context where needed - adapt approach to removing duplicate gene ids

cbethell · 2020-11-05T19:25:03Z

@jaclyn-taroni, I believe I addressed your above review comments with exception of moving the GSEA explanation up tp before the gene ID conversion. I left a comment on #347 (#347 (comment)) to address that as suggested.

Please let me know if I misinterpreted any of your comments or need to make additional changes to satisfy your comments.

jaclyn-taroni · 2020-11-05T20:40:24Z

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

+```{r}
+filtered_dge_mapped_df <- dge_mapped_df %>%
+ # Sort so that highest t-statistic values are at the top
+ dplyr::arrange(dplyr::desc(t)) %>%


I missed this in my first review. I would expect this to be based on absolute value, i.e., whichever value is most likely to be highly- or lowly-ranked (or further away from the center of the ranking depending on how you'd like to talk about it) #345 (comment)

Gotcha, that does make the most sense here. I overlooked this in the first comment on #345 but believe I implemented it in the most recent commit. Please let me know if I missed an important step in the implementation @jaclyn-taroni.

Worth noting, when running the GSEA() step included in PR #347 throws the following error when the vector is sorted based on absolute value:

Error in GSEA_internal(geneList = geneList, exponent = exponent, minGSSize = minGSSize, : geneList should be a decreasing sorted vector...

I'm not saying we should sort the vector we pass GSEA() by absolute value (provided I am following you correctly), only that we should select the duplicate instances with a greater absolute value.

Ah gotcha! Makes even more sense now thinking about it 👍

02-microarray/pathway-analysis_microarray_03_gsea.Rmd

jaclyn-taroni

I returned my review early so I could respond to your comment. See what you think of what I wrote. I expect that we will need to continue to refine that language over on #347 depending on how and where we introduce GSEA.

Co-authored-by: Jaclyn Taroni <[email protected]>

* Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Update github actions to reflect staging branch (#311) * Update github actions to reflect staging branch * Add libglpk40 to Dockerfile * Make it gh-pages-stages! * Remove dockerfile change that should have been on its own all along * Does this work? * Declare a uses * Switch how env is declared * Force it to run so we can test it * try no curly brackets * What's up with the branch * Move to bash if instead * Need quotes? * forgot a `then` * Try dollar signs * Doesn't like the `.`? * Use curly brackets * Try ${GITHUB_REF} * Try ${BRANCH_NAME} * try ${GITHUB_REF#refs/*/} * use jashapiro suggestion * Change to base ref * Change back to `github.ref` * Get rid of PR `on:` * Try another test * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Revert "Try another test" This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac. * Add google analytics to renderings (#314) * Try adding google analytics * Add to header using includes * temporary file snuck in there * Restore master version so they aren't in the review * Let's call an html file and html file * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Only push if we are in master. For simplicity, we will now run this even if the dockerfile hasn't changed. * Add test target * test staging workflow with this branch * back to latest tag * Try separate push step * change tags to test push * Revert "change tags to test push" This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec. * Remove this branch from triggers * Push staging, retag and push master Okay, so the branch name is now inaccurate, but that is fine... * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav * Update scripts/render-notebooks.R * Add some issue templates (#319) * Add some rough draft issue templates * Incorporate cbethell review * Get rid of `Other` labels that aren't useful * Update diagrams showing how microarray/RNA-seq work (#326) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Update microarray and RNAseq overview figures - add context re figures - change .jpg to .png for consistency * Revert "Mechanics for CSS file and navbar add feedback URL (#303)" This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d. * update links to diagrams * @dvenprasad updated figure spacing * add the right updated figure * replace section of link to figures with updated commit id * incorporate @cansavvy's suggested changes Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> * Adding basic footer (#307) * It works! * Add feedback url * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Make the footer retrieve step consistent with others Co-authored-by: dvenprasad <[email protected]> * Updating CONTRIBUTING.md with instructions about staging -> master set up (#313) * Updating contributing with info about staging branch * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Add more details to CONTRIBUTING about cherry picks and etc * Add bit about html preview * Incorporate Josh comment and drop log.log * Add bit about a hotfixes to staging PRs * Incorporate jashapiro feedback * Incorporate a few more bits of jashapiro feedback * Update doctoc * Make pull requests section H2 * Incorporate jashapiro suggestion to be make more specific branch names * Meh, we don't need <> * Change to use "publish" instead of "live" * Re DocToc * Add a bit more direction about PR base branches * Adjust links * Missed one link Co-authored-by: dvenprasad <[email protected]> * New PR templates to help with new process (#334) * Add basic templates * make it live PR template * Add html preview * Add one more "other" template * Minor edit * Polish up some things. Add "PR stage" * Minor edits * Implement cbethell review * Try a "main PR" strategy with links to the real PR templates (#337) * Get rid of headers and try a "main PR" thing. * Rearrange order * Links don't work per se * Try href strategy? * Update .github/PULL_REQUEST_TEMPLATE.md Co-authored-by: Chante Bethell <[email protected]> Co-authored-by: Chante Bethell <[email protected]> * Add contributing diagrams (#333) * Updating contributing with info about staging branch * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Add more details to CONTRIBUTING about cherry picks and etc * Add bit about html preview * Incorporate Josh comment and drop log.log * Add bit about a hotfixes to staging PRs * Incorporate jashapiro feedback * Incorporate a few more bits of jashapiro feedback * Add the PR diagrams * Add diagrams and some words about them to CONTRIBUTING.md * Couple minor edits * Update doctoc * Make pull requests section H2 * Incorporate jashapiro suggestion to be make more specific branch names * Meh, we don't need <> * Change to use "publish" instead of "live" * Update diagrams to say "publish" * re doctoc * Some polishing of wording * Make robot emoji a png so it renders * Update commit ids * A little more words * Make headlnes more parallel * Couple little updates * A couple more polishing items * Turn :warning: into :x: in diagrams * Update all img commit ids * Address comments from @cbethell 's review * One little wording update Co-authored-by: dvenprasad <[email protected]> * Make the "Other" PR template the default (#341) * Make the "Other" PR template the default * Use jashapiro's wording suggestions * Add timeline reminder to issue template (#342) * Pr 1 of 2: Add Microarray Pathway Analysis - GSEA example (#345) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Making staging changes live (#329) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Update github actions to reflect staging branch (#311) * Update github actions to reflect staging branch * Add libglpk40 to Dockerfile * Make it gh-pages-stages! * Remove dockerfile change that should have been on its own all along * Does this work? * Declare a uses * Switch how env is declared * Force it to run so we can test it * try no curly brackets * What's up with the branch * Move to bash if instead * Need quotes? * forgot a `then` * Try dollar signs * Doesn't like the `.`? * Use curly brackets * Try ${GITHUB_REF} * Try ${BRANCH_NAME} * try ${GITHUB_REF#refs/*/} * use jashapiro suggestion * Change to base ref * Change back to `github.ref` * Get rid of PR `on:` * Try another test * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Revert "Try another test" This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac. * Add google analytics to renderings (#314) * Try adding google analytics * Add to header using includes * temporary file snuck in there * Restore master version so they aren't in the review * Let's call an html file and html file * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Only push if we are in master. For simplicity, we will now run this even if the dockerfile hasn't changed. * Add test target * test staging workflow with this branch * back to latest tag * Try separate push step * change tags to test push * Revert "change tags to test push" This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec. * Remove this branch from triggers * Push staging, retag and push master Okay, so the branch name is now inaccurate, but that is fine... * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav * Update scripts/render-notebooks.R * Add some issue templates (#319) * Add some rough draft issue templates * Incorporate cbethell review * Get rid of `Other` labels that aren't useful * Update diagrams showing how microarray/RNA-seq work (#326) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Update microarray and RNAseq overview figures - add context re figures - change .jpg to .png for consistency * Revert "Mechanics for CSS file and navbar add feedback URL (#303)" This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d. * update links to diagrams * @dvenprasad updated figure spacing * add the right updated figure * replace section of link to figures with updated commit id * incorporate @cansavvy's suggested changes Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Chante Bethell <[email protected]> * add first half of microarray GSEA example nb - add first part of new GSEA notebook example - update Snakefile - update navbar file - update `references.bib` - update `dictionary.txt` * revert commit that snuck in * revert second commit that snuck in * fix render notebooks merge conflict * incorporate cansavvy's review suggestions * add step handling duplicate ids - add note re using said approach * update comment * replace lfc with t-statistic value where mentioned * incorporate @cansavvy's review suggestions - fix typo - add sanity check when removing duplicates * replace `!duplicated()` with `dplyr::distinct()` * incorporate @jaclyn-taroni's review suggestions - add preview of `dr_hallmark_df` - add context where needed - adapt approach to removing duplicate gene ids * add a bit more context re removing dup gene IDs * use absolute value of t-statistic * Apply GSEA explanation suggestion from code review Co-authored-by: Jaclyn Taroni <[email protected]> * rerun snakefile to update rendered html Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: Jaclyn Taroni <[email protected]> * Pr 2 of 2: Add Microarray Pathway Analysis - GSEA example (#347) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Making staging changes live (#329) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Update github actions to reflect staging branch (#311) * Update github actions to reflect staging branch * Add libglpk40 to Dockerfile * Make it gh-pages-stages! * Remove dockerfile change that should have been on its own all along * Does this work? * Declare a uses * Switch how env is declared * Force it to run so we can test it * try no curly brackets * What's up with the branch * Move to bash if instead * Need quotes? * forgot a `then` * Try dollar signs * Doesn't like the `.`? * Use curly brackets * Try ${GITHUB_REF} * Try ${BRANCH_NAME} * try ${GITHUB_REF#refs/*/} * use jashapiro suggestion * Change to base ref * Change back to `github.ref` * Get rid of PR `on:` * Try another test * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Revert "Try another test" This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac. * Add google analytics to renderings (#314) * Try adding google analytics * Add to header using includes * temporary file snuck in there * Restore master version so they aren't in the review * Let's call an html file and html file * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Only push if we are in master. For simplicity, we will now run this even if the dockerfile hasn't changed. * Add test target * test staging workflow with this branch * back to latest tag * Try separate push step * change tags to test push * Revert "change tags to test push" This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec. * Remove this branch from triggers * Push staging, retag and push master Okay, so the branch name is now inaccurate, but that is fine... * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav * Update scripts/render-notebooks.R * Add some issue templates (#319) * Add some rough draft issue templates * Incorporate cbethell review * Get rid of `Other` labels that aren't useful * Update diagrams showing how microarray/RNA-seq work (#326) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Update microarray and RNAseq overview figures - add context re figures - change .jpg to .png for consistency * Revert "Mechanics for CSS file and navbar add feedback URL (#303)" This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d. * update links to diagrams * @dvenprasad updated figure spacing * add the right updated figure * replace section of link to figures with updated commit id * incorporate @cansavvy's suggested changes Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Chante Bethell <[email protected]> * add latter half of GSEA microarray example (includes GSEA steps) fix merge conflicts * add incode prompt * revert commit that snuck in * revert commit * set seed and re-run * incorporate some of the wording/context suggestions from review * rerun Snakefile * incorporate suggested changes re additional context/GSEA explanation * implement `top_n()` * add a bit more context for clarification re ES score * update GSEA explanation before gene ID conversion section * incorporate @cansavvy's wording suggestions * mimic "highly" -> "most" language * incorporate wording suggestions from code review * some re-structuring/re-wording based on review suggestions * update `dictionary.txt` file * incorporate @jaclyn-taroni's review suggestions Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> * Delete intro Rmd and renumber (#355) * Try out intro and fix filenames * Undo intro paragraph for now. Too much * Missed one link to update in GSEA * Add words about Draft and Refined PRs to CONTRIBUTING.md (#361) * Explicitly discuss draft vs refine PRs in contrib * doctoc it * Remove asterisks * Refine wording * Use cbethell's wording suggestions * Make that one sentence more clear? * WGCNA Part 1: Set up (#358) * Put in basic changes: navbar, dict, snakefile, Rmd * More polishing and info and refs * Update file paths * Bring back docker changes * Add to dictionary * Add a couple refs * Add ref and other little things * Revert "Add ref and other little things" This reverts commit 7560c2a7cb861aaecefd8d241db209d8b3658989. * Address straightforward comments from cbethell * Add ref * Add more refs and re-render * Remove that extra part that should only be in part2 not here * Incorporate jashapiro review * Shorten up some more comments * rowSums!! * Get rid of tibble step and change wording * WGCNA Part 2: Running WGCNA (#360) * Put in basic changes: navbar, dict, snakefile, Rmd * More polishing and info and refs * Update file paths * Bring back docker changes * Add to dictionary * Add a couple refs * Add next steps * Add some polishing and refs * Address the straightforward items from cbethell 's review * Incorporate jashapiro review from #358 * Style Rmds * Bring over part1 changes and re-render * Edit things based on jashapiro review Co-authored-by: GitHub Actions <[email protected]> * Add pathway analysis intro paragraph to microarray ORA (#356) * Try out intro and fix filenames * Undo intro paragraph for now. Too much * Add intro paragraph * Fix typo, add links * Incorporate cbethell review * Wording change from @envest * Fix WGCNA installation (#366) * Move order of install for WGCNA * warn moar * Pr 1 of 2: Add Microarray Pathway Analysis - GSVA example (#359) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Making staging changes live (#329) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Update github actions to reflect staging branch (#311) * Update github actions to reflect staging branch * Add libglpk40 to Dockerfile * Make it gh-pages-stages! * Remove dockerfile change that should have been on its own all along * Does this work? * Declare a uses * Switch how env is declared * Force it to run so we can test it * try no curly brackets * What's up with the branch * Move to bash if instead * Need quotes? * forgot a `then` * Try dollar signs * Doesn't like the `.`? * Use curly brackets * Try ${GITHUB_REF} * Try ${BRANCH_NAME} * try ${GITHUB_REF#refs/*/} * use jashapiro suggestion * Change to base ref * Change back to `github.ref` * Get rid of PR `on:` * Try another test * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Revert "Try another test" This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac. * Add google analytics to renderings (#314) * Try adding google analytics * Add to header using includes * temporary file snuck in there * Restore master version so they aren't in the review * Let's call an html file and html file * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Only push if we are in master. For simplicity, we will now run this even if the dockerfile hasn't changed. * Add test target * test staging workflow with this branch * back to latest tag * Try separate push step * change tags to test push * Revert "change tags to test push" This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec. * Remove this branch from triggers * Push staging, retag and push master Okay, so the branch name is now inaccurate, but that is fine... * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav * Update scripts/render-notebooks.R * Add some issue templates (#319) * Add some rough draft issue templates * Incorporate cbethell review * Get rid of `Other` labels that aren't useful * Update diagrams showing how microarray/RNA-seq work (#326) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Update microarray and RNAseq overview figures - add context re figures - change .jpg to .png for consistency * Revert "Mechanics for CSS file and navbar add feedback URL (#303)" This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d. * update links to diagrams * @dvenprasad updated figure spacing * add the right updated figure * replace section of link to figures with updated commit id * incorporate @cansavvy's suggested changes Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Chante Bethell <[email protected]> * Add first half of microarray GSVA example notebook * add packages to Dockerfile and rerun * fix reference * add to navbar * remove mention of pheatmap * incorporate @jaclyn-taroni's suggestion on collapsing duplicates logic * incorporate cansavvy's review comments - fix logic combing rest of mapped data with the collapsed duplicates data - fix context around that logic * clarify/change some wording based on cansavvy's suggestions * incorporate single sample example of selecting max expression values * Push code that cbethell and I chatted through * Add to dictionary * Style Rmds * rerun Snakefile to update html file * Apply jaclyn-taroni's wording suggestions from code review Co-authored-by: Jaclyn Taroni <[email protected]> * incorporate the rest of jaclyn-taroni's review suggestions Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: GitHub Actions <[email protected]> Co-authored-by: Jaclyn Taroni <[email protected]> * WGCNA Part 3: DE and heatmaps (#363) * Put in basic changes: navbar, dict, snakefile, Rmd * More polishing and info and refs * Update file paths * Bring back docker changes * Add to dictionary * Add a couple refs * Add next steps * Add some polishing and refs * Address the straightforward items from cbethell 's review * Incorporate jashapiro review from #358 * Style Rmds * Bring over part1 changes and re-render * Add last set of steps * Push this partcular plot version in case we wanna come back to it * Commit this multiple module pheatmap in case I want to return to it * ComplexHeatmap is mostly wrangled * It's working! * Save to PDFs * Fix color function and re-render * Add outlier thing * Revert "Add outlier thing" This reverts commit 8b9d57ce13ff2b6b6c5ddbb0169a794f6bbd36de. * Add ref for ComplexHeatmap * Incorporate jashapiro review and rerender * Remove standardize_genes option * Wrap up those last few typo things Co-authored-by: GitHub Actions <[email protected]> * WGCNA Part 4: Warn about Outliers (#364) * Put in basic changes: navbar, dict, snakefile, Rmd * More polishing and info and refs * Update file paths * Bring back docker changes * Add to dictionary * Add a couple refs * Add next steps * Add some polishing and refs * Address the straightforward items from cbethell 's review * Incorporate jashapiro review from #358 * Style Rmds * Bring over part1 changes and re-render * Add last set of steps * Push this partcular plot version in case we wanna come back to it * Commit this multiple module pheatmap in case I want to return to it * ComplexHeatmap is mostly wrangled * It's working! * Save to PDFs * Fix color function and re-render * Add outlier thing * Style Rmds * Re-rendered html * switch the whole outlier thing to just a comment * re-render after staging merge Co-authored-by: GitHub Actions <[email protected]> * Microarray ORA Restructure Instruction (#377) * Some edits and adding other tutorials * Add more guidance about why pick ORA * A bit more word changing * A few more wording edits * Incorporating jashapiro review * Get rid of other GSEA mention * sessioninfo::session_info() * Put those two wording things in jashapiro mentioned * WGCNA Part 5: switch dataset (#379) * switch wording and dataset in general * Few more wording edits * Update dictionary; fix spelling errors * Re-render! * Change to 7 and incorporate jashapiro review * Also switch the most sig module! * Two comments from jashapiro review * Put the comments too * Style Rmds * Use all_of() to get rid warning * Style Rmds * Re-render Co-authored-by: GitHub Actions <[email protected]> * Change pdf -> png and rereun (#382) * Pr 2 of 2: Add Microarray Pathway Analysis - GSVA example (#362) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Making staging changes live (#329) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Update github actions to reflect staging branch (#311) * Update github actions to reflect staging branch * Add libglpk40 to Dockerfile * Make it gh-pages-stages! * Remove dockerfile change that should have been on its own all along * Does this work? * Declare a uses * Switch how env is declared * Force it to run so we can test it * try no curly brackets * What's up with the branch * Move to bash if instead * Need quotes? * forgot a `then` * Try dollar signs * Doesn't like the `.`? * Use curly brackets * Try ${GITHUB_REF} * Try ${BRANCH_NAME} * try ${GITHUB_REF#refs/*/} * use jashapiro suggestion * Change to base ref * Change back to `github.ref` * Get rid of PR `on:` * Try another test * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Revert "Try another test" This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac. * Add google analytics to renderings (#314) * Try adding google analytics * Add to header using includes * temporary file snuck in there * Restore master version so they aren't in the review * Let's call an html file and html file * Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316) * Add lib package 40 thing that clusterprofiler needs * Try adding options(warn = 2) * Test if options(warn =2) means it breaks like it should * Revert "Test if options(warn =2) means it breaks like it should" This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3. * Only push if we are in master. For simplicity, we will now run this even if the dockerfile hasn't changed. * Add test target * test staging workflow with this branch * back to latest tag * Try separate push step * change tags to test push * Revert "change tags to test push" This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec. * Remove this branch from triggers * Push staging, retag and push master Okay, so the branch name is now inaccurate, but that is fine... * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav * Update scripts/render-notebooks.R * Add some issue templates (#319) * Add some rough draft issue templates * Incorporate cbethell review * Get rid of `Other` labels that aren't useful * Update diagrams showing how microarray/RNA-seq work (#326) * Mechanics for CSS file and navbar add feedback URL (#303) * Adding in some style with css * Use css magic * Try making the navbar blue * Add survey link * Make font smaller * Need a comma * Change to normalizePath * normalizepath separate step references.bib * Move references.bib to component folder * Made ccs modifications, added logo file Made changes to css/navbar.html Tried to add the logo but it but it cuts out and not sure how to make it decent. * Resolve render-notebooks.R conflict * Remove testing html from file diff * uncommented mobile nav Co-authored-by: dvenprasad <[email protected]> * Update microarray and RNAseq overview figures - add context re figures - change .jpg to .png for consistency * Revert "Mechanics for CSS file and navbar add feedback URL (#303)" This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d. * update links to diagrams * @dvenprasad updated figure spacing * add the right updated figure * replace section of link to figures with updated commit id * incorporate @cansavvy's suggested changes Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Chante Bethell <[email protected]> * Add part two of GSVA microarray example notebook * update comment * update violin plot and its interpretation * add to `dictionary.txt` * apply significance and multiple hypothesis testing before plotting * Switching to northcott and a sina plot of one pathway * Style Rmds * Re-render it all * Adjust wording add tidbits about limma and re-render * Few more wording edits * Caught a few more little wording issues. Re-rendered * Remove Murat2008 ref * Restore the part 1 changes that got lost in the merge * incorporate most of jaclyn-taroni's suggested changes - create annotated results df using wide -> long method - update some wording/context re `mx.diff = TRUE` and what that means * remove outdated entries in `dictionary.txt` - remove unnecessary reference in `references.bib` * fix axis label * break up `annotated_results_df` steps * Apply suggestions from code review Co-authored-by: Jaclyn Taroni <[email protected]> * add reminder of `gsva_results` format - cite gsva package vignette - add more detail around "appropriate format" for plotting Co-authored-by: Candace Savonen <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: GitHub Actions <[email protected]> Co-authored-by: Jaclyn Taroni <[email protected]> * Remove getting started zip file (#392) * ORA RNA-seq: Part 1 - The Set Up (#394) * Add the file. It works * Add components * re-render * Add review tags * Update wording around detectable genes * Add some words to dictionary.txt * re-render * Switch to PNG * Incorporating cbethell 's and envest 's review * Switch from using gene symbols to Entrez IDs * Isolate to just set up * Fix Typo * Polish the wording in a few places * Incorporate jashapiro reviews and remove tags * Port one wording change over to microarray ORA * One more wording edit * ORA RNA-seq: Part 2 - Run ORA and get results! (#395) * Add the file. It works * Add components * re-render * Add review tags * Update wording around detectable genes * Add some words to dictionary.txt * re-render * Switch to PNG * Incorporating cbethell 's and envest 's review * Switch from using gene symbols to Entrez IDs * Couple wording polishes * Copy over changes from #394 's review * Use jashapiro wording suggestions, delete tags * Add message = FALSE to mute chatty blocks (#398) * Add message=FALSE to library loading chunks * Rerender html files * Spell check fixes * rerender * add GSVA package to Dockerfile (#401) * Add rendering options via include.R (#402) * Add option to include R code in an early chunk * Add the include file when rendering * Change width to 70 * add example rerender * comment and naming changes * Update contributing.md with include file description * Add all rendered changes * GSVA for RNA-seq Part 1: Set up (#403) * Scrapbooking together an analysis * switch back to kcdf = "Gaussian" * Rearrange based on chat with Jackie * Fix the two things from jaclyn-taroni partial review * Few wording edits * Make the dup checks more relevant * Make PNG a bit bigger * incorporate most of jaclyn-taroni review comments * Try out the msigdbr list thing * Isolate to first parts of gsva * Editing explanations * Fix a couple spelling things * Incorporate jaclyn-taroni review and delete tags * Use `vst_df` * One more wording change * Remove that instance of "lists" that isn't really what we mean * Link citations in render (#407) * Add links to citations * Fix umlaut * Try different strategy for ortholog file download (#411) * Try different download strategy * Couple edits * Move link to before download * One other wording change * Editing/polish of microarray heatmap notebook (#409) * Intro edits * Heatmap edits * Render changes * Couple edits that didn't get saved * One more comment compaction. * Remove relative links * Add to microarray strengths * Carry over common comment changes (#414) * Carry over common comment changes * Style Rmds * White space change to force check Co-authored-by: GitHub Actions <[email protected]> * Use same download.file strategy for ortholog RNA-seq example (#413) * Copy over changes from #411 but make it mouse * "automatically" gets deleted * GSVA for RNA-seq: Part 2 -- GSVA and a heatmap (#404) * Scrapbooking together an analysis * switch back to kcdf = "Gaussian" * Rearrange based on chat with Jackie * Fix the two things from jaclyn-taroni partial review * Few wording edits * Make the dup checks more relevant * Make PNG a bit bigger * incorporate most of jaclyn-taroni review comments * Try out the msigdbr list thing * Re-render * Update based on part 1 review * Add bit that shows overlaps * Make wording changes based on jaclyn-taroni review * Do some wording/explanation edits * RNA-seq DGE dataset switch (#416) * Introduce SRP123625 * Updating wording and some other items * Re-render it * Spell error fixes * jashapiro review suggestions * one more change and re-render * add part 1 of RNA-seq GSEA example notebook (take 2) (#419) * PCA polishing edits (#421) * Add principal component background Also shortened code lines and results tables * Add some more context and explanation of results * Updates to rnaseq PCA * Format and rerender * update screenshots * Apply suggestions from code review Co-authored-by: Candace Savonen <[email protected]> * Don't call it a matrix it's been here for years * rerender Co-authored-by: Candace Savonen <[email protected]> * Try out a different download strategy for ORA (#418) * Change download steps to download file * re-render * Spell error fixes * Use jashapiro wording * Use download.file for the three other notebooks (#422) * Bring over the GSEA changes * Add download.file() to the other three places * Found a typo * Fix two things cbethell mentioned in review * PR 2 of 2: Add RNA-seq Pathway Analysis - GSEA example (take 2) (#420) * add part 2 of RNA-seq GSEA example notebook (take 2) * rerun snakefile * incorporate jaclyn-taroni's review suggestions * rerun Snakefile to fix html output (#424) * Umap polish (#423) * UMAP polish edits * rendering * Apply suggestions from code review Co-authored-by: Candace Savonen <[email protected]> * rerender * Move filtering to before DESeq2 object creation (#425) * re-render it * Further fix merge conflicts and re-render * Address jashapiro comments * Heatmap polish (#426) * Polishing edits to heatmap pages * Style and render * OSPL fix Co-authored-by: Candace Savonen <[email protected]> * rownames -> row names * rerender all the things and delete a stray space * Polish differential exp microarray notebooks (#427) * Changes to differential exp microarray notebook * Spelling updates * Add comments and releveling, as suggested by @cansavvy * code formatting updates embrace the pipe * remove apeglm * Add some eBayes info! * polishing microarray multiple groups * Rerender everything * Split off multiple testing * Rerender * Minor Polish Diff Expr RNAseq (#429) * Minor polishing to RNAseq Diff Expr * render changes * Apply suggestions from code review Co-authored-by: Candace Savonen <[email protected]> * render Co-authored-by: Candace Savonen <[email protected]> * Polish the microarray ORA notebook (#430) * Update MSigDB section + rerender Also fixes long comments and rownames.print=FALSE * A few more edits to comments * Part 1: Add rownames.print = FALSE where its helpful (#431) * Add print.rownames = FALSE where its helpful * Apparently it doesn't affect design matrices * Push the htmls so that people can actually see them!!! * Polish the RNA-seq ORA example (#432) * Add the rownames.print = FALSE and re-render (#433) Co-authored-by: jashapiro <[email protected]> * Polish Microarray GSEA example (#434) * Polish wording and add introductory paragraphs * A bit more polishing * Apply suggestions from code review Co-authored-by: jashapiro <[email protected]> * Response to code review * Apply suggestions from code review Co-authored-by: jashapiro <[email protected]> * Propagate citation suggestion to multiple pathway notebooks * Missed these quotes * A few mapIds() items Co-authored-by: jashapiro <[email protected]> * Polish RNA-seq GSEA example (#437) * Polish wording and add introductory paragraphs * A bit more polishing * Apply suggestions from code review Co-authored-by: jashapiro <[email protected]> * Response to code review * Apply suggestions from code review Co-authored-by: jashapiro <[email protected]> * Propagate citation suggestion to multiple pathway notebooks * Missed these quotes * A few mapIds() items * Polish RNA-seq GSEA example * Missed a long comment * Update components/references.bib Co-authored-by: jashapiro <[email protected]> * Let the algorithm handle it Co-authored-by: jashapiro <[email protected]> * GHA: Slack us if Docker build or rendering fails (#438) * Add Slack notification to docker-build-push.yml * Add Slack notification to docker-build.yml * Add branch for testing * Add library load of package not installed * Revert "Add library load of package not installed" This reverts commit 4e83ed1e104f0760db40752f9bc9e641f916d374. * Revert "Add branch for testing" This reverts commit 06073504426dc1903cf46bb99cc65a3a91894a3c. * Polish Ensembl Gene ID conversions (bonus reference updates!) (#435) * Polish wording and add introductory paragraphs * A bit more polishing * Reference updates * ensembl gene id polish * Get out of here capital Refine.bio * No more bare dfs * Transfer changes to RNAseq (and some back) * Apply suggestions from code review Co-authored-by: jashapiro <[email protected]> * Response to code review * Some numeric updates * Add rendered files * render update * Citation update * Render updates * add comment & rerender Co-authored-by: Jaclyn Taroni <[email protected]> * Polish microarray GSVA example (#440) * Ignore the gene_sets directory * Polish the microarray GSVA example * Missed a couple mentions of GSEA * Borrow some polishing from #427 * Apply suggestions from code review Co-authored-by: Candace Savonen <[email protected]> * Newline in intro paragraph everywhere * Add note about model organisms with GSVA Link to RNA-seq GSVA example * Rerender notebooks Co-authored-by: Candace Savonen <[email protected]> * Polishing Ortholog notebooks (#436) * Polish wording and add introductory paragraphs * A bit more polishing * Reference updates * ensembl gene id polish * Get out of here capital Refine.bio * No more bare dfs * Transfer changes to RNAseq (and some back) * Apply suggestions from code review Co-authored-by: jashapiro <[email protected]> * Response to code review * Some numeric updates * Add rendered files * render update * Citation update * Add polishing for ortholog examples * Transfer intro sentence updates to related notebooks * more polish - remove duplicates from counts * Transfer changes to rnaseq * Rendering updates * other rendering updates * Render updates * Change ftp -> http for ftp.ebi As noted here: https://github.com/AlexsLemonade/refinebio-examples/issues/439#issuecomment-748721625 Confirmed that this does work as expected by rendering. * Add branch to docker-build-push.yml for render test * Apply suggestions from code review Co-authored-by: Candace Savonen <[email protected]> * Clarify real genes * add some spelling words * Revert "Add branch to docker-build-push.yml for render test" This reverts commit 8e7b6ec868bab71c4ca0ec2f9b21e2be25df978b. * rendering * Caught an igor Co-authored-by: Jaclyn Taroni <[email protected]> Co-authored-by: Candace Savonen <[email protected]> * Polish the RNA-seq GSVA example (#441) Co-authored-by: Joshua Shapiro <[email protected]> Co-authored-by: dvenprasad <[email protected]> Co-authored-by: Chante Bethell <[email protected]> Co-authored-by: Jaclyn Taroni <[email protected]> Co-authored-by: GitHub Actions <[email protected]>

cansavvy and others added 7 commits October 22, 2020 11:04

add first half of microarray GSEA example nb

d5f6045

- add first part of new GSEA notebook example - update Snakefile - update navbar file - update `references.bib` - update `dictionary.txt`

revert commit that snuck in

8a3d59a

revert second commit that snuck in

670277a

fix render notebooks merge conflict

b2518c3

update branch

feffa26

cbethell requested a review from cansavvy October 30, 2020 17:38

cansavvy reviewed Oct 30, 2020

View reviewed changes

incorporate cansavvy's review suggestions

c7c0a2c

cbethell requested a review from cansavvy October 30, 2020 23:51

cbethell added 2 commits November 4, 2020 08:40

add step handling duplicate ids

6fa5e43

- add note re using said approach

update comment

59f4df6

cbethell requested a review from jaclyn-taroni November 4, 2020 13:46

replace lfc with t-statistic value where mentioned

5152b3c

cansavvy reviewed Nov 4, 2020

View reviewed changes

incorporate @cansavvy's review suggestions

ef6d9aa

- fix typo - add sanity check when removing duplicates

cansavvy reviewed Nov 4, 2020

View reviewed changes

replace !duplicated() with dplyr::distinct()

fcb451b

cbethell mentioned this pull request Nov 5, 2020

Pr 2 of 2: Add Microarray Pathway Analysis - GSEA example #347

Merged

10 tasks

jaclyn-taroni reviewed Nov 5, 2020

View reviewed changes

02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved

jaclyn-taroni reviewed Nov 5, 2020

View reviewed changes

02-microarray/pathway-analysis_microarray_03_gsea.Rmd Show resolved Hide resolved

jaclyn-taroni reviewed Nov 5, 2020

View reviewed changes

cbethell added 2 commits November 5, 2020 14:06

incorporate @jaclyn-taroni's review suggestions

90e0aaa

- add preview of `dr_hallmark_df` - add context where needed - adapt approach to removing duplicate gene ids

add a bit more context re removing dup gene IDs

519972e

cbethell requested a review from jaclyn-taroni November 5, 2020 19:25

jaclyn-taroni reviewed Nov 5, 2020

View reviewed changes

use absolute value of t-statistic

d64e208

jaclyn-taroni reviewed Nov 6, 2020

View reviewed changes

02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved

jaclyn-taroni approved these changes Nov 6, 2020

View reviewed changes

cbethell and others added 2 commits November 5, 2020 19:22

Apply GSEA explanation suggestion from code review

447b36d

Co-authored-by: Jaclyn Taroni <[email protected]>

rerun snakefile to update rendered html

dbe5a08

cbethell merged commit 489cc04 into staging Nov 6, 2020

cbethell deleted the cbethell/add-microarray-gsea-pr-1 branch November 6, 2020 13:52

This was referenced Nov 16, 2020

WIP: Add Microarray Pathway Analysis - GSEA example #339

Closed

WIP: Add Microarray Pathway Analysis - GSVA example #352

Closed

cansavvy mentioned this pull request Dec 1, 2020

Publish 3 new examples (and some other changes) #383

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pr 1 of 2: Add Microarray Pathway Analysis - GSEA example #345

Pr 1 of 2: Add Microarray Pathway Analysis - GSEA example #345

cbethell commented Oct 30, 2020 •

edited

Loading

cansavvy left a comment

cansavvy Oct 30, 2020

cbethell Oct 30, 2020

cbethell Oct 31, 2020

cansavvy Nov 2, 2020

jaclyn-taroni Nov 4, 2020

cansavvy Nov 4, 2020

cbethell Nov 4, 2020

jaclyn-taroni Nov 4, 2020

cbethell Nov 5, 2020

cbethell commented Nov 2, 2020 •

edited

Loading

cansavvy commented Nov 2, 2020

cansavvy left a comment •

edited

Loading

cansavvy Nov 4, 2020

cbethell Nov 4, 2020 •

edited

Loading

cansavvy Nov 4, 2020 •

edited

Loading

cbethell Nov 5, 2020

cansavvy left a comment

cbethell commented Nov 5, 2020

jaclyn-taroni left a comment

jaclyn-taroni Nov 5, 2020

cbethell Nov 5, 2020

cbethell commented Nov 5, 2020

jaclyn-taroni Nov 5, 2020

cbethell Nov 5, 2020

cbethell Nov 6, 2020

jaclyn-taroni Nov 6, 2020

cbethell Nov 6, 2020

jaclyn-taroni left a comment

Pr 1 of 2: Add Microarray Pathway Analysis - GSEA example #345

Pr 1 of 2: Add Microarray Pathway Analysis - GSEA example #345

Conversation

cbethell commented Oct 30, 2020 • edited Loading

Analysis Purpose

Pull Request Stage

Strategy

Concerns/Questions for reviewers:

Analysis Pull Request Check List (roughly in order):

Content checks

Formatting Checks

Add datasets to S3

Docker/Snakemake rendering components

cansavvy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbethell commented Nov 2, 2020 • edited Loading

cansavvy commented Nov 2, 2020

cansavvy left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbethell Nov 4, 2020 • edited Loading

Choose a reason for hiding this comment

cansavvy Nov 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cansavvy left a comment

Choose a reason for hiding this comment

cbethell commented Nov 5, 2020

jaclyn-taroni left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbethell commented Nov 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaclyn-taroni left a comment

Choose a reason for hiding this comment

cbethell commented Oct 30, 2020 •

edited

Loading

cbethell commented Nov 2, 2020 •

edited

Loading

cansavvy left a comment •

edited

Loading

cbethell Nov 4, 2020 •

edited

Loading

cansavvy Nov 4, 2020 •

edited

Loading