Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pr 1 of 2: Add Microarray Pathway Analysis - GSEA example #345

Merged
merged 18 commits into from
Nov 6, 2020

Conversation

cbethell
Copy link
Contributor

@cbethell cbethell commented Oct 30, 2020

Analysis Purpose

This PR addresses issue #284

Pull Request Stage

This is a Refined PR - needs review of details and polishing

The draft PR relevant to this PR is #339.

Strategy

As suggested in one of the comments on the draft PR, the decision was made to break up the analysis notebook into two PRs.

For this PR, the first of the two, focus was placed on the preparation steps leading to the gene set enrichment analysis step (the second PR will be focused on the GSEA step and visualizations).

Therefore, this PR includes:

  • Context around the differential expression results file being imported
  • A section on Getting Familiar with clusterProfiler's` options
  • Gene identifier conversion from Ensembl to Entrez IDs
  • A step to filter out duplicate IDs to account for warning messages thrown by clusterProfiler::GSEA() (which will be used in the latter half of the notebook) -- brought up by this comment on the draft PR

Concerns/Questions for reviewers:

The context re what GSEA does at the top of the notebook, as well as the context in the Getting familiar with clusterProfiler's options section can take an extra close look.

The step filtering out duplicates can also take an extra close look as I am not sure that the decision made was best suited for this purpose (the reason I filtered out duplicated IDs is because we have filtered out multi-mapped IDs in a previously updated analysis and mentioned that we may want to do that depending on our downstream case, and in this case, our downstream analysis does not like duplicated IDs -- of which there were n = 2 duplicate IDs in this PR).

Overall, does the guided instructions and context provided in this PR appear to be suffice for our users? Should there be more/less context added?

Analysis Pull Request Check List (roughly in order):

Content checks

  • All {{BLANKS}} have been replaced with the correct content.
  • Sources are cited
  • Seed is set (not applicable)

Formatting Checks

  • Removed any manual numbering of sections.
  • Removed any instances of chunk naming.
  • Comments and documentation are up to date.
  • All links have been checked and are properly formatted.

Add datasets to S3

Docker/Snakemake rendering components

  • Added the .html link to the navigation bar.
  • Any not yet added packages needed for this analysis have been added to the Dockerfile and it successfully builds.

cansavvy and others added 7 commits October 22, 2020 11:04
* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>
* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Update github actions to reflect staging branch (#311)

* Update github actions to reflect staging branch

* Add libglpk40 to Dockerfile

* Make it gh-pages-stages!

* Remove dockerfile change that should have been on its own all along

* Does this work?

* Declare a uses

* Switch how env is declared

* Force it to run so we can test it

* try no curly brackets

* What's up with the branch

* Move to bash if instead

* Need quotes?

* forgot a `then`

* Try dollar signs

* Doesn't like the `.`?

* Use curly brackets

* Try ${GITHUB_REF}

* Try ${BRANCH_NAME}

* try ${GITHUB_REF#refs/*/}

* use jashapiro suggestion

* Change to base ref

* Change back to `github.ref`

* Get rid of PR `on:`

* Try another test

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f.

* Revert "Try another test"

This reverts commit 845cf1a.

* Add google analytics to renderings (#314)

* Try adding google analytics

* Add to header using includes

* temporary file snuck in there

* Restore master version so they aren't in the review

* Let's call an html file and html file

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f.

* Only push if we are in master.

For simplicity, we will now run this even if the dockerfile hasn't changed.

* Add test target

* test staging workflow with this branch

* back to latest tag

* Try separate push step

* change tags to test push

* Revert "change tags to test push"

This reverts commit 6a38574.

* Remove this branch from triggers

* Push staging, retag and push master

Okay, so the branch name is now inaccurate, but that is fine...

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

* Update scripts/render-notebooks.R

* Add some issue templates (#319)

* Add some rough draft issue templates

* Incorporate cbethell review

* Get rid of `Other` labels that aren't useful

* Update diagrams showing how microarray/RNA-seq work  (#326)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Update microarray and RNAseq overview figures


- add context re figures
- change .jpg to .png for consistency

* Revert "Mechanics for CSS file and navbar add feedback URL (#303)"

This reverts commit 8b81fdd.

* update links to diagrams

* @dvenprasad updated figure spacing

* add the right updated figure

* replace section of link to figures with updated commit id

* incorporate @cansavvy's suggested changes

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>

Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Chante Bethell <[email protected]>
- add first part of new GSEA notebook example
- update Snakefile
- update navbar file
- update `references.bib`
- update `dictionary.txt`
Copy link
Contributor

@cansavvy cansavvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks very good! I just have a comment/question about maybe trimming down the mapping section, let me know what you think.

In a different note: this new PR process seems working well review-wise, how do you feel it's working on your end?

02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved
02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved

It looks like we have two Entrez IDs that were mapped to multiple Ensembl IDs.
For the purpose of performing GSEA later in this notebook, we will filter out the duplicated Entrez IDs.
For more about how to explore this, take a look at our [microarray gene ID conversion example](https://alexslemonade.github.io/refinebio-examples/02-microarray/gene-id-annotation_microarray_01_ensembl.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to spend as much time on the gene ID mapping since we have separate examples for that? We could just skip right to using multiVals = "filter" since this is essentially what you are doing here but manually.

That being said, we may not want to drop data here and I don't think we are as particular about which gene ID is used, so maybe we should just simplify, use multiVals = "first", tell them to see the mapping examples for more info and then move on?
https://www.rdocumentation.org/packages/AnnotationDbi/versions/1.30.1/topics/AnnotationDb-objects

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I implemented multiVals = "first" as you suggest here in the last commit. I used an inner_join() to join the expression data as to not have NAs in the entrez_id column (which may pose an issue later when running GSEA()). Please let me know if this is what you intended @cansavvy or if I should make any further changes here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will note however, that this still leaves us with two duplicated Entrez IDs that map to multiple Ensembl IDs resulting in the following warning message when running GSEA() later in the notebook: There are duplicate gene names, fgsea may produce unexpected results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably show users that these duplicates exist then and how to deal with them. The fact that there are only two instances of this is kinda of annoying but its probably good we have this come up so we can show users how to deal with it -- in other datasets or species its possible it will come up more (or maybe not at all).

I think we should incorporate two steps (that should maybe be their own chunk).

  1. Show users how to test for if there are multiple entrez ids. A TRUE/FALSE like any(duplicated()) would probably work.

  2. Show one way that you can decide on which entrez id's data to keep (here's where there could be a lot of ways to do this but we will just have to pick one that we think will be generally useful in most contexts.
    I think an okay way to do this would be to keep the data for the entrez gene id with the higher t value (or lower p value) since it will be of greater interest.

May be good to get a @jashapiro opinion on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an okay way to do this would be to keep the data for the entrez gene id with the higher t value (or lower p value)

Picking whatever entry has the larger absolute value for the stat we use for ranking makes sense to me. (Are we still using LFC?) A caveat we should point out is that the genes that have duplicate identifiers could be enriched in a particular pathway/gene set and you may get an overly optimistic view of how perturbed that pathway is using this approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we still using LFC?

There is log fold change. And based on the draft PR LFC will be used for GSEA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using the t value now (per your comment on the draft PR @jaclyn-taroni).

In the last commit, I added two steps, one to check for duplicate identifiers and the other to sort by t and remove the lower duplicate value.

Let me know if you think this is the best approach given the suggestions above @cansavvy and @jaclyn-taroni.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you find any evidence to support my thought that t would be more "standard?" That was mostly based on recollection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking a comment that answers this question from PR 2 of 2 #347 here:

"""
The decision to create the pre-ranked gene vector based on the t-statistic rather than log fold change was explored based on this comment from the draft PR and eventually made based on what is recommended in Discovering statistically significant pathways in expression profiling studies and the explanation from a biostars forum which says "Try to understand how they relate to the question you're interested in e.g. if you're most interested in effect size then the fold change is what you should use but if you're more interested in statistical significance then look for one of the statistics taking into consideration the assumptions they make e.g. t-test" as I believe we are encouraging users to look at statistical significance here.
"""

@cbethell
Copy link
Contributor Author

cbethell commented Nov 2, 2020

@cansavvy I believe I addressed your comments so this is ready for another look. Note my comment here however, as using multiVals = "first" still leaves us with duplicated Entrez IDs that map to multiple Ensembl IDs.

Also, to answer your question

In a different note: this new PR process seems working well review-wise, how do you feel it's working on your end?

I believe this new PR process is working very well from my end as well, as the PRs' author.

@cansavvy
Copy link
Contributor

cansavvy commented Nov 2, 2020

I think this is almost ready! Just need to make some decisions about the multiple entrez ids: #345 (comment)

Copy link
Contributor

@cansavvy cansavvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few little comments/requests and then I think this is ready from my end.

02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved
02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved
dplyr::arrange(dplyr::desc(t)) %>%
# Filter out the duplicated rows -- this will keep the first row with the
# duplicated value thus keeping the row with the highest t-statistic value
dplyr::filter(!duplicated(entrez_id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think distinct() is a more direct version of the dplyr::filter(!duplicated()) you have here.

Second question, can we prove this to ourselves a bit? Perhaps as simply as printing out one of the duplicated entrez IDs and their t values before and after? (I don't want to add too much length to these steps, but I also think its good to make data removal steps proved and clear).

Copy link
Contributor Author

@cbethell cbethell Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I opted to use dplyr::filter(!duplicated(entrez_id)) is because dplyr::distinct(entrez_id) returns only the column entrez_id while dplyr::distinct() returns all the rows containing duplicate identifiers (since their t values etc. are different). Perhaps my implementation of dplyr::distinct() is incorrect in this case?

Also, I agree with your second point here. While developing, I used the following to find the duplicate ids and their associated data as a sanity check:

dge_mapped_df %>% dplyr::filter(duplicated(entrez_id))

However, this returns just one of the rows with each of the duplicate ids (I manually searched the before and after data frames for the associated data using the exact entrez_id value returned).

Perhaps I can include the step to print out the below output and use dge_mapped_df %>% dplyr::filter(entrez_id == 336702) as a sanity check?
I implemented this plan in the last commit.
What do you think? Do you have any suggestions to truncate this?

Screen Shot 2020-11-04 at 12 48 04 PM

Copy link
Contributor

@cansavvy cansavvy Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dplyr::distinct(entrez_id) returns only the column entrez_id

Ah. There's an .keep_all = TRUE argument for distinct() that you need to use to not drop columns.

Also, I agree with your second point here. While developing, I used the following to find the duplicate ids and their associated data as a sanity check:
dge_mapped_df %>% dplyr::filter(duplicated(entrez_id))

I was trying to find a straight forward way of doing this without having to make a separate duplicate entrez ids object, but I didn't come up with anything that's great. I was hoping duplicated() had an option to return values directly so you could use an %in% but it doesn't. I also looked to see if tidyverse had a reverse of distinct() but it doesn't seem to. If we installed another package (which I don't want to do) we could use janitor::get_dupes() but I don't find that worth having users install another package for.

So we are left with doing the manual preview you used here -- which I think may be the simplest route for users to follow and still get the point across. OR, this kind of thing:

dup_entrez_ids <- dge_mapped_df %>% 
  dplyr::filter(duplicated(entrez_id)) %>%
  dplyr::pull(entrez_id)

where you then have to use dup_entrez_ids to retrieve things, but you'd still have to do an arrange.

I think we should just stick with your simple and effective use of an example entrez id like 336702.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the most recent commit, I went ahead and replaced the dplyr::filter(!duplicated(entrez_id)) step with dplyr::distinct(entrez_id, .keep_all = TRUE) as you suggested @cansavvy, and left the subsequent steps as is because I also believe it is not worth having users install another package for. I do wish distinct() had a reverse function but I believe what we currently have is the next best simple yet effective solution.

- fix typo 
- add sanity check when removing duplicates
Copy link
Contributor

@cansavvy cansavvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after you decide about distinct() and the other info in the comment I just left, this looks good by me. 👍 Request a review from Jackie when you are ready for her to take a look.

@cbethell
Copy link
Contributor Author

cbethell commented Nov 5, 2020

@jaclyn-taroni this PR is ready for a second review whenever you are ready!

Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good! Before it goes in, I think we need to make sure we're explaining and showing folks everything they need to do as they're doing it. Regarding the point about moving up the GSEA explanation, I think we can work that over on #347.

02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved
02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved
02-microarray/pathway-analysis_microarray_03_gsea.Rmd Outdated Show resolved Hide resolved
```

Looks like we were able to successfully get rid of the duplicate gene identifiers and keep the observation with the higher t-statistic value!
Note however, that a caveat in using this approach is that the genes that have duplicate identifiers could be enriched in a particular pathway/gene set and we may get an overly optimistic view of how perturbed that pathway truly is.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to ~line 296 and expand upon. At this point, you haven't explained GSEA yet (https://github.com/AlexsLemonade/refinebio-examples/pull/339/files#diff-0d0fd22250c391b41f655f78697b0bcbd9626a54072e31fc31cbd01c6faf295dR295)! So I'd also think about, now that we need to explain scientific decision making during the gene identifier conversion, if we want to move up that explanation to before the gene ID conversion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this line up to ~line 296 in this PR and left a comment on #347 to address the latter part of this comment: #347 (comment)

- add preview of `dr_hallmark_df`
- add context where needed 
- adapt approach to removing duplicate gene ids
@cbethell
Copy link
Contributor Author

cbethell commented Nov 5, 2020

@jaclyn-taroni, I believe I addressed your above review comments with exception of moving the GSEA explanation up tp before the gene ID conversion. I left a comment on #347 (#347 (comment)) to address that as suggested.

Please let me know if I misinterpreted any of your comments or need to make additional changes to satisfy your comments.

```{r}
filtered_dge_mapped_df <- dge_mapped_df %>%
# Sort so that highest t-statistic values are at the top
dplyr::arrange(dplyr::desc(t)) %>%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this in my first review. I would expect this to be based on absolute value, i.e., whichever value is most likely to be highly- or lowly-ranked (or further away from the center of the ranking depending on how you'd like to talk about it) #345 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, that does make the most sense here. I overlooked this in the first comment on #345 but believe I implemented it in the most recent commit. Please let me know if I missed an important step in the implementation @jaclyn-taroni.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting, when running the GSEA() step included in PR #347 throws the following error when the vector is sorted based on absolute value:

Error in GSEA_internal(geneList = geneList, exponent = exponent, minGSSize = minGSSize, : geneList should be a decreasing sorted vector...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not saying we should sort the vector we pass GSEA() by absolute value (provided I am following you correctly), only that we should select the duplicate instances with a greater absolute value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha! Makes even more sense now thinking about it 👍

Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I returned my review early so I could respond to your comment. See what you think of what I wrote. I expect that we will need to continue to refine that language over on #347 depending on how and where we introduce GSEA.

@cbethell cbethell merged commit 489cc04 into staging Nov 6, 2020
@cbethell cbethell deleted the cbethell/add-microarray-gsea-pr-1 branch November 6, 2020 13:52
jaclyn-taroni added a commit that referenced this pull request Dec 21, 2020
* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Update github actions to reflect staging branch (#311)

* Update github actions to reflect staging branch

* Add libglpk40 to Dockerfile

* Make it gh-pages-stages!

* Remove dockerfile change that should have been on its own all along

* Does this work?

* Declare a uses

* Switch how env is declared

* Force it to run so we can test it

* try no curly brackets

* What's up with the branch

* Move to bash if instead

* Need quotes?

* forgot a `then`

* Try dollar signs

* Doesn't like the `.`?

* Use curly brackets

* Try ${GITHUB_REF}

* Try ${BRANCH_NAME}

* try ${GITHUB_REF#refs/*/}

* use jashapiro suggestion

* Change to base ref

* Change back to `github.ref`

* Get rid of PR `on:`

* Try another test

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Revert "Try another test"

This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac.

* Add google analytics to renderings (#314)

* Try adding google analytics

* Add to header using includes

* temporary file snuck in there

* Restore master version so they aren't in the review

* Let's call an html file and html file

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Only push if we are in master.

For simplicity, we will now run this even if the dockerfile hasn't changed.

* Add test target

* test staging workflow with this branch

* back to latest tag

* Try separate push step

* change tags to test push

* Revert "change tags to test push"

This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec.

* Remove this branch from triggers

* Push staging, retag and push master

Okay, so the branch name is now inaccurate, but that is fine...

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

* Update scripts/render-notebooks.R

* Add some issue templates (#319)

* Add some rough draft issue templates

* Incorporate cbethell review

* Get rid of `Other` labels that aren't useful

* Update diagrams showing how microarray/RNA-seq work  (#326)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Update microarray and RNAseq overview figures


- add context re figures
- change .jpg to .png for consistency

* Revert "Mechanics for CSS file and navbar add feedback URL (#303)"

This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d.

* update links to diagrams

* @dvenprasad updated figure spacing

* add the right updated figure

* replace section of link to figures with updated commit id

* incorporate @cansavvy's suggested changes

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>

* Adding basic footer (#307)

* It works!

* Add feedback url

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Make the footer retrieve step consistent with others

Co-authored-by: dvenprasad <[email protected]>

* Updating CONTRIBUTING.md with instructions about staging -> master set up (#313)

* Updating contributing with info about staging branch

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Add more details to CONTRIBUTING about cherry picks and etc

* Add bit about html preview

* Incorporate Josh comment and drop log.log

* Add bit about a hotfixes to staging PRs

* Incorporate jashapiro feedback

* Incorporate a few more bits of jashapiro feedback

* Update doctoc

* Make pull requests section H2

* Incorporate jashapiro suggestion to be make more specific branch names

* Meh, we don't need <>

* Change to use "publish" instead of "live"

* Re DocToc

* Add a bit more direction about PR base branches

* Adjust links

* Missed one link

Co-authored-by: dvenprasad <[email protected]>

* New PR templates to help with new process (#334)

* Add basic templates

* make it live PR template

* Add html preview

* Add one more "other" template

* Minor edit

* Polish up some things. Add "PR stage"

* Minor edits

* Implement cbethell review

* Try a "main PR" strategy with links to the real PR templates (#337)

* Get rid of headers and try a "main PR" thing.

* Rearrange order

* Links don't work per se

* Try href strategy?

* Update .github/PULL_REQUEST_TEMPLATE.md

Co-authored-by: Chante Bethell  <[email protected]>

Co-authored-by: Chante Bethell  <[email protected]>

* Add contributing diagrams (#333)

* Updating contributing with info about staging branch

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Add more details to CONTRIBUTING about cherry picks and etc

* Add bit about html preview

* Incorporate Josh comment and drop log.log

* Add bit about a hotfixes to staging PRs

* Incorporate jashapiro feedback

* Incorporate a few more bits of jashapiro feedback

* Add the PR diagrams

* Add diagrams and some words about them to CONTRIBUTING.md

* Couple minor edits

* Update doctoc

* Make pull requests section H2

* Incorporate jashapiro suggestion to be make more specific branch names

* Meh, we don't need <>

* Change to use "publish" instead of "live"

* Update diagrams to say "publish"

* re doctoc

* Some polishing of wording

* Make robot emoji a png so it renders

* Update commit ids

* A little more words

* Make headlnes more parallel

* Couple little updates

* A couple more polishing items

* Turn :warning: into :x: in diagrams

* Update all img commit ids

* Address comments from @cbethell 's review

* One little wording update

Co-authored-by: dvenprasad <[email protected]>

* Make the "Other" PR template the default (#341)

* Make the "Other" PR template the default

* Use jashapiro's wording suggestions

* Add timeline reminder to issue template (#342)

* Pr 1 of 2: Add Microarray Pathway Analysis - GSEA example (#345)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Making staging changes live (#329)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Update github actions to reflect staging branch (#311)

* Update github actions to reflect staging branch

* Add libglpk40 to Dockerfile

* Make it gh-pages-stages!

* Remove dockerfile change that should have been on its own all along

* Does this work?

* Declare a uses

* Switch how env is declared

* Force it to run so we can test it

* try no curly brackets

* What's up with the branch

* Move to bash if instead

* Need quotes?

* forgot a `then`

* Try dollar signs

* Doesn't like the `.`?

* Use curly brackets

* Try ${GITHUB_REF}

* Try ${BRANCH_NAME}

* try ${GITHUB_REF#refs/*/}

* use jashapiro suggestion

* Change to base ref

* Change back to `github.ref`

* Get rid of PR `on:`

* Try another test

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Revert "Try another test"

This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac.

* Add google analytics to renderings (#314)

* Try adding google analytics

* Add to header using includes

* temporary file snuck in there

* Restore master version so they aren't in the review

* Let's call an html file and html file

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Only push if we are in master.

For simplicity, we will now run this even if the dockerfile hasn't changed.

* Add test target

* test staging workflow with this branch

* back to latest tag

* Try separate push step

* change tags to test push

* Revert "change tags to test push"

This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec.

* Remove this branch from triggers

* Push staging, retag and push master

Okay, so the branch name is now inaccurate, but that is fine...

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

* Update scripts/render-notebooks.R

* Add some issue templates (#319)

* Add some rough draft issue templates

* Incorporate cbethell review

* Get rid of `Other` labels that aren't useful

* Update diagrams showing how microarray/RNA-seq work  (#326)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Update microarray and RNAseq overview figures


- add context re figures
- change .jpg to .png for consistency

* Revert "Mechanics for CSS file and navbar add feedback URL (#303)"

This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d.

* update links to diagrams

* @dvenprasad updated figure spacing

* add the right updated figure

* replace section of link to figures with updated commit id

* incorporate @cansavvy's suggested changes

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>

Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Chante Bethell <[email protected]>

* add first half of microarray GSEA example nb

- add first part of new GSEA notebook example
- update Snakefile
- update navbar file
- update `references.bib`
- update `dictionary.txt`

* revert commit that snuck in

* revert second commit that snuck in

* fix render notebooks merge conflict

* incorporate cansavvy's review suggestions

* add step handling duplicate ids 

- add note re using said approach

* update comment

* replace lfc with t-statistic value where mentioned

* incorporate @cansavvy's review suggestions

- fix typo 
- add sanity check when removing duplicates

* replace `!duplicated()` with `dplyr::distinct()`

* incorporate @jaclyn-taroni's review suggestions

- add preview of `dr_hallmark_df`
- add context where needed 
- adapt approach to removing duplicate gene ids

* add a bit more context re removing dup gene IDs

* use absolute value of t-statistic

* Apply GSEA explanation suggestion from code review

Co-authored-by: Jaclyn Taroni <[email protected]>

* rerun snakefile to update rendered html

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: Jaclyn Taroni <[email protected]>

* Pr 2 of 2: Add Microarray Pathway Analysis - GSEA example (#347)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Making staging changes live (#329)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Update github actions to reflect staging branch (#311)

* Update github actions to reflect staging branch

* Add libglpk40 to Dockerfile

* Make it gh-pages-stages!

* Remove dockerfile change that should have been on its own all along

* Does this work?

* Declare a uses

* Switch how env is declared

* Force it to run so we can test it

* try no curly brackets

* What's up with the branch

* Move to bash if instead

* Need quotes?

* forgot a `then`

* Try dollar signs

* Doesn't like the `.`?

* Use curly brackets

* Try ${GITHUB_REF}

* Try ${BRANCH_NAME}

* try ${GITHUB_REF#refs/*/}

* use jashapiro suggestion

* Change to base ref

* Change back to `github.ref`

* Get rid of PR `on:`

* Try another test

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Revert "Try another test"

This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac.

* Add google analytics to renderings (#314)

* Try adding google analytics

* Add to header using includes

* temporary file snuck in there

* Restore master version so they aren't in the review

* Let's call an html file and html file

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Only push if we are in master.

For simplicity, we will now run this even if the dockerfile hasn't changed.

* Add test target

* test staging workflow with this branch

* back to latest tag

* Try separate push step

* change tags to test push

* Revert "change tags to test push"

This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec.

* Remove this branch from triggers

* Push staging, retag and push master

Okay, so the branch name is now inaccurate, but that is fine...

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

* Update scripts/render-notebooks.R

* Add some issue templates (#319)

* Add some rough draft issue templates

* Incorporate cbethell review

* Get rid of `Other` labels that aren't useful

* Update diagrams showing how microarray/RNA-seq work  (#326)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Update microarray and RNAseq overview figures


- add context re figures
- change .jpg to .png for consistency

* Revert "Mechanics for CSS file and navbar add feedback URL (#303)"

This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d.

* update links to diagrams

* @dvenprasad updated figure spacing

* add the right updated figure

* replace section of link to figures with updated commit id

* incorporate @cansavvy's suggested changes

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>

Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Chante Bethell <[email protected]>

* add latter half of GSEA microarray example (includes GSEA steps)

fix merge conflicts

* add incode prompt

* revert commit that snuck in

* revert commit

* set seed and re-run

* incorporate some of the wording/context suggestions from review

* rerun Snakefile

* incorporate suggested changes re additional context/GSEA explanation

* implement `top_n()`

* add a bit more context for clarification re ES score

* update GSEA explanation before gene ID conversion section

* incorporate @cansavvy's wording suggestions

* mimic "highly" -> "most" language

* incorporate wording suggestions from code review

* some re-structuring/re-wording based on review suggestions

* update `dictionary.txt` file

* incorporate @jaclyn-taroni's review suggestions

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Joshua Shapiro <[email protected]>

* Delete intro Rmd and renumber (#355)

* Try out intro and fix filenames

* Undo intro paragraph for now. Too much

* Missed one link to update in GSEA

* Add words about Draft and Refined PRs to CONTRIBUTING.md (#361)

* Explicitly discuss draft vs refine PRs in contrib

* doctoc it

* Remove asterisks

* Refine wording

* Use cbethell's wording suggestions

* Make that one sentence more clear?

* WGCNA Part 1: Set up (#358)

* Put in basic changes: navbar, dict, snakefile, Rmd

* More polishing and info and refs

* Update file paths

* Bring back docker changes

* Add to dictionary

* Add a couple refs

* Add ref and other little things

* Revert "Add ref and other little things"

This reverts commit 7560c2a7cb861aaecefd8d241db209d8b3658989.

* Address straightforward comments from cbethell

* Add ref

* Add more refs and re-render

* Remove that extra part that should only be in part2 not here

* Incorporate jashapiro review

* Shorten up some more comments

* rowSums!!

* Get rid of tibble step and change wording

* WGCNA Part 2: Running WGCNA (#360)

* Put in basic changes: navbar, dict, snakefile, Rmd

* More polishing and info and refs

* Update file paths

* Bring back docker changes

* Add to dictionary

* Add a couple refs

* Add next steps

* Add some polishing and refs

* Address the straightforward items from cbethell 's review

* Incorporate jashapiro review from #358

* Style Rmds

* Bring over part1 changes and re-render

* Edit things based on jashapiro review

Co-authored-by: GitHub Actions <[email protected]>

* Add pathway analysis intro paragraph to microarray ORA (#356)

* Try out intro and fix filenames

* Undo intro paragraph for now. Too much

* Add intro paragraph

* Fix typo, add links

* Incorporate cbethell review

* Wording change from @envest

* Fix WGCNA installation (#366)

* Move order of install for WGCNA

* warn moar

* Pr 1 of 2: Add Microarray Pathway Analysis - GSVA example (#359)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Making staging changes live (#329)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Update github actions to reflect staging branch (#311)

* Update github actions to reflect staging branch

* Add libglpk40 to Dockerfile

* Make it gh-pages-stages!

* Remove dockerfile change that should have been on its own all along

* Does this work?

* Declare a uses

* Switch how env is declared

* Force it to run so we can test it

* try no curly brackets

* What's up with the branch

* Move to bash if instead

* Need quotes?

* forgot a `then`

* Try dollar signs

* Doesn't like the `.`?

* Use curly brackets

* Try ${GITHUB_REF}

* Try ${BRANCH_NAME}

* try ${GITHUB_REF#refs/*/}

* use jashapiro suggestion

* Change to base ref

* Change back to `github.ref`

* Get rid of PR `on:`

* Try another test

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Revert "Try another test"

This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac.

* Add google analytics to renderings (#314)

* Try adding google analytics

* Add to header using includes

* temporary file snuck in there

* Restore master version so they aren't in the review

* Let's call an html file and html file

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Only push if we are in master.

For simplicity, we will now run this even if the dockerfile hasn't changed.

* Add test target

* test staging workflow with this branch

* back to latest tag

* Try separate push step

* change tags to test push

* Revert "change tags to test push"

This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec.

* Remove this branch from triggers

* Push staging, retag and push master

Okay, so the branch name is now inaccurate, but that is fine...

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

* Update scripts/render-notebooks.R

* Add some issue templates (#319)

* Add some rough draft issue templates

* Incorporate cbethell review

* Get rid of `Other` labels that aren't useful

* Update diagrams showing how microarray/RNA-seq work  (#326)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Update microarray and RNAseq overview figures


- add context re figures
- change .jpg to .png for consistency

* Revert "Mechanics for CSS file and navbar add feedback URL (#303)"

This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d.

* update links to diagrams

* @dvenprasad updated figure spacing

* add the right updated figure

* replace section of link to figures with updated commit id

* incorporate @cansavvy's suggested changes

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>

Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Chante Bethell <[email protected]>

* Add first half of microarray GSVA example notebook

* add packages to Dockerfile and rerun

* fix reference

* add to navbar

* remove mention of pheatmap

* incorporate @jaclyn-taroni's suggestion on collapsing duplicates logic

* incorporate cansavvy's review comments

- fix logic combing rest of mapped data with the collapsed duplicates data
- fix context around that logic

* clarify/change some wording based on cansavvy's suggestions

* incorporate single sample example of selecting max expression values

* Push code that cbethell and I chatted through

* Add to dictionary

* Style Rmds

* rerun Snakefile to update html file

* Apply jaclyn-taroni's wording suggestions from code review

Co-authored-by: Jaclyn Taroni <[email protected]>

* incorporate the rest of jaclyn-taroni's review suggestions

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: GitHub Actions <[email protected]>
Co-authored-by: Jaclyn Taroni <[email protected]>

* WGCNA Part 3: DE and heatmaps (#363)

* Put in basic changes: navbar, dict, snakefile, Rmd

* More polishing and info and refs

* Update file paths

* Bring back docker changes

* Add to dictionary

* Add a couple refs

* Add next steps

* Add some polishing and refs

* Address the straightforward items from cbethell 's review

* Incorporate jashapiro review from #358

* Style Rmds

* Bring over part1 changes and re-render

* Add last set of steps

* Push this partcular plot version in case we wanna come back to it

* Commit this multiple module pheatmap in case I want to return to it

* ComplexHeatmap is mostly wrangled

* It's working!

* Save to PDFs

* Fix color function and re-render

* Add outlier thing

* Revert "Add outlier thing"

This reverts commit 8b9d57ce13ff2b6b6c5ddbb0169a794f6bbd36de.

* Add ref for ComplexHeatmap

* Incorporate jashapiro review and rerender

* Remove standardize_genes option

* Wrap up those last few typo things

Co-authored-by: GitHub Actions <[email protected]>

* WGCNA Part 4: Warn about Outliers (#364)

* Put in basic changes: navbar, dict, snakefile, Rmd

* More polishing and info and refs

* Update file paths

* Bring back docker changes

* Add to dictionary

* Add a couple refs

* Add next steps

* Add some polishing and refs

* Address the straightforward items from cbethell 's review

* Incorporate jashapiro review from #358

* Style Rmds

* Bring over part1 changes and re-render

* Add last set of steps

* Push this partcular plot version in case we wanna come back to it

* Commit this multiple module pheatmap in case I want to return to it

* ComplexHeatmap is mostly wrangled

* It's working!

* Save to PDFs

* Fix color function and re-render

* Add outlier thing

* Style Rmds

* Re-rendered html

* switch the whole outlier thing to just a comment

* re-render after staging merge

Co-authored-by: GitHub Actions <[email protected]>

* Microarray ORA Restructure Instruction (#377)

* Some edits and adding other tutorials

* Add more guidance about why pick ORA

* A bit more word changing

* A few more wording edits

* Incorporating jashapiro review

* Get rid of other GSEA mention

* sessioninfo::session_info()

* Put those two wording things in jashapiro mentioned

* WGCNA Part 5: switch dataset (#379)

* switch wording and dataset in general

* Few more wording edits

* Update dictionary; fix spelling errors

* Re-render!

* Change to 7 and incorporate jashapiro review

* Also switch the most sig module!

* Two comments from jashapiro review

* Put the comments too

* Style Rmds

* Use all_of() to get rid warning

* Style Rmds

* Re-render

Co-authored-by: GitHub Actions <[email protected]>

* Change pdf -> png and rereun (#382)

* Pr 2 of 2: Add Microarray Pathway Analysis - GSVA example (#362)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Making staging changes live (#329)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Update github actions to reflect staging branch (#311)

* Update github actions to reflect staging branch

* Add libglpk40 to Dockerfile

* Make it gh-pages-stages!

* Remove dockerfile change that should have been on its own all along

* Does this work?

* Declare a uses

* Switch how env is declared

* Force it to run so we can test it

* try no curly brackets

* What's up with the branch

* Move to bash if instead

* Need quotes?

* forgot a `then`

* Try dollar signs

* Doesn't like the `.`?

* Use curly brackets

* Try ${GITHUB_REF}

* Try ${BRANCH_NAME}

* try ${GITHUB_REF#refs/*/}

* use jashapiro suggestion

* Change to base ref

* Change back to `github.ref`

* Get rid of PR `on:`

* Try another test

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Revert "Try another test"

This reverts commit 845cf1aff92ea7b83f402bbefd563562b44e5eac.

* Add google analytics to renderings (#314)

* Try adding google analytics

* Add to header using includes

* temporary file snuck in there

* Restore master version so they aren't in the review

* Let's call an html file and html file

* Docker dep fix: Add lib package 40 thing that clusterprofiler needs (#316)

* Add lib package 40 thing that clusterprofiler needs

* Try adding options(warn = 2)

* Test if options(warn =2) means it breaks like it should

* Revert "Test if options(warn =2) means it breaks like it should"

This reverts commit d9f688f68448ef69fe4c1caa48af23051cd7f4e3.

* Only push if we are in master.

For simplicity, we will now run this even if the dockerfile hasn't changed.

* Add test target

* test staging workflow with this branch

* back to latest tag

* Try separate push step

* change tags to test push

* Revert "change tags to test push"

This reverts commit 6a38574d312cee82c90c3c036ac9033f9af7f7ec.

* Remove this branch from triggers

* Push staging, retag and push master

Okay, so the branch name is now inaccurate, but that is fine...

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

* Update scripts/render-notebooks.R

* Add some issue templates (#319)

* Add some rough draft issue templates

* Incorporate cbethell review

* Get rid of `Other` labels that aren't useful

* Update diagrams showing how microarray/RNA-seq work  (#326)

* Mechanics for CSS file and navbar add feedback URL (#303)

* Adding in some style with css

* Use css magic

* Try making the navbar blue

* Add survey link

* Make font smaller

* Need a comma

* Change to normalizePath

* normalizepath separate step references.bib

* Move references.bib to component folder

* Made ccs modifications, added logo file

Made changes to css/navbar.html
Tried to add the logo but it but it cuts out and not sure how to make it decent.

* Resolve render-notebooks.R conflict

* Remove testing html from file diff

* uncommented mobile nav

Co-authored-by: dvenprasad <[email protected]>

* Update microarray and RNAseq overview figures


- add context re figures
- change .jpg to .png for consistency

* Revert "Mechanics for CSS file and navbar add feedback URL (#303)"

This reverts commit 8b81fdd96eeecf1d0e479d7908376b8e57dc356d.

* update links to diagrams

* @dvenprasad updated figure spacing

* add the right updated figure

* replace section of link to figures with updated commit id

* incorporate @cansavvy's suggested changes

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>

Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Chante Bethell <[email protected]>

* Add part two of GSVA microarray example notebook

* update comment

* update violin plot and its interpretation

* add to `dictionary.txt`

* apply significance and multiple hypothesis testing before plotting

* Switching to northcott and a sina plot of one pathway

* Style Rmds

* Re-render it all

* Adjust wording add tidbits about limma and re-render

* Few more wording edits

* Caught a few more little wording issues. Re-rendered

* Remove Murat2008 ref

* Restore the part 1 changes that got lost in the merge

* incorporate most of jaclyn-taroni's suggested changes

- create annotated results df using wide -> long method
- update some wording/context re `mx.diff = TRUE` and what that means

* remove outdated entries in `dictionary.txt`

- remove unnecessary reference in `references.bib`

* fix axis label

* break up `annotated_results_df` steps

* Apply suggestions from code review

Co-authored-by: Jaclyn Taroni <[email protected]>

* add reminder of `gsva_results` format

- cite gsva package vignette
- add more detail around "appropriate format" for plotting

Co-authored-by: Candace Savonen <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: GitHub Actions <[email protected]>
Co-authored-by: Jaclyn Taroni <[email protected]>

* Remove getting started zip file (#392)

* ORA RNA-seq: Part 1 - The Set Up (#394)

* Add the file. It works

* Add components

* re-render

* Add review tags

* Update wording around detectable genes

* Add some words to dictionary.txt

* re-render

* Switch to PNG

* Incorporating cbethell 's and envest 's  review

* Switch from using gene symbols to Entrez IDs

* Isolate to just set up

* Fix Typo

* Polish the wording in a few places

* Incorporate jashapiro reviews and remove tags

* Port one wording change over to microarray ORA

* One more wording edit

* ORA RNA-seq: Part 2 - Run ORA and get results! (#395)

* Add the file. It works

* Add components

* re-render

* Add review tags

* Update wording around detectable genes

* Add some words to dictionary.txt

* re-render

* Switch to PNG

* Incorporating cbethell 's and envest 's  review

* Switch from using gene symbols to Entrez IDs

* Couple wording polishes

* Copy over changes from #394 's review

* Use jashapiro wording suggestions, delete tags

* Add message = FALSE to mute chatty blocks (#398)

* Add message=FALSE to library loading chunks

* Rerender html files

* Spell check fixes

* rerender

* add GSVA package to Dockerfile (#401)

* Add rendering options via include.R  (#402)

* Add option to include R code in an early chunk

* Add the include file when rendering

* Change width to 70

* add example rerender

* comment and naming changes

* Update contributing.md with include file description

* Add all rendered changes

* GSVA for RNA-seq Part 1: Set up (#403)

* Scrapbooking together an analysis

* switch back to kcdf = "Gaussian"

* Rearrange based on chat with Jackie

* Fix the two things from jaclyn-taroni partial review

* Few wording edits

* Make the dup checks more relevant

* Make PNG a bit bigger

* incorporate most of jaclyn-taroni review comments

* Try out the msigdbr list thing

* Isolate to first parts of gsva

* Editing explanations

* Fix a couple spelling things

* Incorporate jaclyn-taroni review and delete tags

* Use `vst_df`

* One more wording change

* Remove that instance of "lists" that isn't really what we mean

* Link citations in render (#407)

* Add links to citations

* Fix umlaut

* Try different strategy for ortholog file download (#411)

* Try different download strategy

* Couple edits

* Move link to before download

* One other wording change

* Editing/polish of microarray heatmap notebook (#409)

* Intro edits

* Heatmap edits

* Render changes

* Couple edits that didn't get saved

* One more comment compaction.

* Remove relative links

* Add to microarray strengths

* Carry over common  comment changes (#414)

* Carry over common  comment changes

* Style Rmds

* White space change to force check

Co-authored-by: GitHub Actions <[email protected]>

* Use same download.file strategy for ortholog RNA-seq example (#413)

* Copy over changes from #411 but make it mouse

* "automatically" gets deleted

* GSVA for RNA-seq: Part 2 -- GSVA and a heatmap (#404)

* Scrapbooking together an analysis

* switch back to kcdf = "Gaussian"

* Rearrange based on chat with Jackie

* Fix the two things from jaclyn-taroni partial review

* Few wording edits

* Make the dup checks more relevant

* Make PNG a bit bigger

* incorporate most of jaclyn-taroni review comments

* Try out the msigdbr list thing

* Re-render

* Update based on part 1 review

* Add bit that shows overlaps

* Make wording changes based on jaclyn-taroni review

* Do some wording/explanation edits

* RNA-seq DGE dataset switch (#416)

* Introduce SRP123625

* Updating wording and some other items

* Re-render it

* Spell error fixes

* jashapiro review suggestions

* one more change and re-render

* add part 1 of RNA-seq GSEA example notebook (take 2) (#419)

* PCA polishing edits (#421)

* Add principal component background

Also shortened code lines and results tables

* Add some more context and explanation of results

* Updates to rnaseq PCA

* Format and rerender

* update screenshots

* Apply suggestions from code review

Co-authored-by: Candace Savonen <[email protected]>

* Don't call it a matrix

it's been here for years

* rerender

Co-authored-by: Candace Savonen <[email protected]>

* Try out a different download strategy for ORA (#418)

* Change download steps to download file

* re-render

* Spell error fixes

* Use jashapiro wording

* Use download.file for the three other notebooks (#422)

* Bring over the GSEA changes

* Add download.file() to the other three places

* Found a typo

* Fix two things cbethell mentioned in review

* PR 2 of 2: Add RNA-seq Pathway Analysis - GSEA example (take 2) (#420)

* add part 2 of RNA-seq GSEA example notebook (take 2)

* rerun snakefile

* incorporate jaclyn-taroni's review suggestions

* rerun Snakefile to fix html output (#424)

* Umap polish (#423)

* UMAP polish edits

* rendering

* Apply suggestions from code review

Co-authored-by: Candace Savonen <[email protected]>

* rerender

* Move filtering to before DESeq2 object creation  (#425)

* re-render it

* Further fix merge conflicts and re-render

* Address jashapiro comments

* Heatmap polish (#426)

* Polishing edits to heatmap pages

* Style and render

* OSPL fix

Co-authored-by: Candace Savonen <[email protected]>

* rownames -> row names

* rerender all the things

and delete a stray space

* Polish differential exp microarray notebooks (#427)

* Changes to differential exp microarray notebook

* Spelling updates

* Add comments and releveling, as suggested by @cansavvy

* code formatting updates

embrace the pipe

* remove apeglm

* Add some eBayes info!

* polishing microarray multiple groups

* Rerender everything

* Split off multiple testing

* Rerender

* Minor Polish Diff Expr RNAseq (#429)

* Minor polishing to RNAseq Diff Expr

* render changes

* Apply suggestions from code review

Co-authored-by: Candace Savonen <[email protected]>

* render

Co-authored-by: Candace Savonen <[email protected]>

* Polish the microarray ORA notebook (#430)

* Update MSigDB section + rerender

Also fixes long comments and rownames.print=FALSE

* A few more edits to comments

* Part 1: Add rownames.print = FALSE where its helpful (#431)

* Add print.rownames = FALSE where its helpful

* Apparently it doesn't affect design matrices

* Push the htmls so that people can actually see them!!!

* Polish the RNA-seq ORA example (#432)

* Add the rownames.print = FALSE and re-render (#433)

Co-authored-by: jashapiro <[email protected]>

* Polish Microarray GSEA example (#434)

* Polish wording and add introductory paragraphs

* A bit more polishing

* Apply suggestions from code review

Co-authored-by: jashapiro <[email protected]>

* Response to code review

* Apply suggestions from code review

Co-authored-by: jashapiro <[email protected]>

* Propagate citation suggestion to multiple pathway notebooks

* Missed these quotes

* A few mapIds() items

Co-authored-by: jashapiro <[email protected]>

* Polish RNA-seq GSEA example (#437)

* Polish wording and add introductory paragraphs

* A bit more polishing

* Apply suggestions from code review

Co-authored-by: jashapiro <[email protected]>

* Response to code review

* Apply suggestions from code review

Co-authored-by: jashapiro <[email protected]>

* Propagate citation suggestion to multiple pathway notebooks

* Missed these quotes

* A few mapIds() items

* Polish RNA-seq GSEA example

* Missed a long comment

* Update components/references.bib

Co-authored-by: jashapiro <[email protected]>

* Let the algorithm handle it

Co-authored-by: jashapiro <[email protected]>

* GHA: Slack us if Docker build or rendering fails (#438)

* Add Slack notification to docker-build-push.yml

* Add Slack notification to docker-build.yml

* Add branch for testing

* Add library load of package not installed

* Revert "Add library load of package not installed"

This reverts commit 4e83ed1e104f0760db40752f9bc9e641f916d374.

* Revert "Add branch for testing"

This reverts commit 06073504426dc1903cf46bb99cc65a3a91894a3c.

* Polish Ensembl Gene ID conversions (bonus reference updates!) (#435)

* Polish wording and add introductory paragraphs

* A bit more polishing

* Reference updates

* ensembl gene id polish

* Get out of here capital Refine.bio

* No more bare dfs

* Transfer changes to RNAseq (and some back)

* Apply suggestions from code review

Co-authored-by: jashapiro <[email protected]>

* Response to code review

* Some numeric updates

* Add rendered files

* render update

* Citation update

* Render updates

* add comment & rerender

Co-authored-by: Jaclyn Taroni <[email protected]>

* Polish microarray GSVA example (#440)

* Ignore the gene_sets directory

* Polish the microarray GSVA example

* Missed a couple mentions of GSEA

* Borrow some polishing from #427

* Apply suggestions from code review

Co-authored-by: Candace Savonen <[email protected]>

* Newline in intro paragraph everywhere

* Add note about model organisms with GSVA

Link to RNA-seq GSVA example

* Rerender notebooks

Co-authored-by: Candace Savonen <[email protected]>

* Polishing Ortholog notebooks (#436)

* Polish wording and add introductory paragraphs

* A bit more polishing

* Reference updates

* ensembl gene id polish

* Get out of here capital Refine.bio

* No more bare dfs

* Transfer changes to RNAseq (and some back)

* Apply suggestions from code review

Co-authored-by: jashapiro <[email protected]>

* Response to code review

* Some numeric updates

* Add rendered files

* render update

* Citation update

* Add polishing for ortholog examples

* Transfer intro sentence updates to related notebooks

* more polish - remove duplicates from counts

* Transfer changes to rnaseq

* Rendering updates

* other rendering updates

* Render updates

* Change ftp -> http for ftp.ebi

As noted here: https://github.com/AlexsLemonade/refinebio-examples/issues/439#issuecomment-748721625

Confirmed that this does work as expected by rendering.

* Add branch to docker-build-push.yml for render test

* Apply suggestions from code review

Co-authored-by: Candace Savonen <[email protected]>

* Clarify real genes

* add some spelling words

* Revert "Add branch to docker-build-push.yml for render test"

This reverts commit 8e7b6ec868bab71c4ca0ec2f9b21e2be25df978b.

* rendering

* Caught an igor

Co-authored-by: Jaclyn Taroni <[email protected]>
Co-authored-by: Candace Savonen <[email protected]>

* Polish the RNA-seq GSVA example (#441)

Co-authored-by: Joshua Shapiro <[email protected]>
Co-authored-by: dvenprasad <[email protected]>
Co-authored-by: Chante Bethell <[email protected]>
Co-authored-by: Jaclyn Taroni <[email protected]>
Co-authored-by: GitHub Actions <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants