WIP: WGCNA second draft #353

cansavvy · 2020-11-12T20:37:58Z

Analysis Purpose

It's a spin off issue from #306 and the discussion on #346 and a second draft from #350.

Now that WGCNA is an advanced topic we can expand some things.

Pull Request Stage

This is a Draft PR - needs review of big concepts and outline

Strategy

After the first draft #350 and we made some decisions and changes, this is another draft before we go to refined PRs.

I've tried to incorporate the bigger and smaller changes mentioned by @jashapiro here: #350 (comment)
As well as add a section where we dive into what module is most differentially expressed between the treatment groups.

I also added a bit more guidance about some of the arguments for blockwiseModules() step.

Concerns/Questions for reviewers:

The main changes that need to be reviewed (and are tagged as such) are Run WGCNA! section and beyond. The rest hasn't changed as much, but I do have questions about whether the Determine parameters for WGCNA should be added to.

Do we still want another plot like a heatmap showing a summary or do we think this is fine?
My sina plot of the most differentially expressed module eigengene is a nice sanity check, but do we want something else/more?
There are clustering parameters that I'm not sure if its worth giving guidance about or even what guidance to give. Section 3 of this doc is the only place I can find more description about choosing these parameters but I still don't find it that helpful. https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/BranchCutting/Supplement.pdf
See what you think, do we let it go?
A lot of the WGCNA tutorials have this kind of graph along with the soft threshold picking graph. I didn't include it originally because its unclear to me from their docs what more you are really gaining from it that you can't conclude more easily from the R^2 power graph. Let me know if you disagree.

Analysis Pull Request Check List (roughly in order):

Content checks

All {{BLANKS}} have been replaced with the correct content.
Sources are cited
Seed is set (if applicable)

Formatting Checks

Removed any manual numbering of sections.
Removed any instances of chunk naming.
Comments and documentation are up to date.
All links have been checked and are properly formatted.

Add datasets to S3

Added data and metadata files to S3.

Docker/Snakemake rendering components

Added the .html link to the navigation bar.
Any not yet added packages needed for this analysis have been added to the Dockerfile and it successfully builds.

… same results

jashapiro

Hight level thoughts: This looks good, and seems to me to have about the right level of code content. It seems reasonable to move along to working on it section by section at this point. I would probably suggest 3 parts to start: 1) intro/setup 2) choosing parameters and running WGCNA (including overview results figure) 3) application (limma).

I'll start by addressing the questions as requested:

I do like the heatmap, so I would add that back. I think it gives a better overview than just looking at part of the output table. In particular, having some sense of how similar or different the eigengenes are from one another is an important part of gaining intuition about the process.
I think the sina plot is fine... I don't think this module needs to go to much into depth about what you do after WGCNA; it should focus on explaining WGCNA, with some nods to later analysis. The limma result falls into the latter part.
I'm going to reserve judgement on the parameter explanations for detailed review, but my initial impression was this was fine to start.
I think the power vs R^2 plot is probably sufficient. (I also think the text numbers for data points are redundant... I'd be happier with just plain old points and maybe gridlines to make it easier to track which value is which.)

Other thoughts:

As this is an advanced topic, I think we will want give more background. The introduction could be expanded substantially, with more discussion of what WGCNA is and why someone might want to use it. We had some of this discussion internally in WIP: WGCNA module draft #350; it would be good to have a version of these thoughts for external consumption!
In general, I would favor more explanation of results. The module at the moment feels like it is heavily weighted toward "how" with not enough "what" and "why".
For these "advanced topics", do we want to cut down the "How to run this" material? My thought is just that hopefully people getting there will have a better sense of the basics and won't need so much detail.

cansavvy · 2020-11-16T19:13:35Z

For these "advanced topics", do we want to cut down the "How to run this" material?

Can you give me an example about what you mean here?

cansavvy · 2020-11-16T19:18:33Z

I would probably suggest 3 parts to start: 1) intro/setup 2) choosing parameters and running WGCNA (including overview results figure) 3) application (limma).

Given your review then, @jashapiro , I'll start prepping the refined PRs.

jashapiro · 2020-11-16T19:23:33Z

For these "advanced topics", do we want to cut down the "How to run this" material?

Can you give me an example about what you mean here?

This was a more high level philosophical question that would apply to any "Advanced" topic. Do we want to at some point reduce the instructions to essentially "Get this dataset from refine.bio and put it in the data folder" and remove most of the screenshots? Assume people have learned the navigate refine.bio skill by the time they get to advanced topics.

cansavvy · 2020-11-16T19:33:48Z

This was a more high level philosophical question that would apply to any "Advanced" topic. Do we want to at some point reduce the instructions to essentially "Get this dataset from refine.bio and put it in the data folder" and remove most of the screenshots? Assume people have learned the navigate refine.bio skill by the time they get to advanced topics.

If your comment is mainly about the introduction material I would say most of it should still remain. I wouldn't expect the users going through the "advanced topics" examples will always be users who have gone through the "basic" examples already so they may not be familiar with downloading from refine.bio. Your comment seems to suggest that this would be a progression (people start with non-advanced and go to advanced), but I think there might be users who come to our material only for "advanced topics".

This being said, this is probably something we should discuss in more detail and is less tied to this particular module, so I'll open up an issue for discussions about this.

Edit: See #357 to further discuss this topic and topics like it.

cansavvy added 13 commits November 10, 2020 09:54

Push a basic draft that doesn't exactly work

aa88556

More words and polishing

c029600

Try out some plots

9ac8121

Put some DRAFT and REVIEW designations

51bd54d

Add to Dockerfile

743fb45

Get rid of umap edit

cff76f6

Add wgcna example to snakefile and run

4f77862

Move WGCNA to the "advanced topics" folder

364e175

remove rna-seq files

58b4550

A set up with ANOVA

ab3df5b

Use limma instead. Its more straightforward and gives essentially the…

d16639a

… same results

Add ggforce to packages

35c1230

render it

e8b7d00

cansavvy changed the title ~~Cansavvy/wgcna draft 2~~ WIP: WGCNA second draft Nov 12, 2020

cansavvy added 3 commits November 13, 2020 12:32

Put draft tags and more words about parameters

7b1c9c4

Push a rendered notebook

c83679e

Add to dictionary; fix spellings

1de2832

cansavvy requested a review from jashapiro November 13, 2020 17:54

cansavvy mentioned this pull request Nov 16, 2020

[Investigation/Discussion] Methods/packages for identifying co-expressed gene modules #346

Closed

jashapiro reviewed Nov 16, 2020

View reviewed changes

This was referenced Nov 16, 2020

Try out new Advanced Usage intro/instructions in the WGCNA example #357

Open

WGCNA Part 1: Set up #358

Merged

WGCNA Part 2: Running WGCNA #360

Merged

WGCNA Part 3: DE and heatmaps #363

Merged

WGCNA Part 4: Warn about Outliers #364

Merged

jaclyn-taroni closed this Nov 25, 2020

jaclyn-taroni deleted the cansavvy/wgcna-draft-2 branch November 25, 2020 18:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: WGCNA second draft #353

WIP: WGCNA second draft #353

cansavvy commented Nov 12, 2020 •

edited

Loading

jashapiro left a comment

cansavvy commented Nov 16, 2020

cansavvy commented Nov 16, 2020

jashapiro commented Nov 16, 2020

cansavvy commented Nov 16, 2020 •

edited

Loading

WIP: WGCNA second draft #353

WIP: WGCNA second draft #353

Conversation

cansavvy commented Nov 12, 2020 • edited Loading

Analysis Purpose

Pull Request Stage

Strategy

Concerns/Questions for reviewers:

Analysis Pull Request Check List (roughly in order):

Content checks

Formatting Checks

Add datasets to S3

Docker/Snakemake rendering components

jashapiro left a comment

Choose a reason for hiding this comment

cansavvy commented Nov 16, 2020

cansavvy commented Nov 16, 2020

jashapiro commented Nov 16, 2020

cansavvy commented Nov 16, 2020 • edited Loading

cansavvy commented Nov 12, 2020 •

edited

Loading

cansavvy commented Nov 16, 2020 •

edited

Loading