New Analysis Example: Microarray Pathway Analysis - GSVA #343

cansavvy · 2020-10-29T16:28:12Z

What are the goals of this new example analysis?

ORA and GSEA are certainly popular pathway analyses methods, but GSVA requires a bit less cutoffs and decision making so having this method as an example would probably be helpful for our users.

Having a per sample pathway analysis results is a different question that GSVA can answer but the others can't so much.

What kind of dataset will this need?

We may want to use the same. original dataset we used in either GSEA or ORA so we have a comparison of pathway analyses?: GSE71270 (zebrafish CREB study) or GSE37418 (human medulloblastoma subtype).

What steps should be included in this analysis?

We can borrow some inspiration from https://github.com/AlexsLemonade/training-modules/blob/master/pathway-analysis/03-gene_set_variation_analysis.Rmd, keeping in mind that the narrative will need to change somewhat like other examples we've adapted from training to refinebio-examples: See #306

Import library(GSVA) (add this to the Dockerfile)
Set up gene expression data as a matrix that that
Import gene lists and decide about Hallmark or not (this decision should be made considering the discussion happening on WIP: Add Microarray Pathway Analysis - GSEA example #339 (comment) -- we'll want to. make sure users understand the implications of multiple testing corrections and how smaller gene sets can help with this.
Use GSVA::gsva() to perform GSVA, probably start out with largely the same parameters used in training but adjust if/when things look wonky.
Display a preview of significant results in one way or another. Somewhat related to this discussion WIP: Add Microarray Pathway Analysis - GSEA example #339 (comment)
Make some sort of visualization of the GSVA scores. Not sure what makes the most sense here? Plotting the top results and maybe a jitter plot by group?
Write results to a TSV.

What packages/methods do you recommend using or looking into for this analysis?

Probably GSVA unless there are other package suggestions we should consider.

The text was updated successfully, but these errors were encountered:

cbethell · 2020-11-09T20:21:04Z

Based on a discussion with @cansavvy, the plan in the original comment above, and the training modules example for inspiration, the tentative plan for tackling this ticket is as follows:

Import library(GSVA) (add this to the Dockerfile)
Read in gene expression data (Homo sapiens, likely a dataset already on S3)
Import gene list from broad institute url using recommendation from GSVA vignette to read in file (and isolate hallmark gene sets) — include context making sure users understand the implications of multiple testing corrections and how smaller gene sets can help with this (if we were to read in a smaller subset file)
Gene identifier conversion — map to human gene symbols or entrez ids, likely symbols
Remove duplicate identifiers — using the highest variance to select which row to keep perhaps?
Use GSVA::gsva() to perform GSVA, probably start out with largely the same parameters used in training but adjust if/when things look wonky.
Make some sort of visualization of the GSVA scores. Plotting the results using a heatmap and maybe a violin or jitter plot to plot by group? To plot by highest variance? To plot by highest GSVA score?
Write results to a TSV.

Feel free to leave any suggestions/modifications you believe should be made before implementing this plan!
cc: @jaclyn-taroni and @jashapiro

jaclyn-taroni · 2020-11-09T20:53:12Z

Remove duplicate identifiers — using the highest variance to select which row to keep perhaps?

You could also aggregate to the mean value for a gene symbol for each sample.

cansavvy mentioned this issue Nov 4, 2020

New Analysis Example: RNA-seq Pathway analysis -- ORA #344

Closed

cbethell self-assigned this Nov 5, 2020

This was referenced Nov 12, 2020

WIP: Add Microarray Pathway Analysis - GSVA example #352

Closed

Pr 1 of 2: Add Microarray Pathway Analysis - GSVA example #359

Merged

cansavvy added the before going live Needs to be done before we can "go live" or do testing label Nov 18, 2020

cbethell mentioned this issue Nov 20, 2020

Pr 2 of 2: Add Microarray Pathway Analysis - GSVA example #362

Merged

10 tasks

cbethell closed this as completed in #362 Nov 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Analysis Example: Microarray Pathway Analysis - GSVA #343

New Analysis Example: Microarray Pathway Analysis - GSVA #343

cansavvy commented Oct 29, 2020 •

edited

Loading

cbethell commented Nov 9, 2020

jaclyn-taroni commented Nov 9, 2020

New Analysis Example: Microarray Pathway Analysis - GSVA #343

New Analysis Example: Microarray Pathway Analysis - GSVA #343

Comments

cansavvy commented Oct 29, 2020 • edited Loading

What are the goals of this new example analysis?

What kind of dataset will this need?

What steps should be included in this analysis?

What packages/methods do you recommend using or looking into for this analysis?

cbethell commented Nov 9, 2020

jaclyn-taroni commented Nov 9, 2020

cansavvy commented Oct 29, 2020 •

edited

Loading