-
Notifications
You must be signed in to change notification settings - Fork 13
/
README.Rmd
144 lines (109 loc) · 4.64 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
cache = TRUE,
out.width = "100%"
)
options(tibble.print_min = 5, tibble.print_max = 5)
library(BiocStyle)
```
# cBioPortalData
<!-- start badges here -->
[![BioC status](http://www.bioconductor.org/shields/build/release/bioc/cBioPortalData.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/cBioPortalData)
[![Platforms](http://www.bioconductor.org/shields/availability/release/cBioPortalData.svg)](https://www.bioconductor.org/packages/release/bioc/html/cBioPortalData.html#archives)
[![Downloads](https://www.bioconductor.org/shields/downloads/release/cBioPortalData.svg)](https://bioconductor.org/packages/stats/bioc/cBioPortalData/)
<!-- end badges here -->
## cBioPortal data and MultiAssayExperiment
### Overview
The `cBioPortalData` R package aims to import cBioPortal datasets as
`r Biocpkg("MultiAssayExperiment")` objects into Bioconductor. Some of the
features of the package include:
1. The use of the `MultiAssayExperiment` integrative container for coordinating
and representing the data.
2. The data container explicitly links all assays to the patient
clinical/pathological data.
3. With a [flexible API][1], `MultiAssayExperiment` provides harmonized
subsetting and reshaping into convenient wide and long formats.
4. The package provides datasets from both the API and the saved packaged data.
5. It also provides automatic local caching, thanks to
`r Biocpkg("BiocFileCache")`
[1]: https://github.com/waldronlab/MultiAssayExperiment/wiki/MultiAssayExperiment-API
## MultiAssayExperiment Cheatsheet
<a href="https://github.com/waldronlab/cheatsheets/blob/master/MultiAssayExperiment_QuickRef.pdf">
<img src="https://raw.githubusercontent.com/waldronlab/cheatsheets/master/pngs/MultiAssayExperiment_QuickRef.png" width="989" height="1091"/>
</a>
## Quick Start
### Installation
To install from Bioconductor (recommended for most users, this will install the
release or development version corresponding to your version of Bioconductor):
```{r,eval=FALSE}
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("cBioPortalData")
```
Developers may want to install from GitHub for bleeding-edge updates (although
this is generally not necessary because changes here are also pushed to
[bioc-devel][2]). Note that developers must be working with the development
version of Bioconductor; see [bioc-devel][2] for details.
[2]: https://contributions.bioconductor.org/use-devel.html
```{r,eval=FALSE}
if (!require("cBioPortalData", quietly = TRUE))
BiocManager::install("waldronlab/cBioPortalData")
```
To load the package:
```{r,include=TRUE,results="hide",message=FALSE,warning=FALSE}
library(cBioPortalData)
```
```{r,results="hide",include=FALSE}
cbio <- cBioPortal()
studies <- getStudies(cbio, buildReport = TRUE)
st <- studies
packpct <- round(prop.table(table(st$pack_build))[[2]], 2) * 100
apipct <- round(prop.table(table(st$api_build))[[2]], 2) * 100
```
## Note
`cBioPortalData` is a work in progress due to changes in data curation and
cBioPortal API specification. Users can view the `data(studiesTable)` dataset
to get an overview of the studies that are available and currently building
as `MultiAssayExperiment` representations. About `r apipct` % of the studies via
the API (`api_build`) and `r packpct` % of the package studies (`pack_build`)
are building, these include additional datasets that were not previously
available. Feel free to file an issue to request prioritization of fixing any
of the remaining datasets.
```{r,eval=FALSE}
cbio <- cBioPortal()
studies <- getStudies(cbio, buildReport = TRUE)
```
```{r}
table(studies$api_build)
table(studies$pack_build)
```
### API Service
Flexible and granular access to cBioPortal data from `cbioportal.org/api`.
This option is best used with a particular gene panel of interest. It allows
users to download sections of the data with molecular profile and gene panel
combinations within a study.
```{r,warning=FALSE,message=FALSE,include=TRUE,results="hide"}
gbm <- cBioPortalData(api = cbio, by = "hugoGeneSymbol", studyId = "gbm_tcga",
genePanelId = "IMPACT341",
molecularProfileIds = c("gbm_tcga_rppa", "gbm_tcga_mrna")
)
```
```{r}
gbm
```
### Packaged Data Service
This function will download a dataset from the `cbioportal.org/datasets`
website as a packaged tarball and serve it to users as a `MultiAssayExperiment`
object. This option is good for users who are interested in obtaining
all the data for a particular study.
```{r,warning=FALSE,message=FALSE,include=TRUE,results="hide"}
acc <- cBioDataPack("acc_tcga")
```
```{r}
acc
```