Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with compareAgainstL1000 #4

Open
fransilvion opened this issue Nov 20, 2018 · 4 comments
Open

Problems with compareAgainstL1000 #4

fransilvion opened this issue Nov 20, 2018 · 4 comments

Comments

@fransilvion
Copy link

Hi,

for some reason, when I am running the same code from this pdf, when I execute the line:

compareSmallMolecule$spearman <- compareAgainstL1000( diffExprStat, l1000perturbationsSmallMolecules, cellLine, method="spearman")

I am getting the following error:

Error in .rowNamesDF<-(x, value = value) : duplicate 'row.names' are not allowed

What might be the problem?

sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/local/lib/R/lib/libRblas.so
LAPACK: /usr/local/lib/R/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 LC_PAPER=en_CA.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] dplyr_0.7.8 cTRAP_1.0.2 biomaRt_2.38.0 Biobase_2.42.0 BiocGenerics_0.28.0 DeMAND_1.12.0
[7] KernSmooth_2.23-15 limma_3.38.2

loaded via a namespace (and not attached):
[1] httr_1.3.1 jsonlite_1.5 bit64_0.9-7 R.utils_2.7.0 gtools_3.8.1
[6] assertthat_0.2.0 stats4_3.5.1 blob_1.1.1 yaml_2.2.0 progress_1.2.0
[11] slam_0.1-43 pillar_1.3.0 RSQLite_2.1.1 lattice_0.20-35 glue_1.3.0
[16] digest_0.6.18 colorspace_1.3-2 cowplot_0.9.3 Matrix_1.2-14 R.oo_1.22.0
[21] plyr_1.8.4 XML_3.98-1.16 pkgconfig_2.0.2 purrr_0.2.5 relations_0.6-8
[26] scales_1.0.0 gdata_2.18.0 BiocParallel_1.16.0 tibble_1.4.2 IRanges_2.16.0
[31] ggplot2_3.1.0 pbapply_1.3-4 lazyeval_0.2.1 magrittr_1.5 crayon_1.3.4
[36] memoise_1.1.0 R.methodsS3_1.7.1 gplots_3.0.1 tools_3.5.1 data.table_1.11.8
[41] prettyunits_1.0.2 hms_0.4.2 stringr_1.3.1 Rhdf5lib_1.4.0 S4Vectors_0.20.1
[46] munsell_0.5.0 cluster_2.0.7-1 AnnotationDbi_1.44.0 bindrcpp_0.2.2 compiler_3.5.1
[51] caTools_1.17.1.1 rlang_0.3.0.1 rhdf5_2.26.0 grid_3.5.1 RCurl_1.95-4.11
[56] marray_1.60.0 igraph_1.2.2 bitops_1.0-6 gtable_0.2.0 curl_3.2
[61] DBI_1.0.0 sets_1.0-18 R6_2.3.0 gridExtra_2.3 knitr_1.20
[66] bit_1.1-14 bindr_0.1.1 fastmatch_1.1-0 fgsea_1.8.0 readr_1.1.1
[71] stringi_1.2.4 Rcpp_1.0.0 piano_1.22.0 tidyselect_0.2.5

@fransilvion
Copy link
Author

fransilvion commented Nov 20, 2018

Also, when I run this line:
compareSmallMolecule$gsea <- compareAgainstL1000( diffExprStat, l1000perturbationsSmallMolecules, cellLine, method="gsea", geneSize = 150)

I have a lot of warnings:

the condition has length > 1 and only the first element will be used

What exactly it means and how bad is it for following analysis? Thanks

@nuno-agostinho
Copy link
Owner

Hello @fransilvion, I just tried to run the tutorial again and everything is working fine.

Could you save your R environment in a RDS file and send it to me so I can analyse it? Please attach the RDS file in this thread or send it to me by email ([email protected]). Thank you!

@fransilvion
Copy link
Author

Hi @nuno-agostinho, thanks for quick response! I tried to run tutorial again, and it worked, but for my own data set I still have this error. I can send you my diffExprStat file via email. I am searching for compounds in MCF7 cell line, and all other steps are the same as in tutorial.

@nuno-agostinho
Copy link
Owner

Hey @fransilvion, sorry for the delay in answering! Could you try to run the following commands?

library(cTRAP)
diffExprStat <- readRDS("diffExprStat.Rds")

# Summarise available conditions for CMap perturbations ------------------------
cellLine <- "MCF7"
l1000metadata <- downloadL1000data("l1000metadata.txt", "metadata")
l1000metadataSmallMolecules <- filterL1000metadata(
    l1000metadata, cellLine=cellLine, perturbationType="Compound")
getL1000conditions(l1000metadataSmallMolecules)

# Load compound-associated perturbations related to MCF7 cell line -------------
l1000zscores  <- downloadL1000data("l1000zscores.gctx", "zscores",
                                   l1000metadataSmallMolecules$sig_id)
l1000geneInfo <- downloadL1000data("l1000geneInfo.txt", "geneInfo")
l1000perturbationsSmallMolecules <- loadL1000perturbations(
    l1000metadataSmallMolecules, l1000zscores, l1000geneInfo)

# Compare against CMap perturbations -------------------------------------------
gseaResults <- compareAgainstL1000(
    diffExprStat, l1000perturbationsSmallMolecules, cellLine, method="gsea",
    geneSize=150)

# Order based on similarity ----------------------------------------------------
gseaRes_ordered <- gseaRes[order(gseaRes$MCF7_WTCS, decreasing=TRUE), ]

head(gseaRes_ordered)
tail(gseaRes_ordered)

This should run with no issues and return the similar perturbations as tested in the MCF7 cell line from CMap. How does your script deviates from the one above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants