Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Human 1.19 deprecated since this does not follow the guidelines #819

Closed
wants to merge 149 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
149 commits
Select commit Hold shift + click to select a range
ef480d2
fix: Fixed a bug in importYaml.m for when you have an old MATLAB vers…
johan-gson May 21, 2023
57c59c0
feat: Added gene essentiality prediction from DepMap data. The code c…
johan-gson May 21, 2023
f1251a6
Update code/DepMapGeneEss/Instructions.txt
johan-gson May 22, 2023
a9f1d42
Update code/DepMapGeneEss/PrepDepMapData.m
johan-gson May 22, 2023
8d960db
Update code/DepMapGeneEss/enrichment_test.m
johan-gson May 22, 2023
c752e96
Update code/DepMapGeneEss/ConvertGeneEssInputData.R
johan-gson May 24, 2023
3340aaa
Update code/DepMapGeneEss/PlotGeneEss.R
johan-gson May 24, 2023
a07b7b6
refactor: move the gene essentiality instrucitons to Human-GEM-guide
mihai-sysbio Jul 6, 2023
9f3f937
Feat: create metabolites_SMILES_inchi.tsv
JHL-452b Nov 23, 2023
f554fa2
fix: remove MAR03748 and MAR03780 and merge annotations in reactions.…
Devlin-Moyer Dec 6, 2023
67d7279
fix: update GPRs of MAR06416, MAR06419, MAR06421, and MAR06523 and re…
Devlin-Moyer Dec 8, 2023
e3073a1
fix: remove MAR02532 and MAR02534
Devlin-Moyer Dec 8, 2023
7555aa3
feat: add references for MAR06523
Devlin-Moyer Dec 8, 2023
91d3496
fix: remove MAR06329 for being redundant with MAR03149 + MAR03150 + M…
Devlin-Moyer Dec 8, 2023
9955c41
fix: remove MAR06283 for being redundant with MAR03156 + MAR03157 + M…
Devlin-Moyer Dec 8, 2023
b281b49
fix: remove MAR06279 for being redundant with MAR03157 + MAR03158 + M…
Devlin-Moyer Dec 8, 2023
df8c072
fix: remove MAR05449 for being redundant with MAR03163 + MAR03164 + M…
Devlin-Moyer Dec 8, 2023
08aa570
fix: remove MAR01009 for being redundant with MAR03885 and MAR03149,5…
Devlin-Moyer Dec 8, 2023
de0eac4
fix: removed MAR03422
Devlin-Moyer Dec 8, 2023
f44a0a2
feat: added missing thiamine and lipoyl intermediates in BCKDH and 2-…
Devlin-Moyer Dec 9, 2023
271a14c
feat: added thiamine-dependent decarboxylation reactions for BCKDH an…
Devlin-Moyer Dec 10, 2023
35b8fc1
fix: adjusted existing BCKDH and 2-OADH reactions to connect to new t…
Devlin-Moyer Dec 10, 2023
05257da
fix: removed extra space that made YAML validation fail
Devlin-Moyer Dec 10, 2023
783c84e
Merge pull request #743 from SysBioChalmers/fix/BCKDH_duplicates
haowang-bioinfo Dec 10, 2023
097881d
Merge branch 'develop' into fix/remove_MAR03422
haowang-bioinfo Dec 10, 2023
78a3d69
Merge branch 'develop' into fix/BCKDH_genes
Devlin-Moyer Dec 10, 2023
f3d3388
fix: removed MAM02158m, MAM02158n, and MAM02986n
Devlin-Moyer Dec 10, 2023
50315e8
Merge pull request #752 from SysBioChalmers/fix/remove_MAR03422
haowang-bioinfo Dec 10, 2023
2787b2c
fix: GPRs for MAR03269 and MAR03380
Devlin-Moyer Dec 10, 2023
e8bc329
fix: remove MAR00789
Devlin-Moyer Dec 13, 2023
f4a32e0
fix: remove MAR03426 and merge annotations in reactions.tsv with reac…
Devlin-Moyer Dec 13, 2023
5670d60
feat: add code to report Perox genes from beta-oxidation GPRs in mito
haowang-bioinfo Dec 13, 2023
f2561c1
Merge pull request #760 from SysBioChalmers/fix/MAR03269_MAR03380
haowang-bioinfo Dec 13, 2023
4507ee3
fix-GPR: remove ENSG00000115425 from MAR02028
haowang-bioinfo Dec 13, 2023
fa7c814
fix-GPR: remove ENSG00000115425 from MAR02148
haowang-bioinfo Dec 13, 2023
1c77562
fix-GPR: remove ENSG00000115425 from MAR02149
haowang-bioinfo Dec 13, 2023
1c3360a
fix-GPR: remove ENSG00000115425 from MAR02187
haowang-bioinfo Dec 13, 2023
f4ba4c9
fix-GPR: remove ENSG00000115425 from MAR02207
haowang-bioinfo Dec 13, 2023
82822f2
chore: merge in latest develop branch
Devlin-Moyer Dec 14, 2023
e8db446
fix: merged in latest develop branch and corrected weird annotation i…
Devlin-Moyer Dec 14, 2023
1c2060b
chore: merge in latest develop branch
Devlin-Moyer Dec 14, 2023
5e41b87
fix-GPR: remove ENSG00000113790 from GPRs of b-oxidation rxns in mito
haowang-bioinfo Dec 14, 2023
0be78ee
fix-GPR: remove ENSG00000060971 from MAR02381
haowang-bioinfo Dec 16, 2023
091f80c
fix: remove accidentally introduced typo
haowang-bioinfo Dec 16, 2023
0a8006c
fix: removed MAR00965
Devlin-Moyer Dec 18, 2023
c341127
fix: corrected GPR of MAR00970
Devlin-Moyer Dec 18, 2023
cee575c
Merge pull request #750 from SysBioChalmers/fix/BCKDH_genes
haowang-bioinfo Dec 18, 2023
04a85be
Merge pull request #768 from SysBioChalmers/fix/linolenoyl_beta_oxida…
haowang-bioinfo Dec 18, 2023
64f32f6
Merge pull request #770 from SysBioChalmers/removePeroGenesFromBetaox…
haowang-bioinfo Dec 18, 2023
bb85eec
chore: resolved merge conflicts with latest develop branch
Devlin-Moyer Dec 18, 2023
702fa11
doc: update rxnRetired column for MAR03160
haowang-bioinfo Dec 22, 2023
00a9237
Merge pull request #751 from SysBioChalmers/fix/octanoyl_beta_ox_dupes
haowang-bioinfo Dec 23, 2023
eb96739
fix: remove MAR03322 and MAR03296
haowang-bioinfo Dec 26, 2023
18a3db0
fix-gpr: remove DECR2 from mitochondria beta-oxidation reactions
haowang-bioinfo Dec 26, 2023
e6a702c
fix-gpr: remove DECR1 from peroxisome beta-oxidation reactions
haowang-bioinfo Dec 26, 2023
f337c16
doc: update rxnRetired column to genes.tsv
haowang-bioinfo Dec 26, 2023
fa52bf0
fix: remove MAR04005 due to wrong compartment assignment
haowang-bioinfo Dec 26, 2023
16d6dae
doc: sync MAR04005 removal to TSV files
haowang-bioinfo Dec 26, 2023
cc7ad31
fix: remove MAR05019 for being a duplicate of MAR03142+MAR03143
Devlin-Moyer Dec 26, 2023
a5c4cda
fix: remove MAR04967 for being a duplicate of MAR03143+MAR03144+MAR03146
Devlin-Moyer Dec 26, 2023
06ac48b
fix: remove MAR05024 for being a duplicate of MAR03142+MAR03143+MAR03…
Devlin-Moyer Dec 26, 2023
1e2fac9
fix-met: remove MAM00678c and MAM03035c
haowang-bioinfo Dec 26, 2023
3217248
fix: removed EHHADH from the GPR of MAR03288
Devlin-Moyer Dec 26, 2023
8c10adb
Merge pull request #788 from SysBioChalmers/fix/DECR1DECR2_rxnsGPRs
haowang-bioinfo Dec 27, 2023
982c5c6
Merge branch 'develop' into fix/decanoyl_beta_ox_dupes
haowang-bioinfo Dec 27, 2023
d74abc9
Merge pull request #789 from SysBioChalmers/fix/decanoyl_beta_ox_dupes
haowang-bioinfo Dec 27, 2023
2ef197e
fix: remove MAR05187 for duplicating MAR03486+MAR03386
Devlin-Moyer Dec 28, 2023
3259e90
fix: remove MAR05186 for misrepresenting catalytic activity of ALDH2A2
Devlin-Moyer Dec 28, 2023
f35687b
fix-gpr: MAR03386 and MAR03486
Devlin-Moyer Dec 28, 2023
18a27be
fix: remove KRTAP11-1 for not being an enzyme
Devlin-Moyer Dec 28, 2023
44046bf
fix: remove MAR05358 for duplicating MAR03452+MAR03453+MAR03454+MAR03455
Devlin-Moyer Dec 28, 2023
3cc9c20
fix: remove MAR05195 for duplicating MAR03456
Devlin-Moyer Dec 28, 2023
6f58fd4
fix: remove MAR05302 for duplicating MAR03457
Devlin-Moyer Dec 28, 2023
73fb11d
fix: remove MAR05229 for duplicating MAR03458
Devlin-Moyer Dec 28, 2023
d9a35a6
fix: remove MAR05208 for duplicating MAR03281+MAR03282+MAR03283
Devlin-Moyer Dec 28, 2023
5eaebad
fix: remove MAR05356 for duplicating MAR03275+MAR03277+MAR03278+MAR03279
Devlin-Moyer Dec 28, 2023
228fa53
fix: remove MAR05191 for duplicating MAR03280+MAR03281+MAR03282+MAR03283
Devlin-Moyer Dec 28, 2023
557b5e2
fix: remove MAR05123 for duplicating MAR03284+MAR03285+MAR03286+MAR03287
Devlin-Moyer Dec 28, 2023
b4d2ac4
fix: remove MAR05075 for duplicating MAR03288
Devlin-Moyer Dec 28, 2023
de7acc8
fix: remove MAR05072 for duplicating MAR03290+MAR03292+MAR03293
Devlin-Moyer Dec 28, 2023
15ea8ba
fix: replaced MAM03274m with MAM00091m in MAR05360
Devlin-Moyer Dec 28, 2023
512df3c
fix: removed MAM03274m for duplicating MAM00091m
Devlin-Moyer Dec 28, 2023
8c404a0
fix: removed MAM03653m for duplicating MAM00072m
Devlin-Moyer Dec 28, 2023
b186b22
fix: removed MAM03221m for duplicating MAM00089m
Devlin-Moyer Dec 28, 2023
b16646e
fix: removed MAM03182m for duplicating MAM03009m
Devlin-Moyer Dec 28, 2023
adc489b
fix: removed MAM03978m for duplicating MAM01576m
Devlin-Moyer Dec 28, 2023
84505c8
fix: removed MAM03203m for duplicating MAM01575m
Devlin-Moyer Dec 28, 2023
79a9c66
fix: removed MAM02698m for duplicating MAM03021m
Devlin-Moyer Dec 28, 2023
db41d5f
fix: removed MAM03650m for duplicating MAM01577m
Devlin-Moyer Dec 28, 2023
c09f91e
chore: merge in latest develop branch
Devlin-Moyer Dec 28, 2023
1fb40a6
fix a small type in rxn ID
feiranl Dec 31, 2023
7bab1e9
fix: standardized format of new GPR for MAR00970
Devlin-Moyer Dec 31, 2023
f3e706c
Merge pull request #790 from SysBioChalmers/fix/MAR03288_GPR
feiranl Jan 2, 2024
426590e
fix: consistent formatting of names of thiamine conjugates
Devlin-Moyer Jan 2, 2024
d137b60
fix: added MAM20081m to MAR03800 as a product
Devlin-Moyer Jan 2, 2024
d0e5e8b
fix: typo in KEGG ID for MAM20077m
Devlin-Moyer Jan 2, 2024
265216d
fix: formula for MAR20177 (accidentally used the new formula for MAR0…
Devlin-Moyer Jan 2, 2024
e68425d
feat-code for getting smiles and inchi of metabolites
JHL-452b Jan 4, 2024
2c71ce4
Merge pull request #753 from SysBioChalmers/fix/thiamine_in_oxoacid_d…
haowang-bioinfo Jan 8, 2024
f52107c
doc: add removed rxn id to "rxnRetired" column
haowang-bioinfo Jan 8, 2024
9ae41e9
Merge pull request #794 from SysBioChalmers/fix/remove_MAR00965
haowang-bioinfo Jan 8, 2024
a7370db
Merge branch 'develop' into fix/phytanoyl_alpha_ox
haowang-bioinfo Jan 8, 2024
5bfc037
fix: typos in refences field
mihai-sysbio Jan 8, 2024
86ad0df
chore: merge in latest develop branch
Devlin-Moyer Jan 8, 2024
4b80d74
Merge pull request #731 from JHL-452b/feat_smiles_inchi
haowang-bioinfo Jan 10, 2024
33273b1
Merge pull request #803 from SysBioChalmers/chore/pmid-typos
mihai-sysbio Jan 11, 2024
9281d58
fix: renamed MAM02746 metabolites to (3R)-phytanic acid
Devlin-Moyer Jan 28, 2024
0fb4aea
fix: brought back previously-deprecated MAM03884 metabolites and rena…
Devlin-Moyer Jan 28, 2024
0fc25b2
fix: brought back previously-deprecated MAR00659 and MAR01617
Devlin-Moyer Jan 28, 2024
a28a001
fix: renamed MAM02747 metabolites to (3R)-phytanoyl-CoA
Devlin-Moyer Jan 28, 2024
3ed81be
fix: replaced MAM02746c with MAM03884c in MAR03388
Devlin-Moyer Jan 28, 2024
ead6259
fix: made MAR03481 and MAR06779 both ATP-dependent and CoA-hydrolyzing
Devlin-Moyer Jan 28, 2024
1c955ba
fix: replaced MAM02746x with MAM03884x in MAR03389
Devlin-Moyer Jan 28, 2024
872477a
fix: renamed MAM00655x to (3R)-2-hydroxyphytanoyl-CoA
Devlin-Moyer Jan 28, 2024
a0af2aa
created MAM20083x to represent peroxisomal (2R)-pristanal
Devlin-Moyer Jan 28, 2024
0543365
fix: replaced MAM00564x with MAM20083x in MAR03486
Devlin-Moyer Jan 28, 2024
9c5e5ad
fix: renamed MAM02766 metabolites to (3S)-pristanic acid
Devlin-Moyer Jan 28, 2024
ea06b38
fix: brought back previously-deprecated MAM00077x and MAR03488 and re…
Devlin-Moyer Jan 28, 2024
0f3547a
fix: renamed MAM02766 metabolites to (2S)-pristanic acid (wrongly ren…
Devlin-Moyer Jan 28, 2024
9c5912b
fix: replaced MAM02766x with MAM00077x in MAR03493
Devlin-Moyer Jan 28, 2024
020e536
feat: created MAM00077c and MAM00077e
Devlin-Moyer Jan 28, 2024
80a1936
fix: removed deprecated MetaNetX ID from annotations for MAM02766 met…
Devlin-Moyer Jan 28, 2024
c9b7872
fix: replace MAM02766c with MAM00077c in MAR03390
Devlin-Moyer Jan 28, 2024
043f51d
feat: created MAR20179 and MAR20180 to represent export and exchange …
Devlin-Moyer Jan 28, 2024
92d5295
fix: removed MAR04774, MAR04775, and MAM02955c from YAML file
Devlin-Moyer Feb 23, 2024
e4849bd
chore: moved row for MAM02955c from model/metabolites.tsv to data/dep…
Devlin-Moyer Feb 23, 2024
235a087
chore: moved rows for MAR04774 and MAR04775 from model/reactions.tsv …
Devlin-Moyer Feb 23, 2024
b4711dd
feat: added references for tagatose metabolism reactions
Devlin-Moyer Feb 23, 2024
3c8159b
fix: GPRs for mitochondrial 2-enoyl-CoA hydratase reactions
Devlin-Moyer Feb 23, 2024
f108c6c
fix: fill in exchange reaction names
edkerk Mar 13, 2024
a0b15f6
Merge pull request #811 from SysBioChalmers/fix/exchangeRxnNames
feiranl Mar 19, 2024
055ef55
fix: undid changes to GPR of MAR03181 that should not have been made
Devlin-Moyer Apr 4, 2024
afabd27
Merge pull request #574 from SysBioChalmers/feat/AddGeneEssDepMap
feiranl Apr 23, 2024
c1b7bc5
Merge pull request #792 from SysBioChalmers/fix/phytanoyl_alpha_ox
feiranl Apr 23, 2024
bfc65ea
Merge branch 'develop' into fix/dupe_linoleoyl_gamma_linolenoyl_beta_…
feiranl Apr 23, 2024
c5d4835
Merge pull request #793 from SysBioChalmers/fix/dupe_linoleoyl_gamma_…
feiranl Apr 23, 2024
ba08540
fix: changed GPR of MAR03281 to ENSG00000084754 (which I missed earlier)
Devlin-Moyer Apr 23, 2024
189984a
chore: resolve merge conflict with latest develop branch
Devlin-Moyer Apr 23, 2024
3a16284
chore: resolve merge conflicts with latest version of develop branch
Devlin-Moyer Apr 23, 2024
56279d7
Merge pull request #808 from SysBioChalmers/fix/remove_EHHADH_from_mi…
feiranl Apr 24, 2024
be23d02
Merge pull request #807 from SysBioChalmers/fix/tagatose
feiranl Apr 24, 2024
eefe972
Merge pull request #805 from SysBioChalmers/fix/phytanic_stereo
feiranl Apr 24, 2024
34f6271
feat: added MAR20181-3 to represent transport functions of SLC25A19
Devlin-Moyer Apr 24, 2024
37bda79
fix: removed MAR01788, MAR01789, MAR04205, MAR08745, and MAR08747
Devlin-Moyer Apr 24, 2024
7930323
fix: modified unbalance reactions
JHL-452b Apr 25, 2024
56b345d
Merge pull request #816 from JHL-452b/fix-Correction_of_unbalanced_re…
feiranl Apr 26, 2024
9b68a73
Merge pull request #815 from SysBioChalmers/fix/thiamine_transport
feiranl May 6, 2024
d79466c
chore: update checkout action from v3 to v4
mihai-sysbio May 7, 2024
25c5c81
Merge pull request #818 from SysBioChalmers/chore/update-actions
feiranl May 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/check-metabolictasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
task-type: [essential, verification]
steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4

- name: Check ${{ matrix.task-type }} metabolic tasks
run: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/yaml-conversion.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:

steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4

- name: Run conversion script
run: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/yaml-validation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:

steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4

- name: YAML Lint
uses: metabolicatlas/action-yamllint@v3
Expand Down
69 changes: 69 additions & 0 deletions code/DepMapGeneEss/ConvertGeneEssInputData.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
library(tidyverse)

setwd(dirname(rstudioapi::getActiveDocumentContext()$path))

d = read_csv(paste0(rstudioapi::getActiveDocumentContext(), "/data/CCLE_expression_full.csv"))
dim(d)#1377 52055

#transpose
ds = as_tibble(cbind(gene = names(d)[-1], t(as.matrix(d[,-1]))))
colnames(ds)[-1] = d[[1]]

#make sure the columns are numeric instead of chr
ds2 = ds
for (i in 2:ncol(ds)) {
ds2[[i]] = as.numeric(ds[[i]])

}

#convert the gene names

#To get ensembl:
#pattern = ".*\\(([A-Z0-9]*)\\)"
#newGenes = str_match(ds$gene, pattern)
#ds2$gene = newGenes[,2]
#fix the ERCC genes
#ds2$gene[is.na(newGenes[,2])] = ds$gene[is.na(newGenes[,2])];

#To get gene symbols:
#It is a bit tricky, not all genes follow the pattern. Some are like LINC00328-2P (ENSG00000225016),
#some just an ensembl id (we then take the assembl id), some ERCC
newGenes = ds2$gene
x = strsplit(ds2$gene[!ERCCGenesSel], " ")
for(i in 1:length(x)) {
newGenes[i] = x[[i]][1] #handles all cases
}
length(newGenes)
length(unique(newGenes)) #not the same, we need to merge a few rows, done later

ds2$gene = newGenes

ds2


#now convert the data. It is currently as log2(TPM + 1)
dsTPM = ds2
dsTPM[,-1] = 2^ds2[,-1] - 1
colSums(dsTPM[,-1])#very minor roundoff differences, ok

#Sum up the rows that have the same gene name
duplGenes = unique(dsTPM$gene[duplicated(dsTPM$gene)])
length(duplGenes)#9
dsTPM[dsTPM$gene %in% duplGenes,1:10]
#CCDC39 is a good example to test
#is 1.59 + 0.150 in the first row, those should be summed up

rowsToRem = rep(FALSE, nrow(dsTPM))
for (i in 1:length(duplGenes)) {
inds = which(dsTPM$gene == duplGenes[i])
dsTPM[inds[1], -1] = colSums(dsTPM[inds, -1]) #the first row now gets the sum of all rows
rowsToRem[inds[-1]] = TRUE #the other rows are marked for deletion
}
sum(rowsToRem) #9, looks good
#test:
dsTPM[[2]][dsTPM$gene == "CCDC39"] #1.74 0.15, as expected, ok
dsTPM$gene[rowsToRem] # CCDC39 is in there, ok
dsTPM2 = dsTPM[!rowsToRem,]

write_tsv(dsTPM2, paste0(dataFolder, "DepMap_tpm_gene_symbols.txt"))

62 changes: 62 additions & 0 deletions code/DepMapGeneEss/PlotGeneEss.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
library(ggplot2)
library(tidyverse)

setwd(dirname(rstudioapi::getActiveDocumentContext()$path))


figPath = "figures/"


############################################
# Fig 1B - Gene essentiality
############################################

# load data
# fns = c("data/geneEss_model1.txt",
# "data/geneEss_newalg.txt",
# "data/geneEss_newalg2.txt"
# )
fns = c("data/geneEss_model1.txt")


#names = c("tINIT","ftINIT 1+0", "ftINIT 1+1")
names = c("Human2")

gea_res = NULL
for (i in 1:length(fns)) {
x = read.delim(file = fns[i], sep='\t', stringsAsFactors=F)
x$model = names[i]
gea_res = rbind(gea_res, x)
}
gea_res$model = factor(gea_res$model, as.character(names)[1:length(fns)]) # to enforce the model order


color_palette <- c('#B5D39B','#6B97BC','#E7B56C') # light green, light blue, light yellow

p1B = ggplot(gea_res, aes(x = model, y = MCC, fill = model)) +
geom_violin(trim=F, show.legend=F, scale='count') +
scale_fill_manual(values=color_palette) +
theme_classic() +
ylab('MCC') +
xlab('') +
theme(text = element_text(size=14),
axis.text.x = element_text(angle=90, hjust=1, vjust=0.5,
color='black', size=14),
axis.text.y = element_text(color='black', size=14),
axis.line.x = element_blank()) +
ylim(c(0.08,0.40)) # + #manipulate these numbers to include all data
#ylim(c(0,0.5)) # +
p1B


ggsave(
paste0(figPath, "FigGeneEss.png"),
plot = p1B,
width = 3.5, height = 3.2, dpi = 300)

ggsave(
paste0(figPath, "FigGeneEss.eps"),
plot = p1B,
width = 3.5, height = 3.2, dpi = 300)


43 changes: 43 additions & 0 deletions code/DepMapGeneEss/PrepDepMapData.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
%we are using depmap version 2021_Q3

%Prepares the DepMap data, specifically by filtering out RNA-Seq samples for which no CRISPR data exists
%cd C:/Work/MatlabCode/components/human-GEM/Human-GEMDepMapEval/Human-GEM/code/DepMapGeneEss %to be used if not running the whole file, may need to change this
cd(fileparts(which(mfilename))); %to be used if running the whole file

%% Load and prepare DepMap RNA-Seq data (cell lines)

% load RNA-Seq data from txt file
rna_data = readtable('data/DepMap_tpm_gene_symbols.txt');

% load gene essentiality data (Achilles gene effect)
ach_data = readtable('data/Achilles_gene_effect.csv');
samples = ach_data.DepMap_ID; % extract sample IDs

% filter RNA-Seq data to only include samples for which we have
% essentiality data
cellLineNames = rna_data.Properties.VariableNames;
%now replace '_' with '-'
cellLineNames = strrep(cellLineNames, '_', '-');

%{'Original column heading: 'ACH-001113''}

keep = ismember(cellLineNames, samples);

sum(keep) %891
sum(keep)/length(keep) % 65%, seems reasonable

% add RNA-Seq data to arrayData
arrayDataDepMap.genes = rna_data.gene;
arrayDataDepMap.tissues = cellLineNames(keep)';
arrayDataDepMap.levels = table2array(rna_data(:, keep));
arrayDataDepMap.threshold = 1;


% save tINIT inputs
save('data/arrayDataDepMap.mat','arrayDataDepMap');

%Generate ftINIT prepData - only needs to be done once. Can take up to an hour to run
model = importYaml('../../model/Human-GEM.yml');
[model.grRules, skipped] = simplifyGrRules(model.grRules, true);%takes a few minutes to run
prepData = prepHumanModelForftINIT(model, true, '../../data/metabolicTasks/metabolicTasks_Essential.txt', '../../model/reactions.tsv');
save('data/prepDataGeneSymbols.mat', 'prepData')
Empty file.
46 changes: 46 additions & 0 deletions code/DepMapGeneEss/enrichment_test.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
%This file is for convenience copied from the Human1 paper.
function [penr,pdep] = enrichment_test(pop,sample,successes)
%enrichment_test caculates p-value of enrichment of successes in a sample.
%
% [PENR,PDEP] = enrichment_test(POP,SAMPLE,SUCCESSES) evaluates the
% significance of enrichment (and depletion) of SUCCESSES in a SAMPLE drawn
% from population POP using the hypergeometric test.
%
%
%--------------------------------- INPUTS ---------------------------------
%
% pop Vector of genes comprising the population from
% which samples are drawn.
%
% sample Vector of genes sampled from the population.
%
% successes Vector of genes in the population that are defined as
% "successes".
%
% The function will test if these "success" genes are
% significantly depleted or enriched in the sample, given
% that they were drawn from the population.
%
% EXAMPLE: If a metabolite set enrichment analysis (MSEA) is being
% performed, then SAMPLE is the list of metabolites of interest,
% and SUCCESSES is a metabolite set (e.g., TCA cycle metabolites).
% POP is the list of all metabolites from which SAMPLE and
% SUCCESSES are drawn.
%
%--------------------------------- OUTPUTS --------------------------------
%
% penr p-value associated with enrichment of successes in sample.
%
% pdep p-value associated with depletion of successes in sample.
%
%
% ***** WARNING: P-VALUES ARE NOT ADJUSTED FOR FALSE DISCOVERY RATE! *****
%

x = numel(intersect(successes,sample)); % calc # of successes in sample
m = numel(pop); % calc size of population
k = numel(intersect(successes,pop));
n = numel(sample);

penr = hygecdf(x-1,m,k,n,'upper');
pdep = hygecdf(x,m,k,n);
Loading