From b29722ab2a5522eb98a500aaca2fdbdc634634c6 Mon Sep 17 00:00:00 2001 From: Payal Chandak Date: Wed, 27 Jul 2022 21:36:22 -0400 Subject: [PATCH] Update README.md --- README.md | 20 ++++---------------- 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index b16c1d1..b35206c 100644 --- a/README.md +++ b/README.md @@ -90,11 +90,11 @@ pykeen.datasets.has_dataset('primekg') ## Building an updated PrimeKG -#### Downloading primary data resources +### Downloading primary data resources All persistent identifiers and weblinks to download the 20 primary data resources used to build PrimeKG are systematically provided in the Data Records section of our article. We have also mentioned the exact filenames that were downloaded from each resource for easy corroboration. -#### Curating primary data resources +### Curating primary data resources We provide the scripts used to process all primary data resources and the names of the resulting output files generated by those scripts. We would be happy to share the intermediate processing datasets that were used to create PrimeKG on request. @@ -119,26 +119,14 @@ UBERON | uberon.py | uberon_terms.csv, uberon_rels.csv, uberon_is_a.csv UMLS | umls.py, map_umls_mondo.py | umls_mondo.csv UMLS | umls.ipynb | umls_def_disorder_2021.csv, umls_def_disease_2021.csv -#### Harmonizing datasets into PrimeKG +### Harmonizing datasets into PrimeKG The code to harmonize datasets and construct PrimeKG is available at `build_graph.ipynb`. Simply run this jupyter notebook in order to construct the knowledge graph form the outputs of the processing files mentioned above. This jupyter notebook produces all three versions of PrimeKG, `kg_raw.csv`, `kg_giant.csv`, and the complete version `kg.csv`. -#### Feature extraction +### Feature extraction The code required to engineer features can be found at `engineer_features.ipynb` and `mapping_mayo.ipynb`. - - ## Cite Us If you find PrimeKG useful, cite our work: