Skip to content

dgg32/kegg_disease

Repository files navigation

Introduction

This repository contains code and data for my article "Neo4j for Diseases".

  1. The scripts are for data processing.

  2. The data folder contains the five CSV to be imported into Neo4j.

Prerequisite

Neo4j Desktop

Run

The data folder contain data from 2021. If you want to download the newest data, do these:

  1. Download the KEGG data with its API
python download_various_kegg.py ds [kegg_download_folder]
  1. Generate nodes and edges
python parse_disease.py [kegg_download_folder]
  1. Add taxonomy to pathogen. In step 2, a file called pathogen_tmp.csv is generated. We need to add the taxonomy to it via:
python add_taxonomy.py pathogen_tmp.csv > pathogen.csv
  1. Put all the CSV files, except pathogen_tmp.csv, into the Import folder of your Neo4j project. And then follow the instruction in the article.

Authors

  • Sixing Huang - Concept and Coding

License

This project is licensed under the MIT License - see the LICENSE file for details

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages