OntoGPT

Introduction

OntoGPT is a Python package for extracting structured information from text with large language models (LLMs), instruction prompts, and ontology-based grounding.

Two different strategies for knowledge extraction are currently implemented in OntoGPT:

For more details, please see the full documentation.

Quick Start

OntoGPT runs on the command line, though there's also a minimal web app interface (see Web Application section below).

Ensure you have Python 3.9 or greater installed.
Install with pip:
```
pip install ontogpt
```

Set your OpenAI API key:

runoak set-apikey -e openai <your openai api key>

See the list of all OntoGPT commands:
```
ontogpt --help
```
Try a simple example of information extraction:
```
echo "One treatment for high blood pressure is carvedilol." > example.txt
ontogpt extract -i example.txt -t drug
```
OntoGPT will retrieve the necessary ontologies and output results to the command line. Your output will provide all extracted objects under the heading extracted_object.

Web Application

There is a bare bones web application for running OntoGPT and viewing results.

First, install the required dependencies with pip by running the following command:

pip install ontogpt[web]

Then run this command to start the web application:

web-ontogpt

NOTE: We do not recommend hosting this webapp publicly without authentication.

Evaluations

OpenAI's functions have been evaluated on test data. Please see the full documentation for details on these evaluations and how to reproduce them.

Citation

The information extraction approach used in OntoGPT, SPIRES, is described further in: Caufield JH, Hegde H, Emonet V, Harris NL, Joachimiak MP, Matentzoglu N, et al. Structured prompt interrogation and recursive extraction of semantics (SPIRES): A method for populating knowledge bases using zero-shot learning. arXiv publication: http://arxiv.org/abs/2304.02711

The gene summarization approach used in OntoGPT, SPINDOCTOR, is described further in: Joachimiak MP, Caufield JH, Harris NL, Kim H, Mungall CJ. Gene Set Summarization using Large Language Models. arXiv publication: http://arxiv.org/abs/2305.13338

Acknowledgements

This project is part of the Monarch Initiative. We also gratefully acknowledge Bosch Research for their support of this research project.

Name		Name	Last commit message	Last commit date
Latest commit History 916 Commits
.github/workflows		.github/workflows
docs		docs
images		images
notebooks		notebooks
projects		projects
src/ontogpt		src/ontogpt
tests		tests
util		util
.cruft.json		.cruft.json
.gene_requests_cache.sqlite		.gene_requests_cache.sqlite
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
disease_cp_output.yaml		disease_cp_output.yaml
evan-env.yml		evan-env.yml
evan-readme.md		evan-readme.md
long_output.yaml		long_output.yaml
mkdocs.yml		mkdocs.yml
output_parser.py		output_parser.py
output_parser_bulleted.py		output_parser_bulleted.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OntoGPT

Introduction

Quick Start

Web Application

Evaluations

Citation

Acknowledgements

About

Releases

Packages

Languages

License

LicoriceLin/ontogpt

Folders and files

Latest commit

History

Repository files navigation

OntoGPT

Introduction

Quick Start

Web Application

Evaluations

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages