Skip to content

EmergeTechInc/graphrag-examples

 
 

Repository files navigation

GraphRAG Examples

This repo contains a streamlit app for introducing and teaching example GraphRAG patterns.

Running the Streamlit App

Follow the below steps to run the sample app:

1. Get an OpenAI API Key

The sample app uses OpenAI to demonstrate embedding and LLM capabilities. To get an OpenAI API key:

  1. Create an OpenAI account if you don't have one already. Otherwise, sign in.
  2. Navigate to the API key page and "Create new secret key". Optionally naming the key. Save this somewhere safe, and do not share it with anyone.

2. Load the Data

This app uses two datasets:

  1. The classic Northwind Database: Sales data for Northwind Traders, a fictitious specialty foods export/import company.
  2. A sample of the H&M Fashion Dataset: Real-world retail data, including customer purchases and rich information around products such as names, types, descriptions, department sections, etc.

The app has 4 pages in total, reflecting 4 GraphRAG patterns. Each page relies on one of the above datasets:

Page / GraphRAG Pattern Dataset Used Pattern Description
Vector Search With Graph Context Northwind Use graph traversals to retrieve items related to vector search results
Text2Cypher Northwind Convert natural language prompts to explicit Cypher queries for retrieval
Graph Vectors H&M Fashion Dataset Use graph embeddings for retrieval, incorporating both structured and unstructured data in vector similarity search
Graph Filtering H&M Fashion Dataset Use graph patterns and properties to pre/post filter vector search results (can also include Hybrid search

For the entire app to work, each dataset must be loaded into its own Neo4j database. If you choose not to load one of the datasets, the associated pages will not function which may be acceptable if those pages are not of interest to you.

To Load Northwind:

  1. create an empty database on a Neo4j deployment type of your choosing. Good options include a blank Neo4j Sandbox or an Aura Free instance
  2. Run the Cypher from load-data/northwind-data.cypher on that database through Neo4j Browser. At the top of that script, you will need to replace <your OpenAI API Key> with your own OpenAI api key.

To Load the H&M Fashion Dataset:

  1. This dataset involves some graph machine learning stuff. As such, you will need to create an empty Neo4j database with Graph Data Science enabled. There is no Aura Free option for this. A couple good options include:
    • (free) Starting a blank graph data science Neo4j Sandbox which should be sufficient for learning and exploration.
    • (paid) use an AuraDS instance. This is a paid option ($1.00 USD per hour) but should run significantly faster for loading, indexing, querying, and running GDS algorithms
  2. Run the Notebook load-data/hm-data.ipynb. It will attempt to read Neo4j and Open AI credentials from a secrets.toml file. You can create that file per directions below or replace with hard-coded credentials in the notebook.

3. Configure App and Environment

  1. Create a secrets.toml file using secrets.toml.example as a template:

    cp .streamlit/secrets.toml.example .streamlit/secrets.toml
    vi .streamlit/secrets.toml
  2. Fill in the below credentials in the secrets.toml file.

    # OpenAI
    OPENAI_API_KEY = "sk-..."
    
    # NEO4J
    NORTHWIND_NEO4J_URI = "neo4j+s://<xxxxx>.databases.neo4j.io"
    NORTHWIND_NEO4J_USERNAME = "neo4j"
    NORTHWIND_NEO4J_PASSWORD = "<password>"
    
    HM_NEO4J_URI = "neo4j+s://<xxxxx>.databases.neo4j.io"
    HM_NEO4J_USERNAME = "neo4j"
    HM_NEO4J_PASSWORD = "<password>"
    HM_AURA_DS = false
  3. Install requirements (recommended in an isolated python virtual environment):

    pip install -r requirements.txt

3. Run the App

Run the app with the command: streamlit run Home.py --server.port=80

About

Example GraphRAG patterns

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.6%
  • Other 0.4%