Skip to content

Open Source Institute-Cognitive System of Machine Intelligent Computing (OpenSI-CoSMIC)

License

Notifications You must be signed in to change notification settings

TheOpenSI/CoSMIC

Repository files navigation

Official Implementation

License: MIT arXiv python Media

This is the official implementation of the Open Source Institute-Cognitive System of Machine Intelligent Computing (OpenSI-CoSMIC) v1.0.0.

Installation

# For users using SSH on GitHub
git clone --recursive [email protected]:TheOpenSI/CoSMIC.git

# For users using GitHub account and token
git clone --recursive https://github.com/TheOpenSI/CoSMIC.git

Users need to download Stockfish binary file (stockfish-ubuntu-x86-64-avx2 for linux) for chess-game queries and store it as default, "third_party/stockfish/stockfish-ubuntu-x86-64-avx2". The path of this binary file can be changed in config.yaml as

chess:
  stockfish_path: ""  # add the path in ""; otherwise, it will be default.

Requirements

Please install the following packages before using this code, which is also provided in requirements.txt. Users need to register for a Hugging Face account (set hf_token=[your token] in .env) to download base LLMs and an OpenAI account (set openai_token=[your token] in .env) to use the API if applicable.

To use Ollama models, in config.yaml set the LLM name indexed by "ollama:" as llm_name: ollama:[your ollama model name]. If an Ollama model has not yet been pulled to a local directory, it might take a few minutes, depending on the model size.

huggingface_hub==0.24.0
setuptools==75.1.0
chess==1.10.0
stockfish==3.28.0
bitsandbytes==0.43.1
faiss-cpu==1.8.0
imageio==2.34.2
langchain==0.2.14
langchain_community==0.2.12
langchain_huggingface==0.0.3
llama_index==0.11.1
matplotlib==3.7.5
numpy==1.24.3
openai==1.42.0
pandas==2.0.3
peft==0.11.1
Pillow==10.4.0
python-dotenv==1.0.1
pytz==2024.1
torch==2.3.0
transformers==4.42.4
python-box==7.2.0
PyYAML==6.0.2
regex==2024.5.15

To use "code generation and evaluation" service, users need to install docker following

apt install docker.io

Framework

The system is configurated through config.yaml. Currently, it has 5 base services, including

Each query will be parsed by an LLM-based analyser to select the most relevant service.

Upper-level chess-game services include

Get Started

[General User] Chatbot

We provide a website based chatbot for the interaction between user and OpenSI-CoSMIC. The backend program is exected in a docker container. The program is started by running

touch .env   # then if OpenAI GPT API is used, please add OPENAI_API_KEY="[your API key]" in .env.
bash run_chatbot.sh

with the configuration settings in

This chatbot is developed on the open-source Open-WebUI under the MIT license.

Development

The default LLMs for QA and query analyser are "gpt-4o" while one can change them in config.yaml. The full list of supported LLMs is provided in LLM_MODEL_DICT.

  • We demonstrate the use of OpenSI-CoSMIC below.

    # Quit by entering quit or exit.
    python demo.py
  • Alternatively, one can use the following development instruction.

    from src.opensi_cosmic import OpenSICoSMIC
    from utils.log_tool import set_color
    
    # Build the system with a config file, which contains LLM name, or a given base LLM name.
    use_config_file = True
    
    if use_config_file:
        config_path = "scripts/configs/config.yaml"
        opensi_cosmic = OpenSICoSMIC(config_path=config_path)
    else:
        llm_name = "mistral-7b-instruct-v0.1"
        opensi_cosmic = OpenSICoSMIC(llm_name=llm_name)
    
    # Set the question.
    # One can set each question with "[question],[reference answer (optional)]" in a .csv file.
    query = "What is the capital of Australia?"
    
    # Get the answer, raw_answer for response without truncation, retrieve_score (if switched on) for
    # the similarity score to context in the system's vector database.
    answer, raw_answer, retrieve_score = opensi_cosmic(query, log_file=None)
    
    # Print the answer.
    print(set_color("info", f"Question: {query}\nAnswer: {answer}."))
    
    # Remove memory cached in the system.
    opensi_cosmic.quit()

    More example questions are provided in test.csv, which can be used as

    import os, csv
    import pandas as pd
    
    from src.opensi_cosmic import OpenSICoSMIC
    from utils.log_tool import set_color
    
    # Build the system with a given base LLM.
    llm_name = "mistral-7b-instruct-v0.1"
    opensi_cosmic = OpenSICoSMIC(llm_name=llm_name)
    
    # Get the file's absolute path.
    current_dir = os.path.dirname(os.path.abspath(__file__))
    root = f"{current_dir}"
    
    # Set a bunch of questions, can also read from .csv.
    df = pd.read_csv(f"{root}/data/test.csv")
    queries = df["Question"]
    answers = df["Answer"]
    
    # Loop over questions to get the answers.
    for idx, (query, gt) in enumerate(zip(queries, answers)):
        # Skip marked questions.
        if query.find("skip") > -1: continue
    
        # Create a log file.
        if query.find(".csv") > -1:
            # Remove all namespace.
            query = query.replace(" ", "")
    
            # Return if file is invalid.
            if not os.path.exists(query):
                set_color("error", f"!!! Error, {query} not exist.")
                continue
    
            # Change the data folder to results for log file.
            log_file = query.replace("/data/", f"/results/{llm_name}/")
    
            # Create a folder to store log file.
            log_file_name = log_file.split("/")[-1]
            log_dir = log_file.replace(log_file_name, "")
            os.makedirs(log_dir, exist_ok=True)
            log_file_pt = open(log_file, "w")
            log_file = csv.writer(log_file_pt)
        else:
            log_file_pt = None
            log_file = None
    
        # Run for each question/query, return the truncated response if applicable.
        answer, _, _ = opensi_cosmic(query, log_file=log_file)
    
        # Print the answer.
        if isinstance(gt, str):  # compare with GT string
            # Assign to q variables.
            status = "success" if (answer.find(gt) > -1) else "fail"
    
            print(set_color(
                status,
                f"\nQuestion: '{query}' with GT: {gt}.\nAnswer: '{answer}'.\n")
            )
    
        # Close log file pointer.
        if log_file_pt is not None:
            log_file_pt.close()
        
    # Remove memory cached in the system.
    opensi_cosmic.quit()

Reference

If this repository is useful for you, please cite the paper below.

@misc{Adnan2024,
    title         = {Unleashing Artificial Cognition: Integrating Multiple AI Systems},
    author        = {Muntasir Adnan and Buddhi Gamage and Zhiwei Xu and Damith Herath and Carlos C. N. Kuhn},
    howpublished  = {Australasian Conference on Information Systems},
    year          = {2024}
}

Contact

For technical supports, please contact Danny Xu or Muntasir Adnan. For project supports, please contact Carlos C. N. Kuhn.

Contributing

We welcome contributions from the community! Whether you’re a researcher, developer, or enthusiast, there are many ways to get involved:

  • Report Issues: Found a bug or have a feature request? Open an issue on our GitHub page.
  • Submit Pull Requests: Contribute code by submitting pull requests. Please follow our contribution guidelines.
  • Make a Donation: Support our project by making a donation here.

License

This code is distributed under the MIT license. If Mistral 7B v0.1, Mistral 7B Instruct v0.1, Gemma 7B, or Gemma 7B It from Hugging Face is used, please also follow the license of Hugging Face; if the API of GPT 3.5-Turbo or GPT 4-o from OpenAI is used, please also follow the licence of OpenAI.

Funding

This project is funded under the agreement with the ACT Government for Future Jobs Fund with Open Source Institute (OpenSI)-R01553 and NetApp Technology Alliance Agreement with OpenSI-R01657.

About

Open Source Institute-Cognitive System of Machine Intelligent Computing (OpenSI-CoSMIC)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages