UniDM: A Unified Framework for Data Manipulation with Large Language Models

This is the official repo for UniDM: A Unified Framework for Data Manipulation with Large Language Models.

Install

Download and unpack the dataset:

mkdir dataset
wget https://fm-data-tasks.s3.us-west-1.amazonaws.com/datasets.tar.gz -P dataset
tar xvf dataset/datasets.tar.gz -C dataset/

Prerequisites

pip install -r requirements.txt

# Manifest
git clone [email protected]:HazyResearch/manifest.git
cd manifest
pip install -e .

Run

To run inference, use

python inference.py --help

Some examples are as follows: (The API KEY can be obtained by registering with the LLM provider. For instance, if you want to run inference with the OpenAI API models, create an account here.)

To run data imputation task select 3 example to add to the prompt by the auto-retrieve module, using both the adaptive data parsing and prompt engineering modules,

python inference.py \
    --api_key <YOUR API KEY> \
    --data_dir <DATA DIR> \
    --task data_imputation \
    --instance_num 3 \
    --metadata_wise \
    --instance_wise \
    --data_parsing \
    --prompt_engineering

To run entity resolution task and select 3 examples to add to the prompt, using both the adaptive data parsing and prompt engineering modules,

python inference.py \
    --api_key <YOUR API KEY> \
    --data_dir <DATA DIR> \
    --task entity_resolution \
    --context_num 3 \
    --metadata_wise \
    --instance_wise \
    --data_parsing \
    --prompt_engineering

To run data transformation task and select 3 examples to add to the prompt, using both the adaptive data parsing and prompt engineering modules,

python inference.py \
    --api_key <YOUR API KEY> \
    --data_dir <DATA DIR> \
    --task data_transformation \
    --context_num 3 \
    --data_parsing \
    --prompt_engineering

Notes

The data retrieval may take time. When inference, we restore the retrieval scores in the ret_score folder intermediately. And we provide the retrieval scores of examples above for quick verification.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
model		model
ret_score		ret_score
utils		utils
README.md		README.md
UniDM_appendix.pdf		UniDM_appendix.pdf
inference.py		inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniDM: A Unified Framework for Data Manipulation with Large Language Models

Install

Download and unpack the dataset:

Prerequisites

Run

Notes

About

Releases

Packages

Languages

mx54039q/UniDM

Folders and files

Latest commit

History

Repository files navigation

UniDM: A Unified Framework for Data Manipulation with Large Language Models

Install

Download and unpack the dataset:

Prerequisites

Run

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages