Build SceneInstruct Dataset

Please follow the instructions below to reproduce the procedure of building SceneInstruct.

Model Preparation

Llama-3.1-70B-Instruct: You can download the weights of Llama-3.1-70B-Instruct at HF Repo. To serve Llama-3.1-70B-Instruct with vLLM:
```
vllm serve <Llama-3.1-70B path> --tensor_parallel_size 2
```
OpenAI API key: Create a file openai_key and add your API key.

Run the following command:

python create_descriptions.py \
    --num-prompts-needed 3000 # the number of new descriptions to be created

Run the following command:

python collect_before_assign_placement.py
python collect_assign_placement.py

The generated SceneInstruct dataset is saved in three files: data_prompt_assign_placement.jsonl, data_prompt_check_positional_error.jsonl, and data_prompt_fix_positional_error.jsonl.