InstructNav

Enabling robots to navigate following diverse language instructions in unexplored environments is an attractive goal for human-robot interaction. In this work, we propose InstructNav, a generic instruction navigation system. InstructNav makes the first endeavor to handle various instruction navigation tasks without any navigation training or pre-built maps. To reach this goal, we introduce Dynamic Chain-of-Navigation (DCoN) to unify the planning process for different types of navigation instructions. Furthermore, we propose Multi-sourced Value Maps to model key elements in instruction navigation so that linguistic DCoN planning can be converted into robot actionable trajectories.

InstructNav

With InstructNav, we complete the R2R-CE task in a zero-shot way for the first time and outperform many task-training methods. Besides, InstructNav also surpasses the previous SOTA method by 10.48% on the zero-shot Habitat ObjNav and by 86.34% on demand-driven navigation DDN. Real robot experiments on diverse indoor scenes further demonstrate our method's robustness in coping with the environment and instruction variations. Please refer to more details in our paper: .

🔥 News

2024.9.11: The HM3D objnav benchmark code is released.
2024.9.5: Our paper is accepted by CoRL 2024. Codes will be released in the recent.

Dependency

Our project is based on the habitat-sim and habitat-lab. Please follow the guides to install them in your python environment. You can directly install the latest version of habitat-lab and habitat-sim. And make sure you have properly download the navigation scenes (HM3D, MP3D) and the episode dataset for both visual-language navigation (VLN-CE) and object navigation.

Installation

Firstly, clone our repo as:

git clone https://github.com/LYX0501/InstructNav.git
cd InstructNav
pip install -r requirements.txt

Our method depends on an open-vocalbulary detection and segmentation model GLEE. Please check the original repo or try to use the copy located in the ./thirdparty/ directory.

Prepare your GPT4 and GPT4V API Keys

Please prepare your keys for calling the API for large-language model and large vision-language model. We prefer to use the GPT4 and GPT4V to do the inference. And our code follows the AzureOpenAI calling process. Before running the benchmark, you should prepare for your own api-keys and api-endpoint and api-version. You can check the ./llm_utils/gpt_request.py for usage details.

export GPT4_API_BASE=<YOUR_GPT4_ENDPOINT>
export GPT4_API_KEY=<YOUR_GPT4_KEY>
export GPT4_API_DEPLOY=<GPT4_MODEL_NAME>
export GPT4_API_VERSION=<GPT4_MODEL_VERSION>
export GPT4V_API_BASE=<YOUR_GPT4V_ENDPOINT>
export GPT4V_API_KEY=<YOUR_GPT4V_KEY>
export GPT4V_API_DEPLOY=<GPT4V_MODEL_NAME>
export GPT4V_API_VERSION=<GPT4V_MODEL_VERSION>

Running our Benchmark Code

If everything goes well, you can directly run the evaluation code for different navigation tasks. For example,

python objnav_benchmark.py

And all the episode results, intermediate results such as GPT4 input/output and value maps will be saved in /tmp/ directory. The real-time agent first-person-view image observation, depth and segmentation will be saved in the project root directory. Examples are shown below:

example_objnav_episode.mp4

BibTex

Please cite our paper if you find it helpful :)

@misc{InstructNav,
      title={InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment}, 
      author={Yuxing Long and Wenzhe Cai and Hongcheng Wang and Guanqi Zhan and Hao Dong},
      year={2024},
      eprint={2406.04882},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
cv_utils		cv_utils
llm_utils		llm_utils
mapping_utils		mapping_utils
thirdparty/GLEE		thirdparty/GLEE
.gitignore		.gitignore
InstructNav.png		InstructNav.png
README.md		README.md
config_utils.py		config_utils.py
constants.py		constants.py
mapper.py		mapper.py
objnav_agent.py		objnav_agent.py
objnav_benchmark.py		objnav_benchmark.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InstructNav

🔥 News

Dependency

Installation

Prepare your GPT4 and GPT4V API Keys

Running our Benchmark Code

BibTex

About

Releases

Packages

Contributors 2

Languages

LYX0501/InstructNav

Folders and files

Latest commit

History

Repository files navigation

InstructNav

🔥 News

Dependency

Installation

Prepare your GPT4 and GPT4V API Keys

Running our Benchmark Code

BibTex

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages