Skip to content

Commit

Permalink
Move phi_1_5 to canary as it does not install on the docker build (#2095
Browse files Browse the repository at this point in the history
)

Summary:
The docker build runner will CPU OOM when installing phi_1_5. Moving the model to canary.

https://github.com/pytorch/benchmark/actions/runs/7263599720

Pull Request resolved: #2095

Test Plan:
Nightly docker build:
https://github.com/pytorch/benchmark/actions/runs/7301484235

Reviewed By: aaronenyeshi

Differential Revision: D52391171

Pulled By: xuzhao9

fbshipit-source-id: 4866098292cbca7459d632c5c05fe620e638077e
  • Loading branch information
xuzhao9 authored and facebook-github-bot committed Dec 23, 2023
1 parent ac77055 commit 79c236a
Show file tree
Hide file tree
Showing 7 changed files with 6 additions and 1 deletion.
1 change: 1 addition & 0 deletions scripts/torchbench_install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,5 @@ conda activate "${CONDA_ENV}"
parent_dir=$(dirname "$(readlink -f "$0")")/..
cd ${parent_dir}

python -c "import torch; print(torch.__version__); print(torch.version.git_version)"
python install.py
2 changes: 1 addition & 1 deletion test.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def example_fn(self):
try:
_create_example_model_instance(task, device)
accuracy = task.get_model_attribute("accuracy")
assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
assert accuracy == "pass" or accuracy == "eager_1st_run_OOM" or accuracy == "eager_2nd_run_OOM", f"Expected accuracy pass, get {accuracy}"
task.del_model_instance()
except NotImplementedError as e:
self.skipTest(f'Method `get_module()` on {device} is not implemented because "{e}", skipping...')
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
4 changes: 4 additions & 0 deletions torchbenchmark/models/llama_v2_7b_16h/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,9 @@ eval_nograd: true
not_implemented:
- device: cpu
- device: NVIDIA A10G
# TODO: llama_v2_7b_16h accuracy test will cause "CUBLAS_STATUS_NOT_INITIALIZED" Error
# https://github.com/pytorch/benchmark/issues/2064
- device: NVIDIA A100-SXM4-40GB
test: eval
train_benchmark: false
train_deterministic: false

0 comments on commit 79c236a

Please sign in to comment.