Skip to content

Commit

Permalink
Add qnn 16a16w quantization test (#7039)
Browse files Browse the repository at this point in the history
Summary: Pull Request resolved: #7039

Differential Revision: D66390212
  • Loading branch information
cccclai authored and facebook-github-bot committed Nov 22, 2024
1 parent a7ed425 commit f4278a9
Show file tree
Hide file tree
Showing 3 changed files with 817 additions and 807 deletions.
8 changes: 8 additions & 0 deletions .ci/scripts/test_llama.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,10 @@ while [[ $# -gt 0 ]]; do
MODE="$2" # portable or xnnpack+custom or xnnpack+custom+qe
shift 2
;;
-pt2e_quantize)
PT2E_QUANTIZE="$2" # portable or xnnpack+custom or xnnpack+custom+qe
shift 2
;;
-upload)
UPLOAD_DIR="$2"
shift 2
Expand Down Expand Up @@ -234,6 +238,10 @@ if [[ "${COREML}" == "ON" ]]; then
fi
if [[ "${QNN}" == "ON" ]]; then
EXPORT_ARGS="${EXPORT_ARGS} -kv -v --qnn --disable_dynamic_shape"
echo "PT2E_QUANTIZE is ${PT2E_QUANTIZE}"
if [[ "${PT2E_QUANTIZE}" == "qnn_16a16w" ]]; then
EXPORT_ARGS+=" --tokenizer_path tokenizer.model --pt2e_quantize qnn_16a16w --calibration_tasks wikitext --calibration_limit 1 --calibration_seq_length 128 --calibration_data Once "
fi
fi
# Add dynamically linked library location
$PYTHON_EXECUTABLE -m examples.models.llama.export_llama ${EXPORT_ARGS}
Expand Down
Loading

0 comments on commit f4278a9

Please sign in to comment.