Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odla_trt optimized and model test #499

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

ZZWHU
Copy link
Collaborator

@ZZWHU ZZWHU commented Aug 27, 2021

ODLA/platforms/tensorrt/odla_tensorrt.cc: put cudamalloc in front of first batch.

models/vision/invoke_halo.py: change the preprocess and execute code for batch-size > 1

models/vision/onnx2tensorrt.py, common.py, tensorrt_baseline_run_all.sh : native TensorRT test api

models/vision/classification/model_name/run_model_name.sh, run_model_name_tensorrt.sh : test shell for each model

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@@ -210,6 +216,7 @@ struct _odla_context {
std::unordered_map<std::string, InputPtrInfo> input_ptrs;

int run_batch_size = 0;
// CUdeviceptr cumemalloc_address;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove

// size_t bytes =
// GetTotalElements(real_shape) * GetElementSize(value->type.element_type);
// CUdeviceptr dev_ptr;
// CHECK(cuMemAlloc(&dev_ptr, bytes));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove dead code

// std::cerr << "odla_value_shape:" << real_shape << "\n";
// size_t bytes =
// GetTotalElements(real_shape) * GetTotalElements(v->type.element_type);
// CHECK(cudaMalloc(&dev_ptr, bytes));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@@ -189,6 +193,8 @@ struct _odla_context {
odla_computation comp = nullptr;
nvinfer1::ICudaEngine* engine = nullptr;
nvinfer1::IExecutionContext* ctx = nullptr;
void* temp_input_ptr = nullptr;
void* temp_output_ptr = nullptr;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if there are multiple inputs/outputs?

@@ -2,7 +2,7 @@
export SRC_DIR=${PWD}/..
export BUILD_DIR=$SRC_DIR/build
export MODELS_SRC=/models

export MODELS_ROOT=/models
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add a new variable?

@@ -12,17 +12,20 @@ fi

if [[ $TEST_WITH_GPU -eq 1 ]]; then
echo "======== Testing with ODLA TensorRT ========"
for i in 1 2 4 8 16 32 64
do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is "i" used?

#!/bin/bash
# RUN: %s
model_name="alexnet"
docker_model_file="/models/vision/classification/$model_name"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use $MODEL_ROOT

@weimingzha0
Copy link
Collaborator

this PR needs some cleanup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants