-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Odla_trt optimized and model test #499
base: master
Are you sure you want to change the base?
Conversation
|
@@ -210,6 +216,7 @@ struct _odla_context { | |||
std::unordered_map<std::string, InputPtrInfo> input_ptrs; | |||
|
|||
int run_batch_size = 0; | |||
// CUdeviceptr cumemalloc_address; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove
// size_t bytes = | ||
// GetTotalElements(real_shape) * GetElementSize(value->type.element_type); | ||
// CUdeviceptr dev_ptr; | ||
// CHECK(cuMemAlloc(&dev_ptr, bytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove dead code
// std::cerr << "odla_value_shape:" << real_shape << "\n"; | ||
// size_t bytes = | ||
// GetTotalElements(real_shape) * GetTotalElements(v->type.element_type); | ||
// CHECK(cudaMalloc(&dev_ptr, bytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@@ -189,6 +193,8 @@ struct _odla_context { | |||
odla_computation comp = nullptr; | |||
nvinfer1::ICudaEngine* engine = nullptr; | |||
nvinfer1::IExecutionContext* ctx = nullptr; | |||
void* temp_input_ptr = nullptr; | |||
void* temp_output_ptr = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if there are multiple inputs/outputs?
@@ -2,7 +2,7 @@ | |||
export SRC_DIR=${PWD}/.. | |||
export BUILD_DIR=$SRC_DIR/build | |||
export MODELS_SRC=/models | |||
|
|||
export MODELS_ROOT=/models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why add a new variable?
@@ -12,17 +12,20 @@ fi | |||
|
|||
if [[ $TEST_WITH_GPU -eq 1 ]]; then | |||
echo "======== Testing with ODLA TensorRT ========" | |||
for i in 1 2 4 8 16 32 64 | |||
do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is "i" used?
#!/bin/bash | ||
# RUN: %s | ||
model_name="alexnet" | ||
docker_model_file="/models/vision/classification/$model_name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use $MODEL_ROOT
this PR needs some cleanup |
ODLA/platforms/tensorrt/odla_tensorrt.cc: put cudamalloc in front of first batch.
models/vision/invoke_halo.py: change the preprocess and execute code for batch-size > 1
models/vision/onnx2tensorrt.py, common.py, tensorrt_baseline_run_all.sh : native TensorRT test api
models/vision/classification/model_name/run_model_name.sh, run_model_name_tensorrt.sh : test shell for each model