-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[luci] Introduce Compress weights pass #13521
base: master
Are you sure you want to change the base?
Conversation
Current encoded array format: Using example: Download mobilenet_v1_1.0_224_quant.tflite
Compression results for mobilenet_v1_1.0_224_quant.circle: |
case luci::WeightCompression::NONE: | ||
return circle::WeightCompressionType_NONE; | ||
case luci::WeightCompression::HUFFMAN: | ||
return circle::WeightCompressionType_Huffman; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you are adding weight compression with Huffman coding.
As you adding this define here, I think compress_weights_huffman
instead of general compress_weights
, would be better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanshpark fixed, please take a look
98fc07b
to
685f2b9
Compare
950a0c0
to
2de0564
Compare
auto conv2d = dynamic_cast<luci::CircleConv2D *>(node); | ||
if (not conv2d) | ||
continue; | ||
loco::DataType weights_dtype = loco::must_cast<luci::CircleConst *>(conv2d->filter())->dtype(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please split lines. I'd like to have easy readable codes.
loco::DataType weights_dtype = loco::must_cast<luci::CircleConst *>(conv2d->filter())->dtype(); | |
auto filter = loco::must_cast<luci::CircleConst *>(conv2d->filter()); | |
auto weights_dtype = filter->dtype(); |
arr.push_back( | ||
*(static_cast<const uint8_t *>(static_cast<const void *>(&kTreeSizeInBits)) + i)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you plz split lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanshpark Maybe think about better place for HuffmanEncoder.h/HuffmanDecoder.h library? It will be used in onert-micro and other inferences - so in current configuration luci dependency is required, can we place this on top level as separate dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we place this on top level as separate dependency?
Need to think about this. this maybe the first one, ... and I don't think I will like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with duplicate codes. Sharing code each other will complicate dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -831,6 +837,7 @@ table Conv2DOptions { | |||
dilation_h_factor:int = 1; | |||
// Parameters for Conv2D version 8 or above. | |||
// When set, quantized_bias_type defines the dtype for both bias and accumulator. | |||
weight_compression_type:WeightCompressionType = NONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to upgrade to 0.9 and this requires lots of other changes.
CC @hseok-oh
} | ||
else | ||
{ | ||
throw std::runtime_error("Huffman weights compression supports s8 and u8"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz do not throw, just debug info is OK.
we do not want to stop circle2circle with this reason.
this throw should be in the import module.
@@ -34,7 +34,8 @@ namespace luci | |||
*/ | |||
class CircleConv2D final : public FixedArityNode<3, CircleNodeImpl<CircleOpcode::CONV_2D>>, | |||
public CircleNodeMixin<CircleNodeTrait::FusedActFunc>, | |||
public CircleNodeMixin<CircleNodeTrait::Bias> | |||
public CircleNodeMixin<CircleNodeTrait::Bias>, | |||
public CircleNodeMixin<CircleNodeTrait::WeightCompression> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q) why does Conv2D have this attribute? why not the filter Constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q) why does Conv2D have this attribute? why not the filter Constant?
CircleConst is virtual node - and we need somehow export and import circle - so I suggest to set this attribute when importing op from circle and also set this in CircleConst
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. Please give more explanation.
@SlavikMIPT , is the purpose of compression to reduce file size? or is there any other reasons? |
I don't see any code changes in |
I recommend to introduce |
There are two purposes: reducing file size(for microcontrollers - this can allow to use models which don't fit in flash memory without compression) and reducing memory bandwidth requirements (which can be a bottleneck for hardware accelerators), Huffman and RLE encodings are relatively computationally cheap. Combining this with compression-aware training or quantization we potentially can achieve higher compression rates |
1e12f94
to
eb3bbd8
Compare
@@ -0,0 +1,355 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1/ this is header only file. is there any particular reason to make so? why not split implementations to .cpp file?
2/ copy right contains only Samsung. is this file made by you from scratch?
3/ there is no .test.cpp file for this. can you add some?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ok
- yes
- ok
2ec5c30
to
71c600d
Compare
Looks good. |
af43c81
to
a4f7dbf
Compare
7792c51
to
526d3d3
Compare
Refactored, but I am not sure that code duplication of Decoder/Encoder is good idea - I would think about extracting it into separate component |
526d3d3
to
ca1cbb8
Compare
This commit introduces CopressWeightsPass for Conv2D ONE-DCO-1.0-Signed-off-by: Vyacheslav Bazhenov <[email protected]>
ca1cbb8
to
9a80ea4
Compare
if (lhs->size<loco::DataType::FLOAT32>() != rhs->size<loco::DataType::FLOAT32>()) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code looks very ugly. Isn't it possible to just check lhs->compression() == rhs->compression()
?
* | ||
* To see the target Op pattern, please visit implementation. | ||
*/ | ||
struct CompressWeightsPass final : public logo::Pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name looks too general. Can you rename it to something like CompressWeightsHuffmanPass
?
It seems that test codes are missing as @seanshpark pointed out. I left some comments because @SlavikMIPT requested review. But I don't know details about the algorithm. It would be better to add another reviewer, e.g., @hseok-oh. |
plz split each |
This commit introduces CopressWeightsPass for Conv2D
ONE-DCO-1.0-Signed-off-by: Vyacheslav Bazhenov [email protected]