-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce RemovableAttributes #14868
Conversation
This file contains no useful change (or so I think) In reply to: 1450728870 In reply to: 1450728870 Refers to: onnxruntime/core/framework/op_kernel_info.cc:2 in 6eed668. [](commit_id = 6eed668, deletion_comment = True) |
This needs more thought. One idea is that each kernel can expose an API to return the set of attributes that it deems removable. Then in session state finalization we can go through all the kernels to retrieve these attributes and remove them from the corresponding graph node as the session state still holds a mutable version of the graph (and hence no const_cast required). We do something similar with pre-packed tensors. The kernel exposes an API to pre-pack a tensor and after the session state is done with calling PrePack on each kernel, we remove the corresponding initializer from the graph. Edit: Obviously, not all kernels have to implement this API thereby relieving them of doing one more thing. Hence we can have a default implementation that returns an empty set of attrs. |
That would work. I tried to change the code to follow this idea. Let me know if this what you had in mind. |
* Clears removable attributes. These are no longer needed after the initialization | ||
* of the session. | ||
*/ | ||
int PruneRemovableAttributes(const InlinedVector<std::string>& removable_attributes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document what the return value indicates.
for (size_t i = 0; i < session_kernels_.size(); ++i) { | ||
auto status = session_kernels_[i].get()->RemovableAttributes(removable_attributes); | ||
if (!status.IsOK()) { | ||
ORT_THROW("Kernel ", i, " failed at retrieving the removable attributes."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we don't need to throw here because this is merely an optimization. We must log a warning though. Also, the index of the kernel doesn't really give us any useful information; would be good to log the node type and name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -1528,6 +1528,13 @@ common::Status InferenceSession::Initialize() { | |||
session_id_, model_->IrVersion(), model_->ProducerName(), model_->ProducerVersion(), model_->Domain(), | |||
model_->MainGraph().DomainToVersionMap(), model_->MainGraph().Name(), model_->MetaData(), | |||
telemetry_.event_name_, execution_providers_.GetIds(), model_has_fp16_inputs); | |||
|
|||
// once the model is saved, we may remove unnecessary attributes for inference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be placed before the telemetry is logged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
onnxruntime/core/graph/graph.cc
Outdated
graph_->SetGraphResolveNeeded(); | ||
graph_->SetGraphProtoSyncNeeded(); | ||
int n_removed = 0; | ||
for (auto name : removable_attributes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto& name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// All attributes returned by this method will be removed by method | ||
// PruneRemovableAttributes of they exists. | ||
// @param removable_attributes set of attributes the session can safely remove. | ||
virtual Status RemovableAttributes(InlinedVector<std::string>& removable_attributes) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be renamed to GetRemovableAttributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Can you add a test case to exercise this? |
How does this affect exporting optimized model into flatbuffers and loading such an optimized model? |
I did not add any because all test based on TreeEnsemble are going through the code I added. I'll try to see how I can add a few lines to make sure one attribute was removed. I was thinking to add a test case like the following:
It checks saving the optimized model is not broken. It does not change the fact a removable attribute was removed. The public API of InferenceSession seems to give access to SessionState and then to the nodes. Is there a unit test I could use as a starting point for that? |
Signed-off-by: xadupre <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Description TreeEnsemble* kernels fully copies all the parameters from the onnx graph. Even if they are no longer needed or unused (hitrates), they remain in memory. For big models >= 200 trees, max_depth > 10, the model usually weights more than 10 Mb. This change offers a kernel the possibility to remove all unneeded attributes after they were used to create the session. Attributes are deleted after the model was possibly saved, at the of the session creation. The current design is to be debatted: * it stored the list of removable attributes in class `onnxruntime::Node`, * the node is marked as `const` everytime this implementation needs to register the name of a removable attribute or to remove them. The current implementation is just a POC as it needs to cast `onnxruntime::Node*` into `const onnxruntime::Node*`. Should we keep the list of removable attributes in `onnxruntime::Node`? ### Motivation and Context Motivation is mostly to reduce memory consumption. --------- Signed-off-by: xadupre <[email protected]>
Description
TreeEnsemble* kernels fully copies all the parameters from the onnx graph. Even if they are no longer needed or unused (hitrates), they remain in memory. For big models >= 200 trees, max_depth > 10, the model usually weights more than 10 Mb. This change offers a kernel the possibility to remove all unneeded attributes after they were used to create the session. Attributes are deleted after the model was possibly saved, at the of the session creation.
The current design is to be debatted:
onnxruntime::Node
,const
everytime this implementation needs to register the name of a removable attribute or to remove them.The current implementation is just a POC as it needs to cast
onnxruntime::Node*
intoconst onnxruntime::Node*
.Should we keep the list of removable attributes in
onnxruntime::Node
?Motivation and Context
Motivation is mostly to reduce memory consumption.