Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce RemovableAttributes #14868

Merged
merged 14 commits into from
Mar 7, 2023
Merged

Introduce RemovableAttributes #14868

merged 14 commits into from
Mar 7, 2023

Conversation

xadupre
Copy link
Member

@xadupre xadupre commented Mar 1, 2023

Description

TreeEnsemble* kernels fully copies all the parameters from the onnx graph. Even if they are no longer needed or unused (hitrates), they remain in memory. For big models >= 200 trees, max_depth > 10, the model usually weights more than 10 Mb. This change offers a kernel the possibility to remove all unneeded attributes after they were used to create the session. Attributes are deleted after the model was possibly saved, at the of the session creation.

The current design is to be debatted:

  • it stored the list of removable attributes in class onnxruntime::Node,
  • the node is marked as const everytime this implementation needs to register the name of a removable attribute or to remove them.

The current implementation is just a POC as it needs to cast onnxruntime::Node* into const onnxruntime::Node*.

Should we keep the list of removable attributes in onnxruntime::Node?

Motivation and Context

Motivation is mostly to reduce memory consumption.

@xadupre xadupre added the core runtime issues related to core runtime label Mar 1, 2023
@yuslepukhin
Copy link
Member

yuslepukhin commented Mar 1, 2023

// Licensed under the MIT License.

This file contains no useful change (or so I think)


In reply to: 1450728870


In reply to: 1450728870


Refers to: onnxruntime/core/framework/op_kernel_info.cc:2 in 6eed668. [](commit_id = 6eed668, deletion_comment = True)

@pranavsharma
Copy link
Contributor

pranavsharma commented Mar 1, 2023

This needs more thought. One idea is that each kernel can expose an API to return the set of attributes that it deems removable. Then in session state finalization we can go through all the kernels to retrieve these attributes and remove them from the corresponding graph node as the session state still holds a mutable version of the graph (and hence no const_cast required). We do something similar with pre-packed tensors. The kernel exposes an API to pre-pack a tensor and after the session state is done with calling PrePack on each kernel, we remove the corresponding initializer from the graph.

Edit: Obviously, not all kernels have to implement this API thereby relieving them of doing one more thing. Hence we can have a default implementation that returns an empty set of attrs.

@xadupre
Copy link
Member Author

xadupre commented Mar 2, 2023

That would work. I tried to change the code to follow this idea. Let me know if this what you had in mind.

* Clears removable attributes. These are no longer needed after the initialization
* of the session.
*/
int PruneRemovableAttributes(const InlinedVector<std::string>& removable_attributes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what the return value indicates.

for (size_t i = 0; i < session_kernels_.size(); ++i) {
auto status = session_kernels_[i].get()->RemovableAttributes(removable_attributes);
if (!status.IsOK()) {
ORT_THROW("Kernel ", i, " failed at retrieving the removable attributes.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we don't need to throw here because this is merely an optimization. We must log a warning though. Also, the index of the kernel doesn't really give us any useful information; would be good to log the node type and name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1528,6 +1528,13 @@ common::Status InferenceSession::Initialize() {
session_id_, model_->IrVersion(), model_->ProducerName(), model_->ProducerVersion(), model_->Domain(),
model_->MainGraph().DomainToVersionMap(), model_->MainGraph().Name(), model_->MetaData(),
telemetry_.event_name_, execution_providers_.GetIds(), model_has_fp16_inputs);

// once the model is saved, we may remove unnecessary attributes for inference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be placed before the telemetry is logged?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

graph_->SetGraphResolveNeeded();
graph_->SetGraphProtoSyncNeeded();
int n_removed = 0;
for (auto name : removable_attributes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto& name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// All attributes returned by this method will be removed by method
// PruneRemovableAttributes of they exists.
// @param removable_attributes set of attributes the session can safely remove.
virtual Status RemovableAttributes(InlinedVector<std::string>& removable_attributes) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be renamed to GetRemovableAttributes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@pranavsharma
Copy link
Contributor

Can you add a test case to exercise this?

@yuslepukhin
Copy link
Member

How does this affect exporting optimized model into flatbuffers and loading such an optimized model?

@xadupre
Copy link
Member Author

xadupre commented Mar 3, 2023

Can you add a test case to exercise this?

I did not add any because all test based on TreeEnsemble are going through the code I added. I'll try to see how I can add a few lines to make sure one attribute was removed.

I was thinking to add a test case like the following:

TEST(GraphRuntimeOptimizationTest, RemovableAttributes) {
  SaveAndLoadRuntimeOptimizationsForModel(
      ORT_TSTR("testdata/transform/SklearnHGBRegressorTrue.model.onnx"),
      ORT_TSTR("testdata/transform/runtime_optimization/SklearnHGBRegressorTrue.model.onnx"),
      [](const OpCountMap& loaded_ops, const OpCountMap& initialized_ops) {
        EXPECT_EQ(loaded_ops,
                  (OpCountMap{{"TreeEnsembleRegressor", 1}}));

        EXPECT_EQ(initialized_ops,
                  (OpCountMap{{"TreeEnsembleRegressor", 1}}));
      });
}

It checks saving the optimized model is not broken. It does not change the fact a removable attribute was removed. The public API of InferenceSession seems to give access to SessionState and then to the nodes. Is there a unit test I could use as a starting point for that?

@xadupre xadupre changed the title [WIP] Introduce RemovableAttributes Introduce RemovableAttributes Mar 3, 2023
Copy link
Member

@yuslepukhin yuslepukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@xadupre xadupre merged commit 5930e7e into microsoft:main Mar 7, 2023
mszhanyi pushed a commit that referenced this pull request Mar 9, 2023
### Description
TreeEnsemble* kernels fully copies all the parameters from the onnx
graph. Even if they are no longer needed or unused (hitrates), they
remain in memory. For big models >= 200 trees, max_depth > 10, the model
usually weights more than 10 Mb. This change offers a kernel the
possibility to remove all unneeded attributes after they were used to
create the session. Attributes are deleted after the model was possibly
saved, at the of the session creation.

The current design is to be debatted:
* it stored the list of removable attributes in class
`onnxruntime::Node`,
* the node is marked as `const` everytime this implementation needs to
register the name of a removable attribute or to remove them.

The current implementation is just a POC as it needs to cast
`onnxruntime::Node*` into `const onnxruntime::Node*`.

Should we keep the list of removable attributes in `onnxruntime::Node`?

### Motivation and Context
Motivation is mostly to reduce memory consumption.

---------

Signed-off-by: xadupre <[email protected]>
@xadupre xadupre deleted the removeatt branch November 7, 2024 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core runtime issues related to core runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants