LabelEncoder kernel creation improvement #16516

adityagoel4512 · 2023-06-28T09:51:09Z

Description

This PR updates the initialisation of the _map in LabelEncoder_2 to be more memory efficient.

Firstly, we switch from std::unordered_map to absl::flat_hash_map. The latter has a more compact layout and doesn't have the overhead of maintaining reference validity like std::unordered_map, which is a feature we do not need here since we only initialise _map once at creation and then perform lookups duing compute. Abseil is already used extensively within onnxruntime.
Secondly, space is reserved before inserting to prevent rehashing and reallocation.

Motivation and Context

For very large lookups, the LabelEncoder's kernel creation can require more RAM than necessary. These simple changes in _map initialisation improve initialisation speed and memory allocated.

Signed-off-by: Aditya Goel <[email protected]>

adityagoel4512 · 2023-07-03T15:11:47Z

Closes #16575

adityagoel4512 · 2023-07-03T15:18:30Z

@baijumeswani not sure if you are the right person to ask, but would it be possible to get a review on this?

onnxruntime/core/providers/cpu/ml/label_encoder.h

baijumeswani · 2023-07-03T22:32:01Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

baijumeswani · 2023-07-03T22:32:22Z

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, ONNX Runtime React Native CI Pipeline

azure-pipelines · 2023-07-03T22:32:36Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2023-07-03T22:32:51Z

Azure Pipelines successfully started running 8 pipeline(s).

adityagoel4512 · 2023-07-05T10:27:12Z

@baijumeswani looks like the CI has passed

baijumeswani · 2023-07-05T14:09:42Z

Thank you for your contribution.

adityagoel4512 added 2 commits June 28, 2023 10:44

Preallocate unordered_map in label encoder and use emplace

5da913c

Signed-off-by: Aditya Goel <[email protected]>

Switch to abseil flat hashmap

ad97909

adityagoel4512 changed the title ~~Preallocate unordered_map in label encoder and use emplace~~ Make LabelEncoder more memory efficient. Jul 2, 2023

adityagoel4512 changed the title ~~Make LabelEncoder more memory efficient.~~ Make LabelEncoder creation more memory efficient. Jul 2, 2023

adityagoel4512 changed the title ~~Make LabelEncoder creation more memory efficient.~~ LabelEncoder kernel creation improvement Jul 3, 2023

baijumeswani reviewed Jul 3, 2023

View reviewed changes

onnxruntime/core/providers/cpu/ml/label_encoder.h Outdated Show resolved Hide resolved

Use InlinedHashMap

be94fea

baijumeswani approved these changes Jul 5, 2023

View reviewed changes

baijumeswani merged commit 9799d43 into microsoft:main Jul 5, 2023

adityagoel4512 deleted the preallocate_label_encoder_map branch July 5, 2023 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LabelEncoder kernel creation improvement #16516

LabelEncoder kernel creation improvement #16516

adityagoel4512 commented Jun 28, 2023 •

edited

Loading

adityagoel4512 commented Jul 3, 2023

adityagoel4512 commented Jul 3, 2023

baijumeswani commented Jul 3, 2023

baijumeswani commented Jul 3, 2023

azure-pipelines bot commented Jul 3, 2023

azure-pipelines bot commented Jul 3, 2023

adityagoel4512 commented Jul 5, 2023

baijumeswani commented Jul 5, 2023

LabelEncoder kernel creation improvement #16516

LabelEncoder kernel creation improvement #16516

Conversation

adityagoel4512 commented Jun 28, 2023 • edited Loading

Description

Motivation and Context

adityagoel4512 commented Jul 3, 2023

adityagoel4512 commented Jul 3, 2023

baijumeswani commented Jul 3, 2023

baijumeswani commented Jul 3, 2023

azure-pipelines bot commented Jul 3, 2023

azure-pipelines bot commented Jul 3, 2023

adityagoel4512 commented Jul 5, 2023

baijumeswani commented Jul 5, 2023

adityagoel4512 commented Jun 28, 2023 •

edited

Loading