Skip to content

Commit

Permalink
[Quant Tool] Flaky test due to Pad reflect bug (#22798)
Browse files Browse the repository at this point in the history
### Description
Fixes a unit test that would fail intermittently due to an existing bug
with Pad (reflect mode). When the number of padded values is >= the
inner dimension size, the ORT Pad implementation accesses invalid
memory. This PR makes the number of padding values less than the inner
dimension size to avoid triggering the bug.


### Motivation and Context
See related issues:
#8265
#11828
#20801

Here's a valgrind trace obtained on a Linux machine (with
`sess_options.enable_cpu_mem_arena = False`)
```
==864228== Invalid read of size 4
==864228==    at 0x2716272A: void onnxruntime::PadInnermostAxis<unsigned int>(unsigned int*, unsigned int*, long, unsigned long) (pad.cc:370)
==864228==    by 0x2715D213: onnxruntime::common::Status onnxruntime::PadImpl<unsigned int>(onnxruntime::OpKernelContext*, absl::lts_20240722::InlinedVector<long, 10ul, std::allocator<long> > const&, absl::lts_20240722::InlinedVector<long, 10ul, std::allocator<long> > const&, onnxruntime::Mode const&, unsigned int) (pad.cc:551)
==864228==    by 0x2715B2BB: onnxruntime::Pad::Compute(onnxruntime::OpKernelContext*) const (pad.cc:725)
==864228==    by 0x276FF6A7: onnxruntime::ExecuteKernel(onnxruntime::StreamExecutionContext&, unsigned long, unsigned long, bool const&, onnxruntime::SessionScope&) (sequential_executor.cc:484)
==864228==    by 0x276F4A04: onnxruntime::LaunchKernelStep::Execute(onnxruntime::StreamExecutionContext&, unsigned long, onnxruntime::SessionScope&, bool const&, bool&) (execution_steps.cc:73)
...
```

The above is obtained with the basic Pad(reflect) example on the [ONNX
Pad operator spec
page](https://onnx.ai/onnx/operators/onnx__Pad.html#summary):

```python
data = [
    [1.0, 1.2],
    [2.3, 3.4],
    [4.5, 5.7],
]

pads = [0, 2, 0, 0]

mode = 'reflect'

# Expected output by ONNX spec
expected_output = [
    [1.0, 1.2, 1.0, 1.2],
    [2.3, 3.4, 2.3, 3.4],
    [4.5, 5.7, 4.5, 5.7],
]

# Bugged output from onnxruntime has invalid/uninitialized data for the first element in the inner dimension
# invalid data may be 0.0, inf, nan, etc.
ort_output = [
    [inf, 1.2, 1.0, 1.2],
    [inf, 3.4, 2.3, 3.4],
    [inf, 5.7, 4.5, 5.7],
]
```
  • Loading branch information
adrianlizarraga authored and guschmue committed Dec 2, 2024
1 parent 7416a64 commit d892dc1
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions onnxruntime/test/python/quantization/test_op_pad.py
Original file line number Diff line number Diff line change
Expand Up @@ -548,14 +548,15 @@ def build_pad_model(
opset: int = 21,
float_type: onnx.TensorProto.DataType = onnx.TensorProto.FLOAT,
) -> onnx.ModelProto:
num_pads_start = 1
input_0 = onnx.helper.make_tensor_value_info("input_0", float_type, (3, 2))
output_0 = onnx.helper.make_tensor_value_info("output_0", float_type, (3, 4))
output_0 = onnx.helper.make_tensor_value_info("output_0", float_type, (3, 2 + num_pads_start))

initializers = []
pad_input_names = ["input_0"]
attrs = {"mode": mode}

pads_data = np.array([0, 2, 0, 0], dtype=np.int64) # Pad two vals at beginning of axis 1.
pads_data = np.array([0, num_pads_start, 0, 0], dtype=np.int64) # Pad one val at beginning of axis 1.
if opset >= 11:
initializers.append(onnx.numpy_helper.from_array(pads_data, "pads"))
pad_input_names.append("pads")
Expand Down

0 comments on commit d892dc1

Please sign in to comment.