Skip to content

Latest commit

 

History

History
89 lines (72 loc) · 3.39 KB

driver_sum.md

File metadata and controls

89 lines (72 loc) · 3.39 KB

Sum Driver

Usage

    ./benchdnn --sum [benchdnn-knobs] [sum-knobs] [sum-desc] ...

where sum-knobs are:

  • --sdt={f32:f32 [default], ...} -- src data type. Refer to Inputs below. Refer to data types for details.
  • --ddt={f32 [default], s32, s8, u8, bf16, f16} -- dst data type. Refer to data types for details.
  • --stag={nchw:nchw [default], ...} -- physical src memory layout. Refer to Inputs below. Refer to tags for details.
  • --dtag={undef [default], ...} -- physical dst memory layout. Refer to tags for details.
  • --scales={FLOAT[:FLOAT...]} -- input scales. Refer to Scales below. The default is 1.f.
  • --match=REGEX -- skip problems not matching the regular expression in REGEX. By default no pattern is applied (run everything). Note: Windows may interpret only string arguments surrounded by double quotation marks.
  • Any attributes options. Refer to attributes for details.

and sum-desc is a problem descriptor. The canonical form is:

    NxNxNxNxN

where N is an integer number. This represents 3D spatial problem with the following logical dimensions: N, C, D, H, W. Consider removing each xN from the end to specify fewer dimensions.

Inputs

The sum primitive accepts at least two input sources. That is why it has a slightly different interface than almost every other driver. Specifying several inputs is done via the special ':' delimiter, e.g. --sdt=f32:s32, which means that the first source will be of type f32 and the second will be s32. --stag option must either specify a single tag (this tag will be used for all input tensors) or specify the same number of tags as the number of data types in --sdt delimited by ':'.

Scales

The same logic as in Inputs works for sum scales. --scales supports two modes:

  • Single value. In this case, a given value will be broadcasted for each input.
  • Multiple values. In this case, the driver will require the amount of scales to coincide with the amount of the --sdt option. Each input will use its own scale value.

Essence of Testing

Fill input data with integers so that an output will not overflow in the f16 or bf16 data types.

Note that threshold is set to a small value, which means that not each scale may pass it.

Examples

Run the set of sum from sum/test_sum_all with the default settings:

    ./benchdnn --sum --batch=inputs/sum/test_sum_all

Run a specific sum problem with three inputs of s8, s8, and u8 data types, with each input in nhwc physical memory layout, providing scales for each input individually and requesting the output in the s32 data type and nhwc layout:

    ./benchdnn --sum --sdt=s8:s8:u8 --ddt=s32 \
               --stag=nhwc:nhwc:nhwc --dtag=nhwc \
               --scales=2:4:1 16x16x3x5

Run a specific sum problem with the default scales, requesting to deduce the output with the f32 data type, iterating over source data types and physical memory layouts:

    ./benchdnn --sum --sdt=bf16:bf16,f32:f32 --ddt=f32 \
               --stag=nChw16c:nChw16c,nchw:nchw --dtag=undef 16x16x16x16

More examples with different driver options can be found at inputs/sum/test_sum_all. Examples with different benchdnn common options can be found at driver_conv.md.