Releases · NVIDIA/cudnn-frontend

13 Aug 21:43

v0.4.1

8360d4a

Release 0.4.1

[Bug Fix]: Fixed an issue where the vector count was not copied over during move construction phase.
[Samples]: Added a new sample for INT8x32 config (utilizing integer tensor cores). The example includes an errata filter which blocks an engine that has a known issue running this config.
[CleanUp]: Change all move constructors and fixed move assignment operator.

Co-authored-by: agopal [email protected]

Assets 2

01 Jul 04:23

Anerudhan

v0.4

73210a9

v0.4 release

[New API] : Added a new function get_heuristics_list which accepts a list of heuristics mode and returns a concatenated list of the engine heuristics.
[New Feature]: New mode of heuristic (HEUR_MODE_FALLBACK] added to the backend. Sample updated to use that and provides a generic way to access the fallback engines. FallbackEngineList is retained as a way to add custom engines in the frontend.
[New Feature]: Added support to set vectorization dimension and vectorization count attributes in the tensor descriptor.
[Rename]: setDataType in OperationBuilder deprecated and replaced with more clear setComputePrecision()
[CleanUp] : cudnnFindPlan and cudnnGetPlan takes L-value operationGraph rather than previously R-value.
[CleanUp] : cudnnFindPlan and time_sorted_plan return executionPlans_t (which is a vector plans) instead of executionOptions_t (which is a vector of struct containing plan and time). This is to achieve compatibility with the cudnnGet.
[Samples]: New sample added for DP4A.
[Samples]: ConvBiasScaleRelu sample|
[Bug fix]: Errata filter was erroneously filtering out unspecified engines.

Assets 2

09 Jun 20:33

Anerudhan

v0.3.1

949f2ac

MR for quick fix for graceful exit

[Maintenance] Adding status check on the cudnnBackendExecute during warm up.
[Maintenance] Adding status check on json_handle when loading from a file

Assets 2

17 May 06:49

Anerudhan

v0.3

51e60d8

v0.3

[New feature] Support reduction operation in the frontend.
[New feature] Add engine runtime compilation filter in the frontend as a behavior filter.
[New feature] Adding fallback list for convBiasAct
[New feature Beta] Adding Errata filter with an sample.
[Samples] Add ConvBnstats and ConvColReduction tests
[Bug Fix] Clamp upper_clip for float compute type to float max for pointwise descriptor when computeType is float.
[Bug Fix] Compilation fix for newer gcc toolchain (gcc 9+).
[Bug Fix] Add operation tag to the Plan generated by cudnnFind and cudnnGet
[Maintenance] Added default fallback lists to newer versions of cudnn.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: NVIDIA/cudnn-frontend

Release 0.4.1

v0.4 release

MR for quick fix for graceful exit

v0.3