Skip to content

v0.6.2

Compare
Choose a tag to compare
@Anerudhan Anerudhan released this 21 Apr 20:06
· 39 commits to main since this release
43709ab
  • [New Feature] Serialization:

    Execution Plan Serialization and Deserialization (Experimental)

    cuDNN v8.4 and above provides exeuction plan serialization and deserialization to save the execution plan as a string in JSON format. The execution plan can be then restored from that string at a later point, and this also saves compilation time compared to rebuilding the plan from scratch. Currently, this is an experimental feature that only supports the runtime fusion engine. No forward/backward or cross-device compatibility guarantee is offered at this time.

    API:

      - std::string cudnn_frontend::ExecutionPlan_v8::getJsonRepresentation() : Serialize the execution plan into a string in JSON format.
      - cudnn_frontend::ExecutionPlan_v8&& cudnn_frontend::ExecutionPlanBuilder_v8::loadFromJson(const std::string &json_plan) : Deserialize from a string containing the JSON representation of the execution plan.
    
  • [New API] Added a new API

    get_heuristics_list(std::array<std::string, SIZE> modes,
      OperationGraph_v8 &opGraph,
      std::function<bool(cudnnBackendDescriptor_t)> filter_fn,
      EngineConfigList &filtered_configs,
      bool evaluate_all = false)
    

    This function takes a paramter list of heuristics mode. "heuristics_instant", "heuristic_fallback", "heuristic_mode_b" and computes a list of engine config which do not satisfy the blocking condition in filter_fn. The function can be optionally set to keep going even if one of the mode fails.

  • [New Features] Added support for BN Finalize i.e. generation of mean and variance to perform batch normalization.

  • [New Features] Added support for BN Stats fusion pattern. This pattern covers Scale, Bias, Relu, Conv and generation of SUM and SQSUM for batch normalization.

  • [New Features] Added support for CUDNN_POINTWISE_GEN_INDEX and CUDNN_POINTWISE_BINARY_SELECT pointwise operations added in cuDNN 8.4.0.

  • [Cleanup] Fixed a bug when used CUDNN_HEUR_MODE_B is used in multiple threads leads to crash in certain conditions.

  • [Cleanup] Added the CUDNN_PATH in CMakeLists.txt allowing user to build with different cuDNN installation path.

  • [Cleanup] Made Engine_v8 constructor as default which prevents overwriting of the status during knob creation.

  • [Cleanup] Take UIDs of variant pack as a const pointer.

  • [Cleanup] When logging was enabled and if no plan returned by heuristics is finalizable, it lead to a crash. This is now fixed.

  • [Samples] Added a new sample to showcase CUDNN_POINTWISE_GEN_INDEX and CUDNN_POINTWISE_BINARY_SELECT pointwise operations.

  • [Samples] Modified MHA sample to show improved numerical stability. Investigation is still going on to further improve the MHA sample

  • [Samples] Added samples for fused operation graph for BN Stats generation and stats finalization.

  • Added missing return statements for operation.

  • Added as warn-as-error to the Samples Makefile.

  • Addressed multiple compiler warning triggered by clang.

    • Unused variables.
    • Undefined destructor for class with virtual methods