Cherry pick 1.17.3 - Round 2 #20178

YUNQIUGUO · 2024-04-02T17:38:22Z

Description

Motivation and Context

@ivberg

See the comments inside of the changed files for more detailed information. The file onnxruntime/core/platform/windows/hardware_core_enumerator.cc and onnxruntime/core/platform/windows/hardware_core_enumerator.h were copied from WinML source folder in this repo, with minor coding style changes. I had an offline discussion with Sheil. We agree that given the lack of a future proof solution, we may check-in this temp fix first, and rework it later. I will have a meeting with @ivberg for discussing the issue deeply, and seeking for a long term solution. Thanks for offering help, @ivberg ! With this change, we will see about 2x perf improvement on some Intel CPUs.

### Description This PR adds flash attention v2 and support for INT4 CUDA benchmarking in PyTorch. ### Motivation and Context The [flash attention v2](https://github.com/Dao-AILab/flash-attention) algorithm helps improve model performance in PyTorch. Support for INT4 CUDA in PyTorch is done through the [`bitsandbytes`](https://github.com/TimDettmers/bitsandbytes) package.

YUNQIUGUO · 2024-04-02T17:42:54Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

YUNQIUGUO · 2024-04-02T17:43:03Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

YUNQIUGUO · 2024-04-02T17:43:30Z

/azp run iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-04-02T17:43:30Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-04-02T17:43:38Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-04-02T17:43:42Z

Azure Pipelines successfully started running 2 pipeline(s).

### Description  See #19921 Just to address one comment: #19921 (comment) since this is an external branch. need to open another pull request for this. ### Motivation and Context  --------- Co-authored-by: Sai Kishan Pampana <[email protected]> Co-authored-by: rachguo <[email protected]> Co-authored-by: Jian Chen <[email protected]>

Adds an example to demonstrate the export of openai whipser implemenation with batch_size > 1 and addition of prompts for each audio snippet. Also handles the scenario for when prompts are not of the same size. For example if our prompt ids are [p1_id_1, p1_id_2] and [p2_id_1], the final decoder_input_ids will look as such after padding: `[prev_token, p1_id_1, p1_id_2, start_token, lang_token, transcribe_token] [prev_token, p2_id_1, PAD_TOKEN, start_token, lang_token, transcribe_token]` --------- Co-authored-by: kunal-vaishnavi <[email protected]>

snnn and others added 2 commits April 2, 2024 10:24

YUNQIUGUO marked this pull request as ready for review April 2, 2024 17:38

YUNQIUGUO and others added 2 commits April 2, 2024 17:31

YUNQIUGUO requested review from kunal-vaishnavi, smk2007, snnn, jchen351, sophies927 and mszhanyi and removed request for kunal-vaishnavi and smk2007 April 3, 2024 00:32

jchen351 approved these changes Apr 3, 2024

View reviewed changes

sophies927 approved these changes Apr 3, 2024

View reviewed changes

tianleiwu merged commit a61add2 into rel-1.17.3 Apr 3, 2024
102 of 108 checks passed

tianleiwu deleted the yguo/cherry-pick-1.17.3-round2 branch April 3, 2024 21:14

sophies927 mentioned this pull request Apr 4, 2024

LLC Core count calculations updated #19921

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cherry pick 1.17.3 - Round 2 #20178

Cherry pick 1.17.3 - Round 2 #20178

YUNQIUGUO commented Apr 2, 2024

YUNQIUGUO commented Apr 2, 2024

YUNQIUGUO commented Apr 2, 2024

YUNQIUGUO commented Apr 2, 2024

azure-pipelines bot commented Apr 2, 2024

azure-pipelines bot commented Apr 2, 2024

azure-pipelines bot commented Apr 2, 2024

Cherry pick 1.17.3 - Round 2 #20178

Cherry pick 1.17.3 - Round 2 #20178

Conversation

YUNQIUGUO commented Apr 2, 2024

Description

Motivation and Context

YUNQIUGUO commented Apr 2, 2024

YUNQIUGUO commented Apr 2, 2024

YUNQIUGUO commented Apr 2, 2024

azure-pipelines bot commented Apr 2, 2024

azure-pipelines bot commented Apr 2, 2024

azure-pipelines bot commented Apr 2, 2024