Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DFT] Initial DFT work & MKLGPU backend #259

Merged
merged 25 commits into from
Mar 8, 2023
Merged

[DFT] Initial DFT work & MKLGPU backend #259

merged 25 commits into from
Mar 8, 2023

Conversation

hjabird
Copy link
Contributor

@hjabird hjabird commented Dec 29, 2022

Co-authored-by @t4c1 and @anantsrivastava30.

Description

Checklist

All Submissions

  • Tests are still a work in progress. Tested against prototype tests and code adapted from the MKLGPU DFT examples.
  • [ x ] Have you formatted the code using clang-format?

New interfaces

  • Targets the OneAPI 1.2-rev-1 spec.
  • Complete New features checklist

@hjabird
Copy link
Contributor Author

hjabird commented Dec 29, 2022

This commit contains @anantsrivastava30 work from, with additional alterations:

@anantsrivastava30
Copy link
Contributor

Please correct me if I am wrong but I don't thing this is the correct git practice, we need to make sure the work is rebased on top of others' commits because this masks the work done by previous authors and as github keeps track of the credit we should make sure thats held intact.

@hjabird
Copy link
Contributor Author

hjabird commented Dec 30, 2022

Please correct me if I am wrong but I don't thing this is the correct git practice, we need to make sure the work is rebased on top of others' commits because this masks the work done by previous authors and as github keeps track of the credit we should make sure thats held intact.

I've credited you and Tadej as co-authors in both the commit message (with Co-authored-by:) and at the top of this PR. I've squashed the commits into one to create a clean commit history. Even if the commits were separate now, I think commits from different authors are usually squashed when a PR is merged.

From a review perspective, I think it is best that your work is included in this PR - I had to make quite a few changes to your code. For example, the descriptor class had to be moved from the dft to the detail namespace to avoid ambiguity with the decsriptor class in the Intel MKLGPU closed-source library. I don' t think there is any benefit to reviewers first reviewing the descriptor class, and then its move to the detail namespace when they could just review detail::descriptor.

If you'd like me to separate out your work within this PR I can of course do that. However, as I said, I'd expect the commits to be squashed on merging.

EDIT: It appears that I needed to put Co-authored-by at the bottom of the commit message rather than the top. Sorry for this mistake.

@lhuot lhuot requested review from mkrainiuk and mmeterel January 9, 2023 16:55
@mmeterel
Copy link
Contributor

mmeterel commented Jan 9, 2023

A quick high-level comment on the organization of this PR and #261. I am not in favor of adding a backend with no actual tests to cover it. Can we either add the tests first (they won't be used until we add the backend, but that is ok) or add the backend and tests in the same PR with proper testing? (with attached tests logs in the PR)

Also, unless I missed it, I don't see any changes to README and other documentation regarding the addition of new backend (along with supported compilers, OS, hardware, etc). Could you please update them as needed?

FMarno

This comment was marked as resolved.

@hjabird
Copy link
Contributor Author

hjabird commented Jan 13, 2023

A quick high-level comment on the organization of this PR and #261. I am not in favor of adding a backend with no actual tests to cover it. Can we either add the tests first (they won't be used until we add the backend, but that is ok) or add the backend and tests in the same PR with proper testing? (with attached tests logs in the PR)

Also, unless I missed it, I don't see any changes to README and other documentation regarding the addition of new backend (along with supported compilers, OS, hardware, etc). Could you please update them as needed?

I've added that MKLGPU is supported in the main README in 6307446.

I very much agree that this should be tested before merging. I think that this PR is already large enough, and that for the purpose review, it would be better to have feedback on the work in the tests PR and this PR separately. I think it would be best if the tests were merged first, and this PR rebased to include them.

Copy link
Contributor

@mkrainiuk mkrainiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I have several questions below to these changes.

CMakeLists.txt Show resolved Hide resolved
examples/dft/compile_time_dispatching/CMakeLists.txt Outdated Show resolved Hide resolved
examples/dft/CMakeLists.txt Show resolved Hide resolved
examples/dft/CMakeLists.txt Outdated Show resolved Hide resolved
src/dft/backends/mklgpu/commit.cpp Outdated Show resolved Hide resolved
hjabird and others added 12 commits February 15, 2023 17:22
* Implements DFT common functionality
* Implements MKLGPU backend
* Tested against adapted examples from closed-source MKLGPU library and prototype
  unit tests (to be submitted in another PR).

Co-authored-by: Anant Srivastava <[email protected]>
Co-authored-by: Tadej Ciglaric <[email protected]>
* It may be desireable to link to only libonemkl_dft_mklgpu without
the need to also link to libonemkl_dft.
* To allow this, the OneMKL interface must be duplicated within the
backend library.
* This PR duplicates the interface, allowing direct linking.
tests/unit_tests/CMakeLists.txt Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
@mmeterel
Copy link
Contributor

Now that #261 is merged, it would be good to see the test logs in this PR.

FMarno and others added 3 commits February 16, 2023 11:11
There is no reason to believe it wouldn't work, but it has not been
tested.
@Rbiessy
Copy link
Contributor

Rbiessy commented Feb 16, 2023

Here are the logs:
log_mklcpu.txt
log_mklgpu.txt

Note that I ran this with the OpenCL runtime as this is currently easier on my development machine. I am commenting out this skip here.

Copy link
Contributor

@anantsrivastava30 anantsrivastava30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, given the md test PR #275 passes with these as well

@lhuot
Copy link
Contributor

lhuot commented Mar 2, 2023

@mmeterel and @mkrainiuk this PR looks good on the DFT side. Do you have any concerns left regarding how this fit to the overall design of the project?

@FMarno
Copy link
Contributor

FMarno commented Mar 6, 2023

the MKLCPU backend wasn't building, so I've fixed that. Here is the new log:
log_mklcpu.txt

@lhuot lhuot merged commit e4c3b2e into uxlfoundation:develop Mar 8, 2023
@Rbiessy Rbiessy deleted the FFT_MKLGPU_staging branch March 9, 2023 11:18
Comment on lines +116 to +117
set_value_item(descHandle, DFTI_INPUT_DISTANCE, config.fwd_dist);
set_value_item(descHandle, DFTI_OUTPUT_DISTANCE, config.bwd_dist);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anantsrivastava30, @t4c1, @Rbiessy, @hjabird @lhuot: I know this PR was merged already but what motivated this usage of DFTI_INPUT_DISTANCE and DFTI_OUTUT_DISTANCE via dpc++ APIs? Those parameters are not listed in the specs and we explicitly recommend to use FWD_DISTANCE and BWD_DISTANCE for the usage of the closed-source oneMKL via DPC++ APIs here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @raphael-egan the usage of these are mostly to conform to the classical API usage, that still uses these parameters, however I see you point and we can mask them in the cpu backend pr#288, so that on the user end we still use fwd/bwd distance and later on in this part of the code change these to input/output distance, lets move this discussion to #288

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it makes sense to move to the C++ API in the CPU MR. The initial work used the C API for the CPU backend and we didn't update this to avoid conflicts. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants