Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Reorganized cugraph source directory #286

Merged
merged 19 commits into from
May 14, 2019
Merged

[REVIEW] Reorganized cugraph source directory #286

merged 19 commits into from
May 14, 2019

Conversation

kaatish
Copy link
Collaborator

@kaatish kaatish commented May 9, 2019

Fixes #258

@kaatish kaatish changed the title Reorganized cugraph source directory [WIP] Reorganized cugraph source directory May 9, 2019
nvgraph_gdf no longer include library_types.h
@kaatish kaatish changed the title [WIP] Reorganized cugraph source directory [REVIEW] Reorganized cugraph source directory May 9, 2019
@afender
Copy link
Member

afender commented May 9, 2019

The new graph categories look good to me.

I noticed another big change in this PR. The API implementation is now split across internal "backend" files whereas it used to be in a single file (cugraph.cu). Having a single API file allows for fast global API modifications and faster refactoring (as the core implementation is independent of the API).
I also see the advantages of the new solution. For instance, it is more consistent with the python directory structure as well as rapids in general.

@afender
Copy link
Member

afender commented May 9, 2019

Ideally, we should merge PRs that are based on top of the former directory organization first. Such as PR #287

@seunghwak
Copy link
Contributor

Ideally, we should merge PRs that are based on top of the former directory organization first. Such as PR #287

Yes, and I am also planning to submit a PR pretty soon, and it touches multiple files. The reorganization may better happen after that PR gets merged.

The cudf team is planning to adopt the following file structure (https://github.com/rapidsai/cudf/blob/branch-0.7/cpp/docs/TRANSITIONGUIDE.md).

File Structure
In lieu of the monolithic functions.h, external function APIs should be grouped based on functionality into an appropriately titled header file cudf/cpp/include/. For example, cudf/cpp/include/copying.hpp contains the APIs for functions related to copying from one column to another. Note the .hpp file extension used to indicate a C++ header file.

As existing functions are ported from a C to C++ API, new header files will need to be added to cudf/cpp/include/. For naming of these headers, it should be consistent with the name of the folder that contains the implementation of the API. For example, the implementation of the APIs found in cudf/cpp/include/copying.hpp are located in cudf/src/copying.

We should also start replacing existing .h files with .hpp files and renaming/regrouping/splitting public header files to match our new directory structure (we may also need this for internal header files but I guess this is less important/urgent than public ones).

And this may be outside the scope of this reorganization (PLEASE IGNORE ANYTHING FROM HERE IF THEY ARE OUTSIDE THE SCOPE) but the cudf team is planning to explicitly separate external APIs from internal implementations through namespace (https://github.com/rapidsai/cudf/blob/branch-0.7/cpp/docs/TRANSITIONGUIDE.md#detail-namespace). Following this,

we should do

namespace cugraph {

external APIs.

} // namespace cugraph

namespace cugraph {
namespace detail{

internal implementations.

} // namespace detail
} // namespace cugraph

And all functions specific to a translation unit (functions or data types relevant only to a single compiled file, or static functions) go to anonymous namespace.

namespace {

implementations only relevant to a single translation unit.

}

So every source file should look like.

namespace {
implementations only relevant to a single translation unit.
}

namespace cugraph {
namespace detail {
cugraph internal implementations.
}

external APIs.
}

See https://github.com/rapidsai/cudf/blob/branch-0.8/cpp/src/copying/gather.cu for a cudf's example.

And we may not be considering cpp/nvgraph here, but we should think about seemingly unused directories/files under cpp/nvgraph (e.g. cpp/nvgraph/test, cpp/nvgraph/cpp/tests, cpp/nvgraph/include/app, cpp/nvgraph/include/tests and so on). And cpp/nvgraph has two directories for external packages (cpp/nvgraph/external and cpp/nvgraph/cpp/thirdparty). These two directories may better be merged to one.

@BradReesWork
Copy link
Member

directory structure matches NewtworkX - I don't agree that Jaccard is link prediction, but that is a side issue :)

@afender
Copy link
Member

afender commented May 10, 2019

@seunghwak I think it makes sense to address the namespace refactoring as part of switching to a real C++ API (like cuDF).

We should create issues for header splitting and h/hpp renaming. This is something we should have on the radar but it is not as critical as it is for cuDF. For instance, cuDF has headers with 1200+ lines while ours have less than 20 functions. We should do it too, but probably not hold this PR for that.

@afender
Copy link
Member

afender commented May 13, 2019

should we merge smpv.cuh and snmg_csrmv.cuh? That's the same feature

@BradReesWork BradReesWork self-requested a review May 13, 2019 21:07
@kaatish
Copy link
Collaborator Author

kaatish commented May 14, 2019

should we merge smpv.cuh and snmg_csrmv.cuh? That's the same feature

I have renamed the file for clarity. snmg folder has two files :

  • spmv.cu : defines the gdf_snmg* functions
  • spmv.cuh : defines templated class SNMGcsrmv which is used in snmg/blas/spmv.cu and snmg/link_analysis/pagerank.cu

We need separate files because the templated class is being used elsewhere and defining the function in cuh files is going to be problematic.

Copy link
Member

@BradReesWork BradReesWork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new structure. We will need to think about snmg since wit could mimic the same structure. It might eventually go into a dask_cugraph repop.
I think we should get this merged and then address any issues in a separate PR

@BradReesWork BradReesWork merged commit f2f6f91 into rapidsai:branch-0.8 May 14, 2019
@kaatish kaatish deleted the fea-analytic-reorg branch May 14, 2019 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Improve directory structure
4 participants