Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to remove duplicated edges in batchUpdate #64

Open
ax7e opened this issue May 15, 2023 · 4 comments
Open

Unable to remove duplicated edges in batchUpdate #64

ax7e opened this issue May 15, 2023 · 4 comments

Comments

@ax7e
Copy link

ax7e commented May 15, 2023

Thanks for your great work!
I read throw the batchUpdate functions.
I'm confused about the following facts:

  1. Does hornet keeps the nodes inside each chunk sorted? If so, why batch update will finish with out furthur sorting?

template <typename... EdgeMetaTypes,
typename vid_t, typename degree_t>
void
BATCH_UPDATE::
print(bool sort) noexcept {
auto ptr = in_edge().get_soa_ptr();
std::cout<<"src, dst : "<<size()<<"\n";
rmm::device_vector<vid_t> src(size());
rmm::device_vector<vid_t> dst(size());
thrust::copy(ptr.template get<0>(), ptr.template get<0>() + size(), src.begin());
thrust::copy(ptr.template get<1>(), ptr.template get<1>() + size(), dst.begin());
if (!sort) {
thrust::copy(src.begin(), src.end(), std::ostream_iterator<vid_t>(std::cout, " "));
std::cout<<"\n";
thrust::copy(dst.begin(), dst.end(), std::ostream_iterator<vid_t>(std::cout, " "));
std::cout<<"\n";
} else {
thrust::host_vector<vid_t> hSrc = src;
thrust::host_vector<vid_t> hDst = dst;
std::vector<std::pair<vid_t, vid_t>> e;
for (int i = 0; i < size(); ++i) {
e.push_back(std::make_pair(hSrc[i], hDst[i]));
}
std::sort(e.begin(), e.end());
for (unsigned i = 0; i < e.size(); ++i) { std::cout<<e[i].first<<" "; }
std::cout<<"\n";
for (unsigned i = 0; i < e.size(); ++i) { std::cout<<e[i].second<<" "; }
std::cout<<"\n";
}
}

In the above code, the writer seems to assume that the edges are not sorted.

https://github.com/rapidsai/cuhornet/blob/ab70d14a562bdaa950e820b412fe827c570c0ca3/hornet/include/Core/HornetOperations/HornetInsert.i.cuh#LL12C1-L27C2

The code do don't sort the edges after batch update.
2. If the edges are don't sorted, actually the remove_graph_duplicates function is wrong.

int found = xlib::lower_bound_left(
batch_dst_ids + start,
end - start,
dst);

It uses xlib::lower_bound_left, which requires the array to be sorted.

  1. More Importantly, the delete function also assumes the chunks are sorted, but they are not.

https://github.com/rapidsai/cuhornet/blob/ab70d14a562bdaa950e820b412fe827c570c0ca3/hornet/include/Core/BatchUpdate/BatchUpdateKernels.cuh#L293-L296C6

@ax7e
Copy link
Author

ax7e commented May 15, 2023

@BradReesWork @bdice

@bdice
Copy link
Contributor

bdice commented Jun 14, 2023

Hi, unfortunately I won't be much help here. I have done some packaging work for this library but am unfamiliar with its internals. I believe this library is also scheduled for deprecation? cc: @rlratzel @ChuckHastings

@ChuckHastings
Copy link
Contributor

@ax7e - thanks for your interest.

This repository is a copy of the original hornet repository from Georgia Tech (https://github.com/hornet-gt/hornet). We copied the original repository so that we could use a handful of features from this library and integrate them into the NVIDIA RAPIDS cugraph library (https://github.com/rapidsai/cugraph). At the moment, the only remaining use of the hornet code base within cugraph is the implementation of k-truss).

Our support for this copy over the last 4 years has entirely been limited to updating things to compile for the subset of algorithms that we are using... which is now only k-truss.

The cugraph library will be phasing out use of cuhornet later this year and will drop support for this. I don't believe that Georgia Tech is actively working on this library either (at least I see no updates to the repository since 2019). @ogreen might have some insight into that.

If you are interested in using cugraph and/or collaborating with us, I certainly encourage that. We have an active team of developers working on the cugraph library. The big thing that hornet offers that cugraph does not is support for dynamic graphs. We do intend to add support for dynamic graphs (it is on our long-term road map), but our work at the moment focuses on static graph analysis and Graph Neural Networks.

If dynamic graphs are important to you, you can reach out to us and we'd be happy to offer suggestions on how to make use of cugraph, and perhaps you can help inform our plans for adding dynamic graph support in the future.

@ax7e
Copy link
Author

ax7e commented Jun 20, 2023

@ChuckHastings Thanks for your comment!. I'm truely very interested in dynamic graphs. It poses a unique challenge on load balancing and memory coaleasing. With the rapid development of LLM, the GPU memory is becoming sufficiently big that it could fits amost all graph streaming workloads. I'm truely happy to discuss about what I can do in adding dynamic graph support in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants