-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing new graph file parsers for graphchi #1
Comments
The documentation is quite sparse, admittedly. Here are some tips how to get started: The parsers are implemented in src/preprocessing/conversions.hpp Look first around line 541 how the parsers for different tiletypes are called. Then you need to write your own parser method similar to convert_adjlist(basefilename, sharderobj), which starts on line 285. After that, you should be done. You just need to specify "--filetype=myfiletype" on the command line, where myfiletype is the identifier for the format your want to implement. |
Thanks, this was very helpful. I think I might be able to implement a parser for the METIS format. Is there a specific reason why the C++ standard library is so sparsely used for the parsers? Is it okay to work with std::ifstream, std::stringstream, std::getline etc? Kind regards |
It is ok to use C++ standard library. I just found the C-methods were a bit faster, and with billions of edges that can make a difference. |
Implemented the METIS format parser, see: However, when trying to test the new format with the community detection example, I get a crash: cls ~/workspace/Prototypes/graphchi-cpp $ ./bin/example_apps/communitydetection --filetype=metis file /Users/cls/workspace/Data/DIMACS/Clustering/pgp.graph This is hard for me to diagnose. Am I doing anything obviously wrong? Kind regards Am 25.07.2013 um 18:20 schrieb Aapo Kyrola [email protected]:
|
Sorry I had not noticed your message. It seems your interim file is empty: see the message "Max vertex id: 0". You can send me the code and I am happy to have a look. |
You should be able to view and pull the code from here: Alternatively, I append the source file. Thank you for having a look at this. Chris Am 28.07.2013 um 04:02 schrieb Aapo Kyrola [email protected]:
|
Hmm, i notice that none of your output to logstream of convert_metis is shown. I don't see anything obviously wrong in your code. I suggest you add std::cout << "debug ... " << std::endl; to many places and hunt down why no edges are read from the file. |
Am 29.07.2013 um 19:06 schrieb Aapo Kyrola [email protected]:
No edges are read from the file because the control flow does not reach my convert_metis function, starting from the community detection example app. In the main function of the example, graphchi_init(argc, argv) is called, which is supposed to read the --filetype=metis option I suppose. Then it calls set_argc and puts the key-value-pair into the configuration, and prints it, right? get_option_string_interactive is supposed to get the value, I guess. I cannot figure out where convert is actually called, the example only calls convert_if_notexists explicitly. Any idea on how to fix this? |
convert_if_notexists calls convert.... what's happening there? |
For some reason, it did not enter the if (!sharderobj.preprocessed_file_exists()) block. Tried it with a new file and now reading a graph in METIS format seems to work. Community detection on a large web graph 1 runs ins 269 seconds. Are you interested in adding the parser code to graphchi? Kind regards Am 03.08.2013 um 19:49 schrieb Aapo Kyrola [email protected]:
|
Great! Just make a pull request and I will add it. Thanks! Sent from my iPhone On Aug 4, 2013, at 14:31, clstaudt [email protected] wrote:
|
fix some warnings
I would like to try graphchi with a collection of graphs in the so-called METIS format, a simple adjacency list format, which is however not the same as the adjacency lists already supported.
http://www.cc.gatech.edu/dimacs10/downloads.shtml
The Introduction to Example Applications states that "it is fairly easy to write your own parsers. ", but it is not apparent how this works. Looking at the source code did not get me far. There should be some hints in the documentation on how to create a new parser.
The text was updated successfully, but these errors were encountered: