Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example app trianglecounting fails with twitter graph #21

Closed
disa-mhembere opened this issue Apr 14, 2014 · 6 comments
Closed

Example app trianglecounting fails with twitter graph #21

disa-mhembere opened this issue Apr 14, 2014 · 6 comments

Comments

@disa-mhembere
Copy link

I ran the trianglecounting example app with no problem on small graphs, but I get a seg fault when I run the twitter graph with 42 mil vertices & 1.5 bil edges. I have 500GB of RAM and 64 cores and over 1TB of free disk space and this crash occurs when nothing else is running.

Here is the output from gdb when I try to debug. It seems it fails within malloc:

DEBUG: slidingshard.hpp(sliding_shard:213): Total edge data size:
839065660, /mnt/ram0/graphs/twitter_rv.net_degord.edata.e4B.0_0.dyngraph.0_7sizeof(ET):
4

Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff7492720, bytes=643908276) at malloc.c:3900
3900 malloc.c: No such file or directory.
(gdb) backtrace
#0 _int_malloc (av=0x7ffff7492720, bytes=643908276) at malloc.c:3900
#1 0x00007ffff715bf95 in __GI___libc_malloc (bytes=643908276) at malloc.c:2924
#2 0x00000000004428ed in

graphchi::graphchi_dynamicgraph_engine<unsigned int, unsigned int,
graphchi::graphchi_vertex<unsigned int, unsigned int>

::commit_graph_changes (this=0x7fffffffd730) at
./src/engine/dynamic_graphs/graphchi_dynamicgraph_engine.hpp:733

3 0x000000000042a476 in graphchi::graphchi_engine<unsigned int,

unsigned int, graphchi::graphchi_vertex<unsigned int, unsigned int>
::run (this=0x7fffffffd730, userprogram=..., _niters=)
at ./src/engine/graphchi_engine.hpp:952

4 0x0000000000404eb3 in main (argc=, argv=<optimized

out>) at example_apps/trianglecounting.cpp:470
(gdb) frame 2

2 0x00000000004428ed in

graphchi::graphchi_dynamicgraph_engine<unsigned int, unsigned int,
graphchi::graphchi_vertex<unsigned int, unsigned int>
::commit_graph_changes (this=0x7fffffffd730) at
./src/engine/dynamic_graphs/graphchi_dynamicgraph_engine.hpp:733
733 edata =
(graphchi_edge*)malloc(num_edges *
sizeof(graphchi_edge));

Any idea how to solve this?
Thanks

@akyrola
Copy link
Member

akyrola commented Apr 14, 2014

Hi,

are you running it from ram disk (/mnt/ram0/...)?

It is probably the reason it crashes, although 500GB RAM should be enough.... Please place the input file on the hard drive.

Aapo

On Apr 13, 2014, at 8:20 PM, Disa Mhembere [email protected] wrote:

I ran the trianglecounting example app with no problem on small graphs, but I get a seg fault when I run the twitter graph with 42 mil vertices & 1.5 bil edges. I have 500GB of RAM and 64 cores and over 1TB of free disk space and this crash occurs when nothing else is running.

Here is the output from gdb when I try to debug. It seems it fails within malloc:

DEBUG: slidingshard.hpp(sliding_shard:213): Total edge data size:
839065660, /mnt/ram0/graphs/twitter_rv.net_degord.edata.e4B.0_0.dyngraph.0_7sizeof(ET):
4

Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff7492720, bytes=643908276) at malloc.c:3900
3900 malloc.c: No such file or directory.
(gdb) backtrace
#0 _int_malloc (av=0x7ffff7492720, bytes=643908276) at malloc.c:3900
#1 0x00007ffff715bf95 in __GI___libc_malloc (bytes=643908276) at malloc.c:2924
#2 0x00000000004428ed in
graphchi::graphchi_dynamicgraph_engine graphchi::graphchi_vertex

::commit_graph_changes (this=0x7fffffffd730) at
./src/engine/dynamic_graphs/graphchi_dynamicgraph_engine.hpp:733
#3 0x000000000042a476 in graphchi::graphchi_engine unsigned int, graphchi::graphchi_vertex
::run (this=0x7fffffffd730, userprogram=..., _niters=)
at ./src/engine/graphchi_engine.hpp:952
#4 0x0000000000404eb3 in main (argc=, argv= out>) at example_apps/trianglecounting.cpp:470
(gdb) frame 2
#2 0x00000000004428ed in
graphchi::graphchi_dynamicgraph_engine graphchi::graphchi_vertex
::commit_graph_changes (this=0x7fffffffd730) at
./src/engine/dynamic_graphs/graphchi_dynamicgraph_engine.hpp:733
733 edata =
(graphchi_edge*)malloc(num_edges *
sizeof(graphchi_edge));

Any idea how to solve this?
Thanks


Reply to this email directly or view it on GitHub.

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola
GraphChi: Big Data - small machine: http://graphchi.org
twitter: @kyrpov

@disa-mhembere
Copy link
Author

Hi Aapo,

Thanks for the quick response. I am running on a ram disk because I want as
high performace as possible. This seems to have been caused by me messing
with the configurations file - totally my fault.
I seem to still not be able to achieve your results even with my machine
specs. Would you be able to provide the config file or individual command
line configs that you used to achieve the results you show in your paper
for: (i) Pagerank, (ii) Triangle counting, (iii) ConnectedComponents?

Thanks again,
Disa

On Mon, Apr 14, 2014 at 12:16 AM, Aapo Kyrola [email protected]:

Hi,

are you running it from ram disk (/mnt/ram0/...)?

It is probably the reason it crashes, although 500GB RAM should be
enough.... Please place the input file on the hard drive.

Aapo

On Apr 13, 2014, at 8:20 PM, Disa Mhembere [email protected]
wrote:

I ran the trianglecounting example app with no problem on small graphs,
but I get a seg fault when I run the twitter graph with 42 mil vertices &
1.5 bil edges. I have 500GB of RAM and 64 cores and over 1TB of free disk
space and this crash occurs when nothing else is running.

Here is the output from gdb when I try to debug. It seems it fails
within malloc:

DEBUG: slidingshard.hpp(sliding_shard:213): Total edge data size:
839065660,
/mnt/ram0/graphs/twitter_rv.net_degord.edata.e4B.0_0.dyngraph.0_7sizeof(ET):

4

Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff7492720, bytes=643908276) at malloc.c:3900
3900 malloc.c: No such file or directory.
(gdb) backtrace
#0 _int_malloc (av=0x7ffff7492720, bytes=643908276) at malloc.c:3900
#1 0x00007ffff715bf95 in __GI___libc_malloc (bytes=643908276) at
malloc.c:2924
#2 0x00000000004428ed in
graphchi::graphchi_dynamicgraph_engine graphchi::graphchi_vertex

::commit_graph_changes (this=0x7fffffffd730) at
./src/engine/dynamic_graphs/graphchi_dynamicgraph_engine.hpp:733
#3 0x000000000042a476 in graphchi::graphchi_engine unsigned int,
graphchi::graphchi_vertex
::run (this=0x7fffffffd730, userprogram=..., _niters=)
at ./src/engine/graphchi_engine.hpp:952
#4 0x0000000000404eb3 in main (argc=, argv= out>) at
example_apps/trianglecounting.cpp:470
(gdb) frame 2
#2 0x00000000004428ed in
graphchi::graphchi_dynamicgraph_engine graphchi::graphchi_vertex
::commit_graph_changes (this=0x7fffffffd730) at
./src/engine/dynamic_graphs/graphchi_dynamicgraph_engine.hpp:733
733 edata =
(graphchi_edge*)malloc(num_edges *
sizeof(graphchi_edge));

Any idea how to solve this?
Thanks


Reply to this email directly or view it on GitHub.

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola
GraphChi: Big Data - small machine: http://graphchi.org
twitter: @kyrpov


Reply to this email directly or view it on GitHubhttps://github.com//issues/21#issuecomment-40332233
.

@akyrola
Copy link
Member

akyrola commented Apr 14, 2014

The reason you won't be able to achieve same results is that you have a hard drive, not an SSD.
Note also that the numbers in the paper do not include preprocessing time, but is the time "runtime" reported in the end of processing.

But the main configuration I used was --membudget_mb=3000

Aapo

On Apr 14, 2014, at 8:20 AM, Disa Mhembere [email protected] wrote:

Hi Aapo,

Thanks for the quick response. I am running on a ram disk because I want as
high performace as possible. This seems to have been caused by me messing
with the configurations file - totally my fault.
I seem to still not be able to achieve your results even with my machine
specs. Would you be able to provide the config file or individual command
line configs that you used to achieve the results you show in your paper
for: (i) Pagerank, (ii) Triangle counting, (iii) ConnectedComponents?

Thanks again,
Disa

On Mon, Apr 14, 2014 at 12:16 AM, Aapo Kyrola [email protected]:

Hi,

are you running it from ram disk (/mnt/ram0/...)?

It is probably the reason it crashes, although 500GB RAM should be
enough.... Please place the input file on the hard drive.

Aapo

On Apr 13, 2014, at 8:20 PM, Disa Mhembere [email protected]
wrote:

I ran the trianglecounting example app with no problem on small graphs,
but I get a seg fault when I run the twitter graph with 42 mil vertices &
1.5 bil edges. I have 500GB of RAM and 64 cores and over 1TB of free disk
space and this crash occurs when nothing else is running.

Here is the output from gdb when I try to debug. It seems it fails
within malloc:

DEBUG: slidingshard.hpp(sliding_shard:213): Total edge data size:
839065660,
/mnt/ram0/graphs/twitter_rv.net_degord.edata.e4B.0_0.dyngraph.0_7sizeof(ET):

4

Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff7492720, bytes=643908276) at malloc.c:3900
3900 malloc.c: No such file or directory.
(gdb) backtrace
#0 _int_malloc (av=0x7ffff7492720, bytes=643908276) at malloc.c:3900
#1 0x00007ffff715bf95 in __GI___libc_malloc (bytes=643908276) at
malloc.c:2924
#2 0x00000000004428ed in
graphchi::graphchi_dynamicgraph_engine graphchi::graphchi_vertex

::commit_graph_changes (this=0x7fffffffd730) at
./src/engine/dynamic_graphs/graphchi_dynamicgraph_engine.hpp:733
#3 0x000000000042a476 in graphchi::graphchi_engine unsigned int,
graphchi::graphchi_vertex
::run (this=0x7fffffffd730, userprogram=..., _niters=)
at ./src/engine/graphchi_engine.hpp:952
#4 0x0000000000404eb3 in main (argc=, argv= out>) at
example_apps/trianglecounting.cpp:470
(gdb) frame 2
#2 0x00000000004428ed in
graphchi::graphchi_dynamicgraph_engine graphchi::graphchi_vertex
::commit_graph_changes (this=0x7fffffffd730) at
./src/engine/dynamic_graphs/graphchi_dynamicgraph_engine.hpp:733
733 edata =
(graphchi_edge*)malloc(num_edges *
sizeof(graphchi_edge));

Any idea how to solve this?
Thanks


Reply to this email directly or view it on GitHub.

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola
GraphChi: Big Data - small machine: http://graphchi.org
twitter: @kyrpov


Reply to this email directly or view it on GitHubhttps://github.com//issues/21#issuecomment-40332233
.


Reply to this email directly or view it on GitHub.

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola
GraphChi: Big Data - small machine: http://graphchi.org
twitter: @kyrpov

@disa-mhembere
Copy link
Author

Ok - yes, I am not including pre-processing (sharding etc) time and I am running completely on 128GB of RAM Disk which is at least an order of magnitude faster than SSD reading & writing so I should be even faster I would think? I do have ~1TB SSD array on the machine I will test there as well, but how I don't see how this can be faster. I'm confused. Any idea what I might be doing wrong or suggestions for configuration given you know the specs of my machine now?

EDIT: Other than membudget_mb ?

Thanks again,

@akyrola
Copy link
Member

akyrola commented Apr 14, 2014

Can you tell me what kind of results do you get? Also send me full output log of your execution. (Send to [email protected]).

On Apr 14, 2014, at 9:02 AM, Disa Mhembere [email protected] wrote:

Ok - yes, I am not including pre-processing (sharding etc) time and I am running completely on 128GB of RAM Disk which is at least an order of magnitude faster than SSD reading & writing so I should be even faster I would think? I do have ~1TB SSD array on the machine I will test there as well, but how I don't see how this can be faster. I'm confused. Any idea what I might be doing wrong or suggestions for configuration given you know the specs of my machine now?

Thanks again,


Reply to this email directly or view it on GitHub.

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola
GraphChi: Big Data - small machine: http://graphchi.org
twitter: @kyrpov

@disa-mhembere
Copy link
Author

Ahh it seems it ultimately was just an issue of giving GraphChi more memory and it surpasses the papers numbers as I expected. Thanks so much. This issue is resolved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants