Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault in node v5.* #5900

Closed
icefapper opened this issue Mar 25, 2016 · 37 comments
Closed

Segmentation fault in node v5.* #5900

icefapper opened this issue Mar 25, 2016 · 37 comments
Labels
confirmed-bug Issues with confirmed bugs. v8 engine Issues and PRs related to the V8 dependency.

Comments

@icefapper
Copy link

Hello
while fiddling with the toy js parser I have made, i came across weird 139 (that is, Segmentation fault) errors which are driving me nuts. I've pruned every unnecessary part of the parser and looks like i have been able to isolate the error-causing case. i have very little experience using debugging tools to delve in this issue all by myself; i would hence be glad if someone could help me find out what is going on .

the faulty case: https://github.com/icefapper/lubejs
meminfo: https://gist.github.com/icefapper/3081ce0e4f1e8bb17314
my sys is : Linux 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux (specifically it is LM 17.1)
cpuinfo: https://gist.github.com/icefapper/5dc357ab7d751a7ceb86

i have been experiencing it on 5.6.0 and up (i.e, even with the latest stable 5.9.* which i even built from source)

Thanks a lot reading this far, and I hope you could reproduce the error case. Simply run the "run.sh" thing 40-times or so, and the read the "lubelean.log"; please note, though, that the "lubelean.log" already contains the logs I received by running 'run.sh' which means if you need your own logs, you must delete it and then run 'run.sh'.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 25, 2016

I can reproduce a segfault with just node lubelean.js, with this file: https://raw.githubusercontent.com/icefapper/lubejs/fd592cd28796cf3f557853b5011abcade6ea29fd/lubelean.js

Requires several runs, though.

It doesn't use any modules or Node.js API, except for the console. Most probably this is a v8 issue.

@icefapper could you try to produce a smaller testcase?

@icefapper
Copy link
Author

Thanks for such a quick reply!
Actually, the reason lubelean.js is so big is because i thought someone might want to use other cases (that is, other js code snippets than the default one i have provided.) I'd be glad to make it smaller if it is necessary.

Regards,
Neni

@ChALkeR
Copy link
Member

ChALkeR commented Mar 25, 2016

@icefapper I've done some initial cleanup in https://github.com/ChALkeR/lubejs/blob/master/lubelean.js.
Just removing methods that were not being called reduced the line count about three times.

@icefapper
Copy link
Author

WoW! I just did some leaning myself; I confess though, it is by no means leaner than yours :
https://github.com/icefapper/lubejs/raw/master/lubelean.js

@mscdex mscdex added the v8 engine Issues and PRs related to the V8 dependency. label Mar 25, 2016
@ofrobots
Copy link
Contributor

Stack trace:

  * frame #0: 0x0000000100377534 node`v8::internal::StoreBuffer::IteratePointersToNewSpace(void (*)(v8::internal::HeapObject**, v8::internal::HeapObject*)) + 1972
    frame #1: 0x0000000100313709 node`v8::internal::Heap::Scavenge() + 1273
    frame #2: 0x000000010031203f node`v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) + 879
    frame #3: 0x0000000100311a01 node`v8::internal::Heap::CollectGarbage(v8::internal::GarbageCollector, char const*, char const*, v8::GCCallbackFlags) + 689
    frame #4: 0x00000001002ca72c node`v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) + 108
    frame #5: 0x000000010051618e node`v8::internal::Runtime_AllocateInTargetSpace(int, v8::internal::Object**, v8::internal::Isolate*) + 110
    frame #6: 0x00000ec7dcd062d5
    frame #7: 0x00000ec7dce9ba92
    frame #8: 0x00000ec7dce9b513
    frame #9: 0x00000ec7dce9b513

@icefapper
Copy link
Author

@ofrobots sorry for asking a newbieish thing, but, could i ask what it means and how you acquird it? thanks a lot

@ofrobots
Copy link
Contributor

@icefapper I got this stack trace by running node inside a debugger (e.g lldb -- node lubelean.js). This shows that the segfault crash occurs while gc is iterating some objects. I am continuing to look into this further.

@ofrobots
Copy link
Contributor

frame #1: 0x000000010021fc93 node_g`v8::internal::Map::instance_size(this=0xdeadbeedbeadbeed) + 35 at objects-inl.h:4294
   4291
   4292
   4293 int Map::instance_size() {
-> 4294   return NOBARRIER_READ_BYTE_FIELD(
   4295       this, kInstanceSizeOffset) << kPointerSizeLog2;
   4296 }
   4297
(lldb) p this
(v8::internal::Map *) $1 = 0xdeadbeedbeadbeed
(lldb) up
frame #2: 0x000000010021faa1 node_g`v8::internal::HeapObject::SizeFromMap(this=0x00001d2ea2ae9929, map=0xdeadbeedbeadbeed) + 33 at objects-inl.h:4353
   4350
   4351
   4352 int HeapObject::SizeFromMap(Map* map) {
-> 4353   int instance_size = map->instance_size();
   4354   if (instance_size != kVariableSizeSentinel) return instance_size;
   4355   // Only inline the most frequent cases.
   4356   InstanceType instance_type = map->instance_type();
(lldb) p this
(v8::internal::HeapObject *) $2 = 0x00001d2ea2ae9929
(lldb) v8 print 0x00001d2ea2ae9928
<Smi: 7470>

EDIT: v8 print is not to be trusted.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 25, 2016

I performed a bit more cleanup, the result is at https://github.com/ChALkeR/lubejs

@ofrobots
Copy link
Contributor

@ChALkeR thanks! I can reproduce the crash with vanilla d8; replacing console.log with print.

@icefapper
Copy link
Author

@ChALkeR funny, I thought the segv error is due to the stack getting repeatedly pushd and popd as the precedences fluctuated (in prseNonSeqExpr); you've proven me wrong :)

@icefapper
Copy link
Author

laugh at me, but aside from the prevalent 139 i frequently got, there was also a 132 error (i.e, "invalid instruction") occasionally showing up, this time coupled with a core dump, and "slice" was always the first frame that appeard in the dump; now i'm not telling the slice is the culprit, but it just makes me quite suspicious about it

@ofrobots
Copy link
Contributor

Very curious. I see a bunch of deadbeef, and kin, in the area of the heap around we die while doing a GC walk. This is the only area on this heap page that I found deadbeef. https://gist.github.com/ofrobots/2e1c02541bfc0ebb675c

@icefapper I haven't been able to reproduce the invalid instruction crashes. That is interesting. These kinds of symptoms would indicate a memory-corruption type failure to me. If you do have those core dumps still available, I would be interested in know about what is in memory around the instruction the processor was trying to execute. I can give you the commands to run to get this data.

@ofrobots ofrobots added the confirmed-bug Issues with confirmed bugs. label Mar 25, 2016
@ofrobots
Copy link
Contributor

On a debug build, the GC zaps garbage memory blocks with deadbee*. This is more reproducible with a debug build, or if you add --verify-heap. This looks like a gc issue to me.

I can reproduce this with V8 4.5 (Node 4) as well.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 25, 2016

@ofrobots, I reproduced the invalid instruction crash. But it happened only once out of many runs.

@ofrobots
Copy link
Contributor

@ChALkeR If you still have a debugger active, or have a core-dump, could you print some instructions around the crash point? Running x/20i $pc in lldb or gdb should work.

@ofrobots
Copy link
Contributor

Opened V8 issue here: https://bugs.chromium.org/p/v8/issues/detail?id=4871

@icefapper
Copy link
Author

will try my best to reproduce the inv-instr error.

Also, thanks for confirming it's not a bug on my side. i was literally on the verge of a nervous breakdown.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 25, 2016

@ofrobots Sorry, I don't have that now. Perhaps running the original testcase overnight would help?

@ChALkeR
Copy link
Member

ChALkeR commented Mar 25, 2016

The testcase is under 100 lines now.

@icefapper
Copy link
Author

@ofrobots @ChALkeR
I have made bit of a progress:
after inlining (i.e, substituting with their bodies) the calls to this.loc, this.locBegin, and this.locOn in both the lean and the original testcase, SIGSEGV disappeard.

First I thought it might be that V8 does not like a function returning an object (silly, I cede, but experience has taught me that silly does not mean impossible.) I guess I was wrong there ; reason? take a look at this excellently leand testcase. Try fiddling with the object lterals that contain a 'loc:' by changing its value to {}, or by eliminating it altogether (there are only four of them I guess). I will try to share some of the variations that make the SIGSEGV disappear. I was so giddy finding out about it that I did not remember to save my findings :\ My take is that it has something to do with the convoluted way V8 might be handling the so-calld "hidden classes". As to why only 'loc' gets affected, I have no idea.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 26, 2016

after inlining (i.e, substituting with their bodies) the calls to this.loc, this.locBegin, and this.locOn in both the lean and the original testcase, SIGSEGV disappeard.

Yes, I noticed that, too. It's not as trivial, though. Only one place counts (take a look at my testcase — it has the call to .loc() in a single place).

Also notice how my testcase redefines loc() several times — removing that keeps the segfault, but it looks like that makes it less often for some reason. I'm not 100% sure on this, though.

The same about re-assigning tok — looks like its presense increases the chances to get a segfault.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 26, 2016

@ofrobots Update the testcase, please. It looks like the new one fails more often (and doesn't require the tok re-assigning anymore).

@icefapper
Copy link
Author

Looks like the vm is gc'ing undefs; please give it a go

@icefapper
Copy link
Author

further evidence for it : https://gist.github.com/icefapper/ee8e346c5eead603c855

@ChALkeR
Copy link
Member

ChALkeR commented Mar 26, 2016

@icefapper Change it to head.type in the first example, and it will still crash. Note that head.type is defined =).

@icefapper
Copy link
Author

Thanks ; could I ask the precise location to apply the change?

@ChALkeR
Copy link
Member

ChALkeR commented Mar 26, 2016

@icefapper both of the lines you changed =)
Edit: that would be in the second example, in fact =)

@icefapper
Copy link
Author

@ChALkeR
You are completely right ; funnily enough, it happens even with ({l: 12}).l instead of head.type

@ChALkeR
Copy link
Member

ChALkeR commented Jun 2, 2016

v8 issue is fixed, is this still reproducable?

@ChALkeR
Copy link
Member

ChALkeR commented Jun 14, 2016

The v8 issue is fixed, and afaik it landed in Node.js.
I can't reproduce on Node.js v6.2.1.

Closing. Feel free to comment and/or reopen =).

@ChALkeR ChALkeR closed this as completed Jun 14, 2016
ofrobots added a commit to ofrobots/node that referenced this issue Jun 15, 2016
This is part 1/2 of the fixes from v8:4871. This fixes a segfault in
verify-heap.

Original commit message:
  [crankshaft] Write fillers for folded old space allocations during verify-heap

  If we don't write fillers, we crash during PagedSpace verification when we try
  to iterate over dead memory (unused folded allocation slots).

  BUG=v8:4871,chromium:580959
  LOG=N

  Review URL: https://codereview.chromium.org/1837163002

  Cr-Commit-Position: refs/heads/master@{#35097}

Fixes: nodejs#5900
V8-Bug: https://bugs.chromium.org/p/v8/issues/detail?id=4871
ofrobots added a commit to ofrobots/node that referenced this issue Jun 15, 2016
This is part 2/2 of the fixes needed for v8:4871. This fix never landed
upstream because the bug is not present in active V8 version. The patch
is available from the upstream v8 bug however.

The segfault occurs at the intersection of the following three
conditions that are dependent on the allocation pattern of an
application: A pretenured (1) allocation site has to be optimized into
a merged allocation by the allocation folding optimization (2) and
there needs to be overflow of the store buffer (3).

This patch disables the allocation folding optimization for pretenured
allocations. This may have some, hopefully negligible, performance
impact on real world applications.

Fixes: nodejs#5900
@ofrobots
Copy link
Contributor

The problem still exist on on v4.x and v5.x.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 15, 2016

@ofrobots Is there anything actionable here (i.e. is it feasible to backport the fix)?

Edit: ah, I see the PR for 4.x.

@ChALkeR ChALkeR reopened this Jun 15, 2016
@ofrobots
Copy link
Contributor

Yes, I opened the PR for v4.x here: #7303.

I am not sure there is enough runway left on v5.x for it to be worth fixing on that branch.

MylesBorins pushed a commit that referenced this issue Jul 1, 2016
This is part 1/2 of the fixes from v8:4871. This fixes a segfault in
verify-heap.

Original commit message:
  [crankshaft] Write fillers for folded old space allocations during verify-heap

  If we don't write fillers, we crash during PagedSpace verification when we try
  to iterate over dead memory (unused folded allocation slots).

  BUG=v8:4871,chromium:580959
  LOG=N

  Review URL: https://codereview.chromium.org/1837163002

  Cr-Commit-Position: refs/heads/master@{#35097}

Fixes: #5900
V8-Bug: https://bugs.chromium.org/p/v8/issues/detail?id=4871

PR-URL: #7303
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
MylesBorins pushed a commit that referenced this issue Jul 1, 2016
This is part 2/2 of the fixes needed for v8:4871. This fix never landed
upstream because the bug is not present in active V8 version. The patch
is available from the upstream v8 bug however.

The segfault occurs at the intersection of the following three
conditions that are dependent on the allocation pattern of an
application: A pretenured (1) allocation site has to be optimized into
a merged allocation by the allocation folding optimization (2) and
there needs to be overflow of the store buffer (3).

This patch disables the allocation folding optimization for pretenured
allocations. This may have some, hopefully negligible, performance
impact on real world applications.

Fixes: #5900

PR-URL: #7303
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
MylesBorins pushed a commit that referenced this issue Jul 12, 2016
This is part 1/2 of the fixes from v8:4871. This fixes a segfault in
verify-heap.

Original commit message:
  [crankshaft] Write fillers for folded old space allocations during verify-heap

  If we don't write fillers, we crash during PagedSpace verification when we try
  to iterate over dead memory (unused folded allocation slots).

  BUG=v8:4871,chromium:580959
  LOG=N

  Review URL: https://codereview.chromium.org/1837163002

  Cr-Commit-Position: refs/heads/master@{#35097}

Fixes: #5900
V8-Bug: https://bugs.chromium.org/p/v8/issues/detail?id=4871

PR-URL: #7303
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
MylesBorins pushed a commit that referenced this issue Jul 12, 2016
This is part 2/2 of the fixes needed for v8:4871. This fix never landed
upstream because the bug is not present in active V8 version. The patch
is available from the upstream v8 bug however.

The segfault occurs at the intersection of the following three
conditions that are dependent on the allocation pattern of an
application: A pretenured (1) allocation site has to be optimized into
a merged allocation by the allocation folding optimization (2) and
there needs to be overflow of the store buffer (3).

This patch disables the allocation folding optimization for pretenured
allocations. This may have some, hopefully negligible, performance
impact on real world applications.

Fixes: #5900

PR-URL: #7303
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
@targos
Copy link
Member

targos commented Jul 13, 2016

Can we close this? The fix has landed in v4.x-staging and I don't think we will be doing any new release of v5.

@MylesBorins
Copy link
Contributor

v5 is eol. closing.

richardlau added a commit to ibmruntimes/node that referenced this issue Jul 14, 2016
Original commit message:

deps: backport e7cc609 from upstream V8

This is part 1/2 of the fixes from v8:4871. This fixes a segfault in
verify-heap.

Original commit message:
  [crankshaft] Write fillers for folded old space allocations during verify-heap

  If we don't write fillers, we crash during PagedSpace verification when we try
  to iterate over dead memory (unused folded allocation slots).

  BUG=v8:4871,chromium:580959
  LOG=N

  Review URL: https://codereview.chromium.org/1837163002

  Cr-Commit-Position: refs/heads/master@{#35097}

Fixes: nodejs/node#5900
V8-Bug: https://bugs.chromium.org/p/v8/issues/detail?id=4871

PR-URL: nodejs/node#7303
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
richardlau added a commit to ibmruntimes/node that referenced this issue Jul 14, 2016
Original commit message:

deps: fix segfault during gc

This is part 2/2 of the fixes needed for v8:4871. This fix never landed
upstream because the bug is not present in active V8 version. The patch
is available from the upstream v8 bug however.

The segfault occurs at the intersection of the following three
conditions that are dependent on the allocation pattern of an
application: A pretenured (1) allocation site has to be optimized into
a merged allocation by the allocation folding optimization (2) and
there needs to be overflow of the store buffer (3).

This patch disables the allocation folding optimization for pretenured
allocations. This may have some, hopefully negligible, performance
impact on real world applications.

Fixes: nodejs/node#5900

PR-URL: nodejs/node#7303
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: Myles Borins <[email protected]>
@dalvizu
Copy link

dalvizu commented Jan 6, 2017

In case anyone was wondering what the fixed versions were I believe this is it:

dalvizu:~/git/dalvizu/node$ git tag --contains 1164f542
v4.5.0
v4.6.0
v4.6.1
v4.6.2
v4.7.0
v4.7.1
v4.7.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

No branches or pull requests

7 participants