Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 11+ makes Windows 10 unusable / memory leak #748

Closed
pocesar opened this issue Mar 23, 2018 · 22 comments
Closed

Version 11+ makes Windows 10 unusable / memory leak #748

pocesar opened this issue Mar 23, 2018 · 22 comments
Assignees
Labels
Milestone

Comments

@pocesar
Copy link

pocesar commented Mar 23, 2018

Description of bug:

Wallets starts briefly, then freezes, database size stays the same

image

Steps to reproduce the issue:

  1. Nano Milestone 11 on Windows on a fresh start (deleted database files)
  2. Main window hangs and werfault.exe process starts using 100% of CPU (on one core)

Describe the results you received:

Windows Error Reporting starts using 100% (on one core), and is unkillable.
image
image

the only way to stop this is forcefully restarting the computer (taskkill can't kill it)

image

image

Describe the results you expected:

It works like previous versions

Additional information you deem important (e.g. issue happens only occasionally):

Happened every time, from the 3 times I tried

Environment:

  • OS information: Windows 10 x64
  • Node version: Milestone 11

logs

[2018-03-23 17:37:44.065215]: Bootstrap stopped because there are no peers
[2018-03-23 17:37:44.065215]: Bootstrap stopped because there are no peers
[2018-03-23 17:37:44.065215]: Bootstrap stopped because there are no peers
[2018-03-23 17:37:44.065215]: Bootstrap stopped because there are no peers
[2018-03-23 17:37:44.065215]: Exiting bootstrap attempt
[2018-03-23 17:37:44.510075]: Beginning pending block search
[2018-03-23 17:37:44.510075]: Pending block search phase complete
[2018-03-23 17:37:45.515069]: UDP Receive error: No connection could be made because the target machine actively refused it
[2018-03-23 17:37:47.737077]: Beginning pending block search
[2018-03-23 17:37:47.737077]: Pending block search phase complete
[2018-03-23 17:37:49.068256]: Starting bootstrap attempt
[2018-03-23 17:37:49.238260]: Connection established to [::ffff:192.99.176.122]:7075
[2018-03-23 17:37:49.246260]: Connection established to [::ffff:192.99.176.121]:7075
[2018-03-23 17:37:49.246260]: Connection established to [::ffff:165.227.201.217]:7075
[2018-03-23 17:37:49.248260]: Connection established to [::ffff:144.217.167.119]:7075
[2018-03-23 17:37:49.274260]: Connection established to [::ffff:159.89.143.80]:7075
[2018-03-23 17:37:49.276261]: Connection established to [::ffff:138.68.2.234]:7075
[2018-03-23 17:37:49.304261]: Connection established to [::ffff:139.162.199.142]:7075
[2018-03-23 17:37:49.322262]: Connection established to [::ffff:138.201.94.249]:7075
[2018-03-23 17:37:49.402263]: Invalid size: expected 64, got 0
[2018-03-23 17:37:49.404263]: frontier_req failed, reattempting
[2018-03-23 17:37:59.094699]: Error initiating bootstrap connection to [::ffff:89.64.59.63]:10025: The I/O operation has been aborted because of either a thread exit or an application request
[2018-03-23 17:37:59.094699]: Error initiating bootstrap connection to [::ffff:192.81.216.141]:7075: The I/O operation has been aborted because of either a thread exit or an application request
[2018-03-23 17:37:59.094699]: Error initiating bootstrap connection to [::ffff:54.246.128.136]:7075: The I/O operation has been aborted because of either a thread exit or an application request
[2018-03-23 17:37:59.350706]: Connection established to [::ffff:45.76.92.115]:7075
[2018-03-23 17:37:59.352705]: Connection established to [::ffff:212.47.237.7]:7075
[2018-03-23 17:37:59.356701]: Connection established to [::ffff:188.226.155.250]:7075
[2018-03-23 17:37:59.484704]: Connection established to [::ffff:172.104.32.150]:7075
[2018-03-23 17:38:01.985391]: Found a representative at [2001:41d0:8:d85f::1]:7075
[2018-03-23 17:38:02.857584]: Found a representative at [2600:3c03::f03c:91ff:fee5:29e]:7075
[2018-03-23 17:38:04.403929]: Received 303959 frontiers from [::ffff:192.99.176.121]:7075
[2018-03-23 17:38:09.642300]: Found a representative at [::ffff:139.59.31.249]:7075
[2018-03-23 17:38:10.134317]: Found a representative at [::ffff:35.200.122.8]:7075
[2018-03-23 17:38:10.548320]: Completed frontier request, 468710 out of sync accounts according to [::ffff:192.99.176.121]:7075
[2018-03-23 17:38:10.931790]: Found a representative at [::ffff:178.22.66.84]:7075
[2018-03-23 17:38:10.932791]: Found a representative at [::ffff:178.22.66.84]:7075
[2018-03-23 17:38:11.020796]: Requesting account xrb_33de47sgua1pup78kw4ye8nz7ujdz47oucfgco1zb95n8n7hkbq1x81z3md6 from [::ffff:165.227.201.217]:7075. 468709 accounts in queue
[2018-03-23 17:38:11.020796]: Error receiving block type End of file
[2018-03-23 17:38:11.021797]: Error receiving block type End of file
@argakiig
Copy link
Contributor

Had you run a previous build? Did you remove everything from the data dir or just the data.ldb and could you restart the computer, clear out the data dir and try again if you haven't already

@pocesar
Copy link
Author

pocesar commented Mar 24, 2018

yes, it doesn't matter. just tried again and have to hard reset my machine

@clemahieu
Copy link
Contributor

Based on the number of accounts out of sync it looks like this is a fresh install. Have you run previous versions on this machine?

@pocesar
Copy link
Author

pocesar commented Mar 24, 2018

yes, since version 8, but since starting milestone 11 crashed, I decided to wipe the databases and wallet files and try again, without reusing the database that was working from previous versions

@argakiig
Copy link
Contributor

Can you try this version instead and let me know what the results, You should uninstall the previous and reinstall with this one.
https://ci.appveyor.com/api/buildjobs/lvlgfxfjamx5cdlc/artifacts/Nano_Installer-11.0.GIT-a22ceb5-win64.exe

@pocesar
Copy link
Author

pocesar commented Apr 2, 2018

sorry for the delay on the response. I'm trying 11.1 right now and will report back. so far, it says it's synchronizing without the lockup, and using my previous seed without any issue

image

EDIT: although the problem with #227 persists

@pocesar
Copy link
Author

pocesar commented Apr 2, 2018

ok, it seems that it's stuck forever on Block 1, even from a fresh start. the block count keep growing, the database file is 1,7GB, but it's just "frozen" in time, although the GUI is responsive.

issue #635 still happens (closed it, and the process is in the background doing nothing, but using more and more memory each time it passes)

image

@argakiig
Copy link
Contributor

argakiig commented Apr 2, 2018

what did the unchecked count get to? chances are the blocks needing to be processed to continue with the bootstrapping had not been received yet and as a result would act as a blocker to checking in further blocks. The process still running after exit is still being investigated

@pocesar
Copy link
Author

pocesar commented Apr 2, 2018

it was around 3,8kk and the process is still stuck , and eventually will memory leak until my system crashes

image

[2018-04-02 12:22:06.746556]: Error receiving block type End of file
[2018-04-02 12:22:06.763563]: Connection established to [::ffff:172.104.32.150]:7075
[2018-04-02 12:22:06.865076]: Error receiving block type End of file
[2018-04-02 12:22:06.867072]: Error receiving block type End of file
[2018-04-02 12:22:07.141112]: Error receiving block type End of file
[2018-04-02 12:22:07.355068]: Error initiating bootstrap connection to [::ffff:35.227.118.51]:7075: The I/O operation has been aborted because of either a thread exit or an application request
[2018-04-02 12:22:07.356068]: Error initiating bootstrap connection to [::ffff:187.58.41.233]:7075: The I/O operation has been aborted because of either a thread exit or an application request
[2018-04-02 12:22:07.555014]: Connection established to [::ffff:167.99.0.55]:7075
[2018-04-02 12:22:07.567938]: Connection established to [::ffff:142.44.246.137]:7075
[2018-04-02 12:22:07.584147]: Connection established to [::ffff:159.89.122.242]:7075
[2018-04-02 12:22:07.621005]: Connection established to [::ffff:188.166.54.69]:7075
[2018-04-02 12:22:07.631510]: Connection established to [::ffff:159.89.4.209]:7075
[2018-04-02 12:22:07.636516]: Connection established to [::ffff:144.76.158.210]:7075
[2018-04-02 12:22:07.718511]: Error receiving block type End of file
[2018-04-02 12:22:07.744531]: Error receiving block type An existing connection was forcibly closed by the remote host
[2018-04-02 12:22:07.776180]: Connection established to [::ffff:14.187.168.219]:7075
[2018-04-02 12:22:07.858256]: Error receiving block type An existing connection was forcibly closed by the remote host
[2018-04-02 12:22:07.868758]: Error initiating bootstrap connection to [::ffff:196.52.2.87]:62444: No connection could be made because the target machine actively refused it
[2018-04-02 12:22:07.881033]: Requesting account xrb_31q4a1fdiyrcpn8s85rmahrju9bertz1pdt8nifikrmo3j5inwsncm7obzw3 from [::ffff:159.65.234.33]:7075. 374517 accounts in queue
[2018-04-02 12:22:08.548974]: Error receiving block type The network connection was aborted by the local system
[2018-04-02 12:22:08.607476]: Connection established to [::ffff:37.187.4.93]:7075
[2018-04-02 12:22:09.367579]: Error initiating bootstrap connection to [::ffff:31.48.146.60]:7075: The I/O operation has been aborted because of either a thread exit or an application request
[2018-04-02 12:22:09.367579]: Error initiating bootstrap connection to [::ffff:193.1.164.220]:7075: The I/O operation has been aborted because of either a thread exit or an application request
[2018-04-02 12:22:09.368573]: Error initiating bootstrap connection to [::ffff:64.235.45.71]:7075: The I/O operation has been aborted because of either a thread exit or an application request
[2018-04-02 12:22:09.636108]: Connection established to [::ffff:62.210.200.13]:7075
[2018-04-02 12:22:09.747122]: Connection established to [2a01:4f8:1c0c:57ea::1]:7075
[2018-04-02 12:22:09.878639]: Error receiving block type End of file
[2018-04-02 12:22:10.092666]: Requesting account xrb_38bpkbmr8b454agr7usznjjjwuebqom49m7chn6am1zcw58byk6dxnkdbdx6 from [::ffff:74.82.30.7]:7075. 374264 accounts in queue
[2018-04-02 12:22:10.373201]: Error initiating bootstrap connection to [::ffff:35.197.141.88]:7075: The I/O operation has been aborted because of either a thread exit or an application request
[2018-04-02 12:22:10.374201]: Error initiating bootstrap connection to [::ffff:115.66.29.38]:7075: The I/O operation has been aborted because of either a thread exit or an application request

@argakiig
Copy link
Contributor

argakiig commented Apr 2, 2018

Does the memory leak only start when you exit?

@pocesar
Copy link
Author

pocesar commented Apr 2, 2018

yes, while it was opened, the memory was stable, but the UI wasn't doing anything

@argakiig
Copy link
Contributor

argakiig commented Apr 2, 2018

also 3.8kk is only about half downloaded, that would explain why you werent getting past 1 if it hadnt downloaded the block yet. You may want to wait for all almost 7.3m blocks to download as it could be the very last block you download that is needed to continue checking in blocks

@pocesar
Copy link
Author

pocesar commented Apr 2, 2018

well, that's really counter productive. anyway, the process is memory leaking and nearing 2GB, I'll have to kill it

image

@clemahieu
Copy link
Contributor

clemahieu commented Apr 2, 2018

It sounds like you’re running it on a machine with slow disk IO. If it can't write to disk fast enough it keeps it in memory.

Improving disk IO is slowly coming to the top of the list of things to work on, there have been other higher priority issues.

@pocesar
Copy link
Author

pocesar commented Apr 2, 2018

I'm on a Samsung EVO 850 👍

image

@clemahieu
Copy link
Contributor

Odd, then I haven’t seen a memory spike like that, we’ll have to see what it takes to reproduce.

@argakiig
Copy link
Contributor

argakiig commented Apr 2, 2018

how much memory does it show as using while its running and stable?

@pocesar
Copy link
Author

pocesar commented Apr 3, 2018

@argakiig this is the usual memory usage

image

that's why it's a clear indication of a serious memory leak

EDIT: now that it started processing the blocks, although higher, the memory is stable around 800MB

image

image

@clemahieu
Copy link
Contributor

It’s be interesting to know what it stabilizes to. When I run it on OSX it’s pretty stable around 70-120. I’ve seen it spike on systems that have bad disk IO as it puts blocks in memory until written to disk but with that SSD that’s clearly not the case.

@pocesar
Copy link
Author

pocesar commented Apr 3, 2018

yeah, the I/O isn't high, that's why it's strange, maybe it's an Windows issue, I decided to monitor the disk activity and it seems fine, no crazy reads or writes, it tops around 10MB/s, averaging 2MB/s, so it's clearly not an I/O issue. average response time is 3ms. it's fairly constant as expected, the only issue is when you close the wallet

image

image

image

don't know what else to look for. I'm in a older system besides the SSD though (Core 2 Quad Q9550 and 8 GB of DDR2 800) but have no issues with, for example, Bitcoin Core or other more intensive / aggressive / non-optimized coins, like Ethereum

@pocesar
Copy link
Author

pocesar commented Apr 4, 2018

still synchronizing, really really slow, on 2,6kk blocks (being more than 24 hours already), and slowing whole Windows down. I won't be able to work with the wallet opened... text editors lag, sound is glitching, makes the whole system unresponsive from time to time. nothing wrong in the logs, besides the republish infos:

[2018-04-04 07:29:26.719838]: Block 67F139376DF255B90B62F1FEDED19AF94A549F831AD93EF24EF66FF6C4D6C020 was republished to peers
[2018-04-04 07:29:47.684049]: Block 527F728B5AE76BD938CF34B4A7BFA278E504D3F8B1A18C374C9F91FA4ABB662A was republished to peers
[2018-04-04 07:29:47.685052]: Block 67F139376DF255B90B62F1FEDED19AF94A549F831AD93EF24EF66FF6C4D6C020 was republished to peers
[2018-04-04 07:30:05.485077]: Block 67F139376DF255B90B62F1FEDED19AF94A549F831AD93EF24EF66FF6C4D6C020 was republished to peers
[2018-04-04 07:30:05.486077]: Block 527F728B5AE76BD938CF34B4A7BFA278E504D3F8B1A18C374C9F91FA4ABB662A was republished to peers
[2018-04-04 07:30:23.409112]: Block 527F728B5AE76BD938CF34B4A7BFA278E504D3F8B1A18C374C9F91FA4ABB662A was republished to peers
[2018-04-04 07:30:23.411112]: Block 67F139376DF255B90B62F1FEDED19AF94A549F831AD93EF24EF66FF6C4D6C020 was republished to peers
[2018-04-04 07:30:36.085844]: Block 527F728B5AE76BD938CF34B4A7BFA278E504D3F8B1A18C374C9F91FA4ABB662A was republished to peers
[2018-04-04 07:30:53.320838]: Block 025C36CA39AA81DAE324424566BF952397A954B0069FADF276645FEBB8BC9D49 was republished to peers
[2018-04-04 07:31:08.033694]: UDP Receive error: An existing connection was forcibly closed by the remote host
[2018-04-04 07:31:10.072805]: Block 025C36CA39AA81DAE324424566BF952397A954B0069FADF276645FEBB8BC9D49 was republished to peers
[2018-04-04 07:31:31.934074]: Block 025C36CA39AA81DAE324424566BF952397A954B0069FADF276645FEBB8BC9D49 was republished to peers
[2018-04-04 07:31:31.936069]: Block F1A5731B788F48BF4DB3479FCC6E07613910E1DEE505CB0A219527900BDAEAE8 was republished to peers
[2018-04-04 07:31:45.616858]: Block 025C36CA39AA81DAE324424566BF952397A954B0069FADF276645FEBB8BC9D49 was republished to peers
[2018-04-04 07:31:45.618858]: Block 7837BDCBDAE0EE59064A8156A733BE4EB862B2FA62AF744BC0C86AB4A482BE9A was republished to peers
[2018-04-04 07:31:45.620859]: Block F1A5731B788F48BF4DB3479FCC6E07613910E1DEE505CB0A219527900BDAEAE8 was republished to peers
[2018-04-04 07:32:06.263050]: Block F1A5731B788F48BF4DB3479FCC6E07613910E1DEE505CB0A219527900BDAEAE8 was republished to peers
[2018-04-04 07:32:06.264051]: Block 7837BDCBDAE0EE59064A8156A733BE4EB862B2FA62AF744BC0C86AB4A482BE9A was republished to peers
[2018-04-04 07:32:19.271801]: Block F1A5731B788F48BF4DB3479FCC6E07613910E1DEE505CB0A219527900BDAEAE8 was republished to peers
[2018-04-04 07:32:19.273802]: Block 7837BDCBDAE0EE59064A8156A733BE4EB862B2FA62AF744BC0C86AB4A482BE9A was republished to peers
[2018-04-04 07:32:39.860991]: Block 7837BDCBDAE0EE59064A8156A733BE4EB862B2FA62AF744BC0C86AB4A482BE9A was republished to peers
[2018-04-04 07:33:07.768607]: Found a representative at [::ffff:174.138.4.198]:7075
[2018-04-04 07:33:27.716754]: Block 155FC0C47CB5F21B550FC109375217620151949BF2266258F29A9C0D979C34D2 was republished to peers
[2018-04-04 07:33:48.012653]: Block 155FC0C47CB5F21B550FC109375217620151949BF2266258F29A9C0D979C34D2 was republished to peers
[2018-04-04 07:34:01.997264]: Block 155FC0C47CB5F21B550FC109375217620151949BF2266258F29A9C0D979C34D2 was republished to peers
[2018-04-04 07:34:10.403351]: Found a representative at [::ffff:178.22.66.84]:7075
[2018-04-04 07:34:10.404351]: Found a representative at [::ffff:178.22.66.84]:7075
[2018-04-04 07:34:21.210027]: Block 155FC0C47CB5F21B550FC109375217620151949BF2266258F29A9C0D979C34D2 was republished to peers
[2018-04-04 07:35:27.126818]: Block ADDE2775806EA605BAD6DB89E4A6AC221FFE8052D5B142748B527C0917B8E622 was republished to peers
[2018-04-04 07:35:27.128819]: Block E7CBA3AD696277364BCBB6AE80523C9A811E97735603DEBB9DDC11E4D62126C7 was republished to peers
[2018-04-04 07:35:43.931208]: Block ADDE2775806EA605BAD6DB89E4A6AC221FFE8052D5B142748B527C0917B8E622 was republished to peers
[2018-04-04 07:35:43.932207]: Block E7CBA3AD696277364BCBB6AE80523C9A811E97735603DEBB9DDC11E4D62126C7 was republished to peers
[2018-04-04 07:36:03.997056]: Block AF7C5F9DBE6A556E83C92A69828C3BE3ECD2D4DE6EC3AFBEE6C78F343346EB40 was republished to peers
[2018-04-04 07:36:03.998057]: Block E7CBA3AD696277364BCBB6AE80523C9A811E97735603DEBB9DDC11E4D62126C7 was republished to peers
[2018-04-04 07:36:04.001058]: Block 03FA14707C698B32B2B803548D6C57122B0C7002081E919ECFB8AA4C0C289356 was republished to peers
[2018-04-04 07:36:04.002057]: Block ADDE2775806EA605BAD6DB89E4A6AC221FFE8052D5B142748B527C0917B8E622 was republished to peers
[2018-04-04 07:36:22.798904]: Block 03FA14707C698B32B2B803548D6C57122B0C7002081E919ECFB8AA4C0C289356 was republished to peers
[2018-04-04 07:36:22.800904]: Block AF7C5F9DBE6A556E83C92A69828C3BE3ECD2D4DE6EC3AFBEE6C78F343346EB40 was republished to peers
[2018-04-04 07:36:22.801905]: Block ADDE2775806EA605BAD6DB89E4A6AC221FFE8052D5B142748B527C0917B8E622 was republished to peers
[2018-04-04 07:36:22.802904]: Block E7CBA3AD696277364BCBB6AE80523C9A811E97735603DEBB9DDC11E4D62126C7 was republished to peers
[2018-04-04 07:36:40.382219]: Block AF7C5F9DBE6A556E83C92A69828C3BE3ECD2D4DE6EC3AFBEE6C78F343346EB40 was republished to peers
[2018-04-04 07:36:40.383219]: Block 03FA14707C698B32B2B803548D6C57122B0C7002081E919ECFB8AA4C0C289356 was republished to peers
[2018-04-04 07:36:57.426216]: Block 03FA14707C698B32B2B803548D6C57122B0C7002081E919ECFB8AA4C0C289356 was republished to peers
[2018-04-04 07:36:57.427216]: Block AF7C5F9DBE6A556E83C92A69828C3BE3ECD2D4DE6EC3AFBEE6C78F343346EB40 was republished to peers
[2018-04-04 07:38:59.565811]: Found a representative at [::ffff:64.235.37.184]:7075
[2018-04-04 07:39:08.071785]: UDP Receive error: No connection could be made because the target machine actively refused it
[2018-04-04 07:40:23.742060]: Block 03BC6C9437C341BC8C4ACF0355C719ACB1C59B2537879184612B60A1E7078994 was republished to peers
[2018-04-04 07:40:41.015632]: Block 03BC6C9437C341BC8C4ACF0355C719ACB1C59B2537879184612B60A1E7078994 was republished to peers
[2018-04-04 07:40:55.668284]: Block 03BC6C9437C341BC8C4ACF0355C719ACB1C59B2537879184612B60A1E7078994 was republished to peers
[2018-04-04 07:41:12.588451]: Block 03BC6C9437C341BC8C4ACF0355C719ACB1C59B2537879184612B60A1E7078994 was republished to peers
[2018-04-04 07:44:53.295779]: Found a representative at [::ffff:5.9.31.82]:7075

@pocesar pocesar changed the title Version 11 just crashes on Windows Version 11+ makes Windows 10 unusable Apr 4, 2018
@pocesar pocesar changed the title Version 11+ makes Windows 10 unusable Version 11+ makes Windows 10 unusable / memory leak Apr 7, 2018
@rkeene rkeene added this to the V18.0 milestone Aug 23, 2018
@rkeene rkeene added question and removed question labels Aug 23, 2018
@rkeene rkeene self-assigned this Dec 21, 2018
@rkeene rkeene added the bug label Dec 21, 2018
@rkeene rkeene removed this from the V18.0 milestone Dec 21, 2018
@rkeene rkeene added this to the V17.0 milestone Dec 21, 2018
@rkeene
Copy link
Contributor

rkeene commented Dec 21, 2018

This should be resolved by the memory leak fixes supplied in V17.0 ! See #1338 for additional details.

@rkeene rkeene closed this as completed Dec 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants