-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About to connect the wrong block #1562
Comments
Some info that would help:
To recover, you can usually start dcrdata with the following switch added to your usual command: |
Seems dcrdata took too long to shutdown. I was forced to close the terminal instead. The command |
I'm glad that resolved it for you. Since it seems to be relatively common that dcrdata gets killed forcibly without shutting down gracefully, and that the message says nothing about how to recover by purging blocks (nor does it try to do it automatically). I'm going to reopen this issue so we can resolve it properly. |
@chappjc I have been facing this issue for sometime now, purging didn't help. And in my case, dcrdata was running on a dev server without any sort of interruption.
Also, I found out duplicate rows are not removed for votes, misses and tickets table. Lines 393 to 402 in fab8a7e
Lines 199 to 312 in fab8a7e
|
Good finds. Any idea how to reproduce that error? I literally never hit it, and I run several servers in production. Unclean shutdown is the only thing that can cause it afaik |
I dropped dcrdata db, deleted data dir, and ran dcrdata afresh uninterrupted(pm2 mode) , still panic persist. Until I applied a fast-forward to take stakedb to main db height. |
You have to be dropping the wrong data. How else would it report any db heights? |
Wait, how does pm2 start and stop dcrdata? Also, what does that do for you? You can run in a tmux. |
I removed |
What I don't understand about pm2 is how it stops dcrdata. You cannot just kill it.
If dcrdata reports any positive DB heights other than dcrd itself, you've not started fresh. Paste the part of the startup logs that looks like this:
Also please paste the error message with the block heights. |
BTW, when I ask how to "reproduce that error", I'm asking for steps to causing the error from a functioning setup. I understand you are getting it repeatedly, but a reproduction is how you get to that state. |
First log output from dcrdata
Everything went well until:
Since then, the panic persisted at different heights. At some point I did purge main db(took over 8hours), but as soon as it synced for sometime, I saw same panic err. |
Thank you for pasting that. I am surprised anything like that happened during the initial sync. If the initial sync (pre-indexing) is interrupted, you really shouldn't bother trying to purge anything, just drop the whole table and datadir. So have you not been able to get a full sync of dcrdata so far because of this? Always panic during stage 1 of 5? |
yes. |
That sucks. I don't know what to recommend. I assume you're not out of disk space? dcrdata/db/dcrpg/chkdcrpg/main.go Lines 197 to 221 in fab8a7e
BTW, those blks/sec are slow. What kind of disk are you using, not that it would cause the error, just slowness. |
Do you know why this is not more like 870897? Is dcrd syncing at the same time? |
As of then, yes. dcrd finished syncing in less than 8hours. |
OK, don't try to do dcrdata sync until dcrd has finished it's sync/IBD (initial block download). Does it still panic even if dcrd is idle and all synchronized with the network? |
Yeah, the "fast forward" loop runs in the main import loop instead of a panic.
It still panics. |
OK I'm going to run a fresh sync on both mainnet and testnet. The fast forward loop is not a solution though. We need to get to the root cause. |
I am running a dev server with Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-96-generic x86_64) 154GB space. |
And how is disk utilization then? The production testnet machine uses up almost all of that on just the postgresql DB:
The dcrdata datadir will take about 1.5GB on top of that. The dcrd datadir is about 21G.
|
That's a lot of space, I might be able to add more space soon. But without the "fast-forward" I can't get past init block import. |
I think you're out of disk space... |
Yeah, noticed it during setting spending info in addresses table. |
Yeah, 120 GB is a surprising amount. On mainnet:
I'm not sure why testnet surged so much recently. |
|
Scraping the limits of usable space there. You should check the postgresql logs as well as dcrd logs and journalctl. I'd be very surprised if at least one of those wasn't throwing messages about failure to allocate. |
dcrd is has been running smooth. For postgresql and stakeDB, space wasn't a concern during block import stage. Although, I will add more space soon since a complete sync would require about 150GB dedicated to dcrdata alone. |
fresh mainnet sync stage 1 passed:
will update when I do testnet3 |
testnet
|
Mainet Stage 1 syncing in < 3hr and testnet in < 4hrs is fast, and no connect block error. Will run testnet once I top my server space. |
Hi,
When i launch
dcrdata
, with the command./dcrdata
, I am getting a panic ofabout to connect the wrong block
. See the logs.The text was updated successfully, but these errors were encountered: