Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About to connect the wrong block #1562

Open
mkingori opened this issue Sep 29, 2019 · 32 comments
Open

About to connect the wrong block #1562

mkingori opened this issue Sep 29, 2019 · 32 comments

Comments

@mkingori
Copy link
Contributor

Hi,

When i launch dcrdata, with the command ./dcrdata, I am getting a panic of about to connect the wrong block. See the logs.

2019-09-29 14:12:38.440 [INF] DATD: Log folder:  /home/kingmoses/.dcrdata/logs/mainnet
2019-09-29 14:12:38.440 [INF] DATD: Config file: /home/kingmoses/.dcrdata/dcrdata.conf
2019-09-29 14:12:38.441 [INF] DATD: dcrdata version 5.2.0-pre+dev (Go version go1.12.9)
2019-09-29 14:12:38.473 [INF] RPCC: Established connection to RPC server localhost:9109
2019-09-29 14:12:38.474 [INF] DATD: Connected to dcrd (JSON-RPC API v6.1.0) on MainNet
2019-09-29 14:12:38.474 [INF] SKDB: Loading ticket pool DB. This may take a minute...
2019-09-29 14:12:38.558 [INF] SKDB: badger: All 1 tables opened in 83ms
2019-09-29 14:12:38.561 [INF] SKDB: Loading all ticket pool diffs...
2019-09-29 14:12:55.277 [INF] SKDB: Successfully loaded 383908 ticket pool diffs
2019-09-29 14:12:55.390 [INF] SKDB: Advancing ticket pool DB to tip via diffs...
2019-09-29 14:12:55.828 [INF] SKDB: Pre-populating live ticket cache and computing pool value...
2019-09-29 14:12:59.121 [INF] DATD: Loaded StakeDatabase at height 383907
2019-09-29 14:12:59.122 [INF] DATD: Setting up the Politeia's proposals clone repository. Please wait...
2019-09-29 14:13:00.819 [INF] DATD: Address cache capacity: 986895 rows, 134217728 bytes
2019-09-29 14:13:00.826 [INF] PSQL: Switching PostgreSQL time zone to UTC for this session.
2019-09-29 14:13:00.828 [INF] PSQL: PostgreSQL 11.5 (Debian 11.5-2+b1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.2.1-8) 9.2.1 20190909, 64-bit
2019-09-29 14:13:00.837 [INF] PSQL: DB schema version 1.2.0
2019-09-29 14:13:02.161 [INF] PSQL: Pre-loading unspent ticket info for InsertVote optimization.
2019-09-29 14:13:02.442 [INF] PSQL: Storing data for 43413 unspent tickets in cache.
2019-09-29 14:13:02.451 [INF] PSQL: Setting PostgreSQL DB statement timeout to 1h0m0s.
2019-09-29 14:13:02.675 [INF] EXPR: Mean Voting Blocks calculated: 7860
2019-09-29 14:13:02.708 [INF] EXPR: Starting WebsocketHub run loop.
2019-09-29 14:13:02.711 [INF] PUBS: Starting WebsocketHub run loop.
2019-09-29 14:13:02.724 [INF] IAPI: Started Insight socket.io server for up to 16384 clients.
2019-09-29 14:13:02.726 [INF] DATD: Now serving the explorer and APIs on http://127.0.0.1:7777/
2019-09-29 14:13:02.976 [INF] DATD: Starting blockchain sync...
2019-09-29 14:13:02.977 [INF] PSQL: Current best block (dcrd):        383910
2019-09-29 14:13:02.977 [INF] PSQL: Current best block (primary db):  383908
2019-09-29 14:13:02.977 [INF] PSQL: Current best block (stakedb):     383907
2019-09-29 14:13:02.977 [INF] PSQL: Processing blocks 383909 to 383910...
panic: about to connect the wrong block: 383909, 383907

goroutine 383963 [running]:
github.com/decred/dcrdata/db/dcrpg/v4.(*ChainDB).SyncChainDB(0xc00037a2c0, 0x12fd500, 0xc00007bf00, 0x130c420, 0xc001839280, 0xc00a880000, 0x0, 0xc001d1f0e0, 0x0, 0x0, ...)
	/home/kingmoses/go/src/github.com/decred/dcrdata/db/dcrpg/sync.go:414 +0x4905
github.com/decred/dcrdata/db/dcrpg/v4.(*ChainDB).SyncChainDBAsync(0xc00037a2c0, 0x12fd500, 0xc00007bf00, 0xc001842660, 0x130c420, 0xc001839280, 0x0, 0x0, 0xc001d1f0e0)
	/home/kingmoses/go/src/github.com/decred/dcrdata/db/dcrpg/sync.go:112 +0xaf
created by main._main.func10
	/home/kingmoses/go/src/github.com/decred/dcrdata/main.go:812 +0x23f
@chappjc
Copy link
Member

chappjc commented Sep 29, 2019

Some info that would help:

  1. Details of how you got to this point. Had you been running without trouble prior to this? Did the initial sync on a previous run complete all the way? Did dcrdata crash without a clean shutdown? Can you supply logs?
  2. Are you able to reproduce this from scratch? Not just from the broken data folder, but from an empty DB and data folder, and with a certain set of steps.

To recover, you can usually start dcrdata with the following switch added to your usual command: --purge-n-blocks N, where N is at least the number that "stakedb" is behind "primary db". In this case, just use 32 or more.

@mkingori
Copy link
Contributor Author

Seems dcrdata took too long to shutdown. I was forced to close the terminal instead. The command --purge-n-blocks N has helped me solve the issues.

@chappjc
Copy link
Member

chappjc commented Sep 30, 2019

I'm glad that resolved it for you.

Since it seems to be relatively common that dcrdata gets killed forcibly without shutting down gracefully, and that the message says nothing about how to recover by purging blocks (nor does it try to do it automatically). I'm going to reopen this issue so we can resolve it properly.

@chappjc chappjc reopened this Sep 30, 2019
@ukane-philemon
Copy link
Collaborator

@chappjc I have been facing this issue for sometime now, purging didn't help. And in my case, dcrdata was running on a dev server without any sort of interruption.
Solution that worked for me but ain't certain if if has consequences.

  • Fast-forwarding stakedb height to main db height before connecting the next block.

Also, I found out duplicate rows are not removed for votes, misses and tickets table.

dcrdata/db/dcrpg/sync.go

Lines 393 to 402 in fab8a7e

// To build indexes, there must NOT be duplicate rows in terms of the
// constraints defined by the unique indexes. Duplicate transactions,
// vins, and vouts can end up in the tables when identical transactions
// are included in multiple blocks. This happens when a block is
// invalidated and the transactions are subsequently re-mined in another
// block. Remove these before indexing.
log.Infof("Finding and removing duplicate table rows before indexing...")
if err = pgb.DeleteDuplicates(barLoad); err != nil {
return 0, err
}

dcrdata/db/dcrpg/tables.go

Lines 199 to 312 in fab8a7e

// DeleteDuplicates attempts to delete "duplicate" rows in tables where unique
// indexes are to be created.
func (pgb *ChainDB) DeleteDuplicates(barLoad chan *dbtypes.ProgressBarLoad) error {
var allDuplicates []dropDuplicatesInfo
if pgb.cockroach {
allDuplicates = append(allDuplicates,
// Remove duplicate vins
dropDuplicatesInfo{TableName: "vins", DropDupsFunc: pgb.DeleteDuplicateVinsCockroach},
// Remove duplicate vouts
dropDuplicatesInfo{TableName: "vouts", DropDupsFunc: pgb.DeleteDuplicateVoutsCockroach},
)
} else {
allDuplicates = append(allDuplicates,
// Remove duplicate vins
dropDuplicatesInfo{TableName: "vins", DropDupsFunc: pgb.DeleteDuplicateVins},
// Remove duplicate vouts
dropDuplicatesInfo{TableName: "vouts", DropDupsFunc: pgb.DeleteDuplicateVouts},
)
}
allDuplicates = append(allDuplicates,
// Remove duplicate transactions
dropDuplicatesInfo{TableName: "transactions", DropDupsFunc: pgb.DeleteDuplicateTxns},
// Remove duplicate agendas
dropDuplicatesInfo{TableName: "agendas", DropDupsFunc: pgb.DeleteDuplicateAgendas},
// Remove duplicate agenda_votes
dropDuplicatesInfo{TableName: "agenda_votes", DropDupsFunc: pgb.DeleteDuplicateAgendaVotes},
)
var err error
for _, val := range allDuplicates {
msg := fmt.Sprintf("Finding and removing duplicate %s entries...", val.TableName)
if barLoad != nil {
barLoad <- &dbtypes.ProgressBarLoad{BarID: dbtypes.InitialDBLoad, Subtitle: msg}
}
log.Info(msg)
var numRemoved int64
if numRemoved, err = val.DropDupsFunc(); err != nil {
return fmt.Errorf("delete %s duplicate failed: %v", val.TableName, err)
}
msg = fmt.Sprintf("Removed %d duplicate %s entries.", numRemoved, val.TableName)
if barLoad != nil {
barLoad <- &dbtypes.ProgressBarLoad{BarID: dbtypes.InitialDBLoad, Subtitle: msg}
}
log.Info(msg)
}
// Signal task is done
if barLoad != nil {
barLoad <- &dbtypes.ProgressBarLoad{BarID: dbtypes.InitialDBLoad, Subtitle: " "}
}
return nil
}
// DeleteDuplicatesRecovery attempts to delete "duplicate" rows in all tables
// where unique indexes are to be created. This is like DeleteDuplicates, but
// it also includes transactions, tickets, votes, and misses.
func (pgb *ChainDB) DeleteDuplicatesRecovery(barLoad chan *dbtypes.ProgressBarLoad) error {
allDuplicates := []dropDuplicatesInfo{
// Remove duplicate vins
{TableName: "vins", DropDupsFunc: pgb.DeleteDuplicateVins},
// Remove duplicate vouts
{TableName: "vouts", DropDupsFunc: pgb.DeleteDuplicateVouts},
// Remove duplicate transactions
{TableName: "transactions", DropDupsFunc: pgb.DeleteDuplicateTxns},
// Remove duplicate tickets
{TableName: "tickets", DropDupsFunc: pgb.DeleteDuplicateTickets},
// Remove duplicate votes
{TableName: "votes", DropDupsFunc: pgb.DeleteDuplicateVotes},
// Remove duplicate misses
{TableName: "misses", DropDupsFunc: pgb.DeleteDuplicateMisses},
// Remove duplicate agendas
{TableName: "agendas", DropDupsFunc: pgb.DeleteDuplicateAgendas},
// Remove duplicate agenda_votes
{TableName: "agenda_votes", DropDupsFunc: pgb.DeleteDuplicateAgendaVotes},
}
var err error
for _, val := range allDuplicates {
msg := fmt.Sprintf("Finding and removing duplicate %s entries...", val.TableName)
if barLoad != nil {
barLoad <- &dbtypes.ProgressBarLoad{BarID: dbtypes.InitialDBLoad, Subtitle: msg}
}
log.Info(msg)
var numRemoved int64
if numRemoved, err = val.DropDupsFunc(); err != nil {
return fmt.Errorf("delete %s duplicate failed: %v", val.TableName, err)
}
msg = fmt.Sprintf("Removed %d duplicate %s entries.", numRemoved, val.TableName)
if barLoad != nil {
barLoad <- &dbtypes.ProgressBarLoad{BarID: dbtypes.InitialDBLoad, Subtitle: msg}
}
log.Info(msg)
}
// Signal task is done
if barLoad != nil {
barLoad <- &dbtypes.ProgressBarLoad{BarID: dbtypes.InitialDBLoad, Subtitle: " "}
}
return nil
}

Addressed HERE.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

Good finds.

Any idea how to reproduce that error? I literally never hit it, and I run several servers in production. Unclean shutdown is the only thing that can cause it afaik

@ukane-philemon
Copy link
Collaborator

ukane-philemon commented Feb 6, 2022

I dropped dcrdata db, deleted data dir, and ran dcrdata afresh uninterrupted(pm2 mode) , still panic persist. Until I applied a fast-forward to take stakedb to main db height.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

You have to be dropping the wrong data. How else would it report any db heights?

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

Wait, how does pm2 start and stop dcrdata? Also, what does that do for you? You can run in a tmux.

@ukane-philemon
Copy link
Collaborator

You have to be dropping the wrong data. How else would it report any db heights?

I removed ~/.dcrdata/data path and droped dcrdata db, so am starting afresh.
Pm2 helps me keep dcrdata process running on my dev server while I am offline. Maybe I will give tmux a try.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

What I don't understand about pm2 is how it stops dcrdata. You cannot just kill it.

I removed ~/.dcrdata/data path and droped dcrdata db, so am starting afresh.

If dcrdata reports any positive DB heights other than dcrd itself, you've not started fresh. Paste the part of the startup logs that looks like this:

[INF] PSQL: DB schema version 1.11.0
[DBG] PSQL: meta height 628189 / blocks height 628189
...
[DBG] DATD: chainDB height: 628189
...
[INF] DATD: Starting blockchain sync...
[INF] PSQL: Current best block (dcrd):        631276
[INF] PSQL: Current best block (primary db):  628189
[INF] PSQL: Current best block (stakedb):     628189

Also please paste the error message with the block heights.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

BTW, when I ask how to "reproduce that error", I'm asking for steps to causing the error from a functioning setup. I understand you are getting it repeatedly, but a reproduction is how you get to that state.

@ukane-philemon
Copy link
Collaborator

First log output from dcrdata

2022-01-19 20:59:07.531 [INF] PSQL: Empty database "dcrdata". Creating tables...
2022-01-19 20:59:07.532 [INF] PSQL: Creating the "meta" table.
2022-01-19 20:59:07.552 [INF] PSQL: Creating the "blocks" table.
2022-01-19 20:59:07.560 [INF] PSQL: Creating the "transactions" table.
2022-01-19 20:59:07.564 [INF] PSQL: Creating the "vins" table.
2022-01-19 20:59:07.578 [INF] PSQL: Creating the "vouts" table.
2022-01-19 20:59:07.583 [INF] PSQL: Creating the "block_chain" table.
2022-01-19 20:59:07.590 [INF] PSQL: Creating the "addresses" table.
2022-01-19 20:59:07.595 [INF] PSQL: Creating the "tickets" table.
2022-01-19 20:59:07.598 [INF] PSQL: Creating the "votes" table.
2022-01-19 20:59:07.602 [INF] PSQL: Creating the "misses" table.
2022-01-19 20:59:07.607 [INF] PSQL: Creating the "agendas" table.
2022-01-19 20:59:07.611 [INF] PSQL: Creating the "agenda_votes" table.
2022-01-19 20:59:07.613 [INF] PSQL: Creating the "testing" table.
2022-01-19 20:59:07.615 [INF] PSQL: Creating the "stats" table.
2022-01-19 20:59:07.620 [INF] PSQL: Creating the "treasury" table.
2022-01-19 20:59:07.622 [INF] PSQL: Creating the "swaps" table.
2022-01-19 20:59:07.630 [INF] PSQL: Pre-loading unspent ticket info for InsertVote optimization.
2022-01-19 20:59:07.631 [INF] PSQL: Setting PostgreSQL DB statement timeout to 20m0s.
2022-01-19 20:59:07.648 [INF] DATD: chainDB block summary table is empty.
2022-01-19 20:59:07.654 [INF] AGDB: agendas.db version 1.0.0 was set
2022-01-19 20:59:07.662 [INF] PRDB: proposals.db version 2.0.0 was set
2022-01-19 20:59:07.838 [INF] EXPR: Mean Voting Blocks calculated: 1006
2022-01-19 20:59:07.886 [INF] EXPR: Starting WebsocketHub run loop.
2022-01-19 20:59:07.887 [INF] PUBS: Starting WebsocketHub run loop.
2022-01-19 20:59:07.914 [INF] DATD: chainDB block summary table is empty.
2022-01-19 20:59:07.914 [INF] IAPI: Started Insight socket.io server.
2022-01-19 20:59:07.916 [INF] DATD: Now serving the explorer and APIs on http://localhost:17778/
2022-01-19 20:59:08.167 [INF] DATD: Starting blockchain sync...
2022-01-19 20:59:08.168 [INF] PSQL: Tables are empty, starting fresh.
2022-01-19 20:59:08.169 [INF] PSQL: Current best block (dcrd):        341542
2022-01-19 20:59:08.169 [INF] PSQL: Current best block (primary db):  -1
2022-01-19 20:59:08.169 [INF] PSQL: Current best block (stakedb):     0
2022-01-19 20:59:08.169 [INF] PSQL: Collecting all UTXO data prior to height 0...
2022-01-19 20:59:08.169 [INF] PSQL: Pre-warming UTXO cache with 0 UTXOs...
2022-01-19 20:59:08.169 [INF] PSQL: UTXO cache is ready.
2022-01-19 20:59:08.169 [INF] PSQL: Large bulk load: Removing indexes and disabling duplicate checks.
2022-01-19 20:59:08.172 [INF] PSQL: Creating temporary index on addresses(tx_vin_vout_row_id).
2022-01-19 20:59:08.174 [INF] PSQL: Beginning SYNC STAGE 1 of 5 (block data import).
2022-01-19 20:59:08.174 [INF] PSQL: Scanning genesis block into chain db.
2022-01-19 20:59:08.187 [INF] PSQL: Processing blocks 1 to 500...
2022-01-19 20:59:15.555 [INF] PSQL: Processing blocks 501 to 1000...
2022-01-19 20:59:25.511 [INF] PSQL: Processing blocks 1001 to 1500...
2022-01-19 20:59:28.193 [INF] PSQL: ( 53 blk/s,  480 tx/s,  562 vin/sec, 1697 vout/s)
2022-01-19 20:59:48.182 [INF] PSQL: ( 21 blk/s,  540 tx/s,  746 vin/sec, 2011 vout/s)
2022-01-19 20:59:48.635 [INF] PSQL: Processing blocks 1501 to 2000...
2022-01-19 21:00:07.542 [INF] PSQL: Processing blocks 2001 to 2500...

Everything went well until:

panic: About to connect the wrong block: 549218, 546806
The stake database is corrupted. Restart with --purge-n-blocks=4824 to recover.

goroutine 1 [running]:
github.com/decred/dcrdata/db/dcrpg/v7.(*ChainDB).SyncChainDB.func6(0x86162, 0x128cf48, 0x32, 0xc017afa1a0)
	/root/dcrdata/db/dcrpg/sync.go:308 +0x13bf
github.com/decred/dcrdata/db/dcrpg/v7.(*ChainDB).SyncChainDB(0xc0004c8000, 0x161d5f8, 0xc000357200, 0x161da20, 0xc017277180, 0x101, 0x0, 0xc017436540, 0x0, 0x0, ...)
	/root/dcrdata/db/dcrpg/sync.go:361 +0xfc3
main._main.func10(0x0, 0x0, 0x0)
	/root/dcrdata/cmd/dcrdata/main.go:812 +0xd7
main._main(0x161d5f8, 0xc000357200, 0x0, 0x0)
	/root/dcrdata/cmd/dcrdata/main.go:825 +0x3839
main.main()
	/root/dcrdata/cmd/dcrdata/main.go:60 +0x85
panic: About to connect the wrong block: 549217, 546806
The stake database is corrupted. Restart with --purge-n-blocks=4822 to recover.

Since then, the panic persisted at different heights. At some point I did purge main db(took over 8hours), but as soon as it synced for sometime, I saw same panic err.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

Thank you for pasting that. I am surprised anything like that happened during the initial sync.

If the initial sync (pre-indexing) is interrupted, you really shouldn't bother trying to purge anything, just drop the whole table and datadir. So have you not been able to get a full sync of dcrdata so far because of this? Always panic during stage 1 of 5?

@ukane-philemon
Copy link
Collaborator

So have you not been able to get a full sync of dcrdata so far because of this? Always panic during stage 1 of 5?

yes.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

That sucks. I don't know what to recommend. I assume you're not out of disk space?
What did you mean by "fast forward"? Like this?

// // Advance to last block.
// if stakeDBHeight+1 < lastBlock {
// log.Infof("Fast-forwarding StakeDatabase from %d to %d...",
// stakeDBHeight, lastBlock)
// }
// for stakeDBHeight < lastBlock {
// // check for quit signal
// if shutdownRequested(ctx) {
// return fmt.Errorf("StakeDatabase fast-forward cancelled at height %d.",
// stakeDBHeight)
// }
// block, blockHash, err := rpcutils.GetBlock(stakeDBHeight+1, client)
// if err != nil {
// return fmt.Errorf("GetBlock failed (%s): %v", blockHash, err)
// }
// if err = stakeDB.ConnectBlock(block); err != nil {
// return err
// }
// stakeDBHeight = int64(stakeDB.Height())
// if stakeDBHeight%1000 == 0 {
// log.Infof("Stake DB at height %d.", stakeDBHeight)
// }
// }

BTW, those blks/sec are slow. What kind of disk are you using, not that it would cause the error, just slowness.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

2022-01-19 20:59:08.169 [INF] PSQL: Current best block (dcrd): 341542

Do you know why this is not more like 870897? Is dcrd syncing at the same time?

@ukane-philemon
Copy link
Collaborator

Do you know why this is not more like 870897? Is dcrd syncing at the same time?

As of then, yes. dcrd finished syncing in less than 8hours.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

OK, don't try to do dcrdata sync until dcrd has finished it's sync/IBD (initial block download). Does it still panic even if dcrd is idle and all synchronized with the network?

@ukane-philemon
Copy link
Collaborator

What did you mean by "fast forward"? Like this?

Yeah, the "fast forward" loop runs in the main import loop instead of a panic.

Does it still panic even if dcrd is idle and all synchronized with the network?

It still panics.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

OK I'm going to run a fresh sync on both mainnet and testnet. The fast forward loop is not a solution though. We need to get to the root cause.

@ukane-philemon
Copy link
Collaborator

I am running a dev server with Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-96-generic x86_64) 154GB space.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

And how is disk utilization then? The production testnet machine uses up almost all of that on just the postgresql DB:

postgres=# SELECT datname, pg_size_pretty(pg_database_size(datname)) db_size FROM pg_database ORDER BY db_size;
  datname  | db_size 
-----------+---------
 dcrdata   | 145 GB

The dcrdata datadir will take about 1.5GB on top of that.

The dcrd datadir is about 21G.

df -H says the drive is at 221G utilization with everything else.

@ukane-philemon
Copy link
Collaborator

df -H says the drive is at 221G utilization with everything else.

That's a lot of space, I might be able to add more space soon. But without the "fast-forward" I can't get past init block import.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

I think you're out of disk space...
When you hit the error, check df -h and report back. Also check dcrd logs and journalctl -e

@ukane-philemon
Copy link
Collaborator

I think you're out of disk space...

Yeah, noticed it during setting spending info in addresses table.

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

That's a lot of space

Yeah, 120 GB is a surprising amount. On mainnet:

postgres=# SELECT datname, pg_size_pretty(pg_database_size(datname)) db_size FROM pg_database ORDER BY db_size;
     datname     | db_size 
-----------------+---------
 dcrdata_mainnet | 85 GB

I'm not sure why testnet surged so much recently.

@ukane-philemon
Copy link
Collaborator

du -h report:

udev            3.9G     0  3.9G   0% /dev
tmpfs           796M  1.1M  795M   1% /run
/dev/vda1       155G  143G   13G  93% /
tmpfs           3.9G   16K  3.9G   1% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/vda15      105M  5.2M  100M   5% /boot/efi
/dev/loop2       68M   68M     0 100% /snap/lxd/21835
/dev/loop1       68M   68M     0 100% /snap/lxd/21545
/dev/loop3       44M   44M     0 100% /snap/snapd/14295
/dev/loop4       44M   44M     0 100% /snap/snapd/14549
/dev/loop5       62M   62M     0 100% /snap/core20/1270
tmpfs           796M     0  796M   0% /run/user/0
/dev/loop6       62M   62M     0 100% /snap/core20/1328

@chappjc
Copy link
Member

chappjc commented Feb 6, 2022

Scraping the limits of usable space there. You should check the postgresql logs as well as dcrd logs and journalctl. I'd be very surprised if at least one of those wasn't throwing messages about failure to allocate.
Please try on a machine with more space.
However, you might be able to sync mainnet on that machine since for whatever reason it needs less space than testnet. Even that will be tight with both dcrdata and dcrd on the same drive.

@ukane-philemon
Copy link
Collaborator

ukane-philemon commented Feb 6, 2022

dcrd is has been running smooth. For postgresql and stakeDB, space wasn't a concern during block import stage. Although, I will add more space soon since a complete sync would require about 150GB dedicated to dcrdata alone.

@chappjc
Copy link
Member

chappjc commented Feb 7, 2022

fresh mainnet sync stage 1 passed:

2022-02-06 17:44:07.553 [INF] PSQL: Processing blocks 631001 to 631317...
2022-02-06 17:44:15.453 [INF] PSQL: ( 40 blk/s,  725 tx/s, 4134 vin/sec, 4953 vout/s)
2022-02-06 17:44:15.887 [INF] PSQL: Avg. speed: 1145 tx/s, 5152 vout/s
2022-02-06 17:44:15.887 [INF] PSQL: Block import elapsed: 162.01 minutes, 631318 blocks (64.95 blocks/s)
2022-02-06 17:44:15.887 [INF] PSQL: Beginning SYNC STAGE 2 of 5 (duplicate row removal).
...
2022-02-06 17:49:23.124 [INF] PSQL: Beginning SYNC STAGE 3 of 5 (table indexing and analyzing).
...
2022-02-06 17:56:38.334 [INF] PSQL: Beginning SYNC STAGE 4 of 5 (deep database ANALYZE).
2022-02-06 17:57:38.349 [INF] PSQL: Beginning SYNC STAGE 5 of 5 (setting spending info in addresses table). This will take a while.
2022-02-06 17:57:38.351 [INF] PSQL: Updating address rows for blocks [0,9999]...
2022-02-06 17:57:52.104 [INF] PSQL: Updating address rows for blocks [10000,19999]...
...
2022-02-06 18:17:49.033 [INF] PSQL: Updating address rows for blocks [630000,631317]...
2022-02-06 18:17:52.871 [INF] PSQL: Updated 33460663 rows of addresses table.
2022-02-06 18:17:52.872 [INF] PSQL: Indexing addresses table on matching_tx_hash...
2022-02-06 18:19:16.418 [INF] PSQL: Performing an ANALYZE(600) on addresses table...
2022-02-06 18:19:19.576 [INF] PSQL: Catching up with network at block height 631326 from 631317...
2022-02-06 18:19:20.610 [INF] PSQL: SYNC COMPLETED at height 631326. Delta:
			    631327 blocks
			  11138463 transactions
			  37259547 ins
			  50082487 outs
			  72150839 addresses

will update when I do testnet3

@chappjc
Copy link
Member

chappjc commented Feb 7, 2022

testnet

2022-02-06 22:08:39.126 [INF] PSQL: Processing blocks 870501 to 871000...
2022-02-06 22:08:46.132 [INF] PSQL: ( 42 blk/s,  691 tx/s, 4080 vin/sec, 4828 vout/s)
2022-02-06 22:08:51.330 [INF] PSQL: Processing blocks 871001 to 871106...
2022-02-06 22:08:53.892 [INF] PSQL: Avg. speed: 961 tx/s, 4721 vout/s
2022-02-06 22:08:53.892 [INF] PSQL: Block import elapsed: 223.13 minutes, 871107 blocks (65.07 blocks/s)
2022-02-06 22:08:53.893 [INF] PSQL: Beginning SYNC STAGE 2 of 5 (duplicate row removal).
...
2022-02-06 22:15:37.713 [INF] PSQL: Beginning SYNC STAGE 3 of 5 (table indexing and analyzing).
...
2022-02-06 22:24:56.163 [INF] PSQL: Beginning SYNC STAGE 4 of 5 (deep database ANALYZE).
2022-02-06 22:26:11.787 [INF] PSQL: Beginning SYNC STAGE 5 of 5 (setting spending info in addresses table). This will take a while.
2022-02-06 22:26:11.794 [INF] PSQL: Updating address rows for blocks [0,9999]...
2022-02-06 22:26:33.473 [INF] PSQL: Updating address rows for blocks [10000,19999]...
...
2022-02-06 22:55:34.173 [INF] PSQL: Updating address rows for blocks [870000,871106]...
2022-02-06 22:55:38.609 [INF] PSQL: Updated 43088652 rows of addresses table.
2022-02-06 22:55:38.609 [INF] PSQL: Indexing addresses table on matching_tx_hash...
2022-02-06 22:57:45.107 [INF] PSQL: Performing an ANALYZE(600) on addresses table...
2022-02-06 22:57:52.259 [INF] PSQL: Catching up with network at block height 871135 from 871106...
2022-02-06 22:57:54.661 [INF] PSQL: SYNC COMPLETED at height 871135. Delta:
			    871136 blocks
			  12866327 transactions
			  48427059 ins
			  63206847 outs
			  91843276 addresses

@ukane-philemon
Copy link
Collaborator

Mainet Stage 1 syncing in < 3hr and testnet in < 4hrs is fast, and no connect block error. Will run testnet once I top my server space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants