exp/services/ledgerexporter: Extend tool to work on ledger ranges in parallel. #4426

Shaptic · 2022-06-06T16:43:13Z

We need to extend the ledgerexporter tool to export ledgers in parallel (e.g. doable on AWS Batch-able), because single-process exporting is absolutely time-prohibitive for large (and dense) ledger ranges.

The text was updated successfully, but these errors were encountered:

2opremio · 2022-06-28T19:35:37Z

I started implementing a single-process parallel version using goroutines, but .... considering the current memory requirements of Captive Core I am thinking that's probably a bad idea.

Then, I thought about simply adding a new --end-ledger flag, which we could use to split out subranges across different processes, but that poses another challenge: There is race condition on the /latest path. I don't think we can get around using some sort of distributed lock.

Additionally, I am thinking whether it's OK to update the /latest path before all the prior ledgers are successfully exported (e.g. update /latest when ledger 35 is exported but ledger 34 hasn't finished yet). Depending on the answer to that we may want to choose different parallelization strategies.

2opremio · 2022-06-28T19:53:59Z

@bartekn any thoughts about this? Using a distributed lock is going to be a bit of a pain in the ass. If we need to do so I think I would rather use Kubernetes instead of AWS Batch (were we can use things like https://github.com/werf/lockgate )

2opremio · 2022-06-28T20:02:23Z

Actually, we could use a lock in the archive backend itself. For S3 we have Object locks

bartekn · 2022-06-29T08:59:52Z

I'm not sure if any kind of locking is needed because this issue is about exporting old ledgers. So for any range we can split it to more smaller ranges and ingest these in paralel.

2opremio · 2022-06-29T14:40:17Z

Yes, it doesn't make sense to parallelize exporting online ledgers. However, how do you suggest we handle the update of the /latest path? For instance:

The network is at ledger 50000
The ledger archive is at, say ledger 10000, (as pointed out by the /latest path)
We want to use e.g. 4 workers (10k ledgers each), each in its own process
- worker A: range 10001 to 11000
- worker B: range 11001 to 12000
- ....

How does /latest get updated? If we (wrongly) decided that each process updates /latest on its own, we have a race condition (e.g. if A finishes the processing last, /latest will end up with value 11000, which is incorrect).

Maybe we could cut corners and keep it simple e.g:

we disable updating the /latest endpoint through a flag (which we supply to the exporter workers when running in parallel)
we allow allow having gaps in the archive (so that we don't have to worry about only updating latest when all the prior ledgers have been exported)

thoughts?

Shaptic · 2022-06-29T18:37:10Z

This would require quite a bit of extra work, but we could have a /latest.lock and read+update the /latest file "atomically"...

We could also have the workers wait on each other in sequence so that /latest is updated in order, though that poses complications with separate runtimes (e.g. on Batch).

We could also abandon the idea of /latest getting written by the exporter jobs and either have a "job manager" which writes it out at the end, or a post-processing step in the job itself (or the whole batch), e.g. echo $(ls ./ledgers | sort -n | tail -n1) > ./latest

Shaptic · 2022-06-29T18:42:19Z

Additionally, I am thinking whether it's OK to update the /latest path before all the prior ledgers are successfully exported (e.g. update /latest when ledger 35 is exported but ledger 34 hasn't finished yet). Depending on the answer to that we may want to choose different parallelization strategies.

Currently, the indexer assumes that /latest is accurate. Specifically, it polls the file and if it grows, the assumption is that those ledgers are available to be exported. If we want that behavior to work (it was unclear at the time if we'd do real-time or batched index updates), then /latest needs to in fact be the latest.

Shaptic · 2022-06-30T02:44:33Z

Another suggestion: we can use a two-step model for parallelization: each process exports ledgers to its own (sub-)directory with its own /latest file, then another process merges them all into a single directory.

Shaptic · 2022-07-14T18:05:10Z

Done in #4443

Shaptic mentioned this issue Jun 6, 2022

Horizon Lite - MVP Epic #4317

Closed

7 tasks

Shaptic changed the title ~~Extend ledgerexporter to work in parallel (e.g. AWS Batch-able), because single-process exporting is absolutely time-prohibitive for large ledger ranges.~~ exp/services/ledgerexporter: Extend tool to work on ledger ranges in parallel. Jun 6, 2022

Shaptic added the Ingestion Lite label Jun 6, 2022

bartekn self-assigned this Jun 6, 2022

bartekn removed their assignment Jun 29, 2022

Shaptic assigned 2opremio Jun 30, 2022

2opremio mentioned this issue Jul 5, 2022

Horizon: Dockerize ledgerexport to run in AWS Batch #4443

Merged

Shaptic closed this as completed Jul 14, 2022

jcx120 mentioned this issue Sep 1, 2022

Horizon Lite - Productionization / Optimization Epic #4571

Closed

64 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exp/services/ledgerexporter: Extend tool to work on ledger ranges in parallel. #4426

exp/services/ledgerexporter: Extend tool to work on ledger ranges in parallel. #4426

Shaptic commented Jun 6, 2022 •

edited

Loading

2opremio commented Jun 28, 2022

2opremio commented Jun 28, 2022

2opremio commented Jun 28, 2022 •

edited

Loading

bartekn commented Jun 29, 2022

2opremio commented Jun 29, 2022 •

edited

Loading

Shaptic commented Jun 29, 2022 •

edited

Loading

Shaptic commented Jun 29, 2022

Shaptic commented Jun 30, 2022

Shaptic commented Jul 14, 2022

exp/services/ledgerexporter: Extend tool to work on ledger ranges in parallel. #4426

exp/services/ledgerexporter: Extend tool to work on ledger ranges in parallel. #4426

Comments

Shaptic commented Jun 6, 2022 • edited Loading

2opremio commented Jun 28, 2022

2opremio commented Jun 28, 2022

2opremio commented Jun 28, 2022 • edited Loading

bartekn commented Jun 29, 2022

2opremio commented Jun 29, 2022 • edited Loading

Shaptic commented Jun 29, 2022 • edited Loading

Shaptic commented Jun 29, 2022

Shaptic commented Jun 30, 2022

Shaptic commented Jul 14, 2022

Shaptic commented Jun 6, 2022 •

edited

Loading

2opremio commented Jun 28, 2022 •

edited

Loading

2opremio commented Jun 29, 2022 •

edited

Loading

Shaptic commented Jun 29, 2022 •

edited

Loading