Horizon light: Dockerize indexer to run in AWS Batch #4501

2opremio · 2022-08-03T14:06:59Z

This PR creates a docker image (stellar/horizon-indexer) which works in a similar fashion to stellar/horizon-verify-rage
and is tested and pushed as part of the Horizon GitHub workflow.

sreuland · 2022-08-03T17:23:26Z

services/horizon/docker/indexer/Dockerfile

@@ -0,0 +1,27 @@
+FROM golang:1.18 AS builder


what is the basis for locating this in the services/horizon/docker path, I'm currently developing in same space for dockering up light images and was thinking of the docker/k8s files being closer to lighthorizon like /exp/lighthorizon/build or something?

I think you have a point. Can we move them in a separate PR? (Since we also need to move the ledgerexporter files)

yes, I can do that in my upcoming PR for deployment of this on k8s, here's compare of my dev branch I have in progress that uses exp/lighthorizon/builld

once you merge this as-is, I'll pull that into my 4493 dev, and refactor this and ledger exporter over to exp/lighthorizon/build, sound good?

exp/lighthorizon/index/types/bitmap.go

services/horizon/docker/indexer/README.md

Shaptic · 2022-08-03T22:21:31Z

services/horizon/docker/indexer/start

+END="${END:=0}"
+CONTINUE="${CONTINUE:=false}"
+# Writing to /latest is disabled by default to avoid race conditions between parallel container runs
+WRITE_LATEST_PATH="${WRITE_LATEST_PATH:=false}"


Same here, right? I'm not sure how the parameters are getting passed through.

@Shaptic @2opremio , I wonder if it may be easier on us If we hold on this PR, and I take the net intent of it for index and roll it into the dev branch for #4493 i have in in process, which is dockering index and web processes from horizonlight and defining k8s for them? was thinking to also move services/horizon/docker/ledgerexporter into there as well.

We definitely need a map-reduce-specific Docker image for index building so that we can build indices on AWS Batch ➡️ S3.

The "real-time" or "watcher" indexer can be a separate image and deployed on k8s like you're saying.

Maybe we can just scope this one to the map/reduce part and ignore single?

ah, yes, that break-out helps, so, there are two index deployment modes, map-reduce and real-time and we can align docker images against that mode.

For map-reduce , it needs to run the two index commands map and reduce in some arrangement of o/s processes in one container or across multiple depends on the orchestration environment, in either of those cases it could be a single docker image having both index map and reduce installed and env vars determine which is executed?

and then for 'real-time' we build another image with just the index 'single' cmd installed and env variables to config it's exec. sounds good, I can focus on dev for the real-time in #4493, and then won't overlap/collide with scope of this PR for the 'map-reduce'.

thanks for clarification!

The start file was wrongly copied. It's not used.

@sreuland why can't you reuse the indexer image for all usecases? I included all the binaries and you simply have to specify the command to run to the container.

no tech reason, there seemed to be some separation/encapsulation above at orchestration layer, ie.a docker image per higher level mode/function which provides ENTRYPOINT through a start script that adapts env variables and knows how to invoke the lower level index commands, rather than the orchestration layer needing to know details on how to invoking each index cmd directly.

@2opremio , I tried this out in the PR for the web docker, I put all the docker for lighthorizon under exp/lighthorizon/build, I think exp/lighthorizon/build/build.sh index-batch stellar would produce the same docker image with bins as here, moved ledgerexporter into there also. lmk if it looks like a good direction, maybe we can combine efforts.

Shaptic · 2022-08-08T16:57:57Z

.github/workflows/horizon.yml

                     -e WORKER_COUNT=1 -v $PWD/ledgermeta/:/ledgermeta -e TXMETA_SOURCE=file:///ledgermeta -v $PWD/index:/index -e INDEX_TARGET=file:///index stellar/horizon-indexer map

          # run reduce job
-          docker run -e NETWORK_PASSPHRASE='pubnet' -e AWS_BATCH_JOB_ARRAY_INDEX=0 -e MAP_JOB_COUNT=1 -e REDUCE_JOB_COUNT=1 \
+          docker run -e NETWORK_PASSPHRASE='pubnet' -e JOB_INDEX_ENV=AWS_BATCH_JOB_ARRAY_INDEX -e AWS_BATCH_JOB_ARRAY_INDEX=0 -e MAP_JOB_COUNT=1 -e REDUCE_JOB_COUNT=1 \


Why not JOB_INDEX=$AWS_BATCH_JOB_ARRAY_INDEX? Then it's just passing the value, rather than the name of a name to look up heh.

That assumes that the batch system we use will replace env variables when setting them, which I don't even know is the case for batch

stale

sreuland · 2022-08-13T20:59:24Z

@2opremio , since #4519 merged which refactored the docker images under exp/lighthorizon/build, I grabbed the net changes from this PR based on current lighthorizon branch and captured them in new PR #4543, I think that PR can be used and deprecate the need to fix conflicts&merge this PR, lmk what you think, thanks!

2opremio · 2022-08-14T11:36:49Z

@sreuland sounds good! Thanks for the refactoring work!!!

sreuland · 2022-08-15T16:10:14Z

confirmed with @2opremio, closing this PR in lieu of PR #4543 which carries the net changes from this on top of newly refactored docker image setup under lighthorizon.

2opremio force-pushed the dockerize-hlight-map-reduce branch from 53f9ce6 to 009ce92 Compare August 3, 2022 14:09

2opremio requested a review from a team August 3, 2022 14:16

2opremio marked this pull request as ready for review August 3, 2022 14:16

sreuland reviewed Aug 3, 2022

View reviewed changes

sreuland approved these changes Aug 3, 2022

View reviewed changes

Shaptic reviewed Aug 3, 2022

View reviewed changes

exp/lighthorizon/index/types/bitmap.go Show resolved Hide resolved

Shaptic previously requested changes Aug 3, 2022

View reviewed changes

2opremio force-pushed the dockerize-hlight-map-reduce branch from 009ce92 to f60b00c Compare August 4, 2022 10:08

2opremio added 2 commits August 4, 2022 21:16

Dockerize indexer to run in AWS Batch

d5c7292

Parameterize job index env variable (it can't be overriden in Batch)

e12cc83

2opremio force-pushed the dockerize-hlight-map-reduce branch from f3032f0 to e12cc83 Compare August 4, 2022 19:17

Shaptic reviewed Aug 8, 2022

View reviewed changes

This was referenced Aug 8, 2022

exp/lighthorizon: add horizon web docker/k8s deployment #4519

Merged

exp/lighthorizon/build/index-batch: merge map/reduce updates to latest on feature branch #4543

Merged

sreuland closed this Aug 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Horizon light: Dockerize indexer to run in AWS Batch #4501

Horizon light: Dockerize indexer to run in AWS Batch #4501

2opremio commented Aug 3, 2022

sreuland Aug 3, 2022

2opremio Aug 3, 2022

sreuland Aug 3, 2022

Shaptic Aug 3, 2022

sreuland Aug 3, 2022

Shaptic Aug 3, 2022

sreuland Aug 4, 2022

2opremio Aug 4, 2022

2opremio Aug 4, 2022 •

edited

Loading

sreuland Aug 4, 2022

sreuland Aug 5, 2022

Shaptic Aug 8, 2022

2opremio Aug 12, 2022

sreuland commented Aug 13, 2022

2opremio commented Aug 14, 2022

sreuland commented Aug 15, 2022

Horizon light: Dockerize indexer to run in AWS Batch #4501

Horizon light: Dockerize indexer to run in AWS Batch #4501

Conversation

2opremio commented Aug 3, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

2opremio Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sreuland commented Aug 13, 2022

2opremio commented Aug 14, 2022

sreuland commented Aug 15, 2022

2opremio Aug 4, 2022 •

edited

Loading