-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Horizon light: Dockerize indexer to run in AWS Batch #4501
Horizon light: Dockerize indexer to run in AWS Batch #4501
Conversation
53f9ce6
to
009ce92
Compare
@@ -0,0 +1,27 @@ | |||
FROM golang:1.18 AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the basis for locating this in the services/horizon/docker path, I'm currently developing in same space for dockering up light images and was thinking of the docker/k8s files being closer to lighthorizon like /exp/lighthorizon/build
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have a point. Can we move them in a separate PR? (Since we also need to move the ledgerexporter files)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I can do that in my upcoming PR for deployment of this on k8s, here's compare of my dev branch I have in progress that uses exp/lighthorizon/builld
once you merge this as-is, I'll pull that into my 4493 dev, and refactor this and ledger exporter over to exp/lighthorizon/build, sound good?
END="${END:=0}" | ||
CONTINUE="${CONTINUE:=false}" | ||
# Writing to /latest is disabled by default to avoid race conditions between parallel container runs | ||
WRITE_LATEST_PATH="${WRITE_LATEST_PATH:=false}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, right? I'm not sure how the parameters are getting passed through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Shaptic @2opremio , I wonder if it may be easier on us If we hold on this PR, and I take the net intent of it for index and roll it into the dev branch for #4493 i have in in process, which is dockering index and web processes from horizonlight and defining k8s for them? was thinking to also move services/horizon/docker/ledgerexporter into there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need a map-reduce-specific Docker image for index building so that we can build indices on AWS Batch ➡️ S3.
The "real-time" or "watcher" indexer can be a separate image and deployed on k8s like you're saying.
Maybe we can just scope this one to the map/reduce part and ignore single
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, yes, that break-out helps, so, there are two index deployment modes, map-reduce
and real-time
and we can align docker images against that mode.
For map-reduce
, it needs to run the two index commands map
and reduce
in some arrangement of o/s processes in one container or across multiple depends on the orchestration environment, in either of those cases it could be a single docker image having both index map
and reduce
installed and env vars determine which is executed?
and then for 'real-time' we build another image with just the index 'single' cmd installed and env variables to config it's exec. sounds good, I can focus on dev for the real-time
in #4493, and then won't overlap/collide with scope of this PR for the 'map-reduce'.
thanks for clarification!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The start file was wrongly copied. It's not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sreuland why can't you reuse the indexer image for all usecases? I included all the binaries and you simply have to specify the command to run to the container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no tech reason, there seemed to be some separation/encapsulation above at orchestration layer, ie.a docker image per higher level mode/function which provides ENTRYPOINT through a start
script that adapts env variables and knows how to invoke the lower level index commands, rather than the orchestration layer needing to know details on how to invoking each index cmd directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@2opremio , I tried this out in the PR for the web docker, I put all the docker for lighthorizon under exp/lighthorizon/build
, I think exp/lighthorizon/build/build.sh index-batch stellar
would produce the same docker image with bins as here, moved ledgerexporter into there also. lmk if it looks like a good direction, maybe we can combine efforts.
009ce92
to
f60b00c
Compare
f3032f0
to
e12cc83
Compare
-e WORKER_COUNT=1 -v $PWD/ledgermeta/:/ledgermeta -e TXMETA_SOURCE=file:///ledgermeta -v $PWD/index:/index -e INDEX_TARGET=file:///index stellar/horizon-indexer map | ||
|
||
# run reduce job | ||
docker run -e NETWORK_PASSPHRASE='pubnet' -e AWS_BATCH_JOB_ARRAY_INDEX=0 -e MAP_JOB_COUNT=1 -e REDUCE_JOB_COUNT=1 \ | ||
docker run -e NETWORK_PASSPHRASE='pubnet' -e JOB_INDEX_ENV=AWS_BATCH_JOB_ARRAY_INDEX -e AWS_BATCH_JOB_ARRAY_INDEX=0 -e MAP_JOB_COUNT=1 -e REDUCE_JOB_COUNT=1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not JOB_INDEX=$AWS_BATCH_JOB_ARRAY_INDEX
? Then it's just passing the value, rather than the name of a name to look up heh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That assumes that the batch system we use will replace env variables when setting them, which I don't even know is the case for batch
@2opremio , since #4519 merged which refactored the docker images under exp/lighthorizon/build, I grabbed the net changes from this PR based on current lighthorizon branch and captured them in new PR #4543, I think that PR can be used and deprecate the need to fix conflicts&merge this PR, lmk what you think, thanks! |
@sreuland sounds good! Thanks for the refactoring work!!! |
This PR creates a docker image (stellar/horizon-indexer) which works in a similar fashion to stellar/horizon-verify-rage
and is tested and pushed as part of the Horizon GitHub workflow.