Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizon light: Dockerize indexer to run in AWS Batch #4501

Closed

Conversation

2opremio
Copy link
Contributor

@2opremio 2opremio commented Aug 3, 2022

This PR creates a docker image (stellar/horizon-indexer) which works in a similar fashion to stellar/horizon-verify-rage
and is tested and pushed as part of the Horizon GitHub workflow.

@2opremio 2opremio force-pushed the dockerize-hlight-map-reduce branch from 53f9ce6 to 009ce92 Compare August 3, 2022 14:09
@2opremio 2opremio requested a review from a team August 3, 2022 14:16
@2opremio 2opremio marked this pull request as ready for review August 3, 2022 14:16
@@ -0,0 +1,27 @@
FROM golang:1.18 AS builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the basis for locating this in the services/horizon/docker path, I'm currently developing in same space for dockering up light images and was thinking of the docker/k8s files being closer to lighthorizon like /exp/lighthorizon/build or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have a point. Can we move them in a separate PR? (Since we also need to move the ledgerexporter files)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I can do that in my upcoming PR for deployment of this on k8s, here's compare of my dev branch I have in progress that uses exp/lighthorizon/builld

once you merge this as-is, I'll pull that into my 4493 dev, and refactor this and ledger exporter over to exp/lighthorizon/build, sound good?

Shaptic
Shaptic previously requested changes Aug 3, 2022
services/horizon/docker/indexer/README.md Outdated Show resolved Hide resolved
END="${END:=0}"
CONTINUE="${CONTINUE:=false}"
# Writing to /latest is disabled by default to avoid race conditions between parallel container runs
WRITE_LATEST_PATH="${WRITE_LATEST_PATH:=false}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, right? I'm not sure how the parameters are getting passed through.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shaptic @2opremio , I wonder if it may be easier on us If we hold on this PR, and I take the net intent of it for index and roll it into the dev branch for #4493 i have in in process, which is dockering index and web processes from horizonlight and defining k8s for them? was thinking to also move services/horizon/docker/ledgerexporter into there as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need a map-reduce-specific Docker image for index building so that we can build indices on AWS Batch ➡️ S3.

The "real-time" or "watcher" indexer can be a separate image and deployed on k8s like you're saying.

Maybe we can just scope this one to the map/reduce part and ignore single?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes, that break-out helps, so, there are two index deployment modes, map-reduce and real-time and we can align docker images against that mode.

For map-reduce , it needs to run the two index commands map and reduce in some arrangement of o/s processes in one container or across multiple depends on the orchestration environment, in either of those cases it could be a single docker image having both index map and reduce installed and env vars determine which is executed?

and then for 'real-time' we build another image with just the index 'single' cmd installed and env variables to config it's exec. sounds good, I can focus on dev for the real-time in #4493, and then won't overlap/collide with scope of this PR for the 'map-reduce'.

thanks for clarification!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start file was wrongly copied. It's not used.

Copy link
Contributor Author

@2opremio 2opremio Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sreuland why can't you reuse the indexer image for all usecases? I included all the binaries and you simply have to specify the command to run to the container.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no tech reason, there seemed to be some separation/encapsulation above at orchestration layer, ie.a docker image per higher level mode/function which provides ENTRYPOINT through a start script that adapts env variables and knows how to invoke the lower level index commands, rather than the orchestration layer needing to know details on how to invoking each index cmd directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2opremio , I tried this out in the PR for the web docker, I put all the docker for lighthorizon under exp/lighthorizon/build, I think exp/lighthorizon/build/build.sh index-batch stellar would produce the same docker image with bins as here, moved ledgerexporter into there also. lmk if it looks like a good direction, maybe we can combine efforts.

@2opremio 2opremio force-pushed the dockerize-hlight-map-reduce branch from 009ce92 to f60b00c Compare August 4, 2022 10:08
@2opremio 2opremio force-pushed the dockerize-hlight-map-reduce branch from f3032f0 to e12cc83 Compare August 4, 2022 19:17
-e WORKER_COUNT=1 -v $PWD/ledgermeta/:/ledgermeta -e TXMETA_SOURCE=file:///ledgermeta -v $PWD/index:/index -e INDEX_TARGET=file:///index stellar/horizon-indexer map

# run reduce job
docker run -e NETWORK_PASSPHRASE='pubnet' -e AWS_BATCH_JOB_ARRAY_INDEX=0 -e MAP_JOB_COUNT=1 -e REDUCE_JOB_COUNT=1 \
docker run -e NETWORK_PASSPHRASE='pubnet' -e JOB_INDEX_ENV=AWS_BATCH_JOB_ARRAY_INDEX -e AWS_BATCH_JOB_ARRAY_INDEX=0 -e MAP_JOB_COUNT=1 -e REDUCE_JOB_COUNT=1 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not JOB_INDEX=$AWS_BATCH_JOB_ARRAY_INDEX? Then it's just passing the value, rather than the name of a name to look up heh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That assumes that the batch system we use will replace env variables when setting them, which I don't even know is the case for batch

@sreuland
Copy link
Contributor

@2opremio , since #4519 merged which refactored the docker images under exp/lighthorizon/build, I grabbed the net changes from this PR based on current lighthorizon branch and captured them in new PR #4543, I think that PR can be used and deprecate the need to fix conflicts&merge this PR, lmk what you think, thanks!

@2opremio
Copy link
Contributor Author

@sreuland sounds good! Thanks for the refactoring work!!!

@sreuland
Copy link
Contributor

confirmed with @2opremio, closing this PR in lieu of PR #4543 which carries the net changes from this on top of newly refactored docker image setup under lighthorizon.

@sreuland sreuland closed this Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants