-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solr on ECS #36
Merged
Merged
Solr on ECS #36
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Solr running on ECS in fargate - Public IP (no load balancer) - No authentication - No persistent volumes
- Create IAM Role for ECS Task Role and ECS Execution Role (same role) - Encrypt EFS Volume with key that is managed here - Add EFS File Policy to ensure data transit is encrypted - Optimize security groups for NFS communication from ECS to EFS - Enable Cloudwatch Logging from ECS Cluster and ECS Solr Task Definition
To avoid collisions between deployments, make each unique
- Use Load Balancer in front of ecs task - Run all workloads in private subnets - Only Load Balancer is public - Enable SSL for Load Balancer domain - Create custom domain (copy from eks brokerpak, dns, dnssec, et cetera) - Update security groups for load balancer and ecs task - Force HTTPS connection to solr
- For the load balancer to serve traffic, it needs to be in a public subnet. - Everything else is in a private subnet and security groups are used to facilitate internal traffic - Update name for vpc - Disable deletion protection on the load balancer so that it can be deprovisioned properly - Disable public IP for ecs task - Fix the DNS Alias record to load balancer
- Use GSA Solr Image (with cyber bugfixes) - Allow LB to talk to GHCR to pull image - Refine the start command to allow the ckan core to be created automatically
The 'root_direction' config in an efs access_point allows a unique directory to be created with specific chmod/chown permissions. This is perfect for solr user ownership :)
- Fix IAM and Security Groups for LB Health Check Passing - Fix un-erroring problem-causing EFS mount issue (EFS would mount, but it was missing a security group rule to allow communication to task) - Do some work to ensure longer Broker names work (WIP) - Temporarily allow all Egress until GHCR Image pull is fixed again..
- Lots of things were opened up to be less restrictive during testing. Will iterate back to a more secure version since we have a working point now **wipes sweat** - Don't use built-in EFS volume mount support, use a modified docker image with EFS-utils installed and manually mount volume during startup - Add more permissions to IAM role - Disable some restrictions on EFS file system policy
- ECS Service is mounting efs directly, so it needs to depend on the EFS volume so that it can unmount before EFS gets destroyed - Restore EFS File System Policy to deny insecure connections
- Can mount the /var/solr/data into a different directory on the EFS mount w/o efs ap :)
- Use a temporary password to create a randomly generated secure admin user/pass - Output url/user/pass for user to see (hint: use 'terraform output -json' to see sensitive values)
Also, clean up some unused lines
1 vCPU to 3 vCPU
This is a lot simpler since we are using the Solr URL API to add/remove users. URL/USER/PASSWORD are inputs from the provisioning and the same previous outputs exist as outputs.
The authorization link was actually getting hit twice, this fixes that
NO KUBERNETES! ! ! !
Remove unused variables, update terraform files
Solr brokerpak is now specifying AWS Resources by itself and no longer depends on k8s; also, pass in aws creds
some variables are actually numbers and the docker image needs a colon to specify image from tag
- Fix variables definitions in solr-cloud.yml - Fix terraform version in brokerpak - Fix reference to service/plan ids in Makefile
added missing parts of the commands
This might cause issues if it has a max length limit, but will see and fix later
Need to revive the original service to be side-by-side able to support either; Lots to do still
- In order for EFS mounting to work, the solr container needs to be set to root initially - Attempt to hide the admin user/pass since the provision outputs are combined with the bind outputs (https://github.com/cloudfoundry/cloud-service-broker/blob/main/docs/brokerpak-dissection.md#outputs) - Wait upto an hour for DNS to resolve (this is the next thing I'm going to fix)
- Implement ECS CloudMap Service Discovery to have container-to-container dns communication - Create an init container to create the new admin user and delete the temporary admin user
- Copy terraform files to terraform/solrcloud - Restore certain files (solr-cloud.yml, generate*.sh to their original forms) - Caveat (unfortunately), solr-on-ecs won't be able to work without kind because the solrcloud service needs the configuration for kind in parallel to solr-cloud :/ - Lots of changes... mostly blind.. will have a lot of debugging to do
This file was important enough to make a separate commit
hopefully this works.. again, no formal tests, just make sure the brokerpak can provision, deprovision the services
Just waiting for secrets to be populated and this should put them where they need to be :)
39 tasks
make sure to deprovision before destroying the broker
It was worrying me and my suspicions were correct... it didn't actually mount the EFS volume in the container... still need to fix EFS mounting. I tested it by restarting the service and seeing if the data was still there.. it wasn't 😭 |
I verified this by creating the service, connecting it to catalog-dev, running a re-index, restarting the solr service and verifying that the data was able to load when the solr service was re-created... - This is not an optimal design, but it works, so if we want to improve from here, we can
The reason the tests are failing is because the AWS Development account needs to be cleaned up. |
FuhuXia
approved these changes
Jun 1, 2022
10 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related to GSA/data.gov#3826
Support for a standalone, reliable solr instance on ECS with proper security/encryption.
New Additions:
solr-on-ecs
List of AWS Services:
List of Terraform Providers: