-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating the EC2Architect #603
Conversation
Codecov Report
@@ Coverage Diff @@
## main #603 +/- ##
==========================================
- Coverage 66.36% 63.90% -2.47%
==========================================
Files 83 85 +2
Lines 7605 8050 +445
==========================================
+ Hits 5047 5144 +97
- Misses 2558 2906 +348
Continue to review full report at Codecov.
|
Dear @JackUrb, thank you so much for this PR. I have been facing some pushback from my IT team to use Heroku. Using EC2Architect would mean an all-AWS + AMT solution, which is perfect for me. Requests:
Thanks! |
Hi @kushalchawla, I'm hoping to have it committed Monday. The only thing you need is to register a domain name from somewhere, either with a standard registrar or somewhere that offers free domains. After that, running the setup script will tell you where to delegate your domain to, and then it will set up the rest (including registering certificates). The one downside of the setup is it requires around 2 hours for the initial setup to propagate, after which you do incur costs while that is running (on the order of a dollar or so a day, given my current testing). |
Merging without documentation for now after extended testing, will make official usage documentation next week. |
Overview
Introduces our new available backend for deploying Mephisto tasks. This version uses AWS EC2 instances, along with a host of other services, to set up an automated structure relying on just one SSL certificate. This allows multiple users to host tasks, and includes a fallback server that workers can see if one of the task servers is down (plus the server logs these bad requests, so we can trace down broken tasks!).
Implementation
The primary implementation is split into the pre-setup, which is handled by
prepare_ec2_servers.py
, and the deploy of task-related instances, which is handled by theEC2Architect
. Both of these rely heavily on helpers written inec2_helpers.py
.The core of the pre-work is to set up a VPC to host our servers, a load balancer to point to the fallback by default, and various auxilliary components to make this happen (like requesting and registering the SSL certificate). The fallback takes anything routed to
*.<registered_domain.com>
.When an instance is being launched for a specific task, we register it to
<task_name>.<registered_domain.com>
, and route traffic from the load balancer based on that subdomain to the task server.When a server is actually deployed, we set it up to run with
systemctl
remotely (which is what the.service
files are for), and this methodology is used for both the fallback server and for task servers.Files are stored in the
servers
directory containing information about launched servers, such that they can be referenced in the future and cleaned up when their usage is done (even if there's an issue with the python process).For the future
An in-depth document on the architecture of what services are set up to run this architect, how the pre-setup work (involving the VPC and load balancer) differs from what is done to launch instances later, and more.
Testing
Tested by launching the parlai_chat with an ec2 architect after having launched the pre-setup with one of our internal roles. Servers successfully launch, deploy, then cleanup after the task is done. After executing everything, the only remaining resources are those for the fallback server.