Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document deploying Restate on Fly.io, AWS integration when operating in external environment #2183

Open
10thfloor opened this issue Oct 28, 2024 · 9 comments
Labels
aws-lambda-endpoint documentation Improvements or additions to documentation

Comments

@10thfloor
Copy link

10thfloor commented Oct 28, 2024

I'm trying to use a custom config file for my restate-server deployment. I'm using the config.toml from the docs unmodified, but restate whoami refuses to parse it. After placing the config file in the restate-server config directory, it is failing with this error:

I tried everything (fixing line endings, invisible characters...etc)

restate whoami

Error:TOML parse error at line 1, column 9
|
1 | roles = ["worker", "admin", "metadata-store"]
|         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
invalid type: sequence, expected a map
in root/.config/restate/config.tomlTOML file 

linux-x86
and MacOS

Restate v1.1

@10thfloor
Copy link
Author

10thfloor commented Oct 28, 2024

Also tried copy/pasting the output of restate-server --dump-config into the toml file and still get the same error.

@10thfloor 10thfloor changed the title Config parsing error. restate-server Config parsing error. Oct 28, 2024
@tillrohrmann
Copy link
Contributor

Thanks for reporting this issue @10thfloor. .config/restate/config.toml is the configuration of the CLI where the CLI records environment information. The linked config.toml is the configuration for the server and won't be understood by the CLI. Normally you don't have to touch .config/restate/config.toml yourself. Instead, we recommend using the restate config subcommand to edit it.

If you want to configure the restate-server deployment, then you can start restate-server --config <PATH_TO_SERVER_CONFIG>. You can also configure the restate server passing environment variables. Here you can find a few more details.

@10thfloor
Copy link
Author

10thfloor commented Oct 29, 2024

@tillrohrmann Thanks for the reply. It makes more sense now.

I'm trying to self-host the restate-server via the Docker image (docker.io/restatedev/restate:1.1), on Fly.io. It's working however I can't invoke my services I've deployed to AWS. I was trying to figure out how to set the Environment ID for my restate-server deployment to provide to the AWS trust-policy (restate-server "labmda invoker" IAM role).

I realize now that the procedure for setting up a production instance of restate-server is not documented, because that is something you will provide via your own infrastructure, ie. restate-cloud.

I am VERY excited about restate, and the possibilities it can unlock for smaller teams, especially when combined with tools like SST for managing AWS infrastructure. Thanks for your excellent work!

@tillrohrmann
Copy link
Contributor

Glad to hear that you find Restate useful 😊 We try to continually improve it so that it becomes more powerful.

If you are referring to the environment ID used in Restate's CLI, then this is indeed only relevant if you are interacting with a cloud-managed Restate server.

While I haven't deployed Restate server on Fly.io, I think this deployment setup should be possible. You probably need to provide the credentials to the process to talk to your services running on AWS. Maybe this resource can help in setting things up: https://fly.io/blog/oidc-cloud-roles/. I am also pulling in @pcholakov who knows a lot more than I about AWS and how to set up the right credentials.

@pcholakov
Copy link
Contributor

pcholakov commented Oct 31, 2024

Hey @10thfloor! Thank you for the kind words! The reason why this is not documented is just an oversight since we ourselves tend to run Restate servers either on dev machines, or on AWS - and nobody has asked yet 😅 There's no hidden magic!

The nice thing about running Restate on a piece of AWS infrastructure like EC2 is that you get provided bootstrap credentials for a role, e.g. the EC2 instance profile is essentially an IAM role that you can grant permissions to. When you're running Restate outside of AWS, you have to manage two things yourself:

  • create a role in your AWS account that Restate will assume (which I believe you've done)
  • propagate credentials linked to this role to the place where you're going to run restate-server - that could be your laptop, Fly.io, whatever

You're in luck as Fly.io offers really nice OIDC integration with AWS that Till linked to above - that lets you set-up cross-service trust so that you can obtain AWS STS credentials for the Restate "execution role" using Fly.io's own secure init mechanism. First, you'll need to set up an external OIDC provider in your AWS account as described here:

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc.html

I believe that the article Till mentioned has the necessary details:

https://fly.io/blog/oidc-cloud-roles/

Once that's done, all you have to set up is export AWS_ROLE_ARN to the ARN of your Labmda invoker role. When you register your Lambda handler as a deployment in Restate, don't specify a separate role via --assume-role-arn - since Restate will already be running under the appropriate role implicitly (Fly.io should assume it and export its credentials to the restate-server process). If you specify the role as the assume-role, Restate will attempt to assume the same role it's already running under - that's unnecessary but also roles don't have permissions to assume themselves by default, so you might get a confusing error from STS. Just deploy a specific version or alias of your function handler like this:

restate deployments register "arn:aws:lambda:region:account:function:function-name:$LATEST"

Please let us know how this works for you - let's keep this issue open as a reminder for us to update the documentation in this area! It would be great to include a Fly.io example with OIDC trust.

@10thfloor
Copy link
Author

10thfloor commented Nov 1, 2024

Yes worked like a charm!
Using this fly.toml for the restate deployment on fly:
(using my org & app name + custom Role arn)

app = "restate-server"
organization = "boxfactory"

[build]
image = "docker.io/restatedev/restate:1.1"

[vm]
size = "performance-2x"

[env]
AWS_ROLE_ARN = "arn:aws:iam::313969411467:role/FlyBoxFactoryLambdaInvoker"
AWS_REGION = "us-east-1"

[mounts]
source = "restate_data"
destination = "/restate-data"

[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = "suspend"
auto_start_machines = true
min_machines_running = 0
[http_service.concurrency]
type = "requests"
soft_limit = 200
hard_limit = 250

# Node-ctrl port (gRPC + HTTP metrics)
[[services]]
internal_port = 5122
protocol = "tcp"
[[services.ports]]
handlers = ["http"]
port = 5122

# Metadata store (gRPC)
[[services]]
internal_port = 5123
protocol = "tcp"
[[services.ports]]
handlers = ["http"]
port = 5123

# Admin REST API
[[services]]
internal_port = 9070
protocol = "tcp"
[[services.ports]]
handlers = ["tls", "http"]
port = 9070

# Postgres protocol
[[services]]
internal_port = 9071
protocol = "tcp"
[[services.ports]]
handlers = ["pg_tls"]
port = 9071

Couldn't be easier. Plop that file into a directory and run fly deploy. Wait 2 minutes and it's up and running. Had to use a higher memory machine to accommodate restate-server process, the basic machine didn't have enough memory, so this is running on performance-2x machine, but this is probably overkill. 1GB volume for restate-data ...

8080 is the invocation endpoint, so I added the load balancer there. Not sure if all these ports need to be exposed to public internet.

OIDC setup with Fly/AWS was SUPER straightforward. Added permission to invoke lambdas in my account, add the AWS_ROLE_ARN env to fly.toml, and I'm off to the races.

As simple as it gets.

Set up the restate-cli env to point to my deployment and I can register my services.

[boxfactory]
ingress_base_url = "https://restate-server.fly.dev:8080"
admin_base_url = "https://restate-server.fly.dev:9070"

@pcholakov
Copy link
Contributor

Excellent, thanks for sharing your setup, @10thfloor!

@tillrohrmann
Copy link
Contributor

tillrohrmann commented Nov 2, 2024

Great to hear @10thfloor :-). I think you only need to expose the ingress port 8080 and the admin port 9070 for normal operations. If you want to connect with a postgres client, then also expose 9071 but you can also use the restate-server:9070/query endpoint.

@pcholakov pcholakov changed the title restate-server Config parsing error. Document deploying Restate on Fly.io, AWS integration when operating in external environment Nov 12, 2024
@pcholakov pcholakov added documentation Improvements or additions to documentation aws-lambda-endpoint labels Nov 12, 2024
@gvdongen
Copy link
Contributor

I added a reference to this in the documentation issues: restatedev/documentation#480
Thank you for putting work into this @10thfloor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws-lambda-endpoint documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants