-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(glue): Timeout and workerType missing validation for Ray job types #29612
Comments
Thanks for creating this issue as well as the PR. I think adding validation as you described makes sense if the deployment is failing. Linking Ray job documentation for reference:
Did not see much on the topics in this issue, maybe more needs to be added on that. |
Hey Tim, thanks for attaching the docs there - in the same vein, this is also not noted in:
The worker type constraint is mentioned, but for both, timeout is just listed as an optional parameter, no mention that it must be omitted in some circumstances. Having the only documentation be in the request failure message is not ideal, but not sure if it's something that even can be enforced at the L1 / Cfn level, so validating at the L2 construct level should hopefully avoid some frustration. |
### Issue # (if applicable) Closes #29612. ### Reason for this change AWS Glue Ray job has some restriction. - must use Z.2X worker type ```sh CREATE_FAILED [...] Worker type cannot be null and only [Z.2X] worker types are supported for glueray jobs ``` - must not specify timeout ```sh UPDATE_FAILED [...] Timeout not supported for Ray jobs ``` ### Description of changes Add validation for above restriction. ```ts if (executable.type.name === JobType.RAY.name) { if (props.workerType !== WorkerType.Z_2X) { throw new Error(`WorkerType must be Z_2X for Ray jobs, got: ${props.workerType}`); } if (props.timeout !== undefined) { throw new Error('Timeout cannot be set for Ray jobs'); } } ``` ### Description of how you validated changes Add unit test. ### Checklist - [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md) ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Comments on closed issues and PRs are hard for our team to see. |
Describe the bug
When developing Ray jobs via CDK, I bumped into the fact that specifying a timeout or not specifying a worker type on
Job
for Ray job types will succeed in synthesis, but fail deployment.Expected Behavior
I'd expect an error to be thrown if workerType is not provided or timeout is provided, as both of these will cause deployment failures for Ray job types.
Current Behavior
Synthesis passes and failure occurs on deployment:
or
Reproduction Steps
Possible Solution
Additional validation in
packages/@aws-cdk/aws-glue-alpha/lib/job.ts
would address this and provide a slightly smoother developer experience when working with ray job types. I would be interested in contributing a fix if triage determines this is valid bug.Additional Information/Context
No response
CDK CLI Version
2.133.0
Framework Version
2.133.0
Node.js Version
20.10.0
OS
macOS
Language
TypeScript
Language Version
5.3.3
Other information
No response
The text was updated successfully, but these errors were encountered: