-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add events Spark Standalone support to the spark agent check #2752
Conversation
- Added eventing logic for job and stage lifecycles - events work for both YARN and Standalone for both jobs and stages for Spark Standalone
@wjsl this is some great stuff. I was actually aware of this limitation and knew sooner or later we'd have to work on something like this. It's quite a PR, so we'll have to review carefully, but this looks great. Thank you so much. |
''' | ||
Figures out what mode we're in and fetches running apps | ||
''' | ||
cluster_mode = instance.get(SPARK_CLUSTER_MODE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably default to SPARK_YARN_MODE
- after maybe printing a warning. This is to avoid any backward compatibility breakage for existing customers.
Really beautiful PR, great code and some great refactoring @wjsl Just a few minor comments. |
SPARK_YARN_MODE = 'spark_yarn_mode' | ||
|
||
SPARK_STANDALONE_MASTER = 'spark_standalone_master_uri' | ||
SPARK_STANDALONE_SERVICE_CHECK = 'spark_standalone_master' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be something along the lines of spark.standalone_master.can_connect
Superseded by #2786 - closing. Comments herein still valid. |
Specific changes: - Changed `resourcemanager_uri` to `spark_url` - Removed constant SPARK_STANDALONE_MASTER as it was not used - Utilize ''.format() in events - Updated value of SPARK_STANDALONE_SERVICE_CHECK - Moved event titles after if/else logic - Added status checks - General reformatting to standardize spacing and quote types
What does this PR do?
Enabled Spark metrics to be gathered when clusters are provisioned using Spark's Standalone cluster mode. Currently the agent checks works when Spark is running with Apache YARN to provision its resources. This PR re-uses the bulk of the collection logic, but retrofits an adapter to locate application servers via Spark Standalone's APIs.
Events have also been added. The agent can now raise events when both jobs and stages have status transitions (ie, moving from WAITING to RUNNING to COMPLETED or FAILED).
Testing Guidelines
Run the mock test using
nosetests
. Note that the test fit was only modified to continue force the agent to pretend it is using YARN.