Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add events Spark Standalone support to the spark agent check #2752

Closed
wants to merge 1 commit into from

Conversation

wjsl
Copy link

@wjsl wjsl commented Aug 12, 2016

What does this PR do?

Enabled Spark metrics to be gathered when clusters are provisioned using Spark's Standalone cluster mode. Currently the agent checks works when Spark is running with Apache YARN to provision its resources. This PR re-uses the bulk of the collection logic, but retrofits an adapter to locate application servers via Spark Standalone's APIs.

Events have also been added. The agent can now raise events when both jobs and stages have status transitions (ie, moving from WAITING to RUNNING to COMPLETED or FAILED).

Testing Guidelines

Run the mock test using nosetests. Note that the test fit was only modified to continue force the agent to pretend it is using YARN.

- Added eventing logic for job and stage lifecycles
	- events work for both YARN and Standalone for both jobs and stages for Spark Standalone
@truthbk truthbk self-assigned this Aug 12, 2016
@truthbk
Copy link
Member

truthbk commented Aug 12, 2016

@wjsl this is some great stuff. I was actually aware of this limitation and knew sooner or later we'd have to work on something like this. It's quite a PR, so we'll have to review carefully, but this looks great. Thank you so much.

@truthbk truthbk added this to the Triage milestone Aug 12, 2016
@truthbk truthbk modified the milestones: 5.9.0, Triage Aug 23, 2016
'''
Figures out what mode we're in and fetches running apps
'''
cluster_mode = instance.get(SPARK_CLUSTER_MODE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably default to SPARK_YARN_MODE - after maybe printing a warning. This is to avoid any backward compatibility breakage for existing customers.

@truthbk
Copy link
Member

truthbk commented Aug 24, 2016

Really beautiful PR, great code and some great refactoring @wjsl

Just a few minor comments.

SPARK_YARN_MODE = 'spark_yarn_mode'

SPARK_STANDALONE_MASTER = 'spark_standalone_master_uri'
SPARK_STANDALONE_SERVICE_CHECK = 'spark_standalone_master'
Copy link
Member

@truthbk truthbk Aug 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be something along the lines of spark.standalone_master.can_connect

@truthbk
Copy link
Member

truthbk commented Aug 25, 2016

Superseded by #2786 - closing.

Comments herein still valid.

@truthbk truthbk closed this Aug 25, 2016
zachradtka pushed a commit to zdata-inc/dd-agent that referenced this pull request Aug 26, 2016
Specific changes:
- Changed `resourcemanager_uri` to `spark_url`
- Removed constant SPARK_STANDALONE_MASTER as it was not used
- Utilize ''.format() in events
- Updated value of SPARK_STANDALONE_SERVICE_CHECK
- Moved event titles after if/else logic
- Added status checks
- General reformatting to standardize spacing and quote types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants