-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Spark job grouping to distinguish steps of the machine learning flow #467
Conversation
Codecov Report
@@ Coverage Diff @@
## master #467 +/- ##
=========================================
+ Coverage 86.99% 87% +0.01%
=========================================
Files 344 345 +1
Lines 11576 11575 -1
Branches 370 593 +223
=========================================
+ Hits 10070 10071 +1
+ Misses 1506 1504 -2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great addition!!
core/src/main/scala/com/salesforce/op/stages/impl/selector/ModelSelector.scala
Show resolved
Hide resolved
core/src/test/scala/com/salesforce/op/utils/spark/JobGroupUtilTest.scala
Outdated
Show resolved
Hide resolved
utils/src/main/scala/com/salesforce/op/utils/spark/OpSparkListener.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Related issues
N/A
Describe the proposed solution
Leverages Spark's ability to set a "job group" ID to distinguish certain steps of the machine learning. Examples: data IO, model IO, feature engineering, cross-validation.
OpSparkListener
is extended to capture which job group is currently active. Also, the newOpStep
enum's entry names automatically show up in the Spark UI so that the function of stages can be more easily interpreted.To this end:
OpStep
enum is introducedDescribe alternatives you've considered
Because a main goal is to get the current step into the real-time
SparkListener
framework, the latter's ability to get hold of the Spark job group was an easy way to accomplish this.Considered but not feasible:
Also considered were the addition of other steps, such as "sanity checker", "scoring" or "metrics". However, these are not included here as this would:
OpWorkflow
/OpWorkflowModel
)Additional context
The extension to
OpSparkListener
allows for more advanced handling of the metrics that are collected by it, e.g. the metrics can be grouped byOpStep
.