-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaler and descaler transformers #223
Conversation
Codecov Report
@@ Coverage Diff @@
## master #223 +/- ##
==========================================
+ Coverage 86.36% 86.39% +0.03%
==========================================
Files 310 312 +2
Lines 10137 10182 +45
Branches 347 333 -14
==========================================
+ Hits 8755 8797 +42
- Misses 1382 1385 +3
Continue to review full report at Codecov.
|
features/src/main/scala/com/salesforce/op/stages/impl/feature/ScalingType.scala
Show resolved
Hide resolved
core/src/main/scala/com/salesforce/op/stages/impl/feature/ScalerTransformer.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/salesforce/op/stages/impl/feature/DescalerTransformerTest.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/salesforce/op/stages/impl/feature/LinearScalerTest.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/salesforce/op/stages/impl/feature/LinearScalerTest.scala
Show resolved
Hide resolved
import scala.util.{Failure, Success} | ||
|
||
@RunWith(classOf[JUnitRunner]) | ||
class ScalerMetadataTest extends FlatSpec with TestSparkContext{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing a space as well TestSparkContext {
core/src/test/scala/com/salesforce/op/stages/impl/feature/ScalerTest.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 lgtm and thank you!
Thanks for the contribution! Before we can merge this, we need @ericwayman to sign the Salesforce.com Contributor License Agreement. |
Related Issues
Often in regression use cases the response variable is not normally distributed. As a data scientist training regression models I'd like generic transformers to easily scale the response variable during training time, but to also descale the predictions (at scoring time) so predictions are returned in the original scale.
Proposed Solution
A general framework for scaling and descaling the response feature that allows developers to easily add support for new feature scaling functions with minimal boiler plate code and without having to write new code to handle reading and writing metadata.
Each family of scaling functions is represented by a case class which extends the
Scaler
case class.All supported families of scaling functions are stored in the
ScalingType
enum.The
ScalerTransformer
constructs a transformer that scales features fromscalingType
andscalingArgs
. ThescalingType
andscalingArgs
are stored in the metadata of this feature.The
DescalerTransformer
takes in a feature for descaling and a scaled feature containing the metadata for constructing the scaling inverse. This allows for predictions (in the scaled domain) to be descaled at scoring time.