-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When calculating direct features use default value if parent missing #682
Comments
Is there any sample codes that reproduces this problem? I would like to confirm my understanding of this problem. For example, if transactions.value is [1,2,3,4,float('nan')], SUM(transactions.value) should be 10.0 (ignoring nan). Am I correct? |
Here is code that reproduces
the output of fm is
id 4 should be |
If no one is working on this, may I take this up? |
@seriallazer sure! |
I've created a pull-request for the change: #1217. |
For example, if there is a relationship
transaction.session_id -> sessions.id
and we are calculating a featuretransactions: sessions.SUM(transactions.value)
any rows for which there is no corresponding session should be given the default value of0
instead ofNaN
.Of course this should not normally occur, but when it does it seems more reasonable to use the
default_value
.DirectFeature.default_value
is already implemented. We should be able to use the same logic that we do for aggregation features.https://github.com/Featuretools/featuretools/blob/6f4ffd7ef7ea42f95dbaf3892615717a521299db/featuretools/computational_backends/feature_set_calculator.py#L611-L618
The text was updated successfully, but these errors were encountered: