Improve discreteness handling, allow binary outcomes #816

fverac · 2023-09-22T17:00:41Z

Adds a binary_outcome keyword arg to most estimators, where if True then the outcome nuisance model will be a classifier.

…t=False Signed-off-by: Fabio Vera <[email protected]>

Signed-off-by: Fabio Vera <[email protected]>

Signed-off-by: fverac <[email protected]>

Signed-off-by: Fabio Vera <[email protected]>

Signed-off-by: fverac <[email protected]>

Signed-off-by: Fabio Vera <[email protected]>

Signed-off-by: fverac <[email protected]>

Signed-off-by: Fabio Vera <[email protected]>

kbattocchi

Mostly looks great. I've suggested a few minor changes.

Related to this PR, I think lines 52-57 of econml/dml/dml.py should be made more robust - what happens if the target is discrete and there is no predict_proba method, or target is not discrete and there is a predict_proba method? I think in either case, we should at least warn the user that they are passing a classifier where a regressor is expected or vice versa. When there is no predict_proba method but one is expected, there doesn't seem to be much harm in falling back to calling predict instead as long as the user is warned; in the opposite scenario it's less clear to me that calling predict_proba instead of just calling predict as usual is a good idea, but at least if we warn the user they can change the discreteness and get that behavior if they want.

Whatever we decide here, if there's a non-trivial amount of logic we do something similar with other estimators that don't use this first stage wrapper; probably that should happen in a new utility method that this module and others can all use consistently (perhaps something like get_prediction(estimator, expected_discrete)).

econml/_ortho_learner.py

econml/dr/_drlearner.py

econml/tests/test_bootstrap.py

econml/tests/test_ortho_learner.py

fverac · 2024-01-05T17:46:40Z

Mostly looks great. I've suggested a few minor changes.

Related to this PR, I think lines 52-57 of econml/dml/dml.py should be made more robust - what happens if the target is discrete and there is no predict_proba method, or target is not discrete and there is a predict_proba method? I think in either case, we should at least warn the user that they are passing a classifier where a regressor is expected or vice versa. When there is no predict_proba method but one is expected, there doesn't seem to be much harm in falling back to calling predict instead as long as the user is warned; in the opposite scenario it's less clear to me that calling predict_proba instead of just calling predict as usual is a good idea, but at least if we warn the user they can change the discreteness and get that behavior if they want.

Whatever we decide here, if there's a non-trivial amount of logic we do something similar with other estimators that don't use this first stage wrapper; probably that should happen in a new utility method that this module and others can all use consistently (perhaps something like get_prediction(estimator, expected_discrete)).

Adding a warning when first stage is discrete target but model does not have predict_proba,
and raising an error when first stage target is continuous but model does have predict_proba.

Signed-off-by: Fabio Vera <[email protected]>

kbattocchi

This looks good; I've made a couple of minor suggestion but you can merge as soon as you are comfortable with it.

kbattocchi · 2024-01-10T19:37:10Z

econml/_ortho_learner.py

                if len(self.outcome_transformer.classes_) > 2:
                    raise AttributeError(
-                        "More than 2 outcome classes detected. This method currently only supports binary outcomes")
+                        f"({self.outcome_transformer.classes_} outcome classes detected. \


I think you're including the classes themselves rather than their count here.

kbattocchi · 2024-01-10T19:49:06Z

econml/utilities.py

@@ -1482,3 +1482,29 @@ def jacify_featurizer(featurizer):
       a function for calculating the jacobian
    """
    return _TransformerWrapper(featurizer)
+
+
+def single_strata_from_discrete_arrays(arrs):


[minor] The singular for strata is stratum, so this name seems slightly weird. I think strata_from_discrete_arrays (since this gets the strata for all of the rows at once) is shorter and just as clear.

Signed-off-by: Fabio Vera <[email protected]>

Adds a binary_outcome keyword arg to most estimators, where if True then the outcome nuisance model will be a classifier. Additionally add constraints to ensure nuisance model discreteness is handled appropriately by the user. If a nuisance model has a continuous target but a classifier is passed, then will raise an AttributeError. Conversely, if a nuisance model has a discrete target but a regressor is passed, then a warning is issued.

fverac added 7 commits September 22, 2023 12:28

initial commit for binary outcome, warn when clf passed but disc_trea…

06f85fe

…t=False Signed-off-by: Fabio Vera <[email protected]>

add init args to drlearner, causalforestdml

6bc0660

Signed-off-by: Fabio Vera <[email protected]>

modify bootstrap test to use np array

058c3e8

Signed-off-by: Fabio Vera <[email protected]>

bugfix causalforest firststagewrapper

a92d140

Signed-off-by: Fabio Vera <[email protected]>

fix test bug ortholearner

8929eab

Signed-off-by: Fabio Vera <[email protected]>

fix test bugs treatfeat OL doctest

1540a08

Signed-off-by: Fabio Vera <[email protected]>

add tests, allow str y, add warnings/errors

d39a091

Signed-off-by: Fabio Vera <[email protected]>

fverac marked this pull request as ready for review October 13, 2023 15:44

fverac added 13 commits October 13, 2023 09:16

Merge branch 'main' into fverac/improve_discreteness_handling

bfb6e67

Signed-off-by: fverac <[email protected]>

bugfixes

ee64b0e

Signed-off-by: Fabio Vera <[email protected]>

Merge branch 'main' into fverac/improve_discreteness_handling

5b36a4a

Signed-off-by: fverac <[email protected]>

linting

5aaee9d

Signed-off-by: Fabio Vera <[email protected]>

indent

9064f8b

Signed-off-by: Fabio Vera <[email protected]>

linting

c98edbc

Signed-off-by: Fabio Vera <[email protected]>

rlearner doctest

1ff9505

Signed-off-by: Fabio Vera <[email protected]>

Merge branch 'main' into fverac/improve_discreteness_handling

3c4eac7

Signed-off-by: fverac <[email protected]>

linting

a67eb54

Signed-off-by: Fabio Vera <[email protected]>

more typos

e104d73

Signed-off-by: Fabio Vera <[email protected]>

bugfixes, docstrings, enable for intenttotreatdrivs

edc0b48

Signed-off-by: Fabio Vera <[email protected]>

fix default

79a3b07

Signed-off-by: Fabio Vera <[email protected]>

bugfixes

17a0b36

Signed-off-by: Fabio Vera <[email protected]>

fverac requested a review from kbattocchi January 3, 2024 05:09

kbattocchi requested changes Jan 3, 2024

View reviewed changes

fverac requested a review from kbattocchi January 5, 2024 19:15

fverac added 4 commits January 5, 2024 14:26

test_binary_outcome bugfix

6ba3b1f

Signed-off-by: Fabio Vera <[email protected]>

adjust tests

5d75de4

Signed-off-by: Fabio Vera <[email protected]>

address comments; binary_outcome->discrete_outcome, improve warnings

9e7d701

Signed-off-by: Fabio Vera <[email protected]>

line endings

0757d39

Signed-off-by: Fabio Vera <[email protected]>

fverac force-pushed the fverac/improve_discreteness_handling branch from 87ccacf to 0757d39 Compare January 5, 2024 19:31

fix tests where clf was used without specifying disc treat

b848e73

Signed-off-by: Fabio Vera <[email protected]>

kbattocchi approved these changes Jan 10, 2024

View reviewed changes

fverac added 4 commits January 10, 2024 15:45

rename function, fix warning

12dae44

Signed-off-by: Fabio Vera <[email protected]>

add test for discrete model constraints, fix warning whitespace

5014d4c

Signed-off-by: Fabio Vera <[email protected]>

fix test

74842eb

Signed-off-by: Fabio Vera <[email protected]>

merge main

961cf24

Signed-off-by: Fabio Vera <[email protected]>

fverac merged commit ababb7e into main Jan 12, 2024
77 checks passed

fverac deleted the fverac/improve_discreteness_handling branch January 12, 2024 20:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve discreteness handling, allow binary outcomes #816

Improve discreteness handling, allow binary outcomes #816

fverac commented Sep 22, 2023 •

edited

Loading

kbattocchi left a comment

fverac commented Jan 5, 2024

kbattocchi left a comment

kbattocchi Jan 10, 2024

fverac Jan 10, 2024

kbattocchi Jan 10, 2024

Improve discreteness handling, allow binary outcomes #816

Improve discreteness handling, allow binary outcomes #816

Conversation

fverac commented Sep 22, 2023 • edited Loading

kbattocchi left a comment

Choose a reason for hiding this comment

fverac commented Jan 5, 2024

kbattocchi left a comment

Choose a reason for hiding this comment

kbattocchi Jan 10, 2024

Choose a reason for hiding this comment

fverac Jan 10, 2024

Choose a reason for hiding this comment

kbattocchi Jan 10, 2024

Choose a reason for hiding this comment

fverac commented Sep 22, 2023 •

edited

Loading