Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap DensityDist's random with generate_samples #3554

Merged
merged 11 commits into from
Aug 16, 2019

Conversation

lucianopaz
Copy link
Contributor

@lucianopaz lucianopaz commented Jul 19, 2019

This fixes issue #3553 by relying on generate_samples to make the random number generators aware of the distribution's shape. The old incorrect behavior can still be achieved with some extra arguments during initialization.

):
"""
Parameters
----------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NICE!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one question I have -- and this is not a question about your change -- is whether we should have a default shape that is not None (i.e., that records the fact that the programmer did not specify shape).
I'm sure that the shape is most likely (), but worried about the case where it isn't, and the programmer/modeler has not realized that they need to specify the shape.
But this is a different issue, and should not delay merging this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say no, we shouldn't add a special empty shape value. Ideally, the default random variable's shape should be deduced from its parameters. We are far from this goal. At the moment the shape information must be explicitly supplied to every distribution in pymc3, not just DensityDist. The default behavior package wise is to assume a scalar distribution.
When we start to move towards a more symbolic handling of shapes, we should address whether to reserve a special empty value or not.

@lucianopaz
Copy link
Contributor Author

@junpenglao, @twiecki I have no idea what caused the test failure. Could be some deprecation that theano didn't account for?

@junpenglao
Copy link
Member

Seems to be a real test fail:

self = <pymc3.tests.test_models_utils.TestUtils object at 0x7fd6abddbbe0>
    def test_dict_input(self):
        m, l = utils.any_to_tensor_and_labels(self.data.to_dict('dict'))
        self.assertMatrixLabels(m, l, mt=self.data[l].values, lt=l)
    
        m, l = utils.any_to_tensor_and_labels(self.data.to_dict('series'))
        self.assertMatrixLabels(m, l, mt=self.data[l].values, lt=l)
    
        m, l = utils.any_to_tensor_and_labels(self.data.to_dict('list'))
        self.assertMatrixLabels(m, l, mt=self.data[l].values, lt=l)
    
        inp = {k: tt.as_tensor_variable(v.values) for k, v in self.data.to_dict('series').items()}
>       m, l = utils.any_to_tensor_and_labels(inp)
pymc3/tests/test_models_utils.py:44: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pymc3/glm/utils.py:118: in any_to_tensor_and_labels
    x = tt.as_tensor_variable(x)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
x = array([[Subtensor{int64}.0, Subtensor{int64}.0],
       [Subtensor{int64}.0, Subtensor{int64}.0],
       [Subtensor{int64}.0, Subtensor{int64}.0]], dtype=object)

@lucianopaz
Copy link
Contributor Author

@junpenglao, but it's completely unrelated to the PR. In fact, I thought that maybe I should add a .values to fix the issue in 81e96b0, but the problem persisted.

@junpenglao
Copy link
Member

I am not clear about the context of this test, but looking at it naively I would say we dont need to convert x to tensor but instead just do tt.concatenate

Copy link
Contributor

@rpgoldman rpgoldman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the notion of adding a ShapeError error class as I did in my PR? It seems like this is enough of a thing that we might want it.

@rpgoldman
Copy link
Contributor

I am not clear about the context of this test, but looking at it naively I would say we dont need to convert x to tensor but instead just do tt.concatenate

@lucianopaz is right: master is broken, too.

Copy link
Contributor

@rpgoldman rpgoldman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a little to the docstring, but other than that, looks great!

@lucianopaz
Copy link
Contributor Author

Thanks @rpgoldman. I'll edit the note you added though. The user supplied callable never sees the dist_shape parameter, it is only viewed by generate_samples.

@rpgoldman
Copy link
Contributor

Thanks @rpgoldman. I'll edit the note you added though. The user supplied callable never sees the dist_shape parameter, it is only viewed by generate_samples.

Maybe it would be helpful to give an example of this case? I'm not sure what the relationship of generate_samples to the random function is, and I'm not sure how the supplier of the random function to the DensityDist would be able to use the dist_shape if the generator can't see it.

I'm going to "activate" some more of the docstrings to help with this.

@lucianopaz
Copy link
Contributor Author

Great @rpgoldman! I'll write up an example in the docstring

@lucianopaz
Copy link
Contributor Author

While writing an example I ran into an unexpected issue that this PR introduces. I'm setting the WIP flag now, as this is not ready to merge yet.

@lucianopaz
Copy link
Contributor Author

Should be done now. There was a problem that popped out when I tried to write down examples using multidimensional distributions (shape problems never stop). I had to add in some more control flow but I hope this is now working well.

Copy link
Contributor

@rpgoldman rpgoldman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

The two meanings of the same keyword argument (size) is really yucky, but I don't have an idea how to fix it.

@rpgoldman
Copy link
Contributor

@lucianopaz Should I rebase this, update RELEASE-NOTES and merge? Or would you like to?

@junpenglao and I both gave it the 👍🏻, and it still looks good after the updates (and maybe the rebase means the first 2 of the additional commits are not necessary?).

@lucianopaz
Copy link
Contributor Author

Thanks @rpgoldman. I'd be able to do this on Monday so feel free to go ahead with the rebase and merge.

@rpgoldman
Copy link
Contributor

Rebased. Now waiting to make sure the Travis tests still pass. If so, will merge.

@rpgoldman rpgoldman merged commit 8b3ecf9 into pymc-devs:master Aug 16, 2019
@rpgoldman
Copy link
Contributor

Thanks, @lucianopaz !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants