Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

at.square(x) and at.pow(x, 2) don't etuplize to the same expresssion #1213

Open
rlouf opened this issue Sep 26, 2022 · 5 comments
Open

at.square(x) and at.pow(x, 2) don't etuplize to the same expresssion #1213

rlouf opened this issue Sep 26, 2022 · 5 comments

Comments

@rlouf
Copy link
Member

rlouf commented Sep 26, 2022

import aesara.tensor as at
from etuples import etuplize

a = at.scalar('a')

etuplize(at.square(a))
# e(e(<class 'aesara.tensor.elemwise.Elemwise'>, sqr, <frozendict {}>), a)

etuplize(at.pow(a, 2))
# e(e(<class 'aesara.tensor.elemwise.Elemwise'>, pow, <frozendict {}>), a, TensorConstant{2})

Which means that a**2 will fail to unify with etuplize(at.square(a)):

print(etuplize(a ** 2))
# e(e(<class 'aesara.tensor.elemwise.Elemwise'>, pow, <frozendict {}>), a, TensorConstant{2})

I guess this question is more about why at.square is not an alias to at.pow(..., 2).

@ricardoV94
Copy link
Contributor

ricardoV94 commented Sep 27, 2022

I guess this question is more about why at.square is not an alias to at.pow(..., 2).

Probably optimization. Wouldn't be surprised if squaring was faster than power(x, 2)

@rlouf
Copy link
Member Author

rlouf commented Sep 27, 2022

So that could eventually become a rewrite? And is that universally true for every backend who is that distinction tied to the C backend?

@ricardoV94
Copy link
Contributor

We already have some rewrites:

def local_pow_specialize(fgraph, node):

I guess you mean we might be missing an intermediate canonicalization form that is the same for either graph

@rlouf
Copy link
Member Author

rlouf commented Sep 28, 2022

I guess you mean we might be missing an intermediate canonicalization form that is the same for either graph

Yes.

Then we can have at.square(x) be an alias for at.pow(x, 2), which is a question of consistency of the representation in the IR. And then let the canonicalisation handle the computation cost concerns.

@rlouf
Copy link
Member Author

rlouf commented Feb 25, 2023

Note that at.reciprocal(x) should be an alias for at.pow(x, -1) and at.sqrt(x) an alias for at.pow(x, 0.5) for the same consistency reasons.

Consistency is going to prevent special-casing the code in downstream libraries, for instance in AePPL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants