Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compute ideal source coverage with astropy xmatch #555

Merged
merged 4 commits into from
Jun 24, 2021

Conversation

marxide
Copy link
Contributor

@marxide marxide commented Jun 23, 2021

Previous implementation required the Cartesian product of every source and sky region be held in memory. Fixes #554.

Previous implementation required the Cartesian product of every source and sky region be held in memory. Fixes #554.
@marxide marxide requested a review from ajstewart June 23, 2021 17:20
@marxide marxide self-assigned this Jun 23, 2021
[skip ci]
@marxide marxide marked this pull request as ready for review June 23, 2021 17:57
Copy link
Contributor

@ajstewart ajstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it's ready to go to me!

You've just replaced that part with an astropy crossmatch search around sky, right? Definitely a leaner way to go, thanks.

On a side note, from memory, I had speed issues when performing separation calculations using astropy in conjunction with a pandas.apply() and/or vectorising things - that's pretty much the only reason that on_sky_sep function exists.

@marxide
Copy link
Contributor Author

marxide commented Jun 24, 2021

You've just replaced that part with an astropy crossmatch search around sky, right? Definitely a leaner way to go, thanks.

Yes. Instead of making a DataFrame of each source and sky region combination with their separations then filtering by the separation, it does the same with the astropy search around sky function. The separation threshold needs to be different per sky region, so I use the maximum first, then filter again properly later.

On a side note, from memory, I had speed issues when performing separation calculations using astropy in conjunction with a pandas.apply() and/or vectorising things - that's pretty much the only reason that on_sky_sep function exists.

I didn't test if this is any faster or slower. It might be slower, but I think sacrificing speed in return for avoiding OOM errors is preferable!

@ajstewart
Copy link
Contributor

On a side note, from memory, I had speed issues when performing separation calculations using astropy in conjunction with a pandas.apply() and/or vectorising things - that's pretty much the only reason that on_sky_sep function exists.

I didn't test if this is any faster or slower. It might be slower, but I think sacrificing speed in return for avoiding OOM errors is preferable!

For sure! Just wanted to give context on why that function is there in the first place.

@marxide marxide merged commit 348de3e into dev Jun 24, 2021
@marxide marxide deleted the issue-554-skyregion-memory branch June 24, 2021 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High memory usage during sky region ideal coverage
2 participants