Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

classifyarray #10

Merged
merged 16 commits into from
Feb 15, 2021
Merged

classifyarray #10

merged 16 commits into from
Feb 15, 2021

Conversation

mkborregaard
Copy link
Member

No description provided.

@mkborregaard mkborregaard force-pushed the mkb/classify branch 2 times, most recently from 0cc195b to a1bc588 Compare February 11, 2021 21:20
maskedarray = isnothing(classifyMask) ? array : mask!(copy(array))
nCells = count(isfinite, maskedarray)
boundaryIndexes = ceil(Int, cumulativeProportions * nCells)
boundaryValues = sort(vec(maskedarray))[boundaryIndexes]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two full copies in this function - it must be possible to avoid at least this one

@mkborregaard
Copy link
Member Author

mkborregaard commented Feb 11, 2021

Just realized that calcboundaries is just quantile, though a little bit faster:

using StatsBase: quantile
a,c = rand(1000, 1000), [0.2, 0.4, 0.7, 0.8, 1]

julia> @btime quantile(vec(a), c)
  98.238 ms (5 allocations: 7.63 MiB)
5-element Vector{Float64}:
 0.20055273960046652
 0.40035575057383455
 0.7008467088240219
 0.8006453000863003
 0.9999981820198582

julia> @btime calcBoundaries(a, c)
  72.043 ms (7 allocations: 7.63 MiB)
5-element Vector{Float64}:
 0.2005524615247647
 0.40035544743228324
 0.7008461304132632
 0.8006452681104108
 0.9999981820198582

@tpoisot
Copy link
Contributor

tpoisot commented Feb 12, 2021

Isn't it because quantile is doing some checks internally that calcBoundaries doesn't? I'm more than willing to pay the 20ms penalty on a 10⁶ grid if the result is safer.

@mkborregaard
Copy link
Member Author

Yes totally agree. It was just fun that it took me several readings of the code - and a port PR - to realize that this functionality already exists and has a name :-)

@mkborregaard
Copy link
Member Author

But I just need to check out that quantile behaves as expected in the presences of NaNs

@tpoisot
Copy link
Contributor

tpoisot commented Feb 14, 2021

So, if calcboundaries is quantiles, shouldn't it be an overload of quantiles instead?

@codecov-io
Copy link

codecov-io commented Feb 14, 2021

Codecov Report

Merging #10 (86cce3a) into main (19161a8) will decrease coverage by 7.43%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #10      +/-   ##
==========================================
- Coverage   37.73%   30.30%   -7.44%     
==========================================
  Files           8        9       +1     
  Lines          53       66      +13     
==========================================
  Hits           20       20              
- Misses         33       46      +13     
Flag Coverage Δ
unittests 30.30% <0.00%> (-7.44%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/NeutralLandscapes.jl 100.00% <ø> (ø)
src/classify.jl 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 19161a8...86cce3a. Read the comment docs.

@mkborregaard mkborregaard changed the title [wip] start porting classifyarray classifyarray Feb 14, 2021
@mkborregaard
Copy link
Member Author

mkborregaard commented Feb 14, 2021

I think this is ready to be reviewed now - though there are neither docs nor tests.

a = rand(DistanceGradient(rand(1:50*50, 10)), 50, 50);
plot(
    heatmap(a), 
    heatmap(classifyArray!(copy(a), [0.25, 1, 1, 1, 0.25, 0.5])), 
    size = (800, 300)
)

class

@mkborregaard
Copy link
Member Author

Honestly I feel like I should just inline those two helper functions? They're not used anywhere else.

Copy link
Contributor

@tpoisot tpoisot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an illustration to docs/src/gallery.md?

@tpoisot
Copy link
Contributor

tpoisot commented Feb 14, 2021

Honestly I feel like I should just inline those two helper functions? They're not used anywhere else.

I don't feel strongly about that - I might just rename them to start with _ to give a little hint that they're really not user-facing, but otherwise it doesn't really matter.

@mkborregaard
Copy link
Member Author

Done. How de we feel about the camelCase? I'm almost inclined to just call it classify! though that sounds like something that could clash with other packages.

@tpoisot
Copy link
Contributor

tpoisot commented Feb 14, 2021

I like classify! - maybe we don't export it?

@mkborregaard
Copy link
Member Author

Is NeutralLandscapes.classify! nicer than classifyLandscape!? 🤔

@mkborregaard
Copy link
Member Author

Hi @tpoisot I had a number of issues with conflict - I think I resolved them, but I had to rebase and force-push - hope you wont' get into trouble over at #22 which you branched from this.
I renamed to classify! and unexported - if that works let's merge this now.
But actually as you can see in the example, the demolandscape function isn't actually that flexible, so it causes problems here - maybe it's better to go back to the raw code actually?

@rafaqz
Copy link
Member

rafaqz commented Feb 15, 2021

We could just export classify!. I can't find other uses of it, and people can use import to avoid it if they need to?

Copy link
Member

@rafaqz rafaqz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I use BlueStyle most of the time now so the camelCase is a bit confusing in terms of separating out types and variables, but that's another one for the style guide.

src/classify.jl Outdated
@@ -0,0 +1,26 @@
function _w2cp(vec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could inline this or give a clearer name, took me a while to figure out what it was

@mkborregaard mkborregaard merged commit b4fe96c into main Feb 15, 2021
@mkborregaard
Copy link
Member Author

OK, I've exported it again (in the ecosystem only MLBase.jl appears to export a function of that name), and inline the two helper functions - and merged.

@mkborregaard mkborregaard deleted the mkb/classify branch February 15, 2021 08:35
@tpoisot
Copy link
Contributor

tpoisot commented Feb 15, 2021

great, I'll be able to deal with #22 (which doesn't need it yet anyways, but the NN cluster does)

@tpoisot
Copy link
Contributor

tpoisot commented Feb 15, 2021

also in favor or not using demolandscape if it causes more issues than it solves

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants