Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jq/pushappend #9

Merged
merged 3 commits into from
Aug 30, 2016
Merged

Jq/pushappend #9

merged 3 commits into from
Aug 30, 2016

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Aug 24, 2016

No description provided.

@@ -193,3 +193,13 @@ function getindex(A::CatOrdArray, i::Int)
end

levels!(A::CatOrdArray, newlevels::Vector) = _levels!(A, newlevels)

Base.push!{T, R}(A::CategoricalArray{T, 1, R}, item::T) = (push!(A.refs, get!(A.pool, item)); return A)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use the CatOrdArray alias as elsewhere, rather than duplicating definitions.

@nalimilan
Copy link
Member

nalimilan commented Aug 24, 2016

I'm afraid these definitions won't work in two cases:

  • when the new values are not included in existing levels
  • when a null value is added (for nullable arrays only)

For push!, the solution is to resize the array, and use setindex!. For append!, you need to call levels!(A, intersect(levels(A), levels(B)) first.

EDIT: please also add tests for these corner cases (in both 11_array.jl and 12_nullablearray.jl).

@quinnj
Copy link
Member Author

quinnj commented Aug 24, 2016

Would we really want intersect instead of union?

@quinnj
Copy link
Member Author

quinnj commented Aug 24, 2016

Another method I realized I need is empty!(A). Do you think for CategoricalArrays, it would be proper to just empty!(A.refs)? Or should we empty the pool too?

@nalimilan
Copy link
Member

Sorry, I meant union of course. As regards empty!, I'm really not sure what's the best solution. Let's preserve the levels for now, at least until we find a reason to act differently.

@quinnj
Copy link
Member Author

quinnj commented Aug 25, 2016

Alrighty, cleaned up and tests passing. Might not be the most performant solutions in the world, but they seem to be functional.

@quinnj
Copy link
Member Author

quinnj commented Aug 25, 2016

I do find it interesting that there are two coverage systems on CI, but no travis or appveyor? :)

x = V{String, R}(a)
push!(x, "a")
@test length(x) == 4
@test String(x[end]) == "a"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for String here, == is implemented to allow direct comparisons with NominalValue.

@nalimilan
Copy link
Member

Alrighty, cleaned up and tests passing. Might not be the most performant solutions in the world, but they seem to be functional.

Thanks. These look reasonably fast. We could be slightly faster by 1) copying the integer codes for x as-is, and 2) computing a map of level codes in y to level codes in the new vector, storing it in a vector, and accessing it to recode instead of relying on setindex!. But let's leave these for later.

I do find it interesting that there are two coverage systems on CI, but no travis or appveyor? :)

AFAIK Travis is enabled. We could add AppVeyor, but nothing should be Windows-specific in this package, so not a high priority.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.6%) to 83.986% when pulling 7e33fe3b7d0ec99114ae76ed4180c83b7c55e0b4 on quinnj:jq/pushappend into 98c147d on nalimilan:master.

1 similar comment
@coveralls
Copy link

coveralls commented Aug 25, 2016

Coverage Status

Coverage increased (+0.6%) to 83.986% when pulling 7e33fe3b7d0ec99114ae76ed4180c83b7c55e0b4 on quinnj:jq/pushappend into 98c147d on nalimilan:master.

@quinnj
Copy link
Member Author

quinnj commented Aug 25, 2016

Ok updated.

I mention travis because it doesn't seem to be showing up here to show its status.



empty!(x)
@test length(x) == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test that levels are preserved.

@nalimilan
Copy link
Member

There's a test failure on Travis. Apart from this and the two details I noted, should be OK.

@quinnj
Copy link
Member Author

quinnj commented Aug 26, 2016

Weird, I can't reproduce the error when I run the test file by itself, only when I run the entire test suite. Can you think of any reason I'd see that? Unfortunately, the backtraces seemed to be messed up when the test suite is run in full and it doesn't tell me the individual test that is failing. I'll keep trying ot figure out what's going on here.


y = V{String, R}(a)
append!(x, y)
@test length(x) == 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also test the contents of x after appending? Same below.

@nalimilan
Copy link
Member

Got it. The error was only happening with --bounds-check=yes, since it comes from indexing inside an @inbounds loop in _levels! at src/array.jl:167. So the new tests actually uncovered a real bug in levels! triggered by append!: the x > 0 condition should apply to both lines in the loop. Please make that change, even removing the intermediate variable j which isn't really useful.

Could you also repeat the new test code for each series of tests in each file? The test are currently repeated (with some variations were needed) for vectors (without then with nulls) and matrices (without then with nulls). I could probably simplify this, but for now I'd prefer that you follow that structure.

@quinnj
Copy link
Member Author

quinnj commented Aug 26, 2016

Well, we won't need the tests for matrices since we're only dealing with vector operations here (push!, append!)

@nalimilan
Copy link
Member

Right, so please just repeat the tests twice (one for without nullables, then one with nullables).

@coveralls
Copy link

coveralls commented Aug 29, 2016

Coverage Status

Coverage increased (+0.6%) to 83.63% when pulling 82a9345 on quinnj:jq/pushappend into 70b6a5d on nalimilan:master.

@codecov-io
Copy link

codecov-io commented Aug 29, 2016

Current coverage is 83.62% (diff: 100%)

Merging #9 into master will increase coverage by 0.60%

@@             master         #9   diff @@
==========================================
  Files             7          7          
  Lines           271        281    +10   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            225        235    +10   
  Misses           46         46          
  Partials          0          0          

Powered by Codecov. Last update 70b6a5d...7786cd3

@quinnj
Copy link
Member Author

quinnj commented Aug 29, 2016

Ok, think we're good to go here.

@test length(x) == 6
@test x[1] == x[end]
@test levels(x) == ["e", "a", "b", "c", "zz"]
# ["c", "a", "a", "a", "zz", "c"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?


append!(x, x)
@test length(x) == 12
@test x[1] == "c"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be replaced with x == ["c", "a", ...]?

@quinnj
Copy link
Member Author

quinnj commented Aug 29, 2016

Alright, let's see what CI says about this.

@coveralls
Copy link

coveralls commented Aug 29, 2016

Coverage Status

Coverage increased (+0.6%) to 83.63% when pulling 7786cd3 on quinnj:jq/pushappend into 70b6a5d on nalimilan:master.

2 similar comments
@coveralls
Copy link

Coverage Status

Coverage increased (+0.6%) to 83.63% when pulling 7786cd3 on quinnj:jq/pushappend into 70b6a5d on nalimilan:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.6%) to 83.63% when pulling 7786cd3 on quinnj:jq/pushappend into 70b6a5d on nalimilan:master.

@nalimilan nalimilan merged this pull request into JuliaData:master Aug 30, 2016
@nalimilan
Copy link
Member

Thanks! Sorry it took so many rounds of review!

Tell me if you need me to tag a new release.

@nalimilan nalimilan mentioned this pull request Aug 30, 2016
nalimilan pushed a commit that referenced this pull request Aug 30, 2016
Needed by DataStreams.jl. This uncovered a bug in _levels!() when the input contained missing values.
@quinnj
Copy link
Member Author

quinnj commented Aug 30, 2016

No worries. Yeah, let's tag a new release. I can do that if you'd like.

@quinnj quinnj deleted the jq/pushappend branch August 30, 2016 13:06
@nalimilan
Copy link
Member

I'd just like to include commit d4a4e70, but tests unexplicably fail on Travis while they pass locally. Could you checkout nl/test and tell me whether tests pass on your machine?

@nalimilan
Copy link
Member

Ah, nevermind, I just realized this depends on JuliaStats/NullableArrays.jl#141, which isn't merged upstream. Let me find a temporary fix.

@nalimilan
Copy link
Member

OK, should be good as soon as all tests are done. Feel free to tag the new commit as version 0.0.3 (I'm going to be offline for a few hours).

@quinnj
Copy link
Member Author

quinnj commented Aug 30, 2016

Do you want to do a "release" on github and I can just use that?

@nalimilan
Copy link
Member

Ah, right. See https://github.com/nalimilan/CategoricalArrays.jl/releases/tag/v0.0.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants