Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Noob Question #29

Open
alexhallam opened this issue Nov 6, 2016 · 4 comments
Open

Noob Question #29

alexhallam opened this issue Nov 6, 2016 · 4 comments

Comments

@alexhallam
Copy link

alexhallam commented Nov 6, 2016

Hi,

First I just want to say, coming from a "tidyverse/R" background, I love the idea of this package.

I am just starting to play in the world of julia so this is I think more of a general question than specifically about your package.

julia> q = @query filter(iris, sepal_length > 5.0)

julia> collect(q)
ERROR: MethodError: no method matching default(::DataFrames.DataFrame)
Closest candidates are:
  default(::TablesDemo.Table) at /Users/alexhallam/.julia/v0.5/TablesDemo/src/query/interface.jl:3
 in _collect(::DataFrames.DataFrame, ::StructuredQueries.FilterNode) at /Users/alexhallam/.julia/v0.5/AbstractTables/src/column_indexable/query/filter.jl:2
 in _collect(::StructuredQueries.FilterNode) at /Users/alexhallam/.julia/v0.5/StructuredQueries/src/collect/collect.jl:10
 in collect(::StructuredQueries.Query) at /Users/alexhallam/.julia/v0.5/StructuredQueries/src/collect/collect.jl:7

From what I can gather, this error is saying that collect() is not available in DataFrames and that an alternative is TablesDemo. So my question is how do I tell julia that I want to use the TablesDemo collect and not the DataFrames?

Thanks!

@nalimilan
Copy link

I don't think the package is supposed to be really usable with DataFrames yet. Though maybe have a look at https://github.com/davidagold/Collect.jl.

@davidagold
Copy link
Owner

I appreciate your interest, Alex! These packages (the present one and Collect.jl) are all works in progress. I'm hoping to release a version that is usable with DataFrames by the end of the year. Until then, there will be a lot of heavy revision. I hope to push a fairly extensive overhaul to this package within a couple of weeks. This will let me roll out preliminary support for DataFrame manipulation elsewhere. So, do stay tuned!

@alexhallam
Copy link
Author

alexhallam commented Nov 7, 2016

Hi, Thanks. From what I read Julia has some problems with dealing with Nan values, but I am having trouble understanding what the actual problem is. For example,

julia> a = [1,2,3,4]
4-element Array{Int64,1}:
 1
 2
 3
 4

julia> b = [1,NaN,1,1]
4-element Array{Float64,1}:
   1.0
 NaN  
   1.0
   1.0

julia> b + a
4-element Array{Float64,1}:
   2.0
 NaN  
   4.0
   5.0

I am not seeing the problem. Could you point me to a blog post or some information so I can understand why we need NullableArrays and other solutions to the "missing value" problem?

Also, why have two separate packages Collect.jl and StructuredQueries.jl?

@nalimilan
Copy link

Julia has no problems with NaN. The main problem NaN have is that they only work for floating point numbers, i.e. not with integers, strings, dates, etc. Pandas 2 is also moving away from NaN because of this. See in particular these links:
http://www.johnmyleswhite.com/notebook/2014/11/29/whats-wrong-with-statistics-in-julia/
http://www.johnmyleswhite.com/notebook/2015/11/28/why-julias-dataframes-are-still-slow/
http://julialang.org/blog/2015/10/nullablearrays

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants