Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error with 'convert' types #26

Closed
SimonAB opened this issue Oct 13, 2014 · 14 comments
Closed

error with 'convert' types #26

SimonAB opened this issue Oct 13, 2014 · 14 comments

Comments

@SimonAB
Copy link
Contributor

SimonAB commented Oct 13, 2014

Dear Doug,
Using fit(lmm(forehead ~ palm + (1|slap), data)), where palm is a PooledDataArray, I get:

julia> ERROR: `convert` has no method matching convert(::Type{Array{Symbol,1}}, ::Array{Any,1})
 in convert at base.jl:13
 in ModelFrame at ~/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:238
 in lmm at ~/.julia/v0.3/MixedModels/src/linearmixedmodels.jl:25
@dmbates
Copy link
Collaborator

dmbates commented Oct 13, 2014

I just encountered a similar problem myself with extracting the grouping factors from the random-effects term. I will commit the patch for that problem.

This one seems more related to the formula parsing in DataFrames. There have been recent changes in that and also changes in the syntax of creating Any arrays, but you are running 0.3.x so that shouldn't be the issue here. I'll try just the formula to see if that is the probem.

@dmbates
Copy link
Collaborator

dmbates commented Oct 13, 2014

@SimonAB Can you provide a reproducible example? I tried to reproduce with the master version of MixedModels under Julia v0.3.1 and got

julia> using MixedModels,RDatasets

julia> psts = dataset("lme4","Pastes");

julia> m1 = fit(lmm(Strength ~ 1 + Batch + (1|Sample), psts))
Linear mixed model fit by maximum likelihood
Formula: Strength ~ 1 + Batch + (1 | Sample)

 logLik: -116.197036, deviance: 232.394071

 Variance components:
                Variance    Std.Dev.
 Sample         5.509438    2.347219
 Residual       0.678001    0.823408
 Number of obs: 60; levels of grouping factors: 30

  Fixed-effects parameters:
              Estimate Std.Error   z value
(Intercept)    62.2667   1.39624    44.596
BatchB        -2.96667   1.97458  -1.50243
BatchC       -0.216667   1.97458 -0.109728
BatchD        -2.56667   1.97458  -1.29986
BatchE        -6.36667   1.97458  -3.22432
BatchF        -1.23333   1.97458 -0.624606
BatchG        -2.36667   1.97458  -1.19857
BatchH            0.85   1.97458  0.430472
BatchI        -3.58333   1.97458  -1.81473
BatchJ        -3.68333   1.97458  -1.86538

@dmbates
Copy link
Collaborator

dmbates commented Oct 13, 2014

@SimonAB The version of MixedModels that I used is now v0.3.16. I don't think the changes should affect the problem you encountered but it would be good to update before creating an example - just so we are both on the same version.

@SimonAB
Copy link
Contributor Author

SimonAB commented Oct 19, 2014

@dmbates I can reproduce your example fine. After having updated to MixedModels 0.3.16, I still get an error, however with a different output, using this file for example:

using MixedModels
df_expl = readtable("df_expl.csv")
lmm = fit(lmm(log_nw ~ 1 + age + (1|session) + (1|cage), df_expl))
 julia> ERROR: ArgumentError("float64(String): invalid number format")
 in float64 at string.jl:1594
 in map at abstractarray.jl:1328
 in cols at /Users/s_a_b/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:262
 in ModelMatrix at /Users/s_a_b/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:296
 in lmm at /Users/s_a_b/.julia/v0.3/MixedModels/src/linearmixedmodels.jl:26

Starting to wonder if this is a problem with my input data, though it never caused trouble before my initial message above.

@dmbates
Copy link
Collaborator

dmbates commented Oct 20, 2014

@SimonAB I can reproduce the problem with the data you provided. The problem traces back to the way that age is stored by readtable. If you check the DataFrame returned by readtable you get

julia> dump(df)
DataFrame  16 observations of 4 variables
  log_nw: DataArray{Float64,1}(16) [1.38629,1.38629,1.38629,1.09861]
  age: DataArray{UTF8String,1}(16) UTF8String["4wk","4wk","4wk","4wk"]
  session: DataArray{Int64,1}(16) [1,1,1,1]
  cage: DataArray{Int64,1}(16) [2,2,2,2]

Notice that age is stored as a DataArray of UTF8Strings. To be used in lmm it should be a PooledDataArray.

julia> df[:age] = pool(df[:age])
16-element PooledDataArray{UTF8String,Uint8,1}:
 "4wk" 
 "4wk" 
 "4wk" 
 "4wk" 
 "14wk"
 "14wk"
 "14wk"
 "14wk"
 "4wk" 
 "4wk" 
 "4wk" 
 "4wk" 
 "14wk"
 "14wk"
 "14wk"
 "14wk"

julia> fm1 = fit(lmm(log_nw ~ 1 + age + (1|cage) + (1|session), df))
Linear mixed model fit by maximum likelihood
Formula: log_nw ~ 1 + age + (1 | cage) + (1 | session)

 logLik: -15.343272, deviance: 30.686544

 Variance components:
                Variance    Std.Dev.
 cage           0.000000    0.000000
 session        0.000000    0.000000
 Residual       0.398532    0.631294
 Number of obs: 16; levels of grouping factors: 5, 2

  Fixed-effects parameters:
             Estimate Std.Error   z value
(Intercept)   1.38943  0.223196   6.22515
age4wk       -0.15912  0.315647 -0.504109

By the way, it is not surprising that the estimates of the variances of the random effects for cage and for session are zero. You can't expect to estimate variances when there are only two levels of session. It would be best to fit a fixed-effects model.

Also, it is not a good idea to assign the fitted model to the name lmm because that masks the function lmm.

@dmbates dmbates closed this as completed Oct 20, 2014
@SimonAB
Copy link
Contributor Author

SimonAB commented Oct 27, 2014

Thanks for this -- and sorry, I obscured the actual problem in my example. However, I was still having trouble with ERROR: 'convert' has no method matching convert(::Type{Array{Symbol,1}}, ::Array{Any,1}) until I updated to julia 0.3.2 and MixedModels 0.3.17. Maybe something to do with this? Anyway, I guess this issue is closed...

@SimonAB
Copy link
Contributor Author

SimonAB commented Nov 1, 2014

Well, I'm having trouble with this again, and I have figured out how to break things (though not why they break). There seems to be a clash caused by Gadfly... Using this dummy data frame, the following

using MixedModels
using Gadfly
df = readtable("df.csv")
df[:age] = pool(df[:age])
fm1 = fit(lmm(log_nw ~ 1 + age + (1|cage), df))
fm2 = fit(lmm(log_nw ~ 0 + age + (1|cage), df))

throws an error for each model fit:

julia> ERROR: `convert` has no method matching convert(::Type{Array{Symbol,1}}, ::Array{Any,1})
 in convert at base.jl:13
 in ModelFrame at ~/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:238
 in lmm at ~/.julia/v0.3/MixedModels/src/linearmixedmodels.jl:25

julia> ERROR: `convert` has no method matching convert(::Type{Array{Symbol,1}}, ::Array{Any,1})
 in convert at base.jl:13
 in ModelFrame at ~/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:238
 in lmm at ~/.julia/v0.3/MixedModels/src/linearmixedmodels.jl:25

Whereas in a new Julia session the following (note the position of using Gadfly):

using MixedModels
df = readtable("df.csv")
df[:age] = pool(df[:age])
fm1 = fit(lmm(log_nw ~ 1 + age + (1|cage), df))
using Gadfly
fm2 = fit(lmm(log_nw ~ 0 + age + (1|cage), df))

works for both models:

julia> Linear mixed model fit by maximum likelihood
Formula: log_nw ~ 1 + age + (1 | cage)

 logLik: -15.343273, deviance: 30.686545

 Variance components:
                Variance    Std.Dev.
 cage           0.000000    0.000000
 Residual       0.398532    0.631294
 Number of obs: 16; levels of grouping factors: 5

  Fixed-effects parameters:
              Estimate Std.Error  z value
(Intercept)    1.38943  0.223196  6.22515
age4wk       -0.159121  0.315647 -0.50411


julia> Linear mixed model fit by maximum likelihood
Formula: log_nw ~ 0 + age + (1 | cage)

 logLik: -22.683637, deviance: 45.367274

 Variance components:
                Variance    Std.Dev.
 cage           0.939132    0.969088
 Residual       0.575690    0.758743
 Number of obs: 16; levels of grouping factors: 5

  Fixed-effects parameters:
        Estimate Std.Error z value
age4wk   1.23031  0.735885 1.67187

@aviks
Copy link

aviks commented Nov 1, 2014

Thank you @SimonAB for reporting this. I had lm() working in one Julia session, and not working in another. I was completely flummoxed till I came to this issue, and figured that I had Gadfly imported into one session, and not in another.

It also transpires that the order of import matters. If you run lm() before importing Gadfly, it continues to work after. But if you import Gadfly before the first invocation of lm(), it does not work.

./julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.3-pre+18 (2014-10-28 09:26 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit bdd4597* (4 days old release-0.3)
|__/                   |  x86_64-apple-darwin13.4.0

julia> using DataFrames

julia> using GLM
Warning: could not import Base.add! into NumericExtensions

julia> using Gadfly
Warning: could not import Base.has into Gadfly
Warning: could not import StatsBase.bandwidth into Stat
Warning: could not import StatsBase.kde into Stat

julia> df = readtable("/Users/aviks/dev/talks/statslang/julia/data/regression.csv");

julia> pool!(df, [:Sex])

julia> lm(OI ~ Age + Sex , df)
ERROR: `convert` has no method matching convert(::Type{Array{Symbol,1}}, ::Array{Any,1})
 in convert at base.jl:13
 in ModelFrame at /Users/aviks/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:238
 in fit at /Users/aviks/.julia/v0.3/DataFrames/src/statsmodels/statsmodel.jl:52
 in lm at /Users/aviks/.julia/v0.3/GLM/src/lm.jl:43
 ./julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.3-pre+18 (2014-10-28 09:26 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit bdd4597* (4 days old release-0.3)
|__/                   |  x86_64-apple-darwin13.4.0

julia> using DataFrames

julia> using GLM
Warning: could not import Base.add! into NumericExtensions

julia> df = readtable("/Users/aviks/dev/talks/statslang/julia/data/regression.csv");

julia> pool!(df, [:Sex])

julia> lm(OI ~ Age + Sex , df)
DataFrameRegressionModel{LinearModel{DensePredQR{Float64}},Float64}:

Coefficients:
              Estimate Std.Error t value Pr(>|t|)
(Intercept)    1.73525  0.974969  1.7798   0.0782
Age          0.0815745 0.0194358 4.19713    <1e-4
Sex - Male     2.30983  0.699491 3.30216   0.0013


julia> using Gadfly
Warning: could not import Base.has into Gadfly
Warning: could not import StatsBase.bandwidth into Stat
Warning: could not import StatsBase.kde into Stat

julia> lm(OI ~ Age + Sex , df)
DataFrameRegressionModel{LinearModel{DensePredQR{Float64}},Float64}:

Coefficients:
              Estimate Std.Error t value Pr(>|t|)
(Intercept)    1.73525  0.974969  1.7798   0.0782
Age          0.0815745 0.0194358 4.19713    <1e-4
Sex - Male     2.30983  0.699491 3.30216   0.0013

Importing Gadfly adds 271 new methods of convert, bringing the total out to 764. So not sure where to start debugging this!

@aviks
Copy link

aviks commented Nov 1, 2014

@aviks
Copy link

aviks commented Nov 2, 2014

As a workaround, run the following line of code before using Gadfly . Everything should work fine then

convert(Vector{Symbol}, {"a","b"})

@SimonAB
Copy link
Contributor Author

SimonAB commented Nov 2, 2014

@aviks thanks for the workaround. It works fine in simple situations, but things break again when switching back and forth between Gadfly and MixedModels (or, I guess, any packages that use incompatible methods of convert)… It seems a pretty fundamental bug may have been introduced sometime after Julia v. 3.0 (I think I first saw this in 3.1).

@aviks
Copy link

aviks commented Nov 4, 2014

@SimonAB can you pin Color to 0.3.6 and see if this goes away?

@SimonAB
Copy link
Contributor Author

SimonAB commented Nov 5, 2014

@aviks yes, pinning Color to 0.3.6 has fixed this problem for me (I had previously tried pinning to 0.3.9 but that didn't help) -- and switching between MixedModels and Gadfly functions doesn't break the fix anymore. Not sure if this is related to the bug, but pinning back to 0.3.6 also removed Compat 0.1.1…
Thanks for following up on this.

@SimonAB
Copy link
Contributor Author

SimonAB commented Nov 8, 2014

Upgrading Color to v 0.3.12 has solved this issue for me, as per @timholy's suggestion here and here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants