-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error with 'convert' types #26
Comments
I just encountered a similar problem myself with extracting the grouping factors from the random-effects term. I will commit the patch for that problem. This one seems more related to the formula parsing in DataFrames. There have been recent changes in that and also changes in the syntax of creating |
@SimonAB Can you provide a reproducible example? I tried to reproduce with the master version of julia> using MixedModels,RDatasets
julia> psts = dataset("lme4","Pastes");
julia> m1 = fit(lmm(Strength ~ 1 + Batch + (1|Sample), psts))
Linear mixed model fit by maximum likelihood
Formula: Strength ~ 1 + Batch + (1 | Sample)
logLik: -116.197036, deviance: 232.394071
Variance components:
Variance Std.Dev.
Sample 5.509438 2.347219
Residual 0.678001 0.823408
Number of obs: 60; levels of grouping factors: 30
Fixed-effects parameters:
Estimate Std.Error z value
(Intercept) 62.2667 1.39624 44.596
BatchB -2.96667 1.97458 -1.50243
BatchC -0.216667 1.97458 -0.109728
BatchD -2.56667 1.97458 -1.29986
BatchE -6.36667 1.97458 -3.22432
BatchF -1.23333 1.97458 -0.624606
BatchG -2.36667 1.97458 -1.19857
BatchH 0.85 1.97458 0.430472
BatchI -3.58333 1.97458 -1.81473
BatchJ -3.68333 1.97458 -1.86538 |
@SimonAB The version of MixedModels that I used is now v0.3.16. I don't think the changes should affect the problem you encountered but it would be good to update before creating an example - just so we are both on the same version. |
@dmbates I can reproduce your example fine. After having updated to MixedModels 0.3.16, I still get an error, however with a different output, using this file for example: using MixedModels
df_expl = readtable("df_expl.csv")
lmm = fit(lmm(log_nw ~ 1 + age + (1|session) + (1|cage), df_expl)) julia> ERROR: ArgumentError("float64(String): invalid number format")
in float64 at string.jl:1594
in map at abstractarray.jl:1328
in cols at /Users/s_a_b/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:262
in ModelMatrix at /Users/s_a_b/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:296
in lmm at /Users/s_a_b/.julia/v0.3/MixedModels/src/linearmixedmodels.jl:26 Starting to wonder if this is a problem with my input data, though it never caused trouble before my initial message above. |
@SimonAB I can reproduce the problem with the data you provided. The problem traces back to the way that julia> dump(df)
DataFrame 16 observations of 4 variables
log_nw: DataArray{Float64,1}(16) [1.38629,1.38629,1.38629,1.09861]
age: DataArray{UTF8String,1}(16) UTF8String["4wk","4wk","4wk","4wk"]
session: DataArray{Int64,1}(16) [1,1,1,1]
cage: DataArray{Int64,1}(16) [2,2,2,2] Notice that julia> df[:age] = pool(df[:age])
16-element PooledDataArray{UTF8String,Uint8,1}:
"4wk"
"4wk"
"4wk"
"4wk"
"14wk"
"14wk"
"14wk"
"14wk"
"4wk"
"4wk"
"4wk"
"4wk"
"14wk"
"14wk"
"14wk"
"14wk"
julia> fm1 = fit(lmm(log_nw ~ 1 + age + (1|cage) + (1|session), df))
Linear mixed model fit by maximum likelihood
Formula: log_nw ~ 1 + age + (1 | cage) + (1 | session)
logLik: -15.343272, deviance: 30.686544
Variance components:
Variance Std.Dev.
cage 0.000000 0.000000
session 0.000000 0.000000
Residual 0.398532 0.631294
Number of obs: 16; levels of grouping factors: 5, 2
Fixed-effects parameters:
Estimate Std.Error z value
(Intercept) 1.38943 0.223196 6.22515
age4wk -0.15912 0.315647 -0.504109 By the way, it is not surprising that the estimates of the variances of the random effects for Also, it is not a good idea to assign the fitted model to the name |
Thanks for this -- and sorry, I obscured the actual problem in my example. However, I was still having trouble with |
Well, I'm having trouble with this again, and I have figured out how to break things (though not why they break). There seems to be a clash caused by Gadfly... Using this dummy data frame, the following using MixedModels
using Gadfly
df = readtable("df.csv")
df[:age] = pool(df[:age])
fm1 = fit(lmm(log_nw ~ 1 + age + (1|cage), df))
fm2 = fit(lmm(log_nw ~ 0 + age + (1|cage), df)) throws an error for each model fit: julia> ERROR: `convert` has no method matching convert(::Type{Array{Symbol,1}}, ::Array{Any,1})
in convert at base.jl:13
in ModelFrame at ~/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:238
in lmm at ~/.julia/v0.3/MixedModels/src/linearmixedmodels.jl:25
julia> ERROR: `convert` has no method matching convert(::Type{Array{Symbol,1}}, ::Array{Any,1})
in convert at base.jl:13
in ModelFrame at ~/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:238
in lmm at ~/.julia/v0.3/MixedModels/src/linearmixedmodels.jl:25 Whereas in a new Julia session the following (note the position of using MixedModels
df = readtable("df.csv")
df[:age] = pool(df[:age])
fm1 = fit(lmm(log_nw ~ 1 + age + (1|cage), df))
using Gadfly
fm2 = fit(lmm(log_nw ~ 0 + age + (1|cage), df)) works for both models: julia> Linear mixed model fit by maximum likelihood
Formula: log_nw ~ 1 + age + (1 | cage)
logLik: -15.343273, deviance: 30.686545
Variance components:
Variance Std.Dev.
cage 0.000000 0.000000
Residual 0.398532 0.631294
Number of obs: 16; levels of grouping factors: 5
Fixed-effects parameters:
Estimate Std.Error z value
(Intercept) 1.38943 0.223196 6.22515
age4wk -0.159121 0.315647 -0.50411
julia> Linear mixed model fit by maximum likelihood
Formula: log_nw ~ 0 + age + (1 | cage)
logLik: -22.683637, deviance: 45.367274
Variance components:
Variance Std.Dev.
cage 0.939132 0.969088
Residual 0.575690 0.758743
Number of obs: 16; levels of grouping factors: 5
Fixed-effects parameters:
Estimate Std.Error z value
age4wk 1.23031 0.735885 1.67187 |
Thank you @SimonAB for reporting this. I had It also transpires that the order of import matters. If you run ./julia
_
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: http://docs.julialang.org
_ _ _| |_ __ _ | Type "help()" for help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.3.3-pre+18 (2014-10-28 09:26 UTC)
_/ |\__'_|_|_|\__'_| | Commit bdd4597* (4 days old release-0.3)
|__/ | x86_64-apple-darwin13.4.0
julia> using DataFrames
julia> using GLM
Warning: could not import Base.add! into NumericExtensions
julia> using Gadfly
Warning: could not import Base.has into Gadfly
Warning: could not import StatsBase.bandwidth into Stat
Warning: could not import StatsBase.kde into Stat
julia> df = readtable("/Users/aviks/dev/talks/statslang/julia/data/regression.csv");
julia> pool!(df, [:Sex])
julia> lm(OI ~ Age + Sex , df)
ERROR: `convert` has no method matching convert(::Type{Array{Symbol,1}}, ::Array{Any,1})
in convert at base.jl:13
in ModelFrame at /Users/aviks/.julia/v0.3/DataFrames/src/statsmodels/formula.jl:238
in fit at /Users/aviks/.julia/v0.3/DataFrames/src/statsmodels/statsmodel.jl:52
in lm at /Users/aviks/.julia/v0.3/GLM/src/lm.jl:43
./julia
_
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: http://docs.julialang.org
_ _ _| |_ __ _ | Type "help()" for help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.3.3-pre+18 (2014-10-28 09:26 UTC)
_/ |\__'_|_|_|\__'_| | Commit bdd4597* (4 days old release-0.3)
|__/ | x86_64-apple-darwin13.4.0
julia> using DataFrames
julia> using GLM
Warning: could not import Base.add! into NumericExtensions
julia> df = readtable("/Users/aviks/dev/talks/statslang/julia/data/regression.csv");
julia> pool!(df, [:Sex])
julia> lm(OI ~ Age + Sex , df)
DataFrameRegressionModel{LinearModel{DensePredQR{Float64}},Float64}:
Coefficients:
Estimate Std.Error t value Pr(>|t|)
(Intercept) 1.73525 0.974969 1.7798 0.0782
Age 0.0815745 0.0194358 4.19713 <1e-4
Sex - Male 2.30983 0.699491 3.30216 0.0013
julia> using Gadfly
Warning: could not import Base.has into Gadfly
Warning: could not import StatsBase.bandwidth into Stat
Warning: could not import StatsBase.kde into Stat
julia> lm(OI ~ Age + Sex , df)
DataFrameRegressionModel{LinearModel{DensePredQR{Float64}},Float64}:
Coefficients:
Estimate Std.Error t value Pr(>|t|)
(Intercept) 1.73525 0.974969 1.7798 0.0782
Age 0.0815745 0.0194358 4.19713 <1e-4
Sex - Male 2.30983 0.699491 3.30216 0.0013
Importing Gadfly adds 271 new methods of |
As a workaround, run the following line of code before
|
@aviks thanks for the workaround. It works fine in simple situations, but things break again when switching back and forth between Gadfly and MixedModels (or, I guess, any packages that use incompatible methods of |
@SimonAB can you pin |
@aviks yes, pinning |
Dear Doug,
Using
fit(lmm(forehead ~ palm + (1|slap), data))
, wherepalm
is a PooledDataArray, I get:The text was updated successfully, but these errors were encountered: