Plotting methods #38

joshday · 2015-09-24T17:15:04Z

Since Plots.jl exists, we can now implement plotting methods with less worry about a user's preferred plotting package.

In the josh branch, I currently have a new traceplot!(o, b, data...; f) function which plots the value x = nobs(o) y = f(o) for every b observations. f defaults to v -> state(v)[1], but any function that works on the OnlineStat and returns a Real scalar or Vector should work.

Edit for example:

x = randn(10_000, 10)
beta = collect(1:10)
y = x*beta + randn(10_000)

o = SGModel(10)
traceplot!(o, 100, x, y)

The text was updated successfully, but these errors were encountered:

tbreloff · 2015-09-24T17:27:24Z

👍 Cool... this is certainly the type of thing I was thinking of.

It would also be great to add some "plotting recipes" that work with various stats. For example, it would be really cool to be able to do a pca/svd on a scatter plot and add overlays like the image from wikipedia:

tbreloff · 2015-10-12T04:29:30Z

I was just peaking through your plotmethods code... what do you think about adding a Function keyword arg ontrace::Function to the tracefit! method, and changing it to call the function instead of adding to and returning an array of values:

function tracefit!(o::OnlineStat, b::Integer, data...; batch::Bool = false, ontrace::Function = nop)
    b = @compat Int(b)
    n = nrows(data[1])
    i = 1
    s = state(o)
    #result = [copy(o)]
    while i <= n
        rng = i:min(i + b - 1, n)
        batch_data = map(x -> rows(x, rng), data)
        batch ? updatebatch!(o, batch_data...) : update!(o, batch_data...)
        #push!(result, copy(o))
        ontrace(o)
        i += b
    end
    #result
    return
end

# then implement the current functionality like (untested code):
o = Mean()
result = Mean[]
tracefit!(o, 1, rand(10); ontrace = o->push!(result, copy(o)))

The advantage here is that you can use the same tracefit! method to add to a plot or do something else other than return a bunch of copies of an object.

Although, I think this interface could be cleaned up further, so maybe add a new method with this functionality for now?

joshday · 2015-10-12T13:13:35Z

Since my end goal was making plots, I like this idea.

b should be changed to a keyword argument. It just looks so interrupting. In general tracefit! needs a rewrite.

joshday · 2015-10-12T16:28:04Z

What about something like this? It updates an OnlineStat (with all of data by default, but you can pass a different batch size), and calls a function after each update.

I think it could replace onlinefit!, tracefit!, and traceplot!. EDIT: Maybe not replace traceplot!, but traceplot! would call this function.

function update_do!(o::OnlineStat, data...;
        b::Integer = size(data[1], 1),
        dothis::Function = x -> nothing,
        batch::Bool = false
    )
    b = @compat Int(b)
    n = size(data[1], 1)
    i = 1
    while i <= n
        rng = i:min(i + b - 1, n)
        batch_data = map(x -> rows(x, rng), data)
        batch ? updatebatch!(o, batch_data...) : update!(o, batch_data...)
        i += b
        dothis(o)
    end
end

julia> o = OnlineStats.Mean();

julia> OnlineStats.update_do!(o, randn(100), b = 50, dothis = o -> println(nobs(o)))
50
100

tbreloff · 2015-10-12T18:08:01Z

I certainly like it more as part of the core update function, but would argue that having both update! and update_do! doesn't change anything. Also while we're at it, how do you feel about a name change to fit!?

Here's a possible abstraction for callbacks. designed as functors:

julia> abstract OnlineCallback

julia> immutable DoEvery <: OnlineCallback
           b::Int
           f::Function
       end

julia> cb = DoEvery(5, o->println("callback for $o."))
DoEvery(5,(anonymous function))

julia> Base.call(cb::DoEvery, i::Integer, o) = mod1(i,cb.b)==cb.b ? cb.f(o) : nothing
call (generic function with 1501 methods)

julia> for i in 1:20
           cb(i,i^2)
       end
callback for 25.
callback for 100.
callback for 225.
callback for 400.

Then a rough draft for altering the method, assuming we can add a parameter which defines whether a stat should update batch or not:

immutable NoCallback <: OnlineCallback end
Base.call(cb::NoCallback, args...) = nothing

abstract BatchMode
immutable Batch <: BatchMode end
immutable Online <: BatchMode end

function fit!(o::OnlineStat{Online}, x, y;
                                                     onupdate::Function = NoCallback())
    for i in 1:size(x,1)
        fit!(o, row(x,i), row(y,i))
        onupdate(i, o)
    end
end

function fit!(o::OnlineStat{Batch}, x, y; 
                                                     onupdate::Function = NoCallback(),
                                                     b::Integer = default_batch_size(o))
    n = size(x, 1)
    i = 0
    while i < n
        rng = i+1:min(i + b, n)
        batchfit!(o,  rows(x,rng), rows(y,rng))
        i += b
        onupdate(i, o)
    end
end

joshday · 2015-10-12T19:05:25Z

It would be nice to extend fit!. I think update! is a better verb, (fit a variance doesn't sound right), but maybe it's too general and prone to name conflicts. I could go either way.

I'm worried that OnlineCallback/BatchMode is too elegant. With a few small changes to update_do! (including a better name), it seems like all that functionality could be included.

Do we just need a more general update! method that incorporates callbacks and batch updates?

tbreloff · 2015-10-12T19:24:06Z

It's always bugged me that there are different verbs for batch update vs singleton updates. Singleton updates are just a batch size of 1, no?

I mostly agree with you about fit vs update. I'm just trying to reconcile the verbs here with the ones that may come out of JuliaML/LossFunctions.jl#3. I really want to be able to chain these things together into a pipeline, and having different verbs (which mean essentially the same thing) will ruin that. Maybe the primary verb there should be update as well (could even consider leaving off the ! since the verb is clear that it mutates.), and just define:

module LearnBase
  update(o, args...) = @not_implemented
  transform(o, args...) = @not_implemented

  const fit = update
  const solve = update
  const predict = transform
end

Any anyone can choose the one that feels natural to them. Although maybe we should actually reference any verbs from StatsBase instead.

Evizero · 2015-10-12T19:32:50Z

Personally I dislike fit as well. It doesn't sound right for many situations. I think it's just an artefact for historic reasons that originated from JuliaStats. To me the most natural would be train.

But from a community perspective I think that it's probably best if we just adopt the language standard of Julia. Using synonyms seems like an ugly solution and just makes the code less readable for outsiders. Let's not make it more complicated than it needs to be and just use fit if something learns from data.

joshday · 2015-10-12T19:33:46Z

Thanks for the link! A unified framework is enough to sell me on fit!.

Evizero · 2015-10-12T19:44:11Z

Great to have you on board. There is a lot to learn from your package and I think we are at a good point in time to attempt such a design coordination. I think combined we cover enough of a scope so that others would likely follow (provided we succeed in this endeavour which I am confident about)

joshday · 2015-11-05T17:46:56Z

#42 is relevant. All plotting methods should go in the plots.jl file which gets included with Requires.

joshday mentioned this issue Sep 24, 2015

Documentation/Wiki #29

Closed

joshday closed this as completed Nov 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plotting methods #38

Plotting methods #38

joshday commented Sep 24, 2015

tbreloff commented Sep 24, 2015

tbreloff commented Oct 12, 2015

joshday commented Oct 12, 2015

joshday commented Oct 12, 2015

tbreloff commented Oct 12, 2015

joshday commented Oct 12, 2015

tbreloff commented Oct 12, 2015

Evizero commented Oct 12, 2015

joshday commented Oct 12, 2015

Evizero commented Oct 12, 2015

joshday commented Nov 5, 2015

Plotting methods #38

Plotting methods #38

Comments

joshday commented Sep 24, 2015

tbreloff commented Sep 24, 2015

tbreloff commented Oct 12, 2015

joshday commented Oct 12, 2015

joshday commented Oct 12, 2015

tbreloff commented Oct 12, 2015

joshday commented Oct 12, 2015

tbreloff commented Oct 12, 2015

Evizero commented Oct 12, 2015

joshday commented Oct 12, 2015

Evizero commented Oct 12, 2015

joshday commented Nov 5, 2015