Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should scale_penalty_with_samples = true be defualt? #149

Closed
alex-s-gardner opened this issue Aug 31, 2023 · 4 comments
Closed

should scale_penalty_with_samples = true be defualt? #149

alex-s-gardner opened this issue Aug 31, 2023 · 4 comments

Comments

@alex-s-gardner
Copy link

alex-s-gardner commented Aug 31, 2023

Just sinking my teeth into MLLinearModels and I see that scale_penalty_with_samples = true is the default. Playing around with a number of toy datasets it seems that scale_penalty_with_samples = true does not produce intuitive results while scale_penalty_with_samples = false does, e.g.:

# create data 
t = 1:0.001:10;
y = 10 .+ 10 * sin.(t) .+ 5 * t .+ randn(length(t)) * 2 .+ rand((zeros(round(Int64, length(t) / 5))..., 6, -8, 100, -200, 178, -236, 77, -129, -50, -100, -45, -33, -114, -1010, -1238, -2000), length(t))*.5;
X = hcat(ones(length(t)), sin.(t), t);
scatter(t, y; markerstrokecolor=:match, markerstrokewidth=0, label = "obs", ylim = (0, 70))

# Base LSQ model fit
θ = X \ y;
scatter!(t, X * θ, markerstrokewidth=0, label="Base lsq")

# index X[:,2:end] as fit includes offset by default
θ = fit(LinearRegression(), X[:, 2:end], y);
scatter!(t, hcat(X[:, 2:end], ones(length(t))) * θ, markerstrokewidth=0, label="linear")

θ = fit(HuberRegression(scale_penalty_with_samples=true), X[:, 2:end], y);
scatter!(t, hcat(X[:, 2:end], ones(length(t))) * θ, markerstrokewidth=0, label="huber")

θ = fit(HuberRegression(scale_penalty_with_samples=false), X[:, 2:end], y);
scatter!(t, hcat(X[:, 2:end], ones(length(t))) * θ, markerstrokewidth=0, label="huber-no_scale_penalty")
Screenshot 2023-08-31 at 11 05 26 AM

Given this, should scale_penalty_with_samples = false be made the default or is there a logical reason that it is not?

@tlienart
Copy link
Collaborator

tlienart commented Aug 31, 2023

I think this stuff tends to depend on what/how you use MLJLM for.

Looping in @jbrea in since he's the one who introduced the change and had a good rationale for it (I think there's an issue where this stuff is discussed I'll try to find it)

Yeah here's a previous discussion: #124

@alex-s-gardner
Copy link
Author

@tlienart thanks for the link to the previous discussion. To me it seems likely a smaller value for lambda probably should have been set as the default. New users, if moving quickly, will assume that several of the models will give very bad predictions given the defaults. I realize that each use case is different but It seems to me that a lambda would be used in the minority of use cases.

@tlienart
Copy link
Collaborator

tlienart commented Aug 31, 2023

Isn't that what the end of the discussion and the relevant commit did?

In any case if you look at scikit learn for instance, which is one of the reference implementation out there, they use scaling and they use an L2 with a nontrivial lambda by default for logreg for instance.

I personally don't have a strong opinion on this, to me it's a matter of having correct docs and potentially guiding the user in what they should do (eg HP tuning) ; if you feel that the docs were unclear please consider opening a PR for it.

@alex-s-gardner
Copy link
Author

I think expanding the docs with examples of the different regression options would be a good idea. I'll open a new issue for this with the hopes of supplying a PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants