Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminal leaves contain linear regression models? #51

Closed
juanitorduz opened this issue Jan 26, 2023 · 7 comments
Closed

Terminal leaves contain linear regression models? #51

juanitorduz opened this issue Jan 26, 2023 · 7 comments

Comments

@juanitorduz
Copy link
Contributor

Hi 👋 ! I am not sure about the difficulty, but it would be very interesting to have the option to have terminal leaves contain linear regression models as in LightGBM (see https://lightgbm.readthedocs.io/en/latest/Parameters.html#linear_tree) . Do you think this might help the extrapolation inherent limitations of BART (as described in pymc-devs/pymc-examples#507)?
Just food for thought :)

@aloctavodia
Copy link
Member

Hi, That may help, at least we will get something more sensible close to observed values and higher uncertainty as we move away from them. A previous version of BART included an option for a linear response pymc-devs/pymc#5044. So it should not be that difficult to bring it back! There is also this proposal of a GP-BART, but probably that will be much more work https://arxiv.org/abs/2204.02112

@juanitorduz
Copy link
Contributor Author

wow cool! May I ask why this feature is not included now? Do you have plans to bring it back?
GP-BART sounds very interesting ... but I can imagine is a non-trivial implementation.

@aloctavodia
Copy link
Member

I don't actually remember I think it was lost when porting BART to PyMC 4 and at that time I decided to focus more on improving the more "standard" features. Probably this is a good time to bring it back.

@juanitorduz
Copy link
Contributor Author

Indeed, It would be great to bring it back as I see many interest to start using this model for out of-sample predictions (of course knowing the limitations of tree-based models). I probably do not have the knowledge / technical level to support on the source code but I'll be happy to support with more examples and docs :)

@waudinio27
Copy link

Hello Juan, Hello Osvaldo,

I just made a donation to support the efforts of the PyMC BART Team. It is really something innovative with ideas that I like.

Have a nice weekend and best regards.

https://www.youtube.com/watch?v=Su8oNkTn76k

@aloctavodia
Copy link
Member

Thanks @waudinio27, very much appreciated! Have a nice weekend too!

@waudinio27
Copy link

waudinio27 commented Feb 6, 2023

Hello,

I have taken a look into pgbart.py and the y_fit = a + b * X and it is a very good solution together with an OOS prediciton.

There is a package
https://github.com/cerlymarco/linear-tree
were the Linear tree is made in a similar fashion.

In the Linear Forest part a metamodel is built and the final predictions are the sum of the raw linear predictions that is done first and the residuals modeled by the Random Forest. Maybe this could help to smooth out the prediction like the mix option.

It is not Bayesian like BART but with some useful ideas and information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants