-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use protostuff to serialize/deserialize RCF model #251
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my understanding, the protostuff problems are fixed in the new version so we should use it to deserialize/serialize all the tribuo models like linear regression, etc? Also, what is the problem for keep using java object stream in serialize/deserialize?
protostuff can't serialize tribuo models, will throw similar errors like this for
|
BTW we (the Tribuo developers) plan to add protobuf serialization support for all Tribuo types which currently implement |
Thansk @Craigacp , considering backward compatibility, can you add some test to make sure the model serialized by current object stream can be deserialized by protobuf in 4.3? Otherwise, we have to keep using current object stream to deserialize. |
It won't be an automatic change. For all models in 4.3 there will be two serialization options, I guess we could look at adding a |
Keep two serialization options sounds good. It will be good if tribuo can add |
Signed-off-by: Yaliang Wu <[email protected]>
18f2c2e
to
e4e4e9e
Compare
Signed-off-by: Yaliang Wu [email protected]
Description
We have to use java object stream to serialize/deserialize tribuo models as protostuff will throw exceptions and when we release 1.3, protostuff has reflection warning issue. So decided to use same way to serialize/deserialize RCF model. And to do that, we have to change RCF state class to implement
Serializable
interface, check aws/random-cut-forest-by-aws#298.But with this change, AD can't deserialize old RCF models. So we revert last change in this PR aws/random-cut-forest-by-aws#306. And now protostuff has released 1.8 which fixed the reflection issue. This PR add a new
RCFModelSerDeSer
to serialize/deserialize RCF model, the same way of AD to use protostuff. For other tribuo models like Kmeans and LinearRegression, we still use existing object stream to serialize/deserialize.This PR also updated the local RCF jar which built from my branch commit ylwu-amzn/random-cut-forest-by-aws@ad37748, Will check if we have enough time to upload latest RCF jar to maven. If no, will build latest RCF jar from main branch and upload. Create a Github issue to track #250.
Issues Resolved
Close aws/random-cut-forest-by-aws#305
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.