-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Export fields for allowing gob encoding/decoding #9
Conversation
Hey @federicojasson sorry for the late response. Thank you for your contribution. I will be reviewing this PR this week! |
Hey @federicojasson , I have reviewed your PR locally and tested it. While I understand the valid use case you are bringing to the table, I have some concerns about the proposed changes. Exporting the root node of the tree allows users to directly change values without using the intended tree methods, potentially leading to an inconsistent or unbalanced state. Additionally, this change enables users to concurrently modify the data without the protection of the internal mutex, requiring them to control and serialize the changes made directly to the exported root node. In order to mitigate the leak of abstraction, It is important to maintain a tight API by exporting the minimum amount of data necessary for correct functionality. Fortunately, the func (st *MultiValueSearchTree[V, T]) GobEncode() ([]byte, error) {
var b bytes.Buffer
enc := gob.NewEncoder(&b)
if err := enc.Encode(st.root); err != nil {
return nil, err
}
if err := enc.Encode(st.config.allowIntervalPoint); err != nil {
return nil, err
}
return b.Bytes(), nil
}
func (st *MultiValueSearchTree[V, T]) GobDecode(data []byte) error {
b := bytes.NewBuffer(data)
enc := gob.NewDecoder(b)
if err := enc.Decode(&st.root); err != nil {
return err
}
if err := enc.Decode(&st.config.allowIntervalPoint); err != nil {
return err
}
return nil
} By adding these two methods to your implementation and unexporting the What do you think? Does that cover your use case? |
Hey @rdleal, thanks for the feedback. Yes, I completely agree with your comment, I actually started working on a similar approach, but realized it wasn't going to be that simple. The complex thing here is that, if I'm not mistaken, the encoder can only serialize exported fields, so even if we call I can give it a try and see how it goes 👍 |
@federicojasson The Does that make sense? |
@rdleal Ok, that worked, thanks for the help! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@federicojasson thanks for reviewing the suggestions and applying the changes. There's a few more I'd like to point out bellow.
Thanks for you contribution again.
@rdleal Thanks for the suggestions, I wasn't aware of some of these conventions. I think I've applied all the suggested changes (but please double-check it):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@federicojasson great work! I let just a few more comments about concurrency and the new methods description.
@federicojasson thanks for the changes! Could you please update your branch with the After updating your branch with |
Done! |
@rdleal Any idea what is going on with the GitHub actions? |
@federicojasson thank you for updating you branch. About the GHA, I think it's a permission issue. When I opened the PR #10 for changing the .github/workflows/test.yml, the logs of the actions indicate the workflow could find the codecov token variable, but the workflow for this PR cannot. I will investigate a little bit further today night after the work and keep you updated. |
@rdleal Let me know if you want me to recreate the PR in a branch in your repository (you can give me permissions temporarily if you want). |
@federicojasson while I sort this out, I'd love to hear you opinion about this: with the current implementation of func main() {
st := NewMultiValueSearchTree[string, int](func(x, y int) int { return x - y })
st.Insert(17, 19, "node1")
st.Insert(5, 8, "node2")
st.Insert(21, 24, "node3")
st.Insert(21, 24, "node4")
st.Insert(4, 4, "node5")
b := mustEncodeMultiValueTree(t, st)
r := bufio.NewReader(&b)
dec := gob.NewDecoder(r)
st2 := NewSearchTree[string, int](func(x, y int) int { return x - y })
err := dec.Decode(&st2)
if err != nil {
t.Fatalf("got unexpected %v error; want nil", err)
}
stval, stOK := st.Find(17, 19)
t.Logf("st.Find(17, 19): %t : %v", stOK, stval) // Output: true : ["node1"]
st2val, st2OK := st2.Find(17, 19)
t.Logf("st2.Find(17, 19): %t : %v", st2OK, st2val) // Output: true :
} Note that no error happens, because the encoded gob of type While this seems like an edge case, I think it's important for our package to not only be easy to use, but also hard to misuse. So my two cents to avoid this would be for us to encode the name of the type along with the tree itself, and when decoding, we can check if that name matches with the destination type, if not, we return a custom error: // MismatchTypeError represents an error that occurs when there is a type mismatch
// during decoding a tree rom the gob format. It indicates that the encoded value does not
// match the expected type.
type MismatchTypeError struct {
from, to string
}
// Error returns the string representation of this error.
func (e MismatchTypeError) Error() string {
return fmt.Sprintf("interval: cannot decode type %q into type %q", e.from, e.to)
}
// GobEncode encodes the tree (compatible with [encoding/gob]).
func (st *MultiValueSearchTree[V, T]) GobEncode() ([]byte, error) {
st.mu.RLock()
defer st.mu.RUnlock()
var b bytes.Buffer
enc := gob.NewEncoder(&b)
if err := enc.Encode("MultiValueSearchTree"); err != nil {
return nil, err
}
if err := enc.Encode(st.config.allowIntervalPoint); err != nil {
return nil, err
}
if st.root != nil {
if err := enc.Encode(st.root); err != nil {
return nil, err
}
}
return b.Bytes(), nil
}
// GobDecode decodes the tree (compatible with [encoding/gob]).
func (st *MultiValueSearchTree[V, T]) GobDecode(data []byte) error {
st.mu.Lock()
defer st.mu.Unlock()
b := bytes.NewBuffer(data)
enc := gob.NewDecoder(b)
var typeName string
if err := enc.Decode(&typeName); err != nil {
return err
}
if wantType := "MultiValueSearchTree"; typeName != wantType {
return MismatchTypeError{from: typeName, to: wantType}
}
if err := enc.Decode(&st.config.allowIntervalPoint); err != nil {
return err
}
if err := enc.Decode(&st.root); err != nil {
if err != io.EOF {
return err
}
// An EOF error implies that the root
// wasn't encoded because it was nil
st.root = nil
}
return nil
} It might be an overkill, but I'd like to guarantee we are aware of this and at least discussed about it. What do you think? |
Yeah, I think is a good idea. I've implemented it in two commits:
|
interval/search_tree_test.go
Outdated
err := dec.Decode(&st2) | ||
if err == nil { | ||
t.Fatal("got unexpected <nil> error; want not nil") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A common practice in Go since 1.13 is to test errors against specific types, instead of checking only for the presence of the error:
err := dec.Decode(&st2) | |
if err == nil { | |
t.Fatal("got unexpected <nil> error; want not nil") | |
} | |
var wantErr TypeMismatchError | |
err := dec.Decode(&st2) | |
if !errors.As(err, &wantErr) { | |
t.Fatal("got unexpected error %T; want %T", err, wantErr) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (#9 (comment))
interval/search_tree_test.go
Outdated
err := dec.Decode(&st2) | ||
if err == nil { | ||
t.Fatal("got unexpected <nil> error; want not nil") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
err := dec.Decode(&st2) | |
if err == nil { | |
t.Fatal("got unexpected <nil> error; want not nil") | |
} | |
var wantErr TypeMismatchError | |
err := dec.Decode(&st2) | |
if !errors.As(err, &wantErr) { | |
t.Fatal("got unexpected error %T; want %T", err, wantErr) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (#9 (comment))
interval/search_tree_test.go
Outdated
if err.Error() != `interval: cannot decode type "MultiValueSearchTree[string, int]" into type "MultiValueSearchTree[string, string]"` { | ||
t.Fatalf("got unexpected error: %v", err.Error()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string value returned by the Error()
method are human readable messages, testing these message might create a high coupling between the test and the code being tested. For example, if we decide to change the prefix interval:
to intervalst
, because interval is too generic, and intervalst might reflect better the package name, we would have to change the test too. Depending on which unit test school you came from, that might not be a problem, but typically, testing the error type is enough.
So I would either remove this test, or if you want to guarantee that at least the from
and to
values are part of this message, I would change this test to:
if err.Error() != `interval: cannot decode type "MultiValueSearchTree[string, int]" into type "MultiValueSearchTree[string, string]"` { | |
t.Fatalf("got unexpected error: %v", err.Error()) | |
} | |
if !strings.Contains(err.Error(), wantErr.from) { | |
t.Fatalf("got error message: %q; want it to contain %q", err, wantErr.from) | |
} | |
if !strings.Contains(err.Error(), wantErr.to) { | |
t.Fatalf("got error message: %q; want it to contain %q", err, wantErr.to) | |
} |
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I managed to do something simpler: Error check refactoring
By using errors.Is
, if I'm not mistaken, we can check both the type and the values (from
/to
) without having to hardcode the error string here.
Let me know if this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems great!
interval/search_tree_test.go
Outdated
if err.Error() != `interval: cannot decode type "SearchTree[string, int]" into type "SearchTree[string, string]"` { | ||
t.Fatalf("got unexpected error: %v", err.Error()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (#9 (comment))
interval/search_tree.go
Outdated
func (st *MultiValueSearchTree[V, T]) typeName() string { | ||
var v [0]V | ||
var t [0]T | ||
|
||
return fmt.Sprintf( | ||
"MultiValueSearchTree[%v, %v]", | ||
reflect.TypeOf(v).Elem().Name(), | ||
reflect.TypeOf(t).Elem().Name(), | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you point about encoding the generic types of the tree, but in this case I think this proposal goes against one of the main advantages of the gob values: flexibility. For example, it should not be an error when encoding a MultiValueSearchTree[string, int]
, and decoding it to MultiValueSearchTree[string, int8]
, because gob values aren't about types. When encoding an int
, its value is transmitted as an unsized, variable-length integer; so as long as the receiver side decodes that into an arbitrary integer type in which the value fits in, everything should work.
Bottom line, I wouldn't restrict this by the generic types, only the tree type. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels inconsistent to check the tree type but not its generic types (one could argue the same thing about flexibility for the former case). But sure, I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad you made that point. I just suggested to check for the tree types because they are not compatible: while SearchTree
reads the values from the val
field of node
type, the MultiValueSearchTree
reads values from the vals
field. If it wasn't for that, I would totally agree with you that it would be inconsistent to test for the tree type but not for the generic types. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an FYI, using different generic types will also (obviously) provoke incompatibility issues.
e.g. wrong type (string) for received field interval[string,int].Start
So we have a tradeoff of flexibility vs misuse-verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Gob decoding will help us in this regard. If users try to encode a (non-empty) MultiValueSearchTree[string, string]
into a MultiValueSearchTree[string, int]
, they will receive an error as you correctly exemplified. Which is expected and well documented in the encoding/gob package.
I see your point about the tradeoff of flexibility vs misuse-verification, and I think it's worth for users to lose the flexibility of encoding a SearchTree
and decoding it into a MultiValueSearchTree
in order to make this package hard to misuse, because by design, both types are incompatible.
Thank you for the changes, btw.
interval/search_tree.go
Outdated
func (st *SearchTree[V, T]) typeName() string { | ||
var v [0]V | ||
var t [0]T | ||
|
||
return fmt.Sprintf( | ||
"SearchTree[%v, %v]", | ||
reflect.TypeOf(v).Elem().Name(), | ||
reflect.TypeOf(t).Elem().Name(), | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@federicojasson as another attempt to make the coverage upload to CodeCov from a fork work, I applied another change to the |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #9 +/- ##
==========================================
- Coverage 98.67% 95.52% -3.15%
==========================================
Files 6 6
Lines 602 670 +68
==========================================
+ Hits 594 640 +46
- Misses 6 18 +12
- Partials 2 12 +10 ☔ View full report in Codecov by Sentry. |
Looks like it's working! 👏 |
LGTM! Thanks for the patience in reading all of the feedback and engaging in the discussion to make this package better @federicojasson . I will be merging your PR with the main branch and release a new version with the changes. Thank you again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Thank you for the help and feedback. |
Hello 👋
This PR exports fields from the
SearchTree
,node
,interval
andTreeConfig
structs, so that is possible to encode/decode a tree using encoding/gob. This is particularly useful if one needs to persist the tree in disk between executions.Let me know what you think, I know is not a minor change.