Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid ReadOptions mutated by reference release in iterator #172

Merged
merged 6 commits into from
Mar 3, 2025

Conversation

mmsqe
Copy link
Contributor

@mmsqe mmsqe commented Feb 20, 2025

avoid the reference of bytes properties in ReadOptions being accidentally released (mainly due to missing call of Destroy) after creating iterator

being accidentally released (mainly due to missing call of Destroy) after creating iterator
db.go Outdated
@@ -1146,7 +1146,7 @@ func (db *DB) NewIterator(opts *ReadOptions) *Iterator {
// that uses the ReadOptions given.
func (db *DB) NewIteratorCF(opts *ReadOptions, cf *ColumnFamilyHandle) *Iterator {
cIter := C.rocksdb_create_iterator_cf(db.c, opts.c, cf.c)
return newNativeIterator(cIter)
return newNativeIteratorWithTS(cIter, opts.timestamp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a few other bytes fields in ReadOptions, need to handle in a similar way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if it's ok with this direction

@mmsqe mmsqe changed the title imp: avoid the reference of timestamp related bytes in ReadOptions imp: avoid ReadOptions mutated by reference release in iterator Feb 21, 2025
@mmsqe mmsqe marked this pull request as ready for review February 21, 2025 01:29
@linxGnu linxGnu changed the title imp: avoid ReadOptions mutated by reference release in iterator Avoid ReadOptions mutated by reference release in iterator Feb 23, 2025
Copy link
Owner

@linxGnu linxGnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, the purpose of this PR is avoiding data-race? It should be done at application space instead. However, I think it should be no problem if we handle it in the library.

Please address my comments below

iterator.go Outdated
func newNativeIteratorWithReadOptions(c *C.rocksdb_iterator_t, opts *ReadOptions) *Iterator {
return &Iterator{
c: c,
iterUpperBound: opts.iterUpperBound,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clone the byte slice as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's by purpose to hold the reference of byte slice in iterator. So to ensure they're not released early when using iterator.

Copy link
Owner

@linxGnu linxGnu Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmsqe I mean you must clone the slice, not just keep the reference.

Avoiding accidentally release opts is just one aspect. I would like to avoid data-race as well. In this way, modifying opts.iterUpperBound (outside) will not harm the iterator!

iterUpperBound: clone(opts.iterUpperBound)


func clone(b []byte) []byte {
   cloned := make([]byte, len(b))
   copy(cloned, b)
   return cloned
}

Copy link
Contributor Author

@mmsqe mmsqe Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll lost the reference, you might try TestIteratorCFWithTSWhenNoDestroy with larger total

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmsqe

of course, you must use cloned slice as reference. In another word, please clone opts, use cloned one as native iterator's params.

iterator.go Outdated
iterUpperBound: opts.iterUpperBound,
iterLowerBound: opts.iterLowerBound,
timestamp: opts.timestamp,
timestampStart: opts.timestampStart,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

@yihuang
Copy link
Contributor

yihuang commented Feb 23, 2025

AFAIK, the purpose of this PR is avoiding data-race? It should be done at application space instead. However, I think it should be no problem if we handle it in the library.

Please address my comments below

It's to avoid golang gc free slices that's referenced in rocksdb.

@linxGnu
Copy link
Owner

linxGnu commented Feb 23, 2025

@yihuang

Yes, and also avoid data race as well if other routines modify opts.slices* while one routine is iterating. In this way, I think we should clone opts and used cloned one for the iterator.

@mmsqe
Copy link
Contributor Author

mmsqe commented Feb 23, 2025

Yes, and also avoid data race as well if other routines modify opts.slices* while one routine is iterating. In this way, I think we should clone opts and used cloned one for the iterator.

@linxGnu Cloning the slice would defeat the reference holding (in iterator) as the original pointer is released right after clone. We do not read nor modify the bytes in iterator at all so it won't cause data race.

@linxGnu
Copy link
Owner

linxGnu commented Feb 24, 2025

@mmsqe

Then LGTM. Will find a better way later with cloning whole opts

@linxGnu
Copy link
Owner

linxGnu commented Feb 24, 2025

@yihuang could you please help reviewing the PR again. Thank you

iterator.go Outdated
@@ -25,14 +25,29 @@ import (
// return err
// }
type Iterator struct {
c *C.rocksdb_iterator_t
c *C.rocksdb_iterator_t
iterUpperBound []byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can reference the *ReadOptions directly, slightly simpler and future proof if we'll add new slices later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#174

Opened a separate PR to support this practice, WDYT @linxGnu

yihuang added a commit to yihuang/grocksdb that referenced this pull request Feb 24, 2025
1. it's not necessary
2. user may want to keep reference it indirectly

ref: linxGnu#172
yihuang added a commit to yihuang/grocksdb that referenced this pull request Feb 24, 2025
1. it's not necessary
2. user may want to keep reference it indirectly

ref: linxGnu#172
mmsqe pushed a commit to mmsqe/grocksdb that referenced this pull request Feb 24, 2025
1. it's not necessary
2. user may want to keep reference it indirectly

ref: linxGnu#172
mmsqe pushed a commit to mmsqe/grocksdb that referenced this pull request Feb 27, 2025
1. it's not necessary
2. user may want to keep reference it indirectly

ref: linxGnu#172
Copy link
Owner

@linxGnu linxGnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@linxGnu linxGnu merged commit d11410a into linxGnu:master Mar 3, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants