Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up transaction oracle as we go, take two #1275

Merged
merged 53 commits into from
May 18, 2020

Conversation

muXxer
Copy link
Contributor

@muXxer muXxer commented Mar 24, 2020

Hi badger team,

I resolved all the comments @jarifibrahim made in #1198
Would be cool if that could finally be merged, since RAM usage is much less.

Thanks to @damz !


This change is Reviewable

campoy and others added 30 commits June 25, 2019 15:44
* Cast sz to uint32 to fix compilation on 32 bit (dgraph-io#1175)

`env GOOS=linux GOARCH=arm GOARM=7 go build` no longer fails with overflow.

Similar to commit fb0cdb8.

Signed-off-by: Christian Stewart <[email protected]>
(cherry picked from commit 465f28a)

* Fix commit sha for WithInMemory in CHANGELOG. (dgraph-io#1172)

(cherry picked from commit 03af216)

* Fix windows build (dgraph-io#1177)

Fix windows build and some deepsource warnings

(cherry picked from commit 0f2e629)

* Fix checkOverlap in compaction (dgraph-io#1166)

The overlap check in compaction would keep additional keys in case
the levels under compaction had overlap amongst themselves.
This commit fixes it.

This commit also fixes the `numVersionsToKeep` check. Without this
commit, we would end up keeping `2` versions of a key even when the
number of versions to keep was set to `1`. See
`level 0 to level 1 with lower overlap` test.

Fixes dgraph-io#1053

The following test fails on master but works with this commit
```go
func TestCompaction(t *testing.T) {
	t.Run("level 0 to level 1", func(t *testing.T) {
		dir, err := ioutil.TempDir("", "badger-test")
		require.NoError(t, err)
		defer removeDir(dir)

		// Disable compactions and keep single version of each key.
		opt := DefaultOptions(dir).WithNumCompactors(0).WithNumVersionsToKeep(1)
		db, err := OpenManaged(opt)
		require.NoError(t, err)

		l0 := []keyValVersion{{"foo", "bar", 3}, {"fooz", "baz", 1}}
		l01 := []keyValVersion{{"foo", "bar", 2}}
		l1 := []keyValVersion{{"foo", "bar", 1}}
		// Level 0 has table l0 and l01.
		createAndOpen(db, l0, 0)
		createAndOpen(db, l01, 0)
		// Level 1 has table l1.
		createAndOpen(db, l1, 1)

		// Set a high discard timestamp so that all the keys are below the discard timestamp.
		db.SetDiscardTs(10)

		getAllAndCheck(t, db, []keyValVersion{
			{"foo", "bar", 3}, {"foo", "bar", 2}, {"foo", "bar", 1}, {"fooz", "baz", 1},
		})
		cdef := compactDef{
			thisLevel: db.lc.levels[0],
			nextLevel: db.lc.levels[1],
			top:       db.lc.levels[0].tables,
			bot:       db.lc.levels[1].tables,
		}
		require.NoError(t, db.lc.runCompactDef(0, cdef))
		// foo version 2 should be dropped after compaction.
		getAllAndCheck(t, db, []keyValVersion{{"foo", "bar", 3}, {"fooz", "baz", 1}})
	})
}
```

(cherry picked from commit 0a06173)

* Remove the 'this entry should've caught' log from value.go (dgraph-io#1170)

Fixes - dgraph-io#1031
(There wasn't a bug to fix. The log statement shouldn't have been there)

This PR removes the warning message `WARNING: This entry should have
been caught.`. The warning message assumed that we will always find the
**newest key if two keys have the same version** This assumption is
valid in case of a normal key but it's **NOT TRUE** in case of
**move keys**.

Here's how we can end up fetching the older version of a move key if
two move keys have the same version.

```
It might be possible that the entry read from LSM Tree points to an
older vlog file. This can happen in the following situation. Assume DB
is opened with numberOfVersionsToKeep=1

Now, if we have ONLY one key in the system "FOO" which has been updated
3 times and the same key has been garbage collected 3 times, we'll have
3 versions of the movekey for the same key "FOO".
NOTE: moveKeyi is the moveKey with version i
Assume we have 3 move keys in L0.
- moveKey1 (points to vlog file 10),
- moveKey2 (points to vlog file 14) and
- moveKey3 (points to vlog file 15).

Also, assume there is another move key "moveKey1" (points to vlog
file 6) (this is also a move Key for key "FOO" ) on upper levels (let's
say level 3). The move key "moveKey1" on level 0 was inserted because
vlog file 6 was GCed.

Here's what the arrangement looks like
L0 => (moveKey1 => vlog10), (moveKey2 => vlog14), (moveKey3 => vlog15)
L1 => ....
L2 => ....
L3 => (moveKey1 => vlog6)

When L0 compaction runs, it keeps only moveKey3 because the number of
versions to keep is set to 1. (we've dropped moveKey1's latest version)

The new arrangement of keys is
L0 => ....
L1 => (moveKey3 => vlog15)
L2 => ....
L3 => (moveKey1 => vlog6)

Now if we try to GC vlog file 10, the entry read from vlog file will
point to vlog10 but the entry read from LSM Tree will point to vlog6.
The move key read from LSM tree will point to vlog6 because we've asked
for version 1 of the move key.

This might seem like an issue but it's not really an issue because the
user has set the number of versions to keep to 1 and the latest version
of moveKey points to the correct vlog file and offset. The stale move
key on L3 will be eventually dropped by compaction because there is a
newer version in the upper levels.
```

(cherry picked from commit 2a90c66)

* Avoid sync in inmemory mode (dgraph-io#1190)

This makes db.Sync() no-op when badger is running in in-memory mode.
The previous code would unnecessarily load up an atomic and
acquire locks.

(cherry picked from commit 2698bfc)

* Use fastRand instead of locked-rand in skiplist (dgraph-io#1173)

The math/rand package (https://golang.org/src/math/rand/rand.go) uses
a global lock to allow concurrent access to the rand object.

The PR replaces `math.Rand` with `ristretto/z.FastRand()`. `FastRand`
is much faster than `math.Rand` and `rand.New(..)` generators.

The change improves concurrent writes to skiplist by ~30%
```go
func BenchmarkWrite(b *testing.B) {
	value := newValue(123)
	l := NewSkiplist(int64((b.N + 1) * MaxNodeSize))
	defer l.DecrRef()
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		rng := rand.New(rand.NewSource(time.Now().UnixNano()))
		for pb.Next() {
			l.Put(randomKey(rng), y.ValueStruct{Value: value, Meta: 0, UserMeta: 0})
		}
	})
}
```
```
name      old time/op  new time/op  delta
Write-16   657ns ± 3%   441ns ± 1%  -32.94%  (p=0.000 n=9+10)
```

(cherry picked from commit 9d6512b)

* Add Jaegar to list of projects (dgraph-io#1192)


(cherry picked from commit 01a00cb)

* Run all tests on CI (dgraph-io#1189)

This could possibly be a bug in `go test` command
golang/go#36527 .

The `go test` command would skip tests in sub-packages if
the top-level package has a `custom flag` defined and the
sub-packages don't define it.

Issue golang/go#36527 (comment)
has an example of this.

This PR also removes the code from the test that would unnecessary
start a web server. I see two problems here
1. An unnecessary web server running.
2. We cannot run multiple tests are the same time since the second
    run of the test would try to start a web server and
    crash saying `port already in use`.

(cherry picked from commit 5870b7b)

* Improve write stalling on level 0 and 1

We don't need to stall writes if Level 1 does not have enough space.
Level 1 is stored on the disk and it should be okay to have more
number of tables (more size) on Level 1 than the `max level 1 size`.
These tables will eventually be compacted to lower levels.

This commit changes the following
- We no longer stall writes if L1 doesn't have enough space.
- We stall writes on level 0 only if `KeepL0InMemory` is true.
- Upper levels (L0, L1, etc) get priority in compaction (previously,
    level with higher priority score would get preference)

(cherry picked from commit 3747be5)

* Update ristretto to version  8f368f2 (dgraph-io#1195)

This commit pulls following changes from ristretto
```
git log c1f00be0418e...8f368f2 --oneline
```
```

8f368f2 (HEAD -> master) Fix DeepSource warnings
adb35f0 delete item immediately 
fce5e91 Support nil *Cache values in Clear and Close
4e224f9 Add .travis.yml 
8350826 Fix the way metrics are handled for deletions
99d1bbb (tag: v0.0.1) default to 128bit hashing for collision checks
1eea1b1 atomic Get-Delete operation for eviction
8d6a8a7 add cache test for identifying MaxCost overflows
ae09250 use the custom KeyToHash function if one is set
1dd5a4d dgraph-io#19 Fixes memory leak due to Oct 1st regression in processItems
```

(cherry picked from commit 82381ac)

* Support disabling the cache completely. (dgraph-io#1183) (dgraph-io#1185)

The cache can be disabled by setting `opt.MaxCacheSize=0`

(cherry picked from commit 7e5a956)

* Fix L0/L1 stall test (dgraph-io#1201)

The test `TestL0Stall` and `TestL1Stall` would never fail
because of errors in the manifest file. This commit fixes it.

(cherry picked from commit 0acb3f6)

* Disable compression and set ZSTD Compression Level to 1 (dgraph-io#1191)

This PR
- Disables compression. By default, badger does not use any compression.
- Set default ZSTD compression level to 1
    Level 15 is very slow for any practical use of badger.

```
no_compression-16              10	 502848865 ns/op	 165.46 MB/s
zstd_compression/level_1-16     7	 739037966 ns/op	 112.58 MB/s
zstd_compression/level_3-16     7	 756950250 ns/op	 109.91 MB/s
zstd_compression/level_15-16    1	11135686219 ns/op	   7.47 MB/s
```

(cherry picked from commit c3333a5)

* Add support for caching bloomfilters (dgraph-io#1204)

This PR adds support for caching bloom filters in ristretto.
The bloom filters and blocks are removed from the cache 
when the table is deleted.

(cherry picked from commit 4676ca9)

* Rework concurrency semantics of valueLog.maxFid (dgraph-io#1184) (dgraph-io#1187)

Move all access to `valueLog.maxFid` under `valueLog.filesLock`, while
all mutations happen either with writes stopped or sequentially under
valueLog.write.

Fixes a concurrency issue in `valueLog.Read` where the maxFid variable
and the `writableLogOffset` variable could point to two different log files.

(cherry picked from commit 3e25d77)

* Change else-if statements to idiomatic switch statements. (dgraph-io#1207)



(cherry picked from commit eee1602)

* Fix flaky TestPageBufferReader2 test (dgraph-io#1210)

Fixes dgraph-io#1197

The `TestPageBufferReader2` test would fail often because of an
`off-by-1` issue. The problem can be reproduced by setting `randOffset`
to the biggest number that randInt31n may return statically like:

```
    //randOffset := int(rand.Int31n(int32(b.length)))
    randOffset := int(int32(b.length-1))
```

This makes the problem reliably reproducible as the offset is now
pointing at EOF.

Thus changing the line to this should hopefully solve the problem:
`randOffset := int(rand.Int31n(int32(b.length-1

(cherry picked from commit c51748e)

* Replace t.Fatal with require.NoError in tests (dgraph-io#1213)

We are using the following pattern in tests that can be
replaced with `require.NoError(t, err)`.
```go
if err != nil {
    t.Fatal(err)
}
```

(cherry picked from commit 78d405a)

* Add missing package to README for badger.NewEntry (dgraph-io#1223)


(cherry picked from commit 8734e3a)

* Remove ExampleDB_Subscribe Test (dgraph-io#1214)

The test ExampleDB_Subscribe doesn't run reliably on appveyor.
This commit removes it.

(cherry picked from commit e029e93)

* Fix int overflow for 32bit (dgraph-io#1216)

- Fix tests for 32 bit systems
- Enable 32-bit builds on Travis

(cherry picked from commit bce069c)

* Update CHANGELOG for Badger 2.0.2 release. (dgraph-io#1230)

Co-authored-by: Ibrahim Jarif <[email protected]>
(cherry picked from commit e908818)

* Fix changelog for v2.0.2

(cherry picked from commit b81faa5)

Co-authored-by: Christian Stewart <[email protected]>
Co-authored-by: Leyla Galatin <[email protected]>
Co-authored-by: Damien Tournoud <[email protected]>
Co-authored-by: Martin Martinez Rivera <[email protected]>
Co-authored-by: Dieter Plaetinck <[email protected]>
Co-authored-by: Apoorv Vardhan <[email protected]>
`env GOOS=linux GOARCH=arm GOARM=7 go build` no longer fails with overflow.

Similar to commit fb0cdb8.

Signed-off-by: Christian Stewart <[email protected]>
(cherry picked from commit 465f28a)
Fix windows build and some deepsource warnings

(cherry picked from commit 0f2e629)
The overlap check in compaction would keep additional keys in case
the levels under compaction had overlap amongst themselves.
This commit fixes it.

This commit also fixes the `numVersionsToKeep` check. Without this
commit, we would end up keeping `2` versions of a key even when the
number of versions to keep was set to `1`. See
`level 0 to level 1 with lower overlap` test.

Fixes dgraph-io#1053

The following test fails on master but works with this commit
```go
func TestCompaction(t *testing.T) {
	t.Run("level 0 to level 1", func(t *testing.T) {
		dir, err := ioutil.TempDir("", "badger-test")
		require.NoError(t, err)
		defer removeDir(dir)

		// Disable compactions and keep single version of each key.
		opt := DefaultOptions(dir).WithNumCompactors(0).WithNumVersionsToKeep(1)
		db, err := OpenManaged(opt)
		require.NoError(t, err)

		l0 := []keyValVersion{{"foo", "bar", 3}, {"fooz", "baz", 1}}
		l01 := []keyValVersion{{"foo", "bar", 2}}
		l1 := []keyValVersion{{"foo", "bar", 1}}
		// Level 0 has table l0 and l01.
		createAndOpen(db, l0, 0)
		createAndOpen(db, l01, 0)
		// Level 1 has table l1.
		createAndOpen(db, l1, 1)

		// Set a high discard timestamp so that all the keys are below the discard timestamp.
		db.SetDiscardTs(10)

		getAllAndCheck(t, db, []keyValVersion{
			{"foo", "bar", 3}, {"foo", "bar", 2}, {"foo", "bar", 1}, {"fooz", "baz", 1},
		})
		cdef := compactDef{
			thisLevel: db.lc.levels[0],
			nextLevel: db.lc.levels[1],
			top:       db.lc.levels[0].tables,
			bot:       db.lc.levels[1].tables,
		}
		require.NoError(t, db.lc.runCompactDef(0, cdef))
		// foo version 2 should be dropped after compaction.
		getAllAndCheck(t, db, []keyValVersion{{"foo", "bar", 3}, {"fooz", "baz", 1}})
	})
}
```

(cherry picked from commit 0a06173)
…#1170)

Fixes - dgraph-io#1031
(There wasn't a bug to fix. The log statement shouldn't have been there)

This PR removes the warning message `WARNING: This entry should have
been caught.`. The warning message assumed that we will always find the
**newest key if two keys have the same version** This assumption is
valid in case of a normal key but it's **NOT TRUE** in case of
**move keys**.

Here's how we can end up fetching the older version of a move key if
two move keys have the same version.

```
It might be possible that the entry read from LSM Tree points to an
older vlog file. This can happen in the following situation. Assume DB
is opened with numberOfVersionsToKeep=1

Now, if we have ONLY one key in the system "FOO" which has been updated
3 times and the same key has been garbage collected 3 times, we'll have
3 versions of the movekey for the same key "FOO".
NOTE: moveKeyi is the moveKey with version i
Assume we have 3 move keys in L0.
- moveKey1 (points to vlog file 10),
- moveKey2 (points to vlog file 14) and
- moveKey3 (points to vlog file 15).

Also, assume there is another move key "moveKey1" (points to vlog
file 6) (this is also a move Key for key "FOO" ) on upper levels (let's
say level 3). The move key "moveKey1" on level 0 was inserted because
vlog file 6 was GCed.

Here's what the arrangement looks like
L0 => (moveKey1 => vlog10), (moveKey2 => vlog14), (moveKey3 => vlog15)
L1 => ....
L2 => ....
L3 => (moveKey1 => vlog6)

When L0 compaction runs, it keeps only moveKey3 because the number of
versions to keep is set to 1. (we've dropped moveKey1's latest version)

The new arrangement of keys is
L0 => ....
L1 => (moveKey3 => vlog15)
L2 => ....
L3 => (moveKey1 => vlog6)

Now if we try to GC vlog file 10, the entry read from vlog file will
point to vlog10 but the entry read from LSM Tree will point to vlog6.
The move key read from LSM tree will point to vlog6 because we've asked
for version 1 of the move key.

This might seem like an issue but it's not really an issue because the
user has set the number of versions to keep to 1 and the latest version
of moveKey points to the correct vlog file and offset. The stale move
key on L3 will be eventually dropped by compaction because there is a
newer version in the upper levels.
```

(cherry picked from commit 2a90c66)
This makes db.Sync() no-op when badger is running in in-memory mode.
The previous code would unnecessarily load up an atomic and
acquire locks.

(cherry picked from commit 2698bfc)
The math/rand package (https://golang.org/src/math/rand/rand.go) uses
a global lock to allow concurrent access to the rand object.

The PR replaces `math.Rand` with `ristretto/z.FastRand()`. `FastRand`
is much faster than `math.Rand` and `rand.New(..)` generators.

The change improves concurrent writes to skiplist by ~30%
```go
func BenchmarkWrite(b *testing.B) {
	value := newValue(123)
	l := NewSkiplist(int64((b.N + 1) * MaxNodeSize))
	defer l.DecrRef()
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		rng := rand.New(rand.NewSource(time.Now().UnixNano()))
		for pb.Next() {
			l.Put(randomKey(rng), y.ValueStruct{Value: value, Meta: 0, UserMeta: 0})
		}
	})
}
```
```
name      old time/op  new time/op  delta
Write-16   657ns ± 3%   441ns ± 1%  -32.94%  (p=0.000 n=9+10)
```

(cherry picked from commit 9d6512b)
This could possibly be a bug in `go test` command
golang/go#36527 .

The `go test` command would skip tests in sub-packages if
the top-level package has a `custom flag` defined and the
sub-packages don't define it.

Issue golang/go#36527 (comment)
has an example of this.

This PR also removes the code from the test that would unnecessary
start a web server. I see two problems here
1. An unnecessary web server running.
2. We cannot run multiple tests are the same time since the second
    run of the test would try to start a web server and
    crash saying `port already in use`.

(cherry picked from commit 5870b7b)
We don't need to stall writes if Level 1 does not have enough space.
Level 1 is stored on the disk and it should be okay to have more
number of tables (more size) on Level 1 than the `max level 1 size`.
These tables will eventually be compacted to lower levels.

This commit changes the following
- We no longer stall writes if L1 doesn't have enough space.
- We stall writes on level 0 only if `KeepL0InMemory` is true.
- Upper levels (L0, L1, etc) get priority in compaction (previously,
    level with higher priority score would get preference)

(cherry picked from commit 3747be5)
This commit pulls following changes from ristretto
```
git log c1f00be0418e...8f368f2 --oneline
```
```

8f368f2 (HEAD -> master) Fix DeepSource warnings
adb35f0 delete item immediately 
fce5e91 Support nil *Cache values in Clear and Close
4e224f9 Add .travis.yml 
8350826 Fix the way metrics are handled for deletions
99d1bbb (tag: v0.0.1) default to 128bit hashing for collision checks
1eea1b1 atomic Get-Delete operation for eviction
8d6a8a7 add cache test for identifying MaxCost overflows
ae09250 use the custom KeyToHash function if one is set
1dd5a4d dgraph-io#19 Fixes memory leak due to Oct 1st regression in processItems
```

(cherry picked from commit 82381ac)
)

The cache can be disabled by setting `opt.MaxCacheSize=0`

(cherry picked from commit 7e5a956)
The test `TestL0Stall` and `TestL1Stall` would never fail
because of errors in the manifest file. This commit fixes it.

(cherry picked from commit 0acb3f6)
This PR
- Disables compression. By default, badger does not use any compression.
- Set default ZSTD compression level to 1
    Level 15 is very slow for any practical use of badger.

```
no_compression-16              10	 502848865 ns/op	 165.46 MB/s
zstd_compression/level_1-16     7	 739037966 ns/op	 112.58 MB/s
zstd_compression/level_3-16     7	 756950250 ns/op	 109.91 MB/s
zstd_compression/level_15-16    1	11135686219 ns/op	   7.47 MB/s
```

(cherry picked from commit c3333a5)
This PR adds support for caching bloom filters in ristretto.
The bloom filters and blocks are removed from the cache 
when the table is deleted.

(cherry picked from commit 4676ca9)
…aph-io#1187)

Move all access to `valueLog.maxFid` under `valueLog.filesLock`, while
all mutations happen either with writes stopped or sequentially under
valueLog.write.

Fixes a concurrency issue in `valueLog.Read` where the maxFid variable
and the `writableLogOffset` variable could point to two different log files.

(cherry picked from commit 3e25d77)
Fixes dgraph-io#1197

The `TestPageBufferReader2` test would fail often because of an
`off-by-1` issue. The problem can be reproduced by setting `randOffset`
to the biggest number that randInt31n may return statically like:

```
    //randOffset := int(rand.Int31n(int32(b.length)))
    randOffset := int(int32(b.length-1))
```

This makes the problem reliably reproducible as the offset is now
pointing at EOF.

Thus changing the line to this should hopefully solve the problem:
`randOffset := int(rand.Int31n(int32(b.length-1

(cherry picked from commit c51748e)
We are using the following pattern in tests that can be
replaced with `require.NoError(t, err)`.
```go
if err != nil {
    t.Fatal(err)
}
```

(cherry picked from commit 78d405a)
The test ExampleDB_Subscribe doesn't run reliably on appveyor.
This commit removes it.

(cherry picked from commit e029e93)
Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 9 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @muXxer)


batch.go, line 33 at r4 (raw file):

	err      error

	isManaged bool

@damz why is this change necessary? I couldn't find anything affected by wb.isManagedexcept the newTranasction(...) call in wb.commit.

If the outcome with or without existence wb.isManaged is the same, I suggest we remove it.


txn.go, line 141 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I wonder if this assumes linearizability, but that would only affect the read ts of current txn. Lack of lin could cause the read ts of a new txn to be lower than the commit ts of a txn just before it.

If commit ts (of older txns) < txn.readTs, then we're good.

IF this PR gets merged, let's add this above comment here, so we don't have confusion later.

Added a comment about it.


txn.go, line 207 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

We should ensure that newCommitTs would not add entries to commits in managed mode. Otherwise, if the managed mode user does not set discard ts, the commitedTxn list would grow infinitely.

Which does mean that if someone relies upon the conflict checking from Oracle in managed mode, they would not get that anymore (breaking change?). Though, you could argue that in managed mode, they're supposed to be handling txn conflicts and all that, so its ok.

Fixed. The cleanupCommittedTransactions() is no longer executed in managedMode.


txn.go, line 226 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

https://github.com/golang/go/wiki/SliceTricks#filtering-without-allocating

tmp := o.committedTxns[:0]

tmp = append(tmp, ...)

Done


txn.go, line 235 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

o.committedTxns = tmp

Done


txn.go, line 695 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Let's run the bank test on this, at least 8h, but better 24hrs.

Started bank test today. I'll report the results in 24 hours.

Copy link
Contributor

@damz damz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 9 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @muXxer)


batch.go, line 33 at r4 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

@damz why is this change necessary? I couldn't find anything affected by wb.isManagedexcept the newTranasction(...) call in wb.commit.

If the outcome with or without existence wb.isManaged is the same, I suggest we remove it.

The current code in WriteBatch will always start the underlying transaction in managed mode, regardless of the mode of the database. It might have been working before by accident... but it is especially bad after your "Do not process committedTxns in managed mode" because it will mean batches do not participate in conflict detection for other transactions.

Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 9 unresolved discussions (waiting on @ashish-goswami, @damz, @manishrjain, and @muXxer)


batch.go, line 33 at r4 (raw file):
Got it. Earlier there wasn't much effect but with my recent change this might have some side effects.

but it is especially bad after your "Do not process committedTxns in managed mode" because it will mean batches do not participate in conflict detection for other transactions.

We decided to not do conflict detection in managed mode because if a system using badger in managed mode doesn't set o.discardTs, it will lead to high memory usage.

For instance, Dgraph runs badger in managed mode and sets the o.discardTs when it knows that the data has been replicated to all the nodes in the system (this could take decent amount of time if one of the node is running behind in updates). In that case, the committedTxns won't be cleared for a long time and we will end up with high memory usage 🤷‍♂️

Copy link
Contributor

@damz damz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 9 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @muXxer)


batch.go, line 33 at r4 (raw file):

Got it. Earlier there wasn't much effect but with my recent change this might have some side effects.

Exactly.

I am not sure I agree with the rationale about not doing conflict detection in managed mode (because it is a big change in behavior), but I am not a consumer of the managed mode, so it is your call.

@jarifibrahim jarifibrahim changed the base branch from ibrahim/release/v2.0-cherry-pick-v2.0.3 to master April 21, 2020 09:51
@jarifibrahim jarifibrahim changed the base branch from master to ibrahim/release/v2.0-cherry-pick-v2.0.3 April 21, 2020 09:53
Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 9 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @muXxer)


txn.go, line 695 at r3 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

Started bank test today. I'll report the results in 24 hours.

Stopped the bank tests today. No errors.

@damz
Copy link
Contributor

damz commented Apr 23, 2020

Nice 👍

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 3 files at r1, 1 of 1 files at r4.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ashish-goswami, @golangcibot, @jarifibrahim, @manishrjain, and @muXxer)


txn.go, line 181 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

line is 109 characters (from lll)

AssertTruef is expensive, so avoid putting in critical path. Just do AssertTrue, or use an if.


txn.go, line 169 at r4 (raw file):

	if !o.isManaged {
		o.doneRead(txn)

Can be merged with the below if.

@jarifibrahim jarifibrahim changed the base branch from ibrahim/release/v2.0-cherry-pick-v2.0.3 to master May 13, 2020 07:42
@jarifibrahim jarifibrahim merged commit 62b7a10 into dgraph-io:master May 18, 2020
@jarifibrahim
Copy link
Contributor

Thank you for all the work @damz and @muXxer 🎉

@damz
Copy link
Contributor

damz commented May 21, 2020

Thanks @jarifibrahim 🎉

damz added a commit to damz/badger that referenced this pull request Jul 10, 2020
This fixes an issue introduced in dgraph-io#1275.

If the read watermark is, say `10`, it means that all the read transactions
of `readTs <= 10` have completed. All the remaining in-flight transactions
have `readTs >= 11`, which cannot possibly conflict with transactions
of commitTs <= 11.
damz added a commit to damz/badger that referenced this pull request Jul 10, 2020
This fixes an issue introduced in dgraph-io#1275.

If the read watermark is, say `10`, it means that all the read transactions
of `readTs <= 10` have completed. All the remaining in-flight transactions
have `readTs >= 11`, which cannot possibly conflict with transactions
of commitTs <= 11.
damz added a commit to damz/badger that referenced this pull request Jul 10, 2020
This fixes an issue introduced in dgraph-io#1275.

If the read watermark is, say `10`, it means that all the read transactions
of `readTs <= 10` have completed. All the remaining in-flight transactions
have `readTs >= 11`, which cannot possibly conflict with transactions
of commitTs <= 11.
jarifibrahim pushed a commit that referenced this pull request Oct 2, 2020
mYmNeo added a commit to mYmNeo/badger that referenced this pull request Jan 16, 2023
mYmNeo added a commit to mYmNeo/badger that referenced this pull request Feb 13, 2023
mYmNeo added a commit to mYmNeo/badger that referenced this pull request Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.