-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/backend: set InitialMmapSize to 2GB on 32-bit to prevent overflow #4715
Conversation
arm mmap accepts a 10GB mapping length? |
@@ -30,5 +30,5 @@ import ( | |||
// silently ignore this flag. Please update your kernel to prevent this. | |||
var boltOpenOptions = &bolt.Options{ | |||
MmapFlags: syscall.MAP_POPULATE, | |||
InitialMmapSize: InitialMmapSize, | |||
InitialMmapSize: int(InitialMmapSize), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this still overflow 32bit? /cc @hongchaodeng
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @luxas https://play.golang.org/p/HDHHzHbxkO
Probably we should convert to max.int32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, probably. See here: https://github.com/boltdb/bolt/blob/master/bolt_arm.go#L4.
I would prefer moving this to a specific ARM file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 * 1024 * 1024 * 1024 == math.MaxInt32 + 1
, so math.MaxInt32
sgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
boltdb
maxMapSize equals math.MaxInt32
too
@heyitsanthony I don't believe that. I'm perfectly fine to move |
This was just the first way of getting it compiled. Appreciating discussion. |
What will be the next steps? We do worry about delivering unreliable software that can subtly break at runtime. (And we were considering about prevent 32bit from building actually) |
It'll build, but I am confident it won't work very well:
|
Sorry, git error while updating fork. Fixing... |
Well, now we have a much better alternative. Thanks for your patience. Now we have a "real" solution. |
Yeah I know that, but can you merge this when ready and shift the blocker to e.g. 32-bit intel or something instead. |
ef54510
to
845e3b2
Compare
Okay, so here comes the 32-build fix. Let's merge ASAP. |
@luxas I just tried |
Yeah, of course |
@heyitsanthony The patch sets |
You may easily run $ git diff
diff --git a/build b/build
index e08bcc4..ce56aa8 100755
--- a/build
+++ b/build
@@ -29,5 +29,5 @@ else
fi
# Static compilation is useful when etcd is run in a container
-CGO_ENABLED=0 go build $GO_BUILD_FLAGS -installsuffix cgo -ldflags "-s -X ${REPO_PATH}/version.GitSHA${LINK_OPERATOR}${GIT_SHA}" -o bin/etcd ${REPO_PATH}
-CGO_ENABLED=0 go build $GO_BUILD_FLAGS -installsuffix cgo -ldflags "-s" -o bin/etcdctl ${REPO_PATH}/etcdctl
+CGO_ENABLED=0 GOARCH=386 go build $GO_BUILD_FLAGS -installsuffix cgo -ldflags "-s -X ${REPO_PATH}/version.GitSHA${LINK_OPERATOR}${GIT_SHA}" -o bin/etcd ${REPO_PATH}
+CGO_ENABLED=0 GOARCH=386 go build $GO_BUILD_FLAGS -installsuffix cgo -ldflags "-s" -o bin/etcdctl ${REPO_PATH}/etcdctl
diff --git a/etcdmain/etcd.go b/etcdmain/etcd.go
index e6cdf64..3a7456a 100644
--- a/etcdmain/etcd.go
+++ b/etcdmain/etcd.go
@@ -13,7 +13,6 @@
// limitations under the License.
// TODO: support arm64
-// +build amd64
package etcdmain
diff --git a/etcdmain/etcd_phony.go b/etcdmain/etcd_phony.go
index b918b42..67ef009 100644
--- a/etcdmain/etcd_phony.go
+++ b/etcdmain/etcd_phony.go
@@ -12,7 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.
-// +build !amd64
+// +build ignore
package etcdmain With those small changes, it's just |
Interesting. I've tested it on two separate machines (arm, x86) using both linux and darwin; the tests crash in both cases without #4733 but work fine on amd64. This sort of thing is exactly why 32-bit is considered unstable and not ready for general release. |
I've run |
@heyitsanthony @xiang90 Updated commit message. ptal |
I'm not sure these specializations are necessary for the time being. I'm running tests on 386 and arm now to see if it matters. |
I think they are. |
var test = int64(10 * 1024 * 1024 * 1024);
var test2 = int64((2 * 1024 * 1024 * 1024) - 1); // 2 GB, math.MaxInt32
println(int(test))
println(int(test2)) gives on
|
@luxas 2147483647 is not 2GB; it is not page-aligned. -2147483648 is 2GB. The 386 tests all passed; about to test arm. I looked at the boltdb code to see what happens with 2GB-- turns out it amounts to a no-op. However, I also found that boltdb is clearly broken for files >= 2GB on 32-bit. If the db file grows >= 2GB, it won't be able to open the file again. func (db *DB) mmap(minsz int) error {
db.mmaplock.Lock()
defer db.mmaplock.Unlock()
info, err := db.file.Stat()
if err != nil {
return fmt.Errorf("mmap stat error: %s", err)
} else if int(info.Size()) < db.pageSize*2 {
return fmt.Errorf("file size too small")
}
// Ensure the size is at least the minimum size.
var size = int(info.Size())
if size < minsz {
size = minsz
}
size, err = db.mmapSize(size)
if err != nil {
return err
} |
Crashes near 1GB of DB data on 386.
|
With or without this patch?
Yeah, noticed the same
What do you mean with that?
I think it's pretty the same, but I'm glad you're testing |
It doesn't make a difference. 32-bit is very broken in this case.
mmap rounds sizes up to the nearest page; 2GB is page aligned-- mmap will round (((1 << 31) - 1) up to 1 << 31. Likewise, -2147483648 = 0x80000000 = 2GB |
@heyitsanthony Strange, // mmapSize determines the appropriate size for the mmap given the current size
// of the database. The minimum size is 32KB and doubles until it reaches 1GB.
// Returns an error if the new mmap size is greater than the max allowed.
func (db *DB) mmapSize(size int) (int, error) {
// Double the size from 32KB until 1GB.
for i := uint(15); i <= 30; i++ {
if size <= 1<<i {
return 1 << i, nil
}
}
// Verify the requested size is not above the maximum allowed.
if size > maxMapSize {
return 0, fmt.Errorf("mmap too large")
} |
And |
@luxas, (1 << 31) is a no-op because boltdb is using an int so it defaults to using the file size; (1 << 31) - 1 seems like it should immediately trigger the mmapSize failure, but I couldn't reproduce that case. I assume the crash I'm seeing is because I'm building with: I start a cluster with: I generate a workload with: |
@heyitsanthony I agree that This patch sets |
As size is an int, which is 32bit on 386, the "size >= maxMapSize" will never be true, if maxMapSize = (1<<31)-1 == math.MaxInt32 == 0x7FFFFFFF Either we should use int64 for the size, or check for size < 0 or ? |
OK. I think this should wait until 32-bit is not crashing on >1GB data sets. Otherwise it is misleading and wastes a gigabyte of address space for nothing. |
@heyitsanthony Do you think #4742 affected the 32-bit situation? |
Yeah, read your issue some minutes ago, and it seems resonable. And I assume this 32-bit boltdb issue doesn't affect |
Fixes #4700
@xiang90 @hongchaodeng