-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory: switch to the latest tcmalloc for x86_64 builds #13251
Conversation
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Hm.. the way how |
…ew tcmalloc Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
/retest |
Retrying Azure Pipelines, to retry CircleCI checks, use |
/retest |
Retrying Azure Pipelines, to retry CircleCI checks, use |
Signed-off-by: Dmitry Rozhkov <[email protected]>
When Envoy is linked with the new tcmalloc and compiled with gcc it crashes upon assert in tcmalloc because a buffer slice is allocated with `new []`, but deallocated with `delete`. Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
Signed-off-by: Dmitry Rozhkov <[email protected]>
/retest |
Retrying Azure Pipelines, to retry CircleCI checks, use |
@jmarantz With the new tcmalloc the amount of consumed memory becomes non deterministic and from run to run I see different results of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmarantz With the new tcmalloc the amount of consumed memory becomes non deterministic and from run to run I see different results of memory_test.consumedBytes() in the same test. Can we relax the tests a bit and avoid the strict equivalence checks?
Are you sure that deterministic byte-counts are not still available via the API in some way? It looks like maybe you changed the memory-stats APIs to include unmapped bytes?
@@ -310,7 +311,7 @@ TEST_P(ClusterMemoryTestRunner, MemoryLargeClusterSizeWithFakeSymbolTable) { | |||
// https://github.com/envoyproxy/envoy/issues/12209 | |||
// EXPECT_MEMORY_EQ(m_per_cluster, 44949); | |||
} | |||
EXPECT_MEMORY_LE(m_per_cluster, 47500); // Round up to allow platform variations. | |||
EXPECT_MEMORY_LE(m_per_cluster, 47700); // Round up to allow platform variations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole test is now gone, once you merge, so you can just keep the edit for the real symbol-tables test below, and drop this one.
namespace Memory { | ||
|
||
uint64_t Stats::totalCurrentlyAllocated() { | ||
return tcmalloc::MallocExtension::GetNumericProperty("generic.current_allocated_bytes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe for deterministic memory checks we can just use generic.current_allocated_bytes and not include tcmalloc.cpu_free?
We can make 2 APIs if that helps.
I'm also confused by the github UI; it is not showing me the prior impl of this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part (from #if
to #elif
) is new for tcmalloc. The previous impl is below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe for deterministic memory checks we can just use generic.current_allocated_bytes and not include tcmalloc.cpu_free?
Right, my bad. It shouldn't be included. I'll drop it. Though after the last merge all tests passed for a6d4c42. But the version without tcmalloc.cpu_free passes the release tests too on my local host. Let's see...
I'm also confused by the github UI; it is not showing me the prior impl of this method.
The old version is still there, just wrapped in #if defined(GPERFTOOLS_TCMALLOC)
. It's needed for non-x86 builds.
Signed-off-by: Dmitry Rozhkov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. Defer to @jmarantz for the deterministic memory checks and then merge.
I did include unmapped bytes to |
Signed-off-by: Dmitry Rozhkov <[email protected]>
// symbol_table_mem_used: 1726056 (3.9x) -- does not seem to depend on STL sizes. | ||
EXPECT_MEMORY_LE(string_mem_used, 7759488); | ||
EXPECT_MEMORY_LE(string_mem_used, 7775872); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI fails on gcc now with:
Expected: (string_mem_used) <= (7775872), actual: 7792256 vs 7775872
I'm guessing this may be related to varying STL versions for std::string across platforms than tcmalloc. I think we can just remove the exact check for string_mem_used. We can keep the test that symbol tables cut the usage by 2/3.
Signed-off-by: Dmitry Rozhkov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks!
This morning my default build was using clang+llvm-8.0.0 and build failed with "error: 'asm goto' constructs are not supported yet". I updated clang to 9 and I'm good, but based on that and an unsubstantiated google search (http://llvm.1065342.n5.nabble.com/llvm-dev-is-clang9-supporting-asm-goto-td128867.html) I wonder if we need to update the build requirements doc (https://www.envoyproxy.io/docs/envoy/latest/install/building#requirements) |
for the record, I just restored clang-8 as default and verified it builds with --define tcmalloc=gperftools so ideally we can just update requirements doc to note folks should use that option with older versions of clang |
Right. Thanks for letting know! I'll make a PR to mention it. Basically for non-x86 builds the current requirement is still sufficient too. Those who build Envoy for embedded very often have to deal with quite outdated toolchains from BSPs they can get from board vendors. |
I added some comments in #13438 as we're hoping to cut the release today or tomorrow. |
Great you corrected it already! I commented the PR. |
Hmm this change is triggering this for our build:
|
nvm just saw the prev comment |
Commit Message: memory: switch to the latest tcmalloc for x86_64 builds
Additional Description: Switch to the new tcmalloc for linux x86_64 builds. All other builds still get the old tcmalloc from gperftools. Switching profiling or the debugalloc feature on will result in linking to the old tcmalloc too. The old tcmalloc can still be enabled for x86_64 builds with
--define tcmalloc=gperftools
.Risk Level: High
Testing: run unit tests
Docs Changes: added a note about the new
--define tcmalloc=gperftools
option tobazel/README.md
.Release Notes: updated the "Minor Behavior Changes" section of
version_hstory/current.rst
Fixes #10053