You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the new system, there are multiple distinct StatName representations for the same string, which can lead to issues with map usage.
I discovered this by modifying source/common/stats/symbol_table_creator.cc to set the default to use real symbol tables, and found crashes in several tests due to unexpected map behavior. One of these was //test/extensions/filters/network/mongo_proxy:proxy_test . There were two problems:
The unexpected map behavior without canonical names led to a crash. This can be resolved by removing a superfluous toString() -> encode() dance which occurs currently when instantiating MetricImpl (a common base class to all stats). With this fix in place, there are no longer crashes due to unexpected map behavior, at least in the mongo test.
After that, there is still an issue that we are doing a stat lookup based on a pure string in a test, and that fails, because the StatName representation formed from the test flow is not the same as the one created in the production code.
I believe that an effective fix can be made by always re-encoding the StatNames without dynamic components when storing in the Allocator, which will match the behavior expected in tests. Other maps, including the histogram map in IsolatedStoreImpl, will have to be treated more carefully.
Users should not enable --use_fake_symbol_tables 0 until this is resolved.
The text was updated successfully, but these errors were encountered:
jmarantz
changed the title
stats: new dynamic stat functionality has an issue with aliasing for StatNames.
stats: new dynamic stat functionality has an issue with aliasing for StatNames, and its interaction with maps.
Jan 22, 2020
Two tests that fail even with the re-encoding workaround listed above:
//test/extensions/filters/network/thrift_proxy/filters/ratelimit:ratelimit_test FAILED in 3.3s
/export/data/jmarantz/bazel-cache/_bazel_jmarantz/ae395351f4dffd6d28b6d1d8f7f4c65d/execroot/envoy/bazel-out/k8-fastbuild/testlogs/test/extensions/filters/network/thrift_proxy/filters/ratelimit/ratelimit_test/test.log
//test/extensions/tracers/lightstep:lightstep_tracer_impl_test FAILED in 2.0s
/export/data/jmarantz/bazel-cache/_bazel_jmarantz/ae395351f4dffd6d28b6d1d8f7f4c65d/execroot/envoy/bazel-out/k8-fastbuild/testlogs/test/extensions/tracers/lightstep/lightstep_tracer_impl_test/test.log
Strategy 1 in the description looks like the right one. It does require some more test-helpers so that the proper StatName structure can be conveniently specified in tests. Unit tests that do map lookups of stats will have to know which segments of the stat-name are dynamically created, which seems reasonable in the 3 tests for which this matters.
In the new system, there are multiple distinct StatName representations for the same string, which can lead to issues with map usage.
I discovered this by modifying source/common/stats/symbol_table_creator.cc to set the default to use real symbol tables, and found crashes in several tests due to unexpected map behavior. One of these was //test/extensions/filters/network/mongo_proxy:proxy_test . There were two problems:
The unexpected map behavior without canonical names led to a crash. This can be resolved by removing a superfluous toString() -> encode() dance which occurs currently when instantiating MetricImpl (a common base class to all stats). With this fix in place, there are no longer crashes due to unexpected map behavior, at least in the mongo test.
After that, there is still an issue that we are doing a stat lookup based on a pure string in a test, and that fails, because the StatName representation formed from the test flow is not the same as the one created in the production code.
I believe that an effective fix can be made by always re-encoding the StatNames without dynamic components when storing in the Allocator, which will match the behavior expected in tests. Other maps, including the histogram map in IsolatedStoreImpl, will have to be treated more carefully.
Users should not enable
--use_fake_symbol_tables 0
until this is resolved.The text was updated successfully, but these errors were encountered: