Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make the Json::sanitize method exception free #35598

Closed
wbpcode opened this issue Aug 6, 2024 · 22 comments
Closed

make the Json::sanitize method exception free #35598

wbpcode opened this issue Aug 6, 2024 · 22 comments
Labels
enhancement Feature requests. Not bugs or questions. stale stalebot believes this issue/PR has not been touched recently

Comments

@wbpcode
Copy link
Member

wbpcode commented Aug 6, 2024

This is necessary for core data path usage and envoy-mobile.

The Json::sanitize may call the Nlohmann::Factory::serialize method to sanitize string that contains special character. And the Nlohmann::Factory::serialize will construct a json object and call its dump method.

Basically, the constructor won't throw because it won't validate the input string. But the dump() will throw by default if the input string contains invalid UTF8 characters.
This exception could be suppressed by setting the error_handler to the replace. Then the invalid UTF-8 sequences will be replaced by U+FFFD.

This will break exist behavior, but is a quick way to remove exception from the Json::sanitize.

std::string Factory::serialize(absl::string_view str) {
  // The constructor will not throw even if the string is not valid UTF-8
  // because the constructor will only copy the str to the internal string
  // and won't validate it.
  nlohmann::json j(str);
  return j.dump(-1, ' ', false, nlohmann::json::error_handler_t::replace);
}
@wbpcode wbpcode added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Aug 6, 2024
@wbpcode
Copy link
Member Author

wbpcode commented Aug 6, 2024

cc @jmarantz Considering the fact that the Json::sanitize is only used by the stats_render, is this an acceptable way to continue?

@wbpcode wbpcode removed the triage Issue requires triage label Aug 6, 2024
@jmarantz
Copy link
Contributor

jmarantz commented Aug 6, 2024

When I wrote the sanitizer I originally did not use (or even know about) nlohmann, and I had a lot of manual code to do the same thing, which was 2x faster, and exception free. See #20428.

@htuch argued not to add that code to the codebase given the existing dependence on nlohmann, and I agreed that the speedup was not worth the code complexity. And the usage of the sanitizer at the time was all in the main thread, in the admin stats endpoint, where we did allow exceptions (compiled out for mobile).

I think if we want to be exception free I think we should revive #20428.

@htuch
Copy link
Member

htuch commented Aug 6, 2024

Why can't we just do what @wbpcode suggests and use an error handler? I'm not sure I see the need to import a 1k+ code doing non-trivial sanitization if we don't have to.

@jmarantz
Copy link
Contributor

jmarantz commented Aug 6, 2024

Maybe -- @wbpcode what did you mean by "this will break existing behavior". I think I keyed on that warning.

@jmarantz
Copy link
Contributor

jmarantz commented Aug 6, 2024

Also is it enough to avoid nlohmann throwing exceptions at runtime? Or do we need to compile all code linked into E-M with exceptions disabled?

@wbpcode
Copy link
Member Author

wbpcode commented Aug 6, 2024

Maybe -- @wbpcode what did you mean by "this will break existing behavior". I think I keyed on that warning.

In current implemention, if the string contains invalid utf-8, then all non-ascii chars will be replaced with \xxx, the xxx is the number value of byte.

See

         buffer.append(absl::StrFormat("\\%03o", c));

But if we use the way I suggested, only the invalid utf-8 will be replaced with U+FFFD

I personally think this is fine and more correct and this method is used in limited scope for now.****

cc @htuch @jmarantz

@jmarantz
Copy link
Contributor

jmarantz commented Aug 6, 2024

Why is u+fffd more correct? I am not sure if it's that important that we escape invalid utf8 the way it was done, but is it enough to avoid nlohmann throwing exceptions at runtime? Or do we need to compile all code linked into E-M with exceptions disabled?

@wbpcode
Copy link
Member Author

wbpcode commented Aug 6, 2024

Why is u+fffd more correct? 

Sorry, I didn't make it clear. Not the u+fffd is more correct. But the way to handle the other legal utf-8 in a string that contains invalid utf-8 is more correct.

For a string that contains invalid utf-8, the Nlohmann json at least will handle other legal utf-8 correctly.
But the current implemention will convert all non-ascii include the legal utf-8 to \xxx.

@wbpcode
Copy link
Member Author

wbpcode commented Aug 6, 2024

but is it enough to avoid nlohmann throwing exceptions at runtime?

From source code and doc, yeah. Fuzz test could be used to ensure it.

Or do we need to compile all code linked into E-M with exceptions disabled?

Yeah. I think alyssawilk has done lots (super lots) of work to ensure this point. @alyssawilk

@alyssawilk
Copy link
Contributor

+1 to the suggested error handler. I think this is a non-problem for E-M #35607 but represents a potential config validation issue for envoy core (so is worth fixing)

@wbpcode
Copy link
Member Author

wbpcode commented Aug 7, 2024

Ensure the runtime exception issue again:

dump():

Throws type_error.316 if a string stored inside the JSON value is not UTF-8 encoded and error_handler is set to strict

By setting the error_handler to replace or ignore could suppress this behavior.

And I also do a quick check to the source code.


constructor:

Depend on the to_json implementation. No other known exception will be throwed from the doc. See https://json.nlohmann.me/api/basic_json/basic_json/#exception-safety

Specific to constructing json object from string view, only the allocator may throw exception which we basically will ignore. There is the core code snippet:


// This will be called by the constructor.

template<typename BasicJsonType, typename CompatibleString,
         enable_if_t<std::is_constructible<typename BasicJsonType::string_t, CompatibleString>::value, int> = 0>
inline void to_json(BasicJsonType& j, const CompatibleString& s)
{
    external_constructor<value_t::string>::construct(j, s);
}

// This will be called by the to_json implementation for string view.
    template < typename BasicJsonType, typename CompatibleStringType,
               enable_if_t < !std::is_same<CompatibleStringType, typename BasicJsonType::string_t>::value,
                             int > = 0 >
    static void construct(BasicJsonType& j, const CompatibleStringType& str)
    {
        j.m_data.m_value.destroy(j.m_data.m_type);
        j.m_data.m_type = value_t::string;
        j.m_data.m_value.string = j.template create<typename BasicJsonType::string_t>(str);
        j.assert_invariant();
    }

// This will be called by the construct implementation for string_view

    /// helper for exception-safe object creation
    template<typename T, typename... Args>
    JSON_HEDLEY_RETURNS_NON_NULL
    static T* create(Args&& ... args)
    {
        AllocatorType<T> alloc;
        using AllocatorTraits = std::allocator_traits<AllocatorType<T>>;

        auto deleter = [&](T * obj)
        {
            AllocatorTraits::deallocate(alloc, obj, 1);
        };
        std::unique_ptr<T, decltype(deleter)> obj(AllocatorTraits::allocate(alloc, 1), deleter);
        AllocatorTraits::construct(alloc, obj.get(), std::forward<Args>(args)...);
        JSON_ASSERT(obj != nullptr);
        return obj.release();
    }





@wbpcode
Copy link
Member Author

wbpcode commented Aug 7, 2024

I have split the escaping check to a separated function and them add a new utility method escape to implement the exception free sanitizing with only 4 additional lines code.

Utility::EscapedString Utility::escape(absl::string_view str) {
  if (!requireEscaping(str)) {
    return str;
  }
  return Nlohmann::Factory::serialize(str, true);
}

The utility method will only be used by the new JsonFormatter which requires that the users to enable it explicitly by setting runtime flag. See the latest commit in the #35545.

Will this way may help to eliminate some worries about the runtime exception? cc @jmarantz

cc @htuch @alyssawilk

We can make this method to be the only one implementation after it is battle tested. For example, after the JsonFormatter completely replaced the legacy JsonFormatter.

@htuch
Copy link
Member

htuch commented Aug 7, 2024

I guess I'm still a little opposed to bringing in the extra complexity for marginal benefit. Do we have a performance problem today? The 20x is a microbenchmark which will not show up in practice (unless I'm missing some context). Meanwhile, we permanently live with our new more complicated JSON formatter and sanitizer.

Why not just replace the invalid characters and move on? This feels like a non-problem.

@jmarantz
Copy link
Contributor

jmarantz commented Aug 7, 2024

I don't know if I have gotten a specific answer to this question, which I think is for @alyssawilk .

I think @wbpcode is proposing to add the json sanitizer as a dependency in e-m which I think it was not before. json sanitizer currently links in nlohmann which includes 'throw' in its compiled code.

I understand the strategy is that we would prevent nlohmann from calling throw, dynamically. However I thought it might still be a problem to link in a dependency that compiles in a reference to throw.

@jmarantz
Copy link
Contributor

jmarantz commented Aug 7, 2024

@alyssawilk you said "potential config validation issue for envoy core (so is worth fixing)". I did not undestand that phrase. Can you explain?

@wbpcode
Copy link
Member Author

wbpcode commented Aug 8, 2024

I guess I'm still a little opposed to bringing in the extra complexity for marginal benefit. Do we have a performance problem today? The 20x is a microbenchmark which will not show up in practice (unless I'm missing some context). Meanwhile, we permanently live with our new more complicated JSON formatter and sanitizer.

Why not just replace the invalid characters and move on? This feels like a non-problem.

Sorry, I think maybe I don't make it clear enough.

The core of #35545 is a new JsonFormatter implemention for access logging, with completely new design. It's 20x faster (and I think the code is also much simpler) than our legacy one and provide same performance with our text formatter. The poor performance of legacy JsonFormatter actually effects our users (see #35501, an user report similar problem and profiling result in that issue).

An exception free sanitize/escape is necessary for #35545 (our new JsonFormatter) because the new JsonFormatter will be used at core data paths and will be linked to e-m.
But exist sanitize/escape depends on std:: exception directly. It doesn't meet my requirement. So I created this issue to discuss this problem. But note, these methods is not the core part of #35545.

My solution now is creat a new exception-free sanitize/escape method by replacing the invalid utf-8 (as my inital suggestion in this issue) for our new JsonFormatter usage. The new method only bring 4 additional lines code and is much simpler than previous method because we needn't to handle exception. I don't think it will bring extract complexity.
It will simplify our code because the previous sanitize/escape will be replaced once the new one is proven that is stable.

So, I think you may misunderstood something (sorry for my previous unclear description.)

Cc @htuch

@jmarantz
Copy link
Contributor

jmarantz commented Aug 8, 2024

@htuch @alyssawilk

the context here is that @wbpcode is adding a new dependence on JsonSanitizer that requires it to be linked in and functional in envoy-mobile. Hence he is trying to remove the exception flow.

The benefit of this is in #35501 which is looking for a 20x improvement in json serialization during access loggers. I'm very much aligned with this -- I was able to get a huge improvement in json stats output before because the default model of populating a protobuf and then serializing that is a bag full of slow. We are iterating on some of the details of course.

I'm assuming this is necessary because json access loggers are needed in envoy-mobile; if they are not then maybe this change can be skipped.

@alyssawilk
Copy link
Contributor

ah. so I think adding more exceptions to envoy is a poor choice given we're trying to move away from them due to the number of times we've found configs and queries of death where they were used improperly.
if we can get the speed up without exceptions that makes it a clear win.
all of that said envoy mobile doesn't use a json access logger AFIK (mobile/envoy_build_config/extensions_build_config.bzl) @wbpcode can verify by building //test/performance:test_binary_size in mobile - if it builds then there's a test-only dependency which can be easily sorted

@htuch
Copy link
Member

htuch commented Aug 8, 2024

OK, @wbpcode I think I see the reasonableness of the optimization in this use case, thanks for sharing. My vote is for whatever keeps exceptions from occurring (less opinionated about compile time, but sure) and whatever is simplest in implementation that achieves the end goal.

The one concern I have in all this code is it's incredibly fragile and easy to get wrong (enforcing sanitization, parsing, ensuring schema or valid JSON syntax on the way out), mistakes are security bugs in many cases, even for access logs.

Copy link

github-actions bot commented Sep 7, 2024

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Sep 7, 2024
Copy link

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 14, 2024
@jmarantz jmarantz reopened this Sep 14, 2024
@jmarantz
Copy link
Contributor

Actually this was fixed in #36000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests. Not bugs or questions. stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

No branches or pull requests

4 participants