-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race conditions when passing in a BMv2 JSON config #1243
Comments
Pasting the line TSAN is complaining about here for convenience:
|
As a side note, running the TSAN tool while removing the BMv2 Config JSON in the command line arguments (and using the |
I found the following context in
|
Just an educated guess, but on the surface it seems like this may be a race between the configuration through the command line and the configuration through P4Runtime? If so, this could be fixed by not starting the P4Runtime server until after the initial configuration. |
Here is the complete ThreadSanitizer report:
|
Swapping the relative order of the calls to |
Do you push the forwarding pipeline as soon as the P4Runtime server is started? It's strange that these 2 things happen concurrently. I think it would be fine to swap these 2 function calls as you suggested. I don't see any potential drawback or issue.
^ I was wrong there. I think the solution you propose (changing order of function calls) may not work in the end. I see that behavioral-model/src/bm_sim/switch.cpp Lines 76 to 90 in 2ad84a2
If we start the switch without a P4 config and we wait for |
Yes, pretty much. We're seeing this issue in a unit test which is pretty much straight line code without any delays.
Ohhhh, good catch! Perhaps we can swap the calls only iff an initial config was specified? |
I would recommend just locking the behavioral-model/src/bm_sim/switch.cpp Line 87 in 2ad84a2
Since targets are using this callback to validate the initial config, it makes sense to lock the mutex before calling it. I hope I am not missing something, but this should work: void
SwitchWContexts::start_and_return() {
std::unique_lock<std::mutex> config_lock(config_mutex);
if (!config_loaded && !enable_swap) {
Logger::get()->error(
"The switch was started with no P4 and config swap is disabled");
}
config_loaded_cv.wait(config_lock, [this]() { return config_loaded; });
start(); // DevMgr::start
start_and_return_();
// Starts any registered periodically-executing externs
PeriodicTaskList::get_instance().start();
} After the condition variable's |
Where does the initial config from the JSON file get set? I guess we would want to verify that at least that bit happens before the P4RT server is started, right? |
I believe that the initial config from the JSON file gets set in the |
Thanks for verifying that, that looks correct to me. And thanks @antoninbas for the suggestion! |
After using the Tsan tool on our tests with a simple grpc service with a BMV2 JSON config passed in with the command line arguments, I noticed that it resulted in some flaky tests with race conditions. The error follows as below:
I was wondering if this is the expected behavior?
@smolkaj @jonathan-dilorenzo
The text was updated successfully, but these errors were encountered: