Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency bug in DefaultContractResolver #1163

Closed
dotnetjunkie opened this issue Jan 11, 2017 · 10 comments
Closed

Concurrency bug in DefaultContractResolver #1163

dotnetjunkie opened this issue Jan 11, 2017 · 10 comments

Comments

@dotnetjunkie
Copy link

Just this week a client of mine stumbled on a concurrency bug in the application which I traced back to JSON.NET. The application uses JSON.NET v7.0.1 but after reviewing the code, my conclusion is that the bug still exists in v9.0.1.

Below I'll describe why I conclude the problem is in JSON.NET.

Analysis of the encountered bug in the application

The application runs the following code:

private static readonly Dictionary<string, Type> knownQueryTypes =
    Bootstrapper.GetQueryTypes().ToDictionary(type => type.FullName, type => type);

private static readonly JsonSerializerSettings jsonSettings = new JsonSerializerSettings
{
    TypeNameHandling = TypeNameHandling.Auto,
};

[OperationContract]
[FaultContract(typeof(ValidationError))]
public string JsonExecute(string queryTypeName, string json)
{
    Type queryType = knownQueryTypes[queryTypeName];

    dynamic query = JsonConvert.DeserializeObject(json, queryType, jsonSettings);

    Type resultType = GetQueryResultType(queryType);
    Type queryHandlerType = typeof(IQueryHandler<,>).MakeGenericType(queryType, resultType);
    dynamic queryHandler = Bootstrapper.Container.GetInstance(queryHandlerType);

    object result = queryHandler.Handle(query);

    return this.jsonSerializerService.SerializeObject(result);
}

In short, the JsonExecute method is a WCF service call that accepts a query message, serialized as json where the message type is supplied as queryTypeName. Internally, the method converts the queryTypeName to the actual Type using a dictionary that is initialized during application start-up and after that is only read from.

Based on this query Type, a IQueryHandler<TQuery, TResult> implementation (a query handler) is requested from the use DI container, after which this handler is invoked using C# dynamic.

Things to note in this code:

  • The method is completely synchronous in nature. No async.
  • The knownQueryTypes dictionary is never changed after creation.
  • The only two places where currency issues could arrise are JsonConvert.DeserializeObject and the call to Container.GetInstance.

After this web service has ran for several years, this week the following exception ocurred:

Microsoft.CSharp.RuntimeBinder.RuntimeBinderException: The best overloaded method match for 'MyApp.BusinessLayer.CrossCuttingConcerns.PermissionQueryHandlerDecorator<MyApp.Contract.Queries.GetAnalyseAanvraagViewByIdQuery, MyApp.Contract.DTOs.AnalyseAanvraagView>.Handle(MyApp.Contract.Queries.GetAnalyseAanvraagViewByIdQuery)' has some invalid arguments

With the following stack trace

at CallSite.Target(Closure , CallSite , Object , Object )
at System.Dynamic.UpdateDelegates.UpdateAndExecute2[T0,T1,TRet](CallSite site, T0 arg0, T1 arg1)
at MyApp.ServiceLayer.QueryService.JsonExecute(String queryTypeName, String json)

In the log we saw that the

Ftom the call was for the following request:

JsonExecute("MyApp.Contract.Queries.GetAnalyseAanvraagViewByIdQuery", "{ json payload }");

During that day all following requests for this particular query failed with the same exception. After the service was restarted the problem disapeared. We haven't seen it.

After close inspection of the code I came to the conclusion that the call to JsonConvert.DeserializeObject returned an instance of the incorrect type, because:

  • The call to knownQueryTypes[queryTypeName] can't return the incorrect type, since the dictionary is never changed after creation.
  • Since the queryType is correct, the correct query handler must be requested from the container.
  • The container returns the correct handler type, because the C# dynamic binder reports that it found the a method with the signature Handle(GetAnalyseAanvraagViewByIdQuery), but that it doesn't match.

This means that the only possible reason for the C# binder to report that the method signature doesn't match (and overload resolution fails) is because the actual type of the query instance (created by JsonConvert.DeserializeObject) is different from the requested queryType. Also note that this application runs unchanged in production for quite some time and used a lot.

Analysis of the JSON.NET code base

After coming to the conclusion that the problem must be in JSON.NET, I started looking at the code base of JSON.NET 7.0.1 and I found the following code that might explain why things go wrong (note the following code is extract from v7.0.1 using Reflector):

public virtual JsonContract ResolveContract(Type type)
{
    JsonContract contract;
    if (type == null)
    {
        throw new ArgumentNullException("type");
    }
    DefaultContractResolverState state = this.GetState();
    ResolverContractKey key = new ResolverContractKey(base.GetType(), type);
    Dictionary<ResolverContractKey, JsonContract> contractCache = state.ContractCache;
    if ((contractCache == null) || !contractCache.TryGetValue(key, out contract))
    {
        contract = this.CreateContract(type);
        lock (TypeContractCacheLock)
        {
            contractCache = state.ContractCache;
            Dictionary<ResolverContractKey, JsonContract> dictionary2 = (contractCache != null) 
				? new Dictionary<ResolverContractKey, JsonContract>(contractCache) 
				: new Dictionary<ResolverContractKey, JsonContract>();
            dictionary2[key] = contract;
            state.ContractCache = dictionary2;
        }
    }
    return contract;
}

After analyzing this code we can quickly spot a concurrency bug: contractCache.TryGetValue is used outside the lock. Dictionary<TKey, TValue> can not safely be used with one concurrent writer. MSDN clearly states:

A Dictionary<TKey, TValue> can support multiple readers concurrently, as long as the collection is not modified.

In other words, there is a concurrency bug in the JsonContract cache, and that might explain problem described above. If the DefaultContractResolver resolves a JsonContract for the wrong type, deserialization can fail silently, since that type can usually be created successfully, but it will be incorrectly initialized. In this case all query objects have default constructors. Deserialization would likely fail if the query class was immutable with a non-default constructor.

A possible fix for this problem is to replace the Dictionary<ResolverContractKey, JsonContract> with an old fashioned HashTable, since:

Hashtable is thread safe for use by multiple reader threads and a single writing thread.

Do note that the latest code contains the same bug. This means that even v9.0.1 is affected.

@davkean
Copy link

davkean commented Jan 11, 2017

HashTable is not safe for read/writer at the same time. We should update the documentation...

@dotnetjunkie
Copy link
Author

@davkean I am pretty sure HashTable is meant to be safe for this. Take a look at the code for HashTable. You'll see that the code goes into great length to ensure it supports multiple readers and a single writer concurrently. You'll see spinwait loops in there for this.

@davkean
Copy link

davkean commented Jan 11, 2017

It does attempt to go to great lengths to attempt to this - unfortunately, it wasn't successful and still had lots of threading issues (when I'm next connected to the VPN, I'll look up the bugs we had around this if you'd like). In the end, this is why Dictionary wasn't made safe for a writer and reader at the same time - lock free data structures are hard.

@dotnetjunkie
Copy link
Author

@davkean I think it is really important that you make this publically that there are issues with HashTable and what they are. There are OSS projects that depend on this documented behavior of HashTable.

@dotnetjunkie
Copy link
Author

But in case HashTable with one writer and multiple concurrent readers is out of the question, the best solution would be to use 'immutable' Dictionary<T, V> and never update the instance but replace its reference with the reference of a new copy that includes the new value.

@nblumhardt
Copy link

Would be great to know the bugs. Our usage in Serilog is pretty forgiving (I don't think we care if we see stale reads, and we don't enumerate or have hard requirements for a correct Count), so as long as the bugs aren't state-corrupting we'll probably be fine.

Seems a bit late to change the docs here though; perhaps adding the known-issues list would be more useful?

@JamesNK
Copy link
Owner

JamesNK commented Jan 11, 2017

After analyzing this code we can quickly spot a concurrency bug: contractCache.TryGetValue is used outside the lock. Dictionary

It is but you are incorrect that it's not thread safe. The dictionary assigned to state.ContractCache is never modified. When state.ContractCache needs to change then a new dictionary is created on a local variable, the new value is added to it, and only then is it assigned to state.ContractCache. It is then thread safe to access anywhere because it is immutable.

        lock (TypeContractCacheLock)
        {
            contractCache = state.ContractCache;
            Dictionary<ResolverContractKey, JsonContract> dictionary2 = (contractCache != null) 
				? new Dictionary<ResolverContractKey, JsonContract>(contractCache) 
				: new Dictionary<ResolverContractKey, JsonContract>();
            dictionary2[key] = contract;
            state.ContractCache = dictionary2;
        }

@dotnetjunkie
Copy link
Author

It is but you are incorrect that it's not thread safe.

Crap! You are right. I missed the newing of the dictionaries. You are already reference swapping.

@JamesNK JamesNK closed this as completed Jan 12, 2017
@dotnetjunkie
Copy link
Author

This doesn't invalidate the analysis of the reported problem though. I suspect there to still be a concurrency problem somewhere in JSON.NET (at least in v7). I advised my client to upgrade to v9 and to add an extra guard clause to make it very clear when DeserializeObject returns a different type than the supplied one. In case this happens, I will post a new bug report.

@dotnetjunkie
Copy link
Author

@davkean, do you have any public announcement about HashTable and the possible problems with it? It would really help the whole community and libraries like Serilog that depend on hashtable, quite extensively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants