Skip to content
This repository has been archived by the owner on Aug 25, 2018. It is now read-only.

std::wstring_convert performance bottleneck #161

Closed
wants to merge 2 commits into from

Conversation

mcg1969
Copy link
Contributor

@mcg1969 mcg1969 commented May 18, 2016

I recently updated a project from VS2010 to VS2015 (finally!) and nanodbc 1.x to 2.x. I saw a horrible slowdown that can be traced to the poor performance of the std::wstring_convert constructor; see this post for discussion. Making this converter static seems to have addressed the problem.

I only made the change in the VS2015 version of the converter, but I suspect this same pattern should be considered anywhere std::wstring_convert is used.

@lexicalunit
Copy link
Owner

Nice catch! Makes total sense to have that be static there too.

@mcg1969
Copy link
Contributor Author

mcg1969 commented May 18, 2016

I'm happy to modify this patch to do both if you like.

@mcg1969
Copy link
Contributor Author

mcg1969 commented May 18, 2016

Indeed, it looks like there are a few places it could be added---plus I have an indentation issue. Curse you, GitHub editor... I'll fix it locally.

@mloskot
Copy link
Contributor

mloskot commented May 18, 2016

Good catch. Thanks for very interesting report

@mcg1969
Copy link
Contributor Author

mcg1969 commented May 18, 2016

Glad you like it. VS2015 has some really convenient profiling features that made this easy to find.

@mloskot
Copy link
Contributor

mloskot commented May 19, 2016

@mcg1969 Yes, looks like GH editor is messed with tabs, instead of inserting 4 spaces.

@lexicalunit merging? I can untabify after merge.

@mcg1969
Copy link
Contributor Author

mcg1969 commented May 19, 2016

Can't blame GitHub for the second commit; that was VS2015. Bothersome. I can fix before merging if you want.

@mloskot
Copy link
Contributor

mloskot commented May 19, 2016

Fixing would be helpful as it is a good idea to not to mess the history with 'untabify' commits.

Regarding multiple commits, yeah, that's why the 'old-school' way of submitting PRs is the best :)

git co yourbranch
... # edits
git commit --amend
git push -f yourremote yourbranch

But, if it is a problem, I will take care of it.

@kentf
Copy link
Contributor

kentf commented May 19, 2016

Using a static converter is fine for single threaded applications. How would this hold up against multiple threads using a single (static) converter?

Neither to_bytes or from_bytes are const, so I would assume this will be disastrous in multithreaded applications, even when there is a single nanodbc object per thread, and even to different databases. All instances would share the same converter due to the static keyword.

This answer seems to agree with me: http://stackoverflow.com/a/26055879

One solution would be to use a static mutex and a lock per converter, before using those converters... But then again, it will be convoluted and it would incur a slight performance penalty in single threaded applications.

@mcg1969
Copy link
Contributor Author

mcg1969 commented May 19, 2016

@kentf thanks for chiming in, that's a crucial observation. For my application this is not going to be a problem, but I assume that we shouldn't merge this until this is addressed somehow. I doubt the mutex would be as expensive as the current (non-static) implementation.

@kentf
Copy link
Contributor

kentf commented May 19, 2016

@mcg1969: I agree it's not as expensive as constructing the converter every time.

It would make sense to opt out of using std::mutex iff NANODBC_DONT_USE_CONVERTER_MUTEX_ONLY_USE_IN_SINGLE_THREADED_ENVIRONMENTS (or an equivalent thereof) is defined and set to true.

Using a non-static std::unique_lock, around a static std::mutex at each location should solve the synchronization issue.

I'd love to implement it, but I don't have the time required for testing and verifying it.

@jon-v
Copy link

jon-v commented May 19, 2016

Isn't this what thread_local is for?

@mcg1969
Copy link
Contributor Author

mcg1969 commented May 19, 2016

I was just going to ask. I'm a bit new to the latest C++ goodies.

@jon-v
Copy link

jon-v commented May 19, 2016

The downside of thread_local would be poor performance if an app were constantly launching new threads, but I think that's far preferable to the scalability impact of a mutex.

@mloskot
Copy link
Contributor

mloskot commented May 20, 2016

I agree with @kentf that mutex (or TLS) should be conditionally compiled in.

@lexicalunit
Copy link
Owner

Agree that we should make efforts to ensure things don't break for multithreaded applications when possible. I haven't used thread_local before and I'm not super familiar with the trade-offs here.

@mcg1969
Copy link
Contributor Author

mcg1969 commented May 21, 2016

I'm using it now with VS2015. Seems to work fine. And it's c++11, so it's reasonably safe, it would seem.

@lexicalunit
Copy link
Owner

There are two more places where we create static wstring_convert in nanodbc.cpp. We should probably all thread_local to those too. After that I'd be happy to accept this PR :)

@mcg1969
Copy link
Contributor Author

mcg1969 commented May 25, 2016

Thanks Amy. I'm happy to work on both of these PRs in a few days but I've got to help get a release of conda out the door first. (No, we don't use these in conda :-))

@lexicalunit
Copy link
Owner

"(No, we don't use these in conda :-))"

Yet! :-p

@lexicalunit
Copy link
Owner

@mcg1969 this PR can be closed in favor of #201, correct?

@mcg1969
Copy link
Contributor Author

mcg1969 commented Aug 24, 2016

I'm sorry for missing this---got your closure notice though! Yes you're right, the focus should be on #201. I've had some difficulty getting the Linux debugging going

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants