-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
Can one of the admins verify this patch? |
1 similar comment
Can one of the admins verify this patch? |
I did some research, didn't find your issue and came with a similar (less elegant) patch ( master...superboum:superboum/pypy ) and finally found your issue... The next step I think, which is not so simple, would be to evaluate precisely what is the gain (if any). On some pure computation problem, I had a 5x gain by switching from cpython to pypy. But here, it's a bit more complex and not purely computanional. Anyway, if the travis build does not pass, that's is for the following reason:
|
synapse/storage/engines/__init__.py
Outdated
@@ -31,7 +32,12 @@ def create_engine(database_config): | |||
engine_class = SUPPORTED_MODULE.get(name, None) | |||
|
|||
if engine_class: | |||
module = importlib.import_module(name) | |||
needs_pypy_hack = (name == "psycopg2" and | |||
platform.python_implementation() == "PyPy") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line fails the build because it breaks the linter
If anyone has quantitive measures on how much faster or less/more RAM this makes synapse go, we would be all ears (and consider merging it) |
I won't have time to look into this further. However, I'd point out that merging this should have no adverse effects on cpython at all, while making the lives easier for people who want to try and experiment with pypy. Officially supporting pypy is a different beast, of course. |
@@ -362,7 +362,7 @@ def cursor_to_dict(cursor): | |||
Returns: | |||
A list of dicts where the key is the column header. | |||
""" | |||
col_headers = list(intern(column[0]) for column in cursor.description) | |||
col_headers = list(intern(str(column[0])) for column in cursor.description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you explain why the cast to str is required under psycopg2cffi? this is such a hot path that I'm reluctant to add anything that isn't absolutely vital.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure why this is necessary: The value returned here by psycopg2cffi comes from ffi.string, which for some reason couldn't be interned as-is. I couldn't find anything on google regarding ffi.string vs intern, and casting to str was a simple fix. Could be it returns a unicode object, but PQfname is documented to return char* and not wchar_t* so if it does I don't know why :)
If this is performance-critical, might be worth checking the actual runtime type. Although actually cursor_to_dict
might be conceptually suboptimal if this is really on a super hot path?
Thanks for doing this work, @Valodim, and sorry for sitting on it for months. As you rightly say there doesn't seem to be much disadvantage to landing these patches to help people experiment with pypy. Would you be able to sign off the PR as per https://github.com/matrix-org/synapse/blob/master/CONTRIBUTING.rst ? |
This commit drop-in replaces blist with SortedContainers. They are written in pure python so work with pypy, but perform as good as native implementations, at least in a couple benchmarks: http://www.grantjenks.com/docs/sortedcontainers/performance.html
The psycopg2 package isn't available for PyPy. This commit adds a check if the runtime is PyPy, and if it is uses psycopg2cffi module in favor of psycopg2. This is almost a drop-in replacement, except for one place where an additional cast to string is required.
For some reason, string interpolation on a DomainSpecificString object like "%r" % (domainSpecificStringObj) fails under PyPy, because the default __repr__ implementation wants to iterate over the object. I'm not sure why that happens, but overriding __repr__ instead of __str__ fixes this problem, and is arguably the more appropriate thing to do anyways.
PyPy's incminimark GC can't be triggered manually. From what I observed there are no obvious issues with just letting it run normally. And unlike CPython, it actually returns unused RAM to the system. Signed-off-by: Vincent Breitmoser <[email protected]>
@matrixbot: test this please @Valodim: any chance you can sign off the PR (or each of the commits)? |
Signed-off-by: Vincent Breitmoser [email protected] |
That, sir, is an awesome email address. |
[part of this has been reverted by d3347ad ] |
Ugh, this is one of these situations where sortedcontainers probably meant to work as a drop-in, but now they couldn't change even if they wanted to because some users expect the method to work that way. Dang. Nice catch, hope it didn't eat too much of your time |
I've opened #3098 to track redoing this |
Changes in synapse v0.28.0-rc1 (2018-04-26) =========================================== Bug Fixes: * Fix quarantine media admin API and search reindex (PR #3130) * Fix media admin APIs (PR #3134) Changes in synapse v0.28.0-rc1 (2018-04-24) =========================================== Minor performance improvement to federation sending and bug fixes. (Note: This release does not include state resolutions discussed in matrix live) Features: * Add metrics for event processing lag (PR #3090) * Add metrics for ResponseCache (PR #3092) Changes: * Synapse on PyPy (PR #2760) Thanks to @Valodim! * move handling of auto_join_rooms to RegisterHandler (PR #2996) Thanks to @krombel! * Improve handling of SRV records for federation connections (PR #3016) Thanks to @silkeh! * Document the behaviour of ResponseCache (PR #3059) * Preparation for py3 (PR #3061, #3073, #3074, #3075, #3103, #3104, #3106, #3107, #3109, #3110) Thanks to @NotAFile! * update prometheus dashboard to use new metric names (PR #3069) Thanks to @krombel! * use python3-compatible prints (PR #3074) Thanks to @NotAFile! * Send federation events concurrently (PR #3078) * Limit concurrent event sends for a room (PR #3079) * Improve R30 stat definition (PR #3086) * Send events to ASes concurrently (PR #3088) * Refactor ResponseCache usage (PR #3093) * Clarify that SRV may not point to a CNAME (PR #3100) Thanks to @silkeh! * Use str(e) instead of e.message (PR #3103) Thanks to @NotAFile! * Use six.itervalues in some places (PR #3106) Thanks to @NotAFile! * Refactor store.have_events (PR #3117) Bug Fixes: * Return 401 for invalid access_token on logout (PR #2938) Thanks to @dklug! * Return a 404 rather than a 500 on rejoining empty rooms (PR #3080) * fix federation_domain_whitelist (PR #3099) * Avoid creating events with huge numbers of prev_events (PR #3113) * Reject events which have lots of prev_events (PR #3118)
Is this still working? I tested within this environment:
And I had an exception under Twisted
I made a very ugly patch to "/var/lib/synapse/pypy_venv/site-packages/twisted/python/reflect.py"_, line 162 by adding as documented in psycopg2cffi project:
like so:
And everything works smoothly. Did i miss anything or was some patch reverted? |
Seems like the psycopg2cffi patch I made doesn't affect the adbapi setup, since it was added at a later point. As an alternative patch to the one you have, can you try modifying synapse's server.py line 363 like this?: def build_db_pool(self):
name = self.db_config["name"]
import platform
if (name == "psycopg2" and
platform.python_implementation() == "PyPy"):
name = "psycopg2cffi" I think #3098 is no longer relevant, and hence synapse should run just fine on PyPy. Please do report your experience if you keep running synapse on pypy for a while. If we can get pypy into the tox matrix, synapse's upstream support should become much more stable 👍 (On a related note, synapse might benefit from moving to txpostgres. It hasn't been around quite as long as adbapi, but is considered stable and might offer better performance since it actually uses the libpq async api to implement deferreds, rather than a blocking connection pool of python threads) |
That patch did solve the problem. I've been running it under Pypy with no other issues so far. It is running just fine. (Regarding txpostgres I didn't give it a try, yet. I want to make sure that running under Pypy is stable without adding more variables) |
continued in #4174 |
This is a stack of patches I applied to Synapse to be able to run it on the PyPy runtime. The amount of changes is actually quite small, I commented in the individual commits. I ran my production instance (something like 20 active users I'd estimate, and in a couple big rooms too) for two days on PyPy 5.6.0 with this, without a hitch 👍
All unit tests pass, except for image preview ones which I didn't test. The reason is that these require lxml, which can be installed for pypy but not trivially. Tracking continuous compatibility with tox should also be fairly simple.
I would say that both memory and performance were about even. Memory usage might have been slightly higher than on CPython, but it's hard to tell since memory usage depends so much on forward extremities and regularly explodes for me on CPython as well.
Feel free to merge, cherry-pick, or close. If there is interest I'd be happy to add a paragraph to the readme, which I guess mostly boils down to "pypy also works! please test, measure, and report".