-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak in _cdecimal.c with 'z' format #114563
Comments
Inline quote below. Was there a thread specifically about the memory leak? Haven't looked at it, but seems to be good news that the mpdecimal library picked up z-format support.
|
@belm0 in case you're not aware, there's some history here (which I won't get into) that may make the author of that announcement hesitant to interact directly with CPython. If we can verify the memory leak exists, we should fix it in CPython. |
Thanks for pointing that out, @JelleZijlstra. I agree we should just try and fix it on our side. I don't know how feasible it is to update to a new mpdecimal and remove our 'z' code, if that will fix it. |
I recall |
I can reproduce the memory leak on main with this code: from decimal import Decimal
while True:
d = Decimal('-.508e+41211')
format(d, 'D=-z,.44%') When I run this on my (macOS / Intel, FWIW) machine, RAM usage (observed with
import decimal
import tracemalloc
tracemalloc.start()
repeat = 100
before = tracemalloc.take_snapshot()
for _ in range(repeat):
d = decimal.Decimal('-.508e+41211')
format(d, 'D=-z,.44%')
after = tracemalloc.take_snapshot()
top_stats = after.compare_to(before, 'lineno')
for stat in top_stats[:3]:
print(stat) then the first line of the output is:
and the size grows linearly with @belm0 Can you reproduce the above on your machine, and if so do you have bandwidth to investigate further?
I don't understand / can't reproduce this: the mdickinson@lovelace cpython % ./python.exe
Python 3.13.0a3+ (heads/main:a768e12f09, Jan 28 2024, 09:45:39) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from decimal import Decimal
>>> format(Decimal('-0'), '.6E')
'-0.000000E+6'
>>> format(Decimal('-0'), 'z.6E')
'0.000000E+6' |
https://www.bytereef.org/mpdecimal/doc/libmpdec/assign-convert.html#to-string
https://www.bytereef.org/mpdecimal/doc/libmpdec/memory.html
https://www.bytereef.org/contrib/0001-main-revert-z-format-specifier.patch
Finally, @JelleZijlstra, prevailing open source conventions would dictate that upstream is contacted when a new feature is desired in a library, not the other way round. |
@mdickinson Thank you for reproducing the memory leak! "EG" was a typo (introduced while coordinating the large amount of patches) I meant "F":
|
@belm0 I did not "maintain" the integration code, I am the sole author of |
Thank you for the additional links.
I neglected to consider this precisely because distributions may ship/override mpdecimal, so putting an implementation in Python seemed the only way to provide the enhancement for all cases. @mdickinson what do you think of the fallback approach in the cited patch? It seems like less tricky code to maintain.
Do we need a separate issue for supporting |
Yes, I agree, up to a point. For example, all relevant distributions are now https://repology.org/project/mpdecimal/versions So now instances of I'm very glad that you mention the difficulties of the coordination issue! I have received very little empathy in 2020 when I tried to fix an incorrect (and unreleased!) patched libmpdec in and for Debian before the freeze.
No, the fallback code fixes everything including hash formatting. mpdecimal speedups will be picked up automatically when available. |
I opened PR #114879 based on Stefan's patches. |
Seems reasonable to me: it makes a lot of sense to me to separate the (somewhat) frequently-changing Python-specific formatting details from the standards-based decimal core. It's a bit ugly to have some of the formatting being done in the C code and some in Python, but that seems like the most pragmatic compromise. (Moving all formatting to Python sounds nice in theory, but would almost certainly have an unacceptable performance impact for common cases.) Thanks for the PR. I'll review shortly. |
…trings (GH-114879) Immediate merits: * eliminate complex workarounds for 'z' format support (NOTE: mpdecimal recently added 'z' support, so this becomes efficient in the long term.) * fix 'z' format memory leak * fix 'z' format applied to 'F' * fix missing '#' format support Suggested and prototyped by Stefan Krah. Fixes gh-114563, gh-91060 Co-authored-by: Stefan Krah <[email protected]>
…ormat strings (GH-114879) (GH-115353) Immediate merits: * eliminate complex workarounds for 'z' format support (NOTE: mpdecimal recently added 'z' support, so this becomes efficient in the long term.) * fix 'z' format memory leak * fix 'z' format applied to 'F' * fix missing '#' format support Suggested and prototyped by Stefan Krah. Fixes gh-114563, gh-91060 (cherry picked from commit 72340d1) Co-authored-by: John Belmonte <[email protected]> Co-authored-by: Stefan Krah <[email protected]>
…unsupported format strings (pythonGH-114879) (pythonGH-115353) Immediate merits: * eliminate complex workarounds for 'z' format support (NOTE: mpdecimal recently added 'z' support, so this becomes efficient in the long term.) * fix 'z' format memory leak * fix 'z' format applied to 'F' * fix missing 'GH-' format support Suggested and prototyped by Stefan Krah. Fixes pythongh-114563, pythongh-91060 (cherry picked from commit 72340d1) (cherry picked from commit 09c98e4) Co-authored-by: John Belmonte <[email protected]> Co-authored-by: Stefan Krah <[email protected]>
…ormat strings (GH-114879) (GH-115384) Immediate merits: * eliminate complex workarounds for 'z' format support (NOTE: mpdecimal recently added 'z' support, so this becomes efficient in the long term.) * fix 'z' format memory leak * fix 'z' format applied to 'F' * fix missing 'GH-' format support Suggested and prototyped by Stefan Krah. Fixes gh-114563, gh-91060 (cherry picked from commit 72340d1) (cherry picked from commit 09c98e4) Co-authored-by: Stefan Krah <[email protected]>
…rmat strings (pythonGH-114879) Immediate merits: * eliminate complex workarounds for 'z' format support (NOTE: mpdecimal recently added 'z' support, so this becomes efficient in the long term.) * fix 'z' format memory leak * fix 'z' format applied to 'F' * fix missing '#' format support Suggested and prototyped by Stefan Krah. Fixes pythongh-114563, pythongh-91060 Co-authored-by: Stefan Krah <[email protected]>
Bug report
Bug description:
See https://mail.python.org/archives/list/[email protected]/thread/DHZROL7YYJZTPWJQ3WME4HI3Z65K2H4F/
This feature was originally implemented in #30049
I have not verified the memory leak or looked at Stefan's suggestions.
@mdickinson @belm0
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs
The text was updated successfully, but these errors were encountered: