-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak #180
Comments
If your willing I'd also love to see if this is a issue purely in the docker Or also in the code in general Try running the
And see if this issue also happens through that |
Include any logs and details such also No amount of details is too much for us 👍 |
say.... Are you running into issues with the docker image after running multiple books through it? And its just having this increase in ram usage after each book is processed in order without wiping and restarting the docker image from the base image scratch? 🤔 |
Testing rn with this In the gui in stnadard settings docker was launched with a upper cpu memory limit of 4gb with this command for my test docker run -it -p 7860:7860 --platform=linux/amd64 --memory=4g athomasson2/ebook2audiobook:latest python app.py edit: I am testing it with this book because there are errors with your
|
This has been happening with every novel-length book I try.
It fails consistently after a few hours (getting about 4-6% of the way through) once it hits the memory threshold. Unrestricted or set to 4gb, doesn't seem to matter. Logs don't show anything interesting, just dies mid-sentence. I've reverted to using the piper-TTS image for now, which is still awesome. I'll comment if I figure out anything useful. |
Interesting Yeah that's weird then |
Even if you turn off sentence splitting? |
Any chance you could show us the book you're using? |
Trying this now, I previously misread as turn on sentence splitting. The book is Penpal by Dathan Auerbach. I own a copy. What's the best way for me to get it to you? |
Discord Don't want anyone thinking we're distributing books to the public illegally |
No luck with text splitting off and a 4gb memory limit. It crashed this time at 3.8%. |
Anyone who has issues with this keep posting ✨🫶🏻this is just as a helpful list of legacy things that might work for you in the meantime Other LEGACY versions of ebook2audiobook that might not have this issue (I'm not updating them tho as these will be integrated eventually)Legacy Ebook2Audiobook v1.0Legacy Ebook2AudiobookpiperTTSLegacy Ebook2AudiobookStyleTTSLegacy Ebook2AudiobookEspeak |
@DevonGrandahl
|
Sure! It doesn't fail at the same spot every time. The last lines in the logs:
The next line is:
The PC is a ProDesk mini PC I use as a server. |
ha you said it doesn't fail at the same spot right? everytime it's random? |
Yep, seems to be random. It's never made it past ~8%. Trying the legacy v1.0 image now. Update: the v1.0 image just passed this sentence in maybe 1/3rd the time. Using Attenborough and no memory limit. Update: Aaaaand it crashed. Went much faster, but still crashed when memory topped out. |
so I don't think it's related to ebook2audiobook, but more how your OS is managing the docker. |
RAM failure would be a strange thing to happen to multiple people at the same time trying to run the app. Plenty of apps run inside docker with no issue, so I'm also not sure about the Windows/Docker management issue. Could be, though! I tried mounting Docker volumes for the tmp & audiobook directories with no luck, but I have a hunch I did that wrong. Will try again, since that could help with RAM usage. |
failure is maybe not the righ word, but more how the OS is managing the docker RAM.... |
The memory leak happens also when running in native mode. I'll try next with text splitting off. |
I don't think it's related to enable_text_splitting but more a coqui-tts issue... if you say that around 8% it fails, so there is something somewhere where the memory is not freed.... |
Attaching logs and a couple of screenshots of memory utilization (see the clock at the top left of the screen). |
ok there is already something wrong with the processes running. In any case it should be 3 or 4 processes, unless you are several user on it so it will explain your memory "leak" which is not a leak but just a need of more RAM since more users... |
Those different PIDs are actually threads of the same process. Only one instance is running, by just one user - this is a desktop computer. What I wanted to show with the 2 screenshots is that in just 7 minutes RAM usage has increased by ca. 2.6 GB (from 5.3 to 7.9). |
it should'n be threads since ebook2audiobook is running in multiprocessing and it's NOT allowing threading. So a race occurs, which can explain the increase of your RAM. now it needs to undertsand why threads are running.... |
I'm running on Linux, Ubuntu 24.10. |
ok try to run a conversion, then check the PID of each thread, and try to kill all but one with kill -SIGTERM , then provide the log |
ok so the issue is more complex... any chance you have another machine to test the same? |
Just replicated the issue on the Google Collab. Log attached.
|
try to replace this function with this one
|
Memory increases also with torch.no_grad(), same as before. |
add and at the end of the function add |
No changes with gc.collect(), memory still increases. |
If gc.collect does not effect so I'm afraid it's out of the eb2ab scope and the bug is coming from a library |
FWIW, I am seeing this same behavior in the StyleTTS version of the project. |
ok now let's target the origin more precisely. |
I'm running in cpu mode, I've used 3 different ebooks, 1 in English and 2 in Italian. Same behavior across all inputs. |
did you try on another computer? |
I've created a debian VM (debian 12.8) on virtualbox and will run it there. I'll keep you posted. |
well, if you created a VM on the same computer so it will be the same.... |
I don't have another computer and the issue was replicated on Google Collab by @DevonGrandahl . Memory is being allocated on TTS/tts/models/xtts.py, in the function inference, line 568:
After this function the allocated memory is never collected/cleaned even when executing gc.collect(). |
so it's what I said since the start, it's a coqui-tts issue, not ours... and to fix it, good luck with the fork we are working with... at our level we cannot change anything. the issue you are encountering concerns only few computers apparentely. |
These are the versions installed in the python_env which was created automatically by the application: torch==2.5.1 |
it's all fine, so it's really a coqui-tts with torch memory management.... weird a gc.collect() does not do anything though. |
Do we know that users are getting successful runs of full-length novels? Saying this is an issue with CoquiTTS seems equivalent to saying this project is DOA, no? It's failing in the Collab, so it's not exactly isolated to a couple machines. Also, want to reiterate that this same thing is happening with the old StyleTTS library, which feels like a big coincidence if it's outside of e2a's domain. Ill report back if I get time to dig through the Python. Crossing my fingers this is fixable! |
@DevonGrandahl in native mode yes of course!, with the docker, @DrewThomasson will tell you more. |
I'm messing around with the native mode code, and adding gc.collect and a torch cache dump might be slowing memory growth, but I can confirm it's not fixed. I wonder about deleting the TTS object entirely after every chapter (or x sentences) and rebuilding. It's a dumb idea (it'll definitely be slower), but combined with garbage collection it might stop the unlimited memory growth? |
@DevonGrandahl Keep in mind that a very few users have this problem. so if it was a major issue the entire project won't be useable. the issue is elsewhere for sure. are you saying it's the same in native mode? btw I just saw you are on windows 10, not 11 right? |
Yep, I'm seeing the same issue in native mode. The Google Collab is also in native mode, I think. Correct, Windows 10. |
windows 10 could be the issue.... I also saw on coqui-tts forums users having the same issue on windows 10 even with 16GB RAM. |
I tried to create a new TTS object at every sentence and then execute a gc.collect(), but memory grew even faster. I also tried to set to None the params and session objects to see if there was something holding references to other objects, but also that did not reduce memory utilization. |
read carefully my comment above please |
On the debian 12.8 VM in Virtualbox memory is not growing (it grows by approx 1MB per sentence instead of 10s or 100s of MB per sentence like in Ubuntu 24.10). There could be something related to the kernel/OS which influences how memory is managed resulting in a leak. This is weird. |
any VM is still dependent on the OS host so windows10. then docker is doing the mapping and for sure the map behavior differs from each VM to other |
Just tried this on a Windows 11 gaming machine and the memory climb seems much more reasonable. Can't leave this running to see if it ever crashes, but it seems like it would probably work. |
A virtualbox VM is different from a docker container in many respects and also from a memory management standpoint. My host is not windows10, but ubuntu 24.10. Anyway it reached 8GB of RAM after 417 sentences, and the process was killed by the OS as the VM has 9 GB of RAM allocated. So it grew, but slower than on ubuntu 24.10. |
run a TTS A.I. on a VM is not reasonable btw.... it's ok for test, but production not. |
hawthorne_lettera_scarlatta.epub.gz
When processing this file using the docker container, memory utilization grows continuously. Command line:
/ebook2audiobook.sh --hea
dless --ebook ../hawthorne_lettera_scarlatta.epub -
-language ita
The text was updated successfully, but these errors were encountered: