Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak #180

Open
filippocarone opened this issue Jan 2, 2025 · 65 comments
Open

Memory leak #180

filippocarone opened this issue Jan 2, 2025 · 65 comments
Labels
bug Something isn't working docker Related to docker

Comments

@filippocarone
Copy link

hawthorne_lettera_scarlatta.epub.gz

When processing this file using the docker container, memory utilization grows continuously. Command line:

/ebook2audiobook.sh --hea
dless --ebook ../hawthorne_lettera_scarlatta.epub -
-language ita

@DrewThomasson
Copy link
Owner

If your willing I'd also love to see if this is a issue purely in the docker

Or also in the code in general

Try running the

ebook2audiobook.sh

And see if this issue also happens through that

@DrewThomasson
Copy link
Owner

Include any logs and details such also

No amount of details is too much for us 👍

@DrewThomasson DrewThomasson added bug Something isn't working docker Related to docker labels Jan 2, 2025
@DrewThomasson
Copy link
Owner

say....

Are you running into issues with the docker image after running multiple books through it?

And its just having this increase in ram usage after each book is processed in order without wiping and restarting the docker image from the base image scratch? 🤔

@DrewThomasson
Copy link
Owner

DrewThomasson commented Jan 2, 2025

Testing rn with this

alice_in_wonderland.txt

In the gui in stnadard settings

docker was launched with a upper cpu memory limit of 4gb with this command for my test

docker run -it -p 7860:7860 --platform=linux/amd64 --memory=4g athomasson2/ebook2audiobook:latest python app.py

edit: I am testing it with this book because there are errors with your.epub file as its seen as a dir for some reason

@DrewThomasson
Copy link
Owner

DrewThomasson commented Jan 2, 2025

seems to be going.... well XD

image

it's BARELY hanging on rn but still progressing

log_so_far.txt

Update: still going strong

image

log_so_far2.txt

Update still going strong

image

log_so_far3.txt

@DevonGrandahl
Copy link

DevonGrandahl commented Jan 3, 2025

This has been happening with every novel-length book I try.

  • Docker on Windows 10
  • Mostly using a ~2k sentence book
  • Mostly using Bryan Cranston's voice
  • Mid/low-end mini PC w/ 8gb of available RAM.

It fails consistently after a few hours (getting about 4-6% of the way through) once it hits the memory threshold. Unrestricted or set to 4gb, doesn't seem to matter. Logs don't show anything interesting, just dies mid-sentence.

I've reverted to using the piper-TTS image for now, which is still awesome. I'll comment if I figure out anything useful.

@DrewThomasson
Copy link
Owner

Interesting

Yeah that's weird then

@DrewThomasson
Copy link
Owner

Even if you turn off sentence splitting?

@DrewThomasson
Copy link
Owner

@DevonGrandahl

Any chance you could show us the book you're using?

@DevonGrandahl
Copy link

Even if you turn off sentence splitting?

Trying this now, I previously misread as turn on sentence splitting.

The book is Penpal by Dathan Auerbach. I own a copy. What's the best way for me to get it to you?

@DrewThomasson
Copy link
Owner

DrewThomasson commented Jan 3, 2025

@DevonGrandahl

Discord

Don't want anyone thinking we're distributing books to the public illegally

@DevonGrandahl
Copy link

No luck with text splitting off and a 4gb memory limit. It crashed this time at 3.8%.

@DrewThomasson
Copy link
Owner

DrewThomasson commented Jan 3, 2025

Anyone who has issues with this keep posting ✨🫶🏻

this is just as a helpful list of legacy things that might work for you in the meantime

Other LEGACY versions of ebook2audiobook that might not have this issue (I'm not updating them tho as these will be integrated eventually)

Legacy Ebook2Audiobook v1.0

Legacy Ebook2AudiobookpiperTTS

Legacy Ebook2AudiobookStyleTTS

Legacy Ebook2AudiobookEspeak

@ROBERT-MCDOWELL
Copy link
Collaborator

@DevonGrandahl
Could you provide:

  • the next text after the last sentence converted (see the terminal when you run eb2ab)
  • the CPU and GPU of your PC

@DevonGrandahl
Copy link

Sure! It doesn't fail at the same spot every time.

The last lines in the logs:

2025-01-02 21:20:49 94/2396 Sentence: Small towns lack many of the luxuries of larger towns or cities; what few stores there are close down early, 
2025-01-02 21:20:49 
2025-01-02 21:21:47 
Processing 3.88%: : 94/2396

The next line is:

traveling events don’t stop there because they probably missed your small dot on the map, and there aren’t many police or hospitals at your disposal.

The PC is a ProDesk mini PC I use as a server.
Processor: Intel(R) Core(TM) i7-6700T CPU @ 2.80GHz, 2808 Mhz, 4 Core(s), 8 Logical Processor(s)
RAM: 16gb (8gb available to Docker)
GPU: Intel HD Graphics 530 (Integrated)

@ROBERT-MCDOWELL
Copy link
Collaborator

ha you said it doesn't fail at the same spot right? everytime it's random?

@DevonGrandahl
Copy link

DevonGrandahl commented Jan 3, 2025

Yep, seems to be random. It's never made it past ~8%.

Trying the legacy v1.0 image now.

Update: the v1.0 image just passed this sentence in maybe 1/3rd the time. Using Attenborough and no memory limit.

Update: Aaaaand it crashed. Went much faster, but still crashed when memory topped out.

@ROBERT-MCDOWELL
Copy link
Collaborator

so I don't think it's related to ebook2audiobook, but more how your OS is managing the docker.
it can be also a RAM failure...

@DevonGrandahl
Copy link

RAM failure would be a strange thing to happen to multiple people at the same time trying to run the app. Plenty of apps run inside docker with no issue, so I'm also not sure about the Windows/Docker management issue. Could be, though!

I tried mounting Docker volumes for the tmp & audiobook directories with no luck, but I have a hunch I did that wrong. Will try again, since that could help with RAM usage.

@ROBERT-MCDOWELL
Copy link
Collaborator

ROBERT-MCDOWELL commented Jan 3, 2025

failure is maybe not the righ word, but more how the OS is managing the docker RAM....
maybe there is also something else out of Docker causing trouble to your docker system.
try to reboot with the minimum services using RAM and virtual RAM and try again to see if it's better.

@filippocarone
Copy link
Author

The memory leak happens also when running in native mode. I'll try next with text splitting off.

@ROBERT-MCDOWELL
Copy link
Collaborator

I don't think it's related to enable_text_splitting but more a coqui-tts issue... if you say that around 8% it fails, so there is something somewhere where the memory is not freed....

@filippocarone
Copy link
Author

Attaching logs and a couple of screenshots of memory utilization (see the clock at the top left of the screen).

log.txt
Screenshot_20250104_080709_Termius
Screenshot_20250104_080001_Termius

@ROBERT-MCDOWELL
Copy link
Collaborator

ok there is already something wrong with the processes running. In any case it should be 3 or 4 processes, unless you are several user on it so it will explain your memory "leak" which is not a leak but just a need of more RAM since more users...

@filippocarone
Copy link
Author

Those different PIDs are actually threads of the same process. Only one instance is running, by just one user - this is a desktop computer. What I wanted to show with the 2 screenshots is that in just 7 minutes RAM usage has increased by ca. 2.6 GB (from 5.3 to 7.9).

@ROBERT-MCDOWELL
Copy link
Collaborator

ROBERT-MCDOWELL commented Jan 4, 2025

it should'n be threads since ebook2audiobook is running in multiprocessing and it's NOT allowing threading. So a race occurs, which can explain the increase of your RAM. now it needs to undertsand why threads are running....
are you on windows 11?

@filippocarone
Copy link
Author

I'm running on Linux, Ubuntu 24.10.

@ROBERT-MCDOWELL
Copy link
Collaborator

ok try to run a conversion, then check the PID of each thread, and try to kill all but one with kill -SIGTERM , then provide the log

@ROBERT-MCDOWELL
Copy link
Collaborator

ok so the issue is more complex... any chance you have another machine to test the same?

@DevonGrandahl
Copy link

DevonGrandahl commented Jan 5, 2025

Just replicated the issue on the Google Collab. Log attached.

  • ~2k sentence ebook (Penpal, but I'll switch to a Project Gutenberg book for testing going forward))
  • Bryan Cranston voice
  • Using T4 runtime
  • Unaltered (did not remove tqdm or refreshes)
  • Took maybe 30 minutes to crash out
    image
    e2aLog_1_5_25.txt

@ROBERT-MCDOWELL
Copy link
Collaborator

ROBERT-MCDOWELL commented Jan 5, 2025

try to replace this function with this one

def convert_sentence_to_audio(params, session):
    try:
        if session['cancellation_requested']:
            #stop_and_detach_tts(params['tts'])
            print('Cancel requested')
            return False
        generation_params = {
            "temperature": session['temperature'],
            "length_penalty": session["length_penalty"],
            "repetition_penalty": session['repetition_penalty'],
            "num_beams": int(session['length_penalty']) + 1 if session["length_penalty"] > 1 else 1,
            "top_k": session['top_k'],
            "top_p": session['top_p'],
            "speed": session['speed'],
            "enable_text_splitting": session['enable_text_splitting']
        }
        if params['tts_model'] == 'xtts':
            if session['custom_model'] is not None or session['fine_tuned'] != 'std':
                with torch.no_grad():
                    output = params['tts'].inference(
                        text=params['sentence'],
                        language=session['metadata']['language_iso1'],
                        gpt_cond_latent=params['gpt_cond_latent'],
                        speaker_embedding=params['speaker_embedding'],
                        **generation_params
                    )
                    torchaudio.save(
                        params['sentence_audio_file'],
                        torch.tensor(output[audioproc_format]).unsqueeze(0),
                        sample_rate=24000
                    )
            else:
                with torch.no_grad():
                    params['tts'].tts_to_file(
                        text=params['sentence'],
                        language=session['metadata']['language_iso1'],
                        file_path=params['sentence_audio_file'],
                        speaker_wav=params['voice_file'],
                        **generation_params
                    )
        elif params['tts_model'] == 'fairseq':
            with torch.no_grad():
                params['tts'].tts_with_vc_to_file(
                    text=params['sentence'],
                    file_path=params['sentence_audio_file'],
                    speaker_wav=params['voice_file'].replace('_24khz','_16khz'),
                    split_sentences=session['enable_text_splitting']
                )
        if session['device'] == 'cuda':
            torch.cuda.empty_cache()
        if os.path.exists(params['sentence_audio_file']):
            return True
        print(f"Cannot create {params['sentence_audio_file']}")
        return False
    except Exception as e:
        raise DependencyError(e)

@filippocarone
Copy link
Author

Memory increases also with torch.no_grad(), same as before.

@ROBERT-MCDOWELL
Copy link
Collaborator

add
import gc

and at the end of the function add
collected = gc.collect()

@filippocarone
Copy link
Author

No changes with gc.collect(), memory still increases.

@ROBERT-MCDOWELL
Copy link
Collaborator

If gc.collect does not effect so I'm afraid it's out of the eb2ab scope and the bug is coming from a library

@DevonGrandahl
Copy link

FWIW, I am seeing this same behavior in the StyleTTS version of the project.

@ROBERT-MCDOWELL
Copy link
Collaborator

ok now let's target the origin more precisely.
switch in CPU mode and tell me if it's better. if not, use another ebook sample from the same language, around same amount of sentences. if it's still not better, use an english ebook sample with the the same characteristics.
if it's still not ok, we will start to comment out part of code to localize the one causing the issue.

@filippocarone
Copy link
Author

I'm running in cpu mode, I've used 3 different ebooks, 1 in English and 2 in Italian. Same behavior across all inputs.

@ROBERT-MCDOWELL
Copy link
Collaborator

did you try on another computer?

@filippocarone
Copy link
Author

I've created a debian VM (debian 12.8) on virtualbox and will run it there. I'll keep you posted.

@ROBERT-MCDOWELL
Copy link
Collaborator

well, if you created a VM on the same computer so it will be the same....

@filippocarone
Copy link
Author

filippocarone commented Jan 6, 2025

I don't have another computer and the issue was replicated on Google Collab by @DevonGrandahl .

Memory is being allocated on TTS/tts/models/xtts.py, in the function inference, line 568:

            wavs.append(self.hifigan_decoder(gpt_latents, g=speaker_embedding).cpu().squeeze())

After this function the allocated memory is never collected/cleaned even when executing gc.collect().

@ROBERT-MCDOWELL
Copy link
Collaborator

ROBERT-MCDOWELL commented Jan 6, 2025

so it's what I said since the start, it's a coqui-tts issue, not ours... and to fix it, good luck with the fork we are working with... at our level we cannot change anything. the issue you are encountering concerns only few computers apparentely.
I suggest you open an issue there.
btw what torch version is installed?

@filippocarone
Copy link
Author

These are the versions installed in the python_env which was created automatically by the application:

torch==2.5.1
torchaudio==2.5.1
coqui-tts==0.25.1
coqui-tts-trainer==0.2.0

@ROBERT-MCDOWELL
Copy link
Collaborator

it's all fine, so it's really a coqui-tts with torch memory management.... weird a gc.collect() does not do anything though.

@DevonGrandahl
Copy link

Do we know that users are getting successful runs of full-length novels? Saying this is an issue with CoquiTTS seems equivalent to saying this project is DOA, no? It's failing in the Collab, so it's not exactly isolated to a couple machines.

Also, want to reiterate that this same thing is happening with the old StyleTTS library, which feels like a big coincidence if it's outside of e2a's domain.

Ill report back if I get time to dig through the Python. Crossing my fingers this is fixable!

@ROBERT-MCDOWELL
Copy link
Collaborator

@DevonGrandahl in native mode yes of course!, with the docker, @DrewThomasson will tell you more.

@DevonGrandahl
Copy link

I'm messing around with the native mode code, and adding gc.collect and a torch cache dump might be slowing memory growth, but I can confirm it's not fixed. I wonder about deleting the TTS object entirely after every chapter (or x sentences) and rebuilding. It's a dumb idea (it'll definitely be slower), but combined with garbage collection it might stop the unlimited memory growth?

@ROBERT-MCDOWELL
Copy link
Collaborator

ROBERT-MCDOWELL commented Jan 6, 2025

@DevonGrandahl Keep in mind that a very few users have this problem. so if it was a major issue the entire project won't be useable. the issue is elsewhere for sure. are you saying it's the same in native mode? btw I just saw you are on windows 10, not 11 right?

@DevonGrandahl
Copy link

Yep, I'm seeing the same issue in native mode. The Google Collab is also in native mode, I think.

Correct, Windows 10.

@ROBERT-MCDOWELL
Copy link
Collaborator

ROBERT-MCDOWELL commented Jan 6, 2025

windows 10 could be the issue.... I also saw on coqui-tts forums users having the same issue on windows 10 even with 16GB RAM.
but the original coqui-tts repo closed it as "won't fix'.. however some found a way to stabilize the memory by cutting sentence to max 100 chars and a new tts instance on each (not sure about this one). maybe with gc.collect() it can avoid to create a new tts instance, however I can code a new condition to split more the sentences in case of cpu mode. The thing I still don't catch is my laptop test is 17 years old core 2 duo with 4GB RAM, and some tests I did were like around 30,000 sentences on CPU, and after 3 days the RAM was the same, at the max + virtual memory but no crash, in native mode. windows 11

@filippocarone
Copy link
Author

I tried to create a new TTS object at every sentence and then execute a gc.collect(), but memory grew even faster. I also tried to set to None the params and session objects to see if there was something holding references to other objects, but also that did not reduce memory utilization.

@ROBERT-MCDOWELL
Copy link
Collaborator

read carefully my comment above please

@filippocarone
Copy link
Author

filippocarone commented Jan 6, 2025

On the debian 12.8 VM in Virtualbox memory is not growing (it grows by approx 1MB per sentence instead of 10s or 100s of MB per sentence like in Ubuntu 24.10). There could be something related to the kernel/OS which influences how memory is managed resulting in a leak. This is weird.
Anyway, I'll try a full run and see how it behaves until the end.

@ROBERT-MCDOWELL
Copy link
Collaborator

ROBERT-MCDOWELL commented Jan 6, 2025

any VM is still dependent on the OS host so windows10. then docker is doing the mapping and for sure the map behavior differs from each VM to other

@DevonGrandahl
Copy link

Just tried this on a Windows 11 gaming machine and the memory climb seems much more reasonable. Can't leave this running to see if it ever crashes, but it seems like it would probably work.

@filippocarone
Copy link
Author

A virtualbox VM is different from a docker container in many respects and also from a memory management standpoint. My host is not windows10, but ubuntu 24.10. Anyway it reached 8GB of RAM after 417 sentences, and the process was killed by the OS as the VM has 9 GB of RAM allocated. So it grew, but slower than on ubuntu 24.10.

@ROBERT-MCDOWELL
Copy link
Collaborator

run a TTS A.I. on a VM is not reasonable btw.... it's ok for test, but production not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working docker Related to docker
Projects
None yet
Development

No branches or pull requests

4 participants