Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation over time #2146

Closed
vitoyucepi opened this issue Jan 9, 2022 · 23 comments
Closed

Performance degradation over time #2146

vitoyucepi opened this issue Jan 9, 2022 · 23 comments

Comments

@vitoyucepi
Copy link
Collaborator

vitoyucepi commented Jan 9, 2022

Describe the bug
After some time running liquidsoap begins to consume much more CPU time.
Here's my log of

while true; do 
  ps -e -o pid,vsize,rssize,etime,time,%cpu,%mem,args | grep liquidsoa[p];
  sleep 10;
done
516448 883684 195224      27:30 00:01:04  3.8  0.5 liquidsoap /var/liquidsoap/default.liq
516448 883684 195224      27:40 00:01:04  3.8  0.5 liquidsoap /var/liquidsoap/default.liq
516448 883684 195224      27:50 00:01:05  3.8  0.5 liquidsoap /var/liquidsoap/default.liq
516448 883684 195224      28:00 00:01:05  3.8  0.5 liquidsoap /var/liquidsoap/default.liq
...
516448 940520 263924   23:11:05 02:36:51 11.2  0.8 liquidsoap /var/liquidsoap/default.liq
516448 940520 263924   23:11:15 02:36:53 11.2  0.8 liquidsoap /var/liquidsoap/default.liq
516448 940520 263924   23:11:25 02:36:55 11.2  0.8 liquidsoap /var/liquidsoap/default.liq
516448 940520 263924   23:11:35 02:36:57 11.2  0.8 liquidsoap /var/liquidsoap/default.liq

Also it looks like there's also a memory leak.

Another one log, the cpu consumption is higher because additionally I encode 6 streams 2*mp3+2*ogg+2*opus.

365034 922216 225860      23:41 00:04:05 17.2  0.6 liquidsoap /var/liquidsoap/default.liq
365034 922216 225860      23:51 00:04:06 17.2  0.6 liquidsoap /var/liquidsoap/default.liq
365034 922216 225860      24:01 00:04:08 17.2  0.6 liquidsoap /var/liquidsoap/default.liq
365034 922216 225860      24:11 00:04:10 17.2  0.6 liquidsoap /var/liquidsoap/default.liq
...
365034 1204016 518528  16:41:49 04:18:04 25.7  1.5 liquidsoap /var/liquidsoap/default.liq
365034 1204016 518528  16:41:59 04:18:09 25.7  1.5 liquidsoap /var/liquidsoap/default.liq
365034 1204016 518528  16:42:09 04:18:13 25.7  1.5 liquidsoap /var/liquidsoap/default.liq
365034 1204016 518528  16:42:19 04:18:16 25.7  1.5 liquidsoap /var/liquidsoap/default.liq

To Reproduce

  1. Base. I didn't noticed any performance degradation with this config.
settings.log.stdout.set(true)

security = single("/var/liquidsoap/security.ogg")

jingles = playlist(mode = "randomize", reload_mode = "watch", "/media/jingles")
music_1 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/1")
music_2 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/2")
music_3 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/3")

schedule = random(weights = [1, 1, 1], [music_1, music_2, music_3])
rotation = rotate(weights = [3, 1], [schedule, jingles])
radio = fallback(track_sensitive = false, [rotation, security])

output.dummy(radio)
  1. Extended. It seems this config can cause performance degradation over time.
settings.log.stdout.set(true)

security = single("/var/liquidsoap/efg.ogg")

jingles = playlist(mode = "randomize", reload_mode = "watch", "/media/jingles")
music_1 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/1")
music_2 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/2")
music_3 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/3")

schedule = random(weights = [1, 1, 1], [music_1, music_2, music_3])
rotation = rotate(weights = [3, 1], [schedule, jingles])

radio = switch(track_sensitive = false, [(schedule.is_ready, rotation), ({true}, security)])

radio = normalize(target = 0.0, window = 0.03, gain_min = -16.0, gain_max = 0.0, radio)
radio = compress.exponential(id = "radio", mu = 1.0, radio)

radio = crossfade(radio)

add_input = input.harbor(
  port = 8000,
  user = 'a',
  password = 'a',
  buffer = 1.0,
  '/a'
)

main_stream = smooth_add(p = 0.15, normal = radio, special = add_input)

override_input = input.harbor(
  port = 8000,
  user = 'b',
  password = 'b',
  buffer = 1.0,
  '/b'
)

live_stream = fallback(track_sensitive=false, [override_input, main_stream])

output.dummy(main_stream)
output.dummy(live_stream)

Music generation script

#!/bin/bash

set -eu
set -o pipefail

function generateTrack() {
  type="$1"
  number="$2"
  read -r -d '' ssml <<XML || true
<speak>
  <prosody rate="x-slow">
    This is the <say-as interpret-as="ordinal">$number</say-as> $type
    <break time="5s"/>
    The <say-as interpret-as="ordinal">$number</say-as> $type ended
    <break time="200ms"/>
  </prosody>
</speak>
XML
  curl -G --data-urlencode "ssml=$ssml" --data-urlencode "lang=en-US" "https://www.google.com/speech-api/v1/synthesize"
}

function main() {
  mediaPath="${1:-media}"
  jinglesPath="$mediaPath/jingles"
  musicPath="$mediaPath/music"
  for path in "$jinglesPath" "$musicPath"; do
    if [ -e "$path" ]; then
      rm -rf "$path"
    fi
    mkdir -p "$path"
  done

  for number in $(seq 0 9); do
    generateTrack 'jingle' "$number" |
      ffmpeg -i - -metadata title="jingle $number" -c copy "$jinglesPath/jingle_$number.mp3"
  done

  for music in $(seq 1 3); do
    path="$musicPath/$music"
    if [ -e "$path" ]; then
      rm -rf "$path"
    fi
    mkdir "$path"
    for number in $(seq 0 9); do
      generateTrack 'music' "$number" |
        ffmpeg -i - -metadata title="music$music $number" -c copy "$path/${music}_$number.mp3"
    done
  done
}

main "$@"

Expected behavior
Same CPU consumption over several days of work

Version details

  • OS: ubuntu:20.04 in docker
  • Version: 2.0.2

Install method
Deb package from liquidsoap ci artifacts at github

Common issues
N/A

@vitoyucepi
Copy link
Collaborator Author

vitoyucepi commented Jan 10, 2022

I have to monitor different versions of liquidsoap. So it will took some time.
Basically I can see difference after 3 hours, but I'll check this after a day.

1.4.4

set("log.stdout", true)

security = single("/var/liquidsoap/efg.ogg")

jingles = playlist(mode = "randomize", reload_mode = "watch", "/media/jingles")
music_1 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/1")
music_2 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/2")
music_3 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/3")

schedule = random(weights = [1, 1, 1], [music_1, music_2, music_3])
rotation = rotate(weights = [3, 1], [schedule, jingles])
radio = switch(track_sensitive = false, [(fun() -> source.is_ready(schedule), rotation), ({true}, security)])
radio = audio_to_stereo(radio)

radio = normalize(target = 0.0, window = 0.03, gain_min = -16.0, gain_max = 0.0, radio)
radio = compress.exponential(id = "radio", mu = 1.0, radio)

radio = crossfade(radio)

add_input = input.harbor(
  port = 8000,
  user = 'a',
  password = 'a',
  buffer = 1.0,
  '/a'
)

main_stream = smooth_add(p = 0.15, normal = radio, special = add_input)

override_input = input.harbor(
  port = 8000,
  user = 'b',
  password = 'b',
  buffer = 1.0,
  '/b'
)

live_stream = fallback(track_sensitive=false, [override_input, main_stream])

output.dummy(main_stream)
output.dummy(live_stream)

2.0.0+

settings.log.stdout.set(true)

security = single("/var/liquidsoap/efg.ogg")

jingles = playlist(mode = "randomize", reload_mode = "watch", "/media/jingles")
music_1 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/1")
music_2 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/2")
music_3 = playlist(mode = "randomize", reload_mode = "watch", "/media/music/3")

schedule = random(weights = [1, 1, 1], [music_1, music_2, music_3])
rotation = rotate(weights = [3, 1], [schedule, jingles])

radio = switch(track_sensitive = false, [(schedule.is_ready, rotation), ({true}, security)])

radio = normalize(target = 0.0, window = 0.03, gain_min = -16.0, gain_max = 0.0, radio)
radio = compress.exponential(id = "radio", mu = 1.0, radio)

radio = crossfade(radio)

add_input = input.harbor(
  port = 8000,
  user = 'a',
  password = 'a',
  buffer = 1.0,
  '/a'
)

main_stream = smooth_add(p = 0.15, normal = radio, special = add_input)

override_input = input.harbor(
  port = 8000,
  user = 'b',
  password = 'b',
  buffer = 1.0,
  '/b'
)

live_stream = fallback(track_sensitive=false, [override_input, main_stream])

output.dummy(main_stream)
output.dummy(live_stream)

@toots
Copy link
Member

toots commented Jan 10, 2022

Thanks for the report. A comparison would be great. As for the memory, there may not be a leak. The OCaml garbage collector may decide to raise the amount of the allocated memory pool to better accommodate the application's usage.

@vitoyucepi
Copy link
Collaborator Author

All of them

 754336 865308 175712      21:34 00:00:52  4.0  0.5 liquidsoap /var/liquidsoap/default_2.0.0.liq
 754347 882724 195416      21:34 00:00:52  4.0  0.5 liquidsoap /var/liquidsoap/default_2.0.2.liq
 754466 873528 185812      21:34 00:00:51  3.9  0.5 liquidsoap /var/liquidsoap/default_2.0.1.liq
 754499 891064 140328      21:34 00:00:31  2.4  0.4 liquidsoap /var/liquidsoap/default_1.4.4.liq

 754336 865308 179144   01:42:32 00:04:21  4.2  0.5 liquidsoap /var/liquidsoap/default_2.0.0.liq
 754347 882724 201224   01:42:32 00:04:13  4.1  0.6 liquidsoap /var/liquidsoap/default_2.0.2.liq
 754466 873528 185812   01:42:32 00:04:03  3.9  0.5 liquidsoap /var/liquidsoap/default_2.0.1.liq
 754499 932840 183096   01:42:32 00:02:52  2.8  0.5 liquidsoap /var/liquidsoap/default_1.4.4.liq

 754336 894100 213600   06:42:58 00:25:53  6.4  0.6 liquidsoap /var/liquidsoap/default_2.0.0.liq
 754347 883492 202280   06:42:58 00:22:04  5.4  0.6 liquidsoap /var/liquidsoap/default_2.0.2.liq
 754466 883148 199804   06:42:58 00:19:27  4.8  0.6 liquidsoap /var/liquidsoap/default_2.0.1.liq
 754499 1036532 265544  06:42:58 00:20:56  5.1  0.8 liquidsoap /var/liquidsoap/default_1.4.4.liq

 754336 894100 213600   06:52:58 00:26:52  6.5  0.6 liquidsoap /var/liquidsoap/default_2.0.0.liq
 754347 883492 202280   06:52:58 00:22:50  5.5  0.6 liquidsoap /var/liquidsoap/default_2.0.2.liq
 754466 883148 199804   06:52:58 00:20:05  4.8  0.6 liquidsoap /var/liquidsoap/default_2.0.1.liq
 754499 1036532 265544  06:52:58 00:21:49  5.2  0.8 liquidsoap /var/liquidsoap/default_1.4.4.liq

At the moment average CPU time relative to real time for 10 minutes is:

  • 1.4.4: 8.8%
  • 2.0.0: 9.8%
  • 2.0.1: 6.3%
  • 2.0.2: 7.7%

@vitoyucepi
Copy link
Collaborator Author

Maybe this problem is related to crossfade function + small tracks, because in my production I deployed liquidsoap without crossfade and tracks have normal length, except jingles.

@vitoyucepi
Copy link
Collaborator Author

By a mistake deployed instance with crossfade and without crossfade.
crossfade version increased CPU consumption after 12 hours.

@toots
Copy link
Member

toots commented Jan 16, 2022

Thanks for these report, I will have a look very shortly.

@ItsMitchh
Copy link

hey toots, any update on this? been seeing a fair few people complaining about higher than normal memory / CPU usage

@toots
Copy link
Member

toots commented Jan 21, 2022

Hi Mitch. I haven't looked at the issue yet. Could you give me more details about how this is impacting users? Please note that the issue you are referring to in AzuraCast is about memory consumption, not CPU.

@toots
Copy link
Member

toots commented Jan 21, 2022

@vitoyucepi any longer term update? I'm not sure that I see a significative difference in the numbers above.

@vitoyucepi
Copy link
Collaborator Author

@SC2Mitch I'm not sure if this issues are related.
Originally I wanted to rename this issue to crossfade starts to consume more CPU over time.

@toots It seems this issue is related to crossfade only. And it's CPU related.
But anyway there are definitely other sources of memory leakage, because after 5 weeks:

 2314 1200380 605016 40-15:10:09 5-22:12:38 14.5 29.6 liquidsoap /var/liquidsoap/default.liq

@toots
Copy link
Member

toots commented Jan 21, 2022

Thanks for reporting. I'm not sure what I am supposed to look at with the last log @vitoyucepi 🙂

@vitoyucepi
Copy link
Collaborator Author

vitoyucepi commented Jan 21, 2022

@toots 29.6 this is %mem.
In 5 weeks memory consumption went from 10% to 30%. From 200MB to 600MB.

But 14.5 is %cpu and it's fine. After I removed crossfade from config there were no more CPU leakage.

@toots
Copy link
Member

toots commented Jan 21, 2022

@vitoyucepi Thanks! The memleak happens on the 2.0.0+ script above right?

@vitoyucepi
Copy link
Collaborator Author

@toots I haven't done memory leak tests yet. Originally I wanted to figure out if #1539 still persists. But suddenly found crossfade related problem.

@toots
Copy link
Member

toots commented Jan 21, 2022

Initial research seems to point at an issue causing both, memory leak from buffers that would, in the case of cross, lead to a lot of stress on the GC, causing increase in CPU usage as well. Will report as soon as possible.

@vitoyucepi
Copy link
Collaborator Author

@toots I reread your questions.
All of my test setups leak memory. Suddenly I didn't do tests for <1.4.4.
#2146 (comment)

@toots
Copy link
Member

toots commented Jan 22, 2022

I can reproduce the memory leak but at the same time, memtrace from the OCaml side says memory is stable:
Screen Shot 2022-01-22 at 5 23 45 PM

This seems to indicate a bug in the C allocation, perhaps something similar to #2054

cc/ AzuraCast/AzuraCast#5010

@toots
Copy link
Member

toots commented Jan 23, 2022

Update: The memory allocation inspection found some inefficient float operations with bigarrays in ocaml-mm. @vitoyucepi if you recompile with the latest ocaml-mm from git, you should see some improvement on the CPU usage. Still chasing the memory allocation growth.

@toots
Copy link
Member

toots commented Jan 26, 2022

Quick update on this one: still working on it. I can confirm that it's coming from crossfade, this script shows a memory leak:

radio = single("/path/to/song.mp3")
radio = crossfade(radio)

clock.assign_new(sync="none",[radio])

output.dummy(fallible=true,radio)

I'm still working on zeroing on the root cause for it.

@toots
Copy link
Member

toots commented Jan 31, 2022

Update on this issue: after a lot of testing, it appears that there isn't anything going wrong beside OCaml's garbage collector taking some initiative to mitigate CPU usage vs. memory allocation. It looks like a lot of the OCaml ecosystem is generally leaning toward more memory allocations and less CPU usage.

To give a little more details, the OCaml memory management (the GC) has to balance the extra work needed to cleanup allocated memory with the extra CPU cost for it. This means that the GC can sometimes leave allocated memory uncleaned for a while to avoid consuming too many resources.

There are parameters to control this behavior and they are now exported in the v2.0.3-preview and main branch as scripting operators. With some agressive settings, I am able to limit the memory usage on my reproduction script.

Before closing this, I'd like to do a little more research to see if we can provide users with a simpler wrapper that would allow to do simple configurations such as making sure that the allocated memory never crosses a certain threshold.

We will see what is possible. We also want to release 2.0.3 soon, so this will be finished soon.

@toots
Copy link
Member

toots commented Feb 7, 2022

There was typo in the above, I meant to say:

There are parameters to control this behavior and they are now exported in the v2.0.3-preview and main branch as scripting operators. With some agressive settings, I am able to limit the memory usage on my reproduction script.
We're waiting on confirmation that setting more agressive settings fixes the memory usage on AzuraCast rolling releases and we should be good to go with 2.0.3!

@toots
Copy link
Member

toots commented Feb 7, 2022

See: AzuraCast/AzuraCast#5010

@toots
Copy link
Member

toots commented Feb 12, 2022

AzuraCast has confirmed that they are not seeing any error on their end so I'm gonna mark this one as fixed with the 2.0.3 release.

@toots toots closed this as completed Feb 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants