Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webp-server uses to much memory #198

Closed
bhzhu203 opened this issue May 17, 2023 · 41 comments · Fixed by #222
Closed

webp-server uses to much memory #198

bhzhu203 opened this issue May 17, 2023 · 41 comments · Fixed by #222
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@bhzhu203
Copy link

At the first time , it uses 1GB memory . But after 7 hours , it grows to 6.7GB and continue to grow slowly.

图片

@n0vad3v
Copy link
Member

n0vad3v commented May 17, 2023

🤔 Thanks for reporting, we need to dig more on to this issue, are you using version 0.7.0 on this?

Before that you can try running it with docker-compose with limited resource allocated to it for mitigating this problem, more info can be seen on https://docs.webp.sh/usage/docker/ and #75.

@bhzhu203
Copy link
Author

So , how to continue improve the memory usage ?

@n0vad3v
Copy link
Member

n0vad3v commented May 20, 2023

Could you please have a try adding MALLOC_ARENA_MAX=1 env before running the program?
Ref: https://github.com/davidbyttow/govips#memory-usage-note

In my testing case the RAM it used after benching using method from #200, has reduced to less than 100MiB after testing:

eb5eff3296d1   webp-server-webp-1   0.04%     95.56MiB / 45.89GiB   0.20%     54.6kB / 11.9MB   98.3kB / 28.6MB   20

My docker-compose.yml example:

version: '3'

services:
  webp:
    image: webpsh/webp-server-go
    environment:
      - MALLOC_ARENA_MAX=1
    volumes:
      - ./pics:/opt/pics
      - ./exhaust:/opt/exhaust
    ports:
      -  127.0.0.1:3333:3333
    command: '/usr/bin/webp-server --config /etc/config.json -v'

@Ingiboy
Copy link

Ingiboy commented May 20, 2023

Could you please have a try adding MALLOC_ARENA_MAX=1 env before running the program?

Yes, but it eat 1-2-3Gb.
I use docker exec webp-server-go sh -c 'kill -HUP 1' every 15 min

@n0vad3v
Copy link
Member

n0vad3v commented May 21, 2023

@Ingiboy OK, that seems better.

BTW, you can use docker-compose.yml described in https://docs.webp.sh/usage/docker/ with resource limit so it will automatically restart if RAM has reached your limitation, instead of restarting it manually, example as below.

version: '3'

services:
  webp:
    image: webpsh/webp-server-go
    # image: ghcr.io/webp-sh/webp_server_go
    restart: always
    volumes:
      - ./path/to/pics:/opt/pics
      - ./path/to/exhaust:/opt/exhaust
    ports:
      -  127.0.0.1:3333:3333
    deploy:
      resources:
        limits:
          memory: 200M

Use with docker-compose --compatibility up -d to start.

@Ingiboy
Copy link

Ingiboy commented May 22, 2023

When i set deploy section with mem limit=512M, LA grows up to 30-40, CPU in wait state 60-80%. And using swap up to 2.5G.

Without 'deploy' LA less 1, no swap.

@BennyThink
Copy link
Member

that is because docker engine will try to kill and then restart the container quiet often, so high load average.

@n0vad3v can we consider adding back MaxCacheMem and MaxCacheSize? We can read environment variables so it's totally up to end users.

@Ingiboy
Copy link

Ingiboy commented May 22, 2023

that is because docker engine will try to kill and then restart the container quiet often, so high load average.

No, i see 'docker ps' - container uptime is more than 6h. I think memlimit is only limits RAM, and container eats swap)

@n0vad3v
Copy link
Member

n0vad3v commented May 23, 2023

@Ingiboy
I've dig some more on https://docs.docker.com/config/containers/resource_constraints/#--memory-swap-details and found that:

If --memory-swap is unset, and --memory is set, the container can use as much swap as the --memory setting, if the host container has swap memory configured. For instance, if --memory="300m" and --memory-swap is not set, the container can use 600m in total of memory and swap.

So using the docker-compose.yml as above:

version: '3'

services:
  webp:
    image: webpsh/webp-server-go
    # image: ghcr.io/webp-sh/webp_server_go
    restart: always
    volumes:
      - ./path/to/pics:/opt/pics
      - ./path/to/exhaust:/opt/exhaust
    ports:
      -  127.0.0.1:3333:3333
    deploy:
      resources:
        limits:
          memory: 200M

The container can use 200M of RAM and 200M of swap, can be confirmed by docker inspect

"Memory": 209715200,
...
"MemorySwap": 419430400,

In order to limit the swap usage, here is an working example(Note I've added MALLOC_ARENA_MAX in env):

version: '3'

services:
  webp:
    image: webpsh/webp-server-go
    restart: always
    environment:
      - MALLOC_ARENA_MAX=1
    volumes:
      - ./path/to/pics:/opt/pics
      - ./path/to/exhaust:/opt/exhaust
    ports:
      -  127.0.0.1:3333:3333

    deploy:
      resources:
        limits:
          memory: 200M
    memswap_limit: 200M

In this case, the container can use no swap and only 200M of RAM, and will restart the container if ram runs above 200M.

Example inspired by https://stackoverflow.com/questions/70095537/how-to-use-swap-memory-in-docker-compose, have to say the structure is wierd somehow, not sure why it is designed like that.

Please have a try on this, BTW, a higher ram limit than 512MB is suggested.

@Ingiboy
Copy link

Ingiboy commented May 23, 2023

@n0vad3v, o, thanks! i try it

@Ingiboy
Copy link

Ingiboy commented May 23, 2023

Hmm, don't helps.

version: "3.3"
services:
  webp-http:
    container_name: webp-http
    image: webpsh/webp-server-go:latest
    restart: unless-stopped
    user: 1001:1001
    cap_drop:
      - ALL
    environment:
      - MALLOC_ARENA_MAX=1
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./webp/config.json:/etc/config.json:ro
      - /var/tmp/webp-http/exhaust:/opt/exhaust
      - /var/tmp/webp-http/remote-raw:/opt/remote-raw
    ports:
     - 127.0.0.1:3335:3333
    deploy:
      resources:
        limits:
          memory: 196M
    memswap_limit: 196M

after some minuts LA grows up to 30-40, CPU in wa state about 80%. webp utilizes CPU more than 100%
Temporarily reverted to no-limit scheme and kill -HUP 1 every 15 minutes.

@n0vad3v
Copy link
Member

n0vad3v commented May 23, 2023

@Ingiboy

Hmm, don't helps.

Seems weird, in the compose file content provided above, in my understanding with memswap_limit set it should no write to swap, but you've described "CPU in wa state about 80%", are your images large or your machine's disk is quite slow, or if using VPS, might that be caused by CPU overselling?

May I have some more info on your case:

  • What type of disk are you using? (HDD, SSD, NVMe SSD?) The disk type can have an impact on the overall performance, especially when dealing with I/O operations.
  • When "LA grows up to 30-40, CPU in wa state about 80%. webp utilizes CPU more than 100%", it's important to understand the underlying reasons. Could you provide more details about the number of requests being sent to the service and the sizes of the images involved? This information will help in identifying potential bottlenecks.
  • From your previous reply "Yes, but it eat 1-2-3Gb.", I'm assuming your machine has more than 3GB's RAM, if you raise memory and memswap_limit to, say 1024MB, what will happen after running?

@Ingiboy
Copy link

Ingiboy commented May 23, 2023

@n0vad3v

  1. I use VPS, no info about disk, i think it's not a hdd;
  2. For example, 12000+ requests per hour:
    main.findSmallestFiles() - 4382,
    main.proxyHandler() - 5489,
    main.convertLog() - 2344,
    Average size of image - 600kb;
  3. Yes, it have 8Gb RAM. I try to set 1Gb limit, but i saw the peak - 1,6Gb

@Ingiboy
Copy link

Ingiboy commented May 24, 2023

All not good) I tried to put memory=1G and memswap_limit=1G || memswap_limit=0. If the memory usage is about gigabyte - LA about 40, CPU wa about 80% etc(

@n0vad3v
Copy link
Member

n0vad3v commented May 24, 2023

🤔 Kinda weird as I still cannot reproduce your issue in my local env, when setting memory and memswap_limit, my container gets instantly killed when RAM exceeds that value, with some failed requests and almost no high CPU spikes.

Have you tried maybe change a VPS provider or machine to have another test on similar scenario?

@Ingiboy
Copy link

Ingiboy commented May 24, 2023

May be it helps:

#docker -v
Docker version 20.10.21, build 20.10.21-0ubuntu1~18.04.3
# lsb_release -a
Description:    Ubuntu 18.04.6 LTS

Unfortunately i can't change server, because it prod

@Ingiboy
Copy link

Ingiboy commented May 24, 2023

Generally, it doesn't matter exactly how docker is restarted. webp-server - great tool, many thanks)
I'll wait until the memory leaks are fixed)

@BennyThink BennyThink added enhancement New feature or request bug Something isn't working labels May 25, 2023
@bugfest
Copy link
Contributor

bugfest commented May 29, 2023

I'm experiencing a similar issue when to many requests hits the server:

4 vCPU, 8GB of RAM box:

Out of memory: Killed process 4402 (webp-server) total-vm:10140532kB, anon-rss:743668>

I'm going to take a look to vips setup

@bugfest
Copy link
Contributor

bugfest commented May 30, 2023

So using this fixed config for vips makes my server not to crash (OOM killer) under load:

        vips.Startup(&vips.Config{
-               ConcurrencyLevel: runtime.NumCPU(),
+               ConcurrencyLevel: 1,
+               MaxCacheMem: 16 * 1024 * 1024,
+               MaxCacheSize: 16 * 1024 * 1024,
+               MaxCacheFiles: 128,
        })

But this is just a workaround; I'd say the solution is to solve #75, I'm now evaluating the following go-routine pool implementations:

@n0vad3v
Copy link
Member

n0vad3v commented May 30, 2023

Maybe we need to figure out how theses params affect the real performance, as in PR #200 I've done some tests and said:

However we need more information on MaxCacheMem and MaxCacheSize before considering merging this PR.

With @BennyThink 's reply

vips_cache_set_max_mem() and vips_cache_get_max_mem(): These are functions that control the maximum amount of memory that libvips will use for caching. If a request to allocate more memory for caching comes in and it would cause libvips to exceed this limit, the cache is trimmed to stay under this limit. The size is in bytes.
vips_cache_set_max() and vips_cache_get_max(): These control the maximum number of operations that libvips will cache. If more operations are requested than this number, the least recently used operations are dropped from the cache. This is essentially a count of cached operations, not a size in bytes.
by chatgpt

I mean, if given the MaxCacheMem,MaxCacheSize,MaxCacheFiles above will have no performance impact to WebP Server, maybe we should just hard code it? (maybe leave some env for user to adjust them if they wish)

@n0vad3v
Copy link
Member

n0vad3v commented May 30, 2023

I've done some benchmark on some photos (239 Photos, total 2.4GiB) using prefetch


Normal configuration:

	vips.Startup(&vips.Config{
		ConcurrencyLevel: runtime.NumCPU(),
	})
Prefetching... 100% |██████████████████████████████████████████████████████| (239/239, 11 it/s)         
Prefetch completeY(^_^)Y in 22.192475241s


bugfest configuration:

	vips.Startup(&vips.Config{
		ConcurrencyLevel: 1,
		MaxCacheMem:      16 * 1024 * 1024,
		MaxCacheSize:     16 * 1024 * 1024,
		MaxCacheFiles:    128,
	})
Prefetching... 100% |██████████████████████████████████████████████████████| (239/239, 12 it/s)         
Prefetch completeY(^_^)Y in 20.291148467s

However in prefetch we'll be default use all cores to convert so it might not be the best case to benchmark, but RES usage somehow shows that bugfest configuration seems does no harm on performance and by the way reduced RAM usage.
If this is the case, I think we can just hard code them into WebP Server Go.🤔

@bugfest
Copy link
Contributor

bugfest commented May 30, 2023

Sorry @n0vad3v I didn't mentioned I use it in proxy mode with AVIF enabled. I think the issue is each new HTTP request spins a new go routine so the memory consumption skyrockets. Specially on first load; with no cached webp/avif images in disk.

@bugfest
Copy link
Contributor

bugfest commented May 30, 2023

Also, I don't know if we should develop some parallel test in which we benchmark edge-cases here; like concurrent images converstion with the max size allowed by the image format (64k x 64k)

@Ingiboy
Copy link

Ingiboy commented May 30, 2023

Sorry @n0vad3v I didn't mentioned I use it in proxy mode with AVIF enabled.

Yes, me too use it in proxy mode, but without avif, only webp

@n0vad3v
Copy link
Member

n0vad3v commented May 30, 2023

Also, I don't know if we should develop some parallel test in which we benchmark edge-cases here; like concurrent images converstion with the max size allowed by the image format (64k x 64k)

Yeah, and I've been thinking of creating a testing website for proxy mode testing, like generating 200+ images and host it on a website for testing, but I have no idea on image source(especially when dealing with license).

@bugfest
Copy link
Contributor

bugfest commented May 30, 2023

what about if we generate completely random/noise images (https://github.com/mathisve/GoRandomNoiseImage) on the fly so that we can benchmark/measure the conversion when running in parallel?

@bhzhu203
Copy link
Author

bhzhu203 commented May 31, 2023

export MALLOC_ARENA_MAX=1
the memery is stable at 2.5G ~ 3G

export MALLOC_ARENA_MAX=1
export LD_PRELOAD=/libjemalloc.so

the memery is stable at 150m~204.2m .

Seems there is the best memery performance with jemalloc . Could webp-server link static with jemalloc lib ?

图片

@n0vad3v
Copy link
Member

n0vad3v commented May 31, 2023

export LD_PRELOAD=/libjemalloc.so

Nice catch, I've done some local testing and found that jemalloc indeed uses less RAM. However will this have negative impact on stability or performance.

I saw some related discussion at: lovell/sharp#955 (comment) and libvips/libvips#1064 (comment).

Currently I'm planning to add jemalloc and tcmalloc to Dockerfile, and allow user using ENV for switching between those.

@n0vad3v n0vad3v linked a pull request May 31, 2023 that will close this issue
@n0vad3v n0vad3v reopened this May 31, 2023
@n0vad3v
Copy link
Member

n0vad3v commented May 31, 2023

We've released 0.8.4 https://github.com/webp-sh/webp_server_go/releases/tag/0.8.4, this version allows user to choose malloc when using docker, more info can be found on https://docs.webp.sh/usage/docker/

@Ingiboy
Copy link

Ingiboy commented Jun 1, 2023

In proxy mode on prod:

#docker-compose.yml
version: "3.3"

services:
  webp:
    container_name: webp
    image: webpsh/webp-server-go:0.8.4
    restart: unless-stopped
    user: 1001:1001
    cap_drop:
      - ALL
    environment:
      - MALLOC_ARENA_MAX=1
      - LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#      - LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4.5.6
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./webp/config.json:/etc/config.json:ro
      - /var/tmp/webp/exhaust:/opt/exhaust
      - /var/tmp/webp/remote-raw:/opt/remote-raw
    ports:
     - 127.0.0.1:3335:3333
#config.json
{
  "HOST": "0.0.0.0",
  "PORT": "3333",
  "QUALITY": "80",
  "MAX_JOB_COUNT": "4",
  "IMG_PATH": "http://127.0.0.1:8080",
  "EXHAUST_PATH": "./exhaust",
  "ALLOWED_TYPES": ["jpg","png","jpeg","bmp"],
  "ENABLE_AVIF": false
}

image
image

@Ingiboy
Copy link

Ingiboy commented Jun 1, 2023

#Dockerfile
FROM alpine:3.18 as builder
RUN apk --update add --no-cache go vips-dev && \
    mkdir /build
COPY ./build/go.mod /build
RUN cd /build && \
    go mod download
COPY ./build /build
RUN cd /build && \
    go build -ldflags="-s -w" -o webp-server .

FROM alpine:3.18
RUN apk --update add --no-cache vips ca-certificates jemalloc
COPY --from=builder /build/webp-server  /usr/bin/webp-server
COPY --from=builder /build/config.json /etc/config.json
WORKDIR /opt
CMD ["/usr/bin/webp-server", "--config", "/etc/config.json"]
#docker-compose.yml
      - LD_PRELOAD=/usr/lib/libjemalloc.so.2

I saw in docker stats a short peak cpu up to 300+% and ram up to 1.4gb and down to 200mb in 5-10 seconds

image

@bugfest
Copy link
Contributor

bugfest commented Jun 2, 2023

I've been doing a lot of reseach and extended testing (including go profiling). I still have issues with WEBP+AVIF in concurrent context with proxy mode.

With +70 requests in paralell when I start without any local cache, the server has to convert all those images and what I think is happening is that we're not queuing requests to govips; so when those requests get to the server, I get a peak of +100 go-subrotines converting to both image types. This leads to a physical memory consumption spike and then triggers docker or the host system OOM killer.

Current known workarounds:

  • Limit the number of requests hitting the server at once (I'd like to avoid this as does not really help to the stability of webp_server_go itself)
  • Add more memory to the system
  • Disable AVIF conversion (so that the memory footprint is much lower)
  • Reduce the VIPS concurrency to 1

I think it'd be worth it to create a priority queue to then limit how many jobs we're really sending to govips. That way we could add some more config parameters to control how many parallel jobs the host is able to handle and even allowing us to execute opportunistic convertion to AVIF when enough system resources ara available, etc.

Any thoughs @n0vad3v, @BennyThink ?

@BennyThink
Copy link
Member

BennyThink commented Jun 2, 2023

  • Disable AVIF conversion (so that the memory footprint is much lower) ✅
  • Reduce the VIPS concurrency to 1 - maybe not 1 but 10? Does ConcurrencyLevel config works?

@n0vad3v
Copy link
Member

n0vad3v commented Jun 2, 2023

@bugfest I agree with the conversion queue plan, thus user can control how many convert can be triggered at one moment, you've mentioned

I get a peak of +100 go-subrotines converting to both image types

Maybe a simple way to limit this can be done by limiting total Go routines in program?(thus can automatically queue the latter requests)

@bugfest
Copy link
Contributor

bugfest commented Jun 2, 2023

* Reduce the VIPS concurrency to 1 - maybe not 1 but 10? Does `ConcurrencyLevel` [config](https://github.com/davidbyttow/govips/blob/c6838fceef8d93bca9e044187a412540b62c6090/vips/govips.go#L47) works?

@BennyThink Increasing the value, in my case, makes the server to consume resources faster: #198 (comment)

@bugfest I agree with the conversion queue plan, thus user can control how many convert can be triggered at one moment, you've mentioned

@n0vad3v Nice! I'll share a proposal as soon as I get it working

I get a peak of +100 go-subrotines converting to both image types

Maybe a simple way to limit this can be done by limiting total Go routines in program?(thus can automatically queue the latter requests)

It also has impact on the http server performance itself so I wouldn't go that way.

@bugfest
Copy link
Contributor

bugfest commented Jun 8, 2023

Hi @n0vad3v, @BennyThink I've prepared que async implementation as per my last comment. Opened PR #226

@Ingiboy
Copy link

Ingiboy commented Jul 6, 2023

With alpine+docker version 0.9.2 down every 3-5 mins with:

2023/07/06 17:59:05 [VIPS.info] g_getenv( "PATH" ) == "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
2023/07/06 17:59:05 [VIPS.info] looking in "/usr/local/sbin" for "govips"
2023/07/06 17:59:05 [VIPS.info] looking in "/usr/local/bin" for "govips"
2023/07/06 17:59:05 [VIPS.info] looking in "/usr/sbin" for "govips"
2023/07/06 17:59:05 [VIPS.info] looking in "/usr/bin" for "govips"
2023/07/06 17:59:05 [VIPS.info] looking in "/sbin" for "govips"
2023/07/06 17:59:05 [VIPS.info] looking in "/bin" for "govips"
2023/07/06 17:59:05 [VIPS.info] trying for dir = "/opt/govips", name = "govips"
2023/07/06 17:59:05 [VIPS.info] canonicalised path = "/opt"
2023/07/06 17:59:05 [VIPS.info] VIPS_PREFIX = /usr
2023/07/06 17:59:05 [VIPS.info] VIPS_LIBDIR = /usr/lib
2023/07/06 17:59:05 [VIPS.info] prefix = /usr
2023/07/06 17:59:05 [VIPS.info] libdir = /usr/lib
2023/07/06 17:59:05 [VIPS.info] searching "/usr/lib/vips-modules-8.14"
2023/07/06 17:59:05 [VIPS.info] searching "/usr/lib/vips-plugins-8.14"
2023/07/06 17:59:05 [govips.info] vips 8.14.2 started with concurrency=4 cache_max_files=0 cache_max_mem=0 cache_max=0
2023/07/06 17:59:05 [govips.info] registered image type loader type=png
2023/07/06 17:59:05 [govips.info] registered image type loader type=svg
2023/07/06 17:59:05 [govips.info] registered image type loader type=tiff
2023/07/06 17:59:05 [govips.info] registered image type loader type=webp
2023/07/06 17:59:05 [govips.info] registered image type loader type=jpeg
2023/07/06 17:59:05 [govips.info] registered image type loader type=jp2k
2023/07/06 17:59:05 [govips.info] registered image type loader type=gif
fatal error: concurrent map read and map write
...many lines

how fix it? thx

error out: https://pastebin.com/BGhEBVq2

@n0vad3v
Copy link
Member

n0vad3v commented Jul 6, 2023

@Ingiboy Hmmm, that's weird, but thanks for reporting, we're investigating on this, probably something wrong with the code.

In the mean time, could you please have a try on our official docker image to see if this issue will reproduce?

Example usage can be found on https://docs.webp.sh/usage/docker/

@n0vad3v
Copy link
Member

n0vad3v commented Jul 6, 2023

@Ingiboy We think we've found the problem, please have a try on version 0.9.3 to see if it solves this problem.

@Ingiboy
Copy link

Ingiboy commented Jul 6, 2023

@n0vad3v 30 mins on air without errors, it looks like it works)
Many thanks!

@BennyThink
Copy link
Member

since user can use jemalloc or tcmalloc to mitigate OOM issue, consider this as finished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants