Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*127975840 [lua] responses.lua:107: do_authentication(): failed to get from node cache: could not write to lua_shared_dict: no memory #3105

Closed
Zeous9 opened this issue Dec 18, 2017 · 13 comments

Comments

@Zeous9
Copy link

Zeous9 commented Dec 18, 2017

NOTE: GitHub issues are reserved for bug reports only.

Please read the CONTRIBUTING.md guidelines to learn on which channels you can
seek for help and ask general questions:

https://github.com/Kong/kong/blob/master/CONTRIBUTING.md#where-to-seek-for-help

Summary

use default settings shows:no memory error

Steps To Reproduce

1.create 4 api
2.create 100 consumers
3.consumers access those apis
4.error
*127975840 [lua] responses.lua:107: do_authentication(): failed to get from node cache: could not write to lua_shared_dict: no memory

Additional Details & Logs

  • Kong version (0.11.0`)
  • Kong debug-level startup logs ($ kong start --vv)
  • Kong error logs (<KONG_PREFIX>/logs/error.log)
  • Kong configuration (registered APIs/Plugins & configuration file)
  • Operating System linux(ubuntu 16.04 LTS)
@kikito
Copy link
Member

kikito commented Dec 18, 2017

Hello and thanks for reporting this.

That message seems to indicate that you need to increase the value of the mem_cache_size variable. It's 128 MB by default.

@kikito kikito closed this as completed Dec 18, 2017
@thibaultcha
Copy link
Member

Because we do not use the "safe" setter in our shdict cache, we should expect the shdict to evict older items via its LRU mechanism, so I don't think this is related to the size of mem_cache_size.

It seems to me @Zeous9 like the NGINX slab allocator failed to allocate more pages to the shared memory zone, and you might be running into this code path. Are you running NGINX in some peculiar environment - memory limitations, containerized, or anything like so?

@thibaultcha
Copy link
Member

@Zeous9 Another possibility is that you may have a single value that is too large (over the mem_cache_size) and prevents the LRU eviction mechanism to kick in altogether because it would be useless. Do you have any such value? Maybe a large text field in some of these entities?

Could you also share the list of plugins you are using, that'd be great!

@derrley
Copy link

derrley commented Jan 8, 2018

We've also just run into this issue in our environment. We are running Kong in a containerized environment, but it appears to be living within its means just fine (screenshot attached). We run with mem_cache_size at its default value.

screen shot 2018-01-08 at 12 46 01 pm

@derrley
Copy link

derrley commented Jan 8, 2018

plugin configuration:

    "plugins": {
        "available_on_server": {
            "acl": true,
            "aws-lambda": true,
            "basic-auth": true,
            "bot-detection": true,
            "correlation-id": true,
            "cors": true,
            "datadog": true,
            "file-log": true,
            "galileo": true,
            "hmac-auth": true,
            "http-log": true,
            "ip-restriction": true,
            "jwt": true,
            "key-auth": true,
            "ldap-auth": true,
            "loggly": true,
            "oauth2": true,
            "rate-limiting": true,
            "request-size-limiting": true,
            "request-termination": true,
            "request-transformer": true,
            "response-ratelimiting": true,
            "response-transformer": true,
            "runscope": true,
            "statsd": true,
            "syslog": true,
            "tcp-log": true,
            "udp-log": true
        },
        "enabled_in_cluster": [
            "key-auth",
            "acl",
            "rate-limiting"
        ]
    },

This is a test environment, so there are lots of consumers coming and going as test users are added and removed -- "total": 39477

@thibaultcha
Copy link
Member

@derrley We recently hit this issue internally as well and we found out that in our case, it would happen when our VMs' memory was full, preventing NGINX to allocate any additional memory, as I was suspecting in my above comment. Could you investigate this possibility on your side as well? I am having a hard time making any sense of the graph you posted, or what it is supposed to mean.

@derrley
Copy link

derrley commented Jan 8, 2018

@thibaultcha Kong pod is holding steady at 549 out of its allocated 576M. We'll try increasing the limitation to see if that resolves the issue.

@jeremyjpj0916
Copy link
Contributor

Is this related: #3124 ? I see you are using rate limiting as well.

@derrley
Copy link

derrley commented Jan 8, 2018

@jeremyjpj0916 -- I'm not sure. We only use one worker process. Does that mitigate the leak issue?

@jeremyjpj0916
Copy link
Contributor

I am curious if you temp turned off rate limiting if your problem disappears.

@derrley
Copy link

derrley commented Jan 8, 2018

@jeremyjpj0916 I've already lifted Kong's memory limit, which forced Kong to redeploy, which, itself, massively reduced its memory consumption. If the problem reproduces in our test environment I will turn off rate limiting and see if that fixes it.

@derrley
Copy link

derrley commented Jan 8, 2018

@jeremyjpj0916 is the bug only in the shared memory rate limiting? If we used redis rate limiting would that fix the issue?

@Tieske
Copy link
Member

Tieske commented Jan 9, 2018

@derrley shared memory is used for many purposes, so if you temporarily get away with using another rate limiting mechanism, then the issue will return sooner or later in another spot.

As such I'd not like to classify it as a 'bug' just yet, when it might be an out-of-memory issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants