-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snimpy Memory Leak #33
Comments
Memory leaks are expected as I did use Snimpy only in one-time scripts. I will look at that and fix them. Thanks for reporting! I see that you noticed another bug, don't hesitate to open an issue for it. |
Just before import snimpy.mib
snimpy.mib._smi.smiExit() This should remove any potential leaks from libsmi (but won't solve your problem). What are your Celery tasks lifetime? Is it like enclosing your code in a for loop? Or is it doing something like reloading modules? |
Thank you fore your reactiveness. from __future__ import absolute_import
from celery import shared_task
from celery.task import task
from django.core.management.base import BaseCommand, CommandError
from celery.decorators import periodic_task
from celery.schedules import crontab
import platform
import socket
import time
import sys
import os
import requests
import json
from django.conf import settings
from snimpy.manager import Manager as M
from snimpy.manager import load
from snimpy.snmp import SNMPNoSuchInstance
from snimpy.snmp import SNMPException
from snimpy.mib import SMIException
from socket import gaierror
from pyasn1.type.error import ValueConstraintError
import os
import traceback
from masac.feed.models import CiscoEquipment
from masac.feed.models import Routers
from masac.feed.models import Switches
import pickle
from copy import deepcopy
from copy import copy
import sqlite3 as lite
import inspect
from os.path import expanduser
LOG_DEBUG = settings.LOG_DEBUG
CARBON_SERVER = settings.CARBON_SERVER # Server where Cyanite is running
CARBON_PORT = settings.CARBON_PORT
SDSL_TIMEOUT = settings.SDSL_TIMEOUT
SWITCH_TIMEOUT = settings.SWITCH_TIMEOUT
ROUTER_TIMEOUT = settings.ROUTER_TIMEOUT
@periodic_task(run_every=(crontab(hour="*", minute="*/3", day_of_week="*")), queue='masac.switch', options={'queue': 'masac.switch', 'routing_key' : 'masac.switch'})
def send_switch_metrics():
try:
node = platform.node().replace('.', '-')
timestamp = int(time.time())
switch_list = Switches.objects.filter(marque='switch', banned=0)
DELAY = 1
#sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock = socket.socket()
sock.connect((CARBON_SERVER, CARBON_PORT))
message = ''
mib_path = os.path.join(settings.BASE_DIR, 'scripts', 'mib')
load(os.path.join(mib_path, 'SNMPv2-SMI.txt'))
load(os.path.join(mib_path, 'SNMPv2-TC.txt'))
load(os.path.join(mib_path, 'SNMPv2-CONF.txt'))
load(os.path.join(mib_path, 'SNMPv2-MIB.txt'))
load(os.path.join(mib_path, 'IANAifType-MIB.txt'))
load(os.path.join(mib_path, 'IF-MIB.txt'))
list_count = len(switch_list)
if list_count < 1:
print("Warning: List of switches is empty. Re-feed....")
# get_switches()
for device in switch_list:
hostname = str(device.login)
ip = str(device.ip)
comm = str(device.snmp_community)
if LOG_DEBUG == True:
print("Testing Switch: " + hostname + " " + ip + " " + comm)
try:
m = M(host=hostname, retries=0, community=comm, version=2, timeout=SWITCH_TIMEOUT, cache=False)
except gaierror as ge:
# print(traceback.format_exc())
if LOG_DEBUG == True:
print("Snimpy Manager Exception: gaierror")
metrics = {}
try:
for index in m.ifDescr:
metrics[m.ifDescr[index]] = {}
if "thernet" in str(m.ifDescr[index]) and "unrouted" not in str(m.ifDescr[index]) and "Vlan" not in str(m.ifDescr[index]) and "Null" not in str(m.ifDescr[index]):
metrics[m.ifDescr[index]]['ifHCOutOctets'] = m.ifHCOutOctets[index]
metrics[m.ifDescr[index]]['ifHCInOctets'] = m.ifHCInOctets[index]
# print(index, repr(m.ifDescr[index]), repr(m.ifHCOutOctets[index]), repr(m.ifHCInOctets[index]))
except SNMPNoSuchInstance as nie:
# print(traceback.format_exc())
print("SNMPNoSuchInstance for " + hostname + " " + ip + " " + comm)
except SNMPException as snmpe:
Switches.objects.filter(login=hostname, marque='switch').update(banned=1)
print("SNMPException for " + hostname + " " + ip + " " + comm + " | List Count: " + str(list_count))
# print(traceback.format_exc())
pass
except ValueConstraintError as e:
print("ValueConstraintError for " + hostname + " " + ip + " " + comm)
# print(traceback.format_exc())
lines= []
for if_descr in list(metrics.keys()):
for key in list(metrics[if_descr].keys()):
line = 'masac.switch.%s.%s.%s %d %d' % (hostname.replace(".","_"), if_descr.replace("/","_").replace(" ","_"), key, metrics[if_descr][key], timestamp)
lines.append(line)
message = '\n'.join(lines) + '\n'
if LOG_DEBUG == True:
print("MESSAGE for " + hostname + " : " + message)
sock.sendall(message.encode('UTF-8'))
sock.close()
time.sleep(DELAY)
except Exception as e:
print(traceback.format_exc())
|
Hello @vincentbernat Any news about this issue? |
I think the valgrind output is misleading and the memory leak doesn't happen in the CFFI module. I have fixed a small memory leak in the CFFI module but I think this is not related to what you are saying. I still need to find some time to check the issue. I suppose that the issue could be tested by running the first snippet you provided into a loop and use some tool to find the memory leak. |
The initialization of such an object is expensive (so, Snimpy is slow) and there seems to trigger a memory leak. Just use a singleton instead. Related to #33.
I have pushed a new commit that should fix the problem. For the record, here is the code I have used to "trace" memory leaks. #!/usr/bin/python3
from snimpy.manager import Manager as M
from snimpy.manager import load
import tracemalloc
import gc
def sdsl_test(hostname, community):
load('IF-MIB')
load('SNMPv2-MIB')
m = M(host=hostname, retries=0, community=community, version=2,
timeout=1, cache=0)
uptime = m.sysUpTime
print(uptime)
for ix in m.ifDescr:
desc = m.ifDescr[ix]
metric = m.ifHCInOctets[ix]
print(desc + " " + str(metric))
tracemalloc.start()
sdsl_test('127.0.0.1', 'public')
gc.collect()
snapshot1 = tracemalloc.take_snapshot()
for i in range(1000):
sdsl_test('127.0.0.1', 'public')
gc.collect()
snapshot2 = tracemalloc.take_snapshot()
top_stats = snapshot2.compare_to(snapshot1, 'lineno')
print("[ Top 10 differences ]")
for stat in top_stats[:10]:
print(stat) |
Ok great. I am going to test again. |
I didn't do a release, in case there is something else to fix. You can also just apply the patch manually. It is pretty simple: cce0501 |
Due to #35, I have reverted the fix. I have asked for help in the PySNMP mailing list and hopefully will come with a proper fix. Otherwise, I'll try to hack around. |
I forgot about this bug. The current situation with 0.8.8 is that only one command generator per thread will be used for SNMPv2. This is not the case with SNMPv3 as the command generator cannot be reused between sessions in this case. So, you shouldn't get a memory leak with 0.8.8 either in your case. |
That is excellent news. I am gonna put that to test in my next iteration. Thanks for the great job! |
Using Snimpy in some Celery Tasks, the server RAM consumption grows non-stop to the point kernel starts killing Celery workers process (oom).
After running Valgrind against a simple Snimpy Script, Memory leak is detected:
If needed, I can provide code of the Celery Task using Snimpy. basically the task loop over a list of network routers, fetches metrics and send them to a server.
The text was updated successfully, but these errors were encountered: