You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 1, 2023. It is now read-only.
Reported on the fricitonlessdata gitter channel. The symptom is that jobs created via the API (eg in try.goodtables.io or openspending.org/packager/provide-data are never finished, and the client keeps making requests to get an update until you get the following error:
Error: Reached Maximum Attempts
Looking at the pods and the logs, the following from the worker container points to the problem:
kubectl -n production logs goodtables-io-production-xxxxx-yyyy -c goodtables-worker -f
[2019-07-26 09:49:37,169: ERROR/MainProcess] Process 'ForkPoolWorker-1' pid:19 exited with 'signal 9 (SIGKILL)'
[2019-07-26 09:49:37,184: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL).',)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/billiard/pool.py", line 1223, in mark_as_worker_lost
human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).
[2019-07-26 09:50:34,453: ERROR/ForkPoolWorker-6] There was an error setting the GitHub status: https://api.github.com/repos/bzg/data-codes-sources-fr/statuses/42f134bb64848b2df3e0318822805fc24b45afd1 {"message":"Not Found","documentation_url":"https://developer.github.com/v3/repos/statuses/#create-a-status"} 406 success
[2019-07-26 09:52:53,348: ERROR/MainProcess] Process 'ForkPoolWorker-6' pid:566 exited with 'signal 9 (SIGKILL)'
[2019-07-26 09:52:53,362: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL).',)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/billiard/pool.py", line 1223, in mark_as_worker_lost
human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).
^[[A[2019-07-26 10:17:58,438: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 129, in _read_from_socket
raise socket.error(SERVER_CLOSED_CONNECTION_ERROR)
OSError: Connection closed by server.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/redis/client.py", line 2165, in _execute
return command(*args)
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 577, in read_response
response = self._parser.read_response()
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 238, in read_response
response = self._buffer.readline()
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 168, in readline
self._read_from_socket()
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 143, in _read_from_socket
(e.args,))
redis.exceptions.ConnectionError: Error while reading from socket: ('Connection closed by server.',)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 439, in connect
sock = self._connect()
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 494, in _connect
raise err
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 482, in _connect
sock.connect(socket_address)
TimeoutError: [Errno 110] Operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", line 318, in start
blueprint.start(self)
File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", line 584, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python3.5/site-packages/celery/worker/loops.py", line 88, in asynloop
next(loop)
File "/usr/local/lib/python3.5/site-packages/kombu/async/hub.py", line 354, in create_loop
cb(*cbargs)
File "/usr/local/lib/python3.5/site-packages/kombu/transport/redis.py", line 1040, in on_readable
self.cycle.on_readable(fileno)
File "/usr/local/lib/python3.5/site-packages/kombu/transport/redis.py", line 337, in on_readable
chan.handlers[type]()
File "/usr/local/lib/python3.5/site-packages/kombu/transport/redis.py", line 667, in _receive
ret.append(self._receive_one(c))
File "/usr/local/lib/python3.5/site-packages/kombu/transport/redis.py", line 678, in _receive_one
response = c.parse_response()
File "/usr/local/lib/python3.5/site-packages/redis/client.py", line 2183, in parse_response
return self._execute(connection, connection.read_response)
File "/usr/local/lib/python3.5/site-packages/redis/client.py", line 2172, in _execute
connection.connect()
File "/usr/local/lib/python3.5/site-packages/redis/connection.py", line 442, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 110 connecting to goodtables-io-production-redis-master:6379. Operation timed out.
Previous to this there are some errors about files not found, not sure if they are related:
[2019-07-26 09:41:28,823: ERROR/ForkPoolWorker-4] Signal handler <function task_postrun at 0x7f6dfbabbf28> raised: FileNotFoundError(2, 'No such file or directory')
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/celery/utils/dispatch/signal.py", line 167, in send
response = receiver(signal=self, sender=sender, **named)
File "/srv/app/goodtablesio/signals.py", line 40, in task_postrun
os.remove(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmptduxzzma/invalid.csv'
[2019-07-26 09:41:28,835: ERROR/ForkPoolWorker-4] ["Signal handler <function task_postrun at 0x7f6dfbabbf28> raised: FileNotFoundError(2, 'No such file or directory')", ' File "celery/utils/dispatch/signal.py", line 167, in send', ' File "goodtablesio/signals.py", line 40, in task_postrun']
[2019-07-26 09:41:28,835: WARNING/ForkPoolWorker-4] b'["Signal handler <function task_postrun at 0x7f6dfbabbf28> raised: FileNotFoundError(2, \'No such file or directory\')", \' File "celery/utils/dispatch/signal.py", line 167, in send\', \' File "goodtablesio/signals.py", line 40, in task_postrun\']'
[2019-07-26 09:41:28,919: ERROR/ForkPoolWorker-1] Sentry responded with an API error: RateLimited(Creation of this event was denied due to rate limiting)
[2019-07-26 09:41:28,919: WARNING/ForkPoolWorker-1] b'Sentry responded with an API error: RateLimited(Creation of this event was denied due to rate limiting)'
[2019-07-26 09:41:28,920: ERROR/ForkPoolWorker-1] ["Signal handler <function task_postrun at 0x7f6dfbabbf28> raised: FileNotFoundError(2, 'No such file or directory')", ' File "celery/utils/dispatch/signal.py", line 167, in send', ' File "goodtablesio/signals.py", line 40, in task_postrun']
[2019-07-26 09:41:28,920: WARNING/ForkPoolWorker-1] b'["Signal handler <function task_postrun at 0x7f6dfbabbf28> raised: FileNotFoundError(2, \'No such file or directory\')", \' File "celery/utils/dispatch/signal.py", line 167, in send\', \' File "goodtablesio/signals.py", line 40, in task_postrun\']'
Update: The issue seems to be fixed, I restarted both pods, the app + worker one and the redis one, using the command in #317
The text was updated successfully, but these errors were encountered:
roll
changed the title
Worker can not connect to Redis
Sometimes worker can not connect to Redis
Jul 28, 2019
Reported on the fricitonlessdata gitter channel. The symptom is that jobs created via the API (eg in try.goodtables.io or openspending.org/packager/provide-data are never finished, and the client keeps making requests to get an update until you get the following error:
Looking at the pods and the logs, the following from the worker container points to the problem:
Previous to this there are some errors about files not found, not sure if they are related:
Update: The issue seems to be fixed, I restarted both pods, the app + worker one and the redis one, using the command in #317
The text was updated successfully, but these errors were encountered: