Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With django-apscheduler, some jobs are never executed due to connection timeout #13

Closed
WesleyBlancoYuan opened this issue Feb 1, 2018 · 9 comments

Comments

@WesleyBlancoYuan
Copy link

I find that if I keep Apscheduler busy, i.e., keep adding jobs to it, it will stay happy and active. But, if I leave it without job for a time, say, 12 hours or more, and add job again, even it is running (scheduler.state returns 1, which is RUNNING, when I consult with a web service), it accepts job, but does not execute it, when the run_date arrives. I have configured django-apscheduler as the jobstore.

I am using APScheduler along with Apache and Django.

Apache/2.4.6 (CentOS) PHP/5.4.16 mod_wsgi/3.4 Python/2.7.5
APScheduler (3.4.0)
django-apscheduler (0.2.3)

Yesterday, at 10:00 AM, I plannified three jobs, job A at 23:30 yesterday, job B at 09:00 today, and job C 09:50 today, and after that, I left the server there, no more jobs from yesterday from 1000 to 2330.

Checking the logs of yesterday, at 2330, this error happened:

[Wed Jan 31 23:30:00.010202 2018] [:error] [pid 17924] DEBUG:apscheduler.scheduler:Looking for jobs to run
[Wed Jan 31 23:30:00.019000 2018] [:error] [pid 17924] ERROR:root:
[Wed Jan 31 23:30:00.019013 2018] [:error] [pid 17924] Traceback (most recent call last):
[Wed Jan 31 23:30:00.019031 2018] [:error] [pid 17924]   File "/usr/lib/python2.7/site-packages/django_apscheduler/jobstores.py", line 40, in get_due_jobs
[Wed Jan 31 23:30:00.019034 2018] [:error] [pid 17924]     return self._get_jobs(next_run_time__lte=serialize_dt(now))
[Wed Jan 31 23:30:00.019036 2018] [:error] [pid 17924]   File "/usr/lib/python2.7/site-packages/django_apscheduler/jobstores.py", line 100, in _get_jobs
[Wed Jan 31 23:30:00.019038 2018] [:error] [pid 17924]     for job_id, job_state in job_states:
[Wed Jan 31 23:30:00.019040 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/site-packages/django/db/models/query.py", line 250, in __iter__
[Wed Jan 31 23:30:00.019042 2018] [:error] [pid 17924]     self._fetch_all()
[Wed Jan 31 23:30:00.019044 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/site-packages/django/db/models/query.py", line 1118, in _fetch_all
[Wed Jan 31 23:30:00.019046 2018] [:error] [pid 17924]     self._result_cache = list(self._iterable_class(self))
[Wed Jan 31 23:30:00.019048 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/site-packages/django/db/models/query.py", line 122, in __iter__
[Wed Jan 31 23:30:00.019050 2018] [:error] [pid 17924]     for row in compiler.results_iter():
[Wed Jan 31 23:30:00.019052 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/site-packages/django/db/models/sql/compiler.py", line 828, in results_iter
[Wed Jan 31 23:30:00.019054 2018] [:error] [pid 17924]     results = self.execute_sql(MULTI)
[Wed Jan 31 23:30:00.019056 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/site-packages/django/db/models/sql/compiler.py", line 886, in execute_sql
[Wed Jan 31 23:30:00.019058 2018] [:error] [pid 17924]     raise original_exception
[Wed Jan 31 23:30:00.019060 2018] [:error] [pid 17924] OperationalError: (2006, 'MySQL server has gone away')
[Wed Jan 31 23:30:00.020039 2018] [:error] [pid 17924] Exception in thread APScheduler:
[Wed Jan 31 23:30:00.020050 2018] [:error] [pid 17924] Traceback (most recent call last):
[Wed Jan 31 23:30:00.020053 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
[Wed Jan 31 23:30:00.020055 2018] [:error] [pid 17924]     self.run()
[Wed Jan 31 23:30:00.020057 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/threading.py", line 764, in run
[Wed Jan 31 23:30:00.020059 2018] [:error] [pid 17924]     self.__target(*self.__args, **self.__kwargs)
[Wed Jan 31 23:30:00.020061 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/site-packages/apscheduler/schedulers/blocking.py", line 30, in _main_loop
[Wed Jan 31 23:30:00.020063 2018] [:error] [pid 17924]     wait_seconds = self._process_jobs()
[Wed Jan 31 23:30:00.020065 2018] [:error] [pid 17924]   File "/usr/lib64/python2.7/site-packages/apscheduler/schedulers/base.py", line 943, in _process_jobs
[Wed Jan 31 23:30:00.020067 2018] [:error] [pid 17924]     for job in due_jobs:
[Wed Jan 31 23:30:00.020069 2018] [:error] [pid 17924] TypeError: 'NoneType' object is not iterable
[Wed Jan 31 23:30:00.020071 2018] [:error] [pid 17924]

And then, at 0900 and 0950, this error does not appear in log, and jobs not executed.

I think django-apscheduler should wake the connection to DB before it is going to execute the job, because at that moment, we cannot be sure if the connection is alive or not.

@sallyruthstruik
Copy link
Collaborator

Thanks for very detail explanation. I sure problem is about Mysql connection handling. Mysql closes connection which sleeps for a long time. I'll take a look today/tomorrow and try to solve it

sallyruthstruik added a commit that referenced this issue Feb 1, 2018
@sallyruthstruik
Copy link
Collaborator

Fixed, release 0.2.5 available on Pypi: https://github.com/jarekwg/django-apscheduler/releases/tag/0.2.5

Please, check and if everything is ok I'm close the issue

@WesleyBlancoYuan
Copy link
Author

You are welcome.

By the way, you can reproduce the issue by:

  1. Create a scheduler and add DjangoJobStore, and start it.
  2. Add a job which will run in 5 minutes.
  3. Restart the DB service. (systemctl stop/start mariadb, etc.)
  4. Wait till 5 min.

You may see the same error.

Another thing is I agree that MySQL may close the connection, but setting the timeout on mysql side may not be a solution, because I may plannify a job in 7 days and... that is too long for timeout.

@sallyruthstruik
Copy link
Collaborator

I've just added "mysql gone away" exception handling, please update to 0.2.5 and check whether exception gone or not. And please, give feedback to me in order to close the issue.

@WesleyBlancoYuan
Copy link
Author

Wow, fixed! What did you added in the code? I see try-catch but did you reconnect DB in catch?
You can close the issue. Thanks!

@sallyruthstruik
Copy link
Collaborator

sallyruthstruik commented Feb 5, 2018

It is a bit tricky solution: on every access to DjangoJob queryset I try to ping database (https://github.com/jarekwg/django-apscheduler/blob/master/django_apscheduler/models.py#L17). If ping fails I perform reconnect.

Unfortunately, it works only for DjangoJob queries. But you can use this Manager class for any model you want to be protected from "Mysql has gone away" error. I don't know elegant way in Django to beat "Mysql has gone away" error for long-running scripts, only hand-written try-except blocks with reconnect. It is my pain in Django, because mostly I use mysql.

Anyway, I glad exception dissapeared. Feel free to contact me with any other problems.

@WesleyBlancoYuan
Copy link
Author

WesleyBlancoYuan commented Feb 5, 2018

Yeah.... I see your pain.. That is I think what is bad in python: it is useful for scripting, but if you want to use a script, it is your obligation to manage all the import and so on, to handle all the possible problems in a relatively isolated situation.
Fortunately, Django manages the connection very well. I now import Django and do django.setup() every time when I want to execute the job. See the related bug in APScheduler for details.

I didn't expect your reply to be so quick, Thanks again! The code of reconnection is imprescindible for solving this.

@izhaolinger
Copy link

hello, I have the same problem.
APScheduler==3.5.3
django-apscheduler==0.2.13
OperationalError: (2006, 'MySQL server has gone away')

@TreHack
Copy link

TreHack commented Dec 13, 2018

hi @sallyruthstruik you can write a decorator like below to handle the problem("mysql has gone away"):

from functools import wraps
from django.db import connection

def db_auto_reconnect(func):
    """Auto reconnect db when mysql has gone away."""
    @wraps(func)
    def wrapper(*args, **kwagrs):
        try:
            connection.connection.ping()
        except Exception:
            connection.close()
        return func(*args, **kwagrs)
    return wrapper

and use this decorator like below:

@db_auto_reconnect
def _process_submission_event(self, event):
    .....

@db_auto_reconnect
def _process_execution_event(self, event):
    .....

I think it is an elegant way!
but the _ping_interval = 30 is unreliable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants