-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make canarie-api independent of cron and proxy #18
make canarie-api independent of cron and proxy #18
Conversation
mishaschwartz
commented
Apr 17, 2023
- Separate the application, cron job, and proxy (nginx) into separate containers so that they can be run independently.
- Add option to independently enable/disable the two cron jobs that can be run (monitor, parse logs).
- Do not handle log rotation for nginx anymore. Nginx should handle this on its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments
I like the idea of splitting everything up as it should be.
Let's see how the tests do. Maybe more items will have to be validated regarding the log parsing strategy.
self-note for deploy
- major revision
1.0.0
required
docker-compose.yml
Outdated
volumes: | ||
- access_log:/opt/local/src/access_file.log | ||
|
||
proxy: | ||
image: nginx | ||
volumes: | ||
- access_log:/var/log/nginx/access.log | ||
|
||
volumes: | ||
access_log: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering log rotation that could contain multiple files, it would probably be better to mount a dedicated directory.
I think it would be wise also to provide an example (i.e.: what is expected) of a configured log rotation of nginx here using the docker service definition for reference (e.g.: https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/config/proxy/docker-compose-extra.yml#L3-L8, but not using driver: json-file
so the parsing still works in CanarieAPI).
I think the log rotation aspect could be improved (not necessarily in this PR, but setup for it).
Log parsers should probably count everything in the "current log" and add them to call counts from older logs, which are gradually wiped.
When the cron job is called, only log entries after the latest update would be parsed to avoid re-counting the same entries twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that all makes a lot of sense! But... from our previous discussion I got the sense that the log monitoring isn't used much, if at all. Do you think that it is worth the effort to make the log rotation work really well if there is a chance we might drop support for that feature in the near future?
If you think there is a use-case for it still I'll leave it in and make the changes you suggest. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically for the case of nginx requests, there are a lot of logs. There must be some form of log rotation, or the proxy will very quickly run into full disk error. This is the reason the referenced logging definition was originally added.
I don't mind not putting too much effort on parsing the log rotation, but I think it could be added easily.
The issue I can see with the PR is that since CanarieAPI is no longer in charge of managing the rotation, logs that will be parsed when the cron is called will not be in sync with whichever rotation is applied on the proxy docker. But like you said, those stats are not used that much, so maybe not that much of an issue.
canarieapi-cron
Outdated
* * * * * root [ -n "$CANARIE_PARSE_LOGS" ] && /usr/local/bin/python3 -c 'from canarieapi import logparser; logparser.cron_job()' 2>&1 | /usr/bin/tee -a /var/log/logparser_cron.log >/proc/1/fd/1 2>/proc/1/fd/1 | ||
* * * * * root [ -n "$CANARIE_MONITOR" ] && /usr/local/bin/python3 -c 'from canarieapi import monitoring; monitoring.cron_job()' 2>&1 | /usr/bin/tee -a /var/log/monitoring_cron.log >/proc/1/fd/1 2>/proc/1/fd/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand what is happening with those redirects.
Only the logs from canarie-api-cron
itself will be visible (i.e.: the logger.info(...)
calls within parse_log
), and not the logs from nginx
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that's right. This is essentially a way of making cron logs visible on the stdout of the main process (the one running in docker). The problem is that even if you run cron in the foreground with cron -f
stdout and stderr are still redirected to the cron log files. This is a way of getting around that and sending the logs to actual stdout (as one might expect from a docker container)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds good.
This docker will also need the log rotation config applied to avoid it blowing up the docker-compose logs.
canarieapi/logparser.py
Outdated
if not os.path.isfile(filename): | ||
return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should raise instead of failing silently, and also log an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I mostly did this because the tests assume that this file is present (it was previously created as an empty file in the rotate_logs function). I can update the tests though so that they don't fail by default.
cron-entrypoint
Outdated
@@ -0,0 +1,5 @@ | |||
#!/bin/sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repository root is starting to be cluttered with too many docker-specific definitions.
Can you move them under a docker
directory?
|
||
To run the the monitoring job, add the following to a crontab file:: | ||
|
||
* * * * * python3 -c 'from canarieapi import monitoring; monitoring.cron_job()' 2>&1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commands differ from the sample cron definitions.
Here, you could use .. literalinclude:: canarieapi-cron
to avoid maintaining both places
(https://devopstutodoc.readthedocs.io/en/stable/documentation/doc_generators/sphinx/rest_sphinx/code/literalinclude/literalinclude.html). Use :lines:
to select each instruction individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example here would be for running cron on the host machine (not on docker) so the stdout/stderr redirects and the conditional execution isn't as necessary.
I'm assuming, based on the rest of this file, that this documentation is intended as "basic usage when running canarie-api locally" and is different from the docker usage. I didn't notice that there were docker specific usage documentation but if I've missed it let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there was no docker specific usage.
Even this doc is outdated because ../bin/gunicorn
is not even in the repo anymore (just run gunicorn
installed in the python env).
It's ok to leave those examples as is. Personally, I debug calling the python commands directly instead of using cron. I don't expect many usages outside of Docker except for developers.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #18 +/- ##
==========================================
- Coverage 80.58% 79.68% -0.91%
==========================================
Files 11 11
Lines 577 576 -1
Branches 89 95 +6
==========================================
- Hits 465 459 -6
+ Misses 84 83 -1
- Partials 28 34 +6
☔ View full report in Codecov by Sentry. |
I'm making this back into a draft for now so that I can think about it a bit more. The main issue is that if nginx runs in its own container then the logs are streamed to stdout/stderr and so are not written to a file directly in order to rotate them. One option would be to make a custom nginx image that writes logs to a file instead but then there's not much point in running this as a separate service since one of the main advantages is being able to inspect the logs for individual processes. Another option is to mount the container logs from the host machine as a volume but the log format that docker produces isn't necessarily stable so it would be a pain to rely on them. |
Sorry for being late to the party. Some thought, ignore if too naive since I do not know the complexity of the code here.
The proxy will be vanila nginx image? Parse logs cronjob share data-volume with nginx proxy container to parse the nginx logs? Monitor cronjob runs from inside the nginx proxy container? I think it should run inside the proxy container so it can effectively test that the proxy has working connection to all the other containers. The cronjob do not have to be started by the proxy container itself. It can
Could be useful for other usage of Canarie-Api. I think PAVICS maybe will always want both of these cronjobs?
Would this affect the ease of log parsing? If Nginx log rotate itself, maybe the log parsing has to know this schedule and parse the logs before it is rotated? |
Yeah it could be, or at least could start as a vanilla nginx image that could be built on.
Yes that is one solution but doesn't fully solve the problem of log rotation
Right now the monitor cronjob does run inside the same container as the proxy. This PR would change it so that the monitor cronjob runs in a different container.
Yeah that's a good idea. I was thinking about adding another docker-proxy component that can allow the cronjob container to query the nginx logs parsed by docker. This could also be used to rotate the logs in the nginx container with an
Ok. I thought from our emails that the log parsing job was not needed/not used now that the funding from canarie is done. Correct me if I'm wrong.
It makes it slightly more complex but not too bad really. My main concern with this is that canarie-api ties the log rollover to the monitoring very strictly and a sysadmin may want to rotate logs differently. I think that this can be handled gracefully by canarie-api without too much trouble. |
Opps sorry yes, log parsing for the stats counter not required for PAVICS. Francis said maybe we still need to provide the stats until Sept 2023. Might want to double check with David Huard. |
Aren't logs stored in I think the solution would be to mount a volume on |
Yes but that is defined in birdhouse-deploy and isn't mentioned anywhere in this repo. It should at the very least be described or documented here.
Yes that is the current solution |
oops sorry I realize I haven't pushed those edits yet |
@tlvu @fmigneault To be honest I think that this has become more effort than it is worth to make the feature that parses the nginx logs work and isolate the processes (nginx, canarie, cron). If you're ok with me removing the feature that parses nginx logs I can do that (I suggest we wait until September to roll that out so that the Canarie requirements are finished). I vote to do that. If you want that feature kept but allow it to be disabled optionally, then I suggest we close this PR and go with #19 instead. |
@mishaschwartz |
This is implemented in the changes to api.py (see here and in #19)
Ok then I propose we forget about this PR and #19. I would really like there to be an option where we have a service that just monitors the stack and provides information about the other services that are running and nothing else. I think that canarie-api will do for now but I will introduce an alternative for the birdhouse stack later on (I will work on other things first). CRIM and Ouranos can continue using canarie-api as it is, and other deployments can use the alternative monitoring app if they prefer something simpler. |
@mishaschwartz |
You're right, it's not difficult. I've pushed an example solution (not tested just yet, I'll get to it tonight). I propose an alternative because I don't want to require a component in the birdhouse stack where some of its features are mandatory but not others. I've thought about it a bit more and I think we can come to a solution if we make the monitoring job mandatory but the log parsing job optional. I'll push a few more changes later tonight and add some clarifying documentation and we can discuss further from there |
You have my vote. I think logs parsing should be generalized to all components in the stack, not just nginx. There is an issue opened for that here bird-house/birdhouse-deploy#218 |
e49e739
to
03bccb3
Compare
Another reason this is useful is that if canarie-api fails (and exits) the proxy goes down as well (which is not good) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to pass linting and tests validations.
Looks good overall from static code analysis. I'll need to see it working within birdhouse-deploy for further validation.
canarieapi/api.py
Outdated
service_stats.append(("invocations", cron_info["invocations"])) | ||
monitor_info.append(("lastInvocationsUpdate", cron_info["last_log_update"])) | ||
|
||
monitor_info.extend([ | ||
("lastAccess", cron_info["last_access"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe consider having "Disabled"
instead of the default "Never"
so that it is more obvious that this status is expected when PARSE_LOGS = False
was specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was not clear what "lastAccess" referred to in this case. If it refers to the last time this route was accessed by a user then it should be displayed conditionally if monitoring is enabled. I've moved it up into the if
block (above) to reflect this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It refers more to when the logs were last accessed (aka checked). So, more like a "last updated".
I don't think the last access time of the endpoint is relevant for anyone. It is usually used to validate if the logs were parsed recently or are ~1min old (or more depending on crontab config). The "Never"
is usually an indication that something is misconfigured, or that the instance encountered some kind of error.
I want to avoid an expected "Never"
(because log parsing is disabled) to be misinterpreted as something going wrong in the service. Alternatively, you could output parseLogsEnabled: true|false
directly in the response. That would be even more explicit than the "Disabled"
previously suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still remains to be addressed.
"filename": "/opt/local/src/CanarieAPI/stats.db", | ||
"access_log": "/var/log/nginx/access_file.log", | ||
"log_pid": "/var/run/nginx.pid" | ||
"filename": "/data/stats.db", | ||
"access_log": "/logs/nginx-access.log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes breaking changes.
Need to consider MAJOR revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
@tlvu |
Sorry, I haven't been following this PR. What "approve and run" button are you talking about? |
@fmigneault This is odd I do not see what you see (that approve and run button) This is my view on github.com So, I am not sure what I can do to help! The reviewer that is requesting change is you by the way. |
@tlvu This repo is likely set up to not allow workflow runs from outside contributors (https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks). If you add me to the repo as a member, my changes to this PR should trigger the workflow runs automatically. |
It will show only if @mishaschwartz pushes another commit. For now, the latest commit was already approved by me, so you see the CI results of the corresponding run. |
@tlvu I pushed another commit if you want to take another look. Thanks |
Indeed ! You're added to this repo with write permission now. If it still does not resolve the workflow not firing problem, you probably should push directly to this repo instead of to your fork. |
Yes I see that "approve and run" button now and just clicked it. Hopefully now you're added with write permission to this repo, the workflow will fire automatically on each push. |
@fmigneault @tlvu only the labeler is failing and I'm pretty sure that's because I'm merging this from my own fork. See for details: actions/labeler#12 What do you want me to do? Are you willing to approve this without the labeler running? |
I'll let @fmigneault approve the change since he knows the CanarieAPI more than I do. However I don't think the labeller is something critical. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor items to add.
Good to merge after.
You can ignore the labeler action.
CHANGES.rst
Outdated
@@ -7,7 +7,8 @@ CHANGES | |||
------------------------------------------------------------------------------------ | |||
|
|||
* Separate the application, cron job, and proxy (nginx) into separate containers so that they can be run independently. | |||
* Add option to independently enable/disable the parse_logs cron job. | |||
* Add option to independently enable/disable the parse_logs cron job by setting the PARSE_LOGS configuration option to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing quotes for parse_logs
and PARSE_LOGS
.
@fmigneault this is good to merge I think (I don't have permission to merge it myself) |