-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[integration] Remote Pilot Logger to Tornado #6208
[integration] Remote Pilot Logger to Tornado #6208
Conversation
405b86a
to
5a6f587
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an initial review.
Apart from the documentation, that can come later, it is missing unit and integration tests.
src/DIRAC/WorkloadManagementSystem/Service/TornadoPilotLoggingHandler.py
Outdated
Show resolved
Hide resolved
src/DIRAC/WorkloadManagementSystem/Service/TornadoPilotLoggingHandler.py
Outdated
Show resolved
Hide resolved
src/DIRAC/WorkloadManagementSystem/Service/TornadoPilotLoggingHandler.py
Outdated
Show resolved
Hide resolved
src/DIRAC/WorkloadManagementSystem/Service/BasicPilotLoggingPlugin.py
Outdated
Show resolved
Hide resolved
571c1c9
to
50ce8c9
Compare
50ce8c9
to
94a4a63
Compare
@@ -244,6 +244,29 @@ def _getPilotOptionsPerSetup(self, setup, pilotDict): | |||
return queueOptionRes | |||
queuesDict[queue] = queueOptionRes["Value"] | |||
pilotDict["Setups"][setup]["Logging"]["Queues"] = queuesDict | |||
elif "loggingRESTService" in pilotDict["Setups"][setup]: | |||
self.log.debug( | |||
"Getting option of ", "/DIRAC/Setups/%s/%s" % (setup, pilotDict["Setups"][setup]["loggingRESTService"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not look like a suitable location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we really need this (a section in pilot.json). It was there in the original code. Now we are sending logs to Tornado, and in the second step are selecting a plugin by name (TornadoPilotLoggingHandler
does this) from the CS directly (MQ or File cache) and we don't need this section. I would revert to the original code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need it for logging before DIRAC is installed.
src/DIRAC/WorkloadManagementSystem/Utilities/PilotCStoJSONSynchronizer.py
Outdated
Show resolved
Hide resolved
Getting a proxy on demand and executing a function or code block: |
28ea921
to
f203f0e
Compare
self.setup = gConfig.getValue("/DIRAC/Setup", None) | ||
if self.setup is None: | ||
self.log.error("Setup is not defined in the configuration") | ||
return S_ERROR("Setup is not defined in the configuration") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test should be here.
pilotOptions.append("-z ") | ||
# Pilot Logging defined? This enables the extended (possibly remote) logger. | ||
# We need the URL and an optional flag which allows fine tuning on VO by VO basis: | ||
pilotLogging = opsHelper.getValue("/Pilot/RemoteLogging", True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again I would start with
pilotLogging = opsHelper.getValue("/Pilot/RemoteLogging", True) | |
pilotLogging = opsHelper.getValue("/Pilot/RemoteLogging", False) |
src/DIRAC/WorkloadManagementSystem/Agent/test/Test_Agent_PilotLoggingAgent.py
Outdated
Show resolved
Hide resolved
src/DIRAC/WorkloadManagementSystem/Client/TornadoPilotLoggingClient.py
Outdated
Show resolved
Hide resolved
f203f0e
to
010991b
Compare
src/DIRAC/WorkloadManagementSystem/Client/TornadoPilotLoggingClient.py
Outdated
Show resolved
Hide resolved
src/DIRAC/WorkloadManagementSystem/Service/BasicPilotLoggingPlugin.py
Outdated
Show resolved
Hide resolved
src/DIRAC/WorkloadManagementSystem/Service/FileCacheLoggingPlugin.py
Outdated
Show resolved
Hide resolved
src/DIRAC/WorkloadManagementSystem/Service/TornadoPilotLoggingHandler.py
Outdated
Show resolved
Hide resolved
0a7b90f
to
5eb5379
Compare
For the failing PilotWrapper tests: you need to change this line https://github.com/DIRACGrid/DIRAC/blob/integration/tests/Integration/WorkloadManagementSystem/Test_GenerateAndExecutePilotWrapper.py#L56 from
to
and then we would need to merge your Pilot PR (to |
Well, it takes pilot code from |
The tars are updated every night by this action: https://github.com/DIRACGrid/Pilot/actions/workflows/nightly.yml |
There is something strange happening here. When examining the |
I don't see anything wrong ... |
Well, so why in the (failing) wrapper test the untar lists more files than there actually are in the tar which is listed in that test and is here: https://diracproject.web.cern.ch/diracproject/tars/Pilot/DIRAC/devel/ ? |
Run these locally and you should them failing:
|
Hi, |
The cwd the test is running in is |
@aldbr is this your code ? Can you help me with failing tests ? |
I checked out your branch, and, if I run only this (py)test:
it works fine. If instead I run
This is because in one of the unit tests you have added in this PR you are moving to a temporary directory. Please fix it one way or the other. |
I fixed the code avoiding using |
Yes that's independent from you! |
Sure |
PS: you mean all or just the failed ones? |
Didn't even know you could do that :-D |
You mean that I could do that, or that one could to that? |
Sowohl als auch .... |
87acd47
to
8cb436c
Compare
I have rebased and tests are still failing, in another place, though. |
Yes, this looks like a failure elsewhere in the DIRAC code. I can't tell though whose it is, though I feel an urge to tag arbitrary Chrises.... |
@chrisburr I think the test failure is due to an upgrade to python 3.11: #7112 is me trying to fix this. |
Hi, you all have the wrong Chris Burr, this is the guy you want: @chrisburr |
8cb436c
to
5a2af95
Compare
5a2af95
to
9d437d8
Compare
OK. Now just add a note in https://github.com/DIRACGrid/DIRAC/wiki/DIRAC-8.1 on what is needed to activate this and we are ready to merge this PR, and to test it in the DIRAC certification setup. |
I have added a preliminary configuration info to the Wiki. |
BEGINRELEASENOTES
This is a draft of a pilot remote logging system to a Tornado Web server - Dirac side.
*Workload Management System
CHANGE:
PilotWrapper
has been modified to enable sending messages to a Tornado server.NEW: The remote pilot logger uses a string.io buffer which, when full, flushes messages to Tornado. If enabled, remote logging happens in parallell to the existing pilot logging system. The buffer can be used indirectly by calling
log.debug
orlog.info
messages or directly by writing to it. The latter method is used within pilot command subprocesses, where records emitted by Dirac (not Pilot) logger are simply fed into the buffer and sent to Tornado. This requires no changes to logging from those subprocesses.NEW
TornadoPilotLoggingHandler
loads a plugin to dispatch messages received by the server. As an example, aFileCacheLoginPlugin
is implemented. It writes log records to a file, one per pilot. When a last record is written, or a pilot command exits with a code != 0, a log file is "finalised" and collected by an agent (PilotLoggingAgent).NEW add a possibility to read a pilot stamp from
DIRAC_PILOT_STAMP
environment variable if supplied by a CE. This is the same stamp (PilotStamp
) as stored inPilotAgents
table. This allows to link log files to pilots.CHANGE
PilotCStoJSONSynchronizer
to dump an entireOperations
section of the CS.ENDRELEASENOTES
It is foreseen that a MQ logging agent is developed which converts log records into JSON and ship them to a MQ Service.
Notes:
Tornado URL should be provided in Dirac CS.
PilotCStoJSONSynchronizer
could also be used to store (parts of ?) the URL inpilot.json