Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements in Orion logs system #3694

Closed
fgalan opened this issue Sep 11, 2020 · 8 comments
Closed

Improvements in Orion logs system #3694

fgalan opened this issue Sep 11, 2020 · 8 comments
Labels
Milestone

Comments

@fgalan
Copy link
Member

fgalan commented Sep 11, 2020

The following modifications has to be done in the Orion Context Broker logs system:

  • Move all the current LM_I traces to LM_T, choosing a suitable (new) tracelevel in DEBUG level except the trace about Orion startup, that must remain in INFO level
  • Introduce the following new traces at INFO level
    • At the end of each GET request on the CB API endpoint, log the URL used and the response status code to that request. No payload is loged.
    • At the end of each non-GET request on the CB API endpoint, log the URL used, the response status code to that request and the request payload
    • At the end of each notification send, log the URL of the endpoind and subId of the associated subcription. No payload is logged
  • For notifications, add the following token to the correlator ID: ; notif=n where "n" is an autoincremental counter (1, 2, 3, etc.) local to the e2e flow that caused the nofiication. I mean, if a given update has corr=10405986-e4e7-4ee4-95c7-bfec5282e3a4 and three notifications are generated as consecuence of that update, then notifications will have the following correlators:
    • corr=10405986-e4e7-4ee4-95c7-bfec5282e3a4; notif=1
    • corr=10405986-e4e7-4ee4-95c7-bfec5282e3a4; notif=2
    • corr=10405986-e4e7-4ee4-95c7-bfec5282e3a4; notif=3
  • Modify documentation in logs.md
@fgalan fgalan added the backlog label Sep 11, 2020
@fgalan fgalan mentioned this issue Sep 17, 2020
10 tasks
@fgalan
Copy link
Member Author

fgalan commented Sep 17, 2020

Move all the current LM_I traces to LM_T, choosing a suitable (new) tracelevel in DEBUG level except the trace about Orion startup, that must remain in INFO level

The following old-INFO traces will remain at the end:

src/app/contextBroker/contextBroker.cpp:    LM_I(("Orion context broker exiting due to receiving a signal"));
src/app/contextBroker/contextBroker.cpp:    LM_I(("Orion context broker exits in an ordered manner (%s)", reason.c_str()));
src/app/contextBroker/contextBroker.cpp:  LM_I(("Orion Context Broker is running"));
src/app/contextBroker/contextBroker.cpp:  LM_I(("Startup completed"));

which I think is good enough.

@fgalan
Copy link
Member Author

fgalan commented Sep 17, 2020

Regarding

For notifications, add the following token to the correlator ID: ; notif=n where "n" is an autoincremental counter (1, 2, 3, etc.) local to the e2e flow that caused the nofiication.

It will be implemented only for non-initial notifications (i.e. it will be implemented for notifications that are caused by a entity creation/update). In the case of initical notification only only notification exists associated to the causing operation (which is subscription creation/update) so it is pointless to add a counter.

@fgalan
Copy link
Member Author

fgalan commented Sep 18, 2020

Feedback from a review with @manucarrace @mrutid and @AlvaroVega

  • It would be desirable to include the Orion options at the startup in a INFO message. It would be also desirable to include information regarding MongoDB connection (nice to have)
  • Use only one line (instead of 2 or 3), using the following formats:
    • Request received: POST /v2/entities, request payload (xx bytes): { ... }, response code: 204
    • Request received: GET /v2/entities, response code: 200
    • Notif delivered (subId: ...): POST localhost:1028/accumulate, response code: 200
  • Change correlator prefix in notifications from notif= to cbnotif=
  • Take into account also forwarding request, similar to notifications, although in this case with full information about request and response payloads.
corr=0c1f105a-f98e-11ea-9068-000c29df7908; cbfwd=1
...
Request forwarded (regId: ...): POST http://cprs/v2/op/update, request payload (xx bytes): { ... }, response payload (xx bytes): { ... }, response code: 204
Request forwarded (regId: ...): POST http://cprs/v2/op/query, request payload (xx bytes): { ... }, response payload (xx bytes): { ... }, response code: 200
  • In the case of notifications and forwarding, if the reason of the fail is known, use that as response code instead of current -1 (I guess we have that information at hand, as it is now shown in lastFailureReason for subscription)
  • The case of 1 subscription and 1 batch update hitting the subscription several times is not working fine. In the following trace, we shoul see notif=1, notif=2 and notif=3, but all them are using notif=1. This has to be solved.
time=Friday 18 Sep 09:23:58 2020.625Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908 | trans=1600419051-983-00000000017 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=rest.cpp[574]:requestCompleted | msg=Request finishes: POST /v2/op/update
time=Friday 18 Sep 09:23:58 2020.627Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908 | trans=1600419051-983-00000000017 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=rest.cpp[577]:requestCompleted | msg=Request payload (409 bytes): {  "actionType": "append",  "entities": [    {      "type": "T2",      "id": "E1",      "temperature": {        "value": 21.2,        "type": "Float"      }    },    {      "type": "T2",      "id": "E2",      "temperature": {        "value": 31.8,        "type": "Float"      }      },    {      "type": "T2",      "id": "E2",      "temperature": {        "value": 32.8,        "type": "Float"      }    }  ]}
time=Friday 18 Sep 09:23:58 2020.627Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908 | trans=1600419051-983-00000000017 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=rest.cpp[579]:requestCompleted | msg=Response code: 204

time=Friday 18 Sep 09:23:58 2020.628Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908; notif=1 | trans=1600419051-983-00000000018 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=senderThread.cpp[117]:startSenderThread | msg=Notification transaction finishes (subId: 5f647c0f6d8aa198a4fc88c9): POST localhost:1028/accumulate
time=Friday 18 Sep 09:23:58 2020.628Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908; notif=1 | trans=1600419051-983-00000000018 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=senderThread.cpp[118]:startSenderThread | msg=Response code: -1

time=Friday 18 Sep 09:23:58 2020.636Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908; notif=1 | trans=1600419051-983-00000000020 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=senderThread.cpp[117]:startSenderThread | msg=Notification transaction finishes (subId: 5f647c0f6d8aa198a4fc88c9): POST localhost:1028/accumulate
time=Friday 18 Sep 09:23:58 2020.641Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908; notif=1 | trans=1600419051-983-00000000020 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=senderThread.cpp[118]:startSenderThread | msg=Response code: -1

time=Friday 18 Sep 09:23:58 2020.641Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908; notif=1 | trans=1600419051-983-00000000019 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=senderThread.cpp[117]:startSenderThread | msg=Notification transaction finishes (subId: 5f647c0f6d8aa198a4fc88c9): POST localhost:1028/accumulate
time=Friday 18 Sep 09:23:58 2020.641Z | lvl=INFO | corr=ae660c22-f990-11ea-ab30-000c29df7908; notif=1 | trans=1600419051-983-00000000019 | from=0.0.0.0 | srv=foo | subsrv=/bar | comp=Orion | op=senderThread.cpp[118]:startSenderThread | msg=Response code: -1

Subscription used for the test:

curl -v localhost:1026/v2/subscriptions -s -S -H 'Content-Type: application/json' -H 'fiware-service: foo' -H 'fiware-servicepath: /bar' -d @- <<EOF
{
  "subject": {
    "entities": [
      {
        "idPattern": ".*",
        "type": "T2"
      }
    ]
  },
  "notification": {
    "http": {
      "url": "http://localhost:1028/accumulate"
    }
  }
}
EOF

Update used for the test:

curl -v localhost:1026/v2/op/update?options=forcedUpdate -s -S -H 'Content-Type: application/json' -H 'Content-Type: application/json' -H 'fiware-service: foo' -H 'fiware-servicepath: /bar' -d @- <<EOF
{
  "actionType": "append",
  "entities": [
    {
      "type": "T2",
      "id": "E1",
      "temperature": {
        "value": 21.2,
        "type": "Float"
      }
    },
    {
      "type": "T2",
      "id": "E2",
      "temperature": {
        "value": 31.8,
        "type": "Float"
      }	  
    },
    {
      "type": "T2",
      "id": "E2",
      "temperature": {
        "value": 32.8,
        "type": "Float"
      }
    }
  ]
}
EOF

@fgalan
Copy link
Member Author

fgalan commented Oct 1, 2020

Check implementation with large payloads. I think we have a "LINE TOO LONG" guard in traces that can hide log trace completely and we should see how we can deal with that.

@fgalan
Copy link
Member Author

fgalan commented Oct 1, 2020

Regarding:

It would be desirable to include the Orion options at the startup in a INFO message. It would be also desirable to include information regarding MongoDB connection (nice to have)

Now INFO at startup is as follows:

$ ORION_UNKNOWN=foo NO_ES_ORION=bar ORION_PORT=1028 ORION_MONGO_HOST=127.0.0.1 contextBroker -fg -logLevel INFO

time=Thursday 01 Oct 15:10:00 2020.399Z | lvl=INFO | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=contextBroker.cpp[980]:main | msg=start command line <contextBroker -fg -logLevel INFO>
time=Thursday 01 Oct 15:10:00 2020.399Z | lvl=INFO | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=contextBroker.cpp[867]:logEnvVars | msg=env var ORION_PORT (-port): 1028
time=Thursday 01 Oct 15:10:00 2020.399Z | lvl=INFO | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=contextBroker.cpp[859]:logEnvVars | msg=env var ORION_MONGO_HOST (-dbhost): 127.0.0.1
time=Thursday 01 Oct 15:10:00 2020.399Z | lvl=INFO | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=contextBroker.cpp[1048]:main | msg=Orion Context Broker is running
time=Thursday 01 Oct 15:10:00 2020.415Z | lvl=INFO | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=MongoGlobal.cpp[247]:mongoInit | msg=Connected to mongo at 127.0.0.1:orion (poolsize: 10)
time=Thursday 01 Oct 15:10:00 2020.419Z | lvl=INFO | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=contextBroker.cpp[1174]:main | msg=Startup completed

@fgalan
Copy link
Member Author

fgalan commented Oct 2, 2020

With regards to query forwarding reference execution using cases/0787_cprs_full_functional_v2/ngsiv2_query_five_different_cprs_in_one_query.test

time=Friday 02 Oct 14:32:00 2020.284Z | lvl=INFO | corr=08298bf4-04bc-11eb-8143-000c29df7908 | trans=1601649035-663-00000000001 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=rest.cpp[571]:requestCompleted | msg=Request received: POST /v1/registry/registerContext, request payload (227 bytes): { "contextRegistrations": [ { "entities": [ { "type": "T1", "isPattern": "false", "id": "E1" } ], "attributes": [ { "name": "A1", "type": "string" } ], "providingApplication": "http://localhost:9801/v1" } ], "duration": "P1M" }, response code: 200
time=Friday 02 Oct 14:32:00 2020.512Z | lvl=INFO | corr=084d96fc-04bc-11eb-8c77-000c29df7908 | trans=1601649035-663-00000000002 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=rest.cpp[571]:requestCompleted | msg=Request received: POST /v1/registry/registerContext, request payload (227 bytes): { "contextRegistrations": [ { "entities": [ { "type": "T1", "isPattern": "false", "id": "E1" } ], "attributes": [ { "name": "A2", "type": "string" } ], "providingApplication": "http://localhost:9802/v1" } ], "duration": "P1M" }, response code: 200
time=Friday 02 Oct 14:32:00 2020.657Z | lvl=INFO | corr=086b4d50-04bc-11eb-95ca-000c29df7908 | trans=1601649035-663-00000000003 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=rest.cpp[571]:requestCompleted | msg=Request received: POST /v1/registry/registerContext, request payload (227 bytes): { "contextRegistrations": [ { "entities": [ { "type": "T1", "isPattern": "false", "id": "E1" } ], "attributes": [ { "name": "A3", "type": "string" } ], "providingApplication": "http://localhost:9803/v1" } ], "duration": "P1M" }, response code: 200
time=Friday 02 Oct 14:32:00 2020.862Z | lvl=INFO | corr=088a4b88-04bc-11eb-b52c-000c29df7908 | trans=1601649035-663-00000000004 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=rest.cpp[571]:requestCompleted | msg=Request received: POST /v1/registry/registerContext, request payload (227 bytes): { "contextRegistrations": [ { "entities": [ { "type": "T1", "isPattern": "false", "id": "E1" } ], "attributes": [ { "name": "A4", "type": "string" } ], "providingApplication": "http://localhost:9804/v1" } ], "duration": "P1M" }, response code: 200
time=Friday 02 Oct 14:32:00 2020.984Z | lvl=INFO | corr=089d1ea2-04bc-11eb-b538-000c29df7908 | trans=1601649035-663-00000000005 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=rest.cpp[571]:requestCompleted | msg=Request received: POST /v1/registry/registerContext, request payload (227 bytes): { "contextRegistrations": [ { "entities": [ { "type": "T1", "isPattern": "false", "id": "E1" } ], "attributes": [ { "name": "A5", "type": "string" } ], "providingApplication": "http://localhost:9805/v1" } ], "duration": "P1M" }, response code: 200

time=Friday 02 Oct 14:32:01 2020.413Z | lvl=INFO | corr=08b51368-04bc-11eb-8c84-000c29df7908; cbfwd=1 | trans=1601649035-663-00000000007 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=postQueryContext.cpp[309]:queryForward | msg=Request forwarded (regId: 5f7739e06db7dd1b647e8a08): POST http://localhost:9801/v1/queryContext, request payload (78 bytes): {"entities":[{"type":"T1","isPattern":"false","id":"E1"}],"attributes":["A1"]}, response payload (198 bytes): {"contextResponses":[{"contextElement":{"type":"T1","isPattern":"false","id":"E1","attributes":[{"name":"A1","type":"string","value":"A1 in CP1"}]},"statusCode":{"code":"200","reasonPhrase":"OK"}}]}, response code: 200
time=Friday 02 Oct 14:32:01 2020.429Z | lvl=INFO | corr=08b51368-04bc-11eb-8c84-000c29df7908; cbfwd=2 | trans=1601649035-663-00000000008 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=postQueryContext.cpp[309]:queryForward | msg=Request forwarded (regId: 5f7739e06db7dd1b647e8a09): POST http://localhost:9802/v1/queryContext, request payload (78 bytes): {"entities":[{"type":"T1","isPattern":"false","id":"E1"}],"attributes":["A2"]}, response payload (198 bytes): {"contextResponses":[{"contextElement":{"type":"T1","isPattern":"false","id":"E1","attributes":[{"name":"A2","type":"string","value":"A2 in CP2"}]},"statusCode":{"code":"200","reasonPhrase":"OK"}}]}, response code: 200
time=Friday 02 Oct 14:32:01 2020.450Z | lvl=INFO | corr=08b51368-04bc-11eb-8c84-000c29df7908; cbfwd=3 | trans=1601649035-663-00000000009 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=postQueryContext.cpp[309]:queryForward | msg=Request forwarded (regId: 5f7739e06db7dd1b647e8a0a): POST http://localhost:9803/v1/queryContext, request payload (78 bytes): {"entities":[{"type":"T1","isPattern":"false","id":"E1"}],"attributes":["A3"]}, response payload (198 bytes): {"contextResponses":[{"contextElement":{"type":"T1","isPattern":"false","id":"E1","attributes":[{"name":"A3","type":"string","value":"A3 in CP3"}]},"statusCode":{"code":"200","reasonPhrase":"OK"}}]}, response code: 200
time=Friday 02 Oct 14:32:01 2020.482Z | lvl=INFO | corr=08b51368-04bc-11eb-8c84-000c29df7908; cbfwd=4 | trans=1601649035-663-00000000010 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=postQueryContext.cpp[309]:queryForward | msg=Request forwarded (regId: 5f7739e06db7dd1b647e8a0b): POST http://localhost:9804/v1/queryContext, request payload (78 bytes): {"entities":[{"type":"T1","isPattern":"false","id":"E1"}],"attributes":["A4"]}, response payload (198 bytes): {"contextResponses":[{"contextElement":{"type":"T1","isPattern":"false","id":"E1","attributes":[{"name":"A4","type":"string","value":"A4 in CP4"}]},"statusCode":{"code":"200","reasonPhrase":"OK"}}]}, response code: 200
time=Friday 02 Oct 14:32:01 2020.524Z | lvl=INFO | corr=08b51368-04bc-11eb-8c84-000c29df7908; cbfwd=5 | trans=1601649035-663-00000000011 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=postQueryContext.cpp[309]:queryForward | msg=Request forwarded (regId: 5f7739e06db7dd1b647e8a0c): POST http://localhost:9805/v1/queryContext, request payload (78 bytes): {"entities":[{"type":"T1","isPattern":"false","id":"E1"}],"attributes":["A5"]}, response payload (198 bytes): {"contextResponses":[{"contextElement":{"type":"T1","isPattern":"false","id":"E1","attributes":[{"name":"A5","type":"string","value":"A5 in CP5"}]},"statusCode":{"code":"200","reasonPhrase":"OK"}}]}, response code: 200
time=Friday 02 Oct 14:32:01 2020.524Z | lvl=INFO | corr=08b51368-04bc-11eb-8c84-000c29df7908 | trans=1601649035-663-00000000006 | from=0.0.0.0 | srv=<none> | subsrv=<none> | comp=Orion | op=rest.cpp[571]:requestCompleted | msg=Request received: POST /v2/op/query, request payload (96 bytes): { "entities": [ { "id": "E1", "type": "T1" } ], "attributes": [ "A1", "A2", "A3", "A4", "A5" ] }, response code: 200

Note they are NGSIv1 payload (it's and old test) but, for the point of view of this functionality, it doesn't make any difference.

@fgalan
Copy link
Member Author

fgalan commented Oct 6, 2020

Check implementation with large payloads. I think we have a "LINE TOO LONG" guard in traces that can hide log trace completely and we should see how we can deal with that.

Ideally, we should introduce two CLI switches to deal with this:

  • -logLineMaxSize to set the maximum absolute size for trace line (no matter the log level). Default value: 32 * 1024 (to keep backward compatibility, as now is hardwired to that value)
  • -logInfoPayloadMaxSize to set the maximum size for payloads in log level INFO. Default value: 5 * 1024.

@fgalan fgalan added this to the 2.5.0 milestone Oct 26, 2020
@fgalan
Copy link
Member Author

fgalan commented Oct 26, 2020

PR #3696

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant