Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Send Notification API does not send failure to send response #256

Closed
Tracked by #231
lezzago opened this issue Aug 4, 2021 · 5 comments
Closed
Tracked by #231

[BUG] Send Notification API does not send failure to send response #256

lezzago opened this issue Aug 4, 2021 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@lezzago
Copy link
Member

lezzago commented Aug 4, 2021

Describe the bug
When giving an invalid notification channel to the notification plugin to send a notification from the alerting plugin, the notification plugin sends a successful response even if the notification is not sent successfully due to the problems with the channel configuration. Alerting needs to know the notification was not sent successfully since we can put the alert in an error state and inform the user that the notification was not sent.

To Reproduce
Steps to reproduce the behavior:

  1. Create an invalid notification channel
  2. Call the send notification api with the channel and it will return a success response

Expected behavior
The sendNotification API should return an error stating the notification failed to send and include the reason for failure to send.

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@zhongnansu
Copy link
Member

I have some doubts about this issue. I think like you mentioned, the sendNotification API always resturn a successfull response with a notifciatonEventID in the sendNotificationResponse object except few cases when OpenSearchStatusException is thrown in this handler, but none of the exception are related to the actual "delivery status".

However since we save all status in the audit index with notificationEventID, which can be queried and get the delivery status including failure details.

For the source plugin which consumes the APIs. You can make another call to the getNotificationEvent API with the notificationEventID and traverse the results to get any info you need.

As for doing it from notification plugin side to avoid the above additonal call, I feel like we can also do that by modifying the sendNotificationResponse data model in common-utils to add event-related fields(recipient, delivery status). I am just not so sure if it's the right thing to do. @akbhatta AK, could you give some advice on this? Thx

https://github.com/opensearch-project/notifications/blob/develop/notifications/notifications/src/main/kotlin/org/opensearch/notifications/send/SendMessageActionHelper.kt#L94-L96

@lezzago
Copy link
Member Author

lezzago commented Aug 11, 2021

That is good to know getNotificationEvent API should get the response. However, I think it would make sense for the sendNotification API to return if the notification was sent out successfully or not, so users using the API won't just assume the notification was sent successfully when it actually hadn't since they wouldn't know to check the getNotificationEvent API.

@zhongnansu
Copy link
Member

zhongnansu commented Aug 11, 2021

@lezzago yes I agree we should return failure response for sendNotification API. But for the response error messge, there are 2 options. Let me know what you think

  1. to only include the notificationEventId, and some generic failure wording. If caller wants to know what excatly went wrong, he should make another call to getNotificationEvent API by that id.
  2. convert the eventStatus along with notificationId to string and return as error message.

I like option 1 more. Becuase our sendNotificationAPI support send to multiple channels in a single call, and will have status on all channels and recipients. It could easily become a messy error message if we go with option 2. Check the example below with only 1 chime destination that fails. Though I can make it less verbose with only returning the crucial info about delivery status, since I don't know your actual use case(or any other caller's use case), I am not sure what info should be kept and what shouldn't. But by option 1, you call another API and you get all the info from the DeliveryStaus or EventStatus as objects whose data model is defined in common-utils already, and it's easier to access.

{
  "error": {
    "root_cause": [
      {
        "type": "status_exception",
        "reason": "{\"notification_id\":\"4ybKNnsBOqyb4be7lOkT\",\"event_doc\":{\"metadata\":{\"last_updated_time_ms\":1628711850081,\"created_time_ms\":1628711848764,\"tenant\":\"\",\"access\":[]},\"event\":{\"event_source\":{\"title\":\"[alerting] Test Message Title-4ibKNnsBOqyb4be7hen6\",\"reference_id\":\"4ibKNnsBOqyb4be7hen6\",\"feature\":\"alerting\",\"severity\":\"info\",\"tags\":[]},\"status_list\":[{\"config_id\":\"4ibKNnsBOqyb4be7hen6\",\"config_type\":\"chime\",\"config_name\":\"this is a sample config name\",\"email_recipient_status\":[],\"delivery_status\":{\"status_code\":\"500\",\"status_text\":\"Failed to send message Failed: HttpResponseProxy{HTTP/1.1 400 Bad Request [Content-Type: application/json, Content-Length: 113, Connection: keep-alive, Date: Wed, 11 Aug 2021 19:57:30 GMT, x-amzn-ErrorType: ValidationException, x-amzn-RequestId: 5f62b8b8-0deb-4bf0-8433-5fafa930b77b, X-Cache: Error from cloudfront, Via: 1.1 be84d08eeed51234cd122d3c30e6f7c1.cloudfront.net (CloudFront), X-Amz-Cf-Pop: SFO5-C1, X-Amz-Cf-Id: cG_tH_8TxCUKXA9QRC0yeWoWgLDRMSkoQV3JhJeT7Hw0wDhp9m8XAQ==] ResponseEntityProxy{[Content-Type: application/json,Content-Length: 113,Chunked: false]}}\"}}]}}}"
      }
    ],
    "type": "status_exception",
    "reason": "{\"notification_id\":\"4ybKNnsBOqyb4be7lOkT\",\"event_doc\":{\"metadata\":{\"last_updated_time_ms\":1628711850081,\"created_time_ms\":1628711848764,\"tenant\":\"\",\"access\":[]},\"event\":{\"event_source\":{\"title\":\"[alerting] Test Message Title-4ibKNnsBOqyb4be7hen6\",\"reference_id\":\"4ibKNnsBOqyb4be7hen6\",\"feature\":\"alerting\",\"severity\":\"info\",\"tags\":[]},\"status_list\":[{\"config_id\":\"4ibKNnsBOqyb4be7hen6\",\"config_type\":\"chime\",\"config_name\":\"this is a sample config name\",\"email_recipient_status\":[],\"delivery_status\":{\"status_code\":\"500\",\"status_text\":\"Failed to send message Failed: HttpResponseProxy{HTTP/1.1 400 Bad Request [Content-Type: application/json, Content-Length: 113, Connection: keep-alive, Date: Wed, 11 Aug 2021 19:57:30 GMT, x-amzn-ErrorType: ValidationException, x-amzn-RequestId: 5f62b8b8-0deb-4bf0-8433-5fafa930b77b, X-Cache: Error from cloudfront, Via: 1.1 be84d08eeed51234cd122d3c30e6f7c1.cloudfront.net (CloudFront), X-Amz-Cf-Pop: SFO5-C1, X-Amz-Cf-Id: cG_tH_8TxCUKXA9QRC0yeWoWgLDRMSkoQV3JhJeT7Hw0wDhp9m8XAQ==] ResponseEntityProxy{[Content-Type: application/json,Content-Length: 113,Chunked: false]}}\"}}]}}}"
  },
  "status": 500
}

@lezzago
Copy link
Member Author

lezzago commented Aug 18, 2021

I think we would want option 2 because in the alert error, we need to give more details to the customer. If not, then the alerting plugin will have to call the other API and this will break since it would need the cluster permission to make the getNotificationEvent API call.

I am fine with option 1 if you have different generic wording that give the customer a good enough indication what could be wrong. Then in the alert error, the plugin can inform the customer to go to the notification plugin to get more details.

@zhongnansu
Copy link
Member

@lezzago Agree. I have a PR #276 to resolve this issue, which actually follows option 2 that gives more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants