Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE #4824] Add HTTP Sink Connector #4837

Merged
merged 27 commits into from
May 8, 2024
Merged

[ISSUE #4824] Add HTTP Sink Connector #4837

merged 27 commits into from
May 8, 2024

Conversation

cnzakii
Copy link
Contributor

@cnzakii cnzakii commented Apr 14, 2024

Fixes #4824

Motivation

[Feature] Add HTTP Sink Connector #4824

Modifications

  1. Add the relevant code for the HTTP Sink Connector.
  2. Write HttpSinkConnectorTest to test the functionality of the HTTP Sink Connector.
  3. Upgrade and add dependencies to ensure the code runs successfully.

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

Copy link
Member

@Pil0tXia Pil0tXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider supporting Webhook and HTTPS/SSL, and provide corresponding options for users in yml file.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 14, 2024

Please consider supporting Webhook and HTTPS/SSL, and provide corresponding options for users in yml file.

OK, I will continue to improve the relevant functions.
Does a new issue need to be created for enhancement or is it done in this PR ?

@Pil0tXia
Copy link
Member

Does a new issue need to be created for enhancement or is it done in this PR ?

You may directly add more commits in this PR.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 14, 2024

@Pil0tXia

I understand that a webhook is a callback function in HTTP (where a server sends a request to a client).
Why does an Http Sink Connector need this capability? At the moment, I'm also struggling to think of a universal function that can handle different callback data.

Copy link
Member

@Pil0tXia Pil0tXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cnzakii

HTTP Sink Connector would serve as a "notifier" when Webhook function enabled.

Due to Webhook's callback feature, its implementation code may differ with normal HTTP requests, which may request users to choose using normal HTTP or Webhook. If the current normal HTTP request can perform Webhook deliveries, then there's no need to add an option for it.

In order to display Webhook callback response body, you may need to store them in a memory queue, and leave a comment indicating its future use.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 14, 2024

In order to display Webhook callback response body, you may need to store them in a memory queue, and leave a comment indicating its future use.

@Pil0tXia
I have another question, the data passed must be different for different usage scenarios, so it seems difficult to specify a generic function that handles the callback data efficiently. Or do we just need to store the callback data for now and leave the rest of the work to other parts?

@Pil0tXia
Copy link
Member

@cnzakii

the data passed must be different for different usage scenarios

Are you referring to the ConnectRecord's data field being of type Object? You can serialize the data as String. vertx has type inference to further determine if the text is formatted as json or plain text if you need to.

so it seems difficult to specify a generic function that handles the callback data efficiently

May you please describe in more detail the problems encountered in this part? If you're referring to the "Webhook has a lot of protocols" issue, you may check out the org.apache.eventmesh.webhook.receive.protocol package, considering supporting some default protocols, and providing a template for users to implement their own protocols.

However, does the HTTP Sink Connector, as an HTTP Client, need to implement a special protocol? That should be the job of an HTTP Server (HTTP Source Connector).

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 14, 2024

The way I envision or understand for a client with webhook functionality is as follows:

The web client sends a request containing a ConnectRecord and a callback URL to the server, then listens on this callback URL. When changes occur on the server, data is passed to the client through this callback URL.

Based on this concept, the data structure passed from the server to the client via the callback URL is unknown, thus it's not feasible to establish a uniform data processing function.

I'm unsure whether my concept aligns with the intended goal.😶‍🌫️

@Pil0tXia
Copy link
Member

@cnzakii

Based on this concept, the data structure passed from the server to the client via the callback URL is unknown, thus it's not feasible to establish a uniform data processing function.

May I ask that why do you want to process data HTTP Sink Connector received? Basically we just need to store and display it to users and it is the user who should be the one to understand the response body.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 15, 2024

May I ask that why do you want to process data HTTP Sink Connector received? Basically we just need to store and display it to users and it is the user who should be the one to understand the response body.

OK, I get it, do both "storage" and "display" need to be done? Or should we store the data in the memory queue for now, and the "display" function will be implemented in the future or in other parts?

This is the webhook implementation I envisioned
screenshot-20240415-113809

@Pil0tXia
Copy link
Member

Pil0tXia commented Apr 15, 2024

@cnzakii

do both "storage" and "display" need to be done?

You can leave an HTTP endpoint to get all Webhook delivery response results (which is a memory object as I mentioned above).

This is the webhook implementation I envisioned

Regarding NoResponseQueue and ResponseQueue, I don't think it's necessary to read and write to NoResponseQueue frequently. You can put responses of a 200 status code received into a queue called successfulDeliveries and others into failedDeliveries, and retry failed deliveries later.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 15, 2024

Regarding NoResponseQueue and ResponseQueue, I don't think it's necessary to read and write to NoResponseQueue frequently. You can put responses of a 200 status code received into a queue called successfulDeliveries and others into failedDeliveries, and retry failed deliveries later.

@Pil0tXia

Regarding NoResponseQueue and ResponseQueue, my original intention was for NoResponseQueue to record requests that have not yet received callback responses, while ResponseQueue is intended to record callback responses. Currently, NoResponseQueue is only used for recording purposes and can likely be removed.

As for retry mechanisms, I believe that requests encountering errors at the network level can be retried, but retrying requests that have already received error status codes doesn't make much sense because servers typically respond identically to the same request.

@Pil0tXia
Copy link
Member

Pil0tXia commented Apr 15, 2024

@cnzakii

That's a good idea, should we retry any failed request?

I would recommend this as a useful feature, but it is not a necessary task.

Re-initiating a request that failed at the network level may work, but re-initiating a request that returned an error status code will usually not change the result

You've found a good point. In that case, you can put network error requests into the third queue and only retry network errors. This also applies to ordinary HTTP requests.

The number of retry attempts can be set as a YAML parameter, and when its value is set to 0, it means no retries.

Additionally, we can introduce another parameter that allows the user to decide whether to only retry network errors or retry all requests with non-200 status codes (such as 429, 503 indicating that server is busy, while 400, 422, 500 may get the same results). By default, only network errors are retried.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 16, 2024

@Pil0tXia

In order to implement the webhook function, we must add the callback url to the request so that the server knows where it should send the callback data, then we need to modify the original request. Now I have several options:

  1. Add request header: Callback-Url: http://example.com/callback
  2. Modification request body, the following are several modification methods:
    screenshot-20240416-114143
    screenshot-20240416-114011

What do you think about this ?

@Pil0tXia
Copy link
Member

@cnzakii

callback url belongs to webhook function, and it isn't part of event data. So the former one is definitely better.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 16, 2024

callback url belongs to webhook function, and it isn't part of event data. So the former one is definitely better.

@Pil0tXia

Can you clarify which option it is? Because I proposed three options.

In terms of code implementation, it is simpler to add the callback url to the request header. You only need to use if to determine whether you need to add this request header, without modifying the original function or creating a new function to adapt to different request bodies.

@Pil0tXia
Copy link
Member

@cnzakii

You can add callback url both in header and body to achieve best compatibility.

If you believe that only header or body is needed to add, then one of them is enough.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 16, 2024

You can add callback url both in header and body to achieve best compatibility.

If you believe that only header or body is needed to add, then one of them is enough.

The header is carried by every HTTP request, so I think passing the callback url through the header is enough. The redundant design of declaring the callback url again in the body does not seem to improve compatibility, unless faced with A server that completely ignores HTTP headers.

@Pil0tXia
Copy link
Member

@cnzakii May I ask that what do you want to put in callbackUrl? It is the same concept as the payload server url and should be configured in YAML.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 25, 2024

@cnzakii I can't see the messages on your screenshot from my side. Your screenshot shows that your comments are in Pending state, and you have to publish them or they will always be in an invisible draft state.

I'm very sorry, that was my mistake, I've submitted my comment and you should see it now.

@Pil0tXia
Copy link
Member

@cnzakii Never mind! It's not your fault. I personally think it is GitHub's design oversight. I might prefer not reply a conversation via Files Changed page.

@cnzakii
Copy link
Contributor Author

cnzakii commented Apr 27, 2024

@Pil0tXia PTAL

Copy link
Member

@Pil0tXia Pil0tXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall design is very good, and it's highly complete. It's clear that a lot of thought and effort has been put into it. I've only suggested some minor modifications.

Copy link
Member

@Pil0tXia Pil0tXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick and brief review. I will see changes to Retry tomorrow.

@cnzakii cnzakii requested a review from Pil0tXia April 29, 2024 12:50
@cnzakii cnzakii requested a review from Pil0tXia April 29, 2024 13:57
@cnzakii cnzakii requested a review from Pil0tXia April 30, 2024 14:36
Copy link
Member

@Pil0tXia Pil0tXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The completion of this HTTP Sink Connector is truly remarkable. I have a strong feeling it will become one of the most frequently used connectors among EventMesh users.

I sincerely invite you to join in the development of Webhook support for the HTTP Source Connector. With your experience with the HTTP Sink Connector, I believe you are more than capable of handling this feature, and can significantly alleviate the review workload for the community.

Documentation can be submitted later to the https://github.com/apache/eventmesh-site repository. It would be wonderful if you could showcase the extensive features of the EventMesh HTTP Sink Connector to the users there.

@Pil0tXia Pil0tXia added the ready for review PR is waiting for reviewer's approval or opinion (used as a strong reminder) label Apr 30, 2024
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 11.16883% with 2052 lines in your changes are missing coverage. Please review.

Project coverage is 16.43%. Comparing base (e04156b) to head (d235a48).
Report is 18 commits behind head on master.

❗ Current head d235a48 differs from pull request most recent head 2f2e81e. Consider uploading reports for the commit 2f2e81e to get more accurate results

Files Patch % Lines
...he/eventmesh/runtime/metrics/http/HttpMetrics.java 0.00% 416 Missing ⚠️
...ache/eventmesh/runtime/metrics/tcp/TcpMetrics.java 0.00% 122 Missing ⚠️
...time/metrics/http/EventMeshHttpMetricsManager.java 0.00% 121 Missing ⚠️
...s/prometheus/PrometheusMetricsRegistryManager.java 0.00% 120 Missing ⚠️
...he/eventmesh/runtime/metrics/grpc/GrpcMetrics.java 0.00% 105 Missing ⚠️
...atgpt/source/connector/ChatGPTSourceConnector.java 4.71% 101 Missing ⚠️
...ector/http/sink/handle/WebhookHttpSinkHandler.java 54.07% 45 Missing and 17 partials ⚠️
...mesh/runtime/metrics/tcp/TcpMetricsCalculator.java 0.00% 61 Missing ⚠️
...time/metrics/grpc/EventMeshGrpcMetricsManager.java 0.00% 59 Missing ⚠️
...nnector/chatgpt/source/managers/OpenaiManager.java 0.00% 53 Missing ⚠️
... and 112 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #4837      +/-   ##
============================================
+ Coverage     16.20%   16.43%   +0.22%     
- Complexity     1710     1817     +107     
============================================
  Files           857      910      +53     
  Lines         30883    32351    +1468     
  Branches       2685     2776      +91     
============================================
+ Hits           5005     5317     +312     
- Misses        25410    26519    +1109     
- Partials        468      515      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cnzakii
Copy link
Contributor Author

cnzakii commented May 1, 2024

@Pil0tXia

LGTM. The completion of this HTTP Sink Connector is truly remarkable. I have a strong feeling it will become one of the most frequently used connectors among EventMesh use

Thank you very much for your help during this period. Without your assistance, it would have been difficult for me to complete the Http Sink Connector to this extent.

Documentation can be submitted later to the https://github.com/apache/eventmesh-site repository.

After this PR is successfully merged, I will complete this work.

I sincerely invite you to join in the development of #4869.

I am very willing to continue contributing to the community. After I complete all the work on the Http Sink Connector, I will attempt to finish it (if the issue hasn't been assigned yet).

Copy link
Member

@mxsm mxsm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mxsm mxsm merged commit 05b1f43 into apache:master May 8, 2024
9 checks passed
@cnzakii cnzakii deleted the feat_4824 branch May 8, 2024 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review PR is waiting for reviewer's approval or opinion (used as a strong reminder)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Add HTTP Sink Connector
4 participants