-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Large Message Size (PIP-37 / "chunking") #456
Comments
Hello, I am very interested in this feature, the following is my plan. MotivationMake pulsar go client support chunking to produce and consume big messages. ModificationsPublish Chunked MessagesThe pulsar-client-go/pulsar/producer_partition.go Lines 427 to 436 in 0f7041f
If the size of message payload is bigger than maxMessageSize , it will be discarded. So it should be split into chunked messages with a size not exceeding the maxMessageSize , and they are sent to the brokers separately. I think the chunk logic can be added in internalSendAsync .pulsar-client-go/pulsar/producer_partition.go Line 741 in 0f7041f
Receive Chunked MessagesPulsar allows multiple Producers to produce messages to the same topic at the same time, which means that the chunks of multiple big messages may be alternately arranged in the topic. And each chunk of the same big message is not necessarily consecutive arrived (but must arrive in order, which is guaranteed by the broker). pulsar-client-go/pulsar/consumer_partition.go Line 553 in 0f7041f
Some DetailsBatchingCurrently pulsar go client depends on pulsar-client-go/pulsar/producer_partition.go Lines 472 to 474 in 0f7041f
In the Java Client, batch message logic will skip the processing of chunk messages. So we need a single message sending implement independent of BatchBuilder .Considering the problem of consumer available-permits calculation in shared subscription (issue #10417), batching and chunking cannot be enabled at the same time.
Chunked Message IDThis is related to PIP 107. It's good to take the solution in the new Java Client, which is to implement an pulsar-client-go/pulsar/consumer_partition.go Lines 428 to 431 in 0f7041f
pulsar-client-go/pulsar/consumer_partition.go Lines 447 to 459 in 0f7041f
Size CalculationThis is related to issue #16196. Message metadate should be updated before computing the chunk size. An the total size should include all bytes other than the metadata and payload, e.g. the 4 bytes checksum field. Shared SubscriptionThere are some problems of chunking with shared subscription. issue #16202 supported chunking with Shared subscription. And go client may not need to limit chunking with Shared subscription in unAckedChunkedMessageIdSequenceMapGo client doen't support ackTimeout now. So there is no |
Master Issue: [#456](#456) ### Motivation Make pulsar go client support chunking to produce/consume big messages. The earlier implementation ([#717](#717)) didn't take into account many details, so I decided to reimplement it. ### Modifications - Add `internalSingleSend` to send message without batch because batch message will not be received by chunk. - Moved `BlockIfQueueFull` check from `internalSendAsync` to `internalSend` (`canAddQueue`) to ensure the normal block in chunking. - Make producer send big messages by chunking. - Add `chunkedMsgCtxMap` to store chunked messages meta and data. - Make consumer can obtain chunks and consume the big message.
@RobertIndie Since chunk feature is supported in v0.10 #805 , can we close this issue as completed? |
Yes. Sure. Thanks for your reminder. |
Currently the go client does not seem to support PIP-37 which allows for messages to be sent that are larger than the maximum message size by breaking them up on the producer side and re-assembling them in the consumer. This would be a handy feature to have parity with the Java client
The text was updated successfully, but these errors were encountered: