-
Notifications
You must be signed in to change notification settings - Fork 17
The PublishEvents RPC should block when the queue is full #84
Comments
PublishEvents
should block when the queue is full
@cmacknz The issue description sounds like this is to be implemented in the If we just blocked on the channel like it's now implemented in beats (referenced memory and disk queue implementations) then we would not comply with:
Blocking on the channel inside So, my initial idea after our discussion was: when calling @faec any thoughts on this? |
We will need to modify the Publish() interface to pass the RPC context through to allow cancellation to work properly at least, so we can consider further changes or additions to the queue Publish interface if it helps us get the behaviour we want. I would like to eliminate retry loops and anything that resembles busy waiting if we can, and just rely on having the queue unblock us when it is ready to accept new events. Could we expose the underlying channel from the queue to help with this perhaps? // Unconditionally block for the first event in the batch
queueCh <- events[0]
// Avoid blocking for subsequent attempts to write.
for i := 1; i < len(events); i++ {
select {
case queueCh <- events[i]:
case default:
// send would block, return from RPC
}
} |
Well, it's straightforward to handle events after the first one by calling |
@faec I'm somewhat confused now. So, we have this But at the same time I see that if it returns elastic-agent-shipper/queue/queue.go Lines 63 to 68 in 230a013
The behaviour we want is to block on the first event until the queue is free and then (as you said) The missing piece is I can't see how we can achieve this without changes in the implementation of queue in Beats, am I missing something? What would be the concrete steps we need to take in order to achieve this? |
Yes, I agree, this requires changes to the beats queue. I see a couple options. We could modify the queue producer interface itself: type Producer interface {
// Publish adds an event to the queue, blocking if necessary, and returns
// the new entry's id and true on success.
Publish(event interface{}) (EntryID, bool)
... We could change this prototype to be There are three main producers that would need to change to support it: The other option that comes to mind is to modify the queue interface itself. We removed the abstraction of queue "consumers" a while ago since they had no state of their own, but we still have "producers" because they encapsulate state about callbacks / notifications. However, the shipper has no need of these features and they add potential race conditions where there's no essential dependency, so we could add a direct function on the queue for callers that don't need to preserve producer state: // (beats/libbeat/publisher/queue/queue.go)
type Queue interface {
Publish(event interface{}, block bool, cancel chan struct{})
... This publish call would do the same API calls as the producers, but with no producer state and an explicit cancel channel to address the needs of the shipper. This way the shipper wouldn't have to worry about queue producers or properly synchronizing them at all, so I think I like this option somewhat better, but both are probably fine so whatever makes for a more practical implementation. |
After a meeting, @cmacknz @faec and I decided on the following steps:
|
Implementation issue following the discussion in #81.
Specifically the RPC should block until at least one event is accepted into the queue. The RPC should not block until all events in the batch have been accepted into the queue.
The shipper queue's publish interface already blocks when the queue is full based on the underlying Beats' memory and disk queue implementations:
elastic-agent-shipper/queue/queue.go
Lines 57 to 64 in cf6513d
One missing piece with the existing
Publish
method is that it does not accept acontext.Context
as input. This means that when aPublishEvents
RPC call is made with a timeout then that timeout will be ignored if the RPC is blocked in the queuePublish
method. We will need to modify the queue interface to accept acontext.Context
and propagate the RPCcontext
to it in thePublishEvents
method.The text was updated successfully, but these errors were encountered: