-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Personal/duburson/remove sa prototype #68
Conversation
… partition is joined in one batch for consumption in the function.
In general, correct me if this was wrong, the implementation becomes relatively more like a periodic long running job, so assuming we're going to use AKS, we would need to deal with the complexity of managing the lifecycle of a Pod and lifecycle of a buffering events batch in terms of auto-scaling. For example, a function runtime is holding in a tumbling window while the orchestrator decide to terminate it. I think there is the mechanism to deal with this situation but it would require some more investigation. |
Kind of. If we are buffering but the application is terminated and restarted across multiple pods in a scale out or scale in scenario we still have the original watermark in Azure Storage for each partition so we should just resume where we left off. The data that was buffered will need to be re-read but should be buffered and then eventually flushed in the same manner. This is why it is important to have a deterministic buffering mechanism. The ability to deterministically replay is important not just for the reseed scenario but also critical for these failure/scaling scenarios. This is also why the custom Azure Function trigger is necessary. Without it we don't control when the watermark is persisted in relation to the buffering. The code here is design to only update the watermark after the batch is complete. Hopefully that covers your concern. Let me know if you need more discussion! |
return new EventProcessor(_options, _executor, _logger, _singleDispatch); | ||
} | ||
|
||
public IScaleMonitor GetMonitor() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would we auto-scale this new eventhubtrigger/eventhublistener across azure functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handled by the hosting infrastructure. If we omit the same metrics as the current EventHubTrigger we should be able to leverage the same built in AKS scaling. The scaling itself isn't handled in the function, just the telemetry collected.
_logger = logger; | ||
_pendingEventData = new ConcurrentDictionary<string, PartitionEventData>(); | ||
_flushTimeSpan = TimeSpan.FromMinutes(1); | ||
_timedFlush = Task.Run(TimedFlush); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to support time-based flushing, won't the function app need to running? And if its running, won't it be incurring cost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even without time based flushing the function is always running since the underlying Event Hub processor needs to poll for Events. It is probably to think of it better as a timed based buffer since in the normal mode events are immediately sent to the function (name in the code not withstanding). This instead waits to the specified time to send the batch to reduce the calls on the FHIR server. This very helpful if you are using the SampledData type to collect many data points in the same observation. If you are just using ValueQuantity or some other single use type it is less beneficial since each message will become an observation no matter what.
The scale to zero solution with KEDA works independently of this trigger. In a KEDA scenario, KEDA would monitor the event hub and then spin up the first instance of the function. At that point data would be buffered and sent to the FHIR service a regular intervals.
I'm wondering if there is an opportunity to remove the normalizeddata Event Hub and call the MeasurementCollectionToFhir function through a service deployed in K8s and watermark only after a 200 success (when Observation create/update is successful). I can see this being useful for a few of reasons:
|
Thanks for the suggestion Nate. Unfortunately there are some major hurdles if we remove the normalized event hub. The biggest is we use the hop from device data to normalized data to partition the normalize data according to the device id. This ensures the data for a given device will land in the same partition and we don't have contention across multiple partitions for the same devices data. This isn't really something we can enforce on the device data event hub since the partitions for an event are set when a message is written and I would prefer to not leave such a critical item up to the customer to implement correctly. Some additional potential problems with removing the normalized data Event Hub:
If we were to remove the normalized event hub, I don't think we need to keep the current HttpTrigger. It can just be a call to an internal method on another class. We don't really gain much by keeping it an HttpTrigger. We would still need to rotate function keys and transform the data into the http requests and deal with the overhead. Instead if we end up going this route I would suggest just an in memory pipeline with distinct stages. On the cost front, the polling it self is minimal. This is also alleviated if we implement a scale to zero solution like KEDA. KEDA still polls but it more infrequent until data is detected then consumers are ramped up. The concern about where the data is can be handled with proper telemetry and documentation. Also one clarifying point, data in the Event Hub isn't consumed like a queue. It will remain there until it ages out. As for the identity management, we will still need the internal managed identity. The managed identity is used to access the key vault where the other secrets are kept like the store keys. Also, as part of this work I think we will need to write our own triggers to use the latest EventHub and Storage SDKs. I haven't confirmed but there may be an opportunity to leverage managed identity for accessing the storage account with this change. |
Create a Prototype for processing IoMT events without Streaming Analytics
Leverages source code for Azure Function Event Hub extensions.
A majority of the changes are in the EventHubListner.cs & MeasurementFhirImportService.cs classes.
Additional improvements include:
Additional Ideas: