-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Report Scheduler] Documentation of Scheduling Logic #31134
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,6 +37,25 @@ class TimerContext | |
virtual void TimerFired() = 0; | ||
}; | ||
|
||
/** | ||
* @class ReportScheduler | ||
* | ||
* @brief This class is responsible for scheduling Engine runs based on the reporting intervals of the ReadHandlers. | ||
* | ||
* | ||
* This class holds a pool of ReadHandlerNodes that are used to keep track of the minimum and maximum timestamps for a report to be | ||
* emitted based on the reporting intervals of the ReadHandlers associated with the node. | ||
* | ||
* The ReportScheduler also holds a TimerDelegate pointer that is used to start and cancel timers for the ReadHandlers depending | ||
* on the reporting logic of the Scheduler. | ||
* | ||
* It inherits the ReadHandler::Observer class to be notified of reportability changes in the ReadHandlers. | ||
* It inherits the ICDStateObserver class to allow the implementation to generate reports based on the changes in ICD devices state, | ||
* such as going from idle to active and vice-versa. | ||
* | ||
* @note The logic for how and when to schedule reports is implemented in the subclasses of ReportScheduler, such as | ||
* ReportSchedulerImpl and SyncronizedReportSchedulerImpl. | ||
*/ | ||
class ReportScheduler : public ReadHandler::Observer, public ICDStateObserver | ||
{ | ||
public: | ||
|
@@ -60,6 +79,29 @@ class ReportScheduler : public ReadHandler::Observer, public ICDStateObserver | |
virtual Timestamp GetCurrentMonotonicTimestamp() = 0; | ||
}; | ||
|
||
/** | ||
* @class ReadHandlerNode | ||
* | ||
* @brief This class is in charge of determining when a ReadHandler is reportable depending on the monotonic timestamp of the | ||
* system and the intervals of the ReadHandler. It inherits the TimerContext class to allow it to be used as a context for a | ||
* TimerDelegate so the TimerDelegate can call the TimerFired method when the timer expires. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* | ||
* The Logic to determine if a ReadHandler is reportable at a precise timestamp is as follows: | ||
* 1: The ReadHandler is in the CanStartReporting state | ||
* 2: The minimal interval since last report has elapsed | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* 3: The maximal interval since last report has elapsed or the ReadHandler is dirty | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* If the three conditions are met, the ReadHandler is reportable. | ||
* | ||
* Additionnal flags have been provided for specific use cases: | ||
* | ||
* CanbeSynced: Mechanism to allow the ReadHandler to emit a report if another readHandler is ReportableNow. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CanBeSynced, no? |
||
* This flag can substitute the maximal interval condition or the dirty condition. It is currently only used by the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume this means "This flag can replace the maximal interval condition or the dirty condition"? But this is actually really hard to follow. First we say "we are reportable if all three of these things are true", and then we later say "oh, actually, some of those can be false if some of these other things are true". What this documentation should talk about is conceptually what things might keep us from being reportable and for each one which sets of things might be able to remove that impediment. As I see, it there are three impediments:
Do those descriptions make sense? I think recasting this documentation in terms sort of like those would be a lot clearer.... |
||
* SynchronizedReportScheduler. | ||
* | ||
* EngineRunScheduled: Mechanism to ensure that the reporting engine will see the ReadHandler as reportable if a timer fires. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bringing us back to: why is this called EngineRunScheduled? Is that the really important thing? Or is the important thing "we think we have reached our min interval, even though the clock disagrees"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It means this: "This ReadHandler's timer fired, thus ScheduleReport was called, and we want the Report Engine to see the ReadHandler as reportable on the next run, regardless of what the clock thinks" This could be renamed "TimerFired" flag if this seems more clear. Will update the description in my ongoing re-write, I can also update the name if you think it would make sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about ForceReportableOnNextEngineRun if that's what it means? |
||
* This flag can substitute the minimal interval condition or the maximal interval condition. The goal is to allow for | ||
* reporting when timers fire earlier than the minimal timestamp du to mechanism such as NTP clock adjustments. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/du/due/, but this whole comment might end up changing a bunch. |
||
*/ | ||
class ReadHandlerNode : public TimerContext | ||
{ | ||
public: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,6 +24,26 @@ namespace chip { | |
namespace app { | ||
namespace reporting { | ||
|
||
/** | ||
* @class ReportSchedulerImpl | ||
* | ||
* @brief This class extends ReportScheduler and provides a scheduling logic for the CHIP Interaction Model Reporting Engine. | ||
* | ||
* It is reponsible for implementing the ReadHandler and ICD observers callbacks to the Scheduler can take actions whenever a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* ReadHandler event occurs or the ICD changes modes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* | ||
* All ReadHandlers Observers callbacks rely on the node pool to create or find the node associated to the ReadHandler that | ||
* triggered the callback and will use the FindReadHandlerNode() method to do so. | ||
* | ||
* ## Scheduling Logic | ||
* | ||
* This class implements a scheduling logic that calculates the next report timeout based on the current system timestamp, the state | ||
* of the ReadHandlers associated with the scheduler nodes and the min and max intervals of the ReadHandlers. | ||
* | ||
* @note This class mimics the original scheduling in which the ReadHandlers would schedule themselves. The key difference is that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think anyone cares that this "mimics" code that is now gone. It would be better instead to clearly describe the algorithm this class is aiming to implement. |
||
* this implementation only relies on a single timer from the scheduling moment rather than having a timer expiring on the min | ||
* interval that would trigger the start of a second timer expiring on the max interval. | ||
*/ | ||
class ReportSchedulerImpl : public ReportScheduler | ||
{ | ||
public: | ||
|
@@ -33,29 +53,98 @@ class ReportSchedulerImpl : public ReportScheduler | |
~ReportSchedulerImpl() override { UnregisterAllHandlers(); } | ||
|
||
// ICDStateObserver | ||
|
||
/** | ||
* @brief When the ICD changes to Idle, no action is taken in this implementation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This does not seem to be a useful comment. It's clear that no action is taken from looking at the next line of code. The question is why. If we're going to document anything, that is what needs to be documented. In general, a lot of the documentation this PR added talks about various "what" (like what classes things inherit) and not enough about the "whys". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In my updated documentation, explaining the why, but in this case it might be better to just remove the comment altogether. |
||
*/ | ||
void OnTransitionToIdle() override{}; | ||
|
||
/** | ||
* @brief When the ICD changes to Active, this implementation will trigger a report emission on each ReadHandler that is not | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* blocked on its min interval. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* | ||
* @note Most action triggering a change to the Active mode already trigger a report emission, so this method is optionnal as it | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. syntaxe is off here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Most actions". "optional". Please go through these comments carefully with a grammar/spelling checker. |
||
* might be redundant. | ||
*/ | ||
void OnEnterActiveMode() override; | ||
void OnTransitionToIdle() override; | ||
// No action is needed by the ReportScheduler on ICD operation Mode changes | ||
|
||
/** | ||
* @brief When the ICD changes operation mode, no action is taken in this implementation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above about comments that don't say anything useful. |
||
*/ | ||
void OnICDModeChange() override{}; | ||
|
||
// ReadHandlerObserver | ||
|
||
/** | ||
* @brief When a ReadHandler is added, adds a node and register it in the scheduler node pool. Scheduling the report here is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. syntaxe is off here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "and registers", to be consistent with "adds". |
||
* un-necessary since the ReadHandler will call MoveToState(HandlerState::CanStartReporting);, which will call | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* OnBecameReportable() and schedule the report. | ||
* | ||
* @note This method sets a now Timestamp that is used to calculate the next report timeout. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it means a timestamp to the moment of the call. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about "This method captures the current timestamp, which can later be used to ... "? |
||
*/ | ||
void OnSubscriptionEstablished(ReadHandler * aReadHandler) final; | ||
|
||
/** | ||
* @brief When a ReadHandler becomes reportable, recalculate and reschedule the report. | ||
* | ||
* @note This method sets a now Timestamp that is used to calculate the next report timeout. | ||
*/ | ||
void OnBecameReportable(ReadHandler * aReadHandler) final; | ||
|
||
/** | ||
* @brief When a ReadHandler report is sent, recalculate and reschedule the report. | ||
* | ||
* @note This method is called after the report is sent, so the ReadHandler is no longer reportable, and thus CanBeSynced and | ||
* EngineRunScheduled of the node associated to the ReadHandler are set to false in this method. | ||
Comment on lines
+97
to
+98
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this really API documentation, or implementation documentation? As an API consumer what I would want to know is:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, this is more useful in the implementation doc, will move it there. |
||
* | ||
* @note This method sets a now Timestamp that is used to calculate the next report timeout. | ||
*/ | ||
void OnSubscriptionReportSent(ReadHandler * aReadHandler) final; | ||
|
||
/** | ||
* @brief When a ReadHandler is destroyed, remove the node from the scheduler node pool and cancel the timer associated to it. | ||
*/ | ||
void OnReadHandlerDestroyed(ReadHandler * aReadHandler) override; | ||
|
||
virtual bool IsReportScheduled(ReadHandler * aReadHandler); | ||
|
||
void ReportTimerCallback() override; | ||
|
||
protected: | ||
/** | ||
* @brief Schedule a report for the ReadHandler associated to the node. | ||
* | ||
* If a report is already scheduled for the ReadHandler, cancel it and schedule a new one. | ||
* If the timeout is 0, directly calls the TimerFired() method of the node instead of scheduling a report. | ||
Comment on lines
+117
to
+118
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the audience for this comment? This seems to be a restatement of the method's implementation, but why does anyone care (who would not just read the implementation)? I thought the goal here, based on what I had asked for, was documentation of how the system works as a whole, not per-method restatements of what the code in those methods does... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The audience of the second part would be someone trying to establish if a report is scheduled by using the IsReportScheduled() method, which looks at the timer to assess if it is the case. This will be moved to the IsReportScheduler method above. The first part informs a user that Scheduling twice will result in replacing the first scheduled report, not scheduling 2 reports at separate time. |
||
* | ||
* @param[in] timeout The timeout to schedule the report. | ||
* @param[in] node The node associated to the ReadHandler. | ||
* @param[in] now The current system timestamp. | ||
* | ||
* @return CHIP_ERROR CHIP_NO_ERROR on success, timer related error code otherwise (This can only fail on starting the timer) | ||
*/ | ||
virtual CHIP_ERROR ScheduleReport(Timeout timeout, ReadHandlerNode * node, const Timestamp & now); | ||
void CancelReport(ReadHandler * aReadHandler); | ||
virtual void UnregisterAllHandlers(); | ||
|
||
private: | ||
friend class chip::app::reporting::TestReportScheduler; | ||
|
||
/** | ||
* @brief Find the next timer when a report should be scheduled for a ReadHandler. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not finding a "timer". |
||
* | ||
* @param[out] timeout The timeout to calculate. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is computing a delay from "now", right? That would be useful to document. |
||
* @param[in] aNode The node associated to the ReadHandler. | ||
* @param[in] now The current system timestamp. | ||
* | ||
* @return CHIP_ERROR CHIP_NO_ERROR on success or CHIP_ERROR_INVALID_ARGUMENT if aNode is not in the pool. | ||
* | ||
* The logic is as follows: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why does this needs to be documented here? |
||
* - If the ReadHandler is reportable now, the timeout is 0. | ||
* - If the ReadHandler is reportable, but the current timestamp is earlier thant the next min interval's timestamp, the timeout | ||
* is the delta between the next min interval and now. | ||
* - If the ReadHandler is not reportable, the timeout is the difference between the next max interval and now. | ||
*/ | ||
virtual CHIP_ERROR CalculateNextReportTimeout(Timeout & timeout, ReadHandlerNode * aNode, const Timestamp & now); | ||
}; | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -84,8 +84,6 @@ bool SynchronizedReportSchedulerImpl::IsReportScheduled(ReadHandler * ReadHandle | |
return mTimerDelegate->IsTimerActive(this); | ||
} | ||
|
||
/// @brief Find the smallest maximum interval possible and set it as the common maximum | ||
/// @return NO_ERROR if the smallest maximum interval was found, error otherwise, INVALID LIST LENGTH if the list is empty | ||
CHIP_ERROR SynchronizedReportSchedulerImpl::FindNextMaxInterval(const Timestamp & now) | ||
{ | ||
VerifyOrReturnError(mNodesPool.Allocated(), CHIP_ERROR_INVALID_LIST_LENGTH); | ||
|
@@ -105,16 +103,13 @@ CHIP_ERROR SynchronizedReportSchedulerImpl::FindNextMaxInterval(const Timestamp | |
return CHIP_NO_ERROR; | ||
} | ||
|
||
/// @brief Find the highest minimum timestamp possible that still respects the lowest max timestamp and sets it as the common | ||
/// minimum. If the max timestamp has not been updated and is in the past, or if no min timestamp is lower than the current max | ||
/// timestamp, this will set now as the common minimum timestamp, thus allowing the report to be sent immediately. | ||
/// @return NO_ERROR if the highest minimum timestamp was found, error otherwise, INVALID LIST LENGTH if the list is empty | ||
CHIP_ERROR SynchronizedReportSchedulerImpl::FindNextMinInterval(const Timestamp & now) | ||
{ | ||
VerifyOrReturnError(mNodesPool.Allocated(), CHIP_ERROR_INVALID_LIST_LENGTH); | ||
System::Clock::Timestamp latest = now; | ||
|
||
mNodesPool.ForEachActiveObject([&latest, this](ReadHandlerNode * node) { | ||
// We only consider the min interval if the handler is reportable to prevent holding the reports | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment is very cryptic. Holding what reports? What does "holding" even mean here? It seems like there is a subtlety here that needs to be explained, but this comment is not doing it. |
||
if (node->GetMinTimestamp() > latest && this->IsReadHandlerReportable(node->GetReadHandler()) && | ||
node->GetMinTimestamp() <= this->mNextMaxTimestamp) | ||
{ | ||
|
@@ -138,7 +133,10 @@ CHIP_ERROR SynchronizedReportSchedulerImpl::CalculateNextReportTimeout(Timeout & | |
bool reportableNow = false; | ||
bool reportableAtMin = false; | ||
|
||
// Find out if any handler is reportable now or at the next min interval | ||
mNodesPool.ForEachActiveObject([&reportableNow, &reportableAtMin, this, now](ReadHandlerNode * node) { | ||
// If a node is already scheduled, we don't need to check if it is reportable now, unless a chunked report is in progress | ||
// in which case we need to keep scheduling engine runs until the report is complete | ||
Comment on lines
+138
to
+139
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, node->IsChunkedReport() can only be true while reporting, which only happens on reportable handler. |
||
if (!node->IsEngineRunScheduled() || node->IsChunkedReport()) | ||
{ | ||
if (node->IsReportableNow(now)) | ||
|
@@ -156,8 +154,6 @@ CHIP_ERROR SynchronizedReportSchedulerImpl::CalculateNextReportTimeout(Timeout & | |
return Loop::Continue; | ||
}); | ||
|
||
// Find out if any handler is reportable now | ||
|
||
if (reportableNow) | ||
{ | ||
timeout = Milliseconds32(0); | ||
|
@@ -175,8 +171,6 @@ CHIP_ERROR SynchronizedReportSchedulerImpl::CalculateNextReportTimeout(Timeout & | |
return CHIP_NO_ERROR; | ||
} | ||
|
||
/// @brief Callback called when the report timer expires to schedule an engine run regardless of the state of the ReadHandlers, as | ||
/// the engine already verifies that read handlers are reportable before sending a report | ||
void SynchronizedReportSchedulerImpl::TimerFired() | ||
{ | ||
Timestamp now = mTimerDelegate->GetCurrentMonotonicTimestamp(); | ||
|
@@ -188,13 +182,14 @@ void SynchronizedReportSchedulerImpl::TimerFired() | |
mNodesPool.ForEachActiveObject([now, &firedEarly](ReadHandlerNode * node) { | ||
if (node->GetMinTimestamp() <= now) | ||
{ | ||
// Mark the handler as CanBeSynced if the min interval has elapsed so it will emit a report on the next engine run | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This might be clearer as:
or so. That's the real intent of what's going on here, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes it is. |
||
node->SetCanBeSynced(true); | ||
} | ||
|
||
if (node->IsReportableNow(now)) | ||
{ | ||
// We set firedEarly false here because we assume we fired the timer early if no handler is reportable at the moment, | ||
// which becomes false if we find a handler that is reportable | ||
// We set firedEarly false here because we assume we fired the timer early if no handler is reportable at the | ||
// moment, which becomes false if we find a handler that is reportable | ||
firedEarly = false; | ||
node->SetEngineRunScheduled(true); | ||
ChipLogProgress(DataManagement, "Handler: %p with min: 0x" ChipLogFormatX64 " and max: 0x" ChipLogFormatX64 "", (node), | ||
|
@@ -206,12 +201,14 @@ void SynchronizedReportSchedulerImpl::TimerFired() | |
|
||
if (firedEarly) | ||
{ | ||
// If we fired the timer early, we need to recalculate the next report timeout and reschedule the report | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "reschedule the report for a time when we should have a handler that's ready to report"? Or for some other reason? Why do we need to do this? That's the question this comment leaves unanswered. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reason is that when we fire early, we essentially say: "my timer disagrees with the monotonic timestamp but I will trust the timer so let's ignore the min timestamp". But since we use a common timer in this implementation, we don't know which handler should be reportable now and we don't want to systematically ignore min intervals. (I wrote something along those lines in the update doc PR). |
||
Timeout timeout = Milliseconds32(0); | ||
ReturnOnFailure(CalculateNextReportTimeout(timeout, nullptr, now)); | ||
ScheduleReport(timeout, nullptr, now); | ||
} | ||
else | ||
{ | ||
// If we did not fire the timer early, we can schedule an engine run | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The key part here is not the "did not fire early", really. It's "we have a handler that's reportable now", no? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes it is. |
||
InteractionModelEngine::GetInstance()->GetReportingEngine().ScheduleRun(); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`such as going from idle to active mode and vice-versa.