Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #540, add event callback framework #541

Merged

Conversation

jphickey
Copy link
Contributor

@jphickey jphickey commented Jul 10, 2020

Describe the contribution
Adds an event callback mechanism to certain state changes in OSAL. This allows the CFE PSP to be notified at these points, and therefore it can add platform-specific functionality. This can, for instance, set the task name as requested in #532 or set the processor affinity in a multi-core setup (TBD).

Fixes #540

Testing performed
Create an event handler in the pc-linux CFE PSP and register it, and print out each event received. Confirm events are generated as expected, and that it didn't break the FSW in any way.

Note - Not completely tested yet -- this is mainly a proof of concept and pushed for design review at this time.

Expected behavior changes
Adds ability to implement custom platform-specific functionality for defined events.

System(s) tested on
Ubuntu 20.04

Additional context
Previous discussion in #532.

Contributor Info - All information REQUIRED for consideration of pull request
Joseph Hickey, Vantage Systems, Inc.

@jphickey jphickey marked this pull request as draft July 10, 2020 19:58
@jphickey jphickey added the CCB:Ready Pull request is ready for discussion at the Configuration Control Board (CCB) label Jul 10, 2020
@jphickey
Copy link
Contributor Author

marked as "ccb ready" for concept/architecture discussion (not for merge yet).

* @param[inout] data An abstract data/context object associated with the event, or NULL.
* @return status Execution status, see @ref OSReturnCodes.
*/
typedef int32 (*OS_EventHandler_t)(OS_Event_t event, uint32 object_id, void *data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confused by the flow of the "data" void *. Usu. with a callback architecture, the opaque data is passed in by the registration and passed through as another "in" to the callback function pointer. But the registration does not define the opaque and this is defined as in/out? (Which implies that the caller would want/need to understand the contents of the data buffer.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(may be a good opportunity to draw up a sequence diagram or some other way to describe the flow of control and data handoffs.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of the "data" opaque pointer field would be indicated by the event ID. Currently the set of events I've predefined don't use this, and all events pass the field as NULL. But there could be events that pass additional context info via this field, such as the object name. OR - it could be used for inputs, so the event could pass a buffer which is filled/set by the handler.

Again, right now the presence of this field is more of a "future-proof" item - to offer a way to include additional context if necessary without having to change the API in the future.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chris has a good point. This should be some context registered with the handler and passed back to it. The concept you're talking about with using it for extra event parameters is going to lead to you having to either force the user to typecast in their callback or create a set of API functions to make the opaque data useful.

@CDKnightNASA
Copy link
Contributor

Also would like consideration how this might play with the OSAL "object id" framework. (And currently threads/tasks do not use the object id framework...which it should!) It might make sense that all "objects" would have this capability.

@astrogeco
Copy link
Contributor

CCB-2020-07-15: Good architecture, conversation about whether this is the appropriate way to solve the request in #533

@astrogeco
Copy link
Contributor

@acudmore and @jwilmot thoughts?

@jphickey
Copy link
Contributor Author

Also would like consideration how this might play with the OSAL "object id" framework. (And currently threads/tasks do not use the object id framework...which it should!) It might make sense that all "objects" would have this capability.

Not sure what you mean by "currently threads/tasks do not use the object id framework" .. can you elaborate? Everything does use the object ID framework, including tasks.

@astrogeco astrogeco removed the CCB:Ready Pull request is ready for discussion at the Configuration Control Board (CCB) label Jul 29, 2020
@jphickey jphickey linked an issue Aug 12, 2020 that may be closed by this pull request
@jphickey jphickey marked this pull request as ready for review August 12, 2020 20:06
@jphickey
Copy link
Contributor Author

Update - this PR here has some cleanup around the deletion of objects that would be nice to have for other reasons. IIRC it was discussed/reviewed in CCB a while back and there were no objections.

@astrogeco - Can this be merged?

If still on the fence about the callback structure, then if ncessary I can split it into a separate PR which consolidates the deletion/cleanup code (which I would like to have merged) and the callback bits.

@astrogeco
Copy link
Contributor

Did we want to get a review by either of the architects?

@skliper
Copy link
Contributor

skliper commented Aug 13, 2020

Maybe @jwilmot and @klystron78. They have a potential to take advantage of this for the SMP work.

@astrogeco
Copy link
Contributor

@jphickey let's split this into two PRs so we can accelerate merging the callback stuff.

* @retval #OS_SUCCESS @copybrief OS_SUCCESS
* @retval #OS_ERROR @copybrief OS_ERROR
*/
int32 OS_RegisterEventHandler (OS_EventHandler_t handler);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to have a bitmask so the caller can specify which events are of interest. This idea also scales forward as more events are added. It could be 32-bit or 64-bit number which gives that many extra bits for future use. Perhaps 0 means "all events."

* The event handler is an application-defined callback
* that gets invoked as resources are created/configured/deleted.
*/
OS_EventHandler_t EventHandler;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there only be one? It might be useful to follow the observer/observable pattern and have a list of observers, so that apps can monitor events as well to get an idea of the state of the system. I am not fully briefed on the use-case for this work, so this thought may be beyond scope.

@@ -157,14 +157,8 @@ int32 OS_BinSemDelete (uint32 sem_id)
{
return_code = OS_BinSemDelete_Impl(local_id);

/* Free the entry in the master table now while still locked */
if (return_code == OS_SUCCESS)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return_code is set on line 158 and not checked with this deletion. It's set again on 161.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return code is checked inside the OS_ObjectIdFinalizeDelete function. There is cleanup to do for both success and failure cases. (This is a mirror image of the way the allocation/creation works too).

*-----------------------------------------------------------------*/
int32 OS_RegisterEventHandler (OS_EventHandler_t handler)
{
if (handler == NULL)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it an error if null is passed in? The effect would be to say "I am no longer interested in notifications," which seems like would not be an error.

@@ -150,14 +150,8 @@ int32 OS_CountSemDelete (uint32 sem_id)
{
return_code = OS_CountSemDelete_Impl(local_id);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like before, return_code is set and not checked. It's set again on 154.

@@ -187,14 +187,8 @@ int32 OS_DirectoryClose(uint32 dir_id)
{
return_code = OS_DirClose_Impl(local_id);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -223,14 +223,8 @@ int32 OS_close (uint32 filedes)
{
return_code = OS_GenericClose_Impl(local_id);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -289,14 +289,8 @@ int32 OS_ModuleUnload ( uint32 module_id )
*/
return_code = OS_ModuleUnload_Impl(local_id);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing with return_code set and then not checked, and set again below

@@ -145,14 +145,8 @@ int32 OS_MutSemDelete (uint32 sem_id)
{
return_code = OS_MutSemDelete_Impl(local_id);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing with return_code set and then not checked, and set again below

@@ -156,14 +156,8 @@ int32 OS_QueueDelete (uint32 queue_id)
{
return_code = OS_QueueDelete_Impl(local_id);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing with return_code set and then not checked, and set again below

}

/*
** Call the thread Delete hook if there is one.
*/
if (delete_hook != NULL)
if (return_code == OS_SUCCESS && delete_hook != NULL)
{
delete_hook();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both delete hook and event for deletion?

@@ -260,14 +260,8 @@ int32 OS_TimeBaseDelete(uint32 timer_id)
{
return_code = OS_TimeBaseDelete_Impl(local_id);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing with return_code set and then not checked, and set again below

@jphickey jphickey force-pushed the fix-540-callback-framework branch from c0b790c to c0bb696 Compare August 18, 2020 20:36
@jphickey jphickey marked this pull request as draft August 18, 2020 20:39
@jphickey
Copy link
Contributor Author

See split in PRs #566, #567

Review of this should focus on commit c0bb696. Converted back to draft while discussing the event callback API and use case.

@jphickey
Copy link
Contributor Author

Note to reviewers - the initial use case for this callback is in #532, which requires invoking a Linux-specific API to set the name of the task.

It is not intended for dynamic application level use - it is intended more as an extension mechanism to replace what would otherwise be platform-specific #ifdef routines. As such, it is a simpler problem - there should only ever be one callback function registered, which should be done immediately after OS_API_Init(), and it should never be subsequently un-registered.

If we wanted to provide a more generic application-level callback - then yes I agree with the comments regarding providing a method to subscribe to only certain events, and providing another application-provided object to pair with the callback, and so forth. But that wasn't really the objective here. We can certainly do that if folks think its useful, but I don't really see the use-case for it right now. The singlular BSP/PSP-provided callback proposed here is much simpler to implement, as any dynamic registration/unregistration would be much more involved i.e. it would have to maintain a list, proper locking to support being invoked by any thread, etc.

@skliper skliper modified the milestones: 5.2.0, 6.0.0 Sep 1, 2020
@skliper
Copy link
Contributor

skliper commented Sep 21, 2020

Need a path to closure, marking for review (unless we can reach consensus before CCB).

@skliper skliper added the CCB:Ready Pull request is ready for discussion at the Configuration Control Board (CCB) label Sep 21, 2020
@astrogeco
Copy link
Contributor

CCB 2020-09-23 rebase on main and revisit on 2020-09-30. @acudmore and @CDKnightNASA will examine

Define an interface to allow an app/PSP to be notified when state changes
or other events occur at the OS level.  Initially defined events are resource
creation/deletion and task startup.

The interface is easily extendable with more events as needed.

This can be used to add platform-specific/nonstandard functions by putting
the code inside the event handler at the PSP level.
@jphickey jphickey force-pushed the fix-540-callback-framework branch from 51ac344 to ba0448a Compare September 23, 2020 20:35
@jphickey jphickey marked this pull request as ready for review September 23, 2020 20:39
@jphickey
Copy link
Contributor Author

Rebased to latest main branch. Ready for (re-)review.

Also see the use-case example at nasa/PSP@main...jphickey:jph-wip-linux-event-handler

I have confirmed that when running this on my linux box with the change above applied, then the /proc filesystem reads accordingly, i.e. the /proc/<pid>/task/<threadid>/comm special files reads as CFE_EVS, CFE_SB, etc.

@astrogeco astrogeco added CCB-20200930 and removed CCB:Ready Pull request is ready for discussion at the Configuration Control Board (CCB) labels Sep 30, 2020
@astrogeco
Copy link
Contributor

CCB 2020-09-30: APPROVED

@astrogeco astrogeco merged commit d753a45 into nasa:integration-candidate Oct 1, 2020
@jphickey jphickey deleted the fix-540-callback-framework branch December 3, 2020 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OSAL event callback framework for platform-specific handling
5 participants