Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed missing callbacks for pending events #1717

Merged
merged 2 commits into from
Jul 4, 2023

Conversation

e-hndrks
Copy link
Contributor

@e-hndrks e-hndrks commented Jun 5, 2023

When setting a listener, any pending (i.e. unhandled) event received before would not result in an immediate callback. Now the set_listener call also checks for pending events, and invokes the registered callbacks when appropriate. This fix is a prerequisite for #eclipse-cyclonedds/cyclonedds-cxx#410

e-hndrks added a commit to e-hndrks/cyclonedds-cxx that referenced this pull request Jun 6, 2023
When creating an Entity with a Listener, any instantaneous events that occurred during creation would miss their callback.
This was caused by the C++ API not being able to wrap the underlying C API with a C++ object prior to C already trying to invoking the callback.
That is now resolved by not setting the Listener at creation time, but by setting it after successfully putting a C++ wrapper around the underlying C Entity.
This should fix issue eclipse-cyclonedds#410
A prerequisite for applying this fix is to make sure pull request #eclipse-cyclonedds/cyclonedds#1717 is applied to the cyclonedds repository.

Signed-off-by: Erik Hendriks <[email protected]>
Copy link
Contributor

@eboasson eboasson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this'll work pretty well for setting the status callbacks on the reader/writer, but I don't think it'll cause listener callbacks when setting a listeners on a entity higher up in the hierarchy. Extending it to deal with the data-available/data-on-readers events seems quite doable to me, and so we should add that.

There are also some other details and it really needs some (stress) testing ... Testing this will be tricky, there are two aspects to it:

  • whether it covers all listeners
  • whether the invocation mechanism works properly
  • whether it indeed handles cases where listeners are (re)set to 0 correctly

The first one can realistically be covered by inspection (just check whether each status flag/listener is accounted for and the names line up).

The second is probably most easily covered by testing with data-available and data-on-readers (those are easy to manipulate via the API, after all).

The last one could also be done with just one type of event and then checking all cases are handled consistently.

Especially that second test will take quite a bit of effort. Perhaps it would be an idea to add a sanity-check test in this PR and do a follow-up that does more thorough checking later? That sanity-check should be straightforward.

@@ -252,6 +256,11 @@ inline void dds_entity_deriver_refresh_statistics (const struct dds_entity *e, s
dds_entity_deriver_table[e->m_kind]->refresh_statistics (e, s);
}

/** @component statistics */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** @component statistics */
/** @component entity_listener */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check. Typical consequence of using copy-paste too eagerly. Will correct.

@@ -22,6 +22,9 @@ struct ddsi_status_cb_data;
/** @component reader */
void dds_reader_status_cb (void *entity, const struct ddsi_status_cb_data * data);

/** @component reader */
void dds_reader_invoke_cbs_for_pending_events(const struct dds_entity *e, uint32_t status);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This resets the status flag, right? So it is not const struct dds_entity *. (Self-evidently this applies to all places involving invoke_cbs_for_pending_events in one form or the other.)

(That also has the advantage that I don't have to note that the const removed by the cast in the implementation.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again a matter of copy-pasting too eagerly. Will correct this.

src/core/ddsc/src/dds_reader.c Show resolved Hide resolved
src/core/ddsc/src/dds_entity.c Show resolved Hide resolved
/* TODO: dds_get_status_changes may not be invoked for DATA_ON_READERS on Subscriber, check if that is the case!! */
status = ddsrt_atomic_ld32 (&e->m_status.m_status_and_mask) & SAM_STATUS_MASK;
if (listener && status) {
dds_entity_deriver_invoke_cbs_for_pending_events(e, status & listener->reset_on_invoke);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reset_on_invoke mask makes it possible to have listeners that only observe events, without affecting other things. This is an extension to the standard DDS API and comes in very useful at times (the ROS 2 RMW layer relies on it, for example).

There is no reason to make it part of the decision to invoke a listener or not. If the user chose to not reset the status flags on a listener invocation, then I don't see why we would have to worry about generating a spurious event in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense: I already wondered in which case the reset_on_invoke would not be set when the corresponding listener callback was. So instead I will be checking whether a callback function is registered prior to determining whether it needs to be invoked or not.

@@ -1020,10 +1025,12 @@ static void pushdown_listener (dds_entity *e)
ddsrt_mutex_unlock (&e->m_mutex);
}


dds_return_t dds_set_listener (dds_entity_t entity, const dds_listener_t *listener)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the documentation to state that we invoke listeners for events raised in the status flags. I think it would also also state that spurious invocations may occur: that simplifies the interactions with the reset_on_invoke flags and the DATA_ON_READERS thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Which documentation files would be impacted by this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only dds_set_listener in dds.h. It just needs a bit of text like:

dds_set_listener will invoke any listeners for which the corresponding status flag is set. It may cause spurious invocations, including multiple invocations for one listener. For most cases this is unlikely, but for the DATA_ON_READERS listeners it is quite likely, though not certain.

At least that would work for me, but YMMV. Suggestions always welcome 🙂

while (e->m_cb_count > 0)
ddsrt_cond_wait (&e->m_observers_cond, &e->m_observers_lock);
e->m_cb_count++;
/* TODO: dds_get_status_changes may not be invoked for DATA_ON_READERS on Subscriber, check if that is the case!! */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be addressed fairly easily, so let's do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken care off now. Will remove the comment.

src/core/ddsc/src/dds_entity.c Show resolved Hide resolved
src/core/ddsc/src/dds_reader.c Show resolved Hide resolved
@e-hndrks e-hndrks force-pushed the master_erik branch 5 times, most recently from 28092e8 to 3465c6a Compare June 15, 2023 09:28
e-hndrks added a commit to e-hndrks/cyclonedds-cxx that referenced this pull request Jun 16, 2023
When creating an Entity with a Listener, any instantaneous events that occurred during creation would miss their callback.
This was caused by the C++ API not being able to wrap the underlying C API with a C++ object prior to C already trying to invoking the callback.
That is now resolved by not setting the Listener at creation time, but by setting it after successfully putting a C++ wrapper around the underlying C Entity.
This should fix issue eclipse-cyclonedds#410
A prerequisite for applying this fix is to make sure pull request #eclipse-cyclonedds/cyclonedds#1717 is applied to the cyclonedds repository.

Signed-off-by: Erik Hendriks <[email protected]>
@trittsv
Copy link
Contributor

trittsv commented Jun 26, 2023

Hey @eboasson and @e-hndrks :) is there a chance that after the merge of this PR (and the cxx counterpart) a new CycloneDDS release could be created?

We are in a pitty because we want to release our software which is using cyclonedds. But our software does not work without these fixes.
We want to say the users that they should install official CycloneDDS Version X.Y. And not our patched CycloneDDS version.

@eboasson
Copy link
Contributor

@trittsv definitely!

The details are (of course) still to be decided ...

My wished-for 11.0 release is long overdue, in part because of the paperwork to be done and the need for a release/progress review by the Eclipse Foundation. The former is not a valid argument, but the latter really does mean it takes a while longer, and so altogether probably a longer timescale than what would work for you.

I also had a quick look into backporting this to 0.10.x and that looks eminently feasible. There are some conflicts but they appear straightforward. The behavioural change is small enough, and more importantly, entirely in line with how people would've worked around it (always invoking the listener function once in case an event happened earlier) and leaving that workaround in should cause no problems.

So let's aim for that.

Copy link
Contributor

@eboasson eboasson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few details, but overall it looks really good — thanks! I've marked some conversations as "resolved" as an easy way to remove some clutter and to signal my agreement.

@@ -22,6 +22,9 @@ struct ddsi_status_cb_data;
/** @component reader */
void dds_reader_status_cb (void *entity, const struct ddsi_status_cb_data * data);

/** @component reader */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** @component reader */
/** @brief Invokes listeners for events signalled in the entity status
* @component reader
* @note expects `e->m_observers_lock` to be held */

(Hopefully I got the doxygen syntax right)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will copy this into my pull request.

@@ -119,6 +119,7 @@ typedef struct dds_entity_deriver {
dds_return_t (*validate_status) (uint32_t mask);
struct dds_statistics * (*create_statistics) (const struct dds_entity *e);
void (*refresh_statistics) (const struct dds_entity *e, struct dds_statistics *s);
void (*invoke_cbs_for_pending_events) (struct dds_entity *e, uint32_t status);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void (*invoke_cbs_for_pending_events) (struct dds_entity *e, uint32_t status);
/// Invoke listeners for which the corresponding event flag is set in the status mask
/// @note expects `e->m_observers_lock` to be held on entry
/// @note may unlock and re-lock `e->m_observers_lock`
void (*invoke_cbs_for_pending_events) (struct dds_entity *e, uint32_t status);

And similar for the other related functions. The reason for making such a fuss is that it is really easy to make a mistake with the locking if it behaves like this, so better to state it in every location where one might conceivably look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will copy this into my pull request.

@@ -1007,6 +1012,12 @@ static void pushdown_listener (dds_entity *e)

ddsrt_mutex_lock (&e->m_observers_lock);
dds_override_inherited_listener (&c->m_listener, &e->m_listener);
if (ddsrt_avl_is_empty(&c->m_children)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder ...

The status flags are only ever set on the entity it concerns, so, e.g., a "publication matched" event is only reflected in the writer's status word. The only status that is a bit weird in this respect is "data on readers" because it may be approximated using "data available" (if it is not materialized). So there's not that much to worry about calling a listener too many times.

There exists the case where I have a subscriber with materialized "data on readers" and one data reader, and I then delete the reader after receiving some data but without performing any operation that resets the "data on readers" flag, thus leaving it in a state with the "data on readers" flag set. If I then install a "data on readers" listener, it would get called if you remove this condition.

Maybe I'm overlooking something, but I think the condition on not having any children is unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are basically saying here is that since the events only occur on the ultimate children, it doesn't hurt to let for example a Participant or a Publisher check its OfferedIncompatibleQosStatus bit, since that bit will never be set on that level? That might be true, and the check might actually prevent a pending DATA_ON_READERS to be invoked, because a subscriber probably does have children.....
I will remove this check in my pull-request.

@@ -1020,10 +1025,12 @@ static void pushdown_listener (dds_entity *e)
ddsrt_mutex_unlock (&e->m_mutex);
}


dds_return_t dds_set_listener (dds_entity_t entity, const dds_listener_t *listener)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only dds_set_listener in dds.h. It just needs a bit of text like:

dds_set_listener will invoke any listeners for which the corresponding status flag is set. It may cause spurious invocations, including multiple invocations for one listener. For most cases this is unlikely, but for the DATA_ON_READERS listeners it is quite likely, though not certain.

At least that would work for me, but YMMV. Suggestions always welcome 🙂

if (ddsrt_avl_is_empty(&c->m_children)) {
uint32_t status = ddsrt_atomic_ld32 (&c->m_status.m_status_and_mask) & SAM_STATUS_MASK;
if (status) {
dds_entity_deriver_invoke_cbs_for_pending_events(c, status);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be better to also increment/decrement m_cb_count? I can't (yet) give an execution why it could fail without it, but at the very least there is a consistency and the fact that shaving off a pair of trivial instructions in such a rarely used piece of code isn't worth an inconsistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the revised pull-request we will increase m_cb_pending_count instead, which gives increased safety against two threads running dds_set_listener on the same entity in parallel.

@@ -1007,6 +1012,12 @@ static void pushdown_listener (dds_entity *e)

ddsrt_mutex_lock (&e->m_observers_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line (or, perhaps more precisely, the fact that it is held here until after the dds_entity_deriver_invoke_cbs_for_pending_events means in this one path it holds the parent entity's m_observers_lock while invoking the listener. I'm sure that'll be fine in many cases, but before I can merge it I want to understand what the effect of that will be.

Or it needs to be released before invoking the listeners. The only thing it is doing for which it requires it is the dds_override_inherited_listener call. So moving the "unlock" is probably the best option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next pull-request we will unlock m_observers_lock prior to invoking the dds_entity_deriver_invoke_cbs_for_pending_events.

@@ -30,6 +30,7 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is pretty trivial to split this commit into one that fixes the deadline test and one that makes the functional changes on setting the listeners. I think doing so would be worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that means we will do the deadline fix in a separate pull request?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do a separate PR, but as far as I am concerned it would also be fine to have in this PR as long as it is separate commit.

e-hndrks added a commit to e-hndrks/cyclonedds-cxx that referenced this pull request Jul 3, 2023
When creating an Entity with a Listener, any instantaneous events that occurred during creation would miss their callback.
This was caused by the C++ API not being able to wrap the underlying C API with a C++ object prior to C already trying to invoking the callback.
That is now resolved by not setting the Listener at creation time, but by setting it after successfully putting a C++ wrapper around the underlying C Entity.
This should fix issue eclipse-cyclonedds#410
A prerequisite for applying this fix is to make sure pull request #eclipse-cyclonedds/cyclonedds#1717 is applied to the cyclonedds repository.

Signed-off-by: Erik Hendriks <[email protected]>
src/core/ddsc/src/dds__types.h Fixed Show fixed Hide fixed
src/core/ddsc/src/dds_entity.c Fixed Show fixed Hide fixed
e-hndrks added a commit to e-hndrks/cyclonedds-cxx that referenced this pull request Jul 3, 2023
When creating an Entity with a Listener, any instantaneous events that occurred during creation would miss their callback.
This was caused by the C++ API not being able to wrap the underlying C API with a C++ object prior to C already trying to invoking the callback.
That is now resolved by not setting the Listener at creation time, but by setting it after successfully putting a C++ wrapper around the underlying C Entity.
This should fix issue eclipse-cyclonedds#410
A prerequisite for applying this fix is to make sure pull request #eclipse-cyclonedds/cyclonedds#1717 is applied to the cyclonedds repository.

Signed-off-by: Erik Hendriks <[email protected]>
e-hndrks added 2 commits July 4, 2023 01:04
When setting a listener, any pending (i.e. unhandled) event received before would not result in an immediate callback.
Now the set_listener call also checks for pending events, and invokes the registered callbacks when appropriate.
This fix is a prerequisite for #eclipse-cyclonedds/cyclonedds-cxx#410

Signed-off-by: Erik Hendriks <[email protected]>
Copy link
Contributor

@eboasson eboasson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me now 😀 Thank you very much @e-hndrks!

@eboasson eboasson merged commit 50d2daf into eclipse-cyclonedds:master Jul 4, 2023
e-hndrks added a commit to e-hndrks/cyclonedds-cxx that referenced this pull request Jul 4, 2023
When creating an Entity with a Listener, any instantaneous events that occurred during creation would miss their callback.
This was caused by the C++ API not being able to wrap the underlying C API with a C++ object prior to C already trying to invoking the callback.
That is now resolved by not setting the Listener at creation time, but by setting it after successfully putting a C++ wrapper around the underlying C Entity.
This should fix issue eclipse-cyclonedds#410
A prerequisite for applying this fix is to make sure pull request #eclipse-cyclonedds/cyclonedds#1717 is applied to the cyclonedds repository.

Signed-off-by: Erik Hendriks <[email protected]>
eboasson pushed a commit to eclipse-cyclonedds/cyclonedds-cxx that referenced this pull request Jul 4, 2023
When creating an Entity with a Listener, any instantaneous events that occurred during creation would miss their callback.
This was caused by the C++ API not being able to wrap the underlying C API with a C++ object prior to C already trying to invoking the callback.
That is now resolved by not setting the Listener at creation time, but by setting it after successfully putting a C++ wrapper around the underlying C Entity.
This should fix issue #410
A prerequisite for applying this fix is to make sure pull request #eclipse-cyclonedds/cyclonedds#1717 is applied to the cyclonedds repository.

Signed-off-by: Erik Hendriks <[email protected]>
eboasson pushed a commit to eboasson/cyclonedds-cxx that referenced this pull request Jul 21, 2023
When creating an Entity with a Listener, any instantaneous events that occurred during creation would miss their callback.
This was caused by the C++ API not being able to wrap the underlying C API with a C++ object prior to C already trying to invoking the callback.
That is now resolved by not setting the Listener at creation time, but by setting it after successfully putting a C++ wrapper around the underlying C Entity.
This should fix issue eclipse-cyclonedds#410
A prerequisite for applying this fix is to make sure pull request #eclipse-cyclonedds/cyclonedds#1717 is applied to the cyclonedds repository.

Signed-off-by: Erik Hendriks <[email protected]>
eboasson pushed a commit to eclipse-cyclonedds/cyclonedds-cxx that referenced this pull request Aug 28, 2023
When creating an Entity with a Listener, any instantaneous events that occurred during creation would miss their callback.
This was caused by the C++ API not being able to wrap the underlying C API with a C++ object prior to C already trying to invoking the callback.
That is now resolved by not setting the Listener at creation time, but by setting it after successfully putting a C++ wrapper around the underlying C Entity.
This should fix issue #410
A prerequisite for applying this fix is to make sure pull request #eclipse-cyclonedds/cyclonedds#1717 is applied to the cyclonedds repository.

Signed-off-by: Erik Hendriks <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants