Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: [FC-0074] add how-to add event bus support to an Open edX Event #428

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 154 additions & 0 deletions docs/how-tos/adding-event-bus-support-to-an-event.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
Adding Event Bus Support to an Open edX Event
=============================================

Before sending an event across services, you need to ensure that the event is compatible with the Open edX Event Bus. This involves ensuring that the event, with its corresponding payload, can be emitted by a service through the event bus and that it can be consumed by other services. This guide will walk you through the process of adding event bus support to an Open edX event.

For more details on how the :term:`Event Payload` is structured refer to the :doc:`../decisions/0003-events-payload` decision record.

.. note::
This guide assumes that you have already created an Open edX event. If you haven't, refer to the :doc:`../how-tos/creating-new-events` how-to guide.

Step 1: Does my Event Need Event Bus Support?
----------------------------------------------

By default, Open edX Events should be compatible with the Open edX Event Bus. However, there are cases when the support might not be possible or needed for a particular event. Here are some scenarios where you might not need to add event bus support:

- The event is only used within the same application process and cannot be scoped to other services.
- The :term:`Event Payload` contains data types that are not supported by the event bus, and it is not possible to refactor the :term:`Event Payload` to use supported data types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The :term:`Event Payload` contains data types that are not supported by the event bus, and it is not possible to refactor the :term:`Event Payload` to use supported data types.
- The :term:`Event Payload` contains data types that are not supported by the event bus (such as ...), and it is not possible to refactor the :term:`Event Payload` to use supported data types.


When adding support is not possible do the following:

- Add it to the ``KNOWN_UNSERIALIZABLE_SIGNALS`` list in the ``openedx_events/tooling.py`` file so the event bus ignores it.
- Add a ``warning`` in the event's docstring to inform developers that the event is not compatible with the event bus and why.

If you don't add the event to the ``KNOWN_UNSERIALIZABLE_SIGNALS`` list, the CI/CD pipeline will fail for the missing Avro schema that could not be generated for the :term:`Event Payload`. If you don't add a warning in the event's docstring, developers might try to send the event across services and encounter issues.
Comment on lines +11 to +24
Copy link
Member Author

@mariajgrimaldi mariajgrimaldi Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmtcril: At first we thought that all events in the repo should have event bus support by default. So I was going to add support for these events: https://github.com/openedx/openedx-events/blob/main/openedx_events/tooling.py#L20-L32. However, I realized that we would also need to add support for dictionaries (typed and more complex) and/or a rewrite of the data classes, which requires a lot more effort than what we gain with the support since, as far as I understand, most of those events are locally scoped.

Do you think the question, "Does my Event Need Event Bus Support?" is relevant considering what I mentioned, and that should we study each event before compromising on event bus support?

FYI @sarina @felipemontoya

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow when you say locally scoped.

Some of the discussion thread events are very likely to be interesting by consumers of the event bus trying to make the discussion experience more reactive.

How much effort could it take to refactor those classes or to implement serialization capabilities for a list/dict with a limited capability of nesting. E.g: lists of primitives currently supported by avro.
Another option would be to say that we send most of the envelope of those discussion events and we keep the content of the discussion out of the serialization. The same we do with other objects such as a django user, we pass name, email and ID and we leave the consumer figure out the rest via API calls or such.

Copy link
Member Author

@mariajgrimaldi mariajgrimaldi Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felipemontoya: thanks for the reply!

I'm not sure I follow when you say locally scoped.

I meant for interest only within the service where it's sent. But as you said, you can argue that all events can be of interest to consumers. Do you think this section "Does my Event Need Event Bus Support?" is still relevant?

How much effort could it take to refactor those classes or to implement serialization capabilities for a list/dict with a limited capability of nesting. E.g: lists of primitives currently supported by avro.

I'm more concerned about doing it properly. We already have "support" for those limited nesting capabilities, i.e., using attrs classes for fixed dicts or JSON structs as strings when we don't know the content of the dicts beforehand. I already added a note here with the suggestion: https://github.com/openedx/openedx-events/pull/428/files#diff-67886caf4b3357c606ceb6d3ea25e3839b6056f13b22d48a146431adc0fa829dR120. So I don't think adding support for dicts is strictly necessary. I'm sorry I didn't mention this in my previous comment, so it read only that it was too much effort to add support.

But then we have dicts of lists or lists of data attrs which my guess is that is more difficult to serialize, or maybe we hadn't had a strong use case for it, and that's why it's not supported.

That's where I wondered, should an event always have support for the event bus or can we be flexible with that requirement? I totally understand why we wouldn't want the separation between events with/without event bus support, but that's where we currently stand.

Another option would be to say that we send most of the envelope of those discussion events and we keep the content of the discussion out of the serialization.

I was thinking we could create new event versions without the complex serializable sections and considering what I said about fixed/dynamic dicts, so we can rewrite the data and make it suitable for the event bus, but leave the previous versions to be sent within the service. However, that would require sending/maintaining two versions of the event.

Copy link
Member Author

@mariajgrimaldi mariajgrimaldi Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a result of this conversation, I started testing two approaches to give basic event bus support to data classes with dictionaries (str keys, only primitive types as values).

  1. Send dicts as JSON structs (str): feat: add event bus support to forum events V2 #434
  2. Add avro map support to dicts (based on this PR Cristhian opened a few months ago): feat: add support for annotated python dicts as avro map type  #433

I was able to generate schemas for both approaches and avro tests seem to be passing, but I haven't tested them with an event bus implementation just yet. Let me know what you think!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to send forum events through the event bus by using this implementation: #433


Step 2: Define the Event Payload
--------------------------------

An Open edX Event is compatible with the event bus when its payload can be serialized, sent, and deserialized by other services. The payload, structured as `attrs data classes`_, must align with the event bus schema format which in this case is the :term:`Avro Schema`. This schema is used to serialize and deserialize the :term:`Event Payload` when sending it across services.

This ensures the event can be sent by the producer and be then re-emitted by the same instance of `OpenEdxPublicSignal`_ on the consumer side, guaranteeing that the data sent and received is the identical. Serializing this way should prevent data inconsistencies between services, e.g., timezone issues and precision loss. For more information on the event bus schema format, refer to the :doc:`../decisions/0004-external-event-bus-and-django-signal-events` and :doc:`../decisions/0005-external-event-schema-format` decision records.

The data types used in the attrs classes that the current Open edX Event Bus with the chosen schema are:

Primitive Data Types
~~~~~~~~~~~~~~~~~~~~

- Boolean
- Integer
- Float
- String
- Bytes

Complex Data Types
~~~~~~~~~~~~~~~~~~

- Type-annotated Lists (e.g., ``List[int]``, ``List[str]``)
- Attrs Classes (e.g., ``UserNonPersonalData``, ``UserPersonalData``, ``UserData``, ``CourseData``)
- Types with Custom Serializers (e.g., ``CourseKey``, ``datetime``)

Ensure that the :term:`Event Payload` is structured as `attrs data classes`_ and that the data types used in those classes align with the event bus schema format.

In the ``data.py`` files within each architectural subdomain you can find examples of the :term:`Event Payload` structured as `attrs data classes`_ that align with the event bus schema format.

Step 3: Ensure Serialization and Deserialization
------------------------------------------------

Before sending the event across services, you need to ensure that the :term:`Event Payload` can be serialized and deserialized correctly. The event bus concrete implementations use the :term:`Avro Schema` to serialize and deserialize the :term:`Event Payload` as mentioned in the :doc:`../decisions/0005-external-event-schema-format` decision record. The concrete implementation of the event bus handles the serialization and deserialization with the help of methods implemented by this library.

.. For example, here's how the Redis event bus handles serialization before sending a message:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like your formatting got messed up here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented this section because it was too specific and unrelated to adding event bus support for an event. What do you think?


.. .. code-block:: python
.. :emphasize-lines: 4

.. # edx_event_bus_redis/internal/producer.py
.. full_topic = get_full_topic(topic)
.. context.full_topic = full_topic
.. event_bytes = serialize_event_data_to_bytes(event_data, signal)
.. message = RedisMessage(topic=full_topic, event_data=event_bytes, event_metadata=event_metadata)
.. stream_data = message.to_binary_dict()

.. Where `serialize_event_data_to_bytes`_ is a method that serializes the :term:`Event Payload` to bytes using the Avro schema. While the consumer side deserializes the :term:`Event Payload` using the Avro schema with the help of the `deserialize_bytes_to_event_data`_ method:

.. .. code-block:: python
.. :emphasize-lines: 3

.. # edx_event_bus_redis/internal/consumer.py
.. signal = OpenEdxPublicSignal.get_signal_by_type(msg.event_metadata.event_type)
.. event_data = deserialize_bytes_to_event_data(msg.event_data, signal)
.. send_results = signal.send_event_with_custom_metadata(msg.event_metadata, **event_data)

If the :term:`Event Payload` contains types that are not supported by the event bus, you could implement custom serializers for these types. This ensures that the :term:`Event Payload` can be serialized and deserialized correctly when sent across services.

Here is an example of a custom serializer for the ``CourseKey`` type:

.. code-block:: python

# event_bus/avro/custom_serializers.py
class CourseKeyAvroSerializer(BaseCustomTypeAvroSerializer):
"""
CustomTypeAvroSerializer for CourseKey class.
"""

cls = CourseKey
field_type = PYTHON_TYPE_TO_AVRO_MAPPING[str]

@staticmethod
def serialize(obj) -> str:
"""Serialize obj into string."""
return str(obj)

@staticmethod
def deserialize(data: str):
"""Deserialize string into obj."""
return CourseKey.from_string(data)


After implementing the serializer, add it to ``DEFAULT_CUSTOM_SERIALIZERS`` at the end of the ``event_bus/avro/custom_serializers.py`` file:

.. code-block:: python

DEFAULT_CUSTOM_SERIALIZERS = [
# Other custom serializers
CourseKey: CourseKeyAvroSerializer,
]

Now the :term:`Event Payload` can be serialized and deserialized correctly when sent across services.

.. warning::
One of the known limitations of the current Open edX Event Bus is that it does not support dictionaries as data types. If the :term:`Event Payload` contains dictionaries, you may need to refactor the :term:`Event Payload` to use supported data types. When you know the structure of the dictionary, you can create an attrs class that represents the dictionary structure. If not, you can use a str type to represent the dictionary as a string and deserialize it on the consumer side using JSON deserialization.

Step 4: Generate the Avro Schema
--------------------------------

As mentioned in the previous step, the serialization and deserialization of the :term:`Event Payload` is handled by the concrete event bus implementation with the help of methods implemented in this library. However, although openedx-events does not handles the serialization and deserialization of the :term:`Event Payload` directly, it ensures the payload of new events can be serialized and deserialized correctly by adding checks in the CI/CD pipeline for schema verification. To ensure tests pass, you need to generate an Avro test schema for your new event's :term:`Event Payload`:

1. Run the following command to generate the Avro schema for the :term:`Event Payload`:

.. code-block:: bash

python manage.py generate_avro_schemas YOUR_EVENT_TYPE

Run ``python manage.py generate_avro_schemas --help`` to see the available options for the command.

2. The Avro schema for the :term:`Event Payload` will be generated in the ``openedx_events/event_bus/avro/tests/schemas`` directory.
3. Push the changes to the branch and create a pull request or run the checks locally to verify that the Avro schema was generated correctly.

.. code-block:: bash

make test

Step 5: Send the Event Across Services with the Event Bus
---------------------------------------------------------

To validate that you can consume the event emitted by a service through the event bus, you can send the event across services. Here is an example of how you can send the event across services using the Redis event bus implementation following the `setup instructions in a Tutor environment`_. We recommend also following :doc:`../how-tos/using-the-event-bus` to understand how to use the event bus in your environment.

.. note:: If you implemented a custom serializer for a type in the :term:`Event Payload`, the custom serializer support must be included in both the producer and consumer sides before it can be used.

.. _Avro: https://avro.apache.org/
.. _OpenEdxPublicSignal: https://github.com/openedx/openedx-events/blob/main/openedx_events/tooling.py#L37
.. _attrs data classes: https://www.attrs.org/en/stable/overview.html
.. _serialize_event_data_to_bytes: https://github.com/openedx/openedx-events/blob/main/openedx_events/event_bus/avro/serializer.py#L82-L98
.. _deserialize_bytes_to_event_data: https://github.com/openedx/openedx-events/blob/main/openedx_events/event_bus/avro/deserializer.py#L86-L98
.. _setup instructions in a Tutor environment: https://github.com/openedx/event-bus-redis/blob/main/docs/tutor_installation.rst
1 change: 1 addition & 0 deletions docs/how-tos/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@ How-tos
creating-new-events
adding-events-to-a-service
using-events
adding-event-bus-support-to-an-event
using-the-event-bus
add-new-event-bus-concrete-implementation
2 changes: 2 additions & 0 deletions docs/how-tos/using-the-event-bus.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
Using the Open edX Event Bus
============================

.. note:: Be sure to check out how to make your Open edX Event event bus compatible in the :doc:`../how-tos/adding-event-bus-support-to-an-event` guide.

After creating a new Open edX Event, you might need to send it across services instead of just within the same process. For this kind of use-cases, you might want to use the Open edX Event Bus. Here :doc:`../concepts/event-bus`, you can find useful information to start getting familiar with the Open edX Event Bus.

The Open edX Event Bus is a key component of the Open edX architecture, enabling services to communicate without direct dependencies on each other. This guide provides an overview of how to use the event bus to broadcast Open edX Events to multiple services, allowing them to react to changes or actions in the system.
Expand Down
7 changes: 7 additions & 0 deletions docs/reference/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,11 @@ An event has multiple components that are used to define, trigger, and handle th
Topic
How the event bus implementation groups related events, such as streams in Redis. Producers publish events to topics, and consumers subscribe to topics to receive events.

Avro Schema
A specification describing the expected field names and types in an Avro record dictionary. See `Apache Avro`_ for more information.

Avro Record Dictionary
A dictionary whose structure is determined by an Avro schema. These dictionaries are the entities that are actually serialized to bytes and sent over the wire to the event bus.

.. _Events Payload ADR: :doc: `/decisions/0003-events-payload`
.. _Apache Avro: https://avro.apache.org/docs/current/spec.html
Loading