-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: [FC-0074] add how-to add event bus support to an Open edX Event #428
base: main
Are you sure you want to change the base?
Changes from all commits
f1d16d9
5d435b0
b814865
457edf9
4f83a8c
59a8efa
3dd1399
a117543
f91291c
375c609
bfc647c
a05eb03
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,154 @@ | ||
Adding Event Bus Support to an Open edX Event | ||
============================================= | ||
|
||
Before sending an event across services, you need to ensure that the event is compatible with the Open edX Event Bus. This involves ensuring that the event, with its corresponding payload, can be emitted by a service through the event bus and that it can be consumed by other services. This guide will walk you through the process of adding event bus support to an Open edX event. | ||
|
||
For more details on how the :term:`Event Payload` is structured refer to the :doc:`../decisions/0003-events-payload` decision record. | ||
|
||
.. note:: | ||
This guide assumes that you have already created an Open edX event. If you haven't, refer to the :doc:`../how-tos/creating-new-events` how-to guide. | ||
|
||
Step 1: Does my Event Need Event Bus Support? | ||
---------------------------------------------- | ||
|
||
By default, Open edX Events should be compatible with the Open edX Event Bus. However, there are cases when the support might not be possible or needed for a particular event. Here are some scenarios where you might not need to add event bus support: | ||
|
||
- The event is only used within the same application process and cannot be scoped to other services. | ||
- The :term:`Event Payload` contains data types that are not supported by the event bus, and it is not possible to refactor the :term:`Event Payload` to use supported data types. | ||
|
||
When adding support is not possible do the following: | ||
|
||
- Add it to the ``KNOWN_UNSERIALIZABLE_SIGNALS`` list in the ``openedx_events/tooling.py`` file so the event bus ignores it. | ||
- Add a ``warning`` in the event's docstring to inform developers that the event is not compatible with the event bus and why. | ||
|
||
If you don't add the event to the ``KNOWN_UNSERIALIZABLE_SIGNALS`` list, the CI/CD pipeline will fail for the missing Avro schema that could not be generated for the :term:`Event Payload`. If you don't add a warning in the event's docstring, developers might try to send the event across services and encounter issues. | ||
Comment on lines
+11
to
+24
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bmtcril: At first we thought that all events in the repo should have event bus support by default. So I was going to add support for these events: https://github.com/openedx/openedx-events/blob/main/openedx_events/tooling.py#L20-L32. However, I realized that we would also need to add support for dictionaries (typed and more complex) and/or a rewrite of the data classes, which requires a lot more effort than what we gain with the support since, as far as I understand, most of those events are locally scoped. Do you think the question, "Does my Event Need Event Bus Support?" is relevant considering what I mentioned, and that should we study each event before compromising on event bus support? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I follow when you say locally scoped. Some of the discussion thread events are very likely to be interesting by consumers of the event bus trying to make the discussion experience more reactive. How much effort could it take to refactor those classes or to implement serialization capabilities for a list/dict with a limited capability of nesting. E.g: lists of primitives currently supported by avro. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @felipemontoya: thanks for the reply!
I meant for interest only within the service where it's sent. But as you said, you can argue that all events can be of interest to consumers. Do you think this section "Does my Event Need Event Bus Support?" is still relevant?
I'm more concerned about doing it properly. We already have "support" for those limited nesting capabilities, i.e., using attrs classes for fixed dicts or JSON structs as strings when we don't know the content of the dicts beforehand. I already added a note here with the suggestion: https://github.com/openedx/openedx-events/pull/428/files#diff-67886caf4b3357c606ceb6d3ea25e3839b6056f13b22d48a146431adc0fa829dR120. So I don't think adding support for dicts is strictly necessary. I'm sorry I didn't mention this in my previous comment, so it read only that it was too much effort to add support. But then we have dicts of lists or lists of data attrs which my guess is that is more difficult to serialize, or maybe we hadn't had a strong use case for it, and that's why it's not supported. That's where I wondered, should an event always have support for the event bus or can we be flexible with that requirement? I totally understand why we wouldn't want the separation between events with/without event bus support, but that's where we currently stand.
I was thinking we could create new event versions without the complex serializable sections and considering what I said about fixed/dynamic dicts, so we can rewrite the data and make it suitable for the event bus, but leave the previous versions to be sent within the service. However, that would require sending/maintaining two versions of the event. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a result of this conversation, I started testing two approaches to give basic event bus support to data classes with dictionaries (str keys, only primitive types as values).
I was able to generate schemas for both approaches and avro tests seem to be passing, but I haven't tested them with an event bus implementation just yet. Let me know what you think! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I managed to send forum events through the event bus by using this implementation: #433 |
||
|
||
Step 2: Define the Event Payload | ||
-------------------------------- | ||
|
||
An Open edX Event is compatible with the event bus when its payload can be serialized, sent, and deserialized by other services. The payload, structured as `attrs data classes`_, must align with the event bus schema format which in this case is the :term:`Avro Schema`. This schema is used to serialize and deserialize the :term:`Event Payload` when sending it across services. | ||
|
||
This ensures the event can be sent by the producer and be then re-emitted by the same instance of `OpenEdxPublicSignal`_ on the consumer side, guaranteeing that the data sent and received is the identical. Serializing this way should prevent data inconsistencies between services, e.g., timezone issues and precision loss. For more information on the event bus schema format, refer to the :doc:`../decisions/0004-external-event-bus-and-django-signal-events` and :doc:`../decisions/0005-external-event-schema-format` decision records. | ||
|
||
The data types used in the attrs classes that the current Open edX Event Bus with the chosen schema are: | ||
|
||
Primitive Data Types | ||
~~~~~~~~~~~~~~~~~~~~ | ||
|
||
- Boolean | ||
- Integer | ||
- Float | ||
- String | ||
- Bytes | ||
|
||
Complex Data Types | ||
~~~~~~~~~~~~~~~~~~ | ||
|
||
- Type-annotated Lists (e.g., ``List[int]``, ``List[str]``) | ||
- Attrs Classes (e.g., ``UserNonPersonalData``, ``UserPersonalData``, ``UserData``, ``CourseData``) | ||
- Types with Custom Serializers (e.g., ``CourseKey``, ``datetime``) | ||
|
||
Ensure that the :term:`Event Payload` is structured as `attrs data classes`_ and that the data types used in those classes align with the event bus schema format. | ||
|
||
In the ``data.py`` files within each architectural subdomain you can find examples of the :term:`Event Payload` structured as `attrs data classes`_ that align with the event bus schema format. | ||
|
||
Step 3: Ensure Serialization and Deserialization | ||
------------------------------------------------ | ||
|
||
Before sending the event across services, you need to ensure that the :term:`Event Payload` can be serialized and deserialized correctly. The event bus concrete implementations use the :term:`Avro Schema` to serialize and deserialize the :term:`Event Payload` as mentioned in the :doc:`../decisions/0005-external-event-schema-format` decision record. The concrete implementation of the event bus handles the serialization and deserialization with the help of methods implemented by this library. | ||
|
||
.. For example, here's how the Redis event bus handles serialization before sending a message: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like your formatting got messed up here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I commented this section because it was too specific and unrelated to adding event bus support for an event. What do you think? |
||
|
||
.. .. code-block:: python | ||
.. :emphasize-lines: 4 | ||
|
||
.. # edx_event_bus_redis/internal/producer.py | ||
.. full_topic = get_full_topic(topic) | ||
.. context.full_topic = full_topic | ||
.. event_bytes = serialize_event_data_to_bytes(event_data, signal) | ||
.. message = RedisMessage(topic=full_topic, event_data=event_bytes, event_metadata=event_metadata) | ||
.. stream_data = message.to_binary_dict() | ||
|
||
.. Where `serialize_event_data_to_bytes`_ is a method that serializes the :term:`Event Payload` to bytes using the Avro schema. While the consumer side deserializes the :term:`Event Payload` using the Avro schema with the help of the `deserialize_bytes_to_event_data`_ method: | ||
|
||
.. .. code-block:: python | ||
.. :emphasize-lines: 3 | ||
|
||
.. # edx_event_bus_redis/internal/consumer.py | ||
.. signal = OpenEdxPublicSignal.get_signal_by_type(msg.event_metadata.event_type) | ||
.. event_data = deserialize_bytes_to_event_data(msg.event_data, signal) | ||
.. send_results = signal.send_event_with_custom_metadata(msg.event_metadata, **event_data) | ||
|
||
If the :term:`Event Payload` contains types that are not supported by the event bus, you could implement custom serializers for these types. This ensures that the :term:`Event Payload` can be serialized and deserialized correctly when sent across services. | ||
|
||
Here is an example of a custom serializer for the ``CourseKey`` type: | ||
|
||
.. code-block:: python | ||
|
||
# event_bus/avro/custom_serializers.py | ||
class CourseKeyAvroSerializer(BaseCustomTypeAvroSerializer): | ||
""" | ||
CustomTypeAvroSerializer for CourseKey class. | ||
""" | ||
|
||
cls = CourseKey | ||
field_type = PYTHON_TYPE_TO_AVRO_MAPPING[str] | ||
|
||
@staticmethod | ||
def serialize(obj) -> str: | ||
"""Serialize obj into string.""" | ||
return str(obj) | ||
|
||
@staticmethod | ||
def deserialize(data: str): | ||
"""Deserialize string into obj.""" | ||
return CourseKey.from_string(data) | ||
|
||
|
||
After implementing the serializer, add it to ``DEFAULT_CUSTOM_SERIALIZERS`` at the end of the ``event_bus/avro/custom_serializers.py`` file: | ||
|
||
.. code-block:: python | ||
|
||
DEFAULT_CUSTOM_SERIALIZERS = [ | ||
# Other custom serializers | ||
CourseKey: CourseKeyAvroSerializer, | ||
] | ||
|
||
Now the :term:`Event Payload` can be serialized and deserialized correctly when sent across services. | ||
|
||
.. warning:: | ||
One of the known limitations of the current Open edX Event Bus is that it does not support dictionaries as data types. If the :term:`Event Payload` contains dictionaries, you may need to refactor the :term:`Event Payload` to use supported data types. When you know the structure of the dictionary, you can create an attrs class that represents the dictionary structure. If not, you can use a str type to represent the dictionary as a string and deserialize it on the consumer side using JSON deserialization. | ||
|
||
Step 4: Generate the Avro Schema | ||
-------------------------------- | ||
|
||
As mentioned in the previous step, the serialization and deserialization of the :term:`Event Payload` is handled by the concrete event bus implementation with the help of methods implemented in this library. However, although openedx-events does not handles the serialization and deserialization of the :term:`Event Payload` directly, it ensures the payload of new events can be serialized and deserialized correctly by adding checks in the CI/CD pipeline for schema verification. To ensure tests pass, you need to generate an Avro test schema for your new event's :term:`Event Payload`: | ||
|
||
1. Run the following command to generate the Avro schema for the :term:`Event Payload`: | ||
|
||
.. code-block:: bash | ||
|
||
python manage.py generate_avro_schemas YOUR_EVENT_TYPE | ||
|
||
Run ``python manage.py generate_avro_schemas --help`` to see the available options for the command. | ||
|
||
2. The Avro schema for the :term:`Event Payload` will be generated in the ``openedx_events/event_bus/avro/tests/schemas`` directory. | ||
3. Push the changes to the branch and create a pull request or run the checks locally to verify that the Avro schema was generated correctly. | ||
|
||
.. code-block:: bash | ||
|
||
make test | ||
|
||
Step 5: Send the Event Across Services with the Event Bus | ||
--------------------------------------------------------- | ||
|
||
To validate that you can consume the event emitted by a service through the event bus, you can send the event across services. Here is an example of how you can send the event across services using the Redis event bus implementation following the `setup instructions in a Tutor environment`_. We recommend also following :doc:`../how-tos/using-the-event-bus` to understand how to use the event bus in your environment. | ||
|
||
.. note:: If you implemented a custom serializer for a type in the :term:`Event Payload`, the custom serializer support must be included in both the producer and consumer sides before it can be used. | ||
|
||
.. _Avro: https://avro.apache.org/ | ||
.. _OpenEdxPublicSignal: https://github.com/openedx/openedx-events/blob/main/openedx_events/tooling.py#L37 | ||
.. _attrs data classes: https://www.attrs.org/en/stable/overview.html | ||
.. _serialize_event_data_to_bytes: https://github.com/openedx/openedx-events/blob/main/openedx_events/event_bus/avro/serializer.py#L82-L98 | ||
.. _deserialize_bytes_to_event_data: https://github.com/openedx/openedx-events/blob/main/openedx_events/event_bus/avro/deserializer.py#L86-L98 | ||
.. _setup instructions in a Tutor environment: https://github.com/openedx/event-bus-redis/blob/main/docs/tutor_installation.rst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.