Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: event bus related ADRs #40

Merged
merged 6 commits into from
Mar 24, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 35 additions & 34 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Change Log
----------
==========

..
All enhancements and patches to openedx_events will be documented
Expand All @@ -12,115 +12,116 @@ Change Log
.. There should always be an "Unreleased" section for changes pending release.

Unreleased
~~~~~~~~~~
----------

[0.8.1] - 2022-03-03
~~~~~~~~~~~~~~~~~~~~
--------------------

Added
-----
~~~~~
* Added missing field for event COURSE_DISCUSSIONS_CHANGED

[0.8.0] - 2022-02-25
~~~~~~~~~~~~~~~~~~~~
--------------------
Added
-----
~~~~~
* Added COURSE_DISCUSSIONS_CHANGED for discussion event

Changed
--------
~~~~~~~
* Changed openedx_events/enterprise/LicenseLifecycle class to openedx_events/enterprise/SubscriptionLicenseData
* Changed LicenseCreated signal class to SUBSCRIPTION_LICENSE_MODIFIED signal class

[0.7.1] - 2022-01-13
~~~~~~~~~~~~~~~~~~~~
--------------------
Added
-----
~~~~~
* Added data definition for enterprise/LicenseLifecycle
* Added LicenseCreated signal definition

[0.7.0] - 2022-01-06
~~~~~~~~~~~~~~~~~~~~
--------------------
Added
-----
~~~~~
* Added AvroAttrsBridge class to convert between avro standard and attrs classes

[0.6.0] - 2021-09-15
~~~~~~~~~~~~~~~~~~~~
--------------------
Added
_____
~~~~~
* Add custom formatting class for events responses.
* Add a way to use send method instead of send_robust while testing.

Changed
_______
~~~~~~~
* Remove unnecessary InstantiationError exception.
* Default is send_robust instead of send unless specified otherwise.

[0.5.1] - 2021-08-26
~~~~~~~~~~~~~~~~~~~~
--------------------
Changed
_______
~~~~~~~
* Remove TestCase inheritance from OpenEdxTestMixin.

[0.5.0] - 2021-08-24
~~~~~~~~~~~~~~~~~~~~
--------------------
Added
_____
~~~~~
* Utilities to use while testing in other platforms.

[0.4.1] - 2021-08-18
~~~~~~~~~~~~~~~~~~~~
--------------------
Changed
_______
~~~~~~~
* Remove raise_exception assignment in events metadata.

[0.4.0] - 2021-08-18
~~~~~~~~~~~~~~~~~~~~
--------------------
Added
_____
~~~~~
* Preliminary Open edX events definitions.

[0.3.0] - 2021-08-18
~~~~~~~~~~~~~~~~~~~~
--------------------
Added
_____
~~~~~
* Add tooling needed to create and trigger events in Open edX platform.
* Add Data Attribute classes used as arguments by Open edX Events.


[0.2.0] - 2021-07-28
~~~~~~~~~~~~~~~~~~~~
--------------------
Changed
_______
~~~~~~~

* Update reposirtory purpose.
* Changed max-doc-length from 79 to 120 following community recommendation.

[0.1.3] - 2021-07-01
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--------------------
Changed
_______
~~~~~~~

* Update setup.cfg with complete bumpversion configuration.

[0.1.2] - 2021-06-29
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--------------------
Changed
_______
~~~~~~~

* Update documentation with current organization info.

[0.1.1] - 2021-06-29
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--------------------
Added
_____
~~~~~

* Add Django testing configuration.

[0.1.0] - 2021-04-07
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--------------------

Added
_____
~~~~~

* First release on PyPI.
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
4. External event bus and Django Signal events
==============================================

Status
------

Provisional

Context
-------

Open edX services already use Django signals for internal events (see `OEP-50: Hooks extension framework <https://open-edx-proposals.readthedocs.io/en/latest/architectural-decisions/oep-0050-hooks-extension-framework.html>`_). Additionally, we are adding the ability to trigger external events using an Event Bus to the platform (see `OEP-52: Event Bus <https://github.com/openedx/open-edx-proposals/pull/233>`_).

The purpose of this ADR is to better define the relationship between the existing internal events and the new external events.

Decision
--------

- Event definitions will be shared between internal and external events. The event definitions will continue to be in the form of ``OpenEdxPublicSignal``, as decided in ADR ":doc:`0003-events-payload`".

- At this time, event bus events will be triggered from a Django signal handler of the Django signal representing the corresponding internal event.

- The Django signal must contain all of the necessary details to serialize the event and send it across the event bus, such that it can be deserialized and converted back into the ``OpenEdxPublicSignal`` again on the consumer side.

- For consumption, the event bus implementation will convert the messages back into Django signals and emit them within the consumer application.

Consequences
------------

- The ``OpenEdxPublicSignal`` serves as the event definition, and doubles as a Django signal, for both internal and external events. In other words, the ``OpenEdxPublicSignal`` provides the schema for external events.

jinder1s marked this conversation as resolved.
Show resolved Hide resolved
- An external event will never be sent without a corresponding internal event (at this time, based on the current design).

- The external event bus handler will listen for Django signals (``OpenEdxPublicSignals``) that have been configured to be sent as external events. These external events will be serialized using the AvroAttrsBridge, as decided in ADR ":doc:`0005-external-event-schema-format`", before sending the messages over the wire.

- The use of the ``OpenEdxPublicSignal`` on both the event producing and event consuming sides for external events should hopefully provide a consistent mechanism to plug in for events.

- It is unclear whether the use of an extra layer of signals when sending/consuming external events will cause difficulties for implementations. If so, we may need to adjust and document when and where an alternative approach should be taken.

- The data definition in ``OpenEdxPublicSignal`` is geared toward signals, where the top-level dict represents keyword arguments emitted to the signal. This definition is not designed for event bus events, except in that it could be converted back to a signal.

Rejected Alternatives
---------------------

- Sending external event at same place of code as the internal Signal is sent. Decided that this is more complex than we need, and we can add this later if it becomes necessary.

- Sending external events without a corresponding internal event. This might be useful for Event Sourcing, and is not being ruled out forever, but is not currently being designed for.

Additional Resources
--------------------

If you want further background for this decision, see the following discussions:

- `#38: What to send over the wire (Kafka)? <https://github.com/openedx/openedx-events/issues/38>`_

- `#39: ARCHBOM-2010: How does Event Bus interact with OpenedX Django Signals? <https://github.com/openedx/openedx-events/issues/39>`_
59 changes: 59 additions & 0 deletions docs/decisions/0005-external-event-schema-format.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
5. External Event Schema Format
===============================

Status
------

Provisional


Context
-------

* It is a best practice to use an explicit schema definition. Avro is the recommended serialization format for Kafka.

* The attrs objects that we currently have for signal-based events don't easily inter-operate with event bus client objects. Source ADR ":doc:`0003-events-payload`".

* Industry best practices seem to suggest using a binary encoding of messages.

* Industry best practice also suggests using AVRO over JSONSchema for better expression around schema evolution.

* Even if we use a JSON encoding for the message we'll probably want to enable compression on the wire to reduce data
size. This would mean whether or not we use JSON encoding is moot because the data on the wire will still be binary
encoded.

Decision
--------

* We will continue using ``attrs`` decorated classes for explicit schema definition.

* We will also use the binary serialization of messages that are transmitted over the event bus.

* The binary encoding of messages will use AVRO specification.

* A new utility will be created to auto generate Avro schema based on ``attrs`` decorated classes.

Implementation Notes
--------------------

`AvroAttrsBridge`_ class is a potential approach to this using the `built-in metadata`_ that the ``attrs`` library provides for extending ``attrs``.

.. _AvroAttrsBridge: https://github.com/openedx/openedx-events/blob/main/openedx_events/bridge/avro_attrs_bridge.py
.. _built-in metadata: https://www.attrs.org/en/stable/extending.html

Consequences
------------

* There will be bridging code that will abstract away schema generation from most developers of events. This may have a negative impact as it might make it harder for developers to reason about schema evolution.

* For non primitive types (eg. Opaque Keys objects), the bridging code between ``Avro`` and ``attrs`` will need to have special serializers or clearly fail.

* Any reference to using JSON or JSONSchema in `OEP-41: Asynchronous Server Event Message Format`_ should be review and updated to clarify implied or explicit decisions that may be reversed by this decision.

* `OEP-41: Asynchronous Server Event Message Format`_ also dictates the use of the CloudEvents specification. Combined with this ADR, we would be required to adhere to the `CloudEvents Avro Format`_. There may also be additional CloudEvent related work tied to a particular protocol binding, like the `Kafka Protocol Binding for CloudEvents`_. This, however, is out of scope of this particular decision.

.. _`OEP-41: Asynchronous Server Event Message Format`: https://open-edx-proposals.readthedocs.io/en/latest/architectural-decisions/oep-0041-arch-async-server-event-messaging.html

.. _CloudEvents Avro Format: https://github.com/cloudevents/spec/blob/v1.0.2/cloudevents/formats/avro-format.md

.. _Kafka Protocol Binding for CloudEvents: https://github.com/cloudevents/spec/blob/v1.0.2/cloudevents/bindings/kafka-protocol-binding.md#3-kafka-message-mapping
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
6. Event schema serialization and evolution
===========================================

Status
------

Provisional

Context
-------

.. note:: Because the event definitions are shared, this decision applies to both external and internal events

We are experimenting with adding event bus technology (`OEP-52: Event Bus <https://github.com/openedx/open-edx-proposals/pull/233>`_) to the Open edX platform. The goal is to iterate through different implementations of an event bus and slowly get to a solution that works for the larger Open edX developer community.

Currently, the specification for events are written as OpenEdxPublicSignal instances in signal.py modules in the `openedx-events <https://github.com/openedx/openedx-events/blob/main/openedx_events/learning/signals.py>`_ repository.

Over time, as needs change, applications will need the schema for an event to change.

For internal events, there is some discussion of versioning events related to how they might change in ADR ":doc:`0002-events-naming-and-versioning`". For external events to be sent using an event bus, the same OpenEdxPublicSignal definitions will provide the schema.

However, for the event bus, we will be introducing more rigorous event evolution rules using an explicit schema and schema registry. We had decided to use Avro in ":doc:`0005-external-event-schema-format`".

Event schemas can evolve in various ways and each implementation of an event bus should choose a schema evolution configuration. The choices include how data can change, as well as the order in which producer and consumer needs to be deployed with new changes.

Event bus technologies support various schema evolutions (`source <https://docs.confluent.io/platform/current/schema-registry/avro.html>`_)::

- Backward / Backward Transitive: Allows you to delete fields and add optional fields. The consumer of the messages must upgrade to handle new schema before producer.

- Forward / Forward Transitive: Allows you to add fields and delete optional fields. The producer of the messages must upgrade to handle new schema before consumer of messages.

- Full / Full Transitive: Allows you to add or delete optional fields. Either producer or consumer can change first.

Currently, there are two notable classes related to schema: OpenEdxPublicSignal and attrs decorated classes in data.py modules (`example <https://github.com/eduNEXT/openedx-events/blob/main/openedx_events/learning/data.py>`_).

OpenEdxPublicSignal defines all the information necessary to create, serialize, and send an event. This includes the data to be sent and the metadata. The data to be sent is defined as the 'init_data' attribute. 'init_data' is a dict of key/value pairs. The keys are strings and the values can be of various types (the exact supported types has not be specified yet). These keys correspond to keyword arguments sent via the signal.

Most of the values in `init_data` are special attr decorated classes used to provide additional structure and reuse of the data. Since the attr decorated classes provide more structure, they are preferred.

Decision
--------

- The ability to add new data to an event definition is being prioritized over the ability to delete information. Deleting will not be allowed without introducing breaking changes.

- The top-level data dict of an OpenEdxPublicSignal can be evolved by adding new key/values. For the purposes of an explicit schema definition, these will all be considered required fields (vs optional fields).

- An attr Data class will not be evolved once in use. If new fields are required for an event, they can only be added into the top-level data dict.

- The above decisions result in a decision to use “Forward / Forward Transitive” schema evolution for external events.

Consequences
------------

- As noted in the decision, once events are evolved, we must use "Forward / Forward Transitive" with the event bus for external events. This means that the producer of an updated event must upgrade to handle the new schema before consumers can be upgraded. This inflexibility is directly related to not supporting optional fields.

- Note: This may have issues as things stand. Currently, this repo has shared definitions for all services. So, for example, if the LMS needs to evolve an event that it publishes, it may need to upgrade ``openedx-events`` to pick up its updated event. However, with this upgrade, it may pick up updated events which the LMS consumes, but which its publishers has not yet upgraded.

- It is unclear what CI functionality can exist to catch schema changes that break the rules before the schema registry tests the rules for external events in some test environment.

- To avoid breaking changes, since all event evolution must happen in the top-level dict, and not in the attr based data classes, there is potential to have events be messier than they otherwise would be. Again, this could potentially be avoided by adding support for optional fields.

Deferred/Rejected Decisions
---------------------------

We could add the ability to add optional fields to an event, possibly into the attr Data classes and/or possibly into the top-level data dict in the OpenEdxPublicSignal.

- Optional fields would allow for "Full / Full Transitive" schema evolution. This option provides more flexibility around deployment of upgraded producers and consumers indepently, which is useful when each is owned by different teams.

- Optional fields would add additional complexity to the definitions, as well as to the bridge code that serializes these definitions for external events.

- Optional fields may enable cleaner event definitions in the case that new data would make more sense in the attr Data class than in the top-level data dict for all events that share the same Data class.

- Due to the complexity, this work is being deferred and this decision can be revisited when and if a valid use case arises.
17 changes: 17 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Contents:

.. toctree::
:maxdepth: 2
:caption: General Contents

readme
getting_started
Expand All @@ -20,6 +21,22 @@ Contents:
modules
changelog

.. toctree::
:maxdepth: 2
:caption: Decisions

decisions/0001-purpose-of-this-repo
decisions/0002-events-naming-and-versioning
decisions/0003-events-payload
decisions/0004-external-event-bus-and-django-signal-events
decisions/0005-external-event-schema-format
decisions/0006-event-schema-serialization-and-evolution

.. toctree::
:maxdepth: 2
:caption: How-Tos

how_tos/avro_attrs_bridge

Indices and tables
==================
Expand Down