Skip to content
This repository has been archived by the owner on Apr 3, 2024. It is now read-only.

Commit

Permalink
Merge pull request #1 from openedx/bmtcril/add_event_listener
Browse files Browse the repository at this point in the history
Add Course Published event listener and plugin plumbing
  • Loading branch information
bmtcril authored May 5, 2023
2 parents d8fc81a + 61a1c6e commit 8a9d84e
Show file tree
Hide file tree
Showing 24 changed files with 1,402 additions and 128 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ Change Log
Unreleased
**********

*
* First functional version, includes a CMS listener for COURSE_PUBLISHED
* README updates
* New configuration settings for connection to ClickHouse

0.1.0 – 2023-04-24
**********************************************
Expand Down
52 changes: 43 additions & 9 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,25 @@ Event Sink ClickHouse
Purpose
*******

A listener for `Open edX events`_ to send them to ClickHouse. This project
acts as a plugin to the Edx Platform, listens for configured Open edX events,
and sends them to a ClickHouse database for analytics or other processing. This
is being maintained as part of the Open Analytics Reference System (OARS)
project.
This project acts as a plugin to the `Edx Platform`_, listens for
configured `Open edX events`_, and sends them to a `ClickHouse`_ database for
analytics or other processing. This is being maintained as part of the Open
Analytics Reference System (`OARS`_) project.

OARS consumes the data sent to ClickHouse by this plugin as part of data
enrichment for reporting, or capturing data that otherwise does not fit in
xAPI.

Currently the only sink is in the CMS. It listens for the ``COURSE_PUBLISHED``
signal and serializes a subset of the published course blocks into one table
and the relationships between blocks into another table. With those we are
able to recreate the "graph" of the course and get relevant data, such as
block names, for reporting.

.. _Open edX events: https://github.com/openedx/openedx-events
.. _Edx Platform: https://github.com/openedx/edx-platform
.. _ClickHouse: https://clickhouse.com
.. _OARS: https://docs.openedx.org/projects/openedx-oars/en/latest/index.html

Getting Started
***************
Expand Down Expand Up @@ -75,12 +83,38 @@ Every time you develop something in this repo
Deploying
=========

TODO: How can a new user go about deploying this component? Is it just a few
commands? Is there a larger how-to that should be linked here?
This plugin will be deployed by default in an OARS Tutor environment. For other
deployments install the library or add it to private requirements of your
virtual environment ( ``requirements/private.txt`` ).

#. Run ``pip install openedx-event-sink-clickhouse``.

#. Run migrations:

- ``python manage.py lms migrate``

- ``python manage.py cms migrate``

PLACEHOLDER: For details on how to deploy this component, see the `deployment how-to`_
#. Restart LMS service and celery workers of edx-platform.

Configuration
===============

Currently all events will be listened to by default (there is only one). So
the only necessary configuration is a ClickHouse connection:

.. code-block::
.. _deployment how-to: https://docs.openedx.org/projects/openedx-event-sink-clickhouse/how-tos/how-to-deploy-this-component.html
EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG = {
# URL to a running ClickHouse server's HTTP interface. ex: https://foo.openedx.org:8443/ or
# http://foo.openedx.org:8123/ . Note that we only support the ClickHouse HTTP interface
# to avoid pulling in more dependencies to the platform than necessary.
"url": "http://clickhouse:8123",
"username": "changeme",
"password": "changeme",
"database": "event_sink",
"timeout_secs": 3,
}
Getting Help
************
Expand Down
16 changes: 4 additions & 12 deletions catalog-info.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
# https://open-edx-proposals.readthedocs.io/en/latest/processes/oep-0055-proc-project-maintainers.html

apiVersion: backstage.io/v1alpha1
kind: ""
kind: "Component"
metadata:
name: 'openedx_event_sink_clickhouse'
name: 'openedx-event-sink-clickhouse'
description: "A sink for Open edX events to send them to ClickHouse"
annotations:
# (Optional) Annotation keys and values can be whatever you want.
Expand All @@ -15,18 +15,10 @@ metadata:
spec:

# (Required) This can be a group(`group:<group_name>` or a user(`user:<github_username>`)
owner: ""
owner: "group:openedx-event-sink-clickhouse-maintainers"

# (Required) Acceptable Type Values: service, website, library
type: ''
type: 'library'

# (Required) Acceptable Lifecycle Values: experimental, production, deprecated
lifecycle: 'experimental'

# (Optional) The value can be the name of any known component.
subcomponentOf: '<name_of_a_component>'

# (Optional) An array of different components or resources.
dependsOn:
- '<component_or_resource>'
- '<another_component_or_resource>'
37 changes: 37 additions & 0 deletions event_sink_clickhouse/apps.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"""

from django.apps import AppConfig
from edx_django_utils.plugins import PluginSettings, PluginSignals


class EventSinkClickhouseConfig(AppConfig):
Expand All @@ -11,3 +12,39 @@ class EventSinkClickhouseConfig(AppConfig):
"""

name = 'event_sink_clickhouse'
verbose_name = "Event Sink ClickHouse"

plugin_app = {
PluginSettings.CONFIG: {
'lms.djangoapp': {
'production': {PluginSettings.RELATIVE_PATH: 'settings.production'},
'common': {PluginSettings.RELATIVE_PATH: 'settings.common'},
},
'cms.djangoapp': {
'production': {PluginSettings.RELATIVE_PATH: 'settings.production'},
'common': {PluginSettings.RELATIVE_PATH: 'settings.common'},
}
},
# Configuration setting for Plugin Signals for this app.
PluginSignals.CONFIG: {
# Configure the Plugin Signals for each Project Type, as needed.
'cms.djangoapp': {
# List of all plugin Signal receivers for this app and project type.
PluginSignals.RECEIVERS: [{
# The name of the app's signal receiver function.
PluginSignals.RECEIVER_FUNC_NAME: 'receive_course_publish',

# The full path to the module where the signal is defined.
PluginSignals.SIGNAL_PATH: 'xmodule.modulestore.django.COURSE_PUBLISHED',
}],
}
},
}

def ready(self):
"""
Import our Celery tasks for initialization.
"""
super().ready()

from . import tasks # pylint: disable=import-outside-toplevel, unused-import
Empty file.
19 changes: 19 additions & 0 deletions event_sink_clickhouse/settings/common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
"""
Default settings for the openedx_event_sink_clickhouse app.
"""


def plugin_settings(settings):
"""
Adds default settings
"""
settings.EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG = {
# URL to a running ClickHouse server's HTTP interface. ex: https://foo.openedx.org:8443/ or
# http://foo.openedx.org:8123/ . Note that we only support the ClickHouse HTTP interface
# to avoid pulling in more dependencies to the platform than necessary.
"url": "http://clickhouse:8123",
"username": "changeme",
"password": "changeme",
"database": "event_sink",
"timeout_secs": 3,
}
13 changes: 13 additions & 0 deletions event_sink_clickhouse/settings/production.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""
Production settings for the openedx_event_sink_clickhouse app.
"""


def plugin_settings(settings):
"""
Override the default app settings with production settings.
"""
settings.EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG = settings.ENV_TOKENS.get(
'EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG',
settings.EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG
)
13 changes: 13 additions & 0 deletions event_sink_clickhouse/signals.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""
Signal handler functions, mapped to specific signals in apps.py.
"""


def receive_course_publish(sender, course_key, **kwargs): # pylint: disable=unused-argument
"""
Receives COURSE_PUBLISHED signal and queues the dump job.
"""
# import here, because signal is registered at startup, but items in tasks are not yet able to be loaded
from .tasks import dump_course_to_clickhouse # pylint: disable=import-outside-toplevel

dump_course_to_clickhouse.delay(str(course_key))
Empty file.
47 changes: 47 additions & 0 deletions event_sink_clickhouse/sinks/base_sink.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
"""
Base classes for event sinks
"""
from collections import namedtuple

import requests
from django.conf import settings

ClickHouseAuth = namedtuple("ClickHouseAuth", ["username", "password"])


class BaseSink:
"""
Base class for ClickHouse event sink, allows overwriting of default settings
"""
def __init__(self, connection_overrides, log):
self.log = log
self.ch_url = settings.EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG["url"]
self.ch_auth = ClickHouseAuth(settings.EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG["username"],
settings.EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG["password"])
self.ch_database = settings.EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG["database"]
self.ch_timeout_secs = settings.EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG["timeout_secs"]

# If any overrides to the ClickHouse connection
if connection_overrides:
self.ch_url = connection_overrides.get("url", self.ch_url)
self.ch_auth = ClickHouseAuth(connection_overrides.get("username", self.ch_auth.username),
connection_overrides.get("password", self.ch_auth.password))
self.ch_database = connection_overrides.get("database", self.ch_database)
self.ch_timeout_secs = connection_overrides.get("timeout_secs", self.ch_timeout_secs)

def _send_clickhouse_request(self, request):
"""
Perform the actual HTTP requests to ClickHouse.
"""
session = requests.Session()
prepared_request = request.prepare()

try:
response = session.send(prepared_request, timeout=self.ch_timeout_secs)
response.raise_for_status()
except requests.exceptions.HTTPError as e:
self.log.error(str(e))
self.log.error(e.response.headers)
self.log.error(e.response)
self.log.error(e.response.text)
raise
Loading

0 comments on commit 8a9d84e

Please sign in to comment.