Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Redwood, we should use the latest version of Clickhouse #30

Closed
waseem-blended opened this issue Jan 28, 2024 · 11 comments
Closed

In Redwood, we should use the latest version of Clickhouse #30

waseem-blended opened this issue Jan 28, 2024 · 11 comments
Assignees

Comments

@waseem-blended
Copy link

Hi,

I'm trying to add a new connector to a third party provider with cairn clickhouse. However, to make this work we need to upgrade the clickhouse server version used.

FROM docker.io/yandex/clickhouse-server:22.1.3.7

However, I checked this repo in docker hub and it seems like there are no latest image of clickhouse released. Please upgrade the version to latest.

@regisb
Copy link
Collaborator

regisb commented Jan 30, 2024

It appears that the clickhouse Docker image has moved from https://hub.docker.com/r/yandex/clickhouse-server to https://hub.docker.com/r/clickhouse/clickhouse-server/

Yet, we cannot bump the clickhouse version in the current stable release. Otherwise we might break people's databases. This will only happen in the next release, Redwood, scheduled for June. We can make this change now in the nightly branch but will not be able to backport it to the master branch just yet.

@regisb regisb moved this from Pending Triage to Backlog in Tutor project management Jan 30, 2024
@regisb regisb changed the title Use latest version of clickhouse In Redwood, we should use the latest version of Clickhouse Jan 30, 2024
@FahadKhalid210 FahadKhalid210 moved this from Backlog to In Progress in Tutor project management Mar 18, 2024
@FahadKhalid210
Copy link
Contributor

Issues with Latest Version (24.2.1.2248)

Deprecation of Live View

We've encountered issues with the latest version of ClickHouse server (24.2.1.2248) due to the deprecation of Live View feature (link). This has caused errors when attempting to access the course_enrollments table.

image

Alternative: Refreshable Materialized View
While Refreshable Materialized View is indeed an alternative, it's worth noting that this feature is currently in-progress.

Workarounds

  • Create Normal View for course_enrollments:
    Creating a normal view for course_enrollments is a workaround, but it utilizes private tables. Since users don't have access to private tables, they will encounter errors when attempting to access course_enrollments. Using the openedx DB will resolve this issue.

  • Divide the Live Query into Two Views:
    An alternative workaround(hack) is to divide the live query into two live views, which seems to resolve the issue.

image image

@FahadKhalid210
Copy link
Contributor

@regisb As mentioned earlier, Live views are deprecated. Are we good to proceed with implementing a normal view for course_enrollments instead?

@regisb
Copy link
Collaborator

regisb commented Apr 15, 2024

I don't know. What do you suggest? Why?

@FahadKhalid210
Copy link
Contributor

@regisb Given the deprecation of Live Views and the limitations of materialized views in updating existing data, implementing a Normal View seems like a more feasible approach. With a Normal View, we can efficiently track changes without encountering the limitations associated with Live Views or materialized views.

@regisb
Copy link
Collaborator

regisb commented Apr 16, 2024

That sounds great. I suggest you create a proof of concept to migrate a single live view to a normal view. This will help you decide if we should migrate all live views.

@FahadKhalid210
Copy link
Contributor

@DawoudSheraz
We currently have 2 live views course_enrollments & course_block_completion. After upgrading clickhouse version, we encountered errors due to deprecation of live views.

Error we got:
image
After transitioning from live views to normal views to resolve the issue, we encountered a new challenge: course_enrollment table is only accessible in main db(openedx) and we get error on staff specified db’s. It's because staff user's don't have access to the tables that view statement is utilizing(_openedx_course_enrollments, _openedx_user_profiles & _openedx_users). We need to give access to staff user's for those tables as well and also need to change the row policy. Granting access appears justified, as the live view already show data from these tables through join operations.

Please share your thoughts or any additional considerations you believe are important to address moving forward.

Thanks

@DawoudSheraz
Copy link
Contributor

@FahadKhalid210 Alright, if we are already using Live view by providing staff access on certain tables, we can use that here. Though it would be good if we can document under what conditions do the staff users require access.

@FahadKhalid210
Copy link
Contributor

PR #38

@DawoudSheraz
Copy link
Contributor

@FahadKhalid210 Does this require any further actions? Thanks

@FahadKhalid210
Copy link
Contributor

@DawoudSheraz, @waseem-blended PR is merged. Closing this issue

@github-project-automation github-project-automation bot moved this from In Progress to Done in Tutor project management May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

4 participants