Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build viable Docker images for Polaris using Quarkus #610

Merged
merged 1 commit into from
Jan 22, 2025

Conversation

adutra
Copy link
Contributor

@adutra adutra commented Jan 3, 2025

This PR redesigns how Docker images are built, to make them fully compatible with Quarkus. With Quarkus, docker images can be built as part of the normal build. There is no need to do it manually.

This PR also fixes the issues described in #537.

Summary of changes

Docker images

  • The Dockerfile at the root of the repository has been removed (not needed anymore, and has security flaws – see Rework main Dockerfile #537 for details);
  • The main Dockerfile for Polaris server is now located at quarkus/server/src/main/docker/Dockerfile.jvm (default location for dockerfiles for Quarkus)
  • The main Dockerfile for Polaris admin tool is now located at quarkus/admin/src/main/docker/Dockerfile.jvm
  • Both docker images can be built during the regular build: ./gradlew assemble -Dquarkus.container-image.build=true
  • The images are built with the Quarkus docker build provider. You must have a valid docker environment for the image build to succeed.
### JDBC drivers

Furthermore, this PR modifies the eclipse-link module and includes 2 JDBC drivers in the resulting artifacts: H2 and Postgres.

This is absolutely required to be able to run Polaris with Postgres as the JDBC drivers must be present at build time.

The licenses of the 2 drivers are compatible with ASF and were included in the relevant files.

Helm tests

This PR changes the Helm chart slightly, only to make the Helm tests pass in CI. But another PR will introduce a bigger revamp of the Helm chart in order to fully to adapt it to Quarkus: #626.

In CI, a docker image is first built with Gradle and Quarkus, then loaded into minikube. Then the tests are executed normally.

Regression tests

  • The docker-compose.yaml at the root of the repository is moved to the regtests folder where it naturally belongs.
  • The default image in the docker-compose.yaml is apache/polaris:latest. This is the image name that is created when running ./gradlew assemble -Dquarkus.container-image.build=true.
  • In CI, a docker image is first built with Gradle and Quarkus. Then the tests are executed.

Note: there is no released version of Polaris yet, so running the docker compose file today will result in an error unless the user builds the image manually first with ./gradlew assemble -Dquarkus.container-image.build=true; but that's a temporary issue.

Remaining work

Again, this PR also defers all documentation changes to a follow-up task: #700

Also, this PR does not address the specific needs of release builds. For such builds, we need:

  • Multi-platform manifests
  • Multi-registry pushes

@adutra adutra force-pushed the quarkus-docker branch 8 times, most recently from 3a3efcd to 3bbf02c Compare January 7, 2025 19:19
@adutra adutra force-pushed the quarkus-docker branch 7 times, most recently from 6d67aaa to 748e414 Compare January 10, 2025 20:11
@adutra adutra changed the title [WIP] Build Docker images with Quarkus [WIP] Build viable Docker images for Polaris using Quarkus Jan 10, 2025
@adutra adutra force-pushed the quarkus-docker branch 11 times, most recently from 860f7d0 to 432f581 Compare January 14, 2025 13:25
@adutra adutra changed the title [WIP] Build viable Docker images for Polaris using Quarkus Build viable Docker images for Polaris using Quarkus Jan 14, 2025
@adutra adutra marked this pull request as ready for review January 14, 2025 13:37
@@ -65,6 +65,10 @@ dependencies {
compileOnly("com.fasterxml.jackson.core:jackson-annotations")
compileOnly("com.fasterxml.jackson.core:jackson-core")

// JDBC Drivers
runtimeOnly(libs.h2)
runtimeOnly(libs.postgresql)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going back to in-memory by default, these changes aren't needed either

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are :-) If I remove this, the released artifacts wouldn't contain these drivers, and nobody would be able to use Eclipselink at all.

Copy link
Contributor

@dimas-b dimas-b Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is valuable for end users that Polaris's normal build includes usable JDBC drivers. I do not think it is viable to force users to recompile to include PostgreSQL.

@@ -34,8 +34,7 @@
<class>org.apache.polaris.jpa.models.ModelSequenceId</class>
<shared-cache-mode>NONE</shared-cache-mode>
<properties>
<property name="jakarta.persistence.jdbc.url"
value="jdbc:h2:file:./build/test_data/polaris/{realm}/db"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I don't see the value of creating a file on disk?
That said, if you really insist on having that file created, I can revert this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - using in-memory here is better.

@@ -81,6 +81,7 @@ opentelemetry-bom = { module = "io.opentelemetry:opentelemetry-bom", version = "
opentelemetry-semconv = { module = "io.opentelemetry.semconv:opentelemetry-semconv", version = "1.25.0-alpha" }
picocli = { module = "info.picocli:picocli-codegen", version.ref = "picocli" }
picocli-codegen = { module = "info.picocli:picocli-codegen", version.ref = "picocli" }
postgresql = { module = "org.postgresql:postgresql", version = "42.7.4" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is needed. Otherwise the resulting Docker images will be unusable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, we can remove this and ship just with H2 for now. But I don't see the point of a Polaris Docker image that is only usable with H2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This declaration does look now (after recent review-driven changes)

Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there are still metastore-related changes here

@jbonofre
Copy link
Member

@eric-maynard just for my understanding, you are asking to include changes only related to Docker here (and not include changes on persistence.xml) right ? This is the change you are requesting: revert the change on persistence.xml, correct ?

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me and comply with what's needed for Quarkus.
I don't see a technical reason to block this change.

@eric-maynard please retract your review or provide the obligatory justified technical reason.

@@ -34,8 +34,7 @@
<class>org.apache.polaris.jpa.models.ModelSequenceId</class>
<shared-cache-mode>NONE</shared-cache-mode>
<properties>
<property name="jakarta.persistence.jdbc.url"
value="jdbc:h2:file:./build/test_data/polaris/{realm}/db"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - using in-memory here is better.

@eric-maynard
Copy link
Contributor

@jbonofre

@eric-maynard just for my understanding, you are asking to include changes only related to Docker here (and not include changes on persistence.xml) right ? This is the change you are requesting: revert the change on persistence.xml, correct ?

For now, yes. I think we should try to first restore the docker workflow to its previous state or something close to it before making improvements to the metastore that are not strictly related to building a Docker image.

@adutra
Copy link
Contributor Author

adutra commented Jan 20, 2025

For now, yes. I think we should try to first restore the docker workflow to its previous state or something close to it before making improvements to the metastore that are not strictly related to building a Docker image.

Fine, I'm going to remove the EclipseLink changes. The images will contain the in-memory metastore only for now.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm totally fine with building the Docker image with only in-memory persistence. This, however, makes are few other changes in this PR unnecessary, so it might be best to move them to a separate PR for the sake of clarity. All changes make sense to me, though, this is only about keeping PRs neat :)

@@ -35,6 +35,7 @@ dependencies {
implementation(project(":polaris-api-iceberg-service"))
implementation(project(":polaris-service-common"))
implementation(project(":polaris-quarkus-service"))
runtimeOnly(project(":polaris-eclipselink"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this change can be in a separate PR now that the docker image does not use EclipseLink.

@@ -29,6 +29,7 @@ dependencies {
implementation(project(":polaris-api-iceberg-service"))
implementation(project(":polaris-service-common"))
implementation(project(":polaris-quarkus-defaults"))
runtimeOnly(project(":polaris-eclipselink"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need dep in this PR now that docker runs with in-memory storage? I'd support this dependency in general, but it does not look necessary for this particular PR now 🤷

@@ -181,6 +182,10 @@ public void maybeInitializeInMemoryRealm(
MetaStoreManagerFactory factory,
RealmContextConfiguration realmContextConfiguration) {
if (factory instanceof InMemoryPolarisMetaStoreManagerFactory) {
LoggerFactory.getLogger(QuarkusProducers.class)
.warn(
"Polaris is configured with an in-memory metastore; "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe move to a separate PR for clarity?

@@ -81,6 +81,7 @@ opentelemetry-bom = { module = "io.opentelemetry:opentelemetry-bom", version = "
opentelemetry-semconv = { module = "io.opentelemetry.semconv:opentelemetry-semconv", version = "1.25.0-alpha" }
picocli = { module = "info.picocli:picocli-codegen", version.ref = "picocli" }
picocli-codegen = { module = "info.picocli:picocli-codegen", version.ref = "picocli" }
postgresql = { module = "org.postgresql:postgresql", version = "42.7.4" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This declaration does look now (after recent review-driven changes)

@adutra
Copy link
Contributor Author

adutra commented Jan 21, 2025

This, however, makes are few other changes in this PR unnecessary, so it might be best to move them to a separate PR for the sake of clarity.

Indeed this PR needs some more cleanup. Working on it.

@@ -37,6 +37,10 @@ dependencies {
implementation(project(":polaris-quarkus-service"))
implementation(project(":polaris-quarkus-defaults"))

if (project.hasProperty("eclipseLinkDeps")) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are not including any JDBC drivers anymore; we now need this hook.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a fair trade-off to me.

@adutra
Copy link
Contributor Author

adutra commented Jan 21, 2025

OK I think I've cleaned up everything, removed JDBC drivers, removed unrelated changes. The scope is now considerably smaller.

@dimas-b @eric-maynard please have a second look.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@@ -37,6 +37,10 @@ dependencies {
implementation(project(":polaris-quarkus-service"))
implementation(project(":polaris-quarkus-defaults"))

if (project.hasProperty("eclipseLinkDeps")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a fair trade-off to me.

@adutra adutra mentioned this pull request Jan 22, 2025
@adutra
Copy link
Contributor Author

adutra commented Jan 22, 2025

I would like to merge this PR today, if that's OK for everybody. @eric-maynard please review and if fine with you, remove your -1.

Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; the eclipseLinkDeps change makes sense to me

@adutra adutra merged commit 070b905 into apache:main Jan 22, 2025
5 checks passed
@adutra adutra deleted the quarkus-docker branch January 22, 2025 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants