Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade repository-hdfs to Hadoop 3 #76897

Merged
merged 13 commits into from
Sep 23, 2021

Conversation

masseyke
Copy link
Member

@masseyke masseyke commented Aug 24, 2021

This upgrades the repository-hdfs plugin to hadoop 3. Tests are performed against both hadoop 2 and hadoop 3 HDFS. The advantages of using the hadoop 3 client are:
Over-the-wire encryption works (tests coming in an upcoming PR).
We don't have to add (or ask customers to add) additional jvm permissions to the elasticsearch jvm
It's compatible with java versions higher than java 8

@masseyke masseyke requested a review from jbaiera August 24, 2021 19:46
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Aug 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@masseyke masseyke changed the title Fix/hdfs over the wire encryption Upgrade to Hadoop 3 and fix hdfs over the wire encryption Aug 24, 2021
@masseyke masseyke linked an issue Aug 24, 2021 that may be closed by this pull request
@masseyke
Copy link
Member Author

@elasticmachine update branch

@jbaiera
Copy link
Member

jbaiera commented Aug 30, 2021

I think we should probably split out the Hadoop 3 upgrade from the testing additions for over-the-wire hdfs encryption, just to keep things tidy.

@masseyke
Copy link
Member Author

masseyke commented Sep 8, 2021

@elasticmachine update branch

@masseyke masseyke changed the title Upgrade to Hadoop 3 and fix hdfs over the wire encryption Upgrade repository-hdfs to Hadoop 3 Sep 13, 2021
@masseyke
Copy link
Member Author

@elasticmachine update branch

@masseyke
Copy link
Member Author

@elasticmachine run elasticsearch-ci/part-2

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though I think we should make sure to circle back and upgrade HDFS in the searchable snapshot builds at some point in another PR.

Comment on lines 33 to 34
final int minTestedHadopoVersion = 2;
final int maxTestedHadopoVersion = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Typo Hadopo

@@ -95,6 +97,7 @@ tasks.named("dependencyLicenses").configure {

tasks.named("integTest").configure {
dependsOn(project.tasks.named("bundlePlugin"))
enabled = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this still be disabled?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see, do we discard this test and just create new ones down below for hadoop v2 + v3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wanted to be very explicit about hadoop 2 vs hadoop 3. Not sure if this was the best way to go about it, but it works.

@masseyke
Copy link
Member Author

LGTM, though I think we should make sure to circle back and upgrade HDFS in the searchable snapshot builds at some point in another PR.

Yeah I had the same thought. I changed it locally just to make sure there were no problems (there weren't) but left it at 2 since that's what's currently tested.

@masseyke
Copy link
Member Author

@elasticmachine update branch

@masseyke masseyke merged commit a02e8ad into elastic:master Sep 23, 2021
arteam added a commit to arteam/elasticsearch that referenced this pull request Sep 24, 2021
In elastic#76897 the `hadoop-common` module was renamed to `hadoop-client-ide`, but
the change wasn't reflected in `elasticsearch.ide.gradle` script.
Because of that an IDE import started failing with the
`Task with path ':plugins:repository-hdfs:hadoop-common:shadowJar" not found` error.

Fix the import by setting the correct module name in the `buildDependencyArtifacts` task.
arteam added a commit that referenced this pull request Sep 24, 2021
In #76897 the `hadoop-common` module was renamed to `hadoop-client-ide`, but
the change wasn't reflected in `elasticsearch.ide.gradle` script.
Because of that an IDE import started failing with the
`Task with path ':plugins:repository-hdfs:hadoop-common:shadowJar" not found` error.

Fix the import by setting the correct module name in the `buildDependencyArtifacts` task.
arteam added a commit to arteam/elasticsearch that referenced this pull request Sep 27, 2021
In elastic#76897 the `hadoop-common` module was renamed to `hadoop-client-ide`, but
the change wasn't reflected in `elasticsearch.ide.gradle` script.
Because of that an IDE import started failing with the
`Task with path ':plugins:repository-hdfs:hadoop-common:shadowJar" not found` error.

Fix the import by setting the correct module name in the `buildDependencyArtifacts` task.
masseyke added a commit that referenced this pull request Oct 6, 2021
…on fails (#78409)

Until recently, if a user configured over-the-wire encryption for repository-hdfs they would get an exception. That was fixed in an upgraded ticket in two ways: (1) jvm permissions were opened up for haddop2, and (2) support for the hadoop 3 hdfs client was added. This commit adds configuration to a couple of integration tests so that they fail if over-the-wire encryption is not working.
Relates #76897 #76734
masseyke added a commit that referenced this pull request Oct 6, 2021
This upgrades the repository-hdfs plugin to hadoop 3. Tests are performed against both hadoop 2 and hadoop 3 HDFS. The advantages of using the hadoop 3 client are:
Over-the-wire encryption works (tests coming in an upcoming PR).
We don't have to add (or ask customers to add) additional jvm permissions to the elasticsearch jvm
It's compatible with java versions higher than java 8
Relates #76897
@masseyke masseyke deleted the fix/hdfs-over-the-wire-encryption branch October 6, 2021 19:50
arteam added a commit to arteam/elasticsearch that referenced this pull request Oct 11, 2021
In elastic#76897 the `hadoop-common` module was renamed to `hadoop-client-ide`, but
the change wasn't reflected in `elasticsearch.ide.gradle` script.
Because of that an IDE import started failing with the
`Task with path ':plugins:repository-hdfs:hadoop-common:shadowJar" not found` error.
arteam added a commit that referenced this pull request Oct 11, 2021
In #76897 the `hadoop-common` module was renamed to `hadoop-client-ide`, but
the change wasn't reflected in `elasticsearch.ide.gradle` script.
Because of that an IDE import started failing with the
`Task with path ':plugins:repository-hdfs:hadoop-common:shadowJar" not found` error.
@jakelandis jakelandis added v8.0.0-beta1 :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed v8.0.0 labels Oct 27, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Oct 27, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Data Management Meta label for data/management team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HDFS Repository fails when over-the-wire encryption is enabled
5 participants