Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1451] feat(hive): Hive catalog supports to execute operations in the Kerberos mode. #1685

Merged
merged 32 commits into from
Jan 31, 2024

Conversation

qqqttt123
Copy link
Contributor

@qqqttt123 qqqttt123 commented Jan 24, 2024

What changes were proposed in this pull request?

Hive catalog supports to execute operations in the Kerberos mode.

Why are the changes needed?

Fix: #1451

Does this PR introduce any user-facing change?

Yes, added the document.

How was this patch tested?

Manual test.

@qqqttt123 qqqttt123 marked this pull request as draft January 24, 2024 09:23
@qqqttt123
Copy link
Contributor Author

Wait for #596

@qqqttt123 qqqttt123 marked this pull request as ready for review January 30, 2024 15:47
@qqqttt123 qqqttt123 requested review from yuqi1129, jerryshao, mchades, xunliu, FANNG1, diqiu50 and Clearvive and removed request for yuqi1129 and jerryshao January 31, 2024 02:33
Comment on lines 178 to 210
try {
URI uri = new URI(keyTabUri);
String scheme = Optional.ofNullable(uri.getScheme()).orElse("file");
switch (scheme) {
case "http":
case "https":
case "ftp":
int timeout =
(int)
catalogPropertiesMetadata.getOrDefault(
conf, HiveCatalogPropertiesMeta.FETCH_TIMEOUT_SEC);
URLConnection uc = uri.toURL().openConnection();
uc.setConnectTimeout(timeout * 1000);
uc.setReadTimeout(timeout * 1000);
uc.connect();

Files.copy(uc.getInputStream(), keyTabFile.toPath());
break;

case "file":
Files.createSymbolicLink(keyTabFile.toPath(), new File(uri.getPath()).toPath());
break;

// TODO: Supports to download keytab from HCFS, it's unsafe to store
// keytab in an unsecured HDFS cluster. If we use a secured HDFS cluster,
// we should have a keytab first.
default:
throw new IllegalArgumentException(
String.format("Don't support the scheme %s", scheme));
}
} catch (URISyntaxException ue) {
throw new IllegalArgumentException("The uri of keytab has the wrong format", ue);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move this to a common module with a generic method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// we should have a keytab first.
default:
throw new IllegalArgumentException(
String.format("Don't support the scheme %s", scheme));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Doesn't support...."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Files.createSymbolicLink(keyTabFile.toPath(), new File(uri.getPath()).toPath());
break;

// TODO: Supports to download keytab from HCFS, it's unsafe to store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to add HDFS support, it is quite common to use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is HCFS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hadoop Compatible File System.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to add HDFS support, it is quite common to use.

Like the comment. If we use a unsecured cluster to download a key tab. It's weird.

refreshScheduledExecutor = null;
}

File keyTabFile = new File(String.format("/tmp/gravitino-%s-keytab", entity.id()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not safe to put keytab in /tmp folder, /tmp folder will be cleaned by the administrator. You'd better choose a new folder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -204,6 +322,7 @@ public HiveSchema createSchema(
NameIdentifier ident, String comment, Map<String, String> properties)
throws NoSuchCatalogException, SchemaAlreadyExistsException {
try {
LOG.warn("Current user : " + UserGroupInformation.getCurrentUser().getUserName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the usage here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

.put(
REFRESH_INTERVAL_SEC,
PropertyEntry.integerOptionalPropertyEntry(
REFRESH_INTERVAL_SEC, "The interval to refresh the principal", true, 60, false))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

60 seconds is too frequent for key refreshment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interval just check whether the UserGroupInformation is valid. It won't login every time. I will optimize the name and description.

File keyTabFile = new File(String.format("/tmp/gravitino-%s-keytab", entity.id()));
keyTabFile.deleteOnExit();
if (keyTabFile.exists() && !keyTabFile.delete()) {
LOG.warn("Fail to delete key tab file {}", keyTabFile.getAbsolutePath());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the warn level is not enough for the failure of dropping keyTableFile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Files.createSymbolicLink(keyTabFile.toPath(), new File(uri.getPath()).toPath());
break;

// TODO: Supports to download keytab from HCFS, it's unsafe to store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is HCFS?

case "https":
case "ftp":
int timeout =
(int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileUtils.copyURLToFile();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

throw new IllegalArgumentException(
String.format("Don't support the scheme %s", scheme));
}
} catch (URISyntaxException ue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can simplify the nested try-catch block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

private DownloadUtils() {}

public static void downloadFile(
String keyTabUri, File keyTabFile, int timeout, Configuration conf) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the parameter should be keyTabXXX, it is a common method. Please carefully review your code again.

Also, can you please move this method to common module. It actually can be used by other modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't put this to common module. If we put this to common module, the Hadoop class will be loaded by the root JVM. We can't isolate the class of Hadoop.

String.format("Doesn't support the scheme %s", scheme));
}
} catch (URISyntaxException ue) {
throw new IllegalArgumentException("The uri of keytab has the wrong format", ue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you have a UT for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

File keyTabFile = new File(String.format("tmp/gravitino-%s-keytab", entity.id()));
keyTabFile.deleteOnExit();
if (keyTabFile.exists() && !keyTabFile.delete()) {
LOG.error("Fail to delete key tab file {}", keyTabFile.getAbsolutePath());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you throw an exception here, or continue to execute your code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

file.mkdir();
}

File keyTabFile = new File(String.format("tmp/gravitino-%s-keytab", entity.id()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“keytab” is a word here, so don't use camel style like "keyTab".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you use a random name to avoid duplication?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entity id is a unique id. We won't duplicate.

return client.getDelegationToken(
currentUser.getUserName(),
PrincipalUtils.getCurrentPrincipal().getName());
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you get delegation token every time you do the action?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will have many users. It's useless if we cache the token.

Heng Qin added 2 commits January 31, 2024 14:50
checkScheduledExecutor = null;
}

File keyTabFile = new File(String.format("tmp/gravitino-%s-keytab", entity.id()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that you have used deleteOnExit to remove the file when JVM terminates, do we need to drop it explicitly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped them in the close method, too.

DownloadUtils.downloadFile(srcFile.toURI().toString(), destFile, 10, new Configuration());
Assertions.assertTrue(destFile.exists());

srcFile.delete();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's best to test one remote file system to ensure it functions properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that I should add a MiniHdfsCluster for this pr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

break;

case "hdfs":
FileSystem.get(conf).copyToLocalFile(new Path(uri), new Path(destFile.toURI()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need a break here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

String.format("Doesn't support the scheme %s", scheme));
}
} catch (URISyntaxException ue) {
throw new IllegalArgumentException("The uri of keytab has the wrong format", ue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception message is not correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if (UserGroupInformation.AuthenticationMethod.KERBEROS
== SecurityUtil.getAuthenticationMethod(hadoopConf)) {
try {
File keytabTemporaryDirectory = new File("tmp");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to use name of "keytabs", is it better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}

// The id of entity is a random unique id.
File keytabFile = new File(String.format("tmp/gravitino-%s-keytab", entity.id()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a unique id, not a random id. So if you have your gravitino restart, the id will not changed, so the keytab will be existed anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file is deleteOnExit.

keytabFile.deleteOnExit();
if (keytabFile.exists() && !keytabFile.delete()) {
throw new IllegalStateException(
String.format("Fail to delete key tab file %s", keytabFile.getAbsolutePath()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keytab is a word here and below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

String catalogPrincipal = (String) catalogPropertiesMetadata.getOrDefault(conf, PRINCIPAL);
Preconditions.checkArgument(
StringUtils.isNotBlank(catalogPrincipal),
"If you use Kerberos, principal can't be blank");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the third-person statement here and above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@jerryshao jerryshao merged commit 1a8768c into apache:main Jan 31, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Hive catalog supports to impersonate to execute operations in Kerberos mode
3 participants