-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#1187] feat(lakehouse-iceberg): the basic framework of storing Iceberg metrics #1164
Conversation
What's the purpose of writing metrics to FS, not db? |
Analyzing the metrics with Spark or Python is easy if writing to FS. |
remove the specified storage implement, leaving the basic framework. |
@Clearvive @jerryshao , could you help to review this pr? |
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Outdated
Show resolved
Hide resolved
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Show resolved
Hide resolved
|
||
public static final ConfigEntry<Integer> ICEBERG_METRICS_STORAGE_RETAIN_DAYS = | ||
new ConfigBuilder(IcebergMetricsManager.ICEBERG_METRICS_STORAGE_RETAIN_DAYS) | ||
.doc("The retain days of Iceberg metrics") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it reasonable to keep the default setting based on the day dimension, and would it be more reasonable to configure it based on the hour dimension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep metrics in days seems enough
Do we need a metrics storage for the Gravitino server? |
...rc/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/IcebergObjectMapper.java
Outdated
Show resolved
Hide resolved
...-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
...g/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/IcebergRestUtils.java
Outdated
Show resolved
Hide resolved
no, it's used to store Iceberg query or commit metrics not server metrics. |
@Clearvive @yuqi1129 , are comments addressed, please review again. thx. |
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Outdated
Show resolved
Hide resolved
LGTM except the few comments |
...-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergConfig.java
Show resolved
Hide resolved
...-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false); | ||
mapper.setPropertyNamingStrategy(new PropertyNamingStrategies.KebabCaseStrategy()); | ||
RESTSerializers.registerAll(mapper); | ||
return mapper; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of change here, why do you need a new ObjectMapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previouse implement(JsonUtils.objectMapper()
) get a global ObjectMapper and applies Iceberg REST properties to it, this's not a good design. it may effect other ObjectMappers.
LocalDateTime currentDateTime = | ||
LocalDateTime.ofInstant(currentTimestamp, ZoneId.systemDefault()); | ||
LocalDateTime nextHourStartTime = currentDateTime.plusHours(1).withMinute(0).withSecond(0); | ||
return nextHourStartTime.atZone(ZoneId.systemDefault()).toInstant(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of using LocalDateTime
, it is quite simple to calculate the time hour's timestamp by using mod, like (curr_time + 3600000) - (curr_time + 3600000) % 3600000
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LocalDateTime
seems more maintainable.
...java/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/EmptyMetricsStorage.java
Outdated
Show resolved
Hide resolved
this.icebergObjectMapper = IcebergObjectMapper.getInstance(); | ||
} | ||
|
||
public String toJsonOrToString(MetricsReport metricsReport) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method name is weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about toPrintableString()
?
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
private void logMetrics(String message, MetricsReport metricsReport) { | ||
LOG.info("{} {}", message, icebergMetricsFormatter.toJsonOrToString(metricsReport)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this IcebergMetricsFormatter
is only used here, Is it necessary to have a class to define this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IcebergMetricsFormatter is designed to provide variable format for other storages like KVStorage
or SQLStorage
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Outdated
Show resolved
Hide resolved
@jerryshao , all comments are addressed, please help to review again. |
planing to do:
|
supported in the lastest pr , cc @jerryshao |
I think there's no need to use service loader to make it pluggable, using reflection should be enough. |
replace service loader with reflection. |
...-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergConfig.java
Show resolved
Hide resolved
...-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergConfig.java
Show resolved
Hide resolved
public String toJson(MetricsReport metricsReport) throws JsonProcessingException { | ||
return icebergObjectMapper.writeValueAsString(metricsReport); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this class is only used for print logs, I don't think it is necessary to have a class, some helper methods should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will use this class to serialize metrics report to json string in KVMetricsStorage in next pr. should I keep it as a class?
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Outdated
Show resolved
Hide resolved
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Outdated
Show resolved
Hide resolved
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Outdated
Show resolved
Hide resolved
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Outdated
Show resolved
Hide resolved
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsManager.java
Show resolved
Hide resolved
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsStorage.java
Outdated
Show resolved
Hide resolved
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/IcebergMetricsStorage.java
Outdated
Show resolved
Hide resolved
...ava/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/MemoryMetricsStorage.java
Outdated
Show resolved
Hide resolved
...-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/IcebergConfig.java
Show resolved
Hide resolved
...va/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/metrics/BlackHoleMetricsStore.java
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
public void save(MetricsReport metricsReport) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This save method is quite confused with the method above, you'd better choose a better name.
Also you should put all the public method together, then all all private method.
metricsWriterThread.join(); | ||
} catch (InterruptedException e) { | ||
LOG.warn("Iceberg metrics manager is interrupted while join metrics writer thread."); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not return directly here, you still need to execute the follow-up logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If IcebergMetricsManager is interrupted, I think it's better to quit as quickly as possible in case the other cleanup actions block the close.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least you should close the iceberg metrics store, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
What changes were proposed in this pull request?
MetricsManager
to manage storagesIcebergMetricsFormatter
to format metrics, support JSON for now.Why are the changes needed?
Fix: #1187
Does this PR introduce any user-facing change?
no
How was this patch tested?