-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DO NOT SUBMIT SPLIT INTO OTHER PRsAdd HarnessMonitoringInfosRequest/Response InstructionRequest and MonitoringInfosMetadataRequest/Response support to Java #14490
Conversation
R: @pabloem |
32bd70a
to
2bbc236
Compare
sorry about the delay. taking a look.. |
Run Java PostCommit |
@@ -36,4 +38,9 @@ | |||
|
|||
/** Reset this metric. */ | |||
void reset(); | |||
|
|||
/** Return the cumulative values for any metrics in this container as MonitoringInfos. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this docstring right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the comment
|
||
/** Return the cumulative values for any metrics in this container as MonitoringInfos. */ | ||
default @Nullable DateTime getStartTime() { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to not add a default, and make this not-nullable? (i.e. force all implementers to define a non-null start time?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to add this to all the metrics, because I didn't want to add unncessary calls to the system to get clock times.
Let's add it to just this place first, and consider adding more if they are needed for another project.
@@ -302,7 +325,7 @@ public void commitUpdates() { | |||
ImmutableList.Builder<MetricUpdate<UpdateT>> updates = ImmutableList.builder(); | |||
for (Map.Entry<MetricName, CellT> cell : cells.entries()) { | |||
UpdateT update = checkNotNull(cell.getValue().getCumulative()); | |||
updates.add(MetricUpdate.create(MetricKey.create(stepName, cell.getKey()), update)); | |||
updates.add(MetricUpdate.create(MetricKey.create(stepName, cell.getKey()), update, null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the start time should be the current time? or maybe not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -98,7 +98,8 @@ public void flush(boolean async) { | |||
UpdateT value = cell.getValue(); | |||
if (value != null) { | |||
MetricKey key = MetricKey.create(stepName, cell.getName()); | |||
MetricUpdates.MetricUpdate<UpdateT> update = MetricUpdates.MetricUpdate.create(key, value); | |||
MetricUpdates.MetricUpdate<UpdateT> update = | |||
MetricUpdates.MetricUpdate.create(key, value, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to get the time from the cell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is actually now the same MetricCell class. so I can't pull a startTime off it.
its actually a CellT extends AbstractMetric
return false; | ||
} | ||
|
||
default void setIsProcessWide(boolean procesWide) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if setting does not do anything, maybe we should throw an error instead of ignoring the instruction silently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
private HashMap<MonitoringInfoMetricName, String> infoKeyToShortId = new HashMap<>(); | ||
|
||
private HashMap<String, MetricsApi.MonitoringInfo> shortIdToInfo = new HashMap<>(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these should use a Guava cache with automatic invalidation and so on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was deleted. As @robertwb checked in a different implemetnation in the last few days.
...ava/harness/src/main/java/org/apache/beam/fn/harness/control/MonitoringInfoShortIdCache.java
Outdated
Show resolved
Hide resolved
* MonitoringInfos payloads and MonitoringInfo metadata for "process-wide" metrics, which return | ||
* metric values calculated over the life of the process. | ||
*/ | ||
public class ProcessWideInstructionHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename this to reference MonitoringInfoMetadataInstructionHandler or something like that? Since it has process-wide and non-process-wide MIs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to HarnessMonitoringInfosInstructionHandler, which is all this does now.
BeamFnApi.MonitoringInfosMetadataResponse.newBuilder(); | ||
response.putAllMonitoringInfo( | ||
MonitoringInfoShortIdCache.getShortIdCache() | ||
.getInfos(request.getMonitoringInfos().getMonitoringInfoIdList())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the only moment when a MInfo gets added to the ShortIdCache is when we call getShortId
on it, right? I don't see getShortId
getting called for the monitoringInfoMetadata
, only for the harnessMonitoringInfos
- so I wonder how do we get the monitoring infos from the other containers in this call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wasn't implemented when I sent you this PR. Though robert's change did introduce it.
I have updated the PR to merge with his implementation.
813774d
to
e10e165
Compare
Run Java_Examples_Dataflow PreCommit |
e10e165
to
7f18273
Compare
Run Java PreCommit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for the one comment.
MonitoringInfo.Builder cleaner = MonitoringInfo.newBuilder(info); | ||
cleaner.clearPayload(); | ||
cleaner.clearStartTime(); | ||
cleaner.clearType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the type also necessary to identify a monitoring info? What identifies it / what goes into the hash? Is it hash consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR seems to do three separate things.
(1) Change the way ShortIdMap works,
(2) Add process-wide metrics, and
(3) add the notion of startTime to a subset of metrics.
It may be preferable to split them up.
@@ -45,6 +48,8 @@ | |||
*/ | |||
public CounterCell(MetricName name) { | |||
this.name = name; | |||
// The start time of when this cell was first instantied by the container. | |||
this.startTime = new DateTime(DateTimeZone.UTC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a concern that this might be slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to #14805
Yes, revised to only make this call once per process in the new PR
@@ -94,6 +99,11 @@ public MetricName getName() { | |||
return name; | |||
} | |||
|
|||
@Override | |||
public @Nullable DateTime getStartTime() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we always initialize it, when would it be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to #14805
I don't want to make the system call to get the time for every metric as you suggested due to performance concerns. So I make only the process wide MetricContainer make this call. So it should be called once for the process.
@@ -74,6 +74,8 @@ | |||
|
|||
protected final @Nullable String stepName; | |||
|
|||
private boolean isProcessWide = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this final. We should know at instantiation time whether it's process wide (given that the step name above is final).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to #14805
Done in the new PR
/** Create a new {@link MetricsContainerImpl} associated with the given {@code stepName}. */ | ||
/** | ||
* Create a new {@link MetricsContainerImpl} associated with the given {@code stepName}. If | ||
* stepName is null, this MetricsContainer is not bound to a step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, require this be non-null, and add a no-arg constructor for the no-step-name (presumably is-process-wide) case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to #14805
Done in the new PR
@@ -262,6 +265,8 @@ public static void main( | |||
} | |||
}); | |||
|
|||
MetricsEnvironment.setProcessWideContainer(new MetricsContainerImpl(null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we don't hold onto a reference for this, why don't we have MetricsEnvironment create (if possibly lazily) the process-wide container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to #14805
Unfortunately I could not make MetricsEnvironment instantiate a MetricsContainerImpl inside itself as this would create a dep cycle between their packages. So keeping this one as is.
I'm wondering about the value of attaching startTime to each metric. It seems to add a lot of complexity and is a bit ill-defined in the Beam model. Are these tied somehow to process-wide metrics? If so, could we just use the start time of the process itself when publishing these rather than adding the overhead of tracking it per-metric? |
…quest and MonitoringInfosMetadataRequest/Response support to Java SDK.
loses critial information. So we must use two maps.
7f18273
to
25aae15
Compare
i have split this up into two PRs, rather than 3. As the startTime logic is much simpler now, and only done for the process wide MetricContainer. Overall the whole thing is much smaller. PTAL (1) Change the way ShortIdMap works, |
Thanks.
I'm still not following why we need per-metric start times. And
without start times, it seems the (payload-less) BiMap should just work
which simplifies things even more.
…On Wed, May 12, 2021 at 8:47 PM Alex Amato ***@***.***> wrote:
i have split this up into two PRs, rather than 3. As the startTime logic
is much simpler now, and only done for the process wide MetricContainer.
Overall the whole thing is much smaller. PTAL
(1) Change the way ShortIdMap works,
#14804 <#14804>
(2) Add process-wide metrics, and add the notion of startTime to a subset
of metrics.
#14805 <#14805>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14490 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADWVAKMZQS67DO5YM5FCPTTNNDWNANCNFSM42UFLLAQ>
.
|
[BEAM-11994] Add HarnessMonitoringInfosRequest/Response InstructionRequest and MonitoringInfosMetadataRequest/Response support to Java SDK.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.