-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Dimensional metric reporting #376
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -229,6 +229,17 @@ defmodule NewRelic do | |
defdelegate report_custom_event(type, attributes), | ||
to: NewRelic.Harvest.Collector.CustomEvent.Harvester | ||
|
||
@doc """ | ||
Report a Dimensional Metric. | ||
Valid types: `:count`, `:gauge`, and `:summary`. | ||
|
||
```elixir | ||
NewRelic.report_dimensional_metric(:count, "my_metric_name", 1, %{some: "attributes"}) | ||
``` | ||
""" | ||
defdelegate report_dimensional_metric(type, name, value, attributes \\ %{}), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like there should be other functions that call this, as well. Similar to how incrementing a timeslice metric works. |
||
to: NewRelic.Harvest.TelemetrySdk.DimensionalMetrics.Harvester | ||
|
||
@doc """ | ||
Report a Custom metric. | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
defmodule NewRelic.Harvest.TelemetrySdk.DimensionalMetrics.Harvester do | ||
use GenServer | ||
|
||
@moduledoc false | ||
|
||
alias NewRelic.Harvest | ||
alias NewRelic.Harvest.TelemetrySdk | ||
|
||
@interval_ms TelemetrySdk.Config.lookup(:dimensional_metrics_harvest_cycle) | ||
|
||
@valid_types [:count, :gauge, :summary] | ||
|
||
def start_link(_) do | ||
GenServer.start_link(__MODULE__, []) | ||
end | ||
|
||
def init(_) do | ||
{:ok, | ||
%{ | ||
start_time_ms: System.system_time(:millisecond), | ||
metrics: [] | ||
}} | ||
end | ||
|
||
# API | ||
|
||
@spec report_dimensional_metric(:count | :gauge | :summary, atom() | binary(), any, map()) :: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should I remove the typespec? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like a pretty detailed spec would be worth it on the function inside |
||
:ok | ||
def report_dimensional_metric(type, name, value, attributes) when type in @valid_types do | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The main risk I'm noticing in this PR is that Which behavior do you want this function to have? I suspect most folks would assume the aggregated behavior. For example, if someone calls this function for every single request, we wouldn't want to copy the name & attributes potentially thousands of times a second.. If you did want that behavior, perhaps There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah. I definitely agree that it's not what I would expect. What are your thoughts on efficiently storing the map of attributes along with the metric name? Eg I report metric_a with attributes a, b, and c. Then I report metric_a with attributes c, d, and e. Do I merge the maps? Or does that remain as two metrics? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I could use something like phash2 on the attributes map and then aggregate matching ones. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's right, the whole of the attributes makes up a unique metric. The other metric harvester has an optimized aggregation implementation you could steal There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So it would be keyed off of a tuple of the name and the hash of the attributes? The timeslice metric harvester uses the name and the scope, it seems. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep! |
||
TelemetrySdk.DimensionalMetrics.HarvestCycle | ||
|> Harvest.HarvestCycle.current_harvester() | ||
|> GenServer.cast({:report, %{type: type, name: name, value: value, attributes: attributes}}) | ||
end | ||
|
||
def gather_harvest, | ||
do: | ||
TelemetrySdk.DimensionalMetrics.HarvestCycle | ||
|> Harvest.HarvestCycle.current_harvester() | ||
|> GenServer.call(:gather_harvest) | ||
|
||
# do not accept more report messages when harvest has already been reported | ||
def handle_cast(_late_msg, :completed), do: {:noreply, :completed} | ||
|
||
def handle_cast({:report, metric}, state) do | ||
# TODO: merge metrics with the same type/name/attributes? | ||
# Take phash2 of attributes and store them as a map with the | ||
# key being {type, name, phash2(attributes)} => [metrics] | ||
{:noreply, %{state | metrics: [metric | state.metrics]}} | ||
end | ||
|
||
# do not resend metrics when harvest has already been reported | ||
def handle_call(_late_msg, _from, :completed), do: {:reply, :completed, :completed} | ||
|
||
def handle_call(:send_harvest, _from, state) do | ||
send_harvest(state) | ||
{:reply, :ok, :completed} | ||
end | ||
|
||
def handle_call(:gather_harvest, _from, state) do | ||
{:reply, build_dimensional_metric_data(state.metrics, state), state} | ||
end | ||
|
||
# Helpers | ||
|
||
defp send_harvest(state) do | ||
TelemetrySdk.API.log(build_dimensional_metric_data(state.metrics, state)) | ||
log_harvest(length(state.metrics)) | ||
end | ||
|
||
defp log_harvest(harvest_size) do | ||
NewRelic.log( | ||
:debug, | ||
"Completed TelemetrySdk.DimensionalMetrics harvest - size: #{harvest_size}" | ||
) | ||
end | ||
|
||
defp build_dimensional_metric_data(metrics, state) do | ||
[ | ||
%{ | ||
metrics: metrics, | ||
common: common(state) | ||
} | ||
] | ||
end | ||
|
||
defp common(%{start_time_ms: start_time_ms}) do | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made the start time the common timestamp for any metrics in this harvest cycle. I don't think there is any accuracy below (or even near to) 5 seconds when querying. |
||
%{"timestamp" => start_time_ms, "interval.ms" => @interval_ms} | ||
tpitale marked this conversation as resolved.
Show resolved
Hide resolved
|
||
end | ||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
defmodule DimensionalMetricTest do | ||
use ExUnit.Case | ||
|
||
test "reports dimensional metrics" do | ||
TestHelper.restart_harvest_cycle( | ||
NewRelic.Harvest.TelemetrySdk.DimensionalMetrics.HarvestCycle | ||
) | ||
|
||
NewRelic.report_dimensional_metric(:count, "memory.foo_baz", 100, %{cpu: 1000}) | ||
NewRelic.report_dimensional_metric(:summary, "memory.foo_bar", 50, %{cpu: 2000}) | ||
|
||
[%{common: common, metrics: metrics}] = | ||
TestHelper.gather_harvest(NewRelic.Harvest.TelemetrySdk.DimensionalMetrics.Harvester) | ||
|
||
assert common["interval.ms"] > 0 | ||
assert common["timestamp"] > 0 | ||
|
||
assert length(metrics) == 2 | ||
[metric1, metric2] = metrics | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This points out that the metrics are coming in reverse order because that's how they're added. Not sure if it is worth swapping the order, or reversing the metrics at gather time. |
||
assert metric1.name == "memory.foo_bar" | ||
assert metric1.type == :summary | ||
|
||
assert metric2.name == "memory.foo_baz" | ||
assert metric2.type == :count | ||
|
||
TestHelper.pause_harvest_cycle(NewRelic.Harvest.TelemetrySdk.DimensionalMetrics.HarvestCycle) | ||
end | ||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
defmodule TelemetrySdk.DimensionalMetricsHarvesterTest do | ||
use ExUnit.Case | ||
|
||
alias NewRelic.Harvest | ||
alias NewRelic.Harvest.TelemetrySdk | ||
|
||
test "Harvester collects dimensional metrics" do | ||
{:ok, harvester} = | ||
DynamicSupervisor.start_child( | ||
TelemetrySdk.DimensionalMetrics.HarvesterSupervisor, | ||
TelemetrySdk.DimensionalMetrics.Harvester | ||
) | ||
|
||
metric1 = %{} | ||
GenServer.cast(harvester, {:report, metric1}) | ||
|
||
metrics = GenServer.call(harvester, :gather_harvest) | ||
assert length(metrics) > 0 | ||
end | ||
|
||
test "harvest cycle" do | ||
Application.put_env(:new_relic_agent, :dimensional_metrics_harvest_cycle, 300) | ||
TestHelper.restart_harvest_cycle(TelemetrySdk.DimensionalMetrics.HarvestCycle) | ||
|
||
first = Harvest.HarvestCycle.current_harvester(TelemetrySdk.DimensionalMetrics.HarvestCycle) | ||
Process.monitor(first) | ||
|
||
# Wait until harvest swap | ||
assert_receive {:DOWN, _ref, _, ^first, :shutdown}, 1000 | ||
|
||
second = Harvest.HarvestCycle.current_harvester(TelemetrySdk.DimensionalMetrics.HarvestCycle) | ||
Process.monitor(second) | ||
|
||
refute first == second | ||
assert Process.alive?(second) | ||
|
||
TestHelper.pause_harvest_cycle(TelemetrySdk.DimensionalMetrics.HarvestCycle) | ||
Application.delete_env(:new_relic_agent, :dimensional_metrics_harvest_cycle) | ||
|
||
# Ensure the last harvester has shut down | ||
assert_receive {:DOWN, _ref, _, ^second, :shutdown}, 1000 | ||
end | ||
|
||
test "Ignore late reports" do | ||
TestHelper.restart_harvest_cycle(TelemetrySdk.DimensionalMetrics.HarvestCycle) | ||
|
||
harvester = | ||
TelemetrySdk.DimensionalMetrics.HarvestCycle | ||
|> Harvest.HarvestCycle.current_harvester() | ||
|
||
assert :ok == GenServer.call(harvester, :send_harvest) | ||
|
||
GenServer.cast(harvester, {:report, :late_msg}) | ||
|
||
assert :completed == GenServer.call(harvester, :send_harvest) | ||
|
||
TestHelper.pause_harvest_cycle(TelemetrySdk.DimensionalMetrics.HarvestCycle) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should add some docs (or a link) on when to use each of the types.