Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidekiq args filtration #2177

Merged
merged 17 commits into from
Aug 30, 2023
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1259,6 +1259,7 @@ Style/MethodCallWithArgsParentheses:
- raise
- require
- skip
- sleep
- source
- stub
- stub_const
Expand Down
36 changes: 33 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,44 @@
# New Relic Ruby Agent Release Notes


## dev

Version <dev> allows the agent to record additional response information on a transaction when middleware instrumentation is disabled and fixes a bug in `NewRelic::Rack::AgentHooks.needed?`.

Version <dev> allows the agent to record additional response information on a transaction when middleware instrumentation is disabled, introduces new `:'sidekiq.args.include'` and `:'sidekiq.args.exclude:` configuration options to permit capturing only certain Sidekiq job arguments, and fixes a bug in `NewRelic::Rack::AgentHooks.needed?`.

- **Feature: Report transaction HTTP status codes when middleware instrumentation is disabled**
Previously, when `disable_middleware_instrumentation` was set to `true`, the agent would not record the value of the response code or content type on the transaction. This was due to the possibility that a middleware could alter the response, which would not be captured by the agent when the middleware instrumentation was disabled. However, based on customer feedback, the agent will now report the HTTP status code and content type on a transaction when middleware instrumentation is disabled. [PR#2175](https://github.com/newrelic/newrelic-ruby-agent/pull/2175)

- **Feature: Permit capturing only certain Sidekiq job arguments**
New `:'sidekiq.args.include'` and `:'sidekiq.args.exclude'` configuration options have been introduced to permit fine grained control over which Sidekiq job arguments (args) are reported to New Relic. By default, no Sidekiq args are reported. To report any Sidekiq options, the `:'attributes.include'` array must include the string `'jobs.sidekiq.args.*'`. With that string in place, all arguments will be reported unless one or more of the new include/exclude options are used. The `:'sidekiq.args.include'` option can be set to an array of strings. Each of those strings will be passed to `Regexp.new` and collectively serve as an allowlist for desired args. For job arguments that are hashes, if a hash's key matches one of the include patterns, then both the key and its corresponding value will be included. For scalar arguments, the string representation of the scalar will need to match one of the include patterns to be captured. The `:'sidekiq.args.exclude'` option works similarly. It can be set to an array of strings that will each be passed to `Regexp.new` to create patterns. These patterns will collectively serve as a denylist for unwanted job args. Any hash key, has value, or scalar that matches an exclude pattern will be excluded (not sent to New Relic). [PR#2177](https://github.com/newrelic/newrelic-ruby-agent/pull/2177)
fallwith marked this conversation as resolved.
Show resolved Hide resolved

`newrelic.yml` examples:

Any string in the `:'sidekiq.args.include'` or `:'sidekiq.args.exclude'` arrays will be turned into a regular expression. Knowledge of [Ruby regular expression support](https://ruby-doc.org/3.2.2/Regexp.html) can be leveraged but is not required. If regular expression syntax is not used, inexact matches will be performed and the string "Fortune" will match both "Fortune 500" and "Fortune and Glory". For exact matches, use [regular expression anchors](https://ruby-doc.org/3.2.2/Regexp.html#class-Regexp-label-Anchors).

```yaml
# Include any argument whose string representation matches either "apple" or "banana"
# The "apple" pattern will match both "green apple" and "red apple"
sidekiq.args.include:
- apple
- banana
fallwith marked this conversation as resolved.
Show resolved Hide resolved

# Exclude any arguments that match either "grape", "orange", or "pear"
sidekiq.args.exclude:
- grape
- orange
- pear

# Exclude any argument that is a 9 digit number
sidekiq.args.exclude:
- '\d{9}'

# Include anything that starts with "blue" but exclude anything that ends in "green"
sidekiq.args.include
- '^blue'

sidekiq.args.exclude
- 'green$'
```

- **Bugfix: Resolve inverted logic of NewRelic::Rack::AgentHooks.needed?**
Previously, `NewRelic::Rack::AgentHooks.needed?` incorrectly used inverted logic. This has now been resolved, allowing AgentHooks to be installed when `disable_middleware_instrumentation` is set to true. [PR#2175](https://github.com/newrelic/newrelic-ruby-agent/pull/2175)

Expand Down
1 change: 1 addition & 0 deletions lib/new_relic/agent.rb
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ module Agent
require 'new_relic/agent/deprecator'
require 'new_relic/agent/logging'
require 'new_relic/agent/distributed_tracing'
require 'new_relic/agent/attribute_pre_filtering'
require 'new_relic/agent/attribute_processing'
require 'new_relic/agent/linking_metadata'
require 'new_relic/agent/local_log_decorator'
Expand Down
109 changes: 109 additions & 0 deletions lib/new_relic/agent/attribute_pre_filtering.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# This file is distributed under New Relic's license terms.
# See https://github.com/newrelic/newrelic-ruby-agent/blob/main/LICENSE for complete details.
# frozen_string_literal: true

module NewRelic
module Agent
module AttributePreFiltering
module_function

PRE_FILTER_KEYS = %i[include exclude].freeze
DISCARDED = :nr_discarded

def formulate_regexp_union(option)
return if NewRelic::Agent.config[option].empty?

Regexp.union(NewRelic::Agent.config[option].map { |p| string_to_regexp(p) }.uniq.compact).freeze
rescue StandardError => e
NewRelic::Agent.logger.warn("Failed to formulate a Regexp union from the '#{option}' configuration option " +
"- #{e.class}: #{e.message}")
end

def string_to_regexp(str)
Regexp.new(str)
rescue StandardError => e
NewRelic::Agent.logger.warn("Failed to initialize Regexp from string '#{str}' - #{e.class}: #{e.message}")
end

# attribute filtering suppresses data that has already been flattened
# and coerced (serialized as text) via #flatten_and_coerce, and is
# restricted to basic text matching with a single optional wildcard.
# pre filtering operates on raw Ruby objects beforehand and uses full
# Ruby regex syntax
def pre_filter(values = [], options = {})
return values unless !options.empty? && PRE_FILTER_KEYS.any? { |k| options.key?(k) }

# if there's a prefix in play for (non-pre) attribute filtration and
# attribute filtration won't allow that prefix, then don't even bother
# with pre filtration that could only result in values that would be
# blocked
if options.key?(:attribute_namespace) &&
!NewRelic::Agent.instance.attribute_filter.might_allow_prefix?(options[:attribute_namespace])
return values
end

values.each_with_object([]) do |element, filtered|
object = pre_filter_object(element, options)
filtered << object unless discarded?(object)
end
end

def pre_filter_object(object, options)
if object.is_a?(Hash)
pre_filter_hash(object, options)
elsif object.is_a?(Array)
pre_filter_array(object, options)
else
pre_filter_scalar(object, options)
end
end

def pre_filter_hash(hash, options)
filtered_hash = hash.each_with_object({}) do |(key, value), filtered|
filtered_key = pre_filter_object(key, options)
next if discarded?(filtered_key)

# If the key is permitted, skip include filtration for the value
# but still apply exclude filtration
if options.key?(:exclude)
exclude_only = options.dup
exclude_only.delete(:include)
filtered_value = pre_filter_object(value, exclude_only)
next if discarded?(filtered_value)
else
filtered_value = value
end

filtered[filtered_key] = filtered_value
end

filtered_hash.empty? && !hash.empty? ? DISCARDED : filtered_hash
end

def pre_filter_array(array, options)
filtered_array = array.each_with_object([]) do |element, filtered|
filtered_element = pre_filter_object(element, options)
next if discarded?(filtered_element)

filtered.push(filtered_element)
end

filtered_array.empty? && !array.empty? ? DISCARDED : filtered_array
end

def pre_filter_scalar(scalar, options)
return DISCARDED if options.key?(:include) && !scalar.to_s.match?(options[:include])
return DISCARDED if options.key?(:exclude) && scalar.to_s.match?(options[:exclude])

scalar
end

# `nil`, empty enumerable objects, and `false` are all valid in their
# own right as application data, so pre-filtering uses a special value
# to indicate that filtered out data has been discarded
def discarded?(object)
object == DISCARDED
end
end
end
end
32 changes: 32 additions & 0 deletions lib/new_relic/agent/configuration/default_source.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# frozen_string_literal: true

require 'forwardable'
require_relative '../../constants'

module NewRelic
module Agent
Expand Down Expand Up @@ -1715,6 +1716,37 @@ def self.enforce_fallback(allowed_values: nil, fallback: nil)
:transform => DefaultSource.method(:convert_to_regexp_list),
:description => 'Define transactions you want the agent to ignore, by specifying a list of patterns matching the URI you want to ignore. For more detail, see [the docs on ignoring specific transactions](/docs/agents/ruby-agent/api-guides/ignoring-specific-transactions/#config-ignoring).'
},
# Sidekiq
:'sidekiq.args.include' => {
default: NewRelic::EMPTY_ARRAY,
kaylareopelle marked this conversation as resolved.
Show resolved Hide resolved
public: true,
type: Array,
dynamic_name: true,
allowed_from_server: false,
description: <<~SIDEKIQ_ARGS_INCLUDE.chomp.tr("\n", ' ')
An array of strings that will collectively serve as an allowlist for filtering which Sidekiq
job arguments get reported to New Relic. The capturing of any Sidekiq arguments requires that
'job.sidekiq.args.*' be added to the separate :'attributes.include' configuration option. Each
fallwith marked this conversation as resolved.
Show resolved Hide resolved
string in this array will be turned into a regular expression via `Regexp.new` to permit advanced
matching. For job argument hashes, if either a key or value matches the pair will be included. All
matching job argument array elements and job argument scalars will be included.
SIDEKIQ_ARGS_INCLUDE
},
:'sidekiq.args.exclude' => {
default: NewRelic::EMPTY_ARRAY,
public: true,
type: Array,
dynamic_name: true,
allowed_from_server: false,
description: <<~SIDEKIQ_ARGS_EXCLUDE.chomp.tr("\n", ' ')
An array of strings that will collectively serve as a denylist for filtering which Sidekiq
job arguments get reported to New Relic. The capturing of any Sidekiq arguments requires that
'job.sidekiq.args.*' be added to the separate :'attributes.include' configuration option. Each string
fallwith marked this conversation as resolved.
Show resolved Hide resolved
in this array will be turned into a regular expression via `Regexp.new` to permit advanced matching.
For job argument hashes, if either a key or value matches the pair will be excluded. All matching job
argument array elements and job argument scalars will be excluded.
SIDEKIQ_ARGS_EXCLUDE
},
# Slow SQL
:'slow_sql.enabled' => {
:default => value_of(:'transaction_tracer.enabled'),
Expand Down
26 changes: 23 additions & 3 deletions lib/new_relic/agent/instrumentation/sidekiq/server.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ class Server
include NewRelic::Agent::Instrumentation::ControllerInstrumentation
include Sidekiq::ServerMiddleware if defined?(Sidekiq::ServerMiddleware)

ATTRIBUTE_BASE_NAMESPACE = 'sidekiq.args'
ATTRIBUTE_FILTER_TYPES = %i[include exclude].freeze
ATTRIBUTE_JOB_NAMESPACE = :"job.#{ATTRIBUTE_BASE_NAMESPACE}"

# Client middleware has additional parameters, and our tests use the
# middleware client-side to work inline.
def call(worker, msg, queue, *_)
Expand All @@ -18,10 +22,16 @@ def call(worker, msg, queue, *_)
trace_headers = msg.delete(NewRelic::NEWRELIC_KEY)

perform_action_with_newrelic_trace(trace_args) do
NewRelic::Agent::Transaction.merge_untrusted_agent_attributes(msg['args'], :'job.sidekiq.args',
NewRelic::Agent::AttributeFilter::DST_NONE)
NewRelic::Agent::Transaction.merge_untrusted_agent_attributes(
NewRelic::Agent::AttributePreFiltering.pre_filter(msg['args'], self.class.nr_attribute_options),
ATTRIBUTE_JOB_NAMESPACE,
NewRelic::Agent::AttributeFilter::DST_NONE
)

if ::NewRelic::Agent.config[:'distributed_tracing.enabled'] && trace_headers&.any?
::NewRelic::Agent::DistributedTracing::accept_distributed_trace_headers(trace_headers, 'Other')
end
fallwith marked this conversation as resolved.
Show resolved Hide resolved

::NewRelic::Agent::DistributedTracing::accept_distributed_trace_headers(trace_headers, 'Other') if ::NewRelic::Agent.config[:'distributed_tracing.enabled'] && trace_headers&.any?
yield
end
end
Expand All @@ -33,5 +43,15 @@ def self.default_trace_args(msg)
:category => 'OtherTransaction/SidekiqJob'
}
end

def self.nr_attribute_options
@nr_attribute_options ||= begin
ATTRIBUTE_FILTER_TYPES.each_with_object({}) do |type, opts|
pattern =
NewRelic::Agent::AttributePreFiltering.formulate_regexp_union(:"#{ATTRIBUTE_BASE_NAMESPACE}.#{type}")
opts[type] = pattern if pattern
end.merge(attribute_namespace: ATTRIBUTE_JOB_NAMESPACE)
end
end
end
end
1 change: 1 addition & 0 deletions test/helpers/config_scanning.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ def scan_and_remove_used_entries(default_keys, non_test_files)
captures.flatten.compact.each do |key|
default_keys.delete(key.delete("'").to_sym)
end

# Remove any config keys that are annotated with the 'dynamic_name' setting
# This indicates that the names of these keys are constructed dynamically at
# runtime, so we don't expect any explicit references to them in code.
Expand Down
16 changes: 0 additions & 16 deletions test/multiverse/suites/sidekiq/after_suite.rb

This file was deleted.

98 changes: 98 additions & 0 deletions test/multiverse/suites/sidekiq/sidekiq_args_filtration_test.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# This file is distributed under New Relic's license terms.
# See https://github.com/newrelic/newrelic-ruby-agent/blob/main/LICENSE for complete details.
# frozen_string_literal: true

require_relative 'sidekiq_test_helpers'

class SidekiqArgsFiltrationTest < Minitest::Test
include SidekiqTestHelpers

ARGUMENTS = [{'username' => 'JBond007',
'color' => 'silver',
'record' => true,
'items' => %w[stag thistle peat],
'price_map' => {'apple' => 0.75, 'banana' => 0.50, 'pear' => 0.99}},
'When I thought I heard myself say no',
false].freeze

def setup
cache_var = :@nr_attribute_options
if NewRelic::Agent::Instrumentation::Sidekiq::Server.instance_variables.include?(cache_var)
NewRelic::Agent::Instrumentation::Sidekiq::Server.remove_instance_variable(cache_var)
end
end

def test_by_default_no_args_are_captured
captured_args = run_job_and_get_attributes(*ARGUMENTS)

assert_empty captured_args, "Didn't expect to capture any attributes for the Sidekiq job, " +
"captured: #{captured_args}"
end

def test_all_args_are_captured
expected = flatten(ARGUMENTS)
with_config(:'attributes.include' => 'job.sidekiq.args.*') do
captured_args = run_job_and_get_attributes(*ARGUMENTS)

assert_equal expected, captured_args, "Expected all args to be captured. Wanted:\n\n#{expected}\n\n" +
"Got:\n\n#{captured_args}\n\n"
end
end

def test_only_included_args_are_captured
included = ['price_map']
expected = flatten([{included.first => ARGUMENTS.first[included.first]}])
with_config(:'attributes.include' => 'job.sidekiq.args.*',
:'sidekiq.args.include' => included) do
captured_args = run_job_and_get_attributes(*ARGUMENTS)

assert_equal expected, captured_args, "Expected only '#{included}' args to be captured. " +
"Wanted:\n\n#{expected}\n\nGot:\n\n#{captured_args}\n\n"
end
end

def test_excluded_args_are_not_captured
excluded = ['username']
without_excluded = ARGUMENTS.dup
without_excluded.first.delete(excluded.first)
expected = flatten(without_excluded)
with_config(:'attributes.include' => 'job.sidekiq.args.*',
:'sidekiq.args.exclude' => excluded) do
captured_args = run_job_and_get_attributes(*ARGUMENTS)

assert_equal expected, captured_args, "Expected '#{excluded}' to be excluded from capture. " +
"Wanted:\n\n#{expected}\n\nGot:\n\n#{captured_args}\n\n"
end
end

def test_include_and_exclude_cascaded
included = ['price_map']
excluded = %w[apple pear]
hash = {included.first => ARGUMENTS.first[included.first].dup}
# TODO: OLD RUBIES - Requires 3.0
# Hash#except would be better here, requires Ruby v3+
excluded.each { |exclude| hash[included.first].delete(exclude) }
expected = flatten([hash])
with_config(:'attributes.include' => 'job.sidekiq.args.*',
:'sidekiq.args.include' => included,
:'sidekiq.args.exclude' => excluded) do
captured_args = run_job_and_get_attributes(*ARGUMENTS)

assert_equal expected, captured_args, "Used included='#{included}', excluded='#{excluded}'. " +
"Wanted:\n\n#{expected}\n\nGot:\n\n#{captured_args}\n\n"
end
end

def test_arcane_pattern_usage
# no booleans, nothing with numbers, no *.name except unitname, anything ending in 't', a string with I, I, and y, y
excluded = ['^true|false$', '\d+', '(?!<unit)name$', 't$', 'I.*I.*y.*.y']
expected = flatten([{'color' => 'silver', 'items' => %w[stag thistle]}])
with_config(:'attributes.include' => 'job.sidekiq.args.*',
:'sidekiq.args.exclude' => excluded) do
captured_args = run_job_and_get_attributes(*ARGUMENTS)

assert_equal expected, captured_args, "Used excluded='#{excluded}'. " +
"Wanted:\n\n#{expected}\n\nGot:\n\n#{captured_args}\n\n"
end
end
end
Loading