Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEBUG-2334 Remaining dynamic instrumentation code #4063

Closed
wants to merge 12 commits into from
Closed

Conversation

p-datadog
Copy link
Member

@p-datadog p-datadog commented Nov 4, 2024

What does this PR do?

Adds remaining dynamic instrumentation code, tests and exposes the feature to users.

  • Adds serializers for Time, Date, DateTime (used by ActiveRecord timestamps)
  • Adds a mechanism to register custom serializers (used by the ActiveRecord::Base serializer included)
  • Adds a custom serializer for ActiveRecord::Base objects, currently this just exports the attributes (thus not exporting all of the other fields of AR objects that usually are not interesting) but could also export new_record? flag and possibly some other fields
  • Internal DI settings moved to under internal namespace
  • Internal setting added to enable DI in development environments, this is needed for the test suite to work and I am using this for internal troubleshooting/testing
  • Minimum send interval for the network submission worker moved to configuration setting from a constant
  • Proper logging implemented
  • Telemetry is injected and unhandled exceptions are reported to it
  • Line probes now require a single match of the provided path suffix, if multiple paths match an exception is raised (this will be reported via UI to the user, on the assumption that user wanted a specific file instrumented and we don't know which of the multiple matching files is the correct one)
  • When probe installation fails for reasons other than the referenced target being missing (presumably not yet loaded), the failure is noted and that probe is not attempted again in the future
  • DI code tracking is enabled by default when DI loads. This is needed to instrument third-party and customer code via line probes.
  • DI is loaded by default
  • DI component added
  • Remote client updated to invoke DI to process DI configs
  • Probe manager component added - this stores the probes received from remote config, tracks their status (pending/installed), and drives the rest of DI to instrument code, send notifications/events, etc.
  • Code tracker invokes DI when a file is loaded to install any pending line probes in that file
  • Notifications/events now use actual path of file where the code was invoked, previously this was hacked via basename comparisons

Motivation:
Initial implementation of DI for Ruby.

Change log entry
Dynamic instrumentation is now available in Ruby. Currently only log probes are implemented, but they can be set on both methods and lines.

Additional Notes:

Apologies for the diff size. lib diff is only about 1/3 of the total size, the rest is in spec.

Still remaining:

  • Go through all DI code and check all outstanding TODO comments for what needs to be addressed prior to initial release
  • Various additional tests, I plan to write them in parallel with working on system tests, documentation and reliability environment

How to test the change?
This PR includes fairly extensive integration tests for DI, however true end-to-end testing is to be done via system tests which is my current task.

@p-datadog p-datadog requested a review from a team as a code owner November 4, 2024 17:04
@github-actions github-actions bot added the core Involves Datadog core libraries label Nov 4, 2024
# a common base class but are all of different classes) or for Mongoid
# models (that do not have a common base class at all but include a
# standard Mongoid module).
@@flat_registry = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Code Quality Violation

Do not use class variables (...read more)

The rule "Avoid class variables" refers to the practice of refraining from using class variables (variables starting with '@@') in Ruby. Class variables are shared between a class and all of its descendants, which can lead to unexpected behavior and bugs that are difficult to trace. This is because if a class variable is changed in a subclass, that change will also affect the superclass and all other subclasses.

This rule is crucial for maintaining clean, predictable, and easy-to-debug code. It also helps to prevent unintentional side effects that can occur when class variables are manipulated in different parts of a program.

To adhere to this rule, consider using class instance variables or constants instead. Class instance variables belong solely to the class they are defined in, and their value does not get shared with subclasses. Constants, on the other hand, are a good option when the value is not meant to change. For example, in the given non-compliant code, the class variable @@class_var could be replaced with a class instance variable @class_var or a constant CLASS_VAR, depending on the intended use.

View in Datadog  Leave us feedback  Documentation

# TODO lock here?

# Probe failed to install previously, do not try to install it again.
if msg = failed_probes[probe.id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Violation

Suggested change
if msg = failed_probes[probe.id]
if (msg = failed_probes[probe.id])
Use parentheses around assignment in condition (...read more)

The rule "Wrap assignment in condition" is designed to avoid a common programming bug where a single equals sign (=) is used in a conditional statement instead of a double equals sign (==). This can lead to unexpected behavior because the single equals sign is used for assignment in Ruby, not for comparison.

This is important because using assignment in a conditional statement can lead to code that is hard to read and understand. It can also lead to bugs if the assignment is unintentional. The assignment will always return a truthy value unless the assigned value is nil or false, which may not be the expected behavior.

To avoid violating this rule, always use double equals (==) for comparison in conditional statements. If you need to assign a value and use it in the condition, you should do the assignment outside of the conditional statement. If you must do the assignment inside the condition, wrap the assignment in parentheses to make it clear that the assignment is intentional. This improves code readability and helps prevent bugs.

View in Datadog  Leave us feedback  Documentation

# same steep failure with array slices.
# https://github.com/soutaro/steep/issues/1219
value = value[0...max] || []
value = if String === value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Violation

Suggested change
value = if String === value
value = if value.is_a?(String)
Do not use === (...read more)

The case equality operator === in Ruby is used to test equality within a when clause of a case statement. However, it's often considered a bad practice to use this operator explicitly outside of a case statement. This is because its behavior can be quite unpredictable and confusing, as it behaves differently for different classes.

The use of the === operator can lead to code that is harder to read and understand. It's also potentially prone to bugs, as it might not behave as expected with certain objects. Therefore, it's recommended to avoid the explicit use of the === operator.

Instead of using the === operator, it's better to use more explicit methods that clearly indicate what you're trying to achieve. For example, if you're trying to check if a string matches a regular expression, you can use the match? method. If you want to check if an object is an instance of a certain class, you can use the is_a? method. These methods are much more clear and straightforward, leading to better, more maintainable code.

View in Datadog  Leave us feedback  Documentation

context 'when serialization raises an exception' do
before do
# Register a custom serializer that will raise an exception
Datadog::DI::Serializer.register(condition: lambda { |value| DISerializerCustomExceptionTestClass === value }) do |*args|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Violation

Suggested change
Datadog::DI::Serializer.register(condition: lambda { |value| DISerializerCustomExceptionTestClass === value }) do |*args|
Datadog::DI::Serializer.register(condition: lambda { |value| value.is_a?(DISerializerCustomExceptionTestClass) }) do |*args|
Do not use === (...read more)

The case equality operator === in Ruby is used to test equality within a when clause of a case statement. However, it's often considered a bad practice to use this operator explicitly outside of a case statement. This is because its behavior can be quite unpredictable and confusing, as it behaves differently for different classes.

The use of the === operator can lead to code that is harder to read and understand. It's also potentially prone to bugs, as it might not behave as expected with certain objects. Therefore, it's recommended to avoid the explicit use of the === operator.

Instead of using the === operator, it's better to use more explicit methods that clearly indicate what you're trying to achieve. For example, if you're trying to check if a string matches a regular expression, you can use the match? method. If you want to check if an object is an instance of a certain class, you can use the is_a? method. These methods are much more clear and straightforward, leading to better, more maintainable code.

View in Datadog  Leave us feedback  Documentation


context c.fetch(:name) do
let(:value) do
if Proc === value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Violation

Suggested change
if Proc === value
if value.is_a?(Proc)
Do not use === (...read more)

The case equality operator === in Ruby is used to test equality within a when clause of a case statement. However, it's often considered a bad practice to use this operator explicitly outside of a case statement. This is because its behavior can be quite unpredictable and confusing, as it behaves differently for different classes.

The use of the === operator can lead to code that is harder to read and understand. It's also potentially prone to bugs, as it might not behave as expected with certain objects. Therefore, it's recommended to avoid the explicit use of the === operator.

Instead of using the === operator, it's better to use more explicit methods that clearly indicate what you're trying to achieve. For example, if you're trying to check if a string matches a regular expression, you can use the match? method. If you want to check if an object is an instance of a certain class, you can use the is_a? method. These methods are much more clear and straightforward, leading to better, more maintainable code.

View in Datadog  Leave us feedback  Documentation

else
::ActiveRecord::Base.connection_config
end
rescue ActiveRecord::ConnectionNotEstablished
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Violation

Do not suppress exceptions without a comment (...read more)

The rule "Do not suppress exceptions without a comment" is a guideline that helps in maintaining the readability, debuggability, and maintainability of your code. When an exception is silently suppressed, it can lead to confusion for other developers who later maintain the code, as it may not be clear why the exception is being ignored. This can also hide potential issues in your code which might lead to bugs that are difficult to diagnose.

This rule is important because exceptions are generally a sign of an unexpected condition or error in your code. Ignoring them without any explanation can lead to hidden bugs and make your code harder to understand and maintain. It can also lead to incorrect behavior of your application, as the exception may be signaling a condition that needs to be handled.

To avoid violating this rule, always handle exceptions in your rescue block, or if you are intentionally suppressing an exception, clearly document the reason with a comment. This way, other developers will understand why the exception is being ignored. For example, you can handle the exception by logging it or by executing some fallback code. If you are suppressing the exception because it's expected and safe to ignore, explain why this is the case in a comment. For example: rescue SomeError # We expect this error and it's safe to ignore because.... This will make your code easier to understand and maintain.

View in Datadog  Leave us feedback  Documentation

# TODO this line is not being exercised
DI.component&.probe_manager&.install_pending_line_probes(path)
rescue => exc
if component = DI.component
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Violation

Suggested change
if component = DI.component
if (component = DI.component)
Use parentheses around assignment in condition (...read more)

The rule "Wrap assignment in condition" is designed to avoid a common programming bug where a single equals sign (=) is used in a conditional statement instead of a double equals sign (==). This can lead to unexpected behavior because the single equals sign is used for assignment in Ruby, not for comparison.

This is important because using assignment in a conditional statement can lead to code that is hard to read and understand. It can also lead to bugs if the assignment is unintentional. The assignment will always return a truthy value unless the assigned value is nil or false, which may not be the expected behavior.

To avoid violating this rule, always use double equals (==) for comparison in conditional statements. If you need to assign a value and use it in the condition, you should do the assignment outside of the conditional statement. If you must do the assignment inside the condition, wrap the assignment in parentheses to make it clear that the assignment is intentional. This improves code readability and helps prevent bugs.

View in Datadog  Leave us feedback  Documentation

{name: var_name}
end

if expected_matches = c[:expected_matches]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Violation

Suggested change
if expected_matches = c[:expected_matches]
if (expected_matches = c[:expected_matches])
Use parentheses around assignment in condition (...read more)

The rule "Wrap assignment in condition" is designed to avoid a common programming bug where a single equals sign (=) is used in a conditional statement instead of a double equals sign (==). This can lead to unexpected behavior because the single equals sign is used for assignment in Ruby, not for comparison.

This is important because using assignment in a conditional statement can lead to code that is hard to read and understand. It can also lead to bugs if the assignment is unintentional. The assignment will always return a truthy value unless the assigned value is nil or false, which may not be the expected behavior.

To avoid violating this rule, always use double equals (==) for comparison in conditional statements. If you need to assign a value and use it in the condition, you should do the assignment outside of the conditional statement. If you must do the assignment inside the condition, wrap the assignment in parentheses to make it clear that the assignment is intentional. This improves code readability and helps prevent bugs.

View in Datadog  Leave us feedback  Documentation

describe '.register' do
context 'with condition' do
before do
described_class.register(condition: lambda { |value| String === value && value =~ /serializer spec hello/ }) do |serializer, value, name:, depth:|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Violation

Suggested change
described_class.register(condition: lambda { |value| String === value && value =~ /serializer spec hello/ }) do |serializer, value, name:, depth:|
described_class.register(condition: lambda { |value| value.is_a?(String) && value =~ /serializer spec hello/ }) do |serializer, value, name:, depth:|
Do not use === (...read more)

The case equality operator === in Ruby is used to test equality within a when clause of a case statement. However, it's often considered a bad practice to use this operator explicitly outside of a case statement. This is because its behavior can be quite unpredictable and confusing, as it behaves differently for different classes.

The use of the === operator can lead to code that is harder to read and understand. It's also potentially prone to bugs, as it might not behave as expected with certain objects. Therefore, it's recommended to avoid the explicit use of the === operator.

Instead of using the === operator, it's better to use more explicit methods that clearly indicate what you're trying to achieve. For example, if you're trying to check if a string matches a regular expression, you can use the match? method. If you want to check if an object is an instance of a certain class, you can use the is_a? method. These methods are much more clear and straightforward, leading to better, more maintainable code.

View in Datadog  Leave us feedback  Documentation

Comment on lines +132 to +135
if settings.remote.enabled && old_state&.[](:remote)
# remote should be defined here
remote.start
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use it for testing to avoid having to hit the app before it instruments code but it does not belong in this PR, I will revert it.

Comment on lines +21 to +29
# The third-party library integrations need to be loaded after the
# third-party libraries are loaded. Tracing and appsec use Railtie
# to delay integrations until all of the application's dependencies
# are loaded, when running under Rails. We should do the same here in
# principle, however DI currently only has an ActiveRecord integration
# and AR should be loaded before any application code is loaded, being
# part of Rails, therefore for now we should be OK to just require the
# AR integration from here.
require_relative 'di/contrib/active_record'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work in a regular Rails app (meaning, ActiveRecord::Base is loaded when dd-trace-rb is loaded)? If so, we add to the comment that "it works in practice blah blah".

end
end

# TODO this line is not being exercised
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't help me understand why the line is not being executed, but if it's clear to you, it's ok!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line is exercised in #4097, comment removed.

Comment on lines +84 to +85
rescue => exc
if component = DI.component
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document when this error path happens? It looks like the log lines below give a hint, but it would be nice to have a description here, even if short.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #4089

Comment on lines +113 to +114

settings :internal do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here documentation the nature of this internal settings group (for example, is it internal because it should not be used outside of library development, or internal because these are advanced options?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. The group is internal because it is not for use by customers - the options are meant for (my) development and testing only.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #4089.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #4089.

value_to_serialize = {
attributes: value.attributes,
}
serializer.serialize_value(value_to_serialize, depth: depth ? depth - 1 : nil, type: value.class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the - 1 in depth - 1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We went from the object to its attributes, this is one level in. I will double-check against other serialization code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We went from the object to its attributes, this is one level in. I will double-check against other serialization code.

@@ -0,0 +1,9 @@
# frozen_string_literal: true

Datadog::DI::Serializer.register(condition: lambda { |value| ActiveRecord::Base === value }) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a stupid question, but why not make this condition

Suggested change
Datadog::DI::Serializer.register(condition: lambda { |value| ActiveRecord::Base === value }) \
Datadog::DI::Serializer.register(condition: lambda { |value| defined?(ActiveRecord::Base) && ActiveRecord::Base === value }) \

so that it can always be loaded, even if ActiveRecord is not loaded?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a genius idea actually. The reason not to do it is for performance when AR is not loaded (since otherwise every object will have to evaluate this condition) but perhaps it's worthwhile to take the performance hit to permit loading Rails after datadog.


# config is one probe info
def add_probe(probe)
# TODO lock here?
Copy link
Member

@marcotc marcotc Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, adding a probe is normally done by a "slow" process (user action, application initialization, remote configuration) and if adding two conflicting probes concurrently can cause big issues, I think it's reasonable to add a lock here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in #4097.

# to instrument them every time remote configuration is processed.
attr_reader :failed_probes

# config is one probe info
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is probably not accurate anymore to describe this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in #4097.

Comment on lines +103 to +104
# TODO add top stack frame to message
failed_probes[probe.id] = "#{exc.class}: #{exc}"
Copy link
Member

@marcotc marcotc Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# TODO add top stack frame to message
failed_probes[probe.id] = "#{exc.class}: #{exc}"
failed_probes[probe.id] = "#{exc.class} #{exc.message}: #{Array(exc.backtrace).first}"

end

def remove_other_probes(probe_ids)
pending_probes.values.each do |probe|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pending_probes.values.each do |probe|
pending_probes.each_value do |probe|

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I had it was intentional because I am modifying the container during iteration.

rescue => exc
raise if settings.dynamic_instrumentation.internal.propagate_all_exceptions
# Silence all exceptions?
# TODO should we propagate here and rescue upstream?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to answer this comment: in my read of this PR, it makes sense to try to unhook all probes, even if one of them fail to unhook. Which would mean that propagating the error early will prevent other probes from being propagated. So it looks to me that we should not propagate the error immediately here.

if probe.type_name == cls.name
# TODO is it OK to hook from trace point handler?
# TODO the class is now defined, but can hooking still fail?
# TODO pass rate_limiter here, need to get it from somewhere
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the rate limiter available here, in probe.rate_limiter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, deleted the comment.

# Class/module definition trace point (:end type).
# Used to install hooks when the target classes/modules aren't yet
# defined when the hook request is received.
attr_reader :definition_trace_point
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this attr_reader is declared at the end of the class, instead of together with the other attr_readers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an internal attribute, even though it is set from outside of the class.

* master: (53 commits)
  [NO-TICKET] Re-enable ASAN memory leaks testing
  [PROF-9470] Enable "heap clean after GC" profiler optimization by default
  [🤖] Lock Dependency: https://github.com/DataDog/dd-trace-rb/actions/runs/11724221529
  Force rspec >= 3.13 to pick up rspec/rspec-core#2957
  Minor style fix
  Add missing assertions/stubs to avoid noise in test output
  Hide skipped/pending tests from rspec output by default
  Revert "Disable crashtracking by default"
  Lockfile updates for libdatadog upgrade
  [NO-TICKET] Upgrade to libdatadog 14.1
  remove unused rspec integration app
  Update lockfiles for release 2.6.0
  Bump version 2.5.0 to 2.6.0
  Add 2.6.0 to CHANGELOG.md
  Remove unintentional libddwaf require
  Specify gem version for macos test
  Address PR comments
  Address new Ruby 3.4 error message changes
  Add macos and yjit tests
  [NO-TICKET] Relax profiling CODEOWNERS configuration
  ...
@p-datadog
Copy link
Member Author

Replaced by the PRs listed above.

@p-datadog p-datadog closed this Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Involves Datadog core libraries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants