Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unit test to support Logstash 8 nano timestamp #67

Merged
merged 6 commits into from
Jun 16, 2022

Conversation

kaisecheng
Copy link
Contributor

@kaisecheng kaisecheng commented May 3, 2022

fix red ci

logstash_1_ca91d0bf13e2 | Failures:
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   1) LogStash::Filters::Fingerprint `ecs_compatibility => disabled` Timestamps OpenSSL Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq('1d5379ec92d86a67cfc642d55aa050ca312d3b9a')
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: "1d5379ec92d86a67cfc642d55aa050ca312d3b9a"
logstash_1_ca91d0bf13e2 |             got: "437291481f9b52199fcc6e3c6ea31ef4ad1c8fe5"
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:284:in `block in <main>'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   2) LogStash::Filters::Fingerprint `ecs_compatibility => disabled` Timestamps MURMUR3 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq(743372282)
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: 743372282
logstash_1_ca91d0bf13e2 |             got: 1154765817
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:292:in `block in <main>'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   3) LogStash::Filters::Fingerprint `ecs_compatibility => disabled` Timestamps MURMUR3_128 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq("37785b62a8cae473acc315d39b66d86e")
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: "37785b62a8cae473acc315d39b66d86e"
logstash_1_ca91d0bf13e2 |             got: "a0287bec80fce9eb6a1457efae3a7aeb"
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:300:in `block in <main>'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   4) LogStash::Filters::Fingerprint `ecs_compatibility => v1` Timestamps MURMUR3_128 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq("37785b62a8cae473acc315d39b66d86e")
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: "37785b62a8cae473acc315d39b66d86e"
logstash_1_ca91d0bf13e2 |             got: "a0287bec80fce9eb6a1457efae3a7aeb"
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:300:in `block in /usr/share/plugins/plugin/spec/filters/fingerprint_spec.rb'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   5) LogStash::Filters::Fingerprint `ecs_compatibility => v1` Timestamps MURMUR3 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq(743372282)
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: 743372282
logstash_1_ca91d0bf13e2 |             got: 1154765817
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:292:in `block in /usr/share/plugins/plugin/spec/filters/fingerprint_spec.rb'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   6) LogStash::Filters::Fingerprint `ecs_compatibility => v1` Timestamps OpenSSL Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq('1d5379ec92d86a67cfc642d55aa050ca312d3b9a')
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: "1d5379ec92d86a67cfc642d55aa050ca312d3b9a"
logstash_1_ca91d0bf13e2 |             got: "437291481f9b52199fcc6e3c6ea31ef4ad1c8fe5"
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:284:in `block in /usr/share/plugins/plugin/spec/filters/fingerprint_spec.rb'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   7) LogStash::Filters::Fingerprint `ecs_compatibility => v8` Timestamps OpenSSL Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq('1d5379ec92d86a67cfc642d55aa050ca312d3b9a')
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: "1d5379ec92d86a67cfc642d55aa050ca312d3b9a"
logstash_1_ca91d0bf13e2 |             got: "437291481f9b52199fcc6e3c6ea31ef4ad1c8fe5"
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:284:in `block in /usr/share/plugins/plugin/spec/filters/fingerprint_spec.rb'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   8) LogStash::Filters::Fingerprint `ecs_compatibility => v8` Timestamps MURMUR3 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq(743372282)
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: 743372282
logstash_1_ca91d0bf13e2 |             got: 1154765817
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:292:in `block in /usr/share/plugins/plugin/spec/filters/fingerprint_spec.rb'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 |   9) LogStash::Filters::Fingerprint `ecs_compatibility => v8` Timestamps MURMUR3_128 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 |      Failure/Error: expect(fingerprint).to eq("37785b62a8cae473acc315d39b66d86e")
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        expected: "37785b62a8cae473acc315d39b66d86e"
logstash_1_ca91d0bf13e2 |             got: "a0287bec80fce9eb6a1457efae3a7aeb"
logstash_1_ca91d0bf13e2 |      
logstash_1_ca91d0bf13e2 |        (compared using ==)
logstash_1_ca91d0bf13e2 |      # ./spec/filters/fingerprint_spec.rb:300:in `block in /usr/share/plugins/plugin/spec/filters/fingerprint_spec.rb'
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 | Finished in 6.11 seconds (files took 10.26 seconds to load)
logstash_1_ca91d0bf13e2 | 120 examples, 9 failures
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 | Failed examples:
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:1:7:1:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => disabled` Timestamps OpenSSL Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:1:7:2:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => disabled` Timestamps MURMUR3 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:1:7:3:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => disabled` Timestamps MURMUR3_128 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:2:7:3:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => v1` Timestamps MURMUR3_128 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:2:7:2:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => v1` Timestamps MURMUR3 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:2:7:1:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => v1` Timestamps OpenSSL Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:3:7:1:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => v8` Timestamps OpenSSL Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:3:7:2:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => v8` Timestamps MURMUR3 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | rspec './spec/filters/fingerprint_spec.rb[1:3:7:3:1]' # LogStash::Filters::Fingerprint `ecs_compatibility => v8` Timestamps MURMUR3_128 Fingerprinting fingerprints the timestamp correctly
logstash_1_ca91d0bf13e2 | 
logstash_1_ca91d0bf13e2 | Randomized with seed 41691
logstash_1_ca91d0bf13e2 | 
ci_logstash_1_ca91d0bf13e2 exited with code 1
Aborting on container exit...
The command ".ci/docker-run.sh" exited with 1.

Closes: #68

let(:fingerprint_method) { "SHA1" }
let(:data) { { "@timestamp" => epoch_time } }
it "fingerprints the timestamp correctly" do
expect(fingerprint).to eq('1d5379ec92d86a67cfc642d55aa050ca312d3b9a')
digest = OpenSSL::Digest::SHA1.new
Copy link
Member

@jsvd jsvd May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the goal with having explicit values was that we could compute the digests "manually" independently from the code we use in the plugin(e.g. https://cimi.io/murmurhash3js-revisited/). It's not clear if this change causes the digests of a few tests to change, or if it's just a code reorganization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the test fails in Logstash 8 due to the breaking change of nanosecond timestamp in elastic/logstash#12797 which changes the string format of @timestamp, as a result, the hash value change in Logstash 8.

I can make a version specific hash value in test, or getting the value from OpenSSL::Digest::SHA1.new

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 because LS8's Timestamp#to_s does not include a trailing .000 for whole-second values, and does include .XXX for timestamps with fractional seconds, this breaks the fingerprinting of timestamps in 1 out of 1000 events.

Can we normalize the serialization in this library to include a minimum precision?

We could do it with refinements like:

diff --git a/lib/logstash/filters/fingerprint.rb b/lib/logstash/filters/fingerprint.rb
index e027109..2e39d46 100644
--- a/lib/logstash/filters/fingerprint.rb
+++ b/lib/logstash/filters/fingerprint.rb
@@ -24,6 +24,18 @@ require "logstash/plugin_mixins/ecs_compatibility_support"
 # To generate UUIDs, prefer the <<plugins-filters-uuid,uuid filter>>.
 class LogStash::Filters::Fingerprint < LogStash::Filters::Base
 
+  module MinimumSerializationLengthTimestamp
+    THREE = java.time.format.DateTimeFormatterBuilder.new.appendInstant(3).toFormatter
+
+    refine LogStash::Timestamp do
+      def to_s
+        return super unless nsec == 0
+
+        THREE.format(to_java.toInstant)
+      end
+    end
+  end
+
   INTEGER_MAX_32BIT = (1 << 31) - 1
   INTEGER_MIN_32BIT = -(1 << 31)
 
@@ -119,6 +131,9 @@ class LogStash::Filters::Fingerprint < LogStash::Filters::Base
     end
   end
 
+
+  using MinimumSerializationLengthTimestamp if LOGSTASH_VERSION.split('.').first.to_i >= 8
+
   def filter(event)
     case @method
     when :UUID

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave a test record here

Logstash 7.17.0

irb(main):001:0> LogStash::Timestamp.new('2000-01-01T05:00:00.12Z').to_s
=> "2000-01-01T05:00:00.120Z"
irb(main):002:0> LogStash::Timestamp.new('2000-01-01T05:00:00.000Z').to_s
=> "2000-01-01T05:00:00.000Z"
irb(main):003:0> LogStash::Timestamp.new('2000-01-01T05:00:00.123456Z').to_s
=> "2000-01-01T05:00:00.123Z"

Logstash 8.1.1

irb(main):001:0> LogStash::Timestamp.new('2000-01-01T05:00:00.12Z').to_s
=> "2000-01-01T05:00:00.120Z"
irb(main):002:0> LogStash::Timestamp.new('2000-01-01T05:00:00.000Z').to_s
=> "2000-01-01T05:00:00Z"
irb(main):003:0> LogStash::Timestamp.new('2000-01-01T05:00:00.123456Z').to_s
=> "2000-01-01T05:00:00.123456Z"

7.17 always gives three fractional digit while 8 formats fractional digits in groups of three.
The patch is good enough to make zero to three fractional digits backward compatible.

@yaauie I think your patch is the best fit for user. I am wondering is there any reason not to put using MinimumSerializationLengthTimestamp right after MinimumSerializationLengthTimestamp module?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refinement scoping can get... weird. They're active until the lexical scope in which the using is sent is closed, and affect methods that are defined while they are active, so my first thought was to minimize the scope in which it can be active by sending the using after defining other methods.

Looking at the implementation though, I think you are right that they should be grouped together -- there is little risk of accidental timestamp serialization during the lexical scope or using the methods defined in this lexical scope.

The following diff makes the build green.

diff --git a/lib/logstash/filters/fingerprint.rb b/lib/logstash/filters/fingerprint.rb
index e027109..cf0f20c 100644
--- a/lib/logstash/filters/fingerprint.rb
+++ b/lib/logstash/filters/fingerprint.rb
@@ -24,6 +24,27 @@ require "logstash/plugin_mixins/ecs_compatibility_support"
 # To generate UUIDs, prefer the <<plugins-filters-uuid,uuid filter>>.
 class LogStash::Filters::Fingerprint < LogStash::Filters::Base
 
+  ##
+  # Logstash 8+ has variable-length serialization of timestamps
+  # that do not include subsecond info for whole-second timestamps.
+  # For backward-compatibility we refine the implementation to use
+  # our own three-decimal-place formatter for whole-second
+  # timestamps.
+  if LOGSTASH_VERSION.split('.').first.to_i >= 8
+    module MinimumSerializationLengthTimestamp
+      THREE_DECIMAL_INSTANT_FORMATTER = java.time.format.DateTimeFormatterBuilder.new.appendInstant(3).toFormatter
+
+      refine LogStash::Timestamp do
+        def to_s
+          return super unless nsec == 0
+
+          THREE_DECIMAL_INSTANT_FORMATTER.format(to_java.toInstant)
+        end
+      end
+    end
+    using MinimumSerializationLengthTimestamp
+  end
+
   INTEGER_MAX_32BIT = (1 << 31) - 1
   INTEGER_MIN_32BIT = -(1 << 31)
 

I would like to see:

  • some additional specs with fractional seconds, passing both before and after this change, AND
  • some nano-precise specs that only run on Logstash 8+ (since the extra precision is truncated while parsing in Timestamp#new on 7.x).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7.17.0

irb(main):007:0> LogStash::Timestamp.new(Time.at(946702800.12)).to_s
=> "2000-01-01T05:00:00.120Z"

8.1.1

irb(main):038:0> LogStash::Timestamp.new(Time.at(946702800.12)).to_s
=> "2000-01-01T05:00:00.120000004Z"

LogStash::Timestamp of 2000-01-01T05:00:00.12Z is consistent with the patch. However, LogStash::Timestamp( Time -> RubyTime -> Instant ), RubyTime convert .12 to millisecond and 4 nanasecond.
I think Logstash core is a better place to handle this case. I believe most of the use cases are converting from datetime string, except customized script in ruby filter, so the patch is still valuable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaauie I have created an issue to track the timestamp elastic/logstash#14088
Do you think we can merge the patch now?

@kaisecheng kaisecheng requested review from jsvd and removed request for jsvd May 4, 2022 17:31
@kaisecheng kaisecheng requested a review from yaauie June 14, 2022 09:48
@kaisecheng kaisecheng merged commit d8929e9 into logstash-plugins:main Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Test Failure] Specs fingerprinting timestamps are failing on Logstash 8.x
4 participants