Initial kinesis + tower sink #93

LucioFranco · 2019-02-25T21:10:39Z

Adds our initial aws_kinesis_data_stream sink.

LucioFranco · 2019-02-26T18:23:12Z

src/sinks/kinesis.rs

+                .downcast_ref::<PutRecordsError>()
+                .unwrap()
+            {
+                PutRecordsError::ProvisionedThroughputExceeded(_) => {


This should only retry on ProvisionedThroughput here but we probably want to backoff and delay.

lukesteensen

Ok, lots to unpack here. Overall, the kinesis service looks good, just had a few smaller comments and questions there. For the rest of it, I'll try to split it up a bit:

Retry policy

The Policy system seems neat, but this doesn't feel ready to go yet. If we're going to ignore all the errors other than PutRecordsError, we should limit the scope of the policy to only operate over that service. But really it feels like there are a lot of other potential errors to deal with and we should at least stub out actually dealing with them. I imagine things like timeouts won't be terribly uncommon.

`BatchSink`

This is a lot of generic code that's only used in one place. I'd much prefer we start with something simpler and work up from there.

More specifically, this seam doesn't feel right. Service<Vec<B::Item>> is not really what we want for sinks like S3. I don't think we have a good idea yet for the best way to split up and compose batching, request building, retries, etc, so I'd rather spend a little time looking at all our sinks and figuring that out than starting with something big and generic.

Maybe instead we start with the absolute minimum thing that lets us use a Service as a Sink.

Tests

We should probably read the data back and make sure it all got there 😉

Cargo.toml

src/sinks/kinesis.rs

lukesteensen · 2019-02-26T19:31:09Z

src/sinks/kinesis.rs

+        let policy = RetryPolicy { attempts: 5 };
+
+        let service = Timeout::new(service, Duration::from_secs(5));
+        let service = Buffer::new(service, 1).unwrap();


What is this buffer doing for us? It seems like we'd have our own buffer impl, the batch sink, and then this buffer inside of it. I just want to be clear on what they're all doing.

lukesteensen · 2019-02-26T19:33:19Z

src/sinks/kinesis.rs

+                .source()
+                .unwrap()
+                .downcast_ref::<PutRecordsError>()
+                .unwrap()


This is confusing and the unwraps don't make me very confident. Is there a way we can rework this to avoid them? If we know what type the error is going to be, it'd be nice to encode that and let the type system do its thing.

lukesteensen

A few things to follow up on, but in general I think it's looking reasonable.

lukesteensen · 2019-02-27T19:26:55Z

tests/kinesis.rs

+        .block_on(fetch_records(STREAM_NAME.into(), timestamp))
+        .unwrap();
+
+    assert_eq!(records.len(), 11);


Could we have input_records and output_records and assert they're the same? We do that in most other sink tests.

lukesteensen · 2019-02-27T19:31:50Z

src/sinks/util/batch.rs

+        self.batcher.push(item.into());
+
+        if self.batcher.len() > self.size {
+            self.poll_complete()?;


Is this correct? I feel like start_send should only poll_complete when it has to (i.e. lazily).

lukesteensen · 2019-02-27T19:32:41Z

src/sinks/util/batch.rs

+                        continue;
+                    }
+                    Ok(Async::NotReady) => return Ok(Async::NotReady),
+                    Err(err) => panic!("Error sending request: {:?}", err),


Do we actually want to panic here? Or just log the error and move on?

lukesteensen · 2019-02-27T19:34:09Z

tests/kinesis.rs

+        .block_on(futures::lazy(|| {
+            future::ok::<_, ()>(KinesisService::new(config))
+        }))
+        .unwrap();


Can't all this just be KinesisService::new(config)?

yes, now it can.

lukesteensen · 2019-02-27T19:35:00Z

tests/kinesis.rs

+        }))
+        .unwrap();
+
+    let timestamp = chrono::Utc::now().timestamp_millis();


Let's inline this where it's actually used. No reason for it to be way up here.

LOG-2734: A test exists to verify there are no ring dependencies

LucioFranco force-pushed the kinesis branch from 99f43eb to 0d70c73 Compare February 26, 2019 16:29

LucioFranco commented Feb 26, 2019

View reviewed changes

LucioFranco marked this pull request as ready for review February 26, 2019 18:32

LucioFranco requested a review from lukesteensen February 26, 2019 18:32

LucioFranco force-pushed the kinesis branch from 3255b51 to fe2a58b Compare February 26, 2019 19:21

lukesteensen requested changes Feb 26, 2019

View reviewed changes

LucioFranco added 9 commits February 27, 2019 13:10

First pass at a kinesis sink with tower

6a4ed64

Move batcher into util

85129b6

Add retry and timeouts for kinesis

2deabec

Fix kinesis tests

69ccf36

Add useful debug bound

994b556

Fix unused variable

d5795d2

Remove generic bounds for Batch and improve kinesis test

81a6a6c

Clean up retry and fix tests

4787262

Reduce test timeout

a9aad75

LucioFranco force-pushed the kinesis branch from 02384bf to a9aad75 Compare February 27, 2019 18:10

lukesteensen approved these changes Feb 27, 2019

View reviewed changes

Fix the way we actually test cloudwatch and kinesis

98804db

LucioFranco merged commit 595cf65 into master Feb 27, 2019

LucioFranco deleted the kinesis branch February 27, 2019 20:47

syedriko referenced this pull request in syedriko/vector Jul 15, 2022

Merge pull request #93 from syedriko/syedriko_log_2734

a4f3195

LOG-2734: A test exists to verify there are no ring dependencies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial kinesis + tower sink #93

Initial kinesis + tower sink #93

LucioFranco commented Feb 25, 2019 •

edited by binarylogic

Loading

LucioFranco Feb 26, 2019

lukesteensen left a comment

lukesteensen Feb 26, 2019

lukesteensen Feb 26, 2019

lukesteensen left a comment

lukesteensen Feb 27, 2019

lukesteensen Feb 27, 2019

lukesteensen Feb 27, 2019

lukesteensen Feb 27, 2019

LucioFranco Feb 27, 2019

lukesteensen Feb 27, 2019

Initial kinesis + tower sink #93

Initial kinesis + tower sink #93

Conversation

LucioFranco commented Feb 25, 2019 • edited by binarylogic Loading

Choose a reason for hiding this comment

lukesteensen left a comment

Choose a reason for hiding this comment

Retry policy

BatchSink

Tests

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukesteensen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LucioFranco commented Feb 25, 2019 •

edited by binarylogic

Loading

`BatchSink`